Adobe Inc.
GENERATING EMBEDDINGS IN A MULTIMODAL EMBEDDING SPACE FOR CROSS-LINGUAL DIGITAL IMAGE RETRIEVAL

Last updated:

Abstract:

The present disclosure relates to methods, systems, and non-transitory computer-readable media for retrieving digital images in response to queries. For example, in one or more embodiments, the disclosed systems receive a query comprising text and generates a cross-lingual-multimodal embedding for the text within a multimodal embedding space. The disclosed systems further identifies an image embedding for a digital image that corresponds to (e.g., is relevant to) the text from the query based on an embedding distance between the image embedding and the cross-lingual-multimodal embedding for the text within the multimodal embedding space. Accordingly, the disclosed systems retrieve the digital image associated with the image embedding for display on a client device, such as the client device that submitted the query.

Status:
Application
Type:

Utility

Filling date:

20 Oct 2020

Issue date:

21 Apr 2022