Adobe Inc.
VIDEO RETRIEVAL USING TEMPORAL VISUAL CONTENT

Last updated:

Abstract:

Systems and methods for content-based video retrieval are described. The systems and methods may break a video into multiple frames, generate a feature vector from the frames based on the temporal relationship between them, and then embed the feature vector into a vector space along with a vector representing a search query. In some embodiments, the video feature vector is converted into a text caption prior to the embedding. In other embodiments, the video feature vector and a sentence vector are each embedded into a common space using a join video sentence embedding model. Once the video and the search query are embedded into a common vector space, a distance between them may be calculated. After calculating the distance between the search query and set of videos, the distances may be used to select a subset of the videos to present as the result of the search.

Status:
Application
Type:

Utility

Filling date:

15 Oct 2019

Issue date:

15 Apr 2021