Adobe Inc.
TRANSLATING TEXTS FOR VIDEOS BASED ON VIDEO CONTEXT

Last updated:

Abstract:

The present disclosure describes systems, non-transitory computer-readable media, and methods that can generate contextual identifiers indicating context for frames of a video and utilize those contextual identifiers to generate translations of text corresponding to such video frames. By analyzing a digital video file, the disclosed systems can identify video frames corresponding to a scene and a term sequence corresponding to a subset of the video frames. Based on images features of the video frames corresponding to the scene, the disclosed systems can utilize a contextual neural network to generate a contextual identifier (e.g. a contextual tag) indicating context for the video frames. Based on the contextual identifier, the disclosed systems can subsequently apply a translation neural network to generate a translation of the term sequence from a source language to a target language. In some cases, the translation neural network also generates affinity scores for the translation.

Status:
Application
Type:

Utility

Filling date:

8 Nov 2019

Issue date:

13 May 2021