International Business Machines Corporation
Disambiguation of audio content using visual context

Last updated:

Abstract:

Provided is a method for disambiguating an audio component extracted from audiovisual content. Audiovisual content is identified. The audiovisual content includes an audio component and a video component. An ambiguous expression is detected in the audio component. An object referenced by the ambiguous expression is identified in the video component. A verbal description of the object is generated. The verbal description is injected into the audio component to generate a modified audio component.

Status:
Grant
Type:

Utility

Filling date:

21 Apr 2020

Issue date:

27 Jul 2021