International Business Machines Corporation
Disambiguation of audio content using visual context
Last updated:
Abstract:
Provided is a method for disambiguating an audio component extracted from audiovisual content. Audiovisual content is identified. The audiovisual content includes an audio component and a video component. An ambiguous expression is detected in the audio component. An object referenced by the ambiguous expression is identified in the video component. A verbal description of the object is generated. The verbal description is injected into the audio component to generate a modified audio component.
Status:
Grant
Type:
Utility
Filling date:
21 Apr 2020
Issue date:
27 Jul 2021