International Business Machines Corporation
Synchronized sound generation from videos

Last updated:

Abstract:

A computing device receives a video feed. The video feed is divided into a sequence of video segments. For each video segment, visual features of the video segment are extracted. A predicted spectrogram is generated based on the extracted visual features. A synthetic audio waveform is generated from the predicted spectrogram. All synthetic audio waveforms of the video feed are concatenated to generate a synthetic soundtrack that is synchronized with the video feed.

Status:
Grant
Type:

Utility

Filling date:

30 Jul 2019

Issue date:

15 Mar 2022