Meta Platforms, Inc.
Voice activity detection using audio and visual analysis

Last updated:

Abstract:

A method of detecting voice activity includes performing a video analysis on a frame of video signal to determine a position of a user in the frame and to identify one or more beams of a corresponding audio signal associated with a region including the position of the user. The identified one or more beams of audio signal are analyzed to determine whether voice is present in the frame. When a user is not identified during the video analysis of the frame of video signal, audio analysis is not performed on the corresponding frame of audio signal.

Status:
Grant
Type:

Utility

Filling date:

14 Oct 2019

Issue date:

25 Jan 2022