Snap Inc.
Text and audio-based real-time face reenactment

Last updated: 7 Jan 2022

Abstract:

Provided are systems and methods for text and audio-based real-time face reenactment. An example method includes receiving an input text and a target image, the target image including a target face; generating, based on the input text, a sequence of sets of acoustic features representing the input text; determining, based on the sequence of sets of acoustic features, a sequence of sets of scenario data indicating modifications of the target face for pronouncing the input text; generating, based on the sequence of sets of scenario data, a sequence of frames, wherein each of the frames includes the target face modified based on at least one of the sets of scenario data; generating, based on the sequence of frames, an output video; and synthesizing, based on the sequence of sets of acoustic features, an audio data and adding the audio data to the output video.

Status:

Grant

Type:

Utility

Filling date:

11 Jul 2019

Issue date:

7 Sep 2021

Full patent description

Patent application document

Snap Inc. Text and audio-based real-time face reenactment

Abstract:

Snap Inc.
Text and audio-based real-time face reenactment