Microsoft Corporation
Computing system for expressive three-dimensional facial animation

Last updated:

Abstract:

A computer-implemented technique for animating a visual representation of a face based on spoken words of a speaker is described herein. A computing device receives an audio sequence comprising content features reflective of spoken words uttered by a speaker. The computing device generates latent content variables and latent style variables based upon the audio sequence. The latent content variables are used to synchronized movement of lips on the visual representation to the spoken words uttered by the speaker. The latent style variables are derived from an expected appearance of facial features of the speaker as the speaker utters the spoken words and are used to synchronize movement of full facial features of the visual representation to the spoken words uttered by the speaker. The computing device causes the visual representation of the face to be animated on a display based upon the latent content variables and the latent style variables.

Status:
Grant
Type:

Utility

Filling date:

29 Oct 2018

Issue date:

1 Feb 2022