Netflix, Inc.
IDENTIFYING REPRESENTATIVE FRAMES IN VIDEO CONTENT
Last updated:
Abstract:
One embodiment of the present invention sets forth a technique for selecting a frame of video content that is representative of a media title. The technique includes applying an embedding model to a plurality of faces included in a set of frames of the video content to generate a plurality of face embeddings. The technique also includes aggregating the plurality of face embeddings into a plurality of clusters representing a plurality of characters included in the media title. The technique further includes computing a plurality of prominence scores for the plurality of characters based on one or more attributes of the plurality of clusters, and selecting, from the set of frames, a frame of video content as representative of the media title based on one or more prominence scores for one or more characters included in the frame.
Utility
10 Jun 2021
16 Dec 2021