Spotify Technology S.A.
Systems and Methods for Aligning Lyrics using a Neural Network

Last updated:

Abstract:

An electronic device receives audio data for a media item. The electronic device generates, from the audio data, a plurality of samples, each sample having a predefined maximum length. The electronic device, using a neural network trained to predict character probabilities, generates a probability matrix of characters for a first portion of a first sample of the plurality of samples. The probability matrix includes character information, timing information, and respective probabilities of respective characters at respective times. The electronic device identifies, for the first portion of the first sample, a first sequence of characters based on the generated probability matrix.

Status:
Application
Type:

Utility

Filling date:

12 Sep 2019

Issue date:

30 Apr 2020