Cisco Systems, Inc.
Characterizing accuracy of ensemble models for automatic speech recognition by determining a predetermined number of multiple ASR engines based on their historical performance

Last updated:

Abstract:

One embodiment of the present invention sets forth a technique for analyzing a transcription of a recording. The technique includes generating features representing transcriptions produced by multiple automatic speech recognition (ASR) engines from voice activity in the recording and a best transcription of the recording produced by an ensemble model from the transcriptions. The technique also includes applying a machine learning model to the features to produce a score representing an accuracy of the best transcription. The technique further includes storing the score in association with the best transcription.

Status:
Grant
Type:

Utility

Filling date:

9 Mar 2019

Issue date:

5 Jul 2022