International Business Machines Corporation
Feature and feature variant reconstruction for recurrent model accuracy improvement in speech recognition
Last updated:
Abstract:
A multi-task learning system is provided for speech recognition. The system includes a common encoder network. The system further includes a primary network for minimizing a Connectionist Temporal Classification (CTC) loss for speech recognition. The system also includes a sub network for minimizing a Mean squared error (MSE) loss for feature reconstruction. A first set of output data of the common encoder network is received by both of the primary network and the sub network. A second set of the output data of the common encode network is received only by the primary network from among the primary network and the sub network.
Status:
Grant
Type:
Utility
Filling date:
8 Mar 2019
Issue date:
2 Aug 2022