Microsoft Corporation
MULTI-TOKEN EMBEDDING AND CLASSIFIER FOR MASKED LANGUAGE MODELS

Last updated:

Abstract:

Embodiments of the present disclosure include systems and methods for training transformer models. In some embodiments, a set of input data are received. The input data comprises a plurality of tokens including masked tokens. The plurality of tokens in an embedding layer are processed. The embedding layer is coupled to a transformer layer. The plurality of tokens are processed in the transformer layer, which is coupled to a classifier layer. The plurality of tokens are processed in the classifier layer. The classifier layer is coupled to a loss layer. At least one of the embedding layer and the classifier layer combine masked tokens at a current position with tokens at one or more of a previous position and a subsequent position.

Status:
Application
Type:

Utility

Filling date:

25 Aug 2020

Issue date:

3 Mar 2022