Microsoft Corporation
Token Packing for Sequence Models

Last updated:

Abstract:

Embodiments of the present disclosure include systems and methods for packing tokens to train sequence models. In some embodiments, a plurality of datasets for training a sequence model is received. Each dataset in the plurality of datasets includes a sequence of correlated tokens. A set of training data is generated that includes a subset of a sequence of tokens from a first dataset in the plurality of datasets and a subset of a sequence of tokens from a second, different dataset in the plurality of datasets. The sequence model is trained using the set of training data.

Status:
Application
Type:

Utility

Filling date:

22 May 2020

Issue date:

25 Nov 2021