Microsoft Corporation
Token Packing for Sequence Models
Last updated:
Abstract:
Embodiments of the present disclosure include systems and methods for packing tokens to train sequence models. In some embodiments, a plurality of datasets for training a sequence model is received. Each dataset in the plurality of datasets includes a sequence of correlated tokens. A set of training data is generated that includes a subset of a sequence of tokens from a first dataset in the plurality of datasets and a subset of a sequence of tokens from a second, different dataset in the plurality of datasets. The sequence model is trained using the set of training data.
Status:
Application
Type:
Utility
Filling date:
22 May 2020
Issue date:
25 Nov 2021