Adobe Inc.
TRAINING OF NEURAL NETWORK BASED NATURAL LANGUAGE PROCESSING MODELS USING DENSE KNOWLEDGE DISTILLATION

Last updated: 27 Jul 2021

Abstract:

Techniques for training a first neural network (NN) model using a pre-trained second NN model are disclosed. In an example, training data is input to the first and second models. The training data includes masked tokens and unmasked tokens. In response, the first model generates a first prediction associated with a masked token and a second prediction associated with an unmasked token, and the second model generates a third prediction associated with the masked token and a fourth prediction associated with the unmasked token. The first model is trained, based at least in part on the first, second, third, and fourth predictions. In another example, a prediction associated with a masked token, a prediction associated with an unmasked token, and a prediction associated with whether two sentences of training data are adjacent sentences are received from each of the first and second models. The first model is trained using the predictions.

Status:

Application

Type:

Utility

Filling date:

17 Dec 2019

Issue date:

17 Jun 2021

Full patent description

Patent application document

Adobe Inc. TRAINING OF NEURAL NETWORK BASED NATURAL LANGUAGE PROCESSING MODELS USING DENSE KNOWLEDGE DISTILLATION

Abstract:

Adobe Inc.
TRAINING OF NEURAL NETWORK BASED NATURAL LANGUAGE PROCESSING MODELS USING DENSE KNOWLEDGE DISTILLATION