Microsoft Corporation
SENTENCE SIMILARITY SCORING USING NEURAL NETWORK DISTILLATION

Last updated:

Abstract:

The disclosure herein describes a system and method for attentive sentence similarity scoring. A distilled sentence embedding (DSE) language model is trained by decoupling a transformer language model using knowledge distillation. The trained DSE language model calculates sentence embeddings for a plurality of candidate sentences for sentence similarity comparisons. An embedding component associated with the trained DSE language model generates a plurality of candidate sentence representations representing each candidate sentence in the plurality of candidate sentences which are stored for use in analyzing input sentences associated with queries or searches. A representation is created for the selected sentence. This selected sentence representation is used with the plurality of candidate sentence representations to create a similarity score for each candidate sentence-selected sentence pair. A retrieval component identifies a set of similar sentences from the plurality of candidate sentences responsive to the input query based on the set of similarity scores.

Status:
Application
Type:

Utility

Filling date:

12 Feb 2020

Issue date:

17 Jun 2021