Microsoft Corporation
Sentence similarity scoring using neural network distillation
Last updated:
Abstract:
The disclosure herein describes a system and method for attentive sentence similarity scoring. A distilled sentence embedding (DSE) language model is trained by decoupling a transformer language model using knowledge distillation. The trained DSE language model calculates sentence embeddings for a plurality of candidate sentences for sentence similarity comparisons. An embedding component associated with the trained DSE language model generates a plurality of candidate sentence representations representing each candidate sentence in the plurality of candidate sentences which are stored for use in analyzing input sentences associated with queries or searches. A representation is created for the selected sentence. This selected sentence representation is used with the plurality of candidate sentence representations to create a similarity score for each candidate sentence-selected sentence pair. A retrieval component identifies a set of similar sentences from the plurality of candidate sentences responsive to the input query based on the set of similarity scores.
Utility
12 Feb 2020
19 Jul 2022