International Business Machines Corporation
FEATURE VECTOR GENERATION FOR PROBABALISTIC MATCHING
Last updated:
Abstract:
A computer-implemented method increases the efficiency of matching records from two sources. The method includes identifying a first source and a second source wherein each of the sources include one or more records and each record includes one or more attributes. The method further includes determining, based on a corpus, the one or more attributes and generating, based on the attributes, a set of feature vectors which vectors represent the one or more attributes. The method includes comparing each record in the first source against each record in the second source. The method further includes generating, in response to the comparing, a link confidence. The method also includes linking, in response to the link confidence being above a linking threshold, the associated records. The method includes determining a first feature vector of the set of feature vectors used in the linking, and outputting a set of results.
Utility
30 Jul 2020
3 Feb 2022