one
System and techniques for data record merging
Last updated:
Abstract:
A non-transitory computer-readable storage medium is provided to store computer-readable program code to receive an unmerged record set, comprising a first plurality of data records, to generate record-pairs from the first plurality of data records, based upon a set of transitive deterministic matching criteria, apply a set of non-transitive matching rules to the record-pairs, perform a partitioning operation on the record-pairs, using a plurality of independent grouping operations, wherein a plurality of matched record groups are generated. The computer-readable program code may determine a set of maximal connected components from the plurality of matched record groups, perform a merge operation on the set of maximal connected components to generate a set of merged records, the set of merged records comprising a second plurality of data records, less than the first plurality of data records, and send the merged records for storage in a non-transitory computer readable storage medium.
Utility
28 Feb 2020
23 Feb 2021