Palantir Technologies Inc.
Systems and methods for automatic clustering and canonical designation of related data in various data structures

Last updated:

Abstract:

Computer implemented systems and methods are disclosed for automatically clustering and canonically identifying related data in various data structures. Data structures may include a plurality of records, wherein each record is associated with a respective entity. In accordance with some embodiments, the systems and methods further comprise identifying clusters of records associated with a respective entity by grouping the records into pairs, analyzing the respective pairs to determine a probability that both members of the pair relate to a common entity, and identifying a cluster of overlapping pairs to generate a collection of records relating to a common entity. Clusters may further be analyzed to determine canonical names or other properties for the respective entities by analyzing record fields and identifying similarities.

Status:
Grant
Type:

Utility

Filling date:

13 Nov 2018

Issue date:

19 Jul 2022