International Business Machines Corporation
Data lineage and data provenance enhancement

Last updated:

Abstract:

One embodiment of the invention provides a method for data lineage and data provenance enhancement. The method comprises arranging a data set into a logical ordering, and partitioning the data set into at least one set of partitions based on the logical ordering. The method further comprises, for each partition of the at least one set of partitions, determining a corresponding score for the partition, and determining a data similarity between the partition and each other partition of each other data set based on the corresponding score for the partition and another score corresponding to the other partition. The method further comprises determining data lineage of the data set based on each data similarity determined.

Status:
Grant
Type:

Utility

Filling date:

10 Jan 2020

Issue date:

17 May 2022