International Business Machines Corporation
Processing multiple data sets to generate a merged location-based data set
Last updated:
Abstract:
A computer system merges location-based data sets. Each of a plurality of data sets are transformed into a standardized schema, including at least two data sets including information indicating a geographic location. The schemas of the plurality of data sets are combined by data set type to produce a resulting data set for each data set type. The schemas of a first and second data sets are joined to produce a merged data set using a machine learning model to identify corresponding rows of the schemas. The schema of the merged data set is joined with the schemas of the resulting data sets for the data set types to produce a new data set. A resulting merged data set in the standardized schema is produced. Embodiments of the present invention further include a method and program product for merging location-based data sets in substantially the same manner described above.
Utility
24 Apr 2020
14 Dec 2021