International Business Machines Corporation
Ranking datasets based on data attributes
Last updated:
Abstract:
Ranking a group of datasets using a computer includes determining a set of target data fields from a set of process documents that indicate user data field preferences. A set of target dataset attributes from a set of data use documents indicate user data scope preferences. A plurality of metadata sets for an associated plurality of datasets the computer determines having a field suitability value exceeding a predetermined suitability threshold value. The FSV represents a degree of similarity between a set of fields associated with said dataset and the set of target data fields. The computer assesses metadata sets with regard to the target attributes and generates a compared attribute score for each candidate dataset. A degree of likelihood is indicated that an associated dataset will have content exhibiting said target dataset attributes. The computer candidate datasets is based on the compared attribute score.
Utility
17 Dec 2020
6 Sep 2022