International Business Machines Corporation
Data generalization for predictive models

Last updated:

Abstract:

A method, apparatus and a product for data generalization for predictive models. The method comprising: based on a labeled dataset, determining a plurality of buckets, each of which has an associated label; determining a plurality of clusters, grouping similar instances in the same bucket; based on the plurality of clusters, determining an alternative set of features comprising a set of generalized features, wherein each generalized feature corresponds to a cluster of the plurality of clusters, wherein a generalized feature that corresponds to a cluster is indicative of the instance being mapped to the corresponding cluster; obtaining a second instance; determining a generalized second instance that comprises a valuation of the alternative set of features for the second instance; and based on the generalized second instance, determining a label for the second instance.

Status:
Grant
Type:

Utility

Filling date:

6 Aug 2019

Issue date:

22 Mar 2022