International Business Machines Corporation
DATA MODEL PROCESSING IN MACHINE LEARNING EMPLOYING FEATURE SELECTION USING SUB-POPULATION ANALYSIS

Last updated: 24 Nov 2021

Abstract:

A computer system selects features of a dataset for predictive modeling. A first set of features that are relevant to outcome are selected from a dataset comprising a plurality of cases and controls. A subset of cases and controls having similar values for the first set of features is identified. The subset is analyzed to select a set of additional features relevant to outcome. A first and second predictive model are evaluated to determine that the second predictive model more accurately predicts outcome, wherein the first predictive model is based on the first set of features and the second predictive model is based on the first set of features and the additional features. The second predictive model is utilized to predict outcomes. Embodiments of the present invention further include a method and program product for selecting features of a dataset for predictive modeling in substantially the same manner described above.

Status:

Application

Type:

Utility

Filling date:

30 Apr 2020

Issue date:

4 Nov 2021

Full patent description

Patent application document

International Business Machines Corporation DATA MODEL PROCESSING IN MACHINE LEARNING EMPLOYING FEATURE SELECTION USING SUB-POPULATION ANALYSIS

Abstract:

International Business Machines Corporation
DATA MODEL PROCESSING IN MACHINE LEARNING EMPLOYING FEATURE SELECTION USING SUB-POPULATION ANALYSIS