International Business Machines Corporation
EXPLANATIVE ANALYSIS FOR RECORDS WITH MISSING VALUES

Last updated:

Abstract:

Embodiments relate to a system, computer program product, and method for determining missing values in respective data records with an explanatory analysis to provide a context of the determined values. Such method includes receiving a dataset including incomplete data records that are missing predictors and complete data records. A model is trained with the complete data records and candidate predictors for the missing predictors are generated. A predictor importance value is generated for each candidate predictor and the candidate predictors that have a predictor importance value in excess of a first threshold value are promoted. Respective promoted candidate predictors are inserted into the respective incomplete data records, thereby creating tentative data records. The tentative data records are injected into the model, a fit value is determined for each of the tentative data records, and a tentative data record with a fit value exceeding a second threshold value is selected.

Status:
Application
Type:

Utility

Filling date:

14 Sep 2020

Issue date:

17 Mar 2022