International Business Machines Corporation
PRIORITY-BASED, ACCURACY-CONTROLLED INDIVIDUAL FAIRNESS OF UNSTRUCTURED TEXT
Last updated:
Abstract:
Methods, systems, and computer program products for priority-based, accuracy-controlled individual fairness of unstructured text are provided herein. A method includes identifying one or more samples in a set of data used to train a machine learning model having at least one attribute; generating counterfactual samples for each of the one or more identified samples; calculating scores for the one or more identified samples based at least in part on output of the machine learning model with respect to the counterfactual samples, wherein the scores indicate a relative level of bias between the one or more identified samples corresponding to the at least one attribute; creating an enhanced set of data at least in part by supplementing at least a portion of the identified samples with the corresponding counterfactual samples based on the calculated scores; and training the machine learning model using the enhanced set of data.
Utility
28 Jan 2021
28 Jul 2022