International Business Machines Corporation
CORPUS DATA AUGMENTATION AND DEBIASING
Last updated:
Abstract:
Machine learning model training corpus debiasing includes identifying an attribute of input text selected from the training corpus, the attribute including word(s) of the input text, and the attribute corresponding to an attribute class encompassing different possible class values, recognizing bias in the input text with respect to the attribute class, and generating output text corresponding to the attribute and imparting diversity with respect to the attribute class and relative to the input text, where generating the output text uses an optimization function based on loss objectives to minimize loss in the generated output text as compared to the input text.
Status:
Application
Type:
Utility
Filling date:
13 Jan 2021
Issue date:
14 Jul 2022