Microsoft Corporation
SYSTEM AND METHOD FOR IMPROVING MACHINE LEARNING MODELS BASED ON CONFUSION ERROR EVALUATION
Last updated:
Abstract:
Embodiments described herein are directed to improving machine learning (ML) model-based techniques for automatically labeling data items based on identifying and resolving labels that are problematic. An ML model may be trained to predict labels for any given data item. The ML model may be validated to determine a confusion metric with respect to each distinct pair of labels predicted by the ML model. Each confusion metric indicates how a particular label is being mistaken for another particular label. The confusion metrics are analyzed to determine whether any of the ML model-generated labels are problematic (e.g., a label conflicts with another label, a label that is rarely predicted, a label that is incorrectly predicted, etc.). Steps for resolving the problematic labels are implemented, and the ML model is retrained based on the resolution steps. By doing so, the ML model generates a more accurate label for a data item.
Utility
27 Jan 2020
29 Jul 2021