Microsoft Corporation
IDENTIFYING NOISE IN VERBAL FEEDBACK USING ARTIFICIAL TEXT FROM NON-TEXTUAL PARAMETERS AND TRANSFER LEARNING
Last updated:
Abstract:
Methods and systems are provided for classifying free-text content using machine learning. Free-text content (e.g., customer feedback) and parameter values organized according to a schema are received. A free-text corpus is generated, and an artificial-text corpus is generated by applying rules to the parameter values. The artificial-text corpus is generated by converting the parameter values into a finite set of words based on the rules and concatenating the words of the finite set of words into a fixed sequence wordlist. Feature vectors (e.g., sentence embeddings) based on the free-text corpus and the artificial-text corpus are combined and forwarded to a machine learning model for classification. The machine learning model may be trained with a bias towards a specified metric (e.g., precision, recall, F1 score). The model may be trained using transfer learning with training data from a different category of free-text content (e.g., a different category of customer feedback).
Utility
17 Aug 2020
17 Feb 2022