one
MEDIUMS, METHODS, AND SYSTEMS FOR CLASSIFYING COLUMNS OF A DATA STORE BASED ON CHARACTER LEVEL LABELING
Last updated:
Abstract:
Exemplary embodiments pertain to new techniques for classifying or labeling organized data. A major impediment to implementing high-quality machine learning is the lack of readily accessible labeled data. In some cases, data can be classified using a classifier, but these solutions can be inaccurate and slow. Exemplary embodiments address the problem of obtaining accurate labeled data in a timely manner by applying a classifier configured to operate on character-level embeddings. Among other advantages, this can help the classifier to recognize information contained within a data unit, such as a cell of a table. The classifier may operate within the organizational structure of the data, such as by operating across a particular row or column of a table. Because data within a particular row or column is often temporally organized (e.g., transactions that are logged in chronological order), row- or column-based approaches can yield more accurate results.
Utility
10 Sep 2021
10 Mar 2022