SAP SE
MACHINE LEARNING ENABLED TEXT ANALYSIS WITH SUPPORT FOR UNSTRUCTURED DATA

Last updated:

Abstract:

A method for analyzing an electronic document including structured data and unstructured data may include applying a machine learning model to determine whether one or more rows of the electronic document correspond to a header row. The machine learning model may be trained to determine whether one or more cells of a row corresponds to a header field by determining whether a text value included in the cells corresponds to an entity. A row may be identified as a header row based on an output of the machine learning model indicating more than a threshold quantity of cells included in the row correspond to a header field. At least a portion of the structured data included in the electronic document may be extracted based on the entity included in the cells of the row identified as the header row. Related systems and computer program products are also provided.

Status:
Application
Type:

Utility

Filling date:

4 Jan 2021

Issue date:

7 Jul 2022