Datawatch Corporation
SYSTEMS AND METHODS FOR GENERATING TABLES FROM PRINT-READY DIGITAL SOURCE DOCUMENTS

Last updated:

Abstract:

Systems and methods are provided for generating tables from print-ready digital source documents. A document is received and one or more text fragments are identified on a rendered page of the document. A wrapping region collection is generated, comprising one or more wrapping regions. A tabular, narrative and label score is generated for each wrapping region. A block type is assigned to each wrapping region based on the scores. A wrapping region group and a block set are generated. One or more tables are generated based on text fragments corresponding to one of the one or more blocks. The text fragments are organized into corresponding fields of the one or more tables.

Status:
Application
Type:

Utility

Filling date:

9 May 2019

Issue date:

29 Aug 2019