Wipro Limited
System and method for extracting tabular data from a document

Last updated:

Abstract:

The present invention relates to a method for extracting tabular data from a document. The method includes identifying a bordered table or a borderless table in a received document and an image of the document. The tabular data in the identified bordered table is extracted using a first and a second set of pixel coordinates from the plurality of pixel coordinates. Further, upon identifying the borderless table in the document, a first set of document coordinates of at least one row of the borderless table is determined. Furthermore, a second set of document coordinates of the at least one column corresponding to the at least one row is determined. Finally, the tabular data in the identified borderless table is extracted from the document based on the determined first and second set of document coordinates.

Status:
Grant
Type:

Utility

Filling date:

5 Nov 2019

Issue date:

29 Jun 2021