Uber Technologies, Inc.
Machine Learned Structured Data Extraction From Document Image
Last updated:
Abstract:
A document transcription application receives an image of a document that comprises structured data. The document transcription application performs optical character recognition upon the image of the document to produce a block of text. The document transcription application applies the block of text to a first machine learning model to determine a heat map for a class of data in the structured data in the image of the document. The document transcription application applies the image of the document and the heat map to a second machine learning model to identify a region of the image of the document representing the class of data. The document transcription application generates, using the identified region and the block of text, a structured data file.
Utility
1 Mar 2021
2 Sep 2021