Open Text Corporation
Systems and methods for image based content capture and extraction utilizing deep learning neural network and bounding box detection training techniques

Last updated: 28 Jul 2021

Abstract:

Systems, methods and computer program products for image recognition in which instructions are executable by a processor to dynamically generate simulated documents and corresponding images, which are then used to train a fully convolutional neural network. A plurality of document components are provided, and the processor selects subsets of the document components. The document components in each subset are used to dynamically generate a corresponding simulated document and a simulated document image. The convolutional neural network processes the simulated document image to produce a recognition output. Information corresponding to the document components from which the image was generated is used as an expected output. The recognition output and expected output are compared, and weights of the convolutional neural network are adjusted based on the differences between them.

Status:

Grant

Type:

Utility

Filling date:

13 Jul 2018

Issue date:

26 Jan 2021

Full patent description

Patent application document

Open Text Corporation Systems and methods for image based content capture and extraction utilizing deep learning neural network and bounding box detection training techniques

Abstract:

Open Text Corporation
Systems and methods for image based content capture and extraction utilizing deep learning neural network and bounding box detection training techniques