First American Financial Corporation
OCR error correction

Last updated:

Abstract:

Implementations of the disclosure are directed to OCR error correction systems and methods. In some implementations, a method comprises: obtaining, at a computing device, optical character recognition (OCR) text extracted from a document image, the text comprising a token; searching, at the computing device, based on a token bigram determined from the token and a mapping between words in a corpus and a corpus bigram set comprised of unique bigrams from the beginning or ending of the words in the corpus, the corpus for a best word to replace the token; and replacing, at the computing device, the token with the best word.

Status:
Grant
Type:

Utility

Filling date:

17 Jul 2020

Issue date:

19 Jan 2021