International Business Machines Corporation
Confidence models based on error-to-correction mapping

Last updated:

Abstract:

A mechanism is provided in a data processing system comprising at least one processor and at least one memory, the at least one memory comprising instructions which are executed by the at least one processor and configure the processor to implement a document processing system. A spell check confidence component executing within the document processing system records a mapping of misspelled words to corrected words for set of documents. The spell check confidence component generates an error-to-correction frequency model based on the mapping. A parser executing within the document processing system parses an input document to extract words in the error-to-correction frequency model. The spell check confidence component calculates a precision score for each word in the input document found in the error-to-correction frequency model. The precision score represents a probability that the extracted word is spelled correctly as intended in the input document. The document processing system generates a precision model for the input document based on the precision scores. The document processing system performs a natural language processing operation on the input document based on the confidence model.

Status:
Grant
Type:

Utility

Filling date:

10 Aug 2017

Issue date:

17 Aug 2021