International Business Machines Corporation
Extracting Facts from Unstructured Text

Last updated:

Abstract:

A system, computer program product, and method are provided for extraction of factual data from unstructured natural language (NL) text. A detection model is applied to convert unstructured NL text in a first language to annotated NL text. The detection model identifies two or more mentions from the unstructured NL text and a logical position of the mentions. The detection model further identifies a sequential position for each of the mentions and attaches a sequential position identifier. A pattern of rules corresponding with the annotated NL text is identified and applied to the annotated NL text, and one or more facts embedded within the annotated NL text are extracted and converted into structured data.

Status:
Application
Type:

Utility

Filling date:

30 Dec 2020

Issue date:

30 Jun 2022