International Business Machines Corporation
Data curation for corpus enrichment

Last updated:

Abstract:

Techniques for data curation are provided. A data set is received for ingestion into a question answering system, where the data set includes a first question and a first answer. Relevance of the first question is validated by comparing the first question to a first question cluster in the question answering system, and it is determined that the first answer satisfies predefined security criteria. The first data set is evaluated to identify a set of references, and a generalized data set is generated by replacing each respective reference of the set of references with a corresponding entity identifier. The first generalized data set is then ingested into the question answering system.

Status:
Grant
Type:

Utility

Filling date:

17 Oct 2019

Issue date:

6 Sep 2022