International Business Machines Corporation
Descriptor Uniqueness for Entity Clustering
Last updated:
Abstract:
A mechanism is provided in a data processing system to implement a cognitive natural language processing (NLP) system with descriptor uniqueness identification to support named entity mention clustering. The mechanism annotates a set of documents from a corpus of documents for entity types and mentions, collects descriptor usages from all documents in the corpus of documents, analyzes the descriptor usages to classify the descriptors as base terms or modifier terms, generates compatibility scores for the descriptors, and performs entity merging of entity clusters based on the compatibility scores.
Status:
Application
Type:
Utility
Filling date:
17 Feb 2020
Issue date:
19 Aug 2021