International Business Machines Corporation
AUTOMATICALLY EXTENDING A DOMAIN TAXONOMY TO THE LEVEL OF GRANULARITY PRESENT IN GLOSSARIES IN DOCUMENTS

Last updated:

Abstract:

A controller accesses an initial taxonomy for a domain comprising one or more existing terms for the domain identified in a hierarchical structure. The controller analyzes a corpus documents for a domain to identify a selection of one or more documents with glossaries. The controller extracts, from the glossaries, one or more pairs each comprising a term and a definition. The controller attempts to map a respective term of each of the one or more pairs into the initial taxonomy for the domain based on text of a respective definition of each of the one or more pairs to generate an updated taxonomy for the domain.

Status:
Application
Type:

Utility

Filling date:

21 Feb 2020

Issue date:

26 Aug 2021