International Business Machines Corporation
Generation of domain specific type system
Last updated:
Abstract:
Embodiments provide a computer implemented method in a data processing system including a processor and a memory, the memory including instructions that are executed by the processor to cause the processor to implement a system for generating a type system. The method includes: receiving a document corpus; identifying frequently occurring words from the document corpus, disregarding stop words; extracting a conceptual text for each frequently occurring word from a structured information database; performing a cluster analysis on each conceptual text to identify possible entity types; performing a frequency analysis on possible entity types to select at least one entity type; identifying a relation type between entities in the document corpus; and generating the type system including entity types and relation types.
Utility
4 Jun 2018
24 Aug 2021