International Business Machines Corporation
FEATURE ENHANCEMENT VIA UNSUPERVISED LEARNING OF EXTERNAL KNOWLEDGE EMBEDDING
Last updated:
Abstract:
A method, computer system, and computer program product for enhancing feature engineering based on unsupervised learning of associated external knowledge embedding are provided. The embodiment may include receiving, by a processor, input data as a table and a name of a column. The embodiment may also include analyzing the column to identify multisets of concepts or sequences of concepts. The embodiment may further include automatically expanding the column by linking the identified multisets or the sequences of the concepts with corresponding concepts in an external knowledge graph. The embodiment may also include training a neural network to learn embedding vectors of concept multi-sets in the expanded column of the tables, wherein the training is unsupervised without provision of labels of data when the neural network learns an embedding of the multisets of concepts with an objective to minimize a reconstruction error of the identified multisets of concepts.
Utility
17 Nov 2020
19 May 2022