International Business Machines Corporation
VARIABLE-LENGTH WORD EMBEDDING
Last updated:
Abstract:
A data structure is used to configure and transform a computer machine learning system. The data structure has one or more records where each record is a (vector) representation of a selected object in a corpus. One or more non-zero parameters in the records define the selected object and the number of the non-zero parameters define a word length of the record. One or more zero-value parameters are in one or more of the records, The word length of the object representation varies, e.g. can increase, as necessary to accurately represent the object within one or more contexts provided during training of a neural network used to create the database, e.g. as more and more contexts are introduced during the training. A minimum number of non-zero parameters are needed and zero-value parameters can be clustered together and compressed to save large amounts of system storage and shorten execution times.
Utility
18 May 2020
18 Nov 2021