SAP SE
VECTORIZATION OF STRUCTURED DOCUMENTS WITH MULTI-MODAL DATA
Last updated:
Abstract:
Methods, systems, and computer-readable storage media for receiving structured data including a set of columns and a set of rows, determining, for each column, a column width defining a number of characters, providing, for each row, a set of padded values, each padded value corresponding to a column and including a value and one or more padding characters, the value and the one or more padding values collectively having a length equal to a respective column width, defining a set of strings by, for each row, concatenating padded values in the set of padded values to provide a string, and training the ML model by providing, for each string in the set of strings, an embedding as an abstract representation of a record of a respective row and processing the embedding through an attention layer of the ML model.
Utility
17 Mar 2020
23 Sep 2021