International Business Machines Corporation
Multi-modal document feature extraction

Last updated: 8 Dec 2021

Abstract:

Systems and methods are described for generating a machine learning model for multi-modal feature extraction. The method may include receiving a document in a digital format, where the digital format comprises text information and image information, performing a text extraction function on a first portion of the document to produce a set of text features, performing an image extraction function on a second portion of the document to produce a set of image features, generating a feature tree, wherein a plurality of nodes of the feature tree correspond to the set of text features and the set of image features, and generating an input vector for a machine learning model based on the feature tree. In some cases, the feature tree may be generated synthetically, or modified by a user prior to being converted into the input vector.

Status:

Grant

Type:

Utility

Filling date:

6 Dec 2018

Issue date:

7 Dec 2021

Full patent description

Patent application document

International Business Machines Corporation Multi-modal document feature extraction

Abstract:

International Business Machines Corporation
Multi-modal document feature extraction