The Boeing Company
Natural Language Processing (NLP) Pipeline for Automated Attribute Extraction

Last updated:

Abstract:

A method for training a filter-based text recognition system for cataloging image portions associated with files using text from the image portions, the method comprising: receiving a first set of text represented in a first image portion associated with a first file; classifying the first image portion into a predetermined group, wherein the classifying is based at least in part on the first set of text; extracting a first set of features from the first set of text; harmonizing existing data in the predetermined group with the first set of text to modify the first set of features; categorizing the first set of text; and determining analytics-based rules based at least in part on the first set of features.

Status:
Application
Type:

Utility

Filling date:

12 Dec 2019

Issue date:

17 Jun 2021