International Business Machines Corporation
Construction of reference database accurately representing complete set of data items for faster and tractable classification usage

Last updated:

Abstract:

For each unique pair of a complete set of data items, a computing device determines a distance between the data items of the unique pair. The computing device repeats the following until no data items remain in the complete set. For each data item remaining in the complete set, the computing device determines a similarity subset including each other data item that the distance between the data item and the other data item is less than a target difference threshold. The computing device moves a selected data item from a largest similarity subset to a reference database that is a subset of the complete set. The computing device removes each data item from the complete set that the distance between the selected data item and the data item is less than the threshold. A new data item can be classified using the reference database.

Status:
Grant
Type:

Utility

Filling date:

28 Sep 2018

Issue date:

26 Apr 2022