International Business Machines Corporation
COMBINING ENSEMBLE TECHNIQUES AND RE-DIMENSIONING DATA TO INCREASE MACHINE CLASSIFICATION ACCURACY
Last updated:
Abstract:
Classifying unlabeled input data is provided. Euclidean distance and cosine similarity are calculated between an unlabeled input data point to be classified and a class label centroid of each class within a set of training data. A confidence value is calculated for each class label centroid based on the Euclidean distance and the cosine similarity between the unlabeled input data point and the class label centroid of each class. A highest confidence value equals a best matching class label centroid to the unlabeled input data point. A class label centroid having the highest confidence value is selected. The computer classifies the unlabeled input data point using a class label corresponding to the class label centroid having the highest confidence value.
Utility
4 May 2020
4 Nov 2021