Fortinet, Inc.
REAL-TIME MINIMAL VECTOR LABELING SCHEME FOR SUPERVISED MACHINE LEARNING
Last updated:
Abstract:
Systems and methods are described for training a machine learning model using intelligently selected multiclass vectors. According to an embodiment, a set of un-labeled feature vectors are received. The set of feature vectors are grouped into clusters within a vector space having fewer dimensions than the first set of feature vectors by applying a homomorphic dimensionality reduction algorithm to the set of feature vectors and performing centroid-based clustering. An optimal set of clusters among the clusters is identified by performing a convex optimization process on the clusters. Vector labeling is minimized by selecting ground truth representative vectors including a representative vector from each cluster of the optimal set of clusters. A set of labeled feature vectors is created based on labels received from an oracle for each of the representative vectors. A machine-learning model is trained for multiclass classification based on the set of labeled feature vectors.
Utility
11 Sep 2020
17 Mar 2022