Fortinet, Inc.
CONVEX OPTIMIZED STOCHASTIC VECTOR SAMPLING BASED REPRESENTATION OF GROUND TRUTH

Last updated:

Abstract:

Systems and methods are described for training a machine learning model using intelligently selected multiclass vectors. According to an embodiment, a processing resource of a computing system receives a first set of un-labeled feature vectors. The first set feature vectors are homomorphically translated using a T-Distributed Stochastic Neighbor Embedding (t-SNE) algorithm to obtain a second set of feature vectors with reduced dimensionality. The second set of feature vectors are clustered to obtain an initial set of clusters using centroid-based clustering. An optimal set of clusters is identified among the initial set of clusters by performing a convex optimization process on the initial set of clusters. For each cluster of the optimal set of clusters, a representative vector from the cluster is selected for labeling.

Status:
Application
Type:

Utility

Filling date:

11 Sep 2020

Issue date:

17 Mar 2022