Alibaba Group Holding Limited
Sample set processing method and apparatus, and sample querying method and apparatus
Last updated:
Abstract:
Implementations of the present specification provide classification and indexing methods and devices, and methods and devices for querying similar samples. During classification, the samples in the sample set are clustered at two levels, and the clustering results are recorded in a first vector table and a second vector table. During indexing, indexes are established at two levels for each sample in the sample set, where the first level index points to a coarse cluster center to which the sample belongs, and the second level index points to a segment cluster center corresponding to a segment vector of the sample. During query of similar samples, searches are performed at two levels on the query samples. The first-level search is to determine a coarse cluster center that is closer to the query sample from the first vector table obtained through classification, and obtain comparison samples that belong to the coarse cluster center. The second-level search is to select a comparison sample whose distance meets a predetermined criterion as a similar sample. As such, retrieval and query of samples are accelerated.
Utility
19 May 2020
19 Jan 2021