Microsoft Corporation
Low cardinality bias correction system
Last updated:
Abstract:
Bias correcting system for small number estimators. A computer system includes a distinct value estimator configured to estimate a number of distinct values in a data set. The computer system includes a bias table for the estimator. The bias table includes entries with values corresponding to biases caused by the distinct value estimator correlated to values corresponding to numbers estimated. The entries in the table are optimized by having a set of entries with an optimized number of biases in the entries. The biases in the entries are associated with predetermined confidence intervals. The system includes a bias corrector configured to correct the number of distinct values in the multiset data estimated by the distinct value estimator set using values from the bias table to produce a corrected value. The system includes a user interface coupled to the bias corrector configured to output the corrected value to a user.
Utility
2 Jan 2018
16 Aug 2022