Meta Platforms, Inc.
Systems and methods for selecting content to send to labelers for prevalence estimation

Last updated:

Abstract:

A method for selecting content to send to labelers for prevalence estimation may include (1) selecting a prevalence estimator, (2) sampling content items from an online system, (3) using, for each of the content items, a model to generate a score for the content item that indicates a likelihood that the content item is of a class of content, (4) generating buckets that each (a) is assigned a range of scores from the model and (b) contains a subset of the content items whose scores fall within the range of scores, (5) determining a sampling rate for each of the buckets that minimizes a variance metric of the estimator, (6) selecting, from each of the buckets, a portion of content items according to the sampling rate of the bucket, and (7) sending the portions to labelers for labeling. Various other methods, systems, and computer-readable media are also disclosed.

Status:
Grant
Type:

Utility

Filling date:

31 Jul 2017

Issue date:

6 Oct 2020