Meta Platforms, Inc.
Training a classifier to identify unknown users of an online system
Last updated:
Abstract:
An online system develops a model to predict the identity of unknown users accessing the online system. The online system interacts with users who are known by the online system (e.g., because they are logged in), termed known users, and users who are unknown by the online system. The model attempts to predict the identity of unknown users. To train the model, a set of training data with training weights is generated. The training data includes a set of access events from known users. The set can include access events from unknown users who accessed the system and subsequently became identified (referred to as hindsight events). To account for a distribution in training data, the training data is applied to a scoring model to identify training data that resembles known events. A scaling model then scales the scores to generate training weights. The weights may be higher for access events with characteristics that resembles hindsight events.
Utility
2 Aug 2018
16 Feb 2021