Meta Platforms, Inc.
TRAINING DATA QUALITY FOR SPAM CLASSIFICATION

Last updated:

Abstract:

In one embodiment, a method includes accessing posts in a social-networking system. Each of the posts is unlabeled with respect to whether the post is known to be spam. The method also includes determining a posting user who submitted the post to the social-networking system and a recipient user to whom the post is addressed. The method further includes determining a first vector representation of the posting user and a second vector representation of the recipient user based on one or more features associated with the post, the posting user, and the recipient user. The method still further includes comparing the vector representations and building a machine learning model for automatically detecting spam posts in the social-networking system using a subset of the plurality of posts as non-spam training data.

Status:
Application
Type:

Utility

Filling date:

14 Dec 2021

Issue date:

31 Mar 2022