Amazon.com, Inc.
Category-based sampling of machine learning data

Last updated:

Abstract:

A determination is made at a machine learning service that a training data set comprising a majority category of observation records and one or more minority categories of observation records meets a criterion for automated sampling. A sampling ratio to be used for a particular category of the majority category and the one or more minority categories is identified. A selected sampling methodology is applied to the particular category to obtain a sample in accordance with the sampling ratio. A particular machine learning model is trained using a result of applying at least the selected sampling methodology on the particular category.

Status:
Grant
Type:

Utility

Filling date:

14 Aug 2014

Issue date:

23 Nov 2021