Amazon.com, Inc.
Category-based sampling of machine learning data
Last updated:
Abstract:
A determination is made at a machine learning service that a training data set comprising a majority category of observation records and one or more minority categories of observation records meets a criterion for automated sampling. A sampling ratio to be used for a particular category of the majority category and the one or more minority categories is identified. A selected sampling methodology is applied to the particular category to obtain a sample in accordance with the sampling ratio. A particular machine learning model is trained using a result of applying at least the selected sampling methodology on the particular category.
Status:
Grant
Type:
Utility
Filling date:
14 Aug 2014
Issue date:
23 Nov 2021