Amazon.com, Inc.
Scalable generation of multidimensional features for machine learning
Last updated:
Abstract:
An approximate count of a subset of records of a data set is obtained using one or more transformation functions. The subset comprises records which contain a first value of one input variable, a second value of another input variable, and a particular value of a target variable. Using the approximate count, an approximate correlation metric for a multidimensional feature and the target variable is obtained. Based on the correlation metric, the multidimensional feature is included in a candidate feature set to be used to train a machine learning model.
Status:
Grant
Type:
Utility
Filling date:
19 Apr 2016
Issue date:
5 Apr 2022