Apple Inc.
Variance-Based Learning Rate Control For Training Machine-Learning Models
Last updated:
Abstract:
A method includes determining a training scale for training a machine-learning model, defining a group of worker nodes having a number of worker nodes that is selected according to the training scale, and determining an average gradient of a loss function during a training iteration using the group of worker nodes. The method also includes determining a variance value for the average gradient of the loss function, determining a gain ratio based on the variance value for the average gradient of the loss function, and determining a learning rate parameter based on a learning rate schedule and the gain ratio. The method also includes determining updated parameters for the machine-learning model using the learning rate parameter and the average gradient of the loss function.
Utility
27 Mar 2020
25 Mar 2021