Apple Inc.
Variance-Based Learning Rate Control For Training Machine-Learning Models

Last updated:

Abstract:

A method includes determining a training scale for training a machine-learning model, defining a group of worker nodes having a number of worker nodes that is selected according to the training scale, and determining an average gradient of a loss function during a training iteration using the group of worker nodes. The method also includes determining a variance value for the average gradient of the loss function, determining a gain ratio based on the variance value for the average gradient of the loss function, and determining a learning rate parameter based on a learning rate schedule and the gain ratio. The method also includes determining updated parameters for the machine-learning model using the learning rate parameter and the average gradient of the loss function.

Status:
Application
Type:

Utility

Filling date:

27 Mar 2020

Issue date:

25 Mar 2021