International Business Machines Corporation
MODEL TRAINING WITH VARIABLE BATCH SIZING AND GRADIENT CHECKPOINT SEGMENTS
Last updated:
Abstract:
A computer-implemented machine learning model training method and resulting machine learning model. One embodiment of the method may comprise receiving at a computer memory training data; and training on a computer processor a machine learning model on the received training data using a plurality of batch sizes to produce a trained processor. The training may include calculating a plurality of activations during a forward pass of the training and discarding at least some of the calculated plurality of activations after the forward pass of the training.
Status:
Application
Type:
Utility
Filling date:
10 Mar 2020
Issue date:
16 Sep 2021