International Business Machines Corporation
MODEL TRAINING WITH VARIABLE BATCH SIZING AND GRADIENT CHECKPOINT SEGMENTS

Last updated:

Abstract:

A computer-implemented machine learning model training method and resulting machine learning model. One embodiment of the method may comprise receiving at a computer memory training data; and training on a computer processor a machine learning model on the received training data using a plurality of batch sizes to produce a trained processor. The training may include calculating a plurality of activations during a forward pass of the training and discarding at least some of the calculated plurality of activations after the forward pass of the training.

Status:
Application
Type:

Utility

Filling date:

10 Mar 2020

Issue date:

16 Sep 2021