International Business Machines Corporation
Optimization of model generation in deep learning neural networks using smarter gradient descent calibration

Last updated:

Abstract:

In training a new neural network, batches of the new training dataset are generated. An epoch of batches is passed through the new neural network using an initial weight (.theta.). An area minimized (A.sub.i) under an error function curve and an accuracy for the epoch are calculated. It is then determined whether a set of conditions are met, where the set of conditions includes whether A.sub.i is less than an average area (A_avg) from a training of an existing neural network and whether the accuracy is within a predetermined threshold. When the set of conditions are not met, a new .theta. is calculated by modifying a dynamic learning rate (.beta.) by an amount proportional to a ratio of A.sub.i/A_avg and by calculating the new .theta. using the modified .beta. according to .theta.:.+-..theta.- .alpha..differential..function..theta..differential..theta..beta..intg..t- imes..function..theta..times..differential..theta. ##EQU00001## The process is repeated a next epoch until the set of conditions are met.

Status:
Grant
Type:

Utility

Filling date:

30 Apr 2018

Issue date:

14 Sep 2021