International Business Machines Corporation
Variable ISA vector-based compaction in distributed training of neural networks
Last updated:
Abstract:
Using a processor and a memory at a worker machine, a gradient vector is computed corresponding to a set of weights associated with a set of nodes of a neural network instance being trained in the worker machine. In an ISA vector corresponding to the gradient vector, an ISA instruction is constructed corresponding to a gradient in a set of gradients in the gradient vector, wherein a data transmission of the ISA instruction is smaller as compared to a data transmission of the gradient. The ISA vector is transmitted from the worker machine to a parameter server, the ISA vector being responsive to one iteration of a training of the neural network instance, the ISA vector being transmitted instead of the gradient vector to reduce an amount of data transmitted from the worker machine to the parameter server for the one iteration of the training.
Utility
20 Sep 2017
17 Aug 2021