Alibaba Group Holding Limited
HYPER-SQUARE IMPLEMENTATION OF TREE ALLREDUCE ALGORITHM FOR DISTRIBUTED PARALLEL DEEP LEARNING
Last updated:
Abstract:
The present disclosure provides a method for syncing data of a computing task across a plurality of groups of computing nodes. Each group including a set of computing nodes A-D, a set of intra-group interconnects that communicatively couple computing node A with computing nodes B and C and computing node D with computing nodes B and C, and a set of inter-group interconnects that communicatively couple each of computing nodes A-D with corresponding computing nodes A-D in each of a plurality of neighboring groups. The method comprises syncing data at a computing node of the plurality of groups of computing nodes using inter-group interconnects and intra-group interconnects along four different directions relative to the node; and broadcasting synced data from the node to the plurality of groups of computing nodes using inter-group interconnects and intra-group interconnects along four different directions relative to the node.
Utility
30 Jan 2020
5 Aug 2021