Alibaba Group Holding Limited
HETEROGENEOUS DEEP LEARNING ACCELERATOR
Last updated:
Abstract:
Systems and methods for heterogenous hardware acceleration are disclosed. The systems and methods can include a neural network processing unit comprising compute tiles. Each of a first set of the compute tiles can include a first tensor array configured to support operations in a first number format. Each of a second set of the compute tiles can include a second tensor array configured to support operations in a second number format, the second number format supporting a greater range or a greater precision than the first number format, and a de-quantizer configured to convert data in the first number format to data in the second number format. The systems and methods can include neural network processing units, multi-chip hardware accelerators and distributed hardware accelerators including low-precision components for performing interference tasks and high-precision components for performing training tasks. Transfer learning tasks can be performed using low-precision components and high-precision components.
Utility
25 Oct 2019
29 Apr 2021