Microsoft Corporation
Minimizing memory reads and increasing performance by leveraging aligned blob data in a processing unit of a neural network environment
Last updated:
Abstract:
The performance of a neural network (NN) and/or deep neural network (DNN) can be limited by the number of operations being performed as well as management of data among the various memory components of the NN/DNN. By inserting a selected padding in the input data to align the input data in memory, data read/writes can be optimized for processing by the NN/DNN thereby enhancing the overall performance of a NN/DNN. Operatively, an operations controller/iterator can generate one or more instructions that inserts the selected padding into the data. The data padding can be calculated using various characteristics of the input data as well as the NN/DNN as well as characteristics of the cooperating memory components. Padding on the output data can be utilized to support the data alignment at the memory components and the cooperating processing units of the NN/DNN.
Utility
15 Nov 2017
23 Nov 2021