Apple Inc.
FAST DEEP LEARNING FULLY-CONNECTED INFERENCE
Last updated:
Abstract:
This application relates to performing fully-connected inferences using a convolutional neural network. A method includes receiving a two-dimensional input matrix that includes a plurality of elements. The method further includes identifying a two-dimensional weight matrix corresponding to the two-dimensional input matrix, where the two-dimensional weight matrix includes a plurality of weight values. The method further includes transposing a first column of the two-dimensional weight matrix and storing the transposed first column of the two-dimensional weight matrix in a first register having a first length corresponding to the transposed first column. The method further includes generating a first output element by performing a first dot product operation using a first row of the two-dimensional input matrix and the transposed first column. Finally, the method includes storing the first output element in a first row of a two-dimensional output matrix.
Utility
11 Sep 2019
12 Nov 2020