Meta Platforms, Inc.
HIGH THROUGHPUT NEURAL NETWORK OPERATIONS USING INTER-LAYER MEMORY LAYOUT TRANSFORMATION

Last updated:

Abstract:

A microprocessor comprises a shared memory and a processing element. The processing element includes a matrix processor unit, a transpose hardware unit, a scatter hardware unit, and a gather hardware unit. The matrix processor unit is configured to perform a matrix operation. The transpose hardware unit is configured to perform a matrix transpose operation. The scatter hardware unit is configured to place data to the shared memory at locations selected for an output data layout conversion. The gather hardware unit is configured to obtain input data from the shared memory from non-contiguous locations for an input data layout conversion.

Status:
Application
Type:

Utility

Filling date:

16 May 2019

Issue date:

19 Nov 2020