Intel Corporation
Apparatus and method for adaptable and efficient lane-wise tensor processing

Last updated: 6 Jul 2022

Abstract:

An apparatus and method for performing efficient, adaptable tensor operations. For example, one embodiment of a processor comprises: front end circuitry to schedule matrix operations responsive to a matrix multiplication instruction; a plurality of lanes to perform parallel execution of the matrix operations, wherein a lane comprises an arithmetic logic unit to multiply a block of a first matrix with a block of a second matrix to generate a product and to accumulate the product with a block of a third matrix, and wherein the matrix blocks are to be stored in registers within the lane; and broadcast circuitry to broadcast one or more invariant matrix blocks to at least one of different registers within the lane and different registers across different lanes.

Status:

Grant

Type:

Utility

Filling date:

7 Aug 2020

Issue date:

5 Jul 2022

Full patent description

Patent application document

Intel Corporation Apparatus and method for adaptable and efficient lane-wise tensor processing

Abstract:

Intel Corporation
Apparatus and method for adaptable and efficient lane-wise tensor processing