Intel Corporation
Optimized compute hardware for machine learning operations

Last updated:

Abstract:

A processing cluster of a processing cluster array comprises a plurality of registers to store input values of vector input operands, the input values of at least some of the vector input operands having different bit lengths than those of other input values of other vector input operands, and a compute unit to execute a dot-product instruction with the vector input operands to perform a number of parallel multiply operations and an accumulate operation per 32-bit lane based on a bit length of the smallest-sized input value of a first vector input operand relative to the 32-bit lane.

Status:
Grant
Type:

Utility

Filling date:

3 Aug 2020

Issue date:

17 May 2022