Advanced Micro Devices, Inc.
PADDED VECTORIZATION WITH COMPILE TIME KNOWN MASKS

Last updated:

Abstract:

A computing system includes a processing unit and a memory storing instructions that, when executed by the processor, cause the processor to receive program source code in a compiler, identify in the program source code a set of operations for vectorizing, where each operation in the set of operations specifies a set of one or more operands, in response to identifying the set of operations, vectorize the set of operations by, based on the number of operations in the set of operations and a total number of lanes in a first vector register, generating a mask indicating a first unmasked lane and a first masked lane in the first vector register, based on the mask, generating a set of one or more instructions for loading into the first unmasked lane a first operand of a first operation of the set of operations, and loading the first operand into the first masked lane.

Status:
Application
Type:

Utility

Filling date:

9 Aug 2019

Issue date:

5 Mar 2020