NVIDIA Corporation
TECHNIQUES FOR TRANSFORMING SERIAL PROGRAM CODE INTO KERNELS FOR EXECUTION ON A PARALLEL PROCESSOR
Last updated:
Abstract:
A compiler generates an accelerated version of a serial computer program that can be executed on a parallel processor. The compiler analyzes the serial computer program and generates a graph of nodes connected by edges. Each node corresponds to an operation or value set forth in the serial computer program. Each incoming edge corresponds to an operand that is specified or generated in the serial computer program. The compiler partitions the graph of nodes into two different types of partitions; a first type of partition includes one or more nodes that correspond to one or more pointwise operations, and a second type of partition includes one node that corresponds to one operation that is performed efficiently via a library. For each partition, the compiler configures a sequence of kernels that can be executed on the parallel processor to perform the operations associated with the computer program in an accelerated fashion.
Utility
10 Dec 2018
12 Sep 2019