Advanced Micro Devices, Inc.
ARCHITECTED LIBRARY INTERFACE FOR KERNEL FUSION
Last updated:
Abstract:
Systems, apparatuses, and methods for implementing an architected library interface for kernel fusion are disclosed. A processor receives a first representation of a neural network and a vendor-supplied library. The vendor-supplied library is associated with a specific hardware target, and the library includes fusing points which allow a kernel to be called within an optimized operation. When a kernel is called using the fusing point within an optimized operation, the kernel performs one or more operations on the data being processed by the optimized operation. This allows multiple kernels to be executed without having to write data back to memory after each individual kernel. The processor generates an optimized version of the neural network by linking to fusing points within the vendor-supplied library. This reduces the number of memory accesses and increases the performance of the optimized version of the neural network when executed on the hardware target.
Utility
24 Sep 2020
24 Mar 2022