Apple Inc.
Low Latency Fetch Circuitry for Compute Kernels

Last updated:

Abstract:

Techniques are disclosed relating to fetching items from a compute command stream that includes compute kernels. In some embodiments, stream fetch circuitry sequentially pre-fetches items from the stream and stores them in a buffer. In some embodiments, fetch parse circuitry iterate through items in the buffer using a fetch parse pointer to detect indirect-data-access items and/or redirect items in the buffer. The fetch parse circuitry may send detected indirect data accesses to indirect-fetch circuitry, which may buffer requests. In some embodiments, execute parse circuitry iterates through items in the buffer using an execute parse pointer (e.g., which may trail the fetch parse pointer) and outputs both item data from the buffer and indirect-fetch results from indirect-fetch circuitry for execution. In various embodiments, the disclosed techniques may reduce fetch latency for compute kernels.

Status:
Application
Type:

Utility

Filling date:

8 Oct 2020

Issue date:

28 Jan 2021