NVIDIA Corporation
Coherent caching of data for high bandwidth scaling
Last updated:
Abstract:
A method, computer readable medium, and system are disclosed for a distributed cache that provides multiple processing units with fast access to a portion of data, which is stored in local memory. The distributed cache is composed of multiple smaller caches, and each of the smaller caches is associated with at least one processing unit. In addition to a shared crossbar network through which data is transferred between processing units and the smaller caches, a dedicated connection is provided between two or more smaller caches that form a partner cache set. Transferring data through the dedicated connections reduces congestion on the shared crossbar network. Reducing congestion on the shared crossbar network increases the available bandwidth and allows the number of processing units to increase. A coherence protocol is defined for accessing data stored in the distributed cache and for transferring data between the smaller caches of a partner cache set.
Utility
18 Sep 2018
9 Feb 2021