Intel Corporation
MULTICAST NETWORK AND MEMORY TRANSFER OPTIMIZATIONS FOR NEURAL NETWORK HARDWARE ACCELERATION

Last updated:

Abstract:

In one embodiment, a system to deterministically transfer partitions of contiguous computer readable data in constant time includes a computer readable memory and a modulo address generator. The computer readable memory is organized into D banks, to contain contiguous data including a plurality of data elements of size M which are constituent data elements of a vector with N data elements, the data elements to start at an offset address O. The modulo address generator is to generate the addresses of the data elements of a vector with i data elements stored in the computer readable memory, the modulo address generator including at least one forward permutaton to permute data elements with addresses of the form O+M*i where 0<=i<N. Other embodiments are described and claimed

Status:
Application
Type:

Utility

Filling date:

10 Aug 2021

Issue date:

2 Dec 2021