Intel Corporation
AUTONOMOUS ALLOCATION OF DEEP NEURAL NETWORK INFERENCE REQUESTS IN A CLUSTER WITH HETEROGENEOUS DEVICES

Last updated:

Abstract:

Systems, apparatuses and methods include technology that identifies compute capacities of edge nodes and memory capacities of the edge nodes. The technology further identifies a first variant of an Artificial Intelligence (AI) model, and assigns the first variant to a first edge node of the edge nodes based on a compute capacity requirement associated with execution of the first variant, a memory resource requirement associated with execution of the first variant, the compute capacities and the memory capacities.

Status:
Application
Type:

Utility

Filling date:

9 Sep 2021

Issue date:

30 Dec 2021