Apple Inc.
EXECUTION OF SEGMENTED MACHINE LEARNING MODELS

Last updated:

Abstract:

A device implementing a system to execute machine learning models from memory includes at least one processor configured to receive a request to provide an input to one or more machine learning (ML) models arranged into a graph of connected layers, the one or more ML models stored in the first type of memory. The at least one processor is further configured to divide the graph of connected layers into a plurality of segments such that at least two of the plurality of segments concurrently fits within allocated space of the second type of memory. The at least one processor is further configured to cause the input to be processed through the first segment of the plurality of segments using the second type of memory while a second segment of the plurality of segments is concurrently loaded from the first type of memory into the second type of memory.

Status:
Application
Type:

Utility

Filling date:

14 Jun 2021

Issue date:

23 Dec 2021