Amazon.com, Inc.
Auto-scaling hosted machine learning models for production inference
Last updated:
Abstract:
Techniques for auto-scaling hosted machine learning models for production inference are described. A machine learning model can be deployed in a hosted environment such that the infrastructure supporting the machine learning model scales dynamically with demand so that performance is not impacted. The model can be auto-scaled using reactive techniques or predictive techniques.
Status:
Grant
Type:
Utility
Filling date:
24 Nov 2017
Issue date:
21 Sep 2021