Synopsys, Inc.
MIXED-PRECISION NEURAL NETWORKS
Last updated:
Abstract:
Techniques for mixed precision quantization of a machine learning (ML) model. A target bandwidth increase is received (302), for the ML model (114) including objects of a first data type represented by a first number of bits. The target bandwidth increase relates to changing a first portion of the objects to a second data type represented by a second number of bits different from the first number of bits (310). The method further includes sorting the objects in the ML model based on bandwidth (304). The method further includes identifying the first portion of the objects to change from the first data type to the second data type, based on the target bandwidth increase and the sorting of the plurality of objects (508). The method further includes changing the first portion of the objects from the first data type to the second data type (508).
Utility
30 Apr 2021
2 Dec 2021