Advanced Micro Devices, Inc.
QUANTIZATION OF NEURAL NETWORK MODELS USING DATA AUGMENTATION
Last updated:
Abstract:
A neural network is trained at a first precision using a training dataset. The neural network is then calibrated using an augmented calibration dataset that includes a first dataset and one or more second datasets produced by modifying the first dataset. A range of values of activations of nodes in the neural network at the first precision is determined based on inputs to the neural network from the augmented calibration dataset. The activations of the nodes are then quantized to a second precision based on the range of values of the activations of the nodes at the first precision. The first precision is higher than the second precision. For example, in some cases the first precision is a 32-bit floating point precision and the second precision is an 8-bit integer precision.
Utility
23 Sep 2020
30 Dec 2021