NVIDIA Corporation
HYBRID QUANTIZATION OF NEURAL NETWORKS FOR EDGE COMPUTING APPLICATIONS

Last updated:

Abstract:

Apparatuses, systems, and techniques to use low precision quantization to train a neural network. In at least one embodiment, one or more weights of a trained model are represented by low bit integer numbers instead of using full floating point precision. Changing precision of the one or more weights is performed by first quantizing all weights and activations of a neural network except for layers that require finer granularity in representation than an 8 bit quantization can provide to generate a first trained model. Subsequently, precision of the one or more weights of the first trained model is changed again to generate a second trained model. For the second trained model, the precision of one or more weights of at least one additional layer is changed in addition to the layers that previously had precision values changed while training the neural network to generate the first trained model.

Status:
Application
Type:

Utility

Filling date:

9 Jun 2021

Issue date:

10 Feb 2022