QUALCOMM Incorporated
Quantization and inferencing for low-bitwidth neural networks
Last updated:
Abstract:
A method for operating a low-bitwidth neural network includes converting a first activation to a non-negative value (e.g., absolute value). The first activation has a signed value. The sign of the activation is used to select a weight value. A product of the non-negative activation and the selected weight value is computed to determine a next activation. The next activation is quantized and supplied to a subsequent layer of the low-bitwidth neural network.
Status:
Grant
Type:
Utility
Filling date:
11 Mar 2020
Issue date:
20 Sep 2022