QUALCOMM Incorporated
Quantization and inferencing for low-bitwidth neural networks

Last updated:

Abstract:

A method for operating a low-bitwidth neural network includes converting a first activation to a non-negative value (e.g., absolute value). The first activation has a signed value. The sign of the activation is used to select a weight value. A product of the non-negative activation and the selected weight value is computed to determine a next activation. The next activation is quantized and supplied to a subsequent layer of the low-bitwidth neural network.

Status:
Grant
Type:

Utility

Filling date:

11 Mar 2020

Issue date:

20 Sep 2022