QUALCOMM Incorporated
QUANTIZATION AND INFERENCING FOR LOW-BITWIDTH NEURAL NETWORKS

Last updated:

Abstract:

A method for operating a low-bitwidth neural network includes converting a first activation to a non-negative value (e.g., absolute value). The first activation has a signed value. The sign of the activation is used to select a weight value. A product of the non-negative activation and the selected weight value is computed to determine a next activation. The next activation is quantized and supplied to a subsequent layer of the low-bitwidth neural network.

Status:
Application
Type:

Utility

Filling date:

11 Mar 2020

Issue date:

16 Sep 2021