Alibaba Group Holding Limited
SYSTEMS AND METHODS FOR ACCELERATING SPARSE NEURAL NETWORK EXECUTION

Last updated:

Abstract:

The present disclosure relates to systems and methods for dynamically executing sparse neural networks. In one implementation, a system for providing dynamic sparsity in a neural network may include at least one memory storing instructions and at least one processor configured to execute the instructions to: reduce an input vector and a set of weights of the neural network, execute an input layer of the neural network using the reduced input vector and set of weights to generate a reduced output vector; expand the reduced output vector to a full output vector using first predictable output neurons (PONs); using a PON map, reduce a dimension of the full output vector; execute subsequent layers of the neural network using the reduced full output vector to produce a second reduced output vector; and expand the second reduced output vector to a second full output vector using second PONs.

Status:
Application
Type:

Utility

Filling date:

5 Sep 2019

Issue date:

7 Jan 2021