NVIDIA Corporation
Systems and methods for pruning neural networks for resource efficient inference

Last updated: 27 Apr 2022

Abstract:

A method, computer readable medium, and system are disclosed for neural network pruning. The method includes the steps of receiving first-order gradients of a cost function relative to layer parameters for a trained neural network and computing a pruning criterion for each layer parameter based on the first-order gradient corresponding to the layer parameter, where the pruning criterion indicates an importance of each neuron that is included in the trained neural network and is associated with the layer parameter. The method includes the additional steps of identifying at least one neuron having a lowest importance and removing the at least one neuron from the trained neural network to produce a pruned neural network.

Status:

Grant

Type:

Utility

Filling date:

17 Oct 2017

Issue date:

26 Apr 2022

Full patent description

Patent application document

NVIDIA Corporation Systems and methods for pruning neural networks for resource efficient inference

Abstract:

NVIDIA Corporation
Systems and methods for pruning neural networks for resource efficient inference