International Business Machines Corporation
Using gradients to detect backdoors in neural networks

Last updated: 29 Sep 2021

Abstract:

Mechanisms are provided for evaluating a trained machine learning model to determine whether the machine learning model has a backdoor trigger. The mechanisms process a test dataset to generate output classifications for the test dataset, and generate, for the test dataset, gradient data indicating a degree of change of elements within the test dataset based on the output generated by processing the test dataset. The mechanisms analyze the gradient data to identify a pattern of elements within the test dataset indicative of a backdoor trigger. The mechanisms generate, in response to the analysis identifying the pattern of elements indicative of a backdoor trigger, an output indicating the existence of the backdoor trigger in the trained machine learning model.

Status:

Grant

Type:

Utility

Filling date:

16 Apr 2018

Issue date:

28 Sep 2021

Full patent description

Patent application document

International Business Machines Corporation Using gradients to detect backdoors in neural networks

Abstract:

International Business Machines Corporation
Using gradients to detect backdoors in neural networks