International Business Machines Corporation
Quantifying vulnerabilities of deep learning computing systems to adversarial perturbations

Last updated:

Abstract:

Mechanisms are provided for generating an adversarial perturbation attack sensitivity (APAS) visualization. The mechanisms receive a natural input dataset and a corresponding adversarial attack input dataset, where the adversarial attack input dataset comprises perturbations intended to cause a misclassification by a computer model. The mechanisms determine a sensitivity measure of the computer model to the perturbations in the adversarial attack input dataset based on a processing of the natural input dataset and corresponding adversarial attack input dataset by the computer model. The mechanisms generate a classification activation map (CAM) for the computer model based on results of the processing and a sensitivity overlay based on the sensitivity measure. The sensitivity overlay graphically represents different classifications of perturbation sensitivities. The mechanisms apply the sensitivity overlay to the CAM to generate and output a graphical visualization output of the computer model sensitivity to perturbations of adversarial attacks.

Status:
Grant
Type:

Utility

Filling date:

8 Mar 2019

Issue date:

18 Jan 2022