International Business Machines Corporation
Detecting and purifying adversarial inputs in deep learning computing systems
Last updated:
Abstract:
Adversarial input detection and purification (AIDAP) preprocessor and deep learning computer model mechanisms are provided. The deep learning computer model receives input data and processes it to generate a first pass output that is output to the AIDAP preprocessor. The AIDAP preprocessor determines a discriminative region of the input data based on the first pass output and transforms a subset of elements in the discriminative region to modify a characteristic of the elements and generate a transformed input data. The deep learning computer model processes the transformed input data to generate a second pass output that is output to the AIDAP preprocessor which detects an adversarial input or not based on a comparison of the first pass and second pass outputs. If an adversarial input is detected, a responsive action that mitigates effects of the adversarial input is performed.
Utility
26 Jun 2019
28 Jun 2022