One Pixel Adversarial Attacks via Sketched Programs (PLDI 2023 - PLDI Research Papers)

Who

Tom Yuviler, Dana Drachsler Cohen

Track

PLDI 2023 PLDI Research Papers

Time Zone

The program is currently displayed in (GMT-04:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-04:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Tue 20 Jun 2023 10:00 - 10:20 at Royal - PLDI: Synthesis Chair(s): Ilya Sergey

Abstract

Neural networks are successful in various tasks but are also susceptible to adversarial examples. An adversarial example is generated by adding a small perturbation to a correctly-classified input with the goal of causing a network classifier to misclassify. In one pixel attacks, an attacker aims to fool an image classifier by modifying a single pixel. This setting is challenging for two reasons: the perturbation region is very small and the perturbation is not differentiable. To cope, one pixel attacks iteratively generate candidate adversarial examples and submit them to the network until finding a successful candidate. However, existing works require a very large number of queries, which is infeasible in many practical settings, where the attacker is limited to a few thousand queries to the network. We propose a novel approach for computing one pixel attacks. The key idea is to leverage program synthesis and identify an expressive program sketch that enables to compute adversarial examples using significantly fewer queries. We introduce OPPSLA, a synthesizer that, given a classifier and a training set, instantiates the sketch with customized conditions over the input’s pixels and the classifier’s output. OPPSLA employs a stochastic search, inspired by the Metropolis-Hastings algorithm, that synthesizes typed expressions enabling minimization of the number of queries to the classifier. We further show how to extend OPPSLA to compute few pixel attacks minimizing the number of perturbed pixels. We evaluate OPPSLA on several deep networks for CIFAR-10 and ImageNet. We show that OPPSLA obtains a state-of-the-art success rate, often with an order of magnitude fewer queries than existing attacks. We further show that OPPSLA’s programs are transferable to other classifiers, unlike existing one pixel attacks, which run from scratch on every classifier and input.

DOI

https://doi.org/10.1145/3591301

Tom Yuviler

Technion

Dana Drachsler Cohen