Jump to Content

Non-Uniform Adversarial Perturbations for Discrete Tabular Datasets

Jay Nandy
Jatin Chauhan
CIKM (2023)

Abstract

We study the problem of adversarial attack and robustness on tabular datasets with discrete features. The discrete features of a tabular dataset represent high-level meaningful concepts, with different sets of vocabularies, leading to requiring non-uniform robustness. Further, the notion of distance between tabular input instances is not well defined, making the problem of producing adversarial examples with minor perturbations qualitatively more challenging compared to existing methods. Towards this, our paper defines the notion of distance through the lens of feature embeddings, learnt to represent the discrete features. We then formulate the task of generating adversarial examples as a binary set selection problem under non-uniform feature importance. Next, we propose an efficient approximate gradient-descent based algorithm, called Discrete Non-uniform Approximation (DNA) attack, by reformulating the problem into a continuous domain to solve the original optimization problem for generating adversarial examples. We demonstrate the effectiveness of our proposed DNA attack using two large real-world discrete tabular datasets from e-commerce domains for binary classification, where the datasets are heavily biased for one-class. We also analyze challenges for existing adversarial training frameworks for such datasets under our DNA attack.

Research Areas