Search | arXiv e-print repository

Crafting Heavy-Tails in Weight Matrix Spectrum without Gradient Noise

Authors: Vignesh Kothapalli, Tianyu Pang, Shenyang Deng, Zongmin Liu, Yaoqing Yang

Abstract: Modern training strategies of deep neural networks (NNs) tend to induce a heavy-tailed (HT) spectra of layer weights. Extensive efforts to study this phenomenon have found that NNs with HT weight spectra tend to generalize well. A prevailing notion for the occurrence of such HT spectra attributes gradient noise during training as a key contributing factor. Our work shows that gradient noise is unn… ▽ More Modern training strategies of deep neural networks (NNs) tend to induce a heavy-tailed (HT) spectra of layer weights. Extensive efforts to study this phenomenon have found that NNs with HT weight spectra tend to generalize well. A prevailing notion for the occurrence of such HT spectra attributes gradient noise during training as a key contributing factor. Our work shows that gradient noise is unnecessary for generating HT weight spectra: two-layer NNs trained with full-batch Gradient Descent/Adam can exhibit HT spectra in their weights after finite training steps. To this end, we first identify the scale of the learning rate at which one step of full-batch Adam can lead to feature learning in the shallow NN, particularly when learning a single index teacher model. Next, we show that multiple optimizer steps with such (sufficiently) large learning rates can transition the bulk of the weight's spectra into an HT distribution. To understand this behavior, we present a novel perspective based on the singular vectors of the weight matrices and optimizer updates. We show that the HT weight spectrum originates from the `spike', which is generated from feature learning and interacts with the main bulk to generate an HT spectrum. Finally, we analyze the correlations between the HT weight spectra and generalization after multiple optimizer updates with varying learning rates. △ Less

Submitted 7 June, 2024; originally announced June 2024.

Comments: 31 pages, 37 figures

arXiv:2312.00359 [pdf, other]

Temperature Balancing, Layer-wise Weight Analysis, and Neural Network Training

Authors: Yefan Zhou, Tianyu Pang, Keqin Liu, Charles H. Martin, Michael W. Mahoney, Yaoqing Yang

Abstract: Regularization in modern machine learning is crucial, and it can take various forms in algorithmic design: training set, model family, error function, regularization terms, and optimizations. In particular, the learning rate, which can be interpreted as a temperature-like parameter within the statistical mechanics of learning, plays a crucial role in neural network training. Indeed, many widely ad… ▽ More Regularization in modern machine learning is crucial, and it can take various forms in algorithmic design: training set, model family, error function, regularization terms, and optimizations. In particular, the learning rate, which can be interpreted as a temperature-like parameter within the statistical mechanics of learning, plays a crucial role in neural network training. Indeed, many widely adopted training strategies basically just define the decay of the learning rate over time. This process can be interpreted as decreasing a temperature, using either a global learning rate (for the entire model) or a learning rate that varies for each parameter. This paper proposes TempBalance, a straightforward yet effective layer-wise learning rate method. TempBalance is based on Heavy-Tailed Self-Regularization (HT-SR) Theory, an approach which characterizes the implicit self-regularization of different layers in trained models. We demonstrate the efficacy of using HT-SR-motivated metrics to guide the scheduling and balancing of temperature across all network layers during model training, resulting in improved performance during testing. We implement TempBalance on CIFAR10, CIFAR100, SVHN, and TinyImageNet datasets using ResNets, VGGs, and WideResNets with various depths and widths. Our results show that TempBalance significantly outperforms ordinary SGD and carefully-tuned spectral norm regularization. We also show that TempBalance outperforms a number of state-of-the-art optimizers and learning rate schedulers. △ Less

Submitted 1 December, 2023; originally announced December 2023.

Comments: NeurIPS 2023 Spotlight, first two authors contributed equally

arXiv:2306.01435 [pdf, other]

Improving Adversarial Robustness of DEQs with Explicit Regulations Along the Neural Dynamics

Authors: Zonghan Yang, Peng Li, Tianyu Pang, Yang Liu

Abstract: Deep equilibrium (DEQ) models replace the multiple-layer stacking of conventional deep networks with a fixed-point iteration of a single-layer transformation. Having been demonstrated to be competitive in a variety of real-world scenarios, the adversarial robustness of general DEQs becomes increasingly crucial for their reliable deployment. Existing works improve the robustness of general DEQ mode… ▽ More Deep equilibrium (DEQ) models replace the multiple-layer stacking of conventional deep networks with a fixed-point iteration of a single-layer transformation. Having been demonstrated to be competitive in a variety of real-world scenarios, the adversarial robustness of general DEQs becomes increasingly crucial for their reliable deployment. Existing works improve the robustness of general DEQ models with the widely-used adversarial training (AT) framework, but they fail to exploit the structural uniquenesses of DEQ models. To this end, we interpret DEQs through the lens of neural dynamics and find that AT under-regulates intermediate states. Besides, the intermediate states typically provide predictions with a high prediction entropy. Informed by the correlation between the entropy of dynamical systems and their stability properties, we propose reducing prediction entropy by progressively updating inputs along the neural dynamics. During AT, we also utilize random intermediate states to compute the loss function. Our methods regulate the neural dynamics of DEQ models in this manner. Extensive experiments demonstrate that our methods substantially increase the robustness of DEQ models and even outperform the strong deep network baselines. △ Less

Submitted 2 June, 2023; originally announced June 2023.

Comments: Accepted at ICML 2023. Our code is available at https://github.com/minicheshire/DEQ-Regulating-Neural-Dynamics

arXiv:2306.01429 [pdf, other]

A Closer Look at the Adversarial Robustness of Deep Equilibrium Models

Authors: Zonghan Yang, Tianyu Pang, Yang Liu

Abstract: Deep equilibrium models (DEQs) refrain from the traditional layer-stacking paradigm and turn to find the fixed point of a single layer. DEQs have achieved promising performance on different applications with featured memory efficiency. At the same time, the adversarial vulnerability of DEQs raises concerns. Several works propose to certify robustness for monotone DEQs. However, limited efforts are… ▽ More Deep equilibrium models (DEQs) refrain from the traditional layer-stacking paradigm and turn to find the fixed point of a single layer. DEQs have achieved promising performance on different applications with featured memory efficiency. At the same time, the adversarial vulnerability of DEQs raises concerns. Several works propose to certify robustness for monotone DEQs. However, limited efforts are devoted to studying empirical robustness for general DEQs. To this end, we observe that an adversarially trained DEQ requires more forward steps to arrive at the equilibrium state, or even violates its fixed-point structure. Besides, the forward and backward tracks of DEQs are misaligned due to the black-box solvers. These facts cause gradient obfuscation when applying the ready-made attacks to evaluate or adversarially train DEQs. Given this, we develop approaches to estimate the intermediate gradients of DEQs and integrate them into the attacking pipelines. Our approaches facilitate fully white-box evaluations and lead to effective adversarial defense for DEQs. Extensive experiments on CIFAR-10 validate the adversarial robustness of DEQs competitive with deep networks of similar sizes. △ Less

Submitted 2 June, 2023; originally announced June 2023.

Comments: Accepted at NeurIPS 2022. Our code is available at https://github.com/minicheshire/DEQ-White-Box-Robustness

arXiv:2302.10688 [pdf, other]

On Calibrating Diffusion Probabilistic Models

Authors: Tianyu Pang, Cheng Lu, Chao Du, Min Lin, Shuicheng Yan, Zhijie Deng

Abstract: Recently, diffusion probabilistic models (DPMs) have achieved promising results in diverse generative tasks. A typical DPM framework includes a forward process that gradually diffuses the data distribution and a reverse process that recovers the data distribution from time-dependent data scores. In this work, we observe that the stochastic reverse process of data scores is a martingale, from which… ▽ More Recently, diffusion probabilistic models (DPMs) have achieved promising results in diverse generative tasks. A typical DPM framework includes a forward process that gradually diffuses the data distribution and a reverse process that recovers the data distribution from time-dependent data scores. In this work, we observe that the stochastic reverse process of data scores is a martingale, from which concentration bounds and the optional stopping theorem for data scores can be derived. Then, we discover a simple way for calibrating an arbitrary pretrained DPM, with which the score matching loss can be reduced and the lower bounds of model likelihood can consequently be increased. We provide general calibration guidelines under various model parametrizations. Our calibration method is performed only once and the resulting models can be used repeatedly for sampling. We conduct experiments on multiple datasets to empirically validate our proposal. Our code is at https://github.com/thudzj/Calibrated-DPMs. △ Less

Submitted 29 October, 2023; v1 submitted 21 February, 2023; originally announced February 2023.

Comments: NeurIPS 2023

arXiv:2203.06560 [pdf, other]

Query-Efficient Black-box Adversarial Attacks Guided by a Transfer-based Prior

Authors: Yinpeng Dong, Shuyu Cheng, Tianyu Pang, Hang Su, Jun Zhu

Abstract: Adversarial attacks have been extensively studied in recent years since they can identify the vulnerability of deep learning models before deployed. In this paper, we consider the black-box adversarial setting, where the adversary needs to craft adversarial examples without access to the gradients of a target model. Previous methods attempted to approximate the true gradient either by using the tr… ▽ More Adversarial attacks have been extensively studied in recent years since they can identify the vulnerability of deep learning models before deployed. In this paper, we consider the black-box adversarial setting, where the adversary needs to craft adversarial examples without access to the gradients of a target model. Previous methods attempted to approximate the true gradient either by using the transfer gradient of a surrogate white-box model or based on the feedback of model queries. However, the existing methods inevitably suffer from low attack success rates or poor query efficiency since it is difficult to estimate the gradient in a high-dimensional input space with limited information. To address these problems and improve black-box attacks, we propose two prior-guided random gradient-free (PRGF) algorithms based on biased sampling and gradient averaging, respectively. Our methods can take the advantage of a transfer-based prior given by the gradient of a surrogate model and the query information simultaneously. Through theoretical analyses, the transfer-based prior is appropriately integrated with model queries by an optimal coefficient in each method. Extensive experiments demonstrate that, in comparison with the alternative state-of-the-arts, both of our methods require much fewer queries to attack black-box models with higher success rates. △ Less

Submitted 12 March, 2022; originally announced March 2022.

Comments: Accepted by IEEE Transactions on Pattern Recognition and Machine Intelligence (TPAMI). The official version is at https://ieeexplore.ieee.org/document/9609659

arXiv:2202.10103 [pdf, other]

Robustness and Accuracy Could Be Reconcilable by (Proper) Definition

Authors: Tianyu Pang, Min Lin, Xiao Yang, Jun Zhu, Shuicheng Yan

Abstract: The trade-off between robustness and accuracy has been widely studied in the adversarial literature. Although still controversial, the prevailing view is that this trade-off is inherent, either empirically or theoretically. Thus, we dig for the origin of this trade-off in adversarial training and find that it may stem from the improperly defined robust error, which imposes an inductive bias of loc… ▽ More The trade-off between robustness and accuracy has been widely studied in the adversarial literature. Although still controversial, the prevailing view is that this trade-off is inherent, either empirically or theoretically. Thus, we dig for the origin of this trade-off in adversarial training and find that it may stem from the improperly defined robust error, which imposes an inductive bias of local invariance -- an overcorrection towards smoothness. Given this, we advocate employing local equivariance to describe the ideal behavior of a robust model, leading to a self-consistent robust error named SCORE. By definition, SCORE facilitates the reconciliation between robustness and accuracy, while still handling the worst-case uncertainty via robust optimization. By simply substituting KL divergence with variants of distance metrics, SCORE can be efficiently minimized. Empirically, our models achieve top-rank performance on RobustBench under AutoAttack. Besides, SCORE provides instructive insights for explaining the overfitting phenomenon and semantic input gradients observed on robust models. Code is available at https://github.com/P2333/SCORE. △ Less

Submitted 16 June, 2022; v1 submitted 21 February, 2022; originally announced February 2022.

Comments: ICML 2022

arXiv:2106.01606 [pdf, other]

Exploring Memorization in Adversarial Training

Authors: Yinpeng Dong, Ke Xu, Xiao Yang, Tianyu Pang, Zhijie Deng, Hang Su, Jun Zhu

Abstract: Deep learning models have a propensity for fitting the entire training set even with random labels, which requires memorization of every training sample. In this paper, we explore the memorization effect in adversarial training (AT) for promoting a deeper understanding of model capacity, convergence, generalization, and especially robust overfitting of the adversarially trained models. We first de… ▽ More Deep learning models have a propensity for fitting the entire training set even with random labels, which requires memorization of every training sample. In this paper, we explore the memorization effect in adversarial training (AT) for promoting a deeper understanding of model capacity, convergence, generalization, and especially robust overfitting of the adversarially trained models. We first demonstrate that deep networks have sufficient capacity to memorize adversarial examples of training data with completely random labels, but not all AT algorithms can converge under the extreme circumstance. Our study of AT with random labels motivates further analyses on the convergence and generalization of AT. We find that some AT approaches suffer from a gradient instability issue and most recently suggested complexity measures cannot explain robust generalization by considering models trained on random labels. Furthermore, we identify a significant drawback of memorization in AT that it could result in robust overfitting. We then propose a new mitigation algorithm motivated by detailed memorization analyses. Extensive experiments on various datasets validate the effectiveness of the proposed method. △ Less

Submitted 12 March, 2022; v1 submitted 3 June, 2021; originally announced June 2021.

Comments: Accepted by ICLR 2022. 24 pages

arXiv:2103.13127 [pdf, other]

Black-box Detection of Backdoor Attacks with Limited Information and Data

Authors: Yinpeng Dong, Xiao Yang, Zhijie Deng, Tianyu Pang, Zihao Xiao, Hang Su, Jun Zhu

Abstract: Although deep neural networks (DNNs) have made rapid progress in recent years, they are vulnerable in adversarial environments. A malicious backdoor could be embedded in a model by poisoning the training dataset, whose intention is to make the infected model give wrong predictions during inference when the specific trigger appears. To mitigate the potential threats of backdoor attacks, various bac… ▽ More Although deep neural networks (DNNs) have made rapid progress in recent years, they are vulnerable in adversarial environments. A malicious backdoor could be embedded in a model by poisoning the training dataset, whose intention is to make the infected model give wrong predictions during inference when the specific trigger appears. To mitigate the potential threats of backdoor attacks, various backdoor detection and defense methods have been proposed. However, the existing techniques usually require the poisoned training data or access to the white-box model, which is commonly unavailable in practice. In this paper, we propose a black-box backdoor detection (B3D) method to identify backdoor attacks with only query access to the model. We introduce a gradient-free optimization algorithm to reverse-engineer the potential trigger for each class, which helps to reveal the existence of backdoor attacks. In addition to backdoor detection, we also propose a simple strategy for reliable predictions using the identified backdoored models. Extensive experiments on hundreds of DNN models trained on several datasets corroborate the effectiveness of our method under the black-box setting against various backdoor attacks. △ Less

Submitted 24 March, 2021; originally announced March 2021.

arXiv:2010.00467 [pdf, other]

Bag of Tricks for Adversarial Training

Authors: Tianyu Pang, Xiao Yang, Yinpeng Dong, Hang Su, Jun Zhu

Abstract: Adversarial training (AT) is one of the most effective strategies for promoting model robustness. However, recent benchmarks show that most of the proposed improvements on AT are less effective than simply early stopping the training procedure. This counter-intuitive fact motivates us to investigate the implementation details of tens of AT methods. Surprisingly, we find that the basic settings (e.… ▽ More Adversarial training (AT) is one of the most effective strategies for promoting model robustness. However, recent benchmarks show that most of the proposed improvements on AT are less effective than simply early stopping the training procedure. This counter-intuitive fact motivates us to investigate the implementation details of tens of AT methods. Surprisingly, we find that the basic settings (e.g., weight decay, training schedule, etc.) used in these methods are highly inconsistent. In this work, we provide comprehensive evaluations on CIFAR-10, focusing on the effects of mostly overlooked training tricks and hyperparameters for adversarially trained models. Our empirical observations suggest that adversarial robustness is much more sensitive to some basic training settings than we thought. For example, a slightly different value of weight decay can reduce the model robust accuracy by more than 7%, which is probable to override the potential promotion induced by the proposed methods. We conclude a baseline training setting and re-implement previous defenses to achieve new state-of-the-art results. These facts also appeal to more concerns on the overlooked confounders when benchmarking defenses. △ Less

Submitted 31 March, 2021; v1 submitted 1 October, 2020; originally announced October 2020.

Comments: ICLR 2021

arXiv:2007.03317 [pdf, other]

Efficient Learning of Generative Models via Finite-Difference Score Matching

Authors: Tianyu Pang, Kun Xu, Chongxuan Li, Yang Song, Stefano Ermon, Jun Zhu

Abstract: Several machine learning applications involve the optimization of higher-order derivatives (e.g., gradients of gradients) during training, which can be expensive in respect to memory and computation even with automatic differentiation. As a typical example in generative modeling, score matching (SM) involves the optimization of the trace of a Hessian. To improve computing efficiency, we rewrite th… ▽ More Several machine learning applications involve the optimization of higher-order derivatives (e.g., gradients of gradients) during training, which can be expensive in respect to memory and computation even with automatic differentiation. As a typical example in generative modeling, score matching (SM) involves the optimization of the trace of a Hessian. To improve computing efficiency, we rewrite the SM objective and its variants in terms of directional derivatives, and present a generic strategy to efficiently approximate any-order directional derivative with finite difference (FD). Our approximation only involves function evaluations, which can be executed in parallel, and no gradient computations. Thus, it reduces the total computational cost while also improving numerical stability. We provide two instantiations by reformulating variants of SM objectives into the FD forms. Empirically, we demonstrate that our methods produce results comparable to the gradient-based counterparts while being much more computationally efficient. △ Less

Submitted 25 November, 2020; v1 submitted 7 July, 2020; originally announced July 2020.

Comments: NeurIPS 2020

arXiv:2003.06814 [pdf, other]

Towards Face Encryption by Generating Adversarial Identity Masks

Authors: Xiao Yang, Yinpeng Dong, Tianyu Pang, Hang Su, Jun Zhu, Yuefeng Chen, Hui Xue

Abstract: As billions of personal data being shared through social media and network, the data privacy and security have drawn an increasing attention. Several attempts have been made to alleviate the leakage of identity information from face photos, with the aid of, e.g., image obfuscation techniques. However, most of the present results are either perceptually unsatisfactory or ineffective against face re… ▽ More As billions of personal data being shared through social media and network, the data privacy and security have drawn an increasing attention. Several attempts have been made to alleviate the leakage of identity information from face photos, with the aid of, e.g., image obfuscation techniques. However, most of the present results are either perceptually unsatisfactory or ineffective against face recognition systems. Our goal in this paper is to develop a technique that can encrypt the personal photos such that they can protect users from unauthorized face recognition systems but remain visually identical to the original version for human beings. To achieve this, we propose a targeted identity-protection iterative method (TIP-IM) to generate adversarial identity masks which can be overlaid on facial images, such that the original identities can be concealed without sacrificing the visual quality. Extensive experiments demonstrate that TIP-IM provides 95\%+ protection success rate against various state-of-the-art face recognition models under practical test scenarios. Besides, we also show the practical and effective applicability of our method on a commercial API service. △ Less

Submitted 16 August, 2021; v1 submitted 15 March, 2020; originally announced March 2020.

Comments: Accepted by ICCV2021

arXiv:2002.08619 [pdf, other]

Boosting Adversarial Training with Hypersphere Embedding

Authors: Tianyu Pang, Xiao Yang, Yinpeng Dong, Kun Xu, Jun Zhu, Hang Su

Abstract: Adversarial training (AT) is one of the most effective defenses against adversarial attacks for deep learning models. In this work, we advocate incorporating the hypersphere embedding (HE) mechanism into the AT procedure by regularizing the features onto compact manifolds, which constitutes a lightweight yet effective module to blend in the strength of representation learning. Our extensive analys… ▽ More Adversarial training (AT) is one of the most effective defenses against adversarial attacks for deep learning models. In this work, we advocate incorporating the hypersphere embedding (HE) mechanism into the AT procedure by regularizing the features onto compact manifolds, which constitutes a lightweight yet effective module to blend in the strength of representation learning. Our extensive analyses reveal that AT and HE are well coupled to benefit the robustness of the adversarially trained models from several aspects. We validate the effectiveness and adaptability of HE by embedding it into the popular AT frameworks including PGD-AT, ALP, and TRADES, as well as the FreeAT and FastAT strategies. In the experiments, we evaluate our methods under a wide range of adversarial attacks on the CIFAR-10 and ImageNet datasets, which verifies that integrating HE can consistently enhance the model robustness for each AT framework with little extra computation. △ Less

Submitted 25 November, 2020; v1 submitted 20 February, 2020; originally announced February 2020.

Comments: NeurIPS 2020

arXiv:2002.05999 [pdf, other]

Adversarial Distributional Training for Robust Deep Learning

Authors: Yinpeng Dong, Zhijie Deng, Tianyu Pang, Hang Su, Jun Zhu

Abstract: Adversarial training (AT) is among the most effective techniques to improve model robustness by augmenting training data with adversarial examples. However, most existing AT methods adopt a specific attack to craft adversarial examples, leading to the unreliable robustness against other unseen attacks. Besides, a single attack algorithm could be insufficient to explore the space of perturbations.… ▽ More Adversarial training (AT) is among the most effective techniques to improve model robustness by augmenting training data with adversarial examples. However, most existing AT methods adopt a specific attack to craft adversarial examples, leading to the unreliable robustness against other unseen attacks. Besides, a single attack algorithm could be insufficient to explore the space of perturbations. In this paper, we introduce adversarial distributional training (ADT), a novel framework for learning robust models. ADT is formulated as a minimax optimization problem, where the inner maximization aims to learn an adversarial distribution to characterize the potential adversarial examples around a natural one under an entropic regularizer, and the outer minimization aims to train robust models by minimizing the expected loss over the worst-case adversarial distributions. Through a theoretical analysis, we develop a general algorithm for solving ADT, and present three approaches for parameterizing the adversarial distributions, ranging from the typical Gaussian distributions to the flexible implicit ones. Empirical results on several benchmarks validate the effectiveness of ADT compared with the state-of-the-art AT methods. △ Less

Submitted 19 November, 2020; v1 submitted 14 February, 2020; originally announced February 2020.

Comments: NeurIPS 2020. The first two authors contribute equally

arXiv:1912.11852 [pdf, other]

Benchmarking Adversarial Robustness

Authors: Yinpeng Dong, Qi-An Fu, Xiao Yang, Tianyu Pang, Hang Su, Zihao Xiao, Jun Zhu

Abstract: Deep neural networks are vulnerable to adversarial examples, which becomes one of the most important research problems in the development of deep learning. While a lot of efforts have been made in recent years, it is of great significance to perform correct and complete evaluations of the adversarial attack and defense algorithms. In this paper, we establish a comprehensive, rigorous, and coherent… ▽ More Deep neural networks are vulnerable to adversarial examples, which becomes one of the most important research problems in the development of deep learning. While a lot of efforts have been made in recent years, it is of great significance to perform correct and complete evaluations of the adversarial attack and defense algorithms. In this paper, we establish a comprehensive, rigorous, and coherent benchmark to evaluate adversarial robustness on image classification tasks. After briefly reviewing plenty of representative attack and defense methods, we perform large-scale experiments with two robustness curves as the fair-minded evaluation criteria to fully understand the performance of these methods. Based on the evaluation results, we draw several important findings and provide insights for future research. △ Less

Submitted 26 December, 2019; originally announced December 2019.

arXiv:1909.11515 [pdf, other]

Mixup Inference: Better Exploiting Mixup to Defend Adversarial Attacks

Authors: Tianyu Pang, Kun Xu, Jun Zhu

Abstract: It has been widely recognized that adversarial examples can be easily crafted to fool deep networks, which mainly root from the locally non-linear behavior nearby input examples. Applying mixup in training provides an effective mechanism to improve generalization performance and model robustness against adversarial perturbations, which introduces the globally linear behavior in-between training ex… ▽ More It has been widely recognized that adversarial examples can be easily crafted to fool deep networks, which mainly root from the locally non-linear behavior nearby input examples. Applying mixup in training provides an effective mechanism to improve generalization performance and model robustness against adversarial perturbations, which introduces the globally linear behavior in-between training examples. However, in previous work, the mixup-trained models only passively defend adversarial attacks in inference by directly classifying the inputs, where the induced global linearity is not well exploited. Namely, since the locality of the adversarial perturbations, it would be more efficient to actively break the locality via the globality of the model predictions. Inspired by simple geometric intuition, we develop an inference principle, named mixup inference (MI), for mixup-trained models. MI mixups the input with other random clean samples, which can shrink and transfer the equivalent perturbation if the input is adversarial. Our experiments on CIFAR-10 and CIFAR-100 demonstrate that MI can further improve the adversarial robustness for the models trained by mixup and its variants. △ Less

Submitted 20 February, 2020; v1 submitted 25 September, 2019; originally announced September 2019.

Comments: ICLR 2020

arXiv:1906.06919 [pdf, other]

Improving Black-box Adversarial Attacks with a Transfer-based Prior

Authors: Shuyu Cheng, Yinpeng Dong, Tianyu Pang, Hang Su, Jun Zhu

Abstract: We consider the black-box adversarial setting, where the adversary has to generate adversarial perturbations without access to the target models to compute gradients. Previous methods tried to approximate the gradient either by using a transfer gradient of a surrogate white-box model, or based on the query feedback. However, these methods often suffer from low attack success rates or poor query ef… ▽ More We consider the black-box adversarial setting, where the adversary has to generate adversarial perturbations without access to the target models to compute gradients. Previous methods tried to approximate the gradient either by using a transfer gradient of a surrogate white-box model, or based on the query feedback. However, these methods often suffer from low attack success rates or poor query efficiency since it is non-trivial to estimate the gradient in a high-dimensional space with limited information. To address these problems, we propose a prior-guided random gradient-free (P-RGF) method to improve black-box adversarial attacks, which takes the advantage of a transfer-based prior and the query information simultaneously. The transfer-based prior given by the gradient of a surrogate model is appropriately integrated into our algorithm by an optimal coefficient derived by a theoretical analysis. Extensive experiments demonstrate that our method requires much fewer queries to attack black-box models with higher success rates compared with the alternative state-of-the-art methods. △ Less

Submitted 26 July, 2020; v1 submitted 17 June, 2019; originally announced June 2019.

Comments: NeurIPS 2019; Code available at https://github.com/thu-ml/Prior-Guided-RGF

arXiv:1905.10626 [pdf, other]

Rethinking Softmax Cross-Entropy Loss for Adversarial Robustness

Authors: Tianyu Pang, Kun Xu, Yinpeng Dong, Chao Du, Ning Chen, Jun Zhu

Abstract: Previous work shows that adversarially robust generalization requires larger sample complexity, and the same dataset, e.g., CIFAR-10, which enables good standard accuracy may not suffice to train robust models. Since collecting new training data could be costly, we focus on better utilizing the given data by inducing the regions with high sample density in the feature space, which could lead to lo… ▽ More Previous work shows that adversarially robust generalization requires larger sample complexity, and the same dataset, e.g., CIFAR-10, which enables good standard accuracy may not suffice to train robust models. Since collecting new training data could be costly, we focus on better utilizing the given data by inducing the regions with high sample density in the feature space, which could lead to locally sufficient samples for robust learning. We first formally show that the softmax cross-entropy (SCE) loss and its variants convey inappropriate supervisory signals, which encourage the learned feature points to spread over the space sparsely in training. This inspires us to propose the Max-Mahalanobis center (MMC) loss to explicitly induce dense feature regions in order to benefit robustness. Namely, the MMC loss encourages the model to concentrate on learning ordered and compact representations, which gather around the preset optimal centers for different classes. We empirically demonstrate that applying the MMC loss can significantly improve robustness even under strong adaptive attacks, while keeping state-of-the-art accuracy on clean inputs with little extra computation compared to the SCE loss. △ Less

Submitted 20 February, 2020; v1 submitted 25 May, 2019; originally announced May 2019.

Comments: ICLR 2020

arXiv:1901.08846 [pdf, other]

Improving Adversarial Robustness via Promoting Ensemble Diversity

Authors: Tianyu Pang, Kun Xu, Chao Du, Ning Chen, Jun Zhu

Abstract: Though deep neural networks have achieved significant progress on various tasks, often enhanced by model ensemble, existing high-performance models can be vulnerable to adversarial attacks. Many efforts have been devoted to enhancing the robustness of individual networks and then constructing a straightforward ensemble, e.g., by directly averaging the outputs, which ignores the interaction among n… ▽ More Though deep neural networks have achieved significant progress on various tasks, often enhanced by model ensemble, existing high-performance models can be vulnerable to adversarial attacks. Many efforts have been devoted to enhancing the robustness of individual networks and then constructing a straightforward ensemble, e.g., by directly averaging the outputs, which ignores the interaction among networks. This paper presents a new method that explores the interaction among individual networks to improve robustness for ensemble models. Technically, we define a new notion of ensemble diversity in the adversarial setting as the diversity among non-maximal predictions of individual members, and present an adaptive diversity promoting (ADP) regularizer to encourage the diversity, which leads to globally better robustness for the ensemble by making adversarial examples difficult to transfer among individual members. Our method is computationally efficient and compatible with the defense methods acting on individual networks. Empirical results on various datasets verify that our method can improve adversarial robustness while maintaining state-of-the-art accuracy on normal examples. △ Less

Submitted 29 May, 2019; v1 submitted 25 January, 2019; originally announced January 2019.

Comments: ICML 2019

arXiv:1804.00097 [pdf, other]

Adversarial Attacks and Defences Competition

Authors: Alexey Kurakin, Ian Goodfellow, Samy Bengio, Yinpeng Dong, Fangzhou Liao, Ming Liang, Tianyu Pang, Jun Zhu, Xiaolin Hu, Cihang Xie, Jianyu Wang, Zhishuai Zhang, Zhou Ren, Alan Yuille, Sangxia Huang, Yao Zhao, Yuzhe Zhao, Zhonglin Han, Junjiajia Long, Yerkebulan Berdibekov, Takuya Akiba, Seiya Tokui, Motoki Abe

Abstract: To accelerate research on adversarial examples and robustness of machine learning classifiers, Google Brain organized a NIPS 2017 competition that encouraged researchers to develop new methods to generate adversarial examples as well as to develop new ways to defend against them. In this chapter, we describe the structure and organization of the competition and the solutions developed by several o… ▽ More To accelerate research on adversarial examples and robustness of machine learning classifiers, Google Brain organized a NIPS 2017 competition that encouraged researchers to develop new methods to generate adversarial examples as well as to develop new ways to defend against them. In this chapter, we describe the structure and organization of the competition and the solutions developed by several of the top-placing teams. △ Less

Submitted 30 March, 2018; originally announced April 2018.

Comments: 36 pages, 10 figures

arXiv:1710.06081 [pdf, other]

Boosting Adversarial Attacks with Momentum

Authors: Yinpeng Dong, Fangzhou Liao, Tianyu Pang, Hang Su, Jun Zhu, Xiaolin Hu, Jianguo Li

Abstract: Deep neural networks are vulnerable to adversarial examples, which poses security concerns on these algorithms due to the potentially severe consequences. Adversarial attacks serve as an important surrogate to evaluate the robustness of deep learning models before they are deployed. However, most of existing adversarial attacks can only fool a black-box model with a low success rate. To address th… ▽ More Deep neural networks are vulnerable to adversarial examples, which poses security concerns on these algorithms due to the potentially severe consequences. Adversarial attacks serve as an important surrogate to evaluate the robustness of deep learning models before they are deployed. However, most of existing adversarial attacks can only fool a black-box model with a low success rate. To address this issue, we propose a broad class of momentum-based iterative algorithms to boost adversarial attacks. By integrating the momentum term into the iterative process for attacks, our methods can stabilize update directions and escape from poor local maxima during the iterations, resulting in more transferable adversarial examples. To further improve the success rates for black-box attacks, we apply momentum iterative algorithms to an ensemble of models, and show that the adversarially trained models with a strong defense ability are also vulnerable to our black-box attacks. We hope that the proposed methods will serve as a benchmark for evaluating the robustness of various deep models and defense methods. With this method, we won the first places in NIPS 2017 Non-targeted Adversarial Attack and Targeted Adversarial Attack competitions. △ Less

Submitted 22 March, 2018; v1 submitted 17 October, 2017; originally announced October 2017.

Comments: CVPR 2018 Spotlight

arXiv:1611.04248 [pdf, ps, other]

Asymptotic Inference for AR(1) Penal Data

Authors: Jianfei Shen, Tianxiao Pang

Abstract: A general asymptotic theory is given for the panel data AR(1) model with time series independent in different cross sections. The theory covers the cases of stationary process, nearly non-stationary process, unit root process, mildly integrated, mildly explosive and explosive processes. It is assumed that the cross-sectional dimension and time-series dimension are respectively $N$ and $T$. The res… ▽ More A general asymptotic theory is given for the panel data AR(1) model with time series independent in different cross sections. The theory covers the cases of stationary process, nearly non-stationary process, unit root process, mildly integrated, mildly explosive and explosive processes. It is assumed that the cross-sectional dimension and time-series dimension are respectively $N$ and $T$. The results in this paper illustrate that whichever the process is, with an appropriate regularization, the least squares estimator of the autoregressive coefficient converges to a normal distribution with rate at least $O(N^{-1/3})$. Since the variance is the key to characterize the normal distribution, it is important to discuss the variance of the least squares estimator. We will show that when the autoregressive coefficient $ρ$ satisfies $|ρ|<1$, the variance declines at the rate $O((NT)^{-1/2})$, while the rate changes to $O(N^{-1/2}T^{-1})$ when $ρ=1$ and $O(N^{-1/2}ρ^{-T+2})$ when $|ρ|>1$. $ρ=1$ is the critical point where the convergence rate changes radically. The transition process is studied by assuming $ρ$ depending on $T$ and going to $1$. An interesting phenomenon discovered in this paper is that, in the explosive case, the least squares estimator of the autoregressive coefficient has a standard normal limiting distribution in panel data case while it may not has a limiting distribution in univariate time series case. △ Less

Submitted 13 November, 2016; originally announced November 2016.

Comments: 16 pages

Showing 1–22 of 22 results for author: Pang, T