Search | arXiv e-print repository

AI-augmented Automation for Real Driving Prediction: an Industrial Use Case

Authors: Romina Eramo, Hamzeh Eyal Salman, Matteo Spezialetti, Darko Stern, Pierre Quinton, Antonio Cicchetti

Abstract: The risen complexity of automotive systems requires new development strategies and methods to master the upcoming challenges. Traditional methods need thus to be changed by an increased level of automation, and a faster continuous improvement cycle. In this context, current vehicle performance tests represent a very time-consuming and expensive task due to the need to perform the tests in real dri… ▽ More The risen complexity of automotive systems requires new development strategies and methods to master the upcoming challenges. Traditional methods need thus to be changed by an increased level of automation, and a faster continuous improvement cycle. In this context, current vehicle performance tests represent a very time-consuming and expensive task due to the need to perform the tests in real driving conditions. As a consequence, agile/iterative processes like DevOps are largely hindered by the necessity of triggering frequent tests. This paper reports on a practical experience of developing an AI-augmented solution based on Machine Learning and Model-based Engineering to support continuous vehicle development and testing. In particular, historical data collected in real driving conditions is leveraged to synthesize a high-fidelity driving simulator and hence enable performance tests in virtual environments. Based on this practical experience, this paper also proposes a conceptual framework to support predictions based on real driving behavior. △ Less

Submitted 3 April, 2024; originally announced April 2024.

arXiv:2312.06205 [pdf, other]

The Journey, Not the Destination: How Data Guides Diffusion Models

Authors: Kristian Georgiev, Joshua Vendrow, Hadi Salman, Sung Min Park, Aleksander Madry

Abstract: Diffusion models trained on large datasets can synthesize photo-realistic images of remarkable quality and diversity. However, attributing these images back to the training data-that is, identifying specific training examples which caused an image to be generated-remains a challenge. In this paper, we propose a framework that: (i) provides a formal notion of data attribution in the context of diff… ▽ More Diffusion models trained on large datasets can synthesize photo-realistic images of remarkable quality and diversity. However, attributing these images back to the training data-that is, identifying specific training examples which caused an image to be generated-remains a challenge. In this paper, we propose a framework that: (i) provides a formal notion of data attribution in the context of diffusion models, and (ii) allows us to counterfactually validate such attributions. Then, we provide a method for computing these attributions efficiently. Finally, we apply our method to find (and evaluate) such attributions for denoising diffusion probabilistic models trained on CIFAR-10 and latent diffusion models trained on MS COCO. We provide code at https://github.com/MadryLab/journey-TRAK . △ Less

Submitted 11 December, 2023; originally announced December 2023.

Comments: 29 pages, 17 figures

arXiv:2311.03071 [pdf, other]

OrthoNets: Orthogonal Channel Attention Networks

Authors: Hadi Salman, Caleb Parks, Matthew Swan, John Gauch

Abstract: Designing an effective channel attention mechanism implores one to find a lossy-compression method allowing for optimal feature representation. Despite recent progress in the area, it remains an open problem. FcaNet, the current state-of-the-art channel attention mechanism, attempted to find such an information-rich compression using Discrete Cosine Transforms (DCTs). One drawback of FcaNet is tha… ▽ More Designing an effective channel attention mechanism implores one to find a lossy-compression method allowing for optimal feature representation. Despite recent progress in the area, it remains an open problem. FcaNet, the current state-of-the-art channel attention mechanism, attempted to find such an information-rich compression using Discrete Cosine Transforms (DCTs). One drawback of FcaNet is that there is no natural choice of the DCT frequencies. To circumvent this issue, FcaNet experimented on ImageNet to find optimal frequencies. We hypothesize that the choice of frequency plays only a supporting role and the primary driving force for the effectiveness of their attention filters is the orthogonality of the DCT kernels. To test this hypothesis, we construct an attention mechanism using randomly initialized orthogonal filters. Integrating this mechanism into ResNet, we create OrthoNet. We compare OrthoNet to FcaNet (and other attention mechanisms) on Birds, MS-COCO, and Places356 and show superior performance. On the ImageNet dataset, our method competes with or surpasses the current state-of-the-art. Our results imply that an optimal choice of filter is elusive and generalization can be achieved with a sufficiently large number of orthogonal filters. We further investigate other general principles for implementing channel attention, such as its position in the network and channel groupings. Our code is publicly available at https://github.com/hady1011/OrthoNets/ △ Less

Submitted 6 November, 2023; v1 submitted 6 November, 2023; originally announced November 2023.

Comments: IEEE BigData 2023

Journal ref: IEEE BigData 2023

arXiv:2307.10163 [pdf, other]

Rethinking Backdoor Attacks

Authors: Alaa Khaddaj, Guillaume Leclerc, Aleksandar Makelov, Kristian Georgiev, Hadi Salman, Andrew Ilyas, Aleksander Madry

Abstract: In a backdoor attack, an adversary inserts maliciously constructed backdoor examples into a training set to make the resulting model vulnerable to manipulation. Defending against such attacks typically involves viewing these inserted examples as outliers in the training set and using techniques from robust statistics to detect and remove them. In this work, we present a different approach to the… ▽ More In a backdoor attack, an adversary inserts maliciously constructed backdoor examples into a training set to make the resulting model vulnerable to manipulation. Defending against such attacks typically involves viewing these inserted examples as outliers in the training set and using techniques from robust statistics to detect and remove them. In this work, we present a different approach to the backdoor attack problem. Specifically, we show that without structural information about the training data distribution, backdoor attacks are indistinguishable from naturally-occurring features in the data--and thus impossible to "detect" in a general sense. Then, guided by this observation, we revisit existing defenses against backdoor attacks and characterize the (often latent) assumptions they make and on which they depend. Finally, we explore an alternative perspective on backdoor attacks: one that assumes these attacks correspond to the strongest feature in the training data. Under this assumption (which we make formal) we develop a new primitive for detecting backdoor attacks. Our primitive naturally gives rise to a detection algorithm that comes with theoretical guarantees and is effective in practice. △ Less

Submitted 19 July, 2023; originally announced July 2023.

Comments: ICML 2023

arXiv:2306.12517 [pdf, other]

FFCV: Accelerating Training by Removing Data Bottlenecks

Authors: Guillaume Leclerc, Andrew Ilyas, Logan Engstrom, Sung Min Park, Hadi Salman, Aleksander Madry

Abstract: We present FFCV, a library for easy and fast machine learning model training. FFCV speeds up model training by eliminating (often subtle) data bottlenecks from the training process. In particular, we combine techniques such as an efficient file storage format, caching, data pre-loading, asynchronous data transfer, and just-in-time compilation to (a) make data loading and transfer significantly mor… ▽ More We present FFCV, a library for easy and fast machine learning model training. FFCV speeds up model training by eliminating (often subtle) data bottlenecks from the training process. In particular, we combine techniques such as an efficient file storage format, caching, data pre-loading, asynchronous data transfer, and just-in-time compilation to (a) make data loading and transfer significantly more efficient, ensuring that GPUs can reach full utilization; and (b) offload as much data processing as possible to the CPU asynchronously, freeing GPU cycles for training. Using FFCV, we train ResNet-18 and ResNet-50 on the ImageNet dataset with competitive tradeoff between accuracy and training time. For example, we are able to train an ImageNet ResNet-50 model to 75\% in only 20 mins on a single machine. We demonstrate FFCV's performance, ease-of-use, extensibility, and ability to adapt to resource constraints through several case studies. Detailed installation instructions, documentation, and Slack support channel are available at https://ffcv.io/ . △ Less

Submitted 21 June, 2023; originally announced June 2023.

arXiv:2302.06588 [pdf, other]

Raising the Cost of Malicious AI-Powered Image Editing

Authors: Hadi Salman, Alaa Khaddaj, Guillaume Leclerc, Andrew Ilyas, Aleksander Madry

Abstract: We present an approach to mitigating the risks of malicious image editing posed by large diffusion models. The key idea is to immunize images so as to make them resistant to manipulation by these models. This immunization relies on injection of imperceptible adversarial perturbations designed to disrupt the operation of the targeted diffusion models, forcing them to generate unrealistic images. We… ▽ More We present an approach to mitigating the risks of malicious image editing posed by large diffusion models. The key idea is to immunize images so as to make them resistant to manipulation by these models. This immunization relies on injection of imperceptible adversarial perturbations designed to disrupt the operation of the targeted diffusion models, forcing them to generate unrealistic images. We provide two methods for crafting such perturbations, and then demonstrate their efficacy. Finally, we discuss a policy component necessary to make our approach fully effective and practical -- one that involves the organizations developing diffusion models, rather than individual users, to implement (and support) the immunization process. △ Less

Submitted 13 February, 2023; originally announced February 2023.

arXiv:2211.02695 [pdf, other]

doi 10.1109/BigData55660.2022.10020665

WaveNets: Wavelet Channel Attention Networks

Authors: Hadi Salman, Caleb Parks, Shi Yin Hong, Justin Zhan

Abstract: Channel Attention reigns supreme as an effective technique in the field of computer vision. However, the proposed channel attention by SENet suffers from information loss in feature learning caused by the use of Global Average Pooling (GAP) to represent channels as scalars. Thus, designing effective channel attention mechanisms requires finding a solution to enhance features preservation in modeli… ▽ More Channel Attention reigns supreme as an effective technique in the field of computer vision. However, the proposed channel attention by SENet suffers from information loss in feature learning caused by the use of Global Average Pooling (GAP) to represent channels as scalars. Thus, designing effective channel attention mechanisms requires finding a solution to enhance features preservation in modeling channel inter-dependencies. In this work, we utilize Wavelet transform compression as a solution to the channel representation problem. We first test wavelet transform as an Auto-Encoder model equipped with conventional channel attention module. Next, we test wavelet transform as a standalone channel compression method. We prove that global average pooling is equivalent to the recursive approximate Haar wavelet transform. With this proof, we generalize channel attention using Wavelet compression and name it WaveNet. Implementation of our method can be embedded within existing channel attention methods with a couple of lines of code. We test our proposed method using ImageNet dataset for image classification task. Our method outperforms the baseline SENet, and achieves the state-of-the-art results. Our code implementation is publicly available at https://github.com/hady1011/WaveNet-C. △ Less

Submitted 12 March, 2024; v1 submitted 4 November, 2022; originally announced November 2022.

Comments: IEEE BigData2022 conference

arXiv:2207.05739 [pdf, other]

A Data-Based Perspective on Transfer Learning

Authors: Saachi Jain, Hadi Salman, Alaa Khaddaj, Eric Wong, Sung Min Park, Aleksander Madry

Abstract: It is commonly believed that in transfer learning including more pre-training data translates into better performance. However, recent evidence suggests that removing data from the source dataset can actually help too. In this work, we take a closer look at the role of the source dataset's composition in transfer learning and present a framework for probing its impact on downstream performance. Ou… ▽ More It is commonly believed that in transfer learning including more pre-training data translates into better performance. However, recent evidence suggests that removing data from the source dataset can actually help too. In this work, we take a closer look at the role of the source dataset's composition in transfer learning and present a framework for probing its impact on downstream performance. Our framework gives rise to new capabilities such as pinpointing transfer learning brittleness as well as detecting pathologies such as data-leakage and the presence of misleading examples in the source dataset. In particular, we demonstrate that removing detrimental datapoints identified by our framework improves transfer learning performance from ImageNet on a variety of target tasks. Code is available at https://github.com/MadryLab/data-transfer △ Less

Submitted 12 July, 2022; originally announced July 2022.

arXiv:2207.02842 [pdf, other]

When does Bias Transfer in Transfer Learning?

Authors: Hadi Salman, Saachi Jain, Andrew Ilyas, Logan Engstrom, Eric Wong, Aleksander Madry

Abstract: Using transfer learning to adapt a pre-trained "source model" to a downstream "target task" can dramatically increase performance with seemingly no downside. In this work, we demonstrate that there can exist a downside after all: bias transfer, or the tendency for biases of the source model to persist even after adapting the model to the target class. Through a combination of synthetic and natural… ▽ More Using transfer learning to adapt a pre-trained "source model" to a downstream "target task" can dramatically increase performance with seemingly no downside. In this work, we demonstrate that there can exist a downside after all: bias transfer, or the tendency for biases of the source model to persist even after adapting the model to the target class. Through a combination of synthetic and natural experiments, we show that bias transfer both (a) arises in realistic settings (such as when pre-training on ImageNet or other standard datasets) and (b) can occur even when the target dataset is explicitly de-biased. As transfer-learned models are increasingly deployed in the real world, our work highlights the importance of understanding the limitations of pre-trained source models. Code is available at https://github.com/MadryLab/bias-transfer △ Less

Submitted 6 July, 2022; originally announced July 2022.

arXiv:2204.11233 [pdf]

doi 10.24138/jcomss-2021-0155

Naming the Identified Feature Implementation Blocks from Software Source Code

Authors: Ra'Fat Al-Msie'Deen, Hamzeh Eyal Salman, Anas H. Blasi, Mohammed A. Alsuwaiket

Abstract: Identifying software identifiers that implement a particular feature of a software product is known as feature identification. Feature identification is one of the most critical and popular processes performed by software engineers during software maintenance activity. However, a meaningful name must be assigned to the Identified Feature Implementation Block (IFIB) to complete the feature identifi… ▽ More Identifying software identifiers that implement a particular feature of a software product is known as feature identification. Feature identification is one of the most critical and popular processes performed by software engineers during software maintenance activity. However, a meaningful name must be assigned to the Identified Feature Implementation Block (IFIB) to complete the feature identification process. The feature naming process remains a challenging task, where the majority of existing approaches manually assign the name of the IFIB. In this paper, the approach called FeatureClouds was proposed, which can be exploited by software developers to name the IFIBs from software code. FeatureClouds approach incorporates word clouds visualization technique to name Feature Blocks (FBs) by using the most frequent words across these blocks. FeatureClouds had evaluated by assessing its added benefit to the current approaches in the literature, where limited tool support was supplied to software developers to distinguish feature names of the IFIBs. For validity, FeatureClouds had applied to draw shapes and ArgoUML software. The findings showed that the proposed approach achieved promising results according to well-known metrics in terms of Precision and Recall. △ Less

Submitted 24 April, 2022; originally announced April 2022.

Comments: 10 pages, 8 figures, 9 tables

Journal ref: Journal of Communications Software and Systems, vol. 18, no. 2, pp. 101-110, April 2022

arXiv:2204.08945 [pdf, other]

Missingness Bias in Model Debugging

Authors: Saachi Jain, Hadi Salman, Eric Wong, Pengchuan Zhang, Vibhav Vineet, Sai Vemprala, Aleksander Madry

Abstract: Missingness, or the absence of features from an input, is a concept fundamental to many model debugging tools. However, in computer vision, pixels cannot simply be removed from an image. One thus tends to resort to heuristics such as blacking out pixels, which may in turn introduce bias into the debugging process. We study such biases and, in particular, show how transformer-based architectures ca… ▽ More Missingness, or the absence of features from an input, is a concept fundamental to many model debugging tools. However, in computer vision, pixels cannot simply be removed from an image. One thus tends to resort to heuristics such as blacking out pixels, which may in turn introduce bias into the debugging process. We study such biases and, in particular, show how transformer-based architectures can enable a more natural implementation of missingness, which side-steps these issues and improves the reliability of model debugging in practice. Our code is available at https://github.com/madrylab/missingness △ Less

Submitted 13 June, 2022; v1 submitted 19 April, 2022; originally announced April 2022.

Comments: Published at ICLR 2022

arXiv:2204.07862 [pdf, other]

GHM Wavelet Transform for Deep Image Super Resolution

Authors: Ben Lowe, Hadi Salman, Justin Zhan

Abstract: The GHM multi-level discrete wavelet transform is proposed as preprocessing for image super resolution with convolutional neural networks. Previous works perform analysis with the Haar wavelet only. In this work, 37 single-level wavelets are experimentally analyzed from Haar, Daubechies, Biorthogonal, Reverse Biorthogonal, Coiflets, and Symlets wavelet families. All single-level wavelets report si… ▽ More The GHM multi-level discrete wavelet transform is proposed as preprocessing for image super resolution with convolutional neural networks. Previous works perform analysis with the Haar wavelet only. In this work, 37 single-level wavelets are experimentally analyzed from Haar, Daubechies, Biorthogonal, Reverse Biorthogonal, Coiflets, and Symlets wavelet families. All single-level wavelets report similar results indicating that the convolutional neural network is invariant to choice of wavelet in a single-level filter approach. However, the GHM multi-level wavelet achieves higher quality reconstructions than the single-level wavelets. Three large data sets are used for the experiments: DIV2K, a dataset of textures, and a dataset of satellite images. The approximate high resolution images are compared using seven objective error measurements. A convolutional neural network based approach using wavelet transformed images has good results in the literature. △ Less

Submitted 16 April, 2022; originally announced April 2022.

Comments: 13 pages

arXiv:2203.00312 [pdf]

Detecting commonality and variability in use-case diagram variants

Authors: Ra'Fat AL-Msie'deen, Anas H. Blasi, Hamzeh Eyal Salman, Saqer S. Alja'afreh, Ahmad Abadleh, Mohammed A. Alsuwaiket, Awni Hammouri, Asmaa Jameel Al_Nawaiseh, Wafa Tarawneh, Suleyman A. Al-Showarah

Abstract: The use-case diagram is a software artifact. Thus, as with any software artifact, the use-case diagrams change across time through the software development life cycle. Therefore, several versions of the same diagram are existed at distinct times. Thus, comparing all use-case diagram variants to detect common and variable use-cases becomes one of the main challenges in the product line reengineerin… ▽ More The use-case diagram is a software artifact. Thus, as with any software artifact, the use-case diagrams change across time through the software development life cycle. Therefore, several versions of the same diagram are existed at distinct times. Thus, comparing all use-case diagram variants to detect common and variable use-cases becomes one of the main challenges in the product line reengineering field. The contribution of this paper is to suggest an automatic approach to compare a collection of use-case diagram variants and detect both commonality and variability. In our work, every use-case represents a feature. The proposed approach visualizes the detected features using formal concept analysis, where common and variable features are introduced to software engineers. The proposed approach was applied on a mobile media case study to be validated. The findings confirm the importance and the performance of the suggested approach as all common and variable features were precisely detected via formal concept analysis and latent semantic indexing. △ Less

Submitted 1 March, 2022; originally announced March 2022.

Comments: 14 pages, 10 figures, 6 tables

Journal ref: Journal of Theoretical and Applied Information Technology, 28th February 2022, Vol. 100, No. 04, pp. 1113 - 1126, 2022

arXiv:2110.07719 [pdf, other]

Certified Patch Robustness via Smoothed Vision Transformers

Authors: Hadi Salman, Saachi Jain, Eric Wong, Aleksander Mądry

Abstract: Certified patch defenses can guarantee robustness of an image classifier to arbitrary changes within a bounded contiguous region. But, currently, this robustness comes at a cost of degraded standard accuracies and slower inference times. We demonstrate how using vision transformers enables significantly better certified patch robustness that is also more computationally efficient and does not incu… ▽ More Certified patch defenses can guarantee robustness of an image classifier to arbitrary changes within a bounded contiguous region. But, currently, this robustness comes at a cost of degraded standard accuracies and slower inference times. We demonstrate how using vision transformers enables significantly better certified patch robustness that is also more computationally efficient and does not incur a substantial drop in standard accuracy. These improvements stem from the inherent ability of the vision transformer to gracefully handle largely masked images. Our code is available at https://github.com/MadryLab/smoothed-vit. △ Less

Submitted 11 October, 2021; originally announced October 2021.

arXiv:2106.13364 [pdf, other]

CausalCity: Complex Simulations with Agency for Causal Discovery and Reasoning

Authors: Daniel McDuff, Yale Song, Jiyoung Lee, Vibhav Vineet, Sai Vemprala, Nicholas Gyde, Hadi Salman, Shuang Ma, Kwanghoon Sohn, Ashish Kapoor

Abstract: The ability to perform causal and counterfactual reasoning are central properties of human intelligence. Decision-making systems that can perform these types of reasoning have the potential to be more generalizable and interpretable. Simulations have helped advance the state-of-the-art in this domain, by providing the ability to systematically vary parameters (e.g., confounders) and generate examp… ▽ More The ability to perform causal and counterfactual reasoning are central properties of human intelligence. Decision-making systems that can perform these types of reasoning have the potential to be more generalizable and interpretable. Simulations have helped advance the state-of-the-art in this domain, by providing the ability to systematically vary parameters (e.g., confounders) and generate examples of the outcomes in the case of counterfactual scenarios. However, simulating complex temporal causal events in multi-agent scenarios, such as those that exist in driving and vehicle navigation, is challenging. To help address this, we present a high-fidelity simulation environment that is designed for developing algorithms for causal discovery and counterfactual reasoning in the safety-critical context. A core component of our work is to introduce \textit{agency}, such that it is simple to define and create complex scenarios using high-level definitions. The vehicles then operate with agency to complete these objectives, meaning low-level behaviors need only be controlled if necessary. We perform experiments with three state-of-the-art methods to create baselines and highlight the affordances of this environment. Finally, we highlight challenges and opportunities for future work. △ Less

Submitted 24 June, 2021; originally announced June 2021.

arXiv:2106.03805 [pdf, other]

3DB: A Framework for Debugging Computer Vision Models

Authors: Guillaume Leclerc, Hadi Salman, Andrew Ilyas, Sai Vemprala, Logan Engstrom, Vibhav Vineet, Kai Xiao, Pengchuan Zhang, Shibani Santurkar, Greg Yang, Ashish Kapoor, Aleksander Madry

Abstract: We introduce 3DB: an extendable, unified framework for testing and debugging vision models using photorealistic simulation. We demonstrate, through a wide range of use cases, that 3DB allows users to discover vulnerabilities in computer vision systems and gain insights into how models make decisions. 3DB captures and generalizes many robustness analyses from prior work, and enables one to study th… ▽ More We introduce 3DB: an extendable, unified framework for testing and debugging vision models using photorealistic simulation. We demonstrate, through a wide range of use cases, that 3DB allows users to discover vulnerabilities in computer vision systems and gain insights into how models make decisions. 3DB captures and generalizes many robustness analyses from prior work, and enables one to study their interplay. Finally, we find that the insights generated by the system transfer to the physical world. We are releasing 3DB as a library (https://github.com/3db/3db) alongside a set of example analyses, guides, and documentation: https://3db.github.io/3db/ . △ Less

Submitted 7 June, 2021; originally announced June 2021.

arXiv:2012.12235 [pdf, other]

Unadversarial Examples: Designing Objects for Robust Vision

Authors: Hadi Salman, Andrew Ilyas, Logan Engstrom, Sai Vemprala, Aleksander Madry, Ashish Kapoor

Abstract: We study a class of realistic computer vision settings wherein one can influence the design of the objects being recognized. We develop a framework that leverages this capability to significantly improve vision models' performance and robustness. This framework exploits the sensitivity of modern machine learning algorithms to input perturbations in order to design "robust objects," i.e., objects t… ▽ More We study a class of realistic computer vision settings wherein one can influence the design of the objects being recognized. We develop a framework that leverages this capability to significantly improve vision models' performance and robustness. This framework exploits the sensitivity of modern machine learning algorithms to input perturbations in order to design "robust objects," i.e., objects that are explicitly optimized to be confidently detected or classified. We demonstrate the efficacy of the framework on a wide variety of vision-based tasks ranging from standard benchmarks, to (in-simulation) robotics, to real-world experiments. Our code can be found at https://git.io/unadversarial . △ Less

Submitted 22 December, 2020; originally announced December 2020.

arXiv:2007.08489 [pdf, other]

Do Adversarially Robust ImageNet Models Transfer Better?

Authors: Hadi Salman, Andrew Ilyas, Logan Engstrom, Ashish Kapoor, Aleksander Madry

Abstract: Transfer learning is a widely-used paradigm in deep learning, where models pre-trained on standard datasets can be efficiently adapted to downstream tasks. Typically, better pre-trained models yield better transfer results, suggesting that initial accuracy is a key aspect of transfer learning performance. In this work, we identify another such aspect: we find that adversarially robust models, whil… ▽ More Transfer learning is a widely-used paradigm in deep learning, where models pre-trained on standard datasets can be efficiently adapted to downstream tasks. Typically, better pre-trained models yield better transfer results, suggesting that initial accuracy is a key aspect of transfer learning performance. In this work, we identify another such aspect: we find that adversarially robust models, while less accurate, often perform better than their standard-trained counterparts when used for transfer learning. Specifically, we focus on adversarially robust ImageNet classifiers, and show that they yield improved accuracy on a standard suite of downstream classification tasks. Further analysis uncovers more differences between robust and standard models in the context of transfer learning. Our results are consistent with (and in fact, add to) recent hypotheses stating that robustness leads to improved feature representations. Our code and models are available at https://github.com/Microsoft/robust-models-transfer . △ Less

Submitted 7 December, 2020; v1 submitted 16 July, 2020; originally announced July 2020.

Comments: NeurIPS 2020

arXiv:2004.12478 [pdf, other]

Improved Image Wasserstein Attacks and Defenses

Authors: Edward J. Hu, Adith Swaminathan, Hadi Salman, Greg Yang

Abstract: Robustness against image perturbations bounded by a $\ell_p$ ball have been well-studied in recent literature. Perturbations in the real-world, however, rarely exhibit the pixel independence that $\ell_p$ threat models assume. A recently proposed Wasserstein distance-bounded threat model is a promising alternative that limits the perturbation to pixel mass movements. We point out and rectify flaws… ▽ More Robustness against image perturbations bounded by a $\ell_p$ ball have been well-studied in recent literature. Perturbations in the real-world, however, rarely exhibit the pixel independence that $\ell_p$ threat models assume. A recently proposed Wasserstein distance-bounded threat model is a promising alternative that limits the perturbation to pixel mass movements. We point out and rectify flaws in previous definition of the Wasserstein threat model and explore stronger attacks and defenses under our better-defined framework. Lastly, we discuss the inability of current Wasserstein-robust models in defending against perturbations seen in the real world. Our code and trained models are available at https://github.com/edwardjhu/improved_wasserstein . △ Less

Submitted 9 May, 2023; v1 submitted 26 April, 2020; originally announced April 2020.

Comments: Best paper award at ICLR Trustworthy ML Workshop 2020

arXiv:2003.01908 [pdf, other]

Denoised Smoothing: A Provable Defense for Pretrained Classifiers

Authors: Hadi Salman, Mingjie Sun, Greg Yang, Ashish Kapoor, J. Zico Kolter

Abstract: We present a method for provably defending any pretrained image classifier against $\ell_p$ adversarial attacks. This method, for instance, allows public vision API providers and users to seamlessly convert pretrained non-robust classification services into provably robust ones. By prepending a custom-trained denoiser to any off-the-shelf image classifier and using randomized smoothing, we effecti… ▽ More We present a method for provably defending any pretrained image classifier against $\ell_p$ adversarial attacks. This method, for instance, allows public vision API providers and users to seamlessly convert pretrained non-robust classification services into provably robust ones. By prepending a custom-trained denoiser to any off-the-shelf image classifier and using randomized smoothing, we effectively create a new classifier that is guaranteed to be $\ell_p$-robust to adversarial examples, without modifying the pretrained classifier. Our approach applies to both the white-box and the black-box settings of the pretrained classifier. We refer to this defense as denoised smoothing, and we demonstrate its effectiveness through extensive experimentation on ImageNet and CIFAR-10. Finally, we use our approach to provably defend the Azure, Google, AWS, and ClarifAI image classification APIs. Our code replicating all the experiments in the paper can be found at: https://github.com/microsoft/denoised-smoothing. △ Less

Submitted 20 September, 2020; v1 submitted 4 March, 2020; originally announced March 2020.

Comments: 10 pages main text; 29 pages total

arXiv:2002.08118 [pdf, other]

Randomized Smoothing of All Shapes and Sizes

Authors: Greg Yang, Tony Duan, J. Edward Hu, Hadi Salman, Ilya Razenshteyn, Jerry Li

Abstract: Randomized smoothing is the current state-of-the-art defense with provable robustness against $\ell_2$ adversarial attacks. Many works have devised new randomized smoothing schemes for other metrics, such as $\ell_1$ or $\ell_\infty$; however, substantial effort was needed to derive such new guarantees. This begs the question: can we find a general theory for randomized smoothing? We propose a n… ▽ More Randomized smoothing is the current state-of-the-art defense with provable robustness against $\ell_2$ adversarial attacks. Many works have devised new randomized smoothing schemes for other metrics, such as $\ell_1$ or $\ell_\infty$; however, substantial effort was needed to derive such new guarantees. This begs the question: can we find a general theory for randomized smoothing? We propose a novel framework for devising and analyzing randomized smoothing schemes, and validate its effectiveness in practice. Our theoretical contributions are: (1) we show that for an appropriate notion of "optimal", the optimal smoothing distributions for any "nice" norms have level sets given by the norm's *Wulff Crystal*; (2) we propose two novel and complementary methods for deriving provably robust radii for any smoothing distribution; and, (3) we show fundamental limits to current randomized smoothing techniques via the theory of *Banach space cotypes*. By combining (1) and (2), we significantly improve the state-of-the-art certified accuracy in $\ell_1$ on standard datasets. Meanwhile, we show using (3) that with only label statistics under random input perturbations, randomized smoothing cannot achieve nontrivial certified accuracy against perturbations of $\ell_p$-norm $Ω(\min(1, d^{\frac{1}{p} - \frac{1}{2}}))$, when the input dimension $d$ is large. We provide code in github.com/tonyduan/rs4a. △ Less

Submitted 23 July, 2020; v1 submitted 19 February, 2020; originally announced February 2020.

Comments: 9 pages main text, 49 pages total

arXiv:1907.10599 [pdf, other]

A Fine-Grained Spectral Perspective on Neural Networks

Authors: Greg Yang, Hadi Salman

Abstract: Are neural networks biased toward simple functions? Does depth always help learn more complex features? Is training the last layer of a network as good as training all layers? How to set the range for learning rate tuning? These questions seem unrelated at face value, but in this work we give all of them a common treatment from the spectral perspective. We will study the spectra of the *Conjugate… ▽ More Are neural networks biased toward simple functions? Does depth always help learn more complex features? Is training the last layer of a network as good as training all layers? How to set the range for learning rate tuning? These questions seem unrelated at face value, but in this work we give all of them a common treatment from the spectral perspective. We will study the spectra of the *Conjugate Kernel, CK,* (also called the *Neural Network-Gaussian Process Kernel*), and the *Neural Tangent Kernel, NTK*. Roughly, the CK and the NTK tell us respectively "what a network looks like at initialization" and "what a network looks like during and after training." Their spectra then encode valuable information about the initial distribution and the training and generalization properties of neural networks. By analyzing the eigenvalues, we lend novel insights into the questions put forth at the beginning, and we verify these insights by extensive experiments of neural networks. We derive fast algorithms for computing the spectra of CK and NTK when the data is uniformly distributed over the boolean cube, and show this spectra is the same in high dimensions when data is drawn from isotropic Gaussian or uniformly over the sphere. Code replicating our results is available at github.com/thegregyang/NNspectra. △ Less

Submitted 9 April, 2020; v1 submitted 24 July, 2019; originally announced July 2019.

Comments: 8 pages of main text, 19 figures, 51 pages including appendix

arXiv:1906.04584 [pdf, other]

Provably Robust Deep Learning via Adversarially Trained Smoothed Classifiers

Authors: Hadi Salman, Greg Yang, Jerry Li, Pengchuan Zhang, Huan Zhang, Ilya Razenshteyn, Sebastien Bubeck

Abstract: Recent works have shown the effectiveness of randomized smoothing as a scalable technique for building neural network-based classifiers that are provably robust to $\ell_2$-norm adversarial perturbations. In this paper, we employ adversarial training to improve the performance of randomized smoothing. We design an adapted attack for smoothed classifiers, and we show how this attack can be used in… ▽ More Recent works have shown the effectiveness of randomized smoothing as a scalable technique for building neural network-based classifiers that are provably robust to $\ell_2$-norm adversarial perturbations. In this paper, we employ adversarial training to improve the performance of randomized smoothing. We design an adapted attack for smoothed classifiers, and we show how this attack can be used in an adversarial training setting to boost the provable robustness of smoothed classifiers. We demonstrate through extensive experimentation that our method consistently outperforms all existing provably $\ell_2$-robust classifiers by a significant margin on ImageNet and CIFAR-10, establishing the state-of-the-art for provable $\ell_2$-defenses. Moreover, we find that pre-training and semi-supervised learning boost adversarially trained smoothed classifiers even further. Our code and trained models are available at http://github.com/Hadisalman/smoothing-adversarial . △ Less

Submitted 9 January, 2020; v1 submitted 9 June, 2019; originally announced June 2019.

Comments: Spotlight at the 33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, Canada; 9 pages main text; 31 pages total

arXiv:1902.08722 [pdf, other]

A Convex Relaxation Barrier to Tight Robustness Verification of Neural Networks

Authors: Hadi Salman, Greg Yang, Huan Zhang, Cho-Jui Hsieh, Pengchuan Zhang

Abstract: Verification of neural networks enables us to gauge their robustness against adversarial attacks. Verification algorithms fall into two categories: exact verifiers that run in exponential time and relaxed verifiers that are efficient but incomplete. In this paper, we unify all existing LP-relaxed verifiers, to the best of our knowledge, under a general convex relaxation framework. This framework w… ▽ More Verification of neural networks enables us to gauge their robustness against adversarial attacks. Verification algorithms fall into two categories: exact verifiers that run in exponential time and relaxed verifiers that are efficient but incomplete. In this paper, we unify all existing LP-relaxed verifiers, to the best of our knowledge, under a general convex relaxation framework. This framework works for neural networks with diverse architectures and nonlinearities and covers both primal and dual views of robustness verification. We further prove strong duality between the primal and dual problems under very mild conditions. Next, we perform large-scale experiments, amounting to more than 22 CPU-years, to obtain exact solution to the convex-relaxed problem that is optimal within our framework for ReLU networks. We find the exact solution does not significantly improve upon the gap between PGD and existing relaxed verifiers for various networks trained normally or robustly on MNIST and CIFAR datasets. Our results suggest there is an inherent barrier to tight verification for the large class of methods captured by our framework. We discuss possible causes of this barrier and potential future directions for bypassing it. Our code and trained models are available at http://github.com/Hadisalman/robust-verify-benchmark . △ Less

Submitted 9 January, 2020; v1 submitted 22 February, 2019; originally announced February 2019.

Comments: Poster at the 33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, Canada

arXiv:1810.03256 [pdf, other]

Deep Diffeomorphic Normalizing Flows

Authors: Hadi Salman, Payman Yadollahpour, Tom Fletcher, Kayhan Batmanghelich

Abstract: The Normalizing Flow (NF) models a general probability density by estimating an invertible transformation applied on samples drawn from a known distribution. We introduce a new type of NF, called Deep Diffeomorphic Normalizing Flow (DDNF). A diffeomorphic flow is an invertible function where both the function and its inverse are smooth. We construct the flow using an ordinary differential equation… ▽ More The Normalizing Flow (NF) models a general probability density by estimating an invertible transformation applied on samples drawn from a known distribution. We introduce a new type of NF, called Deep Diffeomorphic Normalizing Flow (DDNF). A diffeomorphic flow is an invertible function where both the function and its inverse are smooth. We construct the flow using an ordinary differential equation (ODE) governed by a time-varying smooth vector field. We use a neural network to parametrize the smooth vector field and a recursive neural network (RNN) for approximating the solution of the ODE. Each cell in the RNN is a residual network implementing one Euler integration step. The architecture of our flow enables efficient likelihood evaluation, straightforward flow inversion, and results in highly flexible density estimation. An end-to-end trained DDNF achieves competitive results with state-of-the-art methods on a suite of density estimation and variational inference tasks. Finally, our method brings concepts from Riemannian geometry that, we believe, can open a new research direction for neural density estimation. △ Less

Submitted 22 November, 2018; v1 submitted 7 October, 2018; originally announced October 2018.

arXiv:1803.01446 [pdf, other]

Learning to Sequence Robot Behaviors for Visual Navigation

Authors: Hadi Salman, Puneet Singhal, Tanmay Shankar, Peng Yin, Ali Salman, William Paivine, Guillaume Sartoretti, Matthew Travers, Howie Choset

Abstract: Recent literature in the robotics community has focused on learning robot behaviors that abstract out lower-level details of robot control. To fully leverage the efficacy of such behaviors, it is necessary to select and sequence them to achieve a given task. In this paper, we present an approach to both learn and sequence robot behaviors, applied to the problem of visual navigation of mobile robot… ▽ More Recent literature in the robotics community has focused on learning robot behaviors that abstract out lower-level details of robot control. To fully leverage the efficacy of such behaviors, it is necessary to select and sequence them to achieve a given task. In this paper, we present an approach to both learn and sequence robot behaviors, applied to the problem of visual navigation of mobile robots. We construct a layered representation of control policies composed of low- level behaviors and a meta-level policy. The low-level behaviors enable the robot to locomote in a particular environment while avoiding obstacles, and the meta-level policy actively selects the low-level behavior most appropriate for the current situation based purely on visual feedback. We demonstrate the effectiveness of our method on three simulated robot navigation tasks: a legged hexapod robot which must successfully traverse varying terrain, a wheeled robot which must navigate a maze-like course while avoiding obstacles, and finally a wheeled robot navigating in the presence of dynamic obstacles. We show that by learning control policies in a layered manner, we gain the ability to successfully traverse new compound environments composed of distinct sub-environments, and outperform both the low-level behaviors in their respective sub-environments, as well as a hand-crafted selection of low-level policies on these compound environments. △ Less

Submitted 25 March, 2018; v1 submitted 4 March, 2018; originally announced March 2018.

arXiv:1711.08828 [pdf, other]

A surgical system for automatic registration, stiffness mapping and dynamic image overlay

Authors: Nicolas Zevallos, Rangaprasad Arun Srivatsan, Hadi Salman, Lu Li, Jianing Qian, Saumya Saxena, Mengyun Xu, Kartik Patath, Howie Choset

Abstract: In this paper we develop a surgical system using the da Vinci research kit (dVRK) that is capable of autonomously searching for tumors and dynamically displaying the tumor location using augmented reality. Such a system has the potential to quickly reveal the location and shape of tumors and visually overlay that information to reduce the cognitive overload of the surgeon. We believe that our appr… ▽ More In this paper we develop a surgical system using the da Vinci research kit (dVRK) that is capable of autonomously searching for tumors and dynamically displaying the tumor location using augmented reality. Such a system has the potential to quickly reveal the location and shape of tumors and visually overlay that information to reduce the cognitive overload of the surgeon. We believe that our approach is one of the first to incorporate state-of-the-art methods in registration, force sensing and tumor localization into a unified surgical system. First, the preoperative model is registered to the intra-operative scene using a Bingham distribution-based filtering approach. An active level set estimation is then used to find the location and the shape of the tumors. We use a recently developed miniature force sensor to perform the palpation. The estimated stiffness map is then dynamically overlaid onto the registered preoperative model of the organ. We demonstrate the efficacy of our system by performing experiments on phantom prostate models with embedded stiff inclusions. △ Less

Submitted 23 November, 2017; originally announced November 2017.

Comments: International Symposium on Medical Robotics (ISMR 2018)

arXiv:1711.07063 [pdf, other]

Trajectory-Optimized Sensing for Active Search of Tissue Abnormalities in Robotic Surgery

Authors: Hadi Salman, Elif Ayvali, Rangaprasad Arun Srivatsan, Yifei Ma, Nicolas Zevallos, Rashid Yasin, Long Wang, Nabil Siman, Howie Choset

Abstract: In this work, we develop an approach for guiding robots to automatically localize and find the shapes of tumors and other stiff inclusions present in the anatomy. Our approach uses Gaussian processes to model the stiffness distribution and active learning to direct the palpation path of the robot. The palpation paths are chosen such that they maximize an acquisition function provided by an active… ▽ More In this work, we develop an approach for guiding robots to automatically localize and find the shapes of tumors and other stiff inclusions present in the anatomy. Our approach uses Gaussian processes to model the stiffness distribution and active learning to direct the palpation path of the robot. The palpation paths are chosen such that they maximize an acquisition function provided by an active learning algorithm. Our approach provides the flexibility to avoid obstacles in the robot's path, incorporate uncertainties in robot position and sensor measurements, include prior information about location of stiff inclusions while respecting the robot-kinematics. To the best of our knowledge this is the first work in literature that considers all the above conditions while localizing tumors. The proposed framework is evaluated via simulation and experimentation on three different robot platforms: 6-DoF industrial arm, da Vinci Research Kit (dVRK), and the Insertable Robotic Effector Platform (IREP). Results show that our approach can accurately estimate the locations and boundaries of the stiff inclusions while reducing exploration time. △ Less

Submitted 16 May, 2018; v1 submitted 19 November, 2017; originally announced November 2017.

Comments: 8 pages, ICRA 2018

arXiv:1708.04321 [pdf, other]

doi 10.1089/big.2018.0175

Distance and Similarity Measures Effect on the Performance of K-Nearest Neighbor Classifier -- A Review

Authors: V. B. Surya Prasath, Haneen Arafat Abu Alfeilat, Ahmad B. A. Hassanat, Omar Lasassmeh, Ahmad S. Tarawneh, Mahmoud Bashir Alhasanat, Hamzeh S. Eyal Salman

Abstract: The K-nearest neighbor (KNN) classifier is one of the simplest and most common classifiers, yet its performance competes with the most complex classifiers in the literature. The core of this classifier depends mainly on measuring the distance or similarity between the tested examples and the training examples. This raises a major question about which distance measures to be used for the KNN classi… ▽ More The K-nearest neighbor (KNN) classifier is one of the simplest and most common classifiers, yet its performance competes with the most complex classifiers in the literature. The core of this classifier depends mainly on measuring the distance or similarity between the tested examples and the training examples. This raises a major question about which distance measures to be used for the KNN classifier among a large number of distance and similarity measures available? This review attempts to answer this question through evaluating the performance (measured by accuracy, precision and recall) of the KNN using a large number of distance measures, tested on a number of real-world datasets, with and without adding different levels of noise. The experimental results show that the performance of KNN classifier depends significantly on the distance used, and the results showed large gaps between the performances of different distances. We found that a recently proposed non-convex distance performed the best when applied on most datasets comparing to the other tested distances. In addition, the performance of the KNN with this top performing distance degraded only about $20\%$ while the noise level reaches $90\%$, this is true for most of the distances used as well. This means that the KNN classifier using any of the top $10$ distances tolerate noise to a certain degree. Moreover, the results show that some distances are less affected by the added noise comparing to other distances. △ Less

Submitted 29 September, 2019; v1 submitted 14 August, 2017; originally announced August 2017.

Comments: 39 pages, 6 figures, 17 tables, revised text and added extra experiments

arXiv:1707.04294 [pdf, other]

Ergodic Coverage In Constrained Environments Using Stochastic Trajectory Optimization

Authors: Elif Ayvali, Hadi Salman, Howie Choset

Abstract: In search and surveillance applications in robotics, it is intuitive to spatially distribute robot trajectories with respect to the probability of locating targets in the domain. Ergodic coverage is one such approach to trajectory planning in which a robot is directed such that the percentage of time spent in a region is in proportion to the probability of locating targets in that region. In this… ▽ More In search and surveillance applications in robotics, it is intuitive to spatially distribute robot trajectories with respect to the probability of locating targets in the domain. Ergodic coverage is one such approach to trajectory planning in which a robot is directed such that the percentage of time spent in a region is in proportion to the probability of locating targets in that region. In this work, we extend the ergodic coverage algorithm to robots operating in constrained environments and present a formulation that can capture sensor footprint and avoid obstacles and restricted areas in the domain. We demonstrate that our formulation easily extends to coordination of multiple robots equipped with different sensing capabilities to perform ergodic coverage of a domain. △ Less

Submitted 22 July, 2017; v1 submitted 13 July, 2017; originally announced July 2017.

Comments: Accepted, IROS 2017

Showing 1–30 of 30 results for author: Salman, H