Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–43 of 43 results for author: Papernot, N

Searching in archive stat. Search in all archives.
.
  1. arXiv:2402.03540  [pdf, other

    cs.LG cs.GT stat.ML

    Regulation Games for Trustworthy Machine Learning

    Authors: Mohammad Yaghini, Patty Liu, Franziska Boenisch, Nicolas Papernot

    Abstract: Existing work on trustworthy machine learning (ML) often concentrates on individual aspects of trust, such as fairness or privacy. Additionally, many techniques overlook the distinction between those who train ML models and those responsible for assessing their trustworthiness. To address these issues, we propose a framework that views trustworthy ML as a multi-objective multi-agent optimization p… ▽ More

    Submitted 5 February, 2024; originally announced February 2024.

  2. arXiv:2307.00310  [pdf, other

    cs.LG cs.AI cs.CR stat.ML

    Gradients Look Alike: Sensitivity is Often Overestimated in DP-SGD

    Authors: Anvith Thudi, Hengrui Jia, Casey Meehan, Ilia Shumailov, Nicolas Papernot

    Abstract: Differentially private stochastic gradient descent (DP-SGD) is the canonical approach to private deep learning. While the current privacy analysis of DP-SGD is known to be tight in some settings, several empirical results suggest that models trained on common benchmark datasets leak significantly less privacy for many datapoints. Yet, despite past attempts, a rigorous explanation for why this is t… ▽ More

    Submitted 16 July, 2024; v1 submitted 1 July, 2023; originally announced July 2023.

    Comments: published in 33rd USENIX Security Symposium

  3. arXiv:2208.03567  [pdf, other

    cs.LG cs.AI cs.CR stat.ML

    Proof-of-Learning is Currently More Broken Than You Think

    Authors: Congyu Fang, Hengrui Jia, Anvith Thudi, Mohammad Yaghini, Christopher A. Choquette-Choo, Natalie Dullerud, Varun Chandrasekaran, Nicolas Papernot

    Abstract: Proof-of-Learning (PoL) proposes that a model owner logs training checkpoints to establish a proof of having expended the computation necessary for training. The authors of PoL forego cryptographic approaches and trade rigorous security guarantees for scalability to deep learning. They empirically argued the benefit of this approach by showing how spoofing--computing a proof for a stolen model--is… ▽ More

    Submitted 17 April, 2023; v1 submitted 6 August, 2022; originally announced August 2022.

    Comments: Published in IEEE EuroS&P 2023

  4. arXiv:2207.12545  [pdf, other

    cs.LG stat.ML

    $p$-DkNN: Out-of-Distribution Detection Through Statistical Testing of Deep Representations

    Authors: Adam Dziedzic, Stephan Rabanser, Mohammad Yaghini, Armin Ale, Murat A. Erdogdu, Nicolas Papernot

    Abstract: The lack of well-calibrated confidence estimates makes neural networks inadequate in safety-critical domains such as autonomous driving or healthcare. In these settings, having the ability to abstain from making a prediction on out-of-distribution (OOD) data can be as important as correctly classifying in-distribution data. We introduce $p$-DkNN, a novel inference procedure that takes a trained de… ▽ More

    Submitted 25 July, 2022; originally announced July 2022.

  5. arXiv:2206.14342  [pdf, other

    cs.LG stat.ML

    Intrinsic Anomaly Detection for Multi-Variate Time Series

    Authors: Stephan Rabanser, Tim Januschowski, Kashif Rasul, Oliver Borchert, Richard Kurle, Jan Gasthaus, Michael Bohlke-Schneider, Nicolas Papernot, Valentin Flunkert

    Abstract: We introduce a novel, practically relevant variation of the anomaly detection problem in multi-variate time series: intrinsic anomaly detection. It appears in diverse practical scenarios ranging from DevOps to IoT, where we want to recognize failures of a system that operates under the influence of a surrounding environment. Intrinsic anomalies are changes in the functional dependency structure be… ▽ More

    Submitted 28 June, 2022; originally announced June 2022.

  6. arXiv:2205.13532  [pdf, other

    cs.LG stat.ML

    Selective Classification Via Neural Network Training Dynamics

    Authors: Stephan Rabanser, Anvith Thudi, Kimia Hamidieh, Adam Dziedzic, Nicolas Papernot

    Abstract: Selective classification is the task of rejecting inputs a model would predict incorrectly on through a trade-off between input space coverage and model accuracy. Current methods for selective classification impose constraints on either the model architecture or the loss function; this inhibits their usage in practice. In contrast to prior work, we show that state-of-the-art selective classificati… ▽ More

    Submitted 12 October, 2022; v1 submitted 26 May, 2022; originally announced May 2022.

  7. arXiv:2203.12748  [pdf, other

    cs.LG cs.AI cs.CY stat.ML

    Is Fairness Only Metric Deep? Evaluating and Addressing Subgroup Gaps in Deep Metric Learning

    Authors: Natalie Dullerud, Karsten Roth, Kimia Hamidieh, Nicolas Papernot, Marzyeh Ghassemi

    Abstract: Deep metric learning (DML) enables learning with less supervision through its emphasis on the similarity structure of representations. There has been much work on improving generalization of DML in settings like zero-shot retrieval, but little is known about its implications for fairness. In this paper, we are the first to evaluate state-of-the-art DML methods trained on imbalanced data, and to sh… ▽ More

    Submitted 23 March, 2022; originally announced March 2022.

    Comments: Published as a conference paper at ICLR 2022

  8. arXiv:2110.11891  [pdf, other

    cs.LG cs.AI cs.CR stat.ML

    On the Necessity of Auditable Algorithmic Definitions for Machine Unlearning

    Authors: Anvith Thudi, Hengrui Jia, Ilia Shumailov, Nicolas Papernot

    Abstract: Machine unlearning, i.e. having a model forget about some of its training data, has become increasingly more important as privacy legislation promotes variants of the right-to-be-forgotten. In the context of deep learning, approaches for machine unlearning are broadly categorized into two classes: exact unlearning methods, where an entity has formally removed the data point's impact on the model b… ▽ More

    Submitted 19 February, 2022; v1 submitted 22 October, 2021; originally announced October 2021.

    Comments: published in 31st USENIX Security Symposium

  9. arXiv:2104.10706  [pdf, other

    stat.ML cs.CR cs.LG

    Dataset Inference: Ownership Resolution in Machine Learning

    Authors: Pratyush Maini, Mohammad Yaghini, Nicolas Papernot

    Abstract: With increasingly more data and computation involved in their training, machine learning models constitute valuable intellectual property. This has spurred interest in model stealing, which is made more practical by advances in learning with partial, little, or no supervision. Existing defenses focus on inserting unique watermarks in a model's decision surface, but this is insufficient: the waterm… ▽ More

    Submitted 21 April, 2021; originally announced April 2021.

    Comments: Published as a conference paper at ICLR 2021 (Spotlight Presentation)

  10. arXiv:2103.05633  [pdf, other

    cs.LG cs.AI cs.CR stat.ML

    Proof-of-Learning: Definitions and Practice

    Authors: Hengrui Jia, Mohammad Yaghini, Christopher A. Choquette-Choo, Natalie Dullerud, Anvith Thudi, Varun Chandrasekaran, Nicolas Papernot

    Abstract: Training machine learning (ML) models typically involves expensive iterative optimization. Once the model's final parameters are released, there is currently no mechanism for the entity which trained the model to prove that these parameters were indeed the result of this optimization procedure. Such a mechanism would support security of ML applications in several ways. For instance, it would simpl… ▽ More

    Submitted 9 March, 2021; originally announced March 2021.

    Comments: To appear in the 42nd IEEE Symposium on Security and Privacy

  11. arXiv:2010.06667  [pdf, other

    cs.LG cs.CR cs.CY stat.ML

    Chasing Your Long Tails: Differentially Private Prediction in Health Care Settings

    Authors: Vinith M. Suriyakumar, Nicolas Papernot, Anna Goldenberg, Marzyeh Ghassemi

    Abstract: Machine learning models in health care are often deployed in settings where it is important to protect patient privacy. In such settings, methods for differentially private (DP) learning provide a general-purpose approach to learn models with privacy guarantees. Modern methods for DP learning ensure privacy through mechanisms that censor information judged as too unique. The resulting privacy-pres… ▽ More

    Submitted 13 October, 2020; originally announced October 2020.

  12. arXiv:2007.14321  [pdf, other

    cs.CR cs.LG stat.ML

    Label-Only Membership Inference Attacks

    Authors: Christopher A. Choquette-Choo, Florian Tramer, Nicholas Carlini, Nicolas Papernot

    Abstract: Membership inference attacks are one of the simplest forms of privacy leakage for machine learning models: given a data point and model, determine whether the point was used to train the model. Existing membership inference attacks exploit models' abnormal confidence when queried on their training data. These attacks do not apply if the adversary only gets access to models' predicted labels, witho… ▽ More

    Submitted 5 December, 2021; v1 submitted 28 July, 2020; originally announced July 2020.

    Comments: 16 pages, 11 figures, 2 tables Revision 2: 19 pages, 12 figures, 3 tables. Improved text and additional experiments. Final ICML paper

  13. arXiv:2007.14191  [pdf, other

    stat.ML cs.CR cs.LG

    Tempered Sigmoid Activations for Deep Learning with Differential Privacy

    Authors: Nicolas Papernot, Abhradeep Thakurta, Shuang Song, Steve Chien, Úlfar Erlingsson

    Abstract: Because learning sometimes involves sensitive data, machine learning algorithms have been extended to offer privacy for training data. In practice, this has been mostly an afterthought, with privacy-preserving models obtained by re-running training with a different optimizer, but using the model architectures that already performed well in a non-privacy-preserving setting. This approach leads to l… ▽ More

    Submitted 28 July, 2020; originally announced July 2020.

  14. arXiv:2006.03463  [pdf, other

    cs.LG cs.CL cs.CR stat.ML

    Sponge Examples: Energy-Latency Attacks on Neural Networks

    Authors: Ilia Shumailov, Yiren Zhao, Daniel Bates, Nicolas Papernot, Robert Mullins, Ross Anderson

    Abstract: The high energy costs of neural network training and inference led to the use of acceleration hardware such as GPUs and TPUs. While this enabled us to train large-scale neural networks in datacenters and deploy them on edge devices, the focus so far is on average-case performance. In this work, we introduce a novel threat vector against neural networks whose energy consumption or decision latency… ▽ More

    Submitted 12 May, 2021; v1 submitted 5 June, 2020; originally announced June 2020.

    Comments: Accepted at 6th IEEE European Symposium on Security and Privacy (EuroS&P)

  15. arXiv:2004.01832  [pdf, ps, other

    cs.LG stat.ML

    SOAR: Second-Order Adversarial Regularization

    Authors: Avery Ma, Fartash Faghri, Nicolas Papernot, Amir-massoud Farahmand

    Abstract: Adversarial training is a common approach to improving the robustness of deep neural networks against adversarial examples. In this work, we propose a novel regularization approach as an alternative. To derive the regularizer, we formulate the adversarial robustness problem under the robust optimization framework and approximate the loss function using a second-order Taylor series expansion. Our p… ▽ More

    Submitted 7 February, 2021; v1 submitted 3 April, 2020; originally announced April 2020.

  16. arXiv:2003.03722  [pdf, other

    cs.LG cs.CR stat.ML

    On the Robustness of Cooperative Multi-Agent Reinforcement Learning

    Authors: Jieyu Lin, Kristina Dzeparoska, Sai Qian Zhang, Alberto Leon-Garcia, Nicolas Papernot

    Abstract: In cooperative multi-agent reinforcement learning (c-MARL), agents learn to cooperatively take actions as a team to maximize a total team reward. We analyze the robustness of c-MARL to adversaries capable of attacking one of the agents on a team. Through the ability to manipulate this agent's observations, the adversary seeks to decrease the total team reward. Attacking c-MARL is challenging for… ▽ More

    Submitted 8 March, 2020; originally announced March 2020.

  17. arXiv:2002.12200  [pdf, other

    cs.CR stat.ML

    Entangled Watermarks as a Defense against Model Extraction

    Authors: Hengrui Jia, Christopher A. Choquette-Choo, Varun Chandrasekaran, Nicolas Papernot

    Abstract: Machine learning involves expensive data collection and training procedures. Model owners may be concerned that valuable intellectual property can be leaked if adversaries mount model extraction attacks. As it is difficult to defend against model extraction without sacrificing significant prediction accuracy, watermarking instead leverages unused model capacity to have the model overfit to outlier… ▽ More

    Submitted 19 February, 2021; v1 submitted 27 February, 2020; originally announced February 2020.

    Comments: published in 30th USENIX Security Symposium

  18. arXiv:2002.04599  [pdf, other

    cs.LG cs.CR cs.CV stat.ML

    Fundamental Tradeoffs between Invariance and Sensitivity to Adversarial Perturbations

    Authors: Florian Tramèr, Jens Behrmann, Nicholas Carlini, Nicolas Papernot, Jörn-Henrik Jacobsen

    Abstract: Adversarial examples are malicious inputs crafted to induce misclassification. Commonly studied sensitivity-based adversarial examples introduce semantically-small changes to an input that result in a different model prediction. This paper studies a complementary failure mode, invariance-based adversarial examples, that introduce minimal semantic changes that modify an input's true label yet prese… ▽ More

    Submitted 4 August, 2020; v1 submitted 11 February, 2020; originally announced February 2020.

    Comments: ICML 2020 (Supersedes the workshop paper "Exploiting Excessive Invariance caused by Norm-Bounded Adversarial Robustness", arXiv:1903.10484)

  19. arXiv:1910.13427  [pdf, other

    cs.LG stat.ML

    Distribution Density, Tails, and Outliers in Machine Learning: Metrics and Applications

    Authors: Nicholas Carlini, Úlfar Erlingsson, Nicolas Papernot

    Abstract: We develop techniques to quantify the degree to which a given (training or testing) example is an outlier in the underlying distribution. We evaluate five methods to score examples in a dataset by how well-represented the examples are, for different plausible definitions of "well-represented", and apply these to four common datasets: MNIST, Fashion-MNIST, CIFAR-10, and ImageNet. Despite being inde… ▽ More

    Submitted 29 October, 2019; originally announced October 2019.

  20. arXiv:1910.01177  [pdf, other

    stat.ML cs.LG

    Improving Differentially Private Models with Active Learning

    Authors: Zhengli Zhao, Nicolas Papernot, Sameer Singh, Neoklis Polyzotis, Augustus Odena

    Abstract: Broad adoption of machine learning techniques has increased privacy concerns for models trained on sensitive data such as medical records. Existing techniques for training differentially private (DP) models give rigorous privacy guarantees, but applying these techniques to neural networks can severely degrade model performance. This performance reduction is an obstacle to deploying private models… ▽ More

    Submitted 2 October, 2019; originally announced October 2019.

  21. arXiv:1909.01838  [pdf, other

    cs.LG cs.CR stat.ML

    High Accuracy and High Fidelity Extraction of Neural Networks

    Authors: Matthew Jagielski, Nicholas Carlini, David Berthelot, Alex Kurakin, Nicolas Papernot

    Abstract: In a model extraction attack, an adversary steals a copy of a remotely deployed machine learning model, given oracle prediction access. We taxonomize model extraction attacks around two objectives: *accuracy*, i.e., performing well on the underlying learning task, and *fidelity*, i.e., matching the predictions of the remote victim classifier on any input. To extract a high-accuracy model, we dev… ▽ More

    Submitted 3 March, 2020; v1 submitted 3 September, 2019; originally announced September 2019.

    Comments: USENIX Security 2020, 18 pages, 6 figures

  22. arXiv:1909.00056  [pdf, ps, other

    cs.CY cs.CR stat.ML

    How Relevant is the Turing Test in the Age of Sophisbots?

    Authors: Dan Boneh, Andrew J. Grotto, Patrick McDaniel, Nicolas Papernot

    Abstract: Popular culture has contemplated societies of thinking machines for generations, envisioning futures from utopian to dystopian. These futures are, arguably, here now-we find ourselves at the doorstep of technology that can at least simulate the appearance of thinking, acting, and feeling. The real question is: now what?

    Submitted 30 August, 2019; originally announced September 2019.

  23. arXiv:1905.10900  [pdf, other

    cs.LG stat.ML

    Rearchitecting Classification Frameworks For Increased Robustness

    Authors: Varun Chandrasekaran, Brian Tang, Nicolas Papernot, Kassem Fawaz, Somesh Jha, Xi Wu

    Abstract: While generalizing well over natural inputs, neural networks are vulnerable to adversarial inputs. Existing defenses against adversarial inputs have largely been detached from the real world. These defenses also come at a cost to accuracy. Fortunately, there are invariances of an object that are its salient features; when we break them it will necessarily change the perception of the object. We fi… ▽ More

    Submitted 3 December, 2019; v1 submitted 26 May, 2019; originally announced May 2019.

  24. arXiv:1905.02249  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    MixMatch: A Holistic Approach to Semi-Supervised Learning

    Authors: David Berthelot, Nicholas Carlini, Ian Goodfellow, Nicolas Papernot, Avital Oliver, Colin Raffel

    Abstract: Semi-supervised learning has proven to be a powerful paradigm for leveraging unlabeled data to mitigate the reliance on large labeled datasets. In this work, we unify the current dominant approaches for semi-supervised learning to produce a new algorithm, MixMatch, that works by guessing low-entropy labels for data-augmented unlabeled examples and mixing labeled and unlabeled data using MixUp. We… ▽ More

    Submitted 23 October, 2019; v1 submitted 6 May, 2019; originally announced May 2019.

  25. arXiv:1903.10484  [pdf, other

    cs.LG cs.CR cs.CV stat.ML

    Exploiting Excessive Invariance caused by Norm-Bounded Adversarial Robustness

    Authors: Jörn-Henrik Jacobsen, Jens Behrmannn, Nicholas Carlini, Florian Tramèr, Nicolas Papernot

    Abstract: Adversarial examples are malicious inputs crafted to cause a model to misclassify them. Their most common instantiation, "perturbation-based" adversarial examples introduce changes to the input that leave its true label unchanged, yet result in a different model prediction. Conversely, "invariance-based" adversarial examples insert changes to the input that leave the model's prediction unaffected… ▽ More

    Submitted 25 March, 2019; originally announced March 2019.

    Comments: Accepted at the ICLR 2019 SafeML Workshop

  26. arXiv:1902.06705  [pdf, ps, other

    cs.LG cs.CR stat.ML

    On Evaluating Adversarial Robustness

    Authors: Nicholas Carlini, Anish Athalye, Nicolas Papernot, Wieland Brendel, Jonas Rauber, Dimitris Tsipras, Ian Goodfellow, Aleksander Madry, Alexey Kurakin

    Abstract: Correctly evaluating defenses against adversarial examples has proven to be extremely difficult. Despite the significant amount of recent work attempting to design defenses that withstand adaptive attacks, few have succeeded; most papers that propose defenses are quickly shown to be incorrect. We believe a large contributing factor is the difficulty of performing security evaluations. In this pa… ▽ More

    Submitted 20 February, 2019; v1 submitted 18 February, 2019; originally announced February 2019.

    Comments: Living document; source available at https://github.com/evaluating-adversarial-robustness/adv-eval-paper/

  27. arXiv:1902.01889  [pdf, other

    stat.ML cs.LG

    Analyzing and Improving Representations with the Soft Nearest Neighbor Loss

    Authors: Nicholas Frosst, Nicolas Papernot, Geoffrey Hinton

    Abstract: We explore and expand the $\textit{Soft Nearest Neighbor Loss}$ to measure the $\textit{entanglement}$ of class manifolds in representation space: i.e., how close pairs of points from the same class are relative to pairs of points from different classes. We demonstrate several use cases of the loss. As an analytical tool, it provides insights into the evolution of class similarity structures durin… ▽ More

    Submitted 5 February, 2019; originally announced February 2019.

  28. arXiv:1812.06210  [pdf, ps, other

    cs.LG stat.ML

    A General Approach to Adding Differential Privacy to Iterative Training Procedures

    Authors: H. Brendan McMahan, Galen Andrew, Ulfar Erlingsson, Steve Chien, Ilya Mironov, Nicolas Papernot, Peter Kairouz

    Abstract: In this work we address the practical challenges of training machine learning models on privacy-sensitive datasets by introducing a modular approach that minimizes changes to training algorithms, provides a variety of configuration strategies for the privacy mechanism, and then isolates and simplifies the critical logic that computes the final privacy guarantees. A key challenge is that training a… ▽ More

    Submitted 4 March, 2019; v1 submitted 14 December, 2018; originally announced December 2018.

    Comments: Presented at NeurIPS 2018 workshop on Privacy Preserving Machine Learning; Companion paper to TensorFlow Privacy OSS Library

  29. arXiv:1808.01976  [pdf, ps, other

    cs.LG cs.CV stat.ML

    Adversarial Vision Challenge

    Authors: Wieland Brendel, Jonas Rauber, Alexey Kurakin, Nicolas Papernot, Behar Veliqi, Marcel Salathé, Sharada P. Mohanty, Matthias Bethge

    Abstract: The NIPS 2018 Adversarial Vision Challenge is a competition to facilitate measurable progress towards robust machine vision models and more generally applicable adversarial attacks. This document is an updated version of our competition proposal that was accepted in the competition track of 32nd Conference on Neural Information Processing Systems (NIPS 2018).

    Submitted 6 December, 2018; v1 submitted 6 August, 2018; originally announced August 2018.

    Comments: https://www.crowdai.org/challenges/adversarial-vision-challenge

  30. arXiv:1803.04765  [pdf, other

    cs.LG stat.ML

    Deep k-Nearest Neighbors: Towards Confident, Interpretable and Robust Deep Learning

    Authors: Nicolas Papernot, Patrick McDaniel

    Abstract: Deep neural networks (DNNs) enable innovative applications of machine learning like image recognition, machine translation, or malware detection. However, deep learning is often criticized for its lack of robustness in adversarial settings (e.g., vulnerability to adversarial inputs) and general inability to rationalize its predictions. In this work, we exploit the structure of deep learning to ena… ▽ More

    Submitted 13 March, 2018; originally announced March 2018.

  31. arXiv:1802.08908  [pdf, other

    stat.ML cs.CR cs.LG

    Scalable Private Learning with PATE

    Authors: Nicolas Papernot, Shuang Song, Ilya Mironov, Ananth Raghunathan, Kunal Talwar, Úlfar Erlingsson

    Abstract: The rapid adoption of machine learning has increased concerns about the privacy implications of machine learning models trained on sensitive data, such as medical records or other personal information. To address those concerns, one promising approach is Private Aggregation of Teacher Ensembles, or PATE, which transfers to a "student" model the knowledge of an ensemble of "teacher" models, with in… ▽ More

    Submitted 24 February, 2018; originally announced February 2018.

    Comments: Published as a conference paper at ICLR 2018

  32. arXiv:1802.08195  [pdf, other

    cs.LG cs.CV q-bio.NC stat.ML

    Adversarial Examples that Fool both Computer Vision and Time-Limited Humans

    Authors: Gamaleldin F. Elsayed, Shreya Shankar, Brian Cheung, Nicolas Papernot, Alex Kurakin, Ian Goodfellow, Jascha Sohl-Dickstein

    Abstract: Machine learning models are vulnerable to adversarial examples: small changes to images can cause computer vision models to make mistakes such as identifying a school bus as an ostrich. However, it is still an open question whether humans are prone to similar mistakes. Here, we address this question by leveraging recent techniques that transfer adversarial examples from computer vision models with… ▽ More

    Submitted 21 May, 2018; v1 submitted 22 February, 2018; originally announced February 2018.

    Journal ref: Advances in Neural Information Processing Systems, 2018

  33. arXiv:1708.08022  [pdf, ps, other

    stat.ML cs.CR cs.LG

    On the Protection of Private Information in Machine Learning Systems: Two Recent Approaches

    Authors: Martín Abadi, Úlfar Erlingsson, Ian Goodfellow, H. Brendan McMahan, Ilya Mironov, Nicolas Papernot, Kunal Talwar, Li Zhang

    Abstract: The recent, remarkable growth of machine learning has led to intense interest in the privacy of the data on which machine learning relies, and to new techniques for preserving privacy. However, older ideas about privacy may well remain valid and useful. This note reviews two recent works on privacy in the light of the wisdom of some of the early literature, in particular the principles distilled b… ▽ More

    Submitted 26 August, 2017; originally announced August 2017.

    Journal ref: IEEE 30th Computer Security Foundations Symposium (CSF), pages 1--6, 2017

  34. arXiv:1705.07204  [pdf, other

    stat.ML cs.CR cs.LG

    Ensemble Adversarial Training: Attacks and Defenses

    Authors: Florian Tramèr, Alexey Kurakin, Nicolas Papernot, Ian Goodfellow, Dan Boneh, Patrick McDaniel

    Abstract: Adversarial examples are perturbed inputs designed to fool machine learning models. Adversarial training injects such examples into training data to increase robustness. To scale this technique to large datasets, perturbations are crafted using fast single-step methods that maximize a linear approximation of the model's loss. We show that this form of adversarial training converges to a degenerate… ▽ More

    Submitted 26 April, 2020; v1 submitted 19 May, 2017; originally announced May 2017.

    Comments: 22 pages, 5 figures, International Conference on Learning Representations (ICLR) 2018 (amended in April 2020 to include subsequent attacks that significantly reduced the robustness of our models)

  35. arXiv:1705.05264  [pdf, other

    cs.LG cs.CR stat.ML

    Extending Defensive Distillation

    Authors: Nicolas Papernot, Patrick McDaniel

    Abstract: Machine learning is vulnerable to adversarial examples: inputs carefully modified to force misclassification. Designing defenses against such inputs remains largely an open problem. In this work, we revisit defensive distillation---which is one of the mechanisms proposed to mitigate adversarial examples---to address its limitations. We view our results not only as an effective way of addressing so… ▽ More

    Submitted 15 May, 2017; originally announced May 2017.

  36. arXiv:1704.03453  [pdf, other

    stat.ML cs.CR cs.LG

    The Space of Transferable Adversarial Examples

    Authors: Florian Tramèr, Nicolas Papernot, Ian Goodfellow, Dan Boneh, Patrick McDaniel

    Abstract: Adversarial examples are maliciously perturbed inputs designed to mislead machine learning (ML) models at test-time. They often transfer: the same adversarial example fools more than one model. In this work, we propose novel methods for estimating the previously unknown dimensionality of the space of adversarial inputs. We find that adversarial examples span a contiguous subspace of large (~25)… ▽ More

    Submitted 23 May, 2017; v1 submitted 11 April, 2017; originally announced April 2017.

    Comments: 15 pages, 7 figures

  37. arXiv:1702.06280  [pdf, other

    cs.CR cs.LG stat.ML

    On the (Statistical) Detection of Adversarial Examples

    Authors: Kathrin Grosse, Praveen Manoharan, Nicolas Papernot, Michael Backes, Patrick McDaniel

    Abstract: Machine Learning (ML) models are applied in a variety of tasks such as network intrusion detection or Malware classification. Yet, these models are vulnerable to a class of malicious inputs known as adversarial examples. These are slightly perturbed inputs that are classified incorrectly by the ML model. The mitigation of these adversarial inputs remains an open problem. As a step towards understa… ▽ More

    Submitted 17 October, 2017; v1 submitted 21 February, 2017; originally announced February 2017.

    Comments: 13 pages, 4 figures, 5 tables. New version: improved writing, incorporating external feedback

  38. arXiv:1702.02284  [pdf, other

    cs.LG cs.CR stat.ML

    Adversarial Attacks on Neural Network Policies

    Authors: Sandy Huang, Nicolas Papernot, Ian Goodfellow, Yan Duan, Pieter Abbeel

    Abstract: Machine learning classifiers are known to be vulnerable to inputs maliciously constructed by adversaries to force misclassification. Such adversarial examples have been extensively studied in the context of computer vision applications. In this work, we show adversarial attacks are also effective when targeting neural network policies in reinforcement learning. Specifically, we show existing adver… ▽ More

    Submitted 7 February, 2017; originally announced February 2017.

  39. arXiv:1610.05755  [pdf, other

    stat.ML cs.CR cs.LG

    Semi-supervised Knowledge Transfer for Deep Learning from Private Training Data

    Authors: Nicolas Papernot, Martín Abadi, Úlfar Erlingsson, Ian Goodfellow, Kunal Talwar

    Abstract: Some machine learning applications involve training data that is sensitive, such as the medical histories of patients in a clinical trial. A model may inadvertently and implicitly store some of its training data; careful analysis of the model may therefore reveal sensitive information. To address this problem, we demonstrate a generally applicable approach to providing strong privacy guarantees… ▽ More

    Submitted 3 March, 2017; v1 submitted 18 October, 2016; originally announced October 2016.

    Comments: Accepted to ICLR 17 as an oral

  40. arXiv:1610.00768  [pdf, ps, other

    cs.LG cs.CR stat.ML

    Technical Report on the CleverHans v2.1.0 Adversarial Examples Library

    Authors: Nicolas Papernot, Fartash Faghri, Nicholas Carlini, Ian Goodfellow, Reuben Feinman, Alexey Kurakin, Cihang Xie, Yash Sharma, Tom Brown, Aurko Roy, Alexander Matyasko, Vahid Behzadan, Karen Hambardzumyan, Zhishuai Zhang, Yi-Lin Juang, Zhi Li, Ryan Sheatsley, Abhibhav Garg, Jonathan Uesato, Willi Gierke, Yinpeng Dong, David Berthelot, Paul Hendricks, Jonas Rauber, Rujun Long , et al. (1 additional authors not shown)

    Abstract: CleverHans is a software library that provides standardized reference implementations of adversarial example construction techniques and adversarial training. The library may be used to develop more robust machine learning models and to provide standardized benchmarks of models' performance in the adversarial setting. Benchmarks constructed without a standardized implementation of adversarial exam… ▽ More

    Submitted 27 June, 2018; v1 submitted 3 October, 2016; originally announced October 2016.

    Comments: Technical report for https://github.com/tensorflow/cleverhans

  41. arXiv:1603.09638  [pdf, other

    cs.CR cs.LG stat.ML

    Detection under Privileged Information

    Authors: Z. Berkay Celik, Patrick McDaniel, Rauf Izmailov, Nicolas Papernot, Ryan Sheatsley, Raquel Alvarez, Ananthram Swami

    Abstract: For well over a quarter century, detection systems have been driven by models learned from input features collected from real or simulated environments. An artifact (e.g., network event, potential malware sample, suspicious email) is deemed malicious or non-malicious based on its similarity to the learned model at runtime. However, the training of the models has been historically limited to only t… ▽ More

    Submitted 30 March, 2018; v1 submitted 31 March, 2016; originally announced March 2016.

    Comments: A short version of this paper is accepted to ASIACCS 2018

  42. arXiv:1511.07528  [pdf, other

    cs.CR cs.LG cs.NE stat.ML

    The Limitations of Deep Learning in Adversarial Settings

    Authors: Nicolas Papernot, Patrick McDaniel, Somesh Jha, Matt Fredrikson, Z. Berkay Celik, Ananthram Swami

    Abstract: Deep learning takes advantage of large datasets and computationally efficient training algorithms to outperform other approaches at various machine learning tasks. However, imperfections in the training phase of deep neural networks make them vulnerable to adversarial samples: inputs crafted by adversaries with the intent of causing deep neural networks to misclassify. In this work, we formalize t… ▽ More

    Submitted 23 November, 2015; originally announced November 2015.

    Comments: Accepted to the 1st IEEE European Symposium on Security & Privacy, IEEE 2016. Saarbrucken, Germany

  43. arXiv:1511.04508  [pdf, other

    cs.CR cs.LG cs.NE stat.ML

    Distillation as a Defense to Adversarial Perturbations against Deep Neural Networks

    Authors: Nicolas Papernot, Patrick McDaniel, Xi Wu, Somesh Jha, Ananthram Swami

    Abstract: Deep learning algorithms have been shown to perform extremely well on many classical machine learning problems. However, recent studies have shown that deep learning, like other machine learning techniques, is vulnerable to adversarial samples: inputs crafted to force a deep neural network (DNN) to provide adversary-selected outputs. Such attacks can seriously undermine the security of the system… ▽ More

    Submitted 14 March, 2016; v1 submitted 13 November, 2015; originally announced November 2015.