Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3474370.3485655acmconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
invited-talk

Using Honeypots to Catch Adversarial Attacks on Neural Networks

Published: 15 November 2021 Publication History
  • Get Citation Alerts
  • Abstract

    Deep neural networks (DNN) are known to be vulnerable to adversarial attacks. Numerous efforts either try to patch weaknesses in trained models, or try to make it difficult or costly to compute adversarial examples that exploit them. In our work, we explore a new "honeypot'' approach to protect DNN models. We intentionally inject trapdoors, honeypot weaknesses in the classification manifold that attract attackers searching for adversarial examples. Attackers' optimization algorithms gravitate towards trapdoors, leading them to produce attacks similar to trapdoors in the feature space. Our defense then identifies attacks by comparing neuron activation signatures of inputs to those of trapdoors.
    In this paper, we introduce trapdoors and describe an implementation of a trapdoor-enabled defense. First, we analytically prove that trapdoors shape the computation of adversarial attacks so that attack inputs will have feature representations very similar to those of trapdoors. Second, we experimentally show that trapdoor-protected models can detect, with high accuracy, adversarial examples generated by state-of-the-art attacks (PGD, optimization-based CW, Elastic Net, BPDA), with negligible impact on normal classification. These results generalize across classification domains, including image, facial, and traffic-sign recognition. We also present significant results measuring trapdoors' robustness against customized adaptive attacks (countermeasures).

    Supplementary Material

    MP4 File (MTD21-fp12345.mp4)
    In our work, we explore a new honeypot approach to protect DNN models. We intentionally inject honeypot weaknesses in the classification manifold that attract attackers searching for adversarial examples. Attackers optimization algorithms gravitate towards trapdoors, leading them to produce attacks similar to trapdoors in the feature space. We introduce trapdoors & describe an implementation of a trapdoor-enabled defense. We analytically prove that trapdoors shape the computation of adversarial attacks so that attack inputs will have feature representations very similar to those of trapdoors. We experimentally show that trapdoor-protected models can detect, with high accuracy, adversarial examples generated by state-of-the-art attacks. These results generalize across classification domains, including image, facial, & traffic-sign recognition. We also present significant results measuring trapdoors? robustness against customized adaptive attacks (countermeasures).

    Cited By

    View all
    • (2023)Beating Backdoor Attack at Its Own Game2023 IEEE/CVF International Conference on Computer Vision (ICCV)10.1109/ICCV51070.2023.00426(4597-4606)Online publication date: 1-Oct-2023
    • (2022)Application of Artificial Intelligence Technology in Honeypot Technology2021 International Conference on Advanced Computing and Endogenous Security10.1109/IEEECONF52377.2022.10013349(01-09)Online publication date: 21-Apr-2022

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    MTD '21: Proceedings of the 8th ACM Workshop on Moving Target Defense
    November 2021
    48 pages
    ISBN:9781450386586
    DOI:10.1145/3474370
    Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 15 November 2021

    Check for updates

    Author Tags

    1. adversarial examples
    2. honeypots
    3. neural networks

    Qualifiers

    • Invited-talk

    Conference

    CCS '21
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 40 of 92 submissions, 43%

    Upcoming Conference

    ICSE 2025

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)38
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 12 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)Beating Backdoor Attack at Its Own Game2023 IEEE/CVF International Conference on Computer Vision (ICCV)10.1109/ICCV51070.2023.00426(4597-4606)Online publication date: 1-Oct-2023
    • (2022)Application of Artificial Intelligence Technology in Honeypot Technology2021 International Conference on Advanced Computing and Endogenous Security10.1109/IEEECONF52377.2022.10013349(01-09)Online publication date: 21-Apr-2022

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media