article

Free access

Query strategies for evading convex-inducing classifiers

Authors:

Benjamin I. P. Rubinstein,

Anthony D. Joseph,

J. D. TygarAuthors Info & Claims

The Journal of Machine Learning Research, Volume 13, Issue 1

Pages 1293 - 1332

Published: 01 May 2012 Publication History

PDF eReader Publisher Site

Abstract

Classifiers are often used to detect miscreant activities. We study how an adversary can systematically query a classifier to elicit information that allows the attacker to evade detection while incurring a near-minimal cost of modifying their intended malfeasance. We generalize the theory of Lowd and Meek (2005) to the family of convex-inducing classifiers that partition their feature space into two sets, one of which is convex. We present query algorithms for this family that construct undetected instances of approximately minimal cost using only polynomially-many queries in the dimension of the space and in the level of approximation. Our results demonstrate that nearoptimal evasion can be accomplished for this family without reverse engineering the classifier's decision boundary. We also consider general l_p costs and show that near-optimal evasion on the family of convex-inducing classifiers is generally efficient for both positive and negative convexity for all levels of approximation if p = 1.

References

[1]

Dana Angluin. Queries and concept learning. Machine Learning, 2:319-342, 1988.

[2]

Keith Ball. An elementary introduction to modern convex geometry. In Flavors of Geometry, volume 31 of MSRI Publications, pages 1-58. Cambridge University Press, 1997.

[3]

Dimitris Bertsimas and Santosh Vempala. Solving convex programs by random walks. Journal of the ACM, 51(4):540-556, 2004.

[4]

Christopher M. Bishop. Pattern Recognition and Machine Learning. Springer, 2006.

[5]

Stephen Boyd and Lieven Vandenberghe. Convex Optimization. Cambridge University Press, 2004.

[6]

Richard P. Brent. Algorithms for Minimization without Derivatives. Prentice-Hall, 1973.

[7]

Michael Brückner and Tobias Scheffer. Nash equilibria of static prediction games. In Advances in Neural Information Processing Systems (NIPS), volume 22, pages 171-179. 2009.

[8]

Richard L. Burden and J. Douglas Faires. Numerical Analysis. Brooks Cole, 7th edition, 2000.

[9]

Nilesh Dalvi, Pedro Domingos, Mausam, Sumit Sanghai, and Deepak Verma. Adversarial classification. In Proceedings of the 10th International Conference on Knowledge Discovery and Data Mining, pages 99-108, 2004.

[10]

Sanjoy Dasgupta, Adam Tauman Kalai, and ClaireMonteleoni. Analysis of perceptron-based active learning. Journal of Machine Learning Research, 10:281-299, 2009.

[11]

Vitaly Feldman. On the power of membership queries in agnostic learning. Journal of Machine Learning Research, 10:163-182, 2009.

[12]

Jörg Flum and Martin Grohe. Parameterized Complexity Theory. Texts in Theoretical Computer Science. An EATCS Series. Springer-Verlag, Secaucus, NJ, USA, 2006.

[13]

Lee-Ad Gottlieb, Aryeh Kontorovich, and ElchananMossel. VC bounds on the cardinality of nearly orthogonal function classes. Technical Report arXiv:1007.4915v2 {math.CO}, arXiv, 2011.

[14]

Donald R. Jones. A taxonomy of global optimization methods based on response surfaces. Journal of Global Optimization, 21(4):345-383, 2001.

[15]

Donald R. Jones, Cary D. Perttunen, and Bruce E. Stuckman. Lipschitzian optimization without the Lipschitz constant. Journal Optimization Theory and Application, 79(1):157-181, 1993.

[16]

Murat Kantarcioglu, Bowei Xi, and Chris Clifton. Classifier evaluation and attribute selection against active adversaries. Technical Report 09-01, Purdue University, 2009.

[17]

Tamara G. Kolda, RobertMichael Lewis, and Virginia Torczon. Optimization by direct search: New perspectives on some classical and modern methods. SIAM Review, 45(3):385-482, 2003.

[18]

Anukool Lakhina, Mark Crovella, and Christophe Diot. Diagnosing network-wide traffic anomalies. In Proceedings of the Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications (SIGCOMM), pages 219-230, 2004.

[19]

László Lovász and Santosh Vempala. Simulated annealing in convex bodies and an O ^*(n ⁴) volume algorithm. In Proceedings of the 44th Annual IEEE Symposium on Foundations of Computer Science (FOCS '03), pages 650-659, 2003.

[20]

László Lovász and Santosh Vempala. Hit-and-run from a corner. In Proceedings of the 36th Annual ACM Symposium on Theory of Computing (STOC), pages 310-314, 2004.

[21]

Daniel Lowd and Christopher Meek. Adversarial learning. In Proceedings of the 11th International Conference on Knowledge Discovery in Data Mining, pages 641-647, 2005.

[22]

John A. Nelder and Roger Mead. A simplex method for function minimization. The Computer Journal, 7(4):308-313, 1965.

[23]

Blaine Nelson, Benjamin I. P. Rubinstein, Ling Huang, Anthony D. Joseph, Shing hon Lau, Steven Lee, Satish Rao, Anthony Tran, and J. D. Tygar. Near-optimal evasion of convex-inducing classifiers. In Proceedings of the 13th International Conference on Artificial Intelligence and Statistics (AISTATS), pages 549-556, 2010a.

[24]

Blaine Nelson, Benjamin I. P. Rubinstein, Ling Huang, Anthony D. Joseph, Steven Lee, Satish Rao, and J. D. Tygar. Query strategies for evading convex-inducing classifiers. Technical Report arXiv:1007.0484v1 {cs.LG}, arXiv, 2010b.

[25]

Blaine Nelson, Benjamin I. P. Rubinstein, Ling Huang, Anthony D. Joseph, and J. D. Tygar. Classifier evasion: Models and open problems. In Privacy and Security Issues in Data Mining and Machine Learning, volume 6549 of Lecture Notes in Computer Science, pages 92-98. 2011.

[26]

Luis Rademacher and Navin Goyal. Learning convex bodies is hard. In Proceedings of the 22nd Annual Conference on Learning Theory (COLT), pages 303-308, 2009.

[27]

Greg Schohn and David Cohn. Less is more: Active learning with support vector machines. In Proceedings of the 17th International Conference on Machine Learning (ICML), pages 839-846, 2000.

[28]

Burr Settles. Active learning literature survey. Computer Sciences Technical Report 1648, University of Wisconsin-Madison, 2009.

[29]

Robert L. Smith. The hit-and-run sampler: A globally reaching Markov chain sampler for generating arbitrary multivariate distributions. In Proceedings of the 28th Conference on Winter Simulation (WSC '96), pages 260-264, 1996.

[30]

Kymie M. C. Tan, Kevin S. Killourhy, and Roy A. Maxion. Undermining an anomaly-based intrusion detection system using common exploits. In Proceedings of the 5th International Conference on Recent Advances in Intrusion Detection (RAID), volume 2516 of Lecture Notes in Computer Science, pages 54-73, 2002.

[31]

David Wagner and Paolo Soto. Mimicry attacks on host-based intrusion detection systems. In Proceedings of the 9th ACM Conference on Computer and Communications Security, pages 255- 264, 2002.

[32]

Ke Wang and Salvatore J. Stolfo. Anomalous payload-based network intrusion detection. In Proceedings of the 7th International Conference on Recent Advances in Intrusion Detection (RAID), volume 3224 of Lecture Notes in Computer Science, pages 203-222, 2004.

[33]

Ke Wang, Janak J. Parekh, and Salvatore J. Stolfo. Anagram: A content anomaly detector resistant to mimicry attack. In Proceedings of the 9th International Conference on Recent Advances in Intrusion Detection (RAID), volume 4219 of Lecture Notes in Computer Science, pages 226- 248, 2006.

[34]

Aaron D. Wyner. Capabilities of bounded discrepancy decoding. The Bell System Technical Journal, 44:1061-1122, 1965.

Cited By

Gopalakrishna NAnandayuvaraj DDetti ABland FRahaman SDavis J(2022)"If security is required"Proceedings of the 4th International Workshop on Software Engineering Research and Practice for the IoT10.1145/3528227.3528565(1-8)Online publication date: 19-May-2022
https://dl.acm.org/doi/10.1145/3528227.3528565
Zhang HFu ZLi GMa LZhao ZYang HSun YLiu YJin Z(2022)Towards Robustness of Deep Program Processing Models—Detection, Estimation, and EnhancementACM Transactions on Software Engineering and Methodology10.1145/351188731:3(1-40)Online publication date: 9-Apr-2022
https://dl.acm.org/doi/10.1145/3511887
Madani PVlajic NMaljevic I(2022)Randomized Moving Target Approach for MAC-Layer Spoofing Detection and Prevention in IoT SystemsDigital Threats: Research and Practice10.1145/34774033:4(1-24)Online publication date: 5-Dec-2022
https://dl.acm.org/doi/10.1145/3477403
Show More Cited By

Index Terms

Query strategies for evading convex-inducing classifiers
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Supervised learning
        Supervised learning by classification
    2. Machine learning approaches
      1. Classification and regression trees
2. Information systems
  1. Information systems applications
    1. Decision support systems
      1. Expert systems

Recommendations

Query strategies for evading convex-inducing classifiers

Classifiers are often used to detect miscreant activities. We study how an adversary can systematically query a classifier to elicit information that allows the attacker to evade detection while incurring a near-minimal cost of modifying their intended ...
Classifier evasion: models and open problems
PSDML'10: Proceedings of the international ECML/PKDD conference on Privacy and security issues in data mining and machine learning

As a growing number of software developers apply machine learning to make key decisions in their systems, adversaries are adapting and launching ever more sophisticated attacks against these systems. The near-optimal evasion problem considers an ...
Evading behavioral classifiers: a comprehensive analysis on evading ransomware detection techniques
Abstract
Recent progress in machine learning has led to promising results in behavioral malware detection. Behavioral modeling identifies malicious processes via features derived by their runtime behavior. Behavioral features hold great promise as they are ...

Comments

Information & Contributors

Information

Published In

cover image The Journal of Machine Learning Research

The Journal of Machine Learning Research Volume 13, Issue 1

January 2012

3712 pages

ISSN:1532-4435

EISSN:1533-7928

Issue’s Table of Contents

Publisher

JMLR.org

Publication History

Published: 01 May 2012

Published in JMLR Volume 13, Issue 1

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

27
Total Citations
View Citations
156
Total Downloads

Downloads (Last 12 months)45
Downloads (Last 6 weeks)7

Reflects downloads up to 28 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Gopalakrishna NAnandayuvaraj DDetti ABland FRahaman SDavis J(2022)"If security is required"Proceedings of the 4th International Workshop on Software Engineering Research and Practice for the IoT10.1145/3528227.3528565(1-8)Online publication date: 19-May-2022
https://dl.acm.org/doi/10.1145/3528227.3528565
Zhang HFu ZLi GMa LZhao ZYang HSun YLiu YJin Z(2022)Towards Robustness of Deep Program Processing Models—Detection, Estimation, and EnhancementACM Transactions on Software Engineering and Methodology10.1145/351188731:3(1-40)Online publication date: 9-Apr-2022
https://dl.acm.org/doi/10.1145/3511887
Madani PVlajic NMaljevic I(2022)Randomized Moving Target Approach for MAC-Layer Spoofing Detection and Prevention in IoT SystemsDigital Threats: Research and Practice10.1145/34774033:4(1-24)Online publication date: 5-Dec-2022
https://dl.acm.org/doi/10.1145/3477403
Yang FChen ZGangopadhyay A(2022)Using Randomness to Improve Robustness of Tree-Based Models Against Evasion AttacksIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2020.298729934:2(969-982)Online publication date: 1-Feb-2022
https://dl.acm.org/doi/10.1109/TKDE.2020.2987299
Madani PVlajic NSadeghpour SManiatakos MZhang Y(2020)MAC-Layer Spoofing Detection and Prevention in IoT SystemsProceedings of the 2020 Joint Workshop on CPS&IoT Security and Privacy10.1145/3411498.3419968(71-80)Online publication date: 9-Nov-2020
https://dl.acm.org/doi/10.1145/3411498.3419968
Tong LLi BHajaj CXiao CZhang NVorobeychik YHeninger NTraynor P(2019)Improving robustness of ML classifiers against realizable evasion attacks using conserved featuresProceedings of the 28th USENIX Conference on Security Symposium10.5555/3361338.3361359(285-302)Online publication date: 14-Aug-2019
https://dl.acm.org/doi/10.5555/3361338.3361359
Madani PVlajic N(2019)Near-optimal Evasion of Randomized Convex-inducing Classifiers in Adversarial EnvironmentsProceedings of the 14th International Conference on Availability, Reliability and Security10.1145/3339252.3340520(1-6)Online publication date: 26-Aug-2019
https://dl.acm.org/doi/10.1145/3339252.3340520
Li KGu YZhang PAn WLi W(2019)Research on KNN Algorithm in Malicious PDF Files Classification under Adversarial EnvironmentProceedings of the 4th International Conference on Big Data and Computing10.1145/3335484.3335527(156-159)Online publication date: 10-May-2019
https://dl.acm.org/doi/10.1145/3335484.3335527
Yang FChen ZGangopadhyay AVerma RSubramaniam DSung AVerma R(2019)Using Randomness to Improve Robustness of Tree-based Models Against Evasion AttacksProceedings of the ACM International Workshop on Security and Privacy Analytics10.1145/3309182.3309186(25-35)Online publication date: 13-Mar-2019
https://dl.acm.org/doi/10.1145/3309182.3309186
Diochnos DMahloujifar SMahmoody M(2018)Adversarial risk and robustnessProceedings of the 32nd International Conference on Neural Information Processing Systems10.5555/3327546.3327698(10380-10389)Online publication date: 3-Dec-2018
https://dl.acm.org/doi/10.5555/3327546.3327698
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Media

Figures

Other

Tables

View Issue’s Table of Contents