research-article

Crafting Adversarial Email Content against Machine Learning Based Spam Email Detection

Authors:

Leah DingAuthors Info & Claims

ASSS '21: Proceedings of the 2021 International Symposium on Advanced Security on Software and Systems

Pages 23 - 28

https://doi.org/10.1145/3457340.3458302

Published: 04 June 2021 Publication History

Abstract

While machine learning based spam detectors have proven useful, spammers are learning to bypass the detectors by modifying their email content. Adversarial attacks on machine learning models have been observed in domains such as image classification. Applying such adversarial attack algorithms to craft spam emails to evade spam email detectors, however, has limitations. Such algorithms generate adversarial perturbations in the feature space. Different from image data, translating the adversarial perturbations from the feature space to text formats, as in emails, changes the effectiveness of the adversarial perturbations. It can reduce the attack success rate in the case of spam email detection. In this paper, we study the feasibility of adversarial attacks on machine learning based spam detectors and propose two novel text crafting methods leveraging adversarial perturbations generated by the adversarial example generation algorithms to improve the attack effectiveness. One method tries to approximate the feature values and the other adds special words to original emails. In experimentation, we use PGD as an example to demonstrate and compare the effectiveness of our attack methods on spam email detectors. We also examine the transferability of the proposed attack methods on different machine learning models.

References

[1]

[n.d.]. Enron-Spam dataset. http://nlp.cs.aueb.gr/software_and_datasets/Enron-Spam/index.html

[2]

[n.d.]. scikit-learn. https://scikit-learn.org/stable/

[3]

[n.d.]. SecML: A library for Secure and Explainable Machine Learning. https://secml.gitlab.io/

[4]

Ion Androutsopoulos, John Koutsias, Konstantinos V Chandrinos, George Paliouras, and Constantine D Spyropoulos. 2000. An evaluation of naive bayesian anti-spam filtering. arXiv preprint cs/0006013 (2000).

[5]

Nicholas Carlini and David Wagner. 2017. Towards evaluating the robustness of neural networks. In 2017 IEEE Symposium on Security and Privacy (SP). IEEE, 39--57.

[6]

Nicholas Carlini and David Wagner. 2018. Audio adversarial examples: Targeted attacks on speech-to-text. In 2018 IEEE Security and Privacy Workshops (SPW). IEEE, 1--7.

[7]

Xavier Carreras and Lluis Marquez. 2001. Boosting trees for anti-spam email filtering. arXiv preprint cs/0109015 (2001).

[8]

Pai-Hsuen Chen, Chih-Jen Lin, and Bernhard Schölkopf. 2005. A tutorial on ν-support vector machines. Applied Stochastic Models in Business and Industry, Vol. 21, 2 (2005), 111--136.

Digital Library

[9]

Harris Drucker, Donghui Wu, and Vladimir N Vapnik. 1999. Support vector machines for spam categorization. IEEE Transactions on Neural networks, Vol. 10, 5 (1999), 1048--1054.

Digital Library

[10]

Loredana Firte, Camelia Lemnaru, and Rodica Potolea. 2010. Spam detection filter using KNN algorithm and resampling. In Proceedings of the 2010 IEEE 6th International Conference on Intelligent Computer Communication and Processing. IEEE, 27--33.

Digital Library

[11]

Ian J Goodfellow, Jonathon Shlens, and Christian Szegedy. 2014. Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572 (2014).

[12]

Zach Jorgensen, Yan Zhou, and Meador Inge. 2008. A Multiple Instance Learning Strategy for Combating Good Word Attacks on Spam Filters. Journal of Machine Learning Research, Vol. 9, 6 (2008).

Digital Library

[13]

Bhargav Kuchipudi, Ravi Teja Nannapaneni, and Qi Liao. 2020. Adversarial machine learning for spam filters. In Proceedings of the 15th International Conference on Availability, Reliability and Security. 1--6.

Digital Library

[14]

Daniel Lowd and Christopher Meek. 2005. Good Word Attacks on Statistical Spam Filters. In CEAS, Vol. 2005.

[15]

Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, and Adrian Vladu. 2017. Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv:1706.06083 (2017).

[16]

Sunday Olusanya Olatunji. 2017. Extreme Learning machines and Support Vector Machines models for email spam detection. In 2017 IEEE 30th Canadian Conference on Electrical and Computer Engineering (CCECE). IEEE, 1--6.

[17]

Yao Qin, Nicholas Carlini, Garrison Cottrell, Ian Goodfellow, and Colin Raffel. 2019. Imperceptible, robust, and targeted adversarial examples for automatic speech recognition. In International Conference on Machine Learning. PMLR, 5231--5240.

[18]

Mehran Sahami, Susan Dumais, David Heckerman, and Eric Horvitz. 1998. A Bayesian approach to filtering junk e-mail. In AAAI Workshop on Learning for Text Categorization, Vol. 62. Madison, Wisconsin, 98--105.

[19]

Gregory L Wittel and Shyhtsun Felix Wu. 2004. On Attacking Statistical Spam Filters. In CEAS. Citeseer.

[20]

Hiromu Yakura and Jun Sakuma. 2018. Robust audio adversarial example for a physical attack. arXiv preprint arXiv:1810.11793 (2018).

Cited By

Zhang RSun J(2024)Certified Robust Accuracy of Neural Networks Are Bounded Due to Bayes ErrorsComputer Aided Verification10.1007/978-3-031-65630-9_18(352-376)Online publication date: 24-Jul-2024
https://dl.acm.org/doi/10.1007/978-3-031-65630-9_18
Nosrati VRahmani MJolfaei ASeifollahi S(2023)A Weak-Region Enhanced Bayesian Classification for Spam Content-Based FilteringACM Transactions on Asian and Low-Resource Language Information Processing10.1145/351042022:3(1-18)Online publication date: 2-Apr-2023
https://dl.acm.org/doi/10.1145/3510420
Kushwaha ADutta KMaheshwari V(2023)Analysis of BERT Email Spam Classifier Against Adversarial Attacks2023 International Conference on Artificial Intelligence and Smart Communication (AISC)10.1109/AISC56616.2023.10085255(485-490)Online publication date: 27-Jan-2023
https://doi.org/10.1109/AISC56616.2023.10085255
Show More Cited By

Index Terms

Crafting Adversarial Email Content against Machine Learning Based Spam Email Detection
1. Computing methodologies
  1. Machine learning
2. Security and privacy
  1. Intrusion/anomaly detection and malware mitigation
    1. Social engineering attacks
      1. Phishing

Recommendations

Adversarial Machine Learning Attacks and Defense Methods in the Cyber Security Domain

In recent years, machine learning algorithms, and more specifically deep learning algorithms, have been widely used in many fields, including cyber security. However, machine learning systems are vulnerable to adversarial attacks, and this limits the ...
Adversarial machine learning for spam filters
ARES '20: Proceedings of the 15th International Conference on Availability, Reliability and Security

Email spam filters based on machine learning techniques are widely deployed in today's organizations. As our society relies more on artificial intelligence (AI), the security of AI, especially the machine learning algorithms, becomes increasingly ...
Adversarial machine learning in IoT from an insider point of view
Abstract
With the rapid progress and significant successes in various applications, machine learning has been considered a crucial component in the Internet of Things ecosystem. However, machine learning models have recently been vulnerable to ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ASSS '21: Proceedings of the 2021 International Symposium on Advanced Security on Software and Systems

June 2021

62 pages

ISBN:9781450384032

DOI:10.1145/3457340

Program Chairs:
Weizhi Meng
Technical University of Denmark, Denmark
,
Li Li
Monash University, Australia

Copyright © 2021 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGSAC: ACM Special Interest Group on Security, Audit, and Control

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 June 2021

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

ASIA CCS '21

Sponsor:

SIGSAC

ASIA CCS '21: ACM Asia Conference on Computer and Communications Security

June 7, 2021

Virtual Event, Hong Kong

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

4
Total Citations
View Citations
166
Total Downloads

Downloads (Last 12 months)55
Downloads (Last 6 weeks)9

Reflects downloads up to 12 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Zhang RSun J(2024)Certified Robust Accuracy of Neural Networks Are Bounded Due to Bayes ErrorsComputer Aided Verification10.1007/978-3-031-65630-9_18(352-376)Online publication date: 24-Jul-2024
https://dl.acm.org/doi/10.1007/978-3-031-65630-9_18
Nosrati VRahmani MJolfaei ASeifollahi S(2023)A Weak-Region Enhanced Bayesian Classification for Spam Content-Based FilteringACM Transactions on Asian and Low-Resource Language Information Processing10.1145/351042022:3(1-18)Online publication date: 2-Apr-2023
https://dl.acm.org/doi/10.1145/3510420
Kushwaha ADutta KMaheshwari V(2023)Analysis of BERT Email Spam Classifier Against Adversarial Attacks2023 International Conference on Artificial Intelligence and Smart Communication (AISC)10.1109/AISC56616.2023.10085255(485-490)Online publication date: 27-Jan-2023
https://doi.org/10.1109/AISC56616.2023.10085255
Cheng QXu ALi XDing L(2022)Adversarial Email Generation against Spam Detection Models through Feature Perturbation2022 IEEE International Conference on Assured Autonomy (ICAA)10.1109/ICAA52185.2022.00019(83-92)Online publication date: Mar-2022
https://doi.org/10.1109/ICAA52185.2022.00019

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents