Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article

Using header session messages to anti-spamming

Published: 01 August 2007 Publication History

Abstract

The Internet is popular, with email use functioning as the major Internet activity. However, spam has recently become a major problem impeding the use of email. Many spam filtering techniques have been implemented so far. Most current anti-spamming techniques filter out junk emails based on email subjects and body messages. Nevertheless, subjects and email contents are not the only cues for judging spam. This investigation presents a statistical analysis of the header session messages of junk and normal emails, and explores the possibility of utilizing these messages to perform spam filtering. The message head session, including the sender's mail address, receiver's mail address and time, which is of little interest to most users, also provides further information for anti-spamming purpose. A statistical analysis is undertaken on the content of 10,024 junk emails collected from a Spam Archive database, and 599 regular emails in company with 635 solicited listserv or commercial emails contributed by volunteers. Content analysis results demonstrate that up to 92.5% of junk emails are filtered out when utilizing the message-ID, mail user agent, and sender and receiver addresses in the header session as cues. Additionally, the proposed approach may induce a low block error rate for normal emails for the sample utilized in this investigation. This low rate of over-block errors is a significant merit of the proposed anti-spamming approach. The proposed approach of utilizing header session messages to filter out junk emails may coexist with other anti-spamming approaches. Therefore, no conflict arises between the proposed approach and existing spam prevention approaches.

References

[1]
Ahmed S, Mithun F. Word stemming to enhance spam filtering. In: Proceedings of first conference on email and anti-spam, Mountain View, CA; July 2004.
[2]
Androutsopoulos I, Paliouras G, Karkaletsis V, Sakkis G, Spyropoulos C, Stamatopoulos P. Learning to filter spam e-mail: a comparison of a naive Bayesian and a memorybased approach. In: Proceedings of the workshop on machine learning and textual information access, Lyon, France; September 2000.
[3]
Carreras X, Marquez L. Boosting trees for anti-spam email filtering. In: Proceedings of fourth international conference on recent advances in natural language processing, Tzigov Chark, Bulgaria; September 2001.
[4]
Cohen W. Learning rules that classify e-mail. In: Proceedings of AAAI spring symposium on machine learning in information access, Stanford, CA; March 1996.
[5]
An analysis of the tools used for the generation and prevention of spam. Computers & Security. v23 i2. 154-166.
[6]
Spam!. Communications of the ACM. v41 i8. 74-83.
[7]
Crocker DH. Standard for the format of APRA internet text messages. The Request for Comments (RFC 822), <http://www.rfc.org>; 1982.
[8]
Damiani E, di Vimercati SDC, Paraboschi S, Samarati P, Tironi A, Zaniboni L. Spam attacks: p2p to the rescue. In: Proceedings of the 13th international world wide web conference, New York, NY; May 2004.
[9]
ACM president's letter: electronic junk. Communications of the ACM. v25 i3. 163-165.
[10]
Support vector machines for spam categorization. IEEE Transactions on Neural Networks. v10 i5. 1048-1054.
[11]
Golbeck J, Hendler J. Reputation network analysis for email filtering. In: Proceedings of first conference on email and anti-spam, Mountain View, CA; July 2004.
[12]
Gordon R, Hongyuan Z. Exploring support vector machines and random forests for spam detection. In: Proceedings of first conference on email and anti-spam, Mountain View, CA; July 2004.
[13]
Spam: the evolution of a nuisance. Computers & Security. v22 i6. 474-478.
[14]
Internet email protocols: a developer's guide. Addison Wesley, Boston, MA.
[15]
Jung J, Sit E. An empirical study of spam traffic and the use of DNS black lists. In: Proceedings of fourth ACM SIGCOMM conference on internet measurement, Taormina, Sicily, Italy; October 2004.
[16]
Kolcz A, Alspector J. SVM-based filtering of e-mail spam with content-specific misclassification costs. In: Proceedings of the TextDM'01 workshop on text mining-held at the 2001 IEEE international conference on data mining, San Jose, CA; November 2001.
[17]
Leiba B, Borenstein N. A multifaceted approach to spam reduction. In: Proceedings of first conference on email and anti-spam, Mountain View, CA; July 2004.
[18]
Postel J. On the junk mail problem. The Request for Comments (RFC 706), <http://www.rfc.org>; 1975.
[19]
Naive-Bayes vs. rule-learning in classification of email. The University of Texas at Austin, Department of Computer Sciences.
[20]
Calculating error rates for¿filtering software. Communications of the ACM. v47 i9. 67-71.
[21]
Rigoutsos I, Huynh T. Chung-Kwei: a pattern-discovery-based System for the automatic identification of unsolicited e-mail messages (SPAM). In: Proceedings of first conference on email and anti-spam, Mountain View, CA; July 2004.
[22]
Sahami M, Dumais S, Heckerman D, Horvitz E. A Bayesian approach to filtering junk e-mail. In: Proceedings of AAAI workshop on learning for text categorization, Madison, Wisconsin; July 1998.
[23]
Schneider K. A comparison of event models for naive bayes anti-spam e-mail filtering. In: Proceedings of the 11th conference of the European chapter of the association for computational linguistics (EACL'03), Budapest, Hungary; April 2003.
[24]
Shih DH, Hsu TE, Lin B. Collaborative spam filtering on multiagent system. In: Proceedings of ACME international conference on Pacific rim management, Chicago, IL; 2004.
[25]
Adapting Bayesian statistical spam filters to the server'side. Journal of Computing Sciences in Colleges. v19 i5. 344-346.
[26]
Soonthornphisaj N, Chaikulseriwat K, Tang-On P. Anti-spam filtering: a centroid-based classification approach. In: Proceedings of 2002 international conference on signal proceeding, Beijing, China; August 2002.
[27]
Sender and receiver addresses as cues for anti-spam filtering. Journal of Research and Practice in Information Technology. v36 i1. 3-7.
[28]
Woitaszek M, Shaaban M, Czernikowski R, Identifying junk electronic mail in Microsoft Outlook with a support vector machine. In: Proceedings of 2003 symposium on applications and the internet, Orlando, FL; January 2003.

Cited By

View all
  • (2023)An Intelligent Framework Based on Deep Learning for SMS and e-mail Spam DetectionApplied Computational Intelligence and Soft Computing10.1155/2023/66489702023Online publication date: 1-Jan-2023
  • (2021)CADUE: Content-Agnostic Detection of Unwanted Emails for Enterprise SecurityProceedings of the 24th International Symposium on Research in Attacks, Intrusions and Defenses10.1145/3471621.3471862(205-219)Online publication date: 6-Oct-2021
  • (2016)An intelligent three-phase spam filtering method based on decision tree data miningSecurity and Communication Networks10.1002/sec.15849:17(4013-4026)Online publication date: 25-Nov-2016
  • Show More Cited By

Index Terms

  1. Using header session messages to anti-spamming
    Index terms have been assigned to the content through auto-classification.

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    Publisher

    Elsevier Advanced Technology Publications

    United Kingdom

    Publication History

    Published: 01 August 2007

    Author Tags

    1. Email address
    2. Filter
    3. Junk mail
    4. Spam
    5. Unsolicited email

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 16 Oct 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)An Intelligent Framework Based on Deep Learning for SMS and e-mail Spam DetectionApplied Computational Intelligence and Soft Computing10.1155/2023/66489702023Online publication date: 1-Jan-2023
    • (2021)CADUE: Content-Agnostic Detection of Unwanted Emails for Enterprise SecurityProceedings of the 24th International Symposium on Research in Attacks, Intrusions and Defenses10.1145/3471621.3471862(205-219)Online publication date: 6-Oct-2021
    • (2016)An intelligent three-phase spam filtering method based on decision tree data miningSecurity and Communication Networks10.1002/sec.15849:17(4013-4026)Online publication date: 25-Nov-2016
    • (2012)Identifying spam e-mail based-on statistical header features and sender behaviorProceedings of the CUBE International Information Technology Conference10.1145/2381716.2381863(771-778)Online publication date: 3-Sep-2012
    • (2009)An intelligent spam filtering system based on fuzzy clusteringProceedings of the 6th international conference on Fuzzy systems and knowledge discovery - Volume 710.5555/1802134.1802248(515-519)Online publication date: 14-Aug-2009

    View Options

    View options

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media