Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.5555/1667583.1667682dlproceedingsArticle/Chapter ViewAbstractPublication PagesaclshortConference Proceedingsconference-collections
research-article
Free access

The contribution of stylistic information to content-based mobile spam filtering

Published: 04 August 2009 Publication History

Abstract

Content-based approaches to detecting mobile spam to date have focused mainly on analyzing the topical aspect of a SMS message (what it is about) but not on the stylistic aspect (how it is written). In this paper, as a preliminary step, we investigate the utility of commonly used stylistic features based on shallow linguistic analysis for learning mobile spam filters. Experimental results show that the use of stylistic information is potentially effective for enhancing the performance of the mobile spam filters.

References

[1]
Shlomo Argamon and Shlomo Levitan. 2005. Measuring the usefulness of function words for authorship attribution. In Proceedings of ACH/ALLC '05.
[2]
Shlomo Argamon-Engelson, Moshe Koppel, and Galit Avneri. 1998. Style-based text categorization: What newspaper am i reading? In Proceedings of AAAI '98 Workshop on Text Categorization, pages 1--4.
[3]
H. Baayen, H. van Halteren, and F. Tweedie. 1996. Outside the cave of shadows: using syntactic annotation to enhance authorship attribution. Literary and Linguistic Computing, 11(3):121--132.
[4]
Gordon V. Cormack, José María Gómez Hidalgo, and Enrique Puertas Sánz. 2007a. Spam filtering for short messages. In Proceedings of CIKM '07, pages 313--320.
[5]
Gordon V. Cormack, José María Gómez Hidalgo, and Enrique Puertas Sánz. 2007b. Feature engineering for mobile (sms) spam filtering. In Proceedings of SIGIR '07, pages 871--872.
[6]
Michael Gamon. 2004. Linguistic correlates of style: Authorship classification with deep linguistic analysis features. In Proceedings of COLING '04, page 611.
[7]
José María Gómez Hidalgo, Guillermo Cajigas Bringas, Enrique Puertas Sánz, and Francisco Carrero García. 2006. Content based sms spam filtering. In Proceedings of DocEng '06, pages 107--114.
[8]
David I. Holmes. 1998. The evolution of stylometry in humanities scholarship. Literary and Linguistic Computing, 13(3):111--117.
[9]
Moshe Koppel, Shlomo Argamon, and Anat R. Shimoni. 2003. Automatically categorizing written texts by author gender. Literary and Linguistic Computing, 17(4):401--412.
[10]
Robert Malouf. 2002. A comparison of algorithms for maximum entropy parameter estimation. In Proceedings of COLING '02, pages 1--7.
[11]
T. C. Mendenhall. 1887. The characteristic curves of composition. Science, 9(214):237--246.
[12]
A. Q. Morton. 1965. The authorship of greek prose. Journal of the Royal Statistical Society Series A (General), 128(2):169--233.
[13]
Frederick Mosteller and David L. Wallace. 1964. Inference and Disputed Authorship: The Federalist. Addison-Wesley.
[14]
Marina Santini. 2004. A shallow approach to syntactic feature extraction for genre classification. In Proceedings of CLUK Colloquium '04.
[15]
Dae-Neung Sohn, Joong-Hwi Shin, Jung-Tae Lee, Seung-Wook Lee, and Hae-Chang Rim. 2008. Contents-based korean sms spam filtering using morpheme unit features. In Proceedings of HCLT '08, pages 194--199.
[16]
E. J. Tweedie, S. Singh, and D. I. Holmes. 1996. Neural network applications in stylometry: The federalist papers. Computers and the Humanities, 30:1--10.
[17]
G. Udny Yule. 1939. On sentence-length as a statistical characteristic of style in prose, with application to two cases of disputed authorship. Biometrika, 30(3--4):363--390.

Cited By

View all
  • (2017)Bi-Term Topic Model for SMS ClassificationInternational Journal of Business Data Communications and Networking10.4018/ijbdcn.201707010313:2(28-40)Online publication date: 1-Jul-2017
  • (2017)Value and Misinformation in Collaborative Investing PlatformsACM Transactions on the Web10.1145/302748711:2(1-32)Online publication date: 4-May-2017
  • (2016)A novel feature extraction approach in SMS spam filtering for mobile communicationSecurity and Communication Networks10.1002/sec.16609:17(4680-4690)Online publication date: 25-Nov-2016
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image DL Hosted proceedings
ACLShort '09: Proceedings of the ACL-IJCNLP 2009 Conference Short Papers
August 2009
390 pages

Publisher

Association for Computational Linguistics

United States

Publication History

Published: 04 August 2009

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)64
  • Downloads (Last 6 weeks)6
Reflects downloads up to 30 Aug 2024

Other Metrics

Citations

Cited By

View all
  • (2017)Bi-Term Topic Model for SMS ClassificationInternational Journal of Business Data Communications and Networking10.4018/ijbdcn.201707010313:2(28-40)Online publication date: 1-Jul-2017
  • (2017)Value and Misinformation in Collaborative Investing PlatformsACM Transactions on the Web10.1145/302748711:2(1-32)Online publication date: 4-May-2017
  • (2016)A novel feature extraction approach in SMS spam filtering for mobile communicationSecurity and Communication Networks10.1002/sec.16609:17(4680-4690)Online publication date: 25-Nov-2016
  • (2015)Spam filtering for short messages in adversarial environmentNeurocomputing10.1016/j.neucom.2014.12.034155:C(167-176)Online publication date: 1-May-2015
  • (2012)$100,000 prize jackpot. call now!Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval10.1145/2348283.2348526(1175-1176)Online publication date: 12-Aug-2012
  • (2012)SMS spam filteringExpert Systems with Applications: An International Journal10.1016/j.eswa.2012.02.05339:10(9899-9908)Online publication date: 1-Aug-2012
  • (2011)SMSAssassinProceedings of the 12th Workshop on Mobile Computing Systems and Applications10.1145/2184489.2184491(1-6)Online publication date: 1-Mar-2011
  • (2010)A behavior-based SMS antispam systemIBM Journal of Research and Development10.1147/JRD.2010.206605054:6(651-666)Online publication date: 1-Nov-2010

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media