Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1531914.1531925acmotherconferencesArticle/Chapter ViewAbstractPublication Pagesiea-aeiConference Proceedingsconference-collections
research-article

Tag spam creates large non-giant connected components

Published: 21 April 2009 Publication History

Abstract

Spammers in social bookmarking systems try to mimick bookmarking behaviour of real users to gain the attention of other users or search engines. Several methods have been proposed for the detection of such spam, including domain-specific features (like URL terms) or similarity of users to previously identified spammers. However, as shown in our previous work, it is possible to identify a large fraction of spam users based on purely structural features. The hypergraph connecting documents, users, and tags can be decomposed into connected components, and any large, but non-giant components turned out to be almost entirely inhabitated by spam users in the examined dataset. Here, we test to what degree the decomposition of the complete hypergraph is really necessary, examining the component structure of the induced user/document and user/tag graphs. While the user/tag graph's connectivity does not help in classifying spammers, the user/document graph's connectivity is already highly informative. It can however be augmented with connectivity information from the hypergraph. In our view, spam detection based on structural features, like the one proposed here, requires complex adaptation strategies from spammers and may complement other, more traditional detection approaches.

References

[1]
P. Erdos and A. Renyi. On the evolution of random graphs. Publ. Math. Inst. Hung. Acad. Sci, 5:17--61, 1960.
[2]
A. Gkanogiannis and T. Kalamboukis. A novel supervised learning algorithm and its use for spam detection in social bookmarking systems. In ECML PKDD Discovery Challenge 2008 (RSDC'08), 2008.
[3]
P. Heymann, G. Koutrika, and H. Garcia-Molina. Fighting spam on social web sites: A survey of approaches and future challenges. IEEE Internet Computing, 11(6):36--45, 2007.
[4]
A. Hotho, D. Benz, R. Jäschke, and B. Krause, editors. ECML PKDD Discovery Challenge 2008 (RSDC'08). Workshop at 18th Europ. Conf. on Machine Learning (ECML'08) / 11th Europ. Conf. on Principles and Practice of Knowledge Discovery in Databases (PKDD'08), 2008.
[5]
B. Krause, A. Hotho, and G. Stumme. The anti-social tagger - detecting spam in social bookmarking systems. In Proc. of the Fourth International Workshop on Adversarial Information Retrieval on the Web, 2008.
[6]
M. McGlohon, L. Akoglu, and C. Faloutsos. Weighted graphs and disconnected components: patterns and a generator. In KDD '08: Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 524--532, New York, NY, USA, 2008. ACM.
[7]
N. Neubauer and K. Obermayer. Predicting tag spam examining cooccurrences, network structures and url components. In ECML PKDD Discovery Challenge 2008 (RSDC'08), 2008.
[8]
N. Neubauer and K. Obermayer. Hyperincident components of tagging networks (submitted). In HyperText 2009, Proceedings of, 2009.
[9]
Knowledge Discovery and Data Engineering Group, University of Kassel. Benchmark folksonomy data from bibsonomy, version of june 30th, 2008.
[10]
E. Santos-Neto, M. Ripeanu, and A. Iamnitchi. Tracking usage in collaborative tagging communities.
[11]
R. Wetzker, C. Zimmermann, and C. Bauckhage. Analyzing social bookmarking systems: A del.icio.us cookbook. In Mining Social Data (MSoDa) Workshop Proceedings, ECAI 2008, pages 26--30, 2008.

Cited By

View all
  • (2018)Accessing Information with Tags: Search and RankingSocial Information Access10.1007/978-3-319-90092-6_9(310-343)Online publication date: 3-May-2018
  • (2013)A Local Method for ObjectRank EstimationProceedings of International Conference on Information Integration and Web-based Applications & Services10.1145/2539150.2539177(92-101)Online publication date: 2-Dec-2013
  • (2012)Temporal dynamics of communities in social bookmarking systemsSocial Network Analysis and Mining10.1007/s13278-012-0054-z2:4(387-404)Online publication date: 1-Mar-2012

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
AIRWeb '09: Proceedings of the 5th International Workshop on Adversarial Information Retrieval on the Web
April 2009
67 pages
ISBN:9781605584386
DOI:10.1145/1531914
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 April 2009

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. connected components
  2. spam detection
  3. tagging

Qualifiers

  • Research-article

Conference

AIRWeb '09

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)2
  • Downloads (Last 6 weeks)0
Reflects downloads up to 17 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2018)Accessing Information with Tags: Search and RankingSocial Information Access10.1007/978-3-319-90092-6_9(310-343)Online publication date: 3-May-2018
  • (2013)A Local Method for ObjectRank EstimationProceedings of International Conference on Information Integration and Web-based Applications & Services10.1145/2539150.2539177(92-101)Online publication date: 2-Dec-2013
  • (2012)Temporal dynamics of communities in social bookmarking systemsSocial Network Analysis and Mining10.1007/s13278-012-0054-z2:4(387-404)Online publication date: 1-Mar-2012

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media