Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1458082.1458272acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
poster

Online spam-blog detection through blog search

Published: 26 October 2008 Publication History

Abstract

In this work, we propose a novel post-indexing spam-blog (or splog) detection method, which capitalizes on the results returned by blog search engines. More specifically, we analyze the search results of a sequence of temporally-ordered queries returned by a blog search engine, and build and maintain Blog profiles for those blogs whose posts frequently appear in the top-ranked search results. With the blog profiles, 4 splog scoring functions were evaluated using real data collected from a popular blog search engine. Our experiments show that the proposed method could effectively detect splogs with a high accuracy.

References

[1]
J. Leskovec, J. Kleinberg, and C. Faloutsos. Graphs over time: densification laws, shrinking diameters and possible explanations. In Proc. of KDD, pages 177--187, 2005.
[2]
Y.-R. Lin, H. Sundaram, Y. Chi, J. Tatemura, and B. L. Tseng. Detecting splogs via temporal dynamics using self-similarity analysis. ACM Trans. Web, 2(1), 2008.
[3]
D. Ren, I. Rahal, W. Perrizo, and K. Scott. A vertical distance-based outlier detection method with local pruning. In Proc. of CIKM, pages 279--284, 2004.
[4]
D. Sifry. State of the blogosphere august 2005 part 4: Spam and fake blogs, August 2005. http://technorati.com/weblog/2005/08/38.html.
[5]
A. Sun, M. Hu, and E.-P. Lim. Searching blogs and news: A study on popular queries. In Proc. of SIGIR, pages 729--730, Singapore, July 2008.

Cited By

View all
  • (2019)Based on The Document-Link and Time-Clue Relationships Between Blog Posts to Improve the Performance of Google Blog SearchInternational Journal on Semantic Web & Information Systems10.4018/IJSWIS.201901010315:1(52-75)Online publication date: 1-Jan-2019
  • (2019)A novel time-shifting method to find popular blog post topicsSoft Computing10.1007/s00500-019-04485-3Online publication date: 2-Nov-2019
  • (2017)The Evolution of Data Quality: Understanding the Transdisciplinary Origins of Data Quality Concepts and ApproachesAnnual Review of Statistics and Its Application10.1146/annurev-statistics-060116-0541144:1(85-108)Online publication date: 7-Mar-2017
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CIKM '08: Proceedings of the 17th ACM conference on Information and knowledge management
October 2008
1562 pages
ISBN:9781595939913
DOI:10.1145/1458082
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 October 2008

Permissions

Request permissions for this article.

Check for updates

Author Tag

  1. spam blog detection

Qualifiers

  • Poster

Conference

CIKM08
CIKM08: Conference on Information and Knowledge Management
October 26 - 30, 2008
California, Napa Valley, USA

Acceptance Rates

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)3
  • Downloads (Last 6 weeks)0
Reflects downloads up to 03 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2019)Based on The Document-Link and Time-Clue Relationships Between Blog Posts to Improve the Performance of Google Blog SearchInternational Journal on Semantic Web & Information Systems10.4018/IJSWIS.201901010315:1(52-75)Online publication date: 1-Jan-2019
  • (2019)A novel time-shifting method to find popular blog post topicsSoft Computing10.1007/s00500-019-04485-3Online publication date: 2-Nov-2019
  • (2017)The Evolution of Data Quality: Understanding the Transdisciplinary Origins of Data Quality Concepts and ApproachesAnnual Review of Statistics and Its Application10.1146/annurev-statistics-060116-0541144:1(85-108)Online publication date: 7-Mar-2017
  • (2015)Judging Consistency and Expertise of BlogsBlogosphere and its Exploration10.1007/978-3-662-44409-2_16(211-240)Online publication date: 2015
  • (2013)Moblog-Based Social NetworksSocial Networks: A Framework of Computational Intelligence10.1007/978-3-319-02993-1_5(75-97)Online publication date: 10-Dec-2013
  • (2011)Identifying relevant youtube comments to derive socially augmented user modelsProceedings of the 19th international conference on Advances in User Modeling10.1007/978-3-642-28509-7_8(71-85)Online publication date: 11-Jul-2011

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media