Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1516360.1516442acmotherconferencesArticle/Chapter ViewAbstractPublication PagesedbtConference Proceedingsconference-collections
Free access

Efficient identification of starters and followers in social media

Published: 24 March 2009 Publication History


Activity and user engagement in social media such as web logs, wikis, online forums or social networks has been increasing at unprecedented rates. In relation to social behavior in various human activities, user activity in social media indicates the existence of individuals that consistently drive or stimulate 'discussions' in the online world. Such individuals are considered as 'starters' of online discussions in contrast with 'followers' that primarily engage in discussions and follow them.
In this paper, we formalize notions of 'starters' and 'followers' in social media. Motivated by the challenging size of the available information related to online social behavior, we focus on the development of random sampling approaches allowing us to achieve significant efficiency while identifying starters and followers. In our experimental section we utilize BlogScope, our social media warehousing platform under development at the University of Toronto. We demonstrate the scalability and accuracy of our sampling approaches using real data establishing the practical utility of our techniques in a real social media warehousing environment.


Nilesh Bansal and Nick Koudas, BlogScope: A System for Online Analysis of High Volume Text Streams, WebDb, 2007.
E. Adar and L. A. Adamic. Tracking information epidemics in blogspace. In WI '05: Proceedings of the 2005 IEEE/WIC/ACM International Conference on Web Intelligence, pages 207--214, Washington, DC, USA, 2005. IEEE Computer Society.
E. Agichtein, C. Castillo, D. Donato, A. Gionis, and G. Mishne. Finding high-quality content in social media. In WSDM '08: Proceedings of the international conference on Web search and web data mining, pages 183--194, New York, NY, USA, 2008. ACM.
D. Aldous. On the markov chain simulation method for uniform combinatorial distributions and simulated annealing. Probability in the Engineering and Informational Sciences, 1987.
N. Bansal, F. Chiang, N. Koudas, and F. W. Tompa. Seeking stable clusters in the blogosphere. In VLDB, pages 806--817, 2007.
Z. Bar-Yossef, A. Berg, S. Chien, J. Fakcharoenphol, and D. Weitz. Approximating aggregate queries about web pages via random walks. In VLDB '00: Proceedings of the 26th International Conference on Very Large Data Bases, pages 535--544, San Francisco, CA, USA, 2000. Morgan Kaufmann Publishers Inc.
W. Cochran. Sampling Techniques. John Wiley and Sons, 3rd edition, 1977.
P. Domingos and M. Richardson. Mining the network value of customers. In KDD '01: Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining, pages 57--66, New York, NY, USA, 2001. ACM.
D. Fisher, M. Smith, and H. T. Welser. You are who you talk to: Detecting roles in usenet newsgroups. In HICSS '06: Proceedings of the 39th Annual Hawaii International Conference on System Sciences, Washington, DC, USA, 2006. IEEE Computer Society.
R. Gallager. Discrete Stochastic Processes. Springer, 1st edition, 1995.
D. Gillman. A chernoff bound for random walks on expander graphs. SIAM J. Comput., 27(4):1203--1220, 1998.
V. Gómez, A. Kaltenbrunner, and V. López. Statistical analysis of the social network and discussion threads in slashdot. In WWW '08: Proceeding of the 17th international conference on World Wide Web, pages 645--654, New York, NY, USA, 2008. ACM.
D. Gruhl, R. Guha, D. Liben-Nowell, and A. Tomkins. Information diffusion through blogspace. In WWW '04: Proceedings of the 13th international conference on World Wide Web, pages 491--501, New York, NY, USA, 2004. ACM.
M. R. Henzinger, A. Heydon, M. Mitzenmacher, and M. Najork. On near-uniform url sampling. In Proceedings of the 9th international World Wide Web conference on Computer networks: the international journal of computer and telecommunications netowrking, pages 295--308, Amsterdam, The Netherlands, The Netherlands, 2000. North-Holland Publishing Co.
W. Hoeffding. Probability inequalities for sums of bounded random variables. Journal of the American Statistical Association, 58(301):13--30, 1963.
D. Kempe, J. Kleinberg, and Éva Tardos. Maximizing the spread of influence through a social network. In KDD '03: Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 137--146, New York, NY, USA, 2003. ACM.
J. Leskovec, L. Backstrom, R. Kumar, and A. Tomkins. Microscopic evolution of social networks. In KDD '08.
J. Leskovec, M. McGlohon, C. Faloutsos, N. Glance, and M. Hurst. Cascading behavior in large blog graphs, 2007.
P. Rusmevichientong, D. M. Pennock, S. Lawrence, and L. C. Giles. Methods for sampling pages uniformly from the world wide web. In AAAI Fall Symposium on Using Uncertainty Within Computation, pages 121--128, 2001.
A. Sinclair and M. Jerrum. Approximate counting, uniform generation and rapidly mixing markov chains. Inf. Comput., 82(1):93--133, 1989.
X. Song, Y. Chi, K. Hino, and B. Tseng. Identifying opinion leaders in the blogosphere. In CIKM '07: Proceedings of the sixteenth ACM conference on Conference on information and knowledge management, pages 971--974, New York, NY, USA, 2007. ACM.

Cited By

View all
  • (2018) Professional networking with Yahoo! Groups : A case of school librarians from international schools in Hong Kong Journal of Librarianship and Information Science10.1177/096100061876348851:4(1077-1090)Online publication date: 25-Mar-2018
  • (2018)Efficient monitoring of personalized hot news over Web 2.0 streamsComputer Science - Research and Development10.1007/s00450-011-0178-927:1(81-92)Online publication date: 13-Dec-2018
  • (2018)Role Identification of Social NetworkersEncyclopedia of Social Network Analysis and Mining10.1007/978-1-4939-7131-2_247(2270-2278)Online publication date: 12-Jun-2018
  • Show More Cited By
  1. Efficient identification of starters and followers in social media



    Information & Contributors


    Published In

    cover image ACM Other conferences
    EDBT '09: Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
    March 2009
    1180 pages
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]


    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 24 March 2009


    Request permissions for this article.

    Check for updates


    • Research-article


    EDBT/ICDT '09
    EDBT/ICDT '09: EDBT/ICDT '09 joint conference
    March 24 - 26, 2009
    Saint Petersburg, Russia

    Acceptance Rates

    Overall Acceptance Rate 7 of 10 submissions, 70%


    Other Metrics

    Bibliometrics & Citations


    Article Metrics

    • Downloads (Last 12 months)104
    • Downloads (Last 6 weeks)24
    Reflects downloads up to 13 Feb 2025

    Other Metrics


    Cited By

    View all
    • (2018) Professional networking with Yahoo! Groups : A case of school librarians from international schools in Hong Kong Journal of Librarianship and Information Science10.1177/096100061876348851:4(1077-1090)Online publication date: 25-Mar-2018
    • (2018)Efficient monitoring of personalized hot news over Web 2.0 streamsComputer Science - Research and Development10.1007/s00450-011-0178-927:1(81-92)Online publication date: 13-Dec-2018
    • (2018)Role Identification of Social NetworkersEncyclopedia of Social Network Analysis and Mining10.1007/978-1-4939-7131-2_247(2270-2278)Online publication date: 12-Jun-2018
    • (2017)Mining of Social Media data of University studentsEducation and Information Technologies10.1007/s10639-016-9501-122:4(1515-1526)Online publication date: 1-Jul-2017
    • (2017)Role Identification of Social NetworkersEncyclopedia of Social Network Analysis and Mining10.1007/978-1-4614-7163-9_247-1(1-9)Online publication date: 3-Apr-2017
    • (2015)The comparison of users activity on the example of Polish and American blogosphereScientific Programming10.1155/2015/9075472015(7-7)Online publication date: 1-Jan-2015
    • (2014)An author-reader influence model for detecting topic-based influencers in social mediaProceedings of the 25th ACM conference on Hypertext and social media10.1145/2631775.2631804(46-55)Online publication date: 1-Sep-2014
    • (2014)Experiences Using BDS: A Crawler for Social Internetworking ScenariosSocial Networks: Analysis and Case Studies10.1007/978-3-7091-1797-2_8(149-177)Online publication date: 23-Jun-2014
    • (2014)Explaining Snapshots of Network Diffusions: Structural and Hardness ResultsComputing and Combinatorics10.1007/978-3-319-08783-2_53(616-625)Online publication date: 2014
    • (2013)How people describe themselves on TwitterProceedings of the ACM SIGMOD Workshop on Databases and Social Networks10.1145/2484702.2484708(25-30)Online publication date: 22-Jun-2013
    • Show More Cited By

    View Options

    View options


    View or Download as a PDF file.



    View online with eReader.


    Login options






    Share this Publication link

    Share on social media