Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2806416.2806466acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Profession-Based Person Search in Microblogs: Using Seed Sets to Find Journalists

Published: 17 October 2015 Publication History

Abstract

We introduce the problem of searching for professionals in microblogging platforms. We describe a study of how a group of professional journalists with some common characteristics (e.g., works in a specific language, belongs to certain region, or specializes in a particular media) can be found. Starting from seed sets of different sizes, social network features and profile content features are used to find additional journalists. The results show that combining the social network features of the reciprocated mentions and a bidirectional friend/follower graph provides a signal stronger than either of them taken independently, that both social network and profile content features are useful, and that profile content features are able to find larger numbers of less prominent journalists. We apply our methods to find the Twitter accounts of British and Arab journalists.

References

[1]
M. Bagdouri, W. Webber, D. D. Lewis, and D. W. Oard. Towards minimizing the annotation cost of certified text classification. In CIKM, 2013.
[2]
S. Bergsma, M. Dredze, B. V. Durme, T. Wilson, and D. Yarowsky. Broadly improving user classification via communication-based name and location clustering on Twitter. In NAACL, 2013.
[3]
S. Bergsma, P. McNamee, M. Bagdouri, C. Fink, and T. Wilson. Language identification for creating language-specific Twitter collections. In LSM, 2012.
[4]
P. Bhattacharya, S. Ghosh, J. Kulshrestha, M. Mondal, M. B. Zafar, N. Ganguly, and K. P. Gummadi. Deep Twitter diving: Exploring topical groups in microblogs at scale. In CSCW, 2014.
[5]
S. Bowman and C. Willis. We media: How audiences are shaping the future of news and information. hypergene.net/wemedia, 2003. Accessed: 2015-01-19.
[6]
C. Buckley and E. M. Voorhees. Retrieval evaluation with incomplete information. In SIGIR, 2004.
[7]
Z. Cheng, J. Caverlee, H. Barthwal, and V. Bachani. Who is the barbecue king of Texas?: A geo-spatial approach to finding local experts on Twitter. In SIGIR, 2014.
[8]
R. Cohen and D. Ruths. Classifying political orientation on Twitter: It's not easy! In ICWSM, 2013.
[9]
W. B. Croft and D. J. Harper. Using probabilistic models of document retrieval without relevance information. JDoc, 35(4), 1979.
[10]
D. Dailey and K. Starbird. Journalists as crowdsourcerers: Responding to crisis by reporting with a crowd. CSCW, 23(4--6), 2014.
[11]
T. El-Ganainy, W. Magdy, and A. Rafea. Hyperlink-extended pseudo relevance feedback for improved microblog retrieval. In SoMeRA, 2014.
[12]
C. Fink, J. Kopecky, and M. Morawski. Inferring gender from the content of tweets: A region specific example. In ICWSM, 2012.
[13]
G. P. C. Fung, J. X. Yu, H. Lu, and P. S. Yu. Text classification without negative examples revisit. TKDE, 18(1), 2006.
[14]
S. Ghosh, M. B. Zafar, P. Bhattacharya, N. Sharma, N. Ganguly, and K. Gummadi. On sampling the wisdom of crowds: Random vs. expert sampling of the Twitter stream. In CIKM, 2013.
[15]
B. Han, P. Cook, and T. Baldwin. Text-based Twitter user geolocation prediction. JAIR, 2014.
[16]
S. Han, D. He, J. Jiang, and Z. Yue. Supporting exploratory people search: A study of factor transparency and user control. In CIKM, 2013.
[17]
M. Hu, S. Liu, F. Wei, Y. Wu, J. Stasko, and K.-L. Ma. Breaking news on Twitter. In CHI, 2012.
[18]
W. Huang, I. Weber, and S. Vieweg. Inferring nationalities of Twitter users and studying international linking. In HT, 2014.
[19]
J.-H. Kang and K. Lerman. Using lists to measure homophily on Twitter. In ITWP, 2012.
[20]
B. Liu, W. S. Lee, P. S. Yu, and X. Li. Partially supervised classification of text documents. In ICML, 2002.
[21]
W. Meng, L. Lanfen, W. Jing, Y. Penghua, L. Jiaolong, and X. Fei. Improving short text classification using public search engines. In IUKM, volume 8032. Springer, 2013.
[22]
T. Miyanishi, K. Seki, and K. Uehara. Improving pseudo-relevance feedback via tweet selection. In CIKM, 2013.
[23]
E. Mohammady and A. Culotta. Using county demographics to infer attributes of Twitter users. In Joint Workshop on Social Dynamics and Personal Attributes in Social Media, 2014.
[24]
F. Mordelet and J. P. Vert. A bagging SVM to learn from positive and unlabeled examples. PRL, 2014.
[25]
D. Nguyen, R. Gravel, D. Trieschnigg, and T. Meder. "How old do you think I am?" a study of language and age in Twitter. In ICWSM, 2013.
[26]
J. H. Parmelee. Political journalists and Twitter: Influences on norms and practices. JMP, 14(4), 2013.
[27]
M. Pennacchiotti and A.-M. Popescu. A machine learning approach to Twitter user classification. In ICWSM, 2011.
[28]
S. Phuvipadawat and T. Murata. Breaking news detection and tracking in Twitter. In WI-IAT, 2010.
[29]
Q. Qiu, Y. Zhang, J. Zhu, and W. Qu. Building a text classifier by a keyword and Wikipedia knowledge. In ADMA, 2009.
[30]
D. Rao, D. Yarowsky, A. Shreevats, and M. Gupta. Classifying latent user attributes in Twitter. In SMUC, 2010.
[31]
K. Sadamitsu, K. Saito, K. Imamura, and G. Kikui. Entity set expansion using topic information. In HLT, 2011.
[32]
T. Sakai. Alternatives to bpref. In SIGIR, 2007.
[33]
E. Zak. 4 questions with Liz Heron, the New York Times social media editor. adweek.com/fishbowlny/-/250679, 2012. Accessed: 2015-01-13.

Cited By

View all
  • (2021)Analysis of Users Engaged in Online Discussions about Controversial Covid-19 TreatmentsAdjunct Proceedings of the 29th ACM Conference on User Modeling, Adaptation and Personalization10.1145/3450614.3462239(162-166)Online publication date: 21-Jun-2021
  • (2019)Journalists on Twitter: self-branding, audiences, and involvement of botsJournal of Computational Social Science10.1007/s42001-019-00056-63:1(83-101)Online publication date: 25-Sep-2019
  • (2018)The social silos of journalism? Twitter, news media and partisan segregationNew Media & Society10.1177/146144481880713321:4(815-835)Online publication date: 25-Oct-2018
  • Show More Cited By

Index Terms

  1. Profession-Based Person Search in Microblogs: Using Seed Sets to Find Journalists

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    CIKM '15: Proceedings of the 24th ACM International on Conference on Information and Knowledge Management
    October 2015
    1998 pages
    ISBN:9781450337946
    DOI:10.1145/2806416
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 17 October 2015

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. journalists
    2. microblogs
    3. person search

    Qualifiers

    • Research-article

    Funding Sources

    • Qatar National Research Fund

    Conference

    CIKM'15
    Sponsor:

    Acceptance Rates

    CIKM '15 Paper Acceptance Rate 165 of 646 submissions, 26%;
    Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

    Upcoming Conference

    CIKM '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)3
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 31 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2021)Analysis of Users Engaged in Online Discussions about Controversial Covid-19 TreatmentsAdjunct Proceedings of the 29th ACM Conference on User Modeling, Adaptation and Personalization10.1145/3450614.3462239(162-166)Online publication date: 21-Jun-2021
    • (2019)Journalists on Twitter: self-branding, audiences, and involvement of botsJournal of Computational Social Science10.1007/s42001-019-00056-63:1(83-101)Online publication date: 25-Sep-2019
    • (2018)The social silos of journalism? Twitter, news media and partisan segregationNew Media & Society10.1177/146144481880713321:4(815-835)Online publication date: 25-Oct-2018
    • (2018)On Refining Twitter Lists as Ground Truth Data for Multi-community User ClassificationAdvances in Information Retrieval10.1007/978-3-319-76941-7_74(765-772)Online publication date: 1-Mar-2018
    • (2017)An improved Apriori–based algorithm for friends recommendation in microblogInternational Journal of Communication Systems10.1002/dac.345331:2Online publication date: 6-Nov-2017
    • (2016)Does everybody lie? characterizing answerers in health-related CQA2016 International FRUCT Conference on Intelligence, Social Media and Web (ISMW FRUCT)10.1109/FRUCT.2016.7584763(1-6)Online publication date: 4-Sep-2016

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media