Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Rumor Gauge: Predicting the Veracity of Rumors on Twitter

Published: 14 July 2017 Publication History

Abstract

The spread of malicious or accidental misinformation in social media, especially in time-sensitive situations, such as real-world emergencies, can have harmful effects on individuals and society. In this work, we developed models for automated verification of rumors (unverified information) that propagate through Twitter. To predict the veracity of rumors, we identified salient features of rumors by examining three aspects of information spread: linguistic style used to express rumors, characteristics of people involved in propagating information, and network propagation dynamics. The predicted veracity of a time series of these features extracted from a rumor (a collection of tweets) is generated using Hidden Markov Models. The verification algorithm was trained and tested on 209 rumors representing 938,806 tweets collected from real-world events, including the 2013 Boston Marathon bombings, the 2014 Ferguson unrest, and the 2014 Ebola epidemic, and many other rumors about various real-world events reported on popular websites that document public rumors. The algorithm was able to correctly predict the veracity of 75% of the rumors faster than any other public source, including journalists and law enforcement officials. The ability to track rumors and predict their outcomes may have practical applications for news consumers, financial markets, journalists, and emergency services, and more generally to help minimize the impact of false information on Twitter.

References

[1]
Pear Analytics. 2009. Twitter Study--August 2009. Available: https://pearanalytics.com/wp-content/uploads/2009/08/Twitter-Study-August-2009.pdf. Accessed 2015 March 13.
[2]
Sinan Aral and Dylan Walker. 2012. Identifying influential and susceptible members of social networks. Science 337, 6092 (2012), 337--341.
[3]
Eytan Bakshy, Jake M. Hofman, Winter A. Mason, and Duncan J. Watts. 2011. Everyone’s an influencer: Quantifying influence on Twitter. In Proceedings of the 4th ACM International Conference on Web Search and Data Mining. ACM, 65--74.
[4]
Alex Beutel, Wanhong Xu, Venkatesan Guruswami, Christopher Palow, and Christos Faloutsos. 2013. Copycatch: Stopping group attacks by spotting lockstep behavior in social networks. In Proceedings of the 22nd international conference on World Wide Web. International World Wide Web Conferences Steering Committee, 119--130.
[5]
Prashant Bordia and Ralph L. Rosnow. 1998. Rumor rest stops on the information highway transmission patterns in a computer-mediated rumor chain. Human Communication Research 25, 2 (1998), 163--179.
[6]
Carlos Castillo, Marcelo Mendoza, and Barbara Poblete. 2011. Information credibility on Twitter. In Proceedings of the 20th International Conference on World Wide Web. ACM, 675--684.
[7]
Damon Centola. 2010. The spread of behavior in an online social network experiment. Science 329, 5996 (2010), 1194--1197.
[8]
Danqi Chen and Christopher D. Manning. 2014. A fast and accurate dependency parser using neural networks. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). ACL, 740--750.
[9]
Robin Cowan and Nicolas Jonard. 2004. Network structure and the diffusion of knowledge. Journal of economic Dynamics and Control 28, 8 (2004), 1557--1575.
[10]
David Crystal. 2006. Language and the Internet (2nd). Cambridge: Cambridge University Press.
[11]
Bertrand De Longueville, Robin S. Smith, and Gianluca Luraschi. 2009. Omg, from here, I can see the flames! A use case of mining location based social networks to acquire spatio-temporal data on forest fires. In Proceedings of the 2009 International Workshop on Location Based Social Networks. ACM, 73--80.
[12]
Hui Ding, Goce Trajcevski, Peter Scheuermann, Xiaoyue Wang, and Eamonn Keogh. 2008. Querying and mining of time series data Experimental comparison of representations and distance measures. Proceedings of the VLDB Endowment 1, 2 (2008), 1542--1552.
[13]
Paul Earle, Michelle Guy, Richard Buckmaster, Chris Ostrum, Scott Horvath, and Amy Vaughan. 2010. OMG earthquake! Can Twitter improve earthquake response? Seismological Research Letters 81, 2 (2010), 246--251.
[14]
Bradley Efron. 1982. The Jackknife, the Bootstrap, and Other Resampling Plans. (SIAM Monograph #38) Philadelphia: Society for Industrial and Applied Mathematics.
[15]
Eric K. Foster and Ralph L. Rosnow. 2006. Gossip and network relationships. Relating Difficulty: The Process of Constructing and Managing Difficult Interaction (2006), 161--180.
[16]
Adrien Friggeri, Lada A. Adamic, Dean Eckles, and Justin Cheng. 2014. Rumor cascades. In Proceedings of the 8th International AAAI Conference on Weblogs and Social Media.
[17]
Ayalvadi Ganesh, Laurent Massoulié, and Don Towsley. 2005. The effect of network topology on the spread of epidemics. In Proceedings of the 24th Annual Joint Conference of the IEEE Computer and Communications Societies INFOCOM 2005, Vol. 2. IEEE, 1455--1466.
[18]
Sharad Goel, Duncan J. Watts, and Daniel G. Goldstein. 2012. The structure of online diffusion networks. In Proceedings of the 13th ACM Conference on Electronic Commerce. ACM, 623--638.
[19]
Frank E. Harrell. 2001. Regression Modeling Strategies. Springer Science 8 Business Media.
[20]
Amanda Lee Hughes and Leysia Palen. 2009. Twitter adoption and use in mass convergence and emergency events. International Journal of Emergency Management 6, 3 (2009), 248--260.
[21]
Akshay Java, Xiaodan Song, Tim Finin, and Belle Tseng. 2007. Why we Twitter: Understanding microblogging usage and communities. In Proceedings of the 9th WebKDD and 1st SNA-KDD 2007 Workshop on Web Mining and Social Network Analysis. ACM, 56--65.
[22]
Meng Jiang, Alex Beutel, Peng Cui, Bryan Hooi, Shiqiang Yang, and Christos Faloutsos. 2016a. Spotting suspicious behaviors in multimodal data: A general metric and algorithms. IEEE Transactions on Knowledge and Data Engineering 28, 8 (2016), 2187--2200.
[23]
Meng Jiang, Peng Cui, Alex Beutel, Christos Faloutsos, and Shiqiang Yang. 2014. Catchsync: Catching synchronized behavior in large directed graphs. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 941--950.
[24]
Meng Jiang, Peng Cui, and Christos Faloutsos. 2016b. Suspicious behavior detection: Current trends and future directions. IEEE Intelligent Systems 31, 1 (2016), 31--39.
[25]
Fang Jin, Wei Wang, Liang Zhao, Edward Dougherty, Yang Cao, Chang-Tien Lu, and Naren Ramakrishnan. 2014. Misinformation propagation in the age of Twitter. Computer 47, 12 (2014), 90--94.
[26]
Márton Karsai, Gerardo Iñiguez, Kimmo Kaski, and János Kertész. 2014. Complex contagion process in spreading of online innovation. Journal of The Royal Society Interface 11, 101 (2014), 20140694.
[27]
Max Kaufmann and Jugal Kalita. 2010. Syntactic normalization of Twitter messages. In Proceedings of the International Conference on Natural Language Processing. Kharagpur, India.
[28]
Kirill Kireyev, Leysia Palen, and K. Anderson. 2009. Applications of topics models to analysis of disaster-related Twitter data. In NIPS Workshop on Applications for Topic Models: Text and Beyond. Amherst, MA.
[29]
Lingpeng Kong, Nathan Schneider, Swabha Swayamdipta, Archna Bhatia, Chris Dyer, and Noah A. Smith. 2014. A dependency parser for tweets. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP'14). ACL, 1001--1012.
[30]
Lalit Kundani. 2013. When the Tail Wags the Dog: Dangers of Crowdsourcing Justice. Retrieved from http://newamericamedia.org/2013/07/when-the-tail-wags-the-dog-dangers-of-crowdsourcing-justice.php/.
[31]
Haewoon Kwak, Changhyun Lee, Hosung Park, and Sue Moon. 2010. What is Twitter, a social network or a news media? In Proceedings of the 19th International Conference on World Wide Web. ACM, 591--600.
[32]
Sejeong Kwon, Meeyoung Cha, Kyomin Jung, Wei Chen, and Yajun Wang. 2013. Prominent features of rumor propagation in online social media. In Proceedings of the 13th International Conference on Data Mining (ICDM). IEEE, 1103--1108.
[33]
Sam Laird. 2012. “How Social Media Is Taking Over the News Industry”. (April 2012). http://mashable.com/ 2012/04/18/social-media-and-the-news/[mashable.com; posted 18-April-2012].
[34]
Vasileios Lampos, Tijl De Bie, and Nello Cristianini. 2010. Flu detector-tracking epidemics on Twitter. In Machine Learning and Knowledge Discovery in Databases. Springer, 599--602.
[35]
Dave Lee. 2013. Boston bombing: How internet detectives got it very wrong. Retrieved from http://www.bbc.com/news/technology-22214511/.
[36]
Jure Leskovec, Lars Backstrom, and Jon Kleinberg. 2009. Meme-tracking and the dynamics of the news cycle. In Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 497--506.
[37]
Yixuan Li, Oscar Martinez, Xing Chen, Yi Li, and John E. Hopcraft. 2016. In a world that counts: Clustering and detecting fake social engagement at scale. In Proceedings of the 25th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 111--120.
[38]
Gang Liang, Jin Yang, and Chun Xu. 2016. Automatic rumors identification on Sina Weibo. In Proceedings of the12th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD’16). IEEE, 1523--1531.
[39]
Hugo Liu and Push Singh. 2004. ConceptNeta practical commonsense reasoning tool-kit. BT Technology Journal 22, 4 (2004), 211--226.
[40]
Yasuko Matsubara, Yasushi Sakurai, B. Aditya Prakash, Lei Li, and Christos Faloutsos. 2012. Rise and fall patterns of information diffusion: Model and implications. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 6--14.
[41]
Marcelo Mendoza, Barbara Poblete, and Carlos Castillo. 2010. Twitter under crisis: Can we trust what we RT? In Proceedings of the 1st Workshop on Social Media Analytics. ACM, 71--79.
[42]
George Miller and Christiane Fellbaum. 1998. Wordnet: An electronic lexical database. (1998).
[43]
Mor Naaman, Jeffrey Boase, and Chih-Hui Lai. 2010. Is it really about me? Message content in social awareness streams. In Proceedings of the 2010 ACM Conference on Computer Supported Cooperative Work. ACM, 189--192.
[44]
Mark E. J. Newman. 2002. Spread of epidemic disease on networks. Physical review E 66, 1 (2002), 016128.
[45]
Romualdo Pastor-Satorras and Alessandro Vespignani. 2001. Epidemic spreading in scale-free networks. Physical Review Letters 86, 14 (2001), 3200.
[46]
James W. Pennebaker, Matthias R. Mehl, and Kate G. Niederhoffer. 2003. Psychological aspects of natural language use: Our words, our selves. Annual Review of Psychology 54, 1 (2003), 547--577.
[47]
The Pew Research Center. 2008. Internet Overtakes Newspapers As News Outlet. (December 2008). http://pewresearch.org/pubs/1066/internet-overtakes-newspapers-as-news-source[pewresearch.org; posted 23-December-2008].
[48]
The Pew Research Center. 2009. Public Evaluations of the News Media: 1985-2009. Press Accuracy Rating Hits Two Decade Low. Retrieved from http://www.people-press.org/2009/09/13/press-accuracy-rating-hits-two-decade-low/.
[49]
The Pew Research Center. 2012. Further Decline in Credibility Ratings for Most News Organizations. Retrieved from http://www.people-press.org/2012/08/16/further-decline-in-credibility-ratings-for-most-news-organizations/.
[50]
Kevin Poulsen. 2007. Firsthand reports from California wildfires pour through Twitter. Available: www.wired.com/threatlevel/2007/10/firsthand. Accessed 2009 Feburary 15.
[51]
Vahed Qazvinian, Emily Rosengren, Dragomir R. Radev, and Qiaozhu Mei. 2011. Rumor has it: Identifying misinformation in microblogs. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 1589--1599.
[52]
Lawrence Rabiner. 1989. A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE 77, 2 (1989), 257--286.
[53]
Jacob Ratkiewicz, Michael Conover, Mark Meiss, Bruno Gonçalves, Alessandro Flammini, and Filippo Menczer. 2011a. Detecting and tracking political abuse in social media. In Proceedings of the 5th International AAAI Conference on Weblogs and Social Media (ICWSM'11). AAAI, 297--304.
[54]
Jacob Ratkiewicz, Michael Conover, Mark Meiss, Bruno Goncalves, Snehal Patil, Alessandro Flammini, and Filippo Menczer. 2011b. Detecting and tracking the spread of astroturf memes in microblog streams. In Proceedings of the 20th International Conference Companion on World Wide Web. ACM, 249--252.
[55]
Ralph L. Rosnow. 1991. Inside rumor: A personal journey. American Psychologist 46, 5 (1991), 484.
[56]
Takeshi Sakaki, Makoto Okazaki, and Yutaka Matsuo. 2010. Earthquake shakes Twitter users: Real-time event detection by social sensors. In Proceedings of the 19th International Conference on World Wide Web. ACM, 851--860.
[57]
Hiroaki Sakoe and Seibi Chiba. 1978. Dynamic programming algorithm optimization for spoken word recognition. IEEE Transactions on Acoustics, Speech and Signal Processing 26, 1 (1978), 43--49.
[58]
Jagan Sankaranarayanan, Hanan Samet, Benjamin E. Teitler, Michael D. Lieberman, and Jon Sperling. 2009. Twitterstand: News in tweets. In Proceedings of the 17th ACM Sigspatial International Conference on Advances in Geographic Information Systems. ACM, 42--51.
[59]
Devavrat Shah and Tauhid Zaman. 2011. Rumors in a network: Who’s the culprit? IEEE Transactions on Information Theory 57, 8 (2011), 5163--5181.
[60]
Tamotsu Shibutani. 1966. Improvised News: A Sociological Study of Rumor. Ardent Media.
[61]
Kate Starbird, Leysia Palen, Amanda L. Hughes, and Sarah Vieweg. 2010. Chatter on the red: What hazards threat reveals about the social life of microblogged information. In Proceedings of the 2010 ACM Conference on Computer Supported Cooperative Work. ACM, 241--250.
[62]
Wilma Stassen. 2010. Your news in 140 characters: Exploring the role of social media in journalism. Global Media Journal-African Edition 4, 1 (2010), 116--131.
[63]
Manuel Valdes. 2013. Innocents accused in online manhunt. Retieved from http://www.3news.co.nz/Innocents-accused-in-online-manhunt/tabid/412/articleID/295143/Default.aspx/.
[64]
Sarah Vieweg. 2010. Microblogged contributions to the emergency arena: Discovery, interpretation and implications. In Proceedings of the 2010 ACM Conference on Computer Supported Cooperative Work. ACM, 241--250.
[65]
Sarah Vieweg, Amanda L. Hughes, Kate Starbird, and Leysia Palen. 2010. Microblogging during two natural hazards events: What Twitter may contribute to situational awareness. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 1079--1088.
[66]
Soroush Vosoughi. 2015. Automatic detection and verification of rumors on Twitter. Ph.D. Dissertation. Massachusetts Institute of Technology.
[67]
Soroush Vosoughi and Deb Roy. 2015. A human-machine collaborative system for identifying rumors on Twitter. In 2015 IEEE International Conference on Data Mining Workshop (ICDMW'15). IEEE, 47--50.
[68]
Soroush Vosoughi and Deb Roy. 2016a. A semi-automatic method for efficient detection of stories on social media. In Proceedings of the10th International AAAI Conference on Web and Social Media. AAAI, 707--710.
[69]
Soroush Vosoughi and Deb Roy. 2016b. Tweet acts: A speech act classifier for Twitter. In Proceedings of the10th International AAAI Conference on Web and Social Media. AAAI, 711--714.
[70]
Soroush Vosoughi, Helen Zhou, and Deb Roy. 2015. Enhanced Twitter sentiment classification using contextual information. In Proceedings of the 6th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis. Association for Computational Linguistics, 16--24. http://aclweb.org/anthology/W15-2904.
[71]
Duncan J. Watts and Peter Sheridan Dodds. 2007. Influentials, networks, and public opinion formation. Journal of consumer research 34, 4 (2007), 441--458.
[72]
Kang Zhao, John Yen, Greta Greer, Baojun Qiu, Prasenjit Mitra, and Kenneth Portier. 2014. Finding influential users of online health communities: A new metric based on sentiment influence. Journal of the American Medical Informatics Association (JAMIA) 21, e2 (2014), e212--e218.
[73]
Zhe Zhao, Paul Resnick, and Qiaozhu Mei. 2015. Enquiring minds: Early detection of rumors in social media from enquiry posts. In Proceedings of the 24th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 1395--1405.

Cited By

View all
  • (2024)Systemization of Knowledge (SoK): Creating a Research Agenda for Human-Centered Real-Time Risk Detection on Social Media PlatformsProceedings of the CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642315(1-21)Online publication date: 11-May-2024
  • (2024)Unraveling the Tangle of Disinformation: A Multimodal Approach for Fake News Identification on Social MediaCompanion Proceedings of the ACM Web Conference 202410.1145/3589335.3651972(1849-1853)Online publication date: 13-May-2024
  • (2024)Explainable rumor detection based on grey clustering: Fusion of manual features and deep learning featuresInformation Sciences10.1016/j.ins.2024.121055679(121055)Online publication date: Sep-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Knowledge Discovery from Data
ACM Transactions on Knowledge Discovery from Data  Volume 11, Issue 4
Special Issue on KDD 2016 and Regular Papers
November 2017
419 pages
ISSN:1556-4681
EISSN:1556-472X
DOI:10.1145/3119906
  • Editor:
  • Jie Tang
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 July 2017
Accepted: 01 March 2017
Revised: 01 October 2016
Received: 01 November 2015
Published in TKDD Volume 11, Issue 4

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Twitter
  2. fake news
  3. propagation
  4. rumor
  5. veracity prediction

Qualifiers

  • Research-article
  • Research
  • Refereed

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)156
  • Downloads (Last 6 weeks)3
Reflects downloads up to 01 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Systemization of Knowledge (SoK): Creating a Research Agenda for Human-Centered Real-Time Risk Detection on Social Media PlatformsProceedings of the CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642315(1-21)Online publication date: 11-May-2024
  • (2024)Unraveling the Tangle of Disinformation: A Multimodal Approach for Fake News Identification on Social MediaCompanion Proceedings of the ACM Web Conference 202410.1145/3589335.3651972(1849-1853)Online publication date: 13-May-2024
  • (2024)Explainable rumor detection based on grey clustering: Fusion of manual features and deep learning featuresInformation Sciences10.1016/j.ins.2024.121055679(121055)Online publication date: Sep-2024
  • (2024)Toward Detecting Rumor Initiator in Online Social Networks Using Ontology-Driven ModelArabian Journal for Science and Engineering10.1007/s13369-024-08852-7Online publication date: 11-Mar-2024
  • (2024)A comprehensive overview of fake news detection on social networksSocial Network Analysis and Mining10.1007/s13278-024-01280-314:1Online publication date: 24-Jun-2024
  • (2024)FNNet: a secure ensemble-based approach for fake news detection using blockchainThe Journal of Supercomputing10.1007/s11227-024-06216-4Online publication date: 29-May-2024
  • (2024)Detection on early dynamic rumor influence and propagation using biogeography-based optimization with deep learning approachesMultimedia Tools and Applications10.1007/s11042-024-18168-1Online publication date: 12-Mar-2024
  • (2024)Joint rumour and stance identification based on semantic and structural information in social networksApplied Intelligence10.1007/s10489-023-05170-754:1(264-282)Online publication date: 1-Jan-2024
  • (2024)LIMFA: label-irrelevant multi-domain feature alignment-based fake news detection for unseen domainNeural Computing and Applications10.1007/s00521-023-09340-z36:10(5197-5215)Online publication date: 1-Apr-2024
  • (2024)Loose and Tight: Creative Formation but Rigid Use of Nominal Compounds in Conspiracist TextsThe Journal of Creative Behavior10.1002/jocb.63358:1(114-127)Online publication date: 4-Jan-2024
  • Show More Cited By

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media