Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Bursty Event Detection in Twitter Streams

Published: 08 August 2019 Publication History

Abstract

Social media, in recent years, have become an invaluable source of information for both public and private organizations to enhance the comprehension of people interests and the onset of new events. Twitter, especially, allows a fast spread of news and events happening real time that can contribute to situation awareness during emergency situations, but also to understand trending topics of a period. The article proposes an online algorithm that incrementally groups tweet streams into clusters. The approach summarizes the examined tweets into the cluster centroid by maintaining a number of textual and temporal features that allow the method to effectively discover groups of interest on particular themes. Experiments on messages posted by users addressing different issues, and a comparison with state-of-the-art approaches show that the method is capable to detect discussions regarding topics of interest, but also to distinguish bursty events revealed by a sudden spreading of attention on messages published by users.

References

[1]
Hamed Abdelhaq, Christian Sengstock, and Michael Gertz. 2013. EvenTweet: Online localized event detection from Twitter. Proceedings of the VLDB Endowment 6, 12 (Aug. 2013), 1326--1329.
[2]
Charu C. Aggarwal and Philip S. Yu. 2010. On clustering massive text and categorical data streams. Knowledge and Information Systems 24, 2 (2010), 171--196.
[3]
Rakesh Agrawal and Ramakrishnan Srikant. 1994. Fast algorithms for mining association rules in large databases. In Proceedings of the 20th International Conference on Very Large Data Bases (VLDB’94). 487--499.
[4]
Luca Maria Aiello, Georgios Petkos, Carlos Martin, David Corney, Symeon Papadopoulos, Ryan Skraba, Ayse Goker, Ioannis Kompatsiaris, and Alejandro Jaimes. 2013. Sensing trending topics in Twitter. IEEE Transactions on Multimedia 15, 6 (2013), 1268--1282.
[5]
James Allan. 2002. Topic Detection and Tracking: Event-based Information Organization. Kluwer Academic Publisher, Norwel, MA.
[6]
James Allan, Jaime Carbonell, George Doddington, Jonathan Yamron, Yiming Yang, James Allan Umass, Brian Archibald Cmu, Doug Beeferman Cmu, Adam Berger Cmu, Ralf Brown Cmu, Ira Carp Dragon, George Doddington Darpa, Alex Hauptmann Cmu, John Lafferty Cmu, Victor Lavrenko Umass, Xin Liu Cmu, Steve Lowe Dragon, Paul Van Mulbregt Dragon, Ron Papka Umass, Thomas Pierce Cmu, Jay Ponte Umass, and Mike Scudder Umass. 1998. Topic detection and tracking pilot study final report. In Proceedings of the DARPA Broadcast News Transcription and Understanding Workshop. 194--218.
[7]
Nasser Alsaedi, Pete Burnap, and Omer Rana. 2017. Can we predict a riot? Disruptive event detection using Twitter. ACM Transactions on Internet Technology 17, 2 (Mar. 2017), 18:1--18:26.
[8]
Farzindar Atefeh and Wael Khreich. 2015. A survey of techniques for event detection in Twitter. Computational Intelligence 31, 1 (Feb. 2015), 132--164.
[9]
Hila Becker, Mor Naaman, and Luis Gravano. 2011. Beyond trending topics: Real-world event identification on Twitter. In Proceedings of the 5th International Conference on Weblogs and Social Media.
[10]
David M. Blei, Andrew Y. Ng, Michael I. Jordan, and John Lafferty. 2003. Latent Dirichlet allocation. Journal of Machine Learning Research 3 (2003), 2003.
[11]
Mário Cordeiro and João Gama. 2016. Online Social Networks Event Detection: A Survey. Springer International Publishing, Cham, 1--41.
[12]
Ingo Feinerer, Kurt Hornik, and David Meyer. 2008. Text mining infrastructure in R. Journal of Statistical Software 25, 5 (31 Mar. 2008), 1--54.
[13]
Gabriel Pui Cheong Fung, Jeffrey Xu Yu, Philip S. Yu, and Hongjun Lu. 2005. Parameter free bursty events detection in text streams. In Proceedings of the 31st International Conference on Very Large Data Bases (VLDB’05). 181--192.
[14]
Anuradha Goswami and Ajey Kumar. 2016. A survey of event detection techniques in online social networks. Social Network Analysis and Mining 6, 1 (17 Nov. 2016), 107.
[15]
Adrien Guille and Cécile Favre. 2015. Event detection, tracking, and visualization in Twitter: A mention-anomaly-based approach. Social Network Analysis and Mining 5, 1 (2015), 18:1--18:18.
[16]
Mahmud Hasan, Mehmet A. Orgun, and Rolf Schwitter. 2018. A survey on real-time event detection from Twitter data stream. Journal of Information Science 44, 4 (2018), 443--463.
[17]
Muhammad Imran, Carlos Castillo, Fernando Diaz, and Sarah Vieweg. 2015. Processing social media messages in mass emergency: A survey. ACM Computing Survey 47, 4, Article 67 (Jun. 2015), 38 pages.
[18]
Thorsten Joachims. 1998. Text categorization with support vector machines: Learning with many relevant features. In Proceedings of the 10th European Conference on Machine Learning (ECML’98). 137--142.
[19]
Jon Kleinberg. 2003. Bursty and hierarchical structure in streams. Data Mining and Knowledge Discovery 7, 4 (2003), 373--397.
[20]
Donald E. Knuth. 1997. The Art of Computer Programming, Volume 2 (3rd ed.): Seminumerical Algorithms. Addison-Wesley Longman Publishing Co., Inc., Boston, MA.
[21]
Sungjun Lee, Sangjin Lee, Kwanho Kim, and Jonghun Park. 2012. Bursty event detection from text streams for disaster management. In Proceedings of the 21st International Conference on World Wide Web (WWW’12 Companion). ACM, New York, NY, 679--682.
[22]
Chenliang Li, Aixin Sun, and Anwitaman Datta. 2012. Twevent: Segment-based event detection from tweets. In Proceedings of the 21st ACM International Conference on Information and Knowledge Management (CIKM’12), Maui, HI, October 29--November 02, 2012. 155--164.
[23]
Jianxin Li, Zhenying Tai, Richong Zhang, Weiren Yu, and Lu Liu. 2014. Online bursty event detection from microblog. In Proceedings of the 2014 IEEE/ACM 7th International Conference on Utility and Cloud Computing (UCC’14). IEEE Computer Society, 865--870.
[24]
Michael Mathioudakis and Nick Koudas. 2010. TwitterMonitor: Trend detection over the Twitter stream. In Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data (SIGMOD’10). 1155--1158.
[25]
Qiaozhu Mei and ChengXiang Zhai. 2005. Discovering evolutionary theme patterns from text: An exploration of temporal text mining. In Proceedings of the 11th ACM SIGKDD International Conference on Knowledge Discovery in Data Mining (KDD’05). ACM, New York, NY, 198--207.
[26]
Fionn Murtagh. 1983. A survey of recent advances in hierarchical clustering algorithms. Computer Journal 26, 4 (1983), 354--359.
[27]
Ruchi Parikh and Kamalakar Karlapalem. 2013. ET: Events from tweets. In Proceedings of the 22nd International Conference on World Wide Web (WWW’13 Companion). ACM, New York, NY, 613--620.
[28]
Saša Petrović, Miles Osborne, and Victor Lavrenko. 2010. Streaming first story detection with application to Twitter. In Proceedings of the Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics. Association for Computational Linguistics, 181--189.
[29]
Daniela Pohl, Abdelhamid Bouchachia, and Hermann Hellwagner. 2012. Automatic sub-event detection in emergency management using social media. In Proceedings of the 21st International Conference on World Wide Web (WWW’12 Companion). ACM, New York, NY, 683--686.
[30]
Takeshi Sakaki, Makoto Okazaki, and Yutaka Matsuo. 2013. Tweet analysis for real-time event detection and earthquake reporting system development. IEEE Transactions on Knowledge and Data Engineering 25, 4 (April 2013), 919--931.
[31]
Gerard Salton and Michael J. McGill. 1986. Introduction to Modern Information Retrieval. McGraw-Hill, Inc., New York, NY.
[32]
Jonathan A. Silva, Elaine R. Faria, Rodrigo C. Barros, Eduardo R. Hruschka, André C. P. L. F. de Carvalho, and João Gama. 2013. Data stream clustering: A survey. ACM Computing Surveys 46, 1, Article 13 (Jul. 2013), 13:1--13:31 pages.
[33]
Giovanni Stilo and Paola Velardi. 2016. Efficient temporal mining of micro-blog texts and its application to event discovery. Data Mining and Knowledge Discovery 30, 2 (Mar. 2016), 372--402.
[34]
Xuanhui Wang, ChengXiang Zhai, Xiao Hu, and Richard Sproat. 2007. Mining correlated bursty topic patterns from coordinated text streams. In Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’07). 784--793.
[35]
Yu Wang, Eugene Agichtein, and Michele Benzi. 2012. TM-LDA: Efficient online modeling of latent topic transitions in social media. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’12). 123--131.
[36]
Z. Wang, L. Shou, K. Chen, G. Chen, and S. Mehrotra. 2015. On summarization and timeline generation for evolutionary tweet streams. IEEE Transactions on Knowledge and Data Engineering 27, 5 (2015), 1301--1315.
[37]
Andreas Weiler, Michael Grossniklaus, and Marc H. Scholl. 2017. Survey and experimental analysis of event detection techniques for Twitter. Computer Journal 60, 3 (2017), 329--346.
[38]
Jianshu Weng and Bu-Sung Lee. 2011. Event detection in Twitter. In Proceedings of the 5th International Conference on Weblogs and Social Media, Barcelona, Catalonia, Spain, July 17--21, 2011. 401--408.
[39]
Lexing Xie, Hari Sunaram, and Murray Campbell. 2007. Event mining in multimedia streams. Proceedings of the IEEE 96, 4 (2007), 623--647.
[40]
Wei Xie, Feida Zhu, Jing Jiang, Ee-Peng Lim, and Ke Wang. 2016. TopicSketch: Real-time bursty topic detection from Twitter. IEEE Transactions on Knowledge and Data Engineering 28, 8 (2016), 2216--2229.
[41]
Xiaohui Yan, Jiafeng Guo, Yanyan Lan, and Xueqi Cheng. 2013. A biterm topic model for short texts. In Proceedings of the 22nd International Conference on World Wide Web (WWW’13). ACM, New York, NY, 1445--1456.
[42]
Jaewon Yang and Jure Leskovec. 2011. Patterns of temporal variation in online media. In Proceedings of the 4th ACM International Conference on Web Search and Data Mining (WSDM’11). ACM, New York, NY, 177--186.
[43]
Yiming Yang, Tom Pierce, and Jaime Carbonell. 1998. A study of retrospective and on-line event detection. In Proceedings of the 21st ACM SIGIR (SIGIR’98). ACM, 28--36.
[44]
Junjie Yao, Bin Cui, Yuxin Huang, and Yanhong Zhou. 2012. Bursty event detection from collaborative tags. World Wide Web 15, 2 (2012), 171--195.
[45]
Jie Yin, Andrew Lampert, Mark A. Cameron, Bella Robinson, and Robert Power. 2012. Using social media to enhance emergency situation awareness. IEEE Intelligent Systems 27, 6 (2012), 52--59.
[46]
Yu Zhang and Zhiyi Qu. 2015. A novel method for online bursty event detection on Twitter. In Proceedings of the 6th IEEE International Conference on Software Engineering and Service Science (ICSESS’15). IEEE, 284--288.
[47]
Wayne Xin Zhao, Rishan Chen, Kai Fan, Hongfei Yan, and Xiaoming Li. 2012a. A novel burst-based text representation model for scalable event detection. In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers -- Volume 2 (ACL’12). Association for Computational Linguistics, Stroudsburg, PA, 43--47.
[48]
Xin Zhao, Baihan Shu, Jing Jiang, Yang Song, Hongfei Yan, and Xiaoming Li. 2012b. Identifying event-related bursts via social media activities. In Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL’12). 1466--1477.

Cited By

View all
  • (2024)Automatic Construction of Expiration Time Expression Dataset from RetweetsCompanion Proceedings of the ACM Web Conference 202410.1145/3589335.3651471(545-548)Online publication date: 13-May-2024
  • (2024)conteNXt: A Graph-Based Approach to Assimilate Content and Context for Event Detection in OSNIEEE Transactions on Computational Social Systems10.1109/TCSS.2024.337239911:4(5483-5495)Online publication date: Aug-2024
  • (2024)Social bots spoil activist sentiment without eroding engagementScientific Reports10.1038/s41598-024-74032-014:1Online publication date: 6-Nov-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Knowledge Discovery from Data
ACM Transactions on Knowledge Discovery from Data  Volume 13, Issue 4
August 2019
235 pages
ISSN:1556-4681
EISSN:1556-472X
DOI:10.1145/3343141
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 August 2019
Accepted: 01 May 2019
Revised: 01 March 2019
Received: 01 August 2018
Published in TKDD Volume 13, Issue 4

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Twitter
  2. bursty event
  3. event detection
  4. online clustering

Qualifiers

  • Research-article
  • Research
  • Refereed

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)51
  • Downloads (Last 6 weeks)5
Reflects downloads up to 25 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Automatic Construction of Expiration Time Expression Dataset from RetweetsCompanion Proceedings of the ACM Web Conference 202410.1145/3589335.3651471(545-548)Online publication date: 13-May-2024
  • (2024)conteNXt: A Graph-Based Approach to Assimilate Content and Context for Event Detection in OSNIEEE Transactions on Computational Social Systems10.1109/TCSS.2024.337239911:4(5483-5495)Online publication date: Aug-2024
  • (2024)Social bots spoil activist sentiment without eroding engagementScientific Reports10.1038/s41598-024-74032-014:1Online publication date: 6-Nov-2024
  • (2024)Detection and context reconstruction of sub-events that influence the course of a news event from microblog discussionsJournal of Computational Social Science10.1007/s42001-024-00279-27:2(1483-1517)Online publication date: 26-Apr-2024
  • (2024)Cross-media web video event mining based on multiple semantic-paths embeddingNeural Computing and Applications10.1007/s00521-023-09050-636:2(667-683)Online publication date: 1-Jan-2024
  • (2024)Bursty Event Detection Model for TwitterDistributed Computing and Intelligent Technology10.1007/978-3-031-50583-6_23(338-355)Online publication date: 17-Jan-2024
  • (2023)Exploring Event-based Dynamic Topic Modeling*2023 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery (CyberC)10.1109/CyberC58899.2023.00036(165-174)Online publication date: 2-Nov-2023
  • (2023)WhatsUpInformation Sciences: an International Journal10.1016/j.ins.2023.01.001625:C(553-577)Online publication date: 1-May-2023
  • (2023)Construction of a high-precision general geographical location words datasetComputer Standards & Interfaces10.1016/j.csi.2022.10369284:COnline publication date: 1-Mar-2023
  • (2022)A Review on the Trends in Event Detection by Analyzing Social Media Platforms’ DataSensors10.3390/s2212453122:12(4531)Online publication date: 15-Jun-2022
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media