Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Exploiting Real-time Search Engine Queries for Earthquake Detection: A Summary of Results

Published: 25 May 2021 Publication History
  • Get Citation Alerts
  • Abstract

    Online search engine has been widely regarded as the most convenient approach for information acquisition. Indeed, the intensive information-seeking behaviors of search engine users make it possible to exploit search engine queries as effective “crowd sensors” for event monitoring. While some researchers have investigated the feasibility of using search engine queries for coarse-grained event analysis, the capability of search engine queries for real-time event detection has been largely neglected. To this end, in this article, we introduce a large-scale and systematic study on exploiting real-time search engine queries for outbreak event detection, with a focus on earthquake rapid reporting. In particular, we propose a realistic system of real-time earthquake detection through monitoring millions of queries related to earthquakes from a dominant online search engine in China. Specifically, we first investigate a large set of queries for selecting the representative queries that are highly correlated with the outbreak of earthquakes. Then, based on the real-time streams of selected queries, we design a novel machine learning–enhanced two-stage burst detection approach for detecting earthquake events. Meanwhile, the location of an earthquake epicenter can be accurately estimated based on the spatial-temporal distribution of search engine queries. Finally, through the extensive comparison with earthquake catalogs from China Earthquake Networks Center, 2015, the detection precision of our system can achieve 87.9%, and the accuracy of location estimation (province level) is 95.7%. In particular, 50% of successfully detected results can be found within 62 s after earthquake, and 50% of successful locations can be found within 25.5 km of seismic epicenter. Our system also found more than 23.3% extra earthquakes that were felt by people but not publicly released, 12.1% earthquake-like special outbreaks, and meanwhile, revealed many interesting findings, such as the typical query patterns of earthquake rumor and regular memorial events. Based on these results, our system can timely feed back information to the search engine users according to various cases and accelerate the information release of felt earthquakes.

    References

    [1]
    Google Play. 2019. LastQuake APP. Retrieved from https://play.google.com/store/apps/details?id=org.emsc_csem.lastquake.
    [2]
    Twitter. 2019. Twitter lastquake. Retrieved from https://twitter.com/lastquake.
    [3]
    Jubran Akram, Daniel Peter, and David Eaton. 2019. A k-mean characteristic function for optimizing STA/LTA-based detection of microseismic events. Geophysics 84, 4 (2019), KS143--KS153.
    [4]
    Jaime Arguello, Bogeum Choi, and Robert Capra. 2018. Factors influencing users’ information requests: Medium, target, and extra-topical dimension. ACM Trans. Info. Syst. 36, 4 (2018), 1–37.
    [5]
    Farzindar Atefeh and Wael Khreich. 2015. A survey of techniques for event detection in Twitter. Comput. Intell. 31, 1 (2015), 132–164.
    [6]
    Gail M. Atkinson and David J. Wald. 2007. “Did You Feel It?” intensity data: A surprisingly good measure of earthquake ground motion. Seismol. Res. Lett. 78, 3 (2007), 362–368.
    [7]
    Marco Avvenuti, Stefano Cresci, Mariantonietta Noemi La Polla, Andrea Marchetti, and Maurizio Tesconi. 2014. Earthquake emergency management by social sensing. In Proceedings of the IEEE International Conference on Pervasive Computing and Communications Workshops (PerCom’14). 587–592.
    [8]
    Marco Avvenuti, Stefano Cresci, Andrea Marchetti, Carlo Meletti, and Maurizio Tesconi. 2014. EARS (earthquake alert and report system): A real-time decision support system for earthquake crisis management. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1749–1758.
    [9]
    John W. Ayers, Kurt M. Ribisl, and John S. Brownstein. 2011. Tracking the rise in popularity of electronic nicotine delivery systems (electronic cigarettes) using search query surveillance. Amer. J. Prevent. Med. 40, 4 (2011), 448–453.
    [10]
    Baidu. 2020. Baidu News. Retrieved from https://news.baidu.com/.
    [11]
    Rodger Benham, Joel Mackenzie, Alistair Moffat, and J. Shane Culpepper. 2019. Boosting search performance using query variations. ACM Trans. Info. Syst. 37, 4 (2019), 1–25.
    [12]
    Ilaria Bordino, Stefano Battiston, Guido Caldarelli, Matthieu Cristelli, Antti Ukkonen, and Ingmar Weber. 2012. Web search queries can predict stock market volumes. PLoS ONE 7, 7 (2012), e40014.
    [13]
    Rémy Bossu, Frédéric Roussel, Laure Fallou, Matthieu Landès, Robert Steed, Gilles Mazet-Roux, Aurélien Dupont, Laurent Frobert, and Laura Petersen. 2018. LastQuake: From rapid information to global seismic risk reduction. Int. J. Disaster Risk Reduct. 28 (2018), 32–42.
    [14]
    Rémy Bossu, Robert Steed, Fréderic Roussel, Matthieu Landès, Amaya Fuenzalida, Emanuela Matrullo, Aurélien Dupont, Julien Roch, and Laure Fallou. 2019. App earthquake detection and automatic mapping of felt area. Seismol. Res. Lett. 90, 1 (2019), 305–312.
    [15]
    Fei Cai, Maarten De Rijke et al. 2016. A survey of query auto completion in information retrieval. Found. Trends Info. Retriev. 10, 4 (2016), 273–363.
    [16]
    China Internet Network Information Center (CINIC). 2017. China statistical report on Internet development. (2017).
    [17]
    Emily H. Chan, Vikram Sahai, Corrie Conrad, and John S. Brownstein. 2011. Using web search query data to monitor dengue epidemics: A new model for neglected tropical disease surveillance. PLoS Neglect. Trop. Diseases 5, 5 (2011), e1206.
    [18]
    CNNIC, BG. 2019. The 44th China statistical report on internet development. (2019).
    [19]
    Zhicheng Cui, Wenlin Chen, and Yixin Chen. 2016. Multi-scale convolutional neural networks for time series classification. Retrieved from https://arXiv:1603.06995.
    [20]
    Zihang Dai, Zhilin Yang, Yiming Yang, William W Cohen, Jaime Carbonell, Quoc V Le, and Ruslan Salakhutdinov. 2019. Transformer-xl: Attentive language models beyond a fixed-length context. Retrieved from https://arXiv:1901.02860.
    [21]
    Dazhong Shen, Qi Zhang, Tong Xu, Hengshu Zhu, Wenjia Zhao, Zikai Yin, Peilun Zhou, Lihua Fang, Enhong Chen, and Hui Xiong. 2019. A Machine Learning-enhanced Robust P-Phase Picker for Real-time Seismic Monitoring. Retrieved from https://arXiv:1911.09275.
    [22]
    Qianjin Du, Weixi Gu, Lin Zhang, and Shao-Lun Huang. 2018. Attention-based LSTM-CNNs for time-series classification. In Proceedings of the 16th ACM Conference on Embedded Networked Sensor Systems. 410–411.
    [23]
    Damian R. Eads, Daniel Hill, Sean Davis, Simon J. Perkins, Junshui Ma, Reid B. Porter, and James P. Theiler. 2002. Genetic algorithms and support vector machines for time series classification. In Applications and Science of Neural Networks, Fuzzy Systems, and Evolutionary Computation V, Vol. 4787. International Society for Optics and Photonics, 74–85.
    [24]
    Paul S. Earle, Daniel C. Bowden, and Michelle Guy. 2012. Twitter earthquake detection: Earthquake monitoring in a social world. Ann. Geophy. 54, 6 (2012).
    [25]
    Ryohei Ebina, Kenji Nakamura, and Shigeru Oyanagi. 2011. A real-time burst detection method. In Proceedings of the IEEE 23rd International Conference on Tools with Artificial Intelligence. IEEE, 1040–1046.
    [26]
    Hassan Ismail Fawaz, Germain Forestier, Jonathan Weber, Lhassane Idoumghar, and Pierre-Alain Muller. 2019. Deep learning for time series classification: A review. Data Min. Knowl. Discov. 33, 4 (2019), 917–963.
    [27]
    Thomas Fischer and Christopher Krauss. 2018. Deep learning with long short-term memory networks for financial market predictions. Eur. J. Operat. Res. 270, 2 (2018), 654–669.
    [28]
    Jazmine A. Maldonado Flores, Jheser Guzman, and Barbara Poblete. 2017. A lightweight and real-time worldwide earthquake detection and monitoring system based on citizen sensors. In Proceedings of the Conference on Human Computation and Crowdsourcing (HCOMP’17). 137–146.
    [29]
    Kurt Frieden, Don L. Hayler, Michael Richards, and Vasif Shaikh. 2017. Autocomplete searching with security filtering and ranking. U.S. Patent No. 9,613,165.
    [30]
    Felix A. Gers, Douglas Eck, and Jürgen Schmidhuber. 2002. Applying LSTM to time series predictable through time-window approaches. In Neural Nets WIRN Vietri-01. Springer, 193–200.
    [31]
    Jeremy Ginsberg, Matthew H. Mohebbi, Rajan S. Patel, Lynnette Brammer, Mark S. Smolinski, and Larry Brilliant. 2009. Detecting influenza epidemics using search engine query data. Nature 457, 7232 (2009), 1012.
    [32]
    Google. 2020. Google News. Retrieved from https://news.google.com/.
    [33]
    Mahmud Hasan, Mehmet A. Orgun, and Rolf Schwitter. 2018. A survey on real-time event detection from the Twitter data stream. J. Info. Sci. 44, 4 (2018), 443--463.
    [34]
    Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural Comput. 9, 8 (1997), 1735–1780.
    [35]
    Fazle Karim, Somshubra Majumdar, Houshang Darabi, and Samuel Harford. 2019. Multivariate lstm-fcns for time series classification. Neural Netw. 116 (2019), 237–245.
    [36]
    Diederik P. Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. Retrieved from https://arXiv:1412.6980.
    [37]
    Qingkai Kong, Richard M. Allen, Louis Schreier, and Young-Woo Kwon. 2016. MyShake: A smartphone seismic network for earthquake early warning and beyond. Sci. Adv. 2, 2 (2016), e1501055.
    [38]
    Zhihao Li, Tao Liu, Guanghu Zhu, Hualiang Lin, Yonghui Zhang, Jianfeng He, Aiping Deng, Zhiqiang Peng, Jianpeng Xiao, Shannon Rutherford et al. 2017. Dengue Baidu search index data can improve the prediction of local dengue epidemic: A case study in Guangzhou, China. PLOS Neglect. Trop. Diseases 11, 3 (2017), e0005354.
    [39]
    Andy Liaw, Matthew Wiener et al. 2002. Classification and regression by randomForest. R News 2, 3 (2002), 18–22.
    [40]
    Zachary C. Lipton, David C. Kale, Charles Elkan, and Randall Wetzel. 2015. Learning to diagnose with LSTM recurrent neural networks. Retrieved from https://arXiv:1511.03677.
    [41]
    Hao Liu, Jindong Han, Yanjie Fu, Jingbo Zhou, Xinjiang Lu, and Hui Xiong. 2021. Multi-modal transportation recommendation with unified route representation learning. Proc. VLDB Endow. 14, 3 (2021), 342–350.
    [42]
    Hao Liu, Yongxin Tong, Jindong Han, Panpan Zhang, Xinjiang Lu, and Hui Xiong. 2020. Incorporating multi-source urban data for personalized and context-aware multi-modal transportation recommendation. IEEE Trans. Knowl. Data Eng. (2020).
    [43]
    Yiqun Liu, Junqi Zhang, Jiaxin Mao, Min Zhang, Shaoping Ma, Qi Tian, Yanxiong Lu, and Leyu Lin. 2019. Search result reranking with visual and structure information sources. ACM Trans. Info. Syst. 37, 3 (2019), 1–38.
    [44]
    Anthony Lomax, Claudio Satriano, and Maurizio Vassallo. 2012. Automatic picker developments and optimization: FilterPicker—A robust, broadband picker for real-time seismic monitoring and earthquake early warning. Seismol. Res. Lett. 83, 3 (2012), 531–540.
    [45]
    Claudio Lucchese, Salvatore Orlando, Raffaele Perego, Fabrizio Silvestri, and Gabriele Tolomei. 2013. Discovering tasks from search engine query logs. ACM Trans. Info. Syst. 31, 3 (2013), 1–43.
    [46]
    Jiaxin Mao, Yiqun Liu, Noriko Kando, Min Zhang, and Shaoping Ma. 2018. How does domain expertise affect users’ search interaction and outcome in exploratory search?ACM Trans. Info. Syst. 36, 4 (2018), 1–30.
    [47]
    Stuart E. Middleton, Giorgos Kordopatis-Zilos, Symeon Papadopoulos, and Yiannis Kompatsiaris. 2018. Location extraction from social media: Geoparsing, location disambiguation, and geotagging. ACM Trans. Info. Syst. 36, 4 (2018), 1–27.
    [48]
    Robert Munro. 2013. Crowdsourcing and the crisis-affected community. Info. Retriev. 16, 2 (2013), 210–266.
    [49]
    Masafumi Nakano, Akihiko Takahashi, and Soichiro Takahashi. 2017. Generalized exponential moving average (EMA) model with particle filtering and anomaly detection. Expert Syst. Appl. 73 (2017), 187–200.
    [50]
    NetEase News. 2019. Shawan earthquake. Retrieved from https://baike.baidu.com/reference/19139644/ec79T6Wwg0ZeCBLTacQWSdqPomGrBPVur_LLI_Z6JI_ML7YsTkk5rjYAq5Fdr6RlUNJ1S3xY1s1r_f06lEoCNejonl-07FCaTVjICLzzwXVf.
    [51]
    Ruben E. Ortega, John W. Avery, and Robert Frederick. 2003. Search query autocompletion. U.S. Patent No. 6,564,213.
    [52]
    F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. 2011. Scikit-learn: Machine learning in Python. J. Mach. Learn. Res. 12 (2011), 2825–2830.
    [53]
    Thibaut Perol, Michaël Gharbi, and Marine Denolle. 2018. Convolutional neural network for earthquake detection and location. Sci. Adv. 4, 2 (2018), e1700578.
    [54]
    Barbara Poblete, Jheser Guzmán, Jazmine Maldonado, and Felipe Tobar. 2018. Robust detection of extreme events using Twitter: Worldwide earthquake monitoring. IEEE Trans. Multimedia 20, 10 (2018), 2551–2561.
    [55]
    Robert Power, Bella Robinson, and Adrienne Moseley. 2016. Comparing felt reports and tweets about earthquakes. In Proceedings of the 3rd International Conference on Information and Communication Technologies for Disaster Management (ICT-DM’16). IEEE, 1–8.
    [56]
    Yanxia Qin, Yue Zhang, Min Zhang, and Dequan Zheng. 2018. Frame-based representation for event detection on Twitter. IEICE Trans. Info. Syst. 101, 4 (2018), 1180–1188.
    [57]
    Juan Ramos et al. 2003. Using TF-IDF to determine word relevance in document queries. In Proceedings of the 1st Instructional Conference on Machine Learning, Vol. 242. 133–142.
    [58]
    Pengjie Ren, Zhumin Chen, Zhaochun Ren, Furu Wei, Liqiang Nie, Jun Ma, and Maarten De Rijke. 2018. Sentence relations for extractive summarization with deep neural networks. ACM Trans. Info. Syst. 36, 4 (2018), 1–32.
    [59]
    Bella Robinson, Robert Power, and Mark Cameron. 2013. A sensitive Twitter earthquake detector. In Proceedings of the 22nd International Conference on World Wide Web. ACM, 999–1002.
    [60]
    Alberto Rosi, Marco Mamei, Franco Zambonelli, Simon Dobson, Graeme Stevenson, Juan Ye et al. 2011. Social sensors and pervasive services: Approaches and perspectives. In Proceedings of the IEEE International Conference on Pervasive Computing and Communications Workshops (PerCom’11). 525–530.
    [61]
    Takeshi Sakaki, Makoto Okazaki, and Yutaka Matsuo. 2010. Earthquake shakes Twitter users: Real-time event detection by social sensors. In Proceedings of the 19th International Conference on World Wide Web. ACM, 851–860.
    [62]
    Takeshi Sakaki, Makoto Okazaki, and Yutaka Matsuo. 2013. Tweet analysis for real-time event detection and earthquake reporting system development. IEEE Trans. Knowl. Data Eng. 25, 4 (2013), 919–931.
    [63]
    Dazhong Shen, Qi Zhang, Tong Xu, Hengshu Zhu, Wenjia Zhao, Zikai Yin, Peilun Zhou, Lihua Fang, Enhong Chen, and Hui Xiong. 2019. Machine learning-enhanced realistic framework for real-time seismic monitoring—The winning solution of the 2017 international aftershock detection contest. Retrieved from https://arXiv:1911.09275.
    [64]
    Huan Song, Deepta Rajan, Jayaraman J. Thiagarajan, and Andreas Spanias. 2018. Attend and diagnose: Clinical time series analysis using attention models. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence.
    [65]
    Robert J. Steed, Amaya Fuenzalida, Rémy Bossu, István Bondár, Andres Heinloo, Aurelien Dupont, Joachim Saul, and Angelo Strollo. 2019. Crowdsourcing triggers rapid, reliable earthquake locations. Sci. Adv. 5, 4 (2019), eaau9824.
    [66]
    Jennifer A. Strauss and Richard M. Allen. 2016. Benefits and costs of earthquake early warning. Seismol. Res. Lett. 87, 3 (2016), 765–772.
    [67]
    Ying Sun, Hengshu Zhu, Fuzhen Zhuang, Jingjing Gu, and Qing He. 2018. Exploring the urban region-of-interest through the analysis of online map search queries. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2269–2278.
    [68]
    Peter Sykacek and Stephen J. Roberts. 2002. Bayesian time series classification. In Advances in Neural Information Processing Systems. MIT Press, 937–944.
    [69]
    Paul Thomas, Bodo Billerbeck, Nick Craswell, and Ryen W. White. 2019. Investigating searchers’ mental models to inform search explanations. ACM Trans. Info. Syst. 38, 1 (2019), 1–25.
    [70]
    Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in Neural Information Processing Systems. MIT Press, 5998–6008.
    [71]
    Antanas Verikas, Adas Gelzinis, and Marija Bacauskiene. 2011. Mining data with random forests: A survey and results of new tests. Pattern Recogn. 44, 2 (2011), 330–349.
    [72]
    David Vise. 2007. The Google story. Strat. Direct. 23, 10 (2007).
    [73]
    Li Wei, Nitin Kumar, Venkata Nishanth Lolla, Eamonn J. Keogh, Stefano Lonardi, and Chotirat (Ann) Ratanamahatana. 2005. Assumption-free anomaly detection in time series. In Proceedings of the (SSDBM’05), Vol. 5. 237–242.
    [74]
    Qingyu Yuan, Elaine O. Nsoesie, Benfu Lv, Geng Peng, Rumi Chunara, and John S. Brownstein. 2013. Monitoring influenza epidemics in china with search query from baidu. PLoS ONE 8, 5 (2013), e64323.
    [75]
    Dongxiang Zhang, Liqiang Nie, Huanbo Luan, Kian-Lee Tan, Tat-Seng Chua, and Heng Tao Shen. 2017. Compact indexing and judicious searching for billion-scale microblog retrieval. ACM Trans. Info. Syst. 35, 3 (2017), 1–24.
    [76]
    Qi Zhang, Tong Xu, Hengshu Zhu, Lifu Zhang, Hui Xiong, Enhong Chen, and Qi Liu. 2019. Aftershock detection with multi-scale description-based neural network. In Proceedings of the IEEE International Conference on Data Mining (ICDM’19). IEEE, 886–895.
    [77]
    Bendong Zhao, Huanzhang Lu, Shangfeng Chen, Junliang Liu, and Dongya Wu. 2017. Convolutional neural networks for time series classification. J. Syst. Eng. Electron. 28, 1 (2017), 162–169.
    [78]
    Yi Zheng, Qi Liu, Enhong Chen, Yong Ge, and J. Leon Zhao. 2014. Time series classification using multi-channels deep convolutional neural networks. In Proceedings of the International Conference on Web-Age Information Management. Springer, 298–310.
    [79]
    Hengshu Zhu, Ying Sun, Wenjia Zhao, Fuzhen Zhuang, Baoshan Wang, and Hui Xiong. 2020. Rapid learning of earthquake felt area and intensity distribution with real-time search engine queries. Sci. Rep. 10, 1 (2020), 1–9.
    [80]
    Hengshu Zhu, Hui Xiong, Fangshuang Tang, Qi Liu, Yong Ge, Enhong Chen, and Yanjie Fu. 2016. Days on market: Measuring liquidity in real estate markets. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 393–402.

    Index Terms

    1. Exploiting Real-time Search Engine Queries for Earthquake Detection: A Summary of Results

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Transactions on Information Systems
      ACM Transactions on Information Systems  Volume 39, Issue 3
      July 2021
      432 pages
      ISSN:1046-8188
      EISSN:1558-2868
      DOI:10.1145/3450607
      Issue’s Table of Contents
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 25 May 2021
      Accepted: 01 March 2021
      Revised: 01 March 2021
      Received: 01 June 2020
      Published in TOIS Volume 39, Issue 3

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Search engine queries
      2. crowd sensors
      3. real-time event detection
      4. earthquake rapid reporting

      Qualifiers

      • Research-article
      • Refereed

      Funding Sources

      • National Natural Science Foundation of China

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • 0
        Total Citations
      • 311
        Total Downloads
      • Downloads (Last 12 months)65
      • Downloads (Last 6 weeks)6
      Reflects downloads up to 27 Jul 2024

      Other Metrics

      Citations

      View Options

      Get Access

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format.

      HTML Format

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media