Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3318236.3318242acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicgdaConference Proceedingsconference-collections
research-article

Automatic classification of Aurora-related tweets using machine learning methods

Published: 15 March 2019 Publication History

Abstract

The constant flow of information by social media provides valuable information about all sorts of events at a high temporal and spatial resolution. Over the past few years we have been analyzing in real-time geological hazards/phenomena, such as earthquakes, volcanic eruptions, landslides, floods or the aurora, as part of the GeoSocial project, by geo-locating tweets filtered by keywords in a web-map. However, up to this date only a keyword-based filtering was applied that does not always filter out tweets that are unrelated to hazard-events. Therefore, this work explores five learning-based classification techniques: a Linear SVM and four Deep Neural Networks (DNNs): a Convolutional Neural Network (CNN), a Recurrent Neural Network (RNN), a RNN-Long-short-term memory (RNN-LSTM) and a RNN-Gated Recurrent Unit (GRU) for automatic hazard-event classification based on tweets about Aurora sightings. In addition, for the DNNS we also trained the algorithms using pre-trained word2vec word-embeddings. We finally evaluate the algorithms using two datasets, one from the Aurorasaurus application and one manually labeled in the BGS. We show that DNNs and especially the CNN perform better for both datasets and that there is potential for improvement. Our code is also available online.

References

[1]
Zahra Ashktorab, Christopher Brown, Manojit Nandi, and Aron Culotta. 2014. Tweedr: Mining twitter to inform disaster response. In ISCRAM.
[2]
Marco Avvenuti, Stefano Cresci, Andrea Marchetti, Carlo Meletti, and Maurizio Tesconi. 2014. EARS (earthquake alert and report system): a real time decision support system for earthquake crisis management. In Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, 1749--1758.
[3]
Diaz-Doce D. Stuteley J. Bee E., Poole J. 2017. GeoSocial V2.0 - An application aiming to detect natural geohazards using Twitter {Online}. http://www.bgs.ac. uk/citizenScience/geosocial/home.html. (2017).
[4]
Yoshua Bengio, Patrice Simard, and Paolo Frasconi. 1994. Learning long-term dependencies with gradient descent is difficult. IEEE transactions on neural networks 5, 2 (1994), 157--166.
[5]
Stuart J Blair, Yaxin Bi, and Maurice D Mulvenna. 2016. Sentiment Classification of Social Media Content with Features Generated Using Topic Models. In STAIRS. 155--166.
[6]
Nathan Anthony Case, Elizabeth A MacDonald, Sean McCloat, Nicolas Lalone, and Andrea Tapia. 2016. Determining the accuracy of crowdsourced tweet verification for auroral research. Citizen Science: Theory and Practice 2016 (2016).
[7]
Kyunghyun Cho, Bart Van Merriënboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078 (2014).
[8]
Paul S Earle, Daniel C Bowden, and Michelle Guy. 2012. Twitter earthquake detection: earthquake monitoring in a social world. Annals of Geophysics 54, 6 (2012).
[9]
Fréderic Godin, Baptist Vandersmissen, Wesley De Neve, and Rik Van de Walle. 2015. Multimedia Lab @ ACL WNUT NER Shared Task: Named Entity Recognition for Twitter Microposts using Distributed Word Representations. In Proceedings of the Workshop on Noisy User-generated Text. 146--153.
[10]
Jheser Guzman and Barbara Poblete. 2013. On-line relevant anomaly detection in the Twitter stream: an efficient bursty keyword detection model. In Proceedings of the acm sigkdd workshop on outlier detection and description. ACM, 31--39.
[11]
Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation 9, 8 (1997), 1735--1780.
[12]
Farhan Hassan Khan, Saba Bashir, and Usman Qamar. 2014. TOM: Twitter opinion mining framework using hybrid classification scheme. Decision Support Systems 57 (2014), 245--257.
[13]
Yoon Kim. 2014. Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882 (2014).
[14]
EA MacDonald, NA Case, JH Clayton, MK Hall, Matt Heavner, Nicolas Lalone, KG Patel, and Andrea Tapia. 2015. Aurorasaurus: A citizen science platform for viewing and reporting the aurora. Space Weather 13, 9 (2015), 548--559.
[15]
Tomas Mikolov, Edouard Grave, Piotr Bojanowski, Christian Puhrsch, and Armand Joulin. 2017. Advances in pre-training distributed word representations. arXiv preprint arXiv:1712.09405 (2017).
[16]
Venkata Kishore Neppalli, Cornelia Caragea, and Doina Caragea. Deep Neural Networks versus Naïve Bayes Classifiers for Identifying Informative Tweets during Disasters. (????).
[17]
Takeshi Sakaki, Makoto Okazaki, and Yutaka Matsuo. 2010. Earthquake shakes Twitter users: real-time event detection by social sensors. In Proceedings of the 19th international conference on World wide web. ACM, 851--860.
[18]
Hien To, Sumeet Agrawal, Seon Ho Kim, and Cyrus Shahabi. 2017. On Identifying Disaster-Related Tweets: Matching-based or Learning-based?. In Multimedia Big Data (BigMM), 2017 IEEE Third International Conference on. IEEE, 330--337.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
ICGDA '19: Proceedings of the 2019 2nd International Conference on Geoinformatics and Data Analysis
March 2019
156 pages
ISBN:9781450362450
DOI:10.1145/3318236
Publication rights licensed to ACM. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of the United States government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

In-Cooperation

  • Department of Informatics, University of Oslo

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 March 2019

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Aurora detection
  2. Deep neural networks
  3. Twitter
  4. text classification

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

ICGDA 2019

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 109
    Total Downloads
  • Downloads (Last 12 months)10
  • Downloads (Last 6 weeks)2
Reflects downloads up to 04 Oct 2024

Other Metrics

Citations

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media