Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3486635.3491075acmconferencesArticle/Chapter ViewAbstractPublication PagesgisConference Proceedingsconference-collections
research-article

Location Classification Based on Tweets

Published: 08 November 2021 Publication History

Abstract

Location classification is used for associating type to locations, to enrich maps and support a plethora of geospatial applications that rely on location types. Classification can be performed by humans, but using machine learning is more efficient and faster to react to changes than human-based classification. Machine learning can be used in lieu of human classification or for supporting it. In this paper we study the use of machine learning for Geosocial Location Classification, where the type of a site, e.g., a building, is discovered based on social-media posts, e.g., tweets. Our goal is to correctly associate a set of tweets posted in a small radius around a given location with the corresponding location type, e.g., school, church, restaurant or museum. We explore two approaches to the problem: (a) a pipeline approach, where each post is first classified, and then the location associated with the set of posts is inferred from the individual post labels; and (b) a joint approach where the individual posts are simultaneously processed to yield the desired location type. We tested the two approaches over a data set of geotagged tweets. Our results demonstrate the superiority of the joint approach. Moreover, we show that due to the unique structure of the problem, where weakly-related messages are jointly processed to yield a single final label, linear classifiers outperform deep neural network alternatives.

References

[1]
Alireza Abbasi, Taha Hossein Rashidi, Mojtaba Maghrebi, and S Travis Waller. 2015. Utilising location based social media in travel survey methods: bringing Twitter data into the play. In Proc. of the 8th ACM SIGSPATIAL LBSN 2015. 1--9.
[2]
Amr Ahmed, Liangjie Hong, and Alexander J Smola. 2013. Hierarchical geographical modeling of user locations from social media posts. In WWW'2013.
[3]
Jalal S Alowibdi, Sohaib Ghani, and Mohamed F Mokbel. 2014. VacationFinder: A tool for collecting, analyzing, and visualizing geotagged Twitter data to find top vacation spots. In Proceedings of the 7th ACM SIGSPATIAL international workshop on location-based social networks. 9--12.
[4]
Sanjeev Arora, Yingyu Liang, and Tengyu Ma. 2017. A simple but tough-to-beat baseline for sentence embeddings. In Proc. of ICLR.
[5]
Hau-wen Chang, Dongwon Lee, Mohammed Eltaher, and Jeongkyu Lee. 2012. Phillies tweeting from Philly? Predicting Twitter user locations with spatial word usage. In 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining. IEEE, 111--118.
[6]
Jason PC Chiu and Eric Nichols. 2016. Named entity recognition with bidirectional LSTM-CNNs. Transactions of the Association of Computational Linguistics 4, 1 (2016), 357--370.
[7]
Kevyn Collins-Thompson and James P Callan. 2004. A language modeling approach to predicting reading difficulty. In Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics: HLT-NAACL 2004.
[8]
Alexis Conneau, Douwe Kiela, Holger Schwenk, Loïc Barrault, and Antoine Bordes. 2017. Supervised Learning of Universal Sentence Representations from Natural Language Inference Data. In Proc. of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics.
[9]
Justin Cranshaw, Raz Schwartz, Jason Hong, and Norman Sadeh. 2012. The livehoods project: Utilizing social media to understand the dynamics of a city. In Sixth International AAAI Conference on Weblogs and Social Media.
[10]
Andrew M Dai, Christopher Olah, and Quoc V Le. 2015. Document embedding with paragraph vectors. arXiv preprint arXiv:1507.07998 (2015).
[11]
Yerach Doytsher, Ben Galon, and Yaron Kanza. 2017. Emotion maps based on geotagged posts in the social media. In Proceedings of the 1st ACM SIGSPATIAL Workshop on Geospatial Humanities. 39--46.
[12]
Jacob Eisenstein. 2013. What to do about bad language on the internet. In Proceedings of the 2013 conference of the North American Chapter of the association for computational linguistics: Human language technologies. 359--369.
[13]
Laura Ferrari, Alberto Rosi, Marco Mamei, and Franco Zambonelli. 2011. Extracting urban patterns from location-based social networks. In Proc. of the 3rd ACM SIGSPATIAL International Workshop on Location-Based Social Networks. 9--16.
[14]
David Flatow, Mor Naaman, Ke Eddie Xie, Yana Volkovich, and Yaron Kanza. 2015. On the accuracy of hyper-local geotagging of social media content. In Proc. of the 18th ACM International Conf. on Web Search and Data Mining. 127--136.
[15]
Xavier Glorot, Antoine Bordes, and Yoshua Bengio. 2011. Domain adaptation for large-scale sentiment classification: A deep learning approach. In Proc. of ICML.
[16]
Irena Grabovitch-Zuyev, Yaron Kanza, Elad Kravi, and Barak Pat. 2014. On the Correlation Between Textual Content and Geospatial Locations in Microblogs. In Proc. of Workshop on Managing and Mining Enriched Geo-Spatial Data. ACM.
[17]
Bo Han, Paul Cook, and Timothy Baldwin. 2014. Text-based twitter user geolocation prediction. Journal of Artificial Intelligence Research 49 (2014), 451--500.
[18]
Jerry Hausman and Daniel McFadden. 1984. Specification tests for the multinomial logit model. Econometrica: Journal of the Econometric Society (1984).
[19]
Sepp Hochreiter and jürgen Schmidhuber. 1997. Long short-term memory. Neural computation 9, 8 (1997), 1735--1780.
[20]
Elad Hoffer, Shai Fine, and Daniel Soudry. 2018. On the Blindspots of Convolutional Networks. CoRR abs/1802.05187 (2018). arXiv:1802.05187 http://arxiv.org/abs/1802.05187
[21]
Yingjie Hu, Song Gao, Krzysztof Janowicz, Bailang Yu, Wenwen Li, and Sathya Prasad. 2015. Extracting and understanding urban areas of interest using geotagged photos. Computers, Environment and Urban Systems 54 (2015), 240--254.
[22]
Bálint Kádár. 2014. Measuring tourist activities in cities using geotagged photography. Tourism Geographies 16, 1 (2014), 88--104.
[23]
Yaron Kanza, Elad Kravi, and Uri Motchan. 2014. City nexus: Discovering pairs of jointly-visited locations based on geo-tagged posts in social networks. In Proceedings of the 22nd ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems. 597--600.
[24]
Yaron Kanza, Elad Kravi, Eliyahu Safra, and Yehoshua Sagiv. 2017. Location-Based Distance Measures for Geosocial Similarity. ACM Trans. Web 11, 3 (2017).
[25]
Yaron Kanza and Hanan Samet. 2015. An online marketplace for geosocial data. In Proc. of the 23rd SIGSPATIAL International Conference on Advances in Geographic Information Systems.
[26]
Yaron Kanza and Mirit Shalem. 2018. Combined geo-social search: computing top-k join queries over incomplete information. Geoinformatica 22, 3 (2018).
[27]
Yoon Kim. 2014. Convolutional neural networks for sentence classiication. In in proc. of EMNLP.
[28]
Sheila Kinsella, Vanessa Murdock, and Neil O'Hare. 2011. "I'm Eating a Sandwich in Glasgow": Modeling Locations with Tweets. In Proceedings of the 3rd International Workshop on Search and Mining User-Generated Contents (SMUC '11). Association for Computing Machinery, 61--68.
[29]
Ryan Kiros, Yukun Zhu, Ruslan R Salakhutdinov, Richard Zemel, Raquel Urtasun, Antonio Torralba, and Sanja Fidler. 2015. Skip-thought vectors. In Advances in neural information processing systems. 3294--3302.
[30]
Felix Kling and Alexei Pozdnoukhov. 2012. When a city tells a story: urban topic analysis. In Proceedings of the 20th international conference on advances in geographic information systems. 482--485.
[31]
Granino Arthur Korn and Theresa M Korn. 2000. Mathematical handbook for scientists and engineers: definitions, theorems, and formulas for reference and review. Courier Corporation.
[32]
Elad Kravi, Yaron Kanza, Benny Kimelfeld, and Roi Reichart. 2020. Geosocial Location Classification: Associating Type to Places Based on Geotagged Social-Media Posts. In Proceedings of the 28th ACM SIGSPATIAL. ACM.
[33]
Solomon Kullback and Richard A Leibler. 1951. On information and sufficiency. The annals of mathematical statistics 22, 1 (1951), 79--86.
[34]
Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. 1998. Gradient-based learning applied to document recognition. Proc. IEEE 86, 11 (1998).
[35]
Daniel Leung and Shawn Newsam. 2012. Exploring Geotagged Images for Land-Use Classification. In Proceedings of the ACM Multimedia 2012 Workshop on Geotagging and Its Applications in Multimedia (GeoMM '12). Association for Computing Machinery, 3--8.
[36]
Michael D Lieberman, Hanan Samet, and Jagan Sankaranarayanan. 2010. Geotagging with local lexicons to build indexes for textually-specified spatial data. In Data Engineering (ICDE), 2010 IEEE 26th International Conference on. IEEE.
[37]
Amr Magdy, Louai Alarabi, Saif Al-Harthi, Mashaal Musleh, Thanaa M Ghanem, Sohaib Ghani, and Mohamed F Mokbel. 2014. Taghreed: a system for querying, analyzing, and visualizing geotagged microblogs. In Proceedings of the 22nd ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems. 163--172.
[38]
Jalal Mahmud, Jeffrey Nichols, and Clemens Drews. 2014. Home location identification of twitter users. ACM Transactions on Intelligent Systems and Technology (TIST) 5, 3 (2014), 1--21.
[39]
Christopher Manning and Dan Klein. 2003. Optimization, maxent models, and conditional estimation without magic. In Proc. of the 2003 Conf. of the North American Chapter of the Association for Computational Linguistics on Human Language Technology: Tutorials-Volume 5. Association for Computational Linguistics.
[40]
Christopher D Manning, Prabhakar Raghavan, Hinrich Schutze, et al. 2008. Introduction to information retrieval. Number 1. Cambridge university press Cambridge.
[41]
Stuart E. Middleton, Giorgos Kordopatis-Zilos, Symeon Papadopoulos, and Yiannis Kompatsiaris. 2018. Location Extraction from Social Media: Geoparsing, Location Disambiguation, and Geotagging. ACM Trans. Inf. Syst. 36, 4 (2018).
[42]
Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. In Proc. of NIPS.
[43]
Stefano Mizzaro, Marco Pavan, and Ivan Scagneto. 2015. Content-based similarity of twitter users. In European Conference on Information Retrieval. Springer.
[44]
Kevin P Murphy. 2006. Naive bayes classifiers. University of British Columbia 18 (2006).
[45]
Barak Pat and Yaron Kanza. 2017. Where's Waldo? Geosocial Search over Myriad Geotagged Posts. In Proceedings of the 25th ACMSIGSPATIAL International Conference on Advances in Geographic Information Systems. 1--10.
[46]
Barak Pat, Yaron Kanza, and Mor Naaman. 2015. Geosocial search: Finding places based on geotagged social-media posts. In Proceedings of the 24th International Conference on World Wide Web. 231--234.
[47]
Jefrey Pennington, Richard Socher, and Christopher Manning. 2014. Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing(EMNLP). 1532--1543.
[48]
Jay M Ponte and W Bruce Croft. 1998. A language modeling approach to information retrieval. In Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval. ACM, 275--281.
[49]
Daniele Quercia, Jonathan Ellis, Licia Capra, and Jon Crowcroft. 2012. Tracking" gross community happiness" from tweets. In Proceedings of the ACM 2012 conference on computer supported cooperative work. 965--968.
[50]
Takeshi Sakaki, Makoto Okazaki, and Yutaka Matsuo. 2010. Earthquake shakes Twitter users: real-time event detection by social sensors. In Proceedings of the 19th international conference on Worldwide web. 851--860.
[51]
Jagan Sankaranarayanan, Hanan Samet, Benjamin E Teitler, Michael D Lieberman, and Jon Sperling. 2009. Twitterstand: news in tweets. In Proceedings of the 17th ACM SIGSPATIAL. 42--51.
[52]
Axel Schulz, Aristotelis Hadjakos, Heiko Paulheim, Johannes Nachtwey, and Max Mühlhäuser. 2013. A multi-indicator approach for geolocalization of tweets. In Seventh international AAAI conference on weblogs and social media.
[53]
Aisha Sikder and Andreas Züfle. 2019. Emotion predictions in geo-textual data using spatial statistics and recommendation systems. In Proceedings of the 3rd ACM SIGSPATIAL International Workshop on Location-based Recommendations, Geosocial Networks and Geoadvertising. 1--4.
[54]
Kai Sheng Tai, Richard Socher, and Christopher D Manning. 2015. Improved semantic representations from tree-structured long short-term memory networks. In in proc. of ACL.
[55]
Kristina Toutanova, Dan Klein, Christopher D Manning, and Yoram Singer. 2003. Feature-rich part-of-speech tagging with a cyclic dependency network. In Proc. of the 2003 Conf. of the North American Chapter of the Association for Computational Linguistics on Human Language Technology-Volume 1. Association for Computational Linguistics, 173--180.
[56]
Hong Wei, Jagan Sankaranarayanan, and Hanan Samet. 2017. Finding and Tracking Local Twiter Users for News Detection. In Proceedings of the 25th ACM SIGSPATIAL International Conf. on Advances in Geographic Information Systems.
[57]
Hong Wei, Hao Zhou, Jagan Sankaranarayanan, Sudipta Sengupta, and Hanan Samet. 2018. Detecting latest local events from geotagged tweet streams. In Proceedings of the 26th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems. 520--523.
[58]
Hong Wei, Hao Zhou, Jagan Sankaranarayanan, Sudipta Sengupta, and Hanan Samet. 2019. DeLLe: Detecting Latest Local Events from Geotagged Tweets. In Proc. of the 3rd ACM SIGSPATIAL International Workshop on Analytics for Local Events and News. 1--10.
[59]
Chaolun Xia, Raz Schwartz, Ke Xie, Adam Krebs, Andrew Langdon, Jeremy Ting, and Mor Naaman. 2014. CityBeat: real-time social media visualization of hyper-local city data. In Proc. of the 23rd International Conf. on World Wide Web.
[60]
Ke Xie, Chaolun Xia, Nir Grinberg, Raz Schwartz, and Mor Naaman. 2013. Robust detection of hyper-local events from geotagged social media data. In Proceedings of the Thirteenth International Workshop on Multimedia Data Mining. 1--9.
[61]
Dani Yogatama and Noah Smith. 2014. Making the most of bag of words: Sentence regularization with alternating direction method of multipliers. In Proc. of ICML.
[62]
Dani Yogatama and Noah A Smith. 2014. Linguistic structured sparsity in text categorization. In Proc. of ACL.
[63]
Xiang Zhang, Junbo Zhao, and Yann LeCun. 2015. Character-level convolutional networks for text classification. In Advances in neural information processing systems. 649--657.
[64]
Xin Zheng, Jialong Han, and Aixin Sun. 2018. A survey of location prediction on twitter. IEEE Trans. on Knowledge and Data Engineering 30, 9 (2018), 1652--1671.
[65]
Yi Zhu and Shawn Newsam. 2016. Spatio-temporal sentiment hotspot detection using geotagged photos. In Proceedings of the 24th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems. 1--4.
[66]
Yftah Ziser and Roi Reichart. 2017. Neural structural correspondence learning for domain adaptation. In Proc. of CoNLL.
[67]
Yftah Ziser and Roi Reichart. 2018. Pivot Based Language Modeling for Improved Neural Domain Adaptation. In Proc. of NAACL-HLT.

Cited By

View all

Index Terms

  1. Location Classification Based on Tweets

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    GEOAI '21: Proceedings of the 4th ACM SIGSPATIAL International Workshop on AI for Geographic Knowledge Discovery
    November 2021
    77 pages
    ISBN:9781450391207
    DOI:10.1145/3486635
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 08 November 2021

    Permissions

    Request permissions for this article.

    Check for updates

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    SIGSPATIAL '21
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 17 of 25 submissions, 68%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)11
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 15 Oct 2024

    Other Metrics

    Citations

    Cited By

    View all

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media