Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

A content-driven framework for geolocating microblog users

Published: 01 February 2013 Publication History

Abstract

Highly dynamic real-time microblog systems have already published petabytes of real-time human sensor data in the form of status updates. However, the lack of user adoption of geo-based features per user or per post signals that the promise of microblog services as location-based sensing systems may have only limited reach and impact. Thus, in this article, we propose and evaluate a probabilistic framework for estimating a microblog user's location based purely on the content of the user's posts. Our framework can overcome the sparsity of geo-enabled features in these services and bring augmented scope and breadth to emerging location-based personalized information services. Three of the key features of the proposed approach are: (i) its reliance purely on publicly available content; (ii) a classification component for automatically identifying words in posts with a strong local geo-scope; and (iii) a lattice-based neighborhood smoothing model for refining a user's location estimate. On average we find that the location estimates converge quickly, placing 51% of users within 100 miles of their actual location.

References

[1]
Amitay, E., Har'El, N., Sivan, R., and Soffer, A. 2004. Web-a-Where: Geotagging web content. In Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval.
[2]
Atkinson, K. 2007. Kevin's word list. http://wordlist.sourceforge.net
[3]
Backstrom, L., Kleinberg, J., Kumar, R., and Novak, J. 2008. Spatial variation in search engine queries. In Proceedings of the 17th International Conference on World Wide Web.
[4]
Backstrom, L., Sun, E., and Marlow, C. 2010. Find me if you can: Improving geographical prediction with social and spatial proximity. In Proceedings of the 19th International Conference on World Wide Web.
[5]
Beresford, A. R. and Stajano, F. 2003. Location privacy in pervasive computing. IEEE Pervas. Comput. 2, 1, 46--55.
[6]
Cheema, A. 2010. Twitter hits 20 billion tweets: Giga tweet. http://gopak.co.cc/social-media/twitter-socialmedia/twitter-hits-20-billion-tweets-giga-tweet/
[7]
Cheng, Z., Caverlee, J., and Lee, K. 2010. You are where you tweet: A content-based approach to geolocating twitter users. In Proceedings of the 19th ACM International Conference on Information and Knowledge Management.
[8]
Crandall, D. J., Backstrom, L., Huttenlocher, D., and Kleinberg, J. 2009. Mapping the world's photos. In Proceedings of the 18th International Conference on World Wide Web.
[9]
Cranshaw, J., Toch, E., Hong, J., Kittur, A., and Sadeh, N. 2010. Bridging the gap between physical location and online social networks. In Proceedings of the 12th ACM International Conference on Ubiquitous Computing.
[10]
Fink, C., Piatko, C., Mayfield, J., Finin, T., and Martineau, J. 2009. Geolocating blogs from their textual content. In Proceedings of the AAAI Spring Symposia on Social Semantic Web: Where Web 2.0 Meets Web 3.0.
[11]
Freni, D., Vicente, C. R., Mascetti, S., Bettini, C., and Jensen, C. S. 2010. Preserving location and absence privacy in geo-social networks. In Proceedings of the 19th ACM International Conference on Information and Knowledge Management.
[12]
Heatherly, R., Kantarcioglu, M., and Thuraisingham, B. 2009. Social network classification incorporating link type values. In Proceedings of the IEEE International Conference on Intelligence and Security Informatics.
[13]
Huberman, B. A., Romero, D. M., and Wu, F. 2008. Social networks that matter: Twitter under the microscope. Social Science Research Network Working Paper Series.
[14]
Hurst, M., Siegler, M., and Glance, N. 2007. On estimating the geographic distribution of social media. In Proceedings of the International Conference on Weblogs and Social Media.
[15]
Java, A., Song, X., Finin, T., and Tseng, B. 2007. Why we twitter: Understanding microblogging usage and communities. In Proceedings of the 9th WebKDD and 1st SNA-KDD Workshop on Web Mining and Social Network Analysis (WebKDD/SNA-KDD '07).
[16]
Johnson, S. 2009. How twitter will change the way we live. Time 6/5/09.
[17]
Kalnis, P., Ghinita, G., Mouratidis, K., and Papadias, D. 2007. Preventing location-based identity inference in anonymous spatial queries. IEEE Trans. Knowl. Data Engin. 19, 12, 1719--1733.
[18]
Lee, K., Caverlee, J., and Webb, S. 2010. Uncovering social spammers: Social honeypots + machine learning. In Proceedings of the 33rd ACM SIGIR International Conference on Research and Development in Information Retrieval.
[19]
Lin, J. and Halavais, A. 2004. Mapping the blogosphere in america. In Proceedings of the Workshop on the Weblogging Ecosystem at the 13th International World Wide Web Conference.
[20]
Lindamood, J., Heatherly, R., Kantarcioglu, M., and Thuraisingham, B. 2009. Inferring private information using social network data. In Proceedings of the 18th International Conference on World Wide Web.
[21]
McGee, J., Caverlee, J., and Cheng, Z. 2011. A geographic study of tie strength in social media. In Proceedings of the 20th ACM International Conference on Information and Knowledge Management.
[22]
Miller, C. C. 2010. Sports fans break records on twitter. Blogs of New York Times.
[23]
Patrick, K. and Kevin, B. 2009. The local business owner's guide to twitter. http://domusconsultinggroup.com/wp-content/uploads/2009/06/090624-twitter-ebook.pdf
[24]
Press, W. H., Flannery, B. P., Teukolsky, S. A., and Vetterling, W. T. 1986. Numerical Recipes in C: The Art of Scientific Computing. Cambridge University Press.
[25]
Sakaki, T., Okazaki, M., and Matsuo, Y. 2010. Earthquake shakes twitter users: real-time event detection by social sensors. In Proceedings of the International Conference on World Wide Web.
[26]
Serdyukov, P., Murdock, V., and van Zwol, R. 2009. Placing flickr photos on a map. In Proceedings of the 32nd International and Development in Information Retrieval.
[27]
Twitter. 2007. Twitter's open api. http://apiwiki.twitter.com
[28]
Uscensusbureau. 2002. Census 2000. u.s.gazetteer.http://www.census.gov/geo/www/gazetteer/places2k.html
[29]
Witten, I. H. and Frank, E. 2005. Data Mining: Practical Machine Learning Tools and Techniques, 2nd ed. Morgan Kaufmann.
[30]
Yamamoto, Y. 2007. Twitter4j open-source library. http://yusuke.homeip.net/twitter4j/en/index.html.
[31]
Yardi, S. and Boyd, D. 2010. Tweeting from the town square: Measuring geographic local networks. In Proceedings of the International AAAI Conference on Weblogs and Social Media.
[32]
Yi, X., Raghavan, H., and Leggetter, C. 2009. Discovering users' specific geo intention in web search. In Proceedings of the 18th International Conference on World Wide Web.
[33]
Zheng, Y., Zhang, L., Ma, Z., and Ma, W. Y. 2011. Recommending friends and locations based on individual location history. http://research.microsoft.com/pubs/122435/recomfriend-zheng-published.pdf
[34]
Zong, W., Wu, D., Sun, A., Lim, E.-P., and Goh, D. H.-L. 2005. On assigning place names to geography related web pages. In Proceedings of the 5th ACM/IEEE-CS Joint Conference on Digital Libraries.

Cited By

View all
  • (2024)Utilizing External Knowledge to Enhance Location Prediction for Twitter/X Users in Low Resource SettingsACM Transactions on Spatial Algorithms and Systems10.1145/367389910:3(1-25)Online publication date: 19-Jun-2024
  • (2023)Find You: Multi-View-Based Location Inference for Twitter UsersApplied Sciences10.3390/app13211184813:21(11848)Online publication date: 30-Oct-2023
  • (2023)Automated Phrasal Verb and Key-Phrase Checking with LSTM-Based Attention Mechanism2023 3rd International Conference on Computing and Information Technology (ICCIT)10.1109/ICCIT58132.2023.10273937(285-290)Online publication date: 13-Sep-2023
  • Show More Cited By

Index Terms

  1. A content-driven framework for geolocating microblog users

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Transactions on Intelligent Systems and Technology
      ACM Transactions on Intelligent Systems and Technology  Volume 4, Issue 1
      Special section on twitter and microblogging services, social recommender systems, and CAMRa2010: Movie recommendation in context
      January 2013
      357 pages
      ISSN:2157-6904
      EISSN:2157-6912
      DOI:10.1145/2414425
      Issue’s Table of Contents
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 01 February 2013
      Accepted: 01 July 2012
      Revised: 01 June 2012
      Received: 01 May 2011
      Published in TIST Volume 4, Issue 1

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Microblog
      2. Twitter
      3. location-based estimation
      4. spatial data mining
      5. text mining

      Qualifiers

      • Research-article
      • Research
      • Refereed

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)9
      • Downloads (Last 6 weeks)2
      Reflects downloads up to 13 Jan 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Utilizing External Knowledge to Enhance Location Prediction for Twitter/X Users in Low Resource SettingsACM Transactions on Spatial Algorithms and Systems10.1145/367389910:3(1-25)Online publication date: 19-Jun-2024
      • (2023)Find You: Multi-View-Based Location Inference for Twitter UsersApplied Sciences10.3390/app13211184813:21(11848)Online publication date: 30-Oct-2023
      • (2023)Automated Phrasal Verb and Key-Phrase Checking with LSTM-Based Attention Mechanism2023 3rd International Conference on Computing and Information Technology (ICCIT)10.1109/ICCIT58132.2023.10273937(285-290)Online publication date: 13-Sep-2023
      • (2022)Tracking Dengue on Twitter Using Hybrid Filtration-Polarity and Apache FlumeComputer Systems Science and Engineering10.32604/csse.2022.01846740:3(913-926)Online publication date: 2022
      • (2022)Who are thereJournal of Network and Computer Applications10.1016/j.jnca.2021.103302199:COnline publication date: 1-Mar-2022
      • (2022)Pre-HLSA: Predicting home location for Twitter users based on sentimental analysisAin Shams Engineering Journal10.1016/j.asej.2021.05.01513:1(101501)Online publication date: Jan-2022
      • (2021)Which portland is it?Proceedings of the 5th ACM SIGSPATIAL International Workshop on Location-based Recommendations, Geosocial Networks and Geoadvertising10.1145/3486183.3491066(1-10)Online publication date: 2-Nov-2021
      • (2021)Heterogeneous Graph Attention Network for User GeolocationPRICAI 2021: Trends in Artificial Intelligence10.1007/978-3-030-89363-7_33(433-447)Online publication date: 1-Nov-2021
      • (2020)Predicting the Tweet Location Based on KNN-Sentimental Analysis2020 15th International Conference on Computer Engineering and Systems (ICCES)10.1109/ICCES51560.2020.9334566(1-6)Online publication date: 15-Dec-2020
      • (2019)Locality-adapted kernel densities of term co-occurrences for location prediction of tweetsInformation Processing & Management10.1016/j.ipm.2019.02.01356:4(1280-1299)Online publication date: Jul-2019
      • Show More Cited By

      View Options

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media