Abstract
Recent years have witnessed an explosion of geospatial data, especially in the form of Volunteered Geographic Information (VGI). As a prominent example, OpenStreetMap (OSM) creates a free editable map of the world from a large number of contributors. On the other hand, social media platforms such as Twitter or Instagram supply dynamic social feeds at population level. As much of such data is geo-tagged, there is a high potential on integrating social media with OSM to enrich OSM with semantic annotations, which will complement existing objective description oriented annotations to provide a broader range of annotations. In this paper, we propose a comprehensive framework on integrating social media data and VGI data to derive knowledge about geographical objects, specifically, top relevant annotations from tweets for objects in OSM. We first integrate geo-tagged tweets with OSM data with scalable spatial queries running on MapReduce. We propose a frequency based method for annotating boundary based geographic objects (a polygon), and a probability based method for annotating point based geographic objects (Latitude and Longitude), with consideration of noise. We evaluate our methods using a large geo-tagged tweets corpus and representative geographic objects from OSM, which demonstrates promising results through ground-truth comparison and case studies. We are able to produce up to 80% correct names for geographical objects and discover implicitly relevant information, such as popular exhibitions of a museum, the nicknames or visitors’ impression to a tourism attraction.











Similar content being viewed by others
Change history
11 August 2018
The original version of this article contained two errors.
Notes
Openstreetmap. www.openstreetmap.org.
Wikimapia API. wikimapia.org/api.
The two original tweets for “architecture” are: #imperialwarmuseumnorth #manchester #salfordquays ... Impressive architecture #lovemanchester https://t.co/eS4tfRkEqo and The walls between art and engineering exist only in our minds #bridge #architecture #manchester https://t.co/SVndM4ARCk
References
Aji A, Sun X, Vo H, Liu Q, Lee R, Zhang X, Saltz J, Wang F (2013) Demonstration of hadoop-gis: a spatial data warehousing system over mapreduce. In: SIGSPATIAL/GIS
Aji A, Vo H, Wang F (2015) Effective spatial data partitioning for scalable query processing. coRR
Aji A, Wang F (2012) High performance spatial query processing for large scale scientific data. In: SIGMOD/PODS 2012 PhD symposium
Aji A, Wang F, Vo H, Lee R, Liu Q, Zhang X, Saltz J (2013) Hadoop-GIS: a high performance spatial data warehousing system over mapreduce. In: Proc VLDB Endow
Bast H, Storandt S, Weidner S (2015) Fine-grained population estimation. In: SIGSPATIAL/GIS
Breiman L, Meisel W, Purcell E (1977) Variable kernel estimates of multivariate densities. Technometrics
Brinkhoff T, Kriegel H-P, Seeger B (1996) Parallel processing of spatial joins using r-trees. In: ICDE
Coffey C, Pozdnoukhov A (2013) Temporal decomposition and semantic enrichment of mobility flows. In: SIGSPATIAL/GIS Workshop LBSN
Georgiev P, Noulas A, Mascolo C (2014) The call of the crowd: event participation in location-based social services. In: AAAI conference
Georgiev P, Noulas A, thrive C. Mascolo. (2014) Where businesses predicting the impact of the olympic games on local retailers through location-based services data. In: AAAI conference
Goodchild MF (2007) Citizens as sensors: the world of volunteered geography. GeoJournal
Jurgens D, McCorriston J, Xu YT, Ruths D (2015) Geolocation prediction in twitter using social networks: a critical analysis and review of current practice
Karamshuk D, Noulas A, Scellato S, Nicosia V, Mascolo C (2013) Geo-spotting: mining online location-based services for optimal retail store placement. In: ACM SIGKDD, ACM
Krumm J, Horvitz E (2015) Eyewitness: Identifying local events via space-time signals in twitter feeds. In: SIGSPATIAL/GIS
Lee R, Wakamiya S, Sumiya K (2013) Urban area characterization based on crowd behavioral lifelogs over twitter. Personal and ubiquitous computing
Li Y, Steiner M, Wang L, Zhang Z-L, Bao J (2013) Exploring venue popularity in foursquare. In: INFOCOM, 2013 Proceedings IEEE
Lichman M, Smyth P (2014) Modeling human location data with mixtures of kernel densities. In: SIGKDD
Quattrone G, Capra L, De Meo P (2015) There’s no such thing as the perfect map: Quantifying bias in spatial crowd-sourcing datasets. In: CSCW
Quercia D, Aiello LM, Schifanella R, Davies A (2015) The digital life of walkable streets. In: WWW
Quercia D, Schifanella R, Aiello LM, McLean K (2015) Smelly maps: The digital life of urban smellscapes. ICWSM
Sengstock C, Gertz M (2012) Latent geographic feature extraction from social media. In: SIGSPATIAL/GIS
Silverman BW (1986) Density estimation for statistics and data analysis. Chapman & Hall, London
Thomee B, Rae A (2013) Uncovering locally characterizing regions within geotagged data. In: WWW
Vo H, Aji A, Wang F (2014) Sato: a spatial data partitioning framework for scalable query processing. In: SIGSPATIAL/GIS
Wu F, Li Z, Lee W-C, Wang H, Huang Z (2015) Semantic annotaion of mobility data using social media. In: WWW
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Chen, X., Vo, H., Wang, Y. et al. A framework for annotating OpenStreetMap objects using geo-tagged tweets. Geoinformatica 22, 589–613 (2018). https://doi.org/10.1007/s10707-018-0323-8
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10707-018-0323-8