Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2675133.2675235acmconferencesArticle/Chapter ViewAbstractPublication PagescscwConference Proceedingsconference-collections
research-article

There's No Such Thing as the Perfect Map: Quantifying Bias in Spatial Crowd-sourcing Datasets

Published: 28 February 2015 Publication History

Abstract

Crowd-sourcing has become a popular form of computer mediated collaborative work and OpenStreetMap represents one of the most successful crowd-sourcing systems, where the goal of building and maintaining an accurate global map of the world is being accomplished by means of contributions made by over 1.2M citizens. However, within this apparently large crowd, a tiny group of highly active users is responsible for the mapping of almost all the content. One may thus wonder to what extent the information being mapped is biased towards the interests and agenda of this group of users. In this paper, we present a method to quantitatively measure content bias in crowd-sourced geographic information. We then apply the method to quantify content bias across a three-year period of OpenStreetMap mapping in 40 countries. We find almost no content bias in terms of what is being mapped, but significant geographic bias; furthermore, we find that bias in terms of meticulousness varies with culture.

References

[1]
Arsanjani, J., Barron, C., Bakillah, M., and Helbich, M. Assessing the Quality of OpenStreetMap Contributors together with their Contributions. In Proc. of AGILE (2013).
[2]
Boyd, D., and Crawford, K. Critical Questions for Big Data: Provocations for a Cultural, Technological, and Scholarly Phenomenon. Information, Communication & Society 15, 5 (2012), 662--679.
[3]
Brabham, D. Crowdsourcing. MIT Press, 2013.
[4]
Bryant, S., Forte, A., and Bruckman, A. Becoming Wikipedian: Transformation of Participation in a Collaborative Online Encyclopedia. In Proc. of GROUP, ACM (2005), 1--10.
[5]
Callahan, E., and Herring, S. Cultural Bias in Wikipedia Content on Famous Persons. Journal of the American Society for Information Science and Technology 62, 10 (2011), 1899--1915.
[6]
Cha, M., Kwak, H., Rodriguez, P., Ahn, Y., and Moon, S. I Tube, You Tube, Everybody Tubes: Analyzing the World's Largest User Generated Content Video System. In Proc. of IMC, ACM (2007), 1--14.
[7]
Chango, S., Kumar, V., Gilbert, E., and Terveen, L. Specialization, Homophily, and Gender in a Social Curation Site: Findings from Pinterest. In Proc. of CSCW, ACM (2014), 674--686.
[8]
Cohen, N. Define Gender Gap? Look Up Wikipedias Contributor List. The New York Times (January 2011).
[9]
Dai, W., Jin, G. Z., Lee, J., and Luca, M. Optimal Aggregation of Consumer Ratings: An Application to Yelp.com. Tech. rep., National Bureau of Economic Research, 2012.
[10]
Fana, H., Zipfa, A., Fub, Q., and Neisa, P. Quality assessment for building footprints data on OpenStreetMap. International Journal of Geographical Information Science (IJGIS) 28, 4 (2014), 700--719.
[11]
Gallagher, S., and Savage, T. Cross-cultural analysis in online community research: A literature review. Computers in Human Behavior (2012).
[12]
Garcia-Gavilanes, R., Quercia, D., and Jaimes, A. Cultural Dimensions in Twitter: Time, Individualism, and Power. In Proc. of ICWSM (2013).
[13]
Girres, J., and Touya, G. Quality assessment of the French OpenStreetMap dataset. Transactions in GIS 14, 4 (2010), 435--459.
[14]
Goodchild, M. Citizens as Sensors: the World of Volunteered Geography. GeoJournal 69, 4 (2007), 211--221.
[15]
Haklay, M. How good is volunteered geographical information? A comparative study of OpenStreetMap and Ordnance Survey datasets. Environment and Planning B: Planning and Design 37, 4 (2010), 682--703.
[16]
Haklay, M., Basiouka, S., Antoniou, V., and Ather, A. How Many Volunteers Does it Take to Map an Area Well? The Validity of Linus Law to Volunteered Geographic Information. The Cartographic Journal 47, 4 (2010), 315--322.
[17]
Halfaker, A., Geiger, R., Morgan, J., and Riedl, J. The Rise and Decline of an Open Collaboration System: How Wikipedia's reaction to sudden popularity is causing its decline. American Behavioral Scientist 57, 5 (2013), 664--688.
[18]
Halfaker, A., Kittur, A., Kraut, R., and Riedl, J. A Jury of your Peers: Quality, Experience and Ownership in Wikipedia. In Proc. of WikiSym, ACM (2009).
[19]
Hecht, B., and Stephens, M. A Tale of Cities: Urban Biases in Volunteered Geographic Information. In Proc. of ICWSM 2014 (2014).
[20]
Hofstede, G. Culture's Consequences: Comparing Values, Behaviors, Institutions and Organizations across Nations. SAGE Publications, 2001.
[21]
Howe, J. The Rise of Crowdsourcing. Wired (2006).
[22]
Hristova, D., Quattrone, G., Mashhadi, A., and Capra, L. The Life of the Party: Impact of Social Mapping in OpenStreetMap. In Proc. of ICWSM (2013).
[23]
Hu, M., Lim, E., Sun, A., Lauw, H., and Vuong, B. Measuring Article Quality in Wikipedia: Models and Evaluation. In Proc. of CIKM, ACM (2007).
[24]
Ishida, K. Geographical Bias on Social Media and Geo-local Contents System with Mobile Devices. In Proc. of HICSS (2012), 1790--1796.
[25]
Kittur, A., Chi, E., Pendleton, B., Suh, B., and Mytkowicz, T. Power of the Few vs. Wisdom of the Crowd: Wikipedia and the Rise of the Bourgeoisie. In Proc. of WWW (2007).
[26]
L., J. R., Irani, Silberman, M., Zaldivar, A., and Tomlinson, B. Who are the Crowdworkers' Shifting Demographics in Mechanical Turk. In Proc. of CHI (2010).
[27]
Lam, S. T. K., Uduwage, A., Dong, Z., Sen, S., Musicant, D. R., Terveen, L., and Riedl, J. WP:Clubhouse- An Exploration of Wikipedias Gender Imbalance. In Proc. of WikiSym, ACM (2011), 1--10.
[28]
Ludwig, I., Voss, A., and Krause-Traudes, M. A Comparison of the Street Networks of Navteq and OSM in Germany. Advancing Geoinformation Science for a Changing World 1, 2 (2011), 65--84.
[29]
Maceachren, A. M., Robinson, A., Gardner, S., Murray, R., Gahegan, M., and Hetzler, E. Visualizing Geospatial Information Uncertainty: What We Know and What We Need to Know. Information Science 32 (2005), 160.
[30]
Mashhadi, A., Quattrone, G., Capra, L., and Mooney, P. On the Accuracy of Urban Crowd-Sourcing for Maintaining Large-Scale Geospatial Databases. In Proc. of WikiSym, ACM (2012).
[31]
Neis, P., Zielstra, D., and Zipf, A. Comparison of Volunteered Geographic Information Data Contributions and Community Development for Selected World Regions. Future Internet 5, 2 (2013), 282--300.
[32]
Neis, P., and Zipf, A. Analyzing the Contributor Activity of a Volunteered Geographic Information ProjectThe Case of OpenStreetMap. ISPRS International Journal of Geo-Information 1, 2 (2012), 146--165.
[33]
Panciera, K., Halfaker, A., and Terveen, L. Wikipedians Are Born, Not Made: a Study of Power Editors on Wikipedia. In Proc. of GROUP, ACM (2009), 51--60.
[34]
Pfeil, U., Zaphiris, P., and Ang, C. S. Cultural Differences in Collaborative Authoring of Wikipedia. Journal of Computer-Mediated Communication 12, 1 (2006), 88--113.
[35]
Priedhorsky, R., Lam, S., Panciera, K., Terveen, L., and Riedl, J. Creating, Destroying, and Restoring Value in Wikipedia. In Proc. of GROUP, ACM (2007), 259--268.
[36]
Priedhorsky, R., Masli, M., and Terveen, L. Eliciting and Focusing Geographic Volunteer Work. In Proc. of CSCW, ACM (2010), 61--70.
[37]
Quattrone, G., Mashhadi, A., and Capra, L. Mind the Map: The Impact of Culture and Economic Affluence on Crowd-Mapping Behaviours. In Proc. of CSCW, ACM (2014), 934--944.
[38]
Reinecke, K., Nguyen, M. K., Bernstein, A., Näf, M., and Gajos, K. Doodle Around the World: Online Scheduling Behavior Reflects Cultural Differences in Time Perception and Group Decision-Making. In Proc. of CSCW, ACM (2013), 45--54.
[39]
Rost, M., Barkhuus, L., Cramer, H., and Brown, B. Representation and Communication: Challenges in Interpreting Large Social Media Datasets. In Proc. of CSCW, ACM (2013), 357--362.
[40]
Shannon, C. A Mathematical Theory of Communication. The Bell System Technical Journal 27 (1948), 379--423 and 623--656.
[41]
Singhal, A. Modern Information Retrieval: A Brief Overview. Bulletin of the IEEE Computer Society Technical Committee on Data Engineering 24, 4 (2001), 35--43.
[42]
Stephens, M. Gender and the GeoWeb: Divisions in the Production of User-Generated Cartographic Information. GeoJournal (2013), 1--16.
[43]
Vasconcelos, M., Ricci, S., Almeida, J., Benevenuto, F., and Almeida, V. Tips, Dones and ToDos: Uncovering User Profiles in FourSquare. In Proc. of WSDM, ACM (2012), 653--662.
[44]
Zielstra, D., and Zipf, A. A Comparative Study of Proprietary Geodata and Volunteered Geographic Information for Germany. In Proc. of AGILE (2010).

Cited By

View all
  • (2024)Behind the scenes of a crowdmapping tool design and implementation: Guidelines for participatory mapping practices in a multicultural environmentGeographia Polonica10.7163/GPol.026697:1(5-21)Online publication date: 8-Apr-2024
  • (2024)Behaviors and Perceptions of Human-Chatbot Interactions Based on Top Active Users of a Commercial Social ChatbotProceedings of the ACM on Human-Computer Interaction10.1145/36870228:CSCW2(1-28)Online publication date: 8-Nov-2024
  • (2024)Co-Designing Location-based Games for Broadband Data CollectionProceedings of the 2024 ACM Designing Interactive Systems Conference10.1145/3643834.3661502(2057-2072)Online publication date: 1-Jul-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CSCW '15: Proceedings of the 18th ACM Conference on Computer Supported Cooperative Work & Social Computing
February 2015
1956 pages
ISBN:9781450329224
DOI:10.1145/2675133
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 February 2015

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. content bias
  2. cross-cultural
  3. crowd-sourcing
  4. openstreetmap
  5. volunteered geographic information

Qualifiers

  • Research-article

Conference

CSCW '15
Sponsor:

Acceptance Rates

CSCW '15 Paper Acceptance Rate 161 of 575 submissions, 28%;
Overall Acceptance Rate 2,235 of 8,521 submissions, 26%

Upcoming Conference

CSCW '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)54
  • Downloads (Last 6 weeks)5
Reflects downloads up to 24 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Behind the scenes of a crowdmapping tool design and implementation: Guidelines for participatory mapping practices in a multicultural environmentGeographia Polonica10.7163/GPol.026697:1(5-21)Online publication date: 8-Apr-2024
  • (2024)Behaviors and Perceptions of Human-Chatbot Interactions Based on Top Active Users of a Commercial Social ChatbotProceedings of the ACM on Human-Computer Interaction10.1145/36870228:CSCW2(1-28)Online publication date: 8-Nov-2024
  • (2024)Co-Designing Location-based Games for Broadband Data CollectionProceedings of the 2024 ACM Designing Interactive Systems Conference10.1145/3643834.3661502(2057-2072)Online publication date: 1-Jul-2024
  • (2022)Recreating Human Mobility Patterns Through the Lens of Social Media: Using Twitter to Model the Social Ecology of CrimeCrime & Delinquency10.1177/0011128722110694670:8(1943-1970)Online publication date: 25-Jun-2022
  • (2022)Mitigating Bias in Algorithmic Systems—A Fish-eye ViewACM Computing Surveys10.1145/352715255:5(1-37)Online publication date: 3-Dec-2022
  • (2022)Impact of Driving Behavior on Commuter’s Comfort During Cab Rides: Towards a New Perspective of Driver RatingACM Transactions on Intelligent Systems and Technology10.1145/352306313:6(1-25)Online publication date: 22-Sep-2022
  • (2022)Approximating Accessibility of Regions from Incomplete Volunteered DataExtended Abstracts of the 2022 CHI Conference on Human Factors in Computing Systems10.1145/3491101.3519706(1-6)Online publication date: 27-Apr-2022
  • (2022)Uncovering commercial activity in informal citiesRoyal Society Open Science10.1098/rsos.2118419:11Online publication date: 2-Nov-2022
  • (2022)Extending QGIS processing toolbox for assessing the geometrical properties of OpenStreetMap dataSpatial Information Research10.1007/s41324-022-00480-331:2(135-144)Online publication date: 24-Sep-2022
  • (2021)Quantified Cycling Safety: Towards a Mobile Sensing Platform to Understand Perceived Safety of CyclistsExtended Abstracts of the 2021 CHI Conference on Human Factors in Computing Systems10.1145/3411763.3451678(1-6)Online publication date: 8-May-2021
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media