research-article

Strategies for crowdsourcing social data analysis

Authors:

Wesley Willett,

Maneesh AgrawalaAuthors Info & Claims

CHI '12: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems

Pages 227 - 236

https://doi.org/10.1145/2207676.2207709

Published: 05 May 2012 Publication History

Abstract

Web-based social data analysis tools that rely on public discussion to produce hypotheses or explanations of the patterns and trends in data, rarely yield high-quality results in practice. Crowdsourcing offers an alternative approach in which an analyst pays workers to generate such explanations. Yet, asking workers with varying skills, backgrounds and motivations to simply "Explain why a chart is interesting" can result in irrelevant, unclear or speculative explanations of variable quality. To address these problems, we contribute seven strategies for improving the quality and diversity of worker-generated explanations. Our experiments show that using (S1) feature-oriented prompts, providing (S2) good examples, and including (S3) reference gathering, (S4) chart reading, and (S5) annotation subtasks increases the quality of responses by 28% for US workers and 196% for non-US workers. Feature-oriented prompts improve explanation quality by 69% to 236% depending on the prompt. We also show that (S6) pre-annotating charts can focus workers' attention on relevant details, and demonstrate that (S7) generating explanations iteratively increases explanation diversity without increasing worker attrition. We used our techniques to generate 910 explanations for 16 datasets, and found that 63% were of high quality. These results demonstrate that paid crowd workers can reliably generate diverse, high-quality explanations that support the analysis of specific datasets.

Supplementary Material

MOV File (paperfile554-3.mov)

Supplemental video for “Strategies for crowdsourcing social data analysis”

Download
31.78 MB

References

[1]

Ahmad, S., Battle, A., Malkani, Z., and Kamvar, S. The jabberwocky programming environment for structured social computing. In Proc. UIST (2011).

Digital Library

[2]

Bernstein, M. S., Little, G., Miller, R. C., Hartmann, B., Ackerman, M. S., Karger, D. R., Crowell, D., and Panovich, K. Soylent: a word processor with a crowd inside. In Proc. UIST (2010), 313--322.

Digital Library

[3]

Boud, D. Enhancing learning through self assessment. Routledge, 1995.

[4]

Chandler, D., and Kapelner, A. Breaking monotony with meaning: Motivation in crowdsourcing markets. U. of Chicago mimeo (2010).

[5]

Danis, C. M., Viegas, F. B., Wattenberg, M., and Kriss, J. Your place or mine?: Visualization as a community component. In Proc. CHI (2008), 275--284.

Digital Library

[6]

Heer, J., and Agrawala, M. Design considerations for collaborative visual analytics. Information Visualization 7, 1 (2008), 49--62.

Digital Library

[7]

Heer, J., and Bostock, M. Crowdsourcing graphical perception: Using mechanical turk to assess visualization design. In Proc. CHI (2010), 203--212.

Digital Library

[8]

Heer, J., Viégas, F., and Wattenberg, M. Voyagers and voyeurs: Supporting asynchronous collaborative visualization. Comm. of the ACM 52, 1 (2009), 87--97.

Digital Library

[9]

Hill, W. C., and Hollan, J. D. Deixis and the future of visualization excellence. In Proc. of IEEE Visualization (1991), 314--320, 431.

Digital Library

[10]

Hurley, C. B., and Oldford, R. W. Pairwise Display of High--Dimensional Information via Eulerian Tours and Hamiltonian Decompositions. JCGS 19, 4 (2010), 861--886.

[11]

Ipeirotis, P. Demographics of mechanical turk. New York University, Tech. Rep (2010).

[12]

Kittur, A., Chi, E. H., and Suh, B. Crowdsourcing user studies with mechanical turk. In Proc. CHI (2008), 453--456.

Digital Library

[13]

Kittur, A., Smus, B., and Kraut, R. Crowdforge: crowdsourcing complex work. In CHI Extended Abstracts, ACM (2011), 1801--1806.

Digital Library

[14]

Kong, N., Heer, J., and Agrawala, M. Perceptual guidelines for creating rectangular treemaps. IEEE TVCG 16 (2010), 990--998.

Digital Library

[15]

Kulkarni, A., Can, M., and Hartman, B. Collaboratively Crowdsourcing Workflows with Turkomatic. In Proc. CSCW (2012).

Digital Library

[16]

Lee, E.-K., Cook, D., Klinke, S., and Lumley, T. Projection pursuit for exploratory supervised classification. JCGS 14, 4 (2005), 831--846.

[17]

Little, G., Chilton, L., Goldman, M., and Miller, R. Turkit: tools for iterative tasks on mechanical turk. In Proc. SIGKDD, ACM (2009).

Digital Library

[18]

Luther, K., Counts, S., Stecher, K., Hoff, A., and Johns, P. Pathfinder: an online collaboration environment for citizen scientists. In Proc. CHI (2009), 239--248.

Digital Library

[19]

Mason, W., and Watts, D. J. Financial incentives and the "performance of crowds". SIGKDD Explor. Newsl. 11 (May 2010), 100--108.

Digital Library

[20]

Nielsen, J. Participation inequality: Encouraging more users to contribute. Jakob Nielsen's Alertbox (2006). http://www.useit.com/alertbox/participation_inequality.html.

[21]

Oleson, D., Sorokin, A., Laughlin, G., Hester, V., Le, J., and Biewald, L. Programmatic gold: Targeted and scalable quality assurance in crowdsourcing. In Proc. HComp (2011).

[22]

Pirolli, P., and Card, S. The sensemaking process and leverage points for analyst technology as identified through cognitive task analysis. The Analyst 2005, 1--6.

[23]

Quinn, A. J., and Bederson, B. B. Human Computation: A Survey and Taxonomy of a Growing Field. In Proc. CHI (2011).

Digital Library

[24]

Russell, D. M., Stefik, M. J., Pirolli, P., and Card, S. K. The cost structure of sensemaking. In Proc. CHI (1993), 269--276.

Digital Library

[25]

Shaw, A. D., Horton, J. J., and Chen, D. L. Designing incentives for inexpert human raters. In Proc. CSCW (2011), 275--284.

Digital Library

[26]

Sorokin, A., and Forsyth, D. Utility data annotation with amazon mechanical turk. In Computer Vision and Pattern Recognition Workshops (2008), 1 --8.

[27]

Swivel. http://www.swivel.com.

[28]

Tableau. http://www.tableausoftware.com.

[29]

Viégas, F., Wattenberg, M., McKeon, M., Van Ham, F., and Kriss, J. Harry potter and the meat-filled freezer: A case study of spontaneous usage of visualization tools. In Proc. HICSS, Citeseer (2008).

Digital Library

[30]

Viégas, F., Wattenberg, M., Van Ham, F., Kriss, J., and McKeon, M. Manyeyes: a site for visualization at internet scale. IEEE TVCG 13, 6 (2007), 1121--1128.

Digital Library

[31]

Whittaker, S., Terveen, L., Hill, W., and Cherny, L. The dynamics of mass interaction. In Proc. CSCW (1998), 257--264.

Digital Library

[32]

Willett, W., Heer, J., Agrawala, M., and Hellerstein, J. CommentSpace: Structured Support for Collaborative Visual Analysis. In Proc. CHI (2011).

Digital Library

[33]

Wills, G., and Wilkinson, L. Autovis: automatic visualization. Information Visualization 9 (March 2010), 47--69.

Digital Library

Cited By

Yen CCheng HXia YHuang Y(2023)CrowdIDEA: Blending Crowd Intelligence and Data Analytics to Empower Causal ReasoningProceedings of the 2023 CHI Conference on Human Factors in Computing Systems10.1145/3544548.3581021(1-17)Online publication date: 19-Apr-2023
https://dl.acm.org/doi/10.1145/3544548.3581021
Senecal NWang TThompson EKable J(2023)Normative arguments from experts and peers reduce delay discountingJudgment and Decision Making10.1017/S19302975000063067:5(568-589)Online publication date: 1-Jan-2023
https://doi.org/10.1017/S1930297500006306
Hall BBartram LBrehmer M(2022)Augmented Chironomia for Presenting Data to Remote AudiencesProceedings of the 35th Annual ACM Symposium on User Interface Software and Technology10.1145/3526113.3545614(1-14)Online publication date: 29-Oct-2022
https://dl.acm.org/doi/10.1145/3526113.3545614
Show More Cited By

Index Terms

Strategies for crowdsourcing social data analysis
1. Human-centered computing
  1. Collaborative and social computing

Recommendations

Identifying Redundancy and Exposing Provenance in Crowdsourced Data Analysis

We present a system that lets analysts use paid crowd workers to explore data sets and helps analysts interactively examine and build upon workers' insights. We take advantage of the fact that, for many types of data, independent crowd workers can ...
Crowdsourcing: evolution of information ecology in the digital workplaces
ICAICR '19: Proceedings of the Third International Conference on Advanced Informatics for Computing Research

Technology has become a necessity in every walk of life in the era of Industrial Revolution 4.0. Digitalisation has impacted how the talent is hired by the organisations and how workforce is reciprocating towards the dynamic jobbing fashions. Every ...
Passerby Crowdsourcing: Workers' Behavior and Data Quality Management

Worker recruitment is one of the important problems in crowdsourcing, and many proposals have been presented for placing equipment in physical spaces for recruiting workers. One of the essential challenges of the approach is how to keep people attracted ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

CHI '12: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems

May 2012

3276 pages

ISBN:9781450310154

DOI:10.1145/2207676

General Chair:
Joseph A. Konstan
University of Minnesota
,
Program Chairs:
Ed H. Chi
Google
,
Kristina Höök
Mobile Life at KTH

Copyright © 2012 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGCHI: ACM Special Interest Group on Computer-Human Interaction

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 May 2012

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

CHI '12

Sponsor:

SIGCHI

CHI '12: CHI Conference on Human Factors in Computing Systems

May 5 - 10, 2012

Texas, Austin, USA

Acceptance Rates

Overall Acceptance Rate 6,199 of 26,314 submissions, 24%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

71
Total Citations
View Citations
2,133
Total Downloads

Downloads (Last 12 months)51
Downloads (Last 6 weeks)8

Reflects downloads up to 11 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Yen CCheng HXia YHuang Y(2023)CrowdIDEA: Blending Crowd Intelligence and Data Analytics to Empower Causal ReasoningProceedings of the 2023 CHI Conference on Human Factors in Computing Systems10.1145/3544548.3581021(1-17)Online publication date: 19-Apr-2023
https://dl.acm.org/doi/10.1145/3544548.3581021
Senecal NWang TThompson EKable J(2023)Normative arguments from experts and peers reduce delay discountingJudgment and Decision Making10.1017/S19302975000063067:5(568-589)Online publication date: 1-Jan-2023
https://doi.org/10.1017/S1930297500006306
Hall BBartram LBrehmer M(2022)Augmented Chironomia for Presenting Data to Remote AudiencesProceedings of the 35th Annual ACM Symposium on User Interface Software and Technology10.1145/3526113.3545614(1-14)Online publication date: 29-Oct-2022
https://dl.acm.org/doi/10.1145/3526113.3545614
Yen CCheng HYen GBailey BHuang Y(2021)Narratives + Diagrams: An Integrated Approach for Externalizing and Sharing People's Causal BeliefsProceedings of the ACM on Human-Computer Interaction10.1145/34795885:CSCW2(1-27)Online publication date: 18-Oct-2021
https://dl.acm.org/doi/10.1145/3479588
Feldman MMcInnis B(2021)How We Write with CrowdsProceedings of the ACM on Human-Computer Interaction10.1145/34329284:CSCW3(1-31)Online publication date: 5-Jan-2021
https://dl.acm.org/doi/10.1145/3432928
Lekschas FAmpanavos SSiangliulue PPfister HGajos KKitamura YQuigley AIsbister KIgarashi TBjørn PDrucker S(2021)Ask Me or Tell Me? Enhancing the Effectiveness of Crowdsourced Design FeedbackProceedings of the 2021 CHI Conference on Human Factors in Computing Systems10.1145/3411764.3445507(1-12)Online publication date: 6-May-2021
https://dl.acm.org/doi/10.1145/3411764.3445507
Song JOyama SKurihara M(2020)Exploratory Causal Analysis of Open Data: Explanation Generation and Confounder IdentificationJournal of Advanced Computational Intelligence and Intelligent Informatics10.20965/jaciii.2020.p014224:1(142-155)Online publication date: 20-Jan-2020
https://doi.org/10.20965/jaciii.2020.p0142
McInnis BSun LShin JDow S(2020)Rare, but Valuable: Understanding Data-centered Talk in News Website Comment SectionsProceedings of the ACM on Human-Computer Interaction10.1145/34152454:CSCW2(1-27)Online publication date: 15-Oct-2020
https://dl.acm.org/doi/10.1145/3415245
Jackson CØsterlund CCrowston KHarandi MTrouille L(2020)Shifting forms of Engagement: Volunteer Learning in Online Citizen ScienceProceedings of the ACM on Human-Computer Interaction10.1145/33928414:CSCW1(1-19)Online publication date: 29-May-2020
https://dl.acm.org/doi/10.1145/3392841
Losev TStorteboom SCarpendale SKnudsen S(2020)Distributed Synchronous Visualization Design: Challenges and Strategies2020 IEEE Workshop on Evaluation and Beyond - Methodological Approaches to Visualization (BELIV)10.1109/BELIV51497.2020.00008(1-10)Online publication date: Oct-2020
https://doi.org/10.1109/BELIV51497.2020.00008
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents