Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2207676.2207709acmconferencesArticle/Chapter ViewAbstractPublication PageschiConference Proceedingsconference-collections
research-article

Strategies for crowdsourcing social data analysis

Published: 05 May 2012 Publication History

Abstract

Web-based social data analysis tools that rely on public discussion to produce hypotheses or explanations of the patterns and trends in data, rarely yield high-quality results in practice. Crowdsourcing offers an alternative approach in which an analyst pays workers to generate such explanations. Yet, asking workers with varying skills, backgrounds and motivations to simply "Explain why a chart is interesting" can result in irrelevant, unclear or speculative explanations of variable quality. To address these problems, we contribute seven strategies for improving the quality and diversity of worker-generated explanations. Our experiments show that using (S1) feature-oriented prompts, providing (S2) good examples, and including (S3) reference gathering, (S4) chart reading, and (S5) annotation subtasks increases the quality of responses by 28% for US workers and 196% for non-US workers. Feature-oriented prompts improve explanation quality by 69% to 236% depending on the prompt. We also show that (S6) pre-annotating charts can focus workers' attention on relevant details, and demonstrate that (S7) generating explanations iteratively increases explanation diversity without increasing worker attrition. We used our techniques to generate 910 explanations for 16 datasets, and found that 63% were of high quality. These results demonstrate that paid crowd workers can reliably generate diverse, high-quality explanations that support the analysis of specific datasets.

Supplementary Material

MOV File (paperfile554-3.mov)
Supplemental video for “Strategies for crowdsourcing social data analysis”

References

[1]
Ahmad, S., Battle, A., Malkani, Z., and Kamvar, S. The jabberwocky programming environment for structured social computing. In Proc. UIST (2011).
[2]
Bernstein, M. S., Little, G., Miller, R. C., Hartmann, B., Ackerman, M. S., Karger, D. R., Crowell, D., and Panovich, K. Soylent: a word processor with a crowd inside. In Proc. UIST (2010), 313--322.
[3]
Boud, D. Enhancing learning through self assessment. Routledge, 1995.
[4]
Chandler, D., and Kapelner, A. Breaking monotony with meaning: Motivation in crowdsourcing markets. U. of Chicago mimeo (2010).
[5]
Danis, C. M., Viegas, F. B., Wattenberg, M., and Kriss, J. Your place or mine?: Visualization as a community component. In Proc. CHI (2008), 275--284.
[6]
Heer, J., and Agrawala, M. Design considerations for collaborative visual analytics. Information Visualization 7, 1 (2008), 49--62.
[7]
Heer, J., and Bostock, M. Crowdsourcing graphical perception: Using mechanical turk to assess visualization design. In Proc. CHI (2010), 203--212.
[8]
Heer, J., Viégas, F., and Wattenberg, M. Voyagers and voyeurs: Supporting asynchronous collaborative visualization. Comm. of the ACM 52, 1 (2009), 87--97.
[9]
Hill, W. C., and Hollan, J. D. Deixis and the future of visualization excellence. In Proc. of IEEE Visualization (1991), 314--320, 431.
[10]
Hurley, C. B., and Oldford, R. W. Pairwise Display of High--Dimensional Information via Eulerian Tours and Hamiltonian Decompositions. JCGS 19, 4 (2010), 861--886.
[11]
Ipeirotis, P. Demographics of mechanical turk. New York University, Tech. Rep (2010).
[12]
Kittur, A., Chi, E. H., and Suh, B. Crowdsourcing user studies with mechanical turk. In Proc. CHI (2008), 453--456.
[13]
Kittur, A., Smus, B., and Kraut, R. Crowdforge: crowdsourcing complex work. In CHI Extended Abstracts, ACM (2011), 1801--1806.
[14]
Kong, N., Heer, J., and Agrawala, M. Perceptual guidelines for creating rectangular treemaps. IEEE TVCG 16 (2010), 990--998.
[15]
Kulkarni, A., Can, M., and Hartman, B. Collaboratively Crowdsourcing Workflows with Turkomatic. In Proc. CSCW (2012).
[16]
Lee, E.-K., Cook, D., Klinke, S., and Lumley, T. Projection pursuit for exploratory supervised classification. JCGS 14, 4 (2005), 831--846.
[17]
Little, G., Chilton, L., Goldman, M., and Miller, R. Turkit: tools for iterative tasks on mechanical turk. In Proc. SIGKDD, ACM (2009).
[18]
Luther, K., Counts, S., Stecher, K., Hoff, A., and Johns, P. Pathfinder: an online collaboration environment for citizen scientists. In Proc. CHI (2009), 239--248.
[19]
Mason, W., and Watts, D. J. Financial incentives and the "performance of crowds". SIGKDD Explor. Newsl. 11 (May 2010), 100--108.
[20]
Nielsen, J. Participation inequality: Encouraging more users to contribute. Jakob Nielsen's Alertbox (2006). http://www.useit.com/alertbox/participation_inequality.html.
[21]
Oleson, D., Sorokin, A., Laughlin, G., Hester, V., Le, J., and Biewald, L. Programmatic gold: Targeted and scalable quality assurance in crowdsourcing. In Proc. HComp (2011).
[22]
Pirolli, P., and Card, S. The sensemaking process and leverage points for analyst technology as identified through cognitive task analysis. The Analyst 2005, 1--6.
[23]
Quinn, A. J., and Bederson, B. B. Human Computation: A Survey and Taxonomy of a Growing Field. In Proc. CHI (2011).
[24]
Russell, D. M., Stefik, M. J., Pirolli, P., and Card, S. K. The cost structure of sensemaking. In Proc. CHI (1993), 269--276.
[25]
Shaw, A. D., Horton, J. J., and Chen, D. L. Designing incentives for inexpert human raters. In Proc. CSCW (2011), 275--284.
[26]
Sorokin, A., and Forsyth, D. Utility data annotation with amazon mechanical turk. In Computer Vision and Pattern Recognition Workshops (2008), 1 --8.
[27]
Swivel. http://www.swivel.com.
[28]
Tableau. http://www.tableausoftware.com.
[29]
Viégas, F., Wattenberg, M., McKeon, M., Van Ham, F., and Kriss, J. Harry potter and the meat-filled freezer: A case study of spontaneous usage of visualization tools. In Proc. HICSS, Citeseer (2008).
[30]
Viégas, F., Wattenberg, M., Van Ham, F., Kriss, J., and McKeon, M. Manyeyes: a site for visualization at internet scale. IEEE TVCG 13, 6 (2007), 1121--1128.
[31]
Whittaker, S., Terveen, L., Hill, W., and Cherny, L. The dynamics of mass interaction. In Proc. CSCW (1998), 257--264.
[32]
Willett, W., Heer, J., Agrawala, M., and Hellerstein, J. CommentSpace: Structured Support for Collaborative Visual Analysis. In Proc. CHI (2011).
[33]
Wills, G., and Wilkinson, L. Autovis: automatic visualization. Information Visualization 9 (March 2010), 47--69.

Cited By

View all
  • (2023)CrowdIDEA: Blending Crowd Intelligence and Data Analytics to Empower Causal ReasoningProceedings of the 2023 CHI Conference on Human Factors in Computing Systems10.1145/3544548.3581021(1-17)Online publication date: 19-Apr-2023
  • (2023)Normative arguments from experts and peers reduce delay discountingJudgment and Decision Making10.1017/S19302975000063067:5(568-589)Online publication date: 1-Jan-2023
  • (2022)Augmented Chironomia for Presenting Data to Remote AudiencesProceedings of the 35th Annual ACM Symposium on User Interface Software and Technology10.1145/3526113.3545614(1-14)Online publication date: 29-Oct-2022
  • Show More Cited By

Index Terms

  1. Strategies for crowdsourcing social data analysis

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    CHI '12: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
    May 2012
    3276 pages
    ISBN:9781450310154
    DOI:10.1145/2207676
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 05 May 2012

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. crowdsourcing
    2. information visualization
    3. social data analysis

    Qualifiers

    • Research-article

    Conference

    CHI '12
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 6,199 of 26,314 submissions, 24%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)51
    • Downloads (Last 6 weeks)8
    Reflects downloads up to 11 Sep 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)CrowdIDEA: Blending Crowd Intelligence and Data Analytics to Empower Causal ReasoningProceedings of the 2023 CHI Conference on Human Factors in Computing Systems10.1145/3544548.3581021(1-17)Online publication date: 19-Apr-2023
    • (2023)Normative arguments from experts and peers reduce delay discountingJudgment and Decision Making10.1017/S19302975000063067:5(568-589)Online publication date: 1-Jan-2023
    • (2022)Augmented Chironomia for Presenting Data to Remote AudiencesProceedings of the 35th Annual ACM Symposium on User Interface Software and Technology10.1145/3526113.3545614(1-14)Online publication date: 29-Oct-2022
    • (2021)Narratives + Diagrams: An Integrated Approach for Externalizing and Sharing People's Causal BeliefsProceedings of the ACM on Human-Computer Interaction10.1145/34795885:CSCW2(1-27)Online publication date: 18-Oct-2021
    • (2021)How We Write with CrowdsProceedings of the ACM on Human-Computer Interaction10.1145/34329284:CSCW3(1-31)Online publication date: 5-Jan-2021
    • (2021)Ask Me or Tell Me? Enhancing the Effectiveness of Crowdsourced Design FeedbackProceedings of the 2021 CHI Conference on Human Factors in Computing Systems10.1145/3411764.3445507(1-12)Online publication date: 6-May-2021
    • (2020)Exploratory Causal Analysis of Open Data: Explanation Generation and Confounder IdentificationJournal of Advanced Computational Intelligence and Intelligent Informatics10.20965/jaciii.2020.p014224:1(142-155)Online publication date: 20-Jan-2020
    • (2020)Rare, but Valuable: Understanding Data-centered Talk in News Website Comment SectionsProceedings of the ACM on Human-Computer Interaction10.1145/34152454:CSCW2(1-27)Online publication date: 15-Oct-2020
    • (2020)Shifting forms of Engagement: Volunteer Learning in Online Citizen ScienceProceedings of the ACM on Human-Computer Interaction10.1145/33928414:CSCW1(1-19)Online publication date: 29-May-2020
    • (2020)Distributed Synchronous Visualization Design: Challenges and Strategies2020 IEEE Workshop on Evaluation and Beyond - Methodological Approaches to Visualization (BELIV)10.1109/BELIV51497.2020.00008(1-10)Online publication date: Oct-2020
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media