Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Open access

Documenting Data Production Processes: A Participatory Approach for Data Work

Published: 11 November 2022 Publication History

Abstract

The opacity of machine learning data is a significant threat to ethical data work and intelligible systems. Previous research has addressed this issue by proposing standardized checklists to document datasets. This paper expands that field of inquiry by proposing a shift of perspective: from documenting datasets towards documenting data production. We draw on participatory design and collaborate with data workers at two companies located in Bulgaria and Argentina, where the collection and annotation of data for machine learning are outsourced. Our investigation comprises 2.5 years of research, including 33 semi-structured interviews, five co-design workshops, the development of prototypes, and several feedback instances with participants. We identify key challenges and requirements related to the integration of documentation practices in real-world data production scenarios. Our findings comprise important design considerations and highlight the value of designing data documentation based on the needs of data workers. We argue that a view of documentation as a boundary object, i.e., an object that can be used differently across organizations and teams but holds enough immutable content to maintain integrity, can be useful when designing documentation to retrieve heterogeneous, often distributed, contexts of data production.

References

[1]
[n.d.]. AI FactSheets 360. https://aifs360.mybluemix.net/
[2]
[n.d.]. Call For Datasets Benchmarks. https://neurips.cc/Conferences/2021/CallForDatasetsBenchmarks
[3]
[n.d.]. Google Cloud Model Cards. https://modelcards.withgoogle.com/about
[4]
Rikke Aarhus and Stinne Aaløkke Ballegaard. 2010. Negotiating boundaries: managing disease at home. In Proceedings of the 28th international conference on Human factors in computing systems - CHI '10. ACM Press, Atlanta, Georgia, USA, 1223. https://doi.org/10.1145/1753326.1753509
[5]
Mark S. Ackerman and Christine Halverson. 1998. Considering an organization's memory. In Proceedings of the 1998 ACM conference on Computer supported cooperative work - CSCW '98. ACM Press, Seattle, Washington, United States, 39--48. https://doi.org/10.1145/289444.289461
[6]
Shazia Afzal, C Rajmohan, Manish Kesarwani, Sameep Mehta, and Hima Patel. 2021. Data Readiness Report. In 2021 IEEE International Conference on Smart Data Services (SMDS). IEEE, Chicago, IL, USA, 42--51. https://doi.org/10.1109/ SMDS53860.2021.00016
[7]
Mariam Asad, Christopher A. Le Dantec, Becky Nielsen, and Kate Diedrick. 2017. Creating a Sociotechnical API: Designing City-Scale Community Engagement. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems (Denver, Colorado, USA) (CHI '17). Association for Computing Machinery, New York, NY, USA, 2295--2306. https://doi.org/10.1145/3025453.3025963
[8]
Agathe Balayn, Bogdan Kulynych, and Seda Gürses. 2021. Exploring Data Pipelines through the Process Lens: a Reference Model for Computer Vision. (2021), 8.
[9]
Liam Bannon and Susanne Bødker. 1997. Constructing Common Information Spaces. In Proceedings of the Fifth European Conference on Computer Supported Cooperative Work. Springer Netherlands, Dordrecht, 81--96. https: //doi.org/10.1007/978--94-015--7372--6_6
[10]
Emily M. Bender and Batya Friedman. 2018. Data Statements for Natural Language Processing: Toward Mitigating System Bias and Enabling Better Science. Transactions of the Association for Computational Linguistics 6 (2018), 587--604. https://doi.org/10.1162/tacl_a_00041
[11]
Dane Bertram, Amy Voida, Saul Greenberg, and Robert Walker. 2010. Communication, collaboration, and bugs: the social nature of issue tracking in small, collocated teams. In Proceedings of the 2010 ACM conference on Computer supported cooperative work - CSCW '10. ACM Press, Savannah, Georgia, USA, 291. https://doi.org/10.1145/1718918. 1718972
[12]
Susanne Bødker and Morten Kyng. 2018. Participatory Design That Matters-Facing the Big Issues. ACM Trans. Comput.-Hum. Interact. 25, 1, Article 4 (feb 2018), 31 pages. https://doi.org/10.1145/3152421
[13]
Claus Bossen, Lotte Groth Jensen, and Flemming Witt. 2012. Medical secretaries' care of records: the cooperative work of a non-clinical group. In Proceedings of the ACM 2012 conference on Computer Supported Cooperative Work - CSCW '12. ACM Press, Seattle, Washington, USA, 921. https://doi.org/10.1145/2145204.2145341
[14]
Claus Bossen, Lotte Groth Jensen, and Flemming Witt Udsen. 2014. Boundary-Object Trimming: On the Invisibility of Medical Secretaries' Care of Records in Healthcare Infrastructures. Computer Supported Cooperative Work (CSCW) 23, 1 (Feb. 2014), 75--110. https://doi.org/10.1007/s10606-013--9195--5
[15]
Geoffrey C. Bowker and Susan Leigh Star. 1999. Sorting things out: classification and its consequences. MIT Press, Cambridge, Mass. https://mitpress.mit.edu/books/sorting-things-out
[16]
Eva Brandt, Thomas Binder, and Elizabeth Sanders. 2012. Tools and techniques: Ways to engage telling, making and enacting. In Routledge Handbook of Participatory Design, Jesper Simonsen and Toni Robertson (Eds.). Routledge, 145--181.
[17]
Tone Bratteteig, Keld Bødker, Yvonne Dittrich, Preben H. Mogensen, and Jesper Simonsen. 2012. Methods. Organising principles and general guidelines for Participatory Design projects. In Routledge Handbook of Participatory Design, Jesper Simonsen and Toni Robertson (Eds.). Routledge, 117--144.
[18]
Tone Bratteteig and Ina Wagner. 2016. Unpacking the Notion of Participation in Participatory Design. Computer Supported Cooperative Work 25, 6 (dec 2016), 425--475. https://doi.org/10.1007/s10606-016--9259--4
[19]
Virginia Braun and Victoria Clarke. 2006. Using thematic analysis in psychology. Qualitative Research in Psychology 3 (01 2006), 77--101. https://doi.org/10.1191/1478088706qp063oa
[20]
Virginia Braun, Victoria Clarke, Nikki Hayfield, and Gareth Terry. 2019. Thematic Analysis. In Handbook of Research Methods in Health Social Sciences, Pranee Liamputtong (Ed.). Springer Singapore, Singapore, 843--860. https://doi.org/10.1007/978--981--10--5251--4_103
[21]
Anna Brown, Alexandra Chouldechova, Emily Putnam-Hornstein, Andrew Tobin, and Rhema Vaithianathan. 2019. Toward Algorithmic Accountability in Public Services: A Qualitative Study of Affected Community Perspectives on Algorithmic Decision-Making in Child Welfare Services. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (Glasgow, Scotland Uk) (CHI '19). Association for Computing Machinery, New York, NY, USA, 1--12. https://doi.org/10.1145/3290605.3300271
[22]
Paul Carlile. 2002. A Pragmatic View of Knowledge and Boundaries: Boundary Objects in New Product Development. Organization Science 13 (08 2002), 442--455. https://doi.org/10.1287/orsc.13.4.442.2953
[23]
Kathy Charmaz. 2006. Constructing Grounded Theory: A Practical Guide through Qualitative Analysis. Sage Publications, London ; Thousand Oaks, Calif.
[24]
Michael Chibnik. 2020. Practical and Ethical Complications of Participatory Research. Annals of Anthropological Practice 44, 2 (Nov. 2020), 208--212. https://doi.org/10.1111/napa.12153
[25]
Eric Corbett and Christopher Le Dantec. 2019. Towards a Design Framework for Trust in Digital Civics. In Proceedings of the 2019 on Designing Interactive Systems Conference (San Diego, CA, USA) (DIS '19). Association for Computing Machinery, New York, NY, USA, 1145--1156. https://doi.org/10.1145/3322276.3322296
[26]
Terry Costantino, Steven LeMay, Linnea Vizard, Heather Moore, Dara Renton, Sandra Gornall, and Ian Strang. 2014. Participatory Design of Public Library E-Services. In Proceedings of the 13th Participatory Design Conference: Short Papers, Industry Cases, Workshop Descriptions, Doctoral Consortium Papers, and Keynote Abstracts - Volume 2 (Windhoek, Namibia) (PDC '14). Association for Computing Machinery, New York, NY, USA, 133--136. https: //doi.org/10.1145/2662155.2662232
[27]
Sasha Costanza-Chock. 2020. Design Justice: Community-Led Practices to Build the Worlds We Need. The MIT Press, Cambridge, MA. https://design-justice.pubpub.org/
[28]
Kate Crawford and Trevor Paglen. 2019. Excavating AI: The Politics of Images in Machine Learning Training Sets. https://www.excavating.ai tex.ids: zotero-3263.
[29]
Emily Denton, Alex Hanna, Razvan Amironesei, Andrew Smart, and Hilary Nicole. 2021. On the genealogy of machine learning datasets: A critical history of ImageNet. Big Data & Society 8, 2 (July 2021), 205395172110359. https://doi.org/10.1177/20539517211035955
[30]
Emily Denton, Alex Hanna, Razvan Amironesei, Andrew Smart, Hilary Nicole, and Morgan Klaus Scheuerman. 2020. Bringing the People Back In: Contesting Benchmark Machine Learning Datasets. arXiv:2007.07399 [cs] (July 2020). http://arxiv.org/abs/2007.07399 arXiv: 2007.07399.
[31]
Catherine D'Ignazio and Lauren F. Klein. 2020. Data feminism. The MIT Press, Cambridge, Massachusetts. https: //mitpress.mit.edu/books/data-feminism
[32]
Pelle Ehn. 2008. Participation in Design Things. In Proceedings of the Tenth Anniversary Conference on Participatory Design 2008 (Bloomington, Indiana) (PDC '08). Indiana University, USA, 92--101.
[33]
Jordan Famularo, Betty Hensellek, and Philip Walsh. 2021. Data Stewardship: A Letter to Computer Vision from Cultural Heritage Studies. (2021), 11.
[34]
Shaoyang Fan, Ujwal Gadiraju, Alessandro Checco, and Gianluca Demartini. 2020. CrowdCO-OP: Sharing Risks and Rewards in Crowdsourcing. Proceedings of the ACM on Human-Computer Interaction 4, CSCW2 (Oct. 2020), 1--24. https://doi.org/10.1145/3415203
[35]
Casey Fiesler, Jed R. Brubaker, Andrea Forte, Shion Guha, Nora McDonald, and Michael Muller. 2019. Qualitative Methods for CSCW: Challenges and Opportunities. In Conference Companion Publication of the 2019 on Computer Supported Cooperative Work and Social Computing (Austin, TX, USA) (CSCW '19). Association for Computing Machinery, New York, NY, USA, 455--460. https://doi.org/10.1145/3311957.3359428
[36]
Asbjørn Ammitzbøll Flügge. 2021. Perspectives from Practice: Algorithmic Decision-Making in Public Employment Services. In Companion Publication of the 2021 Conference on Computer Supported Cooperative Work and Social Computing. Association for Computing Machinery, New York, NY, USA, 253--255. https://doi.org/10.1145/3462204. 3481787
[37]
Timnit Gebru, Jamie Morgenstern, Briana Vecchione, JenniferWortman Vaughan, HannaWallach, Hal Daumé III, and Kate Crawford. 2021. Datasheets for Datasets. Commun. ACM 64, 12 (nov 2021), 86--92. https://doi.org/10.1145/3458723
[38]
R. Stuart Geiger, Kevin Yu, Yanlai Yang, Mindy Dai, Jie Qiu, Rebekah Tang, and Jenny Huang. 2020. Garbage in, Garbage out? Do Machine Learning Application Papers in Social Computing Report Where Human-Labeled Training Data Comes From?. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency (Barcelona, Spain) (FAT* '20). Association for Computing Machinery, New York, NY, USA, 325--336. https://doi.org/10.1145/ 3351095.3372862
[39]
Yolanda Gil, Cédric H. David, Ibrahim Demir, Bakinam T. Essawy, Robinson W. Fulweiler, Jonathan L. Goodall, Leif Karlstrom, Huikyo Lee, Heath J. Mills, Ji-Hyun Oh, Suzanne A. Pierce, Allen Pope, Mimi W. Tzeng, Sandra R. Villamizar, and Xuan Yu. 2016. Toward the Geoscience Paper of the Future: Best practices for documenting and sharing research from data to software to provenance. Earth and Space Science 3, 10 (2016), 388--415. https: //doi.org/10.1002/2015EA000136 arXiv:https://agupubs.onlinelibrary.wiley.com/doi/pdf/10.1002/2015EA000136
[40]
Alyssa Goodman, Alberto Pepe, Alexander W. Blocker, Christine L. Borgman, Kyle Cranmer, Merce Crosas, Rosanne Di Stefano, Yolanda Gil, Paul Groth, Margaret Hedstrom, David W. Hogg, Vinay Kashyap, Ashish Mahabal, Aneta Siemiginowska, and Aleksandra Slavkovic. 2014. Ten Simple Rules for the Care and Feeding of Scientific Data. PLOS Computational Biology 10, 4 (04 2014), 1--5. https://doi.org/10.1371/journal.pcbi.1003542
[41]
Mark Graham, Isis Hjorth, and Vili Lehdonvirta. 2017. Digital labour and development: impacts of global digital labour platforms and the gig economy on worker livelihoods. Transfer: European Review of Labour and Research 23, 2 (may 2017), 135--162. https://doi.org/10.1177/1024258916687250
[42]
Mary L. Gray and Siddharth Suri. 2019. Ghost Work: How to Stop Silicon Valley from Building a New Global Underclass. Houghton Mifflin Harcourt, Boston.
[43]
Kotaro Hara, Abigail Adams, Kristy Milland, Saiph Savage, Chris Callison-Burch, and Jeffrey P. Bigham. 2018. A Data-Driven Analysis of Workers' Earnings on Amazon Mechanical Turk. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. ACM, Montreal QC Canada, 1--14. https://doi.org/10.1145/3173574.3174023
[44]
Lisa Haskel. 2017. Participatory design and free and open source software in the not for profit sector: the Hublink Project. Ph.D. Dissertation. Bournemouth University.
[45]
Beverley Hawkins, Annie Pye, and Fernando Correia. 2017. Boundary objects, power, and learning: The matter of developing sustainable practice in organizations. Management Learning 48, 3 (July 2017), 292--310. https: //doi.org/10.1177/1350507616677199
[46]
Sarah Holland, Ahmed Hosny, Sarah Newman, Joshua Joseph, and Kasia Chmielinski. 2018. The Dataset Nutrition Label: A Framework To Drive Higher Data Quality Standards. arXiv:1805.03677 (2018). .http://arxiv.org/abs/1805.03677
[47]
Ben Hutchinson, Andrew Smart, Alex Hanna, Emily Denton, Christina Greer, Oddur Kjartansson, Parker Barnes, and Margaret Mitchell. 2021. Towards Accountability for Machine Learning Datasets: Practices from Software Engineering and Infrastructure. In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency. ACM, Virtual Event Canada, 560--575. https://doi.org/10.1145/3442188.3445918
[48]
Isto Huvila. 2011. The politics of boundary objects: Hegemonic interventions and the making of a document. Journal of the American Society for Information Science and Technology 62, 12 (2011), 2528--2539. https://doi.org/10.1002/asi.21639 arXiv:https://onlinelibrary.wiley.com/doi/pdf/10.1002/asi.21639
[49]
Lilly Irani. 2015. The cultural work of microwork. New Media & Society 17, 5 (2015), 720--739. https://doi.org/10. 1177/1461444813511926
[50]
Lilly C. Irani and M. Six Silberman. 2013. Turkopticon: interrupting worker invisibility in amazon mechanical turk. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '13). Association for Computing Machinery, Paris, France, 611--620. https://doi.org/10.1145/2470654.2470742
[51]
Eun Seo Jo and Timnit Gebru. 2020. Lessons from archives: strategies for collecting sociocultural data in machine learning. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency. ACM, Barcelona Spain, 306--316. https://doi.org/10.1145/3351095.3372829
[52]
Michael Katell, Meg Young, Dharma Dailey, Bernease Herman, Vivian Guetler, Aaron Tam, Corinne Bintz, Daniella Raz, and P. M. Krafft. 2020. Toward Situated Interventions for Algorithmic Equity: Lessons from the Field. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency (Barcelona, Spain) (FAT* '20). Association for Computing Machinery, New York, NY, USA, 45--55. https://doi.org/10.1145/3351095.3372874
[53]
Gunay Kazimzade and Milagros Miceli. 2020. Biased Priorities, Biased Outcomes: Three Recommendations for Ethics-oriented Data Annotation Practices. In Proceedings of the AAAI/ACM Conference on Artificial Intelligence, Ethics, and Society. (AIES '20). Association for Computing Machinery, New York, NY, USA, 1--7. https://doi.org/10.1145/ 3375627.3375809 tex.ids: kazimzade2020a, kazimzade2020b.
[54]
Sandjar Kozubaev, Fernando Rochaix, Carl DiSalvo, and Christopher A. Le Dantec. 2019. Spaces and Traces: Implications of Smart Technology in Public Housing. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. Association for Computing Machinery, New York, NY, USA, 1--13. https://doi.org/10.1145/3290605.3300669
[55]
Jacob Leon Kröger, Milagros Miceli, and Florian Müller. 2021. How Data Can Be Used Against People: A Classification of Personal Data Misuses. https://papers.ssrn.com/abstract=3887097
[56]
Diana S. Kusunoki and Aleksandra Sarcevic. 2015. Designing for Temporal Awareness: The Role of Temporality in Time-Critical Medical Teamwork. In Proceedings of the 18th ACM Conference on Computer Supported Cooperative Work and Social Computing (Vancouver, BC, Canada) (CSCW '15). Association for Computing Machinery, New York, NY, USA, 1465--1476. https://doi.org/10.1145/2675133.2675279
[57]
Christopher A. Le Dantec and Sarah Fox. 2015. Strangers at the Gate: Gaining Access, Building Rapport, and Co-Constructing Community-Based Research. In Proceedings of the 18th ACM Conference on Computer Supported Cooperative Work & Social Computing (Vancouver, BC, Canada) (CSCW '15). Association for Computing Machinery, New York, NY, USA, 1348--1358. https://doi.org/10.1145/2675133.2675147
[58]
Charlotte P. Lee. 2005. Between Chaos and Routine: Boundary Negotiating Artifacts in Collaboration. In ECSCW 2005, Hans Gellersen, Kjeld Schmidt, Michel Beaudouin-Lafon, and Wendy Mackay (Eds.). Springer-Verlag, Berlin/Heidelberg, 387--406. https://doi.org/10.1007/1--4020--4023--7_20
[59]
Charlotte P. Lee. 2007. Boundary Negotiating Artifacts: Unbinding the Routine of Boundary Objects and Embracing Chaos in Collaborative Work. Computer Supported Cooperative Work (CSCW) 16, 3 (June 2007), 307--339. https: //doi.org/10.1007/s10606-007--9044--5
[60]
Wayne G. Lutters and Mark S. Ackerman. 2002. Achieving safety: a field study of boundary objects in aircraft technical support. In Proceedings of the 2002 ACM conference on Computer supported cooperative work - CSCW '02. ACM Press, New Orleans, Louisiana, USA, 266. https://doi.org/10.1145/587078.587116
[61]
Wayne G. Lutters and Mark S. Ackerman. 2007. Beyond Boundary Objects: Collaborative Reuse in Aircraft Technical Support. Computer Supported Cooperative Work (CSCW) 16, 3 (June 2007), 341--372. https://doi.org/10.1007/s10606- 006--9036-x
[62]
Michael A. Madaio, Luke Stark, Jennifer Wortman Vaughan, and Hanna Wallach. 2020. Co-Designing Checklists to Understand Organizational Challenges and Opportunities around Fairness in AI. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. Association for Computing Machinery, New York, NY, USA, 1--14. https://doi.org/10.1145/3313831.3376445
[63]
Henry Mainsah and Andrew Morrison. 2014. Participatory Design through a Cultural Lens: Insights from Postcolonial Theory. In Proceedings of the 13th Participatory Design Conference: Short Papers, Industry Cases, Workshop Descriptions, Doctoral Consortium Papers, and Keynote Abstracts - Volume 2 (Windhoek, Namibia) (PDC '14). Association for Computing Machinery, New York, NY, USA, 83--86. https://doi.org/10.1145/2662155.2662195
[64]
David Martin, Benjamin V. Hanrahan, Jacki O'Neill, and Neha Gupta. 2014. Being a turker. In Proceedings of the 17th ACM conference on Computer supported cooperative work & social computing. ACM, Baltimore Maryland USA, 224--235. https://doi.org/10.1145/2531602.2531663
[65]
Donald Martin Jr, Vinodkumar Prabhakaran, Jill Kuhlberg, Andrew Smart, and William S Isaac. 2020. Participatory problem formulation for fairer machine learning through community based system dynamics. arXiv preprint arXiv:2005.07572 (2020).
[66]
Laurie McLeod and Bill Doolin. 2010. Documents As Mediating Artifacts in Contemporary IS Development. In Proceedings of the 43rd Hawaii International Conference on System Sciences. IEEE, Honolulu, HI, USA, 1--10. https: //doi.org/10.1109/HICSS.2010.155
[67]
Milagros Miceli and Julian Posada. 2022. The Data-Production Dispositif. arXiv. https://doi.org/10.48550/arXiv.2205. 11963 arXiv:2205.11963 [cs] type: article.
[68]
Milagros Miceli, Julian Posada, and Tianling Yang. 2022. Studying Up Machine Learning Data: Why Talk About Bias When We Mean Power? Proc. ACM Hum.-Comput. Interact. 6, Article 34 (Jan. 2022), 14 pages. https://doi.org/10.1145/ 3492853
[69]
Milagros Miceli, Martin Schuessler, and Tianling Yang. 2020. Between Subjectivity and Imposition: Power Dynamics in Data Annotation for Computer Vision. Proceedings of the ACM on Human-Computer Interaction 4, CSCW2 (Oct. 2020), 1--25. https://doi.org/10.1145/3415186
[70]
Milagros Miceli, Tianling Yang, Laurens Naudts, Martin Schuessler, Diana Serbanescu, and Alex Hanna. 2021. Documenting Computer Vision Datasets: An Invitation to Reflexive Data Practices. In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency. ACM, Virtual Event Canada, 161--172. https: //doi.org/10.1145/3442188.3445880
[71]
Margaret Mitchell, Simone Wu, Andrew Zaldivar, Parker Barnes, Lucy Vasserman, Ben Hutchinson, Elena Spitzer, Inioluwa Deborah Raji, and Timnit Gebru. 2019. Model Cards for Model Reporting. In Proceedings of the Conference on Fairness, Accountability, and Transparency (FAT* '19). Association for Computing Machinery, 220--229. https: //doi.org/10.1145/3287560.3287596
[72]
Michael Muller and Angelika Strohmayer. 2022. Forgetting Practices in the Data Sciences. (2022), 30.
[73]
Michael Muller, Christine T Wolf, Josh Andres, Zahra Ashktorab, Narendra Nath Joshi, Michael Desmond, Aabhas Sharma, Kristina Brimijoin, Qian Pan, Evelyn Duesterwald, and Casey Dugan. 2021. Designing Ground Truth and the Social Life of Labels. (2021), 17.
[74]
Michael J. Muller. 2009. Participatory Design: the third space in HCI. In Human-computer interaction. CRC press, 181--202.
[75]
Samir Passi. 2018. Collaboration as Participation: The Many Faces in a Corporate Data Science Project. In The Changing Contours of "Participation" in Data-driven Algorithmic Ecosystems: Challenges, Tactics, and an Agenda' workshop in the 2018 ACM CSCW. https://www.samirpassi.com/pubs/working-papers/SamirPassi-CollaborationAsParticipation.pdf
[76]
Samir Passi and Steven J. Jackson. 2018. Trust in Data Science: Collaboration, Translation, and Accountability in Corporate Data Science Projects. Proc. ACM Hum.-Comput. Interact. 2, CSCW, Article 136 (nov 2018), 28 pages. https://doi.org/10.1145/3274405
[77]
Samir Passi and Phoebe Sengers. 2020. Making data science systems work. Big Data & Society 7, 2 (July 2020), 205395172093960. https://doi.org/10.1177/2053951720939605
[78]
Amandalynne Paullada, Inioluwa Deborah Raji, Emily M. Bender, Emily Denton, and Alex Hanna. 2020. Data and its (dis)contents: A survey of dataset development and use in machine learning research. arXiv:2012.05345 [cs] (Dec. 2020). http://arxiv.org/abs/2012.05345 arXiv: 2012.05345.
[79]
Laura R. Pina, Sang-Wha Sien, Teresa Ward, Jason C. Yip, Sean A. Munson, James Fogarty, and Julie A. Kientz. 2017. From Personal Informatics to Family Informatics: Understanding Family Practices around Health Monitoring. In Proceedings of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing (Portland, Oregon, USA) (CSCW '17). Association for Computing Machinery, New York, NY, USA, 2300--2315. https://doi.org/ 10.1145/2998181.2998362
[80]
Julian Posada. 2020. The Future of Work Is Here: Toward a Comprehensive Approach to Artificial Intelligence and Labour. Ethics in Context (2020).
[81]
Julian Posada. 2022. Embedded Reproduction in Platform Data Work. Information, Communication & Society (2022).
[82]
Mahima Pushkarna, Andrew Zaldivar, and Oddur Kjartansson. 2022. Data Cards: Purposeful and Transparent Dataset Documentation for Responsible AI. arXiv:2204.01075 [cs] (April 2022). http://arxiv.org/abs/2204.01075 arXiv: 2204.01075.
[83]
Samantha Robertson and Niloufar Salehi. 2020. What If I Don't Like Any Of The Choices? The Limits of Preference Elicitation for Participatory Algorithm Design. https://doi.org/10.48550/ARXIV.2007.06718
[84]
Toni Robertson and Jesper Simonsen. 2012. Participatory Design: An introduction. In Routledge International Handbook of Participatory Design. Routledge, 1--18.
[85]
T. Robertson and I. Wagner. 2012. Ethics: engagement, representation and politics-in-action. In Routledge Handbook of Participatory Design, Jesper Simonsen and Toni Robertson (Eds.). Routledge, 64--85.
[86]
Alex Rosenblat and Luke Stark. 2016. Algorithmic Labor and Information Asymmetries: A Case Study of Uber's Drivers. International Journal Of Communication 10, 27 (2016), 3758--3784. https://doi.org/10.2139/ssrn.2686227
[87]
Joel Ross, Lilly Irani, M. Six Silberman, Andrew Zaldivar, and Bill Tomlinson. 2010. Who are the crowdworkers?: shifting demographics in mechanical turk. In CHI '10 Extended Abstracts on Human Factors in Computing Systems. ACM, Atlanta Georgia USA, 2863--2872. https://doi.org/10.1145/1753846.1753873
[88]
M. J. Rothmann, D. B. Danbjørg, C. M. Jensen, and J. Clemensen. 2016. Participatory Design in Health Care: Participation, Power and Knowledge. In Proceedings of the 14th Participatory Design Conference: Short Papers, Interactive Exhibitions, Workshops - Volume 2 (Aarhus, Denmark) (PDC '16). Association for Computing Machinery, New York, NY, USA, 127--128. https://doi.org/10.1145/2948076.2948106
[89]
Niloufar Salehi, Lilly C. Irani, Michael S. Bernstein, Ali Alkhatib, Eva Ogbe, Kristy Milland, and Clickhappier. 2015. We Are Dynamo: Overcoming Stalling and Friction in Collective Action for Crowd Workers. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems. ACM, Seoul Republic of Korea, 1621--1630. https://doi.org/10.1145/2702123.2702508
[90]
Nithya Sambasivan. 2022. All Equation, No Human: The Myopia of AI Models. Interactions 29, 2 (mar 2022), 78--80. https://doi.org/10.1145/3516515
[91]
Nithya Sambasivan, Shivani Kapania, Hannah Highfill, Diana Akrong, Praveen Paritosh, and Lora M Aroyo. 2021. ?Everyone wants to do the model work, not the data work": Data Cascades in High-Stakes AI. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. ACM, Yokohama Japan, 1--15. https://doi.org/10.1145/ 3411764.3445518
[92]
Elizabeth B.-N. Sanders and Pieter Jan Stappers. 2008. Co-creation and the new landscapes of design. CoDesign 4, 1 (2008), 5--18. https://doi.org/10.1080/15710880701875068 arXiv:https://doi.org/10.1080/15710880701875068
[93]
Devansh Saxena and Shion Guha. 2020. Conducting Participatory Design to Improve Algorithms in Public Services: Lessons and Challenges. In Conference Companion Publication of the 2020 on Computer Supported Cooperative Work and Social Computing. Association for Computing Machinery, New York, NY, USA, 383--388. https://doi.org/10.1145/ 3406865.3418331
[94]
Morgan Klaus Scheuerman, Emily Denton, and Alex Hanna. 2021. Do Datasets Have Politics? Disciplinary Values in Computer Vision Dataset Development. arXiv:2108.04308 [cs] (Sept. 2021). https://doi.org/10.1145/3476058
[95]
Morgan Klaus Scheuerman, Kandrea Wade, Caitlin Lustig, and Jed R Brubaker. 2020. How We've Taught Algorithms to See Identity: Constructing Race and Gender in Image Databases for Facial Analysis. Proc. ACM Hum.-Comput. Interact. 4, CSCW1 (2020). https://doi.org/10.1145/3392866 Article 058.
[96]
Kjeld Schmidt. 2008. Taking CSCW Seriously: Supporting Articulation Work (1992). In Cooperative Work and Coordinative Practices. Springer London, London, 45--71. https://doi.org/10.1007/978--1--84800-068--1_3
[97]
Kristen M. Scott, Sonja Mei Wang, Milagros Miceli, Pieter Delobelle, Karolina Sztandar-Sztanderska, and Bettina Berendt. 2022. Algorithmic Tools in Public Employment Services: Towards a Jobseeker-Centric Perspective. In 2022 ACM Conference on Fairness, Accountability, and Transparency (Seoul, Republic of Korea) (FAccT '22). Association for Computing Machinery, New York, NY, USA, 2138--2148. https://doi.org/10.1145/3531146.3534631
[98]
Shilad Sen, Margaret E. Giesel, Rebecca Gold, Benjamin Hillmann, Matt Lesicko, Samuel Naden, Jesse Russell, Zixiao (Ken) Wang, and Brent Hecht. 2015. Turkers, Scholars, "Arafat" and "Peace": Cultural Communities and Algorithmic Gold Standards. In Proceedings of the 18th ACM Conference on Computer Supported Cooperative Work & Social Computing (CSCW '15). Association for Computing Machinery, New York, NY, USA, 826--838. https: //doi.org/10.1145/2675133.2675285 tex.ids= sen2015a.
[99]
Mona Sloane, Emanuel Moss, Olaitan Awomolo, and Laura Forlano. 2020. Participation is not a Design Fix for Machine Learning. arXiv:2007.02423 [cs] (Aug. 2020). http://arxiv.org/abs/2007.02423 arXiv: 2007.02423.
[100]
Susan Leigh Star. 2010. This is Not a Boundary Object: Reflections on the Origin of a Concept. Science, Technology, & Human Values 35, 5 (Sept. 2010), 601--617. https://doi.org/10.1177/0162243910377624
[101]
Susan Leigh Star and James R. Griesemer. 1989. Institutional Ecology, ?Translations' and Boundary Objects: Amateurs and Professionals in Berkeley's Museum of Vertebrate Zoology, 1907--39. Social Studies of Science 19, 3 (Aug. 1989), 387--420. https://doi.org/10.1177/030631289019003001
[102]
David Stark. 2021. Algorithmic Management in the Platform Economy. 3, 2020 (2021), 47--72.
[103]
Marc Steen. 2013. Co-Design as a Process of Joint Inquiry and Imagination. Design Issues 29, 2 (04 2013), 16--28. https://doi.org/10.1162/DESI_a_00207 arXiv:https://direct.mit.edu/desi/article-pdf/29/2/16/1715163/desi_a_00207.pdf
[104]
Lucy Suchman. 2000. Embodied Practices of Engineering Work. Mind, Culture, and Activity 7, 1--2 (Jan. 2000), 4--18. https://doi.org/10.1080/10749039.2000.9677645
[105]
Lucy Suchman. 2002. Located accountabilities in technology production. Scandinavian journal of information systems 14, 2 (2002), 91--105. http://aisel.aisnet.org/sjis/vol14/iss2/7
[106]
Christine T.Wolf and Jeanette L. Blomberg. 2020. Ambitions and Ambivalences in Participatory Design: Lessons from a Smart Workplace Project. In Proceedings of the 16th Participatory Design Conference 2020 - Participation(s) Otherwise - Volume 1 (Manizales, Colombia) (PDC '20). Association for Computing Machinery, New York, NY, USA, 193--202. https://doi.org/10.1145/3385010.3385029
[107]
Pascale Trompette and Dominique Vinck. 2009. Revisiting the notion of Boundary Object. Revue d'anthropologie des connaissances 3, 1, 1 (2009), 3. https://doi.org/10.3917/rac.006.0003
[108]
Paola Tubaro and Antonio A. Casilli. 2019. Micro-work, artificial intelligence and the automotive industry. Journal of Industrial and Business Economics (2019). https://doi.org/10.1007/s40812-019-00121--1 ISBN: 4081201900 Publisher: Springer International Publishing.
[109]
Paola Tubaro, Antonio A. Casilli, and Marion Coville. 2020. The trainer, the verifier, the imitator: Three ways in which human platform workers support artificial intelligence. Big Data & Society 7, 1 (2020). https://doi.org/10.1177/ 2053951720919776
[110]
Gabriela Vargas-Cetina. 2020. Do Locals Need Our Help? On Participatory Research in Anthropology. Annals of Anthropological Practice 44, 2 (Nov. 2020), 202--207. https://doi.org/10.1111/napa.12152
[111]
James R. Wallace, Saba Oji, and Craig Anslow. 2017. Technologies, Methods, and Values: Changes in Empirical Research at CSCW 1990 - 2015. Proc. ACM Hum.-Comput. Interact. 1, CSCW, Article 106 (dec 2017), 18 pages. https://doi.org/10.1145/3134741
[112]
Heike Winschiers. 2006. The challenges of participatory design in a intercultural context: designing for usability in Namibia. In PDC. 73--76.
[113]
Christine T. Wolf. 2019. Conceptualizing Care in the Everyday Work Practices of Machine Learning Developers. In Companion Publication of the 2019 on Designing Interactive Systems Conference 2019 Companion. Association for Computing Machinery, New York, NY, USA, 331--335. https://doi.org/10.1145/3301019.3323879
[114]
Christine T. Wolf, Julia Bullard, Stacy Wood, Amelia Acker, Drew Paine, and Charlotte P. Lee. 2019. Mapping the "How" of Collaborative Action: Research Methods for Studying Contemporary Sociotechnical Processes. In Conference Companion Publication of the 2019 on Computer Supported Cooperative Work and Social Computing (Austin, TX, USA) (CSCW '19). Association for Computing Machinery, New York, NY, USA, 528--532. https://doi.org/10.1145/3311957. 3359441
[115]
MarisolWong-Villacres, Carl DiSalvo, Neha Kumar, and Betsy DiSalvo. 2020. Culture in Action: Unpacking Capacities to Inform Assets-Based Design. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. Association for Computing Machinery, New York, NY, USA, 1--14. https://doi.org/10.1145/3313831.3376329
[116]
Alex J. Wood, Mark Graham, Vili Lehdonvirta, and Isis Hjorth. 2018. Good Gig, Bad Big: Autonomy and Algorithmic Control in the Global Gig Economy. Work, Employment and Society 00, 0 (2018), 1--20. https://doi.org/10.1177/ 0950017018785616
[117]
Allison Woodruff, Sarah E. Fox, Steven Rousso-Schindler, and Jeffrey Warshaw. 2018. A Qualitative Exploration of Perceptions of Algorithmic Fairness. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. Association for Computing Machinery, New York, NY, USA, 1--14. https://doi.org/10.1145/3173574.3174230
[118]
Xiaomu Zhou, Mark Ackerman, and Kai Zheng. 2011. CPOE workarounds, boundary objects, and assemblages. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, Vancouver BC Canada, 3353--3362. https://doi.org/10.1145/1978942.1979439
[119]
Carsten Østerlund. 2008. The Materiality of Communicative Practices. Scandinavian Journal of Information Systems 20, 1 (Jan. 2008). https://aisel.aisnet.org/sjis/vol20/iss1/4

Cited By

View all
  • (2025)Factors influencing trust in algorithmic decision-making: an indirect scenario-based experimentFrontiers in Artificial Intelligence10.3389/frai.2024.14656057Online publication date: 4-Feb-2025
  • (2025)What Knowledge Do We Produce from Social Media Data and How?Proceedings of the ACM on Human-Computer Interaction10.1145/37012169:1(1-45)Online publication date: 10-Jan-2025
  • (2024)PositionProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3694580(60644-60673)Online publication date: 21-Jul-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Human-Computer Interaction
Proceedings of the ACM on Human-Computer Interaction  Volume 6, Issue CSCW2
CSCW
November 2022
8205 pages
EISSN:2573-0142
DOI:10.1145/3571154
Issue’s Table of Contents
This work is licensed under a Creative Commons Attribution International 4.0 License.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 November 2022
Published in PACMHCI Volume 6, Issue CSCW2

Check for updates

Author Tags

  1. data annotation
  2. data labeling
  3. data production
  4. data work
  5. dataset documentation
  6. machine learning
  7. transparency

Qualifiers

  • Research-article

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)459
  • Downloads (Last 6 weeks)52
Reflects downloads up to 27 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2025)Factors influencing trust in algorithmic decision-making: an indirect scenario-based experimentFrontiers in Artificial Intelligence10.3389/frai.2024.14656057Online publication date: 4-Feb-2025
  • (2025)What Knowledge Do We Produce from Social Media Data and How?Proceedings of the ACM on Human-Computer Interaction10.1145/37012169:1(1-45)Online publication date: 10-Jan-2025
  • (2024)PositionProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3694580(60644-60673)Online publication date: 21-Jul-2024
  • (2024)SocialNER2.0Intelligent Data Analysis10.3233/IDA-23058828:3(841-865)Online publication date: 28-May-2024
  • (2024)"Guilds" as Worker Empowerment and Control in a Chinese Data Work PlatformProceedings of the ACM on Human-Computer Interaction10.1145/36869048:CSCW2(1-27)Online publication date: 8-Nov-2024
  • (2024)Algorithmic Harms in Child Welfare: Uncertainties in Practice, Organization, and Street-level Decision-makingACM Journal on Responsible Computing10.1145/36164731:1(1-32)Online publication date: 20-Mar-2024
  • (2024)Bitacora: A Toolkit for Supporting NonProfits to Critically Reflect on Social Media Data UseProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642673(1-29)Online publication date: 11-May-2024
  • (2024)A toolbox for surfacing health equity harms and biases in large language modelsNature Medicine10.1038/s41591-024-03258-230:12(3590-3600)Online publication date: 23-Sep-2024
  • (2024)Policy advice and best practices on bias and fairness in AIEthics and Information Technology10.1007/s10676-024-09746-w26:2Online publication date: 29-Apr-2024
  • (2024)Design im Kontext sozialer und digitaler TeilhabeDesignforschung und Designwissenschaft10.1007/978-3-658-45253-7_14(273-295)Online publication date: 27-Nov-2024
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Full Access

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media