Abstract
Data are essential products of scientific work that move among and through research infrastructures over time. Data constantly changes due to evolving practices and knowledge, requiring improvisational work by scientists to determine the effects on analyses. Today for end users of datasets much of the information about changes, and the processes leading to them, is invisible—embedded elsewhere in the work of a collaboration. Simultaneously scientists use increasing quantities of data, making ad hoc approaches to identifying change difficult to scale effectively. Our research investigates data change by examining how scientists make sense of change in datasets being created and sustained by the collaborative infrastructures they engage with. We examine two forms of change, before examining how trust and project rhythms influence a scientist’s notion that the newest available data are the best. We explore the opportunity to design tools and practices to support user examinations of data change and surface key provenance information embedded in research infrastructures.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
This work is part of the Deduce project (http://deduce.lbl.gov). The goal of the Deduce project is to develop methods and tools that support data change exploration and management in the context of data analysis pipelines.
- 2.
- 3.
References
Birnholtz, J.P., Bietz, M.J.: Data at work: supporting sharing in science and engineering. In: Proceedings of the 2003 International ACM SIGGROUP Conference on Supporting Group Work, GROUP 2003, pp. 339–348. ACM, New York (2003). https://doi.org/10.1145/958160.958215
Borgman, C.L.: Big Data, Little Data, No Data: Scholarship in the Networked World. MIT Press, Cambridge (2015)
Dourish, P., Gómez Cruz, E.: Datafication and data fiction: narrating data and narrating with data. Big Data Soc. 5(2) (2018). https://doi.org/10.1177/2053951718784083
Edwards, P.N.: A Vast Machine: Computer Models, Climate Data, and the Politics of Global. MIT Press, Cambridge (2010)
Edwards, P.N., Jackson, S.J., Bowker, G.C., Knobel, C.P.: Understanding infrastructure: dynamics, tensions, and design. Workshop report, University of Mighican (2007). http://hdl.handle.net/2027.42/49353
Edwards, P.N., Mayernik, M.S., Batcheller, A.L., Bowker, G.C., Borgman, C.L.: Science friction: data, metadata, and collaboration. Soc. Stud. Sci. 41(5), 667–690 (2011). https://doi.org/10.1177/0306312711413314
Faniel, I., Jacobsen, T.: Reusing scientific data: How earthquake engineering researchers assess the reusability of colleagues’ data. Comput. Support. Coop. Work (CSCW) 19(3), 355–375 (2010). https://doi.org/10.1007/s10606-010-9117-8
Gerson, E.M.: Reach, Bracket, and the Limits of Rationalized Coordination: Some Challenges for CSCW Resources, Co-Evolution and Artifacts, Computer Supported Cooperative Work, pp. 193–220. Springer, London (2008). https://doi.org/10.1007/978-1-84628-901-9
Gitelman, L., Jackson, V.: Introduction. In: Gitelman, L. (ed.) “Raw Data” is an Oxymoron. Infrastructure Series, pp. 1–14. MIT Press, Cambridge (2013)
Jirotka, M., Lee, C.P., Olson, G.M.: Supporting scientific collaboration: methods, tools and concepts. Comput. Support. Coop. Work (CSCW) 22(4–6), 667–715 (2013). https://doi.org/10.1007/s10606-012-9184-0
Karasti, H., Blomberg, J.: Studying infrastructuring ethnographically. Comput. Support. Coop. Work 27(2), 233–265 (2018). https://doi.org/10.1007/s10606-017-9296-7
Kitchin, R.: The Data Revolution: Big Data, Open Data, Data Infrastructures and their Consequences. Sage, London (2014)
Paine, D., Lee, C.P.: Who has plots? contextualizing scientific software, practice, and visualizations. In: Proceedings of the ACM on Human-Computer Interaction 1(CSCW) (2017). https://doi.org/10.1145/3134720
Paine, D., Sy, E., Piell, R., Lee, C.P.: Examining data processing work as part of the scientific data lifecycle: Comparing practices across four scientific research groups. In: iConference 2015 (2015). http://hdl.handle.net/2142/73644
Pipek, V., Karasti, H., Bowker, G.C.: A preface to ‘infrastructuring and collaborative design’. Comput. Support. Coop. Work (CSCW) 26(1), 1–5 (2017). https://doi.org/10.1007/s10606-017-9271-3
Plantin, J.C.: Data cleaners for pristine datasets: visibility and invisibility of data processors in social science. Sci. Technol. Hum. Values 44(1), 52–73 (2019). https://doi.org/10.1177/0162243918781268
Rahm, E., Do, H.H.: Data cleaning: problems and current approaches. IEEE Data Eng. Bull. 23(4), 3–13 (2000)
Rawson, K., Munoz, T.: Against cleaning. Curating Menus 6 (2016). http://curatingmenus.org/articles/against-cleaning/
Rolland, B., Lee, C.P.: Beyond trust and reliability: reusing data in collaborative cancer epidemiology research. In: Proceedings of the 2013 Conference on Computer Supported Cooperative Work, CSCW 2013, pp. 435–444. ACM, New York (2013). https://doi.org/10.1145/2441776.2441826
Star, S.L., Ruhleder, K.: Steps toward an ecology of infrastructure: design and access for large information spaces. Inf. Syst. Res. 7(1), 24 (1996)
Star, S.L., Strauss, A.: Layers of silence, arenas of voice: the ecology of visible and invisible work. Comput. Support. Coop. Work (CSCW) 8, 9–30 (1999)
Stodden, V., et al.: Enhancing reproducibility for computational methods. Science 354(6317), 1240–1241 (2016). https://doi.org/10.1126/science.aah6168
Strauss, A.: The articulation of project work: an organizational process. Sociol. Q. 29(2), 163–178 (1988)
Thomer, A.K., Wickett, K.M., Baker, K.S., Fouke, B.W., Palmer, C.L.: Documenting provenance in noncomputational workflows: research process models based on geobiology fieldwork in yellowstone national park. J. Assoc. Inform. Sci. Technol. 69(10), 1234–1245 (2018). https://doi.org/10.1002/asi.24039
Vertesi, J., Dourish, P.: The value of data: considering the context of production in data economies. In: Proceedings of the ACM 2011 Conference on Computer Supported Cooperative Work, CSCW 2011, pp. 533–542. ACM, New York (2011). https://doi.org/10.1145/1958824.1958906
Weiss, R.S.: Learning From Strangers: The Art and Method of Qualitative Interview Studies. The Free Press, New York (1995)
Acknowledgements
The authors thank the members of the Deduce project, the study participants, and the anonymous reviewers of this work. This work is supported by the U.S. Department of Energy, Office of Science and Office of Advanced Scientific Computing Research (ASCR) under Contract No. DE-AC02-05CH11231.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Paine, D., Ramakrishnan, L. (2019). Surfacing Data Change in Scientific Work. In: Taylor, N., Christian-Lamb, C., Martin, M., Nardi, B. (eds) Information in Contemporary Society. iConference 2019. Lecture Notes in Computer Science(), vol 11420. Springer, Cham. https://doi.org/10.1007/978-3-030-15742-5_2
Download citation
DOI: https://doi.org/10.1007/978-3-030-15742-5_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-15741-8
Online ISBN: 978-3-030-15742-5
eBook Packages: Computer ScienceComputer Science (R0)