Abstract
Increasingly, research and management in natural resource science rely on very large datasets compiled from multiple sources. While it is generally good to have more data, utilizing large, complex datasets has introduced challenges in data sharing, especially for collaborating researchers in disparate locations (“distributed research teams”). We surveyed natural resource scientists about common data-sharing problems. The major issues identified by our survey respondents (n = 118) when providing data were lack of clarity in the data request (including format of data requested). When receiving data, survey respondents reported various insufficiencies in documentation describing the data (e.g., no data collection description/no protocol, data aggregated, or summarized without explanation). Since metadata, or “information about the data,” is a central obstacle in efficient data handling, we suggest documenting metadata through data dictionaries, protocols, read-me files, explicit null value documentation, and process metadata as essential to any large-scale research program. We advocate for all researchers, but especially those involved in distributed teams to alleviate these problems with the use of several readily available communication strategies including the use of organizational charts to define roles, data flow diagrams to outline procedures and timelines, and data update cycles to guide data-handling expectations. In particular, we argue that distributed research teams magnify data-sharing challenges making data management training even more crucial for natural resource scientists. If natural resource scientists fail to overcome communication and metadata documentation issues, then negative data-sharing experiences will likely continue to undermine the success of many large-scale collaborative projects.
Similar content being viewed by others
References
Armstrong DJ, Cole P (2002) Managing distances and differences in geographically distributed work groups. In: Hinds P, Kiesler S (eds) Distributed work. The MIT Press, Cambridge
Beier U, Degerman E, Melcher A, Rogers C, Wirlöf H (2007) Processes of collating a European fisheries database to meet the objectives of the European Union Water Framework Directive. Fish Manage Ecol 14:407–416
Booch G, Rumbaugh J, Jacobson I (2005) Unified modeling language user guide, the Addison-Wesley object technology series. Addison-Wesley Professional, Boston
Borer ET, Seabloom EW, Jones MB, Schildhauer M (2009) Some simple guidelines for effective data management. Bull Ecol Soc Am 90:205–214
Borgman C, Wallis J, Enyedy N (2007) Little science confronts the data deluge: habitat ecology, embedded sensor networks, and digital libraries. Int J Digit Libr 7:17–30
Brunt J, Michener W (2009) The resource discovery initiative for field stations: enhancing data management at North American biological field stations. Bioscience 59:482–487
Ellison AM (2010) Repeatability and transparency in ecological research. Ecology 91:2536–2539
Ellison AM, Osterweil LJ, Hadley JL, Wise A, Boose E, Clarke L, Foster DR, Hanson A, Jensen D, Kuzeja P, Riseman E, Schultz H (2006) Analytic webs support the synthesis of ecological datasets. Ecology 87:1345–1358
Federal Geographic Data Committee (1999) Content Standard for Digital Geospatial Data, Part 1, Biological Data Profile. Federal Geographic Data Committee and USGS Biological Resources Division. Report no. FGDC-STD-001.1-1999
Hampton SE, Tewksbury JJ, Strasser CA (2012) Ecological data in the information age. Front Ecol Environ 10:59
Hernandez RR, Mayernik MS, Murphy-Mariscal ML, Allen MF (2012) Advanced technologies and data management practices in environmental science: lessons from academia. Bioscience 62:1067–1076
Hinds P, Kiesler S (2002) Distributed work. MIT Press, Cambridge
Jones MB, Schildhauer MP, Reichman OJ, Bowers S (2006) The new bioinformatics: integrating ecological data from the gene to the biosphere. Annu Rev Ecol Evol Syst 37:519–544
Kiesler S, Cummings J (2002) What do we know about proximity and distance in work groups? In: Hinds PJ, Kiesler S (eds) Distributed work. MIT Press, Cambridge, pp 57–80
Kolb TL, Blukacz-Richards EA, Muir AM, Claramunt RM, Koops MA, Taylor WW, Sutton TM, Arts MT, Bissel E (2013) How to manage data to enhance their potential for synthesis, preservation, sharing, and reuse-a great lakes case study. Fisheries 38:52–64
Ludäscher B, Altintas I, Bowers S, Cummings J, Critchlow T, Deelman E, De Roure D, Freire J, Goble C, Jones M, Klasky S, McPhillips T, Podhorszki N, Silva C, Taylor I, Vouk M (2009) Scientific data management: challenges, existing technology, and deployment, computational science series. In: Shoshani, Rotem (eds) Scientific process automation and workflow management. Chapman & Hall/CRC, Washington
Madin J, Bowers S, Schildhauer M, Jones M (2008) Advancing ecological research with ontologies. Trends Ecol Evol 23(3):159–168
Mager RF, Pipe P (1997) Analyzing performance problems, or, you really oughta wanna: how to figure out why people aren’t doing what they should be, and what to do about it, vol 3. Center for Effective Performance, Atlanta, GA
McLaughlin RL, Carl LM, Middel T, Ross M, Noakes DLG, Hayes DB, Baylis JR (2001) Potentials and pitfalls of integrating data from diverse sources: lessons from a historical database for Great Lakes stream fishes. Fisheries 26:14–23
Michener WK, Jones MB (2012) Ecoinformatics: supporting ecology as a data-intensive science. Trends Ecol Evol 27:85–93
Nardi BA, Whittaker S (2002) The place of face-to-face communication in distributed work. In: Hinds PJ, Kiesler S (eds) Distributed work. MIT Press, Cambridge, pp 83–112
Nelson B (2009) Empty archives. Nature 46:160–163
Oakley KL, Thomas LP, Fancy SG (2003) Guidelines for long-term monitoring protocols. Wildl Soc Bull 31:1000–1003
Pikitch EK, Santora C, Babcock EA, Bakun A, Bonfil R, Conover DO, Dayton P, Doukakis P, Fluharty D, Heneman B, Houde ED, Link J, Livingston PA, Mangel M, McAllister MK, Pope J, Sainsbury KJ (2004) Ecosystem-based fishery management. Science 305:346–347
Quinn M, Alexander S (2008) Information technology and the protection of biodiversity in protected areas. In: Hanna KS, Clark DA, Slowcombe S (eds) Transforming parks and protected areas: policy and governance in a changing world. Routledge, New York, pp 62–84
Rentmeester S (ed) (2010) Regional Guidance on Metadata for Environmental Data. PNAMP Series Report No. 2010-001. Cook, WA: Pacific Northwest Aquatic Monitoring Partnership. http://www.pnamp.org/document/2771
Robertson G (2008) Long-term ecological research: re-inventing network science. Front Ecol Environ 6(5):281
Schmidt B (2009) Considerations for regional data collection, sharing and exchange. StreamNet, p 27. ftp://ftp.streamnet.org/pub/streamnet/projman_files/Data_Sharing_Guide_2009-06-01.pdf
Seifert J (2004) Data mining and the search for security: challenges for connecting the dots and databases. Gov Inf Q 21:461–480
Shaw M, Subramaniam C, Tan G, Welge M (2001) Knowledge management and data mining for marketing. Decis Support Syst 31:127–137
Spengler S (2000) Bioinformatics in the information age. Science 287:1221–1223
Tenopir C, Allard S, Douglass K, Aydinoglu AU, Wu L, Read E, Manoff M, Frame M (2011) Data sharing by scientists: practices and perceptions. PLoS One 6(6):e21101
Turnhout E, Boonman-Berson S (2011) Databases, scaling practices, and the globalization of biodiversity. Ecol Soc 16(1):35
Vogeli C, Yucel R, Bendavid E, Jones LM, Anderson MS, Louis KS, Campbell EG (2006) Data withholding and the next generation of scientists: results of a national survey. Acad Med 81:128–136
Wallis J, Mayernik M, Pepe A, Borgman C (2008) An exploration of the life cycle of eScience collaboratory data. iConference 2008. Los Angeles, CA
Acknowledgments
This work was supported by the Integrated Status and Effectiveness Monitoring Program (funded by Bonneville Power Administration (2003-017-00), the National Research Council and the Northwest Fisheries Science Center (NOAA-Fisheries). Chris Jordan, Steve Rentmeester, and Andy Albaugh provided valuable insight and experiences in the development of this manuscript.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Volk, C.J., Lucero, Y. & Barnas, K. Why is Data Sharing in Collaborative Natural Resource Efforts so Hard and What can We Do to Improve it?. Environmental Management 53, 883–893 (2014). https://doi.org/10.1007/s00267-014-0258-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00267-014-0258-2