Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
SlideShare a Scribd company logo
Biodiversity Informatics at the
Natural History Museum
Ed Baker
Terrestrial Invertebrates, Department of Life Sciences
& NHM Informatics Initiative
http://dx.doi.org/10.6084/m9.figshare.722897
Science as a Slow Cooker
• Only the surface visible
• Lid kept on for extended periods
of time
• Uses cheap cuts of raggy meat
• Ingredient lose their nutritional
value
• Children at risk due to high
temperatures
http://ispiders.blogspot.co.uk/2011/11/realtime-web.html
We like data
• 70 million+ specimens collected over 400 years
• 350,000+ books
• ??? Unpublished datasets in
archive, notebooks, computers
• ??? In the minds of staff
How do we provide access?
• Digitisation of specimens and associated data
• Scanning and transcribing
books, journals, archives
• Providing tools for managing the data life cycle
• Changing the way we publish: data publication
Flowing Data
Publication
Collection Curation Use
Flowing Data
Collection Curation
Somebody retires Somebody dies Project is cancelled
Sits in desk drawer or
on a hard drive until….
Flowing Data
Collection Curation Use
Data Publication
Re-use
Publication
Re-use Re-use Re-use
Flowing Data: from collection to reuse
Collection Curation Use
Data Publication
Re-use
Publication
Re-use Re-use Re-use
Collection
Citizen Science
Automated identification and
monitoring
Traditional taxonomic sources
Flowing Data: from collection to reuse
Curation Use
Data Publication
Re-use
Publication
Re-use Re-use Re-use
Curation
Websites for communities to publish and curate:
• Taxonomy / nomenclature
• Bibliographies
• Specimen information
• Character matricies
Flowing Data: from collection to reuse
Use
Data Publication
Re-use
Publication
Re-use Re-use Re-use
Use: Oboe
Use: Oboe
Flowing Data: from collection to reuse
Data Publication
Re-use
Publication
Re-use Re-use Re-use
Publication (Data)
• Datasets
• Single species descriptions
• Checklists
• Software
Flowing Data: from collection to reuse
Re-use
Publication
Re-use Re-use Re-use
Publication (Research)
• Traditional research
• Systematic zoology
• Phylogeny
• Biogeography
Flowing Data: from collection to reuse
Re-use Re-use Re-use Re-use
The Problem of Scale
Data is being generated by tens of thousands of
researchers, in thousands of institutions
• Hard to find what you need
• Hard to know if what you need actually exists
• Impossible to go through researcher by researcher
NHM Data Portal
• Aggregator for NHM science
data
• Visualisation tools for
datasets
• Allows export of NHM data
for re-use
The Informatics Landscape
>18K specimen records
(local small scale coverage)
>276M specimen records
(worldwide coverage)
The Informatics Landscape
A webpage for every species
Aggregate specimen and
observation data globally
Wikimedian in Residence
• Make NHM content available
under open licenses for use
on Wikimedia projects (and
elsewhere)
• Reach of Wikipedia:
BBC, Encyclopedia of Life
• Wikisource: Transcription and
translation crowd-sourcing
Flowing Data: from collection to reuse
?
"Everybody makes mistakes. And if you don't
expose your raw data, nobody will find your
mistakes."
Jean-Claude Bradley
http://bit.ly/146ugIv
Biodiversity Informatics at the Natural History Museum

More Related Content

Biodiversity Informatics at the Natural History Museum

  • 1. Biodiversity Informatics at the Natural History Museum Ed Baker Terrestrial Invertebrates, Department of Life Sciences & NHM Informatics Initiative http://dx.doi.org/10.6084/m9.figshare.722897
  • 2. Science as a Slow Cooker • Only the surface visible • Lid kept on for extended periods of time • Uses cheap cuts of raggy meat • Ingredient lose their nutritional value • Children at risk due to high temperatures http://ispiders.blogspot.co.uk/2011/11/realtime-web.html
  • 3. We like data • 70 million+ specimens collected over 400 years • 350,000+ books • ??? Unpublished datasets in archive, notebooks, computers • ??? In the minds of staff
  • 4. How do we provide access? • Digitisation of specimens and associated data • Scanning and transcribing books, journals, archives • Providing tools for managing the data life cycle • Changing the way we publish: data publication
  • 6. Flowing Data Collection Curation Somebody retires Somebody dies Project is cancelled Sits in desk drawer or on a hard drive until….
  • 7. Flowing Data Collection Curation Use Data Publication Re-use Publication Re-use Re-use Re-use
  • 8. Flowing Data: from collection to reuse Collection Curation Use Data Publication Re-use Publication Re-use Re-use Re-use
  • 9. Collection Citizen Science Automated identification and monitoring Traditional taxonomic sources
  • 10. Flowing Data: from collection to reuse Curation Use Data Publication Re-use Publication Re-use Re-use Re-use
  • 11. Curation Websites for communities to publish and curate: • Taxonomy / nomenclature • Bibliographies • Specimen information • Character matricies
  • 12. Flowing Data: from collection to reuse Use Data Publication Re-use Publication Re-use Re-use Re-use
  • 15. Flowing Data: from collection to reuse Data Publication Re-use Publication Re-use Re-use Re-use
  • 16. Publication (Data) • Datasets • Single species descriptions • Checklists • Software
  • 17. Flowing Data: from collection to reuse Re-use Publication Re-use Re-use Re-use
  • 18. Publication (Research) • Traditional research • Systematic zoology • Phylogeny • Biogeography
  • 19. Flowing Data: from collection to reuse Re-use Re-use Re-use Re-use
  • 20. The Problem of Scale Data is being generated by tens of thousands of researchers, in thousands of institutions • Hard to find what you need • Hard to know if what you need actually exists • Impossible to go through researcher by researcher
  • 21. NHM Data Portal • Aggregator for NHM science data • Visualisation tools for datasets • Allows export of NHM data for re-use
  • 22. The Informatics Landscape >18K specimen records (local small scale coverage) >276M specimen records (worldwide coverage)
  • 23. The Informatics Landscape A webpage for every species Aggregate specimen and observation data globally
  • 24. Wikimedian in Residence • Make NHM content available under open licenses for use on Wikimedia projects (and elsewhere) • Reach of Wikipedia: BBC, Encyclopedia of Life • Wikisource: Transcription and translation crowd-sourcing
  • 25. Flowing Data: from collection to reuse ?
  • 26. "Everybody makes mistakes. And if you don't expose your raw data, nobody will find your mistakes." Jean-Claude Bradley http://bit.ly/146ugIv