Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
SlideShare a Scribd company logo
Introduction to DataCite
Martin Fenner, DataCite Technical Director
http://orcid.org/0000-0003-1419-2405  
DataCite ‐ International Data Cita
DataCite is the leading global
provider of DOIs for research
data, enabling users to register,
find, use, connect and track
research data.
http://researchgraph.org/schema/
DataCite was founded in 2009
as a German non-profit
membership organization
8,164,272 DOIs
47 members
1,249 data centers
5 staff members
 
DataCite ‐ International Data Citation   
Community and
Use Cases
Members
DOI Services
DOI Registration agencies 

(Crossref, DataCite, KISTI, EIDR, …)
DOI International DOI Foundation (IDF)
A DOI is not a DOI
Register Create and manage DOIs and metadata.
Find
Find resources with DOIs and associated
information.
Use
Access to the content that was registered, and
information how it can be used (licenses).
Connect
Link data to articles (Crossref), people (ORCID),
Funding (Crossref), etc.
Track
Track data citations, data usage, and other
indicators.
Advocate
Promote data sharing, persistent identifiers, and
open science.
Support
Provide reliable, well-documented services and
help members and users to use them.
DataCite DOI Services
Some DataCite Members
https://www.datacite.org/members.html
Challenges for the 

DataCite Community
Data as first class research output
Data versioning
Dynamic data citation
Clinical trial data
Software citation
Data as first-class research output
Data should be considered legitimate,
citable products of research. Data citations
should be accorded the same importance
in the scholarly record as citations of other
research objects, such as publications.
https://www.force11.org/group/joint-declaration-data-citation-principles-final
Data as first-class research output
Force11 Data Citation Implementation Pilot
(DCIP), with specific recommendations for data
repositories:
• use persistent identifiers
• provide citation metadata
• embed metadata in dataset landing pages
https://doi.org/10.1101/097196
https://www.force11.org/group/dcip
Data as first-class research output
The RDA/WDS Scholarly Link Exchange Working Group
aims to enable a comprehensive global view of the links
between scholarly literature and data. The working group will
leverage existing work and international initiatives to work
towards a global information commons by establishing:
• Pathfinder services and enabling infrastructure
• An interoperability framework with guidelines and standards
(see also www.scholix.org)
• A significant consensus
• Support for communities of practice and 

implementation
https://www.rd-alliance.org/groups/rdawds-scholarly-link-exchange-scholix-wg
Data as first-class research output
The Make Data Count (MDC) project is an initiative by
the California Digital Library, DataONE and DataCite,
funded by the Sloan Foundation and started in May
2017. One of the main goals is to develop standards for
measuring and reporting of data usage stats, in
collboration with COUNTER.
https://makedatacount.org/
Data Versioning
1 2 3 4
1
2 3
4
0
Software
Dynamic Data
Version 4.1 of DataCite schema (end of 2017) will
support hasVersion/isVersionOf relation types:
Dynamic Data Citation
DataCite works with the RDA Data Citation WG, which
in 2015 had published recommendations for dynamic
data citation.
Recommendations are based on time-stamped query
strings, and the registration of a persistent identifier and
metadata when data are reused.
These recommendations can be 

implemented using DataCite DOIs.
https://doi.org/10.15497/RDA00016
Clinical Trial Data
CORBEL consensus document on providing access to
individual participant data from clinical trials almost
ready.
P1: The provision of individual-participant data should be
promoted, incentivized, and resourced so that it
becomes the norm in clinical research.
http://www.ecrin.org/
Data Authorship as an 

Incentive to Data SharingThe new engl and jour nal o f medicine
contribution and an explanation of its signifi-
cance. Similarly, nothing would prevent investi-
gators from including data authorship as one of
their substantive contributions for consideration
We acknowledge that there are many unan-
swered questions. The affirmative standards and
responsibilities for the integrity and curation of
the data set may need to be further elucidated.
Figure 1. Credit for Data Sharing and Tracing the Data Set.
An individual researcher (indicated by letters A through F) may be designated and credited as an author, a data author, or both, depending
on the person’s contribution to the data and analysis in the published work. DOI denotes digital object identifier, and MS manuscript.
00000111100
11101010100
01001011010
11110101010
01110011110
11011001010
00110101011
00000111100
11101010100
01001011010
11110101010
01110011110
11011001010
00110101011
00000111100
11101010100
01001011010
11110101010
01110011110
11011001010
00110101011
00000111100
11101010100
01001011010
11110101010
01110011110
11011001010
00110101011
00000111100
11101010100
01001011010
11110101010
01110011110
11011001010
00110101011
00000111100
11101010100
01001011010
11110101010
01110011110
11011001010
00110101011
00000111100
11101010100
00110101011
1110101010011101010100
01001011010
11110101010
01110011110
11011001010
Data set 2
DOI
+
00000111100
11101010100
01001011010
11110101010
01110011110
11011001010
00110101011
Primary authors generate
data set and designate
“data authors,” DOI is
assigned, and primary
publication occurs
Secondary analysis by
members of primary
group
Secondary analysis with
collaborator
Secondary analysis by
independent investigator
Combination of data set
with other data for new
analysis by independent
investigator
A
D
B C
A C
A
A B
A B
A B
A B
E
FF
1
2
3
4
5 MS 5
DOI
MS 4
DOI
MS 3
DOI
MS 2
DOI
MS 1
DOI
1 Author (F)
3 Data authors:
Data set 1 DOI (A, B)
Data set 2 DOI (F)
1 New Manuscript DOI:
MS 5 DOI
1 Author (E)
2 Data authors:
Data set 1 DOI (A, B)
1 New Manuscript DOI:
MS 4 DOI
2 Authors (A, D)
2 Data authors:
Data set 1 DOI (A, B)
1 New Manuscript DOI:
MS 3 DOI
3 Authors (A, B, C)
2 Data authors:
Data set 1 DOI (A, B)
1 Manuscript DOI:
MS 1 DOI
2 Authors (A, C)
2 Data authors:
Data set 1 DOI (A, B)
1 New Manuscript DOI:
MS 2 DOI
00000111100
11011001010
00110101011
Primary authors generate
11011001010
11101010100
01001011010
11110101010
01110011110
11011001010
01001011010
1111010101011110101010
01110011110
Original
data set
00000111100
11101010100
00110101011
11101010100
11011001010
11101010100
01001011010
11110101010
01110011110
11011001010
Data set 1
DOI
00000111100
11101010100
00110101011
B
1110101010011101010100
01001011010
11110101010
01110011110
11011001010
Data set 1
DOI
00000111100
11101010100
00110101011
B
1110101010011101010100
01001011010
11110101010
01110011110
11011001010
Data set 1
DOI
00000111100
11101010100
00110101011
B
1110101010011101010100
01001011010
11110101010
01110011110
11011001010
Data set 1
DOI
00000111100
11101010100
00110101011
+
B
11101010100
11011001010
11101010100
01001011010
11110101010
01110011110
11011001010
Data set 1
DOI
11110101010
Original
11110101010
Original
1111010101011110101010
data set11110101010
Data authors
designated
(A, B)
https://doi.org/10.1056/NEJMsb1616595
https://doi.org/10.12688/f1000research.9448.1
https://doi.org/10.24433/CO.a491b0a8-124f-4448-b2c8-b850b5b2aa33
0
4000
8000
12000
16000
2013 2014 2015 2016 2017
Zenodo DOIs for Software
https://search.datacite.org/works?resource-type-id=software&data-center-
id=cern.zenodo
projected
Introduction to DataCite - Martin Fenner

More Related Content

Introduction to DataCite - Martin Fenner

  • 1. Introduction to DataCite Martin Fenner, DataCite Technical Director http://orcid.org/0000-0003-1419-2405   DataCite ‐ International Data Cita
  • 2. DataCite is the leading global provider of DOIs for research data, enabling users to register, find, use, connect and track research data.
  • 4. DataCite was founded in 2009 as a German non-profit membership organization 8,164,272 DOIs 47 members 1,249 data centers 5 staff members
  • 6. Community and Use Cases Members DOI Services DOI Registration agencies 
 (Crossref, DataCite, KISTI, EIDR, …) DOI International DOI Foundation (IDF) A DOI is not a DOI
  • 7. Register Create and manage DOIs and metadata. Find Find resources with DOIs and associated information. Use Access to the content that was registered, and information how it can be used (licenses). Connect Link data to articles (Crossref), people (ORCID), Funding (Crossref), etc. Track Track data citations, data usage, and other indicators. Advocate Promote data sharing, persistent identifiers, and open science. Support Provide reliable, well-documented services and help members and users to use them. DataCite DOI Services
  • 9. Challenges for the 
 DataCite Community Data as first class research output Data versioning Dynamic data citation Clinical trial data Software citation
  • 10. Data as first-class research output Data should be considered legitimate, citable products of research. Data citations should be accorded the same importance in the scholarly record as citations of other research objects, such as publications. https://www.force11.org/group/joint-declaration-data-citation-principles-final
  • 11. Data as first-class research output Force11 Data Citation Implementation Pilot (DCIP), with specific recommendations for data repositories: • use persistent identifiers • provide citation metadata • embed metadata in dataset landing pages https://doi.org/10.1101/097196 https://www.force11.org/group/dcip
  • 12. Data as first-class research output The RDA/WDS Scholarly Link Exchange Working Group aims to enable a comprehensive global view of the links between scholarly literature and data. The working group will leverage existing work and international initiatives to work towards a global information commons by establishing: • Pathfinder services and enabling infrastructure • An interoperability framework with guidelines and standards (see also www.scholix.org) • A significant consensus • Support for communities of practice and 
 implementation https://www.rd-alliance.org/groups/rdawds-scholarly-link-exchange-scholix-wg
  • 13. Data as first-class research output The Make Data Count (MDC) project is an initiative by the California Digital Library, DataONE and DataCite, funded by the Sloan Foundation and started in May 2017. One of the main goals is to develop standards for measuring and reporting of data usage stats, in collboration with COUNTER. https://makedatacount.org/
  • 14. Data Versioning 1 2 3 4 1 2 3 4 0 Software Dynamic Data Version 4.1 of DataCite schema (end of 2017) will support hasVersion/isVersionOf relation types:
  • 15. Dynamic Data Citation DataCite works with the RDA Data Citation WG, which in 2015 had published recommendations for dynamic data citation. Recommendations are based on time-stamped query strings, and the registration of a persistent identifier and metadata when data are reused. These recommendations can be 
 implemented using DataCite DOIs. https://doi.org/10.15497/RDA00016
  • 16. Clinical Trial Data CORBEL consensus document on providing access to individual participant data from clinical trials almost ready. P1: The provision of individual-participant data should be promoted, incentivized, and resourced so that it becomes the norm in clinical research. http://www.ecrin.org/
  • 17. Data Authorship as an 
 Incentive to Data SharingThe new engl and jour nal o f medicine contribution and an explanation of its signifi- cance. Similarly, nothing would prevent investi- gators from including data authorship as one of their substantive contributions for consideration We acknowledge that there are many unan- swered questions. The affirmative standards and responsibilities for the integrity and curation of the data set may need to be further elucidated. Figure 1. Credit for Data Sharing and Tracing the Data Set. An individual researcher (indicated by letters A through F) may be designated and credited as an author, a data author, or both, depending on the person’s contribution to the data and analysis in the published work. DOI denotes digital object identifier, and MS manuscript. 00000111100 11101010100 01001011010 11110101010 01110011110 11011001010 00110101011 00000111100 11101010100 01001011010 11110101010 01110011110 11011001010 00110101011 00000111100 11101010100 01001011010 11110101010 01110011110 11011001010 00110101011 00000111100 11101010100 01001011010 11110101010 01110011110 11011001010 00110101011 00000111100 11101010100 01001011010 11110101010 01110011110 11011001010 00110101011 00000111100 11101010100 01001011010 11110101010 01110011110 11011001010 00110101011 00000111100 11101010100 00110101011 1110101010011101010100 01001011010 11110101010 01110011110 11011001010 Data set 2 DOI + 00000111100 11101010100 01001011010 11110101010 01110011110 11011001010 00110101011 Primary authors generate data set and designate “data authors,” DOI is assigned, and primary publication occurs Secondary analysis by members of primary group Secondary analysis with collaborator Secondary analysis by independent investigator Combination of data set with other data for new analysis by independent investigator A D B C A C A A B A B A B A B E FF 1 2 3 4 5 MS 5 DOI MS 4 DOI MS 3 DOI MS 2 DOI MS 1 DOI 1 Author (F) 3 Data authors: Data set 1 DOI (A, B) Data set 2 DOI (F) 1 New Manuscript DOI: MS 5 DOI 1 Author (E) 2 Data authors: Data set 1 DOI (A, B) 1 New Manuscript DOI: MS 4 DOI 2 Authors (A, D) 2 Data authors: Data set 1 DOI (A, B) 1 New Manuscript DOI: MS 3 DOI 3 Authors (A, B, C) 2 Data authors: Data set 1 DOI (A, B) 1 Manuscript DOI: MS 1 DOI 2 Authors (A, C) 2 Data authors: Data set 1 DOI (A, B) 1 New Manuscript DOI: MS 2 DOI 00000111100 11011001010 00110101011 Primary authors generate 11011001010 11101010100 01001011010 11110101010 01110011110 11011001010 01001011010 1111010101011110101010 01110011110 Original data set 00000111100 11101010100 00110101011 11101010100 11011001010 11101010100 01001011010 11110101010 01110011110 11011001010 Data set 1 DOI 00000111100 11101010100 00110101011 B 1110101010011101010100 01001011010 11110101010 01110011110 11011001010 Data set 1 DOI 00000111100 11101010100 00110101011 B 1110101010011101010100 01001011010 11110101010 01110011110 11011001010 Data set 1 DOI 00000111100 11101010100 00110101011 B 1110101010011101010100 01001011010 11110101010 01110011110 11011001010 Data set 1 DOI 00000111100 11101010100 00110101011 + B 11101010100 11011001010 11101010100 01001011010 11110101010 01110011110 11011001010 Data set 1 DOI 11110101010 Original 11110101010 Original 1111010101011110101010 data set11110101010 Data authors designated (A, B) https://doi.org/10.1056/NEJMsb1616595
  • 20. 0 4000 8000 12000 16000 2013 2014 2015 2016 2017 Zenodo DOIs for Software https://search.datacite.org/works?resource-type-id=software&data-center- id=cern.zenodo projected