Presentation about the agINFRA Germplasm Working Group (http://wiki.aginfra.eu/index.php/Germplasm_Working_Group). Presented during Session 1 of the 1st International e-Conference on Germplasm Data Interoperability (https://sites.google.com/site/germplasminteroperability/)
1 of 33
More Related Content
The agINFRA Germplasm Working Group
1. The Germplasm Working Group
Dr. Vassilis Protonotarios
Agricultural Biotechnologist, PhD
Agro-Know Technologies, Greece
e-Conference on Germplasm Data Interoperability
Session 1: “The vision of Linked Germplasm Data”
2. Structure of the presentation
1. Background
– About the agINFRA project
– Issues related to data sharing
2. The Germplasm Working Group
– Objectives
– Wiki
– Link with RDA
3. The next steps
4. The agINFRA project
• A project funded under the FP7 program of EC
• Consortium with expertise on
– Technology / infrastructures
– Data / data management
Combined to facilitate agricultural data sharing
More info at:
www.aginfra.eu
5. The agINFRA project
• Aims to enhance the interoperability between
the agricultural data sources
– Data sharing by
• Metadata aggregation & linking data
• Design and deploy the linked ag-data framework
– Methodology for linking data
– Provide the infrastructure needed
• Both cloud- and grid-based services
• Tools, APIs etc.
6. agINFRA major data types
Bibliographic
Agri Statistics
& Economics
Other?
Raw data
agINFRA
Profiles
Educational
Germplasm
Soil data
7. agINFRA major data sources
Data Type
Data provider(s)
Bibliographic
FAO AGRIS
CASDD (CAAS)
Educational
Organic.Edunet
Green Learning Network
LAFLOR
Germplasm
Chinese Crop Germplasm Information
System (CAAS)
Italian National Germplasm Database
(CRA)
Soil Data
Italian National Center for Soil Mapping
Statistical
FAOSTAT
CountrySTAT
Researchers’ profiles, organizations
& events
AGRIVIVO
10. The issue ?
• Heterogeneity!
– Data types
– Data formats
– Data management workflows
– Standards used
– Metadata exposure options
– ….
• Lack of connectivity with other data sources
12. The Germplasm Working Group
• Created in the context of the agINFRA project
• Initially included agINFRA stakeholders
– now expanded to host all stakeholders
• The group is NOT a group of experts on
germplasm data!
13. The scope of the Germplasm WG
• Aims to enable/enhance interoperability between
germplasm databases
– By developing the services for
• exchanging their data and
• delivering their data to other partners
• Focusing on three actions:
1. IDENTIFY
2. ORGANIZE
3. PROPOSE
14. Germplasm WG objectives
• IDENTIFY: collect all information related to germplasm
data
•
•
•
•
•
People/groups
Namespaces (metadata, KOS)
Standards
Workflows
Events
• ORGANIZE: engage all stakeholders & available
resources, analyze existing standards , facilitate
collaboration
• PROPOSE: linked data framework to connect data
sources
• facilitate data sharing between germplasm data sources
17. Proposed methodology
1. Analyze metadata schemas & KOSs used to
describe germplasm resources
2. Define attributes & vocabularies that can be
used to expose germplasm resources in linked
data format.
3. Provide a set of recommendations for the
exposure of germplasm resources as linked data
4. Embed the recommendations in the data
infrastructure of agINFRA
– to allow the exposure of germplasm resources as
LOD.
18. The Germplasm WG wiki
• Central point of reference
http://wiki.aginfra.eu/index.php/Germplasm_Working_Group
• Freely accessible (no login required)
19. Information available so far
•
•
•
•
•
•
Vision
Activities
Outcomes
Participants
Next steps
Useful resources
–
–
–
–
Data sources
Standards
Services
Stakeholders
• Events
21. Key outcomes of the group
• Dossier on Germplasm Information:
– Major programs
– Major information systems and services
– agINFRA germplasm data sources (CGRIS & CRA)
– Core standards for germplasm information
– Plant nomenclature, taxonomies and ontologies
– Plant genomic resources
– Related references and links
• Freely available from the Germplasm Group wiki
24. Our wish list (tentative list)
Reusing experiences from
…and working closely
with
25. Connection with RDA
• RDA: Research Data Alliance (https://rd-alliance.org)
• Aims to “accelerate and facilitate research
data sharing and exchange”
• Structure:
– Interest Groups: Cover wider topics
– Working Groups: Working on focused topics
27. Connection with RDA
• Representation of agINFRA Germplasm WG in
– 1st RDA Plenary Meeting (March 2013,
Gothenburg, Sweden)
– 2nd RDA Plenary Meeting (September 2013,
Washington D.C., USA)
• Suggestion for a Germplasm WG in RDA
29. Link between WG and RDA Groups
agINFRA WG
RDA IG/WG
•Interactions with data
providers
•Collection of large-scale data
•Collection of requirements
• Two (2) case studies
•Development of Best Practices
•Analysis of existing standards
•Collection of requirements
•Definition of data
management workflows
•Interaction with other
IGs/WGs (e.g. metadata, LD)
• Application in more cases
•Wider exposure of outcomes
•Development & adaptation of
tools and services
•Development of Best Practices
•Development of Best Practices
31. Towards the linking of
germplasm data sources
1. Definition and application of the linked data
for the agINFRA germplasm data sources
2. Recording and documentation of the process
3. Identification of issues
4. Suggestion for solutions to these issues
5. Fine-tuning of workflow
6. Development of Best Practices
32. …and more next steps
• Update the existing analysis with new data
• Collect new user requirements
• (re)define the mappings between metadata
schemas and KOSs
• Fine-tune the linked data approach