SlideShare uma empresa Scribd logo
1 de 14
Cynthia Parr @cydparr
US Department of Agriculture
National Agricultural Library
30 September 2015
Ag Data Commons
Adding value to
open agricultural research data
Federal directives: Public access to
open, machine-readable data
The problems in agricultural data
• Broad subject areas
• Journals not integrated with repositories like
Dryad
• Too many existing databases & web distribution
points
• Lack of infrastructure for long-tail data
• Lack of a neutral, sustainable solution for long-
term multi-institutional projects
3
• Supports Public Access mandates
• Holds agricultural research data
• Primary audience: researchers
• Holds metadata for data held elsewhere
• Starting with USDA data but will broaden
• Both human and machine access
• Can include unpublished data that is ready
for release
Ag Data Commons Prototyping FY 2015
A proposed solution
Search &
Knowledge
Discovery
Thesaurus &
Indexing
Ag Data
Commons
Repository
Organization &
Curation
Grant
management
systems
INGESTION DISSEMINATION
PubAg
Dataset
Submissio
n
Analytics &
Tools
Data.gov
Ag Data
Commons
Catalog
Legend
Building
Adapting
Existing
Distributed
repositories
Forest Service
Geospatial
Adding value
6
Metadata +
data package
DOI
Links
Thesaurus tags
Idiosyncratic
data
dictionary
Search, services,
compliance checking
DKAN http://nucivic.com/dkan/
PRO
• Open source community
• Drupal modules for basic
CMS functions
• Integrated CKAN catalog
• Feeds Data.gov
• Basic metadata already
supported
CON
• Not designed for scientific
data or scientists
• No links to literature
• No Digital Object
Identifiers
• Doesn’t handle dataset
relationships
• Metadata inadequate for
compliance checking &
re-use
7
Metadata Standards
Core Metadata Schema
POD 1.1 (Project Open Data)
https://project-open-data.cio.gov/
Related Scientific Metadata & Data Standards (e.g.)
ISO 19115 (GIS Data, FGDC)
https://www.iso.org
Darwin Core (Biodiversity standards)
http://rs.tdwg.org/dwc/
EML (Ecological Metadata Language)
https://knb.ecoinformatics.org/#tools/eml
MiXS GSC (Genomic Standards Consortium)
http://gensc.org/projects/mixs-gsc-project/
Controlled Vocabularies
• NALT – National Agricultural Library Thesaurus
http://agclass.nal.usda.gov
 GACS Global Agricultural Concept Scheme
• Taxonomy
• Gene Ontology (GO)
http://geneontology.org/
• ENVO, ecological, economic, etc.
Relevant for Agriculture
• Help create a semantic web
• SKOS (Simple Knowledge Organization System): W3C
recommendation, or RDF
Credit: AIMS--FAO
https://data.nal.usda.gov/
Launching
next week
Adding even more value
12
Structured
methods
metadata
Shared
data
dictionary
Semantic
data
dictionary
Adding even more value
13
Assist
application
launch
Find related
data
Integrate/link
related data
= help build the knowledge graph
Acknowledgements
Cynthia.Parr@ars.usda.gov
Susan McCarthy, NAL – KSD
Ursula Pieper, NAL – ISD
Qing Qu, NAL – KSD contractor
Jeff Campbell – NAL – KSD
Jaylen Nathwani, NAL – student intern
NüCivic, Angry Cactus Team
Jocelyn McNamara -- NAL – KSD contractor
Kerry Huller – UMD graduate fellow
Erin Antognoli – UMD graduate fellow

Mais conteúdo relacionado

Mais procurados

From data to knowledge – the Ondex System for integrating Life Sciences data ...
From data to knowledge – the Ondex System for integrating Life Sciences data ...From data to knowledge – the Ondex System for integrating Life Sciences data ...
From data to knowledge – the Ondex System for integrating Life Sciences data ...
Catherine Canevet
 

Mais procurados (20)

Creating impact with accessible data in agriculture and nutrition: sharing da...
Creating impact with accessible data in agriculture and nutrition: sharing da...Creating impact with accessible data in agriculture and nutrition: sharing da...
Creating impact with accessible data in agriculture and nutrition: sharing da...
 
Unlocking Thesis Data - Stephen Grace, University of East London
Unlocking Thesis Data - Stephen Grace, University of East LondonUnlocking Thesis Data - Stephen Grace, University of East London
Unlocking Thesis Data - Stephen Grace, University of East London
 
SGCI and Globus: Partners for Acceleration of Science
SGCI and Globus: Partners for Acceleration of ScienceSGCI and Globus: Partners for Acceleration of Science
SGCI and Globus: Partners for Acceleration of Science
 
RSpace - Rory Macneil at Repository Fringe 2015
RSpace - Rory Macneil at Repository Fringe 2015RSpace - Rory Macneil at Repository Fringe 2015
RSpace - Rory Macneil at Repository Fringe 2015
 
Big Data R&D Strategy - Ensure the long term sustainability, access, and deve...
Big Data R&D Strategy - Ensure the long term sustainability, access, and deve...Big Data R&D Strategy - Ensure the long term sustainability, access, and deve...
Big Data R&D Strategy - Ensure the long term sustainability, access, and deve...
 
Tools for improving data publication and use
Tools for improving data publication and useTools for improving data publication and use
Tools for improving data publication and use
 
Research data discovery in OpenAIRE (Presentation by Paolo Manghi at DI4R2018)
Research data discovery in OpenAIRE (Presentation by Paolo Manghi at DI4R2018)Research data discovery in OpenAIRE (Presentation by Paolo Manghi at DI4R2018)
Research data discovery in OpenAIRE (Presentation by Paolo Manghi at DI4R2018)
 
Fsci 2018 tuesday31_july_am6
Fsci 2018 tuesday31_july_am6Fsci 2018 tuesday31_july_am6
Fsci 2018 tuesday31_july_am6
 
Clarivate ERA Supplier rscd2018
Clarivate ERA Supplier rscd2018Clarivate ERA Supplier rscd2018
Clarivate ERA Supplier rscd2018
 
FAIRsharing and FAIRmetrics - RDA, March 2018
FAIRsharing and FAIRmetrics - RDA, March 2018FAIRsharing and FAIRmetrics - RDA, March 2018
FAIRsharing and FAIRmetrics - RDA, March 2018
 
From data to knowledge – the Ondex System for integrating Life Sciences data ...
From data to knowledge – the Ondex System for integrating Life Sciences data ...From data to knowledge – the Ondex System for integrating Life Sciences data ...
From data to knowledge – the Ondex System for integrating Life Sciences data ...
 
Pre-open access data sharing
Pre-open access data sharingPre-open access data sharing
Pre-open access data sharing
 
The FAIR Cookbook in a nutshell
The FAIR Cookbook in a nutshellThe FAIR Cookbook in a nutshell
The FAIR Cookbook in a nutshell
 
Research data sharing
Research data sharingResearch data sharing
Research data sharing
 
AgriSchemas Progress Report
AgriSchemas Progress ReportAgriSchemas Progress Report
AgriSchemas Progress Report
 
Big data analysis and Integration of Geophysical information from the Catalan...
Big data analysis and Integration of Geophysical information from the Catalan...Big data analysis and Integration of Geophysical information from the Catalan...
Big data analysis and Integration of Geophysical information from the Catalan...
 
CDL research lifecycle
CDL research lifecycleCDL research lifecycle
CDL research lifecycle
 
Open Research Gateway for the ELIXIR-GR Infrastructure (Part 2)
Open Research Gateway for the ELIXIR-GR Infrastructure (Part 2)Open Research Gateway for the ELIXIR-GR Infrastructure (Part 2)
Open Research Gateway for the ELIXIR-GR Infrastructure (Part 2)
 
2013 04 g8opendata-ag_infra
2013 04 g8opendata-ag_infra2013 04 g8opendata-ag_infra
2013 04 g8opendata-ag_infra
 
Introduction to Big data
Introduction to Big dataIntroduction to Big data
Introduction to Big data
 

Semelhante a Ag Data Commons: Adding Value to open agricultural research data

Scholze liber 2015-06-25_final
Scholze liber 2015-06-25_finalScholze liber 2015-06-25_final
Scholze liber 2015-06-25_final
Karlsruhe Institute of Technology (KIT)
 

Semelhante a Ag Data Commons: Adding Value to open agricultural research data (20)

re3data.org – Registry of Research Data Repositories
re3data.org – Registry of Research Data Repositoriesre3data.org – Registry of Research Data Repositories
re3data.org – Registry of Research Data Repositories
 
Scholze liber 2015-06-25_final
Scholze liber 2015-06-25_finalScholze liber 2015-06-25_final
Scholze liber 2015-06-25_final
 
Ag Data Commons: A new USDA catalog and repository for agricultural research ...
Ag Data Commons: A new USDA catalog and repository for agricultural research ...Ag Data Commons: A new USDA catalog and repository for agricultural research ...
Ag Data Commons: A new USDA catalog and repository for agricultural research ...
 
re3data.org – a Registry of Research Data Repositories
re3data.org – a Registry of Research Data Repositoriesre3data.org – a Registry of Research Data Repositories
re3data.org – a Registry of Research Data Repositories
 
Public access to research results at USDA
Public access to research results at USDAPublic access to research results at USDA
Public access to research results at USDA
 
Research Data Management in GLAM: Managing Data for Cultural Heritage
Research Data Management in GLAM: Managing Data for Cultural HeritageResearch Data Management in GLAM: Managing Data for Cultural Heritage
Research Data Management in GLAM: Managing Data for Cultural Heritage
 
2014 10 china-nsl
2014 10 china-nsl2014 10 china-nsl
2014 10 china-nsl
 
IEDA Data Publication Workshop @AGU
IEDA Data Publication Workshop @AGUIEDA Data Publication Workshop @AGU
IEDA Data Publication Workshop @AGU
 
Open Science - Global Perspectives/Simon Hodson
Open Science - Global Perspectives/Simon HodsonOpen Science - Global Perspectives/Simon Hodson
Open Science - Global Perspectives/Simon Hodson
 
RDA Update
RDA UpdateRDA Update
RDA Update
 
Rolle und Perspektive von re3data.org bei der Förderung von Open Science
Rolle und Perspektive von re3data.org bei der Förderung von Open ScienceRolle und Perspektive von re3data.org bei der Förderung von Open Science
Rolle und Perspektive von re3data.org bei der Förderung von Open Science
 
dkNET Office Hours - "Are You Ready for 2023: New NIH Data Management and Sha...
dkNET Office Hours - "Are You Ready for 2023: New NIH Data Management and Sha...dkNET Office Hours - "Are You Ready for 2023: New NIH Data Management and Sha...
dkNET Office Hours - "Are You Ready for 2023: New NIH Data Management and Sha...
 
Tools für das Management von Forschungsdaten
Tools für das Management von ForschungsdatenTools für das Management von Forschungsdaten
Tools für das Management von Forschungsdaten
 
Making Research Data Repositories Visible – The re3data.org Registry
Making Research Data Repositories Visible – The re3data.org RegistryMaking Research Data Repositories Visible – The re3data.org Registry
Making Research Data Repositories Visible – The re3data.org Registry
 
Open Access as a Means to Produce High Quality Data
Open Access as a Means to Produce High Quality DataOpen Access as a Means to Produce High Quality Data
Open Access as a Means to Produce High Quality Data
 
Praetzellis "Data Management Planning and Tools"
Praetzellis "Data Management Planning and Tools"Praetzellis "Data Management Planning and Tools"
Praetzellis "Data Management Planning and Tools"
 
Introduction to Data Management Planning at Alien Challenge COST workshop
Introduction to Data Management Planning at Alien Challenge COST workshopIntroduction to Data Management Planning at Alien Challenge COST workshop
Introduction to Data Management Planning at Alien Challenge COST workshop
 
Big Data (SOCIOMETRIC METHODS FOR RELEVANCY ANALYSIS OF LONG TAIL SCIENCE D...
Big Data (SOCIOMETRIC METHODS FOR  RELEVANCY ANALYSIS OF LONG TAIL  SCIENCE D...Big Data (SOCIOMETRIC METHODS FOR  RELEVANCY ANALYSIS OF LONG TAIL  SCIENCE D...
Big Data (SOCIOMETRIC METHODS FOR RELEVANCY ANALYSIS OF LONG TAIL SCIENCE D...
 
The agINFRA Germplasm Working Group
The agINFRA Germplasm Working GroupThe agINFRA Germplasm Working Group
The agINFRA Germplasm Working Group
 
dkNET Webinar: Creating and Sustaining a FAIR Biomedical Data Ecosystem 10/09...
dkNET Webinar: Creating and Sustaining a FAIR Biomedical Data Ecosystem 10/09...dkNET Webinar: Creating and Sustaining a FAIR Biomedical Data Ecosystem 10/09...
dkNET Webinar: Creating and Sustaining a FAIR Biomedical Data Ecosystem 10/09...
 

Mais de Cyndy Parr

Encyclopedia of Life: Use cases for phenotypes
Encyclopedia of Life: Use cases for phenotypesEncyclopedia of Life: Use cases for phenotypes
Encyclopedia of Life: Use cases for phenotypes
Cyndy Parr
 

Mais de Cyndy Parr (20)

Open data and the ag data commons
Open data and the ag data commonsOpen data and the ag data commons
Open data and the ag data commons
 
Biodiversity informatics and the agricultural data landscape
Biodiversity informatics and the agricultural data landscapeBiodiversity informatics and the agricultural data landscape
Biodiversity informatics and the agricultural data landscape
 
Ag Data Commons: Agricultural research metadata and data
Ag Data Commons: Agricultural research metadata and dataAg Data Commons: Agricultural research metadata and data
Ag Data Commons: Agricultural research metadata and data
 
Preparing for data-intensive science across domains.
Preparing for data-intensive science across domains.Preparing for data-intensive science across domains.
Preparing for data-intensive science across domains.
 
TDWG 2014 opening talk: Chair's Welcome
TDWG 2014 opening talk: Chair's WelcomeTDWG 2014 opening talk: Chair's Welcome
TDWG 2014 opening talk: Chair's Welcome
 
Behavior ontology workshop princeton
Behavior ontology workshop princetonBehavior ontology workshop princeton
Behavior ontology workshop princeton
 
iEvoBio Keynote: Frontiers of discovery with Encyclopedia of Life -- TRAITBANK
iEvoBio Keynote: Frontiers of discovery with Encyclopedia of Life -- TRAITBANK iEvoBio Keynote: Frontiers of discovery with Encyclopedia of Life -- TRAITBANK
iEvoBio Keynote: Frontiers of discovery with Encyclopedia of Life -- TRAITBANK
 
Frontiers of discovery with Encyclopedia of Life
Frontiers of discovery with Encyclopedia of LifeFrontiers of discovery with Encyclopedia of Life
Frontiers of discovery with Encyclopedia of Life
 
Practical interoperability across semantic stores of data for ecological, tax...
Practical interoperability across semantic stores of data for ecological, tax...Practical interoperability across semantic stores of data for ecological, tax...
Practical interoperability across semantic stores of data for ecological, tax...
 
Using and extending Darwin Core for structured attribute data
Using and extending Darwin Core for structured attribute dataUsing and extending Darwin Core for structured attribute data
Using and extending Darwin Core for structured attribute data
 
How the Encyclopedia of Life is wrangling organismal attribute data
How the Encyclopedia of Life is wrangling organismal attribute dataHow the Encyclopedia of Life is wrangling organismal attribute data
How the Encyclopedia of Life is wrangling organismal attribute data
 
The Road to TraitBank: What's Next for the Encyclopedia of Life
The Road to TraitBank: What's Next for the Encyclopedia of LifeThe Road to TraitBank: What's Next for the Encyclopedia of Life
The Road to TraitBank: What's Next for the Encyclopedia of Life
 
Encyclopedia of Life: Applying Concepts from Amazon and LEGO to Biodiversity ...
Encyclopedia of Life: Applying Concepts from Amazon and LEGO to Biodiversity ...Encyclopedia of Life: Applying Concepts from Amazon and LEGO to Biodiversity ...
Encyclopedia of Life: Applying Concepts from Amazon and LEGO to Biodiversity ...
 
Encyclopedia of Life: Use cases for phenotypes
Encyclopedia of Life: Use cases for phenotypesEncyclopedia of Life: Use cases for phenotypes
Encyclopedia of Life: Use cases for phenotypes
 
Species pages and portals
Species pages and portals Species pages and portals
Species pages and portals
 
Building EOL species pages
Building EOL species pagesBuilding EOL species pages
Building EOL species pages
 
Leveraging an international infrastructure: Case studies from the Encyclopeda...
Leveraging an international infrastructure: Case studies from the Encyclopeda...Leveraging an international infrastructure: Case studies from the Encyclopeda...
Leveraging an international infrastructure: Case studies from the Encyclopeda...
 
Introduction to EOL.org for scientists
Introduction to EOL.org for scientistsIntroduction to EOL.org for scientists
Introduction to EOL.org for scientists
 
EOL and Science: Yes we can!
EOL and Science: Yes we can!EOL and Science: Yes we can!
EOL and Science: Yes we can!
 
EOL China Center status
EOL China Center statusEOL China Center status
EOL China Center status
 

Último

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Último (20)

Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 

Ag Data Commons: Adding Value to open agricultural research data

  • 1. Cynthia Parr @cydparr US Department of Agriculture National Agricultural Library 30 September 2015 Ag Data Commons Adding value to open agricultural research data
  • 2. Federal directives: Public access to open, machine-readable data
  • 3. The problems in agricultural data • Broad subject areas • Journals not integrated with repositories like Dryad • Too many existing databases & web distribution points • Lack of infrastructure for long-tail data • Lack of a neutral, sustainable solution for long- term multi-institutional projects 3
  • 4. • Supports Public Access mandates • Holds agricultural research data • Primary audience: researchers • Holds metadata for data held elsewhere • Starting with USDA data but will broaden • Both human and machine access • Can include unpublished data that is ready for release Ag Data Commons Prototyping FY 2015 A proposed solution
  • 5. Search & Knowledge Discovery Thesaurus & Indexing Ag Data Commons Repository Organization & Curation Grant management systems INGESTION DISSEMINATION PubAg Dataset Submissio n Analytics & Tools Data.gov Ag Data Commons Catalog Legend Building Adapting Existing Distributed repositories Forest Service Geospatial
  • 6. Adding value 6 Metadata + data package DOI Links Thesaurus tags Idiosyncratic data dictionary Search, services, compliance checking
  • 7. DKAN http://nucivic.com/dkan/ PRO • Open source community • Drupal modules for basic CMS functions • Integrated CKAN catalog • Feeds Data.gov • Basic metadata already supported CON • Not designed for scientific data or scientists • No links to literature • No Digital Object Identifiers • Doesn’t handle dataset relationships • Metadata inadequate for compliance checking & re-use 7
  • 8. Metadata Standards Core Metadata Schema POD 1.1 (Project Open Data) https://project-open-data.cio.gov/ Related Scientific Metadata & Data Standards (e.g.) ISO 19115 (GIS Data, FGDC) https://www.iso.org Darwin Core (Biodiversity standards) http://rs.tdwg.org/dwc/ EML (Ecological Metadata Language) https://knb.ecoinformatics.org/#tools/eml MiXS GSC (Genomic Standards Consortium) http://gensc.org/projects/mixs-gsc-project/
  • 9. Controlled Vocabularies • NALT – National Agricultural Library Thesaurus http://agclass.nal.usda.gov  GACS Global Agricultural Concept Scheme • Taxonomy • Gene Ontology (GO) http://geneontology.org/ • ENVO, ecological, economic, etc. Relevant for Agriculture • Help create a semantic web • SKOS (Simple Knowledge Organization System): W3C recommendation, or RDF Credit: AIMS--FAO
  • 11.
  • 12. Adding even more value 12 Structured methods metadata Shared data dictionary Semantic data dictionary
  • 13. Adding even more value 13 Assist application launch Find related data Integrate/link related data = help build the knowledge graph
  • 14. Acknowledgements Cynthia.Parr@ars.usda.gov Susan McCarthy, NAL – KSD Ursula Pieper, NAL – ISD Qing Qu, NAL – KSD contractor Jeff Campbell – NAL – KSD Jaylen Nathwani, NAL – student intern NüCivic, Angry Cactus Team Jocelyn McNamara -- NAL – KSD contractor Kerry Huller – UMD graduate fellow Erin Antognoli – UMD graduate fellow

Notas do Editor

  1. Title Ag Data Commons: adding value to open agricultural research data     Public access to results of federally-funded research is a new mandate for large departments of the United States government. Public access to scholarly literature from U.S. investments is straightforward, with policies and systems like PubMed Central and PubAg (http://pubag.nal.usda.gov) already implemented. However, research data release is a more complex undertaking. Agricultural researchers make their data available in a patchwork of locations, if they share it at all, and metadata and data formats are far from standardized. Many data types overlap with basic science domains that have standards (e.g. biodiversity, genomics, hydrology) but have little in common with each other and are not tailored for agriculture. U.S. Department of Agriculture's prototoype system, the Ag Data Commons (http://data.nal.usda.gov), will meet the requirements of public access but should also go further to facilitate novel, data-intensive science. Aimed at researchers, Ag Data Commons uses DKAN, a Drupal-based catalog and repository (http://nucivic.com/dkan/), to enhance discoverability and access to well-curated resources (data files, databases, software) deposited in the system or held elsewhere. Core metadata fields are from Project Open Data v.1.1 (a requirement of the U.S. open data catalog athttp://data.gov) but we added fields and features to support scholarly research. We issue DataCite Digital Object Identifiers (DOIs), accept author ORCIDs (http://orcid.org/), apply National Agricultural Library thesaurus terms, and encourage citation of literature and linkage with related datasets and other online resources. While extremely detailed metadata are impractical given the breadth of agricultural domains, we can extract fields from sophisticated ISO 19115 geographic information metadata and extended metadata files can be posted and will be indexed. We are piloting the harvest of distributed metadata records. Towards data integration and standardization, we are developing guidelines for machine-readable data dictionaries, manifests of data elements in datasets not unlike Darwin Core Archives. We are exploring ways to enable basic interactive visualizations. Metadata are available in JSON (http://json.org/) and RDF (http://www.w3.org/RDF/), with dedicated feeds for publication links and (eventually) compliance checking. Many challenges remain before we can move from prototype to production. Among the challenges are how to provide easy API (application program interface) access to elements in data files, how to interface with related systems (e.g. Dryad, DataONE, EcoInforma, iPlant), how to leverage methods metadata and semantics, how to better support provenance and impact tracking, and how to ease the pain of both working with and preserving big data for high performance computing.
  2. This plan is in a learning and pilot phase now. Policies are being developed to be available in the next fiscal year. New projects in 2016-1017 will be expected to be in full compliance with policies, that means data management plans up front that result in publicly released scientific data according to policy. .So we have a little time to work out the details and influence the policies. We can have conversations now on best practices that may guide the policy makers.
  3. Dark Blue: develop as part of AgDatacCommons Light blue:Enhance existing systems. Gray: Already exist
  4. Drupal Knowledge Archive Network
  5. Phase II prototype Launching next week! Data submission for outside personnel Automate DOI submission Support for compliance checking Embargo support Support for methods & software metadata
  6. Scheduled harvesting from external repositories (including geospatial ISO metadata)