SlideShare uma empresa Scribd logo
1 de 46
Baixar para ler offline
Chris Erdmann
Judy Ruttenberg
Todd Vision
NISO Virtual Conference: Open Data Projects
June 13, 2018
Community approaches to
open data at scale
Chris Erdmann
The Carpentries/California Digital Library
Metadata 2020 Participant
@libcce / chris@carpentries.org
Metadata 2020:
Who, what, when,
where, why?
As a researcher…I’m a bit bloody
fed up with Data Management -
Cameron Neylon
What is Metadata 2020?
Metadata 2020 is a collaboration that
advocates richer, connected, and reusable,
open metadata for all research outputs, which
will advance scholarly pursuits for the benefit of
society.
COMMUNITY GROUPS
RESEARCHERS
Cameron Neylon, Curtin (Chair), Bethany Drehman, FASEB, Ernesto Priego, University of London, Eva Mendez,
UC3M/OSPP, Juan Pablo Alperin, Public Knowledge Project, L.K. Williams, Interfolio...
SERVICE PROVIDER/PLATFORMS AND TOOLS
Marianne Calilhanna, Cenveo Publisher Services (Chair), Adrian-Tudor Pănescu, Figshare, Bob Kasenchak, Access
Innovations, Dan Nigloschy, XML workflow solutions architect...
FUNDERS
Ross Mounce, Arcadia Fund
PUBLISHERS
Daniel Shanahan, F1000 (Chair), Fiona Counsell, Taylor & Francis, Christina Gifford, Elsevier, Christina
Hoppermann, Springer Nature, Concetta La Spada, Cambridge University Press…
LIBRARIANS
Juliane Schneider, Harvard Catalyst (Chair), Christopher Erdmann, North Carolina State University, Ebe Kartus,
University of New England, Eva Mendez, UC3M/OSPP...
DATA PUBLISHERS AND REPOSITORIES
John Chodacki, CDL and DataCite (Chair), Barbara Chen, Modern Language Association, Jennifer Lin, Crossref, Scott
Plutchak, University of Alabama at Birmingham (retired)...
● Each group has met 5 times
● They have defined their community problem
statements, outlining challenges and opportunities
● Ideas that arose from multiple meetings are now
resulting in specific cross-community projects
Group Work
Problem Statements, Challenges & Opportunities
Example:
Researchers have a major issue with time. Metadata entry
upon submission of research takes time, and this metadata is
often required to be entered multiple times. Streamlining is
needed. Researchers in different fields have different metadata
needs and ways of talking about metadata. There is also a lack
of knowledge surrounding the importance of complete and
accurate metadata, and the value and uses of that metadata
upstream in the research product life cycle.
Projects 1-3
1. Researcher Communications: Increase the impact
and consistency of communication with researchers
about metadata
2. Metadata Recommendations and Element
Mappings: Shared set of recommended metadata
concepts/related mappings
3. Defining the Terms We Use About Metadata:
Develop a glossary of words associated with metadata,
for core concepts and disciplinary areas
Projects 4-6
4. Incentives for Improving Metadata Quality: Stories
to demonstrate how better metadata will meet
researcher goals
5. Shared Best Practices and Principles: High level best
practices for using metadata across the scholarly
communication cycle, to facilitate interoperability,
exchange
6. Metadata Evaluation and Guidance: Identify and
compare existing metadata evaluation tools and
mechanisms to inform clear community guidance
In our discussions...
Talks: SHARE & Dryad
Improving the metadata curation pipeline to SHARE
Judy Ruttenberg, Program Director for Strategic Initiatives, ARL
SHARE is a community open-source initiative developing tools and services to connected related, yet distributed,
research outputs, enabling new kinds of scholarly discovery. This talk will provide an overview of SHARE's current
development priorities to move to distributed, institutionally-based infrastructure supporting local priorities, as well
as critical improvements to SHARE's harvesting framework and metadata curation pipeline.
Dryad and the evolution of metadata curation at a generalist data
repository
Todd Vision, PI, Dryad
Dryad is a generalist data repository underlying the scientific and medical literature, with data underlying articles
from hundreds of journals and authors at hundreds of institutions. In this talk, I will describe how Dryad's workflow
for metadata curation has evolved over time and contemplate how institutions and data repositories might better
interface with one another and with the world of STM publishing.
Some feedback from NASIG so far...
Focus on identifiers: they keep coming up as the source of many problems,
specifically when they’re used either inconsistently or incorrectly.
Don’t start with single volume monographs and assume serials will fit in
eventually. Many existing data models start there and get too far along before
they realize that it doesn’t quite work with serials and then the serials community
is left trying to figure out how to work around the model/system.
Realize that the volume/proliferation of materials means that a lot of libraries,
large and small, rely (to varying degrees) on vendor-provided data. We need
them to take ownership of that data and work on ways to ensure at least accuracy
(identifiers, spelling, website urls, etc.).
*Thanks to Juliane Schneider & NASIG Metadata 2020 contributors
Can you help?
● Contribute to Metadata 2020 projects! Email
Clare Dean at cdean@metadata2020.org for details, or
sign up here.
● Help promote our efforts to the wider community
through your organizations, word of mouth, and social
media
● Find us on @Metadata2020 Twitter, Facebook,
LinkedIn, and at metadata2020.org
Metadata2020.org
@metadata2020
info@metadata2020.org
Thank you!
Questions?
SHARE is a community open-source initiative
developing tools and services to connect
related, yet distributed, research outputs,
enabling new kinds of scholarly discovery.
@SHARE_research
www.share-research.org
Metadata is data
Rich metadata ...
● Facilitates discovery
● Exposes research assets
● Contributes to meta-scholarship and
meta-analysis
Links and relationships can be analyzed from
this data
Dataset
Harvesting Framework
Aggregator: OSF Preprints
Institutional focus: Dashboard
Lessons learned
Digital Humanities exploration
Dataset
Harvesting Framework
Aggregator: OSF Preprints
Institutional focus: Dashboard
Lessons learned
Digital Humanities exploration
Dataset
Harvesting Framework
Aggregator: OSF Preprints
Institutional focus: Dashboard
Lessons learned
Digital Humanities exploration
Dataset & Harvesting Framework
168+ data sources
● Registries (e.g. CrossRef, DataCite)
● Disciplinary repositories and preprint services
● Data repositories
● Institutional repositories
● Agency repositories (e.g. DOE SciTech Connect)
55+ million metadata records
https://share.osf.io/discover
SHARE metadata priorities
● Institutional identifier
● Person identifier
● Source of funding
● Exchange across systems & borders: CC0
● Reference lists
● URI values - mapping to common values
making them transferrable
Rich metadata, new discovery
Rich metadata, rich storytelling
Lessons learned
● Move to distributed infrastructure
● Invest more in relationship mapping among
objects in the dataset
● Build on work at the institution level
● Shared service AND reusable solutions
Decentralization of SHARE
Under development:
● Template to make writing harvester code
easy, using Node-RED
● Distributed framework for harvesting data
● Editor to clean, remediate, link harvested
data
Community, open-source software
development to solve local problems
Use case: Research Intelligence
“Aggregation, curation, and utilization of
metadata about research activities. [RIMs} …
help reliably connect a complex scholarly
communications landscape of researchers,
affiliations, publications, datasets, grants,
projects, and their persistent identifiers.”
OCLC Research Library Partnerships:
https://www.oclc.org/research/themes/research-collections/rim.html
VIVO - June 2018 - Durham, NC
The evolution of
metadata at a
generalist data
repository
Todd Vision
Associate Prof, Department of Biology
Adjunct, School of Information & Library Science
University of North Carolina at Chapel Hill
With thanks to
Dryad staff
Jane Greenberg, and the UNC/Drexel Metadata Research Center
The long tail of orphan dataVolume
Rank frequency of datatype
Specialized repositories
(e.g. GenBank)
Orphan data
After Heidorn (2008) http://hdl.handle.net/2142/9127
Bumpus HC (1898) The Elimination of the Unfit as
Illustrated by the Introduced Sparrow, Passer
domesticus. A Fourth Contribution to the Study of
Variation. pp. 209-226 in Biological Lectures from the
Marine Biological Laboratory, Woods Hole, Mass.
VIVO - June 2018 - Durham, NC
InformationContent
Time
Time of publication
Specific details
General details
Accident
Retirement or
career change
Death
Michener, W. K., J. W. Brunt, J. Helly, T. B. Kirchner, and S. G. Stafford. 1997.
Non-geospatial metadata for the ecological sciences. Ecological Applications 7:330-342.
Data and metadata entropy
VIVO - June 2018 - Durham, NC
Joint Data Archiving Policy
Data are important products of the scientific enterprise,
and they should be preserved and usable for decades in
the future.
As a condition for publication, data supporting the results
in the article should be deposited in an appropriate
public archive.
Authors may elect to embargo access to the data for a
period up to a year after publication.
Exceptions may be granted at the discretion of the editor,
especially for sensitive information.
http://datadryad.org/pages/jdap
VIVO - June 2018 - Durham, NC
VIVO - June 2018 - Durham, NC
Integration of manuscript and data submission
VIVO - June 2018 - Durham, NC
A data “package”
VIVO - June 2018 - Durham, NC
Supplementary documentation
VIVO - June 2018 - Durham, NC
Interoperability
VIVO - June 2018 - Durham,
Interoperability
VIVO - June 2018 - Durham, NC
Interoperability
VIVO - June 2018 - Durham, NC
Data citation
VIVO - June 2018 - Durham, NC
Data Curation Network
Uncurated Data
Presenting scale and
expertise challenges
to individual
institutions
Curated Data
at scale through shared
Data Curation Network
Appraise
and Select
Ingest Preserve
Long-Term
Facilitate
Access
DCN
Review Assign CURATE Mediate Approve
Check files
and
metadata
Understand
and run files
Request
missing
information
Augment
metadata
Transform
file formats
Evaluate for
FAIRness
C U R A T E
The Data Curation Network
VIVO - June 2018 - Durham, NC
DCN – planning phase (2016-2017)
• Collaboration of six academic libraries
• Can data curation staff be shared among institutions?
• Questions
– How to address policy differences?
– What do researchers actually need help with?
– Will researchers care if curation is distributed?
– Can issues of trust and quality control be solved?
– What skills and workflows are needed?
Lisa Johnston et al. (2017) Data Curation Network: A Cross-Institutional
Staffing Model for Curating Research Data
http://hdl.handle.net/11299/188654
VIVO - June 2018 - Durham, NC
DCN - pilot phase (2018-2020)
VIVO - June 2018 - Durham, NC
Partnership with
VIVO - June 2018 - Durham, NC
Make Data Count
VIVO - June 2018 - Durham, NC
datadryad.org / @datadryad
datacurationnetwork.org

Mais conteúdo relacionado

Mais procurados

LIBER Webinar: 23 Things About Research Data Management
LIBER Webinar: 23 Things About Research Data ManagementLIBER Webinar: 23 Things About Research Data Management
LIBER Webinar: 23 Things About Research Data ManagementLIBER Europe
 
dkNET ESP Meeting - February 2016
dkNET ESP Meeting - February 2016dkNET ESP Meeting - February 2016
dkNET ESP Meeting - February 2016dkNET
 
What Are Links in Linked Open Data? A Characterization and Evaluation of Link...
What Are Links in Linked Open Data? A Characterization and Evaluation of Link...What Are Links in Linked Open Data? A Characterization and Evaluation of Link...
What Are Links in Linked Open Data? A Characterization and Evaluation of Link...Armin Haller
 
Fox-Keynote-Now and Now of Data Publishing-nfdp13
Fox-Keynote-Now and Now of Data Publishing-nfdp13Fox-Keynote-Now and Now of Data Publishing-nfdp13
Fox-Keynote-Now and Now of Data Publishing-nfdp13DataDryad
 
Data Citation, The Dataverse Network ®, and Contributor Identifiers
Data Citation, The Dataverse Network ®, and Contributor IdentifiersData Citation, The Dataverse Network ®, and Contributor Identifiers
Data Citation, The Dataverse Network ®, and Contributor IdentifiersMicah Altman
 
Integration of research literature and data (InFoLiS)
Integration of research literature and data (InFoLiS)Integration of research literature and data (InFoLiS)
Integration of research literature and data (InFoLiS)Philipp Zumstein
 
Data Repositories Impact
Data Repositories ImpactData Repositories Impact
Data Repositories ImpactMerce Crosas
 
dkNET-NURSA Challenge Kick-Off Webinar 04/27/2017
dkNET-NURSA Challenge Kick-Off Webinar 04/27/2017dkNET-NURSA Challenge Kick-Off Webinar 04/27/2017
dkNET-NURSA Challenge Kick-Off Webinar 04/27/2017dkNET
 
bioCADDIE Webinar: The NIDDK Information Network (dkNET) - A Community Resear...
bioCADDIE Webinar: The NIDDK Information Network (dkNET) - A Community Resear...bioCADDIE Webinar: The NIDDK Information Network (dkNET) - A Community Resear...
bioCADDIE Webinar: The NIDDK Information Network (dkNET) - A Community Resear...dkNET
 
Analysing & Improving Learning Resources Markup on the Web
Analysing & Improving Learning Resources Markup on the WebAnalysing & Improving Learning Resources Markup on the Web
Analysing & Improving Learning Resources Markup on the WebStefan Dietze
 

Mais procurados (20)

Think like a Digital Curator
Think like a Digital CuratorThink like a Digital Curator
Think like a Digital Curator
 
FAIR data overview
FAIR data overviewFAIR data overview
FAIR data overview
 
NISO/NFAIS Joint Virtual Conference: Connecting the Library to the Wider Worl...
NISO/NFAIS Joint Virtual Conference: Connecting the Library to the Wider Worl...NISO/NFAIS Joint Virtual Conference: Connecting the Library to the Wider Worl...
NISO/NFAIS Joint Virtual Conference: Connecting the Library to the Wider Worl...
 
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
 
LIBER Webinar: 23 Things About Research Data Management
LIBER Webinar: 23 Things About Research Data ManagementLIBER Webinar: 23 Things About Research Data Management
LIBER Webinar: 23 Things About Research Data Management
 
Preparing Data for Sharing: The FAIR Principles
Preparing Data for Sharing: The FAIR PrinciplesPreparing Data for Sharing: The FAIR Principles
Preparing Data for Sharing: The FAIR Principles
 
Putnam Data Quality and the IR
Putnam Data Quality and the IRPutnam Data Quality and the IR
Putnam Data Quality and the IR
 
dkNET ESP Meeting - February 2016
dkNET ESP Meeting - February 2016dkNET ESP Meeting - February 2016
dkNET ESP Meeting - February 2016
 
Big Data for Library Services (2017)
Big Data for Library Services (2017)Big Data for Library Services (2017)
Big Data for Library Services (2017)
 
NISO/NFAIS Joint Virtual Conference: Connecting the Library to the Wider Wor...
NISO/NFAIS Joint Virtual Conference:  Connecting the Library to the Wider Wor...NISO/NFAIS Joint Virtual Conference:  Connecting the Library to the Wider Wor...
NISO/NFAIS Joint Virtual Conference: Connecting the Library to the Wider Wor...
 
What Are Links in Linked Open Data? A Characterization and Evaluation of Link...
What Are Links in Linked Open Data? A Characterization and Evaluation of Link...What Are Links in Linked Open Data? A Characterization and Evaluation of Link...
What Are Links in Linked Open Data? A Characterization and Evaluation of Link...
 
NISO Training Thursday Crafting a Scientific Data Management Plan
NISO Training Thursday Crafting a Scientific Data Management PlanNISO Training Thursday Crafting a Scientific Data Management Plan
NISO Training Thursday Crafting a Scientific Data Management Plan
 
Fox-Keynote-Now and Now of Data Publishing-nfdp13
Fox-Keynote-Now and Now of Data Publishing-nfdp13Fox-Keynote-Now and Now of Data Publishing-nfdp13
Fox-Keynote-Now and Now of Data Publishing-nfdp13
 
Data Citation, The Dataverse Network ®, and Contributor Identifiers
Data Citation, The Dataverse Network ®, and Contributor IdentifiersData Citation, The Dataverse Network ®, and Contributor Identifiers
Data Citation, The Dataverse Network ®, and Contributor Identifiers
 
Integration of research literature and data (InFoLiS)
Integration of research literature and data (InFoLiS)Integration of research literature and data (InFoLiS)
Integration of research literature and data (InFoLiS)
 
Data Repositories Impact
Data Repositories ImpactData Repositories Impact
Data Repositories Impact
 
dkNET-NURSA Challenge Kick-Off Webinar 04/27/2017
dkNET-NURSA Challenge Kick-Off Webinar 04/27/2017dkNET-NURSA Challenge Kick-Off Webinar 04/27/2017
dkNET-NURSA Challenge Kick-Off Webinar 04/27/2017
 
Mendeley Data FAIR hackathon
Mendeley Data FAIR hackathonMendeley Data FAIR hackathon
Mendeley Data FAIR hackathon
 
bioCADDIE Webinar: The NIDDK Information Network (dkNET) - A Community Resear...
bioCADDIE Webinar: The NIDDK Information Network (dkNET) - A Community Resear...bioCADDIE Webinar: The NIDDK Information Network (dkNET) - A Community Resear...
bioCADDIE Webinar: The NIDDK Information Network (dkNET) - A Community Resear...
 
Analysing & Improving Learning Resources Markup on the Web
Analysing & Improving Learning Resources Markup on the WebAnalysing & Improving Learning Resources Markup on the Web
Analysing & Improving Learning Resources Markup on the Web
 

Semelhante a Full Erdmann Ruttenberg Community Approaches to Open Data at Scale

Metadata 2020 Vivo Conference 2018
Metadata 2020 Vivo Conference 2018 Metadata 2020 Vivo Conference 2018
Metadata 2020 Vivo Conference 2018 Clare Dean
 
INSERM - Data Management & Reuse of Health Data - May 2017
INSERM - Data Management & Reuse of Health Data - May 2017INSERM - Data Management & Reuse of Health Data - May 2017
INSERM - Data Management & Reuse of Health Data - May 2017Susanna-Assunta Sansone
 
My FAIR share of the work - Diamond Light Source - Dec 2018
My FAIR share of the work - Diamond Light Source - Dec 2018My FAIR share of the work - Diamond Light Source - Dec 2018
My FAIR share of the work - Diamond Light Source - Dec 2018Susanna-Assunta Sansone
 
The web of data: how are we doing so far?
The web of data: how are we doing so far?The web of data: how are we doing so far?
The web of data: how are we doing so far?Elena Simperl
 
Open Data and Institutional Repositories
Open Data and Institutional RepositoriesOpen Data and Institutional Repositories
Open Data and Institutional RepositoriesRobin Rice
 
Managing Metadata for Science and Technology Studies: the RISIS case
Managing Metadata for Science and Technology Studies: the RISIS caseManaging Metadata for Science and Technology Studies: the RISIS case
Managing Metadata for Science and Technology Studies: the RISIS caseRinke Hoekstra
 
Open government data portals: from publishing to use and impact
Open government data portals: from publishing to use and impactOpen government data portals: from publishing to use and impact
Open government data portals: from publishing to use and impactElena Simperl
 
Linked Data Workshop Stanford University
Linked Data Workshop Stanford University Linked Data Workshop Stanford University
Linked Data Workshop Stanford University Talis Consulting
 
Rda nitrd 2015 berman - final
Rda nitrd 2015 berman  - finalRda nitrd 2015 berman  - final
Rda nitrd 2015 berman - finalKathy Fontaine
 
Metadata 2020 at APE 2018
Metadata 2020 at APE 2018Metadata 2020 at APE 2018
Metadata 2020 at APE 2018Clare Dean
 
A coordinated framework for open data open science in Botswana/Simon Hodson
A coordinated framework for open data open science in Botswana/Simon HodsonA coordinated framework for open data open science in Botswana/Simon Hodson
A coordinated framework for open data open science in Botswana/Simon HodsonAfrican Open Science Platform
 
A Big Picture in Research Data Management
A Big Picture in Research Data ManagementA Big Picture in Research Data Management
A Big Picture in Research Data ManagementCarole Goble
 
FORCE11: Creating a data and tools ecosystem
FORCE11:  Creating a data and tools ecosystemFORCE11:  Creating a data and tools ecosystem
FORCE11: Creating a data and tools ecosystemMaryann Martone
 
Data as a research output and a research asset: the case for Open Science/Sim...
Data as a research output and a research asset: the case for Open Science/Sim...Data as a research output and a research asset: the case for Open Science/Sim...
Data as a research output and a research asset: the case for Open Science/Sim...African Open Science Platform
 
Open Science Globally: Some Developments/Dr Simon Hodson
Open Science Globally: Some Developments/Dr Simon HodsonOpen Science Globally: Some Developments/Dr Simon Hodson
Open Science Globally: Some Developments/Dr Simon HodsonAfrican Open Science Platform
 
Edinburgh DataShare: Tackling research data in a DSpace institutional repository
Edinburgh DataShare: Tackling research data in a DSpace institutional repositoryEdinburgh DataShare: Tackling research data in a DSpace institutional repository
Edinburgh DataShare: Tackling research data in a DSpace institutional repositoryRobin Rice
 
Research data support: a growth area for academic libraries?
Research data support: a growth area for academic libraries?Research data support: a growth area for academic libraries?
Research data support: a growth area for academic libraries? Robin Rice
 
Next generation data services at the Marriott Library
Next generation data services at the Marriott LibraryNext generation data services at the Marriott Library
Next generation data services at the Marriott LibraryRebekah Cummings
 
Ross Wilkinson - Data Publication: Australian and Global Policy Developments
Ross Wilkinson - Data Publication: Australian and Global Policy DevelopmentsRoss Wilkinson - Data Publication: Australian and Global Policy Developments
Ross Wilkinson - Data Publication: Australian and Global Policy DevelopmentsWiley
 

Semelhante a Full Erdmann Ruttenberg Community Approaches to Open Data at Scale (20)

Metadata 2020 Vivo Conference 2018
Metadata 2020 Vivo Conference 2018 Metadata 2020 Vivo Conference 2018
Metadata 2020 Vivo Conference 2018
 
INSERM - Data Management & Reuse of Health Data - May 2017
INSERM - Data Management & Reuse of Health Data - May 2017INSERM - Data Management & Reuse of Health Data - May 2017
INSERM - Data Management & Reuse of Health Data - May 2017
 
My FAIR share of the work - Diamond Light Source - Dec 2018
My FAIR share of the work - Diamond Light Source - Dec 2018My FAIR share of the work - Diamond Light Source - Dec 2018
My FAIR share of the work - Diamond Light Source - Dec 2018
 
The web of data: how are we doing so far?
The web of data: how are we doing so far?The web of data: how are we doing so far?
The web of data: how are we doing so far?
 
Open Data and Institutional Repositories
Open Data and Institutional RepositoriesOpen Data and Institutional Repositories
Open Data and Institutional Repositories
 
Managing Metadata for Science and Technology Studies: the RISIS case
Managing Metadata for Science and Technology Studies: the RISIS caseManaging Metadata for Science and Technology Studies: the RISIS case
Managing Metadata for Science and Technology Studies: the RISIS case
 
Open government data portals: from publishing to use and impact
Open government data portals: from publishing to use and impactOpen government data portals: from publishing to use and impact
Open government data portals: from publishing to use and impact
 
Open Science Governance and Regulation/Simon Hodson
Open Science Governance and Regulation/Simon HodsonOpen Science Governance and Regulation/Simon Hodson
Open Science Governance and Regulation/Simon Hodson
 
Linked Data Workshop Stanford University
Linked Data Workshop Stanford University Linked Data Workshop Stanford University
Linked Data Workshop Stanford University
 
Rda nitrd 2015 berman - final
Rda nitrd 2015 berman  - finalRda nitrd 2015 berman  - final
Rda nitrd 2015 berman - final
 
Metadata 2020 at APE 2018
Metadata 2020 at APE 2018Metadata 2020 at APE 2018
Metadata 2020 at APE 2018
 
A coordinated framework for open data open science in Botswana/Simon Hodson
A coordinated framework for open data open science in Botswana/Simon HodsonA coordinated framework for open data open science in Botswana/Simon Hodson
A coordinated framework for open data open science in Botswana/Simon Hodson
 
A Big Picture in Research Data Management
A Big Picture in Research Data ManagementA Big Picture in Research Data Management
A Big Picture in Research Data Management
 
FORCE11: Creating a data and tools ecosystem
FORCE11:  Creating a data and tools ecosystemFORCE11:  Creating a data and tools ecosystem
FORCE11: Creating a data and tools ecosystem
 
Data as a research output and a research asset: the case for Open Science/Sim...
Data as a research output and a research asset: the case for Open Science/Sim...Data as a research output and a research asset: the case for Open Science/Sim...
Data as a research output and a research asset: the case for Open Science/Sim...
 
Open Science Globally: Some Developments/Dr Simon Hodson
Open Science Globally: Some Developments/Dr Simon HodsonOpen Science Globally: Some Developments/Dr Simon Hodson
Open Science Globally: Some Developments/Dr Simon Hodson
 
Edinburgh DataShare: Tackling research data in a DSpace institutional repository
Edinburgh DataShare: Tackling research data in a DSpace institutional repositoryEdinburgh DataShare: Tackling research data in a DSpace institutional repository
Edinburgh DataShare: Tackling research data in a DSpace institutional repository
 
Research data support: a growth area for academic libraries?
Research data support: a growth area for academic libraries?Research data support: a growth area for academic libraries?
Research data support: a growth area for academic libraries?
 
Next generation data services at the Marriott Library
Next generation data services at the Marriott LibraryNext generation data services at the Marriott Library
Next generation data services at the Marriott Library
 
Ross Wilkinson - Data Publication: Australian and Global Policy Developments
Ross Wilkinson - Data Publication: Australian and Global Policy DevelopmentsRoss Wilkinson - Data Publication: Australian and Global Policy Developments
Ross Wilkinson - Data Publication: Australian and Global Policy Developments
 

Mais de National Information Standards Organization (NISO)

Mais de National Information Standards Organization (NISO) (20)

Bazargan "NISO Webinar, Sustainability in Publishing"
Bazargan "NISO Webinar, Sustainability in Publishing"Bazargan "NISO Webinar, Sustainability in Publishing"
Bazargan "NISO Webinar, Sustainability in Publishing"
 
Rapple "Scholarly Communications and the Sustainable Development Goals"
Rapple "Scholarly Communications and the Sustainable Development Goals"Rapple "Scholarly Communications and the Sustainable Development Goals"
Rapple "Scholarly Communications and the Sustainable Development Goals"
 
Compton "NISO Webinar, Sustainability in Publishing"
Compton "NISO Webinar, Sustainability in Publishing"Compton "NISO Webinar, Sustainability in Publishing"
Compton "NISO Webinar, Sustainability in Publishing"
 
Mattingly "AI & Prompt Design: Large Language Models"
Mattingly "AI & Prompt Design: Large Language Models"Mattingly "AI & Prompt Design: Large Language Models"
Mattingly "AI & Prompt Design: Large Language Models"
 
Hazen, Morse, and Varnum "Spring 2024 ODI Conformance Statement Workshop for ...
Hazen, Morse, and Varnum "Spring 2024 ODI Conformance Statement Workshop for ...Hazen, Morse, and Varnum "Spring 2024 ODI Conformance Statement Workshop for ...
Hazen, Morse, and Varnum "Spring 2024 ODI Conformance Statement Workshop for ...
 
Mattingly "AI & Prompt Design" - Introduction to Machine Learning"
Mattingly "AI & Prompt Design" - Introduction to Machine Learning"Mattingly "AI & Prompt Design" - Introduction to Machine Learning"
Mattingly "AI & Prompt Design" - Introduction to Machine Learning"
 
Mattingly "Text and Data Mining: Building Data Driven Applications"
Mattingly "Text and Data Mining: Building Data Driven Applications"Mattingly "Text and Data Mining: Building Data Driven Applications"
Mattingly "Text and Data Mining: Building Data Driven Applications"
 
Mattingly "Text and Data Mining: Searching Vectors"
Mattingly "Text and Data Mining: Searching Vectors"Mattingly "Text and Data Mining: Searching Vectors"
Mattingly "Text and Data Mining: Searching Vectors"
 
Mattingly "Text Mining Techniques"
Mattingly "Text Mining Techniques"Mattingly "Text Mining Techniques"
Mattingly "Text Mining Techniques"
 
Mattingly "Text Processing for Library Data: Representing Text as Data"
Mattingly "Text Processing for Library Data: Representing Text as Data"Mattingly "Text Processing for Library Data: Representing Text as Data"
Mattingly "Text Processing for Library Data: Representing Text as Data"
 
Carpenter "Designing NISO's New Strategic Plan: 2023-2026"
Carpenter "Designing NISO's New Strategic Plan: 2023-2026"Carpenter "Designing NISO's New Strategic Plan: 2023-2026"
Carpenter "Designing NISO's New Strategic Plan: 2023-2026"
 
Ross and Clark "Strategic Planning"
Ross and Clark "Strategic Planning"Ross and Clark "Strategic Planning"
Ross and Clark "Strategic Planning"
 
Mattingly "Data Mining Techniques: Classification and Clustering"
Mattingly "Data Mining Techniques: Classification and Clustering"Mattingly "Data Mining Techniques: Classification and Clustering"
Mattingly "Data Mining Techniques: Classification and Clustering"
 
Straza "Global collaboration towards equitable and open science: UNESCO Recom...
Straza "Global collaboration towards equitable and open science: UNESCO Recom...Straza "Global collaboration towards equitable and open science: UNESCO Recom...
Straza "Global collaboration towards equitable and open science: UNESCO Recom...
 
Lippincott "Beyond access: Accelerating discovery and increasing trust throug...
Lippincott "Beyond access: Accelerating discovery and increasing trust throug...Lippincott "Beyond access: Accelerating discovery and increasing trust throug...
Lippincott "Beyond access: Accelerating discovery and increasing trust throug...
 
Kriegsman "Integrating Open and Equitable Research into Open Science"
Kriegsman "Integrating Open and Equitable Research into Open Science"Kriegsman "Integrating Open and Equitable Research into Open Science"
Kriegsman "Integrating Open and Equitable Research into Open Science"
 
Mattingly "Ethics and Cleaning Data"
Mattingly "Ethics and Cleaning Data"Mattingly "Ethics and Cleaning Data"
Mattingly "Ethics and Cleaning Data"
 
Mercado-Lara "Open & Equitable Program"
Mercado-Lara "Open & Equitable Program"Mercado-Lara "Open & Equitable Program"
Mercado-Lara "Open & Equitable Program"
 
Ratner "Enhancing Open Science: Assessing Tools & Charting Progress"
Ratner "Enhancing Open Science: Assessing Tools & Charting Progress"Ratner "Enhancing Open Science: Assessing Tools & Charting Progress"
Ratner "Enhancing Open Science: Assessing Tools & Charting Progress"
 
Pfeiffer "Enhancing Open Science: Assessing Tools & Charting Progress"
Pfeiffer "Enhancing Open Science: Assessing Tools & Charting Progress"Pfeiffer "Enhancing Open Science: Assessing Tools & Charting Progress"
Pfeiffer "Enhancing Open Science: Assessing Tools & Charting Progress"
 

Último

The Contemporary World: The Globalization of World Politics
The Contemporary World: The Globalization of World PoliticsThe Contemporary World: The Globalization of World Politics
The Contemporary World: The Globalization of World PoliticsRommel Regala
 
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxQ4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxlancelewisportillo
 
Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management SystemChristalin Nelson
 
Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4JOYLYNSAMANIEGO
 
Active Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdfActive Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdfPatidar M
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxHumphrey A Beña
 
ROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxVanesaIglesias10
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management systemChristalin Nelson
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONHumphrey A Beña
 
Activity 2-unit 2-update 2024. English translation
Activity 2-unit 2-update 2024. English translationActivity 2-unit 2-update 2024. English translation
Activity 2-unit 2-update 2024. English translationRosabel UA
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxAnupkumar Sharma
 
Expanded definition: technical and operational
Expanded definition: technical and operationalExpanded definition: technical and operational
Expanded definition: technical and operationalssuser3e220a
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...Nguyen Thanh Tu Collection
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfTechSoup
 
Oppenheimer Film Discussion for Philosophy and Film
Oppenheimer Film Discussion for Philosophy and FilmOppenheimer Film Discussion for Philosophy and Film
Oppenheimer Film Discussion for Philosophy and FilmStan Meyer
 
ENG 5 Q4 WEEk 1 DAY 1 Restate sentences heard in one’s own words. Use appropr...
ENG 5 Q4 WEEk 1 DAY 1 Restate sentences heard in one’s own words. Use appropr...ENG 5 Q4 WEEk 1 DAY 1 Restate sentences heard in one’s own words. Use appropr...
ENG 5 Q4 WEEk 1 DAY 1 Restate sentences heard in one’s own words. Use appropr...JojoEDelaCruz
 
EMBODO Lesson Plan Grade 9 Law of Sines.docx
EMBODO Lesson Plan Grade 9 Law of Sines.docxEMBODO Lesson Plan Grade 9 Law of Sines.docx
EMBODO Lesson Plan Grade 9 Law of Sines.docxElton John Embodo
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSJoshuaGantuangco2
 

Último (20)

The Contemporary World: The Globalization of World Politics
The Contemporary World: The Globalization of World PoliticsThe Contemporary World: The Globalization of World Politics
The Contemporary World: The Globalization of World Politics
 
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxQ4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
 
Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management System
 
Paradigm shift in nursing research by RS MEHTA
Paradigm shift in nursing research by RS MEHTAParadigm shift in nursing research by RS MEHTA
Paradigm shift in nursing research by RS MEHTA
 
Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4
 
Active Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdfActive Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdf
 
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptxYOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
 
ROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptx
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management system
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
 
Activity 2-unit 2-update 2024. English translation
Activity 2-unit 2-update 2024. English translationActivity 2-unit 2-update 2024. English translation
Activity 2-unit 2-update 2024. English translation
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
 
Expanded definition: technical and operational
Expanded definition: technical and operationalExpanded definition: technical and operational
Expanded definition: technical and operational
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
 
Oppenheimer Film Discussion for Philosophy and Film
Oppenheimer Film Discussion for Philosophy and FilmOppenheimer Film Discussion for Philosophy and Film
Oppenheimer Film Discussion for Philosophy and Film
 
ENG 5 Q4 WEEk 1 DAY 1 Restate sentences heard in one’s own words. Use appropr...
ENG 5 Q4 WEEk 1 DAY 1 Restate sentences heard in one’s own words. Use appropr...ENG 5 Q4 WEEk 1 DAY 1 Restate sentences heard in one’s own words. Use appropr...
ENG 5 Q4 WEEk 1 DAY 1 Restate sentences heard in one’s own words. Use appropr...
 
EMBODO Lesson Plan Grade 9 Law of Sines.docx
EMBODO Lesson Plan Grade 9 Law of Sines.docxEMBODO Lesson Plan Grade 9 Law of Sines.docx
EMBODO Lesson Plan Grade 9 Law of Sines.docx
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
 

Full Erdmann Ruttenberg Community Approaches to Open Data at Scale

  • 1. Chris Erdmann Judy Ruttenberg Todd Vision NISO Virtual Conference: Open Data Projects June 13, 2018 Community approaches to open data at scale
  • 2. Chris Erdmann The Carpentries/California Digital Library Metadata 2020 Participant @libcce / chris@carpentries.org Metadata 2020: Who, what, when, where, why?
  • 3. As a researcher…I’m a bit bloody fed up with Data Management - Cameron Neylon
  • 4. What is Metadata 2020? Metadata 2020 is a collaboration that advocates richer, connected, and reusable, open metadata for all research outputs, which will advance scholarly pursuits for the benefit of society.
  • 5. COMMUNITY GROUPS RESEARCHERS Cameron Neylon, Curtin (Chair), Bethany Drehman, FASEB, Ernesto Priego, University of London, Eva Mendez, UC3M/OSPP, Juan Pablo Alperin, Public Knowledge Project, L.K. Williams, Interfolio... SERVICE PROVIDER/PLATFORMS AND TOOLS Marianne Calilhanna, Cenveo Publisher Services (Chair), Adrian-Tudor Pănescu, Figshare, Bob Kasenchak, Access Innovations, Dan Nigloschy, XML workflow solutions architect... FUNDERS Ross Mounce, Arcadia Fund PUBLISHERS Daniel Shanahan, F1000 (Chair), Fiona Counsell, Taylor & Francis, Christina Gifford, Elsevier, Christina Hoppermann, Springer Nature, Concetta La Spada, Cambridge University Press… LIBRARIANS Juliane Schneider, Harvard Catalyst (Chair), Christopher Erdmann, North Carolina State University, Ebe Kartus, University of New England, Eva Mendez, UC3M/OSPP... DATA PUBLISHERS AND REPOSITORIES John Chodacki, CDL and DataCite (Chair), Barbara Chen, Modern Language Association, Jennifer Lin, Crossref, Scott Plutchak, University of Alabama at Birmingham (retired)...
  • 6. ● Each group has met 5 times ● They have defined their community problem statements, outlining challenges and opportunities ● Ideas that arose from multiple meetings are now resulting in specific cross-community projects Group Work
  • 7. Problem Statements, Challenges & Opportunities Example: Researchers have a major issue with time. Metadata entry upon submission of research takes time, and this metadata is often required to be entered multiple times. Streamlining is needed. Researchers in different fields have different metadata needs and ways of talking about metadata. There is also a lack of knowledge surrounding the importance of complete and accurate metadata, and the value and uses of that metadata upstream in the research product life cycle.
  • 8. Projects 1-3 1. Researcher Communications: Increase the impact and consistency of communication with researchers about metadata 2. Metadata Recommendations and Element Mappings: Shared set of recommended metadata concepts/related mappings 3. Defining the Terms We Use About Metadata: Develop a glossary of words associated with metadata, for core concepts and disciplinary areas
  • 9. Projects 4-6 4. Incentives for Improving Metadata Quality: Stories to demonstrate how better metadata will meet researcher goals 5. Shared Best Practices and Principles: High level best practices for using metadata across the scholarly communication cycle, to facilitate interoperability, exchange 6. Metadata Evaluation and Guidance: Identify and compare existing metadata evaluation tools and mechanisms to inform clear community guidance
  • 11. Talks: SHARE & Dryad Improving the metadata curation pipeline to SHARE Judy Ruttenberg, Program Director for Strategic Initiatives, ARL SHARE is a community open-source initiative developing tools and services to connected related, yet distributed, research outputs, enabling new kinds of scholarly discovery. This talk will provide an overview of SHARE's current development priorities to move to distributed, institutionally-based infrastructure supporting local priorities, as well as critical improvements to SHARE's harvesting framework and metadata curation pipeline. Dryad and the evolution of metadata curation at a generalist data repository Todd Vision, PI, Dryad Dryad is a generalist data repository underlying the scientific and medical literature, with data underlying articles from hundreds of journals and authors at hundreds of institutions. In this talk, I will describe how Dryad's workflow for metadata curation has evolved over time and contemplate how institutions and data repositories might better interface with one another and with the world of STM publishing.
  • 12. Some feedback from NASIG so far... Focus on identifiers: they keep coming up as the source of many problems, specifically when they’re used either inconsistently or incorrectly. Don’t start with single volume monographs and assume serials will fit in eventually. Many existing data models start there and get too far along before they realize that it doesn’t quite work with serials and then the serials community is left trying to figure out how to work around the model/system. Realize that the volume/proliferation of materials means that a lot of libraries, large and small, rely (to varying degrees) on vendor-provided data. We need them to take ownership of that data and work on ways to ensure at least accuracy (identifiers, spelling, website urls, etc.). *Thanks to Juliane Schneider & NASIG Metadata 2020 contributors
  • 13. Can you help? ● Contribute to Metadata 2020 projects! Email Clare Dean at cdean@metadata2020.org for details, or sign up here. ● Help promote our efforts to the wider community through your organizations, word of mouth, and social media ● Find us on @Metadata2020 Twitter, Facebook, LinkedIn, and at metadata2020.org
  • 15.
  • 16. SHARE is a community open-source initiative developing tools and services to connect related, yet distributed, research outputs, enabling new kinds of scholarly discovery. @SHARE_research www.share-research.org
  • 17.
  • 18. Metadata is data Rich metadata ... ● Facilitates discovery ● Exposes research assets ● Contributes to meta-scholarship and meta-analysis Links and relationships can be analyzed from this data
  • 19. Dataset Harvesting Framework Aggregator: OSF Preprints Institutional focus: Dashboard Lessons learned Digital Humanities exploration
  • 20. Dataset Harvesting Framework Aggregator: OSF Preprints Institutional focus: Dashboard Lessons learned Digital Humanities exploration
  • 21. Dataset Harvesting Framework Aggregator: OSF Preprints Institutional focus: Dashboard Lessons learned Digital Humanities exploration
  • 22. Dataset & Harvesting Framework 168+ data sources ● Registries (e.g. CrossRef, DataCite) ● Disciplinary repositories and preprint services ● Data repositories ● Institutional repositories ● Agency repositories (e.g. DOE SciTech Connect) 55+ million metadata records https://share.osf.io/discover
  • 23. SHARE metadata priorities ● Institutional identifier ● Person identifier ● Source of funding ● Exchange across systems & borders: CC0 ● Reference lists ● URI values - mapping to common values making them transferrable
  • 24. Rich metadata, new discovery
  • 25. Rich metadata, rich storytelling
  • 26. Lessons learned ● Move to distributed infrastructure ● Invest more in relationship mapping among objects in the dataset ● Build on work at the institution level ● Shared service AND reusable solutions
  • 27. Decentralization of SHARE Under development: ● Template to make writing harvester code easy, using Node-RED ● Distributed framework for harvesting data ● Editor to clean, remediate, link harvested data Community, open-source software development to solve local problems
  • 28. Use case: Research Intelligence “Aggregation, curation, and utilization of metadata about research activities. [RIMs} … help reliably connect a complex scholarly communications landscape of researchers, affiliations, publications, datasets, grants, projects, and their persistent identifiers.” OCLC Research Library Partnerships: https://www.oclc.org/research/themes/research-collections/rim.html
  • 29.
  • 30. VIVO - June 2018 - Durham, NC The evolution of metadata at a generalist data repository Todd Vision Associate Prof, Department of Biology Adjunct, School of Information & Library Science University of North Carolina at Chapel Hill With thanks to Dryad staff Jane Greenberg, and the UNC/Drexel Metadata Research Center
  • 31. The long tail of orphan dataVolume Rank frequency of datatype Specialized repositories (e.g. GenBank) Orphan data After Heidorn (2008) http://hdl.handle.net/2142/9127 Bumpus HC (1898) The Elimination of the Unfit as Illustrated by the Introduced Sparrow, Passer domesticus. A Fourth Contribution to the Study of Variation. pp. 209-226 in Biological Lectures from the Marine Biological Laboratory, Woods Hole, Mass. VIVO - June 2018 - Durham, NC
  • 32. InformationContent Time Time of publication Specific details General details Accident Retirement or career change Death Michener, W. K., J. W. Brunt, J. Helly, T. B. Kirchner, and S. G. Stafford. 1997. Non-geospatial metadata for the ecological sciences. Ecological Applications 7:330-342. Data and metadata entropy VIVO - June 2018 - Durham, NC
  • 33. Joint Data Archiving Policy Data are important products of the scientific enterprise, and they should be preserved and usable for decades in the future. As a condition for publication, data supporting the results in the article should be deposited in an appropriate public archive. Authors may elect to embargo access to the data for a period up to a year after publication. Exceptions may be granted at the discretion of the editor, especially for sensitive information. http://datadryad.org/pages/jdap VIVO - June 2018 - Durham, NC
  • 34. VIVO - June 2018 - Durham, NC
  • 35. Integration of manuscript and data submission VIVO - June 2018 - Durham, NC
  • 36. A data “package” VIVO - June 2018 - Durham, NC
  • 37. Supplementary documentation VIVO - June 2018 - Durham, NC
  • 39. Interoperability VIVO - June 2018 - Durham, NC
  • 40. Interoperability VIVO - June 2018 - Durham, NC
  • 41. Data citation VIVO - June 2018 - Durham, NC
  • 42. Data Curation Network Uncurated Data Presenting scale and expertise challenges to individual institutions Curated Data at scale through shared Data Curation Network Appraise and Select Ingest Preserve Long-Term Facilitate Access DCN Review Assign CURATE Mediate Approve Check files and metadata Understand and run files Request missing information Augment metadata Transform file formats Evaluate for FAIRness C U R A T E The Data Curation Network VIVO - June 2018 - Durham, NC
  • 43. DCN – planning phase (2016-2017) • Collaboration of six academic libraries • Can data curation staff be shared among institutions? • Questions – How to address policy differences? – What do researchers actually need help with? – Will researchers care if curation is distributed? – Can issues of trust and quality control be solved? – What skills and workflows are needed? Lisa Johnston et al. (2017) Data Curation Network: A Cross-Institutional Staffing Model for Curating Research Data http://hdl.handle.net/11299/188654 VIVO - June 2018 - Durham, NC
  • 44. DCN - pilot phase (2018-2020) VIVO - June 2018 - Durham, NC
  • 45. Partnership with VIVO - June 2018 - Durham, NC Make Data Count
  • 46. VIVO - June 2018 - Durham, NC datadryad.org / @datadryad datacurationnetwork.org