This presentation was provided by Clara Llebot of Oregon State University, during the NISO hot topic virtual conference "Effective Data Management," which was held on September 29, 2021.
This presentation was provided by Clara Llebot of Oregon State University, during the NISO hot topic virtual conference "Effective Data Management," which was held on September 29, 2021.
5.
Who am I talking to?
OREGON STATE UNIVERSITY 4
Grad students
and early
career in
classes and
workshops
Consultations
Data
management
plans
Deposit of
datasets in
institutional
repository
6.
How is their data?
OREGON STATE UNIVERSITY 5
Small datasets Disciplines without well
established standards
for metadata,
interdisciplinary
7.
Challenges
OREGON STATE UNIVERSITY 6
Kirby Lee-USA TODAY Sports
8.
Challenges
Enough metadata to ensure a
robust scientific process
OREGON STATE UNIVERSITY 7
Reproducibility and reuse
1 2
3
9.
1. Metadata for a robust scientific
process
OREGON STATE UNIVERSITY 8
• Concept vs application.
• Now vs later.
• Intentionally, thoroughly,
systematically
readme templates
10.
2. Reproducibility and reusability
OREGON STATE UNIVERSITY 9
11.
1. Context: premise of study
We asked researchers to tell
us about how they interpret
datasets through a peer-
review like process
Peer reviewers and
Librarians evaluate dataset -
how different are the
interpretations of quality?
Does/should this lead to a
revision of our curation
methods and best practices?
Flickr/AJ Cann, CC BY-SA
2. Reproducibility and reusability
12.
2. Reproducibility and reusability
● Datasets from ScholarsArchive@OSU,
institutional repository
● All datasets go through a review
process. Documentation is mandatory
● 8 datasets reviewed by 11 reviewers
11
13.
2. Reproducibility and reusability
● Is the record
sufficiently
descriptive?
Title,
abstract,
keywords.
● Are there
other
elements that
could be
added?
● Are the data easily
readable? E.g.
community formats
● Are the data of high
quality?
● Are the values
physically possible
and plausible?
● Are there missing
data?
● Contact information
● Contextual information?
● Comprehensive
description of all the data
that is there?
● Methods well described
and reproducible
● Internal references
available
● Rights to use the dataset
RECORD DATA DOCUMENTATION
14.
3. Results
● Descriptive information is critical
to a user’s ability to
understand what the data is
and whether it is potentially
useful
● Deficiencies limit the potential
reusability of the dataset.
● Areas of description work
together to create a more
complete description of the
dataset.
● Information often provided via
links to other sources: articles,
dissertations.
● Researchers are comfortable
using related articles. Librarians
value the presence of dataset
specific documentation higher
than most reviewers.
● Librarians took into consideration
whether links were accessible
and open.
INSUFFICIENT DESCRIPTION LINKS
15.
3. Results
● We ask for the same
information in multiple
documentation locations (record
metadata, documentation, and
dataset). Sometimes is in
articles too.
● Not clear how this duplication of
effort impacts data submission
quality, as the combination
typically was enough to allow the
reviewer or librarian to
understand the dataset in
detail
● Domain expertise was important
across all areas of review for
datasets. The curating librarians do
not have sufficient domain
expertise to properly evaluate the
quality of the data, or metadata.
● Reviewers confused in the areas of
licensing, rights statements,
persistent identifiers, and where
specific types of information belong -
librarian’s expertise.
DUPLICATION OF EFFORT DOMAIN EXPERTISE
16.
3. FAIR data
• F2. Data are described with rich metadata
• A2. Metadata are accessible, even when the data are no longer
available
• I1. (Meta)data use a formal, accessible, shared, and broadly
applicable language for knowledge representation.
• R1.3. (Meta)data meet domain-relevant community standards
OREGON STATE UNIVERSITY 15
17.
3. FAIR data
OREGON STATE UNIVERSITY 16
Greatest disconnect between researchers and metadata
Tools, tools, tools
Most standards are
made for metadata
specialists, not for
researchers
Support
18.
3. FAIR data
• FAIR principles are aspirational
• Disciplines are at different points in their development of
standards and tools. What for some are choices, for others are
challenges. (Jacobsen et al., 2020)
• There is a lot that is being done, but convergence may take
time.
OREGON STATE UNIVERSITY 17
19.
Conclusions
OREGON STATE UNIVERSITY 18
Training and
teaching that can
be done with
support (e.g.
libraries)
Basics of metadata Tools and
translation of
concepts
Organizations and
communities that
maintain
specifications and
standards
Convergence of
standards
Organizations and
researchers talking
about metadata
20.
Clara Llebot Lorente | Data Management Specialist
clara.llebot@oregonstate.edu
ResearchDataServices@oregonstate.edu
http://bit.ly/OSUData
This presentation is licensed under a CC0 license.
OREGON STATE UNIVERSITY 19
Notas do Editor
Must be in Slide Master mode to swap out photos.
Statistical tool that converts a set of variables that are interrelated to another set of variables that are independent and that account for as much as the variability of the sample as possible.
Research intensive university
I will talk about my perception of challenges experimented by researchers, and I just want to acknowledge that many are probably just doing a wonderful job, and I never interact with them because of that! Kirby Lee-USA TODAY Sports
Low hanging fruit Metadata during the research process Concept vs application. They understand well what metadata is, and why we should record it. But when you ask them what metadata they will collect, they will say that their project does not need metadata. Researchers writing DMP leave the metadata section blank, because they do not know what to write.
Image source: Flickr/AJ Cann, CC BY-SA in http://theconversation.com/explainer-what-is-peer-review-27797
This is a summary of the questions we asked
Reviewers reported missing methodology, information about the authors and their contact information, about licenses, and url about the dataset.
Reviewers reported missing methodology, information about the authors and their contact information, about licenses, and url about the dataset.
The FAIR principles add a step, because now we are considering not only reusability by humans, but by machines The FAIR principles talk about metadata pretty much everywhere. I chose four subprinciples, one of each principle, to talk about in this presentation. I think that the interoperability criteria is the most challenging, and also the one that really makes a difference. For metadata what this means is the use of standards, which I haven’t talked about.
Giving support is challenging from the perspective of a
Parece que tem um bloqueador de anúncios ativo. Ao listar o SlideShare no seu bloqueador de anúncios, está a apoiar a nossa comunidade de criadores de conteúdo.
Odeia anúncios?
Atualizámos a nossa política de privacidade.
Atualizámos a nossa política de privacidade de modo a estarmos em conformidade com os regulamentos de privacidade em constante mutação a nível mundial e para lhe fornecer uma visão sobre as formas limitadas de utilização dos seus dados.
Pode ler os detalhes abaixo. Ao aceitar, está a concordar com a política de privacidade atualizada.