DBA Basics: Getting Started with Performance Tuning.pdf
SCIENTIFIC DATA DESCRIPTORS PROMOTE REPRODUCIBILITY
1. SCIENTIFIC DATA
Andrew L. Hufton
Managing Editor
Susanna-Assunta Sansone, University of Oxford (@biosharing)
Honorary Academic Editor
Varsha Khodiyar
Data Curation Editor
Iain Hrynaszkiewicz
Head of Data and HSS Publishing
Emerging Issues Forum on "Data Review”,
Dryad Community Meeting, 27 May 2015
@ScientificData
www.nature.com/sdata
2. A new category of publication that provides detailed
descriptors of scientifically valuable datasets,
associated or not to traditional article(s)
3. Scope: data reproducibility and (re)use
• Title
• Abstract
• Background & Summary
• Methods
• Data Records
• Technical Validation
• Usage Notes
• Figures & Tables
• References
• Data Citations
Detailed description of the methods and technical analyses supporting
the quality of the measurements; no scientific hypotheses
4. • Hosted on Nature.com
• Descriptors are uniformly searchable and
discoverable
• Clear linking to related journal articles and
repository records
• Friendly to data and text miners through
machine-readable metadata
Data discoverability
5. Data curation
• The Data Curation Editor is responsible
for creating the machine-readable
metadata files associated with articles.
• This facilitates their integration with the
underlying datasets stored in community
repositories.
6. Data peer review
• Completeness = can others reproduce?
• Consistency = were community standards
followed?
• Integrity = are data in the best repository?
• Experimental rigour and technical quality =
were the methods sound?
Does not focus on perceived impact, importance,
size, complexity of data
7. Peer review - what have we learned (1)
• Access to data: make it easy for reviewers
o our peer-review reports seem to suggest that
>30% of the reviewers looks at the raw data
o we provide reviewers with links to the data
prominently on the cover page of the
manuscripts
o where data is stored, how is curated and
accessed do matters, therefore selecting the
right repositories to recommend and collaborate
with is pivotal….
8. 1. Recognized within their scientific community
2. Long-term preservation of datasets
3. Expert curation
4. Implement relevant reporting standards
5. Allow confidential review of submitted datasets
6. Stable identifiers for submitted datasets
7. Allow public access to data without unnecessary restrictions
See our website for list of recommended databases and
questionnaire for new repositories requesting listing
Our criteria for selecting repositories
9. • Data review: a constructive value-added service
o not all data is valid, but very few repositories
actually review data quality, validity, experimental
design etc.
o our peer-review process is very concrete, and
usually constructive – a service complementary
to repository curation/review (when it occurs)
o rejection after peer-review is rare, often because
of flaws in experimental design or the lack of key
controls
Peer review - what have we learned (2)
10. • Data curation editor: a new profile needed
o data understandability and reusability require
available and ‘intelligently open’ information
o breadth and depth of information reported
(experimental metadata) is vital
o growing complexity and diversity of data requires
enhanced process, e.g. mediation with >1
repository, several reviewers
o in house expertise in data reporting and
representation standards is a must
Peer review - what have we learned (3)
11. Link, complement and add value:
credit for sharing and data peer-review
Research
papers
Data
records
Data
Descriptors