1. HOW CAN JOURNALS HELP ENCOURAGE
DATA SHARING?
Varsha Khodiyar
Data Publishing Manager, F1000Research
Share and Flourish workshop
Leiden, August 2014
4. SMALL DATASETS (THE LONG-TAIL OF SCIENCE)
http://robertanagourney.wordpress.com/2014/07/0
9/with-cancer-dont-ask-the-experts/
• Number of samples/patients?
• Statistical power?
• Supports prevailing
hypothesis?
5. RESEARCH BECOMES HARDER TO ACCESS WITH AGE
“• We examined the availability of data
from 516 studies between 2 and 22
years old
• The odds of a data set being reported
as extant fell by 17% per year
• Broken e-mails and obsolete storage devices were the main obstacles
to data sharing
• Policies mandating data archiving at publication are clearly needed”
Vines TH. et al. The availability of research data declines rapidly with
article age. Curr Biol 24, 94–7 (2014)
6.
7. OPEN AND POST-PUBLICATION PEER REVIEW
@vkf1000 | @f1000research
• Approved
• Approved with reservations
• Not approved
8. REFEREE REPORTS ARE PUBLIC
All referee names are visible.
Articles with sufficient positive evaluations
indexed in PubMed, Scopus and Embase.
or
Referee reports and
other comments are
visible to everyone.
11. As a field we have the opportunity to compete or
collaborate. I hope that these data facilitate cross-laboratory
collaboration where two groups are
reticent to share their own data…I applaud the
authors for this unprecedented act of scientific
altruism. I hope this will be a platform that
accelerates our understanding of the entorhinal-hippocampal
circuitry through collaboration.
14. DATA AND METHODS ARE REQUIRED TO ALLOW REPLICATION
“[W]e evaluated the replication of data analyses in 18 articles
on microarray-based gene expression profiling published in
Nature Genetics in 2005–2006...We reproduced two analyses
in principle and six partially or with some discrepancies; ten
could not be reproduced. The main reason for failure to
reproduce was data unavailability.”
Ioannidis JPA. et al. Repeatability of
published microarray gene expression
analyses.
Nature Genetics 41, 149–55 (2009)
15. DO DATA PRODUCERS NEED ADDITIONAL CREDIT?
Traditional research paper by
ISS, MSD and JWH
Author contributions
ISS and JWH designed the experiments. ISS with help
from MSD carried out the experiments. ISS and JWH
prepared the manuscript. All authors approved the
final content of the manuscript.
19. SOFTWARE IS IMPORTANT TOO
Archive published source
code with a DOI
Update article after it
has passed peer review
e.g. with release of
version of software
Data and Software
availability section
Request the inclusion of
source code for
software developed for
the project and required
to analyse/view the data
or replicate the work
21. HOW CAN JOURNALS HELP ENCOURAGE
DATA SHARING?
Email: varsha.khodiyar@f1000.com
Twitter: @vkf1000 / @f1000research
Notas do Editor
Large scale datasets e.g. Human genome project
Data more likely to be shared due to collaborative nature of data producing phase
Lots of data produced
Data shared through centrally funded database, e.g. GenBank for gene sequences.
Journals enforce use of these databases for publications with gene sequences.
F1000Research is making it possible to get a paper online within days, using post-publication peer review.
F1000Research articles are published online after an in-house pre-refereeing check, on average, within 6 working days.
Peer review and revisions are carried out publicly.
Articles with sufficient positive referee reports are indexed.
Referee reports on all papers are visible to anyone reading the article, and include the referee name. Author responses and any additional comments are visible as well.
A dataset (or set of datasets) together with the associated methods/protocol used to create the data. No analysis of the data, results or conclusions should be included.
Limited space in traditional full paper for detailed methods, data and author credit for data producers
MSD essential for these set of expts, ie to generate this data
In a data paper MSD would be first author as they have the knowledge and skill to generate that dataset
Journals such as F1000Research are working to make it easier for researchers to share and use published data
All papers have underlying data freely accessible as a condition of publication