Report on key findings of a Wellcome-commissioned study to investigate current practices for paper, data & code sharing among Wellcome & ESRC funded researchers and any barriers that are encountered. Presented by Gareth Knight at a CPD25 Open Access workshop at the Foundling museum in London on 26 April 2017.
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
Towards Open Research: practices, experiences, barriers and opportunities
1. Towards Open Research
practices, experiences, barriers and opportunities
CPD25: Open Access and Repositories
26 April 2017
Veerle Van den Eynden Gareth Knight
(Presenter)
Anca Vlad
UK Data Service
University of Essex
London School of Hygiene &
Tropical Medicine
UK Data Service
University of Essex
2. Open Research study
• Researchers funded by Wellcome Trust and ESRC:
biomedical, clinical, population health, humanities, social
sciences
Current attitudes and practices related to sharing of:
• Publications
• Data
• Code
Barriers that inhibit or prevent researchers from
sharing
Identification of action that funders can take to
encourage good practice and mitigate issues
• Survey (N=583 + 259), focus groups (N=22)
Van den Eynden, Veerle et al. (2016) Towards Open Research: Practices, experiences, barriers
and Opportunities. Wellcome Trust. https://doi.org/10.6084/m9.figshare.4055448
3. Article publishing
• Respondents published average of 18-peer
reviewed papers during past 5 years
– 30% published all papers as OA
• Factors that affect ability to publish OA:
– Journal lacks OA option (31%)
– Lack of funds to cover APCs (30%)
– Papers uploaded to social network (8%)
– Lead author decided against OA (4%)
• 50% of respondents use WT funds for APCs:
– Humanities & social scientists less likely than
Biomedical & clinical scientists
– Early-career less likely than more established
researchers
Open access cookie (CC BY-NC-SA 2.0)
https://www.flickr.com/photos/biblioteekje/6325328112/
4. Data sharing
95% of respondents generate research data, of which 52% made it available in last 5 years
5. Data sharing methods
414 respondents share data:
• Full dataset (51%)
• Data subset linked to paper (38%)
• Other subset of data (37%)
Via:
• Community repositories (42%)
• Institutional repositories (37%)
• Project/private repositories (15%)
• General purpose repositories (13%)
• Journal supplementary (10%)
6. Reasons to share data
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
My funder requires me to share my data(N=273)
Journal expects data underpinning findings to be accessible(N=273)
My research community expects data sharing(N=274)
It is good research practice to share research data(N=277)
It enables collaboration and contribution by other researchers(N=274)
It has public health benefits, e.g. disease outbreaks(N=265)
Ability to respond rapidly to public health emergencies(N=263)
Ethical obligation towards research participants to maximize benefits for society(N=266)
Contributes to academic credentials(N=273)
Enables validation and /or replication of my research(N=275)
Improved visibility for my research(N=273)
I can get credit and more citations by sharing data(N=267)
Not at all important Slightly important Moderately important Very important Extremely important
Source: Wellcome survey results
7. Barriers to data sharing
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
I may lose publication opportunities if I share data(N=517)
Others may misuse or misinterpret my data(N=519)
I have insufficient skills to prepare the data(N=505)
It requires time/effort to prepare my data for deposit(N=520)
I do not have sufficient funding to prepare data for sharing(N=509)
I do not have permission (consent) from my research participants to share data(N=510)
Data contain confidential / sensitive information and cannot be de-identified(N=504)
My data are commercially sensitive or has commercial value(N=501)
There are third party rights in my data(N=499)
No suitable repository exists for my data(N=502)
Country-specific regulations do not allow sharing(N=486)
Not at all important Slightly important Moderately important Very important Extremely important
Source: Wellcome survey results
9. Significant differences in motivationMOREIMPORTANTLESSIMPORTANT
Extra funding to
cover costs
established
researchers
~
cell, development
and physical
science, genetic
and molecular
science,
neuroscience and
mental health,
population health
infection and
immunobiology
Enhanced
academic
reputation
early career
researchers
~
researchers not
sharing data now
Co-authorship
on reuse papers
early career
researchers
clinical,
population health,
social science
researchers
cell, devel and
physical science,
neuroscience and
mental health
biomedical and
humanities
researchers, genetic
and molecular science,
infection and
immunobiology
Case study that
showcase data
LMIC researchers
~
humanities,
Infection and
immuno-biology,
population health
cell, development and
physical science,
genetic and molecular
science, neuroscience
and mental health
Data deposit
leads to data
paper
publication
early career
researchers; LMIC
researchers
~
cell, development
and physical
science, infection
and immuno-
biology,
neuroscience and
mental health
genetic and molecular
science, humanities
and social sciences
Considered
favourably in
funding and
promotion
decisions
UK-based
researchers
~
cell,
development
and physical
science,
genetic and
molecular
science,
neuroscience
and mental
health
Population
health
Ability to limit
data access to
specific
purposes or
individuals
LMIC
researchers
~
clinical,
population
health and
social science
researchers
biomedical
researchers
Assistance from
institution or
funder to
prepare data
clinical,
population
health and
social science
researchers
biomedical and
humanities
researchers
10. Code sharing
40% of respondents generate code:
• Researchers performing surveys, observations, experiments,
secondary analysis & simulations more likely to produce code
43% of these shared code in last 5 years:
• Researchers performing simulations and secondary analysis
more likely to share code
• Researchers applying qualitative methods less likely to share
code
37% reuse existing code:
• Obtain from colleagues, collaborators & community repositories
• Influencing factors in code reuse: good documentation,
reputable source, and open availability
Shared via institutional,
community & journal services
11. Reasons to share code
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
My funder requires me to share my code(N=97)
Journal expects code to be accessible(N=97)
My research community expects code sharing(N=97)
It is good research practice to share code(N=101)
To enable collaboration and contribution (N=98)
Contributes to my academic credentials(N=95)
Enables validation of my research(N=97)
Enables replication of my research(N=96)
Improved visibility for my research(N=95)
I can get credit and more citations by sharing code(N=91)
Not at all important Slightly important Moderately important Very important Extremely important
Source: Wellcome survey results
12. Code sharing benefits
0 5 10 15 20 25 30 35 40
Career benefits
More publications
Higher citation rate
New collaborations
More funding opportunities
Financial benefit
New patents
Improvements to public health
Use in health emergencies
None
Other
Source: Wellcome survey results
13. Code sharing barriers
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
Desire to patent (N=210)
Protecting intellectual property (N=213)
Software and systems dependencies (N=213)
I may lose publication opportunities if I share code (N=210)
Others may misuse or misinterpret my code (N=211)
Insufficient skills to prepare the code for public use (N=213)
It requires time/effort to prepare my code for deposit (N=217)
Insufficient funding to prepare code for public use (N=211)
My code has commercial value (N=207)
There are third party rights in my code (N=206)
No suitable repository exists for my code (N=197)
Not at all important Slightly important Moderately important Very important Extremely important
Source: Wellcome survey results
14. Motivations for more code sharing
0 10 20 30 40 50 60
Financial incentive from my institution
Extra funding to cover the costs
Enhanced academic reputation
Code access and metrics
Knowing how others use my code
Co-authorship on papers resulting from reuse
Case study that showcases my code
It is looked on more favourably in funding and promotion decisions
Evidence of code citation
Assistance from institution/funder staff to prepare code
Nothing motivates me
Source: Wellcome survey results
15. Recommendations
Funding:
• Dedicated funding streams for data/code preparation
• Guidelines for describing code development & sharing in funding bid (Software Management Plan?)
• Demand for investment in support staff to help with data/code preparation
Rewards:
• Recognise data & code sharing in career progress evaluation
• Citations and co-authorship for new publications based upon shared data/code
• Build evidence of good practice – case studies.
Infrastructure:
• Utilise existing infrastructure where possible, e.g. GitHub, SourceForge, CRAN for R code, etc.
• Enhance functionality - granular access controls, big data, enhanced citation and reuse metrics
Support:
• Enhance networking / support opportunities for data/code creators and re-users
• Develop training – software carpentry, Software Sustainability Institute
17. Thanks to:
All researchers who contributed to the surveys and focus groups
Wellcome Trust:
David Carr
Robert Kiley
Expert advisors:
Barry Radler (University of Wisconsin),
Carol Tenopir (University of Tennessee), David Leon, Jimmy Whitworth (LSHTM)
Frank Manista (Jisc)
Louise Corti (UK Data Service)