Presentation on the Value and Impact of Social Science Data Archives and the CESSDA SaW Toolkit
A set of 38 slides used for the Focus Group Cost-Benefit Funding Advocacy Program (Task 4.6) session at the CESSDA Saw Workshop in The Hague 16/17 June 2016.
This was an interactive focus group repeated over two parallel sessions. It was aimed at European social science data archive staff with responsibility for bidding for funding or promotion and advocacy of the archive to key stakeholders.
The presentation covers some of the key ideas on how the CESSDA Saw funding advocacy toolkit will be structured, its components, and key facts and approaches it will include.
We expect the cost-benefit funding advocacy toolkit under development to support the negotiation with ministries and funding organisations across Europe.
The results of the toolkit user requirements survey with responses from 24 European social science archives were presented and discussed, together with suggested approaches and content for the toolkit. 22 people attended the two sessions overall, representing a mix of countries at different stages on the development path for social science archives (none, new/emerging, mature). There was strong interest and support for the emerging toolkit together with open discussion of how it can be applied in the specific political and administrative context of different European countries.
The slide set presented here is an extended version including a number of hidden background/ reference slides not used in the presentation. The focus group is one of a series guiding further development of the toolkit and its adoption being given to either: (a) social science data archive staff or (b) their key stakeholders (senior management in their universities, research councils and academies, funding ministries, national statistics offices, research users and depositors).
CESSDA is the Consortium of European Social Science Data Archives. The CESSDA SaW project “Strengthening and widening the European infrastructure for social science data archives” is funded by the European Commission as part of its Horizon2020 programme.
1. This project has received funding from the European Union’s Horizon 2020
research and innovation programme under grant agreement number 674939.
The Hague Workshop June 2016 - Neil Beagrie (Charles Beagrie Ltd)
Focus Group – The Value and Impact of (Social Science)
Research Data Infrastructure
Illustration by Jørgen Stamp digitalbevaring.dk CC BY 2.5 Denmark
2. This project has received funding from the European Union’s Horizon 2020
research and innovation programme under grant agreement number 674939.
Introduction (Neil Beagrie)
• 2008 -2011 Keeping Research Data Safe (KRDS 1/2/3) with UK
universities, data archives and OCLC Research as partners
Preservation Cost Model for UK Universities; Longitudinal cost data + Benefits
Framework; Benefits Analysis Toolkit + KRDS Factsheet & User Guide
• 2012 -2016 (Economic) Value & Impact Studies with John Houghton
Economic and Social Data Service; British Atmospheric Data Service; Archaeology
Data Service; Synthesis of 3 studies; European Bioinformatics Institute
• 2015-2017 CESSDA-Saw
Horizon 2020 European Project. The value and economic impact of social science data
services: funding advocacy toolkit.
3. This project has received funding from the European Union’s Horizon 2020
research and innovation programme under grant agreement number 674939.
Neil Beagrie (Charles Beagrie Ltd)
Task 4.6: Funding Advocacy Toolkit
Illustration by Jørgen Stamp digitalbevaring.dk CC BY 2.5 Denmark
4. This project has received funding from the European Union’s Horizon 2020
research and innovation programme under grant agreement number 674939.
CESSDA-SaW Task 4.6
Capturing and communicating the value and economic impact of social science
data services. Develop a benefit/cost advocacy programme and supporting tools;
assembling an evidence base to support the negotiation with ministries and funding
organisations; support advocacy with other core stakeholders such as data creators and
data users.
Timetable
Milestone
number
Milestone name Due date
MS26 Survey to gather and
validate requirements
April 2016
MS27 Draft Toolset and testing Oct 2016
MS28 Focus groups and
case studies
Jan 2017
Early days…. First phase focus group today
5. This project has received funding from the European Union’s Horizon 2020
research and innovation programme under grant agreement number 674939.
Presentation Overview
• Measuring Value and Impact of Research Data
Infrastructure (briefly)
• ESDS Impact Study (briefly)
• User requirements survey (briefly)
• Toolkit ideas (briefly)
• Perception and discussion
6. This project has received funding from the European Union’s Horizon 2020
research and innovation programme under grant agreement number 674939.
Measuring Value and
Economic Impact of
Research Data
Infrastructure
7. This project has received funding from the European Union’s Horizon 2020
research and innovation programme under grant agreement number 674939.
European Research Data Open Access
fuller and wider access to scientific publications and data
can help to accelerate innovation (faster to market =
faster growth); foster collaboration and avoid duplication
of effort (greater efficiency); build on previous research
results (improved quality of results); involve citizens and
society (improved transparency of the scientific
process). What is at stake is the speed of scientific
progress and the return on R&D investment
(FAQ open access to publications & data in Horizon 2020)
8. This project has received funding from the European Union’s Horizon 2020
research and innovation programme under grant agreement number 674939.
European Research Data Open Access
Project results which are related to privacy, trade
secrets, national security, legitimate commercial
interests and to intellectual property rights shall not be
requested in open access mode. Additionally, any data,
know-how and/or information whatever their form or
nature which are held by private parties in a joint
public/private partnership prior to the research action
and have been identified as such shall also not fall under
such an open access obligation.
(FAQ open access to publications & data in Horizon 2020)
9. This project has received funding from the European Union’s Horizon 2020
research and innovation programme under grant agreement number 674939.
Buildings / Equipment
Grid / ICT Networks
Staff
Data
Intellectual Capital
Training & Skills
Human Capital
Technical / Organisational
Environment
Organisational Capital
Professional
Networks
Relationship Capital
Research Infrastructures and Impact: tangible and
intangible capital
10. This project has received funding from the European Union’s Horizon 2020
research and innovation programme under grant agreement number 674939.
Previous Work
Big Science and Innovation Study for BIS July 2013
• Desk review of c. 100 studies internationally;
• 3 studies highlighted to BIS as being particularly good
examples of ‘good practice’ in the measurement of
economic impacts:
• Berkeley Lab 2010
• Human Genome Project 2011
• Economic and Social Data Service 2012 (authors)
11. This project has received funding from the European Union’s Horizon 2020
research and innovation programme under grant agreement number 674939.
Best Practice from ESDS study
• Applies range of
methods;
• Includes counter-
factual;
• Data collection
tailored to different
stakeholders:
depositors, users,
research, teaching;
• Data weighting -
survey value
responses weighted
to reflect the overall
pattern of use from
weblogs;
• Case studies/ KRDS
benefits illustrate
benefits and impact
pathways.
12. This project has received funding from the European Union’s Horizon 2020
research and innovation programme under grant agreement number 674939.
• Investment value: annual operational funding plus the costs that
depositors face in preparing data for deposit and in making those
deposits
• Use value: weighted average user access costs multiplied by the
number of accesses
• Contingent value: the amount users are "willing to pay“ for or “willing
to accept” in return for giving up access
• Efficiency gain: user estimates of time saved by using the Data
Service resources
• Return on investment in the data: the estimated increase in return
on investment in the data and services due to the additional use
facilitated by the Data Service (counter-factual/”cost of inaction”)
Economic measures of value
13. This project has received funding from the European Union’s Horizon 2020
research and innovation programme under grant agreement number 674939.
• UK Data Archive one of the largest European social
science data archives: perhaps the largest
• UK Data Archive established for 40 years: has built up
collections and users over time
• Includes Government datasets: but not all in scope of the
ESDS study
• ESDS study: excluded undergraduates
• Only economic impact study for any social science
data archive to date
ESDS Study: Context
14. This project has received funding from the European Union’s Horizon 2020
research and innovation programme under grant agreement number 674939.
ESDS Study: Return on Investment (ROI)
Benefit/cost ratio of
net economic value to
ESDS operational costs
5.4 to 1
Increase in returns
on investment in data and
related infrastructure arising from
additional use facilitated by ESDS
(counter-factual)
up to 10 to 1
15. This project has received funding from the European Union’s Horizon 2020
research and innovation programme under grant agreement number 674939.
8%
13%
19%
20%
22%
17%
0%
5%
10%
15%
20%
25%
No change 1-10% 10-25% 26-49% 50-75% >75%
ESDS Study: Researcher Efficiency Gains
16. This project has received funding from the European Union’s Horizon 2020
research and innovation programme under grant agreement number 674939.
ESDS study: User Time/Money Savings
1%
10%
22%
22%
30%
32%
32%
39%
40%
58%
66%
81%
0% 10% 20% 30% 40% 50% 60% 70% 80% 90%
Other (Please Specify)
ESDS provided training
Time/resources in teaching/training
Data licences and charges from other
providers
User support materials and help desks
Re-using data avoiding additional
surveying and field testing
Single licence point for multiple data
types
Long-term preservation of data
Hosting and provision of data
Data beyond scope to collect myself
Data quality:preparation, validation and
documentation of data
Ability to find data from single point of
access
17.
18. This project has received funding from the European Union’s Horizon 2020
research and innovation programme under grant agreement number 674939.
KRDS Benefits Framework
• Framework arranged on 3 dimensions with two sub-divisions each; Pick list
of common generic benefits
• Individual benefits identified and assigned within this
Internal External
WHO BENEFITS?
Benefit
from
Curation of
Research Data
19. This project has received funding from the European Union’s Horizon 2020
research and innovation programme under grant agreement number 674939.
A4
Summary
–
Benefits
Summary
for ESDS
20. This project has received funding from the European Union’s Horizon 2020
research and innovation programme under grant agreement number 674939.
ESDS Study - conclusions
• Economic benefits exceed the operational costs
• A mix of qualitative and quantitative methods is important
• But having economic numbers is seen as critical
• Results active in later funding bids for c.10 years
• How to extend work for other social science archives?
21. This project has received funding from the European Union’s Horizon 2020
research and innovation programme under grant agreement number 674939.
CESSDA SaW Toolkit
User Requirements Survey
22. This project has received funding from the European Union’s Horizon 2020
research and innovation programme under grant agreement number 674939.
Survey Q1
IN WHICH COUNTRY IS THE DATA SERVICE BASED?
23. This project has received funding from the European Union’s Horizon 2020
research and innovation programme under grant agreement number 674939.
Survey Q2
THE DATA SERVICE'S CURRENT STAFFING IS
APPROXIMATELY:
24. This project has received funding from the European Union’s Horizon 2020
research and innovation programme under grant agreement number 674939.
Survey Q3
THE DEVELOPMENT OF THE DATA SERVICE IS:
25. This project has received funding from the European Union’s Horizon 2020
research and innovation programme under grant agreement number 674939.
Survey Q6
CAN YOU GIVE EXAMPLES OF THE ARGUMENTS OR FACTS YOU USE IN
ADVOCACY TO MAKE YOUR CASE FOR FUNDING THE DATA SERVICE?
1
1
1
2
2
3
3
3
4
4
4
5
7
7
9
10
11
Big data
Comparison with other countries
Declarations of support from the community
Project funding criteria/obligations
case studies for use/impact
European funding opportunities
ESFRI/ National roadmap
Quality of resources for teaching
Quality of research data
Membership/legal obligations
Uniqueness/impact of no service
Standards, trust, reputation
Data sharing and collaboration
Wider Benefits e.g. evidence-based public policy
Benefits to Research community: Discovery and Access, support services
Open data/access/science
Data re-use is cost-effective
Arguments used in advocacy (themes)
26. This project has received funding from the European Union’s Horizon 2020
research and innovation programme under grant agreement number 674939.
Survey Q10
HOW USEFUL WOULD YOU FIND THE FOLLOWING TOOLS IN BUILDING A CASE TO
DEMONSTRATE THE ACTUAL OR POTENTIAL IMPACT OF THE DATA SERVICE TO
POLICY MAKERS AND FUNDING DEPARTMENTS?
27. This project has received funding from the European Union’s Horizon 2020
research and innovation programme under grant agreement number 674939.
Toolkit Ideas
28. This project has received funding from the European Union’s Horizon 2020
research and innovation programme under grant agreement number 674939.
Toolkit Ideas
• Factsheets (costs, benefits, ROI)
• User Guide
• Surveys
• Case studies
• Extras (tbc)
29. This project has received funding from the European Union’s Horizon 2020
research and innovation programme under grant agreement number 674939.
Tool Effort Grading Levels
hours days months
30. This project has received funding from the European Union’s Horizon 2020
research and innovation programme under grant agreement number 674939.
Effort and Use Knowledge Pyramid:
Costs Example
Cost Models
e.g. KRDS, 4C, DANS
Cost Data
e.g. KRDS costs survey, 4C Costs
of Curation Calculator
“Rules
of Thumb”
(‘KRDS’ Law,
Kryder’s Law, etc)
31. This project has received funding from the European Union’s Horizon 2020
research and innovation programme under grant agreement number 674939.
Costs Rules of Thumb (1)
• “Kryder’s Law” – disk
storage roughly halving in
cost every year
(comparable to Moores
Law for processing
power)
• Rules for a prolonged but not eternal period of
time (“Laws”)
Kryder slowdown.
David Rosenthal.
Chart by Preeti Gupta at UCSC
32. This project has received funding from the European Union’s Horizon 2020
research and innovation programme under grant agreement number 674939.
Costs Rules of Thumb (2)
Ingest costs about half, preservation
about a sixth, access about a third, of
the data archive life-cycle costs
“KRDS Law”
33. This project has received funding from the European Union’s Horizon 2020
research and innovation programme under grant agreement number 674939.
Effort and Use Knowledge Pyramid:
ROI Example
Impact
Studies
Benefit-Cost Ratio
“Cost of Inaction”
(counter-factual)
34. This project has received funding from the European Union’s Horizon 2020
research and innovation programme under grant agreement number 674939.
Cost of Inaction (1)
Academic Sociology research and teaching user (ESDS Report interviewee)
Thinking about the last time you accessed ESDS data how long do you estimate it took
you to find and access it?
Only one or two minutes...
If there was no ESDS and the government did not deposit datasets, his work could not be
done at all. The cost for each ‘one-off access would be substantial – several thousand
pounds... Calculating any access costs and the time taken to negotiate licences etc. he
estimated ‘maybe’ £5,000 - £10,000 (assuming access could only be given by the data
providers themselves)
…What would you do and how long would it take if ESDS did not exist? He calculated
three months – if you knew there was a dataset and you were trying to get access.
35. This project has received funding from the European Union’s Horizon 2020
research and innovation programme under grant agreement number 674939.
Cost of Inaction (2)
Wicherts and colleagues emailed requests for data from the
corresponding authors of all 141 empirical articles that had been
published in the most recent two issues of four American Psychological
Association (APA) journals. Their aim was to re-analyze the data and
assess the robustness of the research findings to outliers.
"6 months later, after writing more than 400 e-mails–and sending some
corresponding authors detailed descriptions of our study aims, approvals
of our ethical committee, signed assurances not to share data with
others, and even our full resumes-we ended up with a meager 38
positive reactions and the actual data sets from 64 studies (25.7% of the
total number of 249 data sets). This means that 73% of the authors did
not share their data."
Wicherts et al.(2006) The poor availability of psychological research data
for reanalysis. American Psychologist 61, 726–728.
36. This project has received funding from the European Union’s Horizon 2020
research and innovation programme under grant agreement number 674939.
John Houghton (Victoria University) + Neil Beagrie (Charles
Beagrie Ltd) 4 joint studies to date.
Methods applied to:
Economic & Social Data Service (ESDS)
Archaeology Data Service
British Atmospheric Data Centre
European Bioinformatics Institute
Value + Impact Analysis of Data Services
37. This project has received funding from the European Union’s Horizon 2020
research and innovation programme under grant agreement number 674939.
Perceptions and Discussion
Relevance? (Nationally)
Relevance? (CESSDA/Europe)
38. —thanks for your attention!
website: www.cessda.net / twitter: @CESSDA_Data