SlideShare a Scribd company logo
1 of 67
A centre of expertise in digital information management
www.ukoln.ac.uk
UKOLN is supported by:
Facing the Data Challenge
: Institutions, Disciplines,
Services & Risks
Dr Liz Lyon, Director, UKOLN, University of Bath, UK
Associate Director, UK Digital Curation Centre
1st
DCC Regional Roadshow, Bath November 2010
This work is licensed under a Creative Commons Licence
Attribution-ShareAlike 2.0
Overview
1. Facing the data challenge :
Requirements, Risks, Costs
2. Reviewing Data Support Services :
Analysis, Assessment, Priorities
3. Building Capacity & Capability :
Skills Audit
4. Developing a Strategic Plan :
Actions and Timeframe
Institutional Diversity
http://www.flickr.com/photos/mintchocicecream/7491707/
Facing the Data Challenge
Case studies
Oxford, Cambridge,
Edinburgh, Southampton
Based on DCC Curation Lifecycle Model
Disciplinary Diversity
eScience
Case studies
http://www.flickr.com/photos/30435752@N08/2892112112/
SCARP Case studies
• Atmospheric data
• Neuro-imaging
• Tele-health
• Architecture
• Mouse Atlas
http://www.dcc.ac.uk/sites/default/files/documents/publications/SCARP%20SYNTHESIS.pdf
Recommendations:
• JISC
• HE & Research
funders
• Publishers &
Learned societies
• HEIs and research
institutions
• Researchers &
scholars
http://www.data-archive.ac.uk/media/203597/datamanagement_socialsciences.pdf
http://opus.bath.ac.uk/20896/1/erim2rep100420mjd10.pdf
• Quick &
simple deposit
• Software
tools
• Laboratory
archive
• Crystallography
community engaged
• ‘Embargo’ facility
• Structured
foundations
• Discoverable &
harvestable
Data Curation Profiles
Exercise 1a: Gathering requirements
• What are the researchers’ data requirements?
• What datasets exist already? Standards?
• What are their data priorities? Skills?
• Research methodologies? Plans?
• Equipment and instrumentation? Formats?
• Where are the “pain points”?
• How will you find out? Approaches to use?
• How will you use the information?
Exercise 1b: Motivation, benefits, risks
• What are the RDM drivers and enablers for
research staff and post-grad students?
• RDM drivers and enablers for Libraries / IT /
Computing Services / Information Services?
• RDM drivers and enablers for the institution?
• What are the barriers? What are the risks?
• How will you articulate the benefits?
• How will you find out? Approaches to use?
• How will you use the information?
Exercise 1c: Costs & sustainability
• What are the costs associated with RDM?
• For the researcher?
• For the institution?
• Direct / indirect costs? Fixed / variable costs?
• What cost data already exists?
• What time horizon are you considering?
• How will you find out? Approaches to use?
• How will you use the information?
• Survey e.g. Oxford, Parse.Insight
• Focus groups : semi-structured interviews
• Case studies departmental / disciplinary
• Joint R&D projects
• Data champions in departments
• Data Preservation readiness : AIDA tool
• Data audit / assessment : DAF tool
Requirements gathering:
Approaches and tools
Benefits:
Prioritisation of resources
Capacity development and planning
Efficiency savings – move data to more
cost-effective storage
Manage risks associated with data loss
Realise value through improved access
& re-use
Scale:
Departments, institutions
Dealing with Data Report : Rec 4
• DAF Implementation
Guide October 2009
• Collating lessons of
pilot studies
• Practical examples
of questionnaires and
interview frameworks
• DAF online tool
autumn 2010
http://www.data-audit.eu/docs/DAF_Implementation_Guide.pdf
Methodology
http://www.data-audit.eu/DAF_Methodology.pdf
Data Audit /
Asset Framework
pilots May-July
2008
http://sudamih.oucs.ox.ac.uk/docs/Use%20of%20the%20DAF.pdf
http://eprints.ucl.ac.uk/15053/1/15053.pdf
Some lessons learned….
“CeRch had four false starts before finding a willing audit partner”
“Pick your moment” ….“Timing is key” (avoid exams, field trips,
Boards…)
Plan well in advance!
“Be prepared to badger senior management”
Little documentation/knowledge of what exists:“a nightmare”
Defining the scope and granularity is
crucial
Collect as much information as possible in interviews/surveys
Variable openness of staff and their data
Identifying risks
• Data loss (institution, research group,
individual)
• Increased costs (lack of planning, service
inefficency, data loss)
• Legal compliance (research funder, H&S,
ethics, FoI)
• Reputation (institution, unit, individual)
http://foiresearchdata.jiscpress.org/
Freedom of Information FAQ (Draft)
Sustainability:
Who owns?
Who benefits?
Who selects?
Who preserves?
Who pays?
Dimension 1
Direct Indirect (costs avoided)
Dimension 2
Near-term Long-term
Dimension 3
Private Public
Benefits Taxonomy: Summary
Keeping Research Data Safe2 Report: April 2010
KRDS
• Which costs?
• Effect over time?
• Benefits taxonomy
• Repository models
• Case studies
• Key cost variables
• Recommendations
• User Guide,
business templates
forthcoming 2010
Reviewing Data Support Services
Analysis, Assessment, Priorities
1. Scale, Complexity, Predictive
Potential
2. Continuum of Openness
3. Citizen Science
4. Credentials, Incentives,
Rewards
5. Institutional Readiness &
Response
6. Data Informatics Capacity &
Capability
http://www.ukoln.ac.uk/ukoln/staff/e.j.lyon/publications.html#november-
2009
•Open Science at Web-Scale
A centre of expertise in digital information management
www.ukoln.ac.uk
1. Leadership
2. Policy
3. Planning
4. Audit
6. Repositories &
Quality assurance
8. Access &
Re-use
10. Community
building
9. Training & skills
Data
Informatics
Top 10
5. Engagement
7. Sustainability
Exercise 2:
Analysis, Assessment, Priorities
• Institutional stakeholders?
• Data support services?
• Range, scope, coverage?
• Gaps?
• Fitness for purpose?
• Timeliness?
• Resources?
• Skills?
• SWOT
Strengths Weaknesses (Gaps)
ThreatsOpportunities
Digital Preservation
Policies Study
High-level pointers
and guidance
Outline policy
model/framework
Mappings to
institutional
strategies
Exemplars
Report October 2008
State-of-the-Art Report :
Models & Tools (Alex Ball, June 2010)
• Data Lifecycles
• Data Policies (UK) incl DMP
• Standards & tools
• Data Asset Framework (DAF)
• DANS Seal of Approval
• Preservation metadata
• Archive management tools
• Cost / benefit tools
Jeff Haywood, RDMF V October 2010
http://www.dcc.ac.uk/sites/default/files/documents/RDMF/RDMF5/Haywood.pdf
Jeff Haywood, RDMF V October 2010
http://www.dcc.ac.uk/sites/default/files/documents/RDMF/RDMF5/Haywood.pdf
Jeff Haywood, RDMF V October 2010
http://www.dcc.ac.uk/sites/default/files/documents/RDMF/RDMF5/Haywood.pdf
Assessing cloud options
3 JISC Reports in 2010 :
• Technical Review
• Cloud computing for
research
• Environmental &
Organisational issues
• North Carolina
universities
• Cyber-
infrastructure project
• Data cloud across
three campuses
• “regional”
•
Policy
• Data types, formats, standards, capture
• Ethics and Intellectual Property
• Access, sharing and re-use
• Short-term storage & data management
• Deposit & long-term preservation
• Adherence and review
Planning Dealing with Data Report : Rec 9
http://www.dcc.ac.uk/dmponline
DMP Online
Currently updating Version 1.0
Checklist questions mapped to funder’s data requirements
Checklist for a Data Management Plan
Slide : Martin Donnelly, DCC
DMP Online v2.0 (coming soon)
• Cleaner interface
• Funder-specific
guidance
• Versioning feature
• CSV output
Slide : Martin Donnelly, DCC
http://www.dcc.ac.uk/dmponline
DMPs next steps?
• Embed DMPs in funder policies &
research lifecycles as the norm
• Code of Conduct for Research
• Assess & review DMPs (not just
the science content of proposals)
• Educate reviewers (DCC guidance
for social science in prep)
• Manage compliance of researchers
• Infrastructure to share DMPs
• Integrate in institution research
management information system
Building a University Data registry…
Building Capacity & Capability
Data challenges?
1. Data management plans
2. Appraisal: selection criteria
3. Data retention and handover
4. Data documentation: metadata,
schema, semantics
5. Data formats: applying standards
6. Instrumentation: proprietary formats
7. Data provenance: authenticity
8. Data citation & versions: persistent IDs
9. Data validation and reproducibility
10. Data access: embargo policy
11. Data licensing
12. Data linking: text, images, software
Exercise 3: Skills Audit
• What skills do you have in house?
• What are your strengths? Core data
skills?
• Gaps? Do these matter?
• Can / should they be developed?
• How? Resource implications?
• Other sources of expertise?
• Key partnerships?
• Team science roles?
Skills Audit
Skill Source / Gap Comment
• Be specific
• Prioritise core skills
Data Access & Re-use
“Community Criteria for Interoperability”
(Scaling Up Report 2008)
• Domain data format standard: CIF
• Domain data validation standard: CheckCIF
• Metadata schema: eCrystals Application Profile
http://www.ukoln.ac.uk/projects/ebank-uk/schemas/
• Crystallography Data Commons:
TIDCC Data Model in development
• Domain identifier: International Chemical Identifier
• Citation & linking: DOI
http://dx.doi.org/10.1594/ecrystals.chem.soton.ac.uk/145
• Embargo & Rights http://ecrystals.chem.soton.ac.uk/rights.html
Data Licensing
• Bespoke licences
• Standard licences
• Multiple licensing
• Licence mechanisms
• Forthcoming 2010
What to keep?
Repositories
Quality Assurance
Trust
Standards
Audit and certification tools
• TRAC
• DRAMBORA
• PLATTER
• NESTOR
• DANS Data Seal of Approval
Sustainability
PREMIS Data Dictionary
OAIS
• Representation Information
• Registry/Repository RRORI
http://www.loc.gov/standards/premis/
Data citation
Training
• Consortial
• Institutional
• Departmental
• Laboratory
• Project
• Library
• Computing Services
• Research staff / postdocs
• Postgraduate students
“excellent : probably the
best course I have been on
since starting my role as an
Informatics Liaison Officer”
Research Data Management Forum
http://www.dcc.ac.uk/data-forum/
1) Roles & Responsibilities
2) Value & Benefits
3) Sensitive Data: Ethics, Security, Trust
4) Economics of Applying & Sustaining
digital curation
• Online resources
• Includes training for
• Data handling
• Software
• SPSS, NVIVO
• Live arts
• Department of
Drama
• Researcher-
practitioner focus
Embedding data informatics education
...faculty & LIS...
Doctoral Training Centres
Developing a Strategic Plan
Optimising organisational support
• Organisational structures
• Library / IT / IS / research support structure
• Where does data management fit?
• Leadership?
• Co-ordination?
• Roles : data librarian, data manager,
research support officer, data scientist, data
curator...
• New roles?
New data support
structures
Exercise 4: Actions and Timeframe
• Vision and Objectives: Are they clear?
• Organisational structures: Fit for purpose?
• Library / IT / IS structure : Is it optimal?
• Roles : who is best placed to take action?
• Responsibility : for each service / activity?
• Priorities : what will you stop doing?
• Resources : Do you need to bid for
funding?
• Partnerships : Who do you need to talk to?
• Plan: What? Who? How? When?
Actions and Timeframe
Short-term
0-12 months
Medium-term
12-36 months
Long-term
>3 years
• Identify quick wins
• What can you do tomorrow?
Take homes
1. Understand the research data
requirements of your
campus / institutional
consumers
2. Agree research data service
delivery priorities
3. Define data roles and
responsibilities
4. Collaborate and strengthen
the data support provided
5. Be pro-active! Engage! Be
part of team science!
http://www.pnl.gov/science/images/highlights/computing/biopilotlg.jpg
Chicago Mart Plaza, 6-8 December 2010

More Related Content

What's hot

Research Data Management for Librarians at Oxford Brookes
Research Data Management for Librarians at Oxford BrookesResearch Data Management for Librarians at Oxford Brookes
Research Data Management for Librarians at Oxford BrookesMarieke Guy
 
Martin Lewis and Stephen Pinfield Research Data Management - where should col...
Martin Lewis and Stephen Pinfield Research Data Management - where should col...Martin Lewis and Stephen Pinfield Research Data Management - where should col...
Martin Lewis and Stephen Pinfield Research Data Management - where should col...Jisc
 
The changing world of research evaluation
The changing world of research evaluationThe changing world of research evaluation
The changing world of research evaluationJisc
 
RDM for librarians
RDM for librariansRDM for librarians
RDM for librariansSarah Jones
 
Research data management and university libraries
Research data management and university librariesResearch data management and university libraries
Research data management and university librariesJisc
 
Data Management Planning at Edinburgh
Data Management Planning at EdinburghData Management Planning at Edinburgh
Data Management Planning at EdinburghSarah Jones
 
H2020 data pilot openaire
H2020 data pilot openaireH2020 data pilot openaire
H2020 data pilot openaireSarah Jones
 
Other DCC tools and resources: a quick overview
Other DCC tools and resources: a quick overviewOther DCC tools and resources: a quick overview
Other DCC tools and resources: a quick overviewMarieke Guy
 
Keynote speech - Carole Goble - Jisc Digital Festival 2015
Keynote speech - Carole Goble - Jisc Digital Festival 2015Keynote speech - Carole Goble - Jisc Digital Festival 2015
Keynote speech - Carole Goble - Jisc Digital Festival 2015Jisc
 
UK data management environment and support
UK data management environment and supportUK data management environment and support
UK data management environment and supportJisc
 
H2020 open-data-pilot
H2020 open-data-pilotH2020 open-data-pilot
H2020 open-data-pilotSarah Jones
 
DMP health sciences
DMP health sciencesDMP health sciences
DMP health sciencesSarah Jones
 
EC Open Access Co-ordination workshop - 4th May 2011
EC Open Access Co-ordination workshop - 4th May 2011EC Open Access Co-ordination workshop - 4th May 2011
EC Open Access Co-ordination workshop - 4th May 2011Jisc
 
Open science: what does success look like, and how would we know?
Open science: what does success look like, and how would we know?Open science: what does success look like, and how would we know?
Open science: what does success look like, and how would we know?Jisc
 
RDM policy and recovering costs
RDM policy and recovering costsRDM policy and recovering costs
RDM policy and recovering costsSarah Jones
 
RDAP14 Poster: The DCC’s institutional engagement program: changing approache...
RDAP14 Poster: The DCC’s institutional engagement program: changing approache...RDAP14 Poster: The DCC’s institutional engagement program: changing approache...
RDAP14 Poster: The DCC’s institutional engagement program: changing approache...ASIS&T
 
NSF Data Management Plan - Implications for Librarians
NSF Data Management Plan - Implications for LibrariansNSF Data Management Plan - Implications for Librarians
NSF Data Management Plan - Implications for LibrariansAndrew Sallans
 
Objects in Motion The Institutional Repositories Landscape
Objects in Motion The Institutional Repositories LandscapeObjects in Motion The Institutional Repositories Landscape
Objects in Motion The Institutional Repositories LandscapeGaz Johnson
 

What's hot (20)

Enhancing DMPTool: Further Streamlineing Data Mangement Planning Process
Enhancing DMPTool: Further Streamlineing Data Mangement Planning ProcessEnhancing DMPTool: Further Streamlineing Data Mangement Planning Process
Enhancing DMPTool: Further Streamlineing Data Mangement Planning Process
 
Research Data Management for Librarians at Oxford Brookes
Research Data Management for Librarians at Oxford BrookesResearch Data Management for Librarians at Oxford Brookes
Research Data Management for Librarians at Oxford Brookes
 
Martin Lewis and Stephen Pinfield Research Data Management - where should col...
Martin Lewis and Stephen Pinfield Research Data Management - where should col...Martin Lewis and Stephen Pinfield Research Data Management - where should col...
Martin Lewis and Stephen Pinfield Research Data Management - where should col...
 
The changing world of research evaluation
The changing world of research evaluationThe changing world of research evaluation
The changing world of research evaluation
 
RDM for librarians
RDM for librariansRDM for librarians
RDM for librarians
 
Research data management and university libraries
Research data management and university librariesResearch data management and university libraries
Research data management and university libraries
 
Data Management Planning at Edinburgh
Data Management Planning at EdinburghData Management Planning at Edinburgh
Data Management Planning at Edinburgh
 
H2020 data pilot openaire
H2020 data pilot openaireH2020 data pilot openaire
H2020 data pilot openaire
 
Other DCC tools and resources: a quick overview
Other DCC tools and resources: a quick overviewOther DCC tools and resources: a quick overview
Other DCC tools and resources: a quick overview
 
Keynote speech - Carole Goble - Jisc Digital Festival 2015
Keynote speech - Carole Goble - Jisc Digital Festival 2015Keynote speech - Carole Goble - Jisc Digital Festival 2015
Keynote speech - Carole Goble - Jisc Digital Festival 2015
 
UK data management environment and support
UK data management environment and supportUK data management environment and support
UK data management environment and support
 
H2020 open-data-pilot
H2020 open-data-pilotH2020 open-data-pilot
H2020 open-data-pilot
 
DMP health sciences
DMP health sciencesDMP health sciences
DMP health sciences
 
EC Open Access Co-ordination workshop - 4th May 2011
EC Open Access Co-ordination workshop - 4th May 2011EC Open Access Co-ordination workshop - 4th May 2011
EC Open Access Co-ordination workshop - 4th May 2011
 
Supporting-DMPs
Supporting-DMPsSupporting-DMPs
Supporting-DMPs
 
Open science: what does success look like, and how would we know?
Open science: what does success look like, and how would we know?Open science: what does success look like, and how would we know?
Open science: what does success look like, and how would we know?
 
RDM policy and recovering costs
RDM policy and recovering costsRDM policy and recovering costs
RDM policy and recovering costs
 
RDAP14 Poster: The DCC’s institutional engagement program: changing approache...
RDAP14 Poster: The DCC’s institutional engagement program: changing approache...RDAP14 Poster: The DCC’s institutional engagement program: changing approache...
RDAP14 Poster: The DCC’s institutional engagement program: changing approache...
 
NSF Data Management Plan - Implications for Librarians
NSF Data Management Plan - Implications for LibrariansNSF Data Management Plan - Implications for Librarians
NSF Data Management Plan - Implications for Librarians
 
Objects in Motion The Institutional Repositories Landscape
Objects in Motion The Institutional Repositories LandscapeObjects in Motion The Institutional Repositories Landscape
Objects in Motion The Institutional Repositories Landscape
 

Viewers also liked

Codes, Clouds & Constellations: Open Science in the Data Decade
Codes, Clouds & Constellations: Open Science in the Data DecadeCodes, Clouds & Constellations: Open Science in the Data Decade
Codes, Clouds & Constellations: Open Science in the Data DecadeLizLyon
 
Tax Assist Budget Summary2011
Tax Assist Budget Summary2011Tax Assist Budget Summary2011
Tax Assist Budget Summary2011Paul_Chillman
 
CMOs Share How The Job Has Gotten Easier
CMOs Share How The Job Has Gotten EasierCMOs Share How The Job Has Gotten Easier
CMOs Share How The Job Has Gotten EasierMargaret Molloy
 
Open Science at Genome Scale
Open Science at Genome ScaleOpen Science at Genome Scale
Open Science at Genome ScaleLizLyon
 
Top marketing reading list
Top marketing reading listTop marketing reading list
Top marketing reading listMargaret Molloy
 
Mind the Gap: Reflections on Data Policies and Practice
Mind the Gap: Reflections on Data Policies and PracticeMind the Gap: Reflections on Data Policies and Practice
Mind the Gap: Reflections on Data Policies and PracticeLizLyon
 
TOP brand challenge facing B2B companies
TOP brand challenge facing B2B companies TOP brand challenge facing B2B companies
TOP brand challenge facing B2B companies Margaret Molloy
 

Viewers also liked (7)

Codes, Clouds & Constellations: Open Science in the Data Decade
Codes, Clouds & Constellations: Open Science in the Data DecadeCodes, Clouds & Constellations: Open Science in the Data Decade
Codes, Clouds & Constellations: Open Science in the Data Decade
 
Tax Assist Budget Summary2011
Tax Assist Budget Summary2011Tax Assist Budget Summary2011
Tax Assist Budget Summary2011
 
CMOs Share How The Job Has Gotten Easier
CMOs Share How The Job Has Gotten EasierCMOs Share How The Job Has Gotten Easier
CMOs Share How The Job Has Gotten Easier
 
Open Science at Genome Scale
Open Science at Genome ScaleOpen Science at Genome Scale
Open Science at Genome Scale
 
Top marketing reading list
Top marketing reading listTop marketing reading list
Top marketing reading list
 
Mind the Gap: Reflections on Data Policies and Practice
Mind the Gap: Reflections on Data Policies and PracticeMind the Gap: Reflections on Data Policies and Practice
Mind the Gap: Reflections on Data Policies and Practice
 
TOP brand challenge facing B2B companies
TOP brand challenge facing B2B companies TOP brand challenge facing B2B companies
TOP brand challenge facing B2B companies
 

Similar to Facing the Data Challenge: Institutions, Disciplines, Services and Risks

The DCC: Helping you curate your reputation
The DCC: Helping you curate your reputationThe DCC: Helping you curate your reputation
The DCC: Helping you curate your reputationJisc
 
Research support-challenges
Research support-challengesResearch support-challenges
Research support-challengesSarah Jones
 
Challenges in setting up an RDM Support Service
Challenges in setting up an RDM Support ServiceChallenges in setting up an RDM Support Service
Challenges in setting up an RDM Support ServiceGarethKnight
 
RDM in higher education
RDM in higher educationRDM in higher education
RDM in higher educationSarah Jones
 
Immersive informatics - research data management at Pitt iSchool and Carnegie...
Immersive informatics - research data management at Pitt iSchool and Carnegie...Immersive informatics - research data management at Pitt iSchool and Carnegie...
Immersive informatics - research data management at Pitt iSchool and Carnegie...Keith Webster
 
RDM LIASA webinar
RDM LIASA webinarRDM LIASA webinar
RDM LIASA webinarSarah Jones
 
Roadmaps, Roles and Re-engineering: Developing Data Informatics Capability in...
Roadmaps, Roles and Re-engineering: Developing Data Informatics Capability in...Roadmaps, Roles and Re-engineering: Developing Data Informatics Capability in...
Roadmaps, Roles and Re-engineering: Developing Data Informatics Capability in...LIBER Europe
 
Supporting Research Data Management at the University of Stirling
Supporting Research Data Management at the University of StirlingSupporting Research Data Management at the University of Stirling
Supporting Research Data Management at the University of StirlingLisa Haddow
 
Educause 2015 RDM Maturity
Educause 2015 RDM Maturity Educause 2015 RDM Maturity
Educause 2015 RDM Maturity ResearchSpace
 
Teaching data management in a lab environment (IASSIST 2014)
Teaching data management in a lab environment (IASSIST 2014)Teaching data management in a lab environment (IASSIST 2014)
Teaching data management in a lab environment (IASSIST 2014)IUPUI
 
RDM Roadmap to the Future, or: Lords and Ladies of the Data
RDM Roadmap to the Future, or: Lords and Ladies of the DataRDM Roadmap to the Future, or: Lords and Ladies of the Data
RDM Roadmap to the Future, or: Lords and Ladies of the DataRobin Rice
 
Dorothy Byatt JIBS-RLUK event July 2012
Dorothy Byatt JIBS-RLUK event July 2012Dorothy Byatt JIBS-RLUK event July 2012
Dorothy Byatt JIBS-RLUK event July 2012sherif user group
 
UCISA Learning Anaytics Pre-Conference Workshop
UCISA Learning Anaytics Pre-Conference WorkshopUCISA Learning Anaytics Pre-Conference Workshop
UCISA Learning Anaytics Pre-Conference WorkshopMike Moore
 
PIDs, Data and Software: How Libraries Can Support Researchers in an Evolving...
PIDs, Data and Software: How Libraries Can Support Researchers in an Evolving...PIDs, Data and Software: How Libraries Can Support Researchers in an Evolving...
PIDs, Data and Software: How Libraries Can Support Researchers in an Evolving...Sarah Anna Stewart
 

Similar to Facing the Data Challenge: Institutions, Disciplines, Services and Risks (20)

RDM @ UoE
RDM @ UoERDM @ UoE
RDM @ UoE
 
The DCC: Helping you curate your reputation
The DCC: Helping you curate your reputationThe DCC: Helping you curate your reputation
The DCC: Helping you curate your reputation
 
Research support-challenges
Research support-challengesResearch support-challenges
Research support-challenges
 
Challenges in setting up an RDM Support Service
Challenges in setting up an RDM Support ServiceChallenges in setting up an RDM Support Service
Challenges in setting up an RDM Support Service
 
RDM in higher education
RDM in higher educationRDM in higher education
RDM in higher education
 
Research Data Management Roadmap@Edinburgh
Research Data Management Roadmap@EdinburghResearch Data Management Roadmap@Edinburgh
Research Data Management Roadmap@Edinburgh
 
RDM Programme at University of Edinburgh
RDM Programme at University of EdinburghRDM Programme at University of Edinburgh
RDM Programme at University of Edinburgh
 
Immersive informatics - research data management at Pitt iSchool and Carnegie...
Immersive informatics - research data management at Pitt iSchool and Carnegie...Immersive informatics - research data management at Pitt iSchool and Carnegie...
Immersive informatics - research data management at Pitt iSchool and Carnegie...
 
RDM LIASA webinar
RDM LIASA webinarRDM LIASA webinar
RDM LIASA webinar
 
Roadmaps, Roles and Re-engineering: Developing Data Informatics Capability in...
Roadmaps, Roles and Re-engineering: Developing Data Informatics Capability in...Roadmaps, Roles and Re-engineering: Developing Data Informatics Capability in...
Roadmaps, Roles and Re-engineering: Developing Data Informatics Capability in...
 
Supporting Research Data Management at the University of Stirling
Supporting Research Data Management at the University of StirlingSupporting Research Data Management at the University of Stirling
Supporting Research Data Management at the University of Stirling
 
Educause 2015 RDM Maturity
Educause 2015 RDM Maturity Educause 2015 RDM Maturity
Educause 2015 RDM Maturity
 
Teaching data management in a lab environment (IASSIST 2014)
Teaching data management in a lab environment (IASSIST 2014)Teaching data management in a lab environment (IASSIST 2014)
Teaching data management in a lab environment (IASSIST 2014)
 
RDM Roadmap to the Future, or: Lords and Ladies of the Data
RDM Roadmap to the Future, or: Lords and Ladies of the DataRDM Roadmap to the Future, or: Lords and Ladies of the Data
RDM Roadmap to the Future, or: Lords and Ladies of the Data
 
Dorothy Byatt JIBS-RLUK event July 2012
Dorothy Byatt JIBS-RLUK event July 2012Dorothy Byatt JIBS-RLUK event July 2012
Dorothy Byatt JIBS-RLUK event July 2012
 
UCISA Learning Anaytics Pre-Conference Workshop
UCISA Learning Anaytics Pre-Conference WorkshopUCISA Learning Anaytics Pre-Conference Workshop
UCISA Learning Anaytics Pre-Conference Workshop
 
RDM@Edinburgh
RDM@EdinburghRDM@Edinburgh
RDM@Edinburgh
 
RDM@Edinburgh
RDM@EdinburghRDM@Edinburgh
RDM@Edinburgh
 
PIDs, Data and Software: How Libraries Can Support Researchers in an Evolving...
PIDs, Data and Software: How Libraries Can Support Researchers in an Evolving...PIDs, Data and Software: How Libraries Can Support Researchers in an Evolving...
PIDs, Data and Software: How Libraries Can Support Researchers in an Evolving...
 
EDINA / Data Library Overview
EDINA / Data Library OverviewEDINA / Data Library Overview
EDINA / Data Library Overview
 

Recently uploaded

Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 

Recently uploaded (20)

Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 

Facing the Data Challenge: Institutions, Disciplines, Services and Risks

  • 1. A centre of expertise in digital information management www.ukoln.ac.uk UKOLN is supported by: Facing the Data Challenge : Institutions, Disciplines, Services & Risks Dr Liz Lyon, Director, UKOLN, University of Bath, UK Associate Director, UK Digital Curation Centre 1st DCC Regional Roadshow, Bath November 2010 This work is licensed under a Creative Commons Licence Attribution-ShareAlike 2.0
  • 2. Overview 1. Facing the data challenge : Requirements, Risks, Costs 2. Reviewing Data Support Services : Analysis, Assessment, Priorities 3. Building Capacity & Capability : Skills Audit 4. Developing a Strategic Plan : Actions and Timeframe
  • 5. Based on DCC Curation Lifecycle Model
  • 7. http://www.flickr.com/photos/30435752@N08/2892112112/ SCARP Case studies • Atmospheric data • Neuro-imaging • Tele-health • Architecture • Mouse Atlas
  • 8. http://www.dcc.ac.uk/sites/default/files/documents/publications/SCARP%20SYNTHESIS.pdf Recommendations: • JISC • HE & Research funders • Publishers & Learned societies • HEIs and research institutions • Researchers & scholars
  • 10. • Quick & simple deposit • Software tools • Laboratory archive • Crystallography community engaged • ‘Embargo’ facility • Structured foundations • Discoverable & harvestable
  • 12. Exercise 1a: Gathering requirements • What are the researchers’ data requirements? • What datasets exist already? Standards? • What are their data priorities? Skills? • Research methodologies? Plans? • Equipment and instrumentation? Formats? • Where are the “pain points”? • How will you find out? Approaches to use? • How will you use the information?
  • 13. Exercise 1b: Motivation, benefits, risks • What are the RDM drivers and enablers for research staff and post-grad students? • RDM drivers and enablers for Libraries / IT / Computing Services / Information Services? • RDM drivers and enablers for the institution? • What are the barriers? What are the risks? • How will you articulate the benefits? • How will you find out? Approaches to use? • How will you use the information?
  • 14. Exercise 1c: Costs & sustainability • What are the costs associated with RDM? • For the researcher? • For the institution? • Direct / indirect costs? Fixed / variable costs? • What cost data already exists? • What time horizon are you considering? • How will you find out? Approaches to use? • How will you use the information?
  • 15. • Survey e.g. Oxford, Parse.Insight • Focus groups : semi-structured interviews • Case studies departmental / disciplinary • Joint R&D projects • Data champions in departments • Data Preservation readiness : AIDA tool • Data audit / assessment : DAF tool Requirements gathering: Approaches and tools
  • 16. Benefits: Prioritisation of resources Capacity development and planning Efficiency savings – move data to more cost-effective storage Manage risks associated with data loss Realise value through improved access & re-use Scale: Departments, institutions Dealing with Data Report : Rec 4
  • 17. • DAF Implementation Guide October 2009 • Collating lessons of pilot studies • Practical examples of questionnaires and interview frameworks • DAF online tool autumn 2010 http://www.data-audit.eu/docs/DAF_Implementation_Guide.pdf
  • 19. Data Audit / Asset Framework pilots May-July 2008
  • 21. Some lessons learned…. “CeRch had four false starts before finding a willing audit partner” “Pick your moment” ….“Timing is key” (avoid exams, field trips, Boards…) Plan well in advance! “Be prepared to badger senior management” Little documentation/knowledge of what exists:“a nightmare” Defining the scope and granularity is crucial Collect as much information as possible in interviews/surveys Variable openness of staff and their data
  • 22. Identifying risks • Data loss (institution, research group, individual) • Increased costs (lack of planning, service inefficency, data loss) • Legal compliance (research funder, H&S, ethics, FoI) • Reputation (institution, unit, individual)
  • 24. Sustainability: Who owns? Who benefits? Who selects? Who preserves? Who pays?
  • 25. Dimension 1 Direct Indirect (costs avoided) Dimension 2 Near-term Long-term Dimension 3 Private Public Benefits Taxonomy: Summary Keeping Research Data Safe2 Report: April 2010
  • 26. KRDS • Which costs? • Effect over time? • Benefits taxonomy • Repository models • Case studies • Key cost variables • Recommendations • User Guide, business templates forthcoming 2010
  • 27. Reviewing Data Support Services Analysis, Assessment, Priorities
  • 28. 1. Scale, Complexity, Predictive Potential 2. Continuum of Openness 3. Citizen Science 4. Credentials, Incentives, Rewards 5. Institutional Readiness & Response 6. Data Informatics Capacity & Capability http://www.ukoln.ac.uk/ukoln/staff/e.j.lyon/publications.html#november- 2009 •Open Science at Web-Scale
  • 29. A centre of expertise in digital information management www.ukoln.ac.uk 1. Leadership 2. Policy 3. Planning 4. Audit 6. Repositories & Quality assurance 8. Access & Re-use 10. Community building 9. Training & skills Data Informatics Top 10 5. Engagement 7. Sustainability
  • 30. Exercise 2: Analysis, Assessment, Priorities • Institutional stakeholders? • Data support services? • Range, scope, coverage? • Gaps? • Fitness for purpose? • Timeliness? • Resources? • Skills? • SWOT
  • 32. Digital Preservation Policies Study High-level pointers and guidance Outline policy model/framework Mappings to institutional strategies Exemplars Report October 2008
  • 33. State-of-the-Art Report : Models & Tools (Alex Ball, June 2010) • Data Lifecycles • Data Policies (UK) incl DMP • Standards & tools • Data Asset Framework (DAF) • DANS Seal of Approval • Preservation metadata • Archive management tools • Cost / benefit tools
  • 34. Jeff Haywood, RDMF V October 2010 http://www.dcc.ac.uk/sites/default/files/documents/RDMF/RDMF5/Haywood.pdf
  • 35. Jeff Haywood, RDMF V October 2010 http://www.dcc.ac.uk/sites/default/files/documents/RDMF/RDMF5/Haywood.pdf
  • 36. Jeff Haywood, RDMF V October 2010 http://www.dcc.ac.uk/sites/default/files/documents/RDMF/RDMF5/Haywood.pdf
  • 37. Assessing cloud options 3 JISC Reports in 2010 : • Technical Review • Cloud computing for research • Environmental & Organisational issues
  • 38. • North Carolina universities • Cyber- infrastructure project • Data cloud across three campuses • “regional” •
  • 40. • Data types, formats, standards, capture • Ethics and Intellectual Property • Access, sharing and re-use • Short-term storage & data management • Deposit & long-term preservation • Adherence and review Planning Dealing with Data Report : Rec 9
  • 42. Checklist questions mapped to funder’s data requirements Checklist for a Data Management Plan Slide : Martin Donnelly, DCC
  • 43. DMP Online v2.0 (coming soon) • Cleaner interface • Funder-specific guidance • Versioning feature • CSV output Slide : Martin Donnelly, DCC http://www.dcc.ac.uk/dmponline
  • 44. DMPs next steps? • Embed DMPs in funder policies & research lifecycles as the norm • Code of Conduct for Research • Assess & review DMPs (not just the science content of proposals) • Educate reviewers (DCC guidance for social science in prep) • Manage compliance of researchers • Infrastructure to share DMPs • Integrate in institution research management information system
  • 45. Building a University Data registry…
  • 46. Building Capacity & Capability
  • 47. Data challenges? 1. Data management plans 2. Appraisal: selection criteria 3. Data retention and handover 4. Data documentation: metadata, schema, semantics 5. Data formats: applying standards 6. Instrumentation: proprietary formats 7. Data provenance: authenticity 8. Data citation & versions: persistent IDs 9. Data validation and reproducibility 10. Data access: embargo policy 11. Data licensing 12. Data linking: text, images, software
  • 48. Exercise 3: Skills Audit • What skills do you have in house? • What are your strengths? Core data skills? • Gaps? Do these matter? • Can / should they be developed? • How? Resource implications? • Other sources of expertise? • Key partnerships? • Team science roles?
  • 49. Skills Audit Skill Source / Gap Comment • Be specific • Prioritise core skills
  • 50. Data Access & Re-use “Community Criteria for Interoperability” (Scaling Up Report 2008) • Domain data format standard: CIF • Domain data validation standard: CheckCIF • Metadata schema: eCrystals Application Profile http://www.ukoln.ac.uk/projects/ebank-uk/schemas/ • Crystallography Data Commons: TIDCC Data Model in development • Domain identifier: International Chemical Identifier • Citation & linking: DOI http://dx.doi.org/10.1594/ecrystals.chem.soton.ac.uk/145 • Embargo & Rights http://ecrystals.chem.soton.ac.uk/rights.html
  • 51. Data Licensing • Bespoke licences • Standard licences • Multiple licensing • Licence mechanisms • Forthcoming 2010
  • 54. Quality Assurance Trust Standards Audit and certification tools • TRAC • DRAMBORA • PLATTER • NESTOR • DANS Data Seal of Approval
  • 55. Sustainability PREMIS Data Dictionary OAIS • Representation Information • Registry/Repository RRORI http://www.loc.gov/standards/premis/
  • 57. Training • Consortial • Institutional • Departmental • Laboratory • Project • Library • Computing Services • Research staff / postdocs • Postgraduate students “excellent : probably the best course I have been on since starting my role as an Informatics Liaison Officer” Research Data Management Forum http://www.dcc.ac.uk/data-forum/ 1) Roles & Responsibilities 2) Value & Benefits 3) Sensitive Data: Ethics, Security, Trust 4) Economics of Applying & Sustaining digital curation
  • 58. • Online resources • Includes training for • Data handling • Software • SPSS, NVIVO
  • 59. • Live arts • Department of Drama • Researcher- practitioner focus
  • 60. Embedding data informatics education ...faculty & LIS... Doctoral Training Centres
  • 62. Optimising organisational support • Organisational structures • Library / IT / IS / research support structure • Where does data management fit? • Leadership? • Co-ordination? • Roles : data librarian, data manager, research support officer, data scientist, data curator... • New roles?
  • 64. Exercise 4: Actions and Timeframe • Vision and Objectives: Are they clear? • Organisational structures: Fit for purpose? • Library / IT / IS structure : Is it optimal? • Roles : who is best placed to take action? • Responsibility : for each service / activity? • Priorities : what will you stop doing? • Resources : Do you need to bid for funding? • Partnerships : Who do you need to talk to? • Plan: What? Who? How? When?
  • 65. Actions and Timeframe Short-term 0-12 months Medium-term 12-36 months Long-term >3 years • Identify quick wins • What can you do tomorrow?
  • 66. Take homes 1. Understand the research data requirements of your campus / institutional consumers 2. Agree research data service delivery priorities 3. Define data roles and responsibilities 4. Collaborate and strengthen the data support provided 5. Be pro-active! Engage! Be part of team science! http://www.pnl.gov/science/images/highlights/computing/biopilotlg.jpg
  • 67. Chicago Mart Plaza, 6-8 December 2010

Editor's Notes

  1. ...and then input it all into the new DMP Online system.