SlideShare uma empresa Scribd logo
1 de 58
Baixar para ler offline
A centre of expertise in digital information management
www.ukoln.ac.uk
UKOLN is supported by:
Digital Preservation
Michael Day
Digital Curation Centre
UKOLN, University of Bath
m.day@ukoln.ac.uk
Information Systems and Services, UWE, Bristol, 19 February 2013
A centre of expertise in digital information management
www.ukoln.ac.uk
Presentation outline
• Digital preservation overview
– Some definitions
– Technical challenges
– Organisational challenges
• Approaches to solving the problem
– Preservation Strategies
– Tools for:
• Format characterisation
• Preservation Planning
– The OAIS model:
• Preservation metadata
• Repository audit frameworks (TRAC, DRAMBORA)
• Institutional assessment tools: (DAF, CARDIO)
• Research Data Management
A centre of expertise in digital information management
www.ukoln.ac.uk
Definitions
• Digital preservation:
– Is mainly concerned with the sustainability of “content” for
a given period of time (probably not forever)
– Largely about ensuring “continued access” to content
– “The series of managed activities necessary to ensure
continued access to digital materials for as long as
necessary” - Digital Preservation Coalition (DPC) Digital
Preservation Definitions and Concepts list:
http://www.dpconline.org/advice/preservationhandbook/in
troduction/definitions-and-concepts?q=definitions
– A combination of technical, organisational and legal
challenges
A centre of expertise in digital information management
www.ukoln.ac.uk
Digital preservation basics
• An ongoing (lifecycle) approach to managing digital
content based on:
– The identification and adoption of appropriate
preservation strategies for content
– The collection and management of appropriate metadata
(explicit and implicit knowledge, contexts)
– The ongoing monitoring of technical contexts and the
application of preservation planning techniques
– Continual monitoring of the organisation (audit)
– Not about keeping everything, forever
A centre of expertise in digital information management
www.ukoln.ac.uk
A multi-faceted set of challenges
• Technical
– Strategies needed to
deal with ongoing
obsolescence and
scale
• Organisational
– Access and reuse
– Authenticity and
integrity
– Sustainability (costs)
– Legal
– Deciding what needs to
be retained
A centre of expertise in digital information management
www.ukoln.ac.uk
Technical challenges (1)
• Physical
– Bits stored on a physical medium (or in the cloud?)
– Focus 20 years ago was on new media types (e.g. optical
storage technologies) as a panacea
– Bit-level preservation is still important – the first layer in a
viable preservation strategy
A centre of expertise in digital information management
www.ukoln.ac.uk
Obsolete media
Image courtesy of Frank Carey
Exhibition at NASA White
Sands Test Facility, 2009
A centre of expertise in digital information management
www.ukoln.ac.uk
Technical challenges (2)
• Hardware and software dependence
– Most digital objects are dependent on particular
configurations of hardware and software
– Relatively short obsolescence cycles
A centre of expertise in digital information management
www.ukoln.ac.uk
Hardware and software dependence
Exhibition at NASA White
Sands Test Facility, 2009Image courtesy of Frank Carey
A centre of expertise in digital information management
www.ukoln.ac.uk
Conceptual challenges (1)
• What is an digital object?
– Some are analogues of traditional objects, e.g. meeting
minutes, research papers
– Others are not, e.g. Web pages, blogs, GIS, 3D models
of chemical structures, research data more generally
• Complexity
• Dynamic nature
• Interactivity
– Born digital vs. product of digitisation initiatives
– Logical layer between physical storage of bits and the
conceptual objects that need preservation (includes data
types, formats, etc.)
A centre of expertise in digital information management
www.ukoln.ac.uk
Conceptual challenges (2)
• Need to identify and document the “significant
properties” (or characteristics) of content:
– Recognises that preservation is context dependent, even
user specific (OAIS concept of 'designated community')
– Helps with choosing an acceptable preservation strategy
• Compare the ‘performance model’ developed by the
National Archives of Australia (2002) - “The source of
a record is a fixed message that interacts with
technology. This message provides the record’s
unique meaning, but by itself is meaningless to
researchers since it needs to be combined with
technology in order to be rendered as its creator
intended. The process is the technology required to
render meaning from the source”
– Focus on re-use (e.g., data curation)
A centre of expertise in digital information management
www.ukoln.ac.uk
Organisational challenges (1)
• Sustainability:
– Ultimately the sustainability of content depends upon the long-
term sustainability of organisations
• Focus on business models
• Embedding preservation into the core task of organisations
– Organisational commitment:
• “An institutional repository needs to be a service with
continuity behind it … Institutions need to recognise that
they are making commitments for the long term” Clifford
Lynch
• Need for policy development
– Incentives for preservation:
• Clarity on roles and responsibilities needed
• Who benefits? Who pays? “Free riding?”
A centre of expertise in digital information management
www.ukoln.ac.uk
Organisational challenges (2)
• Economic perspectives:
– Blue Ribbon Task Force on Sustainable Digital
Preservation and Access: http://brtf.sdsc.edu/
• Final report (Feb 2010) “Ensuring that valuable digital
assets will be available for future use is not simply a
matter of finding sufficient funds. It is about mobilizing
resources - human, technical, and financial - across a
spectrum of stakeholders diffuse over both space and
time. But questions remain about what digital
information we should preserve, who is responsible
for preserving, and who will pay.”
– JISC-funded LIFE (Life Cycle Information for E-
Literature) has developed a predictive costing tool:
http://www.life.ac.uk/
A centre of expertise in digital information management
www.ukoln.ac.uk
Organisational challenges (3)
• The challenge of scale:
– The Web
– Digitised “textual” content:
• Google Books
• DPLA / Europeana
– The “data deluge” in e-Science:
• New generations of instruments, computer
simulations
• Many terabytes generated per day, petabyte scale
computing (and growing)
• Cory Doctorow, “Welcome to the petacentre.” Nature,
455, pp 17-21, 4 Sep 2008
A centre of expertise in digital information management
www.ukoln.ac.uk
Organisational challenges (4)
• The need for collaboration:
– Need for 'deep-infrastructure' for preservation recognised
as far back as 1996 by the Task Force on Archiving of
Digital Information
• Digital preservation involves the "grander problem of
organizing ourselves over time and as a society ... [to
manoeuvre] effectively in a digital landscape" (p. 7)
– Building on existing networks
– Role for national-level co-ordination:
• Digital Preservation Coalition (DPC), nestor
(Germany), National Digital Information Infrastructure
and Preservation Program (NDIIPP)
A centre of expertise in digital information management
www.ukoln.ac.uk
Organisational challenges (5)
• Learn the lessons from
the past:
– Things will go wrong
– Do what you can to
enable recovery from
disaster
– Digital technologies
support replication
(create more than one
point of failure)
A centre of expertise in digital information management
www.ukoln.ac.uk
Digital preservation strategies (1)
• Main approaches:
– Technology preservation (e.g., computing museums)
– Digital archaeology (a post hoc approach)
– Emulation (focusing on the environment, often used
where look-and-feel is important, e.g. computer games)
– Migration (focusing on the content)
• A mature approach: A set of organised tasks
designed to achieve the periodic transfer of digital
information from one hardware and software
configuration to another, or from one generation of
computer technology to a subsequent one - CPA/RLG
report (1996)
A centre of expertise in digital information management
www.ukoln.ac.uk
Digital preservation strategies (2)
• Preservation strategies are not in competition
– Different strategies will work together, may be value in
diversification
– Migration strategies mean difficult choices need to be
made about target formats
• But the strategy chosen has implications for:
– The technical infrastructure required (and metadata)
– Collection management priorities
– Rights management
• Owning the rights to re-engineer software
– Costs
A centre of expertise in digital information management
www.ukoln.ac.uk
Digital preservation strategies (3)
• Tools for format characterisation and validation
– DROID - Digital Record Object Identification (based on
the PRONOM registry
• Very important to know what types (formats) of
content exist in a particular collection (e.g.,
institutional repository or Web archive)
• Performs batch identification of file formats
• http://www.nationalarchives.gov.uk/PRONOM/Default.
aspx
– JHOVE - JSTOR/Harvard Object Validation Environment
• Used for format validation
• http://hul.harvard.edu/jhove/
A centre of expertise in digital information management
www.ukoln.ac.uk
Digital preservation strategies (4)
• Plato preservation planning tool
– Developed by EU Planets project
– A decision support tool that helps users explore the
evaluation of potential preservation solutions against
specific requirements and for building a plan for
preserving a given set of objects
– Integrates file format identification (using DROID); some
migration services; XML-based generic format
characterisation using XCL (eXtensible Characterisation
Languages)
– More info: http://www.ifs.tuwien.ac.at/dp/plato/intro.html
– Integration with repositories tested by JISC KeepIt
project: http://preservation.eprints.org/keepit/
A centre of expertise in digital information management
www.ukoln.ac.uk
OAIS Reference Model (ISO 14721)
OAIS Functional Entities (Figure 4-1)
http://public.ccsds.org/publications/archive/650x0m2.pdf
A centre of expertise in digital information management
www.ukoln.ac.uk
Preservation metadata
• Metadata and documentation is vitally important
– Relates to OAIS concepts like Representation
Information and Preservation Description Information
– Functions:
• Enables resource discovery - supports the
development of finding aids
• Records meaning (structure and semantics)
• Records context and provenance (authenticity)
– Standards that support digital preservation activities:
• PREMIS Data Dictionary (for core metadata):
http://www.loc.gov/standards/premis/
A centre of expertise in digital information management
www.ukoln.ac.uk
Repository audit frameworks (1)
• Repository audit frameworks first developed out of the
OAIS Reference Model (ISO
– OAIS Mandatory Responsibilities (only six of them):
• The main focus was on technical and organisational
aspects, e.g.:
– That repositories ensure that preserved
information (content) can be understood
(independently understandable)
– That documented policies and procedures are
being followed
• No clear concept of OAIS “compliance”
A centre of expertise in digital information management
www.ukoln.ac.uk
Repository audit frameworks (2)
• ISO 16363:2012 -- Audit and certification of trustworthy digital
repositories
– Trusted Repositories Audit and Certification (TRAC)
– Criteria cover three main aspects:
• Organisational Infrastructure
– Governance and viability, structure and staffing,
financial sustainability, contracts, etc.
• Digital Object Management
– Ingest, preservation planning, archival storage, etc.
• Infrastructure and security risk management
– Systems and infrastructure, etc.
– A basis for certification
– http://public.ccsds.org/publications/archive/652x0m1.pdf
A centre of expertise in digital information management
www.ukoln.ac.uk
TRAC Checklist example page
A centre of expertise in digital information management
www.ukoln.ac.uk
Repository audit frameworks (3)
• DRAMBORA (Digital Repository Audit Method Based on Risk
Assessment)
– Developed by the Digital Curation Centre and Digital
Preservation Europe
– “Presents a methodology for self-assessment, encouraging
organisations to establish a comprehensive self-awareness of
their objectives, activities and assets before identifying,
assessing and managing the risks implicit within their
organisation“
– Identifying risks and scoring each one on likelihood and impact
– Covers: organisational context, policies, assets, risks, etc.
– Online tool: http://www.repositoryaudit.eu/
A centre of expertise in digital information management
www.ukoln.ac.uk
Repository audit frameworks (4)
• A means of "asking the right questions" about repositories
(and the wider organisation) and documenting appropriate
procedures and risks
• More than one role:
– External badge of quality (a "certified preservation
repository")
• DINI-Zertifikat für Dokumenten- und Publikationsservices:
http://www.dini.de/english/dini-certificate/
• ISO 16363
– Management tool for self assessment
A centre of expertise in digital information management
www.ukoln.ac.uk
Core repository principles (1)
• Ten Principles - agreed 2007 by CRL (US), Digital Curation
Centre (UK), Nestor (Germany) and Digital Preservation
Europe
– The repository commits to continuing maintenance of digital
objects for identified community/communities.
– Demonstrates organizational fitness (including financial,
staffing structure, and processes) to fulfill its commitment.
– Acquires and maintains requisite contractual and legal rights
and fulfills responsibilities.
– Has an effective and efficient policy framework.
– Acquires and ingests digital objects based upon stated criteria
that correspond to its commitments and capabilities.
A centre of expertise in digital information management
www.ukoln.ac.uk
Core repository principles (2)
• Ten principles (continued)
– Maintains/ensures the integrity, authenticity and usability of
digital objects it holds over time.
– Creates and maintains requisite metadata about actions taken
on digital objects during preservation as well as about the
relevant production, access support, and usage process
contexts before preservation.
– Fulfills requisite dissemination requirements.
– Has a strategic program for preservation planning and action.
– Has technical infrastructure adequate to continuing
maintenance and security of its digital objects.
• Available: http://www.crl.edu/archiving-preservation/digital-
archives/metrics-assessing-and-certifying/core-re
A centre of expertise in digital information management
www.ukoln.ac.uk
Digital preservation basics (reprise)
• An ongoing (lifecycle) approach to managing digital
content based on:
– The identification and adoption of appropriate
preservation strategies for content
– The collection and management of appropriate metadata
(explicit and implicit knowledge, contexts)
– The ongoing monitoring of technical contexts and the
application of preservation planning techniques
– Continual monitoring of the organisation (audit)
– Not about keeping everything, forever
A centre of expertise in digital information management
www.ukoln.ac.uk“It is always a mistake for a historian to try and predict the future.
Life, unlike science, is simply too full of surprises” - Richard J.
Evans, In defence of history (1997, p. 62)
A centre of expertise in digital information management
www.ukoln.ac.uk
Further reading
– DPC Technology Watch reports:
http://www.dpconline.org/advice/technology-watch-reports
– Blue Ribbon Task Force on Sustainable Digital Preservation
and Access, Final Report (NSF, 2010) http://brtf.sdsc.edu/
– Digital Preservation Coalition, Digital preservation handbook:
http://www.dpconline.org/advice/preservationhandbook/
– Marieke Guy, JISC Beginner’s Guide to Digital Preservation
(UKOLN, 2010) http://blogs.ukoln.ac.uk/jisc-beg-dig-pres/
– Digital Preservation Coalition and Digital Curation Centre,
What’s New (monthly current awareness bulletin):
http://www.dpconline.org/newsroom/whats-new
– JISC infoNet, Digital repositories infoKit:
http://www.jiscinfonet.ac.uk/infokits/repositories
– Paradigm Project, Workbook on Digital Private Papers:
http://www.paradigm.ac.uk/workbook/index.html
A centre of expertise in digital information management
www.ukoln.ac.uk
Web links:
– Digital Preservation Coalition: http://www.dpconline.org/
– Abby Smith talk (2011) at Yale: http://youtu.be/Yk9ccNP9xTk
– Plato Preservation Planning tool:
http://www.ifs.tuwien.ac.at/dp/plato/intro.html
– RSP briefing paper on preservation and storage formats:
http://www.rsp.ac.uk/pubs/briefingpapers-docs/technical-
preservformats.pdf
– PRESERV project: http://preservation.eprints.org/
– KeepIt project: http://preservation.eprints.org/keepit/
– WePreserve cartoons at:
http://www.youtube.com/user/wepreserve
A centre of expertise in digital information management
www.ukoln.ac.uk
Available: http://youtu.be/PGFOZLecjTc
A centre of expertise in digital information management
www.ukoln.ac.uk
Research Data Management:
activities, roles and requirements
A centre of expertise in digital information management
www.ukoln.ac.uk
Introduction and overview
• What is research data management?
– Caring for,
– Facilitating access to,
– Preserving and
– Adding value to digital research data throughout its
lifecycle.
• Rationale (researchers, institutions)
• Who is involved and how?
• Roles and responsibilities?
A centre of expertise in digital information management
www.ukoln.ac.uk
Researcher perspectives (1)
• Managing and sharing data is simply part of good
research practice:
– Adhering to disciplinary and/or institutional codes of
practice and policies
– Has been practiced since the advent of modern science,
but not always consistently; data intensive research
makes it even more critical
– Meeting the specific requirements of funding bodies
– Reputational risks if data management is not handled
properly
A centre of expertise in digital information management
www.ukoln.ac.uk
Researcher perspectives (2)
• Potential benefits:
– Scholarly communication/access to data
– Re-purposing and re-use of data
– Stimulating new networks/collaborations & new research
– Knowledge transfer to industry
– Verification of research/research integrity
– Re-purposing data for new audiences
– Secure storage for data intensive research
– Availability of data underpinning journal articles
– Increased visibility/citation
Keeping Research Data Safe Factsheet
http://www.beagrie.com/KRDS_Factsheet_0910.pdf
A centre of expertise in digital information management
www.ukoln.ac.uk
Institutional perspectives
• Institutional drivers
– Safeguarding research integrity
– Increasing number of FOI requests for data
– Adhering to existing codes of research practice and
ethics
– Developing new institution-wide strategies, policies and
services for data storage and management
– Increased institutional focus on research management
(e.g., in response to REF)
– Benchmarking – self-assessing infrastructure and
planning for improvement
– More demands but less resources to work with
A centre of expertise in digital information management
www.ukoln.ac.uk
Codes of practice for research
• UK Research Integrity Office Code of Practice for Research (2009)
– Data management planning is an essential part of research design
– Organisations should have in place procedures, resources (including
physical space) and administrative support to assist researchers in the
accurate and efficient collection of data and its storage in a secure and
accessible form [3.12.5]
• RCUK Code of Conduct on the Governance of Good Research Conduct
(2011)
– Primary data and research evidence [should be made] accessible to
others for reasonable periods after the completion of the research: data
should normally be preserved and accessible for 10 yrs (in some cases
20 yrs or longer)
– Responsibility for proper management and preservation of data and
primary materials is shared between the researcher and the research
organisation [although deposit within national collections is endorsed]
A centre of expertise in digital information management
www.ukoln.ac.uk
Funding body perspectives (1)
• UK Research Councils
– Help fund some data archives, e.g.:
• Archaeology Data Service, European Bioinformatics
Institute, the NERC data centres, UK Data Archive
– Support for JISC (and DCC)
– RCUK Common Principles on Data Policy
• Recognises that data are a critical output of the
research process
http://www.rcuk.ac.uk/research/Pages/DataPolicy.aspx
A centre of expertise in digital information management
www.ukoln.ac.uk
Funding body perspectives (2)
• RCUK Common Principles on Data (in a nutshell)
– Publicly funded research data should be made openly available
– Data with acknowledged long-term value should be preserved
and remain accessible and usable for future research
– Sufficient metadata should be recorded to enable other
researchers to find and understand the research to enable re-
use; published results should always include information on
how to access the supporting data
– Recognition that there may be legal, ethical and commercial
constraints
– Recognition that researchers may need privileged use of data
for a limited period
– All users of research data should acknowledge their sources
– Appropriate to use public funds to support MRD
A centre of expertise in digital information management
www.ukoln.ac.uk
Funding body perspectives (3)
• Changing expectations of funding bodies:
– Institutions need to inform themselves about main funder
policies (mandates) with respect to research data
management
– There is an explicit link between research income and
appropriate data management infrastructures
A centre of expertise in digital information management
www.ukoln.ac.uk
Funding body perspectives (4)
http://www.dcc.ac.uk/resources/policy-and-legal/overview-
funders-data-policies
A centre of expertise in digital information management
www.ukoln.ac.uk
EPSRC expectations (1)
• EPSRC policy (2011) expected all institutions receiving
grant funding:
– To develop a roadmap aligning their policies and
processes with EPSRC’s expectations by 1st May 2012
– To be fully compliant with these expectations by 1st May
2015
A centre of expertise in digital information management
www.ukoln.ac.uk
EPSRC expectations (2)
• Examples:
– Appropriate metadata (including unique IDs) to be made
freely available on the Internet within 12 months of data
generation
– Data not generated in digital format should be stored in a
manner to facilitate it being shared
– Data should be securely preserved for a minimum of 10
years after privileged access expires or the last date
access was requested by a third party
– Adequate resources from existing funding streams
– EPSRC will monitor progress and compliance, and
reserves the right to impose appropriate sanctions
A centre of expertise in digital information management
www.ukoln.ac.uk
Funding body perspectives (5)
• Implications for researchers and institutions:
– Increasing number of research councils and funding bodies with
data management and sharing requirements
– Potential loss of research income if these mandates are not met
– Need to determine the costs associated with short and longer-
term management and curation and to request funds as part of
grant
– Responsibility for infrastructure shifting more to HEIs and less
to centralised data archives, but institutional infrastructures and
services are still emerging
– Need guidance - some good external support
– But also need more local support; often fragmented (need to
draw upon existing channels within institutions wherever
possible)
A centre of expertise in digital information management
www.ukoln.ac.uk
Who needs to be involved?
• Funding bodies
• Archives / long-term data repositories
• At institutions:
– Senior management
– Researcher(s)
– Research support officers / project staff
– Lab technicians
– Librarians / Data Centre staff
– Faculty ethics committees
– Institutional legal / IP advisors
– FOI officer / DPA officer / records manager
– Computing support
– Institutional compliance officers
A centre of expertise in digital information management
www.ukoln.ac.uk
Activities, roles, requirements (1)
• Requirements gathering
– Identifying researchers’ data requirements
– Developing a shared understanding of what needs to be
done (e.g., identifying where data exist, its form and
scale, any existing retention requirements)
– Identifying good practice within the institution (and the
opposite)
– Methods: surveys, focus groups, case studies, joint R&D
projects, assessment tools (e.g. DCC Data Asset
Framework)
A centre of expertise in digital information management
www.ukoln.ac.uk
Activities, roles, requirements (2)
• Identifying motivations and benefits
– For researchers, support services, the institution
• Identifying risks
– Data loss (institution, research group, individual)
– Increased costs (lack of planning, service inefficiency,
data loss)
– Legal compliance (research funder, H&S, ethics, FoI)
– Reputation (institution, unit, individual)
• Identifying costs
– Keeping Research Data Safe (KRDS) toolkit
A centre of expertise in digital information management
www.ukoln.ac.uk
Activities, roles, requirements (3)
• Assessing institutional preparedness
– Identifying institutional stakeholders, existing data support
services, gaps
– Benchmarking and planning for the future
– Skills audit
– DCC CARDIO tool
• Policy development
– Policies – approval by senior management is just the start;
policies need to be embedded in research practice and
responsive to changing requirements
• Data management planning
– DMP online, DCC How-to Develop a Data Management Plan
guide
A centre of expertise in digital information management
www.ukoln.ac.uk
Activities, roles, requirements (4)
• Implementation and service development
– Integrating where possible with existing services, e.g. IR,
CRIS, VRE, HPC, cloud services, social media, etc.
– Appraisal, deciding what needs to be kept and for how
long
– Storage choices – no one-size-fits-all solution, e.g.
Bristol’s BluePeta petascale storage facility, Bath’s X-
Drive approach, cloud approaches
– Data documentation and metadata – layered
approaches: top-level discovery (core metadata,
collection/experiment-level?), role of standards like
DCMI, CERIF, DDI, etc.
A centre of expertise in digital information management
www.ukoln.ac.uk
Activities, roles, requirements (5)
• Data issues:
– Appraisal: selection criteria, retention periods (who
decides?)
• DCC How to appraise and select research data for
curation guide
– Documentation: metadata, schema, semantics
– Formats: proprietary formats, community standards, etc.
– Provenance and authenticity
– Citation (assignment of persistent IDs?)
– Access (embargo policies?)
– Licensing
• DCC How to license research data guide
A centre of expertise in digital information management
www.ukoln.ac.uk
DCC institutional assessment tools
• Data Asset Framework: http://www.data-audit.eu/
– Analysing institutional requirements and holdings
– Discover out what data exists, where it is stored, formats, metadata,
etc.
• CARDIO (Collaborative Assessment of Research Data Infrastructure):
http://cardio.dcc.ac.uk/
– Evaluating data management requirements, activity, and capacity
– Building consensus between data creators, information managers and
service providers
– Identifying practical goals for improvement in data management
provision and support;
– identifying operational inefficiencies and potential opportunities for cost
saving;
– Making a case to senior managers for investment in
data management support
A centre of expertise in digital information management
www.ukoln.ac.uk
Further reading (research data)
– Digital Curation Centre briefing papers and How-to-Guides:
http://www.dcc.ac.uk/resources/how-guides
– Royal Society, Science as an open enterprise (June 2012):
http://royalsociety.org/policy/projects/science-public-enterprise/report/
– Graham Pryor, (ed.) Managing research data (London: Facet
Publishing, 2012). ISBN: 978-1-85604-756-2
– Neil Beagrie, Brian Lavoie and Matthew Woollard, Keeping research
data safe 2 (JISC, 2010): http://www.beagrie.com/publications.php
– Neil Beagrie, Jullia Chruszcz, and Brian Lavoie, Keeping research data
safe: a cost model and guidance for UK universities (JISC, 2008):
http://www.beagrie.com/publications.php
– Liz Lyon, Dealing with data; roles, rights, responsibilities and
relationships (JISC, 2007): http://opus.bath.ac.uk/412/
– National Science Board, Long-lived digital data collections: enabling
research and education in the 21st century (NSF, 2005):
http//www.nsf.gov/pubs/2005/nsb0540/
A centre of expertise in digital information management
www.ukoln.ac.uk
Questions?
A centre of expertise in digital information management
www.ukoln.ac.uk
Acknowledgments
• The Digital Curation Centre (DCC) is a world-leading centre
of expertise in digital information curation with a focus on
building capacity, capability and skills for research data
management across the UK's higher education research
community. The DCC is funded by JISC.
• More information is available from:
http://www.dcc.ac.uk/
• UKOLN receives support from JISC and the University of
Bath, where it is based.
• More information is available from:
http://www.ukoln.ac.uk/
A centre of expertise in digital information management
www.ukoln.ac.uk
Thank you!

Mais conteúdo relacionado

Mais procurados

The Spanish Open Research Data Network. Lessons learned
The Spanish Open Research Data Network. Lessons learnedThe Spanish Open Research Data Network. Lessons learned
The Spanish Open Research Data Network. Lessons learned
maredata
 
Developing linked Open Data - Nuno Freire, Senior Researcher, The European Li...
Developing linked Open Data - Nuno Freire, Senior Researcher, The European Li...Developing linked Open Data - Nuno Freire, Senior Researcher, The European Li...
Developing linked Open Data - Nuno Freire, Senior Researcher, The European Li...
The European Library
 

Mais procurados (20)

Research Information Management
Research Information ManagementResearch Information Management
Research Information Management
 
20yrs: 2004 jisc cni-brighton
20yrs: 2004 jisc cni-brighton20yrs: 2004 jisc cni-brighton
20yrs: 2004 jisc cni-brighton
 
Archives in museums
Archives in museumsArchives in museums
Archives in museums
 
Jisc unleashing data 5 minutes
Jisc unleashing data 5 minutesJisc unleashing data 5 minutes
Jisc unleashing data 5 minutes
 
Who is doing what, and how do we know? [PEPRS]
Who is doing what, and how do we know? [PEPRS]Who is doing what, and how do we know? [PEPRS]
Who is doing what, and how do we know? [PEPRS]
 
Harvesting Repositories: DPLA, Europeana, & Other Case Studies
Harvesting Repositories:  DPLA, Europeana, & Other Case StudiesHarvesting Repositories:  DPLA, Europeana, & Other Case Studies
Harvesting Repositories: DPLA, Europeana, & Other Case Studies
 
OA Network: Heading for Joint Standards and Enhancing Cooperation: Value‐Adde...
OA Network: Heading for Joint Standards and Enhancing Cooperation: Value‐Adde...OA Network: Heading for Joint Standards and Enhancing Cooperation: Value‐Adde...
OA Network: Heading for Joint Standards and Enhancing Cooperation: Value‐Adde...
 
EDINA Serials UKLA SafeNet
EDINA Serials UKLA SafeNetEDINA Serials UKLA SafeNet
EDINA Serials UKLA SafeNet
 
The Spanish Open Research Data Network. Lessons learned
The Spanish Open Research Data Network. Lessons learnedThe Spanish Open Research Data Network. Lessons learned
The Spanish Open Research Data Network. Lessons learned
 
The FP7 Post-Grant Open Access Pilot: An All-Encompassing Gold Open Access Fu...
The FP7 Post-Grant Open Access Pilot: An All-Encompassing Gold Open Access Fu...The FP7 Post-Grant Open Access Pilot: An All-Encompassing Gold Open Access Fu...
The FP7 Post-Grant Open Access Pilot: An All-Encompassing Gold Open Access Fu...
 
Sensitive Data Workshop
Sensitive Data WorkshopSensitive Data Workshop
Sensitive Data Workshop
 
Tdr Overview Pres Advocates
Tdr Overview Pres AdvocatesTdr Overview Pres Advocates
Tdr Overview Pres Advocates
 
OpenAIRE-Connect: Open Science as a Service for repositories and research com...
OpenAIRE-Connect: Open Science as a Service for repositories and research com...OpenAIRE-Connect: Open Science as a Service for repositories and research com...
OpenAIRE-Connect: Open Science as a Service for repositories and research com...
 
20170530_Open Research Data in Horizon 2020
20170530_Open Research Data in Horizon 202020170530_Open Research Data in Horizon 2020
20170530_Open Research Data in Horizon 2020
 
Defining collections and creating their descriptions
Defining collections and creating their descriptionsDefining collections and creating their descriptions
Defining collections and creating their descriptions
 
From Box to Hydra via Archivematica
From Box to Hydra via ArchivematicaFrom Box to Hydra via Archivematica
From Box to Hydra via Archivematica
 
Developing linked Open Data - Nuno Freire, Senior Researcher, The European Li...
Developing linked Open Data - Nuno Freire, Senior Researcher, The European Li...Developing linked Open Data - Nuno Freire, Senior Researcher, The European Li...
Developing linked Open Data - Nuno Freire, Senior Researcher, The European Li...
 
OpenAIRE webinar on Open Access in H2020 (OAW2016)
OpenAIRE webinar on Open Access in H2020 (OAW2016)OpenAIRE webinar on Open Access in H2020 (OAW2016)
OpenAIRE webinar on Open Access in H2020 (OAW2016)
 
OpenAIRE services and tools - 6th National Open Access Conference and OpenAIR...
OpenAIRE services and tools - 6th National Open Access Conference and OpenAIR...OpenAIRE services and tools - 6th National Open Access Conference and OpenAIR...
OpenAIRE services and tools - 6th National Open Access Conference and OpenAIR...
 
Building a Collection of the Historical UK Web for scholarly use
Building a Collection of the Historical UK Web for scholarly useBuilding a Collection of the Historical UK Web for scholarly use
Building a Collection of the Historical UK Web for scholarly use
 

Semelhante a Digital Preservation (UWE)

Developing a Community Capability Model Framework for data-intensive research
Developing a Community Capability Model Framework for data-intensive researchDeveloping a Community Capability Model Framework for data-intensive research
Developing a Community Capability Model Framework for data-intensive research
Michael Day
 
Digital Preservation
Digital PreservationDigital Preservation
Digital Preservation
Smita Chandra
 
Approaches to dig_mg
Approaches to dig_mgApproaches to dig_mg
Approaches to dig_mg
Marieke Guy
 

Semelhante a Digital Preservation (UWE) (20)

Digital Preservation
Digital PreservationDigital Preservation
Digital Preservation
 
Digital Preservation
Digital PreservationDigital Preservation
Digital Preservation
 
Developing a Community Capability Model Framework for data-intensive research
Developing a Community Capability Model Framework for data-intensive researchDeveloping a Community Capability Model Framework for data-intensive research
Developing a Community Capability Model Framework for data-intensive research
 
An Introduction to Digital Preservation
An Introduction to Digital PreservationAn Introduction to Digital Preservation
An Introduction to Digital Preservation
 
Digital preservation: an introduction
Digital preservation: an introductionDigital preservation: an introduction
Digital preservation: an introduction
 
Digital Preservation
Digital PreservationDigital Preservation
Digital Preservation
 
Digital Preservation
Digital PreservationDigital Preservation
Digital Preservation
 
Corrado -- Establishing the Landscape
Corrado -- Establishing the LandscapeCorrado -- Establishing the Landscape
Corrado -- Establishing the Landscape
 
Hans Hofman - European Perspectives on Digital Preservation
Hans Hofman - European Perspectives on Digital PreservationHans Hofman - European Perspectives on Digital Preservation
Hans Hofman - European Perspectives on Digital Preservation
 
UKOLN Programme Support for the JISC Research Information Management Programme
UKOLN Programme Support for the JISC Research Information Management ProgrammeUKOLN Programme Support for the JISC Research Information Management Programme
UKOLN Programme Support for the JISC Research Information Management Programme
 
Brief Introduction to Digital Preservation
Brief Introduction to Digital PreservationBrief Introduction to Digital Preservation
Brief Introduction to Digital Preservation
 
Data curation and preservation: the Digital Curation Centre
Data curation and preservation: the Digital Curation CentreData curation and preservation: the Digital Curation Centre
Data curation and preservation: the Digital Curation Centre
 
Approaches to dig_mg
Approaches to dig_mgApproaches to dig_mg
Approaches to dig_mg
 
Preservation Issues: Other Sources of Information and Next Steps
Preservation Issues:Other Sources of Information and Next StepsPreservation Issues:Other Sources of Information and Next Steps
Preservation Issues: Other Sources of Information and Next Steps
 
Digital Preservation: Other Sources of Information
Digital Preservation: Other Sources of InformationDigital Preservation: Other Sources of Information
Digital Preservation: Other Sources of Information
 
Preservation Issues: Other Sources of Information and Next Steps
Preservation Issues:Other Sources of Information and Next StepsPreservation Issues:Other Sources of Information and Next Steps
Preservation Issues: Other Sources of Information and Next Steps
 
Trm Trusted Repositories
Trm Trusted RepositoriesTrm Trusted Repositories
Trm Trusted Repositories
 
The Dark Side of Digital Preservation: Distributed Digital Preservation
The Dark Side of Digital Preservation: Distributed Digital PreservationThe Dark Side of Digital Preservation: Distributed Digital Preservation
The Dark Side of Digital Preservation: Distributed Digital Preservation
 
Digitization for Access and Preservation: The Evolving Debate in the Cultural...
Digitization for Access and Preservation: The Evolving Debate in the Cultural...Digitization for Access and Preservation: The Evolving Debate in the Cultural...
Digitization for Access and Preservation: The Evolving Debate in the Cultural...
 
Birgit Plietzsch “RDM within research computing support” SALCTG June 2013
Birgit Plietzsch “RDM within research computing support” SALCTG June 2013Birgit Plietzsch “RDM within research computing support” SALCTG June 2013
Birgit Plietzsch “RDM within research computing support” SALCTG June 2013
 

Mais de Michael Day

Mais de Michael Day (19)

What can libraries do for researchers?
What can libraries do for researchers?What can libraries do for researchers?
What can libraries do for researchers?
 
Digital Curation 101 (University of Glamorgan)
Digital Curation 101 (University of Glamorgan)Digital Curation 101 (University of Glamorgan)
Digital Curation 101 (University of Glamorgan)
 
Continuity and change: Opportunities and challenges for the future of researc...
Continuity and change: Opportunities and challenges for the future of researc...Continuity and change: Opportunities and challenges for the future of researc...
Continuity and change: Opportunities and challenges for the future of researc...
 
Introduction to research data management
Introduction to research data managementIntroduction to research data management
Introduction to research data management
 
Introduction to Research Data Management: activities, roles and requirements
Introduction to Research Data Management: activities, roles and requirementsIntroduction to Research Data Management: activities, roles and requirements
Introduction to Research Data Management: activities, roles and requirements
 
UKOLN activities on research information management
UKOLN activities on research information managementUKOLN activities on research information management
UKOLN activities on research information management
 
EASTER project
EASTER projectEASTER project
EASTER project
 
Digital preservation exercises
Digital preservation exercisesDigital preservation exercises
Digital preservation exercises
 
Curation of Research Data
Curation of Research DataCuration of Research Data
Curation of Research Data
 
Digital preservation from a records management perspective
Digital preservation from a records management perspectiveDigital preservation from a records management perspective
Digital preservation from a records management perspective
 
The Improving Access to Text (IMPACT) project and other European initiatives
The Improving Access to Text (IMPACT) project and other European initiativesThe Improving Access to Text (IMPACT) project and other European initiatives
The Improving Access to Text (IMPACT) project and other European initiatives
 
Repositories and digital preservation
Repositories and digital preservationRepositories and digital preservation
Repositories and digital preservation
 
Enhancing social tagging with a knowledge organization system
Enhancing social tagging with a knowledge organization systemEnhancing social tagging with a knowledge organization system
Enhancing social tagging with a knowledge organization system
 
Disciplinary and institutional perspectives on digital curation
Disciplinary and institutional perspectives on digital curationDisciplinary and institutional perspectives on digital curation
Disciplinary and institutional perspectives on digital curation
 
Introduction to digital curation
Introduction to digital curationIntroduction to digital curation
Introduction to digital curation
 
DCC 101: Preservation
DCC 101: PreservationDCC 101: Preservation
DCC 101: Preservation
 
Digital Curation 101: Preserve
Digital Curation 101: PreserveDigital Curation 101: Preserve
Digital Curation 101: Preserve
 
Moving OA to the scientific enterprise
Moving OA to the scientific enterpriseMoving OA to the scientific enterprise
Moving OA to the scientific enterprise
 
Metadata for digital long-term preservation
Metadata for digital long-term preservationMetadata for digital long-term preservation
Metadata for digital long-term preservation
 

Último

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 

Último (20)

Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 

Digital Preservation (UWE)

  • 1. A centre of expertise in digital information management www.ukoln.ac.uk UKOLN is supported by: Digital Preservation Michael Day Digital Curation Centre UKOLN, University of Bath m.day@ukoln.ac.uk Information Systems and Services, UWE, Bristol, 19 February 2013
  • 2. A centre of expertise in digital information management www.ukoln.ac.uk Presentation outline • Digital preservation overview – Some definitions – Technical challenges – Organisational challenges • Approaches to solving the problem – Preservation Strategies – Tools for: • Format characterisation • Preservation Planning – The OAIS model: • Preservation metadata • Repository audit frameworks (TRAC, DRAMBORA) • Institutional assessment tools: (DAF, CARDIO) • Research Data Management
  • 3. A centre of expertise in digital information management www.ukoln.ac.uk Definitions • Digital preservation: – Is mainly concerned with the sustainability of “content” for a given period of time (probably not forever) – Largely about ensuring “continued access” to content – “The series of managed activities necessary to ensure continued access to digital materials for as long as necessary” - Digital Preservation Coalition (DPC) Digital Preservation Definitions and Concepts list: http://www.dpconline.org/advice/preservationhandbook/in troduction/definitions-and-concepts?q=definitions – A combination of technical, organisational and legal challenges
  • 4. A centre of expertise in digital information management www.ukoln.ac.uk Digital preservation basics • An ongoing (lifecycle) approach to managing digital content based on: – The identification and adoption of appropriate preservation strategies for content – The collection and management of appropriate metadata (explicit and implicit knowledge, contexts) – The ongoing monitoring of technical contexts and the application of preservation planning techniques – Continual monitoring of the organisation (audit) – Not about keeping everything, forever
  • 5. A centre of expertise in digital information management www.ukoln.ac.uk A multi-faceted set of challenges • Technical – Strategies needed to deal with ongoing obsolescence and scale • Organisational – Access and reuse – Authenticity and integrity – Sustainability (costs) – Legal – Deciding what needs to be retained
  • 6. A centre of expertise in digital information management www.ukoln.ac.uk Technical challenges (1) • Physical – Bits stored on a physical medium (or in the cloud?) – Focus 20 years ago was on new media types (e.g. optical storage technologies) as a panacea – Bit-level preservation is still important – the first layer in a viable preservation strategy
  • 7. A centre of expertise in digital information management www.ukoln.ac.uk Obsolete media Image courtesy of Frank Carey Exhibition at NASA White Sands Test Facility, 2009
  • 8. A centre of expertise in digital information management www.ukoln.ac.uk Technical challenges (2) • Hardware and software dependence – Most digital objects are dependent on particular configurations of hardware and software – Relatively short obsolescence cycles
  • 9. A centre of expertise in digital information management www.ukoln.ac.uk Hardware and software dependence Exhibition at NASA White Sands Test Facility, 2009Image courtesy of Frank Carey
  • 10. A centre of expertise in digital information management www.ukoln.ac.uk Conceptual challenges (1) • What is an digital object? – Some are analogues of traditional objects, e.g. meeting minutes, research papers – Others are not, e.g. Web pages, blogs, GIS, 3D models of chemical structures, research data more generally • Complexity • Dynamic nature • Interactivity – Born digital vs. product of digitisation initiatives – Logical layer between physical storage of bits and the conceptual objects that need preservation (includes data types, formats, etc.)
  • 11. A centre of expertise in digital information management www.ukoln.ac.uk Conceptual challenges (2) • Need to identify and document the “significant properties” (or characteristics) of content: – Recognises that preservation is context dependent, even user specific (OAIS concept of 'designated community') – Helps with choosing an acceptable preservation strategy • Compare the ‘performance model’ developed by the National Archives of Australia (2002) - “The source of a record is a fixed message that interacts with technology. This message provides the record’s unique meaning, but by itself is meaningless to researchers since it needs to be combined with technology in order to be rendered as its creator intended. The process is the technology required to render meaning from the source” – Focus on re-use (e.g., data curation)
  • 12. A centre of expertise in digital information management www.ukoln.ac.uk Organisational challenges (1) • Sustainability: – Ultimately the sustainability of content depends upon the long- term sustainability of organisations • Focus on business models • Embedding preservation into the core task of organisations – Organisational commitment: • “An institutional repository needs to be a service with continuity behind it … Institutions need to recognise that they are making commitments for the long term” Clifford Lynch • Need for policy development – Incentives for preservation: • Clarity on roles and responsibilities needed • Who benefits? Who pays? “Free riding?”
  • 13. A centre of expertise in digital information management www.ukoln.ac.uk Organisational challenges (2) • Economic perspectives: – Blue Ribbon Task Force on Sustainable Digital Preservation and Access: http://brtf.sdsc.edu/ • Final report (Feb 2010) “Ensuring that valuable digital assets will be available for future use is not simply a matter of finding sufficient funds. It is about mobilizing resources - human, technical, and financial - across a spectrum of stakeholders diffuse over both space and time. But questions remain about what digital information we should preserve, who is responsible for preserving, and who will pay.” – JISC-funded LIFE (Life Cycle Information for E- Literature) has developed a predictive costing tool: http://www.life.ac.uk/
  • 14. A centre of expertise in digital information management www.ukoln.ac.uk Organisational challenges (3) • The challenge of scale: – The Web – Digitised “textual” content: • Google Books • DPLA / Europeana – The “data deluge” in e-Science: • New generations of instruments, computer simulations • Many terabytes generated per day, petabyte scale computing (and growing) • Cory Doctorow, “Welcome to the petacentre.” Nature, 455, pp 17-21, 4 Sep 2008
  • 15. A centre of expertise in digital information management www.ukoln.ac.uk Organisational challenges (4) • The need for collaboration: – Need for 'deep-infrastructure' for preservation recognised as far back as 1996 by the Task Force on Archiving of Digital Information • Digital preservation involves the "grander problem of organizing ourselves over time and as a society ... [to manoeuvre] effectively in a digital landscape" (p. 7) – Building on existing networks – Role for national-level co-ordination: • Digital Preservation Coalition (DPC), nestor (Germany), National Digital Information Infrastructure and Preservation Program (NDIIPP)
  • 16. A centre of expertise in digital information management www.ukoln.ac.uk Organisational challenges (5) • Learn the lessons from the past: – Things will go wrong – Do what you can to enable recovery from disaster – Digital technologies support replication (create more than one point of failure)
  • 17. A centre of expertise in digital information management www.ukoln.ac.uk Digital preservation strategies (1) • Main approaches: – Technology preservation (e.g., computing museums) – Digital archaeology (a post hoc approach) – Emulation (focusing on the environment, often used where look-and-feel is important, e.g. computer games) – Migration (focusing on the content) • A mature approach: A set of organised tasks designed to achieve the periodic transfer of digital information from one hardware and software configuration to another, or from one generation of computer technology to a subsequent one - CPA/RLG report (1996)
  • 18. A centre of expertise in digital information management www.ukoln.ac.uk Digital preservation strategies (2) • Preservation strategies are not in competition – Different strategies will work together, may be value in diversification – Migration strategies mean difficult choices need to be made about target formats • But the strategy chosen has implications for: – The technical infrastructure required (and metadata) – Collection management priorities – Rights management • Owning the rights to re-engineer software – Costs
  • 19. A centre of expertise in digital information management www.ukoln.ac.uk Digital preservation strategies (3) • Tools for format characterisation and validation – DROID - Digital Record Object Identification (based on the PRONOM registry • Very important to know what types (formats) of content exist in a particular collection (e.g., institutional repository or Web archive) • Performs batch identification of file formats • http://www.nationalarchives.gov.uk/PRONOM/Default. aspx – JHOVE - JSTOR/Harvard Object Validation Environment • Used for format validation • http://hul.harvard.edu/jhove/
  • 20. A centre of expertise in digital information management www.ukoln.ac.uk Digital preservation strategies (4) • Plato preservation planning tool – Developed by EU Planets project – A decision support tool that helps users explore the evaluation of potential preservation solutions against specific requirements and for building a plan for preserving a given set of objects – Integrates file format identification (using DROID); some migration services; XML-based generic format characterisation using XCL (eXtensible Characterisation Languages) – More info: http://www.ifs.tuwien.ac.at/dp/plato/intro.html – Integration with repositories tested by JISC KeepIt project: http://preservation.eprints.org/keepit/
  • 21. A centre of expertise in digital information management www.ukoln.ac.uk OAIS Reference Model (ISO 14721) OAIS Functional Entities (Figure 4-1) http://public.ccsds.org/publications/archive/650x0m2.pdf
  • 22. A centre of expertise in digital information management www.ukoln.ac.uk Preservation metadata • Metadata and documentation is vitally important – Relates to OAIS concepts like Representation Information and Preservation Description Information – Functions: • Enables resource discovery - supports the development of finding aids • Records meaning (structure and semantics) • Records context and provenance (authenticity) – Standards that support digital preservation activities: • PREMIS Data Dictionary (for core metadata): http://www.loc.gov/standards/premis/
  • 23. A centre of expertise in digital information management www.ukoln.ac.uk Repository audit frameworks (1) • Repository audit frameworks first developed out of the OAIS Reference Model (ISO – OAIS Mandatory Responsibilities (only six of them): • The main focus was on technical and organisational aspects, e.g.: – That repositories ensure that preserved information (content) can be understood (independently understandable) – That documented policies and procedures are being followed • No clear concept of OAIS “compliance”
  • 24. A centre of expertise in digital information management www.ukoln.ac.uk Repository audit frameworks (2) • ISO 16363:2012 -- Audit and certification of trustworthy digital repositories – Trusted Repositories Audit and Certification (TRAC) – Criteria cover three main aspects: • Organisational Infrastructure – Governance and viability, structure and staffing, financial sustainability, contracts, etc. • Digital Object Management – Ingest, preservation planning, archival storage, etc. • Infrastructure and security risk management – Systems and infrastructure, etc. – A basis for certification – http://public.ccsds.org/publications/archive/652x0m1.pdf
  • 25. A centre of expertise in digital information management www.ukoln.ac.uk TRAC Checklist example page
  • 26. A centre of expertise in digital information management www.ukoln.ac.uk Repository audit frameworks (3) • DRAMBORA (Digital Repository Audit Method Based on Risk Assessment) – Developed by the Digital Curation Centre and Digital Preservation Europe – “Presents a methodology for self-assessment, encouraging organisations to establish a comprehensive self-awareness of their objectives, activities and assets before identifying, assessing and managing the risks implicit within their organisation“ – Identifying risks and scoring each one on likelihood and impact – Covers: organisational context, policies, assets, risks, etc. – Online tool: http://www.repositoryaudit.eu/
  • 27. A centre of expertise in digital information management www.ukoln.ac.uk Repository audit frameworks (4) • A means of "asking the right questions" about repositories (and the wider organisation) and documenting appropriate procedures and risks • More than one role: – External badge of quality (a "certified preservation repository") • DINI-Zertifikat für Dokumenten- und Publikationsservices: http://www.dini.de/english/dini-certificate/ • ISO 16363 – Management tool for self assessment
  • 28. A centre of expertise in digital information management www.ukoln.ac.uk Core repository principles (1) • Ten Principles - agreed 2007 by CRL (US), Digital Curation Centre (UK), Nestor (Germany) and Digital Preservation Europe – The repository commits to continuing maintenance of digital objects for identified community/communities. – Demonstrates organizational fitness (including financial, staffing structure, and processes) to fulfill its commitment. – Acquires and maintains requisite contractual and legal rights and fulfills responsibilities. – Has an effective and efficient policy framework. – Acquires and ingests digital objects based upon stated criteria that correspond to its commitments and capabilities.
  • 29. A centre of expertise in digital information management www.ukoln.ac.uk Core repository principles (2) • Ten principles (continued) – Maintains/ensures the integrity, authenticity and usability of digital objects it holds over time. – Creates and maintains requisite metadata about actions taken on digital objects during preservation as well as about the relevant production, access support, and usage process contexts before preservation. – Fulfills requisite dissemination requirements. – Has a strategic program for preservation planning and action. – Has technical infrastructure adequate to continuing maintenance and security of its digital objects. • Available: http://www.crl.edu/archiving-preservation/digital- archives/metrics-assessing-and-certifying/core-re
  • 30. A centre of expertise in digital information management www.ukoln.ac.uk Digital preservation basics (reprise) • An ongoing (lifecycle) approach to managing digital content based on: – The identification and adoption of appropriate preservation strategies for content – The collection and management of appropriate metadata (explicit and implicit knowledge, contexts) – The ongoing monitoring of technical contexts and the application of preservation planning techniques – Continual monitoring of the organisation (audit) – Not about keeping everything, forever
  • 31. A centre of expertise in digital information management www.ukoln.ac.uk“It is always a mistake for a historian to try and predict the future. Life, unlike science, is simply too full of surprises” - Richard J. Evans, In defence of history (1997, p. 62)
  • 32. A centre of expertise in digital information management www.ukoln.ac.uk Further reading – DPC Technology Watch reports: http://www.dpconline.org/advice/technology-watch-reports – Blue Ribbon Task Force on Sustainable Digital Preservation and Access, Final Report (NSF, 2010) http://brtf.sdsc.edu/ – Digital Preservation Coalition, Digital preservation handbook: http://www.dpconline.org/advice/preservationhandbook/ – Marieke Guy, JISC Beginner’s Guide to Digital Preservation (UKOLN, 2010) http://blogs.ukoln.ac.uk/jisc-beg-dig-pres/ – Digital Preservation Coalition and Digital Curation Centre, What’s New (monthly current awareness bulletin): http://www.dpconline.org/newsroom/whats-new – JISC infoNet, Digital repositories infoKit: http://www.jiscinfonet.ac.uk/infokits/repositories – Paradigm Project, Workbook on Digital Private Papers: http://www.paradigm.ac.uk/workbook/index.html
  • 33. A centre of expertise in digital information management www.ukoln.ac.uk Web links: – Digital Preservation Coalition: http://www.dpconline.org/ – Abby Smith talk (2011) at Yale: http://youtu.be/Yk9ccNP9xTk – Plato Preservation Planning tool: http://www.ifs.tuwien.ac.at/dp/plato/intro.html – RSP briefing paper on preservation and storage formats: http://www.rsp.ac.uk/pubs/briefingpapers-docs/technical- preservformats.pdf – PRESERV project: http://preservation.eprints.org/ – KeepIt project: http://preservation.eprints.org/keepit/ – WePreserve cartoons at: http://www.youtube.com/user/wepreserve
  • 34. A centre of expertise in digital information management www.ukoln.ac.uk Available: http://youtu.be/PGFOZLecjTc
  • 35. A centre of expertise in digital information management www.ukoln.ac.uk Research Data Management: activities, roles and requirements
  • 36. A centre of expertise in digital information management www.ukoln.ac.uk Introduction and overview • What is research data management? – Caring for, – Facilitating access to, – Preserving and – Adding value to digital research data throughout its lifecycle. • Rationale (researchers, institutions) • Who is involved and how? • Roles and responsibilities?
  • 37. A centre of expertise in digital information management www.ukoln.ac.uk Researcher perspectives (1) • Managing and sharing data is simply part of good research practice: – Adhering to disciplinary and/or institutional codes of practice and policies – Has been practiced since the advent of modern science, but not always consistently; data intensive research makes it even more critical – Meeting the specific requirements of funding bodies – Reputational risks if data management is not handled properly
  • 38. A centre of expertise in digital information management www.ukoln.ac.uk Researcher perspectives (2) • Potential benefits: – Scholarly communication/access to data – Re-purposing and re-use of data – Stimulating new networks/collaborations & new research – Knowledge transfer to industry – Verification of research/research integrity – Re-purposing data for new audiences – Secure storage for data intensive research – Availability of data underpinning journal articles – Increased visibility/citation Keeping Research Data Safe Factsheet http://www.beagrie.com/KRDS_Factsheet_0910.pdf
  • 39. A centre of expertise in digital information management www.ukoln.ac.uk Institutional perspectives • Institutional drivers – Safeguarding research integrity – Increasing number of FOI requests for data – Adhering to existing codes of research practice and ethics – Developing new institution-wide strategies, policies and services for data storage and management – Increased institutional focus on research management (e.g., in response to REF) – Benchmarking – self-assessing infrastructure and planning for improvement – More demands but less resources to work with
  • 40. A centre of expertise in digital information management www.ukoln.ac.uk Codes of practice for research • UK Research Integrity Office Code of Practice for Research (2009) – Data management planning is an essential part of research design – Organisations should have in place procedures, resources (including physical space) and administrative support to assist researchers in the accurate and efficient collection of data and its storage in a secure and accessible form [3.12.5] • RCUK Code of Conduct on the Governance of Good Research Conduct (2011) – Primary data and research evidence [should be made] accessible to others for reasonable periods after the completion of the research: data should normally be preserved and accessible for 10 yrs (in some cases 20 yrs or longer) – Responsibility for proper management and preservation of data and primary materials is shared between the researcher and the research organisation [although deposit within national collections is endorsed]
  • 41. A centre of expertise in digital information management www.ukoln.ac.uk Funding body perspectives (1) • UK Research Councils – Help fund some data archives, e.g.: • Archaeology Data Service, European Bioinformatics Institute, the NERC data centres, UK Data Archive – Support for JISC (and DCC) – RCUK Common Principles on Data Policy • Recognises that data are a critical output of the research process http://www.rcuk.ac.uk/research/Pages/DataPolicy.aspx
  • 42. A centre of expertise in digital information management www.ukoln.ac.uk Funding body perspectives (2) • RCUK Common Principles on Data (in a nutshell) – Publicly funded research data should be made openly available – Data with acknowledged long-term value should be preserved and remain accessible and usable for future research – Sufficient metadata should be recorded to enable other researchers to find and understand the research to enable re- use; published results should always include information on how to access the supporting data – Recognition that there may be legal, ethical and commercial constraints – Recognition that researchers may need privileged use of data for a limited period – All users of research data should acknowledge their sources – Appropriate to use public funds to support MRD
  • 43. A centre of expertise in digital information management www.ukoln.ac.uk Funding body perspectives (3) • Changing expectations of funding bodies: – Institutions need to inform themselves about main funder policies (mandates) with respect to research data management – There is an explicit link between research income and appropriate data management infrastructures
  • 44. A centre of expertise in digital information management www.ukoln.ac.uk Funding body perspectives (4) http://www.dcc.ac.uk/resources/policy-and-legal/overview- funders-data-policies
  • 45. A centre of expertise in digital information management www.ukoln.ac.uk EPSRC expectations (1) • EPSRC policy (2011) expected all institutions receiving grant funding: – To develop a roadmap aligning their policies and processes with EPSRC’s expectations by 1st May 2012 – To be fully compliant with these expectations by 1st May 2015
  • 46. A centre of expertise in digital information management www.ukoln.ac.uk EPSRC expectations (2) • Examples: – Appropriate metadata (including unique IDs) to be made freely available on the Internet within 12 months of data generation – Data not generated in digital format should be stored in a manner to facilitate it being shared – Data should be securely preserved for a minimum of 10 years after privileged access expires or the last date access was requested by a third party – Adequate resources from existing funding streams – EPSRC will monitor progress and compliance, and reserves the right to impose appropriate sanctions
  • 47. A centre of expertise in digital information management www.ukoln.ac.uk Funding body perspectives (5) • Implications for researchers and institutions: – Increasing number of research councils and funding bodies with data management and sharing requirements – Potential loss of research income if these mandates are not met – Need to determine the costs associated with short and longer- term management and curation and to request funds as part of grant – Responsibility for infrastructure shifting more to HEIs and less to centralised data archives, but institutional infrastructures and services are still emerging – Need guidance - some good external support – But also need more local support; often fragmented (need to draw upon existing channels within institutions wherever possible)
  • 48. A centre of expertise in digital information management www.ukoln.ac.uk Who needs to be involved? • Funding bodies • Archives / long-term data repositories • At institutions: – Senior management – Researcher(s) – Research support officers / project staff – Lab technicians – Librarians / Data Centre staff – Faculty ethics committees – Institutional legal / IP advisors – FOI officer / DPA officer / records manager – Computing support – Institutional compliance officers
  • 49. A centre of expertise in digital information management www.ukoln.ac.uk Activities, roles, requirements (1) • Requirements gathering – Identifying researchers’ data requirements – Developing a shared understanding of what needs to be done (e.g., identifying where data exist, its form and scale, any existing retention requirements) – Identifying good practice within the institution (and the opposite) – Methods: surveys, focus groups, case studies, joint R&D projects, assessment tools (e.g. DCC Data Asset Framework)
  • 50. A centre of expertise in digital information management www.ukoln.ac.uk Activities, roles, requirements (2) • Identifying motivations and benefits – For researchers, support services, the institution • Identifying risks – Data loss (institution, research group, individual) – Increased costs (lack of planning, service inefficiency, data loss) – Legal compliance (research funder, H&S, ethics, FoI) – Reputation (institution, unit, individual) • Identifying costs – Keeping Research Data Safe (KRDS) toolkit
  • 51. A centre of expertise in digital information management www.ukoln.ac.uk Activities, roles, requirements (3) • Assessing institutional preparedness – Identifying institutional stakeholders, existing data support services, gaps – Benchmarking and planning for the future – Skills audit – DCC CARDIO tool • Policy development – Policies – approval by senior management is just the start; policies need to be embedded in research practice and responsive to changing requirements • Data management planning – DMP online, DCC How-to Develop a Data Management Plan guide
  • 52. A centre of expertise in digital information management www.ukoln.ac.uk Activities, roles, requirements (4) • Implementation and service development – Integrating where possible with existing services, e.g. IR, CRIS, VRE, HPC, cloud services, social media, etc. – Appraisal, deciding what needs to be kept and for how long – Storage choices – no one-size-fits-all solution, e.g. Bristol’s BluePeta petascale storage facility, Bath’s X- Drive approach, cloud approaches – Data documentation and metadata – layered approaches: top-level discovery (core metadata, collection/experiment-level?), role of standards like DCMI, CERIF, DDI, etc.
  • 53. A centre of expertise in digital information management www.ukoln.ac.uk Activities, roles, requirements (5) • Data issues: – Appraisal: selection criteria, retention periods (who decides?) • DCC How to appraise and select research data for curation guide – Documentation: metadata, schema, semantics – Formats: proprietary formats, community standards, etc. – Provenance and authenticity – Citation (assignment of persistent IDs?) – Access (embargo policies?) – Licensing • DCC How to license research data guide
  • 54. A centre of expertise in digital information management www.ukoln.ac.uk DCC institutional assessment tools • Data Asset Framework: http://www.data-audit.eu/ – Analysing institutional requirements and holdings – Discover out what data exists, where it is stored, formats, metadata, etc. • CARDIO (Collaborative Assessment of Research Data Infrastructure): http://cardio.dcc.ac.uk/ – Evaluating data management requirements, activity, and capacity – Building consensus between data creators, information managers and service providers – Identifying practical goals for improvement in data management provision and support; – identifying operational inefficiencies and potential opportunities for cost saving; – Making a case to senior managers for investment in data management support
  • 55. A centre of expertise in digital information management www.ukoln.ac.uk Further reading (research data) – Digital Curation Centre briefing papers and How-to-Guides: http://www.dcc.ac.uk/resources/how-guides – Royal Society, Science as an open enterprise (June 2012): http://royalsociety.org/policy/projects/science-public-enterprise/report/ – Graham Pryor, (ed.) Managing research data (London: Facet Publishing, 2012). ISBN: 978-1-85604-756-2 – Neil Beagrie, Brian Lavoie and Matthew Woollard, Keeping research data safe 2 (JISC, 2010): http://www.beagrie.com/publications.php – Neil Beagrie, Jullia Chruszcz, and Brian Lavoie, Keeping research data safe: a cost model and guidance for UK universities (JISC, 2008): http://www.beagrie.com/publications.php – Liz Lyon, Dealing with data; roles, rights, responsibilities and relationships (JISC, 2007): http://opus.bath.ac.uk/412/ – National Science Board, Long-lived digital data collections: enabling research and education in the 21st century (NSF, 2005): http//www.nsf.gov/pubs/2005/nsb0540/
  • 56. A centre of expertise in digital information management www.ukoln.ac.uk Questions?
  • 57. A centre of expertise in digital information management www.ukoln.ac.uk Acknowledgments • The Digital Curation Centre (DCC) is a world-leading centre of expertise in digital information curation with a focus on building capacity, capability and skills for research data management across the UK's higher education research community. The DCC is funded by JISC. • More information is available from: http://www.dcc.ac.uk/ • UKOLN receives support from JISC and the University of Bath, where it is based. • More information is available from: http://www.ukoln.ac.uk/
  • 58. A centre of expertise in digital information management www.ukoln.ac.uk Thank you!

Notas do Editor

  1. Image from: http://www.bradfordschools.net/curriculumict/
  2. Image courtesy of Frank Carey: http://www.flickr.com/photos/dolor_ipsum/3262262068/in/photostream/
  3. Image courtesy of Frank Carey: http://www.flickr.com/photos/dolor_ipsum/3262262008/in/photostream/
  4. Reference: Thibodeau, K. (2002)."Overview of technological approaches to digital preservation and challenges in coming years." In: The state of digital preservation: an international perspective . Washington, D.C.: Council for Library and Information Resources. Available: http://www.clir.org/pubs/abstract/pub107abst.html
  5. National Archives of Australia, An Approach to the Preservation of Digital Records (2002): http://www.naa.gov.au/images/an-approach-green-paper_tcm2-888.pdf
  6. Image from Mary Beard’s blog: http://timesonline.typepad.com/dons_life/2011/02/where-does-king-tut-belong.html#more
  7. http://public.ccsds.org/publications/archive/650x0b1.PDF
  8. References:
  9. http://www.crl.edu/archiving-preservation/digital-archives/metrics-assessing-and-certifying
  10. Given the audience I’ll reflect on two pieces of DCC work: DAF tool, which has been used primarily by service providers or intermediaries to investigate what’s happening in terms of data management at the coalface and explore service gaps to see what support researchers need, and; Research funders policies, specifically in terms of data management and sharing plan requirements, as this is directly relevant to researchers