This presentation describes the Data Asset Framework (DAF) as a tool for scoping data content for institutional repositories. It was given as part of module 1 of a 5-module course on digital preservation tools for repository managers, presented by the JISC KeepIt project. For more on this and other presentations in this course look for the tag 'KeepIt course' in the project blog http://blogs.ecs.soton.ac.uk/keepit/
- I’ll start off with some background context / an overview to DAF - Harry will then explain how it’s been used at Southampton - then we’ll do a group exercise.
DAF established in response to a recommendation in the Dealing with Data report. This recognised a lack of awareness as to what data were held within HE institutions and how they were being managed. How can unis make the most of their research data when it is unclear: what there is; where these data are; how they’re being managed; options for reuse etc DAF tries to help users find these things out. Can be a useful tool for repositories to identify data for ingest, or to see what the requirements for support are from researchers left curating data without the necessary resources / skills.
5 projects funded by JISC over a 6 month period in 2008 Development project to come up with the methodology and develop an online tool Implementation projects to test this out and investigate the research data challenge
The methodology has four incremental stages, one for planning, one for wrap up and two main audit stages. Stages 2 & 3 pick up directly on the two aspects in the original recommendation i.e. what data exist (inventory stage) and what’s happening to them (assessment stage). Planning: define scope / expected outcomes of the survey; conduct preliminary research; set up interviews / questionnaires. Identifying data: collect basic information (name, description, creator, location); broad mapping to get feel for the extent of data holdings; classification helps refine scope of next stage. Assessing data: look into a few datasets / collections in more depth to identify weaknesses in data management and risks; consider the whole lifecycle. Reporting: collate and analyse information collected; make recommendations on how to improve data management. Information was typically collected by a mix of questionnaires and interviews.
Themes covered all activities in the data lifecycle. Some found this model useful as a way to guide discussion Across all themes there was a tendency to unpick issues and concerns
Pilots were in a mixture of disciplines and sizes of organisation (research group, departments, schools etc). Focus of implementations differed slightly too. Some were more repository based e.g. Imperial College more concerned with capacity planning so asked questions about data size, growth rates, planned retention, formats… DataShare examples were undertaken to identify suitable data for ingest in light of a lack of voluntary deposits
- Lots of data – often complex: survey data and 3D visualisations, CAD drawings. - Didn’t come across any many policies – very ad hoc. - People didn’t know what to do – wanted support – but also unaware of where they could turn e.g. to repository. - Often nowhere for data to go – didn’t always have data centres in their subject area, or the ability to deposit their data in the institutional repositories. Researchers wanted to keep and reuse data but didn’t have time or skills to do it themselves – need for data curation infrastructure. Role for IRs here.
We had a workshop in 2009 to collate lessons from pilots and decide next steps for DAF. These were the three main recommendations made. Most institutions were still in the early stages of developing infrastructure so the approach was more useful for gathering requirements than identifying data to manage. DAF has been suggested as a tool for new JISC data management infrastructure projects to use for scoping requirements. The exercise today will focus on this usage too – scoping data & gathering requirements for the repository’s role in data management 2. Lessons / approaches from the pilots have been brought together to help others – see the implementation guide. 3. Some new work has been funded (JISC IDMP project) to see how DAF and other tools can be brought together to help institutions develop their data management strategy.