Data management plans existed long before the NSF started requiring them. DMPs have inherent value despite their being relatively unknown to researchers until now. Proper, thorough data management plans are potentially a major time saver and a huge asset for the project. In this webinar, we will cover how to go beyond funder requirements and develop more thorough data DMPs The Gulf of Mexico Research Initiative requires an extensive data management plan for projects it funds; we will hear about their efforts and how they are planning to use the DMPTool going forward.
1. Logistics for Webinar
You must call in for audio:
866-740-1260 access code 9870179#
Participants muted
Ask questions in chat any time
20 minutes for Q&A
Recording & slides, schedule of webinars:
blog.dmptool.org/webinar-series
DMPToolWebinar Series 10: More Extensive DMPs
Sponsored by IMLS
1 October 2013
2. 28 May Introduction to the DMPTool
4 June Learning about data management: Resources, tools, materials
18 June Customizing the DMPTool for your institution
25 June Environmental Scan:Who's important at your campus
9 July Promoting institutional services; EZID Outreach Made Simple!
16 July Health Sciences & DMPTool - Lisa Federer, UCLA
23 July Digital humanities and the DMPTool - Miriam Posner, UCLA
13 Aug Data curation profiles and the DMPTool – Jake Carlson, Purdue
27 Aug Talking Points for Meeting with Institutional Stakeholders
1 Oct Beyond Funder Requirements: More Extensive DMPs
15 Oct Tools and resources that complement the DMPTool
5 Nov Case studies – Librarians successfully supporting data
blog.dmptool.org/webinar-series
3. Data Management:
Beyond Funder
Requirements
Sherry Lake | @shlakeuva
Data ManagementConsultant, University ofVirginia Library
DMPToolWebinar Series 10: More Extensive DMPs
Sponsored by IMLS
1 October 2013
FromFlickrbyjdretro
5. • Differentiate between
Funder Data Management
Plan and Operational Data
Management
• Understand the parts of a
Data Management
Workbook
• Introduce the institutional
data management
requirements in DMPTool2
PersonalCollection:SherryLake
Webinar Goals
7. Parts of a (Generic) NSF Data Management Plan
I. Products of the Research:The types of data, samples, physical collections,
software, curriculum materials, and other materials to be produced in the course
of the project.
II. Data Formats:The standards to be used for data and metadata format and
content (where existing standards are absent or deemed inadequate, this should
be documented along with any proposed solutions or remedies).
III. Access to Data and Data Sharing Practices and Policies: Policies for access and
sharing including provisions for appropriate protection of privacy, confidentiality,
security, intellectual property, or other rights or requirements.
IV. Policies for Re-Use, Re-Distribution, and Production of Derivatives.
V. Archiving of Data: Plans for archiving data, samples, and other research
products, and for preservation of access to them.
7
Grant Proposal Guide (GPG) Chapter II.C.2.j
http://www.nsf.gov/pubs/policydocs/pappguide/nsf13001/gpg_2.jsp#dmp
8. Specific Requirements
• Roles and Responsibilities
– outline the rights and
obligations of all parties
• Project Data Management
– how data is managed and
maintained during project
“lifetime”
From Flickr by jesseFewell
9. Funder Data Management Plan
Obligations
• Plan monitored via annual
and final report
• Outcomes reported in
subsequent proposals
– Under “Results of Prior
NSF support”
FromFlickrbystarheadboy
11. What is a Data Management Plan?
• Brief description of how you will comply with
funder’s data sharing policy
• Reviewed as part of a grant application
AND
• A comprehensive plan of how you will manage
your research data throughout the lifecycle of
your research project
12. Lifecycle Data Management
Data Life Cycle
Re-
Purpose
Re-
Use
Deposit
Data
Collection
Data
Analysis
Data
Sharing
Proposal
Planning
Writing
Data
Discovery
End of
Project
Data
Archive
Project
Start Up
Research Lifecycle: Uva Library Data ManagementConsultingGroup
http://dmconsult.library.virginia.edu/lifecycle/
13. Operational Data Management
• Appoint Data Manager Contact
• Describe data to be collected and
methodology
• Include guidelines on data
documentation
• Plan security and backup
procedures
• Plan sharing of data for public
use
• Include preservation plans
• Document copyright and
intellectual property rights
Ted Nguyens USA Blog: http://www.tednguyenusa.com
14. Methods of Data Management
• Data Organization
– Working more efficiently with data
• DataAdministration
– Protecting data and ensuring quality
• DataArchiving and Sharing
– Archiving for preservation and sharing to
promote re-use
16. Parts of theWorkbook
1. Project description
2. Survey of existing data
3. Data to be created
i. Data organization
methods
4. Data administration
issues:
i. Funding and
institutional
requirements
ii. Data owners and
stakeholders
iii. Access and security
iv. Privacy/Confidentiality
v. Backups
5. Data sharing and
archiving
6. Responsibilities
7. Data documentation
and metadata
8. Budget
22. When a Plan ComesTogether
• Consistent Data Management
• Can be used for training
• Use parts for Data Management Reports
• Easy to record changes (i.e., policy)
Rich Murmane Blog: http://richmurnane.blogspot.com/
27. ANU Data Management Manual:
Managing Digital Research Data at
theAustralian NationalUniversity
(6th Edition): http://tinyurl.com/ANU-
DMP
Managing and Sharing Data: Best
Practices for Researchers (3rd
Edition): http://www.data-
archive.ac.uk/media/2894/managing
sharing.pdf
University ofVirginia Data
Management PlanningWorkbook, e-
mail: shlake@virginia.edu
For More Information
Flickrby:SpacemanSpiff,intergalacticexplorer!
29. DMP: Beyond the 2-Pager
James Gibeaut, PhD
GRIIDC Director and Endowed Chair for Geospatial Sciences
Harte Research Institute for Gulf of Mexico Studies
Texas A&M University – Corpus Christi, TX
(email: james.gibeaut@tamucc.edu)
Felimon Gayanilo
Systems Architect
Harte Research Institute for Gulf of Mexico Studies
Texas A&M University – Corpus Christi, TX
(email: felimon.gayanilo@tamucc.edu)
30. Mission
The mission of Gulf of Mexico Research Initiative
(GoMRI) is to implement an independent research
program that will
(1) Study the effects of the Deepwater Horizon
incident and the potential associated impact of
this and similar incidents on the environment
and public health and
(2) Develop improvements for spill mitigation, oil
detection and characterization, and advanced
remediation technologies.
Presenter: J. Gibeaut
31. Goal
The ultimate goal of the GoMRI is be to improve
society’s ability to understand, respond to and
mitigate the impacts of petroleum pollution and
related stressors of the marine and coastal
ecosystems, with an emphasis on conditions found in
the Gulf of Mexico. Knowledge accrued will be applied
to restoration and to improving the long-term
environmental health of the Gulf of Mexico.
Presenter: J. Gibeaut
32. Research Themes
1. Physical distribution and ultimate fate of contaminants
associated with the Deepwater Horizon incident;
2. Chemical evolution and biological degradation of the
contaminants;
3. Environmental effects of the contaminants on Gulf of
Mexico ecosystems, and the science of ecosystem recovery;
4. Technology developments for improved detection,
characterization, mitigation, and remediation of offshore
oil spills; and
5. Impacts of oil spills on public health.
Presenter: J. Gibeaut
34. Data Management Policy
Guiding Principles
• Timely submission of data and model outputs to
GRIIDC and to national repositories
• Use of existing data standards and management
systems as much as practicable
• Employ best practices for data policy and data
management as elucidated by NSF and NOAA or other
agency appropriate for the topic of study
• A strong commitment to data management by
each PI
Presenter: J. Gibeaut
35. Request for Proposal and the DMP
• RFPs are drafted by the GoMRI Research Board and announced
through the GoMRI website (see
http://gulfresearchinitiative.org/request-for-proposals/rfp-iv-save-
the-date/ for the next RFP)
• Includes pre-proposal and full proposal requirement with
abbreviated DMP.
Post-Award: Revisiting the DMP
• GoMRI/GRIIDCTemplate
(http://data.gulfresearchinitiative.org/docs/DMP/Gulf-of-
Mexico-Research-Initiative-Data-Management-Plan-
Template.pdf)
• DMPTool (https://dmp.cdlib.org/; GoMRI template is also
available)
• Example DMPs are also posted on GRIIDC site
Presenter: J. Gibeaut
40. Dataset Information Form (DIF)
• Inform the community of what is
planned to be collected, and
• Inform GRIIDC of what is expected to
come.
Presenter:F.Gayanilo
48. Thank you!
DMP: Beyond the 2-Pager
James Gibeaut, PhD
GRIIDC Director and Endowed Chair for Geospatial Sciences
Harte Research Institute for Gulf of Mexico Studies
Texas A&M University – Corpus Christi, TX
(email: james.gibeaut@tamucc.edu)
Felimon Gayanilo
Systems Architect
Harte Research Institute for Gulf of Mexico Studies
Texas A&M University – Corpus Christi, TX
(email: felimon.gayanilo@tamucc.edu)
49. blog.dmptool.org/
webinar-series
From Flickr by Jeff Keacher
In 2 weeks:
ComplementaryTools
Presenters: Carly Strasser, Perry
Willett, Marisa Strong, Center for
Open Science
Tuesday 15 Oct @ 10am PT
51. blog.dmptool.org/webinar-series
28 May Introduction to the DMPTool
4 June Learning about data management: Resources, tools, materials
18 June Customizing the DMPTool for your institution
25 June Environmental Scan:Who's important at your campus
9 July Promoting institutional services; EZID Outreach Made Simple!
16 July Health Sciences & DMPTool - Lisa Federer, UCLA
23 July Digital humanities and the DMPTool - Miriam Posner, UCLA
13 Aug Data curation profiles and the DMPTool – Jake Carlson, Purdue
27 Aug Talking Points for Meeting with Institutional Stakeholders
1 Oct Beyond Funder Requirements: More Extensive DMPs
15 Oct Tools and resources that complement the DMPTool
5 Nov Case studies – Librarians successfully supporting data
Change Title, webinar #, and date in green text at top
Hopefully you are familiar with the NSF DMP requirement. This is the funder that I’m going to focus on in this 1st section. These are the parts that “may” be included in an NSF Data Management Plan.NSF defines a Data Management Plan as: Plans for data management and sharing of the products of research. …….This supplement should describe how the proposal will conform to NSF policy on the dissemination and sharing of research results
Divisions and specific directorates in the NSF have refined the requirements and have asked researchers to include more information in those specific data management plans. Roles (ENG, SBE, CISE, BIO others)CISE (Directorate for Computer & Information Science & Engineering) Roles:This should include time allocations, project management of technical aspects, training requirements, and contributions of non-project staff—individuals should be named where possible.EHR (Directorate for Education & Human Resources): Specifies to: Identify your methods for collecting data. with how data is managed and maintained during the project's lifetime until it is "shared.”
So you can’t just write a DM plan and be over it. You have to report on your data management (plan). You will have to report on your data management plan (and state changes) in your annual reports. Then at the end of your funding, you will have to report how and where you have shared your data per NSF requirements.We haven’t seen this in action yet, but in order to get future NSF funding, you need to list your shared data (where it is archived, what you archived) under “Results of Prior NSF support”.
Next I’m going to talk about a new concept, maybe to some. Operational Data Management, how that differs from the Funder DMP requirement.
Want to step back and ask the “Big” question? What is a Data Management Plan?We have already seen that it is…..[read 1st 2 bullets]AND now I’m going to talk about “THE” or “BIG” Data Management Plan …. [read last bullet]
Data Management is managing the lifecycle of data including: collection, formatting, organizing, documenting, security, updating access & sharing (during and after), quality control, transformations and destruction.That means managing how it's collected in one or many systems and how it's represented and arranged in database systems.It also means managing how these bits of information are thoroughly documented, backed up safely, monitored over time.Protected from unauthorized access or changes shared with other people or systems, updated with new information.Checked for quality and corrected if errors are found. How it's converted for different uses. And finally how it's destroyed.Remember: Managing Data in a research project is a process that runs throughout the project. Good data management is the foundation for good research. Especially if you are going to share your data. This is the “main” reason for the funder requirements: to share data that has been funded with public money.Good management is essential to ensure that data can be preserved and remain accessible in the long-term, so it can be re-used and understood by other researchers. When managed and preserved properly research data can be successfully used for future scientific purposes.
So a data management plan is not just for the proposal. It is a plan that a research project should implement during the project – here I call it “Operational” to differentiate it from the “data management plan”. An operational plan provides:To the “team” information about the plan (specifics) so everyone will know what is required of each other.Consistency, to ensure that all “operations” parts are done the same way w/in the team/lab.Documented methodology and guides lines that can be used for training when new team members join.To serve as a training document for teaching users about a process To serve as an historical record of the how, why and when of steps in a process for use when modifications are made to that process To serve as an explanation of steps in a process that can be reviewed and included in annual and final reports.
The overall Operational Data Management can be divided into 3 main sections.[read bullets]
Now let’s put some specific Operational Data Management specifics in a “workbook”.
We, at Uva have created a Data Management Workbook, (or Notebook, or Manual). We are still in debate as to what we should call it.To create our workbook, we used the Australian National University’s Data Management Manual as a model. This is a great resource and I have the link at the end of my slides. As you can see it contains more than the 3 Operational Data Management sections on a previous slide. Remember the purpose of the Workbook is to document all parts of the project from funding (Budget) to archiving.Remember that a data management plan is a living document and should be reviewed and updated regularly, especially if unforeseen data is collected. Use the section in the workbook to document the management of your data.
For each section, the researcher is going to “write” in the book (that’s why we are calling it a workbook – notebook), something that is to be changed (and will) over time.We have added a table like this in our book.You should list all the data that will be created during the project. The remainder of the DMP then deals with how each item of data will be managed.
Also in the “Data to be Created” section, directory and file name conventions can be detailed. What file version will the lab be using (via filenames or w/ software)?Organization methodsDirectory structureFile naming conventionsFile Version Controlrecords changes to a file or set of files over time so that you can recall specific versions later.
For the Data Administration section, this should include specifics about access to the data, how will it be protected if needed, specific information on backups – who’s responsible. How (if) will data be shared during the project. What media will the group be used.Here I show a couple of pages from our workbook. For security, they can “check” which security(s) will be required. We also have room in the workbook to write about the specific policies. Same for the backup.
Data sharing and archiving part of the Data Management Workbook might not get filled out until near the end of the project. Here researchers can list the data that they are planning to share and list the formats of the files. We offer tips on sharing and archving.The workbook lets us include local policies and information. Here we would include URLs to local storage options, local policies on data retention and archiving options.
There are other key sections to complete a Data Management Workbook: Responsibilities (who’s in charge of what during the project). A nice chart listing roles and people would be great to include in this section.The Data Documentation and Metadata will be a HUGE section in the workbook. In fact this is not a section in the Australian National University’s manual. The creation of this is its own webinar. I included it here to make sure it gets attention and not forgotten, or left for the last minute. Data documentation & metadata needs to be included with the data that you are archiving and sharing. In our workbook, we include a sample “Readme” file from our Institutional Repository’s readme template. Budget is the last part of the manual:Now that the data management methods and responsibilities have been established, you can estimate the costs of data management for your project. Often the time involved in documenting, writing metadata, and archiving are underestimated. Make note of any costs associated with using data management services or purchasing equipment (such as fileservers, backup media, software, etc.) used for data management.
So how do I get started managing data? Why is a Data Management Plan beyond the funder requirements needed?I hope I have enlightened you on the necessity and the contents on a DM Workbook.This is at the project (overall) levelBest Time to develop your data management plan is at the beginning of your research. Take that Plan you created for your proposal and expand upon that.The plan can be used for training new people.
Do you know if your institution has local data management requirements – policies?If so, or if your institution is considering such policies or requirements, you will be able to include them in DMPTool, Version 2.Let me show you how……
In DMPTool2 we have new roles – one is the Institutional –administrator who can customize the DMPTool with local customizations (resource links, local help, etc.) AND a new customization, can create institutional DMP templates.Here’s a screen shot of DMPTool2. It’s not the final look-N-feel, but I want to focus on the functionality of creating Institutional Requirements. The institutional-administrator can also assign others to be DMP template editors.
This next page is where the template is created.(per local institution).Follow the development of DMPTool here on the DMPTool2 Project page:https://bitbucket.org/dmptool/main/wiki/DMPTool2Project