An introduction to data management and how to prepare and write a data management plan. Discuses ways to meet funding agency requirements as well as best practices and local solutions. The video on slide 10 is available at: http://youtu.be/N2zK3sAtr-4
3. Data Management: Our Goals Today
• Why should we care about data
management?
• What resources are available to
help and assist data management
and data management planning?
• What are Best Practices for data
management?
• What tools are available for
developing Data Management Plans
(DMPs)?
• What ISU resources are available?
4. Why should we care about
data management?
• Requirement of funding agencies
•
•
•
•
NSF requires a data management plan in all proposals
Data sharing is a component
NIH has data sharing policy
Other agencies likely to add or refine requirements in
the future
• NSF awards support ISU research:
• In FY 2013 >$38,000,000
• Top 9 departments represent three STEM colleges
• ECpE, ME, GDCB, EEOB, Chem, CompSci, BBMB,
Geol/Atmos Sci, Agron
5. Why should we care about
data management?
• As scientists, we need to be able to use our data
now and in the future
• Scientific findings built on data
• We are experiencing an explosion of data and
information
• To use data now and in the future, it needs to be
managed
6. Data Management: Our Goals Today
• Why should we care about data management?
• What resources are available to help and assist
data management and data management
planning?
• Why the Library?
• Library Guide for Data Management
8. Data Management: Our Goals Today
• Why should we care about data management?
• What resources are available to help and assist
data management and data management
planning?
• What are Best Practices for data management?
9. Library Guide: Best Practices
•
•
•
•
•
•
•
Know your data
Document your data
Make your data and notes understandable
Keep your data organized
Keep copies and make backups
Secure your data
Resources: DataONE Primer on data management and
DataONE Best Practices
11. Library Guide: Best Practices
• Know your data
• Document your data
• Keep your data organized
• Keep copies and make backups
• Make your data and notes understandable
• Secure your data
12. Data Management: Our Goals Today
• Why should we care about data management?
• What resources are available to help and assist
data management and data management
planning?
• What are Best Practices for data management?
• What tools are available for developing Data
Management Plans?
13. Library Guide: DMP Checklists
• Boiler plate
language for
CyBox and
DR@ISU
• Short checklist
• Full checklist
• Links to tools
and resources
for writing and
developing a
DMP
14. The Checklists
Short
Full
• Only contains the “bare
essentials”
• Is focused on the big
picture
• 6 sections with 3
questions each
• Drills down to the details
• Contains links to “more
information”
• 7 sections; variable
number of questions per
section
• A good place to start
• Will help you develop a
comprehensive DMP
15. Full Checklist – section 0:
Describe the Research Project
• Recap only the aspects that apply to data
management.
• Who is involved?
• What’s the goal?
• What are your expected research products?
Databases?
Images?
Code/Software?
Image credits: Sean MacEntee (databases); NIAID (SEM image); James Cridland (code) ; all via Flickr
16. Full Checklist – section 1:
Data Description & Identification
Describe and identify the data products of your
proposal.
• Formats
• Digital: .jpeg, .pdf, .csv, webpages, etc.
• Analog: lab notebooks, surveys, specimens/artifacts, etc.
• Kinds of data
• Observational, experimental, simulation, derived?
• Methods of collection
• Surveys, direct observation, remote sensors ?
• Expected sizes
17. Full Checklist – section 2:
Data Creation & Organization
How will your data be
managed and what steps
are being taken to
ensure quality?
• File naming and
organization systems
• Quality assurance
• File versioning
Image credits: PhD Comics: “Final”.doc
18. Full Checklist – section 3:
Data Documentation & Metadata
For data to be useful to you, your colleagues, and other
researchers, it needs to be carefully documented and
described.
• What is metadata?
(not just for use by the NSA)
• “data about data” which provides descriptive information.
• Why is metadata important?
• It facilitates reuse by other researchers (including other
members of your research team).
• Discoverability – it lets others find your data.
19. Full Checklist – section 4:
Data Storage, Backup, & Security
Technical information about the machines, software,
and systems used to create, backup, and store your data.
• Locations
• Physical and digital
• Security
• Who has access? How is it secured?
• Disaster Planning
• How often do you make backups? Where are they stored?
20. Full Checklist – section 5:
Data Sharing & Ethics
What are your plans for data sharing and distribution?
• Can the data be shared?
• Are there any legal or ethical restrictions that prevent sharing?
• How will others find, locate, and access the data?
• How long will it be available?
• Will the data be stored in a repository? (more on this soon)
23. Library Guide: Repositories
• Data & Text Repositories
• Data Repositories:
• Dryad, GenBank, ICPSR, TreeBASE, Dataverse Network, FigShare, etc.
• Find a Data Repository
• Databib, Re3data.org, DataONE, etc.
• Text Repositories
• Institutional:
• Digital Repository @ Iowa State University
• Disciplinary:
• ArXIv, CogPrints, Earth-Prints, etc.
24. Data Repository: Dryad
• Tracks downloads
• Assigns a DOI to
datasets.
• Includes
instructions for
citation
• Packages data
with metadata.
• Preservation
actions
25. Library Guide: FAQs
• Targeted answers on specific topics such as:
• What counts as data?
• What is metadata?
• What if my research doesn’t produce data?
• What if I cannot share my data?
• And more!
26. Data Management: Our Goals Today
• Why should we care about data management?
• What resources are available to help and assist
data management and data management
planning?
• What are Best Practices for data management?
• What tools are available for developing Data
Management Plans?
• What ISU resources are available?
27. Library Guide: ISU Resources
• ITS Policies and Standards
• Digital Repository @ Iowa State University
• ISU’s open access repository for scholarly and creative works
• Can be used to fulfill open access mandates for research papers
• Data Storage Services
• CyBox
• Cloud storage, file versioning, syncs to multiple machines, 30-day file
recovery, etc.
• CyFiles / OrgFiles
28. Questions?
• Contact information:
• Megan O’Donnell – mno@iastate.edu
• Bonnie Bowen – bsbowen@iastate.edu
• Feedback form on the guide
• Workshop evaluation form – please fill out today
Notas do Editor
Not just the PIsHow would good data management advance your goals?
Format impacts accessibility, sharing and preservation. - proprietary formats? Standard formats for your discipline? Analog formats? (paper)What kind of data will you be working with?Observational, experimental, simulation, derived, etc.How are they collected or gathered?Surveys, direct observation, images, audio analysis, etc.What is the expected size of the data?Will the size impact other parts of the project?
Metadata, commonly called "data about data", is information which describes data. Good metadata enables others understand and reuse data that they themselves did not create. Metadata elements should be agreed upon and implemented before starting data collection. Data collection and documentation is easier if you know what you need to record and helps maintain consistency and quality.
Can help you locate an appropriate repository.Will broaden your research’s impact
Dryad allows you to track downloads of the data (research impact)Assigns a DOI – tracking and locatingPackages data with metadata for easy reusePreforms preservation actions like validating checksums
More detailed answers for question related to DMPs