SlideShare uma empresa Scribd logo
1 de 27
How to write
a data
management
plan
C. Tobin Magle, PhD
Jan 24, 2017
10:00-11:30 a.m.
Morgan Library Computer
Classroom 173
*inspired by content from CU
Boulder research computing
Hypothesis
Raw
data
Experimental
design
Tidy
Data
ResultsArticle
Data
Management
Plans
Cleaning
Sharing
Analysis
Open Data
Code Reproducible
Research
Reuse
Closed
Data
Archiving
The research cycle
What is research data?
• “The recorded factual material
commonly accepted in the
scientific community as
necessary to validate research
findings”
- White House Office of
Management and Budget
• Reality: anything that is a
(digital) product or your
research
What is a data
management plan?
A description of how you plan to describe, preserve
and share your research data.
Often required by funding agencies
Successful DMPs include
• A data inventory, including type(s) and size
• A strategy for describing the data
• A plan for preserving the data long term
• A method for access to the data
Always make sure to follow funder requirements
Data inventory
• What type of data are you going to collect?
• What file type will be produced?
• What size will these files be? How many files?
• What other research outputs will be produced?
• Code/Software?
• Templates/protocols?
Data inventory
miRNA sequences
FASTQ files
1 GB per file
x 64 strains
x 3 replicates
-------------------
~200 GB
R scripts for
analysis and
visualization
Data use tutorials
• What type of data are you going to collect?
• What file type will be produced?
• What size will these files be? How many files?
• What other research outputs will be produced?
• Code/Software?
• Templates/protocols?
Data formats
• Avoid proprietary formats
• Know what software can read your data
Proprietary Format Alternative Format
Excel (.xls, .xlsx) Comma Separated Values (.csv)
Word (.doc, .docx) plain text (.txt)
PowerPoint (.ppt, .pptx) PDF/A (.pdf)
Photoshop (.psd) TIFF (.tif, .tiff)
Quicktime (.mov) MPEG-4 (.mp4)
MPEG 4 Protected audio (.m4p) MP3 (.mp3)
Exercise: Data Inventory
What kind of data are you going to collect?
What file type will be produced?
What size will these files be? How many files?
What other research outputs will be produced?
A strategy for describing the data
• Metadata: Relevant information
for re-creation and re-use
• Contact info
• How data was collected
• Details about collection
• Date, location of collection
• Units
• Can be as simple as a text file
Genomics example (README)
This project contains next-generation miRNA sequencing data from 64 mouse strains.
Brain tissue from 10 week old male mice were harvested, stored in RNA later. RNA was
extracted using an RNeasy kit, and miRNA libraries were produced using an Illumina kit.
They were run on an Illumina mySeq sequencer. The FASTQ Files produced were analyzed
in R using Bioconductor.
The data and descriptive will be made available on NCBI in the bioproject (PRJXXXX). The
scripts used to analyzed the data are available on github (URL). Tutorials for data use will
be made available in the Digital Collections of Colorado (handle).
Contact Tobin Magle (tobin.magle@colostate.edu) for more information.
http://orcid.org/0000-0003-3185-7034
Metadata standards
• Dublin Core: http://dublincore.org/documents/dcmi-terms/
• Can be applied to anything
• Many discipline specific metadata standards
• EML: https://knb.ecoinformatics.org/#external//emlparser/docs/index.html
• MIAME: http://fged.org/projects/miame/
• Search for other standards:
• http://www.dcc.ac.uk/resources/metadata-standards
• https://biosharing.org/standards/
Genomics example (NCBI template)
Exercise: Describe your data
What do people need to know to reuse your data?
Are there any discipline-specific metadata standards?
What format will you describe your data in (text, XML, tabular)?
What fields will you include (author, date, format, identifier?)
A plan for preserving the data long term
• What will you do to ensure
data are properly stored and
preserved?
• Include metadata and other
products needed for reuse
• Might change over course of
the project
Preservation questions
• What will you store?
• Who will be in charge?
• How long will you store it?
• Where will you store it?
• Multiple copies
Recommendations for backing up data
• Store in geographically distinct
locations
• Automation: Will you remember to do it
manually?
• Security: Are you working with PHI?
Exercise: Preservation plan
What will you store?
Who will be responsible for the data (person or position)?
How long will you store it?
Where will you store it?
How will you back it up?
A method to access the data
• Important to funding agencies
• Reproduce existing research
• Promote further research
• Must be easily available:
• No “by request only”
• Embargoes are “ok”
• Data security: consider privacy
and IP issues before sharing
Data access and sharing best practices
• Non-proprietary formats
• Include metadata
• Proper storage
• Stable identifier
• Licensing: conditions for reuse
Trusted Repositories: store and share
• Discipline specific repositories
• Search:
http://service.re3data.org/browse/by-
subject/
• Generic:
• Figshare - https://figshare.com/
• Dryad - http://datadryad.org/
• CSU Digital Repository:
• http://lib.colostate.edu/digital-collections/ http://67.media.tumblr.com/6228cbe58a9652f1a85e8a
b1ed08d715/tumblr_inline_n6oukhNlZW1qf11bs.png
Data archiving service
• Finished products for
sharing
• CSU Digital Repository
• Over 100 Datasets
• Satisfy requirements for
manuscripts and grants
• At no cost <1 TB
• $150/TB for 5 years
• $300/TB for >5 years
Stable identifiers
• URLs break
• Stable identifiers are
permanent in a database
• Some provide linking
capabilities
• DOI –
https://doi.org/10.1109/5.771073
• Handle-
http://hdl.handle.net/10217/177356
Licensing
• State your conditions for reuse
• Paper citation?
• Disclaimers
• Must justify limitations, describe
how you’ll advertise them
• Creative common licenses are a
good starting point
Exercise: Access methods
Where will people be able to access the data?
Does your discipline have a repository?
What kind of stable identifier will it have?
What are the conditions for reuse?
Are there any limitations to use of these data? Why?
DMPTool
• Review requirements from
different agencies
• https://dmptool.org/guidance
• Create new DMPs based on
funding agency templates
• Search public DMPs
Need help?
• Email: tobin.magle@colostate.edu
• DMPTool: http://dmptool.org/
• Data Management Services website:
http://lib.colostate.edu/services/data-management

Mais conteúdo relacionado

Mais procurados

Data Management for Undergraduate Research
Data Management for Undergraduate ResearchData Management for Undergraduate Research
Data Management for Undergraduate ResearchRebekah Cummings
 
Data Management for Undergraduate Researchers
Data Management for Undergraduate ResearchersData Management for Undergraduate Researchers
Data Management for Undergraduate ResearchersRebekah Cummings
 
Introduction to Digital File Management
Introduction to Digital File ManagementIntroduction to Digital File Management
Introduction to Digital File ManagementRebekah Cummings
 
Data Management Services at the Morgan Library
Data Management Services at the Morgan LibraryData Management Services at the Morgan Library
Data Management Services at the Morgan LibraryC. Tobin Magle
 
Best practices data collection
Best practices data collectionBest practices data collection
Best practices data collectionSherry Lake
 
DataONE Education Module 01: Why Data Management?
DataONE Education Module 01: Why Data Management?DataONE Education Module 01: Why Data Management?
DataONE Education Module 01: Why Data Management?DataONE
 
Ownership, intellectual property, and governance considerations for academic ...
Ownership, intellectual property, and governance considerations for academic ...Ownership, intellectual property, and governance considerations for academic ...
Ownership, intellectual property, and governance considerations for academic ...Rebekah Cummings
 
Analyzing Extended and Scientific Metadata for Scalable Index Designs
Analyzing Extended and Scientific Metadata for Scalable Index DesignsAnalyzing Extended and Scientific Metadata for Scalable Index Designs
Analyzing Extended and Scientific Metadata for Scalable Index DesignsAleatha Parker-Wood
 
Research Data Management
Research Data ManagementResearch Data Management
Research Data ManagementDaniel JACOB
 
DataONE Education Module 03: Data Management Planning
DataONE Education Module 03: Data Management PlanningDataONE Education Module 03: Data Management Planning
DataONE Education Module 03: Data Management PlanningDataONE
 
Data management woolfrey
Data management woolfreyData management woolfrey
Data management woolfreypvhead123
 
Bringing bioinformatics into the library
Bringing bioinformatics into the libraryBringing bioinformatics into the library
Bringing bioinformatics into the libraryC. Tobin Magle
 
Data sharing as part of the research workflow
Data sharing as part of the research workflowData sharing as part of the research workflow
Data sharing as part of the research workflowVarsha Khodiyar
 
Knowledge Graph Construction and the Role of DBPedia
Knowledge Graph Construction and the Role of DBPediaKnowledge Graph Construction and the Role of DBPedia
Knowledge Graph Construction and the Role of DBPediaPaul Groth
 
Love Your Data Locally
Love Your Data LocallyLove Your Data Locally
Love Your Data LocallyErin D. Foster
 

Mais procurados (20)

Data Management for Undergraduate Research
Data Management for Undergraduate ResearchData Management for Undergraduate Research
Data Management for Undergraduate Research
 
Data Management for Undergraduate Researchers
Data Management for Undergraduate ResearchersData Management for Undergraduate Researchers
Data Management for Undergraduate Researchers
 
Introduction to Digital File Management
Introduction to Digital File ManagementIntroduction to Digital File Management
Introduction to Digital File Management
 
Data management
Data management Data management
Data management
 
Data Management Services at the Morgan Library
Data Management Services at the Morgan LibraryData Management Services at the Morgan Library
Data Management Services at the Morgan Library
 
Research Data Management
Research Data ManagementResearch Data Management
Research Data Management
 
Best practices data collection
Best practices data collectionBest practices data collection
Best practices data collection
 
DataONE Education Module 01: Why Data Management?
DataONE Education Module 01: Why Data Management?DataONE Education Module 01: Why Data Management?
DataONE Education Module 01: Why Data Management?
 
Ownership, intellectual property, and governance considerations for academic ...
Ownership, intellectual property, and governance considerations for academic ...Ownership, intellectual property, and governance considerations for academic ...
Ownership, intellectual property, and governance considerations for academic ...
 
Analyzing Extended and Scientific Metadata for Scalable Index Designs
Analyzing Extended and Scientific Metadata for Scalable Index DesignsAnalyzing Extended and Scientific Metadata for Scalable Index Designs
Analyzing Extended and Scientific Metadata for Scalable Index Designs
 
Introduction to open-data
Introduction to open-dataIntroduction to open-data
Introduction to open-data
 
Research Data Management
Research Data ManagementResearch Data Management
Research Data Management
 
DataONE Education Module 03: Data Management Planning
DataONE Education Module 03: Data Management PlanningDataONE Education Module 03: Data Management Planning
DataONE Education Module 03: Data Management Planning
 
Data management woolfrey
Data management woolfreyData management woolfrey
Data management woolfrey
 
Bringing bioinformatics into the library
Bringing bioinformatics into the libraryBringing bioinformatics into the library
Bringing bioinformatics into the library
 
Creating dmp
Creating dmpCreating dmp
Creating dmp
 
Data sharing as part of the research workflow
Data sharing as part of the research workflowData sharing as part of the research workflow
Data sharing as part of the research workflow
 
Open access day
Open access dayOpen access day
Open access day
 
Knowledge Graph Construction and the Role of DBPedia
Knowledge Graph Construction and the Role of DBPediaKnowledge Graph Construction and the Role of DBPedia
Knowledge Graph Construction and the Role of DBPedia
 
Love Your Data Locally
Love Your Data LocallyLove Your Data Locally
Love Your Data Locally
 

Semelhante a Data and Donuts: How to write a data management plan

HKU Data Curation MLIM7350 Class 9
HKU Data Curation MLIM7350 Class 9 HKU Data Curation MLIM7350 Class 9
HKU Data Curation MLIM7350 Class 9 Scott Edmunds
 
Workshop - finding and accessing data - Cambridge August 22 2016
Workshop - finding and accessing data - Cambridge August 22 2016Workshop - finding and accessing data - Cambridge August 22 2016
Workshop - finding and accessing data - Cambridge August 22 2016Fiona Nielsen
 
Meeting Federal Research Requirements for Data Management Plans, Public Acces...
Meeting Federal Research Requirements for Data Management Plans, Public Acces...Meeting Federal Research Requirements for Data Management Plans, Public Acces...
Meeting Federal Research Requirements for Data Management Plans, Public Acces...ICPSR
 
Conquering Chaos in the Age of Networked Science: Research Data Management
Conquering Chaos in the Age of Networked Science: Research Data ManagementConquering Chaos in the Age of Networked Science: Research Data Management
Conquering Chaos in the Age of Networked Science: Research Data ManagementKathryn Houk
 
Managing Your Research Data
Managing Your Research DataManaging Your Research Data
Managing Your Research DataKristin Briney
 
Getting to Grips with Research Data Management
Getting to Grips with Research Data Management Getting to Grips with Research Data Management
Getting to Grips with Research Data Management IzzyChad
 
John morrissey c3 dis fair working data.pptx
John morrissey c3 dis fair working data.pptxJohn morrissey c3 dis fair working data.pptx
John morrissey c3 dis fair working data.pptxARDC
 
Creating a Data Management Plan
Creating a Data Management PlanCreating a Data Management Plan
Creating a Data Management PlanKristin Briney
 
Cal Poly - Data Management and the DMPTool
Cal Poly - Data Management and the DMPToolCal Poly - Data Management and the DMPTool
Cal Poly - Data Management and the DMPToolCarly Strasser
 
Research Data (and Software) Management at Imperial: (Everything you need to ...
Research Data (and Software) Management at Imperial: (Everything you need to ...Research Data (and Software) Management at Imperial: (Everything you need to ...
Research Data (and Software) Management at Imperial: (Everything you need to ...Sarah Anna Stewart
 
Data Sets, Ensemble Cloud Computing, and the University Library: Getting the ...
Data Sets, Ensemble Cloud Computing, and the University Library:Getting the ...Data Sets, Ensemble Cloud Computing, and the University Library:Getting the ...
Data Sets, Ensemble Cloud Computing, and the University Library: Getting the ...SEAD
 
Guy avoiding-dat apocalypse
Guy avoiding-dat apocalypseGuy avoiding-dat apocalypse
Guy avoiding-dat apocalypseENUG
 
Support Your Data, Kyoto University
Support Your Data, Kyoto UniversitySupport Your Data, Kyoto University
Support Your Data, Kyoto UniversityStephanie Simms
 
FAIR BioData Management
FAIR BioData ManagementFAIR BioData Management
FAIR BioData ManagementUlrike Wittig
 
Data Management - Lynn Woolfrey
Data Management - Lynn WoolfreyData Management - Lynn Woolfrey
Data Management - Lynn Woolfreypvhead123
 
Preparing Data for (Open) Publication
Preparing Data for (Open) PublicationPreparing Data for (Open) Publication
Preparing Data for (Open) PublicationBrian Hole
 

Semelhante a Data and Donuts: How to write a data management plan (20)

HKU Data Curation MLIM7350 Class 9
HKU Data Curation MLIM7350 Class 9 HKU Data Curation MLIM7350 Class 9
HKU Data Curation MLIM7350 Class 9
 
Workshop - finding and accessing data - Cambridge August 22 2016
Workshop - finding and accessing data - Cambridge August 22 2016Workshop - finding and accessing data - Cambridge August 22 2016
Workshop - finding and accessing data - Cambridge August 22 2016
 
Meeting Federal Research Requirements for Data Management Plans, Public Acces...
Meeting Federal Research Requirements for Data Management Plans, Public Acces...Meeting Federal Research Requirements for Data Management Plans, Public Acces...
Meeting Federal Research Requirements for Data Management Plans, Public Acces...
 
Conquering Chaos in the Age of Networked Science: Research Data Management
Conquering Chaos in the Age of Networked Science: Research Data ManagementConquering Chaos in the Age of Networked Science: Research Data Management
Conquering Chaos in the Age of Networked Science: Research Data Management
 
Managing Your Research Data
Managing Your Research DataManaging Your Research Data
Managing Your Research Data
 
Getting to Grips with Research Data Management
Getting to Grips with Research Data Management Getting to Grips with Research Data Management
Getting to Grips with Research Data Management
 
John morrissey c3 dis fair working data.pptx
John morrissey c3 dis fair working data.pptxJohn morrissey c3 dis fair working data.pptx
John morrissey c3 dis fair working data.pptx
 
Rdm slides march 2014
Rdm slides march 2014Rdm slides march 2014
Rdm slides march 2014
 
Creating a Data Management Plan
Creating a Data Management PlanCreating a Data Management Plan
Creating a Data Management Plan
 
Cal Poly - Data Management and the DMPTool
Cal Poly - Data Management and the DMPToolCal Poly - Data Management and the DMPTool
Cal Poly - Data Management and the DMPTool
 
Research Data (and Software) Management at Imperial: (Everything you need to ...
Research Data (and Software) Management at Imperial: (Everything you need to ...Research Data (and Software) Management at Imperial: (Everything you need to ...
Research Data (and Software) Management at Imperial: (Everything you need to ...
 
Data Sets, Ensemble Cloud Computing, and the University Library: Getting the ...
Data Sets, Ensemble Cloud Computing, and the University Library:Getting the ...Data Sets, Ensemble Cloud Computing, and the University Library:Getting the ...
Data Sets, Ensemble Cloud Computing, and the University Library: Getting the ...
 
Guy avoiding-dat apocalypse
Guy avoiding-dat apocalypseGuy avoiding-dat apocalypse
Guy avoiding-dat apocalypse
 
Support Your Data, Kyoto University
Support Your Data, Kyoto UniversitySupport Your Data, Kyoto University
Support Your Data, Kyoto University
 
FAIR BioData Management
FAIR BioData ManagementFAIR BioData Management
FAIR BioData Management
 
Data Management - Lynn Woolfrey
Data Management - Lynn WoolfreyData Management - Lynn Woolfrey
Data Management - Lynn Woolfrey
 
Managing your research data
Managing your research dataManaging your research data
Managing your research data
 
Research Data Management Plan: How to Write One - 2017-02-01 - University of ...
Research Data Management Plan: How to Write One - 2017-02-01 - University of ...Research Data Management Plan: How to Write One - 2017-02-01 - University of ...
Research Data Management Plan: How to Write One - 2017-02-01 - University of ...
 
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
 
Preparing Data for (Open) Publication
Preparing Data for (Open) PublicationPreparing Data for (Open) Publication
Preparing Data for (Open) Publication
 

Mais de C. Tobin Magle

Coding and Cookies: R basics
Coding and Cookies: R basicsCoding and Cookies: R basics
Coding and Cookies: R basicsC. Tobin Magle
 
Data wrangling with dplyr
Data wrangling with dplyrData wrangling with dplyr
Data wrangling with dplyrC. Tobin Magle
 
Intro to Reproducible Research
Intro to Reproducible ResearchIntro to Reproducible Research
Intro to Reproducible ResearchC. Tobin Magle
 
Data and donuts: Data Visualization using R
Data and donuts: Data Visualization using RData and donuts: Data Visualization using R
Data and donuts: Data Visualization using RC. Tobin Magle
 
Basic data analysis using R.
Basic data analysis using R.Basic data analysis using R.
Basic data analysis using R.C. Tobin Magle
 
Reproducible research concepts and tools
Reproducible research concepts and toolsReproducible research concepts and tools
Reproducible research concepts and toolsC. Tobin Magle
 
Data and Donuts: Data cleaning with OpenRefine
Data and Donuts: Data cleaning with OpenRefineData and Donuts: Data cleaning with OpenRefine
Data and Donuts: Data cleaning with OpenRefineC. Tobin Magle
 
Reproducible research: practice
Reproducible research: practiceReproducible research: practice
Reproducible research: practiceC. Tobin Magle
 
Reproducible research: theory
Reproducible research: theoryReproducible research: theory
Reproducible research: theoryC. Tobin Magle
 
CU Anschutz Health Science Library Data Services
CU Anschutz Health Science Library Data ServicesCU Anschutz Health Science Library Data Services
CU Anschutz Health Science Library Data ServicesC. Tobin Magle
 
Magle data curation in libraries
Magle data curation in librariesMagle data curation in libraries
Magle data curation in librariesC. Tobin Magle
 

Mais de C. Tobin Magle (12)

Coding and Cookies: R basics
Coding and Cookies: R basicsCoding and Cookies: R basics
Coding and Cookies: R basics
 
Data wrangling with dplyr
Data wrangling with dplyrData wrangling with dplyr
Data wrangling with dplyr
 
Reproducible research
Reproducible researchReproducible research
Reproducible research
 
Intro to Reproducible Research
Intro to Reproducible ResearchIntro to Reproducible Research
Intro to Reproducible Research
 
Data and donuts: Data Visualization using R
Data and donuts: Data Visualization using RData and donuts: Data Visualization using R
Data and donuts: Data Visualization using R
 
Basic data analysis using R.
Basic data analysis using R.Basic data analysis using R.
Basic data analysis using R.
 
Reproducible research concepts and tools
Reproducible research concepts and toolsReproducible research concepts and tools
Reproducible research concepts and tools
 
Data and Donuts: Data cleaning with OpenRefine
Data and Donuts: Data cleaning with OpenRefineData and Donuts: Data cleaning with OpenRefine
Data and Donuts: Data cleaning with OpenRefine
 
Reproducible research: practice
Reproducible research: practiceReproducible research: practice
Reproducible research: practice
 
Reproducible research: theory
Reproducible research: theoryReproducible research: theory
Reproducible research: theory
 
CU Anschutz Health Science Library Data Services
CU Anschutz Health Science Library Data ServicesCU Anschutz Health Science Library Data Services
CU Anschutz Health Science Library Data Services
 
Magle data curation in libraries
Magle data curation in librariesMagle data curation in libraries
Magle data curation in libraries
 

Último

(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubaihf8803863
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiSuhani Kapoor
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSAishani27
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 

Último (20)

Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICS
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 

Data and Donuts: How to write a data management plan

  • 1. How to write a data management plan C. Tobin Magle, PhD Jan 24, 2017 10:00-11:30 a.m. Morgan Library Computer Classroom 173 *inspired by content from CU Boulder research computing
  • 3. What is research data? • “The recorded factual material commonly accepted in the scientific community as necessary to validate research findings” - White House Office of Management and Budget • Reality: anything that is a (digital) product or your research
  • 4. What is a data management plan? A description of how you plan to describe, preserve and share your research data. Often required by funding agencies
  • 5. Successful DMPs include • A data inventory, including type(s) and size • A strategy for describing the data • A plan for preserving the data long term • A method for access to the data Always make sure to follow funder requirements
  • 6. Data inventory • What type of data are you going to collect? • What file type will be produced? • What size will these files be? How many files? • What other research outputs will be produced? • Code/Software? • Templates/protocols?
  • 7. Data inventory miRNA sequences FASTQ files 1 GB per file x 64 strains x 3 replicates ------------------- ~200 GB R scripts for analysis and visualization Data use tutorials • What type of data are you going to collect? • What file type will be produced? • What size will these files be? How many files? • What other research outputs will be produced? • Code/Software? • Templates/protocols?
  • 8. Data formats • Avoid proprietary formats • Know what software can read your data Proprietary Format Alternative Format Excel (.xls, .xlsx) Comma Separated Values (.csv) Word (.doc, .docx) plain text (.txt) PowerPoint (.ppt, .pptx) PDF/A (.pdf) Photoshop (.psd) TIFF (.tif, .tiff) Quicktime (.mov) MPEG-4 (.mp4) MPEG 4 Protected audio (.m4p) MP3 (.mp3)
  • 9. Exercise: Data Inventory What kind of data are you going to collect? What file type will be produced? What size will these files be? How many files? What other research outputs will be produced?
  • 10. A strategy for describing the data • Metadata: Relevant information for re-creation and re-use • Contact info • How data was collected • Details about collection • Date, location of collection • Units • Can be as simple as a text file
  • 11. Genomics example (README) This project contains next-generation miRNA sequencing data from 64 mouse strains. Brain tissue from 10 week old male mice were harvested, stored in RNA later. RNA was extracted using an RNeasy kit, and miRNA libraries were produced using an Illumina kit. They were run on an Illumina mySeq sequencer. The FASTQ Files produced were analyzed in R using Bioconductor. The data and descriptive will be made available on NCBI in the bioproject (PRJXXXX). The scripts used to analyzed the data are available on github (URL). Tutorials for data use will be made available in the Digital Collections of Colorado (handle). Contact Tobin Magle (tobin.magle@colostate.edu) for more information. http://orcid.org/0000-0003-3185-7034
  • 12. Metadata standards • Dublin Core: http://dublincore.org/documents/dcmi-terms/ • Can be applied to anything • Many discipline specific metadata standards • EML: https://knb.ecoinformatics.org/#external//emlparser/docs/index.html • MIAME: http://fged.org/projects/miame/ • Search for other standards: • http://www.dcc.ac.uk/resources/metadata-standards • https://biosharing.org/standards/
  • 14. Exercise: Describe your data What do people need to know to reuse your data? Are there any discipline-specific metadata standards? What format will you describe your data in (text, XML, tabular)? What fields will you include (author, date, format, identifier?)
  • 15. A plan for preserving the data long term • What will you do to ensure data are properly stored and preserved? • Include metadata and other products needed for reuse • Might change over course of the project
  • 16. Preservation questions • What will you store? • Who will be in charge? • How long will you store it? • Where will you store it? • Multiple copies
  • 17. Recommendations for backing up data • Store in geographically distinct locations • Automation: Will you remember to do it manually? • Security: Are you working with PHI?
  • 18. Exercise: Preservation plan What will you store? Who will be responsible for the data (person or position)? How long will you store it? Where will you store it? How will you back it up?
  • 19. A method to access the data • Important to funding agencies • Reproduce existing research • Promote further research • Must be easily available: • No “by request only” • Embargoes are “ok” • Data security: consider privacy and IP issues before sharing
  • 20. Data access and sharing best practices • Non-proprietary formats • Include metadata • Proper storage • Stable identifier • Licensing: conditions for reuse
  • 21. Trusted Repositories: store and share • Discipline specific repositories • Search: http://service.re3data.org/browse/by- subject/ • Generic: • Figshare - https://figshare.com/ • Dryad - http://datadryad.org/ • CSU Digital Repository: • http://lib.colostate.edu/digital-collections/ http://67.media.tumblr.com/6228cbe58a9652f1a85e8a b1ed08d715/tumblr_inline_n6oukhNlZW1qf11bs.png
  • 22. Data archiving service • Finished products for sharing • CSU Digital Repository • Over 100 Datasets • Satisfy requirements for manuscripts and grants • At no cost <1 TB • $150/TB for 5 years • $300/TB for >5 years
  • 23. Stable identifiers • URLs break • Stable identifiers are permanent in a database • Some provide linking capabilities • DOI – https://doi.org/10.1109/5.771073 • Handle- http://hdl.handle.net/10217/177356
  • 24. Licensing • State your conditions for reuse • Paper citation? • Disclaimers • Must justify limitations, describe how you’ll advertise them • Creative common licenses are a good starting point
  • 25. Exercise: Access methods Where will people be able to access the data? Does your discipline have a repository? What kind of stable identifier will it have? What are the conditions for reuse? Are there any limitations to use of these data? Why?
  • 26. DMPTool • Review requirements from different agencies • https://dmptool.org/guidance • Create new DMPs based on funding agency templates • Search public DMPs
  • 27. Need help? • Email: tobin.magle@colostate.edu • DMPTool: http://dmptool.org/ • Data Management Services website: http://lib.colostate.edu/services/data-management