SlideShare uma empresa Scribd logo
1 de 43
Data Management for
Undergraduate
Researchers
Office of Undergraduate Research Seminar and Workshop Series
Rebekah Cummings, Research Data Management Librarian
J. Willard Marriott Library, University of Utah
September 21, 2015
• Introductions
• What are data?
• Why manage data?
• Data Management Plans
• Data Organization
• Metadata
• Storage and Archiving
• Questions
Name
MajorResearch Project
What is data management?
The process of controlling the
information (read: data) generated
during a research project.
https://www.libraries.psu.edu/psul/pubcur/what_is_dm.html
What are data?
“The recorded factual material
commonly accepted in the research
community as necessary to validate
research findings.”
- U.S. OMB Circular A-110
Data are diverse
Data are messy
Why manage data?
• Save time and efficiency
• Meet grant requirements
• Promote reproducible research
• Enable new discoveries from your data
• Make the results of publicly funded research
publicly available
We are trying to avoid
this scenario…
Two bears data
management problems
1. Didn’t know where he stored the data
2. Saved one copy of the data on a USB drive
3. Data was in a format that could only be read by
outdated, proprietary software
4. No codebook to explain the variable names
5. Variable names were not descriptive
6. No contact information for the co-author Sam Lee
Scenario
You develop a research project during your
undergraduate experience.You write up the
results, which are accepted by a reputable
journal. People start citing your work! Three
years later someone accuses you of falsifying
your work.
Scenario adapted from MANTRA training
module
• Would you be able to prove you did the
work as you described in the article?
• What would you need to prove you hadn’t
falsified the data?
• What should you have done throughout
your research study to be able to prove
you did the work as described?
Data Management Plans
• What data are generated by your research?
• What is your plan for managing the data?
• How will your data be shared?
Research Data Lifecycle
Courtesy of the UK Data
Archive http://www.data-
archive.ac.uk/create-manage/life-
cycle
• Types of data
• Data description
• Data storage
• Data sharing
• Data archiving and
responsibility
• Data management costs
Data organization
File naming
MyData.xls
MeetingNotes.doc
Presentation.ppt
Assignment1.pdf
File naming best practices
1. Be descriptive
2. Don’t be generic
3. Appropriate length
4. Be consistent
5. Think critically about
your file names
File naming best practices
• Files should include only letters,
numbers, and underscores/dashes.
• No special characters
• No spaces; Use dashes, underscores, or
camel case (like-this or likeThis)
• Not all systems are case sensitive.
Assume this,THIS, and tHiS are the
same.
Version Control - Numbering
001
002
003
009
010
099
Use leading zeros for
scalability
Bonus Tip: Use ordinal numbers (v1,v2,v3) for major version
changes and decimals for minor changes (v1.1, v2.6)
1
10
2
3
9
99
Version Control - Dates
If using dates useYYYYMMDD
June2015 = BAD!
06-18-2015 = BAD!
20150618 = GREAT!
2015-06-18 = This is fine too 
From a DMP…
“Each file name, for all types of data, will
contain the project acronym PUCCUK; a
reference to the file content (survey,
interview, media) and the date of an event
(such as the date of an interview).
• PLPP_EvaluationData_Workshop2_2014.xlsx
• MyData.xlsx
• publiclibrarypartnershipsprojectevaluationdataw
orkshop22014CummingsHelenaMontana.xlsx
Who filed better?
Who filed better?
• July 24 2014_SoilSamples%_v6
• 20140724_NSF_SoilSamples_Cummings
• SoilSamples_FINAL
File organization best
practices
• Top level folder should include project title
and date.
• Sub-structure should have a clear and
consistent naming convention.
• Document your structure in a README
text file.
File organization exercise
Describing data
Research Documentation
• Grant proposals and related reports
• Applications and approvals (e.g. IRB)
• Codebooks, data dictionaries
• Consent forms
• Surveys, questionnaires, interview protocols
• Transcripts, hard copies of audio and video files
• Any software or code you used (no matter how
insignificant or buggy)
Three levels of
documentation
• Project level – what the study set out to do, research
questions, methods, sampling frames, instruments,
protocols, members of the research team
• File or database level – How all the files relate to
one another.A README file is a classic way of capturing
this information.
• Variable or item level – Full label explaining the
meaning of each variable.
http://datalib.edina.ac.uk/mantra/documentation_metadata_citation/
IJ?
XVAR?
FNAME?
Metadata
Unstructured
Data
Structured
Data
There was a study put out by Dr. Gary
Bradshaw from the University of
Nebraska Medical Center in 1982
called “ Growth of Rodent Kidney
Cells in Serum Media and the Effect of
Viral Transformation On Growth”. It
concerns the cytology of kidney cells.
Title Growth of rodent
kidney cells in serum
media and the effect of
viral transformations on
growth.
Author Gary Bradshaw
Date 1982
Publisher University of Nebraska
Medical Center
Subject Kidney -- Cytology
Dublin Core
Disciplinary Metadata
Digital Curation Centre’s list of subject-specific metadata
schemas -
http://www.dcc.ac.uk/resources/metadata-standards
Data Storage
LOCKSS (Lots of
Copies Keeps
Stuff Safe)
Options for data
storage
• Personal computers or laptops
• Networked drives
• External storage devices
Ubox – box.utah.edu
Language from a DMP
“All data files will be stored on the University server that is backed
up nightly.The University's computing network is protected from
viruses by a firewall and anti-virus software. Digital recordings will
be copied to the server each day after interviews.
Signed consent forms will be stored in a locked cabinet in the
office. Interview recordings and transcripts, which may contain
personal information, will be password protected at file-level and
stored on the server.
Original versions of the files will always be kept on the server. If
copies of files are held on a laptop and edits made, their file names
will be changed.”
Thinking long-
term
Archiving options
• Domain-specific repository
• General Purpose Data Repository
• Institutional repository
Major takeaways
• Data management starts at the beginning of
a project
• Document your data so that someone else
could understand it
• Have more than one copy of your data
• Consider archiving options when you are
done with your project
Questions?
rebekah.cummings@utah.edu
(801) 581-7701
Marriott Library, 1705Y
…or ask now!

Mais conteúdo relacionado

Mais procurados

Data management basics, for UC Davis EDU 292
Data management basics, for UC Davis EDU 292Data management basics, for UC Davis EDU 292
Data management basics, for UC Davis EDU 292Phoebe Ayers
 
Data Management for Graduate Students
Data Management for Graduate StudentsData Management for Graduate Students
Data Management for Graduate StudentsRebekah Cummings
 
Managing Your Research Data
Managing Your Research DataManaging Your Research Data
Managing Your Research DataKristin Briney
 
Data Services presentation for Psychology
Data Services presentation for PsychologyData Services presentation for Psychology
Data Services presentation for PsychologyLynda Kellam
 
Data Citation and DOIs
Data Citation and DOIsData Citation and DOIs
Data Citation and DOIsARDC
 
Data Citation Implementation Guidelines By Tim Clark
Data Citation Implementation Guidelines By Tim ClarkData Citation Implementation Guidelines By Tim Clark
Data Citation Implementation Guidelines By Tim Clarkdatascienceiqss
 
Using a Case Study to Teach Data Management to Librarians
Using a Case Study to Teach Data Management to LibrariansUsing a Case Study to Teach Data Management to Librarians
Using a Case Study to Teach Data Management to LibrariansSherry Lake
 
Data management woolfrey
Data management woolfreyData management woolfrey
Data management woolfreypvhead123
 
DataONE Education Module 01: Why Data Management?
DataONE Education Module 01: Why Data Management?DataONE Education Module 01: Why Data Management?
DataONE Education Module 01: Why Data Management?DataONE
 
Research Data Management for SOE
Research Data Management for SOEResearch Data Management for SOE
Research Data Management for SOELynda Kellam
 
Research data management workshop april12 2016
Research data management workshop april12 2016 Research data management workshop april12 2016
Research data management workshop april12 2016 Rebecca Raworth, MLIS
 
The state of global research data initiatives: observations from a life on th...
The state of global research data initiatives: observations from a life on th...The state of global research data initiatives: observations from a life on th...
The state of global research data initiatives: observations from a life on th...Projeto RCAAP
 
RDAP14: Comparing disciplinary repositories: tDAR vs. Open Context
RDAP14: Comparing disciplinary repositories: tDAR vs. Open ContextRDAP14: Comparing disciplinary repositories: tDAR vs. Open Context
RDAP14: Comparing disciplinary repositories: tDAR vs. Open ContextASIS&T
 
Data Management - Lynn Woolfrey
Data Management - Lynn WoolfreyData Management - Lynn Woolfrey
Data Management - Lynn Woolfreypvhead123
 
Data Management for Postgraduate students by Lynn Woolfrey
Data Management for Postgraduate students by Lynn WoolfreyData Management for Postgraduate students by Lynn Woolfrey
Data Management for Postgraduate students by Lynn Woolfreypvhead123
 
Data Services/ICPSR presentation for School of Education
Data Services/ICPSR presentation for School of EducationData Services/ICPSR presentation for School of Education
Data Services/ICPSR presentation for School of EducationLynda Kellam
 
DataONE Education Module 02: Data Sharing
DataONE Education Module 02: Data SharingDataONE Education Module 02: Data Sharing
DataONE Education Module 02: Data SharingDataONE
 

Mais procurados (20)

Data management basics, for UC Davis EDU 292
Data management basics, for UC Davis EDU 292Data management basics, for UC Davis EDU 292
Data management basics, for UC Davis EDU 292
 
Data Management for Graduate Students
Data Management for Graduate StudentsData Management for Graduate Students
Data Management for Graduate Students
 
Creating dmp
Creating dmpCreating dmp
Creating dmp
 
Managing Your Research Data
Managing Your Research DataManaging Your Research Data
Managing Your Research Data
 
Data Services presentation for Psychology
Data Services presentation for PsychologyData Services presentation for Psychology
Data Services presentation for Psychology
 
Data Citation and DOIs
Data Citation and DOIsData Citation and DOIs
Data Citation and DOIs
 
Data Citation Implementation Guidelines By Tim Clark
Data Citation Implementation Guidelines By Tim ClarkData Citation Implementation Guidelines By Tim Clark
Data Citation Implementation Guidelines By Tim Clark
 
Using a Case Study to Teach Data Management to Librarians
Using a Case Study to Teach Data Management to LibrariansUsing a Case Study to Teach Data Management to Librarians
Using a Case Study to Teach Data Management to Librarians
 
Data management woolfrey
Data management woolfreyData management woolfrey
Data management woolfrey
 
Why managedata
Why managedataWhy managedata
Why managedata
 
DataONE Education Module 01: Why Data Management?
DataONE Education Module 01: Why Data Management?DataONE Education Module 01: Why Data Management?
DataONE Education Module 01: Why Data Management?
 
Research Data Management for SOE
Research Data Management for SOEResearch Data Management for SOE
Research Data Management for SOE
 
Research data management workshop april12 2016
Research data management workshop april12 2016 Research data management workshop april12 2016
Research data management workshop april12 2016
 
RDM & ELNs @ Edinburgh
RDM & ELNs @ EdinburghRDM & ELNs @ Edinburgh
RDM & ELNs @ Edinburgh
 
The state of global research data initiatives: observations from a life on th...
The state of global research data initiatives: observations from a life on th...The state of global research data initiatives: observations from a life on th...
The state of global research data initiatives: observations from a life on th...
 
RDAP14: Comparing disciplinary repositories: tDAR vs. Open Context
RDAP14: Comparing disciplinary repositories: tDAR vs. Open ContextRDAP14: Comparing disciplinary repositories: tDAR vs. Open Context
RDAP14: Comparing disciplinary repositories: tDAR vs. Open Context
 
Data Management - Lynn Woolfrey
Data Management - Lynn WoolfreyData Management - Lynn Woolfrey
Data Management - Lynn Woolfrey
 
Data Management for Postgraduate students by Lynn Woolfrey
Data Management for Postgraduate students by Lynn WoolfreyData Management for Postgraduate students by Lynn Woolfrey
Data Management for Postgraduate students by Lynn Woolfrey
 
Data Services/ICPSR presentation for School of Education
Data Services/ICPSR presentation for School of EducationData Services/ICPSR presentation for School of Education
Data Services/ICPSR presentation for School of Education
 
DataONE Education Module 02: Data Sharing
DataONE Education Module 02: Data SharingDataONE Education Module 02: Data Sharing
DataONE Education Module 02: Data Sharing
 

Semelhante a Data Management for Undergraduate Researchers

Best Practice in Data Management and Sharing
Best Practice in Data Management and Sharing Best Practice in Data Management and Sharing
Best Practice in Data Management and Sharing Mojtaba Lotfaliany
 
Research data management workshop April 2016
Research data management workshop April 2016Research data management workshop April 2016
Research data management workshop April 2016Rebecca Raworth, MLIS
 
Documentation and Metdata - VA DM Bootcamp
Documentation and Metdata - VA DM BootcampDocumentation and Metdata - VA DM Bootcamp
Documentation and Metdata - VA DM BootcampSherry Lake
 
Data management for TA's
Data management for TA'sData management for TA's
Data management for TA'saaroncollie
 
Preventing data loss
Preventing data lossPreventing data loss
Preventing data lossIUPUI
 
Data Archiving and Sharing
Data Archiving and SharingData Archiving and Sharing
Data Archiving and SharingC. Tobin Magle
 
Planning for Research Data Management
Planning for Research Data ManagementPlanning for Research Data Management
Planning for Research Data Managementdancrane_open
 
Planning for Research Data Managment
Planning for Research Data ManagmentPlanning for Research Data Managment
Planning for Research Data ManagmentDaniel Crane
 
Planning for Research Data Management: 26th January 2016
Planning for Research Data Management: 26th January 2016Planning for Research Data Management: 26th January 2016
Planning for Research Data Management: 26th January 2016IzzyChad
 
Planning for Research Data Management
Planning for Research Data ManagementPlanning for Research Data Management
Planning for Research Data Managementdancrane_open
 
Support Your Data, Kyoto University
Support Your Data, Kyoto UniversitySupport Your Data, Kyoto University
Support Your Data, Kyoto UniversityStephanie Simms
 
Getting to grips with research data management
Getting to grips with research data management Getting to grips with research data management
Getting to grips with research data management Wendy Mears
 
Getting to Grips with Research Data Management
Getting to Grips with Research Data Management Getting to Grips with Research Data Management
Getting to Grips with Research Data Management IzzyChad
 
OU Library Research Support webinar: Working with research data
OU Library Research Support webinar: Working with research dataOU Library Research Support webinar: Working with research data
OU Library Research Support webinar: Working with research dataIzzyChad
 
Getting to grips with Research Data Management
Getting to grips with Research Data ManagementGetting to grips with Research Data Management
Getting to grips with Research Data ManagementIzzyChad
 
Data Management Planning for researchers
Data Management Planning for researchersData Management Planning for researchers
Data Management Planning for researchersSarah Jones
 

Semelhante a Data Management for Undergraduate Researchers (20)

Best Practice in Data Management and Sharing
Best Practice in Data Management and Sharing Best Practice in Data Management and Sharing
Best Practice in Data Management and Sharing
 
Managing your research data
Managing your research dataManaging your research data
Managing your research data
 
Research data management workshop April 2016
Research data management workshop April 2016Research data management workshop April 2016
Research data management workshop April 2016
 
Documentation and Metdata - VA DM Bootcamp
Documentation and Metdata - VA DM BootcampDocumentation and Metdata - VA DM Bootcamp
Documentation and Metdata - VA DM Bootcamp
 
Data management for TA's
Data management for TA'sData management for TA's
Data management for TA's
 
Preventing data loss
Preventing data lossPreventing data loss
Preventing data loss
 
Data Archiving and Sharing
Data Archiving and SharingData Archiving and Sharing
Data Archiving and Sharing
 
Planning for Research Data Management
Planning for Research Data ManagementPlanning for Research Data Management
Planning for Research Data Management
 
Planning for Research Data Managment
Planning for Research Data ManagmentPlanning for Research Data Managment
Planning for Research Data Managment
 
Planning for Research Data Management: 26th January 2016
Planning for Research Data Management: 26th January 2016Planning for Research Data Management: 26th January 2016
Planning for Research Data Management: 26th January 2016
 
Planning for Research Data Management
Planning for Research Data ManagementPlanning for Research Data Management
Planning for Research Data Management
 
What is-rdm
What is-rdmWhat is-rdm
What is-rdm
 
Rsearch data & you
Rsearch data & youRsearch data & you
Rsearch data & you
 
Support Your Data, Kyoto University
Support Your Data, Kyoto UniversitySupport Your Data, Kyoto University
Support Your Data, Kyoto University
 
Getting to grips with research data management
Getting to grips with research data management Getting to grips with research data management
Getting to grips with research data management
 
Getting to Grips with Research Data Management
Getting to Grips with Research Data Management Getting to Grips with Research Data Management
Getting to Grips with Research Data Management
 
OU Library Research Support webinar: Working with research data
OU Library Research Support webinar: Working with research dataOU Library Research Support webinar: Working with research data
OU Library Research Support webinar: Working with research data
 
Getting to grips with Research Data Management
Getting to grips with Research Data ManagementGetting to grips with Research Data Management
Getting to grips with Research Data Management
 
Data Management Planning for researchers
Data Management Planning for researchersData Management Planning for researchers
Data Management Planning for researchers
 
Organising and Documenting Data
Organising and Documenting DataOrganising and Documenting Data
Organising and Documenting Data
 

Mais de Rebekah Cummings

Data Management for the Arts and Humanities
Data Management for the Arts and HumanitiesData Management for the Arts and Humanities
Data Management for the Arts and HumanitiesRebekah Cummings
 
Using Wix to Create a Digital History Project
Using Wix to Create a Digital History ProjectUsing Wix to Create a Digital History Project
Using Wix to Create a Digital History ProjectRebekah Cummings
 
Finding, Evaluating, and Using Quality Information
Finding, Evaluating, and Using Quality Information Finding, Evaluating, and Using Quality Information
Finding, Evaluating, and Using Quality Information Rebekah Cummings
 
Worth a Thousand Words: Finding, Evaluating, and Using Historical Images
Worth a Thousand Words: Finding, Evaluating, and Using Historical ImagesWorth a Thousand Words: Finding, Evaluating, and Using Historical Images
Worth a Thousand Words: Finding, Evaluating, and Using Historical ImagesRebekah Cummings
 
Level Up! Building data services at the Marriott Library
Level Up! Building data services at the Marriott LibraryLevel Up! Building data services at the Marriott Library
Level Up! Building data services at the Marriott LibraryRebekah Cummings
 
Determining Copyright for Cultural Heritage Materials
Determining Copyright for Cultural Heritage MaterialsDetermining Copyright for Cultural Heritage Materials
Determining Copyright for Cultural Heritage MaterialsRebekah Cummings
 
Research Data Services at the University of Utah
Research Data Services at the University of UtahResearch Data Services at the University of Utah
Research Data Services at the University of UtahRebekah Cummings
 
Your digital humanities are in my library! No, your library is in my digital ...
Your digital humanities are in my library! No, your library is in my digital ...Your digital humanities are in my library! No, your library is in my digital ...
Your digital humanities are in my library! No, your library is in my digital ...Rebekah Cummings
 
Life After Google: How to conduct scholarly research
Life After Google: How to conduct scholarly researchLife After Google: How to conduct scholarly research
Life After Google: How to conduct scholarly researchRebekah Cummings
 
Providing the On-Ramp to the Digital Public Library of America
Providing the On-Ramp to the Digital Public Library of AmericaProviding the On-Ramp to the Digital Public Library of America
Providing the On-Ramp to the Digital Public Library of AmericaRebekah Cummings
 
From Frenemies to Friends: Embracing Wikipedia
From Frenemies to Friends: Embracing WikipediaFrom Frenemies to Friends: Embracing Wikipedia
From Frenemies to Friends: Embracing WikipediaRebekah Cummings
 
Summary report of ACRL webinar on emerging technologies
Summary report of ACRL webinar on emerging technologiesSummary report of ACRL webinar on emerging technologies
Summary report of ACRL webinar on emerging technologiesRebekah Cummings
 
Hosting Hubs Update: Services, Pricing, and Highlights
Hosting Hubs Update: Services, Pricing, and HighlightsHosting Hubs Update: Services, Pricing, and Highlights
Hosting Hubs Update: Services, Pricing, and HighlightsRebekah Cummings
 
MWDL as a Service Hub for the Digital Public Library of America: Updates and ...
MWDL as a Service Hub for the Digital Public Library of America: Updates and ...MWDL as a Service Hub for the Digital Public Library of America: Updates and ...
MWDL as a Service Hub for the Digital Public Library of America: Updates and ...Rebekah Cummings
 
Welcome to the Mountain West Digital Library: Update for New Partners
Welcome to the Mountain West Digital Library: Update for New PartnersWelcome to the Mountain West Digital Library: Update for New Partners
Welcome to the Mountain West Digital Library: Update for New PartnersRebekah Cummings
 
MWDL and DPLA as research resources
MWDL and DPLA as research resourcesMWDL and DPLA as research resources
MWDL and DPLA as research resourcesRebekah Cummings
 

Mais de Rebekah Cummings (20)

Digital Literacy
Digital LiteracyDigital Literacy
Digital Literacy
 
Collections as Data
Collections as DataCollections as Data
Collections as Data
 
Data Management for the Arts and Humanities
Data Management for the Arts and HumanitiesData Management for the Arts and Humanities
Data Management for the Arts and Humanities
 
Using Wix to Create a Digital History Project
Using Wix to Create a Digital History ProjectUsing Wix to Create a Digital History Project
Using Wix to Create a Digital History Project
 
Finding, Evaluating, and Using Quality Information
Finding, Evaluating, and Using Quality Information Finding, Evaluating, and Using Quality Information
Finding, Evaluating, and Using Quality Information
 
Worth a Thousand Words: Finding, Evaluating, and Using Historical Images
Worth a Thousand Words: Finding, Evaluating, and Using Historical ImagesWorth a Thousand Words: Finding, Evaluating, and Using Historical Images
Worth a Thousand Words: Finding, Evaluating, and Using Historical Images
 
Newspapers as Information
Newspapers as InformationNewspapers as Information
Newspapers as Information
 
Level Up! Building data services at the Marriott Library
Level Up! Building data services at the Marriott LibraryLevel Up! Building data services at the Marriott Library
Level Up! Building data services at the Marriott Library
 
Determining Copyright for Cultural Heritage Materials
Determining Copyright for Cultural Heritage MaterialsDetermining Copyright for Cultural Heritage Materials
Determining Copyright for Cultural Heritage Materials
 
Research Data Services at the University of Utah
Research Data Services at the University of UtahResearch Data Services at the University of Utah
Research Data Services at the University of Utah
 
Your digital humanities are in my library! No, your library is in my digital ...
Your digital humanities are in my library! No, your library is in my digital ...Your digital humanities are in my library! No, your library is in my digital ...
Your digital humanities are in my library! No, your library is in my digital ...
 
Life After Google: How to conduct scholarly research
Life After Google: How to conduct scholarly researchLife After Google: How to conduct scholarly research
Life After Google: How to conduct scholarly research
 
Providing the On-Ramp to the Digital Public Library of America
Providing the On-Ramp to the Digital Public Library of AmericaProviding the On-Ramp to the Digital Public Library of America
Providing the On-Ramp to the Digital Public Library of America
 
Bibliographic Management
Bibliographic ManagementBibliographic Management
Bibliographic Management
 
From Frenemies to Friends: Embracing Wikipedia
From Frenemies to Friends: Embracing WikipediaFrom Frenemies to Friends: Embracing Wikipedia
From Frenemies to Friends: Embracing Wikipedia
 
Summary report of ACRL webinar on emerging technologies
Summary report of ACRL webinar on emerging technologiesSummary report of ACRL webinar on emerging technologies
Summary report of ACRL webinar on emerging technologies
 
Hosting Hubs Update: Services, Pricing, and Highlights
Hosting Hubs Update: Services, Pricing, and HighlightsHosting Hubs Update: Services, Pricing, and Highlights
Hosting Hubs Update: Services, Pricing, and Highlights
 
MWDL as a Service Hub for the Digital Public Library of America: Updates and ...
MWDL as a Service Hub for the Digital Public Library of America: Updates and ...MWDL as a Service Hub for the Digital Public Library of America: Updates and ...
MWDL as a Service Hub for the Digital Public Library of America: Updates and ...
 
Welcome to the Mountain West Digital Library: Update for New Partners
Welcome to the Mountain West Digital Library: Update for New PartnersWelcome to the Mountain West Digital Library: Update for New Partners
Welcome to the Mountain West Digital Library: Update for New Partners
 
MWDL and DPLA as research resources
MWDL and DPLA as research resourcesMWDL and DPLA as research resources
MWDL and DPLA as research resources
 

Último

Magic bus Group work1and 2 (Team 3).pptx
Magic bus Group work1and 2 (Team 3).pptxMagic bus Group work1and 2 (Team 3).pptx
Magic bus Group work1and 2 (Team 3).pptxdhanalakshmis0310
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfAdmir Softic
 
PROCESS RECORDING FORMAT.docx
PROCESS      RECORDING        FORMAT.docxPROCESS      RECORDING        FORMAT.docx
PROCESS RECORDING FORMAT.docxPoojaSen20
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibitjbellavia9
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxDenish Jangid
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...Poonam Aher Patil
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsMebane Rash
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxRamakrishna Reddy Bijjam
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...christianmathematics
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfagholdier
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...pradhanghanshyam7136
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfNirmal Dwivedi
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17Celine George
 
Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseAnaAcapella
 
psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docxPoojaSen20
 
Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Association for Project Management
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentationcamerronhm
 

Último (20)

Magic bus Group work1and 2 (Team 3).pptx
Magic bus Group work1and 2 (Team 3).pptxMagic bus Group work1and 2 (Team 3).pptx
Magic bus Group work1and 2 (Team 3).pptx
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
PROCESS RECORDING FORMAT.docx
PROCESS      RECORDING        FORMAT.docxPROCESS      RECORDING        FORMAT.docx
PROCESS RECORDING FORMAT.docx
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please Practise
 
Spatium Project Simulation student brief
Spatium Project Simulation student briefSpatium Project Simulation student brief
Spatium Project Simulation student brief
 
psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docx
 
Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 

Data Management for Undergraduate Researchers

  • 1. Data Management for Undergraduate Researchers Office of Undergraduate Research Seminar and Workshop Series Rebekah Cummings, Research Data Management Librarian J. Willard Marriott Library, University of Utah September 21, 2015
  • 2. • Introductions • What are data? • Why manage data? • Data Management Plans • Data Organization • Metadata • Storage and Archiving • Questions
  • 4. What is data management? The process of controlling the information (read: data) generated during a research project. https://www.libraries.psu.edu/psul/pubcur/what_is_dm.html
  • 5. What are data? “The recorded factual material commonly accepted in the research community as necessary to validate research findings.” - U.S. OMB Circular A-110
  • 8. Why manage data? • Save time and efficiency • Meet grant requirements • Promote reproducible research • Enable new discoveries from your data • Make the results of publicly funded research publicly available
  • 9. We are trying to avoid this scenario…
  • 10. Two bears data management problems 1. Didn’t know where he stored the data 2. Saved one copy of the data on a USB drive 3. Data was in a format that could only be read by outdated, proprietary software 4. No codebook to explain the variable names 5. Variable names were not descriptive 6. No contact information for the co-author Sam Lee
  • 11. Scenario You develop a research project during your undergraduate experience.You write up the results, which are accepted by a reputable journal. People start citing your work! Three years later someone accuses you of falsifying your work. Scenario adapted from MANTRA training module
  • 12. • Would you be able to prove you did the work as you described in the article? • What would you need to prove you hadn’t falsified the data? • What should you have done throughout your research study to be able to prove you did the work as described?
  • 13. Data Management Plans • What data are generated by your research? • What is your plan for managing the data? • How will your data be shared?
  • 14. Research Data Lifecycle Courtesy of the UK Data Archive http://www.data- archive.ac.uk/create-manage/life- cycle • Types of data • Data description • Data storage • Data sharing • Data archiving and responsibility • Data management costs
  • 18. File naming best practices 1. Be descriptive 2. Don’t be generic 3. Appropriate length 4. Be consistent 5. Think critically about your file names
  • 19. File naming best practices • Files should include only letters, numbers, and underscores/dashes. • No special characters • No spaces; Use dashes, underscores, or camel case (like-this or likeThis) • Not all systems are case sensitive. Assume this,THIS, and tHiS are the same.
  • 20. Version Control - Numbering 001 002 003 009 010 099 Use leading zeros for scalability Bonus Tip: Use ordinal numbers (v1,v2,v3) for major version changes and decimals for minor changes (v1.1, v2.6) 1 10 2 3 9 99
  • 21. Version Control - Dates If using dates useYYYYMMDD June2015 = BAD! 06-18-2015 = BAD! 20150618 = GREAT! 2015-06-18 = This is fine too 
  • 22. From a DMP… “Each file name, for all types of data, will contain the project acronym PUCCUK; a reference to the file content (survey, interview, media) and the date of an event (such as the date of an interview).
  • 23. • PLPP_EvaluationData_Workshop2_2014.xlsx • MyData.xlsx • publiclibrarypartnershipsprojectevaluationdataw orkshop22014CummingsHelenaMontana.xlsx Who filed better?
  • 24. Who filed better? • July 24 2014_SoilSamples%_v6 • 20140724_NSF_SoilSamples_Cummings • SoilSamples_FINAL
  • 25. File organization best practices • Top level folder should include project title and date. • Sub-structure should have a clear and consistent naming convention. • Document your structure in a README text file.
  • 28. Research Documentation • Grant proposals and related reports • Applications and approvals (e.g. IRB) • Codebooks, data dictionaries • Consent forms • Surveys, questionnaires, interview protocols • Transcripts, hard copies of audio and video files • Any software or code you used (no matter how insignificant or buggy)
  • 29. Three levels of documentation • Project level – what the study set out to do, research questions, methods, sampling frames, instruments, protocols, members of the research team • File or database level – How all the files relate to one another.A README file is a classic way of capturing this information. • Variable or item level – Full label explaining the meaning of each variable. http://datalib.edina.ac.uk/mantra/documentation_metadata_citation/
  • 31.
  • 32. Metadata Unstructured Data Structured Data There was a study put out by Dr. Gary Bradshaw from the University of Nebraska Medical Center in 1982 called “ Growth of Rodent Kidney Cells in Serum Media and the Effect of Viral Transformation On Growth”. It concerns the cytology of kidney cells. Title Growth of rodent kidney cells in serum media and the effect of viral transformations on growth. Author Gary Bradshaw Date 1982 Publisher University of Nebraska Medical Center Subject Kidney -- Cytology
  • 34. Disciplinary Metadata Digital Curation Centre’s list of subject-specific metadata schemas - http://www.dcc.ac.uk/resources/metadata-standards
  • 36. LOCKSS (Lots of Copies Keeps Stuff Safe)
  • 37. Options for data storage • Personal computers or laptops • Networked drives • External storage devices
  • 39. Language from a DMP “All data files will be stored on the University server that is backed up nightly.The University's computing network is protected from viruses by a firewall and anti-virus software. Digital recordings will be copied to the server each day after interviews. Signed consent forms will be stored in a locked cabinet in the office. Interview recordings and transcripts, which may contain personal information, will be password protected at file-level and stored on the server. Original versions of the files will always be kept on the server. If copies of files are held on a laptop and edits made, their file names will be changed.”
  • 41. Archiving options • Domain-specific repository • General Purpose Data Repository • Institutional repository
  • 42. Major takeaways • Data management starts at the beginning of a project • Document your data so that someone else could understand it • Have more than one copy of your data • Consider archiving options when you are done with your project

Notas do Editor

  1. Specifically we are going to be be talking about data management of your research data, but some of the principles will help you when thinking about the organization of any digital materials, your notes, your PowerPoints, your grocery lists…. Most of these concepts are pretty straightforward, they almost seem like common sense, but the reality is that very few people manage their data well and if you do, you will be at a big advantage.
  2. Overview of what we will be covering in this session. Each of these could be a one hour course, but we are going to hit the highlights so to speak.
  3. Introductions Name Major Are you working on a research project?
  4. Let’s start at the very beginning… This definition is from the Penn State Library website. Controlling often means planning, organizing, and sharing that data effectively. Being thoughtful and deliberate in your data practices Really about being informed, deliberate, and in control of the lifecycle of your data.
  5. This is the most commonly cited definition when someone wants to pin a definition on data, which is surprisingly difficult to do. What data really is is evidence. Or as Michael Buckland puts it “alleged evidence”. It’s what you are putting forth as evidence for your research findings. “We’ve looked at all this stuff” using these methods and here are our conclusions. Research papers often give methods and conclusions but what they don’t usually contain is the underlying data or evidence. So what is data – EVIDENCE FOR YOUR RESEARCH
  6. One of the characteristics of data is that it tends to be incredibly diverse. Scientific data – observations, computational models, lab notebooks Social sciences – results of surveys, video recordings, field notes Humanities – text mining, newspapers, records of human history
  7. Another attribute of data is that it tends to get messy Most of us just don’t realize this because our messy, disorganized files are locked up in a neat little box called your computer. Don’t believe me? How long would it take you to find a photo from five years ago on your computer? Here is a hint. If your image files start with DSC_ or IMG_ and some number following it, it will probably take you a very long time. If most people’s digital files were analog, this is exactly what they would look like.
  8. The main reason you should manage your data is for yourself and for your own research team. Data management is one of those essential skills you need to get just like learning how manage citations or understand research methods. But it can feel a bit boring like filing. But six months later when you want to locate a file, or even understand your file, your future self will thank you. Most important reason to have good data management is for your own good and the good of your research team. If you want to be able to locate your files or understand your files in the future, good data management is crucial. Plus, unlike research methods and managing citations, this is something that even seasoned scientists are not very good at. So you will have something to offer your research team in the future even as a young scientists.
  9. https://www.youtube.com/watch?v=N2zK3sAtr-4
  10. Hopefully by now you can all see why data management is important. Now we’re going to think a little more deeply about how we can avoid the “Two bears” situation. Let’s look at this scenario…
  11. Get in teams of 2-3 and discuss… Retain the data for a period of at least five years Put your data in a repository Keep multiple copies Keep excellent documentation of your data practices, methods, and workflows, sources, plus any code you may have generated.
  12. The most important thing you can do is to have and follow and data management plan. Next we are going to move on and talk a little bit about these data management plans that funding agencies are requiring (and I am promoting as a good idea in general!!) Your DMP should answer three main questions…
  13. For all the reasons we have talked about, many agencies are now requiring data management plans at the start of a research project. This means when you apply for funding for a project, you will have to have a two-page data management plan as part of your proposal. That plan is going to talk about the “lifecycle” of your data throughout the course of the project. How many of you plan on applying for a grant at some point in your careers? Introduce data lifecycle. Funders know that the earlier you start thinking about your data, the better. It’s much more likely that the results of your research will be reproducible, it helps avoid data loss, and increases the value of your research.
  14. We’ve talked in broad strokes about data management but now we are going to focus in one some of the more specific aspects of managing data well. One of the simplest things that you can do is to be more consistent with file naming, version control, and folder structures. This section has a lot to do with organizing and naming your research materials so that you can find them later and so they will open in any environment.
  15. We’ve talked about data management at kind of a high level. What is data? Why should you manage it well? Now we are going to talk about some of the nuts and bolts of data management. Starting with file naming. How do you currently name files? Do you have a system? To some extent we are all guilty of bad file naming but when it comes to your research it is important to create a system that makes sense not just to you, but other people as well. are all guilty of bad file naming but when it comes to your research it is important to create a system that makes sense not just to you, but other people as well.
  16. Here are some examples of bad file names because they aren’t descriptive and don’t help us find the file later, and also because there is a possibility that these files will be overwritten the next time you name a file the same thing.
  17. File names should reflect the contents of a file and enough information to uniquely identify the data file without getting way too long. Don’t be generic in your file names Be consistent!!!! Your file name may include project acronym, location, investigator, date of data collection, data type, and version number. Whatever will help you or someone else uniquely identify that file in the future. Think about what can be added and what can be omitted in your file names. If you are the only person on a project, you probably don’t need your name. If there are going to be multiple versions of a file, make sure you add a version number or a date to differentiate.
  18. Here are some file naming best practices that will make sure your file will open in any environment. Special characters can have special meaning in certain programming languages and operating systems and can be misinterpreted in file names. Uppercase lettering can affect numbering. Ex: $ = beginning of a variable names in php. A backslash designates file path locations in the Windows operating system. Spaces make things easier for humans to read but some browsers and software don’t know how to interpret spaces. Sometimes it only reads a file up to the space, which can cause problems.
  19. There are also best practices around version control and numbering. Version control is often achieved by using dates or a standard numbering system
  20. #1 is the best one. Descriptive Not too long, not too short
  21. #2 is the best choice here. First example here has spaces, irregular dates that won’t line up in order, special characters Third example may not be descriptive enough for for a secondary user. Also, beware of the “FINAL” as opposed to using a standardized numbering system.
  22. That is how to name an individual file. What about your whole file structure? All your research materials need to be in one folder. The top level folder should include the project title and year. If it is multiple year, include the first and last year in the title. The substructures should have a clear and consistent naming convention that is documented in a README file.
  23. Exercise!! Possible solutions: Organize by type of file (all transcripts in one folder all audio recordings in another) Organize by person (Have a Cliff Barrett folder and a Robert Bennett folder) Problems with file names: Dates are not standardized Special characters/spaces File type in the file name which is unnecessary Unnecessary information in file name – “found on Internet, think okay, better than mine” picture NO consistency to file naming
  24. Next we are going to talk about data description. A third characteristic of data is that it often needs context in order to be understandable If you have a spreadsheet of survey responses, you need to have the survey to understand the responses. You also need the codebook that explains your variable names and the values that you used, how you cleaned your data. Once again, try to think how a secondary user would interpret your data. When we say metadata we are really talking about two things: human readable documentation and machine-readable metadata The importance of documenting your data throughout your research project cannot be overestimated. Document your data with a certain level of reuse in mind. Replication? Verification? inspection?
  25. First and foremost, metadata includes any surrounding documentation you may need to make sense of your data. An excel spreadsheet of survey responses is fairly useless if you haven’t kept the survey that generated those responses.
  26. If you are working in spreadsheets, there are three levels of documentation. http://datalib.edina.ac.uk/mantra/documentation_metadata_citation/ FRLP – Free reduced lunch program
  27. You must make a codebook and include it in your documentation. This is documenting at a variable level. It’s just as important that you document at a Project and file level as well.
  28. If you want to learn more about codebooks and how to create good ones, once again I highly recommend going to the ICPSR website and looking at their Guide to Codebooks
  29. Metadats is very, very important for other people looking to use your project. Often called data about data. Structured information about an object. Mention that there are standards for creating metadata (Dublin Core) including subject specific data.
  30. Simple standard, low barrier to entry
  31. Through the course of your research your data needs to be stored securely, backed up, and maintained regularly. Once again this sounds like common sense, but you will be happy when you pay some attention to it. (e.g. when your laptop crashes or is stolen.). I’m going to play a short video clip that has nothing to do with research data, but I think it perfectly captures the way we approach the storage aspect of data management. https://www.youtube.com/watch?v=QyMgNZHtdk8
  32. #1 rule of data storage – never just keep your data on one device. You are one dropped computer, one spilled glass of water, one unscrupulous thief away from losing all of your data. Every single day I go to Mom’s Café and see people leave their computers at their table while they go to the bathroom or grab a cup of coffee. LOCKSS - There should never just be one copy of your data. Do you backup your data? Most important data management task. NO less than two, preferably three copies of research data. How well are you covered against unexpected loss? Make sure that when disaster strikes, it isn’t a disaster
  33. There are three options for Personal computers and laptops – Convenient for storing your data while in use. Should not be used for storing master copies of your data. Networked drives – Highly recommended. You can share data. Your data is stored in a single place and backed up regularly. Available to you from any place at any time. If using a department drive or Box stored securing thereby minimizing the risk of loss, theft, or authorized access. BEST!!! External storage devices – thumb drives, flash drives, external hard drive. Cheap, easy to store and pass around. Feel better knowing it’s in your hands where you can see it. Not recommended for the long-term storage of your data.
  34. 50 GB of free storage and an additional 50 GB if you are on a sponsored project. Free! Secure! When you leave you can take a copy with you or create a new account
  35. This is an example of social science research where the data are interview recording and transcripts.
  36. Another area of data management that you will have to consider is data archiving. Archiving is not the same thing as storage Archiving adds additional value to your data. Long-term preservation Metadata Sharable, usually through a persistent identifier Makes data citable
  37. There are lots of archiving options for your data. Some people choose to put their data on their website which is an option, but not a best practice.