SlideShare uma empresa Scribd logo
1 de 11
Graduate
                                                                                Office
                                                                              Student
                                                                              Success
                                                                                Series
          GOOD DATA
          PRACTICES FOR
          RESEARCH

                                                                            January 12, 2012
Heather Coates, MLS, MS | Digital Scholarship & Data Management Librarian
CONTEXT: DATA LIFECYCLE




Source: DDI Structural Reform Group. “DDI Version 3.0 Conceptual Model." DDI Alliance. 2004. Accessed on 11 August 2008.
<http://www.icpsr.umich.edu/DDI/committee-info/Concept-Model-WD.pdf>.
OVERVIEW

Planning

Describing the data

Handling data files

Storage & backup
PLAN AHEAD: BEYOND THE PROTOCOL

              Plan early, before data collection

 Identify ethical and legal issues

 Define the data model

 Think about a data organization strategy

 Identify the most appropriate tools: instruments & software
ETHICAL & LEGAL ISSUES

 Privacy
   Are there people (human subjects) involved in your project? Animals?
   Does the study involve personal or health information? Can it be used to
    identify an individual?


 Copyright
   Are you using copyrighted data?
   Have you sought permission?


 Intellectual Property
   You should cite any product that you use for your project: data,
    publications, software, etc.
DESCRIBING YOUR DATA

   Describe   the research project
   Describe   overall organization of your dataset
   Describe   your data files
   Describe   the methods used to create your data
     Describe measurement techniques (protocols, instruments)
     Data processing – why, how, assumptions
     Sensor network, taxonomic information, spatial location
 Choose & use standard terminology (concepts, methods, tools)
     Identify and use relevant metadata standards
 Data citation
 Describe the timeframe
HANDLING DATA FILES

 Create, manage, and document your data storage system
   Use descriptive file names
   Define
         Formats for date and time
         Units of measurement
         Parameters
         Missing code values
         Values that are estimated
     Use consistent codes
     Use appropriate field delimiters
     Store data values separately from data annotations or notes
     Store data at the right level of precision
 Quality assurance & data integrity
 Version control & authenticity
STORAGE & BACKUP

 Backup your data: regular intervals, 3 copies
   Local
   Semi-local
   Remote
 Document your backup strategy
 Make sure backup locations are secure and accessible
 Use standard file formats
     Non-proprietary, open format
     Commonly used in your community
     Unencrypted*
     Uncompressed*
PROCESSING & ANALYSIS

 Defining your research questions and documenting your data are
  iterative processes
   Inform each other
   Are never done, until the project is complete
   Developing good documentation will make analysis easier and more
    efficient
 Having good documentation will make writing your
  paper/thesis/dissertation much easier
   Use your readme or codebook files as source documents for your
    methods sections
 Having good documentation will identify problems sooner, when
  it may be possible to resolve them or minimize the damage to
  your data
RESOURCES

 @IU
     IUWare
     IUanyWARE
     StatMath
     ITTraining
     RFS & SDA

 Open access/public use data sets
   DataCite
   ICPSR
   Data.gov

 Subject liaison librarians can assist in locating data on your topic
THANK YOU

Find us at http://ulib.iupui.edu/digitalscholarship

Heather Coates, MLS, MS
Digital Scholarship & Data Management Librarian
hcoates@iupui.edu
317-278-7125

Mais conteúdo relacionado

Mais procurados

Data Curation: A New Frontier in Faculty-Librarian Collaboration
Data Curation: A New Frontier in Faculty-Librarian CollaborationData Curation: A New Frontier in Faculty-Librarian Collaboration
Data Curation: A New Frontier in Faculty-Librarian Collaborationjpotter49505
 
ESA14 Workshop on SEAD's Data Services and Tools
ESA14 Workshop on SEAD's Data Services and ToolsESA14 Workshop on SEAD's Data Services and Tools
ESA14 Workshop on SEAD's Data Services and ToolsSEAD
 
Practical and Conceptual Considerations of Research Object Preservation
Practical and Conceptual Considerations of Research Object PreservationPractical and Conceptual Considerations of Research Object Preservation
Practical and Conceptual Considerations of Research Object PreservationSEAD
 
Landing Pages - Joe Hourcle - RDAP12
Landing Pages - Joe Hourcle - RDAP12Landing Pages - Joe Hourcle - RDAP12
Landing Pages - Joe Hourcle - RDAP12ASIS&T
 
Data Sets, Ensemble Cloud Computing, and the University Library: Getting the ...
Data Sets, Ensemble Cloud Computing, and the University Library:Getting the ...Data Sets, Ensemble Cloud Computing, and the University Library:Getting the ...
Data Sets, Ensemble Cloud Computing, and the University Library: Getting the ...SEAD
 
Virginia Data Management Bootcamp: Building the Research Data Community of Pr...
Virginia Data Management Bootcamp: Building the Research Data Community of Pr...Virginia Data Management Bootcamp: Building the Research Data Community of Pr...
Virginia Data Management Bootcamp: Building the Research Data Community of Pr...Sherry Lake
 
RDAP14: Learning to Curate Panel
RDAP14: Learning to Curate Panel RDAP14: Learning to Curate Panel
RDAP14: Learning to Curate Panel ASIS&T
 
Presentation to the UM Library Emergent Research Series
Presentation to the UM Library Emergent Research SeriesPresentation to the UM Library Emergent Research Series
Presentation to the UM Library Emergent Research SeriesSEAD
 
NSF DataNet Partners Update at RDAP14
NSF DataNet Partners Update at RDAP14NSF DataNet Partners Update at RDAP14
NSF DataNet Partners Update at RDAP14SEAD
 
Best Practice in Data Management and Sharing
Best Practice in Data Management and Sharing Best Practice in Data Management and Sharing
Best Practice in Data Management and Sharing Mojtaba Lotfaliany
 
Rdap12 wrap up reagan moore
Rdap12 wrap up reagan mooreRdap12 wrap up reagan moore
Rdap12 wrap up reagan mooreASIS&T
 
Staffing Research Data Services at University of Edinburgh
Staffing Research Data Services at University of EdinburghStaffing Research Data Services at University of Edinburgh
Staffing Research Data Services at University of EdinburghRobin Rice
 
Re tooling for data management-support
Re tooling for data management-supportRe tooling for data management-support
Re tooling for data management-supportSherry Lake
 

Mais procurados (20)

Why managedata
Why managedataWhy managedata
Why managedata
 
Data Curation: A New Frontier in Faculty-Librarian Collaboration
Data Curation: A New Frontier in Faculty-Librarian CollaborationData Curation: A New Frontier in Faculty-Librarian Collaboration
Data Curation: A New Frontier in Faculty-Librarian Collaboration
 
ESA14 Workshop on SEAD's Data Services and Tools
ESA14 Workshop on SEAD's Data Services and ToolsESA14 Workshop on SEAD's Data Services and Tools
ESA14 Workshop on SEAD's Data Services and Tools
 
Practical and Conceptual Considerations of Research Object Preservation
Practical and Conceptual Considerations of Research Object PreservationPractical and Conceptual Considerations of Research Object Preservation
Practical and Conceptual Considerations of Research Object Preservation
 
Landing Pages - Joe Hourcle - RDAP12
Landing Pages - Joe Hourcle - RDAP12Landing Pages - Joe Hourcle - RDAP12
Landing Pages - Joe Hourcle - RDAP12
 
Data Sets, Ensemble Cloud Computing, and the University Library: Getting the ...
Data Sets, Ensemble Cloud Computing, and the University Library:Getting the ...Data Sets, Ensemble Cloud Computing, and the University Library:Getting the ...
Data Sets, Ensemble Cloud Computing, and the University Library: Getting the ...
 
Virginia Data Management Bootcamp: Building the Research Data Community of Pr...
Virginia Data Management Bootcamp: Building the Research Data Community of Pr...Virginia Data Management Bootcamp: Building the Research Data Community of Pr...
Virginia Data Management Bootcamp: Building the Research Data Community of Pr...
 
RDAP14: Learning to Curate Panel
RDAP14: Learning to Curate Panel RDAP14: Learning to Curate Panel
RDAP14: Learning to Curate Panel
 
Presentation to the UM Library Emergent Research Series
Presentation to the UM Library Emergent Research SeriesPresentation to the UM Library Emergent Research Series
Presentation to the UM Library Emergent Research Series
 
Praetzellis "Data Management Planning and Tools"
Praetzellis "Data Management Planning and Tools"Praetzellis "Data Management Planning and Tools"
Praetzellis "Data Management Planning and Tools"
 
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
 
NSF DataNet Partners Update at RDAP14
NSF DataNet Partners Update at RDAP14NSF DataNet Partners Update at RDAP14
NSF DataNet Partners Update at RDAP14
 
Putnam Data Quality and the IR
Putnam Data Quality and the IRPutnam Data Quality and the IR
Putnam Data Quality and the IR
 
Best Practice in Data Management and Sharing
Best Practice in Data Management and Sharing Best Practice in Data Management and Sharing
Best Practice in Data Management and Sharing
 
Introduction to Research Data Management
Introduction to Research Data ManagementIntroduction to Research Data Management
Introduction to Research Data Management
 
Rdap12 wrap up reagan moore
Rdap12 wrap up reagan mooreRdap12 wrap up reagan moore
Rdap12 wrap up reagan moore
 
Strasser "Effective data management and its role in open research"
Strasser "Effective data management and its role in open research"Strasser "Effective data management and its role in open research"
Strasser "Effective data management and its role in open research"
 
Staffing Research Data Services at University of Edinburgh
Staffing Research Data Services at University of EdinburghStaffing Research Data Services at University of Edinburgh
Staffing Research Data Services at University of Edinburgh
 
Re tooling for data management-support
Re tooling for data management-supportRe tooling for data management-support
Re tooling for data management-support
 
Llebot "Research Data Support for Researchers: Metadata, Challenges, and Oppo...
Llebot "Research Data Support for Researchers: Metadata, Challenges, and Oppo...Llebot "Research Data Support for Researchers: Metadata, Challenges, and Oppo...
Llebot "Research Data Support for Researchers: Metadata, Challenges, and Oppo...
 

Semelhante a Good data practices for graduate students

Technical Documentation 101 for Data Engineers.pdf
Technical Documentation 101 for Data Engineers.pdfTechnical Documentation 101 for Data Engineers.pdf
Technical Documentation 101 for Data Engineers.pdfShristi Shrestha
 
Meeting the NSF DMP Requirement June 13, 2012
Meeting the NSF DMP Requirement June 13, 2012Meeting the NSF DMP Requirement June 13, 2012
Meeting the NSF DMP Requirement June 13, 2012IUPUI
 
Effective research data management
Effective research data managementEffective research data management
Effective research data managementCatherine Gold
 
Meeting the NSF DMP Requirement: March 7, 2012
Meeting the NSF DMP Requirement: March 7, 2012Meeting the NSF DMP Requirement: March 7, 2012
Meeting the NSF DMP Requirement: March 7, 2012IUPUI
 
Designing Data Plans for Library Assessment
Designing Data Plans for Library AssessmentDesigning Data Plans for Library Assessment
Designing Data Plans for Library AssessmentKathleen Reed
 
Research Data Management for Librarians at Oxford Brookes
Research Data Management for Librarians at Oxford BrookesResearch Data Management for Librarians at Oxford Brookes
Research Data Management for Librarians at Oxford BrookesMarieke Guy
 
Database Concepts and Components
Database Concepts and ComponentsDatabase Concepts and Components
Database Concepts and ComponentsRIAH ENCARNACION
 
Sam Ziesler and Thomas Vause (Leeds Beckett) - Personalising resource access ...
Sam Ziesler and Thomas Vause (Leeds Beckett) - Personalising resource access ...Sam Ziesler and Thomas Vause (Leeds Beckett) - Personalising resource access ...
Sam Ziesler and Thomas Vause (Leeds Beckett) - Personalising resource access ...sherif user group
 
Master Data Analyst Course in Bangalore with ProITBridge's Expert Course.pdf
Master Data Analyst Course in Bangalore with ProITBridge's Expert Course.pdfMaster Data Analyst Course in Bangalore with ProITBridge's Expert Course.pdf
Master Data Analyst Course in Bangalore with ProITBridge's Expert Course.pdfproitbridgePvtLtd
 
Digital curation for postgraduate students
Digital curation for postgraduate studentsDigital curation for postgraduate students
Digital curation for postgraduate studentsSarah Jones
 
Tips for Effective Data Science in the Enterprise
Tips for Effective Data Science in the EnterpriseTips for Effective Data Science in the Enterprise
Tips for Effective Data Science in the EnterpriseLisa Cohen
 
Research Data Service at the University of Edinburgh
Research Data Service at the University of EdinburghResearch Data Service at the University of Edinburgh
Research Data Service at the University of EdinburghRobin Rice
 
Managing your data paget
Managing your data pagetManaging your data paget
Managing your data pagetTERN Australia
 
UK Digital Curation Centre: enabling research data management at the coalface
UK Digital Curation Centre: enabling research data management at the coalfaceUK Digital Curation Centre: enabling research data management at the coalface
UK Digital Curation Centre: enabling research data management at the coalfaceLizLyon
 
Data management plans (dmp) for nsf
Data management plans (dmp) for nsfData management plans (dmp) for nsf
Data management plans (dmp) for nsfBrad Houston
 
Data management plans (dmp) for nsf
Data management plans (dmp) for nsfData management plans (dmp) for nsf
Data management plans (dmp) for nsfBrad Houston
 

Semelhante a Good data practices for graduate students (20)

Technical Documentation 101 for Data Engineers.pdf
Technical Documentation 101 for Data Engineers.pdfTechnical Documentation 101 for Data Engineers.pdf
Technical Documentation 101 for Data Engineers.pdf
 
Meeting the NSF DMP Requirement June 13, 2012
Meeting the NSF DMP Requirement June 13, 2012Meeting the NSF DMP Requirement June 13, 2012
Meeting the NSF DMP Requirement June 13, 2012
 
Good Practice in Research Data Management
Good Practice in Research Data ManagementGood Practice in Research Data Management
Good Practice in Research Data Management
 
Effective research data management
Effective research data managementEffective research data management
Effective research data management
 
Meeting the NSF DMP Requirement: March 7, 2012
Meeting the NSF DMP Requirement: March 7, 2012Meeting the NSF DMP Requirement: March 7, 2012
Meeting the NSF DMP Requirement: March 7, 2012
 
Designing Data Plans for Library Assessment
Designing Data Plans for Library AssessmentDesigning Data Plans for Library Assessment
Designing Data Plans for Library Assessment
 
Preparing Your Research Data for the Future - 2014-02-17 - Social Sciences Di...
Preparing Your Research Data for the Future - 2014-02-17 - Social Sciences Di...Preparing Your Research Data for the Future - 2014-02-17 - Social Sciences Di...
Preparing Your Research Data for the Future - 2014-02-17 - Social Sciences Di...
 
Research Data Management for Librarians at Oxford Brookes
Research Data Management for Librarians at Oxford BrookesResearch Data Management for Librarians at Oxford Brookes
Research Data Management for Librarians at Oxford Brookes
 
Database Concepts and Components
Database Concepts and ComponentsDatabase Concepts and Components
Database Concepts and Components
 
Sam Ziesler and Thomas Vause (Leeds Beckett) - Personalising resource access ...
Sam Ziesler and Thomas Vause (Leeds Beckett) - Personalising resource access ...Sam Ziesler and Thomas Vause (Leeds Beckett) - Personalising resource access ...
Sam Ziesler and Thomas Vause (Leeds Beckett) - Personalising resource access ...
 
Database 1 Introduction
Database 1   IntroductionDatabase 1   Introduction
Database 1 Introduction
 
Master Data Analyst Course in Bangalore with ProITBridge's Expert Course.pdf
Master Data Analyst Course in Bangalore with ProITBridge's Expert Course.pdfMaster Data Analyst Course in Bangalore with ProITBridge's Expert Course.pdf
Master Data Analyst Course in Bangalore with ProITBridge's Expert Course.pdf
 
Digital curation for postgraduate students
Digital curation for postgraduate studentsDigital curation for postgraduate students
Digital curation for postgraduate students
 
Tips for Effective Data Science in the Enterprise
Tips for Effective Data Science in the EnterpriseTips for Effective Data Science in the Enterprise
Tips for Effective Data Science in the Enterprise
 
Research Data Service at the University of Edinburgh
Research Data Service at the University of EdinburghResearch Data Service at the University of Edinburgh
Research Data Service at the University of Edinburgh
 
Managing your data paget
Managing your data pagetManaging your data paget
Managing your data paget
 
UK Digital Curation Centre: enabling research data management at the coalface
UK Digital Curation Centre: enabling research data management at the coalfaceUK Digital Curation Centre: enabling research data management at the coalface
UK Digital Curation Centre: enabling research data management at the coalface
 
Data management plans (dmp) for nsf
Data management plans (dmp) for nsfData management plans (dmp) for nsf
Data management plans (dmp) for nsf
 
Data management plans (dmp) for nsf
Data management plans (dmp) for nsfData management plans (dmp) for nsf
Data management plans (dmp) for nsf
 
Introduction to RDM for Geoscience PhD Students
Introduction to RDM for Geoscience PhD StudentsIntroduction to RDM for Geoscience PhD Students
Introduction to RDM for Geoscience PhD Students
 

Mais de IUPUI

Altmetrics 101 - Altmetrics in Libraries
Altmetrics 101 - Altmetrics in LibrariesAltmetrics 101 - Altmetrics in Libraries
Altmetrics 101 - Altmetrics in LibrariesIUPUI
 
Gather evidence to demonstrate the impact of your research
Gather evidence to demonstrate the impact of your researchGather evidence to demonstrate the impact of your research
Gather evidence to demonstrate the impact of your researchIUPUI
 
Managing data responsibly to enable research interity
Managing data responsibly to enable research interityManaging data responsibly to enable research interity
Managing data responsibly to enable research interityIUPUI
 
Case studies for open science
Case studies for open scienceCase studies for open science
Case studies for open scienceIUPUI
 
Midwest Medical Library Association 2015 Big Data Panel
Midwest Medical Library Association 2015 Big Data PanelMidwest Medical Library Association 2015 Big Data Panel
Midwest Medical Library Association 2015 Big Data PanelIUPUI
 
Gathering Evidence to Demonstrate Impact
Gathering Evidence to Demonstrate ImpactGathering Evidence to Demonstrate Impact
Gathering Evidence to Demonstrate ImpactIUPUI
 
Citation & altmetrics - a comparison
Citation & altmetrics - a comparisonCitation & altmetrics - a comparison
Citation & altmetrics - a comparisonIUPUI
 
Altmetrics for Team Science
Altmetrics for Team ScienceAltmetrics for Team Science
Altmetrics for Team ScienceIUPUI
 
Ensuring data quality
Ensuring data qualityEnsuring data quality
Ensuring data qualityIUPUI
 
Preventing data loss
Preventing data lossPreventing data loss
Preventing data lossIUPUI
 
Practical Data Management Plans
Practical Data Management PlansPractical Data Management Plans
Practical Data Management PlansIUPUI
 
Teaching data management in a lab environment (IASSIST 2014)
Teaching data management in a lab environment (IASSIST 2014)Teaching data management in a lab environment (IASSIST 2014)
Teaching data management in a lab environment (IASSIST 2014)IUPUI
 
Building the Future of Research Together
Building the Future of Research TogetherBuilding the Future of Research Together
Building the Future of Research TogetherIUPUI
 
NIH Data Sharing Plan Workshop - Handout
NIH Data Sharing Plan Workshop - HandoutNIH Data Sharing Plan Workshop - Handout
NIH Data Sharing Plan Workshop - HandoutIUPUI
 
NIH Data Sharing Plan Workshop - Slides
NIH Data Sharing Plan Workshop - SlidesNIH Data Sharing Plan Workshop - Slides
NIH Data Sharing Plan Workshop - SlidesIUPUI
 
Data Management Lab: Session 4 Slides
Data Management Lab: Session 4 SlidesData Management Lab: Session 4 Slides
Data Management Lab: Session 4 SlidesIUPUI
 
Data Management Lab: Session 4 Review Outline
Data Management Lab: Session 4 Review OutlineData Management Lab: Session 4 Review Outline
Data Management Lab: Session 4 Review OutlineIUPUI
 
Data Management Lab: Session 3 Slides
Data Management Lab: Session 3 SlidesData Management Lab: Session 3 Slides
Data Management Lab: Session 3 SlidesIUPUI
 
Data Management Lab: Session 3 Data Review Checklist
Data Management Lab: Session 3 Data Review ChecklistData Management Lab: Session 3 Data Review Checklist
Data Management Lab: Session 3 Data Review ChecklistIUPUI
 
Data Management Lab: Session 3 Data Entry Best Practices
Data Management Lab: Session 3 Data Entry Best PracticesData Management Lab: Session 3 Data Entry Best Practices
Data Management Lab: Session 3 Data Entry Best PracticesIUPUI
 

Mais de IUPUI (20)

Altmetrics 101 - Altmetrics in Libraries
Altmetrics 101 - Altmetrics in LibrariesAltmetrics 101 - Altmetrics in Libraries
Altmetrics 101 - Altmetrics in Libraries
 
Gather evidence to demonstrate the impact of your research
Gather evidence to demonstrate the impact of your researchGather evidence to demonstrate the impact of your research
Gather evidence to demonstrate the impact of your research
 
Managing data responsibly to enable research interity
Managing data responsibly to enable research interityManaging data responsibly to enable research interity
Managing data responsibly to enable research interity
 
Case studies for open science
Case studies for open scienceCase studies for open science
Case studies for open science
 
Midwest Medical Library Association 2015 Big Data Panel
Midwest Medical Library Association 2015 Big Data PanelMidwest Medical Library Association 2015 Big Data Panel
Midwest Medical Library Association 2015 Big Data Panel
 
Gathering Evidence to Demonstrate Impact
Gathering Evidence to Demonstrate ImpactGathering Evidence to Demonstrate Impact
Gathering Evidence to Demonstrate Impact
 
Citation & altmetrics - a comparison
Citation & altmetrics - a comparisonCitation & altmetrics - a comparison
Citation & altmetrics - a comparison
 
Altmetrics for Team Science
Altmetrics for Team ScienceAltmetrics for Team Science
Altmetrics for Team Science
 
Ensuring data quality
Ensuring data qualityEnsuring data quality
Ensuring data quality
 
Preventing data loss
Preventing data lossPreventing data loss
Preventing data loss
 
Practical Data Management Plans
Practical Data Management PlansPractical Data Management Plans
Practical Data Management Plans
 
Teaching data management in a lab environment (IASSIST 2014)
Teaching data management in a lab environment (IASSIST 2014)Teaching data management in a lab environment (IASSIST 2014)
Teaching data management in a lab environment (IASSIST 2014)
 
Building the Future of Research Together
Building the Future of Research TogetherBuilding the Future of Research Together
Building the Future of Research Together
 
NIH Data Sharing Plan Workshop - Handout
NIH Data Sharing Plan Workshop - HandoutNIH Data Sharing Plan Workshop - Handout
NIH Data Sharing Plan Workshop - Handout
 
NIH Data Sharing Plan Workshop - Slides
NIH Data Sharing Plan Workshop - SlidesNIH Data Sharing Plan Workshop - Slides
NIH Data Sharing Plan Workshop - Slides
 
Data Management Lab: Session 4 Slides
Data Management Lab: Session 4 SlidesData Management Lab: Session 4 Slides
Data Management Lab: Session 4 Slides
 
Data Management Lab: Session 4 Review Outline
Data Management Lab: Session 4 Review OutlineData Management Lab: Session 4 Review Outline
Data Management Lab: Session 4 Review Outline
 
Data Management Lab: Session 3 Slides
Data Management Lab: Session 3 SlidesData Management Lab: Session 3 Slides
Data Management Lab: Session 3 Slides
 
Data Management Lab: Session 3 Data Review Checklist
Data Management Lab: Session 3 Data Review ChecklistData Management Lab: Session 3 Data Review Checklist
Data Management Lab: Session 3 Data Review Checklist
 
Data Management Lab: Session 3 Data Entry Best Practices
Data Management Lab: Session 3 Data Entry Best PracticesData Management Lab: Session 3 Data Entry Best Practices
Data Management Lab: Session 3 Data Entry Best Practices
 

Último

A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 

Último (20)

A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 

Good data practices for graduate students

  • 1. Graduate Office Student Success Series GOOD DATA PRACTICES FOR RESEARCH January 12, 2012 Heather Coates, MLS, MS | Digital Scholarship & Data Management Librarian
  • 2. CONTEXT: DATA LIFECYCLE Source: DDI Structural Reform Group. “DDI Version 3.0 Conceptual Model." DDI Alliance. 2004. Accessed on 11 August 2008. <http://www.icpsr.umich.edu/DDI/committee-info/Concept-Model-WD.pdf>.
  • 4. PLAN AHEAD: BEYOND THE PROTOCOL Plan early, before data collection  Identify ethical and legal issues  Define the data model  Think about a data organization strategy  Identify the most appropriate tools: instruments & software
  • 5. ETHICAL & LEGAL ISSUES  Privacy  Are there people (human subjects) involved in your project? Animals?  Does the study involve personal or health information? Can it be used to identify an individual?  Copyright  Are you using copyrighted data?  Have you sought permission?  Intellectual Property  You should cite any product that you use for your project: data, publications, software, etc.
  • 6. DESCRIBING YOUR DATA  Describe the research project  Describe overall organization of your dataset  Describe your data files  Describe the methods used to create your data  Describe measurement techniques (protocols, instruments)  Data processing – why, how, assumptions  Sensor network, taxonomic information, spatial location  Choose & use standard terminology (concepts, methods, tools)  Identify and use relevant metadata standards  Data citation  Describe the timeframe
  • 7. HANDLING DATA FILES  Create, manage, and document your data storage system  Use descriptive file names  Define  Formats for date and time  Units of measurement  Parameters  Missing code values  Values that are estimated  Use consistent codes  Use appropriate field delimiters  Store data values separately from data annotations or notes  Store data at the right level of precision  Quality assurance & data integrity  Version control & authenticity
  • 8. STORAGE & BACKUP  Backup your data: regular intervals, 3 copies  Local  Semi-local  Remote  Document your backup strategy  Make sure backup locations are secure and accessible  Use standard file formats  Non-proprietary, open format  Commonly used in your community  Unencrypted*  Uncompressed*
  • 9. PROCESSING & ANALYSIS  Defining your research questions and documenting your data are iterative processes  Inform each other  Are never done, until the project is complete  Developing good documentation will make analysis easier and more efficient  Having good documentation will make writing your paper/thesis/dissertation much easier  Use your readme or codebook files as source documents for your methods sections  Having good documentation will identify problems sooner, when it may be possible to resolve them or minimize the damage to your data
  • 10. RESOURCES  @IU  IUWare  IUanyWARE  StatMath  ITTraining  RFS & SDA  Open access/public use data sets  DataCite  ICPSR  Data.gov  Subject liaison librarians can assist in locating data on your topic
  • 11. THANK YOU Find us at http://ulib.iupui.edu/digitalscholarship Heather Coates, MLS, MS Digital Scholarship & Data Management Librarian hcoates@iupui.edu 317-278-7125

Notas do Editor

  1. Be aware of the research process, so you have some context for your experience. This can also help you organize your thoughts about executing/carrying out your projects.
  2. Goal: help you translate your research protocol into a practical plan to carry out your project/studyAlthough these things do take some extra time at the beginning of your project, it will make analysis and writing much, much easier because you will be clear about what was done.
  3. -data model: map out relationships between data, especially aggregated or calculated variables; translate research questions into analyses, then map to data to be used; can be particularly important if you are integrating data from multiple sources or have large quantitative datasets-data organization strategy: it should be part of the planning process and answer where, when, how?  will talk more about this in the next slide-software: IUWare, IUanyWare, StatMath, RFS, SDA (links on handout)-ethical &amp; legal issues: confidentiality, privacy, HIPAA, intellectual property, and copyright issues may arise; discuss these potential problems with your advisor; links for further information on handout)
  4. -although facts cannot be copyrighted, specific instances of them (such as a database) can be
  5. -research project: one option is to write a structured abstract (see handout)-dataset organization: use your plan and update it as things change (more on the next slide)-describeyour data files: what do you need to know to interpret the data? parameters, units, define coded values, define missing values-methods: -standards: don’t deviate from standards in your discipline or research community, unless you have a good reason for doing so; these standards reflect a common understanding and help to make data interoperable-citation: if you use someone else’s data, you should document and cite it: source, URL/DOI, detailed title of dataset, version information, date retrieved, authors/creators, brief description-timeframe: particularly if you’re using data from multiple sources or collecting data over a period of time, this needs to be documented clearly
  6. -data typing: use appropriate field for data: date field for dates; comments included in a separate column-document your folder structure &amp; file naming system -don’t rely on the computer’s time and date metadata; it’s not reliable and can be manipulated -keep file names short but descriptive; use a coding system to include project name, file contents, date, etc.-QA &amp; data integrity: minimize opportunity to introduce human error, automate processing, check and verify periodically-version control &amp; authenticity: especially important if multiple people are working on the same dataset; keep copies of your data before/after each major processing step; save you lots of work if errors creep in; you won’t have to start all over from the raw data; document how this is done
  7. -backup strategy: quick and dirty way is to check and verify file quantity, file size, and randomly check values in original and copies-if you need to share or transfer files, use Slashtmp instead of a flash drive; especially if the data involve human subjects data