SlideShare uma empresa Scribd logo
1 de 1
Baixar para ler offline
IUPUI University Library Center for Digital Scholarship
Data Management Lab: Spring 2014
Data Entry Best Practices
Data Entry
1. Dataset creation and integrity
a. Separate the coding and data entry tasks as much as possible
b. Coding should be performed so that distractions to coding tasks are minimized
c. Arrange for particularly complex tasks to be carried out by people specially trained for
the task
d. Use a data-entry program that is designed to catch typing errors (i.e., one that's pre-
programmed to detect out of range values)
e. Perform double entry of data
f. Carefully check the first 5-10 percent of the data records created, then choose random
records to quality-control checks throughout the process
g. Let the computer do complex coding and recoding, if possible
2. Things to check
a. Wild codes and out-of-range values
b. Consistency checks - comparisons across variables
c. Record matches and counts - relevant in longitudinal studies where subjects may have
more than one record and varying numbers of records
3. Variable names
a. Prefix, root, suffix systems is a systematic approach (compared to one-up numbers,
question numbers, and mnemonic names)
4. Variable labels
a. Should provide three pieces of information
i. The item or question number in the original data collection instrument
ii. A clear indication of the variable's content
iii. An indication of whether the variable is constructed from other items
5. Variable groups
a. Groups are recommended if a dataset contains a large number of variables
b. Can effectively organize a dataset an enable secondary analysts get an overview of a
dataset quickly
6. Over the long-term, store data in a consistent format
References
1. ICPSR. (2012). Guide to Social Science Data Preparation and Archiving, University of Michigan,
Ann Arbor, MI. From http://www.icpsr.umich.edu/files/deposit/dataprep.pdf.
2. Scott, T. 2012. Guidelines for data collection and entry.
From http://www.mc.vanderbilt.edu/gcrc/workshop_files/2012-09-07.pdf
3. DataONE Education Module: Data Entry and Manipulation. DataONE.
From http://www.dataone.org/sites/all/documents/L04_DataEntryManipulation.pptx
Heather Coates, 2013

Mais conteúdo relacionado

Mais procurados

Mais procurados (7)

BIOMAJ
BIOMAJBIOMAJ
BIOMAJ
 
Transparency and reproducibility in research
Transparency and reproducibility in researchTransparency and reproducibility in research
Transparency and reproducibility in research
 
UKON 2014
UKON 2014UKON 2014
UKON 2014
 
Warm Up 08-18
Warm Up 08-18Warm Up 08-18
Warm Up 08-18
 
eSource: A Clinical Data Manager's Tale of Three Studies
eSource: A Clinical Data Manager's Tale of Three StudieseSource: A Clinical Data Manager's Tale of Three Studies
eSource: A Clinical Data Manager's Tale of Three Studies
 
rOpenGov: an R ecosystem for open government data and computational social sc...
rOpenGov: an R ecosystem for open government data and computational social sc...rOpenGov: an R ecosystem for open government data and computational social sc...
rOpenGov: an R ecosystem for open government data and computational social sc...
 
Ds mcq
Ds mcqDs mcq
Ds mcq
 

Semelhante a Data Management Lab: Session 3 Data Entry Best Practices

An Empirical Study of the Applications of Classification Techniques in Studen...
An Empirical Study of the Applications of Classification Techniques in Studen...An Empirical Study of the Applications of Classification Techniques in Studen...
An Empirical Study of the Applications of Classification Techniques in Studen...
IJERA Editor
 
Indexing based Genetic Programming Approach to Record Deduplication
Indexing based Genetic Programming Approach to Record DeduplicationIndexing based Genetic Programming Approach to Record Deduplication
Indexing based Genetic Programming Approach to Record Deduplication
idescitation
 
Role of computers in research
Role of computers in researchRole of computers in research
Role of computers in research
Saravana Kumar
 
Student database management system
Student database management systemStudent database management system
Student database management system
Snehal Raut
 
СРС АКТ Малошов Нұралы ВМ-МҚБ-11-23.pptx
СРС АКТ Малошов Нұралы ВМ-МҚБ-11-23.pptxСРС АКТ Малошов Нұралы ВМ-МҚБ-11-23.pptx
СРС АКТ Малошов Нұралы ВМ-МҚБ-11-23.pptx
ssuser8719a6
 

Semelhante a Data Management Lab: Session 3 Data Entry Best Practices (20)

Bi4101343346
Bi4101343346Bi4101343346
Bi4101343346
 
An Empirical Study of the Applications of Classification Techniques in Studen...
An Empirical Study of the Applications of Classification Techniques in Studen...An Empirical Study of the Applications of Classification Techniques in Studen...
An Empirical Study of the Applications of Classification Techniques in Studen...
 
Using ID3 Decision Tree Algorithm to the Student Grade Analysis and Prediction
Using ID3 Decision Tree Algorithm to the Student Grade Analysis and PredictionUsing ID3 Decision Tree Algorithm to the Student Grade Analysis and Prediction
Using ID3 Decision Tree Algorithm to the Student Grade Analysis and Prediction
 
Indexing based Genetic Programming Approach to Record Deduplication
Indexing based Genetic Programming Approach to Record DeduplicationIndexing based Genetic Programming Approach to Record Deduplication
Indexing based Genetic Programming Approach to Record Deduplication
 
Pemanfaatan Big Data Dalam Riset 2023.pptx
Pemanfaatan Big Data Dalam Riset 2023.pptxPemanfaatan Big Data Dalam Riset 2023.pptx
Pemanfaatan Big Data Dalam Riset 2023.pptx
 
Trends and innovations in database course
Trends and innovations in database courseTrends and innovations in database course
Trends and innovations in database course
 
STUDENTS’ PERFORMANCE PREDICTION SYSTEM USING MULTI AGENT DATA MINING TECHNIQUE
STUDENTS’ PERFORMANCE PREDICTION SYSTEM USING MULTI AGENT DATA MINING TECHNIQUESTUDENTS’ PERFORMANCE PREDICTION SYSTEM USING MULTI AGENT DATA MINING TECHNIQUE
STUDENTS’ PERFORMANCE PREDICTION SYSTEM USING MULTI AGENT DATA MINING TECHNIQUE
 
Data Management Lab: Session 3 Data Coding Best Practices
Data Management Lab: Session 3 Data Coding Best PracticesData Management Lab: Session 3 Data Coding Best Practices
Data Management Lab: Session 3 Data Coding Best Practices
 
Data mining and business intelligence
Data mining and business intelligenceData mining and business intelligence
Data mining and business intelligence
 
T0 numtq0n tk=
T0 numtq0n tk=T0 numtq0n tk=
T0 numtq0n tk=
 
Predicting students' performance using id3 and c4.5 classification algorithms
Predicting students' performance using id3 and c4.5 classification algorithmsPredicting students' performance using id3 and c4.5 classification algorithms
Predicting students' performance using id3 and c4.5 classification algorithms
 
Role of computers in research
Role of computers in researchRole of computers in research
Role of computers in research
 
A Survey on the Classification Techniques In Educational Data Mining
A Survey on the Classification Techniques In Educational Data MiningA Survey on the Classification Techniques In Educational Data Mining
A Survey on the Classification Techniques In Educational Data Mining
 
Exam Questions
Exam QuestionsExam Questions
Exam Questions
 
Role of Computers in Research, Data Processing, Data Analysis
Role of Computers in Research, Data Processing, Data AnalysisRole of Computers in Research, Data Processing, Data Analysis
Role of Computers in Research, Data Processing, Data Analysis
 
Analysis Of Data Mining Model For Successful Implementation Of Data Warehouse...
Analysis Of Data Mining Model For Successful Implementation Of Data Warehouse...Analysis Of Data Mining Model For Successful Implementation Of Data Warehouse...
Analysis Of Data Mining Model For Successful Implementation Of Data Warehouse...
 
Student database management system
Student database management systemStudent database management system
Student database management system
 
Read Between The Lines: an Annotation Tool for Multimodal Data
Read Between The Lines: an Annotation Tool for Multimodal DataRead Between The Lines: an Annotation Tool for Multimodal Data
Read Between The Lines: an Annotation Tool for Multimodal Data
 
СРС АКТ Малошов Нұралы ВМ-МҚБ-11-23.pptx
СРС АКТ Малошов Нұралы ВМ-МҚБ-11-23.pptxСРС АКТ Малошов Нұралы ВМ-МҚБ-11-23.pptx
СРС АКТ Малошов Нұралы ВМ-МҚБ-11-23.pptx
 
Data mining
Data miningData mining
Data mining
 

Mais de IUPUI

Building the Future of Research Together
Building the Future of Research TogetherBuilding the Future of Research Together
Building the Future of Research Together
IUPUI
 

Mais de IUPUI (20)

Altmetrics 101 - Altmetrics in Libraries
Altmetrics 101 - Altmetrics in LibrariesAltmetrics 101 - Altmetrics in Libraries
Altmetrics 101 - Altmetrics in Libraries
 
Gather evidence to demonstrate the impact of your research
Gather evidence to demonstrate the impact of your researchGather evidence to demonstrate the impact of your research
Gather evidence to demonstrate the impact of your research
 
Managing data responsibly to enable research interity
Managing data responsibly to enable research interityManaging data responsibly to enable research interity
Managing data responsibly to enable research interity
 
Case studies for open science
Case studies for open scienceCase studies for open science
Case studies for open science
 
Midwest Medical Library Association 2015 Big Data Panel
Midwest Medical Library Association 2015 Big Data PanelMidwest Medical Library Association 2015 Big Data Panel
Midwest Medical Library Association 2015 Big Data Panel
 
Gathering Evidence to Demonstrate Impact
Gathering Evidence to Demonstrate ImpactGathering Evidence to Demonstrate Impact
Gathering Evidence to Demonstrate Impact
 
Citation & altmetrics - a comparison
Citation & altmetrics - a comparisonCitation & altmetrics - a comparison
Citation & altmetrics - a comparison
 
Altmetrics for Team Science
Altmetrics for Team ScienceAltmetrics for Team Science
Altmetrics for Team Science
 
Ensuring data quality
Ensuring data qualityEnsuring data quality
Ensuring data quality
 
Preventing data loss
Preventing data lossPreventing data loss
Preventing data loss
 
Practical Data Management Plans
Practical Data Management PlansPractical Data Management Plans
Practical Data Management Plans
 
Teaching data management in a lab environment (IASSIST 2014)
Teaching data management in a lab environment (IASSIST 2014)Teaching data management in a lab environment (IASSIST 2014)
Teaching data management in a lab environment (IASSIST 2014)
 
Building the Future of Research Together
Building the Future of Research TogetherBuilding the Future of Research Together
Building the Future of Research Together
 
NIH Data Sharing Plan Workshop - Handout
NIH Data Sharing Plan Workshop - HandoutNIH Data Sharing Plan Workshop - Handout
NIH Data Sharing Plan Workshop - Handout
 
NIH Data Sharing Plan Workshop - Slides
NIH Data Sharing Plan Workshop - SlidesNIH Data Sharing Plan Workshop - Slides
NIH Data Sharing Plan Workshop - Slides
 
Data Management Lab: Session 4 Slides
Data Management Lab: Session 4 SlidesData Management Lab: Session 4 Slides
Data Management Lab: Session 4 Slides
 
Data Management Lab: Session 4 Review Outline
Data Management Lab: Session 4 Review OutlineData Management Lab: Session 4 Review Outline
Data Management Lab: Session 4 Review Outline
 
Data Management Lab: Session 3 Slides
Data Management Lab: Session 3 SlidesData Management Lab: Session 3 Slides
Data Management Lab: Session 3 Slides
 
Data Management Lab: Session 3 Data Review Checklist
Data Management Lab: Session 3 Data Review ChecklistData Management Lab: Session 3 Data Review Checklist
Data Management Lab: Session 3 Data Review Checklist
 
Data Management Lab: Session 2 slides
Data Management Lab: Session 2 slidesData Management Lab: Session 2 slides
Data Management Lab: Session 2 slides
 

Último

The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
heathfieldcps1
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
ciinovamais
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
kauryashika82
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
QucHHunhnh
 

Último (20)

Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
Asian American Pacific Islander Month DDSD 2024.pptx
Asian American Pacific Islander Month DDSD 2024.pptxAsian American Pacific Islander Month DDSD 2024.pptx
Asian American Pacific Islander Month DDSD 2024.pptx
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
Dyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptxDyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptx
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701
 
Third Battle of Panipat detailed notes.pptx
Third Battle of Panipat detailed notes.pptxThird Battle of Panipat detailed notes.pptx
Third Battle of Panipat detailed notes.pptx
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 

Data Management Lab: Session 3 Data Entry Best Practices

  • 1. IUPUI University Library Center for Digital Scholarship Data Management Lab: Spring 2014 Data Entry Best Practices Data Entry 1. Dataset creation and integrity a. Separate the coding and data entry tasks as much as possible b. Coding should be performed so that distractions to coding tasks are minimized c. Arrange for particularly complex tasks to be carried out by people specially trained for the task d. Use a data-entry program that is designed to catch typing errors (i.e., one that's pre- programmed to detect out of range values) e. Perform double entry of data f. Carefully check the first 5-10 percent of the data records created, then choose random records to quality-control checks throughout the process g. Let the computer do complex coding and recoding, if possible 2. Things to check a. Wild codes and out-of-range values b. Consistency checks - comparisons across variables c. Record matches and counts - relevant in longitudinal studies where subjects may have more than one record and varying numbers of records 3. Variable names a. Prefix, root, suffix systems is a systematic approach (compared to one-up numbers, question numbers, and mnemonic names) 4. Variable labels a. Should provide three pieces of information i. The item or question number in the original data collection instrument ii. A clear indication of the variable's content iii. An indication of whether the variable is constructed from other items 5. Variable groups a. Groups are recommended if a dataset contains a large number of variables b. Can effectively organize a dataset an enable secondary analysts get an overview of a dataset quickly 6. Over the long-term, store data in a consistent format References 1. ICPSR. (2012). Guide to Social Science Data Preparation and Archiving, University of Michigan, Ann Arbor, MI. From http://www.icpsr.umich.edu/files/deposit/dataprep.pdf. 2. Scott, T. 2012. Guidelines for data collection and entry. From http://www.mc.vanderbilt.edu/gcrc/workshop_files/2012-09-07.pdf 3. DataONE Education Module: Data Entry and Manipulation. DataONE. From http://www.dataone.org/sites/all/documents/L04_DataEntryManipulation.pptx Heather Coates, 2013