SlideShare uma empresa Scribd logo
1 de 6
Baixar para ler offline
University of Sindh
FACULTY OF EDUCATION
Reliability is the degree to which an assessment tool produces stable and consistent results.
Types of Reliability. Test-retest reliability is a measure of reliability obtained by
administering the same test twice over a period of time to a group of individuals.
Types of Reliability
There are four general classes of reliability estimates, each of which estimates reliability in a
different way. They are:
1. Inter-Rater or Inter-Observer Reliability
Used to assess the degree to which different raters/observers give consistent estimates
of the same phenomenon.
Whenever you use humans as a part of your measurement procedure, you have to
worry about whether the results you get are reliable or consistent. People are
notorious for their inconsistency. We are easily distractible. We get tired of doing
repetitive tasks. We daydream. We misinterpret. So how do we determine whether
two observers are being consistent in their observations? You probably should
establish inter-rater reliability outside of the context of the measurement in your
study. After all, if you use data from your study to establish reliability, and you find
that reliability is low, you're kind of stuck. Probably it's best to do this as a side study
or pilot study. And, if your study goes on for a long time, you may want to reestablish
inter-rater reliability from time to time to assure that your raters aren't changing.
There are two major ways to actually estimate inter-rater reliability. If your
measurement consists of categories -- the raters are checking off which category each
observation falls in -- you can calculate the percent of agreement between the raters.
For instance, let's say you had 100 observations that were being rated by two raters.
What is reliability and its types
Name: Miss Kanwal Naz Roll No: 72 Class: B.ed Evening (1.5) Course: EA & E
Submitted to Respected: Dr. Amjad Arian
For each observation, the rater could check one of three categories. Imagine that on 86
of the 100 observations the raters checked the same category. In this case, the percent
of agreement would be 86%. OK, it's a crude measure, but it does give an idea of how
much agreement exists, and it works no matter how many categories are used for each
observation. The other major way to estimate inter-rater reliability is appropriate
when the measure is a continuous one. There, all you need to do is calculate the
correlation between the ratings of the two observers. For instance, they might be
rating the overall level of activity in a classroom on a 1-to-7 scale. You could have
them give their rating at regular time intervals (e.g., every 30 seconds). The
correlation between these ratings would give you an estimate of the reliability or
consistency between the raters. You might think of this type of reliability as
"calibrating" the observers. There are other things you could do to encourage
reliability between observers, even if you don't estimate it. For instance, I used to
work in a psychiatric unit where every morning a nurse had to do a ten-item rating of
each patient on the unit. Of course, we couldn't count on the same nurse being present
every day, so we had to find a way to assure that any of the nurses would give
comparable ratings. The way we did it was to hold weekly "calibration" meetings
where we would have all of the nurses ratings for several patients and discuss why
they chose the specific values they did. If there were disagreements, the nurses would
discuss them and attempt to come up with rules for deciding when they would give a
"3" or a "4" for a rating on a specific item. Although this was not an estimate of
reliability, it probably went a long way toward improving the reliability between
raters.
2. Test-Retest Reliability
Used to assess the consistency of a measure from one time to another.
We estimate test-retest reliability when we administer the same test to the
same sample on two different occasions. This approach assumes that there is no
substantial change in the construct being measured between the two occasions. The
amount of time allowed between measures is critical. We know that if we measure the
same thing twice that the correlation between the two observations will depend in part
by how much time elapses between the two measurement occasions. The shorter the
time gap, the higher the correlation; the longer the time gap, the lower the correlation.
This is because the two observations are related over time -- the closer in time we get
the more similar the factors that contribute to error. Since this correlation is the test-
retest estimate of reliability, you can obtain considerably different estimates
depending on the interval.
3. Parallel-Forms Reliability
Used to assess the consistency of the results of two tests constructed in the same way
from the same content domain.
In parallel forms reliability you first have to create two parallel forms. One way to
accomplish this is to create a large set of questions that address the same construct
and then randomly divide the questions into two sets. You administer both
instruments to the same sample of people. The correlation between the two parallel
forms is the estimate of reliability. One major problem with this approach is that you
have to be able to generate lots of items that reflect the same construct. This is often
no easy feat. Furthermore, this approach makes the assumption that the randomly
divided halves are parallel or equivalent. Even by chance this will sometimes not be
the case. The parallel forms approach is very similar to the split-half reliability
described below. The major difference is that parallel forms are constructed so that
the two forms can be used independent of each other and considered equivalent
measures. For instance, we might be concerned about a testing threat to internal
validity. If we use Form A for the pretest and Form B for the posttest, we minimize
that problem. it would even be better if we randomly assign individuals to receive
Form A or B on the pretest and then switch them on the posttest. With split-half
reliability we have an instrument that we wish to use as a single measurement
instrument and only develop randomly split halves for purposes of estimating
reliability.
4. Internal Consistency Reliability
Used to assess the consistency of results across items within a test.
In internal consistency reliability estimation we use our single measurement
instrument administered to a group of people on one occasion to estimate reliability.
In effect we judge the reliability of the instrument by estimating how well the items
that reflect the same construct yield similar results. We are looking at how consistent
the results are for different items for the same construct within the measure. There are
a wide variety of internal consistency measures that can be used.
Average Inter-item Correlation
The average inter-item correlation uses all of the items on our instrument that are designed to
measure the same construct. We first compute the correlation between each pair of items, as
illustrated in the figure. For example, if we have six items we will have 15 different item
pairings (i.e., 15 correlations). The average interitem correlation is simply the average or
mean of all these correlations. In the example, we find an average inter-item correlation of
.90 with the individual correlations ranging from .84 to .95.
Average Itemtotal Correlation
This approach also uses the inter-item correlations. In addition, we compute a total score for
the six items and use that as a seventh variable in the analysis. The figure shows the six item-
to-total correlations at the bottom of the correlation matrix. They range from .82 to .88 in this
sample analysis, with the average of these at .85.
Split-Half Reliability
In split-half reliability we randomly divide all items that purport to measure the same
construct into two sets. We administer the entire instrument to a sample of people and
calculate the total score for each randomly divided half. the split-half reliability estimate, as
shown in the figure, is simply the correlation between these two total scores. In the example it
is .87.
Cronbach's Alpha (a)
Imagine that we compute one split-half reliability and then randomly divide the items into
another set of split halves and recomputed, and keep doing this until we have computed all
possible split half estimates of reliability. Cronbach's Alpha is mathematically equivalent to
the average of all possible split-half estimates, although that's not how we compute it. Notice
that when I say we compute all possible split-half estimates, I don't mean that each time we
go an measure a new sample! That would take forever. Instead, we calculate all split-half
estimates from the same sample. Because we measured all of our sample on each of the six
items, all we have to do is have the computer analysis do the random subsets of items and
compute the resulting correlations. The figure shows several of the split-half estimates for our
six item example and lists them as SH with a subscript. Just keep in mind that although
Cronbach's Alpha is equivalent to the average of all possible split half correlations we would
never actually calculate it that way. Some clever mathematician (Cronbach, I presume!)
figured out a way to get the mathematical equivalent a lot more quickly.

Mais conteúdo relacionado

Mais procurados

Validity and Reliability
Validity and ReliabilityValidity and Reliability
Validity and ReliabilityMaury Martinez
 
Presentation validity
Presentation validityPresentation validity
Presentation validityAshMusavi
 
Validity, reliability and feasibility
Validity, reliability and feasibilityValidity, reliability and feasibility
Validity, reliability and feasibilitysilpa $H!lu
 
Reliability (assessment of student learning I)
Reliability (assessment of student learning I)Reliability (assessment of student learning I)
Reliability (assessment of student learning I)Rey-ra Mora
 
Validity in Research
Validity in ResearchValidity in Research
Validity in ResearchEcem Ekinci
 
Reliability and its types: Split half method and test retest methods
Reliability and its types: Split half method and test retest methodsReliability and its types: Split half method and test retest methods
Reliability and its types: Split half method and test retest methodsAamir Hussain
 
validity its types and importance
validity its types and importancevalidity its types and importance
validity its types and importanceIerine Joy Caserial
 
Characteristics of a good test
Characteristics of a good testCharacteristics of a good test
Characteristics of a good testcyrilcoscos
 
Test Reliability and Validity
Test Reliability and ValidityTest Reliability and Validity
Test Reliability and ValidityBrian Ebie
 
Reliability
ReliabilityReliability
ReliabilityRoi Xcel
 
Validity and its types
Validity and its typesValidity and its types
Validity and its typesBibiNadia1
 
Norms and the Meaning of Test Scores
Norms and the Meaning of Test ScoresNorms and the Meaning of Test Scores
Norms and the Meaning of Test ScoresMushfikFRahman
 

Mais procurados (20)

Validity and Reliability
Validity and ReliabilityValidity and Reliability
Validity and Reliability
 
Presentation validity
Presentation validityPresentation validity
Presentation validity
 
Validity, reliability and feasibility
Validity, reliability and feasibilityValidity, reliability and feasibility
Validity, reliability and feasibility
 
Validity in Assessment
Validity in AssessmentValidity in Assessment
Validity in Assessment
 
Reliability (assessment of student learning I)
Reliability (assessment of student learning I)Reliability (assessment of student learning I)
Reliability (assessment of student learning I)
 
Validity in Research
Validity in ResearchValidity in Research
Validity in Research
 
Reliability
ReliabilityReliability
Reliability
 
Week 8 & 9 - Validity and Reliability
Week 8 & 9 - Validity and ReliabilityWeek 8 & 9 - Validity and Reliability
Week 8 & 9 - Validity and Reliability
 
Reliability and its types: Split half method and test retest methods
Reliability and its types: Split half method and test retest methodsReliability and its types: Split half method and test retest methods
Reliability and its types: Split half method and test retest methods
 
Reliability
ReliabilityReliability
Reliability
 
validity its types and importance
validity its types and importancevalidity its types and importance
validity its types and importance
 
Characteristics of a good test
Characteristics of a good testCharacteristics of a good test
Characteristics of a good test
 
Test Reliability and Validity
Test Reliability and ValidityTest Reliability and Validity
Test Reliability and Validity
 
Reliability
ReliabilityReliability
Reliability
 
Reliability
ReliabilityReliability
Reliability
 
Validity and its types
Validity and its typesValidity and its types
Validity and its types
 
Reliablity
ReliablityReliablity
Reliablity
 
Reliability and validity
Reliability and validityReliability and validity
Reliability and validity
 
Validity & reliability
Validity & reliabilityValidity & reliability
Validity & reliability
 
Norms and the Meaning of Test Scores
Norms and the Meaning of Test ScoresNorms and the Meaning of Test Scores
Norms and the Meaning of Test Scores
 

Semelhante a What is Reliability and its Types?

RELIABILITY.pptx
RELIABILITY.pptxRELIABILITY.pptx
RELIABILITY.pptxrupasi13
 
Method of measuring test reliability
Method of measuring test reliabilityMethod of measuring test reliability
Method of measuring test reliabilitynamrata227
 
Characteristics of a good test
Characteristics of a good test Characteristics of a good test
Characteristics of a good test Arash Yazdani
 
Validity, Reliability ,Objective & Their Types
Validity, Reliability ,Objective & Their TypesValidity, Reliability ,Objective & Their Types
Validity, Reliability ,Objective & Their TypesMohammadRabbani18
 
Testing in language programs (chapter 8)
Testing in language programs (chapter 8)Testing in language programs (chapter 8)
Testing in language programs (chapter 8)Tahere Bakhshi
 
Chapter 8 compilation
Chapter 8 compilationChapter 8 compilation
Chapter 8 compilationHannan Mahmud
 
Validity and reliability
Validity and reliabilityValidity and reliability
Validity and reliabilityrijocool
 
Topic validity
Topic validityTopic validity
Topic validitymikki khan
 
Reliability and Validity types and example.pptx
Reliability and Validity types and example.pptxReliability and Validity types and example.pptx
Reliability and Validity types and example.pptxAiswarya Lakshmi
 
What makes a good testA test is considered good” if the .docx
What makes a good testA test is considered good” if the .docxWhat makes a good testA test is considered good” if the .docx
What makes a good testA test is considered good” if the .docxmecklenburgstrelitzh
 
Louzel Report - Reliability & validity
Louzel Report - Reliability & validity Louzel Report - Reliability & validity
Louzel Report - Reliability & validity Louzel Linejan
 
RELIABILITY AND VALIDITY
RELIABILITY AND VALIDITYRELIABILITY AND VALIDITY
RELIABILITY AND VALIDITYJoydeep Singh
 
Session 2 2018
Session 2 2018Session 2 2018
Session 2 2018Sue Hines
 
Evaluation of Measurement Instruments.ppt
Evaluation of Measurement Instruments.pptEvaluation of Measurement Instruments.ppt
Evaluation of Measurement Instruments.pptCityComputers3
 
Reliability by Vartika Verma .pdf
Reliability by Vartika Verma .pdfReliability by Vartika Verma .pdf
Reliability by Vartika Verma .pdfVartika Verma
 

Semelhante a What is Reliability and its Types? (20)

RELIABILITY.pptx
RELIABILITY.pptxRELIABILITY.pptx
RELIABILITY.pptx
 
Method of measuring test reliability
Method of measuring test reliabilityMethod of measuring test reliability
Method of measuring test reliability
 
Characteristics of a good test
Characteristics of a good test Characteristics of a good test
Characteristics of a good test
 
Validity, Reliability ,Objective & Their Types
Validity, Reliability ,Objective & Their TypesValidity, Reliability ,Objective & Their Types
Validity, Reliability ,Objective & Their Types
 
Testing in language programs (chapter 8)
Testing in language programs (chapter 8)Testing in language programs (chapter 8)
Testing in language programs (chapter 8)
 
Chapter 8 compilation
Chapter 8 compilationChapter 8 compilation
Chapter 8 compilation
 
Validity and reliability
Validity and reliabilityValidity and reliability
Validity and reliability
 
Topic validity
Topic validityTopic validity
Topic validity
 
Edm 202
Edm 202Edm 202
Edm 202
 
Reliability and Validity types and example.pptx
Reliability and Validity types and example.pptxReliability and Validity types and example.pptx
Reliability and Validity types and example.pptx
 
What makes a good testA test is considered good” if the .docx
What makes a good testA test is considered good” if the .docxWhat makes a good testA test is considered good” if the .docx
What makes a good testA test is considered good” if the .docx
 
Reliability
ReliabilityReliability
Reliability
 
Louzel Report - Reliability & validity
Louzel Report - Reliability & validity Louzel Report - Reliability & validity
Louzel Report - Reliability & validity
 
RELIABILITY AND VALIDITY
RELIABILITY AND VALIDITYRELIABILITY AND VALIDITY
RELIABILITY AND VALIDITY
 
Session 2 2018
Session 2 2018Session 2 2018
Session 2 2018
 
Reliablity and Validity
Reliablity and ValidityReliablity and Validity
Reliablity and Validity
 
Rep
RepRep
Rep
 
Validity and reliablity
Validity and reliablityValidity and reliablity
Validity and reliablity
 
Evaluation of Measurement Instruments.ppt
Evaluation of Measurement Instruments.pptEvaluation of Measurement Instruments.ppt
Evaluation of Measurement Instruments.ppt
 
Reliability by Vartika Verma .pdf
Reliability by Vartika Verma .pdfReliability by Vartika Verma .pdf
Reliability by Vartika Verma .pdf
 

Mais de Dr. Amjad Ali Arain

Meaning of Test, Testing and Evaluation
Meaning of Test, Testing and EvaluationMeaning of Test, Testing and Evaluation
Meaning of Test, Testing and EvaluationDr. Amjad Ali Arain
 
Objective Type Items, Recognition Type Items and Recall Items
Objective Type Items, Recognition Type Items and Recall ItemsObjective Type Items, Recognition Type Items and Recall Items
Objective Type Items, Recognition Type Items and Recall ItemsDr. Amjad Ali Arain
 
Meaning of Test, Testing and Evaluation
Meaning of Test, Testing and EvaluationMeaning of Test, Testing and Evaluation
Meaning of Test, Testing and EvaluationDr. Amjad Ali Arain
 
Administration/Conducting the Test
Administration/Conducting the TestAdministration/Conducting the Test
Administration/Conducting the TestDr. Amjad Ali Arain
 
Counselling of Students After Reporting The Results
Counselling of Students After Reporting The ResultsCounselling of Students After Reporting The Results
Counselling of Students After Reporting The ResultsDr. Amjad Ali Arain
 
Purpose, Principle, Scope of Test and Evaluation
Purpose, Principle, Scope of Test and EvaluationPurpose, Principle, Scope of Test and Evaluation
Purpose, Principle, Scope of Test and EvaluationDr. Amjad Ali Arain
 
Report Test Result to Administration
 Report Test Result to Administration Report Test Result to Administration
Report Test Result to AdministrationDr. Amjad Ali Arain
 
Reporting Test Results to Parents
Reporting Test Results to ParentsReporting Test Results to Parents
Reporting Test Results to ParentsDr. Amjad Ali Arain
 

Mais de Dr. Amjad Ali Arain (20)

Meaning of Test, Testing and Evaluation
Meaning of Test, Testing and EvaluationMeaning of Test, Testing and Evaluation
Meaning of Test, Testing and Evaluation
 
Daignostic Evaluation.
Daignostic Evaluation.Daignostic Evaluation.
Daignostic Evaluation.
 
Assembling The Test
Assembling The TestAssembling The Test
Assembling The Test
 
Objective Type Items, Recognition Type Items and Recall Items
Objective Type Items, Recognition Type Items and Recall ItemsObjective Type Items, Recognition Type Items and Recall Items
Objective Type Items, Recognition Type Items and Recall Items
 
Test Testing and Evaluation
Test Testing and EvaluationTest Testing and Evaluation
Test Testing and Evaluation
 
Frequency Distribution
Frequency DistributionFrequency Distribution
Frequency Distribution
 
Meaning of Test, Testing and Evaluation
Meaning of Test, Testing and EvaluationMeaning of Test, Testing and Evaluation
Meaning of Test, Testing and Evaluation
 
Administration/Conducting the Test
Administration/Conducting the TestAdministration/Conducting the Test
Administration/Conducting the Test
 
Counselling of Students After Reporting The Results
Counselling of Students After Reporting The ResultsCounselling of Students After Reporting The Results
Counselling of Students After Reporting The Results
 
Essay Type Test
Essay Type TestEssay Type Test
Essay Type Test
 
Purpose, Principle, Scope of Test and Evaluation
Purpose, Principle, Scope of Test and EvaluationPurpose, Principle, Scope of Test and Evaluation
Purpose, Principle, Scope of Test and Evaluation
 
Report Test Result to Administration
 Report Test Result to Administration Report Test Result to Administration
Report Test Result to Administration
 
Preparing The Test Items
Preparing The Test ItemsPreparing The Test Items
Preparing The Test Items
 
Validity
ValidityValidity
Validity
 
Learning Objectives
Learning ObjectivesLearning Objectives
Learning Objectives
 
Reporting Test Results to Parents
Reporting Test Results to ParentsReporting Test Results to Parents
Reporting Test Results to Parents
 
Order and Ranking
Order and RankingOrder and Ranking
Order and Ranking
 
Types of Evaluation
Types of EvaluationTypes of Evaluation
Types of Evaluation
 
School Evaluation Program
School Evaluation ProgramSchool Evaluation Program
School Evaluation Program
 
Qualities of a Good Test
Qualities of a Good TestQualities of a Good Test
Qualities of a Good Test
 

Último

Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxVishalSingh1417
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
Role Of Transgenic Animal In Target Validation-1.pptx
Role Of Transgenic Animal In Target Validation-1.pptxRole Of Transgenic Animal In Target Validation-1.pptx
Role Of Transgenic Animal In Target Validation-1.pptxNikitaBankoti2
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphThiyagu K
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Celine George
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17Celine George
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxVishalSingh1417
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhikauryashika82
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxheathfieldcps1
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxAreebaZafar22
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin ClassesCeline George
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.pptRamjanShidvankar
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...Shubhangi Sonawane
 

Último (20)

Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Role Of Transgenic Animal In Target Validation-1.pptx
Role Of Transgenic Animal In Target Validation-1.pptxRole Of Transgenic Animal In Target Validation-1.pptx
Role Of Transgenic Animal In Target Validation-1.pptx
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
 

What is Reliability and its Types?

  • 1. University of Sindh FACULTY OF EDUCATION Reliability is the degree to which an assessment tool produces stable and consistent results. Types of Reliability. Test-retest reliability is a measure of reliability obtained by administering the same test twice over a period of time to a group of individuals. Types of Reliability There are four general classes of reliability estimates, each of which estimates reliability in a different way. They are: 1. Inter-Rater or Inter-Observer Reliability Used to assess the degree to which different raters/observers give consistent estimates of the same phenomenon. Whenever you use humans as a part of your measurement procedure, you have to worry about whether the results you get are reliable or consistent. People are notorious for their inconsistency. We are easily distractible. We get tired of doing repetitive tasks. We daydream. We misinterpret. So how do we determine whether two observers are being consistent in their observations? You probably should establish inter-rater reliability outside of the context of the measurement in your study. After all, if you use data from your study to establish reliability, and you find that reliability is low, you're kind of stuck. Probably it's best to do this as a side study or pilot study. And, if your study goes on for a long time, you may want to reestablish inter-rater reliability from time to time to assure that your raters aren't changing. There are two major ways to actually estimate inter-rater reliability. If your measurement consists of categories -- the raters are checking off which category each observation falls in -- you can calculate the percent of agreement between the raters. For instance, let's say you had 100 observations that were being rated by two raters. What is reliability and its types Name: Miss Kanwal Naz Roll No: 72 Class: B.ed Evening (1.5) Course: EA & E Submitted to Respected: Dr. Amjad Arian
  • 2. For each observation, the rater could check one of three categories. Imagine that on 86 of the 100 observations the raters checked the same category. In this case, the percent of agreement would be 86%. OK, it's a crude measure, but it does give an idea of how much agreement exists, and it works no matter how many categories are used for each observation. The other major way to estimate inter-rater reliability is appropriate when the measure is a continuous one. There, all you need to do is calculate the correlation between the ratings of the two observers. For instance, they might be rating the overall level of activity in a classroom on a 1-to-7 scale. You could have them give their rating at regular time intervals (e.g., every 30 seconds). The correlation between these ratings would give you an estimate of the reliability or consistency between the raters. You might think of this type of reliability as "calibrating" the observers. There are other things you could do to encourage reliability between observers, even if you don't estimate it. For instance, I used to work in a psychiatric unit where every morning a nurse had to do a ten-item rating of each patient on the unit. Of course, we couldn't count on the same nurse being present every day, so we had to find a way to assure that any of the nurses would give comparable ratings. The way we did it was to hold weekly "calibration" meetings where we would have all of the nurses ratings for several patients and discuss why they chose the specific values they did. If there were disagreements, the nurses would discuss them and attempt to come up with rules for deciding when they would give a "3" or a "4" for a rating on a specific item. Although this was not an estimate of reliability, it probably went a long way toward improving the reliability between raters. 2. Test-Retest Reliability Used to assess the consistency of a measure from one time to another. We estimate test-retest reliability when we administer the same test to the same sample on two different occasions. This approach assumes that there is no substantial change in the construct being measured between the two occasions. The amount of time allowed between measures is critical. We know that if we measure the same thing twice that the correlation between the two observations will depend in part by how much time elapses between the two measurement occasions. The shorter the time gap, the higher the correlation; the longer the time gap, the lower the correlation. This is because the two observations are related over time -- the closer in time we get
  • 3. the more similar the factors that contribute to error. Since this correlation is the test- retest estimate of reliability, you can obtain considerably different estimates depending on the interval. 3. Parallel-Forms Reliability Used to assess the consistency of the results of two tests constructed in the same way from the same content domain. In parallel forms reliability you first have to create two parallel forms. One way to accomplish this is to create a large set of questions that address the same construct and then randomly divide the questions into two sets. You administer both instruments to the same sample of people. The correlation between the two parallel forms is the estimate of reliability. One major problem with this approach is that you have to be able to generate lots of items that reflect the same construct. This is often no easy feat. Furthermore, this approach makes the assumption that the randomly divided halves are parallel or equivalent. Even by chance this will sometimes not be the case. The parallel forms approach is very similar to the split-half reliability described below. The major difference is that parallel forms are constructed so that the two forms can be used independent of each other and considered equivalent measures. For instance, we might be concerned about a testing threat to internal validity. If we use Form A for the pretest and Form B for the posttest, we minimize that problem. it would even be better if we randomly assign individuals to receive Form A or B on the pretest and then switch them on the posttest. With split-half reliability we have an instrument that we wish to use as a single measurement instrument and only develop randomly split halves for purposes of estimating reliability.
  • 4. 4. Internal Consistency Reliability Used to assess the consistency of results across items within a test. In internal consistency reliability estimation we use our single measurement instrument administered to a group of people on one occasion to estimate reliability. In effect we judge the reliability of the instrument by estimating how well the items that reflect the same construct yield similar results. We are looking at how consistent the results are for different items for the same construct within the measure. There are a wide variety of internal consistency measures that can be used. Average Inter-item Correlation The average inter-item correlation uses all of the items on our instrument that are designed to measure the same construct. We first compute the correlation between each pair of items, as illustrated in the figure. For example, if we have six items we will have 15 different item pairings (i.e., 15 correlations). The average interitem correlation is simply the average or mean of all these correlations. In the example, we find an average inter-item correlation of .90 with the individual correlations ranging from .84 to .95. Average Itemtotal Correlation This approach also uses the inter-item correlations. In addition, we compute a total score for the six items and use that as a seventh variable in the analysis. The figure shows the six item-
  • 5. to-total correlations at the bottom of the correlation matrix. They range from .82 to .88 in this sample analysis, with the average of these at .85. Split-Half Reliability In split-half reliability we randomly divide all items that purport to measure the same construct into two sets. We administer the entire instrument to a sample of people and calculate the total score for each randomly divided half. the split-half reliability estimate, as shown in the figure, is simply the correlation between these two total scores. In the example it is .87. Cronbach's Alpha (a) Imagine that we compute one split-half reliability and then randomly divide the items into another set of split halves and recomputed, and keep doing this until we have computed all
  • 6. possible split half estimates of reliability. Cronbach's Alpha is mathematically equivalent to the average of all possible split-half estimates, although that's not how we compute it. Notice that when I say we compute all possible split-half estimates, I don't mean that each time we go an measure a new sample! That would take forever. Instead, we calculate all split-half estimates from the same sample. Because we measured all of our sample on each of the six items, all we have to do is have the computer analysis do the random subsets of items and compute the resulting correlations. The figure shows several of the split-half estimates for our six item example and lists them as SH with a subscript. Just keep in mind that although Cronbach's Alpha is equivalent to the average of all possible split half correlations we would never actually calculate it that way. Some clever mathematician (Cronbach, I presume!) figured out a way to get the mathematical equivalent a lot more quickly.