Accuracy and errors

CHAPTER
17:
ACCURACY
AND
ERRORReporter: SHELAMIE M. SANTILLAN-EDUC 243
student2nd Sem. S.Y. 2016-
When is a test score
inaccurate?
Almost
always.
All tests and
scores are
imperfect
and are
subject to
Error – What is it?
 No test measures perfectly,
and many tests fail to
measure as well as we
would like them to.
 Tests make “mistakes”.
They are always associated
with some degree of error.
Error – What is it?
 Think about the last test
you took.
 Did you obtain exactly the
score you thought or knew
you deserved?
Example of a type of error
that lower your obtained
score?
 When you couldn’t sleep the night
before the test
 When you are sick but took the test
anyway
 When the essay test you were taking
was so poorly constructed it was hard to
tell what was being tested.
Example of a type of error
that lower your obtained
score?
 When the test had a 45-minute time
limit but you were allowed only 38
minutes,
 When you took a test that had
multiple defensible answers
Example of a type error (of
situation) that raised your obtained
score?
 The time you just happened to see
the answers on your neighbor’s
paper,
 The time you got lucky guessing,
 The time you had 52 minutes for a
45-minute test
Example of a type error (of
situation) that raised your obtained
score?
 The time the test was so full of
unintentional clues that you were
able to answer several questions
based on the information given
in other question.
Then how does one go about
discovering one’s true score?
Unfortunately, we don’t have an
answer. The true score and the
error score are both theoretical or
hypothetical values.
Why bother with the true score or
error score?
Because they allow us to
illustrate some important
points about test score
reliability and test score
Simply keep the mind!
Remember:
Obtained Score = true score+ error
score
Table 17.1 The relationship among Obtained Scores,
Hypothetical True Scores, and Hypothetical Error Score for a
Ninth-Grade Math Test
Student Obtained
Score
True Score Error Score
Donna 91 88 +3
Jack 72 79 -7
Phyllis 68 70 -2
Gary 85 80 +5
Marsha 90 86 +4
Hypothetical Values
We will use the error scores from
table 17.1 (3, -7, -2, 5,4, -3)
Is the standard deviation of error scores of
a test.
The Standard
Error of
Measurement
(abbreviated S )
m
Step 1: Determine the
mean.
M = X = 0 = 0
Student Obtain
ed
Score
True
Score
Error
Score
Donna 91 88 +3
Jack 72 79 -7
Phyllis 68 70 -2
Gary 85 80 +5
Marsha 90 86 +4
Milton 75 78 -3
∑
N 6
Student Obtaine
d Score
True
Score
Error
Score
Donna 91 88 +3
Jack 72 79 -7
Phyllis 68 70 -2
Gary 85 80 +5
Marsha 90 86 +4
Milton 75 78 -3
Step 2: Subtract the mean from each error score to
arrive at the deviation scores. Square each deviation
score and sum the squared deviations.
X – M = x x
+3 – 0 = 3 9
-7– 0 = -7 49
-2 – 0 = -2 4
+5 – 0 = 5 25
+4 – 0 = 4 16
-3 – 0 = -3 9
2
∑X =
2
112
Step 3: Plug the x sum into the formula
and solve for the standard deviation.
2
Error Score SD =
Fortunately, a rather simple statistical formula can be
used to estimate this standard deviation (Sm) without
actually knowing the error scores:
Where r is the reliability of the test
and SD is the test’s standard
deviation.
USING THE STANDARD ERROR
OF MEASUREMENT
In summary, then, we know that error
scores:
1. are normally distributed
2. have a mean of zero
3. have a standard deviation called the
standard error of measurement
USING THE STANDARD ERROR OF
MEASUREMENT
Studen
t
Obtained
Score
True
Scor
e
Error
Scor
e
Donna 91 88 +3
Jack 72 79 -7
Phyllis 68 70 -2
Gary 85 80 +5
Marsh
a
90 86 +4
Milton 75 78 -3
Figure 17.1 The error score
distribution
Table 17.1
This figure tells us that the distribution
of error scores is a normal distribution
Figure 17.2 The error score distribution for the test depicted
Error score of the ninth-grade
math test
Fig. 17.3 The error score distribution for
the test depicted in Table 17.1
With approximate normal curve
percentages.
Let’s use the following number line to represent an
individual’s obtained score, which we will simply call
the X:
Fig. 17.4 The error distribution around an
obtained score of 90 for a test with Sm=
4.32
Student Obtained
Score
True
Scor
e
Error
Score
Donna 91 88 +3
Jack 72 79 -7
Phyllis 68 70 -2
Gary 85 80 +5
Marsh
a
90 86 +4
Milton 75 78 -3
Fig. 17.5 The error distribution around an
obtained score of 75 for a test with Sm =
4.32
Student Obtained
Score
True
Scor
e
Error
Score
Donna 91 88 +3
Jack 72 79 -7
Phyllis 68 70 -2
Gary 85 80 +5
Marsh
a
90 86 +4
Milton 75 78 -3
Standard Deviation or Standard error of
measurement?
Standard Deviation
(SD)
Standard Error of
Measurement
(Sm)
 Is the variability of raw
scores.
 It tells us how spread out
the scores are in a
distribution of raw scores.
 Is based on a group of
 Is the variability of
error scores.
 Is based on a group
of scores that is
hypothetical.
Why all the fuss about error?
For two reasons:
1.We want to make you aware
of the fallibility of test
scores.
2.We want to sensitize you
Classification of sources of
error
1. Test Takers.
2. The test itself.
3. Test administration.
4. Test scoring.
Test Takers:
Factors that would likely result in an
obtained score lower than a student’s true
score:
• fatigue and illness
• Accidentally seeing another
The test itself:
 Trick questions
 Reading level that is too
high.
 Ambiguous questions.
 Items that are too difficult.
Test Administration:
 Physical Comfort
 Instructions &
Explanations
 Test administrator
Attitudes
Error in Scoring:
 When computer scoring is
used, error can occur.
 When test are hand scored,
the likelihood of error
increases greatly.
Sources of Error Influencing
Various Reliability Coefficients
 Test-Retest
 Alternate Forms
 Internal
Consistency
Test- Retest
 Short-interval test-retest coefficients are
not likely to be affected greatly by within-
student error.
 Any problem that do exist in the test are
present both the first and second
administrations, affecting scores the same
way each time the test is administered.
Alternate Form
 Since alternate-forms reliability is
determined by administering two different
forms or versions of the same test to the
same group close together in time, the
effects within student error are negligible.
Alternate Form
 Error within the test, however, has a
significant effect on alternate-forms
reliability.
 As with test-retest method, alternate-
forms score reliability is not greatly
affected by error in administering or
scoring the test, as long as similar
Alternate Form
Internal
Consistency
BAND
INTERPRETATION
 uses the standard error
of measurement to a
more realistic
interpretation and report
groups of test scores.
BAND
INTERPRETATION
BAND
INTERPRETATION
Formula to compute the reliability of the
difference score is as follows:
BAND
INTERPRETATION
Step 1: List Data (let’s assume)
M: 100 , SD: 10, Score reliability - .91 for all
subtests.
Here are the
subtest scores
for John:
BAND
INTERPRETATION
Step 2: Determine Sm (standard error of
measurement)
Since SD and r are the same for each
subtest in this example, the standard error
of measurement will be the same for each
student.
BAND
INTERPRETATION
Step 2: Add and Subtract Sm
BAND
INTERPRETATION
Step 3: Graph the Results
Shade in the
bands to
represent the
range of scores
that has 68%
chance of
capturing John’s
BAND
INTERPRETATION
Step 4: Interpret the Bands
• Interpret the profile of bands by visually
inspecting the bars to see which bands
overlap and which do not.
• Those that overlap probably represent
differences that likely occurred by chance.
Final Word:
 Technically, there are more accurate statistical
procedures for determining real differences between an
individual’s test scores than the ones we have been able
to present here. These procedures, however, are time-
consuming, complex, and overly specific for the typical
teacher.
 Within the classroom, band interpretation, properly used,
makes for a practical alternative to those more advanced
1 de 46

Recomendados

Chapter 17 error and accuracy por
Chapter 17 error and accuracyChapter 17 error and accuracy
Chapter 17 error and accuracySHELAMIE SANTILLAN
177 visualizações13 slides
Evaluation of the reliability for L2 speech rating in discourse completion te... por
Evaluation of the reliability for L2 speech rating in discourse completion te...Evaluation of the reliability for L2 speech rating in discourse completion te...
Evaluation of the reliability for L2 speech rating in discourse completion te...早稲田大学
1.5K visualizações36 slides
Measurement por
MeasurementMeasurement
Measurementwilsone
919 visualizações25 slides
Errors2 por
Errors2Errors2
Errors2sjsuchaya
69 visualizações37 slides
Math533 final exam_study_guide por
Math533 final exam_study_guideMath533 final exam_study_guide
Math533 final exam_study_guidegradstudent3
1.2K visualizações4 slides
Les5e ppt 11 por
Les5e ppt 11Les5e ppt 11
Les5e ppt 11Subas Nandy
7.4K visualizações94 slides

Mais conteúdo relacionado

Mais procurados

One Sample T Test por
One Sample T TestOne Sample T Test
One Sample T Testshoffma5
43.3K visualizações18 slides
Math 221 week 6 live lecture por
Math 221 week 6 live lectureMath 221 week 6 live lecture
Math 221 week 6 live lectureBrent Heard
7K visualizações35 slides
Application of Statistical and mathematical equations in Chemistry Part 2 por
Application of Statistical and mathematical equations in Chemistry Part 2Application of Statistical and mathematical equations in Chemistry Part 2
Application of Statistical and mathematical equations in Chemistry Part 2Awad Albalwi
64 visualizações9 slides
StatVignette06_HypTesting.pptx por
StatVignette06_HypTesting.pptxStatVignette06_HypTesting.pptx
StatVignette06_HypTesting.pptxSERC at Carleton College
69 visualizações15 slides
Chi Squared por
Chi SquaredChi Squared
Chi SquaredGeoBlogs
6.3K visualizações16 slides
Division algorithms (2) por
Division algorithms (2)Division algorithms (2)
Division algorithms (2)Deepa Vanu
291 visualizações6 slides

Mais procurados(20)

One Sample T Test por shoffma5
One Sample T TestOne Sample T Test
One Sample T Test
shoffma543.3K visualizações
Math 221 week 6 live lecture por Brent Heard
Math 221 week 6 live lectureMath 221 week 6 live lecture
Math 221 week 6 live lecture
Brent Heard7K visualizações
Application of Statistical and mathematical equations in Chemistry Part 2 por Awad Albalwi
Application of Statistical and mathematical equations in Chemistry Part 2Application of Statistical and mathematical equations in Chemistry Part 2
Application of Statistical and mathematical equations in Chemistry Part 2
Awad Albalwi64 visualizações
Chi Squared por GeoBlogs
Chi SquaredChi Squared
Chi Squared
GeoBlogs6.3K visualizações
Division algorithms (2) por Deepa Vanu
Division algorithms (2)Division algorithms (2)
Division algorithms (2)
Deepa Vanu291 visualizações
Estimating standard error of measurement por Carlo Magno
Estimating standard error of measurementEstimating standard error of measurement
Estimating standard error of measurement
Carlo Magno5.1K visualizações
Math 221 week 1 lecture nov 2012 with help por Brent Heard
Math 221 week 1 lecture nov 2012 with helpMath 221 week 1 lecture nov 2012 with help
Math 221 week 1 lecture nov 2012 with help
Brent Heard1.3K visualizações
P1 Stroop por Michael Smith
P1 StroopP1 Stroop
P1 Stroop
Michael Smith293 visualizações
Point estimate for a population proportion p por Muel Clamor
Point estimate for a population proportion pPoint estimate for a population proportion p
Point estimate for a population proportion p
Muel Clamor13.4K visualizações
Hypothesis Test Selection Guide por Leanleaders.org
Hypothesis Test Selection GuideHypothesis Test Selection Guide
Hypothesis Test Selection Guide
Leanleaders.org3.2K visualizações
Resourcd File por Resourcd
Resourcd FileResourcd File
Resourcd File
Resourcd485 visualizações
G6 m1-a-lesson 8-s por mlabuski
G6 m1-a-lesson 8-sG6 m1-a-lesson 8-s
G6 m1-a-lesson 8-s
mlabuski113 visualizações
T test por sai precious
T testT test
T test
sai precious 150.9K visualizações
Introduction to Business Analytics Course Part 10 por Beamsync
Introduction to Business Analytics Course Part 10Introduction to Business Analytics Course Part 10
Introduction to Business Analytics Course Part 10
Beamsync394 visualizações
Lesson03_static11 por thangv
Lesson03_static11Lesson03_static11
Lesson03_static11
thangv708 visualizações
Statistical inference: Statistical Power, ANOVA, and Post Hoc tests por Eugene Yan Ziyou
Statistical inference: Statistical Power, ANOVA, and Post Hoc testsStatistical inference: Statistical Power, ANOVA, and Post Hoc tests
Statistical inference: Statistical Power, ANOVA, and Post Hoc tests
Eugene Yan Ziyou6.9K visualizações
Some study materials por SatishH5
Some study materialsSome study materials
Some study materials
SatishH543 visualizações
Paired t Test por Christina K J
Paired t TestPaired t Test
Paired t Test
Christina K J6.2K visualizações

Similar a Accuracy and errors

Accuracy & Error por
Accuracy & ErrorAccuracy & Error
Accuracy & ErrorRICHELLE MAGPULONG
66 visualizações23 slides
Module 3 statistics por
Module 3   statisticsModule 3   statistics
Module 3 statisticsdionesioable
14.5K visualizações27 slides
Stats 4700 number 3 Statistics homework help.docx por
Stats 4700 number 3 Statistics homework help.docxStats 4700 number 3 Statistics homework help.docx
Stats 4700 number 3 Statistics homework help.docxsdfghj21
3 visualizações3 slides
Topic 8a Basic Statistics por
Topic 8a Basic StatisticsTopic 8a Basic Statistics
Topic 8a Basic StatisticsYee Bee Choo
1.6K visualizações47 slides
Solution manual for design and analysis of experiments 9th edition douglas ... por
Solution manual for design and analysis of experiments 9th edition   douglas ...Solution manual for design and analysis of experiments 9th edition   douglas ...
Solution manual for design and analysis of experiments 9th edition douglas ...Salehkhanovic
12.7K visualizações25 slides
Data meeting por
Data meeting Data meeting
Data meeting tzgliczynski
145 visualizações10 slides

Similar a Accuracy and errors(20)

Module 3 statistics por dionesioable
Module 3   statisticsModule 3   statistics
Module 3 statistics
dionesioable14.5K visualizações
Stats 4700 number 3 Statistics homework help.docx por sdfghj21
Stats 4700 number 3 Statistics homework help.docxStats 4700 number 3 Statistics homework help.docx
Stats 4700 number 3 Statistics homework help.docx
sdfghj213 visualizações
Topic 8a Basic Statistics por Yee Bee Choo
Topic 8a Basic StatisticsTopic 8a Basic Statistics
Topic 8a Basic Statistics
Yee Bee Choo1.6K visualizações
Solution manual for design and analysis of experiments 9th edition douglas ... por Salehkhanovic
Solution manual for design and analysis of experiments 9th edition   douglas ...Solution manual for design and analysis of experiments 9th edition   douglas ...
Solution manual for design and analysis of experiments 9th edition douglas ...
Salehkhanovic12.7K visualizações
Data meeting por tzgliczynski
Data meeting Data meeting
Data meeting
tzgliczynski145 visualizações
VCE Physics: Dealing with numerical measurments por Andrew Grichting
VCE Physics: Dealing with numerical measurmentsVCE Physics: Dealing with numerical measurments
VCE Physics: Dealing with numerical measurments
Andrew Grichting2.9K visualizações
Chapter 3.pptx por mahamoh6
Chapter 3.pptxChapter 3.pptx
Chapter 3.pptx
mahamoh695 visualizações
NEED WORK REVISED AND HAVE SOME OF THE ANSWERS ATTACHED FROM TUTOR T.docx por mayank272369
NEED WORK REVISED AND HAVE SOME OF THE ANSWERS ATTACHED FROM TUTOR T.docxNEED WORK REVISED AND HAVE SOME OF THE ANSWERS ATTACHED FROM TUTOR T.docx
NEED WORK REVISED AND HAVE SOME OF THE ANSWERS ATTACHED FROM TUTOR T.docx
mayank2723692 visualizações
ANSWERS por Yogi Sarumaha
ANSWERSANSWERS
ANSWERS
Yogi Sarumaha390 visualizações
Need this assignment completed by 10pm tonight. Had some lied about .docx por taitcandie
Need this assignment completed by 10pm tonight. Had some lied about .docxNeed this assignment completed by 10pm tonight. Had some lied about .docx
Need this assignment completed by 10pm tonight. Had some lied about .docx
taitcandie2 visualizações
Jwan kareem.biostatic exercise por JwanSalh
Jwan kareem.biostatic exerciseJwan kareem.biostatic exercise
Jwan kareem.biostatic exercise
JwanSalh61 visualizações
Population Standard Deviation por Anna Melek
Population Standard DeviationPopulation Standard Deviation
Population Standard Deviation
Anna Melek3 visualizações
Standard error of measurement por tlcoffman
Standard error of measurementStandard error of measurement
Standard error of measurement
tlcoffman5.8K visualizações
Standard error of measurement por tlcoffman
Standard error of measurementStandard error of measurement
Standard error of measurement
tlcoffman468 visualizações
Resourcd File por Resourcd
Resourcd FileResourcd File
Resourcd File
Resourcd212 visualizações
Download the presentation por butest
Download the presentationDownload the presentation
Download the presentation
butest241 visualizações
QT1 - 07 - Estimation por Prithwis Mukerjee
QT1 - 07 - EstimationQT1 - 07 - Estimation
QT1 - 07 - Estimation
Prithwis Mukerjee1.3K visualizações
5. testing differences por Steve Saffhill
5. testing differences5. testing differences
5. testing differences
Steve Saffhill348 visualizações
Evaluation Of The Post Assessment por Joyce Williams
Evaluation Of The Post AssessmentEvaluation Of The Post Assessment
Evaluation Of The Post Assessment
Joyce Williams4 visualizações

Último

The Accursed House by Émile Gaboriau por
The Accursed House  by Émile GaboriauThe Accursed House  by Émile Gaboriau
The Accursed House by Émile GaboriauDivyaSheta
187 visualizações15 slides
ICS3211_lecture 08_2023.pdf por
ICS3211_lecture 08_2023.pdfICS3211_lecture 08_2023.pdf
ICS3211_lecture 08_2023.pdfVanessa Camilleri
127 visualizações30 slides
REPRESENTATION - GAUNTLET.pptx por
REPRESENTATION - GAUNTLET.pptxREPRESENTATION - GAUNTLET.pptx
REPRESENTATION - GAUNTLET.pptxiammrhaywood
91 visualizações26 slides
Narration ppt.pptx por
Narration  ppt.pptxNarration  ppt.pptx
Narration ppt.pptxTARIQ KHAN
131 visualizações24 slides
Community-led Open Access Publishing webinar.pptx por
Community-led Open Access Publishing webinar.pptxCommunity-led Open Access Publishing webinar.pptx
Community-led Open Access Publishing webinar.pptxJisc
91 visualizações9 slides
Class 10 English notes 23-24.pptx por
Class 10 English notes 23-24.pptxClass 10 English notes 23-24.pptx
Class 10 English notes 23-24.pptxTARIQ KHAN
125 visualizações53 slides

Último(20)

The Accursed House by Émile Gaboriau por DivyaSheta
The Accursed House  by Émile GaboriauThe Accursed House  by Émile Gaboriau
The Accursed House by Émile Gaboriau
DivyaSheta187 visualizações
ICS3211_lecture 08_2023.pdf por Vanessa Camilleri
ICS3211_lecture 08_2023.pdfICS3211_lecture 08_2023.pdf
ICS3211_lecture 08_2023.pdf
Vanessa Camilleri127 visualizações
REPRESENTATION - GAUNTLET.pptx por iammrhaywood
REPRESENTATION - GAUNTLET.pptxREPRESENTATION - GAUNTLET.pptx
REPRESENTATION - GAUNTLET.pptx
iammrhaywood91 visualizações
Narration ppt.pptx por TARIQ KHAN
Narration  ppt.pptxNarration  ppt.pptx
Narration ppt.pptx
TARIQ KHAN131 visualizações
Community-led Open Access Publishing webinar.pptx por Jisc
Community-led Open Access Publishing webinar.pptxCommunity-led Open Access Publishing webinar.pptx
Community-led Open Access Publishing webinar.pptx
Jisc91 visualizações
Class 10 English notes 23-24.pptx por TARIQ KHAN
Class 10 English notes 23-24.pptxClass 10 English notes 23-24.pptx
Class 10 English notes 23-24.pptx
TARIQ KHAN125 visualizações
11.30.23 Poverty and Inequality in America.pptx por mary850239
11.30.23 Poverty and Inequality in America.pptx11.30.23 Poverty and Inequality in America.pptx
11.30.23 Poverty and Inequality in America.pptx
mary850239149 visualizações
ANATOMY AND PHYSIOLOGY UNIT 1 { PART-1} por DR .PALLAVI PATHANIA
ANATOMY AND PHYSIOLOGY UNIT 1 { PART-1}ANATOMY AND PHYSIOLOGY UNIT 1 { PART-1}
ANATOMY AND PHYSIOLOGY UNIT 1 { PART-1}
DR .PALLAVI PATHANIA244 visualizações
Google solution challenge..pptx por ChitreshGyanani1
Google solution challenge..pptxGoogle solution challenge..pptx
Google solution challenge..pptx
ChitreshGyanani1117 visualizações
discussion post.pdf por jessemercerail
discussion post.pdfdiscussion post.pdf
discussion post.pdf
jessemercerail130 visualizações
Structure and Functions of Cell.pdf por Nithya Murugan
Structure and Functions of Cell.pdfStructure and Functions of Cell.pdf
Structure and Functions of Cell.pdf
Nithya Murugan455 visualizações
ACTIVITY BOOK key water sports.pptx por Mar Caston Palacio
ACTIVITY BOOK key water sports.pptxACTIVITY BOOK key water sports.pptx
ACTIVITY BOOK key water sports.pptx
Mar Caston Palacio511 visualizações
Are we onboard yet University of Sussex.pptx por Jisc
Are we onboard yet University of Sussex.pptxAre we onboard yet University of Sussex.pptx
Are we onboard yet University of Sussex.pptx
Jisc93 visualizações
Gopal Chakraborty Memorial Quiz 2.0 Prelims.pptx por Debapriya Chakraborty
Gopal Chakraborty Memorial Quiz 2.0 Prelims.pptxGopal Chakraborty Memorial Quiz 2.0 Prelims.pptx
Gopal Chakraborty Memorial Quiz 2.0 Prelims.pptx
Debapriya Chakraborty625 visualizações
11.28.23 Social Capital and Social Exclusion.pptx por mary850239
11.28.23 Social Capital and Social Exclusion.pptx11.28.23 Social Capital and Social Exclusion.pptx
11.28.23 Social Capital and Social Exclusion.pptx
mary850239291 visualizações
231112 (WR) v1 ChatGPT OEB 2023.pdf por WilfredRubens.com
231112 (WR) v1  ChatGPT OEB 2023.pdf231112 (WR) v1  ChatGPT OEB 2023.pdf
231112 (WR) v1 ChatGPT OEB 2023.pdf
WilfredRubens.com151 visualizações
Use of Probiotics in Aquaculture.pptx por AKSHAY MANDAL
Use of Probiotics in Aquaculture.pptxUse of Probiotics in Aquaculture.pptx
Use of Probiotics in Aquaculture.pptx
AKSHAY MANDAL95 visualizações
AUDIENCE - BANDURA.pptx por iammrhaywood
AUDIENCE - BANDURA.pptxAUDIENCE - BANDURA.pptx
AUDIENCE - BANDURA.pptx
iammrhaywood77 visualizações
Class 10 English lesson plans por TARIQ KHAN
Class 10 English  lesson plansClass 10 English  lesson plans
Class 10 English lesson plans
TARIQ KHAN280 visualizações

Accuracy and errors

  • 1. CHAPTER 17: ACCURACY AND ERRORReporter: SHELAMIE M. SANTILLAN-EDUC 243 student2nd Sem. S.Y. 2016-
  • 2. When is a test score inaccurate? Almost always. All tests and scores are imperfect and are subject to
  • 3. Error – What is it?  No test measures perfectly, and many tests fail to measure as well as we would like them to.  Tests make “mistakes”. They are always associated with some degree of error.
  • 4. Error – What is it?  Think about the last test you took.  Did you obtain exactly the score you thought or knew you deserved?
  • 5. Example of a type of error that lower your obtained score?  When you couldn’t sleep the night before the test  When you are sick but took the test anyway  When the essay test you were taking was so poorly constructed it was hard to tell what was being tested.
  • 6. Example of a type of error that lower your obtained score?  When the test had a 45-minute time limit but you were allowed only 38 minutes,  When you took a test that had multiple defensible answers
  • 7. Example of a type error (of situation) that raised your obtained score?  The time you just happened to see the answers on your neighbor’s paper,  The time you got lucky guessing,  The time you had 52 minutes for a 45-minute test
  • 8. Example of a type error (of situation) that raised your obtained score?  The time the test was so full of unintentional clues that you were able to answer several questions based on the information given in other question.
  • 9. Then how does one go about discovering one’s true score? Unfortunately, we don’t have an answer. The true score and the error score are both theoretical or hypothetical values.
  • 10. Why bother with the true score or error score? Because they allow us to illustrate some important points about test score reliability and test score
  • 11. Simply keep the mind! Remember: Obtained Score = true score+ error score
  • 12. Table 17.1 The relationship among Obtained Scores, Hypothetical True Scores, and Hypothetical Error Score for a Ninth-Grade Math Test Student Obtained Score True Score Error Score Donna 91 88 +3 Jack 72 79 -7 Phyllis 68 70 -2 Gary 85 80 +5 Marsha 90 86 +4 Hypothetical Values
  • 13. We will use the error scores from table 17.1 (3, -7, -2, 5,4, -3) Is the standard deviation of error scores of a test. The Standard Error of Measurement (abbreviated S ) m
  • 14. Step 1: Determine the mean. M = X = 0 = 0 Student Obtain ed Score True Score Error Score Donna 91 88 +3 Jack 72 79 -7 Phyllis 68 70 -2 Gary 85 80 +5 Marsha 90 86 +4 Milton 75 78 -3 ∑ N 6
  • 15. Student Obtaine d Score True Score Error Score Donna 91 88 +3 Jack 72 79 -7 Phyllis 68 70 -2 Gary 85 80 +5 Marsha 90 86 +4 Milton 75 78 -3 Step 2: Subtract the mean from each error score to arrive at the deviation scores. Square each deviation score and sum the squared deviations. X – M = x x +3 – 0 = 3 9 -7– 0 = -7 49 -2 – 0 = -2 4 +5 – 0 = 5 25 +4 – 0 = 4 16 -3 – 0 = -3 9 2 ∑X = 2 112
  • 16. Step 3: Plug the x sum into the formula and solve for the standard deviation. 2 Error Score SD =
  • 17. Fortunately, a rather simple statistical formula can be used to estimate this standard deviation (Sm) without actually knowing the error scores: Where r is the reliability of the test and SD is the test’s standard deviation.
  • 18. USING THE STANDARD ERROR OF MEASUREMENT In summary, then, we know that error scores: 1. are normally distributed 2. have a mean of zero 3. have a standard deviation called the standard error of measurement
  • 19. USING THE STANDARD ERROR OF MEASUREMENT Studen t Obtained Score True Scor e Error Scor e Donna 91 88 +3 Jack 72 79 -7 Phyllis 68 70 -2 Gary 85 80 +5 Marsh a 90 86 +4 Milton 75 78 -3 Figure 17.1 The error score distribution Table 17.1
  • 20. This figure tells us that the distribution of error scores is a normal distribution Figure 17.2 The error score distribution for the test depicted Error score of the ninth-grade math test
  • 21. Fig. 17.3 The error score distribution for the test depicted in Table 17.1 With approximate normal curve percentages.
  • 22. Let’s use the following number line to represent an individual’s obtained score, which we will simply call the X:
  • 23. Fig. 17.4 The error distribution around an obtained score of 90 for a test with Sm= 4.32 Student Obtained Score True Scor e Error Score Donna 91 88 +3 Jack 72 79 -7 Phyllis 68 70 -2 Gary 85 80 +5 Marsh a 90 86 +4 Milton 75 78 -3
  • 24. Fig. 17.5 The error distribution around an obtained score of 75 for a test with Sm = 4.32 Student Obtained Score True Scor e Error Score Donna 91 88 +3 Jack 72 79 -7 Phyllis 68 70 -2 Gary 85 80 +5 Marsh a 90 86 +4 Milton 75 78 -3
  • 25. Standard Deviation or Standard error of measurement? Standard Deviation (SD) Standard Error of Measurement (Sm)  Is the variability of raw scores.  It tells us how spread out the scores are in a distribution of raw scores.  Is based on a group of  Is the variability of error scores.  Is based on a group of scores that is hypothetical.
  • 26. Why all the fuss about error? For two reasons: 1.We want to make you aware of the fallibility of test scores. 2.We want to sensitize you
  • 27. Classification of sources of error 1. Test Takers. 2. The test itself. 3. Test administration. 4. Test scoring.
  • 28. Test Takers: Factors that would likely result in an obtained score lower than a student’s true score: • fatigue and illness • Accidentally seeing another
  • 29. The test itself:  Trick questions  Reading level that is too high.  Ambiguous questions.  Items that are too difficult.
  • 30. Test Administration:  Physical Comfort  Instructions & Explanations  Test administrator Attitudes
  • 31. Error in Scoring:  When computer scoring is used, error can occur.  When test are hand scored, the likelihood of error increases greatly.
  • 32. Sources of Error Influencing Various Reliability Coefficients  Test-Retest  Alternate Forms  Internal Consistency
  • 33. Test- Retest  Short-interval test-retest coefficients are not likely to be affected greatly by within- student error.  Any problem that do exist in the test are present both the first and second administrations, affecting scores the same way each time the test is administered.
  • 34. Alternate Form  Since alternate-forms reliability is determined by administering two different forms or versions of the same test to the same group close together in time, the effects within student error are negligible.
  • 35. Alternate Form  Error within the test, however, has a significant effect on alternate-forms reliability.  As with test-retest method, alternate- forms score reliability is not greatly affected by error in administering or scoring the test, as long as similar
  • 38. BAND INTERPRETATION  uses the standard error of measurement to a more realistic interpretation and report groups of test scores.
  • 40. BAND INTERPRETATION Formula to compute the reliability of the difference score is as follows:
  • 41. BAND INTERPRETATION Step 1: List Data (let’s assume) M: 100 , SD: 10, Score reliability - .91 for all subtests. Here are the subtest scores for John:
  • 42. BAND INTERPRETATION Step 2: Determine Sm (standard error of measurement) Since SD and r are the same for each subtest in this example, the standard error of measurement will be the same for each student.
  • 44. BAND INTERPRETATION Step 3: Graph the Results Shade in the bands to represent the range of scores that has 68% chance of capturing John’s
  • 45. BAND INTERPRETATION Step 4: Interpret the Bands • Interpret the profile of bands by visually inspecting the bars to see which bands overlap and which do not. • Those that overlap probably represent differences that likely occurred by chance.
  • 46. Final Word:  Technically, there are more accurate statistical procedures for determining real differences between an individual’s test scores than the ones we have been able to present here. These procedures, however, are time- consuming, complex, and overly specific for the typical teacher.  Within the classroom, band interpretation, properly used, makes for a practical alternative to those more advanced

Notas do Editor

  1. Think about the last test you took. Did you obtain exactly the score you thought or knew you deserved? Was your score higher than you expected? Was it lower than you expected? What about your obtained scores on all the other tests you have taken? Did they truly reflect your skill, knowledge, or ability, or did they sometimes underestimate your knowledge, ability, or skill? Or did they overestimate? If your obtained test scores did not always reflect your true ability, they were associated with some error.
  2. Your obtained scores may have been lower or higher than they should have been. In short, an obtained score has a true component (actual level of ability, knowledge) and an error component (which may act to lower or raise the obtained score).
  3. We never actually know an individual’s true score or error score.
  4. They are important concepts because they allow us to illustrate some important points about test score reliability and test score accuracy.
  5. The standard deviation of the error score distribution, also known as the standard error of measurement, is 4. 43. If we could know what the error scores are for each test we administer, we could compute Sm in this manner. But, of course, we never know these error scores. If you are following so far, your neat question should be, “But how in the world do you determine the standard deviation of the error scores if you never know the error scores?”
  6. Error scores are assumed to be random. As such, they cancel each other out. That is obtained scores are inflated by random error to the same extent as they are deflated by error. Another way of saying this is that the mean of the error scores for a test is zero. The distribution of the error scores is also important, since it approximates a normal distribution closely enough for us to use the normal distribution to represent it.
  7. Returning to our example from the ninth-grade math test in Table 17.1, we recall that we obtained an Sm of 4.32 for the data provided.
  8. Figure 17.2 illustrates the distribution of error scores for these data. What does the distribution in Fig. 17.2 tell us? Before you answer, consider this: The distribution error of scores is a normal distribution. This is important since, as you learned in Chapter 13, the normal distribution has characteristics that enable us to make decisions about scores that fall between, above, or below different points in the distribution. We are able to do so because fixed percentages of scores fall between various score values in a normal distribution.
  9. (Fig. 17.3 Should refresh your memory) We listed along the baseline the standard deviation of the error score distribution. This is more commonly called the standard error of measurement (Sm) of the test. Thus we can see that 68% of the error scores for the test will be no more than 4.32 points higher or 4.32 points lower than the true scores. That is, if there were 100 obtained scores on this test, 68 of these scores would not be “off” their true scores by more than 4.32 points. The Sm then, tells us about the distribution of obtained score around true scores. By knowing an individual’s true socre we can predict what his or her obtained score is likely to be.
  10. The careful reader may be thinking, “That’s not very useful information. We cab never know what a person’s true score is, only their obtained score.” This is correct. As a test users, we work only with obtained scores. However, we can follow our logic in reverse. If 68% of obtained scores fall within 1 Sm of their true scores, then 68% of true scores must fall within 1Sm of their obtained scores. Strictly speaking, this reverse logic is somewhat inaccurate, it would be true 99% of the itme (Gullikson, 1987). Therefore the Sm is often used to determine how test error is likely to have affected individual obtained scores. That is, X plus or minus 4.32 (+4.32) defines the range or band
  11. Why all fuss? Remember our original point. All test scores are fallible (tending to err); they contain a margin of error. The Sm is a statistic that estimates margin for us. We are accustomed to reporting a single test score. In education, we have long had a tendency to overinterpret small differences in test scores since we too often consider obtained scores to be completely sccurate. Incorporating the Sm in reporting test scores greatly minimizes the likelihood of overinterpretation and forces us to consider how fallible our test scores are. After considering the Sm from a slightly different angle, we will show how to incorporate it to make comparisons among test scores. This procedure is called band interpretation.
  12. You learned to compute and interpret SD in chapter 13.
  13. In reality, an individual’s obtained score is the best estimate of an individual’s true score. That is, inspite of the foregoing discussion, we usually use the obtained score as our best guest of a student’s true level of ability. Well, why all the fuss about error then?
  14. Generally, error due to within-student factors is beyond our control.
  15. Physical Comfort: room temperature, humidity, lighting, noise, and seating arrangement are all potential sources of error for the test taker. Instructions and Explanations: Different test administrators provide differing amounts of information to test takers. Some spell words, provide hints, or tell whether it’s better to guess or leave blanks, while others remain fairly distant. Naturally, your score may vary depending to the amount of information you are provided. Test Administrator Attitudes: Administrators differ in the notions they convey about the importance of the test, the extent to which they are emotionally supportive of students, and the way in which they monitor the test. To the extent that these variables affect students differently, test score reliability and accuracy will be impaired.
  16. The computer a highly reliable machine, is seldom the cause of such errors. But teachers and other test administrators prepare the scoring keys, introducing possibilities for error. And sometimes fail to use No. 2 pencils or make extraneous marks on an answer sheets, introducing another potential source of scoring error. Needles to say, when tests are hand scored, as most classroom tests are, the likelihood of error increases greatly. In fact, because you are human, you can be sure that you will make some scoring errors in grading the tests you give.
  17. With test-retest and alternate-forms reliability, with-in-student factors affect the method of estimating score reliability, since changes in test performance due to such problems as fatigue, momentary anxiety, illness or just having an “off” day can be doubled because there two separate administrations of the test. If the test is sensitive to those problems, it will record different scores from one administration to another, lowering the reliability (or correlation coefficient) between them. Obviously, we would prefer that that the test not be affected by those problems. But if it is, we would like to know about it.
  18. List subtests and scores and the M, SD, and reliability (r) for each subject. For purpose of illustration, let’s assume that the mean is 100, the standard deviation is 10, and the score reliability is .91 for all the subtests.
  19. Since SD and r are the same for each subtest in this example, the standard error of measurement will be the same for each student.
  20. To identify the band or interval of scores that has 68% chance of capturing John’s true score, add and subtract Sm to each subtest score. If the test could be given to John 100 times (without John learning from taking the test), 68 out of 100 times John’s true score would be within the following bands:
  21. To identify the band or interval of scores that has 68% chance of capturing John’s true score, add and subtract Sm to each subtest score. If the test could be given to John 100 times (without John learning from taking the test), 68 out of 100 times John’s true score would be within the following bands: