1. Item Analysis
• A technique for the guidance and
improvement of instruction.
• The items analyzed must be valid
measures of instructional objectives.
• The items must be diagnostic in that the
knowledge of which incorrect options
students select are a clue to the nature of
misunderstanding and are prescriptive of
appropriate actions by the instructor.
2. Usefulness of Data
• Instructors who construct their own test
and examinations may greatly improve the
effectiveness of test items and the validity
of test scores if they select and rewrite
their items on the basis of item
performance data.
3. Difficulty Index
• Used to determine the difficulty level of
test items.
• Asks instructors to calculate the proportion
of students who answered the test item
accurately.
• Looking at alternative answers can also
find out if there are answer choices that
should be replaced.
4. Difficulty Index
Example: The following table illustrates how many students selected
each answer choice for questions 1 and 2 in a multiple choice test.
For question 1 A was not a good distracter, no one selected the answer.
We can compute the difficulty of an item by dividing the number of
students who choose the correct answer (24) by the total number of
students (30). Using this formula, the difficulty of question 1 (p) is equal
to 24/30 or .80.
Question A B C D
#1 0 3 24* 3
#2
*= Correct
answer.
12* 13 3 2
5. Item Difficulty
• A rule-of-thumb is that if the item difficulty is
more than .75 it is an easy item.
• If the difficulty is below .23 it is a difficult item.
• Given these parameters, this item could be
regarded as moderately easy; many students
(80%) got it correct.
Question A B C D
#1 0 3 24* 3
#2
*= Correct
answer.
12* 13 3 2
6. Item Difficulty
• In contrast question 2 is much more difficult (12/30=.40).
• More students selected the incorrect answer (B) in
question 2 than selected the correct answer (A).
• This item should be carefully analyzed to ensure that B
is an appropriate distractor.
Question A B C D
#1 0 3 24* 3
#2
*= Correct
answer.
12* 13 3 2
7. Discrimination Index
• How well an assessment differentiates between high
and low scorers.
• You should be able to expect that the high performing
students would select the correct answer for each
question more often than the low performing students.
• If this is true then the assessment is said to have a
positive discrimination index (between 0 and 1)
indicating that students who received a high total
score chose the correct answer for a specific item
more often than the students who had a lower overall
score.
• However, if you find that more of the low performing
students got a specific item correct then the item has a
negative discrimination index (between -1 and 0).
8. Discrimination Index
Student Total Score (%) Questions
1 2 3
Alan 90 1 0 1
Sam 90 1 0 1
Jill 80 0 0 1
Charles 80 1 0 1
Sonia 70 1 0 1
Roger 60 1 0 0
Clayton 60 1 0 1
Kelly 50 1 1 0
Justin 50 1 1 0
Cathy 40 0 1 0
9. Discrimination Index
• This table displays the result of ten
questions on a test. Note that the students
are arranged with the top overall scorers
at the top of the table.
• The following steps can be followed to
determine the Difficulty Index and the
Discrimination Index.
10. Calculating Indices
• 1. Arrange students with the highest overall
scores at the top then count the number of
students in the upper and lower group who
got each item correct. For Question 1 there
were 4 students in the top half who got it
correct and 4 students in the bottom half.
• 2. Determine the Difficulty Index by dividing
the number who got it correct by the total
number of students. For Question 1 this
would be 8/10 or p=.80.
11. Calculating Indices
• 3. Determine the Discrimination Index by
subtracting the number of students in the
lower group who got the item correct from
the number of students in the upper group
who got the item correct. Then divide by the
number of students in each group (in this
case there were 5 in each group). For
Question 1 that means you would subtract 4
from 4, and divide by 5, which results in a
Discrimination Index of 0.
• These answers are given in the next table.
13. Calculating Indices
• What does this table tell us?
• Question 2 had a difficulty index of .30
meaning it was quite difficult, and a negative
discrimination index of -0.6 meaning low
performing students were more likely to get
this item correct. This question should be
carefully analyzed and probably deleted or
changed.
• Our best overall question is question 3 which
had a moderate difficulty level (.60) and
discriminated extremely well (0.8).
14. Cognitive Level
• Consider what cognitive level the item is
assessing.
• The questions may be based on Bloom’s
taxonomy, grouping those by level of difficulty.
• In this way you can easily note which questions
demand higher level thinking skills and may be
too difficult, or do not discriminate well. Improve
these questions and focus instructional
strategies on higher level skills.
15. Biserial Correlation
• The correlation between a student’s
performance on an item (right or wrong) and
his or her total score on the test.
• Assumes the distribution of test scores is
normal and there is a normal distribution
underlying the right/wrong dichotomy.
• Biserial correlation has the characteristic of
having maximum values greater than unity.
• There is no exact test for the statistical
significance of the biserial correlation
coefficient.
16. Biserial Correlation
• Point biserial correlation is also a
correlation between student performance
on an item (right or wrong) and test score.
• It assumes the test score distribution is
normal and the division on item
performance is a natural dichotomy.
• The possible range of values for the point
biserial correlation is +1 to -1.
17. Guidelines for Item Development
• 1. Items that correlate less than .15 with
total test score should be restructured.
These items usually do not measure the
same skill or ability as does the test on the
whole or that they are confusing or
misleading to students. A test is better
(more reliable) the more homogeneous the
items.
18. Guidelines for Item Development
• 2. Distractors that are not chosen by any students should
be replaced or eliminated. They do not contribute to the
test’s ability to discriminate the good students from the
poor students. Do not be concerned if each distractor is
not chosen by the same number of students. The fact
that a majority of students miss an item does not imply
that the item should be changed, although such items
should be double-checked for accuracy. Be suspicious
about the correctness of any item in which a single
distractor is chosen more often than all other options,
including the answer, and especially if the distractor’s
correlation with the total score is positive.
19. Guidelines for Item Development
• 3. Items that virtually everyone gets right
are useless for discriminating among
students and should be replaced by more
difficult items. This is particularly true if you
adopt the traditional attitude toward letter
grade assignments that letter grades more
or less fit a predetermined distribution.