2. Reliability
• It provides consistent results no matter how many times the
student take it
• When a measurement tool consistently gives the same answer
• The person get a similar test score or a different score?
• The more similar the scores are, the more reliable the test is.
• Would you trust on something that tells how your personality is,
but the next day it tells you other thing
3. The test performance can be influenced
by
• A person’s psychological or physical state at the time of
testing: Motivation, fatigue, anxiety
• Environmental factors: temperature, noise, light, test
administrator.
4. Look the table 1a and 1b
• A group of students were supposed to take a test on
Thursday, but they took it on Wednesday
• Look the scores
• Which test seems to be more reliable?
5. The reliability coefficient
• The reliability of a test is indicated by the reliability
coefficient.
• “r” it is expressed as a number ranging between 0-1
• r=0 r=1
• Not select or reject a test based on the reliability
coefficient: consider the type of the test, the type of
reliability estimate reported, context in which the text will
be used
• Good vocabulary, structure and reading tests are usually in the .90
to .99 range
• while auditory comprehension tests are more often in the .80 to 89
range.
• Oral production tests may be in the .70 to .79 range
6. Test – retest method
• The same test is given to the same people after a period
of time.
• The reliability of the test can be estimated by the
consistency of the scores between the two tests
Alternate form reliability
• Indicates how consistent test scores are if a person takes
two or more form of a test
• Two tests with the same people. Both are designed to
measure the same thing
7. Coefficient of internal consistency
• It indicates that the items on a test are very similar to each
other in content
Split half method
• It is necessary to split into two halves the test, through the
careful matching of items
• It measures multiple characteristics
8. Standard error of measurement
• It gives the margin of error that you should expect in an
individual test score
• In SEM of “2” indicates that the test taker’s “true” score
probably lies with in 2 points in either direction of the
score he or she receives on the test.
• If a person receives a 91 on the test, there is a chance
that the person’s true score lies somewhere between 89
and 93
• The smaller the SEM, the more accurate the
measurements
9. Scorer reliability
• Scorer reliability refers to the consistency when different
people who score the same test agree
• It indicates how consistent test scores are if the test is
scored by two or more people
10. How to make test more reliable
• Take enough samples of behavior
• The more items on a test, the more reliable the test will be
• Items should be independent of each other
• It should not be so long
• Exclude items which do not discriminate well between
weaker and stronger students
• They perform with similar degrees of success contributing little to
reliability of the test
11. How to make a test more reliable
• Do not allow candidates too much freedom
• Write a composition on tourism
• Write a composition on tourism in this country
• Write a composition on how we might develop the tourist industry in
this country
• Discuss the following measures intended to increase the number of
foreign tourist coming to this country
• Better advertising or information (where? What form should it take?)
• Improve facilities (hotels, transportation, communication)
• Training the personnel (guides, hotel managers, etc)
12. How to make a test more reliable
• Write unambiguous items
• The fact than an individual candidate might interpret the question in
different ways on different occasions means that an item is not
contributing fully to the reliability of the test
• Provide clear and explicit instructions
• Write and oral instructions
• Spoken instruction should be always be read from a prepared text
13. How to make a test more reliable
• Make candidates familiar with format and testing
techniques
• Sample tests
• Provide uniform and non-distracting conditions of
administration
• Timing should be specified
• The acoustic conditions should be similar for all administrations of
a listening test
14. How to make a test reliable
• Use items that permit scoring which is as objective as
possible
• Multiple choice items
• Open-ended item
• Make comparisons between candidates as direct as
possible
• Scoring all the compositions one topic will be more reliable than if
the candidates are allowed to choose from six topics
15. How to make a test reliable
• Provide a detailed scoring key
• The key should be as detailed as possible in its assignment of
points
• Train scorers
• They should have learned to score accurately
• Agree acceptable responses and appropriateness scores
at outset of scoring
16. How to make a test reliable
• Identify candidates by number, not name
• Scorers have expectations of candidates that they know
• Employ multiple, independent scoring
• Where testing is subjective, all scripts should be scored by at least
two independent scorers
17. Reliability and validity
• A valid test must be reliable, but a reliable test may not be
valid at all.
• A test can be internally consistent (reliable), but not to be
an accurate measure of what you claim to be measuring
(validity)