LANGUAJE TESTING

LANGUAGE TESTING
SECOND BIMESTER

ESCUELA: INGLES

NOMBRES: MARIA ARIAS CORDOVA
DAVID JAMES AXELSON

FECHA: ABRIL – AGOSTO 2009

1

WHAT IS TESTING?
Testing is a matter of using data
to establish evidence of learning.
But evidence does not occur
concretely in the natural state,
but is an abstract inference. It is
a matter of judgment.

THE PURPOSE OF
VALIDATION
 The purpose of validadtion in
Language Testing is to ensure the
defensibility and fairness of
interpretation based on test
performance.
3

THE PURPOSE OF
VALIDATION
 The scrutiny of such procedure will
involve both reasoning and examination
of the facts.
The reasoning may involve legal
argumentation, and appeals to the
common sense, insight, and human
understanding of the jury members, as
well as careful examination of the
evidence. 5

THE PURPOSE OF
VALIDATION
Test validation similarly involves
thinking about the logic of the test,
particularly its design and its intentions,
and also involves looking at empirical
evidence –the hard facts- emerging from
data from test trial or operational
administrations.
5

QUALITIES 0F A GOOD
TEST
A good test has the following
qualities:
 It is valid
 It is reliable
 It is practical
 It has negative effects on the
teaching program. 6

PRACTICALITY
 A good test is practical.
 A good test is practical when it is
within the means of financial
limitations, time constraints, easy
of administration, and scoring and
interpretation.
7

PRACTICALITY
 A test that is prohibitively
expensive is not practical.
 A test of language proficiency that
takes a student ten hours to
complete is impractical.
 A test that takes a few minutes for a
student to take is impractical.
8

PRACTICALITY
 A test that takes several hours for
an examiner to evaluate is
impractical.
 A test that requires individual one-
to- one proctoring is impractical.

9

PRACTICALITY
The extent to which a test is practical
sometimes hinges on whether a test is
designed to be norm-referenced or
criterion-referenced. In norm –
referenced tests, each test-taker’s score
is interpreted in relation to a mean,
median, standard deviation, and / or
percentile rank. The purpose in such
tests is to place test-takers along a
10
mathematical continuum in rank order.

PRACTICALITY
Typical or non-referenced tests are
standardized tests intended to be
administered to large audiences, with
results quickly disseminated to test-takers.
Such tests must have fixed, predetermined
responses in a form that can be
electronically scanned. Practicality is a
primary issue.

The most important quality of any test is
11
how practical it is to administer.

PRACTICALITY
It is the ability of a person or
system to perform and maintain
its functions in routine
circumstances, as well as hostile
or unexpected circumstances.
12

VALIDITY

The most complex criterion
of a good test is validity, the
degree to which the test
actually measures what it is
intended to measure.
13

FACE VALIDITY
 Face validity: face validity is
when a test appears valid to
examinees who take it,
personnel who administer it
and other untrained
observers.
14

RELIABILITY
A testing reliability is a set of two
probabilities, the definition of which
varies by field. In medicine, the
sensitivity and specificity are
conventionally used. In the field of ,
the probabilities of detection and
false call are conventionally used.
15

RELIABILITY
If you give the same test to the
same subject or matched
subjects on two dfifferent
occasions, the test itself should
yield similar reults; it should
have test reliability
16

RELIABILITY
Means:

- dependability
- trustworthiness
- precision
17

THREATS TO TEST
VALIDITY
Why is face validity not enough?
What can threaten the validity
 The meaningfulness
 Interpretability
 Fairness of assessment
( scores, ratings) 18

THREATS TO TEST VALIDITY

Possible problem areas:
- Test content
- Test method and
- Test construct

19

CONTENT VALIDITY
 A test has content validity if it
measures knowledge of the content
domain of which it was designed to
measure knowledge. Another way of
saying this is that content validity
concerns, primarily, the adequacy
with which the test items adequately
and representatively sample the
content area to be measured.
20

CONTENT VALIDITY
For example: a comprehensive math
achievement test would lack content
validity if good scores depended primarily
on knowledge of English, or if it only had
questions about one aspect of math (e.g.,
algebra). Content validity is primarily an
issue for educational tests, certain
industrial tests, and other tests of content
knowledge like the Psychology Licensing
Exam. 21

TEST METHOD
A test method is a definitive procedure
that produces a test result. (ASTM
definition)
The test result can be qualititive
(yes/no), categorical, or quantititive (a
measured value). It can be a personal
observation or the output of a precision
measuring instrument.
22

TEST CONSTRUCT
Test Construct refers to those aspects of
knowledge or skill possessed by the
candidate which are being measured.

Test Construct involves being clear about
what knowledge of language consists of,
and how that knoweledge is deployed in
actual performance.
23

THREATS TO TEST VALIDITY
Possible problem areas:
Test content: What the test contains.

Test method: The way in which the candidate is
asked to engage with the materials and tasks in
the test, and how these responses will be scored.

Test construct: The underlying ability being
captured by the test.
24

ESSAY TESTS
To write compositions or essay tests
seems very easy. Much easier, for
example, than writing multiple-choice
questions. All one seems to have to do
is write a topic and leave the student
to compose an answer. The following
prompt is very common:
“ HEALTHY FOOD ” Discuss.
25

ESSAY TESTS
Format:
 Introduction. Introduce your topic
 Background. Give historical or philosophical
background data to orient the reader to the
topic.
 Thesis and arguments. State the main points
including causes and effects, methods used,
dates, places, results.
 Conclusion. Include the significance of each event
and finish up with a summary. 26

INTRODUCTION
The business practices of the Intel
Corporation, a technology company best
known for the production of microprocessors
for computers, illustrate the importance of
brand marketing. Intel was able to achieve a
more than 1,500 percent increase in sales,
moving from $ 1.2 billion in sales to more than $
33 billion, in a little more than 10 years. Although
the explosion of the home-computer market
certainly accounted for some of this dramatic
increase, the brilliance of its branding strategy
also played a significant role.
27

BACKGROUND
Intel became a major producer of microprocessor
chips in 1978, when its 8086 chip was selected by IBM for
use in its line of home computers. The 8086 chip and its
successors soon became the industry standard, even as
Intel’s competitors sought to break into this potentially
lucrative market. Intel’s main problem in facing its
competitors was its lack of trademark protection for its
series of microchips. Competitors were able to exploit
this lack by introducing clone products with similar
sounding names, severely inhibiting Intel’s ability to
create a brand identity.
28

THESIS AND ARGUMENTS
In an effort to save its market share, Intel embarked
on an ambitious branding program in 1991. The
corporation’s decision to invest more than $ 100
million in this program was greeted with skepticism and
controversy. Many within the company argued that the
money could be better spent researching and
developing new products, while others argue that a
company that operated within such a narrow consumer
niche had little need for such an aggressive branding
campaign. Despite these misgivings, Intel went ahead
with its strategy, which in a short time became a
resounding success.
29

CONCLUSION
Ironically, the success of the Intel’s branding strategy
led to a marketing dilemma for the company. In 1992,
Intel was prepared to unveil its new line of
microprocessors. However, the company faced a difficult
decision: release the new product under the current
brand logo and risk consumer apathy or give the product a
new name and brand and risk undoing all the work put
into the branding strategy. In the end, Intel decided to
move forward with a new brand identity. It was a
testament to the strength of Intel’s earlier branding
efforts that the new product line was seamlessly
integrated into the public consciousness.
30

TOPICS
 Some people like doing work by hand. Others
prefer using machines. Which do you prefer?
Use specific reasons and examples to support
your answer.
 Some people think that children should begin
their formal education at a very early age and
should spend most of their time on school
studies. Others believe that young children
should spend most of their time playing.
Compare these two views. Which view do you
agree with? Why?

TOPICS
 Some people think that the family is the most
important influence on young adults. Other
people think that friends are the most
important influence on young adults. Which
view do you agree with? Use examples to
support your position.

 Some students prefer to study alone. Others
prefer to study with a group of students. Which
do you prefer? Use specific reasons and
examples to support your answer.
32

KINDS OF ESSAY TESTS
 ORAL INTERVIEWS

 SUMMARIES

 INFORMATION GAP ACTIVITIES

33

ORAL INTERVIEWS
J. B Heaton explains that in real life the
two skills of listening and speaking are
fully integrated in most everyday
situations involving communication.
Consequently, an excellent way of testing
speaking is the oral interview since
listening and speaking can be assessed in
a natural situation.

34

SUMMARIES
Summaries are used most often to test
reading or listening comprehension and
writing skills. Writing summaries may
closely replicate many real-life activities.

35

INFORMATION GAP ACTIVITIES
Work out what the differences are

36

TESTING READING SKILLS
VOCABULARY TESTS often provide a good guide
to reading ability. It is usually necessary for
students to demonstrate not only a knowledge
of the meaning of a particular word but al so an
awareness of the other words with which it is
generally used. However, in addition to their
usefulness in proficiency tests, vocabulary tests
are also useful in progress tests as they lend
themselves to follow up work in class.
37

TRUE / FALSE ITEMS
1. ___ Children learn to recognize and produce
the sounds of the language by listening to its
spoken form.
2. ___ One remarkable thing about first language
acquisition is the low degree of similarity which
we see in the early language of children all
over the world.
3. ___ Many sentences such as “ Mummy juice”
and “baby fall down” are known as telegraphic
speech.
38

MULTIPLE-CHOICE ITEMS

Writing multiple-choice items is not too difficult
after you have had a little practice. For most
purposes three options are enough. Remember
that the distracters should appear correct to any
students who are not sure of the answer. Avoid
writing absurd distracters which everyone can
easily see are wrong. On the other hand,
however, all the distracters should be written
within the student’s range of proficiency and at
the same level as the correct 39

MULTIPLE-CHOICE ITEM
Example:
According to the author, one cause of mountain
formation is the
a. effect of the climate change on sea level
b. slowing down of volcanic activity
c. force of Earth`s crustal plates hitting each other
d. replacement of sedimentary rock with volcanic
rock
Correct answer: c
40

MATCHING ITEMS
 Matching items are also very useful for
testing vocabulary in context. It is
necessary to instruct the students to write
the correct word from the story at the side
of each word listed below it.

41

MATCHING ITEM
Example:
Column A Column B
3. shy a. cheerful
4. happy b. thin
5. sad c. become scared
6. slim d. sorrowful

42

TESTING WRITING
SKILLS

 Jeremy Harmer explains
that like many other
aspects of English
language teaching, the type
of writing we get students
to do will depend on their
age, interests and level.
43

TESTING WRITING
SKILLS
 GRAMMAR AND STRUCTURE
- Multiple-choice
- Error recognition
- Re-arrangement
- Changing words
- Blank -filling
44

TESTING WRITING
SKILLS
Controlled Writing
 Transformation
 Broken Sentences
 Notes and Diaries
 Free writing
45

TESTING WRITING
SKILLS

 GRAMMAR AND STRUCTURE
- Multiple-choice items
Multiple-choice items test an ability to
recognize sentences which are
grammatically correct.
46

TESTING WRITING
SKILLS

 ERROR RECOGNITION

Students must choose the underlined
word or phrase which is incorrect.

47

TESTING WRITING
SKILLS
 RE-ARRANGEMENT

Students are required to unscramble
sentences. They must write out each
sentence, putting the words and phrases
in their correct order. This type of item
is useful for testing awareness of the
order of adjectives, the position of
adverbs, inversion and other areas of
grammar.

TESTING WRITING
SKILLS

 CHANGING WORDS
A completely different type of questions
requires students to put verbs into their
correct tense or voice. This question is
quite easy and straightforward to
construct. However, it is important to
provide an interesting context. 49

TESTING WRITING
SKILLS
 Blank–filling
Blank-filling items should consist of
paragraphs providing an interesting and
relevant context. It is important to
choose the words to omit very carefully
so that they are all grammatical words
( e.g. to, in, is, the).
50

CENTRAL TENDENCY
The Central Tendency of a
distribution is an estimate of
the “center” of a
distribution of values.

51

CENTRAL TENDENCY
 There are three major types of
estimates of Central Tendency:
 - Mean
 - Median
 - Mode

CENTRAL TENDENCY
The Mean or average is probably the most
commonly used method of describing
central tendency.

CENTRAL TENDENCY

 The Mean
 To compute the mean, add up all the
values and divide by the number of values.

CENTRAL TENDENCY
 The Mean
 For example:
 20, 20, 20, 18, 17, 14, 14= 135
 The sum of these 8 values is 135/8=
16.87

CENTRAL TENDENCY
 The Median
 Is the score found at the exact middle
of the set of values. One way to compute
the median is to list all scores in
numerical order, and then locate the
score in the center of the sample.

CENTRAL TENDENCY

 The Median
 For example:
 15, 15, 15, 15, 15, 17, 18, 20
 There are 8 scores and score # 4 and #
5 represent the halfway point. Since both
these scores are 15, the median is 15.

CENTRAL TENDENCY
 The Median
 If the two middle scores have different
values, you would have to interpolate to
determine the median.

DO`S AND DON`TS IN WRITING FOR
READING COMPREHENSION
 General Concerns:
 The candidate or the student should be able to answer the questions
on the basis of what is in the passage; the questions should not require
outside knowledge.
 Questions should cover all the important parts of the passage.
Questions should not be asked exclusively about one section of the
passage while other sections are neglected.
 - Overlap among questions should be avoided. With many questions
based on one passage, it is inevitable that more than one question may
relate to a particular portion or aspect of the passage; care should be
taken, however, that such questions explore different perspectives of
the material.

DO`S AND DON`TS IN WRITING
FOR READING COMPREHENSION
 The stem:
 - The stem should formulate the question or
problem as simply and directly as possible.
Avoid irrelevant verbiage.
 - The stem should be as directed as possible;
that is, it should have a focus and should
clearly identify the problem. The candidate
should not have to read all of the options to
see what the question is asking.

 - Capitalize words such as NOT, LEAST,
EXCEPT, etc. When they are used in the
stem to call for a negative or unexpected
response.
 - If a word or phrase is used at the
beginning of each option, move that word or
phrase to the stem to avoid unnecessary
repetition.

 Refer to the passage as such, not
“selection,” “excerpt,” etc.

 - Use specific line references when questions
refer to specific words, phrases, or
arguments in the passage.

 The Key:
 There should be one and only one correct or clearly
best answer.
 The key should not be specifically determined in any
way, e.g., by length, degree of precision, or language.
Item writers often submit questions with the key so
carefully qualified that it is twice as long as the
distractors; it may help to write the key first so that
the distractors can be tailored to be parallel.

 The options:
 Do not use “ All of the above” or “None of
the above” as options. The need to use “All
of the above” may be an indication that a
Roman numeral format would be
appropriate.
 All options should be as parallel as possible
in grammatical structure, diction, and
length.

DO`S AND DON`TS IN WRITING FOR
READING COMPREHENSION
 Unacceptable Sample Options:
 The passage implies that an advantage of
adopting the author’s theories is that we
would increase our knowledge of
atmospheric processes
 national survival
 the formulation of a set of hypotheses
regarding motion in space 65

 The options:
 - All options must fit the stem, e.g.,
they should not be easily identifiable as
incorrect responses simply because they
make no sense grammatically or
idiomatically.
 - Avoid options that overlap or
subsume each other, or options that give
away the answers to other questions.

 - Avoid using a pair of opposites in the
options if one of the pair is the key. If such a
pair of opposites is used, the item is likely to
operate as two-choice rather than a four-choice
item, and the probability of guessing the
correct answer is increased.
 - Arrange options in logical order, if one
exists, or according to length ( for example,
shortest to longest).

Computers and Language
Testing
 Rapid developments in computer technology have
had a major impact on test delivery. Already,
many important national and international
language tests, including TOEFL, are moving to
computer based testing (CBT). Stimulus texts and
prompts are presented not in examination
booklets but on the screen, with candidates being
required to key in their responses. The advent of
CBT has not necessarily involved any change in
the test content, which may remain quite
conservative in its assumptions, but often simply
represents a change in test method.


Testing
 The proponents of computer based testing
can point to a number of advantages. First,
scoring of fixed response items can be done
automatically, and the candidate can be
given a score immediately. Second, the
computer can deliver tests that are tailored
to the particular abilities of the candidate.


Testing
 It seems inefficient for all candidates to
take all the questions on a test; clearly some
are so easy for some candidates that they
provide little information on their abilities;
others are too hard to be of use. It makes
sense to use the very limited time available
for testing to focus on those items that are
just within, and just beyond a candidate’s
threshold of ability.

Testing
The use of computer for delivery of test
materials raises questions of validity. For
example, different levels of familiarity
with computers will affect people’s
performance with them, and interaction
with the computer may be stressful
experience for some students or
candidates. McNamara Tim ( 2000, pages
79-81 )

LEARNING THEORY:
Intrinsic Motivation / Teacher extrinsic
motivation
 Structure: focused practice / lots of oral
practice
 Sequence: learn well before moving on to
next point.
 Reinforcement: PROCESS andOUTPUT
INPUT review feedback
CORRECTION REVIEW FEEDBACK

LEARNING THEORY

LOCAL ERRORS

 ERROR
 CORRECTION
 FEEDBACK
GLOBAL ERRORS

Consulted Bibliography
 Tim MacNamara: (2000).Language Testing. Oxford
University Press.
 Heaton J. B.(1998) Classroom Testing. Longman Keys to
Language Teaching. Longman. London. New York.
 Jack C. Richards (2005). Communicative Language
Teaching , Cambridge Univ. Press
 Brown, Douglas (200l). Teaching by Principles, Longman,
United States
 IBT Tests (2004). MacGraw Hills.
 Freeman Donald; Richards Jack C. (2001); Teacher
Learning in Language Teaching.

LANGUAJE TESTING

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Destaque

Destaque (8)

Semelhante a LANGUAJE TESTING

Semelhante a LANGUAJE TESTING (20)

Mais de Videoconferencias UTPL

Mais de Videoconferencias UTPL (20)

Último

Último (20)

LANGUAJE TESTING