SlideShare a Scribd company logo
1 of 7
1
Topic 2: Measurement terms
McNamara (2000), Bachman (1990)
Hoa Nguyen
2
Test Validity
• Measurement investigates the quality of the process of
assessment by looking at scores.
• Basic measurement terms from data matrix/data score
are
1. Validity: meaningness and fairness of the conclusion reached
about individual candidates
2. Quality control of raters
• Rater A
• Rater B
– Correlation coefficient (r): the extent to which one score set is
knowable from another (0 to 1; 0 = no correspondence, 1 =
perfect correspondence)
– Reliability coefficient (inter-rater reliability): inter-rater agreement
(benchmarks = 0.7 to 0.9)
– Single classification vs. more than two classification categories
3
Test reliability
3. Properties of individual items
– Item analysis (analysis of score patterns on each of the test
items)
• item facility (item difficulty): proportion of test takers got the right
answer to a given item (acceptable = 0.33 to .67, ideal = 0.5)
• item discrimination: consistency of performance by candidates
across items: test reliability
– Test reliability: overall capacity of a multi-item test (such as
comprehension test or a test of grammar or vocab) to define
levels of knowledge or ability among candidates consistently.
Eg.: Referring to reliability coefficient of 0.9 means scores on
the test are providing about 80% reliable information on
candidates’ abilities, with about 20% attributable to
randomness of error.
4
Norm-referenced
and criterion-referenced measurement
4. Norm-referenced and criterion-referenced measurement
– Norm-referenced measurement: comparison of scores
between individuals (how good was an individual test taker’s
score compared with the performance of others?).
• Eg. Test involves multiple items thus we have a range of possible
total scores such as tests of comprehension, test of grammar or
vocab
• Normal distribution = bell-shape
– Criterion-referenced measurement: individual performances
are evaluated against a verbal description of a satisfactory
performance at a given level (Did an individual test taker’s
score meet what was required?)
• Eg. test of course content.
Norm-referenced
and criterion-referenced measurement
• Distinguish Norm-referenced and Criterion-referenced measurement
according to two categories suggested by Bachman (1990):
i. Design, construction and development
ii. Scales and interpretation of scales
a. maximizing distinctions among individual test takers.
b. scores being interpreted with reference to the performance of other individual
on the test
c. representing specified levels of ability or domain of content
d. scores being interpreted as a level of ability or degree of mastery of the content
domain
5
6
Norm-referenced
and criterion-referenced measurement
– Differences between Norm-referenced measurement and Criterion-referenced
measurement
Norm-referenced Criterion-referenced
Design, construction and
development
- maximizing distinctions
among individual test
takers.
- representing specified
levels of ability or domain
of content
Scales and interpretation of
scales
- scores being interpreted
with reference to the
performance of other
individual on the test
- scores being interpreted as
a level of ability or degree
of mastery of the content
domain
7
Norm-referenced
and criterion-referenced measurement
– Differences between Norm-referenced
measurement and Criterion-referenced
measurement
Norm-referenced Criterion-referenced
Design, construction and
development
- maximizing distinctions
among individual test
takers.
Scales and interpretation of
scales
- scores being interpreted as
a level of ability or degree
of mastery of the content
domain

More Related Content

Similar to Measurement terms

Qualities of Good Test.pdf
Qualities of Good Test.pdfQualities of Good Test.pdf
Qualities of Good Test.pdf
FaheemGul17
 
Adapted from Assessment in Special and incl.docx
Adapted from Assessment in Special and incl.docxAdapted from Assessment in Special and incl.docx
Adapted from Assessment in Special and incl.docx
nettletondevon
 

Similar to Measurement terms (20)

Qualities of Good Test.pdf
Qualities of Good Test.pdfQualities of Good Test.pdf
Qualities of Good Test.pdf
 
Qualities of good evaluation tools
Qualities of good evaluation toolsQualities of good evaluation tools
Qualities of good evaluation tools
 
Assessment of Learning
Assessment of LearningAssessment of Learning
Assessment of Learning
 
Assessment of Learning
Assessment of LearningAssessment of Learning
Assessment of Learning
 
Validity in performance appraisal
Validity in performance appraisalValidity in performance appraisal
Validity in performance appraisal
 
RM-3 SCY.pdf
RM-3 SCY.pdfRM-3 SCY.pdf
RM-3 SCY.pdf
 
Prof-Ed-9-Reporting_20240219_183922_0000.pdf
Prof-Ed-9-Reporting_20240219_183922_0000.pdfProf-Ed-9-Reporting_20240219_183922_0000.pdf
Prof-Ed-9-Reporting_20240219_183922_0000.pdf
 
Concept of Measurements in Business Research
Concept of Measurements in Business ResearchConcept of Measurements in Business Research
Concept of Measurements in Business Research
 
Assessment of Learning Presentation
Assessment of Learning PresentationAssessment of Learning Presentation
Assessment of Learning Presentation
 
Intro to rubric
Intro to rubricIntro to rubric
Intro to rubric
 
Learning assessment-presentation
Learning assessment-presentationLearning assessment-presentation
Learning assessment-presentation
 
TEST CONSTRUCTION in Psychology to measure different traits
TEST CONSTRUCTION in Psychology to measure different traitsTEST CONSTRUCTION in Psychology to measure different traits
TEST CONSTRUCTION in Psychology to measure different traits
 
Scaling concepts
Scaling conceptsScaling concepts
Scaling concepts
 
Learning_activity#1_Sánchez_Jhon.NRC_18235.pptx
Learning_activity#1_Sánchez_Jhon.NRC_18235.pptxLearning_activity#1_Sánchez_Jhon.NRC_18235.pptx
Learning_activity#1_Sánchez_Jhon.NRC_18235.pptx
 
Rubric design workshop
Rubric design workshopRubric design workshop
Rubric design workshop
 
JC-16-23June2021-rel-val.pptx
JC-16-23June2021-rel-val.pptxJC-16-23June2021-rel-val.pptx
JC-16-23June2021-rel-val.pptx
 
Standardization of a test by Dr. Neha Deo
Standardization of a test by Dr. Neha DeoStandardization of a test by Dr. Neha Deo
Standardization of a test by Dr. Neha Deo
 
Assessment of Learning
Assessment of LearningAssessment of Learning
Assessment of Learning
 
What should i assess on the test
What should i assess on the testWhat should i assess on the test
What should i assess on the test
 
Adapted from Assessment in Special and incl.docx
Adapted from Assessment in Special and incl.docxAdapted from Assessment in Special and incl.docx
Adapted from Assessment in Special and incl.docx
 

More from steadyfalcon

More from steadyfalcon (20)

SHRM_Chapter 2
SHRM_Chapter 2SHRM_Chapter 2
SHRM_Chapter 2
 
Performance Appraisal
Performance AppraisalPerformance Appraisal
Performance Appraisal
 
Kỹ năng tuyển dụng
Kỹ năng tuyển dụngKỹ năng tuyển dụng
Kỹ năng tuyển dụng
 
Đánh giá công việc
Đánh giá công việcĐánh giá công việc
Đánh giá công việc
 
SHRM_Chapter 01.ppt
SHRM_Chapter 01.pptSHRM_Chapter 01.ppt
SHRM_Chapter 01.ppt
 
Hiểu con người trong công việc
Hiểu con người trong công việcHiểu con người trong công việc
Hiểu con người trong công việc
 
Đào tạo nguồn nhân lực
Đào tạo nguồn nhân lựcĐào tạo nguồn nhân lực
Đào tạo nguồn nhân lực
 
Đánh gia công việc
Đánh gia công việcĐánh gia công việc
Đánh gia công việc
 
Big Five Personality Traits.ppt
Big Five Personality Traits.pptBig Five Personality Traits.ppt
Big Five Personality Traits.ppt
 
MẪU HỆ THỐNG KPI KẾ HOẠCH NHÂNSỰ.pptx
MẪU HỆ THỐNG KPI KẾ HOẠCH NHÂNSỰ.pptxMẪU HỆ THỐNG KPI KẾ HOẠCH NHÂNSỰ.pptx
MẪU HỆ THỐNG KPI KẾ HOẠCH NHÂNSỰ.pptx
 
LỘ TRÌNH ĐÀO TẠO.pptx
LỘ TRÌNH ĐÀO TẠO.pptxLỘ TRÌNH ĐÀO TẠO.pptx
LỘ TRÌNH ĐÀO TẠO.pptx
 
Mẫu báo cáo giáo dục
Mẫu báo cáo giáo dụcMẫu báo cáo giáo dục
Mẫu báo cáo giáo dục
 
Ky nang quan ly theo muc tieu
Ky nang quan ly theo muc tieuKy nang quan ly theo muc tieu
Ky nang quan ly theo muc tieu
 
Customer_driven_marketing_strategy.pptx
Customer_driven_marketing_strategy.pptxCustomer_driven_marketing_strategy.pptx
Customer_driven_marketing_strategy.pptx
 
Customer-Driven-Marketing-Strategy.ppt
Customer-Driven-Marketing-Strategy.pptCustomer-Driven-Marketing-Strategy.ppt
Customer-Driven-Marketing-Strategy.ppt
 
Reference List edited 2016
Reference List edited 2016Reference List edited 2016
Reference List edited 2016
 
Washback
WashbackWashback
Washback
 
Factors affecting test scores and test evaluation in class
Factors affecting test scores and test evaluation in classFactors affecting test scores and test evaluation in class
Factors affecting test scores and test evaluation in class
 
Purpose of a test
Purpose of a testPurpose of a test
Purpose of a test
 
THE ROLES OF ESP TEACHERS
THE ROLES OF ESP TEACHERSTHE ROLES OF ESP TEACHERS
THE ROLES OF ESP TEACHERS
 

Recently uploaded

Gardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch LetterGardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch Letter
MateoGardella
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
QucHHunhnh
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
PECB
 
Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.
MateoGardella
 

Recently uploaded (20)

Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdf
 
psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docx
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
Gardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch LetterGardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch Letter
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.
 
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 

Measurement terms

  • 1. 1 Topic 2: Measurement terms McNamara (2000), Bachman (1990) Hoa Nguyen
  • 2. 2 Test Validity • Measurement investigates the quality of the process of assessment by looking at scores. • Basic measurement terms from data matrix/data score are 1. Validity: meaningness and fairness of the conclusion reached about individual candidates 2. Quality control of raters • Rater A • Rater B – Correlation coefficient (r): the extent to which one score set is knowable from another (0 to 1; 0 = no correspondence, 1 = perfect correspondence) – Reliability coefficient (inter-rater reliability): inter-rater agreement (benchmarks = 0.7 to 0.9) – Single classification vs. more than two classification categories
  • 3. 3 Test reliability 3. Properties of individual items – Item analysis (analysis of score patterns on each of the test items) • item facility (item difficulty): proportion of test takers got the right answer to a given item (acceptable = 0.33 to .67, ideal = 0.5) • item discrimination: consistency of performance by candidates across items: test reliability – Test reliability: overall capacity of a multi-item test (such as comprehension test or a test of grammar or vocab) to define levels of knowledge or ability among candidates consistently. Eg.: Referring to reliability coefficient of 0.9 means scores on the test are providing about 80% reliable information on candidates’ abilities, with about 20% attributable to randomness of error.
  • 4. 4 Norm-referenced and criterion-referenced measurement 4. Norm-referenced and criterion-referenced measurement – Norm-referenced measurement: comparison of scores between individuals (how good was an individual test taker’s score compared with the performance of others?). • Eg. Test involves multiple items thus we have a range of possible total scores such as tests of comprehension, test of grammar or vocab • Normal distribution = bell-shape – Criterion-referenced measurement: individual performances are evaluated against a verbal description of a satisfactory performance at a given level (Did an individual test taker’s score meet what was required?) • Eg. test of course content.
  • 5. Norm-referenced and criterion-referenced measurement • Distinguish Norm-referenced and Criterion-referenced measurement according to two categories suggested by Bachman (1990): i. Design, construction and development ii. Scales and interpretation of scales a. maximizing distinctions among individual test takers. b. scores being interpreted with reference to the performance of other individual on the test c. representing specified levels of ability or domain of content d. scores being interpreted as a level of ability or degree of mastery of the content domain 5
  • 6. 6 Norm-referenced and criterion-referenced measurement – Differences between Norm-referenced measurement and Criterion-referenced measurement Norm-referenced Criterion-referenced Design, construction and development - maximizing distinctions among individual test takers. - representing specified levels of ability or domain of content Scales and interpretation of scales - scores being interpreted with reference to the performance of other individual on the test - scores being interpreted as a level of ability or degree of mastery of the content domain
  • 7. 7 Norm-referenced and criterion-referenced measurement – Differences between Norm-referenced measurement and Criterion-referenced measurement Norm-referenced Criterion-referenced Design, construction and development - maximizing distinctions among individual test takers. Scales and interpretation of scales - scores being interpreted as a level of ability or degree of mastery of the content domain