Dissertation Proposal

The Development and
Validation of All Four TRAILS
Tests for K-12 Students
Joseph A. Salem, Jr.
Dissertation Proposal Defense
April 29, 2014

Agenda
• Background on the TRAILS Project and
Assessments
• Research Questions for the Proposed
Study
• Methodology of the Proposed Study

TRAILS Project
• Developed at KSU as part of the Institute
for Library and Information Literacy
Education
• Freely available assessments of
information literacy at four grade levels:
– 3rd
– 6th
– 9th
– 12th

Assessments
• Two general assessments at each grade
level of lengths that vary:
– 3rd
grade: 15 items
– 6th
grade: 20 items
– 9th
grade: 25 items
– 12th
grade: 30 items
• Five subscale assessments (10 items)
offered at each grade level

The Problem
• The assessments have never undergone
a thorough validation study
• Current general assessments do not meet
the 0.70 – 0.80 coefficient alpha threshold
for large-scale tests in the social sciences
(Nunnally & Bernstein, 1994)
• Current general assessments vary in their
coefficient alpha and are likely not parallel
(Crocker & Algina, 2008)

Coefficient Alphas for Current
General Assessments
Grade Alpha:
GA 1
Items n Alpha:
GA 2
Items n
3 0.638 15 13,527 0.710 15 1,319
6 0.686 20 34,175 0.654 19 5,570
9 0.661 25 24,451 0.712 24 6,688
12 0.665 30 3,769 0.578 30 1,166

Research Questions for the
Proposed Study
1. Can the current TRAILS item bank be
used to create an efficient, reliable, valid,
and fair test of IL for each grade level?
2. What evidence exists for construct
validity?
3. What score will a student need to achieve
in order to demonstrate proficiency in IL
on the TRAILS test at each grade level?

Methodology for the Proposed
Study: Research Question 1
• Administer the entire item bank at each
grade level to a target sample of 560
students in the associated grade in
• Participant schools will be recruited
through the U.S. through professional
listservs, the TRAILS Web site, and direct
contact to TRAILS administrators

Methodology of the Proposed
• The tests will be assembled based on their
item level and scale level psychometric
properties using the Rasch Item
Response Theory Model:
• Item: fit to scale
• Item: correlation to scale
• Item: lack of gender/racial ethnic bias
• Item: distractor operation
• Scale: reliability
• Scale: difficulty spread

• Sample Size Target: 560
• Desired Sample Size: 500 (allows for 10%
of participants to be eliminated in each
grade level due to being outside of the
associated grade)
• 500 is recommended for robust calibration
estimates (Linacre, 1994) and exceeds the
8:1 participant to item recommendation by
De Ayala (2009)

Grade Alpha: GA
1
Current n Estimated
n
Alpha: GA
2
Current n Estimated
n
3 0.638 15 34 0.710 15 25
6 0.686 20 36 0.654 19 40
9 0.661 25 51 0.710 24 38
12 0.665 30 60 0.578 30 87

• Two proposed methods
– Content expert rating of the items
– Correlation study of the amount of reading
and item difficulty

• Content Expert Rating
– Five raters will be sought for each test based
on the recommended n for a suitable reliability
as measured by infraclass correlation (Walter,
Eliasziw, & Donner, 1998)
– Raters will be school librarians and classroom
teachers in the associated grade level

• Raters will be asked to determine the
degree to which the item measures the
TRAILS objective with which it is
associated on a three point scale:
– Yes
– Yes with revisions
– No
• A comment option will be offered for each
item

• An intraclass correlation will be calculated
for each as an estimate of the reliability of
the measure
• Based on a similar method employed in
the validation study of the Information
Literacy Test (ILT), the frequencies and
agreement will be reported (Cameron,
Wise, & Lottridge, 2007)

• The number of alphanumeric characters
for each item will be input as a variable as
well as the Rasch estimate of its difficulty
• The correlation between these two will be
estimated
• A strong positive correlation will be
interpreted as evidence of an
unanticipated relationship and cause for
further investigation

Methodology for the Proposed
• A modified bookmarking standard setting
method will be used to set proficiency
scores on all four tests (Cizek & Bunch,
2007)

• Three rounds of data gathering are used:
– Individuals bookmark the point(s) in the test
that meet the decision criteria
– A small group discusses the first round and
sets the cut score(s)
– The same group or group of small groups
discuss the second round and finalize the
score(s)

• Modifications
– Rounds 1 and 2 will involve 5 – 8 school
librarians or classroom teachers working near
the associated grade
– Round 3 will involve the TRAILS team to
facilitate one group looking at all four tests
– Rounds 1 and 2 will be conducted virtually
through e-mail and Web conferencing
– Round 3 will be conducted in person

• Decision criteria for round 1
– Participants will select the most difficult item
that gives a proficient student in the
associated grade a 50% probability of
answering correctly (Wang, 2003)
• Sample size
– 5-8 participants sought based on a
recommendation for noncommercial focus
groups (Krueger & Casey, 2009)

References
• Cizek, G. J., & Bunch, M. B. (2007). Standard setting: A guide to
establishing and evaluating performance standards on tests. Thousand
Oaks, CA: Sage Publications.
• Crocker, L., & Algina, J. (2008). Introduction to classical & modern test
theory. Mason, OH: Cengage Learning.
• De Ayala, R. J. (2009). The theory and practice of item response theory.
New York, NY: Guilford Press.
• Krueger, R. A., & Casey, M. A. (2009). Focus groups : A practical guide for
applied research (4th ed.). Los Angelas, CA: Sage.
• Linacre, J. M. (1994). Sample size and item calibration stability. Rasch
Measurement Transactions, 7(4), 328.
• Nunnally, J. C., & Bernstein, I. H. (1994). Psychometric theory (3rd ed.).
New York, NY: McGraw-Hill.
• Walter, S. D., Eliasziw, M., & Donner, A. (1998). Sample size and optimal
designs for reliability studies. Statistics in Medicine, 17(1), 101-110.
• Wang, N. (2003). Use of the rasch IRT model in standard setting: An item-
mapping method. Journal of Educational Measurement, 40(3), 231-253.

Questions?
The Development and Validation of All Fou

Dissertation Proposal

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Destaque

Destaque (16)

Semelhante a Dissertation Proposal

Semelhante a Dissertation Proposal (20)

Último

Último (20)

Dissertation Proposal