Presented by Ainslie Nibert, Associate Dean and Associate Professor, Texas Woman's University College of Nursing
This webinar assists with creating critical-thinking test items for all of your exams. You’ll obtain valuable student response data from these new questions that can guide future editing, and help you obtain the greatest benefit from your authoring efforts. By performing a systematic item analysis after each exam, you can pinpoint students’ knowledge gaps, which will help you focus your item writing on those course objectives that are globally misunderstood or ignored. In addition to reviewing item writing techniques, we’ll also cover the advantages of using electronic test blueprints to establish test validity and tie your assessments to your overall program objectives.
Improve your test item writing skills to help create better nursing exams
1. 3/12/2018
1
IMPROVING PROFICIENCY WITH
ITEM WRITING AND EXAM
CREATION
Ainslie T. Nibert, PhD, RN, FAAN
Consultant
Email – anibert@comcast.net
OBJECTIVES
Create/Edit test items that assess for application and analysis.
Critique exams for alignment with the NCLEX-RN® Test Plan.
Evaluate the relevance and effectiveness of current testing policies.
RESOURCES FOR DEVELOPING CRITICAL THINKING TEST ITEMS
AND ALTERNATE FORMAT ITEMS:
NATIONAL COUNCIL WEBSITE
www.nscbn.org
NCLEX Test Plans
2016 RN
Candidate FAQ
Alternate item formats FAQ
Exam Development FAQ
Source: https://www.ncsbn.org/9010.htm
WHAT IS THE NEXT-GEN NCLEX®?
RELATIONSHIP BETWEEN
TESTING & THE CURRICULUM
Focus today:
INTERNAL
Curriculum Evaluation
(Teacher-made Tests)
Writing Critical Thinking Test Items
Item Analysis Software & Blueprinting
Test Item Banking & Exam Delivery
Internal Evaluation
Evaluation of course objectives (faculty designed)
2. 3/12/2018
2
Five Guidelines to Developing
Effective Critical Thinking Exams
Assemble the “basics.”
Write critical thinking test items.
Pay attention to housekeeping duties.
Develop a test blueprint.
Scientifically analyze all exams.
Critical Thinking Test Items
Contain Rationale
Written at the Application Level or Above
Require Multilogical Thinking to Answer
Ask for High Level of Discrimination
Source: Morrison, Nibert, & Flick (2006)
Housekeeping Tips
Rules
Get rid of names
Get rid of ‘multiple’ multiples
Use non-sexist writing style
Develop parsimonious writing style
Eliminate
Delete scenarios
Write items independent of each other
“of the following…”X
… and More Rules
Use a question format when possible
Make distracters plausible and
homogeneous
Include in stem words repeated in
responses
… and More Rules
Eliminate “all of the above” and “none of the
above”
Rewrite any “all except” questions
Ensure that alternatives do not overlap
Present choices in a logical order
Vary correct answer
3. 3/12/2018
3
… and the MOST IMPORTANT Rule
Develop written testing policy
Writing style
Format
Does the test measure
what it claims to measure?
C o n t e n t V a l i d i t y
Use a Blueprint to Assess a
Test’s Validity
Test Blueprint
Reflects Course Objectives
Rational/Logical Tool
Testing Software Program
Storage of item analysis data (Last & Cumulative)
Storage of test item categories
4. 3/12/2018
4
Consistency of Scores
RELIABILITY TOOLS
Kuder-Richardson Formula 20
(KR20)—EXAM
Range from –1 to + 1
Point Biserial Correlation Coefficient
(PBCC)—TEST ITEMS
Range from – 1 to + 1
Item difficulty 30% - 90%
Item Discrimination Ratio 25% and Above
PBCC 0.20 and Above
KR20 0.70 and Above
Standards of Acceptance ONE “ABSOLUTE” RULE
ABOUT ITEM DIFFICULTY
TEST ITEMS ANSWERED CORRECTLY BY 30% or LESS of
the examinees should always be considered too difficult,
and the instructor must take action.
…BUT WHAT ABOUT HIGH
DIFFICULTY LEVELS?
Test items with high difficulty levels (>90%)
often yield poor discrimination values.
Is there a situation where faculty can
legitimately expect that 100% of the class will
answer a test item correctly, and be pleased
when this happens?
RULE OF THUMB ABOUT MASTERY ITEMS: Due
to their negative impact on test discrimination and
reliability, they should comprise no more than
10% of the test.
Item difficulty 30% - 90%
Item Discrimination Ratio 25% and Above
PBCC 0.20 and Above
KR20 0.70 and Above
Standards of Acceptance
5. 3/12/2018
5
THINKING MORE ABOUT ITEM
DISCRIMINATION STATISTICS
ON TEACHER-MADE TESTS…
IDR can be calculated quickly, but doesn’t consider variance of the
entire group. Use it to quickly identify items that have zero/negative
discrimination values, since these need to be edited before using
again.
PBCC is a more powerful measure discrimination.
Correlates the correct answer to a single test items with the total test score of the
student.
Considers the variance of the entire student group, not just the lower and upper 27%
groups.
For a small ‘n,’ consider referencing the cumulative value.
… WHAT DECISIONS NEED TO
BE MADE ABOUT TEST ITEMS?
When a test item has poor difficulty
and/or discrimination values, action is
needed.
All of these actions require that the exam be
rescored.
Credit can be given for more than one choice.
Test item can be nullified.
Test item can be deleted.
Each of these actions has a consequence, so faculty need to carefully consider
these when choosing an action. Faculty judgment is crucial when determining
actions affecting test scores.
STANDARDS OF
ACCEPTANCE NURSING
Nursing-PBCC 0.15 and Above
Nursing-KR20 0.60 - 0.65 and Above
3-Step Method for
Item Analysis
1. Review Difficulty Level
2. Review Discrimination Data
Item Discrimination Ratio (IDR)
Point Biserial Correlation Coefficient (PBCC)
3. Review Effectiveness of Alternatives
Response Frequencies
Non-distracters
..AND A WORD ABOUT
USING RESPONSE
FREQUENCIES
A review of the response frequency data can focus your editing.
For items where 100% of students answer correctly, and no
other options were chosen, make sure that this is indeed
intentional (MASTERY ITEM), and not just reflective of an item
that is too easy (>90% DIFFICULTY.)
Target re-writing the “zero” distracters – those options that are
ignored by students. Replacing “zeros” with plausible options
will immediately improve item DISCRIMINATION.
6. 3/12/2018
6
31
FAIR/COMMON
UNIVERSAL LANGUAGE
The client is running late for an appointment.
The client understands Buddhist practices are peaceful.
The client is on five different medications.
The client ate a submarine sandwich.
The alcoholic client with delirium tremens is agitated.
After the client sneezed, the nurse said “bless you.”
The nurse is giving a report on the client.
The nursing unit is working shorthanded.
CRITICALLY-THINKING QUESTIONS
Which intervention is most important?
Which intervention, plan, assessment data is/are most
critical to developing a plan of care?
Which intervention should be done first?
What action should the nurse take first?
Which intervention, plan, nursing action has the highest
priority?
What response is best?
7. 3/12/2018
7
LATEST NCLEX® TEST ITEM
FORMAT CONSIDERATIONS
Units of Measure
•International Systems of Units (SI)
•Metric
•Imperial Measurement
Generic vs. Trade Names for Medications
•Generic names only in most cases
•References to general classifications of medications
Item Writing Tools for
Success …
Knowledge
Test Blueprint
Testing Software
REFERENCES
Morrison, S., Nibert, A., & Flick, J. (2006). Critical thinking and test item
writing (2nd ed.). Houston, TX: Health Education Systems, Inc.
National Council of State Boards of Nursing. (2016) 2016 NCLEX-RN
test plan. Chicago, IL: National Council of State Boards of Nursing.
https://www.ncsbn.org/RN_Test_Plan_2016_Final.pdf
National Council of State Boards of Nursing (2018) Next Generation
NCLEX Project. Chicago, IL: National Council of State Boards of
Nursing. https://www.ncsbn.org/next-generation-nclex.htm
Nibert, A. (2010) Benchmarking for student progression throughout a nursing
program: Implications for students, faculty, and administrators. In Caputi, L.
(Ed.), Teaching nursing: The art and science, 2nd ed. (Vol. 3). (pp.45-64).
Chicago: College of DuPage Press.
QUESTIONS?
Ainslie T. Nibert, PhD, RN, FAAN
anibert@comcast.net
713.997.9750
Thanks for your time & attention!