SlideShare uma empresa Scribd logo
1 de 42
Gaetano Bruno Ronsivalle
bruno.ronsivalle@uniroma1.it
“Congratulation!
You just passed the final test!”
CRITICAL NOTES AGAINST THE PASSING SCORE CRITERION IN ASSESSMENT
Passing
Threshold
Assessment
Success
Performance:
bad or good?
How?
Which criteria?
School and University
Traditional criteria
Assessment choices
Threshold
Theoretical Frame
60 /100 correct answers
“Sufficiency”
criterion
A law of
nature?
No law to determine a
“Passing Score”
value
Nothing can be “a priori”
established
Passing Score criterion is
false, unfounded,
and meaningless
…which is the
rational
explanation of the
“sufficiency” criterion?
In fact…
Just common sense.
Alternative
method
to define the
test passing
thresholds
Identifying a passing
threshold means
determining…
… a Balance Point
Acquisition of competencies
Certification
Test Validity!
Problems
Random answers
Inadequate Passing threshold
Model:
Conditional Probabilities
The significance of a test is conditioned by
at least four factors.
1. The expected level of competence
2. The test soundness
3. The “Luck” factor
4. The items number
1. The expected level of competence
Before the test…
P(A)
Marginal probability
Competence “A”…
2. The test soundness
Competence “A” Test “B”
We have to verify…
Value from 0 to 1 Validity of our
measurement tool
P(B|A)
The probability
expresses…
3. The “Luck” factor
What about “lucky”
student?
Not Competence “not A” Test “B”
The probability
expresses…Items typologiesP(B|not A) Number
of wrong options
?
2 options:
True/false
4 options:
Multiple choice
P(B|not A) =
0,5
P(B|not A) =
0,25
?
4. The items number
To verify the Competence
“A” …
100 multiple-choice
Items
Item i
=
Competence Ai
Expected
competence
=
??
Test B
Item validity
=
??
P (Bi/not Ai)
=1/4 = 0,25
100 multiple-choice
Items
Expected
competence
=
??
Test B
Item validity
=
??
Item i
=
Competence Ai
P(Ai)= 0,5
P (Bi/Ai) = 0,5
The Model
Bayes Theorem:
conditional
probabilities
Balance point
Conditions
Test typology and
Calculation
Define the minimum
threshold value
Test B Competence A
Four steps1
P(Ai|Bi)
4
K
P(Ai|not Bi)
2
P(A|B)
3
Bi = Correct Answer
1
P(Ai|Bi)
Competence “Ai”
1
P(Ai|Bi)
Ai not Ai Sum
Bi
P(Ai and Bi) =
0,25
P(not Ai and Bi) =
0,125
P(Bi) =
0,375
not Bi
P(Ai and not Bi) =
0,25
P(not Ai and not Bi) =
0,375
P(not Bi) =
0,625
Addition P(Ai) =
0,5
P(not Ai) =
0,5
1
1
P(Ai|Bi)
6,0
375,0
25,0
)P(B
)BandP(A
)B|P(A
i
ii
ii ===
1
P(Ai|Bi)
.4,0
625,0
25,0
)BP(not
)BnotandP(A
)Bnot|P(A
i
ii
ii ===
P(Ai|not Bi)
2
33
P(A|B)
Competence “A”Test B
33
P(A|B)
P(B)
B]andP[A
P(B)
P(A)*A]|P[B
B)|P(A ==
{A1, A2, A3, …, Ak}
33
P(A|B)
A not A
B A and B not A and B
A not A Sum
B
[P(Ai|Bi)*…*P(Ak|Bk)]
*
[P(Ai|not Bi) *…*P(A(n-k)|not
B(n-k))]
[P(not Ai|Bi) *…*P(not Ak|
Bk)]
*
[P(not Ai|not Bi) *…*P(not
A(n-k)|not B(n-k))]
P(B)
Table 4
])Bnot|AP(not*)B|A[P(not])Bnot|P(A*)B|[P(A
)Bnot|P(A*)B|P(A
B)|P(A k)-(100
ii
k
ii
k)-(100
ii
k
ii
k)-(100
ii
k
ii
+
=
33
P(A|B)
Table 4
n = items number
4
K
]0,6*3[0,]0,4*6[0,
0,4*60,
B)|P(A
k)-(100kk)-(100k
k)-(100k
+
=
Minimum Passing Thereshold
K<100
P(A|B) = 1
K= min
4
K
69
Thank you!

Mais conteúdo relacionado

Mais de Gaetano Bruno Ronsivalle (9)

Ud6 ud7
Ud6 ud7Ud6 ud7
Ud6 ud7
 
Ud3
Ud3Ud3
Ud3
 
Ud10
Ud10Ud10
Ud10
 
Metavalutazione
MetavalutazioneMetavalutazione
Metavalutazione
 
Sistemi intelligenti e processi decisionali
Sistemi intelligenti e processi decisionaliSistemi intelligenti e processi decisionali
Sistemi intelligenti e processi decisionali
 
Ia2
Ia2Ia2
Ia2
 
Come valutare la qualità di un processo/prodotto formativo?
Come valutare la qualità di un processo/prodotto formativo?Come valutare la qualità di un processo/prodotto formativo?
Come valutare la qualità di un processo/prodotto formativo?
 
E learning
E learningE learning
E learning
 
Education technology
Education technologyEducation technology
Education technology
 

Último

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Último (20)

TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 

Valutazione

Notas do Editor

  1. Good morning, ladies and gentleman. I am Bruno Ronsivalle and I am R&amp;D manager in ABI, the Italian Banking Association. Today, we are going to talk about a calculation model to define passing thresholds in evaluation. [CLIC]
  2. Almost always, the passing threshold determines the success of an assessment test. [CLIC] This value represents the limit which establishes if a performance is bad or good. [CLIC]
  3. But how can this value be identified? [CLIC] Which criteria can establish how many correct answers a candidate must give to show the competencies attainment?[CLIC]
  4. Many trainers base their assessment strategies on school and university traditional criteria. [CLIC] But very often, instructional designers can hardly describe [CLIC] how they make their assessment choices [CLIC] why they opt for a certain passing threshold [CLIC] or which is their reference theoretical frame. [CLIC]
  5. Usually, most of them adopt the “sufficiency” criterion: [CLIC] a student passes the test only if he or she correctly answers 60 of 100 questions. [CLIC] The belief is that a 60% correct resolution of the test represents the evidence a student gets the competence under consideration, [CLIC] as if it was an rule of nature!
  6. Clearly such criterion is not based on a scientific argument. [CLIC] Because there’s not a rule of nature able to determine a reliable “passing score” value. [CLIC]
  7. We believe that nothing can be “a priori” established [CLIC] and the “passing score criterion” is not only unfounded but in some cases false and meaningless. [CLIC]
  8. It’s a fact that every time I ask a rational explanation to this assumption, I always get the same answer: [CLIC]
  9. “ Just common sense”. Isn’t that simple?
  10. In order to get a less elusive answer to this question, we are going to show an alternative and more rigorous method to define the test passing thresholds. [CLIC]
  11. Let’s start providing a definition of “Passing threshold”. [CLIC] What does “identifying a passing threshold” mean? [CLIC] It means determining a balance point where probabilistic values about students getting competencies can be formally certified. [CLIC]
  12. But several problems can affect the “passing score criterion” objectivity. [CLIC] For example, students could randomly answer or the passing threshold could be inadequate and so on. [CLIC] All these variables affect the test validity. So, how can we determine such balance point? [CLIC]
  13. The method I intend to propose is based on the calculation of the conditional probabilities and implies a general premise: [CLIC] it’s not possible to “a priori” define a universal and always valid passing threshold because the significance of a test is conditioned by at least four factors. [CLIC]
  14. The expected level of competence [CLIC] The test soundness [CLIC] The “luck” factor [CLIC] The items number [CLIC]
  15. Let’s start with the first factor: the expected level of competence [CLIC]
  16. What do we know about our students competencies before administering the test? [CLIC] Which data our expectations are based on? [CLIC]
  17. We can reformulate this variable in a more schematic way: given a competence “A”, [CLIC] that is the knowledge and abilities this competence can be broken down (for example A1, A2, A3, … , An, etc), [CLIC] P(A) expresses the marginal probability a examinee who randomly answers to the test gets the competence A. [Be careful: we’re talking about “marginal probability” because we’re making reference to an “a priori” probability, as it’s defined before the test administration through the statistic results analysis of similar tests.] [CLIC]
  18. The second factor refers to the test capability of effectively measuring the competence “A”. [CLIC]
  19. In other terms, we have to verify [CLIC] if the examinee who actually has the competence “A” [CLIC] is also able to correctly answer to any question of a test “B”. [CLIC]
  20. The value of the probability P(B|A) expresses a degree of certainty (from 0 to 1) about the “validity” of our measurement tool. [CLIC] To determine this value, we need to administer the test to a sample survey and conduct an item analysis in order to interpret data. [CLIC]
  21. A third element is the so-called “luck” factor. [CLIC]
  22. Are we completely sure students correctly answering the test aren’t just “lucky”? [CLIC] This variable can be expressed by the marginal probability that non competent students (not A) can solve any item of the test by randomly choosing the correct answer. [CLIC]
  23. The value of the probability P(B|not A) depends on the item typologies and it can be directly associated to the number of wrong options making more or less difficult the test resolution. [CLIC] In case of items with two options (for example, a true/false test item) this probability equals 0,5; [CLIC] if we decide to use only multiple choice items with four options (only one is correct) we’ll have a probability of 0,25. [CLIC]
  24. There’s a fourth relevant factor: the items number. [CLIC]
  25. The number of step defined to observe the competence A, [CLIC] correspond to specific knowledge and abilities related to the competence. [CLIC] Such variable is essential to determine the accuracy and completeness of the measurement activities around the competence A: [CLIC] bigger the tests number, bigger the probability to correctly measure the assessment system coherence.
  26. So, let’s imagine a test composed of 100 multiple choice items with three wrong options. [CLIC] Each item is related to just one competence or ability “Ai” composing the competence “A”. [CLIC] Let’s also imagine, as we never administered the test before, we don’t know anything about the expected level of competence [CLIC] and about the item validity either. [CLIC]
  27. On the basis of these premises: the marginal probability students already get the knowledge “Ai” equals the maximum degree of uncertainty. [CLIC] Thence P(Ai) = 0,5; the marginal probability students already having the knowledge/ability (Ai) are able to correctly answer the corresponding item (Bi) equals the maximum degree of uncertainty. [CLIC] Thence P(Bi|Ai) = 0,5; the marginal probability students not having the knowledge/ability (Ai) randomly give the correct answer to the corresponding item (Bi) equals the ratio between number of attempts, 1, and number of available options, 4. [CLIC] Thence P(Bi|not Ai) = ¼ = 0,25. [CLIC]
  28. At this point, let’s describe the calculation model to determine the passing threshold. [CLIC] First of all, this model is based on the conditional probabilities Bayes theorem. [CLIC] According to Bayes, the balance value of the passing threshold must combine some conditions in order to effectively check the chances a student actually attained the required competencies. Such balance point has to be related to the test typology and calculated at every turn, depending on the factors virtually perturbing the assessment phase. [CLIC]
  29. Now, come back to our example. [CLIC] We have to define the minimum threshold value to pass a test composed of 100 multiple-choice items and identify the examinees getting the competence A. [CLIC] How to do that? [CLIC]
  30. The model we propose includes a four steps method that allows calculating and crossing the different probabilistic conditions to get the test passing threshold. [CLIC]
  31. In the first step we calculate the probability [CLIC] a student who gives the correct answer (Bi) [CLIC] gets the ability Ai. [CLIC]
  32. Then it’s necessary to make all possible combinations clear. [CLIC] The table show four combinations: [CLIC] Ai and Bi: the examinee gives the correct answer and gets the competence. [CLIC] not Ai and Bi: the examinee gives the correct answer but doesn’t get the competence.[CLIC] Ai and not Bi: the examinee gives the wrong answer and gets the competence.[CLIC] not Ai and not Bi: the examinee gives the wrong answer and doesn’t get the competence.[CLIC]
  33. [CLIC] After that, we can determine the probabilities of each combination. [CLIC]
  34. [CLIC] At last, we can calculate the probability a student who gives the correct answer Bi gets the ability Ai. [CLIC]
  35. In the second step, on the base of the previous table, the Bayes’ Theorem will help us calculating the probability [CLIC] a wrong answer (not Bi) corresponds to the evidence [CLIC] the examinee gets the knowledge Ai. [CLIC]
  36. After defining the probability for the single item, the third step is the definition of the formula to calculate [CLIC] the probability a student who gives all the correct answer to the test B [CLIC] gets the competence A. So, in this step, the analysis must be extended to the whole test. How? [CLIC]
  37. [CLIC] Our goal is defining an algorithm, [CLIC] based on Bayes’ Theorem, to calculate the probability , [CLIC] where B is the composed event including the k correct answers to the items test. [CLIC]
  38. [CLIC] So, it will be necessary to: [CLIC] determine the possible combinations of the variable B with “A” and “not A”; [CLIC] determine the formula and define every single combination probability; [CLIC]
  39. [CLIC] and, at last, determine the general algorithm to calculate P(A|B). In this specific case, the formula will be this. [CLIC] [CLIC] “n” equals the number of the items we decided our test is composed of. In this case, 100 multiple choice items. [CLIC]
  40. In the last step we have to determine the threshold “k”. Once defined the formula to calculate P(A|B), we need to identify the value of “k” [CLIC] that is the minimum passing threshold. This value must satisfy some conditions: [CLIC] it must be inferior to “n”, the whole items number (in this case 100); [CLIC] the passing threshold must guarantee the examinee certainly got the competence A. It means that P(A|B) equals 1; [CLIC] At last, k must be the minimum correct answers value to give among a values set, thence P(A|B) equals 1. The general formula we’ll be this [CLIC]. [CLIC] But how can we indentify the exact point where P(A|B) equals 1? [CLIC]
  41. [CLIC] Thanks to the graphic representation of the function f(k) [CLIC] we’ll be able to geometrically identify the exact point where P(A|B) gets the value 1. [CLIC] In this case, this point “k” is in the interval between 68 and 70. [CLIC] That means every examinee must give at least 69 of 100 correct answers in order to show for certain he/she gets the competence A. The problem has been solved! [CLIC]
  42. For further information you can write me to this e-mail address. Thank very much for your attention!