SlideShare a Scribd company logo
1 of 22
Download to read offline
Scaling Assessment with
Adaptive Comparative Judgement
Sarah Honeychurch
Niall Barr
Jeremy Singer
Steve Draper
A method of ranking artefacts by
making comparative judgements,
rather than absolute ones.
• Intuitively plausible
• Removes pretence of expert,
objective standards
Adaptive Comparative Judgement
A radically different approach to grading
• Produces a fully ranked set of
scripts
• Allows for separate consideration
about where to insert grade
boundaries
• Marking to a curve
• Marking to rigid standards (e.g. ILOs)
A radically different approach to grading
• Uses a single, implicit criterion
rather than a complex, explicit set
of ILOs
• Can be used both for questions that
do have a single correct answer as
well as those that don't
• Method “scales”
• Compelling naturalness
• Can be used with sets of markers
• Can be used for peer review
• Can easily mark cross-media (& multi-media)
• Can easily be used for/with unusual, subjective,
and implicit marking criteria
• Can be used by matching against exemplars
• http://www.psy.gla.ac.uk/~steve/apr/apr.html#usp
Distinctive Benefits of Pollitt’s ACJ Approach
• The software has been built, tested, and used; and
by more than one person / organisation. (Also done
for conference talk refereeing at UofG.)
• A major experiment has been done and published,
using professional markers; supporting the key
claims (Pollitt, 2012).
• This paper additionally reports an important
qualitative datum: that the markers were highly
sceptical (did the experiment for the money, at
standard professional rates for marking) but came
to see it as better as well as faster than their
traditional way of doing marking).
Adaptive Comparative Judgement
• A simple IMS LTI application that can be linked
from Moodle, FutureLearn or any other LTI host.
• Submissions can be text, source code, PDFs,
images or YouTube URLs.
• Submissions can be added by staff for a review
only exercise, or by each student.
• Like Moodle Workshop and Aropä, it has
separate submission and review phases
Our ACJ Implementation: the software
• Sorting done in ‘rounds’
• New pairing allocated at start of each round
• Three different phases, each with a different
‘scoring’ method as sort improves
• A simulation (using random errors in
comparison) was used to refine the
algorithm
Our ACJ Implementation: the algorithm
Our ACJ Implementation: the process
1 2 3 4 5 6Random order
1 3 62 4 5First sort
Round 2
62 1
13 624 5Second sort
0 1 2
Phase 1: Random Initial Order, Neighbour Comparison, Quartile Bins
round #
Phase 2: Using Earlier Judgments to Select New Comparisons
round #
Phase 3: More Refined Comparison with Near Neighbours
round #
• The same simulation with 600 ‘artefacts’
• After 17 round sorting is very good
• (Image shows middle ~1/3 with one ‘artefact’
highlighted)
Demonstration of Scaling
https://learn.gla.ac.uk/acjdemo/
This demonstration lets you try out ACJ by
comparing photographs of wildlife and flowers. (It
uses a development version of the software that
doesn’t require a login)
Adaptive Comparative Judgement
• Futurelearn MOOC (n=1000)
• COMPSCI4021 (n=80)
Case Study
Functional
Programming
in Haskell:
Supercharge
Your Coding
Case Study Continued
In the Haskell MOOC, we asked students to
peer assess using ACJ.
Students received:
1. Problem spec (to implement)
2. Quality guidelines as judgment criterion
3. Peers’ solutions (to compare)
4. Ranking of their own work (quartile bin)
5. A sample solution
• I can see different ways of thinking and I try to understand
which one is better(more efficient) and I hope that I will be able
to make my own codes more efficient in the future.
• The approach forces you to think differently. This can only be
trained by doing it.
• Being able to compare your own work against lots of others lets
you see roughly how well/poorly you are progressing in the
course compared to your classmates as a whole.
• I think that it is a very useful exercise (both writing a code and
comparing the codes of other students) and it is organised in a
great way. I would like to thank the course educators.
• As you start comparing you can see the different approaches
students started using and everything could be compared
faster.
Student comments
Can be set up to produce reports:
• Who was the most deviant marker?
• Which submission was the most divisive?
• How converged were the judgements?
Interesting statistics
• Still a development / pilot tool
− Further refinement possible
• Could this be useful in your teaching?
− Scholarship / research
− Not a yet a ‘Service’ at UofG
Where next?
• Dale, V.H.M., & Singer, J., 2019. Learner experiences of a
blended course incorporating a MOOC on Haskell
functional programming. Research in Learning
Technology vol.27. DOI: 10.25304/rlt.v27.2248
• Pollitt, A., 2012. The method of Adaptive Comparative
Judgement. Assessment in Education: Principles, Policy
& Practice, 19(3), pp.281–300.
• Thurstone, L. L., 1927. A law of comparative
judgment. Psychological Review, 34(4), pp.273-286.
http://dx.doi.org/10.1037/h0070288
References
Pointers
Sarah.Honeychurch@glasgow.ac.uk @NomadWarMachine
Niall.Barr@glasgow.ac.uk @niall_barr
Jeremy.Singer@glasgow.ac.uk
Steve.Draper@glasgow.ac.uk
Source code: https://github.com/niallb/ACJ-LTI
Further notes: http://www.psy.gla.ac.uk/~steve/apr/apr.html
This talk: http://www.psy.gla.ac.uk/~steve/talks/apr4.html

More Related Content

What's hot

An Example of Predictive Analytics: Building a Recommendation Engine Using Py...
An Example of Predictive Analytics: Building a Recommendation Engine Using Py...An Example of Predictive Analytics: Building a Recommendation Engine Using Py...
An Example of Predictive Analytics: Building a Recommendation Engine Using Py...
PyData
 

What's hot (13)

School net
School netSchool net
School net
 
7 Ways to Use the NOT Release Conditions in Brightspace
7 Ways to Use the NOT Release Conditions in Brightspace7 Ways to Use the NOT Release Conditions in Brightspace
7 Ways to Use the NOT Release Conditions in Brightspace
 
New Intelligences for Intelligent Agents - webinar slides
New Intelligences for Intelligent Agents - webinar slidesNew Intelligences for Intelligent Agents - webinar slides
New Intelligences for Intelligent Agents - webinar slides
 
Usability testing(Northeastern University-2014)
Usability testing(Northeastern University-2014)Usability testing(Northeastern University-2014)
Usability testing(Northeastern University-2014)
 
Assessment and technology
Assessment and technologyAssessment and technology
Assessment and technology
 
Recsys 2016
Recsys 2016Recsys 2016
Recsys 2016
 
Imrad structure
Imrad structureImrad structure
Imrad structure
 
Ten Bright Ideas for Improving Accessibility in Brightspace
Ten Bright Ideas for Improving Accessibility in BrightspaceTen Bright Ideas for Improving Accessibility in Brightspace
Ten Bright Ideas for Improving Accessibility in Brightspace
 
An Example of Predictive Analytics: Building a Recommendation Engine Using Py...
An Example of Predictive Analytics: Building a Recommendation Engine Using Py...An Example of Predictive Analytics: Building a Recommendation Engine Using Py...
An Example of Predictive Analytics: Building a Recommendation Engine Using Py...
 
ASAC Presentation for 2015 WSHETC (Final)
ASAC Presentation for 2015 WSHETC (Final)ASAC Presentation for 2015 WSHETC (Final)
ASAC Presentation for 2015 WSHETC (Final)
 
Recommending the world's knowledge
Recommending the world's knowledgeRecommending the world's knowledge
Recommending the world's knowledge
 
KP Compass Learning Platform
KP Compass Learning PlatformKP Compass Learning Platform
KP Compass Learning Platform
 
Math-Bridge Student Interface
Math-Bridge Student InterfaceMath-Bridge Student Interface
Math-Bridge Student Interface
 

Similar to From a thousand learners to a thousand markers: Scaling peer feedback with Adaptive Comparative Judgement: ALT-C 2019

Workplace Simulated Courses - Course Technology Computing Conference
Workplace Simulated Courses - Course Technology Computing ConferenceWorkplace Simulated Courses - Course Technology Computing Conference
Workplace Simulated Courses - Course Technology Computing Conference
Cengage Learning
 
evaluation technique uni 2
evaluation technique uni 2evaluation technique uni 2
evaluation technique uni 2
vrgokila
 

Similar to From a thousand learners to a thousand markers: Scaling peer feedback with Adaptive Comparative Judgement: ALT-C 2019 (20)

Home base sn ppt t4 t
Home base sn ppt t4 tHome base sn ppt t4 t
Home base sn ppt t4 t
 
Learning-Based Evaluation of Visual Analytic Systems.
Learning-Based Evaluation of Visual Analytic Systems.Learning-Based Evaluation of Visual Analytic Systems.
Learning-Based Evaluation of Visual Analytic Systems.
 
Data carpentry instructor-onboarding
Data carpentry instructor-onboardingData carpentry instructor-onboarding
Data carpentry instructor-onboarding
 
eMOOCs2015 Does peer grading work?
eMOOCs2015 Does peer grading work?eMOOCs2015 Does peer grading work?
eMOOCs2015 Does peer grading work?
 
Moodle for peer review
Moodle for peer reviewMoodle for peer review
Moodle for peer review
 
Online Assessment Presentation
Online Assessment PresentationOnline Assessment Presentation
Online Assessment Presentation
 
Updated online assessment presentation
Updated online assessment presentationUpdated online assessment presentation
Updated online assessment presentation
 
Workplace Simulated Courses - Course Technology Computing Conference
Workplace Simulated Courses - Course Technology Computing ConferenceWorkplace Simulated Courses - Course Technology Computing Conference
Workplace Simulated Courses - Course Technology Computing Conference
 
Updated online assessment presentation
Updated online assessment presentationUpdated online assessment presentation
Updated online assessment presentation
 
Tools for online assessment in Moodle
Tools for online assessment in MoodleTools for online assessment in Moodle
Tools for online assessment in Moodle
 
evaluation technique uni 2
evaluation technique uni 2evaluation technique uni 2
evaluation technique uni 2
 
Presentation at Minnesota Brightspace Ignite on April 24, 2015, byCreating an...
Presentation at Minnesota Brightspace Ignite on April 24, 2015, byCreating an...Presentation at Minnesota Brightspace Ignite on April 24, 2015, byCreating an...
Presentation at Minnesota Brightspace Ignite on April 24, 2015, byCreating an...
 
Session 5 - Evaluation and Useability for elearning
Session 5 - Evaluation and Useability for elearningSession 5 - Evaluation and Useability for elearning
Session 5 - Evaluation and Useability for elearning
 
HT2014 Tutorial: Evaluating Recommender Systems - Ensuring Replicability of E...
HT2014 Tutorial: Evaluating Recommender Systems - Ensuring Replicability of E...HT2014 Tutorial: Evaluating Recommender Systems - Ensuring Replicability of E...
HT2014 Tutorial: Evaluating Recommender Systems - Ensuring Replicability of E...
 
Test Fest and the Tale of Too Many Post-its
Test Fest and the Tale of Too Many Post-itsTest Fest and the Tale of Too Many Post-its
Test Fest and the Tale of Too Many Post-its
 
Test Fest and the Tale of Too Many Post-its
Test Fest and the Tale of Too Many Post-itsTest Fest and the Tale of Too Many Post-its
Test Fest and the Tale of Too Many Post-its
 
Rubric design workshop
Rubric design workshopRubric design workshop
Rubric design workshop
 
What’s a Rubric?
What’s a Rubric?What’s a Rubric?
What’s a Rubric?
 
Oe rs – pros and cons
Oe rs – pros and consOe rs – pros and cons
Oe rs – pros and cons
 
How to gain a foothold in the world of classification
How to gain a foothold in the world of classificationHow to gain a foothold in the world of classification
How to gain a foothold in the world of classification
 

Recently uploaded

Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
ciinovamais
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
PECB
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
heathfieldcps1
 

Recently uploaded (20)

APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdf
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdf
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajan
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writing
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 

From a thousand learners to a thousand markers: Scaling peer feedback with Adaptive Comparative Judgement: ALT-C 2019

  • 1. Scaling Assessment with Adaptive Comparative Judgement Sarah Honeychurch Niall Barr Jeremy Singer Steve Draper
  • 2. A method of ranking artefacts by making comparative judgements, rather than absolute ones. • Intuitively plausible • Removes pretence of expert, objective standards Adaptive Comparative Judgement
  • 3. A radically different approach to grading • Produces a fully ranked set of scripts • Allows for separate consideration about where to insert grade boundaries • Marking to a curve • Marking to rigid standards (e.g. ILOs)
  • 4. A radically different approach to grading • Uses a single, implicit criterion rather than a complex, explicit set of ILOs • Can be used both for questions that do have a single correct answer as well as those that don't
  • 5. • Method “scales” • Compelling naturalness • Can be used with sets of markers • Can be used for peer review • Can easily mark cross-media (& multi-media) • Can easily be used for/with unusual, subjective, and implicit marking criteria • Can be used by matching against exemplars • http://www.psy.gla.ac.uk/~steve/apr/apr.html#usp Distinctive Benefits of Pollitt’s ACJ Approach
  • 6. • The software has been built, tested, and used; and by more than one person / organisation. (Also done for conference talk refereeing at UofG.) • A major experiment has been done and published, using professional markers; supporting the key claims (Pollitt, 2012). • This paper additionally reports an important qualitative datum: that the markers were highly sceptical (did the experiment for the money, at standard professional rates for marking) but came to see it as better as well as faster than their traditional way of doing marking). Adaptive Comparative Judgement
  • 7. • A simple IMS LTI application that can be linked from Moodle, FutureLearn or any other LTI host. • Submissions can be text, source code, PDFs, images or YouTube URLs. • Submissions can be added by staff for a review only exercise, or by each student. • Like Moodle Workshop and Aropä, it has separate submission and review phases Our ACJ Implementation: the software
  • 8. • Sorting done in ‘rounds’ • New pairing allocated at start of each round • Three different phases, each with a different ‘scoring’ method as sort improves • A simulation (using random errors in comparison) was used to refine the algorithm Our ACJ Implementation: the algorithm
  • 9. Our ACJ Implementation: the process 1 2 3 4 5 6Random order 1 3 62 4 5First sort Round 2 62 1 13 624 5Second sort 0 1 2
  • 10. Phase 1: Random Initial Order, Neighbour Comparison, Quartile Bins round #
  • 11. Phase 2: Using Earlier Judgments to Select New Comparisons round #
  • 12. Phase 3: More Refined Comparison with Near Neighbours round #
  • 13. • The same simulation with 600 ‘artefacts’ • After 17 round sorting is very good • (Image shows middle ~1/3 with one ‘artefact’ highlighted) Demonstration of Scaling
  • 14. https://learn.gla.ac.uk/acjdemo/ This demonstration lets you try out ACJ by comparing photographs of wildlife and flowers. (It uses a development version of the software that doesn’t require a login) Adaptive Comparative Judgement
  • 15. • Futurelearn MOOC (n=1000) • COMPSCI4021 (n=80) Case Study Functional Programming in Haskell: Supercharge Your Coding
  • 16. Case Study Continued In the Haskell MOOC, we asked students to peer assess using ACJ. Students received: 1. Problem spec (to implement) 2. Quality guidelines as judgment criterion 3. Peers’ solutions (to compare) 4. Ranking of their own work (quartile bin) 5. A sample solution
  • 17.
  • 18. • I can see different ways of thinking and I try to understand which one is better(more efficient) and I hope that I will be able to make my own codes more efficient in the future. • The approach forces you to think differently. This can only be trained by doing it. • Being able to compare your own work against lots of others lets you see roughly how well/poorly you are progressing in the course compared to your classmates as a whole. • I think that it is a very useful exercise (both writing a code and comparing the codes of other students) and it is organised in a great way. I would like to thank the course educators. • As you start comparing you can see the different approaches students started using and everything could be compared faster. Student comments
  • 19. Can be set up to produce reports: • Who was the most deviant marker? • Which submission was the most divisive? • How converged were the judgements? Interesting statistics
  • 20. • Still a development / pilot tool − Further refinement possible • Could this be useful in your teaching? − Scholarship / research − Not a yet a ‘Service’ at UofG Where next?
  • 21. • Dale, V.H.M., & Singer, J., 2019. Learner experiences of a blended course incorporating a MOOC on Haskell functional programming. Research in Learning Technology vol.27. DOI: 10.25304/rlt.v27.2248 • Pollitt, A., 2012. The method of Adaptive Comparative Judgement. Assessment in Education: Principles, Policy & Practice, 19(3), pp.281–300. • Thurstone, L. L., 1927. A law of comparative judgment. Psychological Review, 34(4), pp.273-286. http://dx.doi.org/10.1037/h0070288 References
  • 22. Pointers Sarah.Honeychurch@glasgow.ac.uk @NomadWarMachine Niall.Barr@glasgow.ac.uk @niall_barr Jeremy.Singer@glasgow.ac.uk Steve.Draper@glasgow.ac.uk Source code: https://github.com/niallb/ACJ-LTI Further notes: http://www.psy.gla.ac.uk/~steve/apr/apr.html This talk: http://www.psy.gla.ac.uk/~steve/talks/apr4.html