SlideShare a Scribd company logo
1 of 16
Download to read offline
MOOC Dropout Prediction Using Machine Learning
Techniques: Review and Research Challenges
Fisnik Dalipi
Linnaeus University, Sweden & University College of Southeast Norway
Ali Shariq Imran, Zenun Kastrati
Norwegian University of Science and Technology
This is a presentation of the paper: “MOOC Dropout Prediction Using Machine
Learning Techniques: Review and Research Challenges”, which was accepted and
presented at the
EDUCON’2018 – IEEE Global Engineering Education Conference
The full paper can be downloaded from the IEEEXplore website: https://ieeexplore.ieee.org/
Introduction
Massive Open Online Courses (MOOCs) are now the most recent topic within the field of e-
learning
Some researchers even predict MOOC will represent the fourth stage of online education
evolution, after the third stage of LMS
MOOCs are progressively becoming an integral part of a learning process in higher education
settings
Also proven that the MOOC approach when complemented with a flipped classroom pedagogy
would also bring advantages toward enhancing the learning experience of students substantially
But …
The biggest threat to MOOCs
Reasons behind the dropoutStudentrelatedMOOCrelated
Lack of motivation
Lack of time
Insufficient background
Course design
Lack of interactivity
Hidden costs
State-of-the-art: Frequency of occurrences of
ML for MOOC dropout prediction
0
1
2
3
4
5
6
7
8
9
10
Logistic
Regression
Deep Neural
Network
Support Vector
Machine
Hidden Markov
Models
Recurrent
Neural Network
Natural
Language
Processing
Technique
Decision Tree Survival
Analysis
Bayesian
Network
Challenges of dropout prediction using ML
Lack of enough sample data
Managing big masses of unstructured data
Data variance
High data imbalance
Availability of publicly accesible dataset
Lack of standard for creating and representing clickstream data
Student schedule related challenges
Lack of enough sample data
Managing big masses of unstructured data
Data variance
High data imbalance
Availability of publicly accesible dataset
Lack of standard for creating and representing clickstream data
Student schedule related challenges
oFor unbiased classification results
oLack of negative samples, i.e. HarvardX
o E.g. Out of 641138 registered students only 17687
obtained certification
oHowever, the effectiveness of the ML relies on the
availability of enormous amount of both positive
and negative samples
oA significant difference in observable sample data
affects
othe generalization of the deep neural network
model during the training step, but also
ocomprises the classification accuracy.
Challenges of dropout prediction using ML
oMOOC comprises of big masses of unstructured
data missing data occurs and is hard to validate
oA multitude of ML techniques can’t be applicable
oE.g. Hidden Markov Models, since
oit relies on a finite data set, and
ocannot be applied if observations are missing
oResearchers reaction: replacing missing
observations with mean values.
oNot a realistic choice for many of the observed
features
Lack of enough sample data
Managing big masses of unstructured data
Data variance
High data imbalance
Availability of publicly accesible dataset
Lack of standard for creating and representing clickstream data
Student schedule related challenges
Challenges of dropout prediction using ML
oIn MOOC's self-paced learning pedagogy, students
have the freedom to decide what, when, and how
to study.
oThis might lead to considerable data variance,
which may produce less accurate and reliable ML
models
oParticularly for SVM and Naïve Bayes, since
otheir performance can quickly turn poor when
dealing with imbalanced classes
oOverall, the high data imbalance may result in
biasing of the classifier towards the majority of the
class.
Lack of enough sample data
Managing big masses of unstructured data
Data variance
High data imbalance
Availability of publicly accesible dataset
Lack of standard for creating and representing clickstream data
Student schedule related challenges
Challenges of dropout prediction using ML
oMOOC platforms are often reluctant to publish the
data due to confidentiality and privacy concerns
oThe de-identified information which is made public
is also restricted to system generated events
(clickstream data) only, while most of the user-
provided data is omitted.
oThe lack of user-provided data can be a hindrance
in successfully estimating the reasons behind the
dropouts.
Challenges of dropout prediction using ML
Lack of enough sample data
Managing big masses of unstructured data
Data variance
High data imbalance
Availability of publicly accesible dataset
Lack of standard for creating and representing clickstream data
Student schedule related challenges
oThe user data MOOC platforms gather and the
clickstream data they generate doesn’t necessarily
conform to a standard format, due to lack of available
standards - such as IEEE LOM for learning objects
oThe lack of a standard for creating and representing
clickstream data for MOOC implies that
oa network trained and validated for one MOOC on a
given platform might not be interoperable across
different MOOC on another platform, or even on the
same platform if a MOOC gathers and collects data
differently
oLack of standards gives rise to another challenge, i.e., how
to create compelling ML models and predictive solutions
that can be generalized across different MOOCs?
Lack of enough sample data
Managing big masses of unstructured data
Data variance
High data imbalance
Availability of publicly accesible dataset
Lack of standard for creating and representing
clickstream data
Student schedule related challenges
Challenges of dropout prediction using ML
oStudents want to construct timetable based on
their preferences
oSatisfying such preferences of students to be
accommodated on different timetables is a
complicated problem which can be solved using
o Artificial intelligence (AI) optimization techniques.
o These AI techniques are specialized to be used for
scheduling and thus students’ time constraints can
be solved using these techniques.
Lack of enough sample data
Managing big masses of unstructured data
Data variance
High data imbalance
Availability of publicly accesible dataset
Lack of standard for creating and representing clickstream data
Student schedule related challenges
Challenges of dropout prediction using ML
Towards useful and effective predictive
solutions
Useful
&
Effective
solutions
Clickstream
data
standards
Student
provided
data
Feature
engineering
techniques
Engagement
system
Evaluating
models and
predictors
Improving
student
social
engagement
Unification/standardization of a
clickstream data
Predicting models can benefit from the
data provided by the students that
don't reveal the identity of the user
To analyze posts, feedback, and
comments shared by the candidate on
the discussion blogs and social forums
of the MOOC
Encouraging social interactions and
simulating peer-evaluation activities
could also potentially represent an
efficient way to mitigate the dropout
To transfer and evaluate models from
one MOOC to another, and to perform
real-time prediction with ongoing
courses.
To allow instructors track and monitor
students activity across different
modules of the course
Conclusions
This paper provides a comprehensive review of the most recent and relevant research
endeavors on machine learning application toward predicting, explaining and solving the
problem of student dropout in MOOCs
It also identifies some of the critical challenges associated with student dropout prediction and
provides recommendations and proposal to assist researchers employing various machine
learning techniques in solving it timely and efficiently.
Additionally, this paper floats the idea of unification of a clickstream data and student-provided
data as a standard, similar to various learning objects metadata standards.
Future work will follow on data collection for the development and analysis of deep learning
algorithms towards solving the MOOC student dropout.
Thank You!

More Related Content

What's hot

Massive Open Online Course ( MOOC )
Massive Open Online Course ( MOOC )Massive Open Online Course ( MOOC )
Massive Open Online Course ( MOOC )Mabusela M.G
 
SMRS B.Ed College Principal NAAC PPT (2)
SMRS B.Ed College Principal NAAC PPT (2)SMRS B.Ed College Principal NAAC PPT (2)
SMRS B.Ed College Principal NAAC PPT (2)Omprakash H M H M
 
E-Learning and Types of E-Learning (Asynchronous and synchronous e learning)
E-Learning and Types of E-Learning (Asynchronous and synchronous e learning)E-Learning and Types of E-Learning (Asynchronous and synchronous e learning)
E-Learning and Types of E-Learning (Asynchronous and synchronous e learning)AksharaDandgaval
 
New gen web browsers and search tools
New gen web browsers and search toolsNew gen web browsers and search tools
New gen web browsers and search toolssushiaaron
 
Grounded theory methodology of qualitative data analysis
Grounded theory methodology of qualitative data analysisGrounded theory methodology of qualitative data analysis
Grounded theory methodology of qualitative data analysisDr. Shiv S Tripathi
 
Educational policies of Pakistan 1998 to 2010.pptx
Educational policies of Pakistan 1998 to 2010.pptxEducational policies of Pakistan 1998 to 2010.pptx
Educational policies of Pakistan 1998 to 2010.pptxBushraHanif11
 
Triangulation: An Approach to establish Credibility and Dependability of Qual...
Triangulation: An Approach to establish Credibility and Dependability of Qual...Triangulation: An Approach to establish Credibility and Dependability of Qual...
Triangulation: An Approach to establish Credibility and Dependability of Qual...sankarprasadmohanty
 
Quality in-higher-education
Quality in-higher-educationQuality in-higher-education
Quality in-higher-educationRao Ahmad
 
Online education a new way of learning
Online education a new way of learningOnline education a new way of learning
Online education a new way of learningHimanshu Gupta
 
Online examination ppt
Online examination pptOnline examination ppt
Online examination pptAmit Kumar
 
Types of Online Learning and Classes
Types of Online Learning and ClassesTypes of Online Learning and Classes
Types of Online Learning and ClassesBeth Backes
 

What's hot (20)

Distance Education
 Distance  Education  Distance  Education
Distance Education
 
Six months progress review (PhD work)
Six months progress review (PhD work)Six months progress review (PhD work)
Six months progress review (PhD work)
 
Massive Open Online Course ( MOOC )
Massive Open Online Course ( MOOC )Massive Open Online Course ( MOOC )
Massive Open Online Course ( MOOC )
 
NPTEL
NPTELNPTEL
NPTEL
 
SMRS B.Ed College Principal NAAC PPT (2)
SMRS B.Ed College Principal NAAC PPT (2)SMRS B.Ed College Principal NAAC PPT (2)
SMRS B.Ed College Principal NAAC PPT (2)
 
Online learning
Online learningOnline learning
Online learning
 
E-learning system
E-learning systemE-learning system
E-learning system
 
Introduction to SWAYAM
Introduction to SWAYAMIntroduction to SWAYAM
Introduction to SWAYAM
 
E-Learning and Types of E-Learning (Asynchronous and synchronous e learning)
E-Learning and Types of E-Learning (Asynchronous and synchronous e learning)E-Learning and Types of E-Learning (Asynchronous and synchronous e learning)
E-Learning and Types of E-Learning (Asynchronous and synchronous e learning)
 
New gen web browsers and search tools
New gen web browsers and search toolsNew gen web browsers and search tools
New gen web browsers and search tools
 
Grounded theory methodology of qualitative data analysis
Grounded theory methodology of qualitative data analysisGrounded theory methodology of qualitative data analysis
Grounded theory methodology of qualitative data analysis
 
Educational policies of Pakistan 1998 to 2010.pptx
Educational policies of Pakistan 1998 to 2010.pptxEducational policies of Pakistan 1998 to 2010.pptx
Educational policies of Pakistan 1998 to 2010.pptx
 
Triangulation: An Approach to establish Credibility and Dependability of Qual...
Triangulation: An Approach to establish Credibility and Dependability of Qual...Triangulation: An Approach to establish Credibility and Dependability of Qual...
Triangulation: An Approach to establish Credibility and Dependability of Qual...
 
Quality in-higher-education
Quality in-higher-educationQuality in-higher-education
Quality in-higher-education
 
Online education a new way of learning
Online education a new way of learningOnline education a new way of learning
Online education a new way of learning
 
Neethu nptel
Neethu nptelNeethu nptel
Neethu nptel
 
Online examination ppt
Online examination pptOnline examination ppt
Online examination ppt
 
Learning Managment System
Learning Managment SystemLearning Managment System
Learning Managment System
 
Types of Online Learning and Classes
Types of Online Learning and ClassesTypes of Online Learning and Classes
Types of Online Learning and Classes
 
E learning
E learningE learning
E learning
 

Similar to MOOC Dropout Prediction Using ML

Using Ontology in Electronic Evaluation for Personalization of eLearning Systems
Using Ontology in Electronic Evaluation for Personalization of eLearning SystemsUsing Ontology in Electronic Evaluation for Personalization of eLearning Systems
Using Ontology in Electronic Evaluation for Personalization of eLearning Systemsinfopapers
 
A Systematic Literature Review Of Student Performance Prediction Using Machi...
A Systematic Literature Review Of Student  Performance Prediction Using Machi...A Systematic Literature Review Of Student  Performance Prediction Using Machi...
A Systematic Literature Review Of Student Performance Prediction Using Machi...Angie Miller
 
Increasing Learners' Retention and Persistence in MOOCs: Design-Based Researc...
Increasing Learners' Retention and Persistence in MOOCs: Design-Based Researc...Increasing Learners' Retention and Persistence in MOOCs: Design-Based Researc...
Increasing Learners' Retention and Persistence in MOOCs: Design-Based Researc...Maha Al-Freih
 
Assignments As Influential Factor To Improve The Prediction Of Student Perfor...
Assignments As Influential Factor To Improve The Prediction Of Student Perfor...Assignments As Influential Factor To Improve The Prediction Of Student Perfor...
Assignments As Influential Factor To Improve The Prediction Of Student Perfor...Kate Campbell
 
VII Jornadas eMadrid "Education in exponential times". Mesa redonda eMadrid L...
VII Jornadas eMadrid "Education in exponential times". Mesa redonda eMadrid L...VII Jornadas eMadrid "Education in exponential times". Mesa redonda eMadrid L...
VII Jornadas eMadrid "Education in exponential times". Mesa redonda eMadrid L...eMadrid network
 
[DSC Europe 22] Machine learning algorithms as tools for student success pred...
[DSC Europe 22] Machine learning algorithms as tools for student success pred...[DSC Europe 22] Machine learning algorithms as tools for student success pred...
[DSC Europe 22] Machine learning algorithms as tools for student success pred...DataScienceConferenc1
 
Utilising learning styles
Utilising learning stylesUtilising learning styles
Utilising learning stylesarteimi
 
Meta-review of recognition of learning in LMS and MOOCs - Ruth Cobos
Meta-review of recognition of learning in LMS and MOOCs - Ruth CobosMeta-review of recognition of learning in LMS and MOOCs - Ruth Cobos
Meta-review of recognition of learning in LMS and MOOCs - Ruth CoboseMadrid network
 
Macfadyen usc tlt keynote 2015.pptx
Macfadyen usc tlt keynote 2015.pptxMacfadyen usc tlt keynote 2015.pptx
Macfadyen usc tlt keynote 2015.pptxLeah Macfadyen
 
Predicting student performance using aggregated data sources
Predicting student performance using aggregated data sourcesPredicting student performance using aggregated data sources
Predicting student performance using aggregated data sourcesOlugbenga Wilson Adejo
 
Learning Analytics
Learning AnalyticsLearning Analytics
Learning AnalyticsViplav Baxi
 
Expert Perceptions of the Feasibility of MOOCs
Expert Perceptions of the Feasibility of MOOCsExpert Perceptions of the Feasibility of MOOCs
Expert Perceptions of the Feasibility of MOOCsShalin Hai-Jew
 
Dissertation Defense Powerpoint presented Aug. 8th, 2015
Dissertation Defense Powerpoint presented Aug. 8th, 2015Dissertation Defense Powerpoint presented Aug. 8th, 2015
Dissertation Defense Powerpoint presented Aug. 8th, 2015KJ Slyusar
 
Dissertation Defense presented August 8th 2015 Dr. Koffajuah Toeque Ed.D
Dissertation Defense  presented August 8th 2015 Dr. Koffajuah Toeque Ed.DDissertation Defense  presented August 8th 2015 Dr. Koffajuah Toeque Ed.D
Dissertation Defense presented August 8th 2015 Dr. Koffajuah Toeque Ed.DDr. Koffajuah Toeque Ed.D
 
DATA ANALYTICS FOR HIGHER EDUCATION
 DATA ANALYTICS FOR HIGHER EDUCATION DATA ANALYTICS FOR HIGHER EDUCATION
DATA ANALYTICS FOR HIGHER EDUCATIONSamantha Suraweera
 
Poster: Very Open Data Project
Poster: Very Open Data ProjectPoster: Very Open Data Project
Poster: Very Open Data ProjectEdward Blurock
 

Similar to MOOC Dropout Prediction Using ML (20)

Using Ontology in Electronic Evaluation for Personalization of eLearning Systems
Using Ontology in Electronic Evaluation for Personalization of eLearning SystemsUsing Ontology in Electronic Evaluation for Personalization of eLearning Systems
Using Ontology in Electronic Evaluation for Personalization of eLearning Systems
 
A Systematic Literature Review Of Student Performance Prediction Using Machi...
A Systematic Literature Review Of Student  Performance Prediction Using Machi...A Systematic Literature Review Of Student  Performance Prediction Using Machi...
A Systematic Literature Review Of Student Performance Prediction Using Machi...
 
Education 11-00552
Education 11-00552Education 11-00552
Education 11-00552
 
Learning and Educational Analytics
Learning and Educational AnalyticsLearning and Educational Analytics
Learning and Educational Analytics
 
Learning Analytics for MOOCs: EMMA case
Learning Analytics for MOOCs: EMMA caseLearning Analytics for MOOCs: EMMA case
Learning Analytics for MOOCs: EMMA case
 
Increasing Learners' Retention and Persistence in MOOCs: Design-Based Researc...
Increasing Learners' Retention and Persistence in MOOCs: Design-Based Researc...Increasing Learners' Retention and Persistence in MOOCs: Design-Based Researc...
Increasing Learners' Retention and Persistence in MOOCs: Design-Based Researc...
 
Assignments As Influential Factor To Improve The Prediction Of Student Perfor...
Assignments As Influential Factor To Improve The Prediction Of Student Perfor...Assignments As Influential Factor To Improve The Prediction Of Student Perfor...
Assignments As Influential Factor To Improve The Prediction Of Student Perfor...
 
VII Jornadas eMadrid "Education in exponential times". Mesa redonda eMadrid L...
VII Jornadas eMadrid "Education in exponential times". Mesa redonda eMadrid L...VII Jornadas eMadrid "Education in exponential times". Mesa redonda eMadrid L...
VII Jornadas eMadrid "Education in exponential times". Mesa redonda eMadrid L...
 
[DSC Europe 22] Machine learning algorithms as tools for student success pred...
[DSC Europe 22] Machine learning algorithms as tools for student success pred...[DSC Europe 22] Machine learning algorithms as tools for student success pred...
[DSC Europe 22] Machine learning algorithms as tools for student success pred...
 
Utilising learning styles
Utilising learning stylesUtilising learning styles
Utilising learning styles
 
Meta-review of recognition of learning in LMS and MOOCs - Ruth Cobos
Meta-review of recognition of learning in LMS and MOOCs - Ruth CobosMeta-review of recognition of learning in LMS and MOOCs - Ruth Cobos
Meta-review of recognition of learning in LMS and MOOCs - Ruth Cobos
 
Macfadyen usc tlt keynote 2015.pptx
Macfadyen usc tlt keynote 2015.pptxMacfadyen usc tlt keynote 2015.pptx
Macfadyen usc tlt keynote 2015.pptx
 
Predicting student performance using aggregated data sources
Predicting student performance using aggregated data sourcesPredicting student performance using aggregated data sources
Predicting student performance using aggregated data sources
 
Learning Analytics
Learning AnalyticsLearning Analytics
Learning Analytics
 
Mecar2010
Mecar2010Mecar2010
Mecar2010
 
Expert Perceptions of the Feasibility of MOOCs
Expert Perceptions of the Feasibility of MOOCsExpert Perceptions of the Feasibility of MOOCs
Expert Perceptions of the Feasibility of MOOCs
 
Dissertation Defense Powerpoint presented Aug. 8th, 2015
Dissertation Defense Powerpoint presented Aug. 8th, 2015Dissertation Defense Powerpoint presented Aug. 8th, 2015
Dissertation Defense Powerpoint presented Aug. 8th, 2015
 
Dissertation Defense presented August 8th 2015 Dr. Koffajuah Toeque Ed.D
Dissertation Defense  presented August 8th 2015 Dr. Koffajuah Toeque Ed.DDissertation Defense  presented August 8th 2015 Dr. Koffajuah Toeque Ed.D
Dissertation Defense presented August 8th 2015 Dr. Koffajuah Toeque Ed.D
 
DATA ANALYTICS FOR HIGHER EDUCATION
 DATA ANALYTICS FOR HIGHER EDUCATION DATA ANALYTICS FOR HIGHER EDUCATION
DATA ANALYTICS FOR HIGHER EDUCATION
 
Poster: Very Open Data Project
Poster: Very Open Data ProjectPoster: Very Open Data Project
Poster: Very Open Data Project
 

Recently uploaded

Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryJeremy Anderson
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home ServiceSapana Sha
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceSapana Sha
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Sapana Sha
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改yuu sss
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsVICTOR MAESTRE RAMIREZ
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Cantervoginip
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...Boston Institute of Analytics
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档208367051
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 217djon017
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queensdataanalyticsqueen03
 

Recently uploaded (20)

Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data Story
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts Service
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
 
Call Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort ServiceCall Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort Service
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business Professionals
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Canter
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queens
 

MOOC Dropout Prediction Using ML

  • 1. MOOC Dropout Prediction Using Machine Learning Techniques: Review and Research Challenges Fisnik Dalipi Linnaeus University, Sweden & University College of Southeast Norway Ali Shariq Imran, Zenun Kastrati Norwegian University of Science and Technology
  • 2. This is a presentation of the paper: “MOOC Dropout Prediction Using Machine Learning Techniques: Review and Research Challenges”, which was accepted and presented at the EDUCON’2018 – IEEE Global Engineering Education Conference The full paper can be downloaded from the IEEEXplore website: https://ieeexplore.ieee.org/
  • 3. Introduction Massive Open Online Courses (MOOCs) are now the most recent topic within the field of e- learning Some researchers even predict MOOC will represent the fourth stage of online education evolution, after the third stage of LMS MOOCs are progressively becoming an integral part of a learning process in higher education settings Also proven that the MOOC approach when complemented with a flipped classroom pedagogy would also bring advantages toward enhancing the learning experience of students substantially But …
  • 5. Reasons behind the dropoutStudentrelatedMOOCrelated Lack of motivation Lack of time Insufficient background Course design Lack of interactivity Hidden costs
  • 6. State-of-the-art: Frequency of occurrences of ML for MOOC dropout prediction 0 1 2 3 4 5 6 7 8 9 10 Logistic Regression Deep Neural Network Support Vector Machine Hidden Markov Models Recurrent Neural Network Natural Language Processing Technique Decision Tree Survival Analysis Bayesian Network
  • 7. Challenges of dropout prediction using ML Lack of enough sample data Managing big masses of unstructured data Data variance High data imbalance Availability of publicly accesible dataset Lack of standard for creating and representing clickstream data Student schedule related challenges
  • 8. Lack of enough sample data Managing big masses of unstructured data Data variance High data imbalance Availability of publicly accesible dataset Lack of standard for creating and representing clickstream data Student schedule related challenges oFor unbiased classification results oLack of negative samples, i.e. HarvardX o E.g. Out of 641138 registered students only 17687 obtained certification oHowever, the effectiveness of the ML relies on the availability of enormous amount of both positive and negative samples oA significant difference in observable sample data affects othe generalization of the deep neural network model during the training step, but also ocomprises the classification accuracy. Challenges of dropout prediction using ML
  • 9. oMOOC comprises of big masses of unstructured data missing data occurs and is hard to validate oA multitude of ML techniques can’t be applicable oE.g. Hidden Markov Models, since oit relies on a finite data set, and ocannot be applied if observations are missing oResearchers reaction: replacing missing observations with mean values. oNot a realistic choice for many of the observed features Lack of enough sample data Managing big masses of unstructured data Data variance High data imbalance Availability of publicly accesible dataset Lack of standard for creating and representing clickstream data Student schedule related challenges Challenges of dropout prediction using ML
  • 10. oIn MOOC's self-paced learning pedagogy, students have the freedom to decide what, when, and how to study. oThis might lead to considerable data variance, which may produce less accurate and reliable ML models oParticularly for SVM and Naïve Bayes, since otheir performance can quickly turn poor when dealing with imbalanced classes oOverall, the high data imbalance may result in biasing of the classifier towards the majority of the class. Lack of enough sample data Managing big masses of unstructured data Data variance High data imbalance Availability of publicly accesible dataset Lack of standard for creating and representing clickstream data Student schedule related challenges Challenges of dropout prediction using ML
  • 11. oMOOC platforms are often reluctant to publish the data due to confidentiality and privacy concerns oThe de-identified information which is made public is also restricted to system generated events (clickstream data) only, while most of the user- provided data is omitted. oThe lack of user-provided data can be a hindrance in successfully estimating the reasons behind the dropouts. Challenges of dropout prediction using ML Lack of enough sample data Managing big masses of unstructured data Data variance High data imbalance Availability of publicly accesible dataset Lack of standard for creating and representing clickstream data Student schedule related challenges
  • 12. oThe user data MOOC platforms gather and the clickstream data they generate doesn’t necessarily conform to a standard format, due to lack of available standards - such as IEEE LOM for learning objects oThe lack of a standard for creating and representing clickstream data for MOOC implies that oa network trained and validated for one MOOC on a given platform might not be interoperable across different MOOC on another platform, or even on the same platform if a MOOC gathers and collects data differently oLack of standards gives rise to another challenge, i.e., how to create compelling ML models and predictive solutions that can be generalized across different MOOCs? Lack of enough sample data Managing big masses of unstructured data Data variance High data imbalance Availability of publicly accesible dataset Lack of standard for creating and representing clickstream data Student schedule related challenges Challenges of dropout prediction using ML
  • 13. oStudents want to construct timetable based on their preferences oSatisfying such preferences of students to be accommodated on different timetables is a complicated problem which can be solved using o Artificial intelligence (AI) optimization techniques. o These AI techniques are specialized to be used for scheduling and thus students’ time constraints can be solved using these techniques. Lack of enough sample data Managing big masses of unstructured data Data variance High data imbalance Availability of publicly accesible dataset Lack of standard for creating and representing clickstream data Student schedule related challenges Challenges of dropout prediction using ML
  • 14. Towards useful and effective predictive solutions Useful & Effective solutions Clickstream data standards Student provided data Feature engineering techniques Engagement system Evaluating models and predictors Improving student social engagement Unification/standardization of a clickstream data Predicting models can benefit from the data provided by the students that don't reveal the identity of the user To analyze posts, feedback, and comments shared by the candidate on the discussion blogs and social forums of the MOOC Encouraging social interactions and simulating peer-evaluation activities could also potentially represent an efficient way to mitigate the dropout To transfer and evaluate models from one MOOC to another, and to perform real-time prediction with ongoing courses. To allow instructors track and monitor students activity across different modules of the course
  • 15. Conclusions This paper provides a comprehensive review of the most recent and relevant research endeavors on machine learning application toward predicting, explaining and solving the problem of student dropout in MOOCs It also identifies some of the critical challenges associated with student dropout prediction and provides recommendations and proposal to assist researchers employing various machine learning techniques in solving it timely and efficiently. Additionally, this paper floats the idea of unification of a clickstream data and student-provided data as a standard, similar to various learning objects metadata standards. Future work will follow on data collection for the development and analysis of deep learning algorithms towards solving the MOOC student dropout.