A Systematic Approach for Providing Personalized Pedagogical Recommendations ...
Education's Clarion Call: Strata, Santa Clara, 2013
1. EduDataScience
Teaming to Improve US Education with Big Data Science
Marie Bienkowski
SRI International
marie.bienkowski@sri.com
February 27, 2013
O’Reilly Strata, Santa Clara, CA
2. • Hour-long classes, “seat time”
requirements
• Students grouped by age
• Lecture-based teaching
• Paper textbook as primary learning
resource; No cell phones in class
• Small, delayed and disconnected
data: some testing feedback, reports
(midterm, final), attendance, free
lunch eligible
4. Deeply Digital Learning
• Flipped classroom w/online
practice and homework via
adaptive tutors
• More engaging and inspiring 24/7
learning: games, projects, badges
for competencies
• Learners collaborate by ability,
interest
• Digital media/platforms for open
or personalized learning
• Data ecosystems including the
Internet of Learning Things
5. K-12 In-School Time is 1 Million Minutes
http://life-slc.org
Many orders of magnitude more learning data with digital learning:
big data will be available – 55M K-12 students; 77M total in K-
college
6. Analytics and Data Mining
• Continuously improving courses, curricula, and apps
• Continuous and stealth testing
• Personalized, adaptive learning
pathways, including recommended
online learning resources
• Support students to succeed with right
challenge, right encouragement, and
right engagement
• Interactive data visualization systems
(aka “dashboard”) for learners,
teachers, leaders www.knewton.com
8. Paradigms of Scientific Discovery
• Empirical – started thousands of years ago
• Theoretical – last few hundred years
• Computational – last 30 – 40 years
• Data Exploration (eScience)
John Stamper, DataShop
9. EduDataScience is about Discovery
• Automated assessment of • You can test students or
student skill, mastery learning, watch them as they learn to
efficient and effective learning see what they know
(Corbett, 2001)
• By discovering knowledge
models automatically using
• If you know what students
data mining, student time can need, you can give it to
be used more effectively (Cen them and they will learn
et al 2008, Stamper et al AIED better
2011)
10. EduDataScience is about Discovery
• Conducting research on
disengaged behaviors
(McQuiggan, Rowe, Lee, &
Lester 2008; Rowe,
McQuiggan, Robison, & • Students can learn from
Lester 2009), led to tightened games that have stories
and improved narrative,
leading to positive learning
outcomes (Rowe, Shores,
Mott, & Lester 2011)
11. EduDataScience is about Discovery
• By automatically detecting
when students “game the
system” (cf. Baker et al., 2004;
Walonoski & Heffernan, 2006a,
Johns & Woolf, 2006), it was • You can tell when
possible to build automated students cheat and
interventions that reduce make them stop
gaming and improve learning
(Baker et al., 2006; Walonoski
& Heffernan, 2006b; Arroyo et
al., 2007)
Examples courtesy Ryan S.J d. Baker
12. Panelists
• Zachary Pardos– MIT
• Jace Kohlmeier– Khan Academy
• Sharren Bates– inBloom
Learn more! Zach and I will hold Office Hours tomorrow at 10am
13. The Data Zeitgeist in Education
And the Discoveries We Need to Succeed
Zachary A. Pardos, Ph.D.
pardos@mit.edu
14. The Data Zeitgeist in Education
• Impetus to use • The same classroom paradigm
technology and data to has existed for centuries
reform education • Data has been used in almost
all other industries to optimize
• Growth of computer outcomes
tutoring system
• Bioinformatics
• Financial analysis
• Statistical methods in particle physics
• Why not education?
14
UMAP 2011
Zach Pardos Strata 2013 - Santa Clara, CA February 27th, 2013
15. The Data Zeitgeist in Education
• Impetus to use
technology and data to
reform education
• Growth of educational-
technology systems
• Major increases in
funding
15
UMAP 2011
Zach Pardos Strata 2013 - Santa Clara, CA February 27th, 2013
16. The Data Zeitgeist in Education
• Using technology and
data to reform education
• Produces the Cognitive Tutor
• Growth of educational- – Used by over 600,000 students
technology systems per year
– Recently acquired by the Apollo
• Major increases in group for $75m
– Apollo group owns University of
funding Phoenix
• Largest online university (500k
students)
16
UMAP 2011
17. The Data Zeitgeist in Education
• Using technology and
data to reform education
• Growth of educational- • Has tripled its daily
technology systems student usage every year
• Major increases in • Was the running for part
funding of a $4.35b federal
initiative to reform
education in MA
17
UMAP 2011
Zach Pardos Strata 2013 - Santa Clara, CA February 27th, 2013
18. The Data Zeitgeist in Education
• Using technology and
data to reform education
• National standardized test being
• Growth of educational-
deployed in the 2014-2015 school
technology systems year
• Major increases in – Two versions of the test
funding – One will be computer adaptive
– Tens of millions of students’ data
per year
– Districts, States will be seeking
big data solutions
18
UMAP 2011
Zach Pardos Strata 2013 - Santa Clara, CA February 27th, 2013
19. The Data Zeitgeist in Education
• Using technology and
data to reform education • Started with Stanford AI course
• Growth of educational- • Nearly 3m registrants since 2011
technology systems • 100s of college courses (growing)
• Major increases in
funding
19
UMAP 2011
Zach Pardos Strata 2013 - Santa Clara, CA February 27th, 2013
20. • Joint venture between MIT and
Harvard to build a platform to host
massive open-access online college
courses (MOOC)
• Additional Universities joining
steadily
• High enrollments (30k-154k)
21. The data
Student participation
•154,000 enrolled
•108,000 entered class
•7,000 received certificate
course interface
Course components
•434 lecture videos
•37 homework problems
•105 lecture problems
•1009 book pages
•14 labs
•145 tutorial videos
•2 exams
22. The data The Approach
-adapt a Bayesian model of learning
-hypothesize that resources influence learning
-see if hypothesis generalizes to new students
Model Parameters
P(L0) = Probability of initial knowledge
knowledge
P(T) = Probability of learning
{video}
Knowledge Tracing{book}
{answer}
P(G) = Probability of guess
P(S) = Probability of slip P(L0) P(T) P(T)
Nodes representation K K K
K = knowledge node
Q = question node
What resources are working?
Node states
P(G)
Q Q Q
P(S)
-post-tests are too far apart K = two state (0 or 1)
-prediction of performance aloneQ = two state (0 or 1)
not adequate 0 1 1
-in need of a model of learning question(Pardos et al, Educational Data Mining, 2013 (under review))
23. Other factors in learning
• Summarizing student affect over two
school years by analyzing tutor log
data
• Correlated to State Test Outcome
• Positive correlation: Frustration,
Concentration, Confusion (while
receiving tutor help)
Pardos, Baker et al. (Learning Analytics & Knowledge, 2013)
24. Exploring interaction of other factors
• Can non-cognitive contextual
information about the student help
explain efficacy?
Model Parameters {confused} {confused}
P(L0) = Probability of initial knowledge
P(T) = Probability of learning
{video} {book}
Knowledge Tracing
• In order to investigate many factors,
P(G) = Probability of guess
we need to be looking beyondof slip a
P(S) = Probability P(L0) P(T) P(T)
single course of data.
Nodes representation K K K
• Live analysis of K = knowledge node
efficacy trends
Q = question node
P(G)
Node states P(S) Q Q Q
K = two state (0 or 1)
Q = two state (0 or 1) 0 1 1
25. What We Need Join us!
• Increased capability in analyzing
continuous streams of big data
• Operationalizing learner analytics
• Problem solvers who want to make
an impact
Zach Pardos
pardos@mit.edu
27. Big problems…
>1,000,000,000
School-aged children around the world
142,800,000
25%
Of US college freshmen
School-aged children not in school
need remedial classes;
Only 85% costing $3 billion annually
Of primary school students worldwide
graduate from primary school
Statistics from UNESCO Institute for Statistics (UIS); National Center for Education Statistics; Complete College America
28. Cumulative visits to Khan Academy (Millions)
… big data
>400 million lessons
60 million
users to date
delivered
>1 billion
problems answered
> 5 million
Unique users / month
216
countries
15,000
classrooms around
the world 28
36. What Data? For what purpose?
• Big Data not yet working in k-12
• At the policy level, collecting information about student background and
achievement has become practice once-a-year between schools, districts and
state education organizations
• This has given great insight into achievement gaps and underserved communities
and schools
• However – it’s difficult to connect those problems to data-driven solutions
• Individual companies and research institutions have advanced the field of learning
analytics but only with intense, expensive research efforts
37. Enabling Great Teaching and Learning with Data
• For teachers, differentiated or customized instruction is a common goal
• Teachers are expected to understand exactly what each of their student needs, discover and successfully
deliver those educational experiences across a student population of up to 200 kids a day
• As tech professionals, we all can think of zillions of opportunities for data-driven tools to support these
instructional processes
– Dashboards and data analysis tools
– Recommendation engines
– Early warning systems
– Communication tools
– Dynamic scheduling
– Teacher development
– x 1zillion
• Big Data should be powering personalized learning at scale. Helping teachers, students and families to
pursue the best possible learning opportunities for the best possible education and life outcomes
38. Current State and Complicating Factors
• While there are innovative products available, it is incredibly difficult for education agencies to successfully
implement them with a product portfolio approach
• State and school district customers don’t always know how to successfully map instructional processes to
requirements, set expectations for continuous improvement, select tools to successfully support process and
insist on future-friendly data and network infrastructure
• Why?
– $
– Capacity
– High-risk regulatory framework
– Highly structured budgets and contract requirements
– Expense of one-off data integrations
– Existing large-footprint software bundles that address multiple processes
– Legal requirements around evaluating teachers
– Complicated relationships between school districts and states
39. Meanwhile in Classrooms
• This leaves teachers in one of two bad scenarios:
– Limited set of tools where the district has not made investment
– Large set of high-quality tools that do not interoperate – making it nearly impossible to use the tools
successfully
• It’s even worse for students:
– Students with access to tech at home experience a huge difference in how they use tools and strategies in
and outside of the classroom
– Students without access to tech at home miss out on whole new ways to experience the world
• Personalized Learning remains a theoretically good idea that can’t get to scale
– Missed opportunity of months of classroom time spent reviewing last years subject mater to figure out where
kids are
– Thriving kids unable to push farther than their classroom curriculum
– Struggling kids not making the progress they need to in order to succeed
40. inBloom and Big Data
• inBloom supports the K-12 community’s move towards great data-driven tools for classroom use built on an
interoperable data and content architecture
• We support states and districts who are taking a more process- and quality-based approach to launching
tech initiatives
• Our success is determined by the success of partners – software providers who launch great data-driven
tools
• If the learning applications and tools our students, teachers and families use together can get all the data
they need to be successful and report back their outcomes, K-12 can finally join the big data movement
• Big Data = personalization of education opportunities, continuous improvement of tools and strategies,
improved student outcomes
41. Find Out More
• inBloom Strata Booth
• inBloom.org
• sharren.bates@inbloom.org
• @sharrensharren
• SXSW EDU NEXT WEEK
Notas do Editor
Get inBloom, WPI, and Khan Academy logos. These folks will be talking next.