Learning analytics: progress and solutions - Niall Sclater and Michael Webb, both Jisc
Reading analytics - Clifford Lynch, CNI
Sharing data safely and it's re-use for analytics – David Fergusson, Francis Crick
Jisc and CNI conference, 6 July 2016
5. “learning analytics is the measurement,
collection, analysis and reporting of data
about learners and their contexts, for
purposes of understanding and optimising
learning and the environments in which it
occurs”
SoLAR – Society for Learning Analytics Research
06/07/2016 Learning analytics: progress & solutions 5
6. » Problems identified in 2nd week of semester
» Interventions include:
› Posting signal on student’s home page
› Emailing or texting them
› Arranging a meeting
» Courses that deploy signals see consistently
better grades
» Students on Signals sought help earlier and more
frequently
Early alert and student success
06/07/2016 Learning analytics: progress & solutions 6
9. Curriculum design
» A key piece of learning content is not being accessed by most
students
» Some students are not participating well in collaborative work
» A particular minority group is underperforming in an aspect of
the curriculum
» Students across several discussion groups are making only
minimal contributions to their forums
06/07/2016 Learning analytics: progress & solutions 9
10. » Total hits is strongest predictor of success
» Assessment activity hits is second
» Metrics relating to current effort (espVLE
usage) are much better predictors of
success than historical or demographic
data.
(John Whitmer)
California State University - Chico
06/07/2016 Learning analytics: progress & solutions 10
11. “a student with average intelligence who
works hard is just as likely to get a good grade
as a student that has above-average
intelligence but does not exert any effort”
(Pistilli & Arnold, 2010)
06/07/2016 Learning analytics: progress & solutions 11
12. » Predictive early alert model transferred to different institutions
» Around 75% of at-risk students were identified
» Most significant predictors were:
› Marks on course so far
› GPA
› Current academic standing
(Jayaprakesh et al.)
Marist College, NewYork
06/07/2016 Learning analytics: progress & solutions 12
13. Retention in England
» 178,100 students aged 16-18 failed to finish post-secondary school qualifications
they started in the 2012/13 academic year
› costing £814 million a year - 12 per cent of all government spending on post-16
education and skills (Centre for Economic and Social Inclusion)
» 8% of undergraduates drop out in their first year of study
› This costs universities around £33,000 per student
» students with 340 UCAS points or above were considerably less likely (4%) than
those with fewer UCAS points (9%) to leave their courses without their award
06/07/2016 Learning analytics: progress & solutions 13
14. Attainment in England
» 70% of students reporting a parent with HE qualifications
achieved an upper degree, as against 64% of students
reporting no parent with HE qualifications
» Overall, 70% ofWhite students and 52% of BME students
achieved an upper degree
06/07/2016 Learning analytics: progress & solutions 14
18. Group Name Question Main type Importance Responsibility
2 Consent Adverse impact of opting
out on individual
If a student is allowed to opt out of data collection and
analysis could this have a negative impact on their
academic progress?
Ethical 1 Analytics Committee
7 Action Conflict with study goals What should a student do if the suggestions are in conflict
with their study goals?
Ethical 3 Student
8 Adverse impact Oversimplification How can institutions avoid overly simplistic metrics and
decision making which ignore personal circumstances?
Ethical 1 Educational researcher
86 issues in 9 groups
Available from Effective learning analytics blog: analytics.jiscinvolve.org
06/07/2016 Learning analytics: progress & solutions 18
19. Group Name Question Main type Importance Responsibility
2 Consent Adverse impact of opting
out on individual
If a student is allowed to opt out of data collection and
analysis could this have a negative impact on their
academic progress?
Ethical 1 Analytics Committee
7 Action Conflict with study goals What should a student do if the suggestions are in conflict
with their study goals?
Ethical 3 Student
8 Adverse impact Oversimplification How can institutions avoid overly simplistic metrics and
decision making which ignore personal circumstances?
Ethical 1 Educational researcher
86 issues
jisc.ac.uk/guides/code-of-practice-for-learning-analytics
06/07/2016 Learning analytics: progress & solutions 19
21. 21
ECAR Analytics Maturity Index for Higher Education
Discovery Phase
06/07/2016 Learning analytics: progress & solutions
22. Implementation process
06/07/2016 Learning analytics: progress & solutions 22
5.
Implementation
Support
4. Signed-up for
Service
3. Institutional
Readiness
2. Self-
assessment
1. Workshop
»2016 - 17
23. Discovery readiness questionnaire
06/07/2016 Learning analytics: progress & solutions 23
• Culture andVision
• Strategy and Investment
• Structure and governance
• Technology and data
• Skills
24. Guidelines / checklist
06/07/2016 Learning analytics: progress & solutions 24
Culture and Organisation Setup
Decide on institutional aims for learning
analytics
Senior management approval and you have
a nominated project lead
Undertake the readiness assessment
Decision on learning analytics products to
pilot
Legal and ethical considerations in hand
Address readiness recommendations
Data processing agreement signed
Select student groups for the pilot and
engage staff/students
Technical setup
Learning records warehouse setup
Extract student data to UDD and upload to
LRW
Historical data extracted from theVLE and
SRS and uploaded to the LRW
VLE plugin installed and live data being
uploaded
View in data explorer to check valid
Contact Jisc to start implementation
25. 25
ECAR Analytics Maturity Index for Higher Education
Architecture
06/07/2016 Learning analytics: progress & solutions
29. Service: Dashboards
Visual tools to allow lecturers, module
leaders, senior staff and support staff
to view:
» student engagement
» cohort comparisons
» etc…
Based on either commercial tools from
Tribal (Student Insight) or open source
toolsfromUnicon/Marist(OpenDashBoard)
06/07/2016 Learning analytics: progress & solutions 29
30. Service: Alert and intervention system
Tools to allow management of
interactions with students once risk
has been identified:
» case management
» intervention management
» data fed back into model
» etc…
Based on open source tools from
Unicon/Marist (Student Success Plan)
06/07/2016 Learning analytics: progress & solutions 30
45. Challenges for ”big data” science in the UK
Distributed Data Sets
Distributed computing resources
Separate authentication/authorization mechanisms
Researchers want to combine and synthesise data
How do we do this?
45
46. Example
Dr David Fergusson,
Head of Scientific Computing,
Francis Crick Institute
Challenges of providing shared platforms
for staff from existing institutes
– CRUK London Research Institute
– National Institute for Medical Research
Compute and data requirements for 1,250 scientists
working in biomed
– In a central London building
Direction of travel towards more and wider
collaboration, requirement for controlled sharing of
sensitive data 46
Photo credit: Francis Crick Institute
47. Example
47
Dr Jeremy Yates, STFC DiRAC & SKA:
› The National e-Infrastructure for research & innovation
– A 60,000 foot view
– Democratisation & Aggiornamento
› Moving to a more cloud-centric view of
scientific computing
› Scientific computing that is not just “HPC”
› Changing the culture around Research
Software Engineering
› Making industrial access to facilities the norm
› Inter-disciplinary science – blockers and enablers
Image credit: Courtesy of EPSRC
48. Addressing the problem
SafeShare – shared secure authorisation/authentication
Shared Data Centre(s) – avoid costly/insecure moving of data
eMedlab – collaborative science/shared operations model
48
51. What has worked?
Consolidation through collaboration
Swansea: One system supporting Farr Wales, ADRC Wales, MRC CLIMB,
Dementia Platform UK
Scotland: EPCC supporting Farr Scotland and ADRC Scotland, leveraging
expertise from Archer, UK-RDF
Leeds: ARC supporting Farr HeRC, Leeds Med Bio, Consumer Data RC
Slough DC: eMedLab, Imperial Med Bio, KCL bio cluster
Jisc network: Safe Share
54. About Jisc » Assent
Assent:
Single, unifying technology that enables
you to effectively manage and control
access to a wide range of web and non-
web services and applications.
These include cloud infrastructures,
High Performance Computing, Grid
Computing and commonly deployed
services such as email, file store,
remote access and instant messaging
54
55. About Jisc » Safe Share
Safe Share:
Providing and building services on
encrypted VPN infrastructure between
organisations
Enhanced confidentiality and integrity
requirements per ISO27001
Requirement to move electronic health
data securely and support research
collaboration
Working with biomedical researchers at
Farr Institute, MRC Medical Bioinformatics
initiative, ESRC Administrative Data
Centres 55
56. The safe share project
The safe share project 56
• What: a pilot project enabling the secure exchange of data collected by
Government and the NHS using an encrypted overlay over the Janet
network to facilitate appropriate analysis between project sites
•
• AND reusing existing services to increase authentication for researchers
• Why: easier, secure access to research data to further knowledge of diseases
and ill health to improve medical treatments in the long-term
• When: running from November 2014 – March 2017
57. The safe share project
The safe share project 57
Background
• Substantial investment in medical and administrative data research to
generate benefits to society from the appropriate analysis of data collected
by Government and the NHS
• E.g. to further knowledge e.g. of disease and ill health to improve medical
treatments
Challenges
• Health data, and other routinely collected data on people’s lives, are very
personal and sensitive
• Significant numbers of ethical, consensual and practical hurdles to making
appropriate use of the sensitive data for research
58. The safe share project
The safe share project 58
Drivers
• Requirement for connectivity to move and access electronic health data
securely
• Challenge to give public confidence that data is appropriately protected
• Provide economies of scale in secure connectivity
The safe share project
• Jisc management and funding of £960k to pilot potential solutions with the
aim of developing a service in 2016/17
59. Partners
The safe share project 59
University of
Bristol
Cardiff University
University of Leeds
Swansea University
University of
Edinburgh
UCL
Francis Crick Institute
University of
Oxford
University of
Southampton
University of
Manchester
St Andrews University
The Farr Institute The MRC Medical Bioinformatics initiative
The Administrative Data Research Network
University of Bristol
Cardiff University
University of Edinburgh
Francis Crick Institute
University of Leeds
UCL
University of Manchester
University of Oxford
University of St Andrews
University of Southampton
Swansea University
60. The safe share project
The safe share project 60
Authentication, Authorisation and Accounting Infrastructure (AAAI)
Use Cases:
• HeRC, N8 HPC – access between facilities using home institution
credentials
• eMedLab – partners will be able to use a common AAAI to access this new
system (for analysis of for instance human genome data, medical images,
clinical, psychological and social data)
• Swansea University Health Informatics Group – investigating Moonshot as an
authentication mechanism to allow use of home institution credentials
• University of Oxford: to enable researchers to use home institution
credentials for authentication to request access to datasets for studies e.g.
61. The safe share project
The safe share project 61
Example “service slice”: Farr
Institution LAN
Safe share
core
Janet,
internet or
other
network
Farr trusted
environments
safe share router at edge
62. The safe share project
The safe share project 62
Example “service slice”: Farr
Institution LAN
Farr trusted
environments
Janet,
internet or
other
network
safe share router at edge
Safe share
core
64. Shared data centre
£900K investment from HEFCE
Anchor tenants:
– Francis Crick Institute
– King’s College London
– London School of Economics
– Queen Mary University of London
– Wellcome Trust Sanger Institute
– University College London 64
65. Potential cost-saving/resource benefits
Jisc Shared Datacentre is already a cost saving
eMedLab award, and need for quick spend, gave impetus to UCL, KCL,
QMUL, Sanger, LSE and Crick to identify off-site datacentre hosting (Slough)
– Anchor tenants get price reduction based on volume of space used
Procurement led by Jisc
Datacentre connected to Janet network (Jisc investment)
Improved PUE; Slough 1.25 cf ~2 for HEI datacentre (UCL save ~£2M p.a.)
68. Objectives - Flexibility
• To help generate new insights and clinical outcomes by
combining data from diverse sources and disciplines
• Bring computing workloads to the data, minimising the
need for costly data movements
• To allow customised use of resources
• To enable innovative ways of working collaboratively
• To allow a distributed support model
68
70. Supportteam
eMedLab academy
• Training via CDFs and courses
• Promote collaborations via “Labs”
eMedLab infrastructure
• Shared computer cluster
• Integrate exchange heterogeneous data
• Methods and insights across diseases
71. eMedLabis a hub
6+1 partners
3 data types
electronic
health
records
genomic
images
3 expertises
clinician
scientists
analytics
basic
science
3 disease areas
rare
cancer
cardio
>6M patients
73. Distributed/Federated support
(What has worked/savings ..)
eMedLab
Ops team
(shared team)
Knowledge
sharing/transfer
(inc. developing
UK industrial
capacity –
Support
Support
Support
Support
Support
Support
74. Many projects, same challenges
Information governance
Secure data transfer
User management
AAAI
Working with Janet to explore how to support most/all projects
75. Cultural Barriers Challenges
Finance – government funding with spend window of 1 year only
+Mitigated by use of efficient procurement teams and framework
agreements
+Working closely with vendors to ensure tight time targets met
- Drain on (unfunded) project management and finance team resources
Regulatory challenge
+Mitigated by clear policies, governance, supported by training
+Changing EU data protection legislation
- Risk of bad PR and/or data leaks
People
+Everyone is open, collaborative, generous with time and knowledge
76. eMedLab production service
Projects
• UCL & WTSI - Enabling Collaborative Medical Genomics Analysis Using Arvados – Javier Herrero
• Crick KCL UCL - A scalable and flexible collaborative eMedLab cancer genomics cluster to share
large-scale datasets and computational resources – Peter van Loo
• UCL QMUL Farr - Creating and exploiting research datamart using i2b2 and novel data-driven
methods - Spiros Denaxas
• LSHTM & QMUL - An evaluation of a genomic analysis tools VM on the EMedLab, applied to
infectious disease projects at the LSHTM using data from EBI and Sanger & Genetic Analysis of UK
Biobank Data - Taane Clark & Helen Warren
• UCL & ICH - The HIGH-5 Programme - High definition, in-depth phenotyping at GOSH, plus related
projects - Phil Beales & Hywel Williams & Chela James
77. eMedLabenables
projects
eMedLab brings data and expertise
together across diseases
(potential)
• Mechanisms of cancer diversity and genome instability
• Better understanding of biomarkers
• DARWIN Clinical Trial to target clonal drivers
Cancer evolution and heterogeneity (Swanton & Van Loo)
• Cancers evolve heterogeneously
• Diverse driver mutations and instability mechanisms
• TracerX: Track lung cancer evolution
• Data: genomes, MRI, molecular pathology
• Who: clinicians, statisticians, evolutionary biologists
78. People
Alan Real, Bob Day, Bruno Silva, Clare Gryce, David Fergusson, Emily
Jefferson, Jacky Pallas, Jeremy Sharp, John Ainsworth, John Chapman,
Jonathan Monk, Mark Parsons, Ric Passey, Richard Christie, Rhys Smith,
Simon Thompson, Simon Thompson, Spiros Denaxas, Stephen
Newhouse, Steve Pavis, Tanvi Desai, Tim Cutts and others …........
79. Thank you for reading the information within
this document; you have now reached the
end.
79
80. Data sharing and analytics in research
and learning
Chair: Phil Richards, Jisc
14/07/2016
80
Majority of the projects involve consortia or universities and research institutes. Given the lack of opex we have had to consolidate and build on existing capacity. Everyone has done this, and done it well.
“Anchor tenants” for the trusted club of research centres for using sensitive data in a secure way across the UK.
Demonstrating the commitment to work as part of a virtual organisation such as the Farr Institute or ADRN
Creating and influencing e-infrastructure standard approaches that funders and researchers understand and that have external verification.
Improved potential for economies of scale in the e-infrastructure for research and re-usability between different projects
Opportunity for visibility as thought leaders and champions for e-infrastructures for research.
Benefits
Reduction in duplication of effort as a solution is needed by everyone
Avoidance of potential competing incompatible solutions in different centres
Support for RCUK and Government strategies for research with sensitive data
Co-ordinated partnership that can help support UK research into disease and public health
Improved knowledge and a scalable solution providing benefits for other members of the community
We are already seeing cost-savings as a result of working together.
The last point is the important one – this would never have worked without the tech community coming together in such a positive way