Open edudatawrangling

Working with #opendata
Tony Hirst
@psychemedia

DATA
USERS
Educators
Learners
Planners
Marketers
Policymakers
Researchers
Press
NGOs
“
D
E
V
E
L
O
P
E
R
S
”

Access/obtain data
Make sense of data
Ask specific questions of data
Communicate in a data-centric way

Load data
Clean data
Merge/enrich data

A barrier to access
(for the tool user) is
data format

JSON XMLCSVXLS
TSV
.db
HTML
PDF DOCTXT

=importHTML(URL, “table”, N)
HTML
QUERYABLE
DATA

=importHTML(URL, “table”, N)
HTML
INTERACTIVE
DASHBOARD
Google Charts

A barrier to access
data shape

A barrier to access
data cleanliness

The Open University
Open University
OU
Open Uni
Open University, UK
NORMALISATION/RECONCILIATION

Reconciliation to
a canonical name
and/or to a
unique identifier

A stumbling block
(for the data user)
is data enrichment

A stumbling block
(for the data user)
is joining datasets

A stumbling block
(for the data user)
is joining partially
matched data

Rolling your own
interactive data
exploration tools

Many chart tools
do the work for
you if the data is
in the right shape

blog.ouseful.info
@psychemedia

Recommended