3. In
the
beginning…
We
came
up
with
rela?onal
model
from
which
we
can
build
an
EPG.
4. So,
what
is
the
problem?
An
EPG
looks
like
a
good
fit
for
a
rela?on
database?
An
EPG,
maybe.
But
we
have
poten6ally
1.5
million
EPGs…
5. we
also
have
a
lot
available
data
sources…
zeebox analyses live TV
to understand the context
of what’s on air…
+
lots
lots
more…
6. With
some
smart
implementa?on
it
all
worked
well,
but
suffers
from
limita?ons,
including
how
oKen
we
can
update
those
EPG’s.
And
we
are/want
to
be
much,
much
more
than
just
an
EPG.
7. One
problem
to
solve…
What
channel?
What
?me?
Can
Ian
even
see
it?
8. Now
it
can
be
done
but
it’s
ugly
and
it’s
slow.
9.
10. We
have
all
the
data.
Is
there
a
smarter
way
to
structure
it?
hSp://www.fanpop.com/clubs/lt-‐commander-‐data/images/31158615/?tle/data-‐photo
11. Need
to
handle:
Structured
and
Semi-‐Structured
Densely
Connected
High
Read
Rates
Rela?vely
Low
Write
Rates
17. Dr Who.
S1 EP1
AIRED_ON
AIRED_AT
AIRED_ON
Broadcast
Time
Broadcast
AIRED_BY
Channel
MATCH
(b:Broadcast)<-‐[:AIRED_ON]-‐(ep:Episode)
-‐[:AIRED_ON]-‐>(d:Broadcast)
AVAILABLE_ON
-‐[:AIRED_BY]-‐>(c:Channel)
-‐[:AVAILABLE_ON]-‐>(p:Provider),
(d:Broadcast)-‐[:AIRED_AT]-‐>(t:Time)
WHERE
b.broadcast_id
=
{JIMs
BROADCAST}
AND
p.provider_id
=
{IANs
PROVIDER}
RETURN
d,c,t;
Provider
18. But
what
are
the
numbers?
So
some
early
benchmarks
based
on
7
days
worth
of
broadcast
data.
Running
on
a
2011
MBP,
2.3GHz,
8GB
with
SSD
Neo
1.9,
MySQL
5.1
MySQL
80
seconds
Cypher
1st
Attempt
6
seconds
Cypher
after
tuning
190
milliseconds
Traversal*
42
milliseconds
19. So
that’s
cool
you
can
make
your
system
faster,
nice.
But
that’s
not
actually
the
really
good
bit.
20. Deeper
ques?ons
of
the
data…
NARRATED
Little
Britain
Tom
Baker
APPEARED_IN
APPEARED_IN
Dr Who.
Black
Adder
22. Flexibility
of
the
data…
Image.
Dr Who.
Dr Who.
S1
Dr Who.
S1 EP1
Image.
23. Represent
the
data
as
it
is,
no
“wedging”
FRANCHISE
Star Trek
EPISODE
Star Trek
S1 EP1
MOVIE
Star Trek
VI
LIVE EVENT
Star Trek
Con. Live
Broadcast
Broadcast
Broadcast