Problems
• Request-‐response
model
• Long
cycle
• EAAB
(engineer
as
a
bottleneck)
https://medium.com/@samson_hu/building-analytics-at-500px-92e9a7005c83
Problems
• Request-‐response
model
• Long
cycle
• EAAB
(engineer
as
a
bottleneck)
• HDC
(Hippo-‐driven
company)
https://medium.com/@samson_hu/building-analytics-at-500px-92e9a7005c83
Problems
• Request-‐response
model
• Long
cycle
• EAAB
(engineer
as
a
bottleneck)
• HDC
(Hippo-‐driven
company)
• Lack
of
speed
https://medium.com/@samson_hu/building-analytics-at-500px-92e9a7005c83
Problems
• Request-‐response
model
• Long
innovation
cycle
• EAAB
(engineer
as
a
bottleneck)
• HDC
(Hippo-‐driven
company)
• Lack
of
speed
• =>
We
are
not
alone
(500px)
https://medium.com/@samson_hu/building-analytics-at-500px-92e9a7005c83
Table
of
Contents
• Problems
• Solution
Requirements
• Elasticsearch
&
Kibana
• In
Gogolook
• Future
Possible
solutions
• Approach
1:
SQL
monkey
zoo
• Approach
2:
Provide
limited
yet
easy
visualization
http://www.slideshare.net/GloriaLau1/keynote-at-spark-summit/5
Requirement
• Easy:
Even
CEO
can
use
it
• Fast:
Must
be
interactive
• Export:
Provide
the
csv
file
• Big:
Must
be
scalable
• 80-‐20:
Solves
80%
problems
Table
of
Contents
• Problems
• Solution
Requirements
• Elasticsearch
&
Kibana
• In
Gogolook
• Future
Elasticsearch
• Lucene-‐based
search
engine
• Document
storage
(JSON)
• Distributed,
scalable
• Serve
search
request
in
ms
• Build
index
for
every
field
Table
of
Contents
• Problems
• Solution
Requirements
• Elasticsearch
&
Kibana
• In
Gogolook
• Future
In
Gogolook
(Aug.
2015)
• 200M+
data
point
daily
• 150GB+
data
size
daily
• 24
dashboards,
160
visualizations
• Service
status
e.g.
requests_total
• Application
data
e.g.
tag_total
• Log
data
e.g.
button_ctr
In
Gogolook
(currently)
• Log
user
behavior
on
features
• ⾃自⼰己的
log
⾃自⼰己記
(Planner/PM)
• ⾃自⼰己的
board
⾃自⼰己拉
(every
one)
• Monitor
performance
from
day
1