Mais conteúdo relacionado Semelhante a King hug uk (20) King hug uk3. © King.com Ltd 2013 – Public
Agenda
3
• Welcome!
• A brief history of King
• King data platform evolution
• Enter Hive
• Hive + DB
• Hive + better DB
• Questions?
7. © King.com Ltd 2013 – Public
Web, social, mobile
7
A brief history of King
8. © King.com Ltd 2013 – Public
King in numbers
8
• 100 million daily active users
• 1 billion game plays per day
• 8 offices
• 10 billion events per day
• Lots and lots of data…
A brief history of King
9. © King.com Ltd 2013 – Public
A brief history of me
andy.done@king.com
9
12. © King.com Ltd 2013 – Public
The road to big
12
Enter Hive
0
50
100
150
200
250
300
350
2011-02-16
2011-03-04
2011-03-20
2011-04-05
2011-04-21
2011-05-07
2011-05-23
2011-06-08
2011-06-24
2011-07-10
2011-07-26
2011-08-11
2011-08-27
2011-09-12
2011-09-28
2011-10-14
2011-10-30
2011-11-15
2011-12-01
2011-12-17
2012-01-02
2012-01-18
2012-02-03
2012-02-19
2012-03-06
2012-03-22
2012-04-07
2012-04-23
2012-05-09
2012-05-25
2012-06-10
2012-06-26
2012-07-12
2012-07-28
2012-08-13
2012-08-29
2012-09-14
2012-09-30
2012-10-16
2012-11-01
2012-11-17
2012-12-03
2012-12-19
2013-01-04
2013-01-20
2013-02-05
2013-02-21
2013-03-09
2013-03-25
2013-04-10
2013-04-26
Compressedeventsgigabytes/day
Browser Mobile
40 nodes
Qlikview says
no
Infobright
CE says no
10 nodes
20 nodes
15. © King.com Ltd 2013 – Public
Data exploration
15
• COUNT(*)
• SELECT DISTINCT
• COUNT, SUM… GROUP BY date
Enter Hive
17. © King.com Ltd 2013 – Public
Data platform 1.0
17
Hive + DB
Games
Event
data
Hive
Report
s
Data
scientis
ts
ETL
18. © King.com Ltd 2013 – Public
Data platform 1.5
18
Hive + DB
Games
Event
data
Hive DB
Report
s
Data
scientis
ts
ETL
19. © King.com Ltd 2013 – Public
Selection criteria
19
• ‘Accessible’ pricing (free?)
• Single node
• Easy to set up
• Low maintenance
Hive + DB
20. © King.com Ltd 2013 – Public
Contenders ready
20
• Infobright
• Columnar MySql engine
• Light tuning and hinting
• InfiniDB
• Columnar MySql engine
• Tuning-less
• Faster for our use case
21. © King.com Ltd 2013 – Public
How’s that work out?
21
• Paid its way
• Popular
• 100s queries / day
• Stability
• Ceilings
• Screwed by mobile
22. © King.com Ltd 2013 – Public
The road to big
22
Enter Hive
0
50
100
150
200
250
300
350
2011-02-16
2011-03-04
2011-03-20
2011-04-05
2011-04-21
2011-05-07
2011-05-23
2011-06-08
2011-06-24
2011-07-10
2011-07-26
2011-08-11
2011-08-27
2011-09-12
2011-09-28
2011-10-14
2011-10-30
2011-11-15
2011-12-01
2011-12-17
2012-01-02
2012-01-18
2012-02-03
2012-02-19
2012-03-06
2012-03-22
2012-04-07
2012-04-23
2012-05-09
2012-05-25
2012-06-10
2012-06-26
2012-07-12
2012-07-28
2012-08-13
2012-08-29
2012-09-14
2012-09-30
2012-10-16
2012-11-01
2012-11-17
2012-12-03
2012-12-19
2013-01-04
2013-01-20
2013-02-05
2013-02-21
2013-03-09
2013-03-25
2013-04-10
2013-04-26
Compressedeventsgigabytes/day
Browser Mobile
40 nodes
Qlikview says
no
Infobright
CE says no
10 nodes
20 nodes
InfiniDB
25. © King.com Ltd 2013 – Public
Data platform 2.0
25
Hive + better DB
Game
Event
data
Hive
Better
DB
Report
s
Data
scientis
ts
ETL
26. © King.com Ltd 2013 – Public
State of the market Jan 2013
26
• Hadoop on steroids
• Hadapt…
• Impala
• Nouvaeu Data
• Platfora
• SIsense
• MPP analytics databases
• Vertica
• ExaSol
Hive + better DB
27. © King.com Ltd 2013 – Public
Contenders ready
27
Hive + better DB
Feature ExaSol Vertica
Processing In memory Disc optimised
Administration Web based Command line
Backup Web based Command line
Resiliency Hot spare Gradual
degradation
Tuning Self tuning User tuning
Licensing Allocated RAM Total storage
Vendor Smaller Larger
28. © King.com Ltd 2013 – Public
Disclaimers
28
• Our data
• Our queries
• Our use case
• Our results
Hive + better DB
29. © King.com Ltd 2013 – Public
This is our data
29
Hive + better DB
Table Row count
Mobile dimension 161 m
Social dimension 600 m
Mobile facts 1 B
Social facts 6.7 B
34. © King.com Ltd 2013 – Public
Cluster stats
34
Hive + better DB
Vertica ExaSol Hive InfiniDB
Nodes 4 4 19 1
Cores 64 48 228 32
RAM 512 Gb 288 Gb 1216 Gb 300 Gb
Discs 96 32 76 4
Hardware
cost / USD $$$$ $$ $$ $
Total cost /
USD $$$$$$ $$$$$ $$ $$
40. © King.com Ltd 2013 – Public
Picture:words
40
Hive + better DB
$1.9m
=
4 ExaSol
nodes
420 Hive nodes
41. © King.com Ltd 2013 – Public
This is a test
41
• Ad hoc query tests
• DML
• INSERTs
• UPDATEs
• DELETEs
Hive + better DB
42. © King.com Ltd 2013 – Public
And in the real world
42
• Faster processing times
• 4.5 hours to 20 minutes
• Happier analysts
• Happier data warehouse engineers
• Happier ops
Hive + better DB
43. © King.com Ltd 2013 – Public
Conclusions
43
• For structured workloads, consider a good analytic database to
complement your Hadoop infrastructure
• ExaSol was an excellent fit for our use case
• We’ll let you know how we get on!
Hive + better DB