Hbase and phoenix usage at eHarmony. Presented the lambda architecture and implementation of HBase and phoenix usage in eharmony at Apache PhoenixCon 2016.
4. E H A R M O N Y C R E A T E S
T H E H A P P I E S T ,
M O S T P A S S I O N A T E
A N D M O S T F U L F I L L I N G
R E L A T I O N S H I P S *
* A C C O R D I N G T O A R E C E N T S T U D Y
8. M A T C H I N G S Y S T E M
Compatibility Matching System®
C O M P A T I B I L I T Y
M A T C H I N G
A F F I N I T Y M A T C H I N G
M A T C H
D I S T R I B U T I O N
12. M A T C H D E L I V E R Y ( V 1 )
M A P - S I D E J O I N S
( T B ) S C O R I N G
V O L D E M O R T
M A T C H D A T A S E R V I C E
M A T C H I N G S Y S T E M
3 0 + M I L L I O N E V E N T S
6 5 + M I L L I O N U S E R S
3 0 + B I L L I O N R E C O R D S
13. V O L D E M O R T ?
T H A T N A M E
T H A T N A M E
S O U N D S F A M I L I A R
14. V O L D E M O R T
A U T O
P A R T I T I O N I N G
P L U G G A B L E
S E R I A L I Z A T I O N
A U T O
R E P L I C A T I O N
K E Y - V A L U E D Y N A M O
G O S S I P
15. U S E R U P D A T E S O ( N ^ 2 )
* U S E R U P D A T E D T H E A D D R E S S
16. N E E D F O R S C A L A B I L I T Y
V O L D E M O R T
30+ Million Match Events / Day
30+ Billion Match Records
Millions of user generated Events / Day
Low latency user requests
1 . 4 G B / M I N ( 1 4 )
17. N E E D F O R S C A L A B I L I T Y
G E T M A T C H E S R E S P O N S E
T I M E S
18. DATA STORE NEEDS
Q U E R I E S
L O W
L A T E N C Y
C R U D
O P E R A T I O N S
F I L T E R I N G
T H R O U G H P U T
4 0 + M I L L I O N
W R I T E S
3 0 + B I L L I O N
R E C O R D S
19. DATA STORE NEEDS
E A S Y T O
M A I N T A I N
C O N S I S T E N C Y
A V A I L A B L E
P A R T I T I O N
T O L E R A N C E
21. L A M B D
• Robust and fault-tolerant system
• Serves a wide range of workloads and use cases
• linearly scalable
• Layered Architecture
Batch Layer
Query Layer
Speed Layer
- Nathan Marz
22. L A M B D
B A T C H L A Y E R
QUERYLAYER
S P E E D / S A V E L A Y E R
M A P - S I D E J O I N S
( T B ) S C O R I N G
M A T C H I N G
S Y S T E M
M E S S A G E
B R O K E R
B A T C H
S T O R A G E
S P E E D
S T O R A G E
MERGE
24. C R U D
O P E R A T I O N S
T H R O U G H P U T
4 0 + M I L L I O N
W R I T E S
3 0 + B I L L I O N
R E C O R D S
E A S Y T O
M A I N T A I N
C O N S I S T E N C
Y
P A R T I T I O N
T O L E R A N C E
A V A I L A B I L I T Y
HBASE AS BATCH
STORE
25. T H R O U G H P U T
4 0 + M I L L I O N
W R I T E S
C O N S I S T E N C Y
A V A I L A B L E
P A R T I T I O N
T O L E R A N C E
KAFKA AS BROKER
26. C O N S I S T E N C Y
P A R T I T I O N
T O L E R A N C E
A V A I L A B L E
REDIS AS SPEED STORAGE
L O W L A T E N C Y
27. C R U D
O P E R A T I O N S
E A S Y T O
M A I N T A I N
AS SQL LAYER
Q U E R I E S
I N D E X I N G
T R A N S A C T I O N
S
M U L T I T E N A N C Y
28. C R U D
O P E R A T I O N SE A S Y T O
M A I N T A I N
PHO LIBRARY
Q U E R I E S
F I L T E R I N G
32. L A M B D
K A F K A
M A P - S I D E J O I N S
( T B ) S C O R I N G
B A T C H L A Y E R
Q U E R Y L A Y E R
S P E E D / S A V E L A Y E R
M A T C H I N G S Y S T E M
33. P E R F O R M A N C E
H B A S E C U T O V E R
S A V E M A T C H R E S P O N S E
T I M E S
5 0 % 1 0 0 %
G E T M A T C H E S R E S P O N S E
T I M E S
H B A S E C U T O V E R 1 0 0 %