UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
Pintrace: Distributed tracing@Pinterest
1. P I N T R A C E
D I S T R I B U T E D T R A C I N G @ P I N T E R E S T
S U M A N K A R U M U R I
2. A B O U T M E
• Passionate about distributed tracing, monitoring and
cloud infrastructure.
• Lead on Visibility team at Pinterest.
• Lead for Zipkin project at Twitter (briefly).
• Author of “Distributed tracing” (upcoming) from
O’Reilly.
• Ex-(Twitter, Facebook, Amazon, Yahoo, Goldman Sachs).
5. M I C R O - S E R V I C E S B R O K E O U R T O O L S
HOW DID THIS REQUEST EXECUTE?
6. A G G R E G AT E E V E N T S P E R S E R V I C E
U N D E R S TA N D T R E N D S A N D A L E R T S
C H E A P
S E R V I C E L E V E L O V E R V I E W
N O P E R R E Q U E S T O V E R V I E W
M E T R I C S
7. R E C O R D D I S C R E T E E V E N T S
M A N U A L C O R R E L AT I O N
E X P E N S I V E
F L E X I B L E B U T V E RY B R I T T L E
L O G S
8. P R O J E C T P R E S T I G E
P I N P O I N T
M A N U A L T R A C I N G
9. R E C O R D E V E N T S I N A R E Q U E S T W I T H C A U S A L
O R D E R I N G
What is Distributed Tracing?
10. S T R U C T U R E D L O G G I N G O N S T E R O I D S
A N N O TAT I O N , S PA N , T R A C E
What is Distributed Tracing?
11. T R A C E R E Q U E S T S : R E C O R D E V E N T S I N
A R E Q U E S T W I T H C A U S A L O R D E R I N G .
A C R O S S M O B I L E C L I E N T S , B A C K E N D
S E R V I C E S A N D D ATA B A S E S
Z I P K I N B A S E D T R A C I N G S O L U T I O N
M O R E E X P E N S I V E
P I N T R A C E
12. B U I L D I N G P I N T R A C E : 5 C H A L L E N G E S
13. B U I L D I N S T R U M E N TAT I O N
C H A L L E N G E 1
HARD & TEDIOUS
O N E I N S T R U M E N TAT I O N S P E R ( L A N G U A G E ,
F R A M E W O R K , T H R E A D P O O L , P R O T O C O L )
C O M B I N AT I O N .
O P E N T R A C I N G P Y T H O N T R A C E R , F I N A G L E Z I P K I N
T R A C E R
14. S PA N R E P O R T A N D A G G R E G AT I O N
C H A L L E N G E 2
First company wide span aggregation pipeline.
15. D E P L O Y I N S T R U M E N TAT I O N
C H A L L E N G E 3
3 instrumentations.
100+ services
40 teams
Sampling <1% traffic
16. T R A C E P R O C E S S I N G A N D S T O R A G E
C H A L L E N G E 4
Open sourced our streaming pipeline:
github.com/openzipkin/zipkin-sparkstreaming
17. T R A C E V I S U A L I Z AT I O N
C H A L L E N G E 5
Pintrace architecture
19. A P P L I C AT I O N S O F T R A C E D ATA
U N D E R S TA N D , D E B U G A N D T U N E D I S T R I B U T E D S Y S T E M S .
20. I D E N T I F Y I N G S E R V I C E S I N T E R A C T I N G
W I T H A R E Q U E S T
U N D E R S TA N D R E Q U E S T T I M E L I N E
21. I D E N T I F Y I N G D U P L I C AT E
C O M P U TAT I O N
U N D E R S TA N D R E Q U E S T T I M E L I N E
5% latency (20ms improvement) while halving the load
22. W H I C H C L U S T E R S E R V E D T H I S
R E Q U E S T ?
D E B U G D I S T R I B U T E D S Y S T E M
23. C U S T O M A P P L I C AT I O N S PA N S
D E B U G D I S T R I B U T E D S Y S T E M
24. I D E N T I F Y C L O C K S K E W
D E B U G D I S T R I B U T E D S Y S T E M
Clock skew is very common in cloud environment.
Easily identified in a trace.
Zipkin UI corrects for clock skew.
25. I D E N T I F Y I N G S E R I A L E X E C U T I O N
T U N E D I S T R I B U T E D S Y S T E M
Step pattern in a trace signifies serial execution
Parallel get_many after the bug fix.
26. M O R E A P P L I C AT I O N S O F T R A C E D ATA
• Tracking down p99 latencies.
• Identify architectural optimizations.
• Latency pipeline.
• Service dependency analysis.
• Improve time to triage.
• Automated root cause analysis.
27. L E S S O N S L E A R N E D
• User awareness and education are very important to
make tracing successful.
• Begin with the end in mind.
• Trace most valuable paths in the application.
• Distributed tracing landscape is confusing.
• Quality of traces is more important than quantity.
28.
29. Q U E S T I O N S ?
https://tinyurl.com/pintrace-architecture
https://tinyurl.com/pintrace-applications
skarumuri@pinterest.com
twitter: @mansu