SlideShare uma empresa Scribd logo
1 de 40
Baixar para ler offline
Brought to you by
Continuous Go Profiling &
Observability
Felix Geisendörfer
Staff Engineer at
■ Go developers and operators of Go applications
■ Interested in reducing costs and latency, or debugging problems such as
memory leaks, infinite loops and performance regressions
■ Focus is on Go’s built-in tools, but we’ll also cover Linux perf and eBPF
Target Audience
Felix Geisendörfer
Staff Engineer at Datadog
■ Working on continuous Go profiling as a product
■ Previous 6.5 years working for Apple (Factory Traceability)
■ Open Source Contributor (node.js, Go): github.com/felixge
https://dtdg.co/p99-go-profiling
Slides
What is profiling?
■ Anything that produces a weighted list of stack traces
■ Example: CPU Profiler that interrupts process every 10ms of CPU time,
captures a stack trace and aggregates their counts
stack trace count
main;foo 5
main;foo;bar 4
main;foobar 4
What is Continuous Profiling?
■ Profiling in production
■ Continuously upload profiles to a backend for later analysis
Why profile in production?
■ Data distributions have a big impact on performance
■ Production profiles can help mitigate and root cause incidents
■ Profiling is usually low overhead (1-10%)
About Go
■ Compiled language like C/C++/Rust
■ Should work well with industry standard observability tools … right?
Does Go pass the Duck Test?
Goroutines
■ Green threads scheduled onto OS thread by Go runtime
■ Tightly integrated with Go’s network stack (epoll on Linux)
■ Tiny 2 KiB stacks that grow dynamically
■ Fast context switching (~170ns), 10x faster than Linux threads
see https://dtdg.co/3n6kBoC
■ Data sharing via mutexes and channels (CSP)
The trouble with goroutines
uprobe:./example:main.Foo {
@start[tid] = nsecs;
}
uretprobe:./example:main.Foo {
@msecs = hist((nsecs - @start[tid]) / 1000000);
delete(@start[tid]);
}
END {
clear(@start);
}
uretprobes + dynamic stacks = 💣
$ sudo bpftrace -c ./example funclatency.bpf
Attaching 3 probes...
SIGILL: illegal instruction
PC=0x7fffffffe001 m=4 sigcode=128
instruction bytes: 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0
0x0
goroutine 1 [running]:
runtime: unknown pc 0x7fffffffe001
stack: frame={sp:0xc00006cf70, fp:0x0} stack=[0xc00006c000,0xc00006d000)
000000c00006ce70: 000000c000014010 0000000000000010
000000c00006ce80: 000000c000018000 000000000000004b
000000c00006ce90: 000000c00001a000 0000000000000013
see: runtime: ebpf uretprobe support #22008: https://dtdg.co/3s4vnfn
Thread IDs? Goroutine IDs!
uprobe:./example:main.Foo {
@start[tid] = nsecs;
}
uretprobe:./example:main.Foo {
@msecs = hist((nsecs - @start[tid]) / 1000000);
delete(@start[tid]);
}
END {
clear(@start);
}
Thread IDs? Goroutine IDs!
struct stack {
uintptr_t lo;
uintptr_t hi;
}
struct gobuf {
uintptr_t sp;
uintptr_t pc;
uintptr_t g;
uintptr_t ctxt;
uintptr_t ret;
uintptr_t lr;
uintptr_t bp;
}
struct g {
struct stack stack;
uintptr_t stackguard0;
uintptr_t stackguard1;
uintptr_t _panic;
uintptr_t _defer;
uintptr_t m;
struct gobuf sched;
uintptr_t syscallsp;
uintptr_t syscallpc;
uintptr_t stktopsp;
uintptr_t param;
uint32_t atomicstatus;
uint32_t stackLock;
uint64_t goid;
}
uprobe:./example:runtime.execute {
@gids[tid] = ((struct g *)sarg0)->goid;
}
■ Does not follow System V AMD64 ABI 🙈
■ Arguments are passed on the stack rather than using registers (slowish)
■ Go 1.17 switched to a register calling convention, but still idiosyncratic (to
support goroutine scalability, multiple return arguments, etc.)
■ ABI0 remains in use to support legacy assembly code
Go’s Calling Convention
See Proposal: Register-based Go calling convention: https://dtdg.co/2VIPOSV
■ Requires separate stack for C call frames which need to be static
■ High complexity and some overhead (~60ns) to switch between stacks
see https://dtdg.co/2X1HvTq
Calling C Code
■ Go pushed frame pointers onto the stack, has no -fomit-frame-pointer
■ Go also generates DWARF unwind/symbol tables by default
■ Leads to good interoperability with tools such as Linux perf
■ Go runtime uses idiosyncratic gopclntab unwinding and symbol tables
(DWARF is strippable and $@!%^# turing complete, so this is good)
Less odd: Stack Traces
Duck Test: Go is an odd duck
Pay attention when using 3rd party tools in production
Ashley Willis (CC BY-NC-SA 4.0)
■ Quirky runtime, Pedestrian language, limited type system, but ...
■ What Go lacks as language, it makes up for in tooling
■ Built-in documentation, testing, benchmarking, code formatting, tracing,
profiling and more!
So why bother with Go?
■ Five different profilers: CPU, Heap, Mutex, Block, Goroutine
go test -cpuprofile cpu.prof -memprofile mem.prof -bench
■ pprof visualization and analysis tool
go tool pprof -http=:6060 cpu.prof
Built-in observability tools
Built-in observability tools
■ Runtime execution tracer (⚠ overhead can be > 10%)
go test -trace trace.out -bench
Built-in Profilers
■ Three profilers that measure time:
● CPU
● Block
● Mutex
Profilers measuring time
CPU Profiler
■ Annotate goroutines with arbitrary key/value pairs
■ Understand CPU consumption of individual requests, users, endpoints, etc.
CPU Profiler: Labels
labels := pprof.Labels("user_id", "123")
pprof.Do(ctx, labels, func(ctx context.Context) {
// handle request
go update(ctx) // child goroutine inherits labels
})
■ Uses setitimer(2) to receive SIGPROF signal for every 10ms of CPU time
■ Signal handler takes stack traces and aggregates them into a profile
■ setitimer(2) has thread delivery bias and can’t keep up when utilizing more
than 2.5 cores 🙄
■ Rhys Hiltner (Twitch) and myself are working on an upstream patch to use
timer_create(2)
See: runtime/pprof: Linux CPU profiles inaccurate beyond 250% CPU use #35057: https://dtdg.co/3CAeApm
CPU Profiler: Implementation Details
■ Samples mutex wait (both) and channel wait (block profiler) events
■ Why the overlap?
● Block captures Lock(), i.e. the blocked mutexes
● Mutex captures Unlock(), i.e. the mutexes doing the blocking
■ Block profile used to be biased. Fix contributed for Go 1.17.
see https://go-review.googlesource.com/c/go/+/299991
Mutex & Block Profiler
Recap: Profilers measuring time
Allocation & Heap Profiler
func malloc(size):
object = ... // alloc magic
if poisson_sample(size):
s = stacktrace()
profile[s].allocs++
profile[s].alloc_bytes += sizeof(object)
track_profiled(object, s)
return object
func sweep(object):
// do gc stuff to free object
if is_profiled(object)
s = alloc_stacktrace(object)
profile[s].frees++
profile[s].free_bytes += sizeof(object)
return object
■ Allocations per stack trace
■ Memory remaining inuse on the heap (allocs-frees)
■ Can identify the source of memory leaks, but not the refs retaining things
Allocation & Heap Profiler
■ Can sometimes guide CPU optimizations better than CPU profiler
Allocation & Heap Profiler
made using tweetpik.com
■ Second-Order Effects: Reducing allocs can make unrelated code faster (!)
■ 💡 Reduce allocations and number of pointers on the heap
Allocation & Heap Profiler
made using tweetpik.com
■ Briefly stops all goroutines and captures their stack traces (⚠ Latency)
■ Useful for debugging goroutine leaks
■ Text output format also includes waiting times for debugging “stuck
programs” (block/mutex don’t show this until the blocking event has finished)
■ fgprof captures goroutine profiles at 100 Hz -> Wallclock Profile
https://github.com/felixge/fgprof
Goroutine Profiler
Bonus: Linux perf & eBPF
■ Frame pointers & DWARF tables lead to good interoperability
■ perf offers better accuracy (but accuracy of builtin profilers is decent enough)
■ Deals with dual Go and C stacks (no need for runtime.SetCgoTraceback())
■ Downsides: Linux only, Security, Permissions, Lack of Profiler Labels
■ Example: perf record -F 99 -g ./myapp && perf report
Linux perf
■ Example: bpftrace -e 'profile:hz:99 { @[ustack()] = count(); }' -c ./myapp
■ Should require less context switching, stacks aggregated in kernel
■ Otherwise similar caveats as Linux perf
eBPF (bpftrace)
Recap
■ Go is a bit odd for a compiled language, but ...
■ Wide variety of profiling and observability tools can be used
■ Most should be safe for production (⚠ goroutine profiler, execution tracer,
uretprobes)
■ Continuous Profiling makes sure you always have the data at your fingertips
Recap
Check out
github.com/DataDog/go-profiler-notes
for more in-depth Go profiling research
Brought to you by
Felix Geisendörfer
p99@felixge.de
@felixge

Mais conteúdo relacionado

Mais procurados

Accelerating Envoy and Istio with Cilium and the Linux Kernel
Accelerating Envoy and Istio with Cilium and the Linux KernelAccelerating Envoy and Istio with Cilium and the Linux Kernel
Accelerating Envoy and Istio with Cilium and the Linux KernelThomas Graf
 
BPF Internals (eBPF)
BPF Internals (eBPF)BPF Internals (eBPF)
BPF Internals (eBPF)Brendan Gregg
 
Meet cute-between-ebpf-and-tracing
Meet cute-between-ebpf-and-tracingMeet cute-between-ebpf-and-tracing
Meet cute-between-ebpf-and-tracingViller Hsiao
 
eBPF - Rethinking the Linux Kernel
eBPF - Rethinking the Linux KerneleBPF - Rethinking the Linux Kernel
eBPF - Rethinking the Linux KernelThomas Graf
 
Velocity 2017 Performance analysis superpowers with Linux eBPF
Velocity 2017 Performance analysis superpowers with Linux eBPFVelocity 2017 Performance analysis superpowers with Linux eBPF
Velocity 2017 Performance analysis superpowers with Linux eBPFBrendan Gregg
 
[KubeCon EU 2022] Running containerd and k3s on macOS
[KubeCon EU 2022] Running containerd and k3s on macOS[KubeCon EU 2022] Running containerd and k3s on macOS
[KubeCon EU 2022] Running containerd and k3s on macOSAkihiro Suda
 
Performance Analysis: The USE Method
Performance Analysis: The USE MethodPerformance Analysis: The USE Method
Performance Analysis: The USE MethodBrendan Gregg
 
Kubernetes Networking with Cilium - Deep Dive
Kubernetes Networking with Cilium - Deep DiveKubernetes Networking with Cilium - Deep Dive
Kubernetes Networking with Cilium - Deep DiveMichal Rostecki
 
Linux Performance Analysis: New Tools and Old Secrets
Linux Performance Analysis: New Tools and Old SecretsLinux Performance Analysis: New Tools and Old Secrets
Linux Performance Analysis: New Tools and Old SecretsBrendan Gregg
 
From DTrace to Linux
From DTrace to LinuxFrom DTrace to Linux
From DTrace to LinuxBrendan Gregg
 
Java Performance Analysis on Linux with Flame Graphs
Java Performance Analysis on Linux with Flame GraphsJava Performance Analysis on Linux with Flame Graphs
Java Performance Analysis on Linux with Flame GraphsBrendan Gregg
 
[232] 성능어디까지쥐어짜봤니 송태웅
[232] 성능어디까지쥐어짜봤니 송태웅[232] 성능어디까지쥐어짜봤니 송태웅
[232] 성능어디까지쥐어짜봤니 송태웅NAVER D2
 
Room 1 - 3 - Lê Anh Tuấn - Build a High Performance Identification at GHTK wi...
Room 1 - 3 - Lê Anh Tuấn - Build a High Performance Identification at GHTK wi...Room 1 - 3 - Lê Anh Tuấn - Build a High Performance Identification at GHTK wi...
Room 1 - 3 - Lê Anh Tuấn - Build a High Performance Identification at GHTK wi...Vietnam Open Infrastructure User Group
 
Linux Profiling at Netflix
Linux Profiling at NetflixLinux Profiling at Netflix
Linux Profiling at NetflixBrendan Gregg
 
Linux BPF Superpowers
Linux BPF SuperpowersLinux BPF Superpowers
Linux BPF SuperpowersBrendan Gregg
 
[KubeCon EU 2020] containerd Deep Dive
[KubeCon EU 2020] containerd Deep Dive[KubeCon EU 2020] containerd Deep Dive
[KubeCon EU 2020] containerd Deep DiveAkihiro Suda
 
LinuxCon 2015 Linux Kernel Networking Walkthrough
LinuxCon 2015 Linux Kernel Networking WalkthroughLinuxCon 2015 Linux Kernel Networking Walkthrough
LinuxCon 2015 Linux Kernel Networking WalkthroughThomas Graf
 
BPF - in-kernel virtual machine
BPF - in-kernel virtual machineBPF - in-kernel virtual machine
BPF - in-kernel virtual machineAlexei Starovoitov
 
Performance Tuning EC2 Instances
Performance Tuning EC2 InstancesPerformance Tuning EC2 Instances
Performance Tuning EC2 InstancesBrendan Gregg
 

Mais procurados (20)

Accelerating Envoy and Istio with Cilium and the Linux Kernel
Accelerating Envoy and Istio with Cilium and the Linux KernelAccelerating Envoy and Istio with Cilium and the Linux Kernel
Accelerating Envoy and Istio with Cilium and the Linux Kernel
 
BPF Internals (eBPF)
BPF Internals (eBPF)BPF Internals (eBPF)
BPF Internals (eBPF)
 
Meet cute-between-ebpf-and-tracing
Meet cute-between-ebpf-and-tracingMeet cute-between-ebpf-and-tracing
Meet cute-between-ebpf-and-tracing
 
eBPF - Rethinking the Linux Kernel
eBPF - Rethinking the Linux KerneleBPF - Rethinking the Linux Kernel
eBPF - Rethinking the Linux Kernel
 
Velocity 2017 Performance analysis superpowers with Linux eBPF
Velocity 2017 Performance analysis superpowers with Linux eBPFVelocity 2017 Performance analysis superpowers with Linux eBPF
Velocity 2017 Performance analysis superpowers with Linux eBPF
 
[KubeCon EU 2022] Running containerd and k3s on macOS
[KubeCon EU 2022] Running containerd and k3s on macOS[KubeCon EU 2022] Running containerd and k3s on macOS
[KubeCon EU 2022] Running containerd and k3s on macOS
 
Performance Analysis: The USE Method
Performance Analysis: The USE MethodPerformance Analysis: The USE Method
Performance Analysis: The USE Method
 
Kubernetes Networking with Cilium - Deep Dive
Kubernetes Networking with Cilium - Deep DiveKubernetes Networking with Cilium - Deep Dive
Kubernetes Networking with Cilium - Deep Dive
 
Linux Performance Analysis: New Tools and Old Secrets
Linux Performance Analysis: New Tools and Old SecretsLinux Performance Analysis: New Tools and Old Secrets
Linux Performance Analysis: New Tools and Old Secrets
 
From DTrace to Linux
From DTrace to LinuxFrom DTrace to Linux
From DTrace to Linux
 
Java Performance Analysis on Linux with Flame Graphs
Java Performance Analysis on Linux with Flame GraphsJava Performance Analysis on Linux with Flame Graphs
Java Performance Analysis on Linux with Flame Graphs
 
[232] 성능어디까지쥐어짜봤니 송태웅
[232] 성능어디까지쥐어짜봤니 송태웅[232] 성능어디까지쥐어짜봤니 송태웅
[232] 성능어디까지쥐어짜봤니 송태웅
 
Room 1 - 3 - Lê Anh Tuấn - Build a High Performance Identification at GHTK wi...
Room 1 - 3 - Lê Anh Tuấn - Build a High Performance Identification at GHTK wi...Room 1 - 3 - Lê Anh Tuấn - Build a High Performance Identification at GHTK wi...
Room 1 - 3 - Lê Anh Tuấn - Build a High Performance Identification at GHTK wi...
 
Linux Profiling at Netflix
Linux Profiling at NetflixLinux Profiling at Netflix
Linux Profiling at Netflix
 
Linux BPF Superpowers
Linux BPF SuperpowersLinux BPF Superpowers
Linux BPF Superpowers
 
eBPF Basics
eBPF BasicseBPF Basics
eBPF Basics
 
[KubeCon EU 2020] containerd Deep Dive
[KubeCon EU 2020] containerd Deep Dive[KubeCon EU 2020] containerd Deep Dive
[KubeCon EU 2020] containerd Deep Dive
 
LinuxCon 2015 Linux Kernel Networking Walkthrough
LinuxCon 2015 Linux Kernel Networking WalkthroughLinuxCon 2015 Linux Kernel Networking Walkthrough
LinuxCon 2015 Linux Kernel Networking Walkthrough
 
BPF - in-kernel virtual machine
BPF - in-kernel virtual machineBPF - in-kernel virtual machine
BPF - in-kernel virtual machine
 
Performance Tuning EC2 Instances
Performance Tuning EC2 InstancesPerformance Tuning EC2 Instances
Performance Tuning EC2 Instances
 

Semelhante a Continuous Go Profiling & Observability

[ CNCF Q1 2024 ] Intro to Continuous Profiling and Grafana Pyroscope.pdf
[ CNCF Q1 2024 ] Intro to Continuous Profiling and Grafana Pyroscope.pdf[ CNCF Q1 2024 ] Intro to Continuous Profiling and Grafana Pyroscope.pdf
[ CNCF Q1 2024 ] Intro to Continuous Profiling and Grafana Pyroscope.pdfSteve Caron
 
GopherCon IL 2020 - Web Application Profiling 101
GopherCon IL 2020 - Web Application Profiling 101GopherCon IL 2020 - Web Application Profiling 101
GopherCon IL 2020 - Web Application Profiling 101yinonavraham
 
Optimizing, profiling and deploying high performance Spark ML and TensorFlow ...
Optimizing, profiling and deploying high performance Spark ML and TensorFlow ...Optimizing, profiling and deploying high performance Spark ML and TensorFlow ...
Optimizing, profiling and deploying high performance Spark ML and TensorFlow ...DataWorks Summit
 
Php 5.6 From the Inside Out
Php 5.6 From the Inside OutPhp 5.6 From the Inside Out
Php 5.6 From the Inside OutFerenc Kovács
 
Trace kernel code tips
Trace kernel code tipsTrace kernel code tips
Trace kernel code tipsViller Hsiao
 
Debugging Hung Python Processes With GDB
Debugging Hung Python Processes With GDBDebugging Hung Python Processes With GDB
Debugging Hung Python Processes With GDBbmbouter
 
Ceph Day Melbourne - Troubleshooting Ceph
Ceph Day Melbourne - Troubleshooting Ceph Ceph Day Melbourne - Troubleshooting Ceph
Ceph Day Melbourne - Troubleshooting Ceph Ceph Community
 
Dynamic tracing of MariaDB on Linux - problems and solutions (MariaDB Server ...
Dynamic tracing of MariaDB on Linux - problems and solutions (MariaDB Server ...Dynamic tracing of MariaDB on Linux - problems and solutions (MariaDB Server ...
Dynamic tracing of MariaDB on Linux - problems and solutions (MariaDB Server ...Valeriy Kravchuk
 
PyCon AU 2012 - Debugging Live Python Web Applications
PyCon AU 2012 - Debugging Live Python Web ApplicationsPyCon AU 2012 - Debugging Live Python Web Applications
PyCon AU 2012 - Debugging Live Python Web ApplicationsGraham Dumpleton
 
Understanding eBPF in a Hurry!
Understanding eBPF in a Hurry!Understanding eBPF in a Hurry!
Understanding eBPF in a Hurry!Ray Jenkins
 
High Performance Distributed TensorFlow with GPUs - Nvidia GPU Tech Conferenc...
High Performance Distributed TensorFlow with GPUs - Nvidia GPU Tech Conferenc...High Performance Distributed TensorFlow with GPUs - Nvidia GPU Tech Conferenc...
High Performance Distributed TensorFlow with GPUs - Nvidia GPU Tech Conferenc...Chris Fregly
 
May2010 hex-core-opt
May2010 hex-core-optMay2010 hex-core-opt
May2010 hex-core-optJeff Larkin
 
Debugging Python with gdb
Debugging Python with gdbDebugging Python with gdb
Debugging Python with gdbRoman Podoliaka
 
php & performance
 php & performance php & performance
php & performancesimon8410
 
The why and how of moving to PHP 5.5/5.6
The why and how of moving to PHP 5.5/5.6The why and how of moving to PHP 5.5/5.6
The why and how of moving to PHP 5.5/5.6Wim Godden
 
Tensorflow internal
Tensorflow internalTensorflow internal
Tensorflow internalHyunghun Cho
 
1032 cs208 g operation system ip camera case share.v0.2
1032 cs208 g operation system ip camera case share.v0.21032 cs208 g operation system ip camera case share.v0.2
1032 cs208 g operation system ip camera case share.v0.2Stanley Ho
 
No instrumentation Golang Logging with eBPF (GoSF talk 11/11/20)
No instrumentation Golang Logging with eBPF (GoSF talk 11/11/20)No instrumentation Golang Logging with eBPF (GoSF talk 11/11/20)
No instrumentation Golang Logging with eBPF (GoSF talk 11/11/20)Pixie Labs
 
No instrumentation Golang Logging with eBPF (GoSF talk 11/11/20)
No instrumentation Golang Logging with eBPF (GoSF talk 11/11/20)No instrumentation Golang Logging with eBPF (GoSF talk 11/11/20)
No instrumentation Golang Logging with eBPF (GoSF talk 11/11/20)Zain Asgar
 

Semelhante a Continuous Go Profiling & Observability (20)

[ CNCF Q1 2024 ] Intro to Continuous Profiling and Grafana Pyroscope.pdf
[ CNCF Q1 2024 ] Intro to Continuous Profiling and Grafana Pyroscope.pdf[ CNCF Q1 2024 ] Intro to Continuous Profiling and Grafana Pyroscope.pdf
[ CNCF Q1 2024 ] Intro to Continuous Profiling and Grafana Pyroscope.pdf
 
GopherCon IL 2020 - Web Application Profiling 101
GopherCon IL 2020 - Web Application Profiling 101GopherCon IL 2020 - Web Application Profiling 101
GopherCon IL 2020 - Web Application Profiling 101
 
Optimizing, profiling and deploying high performance Spark ML and TensorFlow ...
Optimizing, profiling and deploying high performance Spark ML and TensorFlow ...Optimizing, profiling and deploying high performance Spark ML and TensorFlow ...
Optimizing, profiling and deploying high performance Spark ML and TensorFlow ...
 
Php 5.6 From the Inside Out
Php 5.6 From the Inside OutPhp 5.6 From the Inside Out
Php 5.6 From the Inside Out
 
Trace kernel code tips
Trace kernel code tipsTrace kernel code tips
Trace kernel code tips
 
Debugging Hung Python Processes With GDB
Debugging Hung Python Processes With GDBDebugging Hung Python Processes With GDB
Debugging Hung Python Processes With GDB
 
Ceph Day Melbourne - Troubleshooting Ceph
Ceph Day Melbourne - Troubleshooting Ceph Ceph Day Melbourne - Troubleshooting Ceph
Ceph Day Melbourne - Troubleshooting Ceph
 
Dynamic tracing of MariaDB on Linux - problems and solutions (MariaDB Server ...
Dynamic tracing of MariaDB on Linux - problems and solutions (MariaDB Server ...Dynamic tracing of MariaDB on Linux - problems and solutions (MariaDB Server ...
Dynamic tracing of MariaDB on Linux - problems and solutions (MariaDB Server ...
 
PyCon AU 2012 - Debugging Live Python Web Applications
PyCon AU 2012 - Debugging Live Python Web ApplicationsPyCon AU 2012 - Debugging Live Python Web Applications
PyCon AU 2012 - Debugging Live Python Web Applications
 
Understanding eBPF in a Hurry!
Understanding eBPF in a Hurry!Understanding eBPF in a Hurry!
Understanding eBPF in a Hurry!
 
High Performance Distributed TensorFlow with GPUs - Nvidia GPU Tech Conferenc...
High Performance Distributed TensorFlow with GPUs - Nvidia GPU Tech Conferenc...High Performance Distributed TensorFlow with GPUs - Nvidia GPU Tech Conferenc...
High Performance Distributed TensorFlow with GPUs - Nvidia GPU Tech Conferenc...
 
May2010 hex-core-opt
May2010 hex-core-optMay2010 hex-core-opt
May2010 hex-core-opt
 
Debugging Python with gdb
Debugging Python with gdbDebugging Python with gdb
Debugging Python with gdb
 
SOFA Tutorial
SOFA TutorialSOFA Tutorial
SOFA Tutorial
 
php & performance
 php & performance php & performance
php & performance
 
The why and how of moving to PHP 5.5/5.6
The why and how of moving to PHP 5.5/5.6The why and how of moving to PHP 5.5/5.6
The why and how of moving to PHP 5.5/5.6
 
Tensorflow internal
Tensorflow internalTensorflow internal
Tensorflow internal
 
1032 cs208 g operation system ip camera case share.v0.2
1032 cs208 g operation system ip camera case share.v0.21032 cs208 g operation system ip camera case share.v0.2
1032 cs208 g operation system ip camera case share.v0.2
 
No instrumentation Golang Logging with eBPF (GoSF talk 11/11/20)
No instrumentation Golang Logging with eBPF (GoSF talk 11/11/20)No instrumentation Golang Logging with eBPF (GoSF talk 11/11/20)
No instrumentation Golang Logging with eBPF (GoSF talk 11/11/20)
 
No instrumentation Golang Logging with eBPF (GoSF talk 11/11/20)
No instrumentation Golang Logging with eBPF (GoSF talk 11/11/20)No instrumentation Golang Logging with eBPF (GoSF talk 11/11/20)
No instrumentation Golang Logging with eBPF (GoSF talk 11/11/20)
 

Mais de ScyllaDB

Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
What Developers Need to Unlearn for High Performance NoSQL
What Developers Need to Unlearn for High Performance NoSQLWhat Developers Need to Unlearn for High Performance NoSQL
What Developers Need to Unlearn for High Performance NoSQLScyllaDB
 
Low Latency at Extreme Scale: Proven Practices & Pitfalls
Low Latency at Extreme Scale: Proven Practices & PitfallsLow Latency at Extreme Scale: Proven Practices & Pitfalls
Low Latency at Extreme Scale: Proven Practices & PitfallsScyllaDB
 
Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasDissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasScyllaDB
 
Beyond Linear Scaling: A New Path for Performance with ScyllaDB
Beyond Linear Scaling: A New Path for Performance with ScyllaDBBeyond Linear Scaling: A New Path for Performance with ScyllaDB
Beyond Linear Scaling: A New Path for Performance with ScyllaDBScyllaDB
 
Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasDissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasScyllaDB
 
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...ScyllaDB
 
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...ScyllaDB
 
Database Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr SarnaDatabase Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr SarnaScyllaDB
 
Replacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDBReplacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDBScyllaDB
 
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear ScalabilityPowering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear ScalabilityScyllaDB
 
7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptx7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptxScyllaDB
 
Getting the most out of ScyllaDB
Getting the most out of ScyllaDBGetting the most out of ScyllaDB
Getting the most out of ScyllaDBScyllaDB
 
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a MigrationNoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a MigrationScyllaDB
 
NoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration LogisticsNoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration LogisticsScyllaDB
 
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and ChallengesNoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and ChallengesScyllaDB
 
ScyllaDB Virtual Workshop
ScyllaDB Virtual WorkshopScyllaDB Virtual Workshop
ScyllaDB Virtual WorkshopScyllaDB
 
DBaaS in the Real World: Risks, Rewards & Tradeoffs
DBaaS in the Real World: Risks, Rewards & TradeoffsDBaaS in the Real World: Risks, Rewards & Tradeoffs
DBaaS in the Real World: Risks, Rewards & TradeoffsScyllaDB
 
Build Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDBBuild Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDBScyllaDB
 
NoSQL Data Modeling 101
NoSQL Data Modeling 101NoSQL Data Modeling 101
NoSQL Data Modeling 101ScyllaDB
 

Mais de ScyllaDB (20)

Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
What Developers Need to Unlearn for High Performance NoSQL
What Developers Need to Unlearn for High Performance NoSQLWhat Developers Need to Unlearn for High Performance NoSQL
What Developers Need to Unlearn for High Performance NoSQL
 
Low Latency at Extreme Scale: Proven Practices & Pitfalls
Low Latency at Extreme Scale: Proven Practices & PitfallsLow Latency at Extreme Scale: Proven Practices & Pitfalls
Low Latency at Extreme Scale: Proven Practices & Pitfalls
 
Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasDissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance Dilemmas
 
Beyond Linear Scaling: A New Path for Performance with ScyllaDB
Beyond Linear Scaling: A New Path for Performance with ScyllaDBBeyond Linear Scaling: A New Path for Performance with ScyllaDB
Beyond Linear Scaling: A New Path for Performance with ScyllaDB
 
Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasDissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance Dilemmas
 
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
 
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
 
Database Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr SarnaDatabase Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
 
Replacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDBReplacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDB
 
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear ScalabilityPowering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
 
7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptx7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptx
 
Getting the most out of ScyllaDB
Getting the most out of ScyllaDBGetting the most out of ScyllaDB
Getting the most out of ScyllaDB
 
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a MigrationNoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
 
NoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration LogisticsNoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration Logistics
 
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and ChallengesNoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
 
ScyllaDB Virtual Workshop
ScyllaDB Virtual WorkshopScyllaDB Virtual Workshop
ScyllaDB Virtual Workshop
 
DBaaS in the Real World: Risks, Rewards & Tradeoffs
DBaaS in the Real World: Risks, Rewards & TradeoffsDBaaS in the Real World: Risks, Rewards & Tradeoffs
DBaaS in the Real World: Risks, Rewards & Tradeoffs
 
Build Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDBBuild Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDB
 
NoSQL Data Modeling 101
NoSQL Data Modeling 101NoSQL Data Modeling 101
NoSQL Data Modeling 101
 

Último

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 

Último (20)

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 

Continuous Go Profiling & Observability

  • 1. Brought to you by Continuous Go Profiling & Observability Felix Geisendörfer Staff Engineer at
  • 2. ■ Go developers and operators of Go applications ■ Interested in reducing costs and latency, or debugging problems such as memory leaks, infinite loops and performance regressions ■ Focus is on Go’s built-in tools, but we’ll also cover Linux perf and eBPF Target Audience
  • 3. Felix Geisendörfer Staff Engineer at Datadog ■ Working on continuous Go profiling as a product ■ Previous 6.5 years working for Apple (Factory Traceability) ■ Open Source Contributor (node.js, Go): github.com/felixge
  • 5. What is profiling? ■ Anything that produces a weighted list of stack traces ■ Example: CPU Profiler that interrupts process every 10ms of CPU time, captures a stack trace and aggregates their counts stack trace count main;foo 5 main;foo;bar 4 main;foobar 4
  • 6. What is Continuous Profiling? ■ Profiling in production ■ Continuously upload profiles to a backend for later analysis
  • 7. Why profile in production? ■ Data distributions have a big impact on performance ■ Production profiles can help mitigate and root cause incidents ■ Profiling is usually low overhead (1-10%)
  • 8. About Go ■ Compiled language like C/C++/Rust ■ Should work well with industry standard observability tools … right?
  • 9. Does Go pass the Duck Test?
  • 10. Goroutines ■ Green threads scheduled onto OS thread by Go runtime ■ Tightly integrated with Go’s network stack (epoll on Linux) ■ Tiny 2 KiB stacks that grow dynamically ■ Fast context switching (~170ns), 10x faster than Linux threads see https://dtdg.co/3n6kBoC ■ Data sharing via mutexes and channels (CSP)
  • 11. The trouble with goroutines uprobe:./example:main.Foo { @start[tid] = nsecs; } uretprobe:./example:main.Foo { @msecs = hist((nsecs - @start[tid]) / 1000000); delete(@start[tid]); } END { clear(@start); }
  • 12. uretprobes + dynamic stacks = 💣 $ sudo bpftrace -c ./example funclatency.bpf Attaching 3 probes... SIGILL: illegal instruction PC=0x7fffffffe001 m=4 sigcode=128 instruction bytes: 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 goroutine 1 [running]: runtime: unknown pc 0x7fffffffe001 stack: frame={sp:0xc00006cf70, fp:0x0} stack=[0xc00006c000,0xc00006d000) 000000c00006ce70: 000000c000014010 0000000000000010 000000c00006ce80: 000000c000018000 000000000000004b 000000c00006ce90: 000000c00001a000 0000000000000013 see: runtime: ebpf uretprobe support #22008: https://dtdg.co/3s4vnfn
  • 13. Thread IDs? Goroutine IDs! uprobe:./example:main.Foo { @start[tid] = nsecs; } uretprobe:./example:main.Foo { @msecs = hist((nsecs - @start[tid]) / 1000000); delete(@start[tid]); } END { clear(@start); }
  • 14. Thread IDs? Goroutine IDs! struct stack { uintptr_t lo; uintptr_t hi; } struct gobuf { uintptr_t sp; uintptr_t pc; uintptr_t g; uintptr_t ctxt; uintptr_t ret; uintptr_t lr; uintptr_t bp; } struct g { struct stack stack; uintptr_t stackguard0; uintptr_t stackguard1; uintptr_t _panic; uintptr_t _defer; uintptr_t m; struct gobuf sched; uintptr_t syscallsp; uintptr_t syscallpc; uintptr_t stktopsp; uintptr_t param; uint32_t atomicstatus; uint32_t stackLock; uint64_t goid; } uprobe:./example:runtime.execute { @gids[tid] = ((struct g *)sarg0)->goid; }
  • 15. ■ Does not follow System V AMD64 ABI 🙈 ■ Arguments are passed on the stack rather than using registers (slowish) ■ Go 1.17 switched to a register calling convention, but still idiosyncratic (to support goroutine scalability, multiple return arguments, etc.) ■ ABI0 remains in use to support legacy assembly code Go’s Calling Convention See Proposal: Register-based Go calling convention: https://dtdg.co/2VIPOSV
  • 16. ■ Requires separate stack for C call frames which need to be static ■ High complexity and some overhead (~60ns) to switch between stacks see https://dtdg.co/2X1HvTq Calling C Code
  • 17. ■ Go pushed frame pointers onto the stack, has no -fomit-frame-pointer ■ Go also generates DWARF unwind/symbol tables by default ■ Leads to good interoperability with tools such as Linux perf ■ Go runtime uses idiosyncratic gopclntab unwinding and symbol tables (DWARF is strippable and $@!%^# turing complete, so this is good) Less odd: Stack Traces
  • 18. Duck Test: Go is an odd duck Pay attention when using 3rd party tools in production Ashley Willis (CC BY-NC-SA 4.0)
  • 19. ■ Quirky runtime, Pedestrian language, limited type system, but ... ■ What Go lacks as language, it makes up for in tooling ■ Built-in documentation, testing, benchmarking, code formatting, tracing, profiling and more! So why bother with Go?
  • 20. ■ Five different profilers: CPU, Heap, Mutex, Block, Goroutine go test -cpuprofile cpu.prof -memprofile mem.prof -bench ■ pprof visualization and analysis tool go tool pprof -http=:6060 cpu.prof Built-in observability tools
  • 21. Built-in observability tools ■ Runtime execution tracer (⚠ overhead can be > 10%) go test -trace trace.out -bench
  • 23. ■ Three profilers that measure time: ● CPU ● Block ● Mutex Profilers measuring time
  • 25. ■ Annotate goroutines with arbitrary key/value pairs ■ Understand CPU consumption of individual requests, users, endpoints, etc. CPU Profiler: Labels labels := pprof.Labels("user_id", "123") pprof.Do(ctx, labels, func(ctx context.Context) { // handle request go update(ctx) // child goroutine inherits labels })
  • 26. ■ Uses setitimer(2) to receive SIGPROF signal for every 10ms of CPU time ■ Signal handler takes stack traces and aggregates them into a profile ■ setitimer(2) has thread delivery bias and can’t keep up when utilizing more than 2.5 cores 🙄 ■ Rhys Hiltner (Twitch) and myself are working on an upstream patch to use timer_create(2) See: runtime/pprof: Linux CPU profiles inaccurate beyond 250% CPU use #35057: https://dtdg.co/3CAeApm CPU Profiler: Implementation Details
  • 27. ■ Samples mutex wait (both) and channel wait (block profiler) events ■ Why the overlap? ● Block captures Lock(), i.e. the blocked mutexes ● Mutex captures Unlock(), i.e. the mutexes doing the blocking ■ Block profile used to be biased. Fix contributed for Go 1.17. see https://go-review.googlesource.com/c/go/+/299991 Mutex & Block Profiler
  • 29. Allocation & Heap Profiler func malloc(size): object = ... // alloc magic if poisson_sample(size): s = stacktrace() profile[s].allocs++ profile[s].alloc_bytes += sizeof(object) track_profiled(object, s) return object func sweep(object): // do gc stuff to free object if is_profiled(object) s = alloc_stacktrace(object) profile[s].frees++ profile[s].free_bytes += sizeof(object) return object
  • 30. ■ Allocations per stack trace ■ Memory remaining inuse on the heap (allocs-frees) ■ Can identify the source of memory leaks, but not the refs retaining things Allocation & Heap Profiler
  • 31. ■ Can sometimes guide CPU optimizations better than CPU profiler Allocation & Heap Profiler made using tweetpik.com
  • 32. ■ Second-Order Effects: Reducing allocs can make unrelated code faster (!) ■ 💡 Reduce allocations and number of pointers on the heap Allocation & Heap Profiler made using tweetpik.com
  • 33. ■ Briefly stops all goroutines and captures their stack traces (⚠ Latency) ■ Useful for debugging goroutine leaks ■ Text output format also includes waiting times for debugging “stuck programs” (block/mutex don’t show this until the blocking event has finished) ■ fgprof captures goroutine profiles at 100 Hz -> Wallclock Profile https://github.com/felixge/fgprof Goroutine Profiler
  • 35. ■ Frame pointers & DWARF tables lead to good interoperability ■ perf offers better accuracy (but accuracy of builtin profilers is decent enough) ■ Deals with dual Go and C stacks (no need for runtime.SetCgoTraceback()) ■ Downsides: Linux only, Security, Permissions, Lack of Profiler Labels ■ Example: perf record -F 99 -g ./myapp && perf report Linux perf
  • 36. ■ Example: bpftrace -e 'profile:hz:99 { @[ustack()] = count(); }' -c ./myapp ■ Should require less context switching, stacks aggregated in kernel ■ Otherwise similar caveats as Linux perf eBPF (bpftrace)
  • 37. Recap
  • 38. ■ Go is a bit odd for a compiled language, but ... ■ Wide variety of profiling and observability tools can be used ■ Most should be safe for production (⚠ goroutine profiler, execution tracer, uretprobes) ■ Continuous Profiling makes sure you always have the data at your fingertips Recap
  • 40. Brought to you by Felix Geisendörfer p99@felixge.de @felixge