O slideshow foi denunciado.
Seu SlideShare está sendo baixado. ×

Tracer Evaluation

Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Carregando em…3
×

Confira estes a seguir

1 de 31 Anúncio
Anúncio

Mais Conteúdo rRelacionado

Diapositivos para si (20)

Semelhante a Tracer Evaluation (20)

Anúncio

Mais recentes (20)

Tracer Evaluation

  1. 1. Understanding Linux Kernel Behaviour with Off-CPU Analysis Han Qiao Samsung Research UK
  2. 2. Why is my thread blocked? (red dotted line) #ARM DS-5 system profiler
  3. 3. 1. Acquiring a lock 2. Waiting for I/O 3. Sleeping voluntarily What happens Off-CPU? ie. blocked
  4. 4. Limitation of a sampling based profiler
  5. 5. Off-CPU tracing method http://brendangregg.com/offcpuanalysis
  6. 6. Off-CPU wakeup path analysis ● Requirements ○ Function argument (target, who I'm waking up) ○ Stack trace (cause, who's the caller) ○ Context information (process id, cpu id, timing, etc) ● Benefit ○ Identify performance bottlenecks ○ Account for all running applications system wide ○ Real time http://www.brendangregg.com/blog/2016-02-01/linux-wakeup-offwake-profiling.html
  7. 7. Program Specification int try_to_wake_up(struct task_struct *p, unsigned int state, int wake_flags) 1. Function argument (target, who I'm waking up) 2. Context information (source process id, cpu id, timing, etc) 3. Stack trace (cause, why am I called)
  8. 8. Naive jprobe ● Based on kprobe, but more accessible ● Attachable at any kernel function ● Keeps original function argument ● Calls Linux functions: ○ save_stack_trace ○ trace_printk http://www.cs.dartmouth.edu/~reeves/kprobes-2016.pdf
  9. 9. 24x Overhead of profiling using naive jprobe
  10. 10. Optimizations 1. Jmp optimized kprobe (https://lwn.net/Articles/370995/) 2. Custom stack walker (pointer chasing) 3. Less printing (in-kernel aggregation)
  11. 11. Jmp optimized kprobe int kprobe_handler(struct kprobe *p, struct pt_regs *regs) https://lwn.net/Articles/132196/ @ https://lwn.net/Articles/132196/
  12. 12. struct pt_regs { long rbx; long rdi; long rbp; long rax; long rflags; … Inspect registers* *architecture dependent
  13. 13. Custom stack walker ret = (void *)(*bp+8) *Omits handler code from backtrace http://eli.thegreenplace.net/2011/09/06/stack-frame-layout-on-x86-64/
  14. 14. Less printing ● Resolve kernel debug symbols trace_printk("%pfn", (void *) stack_entries[i]); ● In-kernel aggregation (future work)
  15. 15. 4x Overhead of profiling using kprobe kernel module
  16. 16. Alternatives ● ftrace (sched_waking static tracepoint > 3.18) ● eBPF (in-kernel virtual machine JIT interpreter > 4.4) ● perf ● SystemTap ● LTTng ● HTrace ● systrace
  17. 17. ftrace sudo ls /sys/kernel/debug/tracing/ available_tracers kprobe_profile set_ftrace_notrace trace_options buffer_total_size_kb options set_graph_function trace_stat current_tracer per_cpu set_graph_notrace tracing_cpumask dyn_ftrace_total_info printk_formats snapshot tracing_max_latency enabled_functions README stack_max_size tracing_on function_profile_enabled set_event trace …
  18. 18. ftrace available_tracers kprobe_profile set_ftrace_notrace trace_options buffer_total_size_kb options set_graph_function trace_stat current_tracer per_cpu set_graph_notrace tracing_cpumask dyn_ftrace_total_info printk_formats snapshot tracing_max_latency enabled_functions README stack_max_size tracing_on function_profile_enabled set_event trace … sudo ls /sys/kernel/debug/tracing/
  19. 19. Available Tracers cat available_tracers blk mmiotrace function_graph wakeup_dl wakeup_rt wakeup function nop
  20. 20. function tracer sudo ls /sys/kernel/debug/tracing/ available_events instances set_event_pid trace_clock available_filter_functions kprobe_events set_ftrace_filter trace_marker available_tracers kprobe_profile set_ftrace_notrace trace_options buffer_size_kb max_graph_depth set_ftrace_pid trace_pipe buffer_total_size_kb options set_graph_function trace_stat function_profile_enabled set_event trace
  21. 21. function tracer echo try_to_wake_up > set_ftrace_filter kworker/2:1-74 [002] d... 1025.046785: try_to_wake_up (target?)
  22. 22. function tracer echo 1 > options/func_stack_trace kworker/2:1-74 [002] d... 1025.046787: <stack trace> => pollwake => __wake_up_common => __wake_up => n_tty_receive_buf_common => n_tty_receive_buf2 …
  23. 23. nop tracer writebench.o-29118 [000] 2694376.974316: sched_waking: comm=run.sh pid=29115 prio=120 target_cpu=001 echo sched:sched_waking > set_event
  24. 24. eBPF ● Extended berkeley packet filter ● Subset of C that compiles to virtual machine bytecode via llvm ● Verifiably safe, no loops ● Extended from two registers to fourteen ● Originally used in network filters ● Easy loading and compilation with bcc (https://github.com/iovisor/bcc)
  25. 25. No more tree walker McCanne, S., & Jacobson, V. (1993, January). The BSD Packet Filter: A New Architecture for User-level Packet Capture. In USENIX winter (Vol. 46).
  26. 26. bpf source code ● Attaching to kprobe SEC("kprobe/try_to_wake_up") int bpf_prog1(struct pt_regs *ctx) ● Reading pointer value bpf_probe_read(&ret, sizeof(ret), (void *)(*bp+8));
  27. 27. Micro benchmark ● 10 million syscall to write 512 bytes of 0 to /dev/null ○ Consistent measure of syscall overhead ○ Baseline ~92 nsec ● Try it yourself ○ dd if=/dev/zero of=/dev/null bs=512 count=1000k
  28. 28. Results
  29. 29. Conclusion ● Optimized kprobe outperforms other implementations by 3-6x ○ with a combined overhead of slightly under 400ns per syscall ● In-kernel aggregation could further reduce overhead to 200ns ○ by deferring printing cost to analysis time ● With overhead in the microsecond range, tracing can be enabled on production systems without sampling to capture hard-to-reproduce bugs ○ lock contention ○ I/O latency
  30. 30. Future work ● Implement in-kernel aggregation with persistent tracing ● Integrate with ARM devices by applying kprobe patchset (https://lwn.net/Articles/676434/) ● Investigate perf integration with eBPF (https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=1f4 5b1d49073541947193bd7dac9e904142576aa)
  31. 31. Thank you!

×