
Linux Systems Performance Tracing, Profiling, and Visualization G. Amadio (EP-SFT) Nov, 2020 Performance is challenging Some things that can go wrong: ● Measurement ○ Overhead from instrumentation and measurement ○ Variable CPU frequency scaling (turbo boost, thermal throttling) ○ Missing symbols (JIT, interpreted languages, stripped binaries) ○ Broken stack unwinding (deep call stacks, inlining, missing frame pointer) ● Optimization and Tuning ○ Concurrency issues (shared resources with hyperthreading, contention) ○ Compiler optimizations (exceptions vs vectorization, denormals, dead code) ○ Memory alignment, access patterns, fragmentation ○ Addressing the wrong issue (optimizing compute for memory bound problem and vice-versa) 2 A pinch of UNIX wisdom – on handling complexity Rule 1 You can't tell where a program is going to spend its time. Bottlenecks occur in surprising places, so don't try to second guess and put in a speed hack until you've proven that's where the bottleneck is. Rule 2 Measure. Don't tune for speed until you've measured, and even then don't unless one part of the code overwhelms the rest. Rule 3 Fancy algorithms are slow when n is small, and n is usually small. Fancy algorithms have big constants. Until you know that n is frequently going to be big, don't get fancy. (Even if n does get big, use Rule 2 first.) Rule 4 Fancy algorithms are buggier than simple ones, and they're much harder to implement. Use simple algorithms as well as simple data structures. Rule 5 Data dominates. If you've chosen the right data structures and organized things well, the algorithms will almost always be self-evident. Data structures, not algorithms, are central to programming. Rule 6 There is no Rule 6. from “Notes on C Programming”, by Rob Pike 3 System Observability Tools Linux Observability Tools 5 top – display Linux processes 6 htop – interactive process viewer 7 latencytop – tool to visualize system latencies 8 powertop – power consumption diagnosis tool 9 mpstat – report processor statistics 10 mpstat – report processor statistics 11 iostat – report CPU and I/O statistics 12 iostat – report CPU and I/O statistics 13 vmstat – report virtual memory statistics 14 vmstat – report virtual memory statistics 15 perf stat – report performance counter statistics 16 perf stat – report performance counter statistics 17 Tracing Tools strace – trace system calls and signals 19 bash ~ $ strace cat /dev/null execve("/bin/cat", ["cat", "/dev/null"], 0x7ffc38125548 /* 76 vars */) = 0 brk(NULL) = 0x55b0fade6000 access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory) straceopenat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3 fstat(3, {st_mode=S_IFREG|0644, st_size=300584, ...}) = 0 mmap(NULL, 300584, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7efde08c9000 close(3) = 0 openat(AT_FDCWD, "/lib64/libc.so.6", O_RDONLY|O_CLOEXEC) = 3 read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\300?\2\0\0\0\0\0"..., 832) = 832 fstat(3, {st_mode=S_IFREG|0755, st_size=18996000, ...}) = 0 mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7efde08c7000 mmap(NULL, 1873248, PROT_READ, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7efde06fd000 mprotect(0x7efde071f000, 1695744, PROT_NONE) = 0 mmap(0x7efde071f000, 1404928, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x22000) = 0x7efde071f000 mmap(0x7efde0876000, 286720, PROT_READ, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x179000) = 0x7efde0876000 mmap(0x7efde08bd000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1bf000) = 0x7efde08bd000 mmap(0x7efde08c3000, 13664, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7efde08c3000 close(3) = 0 mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7efde06fb000 arch_prctl(ARCH_SET_FS, 0x7efde08c85c0) = 0 mprotect(0x7efde08bd000, 16384, PROT_READ) = 0 mprotect(0x55b0f9bd2000, 4096, PROT_READ) = 0 mprotect(0x7efde093e000, 4096, PROT_READ) = 0 munmap(0x7efde08c9000, 300584) = 0 brk(NULL) = 0x55b0fade6000 brk(0x55b0fae07000) = 0x55b0fae07000 openat(AT_FDCWD, "/usr/lib64/locale/locale-archive", O_RDONLY|O_CLOEXEC) = 3 fstat(3, {st_mode=S_IFREG|0644, st_size=4566336, ...}) = 0 mmap(NULL, 4566336, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7efde02a0000 close(3) = 0 fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(0x88, 0x12), ...}) = 0 openat(AT_FDCWD, "/dev/null", O_RDONLY) = 3 fstat(3, {st_mode=S_IFCHR|0666, st_rdev=makedev(0x1, 0x3), ...}) = 0 fadvise64(3, 0, 0, POSIX_FADV_SEQUENTIAL) = 0 mmap(NULL, 139264, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7efde08f1000 read(3, "", 131072) = 0 munmap(0x7efde08f1000, 139264) = 0 close(3) = 0 close(1) = 0 close(2) = 0 exit_group(0) = ? +++ exited with 0 +++ 20 bash ~ $ strace -e 'openat' --failed-only -- root.exe -l -q 2>&1 | grep ENOENT openat(AT_FDCWD, "/home/amadio/.rootrc", O_RDONLY) = -1 ENOENT (No such file or directory) openat(AT_FDCWD, "/etc/env.d/gcc/config-x86_64-linux-gnu", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) straceopenat(AT_FDCWD, "/etc/env.d/gcc/config-x86_64-unknown-linux-gnu", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) openat(AT_FDCWD, "/etc/redhat-release", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) openat(AT_FDCWD, "/etc/debian_version", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) openat(AT_FDCWD, "/etc/SuSE-release", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) openat(AT_FDCWD, "/usr/lib/root/6.22/etc/new", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) openat(AT_FDCWD, "/usr/lib/root/6.22/etc//cling/new", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) openat(AT_FDCWD, "/usr/lib/root/6.22/include/new", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) openat(AT_FDCWD, "./Rtypes.h", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) openat(AT_FDCWD, "/usr/lib/root/6.22/etc/Rtypes.h", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) openat(AT_FDCWD, "/usr/lib/root/6.22/etc//cling/Rtypes.h", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) openat(AT_FDCWD, "./TError.h", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) openat(AT_FDCWD, "/usr/lib/root/6.22/etc/TError.h", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) openat(AT_FDCWD, "/usr/lib/root/6.22/etc//cling/TError.h", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) openat(AT_FDCWD, "/usr/lib/root/6.22/etc/string", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) openat(AT_FDCWD, "/usr/lib/root/6.22/etc//cling/string", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) openat(AT_FDCWD, "/usr/lib/root/6.22/include/string", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) openat(AT_FDCWD, "/usr/lib/root/6.22/etc/cassert", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) openat(AT_FDCWD, "/usr/lib/root/6.22/etc//cling/cassert", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) openat(AT_FDCWD, "/usr/lib/root/6.22/include/cassert", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) openat(AT_FDCWD, "/usr/lib/gcc/x86_64-pc-linux-gnu/8.4.0/include/g++-v8/bits/c++config.h", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) openat(AT_FDCWD, "/usr/lib/root/6.22/etc/assert.h", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) openat(AT_FDCWD, "/usr/lib/root/6.22/etc//cling/assert.h", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) openat(AT_FDCWD, "/usr/lib/root/6.22/include/assert.h", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) openat(AT_FDCWD, "/usr/lib/gcc/x86_64-pc-linux-gnu/8.4.0/include/g++-v8/assert.h", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) openat(AT_FDCWD, "/usr/lib/gcc/x86_64-pc-linux-gnu/8.4.0/include/g++-v8/x86_64-pc-linux-gnu/assert.h", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) openat(AT_FDCWD, "/usr/lib/gcc/x86_64-pc-linux-gnu/8.4.0/include/g++-v8/backward/assert.h", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) openat(AT_FDCWD, "/usr/local/include/assert.h", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) 21 ltrace – library call tracer 22 ltrace – library call tracer 23 uftrace – function (graph) tracer for userspace https://github.com/namhyung/uftrace 24 uftrace – function (graph) tracer for userspace https://github.com/namhyung/uftrace 25 bpftrace – the eBPF tracing language & frontend 26 bpftrace – the eBPF tracing language & frontend 27 cpudist – summarize on- and off-CPU time per task 28 cpudist – summarize on-CPU time per task 29 Linux eBPF-based Observability Tools 30 Measuring Performance perf – Performance analysis tools for Linux ● Official Linux profiler (source code is part of the kernel itself) ● Both hardware and software based performance monitoring ● Much lower overhead compared with instrumentation-based profiling ● Kernel and user space ● Counting and Sampling ○ Counting — count occurrences of a given event (e.g. cache misses) ○ Event-based Sampling — a sample is recorded when a threshold of events has occurred ○ Time-based Sampling — samples are recorded at a given fixed frequency ○ Instruction-based Sampling — processor follows instructions and samples events they create ● Static and Dynamic Tracing ○ Static — pre-defined tracepoints in software ○ Dynamic — tracepoints created using uprobes (user) or kprobes (kernel) 32 perf – subcommands bash ~ $ perf usage: perf [--version] [--help] [OPTIONS] COMMAND [ARGS] The most commonly used perf commands are:
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages118 Page
-
File Size-