Linux Performance Profiling Tool

Linux Performance Profiling Tool Minsoo Ryu Real-Time Computing and Communications Lab. Hanyang University [email protected] Real-Time Computing and Communications Lab., Hanyang University http://rtcc.hanyang.ac.kr Outline Example source Profiling . Perf . Gprof . Oprofile Tracing . Strace . Ltarce . Ftrace Real-Time Computing and Communications Lab., Hanyang University http://rtcc.hanyang.ac.kr 22 Example Source Real-Time Computing and Communications Lab., Hanyang University http://rtcc.hanyang.ac.kr Example Source UDP program . Server • ./UDP_server [port] . Client1 / client2 • ./UDP_client [ip address] [port] [user name] Real-Time Computing and Communications Lab., Hanyang University http://rtcc.hanyang.ac.kr 44 Example Source Real-Time Computing and Communications Lab., Hanyang University http://rtcc.hanyang.ac.kr 55 Example Source Server Client Data Send/Receive exit Procedure of UDP Socket Programming Real-Time Computing and Communications Lab., Hanyang University http://rtcc.hanyang.ac.kr 66 Outline Example source Profiling . Perf . Gprof . Oprofile Tracing . Strace . Ltarce . Ftrace Real-Time Computing and Communications Lab., Hanyang University http://rtcc.hanyang.ac.kr 77 Profiling Real-Time Computing and Communications Lab., Hanyang University http://rtcc.hanyang.ac.kr Perf Real-Time Computing and Communications Lab., Hanyang University http://rtcc.hanyang.ac.kr Perf What is perf . Performance counters for Linux . Perf profiler collects data through a variety of techniques • Hardware interrupts, code instrumentation, instruction set simulation, operating systems, hooking, performance counters . Operates with PMU information taking the CPU helpful • The reason why user-level program is included into the kernel source • Perf is closely associated with the kernel ABI Real-Time Computing and Communications Lab., Hanyang University http://rtcc.hanyang.ac.kr 1010 Perf Perf install . sudo apt-get install linux-tools-common . sudo apt-get install linux-tools-3.19.0-25-generic • Linux-cloud-tools-3.19.0.25-generic Real-Time Computing and Communications Lab., Hanyang University http://rtcc.hanyang.ac.kr 1111 Perf Usage of part . Perf <command> [option] Commands Descriptions list List all symbolic event types stat Run a command and gather performance counter Statistics top Generate and displays a performance counter profile record Run a command and record its profile into perf.data* report Read perf.data* and display the profile annotate Read perf.data* and display annotated code diff Read two perf.data* files and display the differential profile Real-Time Computing and Communications Lab., Hanyang University http://rtcc.hanyang.ac.kr 1212 Perf Usage of part . Hardware event can add a modifier that limits the scope • U: occur event in the user-level • k: occur event in the kernel-level • h: occur event in the hypervisor • H: occur event in the host machine • G: occur event in the guest machine Real-Time Computing and Communications Lab., Hanyang University http://rtcc.hanyang.ac.kr 1313 Perf Stat Perf stat . Task-clock • Clock cycle number is 5260.179993, average CPU usage rate is 0.134 . Context-switches • 1.592 context switch (0.303 k/sec) . Page-faults • 98 page-faults (0.019 k/sec) Real-Time Computing and Communications Lab., Hanyang University http://rtcc.hanyang.ac.kr 1414 Perf Top Perf top . Analysis system during run-time • Just like the top command in Linux . Provides a monitoring system in real-time # perf top Real-Time Computing and Communications Lab., Hanyang University http://rtcc.hanyang.ac.kr 1515 Perf Record Perf record . Record events . Recorded data is saved as perf.data by default . Use case • Record a specific command in detail • Analyze a suspicious process in detail • Determine a cause of poor performance of a process # perf record [option] [execute file] # ls perf.data Real-Time Computing and Communications Lab., Hanyang University http://rtcc.hanyang.ac.kr 1616 Perf Report Perf report . How to view the recorded data in perf.data file # perf report Real-Time Computing and Communications Lab., Hanyang University http://rtcc.hanyang.ac.kr 1717 Perf Annotate & Diff Perf annotate . Read perf.data and display annotated decompiled code . Use cases • Identify time-consuming part in source code # perf annotate Perf diff . Read two perf.data files and display the differential of the two profiles . Use cases • See differences between updated perf.data and older one # perf diff Real-Time Computing and Communications Lab., Hanyang University http://rtcc.hanyang.ac.kr 1818 Assignment 1 Analysis client1 sample application using perf . Using perf stat . Using perf record / report . Using perf top Submit recorded perf.data file and screenshot Submit stat and top screenshot Real-Time Computing and Communications Lab., Hanyang University http://rtcc.hanyang.ac.kr 1919 Gprof Real-Time Computing and Communications Lab., Hanyang University http://rtcc.hanyang.ac.kr Gprof What is gprof . Gprof is included gcc(binutils package) . Check a lot of load on any function Gprof package install . sudo apt-get install binutils • Binutils dependency: Bash, coreutils, diffutils, gcc, gettext, glibc, grep, make, perl, sed, texinfo Real-Time Computing and Communications Lab., Hanyang University http://rtcc.hanyang.ac.kr 2121 Gprof How to work gprof . Gprof provide statistical information • Record the time of each function from entry to end Every function is recorded with a number of function call during program execution time -pg: insert the option time function(mcount) Real-Time Computing and Communications Lab., Hanyang University http://rtcc.hanyang.ac.kr 2222 Gprof Internal operating concept . Timer • Check the PC per 0ms • Occurs SIGPROF signal Call the settimer before the main function • Signal handler increases pc counter . Enter/exit hooking • Call a mcount function before a function call • Creating a call graph By using a PC count before and after a function call Record the exact function call Real-Time Computing and Communications Lab., Hanyang University http://rtcc.hanyang.ac.kr 2323 Gprof Gprof profile category . Flat profile • Show the CPU time and number to use for each function call • Summarize a overall profiling information • Check whether you need to modify some function, to increase the performance Show each function for execution time and the number of executed function in the program . Call graph • Propose to get rid of any function calls or whether effective alternative to other effective functions • Show the related functions and hidden bugs • Optimize certain code path, after check it • Show the details Call frequency, time spent in subroutine and so on Real-Time Computing and Communications Lab., Hanyang University http://rtcc.hanyang.ac.kr 2424 Gprof How to run gprof #. gcc –o a.out a.c –pg • -pg: add time-function at each function # ./a.out # ls gmon.out # gprof a.out gmon.out . Additional options • -l: source code line-by-line time • -l -A -x: print source code(line-by-line time) • -F: print a particular function call graph Real-Time Computing and Communications Lab., Hanyang University http://rtcc.hanyang.ac.kr 2525 Gprof Gprof flat profile . Slow function is used most of the time . F function • Call the 1 time • Average using 3.26 milliseconds . G function • Call the 1 time • Average using 13.02 milliseconds Real-Time Computing and Communications Lab., Hanyang University http://rtcc.hanyang.ac.kr 2626 Gprof Gprof contents . % time • The percentage of the total running time of the program used by this function . Cumulative seconds • A running sum of the number of seconds accounted for by this function alone . Self seconds • The number of seconds accounted for by this function alone This is the major sort for this listing . Calls • The number of times, this function was invoked If this function is profiled, else blank Real-Time Computing and Communications Lab., Hanyang University http://rtcc.hanyang.ac.kr 2727 Gprof Gprof contents . Self ms/call • The average number of milliseconds spent in this function per call If this function is profiled, else blank . Total ms/call • The average number of milliseconds spent in this function and its children per call If this function is profiled, else blank . Name • The index shows the location of the function in the gprof list If the index is in parenthesis, it shows where it would appear in the gprof list if it were to be printed This is the minor sort for this listing Real-Time Computing and Communications Lab., Hanyang University http://rtcc.hanyang.ac.kr 2828 Gprof Gprof call_graph profile . Real-Time Computing and Communications Lab., Hanyang University http://rtcc.hanyang.ac.kr 2929 Gprof The meaning of each field in a child row function . When current function calls the child function • Self: the total time spent in child functions • Children: the total time spent in the child function of a child • Called: the number of child functions/The total number of invoked child functions Recursion number is not included by the total number of invoked child functions • Name: children function name Real-Time Computing and Communications Lab., Hanyang University http://rtcc.hanyang.ac.kr 3030 Gprof The meaning of each field in a parent row function . When parent function calls the current function • Self: the total time spent in current functions • Children: the total time spent in the child function of the current function • Called: the number of current functions/The total number of invoked current functions Recursion

Load more