SOLARISTM DTRACE INTRODUCTION

Satyajit Tripathi ISV-Engineering,

1 Audience • Chief Executive Officer (CXO, CTO) • Software Developer • System Administrator • Support and Service Engineer • Rest Of The World

2 Agenda • Troubleshooting Scenarios • What is SolarisTM DTrace • DTrace Architecture and Framework • DTrace Components (Need-To-Know) • Additional Privileges for DTrace (How-To) • Useful Links and Resources • Exercises with DTrace (Touch 'n' Feel) • Open Forum

3 Troubleshooting Scenarios • Fatal, Reproducible and Test Case available > Traditional debugging techniques. Provide patch. • Fatal, Non Reproducible > Crash dump, Analyze Core. Use mdb(1), dbx(1) > Postmortem, Speculate Static Snapshot. Iterative process of providing Debug binary, studying log files etc. • Non Fatal, Transient Failure and Unacceptable QoS > Traditional techniques of using truss(1), mdb(1) > Challenges, when a dynamic problem detection requires Invasion, Embedding or Aggregation in production > Here, DTrace can be the Savior on Solaris10 or above

4 What Is SolarisTM DTrace • Dynamic tracing facility framework for Solaris10 • Unique in its focus on Production system, and its integration of User-level and Kernel-level tracing • Allows tracing of arbitrary data and arbitrary expression using the D Language • Overhead nearly 0% on System Performance • 50,000+ Probes available on a System by default • No re-compilation of application required • Safe, Powerful, Flexible, and Easy-to-Use • DTrace is Open Source under CDDL License 5 Dynamic Tracing (D Trace) • A Comprehensive Framework for SolarisTM Environment > To Implement New DTrace Providers > To Implement fully Configurable DTrace Probes > To Implement New DTrace Consumers and Data Display • Observability with DTrace > Aggregate arbitrary behavior of the OS and User programs > Dynamically Enable and Manage Probes > Dynamically Associate Predicates and Actions with Probes > Dynamically Manage Trace Buffers and Probe Overhead > Examine Live Production System or a Crash Dump

6 DTrace Architecture and Framework • Need To Know > D Language > DTrace Probes > DTrace Provider > Actions for Probes > DTrace Consumer

Probes

7 D Language • 'D' Language is like '' and Constructs similar to • Based on Actions and Predicates • Access to Global, Thread local and Probe local variables • Rich built-in Variable set. Associative Arrays • Access to global Kernel variables and structures • Support for ANSI-C Operators and Data Aggregation • Example : Trace the pid of process “date” with syscall open(2)

Probe Name #!/usr/sbin/dtrace -s o Entry point of open syscall::open:entry Predicate /execname == “date”/ o Process is named “date” { trace(pid); Action } o Print the process ID

8 DTrace Probe • Defined by point of Instrumentation within OS Kernel • Probe has a Name, Identifies the Function and Module it , Accessible through Provider • Four Attributes: Provider, Module, Function & Name, defines a tuple to uniquely identify a Probe

probe description (provider:module:function:name) / predicate / { action statements } • Each Probe is Assigned an Integer Identifier • Example : Type command dtrace -l

9 DTrace Provider • A Methodology for Instrumenting Probes • Provider registers Probes in the system • Provider is informed by DTrace to Enable a Probe, and transfers the Control of the Probe to the DTrace • Verify Providers : dtrace -l | more ID PROVIDER MODULE FUNCTION NAME 1 dtrace BEGIN 2 dtrace END 3 dtrace ERROR ...Output truncated • Example of Providers : > syscall: Traces syscalls fbt : Traces in-kernel functions > pid : Traces application functions sched: Traces scheduling events > io : Traces system IO proc : Process/Thread creation, term, SIG

10 Actions for Probes • Actions can be taken when Probe is triggered • Actions are Programmable (Very useful feature) • Most Actions record a specified State of the system • Expressions in D Language are acceptable as Action Parameter(s)

11 DTrace Consumer • Process that interacts with DTrace, could be a Command line or a Script (script.d) • DTrace handles multiplexing, Supports concurrent Consumers • dtrace(1M) is DTrace Consumer • D Script Construction: Probe description, Predicate and Action Create Filename : myscript.d as below

#!/usr/sbin/dtrace -s syscall::write:entry / execname == “bash”/ { printf(“bash with pid %d called write system call\n”,pid); } Run the myscript.d as below # dtrace -s myscript.d dtrace: script 'myscript.d' matched 1 probe

12 Additional Privileges for DTrace • All Privileges for superuser root and • Selective Privileges for non root user • Privilege Groups provide Selective Access > dtrace_user : Provider syscall and profile > dtrace_proc : Provider pid and usdt > dtrace_kernel: Provider fbt and Kernel data structures > proc_owner : *Probe others Process which has sub-set privilege(s)

NOTE : You can observe only those processes for which you have the privilege, makes it Safe!

13 Privileges How-To • Temporary Privilege to a running Process with PID 2596 > Command ppriv -s A+dtrace_user 869 • Permanent Privilege to a User Account > Modify the file /etc/user_attr as jack::::defaultpriv=basic,dtrace_user > Command usermod -K defaultpriv=basic,dtrace_user jack • Verify assigned Privileges > Command ppriv $$

Output 869: bash flags = E: all I: basic,dtrace_user P: all L: all

14 Useful Links and Resources • Solaris Dynamic Tracing Guide (docs.sun.com) • BigAdmin System Administration Portal : DTrace • SDN Member Access, How To USE DTRACE from a Solaris10 System (Excellent head start) • Advanced DTrace Tips, Tricks & Gotchas (Preso) • NetBeans or Sun Studio DTrace GUI Plugin (Refer) • DTrace Toolkit (Download) • Refer FAQ. Contact Sun PDS [email protected] • I've started blogging at http://blogs.sun.com/stripathi

15 Exercise 1: System Calls Advanced than Traditional Tools • Look for System Call Errors # dtrace -n 'syscall:::return /errno/ {trace(execname);trace(pid);trace(errno)}'

0 318 pollsys:return Xorg 408 4 0 12 read:return gnome-terminal 660 11 0 12 read:return Xorg 408 11 0 12 read:return nautilus 650 11 ...Output truncated • Why not use truss > truss is much easier to use, and provides better information > truss will be Invasive for Production, and may not be suitable > truss only looks at one Process > DTrace looks at a System-wide Events

16 Exercise 2: Short-lived Malloc Download the Latest DTrace Toolkit • Watch for Short-lived Allocations in the Application. Use ready scripts in DTraceToolkit-0.99.tar.gz # dtrace -s DTraceToolkit/Proc/shortlived.d -c

dtrace: script 'shortlived.d' matched 12 probes CPU ID FUNCTION:NAME 0 1 :BEGIN Tracing... Hit Ctrl-C to stop. ^C 1 2 :END short lived processes:0.052 secs total sample duration: 11.837 secs

Total time by process name, firefox 0 ms run-mozilla.sh 1 ms pwd 2 ms awk 3 ms ls 3 ms dirname 8 ms basename 9 ms sed 13 ms Total time by PPID, 6417 0 ms 6419 0 ms [1]+ Done firefox ...Output truncated

17 Exercise 3: Who's Doing I/O Use Default DTrace Demo Scripts or Customize • To find out who is doing I/O on the System. Use the ready script /usr/demo/dtrace/whoio.d

# dtrace -s whoio.d

^c DEVICE APP PID BYTES sd0 picld 168 1280 sd0 fsflush 3 3072 sd0 sched 0 295936 nfs2 sched 0 786432 sd0 soffice.bin 6070 2242048

18 Exercise 4: Syscalls by the Application Use Options -c or -p for PID Providers or $target Macro in Scripts • To find which Syscalls are made by the Application. Use the ready script /usr/demo/dtrace/syscall.d # dtrace -s syscall.d -c /usr/bin/ls

dtrace: script 'syscall.d' matched 232 probes applicat.d howlong.d pri.d spec.d whoio.d badopen.d index.html printa.d specopen.d whopreempt.d dtrace: pid 1053 has exited fcntl 1 fsat 1 getpid 1 getrlimit 1 gtime 1 lstat64 1 rexit 1 stat 1 close 2 fstat64 2 getdents64 2 mmap 2 munmap 2 setcontext 2 ioctl 3 brk 6 write 21 19 Exercise 5: File system Workload Use Pragma options, Macros and Functions like C Programs • Review the script /usr/demo/dtrace/io.d # cat io.d

#!/usr/sbin/dtrace -s #pragma D option quiet BEGIN { printf("%-10s %10s %10s %3s %s\n","Device", "Program","I/O Size","R/W","Path"); } io:::start { printf("%-10s %10s %10d %3s %s\n",args[1]->dev_statname,execname, args[0]->b_bcount, args[0]->b_flags & B_READ? "R" : "W" ,args[2]- >fi_pathname); @[execname, pid, args[2]->fi_pathname] = sum(args[0]->b_bcount); } END { printf("%-10s %8s %10s %s\n","Program", "PID", "Total", "Path"); printa("%-10s %8d %10@d %s\n",@); }

20 Exercise contd: File system Workload Dynamic Tracing and Monitoring • Use the ready script /usr/demo/dtrace/io.d # ./io.d On a different terminal Run mkfile 2m /demo/foo

Device Program I/O Size R/W Path ramdisk0 mkfile 8192 R ramdisk0 mkfile 8192 W /demo/foo ramdisk0 mkfile 8192 W /demo/foo ...Output truncated

^c Program PID Total Path mkfile 13262 8192 fsflush 3 25088 sched 0 33792 mkfile 13262 2105344 /demo/foo

21 Exercise 6: Process Opening Files Command line to Enable Probes and Format Output(s) • Shows opened files by the Process name dtrace -n 'syscall::open*:entry { printf("%s %s",execname,copyinstr(arg0)); }'

dtrace: description 'syscall::open*:entry ' matched 2 probes

CPU ID FUNCTION:NAME 0 2596 open:entry df /var/ld/ld.config 0 2596 open:entry df /usr/lib/libcmd.so.1 0 2596 open:entry df /usr/lib/libc.so.1 0 2596 open:entry df /etc/mnttab ^c

22 Exercise 7: Disk Size by Process Using System Probe to Trace System I/O • To record Actual disk I/O Requests • Application may be doing a lot more I/O which may get absorbed by the File system cache and not result in an Actual Disk I/O

dtrace -n 'io:::start { printf("%d %s %d",pid,execname,args[0]->b_bcount); }'

dtrace: description 'io:::start ' matched 6 probes CPU ID FUNCTION:NAME 0 49944 bdev_strategy:start 3 fsflush 512 0 49944 bdev_strategy:start 0 sched 512 0 49944 bdev_strategy:start 0 sched 1024

23 Exercise 8: Write Size by Processes Use Data Manipulation and Display • Identify Write Size Distribution by Process on the System dtrace -n 'sysinfo:::writech { @dist[execname] = quantize(arg0); }' dtrace: description 'io:::start ' matched 6 probes ^c in.telnetd value ------Distribution ------count 1 | 0 2 |@@@@@@@@@@@@@@@@@ 8 4 |@@ 1 8 |@@@@@@@@@@@@@ 6 16 | 0 32 | 0 64 | 0 128 | 0 256 |@@@@@@@@ 4 svc.configd value ------Distribution ------count ...Output truncated

24 Exercise 9: Disk Size Aggregation Use Data Aggregation Functions • Using Aggregation for a Summarized View dtrace -n 'io:::start { @size[execname] = quantize(args[0]->b_bcount); }'

dtrace: description 'io:::start ' matched 6 probes ^c

value ------Distribution ------count 512 | 0 1024 |@@ 37 2048 |@@@@@@@ 114 4096 |@@@@@@@ 116 8192 |@@@@@@@@@@@@@@@@@ 286 16384 |@@ 33 32768 |@@@@@ 87 65536 | 0

25 Other Performance Tools

• Process Stats • Process Tracing & • System Stats cputrack : per-processor hardware acctcom : process accounting counter Debugging busstat : Bus hardware counters pargs : process arguments abitrace : trace ABI interfaces cpustat : CPU hardware counters pflags : process flags dtrace : trace the world iostat : IO & NFS statistics pcred : process credentials mdb : debug/control processes kstat : display kernel statistics pldd : process library dependency truss : trace functions,system calls mpstat : processor statistics psig : process signal disposition netstat : network statistics pstack : process stack dump nfsstat : nfs server stats pmap : process memory map sar : kitchen sink utility pfiles : open files and names vmstat : virtual memory stats prstat : process statistics ptree : process tree ptime : process micro-state times pwdx : process working directory • Process Control • Kernel Tracing & pgrep : grep for processes Debugging pkill : kill processes list dtrace : trace and monitor kernel pstop : stop processes lockstat : monitor locking statistics prun : start processes lockstat -k : profile kernel prctl : view/set process resources mdb : debug live and kernel cores pwait : wait for process preap : reap a zombie process

26 SOLARISTM DTRACE

Satyajit Tripathi http://blogs.sun.com/stripathi

27