Debugging and Profiling on Intel® Xeon Phi™
Total Page:16
File Type:pdf, Size:1020Kb
PRACE Summer School, CINECA 8 -11 July 2013 Debugging and Profiling on Intel® Xeon Phi™ Hans Pabst, July 2013 Software and Services Group Intel Corporation Agenda Debugging • Compiler Debug Features • GNU* Project Debugger • Intel® Inspector Profiling • Compiler Profiling Features • Intel® VTune™ Amplifier Demonstration 2 Copyright© 2013, Intel Corporation. All rights reserved. 10.07.2013 *Other brands and names are the property of their respective owners. Compiler Debug Features Static Analysis (SA): icc -diag-enable scn • Customize analysis level (and other adjustments) • Textual and Inspector based reports • Issue tracking via Inspector GUI Pointer Checker (PL): icc -check-pointers=rw • Further option adjustments possible • No ABI changes despite of bounds information • Intrinsics / API for custom memory allocation • Rigorous checks; failure behavior adjustable • Debugger integration 3 Copyright© 2013, Intel Corporation. All rights reserved. 10.07.2013 *Other brands and names are the property of their respective owners. Intel® Inspector: Static Analysis Analysis: 250 error types • Incorrect directives • Security errors Reports and collaboration • Choose your priority: - Minimize false errors - Maximize error detection • Hierarchical navigation • Share comments with team Code Complexity Metrics • Find code likely to be less reliable 4 Copyright© 2013, Intel Corporation. All rights reserved. 10.07.2013 *Other brands and names are the property of their respective owners. Agenda Debugging • Compiler Debug Features • GNU* Project Debugger • Intel® Inspector Profiling • Compiler Profiling Features • Intel® VTune™ Amplifier Demonstration 5 Copyright© 2013, Intel Corporation. All rights reserved. 10.07.2013 *Other brands and names are the property of their respective owners. The GNU* Project Debugger (GDB) Enhanced build of GNU* GDB 7.5 is included into Intel® Manycore Platform Software Stack (Intel® MPSS) • http://software.intel.com/en-us/articles/intel- manycore-platform-software-stack-mpss • Source code available via installation option • Released back to the GNU* community • Native and cross/remote debugger versions • C/C++ support, improved Fortran support • Intel® Parallel Debugger Extension 6 Copyright© 2013, Intel Corporation. All rights reserved. 10.07.2013 *Other brands and names are the property of their respective owners. Eclipse* Integration Eclipse* IDE integration • Seamless debugging of host and coprocessor • Simultaneous view of host and coprocessor threads • Supports offload language extensions (auto-attach to offload process) • Supports multiple coprocessor cards • Supports C/C++ and Fortran Simultaneous and seamless thread debugging. 7 Copyright© 2013, Intel Corporation. All rights reserved. 10.07.2013 *Other brands and names are the property of their respective owners. Eclipse* Integration (cont.) Install Eclipse IDE for C/C++ Developers • Available from http://www.eclipse.org Integrate Intel® Compilers and Intel® Xeon Phi™ • Start Eclipse, select Help Install New Software • Uncheck Group items by category • Use Add Local, and select folder: /opt/intel/composerxe/eclipse_support/cdt8.0/eclipse • Click Select All and Finish. 8 Copyright© 2013, Intel Corporation. All rights reserved. 10.07.2013 *Other brands and names are the property of their respective owners. Intel® Xeon Phi™ Coprocessor Architecture Features List all new vector and mask registers (gdb) info registers zmm K0 0x0 0 ⁞ Zmm31 {v16_float = {0x0 <repeats 16 times>}, v8_double = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, v64_int8 = {0x0 <repeats 64 times>}, v32_int16 = {0x0 <repeats 32 times>}, v16_int32 = {0x0 <repeats 16 times>}, v8_int64 = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, v4_uint128 = {0x0, 0x0, 0x0, 0x0}} Disassemble instructions (gdb) disassemble $pc, +10 Dump of assembler code from 0x11 to 0x24: 0x0000000000000011 <foobar+17>: vpackstorelps %zmm0,-0x10(%rbp){%k1} 0x0000000000000018 <foobar+24>: vbroadcastss -0x10(%rbp),%zmm0 9 Copyright© 2013, Intel Corporation. All rights reserved. 10.07.2013 *Other brands and names are the property of their respective owners. GDB: Intel® Xeon Phi™ Coprocessor Debug server and host debugger /usr/linux-k1om-4.7/linux-k1om/usr/bin/gdbserver /usr/linux-k1om-4.7/bin/x86_64-k1om-linux-gdb Native debugger (no Parallel DeBug eXtension) /usr/linux-k1om-4.7/linux-k1om/usr/bin/gdb Host debugger /opt/intel/mic/bin/gdb 10 Copyright© 2013, Intel Corporation. All rights reserved. 10.07.2013 *Other brands and names are the property of their respective owners. GDB: Native Debugging Run GDB* on the Intel® Xeon Phi™ Coprocessor ssh –t mic0 /path/on/mic/to/gdb To attach to a running application via the process-id (gdb) shell pidof my_application 42 (gdb) attach 42 To run an application directly from GDB* (gdb) file /target/path/to/application (gdb) start Intel Confidential – NDA presentation 11 Copyright© 2013, Intel Corporation. All rights reserved. 10.07.2013 *Other brands and names are the property of their respective owners. GDB: Remote Debugging Run GDB* on your local host /usr/linux-k1om-4.7/bin/x86_64-k1om-linux-gdb Start gdbserver on the coprocessor – To remote debug using standard I/O redirection (gdb) target extended-remote |ssh -T mic0 gdbserver –-multi – – To set a custom environment replace gdbserver by e.g.: env LD_LIBRARY_PATH=/tmp:$LD_LIBRARY_PATH gdbserver Attach to a running application via process ID (pid) (gdb) file /local/path/to/application (gdb) attach <remote-pid> Run an application directly (gdb) file /local/path/to/application (gdb) set remote exec-file /target/path/to/application 12 Copyright© 2013, Intel Corporation. All rights reserved. 10.07.2013 *Other brands and names are the property of their respective owners. GDB: Offload Debugging Debugging into an offloaded code section on the host does not “switch” to a debugger on the target • No debug synchronization (host / coprocessor) • GUI integration will provide this “glue” logic; see “Eclipse* Integration” Debugging offloaded code via command line 1. Wait within the offloaded code section volatile int loop = 1; do { volatile int a = 1; } while (loop); 2. Attach to offload process on coprocessor via PID Note: cross-compiling the entire application and debugging the previously offloaded section natively might be easier. Intel Confidential – NDA presentation 13 Copyright© 2013, Intel Corporation. All rights reserved. 10.07.2013 *Other brands and names are the property of their respective owners. GDB: Detect and Debug Data Races Data race: a data race occurs when two ordinary simultaneous accesses to the same scalar, at least one of which is a write, execute in different parallel regions. [Hans-J. Boehm, WG21/N2480] Tools are needed to detect and debug data races*: • GNU* GDB with parallel debug extension • Intel® Inspector * Remember: single-threaded (sequential) execution cannot reproduce data races in contrast to multiple threads even on a single core. 14 Copyright© 2013, Intel Corporation. All rights reserved. 10.07.2013 *Other brands and names are the property of their respective owners. GDB: Data Race Symptoms Varying, or wrong results* • One of the possible (but different) ways to interleave the instructions between parallel sections reproduces the actual result (“sequential consistency”) Memory corruption, or crash • An index (or pointer data) is subject of a data race e.g., a book keeping structure is concurrently modified and left in an inconsistent state (mix of different updates) * No data race is a prerequisite for reproducible numerical results e.g., a deterministic execution-order is needed as well. 15 Copyright© 2013, Intel Corporation. All rights reserved. 10.07.2013 *Other brands and names are the property of their respective owners. GDB: Data Race Example Given: global variables a=1 b=2 Given: two threads T1: x = a + b T2: b = 42 Value of x depends on execution order: If T1 runs before T2 x = 3 If T2 runs before T1 x = 43 Data race e.g., “read-write”: T2’s update was not visible to T1’s calculation 16 Copyright© 2013, Intel Corporation. All rights reserved. 10.07.2013 *Other brands and names are the property of their respective owners. GDB: Detect Data Detection How to detect data races? Compile with Intel Compiler: icpc -debug parallel Debugger breaks when race has been detected: (gdb) pdbx enable (gdb) run data race detected 1: write shared, 4 bytes from foo.c:36 3: read shared, 4 bytes from foo.c:40 Breakpoint -11, 0x401515 in L_test_..._21 () at foo.c:36 *var = 42; /* bp.write */ Stop in the context of the access that triggers a race condition 17 Copyright© 2013, Intel Corporation. All rights reserved. 10.07.2013 *Other brands and names are the property of their respective owners. GDB: Data Race Debugging Fine-tune detection and analysis via filter sets • Add filter to selected filter set (gdb) pdbx filter line foo.c:36 (gdb) pdbx filter code 0x40518..0x40524 (gdb) pdbx filter var shared (gdb) pdbx filter data 0x60f48..0x60f50 (gdb) pdbx filter reads # read accesses • Ignore events specified by filters (default behavior) (gdb) pdbx fset suppress • Ignore events not specified by filters (gdb) pdbx fset focus • Get debug command help (pdbx) (gdb) help pdbx Use cases • Focused debugging e.g., debug a single symptom • Limit overhead and control false positives 18 Copyright© 2013, Intel Corporation. All rights reserved. 10.07.2013 *Other brands and names are the property of their respective owners. GDB: Date Race Detection Limitations Data race detection needs instrumented threading runtimes in order to capture synchronization