Intel® Software Development Tools Intel® Parallel Studio Seminar
Total Page:16
File Type:pdf, Size:1020Kb
From Serial to Parallel Intel® Software Products for HPC Hubert Haberstock Technical Consulting Engineer Software & Services Group, Developer Products Division Copyright © 2010, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. 1 Agenda 09:15 Saluto di benvenuto e apertura dei lavori (Assintel) 09:30 Architettura Parallela: lo sviluppo dell’hardware (Intel Italy) 10:00 Parallel Programming, today and tomorrow (Intel) 11:05 Dal seriale al parallelo Intel High-Performance Tools (Intel) Intel Parallel Studio (C. Fiorillo) 13:30 Un caso di studio (C. Fiorillo) 14:15 Parallel programming methods and tools (Intel) 15:00 Ottimizzazione di applicazioni (C. Fiorillo) 16:00 Wrap up, Q&A, seminar evaluation Intel Software Tools - Parallel Design Cycle Serial Visualization of Architectural applications and the system Analysis Highly optimizing Introducing compilers delivering scalable solutions Parallelism Detect latent programming Validating to address unique Correctness challenges Tune for performance Performance and scalability Tuning Parallel Software & Services Group, Developer Products Division Copyright © 2010, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. 3 Intel® VTune™ Analyzer 9.1 "The Intel VTune Identifies hard to find Performance performance bottlenecks Analyzer took a multi-day task and • Features turned it into a sub- day task." – Tune process or thread parallel code – Low overhead sampling Randy Camp – Graphical call graph VP, Software R&D MUSICMATCH Inc. – View results on source or assembly • Applications – System-wide Analysis – Finding hotspots – Tuning libraries, drivers and applications – Remote Data Collector for Windows*/Linux* – Programming Lanugage and Compiler Independent – Supports latest Intel Processors Windows* Linux* Mac* IA32 Intel64 IA64 Multicore √ √ √ √ √ √ Software & Services Group, Developer Products Division Copyright © 2010, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. 4 Intel Software Tools - Parallel Design Cycle Serial Visualization of Architectural applications and the system Analysis Highly optimizing Introducing compilers delivering scalable solutions Parallelism New Detect latent programming Validating to address unique Correctness challenges New Tune for performance Performance and scalability Tuning New Parallel Software & Services Group, Developer Products Division Copyright © 2010, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. 5 Intel® C++ & Fortran Compiler Professional Editions 11.1 • The Best C++ & Fortran Development Solutions for Today’s Multi-core World – Optimized for latest Intel Processor Architectures including Core i7, Atom – Initial implementation of future instructions set extensions. – Best support for creating multi-threaded applications with performance libraries, OpenMP* and C++ libraries for parallelism. – Windows* “In certain stand-alone – Plug-in compatibility with Microsoft Visual Studio* tests, such as linear – Compatibility with Microsoft Visual C++* & Compaq algebra matrix Visual Fortran* multiplication, the Intel – Standalone version of Intel® Visual Fortran now C++ compiler 10.0 is up includes Microsoft Visual Studio to 4 times faster than – Linux* 9.1, due to improved – Command line, Source and binary compatibility with automatic GCC parallelization and – Integration with Eclipse 4.0/CDT automatic vectorization with “unroll and jam” – Mac OS* X that fits hand in glove – Command line compatibility with GCC with Intel Core 2 – Integrates in XCode development environment microarchitecture.” Supports Windows* Linux* Mac* 64-Bit Multicore AMD* ® Gunnar Staff & Lars Intel C++ Compiler √ √ √ √ √ √ ® Petter Endresen Intel Fortran Compiler √ √ √ √ √ √ SPT Group Software & Services Group, Developer Products Division Copyright © 2010, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. 6 Intel® Math Kernel Library 10.2 Flagship math processing library • Features – Multi-core ready with excellent scaling – Highly optimized, extensively threaded math routines for science, engineering and financial applications for maximum performance – Automatic runtime processor detection ensures great performance on whatever processor your application is running on. – Support for C and Fortran – Optimizations for latest Intel processors including Core i7 processors Windows* Linux* Mac* IA32 Intel64 IA64 Multicore √ √ √ √ √ √ √ Software & Services Group, Developer Products Division Copyright © 2010, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. 7 Intel® Integrated Performance Primitives (IPP) 6.1 Multicore Power for Multimedia and Data Processing • Features – Rapid Application Development – Cross-platform Compatibility & Code Re-Use – Highly optimized functions from 15 Domains – Images and Video – Communications and Signal Processing – Data Processing – Performance optimizations for latest Intel processors incl. Core i7 and Atom processors Windows* Linux* Mac* IA32 Intel64 IA64 Multicore √ √ √ √ √ √ √ Software & Services Group, Developer Products Division Copyright © 2010, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. 8 Intel® Threading Building Blocks 2.2 Extend C++ for parallelism • Features – A C++ runtime library that uses familiar task patterns, not threads (STL style) – A high level abstraction requiring less code for threading without sacrificing performance – Appropriately scales to the number of cores available – The thread library API is portable across Linux, Windows, or Mac OS platforms – Works with all C++ compilers (i.e. Microsoft, GNU and Intel) – Auto_partitioner for better parallel algorithms – Lambda support to match 11.1 Compiler – Open source version available at www.threadingbuildingblocks.org Windows* Linux* Mac* IA32 Intel64 IA64 Multicore √ √ √ √ √ √ √ Software & Services Group, Developer Products Division Copyright © 2010, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. 9 Intel® Threading Building Blocks High Performance/Scalability Software & Services Group, Developer Products Division Copyright © 2010, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. 10 Intel Software Tools - Parallel Design Cycle Serial Visualization of Architectural applications and the system Analysis Highly optimizing Introducing compilers delivering scalable solutions Parallelism Detect latent programming Validating to address unique Correctness challenges Tune for performance Performance and scalability Tuning Parallel Software & Services Group, Developer Products Division Copyright © 2010, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. 11 Non deterministic Error Sources –Shared Resources –Locks can – „serialize‟ a require locks program – lead to Deadlocks Thread1 Thread2 X=0 X=X+1 Shared X=X+1 Memory X X=1 time Wrong Result ( X should be 2) Software & Services Group, Developer Products Division Copyright © 2010, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. 12 Intel® Thread Checker v3.1 “We couldn’t have Confidently pinpoint threading errors gotten the networking up and running as quickly and as efficiently without • Features Thread Checker. Thread – Detects challenging data races and Checker is simply an deadlocks awesome tool and we are not going to develop – Pinpoints errors to the source code line multi-threaded code – Batch scripts integration for regression without it.” test runs Doug Service, Dir. of Tech. Dev. – Command line interface for Windows Chris Stark, Software Engineer and Linux Ritual Entertainment – Works on standard debug builds without recompiling – Drill down to source code – Intel Fortran/C++ Compilers, Microsoft Compilers, GNU C++ Linux Compilers – Windows/POSIX* threads – -VTune/Visual Studio integration (Windows* only) Windows* Linux* Mac* IA32 Intel64 IA64 Multicore √ √ √ √ √ Software & Services Group, Developer Products Division Copyright © 2010, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. 13 New Linux* Intel® Debugger (IDB) GUI and Support for Debugging Parallel Code • Thread Shared Data Event Detection - Break on Thread Shared Data Access (read/write) • Re-entrant Function Detection • SIMD SSE Registers Window • Enhanced OpenMP* Support - Serialize OpenMP threaded application execution on the fly - Insight into thread groups, barriers, locks, wait lists etc. Windows* Linux* Mac* IA32 Intel64 IA64 Multicore VS add-in √ √ √ √ Software & Services Group, Developer Products Division Copyright © 2010, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. 14 IDB Parallel Run-Control Use Cases • Stepping Serial Execution parallel Problem: loops Breakpoint Parallel loop computes a wrong result. Is it a concurrency or … algorithm issue ? Normal • Problem: Step Parallel Debugger State investigation difficult . Disable parallel Threads stop at arbitrary Support positions (red line) Runtime access to the OpenMP • Parallel Debugger num_thread property Syncpoint Support Set to 1 for serial