Intel(R) Math Kernel Library for Linux* User's Guide

Total Page:16

File Type:pdf, Size:1020Kb

Intel(R) Math Kernel Library for Linux* User's Guide Intel® Math Kernel Library for Linux* User’s Guide October 2007 Document Number: 314774-005US World Wide Web: http://developer.intel.com Version Version Information Date -001 Original issue. Documents Intel® Math Kernel Library (Intel® MKL) 9.0 gold September 2006 release. -002 Documents Intel® MKL 9.1 beta release. “Getting Started”, “LINPACK and MP January 2007 LINPACK Benchmarks” chapters and “Support for Third-Party and Removed Interfaces” appendix added. Existing chapters extended. Document restruc- tured. List of examples added. -003 Documents Intel® MKL 9.1 gold release. Existing chapters extended. Docu- June 2007 ment restructured. More aspects of ILP64 interface discussed. Section “Config- uring Eclipse CDT to Link with Intel MKL” added to chapter 3. Cluster content is organized into one separate chapter 9 “Working with Intel® Math Kernel Library Cluster Software” and restructured, appropriate links added. -004 Documents Intel® MKL 10.0 Beta release. Layered design model has been September 2007 described in chapter 3 and the content of the entire book adjusted to the model. Automation of setting environment variables at startup has been described in chapter 4. New Intel MKL threading controls have been described in chapter 6. The User’s Guide for Intel MKL merged with the one for Intel MKL Cluster Edition to reflect consolidation of the respective products. -005 Documents Intel® MKL 10.0 Gold release. Configuring of Eclipse CDT 4.0 to October 2007 link with Intel MKL has been described in chapter 3. Intel® Compatibility OpenMP* run-time compiler library (libiomp) has been described. ii INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL® PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL'S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER, AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT. UNLESS OTHERWISE AGREED IN WRITING BY INTEL, THE INTEL PRODUCTS ARE NOT DESIGNED NOR INTENDED FOR ANY APPLICATION IN WHICH THE FAILURE OF THE INTEL PRODUCT COULD CREATE A SITUATION WHERE PERSONAL INJURY OR DEATH MAY OCCUR. Intel may make changes to specifications and product descriptions at any time, without notice. Designers must not rely on the absence or characteristics of any features or instructions marked "reserved" or "undefined." Intel reserves these for future definition and shall have no responsibility whatsoever for conflicts or incompatibilities arising from future changes to them. The information here is subject to change without notice. Do not finalize a design with this information. The products described in this document may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request. Contact your local Intel sales office or your distributor to obtain the latest specifications and before placing your product order. Copies of documents which have an order number and are referenced in this document, or other Intel literature, may be obtained by calling 1-800-548-4725, or by visiting Intel's Web Site. Intel processor numbers are not a measure of performance. Processor numbers differentiate features within each processor family, not across different processor families. See http://www.intel.com/products/processor_number for details. BunnyPeople, Celeron, Celeron Inside, Centrino, Centrino logo, Core Inside, FlashFile, i960, InstantIP, Intel, Intel logo, Intel386, Intel486, Intel740, IntelDX2, IntelDX4, IntelSX2, Intel Core, Intel Inside, Intel Inside logo, Intel. Leap ahead., Intel. Leap ahead. logo, Intel NetBurst, Intel NetMerge, Intel NetStructure, Intel SingleDriver, Intel SpeedStep, Intel StrataFlash, Intel Viiv, Intel vPro, Intel XScale, IPLink, Itanium, Itanium Inside, MCS, MMX, Oplus, OverDrive, PDCharm, Pentium, Pentium Inside, skoool, Sound Mark, The Journey Inside, VTune, Xeon, and Xeon Inside are trademarks of Intel Corporation in the U.S. and other countries. * Other names and brands may be claimed as the property of others. Copyright © 2006 - 2007, Intel Corporation. All rights reserved. iii Contents Chapter 1 Overview Technical Support ....................................................................... 1-1 About This Document .................................................................. 1-1 Purpose................................................................................. 1-2 Audience ............................................................................... 1-2 Document Organization ........................................................... 1-2 Notational Conventions............................................................ 1-3 Chapter 2 Getting Started Checking Your Installation............................................................ 2-1 Obtaining Version Information ...................................................... 2-2 Compiler Support ....................................................................... 2-2 Before You Begin Using Intel MKL ................................................. 2-2 Chapter 3 Intel® Math Kernel Library Structure High-level Directory Structure ...................................................... 3-1 Layered Model Concept................................................................ 3-2 Layers................................................................................... 3-3 Sequential Version of the Library .................................................. 3-4 Support for ILP64 Programming.................................................... 3-5 Intel® MKL Versions ................................................................. 3-11 Directory Structure in Detail....................................................... 3-11 Dummy Libraries .................................................................. 3-19 Contents of the Documentation Directory................................. 3-19 iv Intel® Math Kernel Library User’s Guide Chapter 4 Configuring Your Development Environment Setting Environment Variables...................................................... 4-1 Automating the Process........................................................... 4-1 Configuring Eclipse CDT to Link with Intel MKL ............................... 4-2 Configuring Eclipse CDT 4.0 ..................................................... 4-2 Configuring Eclipse CDT 3.x ..................................................... 4-3 Customizing the Library Using the Configuration File ....................... 4-4 Chapter 5 Linking Your Application with Intel® Math Kernel Library Selecting Between Linkage Models................................................ 5-1 Static Linking......................................................................... 5-1 Dynamic Linking..................................................................... 5-2 Making the Choice .................................................................. 5-2 Intel MKL-specific Linking Recommendations .............................. 5-3 Link Command Syntax ................................................................ 5-3 Selecting Libraries to Link............................................................ 5-6 Linking with Threading Libraries ............................................... 5-7 More Linking Examples............................................................ 5-8 Notes on Linking .................................................................. 5-10 Building Custom Shared Objects................................................. 5-11 Intel MKL Custom Shared Object Builder.................................. 5-11 Specifying Makefile Parameters .............................................. 5-11 Specifying List of Functions.................................................... 5-12 Chapter 6 Managing Performance and Memory Using Intel® MKL Parallelism ....................................................... 6-1 Techniques to Set the Number of Threads .................................. 6-2 Avoiding Conflicts in the Execution Environment ......................... 6-3 Setting the Number of Threads Using OpenMP Environment Variable .............................................................................. 6-4 Changing the Number of Threads at Run Time............................ 6-4 Using Additional Threading Control ........................................... 6-7 Tips and Techniques to Improve Performance ............................... 6-12 Coding Techniques................................................................ 6-12 Hardware Configuration Tips .................................................. 6-13 v Contents Managing Multi-core Performance ............................................ 6-14 Operating on Denormals......................................................... 6-15 FFT Optimized Radices ........................................................... 6-15 Using Intel® MKL Memory Management ....................................... 6-15 Redefining Memory Functions.................................................. 6-16 Chapter 7 Language-specific Usage Options Using Language-Specific Interfaces with Intel® MKL ....................... 7-1 Mixed-language programming with Intel® MKL .............................
Recommended publications
  • 2021 User Guide
    i Pro Fortran Linux Absoft Pro Fortran User Guide Absoft Fortran Linux Fortran User Guide 5119 Highland Road, PMB 398 Waterford, MI 48327 U.S.A. Tel (248) 220-1190 Fax (248) 220-1194 [email protected] All rights reserved. No part of this publication may be reproduced or used in any form by any means, without the prior written permission of Absoft Corporation. THE INFORMATION CONTAINED IN THIS PUBLICATION IS BELIEVED TO BE ACCURATE AND RELIABLE. HOWEVER, ABSOFT CORPORATION MAKES NO REPRESENTATION OF WARRANTIES WITH RESPECT TO THE PROGRAM MATERIAL DESCRIBED HEREIN AND SPECIFICALLY DISCLAIMS ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR ANY PARTICULAR PURPOSE. FURTHER, ABSOFT RESERVES THE RIGHT TO REVISE THE PROGRAM MATERIAL AND MAKE CHANGES THEREIN FROM TIME TO TIME WITHOUT OBLIGATION TO NOTIFY THE PURCHASER OF THE REVISION OR CHANGES. IN NO EVENT SHALL ABSOFT BE LIABLE FOR ANY INCIDENTAL, INDIRECT, SPECIAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE PURCHASER'S USE OF THE PROGRAM MATERIAL. U.S. GOVERNMENT RESTRICTED RIGHTS — The software and documentation are provided with RESTRICTED RIGHTS. Use, duplication, or disclosure by the Government is subject to restrictions set forth in subparagraph (c) (1) (ii) of the Rights in Technical Data and Computer Software clause at 252.227-7013. The contractor is Absoft Corporation, 2111 Cass Lake Rd. Ste 102, Keego Harbor, Michigan 48320. ABSOFT CORPORATION AND ITS LICENSOR(S) MAKE NO WARRANTIES, EXPRESS OR IMPLIED, INCLUDING WITHOUT LIMITATION THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE, REGARDING THE SOFTWARE. ABSOFT AND ITS LICENSOR(S) DO NOT WARRANT, GUARANTEE OR MAKE ANY REPRESENTATIONS REGARDING THE USE OR THE RESULTS OF THE USE OF THE SOFTWARE IN TERMS OF ITS CORRECTNESS, ACCURACY, RELIABILITY, CURRENTNESS, OR OTHERWISE.
    [Show full text]
  • Benefits of Continuing Fortran Standardisation Survey: Interim Results
    Benefits of continuing Fortran standardisation survey: interim results Anton Shterenlikht Standards Officer, BCS Fortran Specialist Group 31st August 2018 1 Introduction This survey has been developed by the committee of the BCS Fortran Group to quantify the value of modern Fortran standards to organisations and individuals. We wanted to know how newer Fortran standards have increased the quality of users' code, cut development costs, increased portability or performance, or whether users could attach any monetary value to the benefits enabled by modern Fortran standards. The Fortran language has been steadily developing since its origins in 1957. Many people have been working on revising the Fortran specification, resulting in Fortran 77, 90, 95, 2003, 2008 and 2018 standards. This survey is designed to find out exactly what benefits newer Fortran standards bring to the community. The results of the survey will help the Group justify continuing involvement in Fortran standardisation efforts. The results of the survey will also be shared with the ISO Fortran standardisation committee. The survey is still open at: https://goo.gl/forms/JUFUReOoVUin2m8D2 and will close on 31-DEC-2018. This interim report contains the data received by 31-AUG-2018. All questions were optional, hence the number of responses is given for each question. The percentages for each question are calculated based on the number of responses for that particular question. For fields where the respondents could enter any text, the responses are given verbatim, one response per paragraph. Multiple identical responses in such fields are indicated with numbers in brackets after such responses. We apologise for broken formatting in the longer responses.
    [Show full text]
  • Fortran Resources 1
    Fortran Resources 1 Ian D Chivers Jane Sleightholme May 7, 2021 1The original basis for this document was Mike Metcalf’s Fortran Information File. The next input came from people on comp-fortran-90. Details of how to subscribe or browse this list can be found in this document. If you have any corrections, additions, suggestions etc to make please contact us and we will endeavor to include your comments in later versions. Thanks to all the people who have contributed. Revision history The most recent version can be found at https://www.fortranplus.co.uk/fortran-information/ and the files section of the comp-fortran-90 list. https://www.jiscmail.ac.uk/cgi-bin/webadmin?A0=comp-fortran-90 • May 2021. Major update to the Intel entry. Also changes to the editors and IDE section, the graphics section, and the parallel programming section. • October 2020. Added an entry for Nvidia to the compiler section. Nvidia has integrated the PGI compiler suite into their NVIDIA HPC SDK product. Nvidia are also contributing to the LLVM Flang project. Updated the ’Additional Compiler Information’ entry in the compiler section. The Polyhedron benchmarks discuss automatic parallelisation. The fortranplus entry covers the diagnostic capability of the Cray, gfortran, Intel, Nag, Oracle and Nvidia compilers. Updated one entry and removed three others from the software tools section. Added ’Fortran Discourse’ to the e-lists section. We have also made changes to the Latex style sheet. • September 2020. Added a computer arithmetic and IEEE formats section. • June 2020. Updated the compiler entry with details of standard conformance.
    [Show full text]
  • Examining the Viability of FPGA Supercomputing
    1 Examining the Viability of FPGA Supercomputing Stephen D. Craven and Peter Athanas Bradley Department of Electrical and Computer Engineering Virginia Polytechnic Institute and State University Blacksburg, VA 24061 USA email: {scraven,athanas}@vt.edu Abstract—For certain applications, custom computational hardware created using field programmable gate arrays (FPGAs) produces significant performance improvements over processors, leading some in academia and industry to call for the inclusion of FPGAs in supercomputing clusters. This paper presents a comparative analysis of FPGAs and traditional processors, focusing on floating- point performance and procurement costs, revealing economic hurdles in the adoption of FPGAs for general High-Performance Computing (HPC). Index Terms— computational accelerator, digital arithmetic, Field programmable gate arrays, high- performance computing, supercomputers. I. INTRODUCTION Supercomputers have experienced a recent resurgence, fueled by government research dollars and the development of low-cost supercomputing clusters. Unlike the Massively Parallel Processor (MPP) designs found in Cray and CDC machines of the 70s and 80s, featuring proprietary processor architectures, many modern supercomputing clusters are constructed from commodity PC processors, significantly reducing procurement costs. In an effort to improve performance, several companies offer machines that place one or more FPGAs in each node of the cluster. Configurable logic devices, of which FPGAs are one example, permit the device’s hardware to be programmed multiple times after manufacture. A wide body of research over two decades has repeatedly demonstrated significant performance improvements for certain classes of applications when implemented within an FPGA’s configurable logic [1]. Applications well suited to speed-up by FPGAs typically exhibit massive parallelism and small integer or fixed-point data types.
    [Show full text]
  • Overview of the SPEC Benchmarks
    9 Overview of the SPEC Benchmarks Kaivalya M. Dixit IBM Corporation “The reputation of current benchmarketing claims regarding system performance is on par with the promises made by politicians during elections.” Standard Performance Evaluation Corporation (SPEC) was founded in October, 1988, by Apollo, Hewlett-Packard,MIPS Computer Systems and SUN Microsystems in cooperation with E. E. Times. SPEC is a nonprofit consortium of 22 major computer vendors whose common goals are “to provide the industry with a realistic yardstick to measure the performance of advanced computer systems” and to educate consumers about the performance of vendors’ products. SPEC creates, maintains, distributes, and endorses a standardized set of application-oriented programs to be used as benchmarks. 489 490 CHAPTER 9 Overview of the SPEC Benchmarks 9.1 Historical Perspective Traditional benchmarks have failed to characterize the system performance of modern computer systems. Some of those benchmarks measure component-level performance, and some of the measurements are routinely published as system performance. Historically, vendors have characterized the performances of their systems in a variety of confusing metrics. In part, the confusion is due to a lack of credible performance information, agreement, and leadership among competing vendors. Many vendors characterize system performance in millions of instructions per second (MIPS) and millions of floating-point operations per second (MFLOPS). All instructions, however, are not equal. Since CISC machine instructions usually accomplish a lot more than those of RISC machines, comparing the instructions of a CISC machine and a RISC machine is similar to comparing Latin and Greek. 9.1.1 Simple CPU Benchmarks Truth in benchmarking is an oxymoron because vendors use benchmarks for marketing purposes.
    [Show full text]
  • Absoft Pro Fortran User Guide
    Pro Fortran Windows™ User Guide For 32-bit and 64-bit Windows Pro Fortran Windows™ User Guide For 32-bit and 64-bit Windows 2111 Cass Lake Road, Suite 102 Troy, MI 48084 U.S.A. Tel (248) 220-1190 Fax (248) 220-1194 [email protected] All rights reserved. No part of this publication may be reproduced or used in any form by any means, without the prior written permission of Absoft Corporation. THE INFORMATION CONTAINED IN THIS PUBLICATION IS BELIEVED TO BE ACCURATE AND RELIABLE. HOWEVER, ABSOFT CORPORATION MAKES NO REPRESENTATION OF WARRANTIES WITH RESPECT TO THE PROGRAM MATERIAL DESCRIBED HEREIN AND SPECIFICALLY DISCLAIMS ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR ANY PARTICULAR PURPOSE. FURTHER, ABSOFT RESERVES THE RIGHT TO REVISE THE PROGRAM MATERIAL AND MAKE CHANGES THEREIN FROM TIME TO TIME WITHOUT OBLIGATION TO NOTIFY THE PURCHASER OF THE REVISION OR CHANGES. IN NO EVENT SHALL ABSOFT BE LIABLE FOR ANY INCIDENTAL, INDIRECT, SPECIAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE PURCHASER'S USE OF THE PROGRAM MATERIAL. U.S. GOVERNMENT RESTRICTED RIGHTS — The software and documentation are provided with RESTRICTED RIGHTS. Use, duplication, or disclosure by the Government is subject to restrictions set forth in subparagraph (c) (1) (ii) of the Rights in Technical Data and Computer Software clause at 252.227-7013. The contractor is Absoft Corporation, 2111 Cass Lake Rd, Suite 102, Keego Harbr, Michigan 48320. ABSOFT CORPORATION AND ITS LICENSOR(S) MAKE NO WARRANTIES, EXPRESS OR IMPLIED, INCLUDING WITHOUT LIMITATION THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE, REGARDING THE SOFTWARE.
    [Show full text]
  • Absoft Pro Fortran User Guide
    Pro Fortran Windows™ User Guide For 32-bit and 64-bit Windows Pro Fortran Windows™ User Guide For 32-bit and 64-bit Windows 5119 Highland Rd, PMB 398 Waterford, MI 48327 U.S.A. Tel (248) 220-1190 Fax (248) 220-1194 [email protected] All rights reserved. No part of this publication may be reproduced or used in any form by any means, without the prior written permission of Absoft Corporation. THE INFORMATION CONTAINED IN THIS PUBLICATION IS BELIEVED TO BE ACCURATE AND RELIABLE. HOWEVER, ABSOFT CORPORATION MAKES NO REPRESENTATION OF WARRANTIES WITH RESPECT TO THE PROGRAM MATERIAL DESCRIBED HEREIN AND SPECIFICALLY DISCLAIMS ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR ANY PARTICULAR PURPOSE. FURTHER, ABSOFT RESERVES THE RIGHT TO REVISE THE PROGRAM MATERIAL AND MAKE CHANGES THEREIN FROM TIME TO TIME WITHOUT OBLIGATION TO NOTIFY THE PURCHASER OF THE REVISION OR CHANGES. IN NO EVENT SHALL ABSOFT BE LIABLE FOR ANY INCIDENTAL, INDIRECT, SPECIAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE PURCHASER'S USE OF THE PROGRAM MATERIAL. U.S. GOVERNMENT RESTRICTED RIGHTS — The software and documentation are provided with RESTRICTED RIGHTS. Use, duplication, or disclosure by the Government is subject to restrictions set forth in subparagraph (c) (1) (ii) of the Rights in Technical Data and Computer Software clause at 252.227-7013. The contractor is Absoft Corporation, 2111 Cass Lake Rd, Suite 102, Keego Harbor, Michigan 48320. ABSOFT CORPORATION AND ITS LICENSOR(S) MAKE NO WARRANTIES, EXPRESS OR IMPLIED, INCLUDING WITHOUT LIMITATION THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE, REGARDING THE SOFTWARE.
    [Show full text]
  • Performance of a Computer (Chapter 4) Vishwani D
    ELEC 5200-001/6200-001 Computer Architecture and Design Fall 2013 Performance of a Computer (Chapter 4) Vishwani D. Agrawal & Victor P. Nelson epartment of Electrical and Computer Engineering Auburn University, Auburn, AL 36849 ELEC 5200-001/6200-001 Performance Fall 2013 . Lecture 1 What is Performance? Response time: the time between the start and completion of a task. Throughput: the total amount of work done in a given time. Some performance measures: MIPS (million instructions per second). MFLOPS (million floating point operations per second), also GFLOPS, TFLOPS (1012), etc. SPEC (System Performance Evaluation Corporation) benchmarks. LINPACK benchmarks, floating point computing, used for supercomputers. Synthetic benchmarks. ELEC 5200-001/6200-001 Performance Fall 2013 . Lecture 2 Small and Large Numbers Small Large 10-3 milli m 103 kilo k 10-6 micro μ 106 mega M 10-9 nano n 109 giga G 10-12 pico p 1012 tera T 10-15 femto f 1015 peta P 10-18 atto 1018 exa 10-21 zepto 1021 zetta 10-24 yocto 1024 yotta ELEC 5200-001/6200-001 Performance Fall 2013 . Lecture 3 Computer Memory Size Number bits bytes 210 1,024 K Kb KB 220 1,048,576 M Mb MB 230 1,073,741,824 G Gb GB 240 1,099,511,627,776 T Tb TB ELEC 5200-001/6200-001 Performance Fall 2013 . Lecture 4 Units for Measuring Performance Time in seconds (s), microseconds (μs), nanoseconds (ns), or picoseconds (ps). Clock cycle Period of the hardware clock Example: one clock cycle means 1 nanosecond for a 1GHz clock frequency (or 1GHz clock rate) CPU time = (CPU clock cycles)/(clock rate) Cycles per instruction (CPI): average number of clock cycles used to execute a computer instruction.
    [Show full text]
  • Investigations of Various HPC Benchmarks to Determine Supercomputer Performance Efficiency and Balance
    Investigations of Various HPC Benchmarks to Determine Supercomputer Performance Efficiency and Balance Wilson Lisan August 24, 2018 MSc in High Performance Computing The University of Edinburgh Year of Presentation: 2018 Abstract This dissertation project is based on participation in the Student Cluster Competition (SCC) at the International Supercomputing Conference (ISC) 2018 in Frankfurt, Germany as part of a four-member Team EPCC from The University of Edinburgh. There are two main projects which are the team-based project and a personal project. The team-based project focuses on the optimisations and tweaks of the HPL, HPCG, and HPCC benchmarks to meet the competition requirements. At the competition, Team EPCC suffered with hardware issues that shaped the cluster into an asymmetrical system with mixed hardware. Unthinkable and extreme methods were carried out to tune the performance and successfully drove the cluster back to its ideal performance. The personal project focuses on testing the SCC benchmarks to evaluate the performance efficiency and system balance at several HPC systems. HPCG fraction of peak over HPL ratio was used to determine the system performance efficiency from its peak and actual performance. It was analysed through HPCC benchmark that the fraction of peak ratio could determine the memory and network balance over the processor or GPU raw performance as well as the possibility of the memory or network bottleneck part. Contents Chapter 1 Introduction ..............................................................................................
    [Show full text]
  • CAPS Openacc Compiler
    CAPS OpenACC Compiler HMPP Workbench 3.2 IDDN.FR.001.490007.000.S.P.2008.000.10600 This information is the property of CAPS entreprise and cannot be used, reproduced or transmitted without authorization. Headquarters – France CAPS – USA CAPS – CHINA Immeuble CAP Nord 4701 Patrick Drive Bldg 12 Suite E2, 30/F 4A Allée Marie Berhaut Santa Clara JuneYao International Plaza 35000 Rennes CA 95054 789, Zhaojiabang Road, France Shanghai 200032 Tel.: +33 (0)2 22 51 16 00 Tel.: +1 408 550 2887 x70 Tel.: +86 21 3363 0057 Fax: +33 (0)2 23 20 16 43 Fax: +86 21 3363 0067 [email protected] [email protected] [email protected] N° d’agrément formation : 53 35 08397 35 Visit our website: http://www.caps-entreprise.com CAPS OpenACC Compiler SUMMARY 1. Introduction 5 1.1. Revisions history .................................................................................................................................... 5 1.2. Introduction ............................................................................................................................................ 6 1.3. What is HMPP Workbench? What is the CAPS OpenACC Compiler? ................................................. 6 1.4. Execution Model .................................................................................................................................... 8 1.5. Memory Model ....................................................................................................................................... 8 2. OpenACC Directives 9 2.1. kernels
    [Show full text]
  • MPICH2 Installer's Guide∗ Version 1.1 Mathematics and Computer
    MPICH2 Installer’s Guide∗ Version 1.1 Mathematics and Computer Science Division Argonne National Laboratory William Gropp Ewing Lusk David Ashton Pavan Balaji Darius Buntinas Ralph Butler Anthony Chan Dave Goodell Jayesh Krishna Guillaume Mercier Rob Ross Rajeev Thakur Brian Toonen June 2, 2009 ∗This work was supported by the Mathematical, Information, and Computational Sci- ences Division subprogram of the Office of Advanced Scientific Computing Research, Sci- DAC Program, Office of Science, U.S. Department of Energy, under Contract DE-AC02- 06CH11357. 1 Contents 1 Introduction 1 2 Quick Start 1 2.1 Prerequisites ........................... 1 2.2 From A Standing Start to Running an MPI Program . 2 2.3 Compiler Optimization Levels .................. 9 2.4 Common Non-Default Configuration Options . 11 2.4.1 The Most Important Configure Options . 11 2.4.2 Using the Absoft Fortran compilers with MPICH2 . 12 2.5 Shared Libraries ......................... 12 2.6 What to Tell the Users ...................... 12 3 Migrating from MPICH1 13 3.1 Configure Options ........................ 13 3.2 Other Differences ......................... 13 4 Choosing the Communication Device 14 5 Installing and Managing Process Managers 15 5.1 mpd ................................ 16 5.1.1 Configuring mpd ..................... 16 5.1.2 System Requirements . 16 5.1.3 Using mpd ........................ 16 5.1.4 Options for mpd ...................... 17 5.1.5 Running MPD on multi-homed systems . 17 5.1.6 Running MPD as Root . 18 i 5.1.7 Running MPD on SMP’s . 18 5.1.8 Security Issues in MPD . 19 5.2 SMPD ............................... 20 5.2.1 Configuration ....................... 20 5.2.2 Usage and administration .
    [Show full text]
  • Intel® Math Kernel Library (Intel® MKL) 10.2 In-Depth Intel® Math Kernel Library (Intel® MKL) 10.2: In-Depth
    Intel® Math Kernel Library (Intel® MKL) 10.2 In-Depth Intel® Math Kernel Library (Intel® MKL) 10.2: In-Depth Contents Intel® Math Kernel Library (Intel® MKL) 10.2. .4 Performance Improvements in Intel MKL 10.2. 6 Highlights . 4 Performance Improvements in Intel MKL 10.1. 7 Features. 4 Performance Improvements in Version 10.0. 7 Multicore ready . 4 BLAS . 7 Automatic runtime processor detection . .4 LAPACK . 7 Support for C and Fortran interfaces . 4 FFTs . 7 Support for all Intel® processors in one package . 4 VML/VSL . .7 Royalty-free distribution rights . 4 Functionality. 7 New in Intel MKL 10.2. 4 Linear Algebra: BLAS and LAPACK . .7 Performance Improvements . 4 BLAS . 8 C#/ .Net support . 4 Sparse BLAS . 8 BLAS . 4 LAPACK . .8 LAPACK . .5 BLAS and LAPACK Performance . .8 FFT . 5 Linear Algebra: ScaLAPACK . 9 PARDISO . 5 ScaLAPACK Performance . 9 New in Intel® MKL 10.1. .5 Raw Performance . 9 Computational Layer . .5 Block Size Robustness . 10 PARDISO Direct Sparse Solver . 5 References. 10 Sparse BLAS . .5 Linear Algebra: Sparse Solvers . 1. 0 LAPACK . 5 PARDISO*: Parallel Direct Sparse Solver . 11 Discrete Fourier Transform Interface (DFTI) . 5 New Out-of-Core Support! . 1. 1 Iterative Solver Preconditioner . 6 Iterative Solvers . .11 Vector Math Functions . 6 FGMRES Solver . 1. 1 User’s Guide . 6 Conjugate Gradient Solver . 11 ILU0/ILUT Preconditioners . 1. 2 Sparse BLAS . 1. 2 2 Intel® Math Kernel Library (Intel® MKL) 10.2: In-Depth References. 12 LINPACK Benchmark . 19 Fast Fourier Transforms (FFT) . 1. 2 Ease of Use . 19 Interfaces . .12 Performance . 20 Fortran and C .
    [Show full text]