Best Practice Guide - Knights Landing Vali Codreanu, Surfsara Jorge Rodríguez, BSC Ole Widar Saastad (Editor), University of Oslo 31-01-2017

Best Practice Guide - Knights Landing Vali Codreanu, Surfsara Jorge Rodríguez, BSC Ole Widar Saastad (Editor), University of Oslo 31-01-2017

Best Practice Guide - Knights Landing Vali Codreanu, SURFsara Jorge Rodríguez, BSC Ole Widar Saastad (Editor), University of Oslo 31-01-2017 1 Best Practice Guide - Knights Landing Table of Contents 1. Introduction .............................................................................................................................. 5 2. System Architecture / Configuration ............................................................................................. 5 2.1. Processor Architecture / MCM Architecture ......................................................................... 6 2.1.1. Instruction Set Architecture .................................................................................... 7 2.2. Memory Architecture ....................................................................................................... 9 2.2.1. Flat mode .......................................................................................................... 10 2.2.2. Cache mode ....................................................................................................... 10 2.2.3. Hybrid mode ...................................................................................................... 13 2.2.4. Optimal Settings ................................................................................................. 13 2.2.5. Example: Programming with the memkind library ..................................................... 13 2.3. Clustering modes .......................................................................................................... 13 2.3.1. All to all mode ................................................................................................... 14 2.3.2. Quadrant mode ................................................................................................... 14 2.3.3. Sub NUMA mode ............................................................................................... 14 2.3.4. Performance using the different modes ................................................................... 15 2.3.5. Example: Programming with sub-NUMA clusters ..................................................... 15 2.3.6. BIOS settings for clustering, memory etc ................................................................ 15 2.4. (Node) Interconnect ....................................................................................................... 16 2.5. I/O Subsystem Architecture ............................................................................................ 16 3. Programming Environment / Basic Porting ................................................................................... 16 3.1. Default System settings .................................................................................................. 16 3.2. Available Compilers ...................................................................................................... 16 3.2.1. Compiler Flags ................................................................................................... 17 3.3. Available (Vendor Optimized) Numerical Libraries ............................................................. 19 3.3.1. Math Kernel Library, MKL .................................................................................. 19 3.3.2. Intel Performance Primitives, IPP .......................................................................... 21 3.3.3. Math library, libimf ............................................................................................. 22 3.3.4. Short vector math library, libsvml .......................................................................... 22 3.4. Available MPI Implementations ....................................................................................... 23 3.5. OpenMP ...................................................................................................................... 24 3.5.1. Compiler Flags ................................................................................................... 24 4. Benchmark performance ........................................................................................................... 24 4.1. STREAM benchmark ..................................................................................................... 25 4.2. NAS kernel benchmark, NPB .......................................................................................... 26 4.2.1. Scaling and Speedup test ..................................................................................... 27 4.2.2. Performance ....................................................................................................... 28 4.3. HYDRO benchmark ...................................................................................................... 29 4.3.1. Scaling and Speedup test ..................................................................................... 29 4.3.2. Performance ....................................................................................................... 30 5. Application Performance ........................................................................................................... 31 5.1. ALYA ......................................................................................................................... 31 5.1.1. Building and compiling ....................................................................................... 31 5.1.2. Scaling and speedup ............................................................................................ 32 5.1.3. Performance ....................................................................................................... 32 5.2. GROMACS .................................................................................................................. 33 5.2.1. Building and compiling ....................................................................................... 33 5.2.2. Scaling and speedup ............................................................................................ 34 5.2.3. Performance ....................................................................................................... 35 5.3. Bifrost (Stellar atmosphere simulation code) ...................................................................... 35 5.3.1. Building and compiling ....................................................................................... 36 5.3.2. Vectorization and scaling ..................................................................................... 36 5.3.3. Performance ....................................................................................................... 38 5.4. Dalton and LS-Dalton (molecular electronic structure program) ............................................. 38 5.4.1. Building and compiling ....................................................................................... 38 5.4.2. Effect of vectorization ......................................................................................... 39 2 Best Practice Guide - Knights Landing 5.4.3. Scaling and speedup ............................................................................................ 40 5.4.4. Performance ....................................................................................................... 41 6. Performance Analysis ............................................................................................................... 42 6.1. Performance Monitoring in the Knights Landing Tile ........................................................... 42 6.2. Available Performance Analysis Tools .............................................................................. 43 6.2.1. Intel Application Performance Snapshot .................................................................. 43 6.2.2. Allienea Performance Reports ............................................................................... 44 6.2.3. Intel Software Development Emulator .................................................................... 46 6.3. Hints for Interpreting Results .......................................................................................... 47 7. Tuning ................................................................................................................................... 47 7.1. Guidelines for tuning ..................................................................................................... 47 7.1.1. Top down .......................................................................................................... 47 7.1.2. Bottom up ......................................................................................................... 48 7.2. Intel tuning tools ........................................................................................................... 49 7.2.1. Intel compiler .................................................................................................... 49 7.2.2. Intel MPI library ................................................................................................ 52 7.2.3. Intel XE-Advisor ................................................................................................ 54 7.2.4. Intel XE-Inspector .............................................................................................. 55 7.2.5. Intel VTune Amplifier ......................................................................................... 55 7.2.6. Intel Trace Analyzer ...........................................................................................

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    85 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us