
IBM XL Fortran for AIX, V16.1 IBM Optimization and Programming Guide Version 16.1 SC27-8064-00 IBM XL Fortran for AIX, V16.1 IBM Optimization and Programming Guide Version 16.1 SC27-8064-00 Note Before using this information and the product it supports, read the information in “Notices” on page 339. First edition This edition applies to IBM XL Fortran for AIX, V16.1 (Program 5765-J14, 5725-C74) and to all subsequent releases and modifications until otherwise indicated in new editions. Make sure you are using the correct edition for the level of the product. © Copyright IBM Corporation 1990, 2018. US Government Users Restricted Rights – Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp. Contents About this document ........ vii Chapter 4. Managing code size .... 51 Who should read this document ....... vii Steps for reducing code size ......... 52 How to use this document ......... vii Compiler option influences on code size..... 52 How this document is organized ....... vii The -qipa compiler option ........ 52 Conventions .............. viii The -qinline inlining option ........ 52 Related information ........... xii The -qhot compiler option ........ 53 Available help information ........ xii The -qcompact compiler option....... 53 Standards and specifications ....... xiv Other influences on code size ........ 53 Technical support ............ xiv High activity areas ........... 53 How to send your comments ........ xv Computed GOTOs and CASE constructs ... 54 Code size with dynamic or static linking ... 54 Chapter 1. Optimizing your applications 1 Distinguishing between optimization and tuning .. 1 Chapter 5. Debugging optimized code 57 Steps in the optimization process ....... 2 Understanding different results in optimized Basic optimization ............ 2 programs ............... 58 Optimizing at level 0 .......... 3 Debugging in the presence of optimization .... 58 Optimizing at level 2 .......... 3 Using -qoptdebug to help debug optimized Advanced optimization ........... 4 programs ............... 60 Optimizing at level 3 .......... 5 An intermediate step: adding -qhot suboptions at Chapter 6. Compiler-friendly level 3 ............... 6 programming techniques ....... 63 Optimizing at level 4 .......... 7 General practices ............ 63 Optimizing at level 5 .......... 8 Variables and pointers ........... 63 Specialized optimization techniques ...... 8 Arrays ................ 64 High-order transformation (HOT) ...... 9 Choosing appropriate variable sizes ...... 64 Interprocedural analysis (IPA) ....... 11 Submodules (Fortran 2008) ......... 65 Profile-directed feedback ......... 15 Handling table of contents (TOC) overflow ... 22 Vector technology ........... 25 Chapter 7. Exploiting POWER9 Using compiler reports to diagnose optimization technology ............. 69 opportunities ............. 29 POWER9 compiler options ......... 69 Tracing procedures in your code ....... 31 High-performance libraries that are tuned for Getting more performance ......... 35 POWER9 ............... 69 Beyond performance: effective programming POWER9 intrinsic procedures ........ 69 techniques ............... 36 Chapter 8. High performance libraries 73 Chapter 2. Tuning XL compiler Using the Mathematical Acceleration Subsystem applications ............ 37 (MASS) libraries ............. 73 Tuning for your target architecture ...... 37 Using the scalar library ......... 74 Using -qarch ............. 38 Using the vector libraries ......... 76 Using -qtune ............. 40 Using the SIMD libraries ......... 81 Using -qcache ............ 40 Compiling and linking a program with MASS .. 86 Before you finish tuning ......... 41 Using the Basic Linear Algebra Subprograms - BLAS 87 Further option driven tuning ........ 41 BLAS function syntax .......... 88 Options for providing application characteristics 41 Linking the libxlopt library ........ 90 Options to control optimization transformations 44 Options to assist with performance analysis .. 45 Chapter 9. Parallel programming with Options that can inhibit performance ..... 46 XL Fortran ............. 91 Compiling your parallelized code ....... 91 Chapter 3. Advanced optimization The _OPENMP C preprocessor macro and concepts .............. 47 conditional compilation ......... 91 Aliasing ............... 47 Setting runtime options .......... 92 Inlining................ 47 XLSMPOPTS ............. 92 Finding the right level of inlining ...... 48 Environment variables for OpenMP ..... 99 Optimizing your SMP code......... 106 © Copyright IBM Corp. 1990, 2018 iii Developing and running SMP applications .. 107 f_pthread_attr_t ........... 224 Parallelization directives.......... 107 f_pthread_cancel(thread) ........ 225 Summary of supported parallelization directives 108 f_pthread_cleanup_pop(exec) ....... 225 Detailed descriptions of parallelization directives 110 f_pthread_cleanup_push(cleanup, flag, arg) .. 226 Data sharing attribute rules........ 161 f_pthread_cond_broadcast(cond) ...... 227 Directive clauses ........... 163 f_pthread_cond_destroy(cond)....... 228 Routines for OpenMP .......... 187 f_pthread_cond_init(cond, cattr) ...... 228 omp_destroy_lock(svar) ......... 188 f_pthread_cond_signal(cond) ....... 229 omp_destroy_nest_lock(nvar) ....... 189 f_pthread_cond_t ........... 229 omp_get_active_level() ......... 189 f_pthread_cond_timedwait(cond, mutex, omp_get_ancestor_thread_num(level) .... 190 timeout) .............. 230 omp_get_dynamic() .......... 190 f_pthread_cond_wait(cond, mutex) ..... 230 omp_get_level() ........... 191 f_pthread_condattr_destroy(cattr)...... 231 omp_get_max_active_levels() ....... 191 f_pthread_condattr_getpshared(cattr, pshared) 232 omp_get_max_threads() ......... 191 f_pthread_condattr_init(cattr) ....... 232 omp_get_nested() ........... 192 f_pthread_condattr_setpshared(cattr, pshared) 233 omp_get_num_procs() ......... 192 f_pthread_condattr_t .......... 234 omp_get_num_threads() ......... 193 f_pthread_create(thread, attr, flag, ent, arg) .. 234 omp_get_schedule(kind, modifier) ..... 194 f_pthread_detach(thread) ........ 235 omp_get_team_size(level) ........ 194 f_pthread_equal(thread1, thread2) ..... 236 omp_get_thread_limit() ......... 195 f_pthread_exit(ret) ........... 236 omp_get_thread_num() ......... 195 f_pthread_getconcurrency() ........ 237 omp_get_wtick() ........... 196 f_pthread_getschedparam(thread, policy, param) 237 omp_get_wtime() ........... 197 f_pthread_getspecific(key, arg)....... 238 omp_in_final() ............ 198 f_pthread_join(thread, ret) ........ 239 omp_in_parallel() ........... 198 f_pthread_key_create(key, dtr) ....... 239 omp_init_lock(svar) .......... 199 f_pthread_key_delete(key) ........ 240 omp_init_nest_lock(nvar) ........ 200 f_pthread_key_t ........... 241 omp_set_dynamic(enable_expr) ...... 201 f_pthread_kill(thread, sig) ........ 241 omp_set_lock(svar) .......... 201 f_pthread_mutex_destroy(mutex) ...... 242 omp_set_max_active_levels(max_levels) ... 202 f_pthread_mutex_getprioceiling(mutex, old) .. 242 omp_set_nested(enable_expr) ....... 203 f_pthread_mutex_init(mutex, mattr) ..... 243 omp_set_nest_lock(nvar) ........ 204 f_pthread_mutex_lock(mutex) ....... 243 omp_set_num_threads(number_of_threads_expr) 204 f_pthread_mutex_setprioceiling(mutex, new, old) 244 omp_set_schedule(kind, modifier) ..... 205 f_pthread_mutex_t .......... 244 omp_test_lock(svar) .......... 206 f_pthread_mutex_trylock(mutex) ...... 245 omp_test_nest_lock(nvar) ........ 207 f_pthread_mutex_unlock(mutex) ...... 245 omp_unset_lock(svar) ......... 207 f_pthread_mutexattr_destroy(mattr) ..... 246 omp_unset_nest_lock(nvar)........ 208 f_pthread_mutexattr_getprioceiling(mattr, Pthreads Library Module ......... 209 ceiling) .............. 246 Pthreads data structures, functions, and f_pthread_mutexattr_getprotocol(mattr, proto) 247 subroutines ............. 210 f_pthread_mutexattr_getpshared(mattr, pshared) 247 f_maketime(delay)........... 212 f_pthread_mutexattr_gettype(mattr, type) ... 248 f_pthread_attr_destroy(attr)........ 213 f_pthread_mutexattr_init(mattr) ...... 249 f_pthread_attr_getdetachstate(attr, detach) ... 213 f_pthread_mutexattr_setprioceiling(mattr, f_pthread_attr_getguardsize(attr, guardsize) .. 214 ceiling) .............. 250 f_pthread_attr_getinheritsched(attr, inherit) .. 215 f_pthread_mutexattr_setprotocol(mattr, proto) 250 f_pthread_attr_getschedparam(attr, param) .. 215 f_pthread_mutexattr_setpshared(mattr, pshared) 251 f_pthread_attr_getschedpolicy(attr, policy) ... 216 f_pthread_mutexattr_settype(mattr, type) ... 251 f_pthread_attr_getscope(attr, scope) ..... 217 f_pthread_mutexattr_t ......... 252 f_pthread_attr_getstackaddr(attr, stackaddr) .. 217 f_pthread_once(once, initr) ........ 253 f_pthread_attr_getstacksize(attr, ssize) .... 218 f_pthread_once_t ........... 253 f_pthread_attr_init(attr) ......... 218 f_pthread_rwlock_destroy(rwlock) ..... 253 f_pthread_attr_setdetachstate(attr, detach) ... 219 f_pthread_rwlock_init(rwlock, rwattr) .... 254 f_pthread_attr_setguardsize(attr, guardsize) .. 220 f_pthread_rwlock_rdlock(rwlock) ...... 255 f_pthread_attr_setinheritsched(attr, inherit) .. 220 f_pthread_rwlock_t .......... 255 f_pthread_attr_setschedparam(attr, param) .. 221 f_pthread_rwlock_tryrdlock(rwlock) ..... 256 f_pthread_attr_setschedpolicy(attr, policy) ... 222 f_pthread_rwlock_trywrlock(rwlock) .... 256 f_pthread_attr_setscope(attr, scope) ..... 223 f_pthread_rwlock_unlock(rwlock) ..... 257 f_pthread_attr_setstackaddr(attr, stackaddr) .. 223 f_pthread_rwlock_wrlock(rwlock) ....
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages366 Page
-
File Size-