Some training course practicalities

Each lecture is about 45min – 1h Between lectures usually an exercise session, ca. 1h A training day ends at always 4pm Lunch breaks at noon with lectures/exercises continuing at 1pm Prerequisites: good knowledge of Introduction to Advanced Fortran and Unix/Linux Exercises will be carried out using CSC’s Cray XC40 (Sisu) supercomputer

1 2

Introduction to Advanced Fortran A brief overview of this 3-day training course

A brief overview of the contents of this training course Advanced Fortran concentrates on F2003/F2008 aspects How do we teach? – some Fortran95 less well known features are covered, too – An example session with “Operator overloading” Our major F2003/F2008 subjects in this course are A quick history of Fortran standards over the last decades – Object Oriented Fortran (OOF) – From FORTRAN66 to Fortran 2008 – Co-Array Fortran (CAF) Parallel processing aspects that often improve program Minor subjects cover performance: paving way to High Performance Computing – Developments in ALLOCATABLE and POINTER variables – Co-Array Fortran (CAF) is part of F2008 standard – Interfacing with —programs / libraries (ISO C bindings) § With a fun demo-session on explicit dynamic linking

3 4 How do we teach? An example session: “Operator overloading”

We will try to make potentially difficult things look To improve readability of Fortran source code it is relatively easy to learn and understand possible to overload existing operators – or even create your own ones (not covered here) We tend to skip items that – we think – have less significance in everyday modern Fortran programming Operator overloading applies to the mathematical operations like ‘+’, ‘-’, ‘*’, ‘/’ and assignment ‘=’ A typical lecture contains many smaller subjects, which Binary operators involve two operands – residing in the often contain the syntactical point of view followed by left and right hand sides (LHS & RHS) of the operator simplified code snippets – Unary operator (e.g. minus sign) only has one operator An exercise session followed by lecture session, allow (RHS) students to practice with the just learned stuff Typical operator overloading usages are found in the field of vector and matrix computation

5 6

Operator overloading ... Overloading summation operator ‘+’ for vectors

A practical implementation of operator overloading Whilst it is usually easier to code this directly without any usually involve creating a Fortran MODULE –file where all extra code for implementing operator overloading, it is the necessary components are placed pedagogically a useful case to go through here A MODULE contains derived data TYPE definition of the To start with, lets introduce a simple vector type: TYPE vector_t associated elements and creation of appropriate REAL, allocatable :: v(:) INTERFACE blocks to enable compiler to map the END TYPE vector_t references to the new operators In a more generalized case this TYPE may contain many Operator overloading can optionally be implemented to more components than shown here cover several (intrinsic) data types, or a mixture of types – We may also want to sum over mixture of various REAL precisions (“kinds”) and/or perhaps over INTEGERs, too

7 8 Overloading summation operator ‘+’... Overloading summation operator ‘+’ ...

MODULE overload TYPE vector_t All the necessary code fractions The next step is to introduce a summation function, real, allocatable :: x(:) are now placed in a MODULE –file END TYPE vector_t … and referenced from the user which does the actual adding up of two vectors INTERFACE operator(+) MODULE PROCEDURE add source code, e.g. : – We need to introduce the summation OPERATOR itself END INTERFACE INTERFACE assignment(=) PROGRAM main and INTERFACE blocks to enable the access MODULE PROCEDURE asgn_int USE overload END INTERFACE TYPE(vector_t) :: x1, x2, out We also would like have a flexible initialization of the CONTAINS FUNCTION add(v1, v2) x1 = [ 1, 2, 3 ]; vectors to be summed up – we take an opportunity to TYPE(vector_t) :: add x2 = [ 10, 20, 30 ]; TYPE(vector_t), intent(in) :: v1, v2 out = x1 + x2 introduce an assignment OPERATOR, too add%x = v1%x + v2%x print '(a,3(1x,g0))','out=',out%x END FUNCTION add END PROGRAM main – In the example our initial values are all INTEGERs, so we asgn_int(out,in) TYPE(vector_t), intent(out) :: out only need to map them to our vector TYPE INTEGER, intent(in) :: in(:) The output is: out%x = in out= 11.00000 22.00000 33.00000 END SUBROUTINE asgn_int END MODULE overload

9 10

A bit of Fortran history Evolution of Fortran standards over the few decades As many know the name “Fortran” stands for “Formula Mid-60’s the most significant version was labeled as ANSI translating system” standard, called FORTRAN66 (“Fortran” in capital letters J) – The 1st version became standard by John Backus (IBM/1957) In 1978 the former “de-facto” FORTRAN77 standard was Well suited for numerical and scientific computing approved and quickly became popular, contained e.g.: – Often emphasis in HPC – High Performance Computing – IMPLICIT statement A language for computationally intensive areas like – IF – THEN – ELSE IF – ELSE – ENDIF – Computational physics & computational chemistry – CHARACTER data type – Finite element analysis & computational fluid dynamics – PARAMETER statement for specifying constants – Numerical weather prediction & climate research – Generic names for intrinsic functions

11 12 Evolution of Fortran standards… Fortran90

An important extension to FORTRAN77 was release of A major step in keeping Fortran programming alive was “military standard” Fortran enhancements (also in 1978) introduction of Fortran90 standard in ’91 (ISO), ’92 (ANSI) by US DoD, and adopted by most FORTRAN77 compilers – Free format source input (up to 132 characters per line) – IMPLICIT NONE – Dynamic memory handling via – DO – ENDDO § ALLOCATE / DEALLOCATE for ALLOCATABLE –arrays § Automatic array sizing in procedures from arguments / modules – INCLUDE statement – Bit-manipulation functions Marked a number of features obsolescent, but did not delete any features – yet, e.g.: All these were eventually incorporated to the next major – Arithmetic IF and computed GOTO –statements official standard release – Fortran90 – Alternate RETURN, and use of H(ollerith) in FORMAT –statements

13 14

Fortran95 Fortran 2003 (F2003)

A minor revision in 1997 (ISO) some features were taken A significant revision of the Fortran95 standard (in 2004) from High Performance Fortran (HPF) specification, e.g. – Parameterized derived TYPEs – FORALL and nested WHERE clauses to improve vectorization – Object-Oriented Programming (OOP), e.g. – User-defined PURE and ELEMENTAL procedures § Type extension, (single) inheritance, polymorphism, … – Automatic DEALLOCATE when ALLOCATABLE out of scope – Procedure POINTERs – POINTER and TYPE components default initializations – Interoperability with C (and other) languages Deleted some features previously marked as obsolescent – O/S interfacing: command line arguments & env. variables – – Use of non-INTEGERs as DO-loop parameters ALLOCATABLE improvements – – H(ollerith) edit descriptor in FORMATs POINTER improvements – – ASSIGN and assigned GOTO I0 (eye-zero) edit descriptor for INTEGERs in FORMAT statements

15 16 Fortran 2008 (F2008) Recap A minor revision of Fortran 2003 – in 2010 The Fortran standard has evolved from its early days, late – Certain new features still not fully supported by MOST 1950s, and contains many modern language features, like compiler makers L Object Oriented Programming (OOP – with a twist “called” New features include OOF) as well as Co-Array Fortran (CAF) to express (massive) – Co-arrays parallelism – Sub-modules Interfacing with C-language is now standardized, so is the – CONTIGUOUS –attribute interface with operating system (command line arguments, – BLOCK –construct obtaining environment variables and running Unix- – newunit= in OPEN –statement commands from Fortran) – DO CONCURRENT –construct Developments in ALLOCATABLE and POINTER variables are – G0 edit descriptor for REALs in FORMAT statements also noticeable improvements

17 18 Useful features since Fortran90

Interfacing the operating system – Accessing command line arguments and environment variables – Running operating system commands from Fortran Enhancements in use of ALLOCATABLE –variables – Derived data type can have ALLOCATABLE components – Automatic DEALLOCATE when variable goes out of scope – ALLOCATABLEs as dummy arguments and function can return ALLOCATABLEs – Automatic resizing of ALLOCATABLE variables – ALLOCATABLE scalars: variable length CHARACTER strings – Two essential POINTER assignment improvements Useful new features CONTIGUOUS attribute Asynchronous I/O I0 and G0 edit descriptors allow dynamic output FORMATting

19 20

Command line arguments Command line arguments …

Parameters to a program are very often given to programs as Access separate command line arguments command line arguments call get_command_argument(number[,value][,length][,status]) – Input file(s), modify program behavior, etc. – number is of type integer and denotes which argument to get – value is of type character string and contains the value of the requested Fortran 2003 has (finally!) a standardized method for reading argument on return. If the actual argument is too short or long it is padded command line arguments with blanks or truncated, respectively (optional) – No need to use extensions such as GETARG and IARGC – length is of type integer and contains the length of the requested argument on return (optional) The standard function calls are, respectively – status is of type integer. On successful return status is 0, -1 if value was – get_command_argument and command_argument_count too short to contain actual argument and 1 if argument could not be returned In addition, in order to access the whole command line, use (optional) – get_command Get the number of command line arguments integer :: command_argument_count()

21 22 Command line arguments … Command line arguments: example

Access the whole command line program commandline call get_command(command[,length][,status]) implicit none – command is of type character string and contains the value of the character(len=256) :: line command line on return. If the the actual command line is too short or integer :: i, iarg, stat, clen long it is padded with blanks or truncated iarg = command_argument_count() – length is of type integer and contains the length of the command do i=1,iarg line on return or zero if the length cannot be determined (optional) call get_command_argument(i,line,clen,stat) – status is of type integer. On successful return status is 0, -1 if value write (*,'(I0,A,A)') i,': ',line(1:clen) was too short to contain actual argument and 1 if argument could not end do be returned (optional) call get_command(line,clen,stat) write (*,'(A)') line(1:clen) end program commandline

23 24

Environment variables Environment variables

Besides command line arguments, environment variables Access a value of an environment variable are a common way to modify program behaviour call get_environment_variable(name,value[,length] [,status][,trim_name]) Fortran 2003 has (finally!) a standardized method for – name is of type character string and contains the name of the requested variable accessing values of environment variables – value is of type character string and contains the value of the requested variable. If the the actual variable value is too short or long it is padded with blanks or truncated. If the In Fortran 77/90/95 accessing getenv from C standard variable has no value or does not exist, value is set to blanks – length is of type integer and contains the length of the requested variable on return if library requires passing character strings from Fortran to the variable exists and has a value and zero otherwise (optional) C and back – status is type integer. If requested variable does not exist status is 1. If value was too short status is -1 and zero otherwise. For other return codes, see docs (optional) – Without ISO-C bindings (only in Fortran 2003/2008), error – trim_name is of type logical and sets if trailing blanks are allowed in variable names or prone and nonportable methods needed not (optional)

25 26 Environment variables: example Executing commands

Invoking external programs from within a program is program environment occasionally needed implicit none character(len=256) :: enval – No source nor library API available for a useful program integer :: len,stat Fortran 2008 has (finally!) a standardized method for call get_environment_variable('HOSTNAME', enval, len, stat) invoking an external command if (stat == 0) write(*,'(A,A)') 'Host=', enval(1:len) call get_environment_variable('USER', enval, len, stat) In Fortran 77/90/95 accessing system from C standard if (stat == 0) write(*,'(A,A)') 'User=', enval(1:len) library requires passing character strings from Fortran to end program environment C and back – With Fortran 2003, ISO-C bindings can be used

27 28

Executing commands Executing commands: example

Execute a command line call execute_command_line(command[,wait][,exitstat] [,cmdstat][,cmdmsg]) program execcommand – command is a character string containing the command to be invoked implicit none – wait is logical value indicating if command termination is to be waited integer :: estat, cstat (.true., the default) or if the command is to be executed call execute_command_line('ls -al', .TRUE., estat, cstat) asynchronously (.false.) (optional) if (estat==0) write (*,'(A)') 'Command completed successfully’ – exitstat is an integer value containing the return value of the end program execcommand command if wait=.true. (optional) – cmdstat is an integer value. It is assigned a value of zero if command executed successfully. For other return codes, see docs (optional) – cmdmsg is a character string containing explanatory message for positive values of cmdstat (optional)

29 30 Enhancements to ALLOCATABLE –variables Automatic DEALLOCATE

There are plenty of new features to dynamic memory SUBROUTINE SUB1 An ALLOCATABLE variable, INTEGER, ALLOCATABLE :: A(:) allocation through ALLOCATABLE –variables since ALLOCATE(A(1000)) which is declared as a local introduction of Fortran90 standard in early 1990s CALL USE_ARRAY( A ) variable in procedure and ! DEALLOCATE(A) ! Not needed gets ALLOCATEd, will be The two most common, which may already have used, END SUBROUTINE SUB1 perhaps without notice, are automatically DEALLOCATEd SUBROUTINE SUB2 upon returning back from – Automatic DEALLOCATE when ALLOCATABLE goes out of scope INTEGER, ALLOCATABLE, SAVE :: A(:) routine – Use of ALLOCATABLE components in derived data TYPEs ALLOCATE(A(1000)) CALL USE_ARRAY( A ) This is out of scope –rule But there are more, less well known cases DEALLOCATE(A) ! Is needed holds as long as variables END SUBROUTINE SUB2 – We will start from the most common ones don’t have SAVE –attribute as well

31 32

ALLOCATABLE as derived data TYPE component ALLOCATABLE as dummy argument

PROGRAM TEST MODULE my_mod Fortran90 standard INTEGER :: N An ALLOCATABLE variable TYPE my_type introduced user defined REAL, ALLOCATABLE :: A(:) can now appear in the REAL, ALLOCATABLE :: ARRAY(:) CALL SUB(A, N) procedure argument list END TYPE my_type data types, but dynamic CALL USE_ARRAY (A, N) END MODULE my_mod components were restricted DEALLOCATE(A) Enables dynamic memory END PROGRAM TEST SUBROUTINE SUB (N) to POINTERs only allocation in the callee based on sizing information USE my_mod The restriction was removed SUBROUTINE SUB(A, N) IMPLICIT NONE REAL, ALLOCATABLE, INTENT(OUT) :: A(:) Allocated (and perhaps INTEGER, intent(in) :: N in Fortran 2003 INTEGER, INTENT(OUT) :: N initialized) data is returned TYPE(my_type) :: T Allocation remains intact READ *, N : ALLOCATE( A(N) ) to the caller ALLOCATE(T % ARRAY (N) ) even after going out of READ *, A The caller must remember : scope, if TYPE was defined in ! Note : No automatic DEALLOCATE here ! END SUBROUTINE SUB a MODULE END SUBROUTINE SUB to DEALLOCATE

33 34 Function return value as ALLOCATABLE Assignment to an ALLOCATABLE variable

PROGRAM TEST A function can now have Given an ALLOCATABLE array, which REAL :: UNPACKED(1000) PROGRAM TEST has not yet been explicitly allocated REAL, ALLOCATABLE :: A(:) ALLOCATABLE –attribute REAL, ALLOCATABLE :: A(:), B(:) Assigning values to it via data or A = PACK_DATA (UNPACKED) Works similarly as when REAL :: INPUT(3) CALL USE_PACKED_DATA (A) READ *,INPUT ! Read 3 values from STDIN from another array triggers DEALLOCATE(A) ALLOCATABLE is a dummy A = INPUT ! Automatic ALLOCATE automatic allocation CONTAINS argument ! A(:) = INPUT(:) would be illegal ! If array was already allocated , but FUNCTION PACK_DATA(X) RESULT (PK) PRINT *,size(A) ! Gives 3 REAL, ALLOCATABLE :: PK(:) Particularly useful in B = (/ 1,2,4,8,16 /) ! Automatic alloc. its size was not sufficient, the extent REAL, INTENT(IN) :: X(:) returning dynamically sized A = [ A, B ] ! Automatic extent ok size will be allocated automatically INTEGER :: NPK PRINT *,size(A) ! Size now = 8 Use of this feature currently NPK = NUM_DISTINCT_VALUES(X) array as a result of ! DEALLOCATEs legal, but not necessary requires additional compiler flags, ALLOCATE(PK(NPK)) DEALLOCATE(A, B) CONTAINS –function value –assume realloc_lhs CALL GET_DISTINCT_VALUES(X,...,PK,NPK) END PROGRAM TEST like (Intel) END FUNCTION PACK_DATA evaluation and –ew flag with Cray compiler END PROGRAM TEST

35 36

ALLOCATABLE scalar Transferring allocation : MOVE_ALLOC

PROGRAM TEST These are especially useful PROGRAM TEST Provides ALLOCATABLE CHARACTER(:), ALLOCATABLE :: CH, S with Fortran CHARACTER INTEGER, ALLOCATABLE :: A(:), B(:) equivalent of POINTER INTEGER :: LENCH strings, whose length is ALLOCATE (A(1:5)) ; A(:) = 0 assignment: moves allocation LENCH = LEN('Hello World!') A(3) = 3 ! Explicit allocation decided at run time PRINT *, 'A=',A ! Output: 0 0 3 0 0 to another memory location ALLOCATE(character(LENCH) :: CH) CALL MOVE_ALLOC(A,B) CH(:) = 'Hello World!' Allocation can either explicit As result previous allocation ! A is now DEALLOCATED & B ALLOCATED ! Implicit alloc. (Using S(:) = ... fails) or automatic (as per previous IF (ALLOCATED(A)) PRINT *, '>A=',A becomes unallocated and new S = 'Hello World!' IF (ALLOCATED(B)) PRINT *, '>B=',B PRINT *,LEN(CH),LEN(S) ! Gives 12 12 page) allocation holds previous data ! No need for explicit DEALLOCATEs here ! DEALLOCATEs: legal, but not required Again, requires additional Arrays to (=A) and from (=B) DEALLOCATE(CH, S) DEALLOCATE(B) END PROGRAM TEST compiler flags due to implicit ! Would be illegal: DEALLOCATE(A) must have the same type and assignments of ALLOCATABLEs END PROGRAM TEST rank

37 38 Two improvements to the POINTER assignments Two improvements to the POINTER ...

The first improvement is that it now possible to set the The 2nd improvement is such that the TARGET of a multi- desired lower bounds of a POINTER to any value, e.g. : dimensional array POINTER can be a one-dim. array, e.g. : REAL, target :: state_budget(1917:2013) REAL, pointer :: memory(:), Matrix(:,:), Diagonal(:) REAL, pointer, dimension(:) :: A, B, C ALLOCATE(memory(N * N)) ! Space for N-by-N matrix A => state_budget Matrix (1:N, 1:N) => memory ! New stuff B => state_budget (1939:1945) Diagonal => memory(::N+1) ! ... or: memory(1:N*N:N+1) C(1939:) => state_budget (1939:1945) ! New After this the POINTER-array Matrix operates across all The bounds of A are (1917,2013), but B has just (1,7) the memory region that has been allocated The new LHS pointer, however, has bounds (1939,1945) Diagonal POINTER refers to the diagonal elements

39 40

CONTIGUOUS attribute CONTIGUOUS attribute …

Depending on the hardware architecture, a unit stride In Fortran 2008, an assumed shape or pointer variable can be data access (contiguous) can be two to three times faster declared contiguous than a constant non-unit stride (non-contiguous) – CONTIGUOUS type attribute, for example real, contiguous :: x(:) Typical optimization: use structure of arrays in stead of real, pointer, contiguous :: y(:,:) arrays of structures – CONTIGUOUS statement In Fortran, having non-contiguous data is possible due to contiguous [::] object-name-list non-unit stride array indexing, for example: Testing contiguity vector(::3) ! Every third element logical :: is_contiguous(arr) where arr is an array of any type. The function returns .TRUE. if arr is contiguous and .FALSE. otherwise.

41 42 CONTIGUOUS attribute … CONTIGUOUS attribute …

Potential performance benefits program contigtest implicit none – Array traversal and element address computation is simplified integer, parameter :: m = 6, n = 5 – Better vectorization properties real, target :: A(m,n) real, pointer :: A_flat(:), A_diag(:) – No need to generate auto-dispatch code for contiguous data real, allocatable, contiguous :: b(:) separately A = 0 ! A is simply contiguous A_flat(1:m*n) => A ! A_flat is simply contiguous Simply contiguous: no need to declare the array as A_diag => A_flat(::m+1) ! A_diag is non-contiguous contiguous. Some examples: A_diag = 1 allocate(b(m)) ! b is contiguous – A whole array of that is not of assumed shape or pointer ! Outputs 'T,T,F,T' – Continuous section of a (simply) contiguous array write (*,'(L,L,L,L)') is_contiguous(A), is_contiguous(A_flat),& is_contiguous(A_diag), is_contiguous(b) end program contigtest

43 44

Asynchronous I/O Asynchronous I/O …

Both input and output I/O INPUT Ts + 1 PROGRAM async_io When all I/O statements are performed in the same routine, can be asynchronous i.e. INTEGER :: id then the arrays / variables need – Other statements may be Compute f(Ts) REAL :: A(1000000), B(1000) OPEN(10, ..., asynchronous='yes') not to be declared with executed whilst I/O is in asynchronous –attribute progress in the background READ(10, id=id, asynchronous='yes') A OUTPUT Ts CALL do_something_with(B, size(B)) – Only in OPEN/READ/WRITE Caveat: it is an WAIT(10, id=id) ! Blocks until A read in You must also not access array A implementation dependent WAIT INPUT CLOSE(10) whilst it is being processed by factor whether I/O actually CALL do_something_with(A,size(A)) async I/O END PROGRAM async_io is performed asynchronously Prepare f(Ts+1) It is a good practice to use L identifier id for each WAIT OUTPUT asynchronous READ/WRITE and pass it to the WAIT statement

45 46 Asynchronous I/O … Non-blocking asynchronous I/O

PROGRAM async_io INTEGER :: id, IA(1000000) If you OPEN your file for By calling WAIT –function, you may potentially wait for a very CALL initialize(IA,size(IA)) asynchronous I/O, but long time before your I/O operation(s) are complete OPEN(10, ..., asynchronous='yes') perform actual Instead, you can also periodically call the INQUIRE –function CALL async_write(10,id,IA,size(IA)) CALL do_something_else_here( ) operations (e.g. WRITE) and request the status of your pending asynchronous I/O WAIT(10, id=id) in other procedure, then operation(s) – and in the meantime do something else END PROGRAM async_io the arrays involved in LOGICAL :: status SUBROUTINE async_write(iu, id, ia, n) the I/O must also be INQUIRE(unit = iu, id = myid, pending = status) INTEGER, intent(in) :: iu, n INTEGER, intent(in), asynchronous :: ia(n) declared with If status==.FALSE. à outstanding I/O with myid not yet complete INTEGER, intent(out) :: id asynchronous –attribute – Unfortunately INQUIRE (…, pending = …) appears to be blocking rather WRITE(iu, id=id, asynchronous='yes') ia than non-blocking even with id= parameter given : thus equals to WAIT END SUBROUTINE async_write

47 48

I0 and G0 edit descriptors in FORMAT Summary

Dynamic sizing of REAL and INTEGER valued output It is now possible to obtain command line arguments, access – I0 appeared in F2003 and G0 was introduced in F2008 environment variables and run operating system commands Correspond to C-language % and %g fprintf –formats in a standardized way Output fields become left justified with all the unnecessary Dynamic memory allocation has also been improved by leading blanks (and precision for REALs) removed means of enhancements to ALLOCATABLE variables INTEGER :: I = 12345 REAL(kind=4) :: SP = 1.23e0 Two new POINTERs improvements were introduced REAL(kind=8) :: DP = 1.234567890d0 CONTIGUOUS –attribute for arrays enables more optimization WRITE(*,fmt='("")') I,SP,DP Asynchronous I/O enables overlap with computation Output is (the “˽” denotes a space character) New edit descriptors enable dynamic sizing of output

49 50 DERIVED TYPES IN FORTRAN 2003/2008

Types and procedure pointers

51 52

Derived types Derived types: declaration type person For large applications, Derived type declaration in Fortran 90/95: integer :: age character(len=20) :: name using only intrinsic types is type [[,access]:: ] type_name end type person often insufficient [private] type(person_type) :: p1 To have the data as larger def_stmt integer :: p1_age [def_stmt] character(len=20) :: p1_name objects, it is beneficial to group the data together as end type [type_name] p1_age = 20 p1_name = 'John' larger objects access field can be public or private p1 = person(p1_age, p1_name) – Code becomes easier to call print_person_bad(p1_age,p1_name) Inside the type, private defines private component call print_person_good(p1) read and maintain access – Cleaner interfaces – Encapsulation of data

53 54 Derived types: declaration Derived types: examples of declaration

Derived type fields defined by def_stmt: type base_type type [[, attr_list]::] decl_list integer :: field = 1 attr_list is composed of pointer or allocatable and end type base_type dimension attributes type :: array_type integer, pointer :: field1(:) decl_list is composed of: ! Equivalently integer, dimension(:), pointer :: field1 name [(bounds)][*char_len] [init] integer :: field2(10) end type array_type In case of pointer attributes, type can point to enclosing type type recursive_type type(recursive_type), pointer :: next bounds is : for allocatable and pointer components and end type recursive_type [lower:]upper for all others

55 56

Derived types: component visibility Derived types: component visibility type person_exposed Restricting component A derived type is fully visible within the host module integer :: age character(len=20) :: name access enforces good For other modules (after use or import -statement) end type person_exposed programming practice type person_encapsulated – public types are fully visible private – Clearer division of – components of public types with private specifier are integer :: age responsibilities (who character(len=20) :: name hidden end type person_encapsulated modifies what) ! ... – Better encapsulation – All private types are hidden ! In other module type(person_exposed) :: p1 Recommendation: restrict type(person_encapsulated) :: p2 ! Exposed components access to components of p1 % age = p1 % age + 1 derived types whenever ! Encapsulated components call increment_age(p2) possible

57 58 Types: parameterization Types: parameterization

Fortran 2003 allows parameterization of type mycomplexvec(rk, n) There is often need to integer, kind :: rk change the kind – Intrinsic types with KIND with and LEN attributes integer, len :: n real(kind=rk) :: realpart(n) parameter or the length of – Derived types with integer KIND and LEN attributes real(kind=rk) :: imagpart(n) data end type mycomplexvec Type parameters – Solution: create a integer, parameter :: & ”template” where the – Kind parameter KIND must be known compile time sp = selected_real_kind(4) data fits allowing code – Length parameter LEN can be deferred until runtime type(mycomplexvec(sp, 3)) :: cvec re-use and better Type enquiry ! Initialize compiler optimizations cvec % realpart=[1E0,0E0,-1E0] – Variable: via implicitly declared LEN and KIND components cvec % imagpart=[0E0,1E0,0E0] Idea similar to C++ templates – Derived type: via LEN and KIND –components of the type

59 60

Derived types: parameterization Derived types: parameterization example

Parameterized type_name definition becomes: ! Type definition ! Declaration type matrix(rk, m, n) integer, parameter :: & type_name (type_spec_list) integer, kind :: rk dp = selected_real_kind(8) integer, len :: m, n type(matrix(dp,10,10)) :: m1 List of type specifiers are of the form type_spec: real(kind=rk) :: value(m,n) type(matrix(rk=dp,m=20,n=20)) :: m2 [keyword =] type_param_value end type matrix type(matrix(dp,:,:)), pointer :: m3 The definitions for keyword = clause are consistent with ! Allocation allocate(matrix(rk=dp,m=30,n=30)::m3) those of intrinsic types: any further type parameter ! KIND and LEN inquiry specifications must include a keyword write (*,*) m1 % rk, m1 % m, m1 % n Default values for type parameters with default initialization syntax

61 62 Derived types: allocatable components Derived types: initialization

Fortran 2003 allows derived types to have components In structure constructors for derived types with the ALLOCATABLE attribute – Pointer components: must be matched with target, – Derived type fields defined by def_stmt pointer or null() type [[, attr_list]::] decl_list, – Allocatable components:must be matched with data item where attr_list is composed of pointer, or null() allocatable, dimension and contiguous (F08) Initial state for allocatable components is unallocated attributes Enhancements of F03 allocatable semantics apply to allocatable components of derived types Recursive reference to enclosing type is allowed in F08

63 64

Derived types: initialization example Derived types: assignment semantics

! Type definitions ! Type allocations Assignment semantics differ by component type type pointer_type type(body) :: data1, data2 type(body) :: data type(pointer_type) :: r1 – Pointer components: shallow copy, i.e., pointer assignment type(pointer_type), pointer :: next type(pointer_type), target :: r2 end type pointer_type type(alloc_type) :: a1, a2 is performed type alloc_type ! Pointer components – Allocatable components: deep copy, i.e., the required character(len=3) :: type r2 = pointer_type(data2, null()) real, allocatable :: data(:) r1 = pointer_type(data1, r2) storage is automatically allocated and data is copied end type alloc_type ! Allocatable components a1 = alloc_type('nul',null()) a2 = alloc_type('vec', [1.2, 2.3, 4.5])

65 66 Derived types: assignment semantics ASSOCIATE construct type(pointer_type) :: r1, r2 ! Compute y = x + 1 Deeply nested types have type(alloc_type) :: a1, a2 some % nested % type % y = & some % nested % type % x + 1 a drawback of making the r1 = r2 ! Equivalent to code more difficult to ! r1 % data = r2 % data associate(x=>some % nested % type %x, ! r1 % next => r2 % next y=>some % nested % type %y) understand ! Compute y = x + 1 y = x + 1 Not just a code readability a1 = a2 ! Equivalent to end associate problem: the compiler ! a1 % type = a2 % type ! if (allocated(a1 % data)) deallocate(a1 % data) may not be able to ! allocate(a1 % data(size(a2 % data))) perform certain ! a1 % data = a2 % data optimizations when nested types are used

67 68

ASSOCIATE construct ASSOCIATE construct

ASSOCIATE construct allows defining aliases to lenghty Within the block, associate_name takes the type, type descriptions of variables for the duration of the associate parameters and rank from the association block. – Inherited: polymorphism, type, array rank and shape. Also associate (association_list) asynchronous, target and volatile attributes block – Not inherited: pointer, allocatable and optional attributes end associate If associate_name is a variable, it can be used as a List of associations is of the form association: variable within the block and is assignable associate_name => variable If associate_name is an expression, association is used or for the expression value within the block and is not associate_name => expression assignable

69 70 ASSOCIATE construct example associate( x => some % nested % type % x, y => some % nested % type % y, z => some % nested % type % z) ! Compute z = sqrt(x**2+y**2) z = sqrt(x**2+y**2) ! Possibly other computations involving x, y and z ABSTRACT INTERFACES AND POINTERS end associate

71 72

Interfaces Interfaces interface Programs may have to call For external procedures, interfaces determine the type subroutine not_dangerous(a, b, c) integer :: a, b, c routines outside the and properties of arguments and return values end subroutine not_dangerous program itself (external end interface Defined by INTERFACE block: libraries) or provide interface integer :: x, y, z services to others APIs x = 1; y = 1; z = 1 interface-body ! Call external subroutine without Wrong calling arguments end interface ! an interface may lead to errors during call dangerous(x,y,z) The interface-body matches the subprogram header ! Call external subroutine with the executable linking ! an interface – position, rank and type of arguments call not_dangerous(x,y,z) phase or even when the executable is being run – return value type and rank (for functions) Interface needed when procedures used as arguments

73 74 Interfaces example Abstract interfaces

! LU decomposition from LAPACK interface Different procedures with a similar interface must be subroutine DGETRF(M, N, A, LDA, IPIV, INFO) declared separately, leading to repeated code integer :: INFO, LDA, M, N integer :: IPIV( * ) Declaration of a generic interface: double precision :: A( LDA, * ) end subroutine DGETRF abstract interface end interface abstract-interface-body ! Euclidean norm from BLAS end interface interface function DNRM2( N, X, INCX ) result(double precision) Routines declared in abstract-interface-body integer :: N, INCX must not have an actual implementation in the current double precision :: X( * ) end function DNRM2 scope end interface

75 76

PROCEDURE statement Abstract interface example

Abstract interface can be referenced with procedure ! Abstract interface definition ! Procedure definition abstract interface procedure(sub_with_no_args) :: sub1 statement subroutine sub_with_no_args() procedure(trig) :: mysin, mycos procedure ([iface]) [[, spec … ] ::] decl-list end subroutine sub_with_no_args function trig(x) result(y) ! Implicit interfaces where spec is one of: public, private, bind, real, intent(in) :: x ! No known return value nor real :: y ! arguments intent, optional, pointer, save and list of end function trig procedure() :: x declarations decl: end interface ! Real return value, arguments ! unknown name [=> null-init] procedure(real) :: y null-init references intrinsic function null() with no arguments and may only appear with pointer Implicit interfaces similarly to external statement

77 78 Procedure pointers Procedure pointers example

Procedure statement allows definition of pointers to ! Interface definition type base abstract interface procedure(sub), pointer, pass(x) :: p procedures (explicit or implicit interface) function trig(x) result(y) end type base real, intent(in) :: x abstract interface A derived type can have procedure pointers as real :: y subroutine sub(a,b,x) end function trig import base components end interface real :: a,b class(base) :: x procedure(trig), pointer :: p1 end subroutine sub When a procedure pointer is invoked as a component of p1 => mysin end interface a derived type, the object itself is passed to the write (*,*) p1(x) ! Prints mysin(x) type(base) :: t1 procedure as the first actual argument t1 % p => mysub ! Equivalent to call mysub(1.0,2.0,t1) – pass argument allows passing invoking object as any call t1 % p(1.0,2.0) dummy argument – nopass argument allows not passing invoking object at all

79 80

Summary

Derived types are a natural way to represent data Encapsulation of derived type components is possible with the PRIVATE attribute Parameterization of derived types is possible with KIND and LEN attributes Abstract interfaces and PROCEDURE statement provide type-safe access to external routines Procedure pointers provide functionality similar to C- function pointers and can be bound to derived types

81 What is object-oriented programming?

Program is separated into interacting objects Objects couple the data and the methods operating on the data Generic programming: the actual type of data and the associated implementation may be encapsulated and abstracted Maintainability, readability and modifiability of the code Object-oriented programming are improved Performance penalty of a well-designed object oriented code is usually negligible

82 83

Class diagrams Class diagrams: class members

In OO programming, UML (Unified Modeling Language) Name describes class name class diagrams are typically used to describe relationships Description of fields and between classes methods visibility, type and Name Used for conceptual and data modelling of the arguments +field1 : type application structure Visibility specifiers #field2 : type -field3 : type Automatic generation of code from class diagrams – ”+ ” public _field4 : type possible (with Java and C++) – ”#” protected +method1(arg: type) – ”-” private #method2(arg: type) – _ -method3(arg: type) ” ” static _method4(arg: type)

84 85 Class diagrams: instance relationships Class diagrams: class relationships

Association Generalization Class 1 Class 2 – ”relates to”, link between … … – ”is a”, parent class Class 1 Class 2 … … … … two or more classes generalizes the extending … … Aggregation class; inheritance – ”has a”, part-whole or part- Class 1 Class 2 Realization … … of relationship between … … – ”uses”, a use relationship Class 1 Class 2 … … two classes between two or more … … Composition classes Class 1 Class 2 Multiplicity 0 … 1 One or zero instances – ”owns a”, a lifecycle … … 1 0 … 1 … … aggregation between two – ”count”, countability of the * Any number of instances classes relationship 1 … * More than one instance

86 87

Object-oriented programming in Fortran

Fortran 2003/2008 supports object-oriented (OO) programming – Type extensions, polymorphism (single inheritance) OBJECT-ORIENTED FORTRAN: BASIC TOPICS – Type and- object-bound procedures, finalizers, type-bound generics – Abstract interfaces and types Object model designed to maintain backwards compatibility with Fortran 95

88 89 IMPORT statement Type extension

Interface does not access its environment via host type :: base Type extensions are used integer :: field1 association, i.e., named constants and derived types end type base to extend the functionality from enclosing module not available of an existing type – type, extends(base) :: extended USE statement requires breaking of encapsulation integer :: field2 Type extensions are Fortran 2003 addresses this with import statement: end type extended backwards compatible, import [ [::] import-name-list] because every instance of Base extended is also an where import-name is an entity to be accessed from +field1 : integer … instance of base the host import without arguments makes all host entities Extended accessible +field2 : integer …

90 91

Type extension Type extension: EXTENDS

To be eligible for type extension, parent type must be Type definition statement with extends attribute extensible, i.e., be a derived type with neither sequence type [[,access][,extends(pname)] :: ] tname or bind(C) attributes … Extensions can have zero additional components end type [tname] New type components or type parameters cannot have pname defines an existing derived type to be extended the same name as inherited ones All components as well as type parameters of the parent Overriding of old type-bound procedures is allowed type pname are inherited by extended type tname Type tname has an additional parent component of the derived type pname with the name pname

92 93 Type extension example Polymorphism

! Type describing a person type :: base Polymorphism = ”The type :: person Person integer :: field1 character(len=10) :: name name : character end type base provision of a single integer :: age age : integer interface to entities of end type person … type, extends(base) :: extended integer :: field2 different types”, Bjarne ! Employee has fields: end type extended Stroustrup, creator of C++ ! name, age, salary, person Employee type, extends(person) :: employee type(extended), target :: t1 Object polymorphism: a integer :: salary salary : integer class(base), pointer :: p1 end type employee … variable can have t1 = extended(1,2) instances from different ! Staff has fields: ! p1 is of class base ! name, age, salary, employee Staff p1 => t1 classes as long as they are type, extends(employee) :: staff … … related by a common base end type staff class

94 95

Polymorphism: CLASS Polymorphism limitations

Type of a polymorphic variable may vary at run time Polymorphic variable can only appear in an input/output Declaration of a polymorphic variable: list if it is processed by derived type output class(type_name)[, class_attr ] :: name Intrinsic assignment to polymorphic variable is not where class_attr may be omitted for dummy data allowed unless it is associated with a non-polymorphic objects, and is either pointer or allocatable variable (Fortran 2008 allows assignment of otherwise allocatable variables) Polymorphic variables are type-compatible with Arrays of polymorphic variables are always homogeneous type_name and all extensions of type_name – Use an array of derived type having polymorphic pointer or allocatable component

96 97 Polymorphism example Polymorphic object initialization subroutine write_name(this) type :: base Actual type of a class(person) :: this integer :: field1 write (*,*) this % name end type base polymorphic object is only end subroutine write_name known during runtime type, extends(base) :: extended type(person) :: p1 integer :: field2 Polymorphic objects can type(staff), target :: p2 end type extended class(employee), pointer :: p3 be allocated only integer, parameter :: n = 5 dynamically p1 = person(name='Joe',age=20) integer :: i p2 = staff(name='Mike',age=42,salary=2000) type(extended) :: t1(n) Polymorphic object p3 => p2 class(base), allocatable :: a1(:) allocation may be based call write_name(p1) ! Prints 'Joe' t1 = (/ (extended(i,i**2),i=1,n) /) on previously allocated call write_name(p3) ! Prints 'Mike' allocate(a1,source=t1) type-compatible data

98 99

Polymorphic object initialization Sourced object initialization

ALLOCATE statement extended to allow definition of Values may be initialized based on a SOURCE object type, type parameter values, type values and size (for (single object in Fortran 2003, multiple objects in Fortran array objects) 2008) Fortran 2003/2008 ALLOCATE statement allocate(alloc [source=expr] [,stat=astat]) allocate([type_spec::] alloc_list [,stat=astat]) SOURCE object expression expr is type-compatible with where type_spec is type_name followed by type alloc. For arrays, expr and alloc must be of the same parameter values in parentheses rank In Fortran 2008 omitting array bounds from both alloc and source expressions is allowed

100 101 Sourced object initialization example Molded object initialization integer, parameter :: n = 5 integer :: i Fortran 2008 allows objects to have the shape, type and real :: a(n) type parameters without copying the object values by real, allocatable :: b(:), c(:) using MOLD a = (/ (i, i=1,n) /) allocate(alloc [mold=expr] [,stat=astat]) allocate(b(lbound(a,1):ubound(a,1)),source=a) ! Fortran 2003 allocate(c,source=b) ! Fortran 2008 MOLD object expression expr must be type-compatible ! a is [1, 2, 3, 4, 5] with alloc. For arrays, expr and alloc must be of the ! b is [1, 2, 3, 4, 5] same rank ! c is [1, 2, 3, 4, 5] After the allocation any default initialization values are applied to alloc

102 103

Molded object initialization example SELECT TYPE construct type(person) :: p1 subroutine write_output(obj) SELECT TYPE –construct type(employee) :: p2 implicit none class(person), allocatable :: p3,p4 class(base) :: obj allows the programmer to determine the type of a p1 = person('Joe', 10) select type(obj) type dynamically during p2 = employee('Mike', 42, 2000) type is (extended) allocate(p3, source=p1) ! p3 is a copy of p1 write (*,*) obj % field1, & runtime allocate(p4, mold=p2) ! p4 has the same type as p2 & obj % field2 Type determination is call write_name(p1) ! Prints 'Joe' type is (base) call write_name(p2) ! Prints 'Mike' write (*,*) obj % field1 relatively expensive – call write_name(p3) ! Prints 'Joe' end select avoid using in call write_name(p4) ! Prints ' ' end subroutine write_output performance critical parts of the program

104 105 SELECT TYPE construct SELECT TYPE construct cont.

Type of polymorphic objects may be determined at Each guard_stmt is runtime by using SELECT TYPE construct type is (type_spec) [name:] select type ([assoc_name =>] selector) type is (intrinsic_type [(param_value_list)]) [guard-stmt [name] class is (type_spec) block ] class default … with type specifier type_spec is defined before end select [name] type_spec needs is required to be type compatible with The selector may be named with assoc_name selector similarly to ASSOCIATE construct Intrinsic type specifiers are allowed only when selector is unlimited polymorphic

106 107

SELECT TYPE construct cont. SELECT TYPE example

SELECT TYPE matches the block in the following way: subroutine write_person(this) class(person), intent(in) :: this i. If derived type and type parameters match exactly, the select type (this) block is executed type is (staff) write (*,*) 'Staff (name,age,salary):', this % name, & ii. If derived type class and type parameters match, the this % age, this % salary block is executed. If more than one block matches, the type is (employee) write (*,*) 'Employee (name,age,salary):', this % name, & block which is an extension of the type of all the others this % age, this % salary is executed (most closely matching type) class is (person) block class default write (*,*) 'Person (name,age):',this % name, this % age iii. The following is executed class default Within the matched block, selector type is stop 'Unknown person type' end select type_spec end subroutine write_person

108 109 Procedures as type components Object-bound procedures

In OO programming, procedures acting on the data are Object-bound procedures can be added as procedure tied to the data pointer type components Invocation of procedures for polymorphic variables via procedure ([iface]) , pointer,[[, spec] ::] decl-list where iface defines the procedure interface to be used dynamic dispatch (runtime decision) and spec is defined as previously In Fortran 2003 procedures can be tied to Unless initialized to be bound to an existing procedure, – objects (object-bound) procedure pointer has to be initialized before it can be – types (type-bound) used The use of type-bound procedures is encouraged, as they enforce the object encapsulation and can be overriden

110 111

Object-bound procedures example Object-bound procedures example cont. type :: person subroutine print_per(this) type(person) :: per character(len=10) :: name class(person) :: this type(employee) :: emp integer :: age write (*,*) this % name, & ! Initialize per and emp procedure(print_per),pointer ::& this % age per = person(name='Joe',age=10) Person print_person => null() end subroutine emp = employee(name='Mike',age=42,& name : character print_per end type person salary=2000) age : integer subroutine print_emp(this) … type, extends(person) :: employee class(employee) :: this per % print_person => print_per integer :: salary write (*,*) this % name, & emp % print_person => print_per procedure(print_emp),pointer ::& this % age, & emp % print_employee => print_emp Employee print_employee => null() this % salary end type employee end subroutine call per % print_person() salary : integer print_emp call emp % print_employee() …

112 113 Type-bound procedures Type-bound procedures type :: base Use of SELECT TYPE – With type-bound procedures, type definition statement integer :: field1 contains construct can be avoided includes a CONTAINS part procedure :: & by overriding type-bound type [[,access]:: ] type_name write_output => write_base end type base procedures … contains type, extends(base) :: extended Type-bound procedures integer :: field2 are dynamically deter- proc_def_stmt contains end type [type_name] procedure :: & mined for polymorphic write_output => write_extended variables during runtime where proc_def_stmt defines the procedure or generic end type extended and have a smaller tied to the type ! Implementations of write_base and performance penalty than ! write_extended omitted a full type determination

114 115

Type-bound procedures cont. Procedure overriding

Declaring a type-bound procedure with proc_def_stmt A way to add extend also the functionality of an procedure[(iface)][[,attr_list]::] tbpname [=>pname] extended type is to override the procedures defined by where each attribute attr in attr_list is one of the base type public or private When extending a type, a procedure can be overriden by deferred simply defining a new proceduce with the same name non_overridable nopass or pass[(arg_name)] Apart from the type of the passed-object dummy argument which must be of the extended type, Procedure interface name iface defines the procedure overriding procedure must have exactly the same interface to be used and appears if and only if deferred interface as the overridden procedure attribute appears NON_OVERRIDABLE attribute to prevent overriding

116 117 Type-bound procedures example Type-bound procedures example cont. type :: person subroutine print_per(this) type(person) :: per character(len=10) :: name class(person) :: this type(employee) :: emp Person integer :: age write (*,*) this % name, & ! Initialize per and emp name : character contains this % age per = person(name='Joe',age=10) procedure, non_overridable :: & end subroutine emp = employee(name='Mike',age=42,& age : integer print_person => print_per salary=2000) print_person procedure :: & subroutine print_emp(this) print_info print_info => print_per class(employee) :: this ! Calls print_per end type person write (*,*) this % name, & call per % print_info() this % age, & ! Calls print_emp type, extends(person) :: employee this % salary call emp % print_info() integer :: salary end subroutine Employee contains procedure, non_overridable :: & salary : integer print_employee => print_emp procedure :: & print_employee print_info => print_emp print_info end type employee

118 119

Summary

In the object-oriented programming paradigm data is coupled with the methods operating on the data Fortran 2003/2008 have limited support the object- oriented programming model – Type extensions – Type-bound procedures – SELECT TYPE -construct for determining the type during runtime

120 OBJECT-ORIENTED FORTRAN: ADVANCED TOPICS

Object-oriented programming

121 122

Type access control Type component access control in Fortran 2003

In Fortran 95, outside the defining module, access to a Fortran 2003 allows mixed component accessibility for type can be derived types. Type component definition statement i. All public, where the type and all of its components def_stmt: are accessible type [[, attr_list] [,access]::] decl_list ii. For a type public, but private to the components of where access is public or private. the type Defining component accessibility overrides any default iii. All private, where both the type and all of its component access set to the type components are hidden Accessibility for type-bound procedures in the contains Using at least ii) is recommended for better –section of a derived type is declared independently of encapsulation of the data the component accessibility

123 124 Type access control example Module access control in Fortran 2003 module people_module type person In addition to public and private, Fortran 2003 allows private module variables to be defined protected character(len=10), public :: name integer :: age A protected variable has contains procedure :: & – public visibility print_info => print_per ! Access to print_per via type is public end type person – Protection against modification outside the defining private :: print_per ! Access to print_per via module is private module similar to intent(in) attribute contains ! Definition of print_per as before Can also be defined to be the default accessibility for all end module people_module variables in a module similarly to public and private

125 126

Module access control example Generic interfaces module access_module module mymodule Modules can contain private ! Module default access is private type myreal integer, protected :: v1 ! v1 is protected real :: value generic interfaces to integer :: v2 ! v2 is private end type procedures and operators integer, public :: v3 ! v3 is public interface operator(+) public :: sub1 module procedure myadd Module operator contains end interface subroutine sub1(…) ! sub1 is public ! Other interfaces to operations overloading allows natural end subroutine sub1 contains use of types subroutine sub2(…) ! sub2 is private function myadd(x,y) result(z) end subroutine sub2 implicit none – Automatic determination subroutine sub3(…) ! sub3 is private type(myreal), intent(in) :: x, y of routines to call already end subroutine sub3 type(myreal) :: z end module access_module z % value = x % value + y % value during compile time end function myadd ! Other operations with myreal end module mymodule

127 128 Generic interfaces Generic interfaces cont.

Fortran 95 allows defining and overloading generic module procedure statement can only be present names for procedures with module interface blocks when generic_spec is present interface [generic_spec] If generic_spec is specified as operator, all the [interface_body] procedures within the block must be functions with one [module procedure proc_name_list] or two non-optional arguments with intent in end interface [generic_spec] If generic_spec is specified as assignment, all the where generic_spec is generic_name, procedures must be with two non-optional operator(op) or assignment(=) arguments, the first having intent out or inout and the Visibility of the generic interface is always public second intent in Procedure arguments must be uniquely determinable

129 130

Generic interface example GENERIC type-bound procedures module access_module module mymodule In Fortran 2003 types can private ! Module default access is private type myreal … real :: value contain generic interfaces public :: sub1 contains to procedures and interface subs ! subs interface is public procedure :: add => myadd module procedure sub2, sub3 generic :: operator(+) => add operators end interface subs end type contains ! Other interfaces to operations When a type is accessible, subroutine sub1(…) ! sub1 is public contains all of its operators are end subroutine sub1 function myadd(x,y) result(z) subroutine sub2(…) ! sub2 is private implicit none accessible end subroutine sub2 class(myreal), intent(in) :: x, y – Avoids programming subroutine sub3(…) ! sub3 is private type(myreal) :: z end subroutine sub3 z % value = x % value + y % value errors from incorrect use end module access_module end function myadd of the USE statement ! Other operations with myreal end module mymodule

131 132 GENERIC type-bound procedures GENERIC procedures example

Type-bound procedures may be defined as generic with module person_module function create_person(myname, & type person myage) result(per) the GENERIC statement private character(len=10) :: myname character(len=10) :: name integer :: myage generic [[,access] ::] generic_spec => tbp_name_list integer :: age type(person) :: per contains per = person(name=myname,age=myage) where generic_spec is as before and each tbp_name private end function defines a type-bound procedure to be included in the procedure :: myname => get_name subroutine get_name(this, name) procedure :: myage => get_age class(person), intent(in) :: this generic set generic, public :: & character(len=10) :: name get => myname, myage name = this % name Procedures associated with the generic must not have end type person end subroutine get_name subroutine get_age(this, age) the nopass attribute interface person class(person), intent(in) :: this module procedure create_person integer :: age end interface age = this % age The extending type always extends the generic set contains end subroutine get_age end module person_module

133 134

GENERIC procedures example cont. Unlimited polymorphic entities use person_module type(person) :: t1 References to any type (including intrinsic type) can be character(len=10) :: name realized with unlimited polymorphic (UP) pointer integer :: age class(*), pointer :: name t1 = person('Joe',10) ! Invokes create_person call t1 % get(name) ! Invokes get_name UP pointer can only be used as an actual argument, call t1 % get(age) ! Invokes get_age pointer target or selector in select type statement When an allocate statement is used with an UP pointer, required type and type parameters must be specified

135 136 Unlimited polymorphic entities example Abstract types class(*), pointer :: v1, v2(:) type, abstract :: abs_base Abstract type creates a type(person), target :: p1 end type abs_base real, pointer :: r1(:) base type for all the type, extends(abs_base) :: extended extending types to build v1 => p1 integer :: field1 allocate(real :: v2(10)) end type extended upon select type (v2) type(extended), target :: t1 Abstract type can be used type is (real) class(abs_base), pointer :: p1 to create common r1 => v2 class default t1 = extended(i) interfaces for all the stop 'Error in type selection' p1 => t1 extending classes to end select implement Abstract types cannot be instantiated

137 138

Abstract types Abstract types cont.

An type can be abstract with deferred procedures Extensions of an abstract type may be non-abstract having no actual implementation A polymorphic variable can be declared to be of abstract type, abstract :: type_name type, but a variable cannot have an abstract type as its … contains actual type [procedure(iface),deferred,[[,attr_list]::] tbpname] Abstract types are commonly used in OO programming … end type type_name model to create common interfaces or to create programming hooks Type can be abstract with no deferred procedures, but a type with deferred procedures must be abstract

139 140 Abstract types example Abstract types example cont. type, abstract :: abstract_person type, extends(abstract_person):: person use person_module private contains class(abstract_person), & abstract_person character(len=10) :: name procedure :: & allocatable :: p1 name : character integer :: age print_person => print_per contains end type person p1 = person(name=’Joe’,age=10)) age : integer procedure(abs_print_if), deferred & call p1 % print_person() :: print_person subroutine print_per(this) print_person end type abstract_person class(person) :: this abstract interface write (*,*) this % name, & subroutine abs_print_if(this) this % age import :: abstract_person end subroutine class(abstract_person) :: this end subroutine abs_print_if person end interface

print_person

141 142

FINAL components FINAL components type handle Housekeeping (closing file Subroutines of a derived type can be declared final private ! Data containing handle handles, deallocating final [::] subroutine_name_list contains pointer components etc.) where subroutine_name defines a set of subroutines final :: close_handle end type handle may be needed when a one of which is run when the derived type is deallocated type is deallocated subroutine close_handle(this) subroutine_name must be a subroutine with a single implicit none FINAL methods of a class(handle) :: this argument of the derived type, with uniquely defined ! Implementation omitted derived type are parameter values and rank end subroutine close_handle automatically called whenever the type is Final subroutines are not type-bound procedures and deallocated cannot be accessed through a type

143 144 Final components cont. Final components example

module final_module An object is finalizable if it has a final subroutine type :: final_type which matches the object … contains A non-pointer object will be finalized when it is final :: destroy_scalar_final, destroy_rank1_final end type final_type deallocated, goes out of scope, is passed to as an intent contains out argument or used as a left-hand side of an intrinsic subroutine destroy_scalar_final(this) type(final_type) :: this assignment statement … end subroutine destroy_scalar_final Derived type containing finalizable components will subroutine destroy_rank1_final(this) type(final_type) :: this(:) be finalized before the individual components … end subroutine destroy_rank1_final Termination of a program does not invoke any final end module final_module subroutines

145 146

Final components example cont. FINAL components and extended types subroutine sub1(n) integer :: n Extended derived types do not inherit any of the final type(final_type) :: f1, f2(n) subroutines of the old type … ! At the end of subroutine, destroy_scalar_final Finalizing an extended derived type causes the parent ! will be called for f1 and destroy_rank1_final for f2 end subroutine sub1 type to be recursively finalized after the extended type has been finalized

147 148 FINAL components and extended types example FINAL comp. of extended types example cont. module ext_final_module subroutine sub1(n) use final_module integer :: n type, extends(final_type) :: ext_final_type type(ext_final_type) :: f1, f2(n) contains … final :: destroy_scalar_ext_final, destroy_rank1_ext_final ! At the end of subroutine, destroy_scalar_ext_final end type ext_final_type ! will be called for f1, then destroy_scalar_final for contains ! f1 % final_type, destroy_rank1_ext_final for f2 and subroutine destroy_scalar_ext_final(this) ! destroy_rank1_final for f2 % final_type type(ext_final_type) :: this end subroutine sub1 … end subroutine destroy_scalar_ext_final subroutine destroy_rank1_ext_final(this) type(ext_final_type) :: this(:) … end subroutine destroy_rank1_ext_final end module final_module

149 150

Summary

In Fortran 2003/2008, mixed access control is allowed for better encapsulation of the methods of derived types GENERIC interfaces can be added to modules and derived types to make use of mixed types easier and the code more readable A polymorphic variable can be declared unlimited polymorphic to be type-compatible with any data A derived type can be declared abstract A derived type can have FINAL components

151 Outline

Language interoperability issues Module ISO_C_BINDING – Mapping of C intrinsic data types in Fortran – Mapping of C derived data types in Fortran – Utility functions Calling C routines from Fortran Language interoperability Mapping and accessing data from Fortran I/O interoperability

152 153

Language interoperability issues Traditional interoperability, i.e., “do not use this”

The problem of language interoperability has been By using the facts that Fortran compilers... present for a long time – Pass all procedure arguments by-reference (i.e. by-address) Consider the following scenario: – Usually refer to functions and subroutines in lowercase letters – An application program is written entirely in Fortran and by adding an additional underscore after, e.g. Func – A library, written entirely in C by some third party, would becomes func_ to be found by the linker be useful in the Fortran program – Usually pass function return values via stack – How to exploit the library written in C in a portable way? – Usually pass CHARACTER-strings by-reference, with an additional hidden length argument, passed by value. NOTE: Fortran CHARACTER-strings are not null-character terminated …C code can be referenced from Fortran

154 155 Traditional interoperability Traditional interoperability: example

Consider a case, where CBLAS-function is to be called in ! F_calls_C.F90 /* cblas_sdot_wrap.c */ program F_calls_C /* This approach works in 99% of the order to calculate a dot product in single precision implicit none cases in Unix. */ integer, parameter :: N=100 #include The C function prototype for “cblas_sdot“ from real :: X(N), Y(N), SDOT float sdot_func_(const float *X, REAL, external :: SDOT_FUNC const float *Y, is: const int *N) ! Init X & Y float cblas_sdot( const int N, { /* Wrapper-code */ const float *X, const int incX, X(:) = 1.5e0; Y(:) = 2.5e0 ! Wrapper call /* Actual call to CBLAS-routine */ const float *Y, const int incY) SDOT = SDOT_FUNC(X, Y, N) float sdot = cblas_sdot(*N, write (*,*) 'SDOT=', SDOT X, 1, In order to call this from Fortran – using traditional, not- Y, 1); end program F_calls_C return sdot; recommended approach – an extra wrapper layer is } needed

156 157

Interoperability with Fortran 2003 Interoperability with Fortran 2003: example

program F_calls_C The traditional way to have interoperability with C is a use, intrinsic :: ISO_C_BINDING implicit none non-portable hack, i.e., it requires a-priori knowledge of integer(kind=C_INT), PARAMETER :: N = 100 lowercase and underscore policy used by the compiler real(kind=C_FLOAT) :: X(N), Y(N), SDOT interface function cblas_example(N, X, INCX, Y, INCY) & Complex cases, such as passing CHARACTER–strings or bind(c,name='cblas_sdot') RESULT(anumber) use, intrinsic :: ISO_C_BINDING passing arguments by value, are generally very error integer(kind=C_INT), value :: N, INCX, INCY ! Pass by value prone and may lead to catastrophic errors at runtime real(kind=C_FLOAT), intent(in) :: X(*), Y(*) ! intent(in)=const float * real(kind=C_FLOAT) :: anumber end function cblas_example With Fortran 2003, interoperability with C is defined in end interface ! Init X & Y the standard by using INTERFACE-blocks with the X(:) = 1.5e0; Y(:) = 2.5e0 bind(C)-attribute SDOT = cblas_example(N, X, 1, Y, 1) ! A direct call to ’cblas_sdot’ write (*,*) 'SDOT=',SDOT end program F_calls_C

158 159 The ISO_C_BINDING module Calling C routines

Fortran 2003 intrinsic module ISO_C_BINDING is used – A Fortran SUBROUTINE maps to a C-function with void as we have already seen – with result USE, INTRINSIC :: ISO_C_BINDING A Fortran FUNCTION maps to a C-function returning a Module contains value – Access to named constants that represent kind type Binding label in bind(c, name=

Mapping of C intrinsic data types VALUE attribute

Interoperable mappings for the most commonly used C For scalar dummy arguments, VALUE attribute can be intrinsic data types used to pass the value to the procedure – No copy is made from the procedure to the caller when Traditional “old” Fortran Fortran declaration C data type the call returns INTEGER*2 INTEGER(c_short) short int – Argument must not be a POINTER, ALLOCATABLE, have INTEGER*4 INTEGER(c_int) int intent(IN) or intent(INOUT), be a procedure or have INTEGER*8 INTEGER(c_long_long) long long int the VOLATILE attribute REAL*4 REAL(c_float) float REAL*8 REAL(c_double) double NOTE: VALUE attribute is not limited to procedures with CHARACTER*1 CHARACTER(1,c_char) char the bind attribute

162 163 Mapping of derived data types Mapping of derived data types

In many cases it is possible to describe Fortran derived For a derived type to be interoperable between C and data types in terms of C data structures (and vice versa) Fortran, variable ordering, data types and array sizes To be interoperable, Fortran derived type must have the must be identical bind(c) attribute – Variable names do not need to be identical – sequence or extends keywords are forbidded Typical usage comes through function calls, for instace a Individual Fortran components in the data type must be Fortran routine extracting information from a C routine of an interoperable type – zero-sized array components are forbidden – ALLOCATABLE and POINTER components are forbidden C types cannot be unions nor structures with bit-fields

164 165

Mapping of derived data types: example Mapping of array data

/* C data structure */ program typedeftest ! Fortran array declarations Array indexing struct { use, intrinsic :: ISO_C_BINDING ! compatible with C int count; implicit none REAL (c_double) :: z1(5), z2(3:5,17) – type, bind(c) :: C_type_in_Fortran C: starts from zero (0) double d[100]; INTEGER (c_int) :: ivec(-4:7) } C_type; integer(kind=c_int) :: count – Fortran: by default, starts real(kind=c_double) :: d(100) end type C_type_in_Fortran from one (1) /* C-function example */ interface void C_func(C_type *p) { subroutine testf(P) & Multidimensional ordering p->count = 1; bind(c, name='C_func') p->d[0] = 1.23; import – C: “row-major”, grow /* Corresponding C-declarations */ } type(C_type_in_Fortran) :: P end subroutine testf fastest along the last end interface double z1[5], z2[17][3]; dimension type(C_type_in_Fortran) :: X int ivec[12]; call testf(X) – Fortran: “column-major”, write (*,*) X % count, X % d(1) grow fastest along the end program typedeftest first dimension

166 167 Mapping of character data Mapping of character data: example

program F_atoi C procedures expect character strings to be null use, intrinsic :: ISO_C_BINDING terminated implicit none interface – A null character has to be appended to a string before ! C-prototype: int atoi(const char *num); calling the C procedure function atoi(num) bind(c, name='atoi') RESULT(x) use, intrinsic :: ISO_C_BINDING Module ISO_C_BINDING contains many character character(kind=c_char), INTENT(IN) :: num(*) integer(kind=c_int) :: x constants: C_ALERT, C_BACKSPACE, C_FORM_FEED, end function atoi end interface C_CARRIAGE_RETURN, C_HORIZONTAL_TAB, character*(*), parameter :: year = '2013' C_NULL_CHAR, C_NEW_LINE, C_VERTICAL_TAB character(:,kind=c_char), allocatable :: number allocate(character(len=len(YEAR)+1)::number) ! Space for c_null_char number = trim(year) // c_null_char write (*,'(A,I0)') 'atoi('//year//')=',atoi(number) end program F_atoi

168 169

Accessing global data Accessing global data: example

Global data: /* Global C data */ ! Fortran global data must be ! defined in a module – Fortran: is declared in modules or in COMMON blocks int number; module something float Array[8]; use, intrinsic :: ISO_C_BINDING implicit none – C: is declared outside any local scope and referenced via double slice[3][7]; extern elsewhere integer(c_int), bind(c) :: number struct coord { real(c_float) :: my_array(8) C global data is accessed in Fortran with float x, y, z; bind(c, name=‘Array’) :: my_array }; bind(c, name=

170 171 Interoperability with C pointers Interoperability with C pointers

Pointer concept is completely different in Fortran and C ISO_C_BINDING contains inquiry and manipulation – Fortran: POINTER is a storage descriptor which holds an functions for TYPE(C_PTR) and TYPE(C_FUNPTR) address of the TARGET object and, for arrays, the related When calling C routines with pointer arguments, C bounds and strides storage locations are sometimes needed. These can be – C: pointer is an address to a memory location acquired with c_loc and c_funloc: Module ISO_C_BINDING contains Fortran type TYPE(C_PTR) FUNCTION c_loc(x) definitions for C pointers and C function pointers TYPE(C_FUNPTR) FUNCTION c_funloc(x) where x in interoperable. – TYPE(C_PTR) abstracts C pointers – For an interoperable Fortran entity x, return – TYPE(C_FUNPTR) abstracts C function pointers TYPE(C_PTR) (or TYPE(C_FUNPTR)) containing the C address of x

172 173

Interoperability with C pointers Interoperability with C pointers

Checking association status of C pointer can be done with When access to the actual storage elements is needed, c_associated function: the contents of C pointer must be first transferred to LOGICAL FUNCTION c_associated(ptr1[, ptr2]) Fortran form. This can be done with c_f_pointer – where ptr1 and ptr2 are TYPE(C_PTR) or routine: TYPE(C_FUNPTR) SUBROUTINE c_f_pointer(c_p, f_p [, shape]) – Return .FALSE. if ptr1 is a NULL C pointer or if ptr2 is where c_p is scalar TYPE(C_PTR), f_p is a Fortran present and has a different value. Otherwise return POINTER. If present shape is a rank-1 integer array with size .TRUE. equal to rank of f_p – Set f_p to point to point to c_p. If f_p is an array, shape must be present. When f_p is an array, its lower bounds are set to 1 and upper bounds according to shape

174 175 Interoperability with C pointers: example Interoperability with C pointers: example

/* C data structure */ MODULE my_typedef PROGRAM mapcpointer typedef struct { USE, INTRINSIC :: ISO_C_BINDING USE my_typedef int count; IMPLICIT NONE TYPE(C_dynamic) :: X double *d; /* dynamic */ TYPE, bind(c) :: C_dynamic INTEGER(c_int), PARAMETER :: N = 10 } C_type; integer(kind=c_int) :: count REAL(c_double), POINTER :: xd(:) type(kind=c_ptr) :: d CALL C_FUNC(X,N) END TYPE C_dynamic /* C-function example */ INTERFACE CALL C_F_pointer(x % d, xd, (/ N /)) ! Map data to Fortran pointer #include // for malloc SUBROUTINE C_FUNC(P,N) & PRINT *,X % count, xd(1) void C_func(C_type *p, int n) { bind(c,name='C_func') END PROGRAM mapcpointer p->count = n; IMPORT p->d = malloc(n * sizeof(*p->d)); TYPE(C_dynamic) :: P if (n > 0) p->d[0] = 1.23; INTEGER(c_int), value :: N } END SUBROUTINE C_FUNC END INTERFACE END MODULE my_typedef

176 177

Interoperating with C binary I/O Interoperating with C binary I/O: example

Interoperability problems extend to binary (unformatted) INTEGER(kind = 4) :: ARRAY(100) Data written from Fortran OPEN(1, FILE='file.bin', & I/O form='unformatted', & as unformatted sequential access='sequential', & file – Fortran: unformatted I/O files contain data and the status='unknown') associated record delimiters (4 or 8 bytes long) Contains record ! Write 400 bytes + 8 – C: binary I/O files contain data WRITE (1) ARRAY(1:100) delimiters, usually 4 or 8- ! Write 200 bytes + 8 Writing and reading files with STREAM I/O in Fortran bytes at the beginning and WRITE (1) ARRAY(1:50) end of each record solves data interoperability problems CLOSE (1) Corresponding C-code has – Stream I/O files do not have the facility to backspace or to ”decipher” these skip over records File ‘file.bin’ in a binary format: delimiters 400 | ... ARRAY(1:100) ...... | 400 200 | ... ARRAY(1:50) ... | 200

178 179 Interoperating with C binary I/O: example Summary

USE, intrinsic :: ISO_C_BINDING When using With ISO_C_BINDING, Fortran has standardized INTEGER(kind = c_int) :: ARRAY(100) OPEN (1, FILE='file.bin', & access='STREAM', no mechanisms to access source code and libraries written form='unformatted', & additional record in C, as well as define Fortran accessible from C access='STREAM', & status='unknown') delimiters are output Interoperability should be used with well-defined ! Write 400 bytes interfaces WRITE (1) ARRAY(1:100) /* The corresponding C-reader is now ! Write 200 bytes easy to implement */ – Complicated structures or calling sequences are not WRITE (1) ARRAY(1:50) int array[100]; recommended FILE *fp = fopen (”file.bin”, ”r”); CLOSE (1) Use of STEAM I/O is recommended if data is to be fread (array, sizeof(*array), 100, fp); File ‘file.bin’ in a binary format: fread (array, sizeof(*array), 50, fp); interoperated from C ... ARRAY(1:100) ...... ARRAY(1:50) ... fclose (fp)

180 181 What is dynamic linking?

A mechanism to implement runtime callable functions – A kind of plug-in Allows existing programs to link and execute new functions without a-priori linkage to them For example – Link on demand i.e. decide which function(s) to use, but only at runtime, based on input parameters (e.g. GUIs) Demo: Dynamic linking – It is possible to generate new (source) code based on input data and then bind (= compile and link) it into your running program

182 183

Prerequisites Connection to Fortran?

The running program has to be a dynamic executable (as Typically plug-ins are used from C/C++ programs as the opposed to a static executable) DL-library interface is C-oriented (C-callable) The runtime functions need to be compiled in a position We will show the Fortran-module dlfcn.F90, which binds independent mode (“PIC”) and … to the DL-library interface into a Fortran callable one … linked dynamically into a shareable object or library All we need now is a test case to demonstrate that the (often libXYZ.so) using functions from DL-library concept actually works (libdl.so) Demo: write “evaluate” –command which reads the – dlopen, dlsym, dlclose, dlerror function definition to be performed (=plug-in) and its The discussion here applies only to ELF-executables and argument range from the command line linkable formats used in Linux systems

184 185 The steps needed to make the command A flow chart of evaluate.F90 “evaluate” Main Program Create Fortran main program to read in command line Function specification arguments (evaluate.F90) Create Fortran module for abstract interface to describe User formula f(x) declaration Dynamic linking our calling interface to a function with a single argument (mod_func.F90) Compile & Link (=bind) f(x) Create Fortran module to handle dynamic linking i.e. Function evaluations Evaluate f(x) values bringing the plug-in alive (mod_bind.F90) – Generate function’s source code & bind it to the program Undo dynam. linking Unbind f(x) Introducing also the DL-library interface module (dlfcn.F90)

186 187

The main program evaluate.F90 The main program evaluate.F90

program evaluate formula = 'sin(x)' myfun => bind(h, trim(formula)) use mod_func Refers to abstract begin = '0.0' write(0,'(a15,2x,a20)') 'x',trim(formula) Dynamic linking use mod_bind interface in end = '3.14' implicit none (mod_func) incr = end rin = rbegin do while (rin <= rend) numargs = command_argument_count() rout = myfun(rin) Formula evaluation procedure(func), pointer :: & if (numargs >= 1) call & write(0,'(g20.14,2x,g20.14)') rin,rout myfun => null() get_command_argument(1,formula) rin = rin + rincr real(kind=dp) :: rin, rout if (numargs >= 2) call & enddo type(dlfcn_handle) :: h get_command_argument(2,begin) Undo dynam. linking if (numargs >= 3) call & From dlfcn get_command_argument(3,end) myfun => unbind(h) (mod_bind) if (numargs >= 4) call & end program evaluate get_command_argument(4,incr) integer :: numargs character(len=1024) :: formula read(begin,*) rbegin character(len=32) :: begin,end,incr read(end,*) rend real(kind=dp) :: rbegin,rend,rincr read(incr,*) rincr

188 189 The abstract function interface The binding module mod_bind.F90 (mod_func.F90) module mod_func module mod_bind function bind(h, formula) result(funcp) implicit none use dlfcn implicit none integer, parameter :: dp = selected_real_kind(8) use mod_func type(dlfcn_handle) :: h implicit none character(len=*) :: formula abstract interface A prototype for formula private ! Function name public :: bind,unbind,dlfcn_handle character(len=*), parameter :: & function func(x) result(r) function f(x) funcname = "f" import contains real(kind=dp), intent(in) :: x type(c_funptr) :: cptr real(kind=dp) :: r procedure(func), pointer :: funcp function unbind(h) result(funcp) integer :: iu end function func implicit none ! generate Fortran function for formula end interface type(dlfcn_handle) :: h ! compile & create shareable object procedure(func), pointer :: funcp h=dlfcn_open( & integer :: istat end module mod_func "./lib"//funcname//".so", & istat = dlfcn_close(h) rtld_now) nullify(funcp) cptr = dlfcn_symbol(h, funcname) end function unbind call c_f_procpointer(cptr, funcp) end function bind end module mod_bind

190 191

Generating Fortran source code: Creation of a shareable object Fortran representation of the user formula f(x)

! Function call interface is similar The first step in Given the just generated FUNCTION f(x) source file ! to abstract function prototype mod_func.F90 is to create f.f90 FUNCTION f(x) bind(c) result(r) equivalent Fortran code that use, intrinsic :: iso_c_binding implicit none represent the formula The following steps take place behind the scenes of the real(kind=c_double) :: x Since DL-library is provided via module mod_func, in function bind( … ): real(kind=c_double) :: r C-interface (see later), our r = cos(2*x) + x – Compile the file f.f90 into object file f.o END FUNCTION f automatically generated Fortran code must be adhering § Remember position independency (“PIC”) ISO C-bindings, e.g. for formula – Create shareable image file f.so from the object f.o : f(x) = cos(2*x) + x – Call DL-library routine dlopen to open up the file f.so – Create a pointer to the function symbol “f” with dlsym – Cast C function pointer into a Fortran procedure pointer

192 193 Compiling and creating a shared object file The DL-library interface module (dlfcn.F90)

module dlfcn ! The actual DL-library prototypes List of compilation commands to create shared use, intrinsic :: iso_c_binding public :: dlfcn_open, & implicit none dlfcn_close, & object files and executables with some commonly private dlfcn_symbol ! the following is probably ! Needed function and procedure used compilers ! unportable and may need system- ! Pointers ! specific adjustment public :: c_funptr, c_f_procpointer Platform Creation of the Compilation Creation of the shared integer, parameter :: strmax = 1024 ! Handle type (a necessary burden) executable object integer(c_int), public, & type, public :: dlfcn_handle parameter :: rtld_lazy = 1 private Cray CCE ftn -dynamic ftn -c -hPIC -shared f.o -o f.so integer(c_int), public, & type(c_ptr) :: handle Intel ifort -ldl ifort -c -fPIC -shared f.o -o f.so parameter :: rtld_now = 2 end type dlfcn_handle integer(c_int), public, & GNU gfortran -ldl gfortran -c -fPIC -shared f.o -o f.so parameter :: rtld_local = 0 integer(c_int), public, & PGI pgf95 -ldl pgf95 -c -fPIC -shared f.o -o f.so parameter :: rtld_global = 256

194 195

The DL-library interface module (dlfcn.F90) The DL-library interface module (dlfcn.F90) interface contains function dlopen(file, mode) & function dlsym(handle, name) & function dlfcn_open(file, mode) & function dlfcn_symbol(handle,name)& result(h) bind(c) result(p) bind(c) result(h) result(fn) import :: c_char, c_int, c_ptr import :: c_char, c_ptr, c_funptr character(kind=c_char,len=*), & type(dlfcn_handle),intent(in) :: & character(kind=c_char) :: file(*) type(c_ptr), value :: handle intent(in) :: file handle integer(c_int), value :: mode character(kind=c_char) :: name(*) integer(c_int) :: mode character(kind=c_char,len=*), & type(c_ptr) :: h type(c_funptr) :: p type(dlfcn_handle) :: h intent(in) :: name end function dlopen end function dlsym type(c_funptr) :: fn h%handle = & function dlclose(handle) & function dlerror() & dlopen(file//c_null_char, mode) fn = dlsym(handle%handle, & result(s) bind(c) result(c) bind(c) end function dlfcn_open name//c_null_char) import :: c_char, c_int, c_ptr import :: c_ptr end function dlfcn_symbol type(c_ptr), value :: handle type(c_ptr) :: c function dlfcn_close(handle) & integer(c_int) :: s end function dlerror result(s) function dlfcn_error() result(c) end function dlclose end interface type(dlfcn_handle) :: handle ! Implementation omitted integer(c_int) :: s end function dlfcn_error

s = dlclose(handle%handle) end module dlfcn end function dlfcn_close

196 197 Demo: evaluate -command Summary

Usage: This demo-lecture explained how dynamic linking works evaluate “formula” lower_bound upper_bound increment Sometimes it can be a useful to feature to introduce new For example: functions and evaluate their values as they come along evaluate “sin(x) + 5” 0 10 2.5 evaluate “acos(2*x+1) * 360/3.14” -4 +4 0.5 – Interactive graphical user interfaces (GUI) evaluate “x**2 + 2*x + erf(x)” 0.1 4 0.2 – In code development: to avoid rebuilding of the software However, this feature is seldom used in connection with Fortran programs as there were portability issues – Now mainly thanks to ISO C-bindings also the Fortran programs can take advantage of this “plug-in” feature

198 199 What are Fortran coarrays (CAF) about ?

Adds parallel processing as part of Fortran language – Only small changes required to convert existing Fortran code to support a robust and potentially efficient parallelism A Partitioned Global Address Space (PGAS) language – Unlike OpenMP, which operates over shared memory, the CAF extensions implement parallelism over “distributed shared memory” à CAF is potentially massively parallel Introduction to Fortran coarrays Has been integrated into Fortran 2008 standard – Caveat: only few compilers so far support the CAF syntax – never mind producing efficient object code L

200 201

Some history CAF does…

Coarray Fortran (formerly known as F-- i.e. Fortran- Add only a few new rules to the Fortran language minus-minus as opposed to C++) was created by Bob Provide mechanisms to allow Numrich and John Reid in the 1990s – SPMD (Single Program, Multiple Data) style of explicitly Reached the current form in 1998 as a simple extension parallel programming to Fortran95 for parallel processing – Data distribution over partitioned memory (you can think CAF has many years experience mainly on Cray hardware about “distributed shared memory” here) A set of CAF’s core features are now part of the Fortran – Guard against race conditions (in e.g. variable value 2008 standard (ISO/IEC 1539-1:2010) assignments) by using synchronization – Memory management for dynamic shared entities

202 203 Execution model Time for “Hello World” in CAF !

Upon startup a Fortran coarrays program gets replicated ! Your very first Fortran coarrays program ! into a number of copies called images (i.e. processes) PROGRAM HELLO_WORLD IMPLICIT NONE – The number of images is usually decided at the execution WRITE(*,’(“Hello world from image “,i0,” out of “,i0)’) & this_image(), num_images() time, but (in rare cases) can also be fixed at compile time END PROGRAM HELLO_WORLD Each “replica” (image) runs asynchronously in a loosely num_images() returns the number of images in use for this coupled way until program controlled synchronization run (usually set outside the program, by the environment) Image’s (local) data are visible within the image only – this_image() returns the image number in concern – except for data declared as special arrays i.e. coarrays numbered from one (not zero) to num_images() – One-sided data communication enables movement of This program is a trivially parallel i.e. each image does not coarray data across different images of a CAF program explicitly share any data and runs seemingly independently

204 205

Data distribution through coarrays Data distribution through coarrays …

CAF programs become meaningful in parallel programming The square brackets [*] denote allocation of special arrays context, when their data are remotely accessible by its images over allocated images (decided upon program startup) Accomplished through additional Fortran syntax for coarrays – Think of the asterisk “*” being replaced by the num_images() for Fortran arrays or scalars, for example : – The round brackets “( )” mean local array accesses, and the “[ ]”

INTEGER, CODIMENSION[*] :: scalar are meant for remote data array accesses only INTEGER :: scalar[*] INTEGER :: GLOBAL(3)[*], LOCAL(3), NPES REAL, DIMENSION(64), CODIMENSION[*] :: vector NPES = num_images() ! Consider NPES == 2 for this example REAL :: vector(64)[*] GLOBAL(:) = this_image() * (/ 1, 2, 3 /) ! Local initialization LOCAL(:) = GLOBAL(:)[NPES] ! A copy (“broadcast”) from image number “NPES” to each image Declares a scalar with one “incarnation” on every image Image # 1 Image # 2 Declares a vector with 64 elements on every image GLOBAL 1 2 3 GLOBAL 2 4 6 LOCAL 2 4 6 LOCAL 2 4 6

206 207 A further peek into vector(64)[*] Synchronization – avoiding race conditions Each image has a local vector with 64 elements We need to be careful when updating coarrays – A program has total NPES = num_images() images – Is the remote data we are copying valid e.g. up to date? For image “me = this_image()”, local access to the – Could another image overwrite our data without notice? element j either by “vector(j)” or “vector(j)[me]” – How do we know if data sent to us has already arrived? Local memory CAF provides synchronization statements, e.g. adds vector(1)[1] vector(1)[2] region for vector(1)[NPES] image#2 vector(2)[1] vector(2)[2] vector(2)[NPES] barrier for synchronization of all images: : : … : SYNC ALL

vector(64)[1] vector(64)[2] vector(64)[NPES] To be absolutely sure we are getting correct result, we need to modify our previous copying example a little…

208 209

Synchronization : corrected remote copy Another synchronization example : data input

We need to add barrier synchronization of all images before CAF assumes by default that standard input is connected to the copying takes place to be absolutely sure we are getting the first image only – other images open /dev/null for input the most up to date copy of GLOBAL(:)[NPES] In order to replicate input data (e.g. a scalar input integer), ... GLOBAL(:) = this_image() * (/ 1, 2, 3 /) ! Local initialization the following code snippet becomes handy – but remember to SYNC ALL synchronize before starting to use your (now local) data LOCAL(:) = GLOBAL(:)[NPES] ! A copy (“broadcast”) from image number “NPES” to each image REAL(kind = 8) :: PARAM[*] !A scalar – one value per image In this particular case – since only the image # NPES – is IF ( this_image() == 1 ) THEN !Only the “master” image has access to standard input READ *, PARAM !Read just one value i.e. PARAM[1] involved, we could use an alternative form of synchronization: DO jroc = 2, num_images() PARAM[jroc] = PARAM !The same as PARAM[jroc] = PARAM[1] , but “faster” ... ENDDO GLOBAL(:) = this_image() * (/ 1, 2, 3 /) ! Local initialization ENDIF SYNC IMAGES ( NPES ) ! Don’t forget the following barrier synchronization or your data maybe out of sync LOCAL(:) = GLOBAL(:)[NPES] ! A copy (“broadcast”) from image number “NPES” to each image SYNC ALL SOME_VALUE = PARAM * 3.14

210 211 Viewing global data with coarrays Global view : linear mapping across images

In many computational problems it is handy to think of Create an example coarray with 3 elements per image having large global array(s), where important data reside We pretend it being global in the sense that the first 3 Since Fortran does not provide global view, a programmer elements are on image#1, the next 3 on image#2 and so on needs to be creative: CAF offers a fantastic help here We want to access global data in a linear fashion, like In the next slides we show how to map into a global view GA(global_index):=CAF_GA(local_index)[owner_image] (“global array”) in a linear fashion To accomplish this kind of virtual global array of GA, we You must have noticed already at this point, that a coarray need to compute (local_index,owner_image) –pair in each image ought to have the same (local) size! from each given global_index – and uniquely – In another lecture we circumvent this by use of Fortran TYPEs

212 213

Global view: linear mapping across images… Global view : linear mapping across images… The following table illustrates mapping of global index Assuming 4 images for simplicity, 3 elements on each image – our mapping from coarray to global array would then be: Owner_image Local_index Global_index Data values 1 1 1 1 Image #1 Image #2 Image #3 Image #4 1 2 2 2 1 3 3 3 1 2 3 2 4 6 369 48 12 2 1 4 2 2 2 5 4 CAF_GA(:)[1] CAF_GA(:)[2] CAF_GA(:)[3] CAF_GA(:)[4] 2 3 6 6 3 1 7 3 3 2 8 6 3 3 9 9 GA := 1232463694 8 12 4 1 10 4 4 2 11 8 4 3 12 12

214 215 Global view : linear mapping across images … Summary INTEGER, PARAMETER :: NEL = 3 ! Number of elements on each image In this lecture we covered basic aspects of coarrays: REAL(KIND=8) :: CAF_GA(NEL)[*] ! Our “global” array, but defined as a coarray INTEGER :: NPES ! Number of images – INTEGER :: NGA ! Global length of GA i.e. NEL x NPES How to declare codimensional arrays and access data INTEGER :: GLOBAL_INDEX ! Global index, between 1 and NGA – INTEGER :: LOCAL_INDEX ! Local indexing, on each image Concept of images and some related functions INTEGER :: OWNER_IMAGE ! Image that “owns” global index – NPES = NUM_IMAGES() Importance of synchronization NGA = NEL * NPES – ! Initialize CAF_GA somehow on each image first and then remember to synchronize How to implement simple linear global view of data CAF_GA(1:NEL) = (/ (LOCAL_INDEX, LOCAL_INDEX=1,NEL) /) * THIS_IMAGE() SYNC ALL We also touched how to use standard input and output ! Access some global data as if we were accessing global array “GA(global_index)“ GLOBAL_INDEX = 7 – So it is now possible to write simple CAF programs! IF (GLOBAL_INDEX >= 1 .and. GLOBAL_INDEX <= NGA) THEN LOCAL_INDEX = MOD(GLOBAL_INDEX - 1,NEL) + 1 In the subsequent lecture we will also get acquainted OWNER_IMAGE = (GLOBAL_INDEX + NPES – 1)/NPES SOME_GA = CAF_GA(LOCAL_INDEX)[OWNER_IMAGE] ourselves to multidimensional and allocatable coarrays ENDIF

216 217 More features of Fortran coarrays

Operating with multiple codimensions Coarray subscripts, functions lcobound( ) & ucobound( ) Use of image_index( ) –function, and more on this_image( ) Dynamic allocation across codimensions Variable length coarrays More features of Fortran coarrays Coarrays in procedures

218 219

Recap : 1-dimensional coarray Multiple codimensions

We refer to variables with codimension (or corank) being 1 Now corank > 1 , thus there is more than one codimension The declaration statement was either of the following: – Things are getting (much) more complicated to figure out INTEGER :: A(N)[*] – The last codim. must always be declared with asterisk “*” INTEGER, codimension[*] :: A(N) Sum of all dimensions (rank plus corank) must be ≤ 15 The size of array (the “N”) had to be the same on each image For simplicity, lets declare a coarray scalar with two The coindex [ ] ends up being trivially the same as image index codimensions i.e. corank is 2, and the first codimension as 4 Image#1 Image#2 Image#3 … The syntax is either of the following: A(1) [1] A(1) [2] A(1) [3] REAL :: S[4, *] A(2) [1] A(2) [2] A(2) [3] REAL, codimension[4,*] :: S : : :

220 221 REAL :: S[4,*] Multiple codimensions …

Suppose we have allocated 10 images for the CAF-program Instead of a scalar valued Numbers in the green slots denote images holding data coarray we could have a multi- 1 2 3 References to the red ones produce invalid coindices (= 0) dimensional (regular) Fortran array 1 1 5 9 1st codim. 1 2 3 2nd codimension – Each box would then 2 2 6 10 A(:,:)[3,2] 1 1 5 9 represent image’s local array 3 3 7 0 The syntax is again either of 2 2 6 10 S[3,2] - a single scalar value of S the following, f.ex.: 4 4 8 0 3 3 7 0 S[3,3] & S[3,4] are not defined REAL :: A(3,3)[4, *] 4 4 8 0 REAL, codimension[4,*] :: A(3,3)

222 223

Coarray subscripts Array and coarray declarations with subscripts…

Coarray declaration by default assumes normal Fortran Fortran INTEGER array, with 2 ranks, lower bounds at (1,1) and upper array declaration rules bounds at (7,3), and shape of (7,3), is declared in usual way: – Lower cobound for each dimension starts from 1 INTEGER, dimension(7,3) :: IA – Upper cobound denotes the size of corresponding codimension A REAL-valued coarray scalar, with corank 1, lower cobound of zero, and upper cobound no more than num_images( ): But as with conventional Fortran arrays we can declare the REAL, codimension[0:*] :: Scalar lower cobound be different from 1 A REAL-valued coarray, with rank of 4 and corank 3 ; lower bounds at The upper cobound – as long as it is not the last (-2,0,1,-1) and lower cobounds at (-3,2,10) ; upper bounds at (5,5,7,4) codimension – can also be different from size and upper cobounds at (6,4,10-1+n) ; shape of (8,6,7,6) and coshape – Upper bound for the last codimension must always be “*” of (10,3,n), with n = ceiling(num_images( )/(10*3)) REAL :: Array(-2:5,0:5,7,-1:4) [-3:6, 2:4, 10:*]

224 225 Obtaining lower and upper bounds for coarrays Diving into the last codimension

Often complicated to figure out upper/lower cobounds Consider a coarray with corank of 4 Luckily there are query functions for obtaining them : INTEGER :: Scalar [p,q,r,*] lcobound(coarray [, dim]) Lower cobounds would all be equal to 1 ucobound(coarray [, dim]) All but the last codimension has upper cobounds of They return an integer array, or in case of a single (p,q,r) dimensional query (with optional “dim”) – a scalar The last codimension acts like an assumed shape array For the Array from the previous page : i.e. the asterisk “*” can be replaced by a ceiling value of lcobound(Array) produces output -3 2 10 num_images( ) / (p * q * r) ucobound(Array, dim=2) gives 4 This is defined at runtime and always becomes ≥ 1

226 227

Use of image_index( ) and this_image( ) image_index( )

So far we have seen this_image() image_index( ) performs the reverse of this_image( ) function been used without arguments 3 4 5 Takes coindices (e.g. [2, 4] on A-array from previous slide) – Returns the image (“process”) number 0 1 5 9 – Returns the owner image number in concern (e.g. image#7) where the CAF-program is being run 1 2 6 10 – Useful in synchronization, e.g. SYNC IMAGE (image_list) In its another calling form it takes a coarray as an argument, e.g. 2 3 7 0 In particular, returns zero (0), when the given coindices are REAL :: A(7,7) [0:3,3:*] 3 4 8 0 out of bounds i.e. they do not belong to any image print *, this_image(A) For example INTEGER :: img Running with 10 images again it returns A(:,:)[2,4] (say) for image#7 a vector [ 2, 4 ] i.e. img = image_index(A, (/ 2, 4 /)) ! Will return 7 coindices of CAF [ ] –brackets expression img = image_index(A, [3, 5]) ! Returns 0 – no owner image

228 229 ALLOCATABLE coarrays About POINTERs with coarrays

It is possible to define fully dynamic coarrays, where both shape A co-array declaration cannot have a POINTER attribute (i.e. locally owned part) and coshape are dynamic, e.g. an Thus the following would be illegal coarray declaration: allocatable with deferred shapes of rank 1 and corank 3: REAL, POINTER :: ptr[:,:] ! This is invalid Fortran! INTEGER , allocatable :: A(:)[:, :, :] ALLOCATE (A(1000)[4,20,-5:*]) However, we can define a new data TYPE, where type DEALLOCATE (A) component(s) have POINTER (or ALLOCATABLE) attributes ALLOCATE and DEALLOCATE imply implicit synchronization – all images must participate i.e. an implicit SYNC ALL occurs – And then define a coarray being of that TYPE The local size (here: 1000) must be the same on each image Used in dynamic dimensioning of coarrays The last codimension must have an asterisk “*” – This way local sizes on every image can be different

230 231

Variable length coarrays via ALLOCATABLE Variable length coarrays via POINTER Create a new Fortran data type with a POINTER component Create a new Fortran data type with ALLOCATABLE component in it – and place it in TYPE mytype a MODULE –file : REAL, POINTER :: data(:) TYPE mytype END TYPE mytype ) a

REAL, ALLOCATABLE :: data(:) t Then declare a coarray of that type, e.g. corank of 1 a END TYPE mytype d TYPE (mytype) :: co[*] ! A fully legal construct % o c

Then declare a coarray of that type, e.g. ( REAL, TARGET :: chunk(1000) e z i

TYPE (mytype) :: co[*] s Make co%data to point different chunks of local data ALLOCATE on each image, but different size CALL get_my_range(size(chunk), istart, iend) this_image() ALLOCATE(co % data (10 * this_image())) co % data => chunk(istart:iend)

232 233 Coarrays in procedures Coarrays in procedures…

SUBROUTINE SOME_ROUTINE (N, ARRAY_1, CO_ARRAY_1, ARRAY_2, CO_ARRAY_2, CO_ARRAY_3) When declared in a subroutine or function, a coarray IMPLICIT NONE !-- Procedure arguments (all, but one are ok declarations !!) must be one the following INTEGER, INTENT(IN) :: N INTEGER, INTENT(INOUT) :: ARRAY_1(N) ! Explicit shape – Declared as a dummy argument to the procedure INTEGER, INTENT(INOUT) :: CO_ARRAY_1(N)[*] ! Explicit shape INTEGER, INTENT(INOUT) :: ARRAY_2(:) ! Assumed shape – Have ALLOCATABLE and/or SAVE attribute INTEGER, INTENT(INOUT) :: CO_ARRAY_2(:)[*] ! Assumed shape INTEGER, INTENT(INOUT) :: CO_ARRAY_3[:] ! Illegal : assumed coshape Re-mapping of corank is also allowed !-- Procedure variable declarations (not all ok – see below) INTEGER :: LOCAL_ARRAY_1(N) ! Ok, an automatic (regular) array INTEGER :: LOCAL_ARRAY_2(1000) ! Ok, local (regular) array A coarray in procedure cannot be an automatic array – INTEGER :: LOCAL_CO_ARRAY_1(N)[*] ! Invalid : coarray can’t be automatic like by nature : the “limitation” stems from the fact local INTEGER :: LOCAL_CO_ARRAY_2(1000)[*] ! Invalid : SAVE-attribute missing INTEGER, SAVE :: LOCAL_CO_ARRAY_3(1000)[*] ! Ok, coarray with SAVE-attr size of a coarray on each image must be the same INTEGER, ALLOCATABLE :: LOCAL_CO_ARRAY_4(:)[:] ! Ok, coarray with ALLOCATABLE INTEGER, POINTER :: LOCAL_CO_ARRAY_5(:)[:] ! Invalid : coarray can’t have POINTER END SUBROUTINE SOME_ROUTINE

234 235

Assumed size coarrays Summary

Coarrays can also be declared as assumed size i.e. In this lecture we have dived deeper into the CAF world codimension(s) can be re-mapped to have a different corank Use of multiple codimensions can be hard to understand in the procedure than was declared in the caller, e.g. – Using CAF specific intrinsic functions lcobound, ucobound, PROGRAM PIC_2D IMPLICIT NONE this_image and image_index usually clarify the situation INTEGER (KIND=2) :: PIXELS(100,100)[*] coarrays can also be dynamic through ALLOCATABLE INTEGER :: NPES_X, NPES_Y CALL BEST_2D_DECOMP(num_images(), NPES_X, NPES_Y) – Direct use of POINTER attribute is, however, forbidden CALL CREATE_PICTURE(PIXELS, NPES_X, NPES_Y) END PROGRAM PIC_2D Variable size coarrays can be defined via TYPE definition trick:

SUBROUTINE CREATE_PICTURE(P, M, N) use of POINTER as TYPE component is then allowed IMPLICIT NONE coarrays can also appear in procedures, but specific rules INTEGER, INTENT(IN) :: M, N INTEGER (KIND=2), INTENT(INOUT) :: P(100,100)[M,*] ! Assumed, re-mapped shape must be adhered, e.g. adding SAVE/ALLOCATABLE attributes END SUBROUTINE CREATE_PICTURE

236 237 Advanced CAF features

Is the array-syntax over codimensions supported? More advanced synchronization clauses Miscellaneous CAF features

Advanced features of Fortran coarrays

238 239

Array-syntax across codimensions? Array-syntax [:] across codimensions is not allowed

One clearly strange thing in CAF-syntax is that use of array Instead – an explicit looping over images is mandatory syntax over codimensions is not allowed REAL :: global_data[*] – It used to be possible prior to CAF becoming part of Fortran2008 std! INTEGER :: jroc REAL, allocatable :: local(:) Suppose we want to gather data over all images into a local ALLOCATE(local(num_images())) array, one element per image. It would be tempting to do: CALL initialize_somehow(global_data) SYNC ALL ! With some further optimization REAL :: global_data[*] DO jroc = 1, this_image() - 1 local(jroc) = global_data[jroc] REAL :: local(num_images()) ! Even this is illegal ENDDO local(this_image()) = global_data CALL initialize_somehow(global_data) DO jroc = this_image() + 1, num_images() SYNC ALL local(jroc) = global_data[jroc] ENDDO local(:) = global_data[:] ! Gather: Array syntax [:] illegal

240 241 More advanced synchronization CRITICAL sections

CRITICAL sections PROGRAM CRIT At times it is necessary to IMPLICIT NONE SYNC IMAGES INTEGER :: MYSUM[*] = 0 execute a block of code by INTEGER :: ANS ! The reference result one image at a time to avoid SYNC MEMORY INTEGER :: ME, NPES, P P = 1 ! Fixed image# [P] race conditions whilst Atomic updates with atomic_define and atomic_ref ME = THIS_IMAGE() updating coarrays at a fixed NPES = NUM_IMAGES() LOCK variables ANS = (NPES * (NPES + 1))/2 image location, e.g. [P] ! SYNC ALL : usually needed, but not here CRITICAL – END CRITICAL is the CRITICAL MYSUM[P] = MYSUM[P] + ME construct to use END CRITICAL Remember to synchronize SYNC ALL !! Must have !! PRINT *,ME,': MYSUM[P], ANS=',MYSUM[P],ANS especially after the section END PROGRAM CRIT

242 243

SYNC IMAGES SYNC IMAGES –examples

There are situations when only a subset of images need The 1st image synchronizes Only the two first images to synchronize; use of SYNC ALL would be an overkill with all the other images in (an image set) synchronize – SYNC IMAGES(image_list) provides a desired interface a pair-wise manner other images are not involved at all in (this) – Not all images need to call this synchronization The image_list can be one or more image ids, or “*” IF (this_image() < 3) THEN – The list does not have to be same on every image IF (this_image() == 1) THEN SYNC IMAGES(*) SYNC IMAGES( (/ 1, 2/) ) Special image_list is asterisk “*” and denotes all images ELSE ENDIF – But the SYNC IMAGES(*) is not SYNC ALL SYNC IMAGES(1) ENDIF – Image pair-wise synchronization is faster than SYNC ALL

244 245 SYNC MEMORY ATOMIC-updates

PROGRAM SPIN_WAIT USE, INTRINSIC :: ISO_FORTRAN_ENV IMPLICIT NONE The standard includes two Enforces any pending LOGICAL (ATOMIC_LOGICAL_KIND) :: FLAG[*] = .FALSE. INTEGER :: A[*] LOGICAL :: VALUE functions for ATOMIC coarray writes to finish INTEGER :: NPES, ME, P, Q NPES = NUM_IMAGES() updates in coarrays : IF (this_image() == 1) THEN before proceeding ME = THIS_IMAGE() A[2] = 123 ! Put 123 to image#2 IF (NPES > 1) THEN ATOMIC_DEFINE Waits until any preceding P = 1 ! Image#2 may not yet see the update Q = NPES ATOMIC_REF SYNC MEMORY writes are visible to the IF (ME == P) THEN PRINT *,ME,': (P,Q) ARE ',P,Q ! Image#2 now sees the update images in concern SYNC MEMORY Use with SYNC MEMORY to ENDIF CALL ATOMIC_DEFINE (FLAG[Q], .TRUE.) ! FLAG[Q] := .TRUE. Flushes also any cached ELSE IF (ME == Q) THEN implement spin-wait VALUE = .FALSE. reads on image’s coarray(s) DO WHILE (.NOT. VALUE) ! SPIN UNTIL "VALUE" .TRUE. between images P and Q ! A bad example, since image#2 actually CALL ATOMIC_REF (VALUE, FLAG) ! VALUE := FLAG[Q] ! has no idea when it can access its A[2] !! ENDDO Not actual synchronization SYNC MEMORY – A “producer-consumer” ! See next slide for use of ATOMIC-routines at all, however SYNC ALL / ELSE ! with SYNC MEMORY CONTINUE ! DON’T GET INVOLVED UNLESS ME == (P,Q) Take a notice of ISO Fortran ENDIF SYNC IMAGES also imply PRINT *,ME,': FLAG=',FLAG ENDIF environment –module SYNC MEMORY END PROGRAM SPIN_WAIT

246 247

LOCK variables LOCK variables

PROGRAM SPIN_WAIT USE, INTRINSIC :: ISO_FORTRAN_ENV USE, INTRINSIC :: ISO_FORTRAN_ENV In fact it is recommended to TYPE (LOCK_TYPE) :: LOCKVAR[*] Use of LOCK variables IMPLICIT NONE LOGICAL :: SUCCESS TYPE (LOCK_TYPE) :: LOCKVAR[*] avoid both SYNC MEMORY IF (THIS_IMAGE() < 10) THEN resembles entering to a LOGICAL :: FLAG[*] = .FALSE. LOCK (LOCKVAR[1]) INTEGER :: NPES, ME, P, Q ! Do something under the lock#1 CRITICAL section, except that NPES = NUM_IMAGES() and ATOMIC–update ! This could be on any image < 10, not only image#1 ME = THIS_IMAGE() it is more flexible, since you IF (NPES > 1) THEN routines and use LOCK – UNLOCK (LOCKVAR[1]) P = 1 ELSE can have many distinct Q = NPES UNLOCK –constructs LOCK (LOCKVAR[2]) IF (ME == P) THEN ! Images >= 10 enter this lock in turn (any order) LOCKs active simultaneously PRINT *,ME,': (P,Q) ARE ',P,Q The aforementioned spin- ! The lock#2 is active whilst lock#1 could be, too – unlike just one with a LOCK(LOCKVAR[P]) ! .. or LOCKVAR[Q] UNLOCK (LOCKVAR[2]) FLAG[Q] = .TRUE. ! “atomic” update at image [Q] wait program now looks ENDIF UNLOCK(LOCKVAR[P]) ! .. or LOCKVAR[Q] CRITICAL section SYNC IMAGES (Q) ! “atomic” update propagated much cleaner ELSE IF (ME == Q) THEN spin: DO You can also acquire a LOCK SYNC IMAGES (P) ! “atomic” update received LOCK (LOCKVAR[2], acquire_lock = SUCCESS) ELSE The LOCK_TYPE is taken IF (SUCCESS) EXIT spin in a non-blocking manner by CONTINUE ! DON’T GET INVOLVED UNLESS ME == (P,Q) ! Do something before trying to get the lock#2 again requesting “acquire_lock” ENDIF again from ISO Fortran env. ENDDO spin PRINT *,ME,': FLAG=',FLAG ! Do something else here and then release the lock#2 ENDIF from LOCK construct END PROGRAM SPIN_WAIT UNLOCK (LOCKVAR[2])

248 249 Miscellaneous CAF features I/O conventions in CAF

I/O conventions in CAF Each image has its own, independent set of Fortran input and output units Program termination The default input (“stdin”) (i.e. READ *, READ(*,…), READ(5,..)) is – Including STAT= and ERRMSG= specifiers in synchronization pre-connected to the master image (image#1) only – Attempts to READ from “stdin” for images > 1 would fail unless correct Future CAF safety measures are taken – Technical specification (TS) PROGRAM SAFE_AND_CORRECT_STDIN INTEGER :: flag[*] =0 IF(this_image() == 1 ) THEN READ *,flag ! .. alias READ(5,*) flag DO JROC = 2, num_images() flag[JROC] = flag ! := flag[1] ENDDO ENDIF SYNC ALL END PROGRAM SAFE_AND_CORRECT_STDIN 250 251

I/O conventions in CAF … Program termination

The default output (“stdout”) is connected to all images Normal termination of a CAF program is initiated by STOP and END PROGRAM statements on all images – Output is merged (in any order) into one output stream In problematic cases, however, one or more images “in trouble” – PRINT *, WRITE(*,…), WRITE(6,…) can request for termination, whilst other images continue to run The standard error (“stderr”) is redirected to the “stdout” – until reaching some synchronization point – WRITE(0,…) – Image data of the images being in an abnormal termination state are still available and can be requested by other images The remaining images could test the termination status of the other image(s) by using STAT= specifier in synchronization statements, and also get ERRMSG –string ERROR STOP statement on one or all images will initiate an error termination and all images are expected to be killed shortly

252 253 Program termination: Making CAF be fail-safe? Technical specification (TS)

PROGRAM ALLOC USE ISO_FORTRAN_ENV INTEGER :: ME, ISTAT, N A run, where the 1st image Fortran coarrays continue to evolve CHARACTER(LEN=80) :: ERRMSG REAL(KIND=8), ALLOCATABLE :: LOCAL(:,:) INTEGER :: CO[*] is in trouble in ALLOCATE Technical specification outlines the future developments ME = THIS_IMAGE() CO = ME ! COARRAY DATA INITIALIZATION Test image status against IF (ME == 1) THEN A useful document to read N = 1000000000 ! 1,000,000,000 ! STAT_STOPPED_IMAGE PRINT *,ME,': ALLOCATING N X 1000 MATRIX, N=',N http://www.pgas2013.org.uk/sites/default/files/finalpapers/Day1/R1/3_paper14.pdf ALLOCATE(LOCAL(N,1000), STAT=ISTAT, ERRMSG=ERRMSG) (from ISO Fortran env.) IF (ISTAT /= 0) THEN In the pipeline are the following extensions to the CAF PRINT *,ME,': COULD NOT ALLOCATE. '//TRIM(ERRMSG) STOP – Available in SYNC ALL, ENDIF – Image teams ENDIF SYNC IMAGES, LOCK, SYNC IMAGES ( 1, STAT = ISTAT, ERRMSG = ERRMSG ) – IF (ISTAT == STAT_STOPPED_IMAGE ) THEN ! FAIL-SAFETY? UNLOCK&SYNC MEMORY Failed images PRINT *,ME,': IMAGE#1 WAS STOPPED: '//TRIM(ERRMSG) IF (ME == 2) THEN ! IMAGE#2 SALVAGES COARRAY FROM #1 – Collective intrinsic functions for coarrays PRINT *,ME,': SALVAGING CO[1] = ',CO[1] The remaining image can ENDIF CALL EXIT(ISTAT) ! ENSURE NON-ZERO EXIT CODE now proceed with fetching – Improvements in atomic updates ENDIF END PROGRAM ALLOC CAF data – still available

254 255

TS: Image teams TS: Failed images

Learning lessons from MPI–standard and its An image may have died during the run – consequently a communicators, it is useful to have subsets of images SYNC IMAGES would fail (or stall) if it contains the failed forming “teams” of processes image SYNC IMAGES (image_list, stat = st) Define a new (“sub”-)team at runtime IF (st == stat_failed_image) THEN ... – All images with the same “id” e.g. mod(this_image(),2) failed_images() function returns the count of failed join images at the given moment FORM SUBTEAM (id, Ateam) – Another possibility is to use num_images(failed = .TRUE.) Synchronize within a team only Used with care – and together with the new team SYNC TEAM (Ateam) capability, a fail safe computing becomes possible (still non-trivial)

256 257 TS: Collectives TS: Atomic operations

Collective operations are essential part of //-programs CRITICAL or LOCK statements often an overkill for – Broadcast data count[1] = count[1] + 1 – Calculate global sums, maximums, etc. For example OpenMP enables atomic operations Available in MPI and OpenMP standards !$OMP atomic – MPI_Allreduce and OpenMP reduction operations shared_count = shared_count + 1 Routines for example Proposed coding: – CO_BROADCAST INTEGER (atomic_int_kind) :: count[*] – CO_SUM, CO_MIN, CO_MAX CALL atomic_define(count[1], 0) CALL atomic_add(count[1], 1) – CO_REDUCE CALL atomic_ref(thevalue, count[1])

258 259

Summary

We have now completed our excursion to CAF world In the last “chapter” we were told that the common array syntax is not extended to codimensions – which is odd We got better insight to more advanced synchronization across images We also had a recap on Fortran I/O and standard channels We learned how to terminate images abnormally and yet gain access to almost destroyed coarray data: fail safety Finally, we had a browse on to future of CAF and its TS

260