<<

Technische Universitat

DFGForschergruppe SPC Fakultatf ur Mathematik

Thomas Ap el Frank Milde Michael The

SPCPM Po D

Programmers Manual

Acknowledgement The package SPCPM Po D has b een developed in the re

search group SPC at the Fakultat f ur Mathematik of the Technische Universitat

ChemnitzZwickau under the sup ervision of A Meyer and Th Ap el Other main

contributors are G Globisch D Lohse M Meyer F Milde M Pester and M The

Chapter of this manual was written together with M Pester The authors thank

M Jung and A Meyer for their pro ofreading and helpful comments U Reichel

typed the manuscript

The research group SPC is supp orted by Deutsche Forschungsgemeinschaft Ger

man Research Foundation No La

All this collab oration and supp ort is gratefully acknowledged

PreprintReihe der Chemnitzer DFGForschergruppe

Scientic Parallel Computing

SPC Dezember

Authors addresses

Thomas Ap el

Technische UniversitatChemnitzZwickau

Fakultatf ur Mathematik

D Chemnitz

Germany

email ap elmathematiktuchemnitzde

Frank Milde

Technische UniversitatChemnitzZwickau

Fakultatf ur Naturwissenschaften Institut f ur Physik

D Chemnitz

Germany

email mildephysiktuchemnitzde

Michael The

Technische UniversitatChemnitzZwickau

Fakultatf ur Mathematik

D Chemnitz

Germany

email thessmathematiktuchemnitzde

Contents

Overview

Introduction

The b oundary value problems

Data structure F Milde

General remarks

Full data structure FDS

Volumes

Faces

Edges

Co ordinates of the no des

Global Crossp oint Names IGLOB

DirichletNeumann data

Kette data

CHAIN

REGION

Reduced data structure RDS

Volumes

Dirichlet data

Neumann data

Hierarchical List

TETF

No de Co ordinatesIGLOBKette DataCHAINREGION

INCLUDEFilesCOMMONBlo cks

netddatinc

com probinc

lenameinc

standardinc

trnetinc

Hierarchical Mesh Renement F Milde

Parameters of NETMAKE

Coarse mesh input and distribution

Renement

The pro cedure

An example

Reducing data

Parameters of the output to ol AUSGABE

Release of Programmers Manual page i

ii

Tree structure of the routines

NETMAKE for tetrahedral meshes

NETMAKE for brick meshes

AUSGABE

Short description of the routines in libNAa

Short description of the routines in libNTa

Short description of the routines in libNQa

Assembly of the equation system and error estimation Th Ap el

General remarks

Numerical integration and shap e functions

Mathematical Background

The arrays QGST QGST SHP and SHP

A mo dication of the arrays for the use in the error estimator

Assembly of the stiness matrix and the right hand side

Main steps

The element routine ELS in the case of the Poisson equation

The element routine ELAST for the Lamesystem

The surface integrals for inhomogeneous Neumann b oundary conditions

The coarse grid matrix

Calculation and estimation of error norms

The computed values

Exact error norms

Tree structures

Short description of the subroutines

Parallel Preconditioned Conjugate Gradient Metho d PPCG MThe

The Parallel CGMetho d

The Jacobi preconditioner

The Yserentant preconditioner

The BPX preconditioner

Tree structure of the routines

Description of the routines

General libraries

libKLZa for matrix op erations with compactly stored matrices

libCub ecoma with basic communication routines

libvbasmoda with basic vector op erations

libMbasmo da for matrix routines

libDDCMcoma for DD and coarse matrix routines

The libraries libGrafa and libNoGrafa

libToolsa for auxiliary routines

Bibliography

Index

Release of Programmers Manual

Chapter

Overview

Introduction

SPCPM Po D is a computer program to solve the Poisson equation or the Lamesystem

of linear elasticity over a threedimensional domain on a MIMD parallel computer It is

b eing developed in the research group SPC Scientic Parallel Computing at the Fakultat

f ur Mathematik of the Technische Universitat ChemnitzZwickau under the sup ervision of

Prof A Meyer and Dr Th Ap el Other main contributors are Dr M Meyer F Milde

Dr M Pester and M The

The historical ro ots of the program are at one hand in several parallel programs for

solving problems over twodimensional domains using domain decomp osition techniques

These co des have b een developed since ab out by A Meyer M Pester and other

collab orators On the other hand Th Ap el developed a sequential program for

the solution of the Poisson equation over threedimensional domains which was extended

together with F Milde

For an introduction of the capabilities of the program its installation and utilization

we refer to the Users Manual The aim of this Programmers Manual is to provide a

description of the algorithms and their realization It is written for those who are interested

in a deep er insight into the co de for example for improving and extending

The do cumentation is organized as follows In the next section we describ e the b oundary

value problems that can b e solved and the nite elements that are used Chapter is

concerned with the data structure The main part of this do cumentation consists of the

description of the sp ecic libraries for SPCPM Po D libNAa libNTa and libNQa for

hierarchical mesh renement of tetrahedral and cub oidal hexahedral meshes Chapter

libAa for assembly of the equation system and for error assessing Chapter and libSa

for solving this system with a preconditioned CG algorithm Chapter

Chapter provides a short description of some general libraries that are used also in

other program systems developed by the research group SPC and which are describ ed in

more detail elsewhere libKLZa libvbasmoda libMbasmo da and libDDCMcoma

for basic matrix and vector op erations libCub ecoma for basic communication routines

libGrafa for graphical representation as well as libToolsa with some auxiliary routines

In this do cumentation we use slanted style for real existing paths and lenames emphasis

style for program parameters sans serif style to characterize buttons and menu items of

programs with a graphical user interface and typewriter style for the names of variables

Release of Programmers Manual page

CHAPTER OVERVIEW

The b oundary value problems

Consider the Poisson problem in the notation

u f in IR

u u on

u

g on

n

u

on n n

n

T

u u u or the Lameproblem for u

u grad div u f in IR

i i

i

u u on i

i

i

i

t g on i

i i

i i

t on n n i

T

is t t t S u n is the normal stress the stress tensor S u s where t

ij

ij

T

x x x by dened with x

i j

u u

s r u

ij ij

j i

x x

is the outward normal and is the Kronecker delta The domain IR must b e n

ij

b ounded In the present version curved b oundaries can not b e treated by the renement

pro cedure thus is restricted to b e a p olyhedron

The b oundary value problem is solved by a standard nite element metho d using either

tetrahedral or brick elements with linear or quadratic shap e functions of the serendipity

class see Figure

Figure Finite elements implemented in SPCPM Po D

Release of Programmers Manual

Chapter

Data structure

General remarks

The program is working in the SPMD mo de that means single program multiple data

Consequently all data describ ed are lo cal data p ossibly with dierent length on every pro

cessor The connection b etween these lo cal data is co ded in the arrays IGLOB KETTED and

KETTED see Subsections and this information is sucient for the communication

nite element accumulation

In FORTRAN it is imp ossible to allo cate memory during the run of the program

but there are several large arrays in our FEM program which are used only for a certain

time So it is necessary to have a dynamic memory management To solve this problem in

FORTRAN we have a very large workspace vector as large as p ossible in our program

to use parts of it as arrays in the subroutines There are several p ointer variables which

determine the array index on which data start We have our own memory management and

must take care of calculating these p ointers to avoid overlaps

In our FEM program we have two data structures First the mesh is rened with the

full data structure FDS see b ecause of its greater variability After the change to

the more typical reduced data structure RDS see the assembly and solution of the

equation system are p erformed

There are a few general variables

NDF number of degrees of freedom p er no de

NEND number of no des p er face

NEND number of no des p er volume

In and we describ e the arrays in the following general form

general description of the array

name and dimension of the array

structure of a data blo ck of the array

additional information

For some arrays there are p ointers within the data blo cks which determine the p ositions

of data Most of the dimensions of the arrays are also variablesparameters which are lo cated

in COMMON blo cks in the source le netddatinc It is b etter to use these variables instead

of hard numbers b ecause of p ossible evolution of the data structure

Release of Programmers Manual page

CHAPTER DATA STRUCTURE

Full data structure FDS

In the FDS volumes are represented by a number of faces faces by a number of edges and

edges by a number of no des

All arrays except the co ordinate array and the kette data have the same coarse structure

representing the hierarchical character of the renement The data of all levels of renement

including the coarse mesh are stored in the following way

coarse mesh data level level level N

For each array there is a variable which p oints to the rst entry of level N Its name is

in general of the form Fname of the array for instance FVOL is the rst entry of level N

in VOL

Volumes

Each volume is describ ed by its j faces for a tetrahedron for a brick

VOLDIMVOL DIMVOLj

j faces

z

face face

Face i is a face number

Faces

Each face is describ ed by its j edges for a triangle for a quadrilateral

FACEDIMFACE DIMFACE j

j edges

z

edge

edge type number of rst son

Edge i is a edge number

Until now there is only one type which means plane face All sons are numbered

consecutively Up to now the number of sons is always

The faces of each level are sorted in the following way

coupling faces

other faces

z z

z

level i level i level i

By our denition all faces of the coarse mesh and their sons are coupling faces even

if they are not contained in interprocessor b oundaries

It is recommended to use the following p ointer variables for this dataset

FZEIG p osition of the type in FACE currently j

FCHIELD p osition of the number of rst son currently j

Release of Programmers Manual

FULL DATA STRUCTURE FDS

Edges

Each edge is describ ed by its vertices and the middle no de only quadratic case

KANTEDIMKANTE DIMKANTE

j vertex j vertex j middle no de j type j number of rst son j

Vertex i is a vertex number

Until now there are two p ossible values of type and The type is used for

dummy edges which are necessary to generate the hierarchical list but no face p oints

to it

The coupling edges are lo cated at the b eginning of each level followed by the other

edges of coupling faces

coupling edges

edges on coupling faces other edges

z z

z

level i level i level i

By our denition all edges of the coarse mesh and their sons are coupling edges even

if they are not contained in interprocessor b oundaries

It is recommended to use the following p ointer variables for this dataset

KZEIG p osition of the type currently

KCHIELD p osition of the number of rst son currently

Co ordinates of the no des

Each no de is represented by its three Euclidean co ordinates

COOR

j X j Y j Z j

i i i

The no des are placed in COOR in the following way

cross p oints CE CF

CE CE CF CF inner no des

z z

D kettes D kettes

Each D kette CE is a blo ck of no des which b elong to sons of a coupling edge of

i

the coarse mesh By analogy each D kette is a blo ck of interior no des of a coupling

face of the coarse mesh

The structure of these kettes is more complicated It is shown in all details with an

example in

Global Crossp oint Names IGLOB

To identify the lo cal crossp oints their global name is stored

IGLOB

IGLOBI global name of the no de I which is an crossp oint where I is the lo cal

number

Release of Programmers Manual

CHAPTER DATA STRUCTURE

DirichletNeumann data

The DirichletNeumann data are asso ciated with faces They have b oth the same

data structure

DIRDVDIR DVDIR NDF NDIRREAL

NEUMDVNEUM DVNEUM NDF NNEUMREAL

NDF data blo cks

z

number of face type data DF

type data DF

DF k means the k th degree of freedom

The data are NDIRREAL NNEUMREAL real parameters RP describing the b ound

ary condition

Possible values of the type of the b oundary condition data are

none

constant f RP

linear function f RP X RP Y RP Z RP

function call f uX Y Z from Assembspf

rd

RP has b een planned for the co ecient in b oundary conditions of kind but

this is not implemented yet

Kette data

The purp ose of the kette data is the optimization of the communication Every cou

pling faceedge of the coarse mesh is referred in the kette data by its global names of

vertices All interior no des of these facesedges have consecutive numbers and form a

so called kette see Thus they can b e describ ed by a p ointer to the rst no de

and the number of no des length in this blo ck

There are two dierent kette data KETTED for edges and KETTED for faces which

have the same data structure For more information see

KETTEDKDDIMKETTEDKDDIM KDDIM KDDIM

or j vertices

z

p ointer length pathID vertex vertex

Vertex i is a vertex number

Note that the vertex numbers here are global crossp oint names For an explanation

of pathID see

It is recommended to use the following p ointer variables for this dataset

PKZEIG p osition of the p ointer currently

PKLENG p osition of the length currently

PWEGID p osition of the pathID currently

PKDAT p osition of the data rst no de currently

Release of Programmers Manual

REDUCED DATA STRUCTURE RDS

CHAIN

The array CHAIN is very similar to the KETTED data The dierence is that it do es

not contain p ointers to no de numbers but to the numbers of subfaces Note that all

subfaces which form a coupling face have consecutive face numbers

CHAINKDDIM KDDIM

j vertices

z

p ointer length pathID vertex vertex

Vertex i is a vertex number

Note that the vertex numbers here are global crossp oint names

There are the same p ointers PKZEIG PKLENG PWEGID and PKDAT as for the kette data

REGION

The vector REGION contains an integer for each volume This integer is read from the

input le and passed on to the son elements during the renement pro cess Currently

it is used as a material index

REGION

jnumberj

Reduced data structure RDS

The RDS is more typically for nite element applications The basic entities are the no des

and all other ob jects volumes b oundary faces are represented by numbers of no des

The arrays of faces and edges disapp ear

Volumes

Each volume consists of NEND no des

VOLNEND

NEND no des

z

no de

no de

Dirichlet data

Dirichlet data are given on faces A face consists of NEND no des and thus the Dirichlet

data consists of the no de numbers and the Dirichlet values of every degree of freedom

at these no des The face number FDS is an additional information for the error

estimator

DIRDRDIR DRDIR NEND NDF

Release of Programmers Manual

CHAPTER DATA STRUCTURE

NEND no des NDF data blo cks of length NEND

z z

number of face no de no de IFG Dir values DF Dir values DF

In the case of more than one degree of freedom it may happ en that not all degrees of

freedom have a Dirichlet condition The Information ab out this is co ded bitwise in

the entry IFG

It is recommended to use the following p ointer variables for this dataset

DRNODES p osition of the no des

DRIFG p osition of IFG

DRDAT p osition of the data

Neumann data

The Neumann data are used in a dierent way then the Dirichlet data so the only

change from the full to the reduced structure is to add the no de numbers which

represent the face The face number FDS is again used for error estimation

NEUMDRNEUM DRNEUM NEND NDF NNEUMREAL

NEND no des NDF data blo cks

z z

number of face no de no de type data DF type data DF

DF k means the k th degree of freedom

There are two p ointer variables for this dataset

NRNODES p osition of the no des

NRDAT p osition of the data

Hierarchical List

The hierarchical list connects all no des with its father no des

LC

j father j factor j j no de j father

The factor factor describ es the relative p osition of the no de at the edge

factor COORfather COORno de factor COORfather

The entries in LC are ordered such that the fathers are included b efore their sons

Note that the entries in COOR are ordered in another way index equals number of the

no de for the numbering of the no des see Section Note that father father

if the no de is a crossp oint

TETF

TETF is the old array of volumes VOL from the FDS see It is used only in the error

estimator Chapter

Release of Programmers Manual

INCLUDEFILESCOMMONBLOCKS

No de Co ordinatesIGLOBKette DataCHAINREGION

The data structure is the same as in the FDS see

INCLUDEFilesCOMMONBlo cks

There is a number of COMMONBlo cks in our program Most of them are lo cated in

INCLUDEFiles Moreover some parameters are determined in these les

netddatinc

This INCLUDEFile contains a number of variablesparameters which determines dimen

sions of data esp ecially these which dep end on the type of the mesh All variables are in

these COMMONBlo cks

NENXD

NEND number of no des p er face see

NEND number of no des p er volume see

NETDIM

DIMVOL dimension of the array of volumes FDS see

DIMFACE dimension of the array of face FDS see

FCHIELD p ointer to the number of the rst subface see

FZEIG p ointer to the type of the face see

SUB name of the sub directory with the meshes

RB

NDF number of degrees of freedom see

DVDIR dimension of the array of Dirichlet data FDS see

DRDIR dimension of the array of Dirichlet data RDS see

DRNODES p osition of the no des RDS see

DRIFG p osition of IFG RDS see

DRDAT p osition of the data RDS see

DVNEUM dimension of the array of Neumann data FDS see

DRNEUM dimension of the array of Neumann data RDS see

NRNODES p osition of the no des RDS see

NRDAT p osition of the data RDS see

The subroutine SET RBCOM sets all these Variables NDF NEND and NEND must b e

correct when calling this routine

Moreover there are the following parameters via FORTRAN parameter statement

Release of Programmers Manual

CHAPTER DATA STRUCTURE

DIMKANTE dimension of the array of edges currently

KZEIG p osition of the type of the edge currently

KCHIELD p osition of the child of the edge currently

KDDIM dimension of the array of D kettes currently

KDDIM dimension of the array of D kettes currently

PKZEIG p osition of the p ointer in the kette data currently

PKLENG p osition of the blo ck length in the kette data currently

PWEGID p osition of the path identier in the kette data currently

PKDAT p osition of the data in the kette data currently

NDIRREAL number of Dirichlet real parameters currently

NNEUMREAL number of Neumann real parameters currently

com probinc

There is a number of variables with information concerning the mesh

PROBLEM

Nk number of no des lo cal on the pro cessor

NCrossG number of crossp oints global

NCrossL number of crossp oints lo cal

NKettSum number of all coupling no des lo cal

NC NKettSum NCrossL

NI number of interior no des lo cal

NanzKD number of D kettes

NanzKD number of D kettes

NanzK NanzKD NanzKD

LinkLevel auxiliary variable for communication

The subroutine COM PROB sets most of these variables

lenameinc

FILENAME

File name of the standard le without std

Length length of File

Nlevl not used

itri not used

Lunit not used

Fullname name of the standard le including sub directory with the meshes

and std

To get the lename and to set these variables the subroutine SETFILE is used

standardinc

This INCLUDEFile contains some program control variables which can b e changed with

the les controlquad or controltet without compiling the program anew see Section

Release of Programmers Manual

INCLUDEFILESCOMMONBLOCKS

standard

vertvar kind of coarse grid partitioning

femakkvar variant of accumulation of distributed data see

loesvar choice of the preconditioner

Nintass number of the quadrature formula used for assembling Neumann

b oundary data

Nintass number of quadrature formula for D elements used in the assembling

Ninterror as nintass but used in the error estimator for the integration of the

jump of the normal derivatives

Ninterror as nintass but used for the integration of D integrals in the error

calculation

Epsilon stop criterion for the CG relative decrease of the norm of the residual

Iter maximal number of iterations in the CG algorithm

NDiag upp er estimate for the number of nonzero entries in any row of the

stiness matrix

Verf mesh renement parameter for a certain class of examples see

Subsection

lin quad kind of shap e functions

trnetinc

There are some variables with information concerning the parallel computer compare

Section

TrNet

NCUBE dimension of the hypercub e

ICH number of the pro cessor in the hypercub e top ology

NODENR number of the no de the pro cessor is lo cated on

TrRing

NPROC number of pro cessors

ICHRING number of the pro cessor in ring top ology

Lforw number of the link that leads to the successor within the ring

Lback number of the link that leads to the predecessor within the ring

Release of Programmers Manual

CHAPTER DATA STRUCTURE

Release of Programmers Manual

Chapter

Hierarchical Mesh Renement

Parameters of NETMAKE

The pro cedure NETMAKE generates the mesh It reads the coarse mesh from a le distributes

it to the pro cessors p erforms the renement with the full data structure and reduces the

data to the reduced data structure

SUBROUTINE NETMAKEAJCOORNUMNPJDIRNDIRJNEUMNNEUMJVOL

JREGIONNUMELJIGLOBJKETTEDJKETTED

JCHAINJFREILAENGEIERVFSJTETFJLCSTEUER

A IO workspace vector

JCOOR O p ointer to array of no de co ordinates COOR

NUMNP O NUMb er of No dal Points

JDIR O p ointer to the Dirichlet data DIR

NDIR O number of Dirichlet faces

JNEUM O p ointer to the Neumann data NEUM

NNEUM O number of Neumann faces

JVOL O p ointer to array of volumes VOL

JREGION O p ointer to the region data REGION

NUMEL O NUMb er of of ELements volumes

JIGLOB O p ointer to array of global crossp oint names IGLOB

JKETTED O p ointer to array of D kette data KETTED

JKETTED O p ointer to array of D kette data KETTED

JCHAIN O p ointer to array CHAIN

JFREI IO rst unused p osition of A

LAENGE I length of A

IER O error parameter

VFS O number of renement steps

JTETF O p ointer to the array of volumes of the full data structure

JLC O p ointer to the hierarchical list

STEUER O array of logical status parameters used in the output to ol see

To optimize the communication the no des which b elong to coupling facesedges stand

together in the array of no de co ordinates and the order of these p oints is the same on every

pro cessor They form a so called kette To realize this it is useful to order the edges and

faces to o

Release of Programmers Manual page

CHAPTER HIERARCHICAL MESH REFINEMENT

Coarse mesh input and distribution

The coarse mesh will b e read from a standardized le compare Section These les

are lo cated in the sub directories mesh tetrahedral meshes or mesh brick meshes

Only pro cessor zero reads the mesh and later it is given to all other pro cessors There are

two metho ds to determine the owner of the volumes

The rst metho d is very simple Each pro cessor should own the same number of volumes

and they are distributed simply as they are numbered linear distribution The next step

is to delete all volumes which are not owned by the pro cessor and all other unused data

The second metho d is much more complicated and should optimize the later computa

tion This metho d is known as Recursive Sp ectral Bisection RSB and is based on results

of graph theory It op erates on the dual graph to the coarse mesh

Sp ectral bisection calculates the eigenvector corresp onding to the secondsmallest eigen

value of the dual graphs Laplacian matrix To each no de in the dual graph naturally

corresp onds a comp onent of the eigenvector To partition the dual graph into two nearly

equalsized subgraphs and the algorithm computes the median value of the eigenvector

comp onents Those no des with eigenvector comp onents smaller than the median value are

assigned to subgraph the remaining no des to subgraph

The routine runs parallel on all pro cessors where the elements corresp onding to subgraph

remain at the rst half of pro cessors and the others at the second half The not assigned

data is deleted on each half This is recursively applied to the mesh according to the

dimension of the hypercub e of the pro cessors used where each half is divided into two new

parts until each part contains only one pro cessor For a detailed description see

Here we use a program written by Clemens Brand MontanUniversitatLeob en Austria

with slight mo dications by Uwe Reichel

At last a number of variables and arrays are initialized with its start values An imp or

tant fact to notice is that all facesedges of the coarse mesh are assumed to b e coupling

facesedges no matter if they really connect the submeshes of two pro cessors or if they are

only within one submesh

Renement

The pro cedure

In the renement step every volume is divided in subvolumes every face in subfaces and

every edge in two sub edges

There are in general steps in the renement pro cedure FEIN

Preparation of the data array COOR for the new no des The no des which b elong to

coupling facesedges of the coarse mesh stand together in COOR in a well dened way

Namely all no des on coupling edges including the edges on coupling faces stand

together in that order they are lo cated on the edge For instance the no des on a

coupling edge with vertices A and B which was rened three times are lo cated on

these edge and in COOR in that way

A P P P P P P P B

n n n n n n n

The middle no de P is from the rst renement step The no des P and P

n n n

were generated in the second renement step all other P in the third one In the

i

Release of Programmers Manual

REFINEMENT

quadratic case it lo oks identically but after the second renement step So it is clear

that no des must b e moved in the array COOR b efore the renement

The kette data were also up dated in this step

Renement of the edges All sub edges which b elong to coupling facesedges stand

NR

together at the b eginning of the data of the level There are X sub edges of a NR

times rened edge These X sub edges stand together ordered as they are lo cated on

the old father edge The X linear case X quadratic case new no des get their

correct p osition in COOR as describ ed ab ove The new sub edges are app ended to the

list of edges in that order they were generated Because of the sp ecial order of the

father edges the describ ed order principle is conserved

Renement of the faces The subfaces of coupling faces stand together New faces

are app ended In the quadratic case the middle no des of the new inner edges are

generated and in the case of brick meshes the middle no de of the face to o

Renement of the volumes For tetrahedral meshes the rule after Bey see is used

to avoid the degeneration of the mesh By analogy to the renement of the edges

the middle no des of the new edges only quadratic case and the middle no des of the

brick are generated Note that also new faces have to b e introduced

Up date of the b oundary condition data The new subfaces with b oundary conditions

were added

There are some dierences in details b etween the dierent mesh types tetrahedralbrick

linearquadratic

For tetrahedral meshes it is easy The dierence b etween the quadratic and the linear

case is whether the middle no des of the edges are generated or not That causes dierences

in the preparation and the edge renement step

For the brick meshes it is more complicated b ecause of the new middle no des of the

facesbricks In the coarse mesh preparation step the middle no des of the edges where

generated So it is an quadratic mesh of serendipity type The normal renement is also

quadratic In the linear case N quadratic renement steps were p erformed followed by

an step to change the quadratic bricks to linear ones It is the same pro cedure FEIN there

is an switch parameter with the dierence that now no new no des on edges were generated

There is an third case with no de bricks the middle no des of the faces and the of brick

are included and the order of the no des of an volume is slightly dierent After the normal

quadratic renement these additional no des were generated in the subroutine STROEM

An example

In general all no des on coupling facesedges stand together and form so called kettes The

no des of Dkettes are ordered as they are lo cated on the edges they b elong to The same

is true for inner edges of Dkettes But the vertices of these edges are not included in

these edge no de blo cks b ecause they are crossp oints vertices of coupling edges no des on

coupling edges or middle p oints of subfaces vertices of edges on coupling faces compare

Figure

To describ e the structure of the array of no de co ordinates with all details the renement

of a single brick is considered It is shown what happ ens with face number

Release of Programmers Manual

CHAPTER HIERARCHICAL MESH REFINEMENT

The coarse mesh see Figure

There are only the crossp oints in COOR

Crossp oints

After the rst renement step see Figure

Now there are in COOR b eside the crossp oints also the middle p oints of the coupling

edges and faces and the rst interior p oint in the center of the brick

Crossp oints

middle p oints of the coupling edges

the middle p oints of coupling faces

the middle p oint of the brick

After the second renement step see Figure

Now the coupling edges are two times rened and there are no des on it The

middle p oints of the new subfaces and of the edges on the coupling face

app ear

Crossp oints

CE

CE

CE

CE no des on coupling edges

CE

CE

CE

CF

middle no des of the subfaces

the middle no de of the face CF no des on coupling faces

one time rened edges

CF

interior no des

After the third renement step see Figure

Now the coupling edges are three times rened and there are no des on it Moreover

there were created the middle p oints of subfaces from level and the middle p oints

of the edges which where created in the previous step compare the table b elow

Note that the middle no de and no des of the four edges of every rened face stand

together for instance no des

Release of Programmers Manual

REFINEMENT

CE

CF CE CE

CE

Figure One renement step

Figure The coarse mesh

Figure Second renement step

Figure Third renement step

Release of Programmers Manual

CHAPTER HIERARCHICAL MESH REFINEMENT

Crossp oints

CE

CE

CE

CE no des on coupling edges

CE

CE

CE

CF

middle no des of the subfaces

middle no de generated in step

one time rened edges

middle no de generated in step

one time rened edges

middle no de generated in step CF no des on coupling faces

one time rened edges

middle no de generated in step

one time rened edges

the middle no de of the face

two times rened edges

CF

interior no des

Note

There are no middle no des of faces in tetrahedral meshes

In the quadratic case there is the same structure of COOR but all edges have their

middle p oints So the edges app ear one step earlier and they lo ok like one time more

rened linear edges

Reducing data

The change of the data structure is easy It is only necessary to determine the no des a

volumeface consists of and to calculate the Dirichlet values on some no des

Parameters of the output to ol AUSGABE

The subroutine AUSGABE is an output to ol for several mesh data It works with the FDS

as well as with the RDS There is an array of logical variables STEUER which determines the

status of the data FDSRDSsolved STEUER

Features of AUSGABE

graphical output of mesh data GRAPE

tabular output of mesh data

tabular output of kette data

Release of Programmers Manual

PARAMETERS OF THE OUTPUT TOOL AUSGABE

tabular output of the solutionerror

tabular output of error norms

output of the mesh as standardized le std works only as one pro cessor version

SUBROUTINE AUSGABESTEUERCOORZKPZFPZIPNUMNPKANTEFKANTE

NKANTEFACEFFACENFACEVOLREGIONFVOL

NUMELNKKNKFDIRFDIRNDIRNEUMFNEUM

NNEUMKETTEDKETTEDCHAINVFSXLCIER

IGLOBFREILAEFREI

STEUER mesh status

COOR array of no de co ordinates

ZKP FDS p ointer in COOR rst no de on D kette

ZFP FDS p ointer in COOR rst no de on D kette

ZIP FDS p ointer in COOR rst inner no de

NUMNP number of no dal p oints

KANTE FDS array of edges

FKANTE FDS p ointer to the last level in KANTE

NKANTE FDS number of edges only last level

FACE FDS array of faces

FFACE FDS p ointer to the last level in FACE

NFACE FDS number of faces only last level

VOL array of volumes

REGION array of region data

FVOL FDS p ointer to the last level in VOL

NUMEL number of volumes only last level

NKK number of D kettes

NKF number of D kettes

DIR array of Dirichlet data

FDIR FDS p ointer to the last level in DIR

NDIR number of Dirichlet faces only last level

NEUM array of Neumann data

FNEUM FDS p ointer to the last level in NEUM

NNEUM number of Neumann faces only last level

KETTED array of D kette data

KETTED array of D kette data

CHAIN array of chain data

VFS number of renement steps

X RDS solution

LC RDS hierarchical list

IER error parameter

IGLOB array of global crossp oint names

FREI workspace array

LAEFREI length of the workspace array

All variables except the error parameter IER are input

Release of Programmers Manual

CHAPTER HIERARCHICAL MESH REFINEMENT

Tree structure of the routines

Tree substructures of subroutines marked with the symbol are describ ed b efore in the list

NETMAKE for tetrahedral meshes

! LCZSTEP NETMAKE

! MATVEC ! GROBNETZ

! TRIDEV ! READDATTR

! DELTA ! SIMP MARK

! BOUNDS ! VERTEILEN

! FIEDLER ! VERTEILEN

! INVITER ! FEINCALL

! TRIDSOL ! FEIN

! COM PROB

FEIN

! MOVE

! COORPLATZ

! STAUCHEN

! KPLATZ

! TEILEKANTEN VERTEILEN

! TKANTEN ! SET RBCOM

! PCORECT ! DAT DOWN

! KWRITE ! TKUERZ

! TEILEFLAECHEN ! KUERZEN

! TFLAECHEN ! ZUERST

! PCORECT ! PCORRECT

! KWRITE ! PFACE

! SCHREIBEDREIECK ! GEMPKT

! ECKPUNKTE ! COM PROB

! GEMPKT

VERTEILEN

! GEMPKT

! SET RBCOM

! MITTE

! DAT DOWN

! KWRITE

! RSB

! SCHREIBEDREIECK

! TKUERZ

! TETSCHREIBEN

! KUERZEN

! RBNEU

! ZUERST

STAUCHEN PROB ! COM

! DIRSTAUCHEN

RSB

! PFACE

! SPEBIS

! DIRINTPO

! MEDIAN

! NEUMSTAUCHEN

! OUTINT

! PFACE

! BUILDHV

! VOLPUNKTE

! MATRIX

! ECKPUNKTE

! EIGVAL

! GEMPKT

! INILANC

! PBRICK

! SEED

! GEMKANTE

! RANDOM

! GEMPKT

! MATVEC

! MAKE LC

Release of Programmers Manual

TREE STRUCTURE OF THE ROUTINES

NETMAKE for brick meshes

! PCORECT NETMAKE

! KWRITE ! GROBNETZ

! FAWRITE ! READDATQ

! KWRITE ! SIMPMARK

! OFDAT ! VERTEILEN

! GDKANTE ! FEINCALL

! GTKANTE ! FEIN

! KWRITE ! COM PROB

! FAWRITE ! STROEM

! VWRITE ! MOVE

! RBNEU ! STAUCHEN

STROEM VERTEILEN

! FMPLATZ ! SET RBCOM

! FMITTEN ! DAT DOWN

! FMID ! TKUERZ

! KWRITE ! KUERZEN

! VMITTEN ! FORIENT

! KWRITE ! VORIENT

! ZUERST

STAUCHEN

! PCORRECT

! DIRSTAUCHEN

! FEBRICK

! PFACE

! GEMPKT

! DIRINTPO

! COM PROB

! NEUMSTAUCHEN

! PFACE FEIN

! VOLPUNKTE ! COORPLATZ

! ECKPUNKTE ! KPLATZ

! GEMPKT ! KFTEILEN

! PBRICK ! KTEILEN

! PFACE ! PCORECT

! FEBRICK ! KWRITE

! MAKE ! FFTEILEN LC

! FTEILEN

! LCPRINT AUSGABE

! NETZDRUCK

! FSTRADDI

AUSGABE

! FSTRADDR

! IAUS

! VRBPRINT

! GEBGRAPE

! KETPOUT

! NETZREDOUT

! WTABX

! FSTRADDI

! PWTABX

! FSTRADDR

! FNTAB

! RDIRPRINT

OUT ! STDF

! VRBPRINT

Release of Programmers Manual

CHAPTER HIERARCHICAL MESH REFINEMENT

Short description of the routines in libNAa

The following FORTRAN sources are lo cated in NetzAllgemein

AUSGABE netzdruckf frame for the output of several data mesh solution

error estimates

BOUNDS rsbf a b ound on the RaleighRitz approximation to the

j

maximum eigenvalue is computed

BUILDHV rsbf accumulation of an auxiliary array for the sp ectral

bisection

COM PROB com probf sets the variables of the common blo ck in com probinc

DAT DOWN dat downf distributes mesh data from pro cessor to all other pro

cessors

DATRED kuerzenf deletes facesedgesno des which are not referred in the

volumesfacesedges generates IGLOB and deletes un

used b oundary condition data

DELTA rsbf function to calculate a sp ecial determinant during RSB

DIRINTPO stauchenf computes the Dirichlet value at a no de

DIRSTAUCHEN stauchenf changes the Dirichlet data to the reduced data structure

DREGIO stdwritef determines the names of regions and the number of vol

umes p er region

ECKPUNKTE AUpfeinf determines the vertices of a tetrahedron

EIGVAL rsbf computes the second smallest eigenvalue of the Lapla

cian matrix

FEBRICK AUpfeinf determines the vertices of a quadrilateral

FIEDLER rsbf computes the edler vector ie the eigenvector corre

sp onding to the second largest eigenvalue of the Lapla

cian matrix

FSTRADDI netzdruckf generates a format string

FSTRADDR netzdruckf generates a format string

GEMKANTE AUpfeinf determines the common edge of two faces

GEMPUNKT AUpfeinf determines the common no de of two edges

GETDOFS getdofsf description of the degrees of freedom for the visualiza

tion to ol GRAPE

IAUS netzdruckf displays a menu for the dierent output routines

INILANC rsbf initialization of the Lanczosmetho d

INVITER rsbf inverse iteration for calculation of the eigenvector to a

given eigenvalue

KETPOUT netzdruckf output of kette data for kette see

KPLATZ AUpfeinf copies the no des of an edge to the new places in COOR

b efore the renement step

KUERZEN kuerzenf deletes unused mesh data and p erforms the necessary

renumbering generates the array of global crossp oint

names IGLOB Note that DAT DOWN distributes the whole

coarse mesh then some elements are marked and

KUERZEN deletes all elements not marked

KWRITE AUpfeinf writes an edge into the array of edges

LCPRINT netzdruckf output of one row of the hierarchical list

Release of Programmers Manual

SHORT DESCRIPTION OF THE ROUTINES IN LIBNAA

LCZSTEP rsbf iteration step of the Lanczosmetho d

LIES standardf reads and analyzes a row of the le of program control

variables controlquad or controltet resp ectively

LC stauchenf generates the hierarchical list MAKE

MAKREGIO stdwritef prepares the REGION data for STDWRITE

MATRIX rsbf generates the Laplacian matrix of element connectivity

MATVEC rsbf matrix vector multiplication

MEDIAN rsbf calculates the median of a set array of numbers

MOVE cnetzf realization of a co ordinate transformation for sp ecial

applications

NEUMSTAUCHEN stauchenf changes the Neumann data to the reduced data struc

ture

NETZDRUCK netzdruckf output of the full data structure

NETZREDOUT netzdruckf output of the reduced data structure

OUT netzdruckf displays the part of the solution vector of one pro cessor

OUTINT rsbf auxiliary routine for integer output

OUTKETTE netzdruckf displays the part of the kette data for kette see of

one pro cessor

OUTSTANDARD standardf displays program control variables

PBRICK AUpfeinf determines the no des of an linearquadratic brick

PCORECT p corectf determines the middle p oint of an edge

PFACE AUpfeinf determines the no des of a face

PWTABX netzdruckf displays one row of the table of the solutionerror

RANDOM rsbf calculates a random value

RBNEU AUpfeinf copies the b oundary condition data from father faces to

the new subfaces after a mesh renement step

RDIRPRINT netzdruckf output of Dirichlet data reduced data structure

RNDPRINT stdwritef output of b oundary condition data

RSB rsbf computes the sp ectral bisection for a given mesh

SEED rsbf a very basic random number generator uses RANDOM

SETFILE setlef input of the lename

SET RBCOM set rb comf sets the values of the variables in the common blo ck RB

in the le netddatinc

SETSTANDARD standardf sets the program control variables using le controltet

controlquad

SIMP MARK simp markf marks the volumes with the number of the pro cessor

which should own them very simple metho d

SPEBIS rsbf the median value of the edler vector is computed all

elements of the edler lower then the median are marked

STAUCHEN stauchenf reduction of the full data structure to the reduced data

structure

OUT stdwritef frame for the output of the full data structure as a stan STDF

dard le std

STDWRITE stdwritef output of the full data structure as a standard le

TKUERZ kuerzenf deletes all volumes which are not marked with the own

pro cessor number array MARK

TRIDEV rsbf computes largest eigenvalue of the tridiagonal matrix

Release of Programmers Manual

CHAPTER HIERARCHICAL MESH REFINEMENT

TRIDSOL rsbf solver for a tridiagonal symmetric p ositive denite

matrix

VERSION versionf displays the title of the program

VOLPUNKTE stauchenf determines the vertices of a volume

VRBPRINT netzdruckf output of Dirichlet data full data structure

VREGIO stdwritef determines the names of the volumes in the regions

WTABX netzdruckf table of the solution

YSFAKTOR cnetzf determines the relative length of the sub edges

ZWEIIWERTE standardf reads two integer values from an string variable

Short description of the routines in libNTa

The FORTRAN sources are lo cated in NetzTetraeder

COORPLATZ upfeinf prepares COOR for the following mesh renement step

The no des in COOR are ordered A renumbering is nec

essary b ecause some of the new p oints will b e placed

b etween old p oints

GROBNETZ grobnetzf provides the coarse mesh

FEIN upfeinf hierarchical mesh renement of tetrahedral meshes

FEINCALL upfeinf frame for the hierarchical mesh renement

NETMAKE netmakef frame for the mesh generation

MITTE upfeinf determines the middle p oint of an edge of an face

SCHREIBEDREIECK upfeinf writes one triangle into the array of faces

SET NETDIM controlf sets constants esp ecially array dimensions for tetrahe

dral meshes

STWERTE controlf presets the program control variables with standard val

ues and op ens the le controltet

TEILEFLAECHEN upfeinf divides the faces

TEILEKANTEN upfeinf divides the edges

TETSCHREIBEN upfeinf writes one tetrahedron into the array of volumes

TFLAECHEN upfeinf divides one face

TKANTEN upfeinf divides one edge

VERTEILEN grobnetzf sends the coarse mesh data to all pro cessors deletes

all unused data and prepares the mesh for renement

simple spreading of the volumes

VERTEILEN grobnetzf sends the coarse mesh data to all pro cessors deletes

all unused data and prepares the mesh for rene

ment spreading of the volumes with recursive sp ectral

bisection

ZUERST grobnetzf prepares the coarse mesh for the renement sets initial

values of variables etc

Short description of the routines in libNQa

The FORTRAN sources are lo cated in NetzQuader

Release of Programmers Manual

SHORT DESCRIPTION OF THE ROUTINES IN LIBNQA

COORPLATZ upfeinf prepares COOR for the following mesh renement step The

no des in COOR are ordered A renumbering is necessary b e

cause some of the new p oints will b e placed b etween old

p oints

F upfeinf determines face number used in VORIENT

FAWRIT upfeinf writes one quadrilateral into the array of faces

FEIN upfeinf hierarchical mesh renement of brick meshes

FEINCALL upfeinf frame for the hierarchical mesh renement

FFTEILEN upfeinf divides the faces

FORIENT upfeinf orders the four edges of an quadrilateral the rst is connected

with the second which is connected with the third which is

connected with the fourth which is connected with the rst

FTEILEN upfeinf divides one face

FMID stro emf determines the middle no des of one face

FMITTEN stro emf determines the middle no des of the faces

FMPLATZ stro emf prepares the data array COOR for the new middle no des of the

faces and volumes for no de bricks

GDKANTE upfeinf determines the relative p osition of subfaces of two father brick

faces

GROBNETZ grobnetzf provides the coarse mesh

GTKANTE upfeinf determines the common edge of two subfaces used in subrou

tine GDKANTE

KTEILEN upfeinf divides one edge

KFTEILEN upfeinf divides the edges

NETMAKE netmakef frame for the mesh generation

OFDAT upfeinf determines the subfaces and inner sub edges of an father brick

face

SET NETDIM controlf sets constants for brick meshes

STROEM stro emf determines the middle no des of the faces and volumes for

no de bricks

STWERTE controlf presets the program control variables with standard values

and op ens the le controlquad

TAUSCHE upfeinf exchange of the values of two integer variables

VERTEILEN grobnetzf sends the coarse mesh data to all pro cessors deletes all unused

data and prepares the mesh for renement simple spreading

of the volumes

VMITTEN stro emf determines the middle no des of the volumes

VORIENT upfeinf orders the six faces of a brick the rst is not connected with

the th and the other four faces are connected as describ ed

for the edges of a quadrilateral ab ove

VWRITE upfeinf writes a brick into the array of volumes

ZUERST grobnetzf prepares the coarse mesh for the renement sets initial values

of variables etc

Release of Programmers Manual

CHAPTER HIERARCHICAL MESH REFINEMENT

Release of Programmers Manual

Chapter

Assembly of the equation system and

error estimation

General remarks

The assembly of the equation system and the error calculation have the common feature

of the computation of integrals over volumes elements and faces Moreover values of

shap e functions are needed only in these two parts of the program That is why we put the

sources of the related subroutines together in the directory Assem and the ob ject les form

the library libAarchia The aim of this chapter is to describ e the realization of these

routines The hop e is that many of them can b e used in further extensions of the program

system SPCPM Po D or in other related programs

Many routines have b een taken from the sequential co de FEMPSD A brief description

of this co de is contained in But there are also some signicant dierences

In FEMPSD only the Poisson equation can b e solved while in SPCPM Po D also

the Lamesystem is included

In FEMPSD the error computation is restricted to linear tetrahedral elements here

this has b een extended to the full scale of elements treatable see b elow Moreover

some communication is necessary

The storage of the system matrix is slightly dierent the storage and the treatment of

b oundary conditions is fully dierent due to the inuence of the co de SPCPM Po D

Because of these changes the parameter lists of many of the subroutines have b een changed

thus it is not suggested to mix the subroutines from the serial and the parallel program

Essentially there are two subroutines which are called from programs outside the library

libAa ASSLOES and FNTAB In b oth parts numerical integration is used we describ e the

corresp onding routines in Section

ASSLOES realizes the frame for the assembly and solution of the equation system Given

the nite element mesh some control parameters and auxiliary memory the routine returns

the nite element solution The matrix and the right hand side are lo cal in ASSLOES The

main ingredients are initialization the call of ASSEM and COARSMAT the call of several

routines for the solution of the equation system see Chapter and time measurement

The stiness matrix and the right hand side are assembled in ASSEM the main steps are

Release of Programmers Manual page

CHAPTER ASSEMBLY OF THE EQUATION SYSTEM AND

describ ed in Sections Moreover a coarse grid matrix is assembled in COARSMAT

see

The calculation of the error norms and their output for each sub domain pro cessor

and global is done by FNTAB The dierent error measures are the discrete maximum norm

maximal dierence of u and u in no dal p oints and the H and L seminorms of u u

h h

calculated via numerical integration

Of course these norms can only b e calculated if the exact solution is programmed Note

that the mo dule bspf supplies function routines for u its derivatives and the right hand

sides f and g They have to b e programmed by the user The realization of the error norms

is describ ed in Section

For the tree structure of ASSLOESASSEM and FNTAB see Section all routines are

describ ed shortly in Section

Numerical integration and shap e functions

Mathematical Background

The assembly of the stiness matrix and the right hand side as well as the calculationes

timation of error norms require the numerical integration of integrals over volume elements

or faces This integration is realized by a transformation to reference elements that means

a co ordinate transformation Let b e a volume element tetrahedron or hexahedron with

no des x IR i NEND Moreover let b e the reference element with no des x

i i

IR i NEND and shap e functions x with x then the corresp onding

i i j ij

co ordinate transformation is

NEND

X

x x x

i i

i

and the volume integral is transformed and approximated by

Z Z

NQP

X

f x dx f x j detJ x j dx f y j detJ y j

j j j

j

where y are the integration p oints are the weights and J is the Jacobian functional

j j

matrix

The twodimensional case is treated by analogy Let G b e a face triangle or quadrilat

eral in IR with no des x IR i NEND and let G b e the reference element with

i

no des x IR i NEND and shap e functions x then the relations are

i i

NEND

X

x x x

i i

i

Z Z

NQP

q

X

p

f x dx f x EC F dx E y C y F y f y

j j j j j

j

G

G

i i i i

X X X

x x x x

C x F x where E x

x x x x

i i i

Release of Programmers Manual

NUMERICAL INTEGRATION AND SHAPE FUNCTIONS

and x x x x x x x

Thus it is useful to supply arrays QGSTNQP and QGSTNQP with the quadra

ture p oints and the corresp onding weights as well as arrays SHPNENDNQP and

SHPNENDNQP with the values of the shap e functions and their derivatives in the

quadrature p oints

The arrays QGST QGST SHP and SHP

Actually these arrays dep end on the element type triangle quadrilateral tetrahedron

p entahedron hexahedron In the sequential co de FEMPSD see there has b een no

restriction to use one type of elements exclusively Thus all arrays could have b een necessary

in one run of the program So it was decided to supply the arrays mentioned ab ove for all

element types This means

NQP and NQP are vectors of length and resp ectively

NQP number of quadrature p oints for quadrilaterals

NQP number of quadrature p oints for triangles

NQP number of quadrature p oints for hexahedra

NQP number of quadrature p oints for p entahedra

NQP number of quadrature p oints for tetrahedra

These arrays are assigned in ELEHF b ecause the user supplies only the number of the

formula see b elow and not the corresp onding number of quadrature p oints

The actual length of the arrays are

array length

QGST NQP NQP

QGST NQP NQP NQP

NQP NQP for linear elements

SHP

NQP NQP for quadratic elements

NQP NQP NQP for linear elements

SHP

NQP NQP NQP for quadratic elements

Of course one can call subroutines with a part of the array and use the array as

introduced in These arrays are allo cated in ELEHF

The quadrature formulae programmed are given in Tables to They are realized

in the subroutines EINTG and EINTG on information from the b o oks for a

description of the routines see Section

The arrays SHP and SHP are assigned in ESHAP and ESHAP using the arrays QGST

and QGST as well as subroutines where the shap e functions and their derivatives for the

dierent cases are programmed PL PQ for triangles PHIL PHIQ for quadrilateral PTL

PTQ for tetrahedra PL PQ for p entahedra and PHIL PHIQ for hexahedra The

subroutines mentioned in this paragraph are contained in the le up eef

A mo dication of the arrays for the use in the error esti

mator

An additional case must b e considered for the error estimator compare Section Here

the values of the D shap e functions and their derivatives are used in the quadrature p oints

Release of Programmers Manual

CHAPTER ASSEMBLY OF THE EQUATION SYSTEM AND

Formula Number of exact for

Description

i j

number p oints x y with

midp oint center of gravity i j

x Gaussian p oints i j

x Gaussian p oints i j

Table Quadrature formulas for quadrilaterals

Formula Number of exact for

Description

i j

number p oints x y with

center of gravity i j

midp oints of the edges i j

Gaussian p oints i j

Gaussian p oints i j

Table Quadrature formulas for triangles

Formula Number of exact for

Description

i j k

number p oints x y z with

center of gravity i j k

Gaussian p oints i j k

Gaussian p oints i j k

Gaussian p oints i j k

Gaussian p oints i j k

Table Quadrature formulas for tetrahedra

Formula Number of the formula is a cross pro duct of the formulas exact for

i j k

number p oints for triangle for interval zdirection x y z with

center of gravity midp oint i j k

midp oints of edges midp oint i j k

Gaussian p oints midp oint i j k

midp oints of edges Gaussian p oints i j k

Gaussian p oints Gaussian p oints i j k

Gaussian p oints Gaussian p oints i j k

Gaussian p oints Gaussian p oints i j k

Gaussian p oints Gaussian p oints i j k

Table Quardature formulas for p entahedra

Formula Number of exact for

Description

i j k

number p oints x y z with

midp oint center of gravity i j k

xx Gaussian p oints i j k

xx Gaussian p oints i j k

midp oints of the faces i j k

Irons formula i j k

Table Quadrature formulas for hexahedra

Release of Programmers Manual

ASSEMBLY OF THE STIFFNESS MATRIX AND THE RIGHT HAND SIDE

of the faces of the elements This is still under development

Assembly of the stiness matrix and the right

hand side

Main steps

The stiness matrix A and the right hand side F are assembled in the subroutine ASSEM

Only the nonzero elements of A are stored see The main steps are the following

Allo cation of memory for the matrix A its index vector LAN the right hand side

FN and the solution XN where N NUMNP NDF is the number of unknowns NUMNP

is the number of no dal p oints and NDF is the number of degrees of freedom p er no de

Our approach is not to determine the structure of A a priori but to use a rectangular

matrix ANDIAGN preliminary The maximal number NDIAG of nonzero entries in a

row of the matrix must b e estimated b efore see Section of the Users manual

If it proves to b e to o small in step then it is increased iteratively

Initialisation of A LA F and X The subroutine MAKEKZU initializes the index vector

LA as if the matrix has exactly NDIAG nonzero elements p er row

Allo cation and calculation of the arrays QGST QGST SHP and SHP see Section

Lo op over all elements The element stiness matrix S and the element right hand

side P are computed in subroutine ELEMENT then they are accumulated using AKKU

AKKUEL and FAKKUFAKKUEL Note that the rst parameter in ELEMENT is the equation

number If it is equal to then the Poisson equation is taken as the basis and

internally the subroutine ELS is called see In the case of linear elasticity the

parameter is and the routine ELAST is used see In ELEMENT there are also the

element types tetrahedron distinguished via the auxiliary routine IHPT

Treatment of Dirichlet data Dirichlet data are stored in an array DIRF see Section

Dep ending on the indicator in DIRFNEND the values are copied to the appro

priate p osition in the vector X and the corresp onding diagonal element of A is set to

D The realization of the CGalgorithm ensures the correct handling of these

equations

Treatment of inhomogeneous Neumann data They are stored in the array NEUMF

see Section In a lo op over all Neumann faces the integrals for the element right

hand side are calculated via subroutines NEUMANNERSOB and then accumulated via

FAKKUFAKKUEL Note that some faces may have Dirichlet and Neumann data for

dierent degrees of freedom Then they are included b oth in DIRF and NEUMF

In a last step the matrix A is compressed by deleting all zero entries subroutine

PACKKLZ Moreover the elements in each row are now sorted by its index of the

column subroutine SORTKZU

Release of Programmers Manual

CHAPTER ASSEMBLY OF THE EQUATION SYSTEM AND

The element routine ELS in the case of the Poisson equation

We are going now to describ e the element routine ELS which calculates the element stiness

NEND NEND

matrix S s and the right hand side P p in the case of the Poisson

k k

k k

equation With the notation from there holds

Z Z

T T T T

r r dx J r J r j det J j dx s

k k k

NQP

X

T

T T

J y r y J y r y j det J y j

j k j j j j j

j

Z Z

p f x x dx f x x j det J x j dx

k k k

NQP

X

f y y j det J y j

j k j j j

j

Thus after initialization of S and P with zeros a lo op over all quadrature p oints y is

j

p erformed where the following steps are carried out

J and j det J j with Calculation of J J y J

j mn

mn

NEND NEND

m

X X

x

i

m m

x x x J

mn

i i

n n n

x x x

i i

y

i j

are contained in the array SHP see The values of

(n)

x

Note that in the case of a linear tetrahedron the matrix J is indep endent of x and this

step can b e executed outside the lo op More than one integration p oint can b e useful

for linear tetrahedra in the case of nonconstant right hand side f

NEND

NEND

T

Preparation of an array D d J r

mi k

mi

k

T

S S DD j det J j QGST j

j j

Transformation of the quadrature p oint

NEND

X

x y y

i i j j

i

p p f y y j det J j

k k j k j j

Note that f y f y but only f is given as a function

j j

The element routine ELAST for the Lame system

The element routine ELAST for linear elasticity is programmed with similar steps as in



NEND NEND

Denote by S S the element stiness matrix with S IR and by P P

k k k

k k

Release of Programmers Manual

ASSEMBLY OF THE STIFFNESS MATRIX AND THE RIGHT HAND SIDE

P IR the element right hand side Then one nds that

k

Z

T T

S T T trT I T r r dx

k k k k k

k

Z

f x x dx f x IR P

k k

which can b e calculated similary to Thus the steps in the lo op are

Calculation of J J and j det J j

NEND

T NEND

Calculation of D J r IR

k

k

For k NEND NEND do

T

T D D where D r y IR is the kth column of D

k k k k j

T

S S T T trT I j det J j

k k k k j

k

NEND

P

x y Calculation of y

i i j j

i

For k NEND do

P P f y y j det J j

k k j k j j

where f y IR is calculated in the function routine

j

The surface integrals for inhomogeneous Neumann b ound

ary conditions

NDF

NEND

P IR for The subroutine ERSOB calculates the element right hand side P P

k k

k

inhomogeneous Neumann b oundary conditions that means

Z Z

p

p g x x dx gx x E x C x F x dx

k k k

G

G

NQP

q

X

E y C y F y gy y

j j j j j k j

i

where

NEND

m

X X X

x x

i

m

x E x

i

x x

m m

i

NEND

m

X X X

x x

i

m

C x x

i

x x

m m

i

NEND NEND

m m

X X X X

x x x x

i i

m m

F x x x

i i

x x x x

m m

i i

Thus a lo op over all quadrature p oints y is p erformed with the following steps

j

Release of Programmers Manual

CHAPTER ASSEMBLY OF THE EQUATION SYSTEM AND

NEND

X

i

m

Calculation of J J y J with J x y

j mn mn j

mn i

n

x

i

P P P

Determining E J C J and F J J

m m

m m

m m m

NEND

P

Calculation of y x y

j i i j

i

p

NDF

E y C y F y where g y IR NDF is the number P P g y y

j j j j j k k j j

of the degrees of freedom

The coarse grid matrix

For the coarse grid solver there is a coarse grid matrix necessary Theoretically this

matrix shall b e obtained by accumulating

I I

NDF NDF

M

I I

NDF NDF

NDFNDF

over all edges of the coarse grid I IR is the identity matrix Moreover this

NDF

matrix is used only by pro cessor

Thus we send all DKettes for Kette see to pro cessor which assembles then the

coarse grid matrix by going through this list of Kettes This pro cedure shall b e improved

in the future b ecause all Kettes which b elong to several pro cessors are treated more than

once Note that the coarse grid matrix is stored in its prole structure

Calculation and estimation of error norms

The computed values

Error norms are calculated and or estimated element by element using quadrature rules

as in the assembly of the stiness matrix right hand side Thats why these routines are

included in the same library libAa

We will understand under error calculation the elementwise computation of the integrals

Z

NQP NEN

X X

u r x jru u j dx rux ku u H k

i i j h j j h

j i

Z

NQP NEN

X X

ku u L k u u dx ux u x

h h j i i j j

j i

where we assume that u and ru are programmed functions

For error estimation a residual type error estimator is under development

Release of Programmers Manual

TREE STRUCTURES

Exact error norms

The exact error norms are computed similarly to using the subroutine FEHLER The

main steps are the allo cation and calculation of the arrays QGST and SHP see Section

and a lo op over the elements In this lo op the lo cal error contributions are computed

using the subroutines ELNORM and ELN and added to a global norm In ELNORM the dierent

element types are distinguished and ELN do es the actual computations

By analogy to the lo cal terms are calculated by a lo op over the quadrature

p oints doing

Calculation of J J and j det J j

T NEND NEND

Calculation of D J r d d r x

k k k k j

k k

NEND

P

Calculation of y x y

j i i j

i

NEND

P

HLOC HLOC rux u x d j det J j

j h i i j

i

NEND

P

j det J j u x x LLOC LLOC ux

j h i k j j

i

Tree structures

ASSLOES

! KLZ INIT

! ASSEM

! VDCOPY

! MAKEKZU

! ELEHF

! EINTG

! ESHAP

! PHIL PHIQ PL PQ

! EINTG

! ESHAP

! PHIL PHIQ PL PQ PTL PTQ

! ELEMENT

! IHPT

! ELS ELAST

! VDCOPY

! JACOBIAN

! F

! AKKUS AKKUEL

! AKKUIJ

! FAKKU FAKKUEL

! AKKUIJ

1

in libKLZa see Section

2

in libvbasmoda see Section

3

to b e supplied in bspf

Release of Programmers Manual

CHAPTER ASSEMBLY OF THE EQUATION SYSTEM AND

! NEUMANN

FACE GET NEUM ! GET

! ERSOB

! G

! PACKKLZ SORTKZU

! KETTDAKK VOR KETTDAKK VOR KETTDAKK VOR

! COARSMAT

UP ! TREE

! ASSCOARS

! HDIAGKZU

! AKKUS AKKUEL

! PACKKLZ SORTKZU

! CVBKLZ

! VDCOPY

! STARTWRD STAVE

! HBBPX HSTH

! PPCGM

TIMES TIME GRAF ! GET

FNTAB

! FEHLER

! ELEHF

! EINTG

! ESHAP

! PHIL PHIQ PL PQ PTL PTQ

! ELNORM

! IHPT

! ELN

! JACOBIAN

! U UX UY UZ

! CUBE DOD

Short description of the subroutines

AKKUEL akkuf accumulates the element stiness matrix to the global matrix

elasticity

AKKUS akkuf accumulates the element stiness matrix to the global matrix

Poisson

ASSCOARS coarsef assembles the coarse grid matrix

ASSEM assemf assembles the equation system

ASSLOES asslo esf frame for the assembly and solving the equation system

COARSMAT coarsef frame for ASSCOARS

EINTG up eef determines integration p oints and weights D

4

in libDDCMcoma see Section

5

in libSa see Section

6

in libCub ecoma see Section

Release of Programmers Manual

SHORT DESCRIPTION OF THE SUBROUTINES

ESHAP up eef determines the shap e functionsderivatives in the integration

p oints D

EINTG up eef determines integration p oints and weights D

ELEHF up eef allo cates memory for the arrays QGST QGST SHP SHP S

and P

ERSOB neumannf calculates surface integrals Neumann b oundary conditions

ESHAP up eef determines the shap e functionsderivatives in the integration

p oints D

ELAST elastf computes the element stiness matrix and the right hand side

elasticity

ELEMENT elementf frame for ELAST ELS

ELN fehlerf computes element errors

ELNORM fehlerf frame for ELN

ELS elementf computes element stiness matrix and the right hand side

Poisson

F fehlerf computes the right hand side equation in a given p oint user

supplied

FAKKU akkuf accumulates element right hand side to the global vector

Poisson

FAKKUEL akkuf accumulates element right hand side to the global vector

elasticity

FEHLER fehlerf frame for calculating the error norms

FNTAB fehlerf calls FEHLER and prints the errors

G neumannf computes right hand side b oundary condition in a given

p oint user supplied

GET FACE neumannf extracts the no de numbers and co ordinates out of NEUMF and

COOR

GET NEUM neumannf extracts Neumann data out of NEUMF

IHPT ihptf integer function determines whether an element is a hexahe

dron a p entahedron or a tetrahedron

IVD ihptf integer function determines whether an face is a quadrilateral

or a triangle

JACOBIAN elementf determines the Jacobian functional matrix J its inverse J

and its determinant for one integration p oint in an element

NEUMANN neumannf frame for ERSOB

PL up eef computes the values of all shap e functionsderivatives in a

p oint linear triangle

PQ up eef computes the values of all shap e functionsderivatives in a

p oint quadratic triangle

PL up eef computes the values of all shap e functionsderivatives in a

p oint linear p entahedron

PQ up eef computes the values of all shap e functionsderivatives in a

p oint quadratic p entahedron

PHIL up eef computes the values of all shap e functionsderivatives in a

p oint linear quadrilateral

PHIQ up eef computes the values of all shap e functionsderivatives in a

p oint quadratic quadrilateral

Release of Programmers Manual

CHAPTER ASSEMBLY OF THE EQUATION SYSTEM AND

PHIL up eef computes the values of all shap e functionsderivatives in a

p oint linear hexahedron

PHIQ up eef computes the values of all shap e functionsderivatives in a

p oint quadratic hexahedron

PTL up eef computes the values of all shap e functionsderivatives in a

p oint linear tetrahedron

PTQ up eef computes the values of all shap e functionsderivatives in a

p oint quadratic tetrahedron

U bspf computes the solution u in a p oint user supplied

u

UX bspf computes the derivative in a p oint user supplied

x

u

UY bspf computes the derivative in a p oint user supplied

y

u

in a p oint user supplied UZ bspf computes the derivative

z

Release of Programmers Manual

Chapter

Parallel Preconditioned Conjugate

Gradient Metho d PPCG

The Parallel CGMetho d

The PPCG metho d is realized in the subroutine PPCGM The parallelization of the PCG

metho d is based on the nonoverlapping domain decomp osition in connection with the fol

lowing two types of data storage

type I the solution vector u is represented lo cally on each pro cessor i by the vector

A u u

i

i

type I I the right hand side vector f is represented lo cally on each pro cessor i by f

i

with

p

X

T

f A f

i

i

i

where p is the number of pro cessors

A is the sup er element connectivity matrix of the sub domain lo cated on the ith pro ces

i i

sor with the dimension N N N number of unknowns of the global problem N number

i i

N

of unknowns on the sub domain which maps a global vector g IR on a lo cal vector

i

N

i

These data types arise from the assembly of the equation system The lo cal g IR

i

assembly of the right hand side and the stiness matrix leads to the additivity of the right

hand side type I I while the solution type I is assumed to contain the true values on each

sub domain Starting from this domain decomp osition we easily get the parallelization of

the CG metho d which is describ ed in the literature for example in

Currently there are three types of preconditioners used in the PCG algorithm the

Jacobi the Yserentant and the BPXpreconditioner which will b e describ ed in Sec

tions

Let us consider some technical details The solution vector is initialized by the subroutine

STAVE At rst the subroutine assigns the value zero to all initial vector entries Then the

p oints of the Dirichlet b oundary conditions of this vector get their correct b oundary values

In order to guarantee a vector of type I this requires some data exchange

The subroutine STARTWRD serves as an interactive input routine for the control param

eters of the CG algorithm The user can choose the options given in Table compare

also Section

Release of Programmers Manual page

CHAPTER PARALLEL PRECONDITIONED CG METHOD

Option Description

v variant of preconditioning v Jacobi

v Yserentant without coarse grid solver

v Yserentant with coarse grid solver

v BPX without coarse grid solver

v BPX with coarse grid solver

i iter maximal number of iterations

e epsilon termination criterion for the relative error norm in the CG

algorithm

d delta scaling factor for the coarse grid matrix

z control of the amount of screen output see ion in Table

Table Control parameters for the solver

Some sp ecic initializations for the CG metho d and the preconditioners are realized in

the subroutine PREVOR First the subroutine D OUT KLZ see extracts the main diagonal

D of the stiness matrix lo cally on each sub domain If the coarse grid solver is used in

the preconditioner Section and the crossp oint values of the main diagonals of

each pro cessors stiness matrix are sent to pro cessor which mo dies the global coarse

grid stiness matrix with resp ect to the Dirichlet b oundary conditions and computes the

Cholesky factorization of this matrix

At the end the subroutine PREVOR makes some sp ecial initializations dep ending on the

kind of the chosen preconditioner In particular the inverse entries of D are stored b ecause

only D is used subsequently Here the information on Dirichlet b oundary conditions is

introduced by setting the inverse of D to zero see Step

After nishing the subroutine PREVOR the PCG iteration starts

The Jacobi preconditioner

The Jacobi preconditioner is the simplest preconditioner of all It only consists of a multi

plication of the residual vector r with the inverse D of the main diagonal of the stiness

matrix This preconditioning is realized by the subroutine JACOBI

After this vector multiplication the subroutine transfers the resulting vector w D r

from data type I I to data type I using the subroutine FEMAKK see This necessity follows

from the data type structure of the PCG metho d Therefore the communication cost of

the Jacobi preconditioned CG is the same as that of a unpreconditioned CG and only N

i

essential arithmetical op erations p er step are needed on pro cessor i

The condition number of C K D K equals O h where K is the stiness matrix

of our global problem and h is the discretization parameter but the p erformance is b etter

than without preconditioning b ecause the sums of the elements in the rows of the matrix

are now nearly equilibrated

Release of Programmers Manual

THE YSERENTANT PRECONDITIONER

The Yserentant preconditioner

The Yserentant preconditioner is based on a hierarchy of the nite element meshes It

can b e written in the following form

T

C SS

Here S is the basis transformation matrix which transfers the usual no dal basis to the

q

hhierarchical basis For the q th level we can write S S S S with

q

q

if i j i j N

q

i

if j i and j i where P is the middle p oint

k

i i

1 2

b etween P and P which are the end p oints of an S

k

ij

edge of a tetrahedron from the mesh T

k

else

If we use the coarse grid solver we get the following form

T

C SA S with

T

LL on the coarse grid

A

I else

T

LL is the Cholesky decomp osition of the matrix C and C is the nite element assembly

of over all pairs of crossp oints having a common edge in the coarse grid has

b een found empirically a go o d value is

If we have strong oscillating co ecients in the dierential equation a Jacobi mo dication

of the form

1 1

T

2 2

C SD D A S

is helpful D is the diagonal matrix extracted from the stiness matrix whose elements are

scaled with the mesh size h of the level i of the p oint it b elongs to

i

While the communication cost of the Yserentant preconditioner is nearly as low as with

out it the condition number C K is equal to O h in the threedimensional case This

is an improvement in comparison to the Jacobi preconditioner but it still cannot satisfy

The Yserentant preconditioning is realized by the subroutine YSERENT The transfor

T

mation with the matrices S and S is carried out in the subroutines HiSmulYser and

HSTmulYser resp ectively

Routine Description

HiSmulYserNfgNkXHielis X SX

T

HSTmulYserNfgNkXHielis X S X

Here Nfg denotes the number of degrees of freedom on each no de Nk is the number of

no des on the sub domain k X is the vector of the length N Nk Nfg and Hielis is the

hierarchical list on the sub domain which is generated by the mesh renement pro cedures

see Chapter Hielis is a twodimensional array of the following form

array Description

HielisNk Hielis no de number

Hielis left father

Hielis right father

Hielis co ecient

Release of Programmers Manual

CHAPTER PARALLEL PRECONDITIONED CG METHOD

The last co ecient denes the basis transformation matrix S In our denition and

in the most cases it is

The routine YSERENT copies rst the residual vector to a working vector w setting the

T

Dirichlet values to zero Then the multiplication w S w is carried out Now the resulting

1

2

therefore in the subroutine PREVOR the square ro ot of D vector w is multiplied with D

is computed

In the next step we have to transform w from data type I I to type I Here communication

is necessary which b ecomes somewhat complicated if we include a coarse grid solver Because

our coarse grid solver is based on a Cholesky factorization computed by pro cessor in a

rst communication step all pro cessors have to send their crossp oint values to pro cessor

While this pro cessor computes the coarse grid solution the other pro cessors start the

communication with resp ect to their edges and faces In the last communication step all

pro cessors receive their parts of the coarse grid solution Nevertheless at the end the amount

of communication is only slightly higher than that without any coarse grid solution

1

2

In coincidence with equation we compute w D w once again and after this

w S w We set the values at the Dirichlet p oints in the resulting vector to zero and nish

the Yserentant preconditioning step

The BPX preconditioner

The BPX preconditioner is also a hierarchical preconditioner It can b e written in the

following form

T

C S S

Here S is a transformation matrix which transforms the normal no dal basis of the space V

q

E

V V V with the into the generating system of the Cartesian pro duct space V

q

q

no dal basis spaces V V V

i i i

j q q q q q q

I I I I j I j j I j I For the q th level we can write S S I

q q

q q q

j j

with

if i j i j N

k

i

if j i and j i where P is the middle p oint

k

i i

1 2

b etween P and P which are the end p oints of an I

k

ij

edge of a tetrahedron from the mesh T

k

else

If we include a coarse grid solver we get the following for

T

C S A S with

T

LL on the grid of V

A

I else

T

For LL see Section

In the case of strong oscillating co ecients in the dierential equation a Jacobi mo di

cation is helpful This mo dication has the form

1 1

T

2 2

C S D A D S

E

where D is the extracted main diagonal of the stiness matrix corresp onding to V Its

q

elements are scaled with the mesh size h of the zone i of the p oint it b elongs to

i

Release of Programmers Manual

THE BPX PRECONDITIONER

III III

Lev Level Level Level

n n

hierarchical list

III IIIII II IIII

Lev Level Level Level Lev Level Level Lev Level Lev

n n n

z z z z

zone n zone n zone zone

BPX list

Figure List extension for the BPX preconditioner

E

the amount Due to the fact that we must communicate in the space corresp onding to V

q

of communication data of the BPX preconditioner is higher than that of the preconditioners

mentioned b efore But at the other hand the condition number of C K is O for the

BPX preconditioner

Before we can use the subroutine BPXLOES for the BPX preconditioning some additional

initialization steps are necessary At rst we have to extend the hierarchical list in the way

shown by Figure In the new BPX list zone corresp onds with V zone with V

zone n with V q n The length of the BPX list is denoted by nbpx Caused by the

q

communication over all zones we also have to extend the arrays KETTED and KETTED with

E

i q These arrays contain some parameters for the communication resp ect to V

i

related to the edges and faces resp ectively All this is done by the subroutine HBBPX

HBBPXNLevnkettKettnLevzonehelp

input output

N Nk nbpx

Lev hierarchical list bpx list

nkett length of the Kett list new length of the extended Kett list

Kett Kett list extended Kett list

nLev number of levels

zone array of p ointers to zones

Keep in mind that b efore using HBBPX there must b e provided enough memory for

the extended BPX and Kett list The same applies to the auxiliary vector w in the BPX

preconditioner and the vector of the main diagonal of the stiness matrix b etter its square

1

E

2

ro ot D which also have to b e extended according to V

q

T

The application of the transformation matrices S and S is done by the subroutines

HiSmulBPX and HSTmulBPX

Routine Describtion

HiSmulBPXNfgNkXBPXlis X SX

T

HSTmulBPXNfgNkXBPXlis X S X

Release of Programmers Manual

CHAPTER PARALLEL PRECONDITIONED CG METHOD

Like the subroutine YSERENT the subroutine BPXLOES copies rst the residual vector to

T

the working vector w setting the Dirichlet values to zero Then the multiplication w S w

1

2

D is takes place As the result we get the extended vector w which is multiplied with D

the extended main diagonal of the stiness matrix

Now we have to transfer w from data type I I to type I This communication concerns all

zones We start with the crossp oint communication where we communicate from the highest

down to the lowest zone If a coarse grid solver is included then after arriving at zone

all pro cessors send their crossp oint values of zone to pro cessor While this pro cessor

is computing the coarse grid solution the other pro cessors start communication over their

edges and faces from zone up to the highest zone In the last communication step all

pro cessors receive their part of the coarse grid solution from pro cessor

1

2

Finally we compute w D w once more and w S w reduces our vector w to the

length Nk After inserting the Dirichlet b oundary conditions the BPX preconditioning step

ends

Tree structure of the routines

In the case of a BPX preconditioning the initialization subroutine HBBPX is called in the

subroutine ASSLOES see Section The PCG metho d is realized by the subroutine PPCGM

! VDMULT PPCGM

! HISMULYSER ! PREVOR

! VDMUL OUT KLZ ! D

! BPX ! TREEUP DOD

! VDMUL ! OXCOPYVBZ

! HSTMULBPX ! CHOVBZ

! VDMULT AKK ! FEM

! TREE DOWN DOD ! TREE

! HISCALED ! TREEUP DOD

! HSTCOP ! RUEVBZ

! AXMKLZ ! VORVBZ

! VDMINUS ! KETTAKK

! JACOBI ! TREE DOWN

! VDMULT ! VDMULT

! FEMAKK ! HISMULBPX

! YSERENT ! VDMUL

! VDMUL ! DSCAPR

! HSTMULYSER ! TREE DOD

! VDMULT ! VDAXPY

! TREEUP ! AXMKLZ DOD

! DSCAPR ! RUEVBZ

DOD ! TREE ! VORVBZ

! VDAXPY ! FEMAKK

! ZWISCH DOWN ! TREE

1

in libKLZa see Section

2

in libCub ecoma see Section

3

in libMbasmo da see Section

4

in libDDCMcoma see Section

5

in libvbasmoda see Section

Release of Programmers Manual

DESCRIPTION OF THE ROUTINES

Description of the routines

The following FORTRAN sources are lo cated in the sub directory solve

BPX bpxf BPX preconditioning

HBBPX hbbpxf extends the hierarchical and the KETT lists with resp ect to the

BPX data structure

HISCALED hiemulf scaling of the main diagonal elements with the mesh size of the

corresp onding zone

HISMULBPX hiemulf multiplication with the transformation matrix S

HISMULYSER hiemulf multiplication with the transformation matrix S

HSTCOP hiemulf extends the main diagonal with resp ect to the BPX data

structure

T

HSTMULBPX hiemulf multiplication with the transformation matrix S

T

HSTMULYSER hiemulf multiplication with the transformation matrix S

JACOBI jacobif Jacobi preconditioning

PPCGM pp cgmf parallel preconditioned conjugate gradient metho d

PREVOR prevorf initializations dep ending on the kind of the chosen preconditioner

YSERENT yserentf Yserentant preconditioning

ZWISCH zwischf displays the values of the CG parameters

Release of Programmers Manual

CHAPTER PARALLEL PRECONDITIONED CG METHOD

Release of Programmers Manual

Chapter

General libraries

libKLZa for matrix op erations with compactly

stored matrices

Discretization metho ds as the nite element metho d lead to systems of equations with sparse

matrices For an economic use of the memory of the computer and for a decrease of the

number of arithmetic op erations it is necessary to use sp ecial storage techniques to exploit

the sparsity of the matrix We use the most economic way namely to store only the nonzero

elements of the matrix

This metho d has b een implemented and used in dierent applications for many years at

the Technische UniversitatChemnitzZwickau and the most imp ortant routines are available

in the library libKLZa A description of the storage and of the routines in libKLZa is given

in

libCub ecoma with basic communication routines

MIMD parallel computers consists of a number of pro cessors with lo cal memory which

exchange data via interprocessor communication Unfortunately the system routines for

the communication are dep endent on the hardware and the op erating system Thus it is very

comfortable from the p oint of view of the programmer to use standardized communication

routines which are indep endent of the system

The library libCub ecoma consists of a number of routines for the typical communications

within a MIMD parallel computer The consequent use of these routines leads to p ortability

of parallel programs Only some basic routines have to b e adapted This p ortability has b een

tested for transputer nCub e KSR TCGMSG Paramid i Linux and workstation

cluster of dierent pro ducers A detailed description of the library is given by the developers

in

libvbasmoda with basic vector op erations

The library libvbasmoda contains routines for vector op erations The use of these routines

instead of DOlo ops has three advantages

The program b ecomes clearer and b etter readable

Release of Programmers Manual page

CHAPTER GENERAL LIBRARIES

The realization of the op eration is optimized unrolled lo ops usage of BLASrouti

nes

Optimizations can b e done once but eventually also system dep endent

A description of the routines and examples for useful applications are given in

libMbasmo da for matrix routines

This library provides a couple of matrixoriented subroutines for lo cal op erations on each

pro cessor or on single pro cessor machines

There are matrixvector op erations for a sp ecial kind of matrix storing schemes called

VBZ storing a prole with rowwire variable bandwidth a set of subroutines for Fast

Fourier Transform and some routines for Schur complement preconditioners see A

more detailed description should b e published later

libDDCMcoma for DD and coarse matrix routines

omain Decomp osition Coarse Mesh COMmunication library do es what its name says The D

The kernel of this library is a set of subroutines for FEM accumulation via an eective

communication across the coupling edges and coupling faces in the D case

The libraries libGrafa and libNoGrafa

The library libGrafa contains routines for simple graphical output which are describ ed

in Based on them some sp ecic routines are included for graphical output of two

and threedimensional nite element applications including meshes and isolines

The only subroutine which is directly related to SPCPM Po D is GEBGRAPE which is

the interface to the package GRAPE The call of this routine the idea of the interface

and an example are describ ed in Note that the user has to supply a subroutine

GETDOFS for a correct lab eling of the buttons which are related to the degrees of freedom

For SPCPM Po D this is realized in NetzAllgemeingetdofsf

Occasionally the user might wish to gain memory by linking without graphics For this

case the library libNoGrafa may replace the library libGrafa in the link step It supplies

a set of dummy routines which are called instead of the graphic routines Set the variable

GRAF in Makedirmakelearchi to Graf or NoGraf to link the desired library see

Section

libToolsa for auxiliary routines

The library libToolsa contains some auxiliary routines of general interest most of them for

the programuserdialog and for le and string manipulations see Section Examples

to get a impression of the routines

Release of Programmers Manual

LIBTOOLSA FOR AUXILIARY ROUTINES

BEEPn Output of an acoustic signal the integer parameter n determines

its length

ENTER Pro cessor is waiting for an ENTER and the program continues

after a tree down

FLUSHlog unit Forces the output of the buer of log unit This is essential alter an

input request when the program is started via rsh remote shell

LEN TRIMstring Returns the length of string

UPCASEstring Change of all lower case letters in string to upp er case

YESstring Output of string and input request JN The return value is

TRUE when J j Y or y is entered and FALSE in all

other cases

Release of Programmers Manual

CHAPTER GENERAL LIBRARIES

Release of Programmers Manual

Bibliography

J Altenbach and A S Sacharov Die Methode der niten Elemente in der Festkorper

mechanik Fachbuchverlag

Th Ap el Programmbeschreibung zum DFiniteElementeProgrammsystem

FEMPSD Technical rep ort TU Chemnitz Unpublished

Th Ap el SPCPM Po D Users Manual Preprint SPC TU Chemnitz

Zwickau

Th Ap el G Haase A Meyer and M Pester Parallel solution of nite element

equation systems ecient interprocessor communication Preprint SPC TU

ChemnitzZwickau

Th Ap el and F Milde Realization and comparison of various mesh renement strate

TU ChemnitzZwickau gies near edges Preprint SPC

J Bey Der BPXVorkonditionierer in Dimensionen GitterVerfeinerung Paral

lelisierung und Simulation Preprint UniversitatHeidelb erg IWR

J H Bramble J E Pasciak and A H Schatz The construction of preconditioners

for elliptic problems by substructuring IIV Math Comp

J H Bramble J E Pasciak and J Xu Parallel multilevel preconditioners Math

Comp

G Haase Th Hommel A Meyer and M Pester Bibliotheken zur Entwicklung paral

leler Algorithmen Preprint SPC TU ChemnitzZwickau Up dated version

and SPC of SPC

G Haase U Langer and A Meyer The approximate Dirichlet domain decomp osi

tion metho d Part I An algebraic approach Part I I Applications to ndorder elliptic

b oundary value problems Computing Part I Part I I

G Haase U Langer and A Meyer Parallelisierung und Vorkonditionierung des

CGVerfahrens durch Gebietszerlegung In G Bader R Rannacher and G Wittum

editors Numerische Algorithmen auf TransputerSystemen pages Stuttgart

Teubner Pro ceedings of the GAMMSeminar Heidelb erg

G Kunert Ein Residuenfehlerschatzerf uranisotrop e Tetraedernetze und Dreiecksnetze

in der FiniteElementeMethode Preprint SPC TU ChemnitzZwickau

Release of Programmers Manual page

CHAPTER GENERAL LIBRARIES

A Meyer and M Pester Verarbeitung von SparseMatrizen in Kompaktsp eicherform

TU ChemnitzZwickau KLZKZU Preprint SPC

M Meyer GrakAusgab e vom Parallelrechner f ur DGebiete Preprint SPC

TU ChemnitzZwickau

I P Mysovskikh Interpolating cubature formulae Nauka Moskow Russian

M Pester GrakAusgab e vom Parallelrechner f ur DGebiete Preprint SPC

TU ChemnitzZwickau

H D Simon Partitioning of unstructured problems for parallel pro cessing Computing

Systems in Engineering

A Wierse and M Rumpf GRAPE Eine ob jektorientierte Visualisierungs und Nu

merikplattform Informatik Forsch Entw

H Yserentant Ub er die Aufspaltung von FiniteElementeRaumen in Teilraumever

schiedener Verfeinerungsstufen Habilitationsschrift RWTH Aachen

O C Zienkiewicz Methode der niten Elemente Fachbuchverlag Leipzig

Release of Programmers Manual

Index

Index

The italic numbers denote the pages where the corresp onding entry is describ ed numbers

underlined p oint to the denition all others indicate the places where it is used

Symbols DRDIR FKANTE

FNEUM

GRAF DRIFG

D kette FNTAB 37

DRNEUM

Full data structure 4

D kette

DRNODES

Fullname

DVDIR

A

FVOL

DVNEUM

ASSEM 36

FZEIG

assembly 31

E

G

ASSLOES 36

EINTG 36

G 37

AUSGABE 18

ESHAP 37

GEBGRAPE

EINTG 37

B

GETDOFS 22

ELEHF

BPX 45

GRAPE

ERSOB 37

BPXLOES

GROBNETZ 24

ESHAP 37

brick

Edges 5

H

brick elements

ELAST 32 37

HBBPX 45

brick mesh

ELEMENT 37

Hielis

element types

hierarchical list

C

ELNORM 37

HISCALED 45

CHAIN 7

ELS 32 37

HiSmulBPX

coarse grid

Epsilon

HiSmulYser

coarse grid matrix 34

equation system

HSTmulBPX

coarse grid solver

error norm 34

HSTmulYser

coarse grid stiness matrix

H seminorm

coarse mesh

I

L norm

COARSMAT 36

ICH

discrete maximum

PROB COM

ICHRING

norm

com probinc

IFG

exact solution

communication

IGLOB 5

COOR

IHPT 37

F

coupling edge

Iter

F 37

coupling face

itri

FACE

IVD 37

Faces 4

D

FCHIELD

D OUT KLZ

J

FDIR

data storage

JACOBI 45

FDS 4

type I

JACOBIAN 37

FEHLER 37

type I I

JFREI

FEIN 24

DIMFACE

FEINCALL 24

DIMKANTE

K

FEM AKK

DIMVOL

KDDIM

FEMAKK

DIR

KDDIM

femakkvar

DIRF

KANTE

FFACE

Dirichlet b oundary condi

KCHIELD

File

tions

VOR KETTDAKK

Dirichlet data 6 FILENAME

KETTDAKK VOR

VOR KETTDAKK DRDAT lenameinc

Release of Programmers Manual

Index

NENXD Yserentant 41 KETTAKK

PREVOR 45 kette

netddatinc 9

PROBLEM Kette data 6

NETDIM

KETTED

PWEGID

NETMAKE 13 20

KETTED

NEUM

Q

KUERZEN 22

NEUMANN 37

QGST

KZEIG

Neumann b oundary condi

QGST 29

tions

L

QGST 29

Neumann data 6

LAENGE

Quadrature formulas

NEUMF

Lameproblem

quadrature p oints

NFACE

Lamesystem

Nfg

R

Lback

NI

RB

LC

Nintass

RDS 7

Length

Ninterror

Recursive Sp ectral Bisec

Lforw

Nintass

tion

libCub ecoma 47

Ninterror

Reduced data structure 7

libDDCMcoma 48

Nk

Reducing data 18

libGrafa

NKANTE

REGION

libKLZa 47

NKettSum

right hand side

libMbasmo da 48

NKF

RSB

libNAa

NKK

RSB

libNoGrafa 48

Nlevl

libToolsa

NNEUM

S

libvbasmoda 47

SET RBCOM NNEUMREAL

lin quad

NODENR SETSTANDARD 23

linear elasticity

no des shap e functions

LinkLevel

linear

NPROC

loesvar

quadratic

NQP

Lunit

SHP 29

NQP

SHP 29

NRDAT

M

solution of the equation sys

NRNODES

MAKEKZU

tem

NUMEL

memory management

SORTKZU

mesh renement 14 numerical integration

standard

NUMNP

N

standardinc

N

O

STARTWRD

NanzK

OXCOPYVBZ

STAUCHEN 23

NanzKD

STAVE

P

NanzKD

STEUER

PACKKLZ

NC

stiness matrix

PKDAT

NCrossG

STROEM

PKLENG

NCrossL

STWERTE 24

PKZEIG

NCUBE

SUB

Poisson equation

NDF

surface integral 33

Poisson problem

NDIAG

PPCGM 45

T

NDiag

preconditioner

TETF 8

NDIR

BPX 42

NDIRREAL tetrahedral elements

BPX list tetrahedral mesh

NEND

NEND Jacobi 40 tetrahedron

Release of Programmers Manual

Index

VOL UZ 38 TKUERZ 23

TrNet

Volumes FDS 4

V

trnetinc

Volumes RDS 7

Verf

TrRing

VERSION 24

VERTEILEN 24 U

Y

VERTEILEN 24 U 38

YSERENT 45

vertvar UX 38

VFS YSFAKTOR 24 UY 38

Release of Programmers Manual