High Performance Computing and Communications Glossary Release
Total Page:16
File Type:pdf, Size:1020Kb
High Performance Computing and Communications Glossary Release 1.3a K.A.Hawick [email protected] Northeast Parallel Architectures Center Syracuse University 24 July 1994 This is a glossary of terms on High Performance Computing and Communica- tions (HPCC) and is essentially a roadmap or information integration system for HPCC technology. The master copy of this do cumentismaintained online as HTML hy- p ertext, where cross references are hyp ertext links to other entries in the do cument. Some entries, like that for ScaLAPACK software for example, p oint to the software source itself, available on the internet from sites like the National Software Exchange. In this \pap er" version, entry headings are in b old, and cross reference entries are in italics. The maintainance of this glossary is of necessity an ongoing pro cess in an active eld like HPCC, and so entries are b eing up dated and augmented p erio dically. Check the version numbers. The master hyp ertext copy is stored at the Northeast Paral lel Architectures Centre (NPAC) at URL: http://www.npac.syr.edu/nse/hp ccgloss and NPAC's main URL is: http://www.npac.syr.edu An additional copy is also available in the Information Databases section of the National Software Exchange which can b e accessed from the URL: http://www.netlib.org/nse/home.html. Several p eople have contributed to the entries in this glossary, esp ecially JackDon- garra and Geo rey Fox and manyofmy colleagues at Edinburgh and Syracuse.Any further contributed entries, suggested revisions, or general comments will b e gladly received at the email address ab ove. This pap er do cumentisalsoavailable by anonymous ftp (ftp.npac.syr.edu) as Technical Rep ort SCCS-629, or by request from NPAC at 111 College Place, Syracuse NY 13244-4100, USA. 1 A acyclic graph (n.) a graph without any cycles. adaptive (adj.) Taking available information into account. For example, an adaptive mesh-generating algorithm generates a ner resolution mesh near to discontinuities such as b oundaries and corners. An adaptive routing algorithm may send identical messages in di erent directions at di erent times dep ending on the lo cal density information it has on message trac. See also oblivious address generation (n.) a cycle during execution of an instruction in which an e ective address is calculated by means of indexing or indirect addressing. address space (n.) A region of a computer's total memory within which addresses are contiguous and may refer to one another directly.Ashared memory computer has only one user-visible address space; a disjoint memory computer can haveseveral. adjacency matrix (n.) a Bo olean matrix indicating for eachpairofvertices I and J whether there is an edge from I to J. all-pairs shortest-path problem (n.) given a weighted graph, nd the shortest path between everypairofvertices. alpha-b eta pruning (n.) an algorithm used to determine the minimax value of a game tree. Alpha-b eta algorithm avoids searching subtrees whose evaluation cannot in uence the outcome of the search. Amdahl's Law (n.) A rule rst formalised by Gene Amdahl in 1967, which states that if F is the fraction of a calculation that is serial and 1-F the fraction that can b e parallelized, then the speedup that can b e achieved using P pro cessors is: 1/( F + (1-F)/P) which has a limiting value of 1/F for an in nite numb er of pro cessors. This no matter how many pro cessors are employed, if a calculation has a 10% serial comp onent, the maximum sp eedup obtainable is 10. AND tree (n.) a search tree whose nonterminal no des are all AND no des. AND-parallelism (n.) A form of parallelism exploited by some implementations of par- allel logic programming languages, in which the terms in conjunctions are evaluated si- multaneously, and parent computations are allowed to pro ceed only once their immediate children have completed. See also OR-paral lelism . AND/OR tree (n.) a search tree with b oth AND and OR nonterminal no des. antidep endence (n.) See dependence. applicative language a language that p erforms all computations by applying functions to values. architecture (n.) The basic plan along which a computer has b een constructed. Popu- lar parallel architectures include processor arrays, bus-based multiprocessors (with caches of various sizes and structures) and disjointmemorymulticomputers. See also Flynn's taxonomy. arity (n.) The numb er of arguments taken by a function. Arity is also sometimes used in place of valence. ARPAnet (n.) The US Advanced Research Pro ject Agency network, whichwas the rst large multi-site computer network. 2 array constant (n.) an array reference within an iterative lo op, all of whose subscripts are invariant. array pro cessor (n.) Any computer designed primarily to p erform data paral lel calcula- tions on arrays or matrices. The two principle architectures used are the processor array and the vector processor. arti cial intelligence (n.) A class of problems such as pattern recognition, decision making and learning in whichhumans are clearly very pro cient compared with current computers. asso ciative address (n.) a metho d whereby access is made to an item whose key matches an access key, as distinct from an access to an item at a sp eci c address in memory. See associative memory. asso ciative memory (n.)Memory that can b e accessed bycontent rather than by address; content addressable is often used synonymously. An asso ciative memory p ermits its user to sp ecify part of a pattern or key and retrieve the values asso ciated with that pattern. The tuple space used to implementthegenerative communication mo del is an asso ciative memory. asso ciative pro cessor (n.) a processor array with associative memory. asynchronous (adj.) Not guaranteed to enforce coincidence in clock time. In an asyn- chronous communication op eration the sender and receiver mayormay not b oth b e engaged in the op eration at the same instant in clo ck time. See also synchronous. asynchronous algorithm (n.) See relaxation method. asynchronous message passing (n.) See message passing . atomic(adj.) Not interruptible. An atomic op eration is one that always app ears to have b een executed as a unit. attached vector-pro cessor (n.) A sp ecialised pro cessor for vector computations, de- signed to b e connected to a general purp ose host pro cessor. The host pro cessor supplies I/O functions, a le system and other asp ects of a computing system environment. automatic vectorization (n.) A compiler that takes co de written in a serial language (often Fortran or C) and translates it into vector instructions. The vector instructions may b e machine sp eci c or in a source form such as array extensions or as subroutine calls to a vector library. See also compiler optimization. auxiliary memory (n.) Memory that is usually large, in capacity, but slow and inexp en- sive, often a rotating magnetic or optical memory, whose main function is to store large volumes of data and programs not b eing actively used by the pro cessors. AVL tree (n.) binary tree having the prop erty that for any no de in the tree, the di erence in heightbetween the left and right subtrees of that no de is no more than 1. B back substitution (n.) See LU decomposition. banded matrix (n.) a matrix in which the non-zero elements are clustered around the main diagonal. If all the non-zero elements in the matrix are within m columns of the main diagonal, then the bandwidth of the matrix is 2m+1. 3 bandwidth (n.) A measure of the sp eed of information transfer typically used to quantify the communication capabilityofmulticomputer and multiprocessor systems. Bandwidth can express p oint to p ointorcollective(bus)communications rates. Bandwidths are usually expressed in megabytes p er second. Another usage of the term bandwidth is literally as the width of a band of elements in a matrix. See banded matrices. bank con ict (n.) A bank "busy-wait" situation. Memory chip sp eeds are relatively slow when required to deliver a single word, so sup ercomputer memories are placed in a large numb er of indep endent banks (usually a p ower of 2). A vector of data laid out contiguously in memory with one comp onent p er successive bank, can b e accessed at one word p er cycle (despite the intrinsic slowness of the chips) through the use of pip eline d delivery of vector- comp onentwords at high bandwidth. When the numb er of banks is a p ower of two, then vectors requiring strides of a p ower of 2 can run into a bank con ict. bank cycle time (n.) The time, measured in clo ck cycles, taken by a memory bank between the honoring of one request to fetch or store a data item and accepting another such request. On most sup ercomputers this value is either four or eight clo ck cycles. barrier (n.) A p oint in a program at which barrier synchronization o ccurs. See also fuzzy barrier. barrier synchronization(n.) An event in whichtwo or more pro cesses b elonging to some implicit or explicit group block until all memb ers of the group haveblocked. They may then all pro ceed. No memb er of the group maypassabarrier until all pro cesses in the group havereached it. See also fuzzy barrier. basic blo ck (n.) A section of program that do es not cross any conditional branches, lo op b oundaries or other transfers of control.