Computer Science Technical Report

A New Fixed Degree Regular Network for Parallel

Processing

Shahram Lati® Pradip K. Srimani Department of Electrical Engineering Department of Computer Science University of Nevada Colorado State University Las Vegas, NV 89154 Ft. Collins, CO 80523 lati®@ee.unlv.edu [email protected]

Technical Report CS-96-126

Computer Science Department Colorado State University Fort Collins, CO 80523-1873

Phone: (970) 491-5792 Fax: (970) 491-2466

WWW: http://www.cs.colostate.edu



oceedings of Eighth IEEE Symposium on Paral lel

To appear in the Pr

Distributed Processing and , New orleans, October 1996. A New Fixed Degree Regular Network for Parallel Processing 

Shahram Lati® Pradip K. Srimani Department of Electrical Engineering Department of Computer Science University of Nevada Colorado State University Las Vegas, NV 89154 Ft. Collins, CO 80523 lati®@ee.unlv.edu [email protected]

Abstract ameter, low degree, high fault tolerance etc. with the well known hypercubes (which are also Cayley graphs). We propose a family of regular Cayley network graphs All of these Cayley graphs are regular, i.e., each vertex of degree three based on permutation groups for design of has the same degree, but the degree of the vertices increases massively parallel systems. These graphs are shown to be with the size of the graph (the number of vertices) either based on the shuf¯e exchange operations, to have logarith- logarithmically or sub logarithmically. This property can mic diameter in the number of vertices, and to be maximally make the use of these graphs prohibitive for networks with fault tolerant. We investigate different algebraic properties large number of vertices [6]. Fixed vertex-degree networks of these networks (including fault tolerance) and propose are also very important from VLSI implementation point of a simple routing algorithm. These graphs are shown to be view [7]; there are applications where the computing ver- able to ef®ciently simulate other permutation group based tices in the interconnection network can have only a ®xed graphs; thus they seem to be very attractive for VLSI imple- number of I/O ports [6]. As a matter of fact a large multipro- mentation and for applications requiring bounded number cessor (with 8096 vertices) is being built by JPL around the of I/O ports as well as to run existing applications for other binary De Bruijn graphs [8], which are almost regular net- permutation group based architectures. work graphs of ®xed vertex degree 4. There are also other graphs in the literature that have almost constant vertex de- grees like the Moebius graphs [9] of degree 3. But neither 1 Introduction the De Bruijn graphs [8] nor the Moebius graphs [9] are reg- ular (and none are Cayley graphs). Also, these graphs have a low vertex connectivity of only 2 although most of the ver- Performance of any distributed system is signi®cantly de- tices in those graphs have degree larger than 2. Some ®xed termined by the choice of the underlying network topology. degree regular networks have been studied in [10, 11, 12]. One of the most ef®cient interconnection networks has been In this paper, we introduce a for intercon- the well known binary n-cubes or hypercubes; they have necting a large number of processors but with a constant de- been used to design various commercial multiprocessor ma- gree of the vertex. The proposed family of graphs is reg- chines and they have been extensively studied. For the past ular of degree 3 independent of the size of the graph, has several years, there has been a spurt of research on a class a logarithmic diameter in the number of vertices and has a of graphs called Cayley graphs which are very suitable for vertex connectivity of 3, i.e., the graphs are maximally fault designing interconnection networks. These Cayley graphs tolerant (vertex connectivity cannot exceed the vertex de- are based on permutation groups and they include a large gree). The proposed family of graphs offers a better and number of families of graphs like star graphs [1, 2], hy- more attractive alternative to De Bruijn graphs or Moebius percubes [3], pancake graphs [1, 4] and others [5]. These graphs for VLSI implementation in terms of regularity and graphs are symmetric (edges are bidirectional), regular and greater fault tolerance. Another interesting feature of the

seem to share many of the desirable properties like low di- proposed graphs is that it can easily simulate other permu-

 tation group based graphs and hence parallel algorithms de-

oceedings of Eighth IEEE Sym- To appear in the Pr

signed for those topologies can be ef®ciently run on the new

osium on Paral lel and Distributed Processing p , New orleans, October 1996. architecture with very minimal change. 2 Shuf¯e-Exchange Permutation (SEP) Proof : The proof follows from the fact that any basic 2-

Graph cycle of a permutation can always be written in terms of the 2

swap and the n-cycle; for details, see [13]. !

A Cayley graph is a vertex-symmetric graph often with n

SEP n  2

vertices, each assigned a label which is a distinct permuta- Lemma 2 n,, is a Cayley graph.

1; 2;:::;ng

tion of the set f . The adjacency among the ver-

SEP n  2 G =

tices is de®ned based on a set of permutations referred to as Proof : The set of generators for n,,

fg ;g ;g g

L R generators; the neighbors of a vertex are obtained by com- 12 is closed under the inverse operation and the

posing the generators with the label of the vertex. The set generator set can generate all the elements in the vertex set

SEP 2 of generators de®ned for a Cayley graph must be closed un- of n.

der the inverse operation to guarantee that the graph will be

g g

L

;2

undirected. Furthermore, the set of generators together with Remark 3 Note that the two generators 1 and could

S

n!

the Identity permutation must produce all permutations produce all the elements of the group n and so could the

g g

n S

R

;2 two generators 1 and . The third generator is in- on integers (which collectively belong to set n ), or a sub-

S cluded with two objectives: to make the generator set closed group of n when the generators are repeatedly applied to

the already generated vertices. under inversion (a requirement for the graph to be a Cay- SEP

An n-dimensional graph is an undirected regular ley graph) and to make the graph undirected (bidirectional)

= n!

graph with N vertices, each vertex corresponding to since the bandwidth of a link incident on a vertex can be

g g

f1; 2; ;ng R

a distinct permutation of the set . Two vertices shared between the incoming traf®c activated by L (or )

g g L are directly connected if and only if the label (permutation) and the outgoing traf®c activated by R (or ).

of one is obtained from the label of the other by one of the

n n  2 SEP following operations: Theorem 1 For any , , the graph n:(1)isa symmetric (undirected) of degree 3; (2) has

 Swapping the ®rst two digits (the leftmost digit is

! 3n!=2 n vertices; and (3) has edges. ranked as ®rst).

 Shifting cyclically to left (or right) by one digit. Proof : (1) follows from Remark 1. (2) follows from 2

Formally, consider a set of n distinct symbols (we use En- Lemma 1. (3) follows from (1) and (2).

= 4

glish numerals as symbols; for example, when n ,the g

De®nition 1 The edges which correspond to 12 are called

; 2; 3; 4 n

symbols are 1 ). Thus, for distinct symbols, there

E g g R

Exchange or -edges and those corresponding to L (or !

are exactly n different permutation of the symbols or the S

) are referred to as Shuf¯e or S -edges; an -edge may be

SEP n! a a a

1 2 n number of vertices in n is ;using as the symbolic representation of an arbitrary vertex, the edges of referred to either as an L-edge (corresponding to left shift)

or an R -edge (corresponding to right shift). SEP

n are de®ned by the following three generators in the graph:

Remark 4 This is similar to the familiar shuf¯e-exchange

g a a a =a a a a

1 2 n 2 3 n n

L network where adjacent labels are obtained by either shift-

g a a a =a a a a

1 2 n n 1 2 n1

R ing the label cyclically to left by one digit, or by complement-

g a a a =a a a a

1 2 n 2 1 3 n 12 ing the rightmost digit of the label (instead of swapping).

Remark 1 (i) The SEP is an undirected graph as swapping



E ; L; R  De®nition 2 We use the regular expression  to the ®rst two digits is a symmetric operation; and if a vertex

denote any sequence of generators; generator sequences v u is obtained from another vertex by shifting to left, vertex

will be used to specify vertex to vertex paths in the graph.

u u

v can be obtained from by shifting to right; thus, if is

v v u g

adjacent to ,sois to , (ii) The generator 12 is its own

1 1 1

g g = g g = g = g

R L

inverse, i.e., 12 and ( ), (ii) 3 Simple Routing Algorithm

12

L R SEP

Figure 1 shows the proposed trivalent Cayley graphs 3

SEP 3 4 4 and of dimensions and respectively. SEP

Since n is a Cayley graph, it is vertex symmetric [1],

i.e., we can always view the distance between any two arbi- g

Remark 2 (i) The generator 12 performs the transposition

trary vertices as the distance between the source vertex and

12 g g R on the permutations and (ii) either L or performs

the identity permutation by suitably renaming the digits rep-

1; 2; ;n the n-cycle on the permutations. resenting the permutations. Thus, in our subsequent discus-

Lemma 1 The transpo- sion about a path from a source vertex to a destination ver-

n 1; 2;:::;n

sition 12 and the -cycle together generate tex, the destination vertex is always assumed to be the iden-

S n

n

= 12 n , the group of all permutations of distinct elements. tity vertex I without any loss of generality. The 4321

3421 4213 1342 123 2134

213 231 1234 2413 3142 2341 1432 4132 3241 4123 1423 2314 3214

1324 4231 3412 321 312

4312 132 3124 2431

1243

2143

=3 n =4

Figure 1. Example Graphs for n and

2g i > 1 g

following algorithm Simple Route computes a path from an times; if , then apply 12 ;

u I g ;g  12

arbitrary source vertex to the identity vertex . apply the sequence of generators R

f0; i 2g

The algorithm de®nes the moves to be taken at each step max times followed by a sin- g

(either of the three generators). Each move takes us to a new gle L .

ij1  i  ng

vertex. The symbol set is given by f and for g

Step 4: Apply one L to reach the identity vertex.

i i any digit i,weuse to denote the position of the digit

in the current vertex (the leftmost digit in a permutation has

2143 SEP

Example: Consider the vertex in 4. The path

5=3 23541 the position 1). For example,  in the vertex

from this vertex to the identity vertex is computed by the 1=5

while  in the same vertex.

g g g

L L L

! 1432 ! 4321 !

algorithm as follows: 2143

g g g g g

12 R 12 L 12

! 2314 ! 4231 ! 2431 ! 4312 !

Algorithm Simple Route 3214

g g

L L

! 4123 ! 1234

3412 . Thus the path length is 8.

1 > bn=2c g n 1

Step 1: If , then apply R

g 1 times; else apply L times. [Digit ª1º is Remark 5 The algorithm is not optimal; note that the ver-

now at the rightmost position.

2143 SEP

tex node in 4 can reach the identity vertex in

g g g

12 L L

2134 ! 1243 ! 2431 !

2  i bn=2c

Step 2: For i, , do the following: 6 steps as follows:

g g g

12 L L

! 3412 ! 4123 ! 1234

4312 . Nevertheless it

n i+1 <i1 g

(A) If , then apply R does establish a upper bound on the diameter of the graph,

n i+1

 times; apply the sequence as we see below.

g ;g ni L

of generators 12 times g

followed by a single L ; We make the following observations about the worst case

maxf0; i 2g g behavior of the above routing algorithm:

(B) otherwise, apply L

i > 1 g

times; if , then apply 12 ;ap-

bn=2c

 Step 1 takes moves.

g ;g  12

ply the sequence of generators R

f0; i 2g

max times followed by a

bn=2c 1

 There are loops in Step 2. Consider an g

single L .

arbitrary value of i in the given range. Number of

n+2

i = b c moves is maximum when  (using ei- [Digits ª1º through ªiº are now at the right- 2

most positions of the current vertex] ther subcase (A) or (B)). Number of moves for putting

3i 2 + 1+1

digit i in its correct place is

3n2

bn=2c < i  n 1 = 3i4 =

Step 3: For digits i, ,dothe . Note that for each value of

2

g maxf0; maxf0; i i

following: apply L within the range speci®ed in Step 2, this worst case

i  s SEP

value of is possible. Thus, maximum possible to- For each -cycle in n, the unique vertex start-

1

n 23n 2 = t

tal number of moves in Step 2 is ing with 1 is called the leader vertex. Since there are

4

1

2

3n 8n +4 n t

. symbols and the leader nodes start with 1 ,there

4

n 1! SEP

are leader nodes in n which is equal to

i=ni+1

 Worst case in Step 3 occurs when for

s SEP

the number of -cycles in n. For example, the

n=2c

b and note that this possible. Thus, 1243 leader node of the s-cycle cited in example 1 is . maximum possible total number of moves in Step 3 is

 Note that each leader node permutation starts with

n1

X the digit ª1º. If we disregard the digit ª1º in repre-

f1 + 3n j +12 + 1g

n 1!

senting the leader nodes, each leader node (

j =n=2+1

n 1

of them) is labeled by a distinct permutation of 

f2; 3; ;ng

21

n= distinct digits .

X

1

2

= 3j 1 = 3n 10n +8

s s s j

De®nition 4 Two -cycles, say i and , are said to be ad-

8

j =1

v 2 s u 2 s j

jacent if there exists a vertex i and a vertex

v = g u u = g v  12 such that 12 or .

 Step 4 always takes one move.

s G n s

Theorem 3 Each -cycle in n is adjacent to different -  In the worst case, the total number of moves made by

the entire routing algorithm is given by (by summing cycles.

1

2

n 22n + 24

up the moves from different steps) 9 . 8

Proof : Consider an arbitrary s-cycle with the leader

SEP

n

= a a a a = t g v  =

Remark 6 Obviously, the diameter of for any given v

2 n 1 1 12

1 ,where .Now,

1

2

n n  3 9n 22n + 24

a a = y y f

, , is upper bounded by .We a

2 1 n 0 0

8 and the node belongs to the -cycle

a a a y

strongly suspect that the actual diameter would be much less a

3 n 2 i

with leader 1 . Then, consider the nodes ,

2

O n 

 i < n y = g v  v =

although most possibly . 1

12 i i

, such that i where

i

g v  y = g a a a a a a =

12 i+1 i+2 n 1 2 i

. We have i

L

a a a a a a a  y

i+2 i+1 i+3 n 1 2 i i SEP ; this node belongs to a

4 Algebraic Properties of n

s a a a a a a a

2 i i+2 i+1 i+3 n

-cycle with the leader 1 .Ob-

y 0  i < n s In this section we investigate different interesting struc- viously, the nodes i , , belong to different -

tural properties of the proposed 3-regular Cayley graphs. cycles (they have different leaders) and hence any s-cycle

n s SEP 2

is adjacent to different -cycles in n.

SEP s

De®nition 3 Any cycle in n consisting of only the -

n1

v 2 SEP LE  v  = v

Theorem 4 For any vertex n,

g g R

edges (induced by the symmetric functions L or )is

2n 1 SEP

and this de®nes a cycle of length in n.

called an s-cycle.

n1

v 2 SEP RE  v =v

Corollary 1 For any vertex n,

4312; 3124; 1243; 2431g

Example: In Figure 1, the cycle f

2n 1 SEP

and this de®nes a cycle of length in n.

is an s-cycle.

v 2 SEP n > 3

Theorem 5 For any vertex n,,

3

n! SEP

Theorem 2 Allofthe vertices of n of dimension

EREL v = v 12

 and this de®nes a cycle of length in

s n

n are partitioned into vertex disjoint -cycles of length ; SEP

n.

s SEP n 1!

number of -cycles in n is . SEP

5 Fault Tolerance of n

v = a a a

2 n

Proof : Consider an arbitrary vertex 1 in

i1

i

SEP i i  1 g v =g g v n

. For any , ,let L ,where L

L The node fault tolerance of an undirected graph is mea-

1 n

g g v  = g v  v  = v

L . It is easy to observe that and

L L

sured by the vertex connectivity of the graph. A graph G

j

i

g 1  i; j  n v  6= g v 

also  for . Thus, from an ar-

L

 G

L is said to have a vertex connectivity if the graph re-

v g g R bitrary vertex if the L (or the ) function is repeatedly

mains connected when an arbitrary set of less than  nodes

n SEP applied, a cycle of length is traced in the graph n.That are faulty. Obviously, the vertex connectivity of a graph G

these s-cycles are vertex disjoint follows from the fact that

cannot exceed the minimum degree of a node in G; thus

g v =g v  v = v 2

1 L 2 1 2

L , if and only if .

 SEP   3 SEP n n since is a 3-regular graph for all val-

ues of n. A graph is called maximally fault tolerant if ver- Remark 7

tex connectivity of the graph equals the minimum degree of

 ft ;t ;;t g SEP

2 n n

Consider the symbol set 1 for .a node. Our purpose in this section is to show that the pro-

k 1  k  n s SEP SEP n

For all , ,each -cycle in n has a posed graph has a vertex connectivity of 3 and hence

t 1  k  n

unique vertex starting with k , . these graphs are maximally fault tolerant. :::n we get the corresponding leader nodes 123 and

234

134:::n2 SEP s s 1 s 2

(in n) of two -cycles and say;

s 2134:::n 123:::n

2 has a node which is connected to

g SEP s 1 s

n 2 by an 12 edge in and hence and are

324 243

u = g v 

adjacent. Similar argument holds if R .If

= E i; i + 1v  v = 2:::n u = 2::i

u ,let and

i +1ii+1::n

1 . Again restoring digit ª1º, we

:::n 12::i

get the corresponding leader nodes 12 and

1i +1ii+1::n s s 1 s

of two -cycles and 2 say;

ii +1::n12::i 1 g

342 423 s 12

1 has a node which has an

SEP i +1ii+2::n12::i 1

edge in n to the node

s s s 1 s 2 that belongs to the -cycle 2 ; thus and are ad-

432 jacent.

 s s s SEP

2 n

If two -cycles 1 and are adjacent in ,their

RS

1 corresponding leader nodes are adjacent in n ;

similar arguments, as before, hold in reverse direc- RS

Figure 2. Reduced Graph 3 corresponding tion. SEP

to 4

2

RS

1

Remark 9 The reduced graph n contains the bubble- SEP

De®nition 5 For a given n compute the reduced graph

B n 1

1

sort graph n of dimension as a subgraph; a

RS s

1

n in the following way: condense each -cycle into a

B n 1 n 2

1 bubble-sort graph n of dimension is a -

single node and label that node with the leader node of the

n 1! regular Cayley graph with  vertices, each labeled

s-cycle without the digit ª1º (see Remarks 7); connect two

1;:::;n with a distinct permutation of the set of integers f

arbitrary vertices by an undirected edge iff the correspond-

g G = fg i; i +1;i =

1 with a set of generators de®ned as:

s SEP

ing -cycles are adjacent in n.

;2;:::n 1g

1 [1].

B

1

Lemma 4 Vertex connectivity of a bubble-sort graph n RS

Remark 8 Figure 2 shows the reduced graph 3 corre-

2

is n [1].

SEP RS

n1

sponding to 4. Each vertex in is labeled with a

n 1 RS

RS n  3

1 distinct permutation of distinct symbols; thus n

Lemma 5 Vertex connectivity of n , ,isatleast3.

n 1! n

has  vertices each with a degree (each vertex in

RS s SEP

1 n

n corresponds to a distinct -cycle in ); each

n  4 RS B n

Proof : For , n contains with connectivity

s SEP n

-cycle in n is of length , has a leader node (digit

n 1 RS

(Lemma 4) and hence n has vertex connectivity at

1

ª1º followed by a distinct permutation of n digits, =3

least 3. For n , exhaustive enumeration shows the result.

2; 3; ;ng n s

f ), and is adjacent to distinct other -cycles.

2

RS

1

Lemma 3 The reduced graph n (corresponding to

u v SEP

Lemma 6 Consider two arbitrary nodes and in n

SEP n n 1!

n)isa regular graph with nodes (each

v s

such that u and belong to different -cycles. Then there

1

node representing a distinct permutation of n digits v

exist three vertex disjoint paths between u and .

f2; 3; ;ng g g E i; i +1 R

) with the generators L , , and ,

 i

for 1 ,where indicates swapping of two

u 2 s v 2 s i 6= j j

Proof : Let i and where . Consider the

i +1

adjacent digits i and .

s s

following three -cycles adjacent to i :

g u=u 2 s ;g g u=u 2s ; g g u=u 2 s

SEP n 1! s

12 1 i1 12 L 3 i2 12 R 4 i3

Proof : n has many -cycles, each with

1

a leader denoting a distinct permutation of n digits

 3 s

It is easy to see that for n these three -cycles are dis-

f2; 3; ;ng RS n 1!

1 and hence n has nodes. It re-

tinct. Similarly, the node v can reach (by vertex disjoint

mains to show the adjacency of the s-cycles is achieved by

s s ;s ; and s

1 j2 j 3

the said generators and nothing more. paths) to three distinct -cycles, say j .Note

i j s = s ` jk

that it is possible for given and that i` for some

 u v RS k

1

If two nodes and are adjacent in n , they cor- and . By Menger's theorem [14], given two sets of nodes

s SEP u = g v  V U jV j = jfv ;v ;v gj = jU j =

L 1 2 n

respond to adjacent -cycles in n.If , and such that

v = 23::n u = 34:::n2 jfu ;u ;;u gj = n n

2 n let and ; restoring digit ª1º, 1 in a -connected graph, there are

n v ; u  i

vertex disjoint paths connecting the nodes i , 6.1 Star, Bubble-sort and Pancake

1  i  n RS

1

. The reduced graph n corresponding to

SEP 3 n  3 n is -connected by Lemma 5; hence for there We choose three networks, e.g., Star, Bubble-sort and

are 3 vertex disjoint paths connecting the two sets of 3 dis- Pancake, since each of these networks is a Cayley graph

s !

tinct -cycles. Thus, there exist three vertex disjoint paths with n vertices, each vertex labeled with a permutation of

u v 2 between and . the ®rst n non-zero positive integers. The signi®cant dif-

ference between any of these networks and our proposed

u v SEP

Lemma 7 Consider two arbitrary nodes and in n

SEP network is that degree requirement of the nodes in each

v s such that u and belong to the same -cycle. Then there

one of them increases with the size of the network while v exist three vertex disjoint paths between u and . SEP is a ®xed degree network for any size; this makes SEP

much more attractive especially for VLSI implementations

0

v s s Proof : Since u and belong to the same -cycle ,we

and for applications where the number of I/O ports per node v directly get two vertex disjoint paths between u and along

is bounded and at the same time applications developed for

0

s s the given s-cycle . Consider the following two -cycles:

other networks can run on SEP with minimal modi®cation.

g u=u 2 s ; and g v =v 2s

1 1 12 1 2

12 Any Cayley graph is uniquely identi®ed by the set of gener-

= I =12:::n

ators and the Identity element ( I ).

s s n  3 2 Note that 1 and are distinct for and neither of them

We need three different kind of generators to de®ne the

0 3

is the same as s . Now, by the -connectivity of the reduced g

networks under study: (1) transposition type generators ij

RS SEP

1 n

graph n corresponding to , there exist 3 vertex j

which works on a permutation by swapping the ith and th

s s 2

disjoint paths between 1 and ; at least two of them cannot

g g R

elements in the label; (2) shift type generators L or

0 s go via the s-cycle . Hence, there exists a path between the

which gives a cyclic shift to the elements in the label by one

u v SEP

1 n

nodes 1 to in that does not go through any of the

f 2  p 

digit to left (right); (3) ¯ipping type generators p (

u v

nodes in the s-cycle and belong to. Thus there exist three p

n), that reverses the ®rst elements of the node permutation.

v 2

vertex disjoint paths between u and .

g f

3

;5

For instance, the application of 2 and to the permuta-

I = 12345 15342 32145 SEP

Theorem 6 The graph nis 3-connected for any given tion will yield permutations and ,

g g I 23451

n n  3 R , . respectively. Application of L ( )to will yield

( 51234). We now brie¯y de®ne the three Cayley networks under study:

Proof : Obvious from the previous two theorems. 2

S G = fg ; 2  i  ng

n star

;i

6 Network Simulations Star Graph : 1 ,

g

;i where 1 is a generator that

swaps the ®rst and ith element. In this section we consider the simulation of three Cay-

ley networks namely, Bubble-sort, Pancake, and Star by

B G = fg ; 1 

n B sor t

i;i+1 Bubble-sort Graph : 

SEP.Ef®cient network simulation strategies are necessary to g i

assume that the networks operate in the SIMD mode. At any

P G = ff ; 2  i 

P ancak e i

given step of an algorithm, the controller broadcasts the in- Pancake Graph n :

ng f struction to be executed to all the processors. The proces- ; a generator i in this graph

sors then apply the same instruction to their own sets of data ¯ips the leftmost i digits.

concurrently. In each case we determine the optimal number

SEP G = fg ;g ;g g

n SEP L R

;2

of steps required for one network to simulate the other. For SEP Graph : 1 .

B A network A to simulate network , should be able to per-

form each interconnection function offered by B in a ®nite 6.2 Basic Principle of Simulation number of steps. This simulation takes place by executing a

sequence of interconnection functions provided by A.The Note that there are two ways to specify a generator. One

A A ! B simulation of B by is denoted by (i.e., A sim- way is naturally to use the permutation obtained after the

ulates B). Programs, developed to run on network architec- generator is applied to the Identity element I (absolute rep- B ture B (i.e., that utilize the interconnection functions of ), resentation). The second way is to express this represen-

cannowrunonnetworkA by translating each interconnec- tation as a set of cycles (cycle representation). The latter

tion function of B required by the program into an equiva- uses the property that any permutation can be viewed as a lent (but minimal) sequence of interconnection functions of set of cycles, i.e. cyclically ordered sets of digits with the

A. property that each digit's desired position is that occupied

k 1

by the next digit in the set. For instance, the abstract rep- The ®rst simulation takes 2 steps whereas the 2nd

g 2; 3 :::n;1 2n 2k +3

resentation of L is: but its cycle representa- simulation takes steps. It follows that if

1; 2 :::n g k  bn=2c +1

i;j 

tion is: . Similarly, the generator  has the , the ®rst approach should be used. Other-

i; j 

cycle representation of  which indicates the fact that in wise, the second approach will result in a faster simulation.

j n +1 the permutation label the digits in positions i and must be The simulation time is which is optimal. The opti- swapped. The cycle representation for other generators will mality follows from the fact that in order to swap two adja-

be derived later. The decomposition of a cycle representa- cent digits, they must be ®rst brought into 1st and 2nd posi-

g

;2

tion can be easily transformed to that of its corresponding tions where the swap is allowed (i.e. 1 ). After the swap

i; j  = 1;i1;j1;i

generator. For example,  implies the digits must be put back in place by successive appropri-

g = g g g

i;j  1;i 1;j  1;i  . ate (left or right) shifts.

In the current context, the interconnection functions of ! a network are essentially its generators which are permuta- 6.4 SEP Pancake tions themselves. Thus, the simulation problem can be re-

duced to one of expressing (or decomposing) the generators We need to use an inductive approach. More speci®cally,

B A

of in terms of generators of . The following assumptions f

1

assuming that the simulation of i is known, the steps to f

are made: f 3

simulate i will be given. We use the simulation of as the f

induction base. The simulation of 3 requires 5 steps and is

N = n!  Each simulation involves movement of data given by:

items among N processors.

f = g g g g g

3 L R

1;2 1;2 1;2 f  When simulating interconnection , the data origi-

nally in processor p must be transferred to processor

f

1

Now assuming the simulation of i is

p 1  p  N

f , .

f f

i1

known, we will simulate i by ®rst simulating which

a a :::a a a :::a

 t

1 2 i1 i i+1 n

The simulation time ( simulation ) is in terms of the will change the original label of

a a :::a a a a :::a a

1 i2 2 1 i i+1 n i

number of executions required to perform the simu- to i . We need only to bring

a ;a ;:::;a a

1 i2 2 1 lation. to the ®rst position, and shift i ,and to the right by one position. It is left to the reader to verify that

 The particular interconnection function to be simu- the following sequence will optimally make the change in lated that require the most time to simulate will deter- the label as required.

mine the simulation time.

i2

i2

g g g g 

R

1;2 1;2 Observe that no two (or more ) processors can send data L

to the same processor when N data items are moved across

i The total number of steps in the above sequence is: 3 any of the four networks in parallel and according to a single

5. Thus, the simulation time can be obtained by solving a

interconnection function. This is because every intercon-

S i f

recurrence on , the number of steps to simulate i .We

nection function is indeed a permutation. Applying this per-

i +1= Si+3i5 S 3=5 have S with . Solution of

mutation to a distinct label of a given vertex can only yield this recurrence yields;

g P =

1

i;j 

a distinct label of another vertex. Formally, if 

2

g P P = P P P

3i 13i +22

2 1 2 1 2

i;j 

 then ,where and are permutations

S i= ; i  3

g

i;j 

representing the labels of two vertices and  is a gener- 2

ator representing an interconnection function. The potential

2 = 1

of data con¯ict, however, exists when various interconnec- Coincidentally, from the above S which is true

f = g S n

2

;2 tion functions are applied to different subnetworks. since: 1 .Since will determine the maxi-

mum number of steps to simulate any pancake function, the

2

3n 7n+4

t =

simulation ! simulation time is given by .The

6.3 SEP Bubble-sort 2

simulation time, in this case, is large, but optimal; This is

k; k +1 The cycle  can be decomposed as follows: probably due to the nature of speci®c generators of the Pan-

cake graphs.

k1 nk+1

k; k +1= 1;2;:::;n 1; 21; 2;:::;n !

The superscripts imply the repetition of the cycles. The gen- 6.5 SEP Star

g

k;k+1

erator  can thus be expressed in one of the following

g 2  i  n

;i

ways: We need to simulate the generators 1 , .To

g i

;i

simulate 1 , the digit in the th position is ®rst brought

k 1 k 1

g = g g g ; or

k;k+1 1;2

L R

a a :::a

2 n

to the ®rst position such that the label 1 be trans-

nk +1 nk +1

g = g a a a :::a a a :::a g g ; 2  k

1 i 2 i2 i1 i+1 n

k;k+1 ;2

1 formed to . This can be done

R L

g g i 2

L

;2

by applying the sequence 1 , times. Then [5] K. Day and A. Tripathi. Arrangement graphs: a class

g

;2

the 1 can be applied to obtain of generalized star graphs. Information Processing

a a a :::a a a :::a

1 2 i2 i1 i+1 n

i . Applying the sequence of Letters, 42:235±241, July 1992.

g g i 2 a i

R 1

;2 1 times will take to position ;i.e.the

[6] C. Chen, D. P. Agrawal, and J. R. Burke. dBCube:

a a :::a a a :::a

2 i1 1 i+1 n label i will be obtained. Thus,

a new class of hierarchical multiprocessor intercon-

i2 i2

g =g g  g g g 

L R

;i 1;2 1;2 1;2 1 nection networks with area ef®cient layout. IEEE

Transactions on Parallel and Distributed Systems, g

The simulation can also be done using the SEP generator R

4(12):1332±1344, December 1993. g

instead of L and then the sequence will look like

i+1 ni

n [7] M. R. Samatham and D. K. Pradhan. The De Bruijn

g =g g  g g  g

R L L

;i 1;2 1;2 1 multiprocessor network: a versatile parallel process-

Depending on the location of i in the node label, one of the ing and sorting network for VLSI. IEEE Transactions

above schemes takes fewer steps, i.e., results in a faster sim- on Computers, 38(4):567±581, April 1989.

2n+5 = ulation. Th e worst case occurs when i .Thissim-

4 [8] D. K. Pradhan and S. M. Reddy. A fault tolerant

g

;i ulation is optimal as no other sequence can simulate 1

in fewer steps. Thus, the simulation time is computed at communication architecture for distributed systems.

n+5

2 IEEE Transactions on Computers, C-31(9):863±870,

i = t =2n2

and is given by simulation . 4 September 1982. 7 Conclusion [9] W. E. Leland and M. H. Solomon. Dense trivalent graphs for processor interconnection. IEEE Transac- We have proposed a new family of regular Cayley net- tions on Computers, 31(3):219±222, March 1982. works which is shown to be very competitive to build mas- sively parallel systems. The proposed networks are maxi- [10] S. B. Akers and B. Krishnamurthy. Group graphs as mally fault tolerant and hence are more resilient to node and interconnection networks. In Proceedings of FTCS-14, link failures than the almost regular networks like DeBruijn pages 422±427, 1984. graphs or MÈobius graphs. Another interesting feature of the [11] S. Lati®, M. M. Azevedo, and N. Bagherzadeh. The proposed family of networks is that it can ef®ciently simu- star connected cycles: a ®xed degree network for par- late many of the popular permutation group based networks allel processing. In Proceedings of the International like star graphs, bubble-sort or pancake graphs. Thus, algo- Conference on Parallel Processing, volume 1, pages rithms originally designed for those graphs can still run on 91±95, 1993. the proposed networks and one can still get the advantage of the ®xed degree of the network (independent of the size). [12] F. Preparata and J. Vuillemin. The cube-connected cycles: a versatile network for parallel computation. References Communications of ACM, 24(5):30±39, May 1981. [13] M. A. Armstrong. Groups and Symmetry. Springer- [1] S. B. Akers and B. Krishnamurthy. A group-theoretic Verlag, 1988. model for symmetric interconnection networks. IEEE Transactions on Computers, 38(4):555±566, April [14] F. Harary. . Addison-Wesley, Reading, 1989. MA, 1972. [2] S. B. Akers and B. Krishnamurthy. The star graph: an attractive alternative to n-cube. In Proceedings of In- ternational Conference on Parallel Processing (ICPP- 87), pages 393±400, St. Charles, Illinois, August 1987. [3] L. Bhuyan and D. P. Agrawal. Generalized hypercube and hyperbus structure for a computer netwrk. IEEE Transactions on Computers, 33(3):323±333, March 1984. [4] K. Qiu, S. G. Akl, and H. Meijer. On some proper- ties and algorithms for the star and pancake intercon- nection networks. Journal of Parallel and Distributed Computing, 22(1), July 1994.