<<

Parallel A Simple Model for Parallel Processing

4 several operations can be executed at the same time • Parallel Random Access Machine (PRAM) model 4 many problems are most naturally modeled with 4a number of processors all can access parallelism 4a large share memory • A Simple Model for Parallel Processing 4all processors are synchronized • Approaches to the design of parallel algorithms 4all processor running the same program each processor has an unique id, pid. and • and Efficiency of parallel algorithms f f may instruct to do different things depending on their pid • A class of problems NC • Parallel algorithms, e.g. TECH

Approaches to PRAM models the design of parallel algorithms • PRAM models vary according • Modify an existing sequential 4how they handle write conflicts 4exploiting those parts of the algorithm that are 4The models differ in how fast they can solve various naturally parallelizable. problems. • Design a completely new parallel algorithm that • Concurrent Read Exclusive Write (CREW) 4may have no natural sequential analog. 4only one processor are allow to write to 4one particular memory cell at any one step • Brute force Methods for parallel processing: • Concurrent Read Concurrent Write (CRCW) 4Using an existing sequential algorithm but • Algorithm works correctly for CREW f each processor using differential initial conditions 4will also works correctly for CRCW 4Using compiler to optimize sequential algorithm 4but not vice versa 4Using advanced CPU to optimize code

Speedup and Efficiency of parallel algorithms A class of problems NC 4Let T*(n) be the of a sequential • The class NC consists of problems that algorithm to solve a problem P of input size n 4can be solved by parallel algorithm using 4Let T (n) be the time complexity of a parallel algorithm p f polynomially bounded number of processors p(n) to solves P on a parallel computer with p processors k f p(n) ∈ O(n ) for problem size n and some constant k • Speedup 4the number of time steps bounded by a polynomial in the logarithm of the problem size n 4Sp(n) = T*(n) / Tp(n) f T(n) ∈ O( (log n)m ) for some constant m 4Sp(n) <= p

4Best possible, Sp(n) = p f when Tp(n) = T*(n)/p • Theorem: • Efficiency 4NC ⊆ P

4Ep(n) = T1(n) / (p Tp(n)) f where T1(n) is when the parallel algorithm run in 1 processor

4Best possible, Ep(n) = 1

1 Parallel algorithms, e.g. Binary Fan-In Technique Algorithm: Parallel Tournament for Max • Algorithm:Parallel Tournament for Maximum • Input: Keys x[0],x[1],....x[n-1], • initially in memory cells M[0] ,...,M[n-1], and integer n. • Output:The largest key will be left in M[0]. • parTournamentMax(M,n) • int incr • Write -(some very small value) into M[n+pid] • incr=1; • while(incr

• Analysis: Use n processor and θ(log n) time

Algorithm: Finding Max in Constant Time Algorithm: Common-Write Max of n Keys • CRCW method • Uses n2 processors, does only three read/write steps!

2