Monte Carlo Method on

Jongsoon Kim Introduction

] Monte Carlo methods \ Utilize random numbers to perform a statistical simulation of a physical problem \ Extremely time-consuming \ Inherently parallel [ Each particle history simulate independently \ Recent increase in accessibility of advanced computer * Parallel Monte Carlo methods have been interested Multiple instruction multiple data (MIMD)

] Several processors operate in parallel. ] Each processor has its own instruction stream and its own data stream. ] They operate asynchronously. ] Depending on memory structure, \ Distributed-memory parallel processors \ Shared-memory parallel processors Distributed memory parallel processors

] Regular array of large number of processors P P ] They have their own M M private memory ] Interconnected by P P communication link M M ] Communicate by passing message along them parallel processors

] Utilize a shared bus to connect processors Global Memory with memory ] Some use high-speed bus Shared bus ] Most successful P P P P commercial parallel processors Limiting factors of

] Balanced workload \ Obvious goal for efficient algorithm \ All processors should keep busying ] Communications \ Serious concern for distributed memory processors [ Relatively slow inter-processor communication \ Shared memory processor [ memory interface Limiting factors of parallel algorithm (cont.)

] Synchronization \ Lead to inefficiencies [ Processors are waiting for one another [ Unavoidable Monte Carlo method

] Radiation Transport Monte Carlo method \ A particle emitted from a source routine \ Transported through the medium interested \ Processed through whatever collisions or interactions \ As a history finish, result of simulation are accumulated (tallies) \ Simulation continues until the particle is terminated; absorbed, escape etc. Monte Carlo on parallel architecture

] Monte Carlo method is inherently parallel ] Parallel algorithm can be developed with minimal change of conventional Monte Carlo ] Critical parts of parallel algorithm \ Sufficient memory \ Parallel random number generator [ Must provide uncorrelated random number Domain decomposition scheme

] To reduce excessive memory demand \ Partitioning by geometry [ Assign specific zone to processors \ Partitioning by energy [ Assign specific energy group to processors \ Substantial saving in memory \ Increase inter-processor communications Random number generator

] Most critical part of parallel Monte Carlo ] Should make sure statistical independence ] Two alternative approaches \ Parameterization [ Recursion [ Linear congruential generator, shift register generator, and lagged-Fibonacci generators \ Splitting [ Long period is split into a number of substreams (PVM)

] Portable message passing programming system ] Link separate machines to create a virtual machine as a single manageable computing source Processors (MPP)

node node node node PE PE PE PE IOP node node node

MP node node IOP node

node IOP node node

node node IOP node Massively Parallel Processors (MPP) (cont.)

Speedup is the ratio of the CPU time taken on a single

S processor to on a number of machines

T1 S N = TN

Speedup achieves nearly linear performance Schematic view of cluster system

Compute SCSI-2 node 1 I/O node Compute node 2 HDD 1GB

Router 8x8 HS Link Compute node 3

PCI Ethernet Entry node Compute node 4 SCSI-1 Compute Compute node 6 node 5 LAN

HDD 2GB Cluster system (cont.)

12 11 Linear decrease of 10 computing time with 9 the number of computing

p

u 8 d nodes

e

e

p 7

S 6

5

4

3 4 5 6 7 8 9 10 11 12 Number of computing nodes Architecture of network of workstations

Node 1 Node 2

Node 8 Node 3 PE PE

100 Mb Cach Cach Fast Ethernet Bus Node 7 Node 4 Memory Network

Node 6 Node 5 Conclusion

] Monte Carlo methods are necessary tools in radiation dosimetry and shielding design ] Both parallel architecture (MPP and distributed workstation) are suitable and effective ] A mixture of PCs and workstations \ May more cheaper