Distributed Shared Memory

Distributed Shared Memory

Sharing memory in MIMD machines Multicomputers and Multiprocessors n Multiprocessors (shared memory) n Multiprocessors u Any process can use usual load / store operations to access any u UMA memory word u NUMA 8Complex hardware — bus becomes a bottleneck with more than n Multicomputers - distributed shared memory 10-20 CPUs (DSM) 4 Simple communication between processes — shared memory u motivation locations u consistency models 4 Synchronization is well-understood, uses classical techniques u implementation (semaphores, etc.) F implementation problems n Multicomputers (no shared memory) • granularity u Each CPU has its own private memory • page replacement 4 Simple hardware with network connection F implementation approaches 8Complex communication — processes have to use message- u thrashing passing, have to deal with lost messages, blocking, etc. u RPCs help, but aren’t perfect (no globals, can’t pass large data structures, etc.) 1 2 Bus-based multiprocessors NUMA Machines CPU CPU CPU n Symmetric Memory CPU Memory CPU Memory Multiprocessor Cache Cache Cache (SMP) u Multiple CPUs (2–30), one shared physical memory, connected by Bus Bus a bus LAN n Caches must be kept consistent n NUMA = Non-Uniform Memory Access u Each CPU has a “snooping” cache, which “snoops” what’s n NUMA multiprocessor happening on the bus u Multiple CPUs, each with its own memory F On read hit, fetch data from local cache u CPUs share a single virtual memory F On read miss, fetch data from memory, and store in local cache u Accesses to local memory locations are much faster (maybe 10x) • Same data may be in multiple caches than accesses to remote memory locations F (Write through) On write, store in memory and local cache n No hardware caching, so it matters which data is stored in which • Other caches are snooping; if they have that word they memory invalidate their cache entry u A reference to a remote page causes a hardware page fault, which • After write completes, the memory is up-to-date and the traps to the OS, which decides whether or not to move the page to word is only in one cache 3 the local machine 4 Distributed Shared Memory (DSM) overview Advantages of DSM n Simpler abstraction - programmer does not have to worry about data movement, may be easier to implement than RPC since the address space is the same n Basic idea (Kai Li, 1986) n easier portability - sequential programs can in principle be run directly u Collection of workstations, connected by a LAN, all sharing a single on DSM systems paged, virtual address space n possibly better performance u Each page is present on exactly one machine, and a reference to a local page is done in the usual fashion u locality of data - data moved in large blocks which helps programs with good locality of reference u A reference to a remote page causes a hardware page fault, which traps to the OS, which sends a message to the remote machine to u on-demand data movement get the page; the faulting instruction can then complete u larger memory space - no need to do paging on disk 4 To programmer, DSM machine looks like a conventional shared- n flexible communication - no need for sender and receiver to exist, can memory machine join and leave DSM system without affecting the others u Easy to modify old programs n process migration simplified - one process can easily be moved to a 8 Poor performance — lots of pages being sent back and forth over the different machine since they all share the address space network 5 6 Comparison of shared memory systems Maintaining memory coherency n DSM systems allow concurrent access to shared data n concurrency may lead to unexpected results - what if the read does not return the value stored by the most recent write (write did not propagate)? n Memory is coherent if the value returned by the read operation is always the value the programmer expected n To maintain coherency of shared data a mechanism that controls (and synchronizes) memory accesses is used. n This mechanism only allows a restricted set of memory u SMP = single bus multiprocessor access orderings F Hardware access to all of memory, hardware caching of blocks n memory consistency model - the set of allowable memory u NUMA access orderings F Hardware access to all of memory, software caching of pages u Distributed Shared Memory (DSM) 7 8 F Software access to remote memory, software caching of pages Consistency models Consistency models (cont.) n Causal consistency (Hutto and Ahmad 1990) n Notation rij[a] - operation number J performed by process Pi on memory element a u the processes are seen in the same order if they are potentially n Strict consistency (strongest model) causally related u Value returned by a read operation is always the same as the u read/write (two write) operations on the same item are causally value written by the most recent write operation related F Writes become instantly available to all processes u causality is transitive - if a process carries out an operation B that causally depends on the preceding op A - all consequent ops by these u Requires absolute global time to correctly order operations — not possible process are causally related to A (even if they are on different items) n PRAM consistency (Lipton & Sandberg 1988) n Sequential consistency (Lamport 1979) u All processes see only memory writes done by a single process in the u All processes see all memory access operations in the same order same (correct) order F Interleaving of operations doesn’t matter, if all processes see the same ordering u PRAM = pipelined RAM F Writes done by a single process can be pipelined; it doesn’t have u Read operation may not return result of most recent write operation! to wait for one to finish before starting another F writes by different processes may be seen in different order on a F Running a program twice may give different results each time third process u If order matters, use semaphores! u Easy to implement — order writes on each processor independent of 9 all others 10 Consistency models (cont.) Comparison of consistency models n Processor consistency (Goodman 1989) u PRAM + (cont.) u coherency on the same data item - all processes agree on the order of write operations to the same data item n Models differ in how restrictive they are, difficulty to implement, ease of n Weak consistency (Dubois 1988) programming, and performance u Consistency need only apply to a group of memory accesses rather n Strict consistency — most restrictive, but impossible to implement than individual memory accesses n Sequential consistency — widely used, intuitive semantics, not much u Use synchronization variables to make all memory changes visible to extra burden on programmer all other processes (e.g., exiting critical section) u But does not allow much concurrency F all access to synchronization variables must be sequentially consistent n Causal & PRAM consistency — allow more concurrency, but have non-intuitive semantics, and put more of a burden on the programmer F write operations are completed before access to synchvar to avoid doing things that require more consistency F access to non-synchvar is allowed only after sycnhvar access is completed n Weak and Release consistency — intuitive semantics, but put extra n Release consistency (Gharachorloo 1990) burden on the programmer u two synchronization vars F acquire - all changes to synchronized vars are propagated to the process F release - all changes to syncrhonized vars are propagated to other processes 11 12 F programmer has to write accesses to these variables Implementation issues Granularity n Granularity - size of shared memory unit n how to keep track of the location of remote data n Page-based DSM n how to overcome the communication delays and high u Single page — simple to implement overhead associated with execution of communication u Multiple pages — take advantage of locality of reference, amortize protocols network overhead over multiple pages n how to make shared data concurrently accessible at several F Disadvantage — false sharing nodes to improve system performance n Shared-variable DSM u Share only those variables that are need by multiple processes u Updating is easier, can avoid false sharing, but puts more burden on the programmer n Object-based DSM u Retrieve not only data, but entire object — data, methods, etc. u Have to heavily modify old programs 13 14 Implementing sequential consistency Implementing sequential consistency on page-based DSM on page-based DSM (cont.) n Can a page move? …be replicated? n Replicated, migrating pages n Nonreplicated, nonmigrating pages u All requests for the page have to be sent to the owner of the page u All requests for the page have to be sent to the owner of the page u Each time a remote page is accessed, it’s copied to the processor u Easy to enforce sequential consistency — owner orders all access that accessed it request u Multiple read operations can be done concurrently u No concurrency u Hard to enforce sequential consistency — must invalidate (most n Nonreplicated, migrating pages common approach) or update other copies of the page during a write operation u All requests for the page have to be sent to the owner of the page n Replicated, nonmigrating pages u Each time a remote page is accessed, it migrates to the processor that accessed it u Replicated at fixed locations u Easy to enforce sequential consistency — only processes on that u All requests to the page have to be sent to one of the owners of the processor can access the page page u No concurrency u Hard to

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    3 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us