Operating Systems I Swapping Swapping Motivation Demand

Swapping F Active processes use more physical memory than system has Address Binding can be fixed Swap out or relocatable P1 Operating Systems I at runtime P2 Backing Store (Swap Space) Virtual Memory OS Main Memory Swapping Motivation F Consider 100K proc, 1MB/s disk, 8ms seek – 108 ms * 2 = 216 ms F Logical address space larger than physical – If used for context switch, want large quantum! memory F Small processes faster – “Virtual Memory” F Pending I/O (DMA) – on special disk – don’t swap F Abstraction for programmer – DMA to OS buffers F Performance ok? F Unix uses swapping variant – Error handling not used – Each process has “too large” address space – Maximum arrays – Demand Paging Demand Paging Paging Implementation Validation F Less I/O needed Page out Bit A1 F Less memory needed A1 A2 A3 A3 Page 0 0 1 v F Faster response B1 B2 0 Page 1 1 0 i F More users B1 1 Page 0 0 3 F Page 2 2 3 v No pages in memory 2 1 initially Page 3 3 0 i Main Memory 3 Page 2 2 – Pure demand Logical Page Table paging Memory Physical Memory 1 Page Fault Performance of Demand Paging F Page not in memory – interrupt OS => page fault Page Fault Rate F OS looks in table: 0 < p < 1.0 (no page faults to every is fault) – invalid reference? => abort Effective Access Time – not in memory? => bring it in = (1-p) (memory access) + p (page fault overhead) F Get empty frame (from list) Page Fault Overhead F Swap page into frame = swap page out + swap page in + restart F Reset tables (valid bit = 1) F Restart instruction Performance Example Page Replacement F memory access time = 100 nanoseconds F F swap fault overhead = 25 msec Page fault => What if no free frames? F page fault rate = 1/1000 – terminate user process (ugh!) – swap out process (reduces degree of multiprog) F EAT = (1-p) x 100 + p x (25 msec) – replace other page with needed page = (1-p) x 100 + p x 25,000,000 F Page replacement: = 100 + 24,999,900 x p – if free frame, use it = 100 + 24,999,900 x 1/1000 = 25 microseconds! – use algorithm to select victim frame F Want less than 10% degradation – write page to disk, changing tables 110 > 100 + 24,999,900 x p – read in new page 10 > 24,999,9000 x p – restart process p < .0000004 or 1 fault in 2,500,000 accesses! Page Replacement Page Replacement Algorithms “Dirty” Bit - avoid page out 0 1 v F Every system has its own (0) 1 02 iv (2) F Want lowest page fault rate Page 0 2 3 v 0 F Evaluate by running it on a particular string Page 1 3 0 i (1) 1 Page 0 0 3 of memory references (reference string) and Page 2 Page Table 2 victim computing number of page faults 1 Page 3 (3) F Example: 1,2,3,4,1,2,5,1,2,3,4,5 0 1 vi (4) 3 Page 2 2 Logical Memory 1 0 i Physical Memory Page Table 2 First-In-First-Out (FIFO) Optimal 1,2,3,4,1,2,5,1,2,3,4,5 vs. 1 4 5 3 Frames / Process 9 Page Faults F 2 1 3 Replace the page that will not be used for 3 2 4 the longest period of time 1,2,3,4,1,2,5,1,2,3,4,5 1 5 4 1 4 4 Frames / Process 10 Page Faults! 4 Frames / Process 2 1 5 2 6 Page Faults 3 2 Belady’s Anomaly 3 4 3 4 5 How do we know this? Use as benchmark Least Recently Used LRU Implementation F Replace the page that has not been used for F the longest period of time Counter implementation – every page has a counter; every time page is 1,2,3,4,1,2,5,1,2,3,4,5 referenced, copy clock to counter – when a page needs to be changed, compare the 1 5 counters to determine which to change F 2 Stack implementation 8 Page Faults – keep a stack of page numbers 3 5 4 – page referenced: move to top 4 3 No Belady’s Anomoly - “Stack” Algorithm – no search needed for replacement - N frames subset of N+1 LRU Approximations Second-Chance F LRU good, but hardware support expensive F FIFO replacement, but … F Some hardware support by reference bit – Get first in FIFO – with each page, initially = 0 – Look at reference bit u – when page is referenced, set = 1 bit == 0 then replace u bit == 1 then set bit = 0, get next in FIFO – replace the one which is 0 (no order) F – enhance by having 8 bits and shifting If page referenced enough, never replaced F – approximate LRU Implement with circular queue 3 Second-Chance Enhanced Second-Chance F 2-bits, reference bit and modify bit (a) (b) F (0,0) neither recently used nor modified 1 1 0 1 – best page to replace 0 2 0 2 F (0,1) not recently used but modified Next 1 3 0 3 – needs write-out Vicitm F 1 4 0 4 (1,0) recently used but clean – probably used again soon F (1,1) recently used and modified If all 1, degenerates to FIFO – used soon, needs write-out F Circular queue in each class -- (Macintosh) Counting Algorithms Page Buffering F Keep a counter of number of references F Pool of frames – LFU - replace page with smallest count – start new process immediately, before writing old u if does all in beginning, won’t be replaced u write out when system idle u decay values by shift – list of modified pages – MFU - smallest count just brought in and will u write out when system idle probably be used – pool of free frames, remember content F Not too common (expensive) and not too u page fault => check pool good Allocation of Frames Fixed Allocation F How many fixed frames per process? F Equal allocation F Two allocation schemes: – ex: 93 frames, 5 procs = 18 per proc (3 in pool) – fixed allocation F Proportional Allocation – priority allocation – number of frames proportional to size – ex: 64 frames, s1 = 10, s2 = 127 u f1 = 10 / 137 x 64 = 5 u f2 = 127 / 137 x 64 = 59 F Treat processes equal 4 Priority Allocation Thrashing F Use a proportional scheme based on priority F If a process does not have “enough” pages, F If process generates a page fault the page-fault rate is very high – select replacement a process with lower – low CPU utilization priority – OS thinks it needs increased multiprogramming F “Global” versus “Local” replacement – adds another procces to system – local consistent (not influenced by others) F Thrashing is when a process is busy – global more efficient (used more often) swapping pages in and out Thrashing Cause of Thrashing F Why does paging work? – Locality model u process migrates from one locality to another u localities may overlap CPU F Why does thrashing occur? utilization – sum of localities > total memory size F How do we fix thrashing? degree of muliprogramming – Working Set Model – Page Fault Frequency Working-Set Model Working Set Example F Working set window W = a fixed number of F T = 5 page references F 1 2 3 2 3 1 2 4 3 4 7 4 3 3 4 1 1 2 2 2 1 – total number of pages references in time T F D = sum of size of W’s W={1,2,3} W={3,4,7} W={1,2} – if T too small, will not encompass locality – if T too large, will encompass several localities – if T => infinity, will encompass entire program F if D > m => thrashing, so suspend a process F Modify LRU appx to include Working Set 5 Page Fault Frequency Prepaging increase number of frames F Pure demand paging has many page faults upper bound initially – use working set lower bound – does cost of prepaging unused frames outweigh decrease Page Fault Rate number of cost of page-faulting? frames Number of Frames F Establish “acceptable” page-fault rate – If rate too low, process loses frame – If rate too high, process gains frame Page Size Program Structure F Old - Page size fixed, New -choose page size F consider: F How do we pick the right page size? Tradeoffs: int A[1024][1024]; – Fragmentation for (j=0; j<1024; j++) – Table size for (i=0; i<1024; i++) – Minimize I/O u transfer small (.1ms), latency + seek time large (10ms) A[i][j] = 0; F – Locality suppose: u small finer resolution, but more faults – process has 1 frame – ex: 200K process (1/2 used), 1 fault / 200k, 100K faults/1 byte – 1 row per page F Historical trend towards larger page sizes – => 1024x1024 page faults! – CPU, mem faster proportionally than disks Program Structure Priority Processes int A[1024][1024]; F Consider for (i=0; i<1024; i++) – low priority process faults, for (j=0; j<1024; j++) u bring page in A[i][j] = 0; – low priority process in ready queue for awhile, F 1024 page faults waiting while high priority process runs F stack vs. hash table – high priority process faults u F low priority page clean, not used in a while Compiler => perfect! – separate code from data F Lock-bit (like for I/O) until used once – keep routines that call each other together F LISP (pointers) vs. Pascal (no-pointers) 6 Real-Time Processes Virtual Memory and WinNT F Page Replacement Algorithm F Real-time – FIFO – bounds on delay – Missing page, plus adjacent pages – hard-real time: systems crash, lives lost F u air-traffic control, factor automation Working set – soft-real time: application sucks – default is 30 u audio, video – take victim frame periodically F Paging adds unexpected delays – if no fault, reduce set size by 1 – don’t do it F Reserve pool – lock bits for real-time processes – hard page faults – soft page faults Virtual Memory and Linux Virtual Memory and WinNT F Regions of virtual memory F Shared pages – paging disk (normal) – level of indirection for easier updates – file (text segment, memory mapped file) – same virtual entry F New Virtual Memory F Page File – exec() creates new page table – stores only modified logical pages – fork() copies page table – code and memory mapped files on disk already u reference to common pages u if written, then copied F Page Replacement Algorithm – second chance (with more bits) Application Performance Studies Capacity Planning Then and Now and F Demand Paging in Windows NT Capacity Planning in the good old days – used to be just mainframes – simple CPU-load based queuing theory Mikhail Mikhailov Saqib Syed – Unix Ganga Kannan F Capacity Planning today Divya Prakash Mark Claypool – distributed systems David Finkel Sujit Kumar – networks of workstations WPI BMC Software, Inc.

Operating Systems I Swapping Swapping Motivation Demand

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support