Quick viewing(Text Mode)

A Better Signal()POSIX Sigaction()

A Better Signal()POSIX Sigaction()

spcl.inf.ethz.ch spcl.inf.ethz.ch @spcl_eth @spcl_eth ADRIAN PERRIG & TORSTEN HOEFLER Oldskool: () Networks and Operating Systems (252-0062-00) Chapter 5: Memory Management void (*signal( sig, void (*handler)(int))) (int);

. Unpacking this: . A handler looks like void my_handler(int); . Signal takes two arguments… An integer (the signal type, e.g., SIGPIPE) A pointer to a handler function http://support.apple.com/kb/HT5642 (Jul. 2015) . … and returns a pointer to a handler function “Description: The iOS kernel has checks to validate that the user-mode The previous handler, pointer and length passed to the copyin and copyout functions would not result in a user-mode being able to directly access kernel memory. The checks were not being used if the length was smaller than . “Special” handler arguments: one page. This issue was addressed through additional validation of the . SIG_IGN (ignore), SIG_DFL (default), SIG_ERR (error code) arguments to copyin and copyout.”

2

spcl.inf.ethz.ch spcl.inf.ethz.ch @spcl_eth @spcl_eth

Unix signal handlers Implications

. Signal handler can be called at any time! . There is very little you can safely do in a signal handler! . Can’t safely access program global or static variables . Executes on the current user stack . Some system calls are re-entrant, and can be called . If process is in kernel, may need to retry current . Many C calls cannot (including _r variants!) . Can also be set to run on a different (alternate) stack . Can sometimes execute a longjmp if you are careful . With signal, cannot safely change signal handlers…

 User process is in undefined state when signal delivered . What happens if another signal arrives?

3 4

spcl.inf.ethz.ch spcl.inf.ethz.ch @spcl_eth @spcl_eth

Multiple signals A better signal()POSIX sigaction()

. If multiple signals of the same type are to be delivered, will New action for discard all but one. signal signo #include . If signals of different types are to be delivered, Unix will deliver Previous action them in any order. int sigaction(int signo, is returned const struct sigaction *act, struct sigaction *oldact); . Serious concurrency problem: Signal How to make sense of this? struct sigaction { handler void (*sa_handler)(int); Signals to be blocked in this sigset_t sa_mask; handler (cf., fd_set) int sa_flags; void (*sa_sigaction)(int, siginfo_t *, void *); }; More sophisticated signal handler (depending on flags)

5 6 spcl.inf.ethz.ch spcl.inf.ethz.ch @spcl_eth @spcl_eth

Signals as upcalls Our Small Quiz

. Particularly specialized (and complex) form of Upcall . True or false (raise hand) . Kernel RPC to user process . Mutual exclusion on a multicore can be achieved by disabling . Test and set can be used to achieve mutual exclusion . Other OSes use upcalls much more heavily . Test and set is more powerful than compare and swap . The CPU retries load-linked/store conditional instructions after a conflict . Including Barrelfish . The best spinning time is 2x the context switch time . “Scheduler Activations”: dispatch every process using an upcall instead of return . Priority inheritance can prevent priority inversion . The receiver never blocks in asynchronous IPC . Very important structuring concept for systems! . The sender blocks in synchronous IPC if the receiver is not ready . A pipe file descriptor can be sent to a different process . Pipes do not guarantee ordering . Named pipes in Unix behave like files . A process can catch all signals with handlers . Signals always trigger actions at the signaled process . One can implement a user-level tasking library using signals . Signals of the same type are buffered in the kernel

7 8

spcl.inf.ethz.ch spcl.inf.ethz.ch @spcl_eth @spcl_eth

Goals of Memory Management

. Allocate physical memory to applications . Protect an application’s memory from others . Allow applications to share areas of memory . Data, code, etc. . Create illusion of a whole address space . Virtualization of the physical address space . Create illusion of more memory than you have . Tomorrow – demand paging

. Book: OSPP Chapter 8

9 http://en.wikipedia.org/wiki/Imagery 10

spcl.inf.ethz.ch spcl.inf.ethz.ch @spcl_eth @spcl_eth

In CASP last semester we saw: What’s new this semester?

. Assorted uses for virtual memory . Wider range of memory management hardware . paging . Base/limit, segmentation . Page table format . Inverted page tables, etc. . Translation process . How the OS uses the hardware . Translation lookaside buffers (TLBs) . Demand paging and swapping . Interaction with caches . Page replacement algorithms . Performance implications . Frame allocation policies . For application code, e.g., matrix multiply

11 12 spcl.inf.ethz.ch spcl.inf.ethz.ch @spcl_eth @spcl_eth

Terminology Memory management

. Physical address: address as seen by the memory unit 1. Allocating physical addresses to applications . Virtual or Logical address: address issued by the processor 2. Managing the name translation of virtual addresses to physical . Loads addresses . Stores 3. Performing access control on memory access . Instruction fetches . Possible others (e.g., TLB fills)… . Functions 2 & 3 usually involve the hardware Memory Management Unit (MMU)

13 14

spcl.inf.ethz.ch spcl.inf.ethz.ch @spcl_eth @spcl_eth

Base and Limit Registers

. A pair of base and limit registers define the logical address space

0x0000000 Operating System Simple scheme: partitioned memory 0x1000000 Process 0x5600ba0 0x5600ba0 base Process

0x39ef470 0x8ff0010 Process limit

0xB000000

0xfffffff

15 16

spcl.inf.ethz.ch spcl.inf.ethz.ch @spcl_eth @spcl_eth

Issue: address binding Dynamic relocation using a relocation register

. Base address isn’t known until load time . Options: 1. Compiled code must be completely position-independent, or 2. Relocation Register maps compiled addresses dynamically to physical addresses Relocation register 14000 Logical Physical address address CPU + Memory 346 14346

MMU

17 18 spcl.inf.ethz.ch spcl.inf.ethz.ch @spcl_eth @spcl_eth

Contiguous allocation Hardware Support for Relocation and Limit Registers

. Main memory usually into two partitions: . Resident OS, usually in low memory with vector . User processes in high memory . Relocation registers protect user processes from

1. each other Limit Relocation 2. changing operating-system code and data register register . Registers: . Base register contains value of smallest physical address Logical Physical . Limit register contains range of logical addresses address yes address CPU Memory each logical address must be less than the limit register < + . MMU maps logical address dynamically no

trap: addressing error

19 20

spcl.inf.ethz.ch spcl.inf.ethz.ch @spcl_eth @spcl_eth

Base & Limit summary

. Simple to implement (addition & compare) . Physical memory fragmentation . Only a single contiguous address range . How to share data between applications? . How to share program text (code)? . How to load code dynamically? . Total logical address space ≤ physical memory Segmentation

21 22

spcl.inf.ethz.ch spcl.inf.ethz.ch @spcl_eth @spcl_eth

Segmentation User’s View of a Program

. Generalize base + limit: . Physical memory divided into segments . Logical address = (segment id, offset) stack . Segment identifier supplied by: shared library . Explicit instruction reference . Explicit processor segment register symbol . Implicit instruction or process state table

heap main program

logical address

23 24 spcl.inf.ethz.ch spcl.inf.ethz.ch @spcl_eth @spcl_eth

Segmentation Reality Segmentation Hardware

1

3 limit base 2 1

2 Segment 3 table CPU s d Physical Memory 0 4 yes address 4 < + 0 no

logical address trap: addressing error physical memory

25 26

spcl.inf.ethz.ch spcl.inf.ethz.ch @spcl_eth @spcl_eth

Segmentation Reality Segmentation Architecture

6000 . Segment table – each entry has: 1 . base – starting physical address of segment 5000 . limit – length of the segment 3 2 4600 . Segment-table base register (STBR) limit base 1 0 300 1500 . Current segment table location in memory 1 1000 5000 3800 2 400 3400 2 . Segment-table length register (STLR) 3 3 400 4600 3400 4 1000 1800 . Current size of segment table segment 2800 table segment number s is legal if s < STLR 0 4 4 1800 0 1500

logical address physical memory

27 28

spcl.inf.ethz.ch spcl.inf.ethz.ch @spcl_eth @spcl_eth

Segmentation Summary

. Fast context switch . Simply reload STBR/STLR . Fast translation . 2 loads, 2 compares . Segment table can be cached . Segments can easily be shared . Segments can appear in multiple segment tables Paging . Physical layout must still be contiguous . (External) fragmentation still a problem

29 30 spcl.inf.ethz.ch spcl.inf.ethz.ch @spcl_eth @spcl_eth

Paging Page table jargon

. Solves contiguous physical memory problem . Page tables maps VPNs to PFNs . Process can always fit if there is available free memory . Page table entry = PTE . Divide physical memory into frames . VPN = Virtual Page Number . Size is power of two, e.g., 4096 bytes . Upper bits of virtual or logical address . Divide logical memory into pages of the same size . PFN = Page Frame Number . For a program of n pages in size: . Upper bits of physical or logical address . Find and allocate n frames . Same number of bits (usually). . Load program . Set up page table to translate logical pages to physical frames

31 32

spcl.inf.ethz.ch spcl.inf.ethz.ch @spcl_eth @spcl_eth

Recall: P6 Page tables (32bit) x86-64 Paging

Virtual address 9 9 9 9 12 20 12 VPN1 VPN2 VPN3 VPN4 VPO Logical address: VPN VPO Page Directory Page 20 12 Pointer VPN1 VPN2 PPN PPO Page Map Directory Page Table Table Table Table

PDE p=1 PTE p=1 data

PDBR PM4LE PDPE PDE PTE Page Page table Data page directory BR

. Pages, page directories, page tables all 4kB

40 12 PPN PPO Physical address

33 34

spcl.inf.ethz.ch spcl.inf.ethz.ch @spcl_eth @spcl_eth

Problem: performance Translating with the P6 TLB

. Every logical memory access needs more than two physical memory accesses 1. Partition VPN into . Load page table entry → PFN CPU TLBT and TLBI. . Load desired location 2. Is the PTE for VPN 20 12 virtual address . Performance  half as fast as with no translation VPN VPO cached in set . Solution: cache page table entries TLBI? 16 4 3. Yes: Check TLBT TLBI 1 2 permissions, build physical address TLB TLB miss PDE PTE hit 3 4. No: PTE (and ... 20 12 PDE if not cached) partial PPN PPO TLB hit from memory and physical build physical page table translation address 4 address

35 36 spcl.inf.ethz.ch spcl.inf.ethz.ch @spcl_eth @spcl_eth

In fact, x86 combines segmentation and paging Effective Access Time

. Segments do (still) have uses . Associative Lookup =  time units . -local state . Assume memory cycle time is 1 time unit . Sandboxing (Google NativeClient, etc.) . Hit ratio  = . Virtual machine monitors (Xen, etc.) . % time that a page number is found in the TLB; . Depends on locality and TLB entries (coverage) segmentation paging physical CPU logical unit linear unit physical memory address address address Then Effective Access Time:

EAT = (1 + )  + (2 + )(1 – ) = 2 +  –  Assuming single- level page table.

Exercise: work this out for the P6 2-level table 37 38

spcl.inf.ethz.ch spcl.inf.ethz.ch @spcl_eth @spcl_eth

Memory Protection

. Associate protection info with each frame . Actually no - with the PTE. . Valid-invalid bit . “valid”  page mapping is “legal” . “invalid”  page is not part of address space, i.e., entry does not exist Page Protection . Requesting an “invalid” address  “fault” . A “”, or….

39 40

spcl.inf.ethz.ch spcl.inf.ethz.ch @spcl_eth @spcl_eth

Remember the P6 Page Table Entry (PTE) P6 protection bits

31 12 11 9 8 7 6 5 4 3 2 1 0 31 12 11 9 8 7 6 5 4 3 2 1 0 Page physical base address Avail G 0 D A CD WT U/S R/W P=1 Page physical base address Avail G 0 D A CD WT U/S R/W P=1

Page base address: 20 most significant bits of physical page address (forces Page base address: 20 most significant bits of physical page address (forces pages to be 4 KB aligned) pages to be 4 KB aligned) Avail: available for system programmers Avail: available for system programmers G: global page (don’t evict from TLB on task switch) G: global page (don’t evict from TLB on task switch) D: dirty (set by MMU on writes) D: dirty (set by MMU on writes) A: accessed (set by MMU on reads and writes) A: accessed (set by MMU on reads and writes) CD: cache disabled or enabled CD: cache disabled or enabled WT: -through or write-back cache policy for this page WT: write-through or write-back cache policy for this page U/S: user/supervisor U/S: user/supervisor R/W: read/write R/W: read/write P bit can be used to trap on any access (read or P: page is present in physical memory (1) or not (0) P: page is present in physical memory (1) or not (0) write)

41 42 spcl.inf.ethz.ch spcl.inf.ethz.ch @spcl_eth @spcl_eth

Protection information

. Protection information typically includes: . Readable . Writeable . Executable (can fetch to i-cache) . Reference bits used for demand paging

. Observe: same attributes can be (and are) associated with Page sharing segments as well

43 44

spcl.inf.ethz.ch spcl.inf.ethz.ch @spcl_eth @spcl_eth

Shared Pages Example Shared Pages Example

text 1 text 1 3 0 3 0 text 2 4 text 2 4 1 data 1 1 data 1 text 3 6 text 3 6 1 2 1 2 data 1 page table data 1 page table 3 code 1 3 code 1 for P1 for P1 text 1 Process P1 Process P1 4 code 2 3 4 code 2 text 2 4 5 text 3 6 5 7 6 code 3 6 code 3 data 2 page table 7 for P2 7 data 2 Process P2 8 8

9 9

10 10

11 11

45 46

spcl.inf.ethz.ch spcl.inf.ethz.ch @spcl_eth @spcl_eth

Shared Pages Example Shared Pages

text 1 . Shared code 3 0 . One copy of read-only code shared among processes text 2 4 . Shared code (often) appears in same location in the 6 1 data 1 text 3 logical address space of all processes 1 2 data 3 data 1 page table . Data segment is not shared, different for each process for P 3 code 1 1 text 1 . But still mapped at same address (so code can find it) Process P1 3 4 code 2 text 2 4 text 3 6 5 . Private code and data 7 6 code 3 . Allows code to be relocated anywhere in address space data 2 page table text 1 for P2 7 data 2 3 Process P2 text 2 4 8 text 3 6 2 9 data 3 page table 10 for P3 Process P3 11

47 48 spcl.inf.ethz.ch spcl.inf.ethz.ch @spcl_eth @spcl_eth

Per-process protection

. Protection bits are stored in page table . Plenty of bits available in PTEs .  independent of frames themselves . Different processes can share pages . Each page can have different protection to different processes . Many uses! E.g., debugging, communication, copy-on-write, etc. Page Table Structures

49 50

spcl.inf.ethz.ch spcl.inf.ethz.ch @spcl_eth @spcl_eth

Page table structures Page table structures

. Problem: simple linear page table is too big . Problem: simple linear page table is too big . Solutions: . Solutions: 1. Hierarchical page tables 1. Hierarchical page tables Saw these last Semester. 2. Virtual memory page tables 2. Virtual memory page tables (VAX) 3. Hashed page tables 3. Hashed page tables 4. Inverted page tables 4. Inverted page tables

51 52

spcl.inf.ethz.ch spcl.inf.ethz.ch @spcl_eth @spcl_eth

#3 Hashed Page Tables Hashed Page Table

. VPN is hashed into table . Hash bucket has chain of logical->physical page mappings . Hash chain is traversed to find match. physical . Can be fast, but can be unpredicable address r d logical p d . Often used for address . Portability . Software-loaded TLBs (e.g., MIPS)

hash physical q s • p r • memory function • • •

hash table

53 54 spcl.inf.ethz.ch spcl.inf.ethz.ch @spcl_eth @spcl_eth

#4 Inverted Page Table Inverted Page Table Architecture

. One system-wide table now maps PFN -> VPN . One entry for each real page of memory . Contains VPN, and which process owns the page . Bounds total size of all page information on machine . Hashing used to locate an entry efficiently logical physical address address CPU pid p d i d Physical . Examples: PowerPC, ia64, UltraSPARC memory

search i

pid p

page table

55 56

spcl.inf.ethz.ch spcl.inf.ethz.ch @spcl_eth @spcl_eth

The need for more bookkeeping

. Most OSes keep their own translation info . Per-process hierarchical page table () . System wide inverted page table (, MacOS) . Why? . Portability . Tracking memory objects . Software virtual  physical translation TLB shootdown . Physical  virtual translation

57 58

spcl.inf.ethz.ch spcl.inf.ethz.ch @spcl_eth @spcl_eth

TLB management TLB management

. Recall: the TLB is a cache. . Machines have many MMUs on many cores  many TLBs Process ID VPN PPN acce ss . Problem: TLBs should be coherent. Why? . Security problem if mappings change Core 1 0 0x0053 0x03 r/w . E.g., when memory is reused TLB: 1 0x20f8 0x12 r/w

Core 2 0 0x0053 0x03 r/w TLB: 1 0x0001 0x05 read

Core 3 0 0x20f8 0x12 r/w TLB: 1 0x0001 0x05 read

59 60 spcl.inf.ethz.ch spcl.inf.ethz.ch @spcl_eth @spcl_eth

TLB management TLB management

Process ID VPN PPN acce Process ID VPN PPN acce ss ss Change Change to read to read Core 1 0 0x0053 0x03 r/w Core 1 0 0x0053 0x03 r/w only only TLB: 1 0x20f8 0x12 r/w TLB: 1 0x20f8 0x12 r/w

Core 2 0 0x0053 0x03 r/w Core 2 0 0x0053 0x03 r/w TLB: 1 0x0001 0x05 read TLB: 1 0x0001 0x05 read

Core 3 0 0x20f8 0x12 r/w Core 3 0 0x20f8 0x12 r/w TLB: 1 0x0001 0x05 read TLB: 1 0x0001 0x05 read

61 62

spcl.inf.ethz.ch spcl.inf.ethz.ch @spcl_eth @spcl_eth

TLB management Keeping TLBs consistent

1. Hardware TLB coherence . Integrate TLB mgmt with cache coherence Process ID VPN PPN acce . Invalidate TLB entry when PTE memory changes ss . Rarely implemented Change to read 2. Virtual caches Core 1 0 0x0053 0x03 r/w only . Required cache flush / invalidate will take care of the TLB TLB: 1 0x20f8 0x12 r/w . High context switch cost!  Most processors use physical caches Core 2 0 0x0053 0x03 r/w 3. Software TLB shootdown TLB: 1 0x0001 0x05 read . Most common . OS on one core notifies all other cores - Typically an IPI Core 3 0 0x20f8 0x12 r/w . Each core provides local invalidation TLB: 1 0x0001 0x05 read 4. Hardware shootdown instructions . Broadcast special address access on the bus . Interpreted as TLB shootdown rather than cache coherence message Process 0 on core 1 can only continue once shootdown is complete! . E.g., PowerPC architecture

63 64

spcl.inf.ethz.ch @spcl_eth

Tomorrow: demand paging

65