Administrivia Confining Code with Legacy Oses Using Chroot Escaping Chroot System Call Interposition Limitations of Syscall Inte

Administrivia Confining code with legacy OSes Guest lecture Thursday • Often want to confine code on legacy OSes - Mark Lentczner (Google) on the Belay project • - Please attend lecture if at all possible Analogy: Firewalls • Last project due Thursday attacker • - No extensions unless all non-SCPD group members at lecture Hopelessly - If staff grants you extension, means only if you attend lecture Insecure attacker Server - We will have a more stringent enforcement mechanism Final Exam • - Your machine runs hopelessly insecure software - Wednesday March 16, 12:15-3:15pm - Can’t fix it—no source or too complicated - Open book, covers all 19 lectures - Can reason about network traffic (possibly including topics already on the midterm) Similarly block unrusted code within a machine Televised final review session Friday • • - By limiting what it can interact with - Bring questions on lecture material 1/37 2/37 Using chroot Escaping chroot chroot (char *dir) “changes root directory” Re-chroot to a lower directory, then chroot .. • • - Kernel stores root directory of each process - Each process has one root directory, so chrooting to a new - File name “/” now refers to dir directory can put you above your new root - Accessing “..” in dir now returns dir Create devices that let you access raw disk • Need root privs to call chroot • Send signals to or ptrace non-chrooted processes - But subsequently can drop privileges • Create setuid program for non-chrooted proc. to run Ideally “Chrooted process” wouldn’t affect parts of • • the system outside of dir Bind privileged ports, mess with clock, reboot, etc. • - Even process still running as root shouldn’t escape chroot Problem: chroot was not originally intended for • security In reality, many ways to cause damage outside dir • - FreeBSD jail, Linux vserver have tried to address problems 3/37 4/37 System call interposition Limitations of syscall interposition Why not use ptrace or other debugging facilities to • Hard to know exact implications of a system call control untrusted programs? • - Too much context not available outside of kernel Almost any “damage” must result from system call • (e.g., what does this file descriptor number mean?) - delete files unlink → - Context-dependent (e.g., /proc/self/cwd) - overwrite files open/write → Indirect paths to resources - attack over network socket/bind/connect/send/recv • → - File descriptor passing, core dumps, “unhelpful processes” - leak private data open/read/socket/connect/write . → Race conditions • So enforce policy by allowing/disallowing each - Remember difficulty of eliminating TOCCTOU bugs? • syscall - Now imagine malicious application deliberately doing this - Theoretically much more fine-grained than chroot - Symlinks, directory renames (so “..” changes), . - Plus don’t need to be root to do it See [Garfinkel] for a more detailed discussion Q: Why is this not a panacea? • • 5/37 6/37 Review: What is an OS What if. The process abstraction looked just like hardware? • OS is software between applications and reality • - Abstracts hardware and makes portable - Makes finite into (near) infinite - Provides protection 7/37 8/37 How is a process different from HW? Virtual Machine Monitor Thin layer of software that virtualizes the hardware Process Hardware • - Exports a virtual machine abstraction that looks like the CPU – Non-Privileged CPU – All registers and • • hardware registers and instructions. instructions. Memory – Virtual memory. Memory – Both virtual and • • Exceptions – signals, errors. physical memory, memory • management, TLB/page I/O – File System, Directory, • tables, etc. Files, raw devices. Exceptions – Trap architecture, • interrupts, etc. I/O – I/O devices accessed us- • ing programmed I/O, DMA, interrupts. 9/37 10/37 Old idea from the 1960s VMM benefits See [Goldberg] from 1974 Software compatibility • • IBM VM/370 – A VMM for IBM mainframe - Runs pretty much all software • - Multiplex multiple OS environments on expensive hardware - Trick: Make virtual hardware match real hardware - Desirable when few machines around Can get Low overheads/High performance • Interest died out in the 1980s and 1990s - Near “raw” machine performance for many workloads • - Hardware got cheap - With tricks can have direct execution on CPU/MMU - Compare Windows NT vs. N DOS machines Isolation • Interesting again today - Seemingly total data isolation between virtual machines • - Different problems today – software management - Use hardware protection - VMM attributes still relevant Encapsulation • - Virtual machines are not tied to physical machines - Checkpoint/Migration 11/37 12/37 OS backwards compatibility Logical partitioning of servers Run multiple servers on same box (e.g., Amazon EC2) Backward compatibility is bane of new OSes • • - Ability to give away less than one machine - Huge effort require to innovate but not break Modern CPUs more powerful than most services need Security considerations may make it impossible • - 0.10U rack space machine – less power, cooling, space, etc. - Choice: Close security hole and break apps or be insecure - Server consolidation trend: N machines 1 real machine → Example: Not all WinNT applications run on WinXP Isolation of environments • or XP on Vista • - Printer server doesn’t take down Exchange server - In spite of a huge compatibility effort - Compromise of one VM can’t get at data of othersa - Given the number of applications that ran on WinNT, Resource management practically any change would break something • if (OS == WinNT) . - Provide service-level agreements Solution: Use a VMM to run both WinNT and WinXP Heterogeneous environments • • - Obvious for OS migration as well: Windows Linux - Linux, FreeBSD, Windows, etc. → a though in practice not so simple because of side-channel attacks [Ristenpart] 13/37 14/37 Complete Machine Simulation Virtualizing the CPU Observations: Most instructions are the same Simplest VMM approach, used by bochs • • regardless of processor privileged level Build a simulation of all the hardware. • - Example: incl %eax - CPU – A loop that fetches each instruction, decodes it, Why not just give instructions to CPU to execute? simulates its effect on the machine state • - One issue: Safety – How to get the CPU back? Or stop it from - Memory – Physical memory is just an array, simulate the MMU stepping on us? How about cli/halt? on all memory accesses - Solution: Use protection mechanisms already in CPU - I/O – Simulate I/O devices, programmed I/O, DMA, interrupts Run virtual machine’s OS directly on CPU in • unprivileged user mode Problem: Too slow! • - “Trap and emulate” approach - 100x slowdown makes it not too useful - Most instructions just work - CPU/Memory – 100x CPU/MMU simulation - Privileged instructions trap into monitor and run simulator on - I/O Device – < 2 slowdown. × instruction Need faster ways of emulating CPU/MMU • - Makes some assumptions about architecture 15/37 16/37 Virtualizing traps Virtualizing memory What happens when an interrupt or trap occurs Basic MMU functionality: • • - Like normal kernels: we trap into the monitor - OS manages physical memory (0. MAX MEM) What if the interrupt or trap should go to guest OS? - OS sets up page tables mapping VA PA • → - Example: Page fault, illegal instruction, system call, interrupt - CPU accesses to VA should go to PA (Paging off: PA=VA) - Re-start the guest OS simulating the trap - Used for every instruction fetch, load, or store x86 example: Need to implement a virtual “physical memory” • • - Give CPU an IDT that vectors back to VMM - Logically need additional level of indirection - VM’sVA VM’s PA machine address - Look up trap vector in VM’s “virtual” IDT → → - Push virtualized %cs, %eip, %eflags, on stack - Note “physical memory” no longer mans hardware bits – - Switch to virtualized privileged mode machine memory is hardware bits Trick: Use hardware MMU to simulate virtual MMU • - Can be folded into page tables: VA machine address → 17/37 18/37 Shadow page tables Shadow PT issues Monitor keeps shadow of VM’s page table Hardware only ever sees shadow page table • • - Shadow PT is map from VA machine address - Guest OS only sees it’s own VM page table, never shadow PT → Treat shadow page tables as a cache Consider the following • • - Have true page faults when a page not in VM’s page table - OS has a page table T mapping V P U → U - Have hidden page faults when just misses in shadow page table - T itself resides at physical address PT On a page fault, VMM must: - Another page table maps VT PT • → - Lookup VPN PPN in VM’s (guest OS’s) page table - VMM stores PU in machine address MU and PT in MT → - Determine where PPN is in machine memory (MPN) What can VMM put in shadow page table? • - Insert VPN MPN mapping in shadow page table - Safe to map V M or V M → U → U T → T - Note: Monitor can demand-page the virtual machine Not safe to map both simultaneously! • Uses hardware protection - If OS writes to P , may make V M in shadow PT incorrect • T U → U - Monitor never maps itself into VM’s page table - If OS reads/writes VU, may require accessed/dirty bits to be - Never maps other VMs’s memory in VM’s page table changed in PT (hardware can only change shadow PT) 19/37 20/37 Tracing Tracing vs. hidden faults VMM needs to get control on some memory accesses • Suppose VMM never allowed access to VM PTs? Guest OS changes VM page table • • - Every PTE access would incur the cost of a tracing fault invlpg - OS should use instruction, which would trap to VMM – - Very expensive when OS changes lots of PTEs but in practice many/most OSes are sloppy about this Suppose OS allowed access to most page tables - Must invalidate stale mapping in shadow page table • (except very recently accessed regions) Guest OS accesses page when VM PT accessible • - Now lots of hidden faults when accessing new region - Accessed/dirty bits in VM PT will no longer be correct - Plus overhead to pre-compute accessed/dirty bits from shadow - Must make VM PT inaccessible in shadow PT page tables when switching between VMs Solution: Tracing Makes for complex trade-offs • • - To track page access, make VPN(s) invalid in shadow PT - But adaptive binary translation (later) can make this better - If guest OS accesses page, will trap to VMM w.

Administrivia Confining Code with Legacy Oses Using Chroot Escaping Chroot System Call Interposition Limitations of Syscall Inte

Chrooting All Services in Linux

The Linux Kernel Module Programming Guide

LM1881 Video Sync Separator Datasheet

26 Disk Space Management

How UNIX Organizes and Accesses Files on Disk Why File Systems

Linux Kernel and Driver Development Training Slides

Installing the Data Broker on a Linux Host : Cloud Sync

Troubleshoot

Setting up Your Own Mozilla Sync Server Setting up Your Own Mozilla Sync Server

Unix Programmer's Manual

Puremessage for Unix Help Contents Getting Started

FTP Plug-In User Guide 2017.1 May 2017 Copyright © 2001-2017 Perforce Software