Operating Systems Engineering

Total Page:16

File Type:pdf, Size:1020Kb

Operating Systems Engineering Operating Systems Engineering Recitation 2: Boot and first process Based on MIT 6.828 (2014, lec3-5) Focus on xv6 • An educational OS based on UNIX V6 • only a few abstractions \ services • Processes • File system • I/O (via file descriptors) Power on (“boot”) the machine ● Initial state – nothing in RAM, need to read kernel from disk ● BIOS in charge, start bootloader (sheets 84-85) – copy first “boot” sector to 0x7c00 in mem – boot sector = bootasm.S + bootmain.c – executing at end of bootasm.S – stack below 0x7c00, so we can call into C – call bootmain xv6 – bootloader (1) ● Bootloader in charge: – bootmain() – sheet 85 ● Two jobs: – copy kernel from disk to mem (0x100000 phys addr) ● not file–-sectors, raw disk ● linker writes in ELF format – jump to kernel's first instruction ● elf->entry ● objdump -f kernel or readelf -e kernel ● kernel.ld and entry.S xv6 – bootloader (2) ● Why phys address 0x100000 (1MB)? – can't use 0x0 → memory mapped devices 0x0->1M – can use 0x200000 (2MB)? ● yes! it is possible ● Bootloader can load kernel to phys address if – it is a DRAM address – kernel must be able to find itself ● 0x100000 satisfies both conditions xv6 – bootloader (3) ● Where bootmain jumps? – entry->elf_entry from ELF header – not 0x100000 but 0x10000c ● this is "start", in entry.S, sheet 10 – linker put 0x10000c in the ELF header ● (gdb) b *0x10000c (entry.S) ● (gdb) si ● continue and then si to jmp *%eax ● (gdb) #0 0x801033b2 in main () at main.c:19 xv6 – processes • Process has user space memory: – instructions - actual computation flow – data - variables used in computation – stack – organize procedures calls • Per-process state private to the kernel – page table – kernel stack – file descriptor table xv6 – Process Isolation • Prevent process X from spying on Y • Prevent process X from corrupting Y – Separated memory, file descriptors – Prevent resource exhaustion (fairness) • Protect kernel from processes • Defensive tactic – Against buggy programs – Against malicious programs xv6 – Isolation Mechanisms • User/Kernel mode flag • System call abstraction • Address spaces • Timeslicing User/Kernel Mode Flag • Called CPL in x86 • Bottom two bits of the cs register cs: CPL • CPL=0 – kernel mode – privileged • CPL=3 – user mode – not privileged User/Kernel Mode Flag ● CPL is the base to almost every isolation – CPL in low 2 bits of CS – CPL=0 -> can modify cr*, devices, can use any PTE – CPL=3 -> can't modify cr*, or use devs, and PTE_U enforced ● Writes to control registers (cs, for instance) ● Writes to certain flags ● Memory access ● I/O Port access ● However, setting CPL=3 is not enough ● Kernel needs to manage policy System calls ● Call from user to kernel – needs to change CPL ● Can this be done? – set CPL=0 – jump sys_open() ● How about a combined instruction that forces the user to jump to a kernel address? System calls - x86 solution ● Kernel sets allowed entry points ● int instruction sets CPL=0 and jumps – saves the values of cs and eip on stack – system call returns with iret – restores old cs and eip ● Should these instructions be privileged? xv6 – First Process (1) ● Each process state in struct proc (2103) ● Process states – UNUSED, EMBRYO, SLEEPING, RUNNABLE, RUNNING, ZOMBIE ● Each process address space maps – program memory (<0x80000000) – kernel instructions and data (>0x80100000) xv6 – First Process ● Each process has two stacks: user and kernel ● Thread of execution (aka thread) – p->kstack + code – p->state – p->pgdir ● When in user mode kernel stack empty ● When in kernel (syscall) user stack contains data but not used – thread state stored in kernel stack: ● local variables ● return address xv6 – Virtual Memory Layout ● 0x80000000=KERNBASE ● memlayout.h (0207) xv6 – First Address Space ● main.c → main() sheet 12 ● First process (see userinit, 2252) – allocproc() sheet 22, set up stack for "returning" to user space ● save trapret (3027) ● p->context->eip = forkret (2533) , ret will run forkret ● forkret will return to trapret (3027), then to userpace – Fill in kernel part of address space (setupkvm) – Fill in user part of address space ● 1 page containing initcode (see initcode.S) – Setup trapframe to exit kernel ● User-mode bit ● tf->eip = 0 (beginning of initcode.S) ● User-stack lives at top of 1 page of initcode – Set process to runnable xv6 – trap frame ● Kernel stack after allocproc() ● Ready to return to user space ● trapasm.S (3027) : trapret: popal popl %gs popl %fs popl %es popl %ds addl $0x8, %esp # trapno # and errcode iret xv6 – First System Call exec() ● initcode.S (7708) calls exec “/init” ● Instruction int enters kernel again ● System call exec (3207) – replace initcode with /init binary – run /init which ● creates new console ● start shell ● handles orphaned zombies ● system is up Questions? Backup Slides xv6 – A Monolithic Kernel • Kernel is a big program • Contains all services, low level hardware mechanisms • Entire kernel runs with full privileges • Pros – easy kernel subsystem interactions • Cons – complex interactions => bugs => system crash – no isolation in the kernel • Unix, Linux, BSD family, Solaris, xv6 Micro Kernel • Kernel is a small program – A micro kernel tries to run most services as daemons in user space. • Only kernel runs with full privileges • Microkernel is essentially a high speed context-switching engine • Pros – complex service interactions => bugs => service crash but system alive! – kernel isolated from services, services isolated from user • Cons – complex OS subsystem interactions using IPC – a lot of of messaging and context switching involved ● MINIX, QNX, L4 Exo Kernel • Kernel is a very small program – concept of an exokernel is orthogonal to that of micro- vs. monolithic kernels. – there are no forced abstractions – security separated from abstraction • Kernel and users can run with full privileges • Pros – simplicity and performance – freedom: users can implement their own optimal subsystems • Cons – additional effort from users and system maintainers ● JOS, nonkernel, BareMetal OS.
Recommended publications
  • Unix Protection
    View access control as a matrix Protection Ali Mashtizadeh Stanford University Subjects (processes/users) access objects (e.g., files) • Each cell of matrix has allowed permissions • 1 / 39 2 / 39 Specifying policy Two ways to slice the matrix Manually filling out matrix would be tedious • Use tools such as groups or role-based access control: • Along columns: • - Kernel stores list of who can access object along with object - Most systems you’ve used probably do this dir 1 - Examples: Unix file permissions, Access Control Lists (ACLs) Along rows: • dir 2 - Capability systems do this - More on these later. dir 3 3 / 39 4 / 39 Outline Example: Unix protection Each process has a User ID & one or more group IDs • System stores with each file: • 1 Unix protection - User who owns the file and group file is in - Permissions for user, any one in file group, and other Shown by output of ls -l command: 2 Unix security holes • user group other owner group - rw- rw- r-- dm cs140 ... index.html 3 Capability-based protection - Eachz}|{ groupz}|{ z}|{ of threez}|{ lettersz }| { specifies a subset of read, write, and execute permissions - User permissions apply to processes with same user ID - Else, group permissions apply to processes in same group - Else, other permissions apply 5 / 39 6 / 39 Unix continued Non-file permissions in Unix Directories have permission bits, too • Many devices show up in file system • - Need write permission on a directory to create or delete a file - E.g., /dev/tty1 permissions just like for files Special user root (UID 0) has all
    [Show full text]
  • A Practical UNIX Capability System
    A Practical UNIX Capability System Adam Langley <[email protected]> 22nd June 2005 ii Abstract This report seeks to document the development of a capability security system based on a Linux kernel and to follow through the implications of such a system. After defining terms, several other capability systems are discussed and found to be excellent, but to have too high a barrier to entry. This motivates the development of the above system. The capability system decomposes traditionally monolithic applications into a number of communicating actors, each of which is a separate process. Actors may only communicate using the capabilities given to them and so the impact of a vulnerability in a given actor can be reasoned about. This design pattern is demonstrated to be advantageous in terms of security, comprehensibility and mod- ularity and with an acceptable performance penality. From this, following through a few of the further avenues which present themselves is the two hours traffic of our stage. Acknowledgments I would like to thank my supervisor, Dr Kelly, for all the time he has put into cajoling and persuading me that the rest of the world might have a trick or two worth learning. Also, I’d like to thank Bryce Wilcox-O’Hearn for introducing me to capabilities many years ago. Contents 1 Introduction 1 2 Terms 3 2.1 POSIX ‘Capabilities’ . 3 2.2 Password Capabilities . 4 3 Motivations 7 3.1 Ambient Authority . 7 3.2 Confused Deputy . 8 3.3 Pervasive Testing . 8 3.4 Clear Auditing of Vulnerabilities . 9 3.5 Easy Configurability .
    [Show full text]
  • System Calls System Calls
    System calls We will investigate several issues related to system calls. Read chapter 12 of the book Linux system call categories file management process management error handling note that these categories are loosely defined and much is behind included, e.g. communication. Why? 1 System calls File management system call hierarchy you may not see some topics as part of “file management”, e.g., sockets 2 System calls Process management system call hierarchy 3 System calls Error handling hierarchy 4 Error Handling Anything can fail! System calls are no exception Try to read a file that does not exist! Error number: errno every process contains a global variable errno errno is set to 0 when process is created when error occurs errno is set to a specific code associated with the error cause trying to open file that does not exist sets errno to 2 5 Error Handling error constants are defined in errno.h here are the first few of errno.h on OS X 10.6.4 #define EPERM 1 /* Operation not permitted */ #define ENOENT 2 /* No such file or directory */ #define ESRCH 3 /* No such process */ #define EINTR 4 /* Interrupted system call */ #define EIO 5 /* Input/output error */ #define ENXIO 6 /* Device not configured */ #define E2BIG 7 /* Argument list too long */ #define ENOEXEC 8 /* Exec format error */ #define EBADF 9 /* Bad file descriptor */ #define ECHILD 10 /* No child processes */ #define EDEADLK 11 /* Resource deadlock avoided */ 6 Error Handling common mistake for displaying errno from Linux errno man page: 7 Error Handling Description of the perror () system call.
    [Show full text]
  • Advanced Components on Top of a Microkernel
    Faculty of Computer Science Institute for System Architecture, Operating Systems Group Advanced Components on Top of A Microkernel Björn Döbel What we talked about so far • Microkernels are cool! • Fiasco.OC provides fundamental mechanisms: – Tasks (address spaces) • Container of resources – Threads • Units of execution – Inter-Process Communication • Exchange Data • Timeouts • Mapping of resources TU Dresden, 2012-07-24 L4Re: Advanced Components Slide 2 / 54 Lecture Outline • Building a real system on top of Fiasco.OC • Reusing legacy libraries – POSIX C library • Device Drivers in user space – Accessing hardware resources – Reusing Linux device drivers • OS virtualization on top of L4Re TU Dresden, 2012-07-24 L4Re: Advanced Components Slide 3 / 54 Reusing Existing Software • Often used term: legacy software • Why? – Convenience: • Users get their “favorite” application on the new OS – Effort: • Rewriting everything from scratch takes a lot of time • But: maintaining ported software and adaptions also does not come for free TU Dresden, 2012-07-24 L4Re: Advanced Components Slide 4 / 54 Reusing Existing Software • How? – Porting: • Adapt existing software to use L4Re/Fiasco.OC features instead of Linux • Efficient execution, large maintenance effort – Library-level interception • Port convenience libraries to L4Re and link legacy applications without modification – POSIX C libraries, libstdc++ – OS-level interception • Wine: implement Windows OS interface on top of new OS – Hardware-level: • Virtual Machines TU Dresden, 2012-07-24 L4Re:
    [Show full text]
  • Lecture Notes in Assembly Language
    Lecture Notes in Assembly Language Short introduction to low-level programming Piotr Fulmański Łódź, 12 czerwca 2015 Spis treści Spis treści iii 1 Before we begin1 1.1 Simple assembler.................................... 1 1.1.1 Excercise 1 ................................... 2 1.1.2 Excercise 2 ................................... 3 1.1.3 Excercise 3 ................................... 3 1.1.4 Excercise 4 ................................... 5 1.1.5 Excercise 5 ................................... 6 1.2 Improvements, part I: addressing........................... 8 1.2.1 Excercise 6 ................................... 11 1.3 Improvements, part II: indirect addressing...................... 11 1.4 Improvements, part III: labels............................. 18 1.4.1 Excercise 7: find substring in a string .................... 19 1.4.2 Excercise 8: improved polynomial....................... 21 1.5 Improvements, part IV: flag register ......................... 23 1.6 Improvements, part V: the stack ........................... 24 1.6.1 Excercise 12................................... 26 1.7 Improvements, part VI – function stack frame.................... 29 1.8 Finall excercises..................................... 34 1.8.1 Excercise 13................................... 34 1.8.2 Excercise 14................................... 34 1.8.3 Excercise 15................................... 34 1.8.4 Excercise 16................................... 34 iii iv SPIS TREŚCI 1.8.5 Excercise 17................................... 34 2 First program 37 2.1 Compiling,
    [Show full text]
  • Operating System Support for Parallel Processes
    Operating System Support for Parallel Processes Barret Rhoden Electrical Engineering and Computer Sciences University of California at Berkeley Technical Report No. UCB/EECS-2014-223 http://www.eecs.berkeley.edu/Pubs/TechRpts/2014/EECS-2014-223.html December 18, 2014 Copyright © 2014, by the author(s). All rights reserved. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission. Operating System Support for Parallel Processes by Barret Joseph Rhoden A dissertation submitted in partial satisfaction of the requirements for the degree of Doctor of Philosophy in Computer Science in the Graduate Division of the University of California, Berkeley Committee in charge: Professor Eric Brewer, Chair Professor Krste Asanovi´c Professor David Culler Professor John Chuang Fall 2014 Operating System Support for Parallel Processes Copyright 2014 by Barret Joseph Rhoden 1 Abstract Operating System Support for Parallel Processes by Barret Joseph Rhoden Doctor of Philosophy in Computer Science University of California, Berkeley Professor Eric Brewer, Chair High-performance, parallel programs want uninterrupted access to physical resources. This characterization is true not only for traditional scientific computing, but also for high- priority data center applications that run on parallel processors. These applications require high, predictable performance and low latency, and they are important enough to warrant engineering effort at all levels of the software stack.
    [Show full text]
  • Android Porting Guide Step by Step
    Android Porting Guide Step By Step ChristoferBarometric remains Derron left-handstill connects: after postulationalSpenser snoops and kinkilywispier or Rustin preacquaint microwaves any caterwaul. quite menacingly Hewie graze but intubated connectedly. her visionaries hereditarily. The ramdisk of the logs should be placed in API calls with the thumb of the code would cause problems. ROMs are desperate more difficult to figure naked but the basic skills you seek be taught here not be applied in principle to those ROMs. Find what catch the prescribed procedures to retrieve taken. Notification data of a surface was one from android porting guide step by step by specific not verify your new things at runtime. Common interface to control camera device on various shipsets and used by camera source plugin. If tap have executed any state the commands below and see want i run the toolchain build again, like will need maybe open a fancy shell. In cases like writing, the input API calls are they fairly easy to replace, carpet the accelerometer input may be replaced by keystrokes, say. Sometimes replacing works and some times editing. These cookies do not except any personally identifiable information. When you decide up your email account assess your device, Android automatically uses SSL encrypted connection. No custom ROM developed for team yet. And Codeaurora with the dtsi based panel configuration, does charity have a generic drm based driver under general hood also well? Means describe a lolipop kernel anyone can port Marshmallow ROMs? Fi and these a rain boot. After flashing protocol. You least have no your fingertips the skills to build a full operating system from code and install navigate to manage running device, whenever you want.
    [Show full text]
  • Cristina Opriceana, Hajime Tazaki (IIJ Research Lab.) Linux Netdev 2.2, Seoul, Korea 08 Nov
    Network stack personality in Android phone Cristina Opriceana, Hajime Tazaki (IIJ Research Lab.) Linux netdev 2.2, Seoul, Korea 08 Nov. 2017 1 Librarified Linux taLks (LLL) Userspace network stack (NUSE) in general (netdev0.1) kernel CI with libos and ns-3 (netdev1.1) Network performance improvement of LKL (netdev1.2, by Jerry Chu) How bad/good with LKL and hrtimer (BBR) (netdev2.1) Updating Android network stack (netdev2.2) 2 Android a platform of billions devices billions installed Linux kernel Questions When our upstreamed code available ? What if I come up with a great protocol ? https://developer.android.com/about/dashboards/index.html 3 Android (cont'd) When our upstreamed code available ? wait until base kernel is upgraded backport specific function What if I come up with a great protocol ? craft your own kernel and put into your image Long delivery to all billions devices 4 Approaches to alleviate the issue Virtualization (KVM on Android) Overhead isn't negligible to embedded devices Project Treble (since Android O) More modular platform implementation Fushia Rewrite OS from scratch QUIC (transport over UDP) Rewrite transport protocols on UDP https://source android com/devices/architecture/treble https://source.android.com/devices/architecture/treble An alternate approach network stack personality use own network stack implemented in userspace no need to replace host kernels but (try to) preserve the application compatibility NUSE (network stack in userspace) No delay of network stack update Application can choose a network stack if needed 56 Userspace implementations Toys, Misguided People Selfish Motivation Trying to present that a Toy is practically useful 7 Linux Kernel Library intro (again) Out-of-tree architecture (h/w-independent) Run Linux code on various ways with a reusable library h/w dependent layer on Linux/Windows /FreeBSD uspace, unikernel, on UEFI, network simulator (ns-3) Android 8 LKL: current status Sent RFC (Nov.
    [Show full text]
  • A Component-Based Environment for Android Apps
    A Component-based Environment for Android Apps Alexander Senier FOSDEM, Brussels, 2020-02-02 Smartphone Trust Challenges Privilege Escalation 2020-02-02 2 Media Frameworks are not getting simpler. How do we avoid such fatal errors? 2020-02-02 3 Trustworthy Systems Component-based Architectures ■ Can’t reimplement everything ■ Solution: software reuse ▪ Untrusted software (gray) Protocol validator (e.g. Firewall) ▪ Policy object (green) ▪ Client software (orange) ■ Policy object Network Web ▪ Establishes assumptions of client Stack browser ▪ Sanitizes ▪ Enforces additional policies 2020-02-02 4 Information Flow Correctness 2020-02-02 5 Trustworthy Systems Information Flow: Genode OS Framework ■ Recursive system structure ■ Hierarchical System ▪ Root: Microkernel Architecture ▪ Parent: Responsibility + control ▪ Isolation is default ▪ Strict communication policy ■ Everything is a user-process ▪ Application ▪ File systems ▪ Drivers, Network stacks ■ Stay here for the next 2 talks for details (13:00) 2020-02-02 https://genode.org 6 Trustworthy Systems Correctness: SPARK ■ Programming Language ■ Applications ▪ Based on Ada ▪ Avionics ▪ Compilable with GCC and LLVM ▪ Defense ▪ Customizable runtimes ▪ Air Traffic Control ▪ Contracts (preconditions, postconditions, invariants) ▪ Space ■ Verification Toolset ▪ Automotive ▪ Absence of runtime errors ▪ Medical Devices ▪ Functional correctness ▪ Security 2020-02-02 https://www.adacore.com/about-spark 7 Applying this Approach to Android Apps 2020-02-02 8 GART Project Objectives ■ Unmodified Android
    [Show full text]
  • Installing Management Node Remotely
    Installing Management Node Remotely This chapter contains the following topics: • Overview to Installation of Management Node Remotely, on page 1 • Overview to Cisco VIM Baremetal Manager REST API, on page 5 • Installing Cisco VIM Baremetal Manager Management Node On a UCS C-series Server, on page 6 • Preparing the Cisco VIM Baremetal Manager Management Node from Cisco VIM Software Hub Server, on page 7 Overview to Installation of Management Node Remotely Cisco VIM fully automates the installation operation of the cloud. In releases prior to Cisco VIM 3.4.1, the management node installation was always manual, as the bootstrap of the cloud happens from there. Using this feature, the management node, referred to as Cisco VIM Baremetal Manager is automatically installed over a layer 3 network to accelerate the Cisco VIM installation process. Note In this chapter, the term Cisco VIM Baremetal Manager and Remote Install of Management Node (RIMN) are used interchangeably. Remote Install of Management Node Remote Install of Management Node (RIMN) software is deployed on the RIMN deployment node from where one or more management nodes are installed. Cisco VIM Baremetal Manager or RIMN supports remote installation of servers across WAN or LAN with either IPv4 or IPv6 connectivity. Cisco VIM Baremetal Manager can be installed on the Cisco VIM Baremetal Manager deployment node by using air-gapped installation. After you install the RIMN software on its management node, you must define an input file for bare-metal config (in YAML format) and use Cisco VIM Baremetal Manager CLI or Rest API to deploy the user-specified ISO into the target platform (as depicted in the figure below): Installing Management Node Remotely 1 Installing Management Node Remotely Hardware Requirements for RIMN RIMN solution is built based on the interaction of several components as depicted below: • Rest-API and CLI: Pushes the received input data into Etcd datastore.
    [Show full text]
  • View the Slides
    RedLeaf: Isolation and Communication in a Safe Operating System Vikram Narayanan1, Tianjiao Huang1, David Detweiler1, Dan Appel1, Zhaofeng Li1, Gerd Zellweger2, Anton Burtsev1 OSDI ’20 1University of California, Irvine 2VMware Research History of Isolation Cedar Ka�eOS Multics Pilot Scomp SPIN J-Kernel Mondrian VINO Singularity 1973 1980 1983 1995 1996 1999 2002 2005 Year • Isolation of kernel subsystems • Final report of Multics (1976) • Scomp (1983) • Systems remained monolithic • Isolation was expensive 1 Isolation mechanisms • Hardware Isolation • Segmentation (46 cycles)1 • Page table isolation (797 cycles)2 • VMFUNC (396 cycles)3 • Memory protection keys (20-26 cycles)4 • Language based isolation • Compare drivers written (DPDK-style) in a safe high-level language (C, Rust, Go, C#, etc.)5 • Managed runtime and Garbage collection (20-50% overhead on a device-driver workload) 1L4 Microkernel: Jochen Liedtke 2https://sel4.systems/About/Performance/ 3Lightweight Kernel Isolation with Virtualization and VM Functions, VEE 2020 4Hodor: Intra-process isolation for high-throughput data plane libraries 5The Case for Writing Network Drivers in High-Level Programming Languages, ANCS 2019 2 • Linear types • Enforces type and memory safety • Statically checked at compile time • Safety without runtime garbage collection overhead Rust Traditional Safe languages vs Rust Java, C# etc. A 3 • Linear types • Enforces type and memory safety • Statically checked at compile time • Safety without runtime garbage collection overhead Rust Traditional Safe languages vs Rust Java, C# etc. A Vector 3 • Linear types • Enforces type and memory safety • Statically checked at compile time • Safety without runtime garbage collection overhead Rust Traditional Safe languages vs Rust Java, C# etc.
    [Show full text]
  • Capability Myths Demolished
    Capability Myths Demolished Mark S. Miller Ka-Ping Yee Jonathan Shapiro Combex, Inc. University of California, Berkeley Johns Hopkins University [email protected] [email protected] [email protected] ABSTRACT The second and third myths state false limitations on We address three common misconceptions about what capability systems can do, and have been capability-based systems: the Equivalence Myth (access propagated by a series of research publications over the control list systems and capability systems are formally past 20 years (including [2, 3, 7, 24]). They have been equivalent), the Confinement Myth (capability systems cited as reasons to avoid adopting capability models cannot enforce confinement), and the Irrevocability and have even motivated some researchers to augment Myth (capability-based access cannot be revoked). The capability systems with extra access checks [7, 13] in Equivalence Myth obscures the benefits of capabilities attempts to fix problems that do not exist. The myths as compared to access control lists, while the Confine- about what capability systems cannot do continue to ment Myth and the Irrevocability Myth lead people to spread, despite formal results [22] and practical see problems with capabilities that do not actually exist. systems [1, 9, 18, 21] demonstrating that they can do these supposedly impossible things. The prevalence of these myths is due to differing inter- pretations of the capability security model. To clear up We believe these severe misunderstandings are rooted the confusion, we examine three different models that in the fact that the term capability has come to be have been used to describe capabilities, and define a set portrayed in terms of several very different security of seven security properties that capture the distinctions models.
    [Show full text]