CS 350 Operating Systems Spring 2021

1. Operating systems overview

1 What is an OS?

Applications

Von Neumann model

Hardware

• Can applications/programs directly run upon hardware? • Yes • But normally not. Why? • Because it is too complex! You (developers) have to deal with hardware. • Also limits applications’ portability. 2 What is an OS?

Applications

Von Neumann model

Hardware

• The code for hardware interaction will be repeatedly used under the same hardware. • More efficient design? • How about to separate this code from applications!

3 What is an OS?

Applications

OS

Hardware

• An OS: a software layer between the hardware and the applications/programs • for making it easy to run applications/programs • But how?

4 What is an OS?

App1 App2 App3

CPU Memory Disk virtualization virtualization OS virtualization

Hardware

• A general technique in an OS: Virtualization • The OS takes a physical resource (e.g., processor, memory, or a disk) and transforms it into a more general, powerful, and easy-to- use virtual form of itself • Hardware details are abstracted away and hidden from application developers 5 What is an OS?

App1 App2 App3

Interfaces or system calls OS CPU virtualization Memory virtualization Disk virtualization

Hardware

• A general technique in an OS: Virtualization • To make use of virtual resources (e.g., running a program, allocating memory, or accessing a file), the OS also provides some interfaces () • A typical OS exports a few hundred system calls that are available to applications • We consider an OS provides a standard library to applications 6 What is an OS?

App1 App2

Interfaces or system calls OS CPU virtualization

Hardware

• The second technique/feature in an OS: Concurrency • An OS serves as a resource manager that allows concurrent programs/users to share the hardware resources • allocate resources for each application, and guarantee isolation among these applications with concurrent control mechanism. 7 What is an OS?

App1 App2 App3

Interfaces or system calls OS Memory virtualization Disk virtualization

volatile Hardware Persistent

• The third technique in an OS: Persistence • data can be easily lost, as devices such as DRAM store values in a volatile manner • we need hardware (e.g., HDD or SSD) and software (e.g., file systems) to be able to store data persistently • Persistence relates to how to store data in an organized, meaningful manner, easy for applications/user to access, and preventing data from corruption and lost. 8 In summary: What is an OS? • A software layer between the hardware and the application programs which virtualizes hardware resources and provides Applications a virtualization interface. OS • A resource manager that allows concurrent programs to share the Hardware hardware resources • And it ensures information stored persistently in the computer. • Three themes of operating systems: virtualization, concurrency and persistence.

9 Theme 1: Virtualization

– Make one physical CPU look like multiple virtual CPUs • Via the abstraction of “” – Make physical memory (RAM) and look like multiple virtual memory spaces • Via the abstraction of “address space” – Make physical disk look like a • Physical disk = raw bytes. • File system = user’s view of data on disk.

10 Virtualization example – Virtualizing the CPU cpu. (simple code that loops and prints)

11 Virtualization example – Virtualizing the CPU

Run one instance of “cpu” program on a single core CPU The CPU runs the program and outputs to the terminal the messages that the program wants to print.

12 Virtualization example – Virtualizing the CPU

Run many instances of “cpu” program on a single core CPU

Even though we have only one processor, somehow all four of these programs seem to be running at the same time! How?

The , with some help from the hardware, turns (or virtualizes) a single CPU (or small set of them) into a seemingly infinite number of CPUs and thus allowing many programs to seemingly run at once → virtualizing the CPU.

13 Virtualization example – Virtualizing Memory

What is the variable p stored? stored in

14 Virtualization example – Virtualizing Memory

Run a single instance of the “mem” program. Run two instances of the “mem” program.

stored in

stored in

For the two instances case, each running program has allocated memory at the same address (00200000), and each seems to be updating the value at 00200000 independently!

15 Virtualization example – Virtualizing Memory

Run a single instance of the “mem” program. Run two instances of the “mem” program.

stored in

stored in

Virtualizing memory: - Each process accesses its own private virtual memory address space (sometimes just called its address space), which the OS maps onto the physical memory of the machine. - A memory reference within one running process does not affect the address space of other processes (or the OS itself); as far as the running process is concerned, it has physical memory all to itself. 16 Theme 2: concurrency

• Virtualization allows multiple concurrent programs to share the virtualized hardware resources, which brings out another class of topics: concurrency. • Examples –One physical CPU runs many processes –One process runs many threads –One OS juggles process execution, system calls, interrupts, exceptions, CPU , , etc. • There’s a LOT of concurrency in modern computer systems. • And it’s the source of most of the system complexity.

17 Concurrency Example

Main function invoked by threads

main

p1 p2 counter++ counter++

main

Create the first thread, p1 Create the second thread, p2 Wait till the end of thread p1 Wait till the end of thread p2

18 Concurrency Example

Expected output?

Reality?

A key part of the program above, where the shared counter is What happed? incremented, takes three instructions: one to load the value of the counter from memory into a register, one to increment it, and one to store it back into memory. Because these three instructions do not execute atomically (all at once), strange things can happen.

19 Concurrency Example

• What happened? Inst2 CPU MEM Instructions counter counter 100 100 counter ++; CPU P1: Inst1 P2: Inst1 100 100 Inst1: Load counter from memory 100->101 100->101 P1: Inst2 101 100 to CPU register Register Register Inst1 Inst3 Inst2: Increment counter by 1 P1 P2 P2: Inst2 101 100 P1: Inst3 Inst3: Store it back from CPU 101 P2: Inst3 register to memory MEM 101 counter=100; A key part of the program above, where the shared counter is incremented, takes three instructions: one to load the value of the counter from memory into a register, one to increment it, and one to store it back into memory. Because these three instructions do not execute atomically (all at once), “strange” things can happen.

One key question in the concurrency case is how to ensure correct execution. 20 Theme 3: Persistence

• Storing data “forever” – On hard disks, SSDs, CDs, floppy disks, tapes, phono discs, paper! • But its not enough to just store raw bytes • Users want to – Organize data (via file systems) – Share data (via network or cloud) – Access data easily • …and recover data when lost. – Protect data from being stolen.

21 Persistence Example

Create a file Write data to the file

Close the file

On the one hand, the file system needs to figure out where on disk this new data will reside and then keep track of it in various on-device structures the file system maintains. On the other hand, the file system provides a standard and simple way to access devices through its system calls.

22 History of OSes Personal computing era Mobile device OS and ▪ MacOS, IBM PC and its DOS, hypervisors Windows, and so forth. • Android, iOS ▪ Unfortunately, many lessons from • VMWare ESX, Early operating systems earlier multi-programming era , /KVM were simple batch were forgotten and had to be re- etc. processing systems learned (painfully). ▪ 1980s also saw the fragmentation of – many proprietary OSes

1970s 1990s 1950s 1960s 1980s 2000s Multi-programming on mainframes • BSD and Linux • Concurrency, , • Open source. Kernel mode, system calls, hardware • Led the way to modern privilege levels, trap handling OSes and cloud platforms • Earliest Multics hardware and OS on • Wider adoption of threads IBM mainframes and parallelism • Which led to the first UNIX OS which pioneered file systems, shell, pipes, and the C language.

23 Evolution of Operating Systems A major OS will evolve over time for a number of reasons:

hardware upgrades

new types of hardware

new services

Fixes

The need to change an OS regularly places certain requirements on its design. An obvious statement is that the system should be modular in construction, with clearly defined interfaces between the modules.

24 Architecting OSes

• Three basic approaches 1. Monolithic kernels • All functionalities are compiled together • All code runs in privileged kernel-space 2. • Only essential functionalities are compiled into the kernel • All other functionalities run in unprivileged 3. Hybrid kernels • Most functionalities are compiled into the kernel • Some functions are loaded dynamically • Typically, all functionality runs in kernel-space

25 Monolithic Kernel Kernel Space User Space

Monolithic Kernel Code CPU Program Memory Scheduling Manager Loader User Program Error System Security Handling Policies APIs File Device Systems Drivers

Advantages Disadvantages • Single code base eases kernel development • Large code base, hard to check • Robust APIs for application developers for correctness • No need to find separate device drivers • Bugs crash the entire kernel • Fast performance due to tight coupling (and thus, the machine) 26 Kernel Space User Space File Disk System Driver Kernel Code Memory Networking Network Manager Service Card Driver CPU Scheduling User Program 1 Interprocess Communication User Program 2

Advantages Disadvantages • Small code base, easy to check for correctness • Performance is • Excellent for high-security systems slower: many context • Extremely modular and configurable switches • Choose only the pieces you need for embedded systems • No stable APIs, more • Easy to add new functionality (e.g. a new file system) difficult to write • Services may crash, but the system will remain27 stable applications

Kernel Space User Space

Kernel Code Third-Party Code Memory CPU Device Manager Scheduling Driver Security Error Policies Device Handling Driver

File System File Systems APIs System

Program Loader User Program

28 Research Kernels: L4 GNU Hurd

Kernels for Embedded System: QNX

Microkernels: Hybrid Kernels: Monolithic Kernels: Small code base, Pretty large code base, Huge code base, Few features Some features delegated Many features

29 Summary • Three themes of operating systems: virtualization, concurrency and persistence. • History and evaluation of operating systems • Three approaches to design kernels (the “live” portion of an operating system).

30 References

• Operating Systems: Three Easy Pieces, by Prof. Remzi Arpaci-Dusseau and Prof. Andrea Arpaci-Dusseau. http://pages.cs.wisc.edu/~remzi/OSTEP/intro. pdf

31