(Kernel) System Call
Total Page:16
File Type:pdf, Size:1020Kb
CS 350 Operating Systems Spring 2021 7-8 System Call 1 Process Execution • The execution of a process is contributed by two parts: • the program’s code • the system’s code – through a set of system calls • A view of an OS • Serve as a library providing system-level services • Part of running processes Proc1 Proc1 Proc1 pipe() PC fork() PC PC Operating systems fork pipe 2 Process Execution • However, the program’s code and OS’s code need to be treated differently while being executed • Program’s code runs in the user mode • OS’s code runs in the kernel mode • User mode and kernel mode are two processor modes • Specify what types of instructions can be executed • Determine which memory regions can access • It is the “system call” interface that “stitches” a process code and the OS code (and the two modes), which makes the complete execution flow for a process. time Program code user System System System mode call call call CPU kernel mode ret ret ret OS code 3 The need for protection When running user-provided processes, the OS needs protect itself and other system components • for reliability: buggy user programs • Exceptions (division by 0, buffer overrun, dereference NULL, …) • Resource hogs (infinite loops, exhaust memory …) • Despite these, OS cannot crash, must serve other processes and users • for security: malicious user programs • Read/write OS or other process’s data without permission • OS must check, and check code cannot be tampered 4 Dual-mode operation • Allows OS to protect itself and other system components • User mode and kernel mode (two modes of the processor) • User-provided code running in the user mode – with low privilege • Kernel-related tasks running in the kernel mode – with high privilege • How? • 1) Mode bit(s) for processor • Provides ability to distinguish whether the processor is running user code or kernel code • 2) Protection bit(s) for memory: • Provide ability to distinguish whether memory is for user instructions/data or for kernel instructions/data • 3) Instructions • Privilege instructions -- those accessing/changing system states or critical resources • Otherwise, normal instructions 5 Dual-mode operation • Two main protection polices are provided as follows: • Privilege instructions cannot be executed in the user mode • User code cannot access kernel memory • Any violations can be detected by hardware, automatically • Hardware checks privileges before execution • Usually, violations cause the process to be killed (by OS) Privilege Normal User mem instructions instructions user mode CPU Privilege Normal kernel Kernel instructions instructions mode mem 6 Dual-mode operation • But, what should a user process do when it wishes to perform privileged operations, such as file management, process creation, shared memory, etc.? • To perform privileged operations, systems must transit into OS through well defined interfaces: system calls • System calls allow the kernel to carefully expose certain key pieces of functionality to user programs • System calls stitches the execution of user process and the control of the OS Proc1 Proc2 Proc3 Proc4 System call interface Operating File Memory systems systems fork IPC management 7 More about system calls 8 System calls • A type of special protected procedure calls allowing user- level processes to request services from the kernel. • A way to control how critical resources are accessed/shared among multiple processes • System calls provide: • An abstraction layer between processes and hardware • Kernel provides and controls resource accesses • Such kernel services are exposed with a well-defined interface to user processes • Advantages of this design • Protection: hardware resources can be safely shared among multiple processes • Reusability: the same functionalities can be shared by independent programs • Reliability: developed by experts 9 Invoking system calls user-mode kernel-mode (restricted privileges) (unrestricted privileges) app … system making xyz() call system sys_xyz() { … } service call … routine call ret call ret xyz { wrapper … system_call: int 0x80 routine %eax = #syscall in std C … system library int 80h; sys_xyz(); call iret handler … … } 10 Points to note • System calls are actually invoked in C library wrappers functions (procedure calls) • Such procedures prepare system call arguments and finally involve a “trap” instruction (e.g., int) • The trap instruction jumps into the kernel and raises the privilege level to kernel mode • In the end, the trap instruction locates the system call entry functrion • Once in the kernel, the system can now perform whatever privileged operations are needed, and thus do the required privileged work for the calling process. • When finished, the OS calls a special “return-from-trap” (e.g., iret) instruction, which returns into the calling user process while simultaneously reducing the privilege level back to user mode. 11 The system-call jump-table (system call table) • There are more than 300 system calls in recent Linux 5.x kernel • A specific system-call is identified by a unique ID- number (i.e., the system call number, which is placed into register %eax in the system call wrapper function) • To locate the system-call hander in the kernel, an in- kernel data structure, system-call table, is used • A system-call table is an array of function-pointers (e.g., “sys_call_table[]” in Linux kernel) 12 An example: xv6 system call table System call System call number function pointers Defined in syscall.c ➔ 13 The “jump-table” idea 0 sys_restart_syscall .section .data 1 sys_exit System call table 2 sys_fork .section .text 3 sys_read 4 sys_write 5 sys_open 6 sys_close 7 …etc… 8 sys_call_table Kernel address space 14 Recall: System call • The execution of a process is contributed by two parts: (1) the program’s code; and (2) the OS system’s code • OS needs protections form user programs: reliability and security • Dual-mode operation (supported by hardware): • User programs’ code run in the user mode (with low privilege) • OS code runs in the kernel mode (with high privilege) • Policies under dual-mode: • Users cannot execution privilege instructions • Users cannot access kernel memory • Violations can be (easily) detected by hardware • Before execution of each instruction, the hardware (logic) checks the current running mode to see if it has sufficient privilege 15 Recall: System call Implementation user-mode kernel-mode (restricted privileges) (unrestricted privileges) app … system making xyz() call system sys_xyz() { … } service call … routine call ret call ret xyz { Trap instruction wrapper … system_call: int 0x80 routine %eax = #syscall in std C … system library int 80h; sys_xyz(); call iret handler … … } return-from-trap 16 Recall: More about trap • How does the “trap” instruction locate the “entry function” of system calls? • int 0x80 IDT (interrupt descriptor table) system_call() idtr • How does the “system call” function locate the concrete system call handler? • %eax = #syscall (user) System call “jump” table • syscall_table[#syscall] (kernel) Fork Exit wait … 17 Discussion 1 • Instead of using the system-table approach, can we use if-else tests or switch statement to locate the service routine’s handler? 18 Designing the syscall interface • Important to keep interface small, stable (for binary and backward compatibility) • The kernel code should verify that system calls are legally invoked (e.g., to check the argument carefully to avoid any errors) • Syscall numbers should not be reused (!) • Why? • Deprecated syscalls are implemented by a special “not implemented” syscall (sys_ni) 20 Discussion 2 • Consider a hypothetical system call, zeroFill, which fills a user buffer with zeroes: zeroFill(char* buffer, int bufferSize); • The following kernel implementation of zeroFill contains a security vulnerability. What is the vulnerability, and how would you fix it? void sys_zeroFill(char* buffer, int bufferSize) { for (int i=0; i < bufferSize; i++) { buffer[i] = 0; } } 21 Discussion 2 • The user buffer pointer is untrusted and could point anywhere. In particular, it could point inside the kernel address space. This could lead to a system crash or security breakdown. • Fix: verify the pointer is a valid user address 22 Discussion 3 • Is it a security risk to execute the zeroFill function in user-mode? void zeroFill(char* buffer, int bufferSize) { for (int i=0; i < bufferSize; i++) { buffer[i] = 0; } } 23 Discussion 3 • No. User-mode code does not have permission to access the kernel’s address space. If it tries, the hardware raises an exception, which is safely handled by the OS • More generally, no user mode code should ever be a security vulnerability. • Unless the OS has a bug… 24 Discussion 4 • To a programmer, a system call looks like any other procedure call to a library function. • Is it important that a programmer knows which library procedures result in system calls? 25 Discussion 4 • To a programmer, a system call looks like any other procedure call to a library function. • Is it important that a programmer knows which library procedures result in system calls? • Yes, if performance is an issue, if a task can be accomplished without a system call the program will run faster. • Every system call involves overhead time in switching from the user context to the kernel context, and then switching back. • Furthermore, on a multiuser system the operating system may schedule another process to run when a system call completes, further slowing the execution of a calling process. ➔ Library calls are much