CS 350 Operating Systems Spring 2021

Total Page:16

File Type:pdf, Size:1020Kb

CS 350 Operating Systems Spring 2021 CS 350 Operating Systems Spring 2021 4. Process II Discussion 1: Question? • Why do UNIX OSes use the combination of “fork() + exec()” to create a new process (from a program)? • Can we directly call exec() to create a new process? Or, do you have better ideas/designs for creating a new process? • Hints: To answer this question: think about the possible advantages of this solution. Fork Parent process Child process Exec New program image in execution 2 Discussion 1: Background • Motived by the UNIX shell • $ echo “Hello world!” • $ Hello world! • shell figures out where in the file system the executable “echo” resides, calls fork() to create a new child process, calls exec() to run the command, and then waits for the command to complete by calling wait(). • You can see that you can use fork(), exec() and wait() to implement a shell program – our project! • But what is the benefit of separating fork() and exec() in creating a new process? 3 Discussion 1: Case study • The following program implements the command: • $ wc p3.c > p4.output # wc counts the word number in p3.c • # the result is redirected to p4.output Close default standard output (to terminal) open file p4.output 4 Discussion 1: Case study • $ wc p3.c # wc outputs the result to the default standard output, specified by STDOUT_FILENO • By default, STDOUT_FILENO points to the terminal output. • However, we close this default standard output → close(STDOUT_FILENO ); • Then, when we open the file “p4.output” • A file descriptor will be assigned to this file • UNIX systems start looking for free file descriptors from the beginning of the file descriptor table (i.e., at zero). In this case, STDOUT_FILENO will be the first available one and thus get assigned with the following open(). • STDOUT_FILENO -> file “p4.output” • Thus, subsequent writes by the child process, which is wc, to the standard output file descriptor will be routed transparently to the newly-opened file “p4.output” 0 0 STDINPUT STDINPUT 1 STDOUT 1 STDOUT wc 2 wc 2 STDERR STDERR n n FDn… FDn… 5 Discussion 1: Case study • What is happening and what do you get from the above example? • What is the benefit by using the combination of fork() and exec() to the above example? • What is the fundamental reason bringing in the benefit? • If you have a different way for creating a new process, can your solution achieve the same goal? 6 Discussion 2: Memory Layout • A general memory layout (address space) • It is the process’s view of memory in the system • and how the process references data and instruction while being executed MAX 0xffffffff Function call arguments, return address, Stack return values, and temporal variables Gap Heap Dynamically allocated memory (e.g. malloc()) Data Global variables, constants etc Text Program Code 0x00000000 7 Discussion 2: Memory Layout • After fork(), the child process has its own, seprate address space, with the initial content being the same as the parent process. 0xC0000000 0xC0000000 Stack Stack Parent Child process process address address space Heap space Heap int g = 2; Data int g = 2; Data Text (R/O) Text (R/O) 0x08048000 0x08048000 0 0 8 Discussion 2: Connect the dots • Address space is a virtual thing (memory abstraction for a process). • Components of an address space are further mapped to the real physical locations, where data/instructions are stored. Address space (virtual view) Physical ELF Program memory Stack Mapping Stack ELF Header Exec() (loader) .text .rodata Heap Heap .data .bss .bss .data Data .rodata Code .text 9 Discussion 2: Memory Layout • When a parent process uses fork() to create a child process, the two processes will have • separate address spaces – separate copies of the data, stack, and heap segments • but the contents in these segments are same. • The child’s stack, data, and heap segments are initially exact duplicates of the corresponding parts the parent’s memory. • After the fork(), each process can modify the variables in its data, stack, and heap segments without affecting the other process. • It is because, they have separate address spaces. 10 Discussion 2: An example static int idata = 111; /* Allocated in data segment */ int main(int argc, char *argv[]) { int istack = 222; /* Allocated in stack segment */ pid_t childPid; childPid = fork(); if (childPid == -1) { exit(-1); } else if (childPid == 0) {idata *= 3; istack *= 3;} /* Child Process: modify data */ else {sleep(3)} /* Parent process: give child a chance to execute */ /* Both parent and child come here */ printf("PID=%ld %s idata=%d istack=%d\n", (long) getpid(), (childPid == 0) ? "(child) " : "(parent)", idata, istack); exit(0); } What are the results? 11 Discussion 2: Question for fork() Implementation • To make the parent and the child process have separate copies of the data, stack, and heap segments after forking: • A straightforward fork () solution: • Directly copy everything from the parent address space to the child address space. • This was actually the implementation used by some early Unix systems. But not anymore… • What are the problems of the “direct copy” approach? Do you have some ideas to improve it? 12 0xC0000000 Stack Physical memory Parent process virtual memory Heap Text (R/O) Data Text (R/O) 0x08048000 Data 0 copy 0xC0000000 Data Stack As an address space maps to the Child physical memory, shared parts from process virtual Hints different address spaces can map to memory Heap the same physical location (e.g., instructions) Data Text (R/O) Thus, direct copy should only apply 0x08048000 for stack, heap and data segments. 0 13 Discussion 2: Question for fork() Implementation • But problems of direct copying parent’s memory when forking. • Slow process creation - the child process won’t be ready until all the memory copying is done. • Waste of resources and time • Many resources could be shared between parent and child. • fork() is usually immediately followed by an exec() function to run a different program, which causes a replacement of the child process text segment, and re-initialization of the data, stack and heap segments, making the direct copying when forking a waste. • Do you have a better solution? • Refer to online materials about “copy-on-write” -- a general programming technique. • Justify your solution, e.g., whether your fork() solution addresses the problems caused by the “direct copy” approach and to which degree? 14 Discussion 3: Who runs first? • After fork(), either the parent or the child can be scheduled to run (assume we have one CPU core). • A test of executing fork() multiple times in Linux 2.2.19: 99.97% of a million executions have the parent first to run. • In Linux 2.4, change was made to have the child to run first most of the time. But this change was later dropped from the 2.4 kernel series. • In Linux 2.6, the change to have the child run first was adopted. • Does this really matter – to run child or parent first? 15 Discussion 3: Who runs first? • If we focus on locality • Instructions and data are usually cached in CPU caches (L1, L2, LLC) • If there is cache hit – instructions or data are in the CPU, the execution is very fast • For better locality, which policy is better – to schedule the parent process first or child process? • If we focus on data copy – aiming to reduce data duplication • If exec() immediately follows fork() for the child process, which policy is better? 16 Group written assignments • Three (open) questions to answer: 1. What is the advantage of using the combination of “fork() + exec()” to create a new process to execute a program? Do you have another/other solution(s) for process creation? Compare your solution(s) with the existing one and show the pros and cons. 2. To implement a fork() system call, instead of directly copying data, heap and stack of the parent process to the child process’s address space, do you have a more efficient approach? Why is your approach superior to the “direct copy” approach? 3. If you were the OS scheduler, who do you schedule first after fork() system call, child or parent? Please justify your choices. 17 Submission Policy • Work together with your group members (via zoom meetings) • Write down the answers/solutions together – each of you should contribute to each of the questions. • One submission per group – the group leader create a google shared document for all answers and submit the sharable link to myCourse (due by 23:59:59, March 7th) • More tips: • As these questions are more open, rather than producing a “correct” solution, you are more expected to reason about your solutions – to provide needed justifications to show why your solutions work and are better than others. • Working as a group, please tolerate, respect, and inspire each other. Try to come to the same page for the understanding of the questions and finally reach certain levels of consensus for your answers. • Enjoy! 18.
Recommended publications
  • Freebsd-And-Git.Pdf
    FreeBSD and Git Ed Maste - FreeBSD Vendor Summit 2018 Purpose ● History and Context - ensure we’re starting from the same reference ● Identify next steps for more effective use / integration with Git / GitHub ● Understand what needs to be resolved for any future decision on Git as the primary repository Version control history ● CVS ○ 1993-2012 ● Subversion ○ src/ May 31 2008, r179447 ○ doc/www May 19, 2012 r38821 ○ ports July 14, 2012 r300894 ● Perforce ○ 2001-2018 ● Hg Mirror ● Git Mirror ○ 2011- Subversion Repositories svnsync repo svn Subversion & Git Repositories today svn2git git push svnsync git svn repo svn git github Repositories Today fork repo / Freebsd Downstream svn github github Repositories Today fork repo / Freebsd Downstream svn github github “Git is not a Version Control System” phk@ missive, reproduced at https://blog.feld.me/posts/2018/01/git-is-not-revision-control/ Subversion vs. Git: Myths and Facts https://svnvsgit.com/ “Git has a number of advantages in the popularity race, none of which are really to do with the technology” https://chapmanworld.com/2018/08/25/why-im-now-using-both-git-and-subversion-for-one-project/ 10 things I hate about Git https://stevebennett.me/2012/02/24/10-things-i-hate-about-git Git popularity Nobody uses Subversion anymore False. A myth. Despite all the marketing buzz related to Git, such notable open source projects as FreeBSD and LLVM continue to use Subversion as the main version control system. About 47% of other open source projects use Subversion too (while only 38% are on Git). (2016) https://svnvsgit.com/ Git popularity (2018) Git UI/UX Yes, it’s a mess.
    [Show full text]
  • Getting Started With... Berkeley Software for UNIX† on the VAX‡ (The Third Berkeley Software Distribution)
    Getting started with... Berkeley Software for UNIX† on the VAX‡ (The third Berkeley Software Distribution) A package of software for UNIX developed at the Computer Science Division of the University of California at Berkeley is installed on our system. This package includes a new version of the operating sys- tem kernel which supports a virtual memory, demand-paged environment. While the kernel change should be transparent to most programs, there are some things you should know if you plan to run large programs to optimize their performance in a virtual memory environment. There are also a number of other new pro- grams which are now installed on our machine; the more important of these are described below. Documentation The new software is described in two new volumes of documentation. The first is a new version of volume 1 of the UNIX programmers manual which has integrated manual pages for the distributed software and incorporates changes to the system made while introducing the virtual memory facility. The second volume of documentation is numbered volume 2c, of which this paper is a section. This volume contains papers about programs which are in the distribution package. Where are the programs? Most new programs from Berkeley reside in the directory /usr/ucb. A major exception is the C shell, csh, which lives in /bin. We will later describe how you can arrange for the programs in the distribution to be part of your normal working environment. Making use of the Virtual Memory With a virtual memory system, it is no longer necessary for running programs to be fully resident in memory.
    [Show full text]
  • The Dragonflybsd Operating System
    1 The DragonFlyBSD Operating System Jeffrey M. Hsu, Member, FreeBSD and DragonFlyBSD directories with slightly over 8 million lines of code, 2 million Abstract— The DragonFlyBSD operating system is a fork of of which are in the kernel. the highly successful FreeBSD operating system. Its goals are to The project has a number of resources available to the maintain the high quality and performance of the FreeBSD 4 public, including an on-line CVS repository with mirror sites, branch, while exploiting new concepts to further improve accessible through the web as well as the cvsup service, performance and stability. In this paper, we discuss the motivation for a new BSD operating system, new concepts being mailing list forums, and a bug submission system. explored in the BSD context, the software infrastructure put in place to explore these concepts, and their application to the III. MOTIVATION network subsystem in particular. A. Technical Goals Index Terms— Message passing, Multiprocessing, Network The DragonFlyBSD operating system has several long- operating systems, Protocols, System software. range technical goals that it hopes to accomplish within the next few years. The first goal is to add lightweight threads to the BSD kernel. These threads are lightweight in the sense I. INTRODUCTION that, while user processes have an associated thread and a HE DragonFlyBSD operating system is a fork of the process context, kernel processes are pure threads with no T highly successful FreeBSD operating system. Its goals are process context. The threading model makes several to maintain the high quality and performance of the FreeBSD guarantees with respect to scheduling to ensure high 4 branch, while exploring new concepts to further improve performance and simplify reasoning about concurrency.
    [Show full text]
  • Linux-Kernel
    linux-kernel #linux- kernel Table of Contents About 1 Chapter 1: Getting started with linux-kernel 2 Remarks 2 Versions 2 Examples 2 Installation or Setup 2 Download extract and enter to the kernel directory 2 Build the dependencies, compile the kernel and modules. 3 Chapter 2: Creation and usage of Kernel Threads 4 Introduction 4 Examples 4 Creation of kernel threads 4 Chapter 3: Event Tracing 6 Examples 6 Tracing I2C Events 6 Chapter 4: Fork System call 7 Examples 7 fork() system call 7 Chapter 5: How to find the right person for help. 9 Introduction 9 Examples 9 Find the "likely" maintainers for the FTDI USB serial converter 9 Chapter 6: Linux Hello World Device driver 10 Examples 10 An empty kernel module 10 Building and running the module 10 Chapter 7: Linux: Named Pipes(FIFO) 12 Examples 12 What is Named Pipe (FIFO) 12 Credits 13 About You can share this PDF with anyone you feel could benefit from it, downloaded the latest version from: linux-kernel It is an unofficial and free linux-kernel ebook created for educational purposes. All the content is extracted from Stack Overflow Documentation, which is written by many hardworking individuals at Stack Overflow. It is neither affiliated with Stack Overflow nor official linux-kernel. The content is released under Creative Commons BY-SA, and the list of contributors to each chapter are provided in the credits section at the end of this book. Images may be copyright of their respective owners unless otherwise specified. All trademarks and registered trademarks are the property of their respective company owners.
    [Show full text]
  • Lecture 4: September 13 4.1 Process State
    CMPSCI 377 Operating Systems Fall 2012 Lecture 4: September 13 Lecturer: Prashant Shenoy TA: Sean Barker & Demetre Lavigne 4.1 Process State 4.1.1 Process A process is a dynamic instance of a computer program that is being sequentially executed by a computer system that has the ability to run several computer programs concurrently. A computer program itself is just a passive collection of instructions, while a process is the actual execution of those instructions. Several processes may be associated with the same program; for example, opening up several windows of the same program typically means more than one process is being executed. The state of a process consists of - code for the running program (text segment), its static data, its heap and the heap pointer (HP) where dynamic data is kept, program counter (PC), stack and the stack pointer (SP), value of CPU registers, set of OS resources in use (list of open files etc.), and the current process execution state (new, ready, running etc.). Some state may be stored in registers, such as the program counter. 4.1.2 Process Execution States Processes go through various process states which determine how the process is handled by the operating system kernel. The specific implementations of these states vary in different operating systems, and the names of these states are not standardised, but the general high-level functionality is the same. When a process is first started/created, it is in new state. It needs to wait for the process scheduler (of the operating system) to set its status to "new" and load it into main memory from secondary storage device (such as a hard disk or a CD-ROM).
    [Show full text]
  • List of BSD Operating Systems
    FreeBSD-based SNo Name Description A lightweight operating system that aims to bring the flexibility and philosophy of Arch 1 ArchBSD Linux to BSD-based operating systems. 2 AskoziaPBX Discontinued 3 BSDBox 4 BSDeviant 5 BSDLive 6 Bzerk CD 7 DragonFly BSD Originally forked from FreeBSD 4.8, now developed in a different direction 8 ClosedBSD DesktopBSD is a discontinued desktop-oriented FreeBSD variant using K Desktop 9 DesktopBSD Environment 3.5. 10 EclipseBSD Formerly DamnSmallBSD; a small live FreeBSD environment geared toward developers and 11 Evoke system administrators. 12 FenestrOS BSD 13 FreeBSDLive FreeBSD 14 LiveCD 15 FreeNAS 16 FreeSBIE A "portable system administrator toolkit". It generally contains software for hardware tests, 17 Frenzy Live CD file system check, security check and network setup and analysis. Debian 18 GNU/kFreeBSD 19 Ging Gentoo/*BSD subproject to port Gentoo features such as Portage to the FreeBSD operating 20 Gentoo/FreeBSD system GhostBSD is a Unix-derivative, desktop-oriented operating system based on FreeBSD. It aims to be easy to install, ready-to-use and easy to use. Its goal is to combine the stability 21 GhostBSD and security of FreeBSD with pre-installed Gnome, Mate, Xfce, LXDE or Openbox graphical user interface. 22 GuLIC-BSD 23 HamFreeSBIE 24 HeX IronPort 25 security appliances AsyncOS 26 JunOS For Juniper routers A LiveCD or USB stick-based modular toolkit, including an anonymous surfing capability using Tor. The author also made NetBSD LiveUSB - MaheshaNetBSD, and DragonFlyBSD 27 MaheshaBSD LiveUSB - MaheshaDragonFlyBSD. A LiveCD can be made from all these USB distributions by running the /makeiso script in the root directory.
    [Show full text]
  • System Calls & Signals
    CS345 OPERATING SYSTEMS System calls & Signals Panagiotis Papadopoulos [email protected] 1 SYSTEM CALL When a program invokes a system call, it is interrupted and the system switches to Kernel space. The Kernel then saves the process execution context (so that it can resume the program later) and determines what is being requested. The Kernel carefully checks that the request is valid and that the process invoking the system call has enough privilege. For instance some system calls can only be called by a user with superuser privilege (often referred to as root). If everything is good, the Kernel processes the request in Kernel Mode and can access the device drivers in charge of controlling the hardware (e.g. reading a character inputted from the keyboard). The Kernel can read and modify the data of the calling process as it has access to memory in User Space (e.g. it can copy the keyboard character into a buffer that the calling process has access to) When the Kernel is done processing the request, it restores the process execution context that was saved when the system call was invoked, and control returns to the calling program which continues executing. 2 SYSTEM CALLS FORK() 3 THE FORK() SYSTEM CALL (1/2) • A process calling fork()spawns a child process. • The child is almost an identical clone of the parent: • Program Text (segment .text) • Stack (ss) • PCB (eg. registers) • Data (segment .data) #include <sys/types.h> #include <unistd.h> pid_t fork(void); 4 THE FORK() SYSTEM CALL (2/2) • The fork()is one of the those system calls, which is called once, but returns twice! Consider a piece of program • After fork()both the parent and the child are ..
    [Show full text]
  • OS-9 Technical Manual 4 Contents
    OS-9® Te c h n i c a l Manual Version 4.2 RadiSys. 118th Street Des Moines, Iowa 50325 515-223-8000 www.radisys.com Revision A • April 2003 Copyright and publication information Reproduction notice This manual reflects version 4.2 Microware OS-9. The software described in this document is intended to Reproduction of this document, in part or whole, by be used on a single computer system. RadiSys Corpo- any means, electrical, mechanical, magnetic, optical, ration expressly prohibits any reproduction of the soft- chemical, manual, or otherwise is prohibited, without written permission from RadiSys Corporation. ware on tape, disk, or any other medium except for backup purposes. Distribution of this software, in part Disclaimer or whole, to any other party or on any other system may constitute copyright infringements and misappropria- The information contained herein is believed to be accurate as of the date of publication. However, tion of trade secrets and confidential processes which RadiSys Corporation will not be liable for any damages are the property of RadiSys Corporation and/or other including indirect or consequential, from use of the parties. Unauthorized distribution of software may OS-9 operating system, Microware-provided software, cause damages far in excess of the value of the copies or reliance on the accuracy of this documentation. involved. The information contained herein is subject to change without notice. April 2003 Copyright ©2003 by RadiSys Corporation. All rights reserved. EPC and RadiSys are registered trademarks of RadiSys Corporation. ASM, Brahma, DAI, DAQ, MultiPro, SAIB, Spirit, and ValuePro are trademarks of RadiSys Corporation.
    [Show full text]
  • Chapter 4 Introduction to UNIX Systems Programming
    Chapter 4 Introduction to UNIX Systems Programming 4.1 Introduction Last chapter covered how to use UNIX from from a shell program using UNIX commands. These commands are programs that are written in C that interact with the UNIX environment using functions called Systems Calls. This chapter covers this Systems Calls and how to use them inside a program. 4.2 What is an Operating System An Operating System is a program that sits between the hardware and the application programs. Like any other program it has a main() function and it is built like any other program with a compiler and a linker. However it is built with some special parameters so the starting address is the boot address where the CPU will jump to start the operating system when the system boots. Draft An operating system typically offers the following functionality: ● Multitasking The Operating System will allow multiple programs to run simultaneously in the same computer. The Operating System will schedule the programs in the multiple processors of the computer even when the number of running programs exceeds the number of processors or cores. ● Multiuser The Operating System will allow multiple users to use simultaneously in the same computer. ● File system © 2014 Gustavo Rodriguez-Rivera and Justin Ennen,Introduction to Systems Programming: a Hands-on Approach (V2014-10-27) (systemsprogrammingbook.com) It allows to store files in disk or other media. ● Networking It gives access to the local network and internet ● Window System It provides a Graphical User Interface ● Standard Programs It also includes programs such as file utilities, task manager, editors, compilers, web browser, etc.
    [Show full text]
  • Lecture 22 Systems Programming Process Control
    Lecture 22 Systems Programming Process Control A process is defined as an instance of a program that is currently running. A uni processor system can still execute multiple processes giving the appearance of a multi- processor machine. A call to a program spawns a process. If a mail program is called by n users then n processes or instances are created and executed by the unix system. Many operating systems including windows and unix executes many processes at the same time. When a program is called, a process is created and a process ID is issued. The process ID is given by the function getpid() defined in <unistd.h>. The prototype for pid( ) is given by #include < unistd.h > pid_t getpid(void); In a uni-processor machine, each process takes turns running and for a short duration, a process takes time interval called a timeslice. The unix command ps can be used to list all current process status. ps PID TTY TIME CMD 10150 pts/16 00:00:00 csh 31462 pts/16 00:00:00 ps The command ps lists the process ID (PID), the terminal name(TTY), the amount of time the process has used so far(TIME) and the command it is executing(CMD). Ps command only displays the current user processes. But we can get all the processes with the flag (- a) and in long format with flag (-l) ps –a ps -l ps -al Information provided by each process may include the following. PID The process ID in integer form PPID The parent process ID in integer form STAT The state of the process TIME CPU time used by the process (in seconds) TT Control terminal of the process COMMAND The user command that started the process Copyright @ 2008 Ananda Gunawardena Each process has a process ID that can be obtained using the getpid() a system call declared in unistd.h.
    [Show full text]
  • Notes on the History of Fork-And-Join Linus Nyman and Mikael Laakso Hanken School of Economics
    Notes on the history of fork-and-join Linus Nyman and Mikael Laakso Hanken School of Economics As part of a PhD on code forking in open source software, Linus Nyman looked into the origins of how the practice came to be called forking.1 This search led back to the early history of the fork system call. Having not previously seen such a history published, this anecdote looks back at the birth of the fork system call to share what was learned, as remembered by its pioneers. The fork call allows a process (or running program) to create new processes. The original is deemed the parent process and the newly created one its child. On multiprocessor systems, these processes can run concurrently in parallel.2 Since its birth 50 years ago, the fork has remained a central element of modern computing, both with regards to software development principles and, by extension, to hardware design, which increasingly accommodates parallelism in process execution. The fork system call is imagined The year was 1962. Melvin Conway, later to become known for “Conway’s Law,”3 was troubled by what seemed an unnecessary inefficiency in computing. As Conway recalls:4 I was in the US Air Force at the time – involved with computer procurement – and I noticed that no vendor was distinguishing between “processor” and “process.” That is, when a proposal involved a multiprocessor design, each processor was committed to a single process. By late 1962, Conway had begun contemplating the idea of using a record to carry the status of a process.
    [Show full text]
  • Linux Internals
    LINUX INTERNALS Peter Chubb and Etienne Le Sueur [email protected] A LITTLE BIT OF HISTORY • Ken Thompson and Dennis Ritchie in 1967–70 • USG and BSD • John Lions 1976–95 • Andrew Tanenbaum 1987 • Linux Torvalds 1991 NICTA Copyright c 2011 From Imagination to Impact 2 The history of UNIX-like operating systems is a history of people being dissatisfied with what they have and wanting to do some- thing better. It started when Ken Thompson got bored with MUL- TICS and wanted to write a computer game (Space Travel). He found a disused PDP-7, and wrote an interactive operating sys- tem to run his game. The main contribution at this point was the simple file-system abstraction. (Ritchie 1984) Other people found it interesting enough to want to port it to other systems, which led to the first major rewrite — from assembly to C. In some ways UNIX was the first successfully portable OS. After Ritchie & Thompson (1974) was published, AT&T became aware of a growing market for UNIX. They wanted to discourage it: it was common for AT&T salesmen to say, ‘Here’s what you get: A whole lot of tapes, and an invoice for $10 000’. Fortunately educational licences were (almost) free, and universities around the world took up UNIX as the basis for teaching and research. The University of California at Berkeley was one of those univer- NICTA Copyright c 2011 From Imagination to Impact 2-1 sities. In 1977, Bill Joy (a postgrad) put together and released the first Berkeley Software Distribution — in this instance, the main additions were a pascal compiler and Bill Joy’s ex editor.
    [Show full text]