<<

9/17/15

DISTRIBUTED SYSTEMS

PROCESSES AND THREADS

Dr. Jack Lange Department University of Pittsburgh

Fall 2015

Outline n Heavy Weight Processes n Threads and Implementation n User Level Threads n Kernel Level Threads n Light Weight Processes n Scheduler Activation n Client/ Implementation n Multithreaded Server n Xwindow n Server General Design Issues

1 9/17/15

Process and Management PROCESS CONCEPT

Process Concept n A primary of an is to execute and manage processes n What is a program and what is a process? n A program is a specification, which contains data type definitions, how data can be accessed, and set of instructions that when executed accomplishes useful “work” n A sequential process is the activity resulting from the of a program with its data by a sequential n Program is passive, while a process is active A process is an instance of a program in execution

2 9/17/15

Process Traditional Structure

n n Code – Binary code n Static and dynamic data n Execution stack – Contains data such as parameter, return address, and temporary variables n Process CPU n Process Status Word (PSW) – Execution mode, Last operation outcome, level, … n (IR) – Current instruction being executed n (PC) – Next instruction to be executed n Stack pointer (SP) n General purpose registers

Process and Process Management PROCESS ADDRESS SPACE

3 9/17/15

Memory Layout

0xffffffff 0xffffffff Kernel Space (1GB) 0xc0000000 Kernel Space (2GB) 0x80000000 User Mode Space (3GB) User Mode Space (2GB) 0 0 32bit Kernel/User 32bit Windows Default Memory Split Memory Split

Process Address Space

Max Name Interpretation

Stack Code Segment (Text) Program executable statements Data Segment 1. Initialized Data 1. Static or global data

initialized with non-zero

values

2. Zero-Initialized or Block 2. Uninitialized static or global Started by Symbol (BSS) data data

Heap Categories occupy a single contiguous area.

Data Heap Dynamic memory allocation

Stack Segment • Local variables of the

scope. Text • Function parameters 0

4 9/17/15

Process and Process Management

Process States

n Process in one of 5 states Created n Created n Ready 1 n Running n Blocked Ready 2 n n Transitions between states 5 1 - Process enters ready queue 3 2 - Scheduler picks this process Blocked Running 3 - Scheduler picks a different (waiting) process 4 4 - Process waits for event (such as I/O) 5 - Event occurs 7 6 - Process exits 6 7 7 - Process ended by another process Exit

5 9/17/15

Processes Representation

n To users and to other processes, a process is identified by its unique Process ID (PID) n PID <= 30,000 in Unix n In the OS, processes are represented by entries in a Process Table (PT) n PID is index to (or pointer to) a PT entry n PT entry = Process Control Block (PCB) n PCB is a large data structure that contains or points to all information about the process n Linux, defined in task_struct – It contains about 70 fields n NT – defined in EPROCESS – It contains about 60 fields

What’s In A Process Table Entry?

Process management File management May be Registers stored Program counter Working (current) directory on stack CPU status word File descriptors Stack pointer User ID Process state Group ID Priority / parameters Process ID ID Signals Pointers to text, data, stack Process start time or Total CPU usage Pointer to

6 9/17/15

Process Information

Process ID UID GID EUID EGID CWD

Memory Signal Dispatch Table Priority

Signal Mask

Stack

File Descriptors CPU State

Process CPU State l The CPU state is defined by the registers’ contents n Process Status Word (PSW) n . mode, last op. outcome, interrupt level n Instruction Register (IR) n Current instruction being executed n Program counter (PC) n Stack pointer (SP) n General purpose registers

7 9/17/15

Process control block (PCB)

PCB kernel user CPU state PSW memory text IR files data accounting PC

priority heap SP

user general CPU registers purpose storage stack registers

Operating System Control Structure

Memory Table

Process 1 PCB Process Table Process 1 Memory Process 2 Process 1 PCB Processes Process n Device Table Devices

Files Process 1 PCB File Table

8 9/17/15

Process and Process Management SWITCHING

Processes in the OS

n Two “layers” for processes n Lowest layer of process-structured OS handles , scheduling n Above that layer are sequential processes n Processes tracked in the process table n Each process has a process table entry Processes

0 1 … N-2 N-1

Scheduler

9 9/17/15

Process Management n In multi-programming and time environments, process management is required n A program may require multiple processes n Many processes may be instances of the same program n Many processes from different programs can run simultaneously n How does the OS manage multiple processes running concurrently? n What kind of information must be kept? n What functions must the OS perform to run processes correctly.

Switching Between Processes n An important task of the OS is to manage CPU allocation among concurrent processes n When the OS takes away the CPU from a running process, the OS must perform a switch between the processes n This referred to as n It is also referred to as a “state save” and “state restore” n What is a context? n What operations must take place to achieve this switch? n How can the OS guarantee that the interrupted process can resume correctly?

10 9/17/15

Process Context n Process Context is the necessary information of the current process operational state that must be stored in order to later resume process operation from an identical position as when the process was interrupted. n Context information is system dependent and typically includes: n User Level Context n Register Level Context n System Level Context n Address space, stack space, Program Counter (PC), Stack Pointer (SP), Instruction Register (IR), Program Status Word (PSW) and other general processor registers, profiling or accounting information, associated kernel data structures and current state of the process

CPU Switch From Process to Process

11 9/17/15

Process Context Switch n Save processor context, including program counter and other registers n Update the process control block with the new state and any accounting information n Move process control block to appropriate queue - ready, blocked n Select another process for execution n Update the process control block of the process selected n Update memory-management data structures n Restore context of selected process and set the PC to the that process code – typically, implicit in the interrupt return instruction

Context Switch Design Issues n Context can be costly and must be minimized n Context switch is purely system overhead, as no “useful” work accomplished during context switching n The actual cost depends OS and on the support provided by the hardware n The more complex the OS and the PCB -> longer the context switch n Some hardware provides multiple sets of registers per CPU, allowing multiple contexts to be loaded at once n A “full” process switch may require a significant number of instruction execution.

12 9/17/15

What Happens On A /Interrupt?

1. Hardware saves program counter (on stack or in a special register) 2. Hardware loads new PC, identifies interrupt 3. Assembly language routine saves registers 4. Assembly language routine sets up stack 5. Assembly language calls to run service routine 6. Service routine calls scheduler 7. Scheduler selects a process to run next (might be the one interrupted…) 8. Assembly language routine loads PC & registers for the selected process

Process Concept Revisited

Heavy and Light Weight Process Model

13 9/17/15

Process Characteristics

n In modern operating systems, the two main characteristics of a process are treated separately

n The unit of resource ownership is usually referred to as a process or task

n The unit of execution is usually referred to a thread or a “lightweight process”

Heavyweight Processes

n A virtual address space Code Data Files which holds the process image Registers Stack n Protected access to processors, files, and other I/O resources n Use or to communicate with other Thread processes

Single-Threaded Process

14 9/17/15

“Heavyweight” Process Model n Simple, uni-threaded model n Security provided by address space boundaries n High cost for context switch n Coarse granularity limits degree of concurrency

. . .

User Kernel

Threads

User and Kernel Level Threads

15 9/17/15

Threads: “Processes” Sharing Memory

n Process == Address Space n Thread == Program Counter / Stream of Instructions n Two examples n Three processes, each with one thread n One process with three threads

Process 1 Process 2 Process 3 Process 1

User space

Threads Threads System Kernel Kernel space

Process and Thread Information

Per process items Address space Open files Child processes Signals & handlers Accounting info Global variables

Per Thread Items Per Thread Items Per Thread Items Program counter Program counter Program counter Registers Registers Registers Stack & stack pointer Stack & stack pointer Stack & stack pointer State State State

16 9/17/15

Threads & Stacks

Thread 1 Thread 2 Thread 3

User space

Process Thread 1’s stack

Thread 3’s stack

Thread 2’s stack Kernel

=> Each thread has its own stack!

Why Use threads?

When in the Course of human events, it We hold these truths to be self-evident, destructive of these ends, it is the Right n Allow a single application becomes necessary for one people to that all men are created equal, that they of the People to alter or to abolish it, and dissolve the political bands which have are endowed by their Creator with to institute new Government, laying its connected them with another, and to certain unalienable Rights, that among foundation on such principles and assume among the powers of the earth, these are Life, Liberty and the pursuit of organizing its powers in such form, as to to do many things at once the separate and equal station to which Happiness.--That to secure these rights, them shall seem most likely to effect the Laws of Nature and of Nature's God Governments are instituted among Men, their Safety and Happiness. Prudence, entitle them, a decent respect to the deriving their just powers from the indeed, will dictate that Governments opinions of mankind requires that they consent of the governed, --That long established should not be changed n Simpler programming model should declare the causes which impel whenever any Form of Government for light and transient causes; and them to the separation. becomes accordingly all n Less waiting n Threads are faster to create or destroy n No separate address space n Overlap computation and I/ O n Could be done without threads, but it’s harder Kernel n Example: word processor n Thread to read from keyboard n Thread to format document n Thread to write to disk

17 9/17/15

Implementing threads

Process

Thread

Kernel Kernel

Run-time Thread Process Process Thread system table table table table

Scheduling User-Level Threads

Process A Process B n Kernel picks a process to run next n Run-time system (at user level) schedules threads n Run each thread for less than process quantum n For example, a processes is allocated 40ms each, threads Kernel get 10ms each n Example schedule: Run-time Thread Process system table table A1,,A3,A1,B1,B3,B2,B3 n Not possible: A1,A2,B1,B2,A3,B3,A2,B1

18 9/17/15

Scheduling Kernel-Level Threads

Process A Process B n Kernel schedules each thread n No restrictions on ordering n May be more difficult for each process to specify priorities n Example schedule: A1,A2,A3,A1,B1,B3,B2,B3 Kernel n Also possible: A1,A2,B1,B2,A3,B3,A2,B1 Process Thread table table

ULTs Pros and Cons

Advantages Disadvantages nThread switching does not n Most system calls are involve the kernel – no need for for processes mode switching n All threads within a process will be implicitly blocked n Fast context switch time, n Waste of resource and nThreads semantics are defined decreased performance by application n The kernel can only assign nScheduling can be application processors to processes. specific n Two threads within the same n Best can be selected process cannot run nULTs are highly portable – simultaneously on two Only a thread is needed processors

19 9/17/15

KLT Pros and Cons

Advantages Disadvantages n The kernel can schedule n Thread switching always multiple threads of the same involves the kernel process on multiple n This means two mode processors switches per thread switch are required n Blocking at thread level, not n KTLs switching is slower process level compared ULTs n If a thread blocks, the CPU can be assigned to another n Still faster than a full thread in the same process process switch n Even the kernel routines can be multithreaded

Combined ULTs and KLTs

Solaris Approach

20 9/17/15

Combined ULT/KLT Approaches

n Thread creation done in the user space n Bulk of thread scheduling and done in user space n ULT’s mapped onto KLT’s n The may adjust the number of KLTs n KLT’s may be assigned to processors n Combines the best of both approaches “Many-to-Many” Model

Solaris Process Structure

n Process includes the user’s address space, stack, and process control block n User-level threads – Threads library n Library supports for application parallelism n Invisible to the OS n Kernel threads n Visible to the kernel n Represents a unit that can be dispatched on a processor n Lightweight processes (LWP) n Each LWP supports one or more ULTs and maps to exactly one KLT

21 9/17/15

Solaris Threads

“bound” thread

§ Task 2 is equivalent to a pure ULT approach – Traditional process structure § Tasks 1 and 3 map one or more ULT’s onto a fixed number of LWP’s, which in turn map onto KLT’s § Note how task 3 maps a single ULT to a single LWP bound to a CPU

Solaris – User Level Threads

n Share the execution environment of the task n Same address space, instructions, data, file (any thread opens file, all threads can read). n Can be tied to a LWP or multiplexed over multiple LWPs n Represented by data structures in address space of the task – but kernel knows about them indirectly via LWPs

22 9/17/15

Solaris – Kernel Level Threads n Only objects scheduled within the system n May be multiplexed on the CPU’s or tied to a specific CPU n Each LWP is tied to a kernel level thread

Solaris – Versatility n ULTs can be used when logical parallelism does not need to be supported by hardware parallelism n Eliminates mode switching n Multiple windows but only one is active at any one time n If ULT threads can block then more LWPs can be added to avoid blocking the whole application n High Versatility – System can operate in a Windows style or conventional Unix style, for example

23 9/17/15

ULTs, LWPs and KLTs n LWP can be viewed as a virtual CPU n The kernel schedules the LWP by scheduling the KLT that it is attached to n Run-time library (RTL) handles multiple threads handled by RTL n When a thread invokes a , the associated LWP makes the actual call n LWP blocks, along with all the threads tied to the LWP n Any other thread in same task will not block.

Thread Implementation

Scheduler Activation

24 9/17/15

Scheduler Activation – Motivation

n Application has knowledge of the user-level thread state but has little knowledge of or influence over critical kernel-level events – By design, to achieve the abstraction n Kernel has inadequate knowledge of user-level thread state to make optimal scheduling decisions

Underscores the need for a mechanism that facilitates exchange of information between user-level and kernel-level mechanisms.

A general system design problem: communicating information and control across layer boundaries while preserving the inherent advantages of layering, abstraction, and .

Scheduler Activations: Structure

User • Change in Processor . . . Requirements Thread Library

Scheduler Activation

Kernel Support • Change in Processor Allocation Kernel • Change in Thread Status

25 9/17/15

Communication via Upcalls

§ The kernel-level scheduler activation mechanism communicates with the user-level thread library by a set of upcalls

Add this processor (processor #) Processor has been preempted (preempted activation #, machine state) Scheduler activation has blocked (blocked activation #) Scheduler activation has unblocked (unblocked activation #, machine state)

§ The thread library must maintain the association between a thread’s identity and thread’s scheduler activation number.

Role of Scheduler Activations

Abstraction Implementation

...... User-level Threads Thread Library Kernel

P1 P2 . . . Pn SA SA . . . SA

Invariant: There is one running scheduler activation (SA) for each processor assigned to the user process. Virtual Multiprocessor

26 9/17/15

Avoiding Effects of Blocking

User User

5: Start

1 1: System Call 4: Upcall

3: New 2: Block 2 kernel Kernel

Kernel Threads Scheduler Activations

Resuming Blocked Thread

user

5 4 4: preempt 5: resume 3: upcall

2: preempt

1: unblock kernel

27 9/17/15

Scheduler Activations – Summary n Threads implemented entirely in user-level libraries n Upon calling a blocking system call n Kernel makes an up-call to the threads library n Upon completing a blocking system call n Kernel makes an up-call to the threads library n Is this the best possible solution? Why not? n Specifically, what popular principle does it appear to violate?

Servers and Clients

28 9/17/15

Threads in distributed systems - clients n Client usage is mainly to hide network n .g. multithreaded web client n Web browser scans incoming HTML page, and finds additional objects to fetch n Each file is fetched by a separate thread, each of which performs a blocking I/O operation n As files come in, the browser displays them n Multiple request-response calls to remote systems n A client does several RPC calls at the same time, each using a separate thread n It waits until all results have been returned n Note: if calls are to different servers, we may have a linear speed-up compared to doing calls one after the other

Threads in distributed systems – servers n In servers, the main issue is improved performance and better structure n Improve performance: n Starting a thread to an incoming request is much cheaper than starting a new process n Having a single-threaded server prohibits simply scaling the server to a multiprocessor system n As with clients: hide network latency by reacting to next request while previous one is being replied n Better structure: n Most servers have high I/O demands. Using simple, well- understood blocking calls simplifies the overall structure. n Multithreaded programs tend to be smaller and easier to understand due to simplified flow of control

29 9/17/15

Multithreaded Web server

Dispatcher while(TRUE) { Thread getNextRequest(&buf); Worker handoffWork(&buf); Thread }

while(TRUE) { waitForWork(&buf); lookForPageInCache(&buf,&page); Kernel if(pageNotInCache(&page)) { readPageFromDisk(&buf,&page); Web Page } Cache returnPage(&page); Network Connection }

Multithreaded Servers (1)

n A multithreaded server organized in a dispatcher/worker model.

30 9/17/15

Three ways to build a server

n Thread model n Parallelism n Blocking system calls n Single-threaded process: slow, but easier to do n No parallelism n Blocking system calls n Finite-state machine n Each activity has its own state n States change when system calls complete or interrupts occur n Parallelism n Non-blocking system calls n Interrupts

Multithreaded Servers (2)

Model Characteristics

Threads Parallelism, blocking system calls

Single-threaded process No parallelism, blocking system calls

Finite-state machine Parallelism, nonblocking system calls

31 9/17/15

Out-of-band communication n How to interrupt a server once it has accepted (or is in the process of accepting) a service request? n Solution 1: Use a separate port for urgent data (possibly per service request): n Server has a separate thread (or process) waiting for incoming urgent messages n When urgent msg comes in, associated request is put on hold n Require OS supports high-priority scheduling of specific threads or processes n Solution 2: Use out-of-band communication facilities of the transport layer: n E.g. TCP allows to send urgent msgs in the same connection n Urgent msgs can be caught using OS signaling techniques

Kinds of Servers n Iterative vs. Concurrent n Iterative Servers n Receive request n Perform service n Reply if necessary n Process next request n Concurrent Servers n Receive request n Pass to separate process/thread n Receive next request

32 9/17/15

Servers and state n Stateless servers: Never keep accurate information about the status of a client after having handled a request: n Don’t record whether a file has been opened (simply close it again after access) n Don’t promise to invalidate a client’s cache n Don’t keep track of your clients n Consequences: n Clients and servers are completely independent n State inconsistencies due to client or server crashes are reduced n Possible loss of performance because, e.g., a server cannot anticipate client behavior (think of prefetching file blocks)

Servers and state n Stateful servers: Keeps track of the status of its clients: n Record that a file has been opened, so that pre-fetching can be done n Knows which data a client has cached, and allows clients to keep local copies of shared data n Observation: The performance of stateful servers can be extremely high, provided clients are allowed to keep local copies. As it turns out, reliability is not a major problem.

33 9/17/15

Thin-client computing n Thin-client n Client and server communicate over a network using a remote display control n Client sends user input, server returns screen updates n Graphical display can be virtualized and served to a client n Application logic is executed on the server n Technology enablers n Improvements in network bandwidth, cost and ubiquity n High total cost of ownership for desktop computing

The X- System n The basic organization of the X Window System n Position of window manager affects/reflects the sophistication of the “client” (running the X-“server”)

34 9/17/15

Servers: General Design Issues

a) Client-to-server binding using a daemon as in Environment (DCE)

b) Client-to-server binding using a superserver as in UNIX

Client-Side Software for Distribution Transparency

n A possible approach to transparent of a remote object using a client-side solution.

35 9/17/15

Conclusion – Server Activation n Heavy Weight Processes n Threads and Thread Implementation n User Level Threads n Kernel Level Threads n Light Weight Processes n Scheduler Activation n Client/Server Implementation n Multithreaded Server n Xwindow n Server General Design Issues

36