Rethinking OS Design (Traditional) Abstractions Productivity applications You are here Workload control Personal (PDAs), • Processes - thread of control with context Embedded Services & API Internal Structure • Files - a named linear stream of data bytes Metrics Energy Policies / Mechanisms efficiency • Sockets - endpoints of communication between unrelated processes Hardware Resources Processor, Memory, Disks (?), Wireless & IR, Keyboard(?), Display(?), Mic & Speaker, Motors & Sensors, GPS, Camera, Batteries

Unix Process Control Fork// Example The fork syscall returns a zero to the child and the starts as child process ID to the forkparent fork child clone of parent: increment parent. refcounts on shared Fork creates an exact resources. copy of the parent int pid; process. int status = 0; OS resources Parent and child execute if (pid = fork()) { Parent uses wait to independently: memory /* parent */ until the child exits; wait states and resources may ….. returns child pid and diverge. pid = wait(&status); status. } else { /* child */ Wait variants allow wait On exit , release ….. on a specific child, or wait exit memory and exit(status); decrement refcounts } notification of stops and Parent sleeps in wait “join” other signals. until child stops or on shared resources. exits.

Child enters zombie state: Child process passes process is dead and most status back to parent on resources are released, but process descriptor remains until exit, to report parent reaps via wait. success/failure.

1 Join Scenarios Signals • Several cases must be considered for join • Signals notify processes of internal or external events. (e.g., exit/wait). – the Unix software equivalent of interrupts/exceptions – What if the child exits before the parent joins? – only way to do something to a process “from the • “Zombie” process object holds child status and stats. outside” – What if the parent continues to run but never joins? – Unix systems define a small set of types • How not to fill up memory with zombie processes? • Examples of signal generation: – What if the parent exits before the child? – keyboard ctrl-c and ctrl-z signal the foreground process • Orphans become children of (process 1). – synchronous fault notifications, syscall errors – What if the parent can’t afford to get “stuck” on a join? – asynchronous notifications from other processes via • Unix makes provisions for asynchronous notification. – IPC events (SIGPIPE, SIGCHLD)

– alarm notifications signal == “upcall”

Using Signals Process Handling of Signals int alarmflag=0; 1. Each signal type has a system-defined default action. alarmHandler () { printf(“An alarm clock signal was received\n”); abort and dump core (SIGSEGV, SIGBUS, etc.) ignore, stop, exit, continue alarmflag = 1; Instructs kernel to } 2. A process may choose to block (inhibit) or ignore send SIGALRM in main() Sets up signal handler 3 seconds some signal types. { 3. The process may choose to catch some signal types signal (SIGALRM, alarmHandler); by specifying a (user mode) handler procedure. alarm(3);printf(“Alarm has been set\n”); specify alternate signal stack for handler to run on while (!alarmflag) pause (); Suspends caller system passes interrupted context to handler printf(“Back from alarm signal handler\n”); until signal handler may munge and/or return to interrupted context }

2 Yet Another User’s View File System Calls main(argc, argv) childhandler() Open files are named to by an integer file int argc; char* argv[]; { int childPid, childStatus; descriptor. Pathnames may be { relative to process childPid = wait (&childStatus); char buf[BUFSIZE]; current directory. int pid; int fd ; signal (SIGCHLD,childhandler ); printf(“child done in time\n”); if ((fd = open(“../ zot ”, O_TRUNC | O_RDWR) == -1) { pid = fork (); exit; perror(“open failed”); if (pid == 0) /*child*/ exit(1); } } { execvp (argv[2], &argv[2]); } Collects status while(read(0, buf , BUFSIZE)) { Process passes status back if (write(fd, buf, BUFSIZE) != BUFSIZE) { else to parent on exit , to report perror(“write failed”); success/failure. {sleep (5); exit(1); } printf(“child too slow\n”); } Process does not specify kill (pid, SIGINT); current file offset: the } system remembers it. SIGCHLD sent Standard descriptors (0, 1, 2) } for input, output, error by child on termination; messages (stdin, stdout, if SIG_IGN, dezombie stderr).

File Sharing Between Sharing Open File Instances Parent/Child

user ID main(int argc, char *argv[]) { process ID char c; group ID shared seek parent PID offset in shared int fdrd , fdwt ; signal state siblings file table entry children if ((fdrd = open(argv[1], O_RDONLY)) == -1) exit(1); if ((fdwt = creat([argv [2], 0666)) == -1) exit(1);

fork(); user ID shared file process ID (inodeorvnode) process group ID for (;;) { child parent PID if (read(fdrd , &c, 1) != 1) signal state exit(0); siblings write(fdwt, &c, 1); children process file system open } process file table } descriptors [Bach] objects

3 File Directories Devices

• Directories are (guess what?) a type of file. Various devices are abstracted as special files. • A hierarchy of directories - a filesystem - has a root (/) • Named by a filename. • Pathnames are absolute or relative to working directory, ., .. • Accessed via open(), close(), read(),and • root filesystem may have roots of other filesystems write() mounted into the hierarchy. • Idiosynchratic operations of the device are • Directories manipulated by link(), ulink(), mkdir(), access through ioctl() calls. rmdir().

SetSystemPowerState OSDM: OnNow – initiate sleep state, query apps(?) Framework for Adaptation SetThreadExecutionState – specifies level of support needed Applications • Odyssey project - Satya (CMU) (e.g. display required) • Odyssey is an attempt to incorporate application- WM_POWERBROADCAST OnNow WIN32 ext – a message notifying of power state aware adaptation changes to which applications can • Noble et al, Agile application-aware adaptation respond OS for mobility, SOSP 97 SetWaitableTimer ACPI Spec (network bandwidth examples) – ensure PC is awake at scheduled time RequestDeviceWakeup HW • Flinn and Satya, Energy-aware adaptation for mobile applications, SOSP 99 RequestWakeupLatency - to specify latency requirements (energy usage examples) GetSystemPowerStatus and GetDevicePowerState

4 Odyssey Provides Architecture of Odyssey Client

• API - new syscalls to register a window of tolerance for a variable resource (e.g. network Warden bandwidth) • Notifications of change (upcalls) Warden

– Implies detection of changes. Viceroy Mechanisms needed. Application Cache Mgt • Typing - Wardens which handle type-specific functionality – Type awareness necessary to evaluate tradeoffs Kernel • Viceroy – centralized resource coordination

API Example

• Each movie in multiple tracks at different fidelity levels • Warden can switch between tracks to fit bandwidth requirements

5 Energy Resource PowerScope [Flinn] as a Tool • Statistical sampling approach • Monitoring to detect resource availability – Program counter/process (PC/PID) + correlated current Powerscope readings. – Off-line analysis to generate profile • Using Odyssey for adaptation in this • Causality domain. – Goal is to assign energy costs to specific application events / program structure – Mapped down to procedure level – System-wide. Includes all processes, including kernel

Experimental Setup Data Gathering System Monitor Kernel Mods Multimeter’s clock drives sampling at • recording of PC and PID Takes period of 1.6ms current • fork(), exec(), exit() instrumented to record sample -> pathname associated with process • new system calls to control profiling Interrupt causes ->Trigger next sample • pscope_init(), pscope_start(), pscope_stop(), PC/PID sample to pscope_read() (user-level daemon, to disk) be buffered <-Trigger Profiling computer User-level daemon writes to disk when buffer 7/8 full

6 Energy Analyzer

• Voltage essentially constant, only current recorded. • Each sample is binned into process bucket and procedure within process bucket. • Energy calculated by summing each bucket

n E = Vmeas S It Dt t=0

Fidelity for Energy? Case Study

• Before investing in incorporating energy Video application into Odyssey for adaptation, first determine original 12.1MB whether Odyssey’s model of fidelity as the • Step 1 lossy compression way to adapt has potential for energy B: 7MB, C: 2.8MB savings. • Step 2: display size reduced from 320x240 to 160x120 • Experiments showing that potential, hand- Asmall: 4.9MB, Csmall: 1MB tuned based on Powerscope information. • Step 3: WaveLAN put into standby mode when not used • Step 4: Disk powered off

7 Base case Every optimization

Conclusions about Fidelity as Can Odyssey Automate This? Energy Saving Adaptation • Significant variation in effectiveness of • User specifies target battery lifetime. fidelity reduction among objects • Odyssey is to monitor energy supply and • And among applications demand • Combining hardware power management • Notify applications to change fidelity if with fidelity reductions is good. estimate future demand and supply don’t match to achieve desired lifetime.

8 Goal-directed Energy Adaptation

• On-line version of Powerscope Results (assume this will be built-in) • Smoothed observations of past • Goals: consumption as estimate of future. • Odyssey’s own criteria for notification – Meet specified battery lifetime – Highest fidelity within that constraint – Infrequent adaptations – Small leftover battery capacity at end of period.

EstDemand = ((1-a)(sample) + (a)(old)) * time_remaining

9