Rethinking OS Design (Traditional) Unix Abstractions Productivity applications You are here Workload Process control Personal (PDAs), • Processes - thread of control with context Embedded Services & API Internal Structure • Files - a named linear stream of data bytes Metrics Energy Policies / Mechanisms efficiency • Sockets - endpoints of communication between unrelated processes Hardware Resources Processor, Memory, Disks (?), Wireless & IR, Keyboard(?), Display(?), Mic & Speaker, Motors & Sensors, GPS, Camera, Batteries
Unix Process Control Fork/Exit/Wait Example The fork syscall returns a zero to the child and the Child process starts as child process ID to the forkparent fork child clone of parent: increment parent. refcounts on shared Fork creates an exact resources. copy of the parent int pid; process. int status = 0; OS resources Parent and child execute if (pid = fork()) { Parent uses wait to sleep independently: memory /* parent */ until the child exits; wait states and resources may ….. returns child pid and diverge. pid = wait(&status); status. } else { /* child */ Wait variants allow wait On exit , release ….. on a specific child, or wait exit memory and exit(status); decrement refcounts } notification of stops and Parent sleeps in wait “join” other signals. until child stops or on shared resources. exits.
Child enters zombie state: Child process passes process is dead and most status back to parent on resources are released, but process descriptor remains until exit, to report parent reaps exit status via wait. success/failure.
1 Join Scenarios Signals • Several cases must be considered for join • Signals notify processes of internal or external events. (e.g., exit/wait). – the Unix software equivalent of interrupts/exceptions – What if the child exits before the parent joins? – only way to do something to a process “from the • “Zombie” process object holds child status and stats. outside” – What if the parent continues to run but never joins? – Unix systems define a small set of signal types • How not to fill up memory with zombie processes? • Examples of signal generation: – What if the parent exits before the child? – keyboard ctrl-c and ctrl-z signal the foreground process • Orphans become children of init (process 1). – synchronous fault notifications, syscall errors – What if the parent can’t afford to get “stuck” on a join? – asynchronous notifications from other processes via kill • Unix makes provisions for asynchronous notification. – IPC events (SIGPIPE, SIGCHLD)
– alarm notifications signal == “upcall”
Using Signals Process Handling of Signals int alarmflag=0; 1. Each signal type has a system-defined default action. alarmHandler () { printf(“An alarm clock signal was received\n”); abort and dump core (SIGSEGV, SIGBUS, etc.) ignore, stop, exit, continue alarmflag = 1; Instructs kernel to } 2. A process may choose to block (inhibit) or ignore send SIGALRM in main() Sets up signal handler 3 seconds some signal types. { 3. The process may choose to catch some signal types signal (SIGALRM, alarmHandler); by specifying a (user mode) handler procedure. alarm(3);printf(“Alarm has been set\n”); specify alternate signal stack for handler to run on while (!alarmflag) pause (); Suspends caller system passes interrupted context to handler printf(“Back from alarm signal handler\n”); until signal handler may munge and/or return to interrupted context }
2 Yet Another User’s View File System Calls main(argc, argv) childhandler() Open files are named to by an integer file int argc; char* argv[]; { int childPid, childStatus; descriptor. Pathnames may be { relative to process childPid = wait (&childStatus); char buf[BUFSIZE]; current directory. int pid; int fd ; signal (SIGCHLD,childhandler ); printf(“child done in time\n”); if ((fd = open(“../ zot ”, O_TRUNC | O_RDWR) == -1) { pid = fork (); exit; perror(“open failed”); if (pid == 0) /*child*/ exit(1); } } { execvp (argv[2], &argv[2]); } Collects status while(read(0, buf , BUFSIZE)) { Process passes status back if (write(fd, buf, BUFSIZE) != BUFSIZE) { else to parent on exit , to report perror(“write failed”); success/failure. {sleep (5); exit(1); } printf(“child too slow\n”); } Process does not specify kill (pid, SIGINT); current file offset: the } system remembers it. SIGCHLD sent Standard descriptors (0, 1, 2) } for input, output, error by child on termination; messages (stdin, stdout, if SIG_IGN, dezombie stderr).
File Sharing Between Sharing Open File Instances Parent/Child
user ID main(int argc, char *argv[]) { process ID char c; parent process group ID shared seek parent PID offset in shared int fdrd , fdwt ; signal state siblings file table entry children if ((fdrd = open(argv[1], O_RDONLY)) == -1) exit(1); if ((fdwt = creat([argv [2], 0666)) == -1) exit(1);
fork(); user ID shared file process ID (inodeorvnode) process group ID for (;;) { child parent PID if (read(fdrd , &c, 1) != 1) signal state exit(0); siblings write(fdwt, &c, 1); children process file system open } process file table } descriptors [Bach] objects
3 File Directories Devices
• Directories are (guess what?) a type of file. Various devices are abstracted as special files. • A hierarchy of directories - a filesystem - has a root (/) • Named by a filename. • Pathnames are absolute or relative to working directory, ., .. • Accessed via open(), close(), read(),and • root filesystem may have roots of other filesystems write() mounted into the hierarchy. • Idiosynchratic operations of the device are • Directories manipulated by link(), ulink(), mkdir(), access through ioctl() calls. rmdir().
SetSystemPowerState OSDM: OnNow – initiate sleep state, query apps(?) Framework for Adaptation SetThreadExecutionState – specifies level of support needed Applications • Odyssey project - Satya (CMU) (e.g. display required) • Odyssey is an attempt to incorporate application- WM_POWERBROADCAST OnNow WIN32 ext – a message notifying of power state aware adaptation changes to which applications can • Noble et al, Agile application-aware adaptation respond OS for mobility, SOSP 97 SetWaitableTimer ACPI Spec (network bandwidth examples) – ensure PC is awake at scheduled time RequestDeviceWakeup HW • Flinn and Satya, Energy-aware adaptation for mobile applications, SOSP 99 RequestWakeupLatency - to specify latency requirements (energy usage examples) GetSystemPowerStatus and GetDevicePowerState
4 Odyssey Provides Architecture of Odyssey Client
• API - new syscalls to register a window of tolerance for a variable resource (e.g. network Warden bandwidth) • Notifications of change (upcalls) Warden
– Implies detection of changes. Viceroy Mechanisms needed. Application Cache Mgt • Typing - Wardens which handle type-specific functionality – Type awareness necessary to evaluate tradeoffs Kernel • Viceroy – centralized resource coordination
API Example
• Each movie in multiple tracks at different fidelity levels • Warden can switch between tracks to fit bandwidth requirements
5 Energy Resource PowerScope [Flinn] as a Tool • Statistical sampling approach • Monitoring to detect resource availability – Program counter/process (PC/PID) + correlated current Powerscope readings. – Off-line analysis to generate profile • Using Odyssey for adaptation in this • Causality domain. – Goal is to assign energy costs to specific application events / program structure – Mapped down to procedure level – System-wide. Includes all processes, including kernel
Experimental Setup Data Gathering System Monitor Kernel Mods Multimeter’s clock drives sampling at • recording of PC and PID Takes period of 1.6ms current • fork(), exec(), exit() instrumented to record sample -> pathname associated with process • new system calls to control profiling Interrupt causes ->Trigger next sample • pscope_init(), pscope_start(), pscope_stop(), PC/PID sample to pscope_read() (user-level daemon, to disk) be buffered <-Trigger Profiling computer User-level daemon writes to disk when buffer 7/8 full
6 Energy Analyzer
• Voltage essentially constant, only current recorded. • Each sample is binned into process bucket and procedure within process bucket. • Energy calculated by summing each bucket
n E = Vmeas S It Dt t=0
Fidelity for Energy? Case Study
• Before investing in incorporating energy Video application into Odyssey for adaptation, first determine original 12.1MB whether Odyssey’s model of fidelity as the • Step 1 lossy compression way to adapt has potential for energy B: 7MB, C: 2.8MB savings. • Step 2: display size reduced from 320x240 to 160x120 • Experiments showing that potential, hand- Asmall: 4.9MB, Csmall: 1MB tuned based on Powerscope information. • Step 3: WaveLAN put into standby mode when not used • Step 4: Disk powered off
7 Base case Every optimization
Conclusions about Fidelity as Can Odyssey Automate This? Energy Saving Adaptation • Significant variation in effectiveness of • User specifies target battery lifetime. fidelity reduction among objects • Odyssey is to monitor energy supply and • And among applications demand • Combining hardware power management • Notify applications to change fidelity if with fidelity reductions is good. estimate future demand and supply don’t match to achieve desired lifetime.
8 Goal-directed Energy Adaptation
• On-line version of Powerscope Results (assume this will be built-in) • Smoothed observations of past • Goals: consumption as estimate of future. • Odyssey’s own criteria for notification – Meet specified battery lifetime – Highest fidelity within that constraint – Infrequent adaptations – Small leftover battery capacity at end of period.
EstDemand = ((1-a)(sample) + (a)(old)) * time_remaining
9