VESPER (Virtual Embraced Space ProbER)

Sungho Kim Satoru Moriya Hitachi, Ltd., Systems Development Lab Hitachi, Ltd., Systems Development Lab [email protected] [email protected] Satoshi Oshima Hitachi, Ltd., Systems Development Lab [email protected]

Abstract of general clustering technology—especially, fail-over response latency in high-availability clusters. This paper describes VESPER (Virtual Embraced Space ProbER), the framework that gathers guest information Even in a virtualized environment, a cluster manager effectively in a virtualized environment. VESPER is such as Heartbeat [1][2] software delivers messages be- designed to provide evaluation criteria for system reli- tween underlying nodes to check their health through ability and serviceability, used in decision-making for network periodically (known as heartbeat). If any node system switching or migration by a cluster manager. In fails to reply in a certain time, the manager will assign general, the cluster manager exchanges messages be- the service which the faulty node was providing to an- tween underlying nodes to check their health through other node. This amount of time before switching to the network. In this way, however, the manager can not another node is called the deadtime, key to ascertaining discover a fault immediately and get detailed informa- node death in Heartbeat. With this approach, however, tion about faulty nodes. VESPER injects Kprobes into the manager can not immediately determine faults, nor the guest to gather the guest’s detailed information. By get detailed information about faulty nodes; this results communicating with the guest in kprobes through the in fail-over response latency. So we have focused on im- VMM infrastructure, VESPER can provide the manager proving upon this latency, and on the failure analysis the with prompt fault information much more quickly. manager should facilitate in clustering virtual machines, by considering features of technology. In this paper, we explain how VESPER injects Kprobes into a guest and clarify the benefits by showing a use One solution to the latency and failure analysis issues case on . As VESPER is not strongly coupled to a is to add dynamic probing technology into virtual ma- specific VMM, we also show its portability to KVM and chines. This allows the cluster manager to have: lguest. • Dynamic probe insertion to guarantee service 1 Introduction availability while inserting a probe.

Recently, the trend of applying virtualization technol- • Arbitrary probe insertion to any process address to ogy to enterprise server systems is getting much more hold its versatile probing capability. noticeable, such as in server consolidation. The tech- • Prompt notification when corresponding events nology enables a single server to execute multiple tasks happen around a probe. which would usually run on multiple physical ma- chines. From that point, applying virtualization tech- nology to cluster computing is very attractive technol- Utilizing this featured probe technology to examine the ogy and worth considering as well, in terms of efficient health of a virtual cluster member machine could lead to resource utilization and system dependability. Further- faster and more efficient evaluation criteria for system more, use of virtualized environments for cluster sys- switching or migration than a simple, periodic message tems allows us to make improvements in several areas delivery mechanism.

• 229 • 230 • VESPER (Virtual Embraced Space ProbER)

Speaking of the probe technology, Kprobes [3] are avail- while Section 5 shows some benefits of VESPER for able in the kernel community. Kprobes can insert simple web services. Finally, we conclude this paper in a probe dynamically at a given address in the running Section 6. kernel of a targeted virtual machine. Its execution of the probed instruction in the kernel is capable of deal- ing with the detailed information on the targeted virtual 2 Design overview machine in an event-driven fashion. In designing VESPER, we took special interest in sim- From a system management point of view, however, one plicity of implementation and robustness against disas- privileged system running the cluster manager (called ters in userland. We thus set up some design require- host hereafter) should obtain all probed data from the ments on VESPER as follows, to reflect our concepts. targeted virtual machine (called guest hereafter). So, there needs a mechanism to insert probes from the host into the guests to improve the manageability of the man- 1. No modifications on host or guest kernels. ager, which Kprobes does not take into consideration. Lightweight implementation as a kernel module could assure better usability and availability of Xenprobes [4] has already addressed this topic. Xen- VESPER in systems due to dynamic loading and probes is newly devised for probing virtual machines, unloading features of kernel modules. but adopts Kprobes’s concept in inserting breakpoints where one needs a probe. However, Xenprobes needs 2. Only the host can insert probes into the guest, and the help of a VMM-like furnished debugging mecha- guest itself loads them using guest kernel space nism [5] and should stop the guest to insert the probes. only. As mentioned in the previous section, the Moreover, every breakpoint causes VMM to give exe- cluster manager running on the host might as well cution control to the host, especially in Xen [6] tech- control probes into the guest from a management nology. These could be significant problems in service point of view. In addition, the guest itself loads the availability. probes inserted by the host into its kernel space dy- namically to keep its serviceability. Furthermore, In this paper, therefore, we propose the framework if some disaster should mangle but not named VESPER (Virtual Embraced Space Prober) kernel space in the guest, VEPSER should still be which gathers guest information effectively in a virtual- available to find out what the problem is. ized environment, taking advantage of the full features of Kprobes, adopted for its probing component. In con- 3. All probed data from the guest is sent to host as trast to Xenprobes, VESPER never gets involved with quickly as possible. To receive prompt alerts via probe handlers; this acts to avoid unnecessary probing probed data is the main purpose of VESPER, and overhead and to improve service availability. VESPER thus to improve fail-over response latency. simply transfers Kprobes generated in the host to the targeted guest. The transferred Kprobes do all the nec- essary probing work themselves in the guest, and then To satisfy the requirements above, VESPER thinks of VESPER simultaneously obtains the result of Kprobes Xen and KVM mainly as target VMMs because of the through shared buffers (such as relayfs) built across the popularity of Xen and KVM [7] in the OSS (Open host and guest. Source Software) community in this writing. Firstly, to address Requirement 1 above, VESPER is, by prefer- We will describe how VESPER injects the probes into ence, developed as a virtual . Requirements guest and provides the solution to fail-over response la- 2 and 3 dictate that VESPER communicate with host tency with failure analysis in a virtualized environment and guest. As a matter of fact, Xen and KVM technol- by showing a use case with Xen, in the following sec- ogy provide a well-defined device driver model, split de- tions of this paper. vice driver, and communication infrastructure between host and guest. Xenbus is infrastructure specific to Xen Section 2 briefly describes design requirements in VES- technology, whereas virtio is applicable to Xen and PER; Section 3 presents the architecture of VESPER. KVM as well. Especially, virtio is available in 2.6.24 Section 4 presents implementation details of VESPER, . VESPER uses hierarchical layers in its 2008 Linux Symposium, Volume One • 231 software structure to accommodate various infrastruc- host guest tures, making it more available and portable. user land user land

The layer dependent on a certain infrastructure prepares VESPER-UI kernel the set of functions and data items for the use of the Probing Module layers independent of the VMM architecture. Details on kernel that structure and implementation will be discussed in Probe Probe Probe Probe the upcoming sections. Listener Loader Loader Listener Host Part Host Part Guest Part Guest Part 3 Architecture and Semantics of VESPER VMM Load Probing Module

For probing the guest, VESPER uses Kprobes to hook Share Probed Data into the guest Linux kernel, and uses relayfs to record probed data in the probe handler of Kprobes. In Figure 1: Architecture of VESPER this section, we take a brief look at the VESPER archi- tecture featuring a 3-layered structure, then we present Host Guest its semantics. UI Layer user space

3.1 VESPER Architecture Probing Module kernel space

Action Layer Action Layer As mentioned before, VESPER uses Kprobes to hook into the guest kernel. However, because Kprobes is the Communication Layer Communication Layer probing interface for the local system, it can not im- plant probes into the remote system directly. Besides, in the typical use case of Kprobes, the probe handler in Figure 2: Layer of VESPER Kprobes comes in the form of a kernel module. There- fore, in using Kprobes to hook into the guest kernel, VESPER should be able to load probing kernel modules, 1. UI Layer on which the handlers to probe are implemented, from This is the interface layer between VESPER and host to guest. This loading capability of VESPER is im- user applications which manage the guest in the plemented as split drivers and named Probe Loader. host. This layer provides interfaces for loading and unloading probing modules to/from the guest and In VESPER, the probing modules use relay buffers to accessing probed data recorded in the guest. These record data in the probe handlers. At this point, prob- are presented in detail in Section 5. ing modules are in the guest; thus, VESPER needs to transfer the buffer data from the guest to the host. This 2. Action Layer relayed data transfer capability is also implemented as split drivers and named Probe Listener. This is the worker layer which processes requests from the UI and Communication Layer. Concretely Figure 1 is the block diagram of the VESPER compo- speaking, in this layer VESPER does actual work nent. for load/unload probing modules to/from guest and sharing probed data between host and guest. As just described, VESPER contains two pairs of split drivers. These drivers are implemented for each VMM 3. Communication Layer because they strongly depend on the underlying VMM. So, we divide VESPER into three layers (shown in Fig- This is the layer which provides communication ure 2): UI Layer, Action Layer, and Communication channels between host and guest. It strongly de- Layer, in order to localize VMM-dependent code. This pends on VMM architecture. By implementing this structure lets VESPER run on Xen and KVM by replac- layer with respect to each VMM, VESPER con- ing only the VMM architecture-dependent layer—the fines the differences between VMM environments Communication Layer. to this layer. 232 • VESPER (Virtual Embraced Space ProbER)

3.2 VESPER Semantics A7. Setup Relayfs Structure The Action Layer of the host’s probe listener builds In order to implant probes into guests, VESPER must the relayfs structure based on the information re- load probing modules from the host to the guest. And ceived from the guest. then, VESPER sends probed data from the guest to the host. A8. Analyze/Offer Probed Data

Figure 3 illustrates the semantics overview of VESPER. One can read probed data through the UI Layer of the probe listener.

3.2.1 Module Loading 3.2.2 Probing The first step to probe the guest kernel is to load the probing module onto the guest. After finishing the procedures in Section 3.2.1, the host and guest probe listeners share relay buffers of the prob- 0. Make Module ing module. Consequently, it is not necessary to transfer all the recorded data from guest to host, but it is nec- First of all, one should make a probing module essary to transfer index information about shared relay which uses Kprobes and relayfs. buffers, where the data is, to get the start index for the A1. Module Load Command actual access to relay buffers by host. Execute the probing module insertion via inter- faces provided by the probe loader. One can also B1. Gather Guest Kernel Data specify module parameters, if needed. Once loaded, the probing module puts data into the A2. Obtain Module Information relay buffer in the probe handler. On the host-side Action Layer of the probe loader, B2. Get Index Information from user space, VESPER obtains the module in- formation to insert such as the module’s name, its When change occurs in the relay sub-buffer, the ac- size, and its address with others related to module tion layer of the guest probe listener gets the index parameters, if any. information and creates a message to notify host.

A3. Send/Receive Request B3. Send/Receive Message Through the interface provided by the VMM, the Through the communication layer, the host probe probe loader transfers the probing module’s infor- listener is notified of the index data from the guest. mation between the host and guest. B4. Update Index A4. Load Module The action layer function of the host probe listener In the Action Layer, the probe loader on the guest updates the index of relayfs in the host, based on side loads the module without userspace help. the received message. A5. Share Relay Buffer In the Action Layer, the guest’s probe listener gets 3.2.3 Module Unloading relayfs buffer information such as the read index, buffer ID, etc., from the probe modules; it then ex- ports the buffer to the host. Basically, unloading a probing module below is similar to loading a module. The significant difference is that A6. Send/Receive Buffer Information it takes two steps in the unloading module process, be- Communication layer of guest probe listener trans- cause before removing the relay buffer in the handler in fer the shared buffer information to host via VMM the guest, the exported user interface for the buffer in interface, and then, host probe listener receives it. host should be dropped. 2008 Linux Symposium, Volume One • 233

host guest

Probe Listener Probe Loader Probe Loader Probe Listener Backend Backend Frontend Frontend

UI:(A8) UI:(A1)(C1) user space

kernel Module:(B1) space

Act:(A7)(B4)(C7) Act:(A2)(C2) Act:(A4)(C4)(C8) Act:(A5)(B2)(C5)

Com:(A6)(B3)(C6) Com:(A3)(C3) Com:(A3)(C3) Com:(A6)(B3)(C6)

VMM

Figure 3: Process Flow of VESPER.

C1. Module Unload Command 4.1 Probe Loader

C2. Obtain Module Information Probe Loader loads probing modules from host to guest without using the guest’s user space. To make this con- C3. Send/Receive Request cept real, probe loader needs to perform two functions. One is to transfer the probing module from the host to C4. Unload Module (step1) guest, and the other is to load the module in the guest without involving user space. C5. Stop Sharing Relay Buffer

C6. Send/Receive Buffer Information 4.1.1 Module Transfer Function

C7. Destroy Relayfs Buffer In order to implant the probing module from the host into the guest memory space, VESPER needs to be C8. Unload Module (step2) able to transfer the module image somehow. Gener- ally, VMM provides I/O infrastructure between host and guest using a shared memory mechanism. In the Xen grant table I/O ring In the next section, we describe the detailed implemen- environment, and are pro- tation of the probe loader and listener. vided for that. To use them for transferring the module, probe loader is implemented as split drivers—frontend driver in the guest, and backend driver in the host, on Xenbus. 4 Implementation of VESPER Backend driver allocates shared memory using grant table and write the module image into it. And then, As previously described, VESPER consists of two com- the driver pushes a request onto I/O ring for the ponents named probe loader and probe listener. Each frontend driver. After receiving the request through I/O component is split into three parts, UI Layer, Action ring, frontend driver maps the shared memory to its Layer, and Communication Layer, to confine the de- virtual memory space. pendency on the underlying VMM. In this section, we present the implementation of VESPER from the view- For security reasons, it is normally the frontend driver point of the Communication Layer on Xen. that requests I/O from the backend driver. If, instead, 234 • VESPER (Virtual Embraced Space ProbER) the backend driver issued I/O requests to the frontend 9. Repeat Steps 2 through 8 for every request from the driver, because frontend driver should have read/write application. permissions on the host to accomplish the requests, this would result in the guest being able change the contents From the benefit of the protocol, we sacrifice only one of the host’s page. dummy request incurred in the initialization phase of the drivers, which is very trivial compared to the security In VESPER, however, (as mentioned above) the host hole. must issue requests to the guest for loading the module. From a security perspective, we must ensure that VES- PER’s communication protocol between host and guest 4.1.2 Module Load Function follows this pattern—that is, that the frontend driver re- quests, and the backend driver responds. To keep this Sometimes there is the situation that a user program pattern, the frontend driver first issues a dummy request goes out of control. In such cases, the user program to backend driver. Then, the backend driver issues the cannot be executed, but a kernel program can. If it is module loading request as a response to the dummy re- possible to load the module without user space help, one quest in the protocol, when the UI layer in host triggers. is able to analyze system faults even in the case above. At this phase, the frontend driver has received all the in- Thus, in VESPER, the module load function is imple- formation about the module to insert and then allocates mented without using a user-space program. a grant table for the module image. After grant table allocation, the frontend driver issues the real Currently the core system of the module loader in Linux, module loading request. With that request, the backend such as load_module, calls copy_from_user to driver does copy_from_user to the grant table get module images because it assumes that module load- with respect to the module for the response. Finally, the ing is executed in user space. Inside copy_from_ result of loading the module is sent as a new dummy user, access_ok is called to verify its memory request from frontend driver. When next request is trig- address. However, it never checks whether the call- gered from the host, the backend driver only issues a ing function is executed in user space; it only checks new request as the response to the last dummy request whether the address limit is in its process address space. for the next process. The processes above are involved Hence, we implemented the module load function as for every module insertion request in the host. The de- a kernel thread in the Action Layer of the guest probe tails of the protocol follow. loader. This kernel thread calls sys_init_module, which calls load_module. Because the loaded mod- ule is already copied from the host via the module 1. Frontend driver issues a dummy request. transfer function described above, copy_from_user 2. Application triggers the module loading into guest. works properly. Nevertheless, it is impossible to invoke sys_init_ 3. Backend driver issues the module loading request as module and load_module from external kernel a response to the dummy request. modules, because they are not exported by EXPORT_ 4. Frontend driver issues a real request with grant SYMBOL. To address this problem, in VESPER, we get table allocated. the address of these symbols from the guest kernel sym- bol table, and then pass them as parameters to the user 5. Backend driver copies the module into the grant interface probe loader provides for loading. table. 4.2 Probe Listener 6. Frontend driver loads the module.

7. Frontend driver issues a new dummy request with the The Probe Listener should retrieve the probed data from last loading module result. the guest and make it available to user-space applica- tions in the host. Probing modules use relayfs to record 8. Backend driver hands in the result of the module the probed data which describes the behavior of the loading to the application. guest kernel around the probe points. 2008 Linux Symposium, Volume One • 235

In fact, relayfs tends to allocate its buffers by pages, structure is done at first, and then exporting relay buffers and grant table is also a page-oriented mechanism. is stopped. So, provided that the buffers controlled by relayfs are allocated by grant table in the guest, a copy pro- cess is definitely unnecessary between host and guest 4.2.2 Index Update Function to share the data, because grant table is transpar- ent to both host and guest. A design decision on using When the probe listener shares the relay buffers be- grant table as relayfs buffers could also eliminate tween host and guest, it must synchronize some buffer the need of other control mechanisms onto buffers than information such as the read index between both rchan relayfs rchan. structures. If it does not, the user application on the host cannot read the probed data correctly. Probe Lis- In the case of sharing the probed data in the buffers, tener uses I/O ring for the information transfer. The VESPER should also share and update some infor- guest’s probe listener creates a message including the mation about the buffers, such as the read index and information, pushes it to I/O ring, and then notifies padding value for relayfs in the host, to access the the host’s probe listener. The host’s probe listener gets buffers. the message from I/O ring and updates its own relay buffer information with the message. As a result, Probe Listener consists of two components, which are a buffer share function and an index update Ideally, probe listener should update that information function, and it is implemented as split drivers, just like immediately whenever the guest rchan is changed. Probe Loader. However, message passing by I/O ring is too expen- sive to update each time due to the intervention of the interrupt mechanism to notify host of the existence of 4.2.1 Buffer Share Function pending messages. Hence, Probe Listener updates the buffer information when switching to sub-buffer occurs. As mentioned before, because the probing module In doing so, probe listener updates the buffer control records probed data into relay buffers, Probe Listener information, and the user application can get the latest shares them between host and guest. Sharing the buffers data probed from the guest. is implemented by using grant table like the mod- ule transfer function in Probe Loader. Similarly, infor- 5 VESPER Interface mation about which buffers need to be shared is pro- vided from guest to host by using I/O ring. This section describes the VESPER Interface. Once the relay buffers are exported by the guest and 5.1 The VESPER User API their information is received by the host, the relayfs structure is built on the host to provide probed data VESPER provides user applications with simple inter- in the buffers to user-space applications. At this time, faces to insert probing modules to target guests and to Probe Listener does not call relay_open to create obtain probed data from the guests in Figure 4. Argu- a rchan structure, which is a control structure of re- ments of virt_insmod and virt_rmmod are sim- layfs. This is because Probe Loader does not need to ply the same as insmod and rmmod, Linux user com- newly allocate the pages for the relay buffers, but should mands to handle kernel modules, except that they target just map the pages exported by the guest. Therefore, the guest. In addition, virt_is_alive is available Probe Listener sets up the rchan structure manually. for the application to check if some error occurs around After rchan is set up, the interface to read this relay the probed point. buffers is created on /sys/kernel/debug/vesper/ domid/modname/ like other subsystems which use re- layfs. User applications can read this interface directly, 5.2 The VESPER Module API or use APIs abstracted by VESPER. VESPER defines an API to export relay buffers of the Finally, to stop sharing the buffer, the probe listener ex- probing module to the host. Additionally, VESPER pro- ecutes the above process in reverse. Removing the relay vides a callback function for use when the sub-buffer of 236 • VESPER (Virtual Embraced Space ProbER)

int virt_insmod( 6.1 Test environment const int target_guest, const char modname, * For this experiment, we plan to prepare two physical const char opt); * machines. We set up two guests as resources managed by the LRM (Local Resource Manager) of Heartbeat on int virt_rmmod( each physical machine. In fact, it is a controversial is- const int target_guest, sue on how a guest is treated in the cluster, as one of const char modname, * the cluster nodes, or as a resource like IP. However, we const long flags); will treat guests as a resource to avoid any complexity of management caused by difficulty in identifying host bool virt_is_alive( and guest from the all nodes, in case a guest were treated const int target_guest, as a node. On each physical machine, the webserver is const char modname); * on the one guest, we say VM1; it is actively performing Figure 4: Prototype of the VESPER user API web service. On the other hand, the webserver in the other guest, we say VM2, is inactive. Figure 6 depicts the details. the relay is changed. All probing modules should call When something wrong happens to VM1, Heartbeat lets the exported function after relay_open, and the stop VM2 take over all of roles which VM1 was perform- export function before relay_close. The callback ing. However, one physical machine, P1, has Heart- function also is set up to subbuf_start, the member beat’s LRM without involvement of VESPER to show of struct rchan_callbacks. Figure 5 shows the how Heartbeat works in the usual way. However, the prototype of the module API. other physical machine, P2, has LRM cooperating with VESPER. Here, we treat the guest as a resource so that int relay_export_start( we plan to add virtual machine resource plugin con- struct rchan *rchan, forming to OCF (Open Cluster Framework) to LRM to const chaq *modname); make Heartbeat and LRM recognize a virtual machine as a resource. The plugin has interfaces for start, stop, void relay_export_stop( and monitor (a.k.a. status) the resource. In implement- const char *modname); ing the resource plugin, the monitor interface of the plu- gin will invoke VESPER for LRM on P2 only. int virtrelay_subbuf_start_callback( struct rchan_buf *buf, 6.2 Implementation of virtual machine resource void *subbuf, void *prev_subbuf, In Heartbeat, CRM (Cluster Resource Manager) coordi- size_t prev_padding); nates what resources ought to run where, or which status they are running, working with the resource configura- Figure 5: Prototype of the VESPER module API tion it maintains. And it commands LRM to achieve all the things. LRM then searches for proper resource to handle by way of the PILS subsystem of Heartbeat. LRM calls exported interfaces by the resource to exe- cute CRM requests, start/stop/monitor, etc. We newly define a virtual machine resource, and it exports a start/ 6 Evaluation stop/monitor interface implemented as follows.

• start. The resource invokes the xm command (Xen In this section, we will examine the benefits of VES- tool) to start a specific guest. PER with a webserver running on a Xen-based guest, and will explain how VESPER works with Heartbeat to • stop. The resource also invokes xm (Xen tool) to eliminate the latency using the test case we plan to build. stop a specific guest. 2008 Linux Symposium, Volume One • 237

P1 P2 host VM1 VM2 host VM1 VM2 user space user space user space user space user space user space Heartbeat Heartbeat LRM web LRM web server server resource resource

VESPER probes probes

kernel space kernel space kernel space kernel space kernel space kernel space

Xen Xen Hardware Hardware

Figure 6: Test environment to demonstrate the usability of VESPER.

• monitor. The resource invokes the xm command points miss, no more improvement over usual Heartbeat (Xen tool) to get the status of a specific guest as can be expected. Actually, the problem on where probe well as monitor procedure, prepared for this test, points should be inserted seems very tricky to handle, to get status of the services running on that guest. because highly experienced developers or system ad- What is more, the resource on P2 invokes the VES- ministrators on kernel context and applications running PER interface and does a logical OR on the results on the server are required to select optimal probe points. of the above two for the return value, only in case The other is about overhead produced by the execu- of P2. tion of probes. One should adjust the overhead ac- cording to the required service performance. Both is- sues exclude each other. More probes inserted to hit 6.3 Expected result and discussion fine-grained events cause more overhead in probing, ob- viously. Some mechanism to help one select optimal For simplicity, we insert Kprobes around the panic path probes could be needed. Some suggestions for these is- of the guest kernel via VESPER. Kprobes thus inserted sues will be mentioned as future works of VESPER in will put clues like the callstack to the panic on the re- the next section. layfs when the panic occurs. Then, we cause a panic intentionally on VM1 and measure the recovery time on 7 Conclusion and future works P1 and P2. We can easily expect that P2 will recognize what happened to VM1 of P2 as soon as VM1 panics, In this paper, we proposed VESPER as a framework to because of the prompt notification done by VESPER. insert probes in virtualized environments and discussed Obviously service recovery time is supposed to be dra- what topics VESPER can solve in clustering computing. matically improved by as much as deadtime we set for After that, we described the design and the implementa- Heartbeat. Moreover, one can find out the reason why tion of VESPER. Then we suggested a test bed to show VM1 on P2 panicked through the probed data on the re- the performance improvement on fail-over response la- layfs later on. tency and failure analysis. Finally we discussed some considerations on places and overhead of probing. To Through the experiment, we could verify better perfor- address these considerations, we have some plans about mance on response latency and usability of failure anal- future VEPSER developments. ysis provided by VESPER. 7.1 Probing aid subsystem However, special care should be taken with two issues regarding probes. One is what probe points are suitable For the ease use of cluster manager or other applications, for proper monitoring. If the targeted, necessary probe we plan to develop a probing aid subsystem. Probing 238 • VESPER (Virtual Embraced Space ProbER) points could be classified into several groups based on 8 Acknowledgments their functionality. The subsystem thus can pre-define several groups of probe points and abstract them to We would like to thank Yumiko Sugita and our col- its clients or application—like memory group, network leagues for reviewing this paper. group, block-io group, etc. The clients just select one of groups, and the subsystem will generate all needed probes relayed to VESPER. Also, fine-grained selection 9 Legal Statements from several groups will be supported by the subsystem. Linux is a registered trademark of . Other 7.2 SystemTap Enhancement company, product, and service names may be trademarks or service marks of others. We have a plan to integrate the feature of VESPERs for virtualization into the SystemTap [8] for its versatile us- age in the native kernel as well as the virtualized kernel. References [1] Alan Robertson, “Linux-HA Heartbeat Design,” 7.3 Virtio support and evaluation on KVM and In Proceedings of the 4th International Linux lguest Showcase and Conference, 2000.

Virtio is likely to promise one standard solution to [2] Heartbeat, http://linux-ha.org. host and guest communication infrastructure on various VMMs. VESPER should support virtio and be evalu- [3] Ananth N. Mavinakayanahalli et al., “Probing the ated on KVM and lguest [9] to verify its portability and Guts of Kprobes,” In Proceedings of the Linux usability. Symposium, Ottawa, Canada, 2006. The next two are not related directly to probing tech- [4] Nguyen A. Quynh et al., “Xenprobes, A nique but are worth examining as enhancements of func- Lightweight User-space Probing Framework for tionality of virtual machine resource facilities in clusters Xen Virtual Machine,” In USENIX Annual in terms of virtualization technology. Technical Conference Proceedings, 2007.

7.4 Virtual machine resource support for LRM [5] Nitin A. Kamble et al., “Evolution in Kernel Debugging using Hardware Virtualization With Arguably, there is a question on how services running Xen,” In Proceedings of the Linux Symposium, on virtual machines should be treated if a virtual ma- Ottawa, Canada, 2006. chine is treated as a resource. Should the services be [6] The Xen vitual machine monitor, handled at the same level as the virtual machine itself in http://www.cl.cam.ac.uk/Research/ LRM? When the services go fail-down, only the failed SRG/netos/xen/. service must be moved to another virtual machine on the same physical machine—better than virtual machine it- [7] KVM, http://kvm.qumranet.com/. self should be switched to another virtual machine on a different physical machine. The next version of inter- [8] SystemTap, face VESPER exports will suggest the solution to that. http://sourceware.org/systemtap/. [9] , “lguest: Implementing the little 7.5 Precaution capability on collapse of the host Linux ,” In Proceedings of the Linux Symposium, Ottawa, Canada, 2007. If the host collapsed, all services running on the guests would be lost. It is obviously a big problem. Therefore, [10] VESPER, VESPER should probe the host simultaneously to check http://vesper.sourceforge.net/. whether the host is in good condition. When VESPER catches a sign of the host’s collapse, the cluster manager notified by VESPER could take necessary action, such as live migration to other host. Proceedings of the Linux Symposium

Volume One

July 23rd–26th, 2008 Ottawa, Ontario Canada Conference Organizers

Andrew J. Hutton, Steamballoon, Inc., Linux Symposium, Thin Lines Mountaineering C. Craig Ross, Linux Symposium

Review Committee

Andrew J. Hutton, Steamballoon, Inc., Linux Symposium, Thin Lines Mountaineering Dirk Hohndel, Intel Gerrit Huizenga, IBM Dave Jones, Red Hat, Inc. Matthew Wilson, rPath C. Craig Ross, Linux Symposium

Proceedings Formatting Team

John W. Lockhart, Red Hat, Inc. Gurhan Ozen, Red Hat, Inc. Eugene Teo, Red Hat, Inc. Kyle McMartin, Red Hat, Inc. Jake Edge, LWN.net Robyn Bergeron Dave Boutcher, IBM Mats Wichmann, Intel

Authors retain copyright to all submitted papers, but have granted unlimited redistribution rights to all as a condition of submission.