Virtual Square: all the virtuality you always wanted but you were afraid to ask.

Renzo Davoli i Computer Science Department vol

Da ALMA MATER STUDIORUM: University of Bologna o Renz eft, yl WorkShop 2007 sul Calcolo e Reti dell'INFN op 7 Rimini, 10 maggio 2007 00 2 © re ua Sq l ua t Vir Virtual Square

VIRTUAL

VIRTUAL VIRTUAL SQUARED

i VIRTUAL SQUARE vol Da o VIRTUAL VIRTUAL Renz

eft, VIRTUAL yl VIRTUAL op C 7 00

2 VIRTUAL © re VIRTUAL ua Sq l ua t Vir VIRTUALITY today

● Virtual Machines – historical topic – lots of papers – lots of tools i vol

Da – ... but something is already missing o

Renz ● Virtual Networking eft, yl op – less historical C 7 00

2 – several papers © re ua Sq l ua t Vir Virtual Square

Virtualization concepts and tools are disconnected.

i There is a world of new applications that vol

Da can be realized by interoperating, o

Renz integrated virtuality eft, yl op C 7 UNIFICATION IS NEEDED 00 2

© re ua Sq l ua t Vir Virtual Square i vol Da o Renz eft, yl op C 7 00 2 © re ua Sq l ua t Vir Some Examples of VM (free software)

● Qemu: PVM or SVM, User Mode User Access (or dual-mode with KQEMU, proprietary sw). – cross emulation platform (ia32, ia64, ppc, i m68k, sparc, arm...) vol Da o – dynamic translation Renz ● eft, : SVM, Native. yl op C 7 – xen uses para- (O.S. in domain0 00 2

© has the real device drivers). re ua – (xen ideas come from the Denali project: Sq l ua t SVN, Native, real virtualization). Vir Some Examples of VM (free software)

● User-Mode (U-ML): SCVM, User-Mode User Access. – Same ABI/Same ISA: Linux on Linux. – Uses the debugging interface ptrace to capture i

vol the system calls. Da o ● OpenVZ: Native, SVM (but all the Vms share Renz eft, the same kernel, this category is also known yl op C

7 as: Level Virtualization). 00 2 – © derived from the proprietary product Virtuozzo. re ua – Sq the kernel itself provide different resource and l ua t different permissions to virtual instances of Linux. Vir Some Other Examples

● Mac-On-Linux: (free SW, SVM, Dual) ● Vmware: (proprietary, SCVM, Dual) ● VirtualPC/Virtual Server: (proprietary,

i SVM, Dual) vol

Da ● o PearPC: (free, SVM, User-Mode User-

Renz Access) eft, yl op ● C Solaris “zones”: proprietary, Native, SVN 7 00

2 (OSLV). © re ua Sq l ua t Vir There are other kinds of Virtuality! The “classical” definition of Virtual Machines does not catch all the possible “Virtualities”... ● Virtual Network Interfaces: – tuntap ● Virtual Networks: i vol

Da – VPN, Local VN provided with Virtual Machines o

Renz ● Virtual Running Environments: eft, yl op –

C (POSIX syscall), fakeroot utility UNIX, Linux 7 00

2 Vserver © re ● ua Virtual File Systems Sq l ua t – Fuse, UnionFS, PlasticFS (uses LD_PRELOAD) Vir Other “Virtuality”

● Environmental Subsystems. – system call conversion: e.g. Posix -> Win32 ● System Call Interposition. – i Systrace: system call access manager vol

Da – o Janus, Ostia: application firewall. Renz eft, yl op C 7 00 2 © re ua Sq l ua t Vir Virtual Square Goals

● We have seen a wide (wild ;-) world of different kinds of virtuality based on different ideas, several tools ● Keyword #1: communication – different VMs must be interconnected i ● vol Keyword #2: integration Da o –

Renz different VMs can be seen as special cases of eft, a broaden idea of VM yl op C 7 ● Keyword #3: extension 00 2

© – re several needs could be captured by some VM ua

Sq abstraction, but there is not the specific l ua t model of VM, nor the tool! Vir 2 ½ Years!

● Ideas, tools... more than 100,000 lines of free SW. i vol Da o Renz eft, yl op C 7 00 2 © re ua Sq l ua t Vir Keyword #1: Communication

● Virtual Distributed Ethernet: – VDE clients: User-Mode Linux, Qemu, virtual tap (all appl. using tap), tuntap (in linux kernel), slirp. i – VDE appears as a Local Area Network vol Da o (Ethernet complaint) from all the connected

Renz entities. eft, yl – op The machines interconnected by a VDE may C 7 be distributed on different hosts (no limits of 00 2

© “distance”, km or hops). re ua Sq l ua t Vir VDE components SWITCH SWITCH

CROSS CABLE i vol Da o Renz eft, yl op C

7 VDE SWITCH VDE SWITCH 00 2 © re ua VM VDE VdeWire VDE VM TunTap Sq

l plug (e.g. ssh) plug ( e.g. QEMU) (e.g. U-ML) Linux Module ua t Vir VDE: Related Work

● VPN: (OpenVPN) point2point, for real machines ● Overlay Networks: specific for application (peer to peer, Akamai). i vol ●

Da VM networking: (tools provided with VM, o

Renz e.g. uml-switch) specific for VM eft, yl

op VDE: C 7

00 ● 2 multipoint, general mesh © re ua ● no need for root (administration) access Sq l ua t ● Vir heterogeneous VM and non VM connected VDEv2: advertisement

● VDEv2: – modular design – compatible with user-mode linux, , tuntap, (, plex86), umview/lwipv6 – through the vdetaplib potentially compatible any application using tap i vol – VLAN (802.1Q) Da o – FST (fast spanning tree) Renz – run time maneageable via unixterm (telnet or eft, yl

op web with vdetelweb) C 7 – includes slirpvde and wirefilter 00 2 – © status debug (NEW!) re ua – plugin support (NEW!) Sq l ua t Vir VDEv2

● VDE-Switch – number of ports configurable on command line – port0 is reserved for management clients, n-1 i ports are available for connections. vol Da o – management UNIX socket for management Renz clients eft, yl op ● self-describing SMTP-like protocol C 7

00 – 2 modules: datasock (VM conn), tuntap, © re consmgmt (management) ua Sq l ua t Vir VDE cables

● VDE-plug – is a VM that converts the Ethernet packets of a VDE port into a stream connection (stdin- stdout) i ● vol VDE-wire Da o – can be any application able to give a Renz

eft, stdin/stdout stream connection yl op C 7 00 2 © re ua Sq l ua t Vir Dual Pipe

● dpipe is a new (general purpose) command we have added. ● Pipe are well known abstractions. The following command prints the list of the i vol current directory: ls lpr Da o ls | lpr Renz eft, yl ● Dpipe creates a bi-irectional connection op C 7 between the processes cmd1 cmd2 00 2 © re dpipe cmd1 = cmd2 ua Sq l ua t Vir VDE cables, plugs, wires and dpipe

● dpipe is used to create VDE-cables: dpipe vde_plug = ssh vde.students.cs.unibo.it vde_plug ● this command connects by a dpipe the local vde_plug with a vde_plug running on i vol a remote host (the wire is ssh) Da o ● Renz other applications can be used as wire eft, yl (e.g.netcat) op C 7 ● 00 In the example vde_plug refers to the 2 © re default switch. It is possible to run several ua Sq l switches on the same host, an extra ua t

Vir option is needed in this case. wirefilter

● wirefilter can be put on a cable (e.g. for network testing) dpipe vde_plug /tmp/s1 = wirefilter -m /tmp/m = vde_plug /tmp/s2 ● packet loss, delays, dup, speed, noise i vol figures, mtu, fifoness properties of the Da o line can be changed with command line Renz

eft, options or real time via a management yl op

C socket. 7 00 2 © re ua Sq l ua t Vir SlirpVDE

VDE SWITCH VDE SWITCH

10.0.2.15 VDE VdeWire VDE 10.0.2.16 plug (e.g. ssh) plug

VM VM i (e.g. QEMU) (e.g. U-ML) vol Da

o Note: it supports Ipv4 only. Renz 10.0.2.2 Ipv6 is already eft,

yl unsupported. SlirpVDE

op Firefox C 7

00 http connection from slirpVDE 2 running on the hosting O.S. to ©

re http connection from firefox www.cs.unibo.it -> 130.136.1.110 ua running on 10.0.2.15 to Sq l w ww.cs.unibo.it -> 130.136.1.110 ua t Vir vde_cryptcab

● Coded by Daniele Lacamera (danielinux) ● A vde_cryptcab is a distributed cable manager for VDE switches.

i Server side vol

Da vde_cryptcab -s /tmp/vde2.ctl -p 2100 o

Renz ● Client side eft, yl

op vde_cryptcab -s /tmp/vde2.ctl -c [email protected]:2100 C 7

00 ●

2 use a blowfish channel (random key © re exchanged by scp). ua Sq l ua t Vir LWIPv6

● It is a LWIPv4/v6 stack implemented as a library. ● Fork project from LWIP project (Adam Dunkels ) i vol ●

Da Can be connected to any number of VDE, o

Renz TUN, TAP interfaces. eft, yl ● op It is a hybrid stack (not a dual-stack). One C 7

00 single Ipv6 “engine” is able also to 2 ©

re manage Ipv4 packets in compatibility ua Sq l mode (130.136.1.110 is managed as ua t

Vir 0::ffff:130.136.1.110). LWIPv6

● PF_INET, PF_INET6 ● PF_PACKET for raw packet management – support for user-level network analysis tools (e.g. sniffers, ethereal) i vol –

Da support for user-level dhcp clients. o ● Renz PF_NETLINK for configuration eft, yl

op ●

C Packet filtering 7 00 2 © re ua Sq l ua t Vir LWIPv6 interface definition API

struct netif *lwip_vdeif_add(void *arg); struct netif *lwip_tapif_add(void *arg); struct netif *lwip_tunif_add(void *arg);

int lwip_add_addr(struct netif *netif,struct ip_addr *ipaddr, struct ip_addr *netmask);

int lwip_del_addr(struct netif *netif,struct ip_addr *ipaddr, struct ip_addr *netmask);

i int lwip_add_route(struct ip_addr *addr, struct ip_addr *netmask, vol struct ip_addr *nexthop, struct netif *netif, int flags); Da

o int lwip_del_route(struct ip_addr *addr, struct ip_addr *netmask, struct ip_addr *nexthop, struct netif *netif, int flags); Renz eft,

yl int lwip_ifup(struct netif *netif); op int lwip_ifdown(struct netif *netif); C 7 00 2 © re ua Sq l ua t Vir LWIPv6 socket API (just put lwip_ ahead) int lwip_accept(int s, struct sockaddr *addr, socklen_t *addrlen); int lwip_bind(int s, struct sockaddr *name, socklen_t namelen); int lwip_shutdown(int s, int how); int lwip_getpeername (int s, struct sockaddr *name, socklen_t *namelen); int lwip_getsockname (int s, struct sockaddr *name, socklen_t *namelen); int lwip_getsockopt (int s, int level, int optname, void *optval, socklen_t *optlen); int lwip_setsockopt (int s, int level, int optname, const void *optval, socklen_t optlen); int lwip_close(int s); int lwip_connect(int s, struct sockaddr *name, socklen_t namelen); int lwip_listen(int s, int backlog); int lwip_recv(int s, void *mem, int len, unsigned int flags); int lwip_read(int s, void *mem, int len); int lwip_recvfrom(int s, void *mem, int len, unsigned int flags, i struct sockaddr *from, socklen_t *fromlen); vol int lwip_send(int s, void *dataptr, int size, unsigned int flags);

Da int lwip_sendto(int s, void *dataptr, int size, unsigned int flags, o struct sockaddr *to, socklen_t tolen);

Renz int lwip_socket(int domain, int type, int protocol); int lwip_write(int s, void *dataptr, int size); eft, yl int lwip_select(int maxfdp1, fd_set *readset, fd_set *writeset, fd_set *exceptset, op struct timeval *timeout); C 7 int lwip_ioctl(int s, long cmd, void *argp); 00 2 © re ua Sq l ua t Vir VDETELWEB

● It is the Web/Telnet Server for VDE switch configuration.

● It uses the LWIPv6 library

● It has two connections to the controlled VDE switch: – management socket to give commands i vol – Da port0: the ethernet port used by the TCP-IP o stack. Renz

eft, ● yl It reads the set of commands, descriptions, op C

7 arguments from the switch itself. 00 2 ● © Telnet has history/command editing and re ua support for asynch debug output (NEW) Sq l ua t Vir VDETELWEB: telnet i vol Da o Renz eft, yl op C 7 00 2 © re ua Sq l ua t Vir VDETELWEB: Web Interface i vol Da o Renz eft, yl op C 7 00 2 © re ua Sq l ua t Vir Keywords #2/#3: Integration and Extension

●Questions: – is there a model able to capture several different types of “virtuality”?

● File System virtuality (Virtual FS: FUSE, chroot, i unionfs, chroot, fakeroot, etc) vol

Da ● Networking virtuality (VPN, VDE) o

● Renz ... eft, yl – are there practical problems that cannot be op C 7 solved with current models of Virtual 00 2

© Machines? re ua Sq l ua t Vir View-OS i vol Da

o A process with a view... Renz eft, yl op C 7 00 2 © re ua Sq l ua t Vir i vol Da o Renz eft, yl op C 7 00 2 © re ua Sq l ua t Vir View

● Each process has a view of the “world”. – meaning of a path – IP address, routing – ... i vol ● Da Global View assumption: all the processes o

Renz running on the same computer share the eft, yl same view. op C 7

00 ●

2 (Almost) true in POSIX. Some notable © re exceptions: chroot and.... virtual ua Sq l machines. ua t Vir View and Virtual Machines

● SVM require the boot of an entire OS. ● User-Mode Linux is a SCVM, but it boots a Linux kernel.

i Fakeroot, chroot, FUSE, systrace etc. are vol

Da specific virtualizations. o

Renz ● chroot, FUSE, tuntap etc. require root eft, yl

op access to run C 7 00 2 © re ua Sq l ua t Vir Global effects of commands ● A command has a global effect when it changes the view of all the processes running on the same machine. – e.g. mount of a filesystem – e.g. changing the IP address i ● Some operations are forbidden to vol Da o ordinary users just because the operating

Renz system is not able to give local effects to eft, yl

op commands: C 7

00 –

2 e.g. mount of a disk image © re – ua e.g. define a VPN for a user Sq l ua – t e.g. change IP address/routing for a process. Vir View-OS breaks the Global View Assumption

● Each process can have its own “view” of the world. – mount of filesystems – redefinition of access permissions to i

vol resources Da o – definition of interfaces/IP addresses/routing Renz

eft, –

yl definition of devices op C 7 – ... 00 2 © re ua Sq l ua t Vir View-OS Goals

● Redefinition of System Call behavior ● Definition of new System Calls ● Binary Compatibility with existing

i software vol

Da ● o Modularity and Composition of local view

Renz definition eft, yl op ● C Non-privileged use. 7 00 2 © re ua Sq l ua t Vir View-OS and security ● A process can change its view with no security risks. – local mount of a disk image. It is just a sofisticated access to a file! – definition of a local VPN. The virtual interface

i used by the process is just a network vol connection for the hosting computer Da o ● Renz All these operation can be done by eft, yl running a SVM or User-Mode Linux. op C 7 – 00 no extra risks, 2 © re – waste of resources and time: boot of an ua Sq l entire kernel! ua t Vir View-OS and Security (the other way round)

● Loopback mount of a personal disk image (with private data on it) need to set carefully the access permissions (as we'd want a local effect but the kernel can only provide a global mount) i vol

Da ● o Testing an unknown program/browsing

Renz dangerous edges of the internet can be very eft, yl

op risky. Most hackers have “fake” user C 7

00 accounts to do such experiments. The 2

© definition of a very limited view of the re ua

Sq filesystem/networking around the l ua t

Vir program/browser could act like a sandbox. View-OS implementation

● Specific Kernel: – need a lot of time to reach a usable level – need even more time to reach a suitable level of reliability. i vol ● Da o – Renz this is a new model of virtual machine: Partial eft, yl Virtual Machines (ParVM) op C 7 00 2 © re ua Sq l ua t Vir Partial Virtual Machine (ParVM)

VIRTUAL SQUARE VIEW-OS

PARTIAL VIRTUAL MACHINE

● A Partial Virtual Machine i vol – Da support the same ISA of the hosting machine o

Renz – support the same ABI of the hosting machine eft, yl – op can redefine some calls. C 7

00 ●

2 ParVM can be composed. © re ua Sq l ua t Vir UMVIEW=VIEW-OS as a ParVM Application Process Application Process

VIEW-OS VIEW-OS (no mod) mod

OS OS OS OS Hardware HaHardwrdwararee i vol

Da ●

o UMVIEW is a generic ParVM (SCVM).

Renz – UMVIEW load modules for specific eft, yl

op virtualizations C 7

00 – UMVIEW running without modules is an 2 ©

re empty ParVM (each call is directly mapped on ua

Sq the same call of the hosting machine). l ua t Vir Composition of Modules

UMVIEW mod2 mod2

UMVIEW UMVIEW mod1 mod1 i vol

Da Internal o System Call

Renz Capture

eft, ● yl Modules can be composed. op C

7 ●

00 It is like superimposing overlays 2

© ● re Module composition can be done by one instance of ua Sq l umview. Internal call mu st be captured (PURELIBC) ua t Vir UNIX is file system centric

● Pathname = Global Naming Scheme ● /dev, fifo, UNIX socket, /proc ● Advantages: i – no specific system calls vol Da o – inheritance of many “methods”, e.g. access

Renz prot. eft, yl

op ●

C Overlay of file system subtrees = 7 00

2 redefinition of what got the name from © re the subtree ua Sq l

ua ● t Basic operation: mount Vir UMVIEW mount ● Mount is a “view tranformation” function: v = m (v ) 1 1 0 ● A further mount “sees” the modified view v = m (v ) = m (m (v ))= m ◦m (v ) 2 2 1 2 1 0 2 1 0 ● UMVIEW can mount an image hidden by the i vol mount operation itself Da o ● UMVIEW can mount files (not only directory) Renz eft, ● yl UMVIEW mount can change the view outside op C

7 the mountpoint 00 2

© ● UMVIEW can mount on a non-existing re ua mountpoint Sq l ua t Vir Nested UMVIEW ● What happens when a user starts UMVIEW into UMVIEW? umview xterm & ● At the time of activation i both ParVM instances share vol Da

o the view, then each one can

Renz do further “mount”s eft, yl ● op It is a “virtual” instance: the C 7

00 same UMVIEW take into 2

© account the “view split” (by re ua the treepoch data struct). Sq l ua t Vir Unfortunately...

● there are exceptions to the file-system naming scheme ● the most notably exception is networking: – network interfaces are devices but do not i vol appear in /dev Da o – there is one stack per protocol family in the Renz

eft, kernel, no file system naming for stacks. yl op C

7 – the “Berkeley socket” API does not (yet?) 00 2 support multiple stacks. © re ua Sq l ua t Vir UMVIEW structure i vol Da o Renz eft, yl op C 7 00 2 © re ua Sq l ua t Vir UMVIEW: capture of process syscalls ● UMVIEW uses ptrace to capture process syscalls. Ptrace was created for debuggers. The debugger (UMVIEW in our case) receives a signal each time one i controlled process issues a system call, vol Da

o this latter process stops. The debugger

Renz can inspect and change the memory of eft, yl the process and then restart it. op C 7

00 ●

2 It is the same technique used by user- © re mode linux. ua Sq l ua t Vir UMVIEW: compatibility and performance ● ptrace is provided as a standard feature of the linux kernel

● UMVIEW runs linux unmodified binaries on any linux kernel (2.6)

● We provide patches for the kernel of the hosting system to increase the performance i vol – PTRACE_MULTI: executes several ptrace requests in a Da o single call (including load/store of large chunks of Renz data) eft, yl op – PTRACE_SYSVM: provide a method to skip useless C 7

00 upcalls. 2

© ● re ptrace was created for debugging. There is an ua

Sq international discussion about the best kernel interface l ua t for virtual machine support. Vir Internal System Call Capture problem ● System calls generated by modules must be captured.

● Modules use libraries (that can mod2 do system calls!)

● Modules are loaded as shared i libraries (for performance) thus vol UMVIEW Da o mod1 run in the same addressing

Renz space eft, yl ● op Internal There is the need for a “self- C

7 System Call

00 inspection” tool: a process Capture 2

© should be able to capture the re ua system call generated by itself! Sq l ua t Vir PURE_LIBC ● Is an overlay library. It uses GLIBC and redefines some GLIBC calls. (e.g. syscalls, stdio)

● When loaded with LD_PRELOAD, it captures the calls from the process involving system call.

● The interface is made by 4 lines of code: i typedef long int (*sfun)(long int __sysno, ...); vol

Da extern sfun _pure_syscall; o extern sfun _pure_socketcall; Renz extern sfun _pure_native_syscall; eft, yl op C

7 _pure_syscall and _pure_socketcall can be set to 00

2 capture functions © re _pure_native_syscall is the way to call the real syscall ua Sq l hidden by the library. ua t Vir UMVIEW modules ● A module implements one specific abstraction – umfuse: virtual file systems – umdev: virtual devices – lwipv6: (umlwip) virtual networking .... ● The interface for modules is composed by: i – vol init: set a structure with all the managing Da o functions Renz – a “choice” function (the module asks to the eft, yl op question: “is this path/mounttype/protocol etc... C 7

00 up to you?”, returns a timestamp 2 © re – one function for each managed system call, ua Sq l with the same interfac e of the system call. ua t

Vir – one upcall function register for select/poll. UMFUSE ● It provides the virtual file system abstraction ● It needs submodules, user-mode implementations of the file system structures. i vol

Da ●

o It is a compatibility layer between

Renz UMVIEW and FUSE (M. Szeredi, eft, yl fuse.sf.net, in the linux kernel since op C 7 2.6.15). 00 2

© – re It is a source level compatibility, source code ua Sq

l of FUSE modules must be compiled with ua t different libraries to run with UMFUSE. Vir UMFUSE FS modules ● UMFUSEEXT2(*): ext2, ext3 file systems – uses the ext2fs lib ● UMFUSEISO9660(*): iso9660 (rock_ridge, joliet) – mounts also compressed iso images – uses the llibcdio/libiso9660 libs i ●

vol UMFUSEENCFS: encoded FS Da o ● UMFUSEFSSH: gives FS view to an ssh session Renz

● eft, UMFUSECRAM: cram (initrd), auto endianess(*). yl op

C ● 7 UMFUSEFSFS(*): Fast Secure File System (Cocchiaro, 00 2 2006), experimental © re ua (*): developed by the V2 project, FUSE compatible. Sq l ua t Vir One word on FSFS

● Fast Secure File System is a secure network file system. – Encrypted NFS requires a lot processing power on the server (expecially with a large number of clients) – Users may want (need) their data to be stored in encrypted formats on hard disks (privacy disclosure i

vol for hard disk maintenance) Da o – Solution: data is encrypted on disks, encrypted files

Renz are sent on a clear channel to the client. The cost of eft, yl encription is moved to client, data is stored and op

C transmitted in an encrypted form. 7 00

2 ● It is a bit more complex than this, control streams must © re be encrypted and some measures must be taken for ua

Sq “record&playback” attacks.... l ua t Vir UMFUSE @ work ● The use of umfuse is very simple. – note that the example uses the standard GNU- linux commands. (it is the same “mount” used by root) i vol Da o Renz eft, yl op C 7 00 2 © re ua Sq l ua t Vir UMFUSE @ hard work!

● In this example a file system mounted by ssh contains an encrypted fs. The encfs is mounted and there is an ext2 image inside. The third mount is on the same target dir of the first! i vol Passwords are Da o entered in the

Renz UMVIEW console eft, yl window. op C 7 00 2 © re ua Sq l ua t Vir UMDEV

● UMDEV implements virtual devices. ● The naming item for a device is a special file (some device can define several special files). i vol ●

Da From the process “view” access to o

Renz devices is a standard file access eft, yl (open/read/write/close...) + specific ioctl op C

7 of the device. 00 2

© ● re UMDEV is able to give the same ua Sq l “perception” to proc esses. ua t Vir UMVIEW and ioctl

● UMDEV provide ioctl virtualization ● ioctl is a “last resort” call able to manage hundreds of different requests with different arguments. i vol ●

Da ioctl itself is fd based so it is routed to the o

Renz module who managed the open. eft, yl ● op there is a special call of the choice C 7

00 function (CHECKIOCTLPARMS) to define 2 ©

re size/management/direction ua Sq l (input/output/I-O) of the args. ua t Vir UMDEV testmodules

● UMDEVNULL: test module. redefines a /dev/null device, echo what is sent to it ● UMTRIVHD: creates a ramdisk: a chunk of malloc-ed memory is given as a block i vol device. Da o – it is possible to create and mount a Renz

eft, filesystem on it. yl op C 7 00 2 © re ua Sq l ua t Vir UMDEVMBR

● This UMDEV submodule manages the MBR (EMBR) structure of disk images. ● When a disk image is mounted on a file (say /tmp/disk) it creates the special files i vol to access all the partitions (/tmp/disk1 ... Da o /tmp/disk63) Renz

eft, ● yl It is possible to use fdisk (ddisk on op C

7 linux) to repartition the hard 00 2

© disk, after the partition map reload, the re ua new partitions are ready. Sq l ua t Vir UMDEV @ Work ● In this example: – I tried to partition my real HD /dev/hda as a user (permission denied) – I mounted a disk image i on /dev/hda vol

Da – o In the new view I could

Renz change the partition

eft, table yl op

C –

7 I created an ext3 00

2 partition on /dev/hda5 © re – I mounted the new ua Sq l partition on /mnt ua t (umfuseext2) Vir UMDEVTAP

● Virtual TAP ● emulates the standard tuntap interface (/dev/net/tun)

i Net packets get sent to a vde switch vol Da o Renz eft, yl op C 7 00 2 © re ua Sq l ua t Vir UMBINFMT

● It implements the same interface of the BINFMT kernel module. ● It is possible to choose the “interpreter” for an executable depending on file i vol extensions or magic numbers (pattern Da o matching). Renz eft, yl op C 7 00 2 © re ua Sq l ua t Vir UMBINFMT @ Work In this example – on a linuxppc I show two executables for other architectures – I load Aumbinfmt – I mount it on the same dir i of the kernel one (can be vol

Da mounted elsewhere, o scripts are compatible if

Renz the same dir is used) eft, yl – I register the interpreter op

C (QEMU) 7

00 – 2 The executables run (ls

© i386 is a real ls, ls on arm re ua is busybox command) Sq l ua t Vir UMLWIP

● It is the virtual networking module ● UMLWIP uses LWIPV6 ● The module to load is named lwipv6.so i ● It creates virtual interfaces. vol Da o ● Compatible with the modern iputils. Renz eft, ● yl After the module has been loaded the op C 7 interfaces can be configured and used. 00 2 © re ua Sq l ua t Vir UMLWIP @ Work

● lwipv6.so without extra options creates two interfaces lo0 and vd0 (vde) i ● vd0 is configured vol

Da using iputils (Ipv6 o gets auto- Renz configuration) eft, yl

op ●

C ssh to a remote 7

00 host shows we 2

© are using the re ua virtual network Sq l ua t Vir VIEWFS (under development)

● ViewFS is a File system remapping module ● Some features: – Directory merge, i vol

Da – WORM access, o

Renz – file and directory hiding eft, yl op C 7 00 2 © re ua Sq l ua t Vir UMMISC

● This module virtualize several concepts like: – process ids – time i vol – system id Da o –

Renz user/group id eft, yl op C 7 00 2 © re ua Sq l ua t Vir i

vol The Operating System ZOO Da o www.oszoo.org Renz eft, yl op C 7 00 2 © re ua Sq l ua t Vir OSZOO ● OSZOO is a repository of Disk Images for VM.

● It is possible to i test an O.S. vol

Da just by o downloadin Renz g the image eft, yl

op and run it. C 7 ● 00 html and 2

© bittorrent re ua access to Sq l images. ua t Vir LIVE OSZOO

● Now it is possible to run OS images directly i vol on our Da o servers. Renz ● eft, Users yl op need just C 7

00 a browser 2 ©

re with jvm ua

Sq support. l ua t Vir Fedora 5 on Live OSZOO This is the example of a GNU-Linux Fedora Core i 5 loaded on a vol Da

o Live OSZOO

Renz virtual eft, machine. yl op C 7 00 2 © re ua Sq l ua t Vir Virtual Square usage scenarios

● .... are so many that this should be a specific seminar, not a slide.

● Prototyping

● Security i vol ● in general “user freedom” ;-) Da o ●

Renz Education: Virtual Square can be used for lab eft, exercises in System Architecture, Operating yl op

C Systems, Networking, Security, Distributed 7 00

2 Systems etc... © re ● ua Virtual Square in Education: is the topic of an Sq l

ua international cooperation with Prof. Michael t

Vir Goldweber, Xavier Univ. Cincinnati, OH. Virtual Square: what is next? ● LWIPv6 internal dhcp client

● VDE-XEN gateway

● SNMP support for VDE

● Mobility support for VDE

● UMVIEW Modules: MISSING BLOCKS/TILES i – Viewfs vol Da o – RemoteSyscall

Renz – Network multistack (recursive LWIP) eft, yl – op Inter module communication C 7 ● 00 UMFUSEFAT 2

© ● re Extension to the model ua Sq l – consistent view of virtual us ers ua t Vir Still Missing...

● Support for Klik filesystem MISSING BLOCKS/TILES ● Cross Platform local Procedure Call.

● i Virtual SysV IPC vol Da o ● WWFX Renz ● eft, Porting to other OS (MacOSX) yl op C

7 ●

00 VDE SlirpV6 2 © re ua Sq l ua t Vir Open Problems...

● Native Kernel support for VM Open Problems on BLOCKS/TILES ● mmap (write access) ● Museum (licenses!) i ● Bugs... (there is always one around the vol Da o corner) Renz eft, yl op C 7 00 2 © re ua Sq l ua t Vir Kernel, μKernel and mKernel!

● (Monolitic) Kernels – bad design, efficient – Linux is very well supported and widely applied

● Micro Kernels – nice design, inefficient – Lack of drivers, rarely applied outside research i vol ● (or used with monolitic kernels loaded as servers) Da o ● MilliKernel (NEW!) Renz eft, – modular monolitic kernel able to use external servers for yl op hi-level structures (File System, Networking, Memory C 7

00 Managent Policies) or for lo-level structures (drivers). 2

© – re It is up to the user/administrator to decide what should be ua load as server or inside the kernel (efficiency vs. Sq l ua t reliability+flexibility). Vir Idea... i

vol VIEW-OS/UMVIEW modules seems to be Da o modules for a Linux Millikernel structure... Renz eft, yl op C 7 00 2 © re ua Sq l ua t Vir The Virtual Square TEAM

●Renzo Davoli – rd235,reenzo (designer, main developer)

●Ludovico Gardenghi – garden (viewfs, optimization, umview inter module communication, logging interface)

●Filippo Giunchedi – godog ( maintainer, vde-snmp)

●Luca Bigliardi – shammash (libcomm, autolink)

●Diego Billi (lwipv6 filtering, optimization) i ●

vol Andrea Gasparini – gaspa (nesting, kernel-patches) Da o ●Daniele Lacamera – danielinux (vde_cryptcab, debian Renz packets, vde-l3). eft, yl ● op Mattia Gentilini MG55 (freelive oszoo) C 7

00 ●

2 Andrea Forni - Gendag (remote procedure call, Slirpv6) © re ●

ua Andrea Saraghiti, Mattia Belletti,..... and many others. Sq l ua t Vir i vol Da o Renz eft, yl op C 7 00 2 © re ua [email protected] Sq l ua t To be continued... Vir