∗ Virtual Square (V 2) in Computer Science Education

Renzo Davoli Michael Goldweber University of Bologna Xavier University [email protected] [email protected]

ABSTRACT General Terms It is common to name as virtual the imaginary space that can Experimentation, Security, Performance be created by using computers and networks. This space is not only a set of processing and communications Keywords means and methods but it is also a space where humans can “meet,” exchange ideas, leave messages etc. Students Teaching, Operating Systems, Networking, Administration, in computer science must learn how to design, implement, , Security, Laboratory manage and debug the systems and networks that create this virtual space. Furthermore, CS students need an ex- 1. DEFINITION OF A VIRTUAL SQUARE perimental environment –a playground– where they can de- 2 velop their skills at creating and supporting these virtual (V ) SYSTEM 2 environments. A Virtual Square (V ) system [3] consists of virtual or em- For this “playground” we propose a virtual world made up ulated machines connected together by virtual or emulated of emulated computer systems and emulated networks. This networks. 2 emulated world will be the students’ testing environment, The basic characteristics of a V system are the following: where they can run their own services, administer their own machines and set up security attacks without any danger • Consistency of the emulation. The overall system should behave as a real system of computers and networks. to real networks and systems. It is a virtual space based 2 on virtual machines and virtual networks but it is also a The extra layer of virtuality introduced by V can re- meeting place for computer science students, where they can duce the performance of the system. All the virtual computers and networks behave as real computers and test the effectiveness of their ideas. 2 This “space” therefore is a twice virtual space, which we networks, albeit as slower devices. Naturally, V net- 2 call virtual to the second power or virtual squared (V ). It works are effective if the speed of the emulation and is a also virtual location (i.e. a town square) where different the processing and communication power of the under- real computers, virtual systems and people can meet and lying real distributed system are sufficient to preserve communicate. usability, in terms of responsiveness for the users.

2 • Possibility to integrate V systems and real systems; Categories and Subject Descriptors 2 or to keep V systems completely disjoint if needed. K.3.2 [Computers and Education]: Computer and In- The consistency of the emulation must therefore be at formation Science EducationComputer science education; the internal processing level. If so, it is then possible to .2.5 [Computer Systems Organization]: Local and interoperate between real systems and virtual systems Wide-Area NetworksEthernet,; D.4.6 [Operating and to forward, switch or route packets between vir- Systems]: Security and ProtectionAccess controls; D.4.4 tual networks and real networks. On the other hand, [Operating Systems]: Communications ManagementNet- there can be cases in which it is desirable to have dis- 2 work communication joint, non intercommunicating V systems, unable to ∗ This work was partially supported by the WebMinds FIRB exchange any data with real systems and networks. project of the Italian Ministry of University, Research and 2 Education • Safety. V virtual machines and networks must run as standard user programs, with no need for dangerous kernel modules or specific root-required configurations in the underlying host systems and networks. Clearly, 2 Permission to make digital or hard copies of all or part of this work for when V networks are interfaced with real infrastruc- personal or classroom use is granted without fee provided that copies are tures, the real systems on the boundary may need some not made or distributed for profit or commercial advantage and that copies root-required configuration. bear this notice and the full citation on the first page. To copy otherwise, to 2 republish, to post on servers or to redistribute to lists, requires prior specific There are two components to a V system; emulated hosts permission and/or a fee. ITiCSE'05, June 27–29, 2005, Monte de Caparica, Portugal. and emulated networks. There are several currently avail- Copyright 2005 ACM 1-59593-024-8/05/0006 ...$5.00. able free or open source technologies that can be used as 2 2 V emulated hosts. The only V emulated/virtual network- MPS/µMPS [10, 8]. MPS and µMPS were designed for 2 ing environment (VDE) was developed as part of the V educational purposes. Like Qemu, and PearPC, initiative. MPS/µMPS are complete virtual systems. MPS em- 2 V systems have several applications. For example they ulates a MIPS based computer (user-level, complete have been used in research areas as diverse as security, pri- processor ). It is a workbench for com- vacy, mobility, and software development. This paper fo- puter science students to run their experimental oper- 2 cuses on V system applications in computer science educa- ating systems in a real-world consistent virtual com- tion. Interested readers are referred to the Virtual Square puter while stripping off unnecessary complexities. µMPS Project Home Page[3] for information about these other uses is a MMU-simplifed version of MPS designed to be 2 for V systems. more accessible for undergraduate experimentation. Both projects provide network in- 2. V 2 EMULATED HOSTS terface support. Currently several different virtual machines can be used Ale4NET [2]. Application Level Environment for Net- V 2 as the nodes in a system. working (Ale4NET) has just been released in alpha User-Mode (U-ML) [13]. This is a project that version. It is an I/O Virtualization only system: with realizes a complete system virtualization through sys- Ale4NET Unix processes (or groups of processes) can tem trapping. It is a set of patches for the join a virtual network. Ale4NET provides neither pro- which defines a new virtual “um” hardware architec- cessor nor system emulation. Instead network calls are ture. A kernel for the “um” architecture is just an diverted to a Ale4NET daemon that gives a completely different perspective of the connectivity. Ale4NET can executable for the host computer which includes the 2 I/O virtualization routines as well as the kernel itself. be used as a bridge to enter a V system from a host Since it runs at user level it does not require any spe- machine at the user-level. In fact, unlike a tuntap cific kernel support from the host machine. Special based solution, Ale4NET virtualizes the network in- attention is given to both security concerns and per- stead of creating an OS accessible interface. There- formance; e.g. the number of threads is purposely kept fore, there is no need for root access to set up net- low and the address space of the emulated kernel is in- work connectivity. Ale4NET traps system call via the accessible by the emulated tasks. the dynamic library preloading technique. The libc interface routines to access system calls are overrid- Qemu [11]. Quoting its author’s web site: “Qemu is a den by Ale4NET functions that trap network accesses. FAST! processor emulation using dynamic translation Ale4NET is IPv4 and IPv6 compatible, including a sin- to achieve good emulation speed.” Qemu is able to gle hybrid stack able to run both families of protocols. run just as a processor or as a complete system virtu- alizer. Running different executables it is possible to 2 3. V VIRTUAL NETWORKS run single executables compiled for different processor 2 architectures in a Linux environment. Furthermore, it V is able to use several networking tools. is possible to start a virtual machine and boot an entire 1. VDE: Virtual Distributed [5] This is the operating system. Qemu runs on a number of differ- primary glue for a Virtual Square solution/environment. ent hardware architectures allowing for the running of Based on the idea of virtual switches and virtual crossed i386, ppc, arm and executables. Qemu also pro- cables, VDE is able to create virtual Ethernet compli- vides a virtual machine emulating i386 and ppc based ant distributed networks. VDE supports several kinds architectures. This project is very active with new 2 of V machines (User-Mode Linux, Qemu, Bochs, MPS/µMPS, ports and features announced on a daily base. Finally, Ale4NET) and can be interfaced to the single virtual Qemu runs at user-level and virtualizes completely the world through a tuntap interface or via support. processor architecture. VDE runs at user level, (it needs root access only when Bochs [9]. Bochs is a historically important virtual ma- a tuntap interface is required). VDE can be used as chine project. Bochs runs on several host architectures a general tunnel, a VPN, a tool for mobility, or as a (Linux, MacOS 9/X, and Windows) where it is able way to create a closed encrypted distributed network. to create a complete system virtualization of an i386 Furthermore, it is network protocol transparent: any architecture. Bochs relies on standard emulation tech- protocol able to run on an Ethernet can be supported niques thus it is quite slow when compared to modern by VDE. virtual machines. Bochs runs at user-level and com- pletely virtualizes the processor architecture. 2. Tuntap kernel support Tuntap is a general virtual interface for the linux kernel. Quoting the linux kernel PearPC [7] This project is conceptually similar to Bochs documentation file: “tuntap provides packet reception but implements a PPC architecture instead of an i386 and transmission for programs.” It can be PC. Thus it creates a complete system virtualized PPC viewed as a simple Point-to-Point or Ethernet device, box able to run several OS’s including Linux and Ma- which, instead of receiving packets from a physical me- cOX 9/X. It runs on several architectures but there dia, receives them from a user space program, and are special performance optimizations tailored for i386 1 instead of sending packets via physical media, writes host machines. Like Qemu and Bochs it runs at user- level achieving a complete processor virtualization. real host performance on a i386 and 500 times slower on 1 other CPUs, but these absolute figures are not related to PearPC’s authors claim that it runs 40 times slower than any well-known benchmark. them to the user space program. With tuntap it is While students are exposed to several different OS’s possible to create interfaces that are seen by the ker- there is almost no cost for system management for the nel as real network interfaces, even though all the data University. Students can reboot their virtual machines sent or received through tuntap is processed by appli- and restart from a clean by themselves at cations at th;e user level (i.e. not in the kernel). A tun any time. The hardware investment, although higher interface has the behavior of a point-to-point network than a standard lab because it requires powerful ma- device while tap is a virtual Ethernet device. Unfor- chines, can be justified for the ability to test different tunately, tuntap needs to be part of the kernel. A architectures without having to stock multiple hetero- port for tuntap support has been created for win32 geneous labs. Obsolescence of a hardware architecture and (partially) for MacOSX environments. and various OS’s are independent from the obsoles- cence of the host computer they are running on. It is 3. Slirp Slirp is a tool by Danny Gasparovski, dating possible to run a set of identical virtual computers on back to 1995. At that time Internet providers pro- a distributed system including machines with different posed two different kinds of contracts: a cheap remote hardware architectures. There are no issues related terminal connection and an expensive ppp/slip service. to the different models of computers, or compatibil- Danny created a tool that was able to convert a termi- ity with the OS etc. Visa versa, when a new model nal line into a ppp/slip access line for client applica- of virtual machine is installed, maybe with a differ- tions. Slirp runs completely at User-level: whenever a ent hardware configuration from the previous one, it is client application tries to open a new network connec- immediately available on all the computers in the lab; tion, slirp catches the connect request. Slirp performs even the oldest, provided they have sufficient speed to the connect for the internal application and then for- maintain the emulation usable. wards all the packets. From the Internet’s (and from 2 the host computer’s operating system’s) point of view Operating System Administration: V provides sev- it acts as if all the connections were initiated by slirp eral solutions for teaching O.S. Administration. Qemu itself. Slirp has also been integrated within a VDE can provide a test-bed, where an OS can be installed client. from scratch and administered by students. It is pos- sible for students to test, study and to compare po- 4. HOW TO USE V 2 IN COMPUTER SCI- tentially every operating system running on the i386 architecture in addition to several OS’s for the pow- ENCE EDUCATION erpc platform. In this perspective PearPC can be an The following list includes several laboratory exercises and alternative for running OS’s on a i386. Using 2 experiences that can be based on V . Exercises have been User-Mode Linux [4, 1] the virtual machine is almost divided into different areas that can be related to specific as powerful as the host computer but only the linux courses within the Bachelors and Masters degrees in Com- kernel can boot. U-ML is strictly consistent in its be- puter Science. havior to a standard GNU-linux based computer; the only different is in the OS installation. Computer Architecture: With Qemu and the GNU 2 cross compiling tools it is currently possible for stu- Operating System Programming: V allows students dents to test C language and assembly programs for to run several different operating systems, each one i386, sparc, powerpc and arm processors. configured to include a programming environment. It is thus possible to write programs that make use of Practice in Operating Systems: Students can run and system calls and system libraries. Students can use test several operating systems practicing OS user in- privileged calls in their program, which are usually re- terfaces and services as well as studying OS usabil- stricted to the administrator user (root). ity. Qemu currently supports several distributions of GNU-Linux (, Fedora, ...), various flavors of Kernel implementation and Programming: There BSD systems (OpenBSD, NetBSD, ...) and also pro- have been several teaching experiences where students prietary systems (like MS products). It is also possible learn the fundamentals of kernel structure by imple- to test live distributions like Knoppix. On powerpc menting their tiny OS from scratch [6]. MPS and Qemu virtual machines it is possible to boot Darwin µMPS have been designed as experimental virtual ma- and Linux. PearPC is able to run both Darwin and chines to create educational operating systems which MacOSX systems on i386 and powerpc host architec- can also be integrated within a virtual square sys- 2 ture. tem. µMPS in particular supports a VDE-compatible These are just some examples of OS’s that have been network interface. User-mode linux can be used for 2 tested on V machines. Potentially all the OS’s com- more complex exercises involving the modification of patible with the emulated architecture can run. There the Linux kernel (maybe for more advanced courses is a project named FreeOSZoo, where images of free or as a thesis). This can be used to teach real kernel and open source OS’s can be downloaded; perfect for hacking techniques. In fact, several features included booting a virtual machine from. Several contributors in the current version of the Linux kernel have been are submitting images to the project so the set of OS’s implemented and debugged using U-ML (e.g. the new in the Zoo is growing quickly. virtual memory management of Linux v.2.6). 2 While it is technically possible to run MacOSX on i386 Network Administration: VDE is a powerful tool for hardware, it is forbidden by a licence condition restricting teaching network administration. A first set of exer- MacOSX from running on non “Apple labelled” machines cises can be set up using a single VDE network. In server S1

server S2

..... the Internet Linux WS 1 Linux WS 2 server S3

server S4 ..... router

Real infrastrusture for a Virtual LAB

Darwin running Apache Web server on S1

Open Debian MacOS BSD Linux MS Fedora W2K i386 W98 victim Charlie’s Laptop for security running Debian Linux testing on S2

Alice@WS 1 Bob@WS2 A virtual square LAB at work

Figure 1: An example of Virtual Square usage

this case all the virtual computers connected to the Internet. In the latter case the student provided ser- VDE behave as if they were on the same LAN, so all vices can only be accessed from the lab or by extending LAN services can be configured and tested on that net- the virtual network outside with vde-cable based tun- work. Students can be requested to install configure nels. It is very satisfying for students to see that their and run file system sharing facilities like NFS for Unix, services are completely indistinguishable from real ser- SMB (for MS OS’s) or Samba (for interoperability). vices, both from the perspective of the user and the Other interesting services that can be chosen as ob- maintainer. jects for exercises could be NIS (former yellow pages), V 2 MS-Domain or LDAP for cluster information manage- Network Programming: In it is possible to design ment, distributed printing services (such as SMB or and implement network applications and to design new lpd services), DHCP or IPv6 auto-configuration. VDE application layer protocols. Exercises of this kind can networks can be interconnected using virtual routers be also solved on real systems, provided that unpriv- > (e.g. U-ML machines with several virtual interfaces ileged ports (socket number 1024) are used. With V 2 running Zebra, or simply by using kernel packet for- it is also possible to use privileged ports and to test warding and routing facilities). It is thus possible to the network support provided by different OS’s. 2 create internetworks with arbitrary topology, config- What is really innovative, however, is that using V it ure routers between them and test intra and inter- is also possible to design and implement network layer autonomous systems routing protocols (as in [12]). protocols. Exercises include the implementation of a tiny IP stack from scratch, the modification of IP (or other IETF RFC-based protocols) to provide different Internet Services: With a virtual network that behaves functions, or the implementation of an entirely new as an IP internetwork and which can also be connected starting at the network layer. by a router to the Internet, students can solve exercises relating to Internet services. For example, it is possible System and Network security When a test system for them to run their own web server. They can install has been expressly designed for network security test- and run an existing software product (like Apache) or ing, everything is allowed. It is like using weapons they can design and implement a tiny web server on in a polygon. A network for security testing must be their own. The same idea can be applied to mailers, disjoint from real networks and everybody must be FTP servers, proxy servers, messaging systems, DNS aware that no personal data or valuable information and so on. If the virtual network is connected to the can be transmitted or stored on virtual machines con- Internet and virtual machines are assigned real IP ad- nected to the network. (like polygons that have high dresses, the student services can even be accessed from walls around it). In this environment students can the Internet. Of course it is also possible to keep the test how to sniff the networks, use password catching student experimental internetwork disjoint from the techniques, and perform general system and network cracking activities. They can also run network attacks integrating code owned by competitors, and the asking of of any kind, e.g. denial of service or man in the mid- permission to publish the research results. dle. It is also possible to deliberately spread worms We feel that today Free and Open Source software is the or viruses, since it is a controlled environment. Stu- key for authentic, open minded, long term research in com- dents behave like system and network crackers in solv- puter science. Therefore we want to use the opportunity ing these exercises (an enjoyable learning experience to thank the whole community that develops, debugs and for them), but at the same time they can test counter broadcasts Free and Open Source Software and ideas. measures, like firewalls, anti-viral software, intrusion detection systems, etc. As a trivial but effective ex- 7. REFERENCES ercise, a class could be divided into two groups and [1] R. R. Adams and C. Erickson. Linux in education: play something similar to cops vs. robbers. In differ- Teaching system administration with linux. Linux ent times or on different testing networks a group can Journal, 2001. try to breach systems and networks managed by the others and visa versa. In this way, students can un- [2] R. Davoli. Ale4net home page. derstand (on a virtual battlefield) what the risks are, http://ale4net.sourceforge.net. and how to protect real systems and networks. [3] R. Davoli. Virtual square home page. http://www.virtualsquare.org/. [4] R. Davoli. Teaching operating systems administration 5. CONCLUSIONS with user-mode linux. In Proceedings of the 9th 2 V is a powerful playground for both computer science ITiCSE Conference on Innovation and Technology in students and for computer scientists in general. Students Computer Science Education, pages 102–106, Leeds, can play at being system and network administrators and UK, June 2004. developers, Internet service designers and maintainers, pro- [5] R. Davoli. VDE: Virtual distributed ethernet. tocol designers, operating system maintainers or authors, Technical report, University of Bologna, Dept. of and security experts, in a game where the rules are exactly Computer Science, 2004. Technical Report UBLCS the same as the real world. 2 2004-12. Not all of these tools were created by the V development 2 [6] R. Davoli and M. Goldweber. New directions in team but that V is possible through a combination of exist- operating systems courses using hardware simulators. ing tools and the ones developed for it. VDE and Ale4NET In Proc. of International Conference on Simulation have been designed and implemented for this project. These and Multimedia in Engineering Education (ICSEE), network tools are in fact the glue that provide intercom- Orlando, 2003. munication means and then unify several virtual machines [7] S. B. et al. Pearpc home page. and emulation tools inside a single framework; MPS is a 2 http://pearpc.sourceforge.net. tool designed before V by the same research team; All the [8] M. Goldweber, R. Davoli, and M. Morsiani. The Kaya other tools must be awarded to their designers and develop- OS project and the µMPS hardware simulator. In ers. In several cases there have been co-operations between Proceedings of the 10th ITiCSE Conference on our team and developing teams of the other tools to include Innovation and Technology in Computer Science compatibility code or entire software interfaces for VDE and Education, 2005. Virtual Square. 2 [9] K. Lawton. Bochs project home page. V has been used in Bologna for “Praticuum in Operating http://bochs.sourceforge.net. Systems” and “Operating Systems Design” courses (Spring term 2004). Students have learnt how to administer GNU- [10] M. Morsiani and R. Davoli. Learning operating system Linux machines and then have realized a startup procedure structure and implementation through the MPS made on the make utility to be called by the Unix init pro- computer system simulator. In Proceedings of the 30th cess instead of the rc.x scripts. make is able to track all the SIGCSE Technical Symposium on Computer Science dependences between services, so start and stop requests can Education, pages 63–67, New Orleans, 1999. start other services if necessary or stop other services when [11] Qemu cpu . are unused. We have also set up a VDE tunnel broker to join http://fabrice.bellard.free.fr/qemu/index.org.html. the experimental network from home. 55 groups out of 64 [12] the Computer Networks Research Group: have already submitted their work (groups are composed by University of Rome 3. Netkit: The poor man system 3 or 4 people), however the final deadline for the submission for experimenting computer networks. has not expired yet. http://www.netkit.org/. 2 V is going to be used more extensively starting from [13] User-mode linux. http://www.usermodelinux.org/. the Fall term for several computer science degree courses and for the new Master in Free and Open Source Software Technology.

6. ACKNOWLEDGES This work is based on Free and Open Source Software. It would have been very hard, if not impossible, to do some- thing similar based on proprietary solutions: requiring a host of non disclosure agreements, guarantees regarding ac- cess to interesting source code, possibly a promise to avoid