AWS Firecracker VMM 之 ⼤熱天捲起袖⼦動⼿玩
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Iocage Documentation Release 1.2
iocage Documentation Release 1.2 Brandon Schneider and Peter Toth Sep 26, 2019 Contents 1 Documentation: 3 1.1 Install iocage...............................................3 1.2 Basic Usage...............................................6 1.3 Plugins.................................................. 10 1.4 Networking................................................ 11 1.5 Jail Types................................................. 14 1.6 Best Practices............................................... 15 1.7 Advanced Usage............................................. 16 1.8 Using Templates............................................. 20 1.9 Create a Debian Squeeze Jail (GNU/kFreeBSD)............................ 21 1.10 Known Issues............................................... 22 1.11 FAQ.................................................... 23 1.12 Indices and tables............................................ 24 Index 25 i ii iocage Documentation, Release 1.2 iocage is a jail/container manager written in Python, combining some of the best features and technologies the FreeBSD operating system has to offer. It is geared for ease of use with a simplistic and easy to learn command syntax. FEATURES: • Templates, basejails, and normal jails • Easy to use • Rapid thin provisioning within seconds • Automatic package installation • Virtual networking stacks (vnet) • Shared IP based jails (non vnet) • Dedicated ZFS datasets inside jails • Transparent ZFS snapshot management • Binary updates • Export and import • And many more! Contents 1 iocage Documentation, -
Freebsd's Jail(2) Facility
Lousy virtualization, Happy users: FreeBSD's jail(2) facility Poul-Henning Kamp [email protected] CHROOT(2) FreeBSD System Calls Manual CHROOT(2) NAME chroot -- change root directory LIBRARY Standard C Library (libc, -lc) SYNOPSIS #include <unistd.h> int chroot(const char *dirname); Calling chroot(2) in ftpd(1) implemented ”anonymous FTP” without the hazzle of file/pathname parsing and editing. ”anonymous FTP” became used as a tool to enhance network security. By inference, chroot(2) became seen as a security enhancing feature. ...The source were not strong in those. Exercise 1: List at least four ways to escape chroot(2). Then the Internet happened, ...and web-servers, ...and web-hosting Virtual hosts in Apache User get their own ”virtual apache” but do do not get your own machine. Also shared: Databases mailprograms PHP/Perl etc. Upgrading tools (PHP, mySQL etc) on virtual hosting machines is a nightmare. A really bad nightmare: Cust#1 needs mySQL version > N Cust#2 cannot use mySQL version <M (unless PHP version > K) Cust#3 does not answer telephone Cust#4 has new sysadmin Cust#5 is just about ready with new version Wanted: Lightweight virtualization Same kernel, but virtual filesystem and network address plus root limitations. Just like chroot(2) with IP numbers on top. Will pay cash. Close holes in chroot(2) Introduce ”jail” syscall + kernel struct Block jailed root in most suser(9) calls. Check ”if jail, same jail ?” in strategic places. Fiddle socket syscall arguments: INADDR_ANY -> jail.ip INADDR_LOOPBACK -> jail.ip Not part of jail(2): Resource restriction Hardware virtualization Covert channel prevention (the hard stuff) Total implementation: 350 changed source lines 400 new lines of code FreeBSD without jail usr Resources of various sorts / home var process process process process process process Kernel FreeBSD with jail usr Resources of various sorts / home var process process process process* process process Kernel error = priv_check_cred( cred, PRIV_VFS_LINK, SUSER_ALLOWJAIL); if (error) return (error); The unjailed part One jailed part of the system. -
Sandboxing 2 Change Root: Chroot()
Sandboxing 2 Change Root: chroot() Oldest Unix isolation mechanism Make a process believe that some subtree is the entire file system File outside of this subtree simply don’t exist Sounds good, but. Sandboxing 2 2 / 47 Chroot Sandboxing 2 3 / 47 Limitations of Chroot Only root can invoke it. (Why?) Setting up minimum necessary environment can be painful The program to execute generally needs to live within the subtree, where it’s exposed Still vulnerable to root compromise Doesn’t protect network identity Sandboxing 2 4 / 47 Root versus Chroot Suppose an ordinary user could use chroot() Create a link to the sudo command Create /etc and /etc/passwd with a known root password Create links to any files you want to read or write Besides, root can escape from chroot() Sandboxing 2 5 / 47 Escaping Chroot What is the current directory? If it’s not under the chroot() tree, try chdir("../../..") Better escape: create device files On Unix, all (non-network) devices have filenames Even physical memory has a filename Create a physical memory device, open it, and change the kernel data structures to remove the restriction Create a disk device, and mount a file system on it. Then chroot() to the real root (On Unix systems, disks other than the root file system are “mounted” as a subtree somewhere) Sandboxing 2 6 / 47 Trying Chroot # mkdir /usr/sandbox /usr/sandbox/bin # cp /bin/sh /usr/sandbox/bin/sh # chroot /usr/sandbox /bin/sh chroot: /bin/sh: Exec format error # mkdir /usr/sandbox/libexec # cp /libexec/ld.elf_so /usr/sandbox/libexec # chroot /usr/sandbox -
Institutionalizing Freebsd Isolated and Virtualized Hosts Using Bsdinstall(8), Zfs(8) and Nfsd(8)
Institutionalizing FreeBSD Isolated and Virtualized Hosts Using bsdinstall(8), zfs(8) and nfsd(8) [email protected] @MichaelDexter BSDCan 2018 Jails and bhyve… FreeBSD’s had Isolation since 2000 and Virtualization since 2014 Why are they still strangers? Institutionalizing FreeBSD Isolated and Virtualized Hosts Using bsdinstall(8), zfs(8) and nfsd(8) Integrating as first-class features Institutionalizing FreeBSD Isolated and Virtualized Hosts Using bsdinstall(8), zfs(8) and nfsd(8) This example but this is not FreeBSD-exclusive Institutionalizing FreeBSD Isolated and Virtualized Hosts Using bsdinstall(8), zfs(8) and nfsd(8) jail(8) and bhyve(8) “guests” Application Binary Interface vs. Instructions Set Architecture Institutionalizing FreeBSD Isolated and Virtualized Hosts Using bsdinstall(8), zfs(8) and nfsd(8) The FreeBSD installer The best file system/volume manager available The Network File System Broad Motivations Virtualization! Containers! Docker! Zones! Droplets! More more more! My Motivations 2003: Jails to mitigate “RPM Hell” 2011: “bhyve sounds interesting...” 2017: Mitigating Regression Hell 2018: OpenZFS EVERYWHERE A Tale of Two Regressions Listen up. Regression One FreeBSD Commit r324161 “MFV r323796: fix memory leak in [ZFS] g_bio zone introduced in r320452” Bug: r320452: June 28th, 2017 Fix: r324162: October 1st, 2017 3,710 Commits and 3 Months Later June 28th through October 1st BUT July 27th, FreeNAS MFC Slips into FreeNAS 11.1 Released December 13th Fixed in FreeNAS January 18th 3 Months in FreeBSD HEAD 36 Days -
Thread Scheduling in Multi-Core Operating Systems Redha Gouicem
Thread Scheduling in Multi-core Operating Systems Redha Gouicem To cite this version: Redha Gouicem. Thread Scheduling in Multi-core Operating Systems. Computer Science [cs]. Sor- bonne Université, 2020. English. tel-02977242 HAL Id: tel-02977242 https://hal.archives-ouvertes.fr/tel-02977242 Submitted on 24 Oct 2020 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Ph.D thesis in Computer Science Thread Scheduling in Multi-core Operating Systems How to Understand, Improve and Fix your Scheduler Redha GOUICEM Sorbonne Université Laboratoire d’Informatique de Paris 6 Inria Whisper Team PH.D.DEFENSE: 23 October 2020, Paris, France JURYMEMBERS: Mr. Pascal Felber, Full Professor, Université de Neuchâtel Reviewer Mr. Vivien Quéma, Full Professor, Grenoble INP (ENSIMAG) Reviewer Mr. Rachid Guerraoui, Full Professor, École Polytechnique Fédérale de Lausanne Examiner Ms. Karine Heydemann, Associate Professor, Sorbonne Université Examiner Mr. Etienne Rivière, Full Professor, University of Louvain Examiner Mr. Gilles Muller, Senior Research Scientist, Inria Advisor Mr. Julien Sopena, Associate Professor, Sorbonne Université Advisor ABSTRACT In this thesis, we address the problem of schedulers for multi-core architectures from several perspectives: design (simplicity and correct- ness), performance improvement and the development of application- specific schedulers. -
April 2006 Volume 31 Number 2
APRIL 2006 VOLUME 31 NUMBER 2 THE USENIX MAGAZINE OPINION Musings RIK FARROW OpenSolaris:The Model TOM HAYNES PROGRAMMING Code Testing and Its Role in Teaching BRIAN KERNIGHAN Modular System Programming in MINIX 3 JORRIT N. HERDER, HERBERT BOS, BEN GRAS, PHILIP HOMBURG, AND ANDREW S. TANENBAUM Some Types of Memory Are More Equal Than Others DIOMEDIS SPINELLIS Simple Software Flow Analysis Using GNU Cflow CHAOS GOLUBITSKY Why You Should Use Ruby LU KE KANIES SYSADMIN Unwanted HTTP:Who Has the Time? DAVI D MALONE Auditing Superuser Usage RANDOLPH LANGLEY C OLUMNS Practical Perl Tools:Programming, Ho Hum DAVID BLANK-EDELMAN VoIP Watch HEISON CHAK /dev/random ROBERT G. FERRELL STANDARDS USENIX Standards Activities NICHOLAS M. STOUGHTON B O OK REVIEWS Book Reviews ELIZABETH ZWICKY, WITH SAM STOVER AND RI K FARROW USENIX NOTES Letter to the Editor TED DOLOTTA Fund to Establish the John Lions Chair C ONFERENCES LISA ’05:The 19th Large Installation System Administration Conference WORLDS ’05: Second Workshop on Real, Large Distributed Systems FAST ’05: 4th USENIX Conference on File and Storage Technologies The Advanced Computing Systems Association Upcoming Events 3RD SYMPOSIUM ON NETWORKED SYSTEMS 2ND STEPS TO REDUCING UNWANTED TRAFFIC ON DESIGN AND IMPLEMENTATION (NSDI ’06) THE INTERNET WORKSHOP (SRUTI ’06) Sponsored by USENIX, in cooperation with ACM SIGCOMM JULY 6–7, 2006, SAN JOSE, CA, USA and ACM SIGOPS http://www.usenix.org/sruti06 MAY 8–10, 2006, SAN JOSE, CA, USA Paper submissions due: April 20, 2006 http://www.usenix.org/nsdi06 2006 -
Building a Virtualisation Appliance with Freebsd/Bhyve/Openzfs Jason Tubnor ICT Senior Security Lead Introduction
Building a virtualisation appliance with FreeBSD/bhyve/OpenZFS Jason Tubnor ICT Senior Security Lead Introduction Building an virtualisation appliance for use within a NGO/NFP Australian Health Sector About Me Latrobe Community Health Service (LCHS) Background Problem Concept Production Reiteration About Me 26 years of IT experience Introduced to Open Source in the mid 90’s Discovered OpenBSD in 2000 A user and advocate of OpenBSD and FreeBSD Life outside of computers: Ultra endurance gravel cycling Latrobe Community Health Service (LCHS) Originally a Gippsland based NFP/NGO health service ICT manages 900+ users Servicing 51 sites across Victoria, Australia Covering ~230,000km2 Roughly the size of Laos in Aisa or Minnesota in USA “Better health, Better lifestyles, Stronger communities” Background First half of 2016 awarded contract to provide NDIS services Mid 2016 – deployment of initial infrastructure MPLS connection L3 switch gear ESXi host running a Windows Server 2016 for printing services Background – cont. Staff number grew We hit capacity constraints on the managed MPLS network An offloading guest was added to the ESXi host VPN traffic could be offloaded from the main network Using cheaply available ISP internet connection Problem Taking stock of the lessons learned in the first phase We needed to come up with a reproducible device Device required to be durable in harsh conditions Budget constraints/cost savings Licensing model Phase 2 was already being negotiated so a solution was required quickly Concept bhyve [FreeBSD] was working extremely well in testing Excellent hardware support Liberally licensed OpenZFS Simplistic Small footprint for a type 2 hypervisor Hardware discovery phase FreeBSD Required virtualisation components in CPU Concept – cont. -
Bhyve - Improvements to Virtual Machine State Save and Restore
bhyve - Improvements to Virtual Machine State Save and Restore Darius Mihai Mihai Carabas, University POLITEHNICA of Bucharest University POLITEHNICA of Bucharest Splaiul Independent, ei 313, Bucharest, Romania, 060042 Splaiul Independent, ei 313, Bucharest, Romania, 060042 Email: [email protected] Email: [email protected] Abstract—As more complex tasks are delegated to distributed Regardless of their exact function, a periodic task is a routine servers, virtual machine hypervisors need to adapt and provide that will have to be called at (or as close as possible) a set features that allow redundancy and load balancing. One such interval. mechanism is the virtual machine save and restore through sys- sleep tem snapshots. A snapshot should allow the complete restoration For example, the basic Unix command can be used of the state that the virtual machine was in when the snapshot to perform an operation every N seconds if sleep $N is was created. Since the snapshot should encapsulate the entire called in a script loop. Since it is safe to assume that all state of the virtualized system, the guest system should not be modern processors have hardware timekeeping components able to differentiate between the moment a snapshot was created implemented, sleep will request from the operating system and the moment when the system was restored, regardless of how much real time has passed between the two events. This that a software timer (i.e., one that is implemented by the paper will present how the time management and block devices operating system as an abstraction [1]) to be set for N are saved and restored for bhyve, FreeBSD’s virtual machine seconds in the future and will yield the processor, without hypervisor. -
In PDF Format
Arranging Your Virtual Network on FreeBSD Michael Gmelin ([email protected]) January 2020 1 CONTENTS CONTENTS Contents Introduction 3 Document Conventions . .3 License . .3 Plain Jails 4 Plain Jails Using Inherited IP Configuration . .4 Plain Jails Using a Dedicated IP Address . .5 Plain Jails Using a VLAN IP Address . .6 Plain Jails Using a Loopback IP Address . .7 Adding Outbound NAT for Public Traffic . .7 Running a Service and Redirecting Traffic to It . .9 VNET Jails and bhyve VMs 10 VNET Jails Using sysutils/pot ............................. 10 VNET Jails Using sysutils/iocage ............................ 13 Managing Bridges . 13 Adding bhyve VMs and DHCP to the Mix . 16 Preventing Traffic Between VNET Jails/VMs . 17 Firewalling Inside VNET Jails/VMs . 20 VXLAN 22 VXLAN Example Overview . 22 Gateway Configuration . 24 Jailhost-a . 25 Network Configuration (jailhost-a) . 25 VM Configuration (jailhost-a) . 26 Jail Configuration (jailhost-a) . 27 Jailhost-b . 27 Network Configuration (jailhost-b) . 27 Plain Jail Configuration (jailhost-b) . 28 Network Switch Setup (jailhost-b) . 29 VNET Jail Configuration (jailhost-b) . 30 VM Configuration (jailhost-b) . 30 VXLAN Multicast Troubleshooting . 31 Conclusion and Further Reading 33 2020-01-08 (final) 2 CC BY 4.0 INTRODUCTION Introduction Modern FreeBSD offers a range of virtualization options, from the traditional jail environment sharing the network stack with the host operating system, over vnet jails, which allow each jail to have its own network stack, to bhyve virtual machines running their own kernels/operating systems. Depending on individual requirements, there are different ways to configure the virtual network. Jail and VM management tools can ease the process by abstracting away (at least some of) the underlying complexities. -
Running Untrusted Application Code: Sandboxing Running Untrusted Code
Running Untrusted Application Code: Sandboxing Running untrusted code We often need to run buggy/unstrusted code: programs from untrusted Internet sites: toolbars, viewers, codecs for media player old or insecure applications: ghostview, outlook legacy daemons: sendmail, bind honeypots Goal : if application “misbehaves,” kill it Approach: confinement Confinement : ensure application does not deviate from pre-approved behavior Can be implemented at many levels: Hardware : run application on isolated hw (air gap) difficult to manage Virtual machines : isolate OS’s on single hardware System call interposition : Isolates a process in a single operating system Isolating threads sharing same address space: Software Fault Isolation (SFI) Application specific: e.g. browser-based confinement Implementing confinement Key component: reference monitor Mediates requests from applications Implements protection policy Enforces isolation and confinement Must always be invoked: Every application request must be mediated Tamperproof : Reference monitor cannot be killed … or if killed, then monitored process is killed too Small enough to be analyzed and validated A simple example: chroot Often used for “guest” accounts on ftp sites To use do: (must be root) chroot /tmp/guest root dir “/” is now “/tmp/guest” su guest EUID set to “guest” Now “/tmp/guest” is added to file system accesses for applications in jail open(“/etc/passwd”, “r”) ⇒⇒⇒ open(“/tmp/guest/etc/passwd”, “r”) ⇒ application cannot access files outside of jail Jailkit Problem: all utility progs (ls, ps, vi) must live inside jail • jailkit project: auto builds files, libs, and dirs needed in jail environment • jk_init : creates jail environment • jk_check: checks jail env for security problems • checks for any modified programs, • checks for world writable directories, etc. -
User-Space Process Virtualization in the Context of Checkpoint-Restart and Virtual Machines
User-Space Process Virtualization in the Context of Checkpoint-Restart and Virtual Machines A dissertation presented by Kapil Arya to the Faculty of the Graduate School of the College of Computer and Information Science in partial fulfillment of the requirements for the degree of Doctor of Philosophy Northeastern University Boston, Massachusetts August 2014 Copyright c August 2014 by Kapil Arya NORTHEASTERN UNIVERSITY GRADUATE SCHOOL OF COMPUTER SCIENCE Ph.D. THESIS APPROVAL FORM THESIS TITLE: User-Space Process Virtualization in the Context of Checkpoint-Restart and Virtual Machines AUTHOR: Kapil Arya Ph.D. Thesis approved to complete all degree requirements for the Ph.D. degree in Computer Science Distribution: Once completed, this form should be scanned and attached to the front of the electronic dissertation document (page 1). An electronic version of the document can then be uploaded to the Northeastern University-UMI website. Abstract Checkpoint-Restart is the ability to save a set of running processes to a check- point image on disk, and to later restart them from the disk. In addition to its traditional use in fault tolerance, recovering from a system failure, it has numerous other uses, such as for application debugging and save/restore of the workspace of an interactive problem-solving environment. Transparent checkpointing operates without modifying the underlying application pro- gram, but it implicitly relies on a “Closed World Assumption” — the world (including file system, network, etc.) will look the same upon restart as it did at the time of checkpoint. This is not valid for more complex programs. Until now, checkpoint-restart packages have adopted ad hoc solutions for each case where the environment changes upon restart. -
Jails and Unikernels
Virtualisation: Jails and Unikernels Advanced Operating Systems Lecture 18 Colin Perkins | https://csperkins.org/ | Copyright © 2017 | This work is licensed under the Creative Commons Attribution-NoDerivatives 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nd/4.0/ or send a letter to Creative Commons, PO Box 1866, Mountain View, CA 94042, USA. Lecture Outline • Alternatives to full virtualisation • Sandboxing processes: • FreeBSD jails and Linux containers • Container management • Unikernels and library operating systems Colin Perkins | https://csperkins.org/ | Copyright © 2017 2 Alternatives to Full Virtualisation • Full operating system virtualisation is expensive • Need to run a hypervisor • Need to run a complete guest operating system instance for each VM • High memory and storage overhead, high CPU load from running multiple OS instances on a single machine – but excellent isolation between VMs • Running multiple services on a single OS instance can offer insufficient isolation • All must share the same OS • A bug in a privileged service can easily compromise entire system – all services running on the host, and any unrelated data • A sandbox is desirable – to isolate multiple services running on a single operating system Colin Perkins | https://csperkins.org/ | Copyright © 2017 3 Sandboxing: Jails and Containers • Concept – lightweight virtualisation • A single operating system kernel • Multiple services /bin • Each service runs within a jail a service specific container, running just what’s needed /bin /dev for that application /dev /etc /etc /mail /home /home /jail /www /lib • Jail restricted to see only a subset of the / /lib /local /usr filesystem and processes of the host /usr /lib /sbin • A sandbox within the operating system /sbin /sbin /tmp /tmp /share /var • Partially virtualised /var Colin Perkins | https://csperkins.org/ | Copyright © 2017 4 Example: FreeBSD Jail subsystem Jails: Confining the omnipotent root.