
بس حم م ال الر ن الرحیم Sharif University of Data and Network Technology Security Lab. LightweightLightweight VirtualizationVirtualization inin LinuxLinux SadeghSadegh DorriDorri N.N. PhDPhD CandidateCandidate Data and Network Security Lab. Seminar, 4 Aban 1393 TheThe NeedNeed forfor VirtualizationVirtualization Hypervisors are the living proof of operating system's incompetence! SchedulingScheduling aa Multi-processMulti-process “application”“application” - Nice, priority, etc. are hard to be dynamically managed KernelKernel MemoryMemory ManagementManagement - Fork bumps - $ while true; do mkdir x; cd x; done AbuseAbuse shouldshould bebe thethe application'sapplication's problem,problem, ratherrather thanthan beingbeing everyone's!everyone's! The failure of operating systems and how we can fix it: http://lwn.net/Articles/524952/ AgendaAgenda MotivationMotivation - Virtualization architectures - OS-level virtualization in Linux AA demodemo UnderUnder thethe hoodhood - LXC components - Related kernel features: cgroups and namespaces SecuritySecurity considerationsconsiderations ConclusionConclusion VariousVarious VirtualizationVirtualization ArchitecturesArchitectures HardwareHardware VirtualizationVirtualization VMware,VMware, Parallels,Parallels, QEmu,QEmu, Bochs,Bochs, Xen,Xen, KVMKVM ResourcesResources cannotcannot bebe sharedshared betweenbetween VMs.VMs. OS-LevelOS-Level VirtualizationVirtualization Linux Containers (LXC), Linux-VServer, OpenVZ, Parallels Virtuozzo Containers FreeBSD jails Solaris Containers/Zones IBM AIX6 WPARs (Workload Partitions) OS-LevelOS-Level VirtualizationVirtualization inin LinuxLinux LinuxLinux ContainersContainers - Allow a kernel to support more resource-isolation use- cases - Without the overhead and complexity of running multiple kernel and driver instances BenefitsBenefits - Isolation - Small footprint - Speed 3)3) SpeedSpeed 2)2) FootprintFootprint OnOn aa typicaltypical physicalphysical server,server, withwith averageaverage computecompute resources,resources, youyou cancan easilyeasily run:run: - 10-100 virtual machines - 100-1000 containers OnOn disk,disk, containerscontainers cancan bebe veryvery light.light. - A few MB — even without fancy storage. 1)1) IsolationIsolation EachEach containercontainer has:has: ItsIts ownown networknetwork interfaceinterface (and(and IPIP address)address) - can be bridged, routed... just like VMs ItsIts ownown filesystemfilesystem - Debian host can run Fedora container (& vice-versa) IsolationIsolation (security)(security) - container A & B can't harm (or even see) each other IsolationIsolation (resource(resource usage)usage) - soft & hard quotas for RAM, CPU, I/O... PossibilityPossibility ofof processprocess checkpoint/freezecheckpoint/freeze andand migrationmigration - Isolation prevents resource name conflicts Use-Cases:Use-Cases: DevelopersDevelopers ContinuousContinuous IntegrationIntegration - After each commit, run 100 tests in 100 environments ContinuousContinuous PackagingPackaging - Example: Project Builder EscapeEscape dependencydependency hellhell - Build (and/or run) in a controlled environment PutPut everythingeverything inin aa containercontainer - Even the tiny things Use-Cases:Use-Cases: HostingHosting ProvidersProviders CheapCheap CheaperCheaper HostingHosting (VPS(VPS providers)providers) GiveGive awayaway moremore freefree stuffstuff - "Pay for your production, get your staging for free!" - Spin up/down on demand, in seconds - Example: dotCloud ““Google has built their entire datacenter infrastructure around Linux containers, launching more than 2 billion containers per week.”” (Kubernetes:(Kubernetes: openopen sourcesource GoogleGoogle cloudcloud platform)platform) Use-Cases:Use-Cases: EveryoneEveryone LookLook insideinside youryour VMsVMs - You can see (and kill) individual processes - You can browse (and change) the filesystem DoDo (almost)(almost) whateverwhatever youyou diddid withwith VMsVMs - ... But faster MigrationMigration - Checkpoint then unfreeze: experimental (CRIU) Solutions in Linux OpenVZOpenVZ ModifiedModified LinuxLinux kernelkernel - Also works with unpatched Linux 3.x (reduced feature set) EachEach containercontainer isis aa separateseparate entityentity withwith itsits own:own: - Files: System libraries, applications, virtualized /proc and /sys, virtualized locks, etc. - Users and groups: its own root user, as well as other users and groups. - Process tree: only sees its own processes (incl. init) - Network: virtual network device with own IP addresses, iptables, and routing rules. - Devices: can be granted access to real devices. - IPC objects: shared memory, semaphores, messages. LXCLXC (LinuX(LinuX Containers)Containers) Container:Container: - Provides an env. like a standard Linux installation but without the need for a separate kernel. - Single kernel and drivers, multiple different user spaces AA groupgroup ofof processesprocesses inin LinuxLinux inin anan isolatedisolated environment.environment. - From inside: looks like a VM - From outside: looks like normal processes - Something (conceptually) in the middle between a chroot on steroids and a full fledged VM LXCLXC vs.vs. OpenVZOpenVZ - OpenVZ: production ready and stable; pushing to the upstream - LXC: a work-in-progress; uses standard kernel features LXCLXC LifecycleLifecycle lxc-createlxc-create - Setup a container (root filesystem and config) lxc-startlxc-start - Boot the container (by default, you get a console) lxc-consolelxc-console - Attach a console (if you started in background) lxc-stoplxc-stop - Shutdown the container lxc-destroylxc-destroy - Destroy the filesystem created with lxc- create See also: LXC Web Panel - http://lxc-webpanel.github.io/ Demo... Under the Hood LXCLXC ComponentsComponents Components:Components: - The liblxc library - Several language bindings for the API: ● Python, lua, Go, ruby, Haskell - A set of standard tools to control the containers - Container templates OpenOpen source!source! https://linuxcontainers.org/https://linuxcontainers.org/ FeaturesFeatures MakingMaking upup LXCLXC KernelKernel featuresfeatures usedused inin LXC:LXC: - Isolation: ● Kernel namespaces (ipc, uts, mount, pid, network and user) ● Chroots (using pivot_root) - Resource management ● Control groups (cgroups) - Security: ● Apparmor and SELinux profiles ● Seccomp policies ● Kernel capabilities Pivot_rootPivot_root andand ChrootChroot ChangeChange thethe rootroot directorydirectory toto aa newnew pathpath - Pivot_root: switches the complete system and remove dependencies on the old root dir. - Chroot: applied on a single process SeccompSeccomp seccompseccomp (SECure(SECure COMPutingCOMPuting mode)mode) - A simple sandboxing mechanism (Linux 2.6.12+ (2005)) - Allows a process to make a one-way transition into a "secure" state ● Syscalls limited to exit(), sigreturn(), read() and write() to already-open file descriptors. - Any attempts for other system calls result in SIGKILL. seccomp-bpfseccomp-bpf - An extension to seccomp that allows filtering of system calls using a configurable policy - Used by OpenSSH and vsftpd as well as Google Chrome/Chromium on Chrome OS and Linux to sandbox Flash player and renderers. CapabilitiesCapabilities In traditional UNIX, processes are: - Privileged (EUID is 0): Bypass all kernel permission checks. - Unprivileged: full permission checking (EUID, EGID, and supplementary group list). Since Linux kernel 2.2: - The superuser privileges are divided into distinct units (a.k.a. as capabilities) - Capabilities can be independently enabled and disabled (per-thread) Examples: - CAP_CHOWN: Make arbitrary changes to file UIDs and GIDs. - CAP_KILL: Bypass permission checks for sending signals. - CAP_NET_ADMIN: Perform various network-related operations. - CAP_SYS_ADMIN - CAP_SYS_BOOT: Use reboot and kexec_load LinuxLinux SecuritySecurity ModulesModules (LSM)(LSM) AA LinuxLinux kernelkernel frameworkframework toto supportsupport differentdifferent securitysecurity modelsmodels - Avoids favoritism toward any single implementation. - Examples: AppArmor, SELinux, Smack and TOMOYO Linux UsedUsed toto implementimplement differentdifferent MACsMACs Access Control Control Groups IntroductionIntroduction toto CGroupsCGroups CgroupsCgroups (control(control groups):groups): - Allocate resources (CPU, memory, network, or their combinations) among user-defined groups of tasks (processes) - Think ulimit, but for groups of processes ... and with fine-grained accounting. - Initiated at Google (2006) - Available in Fedora 18 kernel and ubuntu 12.10 kernel (also some previous releases). Commands:Commands: - cgcreate: creates new cgroup - cgset: sets parameters for given cgroup(s) - cgexec: runs a task in specified control groups. CGroups:CGroups: ImplementationImplementation ImplementedImplemented asas aa specialspecial cgroupcgroup filefile systemsystem - libcgroup is a library that abstracts the control group file system in Linux. - CGroup services: Allow persistence across reboot and ease of use. AA fewfew simplesimple hookshooks insertedinserted intointo thethe kernelkernel (not(not performance-performance- critical):critical): - In boot phase, process creation and destroy methods, task_struct procfsprocfs entries:entries: ● For each process: /proc/pid/cgroup. ● System-wide: /proc/cgroups CGroupCGroup SubsystemsSubsystems cpucpu - control CPU scheduler cpuacctcpuacct - generates automatic reports on CPU resources cpusetcpuset - assigns individual CPUs (cores) and memory
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages60 Page
-
File Size-