CDE: Run Any Linux Application On-Demand Without Installation
Total Page:16
File Type:pdf, Size:1020Kb
CDE: Run Any Linux Application On-Demand Without Installation Philip J. Guo Stanford University [email protected] Abstract with compiling, installing, and configuring software and their myriad of dependencies. For example, the official There is a huge ecosystem of free software for Linux, but Google Chrome help forum for “install/uninstall issues” since each Linux distribution (distro) contains a differ- has over 5800 threads. ent set of pre-installed shared libraries, filesystem layout In addition, a study of US labor statistics predicts that conventions, and other environmental state, it is difficult by 2012, 13 million American workers will do program- to create and distribute software that works without has- ming in their jobs, but amongst those, only 3 million will sle across all distros. Online forums and mailing lists be professional software developers [24]. Thus, there are are filled with discussions of users’ troubles with com- potentially millions of people who still need to get their piling, installing, and configuring Linux software and software to run on other machines but who are unlikely their myriad of dependencies. To address this ubiqui- to invest the effort to create one-click installers or wres- tous problem, we have created an open-source tool called tle with package managers, since their primary job is not CDE that automatically packages up the Code, Data, and to release production-quality software. For example: Environment required to run a set of x86-Linux pro- grams on other x86-Linux machines. Creating a CDE • System administrators often hack together ad- package is as simple as running the target application un- hoc utilities comprised of shell scripts and custom- der CDE’s monitoring, and executing a CDE package re- compiled versions of open-source software, in or- quires no installation, configuration, or root permissions. der to perform system monitoring and maintenance CDE enables Linux users to instantly run any application tasks. Sysadmins want to share their custom-built on-demand without encountering “dependency hell”. tools with colleagues, quickly deploy them to other machines within their organization, and “future- 1 Introduction proof” their scripts so that they can continue func- tioning even as the OS inevitably gets upgraded. The simple-sounding task of taking software that runs on • Research scientists often want to deploy their com- one person’s machine and getting it to run on another putational experiments to a cluster for greater per- machine can be painfully difficult in practice. Since no formance and parallelism, but they might not have two machines are identically configured, it is hard for permission from the sysadmin to install the required developers to predict the exact versions of software and libraries on the cluster machines. They also want to libraries already installed on potential users’ machines allow colleagues to run their research code in order and whether those conflict with the requirements of their to reproduce and extend their experiments. own software. Thus, software companies devote con- • Software prototype designers often want clients to siderable resources to creating and testing one-click in- be able to execute their prototypes without the has- stallers for products like Microsoft Office, Adobe Pho- sle of installing dependencies, in order to receive toshop, and Google Chrome. Similarly, open-source de- continual feedback throughout the design process. velopers must carefully specify the proper dependencies in order to integrate their software into package manage- In this paper, we present an open-source tool called ment systems [4] (e.g., RPM on Linux, MacPorts on Mac CDE [1] that makes it easy for people of all levels of OS X). Despite these efforts, online forums and mail- IT expertise to get their software running on other ma- ing lists are still filled with discussions of users’ troubles chines without the hassle of manually creating a robust Ubuntu "The cloud" Fedora SUSE Fedora Ubuntu Debian Your Linux SUSE CentOS machine CentOS Debian ... Your Linux machine Figure 1: CDE enables users to package up any Linux Figure 2: CDE’s streaming mode enables users to run any application and deploy it to all modern Linux distros. Linux application on-demand by fetching the required files from a farm of pre-installed distros in the cloud. installer or dealing with user complaints about depen- dencies. CDE automatically packages up the Code, Data, of CDE in a short paper [20], this paper presents a more and Environment required to run a set of x86-Linux pro- complete CDE system with three new features: grams on other x86-Linux machines without any instal- • To overcome CDE’s primary limitation of only be- lation (see Figure 1). To use CDE, the user simply: ing able to package dependencies collected on exe- cuted paths, we introduce new tools and heuristics 1. Prepends any set of Linux commands with the cde for making CDE packages complete (Section 3). executable. cde executes the commands and uses ptrace system call interposition to collect all the • To make CDE-packaged programs behave just like code, data files, and environment variables used native applications on the target machine rather than during execution into a self-contained package. executing in an isolated sandbox, we introduce a new seamless execution mode (Section 4). 2. Copies the resulting CDE package to an x86-Linux • Finally, to enable users to run any Linux application machine running any distro from the past ∼5 years. on-demand, we introduce a new application stream- 3. Prepends the original packaged commands with the ing mode (Section 5). Figure 2 shows its high-level cde-exec executable to run them on the target architecture: The system administrator first installs machine. cde-exec uses ptrace to redirect file- multiple versions of many popular Linux distros in related system calls so that executables can load a “distro farm” in the cloud (or an internal com- the required dependencies from within the package. pute cluster). The user connects to that distro farm Execution can range from ∼0% to ∼30% slower. via an ssh-based protocol from any x86-Linux ma- chine. The user can now run any application avail- The main benefits of CDE are that creating a package able within the package managers of any of the dis- is as easy as executing the target program under its super- tros in the farm. CDE’s streaming mode fetches the vision, and that running a program within a package re- required files on-demand, caches them locally on quires no installation, configuration, or root permissions. the user’s machine, and creates a portable distro- The design philosophy underlying CDE is that people independent execution environment. Thus, Linux should be able to package up their Linux software and users can instantly run the hundreds of thousands of deploy it to other Linux machines with as little effort as applications already available in the package man- possible. However, CDE is not meant to replace tradi- agers of all distros without being forced to use one tional installers or package managers; its intended role is specific release of one specific distro1. to serve as a convenient ad-hoc solution for people like sysadmins, research scientists, and prototype makers. This paper continues with descriptions of real-world Since its release in Nov. 2010, CDE has been down- use cases (Section 6), evaluations of portability and per- loaded over 3,000 times [1]. We have exchanged hun- formance (Section 7), comparisons to related work (Sec- dreds of emails with users throughout both academia and tion 8), and concludes with discussions of design philos- industry. In the past year, we have made several signifi- ophy, limitations, and lessons learned (Section 9). cant enhancements to the base CDE system in response to 1The package managers included in different releases of the same user feedback. Although we introduced an early version Linux distro often contain incompatible versions of many applications! Alice's computer program 1. cde <command> open() filesystem kernel open() open file cde-package/ cde-root/ cde /usr/lib/logutils.so usr/ lib/ copy file into package copy logutils.so Figure 4: Timeline of control flow between target pro- gram, kernel, and cde process during an open syscall. Bob's computer 2. 3. cde-exec <command> 2.1 Creating a new CDE package To create a self-contained package with all of the depen- filesystem redirect open() dencies required to run her anomaly detection script on cde-package/ another Linux machine, Alice simply prepends her com- cde-root/ mand with the cde executable: /usr/lib/logutils.so usr/ lib/ cde python detect_anomalies.py net.log logutils.so cde runs her command normally and uses the Linux ptrace system call to monitor all of the files it ac- Figure 3: Example use of CDE: 1.) Alice runs her com- cesses throughout execution. cde creates a new sub- mand with cde to create a package, 2.) Alice sends her directory called cde-package/cde-root/ and copies package to Bob’s computer, 3.) Bob runs command with all of those accessed files into there, mirroring the orig- cde-exec, which redirects file accesses into package. inal directory structure. Figure 4 shows an overview of the control flow between the target program, Linux ker- 2 CDE system overview nel, and cde during a file-related system call. For example, if Alice’s script dynamically We described the details of CDE’s design and implemen- loads an extension module as a shared library tation in a prior paper and its accompanying technical named /usr/lib/logutils.so (i.e., log pars- report [20]. We will now summarize the core features of ing utility code), then cde will copy it to CDE using an example.