A Survey of Linux Metadata Package Management Systems and Their Solvers
Total Page:16
File Type:pdf, Size:1020Kb
A SURVEY OF LINUX METADATA PACKAGE MANAGEMENT SYSTEMS AND THEIR SOLVERS. 1SRIKALA BHARADWAJ, 2MEGHA.P.ARAKERI 1,2Information Science Department, MSRIT, Bangalore Abstract— Software components which encapsulate units of functionality for an operating system is created and widely distributed to end users in packaged forms such as RPM package manager (RPM) format or debian format in many Linux distros. Many Linux flavours make use of primary package managers with secondary metadata package managers to handle the installation, upgradation and removal of packages. This is a survey paper that delves into the world of metadata package management and examines the various techniques used by the package managers available today. Index Terms—Package Management, Dependency Hell, Dependency Resolution, Automatic install, Upgrade. I. INTRODUCTION logic that is to be executed during deployment and metadata that helps track the files and describe the Software created can be shipped to an end user either component’s capabilities. For example in the RPM in source form or in pre-built package form. The format all the metadata is present in a header. A source form is configurable to ones needs and so is software package can be visualized as shown in Fig.2. preferred by developers who configure, build and install it as per their requirements. But management of this becomes a tedious task with time. A typical end user prefers to use a pre-built ready to install packaged application. Package maintainer takes the burden of configuring, building and packaging it. A package will have the installation paths, information about packages that the current package is dependent upon, patches and scripts. A typical open operating system consists of thousands of software components that help run various applications. These components are made available as packages. An operating system may about have 25000 to 45000 packages. Fig. 2 Contents of a typical package Package Repository: This is usually a web repository (hosted on a File transfer protocol (FTP) server), or a media device such as DVD where all packages are made available by packager for install. Package Manager: It is a tool that handles dependency resolution, fetching the packages from repository, installation, upgradation and removal of packages Fig. 1 Overview of package management from the repository. A basic package manager (such as RPM) is aided by a secondary metadata using package In order to consistently maintain installation, manager say Yellowdog Updater, modified (YUM). upgradation and removal of these software packages, a for automatic dependency resolution and other set of tools called package manager is used. The Fig.1 advanced features. shows the components involved in package distribution and management. A package is typically a Dependencies for a package: A package ‘A’ might piece of software which contains files to be placed in require package ‘B’ as a pre-install. So package ‘B’ is specific directories in the target system, configuration a dependency for package ‘A’. In the real world Proceedings of 6th IRF International Conference, Bangalore, India, 01st June, 2014. ISBN: 978-93-84209-24-7 88 A Survey Of Linux Metadata Package Management Systems And Their Solvers. scenario the dependencies tend to form a complex tree Rollback or undo the changes: When the recent like structure as shown in Fig.3 changes produce undesirable effects, it should be possible to rollback to a previous information profile. Frequent need for updating: There are newer versions of software available everyday which meets the user’s requirement better. Routine upgrades are done to provide security patches, fix bugs and new features. Scalability: A single repository contains thousands of packages. The real challenge arises when the package manager has to deal with multiple repositories. III. PACKAGE MANAGEMENT METHODOLOGIES A. Functional Package managers Nix [3] is a package manager based on a purely functional model. Packages are built from Nix expressions which are used to describe graphs of build actions called derivations. A derivation consists of a build script, environment variables and other Fig.3. Dependency tree for package yum derivations (set of dependencies). A package is built recursively from its dependencies and each time the In this paper we will see the workings of various corresponding build scripts and environment settings package managers and techniques used by them to are done. In order to perform an upgrade with a new solve the package install, upgrade and removal configuration, the system is entirely rebuilt with the problem. Section II gives the challenges in package new specification. This allows for an easy roll back to management. Section III gives the overview of the the old configuration if necessary. This model also types of package managers classified according to the benefits from its statelessness property. Statelessness method used to carry out the tasks-functional method makes the configuration actions predictable and and imperative method. Section IV looks particularly ensures there is no mysterious failure. into techniques used for dependency resolution. The similarities between modules (or units) used in Section V lists package managers whose programming and packages used in software computational burden is shifted to server side. Section distribution were highlighted by Tucker [4]. A VII gives the future trends likely to be seen in this area package and module have similar behaviour as both of research. provide their capabilities as a service and in turn may require certain imports. In programming context the II. CHALLENGES IN PACKAGE imports correspond to other functions and values MANAGEMENT whereas in package management context the imports are other modules or dependencies. The distribution Language to express user request: A simple install format would be a compound unit which is a bundle of command has to be expressed in a structural way [1] all necessary atomic units linked correctly. As seen in [2] such that the steps and dependencies required to the programming world where a class can be install it is input accurately into the package manager. instantiated as several objects similarly a unit can be Usually Conjunctive normal form or set theory invoked several times and separate copies of each expressions are used. invocation can be uniquely identified and maintained. Dependency resolution: Package manager should This makes a provision for using correct version of check whether a solution that satisfies all constraints dependencies without conflicting with other installed exists among the available packages. As the versions. Having multiple versions installed satisfiability criteria increases, the logical complexity simultaneously can aid in better rollback mechanism. of the problem also increases. Optimization: The best solution should be picked B. Imperative package managers among available solutions. User preference should be Most tools use imperative method [3] of operation taken into account for optimization. where the deployment actions are stateful. Files Unpacking and installing: The installation should be belonging to each package are stored in the Unix file done with minimal possible changes to the current system hierarchy. Packages are upgraded by information profile such that there are no broken overwriting the old versions. Statefulness makes it applications. hard to support multiple versions at the same time. These tools are capable of automatically resolving Proceedings of 6th IRF International Conference, Bangalore, India, 01st June, 2014. ISBN: 978-93-84209-24-7 89 A Survey Of Linux Metadata Package Management Systems And Their Solvers. 1 dependencies and installing the required package. p p represents conflict between the packages p They are less complex and hence are fast. The and p1. reliability however is less as they may not be able to When the propositional logic consists of all find a solution many-a-times. It is difficult to roll back dependency requirements in conjunctive form, the an installation or upgradation as configuration files problem can be solved in linear time. But if two or are manipulated without traceability. more packages satisfy the dependency and Cupt [5] is a partial reimplementation of APT with an disjunctions are used in the propositional logic, then it aim to achieve a bug-free package manager. The becomes a NP-complete problem [8]. Boolean optimization criterion is hard coded. In the satisfiability is a known NP-complete problem. Hence MISC-2010 [12] competition it was able to find a package upgrade request is also a NP-complete solution with 21% success for large set criterion (4 problem [7]. Debian releases) and 84% success for single release YaSt [9] package manager used by openSUSE Linux setup. uses Boolean satisfiability (SAT solver) to find Smart Package manager [6] is a platform agnostic dependencies. It uses an external solver libzypp. User portable tool. It aids the user with efficient solution preferences are hard coded in the problem encoding. finding algorithms and rollback options. In the MAX- SAT is a SAT based method where the effort is MISC-2010 competition Smart performed very well made to satisfy maximum number of clauses. with a single repository with 93% success of finding a INESC-ID [8] is a solver built using the p2cudf parser solution; however its performance degraded with the and MAX-SAT solver from MSUnCore which has an scaling of repository size and could provide solutions unsatisfiability core. In the MISC-2010 competition, with only 25% success. insec solver approximately had twice the penalty rate Mancoosi Package Manager (MPM) [7] is a of the winner. In took less time when the aim was to distribution-agnostic tool that uses the Common minimize the number of packages removed or Upgradeability Description Format (CUDF) format changed and worked best with direct CUDF input. It for describing package upgrade problem and found a solution with a success rate of 96% of which optimizations. MPM allows users to specify high level 12% were optimal. user optimization criteria. MPM architecture allows for use of external dependency solvers to be used by D.