Enhancing Separation of Concerns, Parallelism, and Formalism In

Enhancing Separation of Concerns, Parallelism, and Formalism in Distributed Software Deployment with Madeus Maverick Chardet, Hélène Coullon, Christian Pérez, Dimitri Pertin, Charlène Servantie, Simon Robillard To cite this version: Maverick Chardet, Hélène Coullon, Christian Pérez, Dimitri Pertin, Charlène Servantie, et al.. En- hancing Separation of Concerns, Parallelism, and Formalism in Distributed Software Deployment with Madeus. 2020. hal-02737859 HAL Id: hal-02737859 https://hal.inria.fr/hal-02737859 Preprint submitted on 2 Jun 2020 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Enhancing Separation of Concerns, Parallelism, and Formalism in Distributed Software Deployment with Madeus Maverick Chardeta, Hélène Coullona, Christian Perezb, Dimitri Pertina, Charlène Servantiea, Simon Robillarda aIMT Atlantique, Inria, LS2N, UBL, F-44307 Nantes, France bUniv Lyon, Inria, EnsL, UCBL, CNRS, LIP, Lyon, France Abstract Complex distributed software systems are built by connecting software components across a large number of physical or virtual machines. Deploying such systems reliably and efficiently is a difficult task that involves multiple actors. Coordination mechanisms and programming support are needed. In this paper, we introduce Madeus, a component- based model that allows efficient and highly parallel deployment procedures for distributed software through a declarative approach. We describe the formal model of Madeus, its performance prediction model, and its concrete language and prototype. We evaluated Madeus on a complex real-world use case, the deployment of OpenStack, and measured a deployment time reduced by up to 71% compared to existing solutions. Keywords: Distributed software; deployment; component models; parallelism; separation of concerns; formal methods 1. Introduction systems. This last point is usually neglected in the litera- ture. As a result, complex systems such as OpenStack can Distributed software is designed in a modular archi- require in excess of one hour to deploy with existing solu- tectural style where each component (a term that, in this tions that do not take full advantage of the distributed na- paper, refers indifferently to a module, a service, or a mi- ture of the underlying hardware. This becomes especially croservice) is responsible for a specific part of the over- problematic in the context of continuous integration, with all objective, and collaborates at runtime with other com- systems being deployed up to several hundreds of times ponents that are potentially hosted on distant machines. every day. With the advents of IT hardware, distributed software has For these reasons, programming models and tools need become commonplace, whether it be executed in the cloud, to be developed to facilitate documented, verifiable and on personal computers, mobile phones or objects. efficient deployment procedures. To this end, we propose However, deploying a large distributed software system Madeus, a model that relies on a declarative description on distributed infrastructures poses multiple challenges. of dependencies to automate the execution of deployments Firstly, deployment involves a variety of actors with dif- with a high level of parallelism, both between components ferent roles and areas of expertise, from developers who and with them. Prior to execution, Madeus deployments are responsible for designing and coding the components, can also be analyzed for the purpose of verification and to sysadmins and sysops who are responsible for maintain- performance estimation, making it a useful tool during the ing, configuring, and testing multi-user computer systems, conception of the deployment. with devops engineers working between them to oversee In this paper, among the difficulties related to the de- code releases and deployments. Secondly, deployment is ployment of distributed software systems, we leave aside a complex task: components must be chosen, configured the mapping between pieces of software and the nodes of and mapped to infrastructure nodes, dependencies have to infrastructure, an optimization problem that is not in the be solved, virtualization layers handled, etc. The process scope of this paper [1,2,3,4,5]. Indeed, we restrict the is error-prone, and difficult to test because faults can be scope of this paper to the software commissioning part of caused by very specific hardware or software conditions the deployment, i.e., the procedure responsible for lead- in a heterogeneous environment. Lastly, deployment is an ing the set of components to a valid running state while often time-consuming process, owing to the scale of the guaranteeing correct configurations and interactions. In particular, three metrics are considered to measure the Email addresses: [email protected] (Maverick quality of automation of distributed software commission- Chardet), [email protected] (Hélène Coullon), ing: (1) separation of concerns between the different actors [email protected] (Christian Perez), of the commissioning procedure to enhance productivity, [email protected] (Dimitri Pertin), [email protected] (Charlène Servantie), maintainability and reusability of deployment code; (2) [email protected] (Simon Robillard) efficiency of the commissioning procedure in terms of par- Preprint submitted to Journal of Systems and Software April 30, 2020 allelism expressiveness; and (3) formalization of the solu- deployment is automated and written once. However, be- tion, such that formal properties can be proven on a given cause of the specificities of each infrastructure and applica- commissioning procedure. The contribution of this paper tion, these tools need to be parametric. Finding suitable improves upon the related work by succeeding in the com- scripts and images is often a challenge. For this reason, bination of these three metrics. deployment coordination procedures remain difficult even We have previously presented Madeus in [6]. This for a single component. When considering a complete dis- journal paper brings the following additional contributions tributed software composed of a set of components, the compared to our previous publication: picture becomes even more complex, because the commissioning procedures of various components, designed by dif- • an extended study of the related work (Section3); ferent developers and interacting with each other, must be • a concrete language and a prototype (presented in coordinated. For instance, when installing a very basic Section4, along with the overview of Madeus); Apache/MariaDB system, additional documentation is re- quired2 to combine the components. The commissioning • a revised and streamlined formal model that offers of Spark on top of Yarn3 is another example. This com- stronger guarantees (Section5); plex coordination is usually the work of a devops engineer. With the increasing complexity of distributed software de- • a theoretical performance prediction model (Section6); ployment, the number of languages and tools for the dev- • an extended reproducible evaluation on both syn- ops community has grown considerably in recent years. thetic use cases (Section7) and one large use case Efficiency is an often neglected aspect of commission- on real infrastructures (Section8). ing, but let us show that its importance should not to be underestimated. First, the commissioning of complex distributed software is very time-consuming. For example 2. Motivations OpenStack (Section8) can require in excess of one hour to deploy, because existing solutions do not favor paral- Distributed software deployment occurs between the lelism, and therefore exploit only a fraction of the capa- development of components (functional part) and the ex- bilities of the infrastructures that they target. Second, ecution of the distributed software on the infrastructure although common sense would dictate that commissioning (management part). As a result, commissioning relies on occurs only once, this is not the case. System adminis- information about both the behavior of the components trators perform the commissioning process every time a and the infrastructure on which they will be executed, a new machine or new cluster is installed in their infras- frontier that is often called the DevOps domain. Note that tructure, when errors occur, or when updates are needed. our definition of software commissioning does not include Furthermore, with Continuous Integration (CI), Continu- placement and scheduling optimization problems, nor re- ous Delivery (CD), and Continuous Deployment (CDep) of configuration aspects of the distributed software manage- companies, research or open source projects, software com- ment. missioning is executed repeatedly in order to test new fea- Commissioning distributed software requires coordina- tures continuously. For instance, the traces of the Open- ted interactions with various interfaces. Firstly, compo- Stack Continuous Integration platform4 show that over a nents have control interfaces provided

Load more