Understanding Slapos Architecture Table of Content
Total Page:16
File Type:pdf, Size:1020Kb
Understanding SlapOS Architecture Details SlapOS is a general purpose operating system usable on distributed POSIX infrastructures (Linux, xBSD) to provide and manage software services focussing on cloud orchestration and edge computing. It allows to automate devops including deployment, monitoring and billing of software usage on arbitrary clould infrastructures. SlapOS can be used to act as cloud provider selling software as a service (SaaS), to offer platform as a service (PaaS) solutions or also basic infrastructure as a service (IaaS). SlapOS can be run on private hardware and data centers or any public/cloud infrastructure. This design document explains the concepts underlying the SlapOS architecture. Table of Content Master and Slave Computer Partitions Networking Details Master and Nodes Details SlapOS uses a Master (referred to as COMP-ROOT) and Slave (COMP-0,1,2,3...) design. This section explains the role of the Master and other nodes in a network as well as the software components on which they depend to operate in a distributed cloud. Overview Details The Master contains a catalog of available Software Releases. Slave nodes request from the Master which software to install and to supply to users. Users then request services (called hosting subscriptions), which are instances of the supplied software. These are instantiated inside computer partitions available on a node. The node reports back the usage of all running instances over time to the Master who keeps track of node capacity and available software. The Master also acts as web portal providing web services in order for users or other software to request new software instances which are instantiated and run on nodes. It does so using services provided by COMP-0 which therefore has somewhat of a special status, as it's not a standard node similar to COMP-1,2,3 providing services to users. Details Master nodes are stateful. Slave nodes are stateless. More precisely, all information required to rebuild a Slave node is stored in the Master. This may for example include the url of a backup service which maintains an online copy of data so that in case of a node failure, a replacement node can be rebuilt with the same data. It is thus very important to ensure that state data present on the Master is well secured. This can be achieved by hosting the Master node on a trusted 3rd party IAAS infrastructure with redundant resources or better by hosting multiple Master nodes on many Slave nodes located in different regions of the world based on an appropriate data redundancy heuristic. This describes the first reflexive nature of SlapOS: A SlapOS Master is usually a running instance of a SlapOS Master software (!) instantiated on a collection of Slave nodes, which together form a trusted hosting infrastructure. In other words, SlapOS is used to host SlapOS or SlapOS is self-hosted. Master Node - Functional Role Details The Master node keeps track of the identity of all parties involved in (a) requesting cloud resources as well as (b) their monitoring and (c) usage billing. It therefore also includes end users and organisations which can be suppliers and/or consumers of cloud resources. The Master manages computer partitions on all nodes. These "containers" are used to run a single software instance each which can be requested fully autonomous or through user interaction. SlapOS generates X509 certificates for each type of identity: for users who login, for each server that contributes to the resources of SlapOS and for each running software instance that has to request/notify the Master. A SlapOS Master with a single Slave node, one user and ten computer partitions will thus generate up to 12 X509 certificates: 1 for the slave, 1 for the user and 10 for the computer partitions provided on the node. Details Any user, software or Slave node with a valid X509 certificate may request resources from the SlapOS Master which acts like a back office or marketplace. Requests are recorded as if they were a resource trading contract in which a resource consumer is requesting a given resource under certain conditions. A resource could be a NoSQL storage, a KVM virtual machine, an ERP system or similar. The conditions include price, region (eg. China) or a specific hardware setup (eg. 64bit CPU). Conditions are sometimes called Service Level Agreements (SLA) in other architectures but in SlapOS they are considered more as trading specifications than hard guarantees. By default, SlapOS Master acts as automatic marketplace. Requests are processed by trying to find a node which meets all specified conditions. The SlapOS Master thus needs to know which resources are available at a given time, at what price and with what specifications, including which software can be installed on a node and under what conditions. However, it is also possible to manually define a specific node rather than relying on the automatic marketplace logic of SlapOS Master to determine the computer to use when allocating resources. Slave Node - Functional Role Details SlapOS Slave nodes are more simple compared to the Master node. They only need to run software requested by the Master. It is thus on the Slave nodes that software is installed. To save disk space, Slave nodes install only the software which they actually need. Slave nodes are divided into a certain number of so-called computer partitions (Slappart[0-x]). A computer partition can be seen as a lightweight, secure container, but based on Unix users and directories rather than on virtualization. A typical barebone PC can easily provide 100 computer partitions and can thus run 100 Wordpress blogs or 100 e-commerce sites, each of them with its own independent database. A larger server can contain 200-500 computer partitions. The concept of computer partitions in SlapOS was designed to reduce costs drastically compared to other approaches relying on a disk image and virtualization while at the same time also allowing to run virtualization software - but inside a computer partition. This makes SlapOS at the same time cost efficient and compatible with legacy software. Master Node - Software Details At its core, the SlapOS Master is a regular node with the SlapOS "Kernel" installed. During initial installation of the SlapOS Master, a SlapProxy ("mini"-Master) is used to setup the future SlapOS Master. In relation to the SlapProxy, the SlapOS Master is a normal node on which two softwares are installed and instantiated: ERP5 "Cloud Engine" and a Frontend (Apache) for accessing ERP5. This describes the second reflexive nature of SlapOS: SlapOS is used to deploy SlapOS. This implementation of SlapOS Master originates from anE RP5 implementation for a Central Bank across eight african countries. The underlying idea was that currency clearing and cloud resource clearing are similar and should therefore be implemented with the same software. Besides that, ERP5 had also demonstrated its scalability for large CRM applications as well as its trustability for accounting. Thanks to NEO, the distributed NoSQL database ERP5 is using, the transactional nature and scalability which is required for a stateful marketplace could be implemented in ERP5. Details Developing SlapOS Master on top of ERP5 also was a direct application of ERP5's Universal Business Model (UBM), a model which unifies all sciences of management and which has been acknowledged by numerous IEEE publications as a major shift in enterprise application design. In the UBM, each computer is represented by an Item while allocation requests, resource deliveries and resource accounting are represented by a Movement. The movement resource can be software hosting, CPU usage, disk usage, network usage, RAM usage, login usage, etc. Software hosting movements start whenever the running software starts in the computer partition and stop whenever the running software stops. Resource usage movements start and stop for accounting during each period of time, independently of the running software's state. The software release which is run on a computer partition is an Item in UBM, just like the subscription contract identifier. The parties (client, supplier) are represented as Node. More surprisingly, each Network is also considered a Node in the same way a storage cell is represented as Node in logistics. Slave Node - Software Details SlapOS Slave software consists of a POSIX operating system, SlapGRID, Supervisord and Buildout and is designed to run on any OS that supports GNU's GlibC and Supervisord (eg. GNU/Linux, FreeBSD, MacOS/X, Solaris, AIX et al.) At the time of writing, Microsoft Windows was partially supported (node only) through a GlibC implementation on Windows and a port of Supervisord to Windows. SlapGRID SlapGRID acts as the "glue" between the SlapOS Master and both Buildout and Supervisord. SlapGRID requests to SlapOS Master which software should be installed and executed. It uses Buildout to install software and Supervisord to start and stop software processes. SlapGRID also collects the usage data produced by each running software and transmits it to the SlapOS Master. Details Supervisord Supervisord is a process control daemon. It can be used to programmatically start and stop processes with different users, handle their output, log files, errors, etc. It is sort of an improved init.d which can be remotely controlled. Supervisord is lightweight and old enough to be really mature (i.e. no memory leaks). Buildout Buildout is a Python-based system for creating, assembling and deploying applications from multiple parts, some of which may be non Python-based. It allows to create Buildout configurations from which the same software can (later) be reproduced. Buildout originated from the Zope/Plone community to automate deployment of customized instances of their own software and has become a stable and mature software over the years. Please refer to understanding Buildout for a more detailed analysis of Buildout and why it is critical to the architecture of SlapOS.