Provenance Issues in Platform-as-a-Service Model of Cloud Computing
Devdatta Kulkarni [email protected] Rackspace, The University of Texas at Austin
Abstract manages complete life-cycle of an application. In or- der for application developers to gain confidence into the In this paper we present provenance issues that arise in platform and to gain insights into application construc- building Platform-as-a-Service (PaaS) model of cloud tion process, it is important that the platform collects computing. The issues are related to designing, build- different kinds of provenance information and makes it ing, and deploying of the platform itself, and those re- available to application developers. Additionally, prove- lated to building and deploying applications on the plat- nance information about the platform itself is useful for form. These include, tracking of commands for success- the platform developers and platform operators towards ful software installations, tracking of inter-service de- correct design and operation of the platform. pendencies, tracking of configuration parameters for dif- In this paper we identify different kinds of provenance ferent services, and tracking of application related arti- information that need to be collected within a PaaS to facts. We identify the provenance information to address provide better insights into the workings of the plat- these issues and propose mechanisms to track it. form and the applications that it deploys (Section 2). We also propose mechanisms to collect this informa- 1 Introduction tion (Section 3). We realized the importance of some of these through our experience of building Solum, which is Platform-as-a-Service (PaaS) model of cloud computing an application life-cycle management system for Open- simplifies deployment of applications to a cloud infras- Stack [6]. Application developers use declarative speci- tructure. Application developers use a declarative model fication model to specify application’s build and runtime for specifying their application’s source code and re- requirements to Solum. Solum builds and deploys the quirements to the platform. The platform builds and de- application by creating Docker containers [7] with the ploys the application by provisioning the required infras- application code, hence forth referred as docker contain- tructure resources. There are several such PaaS systems ers, and deploying them on OpenStack infrastructure. in existence today, such as Heroku [1], Google App En- gine [2], OpenShift [3], Solum [4]. 2 Provenance Issues Provenance has emerged as one of the key requirement in building modern systems that support properties such 2.1 Platform development as auditability and data lineage tracking [5]. In our view provenance is all the information about an entity that can While developing Solum, we frequently followed a pat- help towards understanding how that entity got to a par- tern of trial and exploration when it came to using new ticular state. For instance, when a software is success- softwares and tools, such as the docker registry [8], fully installed on a Ubuntu host, all the packages that which we use for storing application’s docker containers. had to be installed towards that are part of that software’s The typical pattern that we followed was as follows. We provenance. Another example is that of a service which referred to several on-line documents, we installed dif- depends on other services for its correct functioning. The ferent softwares and packages, changed file permissions dependency information, such as the versions of the de- if required, changed directories, compiled/built certain pendent services, are part of that service’s provenance. packages, and so on until we were successful. In a PaaS system, the need to track provenance infor- At the end when the required software/package was mation is especially important given that the platform installed successfully, we wanted to consolidate the ex- act steps that we could follow if we had to install that ters for other services matter, for Solum to function cor- software on a different host in the future. However, at rectly in the devstack environment, following variables this point we generally faced a problem. From the entire need to be set in Nova’s [15] configuration file: sched- shell command history, it was difficult to find out only uler default filter, memory, compute driver. We propose those commands which were essential towards success- in subsection 3.3 how we can use provenance of config- ful installation of that software. We felt that we needed uration parameters towards enabling choosing of appro- an automated way which, given a list of commands, will priate configuration parameters of dependent services for find out the exact commands that are necessary and suffi- the purpose of design and development of Solum. cient for installing that software. Essentially, we needed the provenance for software installs (subsection 3.1). 2.4 Application provenance We observe that the trial and exploration pattern is not unique to design and development of Solum. It is quite Within Solum, application construction and deploy- general and can be seen to be followed by developers in ment information includes, commit hash of application’s trying to install any new software. source code deployed by the platform, test and run com- mands used in building particular version of the appli- 2.2 Platform building cation, version of docker used to build the application runtime environment, and versions of dependent services Solum depends on other OpenStack services such as used. All this information is part of the application con- Keystone [9], Glance [10], Heat [11], Barbican [12], struction and deployment provenance. Tempest [13]. Often times these services change their requirements, or undergo a refactoring change, which 3 Provenance Mechanisms causes Solum to break. For example, recently the tem- pest functional testing library re-factored its code to 3.1 Command list provenance move its REST module from its package to a separate li- brary. This caused all the functional tests in Solum to In order to find the provenance for successfully installing stop working as the import path of the REST module a software we need to consider the list of commands within Solum’s functional tests was no longer correct. that eventually led to a successful installation. An en- Such a breakage results in wasted developer productiv- tire history of user’s shell interactions contains differ- ity. It also leads to wasted resources in terms of testing ent kinds of commands, such as package installation failures and backups on the OpenStack’s continuous inte- commands (apt-get install), navigational commands (cd, gration servers (CI servers). The problem is exacerbated pushd, popd), listing/viewing commands (ls, less, more), if a service has had several commits. For developers of editor commands (opening files in vi, emacs), service a service like Solum, it is currently very tedious to find start up commands (service docker start), and so on. Be- out which commit in the dependent service broke their low we propose a method which, given a list of com- build. We felt that the severity of this issue could be re- mands and the name of the software to install, prunes duced if could have information about the last commit the command list to the essential commands required for of the dependent service that worked for Solum. Then installing the software. the problem would be constrained to finding the culprit The main issues that need to be addressed in design- commit between the last known working commit and the ing this method are: (a) detecting where to start and stop latest commit. Essentially, we needed to maintain prove- in user’s command history, and (b) determining whether nance information related to inter-service dependencies the candidate list of commands is complete. To address (subsection 3.2). the issue of where to start and stop in a long execution history we define the notion of an anchor point (ap) 2.3 Platform deployment command. Example of an anchor point command on Ubuntu is apt-get update. The reason this command can Each OpenStack service has large number of configura- be treated as an anchor point is because in our experi- tion parameters. It is hard for developers of a service like ence, updating packages on the host is a typical usage Solum, which depends on other services, to know the pa- pattern when trying to install new software on Ubuntu rameters from dependent services that are critical to the hosts. The anchor point commands may also be speci- operation of their service. We have faced this issue sev- fied by users. Moreover, instead of considering the entire eral times in the devstack [14] setup of OpenStack, which command history to build the candidate list, we can use supports running all the OpenStack services within a sin- the last command that led to successfully starting up of gle VM, typically for the purpose of design and devel- the service as another anchor point. Lets call this as the opment. As an example of how configuration parame- last successful command (lsc). To address the issue of
2 how to find out whether the candidate list of commands fore this commit. Below we propose two approaches that is complete, we propose that we need to try out that com- can help with this process. mand list in an automated manner in a contained envi- ronment. Our idea for this is to build a docker container 3.2.1 Dependent services’ commits tracking consisting of the list of candidate commands and check it with a verification script provided by the application One way to find the commit(s) to pin to is to maintain developer which checks whether that software was cor- provenance information about Solum that includes the rectly installed on the container or not. These ideas are most recent git commits of all the services that Solum captured in the following algorithm: depends on. For instance, we could maintain the follow- Input:
3 cluding o. This might be incorrect as the commit which 3.3 Infrastructure configuration tracking might be really causing the problem could be a relatively newer commit n of s2. If we pick s2 instead of s1 as Towards maintaining provenance information of config- the first service to try, we could find this. It remains to uration parameters of dependent services, we propose to be evaluated what would be a good strategy to select the version control the configuration parameters for differ- order of considering different services. ent OpenStack services that led to successful deployment of Solum. For instance, we would version control en- tries in nova.conf, heat.conf, tempest.conf correspond- ing to successful deployments of Solum. Then, when a 3.2.2 Dependent services’ merge tracking bug such as [19] is encountered, which stems from not Another approach for determining commit(s) to pin to having enough memory on the host, and whose resolu- is as follows. In OpenStack, there is a system named tion is to set the value of scheduler default filter config- Zuul [18], which orchestrates execution of continuous in- uration parameter in nova.conf to AllHostsFilter, having tegration runs for all the OpenStack projects. It maintains this parameter and its value tracked in version control a yaml based project definition for each project. We pro- would help create provenance for successful future de- pose that this project definition be enhanced by including ployments of Solum. the list of projects that depend on that project, and a list of trigger events. For example, below we show how the 3.4 Application provenance API project definition for Barbican might look like with this enhancement. The Triggers attribute identifies the list of In subsection 2.4 we identified the provenance require- triggering events and the Used by attribute identifies the ments from the point of view of an application. For ap- list of projects that depend on Barbican. plication developers to be able to extract provenance in- formation about their applications, we propose to define an object model and an API in Solum. Through this API Barbican: it will be possible for application developers to extract Triggers: OnMergeToMaster information such as the git commit hash used to deploy Used_by: their applications’ code to different environments (test- Solum, Murano, Mistral ing, staging, production), the version of docker used to build the application’s docker containers, and the ver- Whenever a patch is merged into the master branch sions of other OpenStack services that are being used. of Barbican, the OnMergeToMaster event will be gener- ated. This event will cause a “recheck no bug” comment 4 Evaluation to be added to the Used by project’s outstanding patches. This comment has a special meaning within Zuul. It We implemented the command line pruning algorithm causes Zuul to trigger a CI run on those patches. If the and used it to find provenance of installing two differ- patch passes then we know that the change to Barbican is ent softwares (docker and tomcat). The implementation, non-breaking. However, if the patch fails, then we would the command histories that we used, and the verification have proactively caught the breaking change. scripts that we defined are available at [17]. Aided with In this approach how to manage updates to the de- some changes in the way we operated on the host, the pendent service’s project configuration can become an algorithm was able to successfully generate provenance issue. Specifically, we cannot expect the dependent ser- for both. We performed these experiments on a Ubuntu vices’ developers to know about all the other projects that 14.04 host with docker version 1.6. are using it. We propose that the team members of the Here are our observations from this experimentation. project that depends on another project propose a patch When building a docker container with a subset of com- to modify the dependent service’s project configuration mands from the command history, docker build may fail in Zuul. The dependent service’s team members need to for several reasons. Specifically, we observed that this approve this patch before it can be merged, thus changing may happen if, (a) the command is not installed on the the dependent service’s project configuration. container, or (b) the command is a non-existent com- Some of the outstanding questions in this approach mand (a typo, or a valid command but not on the path, are: what modifications are required to Zuul to enable or a service name instead of a command). We experi- event generation? if there are no outstanding patches, enced the first case when we included curl as part of the could we make Zuul to generate a new patch and submit command list to be built into a docker container (curl was it to the Used by projects review queue and trigger a CI part of our execution history). The docker build failed in run it? what will be the nature of such a patch? this case because curl was not installed on the current
4 container layer or on any of its parent layers. This led to erated if a software was installed successfully). As men- the realization that a command that works on a host may tioned by one of the reviewer, the output of the algorithm not work on the container unless it is explicitly installed. can be considered to be an hint towards steps for success- This led to the conclusion that when including a com- fully installing a software. Even this could be helpful in mand c to be tried within a docker build, we may have determining the steps for installing a software on differ- to first include another command i c which installs c on ent hosts. the container. We experienced the second case when we included tomcat as part of the command list (executing the command tomcat was part of our execution history). 5 Related Work The docker build failed in this case because tomcat is not a valid Ubuntu command. This led to the conclusion Provenance issues in cloud computing have been consid- that before including a command into the Dockerfile, we ered in [5, 20]. In [5] provenance properties are defined should run it on the host and check if the output contains and protocols are presented that use cloud-based storage a well-known string, such as command not found:. If so, for storing the provenance information. In [20], the im- we should not include it in the Dockerfile. portance of linking provenance information across dif- ferent layers of a cloud infrastructure is stressed. The Certain commands may need one or more flags to be provenance information related to inter-service depen- tried on the command line. We observed this when in- dencies identified here is similar to the provenance prop- stalling docker. One of the commands to install docker is erty of multi-object causal ordering defined in [5]. Sim- to wget the docker package. This command worked fine ilar to [20], we argue for collecting provenance infor- on our host but failed in docker build. Upon investigation mation that will ease designing, building, and deploying we realized that we had to use the –no-check-certificate PaaS systems. In [21] a methodology based on track- flag for the command to work successfully from within ing system calls is presented to enable repeatable exe- the container. The general case of finding out which flags cution of commands on different target platforms. The are missing from a command’s execution might be diffi- command list provenance approach presented here com- cult to figure out. But if the command history contains plements that by addressing the issue of how to repeat commands with different flags that are clustered together, installation of a command on different hosts. Checking which may happen if the user is trying out different flags, validity of configuration parameters for dependent ser- then we can hope to find out the correct flags as follows. vices is an important issue [22]. Collecting provenance We could maintain a similar command map which main- information for such parameters, as presented here, is a tains a map of
5 [5] Kiran-Kumar Muniswamy-Reddy, Peter Macko, and Margo Seltzer. Provenance for the cloud. In Proceedings of the 8th USENIX Conference on File and Storage Technologies, FAST’10, pages 15–14, Berkeley, CA, USA, 2010. USENIX Association. [6] http://www.openstack.org/. [7] https://www.docker.com/. [8] https://github.com/docker/docker-registry. [9] http://docs.openstack.org/developer/keystone/. [10] http://docs.openstack.org/developer/glance/. [11] http://docs.openstack.org/developer/heat/. [12] https://wiki.openstack.org/wiki/Barbican. [13] http://docs.openstack.org/developer/tempest/. [14] http://docs.openstack.org/developer/devstack/. [15] http://docs.openstack.org/developer/nova/. [16] https://docs.docker.com/reference/builder/. [17] https://github.com/devdattakulkarni/ cmdlist-provenance. [18] http://ci.openstack.org/zuul/. [19] https://bugs.launchpad.net/solum/+bug/1390246. [20] Imad M. Abbadi and John Lyle. Challenges for provenance in cloud computing. In 3rd USENIX Workshop on the Theory and Practice of Provenance, 2011. [21] Philip J. Guo. CDE: Run any Linux application on-demand with- out installation. In Proceedings of the 25th International Con- ference on Large Installation System Administration, LISA’11, Berkeley, CA, USA, 2011. USENIX Association. [22] Peng Huang, William J. Bolosky, Abhishek Singh, and Yuanyuan Zhou. Confvalley: A systematic configuration validation frame- work for cloud services. In Proceedings of the European Confer- ence on Computer Systems (EuroSys’15), 2015.
6