MASARYK UNIVERSITY FACULTY}w¡¢£¤¥¦§¨  OF I !"#$%&'()+,-./012345

Foreman plugin for Jenkins CI

MASTER’S THESIS

OndˇrejPraz´akˇ

Brno, 2015 Declaration

Hereby I declare, that this paper is my original authorial work, which I have worked out by my own. All sources, references and literature used or excerpted during elaboration of this work are properly cited and listed in complete reference to the due source.

Advisor: Mgr. Marek Grac,´ Ph.D.

ii Acknowledgement

I would like to give thanks to Mgr. Marek Grac,´ Ph.D. for supervising my thesis and provided counsel. I would like to thank Petr Chalupa and Ivan Necasˇ for their help, advices and suggestions during my work on the practical part of this thesis. My thanks also belongs to my parents who supported me in my studies.

iii Abstract

The main goal of this thesis is to analyze and design a plugin for Foreman that would allow cooperation with Jenkins CI. The textual part describes the motivation, current state of the art. The practical part deals with my proposed solution of the outlined task.

iv Keywords

The Foreman, Jenkins, CI, automation, configuration management, DevOps

v Contents

1 Introduction ...... 3 2 Automation in Software Development ...... 5 2.1 Build Automation ...... 5 2.2 Continuous Integration ...... 7 2.3 Continuous Deployment and Continuous Delivery ...... 9 2.4 Configuration Management ...... 10 3 State of the Art ...... 13 3.1 The Foreman ...... 13 3.2 Katello ...... 15 3.3 ...... 17 3.4 Pulp ...... 20 3.5 Candlepin ...... 22 3.6 Bastion ...... 23 3.7 Elastic Search ...... 24 3.8 Jenkins ...... 25 3.9 Jenkins API Client ...... 27 3.10 Dynflow ...... 27 3.11 Foreman Tasks ...... 28 4 Project Design & Analysis ...... 29 4.1 Project Structure ...... 29 4.2 Models ...... 29 4.2.1 ForemanPipeline::Job ...... 30 4.2.2 ForemanPipeline::JenkinsInstance ...... 32 4.2.3 ForemanPipeline::JenkinsProject ...... 32 4.2.4 ForemanPipeline::JenkinsProjectParam ...... 33 4.2.5 ForemanPipeline::JenkinsUser ...... 33 4.3 Views ...... 33 4.4 Controllers ...... 33 4.5 Dynflow Actions ...... 34

1 4.5.1 General Workflow ...... 35 4.6 Challenges and Obstacles ...... 37 Conclusion ...... 39 Bibliography ...... 40 Index ...... 46 A User Guide ...... 47 A.1 Jobs ...... 47 A.1.1 Create a Job ...... 47 A.1.2 Configure a Job ...... 47 A.1.3 Run a Job ...... 52 A.1.4 Promote a Content View ...... 52 A.2 Jenkins Projects ...... 52 A.2.1 Create Jenkins Project ...... 52 A.2.2 Configure Project Parameters ...... 53 A.3 Jenkins Instance ...... 55 A.3.1 Create Jenkins Instance ...... 55 A.3.2 Configure Jenkins Instance ...... 56 A.4 Jenkins Users ...... 56 A.4.1 Create Jenkins User ...... 57 A.4.2 Configure Jenkins User ...... 58 A.4.3 Change user’s details ...... 58

2 Chapter 1

Introduction

When a certain level of technological progress is reached, automation is an idea that offers itself naturally thanks to its apparent advantages. Speed, cost effectiveness and decreased error rate are the most obvious ones. Automation is hardly a new idea since we have proofs of its practical application as early as 1930’s [31]. But even after 80 years, we still concern ourselves with it because it is an ongoing process rather than a problem with one solution that can be reapplied each time when circumstances require it. Furthermore, automation succeeds in finding new fields of application for itself thus creating new challenges and invades even our homes [10]. That is possibly a reason why there are no signs of the ultimate solution on the horizon. Even this thesis concerns itself with automation, but is limited to the software deve- lopment. There is a plethora of aids that try to automate various aspects of interaction with software systems and running automated tests during development is only a tip of the proverbial iceberg. There are already tools that can provision large number of ma- chines, regardless whether virtual or bare metal, with a few mouse clicks. Also content and configuration management has made a great leap forward during the last decade. There are also high-powered build engines that can be used for continuous integration and subsequent deployments to various environments. But there are so far no promi- nent attempts to combine these powerful standalone tools and make them collaborate closely. Content of my thesis humbly tries to exploit this niche. It aims to bring The Foreman with its plugin Katello and Jenkins together and make them cooperate in a useful way. Foreman is Red Hat’s solution to host provisioning and configuration, one of Katello’s main features is content management of machines provisioned by Foreman. Jenkins is one of the leading open-source continuous integration servers. General idea is to make Foreman provision a new system based on user’s predefined configuration, sup- ply Jenkins CI server with content from Katello and enough information about newly

3 1. INTRODUCTION provisioned machine to allow an optional deployment of builds. The whole process should run without supervision and be triggered by a certain type of events. In chapter 2, I try to elucidate the tools and principles that have a practical impact in the areas mentioned in the previous paragraphs. Chapter 3 describes various projects that are either reification of the topics discussed in chapter 2 or they are in some way important for the practical part of this thesis. They are subsequently brought together in chapter 4, which presents my solution of the outlined task.

4 Chapter 2

Automation in Software Development

2.1 Build Automation

When it comes to interaction with a software, human user is almost exclusively a liabi- lity. He is much slower and more prone to make mistakes in comparison to a running program. Sometimes problems may arise solely from the fact that different people do the same thing differently [13]. Growing complexity of the project increases the count of manual steps that are ne- cessary to include all the components required for successful compilation when change is made. To keep track of all dependencies may be more than challenging, if not impos- sible, and the more manual steps are required the higher is a chance of a mistake. This was recognized long time ago and world has had Make tool for several decades now [17]. To use Make, a user needs to write a Makefile listing the rules. Each rule con- sists of target, target’s prerequisites and optionally a set of commands. When invoking Make with a target, it tries to resolve its dependencies and produce desired artifact. GNU Automake takes things even further with its automatic creation of portable Makefile. It depends on GNU Autoconf to supply shell script that adapts the pack- age to the user’s system. Autoconf is essentially a bundle of M4 macros that create the configure script which in turn runs series of tests to determine the target system’s con- figuration. Because different systems handle shared libraries differently and C compiler versions may vary, GNU Libtool was created to hide the library complexities behind portable interface [45]. Libtool, Autoconf and Automake are often used together and referred to as GNU build system, although Libtool may be used independently [23]. Make’s popularity was an inspiration for derived tools in other programming lan- guages. One of the most prominent examples is (Ruby Make), which Ruby world relies on to do its automated tasks. Unsurprisingly, Rake tasks are defined in a Rakefile and they use pure Ruby syntax as a bonus [59].

5 2. AUTOMATION IN SOFTWARE DEVELOPMENT

Ruby developers rely on two additional tools in their daily tasks. To keep track of dependencies, Bundler is the weapon of choice. Project dependencies, which are com- monly called gems and are simultaneously units of code distribution in Ruby, are de- fined in Gemfile in project’s root folder. Bundler takes care of tracking, resolving and installing needed gem versions with simple bundle install command [32]. RVM (Ruby Version Manager) manages whole Ruby development environment and can be used to set identical self-contained environments in Development, Testing and Produc- tion environments. It has a flexible gem management system known as Named Gem Sets that prevent duplicate gem versions to be installed in the system and confines all gems exclusively into user space thus providing higher level of system security. RVM also allows having various Ruby versions with their own Gem Sets and provides com- fortable way of switching between them [47]. Although Make is widely used, it also has a wide array of undesirable attributes. Difficulty to maintain readability for large projects, not raising errors for undeclared variables and obligatory tabs at the beginning of the lines with rules are the most noto- rious examples [51]. Because of Make’s limitations, XML based build tools that allow developing soft- ware across multiple platforms established a firm foothold in Java world. Ant (Another neat tool) [18] was released in 2000 as one of the first modern build tools by today’s stan- dards. It spread quickly because of its low learning curve but his large build scripts were difficult to maintain and were often a target of critiques [58]. Maven tried to learn from Ant’s shortcomings and was released in 2004 [19]. Although Maven proved superior to Ant in some aspects it has its own problems. Customized build script are more difficult to write because Maven is focused mainly on dependency management. Maven follows ”Convention over Configuration” principle and can download dependencies automati- cally which is definitely a plus. On the downside, different versions of the same libraries often cause conflicts. Gradle [26] is the youngest tool used in Java development. Con- trary to its predecessors, Gradle uses Groovy-based DSL which makes its build scripts much shorter. This property sparked a debate whether to leave the XML configurations in the past and continue with more easily readable DSL from now on [58, 16, 11]. IDEs (Integrated Development Environments) are not build tools per se and may not be a direct step in furthering build automation, but their role in software development is not insignificant. Rather than directly participating in builds, they facilitate access to a huge amount of functionality relevant to software development neatly packed in one application for developer’s comfort. Apart from initiating compilation, they may contain database management, access to application servers and versioning support [9].

6 2. AUTOMATION IN SOFTWARE DEVELOPMENT

Whether a developer uses an IDE or not, version control tools are essential for col- laboration on a project. Their role in build automation is revealed in th next section, which outlines the notion of continuous integration.

2.2 Continuous Integration

Continuous integration, as described by Martin Fowler [22], is a subprocess in software development, where a developer checks out project’s main branch from a source control repository before starting his work on a new feature. Then he makes desired changes in his working copy, adds tests for newly written code, performs a build on his local ma- chine that runs the whole test suite to make sure his modifications did not break existing code. Developer then tries to commit his changes into main branch if his local build is stable. It is possible that contributions were made to the main line of the project during his work on the new feature, therefore it is his responsibility to replay his modifications on top of already submitted changes. Although a developer succeeds in adding his new feature to the main line of the code, his work is not yet done. At this point an automated build is triggered, which is based on code from the central repository. If the build fails, it should be the highest priority of the whole team to make it pass by fixing the main code line. Only when the build passes, the iteration of work is finished and everyone in the team can move on. There are authors who consider the described workflow as an example of continu- ous builds and argue that Fowler’s use of the word ”integration” refers to the piecing of the code written by various developers together whereas working software involves not only code and testing but necessarily also application servers, databases, etc. [41]. Re- gardless of semantics, the process outlined above contains three notable aspects crucial for continuous integration [30, pp. 56-58]:

• Version control. All relevant material needed to create, install, run and test the application should be checked in version control repository, regardless whether using Git [6], Mercurial [8], Subversion [20] or any other tool. This may seem as an obvious requirement since rapid and easy access to the project master is crucial for teams distributed across geographical locations or even different time zones.

• Automated build. Project should be configured to run automated builds easily. Even though IDE is used for development, it is highly recommended to be able to run builds from command line. Build scripts provide additional information

7 2. AUTOMATION IN SOFTWARE DEVELOPMENT

about how project works as a whole and may come in handy when things go wrong.

• Agreement of the team. Continuous integration requires that all the members of the teem accept its rules and adopt the given workflow. Without the support of all the team members the desired effect will not occur.

Upholding the listed principles encourages the frequent commits into the main line, because potential conflicts are detected immediately. Frequent commits result in smaller changes in each commit which make it easier to track down the reason of the build failure [22]. To aid the continuous integration, there has been developed an array of tools known as continuous integration systems. The workflow of these systems may be as simple as checking out current version of project, building it, running tests and reporting results. More sophisticated setups may include conditional builds and build pipelines. The in- ternals of CI system is depicted in figure2.1.

Figure 2.1: The internals of CI system [2].

There may be found two opposite architecture principles in continuous integration systems. In the first model, the centralized server coordinates and schedules the builds on the connected clients. The opposite is a collection of clients that where each client initiates the build and use the master server only to deposit reports [2]. One of the most

8 2. AUTOMATION IN SOFTWARE DEVELOPMENT widespread CI systems called Jenkins [7] supports both modes (more details description of Jenkins can be found in section 3.8). Among prominent CI tools also belong Hudson [42] and Travis CI [24]. There are authors who claim that continuous integration does not work and that Fowler’s concept is literally dead [3]. The major flaw is seen in the reality that the main code line gets broken and nobody likes to interrupt his work to repair a build that was broken by someone else. From the practical standpoint, it is not productive that ev- eryone stopped everything and worked on the broken build. This leads to the pattern where failed builds are ignored and maybe fixed at the end of the day by someone with a surplus of time to spare. The proposed solution is simple: not to allow the main code line builds to break by making master branch read-only. No developer should be allowed to submit his own code to the master branch. Instead, the proposed change should be handed over to the CI system that tries to merge it in specified environment and run tests. When build is successful, proposed change may be submitted to the additional review by peer or merged directly into master, now with guarantee that main code line will not get broken. The efficiently implemented continuous integration is a huge step forward in soft- ware development automation, but journey does not end here. In this section, we have been able to successfully assemble the application and run the automated tests with passing status and, as a result of the successful build, there may be an artifact (a *.war file, for example). How to use the outputs of our builds to advance the automation will be unveiled in the next section that deals with continuous delivery and deployment.

2.3 Continuous Deployment and Continuous Delivery

Moving from development towards deployment and into production is a step where unpleasant surprises may occur, therefore it is useful to detect the potential problems early. Deployment into test environment verifies that all the application pieces fit to- gether and can be revisited as the requirements on the system evolve. Furthermore, the integration and deployment are now familiar because were repeated many times and that removes the fear and uncertainty. Needless to say that the repeated deployment is done automatically. An elegant side effect of automated deployment into test environment is the fact that we are actually capable of deployment to any environment, even production. And

9 2. AUTOMATION IN SOFTWARE DEVELOPMENT thus, continuous delivery is born [41]. It is defined as the act of releasing the software into production when automated tests are passed [60]. A more accurate name could be a ”continuous release” [29]. Continuous deployment has been claimed to be good for business. Because customer feedback can be obtained quickly, quicker response to the market change can be attained [12]. As in previous stages of development, smaller change is easier to apply and stress level of the team members is therefore decreased [43]. Sometimes it may not be viable to deliver every successful build to the customer. From the standpoint of marketing and support, there may be good reasons why only carefully selected versions should be released to the customers. When the decision to ship the new product version is determined by the business reasons, we talk about continuous delivery. The actual delivery is done by pushing the button manually [29], as figure 2.2 shows.

Figure 2.2: Continuous Delivery vs Continuous Deployment [50].

2.4 Configuration Management

Configuration management is a widely used term often appearing in many different contexts. Poorly applied configuration management can be only a nuisance in smaller

10 2. AUTOMATION IN SOFTWARE DEVELOPMENT projects, but may completely hinder the automation on a large scale. Many of its tools and principles were already mentioned in previous sections of this chapter as an in- tegral part of automation process which only stresses the attention this aspect requires. This section merely tries to summarize remarks on configuration management scattered across previous sections and complete them with additional information that portrays different areas where configuration management finds significance. When good configuration management strategy is applied, the answers to the fol- lowing questions will be positive [30, p. 31]:

• Can any of the currently used environments be reproduced, including the oper- ating system version, patch level, software stack and configuration of the system and deployed applications?

• Can change be made to any of the above listed items and can this change be propagated to other selected environments?

• Is there a way to track change history together with the author and time when modifications were applied?

• Can all the mandatory requirements be met?

• Can everyone in the team access the desired information easily and make changes in configuration efficiently?

Source code versioning is undoubtedly a type of configuration management. The first popular version control system was SCCS written by Marc J. Rochkin in 1972 [30, p. 382]. The modern tools were already introduced in section 2.2, therefore no more space will be spared for them here. Managing external dependencies was covered in section 2.1 together with examples of some tools that make this task easier. When the application reaches certain parameters (size or feature count, for example), it may be more manageable to separate the code into several components. While the ac- tual division and architecture of components is a concern of object oriented analysis and design, configuration management plays main part in selecting the component ver- sions that appear in the working application. When change occurs in one component, to rebuild all components may seem as a good strategy until the point where rebuilding the whole application takes too much time. Also, the components may have completely different lifecycle and while some are changed very often, others only rarely. Solution is to make build pipeline for each component in similar fashion as for whole applica- tion. When it builds with success, it is promoted to the integration build that may run

11 2. AUTOMATION IN SOFTWARE DEVELOPMENT on completely different machine than component builds. Modern CI systems provide enough support for managing the components for integration builds [30, pp. 356-361]. No application works in a vacuum. It is dependent on an , external services hardware, software to run which are commonly referred to as an environment. The manual environment configuration is the most common one and is also considered an antipattern for the following reasons [30, pp. 49-54]:

• Configuration information is very large.

• One change can cause the application to break.

• Once broken, finding the problem requires indeterminate amount of time even with supreme knowledge.

• It is very difficult to reproduce the configuration for testing purposes.

• The environments are difficult to maintain.

The key to the issues above is to make the environment creation completely auto- mated, which is the task for Puppet [35] or [49].

12 Chapter 3

State of the Art

This chapter contains description of selected project that are related to the practical part of this thesis. It provides a reader with a high-level overview of the most important features with focus on the functionality that is essential to the given project and/or is important from the point of view of this thesis.

3.1 The Foreman

This open source project started in 2009 and was initiated by Ohad Levy and Paul Kelly. The Foreman is a complex tool for a complete lifecycle management of provisioned bare-metal and virtual machines. Foreman’s main features in a nutshell are: provision- ing, configuration, monitoring. It allows for automation of repetitive tasks during initial infrastructure setup, offers elaborate configuration management followed by monitor- ing and reporting trends from obtained data [54]. During provisioning, Foreman relies heavily on Smart Proxy, which is by default installed to the same network node as Foreman. Smart Proxy plays role of a mediator in communication between Foreman and several external services. At the present, Smart Proxy has TFTP, DNS, DHCP, Puppet & Puppet CA features fully at Foreman’s disposal. The communication between Foreman, Smart Proxy and other services shows figure 3.1. Provisioning of machine with selected operating system and desired configuration is a multi-step process. In the beginning, it is necessary to choose an operating system with appropriate installation medium. The Foreman has several UNIX-based operating systems already preconfigured and ready to use right after Foreman’s installation. Next step is to select partition table and provisioning templates for unattended installation (for example Kickstart or Preseed). When the new host is booted with PXE protocol, it sends broadcast message for DHCP server that can handle PXE requests. Smart Proxy working as a DHCP server answers and assigns IP address to the new host. PXE server is contacted, and redirects

13 3. STATE OF THE ART

Figure 3.1: The Foreman architecture [54]. host to the TFTP server containing a boot image. New host retrieves the image and starts installation with parameters from provisioning templates [37]. In the end, Puppet run is executed. Puppet is relied on by Foreman for configuration management and collecting facts. Puppet is described in greater detail in section 3.3. The whole process is pictured in figure 3.2. Foreman’s hostgroups can provision a group of hosts with identical configuration regardless of their physical location. User simply creates a hostgroup with desired con- figuration of a system and assigns it to newly provisioned host. Foreman’s compute resource supports provisioning of virtual machines. Currently supported hypervisors are: EC2, Google Compute Engine, , OpenStack, oVirt/RHEV, Rackspace and VMWare. The Foreman is able to provision a broad spectrum of UNIX-based systems, success- ful installations were reported for RHEL, Fedora, CentOS, , , Solaris 8 and 10, OpenSUSE. The list of systems that are supported by Foreman is a little shorter. Fore- man may be installed to RHEL 6 and 7, CentOS 6 and 7, Fedora 19, Debian 7, Ubuntu 12.04 and 14.04 [54]. The Foreman offers a graphical UI for user’s convenience, available is also a RESTful API and command line tool called Hammer. The Foreman is easily extensible through the use of Rails Engines [27], as proves The Foreman-Docker plugin that adds sup- port for Docker images and containers [53]. The largest Foreman plugin yet created is Katello, which is described in the following section.

14 3. STATE OF THE ART

Figure 3.2: The Foreman – booting sequence of a new host [37].

3.2 Katello

Katello is the most extensive Foreman plugin created so far. It brings content manage- ment to the already existing Foreman functionality. It allows user to create and manage local repositories, regardless whether their content are yum packages (*.rpm) or puppet manifests (*.pp). Content can be uploaded directly or synchronized with already exist- ing remote sources. One or several repositories can be included in a product which can be consumed by a registered system. A registered system (also known as content host) is a concept that represents a real system. It provides information about assigned products, that is, which repositories the system will pull content from. It also takes care of package install, upgrade and unin- stall. In order to receive content, a system must register with Katello. Registering new systems is entrusted to subscription manager. Registration requires 2 parameters: user’s credentials and a target content view. Alternatively, activation key can be used instead of user’s name and password. Content view is a snapshot of repositories and Puppet

15 3. STATE OF THE ART modules. Filters may be attached to a content view to provide additional control over content in the view and explicitly specify, which items should be excluded. Content view allows user to have different version of the same repository and stage these versions through lifecycle. Staging is done through lifecycle environments that are, simply put, containers holding content view versions. Whole lifecycle of a content view is represented by a lifecycle environment path. Each path starts with default Li- brary lifecycle environment followed by user-defined environments. When promoted, a content view snapshot is moved to the next lifecycle environment along the lifecycle path. When changes to the content view are made, we can choose to publish an entirely new version of content view. Newly published content view version will be automati- cally moved to the Library environment and if there is any previous version in Library it will be removed from there. Content view and lifecycle environment therefore uniquely identify a content view version [56]. Figure 3.3 shows the relationship between Katello and Foreman. As can be seen, both Foreman and Katello rely on other open source projects, that will be described in following sections.

Figure 3.3: Foreman and Katello architecture [33].

16 3. STATE OF THE ART 3.3 Puppet

Puppet is a configuration management system, that achieves the desired state of infra- structure through enforcing a described state [35]. Every node in user’s infrastructure has a Puppet agent that communicates with a node designated as a Puppet master. Enforcing a desired state is a multiple-step process:

• Facts collection. The Puppet agent sends report to the server about current state of the system.

• Catalog compilation. The Puppet master examines the report received from Pup- pet agent and compiles a catalog that specifies, what the corresponding node state should look like and sends the catalog back to the Puppet agent.

• Enforcement. The Puppet agent receives a catalog form master and enforces the state as described in the catalog. Puppet also provides the no-op option, that si- mulates the changes.

• Report. Puppet agent sends a report to the master describing changes that were made to the node.

• Report sharing. Puppet offers to provide reports to third party’s tools.

Puppet uses it’s own ruby-based declarative DSL to describe desired state of the sys- tem. Example of Puppet DSL shows listing 3.1. There is already great amount of created modules for completing common tasks such as managing and configuring firewalls or MySQL database. Modules are available for free from Puppet Forge – a repository crea- ted by community with Puppet Labs support [34].

Listing 3.1: Puppet DSL example

1 file {’my-log-file.log’:

2 path => ’/var/log/my-log-file.log’,

3 ensure => file

4 }

Although brief, the example from listing 3.1 is a fully functioning Puppet program. Puppet programs are called manifests and use the .pp file extension. To achieve a de- sired configuration state using puppet manifests we rely on resource declaration, ”file”

17 3. STATE OF THE ART

being the resource type in our example. Resource title follows on line 1 the opening curly brace my-log-file.log and ends with a colon. A set of attribute–value pairs bound together with a rocket operator (”=>”) specifies the properties of declared resource. In this particular case, we declare that file with desired name exists in /var/log/ directory [35]. When changes are to be applied to the system, Puppet manifests get compiled into catalog and each Puppet agent can retrieve only it’s own catalog. Catalog compilation in detail is pictured in figure 3.4

Figure 3.4: Compilation of Puppet manifests [35].

Puppet uses classes to logically structure it’s code. Example can be seen in listing 3.2. Lines 1-3 are called class definition. They define, what changes are to be made to the system. Line 5 is a class declaration. It ensures, that class gets picked up and applied when Puppet is run. To improve structure of Puppet codebase and keep it maintainable, manifests are usually organized into modules. Puppet module is a directory with defined structure. Module’s root is a directory with name of the module. All manifests are stored in it’s subdirectory called manifests. Among the module’s manifests should be one called init.pp containing a class named after the module. Puppet looks for modules in a mod- ule autoload path that can be easily changed through modification of puppet.conf file. Arranging manifests into modules offers flexibility as well as maintainability. Since the

18 3. STATE OF THE ART

Listing 3.2: Puppet class example

1 class example{

2 # class content ommited

3 }

4

5 include example

only manifest the Puppet server looks into when compiling a catalog is a master mani- fest (sometimes also called site manifest because it defaults to site.pp, it’s default value may be changed in puppet.conf), we can organize our code into modules and use the master manifest only to declare modules with desired configuration as in listing 3.3 [35].

Listing 3.3: Master manifest used only to load desired modules [35]

1 include apache

2 include mongod

3 include mysql

4 include ntp

Puppet DSL also supports user defined types and file content rendering based on erb templating language [1] as listing 3.4 shows.

Listing 3.4: File configuration using a template [35]

1 file {’/etc/foo.conf’:

2 ensure => file,

3 require => Package[’foo’],

4 content => template(’foo/foo.conf.erb’)

5 }

There is much more to Puppet and Puppet DSL than what was mentioned in previ- ous paragraphs. The comprehensive documentation is definitely a required reading for anyone interested in deep workings of Puppet. To cover all aspects of Puppet function- ality is completely out of scope of this section and was never intended. Instead, high- level overview and examples illustrating core principles were meant to elucidate how Puppet fulfills it’s role as configuration management tool and outline the fundamental principles that are relied on by Foreman.

19 3. STATE OF THE ART 3.4 Pulp

Pulp is Red Hat open source project. As figure 3.5 shows, it is responsible for managing repositories and distributing it’s content to selected users.

Figure 3.5: Pulp’s role in content distribution [57].

With Pulp, administrator can upload content from different repositories and store it in Pulp server. He can create his own customized repositories and publish them. Con- sumers can register with Pulp and have content installed from selected repositories. Pulp server can also manage and monitor content installed on each consumer. Pulp cur- rently supports rpm packages and Puppet modules but with its type-agnostic design and extension framework offers an opportunity for new plugins to add additional con- tent types to the list of already supported ones [57]. Pulp is comprised of three main components:

• Server. The main part of the application installed on server machine that takes care of managing and hosting the repositories.

• Agent. This component runs on the consumer and reports to the server.

• Client. Command line tool that is broken into two parts. Server part can run on any machine with access to Pulp server API and serves for remote management of the server. It is also possible to issue some commands to the consumers remotely from server command line tool. Client part manages the consumer’s relationship to the server and must be run on a consumer. It allows administrator to register consumer to the server, bind consumer to the server managed repositories and consume content.

20 3. STATE OF THE ART

For standalone Pulp installation, only Red Hat family operating systems are sup- ported (RHEL 6 & 7, Fedora 19 & 20, CentOS 6 & 7). For RHEL and CentOS systems, EPEL is required. Pulp is backed by MongoDB, which may require significantly more disc space than is the total amount of data stored in the database. This known behavior of MongoDB should not be underestimated when preparing for installation. Pulp uses Qpid for exchanging messages between components, RabbitMQ can be used alterna- tively [57]. Concurrent operations running on the server may result in conflict when, for exam- ple, promoting and synchronizing a repository at the same time. Pulp implements locks in form of task objects to prevent task conflicts. Task object is placed into queue and waits for a worker to process it, always at most one task for a resource at the time. Pulp server has several components that can be restarted independently of others if need arises. These components are [57]:

• Apache. Apache server is responsible for REST API.

• Workers. They execute the asynchronous tasks on the server.

• Celery Beat. Singleton component that monitors availability of the workers and queues scheduled tasks.

• Resource Manager. Singleton worker that assigns task to other workers based on resource availability.

Only registered systems may consume content from a Pulp server. When registering, a system needs to supply HTTP Auth credentials (username and password). After regis- tration, a certificate is written to consumer’s PKI, which will be used for future authen- tications and additional information about the consumer is stored on the server. At this point, a consumer can bind to the repository. Binding to the repository will allow con- sumer to install packages from the repository. Pulp server keeps history of changes on his registered consumers. It records when consumer registers/unregisters, bounds/un- bounds repository, installs/uninstalls content unit and joins/leaves consumer group. Only actions triggered through Pulp are recorded in history. Packages in managed repositories can also be moved between two Pulp servers. Server receiving content is denoted as a child node and registers to the server providing him with a content. All a parent server must do is activate the server-consumer, that will be recognized as a child node. The content on registered child nodes is managed in similar manner as is in server-consumer relationship [57].

21 3. STATE OF THE ART 3.5 Candlepin

Candlepin is a subscription management tool which allows users to access provided content through subscriptions. Candlepin monitors which products the user already has and which products he is allowed to consume. Candlepin provides API for a user to query for available products and then actually assign them. In the most basic deployment setup, Candlepin’s Product Data extension point is supplied with information about product orders. Remote client (entitlement manager) then contacts Subscription Data extension point to consume entitlements that were cre- ated from the orders. Candlepins’s core functionality provided by Java engine maps owner’s subscriptions onto entitlements that can be consumed, as figure 3.6 shows.

Figure 3.6: Simple deployment of Candlepin [28].

Apart from already mentioned Subscription and Product extension points, Can- dlepin furthermore offers following extension points [28]:

• Entitlement Certificate Generation. Point to generate file representation of enti- tlement.

• Identity Certificate Generation. Generates consumer identity.

• Event Publishing. Announces events ocurring within Candlepin’s engine.

• User Data. Informs how users are authorized and authenticated.

• Business Rules. Access for additional rules modifying consumption of entitle- ments.

• Batch Jobs. Support for clusterable batch jobs.

22 3. STATE OF THE ART

Subscription lifecycle is determined by its date attributes. Subscription in entered state is valid but has not been activated. When today’s date is greater or equal to begin date, subscription becomes activated. Updates may change the end date or quantity of products consumer is entitled to. When renewal occurs, end date is changed. Termina- tion is usually a result of an event, end date is set to the termination date. When today’s date is greater than end date, subscription becomes expired but no data are changed. When subscription is canceled, end date is set to equal start date and subscription never becomes active. Figure 3.7 shows the possible states.

Figure 3.7: Candlepin subscription states [28].

Candlepin does not actively look for expired subscriptions but relies on SSL to verify the certificates. It removes the expired subscriptions during refresh operation for a user [28].

3.6 Bastion

Bastion is Katello’s child project. It is a view engine written in AngularJS [25]. Bastion is essentially a standalone frontend module that communicates with backend via REST API. UI provided by Bastion is a single page application. Dynamically rendered pages display cached data and only the HTML necessary to render the next page is loaded which results in a quicker page loads [55].

23 3. STATE OF THE ART 3.7 Elastic Search

Elastic Search is a distributed search engine providing a near real-time searching capa- bilities. It takes care of indexing with different index types and consistency of operations [14]. Elastic Search excels at querying text for results and returning statistical analysis of a given corpus of a text. Most of its search algorithms come from time-proven Lucene project. Lucene first appeared in 1999 and later joined Apache Software Foundation [21]. Elastic Search is essentially built around Lucene’s core Java libraries. Lucene by itself provides its native Java API, which is cumbersome to use. One of the most beneficial values of Elastic Search is therefore exposing its own native Java API as well as RESTful API which both provide more intuitive access to Lucene’s functionality. The addition of RESTful API naturally allows interoperation with non-Java based applications. In comparison to Elastic Search, Lucene provides only minimal support for distribution on multiple nodes which makes scaling extremely difficult. Elastic search is best optimized for searching large number of items for best match, finding occurrences of sample in large text, auto completion of partial input with respect to the misspelling of the word and data manipulation across multiple nodes in cluster. High efficiency of these operations comes with the cost of slow execution of some other types of tasks. Elastic Search is not particularly well suited for problems which can be handled by relation databases such as concurrent execution of operations with rollback support and creation of records with unique values at multiple fields. The smallest data unit, that Elastic Search operates with, is field and each field has a type. Document is a collection of fields and represents the smallest unit of storage. When a field is updated, whole document is rewritten with no change to unmodified fields. All documents are internally represented as JSON and later mapped to Lucene’s API. Each document must have a user-defined mapping. Mapping specifies type of fields and indexing method. Elastic search supports common types for it’s fields (string, inte- ger, long, float, double, boolean, date, geo point) as well as array and JavaScript object. Nested objects are always stored in the same physical document on the disc as their parent. There is an additional nested type that stores the object in a different document which has an impact on performance. Indexes are both logical and physical partitions of the documents. Index-wise, docu- ments are perceived as unique. Most of the operational activities run solely on indexes and single index has no knowledge of data within other indexes. Although cross-index searches are supported, the are rarely used.

24 3. STATE OF THE ART

Indexes in Elastic Search are not mapped 1:1 to Lucene’s indexes. They are made into shards and mapped onto configurable number of indexes in Lucene. Default value is 5, each shard also creates one replica. The number of replicas may be changes as well, there is a tendency to have one replica on each node of the cluster. If the node count in the cluster is too low to support chosen number of replicas, Elastic Search reports cluster as ’degraded’. It is advisable to run Elastic Search on sufficient number of nodes in production environment and avoid degraded state [4].

3.8 Jenkins

Jenkins is an open source continuous integration tool written in Java. It started as a pet project of Kohsuke Kawaguchi under name Hudson, when he was working in Sun in 2004. As Hudson slowly evolved, various teams working in Sun started to adapt Hudson to their own needs. The potential of the project was recognized and Kawaguchi was offered to continue developing Hudson as a full-time job. In 2010, Hudson became the leading solution for continuous integration. After Oracle took over Sun, the tension between Hudson developer community and Oracle started to grow. Oracle claimed the ownership of the name ’Hudson’ and there were disagreements on development process as well when Oracle proposed to adopt strictly controlled development with less frequent release dates. In the end, the group of developers around Kawaguchi renamed the project to Jenkins and moved it to GitHub [48, p. 34]. Wide community support and ease of extensibility are the commonly named reasons for Jenkins’ success. There is more than 1000 of plugins to enhance it’s functionality [5]. Together with intuitive interface ensuring low learning curve, Jenkins is number one CI solution for teams of various sizes regardless of programming languages and tech- nologies involved in their projects. Jenkins can handle .NET, Ruby, Groovy, Grails, PHP, naturally Java and many more[48, p. 35]. Jenkins supports most common UNIX-based systems (Debian/Ubuntu, Red Hat family, OpenSUSE, Mac OS X, FreeBSD, Solaris) and also Windows. After the installation, one of the most important directories for Jenkins is Jenkins home directory, which on Red Hat family systems defaults to /var/lib/jenk- ins. Here Jenkins stores it’s main configuration file (config.xml), installed plugins (plu- gins/) and user defined jobs with their worspace, where builds take place (jobs//workspace)[48, p. 80].

25 3. STATE OF THE ART

Jenkins is highly configurable although it comes with sane default settings and may be used right out of the box. In the ’Manage Jenkins’ screen may be configured just about everything from plugins to mailers, logging and reporting, environment variables, Jenk- ins slave nodes for parallel builds and scheduled shutdowns [48, pp. 67-70]. Executing project builds is the main purpose of Jenkins’ existence. When creating new project, there are 4 default items to choose from:

• Build a Maven project. Jenkins was primarily designed to build Java projects. For those, this option is probably the best choice since it takes advantage of .pom files thus reducing manual job configuration that needs to be done through Jenkins.

• Build a multi-configuration project. This type is intended for project in need of advanced configuration. For example: testing against multiple different databases.

• Monitor an external job. With this project type Jenkins offers to monitor external non-interactive tasks.

• Build a free-style software project. Jenkins’ jobs are prevalently free-style projects. If none of the previous categories applies, project belongs to this one.

For a build of a free-style project, it’s name and location of source code is necessary. Jenkins by itself supports Subversion, there are plugins for Git, Mercurial and many more version control systems. Fully in the spirit of Jenkins’ flexibility, even plugins are highly customizable. For example git plugin allows user to specify branches to check- out, schedule builds, poll git repository for changes and build if change is detected, merge branches locally and much more. Not only Jenkins needs to have specified where to find the source code, but also what to do with it. Actions that should be performed are declared in build steps. They may be as simple as executing a shell script. For Java projects, Maven or Ant steps may be included. But once again, plugins come into play and with their aid it is possible to invoke Rake tasks or Groovy scripts, for example. When the build has finished, there may be additional work ready to be done. It is named as post build tasks and it may include reporting of test results, cleanup in form of wiping the workspace clean, deploying the artifacts or starting builds of dependent projects (called downstream projects in context of Jenkins builds) [48, ch. 5]. Configuring and running a single job is only a small piece of what Jenkins has to offer regarding project builds. Multi-configuration projects were briefly mentioned, but

26 3. STATE OF THE ART there are also parametrized builds which can take for example a tag from a git repos- itory or any of Jenkins provided environment variables as a build parameter. Jobs can be organized into pipelines and promoted on successful builds. When jobs are built in parallel on several machines turned into Jenkins slaves, things can get bit more compli- cated, but that is where dependency graphs, locks, joins and latches come into play to resolve dependencies before building another job [48, ch. 10].

3.9 Jenkins API Client

Jenkins API Client is a client for interaction with Jenkins CI server. Although Jenkins provides 3 types of remote access (XML API, JSON API and Python API), Jenkins re- lies heavily on XML configuration of it’s projects and plugins. This client aims to reach Jenkins from Ruby code with ease [36].

3.10 Dynflow

Dynflow is a dynamic workflow engine created for the task orchestration in Foreman and Katello. It allows to keep track of running tasks, monitor their progress, inspect their state and recover from failures or skip steps when needed. Dynflow can run steps in concurrence or sequence, suspend long-running steps and wake them up again when desired event occurs. Whole workflow is defined by Dynflow actions and their input parameters which are resolved on the run. Each Dynflow action may be subcomponent of another action and has three phases:

• Plan phase. Execution plan of the workflow is constructed. Actions may be planed explicitly by plan action method or as a subscribers to an existing action that is already planned. Output of this phase is set of actions and their inputs.

• Run phase. Actions get executed by calling the run method, which should be supplied all the required information as input from plan phase. This ensures the stateless nature of the run phase which results in easier recovery and persistence.

• Finalize phase. This phase is suitable for example for recording of data into action output.

It is recommended to use composition of actions in such a manner that every action is as atomic as possible to achieve better control over the whole orchestration process [39].

27 3. STATE OF THE ART 3.11 Foreman Tasks

Foreman Tasks are a task management tool for Foreman. It monitors the finished and currently running tasks in Foreman. Although Foreman uses Dynflow as a workflow engine, Foreman Tasks do not depend on Dynflow and may be used with anything that supports execution hooks [38].

28 Chapter 4

Project Design & Analysis

This chapter describes the practical part of my thesis and comments its design. Pre- sented plugin (named Foreman Pipeline) is an ideological successor to a project with a codename ”abcde”, that conceived an idea of deploying the nightly builds onto a ma- chine newly provisioned by Foreman. The intended workflow can be divided into two phases: configuration phase and run phase. During a configuration phase, all the neces- sary information is supplied by the user so that run phase may complete without further interaction and supervision. The run phase may be summarized as follows: provision a new host, build Jenkins projects, wait for build results, do post-build actions. The figure 4.1 shows the run phase in greater detail.

4.1 Project Structure

The project is a Rails::Engine [27] as described in the instructions for Foreman plu- gins in the official documentation [54]. The most adequate choice was an engine with a semi-isolated namespace [46], since I wanted to avoid namespace pollution resulting in name conflicts but I needed an access to the classes in Foreman and Katello. The plugin has a standard Rails project structure with minor modifications mirroring the conven- tions in Foreman and Katello projects.

4.2 Models

This section provides details about the individual models of the plugin. Extensions of existing Foreman and Katello classes are placed into /app/models/concerns. The over- all data model can be seen in figure 4.2.

29 4. PROJECT DESIGN &ANALYSIS

Figure 4.1: Sequence diagram of intended workflow

4.2.1 ForemanPipeline::Job

Job is a core of the plugin. It aggregates all the necessary data to ensure their availability when the job is triggered. Job can be started in several different ways:

• manually.

• when content view is promoted or published.

• when repository is successfully synchronized.

Before the actual execution of a job, it must be properly configured with a content view, hostgroup, compute resource, environment, jenkins instance and jenkins user. The last two entities and their roles are described in detail in subsections 4.2.2 and 4.2.5. A non-composite content view with yum repositories represents a content that should be ultimately deployed onto newly provisioned host. The actual deployment is not in-

30 4. PROJECT DESIGN &ANALYSIS

Figure 4.2: Foreman Pipeline data model. The already existing classes of Foreman and Katello are blue, the classes of Foreman Pipeline are orange.

corporated in the workflow. It is presumed to be a part of the project build on the Jenkins server. Hostgroup defines the parameters of the newly provisioned host. It should have configured a compute resource and all the necessary items for unattended provisioning as described in the Foreman documentation [54]. Compute resource defines where the host will be deployed to. A hostgroup must be already set before assigning a compute resource to a job. Only compute resources configured for the assigned hostgroup will be displayed in the UI and may be set. Environment is one of Katello’s lifecycle environments in lifecycle path. It plays sig- nificant role in jobs execution together with a content view. When a job-triggering event occurs, it is not guaranteed that any job will actually execute even though there are jobs set to be triggered by this event. In the beginning, a job is checked for proper confi- guration and whether it is allowed to be triggered by the occurred event. Based on the trigger type, additional constraints on job are enforced. Furthermore, job only executes

31 4. PROJECT DESIGN &ANALYSIS when its content view has been already promoted to its environment but not to the environment’s successor. This ensures that each job is triggered at most once for each content view version and prevents unnecessary repeating of job runs. Each event (with the exception of repository sync) also triggers at most one job. Library is a special environment that each lifecycle path begins with (for details, see Katello’s documentation [56]). This provides an opportunity to promote a content view from library into multiple paths. These paths are determined by their second environ- ment (the one that follows the library). The only thing, that is necessary to select the de- sired path, is to set additional environments (called ’to environments’) for the job. The promotion of content view is optional, if no to environments are set, the view will not be promoted. This holds for non-library environments as well, but their to environments may contain only one environment – their immediate successor.

4.2.2 ForemanPipeline::JenkinsInstance

This model represents an instance of Jenkins server at the disposal which will be used to build the projects. A few manual configuration steps are required before creating a new Jenkins instance. An RSA keypair needs to be generated on our Foreman server. Private key should be stored in folder with appropriate access rights for the user running our Foreman instance and public key should be distributed to the Jenkins server. Jenkins instance cannot be created without proper keys setup. When Jenkins instance is being created, Foreman connects to the Jenkins server via SSH and instructs it to generate yet another RSA keypair. This one, however, is meant for passwordless communication between Jenkins and newly provisioned host. There- fore Jenkins server hands over the public key to the Foreman and the key is delivered to the host during provisioning through Kickstart file. The actual communication with Jenkins server is handled by Jenkins API Client gem [36], that contacts Jenkins’ REST API.

4.2.3 ForemanPipeline::JenkinsProject

Jenkins project is Foreman’s handle to the existing project on a Jenkins CI server, which is represented by a Jenkins instance. The actually available projects may be retrieved on-demand by a name search and assigned to the job.

32 4. PROJECT DESIGN &ANALYSIS

4.2.4 ForemanPipeline::JenkinsProjectParam

There is a support for parametrized Jenkins project builds. The parameters may be in- jected with information from the newly provisioned host with the use of ERB templat- ing. Jenkins offers more than 10 types of parameters. At the present, only string, text and boolean types are supported, because they appear in vast majority of builds whereas the remaining types are used only exceptionally.

4.2.5 ForemanPipeline::JenkinsUser

Jenkins user model holds credentials for authenticated access to the Jenkins server, since unauthenticated users are usually limited only to a small number of actions. Token and user name are required for successful authentication. The token can be obtained in the user details pane of the Jenkins CI server we wish to connect to.

4.3 Views

Plugin registers with Bastion and uses it as a view engine. Therefore, views are located in /app/assets/ instead of standard /app/views. Bastion views are an An- gularJS submodule where files are structured by feature. RABL [15] view templates are used to generate JSON that supplies Bastion with data from the backend. ERB templat- ing is used for provisioning snippet that adds public key generated for passwordless access from Jenkins server. It can be found in /app/views/foreman/unattended/snip- pets folder.

4.4 Controllers

Each of the models mentioned in previous section has its own controller that han- dles the basic CRUD operations as well as advanced functionality. Controllers inherit from Katello::Api::V2::ApiController and form a REST API that responds to JSON. Mixed in Api::Rendering module takes care of RABL template selection for JSON rendering. JenkinsRequestsController handles the requests that are tied to the Jenkins CI server rather than a to specific model. All routes are documented using Apipie gem [40].

33 4. PROJECT DESIGN &ANALYSIS 4.5 Dynflow Actions

Dynflow action classes can be found in /app/lib folder which is added to the autoload paths. Dynflow is the main tool that holds the orchestration of the workflow together from the start of the job execution to the very end. Without Dynflow’s support, it would be very difficult to achieve a desired outcome since the whole workflow is a sequence of actions distributed among 3 nodes: Foreman server, Jenkins server and a host provi- sioned by Foreman. Foreman Tasks are used for the management of Dynflow actions. They form a wrap- per layer that allows to call Dynflow actions easily from controllers. Some additional enhancements provided by Foreman Tasks are used, such as action with sub-plans. It turned out to very be convenient when contacting Jenkins server with request for a build of multiple projects. If anything breaks, the cause may be easily detected in the Dynflow console, which is an invaluable help. The console output of manually triggered job’s plan phase is shown in in figure 4.4, figure 4.3 pictures the results of job’s run phase.

Figure 4.3: Dynflow console with results of a manually triggered job.

34 4. PROJECT DESIGN &ANALYSIS

Figure 4.4: Dynflow console showing the plan phase of manually triggered job.

4.5.1 General Workflow

The necessary items, that job requires before execution start, are listed in subsection 4.2.1. Foreman Pipeline is fully integrated into Foreman’s web GUI, which can be used to configure a job once the plugin is added into Foreman instance, REST API may be used alternatively. The main details pane of the job gives a summary of the current job configuration, as can be seen in figure 4.5. Workflow starts with a hook action regardless of how the job is triggered. There are, of course, different hooks for each triggering event. Only Job::RunJobManually action is invoked directly, all the others are subscribed to the corresponding actions that serve as their trigger. Hook actions serve as a workflow entry points, they select the jobs that should be executed and validate their configuration. As can be seen in figure 4.4, Job::Redeploy action schedules several actions that take care of a provisioning of a new host and do an array of associated chores such as a creation of a new subscription key and subsequent registration of the newly provisioned

35 4. PROJECT DESIGN &ANALYSIS

Figure 4.5: Job details page providing a user with an overview of the current job config- uration.

system to the content view with it. Its run phase merely gathers the outputs of scheduled actions and makes them available for the actions that are further down the line. The duration of individual actions has to be taken into account as well. There is Job::SuspendUntilProvisioned action (run step number 11 in figure 4.3), whose sole purpose is to check whether the provisioning of a new host has already finished and if so, it allows the workflow to continue. When provisioning is finished, the host reboots which introduces additional latency into workflow, that should be reckoned with. Jenkins::WaitHostReady action (run step number 14 in figure 4.3) waits for host to become available after reboot. Job::FindPackagesToInstall (run step number 16 in figure 4.3) retrieves names of all the packages in the content view associated with the job and makes them available for injection into build parameters of the Jenkins projects. When requesting a build of multiple projects on Jenkins server, action with sub- plans is used. Jenkins::BulkBuild (run step number 18 in figure 4.3) spawns a sub- action that is planned, run and finalized for every Jenkins project that should be built. Then it waits for the results of all the child actions before finishing itself. Every child

36 4. PROJECT DESIGN &ANALYSIS

action tells Jenkins server to build one project, what parameter values to use if build is parametrized, and subsequently waits for results. If all project builds passed, the content view is optionally promoted to the next en- vironment as a post build action. If any of the builds failed, Job::Promote action (run step number 20 in figure 4.3) is skipped. Same holds if promotions are not enabled. The important thing is that jobs can be chained to move the content view along the whole lifecycle environment path. The chaining is specified implicitly by a job configu- ration. If we configure two jobs for the same content view and successive environments, where the job for second environment has a promote/publish trigger, then these jobs may potentially run in a sequence. If all the Jenkins builds configured for the first job pass, then its content view gets promoted as a result. Content view promotion serves subsequently as a trigger for the second job. If a job arrives into the last environment in a lifecycle path, it naturally cannot be promoted any further.

4.6 Challenges and Obstacles

Given its nature as a plugin, Foreman Pipeline depends on several existing projects. Registering with Foreman was straightforward enough, but I ran into permission issues with Bastion, which was still a part of a Katello’s codebase at the time. After changes in routing configuration I was able to make it do my bidding. Bastion was later extracted and is now an independent gem, which naturally required me to adapt to the changes. Foreman Pipeline also depends on Jenkins API Client, that has been lately under heavy development and its newest versions depend on Nokogiri gem, version 1.6 and higher. Unfortunately Foreman has its Nokogiri dependency locked on version lower than 1.6 for backward compatibility with Ruby 1.8. This forced me to use Jenkins API Client 0.14.1, which is the last version to rely on Nokogiri lower than 1.6 and is more than one year old. Fortunately, this version is already advanced enough to provide the needed functionality. Probably one of the most challenging tasks was to make sure the whole workflow does not break. As already mentioned, there are different things happening at the vari- ous nodes and it is imperative to ensure the correct ordering of these tasks. Sometimes, there have to be things done at relatively low level. For example, generating and copy- ing certificates over SSH or making sure that machine is ready to accept SSH connections by actually trying to successfully SSH into it. These tasks are, of course, highly prone to errors with limited options for debugging and I faced my share of problems. Perhaps

37 4. PROJECT DESIGN &ANALYSIS the most hardly learned lesson was to check SELinux context on certificates when pass- wordless authentication via SSH just refuses to work and there is simply no way the certificates are set up incorrectly. There were also general design issues that were completely resolved only after try- ing the non-viable options first. At certain point, I had on-the-run generated shell scripts that were handed over to Jenkins and were meant to be executed on the newly provi- sioned host as a project post-build steps. It proved to be quite cumbersome and this approach was consequently abandoned in favour of configuring the projects directly in Jenkins while Foreman only supplies the parameters for the build.

38 Conclusion

Foreman Pipeline aims to provide an integration of Foreman, Katello and Jenkins CI in a meaningful way. With this plugin, it should be possible to build an environment using Foreman, Katello and Jenkins concepts together. The main value of the plugin is in bridging the gap between Foreman and Jenkins with relative ease of use for users who have had previous experience with these systems. Its functionality may be accessed through Foreman web UI or REST API. At the time of writing, Foreman Pipeline has been submitted to the community for a peer review and much appreciated feedback. The future development will be likely driven by suggestions of its users. A support for additional project build parameters seems like a rational step, I personally could see a history of job runs as a beneficial addition. Project source code is available at GitHub [44]. The associated Wiki pages offer all the necessary information to add the plugin into an existing Foreman installation as well as a user guide that contains detailed usage description. There is also a demo screencast available [52], that shows the core functionality of the plugin and provides potential users with basic information on how to use Foreman Pipeline.

39 Bibliography

[1] James Britt. ERB. [online]. 2015. [cit. 2. 1. 2015]. Available at: .

[2] T. C. Brown and R. Canino-Koning. Continuous Integration. [online]. 2015. [cit. 19. 2. 2015]. Availble at: .

[3] Yegor Bugayenko. Continuous Integration is Dead. [online]. 2014. [cit. 19. 2. 2015]. Available at: .

[4] Andrew Cholakian. Exploring Elsaticsearch. [online]. 2015. [cit 7. 2. 2015]. Avail- able at: .

[5] CloudBees. About Jenkins CI. [online]. 2015. [cit. 8. 2. 2015]. Available at: .

[6] Git Community. Git. [online]. 2015. [cit. 19. 2. 2015]. Available at: .

[7] Jenkins CI community. Jenkins – An extensible open source continuous integration server. [online]. 2015. [cit. 8. 2. 2015]. Available at: .

[8] Mercurial Community. Mercurial. [online]. 2015. [cit. 19. 2. 2015]. Available at: .

[9] NetBeans Community. NetBeans IDE Features. [online]. 2015. [cit. 15. 2. 2015]. Available at: .

[10] Control4 Corporation. Home Automation. [online]. 2015. [cit. 14. 2. 2015]. Avail- able at: .

40 4. PROJECT DESIGN &ANALYSIS

[11] dexterous. Why use Gradle instead of Ant or Maven? [online]. 2013. [cit. 15. 2. 2015]. Available at: .

[12] Eliza Earnshaw. Top Benefits of Continuous Delivery: An Overview. [online]. 2014. [cit. 21. 2. 2015]. Available at: .

[13] Eliza Earnshaw. Why Automation? Predictability, Consistency & the Confidence to Innovate. [online]. 2014. [cit. 14. 2. 2015]. Available at: .

[14] Elasticsearch. Elasticsearch – A Distributed RESTful Search Engine. [online]. 2015. [cit. 7. 2. 2015]. Available at: .

[15] Nathan Esquenazi. RABL. [online]. 2011. [cit. 21. 2. 2015]. Available at: .

[16] Viktor Farcic. Java Build Tools: Ant vs Maven vs Gradle. [online]. 2014. [cit. 15. 2. 2015]. Available at: .

[17] Stuart Feldman. Make - A Program for Maintaining Computer Programs. [on- line]. 1978. [cit. 14. 2. 2015]. Available at: .

[18] Apache Software Foundation. Apache Ant. [online]. 2015. [15. 2. 2015]. Available at: .

[19] Apache Software Foundation. Apache Maven Project. [online]. 2015. [cit. 15. 2. 2015]. Available at: .

[20] Apache Software Foundation. Subversion. [online]. 2014. [cit. 19. 2. 2015]. Avail- able at: .

[21] Apache Software Foundation. Apache Lucene. [online]. 2015. [cit. 7. 2. 2015]. Avail- able at: .

41 4. PROJECT DESIGN &ANALYSIS

[22] Martin Fowler. Continuous Integratiion. [online]. 2006. [cit. 19. 2. 2015]. Available at: .

[23] Eleftherios Gkioulekas. Learning the GNU development tools. [online]. 1998. [cit. 15. 2. 2015]. Available at: .

[24] Travis CI GmbH. Travis CI. [online]. 2015. [cit. 19. 2. 2015]. Available at: .

[25] Google. AngularJS. [online]. 2015. [cit. 7. 2. 2015]. Available at: .

[26] Gradleware. Gradle. [online]. 2015. [cit. 15. 2. 2015]. Available at: .

[27] David Heinemeier Hansson. Getting Started with Engines. [online]. 2015. [cit. 24. 1. 2015]. Available at: .

[28] Red Hat. Candlepin – About. [online]. 2015. [cit. 1. 2. 2015]. Available at: .

[29] Jez Humble. Continuous Delivery vs Continuous Deployment. [online]. 2010. [cit. 19. 2. 2015]. Available at: .

[30] Jez Humble and David Farley. Continuous Delivery. Addison-Wesley, 2011.

[31] Harry Jerome. Mechanization in Industry. National Bureau of Economic Research, 1934. URL .

[32] Yehuda Katz and Carl Lerche. Bundler. [online]. 2015. [cit. 15. 2. 2015]. Available at: .

[33] Brian Kearney. 6 is Here... We Hope You Enjoy It. [online]. 2014. [cit. 25. 1. 2015]. Available at: .

[34] Puppet Labs. The Puppet Forge. [online]. 2015. [cit. 25. 1. 2015]. Available at: .

42 4. PROJECT DESIGN &ANALYSIS

[35] Puppet Labs. What is Puppet. [online]. 2015. [cit. 25. 1. 2015]. Available at: .

[36] Kannan Manickam. Jenkins API Client. [online]. 2015. [cit. 7. 2. 2015]. Available at: .

[37] Felix Massem. The Foreman - A complete lifecycle management tool. [online]. 2014. [cit. 18. 1. 2015]. Available at: .

[38] Ivan Necas.ˇ Foreman Tasks. [online]. 2014. [cit. 14. 2. 2015]. Available at: .

[39] Ivan Necasˇ and Petr Chalupa. Dynflow. [online]. 2015. [cit. 14. 2. 2015]. Available at: .

[40] Ivan Necasˇ and Pavel Pokorny.´ Apipie. [online]. 2015. [10. 4. 2015]. Available at: .

[41] Dan North. Continuous Build is not Continuous Integration. [online]. 2006. [19. 2. 2015]. Available at: .

[42] Oracle. Hudson. [online]. 2015. [19. 2. 2015]. Available at: .

[43] Andy Parker. Five Ways Continuous Delivery Reduces Stress. [online]. 2014. [cit. 21. 2. 2015]. Available at: .

[44] Ondrejˇ Prazˇak.´ Foreman Pipeline. [online]. 2015. [cit. 20. 3. 2015]. Available at: .

[45] GNU Project. GNU Software. [online]. 2015. [15. 2. 2015]. Available at: .

[46] Johnathan Rochkind. The Semi-Isolated Rails Engine. [online]. 2012. [cit. 21. 2. 2015]. Available at: .

43 4. PROJECT DESIGN &ANALYSIS

[47] Wayne E. Seguin and Michael Papis. Ruby Version Manager. [online]. 2015. [cit. 15. 2. 2015]. Available at: .

[48] John Ferguson Smart. Jenkins Definitive Guide. O’Riley Media, 2011.

[49] Chef Software. Chef. [online]. 2015. [cit. 21. 2. 2015]. Available at: .

[50] Yassai Sundman. Continuous Delivery vs Continuous Deployment. [online]. 2013. [cit. 21. 2. 2015]. Available at: .

[51] Connifer Systems. What’s Wrong With GNU Make? [online]. 2010. [cit. 18. 4. 2015]. Available at: .

[52] Foreman Team. Foreman deep dive: foreman-pipeline plugin. [online]. 2015. [cit. 11. 4. 2015]. Available at: .

[53] Foreman Team. Foreman Docker Plugin. [online]. 2015. [cit. 7. 2. 2015]. Available at: .

[54] Foreman Team. The Foreman. [online]. 2015. [cit. 18. 1. 2015]. Available at: .

[55] Katello Team. Bastion: AngularJS based Foreman UI Engine. [online]. 2015. [cit. 7. 2. 2015]. Available at: .

[56] Katello Team. Katello. [online]. 2015. [cit. 25. 1. 2015]. Available at: .

[57] Pulp Team. Pulp - Juicy software repository management. [online]. 2015. [cit. 31. 1. 2015]. Available at: .

[58] Juri Timoshin. Java Build Tools: Maven, Gradle and Ant plus the DSL vs. XML debate. [online]. 2013. [cit. 15. 2. 2015]. Available at: .

44 BIBLIOGRAPHY

[59] Jim Weirich. RAKE – Ruby Make. [online]. 2014. [cit. 15. 2. 2015]. Available at: .

[60] Though Works. Continuous Integration. [online]. 2015. [cit. 19. 2 2015]. Available at: .

45 Index

A G Maven, 6, 26 Ant, 6 gem, 6 P Apache, 24 Gemfile, 6 plugin, 14, 29 GitHub, 25, 38 B provisioning, 13 GNU Bastion, 23, 33 Pulp, 20 Automake, 5 build, 5, 7–10 Puppet, 12, 14, 17 Libtool, 5 Bundler, 6 PXE, 13 Gradle, 6 C H R candlepin, 22 hostgroup, 14, 31 RABL, 33 Chef, 12 Hudson, 25 Rails, 29 configuration management, Rake, 5 11 I Ruby, 6 content management, 15 IDE, 6 RVM, 6 content view, 16, 30 continuous integration, 7, 9 J S Java, 6, 26 Smart Proxy, 13 D Jenkins, 9, 25, 32, 38 subscription management, delivery, 10 JSON, 33 22 deployment, 9 DSL, 6, 17 K T Dynflow, 27, 34 Katello, 15, 29, 38 Travis, 9

E L V Elastic Search, 24 lifecycle environment, 16, versioning, 6, 7, 11 ERB, 33 31, 37 Lucene, 24 W F workflow, 35 Foreman, 13, 29, 38 M Foreman Pipeline, 29, 38 make, 5, 6 X Foreman Tasks, 28, 34 Makefile, 5 XML, 6

46 Appendix A

User Guide

The appendix contains a user guide that provides a step by step help navigating web GUI and shows how to configure and run a sample job.

A.1 Jobs

A.1.1 Create a Job

Go to main menu, Pipeline > Jobs. Click ’New Job’ button on the right as pictured in figure A.1. Fill in the form and submit it.

Figure A.1: Create a job

A.1.2 Configure a Job

Go to main menu, Pipeline > Jobs. Click on the Job name in the table as pictured in figure A.2. This will bring you to the job’s details pane, which shows job’s configuration overview (figure A.3). Left column shows job’s name and how it is triggered. Job can be triggered

47 A.USER GUIDE

Figure A.2: Configure a job manually, on successful repository sync and/or content view promotion. Job requires hostgroup, compute resource, content view, Jenkins instance, Jenkins user and lifecycle environment to run.

Figure A.3: Job’s configuration overview

To configure a content view for a job, click the ’Content Views’ tab as shown in figure A.4. Here you can select a non-composite content view containing repositories with content that may be placed on the host once provisioning finishes. To configure a hostgroup for a job, click the ’Hostgroups’ tab as shown in figure A.5. Hostgroup specifies the parameters of the host that will be provisioned. All hostgroup properties, that can be set, should be properly configured (except for lifecycle environ- ment). Operating systems used during development were Fedora 19 with PXE boot and Katello Kickstart default, CentOS 6 with PXE boot and Katello Kickstart for RHEL –

48 A.USER GUIDE

Figure A.4: Configure a job with a content view

Figure A.5: Configure a job with a hostgroup you may need to modify templates and enable repos containing puppet, subscription- manager and katello-agent for CentOS. New foreman pipeline jenkins pubkey snippet is available in provisioning templates, you have to modify your provisioning template manually and make it use this snippet. For detail on hostgroups and provisioning see Foreman documentation [54]. To configure a compute resource for a job, click the ’Compute Resources’ tab as shown in figure A.6. Only resources configured within compute profile for already se- lected hostgroup are displayed in the ’Compute Resources’ tab. If job has no hostgroup set, no compute resources will be shown in their tab. When hostgroup is changed, com- pute resource for the job is removed and must be set again.

49 A.USER GUIDE

Figure A.6: Configure a job with a compute resource

Figure A.7: Configure a job with a Jenkins instance

Jenkins instances are configured under ’Jenkins Instances’ tab, as shown in figure A.7. Choose the desired Jenkins instance from the list and click ’Set Jenkins Instance’ button. Lifecycle environment and the ’to environments’ (environments where the Job will be promoted to) are selected under ’Environment’ tab, as is shown in figure A.8. Select a desired environment and click ’Set environment’ button.

50 A.USER GUIDE

Figure A.8: Configure a job with an environment

Figure A.9: Configure a job with to environments

51 A.USER GUIDE

Click ’To Environments’ button on the right to proceed to to environments selection. Only a successors of an environment which is already set for the job are admissible as to environments as figure A.9 shows. If no to environments are selected, job will not be promoted to the next environment after it has successfully finished. ’Jenkins Projects’ tab contains existing projects in Jenkins that are assigned to this job. All assigned projects are built when job runs. They are described in detail in section A.2.

A.1.3 Run a Job

There are 3 possibilities to run a job:

1. Job is triggered when repository included in a job through it’s content view is successfully synced.

2. Job is triggered when content view included in job gets successfully promot- ed/published.

3. Job is triggered manually.

A.1.4 Promote a Content View

When job finishes and all Jenkins projects assigned to him are built with success on Jenkins server, content view assigned to the job may be promoted to the next envi- ronment. Whether or not the content view gets promoted depends on the number of to environments the job is configured with.

A.2 Jenkins Projects

A.2.1 Create Jenkins Project

Jenkins projects are created when retrieved from Jenkins server and assigned to the job. First, make sure the job has a Jenkins instance set. Go to the job details, click on the ’Jenkins Projects’ tab. Click the ’Find more...’ button (figure A.10). Search for the Jenkins projects by name. Results will be displayed in the table. Se- lect the desired Projects and click ’Add Projects’ to add them (figure A.11). Continue searching or return back to the list of added projects.

52 A.USER GUIDE

Figure A.10: Retrieve existing projects from Jenkins server

Figure A.11: Add found projects to a job

A.2.2 Configure Project Parameters

Parametrized builds are supported. When displaying the table of Jenkins projects as- signed to the job, click on the project name. List of parameters will be displayed if project has any. Default values may be changed. Next time when job is run, Jenkins project will be built with new parameter values. Only String, Boolean and Text parameter types are supported at the present. ERB templating may be used to inject info from newly provisioned host to the pa- rameters (figure A.13). There are currently 3 Ruby hashes available: their structure is shown in figure A.12.

53 A.USER GUIDE

Figure A.12: Values available for injection into project parameters

Figure A.13: Configure project build parameters

54 A.USER GUIDE A.3 Jenkins Instance

Jenkins instance is a representation of real Jenkins server at our disposal.

A.3.1 Create Jenkins Instance

Foreman needs SSH access to the machine running Jenkins. Foreman uses RSA certifi- cates for authentication to Jenkins server, these have to be set up manually. You will need to know Jenkins server hostname and ip address.

1. To resolve Jenkins server ip address, use dig (Listing A.1).

Listing A.1: Step 1

1 dig +short

If dig is not available for some reason, you may use nslookup as an alternative (Listing A.2)

Listing A.2: Step 1-alternative

1 nslookup

2. Then add Jenkins to the list of known hosts (Listing A.3),

Listing A.3: Step 2

1 ssh-keyscan , >> \\ ˜/.ssh/known_hosts

3. and generate a keypair (Listing A.4).

Listing A.4: Step 3

1 ssh-keygen -f ˜/.ssh/ -t rsa -N ’’ -q

4. Distribute the public key to the Jenkins server (Listing A.5).

5. When prompt for root’s password on Jenkins server is displayed, enter a pass- word (Listing A.6).

55 A.USER GUIDE

Listing A.5: Step 4

1 cat ˜/.ssh/.pub | ssh root@ \\ "mkdir -p ˜/.ssh && cat >> ˜/.ssh/authorized_keys"

Listing A.6: Step 5

1 [email protected]’s password:

6. Verify that certificate has been copied and authentication works (Listing A.7). You should see something similar to what is in listing A.8.

Listing A.7: Step 6

1 ssh -i ˜/.ssh/ root@

Now you can create Jenkins instance. Go to main menu, Pipeline > Jenkins In- stances. Click on the ’New Jenkins Instance’ button on the right as shown in figure A.14. Fill in all the required fields. Jenkins home folder defaults to /var/lib/jenkins for Red Hat family operating systems. For Certificate Path, add path where certificate ge- nerated in steps above is reachable (i.e. ∼/.ssh/). Please note that certificates are usually stored in ∼/.ssh/ folder (as in steps above). Make sure user running Foreman has appropriate access rights to this folder. Jenkins instance will not be created if certificates are not set up properly.

A.3.2 Configure Jenkins Instance

Got to main menu, Pipeline > Jenkins Instances. Click on the Jenkins instance name in the table. To test Jenkins reachability, click on the ’Jenkins reachable?’ button as shown in figure A.15. If Jenkins can be reached, success message with Jenkins server version will be displayed.

A.4 Jenkins Users

Jenkins users are necessary for authorized access to the Jenkins CI server.

56 A.USER GUIDE

Listing A.8: Step 7

1 root@

2 Last login: from

Figure A.14: Create Jenkins instance

A.4.1 Create Jenkins User

1. Go to main menu, Pipeline > Jenkins Instances.

2. Select the Jenkins instance you wish to configure with a user.

3. Select the ’Jenkins Users’ tab.

4. Now click the ’New Jenkins User’ button (figure A.16).

5. Fill in the form with required information and submit. You can get user’s API token from user account details in Jenkins.

Figure A.15: Test Jenkins reachability

57 A.USER GUIDE

Figure A.16: Create new Jenkins user

A.4.2 Configure Jenkins User

Select the desired user in the table and click the ’Set Jenkins User’ button (figure A.17).

Figure A.17: Configure Jenkins user

A.4.3 Change user’s details

Click on the user’s name in the table and edit the details.

58