MASARYKOVA UNIVERZITA FAKULTA}w¡¢£¤¥¦§¨  INFORMATIKY !"#$%&'()+,-./012345

Platform for deploying web applications

MASTERTHESIS

Bc. Marek Jelen

Brno, fall 2011 Declaration

Hereby I declare, that this paper is my original authorial work, which I have worked out by my own. All sources, references and literature used or excerpted during elaboration of this work are properly cited and listed in complete refer- ence to the due source.

Bc. Marek Jelen

Advisor: doc. RNDr. Toma´sˇ Pitner, Ph.D.

ii Acknowledgement

I would like to thank my supervisor and all those who helped me to make this happen.

iii Abstract

This theses focuses on deployment of web applications. Web applications are becoming more important with advances in computing and with the ap- proach of HTML5 standard. With the rise of web applications, the importance of deployment technologies rises. This thesis describes the deployment process of web applications and the phenomenon regarding the deploy- ment process of web applications. As part of the theses a project for deploying web applications is developed to provide open-source platform for web applica- tion deployment.

iv Keywords cloud computing, PaaS, IaaS, web applications, deployment, service, Ruby, Java, JavaScript,

v Contents

1 Introduction ...... 4 2 Cloud computing ...... 6 2.1 History ...... 6 2.2 Finding the roots ...... 6 2.3 Infrastructure ...... 7 2.3.1 Amazon Elastic Compute Cloud ...... 8 2.3.2 ...... 9 2.3.3 ...... 9 2.4 ...... 9 2.4.1 ...... 10 2.4.2 App Engine ...... 11 2.4.3 Azure ...... 11 2.5 Software as s Service ...... 12 2.5.1 Office Web Apps ...... 12 2.5.2 Google Apps ...... 12 3 The cloudy business ...... 14 3.1 Web application development process and deployment ...... 14 3.1.1 Bare metal ...... 16 3.1.2 Shared hosting ...... 17 3.1.3 Virtual servers & dedicated hosting ...... 17 3.1.4 Platform ...... 18 3.2 Platform adoption ...... 18 3.2.1 Management ...... 19 3.2.2 Developers ...... 19 4 Making the clouds wild ...... 21 4.1 Wildcloud ...... 21 4.2 Required features ...... 21 4.2.1 Security ...... 21 4.2.2 Resource allocation ...... 22 4.2.3 Building the application and central repository ...... 22 4.2.4 Thin provisioning ...... 22 4.2.5 Extensibility ...... 23 4.2.6 Deployment ...... 23 4.2.7 File system ...... 23 4.3 Comparison with existing solutions ...... 23 4.3.1 CloudFoundry ...... 24

1 4.3.1.1 Tight coupling ...... 24 4.3.1.2 Deployment ...... 24 4.3.1.3 Message queuing ...... 24 4.3.1.4 Isolation ...... 25 4.3.2 ...... 25 4.3.2.1 Node.js ...... 25 4.3.2.2 Isolation ...... 25 4.3.3 OpenShift ...... 25 4.4 Applied technologies ...... 26 4.4.1 Operating system ...... 26 4.4.2 Application security and isolation ...... 26 4.4.3 Loose coupling ...... 27 4.4.4 Thin provisioning ...... 28 4.4.5 Data storage ...... 28 4.4.6 Cloud communication ...... 29 4.4.7 External services ...... 29 4.4.8 HTTP routing ...... 29 4.4.9 Session store ...... 30 4.5 Architecture ...... 30 4.5.1 Basic communication ...... 30 4.5.2 Core components ...... 31 4.5.3 Components seen by applications ...... 35 4.5.4 Routing of HTTP requests ...... 37 4.6 Testing ...... 38 4.6.1 Functional testing ...... 40 4.6.2 Load testing ...... 40 5 Product evaluation ...... 42 5.1 Case-study ...... 42 5.1.1 Company ...... 42 5.1.2 Problems ...... 42 5.1.3 Requirements ...... 43 5.1.4 Solution ...... 43 5.1.5 Benefits ...... 44 5.2 From customer’s point of view ...... 44 5.2.1 Welcome screens ...... 45 5.2.2 Navigation ...... 46 5.2.3 Dashboard ...... 47 5.2.4 SSH keys ...... 47 5.2.5 Repositories ...... 48

2 5.2.6 Applications ...... 49 5.2.7 Router ...... 50 6 Conclusion ...... 52

3 1 Introduction

When at the end of the year 1990 1 Tim Berners-Lee was proposing his new project “WorldWideWeb”, I believe, he had no idea, what will WildWideWeb become. In those 11 years the mankind created something, that has never been created in the known history. An incredible library of knowledge available to everyone connected to the network. With the advances in text recognition and search algo- rithms, we have possibility to scan through billions of documents, books, articles within seconds. And man is going even further. We are building social networks and services, to allow people to communicate what they want to say to those, who want to lis- ten and even to those, who do not. We are tearing down the walls among people, countries, continents and the freedom of speech has advanced so much, that we are even afraid of it. Services like Twitter2 or 3 scale to billions of active users and unimaginable amounts of data. People are loosing their fear of living in the virtual environment and are embracing the impossible - being anytime ev- erywhere. All these advances in the domain of social or community networking are pos- sible thanks to a new phenomenon called “cloud computing”. Cloud computing is the new “buzz word”, that we can hear from all directions all over the world. At the beginning the cloud computing was the domain of agile start-up compa- nies, that needed to scale their products to vast number of clients, but did not have enough money to invest into their own and expensive hardware. As the time comes, the cloud computing is adopted even in large enterprises and the term “cloud” is becoming the new formula, everyone want to use to magically succeed with their product. The cloud computing brought new ideas into the area of server infrastruc- tures. The most important one is the shift of paradigms from “we have to build computer that will never fail” into more reasonable and practical view “we have to build an infrastructure that will survive the crash of a computer”. Cloud com- puting is about building an infrastructure that even on commodity hardware will maintain high availability and performance. The main subject of this thesis is to analyse requirements on deploying mod-

1. http://en.wikipedia.org/wiki/World Wide Web 2. https://twitter.com/ 3. http://www.facebook.com/

4 1. INTRODUCTION ern web applications and implement platform that simplifies such process. To ful- fil the subject different areas and views on cloud computing will be explored and the most important cloud computing services regarding the topic of this thesis described. In the first chapter, the term “cloud computing” will be discussed. It is always important to lay the ground by defining the terms. It is even more impor- tant in the domain of cloud computing, because of it’s rapid pace of expansion and development. In the second chapter, with the terms defined, the processes from the point of a clients and providers will be discussed to analyse the needs of customers. In the third chapter existing solutions and their advantages and dis- advantages are discussed. Also the reasoning why to start new project is being presented. The following chapters will describe the new platform for deploying web ap- plications in the “cloudy way”. The advantages and disadvantages of this so- lution, as well as technical aspects and requirements to set-up the platform are presented. The last chapter will provide performance figures as well as results of functional and load testing.

5 2 Cloud computing

2.1 History

The cloud computing comes after the “dot-com bubble” 1. Amazon as one of the survivors of the bubble measured that only 10% of their computing capacity is used. The rest of their computing power was there to cover spikes in the visi- tors count to accommodate the necessary computing power to serve the HTTP re- quests. With the advances of virtualization, Amazon created cloud infrastructure that allowed to significantly reduce the costs of their infrastructure and labour required to operate it. Learned this lesson Amazon decided to create public ser- vice, that allows clients to rent computational power for their own purposes. This product was presented for the first time in 2006. More companies followed and new projects were started, proprietary and open source. Among the pioneers in this area were projects Eucalyptus2 and Open- Nebula 3. Later a project called OpenStack 4 was found by RackSpace5.

2.2 Finding the roots

There is no simple and universal answer to the question what cloud computing really is. It is important to note, that different people or experts do define cloud computing differently. In different domains the term may something com- pletely different. For example taking 21 experts to define the term cloud comput- ing yields 21 different definitions [19]6. From the perspective of this thesis, the definition provided by Wikipedia7 [73] seems to be one of the most precise, but let me provide my own definition that is based on those mentioned.

“providing computation as a service over a network”

1. http://en.wikipedia.org/wiki/Dot-com bubble 2. http://www.eucalyptus.com/ 3. http://www.opennebula.org/ 4. http://openstack.org/ 5. http://www.rackspace.com/ 6. http://cloudcomputing.sys-con.com/node/612375 7. http://en.wikipedia.org/wiki/Cloud computing

6 2. CLOUDCOMPUTING

Computation, in this definition, is any computer-related service that can be provided remotely and basically can be divided into three layers, that build on each other from bottom up.

Layer Abbreviation SaaS Platform as a Service PaaS Infrastructure as a Service IaaS

To illustrate what each of these layers represents, let’s apply them to a real service. Dropbox8 is solution to allow users backup data to a remote location and also to replicate these data among many computers. There is a client software that user installs to a computer and it communicates with a cloud software. Software is the service itself. Providing utility to the end users of the service. In it is the web application that provides client-less access to the data and also and API for the client-software communication. The data itself is stored in some sort of object store or database that is responsible for storing the data and also meta-data associated with it. This is represented by the platform layer. It provides utility to the service builder. The data have to be written to a block device or some kind of memory. The infrastructure layer represents this service, providing virtual devices on top of real hardware. This layer provides utility to a system administrator that provides a platform. Depending on a situation the utility can be provided to a single entity or many entities can provide the utility to each other. At the bottom are the servers itself, the real hardware a service provider maintains. Having a basic knowledge of what these layers represent, let’s take a look at each of them in turn, from the most basic one to the most complex, and discuss more details regarding it’s functionality.

2.3 Infrastructure as a Service

The main task for a Infrastructure as a Service is to create a scalable environment on top of a real hardware that can survive a crash of a single physical node. In

8. https://www.dropbox.com/

7 2. CLOUDCOMPUTING most cases the infrastructure is build using a virtualization solution. However in recent months there are providers9 that provide also bare-metal (unvirtualized) infrastructures. One important aspect of Infrastructure as a Service is that client pays only for the time the system is running. For example having an application that quadru- ples the visitors count only for two hours a day. With IaaS provider can buy extra nodes to accommodate the spike and pay for those two hours only instead of buying the hardware and paying for the housing even when those servers are not used.

2.3.1 Amazon Elastic Compute Cloud

Amazon Elastic Compute Cloud (Amazon EC2) was the first service that pioneered the cloud ecosystem. Founded in the year 2006 Amazon realized that they own complex infrastructure to handle occasional spikes in traffic but most of the time, the computing power was unused. The company decided to use the advances in virtualization to consolidate their servers. From this endeavour the service was started. The platform utilizes Xen10 hypervisor and allows the client to use wide spectrum of operating systems. The user has administrator access to the system and is free to modify most aspects of the system. Every instance is assigned spe- cial domain name. The domain name may resolve to different IP addresses. Hav- ing a static IP address is a paid feature. The platform offers many services. Most notable is Elastic Block Store. EBS is replicated block storage that can be attached to virtual machines to provide secure and persistent storage with POSIX characteristics. The virtual device can be formatted with file system of choice and mounted as regular block device. Many service use Amazon EC2 as their infrastructure of choice, because of acceptable pricing and high fault tolerance. The software as a service mentioned in the previous text - Dropbox.com - is build using Amazon EC2 infrastructure. But also platforms as a service are utilizing Amazon EC2 in their architectures, in example Heroku and EngineYard mentioned in the next sections.

9. http://www.newservers.com/language/en/ 10. http://xen.org

8 2. CLOUDCOMPUTING

2.3.2 Rackspace cloud

Rackspace company is one of the largest11 [45] server hosting providers in the world to the number of running servers. In 2006 the company created new brand Mosso to start providing virtual server by utilizing their own data-centres. Later the brand Mosso was changed to Rackspace Cloud. Rackspace cloud uses Xen as it’s virtualization platform and provides similar services like Amazon EC2, but with different product names. On top of Rackspace cloud is build another product Cloud sites, that offers PaaS. Some of the internal systems of Rackspace cloud were open-sourced in a project called OpenStack.

2.3.3 Joyent

Joyent started as a SaaS provider and evolved to a IaaS provider. Their infrastruc- ture was also build using Xen. However when Sun microsystems12 open-sourced Solaris system, the company started building on top of that. Now, the company uses Solaris13 from the project14 as a host operating system and KVM 15 as a hyper-visor in their own package called SmartOS16. Joyent is big advocate of open-source software and sponsors development of many projects. SmartOS is their operating system of use and it’s provided as an open product. They also employ the creator of Node.js project.

2.4 Platform as a Service

Having a fail-proof infrastructure is not enough to create a scalable service. When a service outgrows a single machine (bare-metal or virtual), there is the need for a platform to orchestrate the execution of the service among multiple nodes, ma- chines, to allow the service to scale as easy as adding new nodes to the cluster. There are two points of view on such solutions depending on who provides the platform.

11. http://www.datacenterknowledge.com/archives/2009/05/14/whos-got-the-most-web- servers/ 12. http://en.wikipedia.org/wiki/Sun Microsystems 13. http://www.oracle.com/us/products/servers-storage/solaris/solaris11/overview/index.html 14. https://www.illumos.org/ 15. http://www.linux-kvm.org/page/Main Page 16. http://smartos.org/

9 2. CLOUDCOMPUTING

For the most demanding services it is vital to be build directly on an infras- tructure level and the platform and cluster orchestration integrate directly into the service, but that makes the development costs of the service much higher. For many services it is simpler and more cost-effective to be build on top of a plat- form that handles the orchestration itself and the service is only focused on it’s primary business.

2.4.1 Heroku

The pioneer of such a platform is Heroku17 (founded in 2007) that started as a platform for deploying Ruby and applications in a scalable envi- ronment. Later bought18 by Salesforce19 for $212 million. Following the transac- tion Heroku announced support for many different stacks - Node.js, Python, Java, Scala, . Heroku uses Git20 based deployment. Every service running the platform has it’s own git repository associated and every time there is a git-push21 to the repos- itory a special hook is executed - it takes the latest revision from the repository and exports it to a virtual environment. In the environment a build is run and the resulting application is tested to start properly. Being the test successful, the result of this build is saved for later use. The resulting package is called a ’slug’ and represents everything the application needs to be run. The slug may contain anything, but it is restricted to 100MB of size. Depend- ing on a plan the customer is paying for, the slug is distributed to the actual nodes of the Heroku cluster and started. This way the application is paralleled on mul- tiple nodes. All the nodes have access to the same data-store, Heroku provides PostgreSQL22 database and there need to be shared session store, because from the same client a request may come to different nodes. To ensure stability and security of applications in the cluster, there is virtual environment that separates them from each other. In the first version of the plat- form chroot23 was used. Chroot is very simple to implement and does not need

17. http://www.heroku.com/ 18. http://techcrunch.com/2010/12/08/breaking-salesforce-buys-heroku-for-212-million-in- cash/ 19. http://www.salesforce.com 20. http://git-scm.com/ 21. http://gitref.org/remotes/#push 22. http://www.postgresql.org/ 23. http://linux.die.net/man/1/chroot

10 2. CLOUDCOMPUTING any external software except unix-like operating system. Chroot, however, does not support resource limitation. Badly written application can allocate too much memory or CPU cycles, and other applications will starve. To add resource limitation aspect to the platform, Heroku adopted technology called c-groups24. C-groups allows separating processes into isolated groups (there can be one or more process in a group) and limit what resources each group is allowed to use. The platform than can limit different aspects of resource alloca- tion - memory, cpu used by the process, cpu scheduling, IO operations.

2.4.2

Google App Engine25 is PaaS service by Google26. Google App Engine (GAE) is based on Google’s proprietary technologies and provides scalable environment for deploying web applications in Python and Java programming languages. The platform provides many services including the possibility of running tasks on background to free web applications of managing long running task. Integral part of the platform is also XMPP protocol infrastructure.

2.4.3

Microsoft also entered the market of cloud computing with a product called Mi- crosoft Azure27. Depending on the point of view, Microsoft’s platform has advan- tage and disadvantage in the form of reliance on their own operating system. Most technologies regarding development of web applications have roots in the Unix-like operating systems and in the process of porting the software to Windows operating system developers are often forced to disable features already present in the software. As part of the platform all basic set of technologies is provided - SQL database, Key-Value store, blob store, load bal- ancing infrastructure, running environment.

24. http://www.kernel.org/doc/Documentation/cgroups/ 25. http://code.google.com/intl/cs/appengine/ 26. http://www.google.com/ 27. http://www.microsoft.com/windowsazure/

11 2. CLOUDCOMPUTING 2.5 Software as s Service

Both levels of cloud computing already described are related to the development process of the service and does not directly touches the end-user. Software as a Service builds on the previous levels and provides a value to the customer. The provider has already an infrastructure to deploy the service to and platform to make the applications robust, fail-proof and scalable. In the last step the provider delivers some value the customer. The main paradigm of SaaS is that the customer gets thin client that is ba- sically only responsible for displaying results, data, information and does very basic operations on the data. The main processing system is part of the service and the client makes requests to get some results and these results are delivered from the service. There is vast number of services, that might be considered SaaS.

2.5.1 Office Web Apps

Office Web Apps is Microsoft’s response to the Google Apps ecosystem and it’s in- creasing popularity. This service is neither big player nor pioneer in this area, however it’s still mentioned as the first one in this theses. The reasoning to make the Office Web Apps first is, that this service builds on Outlook Web Access, a ser- vice originally part of Microsoft Exchange solution. Even though it is not a well know fact, XMLHttpRequest28 the main technology that allowed the “Web2.0” era was created by developers of the Outlook Web Access. Office Web Apps is a web application, that allows creating and editing docu- ment using a supported web browser. It’s integrated into standalone Office suite, that allows uploading and downloading documents from and to this service. Of- fice Web Apps have equivalent of Word, Excel, PowerPoint and OneNote. Outlook is provided in the form of Outlook Web Access as part of Microsoft Exchange or as a HotMail service to the general audience.

2.5.2 Google Apps

Google Apps29 is the pioneer of Software as a Service. Starting with Gmail in the year 2006, Google broadens the set of tools provided as part of this service. Nowa-

28. http://msdn.microsoft.com/en-us/library/ms759148(VS.85).aspx 29. http://www.google.com/apps/intl/en/business/index.html

12 2. CLOUDCOMPUTING days the services include e-mail30, documents31, text, audio and video chat32, so- cial network33, image sharing34 and many others. Google Apps use the same infrastructure as other Google services and even Google App Engine customers. Google Apps is a clear example of Software as a Service, because user does not have to install nothing into the operating system, and needs only web browser, that is part of most desktop operating systems, to start using the service.

30. http://www.gmail.com 31. http://docs.google.com/ 32. http://www.google.com/talk/ 33. http://plus.google.com 34. http://picasaweb.google.com

13 3 The cloudy business

Knowing what cloud computing means is not enough. It is important to know how to implement the cloud products into the business itself. Regarding the topic of this theses, in this chapter discusses what benefits and pitfalls the business gains from using cloud related technologies. The text focuses on web applications deployment. Before getting to the technologies itself, it is necessary to describe the process of development of web applications and what role in the process deployment has.

3.1 Web application development process and deployment

The process of developing web applications is complex area of expertise. First of all it is necessary to know what application is being developed. It is simpler with in-house product, but more difficult with development of public services or custom-build systems for customers. With public service development there has to be market and demand analysis done. With development for customers, the actual description and specification of the product communicated. Then the user interface design and user experience design is created. Such a process includes many different professions and experts. From the management point of view it is important to create schedules and budgets. To hire experts in all involved domains. To choose project management methodology. Whether to use time-proven waterfall model or go with more mod- ern agile approach. Depending on all of these factors the actual development process will differ. However across all those processes, there is one, that covers the development of all web applications. As it can be seen on chart 3.1. It consists of four phases in a cycle. The cycle is an important aspect of web application development. Web ap- plications are not developed as standalone product, that are created, delivered, maintained and forgot. Web applications are delivered as services and as services they have tendency to improve in time in iterations, rather than in separated de- velopment processes.

The first step is Development of the product. This step includes the planning and design phases found in some of the methodologies and the implementation

14 3. THECLOUDYBUSINESS

Development

Maintenance Testing

Deployment

Figure 3.1: Common web development process across methodologies

phase that is part of all methodologies. During this step the team creates some deliverable product. In case of the waterfall model, this product would be the whole product, in contrast to agile methodologies, where the product would be small part of the system, prepared to be presented to the customer for feedback. The Development phase is done in-house at the development company and does not directly concern the deployment platform. I many cases it is not known beforehand, in what environment the application will run. However, knowing the environment before allows the developers simplify the software and tailor the product for the environment. The Testing phase comes after the Development phase. Testing phase can be found in all methodologies utilizing different names, tools or processes. The main and only purpose is to ensure that the product is containing as least problems and issues as possible. Testing phase is tightly connected with the Development and Deployment phases. From some points of view the Development and Testing phases can be seen as a cycle itself. In the classical point of view, the product is developed and

15 3. THECLOUDYBUSINESS then pushed to testing. Testing itself may involve some interventions to fix found issues, but these interventions should be minimal and targeted only to fix the issue, in an already developed product. To test the product thoroughly, it is required to know, in what environment the application will run or test across many different environment. Staging en- vironments are used to simulate the environment the product will be running in. These environment should be as close to the real environments as possible. Having an engaged customer and accessible staging environment, it is possible to draw the customer into the development process and discover issues inflicted from misunderstandings in specification of the product. In the Deployment phase, the product is delivered to the customer. Depend- ing on the product, this may be one step or many steps, however the target of the phase is the same - the product is available to the customer. The last phase of the process is Maintenance. This step may be provided by the developer itself, by the client, by third party vendor or a combination of those mentioned. Using the product in production means, that it is crucial to ensure it’s availability and that found issues are resolved. On the other hand as the business grows, the customer may have require- ments that were not known in the time of creating the product. Such requirements should be also resolved for the customer. That way the circle is closed. Adding new features or making more complex changes in the product requires the full circle from Development, over Testing to Deployment and back to Maintenance. In case of agile methodologies, the circle will be repeated many times as the iterations will go. In the waterfall methodology, the circle will be repeated once for one version of the product. As said previously, the actual implementation of the steps depends on the methodology chosen for managing the project develop- ment. Concerning this thesis the Deployment phase is the most important one. To understand what benefits the product of the theses provides, it is important to discuss, what choices the deployer has to deploy a product.

3.1.1 Bare metal

The bare-metal method is the oldest and most complex one. The deployer buys real hardware, that will be connected to the network and managed by a server housing company. This method has an advantage in the knowing exactly what

16 3. THECLOUDYBUSINESS hardware, what operating system and what software the environment has and in the ability to modify and tune all the aspects of the environment. It is possible to fully ensure the Quality of Service, because the deployer has full control over all aspects of the whole environment the product is running.

3.1.2 Shared hosting

Shared hosting is very popular method among PHP developers. Shared hosting companies provide FTP or SCP access to the system the application is running on. The deployer only uploads the data to the system and the application is running. This way of deploying is possible with technologies that does not utilize virtual machines to run the product. With technologies running using virtual machines it is more difficult, because of the requirement to restart the virtual machine to reload the code of the product. In a shared environment it is difficult to separate running applications and moreover the deployer has no way to affect it. When someone else deploys badly written application, the other applications may starve for resources. This way it is very difficult ensure any kind of Quality of Service.

3.1.3 Virtual servers & dedicated hosting

When setting up bare-metal deployment requires too high investments and the customer can not afford it, however there has to be some Quality of Service, vir- tual servers or dedicated hosting may be the answer. In the case of dedicated hosting the whole environment for deployment is prepared by the provider and the deployer has just to upload the product into the system. With virtual servers, there is usually only basic system provided by the service and the deployer has to set up the environment itself. Virtual server has advantage that are accounted by the amount of used re- sources. If the product is not required to run 24 hours, 7 days a week, for the time it is not used, the system may be shut down and started only when needed. Next benefit of virtual servers is that deployment from templates is provided. When set up correctly with load balancing and other technologies, the template of the product can de deployed to more machines to accommodate occasional spikes in traffic. These two methods are very similar in that the product does not share system

17 3. THECLOUDYBUSINESS with other products. To ensure Quality of Service is possible, but the actual level depends heavily on the Quality of Service from the provider of virtual server or dedicated hosting.

3.1.4 Platform

Platforms lay between shared hosting and virtual servers. The platform provider sets an entrance method to let the deployer upload the product into the platform. When uploaded, the platform allows the deployer to deploy instances of the product and scale the application accordingly. Platforms also provide resource management functionality and are mostly accounted by consumed resources. In comparison to virtual servers and bare-metal, platforms does not provide huge flexibility in environment tuning. The environment is set up by the platform provider and the application has to be bend to it. However that may be an advan- tage, when the developers know beforehand that the application will be running on such a platform, that allows then to leverage platform-specific features. Forcing the deployers to bend the product to the platform allows the plat- form providers to provide functionality, that would be otherwise impossible. As en example, in platform deployment the deployer usually can not affect where and how will the application instance be deployed. However this fact allows the platform provider to relocate the applications according to resource demands by other applications to ensure some Quality of Service. The deployer on the other hand gets from these compromises load balancing among the product instances for free.

3.2 Platform adoption

Adopting platform as a main target for newly developed projects brings benefits to the company. On the other hand such an adoption is larger process that may touch many departments in the company. What changes may be part of the pro- cess are described in this chapter from the points of views of departments in the company.

18 3. THECLOUDYBUSINESS

3.2.1 Management

The first and most important decision is whether to adopt the platform or not. That decision depends on many factors that are specific for each company and product. Platform adoption may bring benefits as well as costs. However, such a decision should be always made from the long-term perspective. When the company decides to adopt a platform, it is important to consider, who will be the provider. There are providers that provide platforms as a service for their customers. Using platform as a service means for the company fixed price per running instance of the product. Price will scale linearly with how many instances of the product will be required to run to accommodate all incoming requests from clients. The company may also decide to run the platform itself. In such a case, the price to run an instance of a product is not fixed. Most of the costs the company will have to buy actual hardware to run the platform. From the perspective of an instance of a product, the price will vary with how many instances of how many product are running on the platform in the given time. With more products the price will lover. However to this variable price, there are fixed costs the needs to be added to the calculation. There will be fixed costs for people to administer the platform and assure it’s operability, although these tasks may be carried out by a department in the company that is already existent. Moreover there is the cost to connect the servers to the , being it in-house managed server or a server-housing provider.

3.2.2 Developers

From the perspective of developers, platform adoption means learning new ways of product development. To develop for a platform means the developers have to have some basic knowledge of the platform’s inner workings to be able to use the platform’s features to the maximum. Most platforms use version control systems as an entrance point to the system. In many smaller companies version control systems are not yet used to manage source code. In such companies there will be cost in teaching the developers to use the version control system as an integral part of the development cycle. One of the biggest changes for the developer is being no longer responsible for deployment.Developers tasks are reduces to delivering product into the version control system (or platform generally). The deployment for Quality assurance

19 3. THECLOUDYBUSINESS may be carried by the testers. And the deployment to production by management by an user interface, because the real deployment is done by the platform, there is only need to tell the platform that is should do the deployment.

20 4 Making the clouds wild

4.1 Wildcloud

The experience gained in the years of server management, web application devel- opment and hosting is applied in a project called Wildcloud. In the next chap- ters is described what Wildcloud should look like to help developers to deliver exceptional web applications. Features are described at first to form an idea, what such a platform should look like. Next differences to existing platforms are discussed to back up the rea- soning to start new project. At last the technologies used to implement such plat- form are discussed with the overall architecture.

4.2 Required features

The idea of the platform emerges from pieces that were created for hosting per- sonal and commercial projects. Having already experience with providing host- ing services, the platform is formed to ease the burden with deploying applica- tions into production environments and with ensuring the consistency of these environments. Important aspect of developing web applications and deploying then into production environments is quality assurance. One of the key factors of the plat- form should be ability to deploy application versions into staging environments. These staging environments has to be as close as possible to the production de- ployment environment.

4.2.1 Security

Every application run in the cluster is isolated so that there is no way that two application could interfere. Every application has it’s own set of data and it’s own root file system. The application runs as a regular user, but even by gaining higher privileges and capabilities to modify system configuration, the other applications can not be affected. Different applications need different set of system packages. Platform should allow to specify set of required system packages for each deployment. This set

21 4. MAKINGTHECLOUDSWILD of system packages should not interfere the other applications deployed to the platform and the platform itself.

4.2.2 Resource allocation

Each application in the system can be allocated only some of the system resources. The operator can configure an exact memory consumption for an application. The configuration allows setting the memory it is allowed to use and swap it is allowed to use. Each application can have specific CPUs assigned and only those will be used by the scheduler. The platform allows the operator to specify how many CPU cycles each application receives. This way the application can not take the platform, and the other applications, to a resource starvation.

4.2.3 Building the application and central repository

When new application is set to be deployed to the platform, an image of the sys- tem should be created. The image will hold everything the application needs to be deployed. The whole system image, required system packages, dependencies, compiled scripts, interpreters and the application itself, all is part of the image. These images are stored in central repository. When the platform should de- ploy the application, the image is downloaded to the target node and started. This way it is possible to have one copy of prebuild image in the repository and only when needed the image is transferred.

4.2.4 Thin provisioning

Because of the requirement on standalone images of the applications, such plat- form would have high requirements on storage capacity to accommodate the whole system images of each application and on each node the application is deployed to have the image unpacked and started. To have storage capacity, thin provisioning should be put in place to lower the demand on bandwidth. With thin provisioning in place, only differences be- tween basic system image and image build for application will be stored. Down- loading whole image of the whole operating system for each deployment over the network is not resource effective. With thin provisioning the amount of data transmitted over the network is much smaller.

22 4. MAKINGTHECLOUDSWILD

4.2.5 Extensibility

The system has to be created in a way that adding new features and technolo- gies is very simple. In some deployments Ruby language might be important in others Node.js or Java. The platform has to allow broadening the set of technolo- gies it provides to the applications. It is important to ensure that technologies the platform uses for it’s inner workings are replaceable. In Java ecosystem is lower new technology adoption rate, compared to Python, Ruby or JavaScirpt. The plat- form has to be capable of keeping the pace with the world of new and modern technologies.

4.2.6 Deployment

To deploy an application a version control system is used. As part of the platform, version control repositories are provided to get changes from customers. When new application is supposed to be deployed, a revision id is specified and new ap- plication is build according to the state regarding the specified revision id. More applications may be deployed from one repository to allow creating staging en- vironments for testing and quality assurance, before moving the application into a production deployment.

4.2.7 File system

Application has access to the file system, where it is deployed, but has no guar- antee of persistence of files written to the file system. Every time when an appli- cation is deployed it starts with a state of file system that was created during the building phase. The applications has to use some other service to store it’s files. The service might be provided by the platform operator, but also third-party ser- vice might be utilized.

4.3 Comparison with existing solutions

Starting new project is always time and resource consuming. With large project such as a platform, it is even more demanding. In this chapter some existing project are discussed and compared to the previous set of features, that are re- quired from the platform.

23 4. MAKINGTHECLOUDSWILD

4.3.1 CloudFoundry

CloudFoundry1 is a platform created by VMware and open sourced under the terms of Apache 2 license. Because of the permissive license and big company backing the development the platform is getting big traction.

4.3.1.1 Tight coupling

The project is very complex and does not seem to be easily separated into au- tonomous components. The components are build on Ruby on Rails2 framework that is very complex itself. In Wildcloud platform the components will be com- pletely separated and decoupled so that they can be used as standalone software to provide specific functionality.

4.3.1.2 Deployment

CloudFoundry provides client console application that is run in the root direc- tory of an application. Once the developer wants deploy the application, it is compressed and pushed to the platform. Almost every developer nowadays uses some version control system. By uploading the application changes to the the ver- sion control system allows the platform save traffic needed to transfer the appli- cation as a snapshot. Once the application revisions are saved inside the platform, it is much simpler to deploy different versions of application for testing, quality assurance or production deployment.

4.3.1.3 Message queuing

CloudFoundry uses a message queue to decouple some of it’s components. As a message queue a ruby based product called NATS3 is used. In the beginnings of Twitter4, the team created and used their own message queue written in Ruby with very simple memcache-like protocol. This project was later abandoned and reimplemented using Scala5, because of performance issues with Ruby 6.

1. http://cloudfoundry.org/ 2. http://rubyonrails.org/ 3. https://github.com/derekcollison/nats 4. http://www.twitter.com 5. http://www.scala-lang.org/ 6. http://robey.livejournal.com/53832.html 24 4. MAKINGTHECLOUDSWILD

4.3.1.4 Isolation

CloudFoundry does provide only basic application and resource isolation fea- tures. Applications are deployed from the same file system and the platform does not provide ability to install per application set of packages. To isolate resource CloudFoundry uses ulimit7 to force the applications to behave correctly, but this functionality does not allow isolation of running applications.

4.3.2 Nodejitsu

Nodejitsu8 is a platform as a service developed as an open-source project9 and also a commercial service. The components are loosely coupled and lightweight.

4.3.2.1 Node.js

Nodejitsu focuses on deployment of Node.js based applications only. Wildcloud supports wider range of different technologies, including Node.js.

4.3.2.2 Isolation

Nodejitsu supports only simple chroot for application security. There is no sup- port for resource allocation and application isolation on the system level.

4.3.3 OpenShift

OpenShift10 by RedHat11 was announced after the project was started. According to all the information available, the platform seems to be very similar to Wild- cloud, however it is not open-source. RedHat promised to make the platform open in the future12.

7. http://www.linuxhowtos.org/Tips%20and%20Tricks/ulimit.htm 8. http://nodejitsu.com/ 9. https://github.com/nodejitsu 10. https://openshift.redhat.com/app/ 11. http://www.redhat.com/ 12. http://redmonk.com/sogrady/2011/05/04/deconstructing-red-hats--the-qa/

25 4. MAKINGTHECLOUDSWILD

More information regarding the platform were requested form RedHat, infor- mation were promised, but till the deadline of the thesis, no information were provided.

4.4 Applied technologies

Wildcloud is composed of many technologies. The platform is build with exten- sibility in mind and almost any component in the platform can be replaced with a completely different kind of product. For the primal implementation, however, it is desirable to choose basic set of technologies to bootstrap the project into a working state, that can be used as a blueprint for implementing extensions.

4.4.1 Operating system

As an operating system Linux Oneiric Ocelot (11.10) Server edition13 was chosen. This distribution provides modern packages compared to the more con- servative distributions. This is important from the point of Ruby development. The community around this language is pushing new technologies very fast and distributions like Debian or RedHat Enterprise Linux can not keep the pace. Ubuntu also supports natively lots of different technologies that were abandoned or not implemented in other distributions. For example Aufs14 is at the moment only viable open-source solution for thin provisioning and it is contained in Debian based distributions but not in Fedora based. That was one of the reasons not to start building the project on Fedora.

4.4.2 Application security and isolation

Wildcloud builds it’s security features on Linux Containers15. Linux Containers allow very secure chroot environment. Once container is chrooted into a virtual environment, Linux Containers create isolations layer around the container to isolate it’s processes and resources. Linux Containers allow to specify what sys-

13. http://www.ubuntu.com/business/server/overview 14. http://aufs.sourceforge.net/ 15. http://lxc.sourceforge.net/

26 4. MAKINGTHECLOUDSWILD tem resources each container is allowed to use, and can be used to achieve fine grained resource allocation. Linux Containers are operating system level virtualization16. Compared to full system virtualization17, there is only one kernel running, so this kind of virtual- ization imposes less overhead on the system. The performance is almost the same as a native system18. More to that, operating system level virtualization can be used inside full system virtualization like Xen19 or KVM20. Linux Containers are implemented as part of new versions of Linux kernel and does not require external patches like OpenVZ21. OpenVZ is more mature product, however it is focused on RedHat Enterprise Linux based distributions and requires old kernel version. It is also difficult to combine patches of OpenVZ and Aufs into a single kernel.

4.4.3 Loose coupling

AMQP22 is used to decouple the platform into smaller components. As an imple- mentation of AMQP broker, RabbitMQ23 was chosen, because of it’s performance, features and wide deployment. RabbitMQ allows clustering and ability to build systems with no single point of failure. Each component connects to RabbitMQ and reports it’s status. After that the component waits for instructions what should be done. The instructions can be send by any component in the platform, however in most cases these instruc- tions are be send by some platform orchestration component, that coordinates the whole platform or it’s part. This orchestration component is not part of the platform itself, because it is deployment specific.

16. http://en.wikipedia.org/wiki/Operating system-level virtualization 17. http://en.wikipedia.org/wiki/Full virtualization 18. http://en.opensuse.org/SDB:LXC 19. http://xen.org/ 20. http://www.linux-kvm.org/page/Main Page 21. http://wiki.openvz.org/Main Page 22. http://www.amqp.org/ 23. http://www.rabbitmq.com/

27 4. MAKINGTHECLOUDSWILD

4.4.4 Thin provisioning

Advanced multi layered unification file system version 3.x (Aufs3)24 is used to provide thin provisioning of virtual machines. Aufs stacks separate directories on top of each other in read-write or read-only mode. Wildcloud uses Aufs to create virtual environments for applications. When application stack is build all data resulting from the process are written into a separate directory that forms a read-write layer on top of read-only direc- tory containing the base system. The resulting directory is packed and ready to be deployed into the cloud. When the deployment process is started the application stack is transferred to the targeted node and unpacked. To create a running environment, the base system (the same as in the build phase) is read-only layered, on top of that is the application stack read-only layered and on top of that read-write temporary directory is layered.

4.4.5 Data storage

The file system the application runs on is transient. Whenever an application is deployed, new temporary directory is created and all files written by the appli- cation to the file system are contained in the directory. When the application is undeployed the data are destroyed. This allows the application to use temporary data and also to provide the possibility to move the applications among the clus- ter. Databases are the core of modern web applications. A database server may be started as an application in the cluster and would be available to the applications developer deploys to the platform. But the file system the applications are using is not persistent, so the data will not survive. To solve the problem the applications have to use some other service. The platform provider can provide and scale some database servers for the clients, but that is not in the scope of the platform itself. The same it is with up- loaded files. The applications have to use an external service to save data that needs to be persistent. These features are deployments specific and therefore are not part of the platform.

24. http://aufs.sourceforge.net/

28 4. MAKINGTHECLOUDSWILD

4.4.6 Cloud communication

Whenever an application instance is started, new IP address is assigned to it. IP address is used to route HTTP requests to the application instance. Having an address for each application allows the applications communicate among them. Within each application instance more than one process may run, therefore rout- ing based on ports is not the right solution.

4.4.7 External services

To the application itself the environment seems like standard Linux system. When an application wants to initiate a connection it is allowed to do it. Based on the external system configuration, the connection is handled. In actual implemen- tation, the communication is handled as NAT. Therefore applications are able to connect to the external network, however they are not reachable from the external network directly. Simple port-forwarding system would allow opening external ports and routing the communication to the application, however because the platform main task is to stream-line web application deployment, such a func- tionality is not required.

4.4.8 HTTP routing

Wildcloud implements an HTTP proxy that routes incoming HTTP connections among the cluster. Because the platform can move deployments of applications from node to node according to actual performance characteristics and applica- tion deployments, it is not possible to give the application static address. The router solves this problem. All connections coming to applications inside the cluster are proxied through an HTTP proxy that is aware of actual location of re- quested applications and forwards the request to the actual virtual environment. HTTP proxy serves also as a load balancer, when more instances of an applica- tions are deployed, the requests are distributed evenly to all those instances. This provides important aspects of scalability. By adding application deployments into the cluster, the application can scale horizontally. And it is only up to the platform provider to ensure enough computational resources.

29 4. MAKINGTHECLOUDSWILD

4.4.9 Session store

The HTTP proxy is responsible for delivering the request and it decides where the request is routed based on the performance aspects of the platform. However al- most all modern web applications use session to store data among requests to the application. The two most widely used models are memory and file based stor- ages. Both these models can not be used in a platform environment because the data would not survive application relocation and the requests come randomly to the instances, therefore on one instance the user would be logged in and on the others not. The problem can be solved by storing data in external storage like database. This approach however add more overhead on a database server and makes the database server needs for resources higher. Wildcloud solves this problem by im- plementing simple HTTP based service, that is responsible for managing sessions for applications. The requests to the server may be done asynchronously and will not block evented architectures. Then it is up to the service to save the sessions and scale the storage itself.

4.5 Architecture

The architecture of the platform is discussed in this chapter. In the previous chap- ters the technologies used to build the platform were discussed as isolated pieces. To make the platform working, it is required to make the components communi- cate.

4.5.1 Basic communication

The chart 4.1 represents how all communication in the platform works.

Figure 4.1: Basic communication in the platform

MQ stands for Message Queue. Message Queue is responsible for delivering messages among components in the cluster. On the diagrams the Message Queue

30 4. MAKINGTHECLOUDSWILD is represented as a single instance, however in production deployments it should be run as a cluster of nodes on many machines. The components itself does not know what other components are in the cluster or where they are located. Com- ponents just connects to the Message Queue and creates required routings for published messages. This mechanism can be easily illustrated on the communication between Brain and Git components. When Brain starts, it connects to the Message Queue and let’s it know, that all messages tagged as “master” should be routed to it and this component will handle those messages. Afterwards when Git component starts and connects to the Message Queue, it let’s the Message Queue know that all messages related to Git and tagged as the name of the node, should be routed to that component. When the Git component is fully started it publishes a message that a node with a “specific name”, wants to get full synchronization of data related to Git and tags the message as “master”. The Brain component receives the messages based on the routing, generates a response and publishes the response tagged with the “specific name” of the node, contained in the original request. The response is delivered to the node, based on routing by the Message Queue. This way, neither the Brain nor Git component needs to know how they are implemented or where they are located. This architecture allows to operate Wild- cloud on single machine as well as in separate data-centers across the world. In this and the following chapters Brain represents the orchestration com- ponent. This component is responsible for organizing the cloud and instructing other components whet they are supposed to do. Only the Brain is aware of the state of the world. Regular components know only the information, that are directly connected to they purpose. Brain is respon- sible for storing all information regarding the state of the components and also is responsible of delivering all required information to the components when the information is requested, as well as all mediate information published by all reg- ular components.

4.5.2 Core components

The chart 4.2 displays all core components of the platform. As mentioned in pre- vious chapter no components are directly connected. All communication among the components is directed by the Message Queue.

31 4. MAKINGTHECLOUDSWILD

Figure 4.2: Core components of Wildcloud

The platform internally uses Git to transfer application data from clients to the platform. Git service is responsible for managing all aspects of platform re- garding Git version control system25. The platform uses Git over SSH transport to provide high performance and compatibility. To authenticate users SSH keys26 are used. The component is responsible for SSH key management of the machine the SSH server is running. The component also creates and destroys repositories on the file system as is requested by the Brain component. When the client is authen- ticated and it is known what repository the client wants to access, the components checks the authorization and allows the operation or permits it. When all data is

25. http://git-scm.com/ 26. https://wiki.archlinux.org/index.php/SSH Keys

32 4. MAKINGTHECLOUDSWILD received, Git notifies Brain that new revisions all available. After that the project manager may be notified of new changes in the application, as seen on the chart 4.3.

Figure 4.3: Pushing new revision to Git

When a Git component receives new data, the client can request to build new image of the application. For this task is responsible Builder component, as can be seen on chart 4.4. When the Builder receives a request to build new image, it sets up new empty environment base on the “base image”. Into the environ- ment the sources of the application are cloned from the Git repository. Depending on the platform the application uses, the Builder sets up the environment. This process always consist of two steps. As a superuser, the Builder configures the operating system aspects of the environment. In that phase new operating sys- tem packages are installed according to manifest provided by the application. In the second phase, the Builder runs application specific tasks as an unprivileged user. When both phases are finished, the Builder compresses the environment and pushes the resulting image into a central repository. Keeper is responsible of starting and stopping application instances among the cluster. When the client decides to deploy new version of the application or when the client want to scale the application to more instances, the Brain is no- tified and requests an appropriate task from the Keeper on a specific node. The Brain can also move an instance of application to a different node in the cluster. To do so, it simply requests stop of an instance on one node and start of an in- stance on another node. The reasoning to move instances among the cluster is to allow better allocation of resources and higher density of deployed applications. As seen on the chart 4.5, when the Keeper is requested to deploy new application, it download the image from central repository and starts the application. When a requests comes to the platform, Router is responsible for delivering

33 4. MAKINGTHECLOUDSWILD

Figure 4.4: Building new version of application

it to the actual instance of the application. The decision where to route the request is just on the router and can be replaced. When new instance of an application is

34 4. MAKINGTHECLOUDSWILD

Figure 4.5: Deploying new application started or stopped or when relocation of an instance has been made, the router is notified by the brain about the actual state of the routing table.

4.5.3 Components seen by applications

The figure 4.8 explains how the architecture looks like from the perspective of the deployed applications. All those components mentioned in previous chapter are transparent to the application. The application is somehow started, however it is not aware of the mechanisms that lead to it’s starting. The application is only directly concerned with two components. Storage is responsible for storing application data. This component can be provided by the platform operator or by some third-party service. As part of Wilcloud a basic component is provided. The storage is simple HTTP REST 27 service, that accepts only small range of requests. Every requests has to contain special “X-Appid” header to authenticate the application, based on the applica- tion, data files are accessed. All possible operations are described in table 4.5.3. The storage implements multiple back-ends to store the actual data. To allow as simple deployment as possible, it implements file system storage, operated only within the file system of the system, the storage is running on. This back-end does not provide any measures to scale or ensure high availability. To provide such functionality, storage implements MongoDB28 as a back-end. MongoDB is

27. http://www.ics.uci.edu/ fielding/pubs/dissertation/top.htm 28. http://www.mongodb.org/

35 4. MAKINGTHECLOUDSWILD document-oriented database29, that natively supports clustering and sharding to provide highly available and scalable data storage. Document are stored in the format of binary JSON30. Based on the document-oriented architecture, MongoDB provides virtual file system called GridFS31.

Figure 4.6: Operations provided by storage service

Method Path Description GET /some/path Download data from the location /some/path PUT /some/path Save data to the location /some/path DELETE /some/path Delete data from the location /some/path GET / list files List files available to the application

Session store provides central repository to store sessions. Session can not use any storage that would not be available to all instances of the applica- tion in the cluster, because requests are routed to random nodes in the cluster, the data in sessions has to be available to all those nodes. Wildcloud provides simple HTTP REST service to store and load a session. Operations provided by the service are described in the table 4.5.3. Session store uses database32 as a back-end. Redis is very fast in-memory key-value database, with possibility to persist date to the file system. Redis can be run in master-slave replication mode to allow higher performance.

Figure 4.7: Operations provided by session service

Method Path Description GET /session id Load data for session id PUT /session id Save data for session id DELETE /session id Delete data for session id

The application may also save the session data into cookie and make the dis- tribution among the working instances as part of the HTTP requests. There are

29. http://en.wikipedia.org/wiki/Document-oriented database 30. http://www.mongodb.org/display/DOCS/BSON 31. http://www.mongodb.org/display/DOCS/GridFS 32. http://redis.io/

36 4. MAKINGTHECLOUDSWILD however some limitations. Cookies have limit on the size of data it may contain and therefore this method is not suitable for application that use session store heavily. Also to ensure some level of security, the cookies are encrypted. How- ever when the key used to encode the data is known, an attacker may read the data from cookies or to put different data into it. The application is indirectly concerned with one more component - Router. All incoming requests come through the HTTP routing proxies. Moreover also all responses from the applications are send through those proxies. The proxies can count bandwidth from and to applications and based on that make regulations to ensure some kind of Quality of Service. The proxies may also modify request and responses to dynamically add more information.

4.5.4 Routing of HTTP requests

On the chart 4.9 the path of a request through the platform can be seen. As said before, the platform has to ensure, that requests are delivered to the application. This is required because the platform dynamically relocates the application in- stances among the cluster and it’s addresses are changed accordingly to their actual location. When client initiates a connection to the platform, it is received on front facing proxy server. These server are high performing and has static set of destinations. In this chart is only one front facing proxy used, but there might be many of them to increase the performance. There can be hardware based load balancer to differentiate the traffic to multiple proxies, or simply by utilizing round-robin DNS records33. Routers receive information regarding the state of applications from Brain component. Each Router may have different set of rules, where to direct the re- quest. This simple fact allows creating sub-systems or sub-networks, that may be handled differently. But the proxy server may use different algorithms to choose the Router for each request, so it is important to ensure, that the chosen router how to handle the request. It is also possible to chain routers to create fine-grained network system.

33. http://tools.ietf.org/html/rfc1794

37 4. MAKINGTHECLOUDSWILD

Figure 4.8: Components from the perspective of applications

4.6 Testing

The platform is already being used in production for small sized deployment. In this chapter is the behaviour of the platform in this particular deployment 38 4. MAKINGTHECLOUDSWILD

Figure 4.9: Routing of HTTP requests through the platform

39 4. MAKINGTHECLOUDSWILD described.

4.6.1 Functional testing

During the development of the platform no unit nor behaviour tests were used. Testing would add too much overhead to the development process. The plat- form is build from functional pieces of software that have already been used in production and therefore tested. Moreover there was a vision what the platform should be capable of doing and well defined set of functionality to implement. However, there were no requirements on the implementation itself. To adopt test- driven development, a specification regarding the implementation would be ex- pected and that would make the development process less lean. The development model was inspired by the Lean startup movement34. The main point of this philosophy is to lower all unnecessary burden connected with product development to minimum, but to provide functional product. The best testing environment is a production environment. So to test the plat- form during the development a methodology called Continues deployment35 was used. Once all the components were capable of their basic functionality. They were deployed into the production environment. There were agreements with some web applications, that occasional disruptions in the service will be toler- ated and the applications had been deployed into the platform and started. The functioning of the platform was monitored and whenever a problem ap- peared, the bug was patched and new version of the particular component had been deployed back into the production system. By using unit or behaviour testing the most obvious problems will be discov- ered, however Continues deployment in a production environment will discover all possible problems. In such testing methodology all functional, behaviour and integration testing is covered.

4.6.2 Load testing

The platform uses well tested and optimized components when possible and pro- vides the orchestration and management on top of them. The most crucial com-

34. http://theleanstartup.com/ 35. http://www.startuplessonslearned.com/2009/06/why-continuous-deployment.html

40 4. MAKINGTHECLOUDSWILD ponent from the point of view of performance is Router. All incoming requests and outgoing responses have to go through it. According to internal benchmarking inside the component, the overhead per connection is 10ms. This time includes the internal processing of the request, con- necting to the backend, relaying the request and response.

41 5 Product evaluation

In the last chapter are discussed topics related to end users and the evaluation of the platform. This chapter provides real data, that may be used as a basis for decisions.

5.1 Case-study

One of the possible deployments of the platform is staging environment for de- velopment of web applications. The case will be described regarding a medium- sized internet agency. In the text it will be referred to the internet agency as Com- pany.

5.1.1 Company

The Company is medium sized internet agency developing web applications and presentations. For development of it’s products it utilizes two programming lan- guages - Ruby and PHP. PHP is used with combination of internal content man- agement system. Ruby is used to develop more complex and demanding appli- cations. The Company has 10 employees. There is one coder for UI interface develop- ment. The Company has 2 PHP developers working on long-term projects and occasional one-time contracts. Next, there are 5 Ruby developers and two project manager. The project managers are responsible for quality assurance. The Company provides hosting for web applications using it’s own server. The Company has 5 servers located in server housing facility.

5.1.2 Problems

Because of the economic recession, the problem with gaining new clients ap- peared. The companies are uncertain of the future of economical stability and are afraid of investments. More over customers tend not to spend that much money even with stronger marketing campaigns. With the decrease in customer rate and resistance to marketing actions the companies are investing less resources into the promotion and marketing.

42 5. PRODUCT EVALUATION

The Company has to lower the operational costs and streamline the develop- ment process. The biggest problem is the overhead of deploying and testing their products. To test the product a developer is required to deploy specific version for quality assurance.

5.1.3 Requirements

The Company has to automate the deployment of staging applications and lower the dependence of quality assurance on the developers. When new version of application is pushed into the version control reposi- tory, the project manager is automatically notified. Using simple UI, the project manager starts the particular version of an application and tests it. When the tests are successful the customer is invited to do it’s testing. When the testing is done, found bugs are inserted into the bug tracking system or the application is stopped in the staging environment and deployed into a production environment. The whole process has to be automated and allow to be operated by users without technical education. It is also crucial to support both PHP and Ruby lan- guage using the same system and allow project managers to operate both lan- guages using the same set of tools. In the future there should be one-click solution for the deployment into the production environment from the same user interface that is used for managing staging environment.

5.1.4 Solution

The platform as implemented provides most of the features required by the Com- pany. It will streamline it’s development process and provide all required fea- tures. It also provides space to grow and to adopt new technologies in the future. Managing version control system repositories is one of the core tasks the plat- form provides. As requested from the UI, the repositories are created and de- stroyed. The platform allows to manage access permission to particular reposito- ries by specific users. The also handles notifications of new versions pushed into version control repositories. The platform is capable of starting applications from version control system. The deployer may specify a specific revision that is supposed to be deployed and that revision of application will be started. By default the newest revision is used.

43 5. PRODUCT EVALUATION

Revisions may be specific commits, branches or user-defined tags and names. Inside the platform the routing of requests to applications is based on domain names and thus by setting the right DNS records the application may be pre- sented to the customer. Special domain names may used to provide basic security by using random generated strings. The platform provides simple web-based user interface to access these fea- tures and is therefore suitable for use of non-technical staff. Wildcloud platform supports any technology that may run on Linux operat- ing system and inside Linux Containers. At the moment the Company is using PHP and Ruby, however in a few months, is might happen that another tech- nology, like Node.js or Java, will be required. The platform is capable of accom- modating such requirements only by upgrading the base image, or simply by installing per-application system packages inside it’s image.

5.1.5 Benefits

The Company expects the need of one-click solution for deployment of web ap- plications into production environment. By using Wildcloud platform the com- pany may merge staging and production servers into one platform and thus min- imize the differences between production and staging environments. Wildcloud platform offers application isolation. This way a bug in an applica- tion can not affect the other running applications on the server. With combination with per-application resource allocation there is no need to have separated pro- duction and staging servers. Wildcloud allows to deploy applications using new operating system. By up- grading the base system image and rebuilding the application, the application will be running inside the new environment, without the need to modify the outer operating system.

5.2 From customer’s point of view

The platform was thoroughly discussed from the point of view of architecture and used technologies in previous chapters. From the point of view of an end- user that information is not relevant. The user will be working with front-end application and this application is discussed in this chapter.

44 5. PRODUCT EVALUATION

5.2.1 Welcome screens

The application is created with simplicity in mind. The screens are as simple as possible, but provide all necessary information to operate the system. The design is based in modern user interface frameworks to provide necessary user experi- ence.

Figure 5.1: Welcome screen when user enters the application

Registered users may login directly from the first (picture 5.1). The application uses user’s e-mail as a username to simplify users’s identification. User that does not have their accounts may register (picture 5.2). By default users after registra- tion are inactive and are not allowed to login. Manual action from operators is required to activate the account.

Figure 5.2: Registration form

45 5. PRODUCT EVALUATION

5.2.2 Navigation

The navigation is composed of two components. Main navigation panel is located on the top of the page and allows the user to move among the sections of the application (picture 5.3).

Figure 5.3: Navigation bar

To allow fine-grained navigation inside the sections a special area on the left side of page is used. It may contain navigation elements. Standard navigation in the application is done using hyperlinks (picture 5.4).

Figure 5.4: Navigation elements in sidebar

When the user is about to do an important action, that might affect the func- tioning of the platform, it is displayed as a coloured button (picture 5.5

Figure 5.5: Action buttons in sidebar

46 5. PRODUCT EVALUATION

Lastly the sidebar area may contain a text, that should help the user to un- derstand what the screen is responsible for, or to give advice how to operate the platform (picture 5.6).

Figure 5.6: Special content in sidebar

5.2.3 Dashboard

When the user logs into the application, the dashboard screen is displayed. The Dashboard is special section that should provide the user quick access to impor- tant information regarding the functioning of the platform and deployed appli- cations (picture 5.7). As implemented in the first version of the platform, as part of this thesis, the Dashboard provides overview of applications in the platform, deployed instances and domain name, the application is accessible from. From the sidebar, the user can quickly create new application.

5.2.4 SSH keys

The platform uses SSH keys to authenticate users during the git-related com- munication. The keys are managed from this central place and distributed to git servers. When the key is saved, the repositories are accessible to the user using that key. Users may have multiple keys to allow access from multiple computers with- out the need of copying one key to all of them.

47 5. PRODUCT EVALUATION

Figure 5.7: Dashboard screen

Figure 5.8: SSH keys management

5.2.5 Repositories

The platform uses Git repositories to transfer changes from developers to servers. Developers may create multiple repositories for development. The repositories are not directly related to deployed applications. The repositories may be use independently and the specified when used for deployment of an application. Once created, an unique identifier is created for the repository. That identifier is based on user entered data, but can not be changed afterwards. Also the detail of the repository provides information how to access the repository (picture 5.9).

48 5. PRODUCT EVALUATION

Figure 5.9: Git repositories managemet

5.2.6 Applications

The main purpose of the platform is to deploy web applications. Applications sec- tion manages this aspect. As said before, the deployment process has two phases. First the application is build from Git repository and a deployable image is cre- ated. This images contains all necessary data to start the application. This is done only once for each version of a particular application. When the image is build, the process moves into the second phase. The user may deploy the application and have it accessible using some domain name and HTTP protocol. Whenever an instance is deployed, the platform automatically modifies the routing infrastructure to make the instance accessible. When the in- stance is undeployed, the routes are destroyed. When the user deploys the application into multiple instances. The platform automatically load balances the request among those instances. The actual rout- ing infrastructure may be inspected in the Router section. From the main screen (picture 5.10) the user may build new application or destroy an existing application. To update the application from version control repository, the user rebuilds the application from this screen. When the applica- tion is build, it may be inspected. During the build of the application information is collected. The log is then accessible using he Build log (picture 5.11) subsection. The log contains informa- tion regarding git repository cloning, system package installation, or the actual application build.

49 5. PRODUCT EVALUATION

Figure 5.10: Applications overview

Figure 5.11: Application build log

In the Deployment subsection, the user deploys and undeploys instances of the application (picture 5.12).

Figure 5.12: Applications deployments overview

5.2.7 Router

The Router section manages routes inside the platform (picture 5.13). Whenever an application is deployed or undeployed, the actual routes are created or de- stroyed respectively. The user does not have to explicitly create these routes. The Router allows to inspect these routes.

50 5. PRODUCT EVALUATION

Moreover the router may be instructed manually to route requests. From the application is is required to specify incoming Host name and port to identify the requests. And target host name and port to create a proxy connection to the server.

Figure 5.13: Routing overview

51 6 Conclusion

The main subject of the thesis was to analyse requirements for deploying web ap- plications and to implements a platform that will ease the process. The analytical part of the subject was fulfilled in the first two chapters of the thesis, where the changes in understanding of web application deployment regarding the cloud computing phenomenon were described. The implementational part of the subject was fulfilled the the third chapter of the theses. First, the requirements on such a platform for deploying web applica- tions were layed. The requirements are based on real problems, the deployers of web applications are confronted with. Knowing the requirements, existing products were discussed. None of the an- alyzed projects did fulfil the requirements of the platform as stated before. There- fore new platform was implemented and it’s implementation details were dis- cussed. In the last chapter, the resulting product was described from the business owners point of view. The simplicity of user-interface that allows to operate the platform was considered, as well as functional and load testing of the applica- tion. The platform was developed inside a production environment and therefore was continuously tested by the use of real users. One case-study was presented to prove that the platform may solve different kinds of problems and may help businesses to grow. A project like this one is never finished. The development of the platform will continue. As implemented as part of the theses the platform provides functional- ity for deploying web applications. In the future, the platform might be extend to provide project management features and provide single solution for web appli- cation developers, managers and deployers. Also the platform may not be limited to deploying web applications. By em- ploying more sophisticated networking architecture the platform might be ex- tended to provide base for deploying any networking application and to serve as a basis for high-performance-computing cluster. The platform should also be extended to provide wider range of technologies. It utilized Linux containers that provide in-kernel virtualization. By providing hardware-virtualization the platform may be extended to provide more complex set of features by providing support for using different set of operating systems.

52 BIBLIOGRAPHY

The source code is published on Github1 and may be accessed freely.

https://github.com/wildcloud

The project is licensed under the terms of Affero General Public License and is free to use to anyone. Any modifications to the components have to be released as open-source under the same license. The project is loosely coupled and therefore by using unmodified components is allowed also in commercial deployments without the need to open-source all components in the project. The license is, however, open to discussion and may change in the future. By working on the project I have gained a lot of knowledge and created func- tional platform that may be used in production environments. The implemen- tation parts of the theses were interesting, because a lot of low level aspects of networking and Linux operating systems had been explored.

1. http://www.github.com

53 Bibliography

[1] Kernel based virtual machine. http://www.linux-kvm.org/page/ Main\_Page, 2011. [Online; accessed 12-Semptember-2011]. [2] Inc. 10gen. Bson. http://www.mongodb.org/display/DOCS/BSON, 2011. [Online; accessed 12-November-2011]. [3] Inc. 10gen. Gridfs. http://www.mongodb.org/display/DOCS/ GridFS, 2011. [Online; accessed 5-November-2011]. [4] Inc. 10gen. Mongodb. http://www.mongodb.org/, 2011. [Online; ac- cessed 12-November-2011]. [5] Muhammad Ali Babar and Muhammad Aufeef Chauhan. A tale of migra- tion to cloud computing for sharing experiences and observations. In Pro- ceedings of the 2nd International Workshop on Software Engineering for Cloud Computing, SECLOUD ’11, pages 50–56, New York, NY, USA, 2011. ACM. [6] Christian Baun and Marcel Kunze. The koala cloud management service: a modern approach for cloud infrastructure management. In Proceedings of the First International Workshop on Cloud Computing Platforms, CloudCP ’11, pages 1:1–1:6, New York, NY, USA, 2011. ACM. [7] Network Working Group (T. Brisco). Dns support for load balancing. http://tools.ietf.org/html/rfc1794, 1995. [Online; accessed 5- Semptember-2011]. [8] Sergey Bykov, Alan Geller, Gabriel Kliot, James R. Larus, Ravi Pandya, and Jorgen Thelin. Orleans: cloud computing for everyone. In Proceedings of the 2nd ACM Symposium on Cloud Computing, SOCC ’11, pages 16:1–16:14, New York, NY, USA, 2011. ACM. [9] Scott Chacon. Pro Git. Apress, Berkely, CA, USA, 1st edition, 2009. [10] Scott Chacon. Git - fast version control system. http://git-scm.com/, 2011. [Online; accessed 27-November-2011]. [11] Inc. Citrix Systems. Welcome to xen.org, home of the xen hypervisor, the powerful open source industry standard for virtualization. http://www. xen.org, 2011. [Online; accessed 12-October-2011].

54 BIBLIOGRAPHY

[12] Derek Collison. derekcollison/nats. https://github.com/ derekcollison/nats, 2011. [Online; accessed 28-October-2011].

[13] ArchWiki coordinators. Ssh keys. https://wiki.archlinux.org/ index.php/SSH\_Keys, 2011. [Online; accessed 5-November-2011].

[14] Oracle Corporation. 11. http://www.oracle.com/ us/products/servers-storage/solaris/solaris11/overview/ index.html, 2011. [Online; accessed 5-August-2011].

[15] Dropbox. Dropbox - files - simplify your life. https://www.dropbox. com/, 2011. [Online; accessed 6-August-2011].

[16] Inc. Systems. Cloud computing software from eucalyptus — leader in cloud software. http://www.eucalyptus.com/, 2011. [Online; accessed 27-Semptember-2011].

[17] Facebook. Facebook. http://www.facebook.com/, 2011. [Online; ac- cessed 12-December-2011].

[18] Roy Thomas Fielding. Architectural styles and the design of network-based software architectures. http://www.ics.uci.edu/˜fielding/pubs/ dissertation/top.htm, 2000. [Online; accessed 27-November-2011].

[19] Jeremy Geelan. Twenty-one experts define cloud computing. http:// cloudcomputing.sys-con.com/node/612375, 2009. [Online; accessed 5-October-2011].

[20] Google. Gmail: Email from google. http://www.gmail.com, 2011. [On- line; accessed 20-Sepmtember-2011].

[21] Google. Google. http://www.google.com/, 2011. [Online; accessed 27- December-2011].

[22] Google. Google app engine. http://code.google.com/intl/cs/ appengine/, 2011. [Online; accessed 5-November-2011].

[23] Google. Google apps for business — official website. http://www. google.com/apps/intl/en/business/index.html, 2011. [Online; accessed 5-November-2011].

[24] Google. Google chat - chat with family and friends. http://www.google. com/talk/, 2011. [Online; accessed 12-Sepmtember-2011].

55 BIBLIOGRAPHY

[25] Google. Google docs - online documents, spreadsheets, presentations, sur- veys, file storage and more. http://docs.google.com/, 2011. [Online; accessed 27-Sepmtember-2011].

[26] Google. Google+: real life sharing, rethought for the web. http://plus. google.com, 2011. [Online; accessed 13-Sepmtember-2011].

[27] Google. Picasa web albums: free photo sharing from google. http:// picasaweb.google.com, 2011. [Online; accessed 12-Sepmtember-2011].

[28] AMQP Working Group. Amqp. http://www.amqp.org/, 2011. [Online; accessed 15-December-2011].

[29] PostgreSQL Global Development Group. Postgresql. http://www. .org/, 2011. [Online; accessed 27-September-2011].

[30] David Heinemeier Hansson. Ruby on rails. http://rubyonrails.org/, 2011. [Online; accessed 15-October-2011].

[31] Inc. Heroku. Heroku — dev center. http://devcenter.heroku.com/, 2011. [Online; accessed 27-November-2011].

[32] GitHub Inc. Github - social coding. http://www.github.com, 2011. [On- line; accessed 5-November-2011].

[33] Roger Jennings. Cloud Computing with the Windows Azure Platform. Wrox Press Ltd., Birmingham, UK, UK, 2009.

[34] Inc. Joyent. Smartos: The complete modern operating system. http:// smartos.org/, 2011. [Online; accessed 27-October-2011].

[35] Steffen Kachele,¨ Jorg¨ Domaschka, and Franz J. Hauck. Cosca: an easy-to-use component-based paas cloud system for common applications. In Proceed- ings of the First International Workshop on Cloud Computing Platforms, CloudCP ’11, pages 4:1–4:6, New York, NY, USA, 2011. ACM.

[36] Inc. Linux Kernel Organization. Documentation/cgroups. http://www. kernel.org/doc/Documentation/cgroups/, 2011. [Online; accessed 12-December-2011].

[37] LLC. About aws. http://aws.amazon.com/ what-is-aws/, 2011. [Online; accessed 12-November-2011].

56 BIBLIOGRAPHY

[38] Amazon Web Services LLC. Amazon elastic block store (ebs). http:// aws.amazon.com/ebs/, 2011. [Online; accessed 12-December-2011].

[39] Amazon Web Services LLC. Amazon elastic compute cloud (amazon ec2). http://aws.amazon.com/ec2/, 2011. [Online; accessed 5-November- 2011].

[40] Canonical Ltd. Server — ubuntu. http://www.ubuntu.com/business/ server/overview, 2011. [Online; accessed 12-December-2011].

[41] Parallels Holdings Ltd. Main page - openvz linux containers wiki. http: //wiki.openvz.org/Main\_Page, 2011. [Online; accessed 5-October- 2011].

[42] lxc Linux Containers. lxc - linux containers. http://lxc.sourceforge. net/, 2011. [Online; accessed 5-December-2011].

[43] Microsoft. Ixmlhttprequest. http://msdn.microsoft.com/en-us/ library/ms759148(VS.85).aspx, 2011. [Online; accessed 12-October- 2011].

[44] Microsoft. Windows azure. http://www.microsoft.com/ windowsazure/, 2011. [Online; accessed 12-Semptember-2011].

[45] Rich Miller. Who has the most web servers? http:// www.datacenterknowledge.com/archives/2009/05/14/ whos-got-the-most-web-servers/, 2009. [Online; accessed 27- November-2011].

[46] Incorporated. NewServers. Newservers: Bare metal cloud. http://www. newservers.com/language/en/, 2011. [Online; accessed 5-September- 2011].

[47] Nodejitsu. nodejitsu. https://github.com/nodejitsu, 2011. [Online; accessed 5-November-2011].

[48] Nodejitsu.com. Nodejitsu. http://nodejitsu.com/, 2011. [Online; ac- cessed 15-November-2011].

[49] Inc. Novell and others. Sdb:lxc. http://en.opensuse.org/SDB:LXC, 2011. [Online; accessed 12-October-2011].

57 BIBLIOGRAPHY

[50] Stephen O’Grady. Deconstructing red hat’s openshift: The q&a. http://redmonk.com/sogrady/2011/05/04/ deconstructing-red-hats-openshift-the-qa/, 2011. [Online; accessed 5-December-2011]. [51] Junjiro R. Okajima. aufs.sourceforge.net. http://aufs.sourceforge. net/, 2011. [Online; accessed 27-December-2011]. [52] OpenNebula Project Leads (OpenNebula.org). .:: Opennebula: The open source toolkit for virtualization ::. http://www.opennebula. org/, 2011. [Online; accessed 27-Semptember-2011]. [53] Siani Pearson. Taking account of privacy when designing cloud computing services. In Proceedings of the 2009 ICSE Workshop on Software Engineer- ing Challenges of Cloud Computing, CLOUD ’09, pages 44–52, Washington, DC, USA, 2009. IEEE Computer Society. [54] Heroku — Cloud Application Platform. Heroku, inc. http://www. heroku.com/, 2011. [Online; accessed 12-November-2011]. [55] Robey Pointer. scarling -¿ kestrel. http://robey.livejournal.com/ 53832.html, 2011. [Online; accessed 5-November-2011]. [56] The OpenStack project. Openstack open source cloud computing software. http://openstack.org/, 2011. [Online; accessed 5-October-2011]. [57] US Inc. Rackspace. Cloud computing, managed hosting, dedicated server hosting by rackspace. http://www.rackspace.com/, 2011. [Online; ac- cessed 12-October-2011]. [58] Inc. Red Hat. Openshift by red hat. https://openshift.redhat.com/ app/, 2011. [Online; accessed 5-December-2011]. [59] Inc. Red Hat. redhat.com — the world’s open source leader. http://www. redhat.com/, 2011. [Online; accessed 5-December-2011]. [60] Eric Ries. Why continuous deployment? http: //www.startuplessonslearned.com/2009/06/ why-continuous-deployment.html, 2009. [Online; accessed 27- Semptember-2011]. [61] Eric Ries. The lean startup. http://theleanstartup.com/, 2011. [On- line; accessed 12-Semptember-2011].

58 BIBLIOGRAPHY

[62] inc. salesforce.com. Crm - the enterprise cloud computing company - sales- force.com europe. http://www.salesforce.com, 2011. [Online; ac- cessed 12-October-2011].

[63] Salvatore Sanfilippo and Pieter Noordhuis. Redis. http://redis.io/, 2011. [Online; accessed 12-Semptember-2011].

[64] Maxim Schnjakin, Rehab Alnemr, and Christoph Meinel. Contract-based cloud architecture. In Proceedings of the second international workshop on Cloud data management, CloudDB ’10, pages 33–40, New York, NY, USA, 2010. ACM.

[65] S&P Softwaredesign. ulimit and sysctl. http://www.linuxhowtos. org/Tips\%20and\%20Tricks/ulimit.htm, 2011. [Online; accessed 12-November-2011].

[66] GitHub team. Git reference. http://gitref.org/remotes/#push, 2011. [Online; accessed 12-December-2011].

[67] Twitter. Twitter. https://twitter.com/, 2011. [Online; accessed 5- December-2011].

[68] Inc. VMware. - make it yours! http://cloudfoundry. org/, 2011. [Online; accessed 13-October-2011].

[69] Inc. VMware. Rabbitmq - messaging that just works. http://www. rabbitmq.com/, 2011. [Online; accessed 12-October-2011].

[70] Jens-Sonke¨ Vockler,¨ Gideon Juve, Ewa Deelman, Mats Rynge, and Bruce Ber- riman. Experiences using cloud computing for a scientific workflow appli- cation. In Proceedings of the 2nd international workshop on Scientific cloud computing, ScienceCloud ’11, pages 15–24, New York, NY, USA, 2011. ACM.

[71] Robin Wauters. Salesforce.com buys heroku for $212 mil- lion in cash. http://techcrunch.com/2010/12/08/ breaking-salesforce-buys-heroku-for-212-million-in-cash/, 2010. [Online; accessed 5-December-2011].

[72] [email protected]. chroot(1) - linux man page. http://linux.die.net/man/ 1/chroot, 2011. [Online; accessed 5-November-2011].

59 BIBLIOGRAPHY

[73] Wikipedia. Cloud computing — wikipedia, the free encyclope- dia. http://en.wikipedia.org/w/index.php?title=Cloud_ computing&oldid=467862539, 2011. [Online; accessed 28-November- 2011].

[74] Wikipedia. Document-oriented database — wikipedia, the free ency- clopedia. http://en.wikipedia.org/wiki/Document-oriented\ _database, 2011. [Online; accessed 5-November-2011].

[75] Wikipedia. Dot-com bubble — wikipedia, the free encyclopedia. http://en.wikipedia.org/w/index.php?title=Dot-com_ bubble&oldid=468041713, 2011. [Online; accessed 28-November- 2011].

[76] Wikipedia. Full virtualization — wikipedia, the free encyclopedia. http: //en.wikipedia.org/wiki/Full\_virtualization, 2011. [Online; accessed 27-October-2011].

[77] Wikipedia. Operating system-level virtualization — wikipedia, the free encyclopedia. http://en.wikipedia.org/wiki/Operating\ _system-level\_virtualization, 2011. [Online; accessed 12-October- 2011].

[78] Wikipedia. Sun microsystems — wikipedia, the free encyclopedia. http: //en.wikipedia.org/wiki/Sun\_Microsystems, 2011. [Online; ac- cessed 12-December-2011].

[79] Wikipedia. World wide web — wikipedia, the free encyclopedia. http://en.wikipedia.org/w/index.php?title=World_Wide_ Web&oldid=467071500, 2011. [Online; accessed 28-December-2011].

[80] Ecole´ Polytechnique Fed´ erale´ de Lausanne (EPFL). The scala . http://www.scala-lang.org/, 2011. [Online; accessed 13- November-2011].

60