Serverless Computing: Containerizing Faas with Docker

Serverless Computing: Containerizing FaaS with Docker MSc Research Project Cloud Computing Taofeek-Femi Abdullahi Student ID: 19104219 School of Computing National College of Ireland Supervisor: Manuel Tova-Izquierdo www.ncirl.ie National College of Ireland Project Submission Sheet School of Computing Student Name: Taofeek-Femi Abdullahi Student ID: 19104219 Programme: Cloud Computing Year: 2019 Module: MSc Research Project Supervisor: Manuel Tova-Izquierdo Submission Due Date: 17/08/2020 Project Title: Serverless Computing: Containerizing FaaS with Docker Word Count: 5722 Page Count: 20 I hereby certify that the information contained in this (my submission) is information pertaining to research I conducted for this project. All information other than my own contribution will be fully referenced and listed in the relevant bibliography section at the rear of the project. ALL internet material must be referenced in the bibliography section. Students are required to use the Referencing Standard specified in the report template. To use other author's written or electronic work is illegal (plagiarism) and may result in disciplinary action. I agree to an electronic copy of my thesis being made publicly available on TRAP the National College of Ireland's Institutional Repository for consultation. Signature: Date: 16th August 2020 PLEASE READ THE FOLLOWING INSTRUCTIONS AND CHECKLIST: Attach a completed copy of this sheet to each project (including multiple copies). Attach a Moodle submission receipt of the online project submission, to each project (including multiple copies). You must ensure that you retain a HARD COPY of the project, both for your own reference and in case a project is lost or mislaid. It is not sufficient to keep a copy on computer. Assignments that are submitted to the Programme Coordinator office must be placed into the assignment box located outside the office. Office Use Only Signature: Date: Penalty Applied (if applicable): Serverless Computing: Containerizing FaaS with Docker Taofeek-Femi Abdullahi 19104219 Abstract Serverless computing enables cloud providers to abstract the process of spin- ning up servers to run programs. Developers can simply write their backend logic as pieces of code that get triggered by events such as a change in database, object storage, or request to an API endpoint. In a real world scenario, most applications require external libraries and packages in order to produce a consistent result across multiple deployed environments. This is however one of the limitations of serverless computing as some providers only provide a limited amount of packages at runtime. Developers will need to ensure that these libraries are installed in their serverless environment before deploying their code. This paper introduces ADEPS - Application Dependency Packager for Serverless. With the help of Docker, we were able to package serverless actions, with all the required dependencies, as a Docker image which is then deployed to IBM OpenWhisk. We tested our solution by deploying Spleeter, an open-source python library that decouples audio tracks. We also deployed a graph plotting function that leverages matplotlib. Our results showed that ADEPS can facilitate a more efficient way of deploying customized serverless functions to IBM OpenWhisk. Keywords - Serverless Computing, Cloud Computing, Docker, IBM OpenWhisk 1 Introduction The rise of serverless computing does not seem to be slowing down anytime soon. This doesn't come as a shock seeing as how serverless computing promises to eliminate the need to ever spin up, configure or manage a server. Server management has come a long way but the concept of servers is still the same. In the beginning writing an application meant that you had to deploy that application to a live server that had been configured with a certain amount of memory and disk space. You also needed to set up a network topology to ensure that the outside world can communicate with your application. Soon enough, virtualization was discovered. This meant that you no longer had to waste resources as you can provision your surplus resources to a whole new operating system. Productivity seemed to improve until server management started to take over core operations and therefore we had to move all our resources to the cloud. The cloud is basically a cluster of servers in a data center. The current cloud computing landscape adopts a "data center" approach where dedicated servers are leveraged to provide services (Mengistu et al.; 2017). Cloud providers rent servers and resources to end users for a price. The ease and reliability of cloud computing is the reason why application development has become so popular. Most Cloud Service Providers(CSPs) 1 promise a 99.9% uptime of servers which is commendable seeing as how that might not be achievable on-premise. CSPs also provide resources such as Load Balancers, Elastic Container Services, to help with scalability of applications. If applications need servers to run, where then does the term Serverless come from? "Serverless Computing" is a misnomer. The term "serverless" does not actually mean there are no servers. It was a term coined by the developers of this technology to define the abstraction of servers from developers. In a typical serverless scenario, the servers that run the application code of choice are deployed and run in containers following an event triggering the invocation of that function. Think about serverless computing as zero management of servers. Containerization is a form of virtualization that happens in the operating system. The guest operating system is packaged and ran as an application in the host operating system. Containerization is the key element of serverless since applications run in containers can easily be spun up or shut down. With the ability to spin up containers that are loaded with application dependencies at runtime, serverless computing promises automatic scalability of application logic. There are several serverless offerings. AWS leads the pack with AWS Lambda being the most popular serverless offering. Others include Apache Openwhisk, Microsoft Azure Functions, Google Cloud Functions e.t.c. 1.1 Motivation Application logic that handle tasks that scale quickly in a short time, such as, food ordering at peak hours, viewers logged in to a virtual conference, are typical use cases for serverless computing. CPU intensive tasks such as image manipulation are also examples of where serverless computing is applicable. This process of outsourcing application logic as mentioned above, prevents physical resources from being clogged up which can lead to a slow response time or in worse scenarios, a crash of the application. Developers can simply deploy their application logic to a serverless environment which gets triggered when an event occurs. Serverless platforms provide API endpoints that returns a json response on invocation. However, a limitation which is often found on serverless platforms is that most providers only provide a limited amount of dependencies at runtime. In some cases, the dependencies required for the execution of the application logic is not provided at runtime. We therefore lead this research paper with our research question. \How can application development with serverless computing leverage third party dependencies by using Docker containers on IBM Openwhisk?" 1.2 Contribution This research paper contributes the following: • A comprehensive literature on cloud computing, serverless computing and Docker containers. • A critical review of related work. • ADEPS - Application Dependency Packager for Serverless. A tool that deploys a Docker image as a serverless function on Apache Openwhisk. 2 The rest of this paper is structured as follows: Section 2 is where we take a look at background and related work in this area. In Section 3, we look at the methodology employed in this research. Section 4 details the design decisions taken in this work. We discuss our implementation in Section 5, evaluate our work in Section 6 and this paper comes to a conclusion in Section 7. 2 Background and Related Work 2.1 Virtualization Vs Containerization 2.1.1 Virtualization Virtualization enables software applications to behave as if they were their own operating system (Barrett and Kipper; 2010). This virtual operating system which is commonly known as the "Guest OS" acts as if it was an independent system with its own binaries and libraries. The guest OS is usually provisioned a virtual memory, virtual hard disk and even a virtual memory from the host operating system. Virtualization, being the pre-cursor to cloud computing, helps to solve the problem of a surplus of resources on physical machines. A windows system can easily provision resources for 2 or more Linux hosts depending on how much resources it has. Virtualization also makes it possible to back up the entire state of a system which can be part of a larger disaster management workflow. You can simply have copies of your machine in its entire state and in case of a crash, a backup can easily be restored. Virtualization was made possible by an advancement in x86 servers and it works by placing the virtualization layer underneath the OS which then acts as a VMM (Virtual Machine Monitor) to manage and control the OS (Barrett and Kipper; 2010) 2.1.2 Containerization (Docker) Virtualization is the technology that paved way for containerization. The need to con- tainerize applications emerges as an application scales and there becomes a need to set up different environments such as development, testing, debugging, production e.t.c. It becomes an issue when an application is not consistent across multiple environments simply because the application that depend on a particular version of a dependency is deployed to an environment with the wrong version of that dependency and thus causing the application to crash or produce inconsistent results. The tediousness that comes with replicating an application runtime environment across numerous setups is why Docker has become so popular. Docker provides developers with the capabilities to run applications in isolated runtime environments called containers 1.

Load more