Evolution of Cloud Operating System: from Technology to Ecosystem

Chen ZN, Chen K, Jiang JL et al. Evolution of cloud operating system: From technology to ecosystem. JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY 32(2): 224–241 Mar. 2017. DOI 10.1007/s11390-017-1717-z Evolution of Cloud Operating System: From Technology to Ecosystem Zuo-Ning Chen 1, Fellow, CCF, Kang Chen 1, Jin-Lei Jiang 1, Member, CCF, ACM, IEEE, Lu-Fei Zhang 2 Song Wu 3, Member, CCF, IEEE, Zheng-Wei Qi 4, Member, CCF, ACM, IEEE Chun-Ming Hu 5, Member, CCF, IEEE, Yong-Wei Wu 1, Senior Member, CCF, IEEE Yu-Zhong Sun 6, Member, CCF, IEEE, Hong Tang 7, Ao-Bing Sun 8, and Zi-Lu Kang 9 1Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China 2Jiangnan Institute of Computing Technology, Wuxi 214083, China 3School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, China 4School of Software, Shanghai Jiao Tong University, Shanghai 200240, China 5School of Computer Science and Engineering, Beihang University, Beijing 100191, China 6Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China 7Alibaba Cloud Computing Inc., Hangzhou 310024, China 8G-Cloud Technology Inc., Dongguan 523808, China 9Institute of Technology of Internet of Things, Information Science Academy of China Electronics Technology Group Corporation, Beijing 100081, China E-mail: [email protected]; {chenkang, jjlei}@tsinghua.edu.cn; [email protected] E-mail: [email protected]; [email protected]; [email protected]; [email protected] E-mail: [email protected]; [email protected]; [email protected]; [email protected] Received November 14, 2016; revised December 16, 2016. Abstract The cloud operating system (cloud OS) is used for managing the cloud resources such that they can be used effectively and efficiently. And also it is the duty of cloud OS to provide convenient interface for users and applications. However, these two goals are often conflicting because convenient abstraction usually needs more computing resources. Thus, the cloud OS has its own characteristics of resource management and task scheduling for supporting various kinds of cloud applications. The evolution of cloud OS is in fact driven by these two often conflicting goals and finding the right tradeoff between them makes each phase of the evolution happen. In this paper, we have investigated the ways of cloud OS evolution from three different aspects: enabling technology evolution, OS architecture evolution and cloud ecosystem evolution. We show that finding the appropriate APIs (application programming interfaces) is critical for the next phase of cloud OS evolution. Convenient interfaces need to be provided without scarifying efficiency when APIs are chosen. We present an API-driven cloud OS practice, showing the great capability of APIs for developing a better cloud OS and helping build and run the cloud ecosystem healthily. Keywords cloud computing, operating system, architecture evolution, virtualization, cloud ecosystem 1 Introduction and domestically○4 . Many traditional applications have been migrated to the cloud systems. For developers 1.1 Cloud Operating System and service providers, services and applications are In recent years, cloud computing systems have been hosted without the concern about infrastructure con- becoming more and more prevalent both abroad○1 ∼○3 struction, application deployment, hardware updating Survey Special Section on MOST Cloud and Big Data The work is supported by the National Key Research and Development Program of China under Grant No. 2016YFB1000500. ○1 https://aws.amazon.com/cn/documentation/, Nov. 2016. ○2 https://www.azure.cn/documentation/, Nov. 2016. ○3 https://cloud.google.com/, Nov. 2016. ○4 https://develop.aliyun.com/, Nov. 2016. ©2017 Springer Science + Business Media, LLC & Science Press, China Zuo-Ning Chen et al.: Evolution of Cloud Operating System: From Technology to Ecosystem 225 or data center maintenance. Now, the platform compo- Rich More nents running inside the cloud providers are considered as cloud operating systems (OS). The cloud OS[1], simi- [2-6] lar to traditional OS managing bare metal hardware SaaS and the software resources, goes through a serial of evolution and now enters the next stage of evolution. We start our discussion with the functions that can be pro- PaaS vided by cloud OS[1]. And towards the end of this paper, we hope to provide a guideline for the next stage Cloud OS Functionalities IaaS of evolution. Cloud computing systems[1,7-9] are often built from the existing single machine OS, usually Linux. Now, Ecosystem Services the other operating systems like Windows have been used in the data center infrastructure. Most of the OS Kernel cloud computing system components are constructed [2 3] Core API IaaS API as user applications - from the traditional OS point PaaS API API SaaS of view. However, the platform for running cloud appli- API Levels High cations can still be called an OS as the platform serves the same two critical goals as a traditional OS. On the Fig.1. Cloud OS API levels and the ecosystem services. one hand, the cloud OS is used for managing the large- scale distributed computing resources, similar to the 1.2 Requirements of Cloud OS traditional OS managing hardware in a single machine. In a single computer, it is quite clear that there On the other hand, the cloud OS provides abstraction needs an OS abstracting the bare metal hardware for for running user applications. APIs (application pro- facilitating the applications as well as users to use gramming interfaces) are provided for programmers to the computing resources conveniently. However, most develop cloud applications. This is similar to the case of the components in the cloud OS are implemented that a local OS provides system calls for servicing appli- as user-level programs. The fundamental necessity of cations. For example, file systems are used for providing building a cloud OS needs further discussion as cloud storage APIs instead of exposing the block operation di- applications can always be built directly on the current rectly from disks in local OS. Cloud OS provides similar operating systems without concerning the management APIs to a distributed file system. Besides programming of hardware in the kernel mode. The true necessity APIs, some other facilities will be provided for running relies on the programming APIs and runtime support the system smoothly in the cloud. Job scheduler is such in the cloud environment[13,18-21]. Local OSes mainly [10-12] a service for scheduling jobs . Local OS schedules focus on a single machine. They usually do not con- processes while cloud OS schedules distributed jobs. sider any cloud application logically as a whole uni- Data management is also a very important component fied application spanning over a large number of ma- [13-18] in the cloud OS . Users and applications need to chines. Thus traditional OSes just provide the basic track the data flow inside their applications running in facilities for communications. In addition, cloud ap- the cloud environment. Data processing applications es- plications have different forms such as micro services pecially need the thorough understanding of their data for building web applications, batch processing for big flow among different components in the cloud OS. Fig.1 data analysis[19-21], query-based data analytics[18,22-23], shows the cloud OS API levels as well as various services and real time processing of streaming data[24-26]. All supported. Core APIs are relatively stable and used for the applications have their own internal logical orga- managing the underlying infrastructure and hiding re- nizations of multiple components running on multiple, sources heterogeneity and distribution. On top of core or even thousands of machines. Exposing very sim- APIs are other APIs that provide more functionalities, ple communication interfaces is just not enough. Deve- easing the burden of building more higher levels of cloud lopers need higher levels of abstractions for building services and applications. Through this way, the cloud their applications without struggling with details re- ecosystem can be built and cloud services and applica- lated to the complex communication patterns and ma- tions can proliferate. chine architecture. Exposing APIs that can run on 226 J. Comput. Sci. & Technol., Mar. 2017, Vol.32, No.2 top of multiple machines will be very helpful. This is own cloud OS under different considerations. Amazon a very common case for the current cloud OS practi- started with virtual machine (VM) instances for deve- tioners such as Google GAE○5 , Amazon AWS○6 , Mi- lopers and administrators to start building their own crosoft Azure○7 and Alibaba Cloud○8 . Cloud appli- systems, releasing them from the maintenance of the cation developers can get the programming interfaces underlying hardware. At the same time, Google started without considering the physical resources. For exam- its cloud OS considering large amount of data process- ple, developers can store objects in the cloud without ing. These are the two main pioneers building the knowing the final disks storing the data[14]. Program- cloud OS. The services provided by these two companies ming is always about abstractions and we need the are quite different. The followers, such as Microsoft, cloud OS to provide the cloud programming abstrac- started with the similar services with different inter- tion for building cloud applications. The other reason faces exposed. For business reasons, the interfaces are why a cloud OS is necessary is that running cloud appli- proprietary, leading to open source versions to provide cation is different from running applications in a single similar but slightly different interfaces. Thus, the cloud machine. For each single machine OS, a task sched- application developers have to face the problem of ven- uler will be used for managing the processes created in dor lock-in which means that applications built for one the system.

Load more