Academic Cloud Computing Research: Five Pitfalls and Five Opportunities
Total Page:16
File Type:pdf, Size:1020Kb
Academic Cloud Computing Research: Five Pitfalls and Five Opportunities Adam Barker, Blesson Varghese, Jonathan Stuart Ward and Ian Sommerville School of Computer Science, University of St Andrews, UK Abstract focuses on incremental improvements to existing frame- works created by industry, e.g., Hadoop optimisation for This discussion paper argues that there are five funda- a small set of edge cases. Typically these contributions mental pitfalls, which can restrict academics from con- are short-lived and have no lasting impact on the cloud ducting cloud computing research at the infrastructure computing community; not because these contributions level, which is currently where the vast majority of aca- are irrelevant, but because these kinds of problems are demic research lies. Instead academics should be con- already being solved by industry, who do have access to ducting higher risk research, in order to gain understand- low-level infrastructure at scale [26]. ing and open up entirely new areas. Instead academics should be conducting higher risk We call for a renewed mindset and argue that academic research, which sheds light on more complex problems, research should focus less upon physical infrastructure in order to gain understanding and open up entirely new and embrace the abstractions provided by clouds through areas. We expect to see genuine cloud research focus five opportunities: user driven research, new program- less upon physical infrastructure and continue to em- ming models, PaaS environments, and improved tools to brace abstraction through user driven research, program- support elasticity and large-scale debugging. This objec- ming models, and Platform as a Service (PaaS) clouds. tive of this paper is to foster discussion, and to define In addition, research needs to focus on providing soft- a roadmap forward, which will allow academia to make ware engineers with better tools to program truly elastic longer-term impacts to the cloud computing community. applications, and debug large-scale applications, which utilise existing cloud infrastructure. The objective of this 1 Introduction paper is to foster discussion, and to define a roadmap forward, which will allow academia to make longer-term Imagine a world where there was no academic cloud impacts. Many of these observations have been derived computing research. Would the field of cloud comput- from an eight year UK research programme focused on ing really be any different from what it is today? Would Large Scale Complex IT Systems (LSCITS) [25]. it have evolved in the same way? This paper argues that academics are addressing the wrong class of problems to 2 Five Pitfalls of Academic Research have any lasting impact, and that cloud computing has evolved without substantial influence from academia. Pitfall 1: Infrastructure at Scale The core of this problem comes down to scale: aca- demics generally do not have low-level infrastructure The presence of unused capacity within large-scale data (e.g., compute, storage, network) access to data centre centres led to the development of what became known as sized resources, in order to design, deploy and test al- cloud computing. Therefore, since its inception, cloud gorithms. To circumvent this access problem academics computing and scale are inseparably linked. Today, rapid simulate the behaviour of data centres, and run exper- scalability and on demand resource provisioning are af- iments on small (toy) private clouds deployments. If forded to users through the ever increasing scale of cloud this research is then deployed at scale, there is the ever providers. Current estimates of the scale of Amazon present risk it is non representative of real world data EC2, the dominant public cloud provider, vary between centres, and is therefore of limited value to the commu- 156,225 and 454,400 servers [27]. Amazon Web Ser- nity. In addition, a large proportion of academic research vices (AWS) are now the largest web host on the planet, 1 which hosts numerous high profile customers including search and not maintaining testbeds and research sys- Netflix, Instagram, DuckDuckGo and Reddit. tems. A high turnover of researchers and a emphasis on The scale of AWS and equivalent cloud providers (in research (and not maintenance) often leads to the neglect terms of servers, users, data, etc.) cannot be easily repli- and eventually the premature abandonment of testbeds. cated. To produce a similar environment requires signif- To ensure the longevity of a research system, with appro- icant capital expenditure, which is extremely prohibitive priate documentation and maintenance, dedicated staff for academic researchers. At best, academic researchers are necessary. This cannot be easily afforded be indi- can hope to reproduce a tiny subsection of the resources vidual institutions. Academics need to collaborate, self available through AWS. Typically this is achieved though organise and build their own data centres at the national the use of a small testbed built from commodity hard- level (through government funding etc.) in order to pro- ware running, for example OpenStack [30]. Such de- vide the facilities and maintenance necessary for aca- ployments typically number between the tens and hun- demics to conduct meaningful cloud computing research. dreds of servers, supported by gigabit Ethernet and com- modity network attached storage. As such they cannot Pitfall 2: Abstraction come close to replicating the scale of the compute, net- work or storage capacity that is expected from even the Abstraction is one of the key features of the cloud com- smallest public clouds. puting model. The main principle of public cloud com- In order to perform an evaluation which indicates per- puting models is that a user can make use of VMs as if formance on a public cloud, there is no substitute for a they were offered as a service, and the complexities of real world public cloud deployment. Researchers who managing and maintaining hardware are abstracted from wish to investigate applications or phenomena running an end user. This translates into the fact that such an ab- on top of Virtual Machines (VMs) and storage from a straction results in an ‘observation-only’ or a ‘black box’ public cloud provider can feasibly do so. Any research model. Unlike the grid, the low-level machine details which must be evaluated at scale can incur significant are concealed on the cloud. This has a number of com- costs but remains a feasible means of evaluation. pelling advantages over grid computing architectures in- If however research is focused upon low-level com- cluding elastic on-demand architecture and high service pute, storage, network, energy or other systems there is availability [6]. the ever present risk of producing an evaluation which However, in academic cloud computing research, such is non representative of real world cloud deployments. an abstraction in which the cloud is merely used as a ser- Without access, academics cannot modify and evaluate vice or tool for computing has been of minimal interest, low-level cloud mechanics, making useful research in e.g., deployment of a Physic’s application on the cloud this area all but infeasible. Work which is targeted at to compare performance with a cluster environment. In the cloud infrastructure level and evaluated on a non rep- addition, a lot of academic cloud computing research has resentative private testbed can at best deliver interim re- a reverse engineering focus, where low-level infrastruc- sults, which suggest how it may behave when deployed ture components are often (poorly) reimplemented for in a real world context. the purposes of research prototypes. This not only con- This has caused a dichotomy within academic cloud tradicts the underlying principle of public cloud comput- research: between those researchers with access and ing models (to be used as a service), but such research partnership with cloud providers and those without. Two meets with little support from the cloud providers them- recent examples to illustrate the point include: the latest selves. However, research which has built on top of exist- best paper award at IEEE Cloud 2013 [31] was awarded ing cloud platforms in order to provide new functionality for collaborative research between Princeton and NEC, has been enormously successful. RightScale [32] is an the best paper at SIGCHI (non-cloud research but re- exemplar of how academic cloud computing research has quires access to data) was awarded for research between added innovative new features and been commercialised. Facebook and Carnegie Mellon University [12]. This ac- cess problem prevents academics from having any last- Pitfall 3: Non-reproducible Results ing impact, and has significant implications for systems research investigating the lower levels of the cloud stack. As academics do not have low-level access to data cen- For academics that are used to low-level access this bar- tre resources, why not simulate a data centre instead? rier to research is unprecedented. This is a reasonable response from academia to the prob- There are efforts underway with the European Union lem of access to such large-scale resources, and has BONFIRE [9] project, and Open Cirrus [29] to provide worked well in the networking community, e.g., NS3 academics with access to large-scale resources. Indi- [28]. There are multiple simulators, which attempt to vidual academics are predominantly concerned with re- model and simulate a cloud computing environment, in- 2 cluding CloudSim [13] and GreenCloud [24]. Simula- The novelty of the cloud over preceding infrastruc- tions can be effectively used as a prototyping mechanism tures such as clusters and grids include features such as to provide a rough idea of how a particular algorithm elasticity, scalability, on-demand availability and mini- may perform. The problem with each of these simula- mal user maintenance of the resources [18]. The majority tion tools is that it is very difficult to verify if the simu- of academic cloud computing research does not require lation environment is an accurate representation of a real any of these core properties, which differentiate clouds world data centre environment.