System Software Techniques to Enhance Reliability of Modern Platforms

Total Page:16

File Type:pdf, Size:1020Kb

System Software Techniques to Enhance Reliability of Modern Platforms UNIVERSITY OF THESSALY PHDTHESIS System software techniques to enhance reliability of modern platforms. Author: Supervisor: Konstantinos PARASYRIS Nikolaos BELLAS Advising committee: Nikolaos BELLAS, Spyros LALIS, Christos D. ANTONOPOULOS A thesis submitted in fulfillment of the requirements for the degree of Doctor of Philosophy in the Department of Electrical and Computer Engineering October 17, 2018 Institutional Repository - Library & Information Centre - University of Thessaly 07/06/2020 01:59:22 EEST - 137.108.70.13 i “Optimism is the faith that leads to achievement. Nothing can be done without hope and confidence.” "Helen Keller" Institutional Repository - Library & Information Centre - University of Thessaly 07/06/2020 01:59:22 EEST - 137.108.70.13 ii Abstract Konstantinos PARASYRIS System software techniques to enhance reliability of modern platforms. Chip manufacturers introduce redundancy at various levels of CPU design to guarantee correct operation even for worst-case combinations of non-idealities in process variation and system operating conditions. This redundancy is implemented partly in the form of wide voltage margins. This PhD dissertation is based on the concept that these conservative design runes are mostly excessive, as they account for execution scenarios that rarely ap- pear during the lifetime of the systems. If the faults are ignored the system will result to application crashes or even system-wide failures. Our software based ap- proach treats these faults to enable execution in such conditions. The approach is based on the concept that many applications domains it is not the exact output that matters but a rough estimation of the output. Therefore, we propose a programming model in which the developer can define which parts of the application are more significant than others. The programming model extends a widely used parallel programming model, called OpenMP. The developer provides information about the significance of computations and the programming model ex- poses a parameter, called ratio, which can control the extend of quality degradation and energy efficiency. The idea of significance-aware computing is ported into two different computing paradigms, the fault tolerant one and the approximate. In the case of fault tolerant computing we implement a significance-centric programming model and runtime support which sets the supply voltage in a multicore CPU to sub-nominal values to reduce the energy footprint and provide mechanisms to control output quality. The developers specify the significance of application tasks respecting their contribution to the output quality and provide check and repair functions for handling faults. On a multicore system we evaluate our approach using an energy model which quanti- fies the energy reduction. When executing the least significant tasks unreliably, our approach leads to 20% CPU energy reduction with respect to a reliable execution and has minimal quality degradation. In the case of approximate computing, we implement a similar programming model that promotes the combination of the significance and ratio features. The ap- proach using analytical models of the energy consumption of the application can Institutional Repository - Library & Information Centre - University of Thessaly 07/06/2020 01:59:22 EEST - 137.108.70.13 iii efficiently decide the degree of approximation to meet certain user defined energy requirements. There are many application domains which require their computations to be per- formed without any errors. Our study on the x86-64 Haswell and Skylake multicore microarchitectures reveals wide voltage which can be removed without observing any errors. These margins can reach up to 22% and 13% of the nominal supply volt- age for the Skylake and Haswell architectures respectively. The margins vary across different microarchitectures, different chip parts of the same microarchitecture, and across different workloads. We introduce a model which can be used dynamically to adjust the supply volt- age of modern multicore x86-64 CPUs to just above the minimum required for safe operation. We identify a set of performance metrics – directly measurable via perfor- mance monitoring counters – with high predictive value for the minimum tolerable supply voltage (Vmin ). We use benchmarks that vary in terms of application domain, resource utilization and pressure, and software/hardware interaction characteristics to train a Vmin prediction model. Finally, at execution time those metrics are moni- tored and serve as input to the model, in order to predict and apply the appropriate Vmin for the workload. Compared to the conventional approaches, our methodology achieves up to 42% energy savings for the Skylake family and 34% for the Haswell family for complex, real-world applications. Last but not least, during the course of the thesis we implemented the infrastruc- ture to observe accurately the application resiliency of faults as well as to identify voltage and frequency margins of modern processors. GemFI is a fault injection tool based on the cycle accurate full system simulator Gem5. GemFI provides fault injec- tion methods and is easily extensible to support future fault models. It also supports multiple processor models and ISAs and allows fault injection in both functional and cycle-accurate simulations. GemFI offers fast-forwarding of simulation campaigns via checkpointing. Moreover, it facilitates the parallel execution of campaign exper- iments on a network of workstations. XM 2 enables the evaluation of software on systems operating outside their nominal margins. It supports both bare-metal and OS-controlled execution using an API to control the fault injection procedure and provides automatic management of experimental campaigns. Institutional Repository - Library & Information Centre - University of Thessaly 07/06/2020 01:59:22 EEST - 137.108.70.13 iv PerÐlhyh KwnstantÐnoc Παρασύρης Αύξηση ανθεκτικόthtac twn efarmog¸n se σφάλματa se σύγχροnec πλατφόρμες H biomhqanÐa twn hmiagwg¸n èqei basisteÐ tic teleutaÐec dekaetÐec sto νόμο tou Moore, o opoÐoc problèpei κάθε 18 μήνες ton διπλασιασμό tou ariθμού twn transistors ανά moνάδα επιφάνειας se oloklhrwmèna kukl¸mata basismèna se teqnologÐa CMOS. Se antidiastoλή me to pareλθόn (prin από to 2004), όπου oi sqediastèc epexergasti- k¸n susτημάτwn, mèsw thc αύξησης tou ariθμού twn transistor, eÐqan wc stόqo thn antÐstoiqh αύξηση thc απόδοσης, h nèa πραγματικόthta jètei thn meÐwsh thc katανάλω- shc iσχύος (kai enèrgeiac) wc thn μεγαλύτερη πρόκληση sthn sqedÐash epexergast¸n. Tautόqrona, h megάlh πυκνόthta topojèthshc twn transistors οδηγούν sthn ana- ξιόπιsth leitourgÐa twn σύγχροnwn epexergast¸n. H anaxiopistÐa αυτή ofeÐletai en mèrei stic dunamikèc διακυμάνσειc ρεύματoc kai tάσης (supply voltage) oi opoÐec eÐnai pio piθανό na dhmiουργήσουν λάθη qroniσμού se mikrèc gewmetrÐec teqnologÐac CMOS. EpÐshc eÐnai pio piθανά ta kataskeuasτικά λάθη (fabrication faults) λόgw atelei¸n thc diadikasÐac fwtolijografÐac. Epiplèon, paroδικά λάθη (transient faults) pou ofeÐlo- ntai se exwgeneÐc παράγοntec, όπως alpha particles, èqoun μεγαλύτερη epÐdrash se μικρόterec gewmetrÐec teqnologÐac CMOS. Gia na epiteuqjeÐ αξιόπιsth leitourgÐa υπό autèc tic συνθήκες, oi sqediastèc σύγχροnwn epexergastik¸n susτημάτwn qrhsimo- poiούν sunthrhtikèc sqediastikèc teqnikèc, όπως υψηλά perij¸ria tάσης trofodosÐac (Vdd) kai συχνόthtac rologiού ètsi ¸ste o epexergasτής na prostατεύετai από κάθε piθανόthta laj¸n qroniσμού. Oi sunthrhtikèc autèc teqnikèc mporeÐ men na prosta- τεύουν thn αξιόπιsth leitourgÐa tou epexergasτή, èqoun όμως wc apotèlesma μεγάλη spatάλη se iσχύ kai enèrgeia h opoÐa ftάνει mèqri kai to 35% se arketèc peript¸seic. H βασική idèa thc παρούσας didaktορικής διατριβής basÐzetai sto όti autèc oi su- nthrhtikèc teqnikèc σχεδιασμού eÐnai σχεδόn πάντa perittèc kai antistoiχούν se peri- pt¸seic leitourgÐac pou σχεδόn potè den πρόκειtai na συμβούν tautόqrona katά thn διάρκεια thc leitourgÐac tou epexergasτή. Qrhsimopoi¸ntac teqnikèc kurÐwc sto e- pÐpedo logiσμικού susτήματoc kai efarmog¸n, h διατριβή proteÐnei thn leitourgÐa tou epexergasτή poλύ kontά stic akraÐec katastάσειc leitourgÐac tou kai thn εξάλειyh tou μεγαλύτερου mèrouc tou sqediasτικού perijwrÐou. Gia pαράδειgma, h δυναμική meÐwsh thc tάσης trofodosÐac ενός epexergasτή katά thn διάρκεια leitourgÐac tou mporeÐ na epifèrei μεγάλες belti¸seic sthn katανάλωση iσχύος tou, αλλά, εφόσοn den elegqjeÐ, eÐnai dunatόn na dhmiουργήσει lanjasmèna apotelèsmata ή kai na διακόψει απόtoma thn Institutional Repository - Library & Information Centre - University of Thessaly 07/06/2020 01:59:22 EEST - 137.108.70.13 v leitourgÐa tou. Από thn άλλη, h αύξηση thc συχνόthtac tou rologiού ενός epexerga- sτή, mporeÐ men na belti¸sei thn απόδοση kai na epifèrei meÐwsh tou χρόnou ektèleshc, αλλά, mporeÐ na dhmiουργήσει προβλήματa sthn axiopistÐa thc leitourgÐac tou. H διατριβή basÐzetai sthn idèa όti se pollèc efarmogèc (ή epimèrouc φάσειc e- farmog¸n) to akribèc apotèlesma eÐte den mac endiafèrei, eÐte eÐnai poλύ apaiτητικό se κύκλους mhqaνής kai katανάλωση enèrgeiac gia na mac sumfèrei na upologisteÐ. ProteÐnoume èna nèo programmatisτικό montèlo sto opoÐo o programmatisτής mpore- Ð na qarakthrÐsei thn σημαντικόthta twn diaforetik¸n τμημάτwn μιάς efarmoγής kai thn suneiσφορά touc sthn ποιόthta tou τελικού apotelèsmatoc. To programmatisti- κό montèlo epekteÐnei to gnwstό montèlo OpenMP pou qrhsimopoieÐtai eurèwc ston παράλληλο programmatiσμό. Mèsw thc plhroforÐac thc σημαντικόthtac pou parèqei o programmatisτής
Recommended publications
  • Software Testing: Essential Phase of SDLC and a Comparative Study Of
    International Journal of System and Software Engineering Volume 5 Issue 2, December 2017 ISSN.: 2321-6107 Software Testing: Essential Phase of SDLC and a Comparative Study of Software Testing Techniques Sushma Malik Assistant Professor, Institute of Innovation in Technology and Management, Janak Puri, New Delhi, India. Email: [email protected] Abstract: Software Development Life-Cycle (SDLC) follows In the software development process, the problem (Software) the different activities that are used in the development of a can be dividing in the following activities [3]: software product. SDLC is also called the software process ∑ Understanding the problem and it is the lifeline of any Software Development Model. ∑ Decide a plan for the solution Software Processes decide the survival of a particular software development model in the market as well as in ∑ Coding for the designed solution software organization and Software testing is a process of ∑ Testing the definite program finding software bugs while executing a program so that we get the zero defect software. The main objective of software These activities may be very complex for large systems. So, testing is to evaluating the competence and usability of a each of the activity has to be broken into smaller sub-activities software. Software testing is an important part of the SDLC or steps. These steps are then handled effectively to produce a because through software testing getting the quality of the software project or system. The basic steps involved in software software. Lots of advancements have been done through project development are: various verification techniques, but still we need software to 1) Requirement Analysis and Specification: The goal of be fully tested before handed to the customer.
    [Show full text]
  • Stress Testing Embedded Software Applications
    Embedded Systems Conference / Boston, MA September 20, 2007 ESC-302 Stress Testing Embedded Software Applications T. Adrian Hill Johns Hopkins University Applied Physics Laboratory [email protected] ABSTRACT This paper describes techniques to design stress tests, classifies the types of problems found during these types of tests, and analyzes why these problems are not discovered with traditional unit testing or acceptance testing. The findings are supported by citing examples from three recent embedded software development programs performed by the Johns Hopkins University Applied Physics Laboratory where formal stress testing was employed. It also defines what is meant by the robustness and elasticity of a software system. These findings will encourage software professionals to incorporate stress testing into their formal software development process. OUTLINE 1. INTRODUCTON 1.1 What can be learned by “breaking” the software 2. DESIGNING A STRESS TEST 2.1 What is a reasonable target CPU load? 2.2 Other ways to stress the system 2.3 Characteristics of a Stress Test 3. REAL WORLD RESULTS 3.1 Case #1: Software Missed Receiving Some Commands When CPU Was Heavily Loaded 3.2 Case #2: Processor Reset when Available Memory Buffers Were Exhausted 3.3 Case #3: Unexpected Command Rejection When CPU Was Heavily Loaded 3.4 Case #4: Processor Reset When RAM Disk Was Nearly Full 3.5 Synopsis of All Problems Found During Stress Testing 4. SUMMARY 1. INTRODUCTON Traditional Software Acceptance Testing is a standard phase in nearly every software development methodology. Test engineers develop and execute tests that are defined to validate software requirements.
    [Show full text]
  • Testing Techniques Selection Based on SWOT Analysis
    International Journal of Knowledge, Innovation and Entrepreneurship Volume 2 No. 1, 2014, pp. 56—68 Testing Techniques Selection Based on SWOT Analysis MUNAZZA JANNISAR, RUQIA BIBI & MUHAMMAD FAHAD KHAN University of Engineering and Technology, Pakistan Received 02 March 2014; received in revised form 15 April 2014; approved 22 April 2014 ABSTRACT Quality is easy to claim but hard to achieve. Testing is a most essential phase in software development life cycle not only to ensure a project’s success but also customer satisfaction. This paper presents SWOT Analysis of three testing tech- niques—White-Box, Black-Box and Grey Box. The testing techniques are evaluated on the basis of their strengths, weaknesses, opportunities and threats. The analysis in this paper shows that selection of the techniques should be based on a set of defined attrib- utes, context and objective. The findings in this paper might be helpful in highlighting and addressing issues related to testing techniques selection based on SWOT analysis. Keywords: Black Box Testing, White Box Testing, Grey Box, SWOT Introduction Software plays a substantial role in every sphere of life. It is a key reason why software engineering continually introducing new methodologies and improvement to software development. The capacity of organisations to achieve customer satisfaction or to cor- rectly identify the needs of customers can be misplaced, leading to ambiguities and un- wanted results 1. Researchers have suggested many testing techniques to deal with fault identification and removal subjects, before the product [software] is shipped to cus- tomer. Despite many verification and validation approaches to assuring product quality, defect-free software goal is not very often achieved.
    [Show full text]
  • Putting the Right Stress Into WMS Volume Testing Eliminate Unwelcome Surprises
    v i e w p o i n t Putting the Right Stress into WMS Volume Testing Eliminate unwelcome surprises. When implementing or upgrading a Warehouse Management System (WMS) in a high volume environment, it should be a given that a significant amount of effort needs to be dedicated to various phases of testing. However, a common mistake is to ignore or improperly plan for volume testing. So what types of processes should be tested? There is unit testing that validates segments of functionality, RF Floor Transactions - This should involve the more interface testing that checks the flow of data between systems, commonly used RF transactions such as putaway and picking. user acceptance testing that involves the end user group It is not a good idea to include every type of RF activity. That working through scripted functional flows, and field testing that means you probably will not be paying too much attention to executes an augmented version of the user acceptance testing cycle counts as a part of your stress test. on the distribution center floor. Wave Processing - It is absolutely necessary to know how Volume testing, also known as stress testing, is an assessment long it will take to process each day’s orders. In addition, it is that may be ignored. This is a mistake, because this puts a strain important to validate if there are any important processes that on the processes within the WMS, as well as its interfaces in cannot be executed during wave processing. For example, the order to identify conditions that would result in a slow-down testing may reveal that RF unit picking is an activity that cannot or catastrophic crash of the system.
    [Show full text]
  • Scalability Performance of Software Testing by Using Review Technique Mr
    International Journal of Scientific & Engineering Research Volume 3, Issue 5, May-2012 1 ISSN 2229-5518 Scalability performance of Software Testing by Using Review Technique Mr. Ashish Kumar Tripathi, Mr. Saurabh Upadhyay, Mr.Sachin Kumar Dhar Dwivedi Abstract-Software testing is an investigation which aimed at evaluating capability of a program or system and determining that it meets its required results. Although crucial to software quality and widely deployed by programmers and testers, software testing still remains an art, due to limited understanding of the principles of software, we cannot completely test a program with moderate complexity. Testing is more than just debugging. The purpose of testing can be quality assurance, verification and validation, or reliability estimation. Testing can be used as a generic metric as well. Correctness testing and reliability testing are two major areas of testing. Software testing is a trade-off between budget, time and quality. Index Terms--Software testing modules, measurement process, performance and review technique. —————————— —————————— 1-INTRODUCTION throughput of a web application based on requests per esting is not just finding out the defects. Testing is second, concurrent users, or bytes of data transferred as not just seeing the requirements are Satisfied which well as measure the performance. T are necessary in software development. Testing is a process of verifying and validating all wanted requirements 3-PROCESS FOR MEASUREMENT is there in products and also verifying and validating any unwanted requirements are there in the products. It is also In the measurement of configured software there are seeing any latent effects are there in the product because of several technique can be applicable these requirements.
    [Show full text]
  • Performance Testing, Load Testing & Stress Testing Explained
    Performance Testing, Load Testing & Stress Testing Explained Performance testing is key for understanding how your system works. Without good performance testing, you don’t know how your system will deal with expected—or unexpected—demands. Load testing and stress testing are two kinds of performance testing. Knowing when to use load testing and when to use stressing testing depends on what you need: Do you need to understand how your system performs when everything is working normally? Do you want to find out what will happen under unexpected pressure like traffic spikes? To ensure that your systems remain accessible under peak demand, run your system through performance testing. Let’s take a look. Performance vs load vs stress testing Performance testing is an umbrella term for load testing and stress testing. When developing an application, software, or website, you likely set a benchmark (standard) for performance. This covers what happens under: Regular parameters: If everything goes as planned, how does it work? Irregular parameters: Can my website application survive a DDoS attack? Load testing is testing how an application, software, or website performs when in use under an expected load. We intentionally increase the load, searching for a threshold for good performance. This tests how a system functions when it faces normal traffic. Stress testing is testing how an application, software, or website performs when under extreme pressure—an unexpected load. We increase the load to its upper limit to find out how it recovers from possible failure. This tests how a system functions when it faces abnormal traffic. Now let’s look at each in more detail.
    [Show full text]
  • A Load Test Guide
    WHITE PAPER Preparing for peak traffic: A load test guide Preparing for peak traffic: A load test guide. 1 WHITE PAPER Table of Contents. What is a load test? ...................................................................................................3 Why load testing? ......................................................................................................4 Best practices ............................................................................................................5 Running a load test ...................................................................................................8 Test configurations .................................................................................................10 Final thoughts .........................................................................................................12 About WP Engine .....................................................................................................14 WHITE PAPER Preparing for peak traffic: A load test guide like scaling capabilities, lifecycle hooks, security risks, automatic Introduction. code deployment, health checks, and target tracking. With the holidays approaching, most marketing teams are In this informative white paper, we’ll break down the basics of prepping for impactful, revenue-driving campaigns. Some load testing, why you should load test, best practices, and how might even be preparing for the launch of a new product on to get started load testing your site before the holidays. Black Friday. If
    [Show full text]
  • Flying Black Swans Into Stress Testing
    Flying black swans into stress testing Geopolitical risks stress tested Dr. Andrea Burgtorf Head of Group Risk Operating Office Stress Test Europe Conference London, Oct 6th, 2016 Disclaimer • This document contains certain statements regarding methodology and approaches poten4ally applicable under future accoun4ng rules. • The informaon herein cannot be used to infer any impact of such future accoun4ng rules on Erste Group Bank AG. • The views expressed herein are the views of the author and may not reflect the official opinion of Erste Group Bank AG. • This document does not cons4tute an offer or invitaon to purchase or subscribe for any shares or other securi4es and neither it nor any part of it shall form the basis of or be relied upon in connec4on with any contract or commitment whatsoever. • The informaon contained herein and this presentaon shall not be further disseminated without wriCen consent of the author. 2 Agenda 1 Introduction 2 Black Swans impact and Stress testing environment 3 Modeling of Black swans 4 Integration in Risk Management Framework 3 Black Swans: an erup6on of financial risks ? 4 Fundamental shi>s in Risk Landscape are ongoing... “Known Companies iden4fy and plan for in an 1 Risks” effort to avoid or mi4gate them. uncertainty “Emerging Risks that have come onto radar, but 2 whose full extent and implicaons are Risks” not yet clear. “Black Events which are highly unlikely to 3 happen but would have severe Swans” consequences if they did. Thorough understanding and managing of risks puts companies in a better place to pursue their strategy with a confidence that they have the business resilience to manage known risks and respond to the unexpected 5 ….Black Swans - Management exposed to uncertainty ? Increased speed and Outdated risk Insufficient Protec4on impact of risk events mgmt.
    [Show full text]
  • A Multi-Hazard Risk Assessment Methodology, Stress Test Framework and Decision Support Tool for Transport Infrastructure Networks
    Available online at www.sciencedirect.com ScienceDirect Transportation Research Procedia 14 ( 2016 ) 1355 – 1363 6th Transport Research Arena April 18-21, 2016 A multi-hazard risk assessment methodology, stress test framework and decision support tool for transport infrastructure networks Julie Clarke a,*, Eugene Obrien a aRoughan and O’Donovan Innovative Solutions Limited, Dublin, Ireland Abstract Natural hazards can cause serious disruption to societies and their transport infrastructure networks. The impact of extreme hazard events is largely dependent on the resilience of societies and their networks. The INFRARISK project is developing a reliable stress test framework for critical European transport infrastructure to analyse the response of networks to extreme hazard events. The project considers the spatio-temporal processes associated with multi-hazard and cascading extreme events (e.g. earthquakes, floods, landslides) and their impacts on road and rail transport infrastructure networks. As part of the project, an operational framework is being developed using an online INFRARISK Decision Support Tool (IDST) to advance decision making approaches, leading to better protection of existing transport infrastructure. The framework will enable the next generation of European infrastructure managers to analyse the risk to critical road and rail infrastructure networks due to extreme natural hazard events. To demonstrate the overarching risk assessment methodology developed in the project, the methodology is demonstrated for two case studies, which comprise portions of the European TEN-T network; a road network in the region of Bologna, Italy and a rail network extending from Rijeka to Zagreb in Croatia. This paper provides an overview of the INFRARISK multi-hazard risk assessment methodology and a brief introduction to the case studies, as the project is currently ongoing.
    [Show full text]
  • Performance Testing: Methodologies and Tools
    View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by International Institute for Science, Technology and Education (IISTE): E-Journals Journal of Information Engineering and Applications www.iiste.org ISSN 2224-5758 (print) ISSN 2224-896X (online) Vol 1, No.5, 2011 Performance Testing: Methodologies and Tools H. Sarojadevi * Department of Computer Science and Engineering, Nitte Meenakshi Institute of Technology, PO box 6429, Yelahanka, Bengaluru -64 * E-mail of the corresponding author: [email protected] Abstract Performance testing is important for all types of applications and systems, especially for life critical applications in Healthcare, Medical, Biotech and Drug discovery systems, and also mission critical applications such as Automotives, Flight control, defense, etc. This paper presents performance testing concepts, methodologies and commonly used tools for a variety of existing and emerging applications. Scalable virtual distributed applications in the cloud pose more challenges for performance testing, for which solutions are rare, but available; one of the major providers is HP Loadrunner. Keywords: Performance testing, Application performance, Cloud computing 1. Introduction Building a successful product hinges on two fundamental ingredients — functionality and performance. ‘Functionality’ refers to what the application lets its users accomplish, including the transactions it enables and the information it renders accessible. ‘Performance’ refers to the system’s ability to complete transactions and to furnish information rapidly and accurately despite high multi-user interaction or constrained hardware resources. Application failure due to performance-related problems is preventable with pre-deployment performance testing. However, most teams struggle because of lack of professional performance testing methods, and guaranteeing problems with regard to availability, reliability and scalability, when deploying their application on to the “real world”.
    [Show full text]
  • Performance Testing in the Cloud. How Bad Is It Really?
    Performance Testing in the Cloud. How Bad is it Really? Christoph Laaber Joel Scheuner Philipp Leitner Department of Informatics Software Engineering Division Software Engineering Division University of Zurich Chalmers | University of Gothenburg Chalmers | University of Gothenburg Zurich, Switzerland Gothenburg, Sweden Gothenburg, Sweden [email protected] [email protected] [email protected] Abstract to evaluate the performance of applications under “realistic condi- Rigorous performance engineering traditionally assumes measur- tions”, which nowadays often means running it in the cloud. Finally, ing on bare-metal environments to control for as many confounding they may wish to make use of the myriad of industrial-strength 1 factors as possible. Unfortunately, some researchers and practition- infrastructure automation tools, such as Chef or AWS CloudFor- 2 ers might not have access, knowledge, or funds to operate dedicated mation , which ease the setup and identical repetition of complex performance testing hardware, making public clouds an attractive performance experiments. alternative. However, cloud environments are inherently unpre- In this paper, we ask the question whether using a standard dictable and variable with respect to their performance. In this study, public cloud for software performance experiments is always a we explore the effects of cloud environments on the variability of bad idea. To manage the scope of the study, we focus on a specific performance testing outcomes, and to what extent regressions can class of cloud service, namely Infrastructure as a Service (IaaS), still be reliably detected. We focus on software microbenchmarks and on a specific type of performance experiment (evaluating the as an example of performance tests, and execute extensive experi- performance of open source software products in Java or Go using 3 ments on three different cloud services (AWS, GCE, and Azure) and microbenchmarking frameworks, such as JMH ).
    [Show full text]
  • Software Testing
    Software Testing Carnegie Mellon University 18-849b Dependable Embedded Systems Spring 1999 Authors: Jiantao Pan [email protected] Abstract: Software testing is any activity aimed at evaluating an attribute or capability of a program or system and determining that it meets its required results. [Hetzel88] Although crucial to software quality and widely deployed by programmers and testers, software testing still remains an art, due to limited understanding of the principles of software. The difficulty in software testing stems from the complexity of software: we can not completely test a program with moderate complexity. Testing is more than just debugging. The purpose of testing can be quality assurance, verification and validation, or reliability estimation. Testing can be used as a generic metric as well. Correctness testing and reliability testing are two major areas of testing. Software testing is a trade-off between budget, time and quality. Contents: Introduction Key Concepts Taxonomy Testing automation When to stop testing? Alternatives to testing Available tools, techniques, and metrics Relationship to other topics Conclusions Annotated Reference List & Further Reading Introduction Software Testing is the process of executing a program or system with the intent of finding errors. [Myers79] Or, it involves any activity aimed at evaluating an attribute or capability of a program or system and determining that it meets its required results. [Hetzel88] Software is not unlike other physical processes where inputs are received and outputs are produced. Where software differs is in the manner in which it fails. Most physical systems fail in a fixed (and reasonably small) set of ways. By contrast, software can fail in many bizarre ways.
    [Show full text]