Logging Bus for Field Failure Data Analysis in Distributed Systems

Facoltà di Ingegneria Corso di Studi in Ingegneria Informatica tesi di laurea specialistica Logbus-ng: a software logging bus for Field Failure Data Analysis in distributed systems Anno Accademico 2009/2010 relatori Ch.mo prof. Domenico Cotroneo Ch.mo prof. Marcello Cinque correlatore Ch.mo Ing. Antonio Pecchia candidato Antonio Anzivino matr. 885/451 Logbus-ng: a software logging bus for Field Failure Data Analysis in distributed systems Summary Introduction ...................................................................................................................................... 4 Logging ............................................................................................................................................ 7 Computer security and accounting ............................................................................................. 10 Field Failure Data Analysis........................................................................................................ 11 The state of art of logging frameworks .......................................................................................... 14 Logging frameworks .................................................................................................................. 14 An example API ......................................................................................................................... 16 Logging formats ......................................................................................................................... 17 Logging protocols ...................................................................................................................... 22 Open issues ................................................................................................................................ 24 Design of the Logbus-ng project .................................................................................................... 26 Source-side interfaces ................................................................................................................ 29 Monitor-side interfaces .............................................................................................................. 29 Logging APIs ............................................................................................................................. 37 Core design ................................................................................................................................ 39 Plugin system ............................................................................................................................. 44 Implementation of Logbus-ng ........................................................................................................ 45 Overview of Common Language Runtime (.NET/Mono) platform .......................................... 45 XML configuration .................................................................................................................... 49 Determining a host’s IP address in “Connect-to-me” protocols ................................................ 53 Running a web application from inside a console application ................................................... 56 Concurrency issues .................................................................................................................... 59 Plugin APIs ................................................................................................................................ 61 Field Failure Data Logging support ........................................................................................... 64 The Entity Manager plugin ........................................................................................................ 66 Log4net interoperability ............................................................................................................. 68 Experimental validation ................................................................................................................. 71 Unit testing for Syslog parser ..................................................................................................... 72 Delivery time of messages ......................................................................................................... 72 2 Logbus-ng: a software logging bus for Field Failure Data Analysis in distributed systems Loss of UDP datagrams under stress ......................................................................................... 77 Conclusions and future work ......................................................................................................... 80 Platform bindings ....................................................................................................................... 80 Dealing with protocol drawbacks .............................................................................................. 81 Load balancing, fault tolerance .................................................................................................. 81 Other work ................................................................................................................................. 85 Appendixes..................................................................................................................................... 87 Appendix Alpha ............................................................................................................................. 88 Appendix Bravo ............................................................................................................................. 92 Appendix Charlie ......................................................................................................................... 101 Bibliography................................................................................................................................. 108 Acknowledgements ...................................................................................................................... 110 3 Logbus-ng: a software logging bus for Field Failure Data Analysis in distributed systems Introduction Today, critical computer systems are becoming more and more important in main human activities, replacing people in controlling processes and thus gaining cheaper costs and greater reliability. Such systems are always more directly responsible of people’s safety, and a failure might bring to disastrous consequences in some cases. If we want to replace a human controller with an automated computer controller in a critical scenario, like a nu- clear power plant or a passenger flight, we must know how “dependable” each hardware and software component in the controller is, where dependability is, by definition, a quantitative indication of its capability to provide a proper service (or its resistance to faults). Quantitative indices are suitable for engineering approaches, and among these we find the most common and relevant: availability, which is the probability the system will be providing a service at a given time1; reliability, which is the probability the system will be up for a continuous time interval2. Dependability Engineering is a vast field in modern engineering. Related to computer sci- ence, we can distinguish hardware from software dependability, which are not so parallel like we might expect. Now we still want to deal with the problems that arise from and the techniques that are applied to both fields, but this will not be comprehensive. Hardware components are physical tangible components, made with highly consolidated technolo- gies, subject to Physics laws, particularly those in the Electronics field. An electronic component, whether an ALU or an entire CPU, computes electrical input signals into out- 1 Conversely, it’s the probability of a failure to occur at a given time subtracted by the unit 2 Conversely, it’s the probability of a failure to occur during a time interval subtracted by the unit 4 Logbus-ng: a software logging bus for Field Failure Data Analysis in distributed systems put signals through a deterministic function. One of the Electronics’ teachings is that input signals are not fully deterministic, but they are instead affected by noise that, if not proper- ly filtered, may cause a fault of the component. Signals that are able to affect a hardware circuit is behaviour are not only those electrical ones applied to input connectors, but also the electromagnetic radiation to which the component is subject in its working environ- ment: for example, you can build a memory and indefinitely test it in lab without finding any design defect, but once mounted in a space probe traveling towards the Sun that memory may be affected by radiation the star produces and possibly showing an undesired behaviour. Studying possible failures of hardware components and the techniques that can help avoid- ing them is a consolidated subject. Software components, however, are virtual: they have no mass, while still contained in physical memories, and do not exist in our realm, in the meaning that we cannot physically interact with software. Software is then no more subject to the physical laws that regulate hardware, and is thus slightly more difficult to study. A great advantage of software over hardware is that it is practically immutable: while it is reasonable to believe that under cer- tain conditions (thermal stress, salt accumulation), a hardware component

Load more