Improved Diagnosis of Software Bugs: a Perspective

Improved Diagnosis of Software bugs: A perspective Swetha Pai Manoj V. Thomas Computer Science and Engineering Computer Science and Engineering Vimal Jyothi Engineering College Vimal Jyothi Engineering College Chemperi, Kerala, India - 670632 Chemperi, Kerala, India - 670632 [email protected] [email protected] Abstract—Software Engineering deals with the development of of the software. Hence the interaction with the tester and the unique software. It is almost known that a software development developer goes long until the software code is free from thorny life cycle for a software inevitably encounters with bugs and bugs and errors. errors. The primary approach of this paper is to correctly troubleshoot the bugs by Artificial Intelligence (AI) technique. The procedure mentioned above is known to be very costly. The technique involves the integration of AI with the Learn, The reason is that it is challenging for a developer to recreate Diagnose and Plan (LDP) paradigm where we thoroughly include the same bug that was observed by the tester since both of Machine Learning (ML) and its mechanism. The overall solution them deal with different machines in several contexts. Another for software troubleshooting is implemented on software projects problem is that it is not so easy to handle with software and sample code scratched from Github. Index Terms—Artificial Intelligence, Machine Learning, Soft- programs that are complex and lengthy. ware fault prediction, Software bugs. To reduce the cost of troubleshooting during software development, we present a novel method that helps to incorporate an artificial intelligent agent. The intelligent agent deals with I. INTRODUCTION the range of techniques from the Artificial Intelligence area In our everyday life, it is evident that we deal with errors to improve the detection and also the isolation of the faulty and mistakes. They are not made knowingly, but certain component. The intelligent agent first deals with a software times it happens. Relating this to a software field, during the program. It learns about the source files and code structure development of software, we encounter errors and bugs. Bugs by applying various machine learning algorithms [2]. The can be any fault or flaws where it is not made but happen algorithms help in the detailed study of the software code eventually due to the failures in software code. The human that relates to revision history and past failures. The software action while preparing software code will deal with errors and code contains classes and functions where we consider them as as a result, generates faults than returns to failure. Failures software components depending upon the level of granularity. occur when the software code doesn’t behave appropriately Whenever the code has been learned, it performs diagnosis and according to the user’s postulation and expectation. Errors does a test on it. The learning phase also deals with the past can be rectified manually or automatically depending upon failures and revision history encountered. Tests are conducted the severity of the bugs and the prominence of the software to isolate the faulty software components. Depending upon the code. severity of bugs, a minimal number of tests are performed, There comes the importance of troubleshooting in software and bugs are tracked [30]. This iterative process continues development. Troubleshooting is dealt when a bug has been until an accurate diagnosis is obtained. For the time being, we captured from the software code. During the development of deal with only the learn and Diagnose phase during software a software, there include two key roles, namely the tester and development. the developer. A developer is a person who develops the code We deal with the LDP paradigm where we have a data- according to the application and its need and encapsulates it as augmented diagnosis algorithm present in it [1]. Supervised a program. A tester on the other side deals with the developed Learning techniques are bothered in the machine learning program and tests the code until an error is met. At the time portion. The Model-based diagnosis (MBD) and the Spectrum- when tester meets a bug, he files a bug report where he can based fault Localization (SFL) are considered for the diagnos- use any issue tracking system. The bug report is given to the tic phase. They deal with the traces of tests will be covered in developer where his task is to improve the code and make it the planning phase. The Barinel algorithm is the combination error-free. This task usually means by isolating the bugs from of both of these. The MBD algorithm helps in automated where it evolved by capturing its root cause. The software diagnosis to infer possible diagnoses. The method of software module or component that is faulty is tracked, and as a result, diagnosis deals by isolating the bugs by correctly tracking the the changes are committed in the version control system, so it effected software component. Planning phase deals with how is made available to others in the same team for development to automatically plan if there an additional test is required. II. LEARN,DIAGNOSE AND PLAN(LDP) required, it is given to the test planner, which is the next phase The Learn, Diagnose and Plan paradigm is an extensive that is not considered and will be left for future work. A typical troubleshooting paradigm where this popular method involves Workflow with LDP is presented in Fig. 1. The AI elements both human and AI elements. The human factors involve the of LDP are: person who deals with the software, and the AI element includes the intelligent agent incorporated inside this paradigm. A. Software fault prediction The human elements include. i) Developer: The person in charge of developing the Software fault prediction is a discipline which predicts software. He interacts with the software by writing software the fault proneness of future modules. It uses historical fault codes and programs. The developer can to relate the project data and essential prediction metrics. The fault predictor with the benchmark projects and use it as a reference for future includes an algorithm that finds out the probability of each purpose. He is the person who is responsible for fixing the software component to be faulty. The software component faulty software components. includes the part of the program, a class or even a function ii) Tester: The tester observes that a bug has occurred and present in it. Depending upon our convenience, we select reports it to the developer. The tester creates his test cases the level by how the components are to be considered. The and uses it for testing the system. A tester can be by chance fault predictor does not stick on to the behaviour of the a human or can involve an automated tester. system;instead, it deals with the probability and has this as a prior for the diagnoser. The fault predictor has the capability to include not only the current observations but also past failures and history. Fault prediction when dealing with software is basically a classification problem. When we are given with a software component, the problem classifies it as healthy or faulty. It is the faulty software component that give rise to bugs or errors. In this context we apply supervised machine learning algorithm for the classification problem. In supervised learning we deal with labeled data as inputs and provide with outputs. The learner learns about the components and feds output to the next upcoming phase. As inputs the labeled instances are given and in this case instances are components and the labels depends upon which software component is healthy or not. Fig. 1. An illustration of the workflow with LDP A classification model is obtained as a output which maps an instance to a class. The set of labeled instances is called LDP consists of three AI components: a learning al- the training set and the process of finding a classification gorithm, a diagnosis algorithm, and a planning algorithm. model from the training set is known as learning. The learning We deal with an issue tracking subsystem called Bugzilla algorithms helps to extract the features from a instance and and a version control subsystem named Github for tracking tries to learn from the training set. The classification methods and storing the issues and the software code. Bugzilla helps that exist are decision trees, neural networks, vector machines, in handling with the bug report from the tester, and the logistic regression etc. Adaptive Neuro Fuzzy Inference Sys- Github supports in committing the changes for the stored tem (ANFIS), Support Vector Machine (SVM) and Artificial software code. LDP helps the developer to debug a bug that is Neural Network (ANN) can be used to build software fault observed by diagnosing it automatically. It also guides a tester prediction models [6]. ANFIS is a powerful prediction method intelligently to perform additional tests to collect any further that combines learning ability and the expert knowledge and diagnostic information [3]. The early stage of LDP deal with hence achieves successful results in predicting problems in learning about the fault predictor based upon the information different areas. It also help us to optimize data. The modeling obtained about the past failures and revision history. The first with ANFIS requires more to be aware of the conditions and stage is done before a flaw is found in the software program. results embedded. It is a powerful predictive method [29]. It We implement this by standard machine learning techniques. is considered as the easiest way of experimentation for an After the first stage, when a bug is observed, it is hence expert than other machine learning methods to optimize the reported by the tester.

Improved Diagnosis of Software Bugs: a Perspective

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support