Improved Diagnosis of bugs: A perspective

Swetha Pai Manoj V. Thomas and Engineering Computer Science and Engineering Vimal Jyothi Engineering College Vimal Jyothi Engineering College Chemperi, Kerala, India - 670632 Chemperi, Kerala, India - 670632 [email protected] [email protected]

Abstract— deals with the development of of the software. Hence the interaction with the tester and the unique software. It is almost known that a developer goes long until the software code is free from thorny life cycle for a software inevitably encounters with bugs and bugs and errors. errors. The primary approach of this paper is to correctly troubleshoot the bugs by Artificial Intelligence (AI) technique. The procedure mentioned above is known to be very costly. The technique involves the integration of AI with the Learn, The reason is that it is challenging for a developer to recreate Diagnose and Plan (LDP) paradigm where we thoroughly include the same bug that was observed by the tester since both of Machine Learning (ML) and its mechanism. The overall solution them deal with different machines in several contexts. Another for software troubleshooting is implemented on software projects problem is that it is not so easy to handle with software and sample code scratched from Github. Index Terms—Artificial Intelligence, Machine Learning, Soft- programs that are complex and lengthy. ware fault prediction, Software bugs. To reduce the cost of troubleshooting during software devel- opment, we present a novel method that helps to incorporate an artificial intelligent agent. The intelligent agent deals with I.INTRODUCTION the range of techniques from the Artificial Intelligence area In our everyday life, it is evident that we deal with errors to improve the detection and also the isolation of the faulty and mistakes. They are not made knowingly, but certain component. The intelligent agent first deals with a software times it happens. Relating this to a software field, during the program. It learns about the source files and code structure development of software, we encounter errors and bugs. Bugs by applying various machine learning algorithms [2]. The can be any fault or flaws where it is not made but happen algorithms help in the detailed study of the software code eventually due to the failures in software code. The human that relates to revision history and past failures. The software action while preparing software code will deal with errors and code contains classes and functions where we consider them as as a result, generates faults than returns to failure. Failures software components depending upon the level of granularity. occur when the software code doesn’t behave appropriately Whenever the code has been learned, it performs diagnosis and according to the user’s postulation and expectation. Errors does a test on it. The learning phase also deals with the past can be rectified manually or automatically depending upon failures and revision history encountered. Tests are conducted the severity of the bugs and the prominence of the software to isolate the faulty software components. Depending upon the code. severity of bugs, a minimal number of tests are performed, There comes the importance of troubleshooting in software and bugs are tracked [30]. This iterative process continues development. Troubleshooting is dealt when a bug has been until an accurate diagnosis is obtained. For the time being, we captured from the software code. During the development of deal with only the learn and Diagnose phase during software a software, there include two key roles, namely the tester and development. the developer. A developer is a person who develops the code We deal with the LDP paradigm where we have a data- according to the application and its need and encapsulates it as augmented diagnosis algorithm present in it [1]. Supervised a program. A tester on the other side deals with the developed Learning techniques are bothered in the machine learning program and tests the code until an error is met. At the time portion. The Model-based diagnosis (MBD) and the Spectrum- when tester meets a bug, he files a bug report where he can based fault Localization (SFL) are considered for the diagnos- use any issue tracking system. The bug report is given to the tic phase. They deal with the traces of tests will be covered in developer where his task is to improve the code and make it the planning phase. The Barinel algorithm is the combination error-free. This task usually means by isolating the bugs from of both of these. The MBD algorithm helps in automated where it evolved by capturing its root cause. The software diagnosis to infer possible diagnoses. The method of software module or component that is faulty is tracked, and as a result, diagnosis deals by isolating the bugs by correctly tracking the the changes are committed in the version control system, so it effected software component. Planning phase deals with how is made available to others in the same team for development to automatically plan if there an additional test is required. II.LEARN,DIAGNOSEAND PLAN(LDP) required, it is given to the test planner, which is the next phase The Learn, Diagnose and Plan paradigm is an extensive that is not considered and will be left for future work. A typical troubleshooting paradigm where this popular method involves Workflow with LDP is presented in Fig. 1. The AI elements both human and AI elements. The human factors involve the of LDP are: person who deals with the software, and the AI element in- cludes the intelligent agent incorporated inside this paradigm. A. Software fault prediction The human elements include. i) Developer: The person in charge of developing the Software fault prediction is a discipline which predicts software. He interacts with the software by writing software the fault proneness of future modules. It uses historical fault codes and programs. The developer can to relate the project data and essential prediction metrics. The fault predictor with the benchmark projects and use it as a reference for future includes an algorithm that finds out the probability of each purpose. He is the person who is responsible for fixing the software component to be faulty. The software component faulty software components. includes the part of the program, a class or even a function ii) Tester: The tester observes that a bug has occurred and present in it. Depending upon our convenience, we select reports it to the developer. The tester creates his test cases the level by how the components are to be considered. The and uses it for testing the system. A tester can be by chance fault predictor does not stick on to the behaviour of the a human or can involve an automated tester. system;instead, it deals with the probability and has this as a prior for the diagnoser. The fault predictor has the capability to include not only the current observations but also past failures and history. Fault prediction when dealing with software is basically a classification problem. When we are given with a software component, the problem classifies it as healthy or faulty. It is the faulty software component that give rise to bugs or errors. In this context we apply supervised machine learning algorithm for the classification problem. In supervised learning we deal with labeled data as inputs and provide with outputs. The learner learns about the components and feds output to the next upcoming phase. As inputs the labeled instances are given and in this case instances are components and the labels depends upon which software component is healthy or not. Fig. 1. An illustration of the workflow with LDP A classification model is obtained as a output which maps an instance to a class. The set of labeled instances is called LDP consists of three AI components: a learning al- the training set and the process of finding a classification gorithm, a diagnosis algorithm, and a planning algorithm. model from the training set is known as learning. The learning We deal with an issue tracking subsystem called Bugzilla algorithms helps to extract the features from a instance and and a version control subsystem named Github for tracking tries to learn from the training set. The classification methods and storing the issues and the software code. Bugzilla helps that exist are decision trees, neural networks, vector machines, in handling with the bug report from the tester, and the logistic regression etc. Adaptive Neuro Fuzzy Inference Sys- Github supports in committing the changes for the stored tem (ANFIS), Support Vector Machine (SVM) and Artificial software code. LDP helps the developer to debug a bug that is Neural Network (ANN) can be used to build software fault observed by diagnosing it automatically. It also guides a tester prediction models [6]. ANFIS is a powerful prediction method intelligently to perform additional tests to collect any further that combines learning ability and the expert knowledge and diagnostic information [3]. The early stage of LDP deal with hence achieves successful results in predicting problems in learning about the fault predictor based upon the information different areas. It also help us to optimize data. The modeling obtained about the past failures and revision history. The first with ANFIS requires more to be aware of the conditions and stage is done before a flaw is found in the software program. results embedded. It is a powerful predictive method [29]. It We implement this by standard machine learning techniques. is considered as the easiest way of experimentation for an After the first stage, when a bug is observed, it is hence expert than other machine learning methods to optimize the reported by the tester. The tester files the bug report, and knowledge. it is inputted to the diagnoser. The diagnoser by processing A key for machine learning is based upon the obtained these inputs outputs a set of possible diagnoses. The diagnoser features. It is surveyed []about the features used by the existing applies an automated diagnosis algorithm which considers software prediction algorithms and they are divide into three both the probabilities generated by the fault predictor and the families [4]: observed system behaviour. When a single diagnosis is found 1) Traditional: These features include traditional software whose probability of correctness is high, then this diagnosis complexity metrics. The example such as lines of code or more is passed to the diagnoser for fixing. If an additional test is sophisticated complexity measures [5]. 2) Object Oriented: These features also include software are inconsistent with the speculation that all the components complexity metrics. They are specially designed for object in COMP are healthy. The output to an MBD algorithm is a oriented programs [27]. They include metrics like coupling set of diagnoses. levels, cohesion and the depth of inheritance. Definition 1 (Diagnosis). A set of components ∆ ⊆ COMP is 3) Process: They are captured from the software change a diagnosis if history. They try to compute the dynamics of the software ^ ^ 0 development process. They consider metrics such as lines (¬h(CO)) ∧ (¬h(CO )) ∧ S ∧ OB added and deleted in the previous version and the age of the CO∈∆ CO0∈/∆ software component. is consistent. That is, if assuming that the components in ∆ The list of features that are included involves the object are faulty, then S is consistent with OB. oriented measures that deals with the no of methods overriding The set of components in software diagnoses can be a superclass, number of public methods, Class abstract etc. To the set of classes or even a component per line of code. learn a fault prediction model we require a training set [4]. Considering each line of code as a component, will result in A training set is a set of software components from which very focused diagnoses. Focusing on the diagnoses increase we classify it as healthy or faulty [8]. We employ with a is a computational effort [3]. The observations in the software version control system like Git where it tracks and commit diagnosis are the observed executions of tests. Every observed modification done to the source files and a issue control system test t in this case are labeled as passed or failed, denoted like Bugzilla that helps to record all bugs that are reported and by pass(t) and fail(t), cooperatively. This labeling is done by track changes in their status, including when a bug gets fixed. the tester manually or automatically in case of automated A key feature they include is that they enable tracking which tests. Among the two approaches, the first one requires S to modifications to the source were done in order to fix a specific be a logical model regarding the function of every software bug. component. It allows using the logical reasoning techniques Integrating the Fault Prediction Model to infer diagnosis. The software prediction model can be explained as a 2) SFL for Software Diagnosis: In the SFL-based ap- classifier which accepts software component as input and proach, there is no need of a logical model of the correct outputs a binary prediction whether the component is faulty or function of every software component in the system [7]. not. Barinel requires to estimate the prior probability of each Instead of that the traces of the test being observed are component to be faulty. Let confi(CO) denote the confidence considered. for a component namely CO. We use confi(CO) for Barinel’s Definition 2 (Trace). A trace of a test t which is denoted by prior where the CO is classified as faulty and 1-confi(CO) trace(t), is the sequence of components involved in running otherwise and provides with a resulting algorithm as LD. test t. Traces of the tests can be collected in practice with B. Diagnoser common software profilers [10]. In the SFL-based approach, It is the diagnoser that does the diagnosis part of the S is implicitly defined by the speculation that a test will pass LDP paradigm. The diagnoser finds out the problem in the if all the components in its trace are not faulty. Let h(CO) software component. The input to the diagnoser is the system’s denote the health predicate for a component CO, i.e., h(CO) behavior that is under observation. It includes the form of a is true if CO is not faulty. We consider Horn clause to clarify set of tests that were executed and their outcome whether it the disjunction of literals. In our part we consider software is pass or fail [3]. The output from the diagnoser includes components as literals. We can formally define S in the SFL one or more explanations about which software components based approach with the following set of Horn clauses: are faulty that deals with the failed test and passed tests. ^ A test that helps in the development of code is the passed ∀test ( h(CO)) → pass(test) test and the test that inhibits the code is the failed test. The CO∈trace(test) diagnoser implemented here is an extension of the Barinel software diagnosis algorithm. Barinel algorithm helps to run Thus, if a test is failed then we can infer that at least one software code with regards to the probability of a software among of the component is fault. As a whole, a trace of a component to be faulty and also the classification of the failed test is named as conflict. dataset. Barinel is a combination of Model-Based Diagnosis Definition 3 (Conflict). A set of components Γ ⊆ COMP is (MBD) and Spectrum-based Fault Localization (SFL). a conflict if ^ 1) Model-Based Diagnosis for Software: The input h(CO) ∧ S ∧ OB |=⊥ to classical MBD algorithms is a three variable tuple CO∈Γ , where S is a detailed description of the diagnosed system’s behavior, COMP is the set of components Many MBD algorithms recently uses conflicts to direct and presented in the system that may or may not be faulty, and OB relate the search towards diagnoses. It paves way to exploit the is a set of observations we can find out. A diagnosis problem fact that a diagnosis we found must be a hitting set of all the arises when S and OB in our development and testing phase conflicts. Since when considering every conflict, at least one component should posses only a hitting set of all conflicts that It is the learning phase and the Diagnoses phase implemented can explain the unexpected observation [9]. When a test fails in this scenario. The main aim is to demonstrate the LDP then we can infer that at least on the components in its trace paradigm in a software engineering application and to evaluate is faulty. As a result the trace of a failed test is a conflict and the different AI components of the LDP. The L and D hence Barinel considers it as such when computing diagnoses. phases are integrated together to obtain a combined effect for It uses a fast hitting set algorithm called STACCATO [11] to the improvement of the diagnosis. All the experiments were find the hitting set of these conflicts which are then outputted executed on the WEKA [13] machine learning platforms. As as diagnoses. We implemented only the L and D phases of a benchmark we used the source files and the bugs reported the LDP paradigm and the test planner phase is left for future in several open source projects like Eclipse CDT [31] which work. is an IDE for C/C++ that is the part of Eclipse platform [32], Prioritizing Diagnoses Apache Ant which is a build tool and the Apache POI which The main drawback of using Barinel algorithm is that provides a java API for Microsoft documents. All the projects it gives large set of diagnoses as output which never usu- are written in java. We used Github as the version control ally helps to find out a solution. For example, during a system and Bugzilla as the issue tracking system. In the fault classification problem it is never so much easy to classify prediction experiment random forest learning algorithm [14] when the dataset is large and complex in nature. As a result was performed. For the performance evaluation Precision (P), of this problem a score is computed by Barinel for every Recall (R) and F-measure (F) were applied. They are the possible diagnoses it returns. The score evaluation in Barinel standards for evaluating classifiers. Precision is defined as the algorithm is the Bayesian approach. It helps to compute the number of modules predicted correct as faulty out of all the posterior probability for the failed and passed tests. It finds modules hat have been predicted as faulty. It is defined as the probabilities without any consideration for the behavior of Precision= Truepositive/(Falsepositive+Truepositive) the system. Recall is defined by the percentage of the files that are C. Summarizing the L and D defective which the model predicts as defective [15]. It is defined as The key strength of the LDP paradigm is that all the Recall= Truepositive/(Falsenegative+Truepositive) above mentioned components are integrated in a single ap- F-measure is considered as the combination of recall and proach. We dealt with the learning (L) phase and the Diagnosis precision. The ROC-AUC (Receiver operating characteristics- (D) phase. Area under curve) curve is used to evaluate the performance of the prediction models [16]. They are widely used for disjoint datasets to evaluate the performance of machine learning algorithms. The Fig.4 shows the values generated by random forest by evaluating the fault prediction model.

Fig. 2. A scheme of our integration with L and D for the LDP paradigm

In the Learning (L) part we learned about a fault prediction Fig. 3. Evaluating the fault prediction models model and used standard machine learning algorithms to study about the different components. In the Diagnosis (D) part we The AUC metric marks the known tradeoff between the found out the pure diagnosis for each software component precision and recall where high recall comes with low pre- and classifies whether it as failed or passed. The L and cision [17]. We have the dataset where we calculate the D without the planning phase is shown in Fig.2. It is the imbalanced B/H ratio between the valid and bugged files. A Planning phase that accordingly plans the test required further valid file is valid and free from errors whereas a bugged file for future purposes [12]. The planning phase of the LDP is is prone to errors and may contain faults. There contains the not administered here [28]. As a result the L and D phase of precision, recall and the AUC for the software modules and this paradigm helps in the improved diagnosis in this project the components which is classified as faulty or not and finds when they are integrated in an efficient reciprocative manner their probability based on their classification. The dataset is to provide a combined effect. given in Fig.4 where we have details about the bugged files and tests conducted for diagnosis. III.EXPERIMENT RESULTS In the diagnosis experiments the components are classified We have implemented the L and D components of the as faulty or not. We randomly choose one of the previously LDP paradigm and they have been evaluated experimentally. fixed bugs and deals the new bugs considering the software metrics and the related benchmark projects [18]. The com- troubleshoot. The issue tracking system is used to track any ponent to be faulty are found depending upon whether they error or mistakes during execution. During the development are healthy or not considering their probability to be faulty. phase and in the testing case we can run randomized test The test are run for our software code by developing a user or have a provision to select benchmark test to execute. The interface where a developer and a tester can simultaneously Benchmark [15] test include already available test cases that intrude into its function and working. It consists of a graphical can be considered as example test cases for future reference. user interface (GUI) as depicted in fig.5 where the developer The programs for our implementation is done in Eclipse IDE and the tester can initialize the tests method and the bug 2012 version written in java. The dataset was considered simulation method. from csv files and the source code from Github. In the implementation the Learning and the Diagnosis phase are integrated together rather doing them separately. A. PSO and GD The Particle Swarm Optimization (PSO) and Gradient De- scent (GD) method are the techniques we dealt upon during the learning phase. The PSO is a method of computation which helps in the optimization of a problem by trying to improve the solution iteratively with contemplate to a given measure of quantity. The technique involves by considering the particles Fig. 4. Classification of the dataset position and velocity. In a basic case this algorithm performs optimization. Our result of PSO and GD gives the probabilities According to the plan technique you can select the regarding the components we select. GD is also an iterative diagnosis and the test plan. Highest probability algorithm, optimization algorithm for finding the minimum of a function. Random forest, entropy etc can be applied in this section [19]. In GD one takes steps proportional to the negative of the In the implementation side, we have the AI engine where gradient of the function at the current point. In the result it all other components are directly connected to that. The AI provides with probabilities of a component to be faulty. engine is the artificial intelligent agent we implement and which is considered as the central brain of our system. All IV. RELATED WORKS the actions undergoing inside the system are performed under The independent, individual task and performance of the the intervention of the AI engine. We have the developer Artificial intelligent component of LDP have been highlighted that manages the intelligent agent through the source code. till now. There includes many implementation of our same The developer develops the code or program for the software work not exactly but related to our view of improved diag- development [24] [25] [26]. We used git as the version control nosis. There have been many approaches that provides with system and Bugzilla as the issue tracking system. The git is the combined effect of Spectrum-based fault localization and used to store the source code of different projects. the slicing hitting set computation [21]. In their method of implementation the slices are being computed for all the failing test cases they consider as fault. The diagnoses are computed using the hitting set algorithm by considering the components which are faulty and calculating their prior probability. The fault in spreadsheets [22] is also a relevant example for bug diagnosis where they use the SFL approach and other localization methods. Several surveys were published on this kind of topic [23]. Classical Model Based Diagnosis (MBD) and the Barinel algorithm is been found as work. Staccato algorithm and Highest Probability (HP) algorithm was also found to be used in recent works. There are also several algorithms for planning test to find out more accurate diagnoses and also paves a way for automated diagnoses when the test cases are provided. Some test planning algorithm are based upon information theory so it chooses the first test cases, which maximizes the information Fig. 5. Graphical User Interface gain. Other planning algorithm are based upon the decision theory that uses a decision tree which develops a testing policy According to our need we consider programs in git and also to minimize the cost for further testing. Automated test planner the sample pieces of code available. Any piece of code written [20] is also been used during the testing phase. Related to in java can be effectively considered for performing bug all these, our main focus was on the diagnosis algorithm and how it can be made better and efficient using a learned fault [17] Japkowicz, N., 2000. The class imbalance problem: Significance and prediction model. strategies. In: Proc. of the Intl Conf. on Artificial Intelligence. [18] Kubat, M., Matwin, S., et al., 1997. Addressing the curse of imbalanced training sets: one-sided selection. In: ICML, Vol. 97. pp. 179186. V. CONCLUSION [19] Beck, K., Gamma, E., 0000. Junit cookbook, Available online at: http://JUnit.sourceforge. net/doc/cookboo/cookbook.htm. This paper provides a research direction in the area of [20] Anand, S., Burke, E.K., Chen, T.Y., Clark, J., Cohen, M.B., Grieskamp, artificial intelligence where we introduced the LDP paradigm W., Harman, M., Harrold, M.J., Mcminn, P., et al., 2013. An orchestrated for troubleshooting the thorny errors and software bugs. LDP survey of methodologies for automated software test case generation. J. Syst. Softw. 86 (8), 19782001. provides with three phases like Learning, Diagnosing and [21] Wotawa, F., 2010. Fault localization based on dynamic slicing and Planning. The LDP paradigm in the software field is high on hitting-set computation. In: Quality Software, QSIC, 10th International demand. It proves the integration of AI components is more Conference on. IEEE, pp. 161170. [22] Abreu, R., Hofer, B., Perez, A., Wotawa, F., 2015. Using constraints to efficient than the method when components implemented sepa- diagnose faulty spreadsheets. Softw. Qual. J. 23 (2), 297322. rately. We had proven that when L and D phases are integrated [23] Malhotra, R., 2015. A systematic review of machine learning techniques together their combined effect has made drastic changes in the for software fault prediction. Appl. Soft Comput. 27, 504518. [24] Caglayan B , Turhan B , Bener A , Habayeb M , Miransky A , Cialini performance of each component. The resulting paradigm was E . Merits of organizational metrics in defect prediction: an industrial evaluated in open source software projects which evidently replication. In: 2015 IEEE/ACM 37th IEEE international conference on shows the synergistic benefit of the LDP components. The software engineering, 2. IEEE; 2015. p. 8998 . [25] Apache POI the Java API for Microsoft Documents Project News, results suggest that future work can be done for improving http://poi.apache.org. the fault prediction model. Finally to complete the paradigm, [26] Apache POI the Java API for Microsoft Documents Project News, the integration of the planning phase and automated repair http://poi.apache.org. [27] Subramanyam R , Krishnan MS . Empirical analysis of ck metrics for component is necessary, which will be left for future work. object-oriented design complexity: Implications for software defects. IEEE Trans Softw Eng 2003;29(4):297310 . REFERENCES [28] Feldman, A., Provan, G.M., van Gemund, A.J.C., 2010. A model-based active testing approach to sequential diagnosis. J. Artif. Intell. Res. 39, [1] Cardoso, N., Abreu, R., 2014. Enhancing reasoning approaches to 301334. diagnose functional and non-functional errors. In: The International [29] Fraser, G., Arcuri, A., 2011. Evosuite: automatic test suite generation Workshop on Principles of Diagnosis. for object-oriented software. In: SIGSOFT FSE. pp. 416419. [2] Zamir, T., Stern, R., Kalech, M., 2014. Using model-based diagnosis [30] Friedrich, G., Nejdl, W., 1992. Choosing observations and actions in to improve . In: AAAI Conference on Artificial Intelli- model-based diagnosis/ repair systems. In: The International Conference gence. on Principles of Knowledge Representation and Reasoning (KR), pp. [3] Elmishali, A., Stern, R., Kalech, M., 2016. Data-augmented software 489498. diagnosis. In: AAAI. pp. 40034009. [31] Apache Ant, http://ant.apache.org. [4] Radjenovic, D., Hericko, M., Torkar, R., Zivkovic, A., 2013. Software [32] Eclipse CDT, http://eclipse.org/cdt. fault prediction metrics: A systematic literature review. Inf. Softw. Technol. 55 (8), 13971418. [5] Halstead, M.H., 1977. Elements of Software Science (Operating and Programming Systems Series). Elsevier Science Inc., New York, NY, USA. [6] Ezgi Erturk a, Ebru Akcapinar Sezer, 2015. A comparison of some soft computing methods for software fault prediction. The Scientific and Technological Research Council of Turkey (TUBITAK), Software Technologies Research Institute, Ankara, Turkey. Hacettepe University, Department of , 06800 Ankara, Turkey. [7] Abreu, R., Zoeteweij, P., van Gemund, A.J.C., 2009. Spectrum-based multiple fault localization. In: Automated Software Engineering, ASE. IEEE, pp. 8899. [8] liwerski, J., Zimmermann, T., Zeller, A., 2005. When do changes induce fixes?. ACM Sigsoft Softw. Eng. Notes 30 (4), 15. [9] de Kleer, J., Williams, B.C., 1987. Diagnosing multiple faults. Artificial Intelligence 32 (1), 97130. [10] de Kleer, J., Williams, B.C., 1987. Diagnosing multiple faults. Artificial Intelligence 32 (1), 97130. [11] Abreu, R., van Gemund, A.J., 2009. A low-cost approximate minimal hitting set algorithm and its application to model-based diagnosis. In: SARA, Vol. 9. pp. 29. [12] Rushby, J., 2005. Automated test generation and verified software. In: Working Conference on Verified Software: Theories, Tools, and Experiments. Springer, pp. 161172. [13] Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H., 2009. The weka data mining software: An update. SIGKDD Explor. Newsl. 11 (1), 1018. [14] Liaw, A., Wiener, M., 2002. Classification and regression by random- forest. R News 2 (3), 1822. [15] Garvit Rajesh Choudhary, Sandeep Kumar, Kuldeep Kumar, Alok Mishra, Cagatay Catal.Empirical analysis of change metrics for software fault prediction, Information Technology Group, Wageningen University, Wageningen, The Netherlands. [16] Mitchell, T., 1997. Machine Learning. McGraw Hill.