
Ecological Modelling 154 (2002) 135–150 www.elsevier.com/locate/ecolmodel Illuminating the ‘‘black box’’: a randomization approach for understanding variable contributions in artificial neural networks Julian D. Olden *, Donald A. Jackson Department of Zoology, Uni6ersity of Toronto, 25 Harbord Street, Toronto, Ontario, Canada M5S 3G5 Received 2 May 2000; received in revised form 15 February 2002; accepted 4 March 2002 Abstract With the growth of statistical modeling in the ecological sciences, researchers are using more complex methods, such as artificial neural networks (ANNs), to address problems associated with pattern recognition and prediction. Although in many studies ANNs have been shown to exhibit superior predictive power compared to traditional approaches, they have also been labeled a ‘‘black box’’ because they provide little explanatory insight into the relative influence of the independent variables in the prediction process. This lack of explanatory power is a major concern to ecologists since the interpretation of statistical models is desirable for gaining knowledge of the causal relationships driving ecological phenomena. In this study, we describe a number of methods for understanding the mechanics of ANNs (e.g. Neural Interpretation Diagram, Garson’s algorithm, sensitivity analysis). Next, we propose and demonstrate a randomization approach for statistically assessing the importance of axon connection weights and the contribution of input variables in the neural network. This approach provides researchers with the ability to eliminate null-connections between neurons whose weights do not significantly influence the network output (i.e. predicted response variable), thus facilitating the interpretation of individual and interacting contributions of the input variables in the network. Furthermore, the randomization approach can identify variables that significantly contribute to network predictions, thereby providing a variable selection method for ANNs. We show that by extending randomization approaches to ANNs, the ‘‘black box’’ mechanics of ANNs can be greatly illuminated. Thus, by coupling this new explanatory power of neural networks with its strong predictive abilities, ANNs promise to be a valuable quantitative tool to evaluate, understand, and predict ecological phenomena. © 2002 Elsevier Science B.V. All rights reserved. Keywords: Connection weights; Sensitivity analysis; Neural Interpretation Diagram; Garson’s algorithm; Statistical models * Corresponding author. Present address: Department of Biology, Graduate Degree Program in Ecology, Colorado State University, Fort Collins, CO 80523-1878, USA. Tel.: +1-970-491-5432; fax: +1-970-491-0649. E-mail addresses: [email protected] (J.D. Olden), [email protected] (D.A. Jackson). 0304-3800/02/$ - see front matter © 2002 Elsevier Science B.V. All rights reserved. PII: S0304-3800(02)00064-9 136 J.D. Olden, D.A. Jackson / Ecological Modelling 154 (2002) 135–150 1. Introduction confidence regarding their contribution. Cur- rently, there is a lack of theoretical or practical Artificial neural networks (ANNs) are receiving ways to partition the contributions of the indepen- increased attention in the ecological sciences as a dent variables in ANNs (Smith, 1994); thus pre- powerful, flexible, statistical modeling technique senting a substantial drawback in the ecological for uncovering patterns in data (Colasanti, 1991; sciences where the interpretation of statistical Edwards and Morse, 1995; Lek et al., 1996a; Lek models is desirable for gaining insight into causal and Gue´gan, 2000); a fact recently demonstrated relationships driving ecological phenomena. during the first international workshop on the Recently, a number of methods have been pro- applications of neural networks to ecological posed for selecting the best network architecture modeling (conference papers are published in Eco- (i.e. number of neurons and topology of connec- logical Modelling: Volume 120, Issue 2–3). The tions) among a set of candidate networks, e.g. utility of ANNs for solving complex pattern asymptotic comparison techniques, approximate recognition problems has been demonstrated in Bayesian analysis, and cross validation (Dimo- many terrestrial (e.g. Paruelo and Tomasel, 1997; poulos et al., 1995; see Bishop, 1995 for review). O8zesmi and O8zesmi, 1999; Manel et al., 1999; In contrast, methods for quantifying the indepen- Spitz and Lek, 1999) and aquatic studies (e.g. Lek dent variable contributions within networks are et al., 1996a,b; Bastarache et al., 1997; Mastrorillo more complicated, and as a result are rarely used et al., 1998; Chen and Ware, 1999; Gozlan et al., in ecological studies. For example, intensive com- 1999; Olden and Jackson, 2001; Scardi, 2001), and putational approaches such as growing and prun- has led many researchers to advocate ANNs as an ing algorithms (Bishop, 1995), partial derivatives attractive, non-linear alternative to traditional (e.g. Dimopoulos et al., 1995, 1999) and asymp- statistical methods. totic t-tests are often not used in favour of simpler The primary application of ANNs involves the techniques that use network connection weights development of predictive models to forecast fu- (e.g. Garson’s algorithm: Garson, 1991; Lek’s al- ture values of a particular response variable from gorithm: Lek et al., 1996a). Although these sim- a given set of independent variables. Although the pler approaches provide a means of determining predictive value of ANNs appeals greatly to many the overall influence of each predictor variable, ecologists, researchers have often criticized the interactions among the variables are more difficult explanatory value of ANNs, calling it a ‘‘black to interpret since the strength and direction of box’’ approach to modeling ecological phenomena individual axon connection weights within the (e.g. Paruelo and Tomasel, 1997; Lek and Gue´- network must be examined directly. Bishop (1995) gan, 1999; O8zesmi and O8zesmi, 1999). This view discusses the use of pruning algorithms to remove stems from the fact that the contribution of the connection weights that do not contribute to the input variables in predicting the value of the predictive performance of the neural network. In output is difficult to disentangle within the net- brief, a pruning approach begins with a highly work. Consequently, input variables are often en- connected network (i.e. large number of connec- tered into the network and an output value is tions among neurons), and then successively re- generated without gaining any understanding of moves weak connections (i.e. small absolute the inter-relationships between the variables, and weights) or connections that cause a minimal therefore, providing no explanatory insight into change in the network error function when re- the underlying mechanisms being modeled by the moved. A logical question then arises: at what network (Anderson, 1995; Bishop, 1995; Ripley, threshold value (i.e. absolute connection weight or 1996). The ‘‘black box’’ nature of ANNs is a change in network error) should weights be re- major weakness compared to traditional statistical moved or retained in the network? In the present approaches that can readily quantify the influence study, we propose a randomization test for ANNs of the independent variables in the modeling pro- to address this question. This randomization ap- cess, as well as provide a measure of the degree of proach provides a statistical pruning technique for J.D. Olden, D.A. Jackson / Ecological Modelling 154 (2002) 135–150 137 eliminating null connection weights that mini- 3. Interpreting neural-network connection weights: mally influence the predicted output, as well as an important caveat provides a selection method for identifying inde- pendent variables that significantly contribute to We refrain from detailing the specifics of neural network predictions. By using randomization pro- network optimization and design, and instead tocols to partition the importance of connection refer the reader to the extensive coverage provided weights (in terms of their magnitude and direc- in the texts by Smith (1994), Bishop (1995), Rip- tion), researchers will be able to quantitatively ley (1996), as well as articles by Ripley (1994), assess both the individual and interactive effects Cheng and Titterington (1994). It is sufficient to of the input variables in the network prediction say that the methods described in this paper refer process, as well as evaluate the overall contribu- to the classic family of one hidden-layer, feed-for- tions of the variables. Using an empirical example ward neural network trained by the backpropaga- describing the relationship between fish species tion algorithm (Rumelhart et al., 1986). These richness and habitat characteristics of north-tem- neural networks are commonly used in ecological perate lakes, we illustrate the utility of the ANN studies because they are suggested to be universal randomization test, and compare its results to two approximators of any continuous function commonly used approaches: Garson’s algorithm (Hornik et al., 1989). N-fold cross validation was and sensitivity analysis. used to determine optimal network design as it provides a nearly unbiased estimate of prediction success (Olden and Jackson, 2000). We found that a neural network with four hidden neurons exhib- 2. Empirical example: fish species ited good predictive power (r=0.72 between ob- richness–habitat relationships in lakes served and predicted species richness).
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages16 Page
-
File Size-