Predictive models of hollow incidence for state in central and eastern Victoria

A collaborative project between the Department of Natural Resources and Environment and the University of Melbourne

Fox, J.C.1*, Burgman, M.A.1, and Ades, P.K.2

1. School of Botany, The University of Melbourne, Parkville Victoria 3052. 2. School of , The University of Melbourne, Parkville Victoria 3052 *Corresponding author: Phone: 8344 4405, Email: [email protected]

Final Report to the Department of Natural Resources and Environment February 2001 TABLE OF CONTENTS

1. ABSTRACT ...... 1

2. INTRODUCTION ...... 2

3. MODELLING METHODOLOGY...... 7 3.1. -level hollow incidence models ...... 7 3.2. Stand-level hollow incidence models...... 9 3.3. Stratification for model building ...... 11 3.4. Validation ...... 11 3.5. Data handling and editing...... 11

4. RESULTS ...... 13 4.1. Tree-level first-stage logistic models...... 13 4.2. Tree-level second-stage models...... 22 4.3. Stand-level first-stage logistic models...... 28 4.4. Stand-level second-stage models...... 31 4.5. Validation ...... 37

5. DISCUSSION ...... 44

6. CONCLUSION ...... 53

7. REFERENCES ...... 54

8.1. APPENDIX A. BACKGROUND TO THE GENERALIZED LINEAR MIXED MODEL (GLMM).59

8.2. APPENDIX B. GLMM ESTIMATION OF FIRST-STAGE HOLLOW INCIDENCE MODELS .... 66

8.3. APPENDIX C. FURTHER HOLLOWS WORK ...... 81

8.4. APPENDIX D. PUBLICATIONS ARISING FROM THE HOLLOWS WORK...... 83

8.5. APPENDIX E. ACKNOWLEDGEMENTS ...... 84

ii 1. Abstract

This report details predictive models of hollow incidence for state forests in central and eastern Victoria. Models are based on the hollows component of the state-wide resource inventory (SFRI). A two-stage methodology was used at both the tree- and stand- level which in its first-stage generated a statement of the probability of hollows presence, and in its second-stage estimated the size of hollows at the tree-level, and the density of hollows at the stand-level. Separate models were developed for forests dominated by E. delegatensis, E. regnans, and E. obliqua. Remaining forests were grouped into a mixed forests category. The hierarchical nature of SFRI data prompted the search for statistical methodology capable of explicitly modelling a complicated error structure. Subsequently, generalized linear mixed models (GLMMs) were used to estimate first-stage, tree- and stand-level models in the presence of spatial and nested dependence. The developed first- and second-stage, tree-level models were statistically and biologically plausible and performed well when validated using independent data. By overcoming the limitations facing previous research associated with insufficient data and inappropriate statistical methodology viable predictive models were developed. Tree-level models will be useful for predicting hollows incidence when individual-tree information is available and for improving habitat tree retention guidelines. Stand-level models were statistically marginal, and if possible, tree-level models should be applied where tree-level information is available. However, with the appropriate use of confidence intervals on predictions to inform decision makers of uncertainty, stand-level models will be useful for stand-level forecasting of hollows incidence and for generating a statement of hollow incidence for the state forests of central and eastern Victoria.

1 2. Introduction

Managing public forests for net social benefit requires an appropriate balance of timber production and conservation priorities (Ferguson 1996). This project examined the incidence of tree hollows across central and eastern Victoria using data from the recently implemented state-wide forest resource inventory (SFRI). Because the number of hollow- bearing is a factor which limits populations of several endangered arboreal marsupials (Smith and Lindenmayer 1988, Loyn 1993) this work has important implications for integrating conservation priorities in . To address limitations in previous research, this project used an extensive database of hollows incidence collected as part of the SFRI, and employed advanced statistical methodology that explicitly modelled a hierarchy of spatial and nested dependence. The resulting predictive models can be used by forest managers to inform decision making at several spatial scales, from the selection of individual habitat trees on harvested coupes to landscape level management planning.

Of the fauna that inhabit Australia’s native eucalypt forests, species that use tree hollows are considered the most vulnerable to the effects of timber harvesting (Gibbons and Lindenmayer 1996). Many studies have demonstrated a positive correlation between the diversity and abundance of hollow-dependent taxa and the number of available hollows (Saunders et al. 1982, Kavanagh et al. 1985, Smith and Lindenmayer 1988, Loyn 1993). Findings such as these prompted the loss of hollow-bearing trees to be listed as a threatening process under the Victorian Flora and Fauna Guarantee Act (Victorian Government 1988). Hollow-dependent species make up a significant proportion of Australia’s vertebrate fauna. Ambrose (1982) predicted that 42% of mammals, 17% of birds, and 28% of reptiles and amphibians were dependent on tree hollows for denning, roosting or nesting. However, Australia lacks primary excavators of hollows (such as in North America), rather, their formation is dependent on stochastic, episodic events such as mechanical damage from lightning or other trees, decay by fungal or insect attacks, or fire. Given the indirect nature of hollow development in Australian forests, long time horizons are required before forests exhibit a suitable density of different sized hollows to support its diverse range of vertebrates.

One of the most intensively studied hollow-dependent species has been the Leadbeater’s possum (Gymnobelideus leadbeateri), a rare and endangered arboreal marsupial inhabiting the mountain ash forests of the central highlands of Victoria. The conservation of this species is one of the most contentious forestry issues in Australia (Lindenmayer and Possingham 1996). Approximately 50% of the forests where Leadbeater’s possum may occur are timber producing forests (Attiwill 1995). Current silvicultural practices in production ash

2 forests result in trees being clearfell harvested on rotations of 40-80 years which is too short a period for hollows to develop (120-200 years) (Smith 1982, Smith and Lindenmayer 1992). Furthermore, attempts to retain possible hollow-bearing trees have been judged unsuccessful with research indicating that hollows may be lost, as retained trees became senescent and collapse (e.g. Lindenmayer et al. 1990). It is hypothesised that a shortage of hollows in mountain ash forests could seriously threaten the long-term viability of Leadbeater’s possum (e.g. Lindenmayer et al. 1990). This combination of factors prompted Smith and Lindenmayer (1992) to suggest that Leadbeater’s possum could be totally eliminated from production areas. These findings have been supported by population viability analysis and simulation models which indicated that current management practices may threaten long-term survival (e.g. Lindenmayer and Possingham 1995, 1996, Ball et al. 1999). In light of the social cost of this possible extinction, these hypotheses need to be more rigorously tested, and this study can contribute with a more definitive statement of hollow availability in state forests.

Other rare or threatened hollow-dependent species in Victorian forests include the mountain brushtail possum (Trichosurus caninus) several owls (e.g. sooty owl Tyto tenebricosa, masked owl Tyto novaehollandiae, powerful owl Ninox strenua, and southern boobook Ninox novaeseelandia) and gliders (e.g. yellow-bellied glider Petaurus australis, greater glider Petauroides volans and sugar glider Petaurus breviceps). Previous survey results of the incidence of each species indicate that, like Leadbeater’s possum, they are strongly dependent on the presence of suitable hollow-bearing trees (Milledge et al. 1991). Maintaining a sufficient density, variety and distribution of hollow-bearing trees in production forests will assist in the conservation of these hollow-dependent species.

Various simulation studies and metapopulation viability analyses have demonstrated hypothetical impacts of current management practices on arboreal mammals requiring tree hollows. Lindenmayer and Lacy (1995) used simulation analysis to examine the effects of on the viability of the mountain brushtail possum. Possingham et al. (1994) used metapopulation viability analysis to demonstrate that persistence of the greater glider is threatened when constrained to isolated patches of old-growth mountain ash forest. These habitat models coupled with problems of complexity and uncertainty suggest a precautionary approach to minimise potentially detrimental impacts on wildlife (McCarthy et al. 1994). This perspective has lead some researchers to recommend the cessation of timber production in mountain ash forests (e.g. Lindenmayer 1995), but considerable contention surrounds this (e.g. Macfarlane and Loyn 1994, Attiwill 1995). This precautionary stance needs to be weighed against the socio-economic cost to regional Victoria of excluding timber production from further areas. Gibbons and Lindenmayer (1996) concede that as timber

3 harvesting in eucalypt forests is ongoing, many benefits can be gained by integrating improved strategies for retaining hollow-bearing trees with operational practice. More explicit predictive models that account for more of the variability inherent in the occurrence of hollow-bearing trees and based on a large and comprehensive database may go some of the way to resolving these contentions and will facilitate a more assured approach to integrating conservation priorities in forest management. The identification of a suitable methodology for the development of predictive hollows models can also benefit forest management in other areas of Australia, where similar contentious issues exist, e.g., hollow-dependent in Western Australia (Mawson and Long 1994, Stoneman et al. 1997, Calver 1997).

In a review of issues associated with the preservation of hollow-bearing trees within wood producing eucalypt forest, Gibbons and Lindenmayer (1996) identified two priorities for future research. The first priority was to address key deficiencies in data concerning the incidence of hollows in production forests. The second priority was to develop effective predictive models that could be used by forest managers to make more informed decisions. The aims of this project, coupled with the SFRI hollow database, satisfy the priorities identified by Gibbons and Lindenmayer (1996).

There has previously been limited information from which to develop and parameterise predictive models for hollow incidence to inform the process of landscape level management planning in public forests (Gibbons and Lindenmayer 1996). Victoria’s statewide forest resource inventory (SFRI) was conceived in 1993 with the central objective of providing forest managers with a resource statement for making consistent and informed decisions regarding timber yield forcasts, land-use planning and resource allocation on public land (DNRE 1999a). As part of this long-term project, information on the incidence of tree hollows was collected in public forests across Victoria. This information on the distribution and abundance of tree hollows is critical given the aims of modern forest management to balance the continuing extraction of timber with wildlife conservation (DNRE 1999b). Despite previous studies on the incidence of hollows in eucalypt forests, studies with a sufficient sample of hollow-bearing trees and a broad regional coverage are lacking. Information on the incidence and characteristics of tree hollows collected during Victoria’s SFRI represents the most extensive database of hollow incidence in Australian forests. The SFRI database provides unparalleled information on hollow development and incidence (DNRE 1999b), and this project aimed to capitalise on this.

Designing prescriptions regarding the number, distribution and diversity of habitat trees is the most important issue facing silvicultural planning for wildlife conservation (Florence 1996). Current habitat tree prescriptions are subjective, and rely on anecdotal

4 evidence concerning the habitat requirements of endangered species, and a desire to minimise effects on regrowth productivity (Gibbons 1994). These prescriptions need to be formulated more objectively and, in particular, the attributes of trees likely to contain hollows need to be identified (Florence 1996). Furthermore, the application of blanket prescriptions for habitat tree retention often fails to provide a sufficient diversity of hollow sizes for the full range of hollow-dependent fauna (Gibbons and Lindenmayer 1997a, 1997b). The development of predictive models of hollow incidence and size will facilitate the objective identification of trees most likely to contain hollows, whose retention will yield the greatest benefits for endangered fauna.

Several studies have attempted to build predictive models for the incidence of tree hollows. These models have attempted to use biotic and abiotic factors to explain variability in the incidence of hollows. Studies have used generalized linear models and have either assumed that counts of hollow incidence follow a Poisson distribution (e.g., Lindenmayer et al. 1993, Bennett et al. 1994, DNRE 1999b, Gibbons 1999) or that the presence or absence of hollows follows a binomial distribution (Lindenmayer et al. 1991, Gibbons 1999). Results from these studies indicate that biotic factors including tree diameter, crown form, age, species and understorey composition (Lindenmayer et al. 1993, Bennett et al. 1994, DNRE 1999b, Gibbons 1999), as well as abiotic factors including site characteristics such as slope, latitude and rainfall (Lindenmayer et al. 1993, Bennett et al. 1994) influence the incidence of hollows. However, the developed models had limited predictive ability, and researchers have resigned to identifying simple rules of thumb that can be coarsely applied (Lindenmayer et al. 2000). Researchers have speculated that the poor precision of previous predictive models indicate a need to collect further data at several spatial scales and to ensure that variables are measured at the appropriate level of resolution (Lindenmayer et al. 1993).

Attempts to model the presence of hollows across forested landscapes are complicated by a hierarchy of nested dependence. Dependence is generated because hollow incidence for individual trees within plots and within stands will be more similar than average. This violates the independence assumption of generalized linear models and has several statistical implications including biases in hypothesis tests on the estimated parameters, incorrect deviance statistics, poor estimation efficiency, and an overdispersed binary response (Kramer 1980, West et al. 1986, Goldstein 1995, Schabenberger and Gregoire 1995, Piepho 1999). A hierarchy of nested dependence eventuates because of the hierarchical nature of the SFRI sampling scheme: sampled individual trees are nested within plots and plots are nested within stands. Previous studies have overcome this problem by using empirical transformations to the normal distribution and applying mixed linear models

5 (e.g. Lindenmayer et al. 1993, Bennett et al. 1994, Lindenmayer et al. 2000). An improved methodology for modelling hollows in the presence of a hierarchy of dependence needs to be identified which preserves the actual response distribution and explicitly incorporates a hierarchy of nested dependence. The identification of a suitable methodology will represent an advance that can be utilised for similar benefits in other Australian forests.

Project objectives were to: examine tree, stand, and landscape level influences on the incidence of tree hollows in central and eastern Victoria using the hollows component of the SFRI; identify a methodology for modelling the incidence of hollows; and develop predictive models for tree-hollow incidence that can be used by forest managers for improved management planning. Ultimately the project outcomes will facilitate a more assured approach to integrating conservation priorities for hollow-dependent fauna in forest management.

6 3. Modelling methodology

It is proposed that hollow incidence be modelled at both the tree- and stand-level. Modelling hollow incidence at both the tree- and stand-level ensures a large number of potential applications for the models. For example, stand-level models can be used to generate a statement of hollow incidence using currently available aerial photograph interpretation (API) classifications and information from geographic information systems (GIS), and can also be used for stand-level forecasting of hollow incidence into the future. An assessment of hollow incidence would contribute to the resolution of contentions over the availability of tree hollows in state forests. Tree-level models can be used to estimate the probability that any individual tree will have a hollow, and can be used for more precise statements of hollow incidence when individual-tree information is available. Generating probability statements for individual trees will also be useful when creating guidelines for tree retention strategies or for assessing the habitat potential of particular retained trees.

3.1. Tree-level hollow incidence models

The proposed design for tree-level hollow incidence models consists of a two-stage predictive framework. First-stage models consist of binary logistic models (Hosmer and Lemeshow 1989, Agresti 1990) that predict the probability of an individual tree having a hollow from biotic and abiotic tree and site characteristics using the first-stage (hollow presence/absence) sample of the SFRI. Binary logistic first-stage models are chosen because hollows are absent in many trees. Second-stage models consist of a multiple regression of hollow size against individual-tree attributes using the second-stage (hollow size-class) sample of the SFRI. Given this two-stage design, the probability of hollow incidence and the expected size of hollows could be predicted for any tree in central and eastern Victoria.

Tree-level models can be used to generate accurate statements of hollow incidence for entire forests when individual-tree information is available. Using the two tree-level models described above, the probability of hollow incidence and the expected size of hollows can be predicted for every tree in the strand. Accurate estimates of the total hollow availability in a forest can be determined by aggregating information on the presence and size of hollows to the stand level. The final result is an estimate of the number of trees with hollows, and the composition of these hollows with regard to size.

The two-stage design is appropriate in light of the two-stage sampling design of the SFRI and uses all the available data. Alternative statistical methodologies such as ordinal and Poisson regression were examined but performed poorly. A tree-level modelling methodology is appropriate because of the large amount of variability in hollow incidence explained by tree

7 attributes such as diameter, crown form and species (Lindenmayer et al. 1993, Bennett et al. 1994, 1999b). However, in most instances individual-tree information will not be available for forests. To facilitate a statement of hollow incidence given currently available API classifications, a stand-level methodology is also necessary.

Model building strategy

First-stage logistic models were built to predict the presence or absence of hollows on individual trees from tree and site attributes. A range of model selection algorithms as well as an examination of biological plausibility were used to select a parsimonious model that best fitted the data (Hosmer and Lemeshow 1989). Categorical variables consisting of several levels were divided into binary indicator variables making them amenable to model building algorithms in SAS Proc Logistic (SAS Institute Inc. 1996). After a suitable model had been identified partial residuals were examined to ensure that relationships were linear (Agresti 1996, Harrell 2000). Curvature was observed in the relationship for individual-tree diameter over (DBHOB) which was rectified using a natural log transformation. Various interaction terms were also investigated but none made significant contribution to the models.

Model building for second-stage regressions was implemented using well-developed methodologies for applied linear regression analysis (e.g. Rawlings 1988). Various diagnostic analyses were also used for both first and second-stage models to test for multicollinearity, potentially influential measurements, and assumptions regarding the error structure (Rawlings 1988, Hosmer and Lemeshow 1989). Model building for both logistic and linear regression was tempered by an examination of biological reality to ensure relationships are biologically based and are not statistical abstractions. Statistical analyses were implemented using the SAS and S+ statistical packages.

Explicitly modelling a hierarchy of dependence using generalized linear mixed models (GLMMs)

Modelling individual trees across plots and stands means that individual trees are not independent (West et al. 1984). Dependence is generated because hollow incidence for individual trees within plots and stands will be more similar than average, thus violating the independence assumption of generalized linear models. A hierarchy of nested dependence arises because of the hierarchical nature of the SFRI sampling scheme: sampled individual trees are nested within plots and plots are nested within stands. Therefore, methodology was sought capable of explicitly incorporating a hierarchy of nested dependence in model estimation. To do this we needed to search beyond traditional generalized linear model estimation for more complicated estimation methods.

8 The generalized linear mixed model (GLMM) was identified as a methodology capable of explicitly incorporating a hierarchy of nested dependence in model estimation while preserving a binary response. Further details of the GLMM are provided in Appendix A and further details of its application to the tree-level models are included in Appendix B.

3.2. Stand-level hollow incidence models

Model design

Because of limitations in the availability of individual-tree information, stand-level models were also developed. The stand-level models will facilitate stand-level estimates of hollow size and incidence for API classified forest in central and eastern Victoria.

Similar to tree-level hollow incidence models, stand-level models will also use a first- stage binary logistic model. Binary logistic first-stage models were chosen because hollows were absent in many plots. The binary logistic will model the presence or absence of hollows on plots from API classifications and additional site characteristics from a geographic information system (GIS).

Second-stage models were constructed using the second-stage ( plot) sample of the SFRI. Initially the number of hollows in felling plots was converted to a per hectare estimate for each hollow size class. This was done by calculating the hollow density contribution of individual felled trees as detailed in DNRE (1999b). This per hectare estimate of the incidence of each hollow size class was regressed against API classification variables and additional site characteristics from GIS. A separate regression was derived for each hollow size class, and can be used to provide a per hectare estimate of the incidence of each hollow size class for any API classified forest.

Several alternative statistical methodologies were tested for stand-level modelling. Using hollow counts or hollows per hectare for each plot in a Poisson regression resulted in very poor accuracy because hollows were absent in many plots. Similar model building strategies as detailed for the tree-level models were used for the stand-level models. The modelling methodology is detailed in Figure 1.

9 10 Modelling spatial dependence using generalized linear mixed models (GLMMs)

Similar to the tree-level models a methodology was required capable of explicitly incorporating dependence. The generalized linear mixed model satisfied this need (Appendix A). Its application to the stand-level model is detailed in Appendix B.

3.3. Stratification for model building

It is proposed that the large area contained in the Dandenong, Central, North-east, Tambo and Central Gippsland forest management areas (FMAs) be stratified into the major forest types for model building purposes. Separate first- and second-stage, tree- and stand- level models can then be developed for each of the major forest types. Examination revealed that this should reduce variability in the models, and improve predictive capabilities. Stratification was based on the principal species identified in the API classifications. This will ensure that stand-level models can be applied directly to the current API classified forest estate. Three major forest types were identified as those dominated by alpine ash (Eucalyptus delegatensis), mountain ash (Eucalyptus regnans) or messmate (Eucalyptus obliqua). The remaining, less common forest types were grouped into a mixed forest category. The modelling methodology detailed above was followed separately for these four forest types.

3.4. Validation

Given the size of the SFRI hollows database, there is an opportunity for keeping a portion of the database separate from model building for model validation purposes. Model validation using an independent database is an important step in the development of robust predictive models (Vanclay and Skovsgaard 1997). Data for the North-east and Tambo FMAs were used as the basis of a validation dataset. Initially models were built and parameterised using data from the Central, Central Gippsland, and Dandenong FMAs. Models were then applied to the validation dataset based on North-east and Tambo FMAs for an independent assessment of predictive capabilities. Following validation, models were re-examined, and re- parameterised using all the data combined.

3.5. Data handling and editing

The hollows component of the SFRI was subject to thorough checks for erroneous and unusual measurements. Minimal editing of the data was required. There were some inconsistencies in the data collection scheme for the North-east FMA. Ground-based crown classifications were not made, and the measurement of hollows in felled trees was inconsistent. These problems lead to the exclusion of plots 1 to 1000 for the North-east FMA.

11 Problems detailed for the North-east FMA were remedied when the SFRI was applied to other FMAs.

12 4. Results

Models presented below are the final models parameterised using SFRI data from Central, Central Gippsland, Dandenong, Tambo, and North-east FMAs. Results from the independent validation are presented in a separate validation section.

4.1. Tree-level first-stage logistic models

Tree-level first-stage logistic models were built to predict the probability of hollows being present on individual trees from tree and site attributes for each of the four forest types. Initially relationships between available predictors and the incidence of hollows were examined. The relationship between tree diameter at breast height over bark (DBHOB) and the incidence of hollows was consistently the strongest across the four forest types and appeared the most useful for first-stage models. The relationship for the four forest types is depicted in Figure 2. A strong relationship also existed between the estimated height of merchantable timber (top height) and the incidence of hollows as depicted in Figure 3. Note that Figures 2 and 3 depict simple logistic regressions between each predictor and the incidence of hollows.

Both full and simple tree-level first-stage models were identified. The full models included all significant variables while the simple models included only tree-level variables, thus making them useful for application in selecting retained habitat trees when only tree- level information is available. The more parsimonious, simple models may also exhibit reduced prediction variance, and may be less affected by collinearity (Harrell 2000). Models were estimated using GLMMs as detailed in Appendix B. The four models are detailed in Tables 1 – 8 below.

13 E. delegatensis 0.9

0.4

E. obliqua 0.9

0.4

E. regnans 0.9

0.4 Proportion of trees hollows with Mixed 0.9

0.4

0 50 100 150 200 250 300 Diameter at breast height overbark (cm)

Figure 2. Relationships between diameter at breast height overbark and the proportion of trees with hollows fitted using a simple logistic regression.

E. delegatensis 0.7

0.3

E. obliqua 0.7

0.3

E. regnans 0.7

0.3 Proportion of trees hollows with Mixed 0.7

0.3

0 102030405060 Estimated top height of merchantable timber (m)

Figure 3. Relationships between the estimated top height of merchantable timber and the proportion of trees with hollows fitted using a simple logistic regression.

14 Table 1. First-stage tree-level logistic model for E. delegatensis Full model Simple model Predictor Parameter Estimate t p-value Estimate t p-value INTERCEPT α -19.9499 -32.22 0.0000 -20.4810 -34.68 0.0000

LOGDBH β1 5.0918 35.18 0.0000 5.1003 35.46 0.0000

EST TOP β2 -0.0823 -13.75 0.0000 -0.0823 -13.76 0.0000

CRHR β3 -1.5417 -14.56 0.0000 -1.5658 -14.84 0.0000

CRRG β4 -2.6218 -12.82 0.0000 -2.6208 -12.89 0.0000

SPDE β5 -1.4710 -7.56 0.0000 -1.4668 -7.57 0.0000

SPG1 β6 0.3682 1.78 0.0748 0.3960 1.92 0.0545

IRL05 β7 -0.5636 -2.66 0.0079

REGRL5 β8 -0.8101 -3.06 0.0022

Variance components Variance components Component Estimate LRT p-value Estimate LRT p-value Random plot effect 2.5964 2903.13 0 2.6527 2941.269 0 Residual variance 0.4321 0.4326

Deviance statistics Deviance statistics Total Model%Explained Model %Explained 5677.87 3916.76 68.98 3915.07 68.95

where LOGDBH is the Log of the diameter at breast height overbark, EST TOP is the estimated top height, i.e., the height above which there are no merchantable wood products. CRHR identifies trees with highly regular crown forms and CRRG identifies trees with regrowth crown forms. SPDE identifies E. delegatensis, while SPG1 identifies species that often contain hollows (refer to Table 2 below). IRL05 identifies stands with an irregular crown component of less than 0.05%. REGRL5 identifies stands with a regrowth crown component less than 5%.

Species group classifications such as SPG1 divide the various species encountered in E. delegatensis forests into four classes. E. delegatensis, and dead trees are the first two classes. The second two classes differentiate the secondary species found in E. delegatensis stands into those that often form hollows (SPG1) and those that rarely form hollows (SPG2). These classes were determined using cross tabulations of species against hollow incidence. Only SPG1 and SPDE were selected in the final model as detailed in Table 1. The composition of the two classes, SPG1 and SPG2, are detailed in Table 2. Species that were

15 poorly represented (less than 20 samples) were removed from the dataset to avoid unreliable predictions for these species.

Table 2. Species group classifications based on the incidence of hollows for forests dominated by E. delegatensis. SPECIES GROUP 1 (SPG1) SPECIES GROUP 2 (SPG2) -Species often containing hollows -Species rarely containing hollows Eucalyptus dalrympleana Acacia dealbata Eucalyptus cypellocarpa Acacia melonoxylon Eucalyptus viminalis Eucalyptus nitens Eucalyptus radiata Eucalyptus regnans Eucalyptus dives Eucalyptus pauciflora

Individual tree species were classified with respect to hollow incidence using cross tabulations of hollow frequencies for the various species.

16 Table 3. First-stage tree-level logistic model for E. regnans Full and Simple models Predictor Parameter Estimate t p-value INTERCEPT α -18.4408 -22.30 0.0000

LOGDBH β1 4.0801 22.50 0.0000

EST TOP β2 -0.0537 -7.57 0.0000

CRIR β3 1.0202 3.51 0.0005

CRHR β4 -0.6713 -2.83 0.0047

CRRG β5 -1.3480 -4.66 0.0000

SPG1 β6 1.6258 6.86 0.0000

FMATAM β7 -0.6165 -2.81 0.0049

Variance components Component Estimate LRT p-value Random plot effect 0.9303 2649.816 0 Residual variance 1.0018

Deviance statistics Total Model %Explained 4667.02 3393.74 72.72

where CRIR identifies trees with irregular regular crown forms and CRRG identifies trees with regrowth crown forms, SPG1 identifies species that often contain hollows (refer to Table 4 below). FMATAM identifies trees located in Tambo FMA.

Species group classifications such as SPG1 were identified using similar methods as detailed for E. delegatensis. The composition of the two classes SPG1 and SPG2 for E. regnans are detailed in Table 4. Species that were poorly represented (less than 20 samples) were removed from the dataset.

Table 4. Species group classifications based on the incidence of hollows for forests dominated by E. regnans. SPECIES GROUP 1 (SPG1) SPECIES GROUP 2 (SPG2) -Species often containing hollows -Species rarely containing hollows Eucalyptus cypellocarpa Acacia dealbata Eucalyptus obliqua Acacia melonoxylon Eucalyptus delegatensis

17 Table 5. First-stage tree-level logistic model for E. obliqua Full model Simple model Predictor Parameter Estimate t p-value Estimate t p-value INTERCEPT α -18.2975 -22.49 0.0000 -19.4202 -27.08 0.0000

LOGDBH β1 4.8525 28.45 0.0000 4.8256 28.32 0.0000

EST TOP β2 -0.1007 -15.38 0.0000 -0.0999 -15.13 0.0000

CRHR β3 -1.2553 -9.62 0.0000 -1.2570 -9.52 0.0000

CRIR β4 1.1901 5.29 0.0000 1.1853 5.28 0.0000

CRRG β5 -1.4453 -5.10 0.0000 -1.5342 -5.46 0.0000

SPG1 β6 0.5349 4.44 0.0000 0.5493 4.51 0.0000

SPG2 β7 -0.6315 -2.18 0.0291 -0.6409 -2.18 0.0294

FMADAN β8 -0.4656 -2.16 0.0307 -0.5013 -2.27 0.0234

CRDEN4 β9 0.3250 2.22 0.0268

IRL05 β10 -0.6932 -4.05 0.0001

REGRL05 β11 -0.9900 -2.49 0.0129

REGRL20 β12 -0.9992 -2.79 0.0053

REGR β13 -0.0214 -2.90 0.0038

ASPN β14 0.3122 1.69 0.0905

Variance components Variance components Component Estimate LRT p-value Estimate LRT p-value Random plot effect 0.6816 278.0093 0.6054 0.8220 329.5694 0.03553 Residual variance 0.9575 0.9857

Deviance statistics Deviance statistics Total Model%Explained Model %Explained 6660.59 4330.21 65.01 4329.28 64.99

where SPG2 identifies species that rarely contained hollows (refer to Table 6 below). FMADAN identifies trees located in the Dandenong FMA, CRDEN4 identifies stands with a medium crown density (50-69% crown cover). REGRL05 and REGRL20 identified stands with a regrowth crown component of less than 0.5% and 20% respectively. REGR is a continuous variable describing the percentage of the stand with regrowth crown form component, and ASPN identifies trees on a northerly aspect.

Species group classifications such as SPG1 and SPG2 were identified using similar methods as detailed for E. delegatensis, and E. regnans. The composition of the two classes SPG1 and SPG2 for E. obliqua are detailed in Table 6.

18 Table 6. Species group classifications based on the incidence of hollows for forests dominated by E. obliqua SPECIES GROUP 1 (SPG1) SPECIES GROUP 2 (SPG2) -Species often containing hollows -Species rarely containing hollows Eucalyptus dalrympleana Acacia dealbata Eucalyptus cypellocarpa Eucalyptus baxteri Eucalyptus radiata Eucalyptus seiberi Eucalyptus dives Eucalyptus regnans Eucalyptus viminalis

Table 7. First-stage tree-level logistic model for mixed forest types Full model Simple model Predictor Parameter Estimate t p-value Estimate t p-value INTERCEPT α -15.2357 -28.67 0.0000 -15.5119 -29.35 0.0000

LOGDBH β1 4.2068 33.50 0.0000 4.1924 33.52 0.0000

EST TOP β2 -0.0956 -18.88 0.0000 -0.0952 -18.84 0.0000

CRHR β3 -1.3191 -13.13 0.0000 -1.3399 -13.33 0.0000

CRS β4 -0.5531 -3.82 0.0001 -0.5728 -3.96 0.0001

CRRG β5 -2.5019 -10.69 0.0000 -2.5066 -10.68 0.0000

SPG1 β6 -1.3301 -5.81 0.0000 -1.3947 -6.08 0.0000

SPG2 β7 -0.3611 -3.15 0.0016 -0.3888 -3.38 0.0007

FMATAM β8 -0.5607 -3.03 0.0024 -0.3930 -2.13 0.0328

FMACENG β9 -0.5031 -2.99 0.0028 -0.4080 -2.41 0.0161

FMADAN β10 -1.0957 -3.78 0.0002 -0.9615 -3.29 0.0010

CRDEN3 β11 -0.4473 -2.07 0.0387

IRL05 β12 -0.3967 -2.96 0.0031

REGRL05 β13 -0.7807 -2.50 0.0124

Variance components Variance components Component Estimate LRT p-value Estimate LRT p-value Random plot effect 1.1904 269.3149 1 1.2592 579.8324 0 Residual variance 0.7732 0.7731

Deviance statistics Deviance statistics Total Model%Explained Model %Explained 9268.98 5722.3 61.74 5722.2 61.74

19 where CRS identifies trees with suppressed crown forms, FMACENG identifies trees from the Central FMA, CRDEN3 identifies stands with a low crown density (30-49% crown cover).

Species group classifications such as SPG1 and SPG2 were identified using similar methods as detailed for E. delegatensis, and E. regnans. The composition of the two classes SPG1 and SPG2, and additional class SPG3 for mixed forests are detailed in Table 8.

Table 8. Species group classifications based on the incidence of hollows for forests of mixed composition. SPECIES GROUP 1 (SPG1) SPECIES GROUP 2 (SPG2) SPECIES GROUP 3 (SPG3) -Species rarely -Species sometimes -Species often containing hollows containing hollows containing hollows Acacia dealbata Eucalyptus seiberi Eucalyptus baxteri Eucalyptus regnans Eucalyptus viminalis Eucalyptus radiata Acacia melonoxylon Eucalyptus bridgesiana Eucalyptus dives Eucalyptus delegatensis Eucalyptus bicostata Eucalyptus dalrympleana Eucalyptus nitens Eucalyptus globoidea Eucalyptus cypellocarpa Eucalyptus microrhyncha Eucalyptus consideniana Eucalyptus muellerana Eucalyptus obliqua Eucalyptus pauciflora Eucalyptus polyanthemos Eucalyptus rubida

It can be observed from examination of Tables 1, 3, 5, and 7 that most predictors have parameter estimates that are significantly different to zero. It can also be observed that the models explain excellent amounts of the total variability in hollow incidence, the smallest being 61.74% for mixed forests and the largest being 72.72% for E. regnans. These proportions of explained variability are excellent given the highly variable nature of hollow incidence. It can also be observed that random plot effects are highly significant for E. delegatensis and E. regnans and insignificant for E. obliqua and mixed forests. Table 1, 3, 5, and 7 provide statistical justification for the models, but it is also important to examine the behaviour of selected predictors and parameter estimates in a biological context.

Biological justification

Finding a strong nested dependence for E. delegatensis and E. regnans is biologically reasonable as we would expect two trees on the same plot to be more similar with regard to

20 hollow incidence than two trees on separate plots. This significant nested dependence was explicitly incorporated in model estimation using the GLMMs.

Tables 1, 3, 5, and 7 indicate that diameter is a powerful predictor of hollow incidence and that the probability of hollow incidence increases as diameter increases. This relationship is intuitive because trees with larger diameter tend to be older and are therefore subject to the various mechanisms leading to hollow development for a longer period of time. Mechanisms leading to hollow development include branch shedding, attack from decay- causing microbial organisms and termites, and other influences such as external wounding, fire, and strong winds. The likelihood that these mechanisms will occur increases with tree age. Furthermore, a tree’s ability to respond to the mechanisms detailed above, with the formation of new tissue, declines with tree age.

Top height has a strong negative relationship with the probability of hollow incidence. Top height is a useful predictor because a low top height is indicative of bole defects such as sweep and swelling, or a low major fork or large crown. Trees with these attributes are more likely to contain hollows relative to healthy, vigorous trees with large sections of clear bole, and thus large top heights.

Crown form is an important determinant of hollow incidence. Regrowth crowns are the least likely to contain hollows principally because they have not been exposed to mechanisms that induce hollow formation for long enough. Highly regular crowns are unlikely to contain hollows because typically they form part of a healthy vigorous tree relatively resistant to hollow causing mechanisms. Suppressed crowns are likely to be under physiological stress making them less able to respond to hollow causing agents. Thus hollow formation is more common in suppressed crowns. The moderately regular and irregular crown forms were the most likely to contain hollows. This is expected because poor crown form is associated with senescence, external damage or major branch breakage. All these attributes contribute to hollow formation, and hollows are most prevalent in these trees. Parameter estimates for the various crown form variables detailed in Table 1, 3, 5, and 7 reflect these intuitive understandings.

Species group divides the various species encountered in the four forest types into several classes. The most important classes differentiated the species into those that often form hollows and those that rarely form hollows. The differences between the species in their tendency to form hollows may reflect inter-specific differences in timber properties, morphology, fire resistance and the competitive position in the stand. Timber properties dictate susceptibility to attack from decay causing organisms and termites. Morphology may influence the prevalence of branch shedding, and may also influence the ability of a species to

21 respond to hollow causing mechanisms. The fire resistance of the different species influences hollow incidence because trees able to survive fire disturbance are very likely to develop hollows. Fire disturbance causes external scaring, and if the tree survives it is very likely that the most sever scars will develop into hollows. The competitive position assumed by the various species is also important because the vigorous and dominant species with large sections of clear bole will rarely develop hollows relative to less vigorous, suppressed species exhibiting poor form. The differentiation of species in the models accounts for these inter- specific differences in hollow incidence.

Plot level predictors were rarely selected in the tree-level models, however, binary variables identifying proportions of crown form are present in some of the models. The predictor identifying plots with irregular crown component less than 0.05% (IRL05) was often present in final models and always had a negative coefficient. This is intuitive as trees on plots with a very small irregular crown component would have a reduced probability of hollows presence. Counterintuitive is the negative parameter estimates for the predictor identifying plots with small regrowth crown components (REGRL5, REGRL05). It would be expected that plots with small regrowth crown components would have an increased probability of hollows incidence. Other selected plot level variables included the predictor identifying plots with a northerly aspect (ASPN), and predictors identifying intermediate levels of crown cover (CRDEN3, CRDEN4). Plot level variables generated using GIS failed to explain any additional variability.

In tables 1, 3, 5, and 7 it can be noted that difference in explained deviance between full and simple models is often negligible. Site characteristics explained only a minor additional amount of the deviance in hollows presence/absence. The absence of site variables in the simple models simplifies their application, allowing point estimates of hollow probability given easily measured individual-tree attributes.

4.2. Tree-level second-stage models

Tree-level second-stage models were developed for predicting hollow size from various individual-tree attributes in a simple or multiple regression. Second-stage models were constructed from felling tree information collected in the SFRI. Available individual- tree attributes were tested and, of these, variables that figured in the first-stage models also tended to be significant in the second-stage models. A graphical exploration of relationships between DBHOB and the observed size of hollows is provided in Figure 4. The fitted line shows the second order polynomial fit.

Similar to first-stage models, full and simple models were identified. Full models included all significant variables while simple models included only tree-level variables, thus

22 making them useful for application at the individual-tree-level. The basic structure of the models is detailed in (1), and models were estimated using ordinary least squares. The four models are detailed in Tables 9 – 12 below.

HOLSIZEi = α + β1 X1i + β 2 X 2i + ε i (1) where HOLSIZEi is the diameter of the hollow (cm), and X1 and X2 are independent predictors.

23 0 50 100 150

E. delegatensis E. obliqua 30

10

E. regnans Mixed 30 Estimated hollow diameter(cm)

10

0 50 100 150 Diameter at breast height overbark (cm)

Figure 4. Relationships between diameter at breast height overbark and the observed size of hollows. A fitted second order polynomial is also shown.

24 Table 9. Second-stage tree-level logistic model for E. delegatensis Full and Simple models Predictor Parameter Estimate t p-value INTERCEPT α -1.352957 -5.165 0.0001

LARGHOL β1 4.384765 6.591 0.0001

DBHSQR β2 0.000520 14.012 0.0001

CRMR β3 3.429706 5.34 0.0001

Model diagnostics R-square 0.4892 Model D.F. 3 Total D.F. 571 where LARGHOL differentiates species which tended to have large hollows (E. cypellocarpa, E. dalrympleana, E. pauciflora), DBHSQR is the square diameter at breast height overbark.

Table 10. Second-stage tree-level logistic model for E. regnans Full model Simple model

Predictor Parameter Estimate t p-value Estimate t p-value

INTERCEPT α -2.1269 -2.187 0.0304 -1.2584 -1.266 0.2077

LARGHOL β1 3.4101 3.15 0.002 1.6109 1.520 0.1307

DBHSQR β2 0.0002 2.821 0.0055 0.0003 2.975 0.0034

CRRG β3 -2.0906 -1.987 0.0489 0.2189 0.275 0.7835

CRS β4 4.5126 3.748 0.0003 6.5518 5.539 0.0001

REGR β5 0.0671 4.383 0.0001

REGL5 β6 -2.0914 -2.7 0.0078

Model diagnostics Model diagnostics R-square 0.4515 R-square 0.3722 Model D.F. 6 Model D.F. 4 Total D.F. 145 Total D.F. 145

where LARGHOL differentiates species which tend to have large hollows (E. obliqua).

25 Table 11. Second-stage tree-level logistic model for E. obliqua Full model Simple model

Predictor Parameter Estimate t p-value Estimate t p-value

INTERCEP α 3.5105 0.714 0.476 1.132599 1.244 0.2146

DBHSQR β1 0.0005 5.326 0.0001 0.000688 6.867 0.0001

CRDEN2 β2 7.1851 2.238 0.0262

CRDEN4 β3 2.2925 1.99 0.0477

MIN_SLOP β4 -1.3729 -3.76 0.0002

MEAN_SL β5 0.6628 3.901 0.0001

RAINFALL β6 -0.0063 -2.023 0.0442

Model diagnostics Model diagnostics R-square 0.3521 R-square 0.1609 Model D.F. 6 Model D.F. 1 Total D.F. 247 Total D.F. 247 where MIN_SLOPE, MEAN_SLO, and RAINFALL are variables derived from GIS describing the minimum and mean slope of SFRI plots and the estimated rainfall respectively.

26 Table 12. Second-stage tree-level logistic model for mixed forest. Full model Simple model

Predictor Parameter Estimate t p-value Estimate t p-value

INTERCEPT α -28.4231 -4.149 0.0001 -5.0862 -5.372 0.0001

DBHOB β1 0.1109 8.287 0.0001 0.1211 10.099 0.0001

CRS β2 4.6116 4.268 0.0001 3.9024 3.527 0.0005

FMATAM β3 -11.2063 -3.55 0.0004

FMACENG β4 -3.6579 -2.454 0.0146

CRDEN3 β5 3.1612 2.786 0.0056

SLOPE β6 -0.3569 -3.459 0.0006

SLOPESQR β7 0.0054 3.214 0.0014

AMG_EAS β8 7.11E-05 4.216 0.0001

Model diagnostics Model diagnostics R-square 0.2945 R-square 0.2222 Model D.F. 8 Model D.F. 2 Total D.F. 359 Total D.F. 359

where SLOPE is the estimated slope (in degrees from horiziontal) of the SFRI plot.

The model fit detailed in Tables 9 to 12 is satisfactory with R-square varying from 0.295 to 0.489. However, there remains a large amount of unexplained variability in hollow size. Deterministic predictions from this model may be unreliable. Subsequently a stochastic model structure is adopted to ensure that variability is represented in hollow size predictions (Ripley 1987). Stochastic hollow size predictions are generated using:

HOLSIZEs = HOLSIZEd + σ z (2)

where HOLSIZEs is a stochastic estimate of hollow size, HOLSIZEd is a deterministic estimate of hollow size generated using model (1) and the various predictors detailed in Tables 9-12, σ is the standard error as detailed in Table 13, and z is a random variate drawn from the standard normal distribution. Using this simple stochastic model is one way of representing variability inherent in the model.

27 Table 13. Standard errors of second-stage tree-level models. Full model Simple model Forest type Model standard error Model standard error E. delegatensis 4.5208 4.5208 E. regnans 3.1297 3.3245 E. obliqua 7.5651 8.5215 mixed forest 6.2728 6.5307

The use of a stochastic model component will facilitate the generation of hollow size predictions for single trees, which when summed to the stand-level, should provide an accurate picture of the availability of various sizes of hollow. Alternatively, the deterministic component alone can be used to generate an expected hollow size for a particular tree.

The models described in Tables 9 to 12 are generally biologically realistic. Individual-tree diameter is consistently the most important predictor, with positive coefficients indicating that trees with larger diameters tend to contain larger hollows. Other predictors such as crown forms are also behaving realistically with suppressed crown forms tending to have larger hollows (positive coefficients) and regrowth crown forms tending to have smaller hollows (negative coefficients). The behaviour of stand-level predictors and GIS variables is more difficult to interpret. Although often making significant contributions to the models, it is likely that stand-level predictors will be excluded when models are used for generating predictions at the individual-tree-level.

The fit of the models is satisfactory and incorporating a stochastic component will ensure that predictions are realistic. The second-stage tree-level model described by (1) and (2) are suitable for estimating the expected size of hollows, and if re-sampled, for generating a set of hollow sizes which correctly reflect variability in the size of hollows.

4.3. Stand-level first-stage logistic models

Stand-level first-stage models will predict the probability of hollows being present at the stand-level given aerial photograph interpretation (API) polygon classifications and variables from GIS. Stand-level first-stage models are described in Tables 14 to 17.

28 Table 14. First-stage stand-level logistic model for E. delegatensis Predictor Parameter Estimate t p-value INTERCEPT α 1.3311 4.10 0.0001

IRL05 β1 -1.6757 -5.40 0.0000

REGL05 β2 -1.1980 -3.42 0.0007

PCFI β3 3.9652 4.06 0.0001

Variance components Component Estimate Residual variance 1.0184

Deviance statistics Total Model %Explained 581.54 125.72 21.62

Table 15. First-stage stand-level logistic model for E. regnans Predictor Parameter Estimate t p-value INTERCEPT α -2.4170 -2.05 0.0409

FMACENG β1 -1.3126 -3.71 0.0002

FMADAN β2 1.0041 2.27 0.0240

CRDEN2 β3 -1.7663 -2.62 0.0094

IRL05 β4 -1.5984 -4.44 0.0000

REGR β5 0.0543 3.63 0.0003

REG β6 0.0668 4.63 0.0000

Variance components Component Estimate Residual variance 1.0697

Deviance statistics Total Model %Explained 395.76 107.38 27.13

29 Table 16. First-stage stand-level logistic model for E. obliqua Predictor Parameter Estimate t p-value INTERCEPT α 4.8366 6.33 0.0000

IRL05 β1 -2.5150 -3.27 0.0012

REGR β2 -0.0212 -2.02 0.0443

Variance components Component Estimate LRT p-value Residual variance 1.3947

Deviance statistics Total Model %Explained 138.57 28.62 20.65

Table 17. First-stage stand-level logistic model for mixed forests Predictor Parameter Estimate t p-value INTERCEPT α 3.9481 12.49 0.0000

CRDEN2 β1 -2.5321 -3.19 0.0015

IRL05 β2 -1.6284 -5.36 0.0000

REGL20 β3 -1.2518 -3.99 0.0001

Variance components Component Estimate LRT p-value Mapname effect 2.1338 242.3125 1.92E-10 Residual variance 0.4830

Deviance statistics Total Model %Explained 369.03 156.63 42.44

Further explanation of the random Mapname effect is detailed in Appendix B. Note that the random Mapname effect is highly significant. Stand-level first-stage models explain between 20 and 42% of the total variation in hollow incidence. These results suggest that models explain a relatively small amount of the total variability in stand-level hollow incidence. Despite this, the models may still be useful

30 for stand-level predictions of hollow incidence, and validation using an independent dataset will be a more rigorous test of predictive ability.

Biological justification

Crown form index is a continuous covariate constructed to reflect stand maturity. The strong positive relationship between this covariate and the incidence of hollows for E. delegatensis is expected because older stands are more likely to contain hollows than younger stands. When the irregular crown form component constitutes less than 0.05% (IRL05), stands are unlikely to contain hollows because it is often trees with irregular crown form that contain hollows. This is reflected by the variable IRL05 being selected in all models and having a significant negative coefficient. Other predictors also appear to be biologically reasonable. This brief analysis suggests that first-stage stand-level models adhere to biological intuition.

First-stage stand-level models explain only a modest amount of the variation in hollow incidence at the stand-level. However, the models are biologically plausible and are simply implemented using currently available API classifications. No variables from GIS were selected in the final models.

4.4. Stand-level second-stage models

Stand-level second-stage models will provide a hollows per hectare estimate for API classified state forests in central and eastern Victoria.

An initial challenge was to generate hollows per hectare estimates for SFRI felling plots. Approximately three trees were felled on each felling plot, and for every felled tree a hollows per hectare estimate for each hollow size class was generated by assuming that the number of hollows is proportional to the basal area of the tree. This is an appropriate assumption given that the number of hollows can be expected to increase with tree size. The formulation detailed in (3) is used to estimate hollows per hectare for each felled tree (F. Hamilton pers. comm.).

BAF HOL / HAij = N(HOL)i * N(TREES) j * (3) BAi

where HOL/HAij is a hollows per hectare estimate for tree i on felling plot j, N(HOL)i is the number of hollows (of a particular size class) on tree i, N(TREES)j is the number of trees on felling plot j, BA is basal area, and BAF is the basal area factor (3 for SFRI plots). Note that

31 the ratio of BAF to BAi scales estimates to the hectare. Model (3) is similar to that used for estimating volume per hectare for felled trees (e.g. DNRE 1999a, F. Cumming, pers. comm.).

To generate an overall estimate of hollow density for each felling plot, the average of the hollows per hectare estimates for the 3 felled trees was taken.

Possible predictors for the second-stage models included API classification variables and the probability of hollow incidence from the first-stage model. The probability of hollow incidence was found to be the best predictor of hollows per hectare in the following model:

4 HOL / HAsj = α s pˆ j (4)

4 where HOL/HAsj is the hollows per hectare for hollow size class s for felling plot j, pˆ j is the predicted probability of hollow presence for plot j from the first-stage model. Note that the power transformation of pˆ j was used to straighten an initially exponential relationship. The intercept is excluded in model (4), as it was found not to be significantly different to zero

4 when included in the model. A graphical exploration of relationships between pˆ j and the observed density of hollows are provided in Figure 5. Parameter estimates and model fit diagnostics for model (4) are detailed in Tables 18-20.

32 0.4 0.9 0.4 0.9

E. delegatensis E. delegatensis E. delegatensis E. delegatensis 10 - 20 cm 2.5 - 5 cm 20 + cm 5 - 10 cm 80 60 40 20 0 E. obliqua E. obliqua E. obliqua E. obliqua 10 - 20 cm 2.5 - 5 cm 20 + cm 5 - 10 cm 80 60 40 20 0 Mixed Mixed Mixed Mixed 10 - 20 cm 2.5 - 5 cm 20 + cm 5 - 10 cm 80 60 Observed density of (hollows/hectare) hollows 40 20 0

0.4 0.9 0.4 0.9 Predicted probability from first stage model to the power of four

Figure 5. Relationships between the predicted probability from first stage models (to the power of four) and the observed density of hollows. A least squares fit through the origin is also shown.

33 Table 18. Second-stage stand-level models for E. delegetansis Model Fit Parameter estimates 2 + Hollow R Total D.F. Parameter estimate t stat p > t size class 4 for pˆ j (α s)

2.5 – 5 cm 0.233 59 13.8388 4.201 0.0001 5 – 10cm 0.295 59 7.4963 4.931 0.0001 10 – 20cm 0.227 59 5.1107 4.129 0.0001 20cm + 0.182 59 2.7861 3.596 0.0007 + t stat is for the test (Ho: parameter estimate = 0).

Table 19. Second-stage stand-level models for E. obliqua Model Fit Parameter estimates 2 + Hollow R Total D.F. Parameter estimate t stat p > t size class 4 for pˆ j (α s)

2.5 – 5 cm 0.354 33 23.3535 4.107 0.0003 5 – 10cm 0.221 33 15.1289 3.012 0.0050 10 – 20cm 0.108 33 9.1911 1.964 0.0583 20cm + 0.171 33 8.0111 2.572 0.0150 + t stat is for the test (Ho: parameter estimate = 0).

Table 20. Second-stage stand-level models for mixed forest Model Fit Parameter estimates 2 + Hollow R Total D.F. Parameter estimate t stat p > t size class 4 for pˆ j (α s)

2.5 – 5 cm 0.253 55 14.3247 4.277 0.0001 5 – 10cm 0.191 55 12.4265 3.571 0.0008 10 – 20cm 0.175 55 6.9854 3.385 0.0013 20cm + 0.077 55 7.1546 2.126 0.0381 + t stat is for the test (Ho: parameter estimate = 0).

There were insufficient samples for E. regnans to develop second-stage stand-level models.

34 The models described in (4) and detailed in Tables 18-20 exhibit satisfactory R2 for each hollow size class. R-squares tend to be higher for the smaller hollow size class, and lowest for the largest size classes. This reflects the small number of observed large hollows in SFRI felling plots.

Similar to second-stage tree-level models, a stochastic component is required to ensure that variability is represented in predictions of hollows per hectare. However, inference

4 on the models will be biased because the independent variable ( pˆ j ) does not satisfy the strict statistical requirements of a dependent variable in traditional regression theory, i.e., that observations on the dependent variable are randomly sampled from the complete population and are measured without error. In this instance the dependent variable is a composite of predictors from the first stage models and therefore model standard error will be a poor reflection of the actual variability affecting the model. Given this, standard errors were empirically estimated using the graphical technique of Tanaka (1986, 1988). This will facilitate an improved representation of the variability affecting models.

The graphical technique involved empirically estimating a function between the

4 standard error of hollow density estimates and pˆ j . This was done by dividing observations

4 into 0.2 increments on the pˆ j axis. The standard errors of the 5 groups were then quantified and the least squares method was used to parameterise a function (constrained to pass through

4 zero) between the standard error of hollows density and pˆ j . The parameter estimates for this

4 relationship are detailed in Tables 21 to 24. The estimated standard error for any value of pˆ j can be multiplied by a random drawing from the standardised normal distribution and added to the mean prediction (as in (2)) to generate a stochastic hollows density estimate. The estimated standard error can also be used to generate statements of prediction uncertainty about any hollow density prediction.

Table 21. Estimated parameter for the standard error function for E. delegetansis Hollow Parameter estimate size class for the S.E. 2.5 – 5 cm 25.2241 5 – 10cm 10.9689 10 – 20cm 8.6652 20cm + 5.8121 + t stat is for the test (Ho: parameter estimate = 0).

35 Table 22. Estimated parameter for the standard error function for E. obliqua Hollow Parameter estimate size class for the S.E. 2.5 – 5 cm 30.5495 5 – 10cm 24.1923 10 – 20cm 19.8655 20cm + 13.5778 + t stat is for the test (Ho: parameter estimate = 0).

Table 23. Estimated parameter for the standard error function for mixed forest Hollow Parameter estimate size class for the S.E. 2.5 – 5 cm 27.3202 5 – 10cm 20.4926 10 – 20cm 12.5709 20cm + 15.3622 + t stat is for the test (Ho: parameter estimate = 0).

The empirical function of standard error can be used to generate stochastic predictions of hollows density or for constructing confidence intervals about predictions. An example of this procedure for E. delegatensis is given below:

¾ Predicted probability of hollows presence from first-stage model = 0.9

¾ Deterministic prediction of hollows density using model (4) and Table 18 are 9.08, 4.92, 3.35, 1.83 hollows per hectare for the 2.5-5, 5-10, 10-20, and 20+ hollow size classes.

¾ Using the parameter estimates from Table 21, standard errors can be estimated as 16.55, 7.197, 5.684, 3.813 for the 2.5-5, 5-10, 10-20, and 20+ hollow size classes. A sampling of the standardised normal distribution can yield a stochastic prediction of hollow density.

¾ Empirical 95% confidence intervals can also be constructed; 9.08±32.44, 4.92±14.11, 3.35±11.14, 1.83±7.47 hollows per hectare for the 2.5-5, 5-10, 10-20, and 20+ hollow size classes

36 The predictor in model (4) is the probability of the presence of hollows from the first- stage model. Model (4) is therefore biologically realistic as we would expect first-stage predictors related to the crown composition of the stand to also influence the number of hollows per hectare. This is confirmed with the positive relationships detailed in Tables 18 to 20. Parameter estimates detailed in Tables 18 to 20 demonstrate the changing prevalence of the various hollow size classes. The parameter estimates decline as the hollow size increases. Therefore few large hollows were observed relative to small hollows, and this will be preserved in predictions. Also, hollows of all sizes were more prevalent in E. obliqua and mixed forest relative to E. delegatensis forest.

4.5. Validation

SFRI plots from the Tambo and North-east FMAs were excluded from model building for the purposes of model validation. Little merit can be attached to a binary ecological model if it isn’t assessed for accuracy using independent data (Fielding and Bell 1997). In particular, validation is required to test the performance of the two-stage predictive methodology, and to assess the complicated GLMM model estimation. Models developed and parameterised using SFRI data from the Central, Central Gippsland, and Dandenong FMAs were used to generate predictions of the probability of hollows presence at the tree and stand- level as well as the expected size of hollows at the tree-level and the quantity (hollows/hectare) of hollows at the stand-level for the Tambo and North-east FMAs. The division of SFRI data among the model building and validation datasets is detailed in Tables 25 and 26.

Table 25. Division of data for first-stage models E. delegatensis E. regnans E. obliqua mixed forest

Tree Stand Tree Stand Tree Stand Tree Stand

Model building 5789 300 4768 288 3976 237 4693 333

Validation 1878 129 619 47 1143 84 2495 211

Total 7667 429 5387 335 5119 321 7188 544

37 Table 26. Division of data for second-stage models E. delegatensis E. regnans E. obliqua mixed forest

Tree Stand Tree Stand Tree Stand Tree Stand

Model building 358 39 112 31 176 29 167 29

Validation 214 20 34 3 72 4 193 26

Total 572 59 146 34 248 33 360 55

Validation of first-stage models.

A useful measure of a logistic model’s predictive discrimination is the Wilcoxon- Mann-Whitney two-sample rank test (Harrell 2000). The test measures the concordance between predicted probabilities and the actual presence/absence data. For the first-stage hollows models the index is the proportion of predicted probabilities having a higher predicted probability when a hollow was actually present. The Wilcoxon-Mann-Whitney test is also equivalent to the area under the “receiver operating characteristic” (ROC) curve (Hanley and McNeil 1982). A test statistic of 0.5 indicates random predictions, and a value of 1 indicates perfect prediction. A test statistic greater than 0.75 demonstrates utility in predicting presence/absence. When the Wilcoxon-Mann-Whitney test is used to examine predictions made to data not included in model building and parameterisation it provides a rigorous evaluation of a model’s predictive utility. Compared to other methods of model validation such as cross-validation and bootstrapping, using an independent validation dataset provides the most rigorous test of a model’s predictive ability. Results for the Wilcoxon- Mann-Whitney test of concordance between predicted probabilities and the actual presence/absence data for the North-east and Tambo FMAs is shown in Table 27.

Table 27. Wilcoxon-Mann-Whitney tests for the first-stage models when applied to the validation dataset. E. delegatensis E. regnans E. obliqua mixed forest

Tree Stand Tree Stand Tree Stand Tree Stand

0.9041 0.6939 0.8757 0.6000 0.8265 0.7106 0.8909 0.5937

Predictions for the tree-level models are made using simple models (stand-level predictors were excluded).

38 Test statistics detailed in Table 27 demonstrate that tree-level models are providing a high level of discrimination in predicting the presence or absence of hollows in the North-east and Tambo FMAs. Test statistics between 0.82 and 0.90 for the four forest types provide rigorous confirmation of the utility of tree-level models for predicting the presence and absence of hollows. Results for the stand-level models are less convincing, and demonstrate a poor to moderate discriminatory ability. This is not an unusual result when models are applied to new data and may indicate that predictors and estimated parameters derived from the Central, Central Gippsland, and Dandenong FMAs are not applicable to the North-east and Tambo FMAs. Despite the disappointing performance of stand-level models in general, models for E. delegatensis and E. obliqua exhibit moderate discriminatory power with test values in the vicinity of 0.7. Performance of the stand-level models indicates that model structures need to be re-examined and re-parameterised when applied to all the data combined. Further consideration of their utility as predictive models is also warranted.

Validation of second-stage models.

Second-stage models developed for the Central, Central Gippsland, and Dandenong FMAs were used to predict hollows size and hollows density in the Tambo and North-east FMAs. The correlation between predicted hollows size and actual hollows size was used to evaluate the second-stage tree-level models. Results are shown in Table 28 and Figure 6.

Table 28. Simple correlations between predicted and observed values for the second-stage tree-level models when applied to the validation dataset E. delegatensis E. regnans E. obliqua mixed forest

0.6210 0.5307 0.4497 0.4741

Predictions for the tree-level models are made using simple models (stand-level predictors were excluded).

39 01020

E. delegatensis E. obliqua 30

10

E. regnans Mixed forest 30 Actual Hollow Size (cm)Size Actual Hollow

10

01020 Predicted Hollow Size (cm)

Figure 6. Predicted against observed values for the second-stage tree-level model.

40 Table 28 and Figure 6 demonstrate that models of expected hollow size are performing satisfactorily when applied to the validation dataset. Simple correlations are of moderate strength and graphical correspondence between predicted and observed values is good. There is however a tendency to underestimate the larger hollows sizes. This underestimation may be due to the presence of larger hollows in the North-east and Tambo FMAs.

Predictions from the second-stage stand-level model were compared to the actual hollow densities of the validation dataset. Figures 7-10 show the actual against predicted hollows density for each hollows size class for E. delegatensis and mixed forest. Insufficient samples for validation were available for E. regnans and E. obliqua.

Figures 7-10 demonstrate that second-stage stand-level models are also providing satisfactory predictions of hollows density for the validation data. Similar to the tree-level model, there is some underestimation of very large hollow densities. A re-examination of model form and re-parameterisation for the complete data may improve this apparent underestimation.

41 E. delegatensis

80

30

Mixed forest

Actual Hollows/Ha 80

30

02468101214 Predicted Hollows/Ha

E. delegatensis

60

40

20

0 Mixed forest

Actual Hollows/Ha 60

40

20

0

-4 0 4 8 12 Predicted Hollows/Ha

Figure 7 and 8. Predicted against observed hollows per hectare for the 2.5 – 5 cm (top) and 5 – 10 cm (bottom) hollows size classes for the second-stage stand-level models.

42 E. delegatensis

10

4

Mixed forest Actual Hollows/Ha 10

4

01234567 Predicted Hollows/Ha

E. delegatensis

50

20

Mixed forest

50 Actual Hollows/Ha

20

01234567 Predicted Hollows/Ha

Figure 9 and 10. Predicted against observed hollows per hectare for the 10 – 20 cm (top) and 20 + cm (bottom) hollows size classes for the second-stage stand-level models.

43 5. Discussion

The statewide forest resource inventory (SFRI) represents a major step forward in the quality and quantity of information available on Victorian state forests. Through the work detailed in this report, information on the incidence of tree hollows collected as part of the SFRI has facilitated the development and parameterisation of models that can guide and inform forest management decision making, particularly with respect to the conservation of hollow-dependent fauna. Previous studies of hollow incidence have been limited by the costs involved in gathering information at the appropriate level of resolution and with a sufficient regional sample. The SFRI has overcome this limitation by simultaneously collecting information on both ecological and timber resources, yielding a state-wide dataset with appropriate resolution at the tree and stand-level. The primary intention of the SFRI is to provide accurate total and product volume estimates for standing timber at a strategic level of forestry planning. This was achieved using an efficient model-based sampling scheme that emphasised the representation of individual trees and stands at the FMA level (DNRE 1999a). The representation of individual trees and stands in the SFRI is appropriate for the development of predictive models applicable to central and eastern Victoria. The SFRI remains the most comprehensive and representative inventory of tree hollows in Australian forests.

SFRI data collection occurred in two stages. In the first-stage, individual-tree information on the size of trees (DBHOB) and the presence or absence of hollows was collected for all SFRI sampling plots. In the second-stage more detailed information was collected for a sub-sample of SFRI sampling plots with several trees being felled and information on the number and size of hollows collected. In light of this two-stage design a two-stage modelling methodology was used to maximise use of the information. First-stage models used the first-stage SFRI sample to develop models of the presence or absence of hollows at the tree- and stand-level. Second-stage models used second-stage SFRI information from felled trees to develop models for the size of hollows at the tree-level, and the density of hollows at the stand-level. In the current project SFRI data for central and eastern Victoria are used. In future, the modelling methodology identified in this report can be applied to SFRI data for western Victoria.

Previously developed models for the incidence of hollows in Victorian forests have used restricted samples. The most extensive of these are the database of Lindenmayer et al. (1993) consisting of 2315 trees located in the central highlands of Victoria, and the database of Bennett et al. (1994) consisting of 1120 trees located on the northern plains of Victoria. The information recorded in these studies is equivalent to the first-stage sample of the SFRI,

44 i.e., the size of individual trees and the presence or absence of hollows. Although both studies identified useful relationships between the incidence of hollows and individual-tree attributes, the developed models had very limited predictive ability. Very few studies have involved the felling of trees for a more detailed examination of hollows size and abundance. This is primarily because of the expense of tree falling and dissection and restrictions on this practice in national parks and state forests (Lindenmayer et al. 2000). The only exceptions to this are the studies of Gibbons (1999) and Mackowski (1987) in southern and northern New South Wales respectively. Because of the expense involved in tree felling both these studies had very limited sample sizes thus hindering the usefulness of the developed models (Gibbons 1999). Information collected in these studies is equivalent to the second-stage sample of the SFRI, i.e., trees felled and dissected, and the number and sizes of hollows noted. Clearly the 25,361 trees for which first-stage SFRI information is available, and the 1,326 trees for which second-stage SFRI information is available is unparalleled as a source of information on the incidence of tree hollows.

Observing the presence or absence of hollows from the ground in the first-stage sample of the SFRI is subject to operator biases. In extreme cases, the operator will be guided more by preconceptions of what a hollow-bearing tree should look like, rather than whether a viable hollow has been sighted. Other operators will be more exact, and will record hollow- bearing trees only when an actual hollow has been sighted. Observing predominantly crown hollows from the ground is also subject to errors because a proportion of sighted hollows will not be viable, i.e., they may be shallow or collapsed (Lindenmayer et al. 2000). Furthermore, hollows are often present high in the crown where they cannot be seen by an observer at ground level (Mackowski 1984). Ground based observations of hollow incidence also favour particular types of hollows, in particular large crown hollows, and disadvantages smaller branch cavities and cavities forming at the ends of broken limbs (bayonets). Observers were instructed to only record the presence of crown hollows, therefore, fissures and cavities on the lower bole are also not represented. Despite these biases, observing hollows from the ground has been the predominant method of data collection used in previous studies (e.g. Lindenmayer et al. 1993, Bennett et al. 1994, Lindenmayer et al. 2000). It is hoped that the second-stage sample of the SFRI, which offers a more exact indication of hollows size and abundance, can be used to offset the biases affecting the first-stage sample.

The tree-level models developed in this study tended to rely on important relationships between the incidence and size of hollows and tree attributes such as diameter at breast height (DBHOB) and crown characteristics. These relationships are well established in the literature. Tree diameter is a useful proxy for tree age, and given the various stochastic events capable of initiating hollow development (fire damage, limb breakage and external

45 damage from falling timber due to strong winds and snow, lightning strike), the older the tree, the more likely it is that such stochastic events will have occurred (Mackowski 1984, Lindenmayer et al. 1993, Bennett et al. 1994). Furthermore, the older the tree, the longer it is exposed to agents capable of excavating external or internal damage into viable hollows (fungal infection, insect attack, excavation by termites), and the less likely it is to actively respond to wounding with the development of callous tissue (Lindenmayer et al. 1993). This strong relationship between the incidence and size of hollows and tree diameter is well established in other empirical studies (e.g. Lindenmayer et al. 1993, Bennett et al. 1994, Gibbons 1999, Lindenmayer et al. 2000).

In this study individual-tree diameter at breast height overbark (DBHOB) was the most important predictor of the presence or absence of hollows in first-stage tree-level models. The relationships varied for the different forest types. For example, 75% of trees had hollows when DBHOB exceeded 113.9, 118.2, 136.2, and 159.9cm for E. obliqua, mixed, E. delegatensis, and E. regnans forest respectively. Therefore hollows are present in trees with smaller diameter for E. obliqua and mixed forests relative to E. delegatensis and E. regnans forest. If tree age is a primary determinant of hollows presence or absence, then these differences may be driven by differences in growth rates for the different species. For example, because E. regnans forests are typically faster growing than E. delegatensis forests, then a DBHOB of 159.9cm for E. regnans may represent a similar physiological age as a DBHOB of 136.2cm for E. delegatensis forest.

In addition to individual-tree diameter, the estimated height to which merchantable timber products can be extracted (top height) was also a highly significant predictor of hollows incidence. A small top height is indicative of a tree with poor form or a tree with a dead or broken top. A small top height may also indicate defects such as hollow pipe, burls or a bent bole, the presence of termite galleries, rotting or fungal infection, and fire scarring on the main bole. It is plausible the factors causing a tree to have a small top height also influence the incidence of hollows, and the literature supports this assertion. Tree hollows are often formed when decaying heartwood, hollow pipe, termite galleries, or fungal infections are exposed follow limb breakage or external damage (Mackowski 1984, Gibbons 1999). Therefore the presence of these bole defects is likely to increase the incidence of hollow formation in the crown. Furthermore, Gibbons (1999) observed a significant relationship between the incidence and depth of hollows and the extent of hollow pipe. Several studies have also observed that the form of the tree is an important determinant of the incidence of hollows (e.g. Lindenmayer et al. 1993, Gibbons 1999). Trees with poor form (and generally smaller top heights) were far more likely to contain hollows than otherwise. Given these

46 observations, it is plausible that trees with small top heights are more prone to hollow formation relative to healthy trees with large sections of clear bole and thus large top heights.

The third most important predictor of hollow incidence at the individual tree-level was the character of the crown. First-stage SFRI sampling classified individual tree crowns into five categories; regrowth, suppressed, highly regular, moderately regular and irregular crown forms (DNRE 1999a). The five categories were used to create indicator variables, and variables indicating regrowth, suppressed, highly regular, and irregular crown forms were selected in the final predictive models. Regrowth crowns were the least likely to contain hollows followed by highly regular, and suppressed crowns. Irregular crowns were most likely to contain hollows. Regrowth and highly regular crowns are characterised by healthy well formed crowns supporting rapid growth in the main bole. Such trees will be able to respond to limb breakage and external damage with the development of callus tissue, thus preventing hollow development. Given this rapid response it is unlikely that fungal infection, termite or insect attack will occur. Such trees are also unlikely to have been exposed to fire and other stochastic events such as lightning strikes that may cause hollow formation. Given this it is very unlikely that trees with regrowth and highly regular crown forms will develop hollows. Suppressed crown forms are indicative of a tree under physiological stress. This may be primarily caused by dominant competitors that are restricting access to light, water and nutrients. Physiological stress will reduce the trees ability to respond to external damage and branch breakage, therefore agents of hollow formation such as fungal infection and insect attack may find an entry point. Suppressed trees are also more likely to be externally damaged due to limb breakage in the dominant over-storey. Irregular crown forms are characteristic of senescent trees or trees subject to sever crown damage, suppression or deformity. All these traits will increase the likelihood that hollows are present. Irregular crown forms are also characteristic of retained habitat trees that may also have been subject to external damage due to operations, regeneration burns or natural fires. Crown form has been cited in previous studies as an important determinant of hollows incidence (Gibbons 1999, Lindenmayer 2000).

Dead trees were found to be far more prone to hollow formation than living trees. This is intuitive as the brittle timber of dead trees is highly susceptible to limb breakage and they lack the mechanisms to respond to hollow causing agents such as fungal infection, insect attack and excavation by termites. Different species present in the four forest types also showed a differing predisposition to hollow formation. E. dalrympleana, E. cypellocarpa, E. radiata, E. dives, and E. obliqua exhibited a strong predisposition to hollow development. This was particularly the case when these species were present in largely mono-specific stands of E. delegatensis and E. regnans. E. pauciflora and E. viminalis exhibited a moderate

47 predisposition while E. regnans, E. delegatensis, and E. nitens rarely developed hollows. The differing growth forms and growth rates that dictate the form of the tree and its competitive position in the stand can explain these differences. Physiology and morphology will also be important as they determine the extent of branch shedding, the ability to survive fire events and they influence timber properties that dictate susceptibility to a variety of hollow causing agents (Lindenmayer et al. 1993, Bennett et al. 1994, Lindenmayer et al. 2000). Particular species may also have a higher incidence of defect in the timber, which will result in more hollows. Sensitivity to fire may be important as species able to survive fire events are very likely to develop hollows (Taylor and Haseler 1993). Site productivity and the landscape position assumed by the various species is also important as species subject to dryer conditions (such as the mixed species and E. obliqua forest types) will exhibit poorer growth and form relative to species subject to moist, fertile conditions (such as E. regnans, E. delegatensis and E. nitens).

Despite being significant in some tree-level models, stand-level predictors made negligible contributions to models. Significant stand-level predictors included the crown composition of the stand and the percentage crown cover. Their poor contributions lead to the development of more parsimonious simple tree-level models that could be easily applied using tree-level predictors. Often the difference between the full and simple tree-level models was negligible, making the simple models attractive for application. This finding is consistent with previous studies that have found individual-tree attributes to explain most of the variability in hollow incidence (e.g. Lindenmayer et al. 1993, Bennett et al. 1994, DNRE 1999b). Furthermore, the behaviour of stand-level variables was difficult to interpret biologically, with some counter-intuitive parameter estimates.

Simple tree-level models explained between 61% of the variability in hollows incidence for mixed forests and up to 73% for E. regnans. These are excellent results and far surpass previous modelling efforts (e.g. Lindenmayer et al. 1993, Bennett et al. 1994). This is particularly impressive in light of the very large sample size and the large geographic region represented in the data. The simple models also performed impressively when applied to completely new data with the Wilcoxon-Mann-Whitney test of concordance generating values between 0.83 and 0.9. Despite the excellent performance of tree-level models, hollow formation remains a largely stochastic event. Two trees of the same size, top height, crown form and species are not always identical with regard to the presence or absence of hollows. Uncaptured variability could be attributed to the large number of stochastic externalities affecting individual trees such as wounding from other trees, fire effects, and strong winds that may cause branch breakage and crown damage. Despite these considerations, the deterministic first-stage tree-level models are ideal for application at the tree-level.

48 First-stage models were estimated using generalized linear mixed models (GLMMs). Following the identification of nested and spatial dependence in the SFRI hollows database, a methodology was sought capable of explicitly incorporating this dependence in model estimation. First-stage models for hollow incidence had a binary dependent variable (i.e., a hollow was either present or absent), and a review of the literature indicated that although such models were often affected by nested and/or spatial dependence, the resulting error structures were rarely accounted for. In an attempt to account for a similar error structure Lindenmayer et al. (1993) and (2000) transformed the dependent variable to a continuous variable and applied a linear mixed model. However, this approach failed to preserve the actual distribution of the response. Given this shortcoming, generalisations of the linear mixed model to instances where the dependent variable was binary were investigated, and the generalized linear mixed model (GLMM) was identified. Lindenmayer et al. (1993) and (2000) also failed to account for spatial dependence. Adoption of the GLMM, explicitly modelling both spatial and nested dependence, alleviated statistical problems such as biases in hypothesis tests on the estimated parameters, poor estimation efficiency, and an overdispersed dependent variable. The explicit modelling of a hierarchy of spatial and nested dependence represents a methodological advance, and subsequent models were found to perform well during a rigorous validation exercise.

Second-stage tree-level models were calibrated to provide predictions of hollow size. Predictors that explained the incidence of hollows in the first-stage models also tended to explain the observed size of hollows in the second-stage models. Important predictors of hollow size were diameter at breast height and crown variables. It is intuitive that larger trees contain larger hollows, as agents of hollow formation such as fungal infection and insect attack have longer to act resulting in larger hollows. Trees with moderately regular and suppressed crowns tended to contain larger hollows. A suppressed tree with the same diameter as a dominant tree will be predisposed to hollow development relative to the later due to physiological stress, poor form, and the increased likelihood of external damage from the dominant overstorey. Moderately regular crowns would contain larger hollows relative to highly regular and regrowth crowns. Regrowth crown forms were found to contain very small hollows. Similar variables were found to influence hollow size in the study of Lindenmayer et al. (2000).

Similar to the first-stage model it was found that some species had a predisposition toward the development of large hollows. These species included E. dalympleana, E. cypellocarpa, E. pauciflora, and E. obliqua. Because these species were also found to have a predisposition toward hollows incidence in the first-stage model, it is intuitive that they also contain more large hollows. Explanation for this result will be similar to that for their

49 apparent disposition toward hollow formation observed in the first-stage model. In particular, inter-specific differences in their susceptibility to agents of hollow formation such as fungal infection, insect attack and excavation by termites will be important in influencing the resulting size of hollows.

Stand-level predictors related to the crown composition of the stand and crown density were significant in several second-stage tree-level models, but made relatively minor contributions. However, stand-level predictors from GIS for the mean and minimum slope of the stand as well as rainfall made a large contribution to the model for E. obliqua. This may indicate a particular sensitivity in the size of hollows in E. obliqua to position on the slope and possibly the extent of incident rainfall. Rainfall had a negative coefficient indicating that higher rainfall generally produced smaller hollows. These results may be related to the influence of site productivity on hollow size in E. obliqua forests, i.e., less productive sites with lower rainfall tend to have larger hollows. For E. obliqua forests stand-level variables are providing for an additional 20% of the variation in hollows size. It is tempting to discard stand-level variables in favour of tree-level variables, as their presence greatly complicates application. However, for E. obliqua these variables should be seriously considered for application as they have biological meaning and are making large contributions to the models.

Second-stage tree-level models explain between 30% of the variation in hollow size for mixed forests and up to 49% for E. delegatensis. Simple models including only tree-level predictors performed well when applied to new data in the validation exercise. These results indicate that second-stage tree-level models are viable for the prediction (either stochastic or deterministic) of hollows size from easily measured tree-level variables. The inclusion of stand-level predictors related to slope and rainfall should also be considered for E. obliqua forests.

First and second-stage tree-level models capture a satisfactory amount of the variability in the incidence and size of tree hollows. This is a surprising result with previous research indicating that hollow occurrence is a highly stochastic episodic event, and is intrinsically difficult to model for predictive purposes (e.g. Lindenmayer et al. 1993). Reasons for the predictability of hollows occurrence in this instance may be related to the identification of suitable tree-level predictors, the large sample size, and the broad geographic area represented in the data. The poor performance of stand-level predictors indicates that a majority of the variability in hollows incidence can be explained by tree-level predictors.

Stand-level models were developed in addition to tree-level models to facilitate stand- level forecasting of hollows incidence and density. These models will be particularly useful for generating a statement of hollows incidence and density for central and eastern Victoria.

50 First-stage models attempted to model the presence or absence of hollows from the stand- level variables. Stand-level predictors were derived from aerial photography interpretation (API) and geographic information systems (GIS). API involved the coding of forest types from aerial photographs according to the principal species, extent of crown cover, extent of irregular, regular, and regrowth crown forms, height class of the stand, and evidence of harvesting (DNRE 1999a). The four forest types were identified using the principal species classifications from API. From this information, indicator variables and continuous covariates describing the extent of crown cover and the extent of the various crown forms were created as well as crown form index which is a continuous covariate constructed to reflect stand maturity. GIS was used to generate stand-level variables describing the aspect, the mean, minimum, and maximum slope, mean annual rainfall, minimum and maximum temperatures, and a categorical variable describing Lithology. In addition to these, variables measured on each SFRI plot such as slope, aspect and elevation (and various transforms) were also tested as predictors.

Final first-stage stand-level models explained a modest amount of the variability in hollows incidence, from 21% for E. obliqua and up to 42% for mixed forests. The most important predictors in these models were variables describing the crown composition of the stand. The modest amount of explained variability might be indicative of the importance of individual-tree diameter as a predictor of hollows incidence. The modest amount of explained variability in the stand-level model is consistent with previous studies that have found such models to have very limited predictive ability (e.g. Lindenmayer et al. 1993, Bennett et al. 1994, DNRE 1999b). When applied to new data in the validation exercise, stand-level models fared poorly, generating Wilcoxon-Mann-Whitney test statistics in the range of 0.6 to 0.7. These values indicate that the models may lack utility for the prediction of stand-level hollows incidence. However, the validation exercise was rigorous, and the models may still be useful when re-parameterised using all the data combined.

Second-stage stand-level models attempted to model the density of four hollow size classes on stand-level variables. Similar variables as used in the first-stage model as well as the predicted probability of hollows incidence from the first-stage model were tested in the models. Only the predicted probability from the first-stage model explained variability in hollows density, and it was subsequently used alone in models. Models explained between 10% of the variability for the largest hollow size (20+cm), and 30% for the smallest hollow size (2.5-5cm). Models of hollow density performed satisfactorily in the validation exercise, tending to predict higher densities of hollows when they were present in the validation dataset. These results indicated the viability of second-stage stand-level models for generating a statement of hollows density for central and eastern Victoria. Second-stage models actually

51 performed better in the validation than first-stage stand-level models, suggesting that the density of hollows is more predictable at the stand-level relative to the incidence of hollows. Indeed, the only practicable application of the first-stage models may be to generate a predicted probability for input into the second-stage model. Confidence intervals for predictions from the second-stage stand-level were derived graphically, as statistical estimates would be seriously biased. Because of the large amount of variability about predictions of hollows density, 95% confidence intervals were very wide, and consistently fell below zero. Thus, in a statement of hollows density for central and eastern Victoria, the difference between the worst and best case scenarios will be large. Despite this, indication of mean tendencies will act as a useful guide for landscape level management planning.

Given the large difference in performance between the tree- and stand-level models, it would be preferable to use the tree-level models in application. However, in the absence of tree-level information for forests, there is no choice but to use the stand-level models in application. Possibilities for applying the tree-level model at the stand-level should be explored. These include using a sample of tree-level characteristics that are representative of a forest, and applying an area multiplier to aggregate tree-level information on hollows incidence up to the stand-level. Individual-tree information may also become available with the further development of remote sensing technologies (Brandtberg and Walter 1998, Lefsky et al. 1999). This information could then be used for improved estimates of hollow incidence. In the absence of this information and technology, the stand-level models can be applied to provide a snap shot of hollows incidence in central and eastern Victoria.

The models detailed in this report offer a more assured approach to the integration of conservation priorities for hollow-dependent fauna in the management of timber production forests. Their application at the stand-level will yield a statement of hollow incidence and density in some of Victoria’s most contentious timber production forests. This statement will be useful for testing hypothesis as to the limited availability of hollows in state forests (e.g. Lindenmayer 1995), and the contention surrounding this (e.g. Macfarlane and Loyn 1994, Attiwill 1995). Many benefits can also be gained by integrating improved strategies for retaining hollow-bearing trees with operational practice, and the strong tree-level models offer considerable potential in this regard.

52 6. Conclusion

This report details tree- and stand-level hollow incidence models for state forests in central and eastern Victoria. Initially, a suitable methodology for the development of hollows models from the hollows component of the statewide forest resource inventory (SFRI) was identified. It was intended to maximise the usefulness of models for informing forest management decision making, therefore the identified methodology involved the development of models at both the tree- and stand-level. A two-stage methodology was employed at both the tree- and stand-level that in its first-stage generated a statement of probability of hollows presence, and its second-stage estimated the size of hollows at the tree-level, and the density of hollows at the stand-level. Following the identification of a suitable methodology, the forest estate was divided into major forest types according to API classifications such as those dominated by E. delegatensis, E. regnans, E. obliqua, and mixed forests, and further subdivision was made to facilitate model validation. The hierarchical and spatially-explicit nature of SFRI data prompted the search for statistical methodology capable of explicitly modelling a complicated error structure. Subsequently, generalized linear mixed models (GLMM) were used to estimate first-stage, tree- and stand-level models in the presence of spatial and nested dependence.

The developed first- and second-stage, tree- and stand-level models were statistically and biologically plausible. Validation indicated that the models performed realistically when applied to new data. Tree-level models exhibited superior fitting statistics, and performed better when applied to the validation dataset relative to the stand-level models. Tree-level models will be useful for improving habitat tree retention guidelines, and for predicting hollows incidence when individual-tree information is available. The predictive capabilities of stand-level models is marginal. Fitting statistics indicated that only a small amount of the total variability in hollows incidence was successfully explained by the models. Validation indicated that first-stage models provided poor to moderate discriminatory ability while second-stage models performed satisfactorily. Despite their relatively poor performance, stand-level models will be useful for stand-level forecasting of hollows incidence. Statements of the density of different sized hollows will be particularly useful, and the identification of empirical confidence intervals will ensure that management decision makers are explicitly aware of the uncertainty associated with predictions.

53 7. References

Agresti, A., (1990). Categorical data analysis. John Wiley & Sons, New York.

Agresti, A., (1996). An introduction to Categorical data analysis. John Wiley & Sons, New York.

Ambrose, G.J., (1982). An ecological and behavioural study of vertebrates using hollows in eucalypt branches. Unpublished Ph.D. thesis, La Trobe University, Melbourne.

Attiwill, P.M., (1995). Managing Leadbeater’s possum in the mountain ash forests of Victoria, Australia – Reply. For. Ecol. Manag., 74: 233-237.

Ball, I.R., Lindenmayer, D.B., and Possingham, H.P., (1999). A tree hollow dynamics simulation model. For. Ecol. Manag., 123: 179-194.

Bennett, A.F., Lumsden, L.F., and Nicholls, A.O., (1994). Tree hollows as a resource for wildlife in remnant woodlands: spatial and temporal patterns across the northern plains of Victoria, Australia. Pac. Cons. Biol., 1: 222-235.

Brandtberg, T., and Walter, F., (1998). Automated delineation of individual tree crowns in high spatial resolution aerial images by multiple-scale analysis. Machine Vision & Applications, 11: 64-73.

Calver, M.C., (1997). Hollow arguments? Emu, 97: 183-184.

Department of Natural Resources and Environment, (1999a). Victoria’s Statewide Forest Resource Inventory: Bennalla/Mansfield, Wangaratta and Wodonga Forest Management Areas. Forests Service Technical Reports 99-2. Department of Natural Resources and Environment, Victoria.

Department of Natural Resources and Environment, (1999b). Tree hollows in the box-iron forest: Analysis of ecological data from the box-ironbark timber assessment in the Bendigo Forest Management Area and Pyrenees Ranges. Forests Service Technical Reports 99-3. Department of Natural Resources and Environment, Victoria.

Ferguson, I.S., (1996). Sustainable Forest Management. Oxford University Press, Melbourne, 162pp.

Fielding, A.H., and Bell, J.F., (1997). A review of methods for the assessment of prediction errors on conservation presence/absence models. Environmental conservation. 24: 38-49.

Florence, R.G., (1996). Ecology and of eucalypt forests. CSIRO Publishing, Victoria, Australia.

Gibbons, P., (1994). Sustaining key old growth forest characteristics in native forests used for wood production: the case of habitat tree retention. In: T.W. Norton and S.R. Dovers (Eds.). Ecology and Sustainability of Southern Temperate Ecosystems, CSIRO, Melbourne, Victoria.

Gibbons, P., (1999). Habitat-tree retention in wood production forests. Unpublished Ph.D. Thesis, Australian National University. 242pp.

Gibbons, P., and Lindenmayer, D.B., (1996). Issues associated with the retention of hollow- bearing trees within eucalypt forests managed for wood production. For. Ecol. Manag., 83: 245-279.

54 Gibbons, P., and Lindenmayer, D.B., (1997a). Developing tree retention strategies for hollow- dependent arboreal marsupials in the wood production eucalypt forests of eastern Australia. Aus. For., 60: 29-45.

Gibbons, P., and Lindenmayer, D.B., (1997b). The performance of prescriptions employed for the conservation of hollow-dependent fauna. Implications for the Comprehensive Regional Assessment process. Centre for Resource and Environmental Studies Working Paper. 1997/2. CRES, The Australian National University, Canberra.

Goldstein, H., (1995). Multilevel Statistical Models. London, Arnold.

Haining, R. P., (1990). Spatial data analysis in the social and environmental sciences. Cambridge University press, Cambridge.

Hanley, J.A., and McNeil, B.J., (1982). The meaning and use of the area under the receiver operating characteristic (ROC) curve. Radiology, 143: 29-36.

Harrell, F. E., (2000). Regression modelling strategies with applications to survival analysis and logistic regression. University of Virginia.

Hosmer, D.W., and Lemeshow, S., (1989). Applied logistic regression. John Wiley & Sons, New York.

Kavanagh, R.P., Shields, J.M., Recher, H.F., and Rohan-Jones, W.G., (1985). Bird populations of a logged and unlogged forest mosaic at Eden, New South Wales. Pp. 237-281 In Birds of eucalypt forests and woodlands; Ecology, conservation and management. Ed by A. Keast, H.G. Recher, H. Fored, and D. Saunders. Surrey Beatty and Sons, Chipping Norton, Sydney.

Kramer, W., (1980). Finite sample efficiency of ordinary least squares in the linear regression model with autocorrelated errors. J. Am. Stat. Ass., 75: 1005-1099.

Lefsky, M.A., Harding, D., Cohen, W.B., Parker, G., and Shugart, H.H., (1999). Surface Lidar remote sensing of basal area and in deciduous forests of eastern Maryland, USA. Remote Sensing and Environment, 67: 83-98.

Lindenmayer, D.B., (1995). Forest disturbance, forest wildlife conservation and the conservative basis for forest management in the mountain ash forests of Victoria – comment. For. Ecol. Manag., 74: 223-231.

Lindenmayer, D.B., Cunningham, R.B., Pope, M.L., Gibbons, P., and Donnelly, C.F., (2000). Cavity sizes and types in Australian eucalypts from wet and dry forest types – a simple rule of thumb for estimating size and number of cavities. For. Ecol. Manag., 137: 139-150.

Lindenmayer, D.B., and Possingham, H.P., (1996). Ranking conservation and timber management options for Leadbeater’s possum in southeastern Australia using population viability analysis. Cons. Biol., 10: 235-251.

Lindenmayer, D.B., Cunningham, R.B., Donnelly, C.F., Tanton, M.T., and Nix, H.A., (1993). The abundance and development of cavities in eucalyptus trees: a case study in the montane forests of Victoria, southeastern Australia. For. Ecol. Manag., 60: 77-104.

Lindenmayer, D.B., Cunningham, R.B., Nix, H.A., Tanton, M.T., and Smith, A.P., (1991). Predicting the abundance of hollow-bearing trees in montane forest of southeastern Australia. Aust. J. Ecol., 16: 91-98.

55 Lindenmayer, D.B., Cunningham, R.B., Tanton, M.T., and Smith, A.P., (1990). The conservation of arboreal marsupials in the montane ash forests of the central highlands of Victoria, South-Eastern Australia: II. The loss of trees with hollows and its implications for the conservation of Leadbeater’s possum Gymnobelideus leadbeateri McCoy (Marsupialia: Petauridae). Biol. Cons., 54: 131-145.

Lindenmayer, D.B., and Lacy, R.C., (1995). A simulation study of the impacts of population subdivision on the mountain brushtail possum Trichosurus caninus Ogilby (Phalangeridae: Marsupialia) in South-Eastern Australia. I. Demographic stability and population persistence. Biol. Cons., 73: 119-129.

Lindenmayer, D.B., and Possingham, H.P., (1995). The conservation of arboreal marsupials in the montane ash forests of the central highlands of Victoria, South-Eastern Australia: III. Modelling the persistence of Leadbeater’s possum in response to modified timber harvesting practices. Biol. Cons., 73: 239-257.

Loyn, R.H., (1993). Effects of previous logging on bird populations in East Gippsland, VSP retrospective study. VSP Technical Report No. 18. Department of Conservation and Natural Reources, Melbourne.

Macfarlane, M.A., and Loyn, R.H., (1994). Management for the conservation of Leadbeater’s possum (Gymnobelideus leadbeateri) – a reply. Pac. Conserv. Biol., 1: 84-86.

Mackowski, C.M., (1984). The ontogeny of hollows in Blackbutt, Eucalyptus pilularis and its relevance to the management of forests for possums, gliders and timber. In: A.P. Smith and I.D. Hume (Eds.), Possums and Gliders. Surrey Beatty, Sydney, pp. 517-525.

Mackowski, C.M., (1987). Wildlife hollows and timber management in blackbutt forest. Unpublished M.Sci. Thesis. University of New England, Armidale, Australia.

Mawson, P.R., and Long, J.L., (1994). Size and Age Parameters of Nest Trees Used by Four Species of and One Species of in South-west Australia. Emu, 94: 149-155.

McCarthy, M.A., Pearce, J.L, and Burgman, M.A., (1994). Use and abuse of wildlife models for determining habitat requirements of forest fauna. Aust. For., 57: 82-85.

Milledge, D.R., Palmer, C.L., and Nelson, J.L., (1991). “Barometers of change”: the distribution of large owls and gliders in mountain ash forests of the Victorian central highlands and their potential as management indicators. Pp. 53-65 in Conservation of Australia’s Forest Fauna. (eds. Lunney, D). Royal Zoological Society of NSW, Mosman.

Piepho, H.P., (1999). Analysing disease incidence data from designed experiments by generalized linear mixed models. Plant Pathology, 48: 668-674.

Pitt, D.G., Wagner, R.G., Hall, R.J., King, D.J., Leckie, D.G., and Runesson, U., (1997). Use of remote sensing for forest vegetation management – a problem analysis. Forestry Chronicle, 73: 459-477.

Possingham, H.P., Lindenmayer, D.B., Norton, T.W., and Davies, I., (1994). Metapopulation viability analysis of the greater glider petauroides volons in a wood production area. Biol. Cons.,70: 227-236

Rawlings, J.O., (1988). Applied regression analysis. Wadsworth & Brooks/Cole. Belmont, California.

56 Ripley, B.D., (1987). Stochastic Simulation. John Wiley & Sons, New York.

SAS Institute, Inc., (1996). SAS/STAT® Software: Changes and Enhancements through Release 6.11. SAS Institute Inc., Cary, NC.

Saunders, D.A., Smith, G.T., and Rowley, I., (1982). The availability and dimensions of tree hollows that provide nest sites for (Psittaciformes) in Western Australia. Aust. Wildl. Res. 9: 541-556.

Schabenberger, O., and Gregoire, T.G., (1995). A conspectus on estimating function theory and its application to recurrent modelling issues in forest biometry. Silva Fennica, 29: 49-70.

Smith, A.P., (1982) Leadbeater’s possum and its management, In Species at Risk: Research in Australia, Eds. R.H. Groves, W.D. Ride, Australian Academy of Science, Canberra, 129-145.

Smith, A.P., and Lindenmayer, D.B., (1988). Tree hollow requirements of Leadbeater’s Possum and other possums and gliders in timber production ash forests of the Victorian Central Highlands. Aust. Wildl. Res. 15: 347-362.

Smith, A.P., and Lindenmayer, D.B., (1992). Forest succession, timber production and conservation of Leadbeater’s possum Gymnobelideus leadbeateri McCoy (Marsupialia: Petauridae). For. Ecol. Manag., 49: 311-332.

Stoneman, L., Rayner, M.E., and Bradshaw, F.J., (1997). Size and Age Parameters of Nest Trees Used by Four Species of Parrot and One Species of Cockatoo in South-west Australia: Critique. Emu, 97: 94-96.

Tanaka, K., (1986). A stochastic model of diameter growth in an even-aged pure forest stand. J. Jpn. For. Soc., 68: 226-236.

Tanaka, K., 1988. A stochastic model of height growth in an even-aged pure forest stand - Why is the coefficient of variation of the height distribution smaller than that of the diameter distribution. J. Jpn. For. Soc., 70: 20-29.

Taylor, R.J., and Haseler, M., (1993). Occurrence of potential nest trees and their use by birds in sclerophyll forest in north-east Tasmania. Aust. For., 56: 165-171.

Vanclay, J.K., and Skovsgaard, J.P., (1997). Evaluating forest growth models. Ecol. Mod., 98: 1-12.

Victorian Government, (1988). Flora and Fauna Guarantee Act 1988. Acts 1988 No. 47, Victorian Government Printing Office, Melbourne.

West, P.W., Davis, A.W., and Ratkowsky, D.A., (1986). Approaches to regression analysis with multiple measurements from individual sampling units. J. Statist. Comput. Simul., 26: 149-175.

West, P.W., Ratkowsky, D.A., and Davis, A.W., (1984). Problems of hypothesis testing of regressions with multiple measurements from individual sampling units. For. Ecol. Manag., 7: 207-224.

57 58 8.1. Appendix A. Background to the generalized linear mixed model (GLMM).

Introduction

Ecological modelling has tended to focus on the deterministic prediction of mean trends in ecological systems. Characterisation of the stochastic components of ecological systems is attempted less often and stochastic components prevalent in the spatial and temporal domain have been particularly neglected. This is particularly the case for models with binary responses such as those used to model the incidence of tree hollows. These models offer the greatest challenge for incorporating stochastic structures being an active area of research in mathematics and statistics (e.g. McGilchrist 1994, Booth and Hobert 1998).

It is likely that several stochastic structures are affecting hollows models including spatial and nested stochastic structures. Spatial stochastic structure (also called spatial autocorrelation or spatial dependence) eventuates when spatial influences on the incidence of hollows are not fully captured in models. Nested stochastic structure eventuates when measurements are combined across sampling units and differences between the sampling units are not captured in the model. This brief review examines spatial, and nested stochastic structure, and their affect on generalized linear model (GLM) estimation, the predominant methodology for modelling a binary response. A methodology capable of incorporating stochastic structure in the estimation of hollow incidence models is introduced.

Categorising the stochastic structures affecting hollow incidence models

Nested stochastic structure

Nested stochastic structure eventuates because multiple measurements are taken on each SFRI plot, and measurements are combined across plots. If differences among the plots cannot be fully incorporated in predictive models, then model residuals on the same sampling unit are more similar than average (West et al. 1984). This is a violation of the generalized linear model (GLM) assumption of independent residuals, and can threaten the estimation properties of GLMs.

Spatial stochastic structure

Spatial stochastic structure results from spatial variation in predictors that are not included in the model. Such predictors may include those describing environmental conditions that influence the incidence of hollows. When such predictors are not included in models then a spatial stochastic structure among the model residuals in spatial proximity is

59 generated. Failure to capture spatial environmental effects results in model residuals in spatial proximity being more similar than average, and positive spatial autocorrelation is generated. When spatial autocorrelation is exhibited by residuals, the GLM assumption of independent residuals is violated.

The generalized linear model (GLM)

Many predictive ecological models have a binary dependent variable (two possible outcomes; success or failure). Given this it is reasonable to assume that the observed number of successes follows a binomial distribution (Agresti 1996). The binomial probability of success can be modelled using a logit link which is a transformation of the conditional expectation of the model. A model with a logit link and a response following the binomial distribution is termed the logistic regression model and falls under the class of statistical models known as the generalized linear models (GLM):

⎛ π (x) ⎞ logit[]π (x) = log⎜ ⎟ = Xβ + ε ⎝1−π (x) ⎠ where π(x) is the probability that x occures, Xβ constitutes the design matrix (X) and parameter estimates (β) for the linear predictor of the logistic model, and ε is the error.

The generalized linear mixed model (GLMM).

The logistic model detailed above is subject to several assumptions, namely that linearity is preserved on the logit scale, and that errors are independent (assumptions of normally distributed and homogeneous errors are relaxed for the GLM). In many predictive ecological models it is likely that the assumption of independent errors is violated due to hierarchical data collection schemes and or the likely existence of spatial or temporal dependencies. In light of these violations, the generalized linear mixed model (GLMM) is adopted as a methodology capable of handling dependent errors.

The GLMM for a model with binomial response and logit link can be described as:

⎛ π (x) ⎞ logit[]π (x) = log⎜ ⎟ = Xβ + Zb + ε ⎝1−π (x) ⎠ where Zb is the design matrix (Z) and random parameter estimates (b) for the random effects component of the GLMM.

60 Benefits from explicitly modelling dependence in GLMM model estimation.

Biased inference

The GLM assumption of independent residuals is violated whenever stochastic structure exists, resulting in biased estimates of standard errors of parameter estimates (Johnston 1972). Inference on parameter estimates is then also biased. If a positive autocorrelation is present among residuals located on the same sampling unit or in spatial proximity then hypothesis tests on the significance of parameters will be biased upwards and the type I error rate will be inflated, i.e. too often it will be concluded that the test statistic is different from zero. Inferences on the parameters and the regression are particularly important when model building. For example, when positive autocorrelation is present there will be a tendency to include variables that may, in fact, not be significant in the model (Griffith and Amrhein 1997). Adoption of the GLMM will alleviate biased inference and the associated problems encountered during model building.

Inefficient estimation

A second problem associated with the presence of stochastic structure is inefficient estimation (Kramer 1980). When stochastic structure is present each measurement is dependent to some extent on other measurements. The information provided by each individual measurement is therefore reduced, as one measurement can be partly predicted from other measurements. This results in inefficient use of the data when GLM is applied. If positive autocorrelation is present, then it will take more measurements to achieve the same confidence interval on predictions using GLM relative to when stochastic structure is incorporated in the model using GLMM (Grondona and Cressie 1991, Cressie and Hartfield 1996). When stochastic structure is incorporated, estimation efficiency is improved, as each measurement is bringing information to the model, independent of other measurements. Efficiency considerations are important in light of the cost of data collection.

Overdispersed binary response

An additional advantage of estimating models using a generalized linear mixed model (GLMM) is that it alleviates the problem of overdispersion. Overdispersion occurs when the variance of the response exceeds the nominal variance and is caused by the presence of autocorrelation or failure to identify the correct form for the response (McCullagh and Nelder 1989, Agresti 1996). Overdispersion is a serious problem affecting GLMs, and adoption of the GLMM is a way of relaxing the independence assumption for overdispersed data (Piepho 1999). If we were to ignore the nested and spatial autocorrelation affecting hollows models and estimated the models using GLM, we could expect a greater than binomial variation

61 (Goldstein 1995). Adoption of the GLMM with a random component explicitly modelling nested and spatial autocorrelation allows binomial variance to be preserved. Therefore, adoption of the GLMM improves the theoretical basis of model estimation, ensuring that binomial variance is preserved and thus overdispersion avoided.

Fitting GLMMs using the GLIMMIX Macro and SAS Proc Mixed

As described above, the generalized linear mixed model (GLMM) is an extension to the generalized linear model (GLM) with the addition of random effects to linear predictors. The addition of random effects makes GLMM useful for incorporating autocorrelation in model estimation. However, This addition greatly complicates model estimation. Estimating GLMMs using maximum likelihood methods is untenable, requiring multidimensional numerical integration (McCulloch 1997) [this may change in the future with advances in computing power (Piepho 1999)]. Thus alternative approximations have been sought (Witte et al. 1998). Computationally convenient approximations of the exact likelihood include marginal quasi-likelihood (MQL), and penalized quasi-likelihood (PQL) (Breslow and Clayton 1993, Wolfinger and O’Connell 1993). Another alternative is to use the Bayesian approach (Gilks et al. 1994, Tenhave and Localio 1999). Various software exists for estimating GLMMs using the MQL and PQL approximations and these have been reviewed by Zhou et al. (1999).

The GLIMMIX macro extension to SAS Proc Mixed (referred to simply as GLIMMIX) offers a convenient method of estimating GLMMs (Wolfinger and O’Connell 1993). GLIMMIX incorporates several appealing features of Proc Mixed (Litell et al. 1996), in particularly it allows definition of the random model component using spatial and temporal covariance functions (it is the only software known to the authors that allows this in a GLMM context). GLIMMIX implements the PQL approximation, and was shown to perform well in the simulation study of Zhou et al. (1999). However, researchers have demonstrated that the PQL approximation suffers when the number of observations per random effect is small (Breslow and Lin 1995), and can yield biased estimates of variance components under certain conditions (Breslow and Lin 1995, Engel 1998). However, these problems are considered less severe than a biased estimate of the linear predictor associated with MQL under certain conditions (Rodriguez and Goldman 1995, Goldstein 1995, Goldstein and Rabash 1996). The GLIMMIX macro implements a PQL approximation by iteratively calling Proc Mixed until convergence. Proc Mixed is designed for linear mixed models, and the GLIMMIX macro generalises it to cases of non-continuous or non-normal responses. Documentation and a copy of the macro can be obtained at (http://www.sas.com/techsup/download/stat).

62 Predictions from a GLMM at the expected value of the random effects

The GLMM will yield the best possible estimation of the linear predictor in the presence of autocorrelation. The actual autocorrelation modelled using a GLMM is of secondary importance; It’s inclusion in the GLMM addresses biases in confidence intervals, hypothesis tests on the estimated parameters, efficiency considerations, and prevents overdispersion in the response. Predictions from a GLMM estimated model can be made at the expected value of the random effects over the entire population. Since the expected values of random effects are zero, predictions can be made on the basis of the linear predictor alone, and the random effects can be effectively ignored (Afshartous 2000, Goldstein 1995). Predictions at the expected values of the random effects allow the GLMM estimated model to be applied outside of the model building dataset. Furthermore, because parameter estimates from GLM are unbiased in the presence of spatial autocorrelation, it should not be expected that parameter estimates from GLMM will provide improved predictions. Indeed, parameter estimates from GLM and GLMM should be extremely similar, and therefore the accuracy of predictions will be almost identical. However, statements of prediction precision from GLM are biased in the presence of autocorrelation, and these considerations are important when evaluating the uncertainty associated with model predictions.

Conclusion

Developers of predictive ecological models need to consider and if possible include stochastic structures in models. In particular, many ecological models will be affected by spatial and nested autocorrelation. Given this, GLIMMIX provides the capability for explicitly incorporating stochastic structure in a GLMM model estimation. For the hollows models it is an appropriate methodology for incorporating complex nested and spatial structures in estimation thus alleviating the problems associated with biases in hypothesis tests on the estimated parameters, poor estimation efficiency, and an overdispersed dependent variable.

References

Afshartous, D., (2000). Prediction in multilevel models. Unpublished Ph.D. Dissertation. The University of California.

Agresti, A., (1996). An introduction to Categorical data analysis. John Wiley & Sons, New York.

Booth, J.G., and Hobert, J.P., (1998). Standard errors of prediction in generalized linear mixed models. J. Am. Stat. Ass., 93: 262-272.

63 Breslow, N.E., and Clayton, D.G., (1993). Approximate inference in generalized linear mixed models. J. Amer. Stat. Ass., 88: 9-25.

Breslow, N.E., and Lin, X., (1995). Bias correction in generalized linear mixed models with a single component of dispersion. Biometrika, 82: 81-91.

Cordy, C., and Griffith, D., (1993). Efficiency of least squares estimators in the presence of spatial autocorrelation. Communications in Statistics, 22 B: 1161-1179.

Cressie, N., and Hartfield, M.N., (1996). Conditionally specified guassian models for spatial statistical analysis of field trials. J. Agric. Biol. Env. Statist., 1: 60-77.

Engel, B., (1998). A simple illustration of the failure of PQL, IRREML, and APHL as approximate ML methods for mixed models for binary data. Biometrical Journal, 40: 141- 154.

Gilks, W.R., Thomas, A., and Spiegelhalter, D.J., (1994). A language and program for complex Bayesian modelling. Statistician, 43: 169-178.

Goldstein, H., (1995). Multilevel Statistical Models. London, Arnold.

Goldstein, H., and Rabash, J., (1996). Improved approximations for multilevel models with binary responses. J. R. Statist. Soc. A, 159: 505-513.

Gregoire, T.G., Schabenberger, O., and Barrett, J.P., (1995). Linear modelling of irregularly spaced, unbalanced, longitudinal data from permanent-plot measurements. Can. J. For. Res., 25: 137-156.

Griffith, D.A., and Amrhein, C.G., (1997). Multivariate statistical analysis for geographers. Prentice-Hall, Upper Saddle River, New Jersey.

Grondona, M.O., and Cressie, N., (1991). Using spatial considerations in the analysis of experiments. Technometrics, 33: 381-392.

Gumpertz, M.L., Wu, C., and Pye, J.M., (2000). Logistic regression for Southern Pine Beetle Outbreaks with Spatial and Temporal Autocorrelation. For. Sci., 46: 95-107.

Johnston, J., (1972). Econometric Methods. Second Edition. McGraw-Hill, London.

Kramer, W., (1980). Finite sample efficiency of ordinary least squares in the linear regression model with autocorrelated errors. J. Am. Stat. Ass., 75: 1005-1099.

Kramer, W., and Donninger, C., (1987). Spatial autocorrelation among errors and the relative efficiency of OLS in the linear regression model. J. Am. Stat. Ass., 82: 577-579.

Kung, F.H., and Yang, Y.C., (1983). Autoregression analysis of diameter growth in black walnut trees. In: Bell, J.F., Atterbury, T. (Eds.), Renewable resource inventories for monitoring changes and trends; Proceedings of the international conference. Corvallis, Oregon, USA. pp.235-239.

Littell, R.C., Milliken, G.A., Stroup, W.W., and Wolfinger, R.D., (1996). SAS system for mixed models. North Carolina, SAS Inst. Inc.

Martin, R.L., (1974). On autocorrelation, bias and the use of first spatial differences in regression analysis. Area, 6: 185-194.

64 McCullagh, P., and Nelder, J.A., (1989). Generalized linear models, 2nd edition. London, Chapman Hall.

McGilchrist, C.A., (1994). Estimation in generalized mixed models. J. R. Stat. Soc. B., 56: 61-69.

Piepho, H.P., (1999). Analysing disease incidence data from designed experiments by generalized linear mixed models. Plant Pathology, 48: 668-674.

Rodriguez, G., and Goldman, N., (1995). An assessment of estimation procedures for multilevel models with binary responses. J. R. Statist. Soc. A, 158: 73-89.

Schabenberger, O., and Gregoire, T.G., (1995). A conspectus on estimating function theory and its application to recurrent modelling issues in forest biometry. Silva Fennica, 29: 49-70.

Tenhave, T.R., and Localio A.R., (1999). Empirical bayes estimation of random effects parameters in mixed effects logistic regression models. Biometrics, 55: 1022-1029.

West, P.W., Ratkowsky, D.A., and Davis, A.W., (1984). Problems of hypothesis testing of regressions with multiple measurements from individual sampling units. For. Ecol. Manag., 7: 207-224.

West, P.W., Davis, A.W., and Ratkowsky, D.A., (1986). Approaches to regression analysis with multiple measurements from individual sampling units. J. Statist. Comput. Simul., 26: 149-175.

Witte, J.S., Greenland, S., and Kim, L.L., (1998). Software for hierarchical modelling of epidemiologic data. Epidemiology, 9: 563-566.

Wolfinger, R., and O’Connell, M., (1993). Generalized linear mixed models: a pseudo- likelihood approach. J. Statist. Comput. Simul., 48: 233-243.

Zhou, X. H., Perkins, A.J., and Hui, S.L., (1999). Comparisons of Software Packages for Generalized Linear Multilevel Models. The American Statistician, 53: 282-290.

65 8.2. Appendix B. GLMM estimation of first-stage hollow incidence models

Part I: Modelling spatial dependence in the first-stage stand-level model

Spatial dependence can be defined as the tendency for two plots located in spatial proximity to behave more similarly with respect to the incidence of hollows. This effect is intuitive because two plots located in proximity will tend to be located in the same or similar forest. A GLMM was be used to incorporate spatial dependence in model estimation, and several alternative approaches including the population-averaged and subject-specific methods were evaluated.

The population-averaged and subject-specific approach to modelling spatial dependence

Spatial dependence can be incorporated in model estimation using either the population-averaged approach, or the subject-specific approach (Zeger et al. 1988). The population-average approach models the average behaviour of the population using a continuous covariance function and one or two unknown random parameters. In contrast to this the subject-specific approach allows one or more parameters to vary randomly among subjects, thus facilitating a more subject-specific characterisation of spatial dependence (Jones 1990, Lindstrom and Bates 1990). The effectiveness of the two methodologies depends on how well the spatial dependence can be generalised across the entire data space, and both methods need to be explored to identify an optimal characterisation of spatial dependence (Jones 1990). For example, if spatial dependence is consistent across the study area, then the population-averaged approach would be appropriate, and a spatial covariance function can be selected to specify the dispersion matrix in model estimation. The appropriateness of the population-averaged approach is also dependent on the satisfaction of spatial stationarity that requires that spatial dependence is orientation and location invariant, i.e., the magnitude of spatial dependence is independent of the orientation and location of spatial interaction. If the effect is found to be inconsistent across the study area, or the assumption of spatial stationarity is violated, then a subject-specific approach may be more appropriate. For the subject-specific approach the dispersion matrix is defined using a block-diagonal structure differentiating the different subjects, and with an unknown random parameter to be estimated for each subject. Both approaches will be explored in attempts to incorporate spatial dependence in model estimation. Before this can be done we need to characterise the spatial dependence affecting stand-level models for hollows incidence.

66 Characterising spatial dependence

To characterise spatial dependence the correlogram is used, which is constructed as a correlation estimator (In this instance Moran’s I (Moran 1950, Cliff and Ord 1981)) plotted against distance. Figure A1 shows spatial correlograms for hollows incidence for each of the four forest types. These graphs depict the magnitude of spatial dependence and how it changes as the distance between plots is expanded. The S+ spatial stats module (Kaluzney et al. 1998) and the trellis function were used to generate graphics.

It can be noted that hollows incidence is spatially dependent over short spatial distances for E. delegatensis and mixed species forest. Correlograms for E. regnans and E. obliqua indicate an absence of, or only weak, spatial dependence. The next step is to examine residuals from stand-level hollow incidence models to determine if they are also affected by spatial dependence. Examination of residual spatial dependence will also indicate how effective the deterministic model structure is in explaining spatial variation in stand-level hollow incidence. Residual correlograms for the four forest types are shown in Figure A2.

Residual correlograms for E. delegatensis, E. regnans, and E. obliqua indicate that residuals are spatially independent. Note that the dependence observed for E. delegatensis in Figure A1 has been removed by the deterministic model structure. Thus explanatory variables are successfully explaining spatial variation. However, dependence observed for mixed species forest in Figure A1 is also present in Figure A2. This indicates that the deterministic structure for the mixed species model is failing to characterise spatial dependence, rather it is escaping the model and appearing in residuals where it’s presence threatens the GLM assumption of independent residuals.

Given the strength of spatial dependence observed for the mixed species model it is appropriate to explore methodologies for explicitly incorporating this dependence in model estimation. Initially, application of the population-averaged approach will be explored.

67 0 20000 40000 60000 80000 100000 Eucalyptus delegatensis Eucalyptus regnans

0.4

0.3

o o 0.2 o o o o o o o o o 0.1 o o o o o o o o o o o o o o o o o o o o o o o 0.0 o o o o o o o Eucalyptus obliqua Mixed o Moran's I 0.4

0.3 o

0.2 o o o 0.1 o o o o o o o o o o o o 0.0 o o o o o o o o o o o o o o o o o o o o o o o o

0 20000 40000 60000 80000 100000 Distance (m)

Figure A1. Spatial correlograms for the incidence of hollows at the stand-level.

0 20000 40000 60000 80000 100000 Eucalyptus delegatensis Eucalyptus regnans

0.3

0.2

o o 0.1 o o o o o o o o o o o o o o o o o o o o o o o o o o o 0.0 o o o o o o o o o o o o

Eucalyptus obliqua Mixed o Moran's I 0.3

0.2 o

0.1 o o o o o o o o o o o o o o o 0.0 o o o o o o o o o o o o o o o o o o o o o o o o

0 20000 40000 60000 80000 100000 Distance (m)

Figure A2. Residual spatial correlograms from the first-stage stand-level model of hollow incidence.

68 The population-averaged approach to modelling spatial dependence

The population-averaged approach uses a spatial covariance function to directly specify the dispersion matrix. Initially we need to determine if the structure observed in Figure A2 for the mixed species model can be generalised across the data space. It is also important that the assumption of spatial stationarity be tested. For a more detailed examination of the spatial dependence affecting model residuals, the semivariogram will be used in addition to correlograms. The semivariogram does not filter for changing spatial means and variances and should be examined to provide a complete exposition of the spatial variation affecting the data (Isaaks and Srivastava 1988, Rossi et al. 1992). Figure A3 shows the residual correlogram and residual semivariogram for the mixed species model.

The correlogram and semivariogram in Figure A3 show very different patterns of spatial dependence. The two graphs should mirror each other with correlation decreasing when semivariance increases (Isaaks and Srivastava 1988, Haining 1990). This holds true for inter-plot distances of less than 20,000 meters with the correlation declining and the semivariance increasing. These patterns indicate falling magnitudes of spatial dependence with increasing inter-tree distance. For inter-plot distances of greater than 20,000 different patterns emerge. The correlation fluctuates about zero while the semivariance declines toward original values. Declining semivariance indicates increasing magnitudes of spatial dependence with increasing distance, a counter-intuitive result contrary to what is usually observed in the literature (e.g. Haining 1990). The different patterns of spatial dependence observed here confirms the need to examine both correlograms and semivariograms in a characterisation of spatial dependence. The complicated pattern of spatial dependence observed in the semivariogram may not be conducive to characterisation using a spatial covariance function which tend to assume that spatial dependence declines asympototically toward zero and is absent beyond a certain inter-plot distance. The semivariogram indicates that spatial dependence initially follows this pattern, but begins to increase again for large inter-tree distances.

The validity of the population-averaged approach also hinges on the assumption of spatial stationarity. If spatial stationarity is violated, then spatial dependence cannot be characterised using a spatial covariance function (Cressie 1991). Spatial stationarity assumes that semivariance and correlation are orientation invariant. Invariance to orientation can be tested using directional semivariograms and correlograms. Directional semivariograms are used as they proved the more informative structure. These are constructed by dividing inter- plot joins into directional classes, and creating a different semivariogram for each directional

69 class. Eight directional classes were used, 0-22.5, 22.5-45, etc. Anisotropic semivariograms are shown in Figure A4.

Patterns of spatial dependence portrayed in Figure A4 are variable across the eight direction classes. Several direction classes indicate an absence of spatial dependence (112.5- 135 and 135-157.5), others indicate the counter-intuitive result of falling semivariance with increasing inter-plot distance (0-22.5, 22.5-45), while others convey complicated patterns with semivariance initially rising, reaching a maxima, and tending to fall for larger inter-plot distances (45-67.5, 67.5-90, 90-112.5). Directional semivariograms confirm that spatial stationarity is violated. This violation indicates that directly modelling spatial dependence using a spatial covariance function may be inappropriate. Despite this violation, defining the dispersion matrix using a spatial covariance function was tested as a method for incorporating spatial dependence in estimation of the mixed species hollows incidence model.

70 o o

0.3 o o o 0.10 o o o

o o o 0.2 o 0.09 o o o o o o o o o o

Moran's I o o o 0.1 o Semivariance o 0.08 o o o o o o o o o o o o o o 0.0 o o o o o 0.07 o o o o o o o o o o o o o o

0 20000 40000 60000 80000 100000 0 20000 40000 60000 80000 100000 Distance (m) Distance (m)

Figure A3. Residual correlogram (left) and residual semivariogram (right) for the first-stage stand-level model for mixed forests.

0 20000 60000 100000 0 20000 60000 100000 90 112.5 135 157.5

0.15

0.10

0.05

0.0 0 22.5 45 67.5

0.15 Semivariance

0.10

0.05

0.0

0 20000 60000 100000 0 20000 60000 100000 Distance (meters)

Figure A4. Anisotropic residual semivariograms for the first-stage stand-level model for mixed forests.

71 Fitting the population-averaged approach using spatial covariance functions and the GLIMMIX Macro

Proc Mixed within SAS offers a diverse suite of structures for the random model component (Wolfinger et al. 1994). These include definitions of the dispersion matrix for explicitly incorporating spatial, temporal, and nested dependencies in model estimation. Spatial covariance functions use a function of the spatial distance between sampling units to define the off-diagonal elements of the dispersion matrix. A number of spatial covariance functions are offered including the Gaussian, power, spherical, and exponential functions. Spatial covariance functions include several unknown parameters that control the final form of the function. These parameters along with those associated with the linear predictor are estimated in Proc Mixed using maximum likelihood methods (SAS Institute Inc. 1996). Useful expositions on the use of Proc Mixed for fitting population-averaged spatial covariance functions exist (e.g. Marx and Stroup 1993, Brownie and Gumpertz 1997). Fitting a mixed model using a spatial covariance function to define the dispersion matrix is a population-averaged approach because the estimated parameters for the random component represent the average behaviour of the population. Their effectiveness in characterising the average behaviour is dependent on the consistency of spatial structure across the data space. These facets of the data were examined earlier using directional semivariograms. By using the GLIMMIX macro to generalise these models to the instance where the response is binary we can estimate generalized linear mixed models (GLMMs) with the random model component defined using a spatial covariance structure. An example of the GLMM dispersion matrix for the Gaussian population-averaged spatial covariance model is shown below.

2 2 2 2 2 2 2 ⎡ σ σ (exp(−d12 / ρ )) σ (exp(−d13 / ρ ))⎤ ⎢ 2 2 2 2 2 2 2 ⎥ ⎢σ (exp(−d 21 / ρ )) σ σ (exp(−d 23 / ρ ))⎥ ⎢ 2 2 2 2 2 2 2 ⎥ ⎣σ (exp(−d31 / ρ )) σ (exp(−d32 / ρ )) σ ⎦

2 Where σ is the error variance, ρ is the unknown population-averaged random parameter, d12 is the physical distance between plot 1 and plot 2, and d13 is the distance between plot 1 and plot 3, etc.

Using the mixed species model to define the linear predictor, and a spatial covariance structure to define the random model component, it was attempted to fit a GLMM using the population-average approach. Available covariance functions were tested, and various parameter starting values (estimated from the sill, range and nugget of observed

72 semivariograms) were used but GLIMMIX failed to converge. This may indicate the inappropriateness of the population-averaged approach to modelling spatial dependence in this instance. The earlier examination of directional semivariograms indicated that spatial dependence may be direction variant. Furthermore, the unusual falling semivariance observed over larger inter-plot distances may be complicating the fitting of spatial covariance functions. Spatial covariance functions are designed to model declining magnitudes of spatial dependence with increasing inter-plot distance. The observed increasing spatial dependence with increasing inter-plot distance may explain failure to converge on population-averaged parameter estimates. These results indicate that the subject-specific approach to modelling spatial dependence may be more appropriate. Although unsuccessful, the SAS code for fitting a logistic model with a random component defined using the spherical spatial covariance function for E. regnans is shown below.

%glimmix(data=re.plot, procopt=scoring=5, stmts=%str(class fma reg_n; model holbin=fmaceng fmadan crden2 irl05 regr reg / S P; repeated / type=sp(sph)(x y) subject=intercept local; parms (0.1) (50) (1); make 'Predicted' out=predha;), error=binomial, maxit=50, out=work.pred, options=fitting) run;

The subject-specific approach to modelling spatial dependence

The subject specific approach differs from the population-averaged approach because each subject has a random parameter associated with it. For the mixed species model, it appears that spatial dependence is not amenable to characterisation using the population- averaged approach. This suggests that spatial dependence varies across the population and is subject specific. Spatial dependence dictates that plots located in the same or nearby forest will be behaving more similarly than plots not located in the same or nearby forest. An alternative to explicitly modelling this tendency using a spatial covariance function is to divide the population into subjects within which the individual units tend to be behaving similarly. These subjects can then be used to define a block-diagonal dispersion matrix with all units within the same subject constituting a block. Each block or subject will then have a random parameter associated with it in a mixed model. This approach to modelling spatial dependence is known as the subject-specific approach. The challenge is to identify subjects within which individual units tend to be behaving similarly. Conveniently, a variable dividing the forest estate into 1:25,000 mapsheets was found to identify plots located in the same or nearby forest. These plots located on the same 1:25,000 mapsheet were tending to behave

73 similarly with respect to the incidence of hollows. Using mapsheet as a random effect in a mixed model will facilitate the subject-specific characterisation of spatial dependence in hollows incidence models. Proc Mixed includes the facility to estimate random effects models. This is done by constructing a dispersion matrix consisting of a block diagonal for plots located on a particular mapsheet. A separate random parameter for each block diagonal facilitates the subject-specific modelling of spatial dependence. A hypothetical section of the dispersion matrix for the subject specific GLMM with random mapname effects is shown below.

2 2 2 ⎡σ m1 + σ σ m1 0 0 ⎤ ⎢ 2 2 2 ⎥ ⎢ σ m1 σ m1 + σ 0 0 ⎥ ⎢ 2 2 2 ⎥ 0 0 σ m2 + σ σ m2 ⎢ 2 2 2 ⎥ ⎣⎢ 0 0 σ m2 σ m2 + σ ⎦⎥

2 2 where σ is the error variance, σ m1 is the covariance associated with plots located on 2 Mapsheet 1, and σ m2 is the covariance associated with plots located on Mapsheet 2. Note that covariance between plots located on different Mapsheets is zero because they are assumed independent.

By using the GLIMMIX macro we can generalise the above described mixed model to the case of binary responses (as in the hollows data), and fit the resulting subject-specific GLMM. The SAS code for fitting the subject-specific stand-level model for E. regnans is shown below

%glimmix(data=re.plotd, procopt=mmeqsol, stmts=%str(class mapname; model holbin= fmaceng fmadan crden2 irl05 regr reg / solution; random mapname / solution;), options=fitting, error=binomial, maxit=50, out=tpred) run;

Identifying a suitable deterministic structure before examination of random effects is an appropriate practice (Diggle et al. 1994, Verbeke and Molenberghs 1997). However, after identifying the subject-specific GLMM as the best method for explicitly incorporating spatial dependence in model estimation, the model structure was re-examined. This was done by checking if any of the parameter estimates had become insignificant as a result of including the random effects component. If parameter estimates were insignificant they were

74 sequentially (least significant first) dropped from the model, with the GLMM being re- estimated at each step.

To determine whether the random effects component detailed above was significant the likelihood ratio test was used. The likelihood ratio test is the preferred means of testing whether the magnitude of random effects necessitates their inclusion in model estimation (Verbeke and Molenberghs 1997). The likelihood ratio test was implemented as:

LRT = 2((n − q)/ n)(LLiGLM − LLiGLMM )

where n is the total degrees of freedom, q is the number of random parameters in the mixed model, and LLiGLM and LLiGLMM are the log likelihood values for the GLM and the GLMM respectively. The LRT statistic has an asymptotic χ2 distribution with q degrees of freedom. The p-value from this distribution was used to determine the significance of random plot effects.

Results when the subject specific approach is applied to first-stage stand-level models are shown in Table A1.

Table A1. Likelihood ratio tests for the significance of random mapname effects. Forest type LLi - GLM LLi - GLMM LRT Chi-square E. delegatensis -1089.43 -1087.39 3.553548 1 E. regnans -770.082 -762.145 13.59146 1 E. obliqua -1188.08 -1175.87 18.16505 1 mixed forest -1535.81 -1380.73 242.3125 0

where LLi is the log likelihood, LRT is the statistic from the likelihood ratio test. Note that deterministic structures used in these models are described in Tables 14-17.

Likelihood ratio tests indicate that the random effects component is only significant for the mixed forest model. This confirms the significance of spatial dependence in the mixed forest model as observed in residual correlograms and also confirms an absence of spatial dependence in the other models. These results also confirm that the subject specific approach to modelling spatial dependence is appropriate for the first-stage stand-level hollows models.

The presence of a strong spatial dependence in the mixed forest model is intuitive as many different types of forest are encompassed in this group, and each forest is of similar

75 composition and is behaving similarly with regard to hollow incidence when in spatial proximity. This effect is explicitly modelled in this instance using the subject specific approach and GLMM model estimation.

Part II: Modelling spatial and nested dependence in the first-stage tree-level model

The first-stage tree-level model is effected by a hierarchy of dependence-inducing structures. The first is an artefact of the SFRI sampling scheme; Trees are sampled within plots, therefore a nested dependence is generated because two trees located on the same plot will tend to be more similar with respect to hollows incidence than average. This structure needs to be incorporated to ensure tree-level model estimation remains valid. The second structure is generated by spatial dependence; two trees on plots located in spatial proximity tend to be more similar with respect to hollows incidence than average. This is a similar structure to the spatial dependence affecting the stand-level model. This hierarchy of nested and spatial dependence presents a challenge for tree-level hollow incidence modelling. The challenge is to identify and estimate predictive logistic models explicitly incorporating a hierarchy of dependence.

Explicitly modelling nested dependence

The nested dependence affecting the tree-level is amenable to subject-specific random effects modelling. Therefore plot becomes a random effect, and each plot has a separate random parameter associated with it. This will facilitate subject-specific modelling of the dependence present among the trees of each plot. Similar to the stand-level model, a tree-level model with subject-specific random effects can be estimated as a GLMM using GLIMMIX.

The SAS code used for fitting a tree-level GLMM for E. regnans with random plot effects is shown below.

%glimmix(data=re.tree, procopt=mmeqsol, stmts=%str(class plot_num; model hollows = logdbh est_top_ crir crhr crrg spg1 fmatam / solution; random plot_num / solution;), options=fitting, error=binomial, maxit=150, out=tpred) run;

Each plot will constitute a block on the diagonal of the dispersion matrix, and the random parameter associated with each block will reflect the extent of nested dependence prevalent among the trees. A hypothetical section of the dispersion matrix for the GLMM with random plot effects is shown below.

76 2 2 2 ⎡σ p1 + σ σ p1 0 0 ⎤ ⎢ 2 2 2 ⎥ ⎢ σ p1 σ p1 + σ 0 0 ⎥ ⎢ 2 2 2 ⎥ 0 0 σ p2 + σ σ p2 ⎢ 2 2 2 ⎥ ⎣⎢ 0 0 σ p2 σ p2 + σ ⎦⎥

2 2 2 Where σ is the error variance, σ p1 is the covariance associated with trees in plot 1, and σ p2 is the covariance associated with trees in plot 2. Note that covariance between trees in different plots is zero because trees in different plots are assumed independent.

The validity of random plot effects

The intention of the tree-level model is to predict the probability of hollows presence without having to condition on the particular plots used in the SFRI, and hence plot is treated as a random effect. Using plots as random effects is appropriate because they represent a model based sample (DNRE 1999a) from the larger population consisting of the entire forested estate (Bennington and Thayne 1994). It is intended that the tree-level model be representative of the entire forested estate, thus we are more interested in drawing inference on the entire sampled population rather than the particular plots. Furthermore, we are interested in applying the model developed from plot information to the larger population, thus including plot in the deterministic model structure is not feasible.

Explicitly modelling a hierarchy of spatial and nested dependencies

Trees on plots in spatial proximity exhibit spatial dependence. For the stand-level model this spatial dependence was incorporated using Mapname to define subject-specific random effects. The subject-specific modelling of spatial dependence was found to be more appropriate than population-averaged modelling due to the complexity and variability of spatial patterns across the dataspace. Because the spatial dependence affecting tree- and stand-level models should be very similar, it was decided to test only the subject-specific modelling of spatial dependence in the tree-level model. Thus, the use of mapname as a random effect was tested as a means of incorporating spatial dependence in estimation of the tree-level model.

For tree-level models we have a hierarchy of spatial and nested dependencies which need to be incorporated in model estimation. This can be achieved using a subject-specific modelling approach with two levels of random effects; random plot effects and random mapname effects. Proc Mixed is capable of handling a hierarchy of random effects (Singer

77 1998), and when generalised to models with binary response using GLIMMIX, provides a methodology capable of estimating tree-level models subject to a hierarchy of dependencies.

The SAS code used for fitting a tree-level GLMM with a hierarchy of random effects for E. regnans is shown below.

%glimmix(data=re.tree, procopt=mmeqsol, stmts=%str(class plot_num mapname; model hollows = logdbh est_top_ crir crhr crrg spg1 fmatam / solution; random mapname / solution; random plot_num(mapname) / solution;), options=fitting, error=binomial, maxit=150, out=tpred) run;

Given the hierarchical GLMM described above we can revisit the dispersion matrix and demonstrate how covariance between plots located in the same 1:25,000 map is now non- zero. A hypothetical section of the dispersion matrix for the GLMM with a hierarchy of random plot and mapname effects is shown below.

2 2 2 2 2 2 2 ⎡σ m1 + σ p1 + σ σ m1 + p1 σ m1 σ m1 ⎤ ⎢ 2 2 2 2 2 2 2 ⎥ ⎢ σ m1 + σ p1 σ m1 + σ p1 + σ σ m1 σ m1 ⎥ ⎢ 2 2 2 2 2 2 2 ⎥ σ m1 σ m1 σ m1 + σ p2 + σ σ m1 + σ p2 ⎢ 2 2 2 2 2 2 2 ⎥ ⎣⎢ σ m1 σ m1 σ m1 + σ p2 σ m1 + σ p2 + σ ⎦⎥

2 2 2 where σ is the error variance, σ p1 is the covariance associated with trees in plot 1, and σ p2 2 is the covariance associated with trees in plot 2, σ m1 is the covariance associated with trees in map 1. Note that compared to the earlier dispersion matrix, covariance between plots p1 and p2 is now non-zero because they are located on the same 1:25,000 map sheet. For plots located in different mapsheets the covariance would be zero.

To determine which of the random effects components detailed above were significant the likelihood ratio test was used. Initially, a GLMM was fitted with only random plot effects to identify the significance of nested dependence. Results are shown in Table A2.

78 Table A2. Likelihood ratio tests for the significance of random plot effects. Forest type LLi - GLM LLi - GLMM LRT Chi-square E. delegatensis -29636.0 -28098.4 2903.13 0 E. regnans -19088.1 -17683.4 2649.82 0 E. obliqua -14475.9 -14328.7 278.01 0.6 mixed forest -17163.5 -17154.3 17.23 1

where LLi is the log likelihood, LRT is the statistic from the likelihood ratio test. Note that deterministic structures used in these models are from the full models described in Tables 1, 3, 5, and 7. Results from the simple models were very similar.

Table A2 indicates that dependence nested at the plot level is highly significant for E. delegatensis and E. regnans, but insignificant for E. obliqua and mixed forests. These results indicate that tree-level models inadequately capture differences between plots for E. delegatensis and E. regnans. These differences may be related to stand-level variables such as logging and fire history that were not available and were not included in models. Results for E. obliqua and mixed forest indicate that the differences between plots are being characterised in the models. A significant outcome from this is an important component of stand-level variation is escaping current models for E. delegatensis and E. regnans.

The next step was to add a component explicitly modelling spatial dependence as a random Mapname effect. This was done in the presence of a random plot effect resulting in a hierarchical model. For all four models, the random Mapname effect was insignificant. Its contribution was negligible compared to the more significant random plot effect. Given this, first-stage tree-level models were estimated using a random plot effect only, which explicitly modelled nested dependence.

References

Bennington, C.C., and Thayne, W.V., (1994). Use and misuse of mixed model analysis of variance in ecological studies. Ecology, 75: 717-722.

Brownie, C., and Gumpertz, M.L., (1997). Validity of spatial analysis for large field trials. J. Agric. Biol. Env. Statist. 2: 1-23.

Cliff, A.D., and Ord, J.K., (1981). Spatial processes; models and applications. Pion Limited, Great Britain, London.

Cressie, N., (1991). Statistics for spatial data. John Wiley & Sons, New York.

Department of Natural Resources and Environment, (1999a). Victoria’s Statewide Forest Resource Inventory: Bennalla/Mansfield, Wangaratta and Wodonga Forest Management

79 Areas. Forests Service Technical Reports 99-2. Department of Natural Resources and Environment, Victoria.

Diggle, P.J., Liang, K.Y., and Zeger, S.L., (1994). Analysis of longitudinal data. Oxford Science publications, Clarenden Press, Oxford.

Haining, R. P., (1990). Spatial data analysis in the social and environmental sciences. Cambridge university press, Cambridge.

Isaaks, E.H., and Srivastava, R.M., (1989). An introduction to applied geostatistics. Oxford University Press, New York.

Jones, R.H., (1990). Serial Correlation or Random Subject Effects? Communications in Statistics – Simulation, 19: 1105-1123.

Kaluzny, S.P., Vega, S.C., Cardoso, T.P., and Shelly, A., (1997). S+ Spatial Stats: user’s manual for windows and Unix. Springer, New York. 327pp.

Lindstrom, M.J., and Bates, D.M., (1990). Nonlinear Mixed Effects Models for Repeated Measures Data. Biometrics, 46: 673-687.

Marx, D.B., and Stroup, W.W., (1993). Analysis of spatial variability using PROC MIXED. In Kansas State University Conference on Applied Statistics in Agriculture. Manhatten, KS, Kansas State University. pp.40-59.

Moran, P.A.P., (1950). Notes on continuous stochastic phenomena. Biometrika, 37: 17-23.

Rossi, R.E., Mulla, D.J., Journel, A.G., and Franz, E.H., (1992). Geostatistical tools for modelling and interpreting ecological spatial dependence. Ecological Monographs, 62: 277- 314.

SAS Institute, Inc., (1996). SAS/STAT® Software: Changes and Enhancements through Release 6.11. SAS Institute Inc., Cary, NC.

Singer, J.D., (1998). Using SAS PROC MIXED to fit Multilevel Models, Hierarchical Models, and Individual Growth Models. Journal of Educational and Behavioral Statistics. 24: 323-355.

Tenhave, T.R., and Localio A.R., (1999). Empirical bayes estimation of random effects parameters in mixed effects logistic regression models. Biometrics, 55: 1022-1029.

Verbeke, G., and Molenberghs, G., (1997). Linear mixed models in practice: a SAS-oriented approach. Springer-Verlag, New York.

Wolfinger, R.D., Tobias, R.D., and Sall, J., (1994). Computing Gaussian likelihoods and their derivatives for general linear mixed models. SIAM Journal on Scientific Computing, 15: 1294-1310.

Zeger, S.L., Liang, K.Y., and Albert, P.S., (1988). Models for Longitudinal Data: A Generalized Estimating Equation Approach. Biometrics, 44: 1049-1060.

80 8.3. Appendix C. Further hollows work

There are several opportunities for further work, in particular extensions or applications of the models detailed in this report. These are listed below:

• A natural extension is the application of stand-level models for generating statements of hollow availability. This will be embarked on following confirmation of the usefulness of stand-level models from DNRE. Hollow availability can then be examined in light of the hollow needs of various endangered fauna.

• Tree-level models will facilitate the identification of individual tree attributes that maximise the probability of hollow incidence in the four forest types. Habitat tree retention guidelines for different forest types can then be adjusted to ensure that trees with the specified attributes are retained. In areas where priorities for the conservation of a particular species exist, habitat tree retention could be tailored to maximise the probability that a particular size of hollow is available.

• The methodology applied in this project to central and eastern Victoria, can be applied for similar outcomes to western Victoria, where SFRI data is currently being collected.

• Second-stage stand-level models could not be developed for E. regnans because of a very limited sample. The collection of further felling plot information for E. regnans could change this. There may some opportunity for collecting further felling plot information as part of a current defect study in the North-east FMA (Occhipinti pers. comm.).

• There are several opportunities for integrating hollow incidence models in growth and yield modelling systems. Tree-level models could be integrated into silvicultural optimisation systems. Thus, the number and type of hollows resulting from various silvicultural regimes could be used as an optimisation criterion, in addition to timber yield. Stand-level models could be integrated in growth and yield modelling systems for the forecasting of hollow incidence. This would be a valuable tool for forest planners, allowing simultaneous planning for timber and hollows, and would facilitate management planning that ensures a perpetual supply of hollow-bearing trees in production forests.

• To encourage operational implementation of the developed tree- and stand-level models, these models could become the basis of a simple computer application. The user could input tree attributes and the application would output the probability of hollow incidence and the expected size of the tree hollow. Additionally, the user could input stand attributes and the application would output the probability that hollows are present in the stand as well as a hollows per hectare estimate for each hollows size class. When used

81 independently, or in concert with existing planning tools, it would provide a powerful tool for , and would encourage the integration of priorities for hollow-dependent fauna in operational forestry decision making.

• The literature has identified that external tree hollows and internal timber defects often occur concurrently. This has also been observed to some extent in the SFRI data (Wang pers. comm.). Further work could explore this relationship. Currently, the presence of internal timber defects has a critical affect on the accuracy of timber volume estimates.

82 8.4. Appendix D. Publications arising from the hollows work

Fox, J.C., Burgman, M.A., and Ades, P.K. 2000. Review: stochastic structure and predictive ecological models. p.45 in: Proceedings of the Ecological Society of Australia Annual Conference: Ecology in a Rapidly Changing World, 29 November – 1 December 2000, Melbourne Australia.

Fox, J.C., Ades, P.K., Radic, J.S., Wang, Y., and Burgman, M.A. 2001. Predictive models for hollow incidence in central and eastern Victoria. and Management. In preparation.

Fox, J.C., Ades, P.K., Radic, J.S., Wang, Y., and Burgman, M.A. 2001. A statement of hollow incidence in central and eastern Victorian public forests. Australian Forestry. In preparation.

Fox, J.C., and Burgman, M.A., 2001. Stochastic structure and predictive ecological models. Short Communication – Ecological modelling. In preparation.

Fox, J.C., Whintle, B., Elith, J., and Burgman, M.A. 2001. Spatial dependence and binary ecological models. Ecological modelling. In preparation.

83 8.5. Appendix E. Acknowledgements

The assistance of many staff at the Forest Resource Inventory section of DNRE is acknowledged. In particular, Todd Gretton provided GIS variables, Fiona Hamilton and Fred Cumming assisted with technical aspects of the SFRI, and Yue Wang assisted with the development of an appropriate statistical methodology. This project is the result of initiative taken by Ross Penny and Jan Radic, and their continuing financial support and enthusiasm for the project is acknowledged. The assistance of staff and students in the Environmental Science Lab, particularly that provided by Jane Elith, is also acknowledged.

84