An evaluation of latent Dirichlet allocation in the context of plant-pollinator

networks

by

Liam Callaghan

A Thesis

Presented to

The University of Guelph

In partial fulfilment of requirements

for the degree of

Master of Science

in

Mathematics and Statistics

Guelph, Ontario, Canada

c Liam Callaghan, December, 2012 ABSTRACT

An evaluation of latent Dirichlet allocation in the context of plant-pollinator

networks

Liam Callaghan Advisors: University of Guelph, 2012 Dr. A. Ali Dr. G. Umphrey

There may be several mechanisms that drive observed interactions between plants and pollinators in an ecosystem, many of which may involve trait matching or trait complementarity. Hence a model of species activity on plant species should be represented as a mixture of these linkage rules. Unfortunately, ecologists do not always know how many, or even which, traits are the main contributors to the observed interactions. This thesis proposes the Latent Dirichlet Allocation (LDA) model from artificial intelligence for modelling the observed interactions in an ecosys- tem as a finite mixture of (latent) interaction groups in which plant and pollinator pairs that share common linkage rules are placed in the same interaction group. Sev- eral model selection criteria are explored for estimating how many interaction groups best describe the observed interactions. This thesis also introduces a new model se- lection score called “penalized perplexity”. The performance of the model selection criteria, and of LDA in general, are evaluated through a comprehensive simulation study that consider networks of various size along with varying levels of nesting and numbers of interaction groups. Results of the simulation study suggest that LDA works well on networks with mild-to-no nesting, but loses accuracy with increased nestedness. Further, the penalized perplexity tended to outperform the other model selection criteria in identifying the correct number of interaction groups used to simu- late the data. Finally, LDA was demonstrated on a real network, the results of which provided insights into the functional roles of pollinator species in the study region.

Keywords: pollination network, latent Dirichlet allocation, linkage rules, perplexity, model selection, BIC, AIC, DIC. iv

Acknowledgments

I would like to thank my advisor Dr. Ayesha Ali for patiently helping me with my research at the University of Guelph. I am grateful for the learning oppor- tunities through the conferences and workshops I have attended, and of course the

financial aid for which was provided by my advisor through the NSERC-CANPOLIN

Canadian Pollination Initiative and Dr. Hermann Eberl. In addition to my advisor,

I would like to thank Dr. Gary Umphrey, not only for being on my advisory com- mittee but providing his advice and insight while being a major part of my learning experience at the University of Guelph.

I am thankful to Luisa Carvalheiro for providing the Avon Gorge dataset as well as feedback for my analysis. Also, Peter Kevan and Tom Woodcock for their support, expertise on pollination, and constructive comments.

Furthermore, I would like express my grattitude towards my family and friends whose support made it possible for me to complete my graduate studies.

-Liam v

Table of Contents

List of Figures vii

List of Tables x

1 Introduction 1

2 Pollination Networks 6 2.1 Definition of a Pollination network ...... 6 2.2 Network terms and structure ...... 8 2.3 Methods used to identify compartments ...... 10 2.3.1 Trophic similarility ...... 10 2.3.2 Simulated annealing algorithm (SA) ...... 11

3 Methodology 13 3.1 Latent Dirichlet allocation ...... 13 3.2 Kullback-Liebler (KL) divergence and label switching ...... 19 3.3 Model Selection ...... 22 3.3.1 Perplexity ...... 22 3.3.2 Akaike Information Criterion (AIC) ...... 23 3.3.3 Bayesian Information Criterion (BIC) ...... 25 3.3.4 Deviance Information Criterion (DIC) ...... 26 3.3.5 Information Criterion (IC) ...... 27 3.3.6 Penalized Perplexity ...... 27

4 Simulation Study 29 4.1 Study design ...... 29 4.2 Data Generation and Model Fitting ...... 32 4.3 Statistics ...... 38 4.4 Results ...... 40 4.4.1 Parameter estimation Statistics ...... 42 4.5 Discussion ...... 54

5 Data Analysis 56 5.1 Description of the Avon Gorge Data ...... 56 5.2 Results ...... 59 5.3 Discussion ...... 67 vi

6 Conclusions 71 6.1 Future Work ...... 73

A Appendix 77 A.1 Simulation study results ...... 77 A.1.1 Scenario 1 ...... 77 A.1.2 Scenarios 2 to 4 ...... 83 A.1.3 Scenarios 5 to 8 ...... 92 A.1.4 Scenario 9 ...... 103 A.1.5 Scenarios 13 to 16 ...... 109 A.1.6 Scenario 17 ...... 120 A.1.7 Scenarios 18 to 20 ...... 127 A.1.8 Scenarios 21 to 24 ...... 137 A.2 Avon Gorge dataset results ...... 148 A.2.1 Avon Gorge data results for analysis 1 using penalized perplexity148 A.2.2 Avon Gorge data results for analysis 1 using IC model selection criterion ...... 154 A.3 The lda package in R ...... 159 A.4 The bipartite package in R ...... 160 vii

List of Figures

2.1 A weighted bipartite graph representing observed interactions within an ecosystem. Circles represent pollinator species; squares represent plant species...... 7 3.1 A graphical representation of the LDA model applied to the ath polli- a nator species with na observed counts on M plant species. Z and θ a are K-vectors, Y and βz are M-vectors and α and ηZ are scalars for Z = 1 − K and a = 1 − N...... 17 4.1 Visualization of a mildly nested visitation web with 20 visitor species (rows) and 9 plant species (columns). Darker cells represent higher frequencies of interactions between the corresponding plant-visitor pairs. 31 4.2 Stacked bar plots for the identified interaction groups in scenario 10. The top plots are for the runs with Kˆ = K and the bottom row is for Kˆ 6= K...... 43 4.3 Stacked bar plots for the identified interaction groups in scenario 11. The top plots are for the runs with Kˆ = K and the bottom row is for Kˆ 6= K...... 44 4.4 Stacked bar plots for the identified interaction groups in scenario 12. The top plots are for the runs with Kˆ = K and the bottom row is for Kˆ 6= K...... 45 5.1 Presence/absence visualization of Avon Gorge data with rare visits excluded and single visits excluded (N = 85, M = 53)...... 58 5.2 Presence/absence visualization of Avon Gorge data with rare visits included, but plants/visitors with single counts removed for analysis 3 (N = 85, M = 53)...... 61 5.3 Estimated visitation distribution by interaction group, averaged over 83 runs for Kˆ = 2. Refer to Table 5.3 for plant species names. . . . . 63 A.1 Stacked bar plots for the identified interaction groups in scenario 1. The top plots are for the runs with Kˆ = K and the bottom row is for Kˆ 6= K...... 78 A.2 Stacked bar plots for the identified interaction groups in scenario 2. The top plots are for the runs with Kˆ = K and the bottom row is for Kˆ 6= K...... 84 viii

A.3 Stacked bar plots for the identified interaction groups in scenario 3. The top plots are for the runs with Kˆ = K and the bottom row is for Kˆ 6= K...... 85 A.4 Stacked bar plots for the identified interaction groups in scenario 4. The top plots are for the runs with Kˆ = K and the bottom row is for Kˆ 6= K...... 86 A.5 Stacked bar plots for the identified interaction groups in scenario 5. The top plots are for the runs with Kˆ = K and the bottom row is for Kˆ 6= K...... 93 A.6 Stacked bar plots for the identified interaction groups in scenario 6. The top plots are for the runs with Kˆ = K and the bottom row is for Kˆ 6= K...... 94 A.7 Stacked bar plots for the identified interaction groups in scenario 7. The top plots are for the runs with Kˆ = K and the bottom row is for Kˆ 6= K...... 95 A.8 Stacked bar plots for the identified interaction groups in scenario 8. The top plots are for the runs with Kˆ = K and the bottom row is for Kˆ 6= K...... 96 A.9 Stacked bar plots for the identified interaction groups in scenario 9. The top plots are for the runs with Kˆ = K and the bottom row is for Kˆ 6= K...... 103 A.10 Stacked bar plots for the identified interaction groups in scenario 13. The top plots are for the runs with Kˆ = K and the bottom row is for Kˆ 6= K...... 110 A.11 Stacked bar plots for the identified interaction groups in scenario 14. The top plots are for the runs with Kˆ = K and the bottom row is for Kˆ 6= K...... 111 A.12 Stacked bar plots for the identified interaction groups in scenario 15. The top plots are for the runs with Kˆ = K and the bottom row is for Kˆ 6= K...... 112 A.13 Stacked bar plots for the identified interaction groups in scenario 16. The top plots are for the runs with Kˆ = K and the bottom row is for Kˆ 6= K...... 113 A.14 Stacked bar plots for the identified interaction groups in scenario 17. The top plots are for the runs with Kˆ = K and the bottom row is for Kˆ 6= K...... 120 A.15 Stacked bar plots for the identified interaction groups in scenario 18. The top plots are for the runs with Kˆ = K and the bottom row is for Kˆ 6= K...... 128 A.16 Stacked bar plots for the identified interaction groups in scenario 19. The top plots are for the runs with Kˆ = K and the bottom row is for Kˆ 6= K...... 129 ix

A.17 Stacked bar plots for the identified interaction groups in scenario 20. The top plots are for the runs with Kˆ = K and the bottom row is for Kˆ 6= K...... 130 A.18 Stacked bar plots for the identified interaction groups in scenario 21. The top plots are for the runs with Kˆ = K and the bottom row is for Kˆ 6= K...... 138 A.19 Stacked bar plots for the identified interaction groups in scenario 22. The top plots are for the runs with Kˆ = K and the bottom row is for Kˆ 6= K...... 139 A.20 Stacked bar plots for the identified interaction groups in scenario 23. The top plots are for the runs with Kˆ = K and the bottom row is for Kˆ 6= K...... 140 A.21 Stacked bar plots for the identified interaction groups in scenario 24. The top plots are for the runs with Kˆ = K and the bottom row is for Kˆ 6= K...... 141 A.22 Presence/absence visualization of Avon Gorge data with rare visits included, but plants/visitors with single counts removed for analysis 1 (N = 89, M = 54)...... 152 A.23 Estimated visitation distribution by interaction group, averaged over 84 runs for Kˆ = 2 in analysis 1 using penalized perplexity for model selection. Refer to Table 5.3 for plant species names...... 153 A.24 Presence/absence visualization of Avon Gorge data with rare visits included, but plants/visitors with single counts removed for analysis 1 (N = 89, M = 54)...... 154 A.25 Estimated visitation distribution by interaction group, averaged over 100 runs. Refer to Table 5.3 for plant species names...... 158 x

List of Tables

3.1 Notation for the LDA...... 16 4.1 Dimensions and the number of interaction groups used to generate the data for the 24 different scenarios of the simulation study. No nesting corresponds to a compartmental model...... 32 4.2 The test to accept η for a specified level of nesting...... 34 4.3 Number of samples that chose the correct number of groups K out of 500 samples...... 41 4.4 The discordance ratio calculated as (PP incorrect, PY correct)/(PP correct, PY incorrect) for the penalized perplexity and perplexity model selection criteria and (PP incorrect, AIC correct)/(PP correct, AIC in- correct) for the penalized perplexity and AIC model selection criteria for each scenario of 500 runs. The proportion of the 500 runs choosing an incorrect k for each scenario are also listed for each of the two criteria 46 4.5 The number of groups identified for the scenarios with N = 42, M = 14 and K = 3 with penalized perplexity (PP) used as the model selection criterion...... 47 4.6 Top row: The bias and relative bias for βˆ for the scenarios with N = 42,M = 14 and K = 3 using penalized perplexity for model selection. Bottom row: The true β parameter used to generate the data. . . . . 48 4.7 The average relative bias for θˆ for the scenarios with N = 42,M = 14 and K = 3 using the penalized perplexity for model selection. . . . . 49 4.8 The average bias for θˆ for the scenarios with N = 42,M = 14 and K = 3 using the penalized perplexity for model selection...... 50 4.9 The coefficient of variation (CV) for βˆ for the scenarios with N = 42,M = 14 and K = 3 using the penalized perplexity for model selection. . 51 4.10 The average coefficient of variation (CV) for θˆ for the scenarios with N = 42,M = 14 and K = 3 using the penalized perplexity for model selection...... 52 4.11 The average standard deviation for θˆ for the scenarios with N = 42,M = 14 and K = 3 using the penalized perplexity for model selection. . 53 5.1 Summary of counts in Avon Gorge data...... 57 5.2 The number of interaction groups associated with the model chosen most often for each score. The number of times this model is selected out of the 100 runs is shown in brackets...... 60 xi

5.3 Estimated plant visitation distributions for each interaction group βk, averaged over runs for Kˆ = 2 (83) using analysis 3 of LDA with a Gibbs sampler and two interaction groups...... 62 a 5.4 Estimated group membership distributions for each visitor species θ , averaged over 83 independent runs of LDA where Kˆ = 2 with a Gibbs sampler...... 65 a 5.5 Estimated group membership distributions for each visitor species θ , averaged over 83 independent runs of LDA where Kˆ = 2 with a Gibbs sampler...... 66 A.1 The number of groups identified for scenario 1 with N = 20, M = 9 and K = 2 with penalized perplexity (PP) used as the model selection criterion...... 77 A.2 Top row: The bias and relative bias for βˆ for scenario 1 with N = 20, M = 9 and K = 2 using penalized perplexity for model selection. Bottom row: The true β parameter used to generate the data. . . . . 78 A.3 The average relative bias for θˆ for the scenarios with N = 20, M = 9 and K = 2 using the penalized perplexity for model selection. . . . . 79 A.4 The average bias for θˆ for the scenarios with N = 20, M = 9 and K = 2 using the penalized perplexity for model selection...... 80 A.5 The coefficient of variation (CV) and SD for βˆ for the scenarios with N = 20, M = 9 and K = 2 using the penalized perplexity for model selection...... 80 A.6 The average coefficient of variation (CV) for θˆ for the scenarios with N = 20, M = 9 and K = 2 using the penalized perplexity for model selection...... 81 A.7 The average standard deviation for θˆ for the scenarios with N = 20, M = 9 and K = 2 using the penalized perplexity for model selection. 82 A.8 The number of groups identified for the scenarios with N = 20, M = 9 and K = 3 with penalized perplexity (PP) used as the model selection criterion...... 83 A.9 Top row: The bias and relative bias for βˆ for the scenarios with N = 20, M = 9 and K = 3 using penalized perplexity for model selection. Bottom row: The true β parameter used to generate the data. . . . . 87 A.10 The average relative bias for θˆ for the scenarios with N = 20, M = 9 and K = 3 using the penalized perplexity for model selection. . . . . 88 A.11 The average bias for θˆ for the scenarios with N = 20, M = 9 and K = 3 using the penalized perplexity for model selection...... 89 A.12 The coefficient of variation (CV) and SD for βˆ for the scenarios with N = 20, M = 9 and K = 3 using the penalized perplexity for model selection...... 89 A.13 The average coefficient of variation (CV) for θˆ for the scenarios with N = 20, M = 9 and K = 3 using the penalized perplexity for model selection...... 90 xii

A.14 The average standard deviation (SD) for θˆ for the scenarios with N = 20, M = 9 and K = 3 using the penalized perplexity for model selection. 91 A.15 The number of groups identified for the scenarios with N = 20, M = 9 and K = 4 with penalized perplexity (PP) used as the model selection criterion...... 92 A.16 Top row: The bias and relative bias for βˆ for the scenarios with N = 20, M = 9 and K = 4 using penalized perplexity for model selection. Bottom row: The true β parameter used to generate the data. . . . . 97 A.17 The average relative bias for θˆ for the scenarios with N = 20, M = 9 and K = 4 using the penalized perplexity for model selection. . . . . 98 A.18 The average bias for θˆ for the scenarios with N = 20, M = 9 and K = 4 using the penalized perplexity for model selection...... 99 A.19 The coefficient of variation (CV) for βˆ for the scenarios with N = 20, M = 9 and K = 4 using the penalized perplexity for model selection. 100 A.20 The average coefficient of variation (CV) for θˆ for the scenarios with N = 20, M = 9 and K = 4 using the penalized perplexity for model selection...... 101 A.21 The average standard deviation (SD) for θˆ for the scenarios with N = 20, M = 9 and K = 4 using the penalized perplexity for model selection.102 A.22 The number of groups identified for scenario 9 with N = 42, M = 14 and K = 2 with penalized perplexity (PP) used as the model selection criterion...... 103 A.23 Top row: The bias and relative bias for βˆ for scenario 9 with N = 42, M = 14 and K = 2 using penalized perplexity for model selection. Bottom row: The true β parameter used to generate the data. . . . . 104 A.24 The average relative bias for θˆ for the scenarios with N = 42, M = 14 and K = 2 using the penalized perplexity for model selection. . . . . 105 A.25 The coefficient of variation (CV) and SD for βˆ for the scenarios with N = 42, M = 14 and K = 2 using the penalized perplexity for model selection...... 106 A.26 The average coefficient of variation (CV) for θˆ for the scenarios with N = 42, M = 14 and K = 2 using the penalized perplexity for model selection...... 107 A.27 The average standard deviation (SD) for θˆ for the scenarios with N = 42, M = 14 and K = 2 using the penalized perplexity for model selection...... 108 A.28 The number of groups identified for the scenarios with N = 42, M = 14 and K = 4 with penalized perplexity (PP) used as the model selection criterion...... 109 A.29 Top row: The bias and relative bias for βˆ for the scenarios with N = 42, M = 14 and K = 4 using penalized perplexity for model selection. Bottom row: The true β parameter used to generate the data. . . . . 114 A.30 The average relative bias for θˆ for the scenarios with N = 42, M = 14 and K = 4 using the penalized perplexity for model selection. . . . . 115 xiii

A.31 The average bias for θˆ for the scenarios with N = 42, M = 14 and K = 4 using the penalized perplexity for model selection...... 116 A.32 The coefficient of variation (CV) and SD for βˆ for the scenarios with N = 42, M = 14 and K = 4 using the penalized perplexity for model selection...... 117 A.33 The average coefficient of variation (CV) for θˆ for the scenarios with N = 20, M = 9 and K = 4 using the penalized perplexity for model selection...... 118 A.34 The average standard deviation (SD) for θˆ for the scenarios with N = 20, M = 9 and K = 4 using the penalized perplexity for model selection.119 A.35 The number of groups identified for scenario 17 with N = 80, M = 42 and K = 2 with penalized perplexity (PP) used as the model selection criterion...... 120 A.36 Top row: The bias and relative bias for βˆ for the scenarios with N = 80, M = 42 and K = 2 using penalized perplexity for model selection. Bottom row: The true β parameter used to generate the data. . . . . 121 A.37 The average relative bias for θˆ for the scenarios with N = 80, M = 42 and K = 2 using the penalized perplexity for model selection. . . . . 122 A.38 The average bias for θˆ for the scenarios with N = 80, M = 42 and K = 2 using the penalized perplexity for model selection...... 123 A.39 The coefficient of variation (CV) and SD for βˆ for the scenarios with N = 80, M = 42 and K = 2 using the penalized perplexity for model selection...... 124 A.40 The average coefficient of variation (CV) for θˆ for the scenarios with N = 80, M = 42 and K = 2 using the penalized perplexity for model selection...... 125 A.41 The average standard deviation (SD) for θˆ for the scenarios with N = 80, M = 42 and K = 2 using the penalized perplexity for model selection...... 126 A.42 The number of groups identified for the scenarios with N = 80, M = 42 and K = 3 with penalized perplexity (PP) used as the model selection criterion...... 127 A.43 Top row: The bias and relative bias for βˆ for the scenarios with N = 80, M = 42 and K = 3 using penalized perplexity for model selection. Bottom row: The true β parameter used to generate the data. . . . . 131 A.44 The average relative bias for θˆ for the scenarios with N = 80, M = 42 and K = 3 using the penalized perplexity for model selection. . . . . 132 A.45 The average bias for θˆ for the scenarios with N = 80, M = 42 and K = 3 using the penalized perplexity for model selection...... 133 A.46 The coefficient of variation (CV) and SD for βˆ for the scenarios with N = 80, M = 42 and K = 3 using the penalized perplexity for model selection...... 134 xiv

A.47 The average coefficient of variation (CV) for θˆ for the scenarios with N = 80, M = 42 and K = 3 using the penalized perplexity for model selection...... 135 A.48 The average standard deviation (SD) for θˆ for the scenarios with N = 80, M = 42 and K = 3 using the penalized perplexity for model selection...... 136 A.49 The number of groups identified for the scenarios with N = 80, M = 42 and K = 4 with penalized perplexity (PP) used as the model selection criterion...... 137 A.50 Top row: The bias and relative bias for βˆ for the scenarios with N = 80, M = 42 and K = 4 using penalized perplexity for model selection. Bottom row: The true β parameter used to generate the data. . . . . 142 A.51 The average relative bias for θˆ for the scenarios with N = 80, M = 42 and K = 4 using the penalized perplexity for model selection. . . . . 143 A.52 The average bias for θˆ for the scenarios with N = 80, M = 42 and K = 4 using the penalized perplexity for model selection...... 144 A.53 The coefficient of variation (CV) and SD for βˆ for the scenarios with N = 80, M = 42 and K = 4 using the penalized perplexity for model selection...... 145 A.54 The average coefficient of variation (CV) for θˆ for the scenarios with N = 80, M = 42 and K = 4 using the penalized perplexity for model selection...... 146 A.55 The average standard deviation (SD) for θˆ for the scenarios with N = 80, M = 42 and K = 4 using the penalized perplexity for model selection...... 147 A.56 Estimated plant visitation distributions for each interaction group βk, averaged over runs for Kˆ = 2 (84) using analysis 1 of LDA with a Gibbs sampler and two interaction groups by using PP for model selection. . 149 a A.57 Estimated group membership distributions for each visitor species θ , averaged over 84 independent runs of LDA where Kˆ = 2 with a Gibbs sampler using PP...... 150 A.58 Continuation of the estimated group membership distributions for each a visitor species θ , averaged over 84 independent runs for analysis 1 of LDA where Kˆ = 2 with a Gibbs sampler using PP...... 151 A.59 Estimated plant visitation distributions for each interaction group, av- eraged over 100 independent runs of LDA with a Gibbs sampler and six interaction groups...... 155 a A.60 Estimated group membership distributions θ for each visitor species, averaged over 100 independent runs of LDA with a Gibbs sampler and six interaction groups...... 156 1

Chapter 1

Introduction

This thesis contributes to the area of Bayesian statistics and provides an eval- uation of model selection criteria for Bayesian mixture models in a pollination ecology context. More specifically, this thesis proposes latent Dirichlet allocation (LDA) for modelling pollination networks such that plant-pollinator interactions arise from a

finite number of sets of linkage rules. LDA with the collapsed Gibbs sampler is pro- posed to estimate a finite number of conditional group probabilities to represent sets of linkage rules for visitor and plant species interactions. The ability of LDA to iden- tify the correct number of interaction groups is investigated through a comprehensive simulation study. LDA is also used to analyze a real dataset to demonstrate how results from LDA can be used to gain insights into ecosystem function. Currently, few models exist for grouping interactions between plant and pollinator species to identify structural patterns of ecological importance. Simulated annealing has been used to cluster pollinator and plant species in real pollination networks by maximiz- ing the connectance between species of the same module (Bascompte et al., 2007;

Amarel and Guimer`a,2005). In particular, the algorithm partitions the pollination network into modules that maximizes the modularity for the pollination network us- ing a qualitative web (i.e. binary presence/absence matrix). A module is a group of 2 pollinator and plant species such that pollinators only visit plant species within the same module ( i.e. hub) or pollinators visit plant species in other modules as well

(i.e. connector). A module is also known as a compartment and it is a convenient way of grouping plants and pollinators within a network. The traits of the plant and pollinator species within that compartment may be complementary and responsible for the observed interaction webs (Corbet et al., 2002).

It is important to identify the underlying mechanisms that drive the observed interactions in pollination webs to understand the behaviour of pollinator species in an ecosystem. The compartments may identify species that are vulnerable to extinction due to habitat loss or other effects that are a direct result of human activity (Inouye and Kearns, 1997). For example, if a plant species relies on a single pollinator species for reproduction but an alien flower species is introduced to the ecosystem (that is attractive to this pollinator species), then the native flower species may be less likely to be pollinated and more vulnerable to extinction. It is important to manage ecosystems to ensure that actions caused by increased human activity do not have a negative effect on the mutual symbiotic plant/pollinator relationships that are necessary for survival. Recently, a focus in pollination ecology identifying whether pollination networks are modular or nested (Amarel and Guimer`a,2005; Hall and Raffaelli, 1992).

It is believed that networks with more than 150 plant species are highly modular

(Bascompte et al., 2007), and the modules are a direct result of “linkage” rules caused by trait matching between plant and pollinator species. Linkage rules are defined by trait matching, or complementarity traits between plant and pollinator pairs that facilitate pollination. For example, a common complementarity trait corresponds to 3 an insect having a proboscis length that is long enough to reach the floral reproductive parts of a plant. An example of trait matching would correspond to an insect species with a foraging time that coincides with a flower species’ bloom time. The goal of modelling such processes is to represent the species observed in a community as belonging to an identified interaction group. Currently, clustering algorithms such as

SA, categorize each pollinator species as belonging to only one interaction group. In other words, each group or module is defined by sets of non-overlapping pollinator species. However, in this thesis, such groups are defined by the individual interactions between plants and pollinators in accordance with some linkage rule.

LDA is an advanced text modelling algorithm (Blei et al., 2002) that through- out this thesis has been adapted to modelling pollination networks. The algorithm uses a quantitative web consisting of a matrix of interaction counts between plant- pollinator species pairs. Typically, a “visitation web”, in which counts correspond to observed visits of pollinator species to the floral parts of individuals of a plant species. LDA estimates the group to which each observed interaction in a visitation web belongs and does not require any prior trait information on the species in the network. In fact, LDA treats the linkage rules as latent variables and estimates indi- vidual interaction group memberships between a plant and observed visitors to that plant. Because the interaction groups are defined by linkage rules, visits made by two individuals of the same species may be assigned to different groups. If one acknowl- edges that there is trait variation among individuals of the same species then one must also acknowledge that visitation behaviours may vary within a species. LDA permits and exploits these trait and visitation behaviour variations. Hence, individ- 4 uals within a visitor species can belong to more than one group identified by the model since the model estimates the groups at the individual level, not at the species level. The estimated mixing probabilities output from LDA could be used to guide investigation of plant and pollinator species characteristics that facilitate, encourage, or prevent interactions.

A drawback to using LDA is that the number of latent interaction groups is assumed known. However, typically the true number of groups is unknown. In the text modelling context, LDA assigns the individual words within a document to specific topics such that each document in a corpus is represented as a mixture of topics (Blei et al., 2002). Because humans are good at looking at words and deciding if they are related to each other, topics can be easily identified by manually reviewing the words LDA assigns to different topics. However, in a pollination context, the exact definition of an interaction group is not clear through the assigned groups for each individual observed interaction and careful review of the plant-pollinator pairs is necessary to identify the linkage rule that defines a group, or functional roles of a specific group. In text modelling, it is common to see many topics or groups

(e.g. 100 topics) being used to fit corpus data. However, a likelihood-based score known as perplexity is used to automatically choose the optimal number of groups.

Results of the simulation study in this thesis suggest that perplexity tends to overfit data. Further, this thesis proposes a new score, called penalized perplexity, which incorporates a penalty to account for model complexity.

Chapter 2 provides background on pollination networks, network structure and current methods for analyzing these webs. Chapter 3 provides background on 5

LDA using the collapsed Gibbs sampler, and on various scores used for model selec- tion. Chapter 4 presents the design and results of the simulation study. Chapter 5 presents an analysis of real data collected in the Avon Gorge region in the United

Kingdom. Conclusions and future work are discussed in Chapter 6. 6

Chapter 2

Pollination Networks

This Chapter provides a brief review of plant-pollinator networks. Section

2.1 defines a pollination web and shows how to represent one. Section 2.2 explains the structure and commonly used terms to describe pollination networks. Finally, Section

2.3 briefly summarizes two current methods used to identify modules in pollination webs: trophic similarity and the simulated annealing (SA) algorithm.

2.1 Definition of a Pollination network

Pollination webs are commonly represented by a matrix of either pres- ence/absence or frequency counts of interactions between pollinator-plant species pairs. Associated with this matrix is a bipartite graph in which each node repre- sents either a plant species or a pollinator species and lines connecting plant species to pollinator species represent observed interactions between that species pair. Figure

2.1 provides an example of a pollination web in which thicker lines are used to denote higher observed frequencies. In practice, data is usually collected by observing visi- tors to the plant species, but it is not known whether pollination took place or not.

As such, the matrix of counts most likeliy includes “non-pollination” visits as well, 7 and ecologists tend to use the terms “visitor species” and “visitation web” instead of

“pollinator species” and “pollination web”.

Figure 2.1: A weighted bipartite graph representing observed interactions within an ecosystem. Circles represent pollinator species; squares represent plant species.

Within an ecosystem, several mechanisms could be responsible for the ob- served interactions, such as forbidden links (e.g. due to temporal mismatch) and complementarity traits (e.g. tubular flowers might be more visited by with longer tongue lengths), together known as linkage rules (Luis and Rodriguez-Girones,

2007). If these linkage rules were known, each pollinator species’ interactions could be represented as a mixture of interaction groups, where each interaction group repre- sents a group of plant species and pollinator species that interact in accordance with one or more linkage rule. However, ecologists do not always know how many, or which, linkage rules contribute to the observed interactions. Further, current sampling efforts generally do not provide data at an adequate resolution to facilitate identification of possible linkage rules (Bascompte and Jordano, 2007; Dormann et al., 2009). The 8 main point is that most studies just collect data on who visited whom but not the phenotypic traits of each visitor, or the flower visited, so it is hard to build linkage rules. But if you know that the data is an aggregation of several studies, you could look at several small webs from the individual surveys. For example, most surveys can gather data on which plants were visited, and by which visitor species. However, they do not gather detailed information on the phenotypic traits of the specific visi- tor or plant involved in the observed interaction (e.g. visitor proboscis length, wing span, etc. and floral tubal length, colour, floral symmetry, etc.). Hence most analyses of visitation webs focus on calculating network statistics that quantify the netwrok structure.

2.2 Network terms and structure

Several terms are commonly used to define a community’s network structure.

The terms used in this thesis include:

• Link: An interspecific interaction, usually for presence/absence networks (Jor-

dano, 1987).

• Connectance: The proportion of links that are actually realized (Jordano, 1987;

V´azquezet al., 2009).

• Generalist: A species that is highly connected within a network (V´azquezet al.,

2009).

• Specialist: A species with a low connectance within a network (V´azquezet al., 9

2009).

• Degree: The number of species linked to a species from another trophic level

(e.g. pollinator species linked to plant species) (V´azquezet al., 2009).

• Nestedness: The tendency of a specialist species to interact with a subset of

partners interacting with a more generalist species. (V´azquezet al., 2009).

In this thesis, nestedness is measured using BINMATNEST which represents nest- edness for a network as a temperature between 0 and 100, with 0 denoting a fully nested network and 100 denoting a compartmental network (Rodriguez-Girones and

Santamaria, 2006). All of these network properties can easily be computed using the bipartite package (Dormann et al., 2011) in the R software (R Development Core

Team, 2011).

Apart from the interaction structure of a visitation web it is also valuable

to know the distribution of visitor species in the community. This distribution is

often modelled using a species abundance distribution (SAD), which is defined to be

a vector of all the species relative abundances in the ecosystem. The abundances of

species in pollination network would show that the shape of the abundances tends to

form a “hollow-curve”, meaning that in nature only a few species are highly abundant

and while the vast majority of species are rare (McGill et al., 2007). This trend in

species abundance has led to several models that attempt to predict species abundance

for an ecosystem. One such family of models is the Log-Normal distribution (McGill

et al., 2007). In the simulations conducted in Chapter 4, the Log-Normal distribution

is used to generate mean pollinator species abundances. 10

2.3 Methods used to identify compartments

As mentioned earlier, much recent work has analyzed pollination webs by grouping plant and pollinator species into functional groups, i.e. modules.

2.3.1 Trophic similarility

Trophic similarity was first developed to identify compartments in food webs

(Lawton and Pimm, 1980). The trophic similarity is a measure of how close a pair of species are in sharing interactions partners for a bipartite qualitative web. Using the

Jaccard coefficient, the trophic similarity for a pair of species i and j in a pollination web is estimated by,

a C = , i, j = 1,...,N (2.1) ij a + b + c where a is the number of plant species that i and j both visit, b is the number of plant species only i visits and c is the number of plant species only j visits. The trophic similarity indices are contained in [0, 1] (Hall and Raffaelli, 1992). By plotting a histogram of Cij for all i, j, patterns emerge that expose compartments in the pollination network. For instance, the histogram should be multi-modal if the data is compartmental; otherwise, a unimodal histogram would suggest a random web (Hall and Raffaelli, 1992). 11

2.3.2 Simulated annealing algorithm (SA)

Simulated annealing is a stochastic optimizing technique which iteratively partitions a bipartite web into a finite number of modules. The goal of the algorithm is to maximize the modularity, defined to be,

2 Nm " ! # X ls ds M = − , (2.2) L 2L s=1 where Nm is the number of modules, ls is the number of links between plants and pollinators in module s, L is the number of links in the pollination network, ds is the sum of the degrees for the species in module s (Amarel and Guimer`a,2005). The algorithm relies on a computational temperature Tc to configure the modules and for each step update the plant and pollinator species within the modules. Initially, SA assumes the network is comprised of Nm modules. At each step a change to the net- work is proposed (e.g. some species are moved from one module to another), and the change is accepted based on some acceptance probability p. The acceptance proba- bility depends on the cost associated with a proposed change and the computational

th temperature Tc. In particular, the changes for the i iteration are accepted with probability:

  1 if C ≤ C  i i−1 p = ! Ci−Ci−1  exp − if Ci > Ci−1  Tc 12

th where Ci is the cost associated with the i iteration equal to negative modularity of

the network after the proposed change. Initially, Tc is high to allow the algorithm to

explore larger changes in the modules and accept with high p. The network Tc is cooled

by a factor of c with Tc = cTc for the next iteration. The computational temperature

“cools”, only accepting small changes in the network after several iterations. The algorithm stops when Tc = 0 (Dupont and Jens, 2009; Amarel and Guimer`a,2005).

Unlike trophic similarity or SA, latent Dirichlet allocation is a mixture model for the pollinator species such that at the species-level, species may belong to more than one module. Chapter 3 summarizes LDA, and describes a model selection pro- cedure to choose the number of groups most compatible with some given data. 13

Chapter 3

Methodology

This chapter provides background on LDA using the collapsed Gibbs sam- pler in Section 3.1. The algorithm that will ensure consistent results using LDA is explained in Section 3.2. The different model selection criteria used in this thesis are explained in section 3.3.

3.1 Latent Dirichlet allocation

This thesis proposes that latent Dirichlet allocation (LDA) from artificial intelligence (Blei et al., 2002) be used to model plant-pollinator species interactions as a mixture of interaction groups. Instead of strict module identification for plant and pollinator species, LDA allows individual interactions to belong to seperate modules for plant-pollinator species pairs.

In other words, LDA can be used to model plant-pollinator networks as a mixture of sets of linkage rules. The model assigns individual interactions to a specific set so that all pollinator species in the network are represented as a mixture of a finite number of sets of linkage rules based on the assignment to the interactions observed.

In the remainder of this thesis, we will use the term “linkage rule” to mean one or 14 more linkage rules that define one of these sets, i.e. a compound linkage rule. LDA is a hierarchical latent variable model originally developed to model words in a document as a mixture of topics. The topics in the model have an associated distribution over the vocabulary of words, meaning certain words have a higher probability of belonging to a specific topic. Just as a document may arise from a mixture of topics, thereby consisting of a collection of words corresponding to those topics, the observed interactions of a pollinator species may arise from a mixture of linkage rules, thereby including those plant species that are in accordance with those linkage rules (with respect to the pollinator species).

Estimating the linkage rule associated with each count in the interaction web can be found using either the Variational Expectation-Maximization (VEM) algorithm (Blei et al., 2002) or the collapsed Gibbs sampler (Chang, 2009). The

Gibbs sampler is a much faster algorithm that provides a tighter lower bound on the log-likelihood. Sampling the random variables in the model prevents estimating the parameters on a local maximum. This thesis only uses the collapsed Gibbs sampler which is readily available in the lda package (Chang, 2009) in R (R Development

Core Team, 2011). The collapsed Gibbs sampler is used to identify the linkage rule assignment to the observed interactions in both the simulation study and the analysis of the real dataset.

For a given ecosystem, let N represent the number of pollinator species, M represent the number of plant species and K be the number of interaction groups driving the plant-pollinator interactions. We take the view that the interactions observed for a particular pollinator species are driven by a mixture of interaction 15 groups which may be reflected by variation of phenotypic traits within individuals of the pollinator species (Amorim et al., 2012). However, for a given individual pollinator of that species there is only one linkage rule that drives its observed interactions. We also assume that all observed interactions are independent.

Let Y represent a matrix of observed visit or interaction frequencies and Z be a random variable that specifies which group drives a given interaction. For a given visitor species a, let na represent the number of observed interactions for visitor P species a, and nA = a na, i.e. the total number of observed interactions in the web.

The probability that an individual pollinator of species a belongs to interaction group k (i.e. it chooses which plant species to visit based on linkage rules defined by group k) is given by,

K a X a P (Za = k) = θk, k = 1, 2,...,K, and θk = 1, (3.1) k=1

a where θk represent the mixing probabilities and are pollinator species specific, i.e. the contribution of each interaction group to the set of observed interactions depends on the pollinator species. For example, proboscis length-tubal length complementarity may be the main rule driving interactions for a given pollinator species, but plant colour intensity may be the main rule driving interactions for a different pollinator species. Alternatively, interactions for a pollinator species may correspond to a mix of the two groups such that proboscis length-tubal length complementarity drives the interactions for a proportion of pollinators of that species, but colour intensity drives the interactions for the remaining pollinators of that species. 16

Finally, let βZ be an M ×1 vector of probabilities associated with interaction group k, such that βk represents the probabilities with which a pollinator visits the plant species conditional on Z = k. In summary, for a given pollinator species a,

a a Za is a multinomial random variable with parameters θ ; conditional on Za = k, Y is a multinomial random variable with parameters βk. These parameters, data and random variables are summarized in Table 3.1 below. Table 3.1: Notation for the LDA. Symbol Definition N Number of visitor species M Number of plant species K Number of interaction groups in LDA Z Group assignment for an interaction YN by M matrix of interactions na total observed interactions for visitor species a nA total observed interactions in Y a th th Yj∗ a pollinator and j plant species interactions, summed over all k groups a th th th Yjk a pollinator and j plant species interactions for the k group a th th Yi,jk the i individual of visitor species a visiting the j plant species for the kth group βZ conditional group probability of length M a θk mixture probability of length K for vistor species a ηZ Dirichlet hyperparameter on βZ α Dirichlet hyperparameter on θa ˆ βk An estimate of a conditional group probability of length M ˆa θk An estimate of a mixture probability of length K for visitor species a ˆ βk An average of βk estimates of a conditional group probability of length M a ˆa θk An average of θk mixture probability estimates of length K for visitor species a

Further, LDA is a generative model that assumes a prior distribution over

a the parameters θ and βZ . In particular, symmetric Dirichlet distributions with hy-

a perparameters α and ηZ for θ and βZ respectively, are included in the model. Figure 17

3.1 provides a graphical representation of LDA and demonstrates how the random

variables and parameters are related to each other as per above. Although LDA is

η z! ! ! ! ! βz! ! a a! ! α! θ ! Z Y ! na! ! a! ! ! N! ! ! ! ! th Figure 3.1: A graphical! representation of the LDA model applied to the a pollinator ! a a species with na observed! counts on M plant species. Z and θ are K-vectors, Y and ! βz are M-vectors and! α and ηZ are scalars for Z = 1 − K and a = 1 − N. ! ! ! ! an easily interpretable! model, in practice fitting such models is not straightforward. ! ! In particular, maximum likelihood estimation cannot be used because the random

variable Z is latent. One goal of LDA is to evaluate the posterior distribution of Z,

P (Z,Y ) P (Z|Y ) = P . (3.2) z P (Z,Y )

The probabilities P (Z) and P (Y |Z) can be obtained using the graphical model shown

in Figure 3.1 and marginalizing out θ and β respectively. In particular, Griffiths and

Steyvers (2004) showed that the probabilities P (Z) and P (Y a|Z = k) for pollinator

species a and the kth group are:

! " # Γ(Mη) X X P (Y a|Z = k) = k log + log(Γ(Y a ) + η) − log(Γ(Y a + Mη)) , MΓ(η) jk ∗k k j (3.3) 18

and !N N K Γ(Kα) Y Q Γ(Y a + α) P (Z) = k=1 ∗k , (3.4) Γ(α)K Γ(Y a + Kα) a=1 ∗∗

a P a a P P a where “*” means “summed over”, so that Y∗k = j Yjk and Y∗∗ = k j Yjk. Us-

ing Bayes rule, the posterior distribution of P (Z|Y ) requires the joint distribution

P (Z,Y ) to be marginalized over all the latent linkage rule groups in the model, which

is intractable. Alternative means of estimating the posterior involves Markov Chain

Monte Carlo (MCMC) methods using the collapsed Gibbs sampler. The collapsed

Gibbs sampler for the LDA algorithm iteratively estimates the linkage rule group

membership for each interaction individually based on the full conditional probabil-

ity for the ith observed interaction, which is proportional to:

∗ a a Y−i,jk + η Y−i,∗k + α P (Zi = k|Z−i,Yjk) = ∗ a , (3.5) Y−i,∗k + Mη Y−i,∗∗ + Kα

where −i means the ith individual of species a is excluded for the estimation of the

current group assignment for interaction i and Zi is the current group assignment for

individual i (Griffiths and Steyvers, 2004). Since the ith individual group assignment

P P a is excluded from the full conditional probability, we have j k Y−i,jk = na − 1.

The collapsed Gibbs sampler only requires initial estimates of the counts for the groups and the symmetric prior parameters which are assumed known. The chain is initialized through random assignment of the counts to the K different linkage rule groups. Estimating Z for each interaction in matrix Y is repeated for T iterations of the Markov chain. The group assignments from the T th iteration are used as the

a final estimates of θ and βZ in the model (Hornik and Gr¨un,2011), and are given by 19

(suppressing the iteration number index),

Y ∗ + η ˆ jk βjk = ∗ , (3.6) Y∗k + Mη and a ˆa Y∗k + α θk = a . (3.7) Y∗∗ + Kα

The LDA collapsed Gibbs sampler is presented in Algorithm 1. LDA is implemented in the R statistical software package (R Development Core Team, 2011) using the lda package (Chang, 2009), which requires as input: initial values for α and η, the size of the burn in period B, the number of samples T to be generated, and the number of interaction groups K driving the interactions (Chang, 2009).

3.2 Kullback-Liebler (KL) divergence and label switching

Since the posterior mean of θˆ and βˆ is used as the final estimate, the order- ing of the conditional interaction groups between runs may not be the same. To fix this issue, a label-switching algorithm that ensures consistency of the final θ and β estimates is employed. The relabeling algorithm uses the Kullback-Liebler (KL) di- vergence, measuring the distance of the estimates between runs to a specific reference

Q distribution (Stephens, 2000). The relabeling algorithm is shown by Algorithm 2 for labelling the βˆ estimate with K interaction groups:

The Q distribution is the true parameter in a simulation study or it is the estimate from the first run if a real dataset is being analyzed. The pseudocode shows 20

Algorithm 1 The LDA collapsed Gibbs sampler. Inputs T, B, α, η, Y, N, na and M Outputs ˆ ˆ β, θ, Log-likelihood (LL) and group assignments for the nA interactions 1: Initialization: Randomly assign the interactions in Y to the K groups 2: for t = 2 to T do 3: for a = 1 to N do 4: for k = 1 to K do 5: for j = 1 to M do 6: for i = 1 to na do a 7: Sample and assign Yi,jk to group k with probability ∗(t) a(t) a Y−i,jk+η Y−i,∗k+α 8: P (Zi = k|Z−i,Yjk) = ∗(t) a(t) Y−i,∗k+Mη Y−i,∗∗+Kα 9: end for 10: end for 11: end for 12: end for 13: end for 14: After T iterations, estimate β and θ 15: for k = 1 to K do 16: for a = 1 to N do Y a(T )+α ˆa ˆa(T ) ∗k 17: θk = θk = a(T ) Y∗∗ +Kα 18: end for 19: for j = 1 to M do Y ∗(T )+η ˆ ˆ(T ) jk 20: βjk = βjk = ∗(T ) Y∗k +Mη 21: end for 22: end for 23: Estimate LL using βˆ and θˆ 21

Algorithm 2 Relabeling Algorithm: Labelling a βˆ estimate for a single run. Inputs T = ~0 of size K × K, Q = β of size M × K and βˆ of size M × K Outputs Relabelled βˆ according to Q 1: for k = 1 to K do . number of groups K in the reference Q 2: for i = 1 to K do . number of groups K in the estimate βˆ ˆ PM ˆ βji 3: Tik = βji log j=1 Qjk 4: end for 5: end for 6: for k = 1 to K do 7: for i = 1 to K do ~ 8: if Tik = min(Ti) then ˆ 9: move βi to index k 10: end if 11: end for 12: end for

ˆ ˆ the reordering of the β estimate so that each βk is ordered to the same index as the

th Qk based on the smallest KL distance. If the minimum KL divergence for the k group has more than one β estimate assigned to its index, then one of the estimates is randomly assigned to the kth group. For example, if two βˆ estimates are assigned

th to Qk, then one estimate is assigned to the k group index with a probability of

1/2. Once the ordering for the estimated βˆ is known relative to Q, the θˆ parameter is ordered accordingly. The optimal number of interaction groups representing the linkage rules used to represent the data is assessed by multiple model selection criteria that maximize the log-likelihood and penalizes for complexity based on the number of parameters in the model. 22

3.3 Model Selection

In practice, the true number of interaction groups, Z, is latent. However, one could fit LDA with different values of K groups and choose the “best” model for a selected K based on a selection criterion. This thesis only considers likeli- hood based scores including perplexity (PY), the Akaike Information criterion (AIC),

Bayesian Information criterion (BIC), the Deviance Information criterion (DIC) and the Information criterion (IC). A new ad-hoc score, the penalized perplexity, is also introduced.

All of the model selection criteria in this thesis require the log-likelihood for the fitted model. The log-likelihood of a given dataset Y with K interaction groups is:

N M " K # ˆ ˆ ˆ ˆ X X a X ˆa ˆ LL(θ, β) = log P (Y |θ, β) = Yjk log θkβjk . (3.8) a=1 j=1 k=1

Each score uses a different quantity to penalize the likelihood for model complexity.

3.3.1 Perplexity

Perplexity (PY) is a score used in language modelling and is calculated as the exponential of the cross-entropy. The PY for a model with K interactions groups is: 23

" # −LL(θ,ˆ βˆ) PY = exp . (3.9) nA

Perplexity is the standard language model selection criterion used for mea- suring performance of the fitted model and the uncertainty for prediction of ob- served words (Griffiths and Steyvers, 2004). The log perplexity can be interpreted as the cross-entropy since the LL(θ,ˆ βˆ) evaluates the estimated model and LL(θ,ˆ βˆ)

a is summed over N and M, so Yj /nA is an empirical distribution for each plant- pollinator species pair (Chen, 2009). The cross-entropy measures the uncertainty of the fitted model to the dataset. The exponential cross-entropy then provides a measure of model fit with lower perplexity scores providing a better fit to the data.

Although perplexity is the standard model selection criterion for LDA, other criteria are considered as potential candidates in the simulation study for this thesis.

3.3.2 Akaike Information Criterion (AIC)

A different selection criterion approach uses the asymptotic result of the average log-likelihood, which equals the entropy for the fitted model as nA approaches infinity. The entropy is the KL-divergence of the true distribution to the model distribution of the data (Akaike, 1974). A smaller KL distance estimated by the average log-likelihood suggests a better fit to the data. AIC uses the estimated result:

q ˆ ˆ −1 nA(L(θ, β) − L(θ, β)) ∼ N(0,J ), (3.10) 24

where J is the Fisher Information matrix, L(θ, β) and L(θ,ˆ βˆ) are the true likelihood and the likelihood evaluated at the MLE of θˆ and βˆ respectively. Under regularity

2 conditions, the natural logarithm of this result is χp with p being the number of parameters in the model. The true LL(θ, β) is discarded and LL(θ, βˆ) is corrected by adding p for the bias because LL(θ, βˆ) is an estimate of the true distribution (Akaike,

1974). The result is AIC defined by:

AIC = −2LL(θ, βˆ) + 2p, (3.11) where p = K(M −1)+N(K −1) is the number of free parameters in the fitted model.

Since the interaction data is sampled from a finite population, we consider a version of AIC that includes a finite population correction, given by

2p(p + 1) . (3.12) nA − p − 1

(AICc) is as follows:

2p(p + 1) AICc = −2LLˆ + 2p + . (3.13) nA − p − 1

For the remainder of this thesis, the AICc will be referred to as AIC. Other criteria

exist which rely on the asymptotic results of the log-likelihood for large sample sizes

(Schwarz, 1978). 25

3.3.3 Bayesian Information Criterion (BIC)

The BIC approximates the log-likelihood through a first order Taylor Series expansion around the maximum likelihood estimates of the parameters for large nA

(Raftery, 1995), given by:

p p log P (Y ) ≈ log P (Y |θ, β) − log n = LL − log n . (3.14) 2 A 2 A

The BIC is obtained by taking the negative logarithm of Equation 3.14 and then multiplying by two, and can be interpreted as an approximation to the deviance between the fitted model and the null model for the data (Raftery, 1995). The BIC is expressed as:

BIC = −2LL + p log nA. (3.15)

When calculating the BIC between different models, the “best” model will have a lower BIC score. The penalty for BIC is much stronger than that for AIC since the correction term is of the order log(nA). Other model selection criteria, such as DIC and IC, exist for high dimensional models where the number of parameters may exceed the number of observations using MCMC methods (Spiegelhalter et al.,

2002; Ando, 2009). The implementation of LDA considered in this thesis uses the collapsed Gibbs sampler, an MCMC method to maximize a lower bound on the log- likelihood. It is known that hierarchical models using MCMC methods may contain more parameters than the number of observations in the datasets. Both AIC and BIC 26

require the number of parameters to calculate the score for a model and they may

not be applicable if the number of parameters exceeds the number of observations.

3.3.4 Deviance Information Criterion (DIC)

The motivation for DIC is to assess the fit and complexity of a model with

arbitrary structure (Spiegelhalter et al., 2002). In particular, the DIC penalty is the

effective number of model parameters and is estimated by:

¯ ˆ ˆ ¯ ¯ pD ≈ D(θ, β) − D(θ, β), (3.16)

where D() is the deviance score (i.e. −2LL(θ, βˆ)), θˆ and βˆ are the MLEs of θ and β, and θ and β are the empirical average estimates from separate Monte Carlo samples

for the dataset. It has been shown that pD approximates the effective number of parameters in the model (Spiegelhalter et al., 2002). Further, D(θ,ˆ βˆ) is computed as

the empirical average deviance from separate Monte Carlo samples. DIC estimates

the MLE for the deviance using the average deviance of all parameter estimates and

penalizes for model complexity using pD (Spiegelhalter et al., 2002), as follows:

¯ ˆ ˆ DIC = D(θ, β) + pD. (3.17)

The deviance is a measure of uncertainty from the fitted model with a larger deviance translating to greater uncertainty. The score calculated for separate models then suggests the “best” model will have the lowest DIC. The DIC tends to overfit the data, selecting a model with several parameters suggesting the penalty is too weak. 27

3.3.5 Information Criterion (IC)

IC was developed to adjust for the tendency of DIC to overfit based on the

deviance from Monte Carlo samples. The bias correction is the same as the penalty

used for DIC (Ando, 2009). The bias corrected posterior mean of the likelihoods is

then used as a maximum likelihood estimate for the deviance. The IC is then:

n 1 X IC = D¯(θ,ˆ βˆ) + 2p = −2 LLˆ (θˆ , βˆ ) + 2p (3.18) D n i i D i=1 where n is the number of Monte Carlo samples with different parameter estimates.

Using the same penalty term for the DIC, the IC is twice as strict for model com- plexity. The model with the smallest IC score will provide the “best” fit to the data.

3.3.6 Penalized Perplexity

In language modelling, the “best” model is chosen based on the lowest score using perplexity. Perplexity can overfit the data by choosing a model with many parameters that simply maximizes the mean log-likelihood per interaction in the data. In a pollination context with fewer interactions, the “best” model’s interaction groups are meant to expose relevant traits that may facilitate or explain the observed interactions between visitor species and plant species. By simply maximizing the log- likelihood, PY tends to overfit data. In particular, models with higher K values tend to give higher likelihood values and so PY tends to select more complicated models, thereby failing to reduce the dimensionality to a few relevant groups ( i.e. the number of parameters may be greater than the number of interactions). To adjust for model 28 complexity, we propose introducing a penalty term that increases the cross entropy when too many parameters are used in the model. The penalized perplexity (PP) is calculated as:

" # −LL(θ,ˆ βˆ) + p PP = exp , (3.19) nA where p = K(M − 1) + N(K − 1) is the number of free parameters in a model with

K interaction groups. 29

Chapter 4

Simulation Study

A simulation study was conducted to evaluate the ability of LDA to accu- rately model interaction web structure and to investigate which model selection cri- terion is most appropriate in this context. A total of 24 scenarios averaged over 500 generated data sets, were considered with Kˆ = 2,..., 6 interaction groups. Section

4.1 outlines the design of the study, Section 4.2 explains how the data was generated, and Section 4.3 lists which statistics calculated from the results of the simulation study. Finally, Section 4.4 and Section 4.5 present the main results and discussion of the simulation study.

4.1 Study design

Network data was generated for different combinations of N × M, K and the level of nesting (none, mild, moderate, extreme and full), as shown in Table 4.1.

The dimensions of N × M: (20 × 9), (42 × 14) and (80 × 42) were determined using the Interaction Web Database (IWDB) (Guimar˜aeset al., 2011), which contains 35 publicly available pollination networks. The values for M, number of plant species, were the first, median and third quartiles of M, computed from all 35 networks 30

√ √ provided in the IWDB. A regression of N on M for these 35 networks gives the √ √ fitted line N = 0.5491 + 4.4821 log( M). Hence, the values of N in this simulation

study were obtained by plugging in the second and third quartiles of M from IWDB,

as was done by Crea (2011). Instead of the first quartile for M (equal to 10), M = 9

was used instead with N = 20 pollinator species.

Nesting was classified according to the level of overlap in conditional group probabilities for β used to generate the data. Figure 4.1 shows an example of a mildly nested interaction web scenario with K = 3 groups (scenario 3) using the bipartite

package in R. Note the overlap in heavily visited plant species between interaction

groups (columns 3 and 7).

Data was generated according to the graphical model shown in Figure 3.1.

In particular, for specified number of groups K, the mixing probabilities for pollinator

species, θa, were sampled from a Dirichlet(α) distribution and the interaction proba-

bilities for group k, βk were sampled from a Dirichlet(ηk) distribution. ξa was sampled

from a Log-Normal(1,2.5) distribution for all simulations and α was taken to be 0.01.

The total number of observed interactions, na, was sampled from a Poisson(ξa) dis-

tribution. The values for ηk were more complicated because they depended on M,

K and the level of nesting so the generative process for ηk is deferred to Section 4.2.

This design suggests that each sampled interaction for a visitor species will tend to

have high probability of belonging to a single interaction group.

For each scenario (combination of N × M, K and level of nesting), R =

500 datasets were generated. LDA with a collapsed Gibbs sampler was fit to each 31

Figure 4.1: Visualization of a mildly nested visitation web with 20 visitor species (rows) and 9 plant species (columns). Darker cells represent higher frequencies of interactions between the corresponding plant-visitor pairs. dataset for K = 2, 3,..., 6 groups. For all scenarios, the estimation was based on T

= 15,000 Monte Carlo samples from the Gibbs sampler. However, the burn-in period

B depended on N × M since larger networks tended to require more iterations before the Gibbs sampler converged. The specific values of B used were 5,000 iterations for

20 × 9 networks, 10,000 iterations for 42 × 14 networks and 25,000 iterations for 80

× 42 networks. Diagnostic plots were monitored to ensure that the Gibbs sampler had reached convergence before samples were drawn from it. 32

Table 4.1: Dimensions and the number of interaction groups used to generate the data for the 24 different scenarios of the simulation study. No nesting corresponds to a compartmental model. Scenario N M K Nesting 1 20 9 2 None 2 20 9 3 None 3 20 9 3 Mild 4 20 9 3 Moderate 5 20 9 4 Mild 6 20 9 4 Moderate 7 20 9 4 Extreme 8 20 9 4 Full 9 42 14 2 None 10 42 14 3 None 11 42 14 3 Mild 12 42 14 3 Moderate 13 42 14 4 Mild 14 42 14 4 Moderate 15 42 14 4 Extreme 16 42 14 4 Full 17 80 42 2 None 18 80 42 3 None 19 80 42 3 Mild 20 80 42 4 Moderate 21 80 42 4 Mild 22 80 42 4 Moderate 23 80 42 4 Extreme 24 80 42 4 Full

4.2 Data Generation and Model Fitting

The process of data generation starts at the outer nodes of the graphical model in Figure 3.1. The hyperparameters α and η on θ and β are Dirichlet priors specified before the generated process. The hyperparameter η is generated only once for a specified scenario and requires consideration for the level of nesting, because

β inherits its structure from η. The amount of nesting for the mild and moderate scenarios is quantified by λ, the fraction of overlap for the M plant species between 33

Algorithm 3 η generation for a single scenario in the simulation study. Inputs M = num. of plant species, Nesting = 1(none), 2(mild),3(moderate), 4(extreme) and 5(full), λ = fraction of species belonging to more than one interaction group.

Outputs η of size M × K. 1: Initialize η = NULL 2: while η = NULL do 3: for k = 1 to K do 4: ηk = 0.01 for all j = 1, .., M and k = 1, .., K 5: sample f1 ∼ Uniform[1,M] 6: sample f2 ∼ Uniform[1,M], . [f1, f2] defines potential plant species in group k 7: for i ∈ [f1, f2] do 8: sample ηik ∼N(7,1) 9: end for 10: end for 11: RowSum(η) of length M. RowSum(η) is the row sum of the groups for η 12: . Check if η abides by specified nesting level 13: for x ∈ Nesting do 14: test = test[x] . Select test per level of nesting 15: if test passes then 16: accept η 17: else 18: η = NULL 19: end if 20: end for 21: end while 22: return η

any two groups. λ = 0 for all compartmental scenarios and for fully nested scenario,

λ = 1. λ had to be considered for mild and moderate nesting and varied for the

different dimensions of N×M. The mild and moderately nested scenarios had λ = 1/3 for scenarios 3-6 and 11-14; scenarios 19-22, λ = 1/4. The sampling scheme for η is outlined in Algorithm 3.

Algorithm 3 requires λ, M and nesting as input, where λ specifies the frac- 34

tion of the M plant species, that will belong to two or more interaction groups. A

plant species “belongs” to an interaction group if it is within the sampled [f1, f2] interval for the kth group, in which a value from a Normal(7, 1) is assigned to the plant species. A Normal distribution with a mean of 7 was chosen to distinguish the

th plant species defining a group in the sampled [f1, f2] interval for the k group but

still maintain a reasonable weight for plant species outside [f1, f2], so the groups are

less apparent when β is sampled for the scenario. Plant species that are not used to

define an interaction group are not within the [f1, f2] sampled interval and they are

assigned a value of 0.01. The nesting is specified at the beginning to determine which

test to use in order to select an appropriate η parameter. The tests for Algorithm 3

are shown in Table 4.2. Table 4.2: The test to accept η for a specified level of nesting. x Test 1 all RowSums(η) ≤ 10 2 RowSum(η) = RowSum(η)[10 < RowSum(η) ≤ 20] 0 < length[RowSum(η)] ≤ λM 3 RowSum(η) = RowSum(η)[10 < RowSum(η) ≤ 20] λM < length[RowSum(η)] ≤ 2(λM) 4 [f1, f2]k ⊂ [f1, f2]i6=k for < K − 1 groups 5 [f1, f2]k ⊂ [f1, f2]i6=k for K − 1 groups

For a compartmental scenario, the η is chosen if no overlap is present between

the sampled plant species in [f1, f2] for each of the k groups. If the sum of ηj (for each

plant species) across the K groups is less than 10 (the mean + 3SD’s) for all plant

species then the η was satisfactory. This corresponds to all plant species belonging to

only one interaction group with high probability (approximately 0.999). For a mild

nested scenario, if the sum of ηj (for each plant species) across the K groups is greater 35

than or equal to 10 (the mean + 3SD’s) but less than 20 (2 × mean + 3SD’s) for

λM of the plant species or less (but greater than 0), then the η was satisfactory. For

a moderately nested scenario, if the sum of ηj (for each plant species) across the K

groups is greater than 10 (the mean + 3SD’s) but less than or equal to 20 2(× mean

+ 3SD’s) for more than λM of the plant species but less than or equal to 2λM, the η

was satisfactory. For extreme nested scenarios, if at least one of the sampled [f1, f2]k

is a subset of another [f1, f2]i6=k for less than K −1 groups than the η was satisfactory.

For full nested scenarios, if the sampled [f1, f2]k are a subset of another [f1, f2]i6=k for

K − 1 groups than the η was satisfactory.

Theoretically, for compartmental scenarios, a plant species could belong to

more than one group and have a sum of ηj less than 10 (the mean + 3SD’s). However,

inspection of the returned η parameter can verify whether a sampled η matches the

nesting criteria specified. The η is generated once for a scenario to sample the β

parameter. The βk is sampled only once from a Dirichlet(ηk) at the beginning of the

scenario using the selected η parameter according the level of nesting and λ.

By sampling β, the interaction webs for the scenario can now be generated

using the symmetric α hyperparameter to generate θa from a Dirichlet distribution

for each visitor species. Once the interaction group mixture probabilities θa have been

generated for each visitor species, the total observed counts na are sampled from a

Log-Normal(1,2.5). A Log-Normal with log-scale mean 1 and shape parameter 2.5

were chosen in an attempt to replicate the abundances, na, on the 35 interaction webs in IWDB (Guimar˜aeset al., 2011). The abundances for the webs were plotted against several randomly sampled abundances from a Log-Normal with varying log-scale and 36

Algorithm 4 Generative Process: LDA process for a single interaction web Inputs β, α, N, M and K Outputs Y of size N × M 1: ξ of size N × 1 2: for a = 1 to N do 3: Sample ξa ∼ Log − Normal(τ, υ) 4: Sample na ∼ P oisson(ξa) 5: θa ∼ Dirichlet(α) K × 1 6: for i = 1 to na do 7: Sample Z ∼ Mult(θa) . Sample the ith interaction to an interaction group th th 8: Sample j ∼ Mult(βZ ) . sample the i interaction to the j plant species a a 9: Yj∗ ← Yj∗ + 1 10: end for 11: end for return Y

shape parameters. The parameters 1 and 2.5 were an appropriate approximation

to the 35 webs. Using the Log-Normal(1,2.5), the generation process for a single

interaction web Y is shown in Algorithm 4.

The Y matrix returned from the data generation process does not provide the group assignment for each individual interaction. The generated data with known parameters is then evaluated using the collapsed Gibbs sampler to examine the effec- tiveness of LDA in accurately identifying the latent interaction groups and conditional probabilities used to generate the data. If any of the plant species for a generated Y did not have any visitors then two visitor species were sampled from a Multinomial distribution with probabilities given by the visitor abundances, and an additional interaction for each species was assigned to the originally non-visited plant species.

The data generation and model fitting process for a maximum number of interaction groups D fitted and R = 500 runs for a scenario is outlined in Algorithm 5. Also, αinit

and ηinit needs to be specified in order to initialize the Markov chain in the collapsed 37

Algorithm 5 Data generation and model fitting for a scenario Inputs N, M, K, R, D, λ, Nesting, α, αinit and ηinit Outputs η, βˆ, θˆ,β, θ, β, Y, AIC, BIC, DIC, IC, PP, PY, and LL 1: • Generate η per Algorithm 3 2: • Sample βk ∼ Mult(ηk) 3: for h = 1 to R do 4: Generate Y per Algorithm 4 5: for t = 2 to D do 6: Fit LDA per Algorithm 1 7: end for 8: end for 9: for h = 1 to R do 10: Choose model selection criterion 11: Select Kˆ from lowest selection score 12: Save βˆ and θˆ from Kˆ 13: end for 14: Order θˆ and βˆ per Algorithm 2 15: Take the mean of βˆ θˆ to obtain final estimates

Gibbs sampler for each fitted model in the scenario. αinit = 0.001 and ηinit = 0.01 for all scenarios in the simulation study. 38

Once the estimates have been chosen based on some model selection crite-

rion, the average parameter estimates βˆ and θˆ are compared to the true β and θ used to generate the data. The summary statistics used to verify the estimates from the simulation study are outlined in the next section.

4.3 Statistics

For each interaction web generated within a scenario and a given choice of model score, the estimates for θˆ and βˆ are obtained by computing the MLE estimates from the model with the lowest associated score. Similarly, Kˆ is the number of interaction groups associated with the “best” (lowest scoring) model. The choice of model selection score can be evaluated by inspecting how often each criterion selected the “best” model with the correct K out of 500 runs for each scenario. The performance of LDA can also be evaluated by computing the mean square error, standard deviation, coefficient of variation and relative bias of βˆ and θˆ.

The prediction accuracy of LDA can be examined through the β parameter by using the Mean Square Error (MSE) for βˆ. The MSE is calculated as,

M !2 1 X MSE(βˆ ) = βˆ − β , k = 1,...,K. (4.1) k M jk jk j=1

The MSE for each interaction group k provides a measure of how close the estimate is

to the true parameter used to generate the interaction webs. The MSE was calculated

ˆ ˆ for all estimates in a scenario for which K = K. An interaction group estimate βk has

identified a true interaction group if the MSE is within a threshold value . Several 39

 values are tested to ensure the threshold is not too loose for identifying βk. All ˆ estimated interaction groups of β are considered for identification of a given βk.

ˆ th th The variance for βjk for the k group on the j plant species is defined to be:

R !2 X ˆ(r) 1 ˆ (r) βjk − βjk (K = K) V ar(βˆ ) = r=1 , (4.2) jk R P 1(Kˆ (r) = K) − 1 r=1

ˆ(r) th ˆ (r) where βjk is the MLE of βjk for the r generated data set, K , is the corresponding estimate of K, and βjk is the mean of the R estimates after using Algorithm 2.

ˆa th th Similarly, the variance for θk, the mixing probability for the k group of the a visitor species using the runs that selected the correct K is defined to be,

R !2 X ˆ(ar) a 1 ˆ (r) θk − θk (K = K) V ar(θˆa) = r=1 . (4.3) k R P 1(Kˆ (r) = K) − 1 r=1

ˆ ˆa The coefficient of variation (CV) for βjk and θk are respectively given by,

ˆ ˆa ˆ SD(βjk) ˆa SD(θk) CV (βjk) = ,CV (θk) = a . (4.4) βjk θk

Finally, the relative bias can be calculated as:

β − β a a ˆ jk jk ˆa θk − θk RB(βjk) = and RB(θk) = a . (4.5) βjk θk 40

All of these statistics were computed for the estimates from all scenarios. The bias

a and standard deviation are estimated instead of CV and RB when a θk = 0 or a

βjk = 0 are present in the simulation scenario.

4.4 Results

The 24 different scenarios in the simulation study exposed the ability of LDA to identify interaction groups at various levels of nesting as explained in the previous section. The best model chosen is determined by the effectiveness of various model selection criteria to choose the correct number of interaction groups used to generate the data. The six different model selection criteria were BIC, AIC, perplexity (PY), penalized perplexity (PP), DIC and IC. Each scenario generated 500 interaction webs with the fixed parameters N, M, K, β and web specific θ. The frequency which each criterion selected the correct K for the 500 runs is shown in Table 4.1 as well as the frequency with which KKˆ . The nesting temperature for each scenario was estimated by sampling 100 random runs and averaging over the temperatures from these 100 samples. 41 ˆ K>K out of 500 samples. K K = ˆ K ˆ K

Table 4.4 is the discordance ratio and proportion of runs which selected

Kˆ < K for penalized perplexity, perplexity and AIC. The sampled average nesting temperature T for the scenarios is reasonably low, even for the compartmental sce- narios. The nesting is low for the compartmental scenarios because the individuals within a visitor species may belong to more than one compartment and these interac- tions will contribute to the overall nesting of the web generated web using Algorithm

4. Both AIC and PP are competitive selection criteria in choosing the correct K used to generate the data. The discordance exposes which criterion tended to select the correct K while the other criterion had the incorrect K for the runs in a scenario.

The ratio of discordance for two criteria can then show which criterion tended to select the correct K more often when the other criterion selected the incorrect K.A ratio of less than 1 means the criterion on the numerator tended to select the correct

K less often than the criterion on the denominator.

4.4.1 Parameter estimation Statistics

The summary statistics provided in this Section are only for 3 of the 24 scenarios in the simulation study. The 3 scenarios chosen were scenario 10, 11 and 12 from Table 4.1. The results from these 3 scenarios are representative of the findings for all 24 scenarios in the simulation study as nesting increased for fixed dimensions of N × M. AIC and PP performed equally in these 3 scenarios, only one (PP) is used to show the parameter estimates for θ and β.

Table 4.5 is a tally of the MSE of βˆ within the specified threshold for  =

[0.003, 0.005, 0.01, 0.03] for all runs Kˆ = K and Kˆ 6= K. The largest threshold 43

 = 0.03 suggests most interaction groups are identified for runs with Kˆ = K and

Kˆ 6= K. As the threshold tightens to  = 0.003 the identification of all K = 3

interaction groups decreases with more runs only identifying at most 2 interaction

groups. A breakdown of the interaction groups identified are illustrated in Figure

4.2, 4.3 and 4.4 for scenarios 10, 11 and 12. The different shades of the stacked blocks

represent the number of runs where the interaction group is identified based on the

MSE(βˆ) for the specified  value.

ε = 0.003 ε = 0.005 ε = 0.01 ε = 0.03 2000 2000 2000 2000 Group 3 Group 2 Group 1 1500 1500 1500 1500 1000 1000 1000 1000 frequency frequency frequency frequency 500 500 500 500 0 0 0 0

1 2 3 1 2 3 1 2 3 1 2 3

Num. groups identified Num. groups identified Num. groups identified Num. groups identified

ε = 0.003 ε = 0.005 ε = 0.01 ε = 0.03 2000 2000 2000 2000 1500 1500 1500 1500 1000 1000 1000 1000 frequency frequency frequency frequency 500 500 500 500 0 0 0 0

1 2 3 1 2 3 1 2 3 1 2 3

Num. groups identified Num. groups identified Num. groups identified Num. groups identified

Figure 4.2: Stacked bar plots for the identified interaction groups in scenario 10. The top plots are for the runs with Kˆ = K and the bottom row is for Kˆ 6= K.

The relative bias for βˆ and θˆ are presented in Tables 4.6 and 4.7 respectively.

If any true parameters for β or θ were equal to zero then only the bias is shown. For example, note that the true β values are listed in the bottom half of Table 4.6 . So 44

ε = 0.003 ε = 0.005 ε = 0.01 ε = 0.03 2000 2000 2000 2000 Group 3 Group 2 Group 1 1500 1500 1500 1500 1000 1000 1000 1000 frequency frequency frequency frequency 500 500 500 500 0 0 0 0

1 2 3 1 2 3 1 2 3 1 2 3

Num. groups identified Num. groups identified Num. groups identified Num. groups identified

ε = 0.003 ε = 0.005 ε = 0.01 ε = 0.03 2000 2000 2000 2000 1500 1500 1500 1500 1000 1000 1000 1000 frequency frequency frequency frequency 500 500 500 500 0 0 0 0

1 2 3 1 2 3 1 2 3 1 2 3

Num. groups identified Num. groups identified Num. groups identified Num. groups identified

Figure 4.3: Stacked bar plots for the identified interaction groups in scenario 11. The top plots are for the runs with Kˆ = K and the bottom row is for Kˆ 6= K.

for any cell in this section for which βjk = 0, the corresponding entries in the top half of Table 4.6 present biases not relative biases. The relative bias is calculated as per equations in (4.5).

Similarly, the coefficient of variation (CV) for βˆ and θˆ are presented in Table

4.9 and 4.10 respectively. Again, if any true parameters for β or θ were equal to zero then only the standard deviation is shown. The CV is calculated as per equations in

(4.4). 45

ε = 0.003 ε = 0.005 ε = 0.01 ε = 0.03 2000 2000 2000 2000 Group 3 Group 2 Group 1 1500 1500 1500 1500 1000 1000 1000 1000 frequency frequency frequency frequency 500 500 500 500 0 0 0 0

1 2 3 1 2 3 1 2 3 1 2 3

Num. groups identified Num. groups identified Num. groups identified Num. groups identified

ε = 0.003 ε = 0.005 ε = 0.01 ε = 0.03 2000 2000 2000 2000 1500 1500 1500 1500 1000 1000 1000 1000 frequency frequency frequency frequency 500 500 500 500 0 0 0 0

1 2 3 1 2 3 1 2 3 1 2 3

Num. groups identified Num. groups identified Num. groups identified Num. groups identified

Figure 4.4: Stacked bar plots for the identified interaction groups in scenario 12. The top plots are for the runs with Kˆ = K and the bottom row is for Kˆ 6= K. 46

Table 4.4: The discordance ratio calculated as (PP incorrect, PY correct)/(PP cor- rect, PY incorrect) for the penalized perplexity and perplexity model selection criteria and (PP incorrect, AIC correct)/(PP correct, AIC incorrect) for the penalized per- plexity and AIC model selection criteria for each scenario of 500 runs. The proportion of the 500 runs choosing an incorrect k for each scenario are also listed for each of the two criteria Nesting Discordance Proportion K

9 42 14 None 0.000 0/0 0.000 0.000 0.000 10 42 14 None 0.000 0/0 0.000 0.000 0.000 11 42 14 Mild 0.000 1.000 0.004 0.000 0.006 12 42 14 Moderate 0.003 1.000 0.020 0.000 0.034 13 42 14 Mild 0.000 0.143 0.040 0.000 0.078 14 42 14 Moderate 0.023 0.069 0.190 0.006 0.282 15 42 14 Extreme 0.128 0.267 0.358 0.034 0.420 16 42 14 Full 3.263 0.000 0.786 0.132 0.802

17 80 42 None 0.000 0/0 0.000 0.000 0.000 18 80 42 None 0.000 0/0 0.000 0.000 0.000 19 80 42 Mild 0.000 8.000 0.004 0.000 0.008 20 80 42 Moderate 0.000 1/0 0.004 0.000 0.004 21 80 42 Mild 0.018 1.800 0.060 0.000 0.100 22 80 42 Moderate 0.105 3.750 0.272 0.014 0.330 23 80 42 Extreme 0.018 0.381 0.106 0.004 0.172 24 80 42 Full 0.223 0.500 0.508 0.036 0.566 47

Table 4.5: The number of groups identified for the scenarios with N = 42, M = 14 and K = 3 with penalized perplexity (PP) used as the model selection criterion. Kˆ = 3 Kˆ 6= 3 Number of Groups identified Number of Groups identified  Nesting 0 1 2 3 Total 0 1 2 3 Total None 0 0 1 447 448 0 0 5 47 52 0.003 Mild 0 0 4 396 400 0 5 14 81 100 Moderate 0 3 15 308 326 0 16 30 128 174

None 0 0 0 448 448 0 0 4 48 52 0.005 Mild 0 0 2 398 400 0 2 11 87 100 Moderate 0 0 13 313 326 0 14 19 141 174

None 0 0 0 448 448 0 0 0 52 52 0.01 Mild 0 0 0 400 400 0 1 2 97 100 Moderate 0 0 2 324 326 0 0 21 153 174

None 0 0 0 448 448 0 0 0 52 52 0.03 Mild 0 0 0 400 400 0 0 2 98 100 Moderate 0 0 1 325 326 0 0 12 162 174 48

Table 4.6: Top row: The bias and relative bias for βˆ for the scenarios with N = 42,M = 14 and K = 3 using penalized perplexity for model selection. Bottom row: The true β parameter used to generate the data. Level of Nesting None Mild Moderate

ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ Plant β1 β2 β3 β1 β2 β3 β1 β2 β3 1 0.000 -0.015 0.000 0.000 0.006 -0.021 0.002 0.000 0.000 2 0.000 -0.001 0.000 0.000 0.001 0.000 0.002 0.000 0.000 3 0.000 0.005 -0.538 -0.005 -0.006 0.000 -0.003 -0.013 0.000 4 0.000 0.005 0.000 0.006 0.000 0.000 0.000 -0.007 0.000 5 -0.000 0.000 0.000 0.005 0.000 0.000 0.000 -0.010 0.000 6 -0.006 0.000 0.000 -0.003 0.000 0.000 0.000 0.009 -0.053 7 0.002 0.000 -0.000 -0.001 0.000 0.000 0.000 0.014 -0.060 8 0.007 0.000 0.000 0.002 0.000 0.000 0.000 0.000 -0.036 9 0.000 0.000 -0.005 -0.005 -0.000 0.009 0.000 0.000 0.034 10 0.000 0.000 0.001 -0.003 0.000 0.001 0.000 0.000 0.013 11 0.000 0.000 0.004 0.000 0.000 -0.006 0.000 0.000 0.051 12 0.000 0.000 -0.006 0.000 0.000 0.001 0.000 0.000 0.055 13 0.000 0.000 0.015 0.000 0.000 -0.009 0.000 0.001 0.020 14 0.000 0.000 0.001 0.000 0.000 0.006 0.000 0.000 0.028

β1 β2 β3 β1 β2 β3 β1 β2 β3 1 0.000 0.171 0.000 0.000 0.321 0.027 0.226 0.000 0.000 2 0.000 0.341 0.000 0.000 0.295 0.000 0.259 0.000 0.000 3 0.000 0.259 0.003 0.123 0.383 0.000 0.516 0.108 0.000 4 0.000 0.230 0.000 0.184 0.000 0.000 0.000 0.205 0.000 5 0.414 0.000 0.000 0.136 0.000 0.000 0.000 0.148 0.000 6 0.235 0.000 0.000 0.109 0.000 0.000 0.000 0.094 0.076 7 0.211 0.000 0.000 0.092 0.000 0.000 0.000 0.165 0.151 8 0.140 0.000 0.000 0.107 0.000 0.000 0.000 0.279 0.182 9 0.000 0.000 0.121 0.189 0.000 0.093 0.000 0.000 0.119 10 0.000 0.000 0.151 0.061 0.000 0.249 0.000 0.000 0.062 11 0.000 0.000 0.229 0.000 0.000 0.159 0.000 0.000 0.051 12 0.000 0.000 0.156 0.000 0.000 0.151 0.000 0.000 0.107 13 0.000 0.000 0.134 0.000 0.000 0.135 0.000 0.000 0.117 14 0.000 0.000 0.205 0.000 0.000 0.186 0.000 0.000 0.135 49

Table 4.7: The average relative bias for θˆ for the scenarios with N = 42,M = 14 and K = 3 using the penalized perplexity for model selection. Level of Nesting None Mild Moderate

ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ Visitor θ1 θ2 θ3 θ1 θ2 θ3 θ1 θ2 θ3 1 0.060 0.149 -0.012 -0.169 0.843 -0.133 0.882 0.578 -0.046 2 -0.058 0.038 -0.240 -0.417 2.794 -0.133 0.111 0.248 -0.754 3 0.121 0.050 0.079 -0.320 1.099 1.351 -0.049 -0.598 1.178 4 0.082 0.609 0.072 -0.148 -0.133 0.356 -0.454 0.020 0.045 5 -0.175 1.727 -0.385 -0.067 -0.164 0.209 -0.348 -0.483 -0.182 6 0.186 -0.254 0.669 -0.571 -0.233 -0.363 -0.360 -0.663 -0.158 7 -0.069 -0.215 -0.128 -0.674 -0.562 -0.303 -0.664 -0.280 -0.793 8 1.415 -0.617 -0.256 -0.248 8.036 -0.292 0.541 -0.281 -0.200 9 -0.426 0.044 2.090 0.200 -0.114 -0.359 -0.061 -0.225 2.297 10 -0.066 0.706 -0.103 0.186 -0.224 -0.513 -0.846 -0.202 -0.643 11 0.588 -0.226 -0.169 -0.338 -0.595 0.304 -0.719 -0.530 -0.518 12 0.474 -0.443 0.032 -1.100 1.038 -0.419 -0.032 1.704 -0.509 13 -1.102 -0.286 -0.158 -0.778 -0.216 -0.810 -1.557 29.570 -0.031 14 -0.415 -0.457 0.944 -0.329 -0.190 -0.729 -0.289 -0.938 2.171 15 -0.101 -0.662 -0.159 0.202 -0.650 -0.100 -2.217 16.823 0.118 16 -0.708 -0.048 0.832 -0.452 -1.042 -0.935 -1.004 -0.168 0.734 17 -0.800 -0.086 -0.365 -0.070 -0.834 0.091 20.157 1.905 25.491 18 -0.426 -0.412 0.031 -0.625 -1.328 -0.429 -1.248 -0.679 -0.735 19 -0.015 -0.430 -0.013 -0.671 0.360 0.748 -0.733 -0.640 -1.041 20 -0.383 -0.435 -0.773 0.189 -0.730 -1.017 -0.053 -0.710 -0.257 21 -0.618 -0.695 -0.884 0.750 -0.974 -0.232 -0.545 0.081 -1.866 22 -0.724 0.114 0.420 -0.139 -0.665 -0.658 -0.666 -0.828 -1.548 23 0.832 -0.993 -0.802 2.061 -1.065 -0.968 -1.008 -0.758 18.284 24 1.090 -0.836 -0.074 -0.794 -0.907 -0.668 -1.268 52.869 20.417 25 0.243 -0.645 -0.604 9.769 0.722 -0.590 -0.669 0.916 -0.582 26 -0.739 -0.705 -0.050 -0.846 -1.792 -0.538 -1.341 -0.621 -1.420 27 -0.042 -0.413 -1.605 -0.837 -0.705 -0.266 -0.748 -0.918 -1.071 28 0.307 -0.442 0.565 -0.129 -1.078 0.050 -0.680 -0.834 -0.711 29 3.484 -0.165 9.834 2.136 -0.256 -0.246 -1.410 -0.970 -1.099 30 -0.515 5.787 0.221 2.345 -0.561 1.904 -0.650 -0.123 -0.833 31 0.244 -0.879 -0.783 -1.232 2.385 -1.484 -1.180 51.748 0.182 32 -0.435 3.113 1.234 162.725 0.491 13.050 -1.473 -1.472 -1.569 33 -1.167 -0.210 -0.151 -1.304 -0.782 -0.975 -1.788 -2.679 51.690 34 -0.518 -0.164 -0.849 -0.761 -1.152 -0.435 -0.781 -2.328 -1.272 35 -0.555 -0.127 -0.728 101.139 -1.110 -0.369 -2.008 -1.177 0.477 36 -0.074 -1.770 1.743 0.147 31.023 -0.172 45.798 -1.669 -0.127 37 0.445 1.233 4.202 -0.960 2.190 -2.375 0.459 122.864 -2.626 38 0.160 -1.435 -1.125 -0.603 -1.510 9.900 -0.964 0.184 -2.125 39 -1.417 -0.118 2.209 47.772 0.523 -0.407 60.919 0.373 -0.813 40 -0.711 -0.664 -1.864 -0.570 -1.345 -0.662 -1.754 15.613 -1.821 41 -0.777 -0.715 -0.957 -2.415 -2.260 3.330 -1.280 32.327 -5.010 42 0.914 -0.886 59.414 91.301 -2.031 -1.707 0.804 35.003 0.250 50

Table 4.8: The average bias for θˆ for the scenarios with N = 42,M = 14 and K = 3 using the penalized perplexity for model selection. Level of Nesting None Mild Moderate

ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ Visitor θ1 θ2 θ3 θ1 θ2 θ3 θ1 θ2 θ3 1 0.000 0.001 0.000 0.000 0.000 0.000 -0.000 0.010 0.003 2 0.000 0.001 -0.000 0.000 0.000 0.000 -0.000 0.005 0.000 3 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.004 0.000 4 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.003 0.000 5 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.009 0.000 6 0.000 0.000 0.000 0.000 0.001 0.000 0.000 0.011 0.003 7 0.000 0.001 0.000 0.001 0.001 0.000 0.000 0.008 0.000 8 0.000 0.001 0.000 0.000 0.000 0.000 0.000 0.004 0.000 9 0.000 0.001 0.000 0.001 0.000 0.000 0.000 0.004 0.000 10 0.000 0.001 0.000 0.000 0.002 0.000 0.000 0.009 0.000 11 0.000 0.001 0.000 0.000 0.000 0.000 0.000 0.010 0.000 12 0.000 0.001 0.000 0.000 0.000 0.000 0.000 0.007 0.001 13 0.000 0.000 0.000 0.000 0.001 0.002 0.000 0.007 0.000 14 0.000 0.001 0.000 0.000 0.001 0.000 0.001 0.010 0.000 15 0.000 0.001 0.000 0.001 0.000 0.001 0.001 0.013 0.000 16 0.001 0.001 0.000 0.001 0.002 0.001 0.000 0.008 0.000 17 0.000 0.000 0.000 0.000 0.001 0.000 0.000 0.009 0.000 18 0.000 0.001 0.000 0.005 0.000 0.001 0.001 0.012 0.002 19 0.001 0.000 0.000 0.000 0.001 0.002 0.000 0.022 0.005 20 0.000 0.001 0.000 0.005 0.005 0.000 0.001 0.010 0.005 21 0.000 0.000 0.000 0.000 0.001 0.006 0.002 0.013 0.001 22 0.000 0.001 0.000 0.007 0.001 0.000 0.002 0.022 0.022 23 0.000 0.002 0.000 0.009 0.004 0.003 0.002 0.034 0.006 24 0.000 0.000 0.000 0.002 0.002 0.003 0.004 0.024 0.016 25 0.001 0.002 0.000 0.001 0.002 0.005 0.000 0.021 0.021 26 0.000 0.000 0.000 0.009 0.002 0.004 0.000 0.040 0.017 27 0.000 0.000 0.000 0.011 0.002 0.002 0.002 0.037 0.014 28 0.000 0.000 0.000 0.009 0.004 0.000 0.004 0.031 0.042 29 0.000 0.001 0.000 0.007 0.014 0.005 0.000 0.046 0.010 30 0.000 0.000 0.000 0.009 0.003 0.005 0.002 0.039 0.019 31 0.000 0.003 0.000 0.011 0.007 0.007 0.004 0.059 0.025 32 0.000 0.003 0.000 0.017 0.003 0.013 0.006 0.058 0.040 33 0.000 0.002 0.000 0.025 0.005 0.022 0.004 0.068 0.025 34 0.000 0.002 0.000 0.010 0.008 0.022 0.013 0.074 0.056 35 0.001 0.002 0.001 0.019 0.005 0.027 0.008 0.111 0.030 36 0.001 0.001 0.001 0.037 0.007 0.016 0.006 0.086 0.067 37 0.001 0.001 0.001 0.046 0.003 0.030 0.005 0.123 0.057 38 0.001 0.001 0.001 0.046 0.015 0.033 0.019 0.086 0.038 39 0.001 0.001 0.001 0.033 0.019 0.012 0.005 0.074 0.079 40 0.002 0.002 0.001 0.049 0.013 0.021 0.003 0.101 0.052 41 0.001 0.002 0.001 0.030 0.013 0.045 0.015 0.107 0.041 42 0.001 0.001 0.001 0.046 0.004 0.018 0.023 0.088 0.052 51

Table 4.9: The coefficient of variation (CV) for βˆ for the scenarios with N = 42,M = 14 and K = 3 using the penalized perplexity for model selection. Level of Nesting None Mild Moderate

ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ Plant β1 β2 β3 β1 β2 β3 β1 β2 β3 1 0.000 0.970 0.000 0.000 0.987 1.075 0.984 0.000 0.001 2 0.000 0.980 0.000 0.000 0.983 0.000 0.985 0.000 0.000 3 0.000 0.987 0.944 0.986 0.974 0.000 0.977 0.993 0.000 4 0.000 0.987 0.000 0.992 0.000 0.000 0.000 0.984 0.000 5 0.980 0.000 0.000 0.992 0.001 0.000 0.000 0.981 0.001 6 0.976 0.000 0.000 0.989 0.000 0.000 0.000 1.007 0.974 7 0.986 0.000 0.001 0.993 0.000 0.000 0.000 1.007 0.952 8 0.993 0.000 0.000 0.991 0.000 0.001 0.001 0.986 0.974 9 0.000 0.000 0.984 0.984 0.001 1.010 0.000 0.000 1.041 10 0.000 0.000 0.986 1.010 0.001 0.984 0.000 0.004 1.028 11 0.000 0.000 0.986 0.000 0.000 0.980 0.000 0.000 1.083 12 0.000 0.000 0.979 0.000 0.000 0.986 0.000 0.000 1.061 13 0.000 0.000 1.001 0.000 0.000 0.978 0.000 0.008 1.021 14 0.000 0.001 0.984 0.001 0.000 0.990 0.000 0.007 1.027 52

Table 4.10: The average coefficient of variation (CV) for θˆ for the scenarios with N = 42,M = 14 and K = 3 using the penalized perplexity for model selection. Level of Nesting None Mild Moderate

ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ Visitor θ1 θ2 θ3 θ1 θ2 θ3 θ1 θ2 θ3 1 15.889 17.064 10.561 21.383 11.370 28.140 9.079 9.948 21.885 2 12.804 18.462 21.115 21.501 28.102 24.266 19.757 9.828 24.395 3 9.325 26.357 11.197 17.293 30.734 17.401 11.802 19.719 20.111 4 10.744 28.938 20.360 16.577 2.782 12.607 16.405 12.777 3.878 5 21.339 27.777 17.755 2.373 16.435 19.221 23.075 21.645 5.169 6 17.217 16.108 9.454 24.542 7.058 10.329 23.319 24.125 11.148 7 14.450 15.117 16.401 18.789 17.540 16.946 26.671 19.791 12.871 8 20.163 24.373 9.700 6.569 19.503 11.177 26.405 12.726 6.987 9 15.457 19.200 31.915 4.436 9.998 9.931 22.122 5.599 15.722 10 11.283 17.516 9.976 27.017 5.879 10.330 25.630 10.267 14.764 11 12.241 10.647 19.335 13.993 22.246 16.328 10.199 19.493 8.128 12 23.408 28.727 16.165 22.928 21.252 20.233 13.040 21.173 11.849 13 35.420 20.722 15.197 21.672 26.132 14.885 23.618 22.507 27.326 14 11.605 15.391 18.105 13.962 14.719 18.302 14.042 22.625 8.196 15 16.193 18.033 18.781 19.613 11.251 19.183 23.727 13.522 7.096 16 23.052 18.492 7.988 14.995 13.599 25.765 9.072 9.787 26.279 17 10.344 5.171 23.336 11.199 15.816 7.202 20.137 13.358 11.209 18 12.221 21.047 19.462 17.169 17.614 6.902 21.203 7.520 28.985 19 15.224 9.717 11.602 7.537 17.422 26.764 30.303 23.120 16.030 20 14.184 18.196 20.538 26.030 8.779 10.889 20.323 7.100 18.226 21 17.876 12.663 29.946 19.166 16.074 16.986 15.563 12.756 22.229 22 22.772 12.826 9.268 21.977 26.913 15.664 25.564 13.866 13.939 23 11.358 19.419 21.815 22.598 22.985 18.709 11.004 12.651 14.967 24 11.577 18.693 16.765 18.626 5.057 21.411 15.442 36.163 33.829 25 27.087 12.192 15.396 22.820 13.738 10.135 18.945 11.351 12.791 26 18.532 11.671 14.910 25.855 26.185 24.506 13.374 18.161 26.741 27 10.121 18.831 16.093 13.525 7.438 23.613 12.064 7.746 17.173 28 11.901 15.203 18.547 10.779 5.689 27.273 10.804 19.696 9.030 29 13.087 20.761 20.224 7.105 14.158 11.535 31.756 20.275 16.421 30 12.435 18.487 5.475 31.654 12.811 14.859 19.315 12.390 10.129 31 6.672 16.309 13.809 11.175 7.811 16.751 14.936 18.662 13.497 32 17.389 14.342 24.877 38.708 3.223 23.188 6.772 6.710 16.936 33 14.205 11.916 15.326 11.710 22.348 26.288 22.250 12.006 19.799 34 24.694 26.806 23.436 7.591 10.944 15.585 9.158 15.692 12.949 35 20.679 5.868 26.207 18.602 23.334 12.215 19.989 18.442 7.736 36 12.062 16.892 16.795 14.565 24.224 18.255 29.340 22.865 1.673 37 30.288 19.539 8.752 18.395 15.380 16.204 9.651 36.640 11.066 38 14.594 13.372 18.451 13.433 21.618 18.427 15.119 14.331 21.045 39 29.892 9.353 14.725 16.951 14.902 14.027 19.858 5.743 5.330 40 12.722 19.538 17.060 26.331 17.897 15.265 18.438 14.166 9.963 41 11.620 23.907 13.231 25.107 22.398 10.144 22.848 21.845 10.604 42 22.136 10.405 25.538 19.259 13.379 29.869 32.847 15.733 14.149 53

Table 4.11: The average standard deviation for θˆ for the scenarios with N = 42,M = 14 and K = 3 using the penalized perplexity for model selection. Level of Nesting None Mild Moderate

ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ Visitor θ1 θ2 θ3 θ1 θ2 θ3 θ1 θ2 θ3 1 0.519 0.582 0.605 0.591 0.521 0.601 0.586 0.569 0.544 2 0.573 0.578 0.561 0.580 0.547 0.585 0.581 0.567 0.556 3 0.574 0.571 0.567 0.585 0.577 0.556 0.559 0.596 0.555 4 0.563 0.590 0.563 0.583 0.549 0.581 0.535 0.597 0.580 5 0.563 0.558 0.591 0.560 0.566 0.593 0.573 0.590 0.543 6 0.577 0.567 0.573 0.591 0.568 0.554 0.543 0.578 0.582 7 0.547 0.622 0.542 0.555 0.584 0.569 0.542 0.585 0.585 8 0.569 0.568 0.574 0.595 0.563 0.563 0.571 0.556 0.585 9 0.570 0.573 0.572 0.576 0.538 0.596 0.571 0.608 0.537 10 0.523 0.597 0.592 0.569 0.590 0.547 0.583 0.542 0.588 11 0.590 0.584 0.540 0.609 0.547 0.553 0.592 0.570 0.549 12 0.574 0.532 0.608 0.584 0.565 0.567 0.537 0.556 0.613 13 0.556 0.576 0.581 0.577 0.566 0.573 0.587 0.571 0.554 14 0.537 0.573 0.600 0.583 0.562 0.566 0.586 0.565 0.558 15 0.571 0.581 0.559 0.575 0.563 0.576 0.564 0.575 0.570 16 0.562 0.604 0.538 0.614 0.522 0.576 0.580 0.565 0.569 17 0.538 0.589 0.589 0.586 0.574 0.548 0.551 0.608 0.552 18 0.553 0.607 0.548 0.597 0.589 0.533 0.547 0.592 0.576 19 0.578 0.580 0.556 0.563 0.587 0.566 0.609 0.569 0.534 20 0.591 0.514 0.610 0.582 0.576 0.549 0.590 0.556 0.558 21 0.574 0.581 0.565 0.576 0.543 0.592 0.554 0.592 0.567 22 0.519 0.572 0.612 0.582 0.587 0.545 0.553 0.586 0.565 23 0.563 0.590 0.564 0.543 0.596 0.575 0.603 0.581 0.519 24 0.564 0.574 0.574 0.583 0.569 0.562 0.603 0.564 0.543 25 0.564 0.580 0.568 0.570 0.566 0.583 0.556 0.559 0.597 26 0.573 0.561 0.584 0.581 0.590 0.547 0.564 0.549 0.600 27 0.542 0.601 0.568 0.533 0.609 0.570 0.558 0.591 0.561 28 0.571 0.577 0.569 0.560 0.611 0.541 0.569 0.536 0.604 29 0.572 0.579 0.562 0.570 0.563 0.580 0.574 0.585 0.564 30 0.560 0.554 0.596 0.542 0.589 0.579 0.574 0.566 0.577 31 0.595 0.577 0.535 0.579 0.576 0.560 0.542 0.600 0.571 32 0.593 0.578 0.543 0.581 0.590 0.545 0.564 0.569 0.581 33 0.564 0.559 0.591 0.592 0.551 0.574 0.541 0.593 0.582 34 0.579 0.541 0.591 0.558 0.555 0.602 0.562 0.568 0.574 35 0.607 0.546 0.558 0.580 0.567 0.572 0.551 0.617 0.549 36 0.613 0.549 0.555 0.599 0.581 0.538 0.556 0.553 0.612 37 0.552 0.583 0.575 0.592 0.568 0.555 0.555 0.589 0.572 38 0.627 0.535 0.549 0.559 0.542 0.612 0.581 0.594 0.542 39 0.587 0.561 0.572 0.577 0.591 0.546 0.555 0.574 0.583 40 0.602 0.594 0.522 0.599 0.567 0.548 0.547 0.574 0.593 41 0.574 0.549 0.592 0.570 0.554 0.594 0.563 0.571 0.580 42 0.545 0.576 0.595 0.602 0.552 0.563 0.558 0.586 0.567 54

4.5 Discussion

The simulation study tests the performance of LDA in a visitor web context

using various dimensions of the interaction webs and different model selection criteria.

From Table 4.3, it is clear that AIC and penalized perplexity outperformed

the other criteria in selecting the correct K. Examining the performance of AIC

and penalized perplexity closely, penalized perplexity selected the correct K for the

smaller systems of N = 20 and M = 9. As the network size increased, both AIC and

penalized perplexity were comparable in model selection and as nestedness increased

within a fixed network size the penalized perplexity outperformed AIC. The AIC and

penalized perplexity tends to select models with Kˆ < K when the best chosen model

was not equal to K for a given run. The performance of AIC suggests that the penalty may be too strong in selecting the optimal number of groups for maximizing the lower bound of the log likelihood. The effect is even stronger for BIC, selecting K

frequently as the nestedness increased. The IC and DIC had a similar performance

based on the number of runs within scenarios which selected the correct number of

interaction groups. Both criteria tended to select models with K < Kˆ for most of

the runs within a scenario. Perplexity is also similar to DIC and IC in selecting the

correct Kˆ . However, the performance tended to improve for perplexity as the nesting

increased.

The results of Table 4.4 enforce the apparent trend from Table 4.3 that

penalized perplexity performed very well when it is compared to AIC or perplexity.

Both discordance ratios tended to be 0 or ≤ 1 for most of the scenarios. These ratio 55 values between scenarios suggest penalized perplexity tended to have more runs with

Kˆ = K when either AIC or perplexity had the incorrect Kˆ estimated. Also, AIC and PY had fewer runs with Kˆ = K when PP had the incorrect Kˆ estimated. The proportion of runs with Kˆ < K increased for all model selection criteria when the nesting increased. AIC and penalized perplexity had a much higher proportion of runs with Kˆ < K with AIC having the highest for all scenarios. Perplexity had the lowest proportion with Kˆ < K because it tended to overfit the data with Kˆ > K.

Comparing Figures 4.2, 4.3 and 4.4, the identification of all 3 interaction groups decreased as the nesting increased. Also, more runs selected Kˆ 6= K as the nesting increased indicated by the lower row of stacked barplot for group identification when Kˆ 6= K. Observing the relative bias for βˆ in Table 4.6, the values are around 0 for all three which suggests that the average β is very close to the true β. The results are similar for θ for all 3 scenarios in Table 4.7. Examining the CV for βˆ in Table

4.9, the values are around 1 for all three scenarios indicating reasonable variation for

βˆ. The CV for θˆ in Table 4.10 is much larger and this is to be expected because of the large variation of θˆ generated for each run in a scenario. 56

Chapter 5

Data Analysis

Floral visits observed across eleven surveys conducted between May 2004 and September 2004 in Gully, Avon Gorge, Bristol, UK (Carvalheiro et al., 2008) were analyzed using LDA. Section 5.1 briefly describes the Avon Gorge data and

Section 5.2 presents the results from the analysis. Section 5.1 describes how the

Avon Gorge data was analysed. Section 5.2 presents results from the analysis and briefly discusses them in the context of findings in (Carvalheiro et al., 2008).

5.1 Description of the Avon Gorge Data

The data are comprised of 1092 individual visits (counts) made between 174 insect species and 64 plant species. Of the 1092 observed visits, 78 were observed outside of the designated observation periods and are thus labelled as “rare” visits.

It is questionable whether these rare visits should be included in the analysis or not as they may introduce sampling bias into the analysis. Also, note that the single visits mainly corresponded to visitor species, not to plant species. A visitor with a single observed interaction was excluded and if a plant species is only visited once as a result of the exclusion then the plant species was removed. As well, it 57 is customary that plant species and/or visitor species for which only one visit was observed in total (henceforth called “single visits”), be excluded from the analysis because results concerning these species lack statistical power. Consequently, we considered four separate analyses, which differed only with respect to how the data were pre-processed:

1. Neither rare visits nor single visits excluded.

2. Rare visits not excluded (per se) but single visits excluded (rare visits that were

single visits were excluded).

3. Rare visits excluded and single visits excluded.

4. Rare visits excluded and any remaining single visits not excluded.

Table 5.1: Summary of counts in Avon Gorge data. Visits Excluded Rare Single N M nA T emp. nA/p No Yes 89 54 1010 8.94 1.32 No No 163 63 1092 3.39 0.92 Yes Yes 85 53 937 9.39 1.27 Yes No 154 62 1014 3.63 0.89

Figure 5.1 provides a binary snapshot (presence/absence) of the visitation web when rare visits are excluded per se and single visits are excluded. Table 5.1 presents the final counts used for each analysis, where T emp. is the temperature of the web, and p is the number of free parameters in the associated LDA model assuming

6 interaction groups (p = K(M −1)+N(K −1) from Chapter 3). According to Table

5.1, the pollination network for Avon Gorge is highly nested. Further, it appears that 58 when the single visits are included, then the number of parameters in a model with six interaction groups is greater than the total number of observed visits.

Bombus pascuorum Anaspis pulicaria fusca Meligethes solidus Episyrphus balteatus Lasioglossum morio alienus cf Halictus tumulorum Meligethes aeneus Platycheirus albimanus Syrphus ribesii Meligethes obscurus Paragus haemorrhous Sphaerophoria scripta cf Melanostoma scalare Apion Exapion Oedemera lurida Heterostylodes nominabilis sulphureus Lasioglossum smacthmanellum Hoplitis spinulosus sp7 Eristalis tenax cf Bombus lapidarius Phyllobius roboretanus Lasioglossum villosulum Isomira murina Eriothrix rufomaculata Bombus pratorum Apis melliferae Melanostoma millinum Helophilus pendulus Epistrophe grossularia Bombus lucorum.terrestris thysanoptera sp3 Myrmica sp Lasioglossum calceatum heteroptera sp5 Delia sp thysanoptera sp1 thysanoptera sp Syrphus vitripennis Sarcophaga nigriventris Pegoplata aestiva Meliscaeva auricollis heteroptera sp9 Fernandea cuprea Eristalis arbustorum Vespula vulgaris Syritta pipiens Sphecodes niger Sarcophaga spp Rhinophora lepida Platycheirus scutatus Oedemera nobilis cf Melangyna umbellatarum Lasioglossum leucopus Lasiambia palposa Eupeodes luniger Eristalis interruptus incurva Byturus tomentosus Bombylius major Syrphus torvus Strangalia maculata Solieria pacifica Sarcophaga subvicina Pyronia tithonus Polyommatus icarus Platycheirus clypeatus Neomyia cornicina Myathropa florea Meligethes flavimanus Lucilia illustris Leptothorax sp Lasioglossum fulvicorne Hylaeus communis Hoplitis claviventris Eusphalerum luteum Cetonia aurata Ceratina cyanea Bombus hortorum sp1 Anthonomus rubi cf Crepis sp. Crepis Sorbus sp. Sorbus Senecio sp1 Senecio Rosa canina Rosa Hedera Hedera Trinia sp. Apiaceae Bellis perenis Bellis Hieracium sp. Hieracium Rosa arvensis Rosa Gallium verum Gallium Torilis japonica Torilis Mycelis muralis Mycelis Carlina vulgaris Carlina Asteracea sp13 Asteracea Ulex europaeus Ulex Clematis vitalba Clematis Lactuca serriola Lactuca Ranunculus sp1 Ranunculus Potentilla sterilis Potentilla Medicago lupina Medicago Rubus fruticosus Rubus Viburnum lantana Viburnum Sonchus arvensis Sonchus Origanum vulgare Origanum vulgare Ligustrum Senecio jacobaea Senecio Centhrantus ruber Centhrantus corniculatus Lotus Cornus sanguinea Cornus Solidago virgaurea Solidago Sanguisorba minor Sanguisorba millefolium Achillea Thymus Thymus polytrichus Clinopodium acinos Clinopodium Verbascum thapsus Verbascum Pilosella officinarum Pilosella Solanum dulcamara Solanum Hippocrepis camosa Hippocrepis Smyrnium olusatrum Smyrnium Scabiosa columbaria Scabiosa Teucrium scorodonia Teucrium Sisymbrium officinale Sisymbrium Cratageus monogyna Cratageus Cotoneaster franchetti Cotoneaster Hypericum perfuratum Hypericum Melampyrum pratense Melampyrum Geranium robertianum Geranium Clinopodium ascendens Clinopodium sphaerocephalon Allium Cotoneaster horizontalis Cotoneaster Eupatorium cannabinum Eupatorium Helianthemum nunmularium Helianthemum

Figure 5.1: Presence/absence visualization of Avon Gorge data with rare visits ex- cluded and single visits excluded (N = 85, M = 53).

Since the number of interaction groups is typically unknown but needs to be specified to implement LDA, LDA was run for k = 2,..., 6 interaction groups. For each k, six different model scores were computed to select the model best supported 59

by the data. In particular, AIC, BIC, DIC, IC, PY (perplexity), and PP (penalized

perplexity) were used as the different scores.

Using the lda package in R, 100 independent Gibbs samplers were run with

η = 0.01 and α = 0.001 for 15000 iterations per run and a burn-in period of 5000.

For each score a majority vote, based on the value of k selected in each of the 100 runs, was used to select the best model.

5.2 Results

Table 5.2 shows k, the number of interaction groups in the model selected most frequently by each score, and how many of the 100 runs selected a model with that value of k. First, note that the value of k selected by each score was more or less the same regardless of how the data were pre-processed. Further, the scores suggest either a very simplistic model (k = 2 for AIC, BIC and PP), or a very complex model

(k = 5 or 6 for DIC, IC and PY), as may be expected for highly nested data. The estimates associated with the models with k = 2 were averaged across the 100 runs of

LDA. In particular, averages for the visitation probabilities (βˆ ) associated with each interaction group, and for the pollinator-specific group membership probabilities (θˆa) were calculated and compared.

The results from excluding single visits and excluding the rare visits are presented in this Chapter, while other results are presented in Appendix A.2. How- ever, the overall interpretation was very similar for the other analyses. Further, we show the results for k = 2 using PP for 83 runs in the main body here, but reserve 60

Table 5.2: The number of interaction groups associated with the model chosen most often for each score. The number of times this model is selected out of the 100 runs is shown in brackets. Visits Excluded Rare Single AIC BIC DIC IC PY PP No Yes 2 (100) 2 (100) 6 (100) 6 (100) 6 (77) 2 (84) No No 2 (100) 2 (100) 5 (100) 5 (100) 5 (90) 2 (77) Yes Yes 2 (100) 2 (100) 6 (100) 6 (100) 6 (78) 2 (83) Yes No 2 (100) 2 (100) 5 (100) 5 (100) 5 (91) 2 (71)

the results for k = 6 using IC and k = 2 using AIC for the Appendix. Figure 5.2 provides a snapshot of the binary visitation web estimated by LDA after the plant and pollinator species have been ordered according to β and θ from the 83 runs which

selected Kˆ = 2 for PP model selection. The web estimated by LDA seems to show

slightly more structure compared to Figure 5.1.

Table 5.3 presents the estimated group-specific visitation distributions aver-

aged over the 83 runs (single visits and rare visits excluded) for a model with k = 2

interaction groups. Each column represents the probability of visiting a plant species,

given membership to an interaction group. Bolded numbers within a column high-

light plant species that tend to belong to the same group most often. In a very loose

sense, they may help us understand what types of linkage rules may characterize each

group.

Figure 5.3 plots the estimated visitation distributions for βk by interaction

group as a visual counterpart to Table 5.3. It is interesting to note that the plant

species with the highest estimated probabilities from groups 1 and 2 are relatively

non-overlapping. The scatterplot of probabilities in Figure 5.3 are for the conditional

group probabilities on the plant species after an interaction group has been assigned 61

Bombus pascuorum Anaspis pulicaria Formica fusca Meligethes solidus Episyrphus balteatus Lasioglossum morio Lasius alienus cf Halictus tumulorum Meligethes aeneus Platycheirus albimanus Syrphus ribesii Meligethes obscurus Paragus haemorrhous Sphaerophoria scripta cf Melanostoma scalare Apion Exapion Oedemera lurida Heterostylodes nominabilis Cteniopus sulphureus Lasioglossum smacthmanellum Hoplitis spinulosus heteroptera sp7 Eristalis tenax cf Bombus lapidarius Phyllobius roboretanus Lasioglossum villosulum Isomira murina Eriothrix rufomaculata Bombus pratorum Apis melliferae Myrmecoris gracilis Melanostoma millinum Helophilus pendulus Epistrophe grossularia Bombus lucorum.terrestris thysanoptera sp3 Myrmica sp Lasioglossum calceatum heteroptera sp5 Delia sp thysanoptera sp1 thysanoptera sp Syrphus vitripennis Sarcophaga nigriventris Pegoplata aestiva Meliscaeva auricollis heteroptera sp9 Fernandea cuprea Eristalis arbustorum Vespula vulgaris Syritta pipiens Sphecodes niger Sarcophaga spp Rhinophora lepida Platycheirus scutatus Oedemera nobilis cf Melangyna umbellatarum Lasioglossum leucopus Lasiambia palposa Eupeodes luniger Eristalis interruptus Discomyza incurva Byturus tomentosus Bombylius major Syrphus torvus Strangalia maculata Solieria pacifica Sarcophaga subvicina Pyronia tithonus Polyommatus icarus Platycheirus clypeatus Neomyia cornicina Myathropa florea Meligethes flavimanus Lucilia illustris Leptothorax sp Lasioglossum fulvicorne Hylaeus communis Hoplitis claviventris Eusphalerum luteum Cetonia aurata Ceratina cyanea Bombus hortorum aphid sp1 Anthonomus rubi cf Crepis sp. Crepis Sorbus sp. Sorbus Senecio sp1 Senecio Rosa canina Rosa Hedera helix Hedera Trinia glauca Trinia Apiaceae sp. Apiaceae Bellis perenis Bellis Hieracium sp. Hieracium Rosa arvensis Rosa Gallium verum Gallium Torilis japonica Torilis Mycelis muralis Mycelis Carlina vulgaris Carlina Asteracea sp13 Asteracea Ulex europaeus Ulex Clematis vitalba Clematis Lactuca serriola Lactuca Ranunculus sp1 Ranunculus Potentilla sterilis Potentilla Medicago lupina Medicago Rubus fruticosus Rubus Viburnum lantana Viburnum Sonchus arvensis Sonchus Origanum vulgare Origanum vulgare Ligustrum Senecio jacobaea Senecio Centhrantus ruber Centhrantus corniculatus Lotus Cornus sanguinea Cornus Solidago virgaurea Solidago Sanguisorba minor Sanguisorba Achillea millefolium Achillea Thymus Thymus polytrichus Clinopodium acinos Clinopodium Verbascum thapsus Verbascum Pilosella officinarum Pilosella Solanum dulcamara Solanum Hippocrepis camosa Hippocrepis Smyrnium olusatrum Smyrnium Scabiosa columbaria Scabiosa Teucrium scorodonia Teucrium Sisymbrium officinale Sisymbrium Cratageus monogyna Cratageus Cotoneaster franchetti Cotoneaster Hypericum perfuratum Hypericum Melampyrum pratense Melampyrum Geranium robertianum Geranium Clinopodium ascendens Clinopodium sphaerocephalon Allium Cotoneaster horizontalis Cotoneaster Eupatorium cannabinum Eupatorium Helianthemum nunmularium Helianthemum

Figure 5.2: Presence/absence visualization of Avon Gorge data with rare visits in- cluded, but plants/visitors with single counts removed for analysis 3 (N = 85, M = 53). 62

Table 5.3: Estimated plant visitation distributions for each interaction group βk, averaged over runs for Kˆ = 2 (83) using analysis 3 of LDA with a Gibbs sampler and two interaction groups. Interaction Group Plant Species 1 2 Scabiosa columbaria 0.090 0.000 Rubus fruticosus 0.076 0.000 Geranium robertianum 0.059 0.000 Solidago virgaurea 0.051 0.002 Rosa canina 0.049 0.006 Origanum vulgare 0.046 0.000 Crepis sp. 0.042 0.016 Clematis vitalba 0.040 0.000 Senecio sp1 0.039 0.000 Helianthemum nunmularium 0.038 0.000 Cotoneaster franchetti 0.032 0.010 Ligustrum vulgare 0.029 0.000 Centhrantus ruber 0.024 0.018 Achillea millefolium 0.023 0.000 Teucrium scorodonia 0.024 0.000 Pilosella officinarum 0.022 0.000 Hieracium sp. 0.023 0.000 Torilis japonica 0.020 0.000 Ulex europaeus 0.018 0.005 Smyrnium olusatrum 0.018 0.000 Hypericum perfuratum 0.016 0.000 Clinopodium ascendens 0.016 0.000 Cratageus monogyna 0.010 0.001 Rosa arvensis 0.010 0.001 Mycelis muralis 0.011 0.003 Allium sphaerocephalon 0.013 0.001 Lactuca serriola 0.010 0.000 Eupatorium cannabinum 0.010 0.000 Sonchus arvensis 0.009 0.000 Solanum dulcamara 0.009 0.000 Viburnum lantana 0.008 0.002 Lotus corniculatus 0.007 0.000 Thymus polytrichus 0.004 0.002 Melampyrum pratense 0.006 0.000 Cornus sanguinea 0.006 0.002 Sisymbrium officinale 0.003 0.000 Senecio jacobaea 0.005 0.000 Hippocrepis camosa 0.005 0.000 Clinopodium acinos 0.005 0.000 Carlina vulgaris 0.005 0.000 Medicago lupina 0.002 0.000 Ranunculus sp1 0.033 0.088 Trinia glauca 0.003 0.290 Potentilla sterilis 0.002 0.164 Hedera helix 0.008 0.047 Sorbus sp. 0.001 0.126 Gallium verum 0.006 0.025 Cotoneaster horizontalis 0.001 0.095 Apiaceae sp. 0.004 0.023 Sanguisorba minor 0.001 0.051 Verbascum thapsus 0.003 0.000 Asteracea sp13 0.003 0.007 Bellis perenis 0.002 0.006 63

to an individual observed visitor.

Group 1 Group 2 0.30 0.08 0.25 0.20 0.06 0.15 0.04 0.10 Estimated Probabilities Estimated Probabilities Estimated 0.02 0.05 0.00 0.00

0 10 20 30 40 50 0 10 20 30 40 50

Plant species index Plant species index

Figure 5.3: Estimated visitation distribution by interaction group, averaged over 83 runs for Kˆ = 2. Refer to Table 5.3 for plant species names.

Interaction group 1 appears to be somewhat more generalized compared to

group 2, with Scabiosa columbaria having the highest visit probability. For group 1,

Trinia glauca, Potentilla sterilis and Sorbus sp. held the highest conditional proba-

bility. Table 5.4 presents the estimated interaction group mixtures θ for each visitor

species.

By looking at the average θˆ for each visitor species, the results can be ex- plained by observing the group that represents the highest mixture for a given visitor species. This generalization of a visitor species still suggests how a majority of the individual visitors will behave on the plant species in the visitation web. A majority of the visitor species interactions were assigned to interaction group 1. The visitor

ˆ ˆ species with a θ2 > θ1 were Meligethes obscurus, Phyllobius roboretanus, Myrmecoris 64 gracilis, Platycheirus clypeatus and Leptothorax sp.. Phyllobius roboretanus and Lep- tothorax sp held the highest mixture probability for group 2. 65

a Table 5.4: Estimated group membership distributions for each visitor species θ , averaged over 83 independent runs of LDA where Kˆ = 2 with a Gibbs sampler. Interaction Group Visitor species 1 2 Anaspis pulicaria 0.993 0.007 Formica fusca 0.756 0.244 Meligethes solidus 0.942 0.058 Episyrphus balteatus 0.996 0.004 Lasioglossum morio 0.990 0.010 Lasius alienus cf 0.526 0.474 Halictus tumulorum 0.998 0.002 Meligethes aeneus 0.997 0.003 Platycheirus albimanus 1.000 0.000 Syrphus ribesii 1.000 0.000 Meligethes obscurus 0.480 0.520 Paragus haemorrhous 0.984 0.016 Sphaerophoria scripta cf 1.000 0.000 Melanostoma scalare 1.000 0.000 Apion exapion 0.976 0.024 Oedemera lurida 1.000 0.000 Heterostylodes nominabilis 0.990 0.010 Cteniopus sulphureus 1.000 0.000 Lasioglossum smacthmanellum 0.989 0.011 Hoplitis spinulosus 1.000 0.000 Heteroptera sp7 1.000 0.000 Eristalis tenax cf 1.000 0.000 Bombus lapidarius 0.994 0.006 Phyllobius roboretanus 0.121 0.879 Lasioglossum villosulum 1.000 0.000 Isomira murina 0.984 0.016 Eriothrix rufomaculata 1.000 0.000 Bombus pratorum 0.995 0.005 Apis melliferae 1.000 0.000 Myrmecoris gracilis 0.414 0.586 Melanostoma millinum 1.000 0.000 Helophilus pendulus 0.938 0.062 Epistrophe grossularia 1.000 0.000 Bombus lucorum.terrestris 1.000 0.000 Thysanoptera sp3 0.891 0.109 Myrmica sp 0.578 0.422 Lasioglossum calceatum 1.000 0.000 Heteroptera sp5 1.000 0.000 Delia sp 1.000 0.000 Thysanoptera sp1 0.626 0.374 Thysanoptera sp 1.000 0.000 Syrphus vitripennis 1.000 0.000 Sarcophaga nigriventris 1.000 0.000 Pegoplata aestiva 1.000 0.000 66

a Table 5.5: Estimated group membership distributions for each visitor species θ , averaged over 83 independent runs of LDA where Kˆ = 2 with a Gibbs sampler. Group Visitor species 1 2 Meliscaeva auricollis 0.994 0.006 Heteroptera sp9 1.000 0.000 Fernandea cuprea 0.997 0.003 Eristalis arbustorum 1.000 0.000 Vespula vulgaris 0.638 0.362 Syritta pipiens 1.000 0.000 Sphecodes niger 1.000 0.000 Sarcophaga spp 1.000 0.000 Rhinophora lepida 1.000 0.000 Platycheirus scutatus 1.000 0.000 Oedemera nobilis cf 1.000 0.000 Melangyna umbellatarum 0.996 0.004 Lasioglossum leucopus 1.000 0.000 Lasiambia palposa 0.984 0.016 Eupeodes luniger 1.000 0.000 Eristalis interruptus 0.988 0.012 Discomyza incurva 0.626 0.374 Byturus tomentosus 0.996 0.004 Bombylius major 1.000 0.000 Syrphus torvus 1.000 0.000 Strangalia maculata 1.000 0.000 Solieria pacifica 0.813 0.187 Sarcophaga subvicina 1.000 0.000 Pyronia tithonus 1.000 0.000 Polyommatus icarus 1.000 0.000 Platycheirus clypeatus 0.476 0.524 Neomyia cornicina 0.987 0.013 Myathropa florea 0.705 0.295 Meligethes flavimanus 0.993 0.007 Lucilia illustris 1.000 0.000 Leptothorax sp 0.121 0.879 Lasioglossum fulvicorne 0.987 0.013 Hylaeus communis 1.000 0.000 Hoplitis claviventris 1.000 0.000 Eusphalerum luteum 1.000 0.000 Cetonia aurata 0.987 0.013 Ceratina cyanea 1.000 0.000 Bombus hortorum 1.000 0.000 Aphid sp1 0.771 0.229 Anthonomus rubi cf 1.000 0.000 67

5.3 Discussion

LDA groups individuals from the visitor species observed based on which

of the M plant species they are most likely to visit. The assigned groups for each

interaction of a visitor species then allows the visitor species to be represented as a

mixture of the identified interaction groups determined by the chosen model selection

criterion. Since AIC, BIC and PP behaved similarly in the simulation study and

tended to favour simpler models when the wrong model was selected, the results of

this analysis suggest that the true number of groups may in fact be between K = 2

and K = 6.

LDA found a model with Kˆ = 2 interaction groups using PP, the first group

tended to be more of a generalist group and the second group appeared to be more

specialized on nine of the plant species in the visitation web. Among the nine domi-

nant plant species in group 2, Trinia glauca and Cotoneaster horizontals were among the plants with high visit probabilities in group 2, which may be in line with observa- tions made by (Carvalheiro et al., 2008) that these two flowers largely contributed to the floral diet of the species during the Trinia glauca flowering period. For group

1, Scabiosa columbaria, Cotoneaster franchetti and Centranthus ruber were among

the 25 most visited plant species in group 1, and were large contributors to the

floral diet outside of the Trinia glauca flowering period.

The second interaction group identified is mostly specialized on the lower 9 plant species bolded in Table 5.3 based on the average β estimates. The Avon Gorge

data was collected in a small area and most of these nine plant species in group 2 were 68

in the top part of a ravine. The geographical location of the plant species may have

had an effect on the visitation patterns for the observed visitors in the Avon Gorge

(Carvalheiro et al., 2008). The corresponding average θ estimates in Table 5.4 suggest

Leptothorax sp and Phyllobius roboretanus have a high probability of visiting the 9 plant species in group 2. The ant species highlighted in Table 5.4 are all known to visit Trinia glauca in the Avon Gorge. The ant species Lasius alienus cf and Formica

fusca are known visitors of Trinia glauca but are also generalists to 10-15 other plant species in the data (Carvalheiro et al., 2008). LDA accounted for the behaviour of the two ant species by estimating these species to be more evenly mixed between the two groups. Lasius alienus cf has a good mixture of both groups and Formica fusca is

clearly a generalist tending to belong to group 1 which has a more general probability

on the plant species. The second interaction group could be accounting for the ant

species relationship on Trinia glauca and the other alien plant species in the Avon

Gorge ecosystem while still accounting for the other interactions since the two most

common ant species were generalists visiting many other plant species.

Other visitor species that had a high mixture for group 2 that are not di-

rectly linked to Trinia glauca were the Leptothorax sp, Aphid sp1, Myathropa florea,

Platycheirus clypeatus, Discomyza incurva and Meligethes obscurus. These species

could be important visitors for the other plant species that have a high estimated

mixing probability for group 2. Apart from Trinia glauca and Cotoneaster horizon-

tals, seven other plant species hold a substantial probability and they are highlighted

in Table 5.3. Other important linkage rules such as geographical location could have

also been identified by LDA for these other species such as Potentilla sterilis or the 69

Sorbus sp. and the visitors species that are not known to visit Trinia glauca but still

held a higher mixture for group 2.

Note that analysis 3 outlined in Table 5.1 is the closest to analysis 1 with

respect to network dimensions, total counts, temperature nesting and the “best cho-

sen” with kˆ = 2 using PP for model selection. Analysis 1 involved removing species

with single counts and keeping rare visits and analysis 3 excluded rare visits and

single visits. The θ and β from analysis 1 are presented in Table A.57 and Table

A.56 in Appendix A.2. By examining βˆ and θˆ for analysis 1 and 3 it is worth not-

ˆ ing that both interaction group probabilities βk are similar between the two analyses

a with slight differences in θ and βk conditional probabilities for the two interaction

groups. This result can be expected since rare visits are included and LDA needs to

assign probabilities to counts that are not commonly observed in nature. However,

including the rare visits may be important in order to represent the visits that are

uncommon in the ecosystem but still play a functional role in the ecosystem. The

overlap between the two interaction groups in analysis 1 between the βk’s is much

smaller than in analysis 3. The second interaction group (specialist group) is less

visible for the arranged visual web for analysis 1 (Figure A.22) when rare visits are

included.

The results from analyzing the Avon Gorge data using IC suggest a model

with the maximum number of interaction groups. The results of analysis 1 and 3 using

the IC model selection criterion suggested a model with Kˆ = 6 for the interaction

groups. The parameter estimates for the six different interaction groups appeared to

be smaller groups compared to the two groups already identified by a model with Kˆ 70

= 2. The results of the model for analysis 1 with Kˆ = 6 using IC for model selection are provided in Section A.2 of the Appendix. 71

Chapter 6

Conclusions

This thesis introduced latent Dirichlet allocation to modelling pollination networks using visitation web data, and developed the penalized perplexity for model selection criteria. LDA with a collapsed Gibbs sampler iteratively samples the ob- served interactions to the latent interaction groups until convergence of the lower bound log-likelihood. The performance of LDA in a pollination context is examined using varying levels of nesting for the interaction groups. Also, different model se- lection criteria are chosen to select the optimal number of interaction groups for the simulated data.

The penalized perplexity model selection criterion is a heuristic based on the perplexity criterion commonly used for modelling words in documents of a corpus

(Blei et al., 2002). The heuristic involves adding a penalty term for the number of parameters in the exponent of the negative mean log-likelihood per word. Overall, the simulation study supported the penalized perplexity as a competitive selection criterion for the number of interaction groups, though it tends to favour simpler models when the optimal number of groups is not correct using LDA in a pollination context.

LDA identifies groups or modules within the interaction web and estimates 72 a proportion of individuals observed within a visitor species that belong to a group

ˆa (θk) and therefore are likely to visit a subset of the plant species corresponding to ˆ that group (βk). LDA accounts for trait variation within a visitor species using the interaction group mixture with individuals of the same visitor species having different interaction group assignments, which may be indicative of individual visitors serving different functional roles on the plant species.

The simulation study and Avon Gorge data provided promising results. LDA using penalized perplexity provided a valuable technique for studying pollination networks. Applying LDA to the Avon Gorge data identified structural groups in the data. The models using both PP and AIC contained two interaction groups being a generalist and slightly more specialized group. The specialist group had a high probability on the rare plant species Trinia glauca, which is known to account for a large portion of the diet for certain ant species in this ecosystem for a specific flowering time. The results for the relationship between the ant species and Trinia glauca of the

Avon Gorge dataset are supported by other research for this ecosystem (Carvalheiro et al., 2008). The specialist group also has other plant species with substantially higher probabilities that were observed in a small area of the study sight. LDA identified the visits on these plant species for this micro-habitat as belonging to the specialist group. The specialist group not only accounts for the ant visitors within the late Spring flowering time but also the geographical difference of the certain plant species within the Avon Gorge study site. 73

6.1 Future Work

The results presented in this thesis suggest LDA is a viable method for pollination web modelling. Additional developments can be very helpful in applying

LDA in a pollination context. Considerations for future work include the following:

1. Currently, the LDA collapsed Gibbs sampler does not incorporate sampling

effort into modelling visitation webs. An interaction web may be recorded

focussing on a specific plant species or visitor species which can be biased when

it is used to represent the ecosystem as a whole. Using a symmetric Dirichlet

prior for the conditional interaction group probabilities β and θ for the collapsed

Gibbs sampler could be changed to accomodate for the sampling effort. In

particular, the weight η assigned to a plant species for some βk is a function of

how the dataset was collected.

2. Applying LDA to various visitation webs collected over the different flower-

ing times for an ecosystem. The goal would be to determine how important

temporal mismatch is for identifying the interaction groups.

3. Comparing interpretation of results with respect to ecosystem function or com-

munity structure to the results obtained by SA or other ordination methods.

However, a direct comparison is difficult because these other methods do not

explicitly account for trait variation within a plant or pollinator species, and

the groups are not defined by sets of linkage rules. 74

Bibliography

Akaike, H. (1974). A new look at the statistical model identification. IEEE Transac- tions on Automatic Control 19 (6), 716–723. Amarel, L. and R. Guimer`a(2005). Cartography of complex networks: modules and universal roles. Journal of Statistical Mechanics: Theory and Experiment. Amorim, F., S. Benitez-Vieyra, A. Cocucci, A. Medina, M. Mor´e,and M. Sazima (2012). Armament imbalances: Match and mismatch in plant-pollinator traits of highly specialized long-spurred orchids. PLoS ONE 7 (7). Ando, T. (2009). Predictive Bayesian model selection. Master’s thesis, Keio Univer- sity. Bascompte, J., Y. Dupont, P. Jordano, and J. Olesen (2007). The modularity of pollination networks. PNAS 104 (50), 19891–19896. Bascompte, J. and P. Jordano (2007). Plant- mutualistic networks: The archi- tecture of biodiversity. Annual Review of Ecology, Evolution, and Systematics. 38, 567–593. Blei, D., A. Ng, and M. Jordan (2002). Latent Dirichlet allocation. 14th Conference on Advances in Neural Information Processing Systems (NIPS 14) 3, 993–1022. Carvalheiro, L., E. Barbosa, and J. Memmott (2008). Pollinator networks, alien species and the conservation of rare plants: Trinia glauca as a case study. Journal of Applied Ecology 45, 1419–1427. Chang, J. (2009). Collapsed Gibbs sampling methods for topic models. R package version 1.1. Chen, F. (2009). Performance prediction for exponential language models. NAACL ’09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, 450–458. Corbet, A., L. Dicks, and F. Pywell (2002). Compartmentalization in plant-insect flower visitor webs. Journal of Animal Ecology 71, 32–43. Dormann, C., B. N. Frund, J., and B. Gruber (2009). Indices, graphs and null models: Analyzing bipartite ecological networks. The Open Ecology Journal 2, 7–24. Dormann, C., B. Gruber, M. Devoto, J. Fruend, I. J., R. Strauss, D. Vasquez, N. Bluethgen, A. Clauset, and M. Rodriguez-Girones (2011). Visualising bipar- tite networks and calculating some (ecological) indices. R package version 1.18. 75

Dupont, Y. and O. Jens (2009). Ecological modules and roles of species in heathland plant-insect flower visitor networks. Journal of Animal Ecology 78, 346–353. Griffiths, T. and M. Steyvers (2004). Finding scientific topics. PNAS 101, 5228–5235. Guimar˜aes, P., R. Galdini Raimundo, and L. Cagnolo (2011). Interaction web database. National Center for Ecological Analysis and Synthesis University of Cal- ifornia, Santa Barbara, USA.. Hall, S. and D. Raffaelli (1992). Compartments and in an estuarine food web. Journal of Animal Ecology 61 (3), 551–560. Hornik, K. and B. Gr¨un(2011). topicmodels: An r package for fitting topic models. Journal of Statistical Software 40 (13), 1–30. Inouye, D. and C. Kearns (1997). Pollinators, flowering plants, and conservation biology. BioScience 47 (5), 297–307. Jordano, P. (1987). Patterns of mutualistic interactions in pollination and seed dis- persal. American Naturalist 129, 657–677. Lawton, J. and S. Pimm (1980). Are food webs divided into compartments? Journal of Animal Ecology 49, 879–898. Luis and M. A. Rodriguez-Girones (2007). Linkage rules for plant-pollinator networks: Trait complementarity or exploitation barriers? PLoS Biology 5 (2), 354–362. McGill, B., R. Etienne, J. Gray, D. Alonso, M. Anderson, B. H., M. Dornelas, J. En- quist, B. Green, A. He, F. anf Hurlbert, A. Magurran, P. Marquet, B. Maurer, A. Ostling, K. Soykan, C. Ugland, and E. White (2007). Species abundance distri- butions: moving beyond single prediction theories to integration within an ecolog- ical framework. Ecology Letters 10, 995–1015. R Development Core Team (2011). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing. Raftery, A. (1995). Bayesian model selection in model research. Sociological Method- ology 25, 111–163. Rodriguez-Girones, M. and L. Santamaria (2006). A new algorithm to calculate the nestedness temperature of presence-absence matrices. Journal of Biogeography 33, 924–935. Schwarz, G. (1978). Estimating the dimensions of a model. Annals of Statistics 6, 461–464. Spiegelhalter, D., N. Best, B. Carlin, and A. van der Linde (2002). Bayesian measures of model complexity and fit. Journal of the Royal Statistical Society 64 (4), 583–639. Stephens, M. (2000). Dealing with label switching in mixture models. Royal Statistical Society 4, 795–809. 76

V´azquez,D., N. Bl¨uthgen,L. Cagnolo, and N. Chacoff (2009, March). Uniting pattern and process in plant-animal mutualistic networks: a review. Annals of Botany 103, 1445–1457. 77

Appendix A Appendix

A.1 Simulation study results

In this Section, all the output and statistics provided in the body of the thesis for scenarios 10,11 and 12 are repeated for the rest of the 21 scenarios. These include the relative bias and bias for βˆ, the average relative bias and bias for θˆ, the coefficient of variation and standard deviation for βˆ, the average coefficient of variation and standard deviation for θˆ and the stacked bar plots for group identification based on the MSE for βˆ for each of the  = [0.003, 0.005, 0.01, 0.03] threshold values. When scenarios had the same dimensions for N, M and K then they were grouped together in a common Subsection (e.g. scenarios 5,6,7 and 8 have N = 20, M = 9 and K = 4).

A.1.1 Scenario 1

Table A.1: The number of groups identified for scenario 1 with N = 20, M = 9 and K = 2 with penalized perplexity (PP) used as the model selection criterion. Kˆ = 2 Kˆ 6= 2 Number of Groups identified Number of Groups identified  Nesting 0 1 2 Total 0 1 2 Total 0.003 None 0 10 477 487 0 4 9 13 0.005 None 0 3 484 487 0 4 9 13 0.01 None 0 0 487 487 0 4 9 13 0.03 None 0 0 487 487 0 0 13 13 78

ε = 0.003 ε = 0.005 ε = 0.01 ε = 0.03 1500 1500 1500 Group 2

1400 Group 1 1000 1000 1000 1000 frequency frequency frequency frequency 600 500 500 500 200 0 0 0 0

1 2 1 2 1 2 1 2

Num. groups identified Num. groups identified Num. groups identified Num. groups identified

ε = 0.003 ε = 0.005 ε = 0.01 ε = 0.03 500 500 500 500 300 300 300 300 frequency frequency frequency frequency 100 100 100 100 0 0 0 0

1 2 1 2 1 2 1 2

Num. groups identified Num. groups identified Num. groups identified Num. groups identified

Figure A.1: Stacked bar plots for the identified interaction groups in scenario 1. The top plots are for the runs with Kˆ = K and the bottom row is for Kˆ 6= K.

Table A.2: Top row: The bias and relative bias for βˆ for scenario 1 with N = 20, M = 9 and K = 2 using penalized perplexity for model selection. Bottom row: The true β parameter used to generate the data. Level of Nesting None ˆ ˆ Plant β1 β2 β1 β2 1 -0.01 0.00 0.48 0.00 2 0.00 0.00 0.52 0.00 3 0.00 -0.01 0.00 0.12 4 0.00 -0.00 0.00 0.15 5 0.00 0.01 0.00 0.11 6 0.00 0.01 0.00 0.15 7 0.00 0.00 0.00 0.21 8 0.00 -0.01 0.00 0.16 9 0.00 0.00 0.00 0.10 79

Table A.3: The average relative bias for θˆ for the scenarios with N = 20, M = 9 and K = 2 using the penalized perplexity for model selection. Level of Nesting None ˆ ˆ Visitor θ1 θ2 1 0.506 -0.270 2 0.281 -0.082 3 -0.230 -0.077 4 -0.404 -0.182 5 -0.253 -0.224 6 0.137 -0.485 7 0.019 -0.430 8 -0.337 1.210 9 -0.489 -0.062 10 -0.439 -0.505 11 0.062 -0.097 12 -0.222 -0.358 13 5.378 -0.756 14 -0.029 -0.661 15 -0.482 -0.532 16 -0.740 -0.204 17 -0.059 -1.111 18 -0.177 -0.639 19 0.350 2.515 20 -0.093 -0.886 80

Table A.4: The average bias for θˆ for the scenarios with N = 20, M = 9 and K = 2 using the penalized perplexity for model selection. Level of Nesting None ˆ ˆ Visitor θ1 θ2 1 0.000 0.000 2 0.000 0.000 3 0.000 0.000 4 0.000 0.000 5 0.000 0.000 6 0.000 0.000 7 0.000 0.000 8 0.000 0.000 9 0.001 0.000 10 0.001 0.001 11 0.001 0.001 12 0.001 0.001 13 0.001 0.001 14 0.001 0.001 15 0.001 0.001 16 0.001 0.001 17 0.001 0.001 18 0.003 0.001 19 0.002 0.002 20 0.002 0.002

Table A.5: The coefficient of variation (CV) and SD for βˆ for the scenarios with N = 20, M = 9 and K = 2 using the penalized perplexity for model selection. Level of Nesting None ˆ ˆ Plant β1 β2 1 0.94 0.00 2 0.95 0.00 3 0.00 0.95 4 0.00 0.95 5 0.00 0.97 6 0.00 0.97 7 0.00 0.96 8 0.00 0.95 9 0.00 0.97 81

Table A.6: The average coefficient of variation (CV) for θˆ for the scenarios with N = 20, M = 9 and K = 2 using the penalized perplexity for model selection. Level of Nesting None ˆ ˆ Visitor θ1 θ2 1 12.74 13.34 2 13.09 9.05 3 5.39 10.74 4 12.79 15.05 5 11.21 10.24 6 12.43 10.72 7 12.72 8.16 8 4.15 10.51 9 14.79 4.26 10 9.50 8.45 11 14.75 15.96 12 10.72 15.61 13 13.41 8.18 14 14.10 7.03 15 10.96 17.84 16 6.99 8.59 17 11.29 8.72 18 9.44 13.27 19 22.17 12.95 20 19.51 10.25 82

Table A.7: The average standard deviation for θˆ for the scenarios with N = 20, M = 9 and K = 2 using the penalized perplexity for model selection. Level of Nesting None ˆ ˆ Visitor θ1 θ2 1 0.688 0.706 2 0.700 0.689 3 0.677 0.713 4 0.690 0.702 5 0.677 0.712 6 0.710 0.683 7 0.684 0.708 8 0.726 0.665 9 0.702 0.691 10 0.701 0.687 11 0.679 0.711 12 0.713 0.674 13 0.716 0.671 14 0.688 0.700 15 0.697 0.694 16 0.692 0.698 17 0.707 0.684 18 0.669 0.719 19 0.705 0.685 20 0.698 0.693 83

A.1.2 Scenarios 2 to 4

Table A.8: The number of groups identified for the scenarios with N = 20, M = 9 and K = 3 with penalized perplexity (PP) used as the model selection criterion. Kˆ = 3 Kˆ 6= 3 Number of Groups identified Number of Groups identified  Nesting 0 1 2 3 Total 0 1 2 3 Total None 0 2 66 389 457 0 5 10 28 43 0.003 Mild 0 7 78 325 410 1 15 26 48 90 Moderate 1 3 60 301 365 1 24 68 42 135

None 0 0 24 433 457 0 4 10 29 43 0.005 Mild 0 2 40 368 410 1 11 28 50 90 Moderate 1 1 28 335 365 0 16 74 45 135

None 0 0 8 449 457 0 2 12 29 43 0.01 Mild 0 0 20 390 410 0 8 29 53 90 Moderate 1 0 12 352 365 0 7 77 51 135

None 0 0 0 457 457 0 0 14 29 43 0.03 Mild 0 0 1 409 410 0 0 29 61 90 Moderate 0 0 7 358 365 0 0 70 65 135 84

ε = 0.003 ε = 0.005 ε = 0.01 ε = 0.03

Group 3 Group 2 1500 Group 1 1500 1500 1500 1000 1000 1000 1000 frequency frequency frequency frequency 500 500 500 500 0 0 0 0

1 2 3 1 2 3 1 2 3 1 2 3

Num. groups identified Num. groups identified Num. groups identified Num. groups identified

ε = 0.003 ε = 0.005 ε = 0.01 ε = 0.03 500 500 500 500 300 300 300 300 frequency frequency frequency frequency 100 100 100 100 0 0 0 0

1 2 3 1 2 3 1 2 3 1 2 3

Num. groups identified Num. groups identified Num. groups identified Num. groups identified

Figure A.2: Stacked bar plots for the identified interaction groups in scenario 2. The top plots are for the runs with Kˆ = K and the bottom row is for Kˆ 6= K. 85

ε = 0.003 ε = 0.005 ε = 0.01 ε = 0.03

1500 Group 3

1500 Group 2 1500

1500 Group 1 1000 1000 1000 1000 frequency frequency frequency frequency 500 500 500 500 0 0 0 0

1 2 3 1 2 3 1 2 3 1 2 3

Num. groups identified Num. groups identified Num. groups identified Num. groups identified

ε = 0.003 ε = 0.005 ε = 0.01 ε = 0.03 700 700 700 600 500 500 500 400 frequency frequency frequency frequency 300 300 300 200 100 100 100 0 0 0 0

1 2 3 1 2 3 1 2 3 1 2 3

Num. groups identified Num. groups identified Num. groups identified Num. groups identified

Figure A.3: Stacked bar plots for the identified interaction groups in scenario 3. The top plots are for the runs with Kˆ = K and the bottom row is for Kˆ 6= K. 86

ε = 0.003 ε = 0.005 ε = 0.01 ε = 0.03 1500

1400 Group 3 1500 1500 Group 2 Group 1 1000 1000 1000 1000 frequency frequency frequency frequency 600 500 500 500 200 0 0 0 0

1 2 3 1 2 3 1 2 3 1 2 3

Num. groups identified Num. groups identified Num. groups identified Num. groups identified

ε = 0.003 ε = 0.005 ε = 0.01 ε = 0.03 700 700 700 600 500 500 500 400 frequency frequency frequency frequency 300 300 300 200 100 100 100 0 0 0 0

1 2 3 1 2 3 1 2 3 1 2 3

Num. groups identified Num. groups identified Num. groups identified Num. groups identified

Figure A.4: Stacked bar plots for the identified interaction groups in scenario 4. The top plots are for the runs with Kˆ = K and the bottom row is for Kˆ 6= K. 87 = 3 using penalized 3 3 ˆ β β K 2 2 ˆ β β = 9 and M 1 1 ˆ β β = 20, 0.0000.001 0.000 -0.2610.000 0.011 0.001 0.035 0.0000.001 0.001 0.016 0.000 -0.040 0.0000.000 -0.015 0.000 -0.002 0.0010.015 -0.045 0.001 0.000 0.0000.000 0.0000.000 0.001 0.225 0.000 0.000 0.115 0.000 0.000 0.191 0.000 0.000 0.092 0.000 0.000 0.172 0.649 0.518 0.130 0.350 0.481 0.076 0.000 0.000 0.000 -0.014 -0.004 0.000 N 3 3 ˆ β β 2 2 ˆ β β 1 1 parameter used to generate the data. ˆ β β Level of Nesting β for the scenarios with 0.0000.000 0.0050.000 0.015 0.000 -0.0480.000 0.000 0.000 0.006 0.0000.000 0.000 0.018 0.002 -0.003 0.0000.005 0.000 0.007 -0.053 0.000 0.000 0.0000.000 0.1510.000 0.620 0.000 0.000 0.230 0.000 0.000 0.000 0.273 0.000 0.000 0.187 0.344 0.000 0.170 0.289 0.000 0.241 0.367 0.000 0.130 0.000 0.000 0.000 -0.008 0.000 0.000 ˆ β 3 3 ˆ β β 2 2 ˆ β β None Mild Moderate 1 1 ˆ β β 0.000 -0.0080.0000.000 0.000 0.003 -0.0020.022 0.000 0.000 0.000 0.000 0.0000.000 0.000 -0.003 0.000 0.003 0.0000.000 0.1750.000 0.404 0.000 0.185 0.421 0.000 0.211 0.000 0.000 0.360 0.000 0.000 0.244 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.642 0.357 -0.009-0.001 0.000-0.009 0.000 0.000 0.000 0.000 0.000 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 9 9 Plant perplexity for model selection. Bottom row: The true Table A.9: Top row: The bias and relative bias for 88

Table A.10: The average relative bias for θˆ for the scenarios with N = 20, M = 9 and K = 3 using the penalized perplexity for model selection. Level of Nesting None Mild Moderate ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ Visitor θ1 θ2 θ3 θ1 θ2 θ3 θ1 θ2 θ3 1 -0.091 -0.134 0.150 3.548 -0.083 -0.309 -0.262 1.147 -0.015 2 -0.108 -0.336 -0.119 -0.569 -0.143 -0.079 -0.282 -0.618 -0.337 3 -0.367 -0.051 -0.231 -0.137 -0.857 -0.129 -0.580 0.509 -0.358 4 -0.110 0.263 -0.706 -0.269 -0.735 -0.028 -0.383 -0.718 -0.385 5 -0.039 -0.421 -0.214 -0.535 -0.161 1.350 -0.766 -0.609 0.121 6 -0.540 0.344 -0.105 -0.115 -0.283 -0.176 0.020 0.408 -0.971 7 -0.584 0.369 0.279 3.875 -0.703 -0.783 -0.730 -1.440 -0.091 8 0.701 0.774 -0.078 37.319 0.083 -1.229 -0.347 -0.637 5.142 9 -0.092 -0.618 -0.567 -0.301 -0.905 -0.321 -1.161 -1.315 0.230 10 -0.294 -0.836 -0.060 1.679 -0.053 24.218 -1.536 0.499 0.064 11 -0.555 -0.515 -0.420 -0.425 -0.704 1.053 -1.119 -1.950 -0.840 12 -0.831 -0.753 -0.346 -0.790 -1.282 4.926 1.490 -1.655 -0.549 13 -0.657 0.081 0.184 0.083 -1.200 0.032 16.501 6.354 -0.283 14 -0.846 -0.065 -1.119 10.045 -1.161 0.054 18.016 11.049 -0.490 15 -0.443 -0.486 -0.786 -0.910 0.766 -1.026 -2.382 16.543 3.323 16 -0.081 2.381 0.832 -0.306 -0.588 47.074 -2.247 24.002 -0.939 17 -1.072 8.704 -0.162 -0.634 -1.087 -1.021 6.158 22.261 -0.083 18 1.596 -1.199 -0.171 -1.181 0.016 159.050 -2.780 12.437 0.112 19 1.066 -0.665 -0.868 93.077 -1.524 -1.124 55.905 -3.104 -0.408 20 -0.780 -1.025 -0.616 -0.806 3.311 -1.826 2.516 8.550 -0.186 89

Table A.11: The average bias for θˆ for the scenarios with N = 20, M = 9 and K = 3 using the penalized perplexity for model selection. Level of Nesting None Mild Moderate ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ Visitor θ1 θ2 θ3 θ1 θ2 θ3 θ1 θ2 θ3 1 0.000 0.000 0.000 0.001 0.000 0.000 0.002 0.001 0.000 2 0.000 0.000 0.000 0.001 0.000 0.000 0.004 0.005 0.000 3 0.000 0.000 0.000 0.001 0.002 0.002 0.002 0.005 0.001 4 0.000 0.000 0.000 0.001 0.000 0.000 0.004 0.003 0.001 5 0.000 0.000 0.000 0.001 0.000 0.000 0.000 0.001 0.002 6 0.000 0.000 0.000 0.002 0.000 0.000 0.006 0.003 0.000 7 0.000 0.000 0.000 0.005 0.002 0.002 0.006 0.009 0.000 8 0.000 0.001 0.000 0.001 0.001 0.000 0.003 0.018 0.003 9 0.000 0.000 0.000 0.001 0.000 0.005 0.007 0.023 0.005 10 0.000 0.000 0.000 0.003 0.003 0.003 0.007 0.019 0.001 11 0.000 0.000 0.000 0.005 0.000 0.007 0.009 0.043 0.000 12 0.000 0.000 0.000 0.000 0.000 0.006 0.015 0.036 0.010 13 0.000 0.000 0.000 0.004 0.001 0.010 0.036 0.041 0.015 14 0.000 0.000 0.000 0.006 0.002 0.026 0.026 0.070 0.005 15 0.000 0.000 0.000 0.004 0.016 0.018 0.038 0.090 0.018 16 0.000 0.000 0.000 0.004 0.014 0.022 0.064 0.075 0.015 17 0.000 0.001 0.000 0.005 0.027 0.040 0.071 0.077 0.016 18 0.001 0.001 0.001 0.018 0.023 0.048 0.080 0.097 0.022 19 0.001 0.001 0.001 0.009 0.008 0.027 0.043 0.074 0.014 20 0.001 0.001 0.001 0.006 0.013 0.044 0.054 0.096 0.028

Table A.12: The coefficient of variation (CV) and SD for βˆ for the scenarios with N = 20, M = 9 and K = 3 using the penalized perplexity for model selection. Level of Nesting None Mild Moderate ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ Plant β1 β2 β3 β1 β2 β3 β1 β2 β3 1 0.001 0.982 0.000 0.002 1.000 0.000 0.001 0.000 0.993 2 0.000 0.976 0.000 0.000 0.984 0.002 0.016 2.553 1.050 3 0.000 0.969 0.000 0.000 0.950 0.988 0.000 0.000 1.010 4 1.013 0.000 0.000 0.000 0.001 1.010 0.014 0.008 0.982 5 0.977 0.000 0.000 0.001 0.000 0.985 0.023 0.000 0.988 6 0.974 0.000 0.000 0.001 0.000 0.991 0.003 0.000 0.997 7 0.976 0.001 0.000 0.983 0.000 0.984 0.000 0.975 1.052 8 0.000 0.001 0.964 0.984 0.000 0.000 0.957 0.972 0.001 9 0.001 0.001 0.976 0.968 0.000 0.000 0.996 0.007 0.003 90

Table A.13: The average coefficient of variation (CV) for θˆ for the scenarios with N = 20, M = 9 and K = 3 using the penalized perplexity for model selection. Level of Nesting None Mild Moderate ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ Visitor θ1 θ2 θ3 θ1 θ2 θ3 θ1 θ2 θ3 1 10.682 12.446 17.317 11.687 4.456 24.341 7.754 30.826 20.823 2 19.637 18.863 18.077 24.805 21.308 5.430 18.221 17.588 13.489 3 17.548 11.955 20.487 15.510 20.205 9.112 11.905 20.883 8.127 4 20.486 15.397 18.366 8.416 19.381 10.331 13.403 16.706 15.775 5 24.007 23.104 14.980 20.842 15.587 15.985 10.804 19.410 14.989 6 5.550 15.581 19.238 21.443 12.851 17.459 21.926 9.163 29.959 7 18.000 16.578 11.227 17.137 9.974 19.649 19.335 20.224 16.889 8 14.845 7.591 16.143 12.300 17.322 26.781 13.422 10.483 39.264 9 5.598 13.304 20.429 20.244 20.418 14.238 18.846 19.084 29.053 10 13.174 24.064 12.822 20.914 8.798 31.404 3.228 21.550 14.815 11 21.289 26.503 18.450 9.292 20.238 23.356 11.577 31.063 8.007 12 8.547 24.608 8.068 14.104 27.870 10.412 9.349 19.546 25.752 13 23.046 13.125 24.786 26.653 17.955 26.042 14.268 26.940 16.067 14 6.018 18.716 8.085 15.221 20.092 19.013 26.325 8.795 11.168 15 9.649 16.515 17.005 11.815 8.448 14.289 10.073 10.761 15.976 16 7.941 12.109 7.316 10.976 22.570 12.649 5.191 21.073 22.403 17 24.204 8.922 23.335 13.810 15.165 10.175 22.884 8.863 9.369 18 14.880 17.247 20.684 18.015 16.791 24.663 15.543 10.873 22.863 19 12.102 10.181 15.752 21.387 10.940 15.484 22.801 19.835 29.429 20 14.751 14.525 11.435 7.357 10.098 12.642 14.320 14.327 21.476 91

Table A.14: The average standard deviation (SD) for θˆ for the scenarios with N = 20, M = 9 and K = 3 using the penalized perplexity for model selection. Level of Nesting None Mild Moderate ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ Visitor θ1 θ2 θ3 θ1 θ2 θ3 θ1 θ2 θ3 1 0.583 0.561 0.567 0.551 0.582 0.578 0.571 0.576 0.566 2 0.551 0.541 0.612 0.583 0.572 0.548 0.595 0.571 0.543 3 0.544 0.599 0.564 0.573 0.533 0.599 0.548 0.564 0.592 4 0.552 0.580 0.572 0.575 0.546 0.589 0.561 0.579 0.572 5 0.553 0.582 0.574 0.549 0.583 0.576 0.562 0.581 0.570 6 0.592 0.576 0.542 0.571 0.571 0.566 0.550 0.609 0.541 7 0.607 0.551 0.547 0.560 0.596 0.552 0.557 0.580 0.565 8 0.574 0.540 0.588 0.556 0.589 0.558 0.572 0.562 0.567 9 0.577 0.541 0.587 0.607 0.529 0.569 0.547 0.599 0.562 10 0.574 0.550 0.586 0.565 0.562 0.574 0.530 0.592 0.586 11 0.555 0.558 0.592 0.570 0.581 0.556 0.588 0.571 0.554 12 0.562 0.557 0.594 0.581 0.550 0.580 0.548 0.560 0.594 13 0.581 0.576 0.545 0.604 0.540 0.558 0.519 0.584 0.598 14 0.554 0.572 0.586 0.549 0.566 0.596 0.528 0.581 0.594 15 0.578 0.536 0.596 0.573 0.588 0.548 0.523 0.608 0.580 16 0.571 0.548 0.589 0.561 0.573 0.576 0.562 0.585 0.567 17 0.547 0.573 0.590 0.553 0.574 0.584 0.581 0.566 0.560 18 0.578 0.570 0.563 0.567 0.557 0.581 0.566 0.584 0.559 19 0.552 0.621 0.532 0.580 0.565 0.568 0.576 0.578 0.562 20 0.595 0.539 0.577 0.582 0.576 0.549 0.537 0.570 0.604 92

A.1.3 Scenarios 5 to 8

Table A.15: The number of groups identified for the scenarios with N = 20, M = 9 and K = 4 with penalized perplexity (PP) used as the model selection criterion. Kˆ = 4 Kˆ 6= 4 Number of Groups identified Number of Groups identified  Nesting 0 1 2 3 4 Total 0 1 2 3 4 Total Mild 0 0 22 83 195 300 3 18 82 75 22 200 0.003 Moderate 0 1 17 56 39 113 2 39 128 209 9 387 Extreme 3 26 36 45 3 113 51 133 143 59 1 387 Full 3 11 11 5 0 30 53 254 148 15 0 470

Mild 0 0 15 56 229 300 2 10 72 92 24 200 0.005 Moderate 0 0 9 53 51 113 0 12 82 283 10 387 Extreme 1 13 42 52 5 113 35 100 157 85 10 387 Full 2 11 12 5 0 30 18 255 180 17 0 470

Mild 0 0 4 36 260 300 0 3 43 128 26 200 0.01 Moderate 0 0 5 47 61 113 0 6 48 321 12 387 Extreme 1 6 27 68 11 113 12 68 147 149 11 387 Full 1 3 19 7 0 30 3 220 219 28 0 470

Mild 0 0 0 9 291 300 0 0 11 153 36 200 0.03 Moderate 0 0 1 40 72 113 0 0 36 330 21 387 Extreme 0 0 5 53 55 113 0 1 137 209 40 387 Full 0 0 14 15 1 30 0 179 246 43 2 470 93

ε = 0.003 ε = 0.005 ε = 0.01 ε = 0.03

Group 4 1400 1500 Group 3 1500 Group 2 Group 1 1000 1000 1000 1000 600 frequency frequency frequency frequency 600 500 500 200 200 0 0 0 0

1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4

Num. groups identified Num. groups identified Num. groups identified Num. groups identified

ε = 0.003 ε = 0.005 ε = 0.01 ε = 0.03 800 1000 800 800 800 600 600 600 600 400 400 frequency frequency frequency frequency 400 400 200 200 200 200 0 0 0 0

1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4

Num. groups identified Num. groups identified Num. groups identified Num. groups identified

Figure A.5: Stacked bar plots for the identified interaction groups in scenario 5. The top plots are for the runs with Kˆ = K and the bottom row is for Kˆ 6= K. 94

ε = 0.003 ε = 0.005 ε = 0.01 ε = 0.03 800

800 Group 4 700 800 Group 3 Group 2

600 Group 1 600 600 500 400 400 400 frequency frequency frequency frequency 300 200 200 200 100 0 0 0 0

1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4

Num. groups identified Num. groups identified Num. groups identified Num. groups identified

ε = 0.003 ε = 0.005 ε = 0.01 ε = 0.03 1200 1400 1500 1500 1000 800 1000 1000 600 frequency frequency frequency frequency 600 400 500 500 200 200 0 0 0 0

1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4

Num. groups identified Num. groups identified Num. groups identified Num. groups identified

Figure A.6: Stacked bar plots for the identified interaction groups in scenario 6. The top plots are for the runs with Kˆ = K and the bottom row is for Kˆ 6= K. 95

ε = 0.003 ε = 0.005 ε = 0.01 ε = 0.03 800 800 700

700 Group 4 Group 3 Group 2

600 Group 1 600 500 500 400 400 frequency frequency frequency frequency 300 300 200 200 100 100 0 0 0 0

1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4

Num. groups identified Num. groups identified Num. groups identified Num. groups identified

ε = 0.003 ε = 0.005 ε = 0.01 ε = 0.03 1200 1000 800 800 800 600 600 800 600 600 400 400 frequency frequency frequency frequency 400 400 200 200 200 200 0 0 0 0

1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4

Num. groups identified Num. groups identified Num. groups identified Num. groups identified

Figure A.7: Stacked bar plots for the identified interaction groups in scenario 7. The top plots are for the runs with Kˆ = K and the bottom row is for Kˆ 6= K. 96

ε = 0.003 ε = 0.005 ε = 0.01 ε = 0.03 600 600 600 600 Group 4 Group 3 500 500 Group 2 500 500 Group 1 400 400 400 400 300 300 300 300 frequency frequency frequency frequency 200 200 200 200 100 100 100 100 0 0 0 0

1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4

Num. groups identified Num. groups identified Num. groups identified Num. groups identified

ε = 0.003 ε = 0.005 ε = 0.01 ε = 0.03 1000 1000 800 800 800 800 600 600 600 600 400 frequency frequency frequency frequency 400 400 400 200 200 200 200 0 0 0 0

1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4

Num. groups identified Num. groups identified Num. groups identified Num. groups identified

Figure A.8: Stacked bar plots for the identified interaction groups in scenario 8. The top plots are for the runs with Kˆ = K and the bottom row is for Kˆ 6= K. 97 4 4 ˆ β β 3 3 ˆ β β 2 2 ˆ β β = 4 using penalized K 1 1 ˆ β β 1.143 0.023 0.040 0.088 0.423 -0.2260.019 -0.242 0.011 -0.163 -0.355 0.006 0.1040.142 0.0000.174 0.142 0.0000.120 0.196 0.000 0.000 0.045 0.083 0.000 0.059 0.056 0.108 0.000 0.251 0.095 0.084 0.000 0.062 0.101 0.098 0.334 0.071 0.163 0.165 0.269 0.209 0.124 0.397 0.077 0.000 0.170 0.101 -0.084-0.020 0.342-0.201 0.063 0.033 0.264 -0.094 0.042 -0.018 0.001-0.382 -0.581 -0.042 0.093 -0.478-0.363 -0.206 -0.065 0.069 0.837 -0.221 0.009 -0.447 = 9 and 4 4 ˆ β β M 3 3 ˆ β β = 20, N 2 2 ˆ β β 1 1 ˆ β β 0.0330.030 0.0190.021 -0.095 0.0150.040 0.033 0.210 0.010 -0.1220.023 0.030 0.661 -0.064 0.020 0.490 -0.260 0.001 -0.370 0.1840.207 0.0000.153 0.000 0.0000.457 0.000 0.001 0.001 0.000 0.000 0.000 0.192 0.000 0.000 0.247 0.084 0.000 0.000 0.752 0.119 0.000 0.000 0.000 0.096 0.000 0.634 0.000 0.071 0.366 0.000 0.222 0.000 0.097 0.117 -0.021-0.195 0.003-0.062 0.011 0.005-0.202 0.003 9.226 0.017 0.017 -0.250 0.009 -0.047 -0.357 -0.392 4 4 ˆ β β parameter used to generate the data. Level of Nesting β 3 3 for the scenarios with ˆ β β ˆ β 2 2 ˆ β β 1 1 ˆ β β 0.004 -0.0200.013 -0.0090.006 0.010 0.001 0.001 0.000 0.000 0.3440.206 0.000 0.002 0.0010.020 0.001 0.0040.011 1.587 -0.082 0.001 0.001 0.001 0.001 0.001 0.012 0.000 0.042 0.000 0.3550.000 0.641 0.0000.314 0.000 0.000 0.000 0.453 0.000 0.339 0.000 0.232 0.000 0.325 0.000 0.000 0.000 0.336 0.000 0.000 0.000 0.000 0.000 0.000 0.004 0.000 0.167 0.000 0.000 0.271 0.000 0.360 0.201 -0.115-0.146 0.001 -0.201 0.001 -0.211 0.000 0.000 4 4 ˆ β β 3 3 ˆ β β 2 2 Mild Moderate Extreme Full ˆ β β 1 1 ˆ β β 0.012 0.000 0.0000.000 0.000 0.003 0.001 -0.0380.000 0.0000.000 0.002 0.000 0.004 -0.019 0.000 0.081 0.000 -0.0160.000 0.000 0.002 -0.013 0.000 0.001 0.001 0.0210.252 0.000 0.246 0.0000.502 0.000 0.0000.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.217 0.000 0.000 0.000 0.367 0.000 0.634 0.000 0.128 0.000 0.366 0.000 0.288 0.000 0.000 0.000 0.690 0.000 0.310 0.000 0.000 -0.013-0.007 0.000 0.000 0.000 0.000 0.000 0.001 2 3 4 5 6 7 8 9 2 3 4 5 6 7 8 9 1 1 Plant Table A.16: Top row: The bias and relative bias for perplexity for model selection. Bottom row: The true 98 4 ˆ θ 3 ˆ θ -0.912 -1.931 -1.984 -5.561 12.316 2 ˆ θ 2.917 177.884 -3.072 0.128 -2.678 -0.750 -5.789 -1.757 -1.986 11.245 1 ˆ θ = 4 using the penalized 1.7571.053 -2.987 0.589 0.027 715.145 0.539 -1.9873.771 -0.747 71.070 307.231 -5.014 4.378 -2.117 2.031 0.789 -3.670-5.584 -7.032 11.399 -3.390 20.861 203.331210.854 -2.299 -1.653 157.897 -3.997 65.747 K 1776.461 49.896 -7.076 4 ˆ θ = 9 and 3 ˆ θ M 2 ˆ θ = 20, N 1 ˆ θ 0.3388.007 -0.999 48.150 -0.538 2.470 -8.611 1.371 -1.081 -2.453 35.371 -1.538-1.566 -0.291 -1.734 -0.053-0.979 60.619 14.664 -8.395 -0.273 -2.186 -0.397 0.056 235.985 -3.138 -2.388 45.280 1112.304 -1.306-3.752 18.795-1.214 -5.117 152.148-8.473 5.233 -1.519 -3.184 234.109 1.402 -2.151 -5.868 41.668 -6.221 -0.329 12.645 63.666 -2.236 -1.10538.897 26.725 90.09726.377 0.07615.729 2.412 -0.068 17.810 -1.908 76.835 -15.582 -9.403 64.198 -2.211 -6.500 275.007 189.988 -0.591 -0.735 158.040 -1.492 -2.975156.133 17.394 11.992 1.493 -0.315 4 ˆ θ Level of Nesting 3 ˆ θ for the scenarios with 2 ˆ θ ˆ θ 1 ˆ θ 2.686 -0.2431.121 1.837 0.210 -0.616 -0.346 0.121 0.584 -1.038 -11.346 3.542 6.050 -2.942 -1.507 1.399 -0.125 -0.180-0.438 1.970 -0.001 -0.375 0.018 11.179 -0.904 -0.253-0.014 -0.953 -1.382-0.630 -3.966 -0.534 -0.249-0.816 -1.000 -1.202 -1.752 -2.436 -0.729 -0.642 -2.282 0.133-2.019 -2.092 -0.543 -1.615 -8.435 -0.779 -0.776 -2.442 -1.797 -7.244 -1.544 -0.315 -2.342 -1.212 14.807 -0.129 -1.578 -1.017 91.388 -0.51318.760 -4.846 -1.152 -0.956 -2.592 -2.222 22.060 -2.854 -18.21314.957 -1.688 -0.593 208.641 -0.966 -14.744 -0.989 305.722 0.246 4 ˆ θ 3 ˆ θ 2 Mild Moderate Extreme Full ˆ θ 1 ˆ θ 0.296 -0.375 1.6290.245 1.513 0.565 -0.840 -0.154 -0.390 -0.237 -0.206 0.043 0.268 -0.386 -1.2140.050 0.247 0.082 -1.9761.601 -1.055 0.253 -1.0111.751 -1.228 -0.244 -0.152 -0.310 -1.808 -0.736 -2.110 -1.918 -0.138-0.480 -0.063 -0.354 0.299 -0.328 -0.254-0.301 0.019 -0.655 -0.630-0.600 0.087 0.367 -0.766 11.855 6.545 -1.018 -0.674 0.549 -0.542 -0.550 6.095 -0.721 0.384 -0.006 -0.220 -0.475-1.742 -2.107 0.306 41.308 0.365 -0.042 27.479 -1.657 -0.465 36.982-0.839 -2.803-0.537 -3.166 -1.317 54.966 6.338 -0.088 4.599 -0.994 9 1 2 5 6 7 8 3 4 10 12 11 13 15 16 18 19 20 14 17 Visitor perplexity for model selection. Table A.17: The average relative bias for 99 4 ˆ θ 3 ˆ θ 2 ˆ θ 1 ˆ θ 0.000 0.1810.111 0.074 0.1920.238 0.070 0.137 0.2770.106 0.157 0.132 0.2310.202 0.206 0.201 0.2440.087 0.154 0.147 0.1550.092 0.227 0.168 0.3060.209 0.090 0.097 0.2170.165 0.127 0.154 0.2700.147 0.086 0.159 0.2710.133 0.189 0.129 0.2690.149 0.152 0.150 0.2220.208 0.205 0.123 0.2140.193 0.146 0.206 0.3240.152 0.177 0.125 0.3560.109 0.326 0.107 0.2700.250 0.163 0.250 0.3490.131 0.208 0.253 0.4160.114 0.220 0.184 0.3820.277 0.132 0.100 0.237 0.239 0.250 0.150 4 ˆ θ 3 ˆ θ = 4 using the penalized perplexity for K 2 ˆ θ 1 = 9 and ˆ θ M 0.056 0.0520.080 0.030 0.0560.107 0.025 0.053 0.0420.076 0.041 0.011 0.0490.100 0.077 0.036 0.0590.052 0.098 0.056 0.0390.037 0.149 0.044 0.0360.075 0.105 0.044 0.0530.073 0.148 0.043 0.0340.038 0.115 0.071 0.0390.046 0.169 0.041 0.0490.139 0.170 0.059 0.0450.076 0.166 0.030 0.0640.048 0.136 0.025 0.0460.084 0.153 0.044 0.0560.089 0.238 0.091 0.0720.154 0.092 0.104 0.0760.169 0.146 0.090 0.0640.118 0.114 0.091 0.0770.109 0.099 0.054 0.098 0.194 0.070 0.119 4 ˆ θ = 20, N Level of Nesting 3 ˆ θ 2 ˆ θ 1 ˆ θ 0.006 -0.0000.030 -0.0000.021 0.000 0.000 0.046 0.000 0.0200.035 0.000 0.001 0.0540.052 0.000 0.000 0.0120.026 0.000 0.001 0.0190.085 0.000 0.000 0.0410.042 0.000 0.001 0.0210.085 0.000 0.000 0.0450.028 0.000 0.001 0.0320.045 0.000 0.002 0.0260.100 0.003 0.004 0.0320.070 0.000 0.009 0.0560.105 0.004 0.002 0.0610.063 0.000 0.000 0.0620.155 0.000 0.000 0.1020.107 0.000 0.000 0.0990.154 0.001 0.000 0.0810.076 0.001 0.013 0.038 0.001 0.001 0.093 0.001 0.084 0.006 for the scenarios with 4 ˆ ˆ θ θ 3 ˆ θ 2 Mild Moderate Extreme Full ˆ θ 1 ˆ θ 0.000 0.0000.000 0.000 0.0020.000 0.001 0.000 0.0000.000 0.000 0.004 0.0000.000 0.007 0.003 0.0000.000 0.002 0.000 0.0020.000 0.000 0.005 0.0000.000 0.000 0.002 0.0050.000 0.005 0.000 0.0190.000 0.002 0.002 0.0270.000 0.001 0.006 0.0100.000 0.003 0.010 0.0280.000 0.004 0.003 0.0090.000 0.019 0.001 0.0300.000 0.002 0.012 0.0190.000 0.001 0.003 0.0280.000 0.008 0.008 0.0700.001 0.006 0.022 0.0770.001 0.019 0.012 0.0540.005 0.013 0.041 0.067 0.012 0.027 0.025 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Visitor Table A.18: The averagemodel bias selection. for 100 4 ˆ β 3 ˆ β 2 ˆ β 1 = 4 using the penalized ˆ β K 3.767 0.0471.482 0.188 2.2971.454 0.267 0.142 1.3980.996 3.058 0.185 2.3461.793 1.830 0.002 0.9411.401 1.037 0.048 1.6610.817 2.607 1.009 2.2932.010 1.821 0.915 0.9030.795 4.235 1.082 1.399 1.653 0.049 1.873 4 ˆ β = 9 and 3 ˆ β M = 20, 2 ˆ β N 1 ˆ β 1.534 0.0150.977 0.0941.236 0.029 48.821 0.0150.921 0.042 1.131 0.1000.153 0.051 0.0970.139 1.036 1.107 0.0950.092 1.146 0.942 0.1590.170 1.873 0.057 0.9160.112 2.191 0.157 0.997 1.751 0.115 1.557 0.008 1.114 4 ˆ β Level of Nesting 3 ˆ β for the scenarios with ˆ β 2 ˆ β 1 ˆ β 0.0200.092 0.9860.030 0.093 0.9800.930 0.001 0.011 0.0110.901 0.000 1.540 0.0111.496 0.000 0.918 0.0110.015 0.000 0.898 0.0110.132 0.000 0.020 0.011 19.4630.092 1.018 0.011 0.011 1.012 0.011 1.040 0.011 1.075 4 ˆ β 3 ˆ β 2 Mild Moderate Extreme Full ˆ β 1 ˆ β 1.016 0.0010.993 0.001 0.0040.978 0.001 0.001 0.0010.000 0.001 0.001 0.0140.040 0.006 0.001 1.0270.000 0.986 0.023 0.9980.002 0.979 0.001 1.1770.000 0.018 0.001 1.0030.000 0.007 0.968 0.001 0.014 1.029 0.000 1 2 3 4 5 6 7 8 9 Plant Table A.19: The coefficient of variation (CV) for perplexity for model selection. 101 4 ˆ θ 3 ˆ θ = 4 using the 2 ˆ θ K 1 ˆ θ = 9 and 0.467 32.574 16.748 0.349 4.2042.496 1.192 2.271 0.4440.498 50.651 0.579 0.427 54.380 0.834 4.6820.423 0.524 31.572 25.2440.788 28.2104.057 1.916 0.645 0.808 1.405 0.844 0.772 1.706 59.888 63.230 0.627 33.764 20.183 13.153 2.1606.578 0.5331.128 4.908 5.518 1.752 0.777 5.922 0.927 2.310 3.194 53.17324.894 0.455 11.46012.528 79.124 14.294 0.520 0.676 18.581 50.45244.052 26.303 0.67285.079 0.543 29.068 6.634 0.589 0.407 62.32461.569 9.739 30.063 54.081 17.541 22.112 0.462 67.656 3.169 0.482 0.517 M 4 ˆ θ = 20, 3 ˆ θ N 2 ˆ θ 1 ˆ θ 9.029 6.5371.958 19.787 36.6135.539 13.992 17.230 5.549 28.694 2.795 13.370 46.479 15.92714.852 20.717 43.17558.063 22.639 21.74924.363 1.710 11.040 48.43838.745 32.050 14.917 43.787 13.23615.464 69.18436.921 3.958 1.869 28.73832.359 9.655 3.845 31.223 13.70820.720 7.947 34.661 43.23330.386 16.186 23.099 54.624 0.548 33.038 16.040 18.446 2.082 19.374 15.469 28.030 21.794 26.647 26.356 23.618 34.356 25.10018.100 24.97818.000 7.390 12.134 37.570 28.447 9.195 49.70653.200 28.581 29.225 33.623 13.880 18.363 14.180 34.636 14.501 4 ˆ θ for the scenarios with ˆ θ Level of Nesting 3 ˆ θ 2 ˆ θ 1 ˆ θ 4.530 19.749 40.427 26.677 6.334 17.194 35.9974.948 21.216 12.244 17.4130.641 41.251 34.604 16.540 45.259 4.814 16.3286.364 27.261 21.132 28.350 8.235 28.795 16.138 3.63716.394 15.283 42.74521.744 6.843 32.627 44.96157.236 18.069 26.165 2.22310.236 5.607 47.280 14.602 21.080 42.17224.157 19.912 1.29610.092 24.641 17.08516.002 22.708 27.527 19.68613.577 13.276 17.346 27.38021.647 14.78327.811 8.750 4.100 25.496 58.12915.906 43.001 37.850 64.44931.963 43.267 18.393 23.77638.105 8.108 11.809 24.446 5.565 6.908 24.390 2.309 4 ˆ θ 3 ˆ θ 2 Mild Moderate Extreme Full ˆ θ 1 ˆ θ 7.685 32.484 19.916 15.104 24.965 15.67624.209 23.756 13.137 15.046 12.82724.098 17.901 25.75320.540 29.021 20.63618.968 15.020 10.11024.579 7.478 25.78028.859 5.440 28.918 18.608 18.13428.029 18.203 15.259 29.181 18.51833.577 17.081 14.707 18.00423.121 22.069 17.048 33.12510.505 30.768 29.288 13.51130.197 26.397 22.47618.133 7.609 27.210 15.028 37.209 1.756 17.13114.165 23.498 12.396 18.09531.659 13.399 6.708 20.158 25.95210.394 12.131 17.270 28.47820.530 9.698 13.933 16.17534.926 18.765 27.137 15.879 18.374 24.046 8.187 21.788 1 2 6 7 9 3 4 5 8 11 12 13 14 15 18 19 20 10 16 17 Visitor Table A.20: The averagepenalized coefficient perplexity of for variation (CV) model for selection. 102 4 ˆ θ = 4 using the 3 ˆ θ K 2 ˆ θ = 9 and 1 ˆ θ M 0.439 0.4140.369 0.623 0.4550.442 0.300 0.629 0.4840.376 0.368 0.520 0.4450.462 0.407 0.599 0.5420.437 0.337 0.410 0.4480.436 0.402 0.574 0.5540.498 0.271 0.484 0.4260.452 0.294 0.495 0.5150.511 0.348 0.498 0.5100.421 0.377 0.472 0.5210.517 0.344 0.541 0.5350.428 0.326 0.470 0.5070.501 0.318 0.559 0.5350.434 0.358 0.400 0.5540.428 0.448 0.457 0.5300.538 0.392 0.524 0.5200.346 0.416 0.468 0.5830.392 0.386 0.590 0.5830.505 0.423 0.471 0.471 0.452 0.479 0.443 4 ˆ θ = 20, N 3 ˆ θ 2 ˆ θ 1 ˆ θ 0.583 0.5170.479 0.466 0.5120.445 0.277 0.475 0.5290.476 0.380 0.496 0.4640.509 0.371 0.497 0.4030.457 0.430 0.462 0.4960.450 0.467 0.524 0.4730.419 0.403 0.526 0.5600.457 0.444 0.474 0.4240.460 0.416 0.502 0.4390.452 0.487 0.506 0.4430.540 0.510 0.500 0.3760.469 0.505 0.530 0.5300.438 0.427 0.456 0.4010.500 0.440 0.508 0.4530.434 0.550 0.535 0.5140.506 0.402 0.506 0.4630.502 0.476 0.500 0.4620.456 0.448 0.476 0.4590.506 0.470 0.470 0.460 0.549 0.461 0.497 4 ˆ θ for the scenarios with Level of Nesting ˆ 3 θ ˆ θ 2 ˆ θ 1 ˆ θ 0.611 0.3530.603 0.549 0.4520.547 0.409 0.427 0.4590.534 0.479 0.498 0.3880.447 0.442 0.533 0.5370.540 0.455 0.494 0.5060.547 0.470 0.458 0.4390.531 0.448 0.423 0.5400.420 0.546 0.435 0.4920.498 0.443 0.517 0.4580.477 0.518 0.487 0.5100.503 0.505 0.436 0.4940.499 0.545 0.450 0.5010.529 0.500 0.495 0.4830.527 0.481 0.507 0.4460.477 0.442 0.512 0.4280.508 0.481 0.547 0.4720.486 0.514 0.497 0.5610.551 0.507 0.433 0.4610.531 0.491 0.495 0.472 0.472 0.519 0.453 4 ˆ θ 3 ˆ θ 2 Mild Moderate Extreme Full ˆ θ 1 ˆ θ 0.478 0.5220.440 0.454 0.5480.451 0.513 0.506 0.5050.474 0.481 0.486 0.5220.442 0.525 0.447 0.4760.498 0.522 0.512 0.4770.492 0.526 0.462 0.4820.499 0.527 0.466 0.4770.502 0.522 0.515 0.4900.462 0.481 0.472 0.4970.524 0.502 0.538 0.5100.483 0.466 0.466 0.5220.462 0.470 0.473 0.4550.481 0.490 0.513 0.5380.513 0.541 0.500 0.4960.476 0.456 0.447 0.4400.438 0.514 0.527 0.5540.515 0.523 0.487 0.4970.505 0.495 0.462 0.4950.497 0.507 0.483 0.509 0.485 0.473 0.481 2 3 4 5 6 7 8 9 1 10 11 12 13 14 15 16 17 18 19 20 Visitor Table A.21: Thepenalized average perplexity standard for deviation model (SD) selection. for 103

A.1.4 Scenario 9

Table A.22: The number of groups identified for scenario 9 with N = 42, M = 14 and K = 2 with penalized perplexity (PP) used as the model selection criterion. Kˆ = 2 Kˆ 6= 2 Number of Groups identified Number of Groups identified  Nesting 0 1 2 Total 0 1 2 Total 0.003 None 0 0 477 477 0 13 10 23 0.005 None 0 0 477 477 0 8 15 23 0.01 None 0 0 477 477 0 0 23 23 0.03 None 0 0 477 477 0 0 23 23

ε = 0.003 ε = 0.005 ε = 0.01 ε = 0.03

Group 2

1400 1400 1400 1400 Group 1 1000 1000 1000 1000 frequency frequency frequency frequency 600 600 600 600 200 200 200 200 0 0 0 0

1 2 1 2 1 2 1 2

Num. groups identified Num. groups identified Num. groups identified Num. groups identified

ε = 0.003 ε = 0.005 ε = 0.01 ε = 0.03 600 600 600 600 500 500 500 500 400 400 400 400 300 300 300 300 frequency frequency frequency frequency 200 200 200 200 100 100 100 100 0 0 0 0

1 2 1 2 1 2 1 2

Num. groups identified Num. groups identified Num. groups identified Num. groups identified

Figure A.9: Stacked bar plots for the identified interaction groups in scenario 9. The top plots are for the runs with Kˆ = K and the bottom row is for Kˆ 6= K. 104

Table A.23: Top row: The bias and relative bias for βˆ for scenario 9 with N = 42, M = 14 and K = 2 using penalized perplexity for model selection. Bottom row: The true β parameter used to generate the data. Level of Nesting None

ˆ ˆ Plant β1 β2 β1 β2 1 -0.533 0.003 0.002 0.304 2 0.000 -0.008 0.000 0.237 3 0.000 0.000 0.000 0.187 4 0.000 0.003 0.000 0.271 5 -0.000 -0.000 0.108 0.000 6 0.004 0.000 0.037 0.000 7 0.003 0.000 0.136 0.000 8 0.005 0.000 0.101 0.000 9 -0.001 0.000 0.123 0.000 10 -0.004 0.000 0.049 0.000 11 0.003 0.000 0.089 0.000 12 -0.008 0.000 0.125 0.000 13 0.002 0.000 0.095 0.000 14 0.004 0.000 0.134 0.000 105

Table A.24: The average relative bias for θˆ for the scenarios with N = 42, M = 14 and K = 2 using the penalized perplexity for model selection. Level of Nesting None

ˆ ˆ Visitor θ1 θ2 1 -0.142 0.147 2 -0.072 -0.103 3 0.069 0.046 4 -0.240 0.098 5 0.088 0.774 6 0.387 -0.025 7 0.195 -0.284 8 0.195 -0.231 9 -0.500 -0.091 10 -0.078 -0.151 11 0.338 -0.560 12 -0.040 -0.201 13 -0.205 3.913 14 -0.495 -0.282 15 -1.234 0.033 16 0.521 0.338 17 -0.723 11.029 18 -0.499 1.794 19 -0.810 0.074 20 -0.524 -0.349 21 0.052 -1.223 22 -0.219 -0.650 23 -0.522 11.496 24 -0.454 -0.990 25 -0.205 -0.550 26 -0.676 0.422 27 0.709 -0.284 28 -1.074 -0.012 29 0.454 -0.518 30 -0.613 -0.498 31 -0.344 -0.529 32 0.180 -1.157 33 -0.951 0.484 34 -0.411 -0.371 35 -0.750 0.116 36 -0.892 0.020 37 -0.897 -0.464 38 -0.603 0.654 39 1.006 -0.608 40 -0.707 -0.488 41 1.770 -0.871 42 0.452 -1.155 106

Table A.25: The coefficient of variation (CV) and SD for βˆ for the scenarios with N = 42, M = 14 and K = 2 using the penalized perplexity for model selection. Level of Nesting None

ˆ ˆ Plant β1 β2 1 1.070 0.970 2 0.000 0.960 3 0.001 0.970 4 0.000 0.970 5 0.972 0.000 6 0.990 0.000 7 0.974 0.000 8 0.977 0.000 9 0.970 0.000 10 0.978 0.000 11 0.977 0.000 12 0.963 0.000 13 0.975 0.000 14 0.975 0.000 107

Table A.26: The average coefficient of variation (CV) for θˆ for the scenarios with N = 42, M = 14 and K = 2 using the penalized perplexity for model selection. Level of Nesting None

ˆ ˆ Visitor θ1 θ2 1 11.642 9.181 2 5.836 12.949 3 7.116 10.817 4 12.371 15.661 5 7.690 11.701 6 15.197 8.259 7 16.918 12.529 8 13.580 11.395 9 23.821 9.162 10 12.205 14.031 11 8.135 8.516 12 8.378 8.510 13 8.600 14.020 14 7.729 11.316 15 23.372 1.981 16 5.867 15.123 17 15.475 14.433 18 12.491 10.866 19 9.214 7.387 20 8.265 24.541 21 9.850 15.827 22 10.274 24.435 23 3.059 10.197 24 11.480 11.169 25 9.027 18.821 26 12.321 12.965 27 14.534 10.501 28 2.523 2.932 29 7.719 11.715 30 14.988 22.774 31 7.249 10.121 32 0.970 4.251 33 10.189 2.518 34 12.501 15.387 35 8.080 19.684 36 8.088 5.223 37 9.737 24.307 38 7.037 10.496 39 17.521 11.191 40 11.427 8.547 41 14.421 15.681 42 9.883 7.024 108

Table A.27: The average standard deviation (SD) for θˆ for the scenarios with N = 42, M = 14 and K = 2 using the penalized perplexity for model selection. Level of Nesting None

ˆ ˆ Visitor θ1 θ2 1 0.665 0.734 2 0.696 0.705 3 0.693 0.708 4 0.708 0.692 5 0.706 0.691 6 0.688 0.711 7 0.698 0.699 8 0.705 0.696 9 0.678 0.719 10 0.701 0.699 11 0.685 0.715 12 0.694 0.705 13 0.707 0.693 14 0.693 0.706 15 0.674 0.724 16 0.734 0.665 17 0.701 0.701 18 0.726 0.668 19 0.686 0.715 20 0.708 0.692 21 0.710 0.690 22 0.688 0.711 23 0.703 0.695 24 0.695 0.704 25 0.727 0.671 26 0.687 0.713 27 0.703 0.697 28 0.697 0.702 29 0.696 0.701 30 0.712 0.689 31 0.704 0.693 32 0.680 0.718 33 0.716 0.684 34 0.702 0.697 35 0.727 0.672 36 0.675 0.722 37 0.723 0.675 38 0.701 0.697 39 0.704 0.698 40 0.674 0.725 41 0.705 0.695 42 0.696 0.703 109

A.1.5 Scenarios 13 to 16

Table A.28: The number of groups identified for the scenarios with N = 42, M = 14 and K = 4 with penalized perplexity (PP) used as the model selection criterion. Kˆ = 4 Kˆ 6= 4 Number of Groups identified Number of Groups identified  Nesting 0 1 2 3 4 Total 0 1 2 3 4 Total Mild 0 0 1 17 394 412 0 0 7 23 58 88 0.003 Moderate 0 2 9 34 237 282 1 5 37 75 100 218 Extreme 4 5 32 103 49 193 8 38 85 137 39 307 Full 3 11 11 5 0 30 53 254 148 15 0 470

Mild 0 0 1 10 401 412 0 0 4 22 62 88 0.005 Moderate 0 1 0 34 247 282 0 4 11 100 103 218 Extreme 1 2 19 97 74 193 0 17 67 159 64 307 Full 2 11 12 5 0 30 18 255 180 17 0 470

Mild 0 0 0 5 407 412 0 0 2 20 66 88 0.01 Moderate 0 0 1 23 258 282 0 1 8 97 112 218 Extreme 0 0 3 63 127 193 0 2 37 178 90 307 Full 1 3 19 7 0 30 3 220 219 28 0 470

Mild 0 0 0 1 411 412 0 0 1 19 68 88 0.03 Moderate 0 0 1 12 269 282 0 0 7 90 121 218 Extreme 0 0 0 35 158 193 0 0 21 168 118 307 Full 0 0 14 15 1 30 0 179 246 43 2 470 110

ε = 0.003 ε = 0.005 ε = 0.01 ε = 0.03

Group 4 2000 2000 2000 2000 Group 3 Group 2 Group 1 1500 1500 1500 1500 1000 1000 1000 1000 frequency frequency frequency frequency 500 500 500 500 0 0 0 0

1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4

Num. groups identified Num. groups identified Num. groups identified Num. groups identified

ε = 0.003 ε = 0.005 ε = 0.01 ε = 0.03 800 800 800 800 600 600 600 600 400 400 400 400 frequency frequency frequency frequency 200 200 200 200 0 0 0 0

1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4

Num. groups identified Num. groups identified Num. groups identified Num. groups identified

Figure A.10: Stacked bar plots for the identified interaction groups in scenario 13. The top plots are for the runs with Kˆ = K and the bottom row is for Kˆ 6= K. 111

ε = 0.003 ε = 0.005 ε = 0.01 ε = 0.03

1500 Group 4 1500 1400 1500 Group 3 Group 2 Group 1 1000 1000 1000 1000 frequency frequency frequency frequency 600 500 500 500 200 0 0 0 0

1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4

Num. groups identified Num. groups identified Num. groups identified Num. groups identified

ε = 0.003 ε = 0.005 ε = 0.01 ε = 0.03 1000 1000 1000 1000 800 800 800 800 600 600 600 600 frequency frequency frequency frequency 400 400 400 400 200 200 200 200 0 0 0 0

1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4

Num. groups identified Num. groups identified Num. groups identified Num. groups identified

Figure A.11: Stacked bar plots for the identified interaction groups in scenario 14. The top plots are for the runs with Kˆ = K and the bottom row is for Kˆ 6= K. 112

ε = 0.003 ε = 0.005 ε = 0.01 ε = 0.03

1200 Group 4 800 1000 800 Group 3 Group 2 Group 1 800 600 600 800 600 600 400 400 frequency frequency frequency frequency 400 400 200 200 200 200 0 0 0 0

1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4

Num. groups identified Num. groups identified Num. groups identified Num. groups identified

ε = 0.003 ε = 0.005 ε = 0.01 ε = 0.03 1000 1000 1000 1000 800 800 800 800 600 600 600 600 frequency frequency frequency frequency 400 400 400 400 200 200 200 200 0 0 0 0

1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4

Num. groups identified Num. groups identified Num. groups identified Num. groups identified

Figure A.12: Stacked bar plots for the identified interaction groups in scenario 15. The top plots are for the runs with Kˆ = K and the bottom row is for Kˆ 6= K. 113

ε = 0.003 ε = 0.005 ε = 0.01 ε = 0.03 700 700 Group 4 Group 3 Group 2 Group 1 500 500 500 500 300 300 300 frequency frequency frequency frequency 300 100 100 100 100 0 0 0 0

1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4

Num. groups identified Num. groups identified Num. groups identified Num. groups identified

ε = 0.003 ε = 0.005 ε = 0.01 ε = 0.03 1000 1000 1000 1000 800 800 800 800 600 600 600 600 frequency frequency frequency frequency 400 400 400 400 200 200 200 200 0 0 0 0

1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4

Num. groups identified Num. groups identified Num. groups identified Num. groups identified

Figure A.13: Stacked bar plots for the identified interaction groups in scenario 16. The top plots are for the runs with Kˆ = K and the bottom row is for Kˆ 6= K. 114 4 4 ˆ β β 3 3 ˆ β β 2 2 ˆ β β = 4 using penalized K 1 1 ˆ β β 0.046 0.038 0.077 0.104 0.0100.002 0.604 -0.3680.031 -0.369 -0.539 -0.155 -0.325 0.027 0.007 0.014 0.021 0.023 -0.136 0.024 0.027 0.0900.039 0.022 0.068 0.016 0.027 0.011 0.057 0.113 0.016 0.025 -0.082 2.288 0.011 -0.006 0.076 0.062 0.099 0.000 0.012 0.347 0.0000.269 0.147 0.0000.163 0.159 0.222 0.097 0.221 0.078 0.192 0.071 0.000 0.108 0.152 0.012 0.000 0.081 0.105 0.122 0.000 0.249 0.110 0.028 0.000 0.177 0.192 0.098 0.000 0.000 0.000 0.084 0.000 0.000 0.000 0.066 0.000 0.000 0.000 0.067 0.000 0.000 0.000 0.080 0.000 0.000 0.027 0.062 0.000 0.000 0.086 0.000 0.076 0.053 -0.243 -0.368-0.250 -0.607 -0.204-0.156 -0.693 0.021 -0.268 0.264 -0.202 3.014 -0.279 -0.552 -0.270 0.454 4 4 = 14 and ˆ β β M 3 3 ˆ β β = 42, N 2 2 ˆ β β 1 1 ˆ β β 0.354 0.013 0.0020.001 -0.046 0.001 0.0180.001 0.022 0.1930.000 0.015 0.316 0.001 -0.0300.000 0.351 0.006 -0.375 -0.1740.000 0.001 -0.260 -0.171 0.007 0.000 -0.225 -0.068 0.009 0.000 -0.394 0.003 0.012 0.008 0.2330.255 0.006 0.054 0.0000.086 0.000 0.0000.083 0.000 0.000 0.000 0.302 0.000 0.000 0.000 0.219 0.000 0.000 0.000 0.000 0.000 0.125 0.582 0.000 0.000 0.087 0.418 0.000 0.000 0.087 0.000 0.000 0.000 0.104 0.000 0.000 0.143 0.135 0.000 0.000 0.417 0.126 0.000 0.000 0.285 0.086 0.000 0.000 0.155 0.064 0.000 0.000 0.114 0.000 0.072 0.000 0.000 -0.045-0.109 0.007-0.057 0.002 0.000 0.002 0.000 0.001 -0.012 0.000 0.002 -0.031 0.021 0.001 0.025 0.075 -0.050 0.040 0.003 4 4 ˆ β β parameter used to generate the data. Level of Nesting β 3 3 ˆ for the scenarios with β β ˆ β 2 2 ˆ β β 1 1 ˆ β β 0.0000.000 0.0000.000 0.000 0.026 -0.7390.007 0.014 0.000 0.000 0.026 0.000 0.000 0.000 -0.009 0.000 0.001 0.004 -0.059 0.000 0.400 -0.005 0.006 -0.059 0.000 0.000 0.005 0.001 -0.003 0.000 -0.011 0.0010.001 0.001 -0.0470.002 0.000 -0.006 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.0000.000 0.000 0.2200.000 0.001 0.265 0.000 0.000 0.000 0.182 0.000 0.000 0.000 0.106 0.000 0.000 0.000 0.104 0.002 0.000 0.000 0.122 0.361 0.178 0.000 0.000 0.219 0.594 0.000 0.000 0.262 0.228 0.244 0.000 0.148 0.000 0.284 0.000 0.000 0.000 0.157 0.002 0.007 0.000 0.143 0.000 0.000 0.172 0.000 0.000 0.000 0.000 0.000 -0.025-0.018 0.004-0.029 0.023 0.000 -0.006 0.000 0.004 -0.632 -0.441 0.000 4 4 ˆ β β 3 3 ˆ β β 2 2 Mild Moderate Extreme Full ˆ β β 1 1 ˆ β β 0.0000.000 0.013 -0.002 0.000 0.000 0.000 0.000 0.000 -0.4560.000 -0.0020.000 0.000 0.001 0.000 -0.000 0.000 -0.2990.001 0.070 0.0000.001 0.000 0.000 -0.018 -0.5610.001 0.132 0.0000.000 0.000 0.000 -0.019 -0.010 0.001 0.000 0.000 0.000 -0.015 0.000 -0.020 0.0000.000 -0.023 0.000 0.3440.437 0.237 0.0000.318 0.403 0.000 0.000 0.245 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.001 0.000 0.000 0.000 0.000 0.249 0.000 0.000 0.000 0.751 0.000 0.000 0.012 0.000 0.060 0.000 0.000 0.000 0.237 0.000 0.003 0.000 0.075 0.000 0.000 0.000 0.151 0.000 0.000 0.000 0.093 0.000 0.000 0.104 0.000 0.170 0.108 -0.006-0.000 0.001-0.005 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 2 3 4 5 6 7 8 9 2 3 4 5 6 7 8 9 1 1 10 11 12 13 14 10 11 12 13 14 Plant Table A.29: Top row: The bias and relative bias for perplexity for model selection. Bottom row: The true 115 4 ˆ θ 3 ˆ θ 2 ˆ θ = 4 using the penalized 2.870 533.253 -6.147 -3.924 22.796 -10.769 1.040 77.680 184.030 8.462 643.410 K 1 ˆ θ 6.934 1.651 -15.064 3.796 10.144 7.841 0.1291.650 76.495 36.393 -21.828 29.782 -1.279-3.247 97.378-1.999 192.548 -1.792 151.558-0.354 78.821-0.993 11.429 -0.494 -2.088 -6.707-4.460-3.985 39.895 373.890 12.849-4.001 -7.601 -9.531 72.844 -7.090 -9.010-5.970 1.562 181.408 218.014 1.580-3.990 16.347 -4.130 -9.182 -1.250 -1.202-1.733-3.519 8.784 35.730 -0.370 -9.161 -9.789 -1.813 -1.544-9.975 -8.872 -13.644 468.324-8.823 -5.292 -5.179 0.727-2.108 -21.068 -2.036 -5.289 30.019-9.381 77.828 955.475 45.471-9.879 -4.636 -3.131 -5.398-7.008 37.122 -8.597-8.831-0.416 -8.163 32.581 -14.975 -4.239 20.385 -9.741 126.312 73.648 -2.346 -2.735 -10.322 75.540 230.557 -20.782 -8.883 24.432 416.535 412.130 -4.348 -10.892 3.092 204.264-12.007 -1.660 -3.196 -3.993 -1.343 -10.974-14.428 -9.504 528.151 -15.979 367.854 -0.478 163.191 470.678 -14.754 -6.597 221.014153.041 -11.140 463.218 104.212 -11.706 4 ˆ θ = 14 and 3 M ˆ θ 2 = 42, ˆ θ N 1 ˆ θ 0.519 0.1744.9417.194 13.125 5.8202.911 25.388 -0.711 11.705 -1.330 47.254 -3.022 16.107 -0.089 0.118 13.415 0.898 5.502 -1.905 -1.721 2.333 -0.041 -0.399 5.126 3.459 4.2228.944 -0.628 89.757 -1.269 -1.769 1.590 -5.430 46.596 1.264 1.165 -1.544 436.553 9.736 -0.314-0.556 3.547 0.797 89.636 136.393 -1.508 -2.150 -0.877 122.090-0.702 102.179-0.955 -1.775-1.032 91.005 -2.257 -0.250 -4.219 3.679 -0.442-0.553 -0.944 -0.580 -7.974 -0.551 -0.676 3.788 -3.011 -1.599 -0.410 0.073 14.396 -2.327 127.986 0.444 -0.312-1.639 -1.893 260.078 -2.883 35.018 -1.113 -1.968 -1.870 114.546-3.709 -7.765-1.038 -0.936 -2.697 -8.025 61.919-3.619 21.136 13.455-1.171 -3.462 -2.082 -0.497 10.857 -3.766-7.644 34.224 -6.111 -1.679 -1.056 4.254 -1.845 -2.041 46.114 -3.466 -2.086 56.575 32.825-5.121 -1.256 -2.303 -1.613 -9.156 51.294 89.567 -10.788 77.125 2.604 -0.190 -3.427 44.192 -0.222 5.828 -0.977 28.773 0.988 -1.506 -2.764 98.176 1.48274.248 -2.487 -1.100 -1.228 10.48210.843 41.906 40.620 -3.68818.956 -4.951 112.400 11.354 4.884 46.063 1.634 -4.74157.036 22.715 52.171 -3.401 -2.824 124.806 37.123 -2.455 -1.201 122.642 22.374 50.260 -1.414 4 ˆ θ Level of Nesting 3 ˆ θ 2 ˆ θ for the scenarios with ˆ θ 1 ˆ θ 0.973 0.370 -0.379 57.792 2.2993.827 -0.594 -0.201 -0.110 -0.264 3.481 -0.643 1.572 6.415 -1.294 28.148 0.412 -0.0523.535 -1.071 13.818 0.360 -1.510 -1.254 -0.483-0.598 46.958-0.011 -0.174 -0.532-1.167 -0.484 -0.120 0.117 -1.547 -0.286 1.156 0.386 10.610 0.094 -0.624 -0.192 0.663-0.639 0.932 -1.543 0.858-0.401 0.002-0.028 0.060 79.612 -0.629-0.902 -0.120 -0.922 -0.594 -0.301 -0.389 -1.105 -1.146 1.938-1.595 -0.297 0.238-2.033 0.651 -0.499 -0.297-2.286 -0.987 0.201 1.154 7.776 0.080 -0.104 -0.201 -0.825 -0.767 -3.547 -1.151 -0.327 -1.065 -1.893 -1.778 -0.656-0.662 -0.499 -0.585 -0.410 -1.028 -1.290 -0.263 -0.524 -0.070 -1.941 -1.374 6.419-1.115 -0.088 0.365-2.328 0.472 -2.637 -1.154 -0.744 -0.290 -1.250 -1.529 -1.381 -1.168 -0.379 -0.966 -2.936 -0.464 -0.887 5.472 -0.799 -0.079-4.206 -1.048 0.201 -0.626-0.410 -1.112 -3.680 0.009 54.155 -3.186 -0.438 18.321 -1.582 15.079 0.998 -0.542 -0.943 11.822 1.521 0.755 0.069 53.540 139.455 -0.550 47.217 47.36785.908 0.180 -2.083 -2.07536.121 -0.915 -1.320 -1.481 6.574 -0.510 -1.120 25.192 136.952 -0.12228.791 -1.294 -0.55621.287 -1.268 -1.556 -0.448 1.877 6.032 4 ˆ θ 3 ˆ θ 2 Mild Moderate Extreme Full ˆ θ 1 ˆ θ 0.059 7.506 -0.228 -0.003 0.013 1.752 0.1060.048 15.228 3.081 -0.394 2.425 0.144 -0.010 -0.9136.210 -1.501 -1.086 -0.883 -0.517 0.025 1.234 -1.2751.472 -0.989 -0.5627.661 4.366 -1.213 -0.056 -0.932 -0.182 0.813 40.032 -1.413 9.510 5.501 -1.938 -2.8902.305 -0.275 -0.772 0.002 2.304 -0.001-0.433 -0.232 -0.135-0.314 -0.243 -0.141-0.238 -0.346 0.624 -0.513-0.450 -0.368 0.052 -0.089 -0.189 -0.307 -0.266 -0.439 -0.159 -0.741 -0.723-1.083 0.350 -0.479 0.299 0.553-0.231 -0.506 2.641-0.723 -0.253 -1.076 -0.054 -0.645 1.010 0.687 -0.198-0.481 -0.361 -0.205 -0.571 -0.880 -1.539 -0.065 -1.273-0.446 0.657 -0.480 -0.759 -0.100 -1.270-0.661 -0.887-0.788 -0.987 1.533 2.660 -0.519 -0.279 0.240-0.485 -1.537 0.890 -1.663 -0.027 -0.387-0.614 -0.423 -0.837 -0.780 -1.011 -1.065 -0.963-0.449 1.000 54.403 -0.645 2.337 -0.356 -1.391 -1.338-1.643 0.286 -0.407 6.894 -1.016-1.349 0.621 4.894 -1.333 0.375 -1.128-0.432 0.833-1.341 -0.217 9.326 25.853 -0.692 -0.005 -2.189-0.002 0.140 -0.250 -1.086 -0.480 -1.821 73.372 1.54124.325 -0.968 -1.404 -0.600 -1.065 2.571 79.18923.843 -1.071 86.269 -3.281 -0.560 0.082 -0.762 2 3 4 5 6 7 8 9 1 11 13 14 16 17 18 19 20 22 23 25 26 28 29 30 31 32 34 35 36 37 38 41 42 12 24 27 33 10 15 21 39 40 Visitor Table A.30: The average relative bias for perplexity for model selection. 116 4 ˆ θ 3 ˆ θ 2 ˆ θ 1 ˆ θ = 4 using the penalized perplexity 0.057 0.1480.013 0.164 0.2210.024 0.103 0.269 0.1320.002 0.214 0.102 0.2030.000 0.305 0.106 0.1850.039 0.158 0.088 0.1820.027 0.239 0.122 0.2480.021 0.296 0.164 0.1810.050 0.302 0.173 0.2270.039 0.214 0.115 0.1560.045 0.278 0.127 0.3280.000 0.239 0.174 0.2760.014 0.252 0.113 0.2520.080 0.268 0.170 0.3110.020 0.200 0.176 0.2530.026 0.205 0.128 0.1600.011 0.286 0.163 0.2410.046 0.271 0.150 0.2650.049 0.176 0.118 0.2730.062 0.233 0.107 0.2210.051 0.260 0.185 0.2980.110 0.179 0.156 0.2420.071 0.234 0.162 0.2730.074 0.179 0.164 0.1850.111 0.280 0.185 0.3680.045 0.200 0.169 0.2240.064 0.214 0.156 0.2660.146 0.290 0.086 0.2250.065 0.245 0.156 0.2800.043 0.175 0.150 0.3000.146 0.312 0.205 0.2540.172 0.308 0.182 0.2040.182 0.146 0.207 0.1560.212 0.261 0.270 0.2420.145 0.219 0.237 0.2820.122 0.245 0.206 0.2770.225 0.268 0.163 0.3660.219 0.229 0.193 0.2680.234 0.150 0.167 0.2440.250 0.245 0.189 0.2880.154 0.178 0.152 0.2610.213 0.160 0.214 0.261 0.217 0.122 0.133 K 4 ˆ θ 3 ˆ θ 2 = 14 and ˆ θ M 1 ˆ θ 0.022 0.0600.035 0.008 0.0820.022 0.014 0.016 0.0660.013 0.014 0.007 0.0770.015 0.010 0.098 0.0800.058 0.001 0.032 0.0730.049 0.015 0.032 0.0640.046 0.000 0.048 0.0770.040 0.008 0.037 0.0690.071 0.000 0.078 0.0580.058 0.007 0.066 0.0790.082 0.019 0.080 0.0570.025 0.033 0.054 0.0590.074 0.018 0.055 0.0480.028 0.021 0.056 0.0460.050 0.013 0.049 0.0580.048 0.013 0.055 0.0350.069 0.003 0.074 0.0700.049 0.023 0.059 0.0780.063 0.017 0.078 0.0490.054 0.011 0.042 0.0710.034 0.022 0.047 0.0540.063 0.015 0.070 0.0860.070 0.013 0.079 0.0480.085 0.025 0.067 0.1210.054 0.015 0.078 0.0820.045 0.026 0.071 0.0930.015 0.011 0.097 0.0760.104 0.039 0.095 0.0620.051 0.030 0.050 0.1060.106 0.043 0.094 0.0500.081 0.050 0.074 0.0780.103 0.030 0.116 0.0790.059 0.034 0.092 0.1300.128 0.056 0.143 0.0620.085 0.038 0.078 0.0620.067 0.039 0.115 0.0650.081 0.014 0.060 0.1130.090 0.051 0.117 0.0880.080 0.041 0.127 0.0660.092 0.065 0.097 0.0650.067 0.076 0.123 0.112 0.062 0.147 0.031 = 42, 4 ˆ θ N Level of Nesting 3 ˆ θ 2 ˆ θ 1 ˆ θ 0.001 0.0110.004 0.003 0.0000.004 0.005 0.003 0.0160.001 0.002 0.004 0.0150.004 0.009 0.000 0.0000.001 0.003 0.000 0.0250.007 0.005 0.000 0.0170.002 0.003 0.001 0.0060.009 0.003 0.001 0.0160.001 0.003 0.000 0.0160.011 0.006 0.000 0.0230.007 0.004 0.005 0.0170.006 0.003 0.001 0.0110.004 0.000 0.003 0.0210.015 0.003 0.000 0.0290.005 0.004 0.003 0.0230.011 0.007 0.001 0.0340.000 0.007 0.011 0.0190.006 0.002 0.000 0.0400.010 0.004 0.002 0.0300.016 0.007 0.005 0.0650.002 0.004 0.005 0.0400.017 0.012 0.002 0.0460.016 0.007 0.000 0.0390.026 0.009 0.006 0.0320.027 0.012 0.011 0.0580.040 0.001 0.018 0.0850.032 0.004 0.008 0.0380.054 0.015 0.014 0.0550.039 0.012 0.018 0.0590.028 0.005 0.017 0.0820.062 0.015 0.019 0.0630.038 0.011 0.023 0.0760.086 0.011 0.035 0.1250.052 0.003 0.024 0.1170.050 0.013 0.012 0.0790.070 0.011 0.015 0.1110.050 0.031 0.022 0.0550.046 0.030 0.013 0.0750.064 0.015 0.022 0.0950.047 0.037 0.020 0.0860.040 0.029 0.012 0.058 0.036 0.041 0.014 4 ˆ θ 3 ˆ for the scenarios with θ ˆ θ 2 Mild Moderate Extreme Full ˆ θ 1 ˆ θ 0.0020.000 0.000 -0.0000.002 0.001 0.000 0.002 0.000 0.000 0.001 0.000 0.001 0.000 0.001 0.002 0.000 0.004 0.001 0.000 0.001 0.000 0.001 0.002 0.001 0.001 0.001 0.000 0.000 0.000 0.001 0.000 0.001 0.000 0.001 0.000 0.001 0.001 0.001 0.002 0.001 0.000 0.001 0.002 0.001 0.001 0.000 0.000 0.001 0.000 0.002 0.000 0.002 0.001 0.002 0.000 0.001 0.000 0.004 0.000 0.000 0.000 0.005 0.003 0.002 0.004 0.005 0.000 0.000 0.007 0.000 0.001 0.001 0.008 0.002 0.000 0.000 0.000 0.007 0.005 0.001 0.000 0.005 0.006 0.003 0.007 0.005 0.006 0.002 0.006 0.006 0.008 0.001 0.004 0.009 0.007 0.001 0.007 0.009 0.014 0.003 0.018 0.004 0.004 0.003 0.024 0.004 0.008 0.002 0.006 0.012 0.027 0.006 0.017 0.021 0.019 0.000 0.019 0.027 0.028 0.002 0.010 0.007 0.019 0.006 0.019 0.025 0.033 0.006 0.025 0.024 0.028 0.006 0.024 0.022 0.017 0.006 0.011 0.011 0.016 0.006 0.032 0.007 0.049 0.001 0.042 0.014 0.037 0.009 0.034 0.027 0.039 0.004 0.042 0.025 0.031 0.015 0.028 0.016 0.022 0.002 0.025 0.024 0.009 0.045 0.022 0.010 0.013 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 Visitor for model selection. Table A.31: The average bias for 117 4 ˆ β = 4 using 3 ˆ β K 2 ˆ β = 14 and 1 ˆ β M 0.179 0.1150.907 0.146 0.9040.874 1.485 0.795 1.4351.157 1.463 0.784 2.5320.847 6.136 1.583 1.7330.046 0.966 1.304 3.4230.009 2.111 1.536 0.9930.166 0.903 1.032 1.6710.123 1.399 0.114 0.0740.047 1.123 0.208 0.0910.130 1.416 0.153 0.0680.176 1.502 0.107 0.0310.122 1.283 8.911 0.1190.058 1.354 0.139 0.035 1.502 0.142 1.331 4 ˆ β = 42, N 3 ˆ β 2 ˆ β 1 ˆ β 0.988 0.0700.959 0.000 0.0091.006 0.007 0.000 0.0091.716 0.010 0.001 0.0971.011 0.007 0.024 0.1110.988 0.971 1.284 0.1230.010 0.972 1.235 0.0850.006 0.019 1.335 0.0980.009 0.008 1.565 0.0890.000 0.064 1.474 1.1410.000 0.008 0.977 0.9150.000 0.058 1.224 0.9240.000 0.075 1.435 1.0870.004 0.021 0.858 0.074 0.071 1.376 0.064 4 for the scenarios with ˆ β ˆ β Level of Nesting 3 ˆ β 2 ˆ β 1 ˆ β 0.004 0.0000.004 1.029 0.0000.004 0.006 1.014 0.9630.081 0.006 1.042 0.0000.004 0.011 1.022 0.0000.004 8.212 1.001 0.0000.057 0.992 1.003 0.0000.071 1.012 0.007 0.0000.987 1.016 0.002 0.0190.984 1.007 0.000 1.0381.002 0.057 0.001 1.0060.005 1.014 1.362 1.0390.005 0.000 0.000 0.9780.020 0.000 0.000 1.016 0.000 0.001 0.000 4 ˆ β 3 ˆ β 2 Mild Moderate Extreme Full ˆ β 1 ˆ β 0.000 1.0000.000 0.000 0.9890.982 0.001 0.000 0.9880.990 0.000 0.000 0.0050.987 0.000 0.000 0.0040.000 0.000 0.000 2.4100.007 0.000 0.989 0.0040.000 0.007 0.984 0.0000.000 1.230 0.000 0.9450.017 0.978 0.000 0.0000.025 1.265 0.001 1.1730.012 0.983 0.000 0.0000.000 1.002 0.000 0.0000.013 0.995 0.000 0.000 0.980 0.000 0.985 2 3 4 5 6 7 8 9 1 10 11 12 13 14 Plant Table A.32: The coefficientthe of penalized variation perplexity (CV) for and model SD selection. for 118 4 ˆ θ = 4 using the K 3 ˆ θ 2 ˆ θ = 9 and M 1 ˆ θ 0.7364.354 4.807 10.947 3.675 13.543 0.555 3.673 0.599 15.1740.457 14.068 39.5361.483 19.761 0.698 0.630 0.559 1.841 5.740 1.298 4.2123.490 3.4184.808 3.907 11.586 29.662 4.6031.410 10.905 0.865 67.3751.523 9.340 21.6300.516 7.8888.000 1.259 0.590 0.5400.500 2.823 3.566 54.734 2.358 1.255 63.857 4.591 24.515 1.191 31.345 1.166 11.7842.351 31.491 13.529 28.1451.325 42.075 14.7560.931 0.414 0.590 0.720 0.842 7.223 5.953 3.163 32.7829.593 24.249 16.387 2.590 18.039 13.907 70.19727.321 56.534 41.47613.943 36.719 22.003 31.14341.200 14.028 31.356 0.594 1.289 10.45945.344 3.468 10.77431.608 21.946 27.301 28.554 0.569 15.994 24.50430.757 1.12746.287 1.041 9.457 32.625 23.317 24.530 2.751 25.455 32.210 14.385 7.165 4.25964.166 20.726 2.128 1.772 22.668 27.988 33.90422.601 24.283 30.452 14.600 27.10423.471 42.088 29.115 0.55447.235 13.972 39.73037.227 12.771 16.45011.332 37.326 13.487 12.141 35.206 31.864 0.47748.533 0.595 9.347 36.237 0.912 23.57445.482 28.954 4.723 29.094 2.978 31.08133.656 4.132 31.429 56.58119.371 36.425 16.048 10.854 39.175 19.371 0.513 1.029 = 20, 4 ˆ θ N 3 ˆ θ 2 ˆ θ 1 ˆ θ 7.648 12.778 12.620 27.885 9.388 16.8262.132 29.484 5.449 25.908 13.349 19.677 2.260 31.126 30.603 16.135 1.9019.965 8.613 26.2164.461 29.498 31.227 25.007 24.535 19.465 10.708 11.067 5.144 5.152 23.668 16.129 2.171 20.933 24.486 12.011 27.588 12.97312.180 26.785 15.874 22.009 17.38322.366 19.061 36.69618.001 11.343 18.35232.357 26.068 23.42313.513 5.881 22.373 21.889 4.827 27.062 25.77417.000 16.348 11.63717.759 32.524 13.51924.874 30.81422.643 4.308 9.746 22.61826.396 1.351 7.996 13.607 29.88323.670 32.282 28.475 19.48026.184 5.158 11.684 23.489 8.127 14.84118.501 15.928 18.768 18.00212.404 13.135 21.98725.480 17.548 48.322 27.95541.755 26.099 16.386 26.02349.035 24.864 15.963 11.15828.540 12.209 10.160 15.517 14.271 8.938 12.393 30.578 21.54011.915 27.023 35.14112.588 19.934 21.10612.130 9.314 21.347 19.889 8.702 33.148 25.746 9.917 27.08016.727 48.770 32.518 14.97528.852 15.492 12.582 5.157 18.499 11.584 8.211 14.32213.483 26.186 21.448 36.83127.986 29.13211.460 7.646 5.448 22.070 12.41524.734 20.966 22.09712.134 10.736 6.215 26.711 39.48852.086 17.433 32.398 29.706 7.795 50.959 13.247 18.621 4 ˆ θ for the scenarios with ˆ θ Level of Nesting 3 ˆ θ 2 ˆ θ 1 ˆ θ 4.409 45.754 7.395 23.072 7.760 20.038 31.059 6.431 8.102 47.440 17.005 20.757 6.678 31.525 40.423 17.816 8.976 20.583 15.421 17.652 4.324 14.166 28.476 18.198 18.431 38.33116.144 12.433 28.34816.782 14.985 17.95613.718 9.678 13.203 30.94427.439 10.031 16.188 13.12816.089 12.317 23.205 22.879 19.64915.613 19.129 16.550 12.76529.932 19.075 22.339 27.62222.982 14.937 20.985 39.56920.952 21.987 18.862 17.70727.831 30.067 23.660 11.22418.534 11.176 18.460 35.35015.631 20.477 24.41012.377 9.756 19.321 37.565 6.495 16.980 20.044 3.440 11.95423.576 20.883 21.063 8.455 9.998 15.98928.584 13.830 2.477 19.44624.301 7.139 25.523 17.626 17.239 16.881 17.70624.810 6.690 23.821 14.66611.296 14.101 37.56522.824 27.855 15.98627.392 8.492 18.752 29.311 7.889 12.211 21.394 39.097 26.91429.719 33.011 30.87013.930 9.625 15.48923.839 3.584 13.631 13.652 16.705 8.17129.346 9.931 10.951 20.28220.093 25.659 36.130 17.12618.530 16.260 19.891 24.04717.322 27.059 10.857 19.69612.726 15.928 30.701 26.342 18.857 11.25217.445 27.874 24.63911.572 23.589 9.444 27.15310.133 14.168 13.199 26.71712.431 15.469 8.402 24.931 6.395 14.216 4 ˆ θ 3 ˆ θ 2 Mild Moderate Extreme Full ˆ θ 1 ˆ θ 8.647 39.188 22.772 25.242 5.427 19.192 12.078 17.198 15.681 10.99822.813 20.64426.259 8.029 16.452 20.98011.208 32.300 29.93227.105 15.906 8.924 21.769 14.57429.855 22.477 30.980 24.45622.980 34.727 30.974 22.772 19.87324.308 16.526 22.415 15.57823.203 12.305 18.210 10.34418.009 12.290 28.126 19.20331.034 32.094 16.054 29.86310.252 25.676 10.558 18.56816.110 30.597 20.051 12.75020.344 27.772 20.510 11.73713.071 14.085 22.123 19.08229.029 16.210 17.837 15.28527.129 22.697 27.544 24.390 36.603 14.67019.358 23.257 26.22632.566 20.000 19.95412.006 18.367 31.634 16.91235.140 38.418 20.380 31.71229.357 20.979 10.121 21.70225.571 23.444 15.354 32.35517.774 25.531 11.228 39.04712.655 15.327 21.449 25.86715.001 10.720 17.76935.160 8.309 20.472 12.60217.807 14.184 23.095 36.27517.583 17.091 18.653 21.156 13.92723.186 22.299 22.791 18.51015.144 12.874 14.672 30.62311.638 32.906 18.756 18.72226.991 14.298 14.323 7.073 21.27128.547 26.065 25.00218.016 8.998 20.646 16.65119.651 14.473 24.404 18.94514.674 30.793 14.116 11.85412.876 10.691 24.32117.895 7.013 19.276 17.205 31.59924.149 23.204 29.176 29.712 15.905 29.337 23.256 6 7 1 2 3 4 5 8 9 10 11 12 16 18 19 21 22 23 24 25 26 29 31 32 36 39 13 14 15 17 20 27 28 30 33 34 35 37 38 40 41 42 Visitor Table A.33: The averagepenalized coefficient perplexity of for variation (CV) model for selection. 119 = 4 using the K 4 ˆ θ 3 ˆ θ = 9 and 2 ˆ θ M 1 ˆ θ = 20, 0.445 0.5210.477 0.468 0.4610.513 0.399 0.426 0.5000.569 0.502 0.284 0.5130.474 0.530 0.361 0.5020.473 0.423 0.356 0.5130.347 0.537 0.299 0.5290.446 0.553 0.388 0.5010.443 0.555 0.393 0.5840.456 0.518 0.295 0.5630.400 0.530 0.284 0.5380.415 0.533 0.420 0.5080.509 0.512 0.325 0.5500.465 0.586 0.340 0.5600.481 0.481 0.363 0.5070.421 0.465 0.343 0.5470.509 0.536 0.396 0.5130.459 0.531 0.336 0.5670.381 0.519 0.324 0.5710.451 0.494 0.344 0.4980.444 0.533 0.410 0.5410.471 0.499 0.418 0.5510.464 0.518 0.383 0.4810.449 0.474 0.395 0.5060.412 0.525 0.406 0.6200.383 0.553 0.291 0.5350.492 0.528 0.336 0.6130.532 0.588 0.281 0.4790.392 0.512 0.387 0.5840.388 0.484 0.378 0.5350.510 0.550 0.412 0.5450.476 0.544 0.364 0.5020.527 0.466 0.434 0.4480.496 0.516 0.469 0.5050.485 0.499 0.460 0.4740.473 0.501 0.454 0.5340.498 0.522 0.390 0.6410.529 0.539 0.400 0.5210.555 0.374 0.402 0.5100.569 0.490 0.390 0.5400.506 0.456 0.347 0.5330.544 0.490 0.429 0.551 0.469 0.374 0.478 4 ˆ N θ 3 ˆ θ 2 ˆ θ 1 ˆ θ 0.446 0.5800.507 0.425 0.4900.560 0.452 0.397 0.5220.470 0.507 0.380 0.5120.495 0.437 0.472 0.4960.490 0.457 0.448 0.5400.519 0.464 0.438 0.4970.503 0.438 0.458 0.5180.507 0.437 0.395 0.5090.515 0.499 0.470 0.4660.514 0.428 0.427 0.4580.594 0.501 0.462 0.4830.487 0.473 0.443 0.4840.540 0.393 0.465 0.4570.566 0.479 0.428 0.5180.503 0.514 0.418 0.4750.497 0.416 0.467 0.4130.527 0.477 0.507 0.4690.530 0.506 0.467 0.4400.525 0.450 0.460 0.4560.546 0.480 0.446 0.4720.520 0.515 0.438 0.5170.546 0.458 0.465 0.4410.569 0.430 0.458 0.5020.504 0.469 0.438 0.4960.502 0.428 0.469 0.5090.512 0.447 0.461 0.4650.479 0.453 0.469 0.4930.561 0.476 0.500 0.4430.474 0.467 0.471 0.4880.587 0.466 0.514 0.3980.494 0.455 0.452 0.4780.504 0.501 0.503 0.4900.437 0.477 0.448 0.5290.598 0.479 0.499 0.4200.485 0.485 0.451 0.4960.578 0.470 0.504 0.4780.533 0.467 0.428 0.5040.458 0.471 0.477 0.5190.487 0.442 0.491 0.5030.474 0.477 0.465 0.4950.501 0.491 0.502 0.542 0.486 0.456 0.467 4 ˆ θ for the scenarios with Level of Nesting 3 ˆ θ ˆ θ 2 ˆ θ 1 ˆ θ 0.528 0.4840.498 0.490 0.5100.500 0.472 0.460 0.4570.525 0.492 0.497 0.4600.476 0.512 0.495 0.4960.484 0.488 0.454 0.4930.475 0.539 0.521 0.4790.532 0.475 0.455 0.4670.556 0.549 0.465 0.4760.498 0.505 0.464 0.4900.479 0.468 0.477 0.5570.478 0.501 0.499 0.5030.505 0.418 0.460 0.5200.483 0.533 0.497 0.5490.506 0.452 0.483 0.4850.476 0.442 0.510 0.4770.445 0.450 0.491 0.5530.496 0.519 0.521 0.5160.464 0.439 0.496 0.5440.498 0.464 0.474 0.4840.513 0.473 0.473 0.5050.445 0.506 0.486 0.5420.481 0.471 0.485 0.4980.510 0.499 0.496 0.4910.513 0.507 0.495 0.5330.497 0.458 0.459 0.5450.461 0.467 0.469 0.5500.454 0.461 0.473 0.5370.499 0.489 0.475 0.4640.493 0.492 0.516 0.4900.481 0.494 0.498 0.5440.486 0.486 0.470 0.5030.458 0.466 0.466 0.5210.479 0.525 0.499 0.5150.427 0.505 0.510 0.5660.455 0.478 0.477 0.5310.477 0.498 0.481 0.5430.529 0.510 0.483 0.4390.477 0.471 0.469 0.4730.439 0.538 0.507 0.5040.494 0.514 0.522 0.5200.446 0.503 0.446 0.531 0.509 0.491 0.508 4 ˆ θ 3 ˆ θ 2 Mild Moderate Extreme Full ˆ θ 1 ˆ θ 0.511 0.5050.519 0.518 0.4880.514 0.434 0.507 0.4930.522 0.461 0.469 0.4880.458 0.497 0.456 0.5020.520 0.499 0.534 0.4540.480 0.471 0.502 0.5220.501 0.501 0.490 0.4870.509 0.467 0.502 0.4980.505 0.484 0.466 0.5150.510 0.500 0.482 0.4560.494 0.463 0.515 0.5190.503 0.490 0.480 0.5120.537 0.477 0.476 0.4880.475 0.484 0.467 0.4740.470 0.478 0.517 0.4980.484 0.510 0.465 0.4600.484 0.540 0.533 0.4850.487 0.495 0.459 0.5460.472 0.533 0.471 0.5020.479 0.471 0.519 0.5020.489 0.487 0.505 0.5000.520 0.490 0.462 0.4870.503 0.516 0.457 0.4800.552 0.506 0.491 0.4730.468 0.505 0.459 0.5030.499 0.484 0.498 0.4770.483 0.498 0.513 0.5000.534 0.489 0.460 0.4690.514 0.525 0.444 0.5220.489 0.521 0.473 0.4570.485 0.462 0.506 0.5010.496 0.517 0.486 0.4990.512 0.499 0.456 0.4960.505 0.518 0.469 0.4940.485 0.493 0.490 0.5280.492 0.492 0.479 0.4830.528 0.490 0.486 0.5080.479 0.507 0.423 0.4930.522 0.518 0.517 0.4790.527 0.474 0.482 0.4480.464 0.500 0.480 0.476 0.516 0.509 0.528 2 3 4 5 6 7 8 9 1 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 Visitor Table A.34: Thepenalized average perplexity standard for deviation model (SD) selection. for 120

A.1.6 Scenario 17

Table A.35: The number of groups identified for scenario 17 with N = 80, M = 42 and K = 2 with penalized perplexity (PP) used as the model selection criterion. Kˆ = 2 Kˆ 6= 2 Number of Groups identified Number of Groups identified  Nesting 0 1 2 Total 0 1 2 Total 0.003 None 0 0 475 475 0 0 25 25 0.005 None 0 0 475 475 0 0 25 25 0.01 None 0 0 475 475 0 0 25 25 0.03 None 0 0 475 475 0 0 25 25

ε = 0.003 ε = 0.005 ε = 0.01 ε = 0.03

Group 2

1400 1400 1400 1400 Group 1 1000 1000 1000 1000 frequency frequency frequency frequency 600 600 600 600 200 200 200 200 0 0 0 0

1 2 1 2 1 2 1 2

Num. groups identified Num. groups identified Num. groups identified Num. groups identified

ε = 0.003 ε = 0.005 ε = 0.01 ε = 0.03 600 600 600 600 500 500 500 500 400 400 400 400 300 300 300 300 frequency frequency frequency frequency 200 200 200 200 100 100 100 100 0 0 0 0

1 2 1 2 1 2 1 2

Num. groups identified Num. groups identified Num. groups identified Num. groups identified

Figure A.14: Stacked bar plots for the identified interaction groups in scenario 17. The top plots are for the runs with Kˆ = K and the bottom row is for Kˆ 6= K. 121

Table A.36: Top row: The bias and relative bias for βˆ for the scenarios with N = 80, M = 42 and K = 2 using penalized perplexity for model selection. Bottom row: The true β parameter used to generate the data. Level of Nesting None

ˆ ˆ Plant β1 β2 β1 β2 1 -0.004 0.000 0.023 0.000 2 -0.001 0.000 0.024 0.000 3 0.005 0.000 0.021 0.000 4 -0.017 0.000 0.022 0.000 5 0.004 0.000 0.045 0.000 6 -0.005 0.000 0.048 0.000 7 -0.013 0.000 0.010 0.000 8 -0.013 0.000 0.035 0.000 9 0.003 0.000 0.029 0.000 10 0.005 0.000 0.024 0.000 11 0.006 0.000 0.051 0.000 12 -0.009 0.000 0.032 0.000 13 0.005 0.000 0.046 0.000 14 0.020 0.000 0.015 0.000 15 0.000 0.000 0.033 0.000 16 -0.006 0.000 0.035 0.000 17 0.004 0.000 0.043 0.000 18 -0.004 0.000 0.030 0.000 19 -0.000 0.000 0.036 0.000 20 0.004 0.000 0.033 0.000 21 0.005 0.000 0.011 0.000 22 0.014 0.000 0.032 0.000 23 0.014 0.000 0.024 0.000 24 0.002 0.000 0.031 0.000 25 0.009 0.000 0.024 0.000 26 0.004 0.000 0.024 0.000 27 -0.007 0.000 0.048 0.000 28 -0.007 0.000 0.034 0.000 29 0.001 0.000 0.038 0.000 30 0.002 0.000 0.028 0.000 31 -0.006 0.000 0.036 0.000 32 -0.006 0.000 0.039 0.000 33 0.000 0.001 0.000 0.107 34 0.000 -0.003 0.000 0.185 35 0.000 0.002 0.000 0.102 36 0.000 -0.001 0.000 0.085 37 0.000 -0.000 0.000 0.107 38 0.000 -0.000 0.000 0.102 39 0.000 0.000 0.000 0.057 40 0.000 0.001 0.000 0.045 41 0.000 -0.001 0.000 0.100 42 0.000 0.002 0.000 0.110 122

Table A.37: The average relative bias for θˆ for the scenarios with N = 80, M = 42 and K = 2 using the penalized perplexity for model selection. Level of Nesting None

ˆ ˆ Visitor θ1 θ2 1 0.020 0.007 2 -0.071 -0.107 3 -0.195 0.074 4 0.003 0.304 5 0.158 -0.478 6 0.020 -0.023 7 -0.137 0.725 8 0.178 -0.107 9 -0.095 -0.078 10 -0.169 0.392 11 0.698 0.016 12 0.404 -0.305 13 0.099 -0.581 14 1.173 -0.124 15 -0.116 -0.017 16 0.395 -0.866 17 -0.645 -0.204 18 0.586 -0.358 19 -0.239 -0.474 20 -0.198 0.366 21 -0.154 -0.305 22 -0.320 -0.649 23 0.550 -0.270 24 -0.169 -1.093 25 -0.478 -0.087 26 -0.653 -0.142 27 0.069 -1.085 28 -0.408 -0.540 29 1.622 0.641 30 -0.029 -0.258 31 -0.043 -0.613 32 0.195 9.933 33 -0.556 -0.073 34 -0.100 0.902 35 -1.510 -0.200 36 -0.367 -0.276 37 -0.390 -0.450 38 -0.945 0.027 39 0.407 2.186 40 -0.573 -0.542 41 -0.718 0.734 42 -1.003 0.228 43 -0.508 1.255 44 -0.215 -1.315 45 -0.610 -0.206 46 -0.274 -0.689 47 -0.170 -0.442 48 -0.318 -0.794 49 0.144 -0.799 50 -0.404 -1.104 51 -0.654 3.157 52 6.777 -0.559 53 -0.567 -0.482 54 -0.445 -0.569 55 -1.267 0.239 56 -0.294 -1.550 57 -0.450 1.035 58 -0.159 -1.407 59 -0.566 -0.830 60 0.283 -1.051 61 -1.341 -0.086 62 0.764 -0.713 63 2.524 -0.632 64 -0.487 -1.328 65 -0.132 -0.472 66 -0.855 -1.191 67 -0.396 -0.831 68 -0.716 -0.531 69 -0.433 1.000 70 -0.701 0.584 71 -0.709 -0.441 72 -0.638 -1.325 73 -0.067 -1.899 74 0.340 -0.635 75 1.232 -0.986 76 -0.486 -0.756 77 -0.626 4.975 78 -0.506 -0.783 79 -0.629 -0.666 80 -0.703 -1.028 123

Table A.38: The average bias for θˆ for the scenarios with N = 80, M = 42 and K = 2 using the penalized perplexity for model selection. Level of Nesting None

ˆ ˆ Visitor θ1 θ2 1 -0.000 0.000 2 0.000 -0.000 3 0.000 -0.000 4 0.000 0.000 5 0.000 0.000 6 0.000 0.000 7 0.000 0.000 8 0.000 0.000 9 0.000 0.000 10 0.000 0.000 11 0.000 0.000 12 0.000 0.000 13 0.000 0.000 14 0.000 0.000 15 0.000 0.000 16 0.000 0.000 17 0.000 0.000 18 0.000 0.000 19 0.000 0.000 20 0.000 0.000 21 0.000 0.000 22 0.000 0.000 23 0.000 0.000 24 0.000 0.000 25 0.000 0.000 26 0.000 0.000 27 0.000 0.000 28 0.000 0.000 29 0.000 0.000 30 0.000 0.000 31 0.000 0.000 32 0.000 0.000 33 0.000 0.000 34 0.000 0.000 35 0.000 0.000 36 0.000 0.000 37 0.000 0.000 38 0.000 0.000 39 0.000 0.000 40 0.000 0.000 41 0.000 0.000 42 0.000 0.000 43 0.000 0.000 44 0.000 0.000 45 0.000 0.000 46 0.000 0.000 47 0.000 0.000 48 0.000 0.000 49 0.000 0.000 50 0.000 0.000 51 0.000 0.000 52 0.000 0.000 53 0.000 0.000 54 0.000 0.000 55 0.000 0.000 56 0.000 0.000 57 0.000 0.000 58 0.000 0.000 59 0.000 0.000 60 0.000 0.000 61 0.000 0.000 62 0.000 0.000 63 0.000 0.000 64 0.000 0.000 65 0.001 0.001 66 0.001 0.001 67 0.001 0.001 68 0.001 0.001 69 0.001 0.001 70 0.001 0.001 71 0.001 0.001 72 0.001 0.001 73 0.001 0.001 74 0.001 0.001 75 0.001 0.001 76 0.001 0.001 77 0.001 0.001 78 0.001 0.001 79 0.001 0.001 80 0.001 0.001 124

Table A.39: The coefficient of variation (CV) and SD for βˆ for the scenarios with N = 80, M = 42 and K = 2 using the penalized perplexity for model selection. Level of Nesting None

ˆ ˆ Plant β1 β2 1 0.999 0.000 2 1.002 0.000 3 1.010 0.000 4 0.986 0.000 5 0.999 0.000 6 0.991 0.000 7 1.004 0.000 8 0.985 0.000 9 1.003 0.000 10 1.008 0.000 11 1.000 0.000 12 0.989 0.000 13 1.000 0.000 14 1.030 0.000 15 0.997 0.000 16 0.991 0.000 17 1.000 0.000 18 0.995 0.000 19 0.997 0.000 20 1.001 0.000 21 1.027 0.000 22 1.012 0.000 23 1.014 0.000 24 1.001 0.000 25 1.009 0.000 26 1.005 0.000 27 0.987 0.000 28 0.991 0.000 29 0.998 0.000 30 1.004 0.000 31 0.991 0.000 32 0.991 0.000 33 0.000 0.993 34 0.000 0.988 35 0.000 0.995 36 0.000 0.991 37 0.000 0.991 38 0.000 0.992 39 0.000 0.994 40 0.000 0.996 41 0.000 0.991 42 0.000 0.994 125

Table A.40: The average coefficient of variation (CV) for θˆ for the scenarios with N = 80, M = 42 and K = 2 using the penalized perplexity for model selection. Level of Nesting None

ˆ ˆ Visitor θ1 θ2 1 11.318 6.327 2 13.473 13.782 3 10.610 6.613 4 8.397 17.258 5 9.970 17.227 6 4.639 9.827 7 7.652 14.144 8 11.639 7.108 9 10.877 13.395 10 10.610 10.779 11 14.529 6.446 12 3.696 8.104 13 14.974 11.976 14 10.833 15.286 15 15.130 11.574 16 15.909 8.128 17 17.622 8.189 18 15.352 4.869 19 5.386 7.666 20 8.306 5.032 21 8.597 14.379 22 9.166 19.965 23 11.557 8.184 24 5.120 21.728 25 15.272 5.645 26 12.044 9.910 27 10.611 17.199 28 14.079 19.434 29 13.299 18.230 30 15.172 2.116 31 16.657 12.226 32 6.221 21.577 33 6.010 11.339 34 11.178 7.347 35 20.012 8.744 36 6.524 10.894 37 6.907 8.500 38 11.327 8.923 39 12.294 19.360 40 10.378 15.563 41 8.200 5.484 42 12.386 4.790 43 17.406 26.124 44 15.971 11.137 45 12.550 8.681 46 5.966 8.932 47 4.155 7.479 48 13.948 12.632 49 14.415 20.000 50 8.378 15.316 51 7.843 17.381 52 18.408 16.299 53 4.895 11.433 54 10.244 13.762 55 11.507 12.589 56 7.000 14.778 57 15.370 9.732 58 19.327 7.170 59 11.862 18.902 60 8.910 5.191 61 8.954 9.173 62 10.853 10.167 63 6.489 17.961 64 10.899 16.210 65 16.984 8.172 66 8.768 14.285 67 9.386 6.899 68 12.607 14.699 69 7.418 17.880 70 19.742 16.992 71 18.895 6.369 72 13.042 20.388 73 6.386 12.674 74 6.419 5.570 75 12.247 13.944 76 10.216 11.900 77 4.274 9.472 78 11.918 12.721 79 6.066 8.163 80 18.703 15.706 126

Table A.41: The average standard deviation (SD) for θˆ for the scenarios with N = 80, M = 42 and K = 2 using the penalized perplexity for model selection. Level of Nesting None

ˆ ˆ Visitor θ1 θ2 1 0.705 0.700 2 0.691 0.707 3 0.691 0.714 4 0.696 0.711 5 0.713 0.695 6 0.728 0.676 7 0.687 0.717 8 0.677 0.725 9 0.718 0.687 10 0.713 0.694 11 0.699 0.706 12 0.684 0.719 13 0.725 0.680 14 0.704 0.699 15 0.708 0.692 16 0.712 0.696 17 0.691 0.715 18 0.717 0.687 19 0.702 0.703 20 0.680 0.719 21 0.664 0.736 22 0.718 0.686 23 0.702 0.703 24 0.716 0.687 25 0.685 0.714 26 0.701 0.702 27 0.733 0.674 28 0.704 0.701 29 0.701 0.703 30 0.695 0.711 31 0.696 0.708 32 0.722 0.681 33 0.669 0.734 34 0.678 0.725 35 0.687 0.717 36 0.728 0.674 37 0.699 0.705 38 0.714 0.691 39 0.686 0.718 40 0.703 0.704 41 0.693 0.709 42 0.720 0.682 43 0.720 0.682 44 0.715 0.690 45 0.721 0.682 46 0.692 0.711 47 0.700 0.703 48 0.699 0.707 49 0.700 0.703 50 0.722 0.684 51 0.695 0.708 52 0.700 0.706 53 0.685 0.716 54 0.691 0.715 55 0.705 0.700 56 0.725 0.680 57 0.709 0.692 58 0.723 0.680 59 0.731 0.671 60 0.710 0.695 61 0.673 0.732 62 0.696 0.705 63 0.677 0.724 64 0.721 0.684 65 0.703 0.700 66 0.703 0.704 67 0.691 0.713 68 0.695 0.709 69 0.717 0.686 70 0.686 0.718 71 0.686 0.718 72 0.715 0.690 73 0.730 0.673 74 0.704 0.696 75 0.722 0.683 76 0.715 0.690 77 0.676 0.727 78 0.702 0.703 79 0.718 0.686 80 0.728 0.676 127

A.1.7 Scenarios 18 to 20

Table A.42: The number of groups identified for the scenarios with N = 80, M = 42 and K = 3 with penalized perplexity (PP) used as the model selection criterion. Kˆ = 3 Kˆ 6= 3 Number of Groups identified Number of Groups identified  Nesting 0 1 2 3 Total 0 1 2 3 Total None 0 0 0 413 413 0 0 1 86 87 0.003 Mild 0 1 6 277 284 0 0 9 207 216 Moderate 0 0 0 343 343 0 0 11 146 157

None 0 0 0 413 413 0 0 1 86 87 0.005 Mild 0 0 6 278 284 0 0 9 207 216 Moderate 0 0 0 343 343 0 0 4 153 157

None 0 0 0 413 413 0 0 1 86 87 0.01 Mild 0 0 0 284 284 0 0 2 214 216 Moderate 0 0 0 343 343 0 0 2 155 157

None 0 0 0 413 413 0 0 0 87 87 0.03 Mild 0 0 0 284 284 0 0 2 214 216 Moderate 0 0 0 343 343 0 0 2 155 157 128

ε = 0.003 ε = 0.005 ε = 0.01 ε = 0.03

Group 3 Group 2

1500 1500 1500 1500 Group 1 1000 1000 1000 1000 frequency frequency frequency frequency 500 500 500 500 0 0 0 0

1 2 3 1 2 3 1 2 3 1 2 3

Num. groups identified Num. groups identified Num. groups identified Num. groups identified

ε = 0.003 ε = 0.005 ε = 0.01 ε = 0.03 800 800 800 800 600 600 600 600 400 400 400 400 frequency frequency frequency frequency 200 200 200 200 0 0 0 0

1 2 3 1 2 3 1 2 3 1 2 3

Num. groups identified Num. groups identified Num. groups identified Num. groups identified

Figure A.15: Stacked bar plots for the identified interaction groups in scenario 18. The top plots are for the runs with Kˆ = K and the bottom row is for Kˆ 6= K. 129

ε = 0.003 ε = 0.005 ε = 0.01 ε = 0.03 1400 1400 1400 1400 Group 3 Group 2 Group 1 1000 1000 1000 1000 frequency frequency frequency frequency 600 600 600 600 200 200 200 200 0 0 0 0

1 2 3 1 2 3 1 2 3 1 2 3

Num. groups identified Num. groups identified Num. groups identified Num. groups identified

ε = 0.003 ε = 0.005 ε = 0.01 ε = 0.03 1200 1200 1200 1200 800 800 800 800 600 600 600 600 frequency frequency frequency frequency 400 400 400 400 200 200 200 200 0 0 0 0

1 2 3 1 2 3 1 2 3 1 2 3

Num. groups identified Num. groups identified Num. groups identified Num. groups identified

Figure A.16: Stacked bar plots for the identified interaction groups in scenario 19. The top plots are for the runs with Kˆ = K and the bottom row is for Kˆ 6= K. 130

ε = 0.003 ε = 0.005 ε = 0.01 ε = 0.03

Group 3 1500 1500 1500 1500 Group 2 Group 1 1000 1000 1000 1000 frequency frequency frequency frequency 500 500 500 500 0 0 0 0

1 2 3 1 2 3 1 2 3 1 2 3

Num. groups identified Num. groups identified Num. groups identified Num. groups identified

ε = 0.003 ε = 0.005 ε = 0.01 ε = 0.03 1000 1000 1000 1000 800 800 800 800 600 600 600 600 frequency frequency frequency frequency 400 400 400 400 200 200 200 200 0 0 0 0

1 2 3 1 2 3 1 2 3 1 2 3

Num. groups identified Num. groups identified Num. groups identified Num. groups identified

Figure A.17: Stacked bar plots for the identified interaction groups in scenario 20. The top plots are for the runs with Kˆ = K and the bottom row is for Kˆ 6= K. 131

Table A.43: Top row: The bias and relative bias for βˆ for the scenarios with N = 80, M = 42 and K = 3 using penalized perplexity for model selection. Bottom row: The true β parameter used to generate the data. Level of Nesting None Mild Moderate

ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ Plant β1 β2 β3 β1 β2 β3 β1 β2 β3 1 -0.008 0.000 0.000 0.000 0.001 -0.077 0.000 0.004 0.000 2 -0.001 0.000 0.000 0.000 0.001 -0.069 0.000 0.001 0.000 3 0.004 -0.394 0.000 0.000 0.001 -0.074 0.000 -0.003 0.000 4 0.003 0.000 0.000 0.000 -0.164 0.009 0.000 0.007 0.000 5 0.001 0.000 0.000 0.000 -0.134 -0.021 -0.000 -0.012 0.000 6 -0.003 0.000 0.000 0.000 -0.185 0.084 0.016 -0.012 -0.000 7 -0.005 0.000 0.000 -0.287 -0.141 0.023 0.000 0.000 0.000 8 0.010 0.000 0.000 -0.541 -0.139 0.038 -0.014 0.006 0.000 9 -0.006 0.000 0.000 -0.000 0.075 0.000 0.000 -0.002 0.000 10 -0.002 0.000 0.000 0.000 0.022 0.001 -0.002 0.000 0.000 11 0.001 0.000 0.000 0.000 0.066 0.001 -0.005 0.000 0.000 12 -0.001 0.000 0.000 0.000 0.045 0.001 0.005 0.000 -0.030 13 0.001 0.000 0.000 0.000 0.055 0.001 0.013 0.000 -0.028 14 0.000 0.000 -0.003 0.000 -0.020 2.897 -0.001 -0.000 0.006 15 0.000 0.000 0.003 0.000 0.047 0.000 0.000 0.000 0.013 16 0.000 0.000 -0.001 0.001 0.054 0.000 0.000 0.000 -0.008 17 0.000 0.020 0.000 0.000 0.067 0.001 0.000 0.000 -0.017 18 0.000 -0.014 0.000 0.000 0.042 0.001 0.000 0.000 0.008 19 -0.016 0.006 0.000 0.000 0.056 0.000 0.000 0.000 -0.014 20 0.000 0.012 0.000 0.001 0.052 0.000 0.000 0.000 -0.030 21 0.000 -0.016 0.000 0.000 0.072 0.000 0.000 0.000 0.001 22 0.000 0.003 0.000 0.015 0.056 0.001 0.000 0.000 0.001 23 -0.000 -0.001 0.000 0.003 0.064 0.001 0.000 0.000 0.003 24 0.000 0.004 0.000 -0.014 0.000 0.001 0.000 -0.099 0.015 25 0.000 0.008 0.000 -0.006 0.000 0.001 0.000 0.000 0.011 26 0.000 -0.000 0.000 -0.026 0.000 0.000 0.000 0.000 0.022 27 0.000 -0.004 0.000 -0.003 0.000 0.000 0.000 0.000 -0.013 28 0.000 0.001 0.000 0.015 0.000 0.000 0.000 0.000 -0.008 29 0.000 0.001 0.000 0.008 0.000 0.001 0.000 0.000 -0.005 30 0.000 -0.010 0.000 0.001 0.000 0.000 0.000 0.000 0.015 31 0.000 -0.011 0.000 0.018 0.000 0.001 0.000 0.000 -0.006 32 0.000 -0.002 0.000 -0.013 0.000 0.001 0.000 0.000 0.004 33 0.000 0.007 0.000 -0.006 0.000 0.000 0.000 0.000 -0.015 34 0.000 0.002 0.000 0.007 0.000 0.001 0.000 0.000 -0.002 35 0.000 0.004 0.000 -0.008 -0.000 0.000 0.000 0.000 -0.004 36 0.000 -0.006 0.000 0.005 0.000 0.001 0.000 0.000 0.040 37 0.000 0.001 -0.000 -0.009 0.000 0.001 0.000 0.000 -0.015 38 0.000 0.016 0.000 0.004 0.000 0.001 0.000 -0.503 0.027 39 0.000 0.002 0.000 0.019 0.000 0.000 0.000 0.000 0.010 40 0.000 -0.009 0.000 0.024 0.000 0.000 0.000 0.000 0.010 41 0.000 0.020 0.000 -0.014 -0.000 0.001 0.000 0.000 -0.005 42 0.000 -0.005 0.000 -0.013 0.000 0.000 0.000 0.000 -0.009

β1 β2 β3 β1 β2 β3 β1 β2 β3 1 0.085 0.000 0.000 0.000 0.000 0.108 0.000 0.097 0.000 2 0.043 0.000 0.000 0.000 0.000 0.183 0.000 0.067 0.000 3 0.117 0.001 0.000 0.000 0.000 0.125 0.000 0.142 0.000 4 0.082 0.000 0.000 0.000 0.047 0.112 0.000 0.122 0.000 5 0.097 0.000 0.000 0.000 0.040 0.161 0.000 0.103 0.000 6 0.088 0.000 0.000 0.000 0.076 0.091 0.020 0.085 0.000 7 0.045 0.000 0.000 0.003 0.063 0.148 0.000 0.144 0.000 8 0.065 0.000 0.000 0.001 0.041 0.069 0.102 0.085 0.000 9 0.063 0.000 0.000 0.000 0.071 0.000 0.157 0.149 0.000 10 0.039 0.000 0.000 0.000 0.047 0.000 0.072 0.000 0.000 11 0.082 0.000 0.000 0.000 0.053 0.000 0.183 0.000 0.000 12 0.094 0.000 0.000 0.000 0.053 0.000 0.168 0.000 0.023 13 0.097 0.000 0.000 0.000 0.041 0.000 0.078 0.000 0.033 14 0.000 0.000 0.346 0.000 0.027 0.001 0.220 0.000 0.037 15 0.000 0.000 0.269 0.000 0.048 0.000 0.000 0.000 0.026 16 0.000 0.000 0.385 0.000 0.030 0.000 0.000 0.000 0.043 17 0.000 0.024 0.000 0.000 0.040 0.000 0.000 0.000 0.023 18 0.000 0.069 0.000 0.000 0.030 0.000 0.000 0.000 0.025 19 0.002 0.025 0.000 0.000 0.066 0.000 0.000 0.000 0.031 20 0.000 0.025 0.000 0.000 0.045 0.000 0.000 0.000 0.020 21 0.000 0.032 0.000 0.000 0.059 0.000 0.000 0.000 0.017 22 0.000 0.024 0.000 0.026 0.069 0.000 0.000 0.000 0.060 23 0.000 0.044 0.000 0.049 0.053 0.000 0.000 0.000 0.068 24 0.000 0.030 0.000 0.070 0.000 0.000 0.000 0.004 0.030 25 0.000 0.067 0.000 0.043 0.000 0.000 0.000 0.000 0.048 26 0.000 0.036 0.000 0.033 0.000 0.000 0.000 0.000 0.024 27 0.000 0.054 0.000 0.032 0.000 0.000 0.000 0.000 0.036 28 0.000 0.014 0.000 0.053 0.000 0.000 0.000 0.000 0.059 29 0.000 0.037 0.000 0.058 0.000 0.000 0.000 0.000 0.040 30 0.000 0.061 0.000 0.049 0.000 0.000 0.000 0.000 0.046 31 0.000 0.023 0.000 0.051 0.000 0.000 0.000 0.000 0.019 32 0.000 0.050 0.000 0.054 0.000 0.000 0.000 0.000 0.042 33 0.000 0.068 0.000 0.047 0.000 0.000 0.000 0.000 0.020 34 0.000 0.048 0.000 0.043 0.000 0.000 0.000 0.000 0.043 35 0.000 0.053 0.000 0.030 0.000 0.000 0.000 0.000 0.019 36 0.000 0.008 0.000 0.075 0.000 0.000 0.000 0.000 0.022 37 0.000 0.049 0.000 0.064 0.000 0.000 0.000 0.000 0.016 38 0.000 0.034 0.000 0.068 0.000 0.000 0.000 0.001 0.028 39 0.000 0.021 0.000 0.059 0.000 0.000 0.000 0.000 0.035 40 0.000 0.032 0.000 0.021 0.000 0.000 0.000 0.000 0.022 41 0.000 0.028 0.000 0.029 0.000 0.000 0.000 0.000 0.016 42 0.000 0.045 0.000 0.043 0.000 0.000 0.000 0.000 0.027 132

Table A.44: The average relative bias for θˆ for the scenarios with N = 80, M = 42 and K = 3 using the penalized perplexity for model selection. Level of Nesting None Mild Moderate

ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ Visitor θ1 θ2 θ3 θ1 θ2 θ3 θ1 θ2 θ3 1 0.122 -0.119 -0.124 -0.075 -0.036 0.891 -0.001 -0.349 0.617 2 -0.098 0.120 0.066 0.149 -0.854 18.875 -0.139 -0.230 0.290 3 -0.032 0.424 -0.180 -0.068 1.432 0.268 -0.069 -0.093 -0.279 4 -0.207 0.186 -0.387 -0.015 1.559 0.760 -0.244 0.394 0.106 5 -0.141 -0.039 -0.133 -0.156 -0.530 0.693 -0.283 -0.308 -0.051 6 -0.103 -0.162 0.211 0.680 -0.052 -0.291 0.058 0.008 -0.056 7 -0.277 -0.090 0.077 -0.252 -0.396 -0.482 -0.217 -0.406 0.129 8 0.437 0.181 -0.060 -0.366 -0.220 9.268 -0.066 -0.723 -0.194 9 -0.198 -0.060 -0.191 -0.456 -0.107 6.406 -0.351 -0.070 -0.104 10 -0.191 0.064 -0.217 -0.354 -0.667 0.215 0.088 0.299 0.080 11 0.125 -0.299 -0.179 1.595 -0.675 -0.189 0.133 -0.055 2.120 12 -0.186 -0.535 0.821 0.063 0.162 10.823 -0.175 -0.420 -0.230 13 -1.121 0.574 -0.051 -0.353 -0.303 -0.342 -0.681 -0.237 -0.146 14 -0.391 1.115 4.260 -0.584 -0.208 4.158 0.520 0.002 0.207 15 -0.222 -0.103 -0.413 -0.372 -0.733 -0.555 -0.142 1.855 0.004 16 0.163 -0.481 -0.252 1.682 -0.599 -0.059 -0.394 -0.534 -0.080 17 0.833 -0.160 -0.425 -0.585 3.087 0.844 -0.303 0.105 1.936 18 0.663 -0.195 0.107 -0.419 -0.403 -0.220 -0.273 -0.651 -0.385 19 -0.101 -0.175 0.509 -0.533 -0.092 -0.653 -0.392 -0.696 -0.328 20 -0.415 -0.008 -0.165 -0.810 -1.231 0.019 -1.324 -0.253 0.218 21 -0.311 -0.472 -0.153 -0.369 -1.124 -0.001 -0.647 -0.216 -0.273 22 -0.916 0.146 0.052 0.020 131.049 -0.069 -0.302 -0.533 -0.596 23 -0.542 -0.379 -0.674 -1.104 -0.793 -0.332 0.098 -0.112 -0.275 24 -0.435 -0.142 -0.300 -0.167 -1.173 -0.080 -0.603 -0.363 -0.632 25 -0.367 -0.523 -0.341 -0.176 -1.769 -0.629 -0.524 -0.488 -0.706 26 -0.558 -0.195 -0.174 -0.240 -0.259 9.282 -0.497 -0.141 -0.764 27 -0.042 0.643 -0.730 -0.350 -0.822 1.655 0.946 4.933 -1.034 28 -0.523 -0.499 0.188 -0.544 79.891 -0.111 -0.341 -0.120 -0.763 29 0.687 -0.138 -1.115 -0.352 -1.318 32.989 -0.441 -0.201 -0.243 30 -0.582 -0.302 -0.572 0.820 -0.244 1.139 5.141 0.047 -0.246 31 -0.447 -0.452 -0.591 -0.308 0.351 -1.105 -0.022 -0.615 -0.559 32 -0.939 0.415 -0.157 7.556 10.340 17.755 -0.630 -0.710 1.733 33 -0.285 -0.232 -1.415 -0.453 -0.041 0.497 -0.613 -0.196 -0.554 34 -0.875 -0.334 0.029 -0.440 -1.263 -0.534 -1.124 -0.053 0.273 35 3.575 0.437 -0.259 0.294 0.588 0.458 3.270 0.030 -0.771 36 -0.686 0.052 -0.268 -0.100 -1.148 -1.103 -0.508 -0.723 -0.104 37 -0.884 -1.601 -0.142 -0.678 -1.415 -0.429 -0.139 -0.058 -0.487 38 -0.248 0.108 -0.494 2.695 -0.744 -2.056 -0.330 -0.132 -2.314 39 -0.735 2.839 1.712 -0.229 -0.871 1.220 -0.062 -1.333 -0.765 40 -0.367 -0.965 0.050 27.596 13.257 4.235 -0.999 -1.478 -0.008 41 -0.824 -0.278 -0.713 -0.317 -1.586 1.153 -1.537 -0.631 0.338 42 -1.175 -0.004 -0.591 -0.712 12.068 77.927 -0.260 -1.339 1.936 43 -0.325 1.685 -0.576 0.302 0.423 0.190 -0.752 0.487 -0.717 44 0.810 -0.912 -0.516 -0.780 -0.551 5.242 -0.842 0.428 -1.042 45 -0.529 0.276 0.237 -0.195 -0.708 -1.795 -0.952 6.152 3.131 46 -0.723 -0.183 -0.024 9.515 -1.035 -1.359 -0.005 2.094 -1.236 47 -0.991 -0.607 -0.382 -0.453 107.425 3.870 -0.770 -0.727 -0.756 48 -0.286 7.222 -0.588 -0.950 0.455 -0.550 2.006 -1.286 8.575 49 -0.636 -0.781 11.194 -1.016 1.890 32.908 -1.155 -0.621 -0.382 50 -0.333 34.969 -0.766 -0.756 20.093 2.584 -0.966 -0.288 -0.250 51 -0.686 3.622 12.667 6.733 -0.922 -1.476 -0.898 -1.502 -0.304 52 -0.922 -1.543 -0.103 -0.332 -1.946 -0.759 -0.568 -0.838 -0.935 53 -1.120 0.292 1.069 -0.478 3.556 -1.901 -1.875 0.526 -0.296 54 -1.220 -0.434 -0.664 -0.583 -3.043 6.299 -2.100 0.436 0.411 55 -1.018 -0.330 0.022 -0.020 -1.349 -0.149 -0.881 -1.091 -0.512 56 -0.883 -0.842 1.048 -0.699 -0.235 -2.299 -0.523 2.144 -1.111 57 -0.590 -0.047 -0.087 -0.939 -3.514 6.769 6.802 -0.243 0.297 58 -0.418 -0.953 -0.920 -0.503 -0.376 32.079 -1.944 5.423 0.392 59 -0.409 -0.707 -0.891 -0.770 -1.744 -0.660 -2.381 -0.594 0.638 60 -0.741 0.182 -0.902 -1.195 -2.571 27.226 -2.558 0.900 -0.616 61 -0.579 0.002 -1.072 -0.661 0.782 125.327 136.896 -0.161 -0.382 62 -1.936 0.235 -0.842 -1.490 -1.405 0.486 5.196 4.384 1.185 63 14.908 -0.835 -0.125 0.104 -2.564 -0.487 -0.732 -0.526 -1.038 64 -0.902 0.030 -0.656 -0.324 0.048 -2.557 99.959 0.305 -0.265 65 -0.971 -1.033 -0.003 -2.064 6.177 4.615 0.106 -0.598 -0.898 66 -0.353 -0.956 -0.775 -0.444 1.392 -3.004 -2.265 6.102 0.376 67 5.082 4.845 0.184 -2.380 6.868 -0.419 -0.489 152.560 -0.747 68 -0.183 -0.565 2.393 -0.280 -1.089 86.867 -1.204 -1.203 -0.837 69 2.010 -0.655 -0.085 -3.780 -1.545 2.386 -0.844 -2.353 -0.972 70 -0.737 -0.806 -0.687 -1.784 -0.986 41.040 -2.020 -0.385 0.997 71 -1.049 -1.064 -0.495 -1.000 22.431 4.933 -0.988 -2.736 -0.348 72 -0.399 -2.137 -0.626 -1.514 -1.769 -0.603 -1.579 -0.772 -0.720 73 0.443 -0.318 -1.355 -0.481 -3.017 74.691 -1.312 -0.488 19.172 74 -0.674 -1.180 -0.454 -1.514 -0.617 -0.668 -0.149 137.557 2.854 75 6.726 -0.367 -1.043 -0.361 -1.901 94.507 -1.648 -0.516 -1.612 76 -1.489 -0.008 5.936 -0.763 -2.654 -1.416 -1.353 -0.599 -0.688 77 -0.377 -0.521 -1.089 -0.927 -2.893 3.169 -1.109 25.020 -0.861 78 0.156 -0.171 -0.685 -0.584 -1.958 0.762 3.649 -1.136 -1.120 79 -0.212 -1.252 -0.792 -0.778 19.047 -3.823 -3.898 -0.890 7.325 80 -0.405 -1.023 -0.830 -0.401 -0.673 -2.781 -1.330 -0.293 -1.186 133

Table A.45: The average bias for θˆ for the scenarios with N = 80, M = 42 and K = 3 using the penalized perplexity for model selection. Level of Nesting None Mild Moderate

ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ Visitor θ1 θ2 θ3 θ1 θ2 θ3 θ1 θ2 θ3 1 0.000 0.000 0.000 0.001 0.003 0.070 0.003 0.001 0.000 2 0.000 0.000 -0.000 0.000 0.006 0.034 0.001 0.000 0.000 3 0.000 0.000 0.000 0.000 0.006 0.024 0.001 0.000 0.000 4 0.000 0.000 0.000 0.000 0.000 0.027 0.002 0.000 0.000 5 0.000 0.000 -0.000 0.000 0.000 0.020 0.001 0.003 0.000 6 0.000 0.000 0.000 0.000 0.004 0.024 0.002 0.004 0.000 7 0.000 0.000 0.000 0.000 0.000 0.017 0.001 0.000 0.001 8 0.000 0.000 0.000 0.000 0.013 0.025 0.001 0.004 0.000 9 0.000 0.000 0.000 0.001 0.009 0.021 0.002 0.000 0.000 10 0.000 0.000 0.000 0.000 0.012 0.018 0.001 0.002 0.000 11 0.000 0.000 0.000 0.000 0.006 0.022 0.000 0.000 0.000 12 0.000 0.000 0.000 0.001 0.006 0.016 0.001 0.004 0.001 13 0.000 0.000 0.000 0.000 0.012 0.028 0.002 0.005 0.000 14 0.000 0.000 0.000 0.002 0.006 0.025 0.001 0.000 0.001 15 0.000 0.000 0.000 0.002 0.014 0.034 0.000 0.001 0.000 16 0.000 0.000 0.000 0.000 0.000 0.028 0.000 0.005 0.001 17 0.001 0.000 0.000 0.001 0.006 0.030 0.001 0.000 0.000 18 0.001 0.000 0.000 0.000 0.013 0.030 0.002 0.000 0.001 19 0.000 0.000 0.000 0.001 0.016 0.027 0.000 0.005 0.001 20 0.000 0.000 0.000 0.000 0.002 0.030 0.002 0.001 0.001 21 0.001 0.000 0.000 0.002 0.006 0.026 0.002 0.001 0.000 22 0.000 0.000 0.000 0.001 0.010 0.015 0.001 0.001 0.001 23 0.001 0.000 0.000 0.000 0.011 0.026 0.001 0.000 0.000 24 0.001 0.000 0.000 0.000 0.000 0.024 0.001 0.000 0.002 25 0.000 0.000 0.000 0.000 0.007 0.026 0.001 0.000 0.000 26 0.000 0.000 0.000 0.002 0.000 0.028 0.004 0.006 0.000 27 0.000 0.000 0.000 0.000 0.006 0.029 0.002 0.000 0.001 28 0.001 0.000 0.000 0.002 0.006 0.021 0.001 0.004 0.000 29 0.000 0.001 0.000 0.000 0.012 0.035 0.001 0.000 0.000 30 0.000 0.001 0.000 0.000 0.006 0.024 0.002 0.005 0.001 31 0.001 0.000 0.000 0.001 0.006 0.024 0.001 0.001 0.000 32 0.001 0.000 0.000 0.000 0.015 0.024 0.002 0.000 0.000 33 0.000 0.001 0.000 0.000 0.001 0.033 0.001 0.001 0.000 34 0.001 0.001 0.000 0.001 0.017 0.033 0.003 0.004 0.001 35 0.002 0.001 0.000 0.000 0.000 0.024 0.003 0.002 0.001 36 0.001 0.000 0.000 0.002 0.013 0.028 0.002 0.002 0.002 37 0.000 0.000 0.000 0.000 0.007 0.034 0.000 0.000 0.001 38 0.000 0.000 0.000 0.000 0.006 0.037 0.001 0.002 0.001 39 0.001 0.001 0.000 0.001 0.002 0.024 0.003 0.001 0.000 40 0.002 0.001 0.000 0.000 0.000 0.035 0.004 0.004 0.001 41 0.001 0.001 0.000 0.000 0.004 0.028 0.004 0.005 0.001 42 0.000 0.001 0.000 0.000 0.018 0.022 0.005 0.002 0.000 43 0.001 0.000 0.000 0.000 0.009 0.028 0.001 0.000 0.002 44 0.002 0.000 0.000 0.000 0.008 0.035 0.002 0.005 0.000 45 0.000 0.001 0.000 0.003 0.006 0.020 0.009 0.004 0.005 46 0.001 0.000 0.000 0.002 0.018 0.024 0.004 0.004 0.004 47 0.001 0.000 0.000 0.002 0.009 0.031 0.011 0.000 0.003 48 0.001 0.002 0.000 0.000 0.021 0.025 0.004 0.005 0.001 49 0.000 0.002 0.000 0.002 0.031 0.024 0.005 0.009 0.002 50 0.001 0.000 0.000 0.004 0.017 0.038 0.008 0.006 0.003 51 0.000 0.000 0.000 0.000 0.017 0.021 0.006 0.009 0.008 52 0.001 0.000 0.000 0.002 0.034 0.055 0.018 0.004 0.000 53 0.000 0.000 0.000 0.000 0.013 0.025 0.008 0.006 0.005 54 0.002 0.002 0.000 0.000 0.006 0.023 0.016 0.013 0.000 55 0.000 0.000 0.000 0.000 0.035 0.018 0.016 0.006 0.005 56 0.000 0.002 0.000 0.002 0.015 0.018 0.013 0.005 0.000 57 0.000 0.000 0.000 0.000 0.019 0.042 0.018 0.003 0.000 58 0.002 0.002 0.000 0.002 0.022 0.028 0.012 0.007 0.003 59 0.000 0.002 0.000 0.006 0.016 0.060 0.029 0.011 0.005 60 0.002 0.000 0.000 0.008 0.026 0.021 0.010 0.022 0.005 61 0.000 0.000 0.000 0.002 0.037 0.046 0.013 0.000 0.007 62 0.000 0.000 0.000 0.000 0.030 0.039 0.017 0.026 0.000 63 0.000 0.002 0.000 0.003 0.040 0.028 0.019 0.004 0.006 64 0.000 0.000 0.000 0.006 0.047 0.040 0.015 0.005 0.011 65 0.001 0.002 0.001 0.001 0.020 0.048 0.014 0.014 0.014 66 0.002 0.001 0.001 0.012 0.033 0.076 0.009 0.018 0.011 67 0.001 0.001 0.001 0.006 0.034 0.042 0.003 0.022 0.003 68 0.001 0.001 0.001 0.006 0.021 0.023 0.030 0.017 0.014 69 0.001 0.003 0.001 0.012 0.054 0.046 0.022 0.010 0.005 70 0.001 0.001 0.001 0.001 0.027 0.057 0.024 0.033 0.010 71 0.001 0.001 0.001 0.003 0.023 0.047 0.039 0.010 0.013 72 0.001 0.001 0.001 0.006 0.009 0.083 0.012 0.040 0.003 73 0.001 0.001 0.001 0.001 0.012 0.065 0.033 0.027 0.005 74 0.001 0.001 0.001 0.012 0.032 0.059 0.045 0.025 0.015 75 0.001 0.001 0.001 0.001 0.023 0.057 0.033 0.013 0.005 76 0.001 0.001 0.001 0.012 0.040 0.031 0.033 0.019 0.001 77 0.001 0.002 0.001 0.003 0.036 0.071 0.031 0.030 0.015 78 0.003 0.001 0.001 0.007 0.032 0.043 0.032 0.014 0.013 79 0.003 0.001 0.001 0.001 0.045 0.051 0.027 0.018 0.015 80 0.001 0.001 0.001 0.003 0.047 0.044 0.024 0.017 0.001 134

Table A.46: The coefficient of variation (CV) and SD for βˆ for the scenarios with N = 80, M = 42 and K = 3 using the penalized perplexity for model selection. Level of Nesting None Mild Moderate

ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ Plant β1 β2 β3 β1 β2 β3 β1 β2 β3 1 0.991 0.000 0.000 0.000 0.006 0.946 0.000 1.004 0.000 2 1.003 0.000 0.000 0.000 0.011 0.950 0.000 1.003 0.000 3 1.001 1.075 0.000 0.000 0.008 0.945 0.000 0.996 0.000 4 1.002 0.000 0.000 0.000 0.945 1.016 0.000 1.006 0.000 5 0.999 0.000 0.000 0.000 1.021 0.984 0.000 0.987 0.000 6 0.995 0.000 0.000 0.000 0.902 1.119 1.091 0.989 0.000 7 0.998 0.000 0.000 1.112 0.965 1.030 0.000 0.998 0.000 8 1.010 0.000 0.000 1.123 0.966 1.057 0.991 1.010 0.000 9 0.994 0.000 0.000 0.000 1.101 0.000 1.001 0.998 0.000 10 1.003 0.000 0.000 0.000 1.052 0.007 0.999 0.000 0.000 11 1.000 0.000 0.000 0.000 1.097 0.007 0.992 0.000 0.000 12 0.997 0.000 0.000 0.000 1.081 0.010 1.001 0.000 1.026 13 0.998 0.000 0.000 0.000 1.091 0.005 1.017 0.001 1.002 14 0.000 0.000 0.992 0.000 1.039 14.370 0.996 0.000 1.050 15 0.000 0.000 0.998 0.000 1.076 0.005 0.000 0.000 1.032 16 0.000 0.000 0.993 0.008 1.098 0.003 0.000 0.002 1.002 17 0.000 1.036 0.000 0.000 1.100 0.006 0.000 0.001 1.007 18 0.000 0.987 0.000 0.000 1.081 0.006 0.000 0.001 1.029 19 1.873 1.022 0.000 0.000 1.079 0.005 0.000 0.001 1.002 20 0.000 1.024 0.000 0.010 1.074 0.004 0.000 0.001 0.996 21 0.000 0.997 0.000 0.000 1.099 0.000 0.000 0.000 1.033 22 0.000 1.014 0.000 1.198 1.080 0.008 0.000 0.003 1.006 23 0.000 1.004 0.000 1.031 1.093 0.011 0.000 0.003 1.007 24 0.000 1.016 0.000 0.992 0.000 0.012 0.000 1.173 1.034 25 0.000 1.007 0.000 1.006 0.000 0.012 0.000 0.000 1.024 26 0.000 1.008 0.000 0.992 0.000 0.006 0.000 0.000 1.048 27 0.000 1.000 0.000 1.014 0.000 0.006 0.000 0.002 0.999 28 0.000 1.030 0.000 1.021 0.000 0.000 0.000 0.002 0.997 29 0.000 1.008 0.000 1.016 0.000 0.010 0.000 0.000 1.010 30 0.000 0.992 0.000 1.008 0.000 0.000 0.000 0.000 1.029 31 0.000 1.008 0.000 1.028 0.000 0.008 0.000 0.001 1.020 32 0.000 1.002 0.000 0.995 0.000 0.008 0.000 0.002 1.013 33 0.000 1.008 0.000 1.003 0.000 0.007 0.000 0.001 1.013 34 0.000 1.006 0.000 1.017 0.000 0.010 0.000 0.002 1.006 35 0.000 1.007 0.000 1.009 0.000 0.004 0.000 0.001 1.027 36 0.000 1.045 0.000 1.010 0.000 0.012 0.000 0.001 1.067 37 0.000 1.004 0.000 0.999 0.000 0.014 0.002 0.001 1.027 38 0.000 1.025 0.000 1.009 0.000 0.012 0.000 1.344 1.050 39 0.000 1.019 0.000 1.024 0.000 0.000 0.000 0.002 1.023 40 0.000 1.001 0.000 1.047 0.000 0.004 0.000 0.000 1.033 41 0.000 1.034 0.000 1.006 0.000 0.007 0.000 0.001 1.030 42 0.000 1.000 0.000 0.996 0.000 0.007 0.000 0.001 1.008 135

Table A.47: The average coefficient of variation (CV) for θˆ for the scenarios with N = 80, M = 42 and K = 3 using the penalized perplexity for model selection. Level of Nesting None Mild Moderate

ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ Visitor θ1 θ2 θ3 θ1 θ2 θ3 θ1 θ2 θ3 1 8.266 12.432 19.414 20.081 25.275 5.667 14.034 18.180 20.961 2 13.377 19.712 28.663 3.665 17.694 27.047 23.194 28.756 21.342 3 14.658 9.807 13.098 6.827 17.757 31.795 19.867 9.038 18.572 4 18.218 10.957 23.278 14.233 12.244 12.781 8.536 12.262 7.997 5 12.166 9.669 13.142 8.596 11.599 23.189 18.325 10.107 11.264 6 24.715 19.951 16.358 32.570 16.062 8.406 7.852 31.329 5.335 7 8.638 11.669 9.604 5.189 16.637 17.853 9.023 21.321 10.127 8 21.232 16.398 17.412 31.678 10.881 21.690 12.603 25.085 23.461 9 16.648 25.237 20.047 17.756 14.200 10.283 23.182 11.281 5.847 10 13.125 31.549 11.704 15.957 19.526 9.885 13.101 17.213 15.691 11 13.065 8.627 11.078 13.630 18.948 20.734 22.705 3.255 17.527 12 17.024 14.365 8.268 20.679 11.529 21.857 8.539 20.601 6.067 13 26.030 22.659 8.401 28.131 17.441 17.671 31.311 12.954 12.044 14 16.024 17.756 13.880 23.017 7.410 19.639 19.385 31.037 18.634 15 11.002 12.654 13.001 29.535 8.554 13.675 11.066 18.700 24.215 16 16.605 18.674 15.118 16.647 12.432 10.292 24.939 17.506 21.828 17 12.930 14.467 20.783 4.619 17.332 4.950 13.340 17.092 23.519 18 10.930 2.628 14.728 20.820 5.995 32.987 20.687 17.452 8.344 19 20.480 15.792 23.855 12.035 16.226 12.608 9.768 22.943 16.142 20 9.853 3.268 8.949 14.621 28.368 5.165 33.800 9.892 17.008 21 21.048 15.200 13.886 8.094 14.741 12.173 15.307 16.343 5.798 22 25.724 19.170 12.544 14.172 16.364 3.665 15.764 17.395 12.409 23 22.284 19.850 21.145 34.824 15.897 13.862 9.058 6.037 18.321 24 12.378 12.855 9.928 10.642 14.987 28.483 23.640 15.910 28.820 25 22.733 10.153 13.920 11.608 7.880 12.625 30.265 27.306 24.927 26 24.270 18.252 14.655 19.342 22.399 23.700 16.620 4.140 11.348 27 9.909 22.482 13.356 26.665 10.886 28.064 18.422 18.910 12.365 28 20.428 7.463 20.752 18.118 17.679 7.016 13.157 7.986 15.098 29 21.994 10.294 14.070 22.483 7.254 14.778 21.827 6.715 9.018 30 8.914 13.611 14.010 1.161 21.044 8.616 17.218 13.414 16.557 31 20.315 22.212 16.532 13.330 6.368 20.205 22.659 22.088 21.341 32 17.964 34.555 10.896 8.327 16.608 26.508 8.466 16.170 18.520 33 17.710 10.313 20.289 11.323 6.588 34.938 9.165 20.689 8.205 34 15.345 9.053 6.549 19.117 14.149 10.047 22.322 15.389 7.091 35 21.749 10.635 22.184 15.974 2.524 14.826 15.902 10.821 20.290 36 16.361 16.676 16.534 11.323 20.952 11.135 9.058 11.962 14.669 37 23.535 17.108 13.103 18.523 20.409 20.214 11.486 2.375 10.514 38 11.297 17.236 18.991 5.041 12.822 34.381 17.590 10.729 36.615 39 19.470 18.687 16.411 6.732 20.545 18.661 12.623 27.846 17.861 40 17.177 16.090 10.115 10.600 12.505 23.823 10.539 25.198 7.589 41 25.517 12.925 18.112 12.742 12.412 18.916 21.238 11.088 5.367 42 22.297 8.725 16.930 29.599 24.971 40.931 9.364 10.540 18.519 43 13.659 18.779 6.757 2.877 6.057 15.634 4.321 10.403 13.573 44 12.966 18.872 19.468 10.348 23.888 13.319 27.186 13.121 18.709 45 11.804 17.722 10.702 17.515 18.379 18.127 13.125 14.156 17.033 46 21.554 18.028 20.272 32.446 19.631 15.884 12.887 21.671 20.977 47 20.142 13.082 5.934 17.838 22.138 22.891 20.535 6.617 15.222 48 13.063 14.577 13.859 9.460 19.317 10.057 13.979 10.774 23.765 49 13.719 11.683 24.047 10.021 16.424 23.404 25.808 8.617 16.305 50 16.048 23.595 25.166 16.411 12.592 12.615 14.689 16.345 36.133 51 19.689 7.367 13.192 13.100 17.577 18.381 29.414 25.417 2.816 52 23.165 25.716 24.333 17.879 11.516 23.041 7.963 39.147 19.676 53 11.493 10.197 28.130 27.237 24.775 15.716 12.442 18.531 28.878 54 20.146 16.246 23.657 33.654 14.891 15.822 2.297 23.005 18.233 55 17.730 8.342 19.164 15.434 23.221 15.658 16.451 14.353 11.898 56 13.575 18.696 21.923 12.204 14.075 9.772 21.935 15.396 13.605 57 19.783 13.318 28.264 29.890 25.099 25.445 24.117 6.909 15.431 58 18.255 8.127 13.025 21.782 4.643 19.483 12.995 14.105 27.144 59 16.785 27.137 19.069 20.127 23.510 17.321 29.715 20.258 24.774 60 16.202 14.315 23.572 3.023 10.573 26.282 20.995 5.424 15.489 61 21.356 12.525 17.291 16.009 13.274 25.038 23.417 14.903 15.141 62 30.372 16.956 15.974 23.109 7.133 17.192 21.032 8.108 11.265 63 21.030 11.839 5.841 19.086 46.609 19.202 15.652 18.394 14.055 64 1.886 19.561 11.417 17.106 19.215 10.833 20.136 18.723 19.636 65 9.773 18.138 6.460 18.193 14.653 13.047 15.052 8.531 18.481 66 16.652 27.929 13.976 34.078 12.960 20.616 18.312 8.528 7.132 67 10.531 14.510 10.443 20.584 15.645 4.123 13.641 28.772 28.208 68 15.877 14.588 17.595 10.434 8.381 20.361 20.590 21.457 16.838 69 31.469 10.397 14.292 13.863 37.112 9.109 12.071 13.131 19.160 70 24.654 13.091 24.965 20.505 21.315 22.978 11.558 25.561 19.535 71 23.129 20.786 14.667 12.688 5.491 19.591 12.556 2.680 25.471 72 12.368 21.961 23.752 10.328 16.440 18.227 24.597 10.982 27.008 73 17.247 14.879 20.009 13.136 22.246 19.959 11.418 15.059 21.208 74 9.580 9.629 24.558 18.557 13.742 22.818 8.573 26.097 8.059 75 18.468 14.134 9.227 15.749 6.362 24.544 14.620 14.450 13.109 76 19.858 9.274 29.433 17.475 25.671 14.678 33.641 9.629 12.049 77 13.119 15.086 25.675 20.698 16.287 26.250 22.156 8.732 22.261 78 13.385 15.003 19.515 12.271 11.877 6.395 39.693 14.683 15.817 79 3.434 21.849 18.199 27.988 8.816 16.795 17.624 18.548 25.984 80 24.742 17.700 9.280 7.627 23.166 16.930 13.422 8.203 16.898 136

Table A.48: The average standard deviation (SD) for θˆ for the scenarios with N = 80, M = 42 and K = 3 using the penalized perplexity for model selection. Level of Nesting None Mild Moderate

ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ Visitor θ1 θ2 θ3 θ1 θ2 θ3 θ1 θ2 θ3 1 0.607 0.540 0.563 0.559 0.437 0.654 0.643 0.573 0.483 2 0.564 0.585 0.570 0.580 0.505 0.608 0.597 0.563 0.555 3 0.585 0.598 0.531 0.571 0.535 0.594 0.590 0.546 0.577 4 0.589 0.560 0.566 0.545 0.550 0.590 0.556 0.565 0.586 5 0.562 0.589 0.567 0.589 0.559 0.552 0.607 0.552 0.559 6 0.551 0.563 0.599 0.556 0.573 0.566 0.615 0.582 0.505 7 0.584 0.572 0.557 0.594 0.569 0.539 0.546 0.573 0.594 8 0.568 0.540 0.602 0.573 0.570 0.547 0.584 0.546 0.582 9 0.546 0.597 0.571 0.602 0.559 0.536 0.581 0.565 0.571 10 0.602 0.533 0.577 0.532 0.579 0.587 0.593 0.577 0.538 11 0.569 0.596 0.546 0.534 0.570 0.584 0.588 0.575 0.553 12 0.576 0.583 0.560 0.535 0.539 0.620 0.539 0.553 0.616 13 0.512 0.605 0.599 0.547 0.581 0.572 0.596 0.573 0.544 14 0.569 0.567 0.580 0.585 0.578 0.524 0.591 0.560 0.567 15 0.586 0.571 0.564 0.518 0.602 0.567 0.611 0.542 0.555 16 0.548 0.572 0.594 0.563 0.568 0.558 0.549 0.577 0.591 17 0.571 0.580 0.564 0.564 0.559 0.565 0.569 0.611 0.534 18 0.553 0.589 0.575 0.578 0.578 0.532 0.549 0.578 0.590 19 0.594 0.557 0.563 0.593 0.566 0.539 0.598 0.546 0.571 20 0.593 0.556 0.556 0.579 0.552 0.564 0.550 0.544 0.621 21 0.558 0.564 0.594 0.578 0.568 0.548 0.583 0.541 0.592 22 0.554 0.574 0.585 0.566 0.589 0.548 0.572 0.613 0.533 23 0.542 0.595 0.580 0.543 0.546 0.611 0.527 0.580 0.604 24 0.571 0.543 0.603 0.599 0.553 0.545 0.582 0.555 0.582 25 0.562 0.560 0.591 0.608 0.570 0.522 0.580 0.561 0.577 26 0.567 0.573 0.568 0.566 0.568 0.565 0.568 0.566 0.578 27 0.559 0.566 0.588 0.562 0.570 0.563 0.559 0.609 0.546 28 0.568 0.564 0.586 0.590 0.522 0.587 0.569 0.593 0.551 29 0.560 0.588 0.572 0.527 0.550 0.620 0.569 0.563 0.588 30 0.570 0.570 0.573 0.605 0.521 0.574 0.562 0.556 0.590 31 0.553 0.598 0.568 0.567 0.536 0.598 0.587 0.567 0.565 32 0.525 0.597 0.592 0.558 0.596 0.543 0.587 0.580 0.546 33 0.531 0.609 0.572 0.561 0.544 0.594 0.598 0.509 0.606 34 0.547 0.570 0.599 0.567 0.545 0.585 0.536 0.577 0.599 35 0.557 0.584 0.574 0.559 0.543 0.593 0.567 0.620 0.522 36 0.596 0.550 0.570 0.573 0.568 0.547 0.577 0.571 0.568 37 0.544 0.572 0.604 0.537 0.557 0.607 0.600 0.562 0.548 38 0.556 0.571 0.588 0.578 0.580 0.538 0.606 0.553 0.557 39 0.603 0.563 0.550 0.587 0.589 0.521 0.558 0.547 0.611 40 0.609 0.547 0.550 0.526 0.520 0.648 0.555 0.565 0.596 41 0.608 0.523 0.581 0.551 0.555 0.602 0.554 0.576 0.586 42 0.563 0.592 0.564 0.602 0.557 0.532 0.583 0.575 0.559 43 0.585 0.545 0.588 0.552 0.570 0.577 0.562 0.606 0.550 44 0.589 0.550 0.574 0.534 0.579 0.590 0.560 0.572 0.586 45 0.590 0.530 0.595 0.636 0.510 0.549 0.575 0.586 0.547 46 0.528 0.591 0.588 0.591 0.575 0.542 0.557 0.592 0.563 47 0.559 0.582 0.578 0.573 0.555 0.579 0.552 0.587 0.577 48 0.579 0.577 0.558 0.560 0.587 0.559 0.579 0.581 0.560 49 0.600 0.550 0.572 0.526 0.595 0.583 0.560 0.571 0.577 50 0.582 0.543 0.590 0.549 0.558 0.586 0.576 0.614 0.522 51 0.562 0.585 0.569 0.625 0.527 0.549 0.570 0.585 0.559 52 0.570 0.582 0.568 0.542 0.567 0.589 0.613 0.549 0.549 53 0.576 0.591 0.547 0.584 0.587 0.536 0.553 0.573 0.590 54 0.552 0.602 0.566 0.596 0.543 0.575 0.598 0.593 0.523 55 0.569 0.603 0.550 0.585 0.585 0.541 0.611 0.519 0.579 56 0.583 0.556 0.579 0.590 0.562 0.557 0.548 0.573 0.587 57 0.565 0.574 0.576 0.542 0.557 0.602 0.610 0.543 0.559 58 0.565 0.585 0.568 0.532 0.607 0.563 0.547 0.626 0.537 59 0.583 0.567 0.566 0.575 0.576 0.552 0.564 0.575 0.579 60 0.568 0.566 0.580 0.573 0.558 0.578 0.552 0.582 0.586 61 0.557 0.587 0.576 0.571 0.574 0.565 0.539 0.602 0.578 62 0.547 0.592 0.580 0.538 0.602 0.565 0.568 0.584 0.560 63 0.568 0.557 0.592 0.573 0.549 0.594 0.560 0.597 0.556 64 0.562 0.592 0.558 0.603 0.536 0.574 0.581 0.546 0.590 65 0.573 0.546 0.599 0.566 0.538 0.599 0.556 0.599 0.562 66 0.562 0.604 0.551 0.555 0.574 0.571 0.536 0.568 0.610 67 0.543 0.559 0.613 0.514 0.615 0.574 0.570 0.574 0.573 68 0.592 0.582 0.541 0.572 0.588 0.555 0.591 0.566 0.558 69 0.590 0.576 0.546 0.578 0.555 0.576 0.551 0.589 0.578 70 0.571 0.583 0.567 0.567 0.525 0.620 0.560 0.564 0.597 71 0.562 0.591 0.565 0.584 0.585 0.539 0.554 0.554 0.609 72 0.558 0.560 0.600 0.524 0.565 0.622 0.560 0.597 0.562 73 0.571 0.583 0.564 0.588 0.548 0.570 0.589 0.551 0.582 74 0.582 0.580 0.558 0.576 0.541 0.597 0.562 0.578 0.579 75 0.542 0.579 0.598 0.557 0.574 0.582 0.537 0.608 0.578 76 0.578 0.558 0.583 0.562 0.552 0.599 0.526 0.566 0.620 77 0.594 0.538 0.584 0.559 0.570 0.585 0.553 0.591 0.575 78 0.601 0.541 0.568 0.619 0.509 0.574 0.581 0.548 0.587 79 0.609 0.549 0.560 0.575 0.569 0.566 0.565 0.552 0.599 80 0.587 0.566 0.567 0.579 0.544 0.586 0.573 0.587 0.560 137

A.1.8 Scenarios 21 to 24

Table A.49: The number of groups identified for the scenarios with N = 80, M = 42 and K = 4 with penalized perplexity (PP) used as the model selection criterion. Kˆ = 4 Kˆ 6= 4 Number of Groups identified Number of Groups identified  Nesting 0 1 2 3 4 Total 0 1 2 3 4 Total Mild 0 0 0 16 246 262 0 0 0 40 198 238 0.003 Moderate 0 0 3 132 17 152 0 0 7 274 67 348 Extreme 0 1 1 90 189 281 0 0 13 116 90 219 Full 0 0 18 126 4 148 0 0 113 231 8 352

Mild 0 0 0 16 246 262 0 0 0 39 199 238 0.005 Moderate 0 0 1 99 52 152 0 0 1 217 130 348 Extreme 0 0 0 26 255 281 0 0 5 71 143 219 Full 0 0 7 70 71 148 0 0 50 241 61 352

Mild 0 0 0 4 258 262 0 0 0 30 208 238 0.01 Moderate 0 0 0 44 108 152 0 0 0 156 192 348 Extreme 0 0 0 12 269 281 0 0 3 55 161 219 Full 0 0 0 39 109 148 0 0 42 218 92 352

Mild 0 0 0 0 262 262 0 0 0 30 208 238 0.03 Moderate 0 0 0 0 152 152 0 0 0 136 212 348 Extreme 0 0 0 0 281 281 0 0 2 51 166 219 Full 0 0 0 0 148 148 0 0 41 213 98 352 138

ε = 0.003 ε = 0.005 ε = 0.01 ε = 0.03

1500 1500 Group 4 1500 1500 Group 3 Group 2 Group 1 1000 1000 1000 1000 frequency frequency frequency frequency 500 500 500 500 0 0 0 0

1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4

Num. groups identified Num. groups identified Num. groups identified Num. groups identified

ε = 0.003 ε = 0.005 ε = 0.01 ε = 0.03 1400 1400 1000 1000 1000 1000 600 600 frequency frequency frequency frequency 600 600 200 200 200 200 0 0 0 0

1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4

Num. groups identified Num. groups identified Num. groups identified Num. groups identified

Figure A.18: Stacked bar plots for the identified interaction groups in scenario 21. The top plots are for the runs with Kˆ = K and the bottom row is for Kˆ 6= K. 139

ε = 0.003 ε = 0.005 ε = 0.01 ε = 0.03

1000 Group 4 800 Group 3 1000

800 Group 2 800 Group 1 600 800 600 600 600 400 frequency frequency frequency frequency 400 400 400 200 200 200 200 0 0 0 0

1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4

Num. groups identified Num. groups identified Num. groups identified Num. groups identified

ε = 0.003 ε = 0.005 ε = 0.01 ε = 0.03 1400 1400 1200 1000 1000 1000 800 600 600 600 frequency frequency frequency frequency 600 400 200 200 200 200 0 0 0 0

1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4

Num. groups identified Num. groups identified Num. groups identified Num. groups identified

Figure A.19: Stacked bar plots for the identified interaction groups in scenario 22. The top plots are for the runs with Kˆ = K and the bottom row is for Kˆ 6= K. 140

ε = 0.003 ε = 0.005 ε = 0.01 ε = 0.03

Group 4 1500

1500 Group 3 1500 Group 2 Group 1 1000 1000 1000 1000 600 frequency frequency frequency frequency 500 500 500 200 0 0 0 0

1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4

Num. groups identified Num. groups identified Num. groups identified Num. groups identified

ε = 0.003 ε = 0.005 ε = 0.01 ε = 0.03 1200 1200 1000 800 800 800 800 600 600 600 600 frequency frequency frequency frequency 400 400 400 400 200 200 200 200 0 0 0 0

1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4

Num. groups identified Num. groups identified Num. groups identified Num. groups identified

Figure A.20: Stacked bar plots for the identified interaction groups in scenario 23. The top plots are for the runs with Kˆ = K and the bottom row is for Kˆ 6= K. 141

ε = 0.003 ε = 0.005 ε = 0.01 ε = 0.03

1000 Group 4 800 Group 3 1000 800 Group 2 800 Group 1 600 800 600 600 600 400 frequency frequency frequency frequency 400 400 400 200 200 200 200 0 0 0 0

1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4

Num. groups identified Num. groups identified Num. groups identified Num. groups identified

ε = 0.003 ε = 0.005 ε = 0.01 ε = 0.03 1200 1200 1200 1200 800 800 800 800 600 600 600 600 frequency frequency frequency frequency 400 400 400 400 200 200 200 200 0 0 0 0

1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4

Num. groups identified Num. groups identified Num. groups identified Num. groups identified

Figure A.21: Stacked bar plots for the identified interaction groups in scenario 24. The top plots are for the runs with Kˆ = K and the bottom row is for Kˆ 6= K. 142

Table A.50: Top row: The bias and relative bias for βˆ for the scenarios with N = 80, M = 42 and K = 4 using penalized perplexity for model selection. Bottom row: The true β parameter used to generate the data. Level of Nesting Mild Moderate Extreme Full

ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ Plant β1 β2 β3 β4 β1 β2 β3 β4 β1 β2 β3 β4 β1 β2 β3 β4 1 0.000 0.000 -0.009 0.001 0.030 0.004 0.001 0.015 0.000 0.001 -0.032 0.000 0.396 0.000 0.008 0.011 2 0.001 0.000 -0.006 0.001 0.015 0.005 0.001 0.006 0.000 0.000 -0.028 0.000 0.397 0.000 0.018 0.016 3 0.001 0.000 -0.012 0.003 0.038 0.007 0.002 0.006 0.000 0.000 -0.007 0.000 0.594 0.000 0.010 0.013 4 0.000 0.000 -0.014 0.003 -0.049 0.005 0.000 0.005 0.000 0.000 -0.016 0.000 0.489 0.000 0.015 0.015 5 0.001 0.000 0.029 -0.371 0.051 0.003 0.000 0.007 -0.000 0.001 -0.025 0.000 0.640 0.000 0.021 0.032 6 0.000 0.000 -0.010 0.003 -0.050 0.006 0.001 0.007 0.000 0.000 -0.027 0.000 0.457 0.000 0.012 0.022 7 0.001 0.000 0.013 0.072 -0.006 0.001 0.000 0.007 0.147 0.004 0.002 0.000 0.374 0.000 0.025 0.025 8 0.000 0.000 0.001 0.002 0.036 0.004 0.000 0.009 0.321 0.017 0.015 0.000 0.371 0.000 0.012 25.862 9 0.000 0.000 0.003 0.000 -0.022 0.011 2.409 0.002 0.268 0.012 0.019 0.000 0.484 0.000 0.020 0.023 10 -0.029 0.001 0.000 0.001 -0.007 0.008 0.000 0.014 0.503 0.016 0.000 0.000 0.266 0.000 -0.170 0.011 11 -0.010 0.001 0.000 0.001 -0.006 0.012 0.001 0.009 0.365 0.011 0.000 0.001 0.196 0.308 -0.167 0.015 12 -0.006 0.000 0.000 0.001 0.012 0.006 0.000 0.004 0.349 0.008 0.000 0.000 1.332 0.000 -0.252 0.015 13 -0.011 0.000 0.000 0.001 -0.102 0.008 -0.045 0.006 0.428 0.010 0.000 0.001 0.282 0.000 -0.392 0.012 14 -0.005 0.000 0.000 0.001 -0.135 0.012 -0.118 0.008 0.448 0.009 0.000 0.000 0.002 0.002 -0.244 0.011 15 -0.023 0.000 0.000 0.001 -0.050 0.011 0.018 0.005 0.404 0.019 0.001 0.000 0.233 0.000 -0.172 0.015 16 -0.037 0.000 0.000 0.000 -0.032 0.017 -0.089 0.015 0.457 0.011 0.001 0.328 -0.206 0.001 0.436 0.009 17 0.006 0.000 0.000 0.002 -0.047 0.006 -0.084 0.014 0.488 0.010 0.000 0.001 0.020 0.000 -0.189 0.014 18 -0.019 0.001 0.000 0.002 -0.133 0.012 -0.014 0.007 0.486 0.009 0.000 0.000 0.605 0.003 -0.316 0.011 19 -0.004 0.000 0.000 0.002 0.001 0.004 -0.050 0.010 0.421 0.013 0.001 0.000 0.530 0.000 -0.117 0.006 20 -0.025 0.000 0.000 0.001 0.000 0.005 0.038 0.001 0.428 0.013 0.001 0.000 0.323 0.000 -0.446 0.021 21 -0.004 0.000 -0.000 0.001 0.001 0.014 0.017 0.001 0.387 0.019 0.001 0.566 -0.080 0.001 -0.050 0.008 22 0.011 0.000 0.000 0.000 0.002 0.007 -0.024 0.010 0.416 0.010 1.679 0.000 -0.590 0.001 -0.043 -0.365 23 -0.020 0.001 0.000 0.002 0.001 0.012 0.034 0.003 0.460 0.021 0.001 0.000 -0.355 -0.017 -0.298 -0.345 24 0.005 0.148 0.000 0.001 0.001 0.006 0.001 0.001 -0.523 -0.264 0.000 0.000 -0.603 -0.019 0.028 -0.426 25 0.012 0.210 0.000 0.002 0.001 0.004 0.076 0.004 -0.589 -0.188 0.000 0.000 -0.483 0.007 -0.183 -0.451 26 0.000 0.164 0.000 0.003 0.002 0.009 0.000 0.395 -0.592 -0.087 0.000 0.000 -0.543 -0.032 -0.028 -0.458 27 0.000 0.192 0.000 0.001 0.002 0.014 0.000 0.395 -0.591 -0.126 0.000 0.000 -0.302 -0.040 -0.107 -0.476 28 0.000 0.179 0.000 0.003 0.001 0.006 0.000 0.114 -0.417 -0.301 0.000 0.000 -0.538 0.001 -0.306 -0.217 29 0.000 0.228 0.000 0.001 0.003 0.010 0.000 0.162 -0.448 -0.284 0.000 0.000 -0.427 0.001 -0.361 -0.408 30 0.001 0.203 0.000 0.001 0.000 -0.097 0.000 -0.483 -0.551 -0.174 0.000 0.000 -0.493 0.002 0.230 -0.444 31 0.000 0.220 0.000 0.001 0.000 -0.322 0.000 -0.373 -0.434 -0.278 0.000 0.000 -0.480 0.001 -0.330 -0.383 32 0.001 0.189 0.000 0.003 0.002 -0.177 0.000 -0.545 -0.489 -0.232 0.000 0.000 -0.463 0.001 0.169 -0.467 33 0.000 0.145 0.000 0.001 0.001 -0.342 0.000 -0.398 -0.504 -0.202 0.000 0.000 -0.404 0.001 -0.197 -0.435 34 0.001 0.184 0.000 0.002 0.000 -0.036 0.000 -0.471 -0.341 -0.331 0.000 0.000 -0.690 0.000 0.275 -0.356 35 0.000 -0.472 0.000 -0.024 0.001 -0.237 0.001 -0.541 -0.219 -0.355 0.000 0.000 -0.609 0.000 0.208 -0.413 36 0.000 -0.462 0.000 -0.051 0.001 -0.279 0.000 -0.476 -0.549 -0.197 0.012 0.000 -0.459 0.001 -0.305 -0.407 37 0.000 -0.487 0.000 0.017 0.001 -0.376 0.000 -0.479 -0.620 -0.094 0.000 0.000 -0.584 0.000 -0.278 -0.417 38 0.000 -0.474 0.000 0.043 0.000 -0.062 0.001 -0.433 -0.575 -0.094 0.000 0.003 0.620 0.001 -0.243 0.012 39 0.000 -0.480 0.000 0.005 0.000 -0.308 0.000 -0.401 -0.282 -0.316 0.000 -0.122 0.086 0.000 -0.007 0.008 40 0.000 -0.477 0.000 0.107 0.000 -0.220 0.001 0.008 -0.390 -0.275 0.000 -0.013 0.541 0.002 -0.397 0.009 41 0.000 0.002 0.000 -0.146 0.001 -0.220 0.003 0.016 0.403 0.013 0.001 -0.008 0.369 0.000 0.007 0.015 42 -0.000 0.002 0.000 -0.173 0.000 -0.104 0.001 0.006 0.547 0.010 0.000 -0.024 0.506 0.000 0.030 0.037

β1 β2 β3 β4 β1 β2 β3 β4 β1 β2 β3 β4 β1 β2 β3 β4 1 0.000 0.000 0.067 0.000 0.044 0.000 0.000 0.000 0.000 0.000 0.133 0.000 0.011 0.000 0.000 0.000 2 0.000 0.000 0.099 0.000 0.045 0.000 0.000 0.000 0.000 0.000 0.085 0.000 0.021 0.000 0.000 0.000 3 0.000 0.000 0.092 0.000 0.073 0.000 0.000 0.000 0.000 0.000 0.046 0.000 0.014 0.000 0.000 0.000 4 0.000 0.000 0.082 0.000 0.037 0.000 0.000 0.000 0.000 0.000 0.144 0.000 0.022 0.000 0.000 0.000 5 0.000 0.000 0.087 0.004 0.045 0.000 0.000 0.000 0.000 0.000 0.125 0.000 0.039 0.000 0.000 0.000 6 0.000 0.000 0.095 0.000 0.043 0.000 0.000 0.000 0.000 0.000 0.067 0.000 0.023 0.000 0.000 0.000 7 0.000 0.000 0.052 0.001 0.013 0.000 0.000 0.000 0.013 0.000 0.131 0.000 0.025 0.000 0.000 0.000 8 0.000 0.000 0.220 0.000 0.092 0.000 0.000 0.000 0.034 0.000 0.122 0.000 0.025 0.000 0.000 0.001 9 0.000 0.000 0.071 0.000 0.098 0.000 0.001 0.000 0.025 0.000 0.134 0.000 0.032 0.000 0.000 0.000 10 0.050 0.000 0.135 0.000 0.074 0.000 0.000 0.000 0.037 0.000 0.000 0.000 0.025 0.000 0.038 0.000 11 0.072 0.000 0.000 0.000 0.091 0.000 0.000 0.000 0.026 0.000 0.000 0.000 0.042 0.008 0.053 0.000 12 0.059 0.000 0.000 0.000 0.050 0.000 0.000 0.000 0.018 0.000 0.000 0.000 0.008 0.000 0.035 0.000 13 0.057 0.000 0.000 0.000 0.055 0.000 0.052 0.000 0.021 0.000 0.000 0.000 0.024 0.000 0.037 0.000 14 0.050 0.000 0.000 0.000 0.051 0.000 0.057 0.000 0.024 0.000 0.000 0.000 0.033 0.000 0.032 0.000 15 0.037 0.000 0.000 0.000 0.044 0.000 0.028 0.000 0.040 0.000 0.000 0.000 0.029 0.000 0.040 0.000 16 0.016 0.000 0.000 0.000 0.044 0.000 0.123 0.000 0.037 0.000 0.000 0.008 0.025 0.000 0.012 0.000 17 0.090 0.000 0.000 0.000 0.039 0.000 0.067 0.000 0.022 0.000 0.000 0.000 0.026 0.000 0.025 0.000 18 0.093 0.000 0.000 0.000 0.061 0.000 0.044 0.000 0.022 0.000 0.000 0.000 0.018 0.000 0.046 0.000 19 0.108 0.000 0.000 0.000 0.000 0.000 0.052 0.000 0.032 0.000 0.000 0.000 0.009 0.000 0.022 0.000 20 0.050 0.000 0.000 0.000 0.000 0.000 0.075 0.000 0.020 0.000 0.000 0.000 0.036 0.000 0.055 0.000 21 0.045 0.000 0.000 0.000 0.000 0.000 0.134 0.000 0.046 0.000 0.000 0.007 0.019 0.000 0.015 0.000 22 0.054 0.000 0.000 0.000 0.000 0.000 0.062 0.000 0.019 0.000 0.001 0.000 0.035 0.000 0.042 0.060 23 0.051 0.000 0.000 0.000 0.000 0.000 0.150 0.000 0.043 0.000 0.000 0.000 0.022 0.226 0.048 0.057 24 0.069 0.017 0.000 0.000 0.000 0.000 0.100 0.000 0.027 0.074 0.000 0.000 0.028 0.184 0.017 0.059 25 0.097 0.077 0.000 0.000 0.000 0.000 0.054 0.000 0.038 0.065 0.000 0.000 0.030 0.176 0.038 0.085 26 0.000 0.043 0.000 0.000 0.000 0.000 0.000 0.078 0.036 0.042 0.000 0.000 0.022 0.224 0.018 0.054 27 0.000 0.031 0.000 0.000 0.000 0.000 0.000 0.133 0.030 0.040 0.000 0.000 0.013 0.182 0.023 0.038 28 0.000 0.061 0.000 0.000 0.000 0.000 0.000 0.045 0.014 0.049 0.000 0.000 0.027 0.000 0.045 0.034 29 0.000 0.059 0.000 0.000 0.000 0.000 0.000 0.086 0.023 0.076 0.000 0.000 0.018 0.000 0.041 0.052 30 0.000 0.122 0.000 0.000 0.000 0.064 0.000 0.085 0.031 0.049 0.000 0.000 0.032 0.000 0.026 0.150 31 0.000 0.058 0.000 0.000 0.000 0.119 0.000 0.064 0.016 0.047 0.000 0.000 0.024 0.000 0.042 0.060 32 0.000 0.084 0.000 0.000 0.000 0.086 0.000 0.094 0.018 0.039 0.000 0.000 0.022 0.000 0.034 0.084 33 0.000 0.043 0.000 0.000 0.000 0.073 0.000 0.059 0.039 0.072 0.000 0.000 0.019 0.000 0.043 0.066 34 0.000 0.105 0.000 0.000 0.000 0.042 0.000 0.071 0.018 0.071 0.000 0.000 0.030 0.000 0.019 0.039 35 0.000 0.058 0.000 0.155 0.000 0.047 0.000 0.065 0.010 0.061 0.000 0.000 0.039 0.000 0.029 0.080 36 0.000 0.060 0.000 0.200 0.000 0.046 0.000 0.033 0.034 0.059 0.012 0.000 0.012 0.000 0.025 0.036 37 0.000 0.046 0.000 0.100 0.000 0.099 0.000 0.069 0.025 0.029 0.000 0.000 0.020 0.000 0.016 0.044 38 0.000 0.043 0.000 0.082 0.000 0.071 0.000 0.065 0.049 0.055 0.000 0.000 0.014 0.000 0.037 0.000 39 0.000 0.061 0.000 0.143 0.000 0.091 0.000 0.054 0.025 0.118 0.000 0.004 0.018 0.000 0.022 0.000 40 0.000 0.032 0.000 0.046 0.000 0.058 0.000 0.000 0.015 0.054 0.000 0.266 0.012 0.000 0.026 0.000 41 0.000 0.000 0.000 0.139 0.000 0.161 0.000 0.000 0.051 0.000 0.000 0.324 0.012 0.000 0.000 0.000 42 0.000 0.000 0.000 0.131 0.000 0.042 0.000 0.000 0.021 0.000 0.000 0.391 0.045 0.000 0.000 0.000 143

Table A.51: The average relative bias for θˆ for the scenarios with N = 80, M = 42 and K = 4 using the penalized perplexity for model selection. Level of Nesting Mild Moderate Extreme Full

ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ Visitor θ1 θ2 θ3 θ4 θ1 θ2 θ3 θ4 θ1 θ2 θ3 θ4 θ1 θ2 θ3 θ4 1 3.803 -1.595 0.090 5.588 12.369 0.409 -0.895 91.813 -15.014 22.573 -0.021 0.065 -15.607 2.550 -17.935 54.191 2 -0.285 -1.762 0.113 0.455 -0.971 62.161 0.912 -3.622 3.795 2.701 3.651 -0.164 6.079 -1.148 66.232 24.584 3 0.137 -1.147 0.333 42.895 0.620 198.832 -0.247 -7.711 -7.050 32.896 -0.057 -0.629 73.625 -1.107 298.108 -0.764 4 0.042 -0.330 -0.293 -0.297 -2.037 115.238 -1.379 202.473 -4.481 20.678 -0.088 -0.380 48.114 -0.995 -0.693 24.712 5 -0.480 -1.559 0.127 -0.176 2.371 5.000 0.116 -4.372 83.543 28.392 0.852 -0.578 107.001 -1.231 3.389 6 0.295 -0.931 0.029 2.221 -1.060 12.205 -0.169 -4.192 392.179 3.667 -0.327 0.786 40.812 -1.787 160.866 -2.100 7 -0.022 -1.700 -0.165 -0.319 -0.802 6.913 -0.171 -13.857 159.230 50.240 -0.513 -0.796 112.147 -2.134 10.735 3.826 8 -0.425 -1.065 -0.415 40.894 -0.687 21.367 -0.378 68.631 -3.292 134.438 -0.030 1.473 24.014 -1.088 279.979 55.680 9 -0.043 -2.452 0.018 -0.575 -1.378 30.195 12.719 156.267 5.333 19.318 -0.388 -0.226 118.875 8.650 162.007 131.590 10 -0.289 -2.174 -0.272 5.103 17.611 -0.563 1.161 65.527 37.530 22.895 -0.410 4.949 38.810 -1.655 15.822 -1.112 11 -0.704 -2.245 -0.115 0.990 -0.559 -2.185 -1.424 22.785 -9.517 6.402 -0.395 -0.654 -7.761 -1.324 196.046 -10.093 12 0.656 47.809 0.090 14.871 -0.757 150.739 -0.643 -5.171 251.839 0.846 -0.292 -0.482 -1.074 -0.518 60.166 12.668 13 -0.334 -1.461 0.882 8.340 -2.212 -1.186 0.626 1.672 26.864 -0.877 0.633 -0.707 443.657 -1.381 70.164 27.332 14 -2.141 -1.011 0.179 104.326 -0.508 4.527 1.623 -4.100 4.523 1.430 -0.262 -1.214 -20.907 78.009 -28.605 68.955 15 -0.667 -1.547 0.038 36.555 -1.431 15.851 -0.163 -1.629 -1.624 10.901 0.100 -0.662 44.497 17.531 102.184 16 -0.452 -1.136 -0.554 -0.376 33.250 -2.901 -1.137 10.696 -32.459 6.573 0.188 0.357 625.597 3.194 45.116 13.378 17 -0.084 -2.410 -0.082 0.503 -1.485 6.629 -0.684 -3.782 140.858 -1.744 -0.301 -0.810 -8.910 -0.793 141.603 19.431 18 0.516 -2.491 -0.661 -0.204 -1.353 32.805 -0.397 -1.159 -11.813 6.677 -0.002 -0.894 19.708 35.670 162.244 56.589 19 -0.453 2.625 -0.380 1.548 -1.118 29.139 1.096 -2.679 84.738 45.749 -1.005 -0.992 377.777 3.504 -18.962 13.687 20 -0.716 -2.026 0.605 24.107 -0.726 9.884 1.387 -6.688 -9.595 24.057 -0.298 -0.430 549.195 47.463 106.859 60.086 21 -1.039 -2.737 0.053 2.472 -1.307 -1.917 -0.566 124.948 -15.300 -0.432 -0.266 -0.400 233.622 46.990 100.369 10.361 22 -0.749 -1.956 0.154 31.453 -3.413 -4.938 51.110 67.331 -18.760 51.021 0.019 3.616 98.623 -1.182 115.162 62.172 23 -0.840 -1.126 -0.504 1.029 -1.155 90.685 -1.087 -7.440 144.648 6.175 2.360 0.371 467.835 39.729 392.024 327.591 24 56.307 -0.733 -0.071 -0.774 -0.574 48.925 -1.130 -1.054 -15.936 20.937 -0.779 -1.520 366.935 -0.013 7.307 -1.195 25 2.569 1.653 -0.305 0.603 -1.861 27.596 46.096 85.559 -39.029 33.344 -0.745 0.129 881.885 -5.456 -35.365 35.959 26 -1.208 -1.607 -0.634 -0.537 0.172 148.589 -3.863 -8.683 38.304 17.761 -0.330 -0.065 516.548 -3.072 7.491 39.097 27 -0.543 -3.779 0.438 -1.038 -1.886 -2.701 2.891 27.516 -10.178 32.764 -0.220 -0.858 215.403 0.273 -5.231 62.974 28 1.612 -2.968 0.246 2.079 -1.032 180.236 -0.599 44.562 147.954 1.058 -0.276 -0.392 154.917 -1.274 17.728 29 -0.256 -2.516 -1.050 0.128 -0.903 103.193 -0.781 526.739 463.531 4.204 -0.808 -0.278 189.316 -0.927 367.530 34.324 30 -0.699 -1.460 -1.155 -0.323 -1.525 -3.193 -0.599 339.743 -19.019 70.166 -0.999 -1.819 85.836 -1.393 91.158 95.837 31 -1.396 -1.077 -0.935 -0.500 -1.137 234.374 -1.238 -14.790 -19.218 14.885 -0.103 0.396 8.127 -1.440 342.201 16.993 32 -1.382 -2.729 -0.452 0.694 -1.208 155.325 -2.148 21.749 555.887 22.858 -0.558 -0.460 79.543 -1.380 -29.462 19.007 33 -1.202 -2.721 -0.308 1.691 -2.963 -7.842 11.537 -1.751 44.344 21.433 -0.332 -0.889 -14.288 13.354 216.741 57.895 34 -0.867 -4.909 -0.221 31.331 -0.433 18.649 -0.702 -0.911 157.107 2.863 0.047 -0.153 187.850 -0.495 35.845 7.873 35 -0.575 -4.740 0.047 -0.981 -0.402 287.521 0.190 -5.917 233.243 5.167 -1.352 -0.302 46.272 -1.800 99.401 15.260 36 -0.853 17.992 -0.534 47.539 213.562 17.933 -0.787 152.819 81.325 68.762 -0.254 -0.929 117.129 -1.424 36.810 6.980 37 -0.122 -0.585 -1.695 25.050 -2.655 85.161 -0.697 -3.250 33.203 -0.425 -0.560 0.164 41.711 -2.228 1.176 11.811 38 0.083 2.775 -2.068 0.322 -0.606 -2.093 -6.508 -3.284 186.658 5.403 -0.690 -0.703 33.985 -1.145 43.580 4.956 39 0.015 0.182 -0.056 65.303 2.148 -4.195 1.105 -15.961 -18.277 78.182 -0.541 -1.329 118.298 -7.771 17.474 60.828 40 -0.987 7.445 -1.538 65.870 -0.507 -10.431 -0.744 17.267 44.310 5.362 -1.497 -0.940 239.381 0.017 -10.306 2.911 41 -0.855 -2.580 -0.319 0.364 0.233 21.686 12.017 -9.546 11.467 109.711 -0.446 -1.709 -2.898 -1.673 867.530 78.384 42 -0.440 -1.885 -0.535 -0.592 -1.618 206.930 -0.841 4.024 109.301 8.600 -1.317 -0.197 95.673 -1.588 -2.526 43 0.141 -1.365 -0.146 34.922 -2.200 -2.728 7.410 65.088 41.591 47.120 5.226 -0.271 90.137 22.035 2.247 89.104 44 -0.664 -2.745 -2.076 15.400 -0.861 -2.361 7.052 -5.410 -10.453 21.439 -0.190 7.207 14.834 -3.235 234.313 -0.662 45 -0.592 -0.827 -1.502 116.918 -1.653 2.941 -1.548 -6.993 2.026 -1.625 -1.482 -0.810 50.267 61.112 -10.970 195.343 46 -1.798 -0.883 -0.749 0.468 -0.784 9.461 -0.901 8.531 -9.800 75.221 -1.863 -0.521 2.772 70.532 -41.338 32.634 47 31.863 0.743 2.766 13.238 1.819 192.804 -1.377 -4.216 82.513 2.363 -0.288 -2.280 139.697 -1.242 129.446 196.638 48 -0.751 -5.225 0.138 3.682 -5.390 -12.691 -1.137 58.685 76.192 2.288 -0.611 -1.275 101.904 -2.872 24.329 51.638 49 -1.161 -2.649 0.425 3.611 -1.609 1.749 -1.567 -6.939 166.184 41.760 0.185 7.515 22.382 -0.282 91.559 225.241 50 -0.607 -0.541 2.524 0.549 13.516 12.601 -0.238 -4.431 364.953 60.392 -1.288 3.502 -25.468 -1.189 42.361 13.810 51 -0.498 -2.197 5.374 66.691 -7.803 -4.001 -0.781 -6.467 -1.300 104.493 -0.598 -2.520 -2.639 0.456 -29.711 21.569 52 -0.833 3.872 3.236 103.046 -1.650 1.142 9.646 -15.302 -12.385 138.362 -0.772 -0.602 -22.505 -0.507 31.238 78.808 53 -1.513 -2.360 0.738 1.514 15.951 -0.989 50.950 457.468 0.707 28.477 -0.240 -1.595 2.866 -0.986 13.117 275.370 54 -2.391 37.558 -0.262 88.370 -0.769 -4.252 -2.832 -7.994 -5.340 60.273 -1.742 -0.856 82.043 2.999 -0.557 64.647 55 -0.300 -6.138 -1.128 0.333 -1.505 -4.746 -0.733 17.318 237.834 75.999 -2.483 -0.437 -4.007 -1.813 505.408 -3.070 56 -1.175 -1.428 -0.647 0.586 -2.275 10.779 -0.517 4.559 -5.019 57.076 3.077 0.252 251.393 -0.400 -26.914 84.293 57 -1.375 0.067 0.609 -1.675 -1.588 148.027 2.334 1.711 -21.129 -1.199 -0.928 7.442 147.286 1.608 -5.777 9.407 58 -0.833 -3.334 -0.415 -0.667 -2.897 -3.975 -1.566 -2.589 -0.651 60.929 -0.606 -0.831 22.730 -1.753 1503.125 -9.580 59 -0.234 -2.454 0.015 -0.650 -1.399 5.330 -1.794 5.593 -7.259 44.690 -1.009 -1.505 230.825 -3.748 205.822 40.338 60 -0.774 4.389 -1.343 2.398 -1.064 5.575 1.387 -11.448 22.672 88.074 -0.416 27.127 -5.503 27.655 53.443 42.295 61 -0.471 15.290 -1.553 -1.681 -4.159 505.812 0.172 -11.164 -24.018 55.536 -1.226 0.234 109.164 -0.822 -3.840 62 1.448 -2.152 -0.840 -0.580 -2.926 13.445 -1.121 -13.265 40.992 61.499 -1.826 -0.963 172.844 172.622 -3.471 -21.792 63 -1.353 -0.099 -0.633 -0.432 18.238 -1.474 -9.537 288.948 -7.599 33.906 0.109 15.376 94.149 -1.596 -37.120 209.254 64 -2.100 -2.636 -0.179 36.715 -2.418 17.672 -1.051 20.815 -15.844 26.955 -0.181 -1.148 -8.574 1.501 -11.501 38.999 65 2.111 30.290 -0.529 -0.537 0.689 202.216 10.442 -9.559 -9.461 1.763 -0.922 -0.574 10.131 0.540 422.924 155.647 66 -1.218 -1.419 -1.664 4.512 7.742 14.108 -0.777 37.899 8.023 10.766 -1.246 -1.020 83.751 6.137 71.453 -29.236 67 0.206 -2.608 -0.842 0.269 -2.512 -4.056 -1.016 205.779 13.572 24.590 -0.768 -1.421 34.431 25.388 -10.008 2.896 68 -0.936 -2.682 0.954 -2.177 -2.121 34.102 121.405 43.379 31.706 31.257 -0.078 -1.144 62.215 32.644 341.182 -7.954 69 1.727 -2.677 -1.162 -1.014 -0.207 -9.129 -1.416 -5.181 6.255 -2.277 54.965 5.706 80.635 -3.176 385.661 116.879 70 -0.983 -2.635 -1.615 1.161 -0.963 -3.623 6.245 34.625 91.855 -0.372 -0.563 -1.831 110.734 255.851 -8.170 71 -1.546 -1.943 -1.600 0.423 -0.681 0.508 -2.674 -21.498 0.197 -4.459 0.055 -1.870 305.543 -4.130 407.363 40.511 72 -1.430 13.871 -1.848 78.756 -0.808 0.704 -1.446 63.843 370.281 127.449 -0.390 -1.139 239.145 15.361 -9.315 -3.353 73 0.241 4.390 -1.805 -0.541 -2.182 -4.003 -5.194 -5.614 -2.397 41.951 3.306 -3.112 -2.736 -3.356 -1.350 176.775 74 -1.333 -1.859 -1.566 9.367 10.892 55.706 -1.671 -4.919 193.345 -0.459 -2.048 -0.976 -4.821 -1.366 447.485 360.788 75 -0.277 -2.488 -0.842 -1.303 -0.954 -4.837 -0.989 125.708 246.322 2.484 -1.395 -3.220 2.599 49.100 -29.301 53.062 76 -1.356 -1.220 -0.731 96.670 -5.994 31.927 15.528 -5.663 7.737 -2.341 0.082 -0.267 368.162 -1.090 -30.823 -9.613 77 -1.336 10.749 -0.424 -0.990 -4.375 -5.215 -1.808 37.475 -8.689 24.416 9.861 1.400 -2.146 20.636 3.703 -6.599 78 -2.404 -0.885 -0.745 28.406 -7.136 31.663 -1.077 -5.521 55.770 9.737 -0.893 -0.506 145.562 -0.838 -26.255 -5.664 79 63.060 -0.790 -0.170 12.567 -6.424 -0.182 -2.108 17.718 -4.146 -3.996 -0.342 -0.747 -1.724 -3.238 -13.090 -12.887 80 0.318 -3.646 -2.621 4.502 -1.488 85.044 0.350 3.095 71.415 97.409 -0.069 49.661 -4.370 15.336 -31.567 52.425 144

Table A.52: The average bias for θˆ for the scenarios with N = 80, M = 42 and K = 4 using the penalized perplexity for model selection. Level of Nesting Mild Moderate Extreme Full

ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ Visitor θ1 θ2 θ3 θ4 θ1 θ2 θ3 θ4 θ1 θ2 θ3 θ4 θ1 θ2 θ3 θ4 1 0.002 0.002 0.001 0.100 0.007 0.130 0.016 0.076 0.010 0.127 0.007 0.013 0.109 0.033 0.081 0.179 2 0.002 0.000 0.000 0.070 0.007 0.127 0.020 0.052 0.036 0.125 0.003 0.006 0.177 0.024 0.136 0.177 3 0.000 0.012 0.000 0.066 0.012 0.190 0.013 0.066 0.055 0.114 0.001 0.003 0.185 0.032 0.156 0.213 4 0.000 0.005 0.000 0.053 0.014 0.158 0.010 0.127 0.066 0.126 0.001 0.003 0.176 0.035 0.235 0.186 5 0.000 0.006 0.000 0.069 0.016 0.116 0.012 0.096 0.080 0.123 0.002 0.007 0.227 0.029 0.205 0.184 6 0.000 0.011 0.001 0.048 0.018 0.155 0.014 0.124 0.101 0.150 0.004 0.003 0.211 0.022 0.162 0.204 7 0.000 0.017 0.000 0.051 0.001 0.217 0.008 0.145 0.085 0.136 0.003 0.005 0.151 0.039 0.175 0.213 8 0.003 0.002 0.000 0.057 0.004 0.206 0.007 0.100 0.067 0.127 0.001 0.007 0.186 0.029 0.156 0.199 9 0.000 0.015 0.000 0.054 0.022 0.154 0.014 0.164 0.101 0.137 0.008 0.009 0.274 0.013 0.119 0.213 10 0.000 0.006 0.001 0.063 0.027 0.172 0.016 0.094 0.059 0.137 0.003 0.003 0.210 0.040 0.178 0.209 11 0.004 0.013 0.001 0.058 0.032 0.168 0.013 0.124 0.055 0.139 0.005 0.009 0.218 0.033 0.171 0.202 12 0.001 0.005 0.001 0.051 0.002 0.192 0.014 0.103 0.071 0.149 0.002 0.008 0.191 0.022 0.147 0.131 13 0.000 0.009 0.000 0.056 0.005 0.112 0.005 0.093 0.052 0.147 0.007 0.003 0.223 0.024 0.198 0.230 14 0.000 0.006 0.001 0.045 0.039 0.180 0.019 0.081 0.080 0.128 0.005 0.002 0.267 0.038 0.200 0.200 15 0.000 0.006 0.001 0.059 0.017 0.148 0.020 0.113 0.111 0.153 0.002 0.002 0.180 0.025 0.196 0.223 16 0.001 0.022 0.000 0.056 0.024 0.141 0.018 0.148 0.105 0.135 0.002 0.008 0.130 0.037 0.170 0.203 17 0.001 0.001 0.000 0.064 0.034 0.152 0.016 0.105 0.090 0.136 0.007 0.008 0.203 0.031 0.194 0.169 18 0.003 0.006 0.001 0.056 0.023 0.121 0.013 0.162 0.035 0.111 0.005 0.006 0.203 0.026 0.148 0.155 19 0.000 0.006 0.001 0.054 0.012 0.153 0.011 0.134 0.056 0.132 0.004 0.013 0.228 0.024 0.197 0.211 20 0.002 0.005 0.001 0.055 0.014 0.172 0.008 0.147 0.080 0.140 0.006 0.015 0.206 0.034 0.193 0.182 21 0.000 0.020 0.001 0.037 0.026 0.095 0.003 0.147 0.080 0.108 0.001 0.004 0.201 0.019 0.157 0.178 22 0.000 0.006 0.001 0.046 0.008 0.154 0.009 0.083 0.109 0.118 0.001 0.005 0.184 0.033 0.190 0.187 23 0.006 0.018 0.001 0.041 0.013 0.241 0.003 0.137 0.071 0.152 0.005 0.008 0.214 0.038 0.143 0.181 24 0.000 0.034 0.000 0.043 0.021 0.205 0.000 0.181 0.089 0.140 0.007 0.007 0.220 0.025 0.174 0.143 25 0.002 0.016 0.000 0.060 0.005 0.119 0.004 0.120 0.073 0.149 0.007 0.009 0.257 0.027 0.209 0.191 26 0.001 0.013 0.000 0.053 0.018 0.165 0.010 0.118 0.072 0.145 0.004 0.005 0.235 0.038 0.152 0.181 27 0.000 0.023 0.000 0.066 0.008 0.124 0.015 0.145 0.095 0.127 0.001 0.006 0.190 0.037 0.203 0.218 28 0.000 0.009 0.001 0.052 0.005 0.148 0.013 0.141 0.096 0.146 0.002 0.003 0.225 0.026 0.171 0.200 29 0.007 0.018 0.001 0.047 0.013 0.108 0.010 0.107 0.088 0.154 0.004 0.007 0.234 0.029 0.154 0.189 30 0.004 0.015 0.001 0.027 0.013 0.194 0.003 0.165 0.102 0.135 0.006 0.011 0.231 0.022 0.206 0.184 31 0.001 0.013 0.001 0.076 0.004 0.230 0.010 0.100 0.066 0.129 0.006 0.009 0.263 0.021 0.182 0.185 32 0.000 0.005 0.001 0.058 0.016 0.200 0.027 0.108 0.108 0.100 0.002 0.007 0.215 0.015 0.210 0.154 33 0.004 0.021 0.001 0.071 0.026 0.119 0.007 0.132 0.083 0.114 0.004 0.019 0.229 0.028 0.206 0.200 34 0.000 0.005 0.000 0.052 0.018 0.110 0.021 0.109 0.087 0.116 0.000 0.009 0.175 0.028 0.186 0.188 35 0.001 0.010 0.001 0.056 0.016 0.216 0.018 0.159 0.086 0.139 0.002 0.010 0.211 0.045 0.139 0.201 36 0.001 0.014 0.001 0.036 0.002 0.160 0.013 0.089 0.091 0.160 0.003 0.014 0.228 0.039 0.173 0.214 37 0.007 0.000 0.001 0.057 0.015 0.119 0.002 0.125 0.057 0.125 0.008 0.013 0.212 0.015 0.164 0.202 38 0.000 0.018 0.001 0.060 0.029 0.125 0.007 0.092 0.081 0.133 0.003 0.009 0.169 0.046 0.212 0.178 39 0.001 0.009 0.002 0.054 0.016 0.116 0.029 0.128 0.097 0.133 0.005 0.012 0.195 0.030 0.186 0.184 40 0.001 0.021 0.000 0.069 0.015 0.167 0.013 0.129 0.069 0.129 0.007 0.006 0.223 0.027 0.181 0.193 41 0.000 0.012 0.002 0.038 0.011 0.224 0.016 0.110 0.077 0.130 0.004 0.012 0.190 0.048 0.156 0.180 42 0.000 0.006 0.000 0.055 0.027 0.198 0.020 0.121 0.059 0.128 0.005 0.013 0.152 0.046 0.181 0.198 43 0.000 0.014 0.001 0.054 0.025 0.164 0.018 0.093 0.090 0.118 0.008 0.010 0.211 0.027 0.146 0.161 44 0.000 0.035 0.000 0.059 0.020 0.144 0.011 0.097 0.139 0.160 0.012 0.017 0.183 0.032 0.182 0.162 45 0.009 0.015 0.000 0.044 0.022 0.254 0.013 0.170 0.079 0.129 0.004 0.008 0.227 0.033 0.159 0.161 46 0.000 0.032 0.000 0.045 0.013 0.171 0.006 0.132 0.088 0.137 0.004 0.010 0.200 0.053 0.140 0.180 47 0.002 0.011 0.002 0.062 0.028 0.139 0.009 0.144 0.077 0.099 0.006 0.007 0.210 0.032 0.163 0.149 48 0.002 0.018 0.000 0.052 0.000 0.219 0.021 0.198 0.103 0.142 0.006 0.008 0.236 0.034 0.138 0.185 49 0.006 0.010 0.000 0.048 0.020 0.172 0.014 0.108 0.071 0.130 0.008 0.013 0.160 0.062 0.181 0.145 50 0.004 0.006 0.001 0.064 0.016 0.143 0.020 0.110 0.071 0.116 0.010 0.006 0.153 0.053 0.168 0.168 51 0.000 0.042 0.000 0.079 0.016 0.177 0.023 0.137 0.106 0.138 0.011 0.031 0.150 0.058 0.133 0.211 52 0.000 0.030 0.002 0.067 0.005 0.165 0.002 0.112 0.073 0.118 0.008 0.037 0.174 0.073 0.186 0.187 53 0.000 0.019 0.000 0.049 0.007 0.197 0.029 0.098 0.088 0.136 0.022 0.014 0.218 0.054 0.206 0.207 54 0.002 0.020 0.000 0.071 0.009 0.161 0.018 0.171 0.099 0.114 0.014 0.025 0.218 0.062 0.205 0.169 55 0.009 0.005 0.000 0.046 0.008 0.163 0.018 0.142 0.106 0.127 0.009 0.009 0.190 0.076 0.129 0.180 56 0.002 0.017 0.000 0.062 0.015 0.163 0.021 0.165 0.101 0.128 0.002 0.022 0.213 0.054 0.161 0.202 57 0.002 0.029 0.002 0.084 0.031 0.171 0.010 0.096 0.094 0.129 0.015 0.014 0.198 0.056 0.166 0.158 58 0.002 0.026 0.000 0.052 0.032 0.145 0.024 0.117 0.109 0.108 0.005 0.025 0.182 0.084 0.196 0.182 59 0.012 0.014 0.000 0.042 0.012 0.193 0.007 0.161 0.152 0.149 0.010 0.019 0.219 0.059 0.151 0.209 60 0.002 0.050 0.000 0.062 0.009 0.149 0.004 0.149 0.075 0.137 0.010 0.020 0.254 0.024 0.162 0.160 61 0.002 0.044 0.002 0.042 0.011 0.206 0.034 0.164 0.089 0.146 0.016 0.028 0.171 0.099 0.174 0.130 62 0.009 0.036 0.002 0.039 0.016 0.229 0.018 0.130 0.108 0.140 0.026 0.032 0.248 0.087 0.177 0.212 63 0.012 0.027 0.003 0.067 0.022 0.128 0.031 0.061 0.126 0.126 0.021 0.019 0.237 0.060 0.230 0.176 64 0.011 0.039 0.000 0.074 0.022 0.141 0.029 0.134 0.071 0.153 0.015 0.008 0.126 0.102 0.162 0.167 65 0.000 0.037 0.003 0.057 0.031 0.129 0.022 0.182 0.120 0.137 0.022 0.023 0.234 0.058 0.197 0.191 66 0.009 0.023 0.006 0.062 0.022 0.163 0.021 0.102 0.120 0.166 0.018 0.035 0.224 0.090 0.200 0.227 67 0.014 0.015 0.001 0.084 0.005 0.196 0.016 0.139 0.115 0.106 0.014 0.017 0.162 0.060 0.191 0.196 68 0.016 0.030 0.009 0.097 0.030 0.221 0.052 0.102 0.124 0.139 0.016 0.018 0.242 0.075 0.216 0.142 69 0.006 0.029 0.001 0.048 0.028 0.194 0.015 0.116 0.082 0.165 0.008 0.017 0.216 0.044 0.240 0.165 70 0.017 0.025 0.003 0.060 0.044 0.175 0.029 0.205 0.077 0.139 0.016 0.029 0.205 0.066 0.176 0.123 71 0.008 0.034 0.009 0.061 0.036 0.190 0.015 0.174 0.127 0.122 0.003 0.024 0.225 0.034 0.215 0.226 72 0.009 0.055 0.006 0.095 0.031 0.178 0.023 0.131 0.068 0.155 0.013 0.026 0.197 0.068 0.181 0.186 73 0.011 0.027 0.001 0.065 0.042 0.153 0.029 0.186 0.099 0.152 0.008 0.025 0.181 0.047 0.234 0.218 74 0.006 0.019 0.003 0.049 0.032 0.137 0.005 0.170 0.078 0.119 0.006 0.042 0.223 0.047 0.180 0.146 75 0.019 0.053 0.001 0.081 0.005 0.164 0.034 0.084 0.088 0.162 0.006 0.023 0.200 0.084 0.127 0.233 76 0.003 0.041 0.001 0.055 0.037 0.146 0.048 0.107 0.124 0.155 0.013 0.027 0.192 0.055 0.163 0.194 77 0.003 0.031 0.001 0.094 0.028 0.182 0.034 0.141 0.119 0.141 0.014 0.020 0.197 0.068 0.195 0.156 78 0.003 0.054 0.006 0.068 0.030 0.199 0.028 0.200 0.076 0.118 0.003 0.023 0.139 0.058 0.140 0.173 79 0.001 0.032 0.009 0.072 0.010 0.175 0.038 0.102 0.135 0.122 0.016 0.014 0.170 0.122 0.197 0.187 80 0.003 0.023 0.006 0.058 0.033 0.152 0.024 0.129 0.105 0.164 0.013 0.037 0.167 0.061 0.180 0.245 145

Table A.53: The coefficient of variation (CV) and SD for βˆ for the scenarios with N = 80, M = 42 and K = 4 using the penalized perplexity for model selection. Level of Nesting Mild Moderate Extreme Full

ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ Plant β1 β2 β3 β4 β1 β2 β3 β4 β1 β2 β3 β4 β1 β2 β3 β4 1 0.000 0.000 1.005 0.016 1.130 0.025 0.008 0.092 0.000 0.020 0.976 0.000 2.153 0.001 0.022 0.025 2 0.018 0.000 1.004 0.015 1.111 0.030 0.007 0.030 0.000 0.002 0.984 0.000 2.134 0.000 0.084 0.039 3 0.017 0.000 1.003 0.030 1.121 0.042 0.016 0.030 0.000 0.002 1.012 0.000 2.429 0.000 0.034 0.033 4 0.000 0.000 1.000 0.028 1.050 0.022 0.001 0.020 0.000 0.002 0.991 0.000 2.164 0.000 0.046 0.039 5 0.017 0.000 1.039 0.950 1.144 0.020 0.001 0.033 0.000 0.020 0.984 0.000 2.329 0.000 0.074 0.080 6 0.000 0.000 1.002 0.031 1.052 0.035 0.005 0.028 0.000 0.002 0.986 0.000 2.189 0.000 0.036 0.057 7 0.011 0.000 1.033 10.783 1.125 0.006 0.002 0.059 1.543 0.015 1.009 0.000 1.992 0.000 0.098 0.063 8 0.000 0.000 1.004 0.030 1.113 0.029 0.001 0.046 1.583 0.055 1.028 0.005 2.186 0.000 0.064 94.112 9 0.000 0.000 1.014 0.000 1.065 0.050 21.292 0.017 1.594 0.057 1.026 0.002 2.123 0.000 0.059 0.059 10 1.003 0.009 1.008 0.019 1.077 0.041 0.001 0.074 1.744 0.054 0.005 0.002 1.728 0.005 2.678 0.036 11 1.006 0.012 0.000 0.011 1.077 0.061 0.009 0.044 1.611 0.032 0.002 0.008 1.635 2.841 2.185 0.048 12 1.012 0.000 0.000 0.009 1.092 0.031 0.001 0.021 1.631 0.020 0.005 0.004 3.576 0.000 1.908 0.068 13 1.010 0.000 0.000 0.011 1.033 0.035 1.076 0.027 1.693 0.034 0.006 0.005 1.726 0.000 1.234 0.035 14 1.015 0.000 0.000 0.013 1.011 0.041 0.995 0.034 1.682 0.026 0.001 0.005 1.375 0.023 1.503 0.033 15 1.007 0.006 0.000 0.010 1.123 0.062 1.201 0.024 1.601 0.055 0.008 0.002 1.682 0.000 2.472 0.043 16 1.027 0.000 0.000 0.002 1.256 0.087 0.979 0.066 1.698 0.034 0.008 2.517 1.093 0.010 6.361 0.025 17 1.018 0.001 0.000 0.019 1.152 0.028 1.003 0.073 1.747 0.028 0.005 0.005 1.409 0.000 1.638 0.064 18 0.999 0.015 0.000 0.021 0.993 0.051 1.149 0.027 1.736 0.027 0.004 0.004 2.255 0.032 1.931 0.035 19 1.009 0.001 0.000 0.024 0.007 0.024 1.031 0.067 1.640 0.036 0.016 0.005 2.245 0.003 2.865 0.020 20 0.998 0.001 0.000 0.012 0.004 0.036 1.113 0.003 1.747 0.058 0.007 0.001 1.776 0.000 1.138 0.060 21 1.021 0.000 0.000 0.011 0.005 0.086 1.080 0.003 1.607 0.059 0.009 3.650 1.301 0.006 2.895 0.031 22 1.029 0.000 0.000 0.000 0.011 0.047 1.049 0.080 1.747 0.033 16.243 0.004 0.804 0.007 2.757 0.895 23 1.004 0.009 0.000 0.017 0.008 0.087 1.096 0.031 1.673 0.058 0.009 0.002 1.390 0.993 2.147 1.372 24 1.021 1.398 0.000 0.011 0.011 0.036 1.071 0.003 0.985 0.856 0.000 0.000 0.818 0.996 2.810 0.967 25 1.027 1.264 0.000 0.018 0.011 0.022 1.508 0.022 0.821 0.942 0.000 0.000 1.116 1.020 1.833 0.935 26 0.000 1.280 0.000 0.014 0.019 0.019 0.001 2.237 0.804 1.071 0.000 0.000 0.958 0.977 2.482 0.917 27 0.004 1.280 0.000 0.010 0.024 0.029 0.001 1.947 0.796 1.040 0.000 0.000 1.644 0.974 3.062 0.880 28 0.000 1.240 0.000 0.017 0.009 0.016 0.001 1.584 1.244 0.811 0.000 0.000 0.886 0.006 2.200 1.107 29 0.000 1.315 0.000 0.006 0.023 0.021 0.001 1.651 1.174 0.849 0.000 0.000 1.181 0.009 1.472 0.848 30 0.016 1.244 0.000 0.011 0.003 1.752 0.001 0.826 0.887 0.958 0.000 0.000 1.058 0.024 3.377 0.802 31 0.000 1.304 0.000 0.013 0.005 1.035 0.001 1.055 1.588 0.837 0.001 0.000 1.047 0.011 1.723 0.880 32 0.011 1.242 0.000 0.024 0.016 1.225 0.001 0.678 1.071 0.891 0.000 0.000 1.080 0.013 3.403 0.769 33 0.000 1.198 0.000 0.008 0.010 0.882 0.001 1.250 1.023 0.976 0.000 0.000 1.239 0.013 2.173 0.805 34 0.013 1.227 0.000 0.018 0.002 1.800 0.001 0.972 1.470 0.776 0.001 0.000 0.609 0.000 4.547 0.906 35 0.000 0.767 0.000 0.996 0.012 0.987 0.010 0.687 1.682 0.754 0.000 0.000 0.782 0.001 3.385 0.822 36 0.000 0.790 0.000 0.968 0.006 1.480 0.001 0.781 0.905 0.932 1.224 0.000 1.112 0.006 1.652 0.834 37 0.000 0.747 0.000 1.041 0.012 0.793 0.001 0.785 0.753 1.069 0.000 0.000 0.801 0.000 1.387 0.824 38 0.000 0.762 0.000 1.075 0.002 1.883 0.015 1.156 0.822 1.054 0.006 0.047 2.358 0.011 2.096 0.046 39 0.000 0.746 0.000 1.026 0.003 1.142 0.001 0.999 1.663 0.806 0.000 1.618 1.503 0.000 3.512 0.025 40 0.000 0.780 0.000 1.158 0.002 1.256 0.013 0.022 1.513 0.878 0.000 0.991 2.193 0.018 1.392 0.026 41 0.000 0.011 0.000 0.930 0.009 1.248 0.033 0.032 1.659 0.044 0.011 0.993 2.127 0.000 0.021 0.063 42 0.000 0.010 0.000 0.864 0.002 2.106 0.010 0.017 2.105 0.040 0.001 0.979 2.154 0.000 0.102 0.109 146

Table A.54: The average coefficient of variation (CV) for θˆ for the scenarios with N = 80, M = 42 and K = 4 using the penalized perplexity for model selection. Level of Nesting Mild Moderate Extreme Full

ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ Visitor θ1 θ2 θ3 θ4 θ1 θ2 θ3 θ4 θ1 θ2 θ3 θ4 θ1 θ2 θ3 θ4 1 17.898 14.597 19.349 18.024 13.651 22.866 21.011 34.923 5.790 21.477 16.669 28.836 20.819 19.444 2.291 16.682 2 20.167 5.101 13.402 20.303 4.989 21.516 12.665 16.124 14.690 13.937 28.110 10.571 11.618 24.508 20.190 29.561 3 20.666 15.227 32.694 26.341 10.901 26.359 29.027 13.546 18.093 18.344 24.049 26.395 20.021 24.611 12.381 11.304 4 14.185 30.200 11.638 21.572 5.328 34.885 18.035 48.090 13.751 9.330 30.341 28.130 30.899 47.984 14.649 4.197 5 32.603 3.768 24.120 8.949 1.161 3.572 59.211 23.004 13.986 30.620 20.295 21.731 9.352 18.349 5.268 18.828 6 24.781 9.805 12.003 9.087 24.434 34.895 3.011 13.600 9.330 14.681 15.858 17.766 6.611 15.120 14.283 6.225 7 8.874 17.642 38.896 13.873 7.013 13.495 18.081 14.543 21.567 22.175 27.914 26.180 12.592 24.999 34.031 21.312 8 31.851 19.060 25.427 25.904 7.420 29.706 7.359 21.736 8.404 33.492 25.470 13.732 8.621 41.649 16.135 26.202 9 20.750 10.808 14.311 14.300 24.225 13.610 9.323 23.421 23.020 17.817 21.874 18.066 30.053 9.254 17.438 44.880 10 12.525 12.938 15.025 13.449 24.171 17.008 17.358 9.588 13.029 18.356 32.103 8.773 14.396 48.668 10.699 27.690 11 15.220 27.725 13.244 10.568 20.602 22.411 9.711 26.104 8.512 14.680 15.283 6.759 12.549 20.575 26.749 3.698 12 21.433 21.197 14.910 16.150 26.561 49.503 22.367 14.128 21.881 15.335 22.323 11.194 13.693 1.971 24.891 12.814 13 12.999 8.203 23.873 21.252 33.433 13.944 13.351 19.193 24.298 14.200 32.018 26.352 28.662 26.139 12.504 15.244 14 31.976 20.648 16.259 24.702 5.021 23.237 33.067 9.186 9.137 12.092 17.651 20.950 6.144 25.782 0.918 28.771 15 19.749 22.134 9.373 16.787 11.675 32.254 1.438 11.297 10.229 22.226 11.981 20.265 15.721 33.033 1.456 22.471 16 17.893 12.059 15.858 8.822 43.899 6.933 12.624 12.880 24.125 10.376 15.826 29.189 11.088 13.696 16.447 3.833 17 15.169 10.220 24.832 20.005 12.395 14.252 38.485 12.798 13.499 3.437 17.757 28.248 22.144 13.211 35.796 8.742 18 17.136 19.354 19.925 27.594 10.016 22.907 21.896 30.779 26.925 16.323 21.906 14.295 15.974 28.283 10.076 35.318 19 31.231 24.122 12.985 9.544 11.249 14.988 2.064 12.228 19.670 30.729 40.013 15.892 17.270 4.522 14.291 6.665 20 23.556 11.100 4.832 25.261 12.785 15.373 4.431 20.311 11.137 24.802 21.301 16.185 19.746 33.447 13.027 9.654 21 27.419 11.389 8.577 16.571 35.500 12.141 11.841 30.846 21.199 25.721 23.410 9.495 21.157 14.433 6.989 19.241 22 40.616 21.125 21.308 40.780 9.505 13.954 11.955 11.509 28.049 21.432 12.409 27.066 13.173 27.163 10.869 9.700 23 17.926 15.083 22.672 23.985 16.731 22.187 25.655 5.058 11.742 17.477 29.621 10.968 21.721 50.956 26.207 20.556 24 24.395 16.530 4.951 15.784 19.949 8.627 11.311 25.713 2.804 15.461 29.135 27.205 33.739 8.273 5.929 1.800 25 13.723 22.290 5.265 11.852 13.841 11.079 43.639 22.445 6.140 31.638 21.523 30.612 20.623 15.182 4.426 10.943 26 19.362 5.876 19.528 34.633 1.956 24.536 18.860 3.320 18.433 23.308 13.159 10.343 17.405 21.036 6.127 7.999 27 19.724 23.652 9.323 12.638 23.895 7.023 6.263 9.605 23.511 16.855 9.848 12.568 20.315 5.219 18.005 20.709 28 18.665 21.040 14.460 8.556 11.801 52.079 41.230 23.626 27.413 19.101 4.725 5.572 32.577 37.602 10.144 5.676 29 24.588 29.088 20.507 2.942 33.942 12.363 32.144 50.025 24.740 12.433 22.796 44.717 18.864 20.775 10.087 11.958 30 15.175 14.464 16.878 15.676 6.268 1.580 3.225 35.584 11.591 18.256 25.362 23.538 18.571 13.774 25.889 19.002 31 30.675 6.354 16.813 6.702 3.479 14.994 16.135 9.968 19.596 13.607 19.038 15.248 11.865 14.638 19.236 4.266 32 39.619 11.595 15.471 58.087 11.361 25.424 15.950 17.844 36.464 16.097 24.620 26.306 15.808 16.741 13.916 21.089 33 33.305 6.821 26.229 10.874 8.952 7.710 30.367 18.915 12.155 11.298 16.998 17.008 9.460 40.290 10.602 21.744 34 16.614 14.663 14.223 40.381 27.781 14.249 17.073 10.407 26.364 29.934 10.406 22.814 29.668 7.820 14.195 3.601 35 33.576 10.163 17.819 37.757 27.026 50.458 17.291 0.788 20.624 19.309 35.008 22.271 29.044 30.934 15.495 31.162 36 32.262 19.587 25.948 37.666 34.466 30.129 31.929 18.897 10.582 21.696 19.679 24.436 32.143 32.520 8.102 12.766 37 3.342 6.808 23.338 18.368 43.833 18.253 16.127 14.792 7.985 13.405 22.044 14.704 24.175 17.444 19.260 24.791 38 20.145 27.911 17.573 14.576 12.822 20.832 9.828 4.233 14.413 20.262 20.835 23.561 29.081 13.872 27.169 16.982 39 25.044 8.838 5.266 21.317 10.336 37.832 12.168 23.532 3.600 11.871 12.921 20.210 4.911 17.387 4.507 9.669 40 37.375 16.634 10.832 28.465 21.198 15.319 15.731 6.042 9.850 14.124 21.370 9.133 11.959 6.515 8.794 13.549 41 18.876 21.986 27.635 24.537 29.579 7.872 14.018 8.191 16.609 17.976 16.624 36.748 1.200 19.980 34.113 17.151 42 19.616 17.301 8.251 12.490 6.555 37.054 3.559 25.585 27.181 30.221 11.420 25.933 31.024 15.352 14.969 7.677 43 9.616 23.102 15.616 21.010 23.475 1.500 13.735 13.250 15.227 14.996 24.715 17.963 44.048 28.931 19.249 38.928 44 12.612 11.717 33.061 17.930 25.502 2.308 13.405 18.925 12.157 17.456 15.861 17.027 11.053 11.552 38.044 15.249 45 14.854 29.496 27.715 15.701 35.761 18.652 34.734 29.280 29.310 9.198 18.619 26.233 17.636 25.323 10.322 18.914 46 12.664 18.649 15.434 11.525 10.079 11.663 23.034 34.571 14.249 39.639 19.587 20.419 32.853 64.618 20.412 24.596 47 21.659 19.080 29.599 11.872 22.468 17.909 16.361 12.920 19.349 16.663 19.428 17.427 4.920 39.166 23.921 16.814 48 14.499 17.410 19.599 21.049 44.264 31.875 7.693 8.690 16.743 23.227 3.536 21.303 23.052 43.686 7.347 28.664 49 22.684 12.477 3.029 42.769 12.014 11.792 28.535 20.848 21.311 23.490 13.581 19.534 32.736 7.773 8.055 13.566 50 21.890 30.003 18.794 37.465 29.325 2.661 12.356 9.984 18.524 14.876 28.532 15.824 6.582 13.172 31.647 16.006 51 30.695 17.332 22.717 20.912 4.774 10.594 0.856 14.662 12.290 26.492 10.283 16.223 11.646 1.673 3.108 52.679 52 33.827 18.748 22.713 28.921 22.922 22.721 15.653 22.338 11.101 46.675 14.121 29.860 18.280 27.425 18.298 19.769 53 35.416 16.810 4.210 15.668 8.507 27.869 15.779 23.035 8.778 21.489 23.811 31.342 20.358 16.014 5.782 16.731 54 27.901 16.958 19.683 30.710 30.850 2.664 22.221 4.481 24.903 19.406 25.536 10.130 39.025 30.361 15.560 37.724 55 14.550 23.793 9.823 31.202 50.162 12.882 12.755 12.992 18.764 25.186 17.331 14.217 12.742 2.084 38.019 5.296 56 22.422 19.246 21.417 14.166 19.047 42.610 34.430 22.313 9.498 33.203 22.777 8.235 49.430 24.667 0.380 16.651 57 28.322 16.958 20.086 13.815 18.220 24.525 15.910 37.504 12.726 4.769 28.857 13.162 36.550 23.849 1.047 9.287 58 9.569 23.540 26.167 20.735 22.005 28.544 13.838 5.274 10.982 29.538 10.154 13.778 19.465 35.727 27.094 14.284 59 20.615 8.917 8.517 11.047 2.307 6.537 3.517 7.891 15.223 33.284 23.449 21.499 16.300 13.704 5.716 7.027 60 16.785 9.964 12.167 21.200 18.373 7.454 7.438 13.748 15.893 20.469 6.344 13.471 11.487 36.186 5.003 14.235 61 18.368 15.958 14.226 16.856 27.359 25.609 3.875 11.765 13.059 21.039 11.988 8.115 39.276 29.143 16.891 8.663 62 38.602 23.504 14.862 25.546 7.455 51.374 10.873 7.407 14.163 26.675 10.871 8.453 34.337 16.947 8.480 0.798 63 21.746 20.775 31.508 28.565 2.881 29.314 12.430 48.225 9.023 26.267 16.444 9.299 23.006 40.791 12.309 11.632 64 21.059 12.820 17.736 10.499 22.364 9.900 6.853 21.148 14.171 30.395 22.324 31.160 14.575 5.608 5.080 14.629 65 19.831 7.827 24.215 23.906 1.028 28.270 23.986 6.601 12.736 15.255 19.930 15.168 8.439 3.988 28.323 22.993 66 36.959 13.276 36.402 9.165 11.972 5.620 16.425 7.460 4.321 13.054 17.048 30.995 13.534 13.869 32.083 26.505 67 17.710 13.047 35.132 18.428 15.074 3.027 14.660 51.700 25.234 19.298 28.296 23.634 4.723 9.358 9.272 22.753 68 15.529 18.240 30.723 28.835 12.057 34.916 19.020 35.022 17.548 13.212 17.031 16.453 4.040 9.157 27.551 10.062 69 17.765 18.189 23.449 33.362 13.846 10.747 14.956 10.731 14.338 13.230 34.746 17.591 22.453 19.035 56.953 20.425 70 23.227 8.882 24.438 26.221 32.827 25.665 26.792 2.674 22.331 33.089 19.711 15.343 23.190 14.786 33.047 26.668 71 20.636 22.116 11.624 28.435 7.737 24.148 22.247 19.710 16.782 8.444 19.886 10.242 42.634 15.575 22.728 8.579 72 19.732 14.299 13.081 23.859 21.329 19.980 1.858 30.081 14.220 38.657 10.835 25.273 30.280 27.094 22.872 22.838 73 9.509 13.157 17.052 27.019 1.342 25.401 33.560 1.125 18.545 22.870 29.095 13.461 14.027 13.605 7.367 25.439 74 14.974 26.211 42.520 23.729 15.578 14.502 11.419 26.794 30.145 29.312 18.179 21.311 15.291 36.350 25.147 25.947 75 16.660 21.390 27.406 6.111 7.758 23.817 42.886 20.061 19.621 19.944 4.077 22.783 4.556 9.145 13.794 17.582 76 11.139 29.686 16.832 32.660 5.456 6.209 23.234 25.747 10.235 4.671 20.213 16.267 16.076 2.688 18.933 12.205 77 16.319 9.811 23.322 44.677 19.176 28.687 20.477 5.252 10.775 12.715 14.047 16.592 37.561 9.483 17.190 41.916 78 34.205 23.391 40.463 29.843 13.286 4.354 18.107 22.929 28.185 17.960 10.555 13.248 14.026 22.938 24.117 12.532 79 26.144 16.980 21.482 24.225 7.645 20.538 34.328 22.177 22.306 9.699 19.331 13.026 13.393 9.807 0.535 3.510 80 19.906 31.619 14.614 21.381 34.509 10.746 14.834 32.312 30.146 35.394 12.093 22.477 14.681 10.975 8.032 14.903 147

Table A.55: The average standard deviation (SD) for θˆ for the scenarios with N = 80, M = 42 and K = 4 using the penalized perplexity for model selection. Level of Nesting Mild Moderate Extreme Full

ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ Visitor θ1 θ2 θ3 θ4 θ1 θ2 θ3 θ4 θ1 θ2 θ3 θ4 θ1 θ2 θ3 θ4 1 0.424 0.411 0.458 0.571 0.439 0.539 0.409 0.408 0.291 0.549 0.464 0.512 0.389 0.394 0.401 0.497 2 0.489 0.386 0.485 0.532 0.474 0.532 0.465 0.395 0.331 0.510 0.493 0.514 0.430 0.442 0.383 0.483 3 0.478 0.443 0.474 0.513 0.483 0.543 0.413 0.400 0.323 0.506 0.520 0.508 0.418 0.441 0.370 0.496 4 0.524 0.421 0.478 0.501 0.385 0.546 0.450 0.491 0.361 0.482 0.466 0.534 0.415 0.463 0.408 0.443 5 0.492 0.377 0.486 0.549 0.485 0.486 0.486 0.406 0.361 0.498 0.459 0.537 0.443 0.461 0.367 0.437 6 0.496 0.457 0.451 0.513 0.476 0.483 0.478 0.459 0.394 0.510 0.464 0.483 0.449 0.461 0.379 0.415 7 0.489 0.426 0.518 0.488 0.469 0.556 0.375 0.447 0.378 0.493 0.475 0.509 0.433 0.459 0.358 0.475 8 0.527 0.416 0.477 0.493 0.444 0.534 0.440 0.468 0.356 0.483 0.508 0.503 0.421 0.443 0.366 0.457 9 0.509 0.432 0.516 0.464 0.406 0.501 0.522 0.461 0.394 0.440 0.486 0.521 0.474 0.464 0.337 0.433 10 0.464 0.430 0.494 0.528 0.447 0.511 0.460 0.452 0.347 0.471 0.498 0.516 0.434 0.473 0.371 0.414 11 0.508 0.419 0.477 0.512 0.478 0.517 0.430 0.453 0.352 0.458 0.531 0.493 0.427 0.518 0.343 0.413 12 0.519 0.379 0.504 0.516 0.490 0.488 0.472 0.439 0.361 0.507 0.495 0.479 0.443 0.548 0.332 0.423 13 0.508 0.437 0.490 0.481 0.477 0.457 0.508 0.456 0.359 0.490 0.482 0.508 0.461 0.417 0.395 0.429 14 0.465 0.443 0.516 0.494 0.560 0.523 0.391 0.424 0.351 0.502 0.551 0.450 0.454 0.463 0.348 0.419 15 0.471 0.437 0.444 0.565 0.477 0.486 0.477 0.445 0.395 0.486 0.491 0.464 0.444 0.481 0.384 0.384 16 0.525 0.418 0.467 0.516 0.424 0.508 0.482 0.476 0.397 0.488 0.517 0.471 0.371 0.503 0.372 0.458 17 0.494 0.431 0.475 0.511 0.530 0.487 0.458 0.411 0.370 0.473 0.556 0.446 0.411 0.498 0.380 0.449 18 0.457 0.421 0.511 0.536 0.464 0.507 0.470 0.469 0.303 0.526 0.489 0.525 0.445 0.524 0.376 0.422 19 0.477 0.416 0.525 0.503 0.450 0.497 0.503 0.416 0.379 0.521 0.500 0.431 0.465 0.507 0.365 0.419 20 0.503 0.425 0.494 0.495 0.476 0.490 0.441 0.486 0.387 0.424 0.495 0.524 0.448 0.537 0.372 0.366 21 0.518 0.414 0.518 0.483 0.489 0.454 0.488 0.483 0.354 0.475 0.511 0.524 0.423 0.510 0.362 0.419 22 0.508 0.438 0.497 0.486 0.457 0.449 0.523 0.448 0.397 0.502 0.503 0.472 0.423 0.501 0.411 0.389 23 0.503 0.459 0.473 0.492 0.455 0.539 0.439 0.440 0.385 0.505 0.473 0.479 0.455 0.492 0.355 0.411 24 0.512 0.451 0.480 0.483 0.413 0.541 0.460 0.489 0.385 0.469 0.517 0.480 0.475 0.476 0.401 0.414 25 0.492 0.469 0.502 0.464 0.451 0.490 0.467 0.487 0.376 0.472 0.516 0.482 0.461 0.433 0.367 0.417 26 0.460 0.445 0.530 0.495 0.426 0.566 0.438 0.448 0.370 0.480 0.541 0.450 0.476 0.454 0.374 0.436 27 0.497 0.442 0.488 0.502 0.502 0.461 0.484 0.432 0.368 0.485 0.517 0.489 0.476 0.448 0.370 0.412 28 0.437 0.441 0.519 0.527 0.517 0.480 0.439 0.462 0.394 0.496 0.515 0.448 0.457 0.468 0.367 0.461 29 0.512 0.428 0.495 0.484 0.484 0.509 0.493 0.439 0.408 0.468 0.498 0.465 0.472 0.485 0.347 0.430 30 0.538 0.434 0.480 0.492 0.441 0.506 0.412 0.498 0.387 0.506 0.488 0.481 0.444 0.489 0.403 0.409 31 0.468 0.499 0.429 0.514 0.383 0.565 0.490 0.431 0.358 0.482 0.515 0.503 0.493 0.466 0.377 0.425 32 0.475 0.389 0.540 0.512 0.426 0.554 0.478 0.412 0.408 0.473 0.535 0.483 0.456 0.516 0.382 0.407 33 0.443 0.481 0.487 0.508 0.489 0.476 0.473 0.451 0.398 0.467 0.513 0.500 0.439 0.443 0.382 0.434 34 0.458 0.403 0.521 0.538 0.429 0.519 0.474 0.485 0.377 0.459 0.535 0.509 0.437 0.475 0.392 0.455 35 0.491 0.376 0.534 0.530 0.423 0.518 0.487 0.459 0.389 0.525 0.459 0.495 0.452 0.517 0.358 0.440 36 0.494 0.448 0.515 0.485 0.471 0.525 0.454 0.450 0.418 0.500 0.479 0.467 0.467 0.455 0.377 0.444 37 0.491 0.471 0.480 0.487 0.518 0.481 0.466 0.449 0.388 0.495 0.479 0.503 0.472 0.470 0.387 0.442 38 0.491 0.434 0.462 0.534 0.554 0.450 0.485 0.410 0.392 0.502 0.507 0.481 0.385 0.502 0.417 0.473 39 0.437 0.452 0.525 0.505 0.465 0.448 0.510 0.472 0.416 0.496 0.506 0.466 0.464 0.489 0.364 0.450 40 0.490 0.465 0.464 0.506 0.485 0.522 0.486 0.436 0.393 0.489 0.527 0.477 0.460 0.515 0.365 0.413 41 0.491 0.415 0.534 0.490 0.421 0.534 0.504 0.427 0.418 0.491 0.498 0.479 0.435 0.550 0.345 0.424 42 0.494 0.450 0.495 0.498 0.482 0.512 0.458 0.453 0.365 0.512 0.473 0.517 0.453 0.513 0.380 0.426 43 0.486 0.433 0.487 0.513 0.510 0.470 0.474 0.429 0.424 0.512 0.474 0.479 0.466 0.524 0.361 0.435 44 0.479 0.433 0.497 0.531 0.525 0.503 0.504 0.419 0.419 0.516 0.463 0.466 0.410 0.567 0.408 0.433 45 0.508 0.446 0.483 0.496 0.404 0.560 0.407 0.492 0.433 0.488 0.509 0.465 0.496 0.464 0.392 0.428 46 0.481 0.500 0.499 0.469 0.470 0.534 0.464 0.447 0.404 0.468 0.473 0.530 0.461 0.519 0.364 0.430 47 0.443 0.507 0.482 0.494 0.485 0.505 0.473 0.476 0.423 0.509 0.494 0.480 0.465 0.508 0.424 0.412 48 0.479 0.479 0.511 0.463 0.414 0.490 0.470 0.525 0.417 0.496 0.478 0.499 0.484 0.452 0.399 0.468 49 0.504 0.406 0.534 0.488 0.480 0.511 0.479 0.461 0.396 0.500 0.486 0.499 0.476 0.481 0.426 0.437 50 0.517 0.432 0.474 0.510 0.432 0.480 0.546 0.466 0.387 0.490 0.510 0.506 0.411 0.538 0.395 0.446 51 0.486 0.479 0.438 0.529 0.395 0.518 0.560 0.448 0.437 0.492 0.497 0.457 0.397 0.533 0.369 0.503 52 0.462 0.474 0.480 0.513 0.448 0.525 0.518 0.422 0.384 0.498 0.523 0.491 0.421 0.523 0.411 0.423 53 0.505 0.419 0.562 0.446 0.443 0.525 0.510 0.441 0.413 0.477 0.487 0.515 0.455 0.499 0.452 0.415 54 0.476 0.458 0.491 0.524 0.516 0.515 0.458 0.445 0.423 0.507 0.493 0.469 0.473 0.482 0.442 0.467 55 0.510 0.381 0.518 0.528 0.411 0.508 0.506 0.500 0.432 0.519 0.454 0.510 0.468 0.497 0.400 0.454 56 0.445 0.486 0.515 0.503 0.462 0.500 0.482 0.493 0.441 0.468 0.530 0.475 0.454 0.508 0.377 0.490 57 0.483 0.437 0.521 0.487 0.491 0.530 0.457 0.436 0.418 0.483 0.487 0.514 0.444 0.549 0.365 0.442 58 0.528 0.428 0.478 0.504 0.497 0.489 0.516 0.435 0.429 0.487 0.497 0.516 0.465 0.490 0.417 0.465 59 0.515 0.380 0.513 0.524 0.410 0.526 0.474 0.487 0.471 0.477 0.462 0.489 0.481 0.477 0.410 0.505 60 0.510 0.447 0.487 0.501 0.468 0.472 0.512 0.484 0.429 0.527 0.504 0.461 0.522 0.470 0.388 0.473 61 0.503 0.471 0.460 0.525 0.500 0.529 0.430 0.472 0.371 0.509 0.501 0.519 0.469 0.566 0.423 0.417 62 0.462 0.446 0.519 0.530 0.502 0.498 0.461 0.440 0.425 0.496 0.460 0.517 0.492 0.477 0.389 0.459 63 0.516 0.419 0.529 0.466 0.542 0.495 0.467 0.441 0.446 0.514 0.516 0.453 0.471 0.458 0.446 0.478 64 0.474 0.450 0.481 0.547 0.422 0.546 0.515 0.438 0.392 0.506 0.519 0.495 0.412 0.577 0.410 0.462 65 0.488 0.431 0.513 0.526 0.460 0.494 0.450 0.519 0.430 0.499 0.479 0.503 0.521 0.444 0.417 0.492 66 0.514 0.474 0.452 0.517 0.401 0.554 0.548 0.451 0.456 0.537 0.455 0.466 0.484 0.509 0.439 0.442 67 0.474 0.496 0.475 0.498 0.416 0.530 0.536 0.455 0.464 0.514 0.458 0.501 0.469 0.525 0.435 0.467 68 0.475 0.470 0.482 0.521 0.490 0.553 0.476 0.410 0.418 0.509 0.515 0.494 0.506 0.489 0.470 0.426 69 0.519 0.436 0.504 0.513 0.479 0.516 0.483 0.447 0.416 0.519 0.502 0.504 0.515 0.463 0.474 0.433 70 0.504 0.458 0.486 0.515 0.483 0.483 0.460 0.506 0.426 0.517 0.485 0.499 0.470 0.550 0.452 0.429 71 0.458 0.460 0.518 0.517 0.367 0.600 0.459 0.495 0.451 0.504 0.485 0.494 0.477 0.429 0.429 0.544 72 0.498 0.462 0.447 0.539 0.448 0.556 0.457 0.495 0.420 0.535 0.514 0.438 0.483 0.503 0.425 0.441 73 0.539 0.423 0.495 0.511 0.476 0.541 0.475 0.468 0.459 0.521 0.481 0.477 0.429 0.467 0.483 0.465 74 0.496 0.471 0.494 0.501 0.478 0.511 0.455 0.493 0.402 0.505 0.539 0.498 0.517 0.534 0.419 0.421 75 0.473 0.501 0.463 0.522 0.437 0.530 0.503 0.464 0.407 0.534 0.501 0.473 0.486 0.523 0.346 0.508 76 0.468 0.496 0.470 0.528 0.457 0.556 0.500 0.443 0.467 0.517 0.476 0.465 0.497 0.483 0.433 0.485 77 0.470 0.458 0.502 0.521 0.462 0.539 0.495 0.466 0.435 0.497 0.534 0.460 0.481 0.512 0.467 0.441 78 0.462 0.470 0.541 0.488 0.490 0.526 0.435 0.495 0.402 0.493 0.525 0.496 0.447 0.569 0.415 0.479 79 0.496 0.440 0.497 0.520 0.468 0.508 0.503 0.473 0.430 0.476 0.508 0.507 0.447 0.490 0.450 0.454 80 0.494 0.478 0.471 0.517 0.448 0.542 0.492 0.469 0.425 0.551 0.475 0.477 0.458 0.531 0.409 0.500 148

A.2 Avon Gorge dataset results

In this Section, the results for the Avon Gorge analysis using penalized perplexity for analysis 1 are provided in Subsection A.2.1. The results for analysis 1 using the IC model selection criterion are provided in Subsection A.2.2. The results in each Subsection are the same type of results represented in the body of the thesis for analysis 2 using the penalized perplexity model selection criterion. These include arranged visual webs according to the final θ and β estimates from the 100 runs, the θ and β estimates and a scatterplot of the β estimates for each exposed group according to LDA.

A.2.1 Avon Gorge data results for analysis 1 using penalized perplexity 149

Table A.56: Estimated plant visitation distributions for each interaction group βk, averaged over runs for Kˆ = 2 (84) using analysis 1 of LDA with a Gibbs sampler and two interaction groups by using PP for model selection. Interaction Group Plant Species 1 2 Scabiosa columbaria 0.092 0.000 Rubus fruticosus 0.071 0.000 Geranium robertianum 0.055 0.000 Solidago virgaurea 0.057 0.000 Rosa canina 0.045 0.009 Origanum vulgare 0.046 0.000 Crepis sp. 0.042 0.000 Clematis vitalba 0.038 0.000 Senecio sp1 0.041 0.000 Helianthemum nunmularium 0.039 0.000 Cotoneaster franchetti 0.032 0.000 Ligustrum vulgare 0.027 0.000 Centhrantus ruber 0.025 0.002 Achillea millefolium 0.024 0.000 Teucrium scorodonia 0.024 0.000 Pilosella officinarum 0.023 0.000 Hieracium sp. 0.023 0.000 Torilis japonica 0.019 0.000 Smyrnium olusatrum 0.017 0.006 Clinopodium ascendens 0.017 0.000 Hypericum perfuratum 0.016 0.000 Hedera helix 0.018 0.005 Cratageus monogyna 0.009 0.006 Rosa arvensis 0.011 0.000 Allium sphaerocephalon 0.014 0.000 Mycelis muralis 0.011 0.001 Eupatorium cannabinum 0.010 0.000 Lactuca serriola 0.010 0.000 Sonchus arvensis 0.009 0.000 Solanum dulcamara 0.009 0.000 Gallium verum 0.007 0.005 Viburnum lantana 0.007 0.004 Lotus corniculatus 0.006 0.000 Thymus polytrichus 0.004 0.001 Melampyrum pratense 0.005 0.000 Cornus sanguinea 0.005 0.003 Sisymbrium officinale 0.003 0.000 Senecio jacobaea 0.004 0.000 Hippocrepis camosa 0.004 0.000 Clinopodium acinos 0.004 0.000 Carlina vulgaris 0.004 0.000 Medicago lupina 0.002 0.000 Plantago lanceolata 0.002 0.001 Ranunculus sp1 0.028 0.164 Trinia glauca 0.005 0.285 Ulex europaeus 0.015 0.022 Potentilla sterilis 0.003 0.161 Sorbus sp. 0.002 0.124 Cotoneaster horizontalis 0.002 0.094 Apiaceae sp. 0.007 0.011 Sanguisorba minor 0.001 0.060 Verbascum thapsus 0.003 0.005 Asteracea sp13 0.003 0.016 Bellis perenis 0.002 0.010 150

a Table A.57: Estimated group membership distributions for each visitor species θ , averaged over 84 independent runs of LDA where Kˆ = 2 with a Gibbs sampler using PP. Interaction Group Visitor species 1 2 Bombus pascuorum 1.000 0.000 Anaspis pulicaria 0.986 0.014 Formica fusca 0.841 0.159 Meligethes solidus 0.893 0.107 Episyrphus balteatus 1.000 0.000 Lasioglossum morio 0.999 0.001 Lasius alienus cf 0.568 0.432 Halictus tumulorum 0.998 0.002 Meligethes aeneus 0.996 0.004 Platycheirus albimanus 1.000 0.000 Syrphus ribesii 0.998 0.002 Meligethes obscurus 0.561 0.439 Paragus haemorrhous 0.997 0.003 Sphaerophoria scripta cf 1.000 0.000 Melanostoma scalare 1.000 0.000 Apion exapion 0.881 0.119 Oedemera lurida 0.941 0.059 Heterostylodes nominabilis 0.982 0.018 Cteniopus sulphureus 1.000 0.000 Hoplitis spinulosus 1.000 0.000 Lasioglossum smacthmanellum 0.980 0.020 heteroptera sp7 1.000 0.000 Bombus lapidarius 1.000 0.000 Eristalis tenax cf 1.000 0.000 Lasioglossum villosulum 1.000 0.000 Phyllobius roboretanus 0.208 0.792 Isomira murina 0.942 0.058 Eriothrix rufomaculata 1.000 0.000 Bombus pratorum 0.996 0.004 Apis melliferae 1.000 0.000 Melanostoma millinum 0.993 0.007 Helophilus pendulus 0.985 0.015 Epistrophe grossularia 1.000 0.000 Bombus lucorum.terrestris 1.000 0.000 Myrmecoris gracilis 0.517 0.483 Lasioglossum calceatum 1.000 0.000 Heteroptera sp5 1.000 0.000 Thysanoptera sp3 0.798 0.202 Myrmica sp 0.667 0.333 Delia sp 0.995 0.005 Thysanoptera sp1 0.840 0.160 Thysanoptera sp 1.000 0.000 Syrphus vitripennis 1.000 0.000 Sarcophaga nigriventris 1.000 0.000 151

Table A.58: Continuation of the estimated group membership distributions for each a visitor species θ , averaged over 84 independent runs for analysis 1 of LDA where Kˆ = 2 with a Gibbs sampler using PP. Group Visitor species 1 2 Pegoplata aestiva 1.000 0.000 Meliscaeva auricollis 1.000 0.000 heteroptera sp9 1.000 0.000 Fernandea cuprea 1.000 0.000 Eristalis arbustorum 1.000 0.000 Oedemera nobilis cf 1.000 0.000 Melangyna umbellatarum 1.000 0.000 Discomyza incurva 0.889 0.111 Vespula vulgaris 0.976 0.024 Syritta pipiens 1.000 0.000 Sphecodes niger 1.000 0.000 Sarcophaga spp 0.988 0.012 Rhinophora lepida 1.000 0.000 Platycheirus scutatus 1.000 0.000 Lasioglossum leucopus 1.000 0.000 Lasiambia palposa 0.996 0.004 Eupeodes luniger 1.000 0.000 Eristalis interruptus 1.000 0.000 Byturus tomentosus 0.984 0.016 Bombylius major 1.000 0.000 Polyommatus icarus 0.997 0.003 Bombus hortorum 1.000 0.000 Pyronia tithonus 1.000 0.000 Lucilia illustris 1.000 0.000 Lasioglossum fulvicorne 1.000 0.000 Syrphus torvus 1.000 0.000 Strangalia maculata 1.000 0.000 Solieria pacifica 0.946 0.054 Sarcophaga subvicina 1.000 0.000 Platycheirus clypeatus 0.589 0.411 Neomyia cornicina 1.000 0.000 Myathropa florea 0.982 0.018 Meligethes flavimanus 0.988 0.012 Leptothorax sp 0.203 0.797 Hylaeus communis 1.000 0.000 Hoplitis claviventris 1.000 0.000 Eusphalerum luteum 1.000 0.000 Cetonia aurata 0.940 0.060 Ceratina cyanea 1.000 0.000 aphid sp1 0.988 0.012 Anthonomus rubi cf 1.000 0.000 heteroptera sp8 1.000 0.000 heteroptera sp2 1.000 0.000 Bombus vestalis 1.000 0.000 Pieris napi 1.000 0.000 152

Bombus pascuorum Anaspis pulicaria Formica fusca Meligethes solidus Episyrphus balteatus Lasioglossum morio Lasius alienus cf Halictus tumulorum Meligethes aeneus Platycheirus albimanus Syrphus ribesii Meligethes obscurus Paragus haemorrhous Sphaerophoria scripta cf Melanostoma scalare Apion Exapion Oedemera lurida Heterostylodes nominabilis Cteniopus sulphureus Hoplitis spinulosus Lasioglossum smacthmanellum heteroptera sp7 Bombus lapidarius Eristalis tenax cf Lasioglossum villosulum Phyllobius roboretanus Isomira murina Eriothrix rufomaculata Bombus pratorum Apis melliferae Melanostoma millinum Helophilus pendulus Epistrophe grossularia Bombus lucorum.terrestris Myrmecoris gracilis Lasioglossum calceatum heteroptera sp5 thysanoptera sp3 Myrmica sp Delia sp thysanoptera sp1 thysanoptera sp Syrphus vitripennis Sarcophaga nigriventris Pegoplata aestiva Meliscaeva auricollis heteroptera sp9 Fernandea cuprea Eristalis arbustorum Oedemera nobilis cf Melangyna umbellatarum Discomyza incurva Vespula vulgaris Syritta pipiens Sphecodes niger Sarcophaga spp Rhinophora lepida Platycheirus scutatus Lasioglossum leucopus Lasiambia palposa Eupeodes luniger Eristalis interruptus Byturus tomentosus Bombylius major Polyommatus icarus Bombus hortorum Pyronia tithonus Lucilia illustris Lasioglossum fulvicorne Syrphus torvus Strangalia maculata Solieria pacifica Sarcophaga subvicina Platycheirus clypeatus Neomyia cornicina Myathropa florea Meligethes flavimanus Leptothorax sp Hylaeus communis Hoplitis claviventris Eusphalerum luteum Cetonia aurata Ceratina cyanea aphid sp1 Anthonomus rubi cf heteroptera sp8 heteroptera sp2 Bombus vestalis Pieris napi Crepis sp. Crepis Sorbus sp. Sorbus Senecio sp1 Senecio Rosa canina Rosa Hedera helix Hedera Trinia glauca Trinia Apiaceae sp. Apiaceae Bellis perenis Bellis Hieracium sp. Hieracium Rosa arvensis Rosa Gallium verum Gallium Torilis japonica Torilis Mycelis muralis Mycelis Carlina vulgaris Carlina Asteracea sp13 Asteracea Ulex europaeus Ulex Clematis vitalba Clematis Lactuca serriola Lactuca Ranunculus sp1 Ranunculus Potentilla sterilis Potentilla Medicago lupina Medicago Rubus fruticosus Rubus Viburnum lantana Viburnum Sonchus arvensis Sonchus Origanum vulgare Origanum vulgare Ligustrum Senecio jacobaea Senecio Centhrantus ruber Centhrantus corniculatus Lotus Cornus sanguinea Cornus Solidago virgaurea Solidago Sanguisorba minor Sanguisorba Achillea millefolium Achillea Thymus Thymus polytrichus Clinopodium acinos Clinopodium Plantago lanceolata Plantago Verbascum thapsus Verbascum Pilosella officinarum Pilosella Solanum dulcamara Solanum Hippocrepis camosa Hippocrepis Smyrnium olusatrum Smyrnium Scabiosa columbaria Scabiosa Teucrium scorodonia Teucrium Sisymbrium officinale Sisymbrium Cratageus monogyna Cratageus Cotoneaster franchetti Cotoneaster Hypericum perfuratum Hypericum Melampyrum pratense Melampyrum Geranium robertianum Geranium Clinopodium ascendens Clinopodium sphaerocephalon Allium Cotoneaster horizontalis Cotoneaster Eupatorium cannabinum Eupatorium Helianthemum nunmularium Helianthemum

Figure A.22: Presence/absence visualization of Avon Gorge data with rare visits included, but plants/visitors with single counts removed for analysis 1 (N = 89, M = 54). 153

Group 1 Group 2 0.25 0.08 0.20 0.06 0.15 0.04 0.10 Estimated Probabilities Estimated Probabilities Estimated 0.02 0.05 0.00 0.00 0 10 20 30 40 50 0 10 20 30 40 50

Plant species index Plant species index

Figure A.23: Estimated visitation distribution by interaction group, averaged over 84 runs for Kˆ = 2 in analysis 1 using penalized perplexity for model selection. Refer to Table 5.3 for plant species names. 154

A.2.2 Avon Gorge data results for analysis 1 using IC model selection criterion

Bombus pascuorum Anaspis pulicaria Formica fusca Meligethes solidus Episyrphus balteatus Lasioglossum morio Lasius alienus cf Halictus tumulorum Meligethes aeneus Platycheirus albimanus Syrphus ribesii Meligethes obscurus Paragus haemorrhous Sphaerophoria scripta cf Melanostoma scalare Apion Exapion Oedemera lurida Heterostylodes nominabilis Cteniopus sulphureus Hoplitis spinulosus Lasioglossum smacthmanellum heteroptera sp7 Bombus lapidarius Eristalis tenax cf Lasioglossum villosulum Phyllobius roboretanus Isomira murina Eriothrix rufomaculata Bombus pratorum Apis melliferae Melanostoma millinum Helophilus pendulus Epistrophe grossularia Bombus lucorum.terrestris Myrmecoris gracilis Lasioglossum calceatum heteroptera sp5 thysanoptera sp3 Myrmica sp Delia sp thysanoptera sp1 thysanoptera sp Syrphus vitripennis Sarcophaga nigriventris Pegoplata aestiva Meliscaeva auricollis heteroptera sp9 Fernandea cuprea Eristalis arbustorum Oedemera nobilis cf Melangyna umbellatarum Discomyza incurva Vespula vulgaris Syritta pipiens Sphecodes niger Sarcophaga spp Rhinophora lepida Platycheirus scutatus Lasioglossum leucopus Lasiambia palposa Eupeodes luniger Eristalis interruptus Byturus tomentosus Bombylius major Polyommatus icarus Bombus hortorum Pyronia tithonus Lucilia illustris Lasioglossum fulvicorne Syrphus torvus Strangalia maculata Solieria pacifica Sarcophaga subvicina Platycheirus clypeatus Neomyia cornicina Myathropa florea Meligethes flavimanus Crepis sp. Crepis Sorbus sp. Sorbus Senecio sp1 Senecio Rosa canina Rosa Hedera helix Hedera Trinia glauca Trinia Apiaceae sp. Apiaceae Hieracium sp. Hieracium Rosa arvensis Rosa Gallium verum Gallium Torilis japonica Torilis Mycelis muralis Mycelis Carlina vulgaris Carlina Ulex europaeus Ulex Clematis vitalba Clematis Lactuca serriola Lactuca Ranunculus sp1 Ranunculus Potentilla sterilis Potentilla Medicago lupina Medicago Rubus fruticosus Rubus Viburnum lantana Viburnum Sonchus arvensis Sonchus Origanum vulgare Origanum vulgare Ligustrum Senecio jacobaea Senecio Lotus corniculatus Lotus ruber Centhrantus Cornus sanguinea Cornus Solidago virgaurea Solidago Sanguisorba minor Sanguisorba Achillea millefolium Achillea Thymus Thymus polytrichus Clinopodium acinos Clinopodium Plantago lanceolata Plantago Verbascum thapsus Verbascum Pilosella officinarum Pilosella Solanum dulcamara Solanum Hippocrepis camosa Hippocrepis Smyrnium olusatrum Smyrnium Scabiosa columbaria Scabiosa Teucrium scorodonia Teucrium Sisymbrium officinale Sisymbrium Cratageus monogyna Cratageus Cotoneaster franchetti Cotoneaster Hypericum perfuratum Hypericum Melampyrum pratense Melampyrum Geranium robertianum Geranium Clinopodium ascendens Clinopodium sphaerocephalon Allium Eupatorium cannabinum Eupatorium Helianthemum nunmularium Helianthemum

Figure A.24: Presence/absence visualization of Avon Gorge data with rare visits included, but plants/visitors with single counts removed for analysis 1 (N = 89, M = 54). 155

Table A.59: Estimated plant visitation distributions for each interaction group, av- eraged over 100 independent runs of LDA with a Gibbs sampler and six interaction groups. Interaction Group Plant Species 1 2 3 4 5 6 Scabiosa columbaria 0.107 0.016 0.003 0.001 0.000 0.001 Rubus fruticosus 0.080 0.065 0.036 0.008 0.000 0.001 Geranium robertianum 0.065 0.001 0.000 0.000 0.000 0.000 Rosa canina 0.046 0.005 0.029 0.036 0.008 0.029 Origanum vulgare 0.054 0.002 0.000 0.000 0.000 0.000 Crepis sp. 0.047 0.013 0.007 0.000 0.000 0.006 Clematis vitalba 0.044 0.002 0.003 0.000 0.000 0.002 Senecio sp1 0.042 0.040 0.022 0.004 0.000 0.013 Helianthemum nunmularium 0.043 0.020 0.004 0.000 0.000 0.002 Cotoneaster franchetti 0.034 0.001 0.000 0.024 0.011 0.000 Ligustrum vulgare 0.032 0.001 0.000 0.000 0.000 0.000 Teucrium scorodonia 0.028 0.005 0.000 0.000 0.000 0.000 Pilosella officinarum 0.027 0.001 0.000 0.000 0.000 0.000 Hieracium sp. 0.027 0.001 0.000 0.000 0.000 0.000 Torilis japonica 0.023 0.001 0.000 0.000 0.000 0.000 Clinopodium ascendens 0.020 0.001 0.000 0.000 0.000 0.000 Hypericum perfuratum 0.019 0.001 0.003 0.000 0.000 0.000 Rosa arvensis 0.013 0.001 0.000 0.000 0.000 0.000 Allium sphaerocephalon 0.016 0.001 0.002 0.000 0.001 0.000 Sonchus arvensis 0.010 0.001 0.000 0.000 0.000 0.000 Lotus corniculatus 0.008 0.001 0.000 0.000 0.000 0.000 Thymus polytrichus 0.005 0.003 0.002 0.000 0.003 0.000 Hippocrepis camosa 0.005 0.001 0.004 0.000 0.000 0.000 Clinopodium acinos 0.005 0.001 0.000 0.001 0.001 0.000 Medicago lupina 0.002 0.001 0.000 0.000 0.000 0.002 Centhrantus ruber 0.025 0.052 0.031 0.004 0.006 0.037 Achillea millefolium 0.019 0.123 0.047 0.000 0.000 0.035 Eupatorium cannabinum 0.007 0.075 0.043 0.010 0.000 0.001 Lactuca serriola 0.010 0.017 0.003 0.000 0.000 0.001 Solanum dulcamara 0.010 0.011 0.000 0.000 0.000 0.000 Melampyrum pratense 0.005 0.021 0.000 0.004 0.000 0.000 Sanguisorba minor 0.000 0.098 0.006 0.003 0.069 0.000 Sisymbrium officinale 0.002 0.028 0.013 0.001 0.000 0.005 Senecio jacobaea 0.004 0.015 0.007 0.001 0.000 0.000 Carlina vulgaris 0.005 0.012 0.000 0.003 0.000 0.000 Plantago lanceolata 0.002 0.020 0.004 0.001 0.002 0.002 Solidago virgaurea 0.058 0.148 0.186 0.036 0.016 0.052 Ranunculus sp1 0.004 0.112 0.419 0.149 0.000 0.058 Mycelis muralis 0.012 0.008 0.013 0.001 0.002 0.002 Verbascum thapsus 0.002 0.008 0.017 0.003 0.000 0.001 Asteracea sp13 0.000 0.015 0.038 0.016 0.000 0.006 Bellis perenis 0.000 0.012 0.026 0.012 0.000 0.003 Ulex europaeus 0.011 0.006 0.000 0.108 0.012 0.006 Smyrnium olusatrum 0.005 0.008 0.014 0.188 0.006 0.045 Cratageus monogyna 0.002 0.007 0.004 0.138 0.005 0.016 Viburnum lantana 0.001 0.002 0.000 0.097 0.009 0.005 Cornus sanguinea 0.002 0.001 0.001 0.056 0.004 0.004 Trinia glauca 0.001 0.003 0.000 0.021 0.358 0.001 Potentilla sterilis 0.000 0.007 0.000 0.000 0.239 0.000 Sorbus sp. 0.000 0.001 0.000 0.007 0.163 0.000 Cotoneaster horizontalis 0.000 0.001 0.000 0.051 0.071 0.000 Hedera helix 0.004 0.001 0.000 0.003 0.002 0.373 Gallium verum 0.003 0.001 0.002 0.002 0.003 0.119 Apiaceae sp. 0.001 0.003 0.000 0.006 0.004 0.163 156

a Table A.60: Estimated group membership distributions θ for each visitor species, averaged over 100 independent runs of LDA with a Gibbs sampler and six interaction groups. Interaction Group Visitor species 1 2 3 4 5 6 Bombus pascuorum 0.986 0.008 0.000 0.005 0.000 0.000 Anaspis pulicaria 0.891 0.008 0.014 0.059 0.013 0.015 Formica fusca 0.458 0.002 0.002 0.184 0.154 0.199 Meligethes solidus 0.493 0.080 0.274 0.112 0.000 0.040 Episyrphus balteatus 0.990 0.002 0.001 0.000 0.000 0.007 Lasioglossum morio 0.975 0.000 0.000 0.001 0.000 0.024 Lasius alienus cf 0.413 0.007 0.001 0.093 0.486 0.001 Halictus tumulorum 0.877 0.064 0.034 0.001 0.000 0.024 Meligethes aeneus 0.926 0.004 0.001 0.058 0.004 0.008 Platycheirus albimanus 0.977 0.016 0.005 0.001 0.000 0.001 Syrphus ribesii 0.922 0.031 0.007 0.001 0.001 0.039 Meligethes obscurus 0.462 0.007 0.000 0.001 0.530 0.000 Paragus haemorrhous 0.892 0.035 0.019 0.000 0.007 0.046 Sphaerophoria scripta cf 0.891 0.071 0.019 0.001 0.000 0.018 Melanostoma scalare 0.994 0.003 0.003 0.000 0.000 0.001 Apion exapion 0.550 0.010 0.000 0.370 0.060 0.010 Oedemera lurida 0.813 0.092 0.028 0.001 0.050 0.016 Heterostylodes nominabilis 0.918 0.013 0.046 0.016 0.000 0.006 Cteniopus sulphureus 0.997 0.000 0.003 0.000 0.000 0.000 Hoplitis spinulosus 1.000 0.000 0.000 0.000 0.000 0.000 Lasioglossum smacthmanellum 0.907 0.017 0.052 0.017 0.000 0.007 Heteroptera sp7 0.974 0.018 0.003 0.000 0.000 0.005 Bombus lapidarius 0.986 0.004 0.001 0.009 0.000 0.000 Eristalis tenax cf 0.929 0.046 0.018 0.005 0.000 0.001 Lasioglossum villosulum 0.999 0.000 0.000 0.000 0.000 0.000 Phyllobius roboretanus 0.027 0.101 0.004 0.012 0.855 0.000 Isomira murina 0.139 0.012 0.002 0.698 0.062 0.089 Eriothrix rufomaculata 0.774 0.129 0.066 0.007 0.000 0.024 Bombus pratorum 0.962 0.002 0.004 0.023 0.004 0.004 Apis melliferae 0.959 0.020 0.020 0.000 0.000 0.000 Melanostoma millinum 0.771 0.149 0.044 0.003 0.006 0.027 Helophilus pendulus 0.861 0.002 0.007 0.006 0.004 0.120 Epistrophe grossularia 0.992 0.004 0.000 0.000 0.000 0.003 Bombus lucorum.terrestris 0.992 0.004 0.003 0.000 0.000 0.000 Myrmecoris gracilis 0.378 0.008 0.012 0.030 0.560 0.012 Lasioglossum calceatum 0.996 0.000 0.003 0.000 0.000 0.000 Heteroptera sp5 0.911 0.078 0.010 0.000 0.000 0.000 Thysanoptera sp3 0.100 0.140 0.510 0.180 0.000 0.070 Myrmica sp 0.498 0.020 0.002 0.000 0.392 0.088 Delia sp 0.743 0.084 0.030 0.096 0.002 0.044 Thysanoptera sp1 0.691 0.066 0.024 0.012 0.180 0.026 Thysanoptera sp 0.739 0.150 0.070 0.010 0.000 0.030 Syrphus vitripennis 0.976 0.005 0.010 0.000 0.000 0.008 Sarcophaga nigriventris 0.822 0.113 0.050 0.010 0.003 0.003 Pegoplata aestiva 0.989 0.010 0.000 0.000 0.000 0.000 Meliscaeva auricollis 0.984 0.010 0.005 0.000 0.000 0.000 heteroptera sp9 0.609 0.240 0.120 0.030 0.000 0.000 Fernandea cuprea 0.989 0.000 0.010 0.000 0.000 0.000 Eristalis arbustorum 0.769 0.140 0.058 0.000 0.000 0.033 Oedemera nobilis cf 0.924 0.048 0.010 0.000 0.000 0.018 Melangyna umbellatarum 0.979 0.000 0.005 0.013 0.003 0.000 Discomyza incurva 0.117 0.000 0.000 0.047 0.040 0.795 Vespula vulgaris 0.197 0.000 0.000 0.010 0.014 0.779 Syritta pipiens 0.689 0.210 0.070 0.007 0.000 0.024 157

Group Visitor species 1 2 3 4 5 6 Sphecodes niger 0.755 0.127 0.080 0.000 0.000 0.037 Sarcophaga spp 0.320 0.167 0.157 0.240 0.020 0.097 Rhinophora lepida 0.564 0.260 0.105 0.005 0.000 0.065 Platycheirus scutatus 0.962 0.007 0.030 0.000 0.000 0.000 Lasioglossum leucopus 0.998 0.000 0.000 0.000 0.000 0.000 Lasiambia palposa 0.945 0.020 0.020 0.000 0.014 0.000 Eupeodes luniger 0.812 0.150 0.037 0.000 0.000 0.000 Eristalis interruptus 0.410 0.200 0.230 0.070 0.020 0.070 Byturus tomentosus 0.516 0.113 0.074 0.247 0.014 0.037 Bombylius major 0.985 0.007 0.004 0.000 0.000 0.004 Polyommatus icarus 0.991 0.005 0.003 0.000 0.000 0.000 Bombus hortorum 0.996 0.003 0.000 0.000 0.000 0.000 Pyronia tithonus 0.998 0.000 0.000 0.000 0.000 0.000 Lucilia illustris 0.819 0.133 0.044 0.000 0.000 0.004 Lasioglossum fulvicorne 0.586 0.180 0.140 0.020 0.017 0.057 Syrphus torvus 0.998 0.000 0.000 0.000 0.000 0.000 Strangalia maculata 0.688 0.190 0.080 0.030 0.000 0.010 Solieria pacifica 0.265 0.160 0.050 0.015 0.015 0.494 Sarcophaga subvicina 0.369 0.230 0.230 0.080 0.010 0.080 Platycheirus clypeatus 0.439 0.145 0.025 0.010 0.369 0.010 Neomyia cornicina 0.389 0.240 0.200 0.080 0.020 0.070 Myathropa florea 0.315 0.075 0.035 0.015 0.030 0.529 Meligethes flavimanus 0.923 0.005 0.015 0.030 0.010 0.015 Leptothorax sp 0.020 0.010 0.000 0.060 0.908 0.000 Hylaeus communis 0.893 0.050 0.035 0.000 0.000 0.020 Hoplitis claviventris 0.938 0.050 0.010 0.000 0.000 0.000 Eusphalerum luteum 0.614 0.235 0.120 0.030 0.000 0.000 Cetonia aurata 0.110 0.020 0.020 0.718 0.040 0.090 Ceratina cyanea 0.998 0.000 0.000 0.000 0.000 0.000 aphid sp1 0.459 0.190 0.090 0.020 0.030 0.210 Anthonomus rubi cf 0.968 0.025 0.005 0.000 0.000 0.000 Heteroptera sp8 0.983 0.015 0.000 0.000 0.000 0.000 Heteroptera sp2 0.429 0.310 0.110 0.000 0.000 0.150 Bombus vestalis 0.963 0.015 0.010 0.010 0.000 0.000 Pieris napi 0.898 0.060 0.030 0.000 0.000 0.010 158

Group 1 Group 2 Group 3 0.15 0.4 0.08 0.3 0.10 0.2 0.04 0.05 0.1 Estimated Probabilities Estimated Probabilities Estimated Probabilities Estimated 0.0 0.00 0.00 0 10 20 30 40 50 0 10 20 30 40 50 0 10 20 30 40 50

Plant species index Plant species index Plant species index

Group 4 Group 5 Group 6 0.30 0.3 0.15 0.20 0.2 0.10 0.1 0.10 0.05 Estimated Probabilities Estimated Probabilities Estimated Probabilities Estimated 0.0 0.00 0.00 0 10 20 30 40 50 0 10 20 30 40 50 0 10 20 30 40 50

Plant species index Plant species index Plant species index

Figure A.25: Estimated visitation distribution by interaction group, averaged over 100 runs. Refer to Table 5.3 for plant species names. 159

A.3 The lda package in R

The lda package in R is used to fit LDA via the collapsed Gibbs sampler (Chang, 2009). The package is meant for reading in a corpus of documents, filtering the words in the documents and fitting various LDA models. The specific function in the package used for model fitting in this thesis is the lda.collapsed.gibbs.sampler. The function is capable of of fitting three different models including: latent Dirichlet allocation (LDA), supervised LDA, and the mixed-membership stochastic blockmodel (MMSB) (Chang, 2009). The function uses several input parameters but only a few relevant parameters are described in the context of pollination web modelling:

1. documents: A list of length N, with an element a being an integer matrix of size 2 × M. The first row is a index label for each plant species and the second row is the frequency of observed interactions for the M plant species. 2. K: number of interaction groups used to fit LDA to the dataset. 3. vocab: character vector for plant species labels in the documents list. 4. num.iterations: the number of iterations T used to sample group assignments for all interactions in the visitation web. 5. alpha: A scalar Dirichlet hyperparameter for θ. 6. burn − in: Number of iterations to discard from the Markov Chain. 6. compute.log.likelihood: A logical scalar to state if the log-likelihood will be computed for each iteration. The log-likelihoods from the output are not used to compute the model selection criteria but they are used to check convergence of the Markov chain.

The lda.collapsed.gibbs.sampler function provides output for the group assignments for the interactions in the web using three formats:

1. topics:A K × M matrix indicating the number of interactions on the plant species assigned to each interation group. 2. topicsums: A vector of length K, indicating the total number of interactions assigned to each group. 3. documentsums:A K × N matrix indicating the number of interactions for each visitor species assigned to each interation group.

The first and third output for lda.collapsed.gibbs.sampler is used to es- timate the final βˆ and θˆ estimate matrices described in Chapter 3. 160

A.4 The bipartite package in R

The bipartite package is used to visualize pollination webs and calculate several topological indices in R (Dormann et al., 2011). The package is mostly meant for plant-pollinator mutualistic networks and provides many examples of datasets which can also be found in the International web database (IWDB) (Guimar˜aeset al., 2011). Several of the functions require only the N × M frequency matrix of observed interactions as input.

In this thesis, the nested function calculates the nesting temperature for the visitation webs using the BINMATNEST method (Rodriguez-Girones and San- tamaria, 2006). The nested function only requires the N × M frequency matrix and specification of the method used to calculate the nesting temperature. A nestedness function also exists for qualitative pollination webs.

The visweb function is used to visualize the visitation webs in this thesis (Dormann et al., 2011). The function uses the N × M frequency matrix to represent the web as a grid with each cell representing a plant-visitor species pair. A higher frequency of observed counts for a species pair results in a darker the cell on the grid. Several aesthetic variable inputs variables exist, which allow the user to format the visual web (Dormann et al., 2011). The package also has the plotweb function to represent the visitation web as a bipartite graph.