Bayesian Inference in Ecological and Epidemiological Models

A thesis submitted in fulfilment of the requirements for the degree of Doctor of Philosophy

Saritha Kalhari Kodikara B.Sc Special Degree (Statistics), University of Sri Jayewardenepura.

School of Science College of Science, Health and Engineering RMIT University

June 2020

Declaration

I certify that except where due acknowledgement has been made, the work is that of the author alone; the work has not been submitted previously, in whole or in part, to qualify for any other academic award; the content of the thesis is the result of work which has been carried out since the official commencement date of the approved research program; any editorial work, paid or unpaid, carried out by a third party is acknowledged; and, ethics procedures and guidelines have been followed. I acknowledge the support I have received for my research through the provision of an Australian Government Research Training Program Scholarship.

Saritha Kalhari Kodikara 24 June 2020

i “If a man will begin with certainties, he shall end in doubts; but if he will be content to begin with doubts he shall end in certainties..”

Sir Francis Bacon Acknowledgements

Undertaking this PhD has been a truly life-changing experience for me and it would not have been possible without the help I got from many great individuals.

First and foremost, I wish to express my profound gratitude to my supervisors: Prof. Lewi Stone, Dr. Haydar Demirhan and Dr. Yan Wang for the continuous support given to me throughout my PhD study. I could not imagine having better supervisors for my PhD study. In particular, I am indebted to Prof. Lewi Stone for his insightful com- ments and encouragements, but also for the hard questions which inspired me to widen my research from many different perspectives. I am also thankful for the constructive feedback given on my writing which helped me to become a much better writer.

Besides my supervisors, I would like to thank Dr. Simon Firestone for taking me on- board to work on a rather fascinating research problem and for the assistance given throughout its duration.

I am extremely thankful to my dear friend Ayomi Marshall for her helpful comments and proofreading this thesis.

I would also like to take this opportunity to be thankful to all teachers, friends, colleagues whom I have crossed paths with in this lifetime and for the experiences we shared, as they have shaped me into who I am today.

I would also like to extend my gratitude to my family: my parents, and to my brother and sister, for supporting me spiritually throughout my PhD candidature and my life in general.

Finally, I am thankful to Pubudu, who has been by my side and encouraging me through- out. This research outcome would have been impossible without your sincere love and patience.

iii Contributions by others to the thesis

This is a thesis ‘with publications’ and contains six papers that are either published or under review that illustrate the research undertaken. While Chapters3,4,6 and7 are published as journal papers, Chapters5 and8 are still under review. Below I discuss my contribution to each chapter along with the contributions from others who have co-authored the publications resulting from this thesis.

• Chapter3 has been published with the following information: Kodikara, S., Demirhan, H., and Stone, L. (2018). Inferring about the of a using certain and uncertain sightings. Journal of Theoretical Biology, 442:98–109 Author contributions: Saritha Kodikara - Developed underlying theory, analysed the data, and drafted the manuscript. Haydar Demirhan - Guided the research and edited the manuscript. Lewi Stone - Guided the research and edited the manuscript.

• Chapter4has been published with the following information: Kodikara, S., Demirhan, H., Wang, Y., Solow, A., and Stone, L. (2020). Inferring extinction year using a bayesian approach. Methods in Ecology and Evolution, 11(8):964–973 Author contributions: Saritha Kodikara - Developed underlying theory, implemented and coded the model in R, analysed the data and drafted the manuscript. Haydar Demirhan - Guided the research and edited the manuscript. Andrew Solow - Guided the research and edited the manuscript. Yan wang - Guided the research and edited the manuscript. Lewi Stone - Guided the research and edited the manuscript.

• Chapter5 is currently being prepared for publication: Author contributions: Saritha Kodikara - Developed underlying theory, implemented and coded the model in R, analysed the data and drafted the manuscript. Yan wang - Guided the research and edited the manuscript. Haydar Demirhan - Guided the research and edited the manuscript. Lewi Stone - Guided the research and edited the manuscript. • Chapter6 has been published with the following information: Thompson, C. J., Kodikara, S., Burgman, M. A., Demirhan, H., and Stone, L. (2019). Bayesian updating to estimate extinction from sequential observation data. Biological Conservation, 229:26–29 Author contributions: Colin J. Thompson - Developed underlying theory and drafted the manuscript. Saritha Kodikara - Co-developed underlying theory, analysed the data, edited the manuscript. Mark A. Burgman - Guided the research and edited the manuscript. Haydar Demirhan - Guided the research and edited the manuscript. Lewi Stone - Guided the research and edited the manuscript.

• Chapter7 has been published with the following information: Thompson, C. J., Kodikara, S., Burgman, M. A., Demirhan, H., and Stone, L. (2020). Using survival theory models to quantify . Biological Conser- vation, 241:108345 Author contributions: Colin J. Thompson - Developed underlying theory and drafted the manuscript. Saritha Kodikara - Co-developed underlying theory, analysed the data, edited the manuscript. Mark A. Burgman - Guided the research and edited the manuscript. Haydar Demirhan - Guided the research and edited the manuscript. Lewi Stone - Guided the research and edited the manuscript.

• Chapter8 is currently being prepared for publication.: Author contributions: Saritha Kodikara - Developed underlying theory, coded the model in C++, designed and conducted the simulations, and drafted the manuscript. Max S. Y. Lau - Guided the research and edited the manuscript. Mary van Andel - Guided the research and edited the manuscript. Mark A. Stevenson - Guided the research and edited the manuscript. Bryan T. Grenfell - Guided the research and edited the manuscript. Nigel French - Guided the research and edited the manuscript. Haydar Demirhan - Guided the research and edited the manuscript. Lewi Stone - Guided the research and edited the manuscript. Simon M. Firestone - Co-developed underlying theory, provided assistance with C++ coding, analysed the NZ data and edited the manuscript. To Amma, Thaththa, Aiya, Akka and to my soulmate/ best friend/ partner in crime Pubudu Perera.

vi Contents

Declarationi

Acknowledgements iii

Contributions by others to the thesis iv

Contents vii

List of Figuresx

List of Tables xiii

Abstract 1

1 Introduction3 1.1 Overview...... 3 1.2 Species extinction models...... 3 1.2.1 Significance...... 3 1.2.2 Objective...... 5 1.2.3 Background...... 5 1.3 Infectious disease models...... 10 1.3.1 Significance...... 10 1.3.2 Objective...... 13 1.3.3 Background...... 13 1.4 Research questions addressed in the thesis...... 16 1.5 Thesis Layout...... 17

2 Bayesian Approach 21 2.1 Markov chain Monte Carlo method...... 26 2.1.1 Markov chain...... 26 2.1.2 Metropolis-Hastings algorithm...... 28 2.1.3 Gibbs sampler...... 28 2.2 Popularity of Bayesian approaches in ecology and epidemiology...... 29

3 Incorporating uncertainty in the sighting data into species extinction models 31 3.1 Introduction...... 32 3.2 Data...... 33 3.3 Methods- Model 1...... 34

vii Contents viii

3.4 Results- Model 1...... 37 3.5 Methods- Model 2...... 47 3.6 Results- Model 2...... 49 3.7 Guideline to select between Models...... 53 3.8 Discussion...... 55

4 Inferring Extinction Year using a Bayesian Approach 57 4.1 Introduction...... 58 4.2 Model development...... 60 4.2.1 Model 1 - Certain sightings only...... 61 4.2.2 Model 2 - Certain and uncertain sightings...... 63 4.3 Results...... 65 4.3.1 Simulation Study...... 65 4.3.2 Case study...... 69 4.4 Sensitivity analysis...... 75 4.4.1 Artificial sighting records...... 76 4.4.2 Species sighting records...... 77 4.5 Discussion...... 78

5 Modeling extinction of a species using non-homogeneous Poisson pro- cesses with a change-point 81 5.1 Introduction...... 82 5.2 Model Development...... 83 5.3 Results...... 89 5.3.1 Black-footed ferret...... 89 5.3.2 Ivory-billed woodpecker...... 90 5.4 Discussion...... 95

6 Bayesian updating to estimate extinction from sequential observation data 96 6.1 Introduction...... 96 6.2 Bayesian updating...... 98 6.2.1 Choosing an initial P (X1)...... 100 6.2.2 Calculating the yearly Bayes factor bt ...... 100 6.3 Rules of thumb and probability thresholds...... 101 6.4 Case study...... 102 6.5 Discussion...... 105

7 Using survival theory models to quantify extinctions 107 7.1 Introduction...... 107 7.2 Discrete survival theory for extinction probabilities...... 109 7.3 Model consistency...... 112 7.4 Case studies...... 113 7.4.1 Dodo...... 113 7.4.2 Aldabra snails...... 114 7.5 Discussion...... 116 Contents ix

8 A Systematic Bayesian Integration of Epidemiological, Genetic and Movement Data 118 8.1 Introduction...... 118 8.2 Methods...... 121 8.2.1 Model formulation and modification...... 121 8.2.2 Model verification and pseudo-validation...... 130 8.2.3 Model implementation...... 130 8.3 Results...... 131 8.4 Discussion...... 133

9 Conclusion 135 9.1 Incorporating uncertainty into species extinction models...... 136 9.2 Inferring extinction year using a Bayesian approach...... 136 9.3 Modeling extinction of a species using non-homogeneous Poisson processes with a change-point...... 137 9.4 Bayesian updating to estimate extinction from sequential observation data 138 9.5 Using survival theory models to quantify extinctions...... 138 9.6 A systematic Bayesian integration of epidemiological, genetic and move- ment Data...... 139 9.7 Conclusion...... 140

Bibliography 141

Appendix A. MCMC diagnostic plots for Chapter4 156

Appendix B. Technical details relating to equations presented in Chapter 7 158 List of Figures

1.1 Time series of the cumulative extinctions as a percentage of IUCN(2012) evaluated vertebrate species as extinct or extinct in the wild...... 4 1.2 Graphical representation of three sighting records for different species...6 1.3 Caribbean monk seal sighting data during (1915, 1992]. There are only five recorded sightings...... 7 1.4 Graphical representation of a sighting record that includes both certain and uncertain sightings...... 9 1.5 The consensus transmission network inferred for the FMD epidemic in 2010 in Japan...... 12 1.6 The most probable transmission network of the novel corona virus spread from the Beijing international airport to other airports around the world. 12 1.7 A SEIR compartment model...... 13 1.8 Illustration of the basic concept to infer transmission tree using genetic data...... 15

2.1 Graphical representation of the posterior distribution...... 23

3.1 Graphical representation of Ivory-billed Woodpecker sighting data..... 33 3.2 Model 1 is based on the assumption that the valid and invalid sightings follow different Poisson process...... 34 3.3 Change in the Bayes factor...... 38 3.4 Change in the Bayes factor for Model 1 for different variants in uncertain sightings...... 39 3.5 I(ω, x) is plotted as a function of time for different fixed values of ω when sightings are homogeneous...... 42 3.6 I(ω, x) as a function of time ω = 0.7. The red dashed line indicates the last uncertain sighting tn and blue dotted line represents tn−1...... 44 3.7 I(ω, x) as a function of time time for a fixed values of ω = 0.9 when sightings are non-homogeneous at middle. The red dashed line indicates the last uncertain sighting tn ...... 45 3.8 I(ω, x) as a function of time time for a fixed values of ω = 0.9 when sightings are non-homogeneous at end.The red dashed line indicates the last uncertain sighting tn ...... 46 3.9 I(ω, x) as a function of time time for a fixed values of ω = 0.9 when there is only one uncertain sighting.The red dashed line indicates the last uncertain sighting tn...... 47 3.10 The basis of Model 2 is developed on the fact that the certain and uncer- tain sightings follow different Poisson process...... 48

x List of Figures xi

3.11 Change in Bayes factor for Model 2 for different variants in both certain and uncertain sightings...... 50 3.12 Change in Bayes factor for Model 2 for different variants in certain sightings. 51 3.13 I0(ω, x) as a function of time for different fixed values of ω when sightings are homogeneous...... 52 3.14 Posterior PDF of parameter Ω, which is a measure of the quality of un- certain sightings...... 54

4.1 Comparison between the true extinction date vs inferred posterior median extinction date...... 67 4.2 Histogram of true extinction date under different certain sighting proba- bilities...... 69 4.3 Posterior distribution plot of τE for the IBW for Model 1...... 71 4.4 Posterior distribution plots for the model parameters for Model 1 exclud- ing τE...... 71 4.5 Posterior distribution plot of τE for the IBW for Model 2...... 72 4.6 Posterior distribution plots for the model parameters for Model 2 exclud- ing τE...... 73 4.7 Illustration of MCMC Diagnostics...... 74 4.8 Posterior median extinction date and its 95% HDI for three artificially generated sighting records between 0 and 100...... 76

5.1 Homogeneous or non-homogeneous Poisson process...... 84 5.2 Posterior distribution plots for the model parameters τE and αc for black- footed ferret...... 90 5.3 Posterior distribution plot of τE for the IBW...... 91 5.4 Graphical representation of IBW certain sightings and uncertain sightings. 92 5.5 Posterior distribution plot of τE for the IBW with change-point and ho- mogeneous rate assumptions...... 92 5.6 Posterior distribution plots for the model parameters αc, αu1 and αu2 for IBW...... 93 5.7 Time series plot of the posterior extinction probability under different assumptions for the IBW...... 94

6.1 Cumulative Bayes Factors and probabilities of extinction generated se- quentially from the data in Table 6.1...... 104 6.2 Sensitivity of extant probability to p(ri)...... 105

7.1 Plots of P (ET ) vs. h and T vs. h showing the self-consistency triplet (P ∗, h∗,T ∗) of model parameter values when K = 2...... 112 7.2 Consistency check of Linear iterative model against Survival theory.... 116

8.1 Representation of periods of exposure to secondary transmission rates, incorporating contact-traced movements...... 123 8.2 Genomes are mostly in the form of DNA and are made out of four nu- cleotide bases (i.e. Adenine, Guanine, Cytosine, Thymine) which fall into two categories pyrimidines and purines...... 125 8.3 Posterior distribution of the overall coverage rate, which is basically the proportion of links predicted correctly in the network...... 132 List of Figures xii

8.4 Posterior distribution of model parameters...... 133

1 MCMC diagnostic plots for IBW...... 156 2 MCMC diagnostic plots for IBW...... 157 List of Tables

4.1 Notation used in model development...... 65 4.2 Posterior estimate for the extinction date based on simulations under Model 2...... 67 4.3 Summary of the posterior distribution of τE using certain sightings only.. 71 4.4 Summary of the posterior distribution of τE ...... 72 4.5 Inferred extinction year and its 95% HDI based on Model 1, Model 2 and Model 3...... 77

5.1 Summary of the posterior distributions of τE and αc for the black-footed ferret...... 90 5.2 Summary of the posterior distribution of τE for IBW...... 91

6.1 Passive and active survey input data for the Alaotra data com- mencing in 1990, and output from the model...... 103

8.1 Key parameters in the Bayesian MCMC inference of the transmission network...... 129

xiii Abstract 1 Abstract

Clear signs are emerging that any further loss of critically endangered species might tip the world towards another mass extinction event at the worst, or a global ecological catastrophe at the best. In addition to this, infectious diseases are considered as one of the top five drivers of extinction. On the other hand, scientists find that biodiversity loss can in fact increase rates of infectious disease, which already lands us in a never- ending loop. Of course infectious diseases, and the problem of eradicating diseases are of interest in their own right. The goal of my thesis is to derive new mathematical and statistical models for studying species extinctions as well as models for studying infectious diseases.

Many studies have used historical sighting data to infer the extinction of species. Often, the data is in the form of a time-series specifying in which years the species was sighted over a long time-span, usually a number of decades. My work is largely concerned with data-sets of this type and trying to predict whether the species has gone extinct after the last certain sighting, and if so, what date it is likely to have gone extinct. The majority of extinction models developed in the literature have only dealt with certain sightings, where all the sightings were treated as correctly identified and thus valid. However, in reality, for many of the sightings reported it is not one hundred percent certain that the species was correctly identified. In recent years, several new probabilistic models have been developed to incorporate such uncertainties into modelling approaches. In fact two different modelling methodologies have been developed. In the first phase (Chapter3) of my research, I examined these two methods in-depth to seek out the true underlying principles, as they give contradictory conclusions on the extinction of the Ivory-billed Woodpecker. This work is significant as it helps to identify which models can be used by ecologists in which situations, and how to untangle the influential factors controlling the signatures of extinction in sighting records.

In the second phase (Chapter4), a new perspective is provided into modelling extinction using a hierarchical Bayesian approach that could be applied to any sighting record consisting of certain and uncertain sightings. This new approach calculates the posterior probability that the species went extinct by the endpoint of the sighting record. The differences in the new proposed method are discussed in contrast to other methods in the literature. The hierarchical Markov Chain Monte Carlo (MCMC) approach I develop for determining extinction date, was applied to the sighting record of the Ivory Billed Woodpecker (IBW) (Campephilus principalis). It was concluded that the IBW went extinct in the 1940s which confirms the validity of recent studies, despite arguable and controversial sightings of the species later. Abstract 2

In the third phase of the thesis (Chapters5-7) I explore several new approaches for in- ferring species extinction. For instance, in Chapter5, a model is developed that assumes an independent non-homogeneous Poisson process for the sightings. This work signifi- cantly extends models that were developed in phase two (above) which assumed that the certain and uncertain sighting rates are constant (i.e., having constant sighting probabil- ities). In Chapters6 and7, techniques such as the sequential Bayes factor and Survival theory are used as tools for modelling extinction. These new methods are applied to sighting data for different species, such as Ivory-billed Woodpecker (Campephilus prin- cipalis), Alaotra Grebe ( rufolavatus), Dodo (Raphus cucullatus), Aldabra Snails (Rhachistia aldabrae), and black-footed ferret (Mustela nigripes). In short, sev- eral flexible tools are developed in this thesis for studying species extinction. Most of these allow working with distributions of parameters including extinction time, rather than point estimates in contrast to other methods in the literature.

In the final part of the thesis (Chapter8) I turn my attention to studying disease trans- mission in infectious disease problems. These relate to the above ecological extinction problems, in that disease extinction is the ultimate goal, and the processes that lead up to disease extinction are of intrinsic interest. The main objective in this final phase of the thesis is to improve and extend current Bayesian algorithms that infer the transmis- sion network of an outbreak and thus help to make effective disease control decisions to eradicate a disease. A transmission network is a graph that links all infected cases with their possible sources of infection from whom they caught the disease. The network thus provides the big picture of “who infected whom”. The modern transmission network re- construction methodologies combine both epidemiological and genetic sequence data. In this chapter, one of the most successful methods (Lau et al., 2015) is extended so that it considers movements between farms. This methodology is used to assess the accuracy of the inferred transmission network based on a simulated spatio-temporal epi- demic. Through the simulation study I demonstrate that the accuracy of a transmission network is significantly improved through the addition of contact(/movement)-data (in comparison to using sequence data and epidemiological data alone). Having 100% of sequence data along with no contact data resulted in a coverage rate of 44%. However, this coverage rate was doubled by just including 50% of the movement data. Thus, in- cluding movement data into the inference gave a big boost in the inferred transmission network coverage rate. The methodology described in this section is intended to be applied to the early stages of the 2019 Mycoplasma bovis outbreak in New Zealand.

In conclusion, the models developed in this thesis collectively reveal the importance and contribution of mathematical approaches for studying ecological and epidemiological problems relating to species extinction and the dynamics of disease extinction. Chapter 1

Introduction

1.1 Overview

My thesis is concerned with the mathematical modelling of species extinction in ecolog- ical systems and disease transmission in epidemiological systems. Over the past decade, both of these research areas have evolved rapidly, largely due to improvements in mod- elling techniques and surveillance. At first glance, species extinction and infectious disease modelling might be viewed as two independent problems. However, they are simply two sides of the same coin. While ecologists view extinction or the dangerous decline of species numbers as a problem, and conservation as a goal, epidemiologists view eradication or decline of an undesirable virus or bacteria as an achievement, and persis- tence as a problem (Earn et al., 1998). Even though the ecologists and epidemiologists have inverted goals, the mathematical structure studied by both can often have strong similarities. In this chapter, the history of some of the key species extinction models are described followed by an overview of infectious disease models. In each case I give a statement of my objectives in this thesis and explain how it would add to the field of knowledge. Following this, in the next section, I give a chapter by chapter outline of the objectives of each chapter and contributions.

1.2 Species extinction models

1.2.1 Significance

The report from the Intergovernmental Science-Policy Platform on Biodiversity and Ecosystem Services (IPBES) warns that the rate of species extinctions is accelerating and that one million animal and plant species are threatened and may be lost within 3 Chapter 1. Introduction 4 decades unless urgent action is taken (D´ıazet al., 2019). The loss of biodiversity directly threatens the well-being of humans in all regions of the world, given the worrying rate the interconnected web of life is diminishing. A growing body of evidence indicates that the earth is entering a sixth “mass extinction” (Ceballos et al., 2015), as the current extinction rates are far above the “background” rate (see Figure 1.1). Background rate, refers to the standard rate of extinction in earth’s history before humans had evolved and were on the scene. Figure 1.1 indicates that it would have taken several millennia to reach the modern vertebrate extinction rates if the background rate had prevailed.

Figure 1.1: Time series of the cumulative extinctions as a percentage of IUCN(2012) evaluated vertebrate species as extinct or extinct in the wild. The term extinct in the wild refers to a species that does not have any living individuals in the wild, however there would be some existing in other artificial habitats (i.e. zoos) (A) Highly conservative estimate. (B) Conservative estimate. (Source: Ceballos et al.(2015)).

Knowing whether a species is extinct or not is important to prioritise conservation strategies and to allocate research energy and funds for environmental monitoring. Thus, one of my key aims in the thesis is to find ways to answer the question of whether a species has become extinct. The date of extinction, or the time of the disappearance of the last individual of a species, is rarely observed and even harder to detect. Therefore, it is better if possible, to attempt to infer the extinction of a species using a variety of information. This includes time series of historical sightings (i.e. sighting records), the effort expended in searching for the species, change in abundance over time (i.e. population trajectories), potential remaining habitat and its relationship to abundance, the severity and extent of processes threatening species, and intrinsic taxon information (e.g. life-history traits) (Boakes et al., 2015). Chapter 1. Introduction 5

1.2.2 Objective

Ideally, we would like to use all the information stated in the previous section to infer extinction. However, in reality the only available data is often just restricted to sightings for rare or poorly studied species. Thus quantitative techniques have been developed to assess extinction solely based on sighting data, which will be discussed shortly. The main objective of this thesis in the species extinction context is to develop new Bayesian approaches to determine if the species is extinct or not based on sighting data. New and existing methods are then used to obtain insights on the most significant factors in extinction models, and which factors require additional research for strengthening the knowledge base and thereby reduce the uncertainties in the inferences made.

1.2.3 Background

Before moving forward with the details of the statistical techniques used in estimating species extinction dates, it is important to have a basic understanding about Poisson processes as it is the widely used stochastic process for modeling sighting records.

Poisson process

A Poisson process describes a process that generates a series of discrete random events that occur at a given rate λ(t) and for which the inter-interval times between events is an independently distributed random number. If λ(t) is constant (i.e. λ(t) = λ), then the Poisson process is stationary. For any interval of size t of a stationary Poisson process the number of expected sightings in that interval is λt, so that the sightings are uniformly distributed over the time period. However, in practice it is more convenient to define the Poisson process in terms of the sequence of inter-arrival times, which has an exponential distribution function. Chapter 1. Introduction 6

1 2 3 4 5 6 7 8

me 0 T  (a)

1 2 3 4 5 6 7 8

0 T me (b)

1 2 3 4 5 6 7 8

0 T ¡ me (c)

Figure 1.2: Graphical representation of three sighting records for different species. Green represents the years where there are sightings. (A) Approximately constant sighting rate. (B) Increasing sighting rate. (C) Decreasing sighting rate.

While the homogeneous Poisson process is simple and has been used in extinction models for sighting records similar to Figure 1.2a, it is also pertinent to consider heterogeneous sighting rates (i.e. λ(t)) specifically for sighting records that vary with time (i.e. Figures 1.2b and 1.2c). Such a Poisson process is referred to as a non-homogeneous or non- stationary Poisson process. If X(t) is a non-homogeneous Poisson process with rate

λ(t), then the number of sightings between time t1 and t2 has a Poisson distribution with parameter R t2 λ(t)dt. t1

Basic model of extinction

Figure 1.3 illustrates the reliable sightings for the Caribbean monk seal during the period (1915, 1992]. Based on these sightings, we might ask: Has the Caribbean monk seal gone extinct by 1992? If it has gone extinct then what is the most likely time at which extinction occurred? These are reasonable questions ecologists seek answers for and for which statistical models should be able to address. In this respect, Solow(1993a) developed two simple methods for the first time within the field of conservation biology. Chapter 1. Introduction 7

Figure 1.3: Caribbean monk seal sighting data during (1915, 1992]. There are only five recorded sightings.

Solow(1993a) proposed a simple frequentist method along with a Bayesian model to infer species extinction, assuming that the sightings follow a stationary Poisson process until the time of extinction (τE). Thus, the rate of the Poisson process is constant

(i.e. λ(t) = λ) until τE, and then falls to zero. The constant rate assumption is most appropriate for small population sizes (Solow, 1993a). These two methods tested the null hypothesis that extinction had not occurred by the end of the sighting period T

(i.e. τE > T ) using the likelihood ratio statistic and Bayes theorem (see Chapter2 for more details on Bayes theorem). It was found that the p-value corresponding to the null hypothesis that extinction had not yet occurred by time T (i.e the number of years n since the first sighting) is (tn/T ) , where tn and n represent the time of last sighting and

the number of sightings, respectively. Thus, a larger gap between the last sighting tn and the end of observation period T favours the possibility of extinction over existence by time T . Solow(1993a) found that the p-value that the Caribbean monk seal was n 4 extant by 1992 is equal to (tn/T ) = (37/77) = 0.053. Hence, it can be inferred that the Caribbean monk seal is highly likely to have gone extinct by 1992. Note that the number of sightings is equal to four (i.e. n = 4) as the first sighting was omitted from the analysis to define the beginning of the sighting period.

Key advances since the basic extinction model of Solow

Extensions and modifications to Solow’s original analysis Solow(1993a) Chapter 1. Introduction 8

In the same year Solow extended his own model to take into account declining pop- ulations making use of a non-stationary Poisson process with rate function λ(t) = exp{(a + bt)} before extinction (Solow, 1993b). Several other modifications have also been applied to Solow(1993a)’s original analysis by other investigators. For example Burgman et al.(1995) developed a similar equation as in Solow(1993a) using the total number of observed individuals instead of the number of time units in which the species was recorded. This study also developed a technique that calculates the probability that the species will be recorded again during a period that is either as long as, or longer than, the longest observed run of absence, known as a ’runs test.’ This technique was useful in the presence of population decline since it is characterized by longer and longer periods during which the species is not sighted. Also in 1998, McCarthy introduced a ‘Partial Solow equation’, that accounted for changes in collection effort. McCarthy(1998) made use of a non-homogeneous Poisson process. A general equation was given for the p-value of the null hypothesis that extinction has not occurred. Assuming that the sighting n R t rate is λ(t), the defined p-value is equal to [Λ(tn)/Λ(T )] , where Λ(t) = 0 λ(u)du. For poorly studied species it is impossible obtain a full sighting record and in such situ- ations making assumptions about the sighting rate will introduce bias into modeling. This brings us to the next wave of extinction models which are characterised by non- parametric approaches that do not require specifying a parametric form for the sighting rate λ(t).

Non-parametric models

In 2003, Solow and Roberts(2003) proposed an alternative simple approach to test extinction. This test bases its assessment on the behavior of the sighting record in the vicinity of the most recent sighting as opposed to the parametric tests which uses all of the information in the sighting record. Thus, the approach of Solow and Roberts (2003) can be applied when there are only a few records and does not require specifying a parametric form for the sighting rate. The proposed non-parametric estimate for the extinction time was found to have the simple expressionτ ˆE = tn + (tn − tn−1), which does not require knowledge of either beginning of the sighting period t = 0 or even the total number of sightings in the sighting record n. This estimate was based on the work of Robson and Whitlock(1964) who estimated the endpoint of a distribution based on an independent sample. The method in Solow and Roberts(2003) tends to overestimate the extinction date, particularly when the sighting rate is constant or increasing (Rivadeneria et al., 2009; Clements et al., 2014). Thus, the proposed method performs best when applied to the sighting records of species that are believed to exhibit gradual extinction and to those whose probability of sighting is generally low. Based on Solow and Roberts(2003)’s non-parametric equation, Jari´cand Ebenhard(2010) developed a new index, replacing the time between the two last sightings (tn − tn−1) Chapter 1. Introduction 9 with the average time between each pair of successive sightings. Jari´cand Ebenhard (2010) method might be preferred (if data is available) over Solow and Roberts(2003) method as it makes most use of the sighting data (Boakes et al., 2015).

The methods discussed so far differ from one another either by the underlying statistical model of the sighting record or whether a Bayesian or non-Bayesian approach has been applied. However, they all take two key parameters into account to draw inferences. The two parameters are the number of sightings and the time elapsed since the last sighting of the species. These methods provide reliable estimates of extinction time as long as the choice of the method is determined by the nature of the underlying sighting data (Rivadeneria et al., 2009). However, none of these methods deal with uncertain sightings (or doubtful sightings). Thus, we now move into the modern extinction approaches which incorporate uncertain sightings into the modelling approach.

Joint analysis of certain and uncertain data

Historically, the extinction models used certain sightings as the main source of infor- mation to infer species extinction (Rivadeneria et al., 2009). These sightings are based on specimens or uncontroversial photographs, video, or sound recordings. However, working with uncertain sightings requires further terminology. As seen through Figure 1.4, certain sightings are considered always “valid”. But uncertain sightings are either “valid” or “invalid”, given we are not sure whether we have identified the species cor- rectly or not. In practice, it is impossible to know which uncertain sightings are valid and actually real, which are invalid and thus erroneous.

Valid

Valid Certain Sigh�ngs Uncertain Sigh�ngs Invalid

푡1 푡2 푡3 푡4 푡5 푡6 푡7 푡8 푡9 푡10 푡11

�me 0 T

Figure 1.4: Graphical representation of a sighting record that includes both certain and un- certain sightings. Each year is split into two parts in to indicate certain and uncertain sightings. First row with filled cells represents the years where there are certain sightings. Second row with pink filled cells represents the years where there are uncertain sightings. Chapter 1. Introduction 10

In order to infer extinction more accurately, we need to include both certain and un- certain sightings into the modeling approaches in a statistically sound manner. Being the first key paper addressing this issue, Roberts et al.(2010) demonstrated how the inferences made from models that include uncertain sightings significantly differ from those omitting that information and the conclusions about extinction were sensitive to the inclusion or exclusion of uncertain sightings. This led Solow et al.(2012) to develop a statistical method which treated uncertain sightings of the Ivory-billed Woodpecker (Campephilus principalis) in a formal way by neither simply treating them as valid nor excluding them. The limitation of the Solow et al.(2012) method is that it assumes that uncertain sightings occur only after the last certain sighting, which is not a logical necessity. Afterwards, Solow and Beet(2014) modified the method into two different probability models that allow overlap in time between certain and uncertain sightings, thereby increasing generality.

Recently, several research groups have developed methods that take into account uncer- tainties and the actual strength of the evidence, by incorporating additional information. For example, whether actual specimens of the species were recorded, or whether less cer- tain video and/or audio recordings were collected, or whether there were just local verbal reports from experts or unreliable non-experts in some years (Thompson et al., 2013b; Lee, 2014; Thompson et al., 2017). The idea was then extended by assigning probabili- ties of reliability to individual sightings (Jari´cand Roberts, 2014; Lee et al., 2014). All of these models needed some expert opinion (e.g Life International) in the area to provide sighting reliabilities, and the inferences made from these models were sensitive to sighting reliabilities. Because of the importance of sighting reliabilities, Lee et al. (2015) developed a formal framework to elicit expert opinions in order to determine the validity of sightings.

Section 1.2 discussed the background of some key extinction model developments. Now, we move into the background concerned with the study of infectious disease transmission.

1.3 Infectious disease models

1.3.1 Significance

Infectious diseases have always been an important part of human history. The common goal of epidemiologists is to generate conditions that lead to the extinction of the viruses or bacteria that cause these diseases. Since the beginning of recorded history there have been epidemics that invaded and endangered human populations, the most recent one being the coronavirus (COVID19) which emerged from Wuhan, a district in China. Chapter 1. Introduction 11

This and other infectious disease epidemics such as foot-and-mouth (FMD) disease in the United Kingdom in 2001, severe acute respiratory syndrome (SARS) in Hong Kong 2002 and Ebola in West Africa in 2014, have caused many deaths worldwide, prompting the importance of better understanding disease transmission dynamics. For all events of this type, epidemiologists have a great need to obtain accurate inference of ‘who infected whom’ (i.e. transmission network) and epidemiological parameters such as the reproductive number, serial interval, incubation period and clearance rate to help make evidence-based decisions about disease control strategies in the early phases of an outbreak.

Any transmission network is a combination of nodes (or vertices) and links (or edges) between nodes. The nodes can describe single individuals (e.g. humans, ), groups of individuals (e.g. households, farms[see Figure 1.5]) or locations to which individuals are connected (e.g airports [see Figure 1.6]). Links describe the transmission of the disease between nodes, in which case the network is usually directed. Essentially, by corresponding all infected cases with their possible source of infection from whom they caught the disease, the model attempts to infer the true ‘transmission network’. Figures 1.5 and 1.6 represent examples of inferred transmission networks for humans and farm animals. Note that Figures 1.5 is a weighted graph where each branch is given a numerical weight to reflect the strength of transmission between a target farm and a source farm. Chapter 1. Introduction 12

Figure 1.5: The consensus transmission network inferred for the FMD epidemic in 2010 in Japan. The case 6 (i.e. c6) is the source of the epidemic and the transmission is indicated vertically from top to bottom (source (Hayama et al., 2019)).

Figure 1.6: The most probable transmission network of the novel corona virus spread from the Beijing international airport to other airports around the world. Size of the bubble represents the relative risk at each airport (source (Cohen, 2020)). Chapter 1. Introduction 13

1.3.2 Objective

In this component of the thesis, the main objective is to develop new methods to infer and reconstruct transmission networks in an outbreak. This will be achieved using knowledge about the disease dynamics, genetic knowledge obtained from DNA sequences at infected farms, and any knowledge of movement of (possibly) infected animals between farms.

1.3.3 Background

Before discussing transmission network models in the literature, it is desirable to be fa- miliar with epidemic compartmental models such as the Susceptible-Exposed-Infectious- Recovered (i.e. SEIR) model as our work as well as many other epidemic models hinge on it.

SEIR model

The standard SEIR model is used to describe the disease progression in a sequence of compartments (representing subpopulations) which simplifies the modelling of infectious diseases in large populations. This model is commonly used for diseases which are transmitted from human to human (i.e. influenza) or animal to animal (i.e. foot and mouth disease). In the traditional SEIR model, each member of the population is considered to belong to one of four classes: Susceptible individuals (S), Exposed(E), Infected individuals (I), and Removed individuals (R) at a give time. Each individual begins in the susceptible class S, only to move to the Exposed class E after coming into contact with an infectious individual. Exposed individuals then move into infectious class I, where they infect other susceptible individuals and then eventually recover from the disease and move on to the recovered class R (see Figure 1.7). Being “recovered” and unable to be reinfected (due to acquired immunity), they are essentially removed from the population and play no further role in the dynamics.

Figure 1.7: A SEIR compartment model showing the transmission rate β between infectious and exposed individuals and the transition rates λ and σ for class I and R, respectively. Here S(t), E(t), I(t), R(t) represents the number of individuals in the respective class at time t. The total population N is assumed to be fixed.

The SEIR model described above can be expressed by the following set of ordinary differential equations: Chapter 1. Introduction 14

dS SI = β dt N dE SI = β − λE dt N dI = λE − σI (1.1) dt dR = σI dt N = S + E + I + R

Here the β, λ and σ are the infectious, incubation and recovery rates, respectively. S, E, I and R represents the number of individuals in the respective class, and N = S(t) + E(t) + I(t) + R(t) = constant is the total population. It thus does not include birth, death, or virulence. For the relatively short time scale of the dynamics, we assume these are not major factors that influence total population numbers.

Historically, epidemiological data collected during the infectious disease outbreak has been used as the main source of information to infer disease dynamics including transmis- sion networks (Haydon et al., 2003; Cauchemez et al., 2006; Wallinga and Lipsitch, 2006; Cauchemez and Ferguson, 2011; Cauchemez et al., 2011; Heijne et al., 2012; Wallinga and Teunis, 2004; Ferguson et al., 2001). In those models, epidemiological data such as observed times of symptom onset and/or times of culling/removals were included. These data are indeed informative as they are based on clinical observations and diagnostic results, but they do not allow very precise inference on transmission dynamics as they only indirectly reflect the underlying transmission network. In this respect, unobserved aspects of an outbreak such as the exposure time (and incubation period) plays an im- portant role in capturing the transmission dynamics adequately. Gibson and Renshaw (1998) and O’Neill and Roberts(1999) were the first to present the use of Markov Chain Monte Carlo (MCMC) methods (see Chapter2 for details on MCMC approaches) to impute unobserved components in an SEIR model. Chapter 1. Introduction 15

I Farm i

…tacctgaatacataggcataccga …

S E I Farm j …tacctgaatatatagccataccga …

S E Farm k …ttcctgaatatatagccataccga …

Time Figure 1.8: Illustration of the basic concept to infer transmission tree using genetic data. Stars represent the sampling of pathogens from hosts; each horizontal dashed line is a distinct pathogen lineage in each host. Dash dotted vertical arrows represent transmissions between hosts, and solid black arrows indicate mutations. Each farm is represented through their cattle while the colour representing their status through time. For example, black coloured cattle indicates that the farm is susceptible while red indicates that its infectious. The basic idea is finding similarities between sequences taken from different hosts.

Another source of valuable data is the genetic data sampled from infected individuals that carry evolutionary history of the pathogens. With the improvement of sequencing technology it has become relatively cheaper and fast to obtain genetic data on the pathogens. Thus, several approaches were developed to infer the links based on sequence data alone (K¨oseret al., 2012; Ruan et al., 2003; Liu et al., 2005; N¨ubel et al., 2010; Mutreja et al., 2011; Walker et al., 2013; Jombart et al., 2011; Aldrin et al., 2011). The underlying basic idea in these models are described more in Figure 1.8.

For many pathogens, particularly for RNA viruses that rapidly mutate, evolutionary and epidemiological processes are inextricably linked (Holmes et al., 1995; Pybus and Rambaut, 2009). The field that analyse genetic sequence data by considering host dynamics and pathogen genetics to infer epidemiological characteristics is referred to as “phylodynamics” Grenfell et al.(2004). During an outbreak, mutations accumulate in the genome of the pathogens and provide useful insights into the spread of the disease as many viruses lack fidelity during replication (Holmes et al., 1995). Thus, it is possible to follow or trace the infection through a network by following the simple to observe changes in genetic sequences as it mutates in time and travels over the network. The main idea is based on the rate of genetic mutations being slow and constant. For instance, if there is one mutation between two farms, then chances are one farm infected Chapter 1. Introduction 16 the other. But if there are a sequence of changes from farms i to j to k (similarly to Figure 1.8), then the chances are farm i infected farm j which infected farm k. Farm i is less likely to infect farm k as there are three mutations between them.

By extending the above explained simple idea, several approaches have been proposed to infer the transmission network using genetic data; either by using a phylogenetic (Stadler, 2009; Volz et al., 2009; Rasmussen et al., 2011; Morelli et al., 2012; Ypma et al., 2013; Didelot et al., 2014; Hall et al., 2015) or non-phylogenetic (Aldrin et al., 2011; Jombart et al., 2014; Skums et al., 2018) methods. In phylogenetic methods, a phy- logenetic tree i.e., a diagram representing the best hypothesis about how the pathogen evolved from its common ancestor, is reconstructed from the sequences sampled in an epidemic. However, a phylogenetic tree on its own does not determine the transmission network (Hall and Colijn, 2019). For instance, assuming that the internal nodes of the phylogenetic trees coincide with transmission events may lead to incorrect inference of the transmission tree due to the ignorance of the within-host pathogen diversity (for more details see Ypma et al.(2013) and Giardina et al.(2017)). Hence extra infor- mation such as epidemiological data and movement data are required to reconstruct transmission events accurately. In this thesis, one of the most successful methods that integrated epidemiological and genetic sequence data (Lau et al., 2015) is extended so that it considers animal movements between farms. The key advantage of Lau’s mech- anistic model is that it can make good predictions even when the sequence data is only partially sampled and thus incomplete. The model is also relatively easier to interpret given its mechanistic nature and its performance also benchmarks amongst the best models available (Firestone et al., 2019a).

1.4 Research questions addressed in the thesis

This thesis deals with the following research questions:

1. What are the true underlying principles that determine the outcomes of Solow’s (1993a,2014) well known models for predicting extinction dates? In some cases Solow’s models give contradictory conclusions eg., when studying the extinction of the Ivory-billed Woodpecker. Can these be better resolved?

2. State of the art models (Solow and Beet, 2014) yield prediction of species extinc- tion dates in terms of complicated integral formulations that give little ecological insights. Is it possible to derive simple analytic approximations? Can the mathe- matical Laplace expansion help in this respect? In some circumstances it can be used to help simplify such integral expressions? Chapter 1. Introduction 17

3. How does the species extinction date found in state-of-the-art prediction models relate to the last certain sighting? How does the extinction date relate to the last uncertain sighting? Are there other factors involved and why are these factors of relevance?

4. Many conventional Bayesian species extinction models make use of the problemat- ical Bayes factor, which is often considered ad hoc. Are there alternative Bayesian methods, which bypass working with conventional Bayes factor, that can be used to infer extinction of species? Can a likelihood for the model be determined and an MCMC computational approach be used for parameter estimation and exploration of posteriors?

5. Most extinction models assume stationary populations (Solow et al., 2012; Solow and Beet, 2014; Lee et al., 2014, 2017a). Is it possible to account for non-stationary sighting rates using the Bayesian techniques, while accounting for both certain and uncertain sightings?

6. Can traditional Survival analysis, as used in theoretical biology, be adapted to estimate dates of species extinction?

7. Can Bayesian approaches be used to study disease transmission and extinction of disease in networks of farms? Is it possible to allow for information about the network structure, information about the DNA or RNA sequences of the evolving virus as well as information about animal movements across the network? Is it possible to make predictions when information is only partial? What is the relative importance of the movement information and the genetic sequence information?

1.5 Thesis Layout

In Chapter2 an overview on Bayesian models used in this thesis is given including a discussion on the reasons for their popularity over classical (frequentist) techniques. In the succeeding chapters these tools are used to advance the current ecological and epidemiological models that were discussed.

In Chapter3, the two extinction models of Solow and Beet(2014) were examined to identify the importance of last certain and uncertain sighting. The most remarkable property of these two models is that they were able to incorporate statistical uncertainty in sighting data, in a reasonably general way. However, unusually the two models of Solow and Beet(2014) gave completely different conclusions on their analysis of the extinction of the Ivory-billed Woodpecker. Thus, in the first phase (Chapter3) of my Chapter 1. Introduction 18 research, I examined these two methods to provide a mathematical explanation that explores in more depth as to why the results differed from one another. As discussed in this chapter, it was found through analytical techniques that the role of the last certain and last uncertain sighting under each model are the most influential factors, and thus untangles the most influential variable under each model. In addition, it gives insights as to how different subtle ways of treating uncertain sightings could result in the last uncertain sighting becoming more influential than the last certain sighting (which is the most common and obvious influential variable in the majority of extinction models). This work is significant as it helps to identify which model can be used by ecologists in which situation, and how to untangle the influential factors controlling the signatures of extinction in sighting records. The work in Chapter3 has been published in Kodikara et al.(2018).

In Chapter4, a new perspective is provided into modelling extinction using a hierarchi- cal Bayesian approach that could be applied to any sighting record consisting of certain and uncertain sightings. This new approach calculates the posterior probability that the species went extinct by the endpoint of the sighting record. Estimating this posterior ex- tinction probability is important in conservation biology as it affects the decisions about prioritising actions such as establishing protected areas. The new method proposed has notable differences from others in the literature. First, the model developed in this chap- ter examined the problem of estimating the extinction time when uncertain sightings are present in the sighting record. Second, the approach inferred the probability of sighting a species while it is extant. Third, the use of computational MCMC approach made it possible to find posterior distributions along with credible intervals of all parameters with ease. Fourth, the approach bypassed working with Bayes factor directly in-contrast to Solow and Beet(2014). After formulating the likelihood, priors, and hyper-priors it is a simple matter to turn this into a flexible hierarchical Markov Chain Monte Carlo (MCMC) model, although this has never been formulated until now. It is expected that this is the direction that will be taken in future related research specifically when there is a need to add more hierarchies into the model (for example, making the sighting rate co-variate dependent). This new model was applied to the sighting record of the Ivory Billed Woodpecker (IBW) (Campephilus principalis), and it was concluded that the IBW went extinct in the 1940s which confirms the validity of recent studies despite arguable and controversial sightings of the species later. This work described in Chapter 4 is accepted for publication in Kodikara et al.(2020).

In Chapter5, a model is developed that assumes independent non-homogeneous Poisson process for the sightings. This work significantly extends models that were developed in phase two (above) which assumed that the certain and uncertain sighting rates are con- stant (i.e., having constant sighting probabilities) i.e., a homogeneous Poisson process. Chapter 1. Introduction 19

Similarly, all other models reported in the literature that included both certain and un- certain sightings were built on a constant stationary sighting rate assumption. However, the inferences made under constant rate assumption will be inaccurate if the true un- derlying rate is heterogeneous. Hence, in this chapter, the constant rate assumption of the homogeneous Poisson process is relaxed by allowing the certain and uncertain sight- ings to follow independent non-homogeneous Poisson processes. This new model could also identify whether the sighting rates were increasing, decreasing or constant. The proposed method was applied to the sighting records of the black-footed ferret (Mustela nigripes) and the Ivory-billed Woodpecker (IBW; Campephilus principalis) species. The work in Chapter5 is currently under review in a journal related to ecological statistics and modelling.

In Chapter6, a Bayesian updating approach given, which is appropriate for circum- stances in which sighting data are sequential in time and thus provides a means of reassessing estimates and priorities as new information arises. In addition, this model is very simple to implement in a standard spreadsheet, facilitating its adoption for rou- tine application in organizations that use observation data to support evidence-based decisions regarding actions to protect species and their habitats. This new method is applied to sighting data of Alaotra grebe. This work described in Chapter6 is published in Thompson et al.(2019).

In Chapter7, a Survival model is developed and in contrast to all methods developed so far, this new model does not require a full sighting history as it is largely based on the last certain sighting and the hazard rate. Specifically, this new model is designed for situations in which a species has not been recorded for some period after an assumed valid sighting. However, if the biology of the species suggests it depends sensitively on habitat extent and reliable data are available on the loss of habitat over the critical period since the last sighting, then the hazard rate could be scaled accordingly to provide a more nuanced analysis. This new method is applied to sighting data of Dodo and Aldabra Snails. This work described in Chapter7 is published in Thompson et al.(2020).

In Chapter8, a Bayesian framework is developed to facilitate the integration of epi- demiological, genetic and movement data to accurately infer the transmission network of an infectious disease outbreak. In this chapter, one of the most successful methods (Lau et al., 2015) is extended so that it considers animal movements between farms. This extended methodology is used to assess the accuracy of the inferred transmission network based on a simulated spatio-temporal epidemic. It was found that when all the information from the simulated epidemic data was used, the true transmission net- work could be recovered with near-complete coverage rate. (The coverage rate of an inferred transmission graph is defined as the proportion of links predicted correctly in Chapter 1. Introduction 20 the network.) Also, through the simulation study I demonstrate that the accuracy of a transmission network is significantly improved through contact-data in-comparison to using sequence data and epidemiological data. This work described in Chapter 8 is currently under preparation for publication.

In Chapter9, a summary overview is presented on the results obtained from the different extinction and epidemiological models discussed in Chapters3-8. Chapter 2

Bayesian Approach

This chapter provides background details on Bayesian inference, which is the main sta- tistical inference method used in the thesis.

The main objective in statistical inference is to make probability statements about the model parameters given some observed data. These model parameters are generally unknown and thus need to be inferred using statistical techniques. For example, let us assume that our main interest is to estimate the probability of sighting a species, i.e. θ, where θ is assumed to be constant over space and time. In order to make inferences about θ, it is important to learn about the unknown sighting probability through an observed sample of sightings, x = x1, x2, ..., xn. If the species is sighted in the ith year,

then xi = 1 and otherwise it is 0.

Based on the sample observed, a likelihood function is defined to explore the plausible estimate for θ and to understand the uncertainty associated with the estimate. In the

above described sighting example, the probability that the outcome of xi is a sighting is

equal to θ, i.e., P (xi = 1|θ) = θ. Since the outcome is a binary response, the probability of not observing a species is simply the complement of the probability of having a sighting

in the ith year(i.e. P (xi = 0|θ) = 1 − θ). Then the likelihood function, `(θ|x) is

n n Y Y xi 1−xi `(θ|x) = P (xi|θ) = θ (1 − θ) . (2.1) i=1 i=1

The likelihood function, `(θ|x) is a function of θ based on the joint conditional density or probability function of the sample. In addition to the likelihood function, the Bayesian approach takes the distribution of θ into account, i.e., p(θ). This distribution quantifies the uncertainty of θ prior to seeing the data and is referred to as the prior distribution or simply a prior. Often, such prior knowledge is accessible from past information, such as

21 Chapter 2. Bayesian Approach 22 previous related studies or through a subjective assessment of an experienced expert, or possibly based on past experiences. This prior knowledge is combined with the likelihood via the Bayes’ theorem.

Bayes’ Theorem For two random events, A and B, Bayes’ theorem states that:

P (B|A)P (A) P (A|B) = , (2.2) P (B)

where P (A|B) is the probability of event A occurring given that B has already occurred and P (B|A) is the vice versa. P (A) and P (B) are the marginal probabilities of observing event A and event B respectively.

Using the Bayes theorem given in Equation 2.2, the posterior distribution p(θ|x) is written as,

likelihood prior posterior z }| { z}|{ z }| { `(θ|x) p(θ) p(θ|x) = p(x) (2.3) |{z} marginal likelihood ∝ `(θ|x)p(θ).

Here, ∝ represents the proportionality operator.

In Figure 2.1, we represent a Beta distribution as the prior distribution for θ (i.e. p(θ) ∼ Beta(10, 10)), along with the likelihood function, `(θ|x). The prior distribution captures our beliefs prior to seeing any data, however the chosen distribution should represent the parameter space well. For example, in this particular example, θ is a probability and we thus need a prior probability distribution with support that is restricted to between zero and one. We thus use a Beta distribution for the prior since it satisfies this property. Next step in prior specification is to specify the prior parameter values that represent the belief of the researcher on the parameter along with his/her confidence in the belief. In Figure 2.1, prior is represented by a Beta(10,10) distribution which has an expected value of 0.5 and a variance of 0.01. Thus a Beta(10,10) is comparatively informative than a Beta(2,2) which has the same expected value but with a higher variance (i.e. 0.05). As mentioned previously, the likelihood function, `(θ|x) explains the data we observed. However, the key to Bayesian analysis, is to determine the posterior distribution by combining the prior distribution and the likelihood function according to Equation 2.3. The posterior distribution, p(θ|x), summarizes all the information available about the Chapter 2. Bayesian Approach 23 parameter by combining the prior distribution and the likelihood function. In other words, one can think of the posterior as a kind of average between the prior distribution and the likelihood function. However, in this specific example, the posterior distribution of θ is noticeably influenced more by the prior p(θ). In practice, such prior distributions are referred to as informative priors. Finally, the posterior distribution can be used to answer all standard statistical problems such as parameter estimation and hypothesis testing.

Figure 2.1: Graphical representation of the posterior distribution of θ i.e., p(θ|x) as an combi- nation between the likelihood and prior information. In this particular example prior distribution is taken be a Beta distribution parametrized by two shape parameters equal to 10.

The prior distribution is the key part in Bayesian inference which represents the knowl- edge about the parameter θ before the sample is taken into account. Even though this provides the opportunity to include all available information into the analysis in addi- tion to the sample, the subjective nature of the specification has been the main criticism against Bayesian inference. However, the effect of prior distribution can be adjusted with well-identified parameters and large sample sizes. For instance, if there is strong correlation among parameters in the posterior density, then it could result in nonidentia- bility of parameters. The definition of ‘well-identified’ and ‘large’ sample size converge to the same idea. In practice, a sensitivity analysis is conducted on the posterior under different reasonable choices of prior distributions to identify the influence of the prior distribution.

In many situations, researchers have no or very limited prior knowledge about the pa- rameters of interest. A potential remedy for such cases is to use non-informative prior distributions so as not to introduce subjective inputs into the analysis. For the Bayesian community, finding such a prior distribution is similar to finding the “philosopher’s stone” (Consonni et al., 2018). However, it has been impossible to give an accurate Chapter 2. Bayesian Approach 24 definition of a non-informative prior reflecting the expression that we “know nothing” (Kass and Wasserman, 1995). Thus, the focus in the scientific community moved in the direction of finding priors that have a minimal impact on the corresponding poste- rior inference when prior information is unavailable. These priors have been named in many different ways, such as vague priors, objective priors, non informative priors and reference priors (Consonni et al., 2018). In contrast, subjective or informative priors are a great resource in Bayesian analysis specifically when prior knowledge about the parameters of interests is available and can be incorporated into a prior distribution meaningfully.

In what follows, we will discuss the most well-known methods used to define objective prior distributions in Bayesian analysis.

• Uniform prior The most common, and an obvious choice, is to assign an Uniform (flat) prior for the parameter of interest when prior information is not available. In this ap- proach, equal probabilities are assigned for all values within the parameter space. The Uniform prior, however, is not invariant under re-parametrization. For ex- ample, let us assume a standard Uniform prior between 0 and 1 (Unif(0, 1)) for the success probability, θ, of a Bernoulli distribution. Suppose the researcher re- θ parametrizes the problem as a log odds ratio (i.e. f(θ) = log( 1−θ )), then the prior for f(θ) becomes an informative prior when θ is assumed to be a standard Uniform distribution, as the re-parametrization deviates from the flat distribution. For most Bayesian models, there is no natural way of parametrizing the research problem (Jaynes, 2003). In addition, there is no guarantee that the posterior will be proper (i.e., posterior density integrates to one) if a Uniform prior is defined on an unbounded parameter space (Consonni et al., 2018).

• Invariant prior The main criticism in using a Uniform prior is due to its non-invariance prop- erty under re-parametrization. Thus, Bayesian practitioners are keen in finding objective priors that are invariant under a specific class of transformations. Com- prehensive details on the uses of invariant priors can be found in Berger(1985), Dawid(2014) and Robert(2007).

• Maximum entropy prior This method selects a prior distribution which maximizes the entropy among a set of priors (Jaynes, 2003). Assume that π(θ) is one of the continuous prior Chapter 2. Bayesian Approach 25

distributions for θ, then the entropy of π(θ) is given by Z Ent(π) = − π(θ) log π(θ)dθ. (2.4) Θ

• Jeffreys and reference prior Jeffrey’s prior (Jeffreys, 1998) is probably the most popular objective prior among Bayesian practitioners specifically before the advent of Markov Chains Monte Carlo (MCMC) methods. This prior is defined as

π(θ) ∝ det(I(θ))1/2, (2.5)

where I(θ) is the Fisher information matrix where it’s generic element Iij(θ) is given by ∂2 log p(X|θ) Iij(θ) = −Eθ( ), (2.6) ∂θi∂θj

where Eθ denotes the expected value over the sampling space for a given parameter θ, and X is an observable random variable.

Jeffrey’s prior has many desirable properties such as being parametrization in- variant and being a second order matching prior when θ is a scalar (Datta and Mukerjee, 2012). However, there are some potential drawbacks; for instance, there is no guarantee that the resulting posterior obtained using a Jeffrey’s prior will be proper (Ye and Berger, 1991; Bernardo, 2011). Another drawback of Jeffrey’s prior is that there is no guarantee of satisfactory performance with regards to a function of the parameter of interest (Robert, 2007).

The problem of selecting an objective prior for a low dimensional function ψ(θ) of the entire parameter vector θ in the presence of other nuisance parameters has been the motivation for the development of the reference prior (Bernardo, 1979; Berger and Bernardo, 1992). Roughly speaking, the reference prior method introduced by Bernardo(1979) is to find a prior that maximises the average divergence between the prior and posterior for a specific quantity of interest ψ = ψ(θ). This method was then followed by further developments through a series of papers (Berger et al., 1988; Berger and Bernardo, 1989; Berger et al., 2012, 2015).

However, in most ecological and epidemiological models it is impossible to optimise the posterior distribution analytically as the likelihood functions defined in these models are very complicated and analytically intractable. Fortunately, with the widespread availability of powerful computers, computationally intensive techniques such as Markov chain Monte Carlo (MCMC) methods can be employed in such situations. Chapter 2. Bayesian Approach 26

2.1 Markov chain Monte Carlo method

2.1.1 Markov chain

A Markov chain describes a stochastic process in discrete time, where the transitions in the future are independent of the past transitions that led up to the present state. A Markov chain differs from other general stochastic processes due to the Markov property or simply the memory-less property.

Markov property

For any positive integer t and possible states i1, i2, ..., it, it+1 of the random variables

θ1, θ2, ..., θt, θt+1,

P (θt+1 = it+1|θ1 = i1, θ2 = i2, ..., θt = it) = P (θt+1 = it+1|θt = it). (2.7)

According to Equation 2.7, knowledge of the current state is all that is necessary to determine the transition probability of a future state.

Following are a variety of properties either in a specific state or in the entire chain in a Markov chain (Kroese et al., 2013). Let T be the transition matrix for a Markov chain that contains information on the conditional probabilities of transitioning between states given its current state.

• If a state i returns to its own state in multiples of k > 0 time steps, then that state has period k. A state is known as aperiodic if k = 1, otherwise it is referred to as periodic. The Markov chain is known as aperiodic if all the states in the chain are aperiodic.

• The Markov chain is known as irreducible if the probability of reaching any non empty subset of states is positive, regardless of the starting state.

• If it is impossible to leave a state then it is called an absorbing state and the transition probability to another state from an absorbing state is zero.

• A state i is a recurrent state if there is a guarantee that the process beginning in state i will return back to its own state. However if there is no such guarantee then its referred to as a transient state.

• If a state is both positive recurrent (i.e. expected to return within a finite number of steps) and aperiodic, then that state is known as ergodic. A Markov chain is known as ergodic if all the states in the chain are ergodic. Chapter 2. Bayesian Approach 27

If a Markov chain is irreducible, aperiodic and positive recurrent then it is guaranteed to converge to a stationary distribution that remains unchanged in the Markov chain as time progresses. This property is used to generate a Markov chain in a manner that the posterior distribution of interest is a stationary distribution. Once the stationary state is reached, each chain provides a sample from the posterior. Even though the chain is guaranteed to converge regardless of the initial state, there is no methodical way of testing this convergence other than visual inspection of the chains. To obtain convergence in practice, a significant number of initial realisations of the chain are discarded before the chain is considered to be in a stationary state and the dependence on the initial point becomes irrelevant. This period in which the MCMC moves from its unrepresentative initial value to the modal region of the posterior, is called the burn-in period (Kruschke, 2014). In addition to the burn-in period, the user can also define a number of steps to be thinned in a Markov chain to reduce the auto-correlations between consecutive observations in the chain. For example, when the chain is thinned every 3rd step, only the observations at 1, 4, 7, 10 and so on, are retained. The resulting thinned chain is less auto-correlated than the original complete chain (Kruschke, 2014).

The following is a list of popular MCMC algorithms (Kroese et al., 2013). However, in- addition to the ones listed here there are many other sophisticated MCMC algorithms; such as Hamiltonian Monte Carlo and the Wang and Landau algorithm.

1. Metropolis-Hastings algorithm

2. Gibbs sampler

3. Hit-and-run sampler

4. Shake-and-bake algorithm

5. Metropolis-Gibbs method

6. Multiple-try Metropolis-Hastings method

7. Slice sampler

8. Swendsen-Wang algorithm

9. Reversible-jump sampler

In what follows next, we provide details on Metropolis-Hastings algorithm and Gibbs sampler as they are the main approaches used in this thesis. While the methods covered in Chapters4 and5 use Gibbs sampler, those in Chapter8 use Metropolis-Hastings algorithm to simulate from the posterior distributions of model parameters. Chapter 2. Bayesian Approach 28

2.1.2 Metropolis-Hastings algorithm

The Metropolis Hastings (M-H) algorithm is a simple algorithm for producing Markov chains from a stationary distribution (Metropolis et al., 1953; Hastings, 1970; Chib and Greenberg, 1995). Suppose we want to sample from a one-dimensional target distribu- tion θ. However, it can be extended into multiple parameters by updating the model sequentially. In the long run, the M-H algorithm simulates Markov chains using a tran- sition kernel, q. The transition kernel q lets the current state (θt) move randomly to a new state (say θt+1) and thus represents a distribution on θt+1 given θt. In prac- tice, q(θt+1|θt) is considered to be a continuous function, commonly referred to as the proposal distribution from θt to θt+1.

M-H algorithm

• Initializing the chain (i.e. θ1)

• For t = 1, 2,...

0 0 – Sample θ from q(θt+1|θt). θ is the proposed value for θt+1. 0 – Compute the acceptance probability α(θt, θ ),

0 0 ! 0 p(θ ) × q(θt|θ ) α(θt, θ ) = min 1, 0 . (2.8) p(θt) × q(θ |θt)

0 0 0 – Accept the proposed value θ (θt+1 = θ ) with probability α(θt, θ ), otherwise

set θt+1 = θt.

Proposal distributions can take any form, with the goal of being able to explore the regions of the parameter space efficiently. Even though the stationary distribution is invariant to the proposal distribution, the choice of the proposal distribution could effect the convergence and mixing of the chain specifically in high-dimensional problems. For example, if the proposal distribution is too broad then most of the proposed values will be rejected and the chain will be trapped in a localized region of the target distribution.

2.1.3 Gibbs sampler

Another type of MCMC sampling that is effective to models with multiple parameters is referred to as the Gibbs sampler. This is a special case of M-H algorithm where θt+1 is sampled from its marginal distribution conditional on other parameters fixed at their current values (Casella and George, 1992). In Gibbs sampling, all proposed values are accepted unlike the M-H algorithm. However, in order to apply the Gibbs sampling Chapter 2. Bayesian Approach 29 algorithm, we should be able to derive and generate random samples from the marginal distribution of a parameter when other model parameters are given. The idea behind Gibbs sampling is to generate a sample from the joint posterior distribution of param- eters by sweeping through each parameter to sample from its conditional distribution when all the remaining parameters are fixed to their current values.

2.2 Popularity of Bayesian approaches in ecology and epi- demiology

In the Bayesian approach, the parameter (say θ) is treated as a random variable as op- posed to an unknown fixed quantity as in the classical approach. Due to this fundamental difference, it makes sense to talk about prior and posterior distributions of the parameter θ in Bayesian context. Thus, in a Bayesian approach, the question whether θ is less than k can be answered using the probability statement P (θ < k|x). In contrast, classical inference requires one to calculate a p-value in order to carry out a formal hypothesis testing at a considered significance level, such as 95%, to decide whether or not to re- ject the null hypothesis of interest. Classical inference provides a probability statement about the observed data under the null hypothesis instead of θ. Hence, the response assigns no probability statement to the model parameter θ. It is however important to modellers in ecology and epidemiology to be able to provide probability statements about model parameters such as extinction time and transmission rate; hence, Bayesian methods are preferred over classical approaches in such studies.

Also, classical inference only uses the information from the sample and thus lacks the machinery of the Bayesian approach where the information from the sample can be naturally combined with the prior belief of model parameters. This additional inclu- sion of prior knowledge often facilitates the inference specifically when data alone may not be sufficient to estimate parameters. Bayesian inference is also more axiomatically favourable compared to the classical inference. Many classical inference procedures rely on the outcomes from the long-run behaviour of hypothetical repeated samples while the likelihood principle requires that any inference should only depend on the observed sample (O’Hagan and Forster, 2004).

Bayesian statistics has become more popular in the last couple of decades due to the im- provements in computational resources along with the advancements in computational approaches such as MCMC. The use of computational approaches makes it possible to realise posterior distributions along, with credible intervals, of all parameters with ease. This is difficult to do with frequentist methods, and often impossible due to the com- plex modelling structures used in both ecology and epidemiology. Also, this approach Chapter 2. Bayesian Approach 30 is relatively easy to implement and understand, particularly for those ecological and epidemiological practitioners who are not trained in mathematics and more advanced statistical theory. The flexibility of the approach becomes more obvious when there is a need to add more hierarchies into the model, (for example, making the sighting prob- ability co-variate dependent). The computational Bayesian approach can be adjusted accordingly, without a need to derive the mathematically complex form for the posterior probability of extinction. This approach has become very popular of late, in particular with the availability of statistical softwares such as JAGS, OpenBUGS, WinBUGS and STAN.

Even though Bayesian approaches are popular and useful, they also have their weak- nesses. For example, the messy computational spaces and possible biases introduced by prior selection are the main ones to consider. Scientists who are unaware of Bayesian techniques (and hence do not have a full understanding of what is meant by a prior distribution) can provide prior distributions that are either inconsistent or far too pre- cise and even for the ones who are experts, specifying prior information for the analysis can be time-consuming (Walters and Ludwig, 1994). In 2010, (Kuhnert et al., 2010) examined the impact of prior distributions specified based on expert knowledge and specifically how to minimize potential bias. However, the problem of introducing biased information into the Bayesian models through the prior can be avoided if informative priors are determined by information from synthesis studies rather than highly depend- ing on expert opinion. Along the line of priors, it is important to remember that if the data are sufficiently informative possibly due to large sample size, then the choice of the prior for the model parameters will not affect the posterior results substantially. However, if the posterior results are sensitive to the choice of priors, then it indicates the fact that the data provide little information about parameters, and this should be taken seriously when making decisions based on the analysis (Punt and Hilborn, 1997).

Another area to consider when conducting Bayesian analysis is the computational de- mands. For the complex modelling structures used in both ecology and epidemiology, it will often take several days of computer time even on reasonably powerful personal computers to conduct a Bayesian analysis. Hence, it is important that scientists try to optimize the efficiency of these methods through techniques such as parallel processing and re-parameterization. Chapter 3

Incorporating uncertainty in the sighting data into species extinction models

This chapter has been published, and has the following citation:

Kodikara, S., Demirhan, H., and Stone, L. (2018). Inferring about the extinction of a species using certain and uncertain sightings. Journal of Theoretical Biology, 442:98–109

Abstract

Sighting record of species is often used to infer the extinction of species. Most of these sightings have uncertain validity. Solow and Beet(2014, On uncertain sightings and infer- ence about extinction. Conservation Biology 28:1119-1123) developed two models using a Bayesian approach which allowed for uncertainty in the sighting record by formally incorporating both certain and uncertain sightings, but in different ways. Interestingly, the two methods give completely different conclusions concerning the extinction of the Ivory-billed Woodpecker. We further examined these two methods to provide a math- ematical explanation to explore in more depth as to why the results differed from one another. As a result, it was found that the first model was more sensitive to the last

31 Chapter 3. Incorporating uncertainty into species extinction models. 32 uncertain sighting, while the second was more sensitive to the last certain sighting. The difficulties in choosing the appropriate model are discussed.

3.1 Introduction

The Ivory-billed woodpecker, Campephilus principalis, was generally considered to have gone extinct in the 1950’s (Jackson, 2004). The rediscovery of the Ivory-billed wood- pecker made headlines all over the world and was featured on the cover of Science mag- azine (Wilcove, 2005). After the initial excitement, however, doubts about the validity of the rediscovery were raised and the video evidence was soon disputed by indepen- dent researchers (Sibley et al., 2006), who argued that the bird is a normal Pileated Woodpecker (Dryocopus pileatus). Nevertheless, a recent publication (Collins, 2017) provides three video footages as evidence that the Ivory-billed Woodpecker is extant. The species in the footage was claimed to have the same consistent characteristics as the Ivory-billed Woodpecker when it comes to flights, behavior and field marks which have not been noted for any other species inhabiting the region.

A related incident arose with regard to the Aldabra banded snail Rhachistiia aldabrae, which was understood to have become extinct in the 1990’s, based on lack of sighting data. The last sighting occurred in 1996. As a further check, dedicated ecological surveys were carried out to locate the snail in 2005 and 2006, but none succeeded and Gerlach (2007) concluded the snail was driven to extinction due to climate change (Gerlach, 2007). However, in 2014 the Seychelles Island Foundation reported the Aldabra snail had reappeared in their surveys of the Aldabra atoll. When reviewing the events of the Aldabra snail the following question was presented: “Where does this leave us? First, it leaves us with the need to have a renewed debate about the methods used to assess extinction probability ...” (Battarbee, 2014).

Solow and Beet(2014) developed two different, but nevertheless related probabilistic models for analyzing sighting records and applied their models to estimate the extinction time of the Ivory-Billed Woodpecker. When applied to the same sighting record for the Ivory-Billed Woodpecker, two models give two distinct results; one predicting the woodpecker is extant, and the other concludes that it is well and truly extinct (Solow and Beet, 2014). Solow and Beet(2014) emphasize that deciding which of these methods is appropriate in a particular case requires an understanding of what they called the natural history of the sighting record.

In this paper, we will attempt to understand some of the issues that lead to these contra- dictory predictions in more detail.To begin the process, we examine the two extinction models of Solow and Beet(2014) more carefully and attempt to identify the factors that Chapter 3. Incorporating uncertainty into species extinction models. 33 the models are most sensitive to. Their models (Solow and Beet, 2014) are among only a few other models that take into account uncertainty in the historical sighting record (i.e (Solow et al., 2012; Thompson et al., 2017; Lee, 2014; Jari´cand Roberts, 2014; Lee et al., 2015) ). In practice, the historical sighting records of a species may often be diffi- cult to interpret. The validity of any sighting record can either be certain or uncertain. However, Rivadeneria et al.(2009) claim that most statistical methods for assessing species extinction before 2009 were based on all sightings being valid with certainty (Rivadeneria et al., 2009). Because the conclusions about extinction are sensitive to the inclusion or exclusion of uncertain sightings, the inferences from the models including uncertain sightings significantly differ from those obtained by the models omitting this information (Roberts et al., 2010). In the following study, we demonstrate the reasons behind the contradictory results of the two models of Solow and Beet(2014), in relation to the different subtle way uncertain sightings are modelled.

3.2 Data

As a first step, it is of interest to examine the sighting record data of the Ivory-billed Woodpecker reported in Roberts et al.(2010). Following Solow and Beet(2014), we treated all sightings not based on physical evidence as uncertain. Since there was no natural way to define the beginning of the observation period, the period was taken to be [1897, 2010] and the first sighting in 1897 was omitted as standard in literature (Solow, 1993a; Solow and Beet, 2014; Boakes et al., 2015). The record period from 1897- 2010 contains 21 certain sightings in years 1898-1902, 1904-1910, 1913, 1914, 1917, 1924, 1925, 1932 1935, 1938, and 1939 and 46 uncertain sightings in years 1911, 1916, 1920, 1921, 1923, 1926, 1929-1931, 1933, 1934, 1936, 1937, 1941-1944, 1946, 1948-1952, 1955, 1958, 1959,1962, 1966-1968, 1969, 1970-1974, 1976, 1981, 1982, 1985-1988, 1999, 2004- 2006, where for example 2004-2006 means that there are sightings in every year from 2004 to 2006. The Ivory-billed Woodpecker sighting data described above, is visualized graphically in Figure 3.1.

1897 1939 2006 2010 Figure 3.1: Graphical representation of Ivory-billed Woodpecker sighting data. Blue represents the years where there are certain sightings followed by pink which represents the uncertain sightings. Note the period of overlap of these two sighting types.

In what follows, we adopt an unconventional format and provide the Methods and then give results first for Model 1, and then follow this for Model 2. Chapter 3. Incorporating uncertainty into species extinction models. 34

3.3 Methods- Model 1

We begin by outlining the basic structure of the first model of Solow and Beet(2014). Define the observation period of a species as (0,T ), where 0 is the time when observations began and the period lasts for T years altogether. The complete sighting record t =

(t1, t2, . . . , tn) consists of the times (i.e., here the years) when the species were sighted during the observation period.

If the sighting is based on clear physical evidence, it is classified as certain; but if it is classified by a sound recording, photograph or video, or some other less precise confirmation, it is classified as uncertain. Thus, for observed data, we only know whether each sighting is certain or uncertain. All certain sightings are considered to be valid. Note, however, that uncertain sightings can either be valid or invalid. In practice, whether an uncertain sighting is valid or invalid is often unknown.

In the datasets studied here, all that is known is which sightings are certain and which are uncertain. From this information, the rates of valid and invalid sightings can be inferred, so that an informed decision can be made as to the likelihood of the species being extant or extinct.

Figure 3.2: Model 1 is based on the assumption that the valid and invalid sightings follow different Poisson process. The species becomes extinct at time τE. Before τE, all sightings are either valid with rate Λ or invalid with rate Θ. Note that both Λ and Θ are assumed to be constant. In the real data all certain sightings are valid, while uncertain sightings are either valid or invalid. After the species becomes extinct at time τE, all sightings must be invalid with rate Θ, and appear in the real data as uncertain. Chapter 3. Incorporating uncertainty into species extinction models. 35

Under the first model of Solow and Beet(2014), the sighting record is divided into two parts with the division based on the (unknown) extinction time τE. Over the first time period in the interval (0, τE), valid sightings follow a stationary Poisson process with rate Λ, and invalid sightings follow a stationary Poisson process with rate Θ. The expected proportion of valid sightings is Ω = Λ/(Λ + Θ). Then, the invalid rate Θ is equal to [(1 − Ω)]/Ω.

Over the second time period in the interval (τE,T ), all the sightings are invalid and follow a stationary Poisson process still with rate Θ. That is, the rate of invalid sighting does not change over the whole interval (0,T ). Model 1 is summarized in Figure 3.2.

Let nc and nu be the number of certain and uncertain sightings. The first model proceeds

by assuming the extinction time τE falls in the interval (0,T ), with n(τE) sightings prior

to τE. We suppose the number of valid uncertain sightings in (0, τE) is the random

variable j. This gives the scheme shown in Figure 3.2. There are thus nc + j valid sightings in (0, τE), and there are n − (nc + j) invalid sightings in (0,T ). The goal is to construct likelihoods for the valid and invalid sightings in these two time intervals, and assess the extinction hypothesis from these likelihoods.

In more detail, let E be the event that the species became extinct during the observation period (0,T ) and E be the event that the species is extant at time T . By Bayes theorem, the posterior probability of an extinction event E given the complete sighting record t is

p(t|E)p(E) p(E|t) = , (3.1) p(t|E)p(E) + p(t|E)(1 − p(E)) where p(t | E) is the likelihood of t given E, p(t | E) is the likelihood of t given E, and p(E) is the prior probability of extinction.

The Bayes factor (B(t)) is a standard Bayesian measure that is used to make a decision of whether the data support the null hypothesis E, that the species went extinct in the study interval, while not depending on p(E). Then the Bayes factor is defined as:

p(t|E) B(t) = . (3.2) p(t|E) A value of B(t) > 3 constitutes substantial evidence for for the null hypothesis E, while a value B(t) < 1/3 constitutes substantial evidence for the alternative hypothesis E that the species is extant(Kass and Raftery, 1995; Solow and Beet, 2014). In the Chapter 3. Incorporating uncertainty into species extinction models. 36 following section, we will describe methods for estimating the Bayes Factor, so that the null hypothesis may be tested with real sighting data.

The likelihood p(t|E) is given by

Z T p(t|E) = p(t|τE)p(τE)dτE, (3.3) tL

where p(t|τE) is the likelihood of t given that extinction occurs at a given time τE and

p(τE) is the prior probability density function of the extinction time. The time point

tL is the last certain sighting which makes p(t|τE) = 0 if τE < tL. Also the likelihood p(t|E) is given by

p(t|E) = p(t|τE = T ). (3.4)

The likelihood of t given a particular τE for Model 1 can be obtained from Solow and Beet(2014) through a set of calculations and assumptions on model parameters.See Solow and Beet(2014) for more details on the calculations and assumptions on model parameters.

Z 1 1 − ω −nu n−n(τE ) −n p(t|τE) = ω (1 − ω) (n − 1)!(τE + ( )T ) dω. (3.5) 0 ω

Note that in Equation 3.5, ω is the proportion of valid sightings. The likelihoods under event E and E can then be found using Equation 3.5 along with Equations 3.3 and 3.4,

Z 1 (n − 1)! nc (n − 1)! p(t|E) = p(t|τE = T ) = n ω dω = n , (3.6) T 0 T (nc + 1)

Z T p(t|E) = p(t|τE)p(τE)dτE. (3.7) tL

We build our study based on Equation 3.6 and Equation 3.7 which were formulated in 1 Solow and Beet (2014). In our study here, we assume a uniform distribution p(τE) = T τE on 0 ≤ τE ≤ T and change variables to x = T (i.e the proportion of time the species was extant), then some cancellations occur resulting in a simplified Bayes Factor (BF):

p(t|E) Z 1 Z 1 (1 − ω)n−n(xT ) BF = = (n + 1) ωnc−n dωdx, (3.8) 1 c 1 n p(t|E)  0 (x + ω − 1) Chapter 3. Incorporating uncertainty into species extinction models. 37

tL where  = T , the proportion of time where certain sightings occurred.

n−n(xT ) nc−n (1−ω) If we let I(ω, x) = ω 1 n , we get the following as the Bayes factor (x+ ω −1) corresponding to the Model 1:

Z 1 Z 1 BF1 = (nc + 1) I(ω, x)dωdx. (3.9)  0

At the first glance, Equation 3.9 suggest that BF1 will be strongly depend on nc and  but by closely observing I(ω, x) it is clear that it also depends on the order of the actual data through n(xT ). Note that n(xT ) is the number of sightings prior to time xT (τE).

3.4 Results- Model 1

There is no closed form solution for the Bayes factors BF1 Equation 3.9 in Model 1, and it can only be solved through numerical integration. As such, one cannot clearly see which variables BF1 is most sensitive to. In what follows we attempt to identify the key influential variables.

Using Equation 3.8, the Bayes factor was calculated for a number of different versions of the woodpecker dataset (see Data above). Note that the Ivory-billed woodpecker data is characterized by a period of certain sightings followed by a period of uncertain sightings, and at the cross-over point there is a small period of overlap where there are both certain and uncertain sightings (see Figure 3.1). Hence we found it useful to examine the sensitivity of Model 1 to certain and uncertain sightings separately. In order to achieve this we considered different variations of the original sighting data of Ivory-billed Woodpecker. Chapter 3. Incorporating uncertainty into species extinction models. 38

Figure 3.3: LH graph. Change in the Bayes factor for Model 1 for different variants in certain sightings. When the x-axis indicates “1932-(18),” this refers to the last certain sighting being in 1932 and the total number of certain sightings being nc = 18. RH graph is the same, but for uncertain sightings. Note that the magnitudes observed in the RH graph is 1010, indicating Model 1 is much more sensitive to uncertain sightings.

The Left Hand (LH) graph of Figure 3.3 examines the Bayes Factor when changing certain sightings in the dataset, while the Right Hand (RH) graph examines the Bayes Factor when changing uncertain sightings in the dataset. When changing the certain sighting information we deleted the last certain sighting sequentially, and calculated the Bayes Factor each time. Thus where the x-axis indicates “1932-(18)” this refers to the last certain sighting being in 1932 and the total number of certain sightings being nc = 18. In a similar manner we changed the uncertain sighting information as well.

By comparing the magnitudes of the two graphs in Figure 3.3 it is clear that the Bayes Factor for Model 1 changes significantly in the RH graph and thus controlled by the uncertain sightings. The magnitude of the RH graph can reach values of BF ∼ 1010 (i.e., orders of magnitude higher than left graph) which gives overwhelming evidence that Model 1 is much more sensitive to uncertain sightings.

Note that the both RH and LH graphs in Figure 3.3 examines two parameters (i.e. the number of certain/uncertain sightings and the last certain/uncertain sighting) and the two parameters are changed simultaneously. In order to investigate which parameter is Chapter 3. Incorporating uncertainty into species extinction models. 39 the most influential we need to keep one parameter fixed while changing the other. We therefore considered two situations in Figure 3.4; one having the last uncertain sighting fixed at 2006 and the other having the same fixed number of uncertain sightings as the original data.

Figure 3.4: Change in the Bayes factor for Model 1 for different variants in uncertain sightings. Recall that the model does not require the user to specify the dates because it assumes they are uniform / equidistant. Comparing the LH and RH graphs it gives evidence that Model 1 is much more sensitive to the last uncertain sighting than to the number of uncertain sightings. Note that the magnitude of that RH graph is 107.

By examining the graphs in Figure 3.4 it becomes evident that the change in the last uncertain sighting (rather than number of uncertain sightings) is largely responsible for the huge changes in the Bayes Factor for Model 1. Hence our working hypothesis for Model 1 is that the Bayes Factor is most sensitive to the time of the last uncertain sighting. Given the structure of Model 1, this seems a reasonable prediction. This is because Model 1 assumes that the all valid sightings, whether certain or uncertain follows the same Poisson process. In this case, the relatively high rate of certain sightings prior to 1939 ensures the rate of valid sightings prior to 1939 is large. Since the same rate will be applied to valid uncertain sightings after 1939, this implies the majority of uncertain sightings are valid, and thus extinction occurs immediately after the last

uncertain sighting. Hence the value of BF1 will start to favor the hypothesis that the Chapter 3. Incorporating uncertainty into species extinction models. 40 extinction occurred between (0,T ), and more as the gap increases between the last uncertain sighting and T .

The importance of the last uncertain sighting.

In order to understand the Bayes Factor in Equation 3.2, we examined its behavior under a number of simplified scenarios that gave some analytical insights.

Case 1: Model 1. No extinction baseline.

First, it was assumed that all sightings are spread uniformly until the end of the obser- vation period T . This would correspond to a baseline scenario of what we would expect if the species never went extinct,when all valid sightings had the same rate in (0,T ).

In such a case, the number of sightings in the interval (0, xT ) (i.e (0, τE)) could be approximated by n(xT ) = nx, for 0 ≤ x ≤ 1. For this case, Equation 3.8 can be simplified to

Z 1 Z 1 (1 − ω)n−n(xT ) BF = (n + 1) ωnc−n dωdx 1 c 1 n  0 (x + ω − 1) !n (3.10) Z 1  n Z 1 −x nc 1 − ω (1 − ω) = (nc + 1) ω 1 dxdω. 0 ω  (x + ω − 1)

R b Laplace’s method is used to approximate integrals of the form a exp(Mf(x))dx (Laplace, 1986). The method requires that the value of M should be large and f(x) should be a twice-differentiable function. The integral over x in Equation 3.10 can be evaluated using (1−ω)−x Laplace’s method as 1 is twice differentiable. In this special case, the maximum (x+ ω −1) of the integrand occurs at the end point x = 1, where in general for large n

Z 1 0 −1 In = exp(ng(x))dx ∼ (ng (1)) exp(ng(1)). (3.11) 

ω2 ω3 Using Equation 3.10 and − log(1 − ω) = ω + 2 + 3 .... we obtain,

Z 1  nh n i   nc 1 − ω ω 2 2 nc + 1 BF1 ≈ (nc + 1) ω × 2 dω = (3.12)  ω 1 − ω nω n nc − 1

For large nc Equation 3.12 is approximately equal to 2/n. It is also clear that the value of BF1 will always be less than 3, which simply mean that the species will always be Chapter 3. Incorporating uncertainty into species extinction models. 41 extant if sightings are spread uniformly until the end of observation period. This is to be expected because according to Model 1, the above conforms exactly to the alternative hypothesis that the species is extant.

If nc > 10 then the quantity nc + 1/nc − 1 approaches unity. Thus, using Equation 3.12 it is clear that for nc > 10, the certain sightings do not play a significant role when making inferences about extinction for Model 1. This can also be observed in LH of Figure 3.3 where the Bayes factor does not change significantly upon the deletion of certain sightings. These results imply that Model 1 inference is very much influenced by when the last uncertain sighting occurred. The next section illustrates this idea through a simple example.

Case 2: Last uncertain sighting tn < T .

Consider Model 1 for the specific case when uncertain sightings terminate before the end of the observation period. As a typical example, consider a simulated time series of sightings where the last uncertain sighting occurs at t = tn, and the uncertain sightings

are distributed uniformly in the interval (0, tn). The results obtained can be generalized to any number of certain and uncertain sightings but in order to present numerical results it will be assumed that there are certain sightings in every year from 1 to 8 (1-8) and uncertain sightings from year 5 to 30 (5-30) for the 35 years of the observation period.

The main contributions to the Bayes Factor in Equation 3.9 can be found from examining the integrand for I(ω, x) of the integral in Equation 3.8. Figure 3.5 illustrates some plots of the integrand I(ω, x) at fixed values of ω, as a function of time x. Chapter 3. Incorporating uncertainty into species extinction models. 42

Figure 3.5: I(ω, x) is plotted as a function of time for different fixed values of ω when sightings are homogeneous. Here ω is the expected proportion of valid sightings and I(ω, x) is the integrand in Equation 3.9 which is used to calculate the Bayes factor for Model 1. I(ω, x) has major peaks at the last uncertain sighting tn, indicated by a red dash vertical line and the peak gradually increases when ω is increased.

Figure 3.5 makes clear that for ω > 0.6, the integrand has a major peak orders of magni-

tude higher than 2 and it is exactly located at the last uncertain sighting (tn) not before or after it. ω is the expected proportion of valid sightings and Figure 3.5 illustrates that I(ω, x) has comparatively large values when ω is large or when Model 1 treats most uncertain sightings to be valid. Hence the distance between the last uncertain sighting

and the observation end period will determine the value of BF1. This property makes Model 1 conclude that the species is extinct only if the last uncertain sighting and the observation end period is away from one another. By doing further numerical calcu- lations for this particular example it was found that the species can be inferred to be extinct if the last uncertain sighting and the observation end period is away from one another by at least five years.

The steep peak at the last uncertain sighting after a specific value of ω can be explained by the following property. Chapter 3. Incorporating uncertainty into species extinction models. 43

 0, if a < 1 lim an = (3.13) n→∞ ∞, if a > 1

tn When x = T and n(xT ) = n , we get

Z 1 Z 1 (1 − ω)n−n(xT ) BF = (n + 1) ωnc−n dωdx 1 c 1 n  0 (x + ω − 1) " #n (3.14) Z 1 Z 1 ωnc/n = (nc + 1)   dωdx.  0 tn  1  ω T + ω − 1

By virtue of Equation 3.13, the steep peak will initiate when ω = ω1, where ω1 satisfies the following

! nc n tn   1  ω1 = ω1 + − 1 (3.15) T ω1

Solving Equation 3.15 for our data it was found that ω1 = 0.66.

By observing Figure 3.5, it is clear that the steep peak at the last uncertain sighting starts to increase rapidly after ω is greater than 0.66.

We just considered the case where uncertain sightings are distributed uniformly in time but the last uncertain sighting abruptly ends sometime before T . Model 1 would then predict a major peak in I(ω, t) at the last uncertain sighting, the BF1 will accordingly be large, and extinction predicted.The same idea can be generalized for non-uniformly distributed data.

Peak at tn−1 is less than the peak at tn which is shown mathematically in the following section.

When x = (tn − 1)/T and n(xT ) = n − 1, we have the following

Z 1 Z 1 (1 − ω)n−n(xT ) BF = (n + 1) ωnc−n dωdx 1 c 1 n  0 (x + ω − 1) " #n Z 1 Z 1 ωnc/n = (nc + 1) (1 − ω)   dωdx  0 tn−1  1  (3.16) ω T + ω − 1 " #n Z 1 Z 1 ωnc/n ≈ (nc + 1) (1 − ω)   dωdx  0 tn  1  ω T + ω − 1 Chapter 3. Incorporating uncertainty into species extinction models. 44

Note that (tn−1/T ) can be approximated by (tn/T ) as sightings are homogeneous.

Comparing Equation 3.16 with Equation 3.14, it is clear that at tn−1, the peak is different by a factor of (1−ω), approximately. Consider the case where ω = 0.7, the plot of I(ω, x)

confirms that the peak after tn−1is approximately different from tn by a factor of (1−ω)

Figure 3.6: I(ω, x) as a function of time ω = 0.7. The red dashed line indicates the last uncertain sighting tn and blue dotted line represents tn−1.

By the results of Figure 3.6, it is clear that the peak after tn−1 is smaller than the peak

at tn when sightings are homogeneous.

Case 3: Last uncertain sighting tn < T with non-homogeneous sightings in middle of the sightings.

It will be assumed that there are certain sightings in every year from 1-8 and uncertain sightings from year 5-22, 25-30 for the 35 years of the observation period. Figure 3.7 illustrates the plot of the integrand I(ω, x) at a fixed value of ω = 0.9, as a function of time x. ω = 0.9 was chosen as it gives a clear idea where the peak is located this can be verified by Figure 3.5. Chapter 3. Incorporating uncertainty into species extinction models. 45

Figure 3.7: I(ω, x) as a function of time time for a fixed values of ω = 0.9 when sightings are non-homogeneous at middle. The red dashed line indicates the last uncertain sighting tn .

Similar to Figure 3.5, the integrand has a major peak exactly located at the last uncertain sighting (tn) not before or after it.

Case 4: Last uncertain sighting tn < T with non-homogeneous sightings at end of the sightings.

It will be assumed that there are certain sightings in every year from 1-8 and uncertain sightings from year 5-25, 30 for the 35 years of the observation period. Figure 3.8 illustrates the plot of the integrand I(ω, x) at a fixed value of ω = 0.9, as a function of time x. Chapter 3. Incorporating uncertainty into species extinction models. 46

Figure 3.8: I(ω, x) as a function of time time for a fixed values of ω = 0.9 when sightings are non-homogeneous at end.The red dashed line indicates the last uncertain sighting tn

Figure 3.8 makes clear that for high ω (i.e ω=0.9) the integrand has a major peak orders of magnitude higher than 10 and it is not exactly located at the last uncertain sighting

(tn), in fact its located at where the non-homogeneity occurs (tn1 ). This could be farther explained by Equation 3.17.

tn−1 When x = T and n(xT ) = n − 1 , we get

Z 1 Z 1 (1 − ω)n−n(xT ) BF = (n + 1) ωnc−n dωdx 1 c 1 n  0 (x + ω − 1) " #n (3.17) Z 1 Z 1 ωnc/n = (nc + 1)   dωdx.  0 tn−1  1  ω T + ω − 1

tn−1 tn Now ( T ) cannot be approximated by ( T ) as sightings are non-homogeneous at the tn−1 tn end. This result in ( T )  ( T ) , hence the integrand will have a maximum at (tn−1).

Case 5: When there is only one uncertain sighting and assuming that the species was extant at the start of observation (i.e. certain sighting at 0). Chapter 3. Incorporating uncertainty into species extinction models. 47

It will be assumed that for the sighting period between 0 and 35, there is only one uncertain sighting, which was recorded in the 30’th year. Figure 3.9 illustrates the plot of the integrand I(ω, x) at a fixed value of ω = 0.9, as a function of time x. Since there is only one sighting, the peak is located soon after that sighting.

Figure 3.9: I(ω, x) as a function of time time for a fixed values of ω = 0.9 when there is only one uncertain sighting.The red dashed line indicates the last uncertain sighting tn.

3.5 Methods- Model 2

Under the second model it is assumed that certain sightings follow an independent Poisson process with rate M, while valid uncertain sightings follow a stationary Poisson process with rate Λ and invalid sightings follow a stationary Poisson process with rate Θ. In this situation, certain sightings differ qualitatively from valid sightings that are uncertain. Also, Model 2 assumes certain and uncertain sightings to be independent of each other. Figure 3.10 provide a summary of Model 2. Chapter 3. Incorporating uncertainty into species extinction models. 48

Figure 3.10: The basis of Model 2 is developed on the fact that the certain and uncertain sightings follow different Poisson process. The species becomes extinct at time τE. Before τE, all sightings are either certain with rate M or uncertain(valid with rate Λ; invalid with rate Θ). After the species becomes extinct at time τE, all sightings must be uncertain(invalid) with rate Θ.

Let tc and tu denote the sets of certain and uncertain sightings respectively. The condi-

tional likelihood of the complete set of sightings given τE is the product of

p(t|τE) = p(tc|τE)p(tu|τE), (3.18)

where p(tc|τE) is the likelihood of tc given τE and p(tu|τE) is the likelihood of tu given 1 τE. Augmenting a non-informative prior pdf for p(µ) = µ on 0 ≤ µ ≤ ∞, for M, a development parallel to Model 1 yields:

(nc − 1)! p(tc|τE) = nc , (3.19) τE and

1 Z  1 − ω −nu −nu nu−nu(τE )  p(tu|τE) = (nu − 1)!ω (1 − ω) τE + T dω, (3.20) 0 ω

where nu(τE) is the number of uncertain sightings prior to τE. Equation 3.20 is equivalent to Equation 3.5, if one assumes all sightings are uncertain. This assumption is used as certain and uncertain sightings are modelled independetly of each other. Chapter 3. Incorporating uncertainty into species extinction models. 49

Thus, Equation 3.20 can be obtained by replacing n by nu in Equation 3.5. Note that

Equation 3.20 differs from Solow and Beet(2014) by a term ( nu − 1)!. The term

(nu − 1)! is needed when working with the exact likelihood.

Hence we obtain the following:

(n − 1)!(n − 1)! Z 1 ω−nu (1 − ω)nu−nu(τE ) p(t|τ ) = c u dω, (3.21) E nc  nu τE 0 1−ω  τE + ω T

Equation 3.21 can then be combined with a specified prior pdf p(τE) to find p(t|E) and p(t|E)as follows:

(nc − 1)!(nu − 1)! p(t|E) = p(t|τE = T ) = , (3.22) T nc T nu

Z T p(t|E) = p(t|τE)p(τE)dτE. (3.23) tL

If we assume a uniform distribution p(τE) = 1/T on 0 ≤ τE ≤ T and change variables to x = τE/T , then some cancellations occur resulting in a simplified Bayes Factor (BF):

p(t|E) Z 1 Z 1 (1 − ω)nu−nu(xT ) BF2 = = dωdx, (3.24) nu nc 1 nu t|E  0 ω x (x + ω − 1)

0 nu−nu(xT ) nu nc 1 nu Let I (ω, x) = [(1 − ω) ]/[ω x (x + ω − 1) ]. Thus, we obtain the following as the Bayes factor corresponding to the Model 2:

Z 1 Z 1 0 BF2 = I (ω, x)dωdx. (3.25)  0

3.6 Results- Model 2

Similar to BF1 , BF2 can only be solved through numerical integration. In order to identify the variables which most influence the value of BF2, we first checked which sighting type (i.e. certain or uncertain) dominates more under Model 2. Hence we considered the same variants as in Figure 3.3. Chapter 3. Incorporating uncertainty into species extinction models. 50

Figure 3.11: Change in Bayes factor for Model 2 for different variants in both certain and uncertain sightings. Note that the magnitude of the LH graph is 109, which gives evidence that Model 2 is much more sensitive to certain sighting. The red dashed line indicates the last uncertain sighting tn

According to Figure 3.11 it is clear that the Bayes factor for Model 2 changes significantly in the LH graph. Once again to investigate which parameter (i.e. the last certain sighting or the number of certain sightings) dominates more, we consider the situations in Figure 3.12. The RH graph of Figure 3.12 has the last certain sighting fixed at 1939 and the left graph has the same fixed number of certain sightings as the original data.

Comparing the graphs in Figure 3.12 it becomes evident that the change in the last certain sighting was responsible for the huge changes in Bayes factor for Model 2. Hence Model 2 is particularly sensitive to the last certain sighting, and this became our working hypothesis for Model 2.

We can again use Laplace’s method to obtain a general idea about the integral in Equa- tion 3.25 but now the factor x−nc moves the maximum of the integrand to the end point x =  (rather than x = 1 for BF1). The remaining integral over ω is solvable but from the Mean Value Theorem, we can express the integral as the value of the integrand at some point ω ∈ (0, 1). To get a feel for the order of magnitude of BF2 take ω = 0.5. Combined with Laplace method using nu(xT ) = nux, we obtain the asymptotic estimate, Chapter 3. Incorporating uncertainty into species extinction models. 51

Figure 3.12: Change in Bayes factor for Model 2 for different variants in certain sightings. RH graph shows much more sensitivity in Bayes Factor compared to the LH graph which provides evidence of Model 2 being sensitive to the last certain sighting.

(0.5)nu BF2 ≈ n n . (3.26) nu c ( + 1) u

Even though the result in Equation 3.26 does not simply explain the situation as clearly as in BF1, it is clear that BF2 is most influenced by  which depends on the last certain sighting. This result suggests that Model 2 inference is particularly sensitive to when the last certain sighting occurs.

The next section will illustrate this possibility using the same example that we studied with Model 1 case 2, which assumes that there are certain sightings from 1 to 8 and uncertain sightings from 5 to 30 for 35 years.

From inspecting at Figure 3.13, it is clear that the peak is exactly at the last certain sighting (tL). Also by examining the magnitudes of the y-axis, it is clear that BF2 is more influenced when ω < 0.8. This property of Model 2 is responsible for the prediction that the species is extant only if the last certain sighting and the observation end period are close to one another. Chapter 3. Incorporating uncertainty into species extinction models. 52

Figure 3.13: I0(ω, x) as a function of time for different fixed values of ω when sightings are homogeneous. The red dashed line indicates the last uncertain sighting tn and the purple dotted line indicates the last certain sighting tL. As in Figure 3.5, ω is the expected proportion of valid sightings but I0(ω, x) represents the integrand in Equation 3.24 which is used to calculate Bayes factor for Model 2. In here I0(ω, x) shows an opposite behavior compared to I(ω, x) by having its major peak at the last certain sighting. Also note that this peak gradually decrease when ω is increased.

The steep peak at the last certain sighting can be explained in the same way as for the first model.

When x = tL/T and n(xT ) = n1 , we obtain

Z 1 Z 1 (1 − ω)nu−n1 BF2 = dωdx nu tL nc tL 1 nU  0 ω ( T ) ( T + ω − 1) n n 1 1 " 1− 1 # u (3.27) Z Z (1 − ω) nu = nc dωdx.  0 tL  nu tL 1  ω T T + ω − 1

A steep peak will occur when ω = ω2 and Chapter 3. Incorporating uncertainty into species extinction models. 53

n1 t nc t 1 1− L  n L  (1 − ω2) nu = ω2 u + − 1 . (3.28) T T ω2

Solving Equation 3.28 with the assumed data it was found that ω2 = 0.79. By observing Figure 3.13, it is clear that the peak at the last certain sighting begins to increase rapidly when ω < 0.8.

By the results of Equation 3.28, it is clear that the second method is sensitive to the last certain sighting. The distance between the last certain sighting and the observation end period will determine the value of BF2.

3.7 Guideline to select between Models

Solow and Beet (2014) emphasized that the choice of model depends on the understand- ing of what they called the natural history of the sighting record. For the particular case of the Ivory-billed Woodpecker, they pointed out that the second model is appropriate based on natural history - certain sightings were based on physical specimens, uncertain sightings on calls.

Our results show that Model 1 is sensitive to the last uncertain sighting while Model 2 is sensitive to the last certain sighting. The choice of model will obviously depend on the reliability of the uncertain sightings in the observed data. This can be investigated by exploring the posterior probability density function (PDF) of the parameter Ω which represents the expected proportion of valid sightings and is a measure of the quality of uncertain sightings (Solow et al., 2012). If Ω is near one, then most of the uncertain sightings prior to extinction are valid. If Ω is near zero, this suggests that most of the uncertain sightings prior to extinction are invalid. By examining the posterior PDF of we can gain useful information as to whether Model 1 or Model 2 is the correct model to use.

The posterior PDF of Ω is (Solow et al., 2012):

p(t|ω) p(ω|t) = (3.29) R 1 0 p(t|ω)dω where Chapter 3. Incorporating uncertainty into species extinction models. 54

 R T −nu n−n(τ ) 1−ω −n  ω (1 − ω) E (n − 1)!(τE + ( )T ) p(τE)dτE, Model 1 p(t|ω) = tL ω  −nu R T −nc −nu nu−nu(τ ) 1−ω   (nc − 1)!τ (nu − 1)!ω (1 − ω) E τE + T dτE. Model 2 tL E ω (3.30)

(Solow and Beet, 2014). By comparing the posterior PDF of Ω with the realistic ex- pectation of the validity of uncertain sightings one can choose the appropriate model to make inferences i.e., Model 1 or Model 2. To illustrate this idea we generated the two posterior distributions related to each model for the Ivory-billed Woodpecker data, as shown in Figure 3.14.

Figure 3.14 indicates that, prior to extinction most uncertain sightings are valid under Model 1 which makes Model 1 to be sensitive to the uncertain sightings. Under Model 2 most uncertain sightings are invalid prior to extinction which makes certain sightings become the dominating factor. As a rule of thumb, our findings suggest that if most of the uncertain sightings are known to be valid Model 1 is appropriate, on the other hand if most of the uncertain sightings are known to be invalid Model 2 is appropriate.

(a) Model 1 (b) Model 2

Figure 3.14: Posterior PDF of parameter Ω, which is a measure of the quality of uncertain sightings. The LH graph indicates that most uncertain sightings prior to extinction tend to be valid under Model 1. The RH graph indicates that under Model 2 most uncertain sightings are invalid.

There might also be situations where the validity of uncertain sightings is not known. In this case selecting one model over the other will be risky because of this ignorance of model uncertainty. In such situations there might be a need to consider Bayesian Model Averaging, which would examine the response of a weighted average of the two models. We hope to treat such an approach in future work. Chapter 3. Incorporating uncertainty into species extinction models. 55

3.8 Discussion

In this article, we study the sensitivities of the two important models proposed by Solow and Beet(2014) for predicting extinction, in terms of the particular characteristics of the historical sighting records of the species under investigation. The differences between the two models can be understood visually by comparing Figures 3.2 and 3.10. We mathematically explain and give reasons for the different results obtained from the two models, when both are used to analyze the same data set. As seen in the results section and in Solow and Beet(2014), it is clear that the two models behave differently. A simple way of conceptualizing the difference is by noting that, at least until the time the species goes extinct, Model 1 is framed in terms of the underlying valid and invalid sightings occurring as different but stationary Poisson processes (at constant rates Λ and Θ respectively).

After the species goes extinct, all sightings are invalid. (Note that valid sightings consist of certain and valid uncertain sightings.) A relatively high rate of certain sightings before tL ensures that the rate of valid sightings is large. Thus after tL, most of the uncertain sightings should be valid in order to maintain the same valid sighting rate. Hence the last uncertain sighting is more influential on Model 1, as compared to the last certain sighting tL.

In contrast, at least until the species goes extinct, Model 2 is framed in terms of the underlying certain and uncertain (valid\invalid) sightings occurring as different station- ary processes (at constant rates M and Λ + Θ respectively). After the species goes extinct all sightings are uncertain and must also be invalid. A relatively high rate of

certain sightings before tL, and total absence of them afterwards, makes the last certain

sighting tL directly related to the extinction time (as pointed out by Solow and Beet (2014). However, when the rate of certain sightings is, extinction time can still be linked with the last certain sighting with a higher margin of error due to low sighting rate. This helps explain our finding that Model 2 is particularly sensitive to the last certain sighting tL.

As discussed, the two models show two different ways of taking errors in the sighting data into account. One focuses on the validity and invalidity of the sightings (Model 1), whereas the other focuses on the certainty and uncertainty of sightings (Model 2). These are subtly different things, yet they can have enormous consequence. In general, we only know whether data points are certain or uncertain, yet Model 1 shows that this might not be sufficient information. This is because we rarely have information as to the validity or invalidity of uncertain sightings, which would be needed if Model 1 were true. Chapter 3. Incorporating uncertainty into species extinction models. 56

All these characteristics makes model selection more complex because in reality we do not know if the certain and valid uncertain sightings follow different independent Poisson processes (as would be needed in Model 2) or as a single process (Model 1). For instance, consider the no extinction baseline scenario discussed under Model 1, Case 1, which assumes that there are uniformly spread uncertain sightings throughout the observation period. As an example, assume that there are 5 consecutive certain sightings followed by 25 consecutive uncertain sightings. If one is interested in finding out whether or not the species is extinct by the time of the last uncertain sighting at T = 30, the choice of model will give two different inferences. Specifically, Model 2 will favor extinction while Model 1 will favor the species being extant. In a special case, if in reality most of the uncertain sightings were truly valid, the inference made under Model 2 would be incorrect. However, if most of the uncertain sightings were truly invalid, then the inference made under Model 1 would be incorrect. Thus which model to be used in which situation is still open for debate as long as the quality of uncertain sightings remains unknown. Chapter 4

Inferring Extinction Year using a Bayesian Approach

This chapter has been published, and has the following citation:

Kodikara, S., Demirhan, H., Wang, Y., Solow, A., and Stone, L. (2020). Inferring extinc- tion year using a bayesian approach. Methods in Ecology and Evolution, 11(8):964–973

Abstract

Species sighting records are combined with statistical models to infer whether an en- dangered species might have become extinct, or instead has just gone unobserved for a lengthy period of time. The challenging part of developing these models lies in deal- ing with uncertain sightings. We propose a Bayesian hierarchical approach to infer the extinction time of a species based on historical sighting records which may be either certain or uncertain. The posterior distribution for extinction time is evaluated using the likelihood of sighting data and non-informative priors for model parameters. All the models discussed in this paper are implemented in JAGS, a program for analyz- ing Bayesian models using Markov Chain Monte Carlo (MCMC) simulation. A general methodology is presented and then applied to the sighting record of the Ivory Billed Woodpecker (IBW) (Campephilus principalis). It was found that the IBW most likely went extinct between 1940 and 1945, a little after the date of the last certain sighting. The methods developed were also applied to other species sighting records as well as some artificial sighting records. Through the results, it was found that the inferred time of extinction is significantly influenced by the last certain sighting if the sighting record consists of only certain sightings. In the presence of uncertain sightings, the inferred

57 Chapter 4. Inferring the extinction year 58 extinction time is influenced by either the last certain sighting or by the time where the uncertain sighting rate drops.

4.1 Introduction

Clear signs are emerging that any further loss of critically endangered species might tip the world towards another mass extinction event (Barnosky et al., 2011). These extreme events have likely only occurred five times in the past 540 million years (Barnosky et al., 2011; Pimm et al., 2014). As such, there is concern that the diversity and complexity of life on Earth may well again be on a dangerous downward spiral. It also highlights the need to correctly monitor and model the current extinction status of species on planet Earth and carefully assess the fragility of potentially endangered species. An incorrect classification of a species as extinct can lead to failure in conserving a threatened species (Lee et al., 2014; Thompson et al., 2013a). On the other hand, it is also undesirable to classify a species as extant when it is actually extinct, as it can lead to misallocation of research energy and funds (Thompson et al., 2013a; Lee et al., 2014; Ak¸cakaya et al., 2017; Keith et al., 2017).

In practice, it is extremely difficult to determine whether a species has gone extinct or has just remained unobserved (Ak¸cakaya et al., 2017; Keith et al., 2017; Thompson et al., 2017). But it only requires one new certain sighting to prove that a species is extant. A recent example of an erroneously inferred extinction is the Aldabra banded snail Rhachistia aldabrae. Gerlach(2007) announced that these snails went extinct as a result of short-term climate change, as no recent shell or live specimen was sighted after 1997. This was the case even after systematic and exhaustive surveys specifically aimed at finding the snail in 2005 and 2006. Nevertheless, the snail surprisingly reap- peared in 2014, when the rediscovery was publicised by the Seychelles Island Foundation (Battarbee, 2014).

Historical sighting records are often the only available data for rare or poorly studied species, and thus the main information available to work with for quantitative assessment of extinction. Palaeobiologists first introduced the general idea of using sighting records to infer the time of extinction (Strauss and Sadler, 1989; Marshall, 1990), while Solow (1993a) applied it for the first time within the field of conservation biology. Solow (Solow, 1993a) developed a Bayesian approach to derive an equation for expressing the survival probability of a species based on sightings over a series of time units.

Rivadeneria et al.(2009) pointed out that most of the statistical methods for assessing species extinction before 2009 assumed that all sightings were valid with complete cer- tainty. The paper spurred modellers to examine better what happens when uncertainty Chapter 4. Inferring the extinction year 59 might be attached to the validity of sightings. Roberts et al.(2010) noted that infer- ences from models including uncertain sightings differ significantly from those obtained by models omitting this information. Several studies were developed to incorporate probabilities of reliability or sighting validity for each sighting into the model develop- ment (Jari´cand Roberts, 2014; Lee et al., 2014, 2015, 2017a), as well as expert opinion (Thompson et al., 2013a; Lee, 2014). There has been recent interest in developing frame- works that incorporate uncertain sightings (Solow et al., 2012; Solow and Beet, 2014; Thompson et al., 2017). When analyzing the different approaches in Solow and Beet (2014), it was found that the final inferences made were particularly sensitive to the different ways of modeling uncertain sightings Solow and Beet(2014); Kodikara et al. (2018). This indicates the need to gain a deeper understanding and familiarity with models that include uncertain sightings.

Note that Solow(2016) and Alroy(2014) have made use of a framework related to the one studied here when they analysed a model of certain sightings only. Our paper has no- table differences though, and differences from others in the literature. First, our model examines the problem of estimating the extinction time when uncertain sightings are present in the sighting record. Second, our approach infers the sighting probability (and its distribution). If interest centres solely on extinction, one may not necessarily want to do this since there are ways to eliminate the parameter (see Solow(2016)). However, finding the sighting probability is also of considerable interest. Third, the use of com- putational MCMC approach makes it possible to find posterior distributions along with credible intervals of all parameters with ease. This is difficult to do with other methods, and often impossible. Also, this approach is relatively easy to implement and under- stand, particularly for those ecological practitioners who are not trained in mathematics and more advanced statistical theory. The flexibility of the approach becomes more ob- vious when there is a need to add more hierarchies into the model, (for example, making the sighting probability co-variate dependent). The computational Bayesian approach can be adjusted accordingly, without a need to derive the mathematically complex form for the posterior probability of extinction. This approach has become very popular of late, in particular with the availability of statistical packages, such as JAGS, WinBUGS etc.

Also, our approach bypasses working with Bayes factor directly in-contrast to Solow and Beet(2014). The Bayes factor approach has become popular as it allows to test hypotheses without any information about the hypothesis prior. In this method, a threshold value (e.g. 1/3) is used to reject the null hypothesis. However, there is no universal agreement on this threshold value or the simple “rule-of-thumb”. Since the Bayes factor is the ratio between posterior odds and prior odds, the Bayes factor is equal to the posterior odds when the prior odds are one. Thus, the use of the Bayes factor Chapter 4. Inferring the extinction year 60 is only justifiable when the prior odds are close to one. Otherwise, the method has questionable utility (Kruschke and Liddell, 2018). Our approach is based on working with posterior probabilities (Bernardo, 2011; van der Linden and Chryst, 2017). (That is, we work with the Bayes factor and include reasonable priors.) Although in the end, this is not particularly hard to do, we think this is the direction that should be taken in future related research specifically when informative priors are used.

The remainder of this paper is organized in the following way. Section 2 presents the development of the models. Section 3 explores the models using the sighting records of the IBW. Section 4 examines the sensitivity of the results to uncertain sightings. The paper concludes with the discussion in Section 5.

4.2 Model development

Consider a historical sighting record S of a species in which n sightings occurred in years

S = (s1, ..., sn), as recorded over the full observation period t = 1,...,T years. If the species went extinct during the observation period, then we designate τE as the date of the first year following extinction. In this paper, a hierarchical Bayesian approach

is developed to infer the extinction time τE for a species based on its sighting record S. From this it is possible to infer the probability a species went extinct during the

observation period, namely p(τE ≤ T |S).

Sightings in S can either be certain or uncertain and this has to be fully taken into account. Note that all certain sightings are taken to be valid and it is assumed that the species has been correctly identified on each sighting date. However, uncertain sightings can either be valid or invalid (since now the species is sometimes incorrectly identified).

Bayesian inference is used to find the posterior probability distribution for parameters of interest (eg., τE), based on prior knowledge of the parameter combined with a statistical model of the observed data (likelihood function). This requires working with the well known Bayesian formula:

posterior ∝ likelihood × prior. (4.1)

Here the prior is our initial knowledge about the parameter of interest, while the posterior is an updated version of the prior for which the observed data has been taken into account via the likelihood.

Two distinct modeling approaches are developed. The first is appropriate for sighting records that consist of certain sightings only. The second includes uncertain sightings. Chapter 4. Inferring the extinction year 61

We next discuss the development of the likelihood in each of these models and show how the likelihood and prior specification is used to obtain the posterior distribution according to Equation 5.8.

4.2.1 Model 1 - Certain sightings only

Formulation of the likelihood

First consider a historical sighting record S of a species in which all n sightings recorded are certain and occur in years S = C = (c1, ..., cn). Thus cn is the time of the last certain

sighting. The model assumes that there is a probability pc that an extant species can

be sighted in any given year. Our goal is to infer the distribution of extinction times τE,

based on the sighting record data S. Clearly, τE must be greater than the last certain sighting cn. When there are nc certain sightings, the likelihood for the sighting record

S given τE and pc is easily seen to be:

nc (τE −1−nc) p(S|τE, pc) = pc (1 − pc) . (4.2)

Since the full sighting record occurred in the period (0,T ), the upper bound for τE (the year following extinction) should be T + 1. Hence, the likelihood of S given τE > T is found by evaluating p(S|τE = T +1, pc). Considering all situations, Equation 8.5 for the likelihood is generalized as follows:

 0, τE ≤ cn  nc (τE −1−nc) p(S|τE, pc) = pc (1 − pc) , cn < τE ≤ T (4.3)   nc (T −nc) pc (1 − pc) , τE > T.

The basic set up of Model 1 is identical to the one proposed by Alroy(2014) and the associated paper of Solow(2016). In the latter study, instead of treating the yearly sight- ing probability pc as a parameter of interest, an approach was developed to completely eliminate pc by treating it as a nuisance parameter, resulting in an analytical solution for the posterior extinction probability. However, for the reasons that will become evident shortly, it is instructive and useful to include pc. Chapter 4. Inferring the extinction year 62

Prior distributions of model parameters

Assuming that an extant species can become extinct (E) at the beginning of each year with probability θ, the number of years until the species becomes extinct τE is charac- terized by a geometric distribution with parameter θ,

τE −1 p(τE|θ) = (1 − θ) θ, τE = 1, 2, ... (4.4)

The parameter pc and hyper-parameter θ are assumed to have a standard uniform dis- tribution with the following probability density function,

p(θ) = 1, 0 < θ < 1. (4.5)

and

p(pc) = 1, 0 < θ < 1. (4.6)

Based on this framework, we now evaluate the posterior distribution of τE.

Posterior distribution

Applying Bayes’ rule defined in Equation 5.8, the posterior distributions for the param-

eters of interest (τE, pc and θ) are written as the product between the likelihood, priors and hyper-prior as follows:

p(τE, pc, θ|S) ∝ p(S|τE, pc)p(τE|θ)p(θ)p(pc) (4.7) where p(τE, pc, θ|S) = posterior distribution for τE, pc and θ given the observed data S; p(S|τE, pc) = likelihood function for S given τE and pc;

p(τE|θ) = prior distribution of τE given the hyper-parameter θ and p(θ) = hyper-prior distribution of θ. p(pc) = prior distribution of pc.

Based on Equation 4.7, the MCMC samples for τE may be found using JAGS. The model developed in JAGS is specified according to the likelihood function defined in Equation 4.3 along with the prior and hyper-prior distributions in Equations 4.4, 4.6

and 4.5. Since τE is a discrete random variable, the posterior distribution of τE describes

the probability of occurrence of each value of τE. By summing all probabilities that are less than or equal to T in the posterior distribution of τE, we can obtain the posterior Chapter 4. Inferring the extinction year 63

probability of p(τE ≤ T |S) which can be expressed by the following formula:

p(τE ≤ T |S, θ, pc) p(τE ≤ T |S) = p(τE|S, θ, pc) T P (1 − p )(τE −1)(1 − θ)τE −1θ τE =cn+1 c = ∞ . P +1T (1 − p )(τE −1)(1 − θ)τE −1θ + (1 − p )T P (1 − θ)τE −1θ τE =cn c c τE =T +1 (4.8)

A related formulation as Equation 4.8 was used in Fader et al.(2010); Thompson et al. (2013a); Lee(2014) and Alroy(2014). The method discussed in Thompson et al.(2013a) and Lee(2014) uses a simple infer for the probability of sighting a species when it is extant, i.e., dividing the number of years in which there are sightings by the time of the last sighting (in our notationp ˆ = nc ; ). Our approach goes beyond this simple method c cn and infers the parameter pc using the Bayesian machinery.

4.2.2 Model 2 - Certain and uncertain sightings

Many historical data sets of rare or extinct species contain sightings that are to some degree uncertain. While physical evidence of a species is usually taken to indicate that the species was certainly present during a survey, other evidence is often less certain.

Suppose that the certain sightings occur in years C = (c1, ..., cn) and uncertain sightings occur in years U = (u1, ..., un), where cn and un represents the time of the last certain and last uncertain sighting respectively. Then the sighting record S is a combination of both C and U records. Our work assumes that uncertain sightings can only be recorded in years in which there are no certain sightings. In other words there is some “censorship” process that constrains the recording of uncertain sightings.

A likelihood for the sighting record S can be constructed similar to the “certain sighting only” model that takes into account the censorship process. Consider first the case cn < τE ≤ T , Then, in any year before extinction t < τE, a sighting is considered an outcome of a generalized Bernoulli trial where either a certain sighting or an uncertain sighting or no sighting is recorded. These uncertain sightings can either be valid or invalid. Valid uncertain sightings refers to a correct identification of the actual species (even though the ecologist might not actually have knowledge that it is correct), while an invalid uncertain sighting refers to a misidentification of the interested species. For any year after extinction (t ≥ τE), all uncertain sightings are invalid. Thus a sighting is considered a Bernoulli trial with either an invalid uncertain sighting with probability pui, or no sighting with probability 1 − pui, as outcomes.

Next we discuss how to allow for the censoring process whereby we assume that no single year can have both certain and uncertain sightings. Recall that for an extant species, Chapter 4. Inferring the extinction year 64

the probability of recording a certain sighting in any year is pc. We can assume that valid uncertain sightings and invalid uncertain sightings occur independently according to some probabilities, say puv and pui. As such, in theory both valid and invalid sightings could occur in the same year. Thus the probability of having an uncertain sighting before

τE is pu = puv(1 − pui) + pui(1 − puv) + puvpui. Recall that uncertain sightings are only recorded if there are no certain sightings. Thus to exclude the probability of recording an uncertain sighting in a certain sighting year, the probability of recording an uncertain sighting is taken to be (1 − pc)(1 − pu). This process is referred as the “censoring” process. However, if one prefers to consider the model without the censoring process then the above defined probabilities should be modified accordingly by discarding the

(1 − pc) term.

The certain sighting record C consists of nc sightings, all of which occur prior to τE as there cannot be any certain sighting after extinction. Let Nu be the total number of uncertain sightings. The uncertain sighting record tu consists of nu(τE) sightings prior to τE of uncertain validity, followed by Nu − nu(τE) sightings after τE all of which must be invalid. When τE > T , then nu(τE) = Nu. Considering all situations described above, the likelihood p(S|τE, pc, pui, puv) can be summarized as:  0, τE ≤ cn      pnc ((1 − p )p )nu(τE )  c c u   τ −1−nc−nu(τ )  ((1 − pc)(1 − pu)) E E cn < τE ≤ T p(S|τE, ...) = (4.9) Nu−nu(τE ) T −(τE −1)−(Nu−nu(τE ))  pui ((1 − pui)) ,      pnc ((1 − p )p )Nu  c c u   T −nc−Nu  ((1 − pc)(1 − pu)) τE > T.

The key notations used in Equation 4.9 are summarised in Table 4.1. In Equation 4.9, we have used the result that the likelihood of counts n1 = nc, n2 = Nu and n3 = T −nc −Nu arises from a generalized Bernoulli trial with probabilities p1 = pc, p2 = (1 − pc)pu and n1 n2 n3 p3 = (1 − pc)(1 − pu) (i.e. p1 + p2 + p3 = 1) is p1 p2 p3 . Chapter 4. Inferring the extinction year 65

Notation Description

τE Time or date of first year following extinction.

cn The date of the last certain sighting.

nc The total number of certain sightings.

Nu The total number of uncertain sightings.

nu(τE) The number of uncertain sightings prior to τE.

pc The probability of having a certain sighting in each year.

puv The probability of having a valid uncertain sighting in each year.

pui The probability of having an invalid uncertain sighting in each year.

pu The probability of having an uncertain sighting in each year.

(pu = puv(1 − pui) + pui(1 − puv) + puvpui).

Table 4.1: Notation used in model development

Note that Model 2 reduces to Model 1 by setting the uncertain sighting probabilities to zero (i.e. pui = 0 and ppu = 0).

As discussed Section 2.1, τE was modelled as a geometric distribution with parameter θ, where the prior for θ was taken to be a uniform(0,1) distribution. All the other parameters (i.e. pc, pui and puv) were also assigned a standard uniform prior. Using these prior specifications along with the likelihood in Equation 4.9, we obtained posterior distributions for all model parameters including, most importantly, τE. In any MCMC implementation, we generated 4 chains each with 130,000 iterations and a burn-in period and an adaptive phase equals to 60,000 iterations. Also, a thinning value of 13 was used to reduce the auto correlation in chains and hence 10,000 thinned steps were generated in each iteration.

Before applying Model 2 on actual sighting records, we tested it on 40 random simula- tions under eight different parameter settings where the species was either modelled as extinct or extant. Via the simulation study, we demonstrated that the MCMC method- ology infers extinction time accurately.

4.3 Results

4.3.1 Simulation Study

In here, we infer the extinction year and its 95% highest density interval (HDI) on 40 random simulations, with the primary aim of assessing the performance of Model 2. Chapter 4. Inferring the extinction year 66

One can perform a similar approach on Model 1 but in here our primarily interest is on Model 2 as it includes both certain and uncertain sightings.

To test Model 2 we apply it to analyse 40 sighting records that are generated under 8 parameter settings (see Table 4.2). All the sighting records have a sighting period from 0 to 100. When generating a sighting record, uncertain sightings were only generated when there were no certain sightings. This was done to match the ”censorship” process discussed under Model 2. Also if the last (i.e. 100th) sighting was a certain sighting then that sighting record was removed as it is obvious that the species is extant in such a situation. In order to use our model, or any other related model of extinction, one should define the beginning of the observation period where the species was extant. This is needed in the mathematics as it refers to the zeroth time point. Defining the zeroth timepoint where the species was extant is important as it allows us to assume that the extinction has not occurred before that point and hence it is bound to happen afterwards or never. Because there is no natural way to define this, almost all extinction models follow the approach in Solow(1993a) who defines the “beginning” as the first certain sighting. So, in order to use Model 2 (model with uncertain sightings), there should be at least two certain sighting to mark the beginning of the sighting record. Hence all the sighting records had at least one certain sighting.

Table 4.2 summarises the results obtained from the 40 simulations. (According to Table 4.2, most of the 95% HDI intervals includes the true extinction year). Through Table 4.2 and Figure 4.1 it is visible that the posterior median is a good estimation when the species is extinct (τE < 100). However, when the species is extant (τE ≥ 100) the posterior median tends to be much higher than the true value. In addition, the upper bound of the 95% HDI is unreasonably high in majority of these simulations. For example when the true extinction year is 107, the posterior median for extinction is 200 with a 95%HDI upper bound in 1982, which is unreasonably high. The main reason for this is that we only have sightings till 100 and beyond 100 is the unknown region. This can be seen as a extrapolation problem and hence the uncertainty of the estimate increases. Chapter 4. Inferring the extinction year 67

(a) (b)

Figure 4.1: Comparison between the true extinction date vs inferred posterior median extinc- tion date with different yearly extinction probabilities (θ) and certain sighting probabilities (pc). The data were simulated with different pc and θ values while holding other parameters constant (i.e puv = 0.8 and pui = 0.2). Blue vertical dotted line indicate the sighting end period, T , (A) θ = 0.025 and pc = 0.01. (B) θ = 0.01 and pc = 0.7.

Table 4.2: Posterior estimate for the extinction date based on simulations under Model 2

puv pui pc θ cn True τE Posterior estimates for τE Median 95% HDI 0.8 0.2 0.7 0.025 1 2 2 (2,4) 99 107 200 (100,1982) 12 13 13 (13,14) 33 34 34 (34,35) 97 206 99 (98,695) 0.005 97 344 109 (98,1070) 99 200 195 (100,1900) 99 125 196 (100,1949) 11 12 12 (12,13) 65 66 66 (66,67) 0.01 0.025 19 20 20 (20,21) 10 16 16 (14,17) 56 59 59 (57,66) 14 29 28 (17,32) 19 44 44 (42,47) 0.005 67 138 150 (68,1401) 75 301 137 (76,1297) 55 120 116 (56,1113) Continued on next page Chapter 4. Inferring the extinction year 68

Table 4.2 – continued from previous page

puv pui pc θ cn True τE Posterior estimates for τE Median 95% HDI 47 346 108 (48,1049) 49 57 56 (54,57) 0.2 0.8 0.7 0.025 20 21 21 (21,23) 9 10 10 (10,12) 17 18 18 (18,21) 68 69 69 (69,71) 98 107 189 (99,1828) 0.005 99 216 197 (100,1913) 22 23 23 (23,24) 98 109 183 (99,1787) 98 185 190 (99,1948) 98 160 170 (99,1632) 0.01 0.025 14 30 16 (15,31) 20 39 31 (21,61) 16 18 22 (17,45) 8 11 11 (9,33) 9 14 12 (10,23) 0.005 82 109 191 (83,1862) 78 212 174 (79,1656) 12 14 14 (13,25) 90 206 191 (91,1786) 81 321 176 (82,1735)

In addition to the accuracy of our Model 2, the results in Table 4.2 indicates the re- lationship between the true extinction time and our model parameters (such as pc, θ). According to the findings in Table 4.2 and Figure 4.2, there seems to be no relationship between pc (i.e. certain sighting probability) and true τE (extinction time). This is mainly due to the independence between pc (certain sighting probability) and θ (yearly extinction probability). If interest centres solely on extinction, then the sighting rate becomes a nuisance parameter. Hence Solow(2016) eliminate the sighting probability by conditioning on the total number of sightings. Also, when all the parameters are kept constant except for θ, we can see that the extinction year tends to be far away from the last certain sighting (cn) for small values of θ (i.e. 0.005), compared to its larger value Chapter 4. Inferring the extinction year 69

(0.025). Also, for smaller values of θ the species tend to be extant in most situations irrespective of sighting probabilities (i.e. pc, puv, pui).

Figure 4.2: Histogram of true extinction date with low and high certain sighting probabilities (pc). The histogram is a result of 1000 simulations. The data were simulated with different pc values (0.7, 0.01) while holding other parameters constant (i.e puv = 0.8, pui = 0.2 and θ = 0.025).

4.3.2 Case study

Ivory Billed Woodpecker (IBW)

The Ivory Billed Woodpecker (IBW) (Campephilus principalis), is one of the largest woodpeckers in the world but may have recently gone extinct. In the past decade several sightings of the IBW were reported but with uncertain validity, as it was impossible to obtain a clear photograph or other conclusive evidence of the bird (Collins, 2017). A highly controversial uncertain sighting was recorded in 2004, and it was then argued that the IBW had been rediscovered (see Fitzpatrick et al.(2005)). But whether the sighting was from the IBW or the Pileated Woodpecker (Dryocopus pileatus) is still open to debate (Sibley et al., 2007).

We proceed to analyse the sighting record data of the IBW provided in Elphick et al. (2010) and Roberts et al.(2010), which gives 68 sightings throughout the period 1897 Chapter 4. Inferring the extinction year 70 to 2010. Each of these sightings was classified into one of three different sighting classes (Roberts et al., 2010).

1. Physical Evidence (PE) - e.g., museum specimens, but also uncontroversial pho- tographs, video, and sound recordings. (22 sightings).

2. Independent Expert Opinion (IEO) - evidence that experts deemed sufficiently documented to confirm the record. (17 sightings).

3. Controversial sightings (CS) - sightings judged to lack firm evidence including any sighting for which there is published disagreement between experts. (29 sightings).

Following Solow and Beet(2014), we consider only sightings belonging to the Physical Evidence (PE) class as certain sightings while all other evidence as uncertain sightings.

Model 1 - Certain sightings only

We begin by analysing the IBW data with the certain sighting only model (Model 1). That is, we only analyse PE sightings. This requires working with the likelihood in Equation 4.3 and the prior distributions defined above. Then the posterior distribution of τE is summarized in Figure 4.3 and the 95% highest density interval (HDI) for the posterior extinction year τE is given in Table 4.3. The median extinction year is 1940 with a 95% upper bound in 1944. Also the posterior probability that extinction occurred

during the observation period is equal to one (i.e. p(τE ≤ 2010) = 1), which gives overwhelming support that extinction occurred during the observation period. Based on these findings we can infer that the IBW went extinct within a few years after the last certain (i.e. PE) sighting in 1939. Chapter 4. Inferring the extinction year 71

Figure 4.3: Posterior distribution plot of τE for the IBW for Model 1. Black solid line above x-axis shows the 95% HDI for the posterior distribution.

Table 4.3: Summary of the posterior distribution of τE using certain sightings only.

95% HDI Low median 95% HDI High

τE|S 1940 1940 1944

(a) (b)

Figure 4.4: Posterior distribution plots for the model parameters for Model 1 excluding τE. Black solid line above x-axis shows the 95% HDI for the posterior distribution. (A) Posterior distribution of θ. (B) Posterior distribution of pc.

The posterior distributions of model parameters θ and pc are shown in Figure 4.4. Recall that a non-informative prior (i.e. uniform distribution) was used for all these parameters. As per Figure 4.4a, the posterior estimate for the yearly extinction probability θ for

the IBW is θ=0.02. Also, according to Figure 4.4b, the posterior estimate of pc, the

probability for recording a certain sighting, is pc=0.5, with a 95% HDI between 0.3

and 0.6. The posterior median inferred for pc is similar to the estimate obtained from Chapter 4. Inferring the extinction year 72

S/Tn = 0.52 (i.e. S = nc and Tn = cn) as used in Thompson et al.(2013a) and Lee (2014).

Model 2 - Certain sightings and uncertain sightings

In Model 2, we follow Solow and Beet(2014) and assume that all PE sightings are certain, and all other sighting evidence (i.e. IEO and CS) uncertain. We thus use the

likelihood in Equation 4.9. The posterior distribution of τE is plotted in Figure 4.5 and the 95% HDI for the posterior extinction year is given in Table 4.4. According to Table 4.4, the median extinction year is 1940 with a 95% upper bound in 1945. Similar to Model 1, we can infer that the IBW went extinct within a few years of the last certain (i.e. PE) sighting in 1939. Our findings contradict the results from a recent paper which inferred the extinction year for IBW to be much closer to the sighting end-point in 2010 (Brook et al., 2019). However, it is impossible to know which of these inferences are

correct. Interestingly, the inference made concerning τE under Model 1 and Model 2 seems almost identical. Hence for the IBW sighting record, the inclusion of uncertain sightings does not affect the conclusion of the model, although this property is not always guaranteed (see Section 4.4).

Table 4.4: Summary of the posterior distribution of τE

95% HDI Low median 95% HDI High

τE|S 1940 1940 1945

Figure 4.5: Posterior distribution plot of τE for the IBW for Model 2. Black solid line above x-axis shows the 95% HDI for the posterior distribution. Chapter 4. Inferring the extinction year 73

The posterior distributions of other model parameters i.e. θ, pc, pui and puv, are shown in Figure 4.6. By comparing the value of the mode in Figure 4.6c with the mode in Figure 4.6d it is clear that there is a higher chance of observing an invalid uncertain sighting rather than a valid uncertain. Also, the variability in the invalid uncertain probability is much less compared to the variability of the valid uncertain probability. Both Model 1 and Model 2 produce similar posterior distributions for the yearly extinction probability θ and pc.

(a) (b)

(c) (d)

Figure 4.6: Posterior distribution plots for the model parameters for Model 2 excluding τE. Black solid line above x-axis shows the 95% HDI for the posterior distribution. (A) Posterior distribution of θ. (B) Posterior distribution of pc. (C) Posterior distribution of puv. (D) Posterior distribution of pui.

Diagnostic checks were carried out for all the model results presented in this paper and

no indication of any problem for any parameter (e.g. τE, pc, puv etc.) was observed (see Appendix8).

Treating uncertain sightings as certain

In this subsection, we analyse the IBW data, treating all sightings (PE, IEO and CS) as certain sightings in order to see how the inference is changed. Under this assumption,

the last certain sighting cn is equal to the last (previously uncertain) sighting in 2007

and the total number of certain sightings is now equal to nc + Nu. Based on these new Chapter 4. Inferring the extinction year 74 inputs, it was found that the posterior estimate (median) for the extinction year for the IBW is increased to the year 2080 (τE = 2080), which is completely different to our previous results, and would suggest that the IBW is extant, if there was reason to believe that the CS and IEO data were actually certain.

Diagnostic tests for MCMC Samples

When using a Computational Bayesian approach it is important to carry out diagnostics checks to examine whether the quality of the MCMC chains are sufficient to provide an accurate approximation of the target distribution. In practice, the MCMC chains are often assessed through visual inspection of the trace plot, auto-correlation plot, shrink factor plot and marginal density plot. Addition to these visual inspections there are some numerical checks such as the effective sample size (ESS) and Monte Carlo standard error (MCSE) which are used to measure the accuracy of the chains. A full discussion on these tools can be found in Kruschke(2014). Figure 4.7 illustrates these diagnostic checks for the parameter θ using the IBW sightings.

(a) Trace plot for θ (b) Auto-correlation plot for θ

(c) Shrink factor plot for θ (d) Marginal density plot for θ

Figure 4.7: Illustration of MCMC Diagnostics. The trace plot, auto-correlation plot, shrink factor plot and the marginal density plot outputted by JAGS. These plots are used to check if the chains are well mixed and suitably represent the posterior distribution. Analysis is based on data for the IBW (see text). Chapter 4. Inferring the extinction year 75

The trace plot in Figure 4.7a displays the values of the parameter θ (yearly extinction probability), during the run-time of the chain. This plot is used to identify any signs of irregular orphaned chains that might arise in some unusual regions of the parameter space. The plot given in Figure 4.7a indicates overlapping chains suggesting no orphaned chains. The marginal density plot of θ (see Figure 4.7d) is a smoothed histogram of the values in the trace-plot. This plot is used to identify if all the chains suitably represent the posterior distribution. The density plot also indicates overlapping chains, which suggest good representativeness of the posterior distribution. The auto-correlation plot given in Figure 4.7b indicates a zero auto-correlation between the chain values, which means that the values in a chain change rapidly for each and every step. As such, the chains are less clumpy and provide reasonably independent samples from the parameter distribution indicating that there are no problems. Inspection of convergence can also be checked numerically through the shrink factor, shown here in Figure 4.7c. A shrink factor above 1.1 indicates concerns on the convergence of the chains (Kruschke, 2014), something that is not an issue in this example.

The density plot in Figure 4.7d displays the estimated 95% highest density interval (HDI) for each chain. The 95% HDI is a Bayesian credible interval, and values inside this interval have a total probability of 0.95. Because of the uncertainty in the parameter, HDI intervals for each chain will slightly differ from each other. The MCSE indicates the estimated standard deviation of the sample mean in the chain and an ESS value of at least 10,000 is desirable to have a reasonably accurate and stable estimate of the limits of the 95% HDI. As the ESS value for θ is around 40, 000(> 10, 000) (see Figure 4.7b), the estimates for θ will be stable and accurate.

With the aid of Figure 4.7, we demonstrated how the MCMC chains generated for θ under Model 1 are sufficient to provide an accurate approximation for the target distri- bution. Similar diagnostic checks were carried out for all the model results presented in this paper (Model 1/ Model 2) and no indication of any problem for any parameter (e.g.

τE, pc, puv etc.) was observed. All of these Diagnostic figures are give in Appendix8.

4.4 Sensitivity analysis

In the previous section, we found that the inclusion of uncertain sightings changed the results of the IBW analysis very little compared to a model which omits them. Hence, it is important to see if this is a special case, or whether the uncertain sightings are generally non-informative. To assess this, we consider three artificially generated sighting records shown in Figure 4.8 along with four sighting records for species documented in the literature. Chapter 4. Inferring the extinction year 76

4.4.1 Artificial sighting records

For the artificially generated time series, all three had the same certain sighting history where sightings (green) occur in a regular fashion for the first 24 years of the 100 year observation record see Figure 4.8. While the first scenario has only certain sightings the second and the third includes uncertain sightings with different rates for the first 69 years.

Figure 4.8: Posterior median extinction date and its 95% HDI for three artificially generated sighting records between 0 and 100. The cells shaded in green represents certain sightings while the red shades represent uncertain sightings. Also, the cells without any shade indicates no sightings. For each of the sighting record the posterior median extinction date is indicated from a pink dashed line and the 95% HDI interval in the blue region.

Figure 4.8 summarizes the prediction results obtained from Model 1 for scenario (i) and Model 2 for the other two scenarios. From Figure 4.8, it is clear that the first two scenarios result in a median extinction date (pink dashed line) closer to the last certain sighting in year 24, while for the third scenario the extinction prediction is closer to the last uncertain sighting in year 69. Theoretically, after extinction, the rate of sightings will fall to a lower value as there can be no certain sightings or valid uncertain sightings after extinction. Hence these models (i.e. Model 1 and Model 2) are trying to identify the time point when the rate of sightings drop. In a situation, when there are only certain sightings, the time of the last certain sighting should indicate the point where the rate changes as there are no certain sightings afterwards. Hence for a sighting record with frequent certain sightings, the species is expected to go extinct at a date that is close to the last certain sighting.

But, when the uncertain sightings continue at a high rate after the last certain sighting and then fall to a low rate closer to the end of the observation period, the uncertain sightings become more informative (see scenario (iii)). Hence extinction will occur closer to the point when the rate of uncertain sightings drops. Also, when the uncertain sightings occur at a low constant rate (scenario (ii)) the information obtained from uncertain sightings is not that significant. Hence the result from scenario (ii) does not differ significantly from scenario (i). So there can be situations where the uncertain Chapter 4. Inferring the extinction year 77 sightings are informative (i.e. scenario (iii)) and situations where they are not (i.e. scenario (ii)). This is what Model 2 weighs up.

4.4.2 Species sighting records

In this subsection the two methods developed are used to infer the extinction year of four charismatic bird species: Nukupu’u (Hemignathus lucidus), Eskimo Curlew (Numenius borealis, Kaua’i ’akialoa (Hemignathus stejnegeri) and O’ahu ’Alauahio (Paroreomyza maculata). All sighting records contain the same three sightings types similar to IBW (Elphick et al., 2010; Roberts et al., 2010). For these species, the observation end-point is 2010. Similar to IBW, the PE sightings are considered as certain sightings while IEO and CS are treated as uncertain sightings. The sighting records are then analysed under model 1 and model 2. In addition, we demonstrate the impact of model assumptions. For the sake of simplicity, we assume that the yearly valid uncertain sighting probability

is same as the yearly certain sighting probability (i.e. puv = pc). This is Model 3.

Table 4.5: Inferred extinction year and its 95% HDI based on Model 1, Model 2 and Model 3

Species Model 1 Model 2 Model 3 Median 95% HDI Median 95% HDI Median 95% HDI Nukupu’u 1903 (1900, 1916) 1903 (1900, 1914) 1903 (1900, 1924) Eskimo Curlew 1966 (1964, 1974) 1967 (1964, 1984) 1969 (1964, 1991) Kaua’i ’akialoa 1965 (1961, 1988) 1968 (1961, 1994) 1971 (1961, 2046) O’ahu ’Alauahio 1979 (1969, 2651) 2058 (1969, 3952) 2079 (1969, 4077)

* Key model assumption- Model 1: puv = pui = 0; Model 2: puv ≥ 0, pui ≥ 0; Model 3: puv = pc

Table 4.5, summarises the findings under Model 1 Model 2 and Model 3 for the four species. According to Table 4.5, inclusion of uncertain sightings has no or little effect on the inferred extinction year for Nukupu’u species. For this species, there are no (uncer- tain) sightings after 1899 till 1960. Also, in addition there have been frequent certain sightings from 1890 which suddenly stop in 1899. Hence, as the theory predicts, as an outcome the extinction date will be found close to the last certain sighting (i.e. 1899). For this species, the additional information about the uncertain sightings increased the upper bound of the 95% HDI of the extinction year by ten years. For Eskimo Curlew all the Models infer similar median extinction time. However, the upper bound of the 95% HDI for Model 2 is a decade away from its value from Model 1. Similar to Eskimo Curlew, the inference of the extinction year only changes in few years for Kaua’i ’akialoa when uncertain sightings are included. However, it is interesting to note that when the model assumes that the puv = pc (Model 3), then the upper bound of the 95% HDI goes Chapter 4. Inferring the extinction year 78 beyond the end-point (2010). Hence one can infer that the Kaua’i ’akialoa species is extant in 2010, which contradicts the inference from Model 1 and Model 2. This example demonstrates the sensitivity of the extinction year to model assumptions.

The inferred median extinction year for O’ahu ’Alauahio is much higher for Model 2 (2058) compared to Model 1 (1972). On the basis of the inferred median extinction year, Model 1 infers the species is extinct while Model 2 does not. However, when the 95% HDI are used both models support the existence of the O’ahu ’Alauahi bird. In addition, for O’ahu ’Alauahio the posterior probability of extinction before 2010 (i.e. p(τE ≤ 2010|S)) is 0.76 under Model 1, which is much higher compared to 0.36 under Model 2. This again confirms the important role of uncertain sightings.

4.5 Discussion

In this study we present a Bayesian hierarchical approach to obtain the posterior dis- tribution for τE (the date of the first year following extinction) and to calculate the posterior probability that the species is extinct by the end-point of the sighting record data. Our general model is intended for sighting records that contain both certain and

uncertain sightings. In order to obtain the posterior distribution for τE, we use Markov Chain Monte Carlo (MCMC) sampling techniques implemented with JAGS in R (Kr-

uschke, 2014). As a case-study, we infer the extinction time distribution of τE for the IBW from historical sighting records.

In 2005, the IBW, which was thought to be extinct, received considerable attention after the announcement of its rediscovery in continental North America in the prestigious journal Science (Fitzpatrick et al., 2005). This announcement was based on a video clip analysis, which captured the species for a total of four seconds in 2004. However, the video had a number of problems, since images were blurred and pixelated owing to rapid motion, slow shutter speed, video interlacing artifacts, and the bird’s distance beyond the video camera’s focal plane (Fitzpatrick et al., 2005). Soon after the claim, Sibley et al. (Sibley et al., 2007) concluded that the evidence suggests that the bird in the video was a normal pileated woodpecker rather than an IBW, thereby reigniting the controversy as to whether the IBW was extinct or extant. Recent work has shown how modern drone technology might be used to find the IBW (Collins, 2018) and possibly resolve this controversy. However, it is also possible that these searches may not find any significant evidence of IBW existence or they may simply find more uncertain evidence. Thus, it is important to identify when to stop the search efforts for extinct species and divert the search efforts and funds towards the conservation of nearly extinct species. More on this topic is discussed in Carlson et al.(2018a). Chapter 4. Inferring the extinction year 79

The systematic surveys for IBW started after 2005 did not result in any detections or additional evidence (U.S. Fish and Wildlife Service, 2010). Theoretically for a species with no sightings in survey years, this will indicate earlier extinction compared to a species for which there were no surveys at all. Nevertheless, we believe that the survey efforts have no or little impact on our infer for IBW extinction year as the systematic surveys started five decades after the inferred extinction date, when it was likely already extinct, and when there were no sightings. Hence survey effort has been considered irrelevant for IBW in our analysis. However, if such information is available, then the models developed here could be modified accordingly by following Lee(2014).

The models developed assume that the certain and uncertain sighting rates are constant by taking constant sighting probabilities. But if a constant sighting rate is assumed, such methods cannot be used for modelling declining populations/sightings or should be used with caution. However, the method is still appropriate for small populations with relatively rapid extinction (Solow, 1993a), we thus follow Solow and Beet(2014) who find this method appropriate for IBW. Another limitation of the models described in this paper is that complex factors such as spatial heterogeneity are not allowed for. If such information is available, other approaches may also be tried for inferring extinction (Brook et al., 2018). However, currently, for many species, sighting records may be the only information available (Solow, 1993b).

After applying both Models to the sighting data of the IBW assigning uniform priors to all model parameters. The null hypothesis that the IBW is extant by 2010 was rejected under both the certain sighting model (Model 1, excluding uncertain sightings) and the combined certain/uncertain sighting model (Model 2, including uncertain sightings). Thus our statistical analysis suggests that the IBW went extinct in the 1940s, even when taking into account the uncertain sighting in 2006. Hence, the inclusion of uncertain sightings did not change the inference about extinction for the IBW. However, we checked whether this is the case for four other real world examples (i.e. Nukupu’u, Eskimo Curlew, Kaua’i ’akialoa and O’ahu ’Alauahio). For the sighting record of Nukupu’u, the uncertain sightings did not add any significant information about extinction. In contrast the inferred median extinction year was much higher for O’ahu ’Alauahio. Also, when Model 2 was modified to have the same yearly valid uncertain sighting probability as the yearly certain sighting probability (i.e. puv = pc), then the inference was changed from extinct to extant for Kaua’i ’akialoa. For Eskimo Curlew, including uncertain sightings only increased the upper bound of the 95% HDI interval for the inferred extinction year by 10 years. Through these results it is clear that the extinction year can be sensitive to inclusion of uncertain sightings as well as the model assumptions. Chapter 4. Inferring the extinction year 80

Through a set of artificially generated sighting records, it was shown that extinction is likely to occur either close to the last certain sighting or close to the point where the uncertain sightings fall to a significantly lower rate. These two time points can be seen as change-points. A change-point is a time point where the probability distribution of a sequence of observations differ before to after. In simple terms, our analysis reflects that extinction is highly likely to occur at a change-point where the rate of sightings diminishes. For the certain-sighting only scenario, there is only one change-point and that is the last certain sighting. In this case, the relatively high rate of certain sightings prior to the last certain sighting, and their absence after that, obviously means that the last certain sighting is the change-point. But when there are both certain and uncertain sightings, then the significant change-point can occur from either sighting types. For example, uncertain sightings can continue at a high rate after the last certain sighting and then fall to a low rate prior to the end of the observation period. In this situation the change-point is the point of time where the uncertain sighting rate diminishes. Hence the extinction problem can be viewed as a change-point analysis but the actual change-point will be dependent on model assumptions. Chapter 5

Modeling extinction of a species using non-homogeneous Poisson processes with a change-point

This chapter is currently being prepared for publication.

Abstract

Bayesian methods have been developed for inferring the true year of extinction of a species from sighting records that have both certain and uncertain sightings. These methods typically make the restrictive assumption that all sighting types (i.e certain, valid uncertain, invalid uncertain) derive from independent homogeneous Poisson pro- cesses. In this study, the constant rate assumption in the homogeneous Poisson pro- cess is relaxed by allowing certain and uncertain sightings to follow independent non- homogeneous Poisson processes. The model can thus identify whether or not any of the sighting rates were increasing, decreasing or constant. In addition, a change-point is introduced to model the uncertain sightings, where the sighting rates before and after the change-point vary. We have used Markov Chain Monte Carlo (MCMC) sampling to generate the posterior distributions for model parameters including species extinction time. The proposed method was applied to the sighting records of the black-footed fer- ret (Mustela nigripes) and the Ivory-billed Woodpecker (IBW; Campephilus principalis) species. Based on a hypothesis test, the results of the model indicate that the species both went extinct in the years 1988 and 1956 respectively. Moreover, a decline in the certain sighting rate was also inferred for both these species, possibly indicating the decrease in the species abundance as it converge extinction. Thus earlier models that 81 Chapter 5. Modeling extinction of a species using non-homogeneous processes. 82 assume a constant sighting rate may well be biased. Uncertain sightings rates for the IBW were found to increase before extinction (indicating ecological attention received near extinction) and stayed constant after extinction.

5.1 Introduction

Continued ongoing loss of global biodiversity is one of the most pressing contemporary ecological problems that threatens valuable ecosystem services and human well-being (Ceballos et al., 2010; Dirzo and Raven, 2003; Mace et al., 2012; Daily and Matson, 2008; Ehrlich and Ehrlich, 2013; Barnosky et al., 2011). Theoretical ecologists have therefore taken great interest in studying processes that lead to species extinctions, from complex spatio-temporal models (Holdaway, 1999; La Barbera and Spagnolo, 2002; Gu et al., 2002), to dealing with methods that infer from empirical data (Berger, 1990) and survival models (Fisher and Blomberg, 2011, 2012; Thompson et al., 2020) that predict whether a species has become extinct or not. It is the latter methods that will be of concern to us here. The date of extinction, or the time of the disappearance of the last individual of a species, is rarely observed and even harder to detect. Therefore, where- ever possible, any inference concerning the extinction of a species should be based on a variety of information sources. This includes time series of historical sightings (i.e. sighting records), the effort expended in searching for the species, change in abundance over time (i.e. population trajectories), potential remaining habitat and its relationship to abundance, the severity and extent of processes threatening species, and intrinsic taxon information (e.g. life-history traits) (Boakes et al., 2015). Ideally, we would like to use all of this information when attempting to infer whether a species has gone extinct or not. However, for rare or poorly studied species, the only available data is often just restricted to time series of sightings (Solow, 1993b).

Sighting history provides fundamental knowledge about a species existence and also the possibility of its extinction. However, extinction becomes a certainty only when there are no surviving individuals of the species, which is generally difficult or impossible to ascertain. Thus the assessment of extinction can benefit from the development of probabilistic frameworks (Elphick et al., 2010). A number of studies have developed methods to calculate an extinction probability based on the record of sightings of a species through time. A sighting record typically contains mixed-certainty sightings, some being certain and others uncertain. For example, observing an actual specimen of a species would be classified as a certain sighting, while an ambiguous photograph would be classified as an uncertain sighting. Thus, predicting the probability of a species being extinct from a sighting record ideally requires allowing for both certain and uncertain sightings. Chapter 5. Modeling extinction of a species using non-homogeneous processes. 83

Working with uncertain sightings requires further terminology. While certain sightings may confidently be considered always “valid”, uncertain sightings are either “valid” or “invalid”, given we are not sure whether we have identified the species correctly or not. In practice, it is impossible to know which of the uncertain sightings are valid and actually real, and which are invalid and thus errors. The most straightforward approach when modeling both these sighting types is to assume that the sighting rate of a species over time is constant for each sighting type, i.e., for all certain, for all valid uncertain and for all invalid uncertain sightings (Solow et al., 2012; Solow and Beet, 2014; Lee et al., 2014, 2017a). However, assuming a constant sighting rate is only valid for small populations. This motivated Solow(1993b)’s test for extinction in a declining population. In this approach, sightings were modelled as a non-stationary Poisson process with an exponentially declining rate function. However, the method is only suitable for sighting records with only certain sightings, and which appear to have a decline in the sighting rate.

In this paper, we extend the work of Solow and Beet(2014) to develop a more general Bayesian framework to infer the extinction year by relaxing the assumption of a constant sighting rate. The new model assumes that the sightings follow a non-homogeneous Poisson process that has an intensity function of Weibull hazard form, and often referred to as a Weibull process, or more commonly, a power-law process (Rigdon and Basu, 1989; Ho, 1991; Rao et al., 2006). Our new approach can detect if the sighting rates for certain and uncertain are constant, decreasing or increasing over the study period. It is important to have this flexibility in an extinction model because, for example, the certain sighting rate can decline before extinction due to declining abundance, habitat loss. In addition, the certain sighting rate and/or the uncertain sighting rate can increase due to significant attention from the media. Additionally, the certain and uncertain sighting rates will also be affected by survey efforts. As Boakes et al.(2010) noted, there is a risk that the increasing survey effort for threatened species could mask an abundance decline. In this paper, it is assumed that the time of extinction can be viewed as a change-point for the uncertain sightings. Hence, the rate of uncertain sightings can be decreasing, increasing or constant after extinction regardless of their behaviour before extinction. However, the model proposed here assumes a smooth incline/decline in the sighting rates and may not be suitable if this assumption is violated.

5.2 Model Development

Let N(t) ≥ 0 be the total number of sightings in the time interval [0, t), t ≥ 0. Then, assuming that (N = {N(t): t ≥ 0}) evolves according to a non-homogeneous Poisson Chapter 5. Modeling extinction of a species using non-homogeneous processes. 84 processes with rate given by λ(t), the mean of the process is given by:

Z t EN(t) = m(t) = λ(s) ds. (5.1) 0

By the properties of the Poisson distribution, the probability of k sightings between time t and (t + s) is given by:

[R t+s λ(s)ds]k Z t+s P (N(t + s) − N(t)) = k = t exp{− λ(s)ds} k! t (5.2) [m(t + s) − m(t)]k = exp{−[m(t + s) − m(t)]}. k!

Figure 5.1: Certain sightings can evolve according to a homogeneous or non-homogeneous Poisson process with rate given by λc(t). Uncertain sightings can also evolve according to a Poisson process but with the presence of a change-point τE. The rate of the uncertain sightings can either be λu1(t) or λu2(t) depending on whether t is before or after extinction. The solid line indicates the homogeneous Poisson process, while the dashed (i.e increasing rate) and dotted (i.e decreasing rate) horizontal lines indicate two different non-homogeneous Poisson processes. Whether these non-homogeneous Poisson rates behave in a linear or non-linear pattern depends on the rate function used.

For a sighting record, let Nc(t) ≥ 0 and Nu(t) ≥ 0 be the number of certain sightings and uncertain sightings in the time interval [0, t), t ≥ 0. We assume that the number

of certain sightings (Nc = {Nc(t): t ≥ 0}) evolves according to a non-homogeneous

Poisson process with rate given by λc(t) in the interval 0 to τE (i.e extinction time).

While the certain sightings must stop after the extinction time τE, the invalid uncertain

sightings should continue after extinction. The extinction time (τE) can be considered Chapter 5. Modeling extinction of a species using non-homogeneous processes. 85

as a change-point for uncertain sightings because before τE the uncertain sightings con- sists of both valid and invalid uncertain sightings but after τE there are only invalid uncertain sightings. Hence the uncertain sightings are assumed to evolve according to a non-homogeneous Poisson processes with the presence of the change-point τE. Before extinction, the rate of uncertain sightings is λu1(t), but this changes to λu2(t) after τE (see Figure 5.1).

In the present case (t ≤ τE), the rate of certain sighting λc(t) is defined as follows:

αc−1 λc(t) = (αc/σc)(t/σc) , t ≤ τE. (5.3)

The rate of certain sightings in Equation 5.3 is assumed to be of the Weibull hazard func-

αc−1 tion form, i.e., (αc/σc)(t/σc) , where σc and αc are the scale and shape parameters respectively.

The rate for uncertain sightings λu(t) is defined as follows with a change-point at τE:

 α −1 λu1(t) = (αu1/σu1)(t/σu1) u1 , t ≤ τE λu(t) = (5.4) α −1 λu2(t) = (αu2/σu2)(t/σu2) u2 , t > τE.

Similar to Equations 5.3, the rates of uncertain sightings in Equations 5.4 are assumed

to be of the Weibull hazard function form, where σu1, αu1, σu2 and αu1 are the scale and shape parameters before and after extinction. The Weibull rate function is a very flexible function that can be adjusted to mimic the smoothed real-world sighting rate behaviours, using the scale (σ) and shape (α) parameters. It should be noted that a Weibull shape parameter value less than one, α < 1, mimics a decreasing sighting rate over time, while α > 1 mimics an increasing sighting rate (Ho, 1991). The Weibull distribution can be derived theoretically as a form of Extreme Value Distribution, and has thus been used in the literature to model the k most recent sighting times of a species, ordered from

most recent sighting time to least recent sighting time (i.e T1 > T2 > ... > Tk)(Smith and Weissman, 1985; Hall et al., 1999; Roberts and Solow, 2003; Solow, 2005).

The mean certain and uncertain sighting rates, which change with time, can be obtained using Equation 5.1 as:

αc mc(t) = (t/σc) , t ≤ τE. (5.5)  αu1 mu1(t) = (t/σu1) , t ≤ τE  mu(t) = mu1(τE) + mu2(t) − mu2(τE) (5.6)   α α α  = (τE/σu1) u1 + (t/σu2) u2 − (τE/σu2) u2 , t > τE. Chapter 5. Modeling extinction of a species using non-homogeneous processes. 86

The mean uncertain sighting rate (mu(t)), has a change point at the extinction time, τE.

We denote mu1(t) to be the mean uncertain sighting rate before extinction, and mu2(t) to be the mean uncertain sighting rate after extinction. θ = (αc, αu1, αu2, σc, σu1, σu2, τE)

is the vector of parameters of the model, where α∗ and σ∗ refers to the shape and scale parameters of the Weibull distribution. Here, ‘∗’ refers to either the certain sightings (c), or the uncertain sightings before extinction (u1) or the uncertain sightings after extinction (u2). In this work, we assume that these parameters are random variables that need to be estimated.

In what follows, we will be interested in determining α∗ in Equations 5.5 and 5.6 as it reflects whether the sighting rate is increasing, decreasing or constant. Based on the value of α∗ the sighting rate λ∗(t) can be classified as follows using the rate functions given in Equations 5.3 and 5.4.

 decreasing, if α∗ < 1  λ∗(t) = constant, if α∗ = 1 (5.7)   increasing, if α∗ > 1

Now we discuss the development of the likelihood and show how the likelihood and prior specification is used to obtain the posterior distribution of θ i.e., the vector containing all parameters including τE the extinction time. Let T > 0, Kc > 0 and Ku > 0 be

fixed integers. Assume that there are Kc certain sightings in the time interval [0, τE) and Ku uncertain sightings in the time interval [0,T ). Then, let Ku(τE) be the num- ber of uncertain sightings prior to extinction time τE. Dc = {yc1 , yc2 , ..., yKc } and

Du = {yu1 , yu2 , ..., yKu } denote the two types of observed sightings, certain and uncer- tain, while yi indicate the time of the sighting. Then using the likelihood function and the prior distributions, the posterior distribution of the parameters of interest can be expressed as:

P (θ|Dc,Du) ∝ L(Dc,Du|θ)P (θ), (5.8)

where P (θ|Dc,Du) is the posterior distribution of θ given the data Dc,Du; P (θ) represents all the prior distributions for model parameters; and L(Dc,Du|θ) is the likelihood function of the model.

To build the full likelihood, we use the result that the likelihood of sighting times

D = {y1, y2, ..., yK } arising from a non-homogeneous Poisson process with rate λ(t) QK over the period [0,T ) is i=1 λ(yi|θ) exp[−m(T |θ)]. Thus, the full likelihood with the Chapter 5. Modeling extinction of a species using non-homogeneous processes. 87 presence of a change-point takes the following form (see, for instance, Achcar et al. (2010); Guarnaccia et al.(2015)):

L(Dc,Du|θ)

= L(Dc|θ) × L(Du|θ)

K Ku(τE ) h Yc i h Y i = λc(yci |θ) exp[−mc(τE|θ)] × λu1(yuj |θ) exp[−mu1(τE|θ)] i=1 j=1 K h Yu i × λu2(yuj |θ) exp[−(mu2(T |θ) − mu2(τE|θ))] j=Ku(τE ) (5.9) h Kc i αc Kc Y αc−1 αc ∝ αc (yci ) exp[−(τE/σc) ] σc i=1

Ku(τE ) h αu1 Ku(τ ) Y i ×  E (yαu1−1) exp[−(τ /σ )αu1 ] σαu1 uj E u1 u1 j=1 h α Ku i u2 Ku−Ku(τE ) Y αu2−1 αu2 αu2 × αu2 (yuj ) exp[−((T/σu2) − (τE/σu2) )] σu2 j=Ku(τE )

where Ku(τE) is the number of uncertain sightings prior to extinction time (τE).

If one needs to model a sighting record with a homogeneous Poisson process then it

can be modelled by allowing αc = αu1 = αu2 = 1 in Equation 5.9. In addition, if all sightings are certain, and assuming that only certain sightings are possible, the likelihood in Equation 5.9 can be significantly simplified into the following, by setting

L(Du|θ) = 1. However, this is different to having no uncertain sightings while assuming uncertain sightings are possible to observe.

K h Yc i L(Dc|θ) = λc(yci |θ) exp[−mc(τE|θ)] i=1 (5.10) h Kc i αc Kc Y αc−1 αc ∝ αc (yci ) exp[−(τE/σc) ] σc i=1

Accordingly, the posterior distributions can be obtained using Equation 5.8 along with the same prior distributions defined earlier for αc, σc and τE. Similar modification can be done for a model that does not assume a change-point for uncertain sightings by fitting a single non-homogeneous Poisson process for the uncertain sightings (i.e λu2 = λu1). In such a situation, the uncertain sightings are independent of the extinction time. Thus, inclusion of uncertain sightings does not provide additional information about extinction. Chapter 5. Modeling extinction of a species using non-homogeneous processes. 88

However, for species sighting data, the change-point assumption is important since it is possible to have both valid and invalid uncertain sightings prior to extinction, but only invalid ones afterwards.

For this study the prior distributions for shape (αc, αu1 and αu2) and scale (σc, σu1 and

σu2) parameters are chosen to be non-informative Uniform distributions, i.e. Unif(0, 1000), giving the sighting rates a vague behaviour. However, if there is prior knowledge about the sighting rates, the prior distributions can be modified accordingly. Following previous Bayesian approaches in the literature (Solow et al., 2012; Solow and Beet, 2014),

the prior distribution for τE is chosen to be an exponential distribution representing an increasing probability of extinction after the last certain sighting.

p(τE|γ) = γ exp{−γτE} (5.11)

In order to use a weakly informative prior for τE, the rate parameter γ in Equation 5.11 is chosen as 0.005 in the exponential distribution. This prior specification reflects an expected extinction time to be 200 years with a variance of 40,000 years. Along with these priors and likelihood function defined in Equation 5.9, we obtain posterior

distributions for model parameters including, most importantly τE, via the Markov

Chain Monte Carlo (MCMC) algorithm using Equation 5.8. In addition to τE the extinction time, the posterior distribution of the shape parameter α is of particular interest, as it reflects whether the sightings were increasing, decreasing or constant.

In the MCMC implementation, we generated 4 chains each with 10,000 thinned itera- tions for the black-footed ferret and for the Ivory-billed woodpecker. Compared to the Ivory-billed woodpecker, the black-footed ferret sighting record resulted in highly auto- correlated MCMC chains. Thus, a thinning value of 360 and 13 were used to reduce the auto-correlation in chains for the black-footed ferret and Ivory-billed woodpecker, respectively. These numbers were obtained by going through a trial and error process to obtain less auto-correlated chains. MCMC diagnostic checks were carried out in re- spect to convergence, auto-correlation and effective sample size and no indication of any

problem for any parameter (e.g. τE, αc, σc etc.) was observed. Detailed descriptions on the diagnostics are not discussed here as it is out of the scope of this paper. However, such details can be found in the supplementary materials of our latest paper (Kodikara et al., 2020). Chapter 5. Modeling extinction of a species using non-homogeneous processes. 89

5.3 Results

In this section the model outlined above is first applied to a simple example with only certain sightings using the data collected for the black-footed ferret. This is then followed by the example of the Ivory-billed woodpecker (IBW) where both certain and uncertain sightings are included in the modeling approach.

5.3.1 Black-footed ferret

The black-footed ferret (Mustela nigripes), found in the State of Wyoming, USA, was once thought to be extinct, but it was successfully reintroduced back into the wild after captive propagation (Dobson and Lyles, 2000; Wisely et al., 2008). Before this reintroduction, Solow(1993b) used the sightings of the black-footed ferret as a method for testing extinction in declining populations. The sighting record consists of 28 certain sightings over the period January 1972 to December 1990 (Solow, 1993b). Even though the black-footed ferret was reintroduced back into the wild, the sightings published in Solow(1993b) are useful for studying extinction in a declining population Jari´cand Ebenhard(2010).

The posterior distribution of the extinction time τE is plotted in Figure 5.2a and the 95% Highest Density Interval (HDI) for the posterior extinction year is given in Table 5.1. In these calculations, as similar to Solow(1993b), the time unit was taken to be a month. According to Table 5.1, the median extinction date is March 1985 for the black-footed ferret. The 95% HDI upper bound for τE was found to be March 1987 and hence we could infer that the species is highly likely to be extinct by the sighting end period (i.e December 1990). This inference agrees with the finding of Solow(1993b), where his method provides moderately strong evidence against the existence of the black- footed ferret (i.e., until reintroduction from 1991 to 2009). In addition, the posterior distribution for parameter αc has a 95% HDI between 0.45 and 1 indicating that the certain sightings have a high tendency to decline from the beginning of the observation period as suspected by Solow(1993b). Chapter 5. Modeling extinction of a species using non-homogeneous processes. 90

(a) (b)

Figure 5.2: Posterior distribution plots for the model parameters τE and αc for black-footed ferret. Black solid line above x-axis shows the 95% HDI for the posterior distribution. (a) Posterior distribution of τE. (b) Posterior distribution of αc.

Table 5.1: Summary of the posterior distributions of τE and αc for the black-footed ferret

95% HDI Low median 95% HDI High

τE|S 1985 1985 1987

αc|S 0.45 0.71 1.01

5.3.2 Ivory-billed woodpecker

The Ivory-Billed Woodpecker (IBW) was the third largest woodpecker in the world. It is believed that the IBW went extinct in the middle of the twentieth century. To illustrate the use of our model when there are both certain and uncertain sightings, we analyzed the sighting record of the IBW given in Elphick et al.(2010). The same sighting data was used in Solow et al.(2012) and Solow and Beet(2014) to infer extinction about IBW. However, these approaches assumed that the certain sightings, valid uncertain sightings and invalid uncertain sightings follow independent stationary Poisson processes with constant rates. We relax this assumption and suppose the sightings could evolve according to a non-homogeneous Poisson process. Similar to Solow and Beet(2014), we assume that all sightings based on physical evidence are certain, and all sightings that are not based on physical evidence are uncertain. Chapter 5. Modeling extinction of a species using non-homogeneous processes. 91

Table 5.2: Summary of the posterior distribution of τE for IBW

95% HDI Low median 95% HDI High

τE|S 1939 1950 1956

Figure 5.3: Posterior distribution plot of τE for the IBW. Black solid line above x-axis shows the 95% HDI for the posterior distribution.

The posterior distribution of τE is plotted in Figure 5.3 and the 95% HDI for the posterior extinction year is given in Table 5.2. According to Table 5.2, the median extinction year is 1950 with a 95% upper bound in 1956. Also, the extinction time τE is a bimodal distribution (see Figure 5.3), where the modes are approximately located at 1944 and 1952. In the next paragraphs we explain the reason for this bimodal result.

In the model, extinction time was formulated as a termination point for certain sightings and a change-point for uncertain sightings (see Figure 5.1). Hence, the inference about extinction time is significantly influenced by the last certain sighting and by any rate change in the uncertain sightings. For IBW, the first mode (1944) is associated with the last certain sighting while the second mode (1952) arises due to the change point of the uncertain sighting. According to Figure 5.4, the first mode in year 1944 is closer to the last certain sighting in 1939. The sudden ending of certain sightings at this year indicate its possible influence over extinction time. On the other hand, the second mode in 1952 is another possible change point as the rate of uncertain sightings increased until 1952 and then changed its behaviour afterwards. Thus, 1952 becomes another possible candidate for extinction time. Chapter 5. Modeling extinction of a species using non-homogeneous processes. 92

1944

1897 1952 2010 Figure 5.4: Graphical representation of IBW certain sightings and uncertain sightings. Green represents the years where there are certain sightings while red represents the years of uncertain sightings.

Figure 5.5: Posterior distribution plot of τE for the IBW with change-point and homogeneous rate assumptions. Black solid line above x-axis shows the 95% HDI for τE.

In order to further investigate the bimodal behaviour, the IBW sighting record was modelled using a homogeneous Poisson process for uncertain and certain sightings. The model can be obtained by allowing αc = αu1 = αu2 = 1 in Equation 5.9. As seen in Figure 5.5, when the rate of certain and uncertain sightings are assumed to have a constant sighting rate, while allowing for a change-point in uncertain sightings, the

posterior distribution for τE is unimodal with its mode being closer to the first mode in Figure 5.3. This results indicates that the bimodal distribution in Figure 5.3 is a result of the heterogeneous rate assumption in uncertain sightings. Chapter 5. Modeling extinction of a species using non-homogeneous processes. 93

(a)

(b) (c)

Figure 5.6: Posterior distribution plots for the model parameters αc, αu1 and αu2 for IBW. Black solid line above x-axis shows the 95% HDI for the posterior distribution. According to the 95% HDI αc < 1, αu1 > 1 and αu2 = 1. (A) Posterior distribution of αc for certain sightings. (B) Posterior distribution of αu1 for uncertain sightings before extinction (C) Posterior distribution of αu2 for uncertain sightings after extinction.

Other model parameters of interest are the shape parameters αc, αu1 and αu2, which reflect if and how the sighting rates change over time. Figure 5.6a shows that the 95%

HDI for αc is (0.37, 0.94). Thus, the null hypothesis that αc ≥ 1 can be rejected, or

in other words we can be 95% confidence that αc is less than 1. This implies that the certain sighting rate declined over the pre-extinction period for IBW. Similar inferences

can be made on αu1 and αu2. The 95% HDI for αu1 and αu2 are (1.35,3.46) and (0.31,

1.92), respectively (see Figure 5.6b and Figure 5.6c). Since the lower bound of αu1 is

greater than 1, the uncertain sighting rate before extinction (i.e τE) increased over time.

In contrast, the 95% HDI for αu2 is centered around 1, and thus implies a constant (invalid) sighting rate after extinction for IBW. Chapter 5. Modeling extinction of a species using non-homogeneous processes. 94

Using Figure 5.7, let us now examine the impact of model assumptions on the cumulative posterior extinction probability.

Figure 5.7: Time series plot of the posterior extinction probability under different assumptions for the IBW.

The cumulative posterior extinction probability evaluated at a considered year (say ti)

is the area under the posterior distribution of τE from zero to ti in Figure 5.3 (for the case of uncertain sightings). Through Figure 5.7 we can discuss the importance of model assumptions as well as the uncertain sightings on the posterior extinction probability. For instance, let us assume that ecologists are interested in inferring whether the IBW was extinct by a given year, say 1948. In such a situation, if one uses only certain sightings with a constant sighting rate assumption, then the posterior extinction probability for

IBW by 1948 is 0.97 (i.e., P (τE ≤ 1948|Dc, αc = 1) = 0.97). But if the uncertain sightings were included with a constant rate assumption then the probability is reduced to 0.75 (i.e., P (τE ≤ 1948|Dc,Du, αc = 1, αu1 = 1, αu2 = 1) = 0.75). In the latter case, the analysis indicates that the inclusion of uncertain sightings still favours extinction but with a lower probability than before. The same probability is further reduced to Chapter 5. Modeling extinction of a species using non-homogeneous processes. 95

0.38 when the constant rate assumption is relaxed by allowing the process to be non- homogeneous (i.e., P (τE ≤ 1948|Dc,Du) = 0.38). Under this scenario the extinction of IBW by 1948 is questionable. Through this simple example we demonstrated how model assumptions produce different posterior extinction probabilities. It is recommended that a similar comparison is carried out when this model is applied to a different data set

5.4 Discussion

In this paper, the work of Solow and Beet(2014) was extended for predicting extinc- tion when the sightings, both certain and uncertain, follow non-homogeneous Poisson processes. It was assumed that certain and uncertain sightings evolve according to two independent non-homogeneous Poisson processes between (0, τE), where τE denotes the

extinction time. Over the period (τE,T ), the uncertain sightings follow a different non- homogeneous Poisson process. The new approach can be used to test if the sighting rates for certain and uncertain sightings were decreasing, increasing or constant over the studied time period.

The model was applied on two real-world case studies covering a certain sighting only scenario (the black-footed ferret), and a case where both certain and uncertain sightings appear in the sighting record (the IBW). The null hypothesis that the black-footed ferret and the IBW are extant by the sighting end period (i.e 1990 and 2010) was rejected. Furthermore, our statistical analysis suggests that the black-footed ferret and the IBW went extinct in the 1980s and 1950s, respectively. In addition to these conclusions, the

posterior distributions of αc suggested that both these species had a declining certain sighting rate pre-extinction, possibly due to decline in the population as the species reach extinction. The uncertain sighting rate, however, indicated an increase before extinction for IBW which probably reflects the media and ecological attention received (U.S. Fish and Wildlife Service, 2010).

The main advantage of the method described in this paper is that it does not have any underlying assumptions on the sighting rates behaviour except for smoothness. Through Figure 5.7, we demonstrate the impact of these assumption on the posterior extinction probability using IBW sighting data. For example, if one assumes a constant sighting rate for both certain and uncertain sightings, then the cumulative posterior extinction probability is over estimated as shown in Figure 5.7. In contrast if the true certain sighting rate was increasing before extinction then assuming a constant rate would under- estimate the true extinction probability. Hence the inferences made under the constant rate assumption will be inaccurate if the true underlying rate is heterogeneous. Chapter 6

Bayesian updating to estimate extinction from sequential observation data

This chapter has been published, and has the following citation:

Thompson, C. J., Kodikara, S., Burgman, M. A., Demirhan, H., and Stone, L. (2019). Bayesian updating to estimate extinction from sequential observation data. Biological Conservation, 229:26–29

Abstract

Several new approaches to estimating the probability that a species is extinct have emerged recently. Different foundational assumptions can lead to different interpreta- tions of data and potentially to different conclusions. To explore the implications of alternative formulations, here we develop and illustrate a Bayesian Updating method for inferring extinction based on records of observations and surveys. We illustrate how it combines incidental sightings and surveys with a data set for the Alaotra Grebe, show- ing how estimates of extinction may be updated as new data arise, providing a means for managers to reassess priorities for survey and management dynamically.

6.1 Introduction

Extinctions in ecology are important for many reasons, not least because they represent the ultimate expression of many human impacts on the natural world. However, it is

96 Chapter 6. Bayesian updating to estimate extinction 97 often difficult to know when the last member of a species has died. Instead, extinction is inferred from absences of incidental sightings and from dedicated surveys which fail to detect the target species. The extinction or otherwise of species can be controversial (see Brook et al.(2018),Carlson et al.(2018b) for discussion of the status of the thylacine, and Collins(2017), for a discussion of the kinds of data available to support inferences about the status of the Ivory Billed Woodpecker). Incorrect inferences may lead to actions to protect species that are already extinct, or to a failure to act when it may have been effective to do so (Ak¸cakaya et al., 2017).

The problem of inferring possible extinction of a species from sighting records (data) has been much studied and debated (Lee et al., 2017a; Solow and Beet, 2014; Thomp- son et al., 2013b), beginning with the seminal work of (Solow, 1993a,b, 2005). Many models have been proposed that use different assumptions and data, and which may give incompatible and even contradictory (i.e. inconsistent) estimates for probabilities of extinction (Thompson et al., 2017). For example, the model developed by Solow and Beet(2014) (see Bond et al.(2019) for an application) does not use information on de- tection probability and survey effort. Thompson et al.(2017) proposed a non-Bayesian linear model (LM) for the extinction problem to account for such data when the data are sequential in time. In this approach, the probability of extinction in year t, P (Et)

is updated to determine P (Et+1) as new sighting data come to hand year by year. In their LM model there are no Bayes Factors, rules of thumb or priors.

In the present article we propose an alternative Bayesian approach for circumstances in which sighting data are sequential in time. Known as Bayesian Updating (O’Hagan,

1995), probabilities of extinction P (Et) are updated year-by-year as new data come to hand, by taking the (Bayesian) “posterior” in year t to be the “prior” in year t+1. This results in a non-linear iterative model (Bayesian Updating model; BU) with a cumulative Bayes Factor which is also updated year-by-year.

In the following section we present a detailed description of the BU model including an exact expression for the probability P (Xt) that the species is extant in year t in terms of an “initial” P (X1) and cumulative Bayes Factor Bt. The choice of P (X1) is discussed and explicit expressions are given for yearly Bayes Factors in terms of parameters for recordings and unsuccessful surveys (described in detail below). In section 4.3 we present a methodology for dealing self-consistently with rules of thumb and probability thresholds for the BU model.

As a case study in section 4.4, we consider Alotra Grebe (Rhachistia aldabrae) which was the subject of a detailed analysis using the Linear Model (LM) in Thompson et al. (2017). Our results are summarized and discussed in the final section. Chapter 6. Bayesian updating to estimate extinction 98

6.2 Bayesian updating

Consider a period of T consecutive years t = 1, 2,...,T of sequential data s1, s2, . . . , sT . To be specific we consider a species which is either extant (x) or extinct (e) at the beginning of any year in the record period (1,T ) with probabilities P (x) and P (e) =

1−P (x). The sequential data st in this case represent sighting states (st in year t) which could take many forms including “recordings (r)” (photographs, sounds, specimens, . . . ), or unsuccessful surveys (u or u0) covering some fraction of the species’ habitat. Thus, ‘r’ represents a successful sighting / survey. If there are no sightings during a dedicated survey, then it is classified as an active unsuccessful survey (u). In the absence of a dedicated survey, it is assumed that there may still be unplanned surveys by interested professionals or amateur ecologists. These kinds of unsuccessful surveys are referred to as passive unsuccessful surveys (u0). Thus, r represents a year where a sighting occurred, u represents a year where a sighting did not occur, but a survey did, and u0 represents no sightings and no survey.

In any given year, Bayes Rule states that with sighting state ‘s’ in that year, the posterior (conditional probability) P (e|s) is given by

P (s|e)P (e) P (e|s) = . (6.1) P (s|e)P (e) + P (s|x)P (x)

There is of course a similar equation for P (x|s), in terms of the inverse conditional probabilities P (s|e) and P (s|x) and the priors P (e) and P (x). The ratio of these two equations gives

P (e|s) P (e) = b(s) , (6.2) P (x|s) P (x)

where

P (s|e) b(s) = , (6.3) P (s|x) is the Bayes Factor for that year.

Note that we only have one sighting record in the period (1,T ) specified by the given sequence st of sighting states. Uncertainty in this case is embodied in the unknown and uncertain values of the priors and the Bayes Factors Equation 6.3, the latter depending on the st independently of the prior(s). Chapter 6. Bayesian updating to estimate extinction 99

Proceeding year by year we propose a simple updating methodology to determine the probability P (Xt+1) that the species is extant in year t + 1 from P (Xt) and the Bayes Factor in year t. Specifically, in Equation 6.2 we take the priors in year t to be

P (x) = P (Xt) and P (e) = P (Et) = 1 − P (Xt), (6.4)

with Bayes factor in year t given by, from Equation 6.3,

P (st|Et) bt = b(st) = . (6.5) P (st|Xt)

We then update to year t + 1 by taking the posteriors in year t from Equation 6.2 to be the priors for year t + 1, i.e. from Equation 6.2, Equation 6.4 and Equation 6.5

P (Et+1) P (Et) = bt , t = 1, 2, 3, ... (6.6) P (Xt+1) P (Xt)

We refer to the iterative rule Equation 6.6 as Bayesian updating. If we now substitute the second equation of Equation 6.4 into Equation 6.6 we obtain a linear iterative equation −1 for [P (Xt)] which can be readily solved and inverted to give

n h 1 io−1 P (Xt+1) = 1 + Bt − 1 , t = 1, 2, 3, ... (6.7) P (X1) where

t Y Bt = bj, (6.8) j=1

may be interpreted as a cumulative Bayes Factor.

To implement the Bayes updating model (BU) Equation 6.7, we need to assign values

to the initial P (X1) and the yearly Bayes Factors bt in Equation 6.5. Chapter 6. Bayesian updating to estimate extinction 100

6.2.1 Choosing an initial P (X1)

In the conventional Bayes Rule of Equation 6.1 a prior of unity (P (e) = 1, i.e. P (x) = 0) implies a posterior of unity (P (e|s) = 1). Similarly, in the BU analysis, an initial

P (X1) = 1 from Equation 6.7 implies P (Xt) = 1 for all t = 1, 2, 3,... . However, P (X1) is uncertain and we specify a range of values for P (X1), assuming a uniform (or some other) distribution over this range. For example, we may replace P (X1) by its mid-point value over a specified range. If one assumes a value of P (X1) between 0.25 and 0.75 one could choose the average value of 0.5, giving equal weight to the species being extant or extinct. We will adopt such mid-point choices of BU model parameters, described below. This approach is discussed in greater detail in Thompson et al.(2013b).

6.2.2 Calculating the yearly Bayes factor bt

The following formulas use parameters that may be estimated from data or by ornitho- logical specialists familiar with the species in question (see Thompson et al.(2017)). The two main sighting states are recordings (r) and unsuccessful surveys (u active or u0 passive). a) For a recording year (r) there is one parameter: p(ci)= the probability that the recorded species is correctly identified. Then by definition in an r year p(r|e) = 1 − p(ci) is the probability that the species is incorrectly identified. In addition it is clear that p(r|x) = 1. Hence from Equation 6.3 b(r) = 1 − p(ci). (6.9) b) For an active unsuccessful survey year (u) there are two parameters: p(ri) = the probability that the species could have been reliably identified and recorded. (i.e. p(ri) = p(r) ∗ p(i) where p(i) is the probability that the species could have been reliably identified in the survey if it had been recorded and p(r) is the probability that the species would have been recorded in the survey.) Let  be the proportion of the species habitat within its likely range that was sur- veyed (i.e 0 <  < 1). If the habitat is known to be inhomogeneous then  can be modified accordingly as the calculations are done sequentially. For instance, if the range surveyed is equal to the full range of the species habitat then  should be closer to one. Then by definition in a u year p(u|x) = 1 − p(ri) is the probability that the survey was unsuccessful. In addition, it is also clear that p(u|e) = 1. Hence from Eqn. (6.3) Chapter 6. Bayesian updating to estimate extinction 101

b(u) = (1 − p(ri))−1 (6.10)

In passive unsuccessful survey years (u0),  and p(ri) in Equation 6.10 are replaced by their primed counterparts, where 0 is the default passive (unsuccessful) survey estimates for coverage and p0(ri) is the probability that the species could have been reliably identified and recorded in a passive (unsuccessful) survey year.

To summarize, the Bayes Factor bt in year t can be expressed, from Equation 6.5, Equation 6.9 and Equation 6.10 as,

 (1 − p(ci)), when a sighting occurred during the year  −1 bt = (1 − p(ri)) , when a sighting did not occur during a survey year  (1 − 0p0(ri))−1, when a sighting did not occur during a non-survey year (6.11)

6.3 Rules of thumb and probability thresholds

In our basic Equation 6.7, P (Xt+1) is sensitive to the choice of the initial P (X1) just as posteriors in conventional Bayesian analysis are sensitive to choices of priors. In the latter case, it is common practice to deal directly with Bayes Factors (B) which are independent of priors, and to specify rules of thumb values (R), arguing that a B exceeding R provides ”substantial evidence for extinction”.

An equivalent situation applies to the BU model where the cumulative Bayes Factor

Bt given by Equation 6.8 is independent of the initial P (X1). The problem then is to specify values for R. Depending on applications, values ranging from 3 to 100 (or more) have been suggested (Jeffreys, 1998; Solow and Beet, 2014). Ideally of course one

would like to estimate probabilities of extinction P (Et), e.g. for BU from Equation 6.7

and Equation 6.8, and to use probability threshold lower bounds on P (Et) (e.g. 0.95) to make decisions regarding extinction of species. In any event, one is left with the problem

of choosing P (X1). As an example for BU if one chooses P (X1) = 0.5 in Equation 6.7

one may need a value of R of at least 20 or so (rising to 60 if one chooses P (X1) = 0.75)

to achieve a probability threshold value for P (Et) of 0.95. That is, with P (X1) = 0.5,

one needs Bt > 20 in order for P (Xt+1) < 0.05.

A novel feature of the BU model is that the Bayes Factor Bt Equation 6.8 is cumulative and from Equation 6.11 decreases in sighting (r) years and increases in no sighting Chapter 6. Bayesian updating to estimate extinction 102

(u and u0) years. In particular, if we set t = 0 to the last “r” year in a sighting

record, Bt is an increasing function of t. One way to proceed is then to calculate Bt

for successive t = 1, 2,... until one reaches t = T where BT first exceeds a specified rule of thumb R (e.g. 20, which provides strong evidence for the alternative hypothesis (Lee and Wagenmakers, 2014; Sch¨onbrodt and Wagenmakers, 2018)) at which point, the

probability of extinction P (ET ) exceeds a given probability threshold (e.g. 0.95).

In the following section we consider Alaotra Grebe as a case study for our BU model, taking us back to our previous analysis of Grebe data using an alternative linear model in Thompson et al.(2017).

6.4 Case study

The water bird, the Alaotra Grebe (Tachybaptus rufolavatus) was once endemic to Lake Alaotra and surrounding lakes in Madagascar. It was last observed in 1988 and as a result of several large scale unsuccessful surveys (u0) over the period 1989-1997, it was inferred to be extinct by the mid 1990’s by several research groups (Keith et al., 2017; Thompson et al., 2017).

In the following analysis we use observation (records) data for the Alaotra Grebe col- lected by Bird-Life International as given in Thompson et al.(2017). We note firstly that over the 18 year period from 1970-1988 there were 7 “r” years and 11 “u” years. Using midpoint p(ci) values in r years and midpoint values for 0 and p0(ri) in u0 years 0 0 (from Table 1 in T Thompson et al.(2013b), i.e.  p (ri) = 0.0047) and an initial P (Xt)

of 0.5 in 1970, we find P (Xt) is slightly less than 1 in 1988 then decreases markedly in subsequent u years in accord with the results shown in Figures 1, 2, 3 of Thompson et al. (2017). We therefore focus our attention on years following 1988 where there were only unsuccessful (u, u0) surveys. In the following analysis we thus take t = 0 to be 1988.

In view of the above remarks, we initiate the BU model in 1990 and take P (Xt) = 0.5 in that year. We then need to check for consistency of these assumptions as described in the previous section. Firstly, however, from Table 1 in Thompson et al.(2017) we have surveyed (u) years in 1989, 1990, 1993, 1994, 1997, 1998, 1999 with midpoint parameters values of  = 0.875 (i.e 0. ≤ ≤ 0.9) and p(ri) = 0.74 giving a Bayes Factor in those years from Equation 6.11 of b(u) = 2.837. In non-surveyed (u0) years after 1988, midpoint parameter values of 0 = 0.025 and p0(ri) = 0.187 give a Bayes Factor from Equation 6.11 of b(u0) = 1.0047. All the parameter values (i.e , p(ri), 0 and p0(ri)) are taken from Table 1 in Thompson et al.(2017), which were derived from information

held by ‘BirdLife International’. Cumulative Bayes Factors Bt from Equation 6.8 then Chapter 6. Bayesian updating to estimate extinction 103 take respective values from 1990 to 1994 of 2.837, 2.850, 2.864, 8.124, 23.05 (see Figure 6.1a). According to the literature, a rule of thumb value of R between 3 and 10 only provides moderate evidence against extinction (Lee and Wagenmakers, 2014; Sch¨onbrodt and Wagenmakers, 2018). Based on this, we suggest using a conservative rule of thumb value of R = 20 as an indicator of extinction occurred on or before 1995 (see Table 6.1). This should reflect strong evidence against extinction. We also note that the large jump in Bt values in 1994 (from Bt=8.12 to Bt=23.05) serves as conformation for a change in extinction status. These data were used to construct Figure 6.1a and Figure 6.1b. They show the cumulative Bayes Factor for the likelihood that the species is extinct, together with the probability that the species is extant, which declines in the absence of successful surveys and sightings from the initial value of 0.5 in 1990 to a value close to zero by 1998.

Start P (Xt) Year 1990 0.5

Passive Surveys (u0) 0 p0(i) p0(r) 0p0(ri) Low Mid High Low Mid High Low Mid High Low Mid High 0.00 0.03 0.05 0.10 0.38 0.65 0.40 0.50 0.60 0.0000 0.0047 0.0195

Active Surveys (u) Year  p(i) p(r) p(ri) Low Mid High Low Mid High Low Mid High Low Mid High 1990 0.80 0.88 0.95 0.90 0.93 0.95 0.70 0.80 0.90 0.50 0.65 0.81 1993 0.80 0.88 0.95 0.90 0.93 0.95 0.70 0.80 0.90 0.50 0.65 0.81 1994 0.80 0.88 0.95 0.90 0.93 0.95 0.70 0.80 0.90 0.50 0.65 0.81 1997 0.80 0.88 0.95 0.90 0.93 0.95 0.70 0.80 0.90 0.50 0.65 0.81 1998 0.80 0.88 0.95 0.90 0.93 0.95 0.70 0.80 0.90 0.50 0.65 0.81 1999 0.80 0.88 0.95 0.90 0.93 0.95 0.70 0.80 0.90 0.50 0.65 0.81 2000 0.70 0.80 0.90 0.90 0.93 0.95 0.70 0.80 0.90 0.44 0.59 0.77 2004 0.80 0.88 0.95 0.90 0.93 0.95 0.70 0.80 0.90 0.50 0.65 0.81 2009 0.80 0.88 0.95 0.90 0.93 0.95 0.70 0.80 0.90 0.50 0.65 0.81

Table 6.1: Passive and active survey input data for the Alaotra Grebe data commencing in 1990, and output from the model. The low and high values in the table refers to the lower and upper bounds for the parameter values for Alaotra grebe in Thompson et al.(2017)). Note that these are just parameter bounds for Alaotra grebe and it does not mean that 80% surveying is low. As discussed in the text, we will be considering the midpoint values for each parameter. Recall that p(ri) = p(r) ∗ p(i) and for an explanation of the parameters, see the main text. Chapter 6. Bayesian updating to estimate extinction 104

(a)

(b)

Figure 6.1: Cumulative Bayes Factors and probabilities of extinction generated sequentially from the data in Table 6.1.

Values for the PB(Xt) model from t = 1 (1989) to t = 7 (1995) derived from Equation 6.7

for BU (assuming an initial PB(X1) of 0.5 in 1990 and the above Bayes Factors Bt). The

values for P (Xt) show probabilities for extinction by 1995 of PB(Et) = 0.958, exceeding a notional probability threshold value of 0.95. There are many possible variations on the model outputs above. For example, if we initiate yearly Bayes Factors b(u), b(u0)

from 1989 we obtain a cumulative Bayes Factor Bt in 1994 of 65.7 (suggesting a rule of thumb value of, say, 65). However, the extant probabilities given in Figure 6.1b is highly sensitive to the choice of p(ri), as shown below, Chapter 6. Bayesian updating to estimate extinction 105

Figure 6.2: Sensitivity of extant probability to p(ri). Here, it was assumed that all survey years have the same  and p(ri). Recall that (ri) is the probability that the species could have been reliably identified and recorded, and  is the proportion of the species habitat within its likely range that was surveyed.

6.5 Discussion

Inferring extinction probabilities is important in ecology because these inferences affect reporting on the state of the environment, and decisions about priorities for surveys, actions to abate threats, and to establish and manage protected areas (see Thompson et al.(2017)). The IUCN (2012) defines a taxon as extinct “when there is no reasonable doubt that the last individual has died. A taxon is presumed extinct when exhaustive surveys in known and/or expected habitat, at appropriate times (diurnal, seasonal, annual), throughout its historic range have failed to record an individual. Surveys should be over a time frame appropriate to the taxon’s life cycle and life form”. We have described an approach to inferring possible extinction of a species from sighting Chapter 6. Bayesian updating to estimate extinction 106 records that allows for opportunistic sightings and dedicated surveys, accounting for the period of observation and the extent of habitat surveyed.

Our approach provides a means for quantifying what is meant by ‘no reasonable doubt’ and a basis for consistent allocation of resources to mitigate extinction risks (Ak¸cakaya et al., 2017). Its particular utility in this regard is that it provides a means of reassess- ing estimates and priorities as new information arises. The model is very simple to implement in a standard spreadsheet, facilitating its adoption for routine application in organizations that use observation data to support evidence-based decisions regarding actions to protect species and their habitats. It should also provide a measure of comfort for those who have used sighting models in controversial circumstances (Carlson et al., 2018b), providing an opportunity for a more comprehensive, integrated interpretation of data. However, the approach requires users to provide parameter values (i.e ,(ri)), which can be hard to obtain for most species and the choice of values will govern the conclusions made (see Figure 6.2).

Different foundational assumptions can lead to different interpretations of data and in some instances, different conclusions. It is important that such differences be reconciled. Here, we have developed and illustrated a Bayesian Updating method for evaluating the probability that a species is extinct, based on a record of observations and surveys. We have shown that it is consistent with a non-Bayesian model and illustrated its use with a data set to arrive at conclusions which accord with the non-Bayesian model applied to the same data.

As with the model developed by Thompson et al.(2017), the Bayesian Updating model may be used to explore hypothetical scenarios, or to test ideas about investments in sur- veys. For instance, this model will be useful if questions arise regarding the trade-offs between more extensive surveys, or new technologies that are more likely to detect a species when it is present. Manipulations of the model’s parameters will reveal whether potential investments will contribute substantially to the estimated probability that a taxon is extinct. Likewise, alternative targeted survey strategies or training scenarios may be assessed (Ak¸cakaya et al., 2017; Thompson et al., 2017). Perhaps most impor- tantly, the methods outlined here indicate how acceptable levels of uncertainty may be quantified so that decisions about the allocation of resources may be made transparently and consistently. Chapter 7

Using survival theory models to quantify extinctions

This chapter has been published, and has the following citation:

Thompson, C. J., Kodikara, S., Burgman, M. A., Demirhan, H., and Stone, L. (2020). Us- ing survival theory models to quantify extinctions. Biological Conservation, 241:108345

Abstract

Extinctions are difficult to observe and typically are inferred from the timing and relia- bility of field observations and collections. Recent advances in approaches to estimating extinction probability consider the type; timing and certainty of records; the timing; scope and severity of threats; and the timing; extent and reliability of surveys. Here we describe a new approach to inference of extinction that uses survival theory, an approach that has a long history of effective use in other disciplines that confront similar problems. The model takes into account uncertainties in input parameter estimates and provides bounds on estimates of the extinction probability for the case in which a species has not been detected following some specified time. We illustrate application of the model using information for dodos and Aldabra snails. This approach provides an alternative perspective on the models underlying the techniques for inferring extinction. It should provide reliable estimates of recent extinction rates.

7.1 Introduction

Estimating the probability a taxon has gone extinct is important in conservation biology because it affects decisions about priorities for surveys and actions such as establishing 107 Chapter 7. Using survival theory models to quantify extinctions 108 and managing protected areas (Pimm et al., 2014; Thompson et al., 2017). Estimates also contribute to measures of biodiversity conservation effectiveness and reporting on the state of the environment (e.g., Tittensor et al.(2014), Pimm et al.(2014)). The problems that usually occupy survival analysis in medicine and engineering safety par- allel the problem in ecology of estimating the mean time to extinction of a species. Survival analysis is used widely in medicine, for instance, to estimate life expectancy, in engineering to estimate the time to failure of a component or system (Hosmer et al., 2002; Klein and Goel, 2013), and in many other contexts in which time to an event is crucial (e.g. Singer and Willett(1993)).

In survival analysis, events such as the death of a patient or failure of a component are assumed to be unambiguous and there is only one such event for each subject. A survival curve is a function describing the proportion of a population that (or the probability that an individual) survives to some time, t. Survival functions depend on the specification of a function, which is the probability of the event at time t, conditional on the person, organism or system persisting to time t. The times to death or failure may depend on a host of factors.

Typically, the available data for biological species are composed of records based on museum or herbarium collections, together with a series of uncertain observations based on expert and amateur sightings, sound recordings or photographs of varying reliability (McCarthy, 1998; Solow and Beet, 2014; Keith et al., 2017; Thompson et al., 2017). Sometimes, these records are supplemented by explicit surveys designed to detect the species. Solow(1993a) introduced the idea of inferring extinctions from data on the tim- ing of observations. Others have adapted these ideas to consider uncertain observations (see especially (Solow and Roberts, 2003; Rout et al., 2009; Rivadeneria et al., 2009; Roberts et al., 2010; Jari´cand Roberts, 2014)). Keith et al.(2017) and Thompson et al. (2013b, 2017) extended these approaches to incorporate information on the adequacy of searches for the taxon, accounting for detectability, accessibility of habitat, timing, duration, sampling intensity, survey methods and extent, and observer skill.

In this study, we introduce a new approach that uses survival theory as a tool for estimating extinction times and the mean time to extinction. Unlike previous studies cited above our approach does not rely on detailed historical sighting records and in addition is non-Bayesian. Specifically, our models are designed for situations in which a species has not been recorded for some period of time t after an assumed valid record at time t = 0. Thus, this approach is not designed to replace the developments outlined above. Rather, as shown in the following sections, this enables us to derive simple expressions for the mean time to extinction and associated confidence levels for the special case of extinction during some specified time interval. The novel features of Chapter 7. Using survival theory models to quantify extinctions 109 our approach are explained in more detail in 7.2 Discrete survival theory for extinction probabilities, 7.3 Model consistency. Some case studies to illustrate these points, using particular scenarios, are given in Section 7.4. Our results are summarized and discussed in the final section.

7.2 Discrete survival theory for extinction probabilities

In discrete time, we assume that extinction of a species occurs in some year beginning at time t = 1, 2, 3,... following the last certain recording at time t = 0. We define Xt and et to be the events that the species is extant at time t and becomes extinct in year t (i.e. in the time interval (t, t + 1)) with probabilities respectively of P (Xt) and p(et). Then we have

∞ X P (X0) = 1 and p(et) = 1 (7.1) t=0 and in addition, that

∞ X P (Xt) = p(et0 ) (7.2) t0=t+1 i.e. P (Xt) can also be interpreted as the probability that extinction occurs in some year beginning at time t0 > t ≥ 0. See Lee et al.(2017b) for further details about the P∞ issues underlying Equation 7.1 i.e. cases where t=0 p(et) 6= 1 because of rediscovery.

In survival theory, with this interpretation, P (Xt) is usually called the survival function.

From Equation 7.2, on rearrangements of sums over t and t0 we have (as shown in the Appendix8)

∞ ∞ X X P (Xt) = tp(et) = µ (7.3) t=0 t=0 where the second sum in Equation 7.3 is, by definition, the Mean Time to Extinction (denoted by µ).

In survival theory it is usual to consider the so-called hazard function defined in general by

p(et) ht = t = 1, 2, 3, ... (7.4) P (Xt−1) Chapter 7. Using survival theory models to quantify extinctions 110

From Equations 7.2 and 7.4 it then follows that

P (Xt) = P (Xt−1) − p(et) = [1 − h(t)]P (Xt−1) t = 1, 2, 3, ... (7.5)

We note in passing that Equation 7.5 is formally equivalent to a proposed (Thompson et al., 2017) linear iterative model in the special case where there are only unsuccessful surveys following the last recording, at time t = 0. We will return to this formal connection in Section 7.3.

Further discussion and elaboration of the above is given in the Appendix8. We note in particular that specification of a time dependent hazard function h(t) is not a simple task and that, at a more fundamental level, one should consider probability models for

say p(et). As a simple example, one can consider the (normalized) geometric distribution

t−1 p(et) = h(1 − h) t = 1, 2, 3, ... (7.6)

with constant 0 < h < 1. In the Appendix8 we show that this model implies a constant hazard function or rate h(t) = h. Conversely, if we assume an effective constant (mean value; i.e. it is only marginally different than a constant hazard rate) h for the hazard function h(t) over some time interval, iteration of Equation 7.5 shows that

t P (Xt) = (1 − h) t = 0, 1, 2, ... (7.7)

Equation 7.6 then follows directly from Equations 7.4 and 7.7. In essence the assumption of constant h(t) is equivalent to assuming that the background processes driving the threats to extinction of the species are more or less constant through time.

A simplifying feature of constant h(t) is that from Equations 7.3 and 7.7 we obtain

t P (Xt) = (1 − h) t = 0, 1, 2, ... (7.8)

in accord with similar results for survival analysis in continuous time (Miller Jr, 2011). Another simplifying feature of the constant hazard rate h(t) assumption (as shown in the Appendix8) is that the standard deviation (STD) for the time to extinction is given by

√ 1 − h σ = . (7.9) h Chapter 7. Using survival theory models to quantify extinctions 111

In general, for given µ and STD µ one can examine confidence levels (CL) for extinction

(ET ) occurring on or before some time T , often taken to be a multiple of K µ’s above µ, i.e. at time

T = µ + Kσ. (7.10)

We note that in general, on rearranging the first equality in Equation 7.5, we obtain, by definition, the confidence level

T T X X CL = p(et) = [P (Xt−1) − P (Xt)] = 1 − P (XT ) = P (ET ) (7.11) t=1 t=1

which holds for any T and any distribution p(et) (given P (X0) = 1). In particular for constant hazard rate h(t) = h, Equations 7.7 to 7.11, give

T CL = 1 − (1 − h) = P (ET ) (7.12)

and

√ −1 TK = h [1 + K 1 − h]. (7.13)

As shown in the Appendix8, and discussed in the following section, for T given by Equation 7.13 we obtain the lower bound

CL ≥ lim P (ET ) = 1 − exp[−(1 + K)] (7.14) h→0

It is interesting to note that the r.h.s of Equation 7.14 is equal to the CL for the continuous time exponential distribution (Miller Jr, 2011), obtained as a continuum limit of the discrete time geometric distribution, and is valid (as shown in the Appendix8) for any constant hazard rate h. By contrast the discrete model CL, Equations 7.12 and 7.13, depends on both K and h. As we shall see, the h-dependence in the discrete case provides us with an additional “dimension” for studying extinction times and probabilities for extinction. Chapter 7. Using survival theory models to quantify extinctions 112

7.3 Model consistency

Equation 7.13 can be viewed as a relationship between T and h at a given CL. In addition, Equation 7.12 provides a relationship between T , h and P (ET ). Equations 7.12 and 7.13 could be considered as two equations in three “unknowns”; h, T and

P (ET ) for given K. One would then expect that specifying a value for any one of these unknowns determines values for the other two. We call such an outcome self-consistency meaning that the Equations 7.12 and 7.13 for such solutions are internally consistent.

A simple way of visualizing (model) self-consistency is to plot T vs. h and P (ET ) vs. h using Equations 7.12 and 7.13 and line the resulting curves up against one another with a common h-axis as shown in Figure 7.1.

∗ ∗ ∗ Figure 7.1: Plots of P (ET ) vs. h and T vs. h showing the self-consistency triplet (P , h ,T ) of model parameter values when K = 2.

Specifying any one of P ∗, h∗ and T ∗ then determines the other two as shown (when specifying T ∗). Note, however, that while h∗ can take any value from 0 to 1 and T ∗ any value from 1 to infinity, P ∗ is bounded below by 0.95 when K = 2 (as h → 0 and T → ∞ together in Equations 7.12 and 7.13). Details are provided in Appendix for arbitrary values of K. Suffice it to say here that as K increases the lower bound of P ∗ also increases while the general shape of the curves in Figure 7.1 stay the same.

In practical application one would specify a value for K and either, for a given value of P ∗, determine h∗ and T ∗ or, for a given value of T ∗ determine h∗ and P ∗ (as indicated ∗ in Figure 7.1). We stress again that from Equation 7.11 (cf., Figure 7.1), P = P (ET ) Chapter 7. Using survival theory models to quantify extinctions 113

is in general the confidence level (CL) for any distribution P (et). In the special case of the geometric distribution, one can think of h∗ as an affective contact (mean value) for h(t). According to Figure 7.1, it is clear that the extinction time (T ∗) is highly sensitive to the choice of h∗. For instance, if h∗ is assumed to be 0.1 then extinction is likely to happen before 29 years (i.e. T ∗ = 29). However, the value of T ∗ is reduced to 14 if h∗ is increased to 0.2. Thus, a value greater than 0.3 can be considered as a high hazard rate h∗ as it infers that extinction is likely to happen in the next 10 years. Thus, with the uncertainty in the hazard rate and because of sensitivity on the result , a choice for h∗ to determine P ∗ and T ∗ is, in general, not a good option.

A self-consistent value for h∗ determined from given P ∗ and T ∗ can, however, be checked for consistency with the aforementioned linear iterative model (Thompson et al., 2017) for unsuccessful surveys which is formally equivalent to the survival theory iterative model Equation 7.5 with h(t) replaced by tpt(ri). In this interpretation t is the frac-

tion of habitat surveyed and pt(ri) is the probability that the species would have been recorded and correctly identified if it were detected in year t. Thus, for constant h(t) = h the corresponding linear iterative model factor [1 − p(ri)] in Equation 7.5 is the proba- bility that a survey is unsuccessful. For example with a determined self-consistent value of h∗ = 0.10 the corresponding linear iterative model with p(ri) = 0.10 and an assumed p(ri) = 0.90 corresponds to an unsuccessful survey coverage of 11% of the (assumed) species habitat. Such comparisons between survival theory and linear iterative model can provide useful consistency checks for survival theory. However, we stress that unlike survival theory, the linear iterative model has no intrinsic concepts of mean time to extinction, CL’s etc.

7.4 Case studies

In this section we present two applications of our approach for the special case K = 2 (as in Figure 7.1). Recall that K = 2 corresponds to confidence bounds of 2 standard deviations away from mean. Particular assumed values for model parameters represent possible scenarios in our case studies. Readers are encouraged to experiment with other parameter (and K) values to check on the robustness and sensitivities of our model results.

7.4.1 Dodo

The observation records for the Dodo have been used by a number of authors to illustrate extinction processes (see references in Roberts and Solow(2003)). The species was Chapter 7. Using survival theory models to quantify extinctions 114 endemic to the island of Mauritius and probably became extinct in the late 17th century. Discounting a doubtful record in 1674, the last confirmed record was in 1662, which we specify to be t = 0.

In the absence of any information to enable us to specify a functional form for the hazard function h(t) we assume a constant value h over some number of years T . For example, if we seek to find the probability of extinction by year 1700, i.e. T = 38 (taking 1662 to correspond to t = 0) we deduce from the self-consistency Equation 7.13 that h = 0.077 and from Equation 7.12 that P (ET )=0.952 i.e. at a 95% CL the species was extinct by 1700 with probability of extinction 0.952.

Alternatively, if one assumes say a constant hazard rate h = 0.10 one deduces, as noted above the self-consistent value of T = 29 and P (ET ) = 0.953 from Equations 7.12 and 7.13. From these values one would deduce that at a 95% CL the Dodo was extinct by 1662+29=1691 with probability 0.953. This is in close agreement with an extinction time estimate of 1690 given by Roberts and Solow(2003). Thus, the survival theory approach provides consistent results with Roberts and Solow(2003) where a 0.1 hazard rate is used in the survival model.

Finally as a model consistency check of the linear iterative model against survival theory we use the linear iterative model interpretation Equation 7.5 for h = p(ri), and note that for survival theory the probability of extinction is approximately 0.95 for both h = 0.10, T = 29 (i.e. 1691) and h = 0.077, T = 38 (i.e. 1700). Given the large number of confirmed sightings before 1662 (t = 0) and the unique character of the species itself it is reasonable to assume a relative high value for the linear iterative model parameter p(ri) defined above. For example if we take p(ri) = 0.90 a value of p(ri) = 0.10 (T = 29) implies a survey percentage coverage of approximately 11% while the value p(ri) = 0.077 (T = 38) implies a smaller survey coverage of 8.6% as one would expect. In all cases the probability of extinction for the linear iterative model and survival theory are approximately 0.95 again showing model consistency.

7.4.2 Aldabra snails

The Aldabra snail inhabits the Aldabra Atoll in the Seychelles group. Gerlach(2007) believed it to have become extinct in the late 1990s or early 2000s, although this claim was disputed and subsequently proved to be wrong when a specimen was sighted in dense vegetation in a relatively remote location on Malabar Island in 2014, one of the large islands in the Aldabra atoll (Battarbee, 2014). Chapter 7. Using survival theory models to quantify extinctions 115

To illustrate the use of these methods, we assume that the last confirmed (specimen based) record was indeed in 1997, rather than in 2014, and examine the support for the inference that the species was extinct, based on the evidence available in 2007. After 1998, there were several unsuccessful surveys; a small-scale survey in 2000 and two ‘exhaustive’ surveys in 2005 and 2006.

With insufficient information to assign values to model parameters we assume that at a 95% CL extinction occurred on or before 2008 i.e. with 1997 as t = 0 we take T = 10 in the self-consistent survival theory Equations 7.12 and 7.13 (assuming, conservatively, 10 years of no records). Thus with T = 10 in Equation 7.13 we deduce an (assumed)

constant hazard rate of h = 0.27. It then follows from Equation 7.12 that P (ET ) = 0.957, i.e. an approximate probability of 0.96 that the species was extinct on or before 2008.

If we now take account of the recent sighting by resetting t = 0 to 2014 and assume a large hazard rate of say h = 0.35 to take account of an assumed re-introduction of systematic surveys after 2014 we obtain from Equation 7.13 a value of T = 7.5. That is, i.e. we can now be 95% confident that extinction will occur on or before 2014+8=2022 and that from Equation 7.12 the probability of extinction at that time will be at least 0.96.

Again as a model consistency check with the linear iterative model assuming p(ri) = 0.90, values of p(ri) = 0.27 and 0.35 correspond respectively to survey coverage of 30% and 39%.

Finally as an illustration of how one could include time dependence of linear iterative

model parameters, we assume constant p(ri) = 0.90 and survey coverage t of respec- tively 0.30 in year 2000 ; 0.70 in “exhaustive” survey years 2005 and 2006 and passive values of say 0.10. Taking t = 0 as 1997 we obtain consecutive linear iterative model

P (XT )’s from 1998 to 2008 of 0.91, 0.83, 0.60, 0.55, 0.50, 0.46, 0.41, 0.15, 0.057, 0.052, 0.047, i.e. a probability of extinction of 0.955 by 2008 in accord with the survival the- ory analysis above (refer to Figure 7.2). Adapting the linear iterative model to the rediscovery in 2014 can be achieved as outlined above and again in accord with survival theory. Chapter 7. Using survival theory models to quantify extinctions 116

Figure 7.2: Consistency check of Linear iterative model against Survival theory.

7.5 Discussion

In this study, we introduce the use of “Survival Theory” to estimate the mean time to extinction of a species given uncertain observational records in time. In survival theory there are various ways that one could achieve self-consistency and under reasonable assumptions both survival theory and linear iterative model are consistent with each other. This approach also illustrates how to quantify uncertain extinction by conducting scenario analyses which could guide investments in surveys for the species that may have become extinct.

Ak¸cakaya et al.(2017) discuss the costs of mistakenly concluding a species is extinct when in fact it is extant, versus mistakenly concluding a species is extant when it is extinct. The model described here can be used together with information on the costs of surveys and the costs of wrong decisions to explore trade-offs and optimize investments in surveys.

The hazard function is a key element in the analysis above. It may depend on a host of factors including the amount of remaining habitat or some other proxy of population size. In the above, for simplicity, we assumed the hazard function was constant in time. Chapter 7. Using survival theory models to quantify extinctions 117

The time dependence of h(t) may be difficult to quantify in practice. The important point for model development, however, is that one could use any value for h(t) in the above and calculate mean time to extinction from Equation 7.8. Thus, if the biology of the species suggests it depends sensitively on habitat extent and reliable data are available on the loss of habitat over the critical period since the last sighting, then h(t) could be scaled accordingly to provide a more nuanced analysis.

Other parameters are uncertain, due to both natural variation and observation errors. We have estimated the probabilities of detecting a species given that it is present, and of correctly classifying it, given that it is observed. These parameters are very different for different species, depending on their size and morphology, the characteristics of their habitat, their behaviours and reactions to the presence of humans, and the presence of physically similar but taxonomically distinct species (Keith et al., 2017). Wintle et al. (2012), for example, provide a method for calculating the probability a species occupies a site given one or more unsuccessful surveys, and the number of sequential non-detections necessary to assert, with a pre-specified confidence, that a species is absent from a site. It would be a worthwhile avenue of research to quantify these parameters empirically for a range of taxa and ecological settings, to provide general guidelines that could be extrapolated over many other taxa. This would provide a foundation for more accurate estimates of extinction rates in regions and globally. Chapter 8

A Systematic Bayesian Integration of Epidemiological, Genetic and Movement Data

This chapter is currently being prepared for publication.

Abstract

Movement or contact data between infected and susceptible hosts is one of the potential factors governing the early outbreak dissemination for an infectious disease. Thus, in- cluding movement information into modelling structure has a great potential to improve the inference regarding the transmission network and ultimately leading into better dis- ease control strategies. However the recently developed approaches mainly present the integration of epidemiological and genetic data. One such method was developed by Lau et al. (2015), where a Bayesian MCMC technique was developed to reconstruct the transmission network using sequence and epidemiological data of infected animals in a network of farms. In this chapter, we extend their approach to incorporate animal movements between farms. The extended framework is first tested and verified on sim- ulated outbreak and in a future project it will be applied on the early stages of the 2017 Mycoplasma bovis outbreak in New Zealand.

8.1 Introduction

The widespread impacts of outbreaks of foot-and-mouth disease in the United Kingdom in 2001, severe acute respiratory syndrome (SARS) in Hong Kong 2002, Ebola in West 118 Chapter 8. Reconstructing the disease spread 119

Africa in 2014 and the recent coronavirus pandemic (COVID19), have prompted an ex- plosion of efforts into understanding transmission dynamics of infectious diseases. For all events of this type, epidemiologists have a great need to understand the underly- ing disease transmission network to gain a picture of ”who infected whom.” Modelling techniques are continuously being developed to help understand and infer this network. Essentially, by linking each infected case with its possibles source of infection, these mod- els attempt to infer the true ‘transmission network’. Nevertheless, successfully recon- structing a network still remains a major theoretical challenge, owing to the difficulties in taking into account complex unobserved processes. There are many practical advan- tages in reconstructing the underlying network. For example, it is needed for calculating the very important epidemiological index, the basic reproductive number R0, which is defined as the number of secondary infections produced by a single typical primary case in a fully susceptible population (Kermack and McKendrick, 1927). Or it can be used to locate super-spreading events (Lloyd-Smith et al., 2005), and identify risk factors associated with increased propensity to infect other individuals (Hayama et al., 2019), which are much useful in designing and evaluating infection control policies (Ferguson et al., 2001). If the transmission network can be inferred or reconstructed it can be used to estimate the contribution of each infective case and their spatial location to the overall progression of an outbreak (Faye et al., 2015).

In a number of early studies, the most likely transmission network was reconstructed only from the epidemiological data collected during a disease outbreaks (Haydon et al., 2003; Cauchemez et al., 2006; Wallinga and Lipsitch, 2006; Cauchemez and Ferguson, 2011; Cauchemez et al., 2011; Heijne et al., 2012; Wallinga and Teunis, 2004; Ferguson et al., 2001). But, with the improvement of sequencing technology it became possible to obtain genetic sequence data on the pathogens at a fine resolution, and even in real-time (K¨oseret al., 2012; Eyre et al., 2012). This new source of data opened up whole new directions for unraveling the possible transmission links between cases (i.e. individuals, animals, farms, etc.). Several methods were developed to infer the links based on sequence data alone (K¨oseret al., 2012; Ruan et al., 2003; Liu et al., 2005; N¨ubel et al., 2010; Mutreja et al., 2011; Walker et al., 2013; Jombart et al., 2011; Aldrin et al., 2011). Cottam’s original Frequentist approach (Cottam et al., 2008a), and its modifications (Cottam et al., 2008b; Firestone et al., 2019a), are the intermediate step between epidemiological or genetic only models, and the modern state of the art Bayesian outbreak reconstruction methodologies that combine both epidemiological and genetic sequence data (Ypma et al., 2013; Jombart et al., 2014; Didelot et al., 2014; Mollentze et al., 2014; Worby et al., 2016; Hall et al., 2015; Lau et al., 2015; De Maio et al., 2016; Klinkenberg et al., 2017; Didelot et al., 2017; Teunis et al., 2013). Chapter 8. Reconstructing the disease spread 120

The majority of approaches use Bayesian MCMC frameworks for inferring “who infected whom” over an outbreak. These models differ from each other on different levels begin- ning with their underlying epidemiological models (e.g. SIR, SEIR), and/or their genetic models (e.g. phylogenetic and non-phylogenetic), as well as their different approaches for incorporating unobserved or incomplete sequence data. Furthermore, some of these methods allow for multiple primary infections along with the possibility of including multiple pathogen lineages within a host (Campbell et al., 2018; De Maio et al., 2016; Hall et al., 2015; Didelot et al., 2017). In addition, statistical models have been devel- oped for performing Bayesian inference on the locations of undetected infections using Reversible-Jump MCMC (Jewell et al., 2009b) or related methods (Jewell et al., 2009a).

Only a handful of these methods (Campbell et al., 2019; Soetens et al., 2018; Hens et al., 2012; Jewell and Roberts, 2012) can use movement data or contact data between cases (i.e., individuals, animal, farms, etc.) in the inference. However, contact data are considered amongst the most important factors governing early outbreak dissemination for infectious diseases emerging in highly susceptible populations (Firestone et al., 2011). Hence, any model that does not include movement information may well be too simple and inaccurate. In practice, it requires great effort to observe and record movement data on large populations. Often such data is available, either through established animal movements monitoring systems or through proxies for human contact structures.

Soetens et al. (Soetens et al., 2018) used contact data to estimate the exposure type- specific attack rates (i.e., the proportion of infected cases with a specific exposure type among the total number of traced individuals with that same exposure type (Soetens et al., 2018)). The reproductive number (Kermack and McKendrick, 1927) was also esti- mated using maximum likelihood techniques specified for censored data. However, their approach and the approach of Hens et al. (Hens et al., 2012) automatically designate confirmed cases with a single contact as the transmission pair, and this requires complete sampling of contacts and cases. Jewell and Roberts (Jewell and Roberts, 2012) explic- itly modelled the contact process to model the epidemiological inference in an SINR framework by assuming that a single index case was known in advance.

In 2019, a new Bayesian transmission network model (Campbell et al., 2019) attempted to incorporate movement data with genomic and other available data. But the study only considering undated, undirected, binary contact data. This limits the possibilities for inference as it ignores the direction of the movements as well as the timing of movements, which include significant information about the transmission network.

In this chapter, we extend the transmission network reconstruction method of Lau et al. (2015) so that it takes into account animal movements between farms. The method may also be used in outbreaks involving human populations, such as COVID-19, to Chapter 8. Reconstructing the disease spread 121 untangle contact network structures in general. Lau et al.(2015) used a Bayesian MCMC technique to reconstruct the transmission network using sequence and epidemiological data of infected animals in a network of farms. The key advantage of Lau’s mechanistic model is that it can make good predictions even when the sequence data is only partially sampled and thus incomplete. The model is also relatively easier to interpret given its mechanistic nature. Its performance also benchmarks amongst the best, if not the best, of models available (Firestone et al., 2019a). Through testing on simulated outbreaks, we evaluated the performance of our new extension.

8.2 Methods

8.2.1 Model formulation and modification

Epidemic process

The model developed here is an adaptation of Lau’s joint Bayesian Markov Chain Monte Carlo (MCMC) inference (Lau et al., 2015, 2017) with an assumed underlying SEIR epidemic framework. Consider a set of spatially distributed farms indexed by 1, 2, ....

Let ξS(t), ξE(t), ξI (t) and ξR(t) denote the set of indices of farms in class S (susceptible), E (exposed), I (infectious) and R (removed) respectively at time t. In Lau’s original model, it was shown that it is possible to jointly impute the transmission graph and the transmitted sequences by integrating both epidemiological and genetic data in a statistically sound Bayesian framework.

Specifically, a farm j ∈ ξS(t) became exposed via a primary or background infection

rate α or from an infected farm j ∈ ξE(t) with rate β of the form K(dij; κ). The term

K(dij; κ) is known as the spatial kernel function which characterises the dependence of the infectious challenge from infective i to susceptible j as a function of distance between

them dij. Assuming that the sources of infection act independently of each other, Lau et al.(2015) obtained the total probability of individual j becoming infected during time period [t, t + dt] as follows:

X r(j, t, dt) = [α + β K(dij; κ)] + o(dt). (8.1)

i∈ξI (t)

Here o(dt) represents the probability of individual j being infected by multiple sources of infection in the small period dt. The spatial kernel function was assumed to be a power function of the form: Chapter 8. Reconstructing the disease spread 122

1 K(dij; κ) = κ (8.2) 1 + dij where dij is the Euclidean distance between the premises, and κ is an inferred parameter. Other options for the spatial kernel include exponential, Cauchy and Gaussian decay (not tested here). In previous work, the parameter β in Equation 8.1 was reformulated as

βij to incorporate additional terms that represent the transmissibility of each infectious farm and the susceptibility of each susceptible farm based on farm-level covariates such as the predominant species held on the premises and the numbers of susceptible animals (see (Firestone et al., 2019b)). For simplicity, this modification is not included here also considering that the M. bovis outbreak in New Zealand mainly affected cattle farms. Chapter 8. Reconstructing the disease spread 123

A t_e[i] t_i[i] t_r[i] t_m(i j)

 !

" ! t_e[j] t_i[j] t_r[j]

B t_e[i] t_i[i] t_r[i] t_m(i j)

 ! " ! t_e[j] t_i[j] t_r[j]

C t_e[i] t_i[i] t_r[i ] t_m(i j)

 !  !  " t_e[j] t_i[j] t_r[j]

D t_e[i] t_i[i] t_r[i] t_m(i j)

¡ me t_e[j] t_i[j] t_r[j]

Figure 8.1: Representation of periods of exposure to secondary transmission rates, incorpo- rating contact-traced movements. Blank open circles represent the time of exposure while solid circles represent the infectious (coloured black) and recovered (coloured grey) times. Thus the green and red lines indicate the exposure and infectious periods. Dotted and dashed grey lines represent the duration of exposure from farm i to j through spatial spread and movements re- spectively. Assuming that farms i and j are located far away from each other, movements are crucial for disease transmission between farms i and j. In scenarios A and B, it is highly likely that the source of infection for farm j is farm i, as j is exposed while i is still infectious. Assum- ing the same low movement transmission rate (βm), farm i has a higher chance of infecting farm j in scenario A compared to scenario B due to the longer exposure time from the movements (βmdt). In scenario C, farm i is no longer infectious when farm j is exposed, hence farm i cannot be the source of infection for farm j. Here, we assume that the newly moved animals in farm j recover as soon as the animals on the source farm i recovers and end their infectious period. This assumption could be relaxed in the future, to account for depopulation at different rates. For scenario D, farm j is already exposed to the infection before farm i is infectious and hence farm i cannot be the source of infection for farm j.

The new model was developed to incorporate known contacts between known infected premises i.e., information of movement between farms. Thus, we modified the transmis- sion rate in Equation 8.1 as follows:

X X r(j, t, dt) = [α + βs K(dij; κ) + βm mij] + o(dt). (8.3)

i∈ξI (t) i∈ξI (t)

Further terms were included in the likelihood to account for the possibility that farm i was the source of infection for farm j where mij movements of potentially infected animals occurred from farm i to farm j. The probability of farm j becoming exposed Chapter 8. Reconstructing the disease spread 124 due to farm i depends on the inferred secondary transmission rates via local spatial spread (βs) and movements (βm), along with the distance and the number of movements between farm i and farm j. However, for a particular movement from i to j to be significant, it should occur during the inferred infectious period of farm i and prior to the first exposure time of farm j. In the mechanistic statistical representation, the newly arrived animals were considered separately from the susceptible population at the destination farm, j. If the newly arrived animals at farm j were unable to infect within their previously inferred infectious period, then farm j was considered to remain uninfected. However, if they transmitted an infection onto the rest of the herd at some point (within their inferred infectious period), farm j was considered infected (see Figure 8.1).

Genetic evolution process

The evolutionary process of the pathogen considered here is identical to that found in Lau et al. (Lau et al., 2015), where it was modelled at the level of nucleotide. Lau et al. (Lau et al., 2015) assumed that the nucleotide substitution process is conditionally independent of the epidemic process and that there is only a single dominant strain at each exposed farm at any time point. Hence a newly exposed farm is only infected by a single strain at the time of exposure. The dominant strain that was passed on from the infectious farm evolves according to the continuous-time evolutionary process with the assumption that the nucleotide substitutions happen independently at different positions of the sequence.

The genome is a sequence of four nucleotide bases which can be classified into pyrim- idines and purines. This classification is demonstrated through Figure 8.2 for a DNA (deoxyribonucleic acid), which consists of two strands. Chapter 8. Reconstructing the disease spread 125

Figure 8.2: Genomes are mostly in the form of DNA and are made out of four nucleotide bases (i.e. Adenine, Guanine, cytosine, Thymine) which fall into two categories pyrimidines and purines. A point mutation between bases in the same category is referred as transition, while the point mutation between the two categories is referred as transversion. In general transitions occur frequently compared to transversions.

In contrast to DNA, an RNA virus is single-stranded with Uracil being replaced for Thymine in DNA. Following Lau et al. (Lau et al., 2015), we adopted the two parameter Kimura model for the evolutionary process of the genome (Yang, 2006). Under the Kimura model, a nucleotide base in a DNA virus x ∈ A, G, C, T mutates to a nucleotide base x0 ∈ A, G, C, T within the period ∆t with probability:

−4µ2∆t −2(µ1+µ2)∆t Pµ1,µ2 (y|x, ∆t) = 0.25 + 0.5e + 0.5e , for x = y (8.4a)

Pµ1,µ2 (y|x, ∆t)  0.25 + 0.5e−4µ2∆t − 0.5e−2(µ1+µ2)∆t, for x 6= y specifying a transition = (8.4b) 0.25 − e−4µ2∆t, for x 6= y specifying a transversion.

Here µ1 and µ2 represents the transition rate and the transversion rate.

Likelihood

In this section I outline the formulation of the likelihood used for the network inference, as described in Lau et al.(2015) although there are some differences. Consider an Chapter 8. Reconstructing the disease spread 126

epidemic that is observed between t = 0 and t = tmax with a population of size N and

a genomics sequence with n bases. Let χS, χE, χI , χR be the set of farms remaining in class S and have gone through class E, I, R by tmax respectively. Also, the exposure times and the times of becoming infectious and recovered are denoted by E, I and R. The sojourn times in class E and I are denoted by the cumulative distribution functions

FE and FI .

To formulate the likelihood it is important to have information regarding the dominant

pathogen strain at each exposed farm preferably at multiple times. Therefore, let mj

number of sequences at farm j be represented using G.j = (G1,j, ..., Gmj ,j). The corre-

sponding sequence times t.j = (t1,j, ..., tmj ,j) also includes the observed sampling time s tj. The vector ψ specifies the transmission graph which includes information regarding the source of infection for each farm.

In a multiple-cluster scenario, the likelihood is expressed using the complete data z =

(E, I, R, G, ) and model parameters θ = (α, β, a, b, η, γ, κ, µ1, µ2, p) (see Table 8.1 for descriptions on parameters).

Y Y L(θ; z) = P (j, ψj) × exp{−qi(Ej)} × exp{−qj(tmax)}

−1 j∈χS j∈χE Y Y × fE(Ij − Ej; a, b) × fI (Rj − Ij; γ, η) j∈χ j∈χ I R (8.5) Y Y × {1 − FE(tmax − Ej; a, b)} × {1 − FI (tmax − Ij; γ, η)}

j∈χE\I j∈χI\R Y Y × g(G2,j..., Gmj ,j|t.j, ψj,G1,j) × h(G1,j|ψj) j∈χE j∈χE

−1 In what follows, the terms in Equation 8.5 will be explained in detail. χE in Equation

8.5 denote χE excluding the index or the primary infected case while χE\I denotes the

farms that are in the exposed class at tmax and hence not infectious (e.g. j ∈ E and

j∈ / I). Similarly, χI\R represent the farms that are infectious but not recovered by tmax. The likelihood given in Equation 8.5 is the same as the one given in Lau et al (Lau et al.,

2015). The only difference is the way we calculate P (j, ψj) and qj(s). P (j, ψj) refers to

the contribution arising from the infection of j by a particular source ψj, which is:

 α, if individual j is a primary case  P (j, ψ ) = (8.6) j βsK(dψj j; κ) + βmmψj j, if ψj ∈ χI at time Ej.   Chapter 8. Reconstructing the disease spread 127

Here mij denotes the number of movements from farm i to farm j. The term qj(s) gives the contribution arising from survival of each farm till time s. For an exposed farm s = Ej while for a non-exposed farm s = tmax. qj(s) is defined as follows:

Z s X X qj(s) = {α + βs K(dij; κ) + βm mij}dt (8.7) 0 i∈ξI (t) i∈ξI (t)

The contribution to the likelihood arising from the sojourn times in class E and I are rep- resented through the second and third lines in Equation 8.5. For instance a Gamma(a, b) distribution (i.e. fE(.; a, b)) is used as the latent period for a infectious farm while its cumulative distribution (i.e. FE(.; a, b)) is used when the farm is in χE\I . A similar interpretation can be given to the infectious period (fI (.; γ, η)) or FI (.; γ, η))) using the W eibull(γ, η) distribution.

The first term in the last line in Equation 8.5 contributes to the likelihood from the sequence data.

n mj −1 Y Y i i g(G2,j..., Gmj ,j|t.j, ψj,G1,j) = Pµ1,µ2 (Gk+1,j|Gk,j, ∆t = tk+1,j − tk,j) (8.8) i=1 k=1

The term g(G2,j..., Gmj ,j|t.j, ψj,G1,j) in Equation 8.8 gives the total probability of the

mutations that occurred in the exposed individual j, given the infecting strain (G1,j) i and the sampling times using the Kimura model defined in Equation 8.4. Gk,j refers to the nucleotide base at position i of sequence k on farm j.

The final term in Equation 8.5 contributes to the likelihood from the background pathogens of the primary cases. In a multiple-cluster scenario, it is necessary to ob- tain a distinct background sequence of the pathogen in each cluster as it will allow the background infection and secondary infection to become distinguishable in the Bayesian

approach. Thus, the probability of a background sequence G1,j for a farm j is modelled as:

 p lj n−lj ( 3 ) (1 − p) , if individual j is a primary case h(G1,j|ψj) = (8.9) 1, if ψj ∈ χI

where lj is the total number of different bases in G1,j compared to the given master sequence GM and p is the probability of a single base in G1,j being different from the corresponding base in GM . Chapter 8. Reconstructing the disease spread 128

Now we obtain the joint posterior distribution p(θ, z|y) ∝ L(θ; z)p(θ) given the observed data y. Here z represents the complete data which is reconstructed or imputed from y and p(θ) represents the prior distribution of model parameters. Chapter 8. Reconstructing the disease spread 129

Table 8.1: Key parameters in the Bayesian MCMC inference of the transmission network.

Parameter Type Description

The source and timing of exposure and Vectors of ψj, te[j], ti[j] onset of infectiousness for each exposed latent variables site j.

The timing of removal (end of the infectious phase) for each infected premises, i, in the

tr[i] Observed data set, at the commencement of the imposition of quarantine measures or depopulation. The timing of sampling and available sequences t.j Observed for infected premises in the data set.

The sequence on each infected premises at each sampling and transmission time, including Matrix of for premises for which sequence G.j = (G1,j, ..., Gm ,j) j latent variables data was unavailable and for those with sequence data available but at other time-points than when sampled.

dij Observed Euclidean distance between premises i and j.

α Latent variable The background rate of infection.

The secondary transmission rate partitioned

βs, βm Latent variables into spatial(s) and animal movement(m) components. κ Latent variable The power of the spatial transmission kernel.

The rates of transitions and transversions

µ1, µ2 Latent variable based on Kimura’s 2 parameter nucleotide substitution model. The mean and standard deviation of the µlat, σlat Latent variables duration of the farm-level latent period.

Shape parameters for Weibull distribution γ, η Latent variables representing the farm-level infectious period.

The probability that a nucleotide base of each of the primary (seeding) sequences has p Latent variable of differing from the base at the corresponding site in the sequence of the universal master sequence [for details see Lau et al.(2015)]. Chapter 8. Reconstructing the disease spread 130

8.2.2 Model verification and pseudo-validation

Before using our method on actual data-sets, the modified model was first verified on outbreak data-sets simulated with the same underlying model structure. Our proce- dure follows the approach described in (Lau et al., 2015). The accuracy of the method was assessed using 60 simulated outbreaks from each of the twelve different scenarios (designated Sim1-Sim12). These twelve scenarios were purposely designed to investigate the effect of the sequence data and the movement data on the accuracy of an inferred transmission network. For this purpose different sampling rates from the sequence data and different sampling percentages from movement data were randomly selected and incorporated into the inference.

8.2.3 Model implementation

The modified joint Bayesian MCMC inference of the transmission tree was implemented on the genomic, epidemiological and animal movement data-set available at the time of analysis with data augmentation for unobserved sequences and further parameters. The model was compiled from C++ source code and run on a parallel computing cluster with 4 chains (Lafayette et al., 2016). Each chain included 1 million MCMC iterations, the first 20% of each discarded as burn-in and the remainder thinned by 1000 based on assessment of convergence and autocorrelation, with Gelman and Rubin’s shrink fac- tor (Gelman et al., 1992), visually and by calculation of autocorrelation and effective sample size using Tracer (Rambaut et al., 2018). All unobserved parameters were given uninformative flat priors and imputed as described previously (Lau et al., 2015). The MCMC was initialised with a transmission tree with initial sources selected randomly from amongst those estimated to hold infectious animals at the estimated time of expo- sure of each IP (infected premises). If there were no potential sources at the estimated time of exposure of an IP the proposed source for this IP was initialised with a value to represent seeding from a non-observed IP. The initiating single universal master se- quence was assumed to be the consensus sequence of all available partial genomic data analysed. Chapter 8. Reconstructing the disease spread 131

8.3 Results

Simulation studies

In this section we infer the accuracy of the transmission network based on simulated epidemics, with the aim of assessing the performance of the model under different sce- narios. Specifically we investigate the effect of having partial contact-data as well as partial genetic data. To test our modifications, we first apply it to an epidemic simu- lation in a population of size N = 105 (i.e total number of farms). The longitude and latitude coordinates for these 105 premises were selected to be similar to the actual farm locations in the NZ outbreak data set.

The epidemic begins at time zero (t = 0) and evolves according to Equation 8.3. The maximum time limit (tmax) for the epidemic is taken to be 150 days. We initially set α = 0.0004, βs = 0.005, βm = 0.5, K(dij; κ) = exp{(−1.5dij)} and assume that the latent period and the infectious period in class E and I follow Gamma(8, 0.5) and Weibull(15, 10) distributions respectively. Beginning with three primary cases, we let the epidemic spread across the landscape according to Equation 8.3 while taking into account various degrees of random movement between farms that we specified in advance. Upon infection a pathogen sequences of length n = 7667 are transmitted with transition rate µ1 = 0.000015 and transversion rate µ2 = 0.000001.

Based on the epidemic simulated, we gather the data- which farms were infected and their corresponding exposure times along with the times of becoming infectious and recovered, and which pairs of farms had contact through movement and the distance between farms, and samples of the sequences at the infected farms. From subsets of the data available, we try to infer all the links of ”who infected whom” and then compare it with true infecting source in the simulation. In the inferred transmission graph, the proportion of transmission links predicted correctly is referred to as the coverage rate.

This coverage rate is an important indicator to understand the performance of the inferred network as compared to the true network. Through the sub-Figures (A), (B) and (C) in Figure 8.3, we infer the transmission graph using different proportions of the contact or movement data. In sub-Figures (A), all the movement data are used with different percentages of genetic data. As shown by the purple colour histogram, the transmission network is typically recovered with near-complete coverage rate, that is with 90-100% accuracy when full information of the movement and genetic data is used. Similarly, sub-figures (B) and (C) show the coverage rates obtained with 50% movement data and no movement data respectively. Through all these sub Figures Chapter 8. Reconstructing the disease spread 132 it is clear that the more genetic sequences available the more accurate is the inferred transmission graph.

Figure 8.3: Posterior distribution of the overall coverage rate, which is basically the propor- tion of links predicted correctly in the network. (A) Using 100% contact-data (B) Using 50% contact-data (C) Using no contact-data (D) Heat-map for the median accuracy of out- break reconstruction. The colour of a grid point represents the average accuracy of outbreak reconstruction.

All the information in the sub-figures (A), (B) and (C) are finally summarized in sub- figure (D) using the median coverage values under each scenario. For this particular simulation study, it can be see that the information provided by the contact-data sig- nificantly improves the accuracy far more than the sampling proportion of the sequence data. For example, having sequence data from all infected farms (i.e., 100% of the sequence) along with no contact data resulted in a coverage rate of 45%. However, this coverage rate was approximately doubled by just including 50% of the movement data. Thus, including movement data into the inference gave a big boost in the inferred transmission network coverage rate. Chapter 8. Reconstructing the disease spread 133

80 2.5 1000 2.0 60 750 1.5 40 500 1.0 Density Density Density 20 250 0.5

0 0.0 0 0.00 0.01 0.02 0.03 0.0 0.5 1.0 1.5 0.000 0.001 0.002 0.003 α βs βm

1.00 1.5

0.75 1.0 1.0 0.50 0.5 Density Density Density 0.5 0.25

0.00 0.0 0.0 3 4 5 6 1 2 3 14.0 14.5 15.0 15.5 16.0 µlat σlat γ

0.3 0.2 100 0.2

0.1 50 Density Density Density 0.1

0.0 0.0 0 5 10 15 20 0 2 4 6 0.190 0.195 0.200 0.205 0.210 η κ p

150000 750000

100000 500000

Density 50000 Density 250000

0 0 5.0e−06 1.0e−05 1.5e−05 2.0e−05 2.5e−05 0e+00 1e−06 2e−06 3e−06 µ1 µ2

Figure 8.4: Posterior distribution of model parameters. Red dotted lines represent the true value of the parameters

Figure 8.4 shows the posterior distribution of the all parameters of the model for the case where all epidemiological, genetic and movement data from all farms and from every infection is available. As seen by the figure all parameters are accurately repre- sented by their posterior distribution. However, the background transmission rate (α), transmission rate via movements (βm), and the parameters representing the latent pe- riod of the disease (µlat, σlat) are slightly over-estimated. In summary, the simulation study indicates that our enlargement of Lau’s original model (by inclusion of movement and movements data) is successful as all the parameter estimates fall withing credible intervals and thus they are accurately represented.

8.4 Discussion

In this chapter, the work of Lau et al.(2015) was extended to incorporate animal move- ments between farms into the inference about the transmission network of an outbreak. Chapter 8. Reconstructing the disease spread 134

Lau et al.(2015) developed a statistically sound and computationally tractable Bayesian framework that reconstructs the transmission network using sequence and epidemiolog- ical data. The performance of Lau’s mechanistic model benchmarks amongst the best (Firestone et al., 2019a). However, this method did not use movement data or contact data between cases (i.e., individuals, animal, farms, etc.), which is considered among the most important factors governing early outbreak dissemination (Firestone et al., 2011). Thus, including movement data into the model is expected to improve the inferred trans- mission network. The method described here may also be used to untangle the network structure of an outbreak involving human populations, such as COVID-19.

The modified model was first tested and verified on a simulated epidemic with a mod- erate sample size (N = 150) which is comparable to the one used in Lau et al.(2015). The accuracy of the method was assessed using different scenarios of the simulated outbreak. These scenarios were purposely designed to investigate the effect of differ- ent sampling percentages of sequence data and movement data on the accuracy of an inferred transmission network. Based on the result we showed that the accuracy of a transmission network was significantly improved by incorporating contact-data into the model. For example, having 100% of sequence data along with no contact data resulted in a coverage rate of 44%. This coverage rate was doubled by just including 50% of the movement data. However, in practice it is highly unlikely to obtain 100% sequencing. Even in situations where there is 100% sequencing, the transmission network built solely on sequence data wouldn’t guarantee an accurate transmission network specifically for pathogens with slow mutation rates. In addition, it was demonstrated that all model parameters were accurately inferred by their respective Posterior distributions. Thus, the model described in this chapter is ready to be used on real outbreak data-set. The intention is to next use the model to study the early stages of the 2017 Mycoplasma bovis outbreak in New Zealand. Mycoplasma bovis is a bacterium that causes mastitis, otitis and respiratory disease in cattle. Chapter 9

Conclusion

In 2014, Pimm et al.(2014) estimated the current rate of extinction to be around 1000 times higher than the natural background rate. It appears that the bonds that hold nature together may be at risk of unraveling and this is likely to be due to factors such as deforestation, over-fishing, habitat loss, pollution, climate change, exotic species introductions or infectious diseases. Thus more than ever now, ecologists need to under- stand how to monitor and ascertain species extinctions, and to develop better species conservation strategies. On the other hand, epidemiologists are more interested in un- derstanding the factors that lead to disease mitigation or extinction. Even though they seem like opposite goals, both ecologists and epidemiologists are concerned about the persistence of species or susceptible hosts. The main focus of this thesis is to derive new mathematical and statistical models for studying extinctions of species and disease transmission networks that are responsible for the spread of disease outbreaks. In addi- tion, the work presented in this thesis shows that the Bayesian approaches have a lot to offer in extinction and ecological modelling in general.

To examine multiple ways of improving species extinction models and infectious dis- ease models, this thesis considered six main themes: (i) how subtle but qualitatively different ways of modelling uncertain sightings can introduce major differences in our understanding of species extinctions (Chapter3); (ii) the methodological development of a hierarchical Bayesian approach for species extinction modelling with certain and uncertain sightings (Chapter4); (iii) the importance of allowing for non-homogeneous sighting rates (Chapter5); (iv) the development of a Bayesian Updating method for evaluating the probability that a species is extinct, based on a record of observations and surveys (Chapter6); (v) use of survival theory as a tool for estimating extinction times and the mean time to extinction (Chapter7); (vi) the importance of including movement data in infectious disease modelling (Chapter8). In this concluding chapter,

135 Chapter 9. Conclusion 136 the main findings of the previous chapters will be discussed together with their potential limitations.

9.1 Incorporating uncertainty into species extinction mod- els

In practice, it is extremely difficult to determine whether a species has gone extinct or has just remained unobserved. However, historical sighting records contains significant information on species’ existence that may be used for quantitative assessment of ex- tinction. In recent years several new probabilistic models have been developed in order to infer species extinction using ecological sighting data. In 2014, Solow and Beet devel- oped two such models (Solow and Beet, 2014). The most remarkable property of these models is that they were able to incorporate statistical uncertainty in sighting data in a much more general way than previously attempted methods. In Chapter3 we explored why the two methods give completely different conclusions concerning the extinction of the Ivory-billed Woodpecker. As discussed in the chapter, the first model focuses on the validity and invalidity of the sightings (Model 1), whereas the other focuses on the certainty and uncertainty of sightings (Model 2). As a result, it was found that the first model was more sensitive to the last uncertain sighting, while the second was more sensitive to the last certain sighting. For the Ivory-billed Woodpecker, if most of the uncertain sightings were truly valid, then the inference made under Model 2 would be incorrect. However, if most of the uncertain sightings were truly invalid, then the inference made under Model 1 would be incorrect. Thus, the model to be used of the two is open for debate as long as the quality of uncertain sightings remains unknown. The work in Chapter3 has been published in Kodikara et al.(2018).

9.2 Inferring extinction year using a Bayesian approach

Chapter4 presented a first step towards developing a hierarchical computational Bayesian approach to model species extinction. The posterior distribution for extinction time is evaluated using the likelihood of sighting data, priors for model parameters and hyper priors for prior parameters. In this chapter, two main models were developed to under- stand the impact of including uncertain sightings. These models were then used on a real test case – the Ivory-billed Woodpecker. Our statistical analysis under both models rejected the null hypothesis and thus favoured the extinction of Ivory-billed Woodpecker by 2010. Due to the same inference under both the certain sighting model (Model 1, ex- cluding uncertain sightings) and the combined certain/uncertain sighting model (Model Chapter 9. Conclusion 137

2, including uncertain sightings), it is evident that the inclusion of uncertain sightings did not change the inference about extinction for the Ivory-billed Woodpecker. From the results of other species sighting records (i.e. Nukupu’u (Hemignathus lucidus), Es- kimo Curlew (Numenius borealis, Kaua’i ’akialoa (Hemignathus stejnegeri) and O’ahu ’Alauahio (Hemignathus stejnegeri)) and artificially generated sighting records it was found that the extinction year can be sensitive to the inclusion of uncertain sightings as well as the model assumptions on the sighting probability. It was also shown that extinction is likely to occur at a change-point where the probability distribution of a sequence of sightings differs from before to after the change-point. For example, if a relatively high rate of certain sightings occurs prior to the last certain sighting, and their absence after that, this indicates that the last certain sighting is the change-point. However, the approaches developed in this chapter have mainly two limitations. Firstly, the models developed assume that the certain and uncertain sighting rates are constant by taking constant sighting probabilities, thus, not appropriate for declining population- s/sightings. Secondly, complex factors such as spatial heterogeneity are not allowed for in the models discussed in this paper. This work described in Chapter4 is accepted for publication in Kodikara et al.(2020).

9.3 Modeling extinction of a species using non-homogeneous Poisson processes with a change-point

Previous methods that included uncertain sightings typically assumed that all sighting types (i.e. certain, valid uncertain, invalid uncertain) are derived from independent homogeneous Poisson processes. Hence, in Chapter5 we proposed a new Bayesian model based on non-homogeneous Poisson processes for the sightings to infer species extinction. This approach can also be used to test if the sighting rates for certain and uncertain sightings were decreasing, increasing or constant over the studied time period. The proposed method was applied to the sighting records of the black-footed ferret and the Ivory-billed Woodpecker species. The null hypothesis that the black- footed ferret and IBW are extant by the sighting end period was rejected under the model developed. For the IBW sightings the posterior distribution indicated a decline in the certain sighting rate along with an increase in the uncertain sighting rate prior to extinction. For example, if one uses only certain sightings with a constant sighting rate assumption, then the posterior extinction probability for IBW by the year 1948 is 0.97. But if the uncertain sightings were included with a constant rate assumption then the probability is reduced to 0.75. Thus, the analysis indicates that the inclusion of uncertain sightings still favours extinction but with a lower probability than before. The same probability is further reduced to 0.38 when the constant rate assumption Chapter 9. Conclusion 138 is relaxed by allowing the process to be non-homogeneous. Under this scenario the extinction of IBW by 1948 is questionable. Through these results it was clear how each model assumption produces different posterior extinction probabilities. The work in Chapter5 is currently under review in a journal related to ecological statistics and modelling.

9.4 Bayesian updating to estimate extinction from sequen- tial observation data

Chapter6 proposed a Bayesian Updating method for inferring extinction based on records of observations and surveys. In this approach, probabilities of extinction P (Et) were updated year-by-year as new data come to hand, by taking the (Bayesian) “pos- terior” in year t to be the “prior” in year t + 1. This resulted in a non-linear iterative model with a cumulative Bayes Factor which is also updated year-by-year. The model was used to analyse sighting data of the water bird, the Alaotra Grebe (Tachybaptus rufolavatus) collected by Bird-Life International as given in Thompson et al.(2017). For

the Alaotra Grebe, the cumulative Bayes Factors Bt took respective values from 1990 to 1994 of 2.837, 2.850, 2.864, 8.124, 23.05, suggesting from a rule of thumb value (R) of 20 (i.e. R = 20) that extinction occurred on or before 1995. The model developed is very simple to implement in a standard spreadsheet, facilitating its adoption for routine ap- plication in organizations that use observation data to support evidence-based decisions regarding actions to protect species and their habitats. As with the model developed by Thompson et al.(2017), the Bayesian Updating model may be used to explore hy- pothetical scenarios, or to test ideas about investments in surveys. For instance, this model can be useful if questions arise regarding the trade-offs between more extensive surveys, or new technologies that are more likely to detect a species when it is present. Most importantly, the methods outlined in this chapter indicate how acceptable levels of uncertainty may be quantified so that decisions about the allocation of resources may be made transparently and consistently. This work described in Chapter6 is published in Thompson et al.(2019).

9.5 Using survival theory models to quantify extinctions

In Chapter7, we introduced a new approach that uses survival theory as a tool for estimating extinction times and the mean time to extinction. In survival analysis, events such as the death of a patient or failure of a component are assumed to be unambiguous and there is only one such event for each subject. For example, survival analysis is used Chapter 9. Conclusion 139 in medical sciences to estimate life expectancy and it is also widely used in engineering to estimate the time to failure of a component or system (Hosmer et al., 2002; Klein and Goel, 2013). These problems are similar to the problem in ecology of estimating the meantime to extinction of a species. Unlike the majority of previous related studies in the ecological literature, this approach does not rely on detailed historical sighting records. Specifically, this model is designed for situations in which a species has not been recorded for some period of time t after an assumed valid sighting (record) at time referred to as t= 0. Thus, this approach is not designed to replace developments that use the full sighting history. However, the approach can still derive simple expressions for the mean time to extinction and associated confidence levels. Application of the model is illustrated by using it on information for dodos (Raphus cucullatus) and Aldabra snails (Rhachistia aldabrae). If one assumes a constant hazard rate say h = 0.10, then with a 95% confidence the Dodo was extinct by 1691. (The Hazard Function, is defined as the probability that the species will be sighted within a small time interval, provided that it has survived until the beginning of that interval.) The estimated extinction date for the Dodo is in close agreement with an extinction time estimate of 1690 given by Roberts and Solow(2003). Similarly, for Aldabra snails, if one assumes a large hazard rate of say h = 0.35 to take account of systematic surveys after 2014, then with a 95% confidence, extinction will occur on or before 2022. This work described in Chapter7 is published in Thompson et al.(2020).

9.6 A systematic Bayesian integration of epidemiological, genetic and movement Data

Modelling the transmission network of infectious disease outbreaks is an active research area that was greatly influenced and spurred on by the widespread impact of foot-and- mouth disease in the United Kingdom in 2001 and many other outbreaks that followed. The main objective of these modelling approaches is to recover the epidemic transmission network, which identifies the route of infection (“who infected whom and how”). In most situations, genetic sequence data from pathogens between infected hosts present a novel means to this investigation. However, modern state of the art outbreak reconstruction methodologies combines both epidemiological and genetic sequence data. In Chapter8, a Bayesian framework is developed to facilitate the integration of epidemiological, genetic and movement data to accurately infer the transmission network of an infectious disease outbreak. In this chapter, we extended the transmission network reconstruction method of Lau et al.(2015) so that it takes into account animal movements between farms. Lau et al.(2015) used a Bayesian MCMC technique to reconstruct the transmission network using sequence and epidemiological data of infected animals in a network of farms. Chapter 9. Conclusion 140

However, the method did not use movement data or contact data between cases (i.e., individuals, animal, farms, etc.). For infectious diseases emerging in highly susceptible populations, contact data are considered amongst the most important factors governing early outbreak dissemination Firestone et al.(2011). Hence, incorporating this valuable source of information can improved the inferences as confirmed by a simulation study. For instance, having 50% of contact-data along with 50% of sequence data provides higher accuracy than no contact data and 75% of sequence data. This method is intended to be applied to the the early stages of the 2017 Mycoplasma bovis outbreak in New Zealand. Mycoplasma bovis is a bacterium that causes mastitis, otitis and respiratory disease in cattle. This work described in Chapter 8 is currently under preparation for publication.

9.7 Conclusion

In conclusion, the focus of this thesis was to derive new mathematical and statistical models for studying species extinctions as well as models for studying infectious diseases. The models developed in this thesis collectively reveal the important contribution of Bayesian approaches. The new findings reinforce the value of incorporating additional information such as uncertain sightings in extinction context and movement data into infectious disease models. Bibliography

Achcar, J. A., Rodrigues, E. R., Paulino, C. D., and Soares, P. (2010). Non-homogeneous poisson models with a change-point: An application to ozone peaks in mexico city. Environmental and Ecological Statistics, 17(4):521–541.

Ak¸cakaya, H., Keith, D. A., Burgman, M., Butchart, S. H., Hoffmann, M., Regan, H. M., Harrison, I., and Boakes, E. (2017). Inferring extinctions iii: A cost-benefit framework for listing extinct species. Biological Conservation, 214:336–342.

Aldrin, M., Lyngstad, T., Kristoffersen, A., Storvik, B., Borgan, Ø., and Jansen, P. (2011). Modelling the spread of infectious salmon anaemia among salmon farms based on seaway distances between farms and genetic relationships between infectious salmon anaemia virus isolates. Journal of The Royal Society Interface, 8(62):1346–1356.

Alroy, J. (2014). A simple bayesian method of inferring extinction. Paleobiology, 40(4):584–607.

Barnosky, A. D., Matzke, N., Tomiya, S., Wogan, G. O., Swartz, B., Quental, T. B., Marshall, C., McGuire, J. L., Lindsey, E. L., Maguire, K. C., et al. (2011). Has the earth’s sixth mass extinction already arrived? Nature, 471(7336):51–57.

Battarbee, R. W. (2014). The rediscovery of the aldabra banded snail, Rhachistia aldabrae. Biology Letters, 10(10):20140771.

Berger, J. (1990). Persistence of different-sized populations: an empirical assessment of rapid extinctions in bighorn sheep. Conservation Biology, 4(1):91–98.

Berger, J. O. (1985). Prior information and subjective probability. In Statistical Decision Theory and Bayesian Analysis, pages 74–117. Springer.

Berger, J. O. and Bernardo, J. M. (1989). Estimating a product of means: Bayesian analysis with reference priors. Journal of the American Statistical Association, 84(405):200–207.

Berger, J. O. and Bernardo, J. M. (1992). Ordered group reference priors with applica- tion to the multinomial problem. Biometrika, 79(1):25–37. 141 Bibliography 142

Berger, J. O., Bernardo, J. M., and Mendoza, M. (1988). On priors that maximize expected information. Purdue University. Department of Statistics.

Berger, J. O., Bernardo, J. M., and Sun, D. (2012). Objective priors for discrete param- eter spaces. Journal of the American Statistical Association, 107(498):636–648.

Berger, J. O., Bernardo, J. M., Sun, D., et al. (2015). Overall objective priors. Bayesian Analysis, 10(1):189–221.

Bernardo, J. M. (1979). Reference posterior distributions for bayesian inference. Journal of the Royal Statistical Society: Series B (Methodological), 41(2):113–128.

Bernardo, J. M. (2011). Integrated objective bayesian estimation and hypothesis testing. Bayesian Statistics, 9:1–68.

Boakes, E. H., McGowan, P. J., Fuller, R. A., Chang-qing, D., Clark, N. E., O’Connor, K., and Mace, G. M. (2010). Distorted views of biodiversity: spatial and temporal bias in species occurrence data. PLoS Biol, 8(6):e1000385.

Boakes, E. H., Rout, T. M., and Collen, B. (2015). Inferring species extinction: The use of sighting records. Methods in Ecology and Evolution, 6(6):678–687.

Bond, A. L., Carlson, C. J., and Burgio, K. R. (2019). Local extinctions of insular avifauna on the most remote inhabited island in the world. Journal of Ornithology, 160(1):49–60.

Brook, B. W., Buettel, J. C., and Jari´c,I. (2019). A fast re-sampling method for using reliability ratings of sightings with extinction-date estimators. Ecology, 100(9):e02787.

Brook, B. W., Sleightholme, S. R., Campbell, C. R., and Buettel, J. C. (2018). Defi- ciencies in estimating the extinction date of the thylacine with mixed certainty data. Conservation Biology, 32(5):1195–1197.

Burgman, M. A., Grimson, R. C., and Ferson, S. (1995). Inferring threat from scientific collections. Conservation Biology, 9(4):923–928.

Campbell, F., Cori, A., Ferguson, N., and Jombart, T. (2019). Bayesian inference of transmission chains using timing of symptoms, pathogen genomes and contact data. PLoS Computational Biology, 15(3):e1006930.

Campbell, F., Didelot, X., Fitzjohn, R., Ferguson, N., Cori, A., and Jombart, T. (2018). Outbreaker2: A modular platform for outbreak reconstruction. BMC Bioinformatics, 19(11):1–8.

Carlson, C. J., Bond, A. L., and Burgio, K. R. (2018a). Estimating the extinction date of the thylacine with mixed certainty data. Conservation Biology, 32(2):477–483. Bibliography 143

Carlson, C. J., Bond, A. L., and Burgio, K. R. (2018b). Reevaluating sighting models and moving beyond them to test and contextualize the extinction of the thylacine. Conservation Biology, 32(5):1198–1199.

Casella, G. and George, E. I. (1992). Explaining the gibbs sampler. The American Statistician, 46(3):167–174.

Cauchemez, S., Bhattarai, A., Marchbanks, T. L., Fagan, R. P., Ostroff, S., Ferguson, N. M., Swerdlow, D., Group, P. H. W., et al. (2011). Role of social networks in shaping disease transmission during a community outbreak of 2009 h1n1 pandemic influenza. Proceedings of the National Academy of Sciences, 108(7):2825–2830.

Cauchemez, S., Bo¨elle,P.-Y., Donnelly, C. A., Ferguson, N. M., Thomas, G., Leung, G. M., Hedley, A. J., Anderson, R. M., and Valleron, A.-J. (2006). Real-time estimates in early detection of sars. Emerging Infectious Diseases, 12(1):110.

Cauchemez, S. and Ferguson, N. M. (2011). Methods to infer transmission risk factors in complex outbreak data. Journal of the Royal Society Interface, 9(68):456–469.

Ceballos, G., Ehrlich, P. R., Barnosky, A. D., Garc´ıa, A., Pringle, R. M., and Palmer, T. M. (2015). Accelerated modern human–induced species losses: Entering the sixth mass extinction. Science Advances, 1(5):e1400253.

Ceballos, G., Garc´ıa,A., and Ehrlich, P. R. (2010). The sixth extinction crisis: Loss of animal populations and species. Journal of Cosmology, 8(1821):31.

Chib, S. and Greenberg, E. (1995). Understanding the metropolis-hastings algorithm. The American Statistician, 49(4):327–335.

Clements, C., Collen, B., Blackburn, T., and Petchey, O. (2014). Recent environmental change may affect accurate inference of extinction. Conservation Biology, 28:971–981.

Cohen, J. (2020). Scientists are racing to model the next moves of a coronavirus that’s still hard to predict. Science, 7.

Collins, M. D. (2017). Video evidence and other information relevant to the conservation of the ivory-billed woodpecker (Campephilus principalis). Heliyon, 3(1):e00230.

Collins, M. D. (2018). Using a drone to search for the ivory-billed woodpecker (Campephilus principalis). Drones, 2(1):11.

Consonni, G., Fouskakis, D., Liseo, B., Ntzoufras, I., et al. (2018). Prior distributions for objective bayesian analysis. Bayesian Analysis, 13(2):627–679. Bibliography 144

Cottam, E. M., Th´ebaud,G., Wadsworth, J., Gloster, J., Mansley, L., Paton, D. J., King, D. P., and Haydon, D. T. (2008a). Integrating genetic and epidemiological data to determine transmission pathways of foot-and-mouth disease virus. Proceedings of the Royal Society B: Biological Sciences, 275(1637):887–895.

Cottam, E. M., Wadsworth, J., Shaw, A. E., Rowlands, R. J., Goatley, L., Maan, S., Maan, N. S., Mertens, P. P., Ebert, K., Li, Y., et al. (2008b). Transmission pathways of foot-and-mouth disease virus in the united kingdom in 2007. PLoS Pathogens, 4(4).

Daily, G. C. and Matson, P. A. (2008). Ecosystem services: From theory to implemen- tation. Proceedings of the National Academy of Sciences, 105(28):9455–9456.

Datta, G. S. and Mukerjee, R. (2012). Probability matching priors: Higher order asymp- totics, volume 178. Springer Science & Business Media.

Dawid, A. (2014). Invariant prior distributions. Wiley StatsRef: Statistics Reference Online.

De Maio, N., Wu, C.-H., and Wilson, D. J. (2016). Scotti: Efficient reconstruction of transmission within outbreaks with the structured coalescent. PLoS Computational Biology, 12(9):e1005130.

D´ıaz,S., Settele, J., Brondizio, E., Ngo, H., Gu`eze,M., Agard, J., Arneth, A., Balvanera, P., Brauman, K., Butchart, S., et al. (2019). Summary for policymakers of the global assessment report on biodiversity and ecosystem services–unedited advance version. Intergovernmental Science-Policy Platform on Biodiversity and Ecosystem Services (IPBES), Bonn, Germany.

Didelot, X., Fraser, C., Gardy, J., and Colijn, C. (2017). Genomic infectious disease epidemiology in partially sampled and ongoing outbreaks. Molecular Biology and Evolution, 34(4):997–1007.

Didelot, X., Gardy, J., and Colijn, C. (2014). Bayesian inference of infectious disease transmission from whole-genome sequence data. Molecular Biology and Evolution, 31(7):1869–1879.

Dirzo, R. and Raven, P. H. (2003). Global state of biodiversity and loss. Annual Review of Environment and Resources, 28.

Dobson, A. and Lyles, A. (2000). Black-footed ferret recovery. Science, 288(5468):985– 988.

Earn, D. J., Rohani, P., and Grenfell, B. T. (1998). Persistence, chaos and synchrony in ecology and epidemiology. Proceedings of the Royal Society of London. Series B: Biological Sciences, 265(1390):7–10. Bibliography 145

Ehrlich, P. R. and Ehrlich, A. H. (2013). Can a collapse of global civilization be avoided? Proceedings of the Royal Society B: Biological Sciences, 280(1754):20122845.

Elphick, C. S., Roberts, D. L., and Reed, J. M. (2010). Estimated dates of recent extinc- tions for north american and hawaiian . Biological Conservation, 143(3):617–624.

Eyre, D. W., Golubchik, T., Gordon, N. C., Bowden, R., Piazza, P., Batty, E. M., Ip, C. L., Wilson, D. J., Didelot, X., O’Connor, L., et al. (2012). A pilot study of rapid benchtop sequencing of staphylococcus aureus and clostridium difficile for outbreak detection and surveillance. BMJ Open, 2(3):e001124.

Fader, P. S., Hardie, B. G., and Shang, J. (2010). Customer-base analysis in a discrete- time noncontractual setting. Marketing Science, 29(6):1086–1108.

Faye, O., Bo¨elle,P.-Y., Heleze, E., Faye, O., Loucoubar, C., Magassouba, N., Soropogui, B., Keita, S., Gakou, T., Koivogui, L., et al. (2015). Chains of transmission and control of ebola virus disease in conakry, guinea, in 2014: An observational study. The Lancet Infectious Diseases, 15(3):320–326.

Ferguson, N. M., Donnelly, C. A., and Anderson, R. M. (2001). Transmission intensity and impact of control policies on the foot and mouth epidemic in great britain. Nature, 413(6855):542.

Firestone, S. M., Hayama, Y., Bradhurst, R., Yamamoto, T., Tsutsui, T., and Steven- son, M. A. (2019a). Reconstructing foot-and-mouth disease outbreaks: A methods comparison of transmission network models. Scientific Reports, 9(1):4809.

Firestone, S. M., Hayama, Y., Lau, M. S., Yamamoto, T., Nishi, T., Bradhurst, R., Demirhan, H., Stevenson, M. A., and Tsutsui, T. (2019b). Transmission network re- construction for foot-and-mouth disease outbreaks incorporating farm-level covariates. bioRxiv, page 835421.

Firestone, S. M., Ward, M. P., Christley, R. M., and Dhand, N. K. (2011). The impor- tance of location in contact networks: Describing early epidemic spread using spatial social network analysis. Preventive Veterinary Medicine, 102(3):185–195.

Fisher, D. O. and Blomberg, S. P. (2011). Correlates of rediscovery and the detectability of extinction in mammals. Proceedings of the Royal Society of London B: Biological Sciences, 278(1708):1090–1097.

Fisher, D. O. and Blomberg, S. P. (2012). Inferring extinction of mammals from sighting records, threats, and biological traits. Conservation Biology, 26(1):57–67. Bibliography 146

Fitzpatrick, J. W., Lammertink, M., Luneau, M. D., Gallagher, T. W., Harrison, B. R., Sparling, G. M., Rosenberg, K. V., Rohrbaugh, R. W., Swarthout, E. C. H., Wrege, P. H., Swarthout, S. B., Dantzker, M. S., Charif, R. a., Barksdale, T. R., Remsen, J. V., Simon, S. D., and Zollner, D. (2005). Ivory-billed woodpecker (Campephilus principalis) persists in continental North America. Science, 308(5727):1460–1462.

Gelman, A., Rubin, D. B., et al. (1992). Inference from iterative simulation using multiple sequences. Statistical Science, 7(4):457–472.

Gerlach, J. (2007). Short-term climate change and the extinction of the snail Rhachistia aldabrae. Biology Letters, 3(5):581–585.

Giardina, F., Romero-Severson, E. O., Albert, J., Britton, T., and Leitner, T. (2017). Inference of transmission network structure from hiv phylogenetic trees. PLoS Com- putational Biology, 13(1):e1005316.

Gibson, G. J. and Renshaw, E. (1998). Estimating parameters in stochastic compart- mental models using markov chain methods. Mathematical Medicine and Biology: A Journal of the IMA, 15(1):19–40.

Grenfell, B. T., Pybus, O. G., Gog, J. R., Wood, J. L., Daly, J. M., Mumford, J. A., and Holmes, E. C. (2004). Unifying the epidemiological and evolutionary dynamics of pathogens. Science, 303(5656):327–332.

Gu, W., Heikkil¨a,R., and Hanski, I. (2002). Estimating the consequences of habitat fragmentation on extinction risk in dynamic landscapes. Landscape ecology, 17(8):699– 710.

Guarnaccia, C., Quartieri, J., Tepedino, C., and Rodrigues, E. R. (2015). An analysis of airport noise data using a non-homogeneous poisson model with a change-point. Applied Acoustics, 91:33–39.

Hall, M., Woolhouse, M., and Rambaut, A. (2015). Epidemic reconstruction in a phy- logenetics framework: Transmission trees as partitions of the node set. PLoS Compu- tational Biology, 11(12):e1004613.

Hall, M. D. and Colijn, C. (2019). Transmission trees on a known pathogen phylogeny: Enumeration and sampling. Molecular Biology and Evolution, 36(6):1333–1343.

Hall, P., Wang, J. Z., et al. (1999). Estimating the end-point of a probability distribution using minimum-distance methods. Bernoulli, 5(1):177–189.

Hastings, W. K. (1970). Monte carlo sampling methods using markov chains and their applications. Bibliography 147

Hayama, Y., Firestone, S. M., Stevenson, M. A., Yamamoto, T., Nishi, T., Shimizu, Y., and Tsutsui, T. (2019). Reconstructing a transmission network and identifying risk factors of secondary transmissions in the 2010 foot-and-mouth disease outbreak in japan. Transboundary and Emerging Diseases, 66(5):2074–2086.

Haydon, D. T., Chase-Topping, M., Shaw, D., Matthews, L., Friar, J., Wilesmith, J., and Woolhouse, M. (2003). The construction and analysis of epidemic trees with reference to the 2001 uk foot–and–mouth outbreak. Proceedings of the Royal Society of London. Series B: Biological Sciences, 270(1511):121–127.

Heijne, J. C., Rondy, M., Verhoef, L., Wallinga, J., Kretzschmar, M., Low, N., Koop- mans, M., and Teunis, P. F. (2012). Quantifying transmission of norovirus during an outbreak. Epidemiology, pages 277–284.

Hens, N., Calatayud, L., Kurkela, S., Tamme, T., and Wallinga, J. (2012). Robust reconstruction and analysis of outbreak data: Influenza a (h1n1) v transmission in a school-based population. American Journal of Epidemiology, 176(3):196–203.

Ho, C.-H. (1991). Nonhomogeneous poisson model for volcanic eruptions. Mathematical Geology, 23(2):167–173.

Holdaway, R. N. (1999). A spatio-temporal model for the invasion of the new zealand archipelago by the pacific rat rattus exulans. Journal of the Royal Society of New Zealand, 29(2):91–105.

Holmes, E. C., Nee, S., Rambaut, A., Garnett, G. P., and Harvey, P. H. (1995). Re- vealing the history of infectious disease epidemics through phylogenetic trees. Philo- sophical Transactions of the Royal Society of London. Series B: Biological Sciences, 349(1327):33–40.

Hosmer, D. W., Lemeshow, S., May, S., et al. (2002). Applied survival analysis: Regres- sion modeling of time to event data. Wiley New York, NY.

Jackson, J. A. (2004). In search of the Ivory-billed woodpecker. Smithsonian Books.

Jari´c,I. and Ebenhard, T. (2010). A method for inferring extinction based on sighting records that change in frequency over time. Wildlife Biology, 16(3):267–275.

Jari´c,I. and Roberts, D. L. (2014). Accounting for observation reliability when inferring extinction based on sighting records. Biodiversity and Conservation, 23(11):2801– 2815.

Jaynes, E. T. (2003). Probability theory: The logic of science. Cambridge university press. Bibliography 148

Jeffreys, H. (1998). The Theory of Probability. OUP Oxford.

Jewell, C. P., Keeling, M. J., and Roberts, G. O. (2009a). Predicting undetected infec- tions during the 2007 foot-and-mouth disease outbreak. Journal of the Royal Society Interface, 6(41):1145–1151.

Jewell, C. P., Kypraios, T., Christley, R., and Roberts, G. O. (2009b). A novel approach to real-time risk prediction for emerging infectious diseases: A case study in avian influenza h5n1. Preventive Veterinary Medicine, 91(1):19–28.

Jewell, C. P. and Roberts, G. O. (2012). Enhancing bayesian risk prediction for epidemics using contact tracing. Biostatistics, 13(4):567–579.

Jombart, T., Cori, A., Didelot, X., Cauchemez, S., Fraser, C., and Ferguson, N. (2014). Bayesian reconstruction of disease outbreaks by combining epidemiologic and genomic data. PLoS Computational Biology, 10(1):e1003457.

Jombart, T., Eggo, R., Dodd, P., and Balloux, F. (2011). Reconstructing disease out- breaks from genetic data: A graph approach. Heredity, 106(2):383.

Kass, R. E. and Raftery, A. E. (1995). Bayes factors. Journal of the American Statistical Association, 90(430):773–795.

Kass, R. E. and Wasserman, L. (1995). A reference bayesian test for nested hypotheses and its relationship to the schwarz criterion. Journal of the American Statistical Association, 90(431):928–934.

Keith, D. A., Butchart, S. H., Regan, H. M., Harrison, I., Ak¸cakaya, H. R., Solow, A. R., and Burgman, M. A. (2017). Inferring extinctions i: A structured method using information on threats. Biological Conservation, 214:320–327.

Kermack, W. O. and McKendrick, A. G. (1927). A contribution to the mathematical theory of epidemics. Proceedings of the Royal Society of London. Series A, Containing papers of a mathematical and physical character, 115(772):700–721.

Klein, J. P. and Goel, P. K. (2013). Survival analysis: State of the art, volume 211. Springer Science & Business Media.

Klinkenberg, D., Backer, J. A., Didelot, X., Colijn, C., and Wallinga, J. (2017). Simulta- neous inference of phylogenetic and transmission trees in infectious disease outbreaks. PLoS Computational Biology, 13(5):e1005495.

Kodikara, S., Demirhan, H., and Stone, L. (2018). Inferring about the extinction of a species using certain and uncertain sightings. Journal of Theoretical Biology, 442:98– 109. Bibliography 149

Kodikara, S., Demirhan, H., Wang, Y., Solow, A., and Stone, L. (2020). Inferring extinc- tion year using a bayesian approach. Methods in Ecology and Evolution, 11(8):964–973.

K¨oser,C. U., Holden, M. T., Ellington, M. J., Cartwright, E. J., Brown, N. M., Ogilvy- Stuart, A. L., Hsu, L. Y., Chewapreecha, C., Croucher, N. J., Harris, S. R., et al. (2012). Rapid whole-genome sequencing for investigation of a neonatal mrsa outbreak. New England Journal of Medicine, 366(24):2267–2275.

Kroese, D. P., Taimre, T., and Botev, Z. I. (2013). Handbook of Monte Carlo methods, volume 706. John Wiley & Sons.

Kruschke, J. (2014). Doing Bayesian data analysis: A tutorial with R, JAGS, and Stan. Academic Press.

Kruschke, J. K. and Liddell, T. M. (2018). Bayesian data analysis for newcomers. Psychonomic Bulletin & Review, 25(1):155–177.

Kuhnert, P. M., Martin, T. G., and Griffiths, S. P. (2010). A guide to eliciting and using expert knowledge in bayesian ecological models. Ecology letters, 13(7):900–914.

La Barbera, A. and Spagnolo, B. (2002). Spatio-temporal patterns in population dy- namics. Physica A: Statistical Mechanics and its Applications, 314(1-4):120–124.

Lafayette, L., Sauter, G., Vu, L., and Meade, B. (2016). Spartan performance and flexibility: An hpc-cloud chimera. OpenStack Summit, Barcelona.

Laplace, P. S. (1986). Memoir on the probability of the causes of events. Statistical Science, 1(3):364–378.

Lau, M. S., Dalziel, B. D., Funk, S., McClelland, A., Tiffany, A., Riley, S., Metcalf, C. J. E., and Grenfell, B. T. (2017). Spatial and temporal dynamics of superspread- ing events in the 2014–2015 west africa ebola epidemic. Proceedings of the National Academy of Sciences, 114(9):2337–2342.

Lau, M. S., Marion, G., Streftaris, G., and Gibson, G. (2015). A systematic bayesian integration of epidemiological and genetic data. PLoS Computational Biology, 11(11):e1004633.

Lee, M. D. and Wagenmakers, E.-J. (2014). Bayesian cognitive modeling: A practical course. Cambridge university press.

Lee, T. E. (2014). A simple numerical tool to infer whether a species is extinct. Methods in Ecology and Evolution, 5(8):791–796. Bibliography 150

Lee, T. E., Black, S. A., Fellous, A., Yamaguchi, N., Angelici, F. M., Al Hikmani, H., Reed, J. M., Elphick, C. S., and Roberts, D. L. (2015). Assessing uncertainty in sighting records: An example of the Barbary lion. PeerJ, 3:e1224.

Lee, T. E., Bowman, C., and Roberts, D. L. (2017a). Are extinction opinions extinct? PeerJ, 5:e3663.

Lee, T. E., Fisher, D. O., Blomberg, S. P., and Wintle, B. A. (2017b). Extinct or still out there? disentangling influences on extinction and rediscovery helps to clarify the fate of species on the edge. Global change biology, 23(2):621–634.

Lee, T. E., Mccarthy, M. A., Wintle, B. A., Bode, M., Roberts, D. L., and Burgman, M. A. (2014). Inferring extinctions from sighting records of variable reliability. Journal of Applied Ecology, 51(1):251–258.

Liu, J., Lim, S. L., Ruan, Y., Ling, A. E., Ng, L. F., Drosten, C., Liu, E. T., Stanton, L. W., and Hibberd, M. L. (2005). Sars transmission pattern in singapore reassessed by viral sequence variation analysis. PLoS Medicine, 2(2):e43.

Lloyd-Smith, J. O., Schreiber, S. J., Kopp, P. E., and Getz, W. M. (2005). Superspread- ing and the effect of individual variation on disease emergence. Nature, 438(7066):355.

Mace, G. M., Norris, K., and Fitter, A. H. (2012). Biodiversity and ecosystem services: A multilayered relationship. Trends in Ecology & Evolution, 27(1):19–26.

Marshall, C. R. (1990). Confidence intervals on stratigraphic ranges. Paleobiology, 16(1):1–10.

McCarthy, M. A. (1998). Identifying declining and threatened species with museum data. Biological Conservation, 83(1):9–17.

Metropolis, N., Rosenbluth, A. W., Rosenbluth, M. N., Teller, A. H., and Teller, E. (1953). Equation of state calculations by fast computing machines. The Journal of Chemical Physics, 21(6):1087–1092.

Miller Jr, R. G. (2011). Survival analysis, volume 66. John Wiley & Sons.

Mollentze, N., Nel, L. H., Townsend, S., Le Roux, K., Hampson, K., Haydon, D. T., and Soubeyrand, S. (2014). A bayesian approach for inferring the dynamics of partially observed endemic infectious diseases from space-time-genetic data. Proceedings of the Royal Society B: Biological Sciences, 281(1782):20133251.

Morelli, M. J., Th´ebaud,G., Chadœuf, J., King, D. P., Haydon, D. T., and Soubeyrand, S. (2012). A bayesian inference framework to reconstruct transmission trees using epidemiological and genetic data. PLoS Computational Biology, 8(11). Bibliography 151

Mutreja, A., Kim, D. W., Thomson, N. R., Connor, T. R., Lee, J. H., Kariuki, S., Croucher, N. J., Choi, S. Y., Harris, S. R., Lebens, M., et al. (2011). Evidence for several waves of global transmission in the seventh cholera pandemic. Nature, 477(7365):462.

N¨ubel, U., Dordel, J., Kurt, K., Strommenger, B., Westh, H., Shukla, S. K., Zemliˇckov´a,ˇ H., Leblois, R., Wirth, T., Jombart, T., et al. (2010). A timescale for evolution, population expansion, and spatial spread of an emerging clone of methicillin-resistant staphylococcus aureus. PLoS Pathogens, 6(4):e1000855.

O’Hagan, A. (1995). Fractional bayes factors for model comparison. Journal of the Royal Statistical Society: Series B (Methodological), pages 99–138.

O’Hagan, A. and Forster, J. J. (2004). Kendall’s advanced theory of statistics, volume 2B: Bayesian inference, volume 2. Arnold.

O’Neill, P. D. and Roberts, G. O. (1999). Bayesian inference for partially observed stochastic epidemics. Journal of the Royal Statistical Society: Series A (Statistics in Society), 162(1):121–129.

Pimm, S. L., Jenkins, C. N., Abell, R., Brooks, T. M., Gittleman, J. L., Joppa, L. N., Raven, P. H., Roberts, C. M., and Sexton, J. O. (2014). The biodiversity of species and their rates of extinction, distribution, and protection. Science, 344(6187):1246752.

Punt, A. E. and Hilborn, R. (1997). Fisheries stock assessment and decision analysis: the bayesian approach. Reviews in fish biology and fisheries, 7(1):35–63.

Pybus, O. G. and Rambaut, A. (2009). Evolutionary analysis of the dynamics of viral infectious disease. Nature Reviews Genetics, 10(8):540–550.

Rambaut, A., Drummond, A. J., Xie, D., Baele, G., and Suchard, M. A. (2018). Pos- terior summarization in bayesian phylogenetics using tracer 1.7. Systematic Biology, 67(5):901–904.

Rao, C. R., Rao, C., and Govindaraju, V. (2006). Handbook of statistics. Elsevier.

Rasmussen, D. A., Ratmann, O., and Koelle, K. (2011). Inference for nonlinear epi- demiological models using genealogies and time series. PLoS Computational Biology, 7(8).

Rigdon, S. E. and Basu, A. P. (1989). The power law process: A model for the reliability of repairable systems. Journal of Quality Technology, 21(4):251–260.

Rivadeneria, M., Hunt, G., and Roy, K. (2009). The use of sighting records to infer species extinctions: An evaluation of different methods. Ecology, 90(1):1291–1300. Bibliography 152

Robert, C. (2007). The Bayesian choice: From decision-theoretic foundations to com- putational implementation. Springer Science & Business Media.

Roberts, D. L., Elphick, C. S., and Reed, J. M. (2010). Identifying anomalous reports of putatively extinct species and why it matters: Contributed paper. Conservation Biology, 24(1):189–196.

Roberts, D. L. and Solow, A. R. (2003). Flightless birds: When did the dodo become extinct? Nature, 426(6964):245.

Robson, D. and Whitlock, J. (1964). Estimation of a truncation point. Biometrika, 51(1/2):33–39.

Rout, T. M., Salomon, Y., and McCarthy, M. A. (2009). Using sighting records to declare eradication of an invasive species. Journal of Applied Ecology, 46(1):110–117.

Ruan, Y., Wei, C. L., Ling, A. E., Vega, V. B., Thoreau, H., Thoe, S. Y. S., Chia, J.-M., Ng, P., Chiu, K. P., Lim, L., et al. (2003). Comparative full-length genome sequence analysis of 14 sars coronavirus isolates and common mutations associated with putative origins of infection. The Lancet, 361(9371):1779–1785.

Sch¨onbrodt, F. D. and Wagenmakers, E.-J. (2018). Bayes factor design analysis: Plan- ning for compelling evidence. Psychonomic bulletin & review, 25(1):128–142.

Sibley, D. A., Bevier, L. R., Patten, M. A., and Elphick, C. S. (2006). Comment on ”ivory-billed woodpecker (Campephilus principalis) persists in continental north america”. Science, 311(5767):1555–1555.

Sibley, D. A., Bevier, L. R., Patten, M. A., and Elphick, C. S. (2007). Ivory-billed or pileated woodpecker? Science, 315(5818):1495–1496.

Singer, J. D. and Willett, J. B. (1993). It’s about time: Using discrete-time survival analysis to study duration and the timing of events. Journal of Educational Statistics, 18(2):155–195.

Skums, P., Zelikovsky, A., Singh, R., Gussler, W., Dimitrova, Z., Knyazev, S., Mandric, I., Ramachandran, S., Campo, D., Jha, D., et al. (2018). Quentin: Reconstruction of disease transmissions from viral quasispecies genomic data. Bioinformatics, 34(1):163– 170.

Smith, R. L. and Weissman, I. (1985). Maximum likelihood estimation of the lower tail of a probability distribution. Journal of the Royal Statistical Society: Series B (Methodological), 47(2):285–298. Bibliography 153

Soetens, L., Klinkenberg, D., Swaan, C., Hahn´e,S., and Wallinga, J. (2018). Real-time estimation of epidemiologic parameters from contact tracing data during an emerging infectious disease outbreak. Epidemiology, 29(2):230–236.

Solow, A., Smith, W., Burgman, M., Rout, T., Wintle, B., and Roberts, D. (2012). Uncertain Sightings and the extinction of the Ivory-billed woodpecker. Conservation Biology, 26(1):180–184.

Solow, A. R. (1993a). Inferring extinction from sighting data. Ecology, 74(3):962–964.

Solow, A. R. (1993b). Inferring extinction in a declining population. Journal of Mathe- matical Biology, 32(1):79–82.

Solow, A. R. (2005). Inferring extinction from a sighting record. Mathematical Bio- sciences, 195(1):47–55.

Solow, A. R. (2016). A simple bayesian method of inferring extinction: Comment. Ecology, 97(3):796–798.

Solow, A. R. and Beet, A. R. (2014). On uncertain sightings and inference about ex- tinction. Conservation Biology, 28(4):1119–1123.

Solow, A. R. and Roberts, D. L. (2003). A nonparametric test for extinction based on a sighting record. Ecology, 84(5):1329–1332.

Stadler, T. (2009). On incomplete sampling under birth–death models and connections to the sampling-based coalescent. Journal of Theoretical Biology, 261(1):58–66.

Strauss, D. and Sadler, P. M. (1989). Classical confidence intervals and bayesian proba- bility estimates for ends of local taxon ranges. Mathematical Geology, 21(4):411–427.

Teunis, P., Heijne, J. C., Sukhrie, F., van Eijkeren, J., Koopmans, M., and Kretzschmar, M. (2013). Infectious disease transmission as a forensic problem: Who infected whom? Journal of the Royal Society Interface, 10(81):20120955.

Thompson, C., Lee, T., Stone, L., McCarthy, M., and Burgman, M. (2013a). Inferring extinction risks from sighting records. Journal of Theoretical Biology, 338:16–22.

Thompson, C. J., Kodikara, S., Burgman, M. A., Demirhan, H., and Stone, L. (2019). Bayesian updating to estimate extinction from sequential observation data. Biological Conservation, 229:26–29.

Thompson, C. J., Kodikara, S., Burgman, M. A., Demirhan, H., and Stone, L. (2020). Using survival theory models to quantify extinctions. Biological Conservation, 241:108345. Bibliography 154

Thompson, C. J., Koshkina, V., Burgman, M. A., Butchart, S. H., and Stone, L. (2017). Inferring extinctions II: A practical, iterative model based on records and surveys. Biological Conservation, 214:328–335.

Thompson, C. J., Lee, T. E., Stone, L., McCarthy, M. A., and Burgman, M. A. (2013b). Inferring extinction risks from sighting records. Journal of Theoretical Biol- ogy, 338(August):16–22.

Tittensor, D. P., Walpole, M., Hill, S. L., Boyce, D. G., Britten, G. L., Burgess, N. D., Butchart, S. H., Leadley, P. W., Regan, E. C., Alkemade, R., et al. (2014). A mid-term analysis of progress toward international biodiversity targets. Science, 346(6206):241– 244.

U.S. Fish and Wildlife Service (2010). Recovery plan for the Ivory-billed woodpecker (Campephilus principalis). US Fish and Wildlife Service, Southeast Region.

van der Linden, S. and Chryst, B. (2017). No need for bayes factors: A fully bayesian evidence synthesis. Frontiers in Applied Mathematics and Statistics, 3:12.

Volz, E. M., Pond, S. L. K., Ward, M. J., Brown, A. J. L., and Frost, S. D. (2009). Phylodynamics of infectious disease epidemics. Genetics, 183(4):1421–1430.

Walker, T. M., Ip, C. L., Harrell, R. H., Evans, J. T., Kapatai, G., Dedicoat, M. J., Eyre, D. W., Wilson, D. J., Hawkey, P. M., Crook, D. W., et al. (2013). Whole-genome sequencing to delineate mycobacterium tuberculosis outbreaks: A retrospective ob- servational study. The Lancet Infectious Diseases, 13(2):137–146.

Wallinga, J. and Lipsitch, M. (2006). How generation intervals shape the relationship between growth rates and reproductive numbers. Proceedings of the Royal Society B: Biological Sciences, 274(1609):599–604.

Wallinga, J. and Teunis, P. (2004). Different epidemic curves for severe acute respi- ratory syndrome reveal similar impacts of control measures. American Journal of Epidemiology, 160(6):509–516.

Walters, C. and Ludwig, D. (1994). Calculation of bayes posterior probability distri- butions for key population parameters. Canadian Journal of Fisheries and Aquatic Sciences, 51(3):713–722.

Wilcove, D. S. (2005). Rediscovery of the ivory-billed woodpecker. Science, 308(5727):1422–1423.

Wintle, B. A., Walshe, T. V., Parris, K. M., and McCarthy, M. A. (2012). Designing occupancy surveys and interpreting non-detection when observations are imperfect. Diversity and Distributions, 18(4):417–424. Bibliography 155

Wisely, S. M., Santymire, R. M., Livieri, T. M., Mueting, S. A., and Howard, J. (2008). Genotypic and phenotypic consequences of reintroduction history in the black-footed ferret (Mustela nigripes). Conservation Genetics, 9(2):389–399.

Worby, C. J., O’Neill, P. D., Kypraios, T., Robotham, J. V., De Angelis, D., Cartwright, E. J., Peacock, S. J., and Cooper, B. S. (2016). Reconstructing transmission trees for communicable diseases using densely sampled genetic data. The Annals of Applied Statistics, 10(1):395.

Yang, Z. (2006). Computational molecular evolution. Oxford University Press.

Ye, K. and Berger, J. O. (1991). Noninformative priors for inferences in exponential regression models. Biometrika, 78(3):645–656.

Ypma, R. J., van Ballegooijen, W. M., and Wallinga, J. (2013). Relating phylogenetic trees to transmission trees of infectious disease outbreaks. Genetics, 195(3):1055–1062. Appendix A. MCMC diagnostic plots for Chapter4

Convergence diagnostics for IBW Model 1

(a) τE (b) θ

(c) pc

Figure 1: MCMC diagnostic plots for IBW. (a) Convergence diagnostics for parameter τE. (b) Convergence diagnostics for parameter θ. (c) Convergence diagnostics for parameter pc.

156 Appendix A. MCMC diagnostic plots for Chapter4 157

Convergence diagnostics for IBW Model 2

(a) τE (b) θ

(c) pc (d) puv

(e) pui

Figure 2: MCMC diagnostic plots for IBW. (a) Convergence diagnostics for parameter τE. (b) Convergence diagnostics for parameter θ. (c) Convergence diagnostics for parameter pc. (d) Convergence diagnostics for parameter puv. (e) Convergence diagnostics for parameter pui. Appendix B. Technical details relating to equations presented in Chapter7

i To derive Equation 7.3 we simply sum Equation 7.2 successively over t = 0, 1, 2, 3,... to obtain

∞ X P (Xt) = p(e1) + p(e2) + p(e3) + ... t=0 (1) +p(e2) + p(e3) + ...

+ p(e3) + ...

The r.h.s of Equation 7.3 follows immediately by summing the lines vertically on the r.h.s of Equation1.

ii The first equality in Equation 7.5 can be iterated to obtain

P (Xt) = P (Xt−1) − p(et) = P (Xt−2) − p(et−1) − p(et) = ... t X (2) = P (X0) − p(et0 ). t0=1

From the definition Equations 7.4 of h(t)and the fact that P (X0) = 1 we deduce in general that t−1 h X i−1 h(t) = p(et) 1 − p(et0 ) . (3) t0=1

iii In the special case of the geometric distribution Equation 7.6 for p(et), we have

t−1 t−1 X X t0−1 t−1 p(et0 ) = h (1 − h) = 1 − (1 − h) . (4) t0=1 t0=1

158 Appendix B. Technical details relating to equations presented in Chapter7. 159

It then follows from Equations3 and4 that the hazard function h(t) = h is a constant.

iv For constant h(t) = h, using Equation 7.6 we have5

∞ ∞ X X t−1 µ = tp(et) = h t(1 − h) t= t= ∞ (5) d h X i d h 1 i 1 = h (1 − h)t = −h = dh dh h h t=

in accord with Equation 7.8.

The differentiation trick in Equation5 is well-known for geometric series and gener- alizes straight forwardly to higher moments. E.g. the second order moment can be expressed as

∞ ∞ X d h d X i 2 1 t2p(e ) = h (1 − h) (1 − h)t = − (6) t dh dh h2 h t=0 t=0

It follows from Equations5,6 that the variance σ2 for the time to extinction is given by ∞ ∞ X  X 2 1 σ2 = t2p(e ) − tp(e ) = (1 − h) (7) t t h2 t=0 t=0 yielding Equation 7.9.

v With respect to lower bounds P (ET ) as h → 0 (see Fig. 7.1) we note from Equation 7 that as h → 0 for a given K, T ∼ ((1 + K))/h. With this value for T in Equation 7.12 (or equivalently h = ((1 + K))/T as T → ∞) we deduce from the definition of the exponential function that

h 1 + K iT lim P (ET ) = 1 − lim 1 − = 1 − exp[−(1 + K)] (8) h→0 T →∞ T

Thus for example, when K = 2 we deduce that the limiting value is 0.9502 (as in Fig. 7.1) which, coincidentally is (almost) the same as the K = 2 CL for the normal distribution.

vi In the continuum limit the geometric distribution Equation 7.6 become the expo- nential distribution with the pdf for constant h given by

f(t) = hexp(−ht) (9) Appendix B. Technical details relating to equations presented in Chapter7. 160

In this case µ = σ = 1/h and then from Equation 7.11 the

Z T CL = he−htdt = 1 − exp(−hT ) = 1 − exp[−(1 + K)]. (10) 0