Report Cover Page

ACERA Project 0605 Title Statistical Methods for Biosecurity Monitoring and Surveillance Author(s) / Address (es) David Fox, University of Melbourne

Material Type and Status (Internal draft, Final Technical or Project report, Manuscript, Manual, Software) Final Report Summary This report investigates the applicability of traditional methods of analysing surveillance data to biosecurity risks, and explores some more recent innovations designed to detect subtle trends and anomalous behaviour in data over space and time. In particular, it examines control charting and syndromic surveillance methods, and explores how useful they are likely to be in dealing with typical biosecurity disease and pest surveillance. If focuses on disease detection, and methods for optimising surveillance networks and robust methods for minimising levels of inspection. This work provides a proof of concept of these approaches. The case studies, while based on real contexts, are intended only to be illustrative. If the tools are considered to be potentially useful, the next stage would involve development of specific applications to trial their utility.

Received By: Date: ACERA Use only ACERA / AMSI SAC Approval: Date: DAFF Endorsement: ( ) Yes ( ) No Date:

AUSTRALIAN CENTRE OF EXCELLENCE FOR RISK ANALYSIS Project 06-05

Statistical Methods for Biosecurity Monitoring and Surveillance

Page | i THE AUSTRALIAN CENTRE OF EXCELLENCE FOR RISK ANALYSIS Statistical Methods for Biosecurity Monitoring & Surveillance

DAVID R. FOX  2009 The University of Melbourne Parkville 3052 Phone +61 3 8344 7253 • Fax +61 3 8344 6215 Email: [email protected]

This report may not be reproduced in part or full by any means without the express written permission of the copyright owner.

Page | ii DISCLAIMER

This report has been prepared by consultants for the Australian Centre of Excellence for Risk Analysis (ACERA) and the views expressed do not necessarily reflect those of ACERA. ACERA cannot guarantee the accuracy of the report and does not accept liability for any loss or damage incurred as a result of relying on its accuracy.

Page | iii Table of Contents

LIST OF FIGURES ...... VI ACKNOWLEDGEMENTS ...... IX EXECUTIVE SUMMARY ...... 1

CHAPTER 1: Basic Control Charting 1-1 INTRODUCTION ...... 5 1-2 BACKGROUND ...... 6 1-3 BASIC STATISTICAL CONCEPTS ...... 9 1-4 STATISTICAL SIGNIFICANCE ...... 17 1-5 CONTROL CHARTS ...... 21 1-6 TIME BETWEEN EVENTS ...... 27 1-7 TRANSFORMATIONS TO NORMALITY ...... 32 1-8 CONTROL CHART FOR TIME-BETWEEN-EVENTS ...... 36 1-9 DISCUSSION ...... 39

CHAPTER 2: Bayesian Control Charting 2-1 INTRODUCTION ...... 41 2-2 FUNDAMENTALS OF BAYESIAN CONTROL CHARTING ...... 43 2-3 A BAYESIAN CONTROL CHART FOR QUARANTINE INSPECTION ...... 45 2-3-1 MATHEMATICAL DETAIL ...... 46 Updating the prior ...... 47

Predictive distributions for X t1 and Yt1 ...... 49 2-3-2 ADAPTIVE CONTROL LIMITS ...... 50 2-4 EXAMPLE – AN ADAPTIVE CONTROL CHART FOR FOOD IMPORTS ...... 51 2-4-1 UPDATING THE PRIOR ...... 51 2-4-2 SETTING ADAPTIVE TRIGGERS ...... 54 2-5 DISCUSSION ...... 55

CHAPTER 3: Space-Time Disease Spread 3-1 INTRODUCTION ...... 57 3-2 A CELLUAR AUTOMATA MODEL FOR DISEASE SPREAD ...... 58 3-2-1 MATHEMATICAL FORMULATION ...... 63 Models for transmission effectiveness...... 65 Initial conditions ...... 65

Page | iv Modelling the disease spread ...... 66 3-3 INFERRING THE TIME AND LOCATION OF AN OUTBREAK ...... 69 3-4 EXAMPLE ...... 72 3-5 DISCUSSION ...... 75

CHAPTER 4: Sensor Network Optimisation 4-1 INTRODUCTION ...... 79 4-2 A SURVEILLANCE NETWORK FOR EI ...... 80 4-3 THE MAXIMAL COVERING LOCATION PROBLEM (MCLP) ...... 82 4-4 THE GENERALISED MCLP (G-MCLP) ...... 85 4-4-1 A LOGISTIC MODEL FOR DISEASE PROBABILITY ...... 87 4-5 PROBLEM FORMULATION ...... 89 4-6 EXAMPLE ...... 91 4-6-1 IMPLEMENTATION ...... 95 Scenario evaluation ...... 95 4-7 DISCUSSION ...... 102

CHAPTER 5: Robust Methods for Biosurveillance 5-1 INTRODUCTION ...... 105 5-2 SURVEILLANCE WITH IMPERFECT DETECTION ...... 107 5-2-1 PROBLEM FORMULATION ...... 108 5-3 AN INFO-GAP MODEL FOR SURVEILLANCE PERFORMANCE ...... 110 5-4 ILLUSTRATIVE EXAMPLE ...... 113 5-4-1 COMPARISON WITH A BAYESIAN APPROACH ...... 114 5-5 DISCUSSION ...... 117 6 REFERENCES ...... 119 APPENDIX A : DATA USED IN CHAPTER 1 EXAMPLE ...... 126 APPENDIX B: DERIVATION OF EQUATION 2-5 ...... 127 APPENDIX C: DERIVATION OF EQUATION 2-9 ...... 129 APPENDIX D : DATA USED IN CHAPTER 3 EXAMPLE ...... 133 APPENDIX E : MATHCAD CODE FOR CELLULAR AUTOMATA MODEL ...... 135 APPENDIX F : DATA USED IN CHAPTER 4 EXAMPLE ...... 138 ® APPENDIX G: LINGO CODE FOR OPTIMAL SENSOR CONFIGURATION ...... 142

Page | v

List of Figures Figure 1. Histogram of number of imported food items inspected each day in the period 1/7/2006 to 29/6/2007...... 9 Figure 2. Graphical and numerical statistical summary number of weekday inspections. Curve overlaying the histogram is the best-fitting normal distribution...... 10 Figure 3. Graphical and numerical statistical summary number of weekend inspections. Curve overlaying the histogram is the best-fitting normal distribution...... 11 Figure 4. Daily inspection failure rate histogram for food items imported between 17/2006 and 30/6/2007. Blue curve is best-fitting normal distribution...... 13 Figure 5. Boxplot for inspection failure rate for weekends and weekdays...... 14 Figure 6. Time series plot of number of containers inspected per week (black line) and number of containers found to be of quarantine concern (red line)...... 16 Figure 7. Times series plot of overall failure rate for food imports. Solid blue line is a loess smooth...... 16 Figure 8. Theoretical distribution for the true proportion of weekday inspection failure rate. . 18 Figure 9. Theoretical distributions for an individual proportion (black) and the average of 52 proportions (red)...... 18 Figure 10. Assumed distribution for mean of n=52 sample proportions. One-tail, 5% 'critical region' identified by red shading...... 19 Figure 11. Assumed distribution for mean of n=52 sample proportions. Two-tail, 5% 'critical regions' identified by red shading...... 20 Figure 12. P-chart for inspection failure rate data...... 21 Figure 13. Smoothing using block averaging...... 22 Figure 14. Moving average scheme. A block or 'window' is stepped incrementally over the series and the block mean computed and plotted...... 23 Figure 15. Moving average chart of weekday inspection failure rate. Sub-group size =1; MA length=5...... 24 Figure 16. Moving average for weekday inspection failure rate. Subgroup size defined by week of year (usually 5); MA length=4...... 25 Figure 17. Comparison of exponentially declining weights (red bars) compared with equal- weighting scheme (blue rectangles) for k=10...... 26 Figure 18. EWMA chart for weekday inspection failure rate. Subgroup size defined by week of year (usually 5); EWMA weight=0.2...... 27 Figure 19. Time sequence of detection of quarantine threats...... 28 Figure 20. Pattern of inter-arrival times as measured by the 'white space' between blue lines. . 28 Figure 21. Histogram of inter-arrival times with smoothed version (red line) and theoretical normal distribution (black line) overlaid. The normal distribution provides a poor description of this data (evidenced by the both the shape and probability mass associated with negative values of days between detects)...... 29 Figure 22. Histogram of days between detects. Smoothed histogram indicated by red curve, theoretical exponential distribution depicted by black curve...... 30 Figure 23. Empirical cdf for days between detects (red curve) and theoretical exponential cdf (blue curve)...... 31 Figure 24. Chart of individual values of days between detects (I-Chart)...... 31

Page | vi Figure 25. Box-Cox profile plot for the days between detects. Optimal lambda is 0.24...... 33 Figure 26. Histogram of transformed days between detection with smoothed version (red curve) and theoretical normal (black curve) overlaid...... 33 Figure 27. Fitted normal distribution to transformed days between detects with upper 10% point indicated...... 34 Figure 28. Theoretical negative exponential distribution for untransformed days between detects and upper 10% point indicated...... 35 Figure 29. I-Chart for transformed days between detects...... 36 Figure 30. Performance characteristics (as measured by equation 1.9) for a one-sided, lower control chart...... 38 Figure 31. Performance characteristics (as measured by equation 1.10) for a one-sided, upper control chart...... 38 Figure 32. Illustrative non-informative priors for true failure rate, theta...... 47 Figure 33. Daily consignment failure rate (red curve) and cumulative failure rate (blue curve) for imports between 4/7/2006 and 29/6/2007...... 52 Figure 34. Top: original (subjective) prior density (blue curve) and posterior density (red curve) for true failure rate after 1 year. Bottom: Empirical cumulative failure rate (red line), overall mean failure rate (blue dotted line) and mean of posterior distribution after 1 year (green dashed line)...... 53 Figure 35. Average number of days per year that are conducive to persistence of FMD virus in aerosol (From Cannon and Gardner 1999)...... 60 Figure 36. Zonation of EI infected regions in Queensland during the 2007/08 outbreak. Source: http://www2.dpi.qld.gov.au/extra/ei/maps/QLDInfectedCluster.gif (accessed 25 January 2008) ...... 60 Figure 37. Illustration of grid representation used by authorities in managing EI outbreak. (Source: Source: http://www2.dpi.qld.gov.au/extra/ei/maps/map8.gif ) ...... 61 Figure 38. The region of interest in Figure 36 with grid overlay that forms the basis of the CA modelling approach...... 62 Figure 39. The grid in Figure 38 showing an ‘infected’ cell (red shading) and associated spatial pattern for disease transmission...... 62 Figure 40. General CA situation with cell of interest (red shading) and neighbouring cell. Z is a binary variable indicating cell’s disease status. Region bordered by heavy line depicts spatial extent of influence or impact of cell on its neighbours...... 63 Figure 41. Illustration of a spatial probability model for disease spread for generic cell {i,j} at fixed point in time. Grid represents region of interest. Height of surface is proportional to probability of spread of infection from cell {i,j} to neighbouring cells. Each plot represents a different range of influence from highly localised (bottom right) to far-ranging (top left)...... 66 Figure 42. Illustrative contours of probability representing likelihood of infection around cell having grid coordinates {12,8}...... 67 Figure 43. 3-D depiction of progression of disease spread starting with initial outbreak pattern at T=0 and at three subsequent time periods. Vertical scale is probability of infection...... 68 Figure 44. Likelihood profile plot for k (number of time increments since outbreak)...... 69 Figure 45. General situation depicting region of interest with vertical bars depicting empirical rates of infection...... 70 Figure 46. Illustrative likelihood surface for outbreak location...... 72

Page | vii Figure 47. 3-D visualisation of empirical data (infection rates) used in the example. Underlying contour lines derived from theoretical response-generating model...... 73 Figure 48. Profile plot showing contours of likelihood of outbreak location. Maximum attained at grid position located at row 14, column 9. Notes: (i) this plot is rotated 900 relative to Figure 47; (ii) plot is offset by 1 unit in each direction due to placement of origin at {0,0}, hence most likely grid location for disease outbreak is cell {10,15}...... 74 Figure 49. Likelihood profile plot for temporal component of CA model. Most likely outbreak time is 78 time prior to time of sampling...... 75 Figure 50. Detection of EI in NSW as at September 12, 2007...... 81 Figure 51. Illustration of 'envelope of effectiveness' for three sensor types. Note that range and orientation can be different for each sensor...... 86 Figure 52. Representation of single sensor network. Sensor is located location in cell {i,j}. The probability of detection in cell {k,l} is a product of the risk/likelihhood for

cell {k,l} (denoted Ekl) multiplied by the sensor’s detection probability wijkl which is a function of distance (dijkl) between cells {i,j} and {k,l})...... 88 Figure 53. Generalisation of previous figure for multiple sensor network...... 89 Figure 54. 3D Surface/contour plot of NSW population density. (Raw data given in Appendix F)...... 92 Figure 55. NSW population density contours. Solid red circles correspond to EI locations as shown in Figure 1...... 92 Figure 56. 3D surface/contour plot of EI risk over NSW...... 93 Figure 57. Contour plot of EI risk in NSW. Solid red circles are EI locations from Figure 1. . 94 Figure 58. Block-averaged risk probabilities...... 94 Figure 59. generic representation of three sensors having characteristics given in Table 4-1. .. 96 Figure 60. Cost-contours for surveillance monitoring...... 97 Figure 61. Block-averaged surveillance monitoring costs...... 98 Figure 62. Optimal solution for scenario #1: 5 sensors; min spacing 30 km; no cost constraint...... 98 Figure 63. Optimal solution for scenario #2: 5 sensors; min spacing=35 km; no cost constraint...... 99 Figure 64. Optimal solution for scenario #3: 5 sensors; min spacing=45 km; no cost constraint...... 99 Figure 65. Optimal solution for scenario #4: 5 sensors; min spacing=60 km; no cost constraint...... 100 Figure 66. Optimal solution for scenario #5: 4 sensors; no min spacing; cost <=$20,000. .... 100 Figure 67. Optimal solution for scenario #6: 4 sensors; min spacing 35 km; cost <=$20,000101 Figure 68. Optimal monitoring locations corresponding to Figure 62 overlaid on geographic map. (Note: Apparent positional discrepancies due to different mapping projections)...... 101 Figure 69. Robustness of surveillance performance for various sampling fractions (lambda).113 Figure 70. Directed acyclic graph for the biosurveillance example...... 115 Figure 71. Prior density for theta. Figure 72. Prior density for phi...... 115 Figure 73. WinBugs code for biosurveillance example ...... 116 Figure 74. Emprical cdfs for performance measure of equation 5.11 based on 10,000 simulated results...... 116

Page | viii

Acknowledgements

This report is a product of the Australian centre of Excellence for Risk Analysis (ACERA). In preparing this report, the author acknowledges the financial and other support provided by the Department of Agriculture, Fisheries and Forestry (DAFF), the University of Melbourne, Australian Mathematical Sciences Institute (AMSI) and the Australian Research Centre for Urban Ecology (ARCUE).

The author is grateful to the following people for their reviews of earlier versions of individual chapters of this report: Mark Burgman (University of Melbourne); Andrew Robinson (University of Melbourne); Rob Cannon (AQIS); and the anonymous referees who provided valuable feedback on individual chapters. The contributions and assistance of Colin Thompson (University of Melbourne) in the preparation of the material in chapter 5 is greatly appreciated.

Page | ix

Executive Summary

Wagner et al. (2006) define biosurveillance as “a process that detects...outbreaks of disease...monitors the environment...and systematically collects and analyses data”. Clearly, statistics and statistical methods have a critical role to play in all aspects of biosurveillance: detection; monitoring; and data collection and analysis.

Early work on developing statistical tools for biosurveillance for the most part represented a re-working or adaptation of standard, pre-existing methodologies such as control charts and attribute sampling inspection schemes. While these methods are certainly applicable, there is growing recognition that the data and processes underpinning modern biosecurity and biosurveillance deviate from the industrial context in which they were originally developed (Shmueli and Burkom in press). Classical ‘frequentist’ statistical tools struggle with the nuances of biosecurity data which invariably exhibit some or all of the following analytical ‘curses’: data paucity; non-normality; non-stationarity; heterogeneous error structures; and over-dispersion. Other issues such as an inability to deal with data from multiple sources, and a model focus on natural/physical processes rather than ‘choice processes’ are also cited as reasons for the failure of traditional methods of monitoring and analysis. The peculiarities of biosurveillance systems demand ‘new’ statistical approaches to both data acquisition and analysis. Techniques that have been successfully applied to the analysis of syndromic and climatic data are candidates for biosurveillance.

This report summarises the outcomes of ACERA Project 05/06 which had as its two main aims:

1. Investigate the applicability of ‘traditional’ monitoring methods to the surveillance and detection of bio-security risks &/or threats; and

2. Undertake statistical research in ‘new’ and emerging areas of bio-surveillance that show promise for their ability to identify and predict trends and anomalous behaviour in both space and time.

Objectives 1 and 2 above have been met, and as the project evolved over the past three years, so too did its direction and emphasis. Chapter 1 of this report is associated with

Page | 1 activities under objective 1. Chapter 2 extends the traditional control charting methods and develops this within a Bayesian context for quarantine inspection while chapters 3 to 5 report on research undertaken as part of the second objective. Specifically, chapter 3 focuses on disease outbreak detection in time and space; chapter 4 examines methods for optimising surveillance network designs; and chapter 5 looks at robust methods for determining minimum levels of inspection.

During the course of this project a number of ancillary activities were undertaken. These included:

 Meetings with key U.S. researchers working in the area of syndromic surveillance. This led to an enhanced awareness and understanding of statistical methods used for analysing space-time clusters of ‘incidences’ (eg. disease outbreaks);

 Creating and fostering linkages with research groups at Harvard; N.Y. Department of Health and Rutgers University (DIMACS);

 Presentations (SRA conference, Melbourne; AQIS, Canberra; Rutgers University, New York; International Statistical Institute conference, Lisbon);

 Participation in “Workshop on Info-Gap Applications in the Life Sciences”, University of Houston, 11–15 Sept. 2006. An outcome from this workshop was preparation of a manuscript titled: “Robust Profiling for Quarantine Inspection” (Fox, D.R., Moilanen,A. and Beare, S.);

 Organising an international “Uncertainty and Surveillance” Workshop August 12–17, 2007 Hobart. The purpose of the workshop was to bring together individuals from a diversity of backgrounds to discuss surveillance and uncertainty in the context of bio- security. Topics considered by the working group were drawn from the following list:

1. Surveillance for exotic diseases in animal / plant populations

2. How to design monitoring/surveillance programs for events for which we have no data and hope to never have data eg. catastrophic events having unimaginable consequences?

3. Attribute sampling and inspection – how many samples; where; and when in order to declare pest or disease-free status? Distinction between sampling zeros and structural zeros.

4. Characteristics of a surveillance network that will make it most robust in discovering emerging animal diseases early while adhering to cost and performance criteria.

Page | 2 5. Resource Allocation (static). Inspection resources are limited. N sites could be inspected. There is uncertainty in any or all of: cost of inspection, probability of detection, consequence of missed detection.

6. Resource Allocation (dynamic).This is an extension of the previous problem. Here the temporal dimension enters. Inspection resources are limited. N sites could be inspected. There is uncertainty in any or all of: cost of inspection, probability of detection, consequence of missed detection. We assume the inspector learns from inspection process and adjusts his/her behaviour. This can be studied from a multitude of perspectives.

7. Search and evasion: This is a generalization of the previous problem. The main extension is that now we consider strategic behaviour on the part of the target: the target adjusts his behaviour in response to the behaviour of the inspector. That is, both inspector and target behave strategically.

The outcomes of this project should be viewed as a starting point for further exploration and analysis. We have developed to a ‘proof of concept’ stage a number of strategies and ideas which we believe have the potential to enhance biosecurity monitoring and surveillance. The need to refine these methodologies and ‘road test’ them within an actual quarantine inspection program is an essential next step. Early investigations were frustrated by a lack of data and resources. ACERA Project 08/04 provided much-needed data for this project and efforts to enhance the flow of data and information between ACERA and Client agencies is strongly encouraged.

Page | 3

This page intentionally blank

Page | 4

BASIC

CONTROL Chapter CHARTING 1

1-1 INTRODUCTION

We review some basic statistical concepts as well as introducing some common control-charting techniques that have been successfully applied in areas as diverse as the manufacturing industries and veterinary epidemiology. The review focuses on the applicability of control charts for monitoring temporal trends and aberrations in bio- security related applications. Control charts are particularly well suited to the visualisation and assessing of moderate to large volumes of time-based data and as such would be expected to have greater utility for container inspection regimes say, than for detecting the occurrence (in space) of an invasive species. Control charts need to be viewed as just one method in a tool-kit of available techniques which can potentially assist field officers and quarantine risk assessors in identifying ‘unusual’ or ‘aberrant’ trends. For events having very low probabilities of occurrence (eg. exotic disease outbreak) the monitoring of ‘time between outbreaks’ is a potentially more useful quantity to be charting although as shown in this report, the statistical power (ability to correctly identify real ‘shifts’ in the mean time between events) of current charting techniques is relatively low.

The various forms of charting in the context of a quarantine inspection service are illustrated using imported food inspection data gathered under ACERA Project 08/04 “AQIS Import Clearance Data Framework”

Page | 5 1-2 Background

This chapter is intended to provide managers, field officers, and researchers who have some responsibility for biosecurity monitoring and surveillance with an introduction into the principles and procedures of statistical process control (SPC) and in particular, control chart techniques.

Statistical process control can be broadly defined as the (statistical) methodologies and tools by which ‘quality’ is monitored and managed. With this definition, the original context of SPC is clear – to ‘control’ the quality of manufactured items in an industrial process or setting, although these days the word control is de-emphasised and is usually either dropped1 or replaced by improvement. The development of SPC techniques can be traced back to the First World War and shortly thereafter with the introduction of the Shewhart control chart in the 1920s. General acceptance and uptake of SPC tools in the West was relatively slow until it was realised that a major contributing factor to the high productivity and quality of Japanese manufactured goods was due to that country’s enthusiastic embrace of a total quality philosophy as espoused by leading (American) statistician and quality advocate– Edwards Deming.

The 1980s saw a resurgence of interest in SPC under the banner of ‘Total Quality Management’ or TQM. The ‘six-sigma’ philosophy was conceived during this time in response to Motorola’s desire to achieve a tenfold reduction in product-failure levels within five years. The Six Sigma methodology (based on the steps Define - Measure - Analyse - Improve – Control) underpins the objectives of process improvement, reduced costs, and increased profits.

While much good work was done in the ensuing years with numerous examples of demonstrable success attributed to the SPC/TQM paradigm, the trend attracted some poorly credentialed ‘experts’ offering radical transformations to new (and often times unrealisable) levels of profitability. Not surprisingly, there were some failures and residual ill-feeling towards TQM. For example, one large company found that two thirds of the

1 The American Society for Quality Control (ASQC) changed its name to the American Society for Quality (ASQ) on July 1, 1997.

Page | 6 TQM programs it examined had been halted due to lack of results while a 1994 American Electronics Association survey showed that TQM implementation had dropped from 86 percent to 63 percent, and that reductions in defect rates were not being realised (Dooley and Flor, 1998).

Despite some negative experiences in the manufacturing sector, TQM made a substantial contribution to quality improvement. The issue these days is one of how much quality is enough? For example, would anyone be interested in driving a 50-year old vehicle even if it was still under warranty? Or is there any value in ensuring that a computer will run reliably for 5 years when generational change in the computer industry is measured in months?

Environmental applications of TQM and SPC techniques have only more recently been identified despite the need for robust and reliable monitoring and surveillance systems. Fox (2001) attributes this to a lack of cross-talk between the ‘brown’ (industrial) statisticians and the ‘green’ (environmental) statisticians. Whatever the reasons, the slow uptake of SPC tools for environmental monitoring meant that critical assessments about environmental condition and important decisions about responses were being made on the basis of often-times flawed statistical advice. The ecologists’ statistical toolkit was generally standard issue – t-tests, ANOVA, ANOSIM, and MDS were invariably represented and much used while the relatively simple techniques of Xbar/S charts, EWMA charts, and capability analysis were virtually unheard of.

The Australian Guidelines for Fresh and Marine Water Quality (ANZECC/ARMCANZ 2000) advocated a more ‘risk-based’ approach to water quality monitoring and assessment and in particular, a reduced emphasis on binary decision- making (t-tests and ANOVA) and increased emphasis on early-warning systems underpinned by dynamic visualizations (control charting). Shortly after the release of the Guidelines the events of 9/11 and the subsequent discovery of high-grade anthrax sent via the U.S. Postal service elevated the importance of early warning systems. In response, U.S. State and Federal governments have invested hundreds of millions of dollars in developing advanced surveillance systems to detect, among other things, another anthrax attack. Recognising that existing monitoring systems which relied on the gathering and processing

Page | 7 of hospital records had unacceptable latencies, a new area of research emerged which aimed to provide close to real-time monitoring and warning of ‘aberrant’ events.

Syndromic surveillance is underpinned by a belief that signals of an emerging ‘syndrome’ such as a flu outbreak can be identified by an analysis of multiple time-series of ancillary variables such as absenteeism records and sales of non-prescription cold and flu medications together with an analysis of spatial clustering of outbreaks. Kulldorf (1997) developed a spatial scan statistic to help with the latter, while control charts were an obvious first candidate for the former. A major barrier at present is the difficulty in ‘proving’ that any of these new systems has made a difference or even do what they’re meant to. As noted by Mostashari and Hartman (2003), no syndromic surveillance system has provided early warning of bioterrorism, and no large-scale bioterrorist attack has occurred since existing systems were instituted.

While the use of syndromic surveillance for counter-terrorism (see for example, http://www.bt.cdc.gov/surveillance/ears/) is a recent development, similar systems have been used for some time now to detect outbreaks, patterns, and trends in diseases and epidemics (see for example, http://www.satscan.org/). These techniques do not appear to have had any appreciable uptake in or elsewhere around the world in quarantine inspection and bio-security. While adoption and uptake of SPC techniques in the Australian Quarantine and Inspection Service has been low, a search of the Department of Agriculture, Fisheries and Forestry (DAFF) website (www.daff.gov.au) reveals two (publicly available) documents that discusses the use of simple control charts (Korth 1997, Commonwealth of Australia 2002). One of these documents (Commonwealth of Australia, 2002) devotes a chapter to the use of control charting as an effective means of detecting trends for meat hygiene assessments. Control charting has also been recommended for detecting spatial and temporal clusters in veterinary monitoring (Carpenter, 2001) as well as testing waste streams from wastewater treatment plants (Hall and Golding, 1998). Stark et al. (2006) discuss risk-based veterinary surveillance approaches to protecting livestock and consumer health although control-charting was not mentioned.

This chapter provides a review of basic SPC techniques (focusing on control charting methods) with a view to introducing our Client Agencies to ‘new’ or alternative approaches to monitoring that may offer enhanced anomaly detection capabilities. With a shared

Page | 8 understanding of the control charting principles and an understanding of the strength and weaknesses of these methods, it is hoped that opportunities for implementation and evaluation in an actual quarantine inspection / bio-surveillance environment will emerge.

1-3 BASIC STATISTICAL CONCEPTS Statistics is concerned with random variation. Moreover, statistics is concerned with random variation. This does not mean that life for a statistician is totally unpredictable. The quantities exhibiting the random variation (the random variables) can be ‘predicted’ or described to the extent that in repeated ‘trials’ or observations, the values assumed by the random variable can be described by a frequency distribution. Figure 1 shows the histogram for the number of imported food items inspected per day between July 1 2006 and 30 June 2007.

Histogram of N_Inspect 40

30

20 Frequency

10

0 0 60 120 180 240 300 360 420 N_ In s p e c t

Figure 1. Histogram of number of imported food items inspected each day in the period 1/7/2006 to 29/6/2007.

A number of features are immediately apparent from Figure 1: (i) the range of items inspected/day is from zero to around 440 ; (ii) the most frequent number of inspected items is approximately 250 - this is the modal value; (iii) the histogram exhibits bimodality (i.e. two ‘peaks’); (iv) an eyeball estimate of an average or mean value is about 200 items/day.

Page | 9 With respect to the bimodality, two groupings are evident: <100 inspections/day; >100 inspections/day. Further analysis reveals that this is simply a weekend / weekday dichotomy as highlighted by 2-way breakdown in Table 12

Table 1-1. Two-way breakdown of number of inspections according to weekday/weekend.

Fewer than 100 Greater than Totals inspections/day 100 inspections/day Weekend 57 0 57 Weekday 6 252 258 Totals 63 252 315

As a companion to the graphical summary provided by the histogram, we generally compute relevant statistics for our data. Most software tools (such as MINITAB) combine both the graphical and numerical summaries. Individual summaries for the number of daily inspections are provided for both weekdays (Figure 2) and weekends (Figure 3).

Summary for N_Inspect day_type = weekday

A nderson-Darling Normality Test A-Squared 0.93 P-Value 0.018 Mean 251.08 StDev 69.84 V ariance 4877.76 Skewness -0.50212 Kurtosis 1.33228 N258

Minimum 1.00 1st Q uartile 213.00 Median 250.50 0 80 160 240 320 400 3rd Q uartile 298.00 Maximum 441.00 95% C onfidence Interv al for M ean 242.52 259.64 95% C onfidence Interv al for M edian 242.00 259.00 95% Confidence Intervals 95% C onfidence Interv al for StDev

Mean 64.29 76.45

Median

240 245 250 255 260

Figure 2. Graphical and numerical statistical summary number of weekday inspections. Curve overlaying the histogram is the best-fitting normal distribution.

2 As an aside, the data in Table 1 are conventionally analysed using contingency table methods and a chi-squared test of independence. The result of such an analysis when applied to Table 1 is highly significant (p<0.0001) suggesting that the number of items inspected is not independent of the weekday/weekend classification.

Page | 10 Summary for N_Inspect day_type = weekend

A nderson-Darling Normality Test A-Squared 1.91 P-V alue < 0.005 Mean 15.456 StDev 11.957 V ariance 142.967 Skew ness 1.09767 Kurtosis 0.68878 N57 Minimum 1.000 1st Q uartile 6.500 Median 13.000 0 80 160 240 320 400 3rd Q uartile 21.500 Maximum 50.000 95% C onfidence Interv al for M ean 12.284 18.629 95% C onfidence Interv al for M edian 9.000 16.000 95% Confidence Intervals 95% C onfidence Interv al for StDev

Mean 10.095 14.668

Median

10 12 14 16 18 20

Figure 3. Graphical and numerical statistical summary number of weekend inspections. Curve overlaying the histogram is the best-fitting normal distribution.

The numerical summaries presented on the right-hand side of Figures 2 and 3 are explained below:

Anderson-Darling Normality Test

Because many statistical procedures (including control charting) rely on an implicit assumption of underlying normality in the distribution of the quantity of interest, the best-fitting normal distribution for the data at hand has been identified and overlaid on the sample histogram. While this allows for a visual inspection of the plausibility of the normality assumption, a more formal statistical test can be performed. There are many such tests – some good, others not so good (see for example Stephens, 1974). One such test that has been shown to perform well under a wide range of conditions is the Anderson-Darling test. The implicit hypothesis being tested is that the sample data has been drawn (at random) from a much larger population of values which is normally distributed. In this case, the test-statistic is a value of A2 (0.14 in Figure 2). The numerical value is rather meaningless by itself. In order to gauge the significance of the result we look at the companion p-value. The rule-of-thumb is that small p-values lead to a rejection of the implicit hypothesis (otherwise known as the null hypothesis). So, how small is small? Convention dictates that p-values of less than 0.05 are ‘significant’. However you are cautioned against the unthinking adoption of this 0.05 norm. Since the p-value of 0.971 is well above the nominal 0.05, we do not reject the hypothesis of normality.

Page | 11 Mean

This is the usual arithmetic mean or average of the data. Other means are available (geometric and harmonic) although they are not widely used.

StDev and Variance

This is the standard deviation and is the most common measure of spread (or dispersion) of a data set. It is defined as the positive square root of the variance. Since the variance is computed by summing the square of differences formed by subtracting the (sample) mean from each data value, it (the variance) can only ever take on non-negative values. The reason for preferring the standard deviation as a measure of spread is that it has the same units as the original data values.

Skewness

As the name suggests, this is a measure of skewness or asymmetry. Skewness can be negative (long-tail to the left) or positive (long-tail to the right). Symmetrical distributions have zero skewness.

Kurtosis

This is not a quantity that is used often in its own right. It is a numerical measure of ‘peakedness’ of a distribution. A flat distribution is said to have low kurtosis while a highly peaked distribution has high kurtosis. Different software packages will compute skewness slightly differently. The normal distribution has a skewness of 3. MINITAB and other software tools subtract 3 from the computed skewness so as to make the measure relative to a normal distribution. The skewness of - 0.0834 in Figure 2 is very close to zero, implying that this sample data is about as peaked as a normal distribution.

Minimum and Maximum

These are self-evident.

First, second, and third quartiles

The quartiles are numerical values that divide the distribution into four equal parts.

The first quartile (Q1) is such that 25% of all values are less than or equal to Q1; the second quartile (Q2) is such that 50% of all values are less than or equal to Q2; while the third quartile (Q3) is such that 75% of all values are less than or equal to Q3. Q2 is also known as the median value. The difference Q3 - Q1 is referred to as the inter-quartile range (IQR) and is a measure of spread or variation.

Page | 12 Confidence Intervals: mean, median, and standard deviation

An explanation of a confidence interval would require considerably more than a few lines. However, a practical interpretation is that we may assert that the true parameter value lies within the stated interval with stated degree of confidence.

A quantity of potentially greater interest than the number of inspections performed is

N f the failure rate, p defined as p  where N f is the number of failed items out of NI NI inspected. The histogram of daily failure rates (Figure 4) is slightly positively skewed (i.e to the right) and is not well described by a normal probability model.

Normal

30 Mean 0.02493 StDev 0.01761 N 258 25

20

15 Density

10

5

0 -0.015 0.000 0.015 0.030 0.045 0.060 0.075 0.090 p

Figure 4. Daily inspection failure rate histogram for food items imported between 17/2006 and 30/6/2007. Blue curve is best-fitting normal distribution.

Descriptive statistics for p for weekdays and weekends are given in Table 1-2. It is seen that while far fewer inspections are undertaken on weekends, the average failure rate is essentially the same as for weekdays, although the variability in failure weekend failure rates is much greater.

Table 1-2. Numerical summaries of inspection failure rates for weekdays and weekends. day_type N N Mean SE Mean StDev Minimum Q1 Median Q3 Maximum (missing) weekday 258 0 0.02493 0.0011 0.01761 0 0.01219 0.0211 0.0342 0.09272

Page | 13 weekend 57 0 0.02524 0.00669 0.05048 0 0 0 0.03399 0.24 Another way of presenting the data is the box-plot. The box-plot for the proportion data is shown in Figure 5. There are a number of variations on how the box-plot is constructed so you should check your computer software to be clear on the specifics. The information given by MINITAB (the statistical software used to produce Figure 5) is shown in Box 1 and general information on graphical summaries from SYSTAT is shown in Box 2.

A number of features are immediately apparent from Figure 5: (i) the distributions of failure rates are approximately symmetrical around a common average of about 0.02 although the weekend distribution is slightly positively skewed; and (ii) the weekend distribution shows greater variability than the weekday distribution as evidenced by the larger interquartile range (IQR) and a number of relatively large values.

0.25

0.20

0.15

0.10 failure rate

0.05

0.00

weekday weekend

Figure 5. Boxplot for inspection failure rate for weekends and weekdays.

Page | 14 Box 1. MINITAB's Help on Box-plots

Box 2. SYSTAT's help on graphical summaries.

In addition to the graphical summaries already presented, it will be useful to display the time-series plot (ie. values of a variable plotted over time) for the monitored data. Figure 6 shows such a plot of the weekend and weekday data. Given that the number of failures is very much smaller than the number of items inspected, plotting both quantities on the same axes is of limited value. Given that both the number of containers inspected and the number of ‘detects’ vary with time, a more useful quantity to plot is the failure rate, p (Figure 7).

Page | 15 6 6 6 7 7 7 7 0 0 0 0 0 0 0 0 0 0 0 0 0 0 /2 /2 /2 /2 /2 /2 /2 7 9 1 1 3 5 7 /0 /0 /1 /0 /0 /0 /0 1 1 1 1 1 1 1 weekday weekend Variable 500 N_Inspect N_Fail

400

300

200

100

0

6 6 6 7 7 7 7 0 0 0 0 0 0 0 0 0 0 0 0 0 0 /2 /2 /2 /2 /2 /2 /2 7 9 1 1 3 5 7 /0 / 0 /1 /0 / 0 /0 /0 1 1 1 1 1 1 1

Figure 6. Time series plot of number of containers inspected per week (black line) and number of containers found to be of quarantine concern (red line).

0.25

0.20

0.15

0.10 failure rate

0.05

0.00

6 6 6 6 6 6 7 7 7 7 7 7 7 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 /2 /2 /2 /2 /2 /2 /2 /2 /2 /2 /2 /2 /2 7 8 9 0 1 2 1 2 3 4 5 6 7 /0 /0 /0 /1 /1 / 1 /0 /0 /0 /0 /0 /0 /0 1 1 1 1 1 1 1 1 1 1 1 1 1

Figure 7. Times series plot of overall failure rate for food imports. Solid blue line is a loess smooth.

Page | 16 Figure 7 shows the failure rate generally fluctuates between zero and 5%. The blue line in Figure 7 is the result of applying a local smoothing procedure, called a ‘loess’ to the failure rate data. The idea behind the loess (and other procedures like it) is that the high frequency oscillations can be removed by sub-setting the data and replacing individual data points by some statistic (such as the median of the data in the sub-set). In this way, the relatively ‘noisy’ data are smoothed. The degree of smoothing can be varied so as to reveal different features in the time-series data. In the case of Figure 7, we see that the failure rate declined slowly between July 2006 and October 2006; was relatively constant between October 2006 and January 2007 and then increased slightly until April 2007. There was a significant increase between April and June 2007. Also evident in Figure 7 is a period of low variability between February and April 2007 followed by a period of substantially higher variability. Whether or not these observations correlate with other known facts or can be attributed to known causes is a matter for the relevant agencies. One way of helping address the ‘significance’ of these variations is through the use of control charts.

1-4 Statistical significance While a comprehensive treatment of statistical inference is outside the scope of this introductory note, a brief discussion of the main ideas underpinning ‘statistical significance’ is warranted.

To place the discussion in context, suppose that it is known from lengthy monitoring that the proportion of imported food items that fail inspection is normally distributed with a mean of 0.02493 and a standard deviation of 0.01761. Statisticians write this in an abbreviated way as  ~N (0.02493,0.017612 ) where  denotes the true proportion. Figure 8 shows a plot of the probability density function (or pdf) for this particular normal distribution.

Now, suppose we collect n=52 weekly proportions and looked at the average (arithmetic mean) proportion of containers classified as a quarantine risk. Statistical theory tells us that if  ~N (0.02493,0.017612 ) ,then the average of n sample proportions is distributed as  ~Nn (0.02493,0.017612 ) . For n=52, this distribution is shown in Figure 9 (together with the original distribution of Figure 8).

Page | 17 Distribution Plot Normal, Mean=0.02493, StDev=0.01761 0.0249 25

20

15

Density 10

5

0 -0.04 -0.02 0.00 0.02 0.04 0.06 0.08 theta

Figure 8. Theoretical distribution for the true proportion of weekday inspection failure rate.

Distribution Plot Normal, Mean=0.02493

0.0249 StDev 50 0.01761 0.007875

40

30 Density 20

10

0 -0.04 -0.02 0.00 0.02 0.04 0.06 0.08 proportion / average proportion

Figure 9. Theoretical distributions for an individual proportion (black) and the average of 52 proportions (red).

It is readily seen from Figure 9 that although both distributions are centred about the same mean value, the distribution of the average is more compactly so. Thus, while an

Page | 18 individual proportion of say, 0.045 or more is expected to occur quite frequently it is highly unlikely that the average of 52 readings would be this high if the individual values had been drawn from a population described by the red curve in Figure 9. In statistical parlance, we would say that an average (of n=52) proportion of 0.045 or greater is a highly significant result. This immediately raises the question of ‘how significant is significant’ or where do we draw the line between statistical significance and non-significance? Convention dictates (and this is a potentially dangerous approach without understanding the ramifications) that

* * we choose a cut-off value (call it ) such that the probability P     is ‘small’ – and small is taken to mean 0.05. For our example, we find that a  * equal to 0.0379 has an associated ‘tail probability’ of 0.05 (Figure 10). So, if we are only interested in large proportions, then we will declare a sample average (of n=52) to be statistically significant if it is numerically greater than 0.0379. If we are interested in both increases and decreases in the assumed proportion, we split the 0.05 area into two ‘tail’ areas each of 0.025. This generates a two-sided test of significance (Figure 11).

Normal, Mean=0.02493, StDev=0.00787543

0.02493

50

40

30 Density 20

10

0.05 0 0.0249 0.0379 average week ly failure rate

Figure 10. Assumed distribution for mean of n=52 sample proportions. One-tail, 5% 'critical region' identified by red shading.

Page | 19 Normal, Mean=0.02493, StDev=0.00787543

0.02493

50

40

30 Density 20

10

0.025 0.025 0 0.00949 0.0249 0.0404 av erage failure rate

Distance = 0.0404‐0.0249 = 0.0155 Or 0.0155/0.00788 = 1.96 standard deviations

Figure 11. Assumed distribution for mean of n=52 sample proportions. Two-tail, 5% 'critical regions' identified by red shading.

It is also seen from Figure 11 that the upper limit of 0.0404 is equivalent to 1.96 standard deviations from the mean (it is easily verified that the lower limit is also 1.96 standard deviations from the mean, but obviously in the opposite direction). These 1.96 ‘sigma limits’ or ‘control limits’ form the basis of ‘early warning’ or detection limits on control charts. The multiplier (1.96) determines the width of the limits which in turn determines important statistical properties associated with ‘false triggering’. Thus, there is a trade-off to be struck: very narrow limits will give high sensitivity to shifts in the ‘process’ but at the expense of increased rates of false-triggering. By default, limits on control charts are placed at either 2 or 3 standard deviations from the mean (centre-line). Limits determined through non-statistical considerations (eg. biological or ecological significance) can also be used, although the implications for ‘statistical significance’ would need to be determined if the chart was to be used in an inferential mode.

Page | 20 1-5 Control Charts

The Basic Shewhart Chart

The simplest control chart is essentially Figure 7 with the addition of upper and/or lower control limits. This could be done manually, although it is easier to have computer software do it. The output from the MINITAB statistical software package is shown in Figure 12.

Note that the upper and lower control limits in Figure 12 are not constant. This is N because the denominator in the expression p  f is not constant (as noted by the NI warning message in the lower right corner of Figure 12). The red plotting symbols in Figure 11 indicate ‘violations’ or ‘excursions’ outside the control limits. By itself, this chart raises no particular concerns, other than there were two occasions when the proportion was ‘significantly’ high and three occasions when it was ‘significantly’ low. Various modifications and options are available in the presentation of control charts such as Figure 12. For example, if we think there are two ‘epochs’ as discussed earlier, we can provide separate limits in each epoch (Figure 12).

0.5

0.4

0.3 1

0.2 daily failure rate 1 1

1 1 0.1 1 1 1 1 1 1 1 1 1 1 1 11 UC L=0.0600 _ P=0.0260 0.0 LCL=0

6 6 6 6 6 6 7 7 7 7 7 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 /2 /2 /2 /2 /2 /2 /2 /2 /2 /2 /2 7 8 9 0 1 2 2 3 4 5 6 /0 /0 /0 /1 / 1 /1 /0 /0 /0 /0 /0 1 3 8 1 5 0 1 0 5 2 5 1 1 2 1 1 2 2 Tests performed with unequal sample sizes

Figure 12. P-chart for inspection failure rate data.

Page | 21

For data that are collected over time a number of time-based control charts are available. Some of the more common/useful are described below.

Time-based charts

The simplest way of smoothing over time is by ‘block-averaging’. Figure 13 shows a time series divided into a number of non-overlapping ‘blocks’ of constant width. The average of the data in each block is computed and plotted at the centre of the block and these points can be connected by straight line segments to reveal a smoother version of the series.

Figure 13. Smoothing using block averaging.

Block averaging is a relatively unsophisticated way of smoothing and has some potential difficulties – not least of which is that the mean of each block is computed without reference to the rest of the series. In other words there is no ‘history’ built in to the mean of an individual block and so the ‘smoothed’ series can still exhibit some erratic jumps. To overcome this, we can take the basic ‘block’ or ‘window’ and step it across the

Page | 22 series so that there is overlap. This is achieved by replacing the ‘oldest’ k observations with the most recent k observations. The block averages in this case result in a moving average (MA) of the original series (Figure 14). A MA plot for the weekday inspection failure rate data is shown in Figure 15. The MINITAB help for this procedure is given in Box 3.

Figure 14. Moving average scheme. A block or 'window' is stepped incrementally over the series and the block mean computed and plotted.

Page | 23 0.100

0.075

0.050 UC L=0.0438

_

Moving Average 0.025 X=0.0249

LC L=0.0061 0.000

6 6 6 6 6 7 7 7 7 7 0 0 0 0 0 0 0 0 0 0 20 20 20 20 20 20 20 20 20 20 7/ 8/ 9/ 0/ 1/ 1/ 2/ 3/ 4/ 5/ /0 /0 / 0 /1 /1 /0 /0 /0 /0 / 0 3 8 3 9 4 2 8 6 3 9 1 1 2 1 2 2

Figure 15. Moving average chart of weekday inspection failure rate. Sub-group size =1; MA length=5.

Box 3. MINITAB's Help on moving average chart

Page | 24 By adjusting the parameters of the moving average plot, different levels of smoothing can be achieved. For example, Figure 16 shows a MA plot for the proportion data using the averages of 3 weekly sub-groups. With this degree of smoothing, the three periods are clearer.

0.05

0.04

UC L=0.03532

0.03 _ X=0.02493

0.02 Moving Average Moving

LC L=0.01455

0.01

0.00

6 6 6 6 6 6 7 7 7 7 7 0 0 0 0 0 0 0 0 0 0 0 20 20 20 20 20 20 20 20 20 20 20 7/ 8/ 9/ 0/ 1/ 2/ 1/ 3/ 4/ 5/ 6/ /0 /0 /0 /1 /1 /1 /0 /0 / 0 /0 /0 3 7 1 6 0 6 9 5 9 4 8 1 1 2 2 2 1 1

Figure 16. Moving average for weekday inspection failure rate. Subgroup size defined by week of year (usually 5); MA length=4.

Time-weighted charts

The block or moving average charts just discussed give equal importance or weighting to all data in the current ‘window’. While this may be appropriate in some situations, it doesn’t accord with the usual notion that the greater the time separation, the less influential the data becomes. Time-weighted control charts such as the Exponentially Weighted Moving Average (EWMA) are more flexible in that the relative weightings given to recent and historical data can be specified.

By way of example, suppose we wish to form a weighted average of the current observation and the (k-1) most recent values. That is, we’re interested in forming the

2 k weighted mean X11XX 2   Xk where the weighting factor is . The requirement that X1 is an unbiased estimator of the true mean imposes the constraint

Page | 25 k  i 1 and for a given k the solution to this is the root of the equation k 210. i1 For example, if k=10, we find  0.5002 . A plot of these weights compared to the simple arithmetic mean is shown in Figure 17.

The recursive formula for computing values of the EWMA chart is

EWMAtt X1 EWMA t1 0< <1

In other words, the current EWMA is a weighted average of the current data value and the EWMA in the preceding period. Figure 18 shows the EWMA chart for the weekday failure rate data with  0.2 .

0.5

0.4

0.3

0.2

0.1

0 0246810

Figure 17. Comparison of exponentially declining weights (red bars) compared with equal-weighting scheme (blue rectangles) for k=10.

Page | 26 0.06

0.05

0.04 UC L=0.03898

EWMA 0.03 _ X=0.02493

0.02

0.01 LC L=0.01089

6 6 6 6 6 7 7 7 7 7 0 0 0 0 0 0 0 0 0 0 20 20 20 20 20 20 20 20 20 20 7/ 8/ 9/ 0/ 1/ 1/ 2/ 3/ 4/ 5/ /0 /0 /0 /1 / 1 /0 /0 /0 /0 /0 3 8 3 9 4 2 8 6 3 9 1 1 2 1 2 2

Figure 18. EWMA chart for weekday inspection failure rate. Subgroup size defined by week of year (usually 5); EWMA weight=0.2.

1-6 Time between events

We have seen how control charts can be used to monitor variables (such as the number of quarantine threats detected) or attributes (eg. the proportion of containers having a quarantine threat). When ‘events’ (eg. the ‘arrival’ of a quarantine risk at a port of entry) occur randomly in time, an alternative approach is to monitor the inter-arrival time. One advantage of this approach is that the inter-arrival time is available at the time of the arrival whereas if we are tracking the number of arrivals then these need to be aggregated over some time period before a meaningful analysis can be performed. However, some modifications to the standard charts are required to accommodate the fact that the distribution of inter-arrival times is usually (highly) non-normal. Details of the theoretical development can be found in Radaelli (1998). More recently, control charts for the number of ‘cases’ between events (so-called ‘g’ and ‘h’ charts) have been developed and applied to monitoring hospital-acquired infections and other relatively rare adverse health-related events (Benneyan 2001a, 2001b).

By way of example, consider Figure 19 which depicts the ‘arrival’ of a quarantine threat over time. Figure 20 shows a more detailed ‘slice’ through this pattern of arrivals. By

Page | 27 measuring the ‘white-spaces’ (i.e. computing the differences ttii1  )in Figure 20 we obtain data on the inter-arrival times. The complete listing of data for this example is given in Appendix A.

Our analysis of the inter-arrival data commences with an inspection of basic distributional properties (Figure 21).

1 detected

0 2 2 2 3 3 3 4 4 4 5 5 5 6 6 6 7 7 /0 /0 /0 /0 /0 /0 /0 / 0 /0 /0 /0 /0 /0 / 0 /0 /0 /0 1 5 9 1 5 9 1 5 9 1 5 9 1 5 9 1 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1/ 1/ 1/ 1/ 1/ 1/ 1/ 1/ 1/ 1/ 1/ 1/ 1/ 1/ 1/ 1/ 1/ detect_date

Figure 19. Time sequence of detection of quarantine threats.

Measure gaps or ‘white-space’

Figure 20. Pattern of inter-arrival times as measured by the 'white space' between blue lines.

It is clear from Figure 21 that these data are highly skewed and that the normal distribution is not an appropriate probability model.

Page | 28 Histogram of days between detects Normal

0.08 Fits Distribution Fit 0.07 Lowess Mean 7.713 0.06 StDev 9.671 N 253 0.05

0.04 Density 0.03

0.02

0.01

0.00 0 20 40 60 80 100 days between detects

Figure 21. Histogram of inter-arrival times with smoothed version (red line) and theoretical normal distribution (black line) overlaid. The normal distribution provides a poor description of this data (evidenced by the both the shape and probability mass associated with negative values of days between detects).

The smoothed histogram in Figure 21 suggests a ‘J-shaped’ probability model is more appropriate. One such model is the negative exponential probability distribution. This choice is also supported by statistical theory which says that if events arrive randomly in time according to a Poisson probability model with an average rate of arrival of λ per unit time, then the distribution of the inter-arrival time is negative exponential with parameter λ. The probability density function (pdf ) for the negative exponential is given by Equation 1.1 and the corresponding cumulative distribution function (cdf ) is given by Equation 1.2.

x fxX ()  e , , x 0 (1.1)

x FxX () 1 e ,  , x 0 (1.2)

For this distribution, the mean is 1 and the variance is 1 . Notice, that this   2 immediately implies that the variance increases/decreases with an increasing /decreasing mean – in contradiction to many ‘conventional’ statistical techniques which assume constant variance.

Page | 29 One simple way of estimating the parameter λ is to equate the theoretical and sample

1 ˆ 1 means. In this case, we have   7.713 and hence our estimate is  7.713 0.130. A plot of the histogram of the data with a negative exponential distribution having ˆ  0.130 overlaid is shown in Figure 22. The adequacy of this fit is more readily seen by comparing the empirical and theoretical cumulative distribution functions (Figure 23).

The ‘false-triggering’ due to the non-normality of the data is evident in the I-Chart3 of Figure 24. There are two ways of over-coming this. The first is to modify the control chart itself to account for the non-normality. The second approach is to transform the data so that the transformed data are normally distributed (or approximately so) and then apply standard control charting techniques to the transformed data. We consider each of these approaches in turn.

Histogram of days between detects Exponential

0.14 Fits Distribution Fit Lowess 0.12 Mean 7.713 N 253 0.10

0.08

Density 0.06

0.04

0.02

0.00 0 15 30 45 60 75 90 105 days between detects

Figure 22. Histogram of days between detects. Smoothed histogram indicated by red curve, theoretical exponential distribution depicted by black curve.

3 An “I-Chart” is simply a control chart for individual observations ie. ungrouped data.

Page | 30 Empirical CDF of days between detects Exponential

100 Mean 7.713 N 253

80

60

Percent 40

20

0

0 20 40 60 80 100 120 days between detects

Figure 23. Empirical cdf for days between detects (red curve) and theoretical exponential cdf (blue curve).

I Chart of days betw detects 120 1

100

80

60

1 40 1 1 1 1 1 Individual Value Individual UCL=27.7 20 _ X=7.7 0 LCL=-12.3

1 26 51 76 101 126 151 176 201 226 251 Observation

Figure 24. Chart of individual values of days between detects (I-Chart).

Page | 31 1-7 Transformations to Normality

In section 1-3 we looked briefly at the issue of statistical significance. The discussion focussed on a normally distributed random variable. Violations of the normality assumption will tend to invalidate the results of any statistical procedure which invokes this assumption4. One way to overcome this problem is to identify a mathematical transformation of the data so that the transformed data are normally distributed or approximately so. There is a tendency among practitioners to spend an inordinate amount of time on the identification of the ‘best’ transformation (best in the sense that the resulting data are most nearly normal). This is often wasted effort since many statistical procedures (including control charting) are relatively robust to mild to moderate departures from normality. The over-riding objective should be to identify a simple mathematical transformation that at least results in data that is approximately symmetrical. The identification process can be by trial and error or by some ‘automated’ procedure. An example of the latter is the so-called Box-Cox family of transformations. The idea is simple enough: we wish to find the value of the transformation parameter  (not to be confused with the  in equations 1.1 and 1.2) so that data transformed according to

 X     0 Y      ln(X )  =0 exhibit a greater degree of normality than the untransformed data (the Xs). Having found this  we proceed to work with the transformed values,Y . MINITAB and other software packages simplify the task of determining  for a given data set. The ‘optimal’  is identified as the abscissa value at the minimum on a Box-Cox ‘profile plot’ (Figure 25) – in this case we find   0.24 . With this value of  we then transform the data and then use control charting methods on the transformed data.

4 The severity of the violation cannot be anticipated in advance since it is a function of the degree to which the assumption is violated, the manner in which it is violated, and the robustness of the statistical procedure to such violations.

Page | 32 Box-Cox Plot of days between detects

Lower CL Upper CL 70 Lambda (using 95.0% confidence) 60 Estimate 0.22 Lower C L 0.12 Upper C L 0.32 50 Rounded Value 0.22 40

StDev 30

20

10 Limit 0 -1 0 1 2 3 Lambda

Figure 25. Box-Cox profile plot for the days between detects. Optimal lambda is 0.24.

The effectiveness of the transformation is evident from the histogram of the transformed data (Figure 26).

Histogram of transformed DBF Normal

Fits 1.2 Distribution Fit Lowess

1.0 Mean 1.416 StDev 0.3683 N 253 0.8

0.6 Density

0.4

0.2

0.0 0.45 0.90 1.35 1.80 2.25 2.70 transformed DBF

Figure 26. Histogram of transformed days between detection with smoothed version (red curve) and theoretical normal (black curve) overlaid.

Page | 33 To see how the transformation process works, consider setting an upper control limit on the days between detects such that this limit would only be exceeded 10% of the time when there has been no change in the underlying response-generating mechanism. In other words, on the transformed scale, we wish to find that value y* which satisfies the following

* PY y 0.10 . From Figure 26, we see that the transformed data (Y) are well described by a normal distribution having mean 1.416 and standard deviation 0.3683. Either using tables of the normal distribution or computer software (as in Figure 27) we determine that y* 1.89 . We can ‘back-transform’ this y* to determine an equivalent x* on the untransformed scale by noting that

1 ***    PY y P X  y P X  y 

1 * That is, xy**  . With y 1.89 and   0.22 we obtain x* 18.1 which compares favourably to the theoretical result of 17.8 obtained directly from the negative exponential distribution (Figure 28).

Distribution Plot Normal, Mean=1.416, StDev=0.3683 1.2

1.0

0.8

0.6 Density

0.4

0.2 0.1

0.0 1.42 1.89 X

Figure 27. Fitted normal distribution to transformed days between detects with upper 10% point indicated.

Page | 34 Finally, we plot the I-Chart for the transformed days between detects and note that there is now only one ‘out-of-control’ situation indicated (Figure 29) compared with the previous seven (Figure 24).

Distribution Plot Exponential, Scale=7.713, Thresh=0 0.14

0.12

0.10

0.08

Density 0.06

0.04

0.02

0.1 0.00 0 17.8 X

Figure 28. Theoretical negative exponential distribution for untransformed days between detects and upper 10% point indicated.

Page | 35 I Chart of transformed days between defects 3.0 1

2.5 UCL=2.474

2.0

1.5 _ X=1.416

Individual Value 1.0

0.5 LCL=0.357

0.0 1 26 51 76 101 126 151 176 201 226 251 Observation

Figure 29. I-Chart for transformed days between detects.

1-8 Control chart for time-between-events

Rather than transform the data as described in the preceeding section, alternative methods have been developed which modify existing control charts for use with untransformed (and non-normal) data. Radaelli (1998) describes procedures for setting control limits for both one and two-sided control charts for inter-arrival times. Only the one-sided case is considered here since we are generally only interested in tracking significant deviations in one direction (eg. where the inter-arrival time between quarantine risk detects is decreasing).

th Let Xi be the i inter-arrival time. An ‘out-of-control’ situation is declared if XTiL in the case of decreasing inter-arrival times (ie. increasing counts) or XTiU in the case of increasing inter-arrival times (ie. decreasing counts) where TL and TU are suitably chosen positive constants. Suppose that an ‘in-control’ situation corresponds to a mean inter-arrival

1 time of 0 (where  is the parameter in equation 1) then using Equation 1.2, it can be determined that

Page | 36 0 TL PXiL T 0 1 e (1.3)

0 TU PXiU T 0 e (1.4)

Equations 1.3 and 1.4 are analogous to the Type I error in a hypothesis test: it’s the probability of a false-positive. As in statistical hypothesis testing, the Type I error-rate (α) is set to be some arbitrarily small value (eg. α =0.05). Thus, the upper and lower control limits can be determined by setting Equations 3 and 4 equal to α and solving for either TL or TU . Thus we have:

1 TL 0 ln 1   (1.5)

1 TU 0 ln   (1.6)

In addition to having a low α, we also require our control chart to correctly signal an important deviation from ‘in-control’ conditions. Suppose we wish to detect a change from

0 to 1 with some high probability, (1  ) where 10 k (k>1 for a one-sided lower chart; k<1 for a one-sided upper chart). That is:

kT0 L PXiL T 1 1(1) e  (lower chart) (1.7)

kT0 U PXiU T 1 e (1 ) (upper chart) (1.8)

Substituting TL and TU in Equations 7 and 8 respectively, we obtain:

(1 ) 1 ek ln 1  (lower chart) (1.9)

(1 ) ek ln  (upper chart) (1.10)

The performance characteristics for both lower and upper one-sided charts are shown in Figures 30 and 31. Both of these figures show that the ability to detect even relatively large shifts (eg. a doubling or halving) in the mean inter-arrival time is low (typically less than 0.2) for values of α less than 0.1. For example, using a 10% level of

Page | 37 significance (ie. alpha =0.10), the one-sided chart of Figure 30 suggests that there is a less than 20% chance of detecting a doubling (ie. k=2) of the mean arrival rate (or a halving of the inter-arrival time).

10 1-beta < 0.2 9 0.2– 0.4 0.4– 0.6 8 0.6– 0.8 > 0.8 7

6 k 5

4

3

2

1 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 alpha

Figure 30. Performance characteristics (as measured by equation 1.9) for a one-sided, lower control chart.

1.0 1-beta < 0.2 0.2– 0.4 0.4– 0.6 0.8 0.6– 0.8 > 0.8

0.6 k

0.4

0.2

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 alpha

Figure 31. Performance characteristics (as measured by equation 1.10) for a one-sided, upper control chart.

Page | 38 1-9 DISCUSSION

In this chapter we have provided a review of basic statistical concepts as well as introducing some common control-charting techniques that have been advocated elsewhere (Carpenter 2001, Commonwealth of Australia 2002) as being particularly suited to monitoring for temporal trends and aberrations in bio-security related applications. Control charts are particularly well suited to the visualisation and assessing of moderate to large volumes of time-based data and as such would be expected to have greater utility for container inspection regimes say, than for detecting the occurrence (in space) of an invasive species. Control charts need to be viewed as just one method in a tool-kit of available techniques which can potentially assist field officers and quarantine risk assessors in identifying ‘unusual’ or ‘aberrant’ trends. For events having very low probabilities of occurrence (eg. exotic disease outbreak) the monitoring of ‘time between outbreaks’ is a potentially more useful quantity to be charting although as shown in this report, the statistical power (ability to correctly identify real ‘shifts’ in the mean time between events) of current charting techniques is relatively low.

Since the events of September 2001, there has been a substantial research push in the area of ‘syndromic surveillence’ with the accompanying development of new approaches and methods to detecting unusual patterns in a space-time continuum. Some of these techniques would appear to have direct applicability to the activities of Biosecurity Australia and AQIS.

Page | 39

Page | 40

BAYESIAN

CONTROL Chapter CHARTING 2

2-1 INTRODUCTION

In this chapter we investigate some more specific aspects of control charting and in particular, focus on the use of Bayesian statistical methods. This work advances conventional control charting methods described in chapter 1 by adaptively updating the alerting mechanisms as well as explicitly incorporating prior belief about the state of monitored ‘system’. The methodology is developed in the context of routine quarantine inspection of imported foods although the potential applications extend to other areas of bio-surveillance where data is being gathered over time and ‘early warning’ triggers are required.

Detailed background information on current quarantine inspection practice can be found in the report by Robinson et al. (2008). Robinson et al. (2008) also provide details of a suggested risk-based framework for allocating scarce resources to the monitoring and surveillance effort.

The present study is concerned with the implementation phase of a specific risk-based approach to monitoring. Most quarantine surveillance and monitoring programs are candidates for a statistical approach since they invariably involve small sampling fractions and there overarching requirement to balance the cost of sampling with the probability of failing to detect a threat. Our focus in this chapter is on the temporal component of monitoring – that is, detecting important ‘shifts’ or ‘aberrations’ in monitored data in close to real time. Chapters three and four examine statistical methods associated with the spatial dimension of bio-surveillance.

A common requirement of statistical surveillance techniques is to detect important changes in a stochastic process at an unknown time, as quickly and as accurately as possible (Sonesson and Bock 2003). Many of the reported techniques use likelihood based methods

Page 41 to detect step changes in a parameter of interest (eg. process mean or variance). While a number of papers have appeared recently on statistical surveillance in the context of epidemiology, public health, and syndromic surveillance (Doherr and Audigé 2001, Sonesson and Bock 2003, Marshall et al. 2004, Höhle and Paul 2008) relatively little has been published on quarantine inspection.

The utility of conventional statistical process control (SPC) tools such as the Shewhart chart, EWMA chart, and other control charting variants for quarantine inspection was covered in chapter one of this report. Control charting methods have a number of attributes which make them particularly well-suited to the task of identifying ‘abnormal’ trends in the detection of non-compliant shipments, such as an ‘early-warning’ capability and easily communicated visual displays of historical results. While some of these tools (such as the EWMA chart) have the ability to couple past history and present observations, they are conventionally data-driven approaches that do not readily accommodate expert opinion or existing understanding about the underlying response-generating process. This is potentially an important consideration, particularly when a new commodity or product is shipped into the country for which historical data does not exist or in cases where other ancillary information concerning the commodity becomes available (for example increased susceptibility to contamination at the point of manufacture).

Another difficulty with standard control-charting tools is that they are constructed on models which assume process parameters are known exactly and observations are i.i.d5. (Tsiamyrtzis and Hawkins 2007). This is problematic since parameter values are rarely known and secondly, the assumption of i.i.d. data is frequently violated – particularly for time-series data which often exhibit moderate to strong autocorrelation. More recently, Bayesian control charting methods have been developed to help overcome some of these limitations. Baron (2001) used the theory of optimal stopping of Markov sequences to develop efficient algorithms for the detection of a distributional change in sequentially collected data while Hamada (2002) used Bayesian tolerance interval control limits in the context of attribute sampling. Our approach to Bayesian control charting for quarantine

5 independently and identically distributed

Page 42 inspection has been motivated by Menzefricke (2002) who used Bayesian predictive distributions to derive rejection regions for various monitoring applications.

It is not the aim of this report to provide a comprehensive review on Bayesian control charting techniques. However, to facilitate the subsequent mathematical development we digress momentarily to explain some of the underlying concepts. Readers requiring a more comprehensive treatment of the topic may find the collection of papers in Colosimo and del Castillo (2007) a useful entry point.

2-2 Fundamentals of Bayesian Control Charting

The techniques presented in chapter one fall within the realm of ‘frequentist’ statistics. This mode of statistical thinking is by far the most common and underpins nearly every undergraduate course in statistics. The frequentist view of the world is one in which the only admissible probabilities are those that are expressible as the ratio of the number of outcomes which are favourable to the ‘event’ under consideration to the total number of outcomes, or alternatively, can be thought of as the limiting value of the relative frequency of some phenomenon – hence the term frequentist statistics. A point of clear demarcation between frquentist and Bayesian statistics is the role of subjective probability. There is no role for subjective probability in a frequentist framework; for Bayesians it is pivotal.

The term Bayesian derives from the Rev. Thomas Bayes (b. 1702, London - d. 1761). Bayes was not known as a mathematician and his only significant work "Essay Towards Solving a Problem in the Doctrine of Chances" (1763), was published posthumously in the Philosophical Transactions of the Royal Society of London. Although a largely turgid piece of work, Bayes’ essay identified a fundamental proposition in probability. This was a profound insight and provided a logical and consistent way of updating prior belief or probability in the light of new evidence. The formula was named Bayes rule or theorem after him. The updated probability is referred to as the posterior probability. Unlike frequentist statistical inference which tests hypotheses or estimates (unknown) parameters on the basis of information contained in data alone, the Bayesian paradigm combines prior belief about unknown parameters with evidence from data using Bayes’ rule. More formally, the aim of Bayesian inference is then to make inferences about a parameter  or future observation y using

Page 43 probability statements conditional on the data y . Both parameters and future observations are treated as random variables in a Bayesian framework and we talk of the posterior density of  [denoted py  ] and the posterior predictive density of y [denoted pyy   ].

The simplest version of Bayes’ rule for two ‘events’ A and B says that the conditional probability that event A occurs given event B has occurred is given by the formula:

PA  B PAB where the probability in the numerator is the joint probability (i.e. PB the probability that both A and B occur). Bayes’ theorem applies equally to probability density functions (pdf). Thus if y denotes data and  some parameter or vector of Py   Py ,  parameters, then Py and the numerator is the joint probability PP  density function for y and . The roles of y and can be interchanged in this formula and we have immediately that:

PyPy    ,  Py and a comparison of Py   and Py  reveals that Py Py

Py ,  P  Py  PyP    y . Finally, substituting this last expression for Py ,  into the expression for Py  gives Bayes’ law for densities:

PPy     Py  . Py

This formula takes a prior density for  [i.e. P  ]and converts it into a posterior density

Py   Py  via the term called the Bayes factor. The denominator in the expression for Py the posterior density does not involve  and only serves to normalise the pdf (i.e. make it integrate to unity). Inference for based on the posterior is therefore unaffected by working with PyP    Py  instead of the normalised posterior. Thus we see that the posterior distribution is proportional to the product of the prior times the likelihood of the

Page 44 data. In frequentist inference, only the likelihood is used; in Bayesian statistics the likelihood is modified by our prior belief.

In the remainder of this chapter we describe a new/novel Bayesian control charting approach to quarantine inspection. The motivation in the present context is that conventional (i.e. non-Bayesian or Frequentist) approaches to control charting need to be ‘primed’ with hard data in the absence of known parameter values. While this might not be an issue in a manufacturing context where production data is both plentiful and continuous, it is problematic for the monitoring of processes for which little background data is available. This problem becomes particularly acute for the development of a surveillance program aimed at detecting a new threat for which there is no prior data. Similarly, in the case of monitoring a rare phenomenon, a paucity of data is inevitable. In such cases the monitoring data will be comprised of a string of zeros – corresponding to ‘no detect’ outcomes. Frequentist statistical methods will thus estimate the true rate of occurrence as zero with a standard error of zero. The Bayesian paradigm on the other hand commences with the specification of a prior density for the parameter of interest (such as the true rate of occurrence) and continually updates this as new data becomes available. The method is illustrated with application to food import data used in the previous chapter (Robinson et al. 2008).

2-3 A BAYESIAN CONTROL CHART FOR QUARANTINE INSPECTION

The following development assumes attribute sampling whereby at time t, nt ‘units’ are selected from a total volume of trade comprising Nt units and the result of inspection is a binary outcome: “pass” or “fail”. The time index t will generally represent daily increments. On each sampling occasion two related control-charting questions are considered: (i) is the observed failure rate for the current sample within acceptable limits?; and (ii) is the cumulative failure rate for all samples inspected to date within acceptable limits? These objectives mirror the detection of ‘pulse’ and ‘press’ stresses in natural ecosystem management (Underwood, 1994). In answering questions (i) and (ii) we wish to

Page 45 incorporate both historical monitoring data and prior information on the true failure rate,  . We do this through the use of a conditional probability distribution for the data given  and a prior probability model for . These two elements can be combined to obtain a predictive distribution for a new sample. The mathematical detail is developed in the following section. Readers not interested in this can skip forward to section 2-4.

2-3-1 Mathematical detail

The problem as formulated leads us to consider the random variable X t - the number of failed units in the sample of nt taken at time t. Assuming independent Bernoulli trials for each inspected unit, the conditional distribution of Xt  is binomial (we have dropped the time subscript to improve clarity where it is understood that all results pertain to the current sample unless otherwise indicated):

n xnx fxX   (1 ) ; x  0,1, n , 0   1 (2.1) x

Uncertainty in the true failure rate is reflected in the prior distribution p  . A suitable choice for p  is the beta density:

1 pab(;,)ab11 (1 ) ; 0 1, a 0, b 0 (2.2)  (,)ab

Initial values for the a and b parameters in equation 2.2 can be chosen according to various strategies depending on how much or how little we know about the true rate of risk for a particular commodity, country, test etc. Robinson et al. (2008) discuss some of these strategies in the context of food imports and recommended the use of a Jeffrey’s prior corresponding to a  0.5 and b  0.5 . A plot of this Jeffrey’s prior and two ‘vague’ or ‘non- informative’ prior densities are shown in Figure 32.

Page 46 Updating the prior

The underlying principal in the adaptive monitoring process is that our estimate of the true failure rate is constantly revised as new data is gathered. In the early stages of monitoring, our probability model for the true failure rate will be driven by prior information.

4

3

2 probability density probability

1

0 0 0.2 0.4 0.6 0.8 theta beta(0.5,0.5) beta(1,1) beta(2,2)

Figure 32. Illustrative non-informative priors for true failure rate, theta.

At each time increment the current prior probability for  is updated using standard Bayesian methods to generate a posterior marginal pdf. The procedure is described below.

At time t we have available the history of observed failures up to and including the

t current observation – denote this as x12,,,xx N where Nn  i is the total number of i1

t sampled units. We let Y denote the total number of failures at time t i.e.YX  i . For a i1

Page 47 stable process, the distribution of Y is also binomial with parametersN, . However N will rapidly become ‘large’ (i.e greater than ~ 30) and provided  is not close to either zero or one the binomial distribution is well approximated by a Poisson distribution with mean N .

Now the marginal posterior distribution for Y as a function of the parameters a and b is:

1 p yab,,  l y p ab d (2.3) 0

where ly  is the likelihood of the data (y) given  . Thus, equation 2.3 can be written as:

1  N y eN 1 b1 pyab,1 a1  d (2.4)  yab!(,) 0 

Equation 2.4 can be evaluated using numerical integration or alternatively computed using equation 2.5 (a derivation of equation 2.5 is provided in Appendix B).

m Njay y1   m1 yar N  pyab,1   ; yjab!!jr00 m1 yabrm   (2.5) yNab 0,1, , , 0

At time t we have available the data NY11,,  NY 2 ,,,, 2   NYtt . The likelihood is thus

t laby( , ; ) pX (iii y y10 ab , ); y 0 (2.6) i1

Page 48 The maximum likelihood estimates, aˆ and bˆ at time t are found by simultaneously solving equations 2.7a and 2.7b.

laby(,; ) aaˆ :0 (2.7a) a aˆ

laby(,; ) bbˆ :0 (2.7b) ˆ b b

The updated distribution for  at time t is equation 2.2 with parameters aˆ andbˆ . We use this posterior to obtain the predictive distributions for the number of failures ()X t1 in the next sample of units to be inspected and the cumulative number of failures ()Yt1 .

Predictive distributions for X t1 and Yt1

The predictive distribution for X t1 can be written as

1 p Xy fx p yd tt11  t  (2.8) 0

py()() p 1 where py  ; pypypd  ()()  and p  based on the most py 0 recent estimate using equation 2.2 with parameters abˆ, ˆ .

It can be shown (Appendix C) that p Xytt1  is given by equation 2.9.

Page 49   m1 m   xyar Nt  1 tt1    nyabrm ! nt1  xyanxb,    m1 r0 tt1  pX y tt111 tt  tt1  m x  m1 t1  yabt  ,  yar N  1 t t    yabr m!  m1 r0 t  (2.9)

We next consider the predictive distribution forYt1 . First, it can be seen that pYsYypXsytt11 t . The unconditional distribution of X t1 , pX t1  is obtained as follows:

1 p XpXpd   tt11  0

d d where pXtt11 ~(,) binn  and p  ~(,)beta a b . Thus, pX t1  is a beta-binomial distribution and the predictive distribution for Yt1 is therefore:

nt1  s  yan, t1 syb pYtt1  sY y  (2.10) sy  ab,

We next discuss how the predictive distributions are used to set control limits for routine inspection programs.

2-3-2 Adaptive control limits

The idea of a control limit is to provide an early warning that the underlying response generating mechanism has departed from an assumed stable state. In the present context we wish to set two limits (designated as RL1 and RL2) on the number of failures in the next batch of sampled units. RL1 is set such that, when exceeded, it draws our attention to the fact that there is a higher number of failures in that particular sample of n units than would be expected. Exceedence of RL2 signifies an unusually high number of failures which would significantly increase the cumulative failure rate. Clearly the two limits are related since

Page 50 triggering of RL2 implies a triggering of RL1, although the converse is not necessarily true. Thus, the second limit will tend to be more liberal than the first.

(1 )100% response levels RL1 and RL2 are obtained by solving equations 2.11 and

2.12 respectively.

nntt11 RL11(; nttttt ,) y r : p X 1  r Y  y p X  1  r Y  y xr xr1 (2.11)

nntt11 RL21(; nttttt ,) y r : p Y 1  r Y  y p Y  1  r Y  y (2.12) xr xr1

These ideas are illustrated with an example using the AQIS food import data set (see Chapter 1) for the period 1-Jul-2006 to 30-Jun-2007. Detailed information about the data collection methods can be found in Robinson et al. (2008).

2-4 EXAMPLE – AN ADAPTIVE CONTROL CHART FOR FOOD IMPORTS

Between 4/7/2006 and 29/6/2007 a total of 1,718 items were imported from a particular country. These were predominantly food items such as soy sauce, instant noodles, fish, pasta, and crabmeat. There are generally multiple consignments each day and the AQIS database records a “PASS/FAIL” result for each consignment.

2-4-1 Updating the prior

For the purpose of illustrating the proposed control charting methods, we have aggregated the results on a daily basis and simply noted the number of failures xt out of nt consignments on day t. A plot of the observed daily failure rate and cumulative failure rate is shown in Figure 33. Initial estimates of the failure rate are highly variable although ultimately converge to about 3% as evidenced by the blue trace in Figure 33.

Page 51 1 0.08

0.8 0.06

0.6

0.04 failure rate 0.4

0.02 0.2

0 0 0 20 40 60 80 100 120 140 160 180 200 220 240 260 280 300 320 340 360

days since July 4 2006 Figure 33. Daily consignment failure rate (red curve) and cumulative failure rate (blue curve) for imports between 4/7/2006 and 29/6/2007.

We initially assumed a beta(4,20) distribution for the prior on  which has 99% of its probability mass between zero and 0.374, a mean of 0.167 and a modal value of 0.136. This choice reflects little or no prior knowledge about  other than we expect it to be less than 0.4. Using the methods of the previous sections we can update the prior at any point in time using all the available information available at that time. This could be as frequently as every day or say, once a month. Figure 34 shows the situation at the end of a year of monitoring.

The top panel in Figure 34 shows the initial beta(4,20) prior (blue curve) and the posterior density at the end of the 1 year period (red curve). The posterior is a beta(6.248,165.337) distribution which has a mean of 0.036, a median of 0.035, and a modal value of 0.031. 99% of the posterior distribution lies in the interval (0, 0.08).

Page 52 The lower panel of Figure 34 shows the cumulative failure rate as a function of time since monitoring commenced.

50

40

30

20 Current prior density

10

0 0 0.1 0.2 0.3 0.4

Consignment failure probability

0.08

0.06

0.04 probability

0.02

0 0 100 200 300

Days since July 4 2006

Figure 34. Top: original (subjective) prior density (blue curve) and posterior density (red curve) for true failure rate after 1 year. Bottom: Empirical cumulative failure rate (red line), overall mean failure rate (blue dotted line) and mean of posterior distribution after 1 year (green dashed line).

Page 53 It is evident from Figure 34 and the related distributional summaries that the relatively vague prior has been considerably ‘sharpened’ after a year of monitoring. The posterior density is compactly centred about the overall failure rate of 0.03 and a Bayesian 95% highest posterior density credibility interval for the true consignment failure rate is readily determined to be {0.011, 0.065}. The upper limit of a 1-sided 95% credibility interval is 0.063 which suggests that a failure rate more than the equivalent of 1 in 16 is evidence of a significant increase in import failure. A more refined instrument for alerting to changing failure status has been provided in the form of the two triggers, RL1 and RL2. We next illustrate how these operate with reference to the present example.

2-4-2 Setting adaptive triggers

By way of example, suppose the current date is August 7, 2006 and we wish to place approximate 99% limits on the number of failures for imports for the following day. There were a total of 128 consignments from the country in question since the start of our monitoring period (we take this to be July 4 2006) – three of which failed inspection, giving a current failure rate of 2.3%. Solving equations 2.7a and 2.7b we obtain the maximum likelihood estimates aˆ  3.805 andbˆ 167.819. There are 17 consignments on August 8

2006. With N 128, y  3, n 17 and   0.01we use equation 2.11 to determine RL1=2 and equation 2.12 to determine RL2=2. Note, because the outcome of inspection is a discrete random variable, equations 2.11 and 2.12 will generally not be able to be satisfied exactly.

Thus in this case, the actual value for  is 0.008 for RL1 and 0.01 for RL2 as distinct from the nominal   0.01. As it turned out there were 5 failures on August 8 2006 and this outcome would have tripped both our triggers for further investigation.

We also note that in the early stages of monitoring, RL1 and RL2 will be quite close (in this case they’re identical) reflecting the fact that not much data has been gathered and a significant increase in failure rate on any one occasion has a relatively large impact on the cumulative failure rate. As monitoring progresses, there will tend to be greater separation between RL1 and RL2, although the difference will still tend to be small given the relatively small sample sizes involved. By way of example, we now advance to June 28 2007. By this time, there have been 1,680 consignments resulting in 58 failures. The approximate

Page 54   0.01triggers for the following day’s 15 consignments are RL1=2 and RL2=3. The actual result was zero which is clearly acceptable.

2-5 Discussion

The adoption of a Bayesian framework has allowed us to extend traditional control charting methods discussed in chapter 1 to accommodate expert option and/or prior belief about the monitored process. Furthermore, the Bayesian approach provides some other important enhancements. For example, ‘ignorance’ about a new or previously undetected threat is readily accommodated and the intrinsic updating of prior information means that these methods are evolutionary, learning and adaptive. We believe these are important prerequisites for a successful biosecurity surveillance and monitoring system.

It is important to distinguish between monitoring activities that aim to predict or forecast future events with those whose primary objective is to alert or flag the existence of an abnormal event. A comprehensive biosurveillance monitoring strategy will incorporate both pro-active and reactive components. Control charting techniques are reactive, although depending on how they are constructed and implemented, they can provide a close to real-time monitoring capability. A difficulty with pro-active systems such as those used in syndromic surveillance is that forecasting (particularly rare events) is exceedingly difficult with success depending very much on model choice and parameterisation. Indeed, as noted by Burkhom et al. (2007) a critical issue for syndromic surveillance / forecasting systems is their sensitivity to “expected and unexpected data outliers”. Burkhom et al. (2007) go on to further state that “for unexpected outliers, we have implemented automated outlier removal schemes to avoid baseline contamination for the adaptive regression, but such schemes can produce unexpected effects and need further study”. We regard this as a flawed strategy for two reasons: (i) an “expected outlier” is an oxymoron; and (ii) the automated removal of observations that are, in some sense, aberrant is to be strenuously avoided. It was precisely because of the automated removal of ‘outliers’ that the hole in the ozone layer was initially undetected. It was only when the ‘offending’ data was reinstated and the time series data reanalysed that the seriousness of the problem became apparent. Given that the utility of forward looking systems is critically dependent on which

Page 55 data is included/excluded in the modelling process, it seems to us that data screening tools such as control charts have an important role to play in the development of prospective, forecasting tools.

Hitherto, Bayesian methods have not been widely used in biosecurity / biosurveillance applications although a number of papers have appeared recently which suggest that there is a growing awareness of the potential utility of this statistical paradigm. Wong et al. (2005) used Bayesian networks to extend the Population-wide Anomaly Detection and Assessment (PANDA) algorithm for syndromic surveillance while Hogan et al. (2007) describe a Bayesian aerosol release detector (BARD) that combines medical surveillance and meteorological data to provide an early warning capability for the release of B. Anthracis.

In this chapter we have outlined a Bayesian approach to control charting within the context of quarantine monitoring and extension. We have provided a proof-of-concept evaluation of the method using AQIS food import data (Robinson et al. 2008) which demonstrates that the approach shows promise and warrants further development and evaluation. Future work could usefully focus on the elicitation of prior probability distributions as well as the incorporation of other covariates.

Page 56

SPACE-TIME

DISEASE Chapter SPREAD 3

3-1 INTRODUCTION Australia is free of the world’s worst animal diseases such as foot and mouth disease (FMD) and bird flu (avian influenza H5N1) although the list of potential threats is long (http://www.daff.gov.au/animal-plant-health/pests-diseases-weeds/animal). There are good reasons for taking whatever steps are necessary to ensure that this status is maintained. The 2001 FMD outbreak in the United Kingdom had disastrous consequences with the slaughter of over 4.2 million animals and substantial economic loss (Riley 2007).

Four outbreaks of what is thought to be FMD occurred in Australia in the nineteenth century however there have been no reported outbreaks for over 100 years. Equine influenza (EI) is another highly contagious disease afflicting horses, donkeys, mules, and zebras. Shortly after Animal Health Australia released its disease strategy for equine influenza (Animal Health Australia 2007) an EI outbreak was detected in the area. The disease spread rapidly through northern NSW into Queensland where it concentrated in the Brisbane region (DPI 2008). It wasn’t until Christmas Day 2008 that Australia was officially declared EI disease-free. In handing down his findings The Hon. Ian Callinan (AC) highlighted shortcomings in the Government’s monitoring and surveillance protocols for biosecurity threats (Callinan 2008). Similarly, an analysis of the 2001 FMD outbreak in the United Kingdom revealed serious shortcomings in data collection, processing, and analysis activities in the initial stages of the outbreak (AusVet-CSIRO 2005). The AusVet-CSIRO report suggested that Australia accordingly review its data requirements and mathematical modelling in order to understand the quantitative aspects of animal disease outbreaks. In a recent review of infectious disease outbreaks Riley (2007) noted a number of interesting phenomena in the spatial dynamics of disease propagation in human and animal populations. Examples of these included “spatial waves of infection” and the tendency of

Page 57

disease incidence to occur in spatial clusters. The phenomenon of epidemic travelling waves is not new with historical examples provided by the European plague in the Middle Ages, the influenza pandemic in the early 20th century and the spread of cholera in Asia and East Europe during the 1960s (Fuentes and Kuperman 1999) As with any disease, prevention is better than cure and continuous surveillance coupled with stringent border and pre-border controls is essential to the maintenance of Australia’s disease-free status. However, if an outbreak of a highly contagious and economically devastating disease such as EI or FMD was to occur, the ability to predict the subsequent spread of the disease would greatly enhance the prospects of early control and containment. As in the 2007 EI outbreak, enhancements to the biosecurity network can only be made once the deficiencies are understood. To this end, an ability to pin-point the location of the initial outbreak is a critical first step. This chapter details the outcome of investigations into the development of a cellular automata model to describe the spatial dynamics of infectious disease spread. Additionally, the output of the cellular automata model is combined with limited, spatially- temporally referenced empirical data on actual disease numbers to provide an estimate of the most likely location and time of the initial outbreak.

3-2 A CELLUAR AUTOMATA MODEL FOR DISEASE SPREAD

There is a considerable literature on epidemic models although most population models are zero-dimensional and describe intrinsic epidemical features such as the existence of threshold values for the spread of an infection, the asymptotic solution for the density of infected individuals and density-dependent effects (Fuentes and Kuperman 1999). However, Riley (2007) suggests that spatial models of infectious disease transmission which integrate knowledge of the infection process are “the only plausible experimental system ... to investigate observed patterns and to evaluate alternative intervention options”.

Mathematical models describing population dynamics usually use either differential or difference equations depending on whether ‘time’ is treated as a continuous or discrete variable. An alternative to this approach are cellular automata (CA) methods having

Page 58

discrete time increments and a matrix representation of a geographical network. A set of rules governs the evolution of the automata such that the state of an element at each time step is expressed in terms of its own state and those of its neighbours at earlier time steps. As noted by Fuentes and Kuperman (1999), CA methods enjoy a number of advantages over conventional differential-difference equation approaches such as considerably faster computational speeds and the ease with which certain epidemiological features can be incorporated as well as local and seasonal effects. CA models have been used to model a range of problems associated with pathogen and disease spread including rabies in fox populations (Benyoussef et al. 1999) and FMD in feral pigs in Queensland (Doran and Laffan, 2005).

CA models are potentially well-suited to modelling the spread of an infectious disease such as FMD which can exhibit long-range spatial dynamics as a result of airborne spreading (Cannon and Garner 1999). Indeed, it has been suggested that a 1981 FMD outbreak in the UK was initiated by windborne particles carried across the English Channel from France (Alexandersen et al., 2001; Donalsdson, 1983; Sørensen et al. 2000). Unless the wind is erratic, it is reasonable to assume that disease incidence data would exhibit a relatively high degree of spatial continuity that was aligned with the predominant wind- direction. According to Cannon and Garner (1999), most wind-borne spread over land is less than 10 km although this can be up to 60 km. This suggests that an anisotropic spatial covariance model having a 10 km ‘range of influence’, say, could potentially be useful in describing the spatial correlation structure in disease spread. The risk of airborne spread of FMD in Australia was assessed by Garner and Cannon (1995). Figure 35 is taken from their report and indicates that conditions favourable for the survival of FMD in aerosols occur at least 50% of the time in most of south-east Australia and all of Tasmania. The use of a discrete grid as the basis for representing and modelling the spatial dynamics of disease spread is not only convenient but parallels the way in which management agencies convey risk to the broader community (Figure 37). Our approach is to take the region of interest (e.g. Figure 36) and overlay a grid whose spacing is commensurate with the phenomenon of interest (Figure 38).

Page 59

Figure 35. Average number of days per year that are conducive to persistence of FMD virus in aerosol (From Cannon and Gardner 1999).

Figure 36. Zonation of EI infected regions in Queensland during the 2007/08 outbreak. Source: http://www2.dpi.qld.gov.au/extra/ei/maps/QLDInfectedCluster.gif (accessed 25 January 2008)

Page 60

The starting point for the CA model is a probability model that describes the spatial extent and orientation of the likelihood that a ‘diseased’ cell (i.e. one in which the presence of the disease has been confirmed) will ‘infect’ neighbouring cells. The general situation is depicted in Figure 39.

Figure 37. Illustration of grid representation used by authorities in managing EI outbreak. (Source: Source: http://www2.dpi.qld.gov.au/extra/ei/maps/map8.gif )

Page 61

Figure 38. The region of interest in Figure 36 with grid overlay that forms the basis of the CA modelling approach.

Figure 39. The grid in Figure 38 showing an ‘infected’ cell (red shading) and associated spatial pattern for disease transmission.

Page 62

The mathematical detail underpinning the development of the CA model is presented in the next section.

3-2-1 Mathematical formulation

Our problem formulation is constructed around the general representation depicted in Figure 40. Our ‘target cell’ (ie. the one of interest) is located in row u and column v of a 2-D grid overlayed on the region of interest. Associated with each grid cell is a Bernoulli variable  indicating disease status6 viz:

t 1 if cell {i,j} is infected at time t ij,   0 otherwise

tt with P ij,,1  ij.

zij,

uv,

Figure 40. General CA situation with cell of interest (red shading) and neighbouring cell. Z is a binary variable indicating cell’s disease status. Region bordered by heavy line depicts spatial extent of influence or impact of cell on its neighbours.

We further make the following assumptions: (i) infection is an absorbing state, that is, once a cell is ‘infected’ it remains infected (at least for the period of interest); and (ii) the

6 Clearly, it is not the cell per se which is infected – it is the occurrence of at least one infected individual within the cell.

Page 63

risk of infection is a function of the disease status of neighbouring cells and the distance between the target cell and infected neighbouring cells.

We define the transmission ‘effectiveness’,  r as the probability that an uninfected cell at distance r from an infected cell will become infected during the period {t, t+1}.

Without loss of generality, we denote rij, to be the distance between the centre of cell at grid location {i,j} and the centre of the target cell.

We next consider the updating step as time is incremented by 1 unit. Hence:

 tt1 PPP1 t 0 cell {u,v} becomes infected in the interval  tt ,  1 (3.1) uv,, uv uv , 

For the sake of brevity and simplicity, we let Iij, denote the event that target cell in row- column position {u, v} becomes infected by cell {i, j} in the interval {t,t+1} and Iij, its

  complement ie. PIij,,1 PI  ij . Thus,

PI11 r tt  ij,,,, ij ij ij (3.2)

For the target cell not to become infected during the period {t, t+1} requires an unsuccessful transmission from every cell to the target. Assuming the effectiveness of transmissions are unrelated, we have:

PI11  r tt  1   ij,,,,  ij ij ij (3.3) all i, j  ij,, uv

Substituting equation 3.3 for the last term in equation 3.1 gives the result:

    tt1 PP1011 t  r tt  1 uv,, uv uv ,   i ,,, j i j i j  all i, j    ij,, uv

or

Page 64

    tt1 11  t  1  r  t  1  t uv,, uv uv ,   i ,, j i j i , j  all i, j (3.4)    ij,, uv

We next consider candidate models to describe the transmission effectiveness as a function of separation from an infected cell.

Models for transmission effectiveness

It is assumed in this development that the transmission effectiveness is only a function of the distance to an infected cell – in other words, our model is omnidirectional or anisotropic. Although the exact form of the function describing this relationship needs to be informed by expert knowledge and disease-specific data, it is not unrealistic to assume a radial basis function for  r - such as a Gaussian model. One such possibility is given by equation 3.5.

r2  ()rke  (3.5)

Different choices for the constants k and  in equation 3.5 give rise to different spatial patterns of transmission effectiveness (Figure 41).

Initial conditions

Equation 3.4 is recursive and therefore requires the specification of an initial state for

0 0 each cell, ie. ij, i, j . The process of supplying an initial guess for ij, followed by the updating step parallels a Bayesian analysis whereby a prior probability is transformed into a posterior probability via a likelihood function (see section 2-2 for a discussion). As in a

0 Bayesian analysis, ij, can be chosen to reflect the degree of belief in the initial infection status. In the absence of any prior information, knowledge, or understanding (other than that captured in equation 3-5) we will assume an initial configuration that is the equivalent

0 of a Bayesian non-informative prior. For example, ij, pij , where p is a constant

Page 65

corresponds to the case where the initial outbreak is equally likely to occur (or have occurred) anywhere in the region of interest.

Figure 41. Illustration of a spatial probability model for disease spread for generic cell {i,j} at fixed point in time. Grid represents region of interest. Height of surface is proportional to probability of spread of infection from cell {i,j} to neighbouring cells. Each plot represents a different range of influence from highly localised (bottom right) to far-ranging (top left).

Modelling the disease spread

To illustrate how equation 3.4 models the spread of a disease, we consider the case of equally-likely outbreak locations. At T=0, we assume an outbreak at a particular grid

Page 66

location7 rc, and make the assignment  0 1 . The probability that neighbouring  00 rc00, cells become infected during the next time increment is given by  rij,  where rij, is distance from rc00,  to a neighbouring cell at grid location {i,j}. Figure 42 illustrates the spatial characteristics of the transmission effectiveness probability contours based on a particular parameterisation of equation 3-5 withrc0012, 8 . At the next time increment, the probability of infection is updated for every grid cell using equation 3.4 and the process repeated N times.

26 25 prob 24 < 0.1 23 0.1– 0.3 22 0.3– 0.5 21 0.5– 0.7 20 0.7– 0.9 19 > 0.9 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26

Figure 42. Illustrative contours of probability representing likelihood of infection around cell having grid coordinates {12,8}.

A slightly more complex scenario is shown in Figure 9 which depicts a non-uniform assignment of initial outbreak probabilities at time T=0 together with ‘snapshots’ at T=40, T=80, and T=120.

7 Or set of locations

Page 67

T=0 T=40

T=80 T=120

Figure 43. 3-D depiction of progression of disease spread starting with initial outbreak pattern at T=0 and at three subsequent time periods. Vertical scale is probability of infection.

The attractive feature of Figure 43 (or more correctly, the underlying model) is it can be used as the basis of inference about the time of an outbreak. For example, we assume the outbreak was at grid cell rc00,  and that we have available sample data on disease incidence at some time TT0 kt where T0 is the outbreak time and k is the number of time increments each of length t units. We then use a maximum likelihood estimation

(mle) procedure to estimate k as the value (call it kˆ ) that maximises the joint probability function for the observed data using the spatial probability model of equation 3.4 evaluated at the kth time step. An example of the likelihood function plotted against k is shown in Figure 44. This procedure can be repeated for all grid locations thereby generating a total of

Page 68

Nrc x such curves, where r and c denote respectively the number of rows and columns of the grid. The estimated outbreak time and location is associated with the liklelihood profile whose maximum is the largest among all N plots. The mle for the outbreak time is  ˆ TTkt0  . This procedure is detailed more formally in the next section.

0

3 10

3 10

3 10 0 50 100 150 0t200

Figure 44. Likelihood profile plot for k (number of time increments since outbreak).

3-3 Inferring the time and location of an outbreak

The general situation was described in the previous section. We now develop the mathematical and computational detail associated with the maximum likelihood procedure for estimating the time and location of an outbreak.

We commence by assuming that at some time TT 0  kt we observe for grid cell {i,j}

()T a proportion, pij of infected units (animals, people, agricultural plots etc.) where

Page 69

()T ()T X ij ()T ()T pij  ()T and Xij is the number of infected units in cell {i,j} out of a total nij inspected nij in grid cell {i,j} (Figure 45).

Figure 45. General situation depicting region of interest with vertical bars depicting empirical rates of infection.

()T The total number of units in grid cell {i,j}, Nij may or may not be known. In any event,

()TT () we assume Nnij ij . If we further assume that the inspected units represent a random sample from the population of susceptible units and that the observations on disease status

()T are independent of each other, then a reasonable probability model for the Xij data is the binomial distribution.

Page 70

Hence

()TTT () () Xbinomnij~, ij ij  (3.6)

()T ()T ()T with Xij and nij defined above and ij given by equation 3.4.

We next define the set S of cardinality m to be comprised of the row-column indexes of all sampled grid locations. For fixed T and given outbreak location in grid cell

()T rc00,  , the log-likelihood function for the unknown ij is given by equation 3.7.

 ()TT;,,xrc () x () T log () T N () T x () T log1 () T  ij ij00   ij ij  ij ij ij  (3.7) ijS, 

Note that the right hand side of equation 3.7 is a function of rc00and by virtue of the

()T relationship between ij and  rij,  with rij, the distance between grid locations rc00,  and i, j . Thus, for fixed T, equation 3.7 is evaluated for all Nrc x possibilities obtained by varying rc00,  over the entire grid. It is then a simple matter to identify the location at which equation 3.7 attains its maximum (Figure 46).

We now proceed to illustrate the method with an example.

Page 71

Figure 46. Illustrative likelihood surface for outbreak location.

3-4 Example

To test the efficacy of the estimation procedure, we generated {,}x n data pairs on a 26 x

26 grid. The outbreak occurred at time TT 0 in grid cell rc0010; 15 and sample data was obtained78t time units later. The transmission effectiveness model used is given by equation 3.8.

2 2 rr 2rrcc  cc  i 0 ij00 j 0 ()rij  exp   (3.8) abab  with a=0.0025 and b=0.0055.

A random sample of n=146 cells was selected from these 676 pairs (see Appendix D for a listing of the data used). Figure 41 shows the sample proportions ( xij/ n ij ) overlaid on a contour plot of the theoretical probability field from which they were generated. The maximum likelihood procedure described in the previous section was programmed within

Page 72

the Mathcad 14® symbolic computing environment. A listing and description of the routines can be found in Appendix E. The likelihood was evaluated for 200 time increments for each of the 676 possible outbreak locations. The maximum was attained at grid location rc0010; 15 (Figure 48) after 78 time increments (Figure 49) which corresponds exactly with the parameters used to generate the data. Given that we used the true probability model (equation 3.8) to compute the likelihoods, the perfect agreement between estimated and actual parameters is not totally unexpected. Nevertheless, we regard this example as a test of the integrity of the estimation procedure. Substantial discrepancies between actual and estimated parameters for this example would have cast doubt over the utility of the modelling and estimation procedures.

Figure 47. 3-D visualisation of empirical data (infection rates) used in the example. Underlying contour lines derived from theoretical response-generating model.

Page 73

In reality, neither the functional form nor the parameter values for ()rij will be known. There are a number of strategies which could be employed in such cases. The simplest, would be to use expert opinion and prior studies to identify a suitable probability model. A more complex and computationally intensive approach would be to specify the functional form but leave the parameters as unknown. A relatively straightforward modification to the estimation algorithm could be made so that the unknown model parameters were estimated simultaneously with the outbreak location and time by maximising over all rc00;  x k space-time combinations within some defined parameter space,  .

Figure 48. Profile plot showing contours of likelihood of outbreak location. Maximum attained at grid position located at row 14, column 9. Notes: (i) this plot is rotated 900 relative to Figure 47; (ii) plot is offset by 1 unit in each direction due to placement of origin at {0,0}, hence most likely grid location for disease outbreak is cell {10,15}.

Page 74

 140

 160 likelihood

 180

 200 0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 170 180 190 200 time increment

Figure 49. Likelihood profile plot for temporal component of CA model. Most likely outbreak time is 78 time prior to time of sampling.

3-5 Discussion

In this chapter we have outlined an approach based on a cellular automata model coupled with empirical observations on incidence rates to estimate both the time and location of an initial disease outbreak. Traditional epidemiological modelling approaches typically focus on the temporal component using differential – difference equations to model the progression of disease spread through time as well as describing other (important) phenomena such as threshold and density-dependent effects. Being discrete, the CA model can be readily implemented on a spatial grid of arbitrary size and resolution and has the advantage of being able to model complex space-time interactions. When coupled with a probability model for the response-generating mechanism the CA model can be used to infer the initial outbreak time and location using maximum likelihood methods. While there is no doubt that once an outbreak of a disease has been detected, much of the subsequent monitoring effort would be focussed on tracking its progression in space and time with a view to containment and eradication. Nevertheless, an ability to

Page 75

pinpoint the outbreak location and time has been acknowledged as an important capability for post-outbreak evaluation and follow-up activities – particularly if the outbreak was the result of a failure in procedures, process, or facilities. In other cases it may be important to understand whether a disease outbreak was of human or natural origin. For example, the official Soviet explanation of the 1979 outbreak of anthrax in Sverdlovsk, U.S.S.R. was that it was caused by contaminated meat. The U.S. intelligence community suspected the outbreak was due to the aerosol release of B. Anthracis from a military microbiology facility. Resolution of this issue was important in deciding whether or not the Soviet Union was in violation of international treaties to which it was a signatory (Hogan et al. 2007).

A number of spatial and spatio-temporal modelling tools for biosurveillance have used conventional statistical modelling approaches such as generalised linear mixed models (GLMMs). For example, the BioSense program run by the U.S. Center for Disease Control (http://www.cdc.gov/BioSense/) utilises a variant of GLMMs known as small area regression and testing (SMART)to enhance early detection and situational awareness of possible biologic terrorism attacks (Bradley et. al. 2005). However, as noted by Fricker (2008) the method in BioSense only uses spatial information to bin data into separate time series and is thus not strictly a spatial model. Most spatial models for biosurveillance utilise the scan statistic (Kulldorff, 1997) to detect disease clusters. The method presented in this chapter models the space-time trajectory of infectious state and couples this with a maximum likelihood method to infer initial outbreak time and place using monitored data.

While our approach has been developed to a proof-of-concept stage, further enhancements would increase the appeal of this methodology. For example, the transmission effectiveness model (equation 3-5) could be extended to incorporate ancillary information about the target and neighbouring cells such as the number of susceptible individuals for an animal disease or the size of area at risk for a plant disease. Thus, in considering the transmission between any two grid cells (denote them cell I and cell J), a p-

T    dimensional vector of ancillary values XxxxIJ  IJ12,,, IJ IJp  could be defined (where one of the xIJ is the distance between I and J). A candidate model for transmission effectiveness that utilises all the available information is the multivariate logistic function (equation 3.9).

Page 76

T exp X IJ     IJT IJ (3.9) 1exp X IJ 

where  is a p x 1 vector of parameters and IJ is a stochastic error term with some

2 assumed distribution (for example IJ ~(0,)N  . To be applicable, values for the parameter vector  would need to be either supplied (based on knowledge of the system) or estimated using data from a separate calibration study.

Page 77

Page 78 SENSOR Chapter NETWORK

OPTIMISATION 4

8 4-1 Introduction Spatial surveillance is a key component of monitoring programs which provide an early detection capability of disease and pest incursions as well as informing assessments of plant and animal health status for trade purposes. International standards for phytosanitary measures and guidelines for surveillance have been established under the International Plant Protection Convention (FAO 1998). These guidelines distinguish between two broad classes of surveillance: specific surveys in which information is obtained on a particular pest over a relatively narrowly defined spatial-temporal extent; and general surveillance activities in which information is gathered on one or more pests over a wider area and from many sources - including specific surveys (Pheloung 2004). Some examples of foreign organisms that are of concern include: Siam weed (Chromolaena odorata); papaya fruit fly (Bactrocera papayae); red imported fire ant (Solenopsis invicta); branched broomrape (Orobanche ramosa) and kochia (Bassia scoparia) (Pheloung 2005). Plant pests take a variety of forms including insects, weeds, fungi, bacteria, viruses and other harmful organisms and usually find their way into the country via trade and travel (Pheloung 2004). Of particular concern is Australia’s vulnerability to fruit fly and since 2006 there has been renewed interest in developing a strategy to underpin a national approach to the management of this pest (http://www.planthealthaustralia.com.au/fruitfly/public.asp?pageID=243). This national fruit fly strategy (NFFS) builds upon the successful national fruit fly trapping program which targets exotic fruit fly pests (Bactrocera spp.) entering through international pathways at ports (Pheloung 2005).

8 The illustrative examples and motivating context used in this chapter relate to the detection of an equine influenza outbreak. It is acknowledged that the methodology is potentially better suited to the detection of plant pest and disease outbreaks. Unfortunately our requests to access the Australian Plant Pest Data Base were unsuccessful as were attempts to interface with the CRC Plant Biosecurity Surveillance research program.

Page 79

With respect to animal diseases, a number of potential and serious risks exist including: Avian Influenza (or Bird Flu); Bovine Spongiform Encephalopathy (BSE); Foot and Mouth Disease (FMD); Equine Influenza (EI); Rabies; and Varroa mite. Australia has developed a number of emergency response plans as well as a spatial and textual, web based software application tool called BioSIRT (Biosecurity Surveillance Incident Response and Tracing). While incident response plans and tools are vital components of a combative strategy, it has been noted that by the time an incursion is detected, the prospects for eradication are very poor and prohibitively expensive (Pheloung 2004). Nairn et al. (1996) and others have long advocated strategies based on avoidance rather than eradication. Fox et al. (2009) noted that surveillance programs for monitoring invasive plants were expensive yet budgets allocated for this purpose were invariably “highly constrained”. Under such circumstances there is a clear need to allocate scarce monitoring resources in the most effective way possible. Fox et al. (2009) also observed that previous attempts at ‘optimisation’ utilised economic tools that did not have any spatial or temporal representation. In the following sections of this chapter we outline a mathematical programming approach to the identification of an ‘optimal’ configuration of ‘sensors’ as part of a general biosecurity surveillance program. A sensor in this context is defined broadly as any instrument, method, procedure, or device that acquires information or samples related to the biosecurity threat under investigation. We acknowledge that our (artificial) example of sensor network optimisation for detecting an EI outbreak may not be the most suitable candidate for our methodology. However, due to our inability to access realistic plant data (see footnote on previous page) and the ready availability of EI and demographic data, we used the latter for illustrative purposes. The suitability or otherwise of the example does not diminish the integrity of the proposed strategy.

4-2 A surveillance network for EI

On August 24, 2007 a disease strategy for equine influenza (EI) was released by Animal Health Australia (2007). The AHA report noted that “there has been no occurrence

Page 80

of EI in Australia … and vaccination is not practised.” Regrettably, that situation changed with the first detection of EI in the same month in the Sydney area. The disease spread rapidly through northern NSW into Queensland where it concentrated in the Brisbane region (DPI 2008). The NSW situation by September 12, 2007 is shown in Figure 1. Equine influenza (EI) is an acute, highly contagious (having an almost 100% infection rate), viral disease which spreads rapidly in horses and other equine species (NSW DPI 2008). Humans are not affected by this virus although they can be responsible for its spread. Most animals exposed to the virus will show signs within a period of 1-5 days. Typically, an infected animal will develop a fever, a dry hacking cough and have a suppressed appetite. Recovery usually takes 2 to 3 weeks. Being a virus, there is no effective treatment and the risk of secondary infections, such as pneumonia is high.

Figure 50. Detection of EI in NSW as at September 12, 2007. Source: http://www.dpi.nsw.gov.au/__data/assets/pdf_file/0009/179406/IP-equine-influenza-map-nsw-12-sept-07.pdf

With respect to biosecurity, responsible agencies in Australia divide their operations into pre-border, border, and post-border monitoring and surveillance activities. A recurring and important issue is where to place limited resources and effort so as to maximize the effectiveness of the surveillance program. For example, given the map of Figure 50 together with other ancillary information about the population at risk (size, geographic extent, spatial aggregation, susceptibility etc.) what configuration (numbers, types, and

Page 81

placement) of monitoring activities delivers the ‘best’ surveillance outcome? Clearly, terms like ‘best’ need to be defined as does a metric of surveillance utility. One optimization criterion might, for example, be the maximization of the probability of detection. However for our problem formulation, we make the following assumptions:

 detection probability is a non-decreasing function of monitoring effort;  detection probability is heterogeneous in space and time;  the cost of monitoring is directly proportional to the amount of resources devoted to the monitoring program;  the total cost of monitoring is constrained.

We consider a formulation of the general surveillance design problem (ie. the optimal placement of sensors in a distributed network) as a constrained integer linear programming problem (ILP). The objectives are to:

i. Recast a conceptual understanding of the monitoring network for EI as a constrained optimisation problem; ii. Provide realistic models of ‘risk’ over a 2D space; iii. Code the problem for solution using ILP; iv. Solve an artificial example problem.

In the following sections we provide details associated with items (i) to (iii) above and in particular, demonstrate the feasibility of item (iv). To this extent, we regard (i) to (iv) as forming the basis of a prototype ‘system’ which we has been developed to a proof-of- concept stage. The challenge that lies ahead is to implement this system on a suitably calibrated real problem.

4-3 The Maximal Covering Location Problem (MCLP)

The optimal sensor location problem for building an EI surveillance network is a variant of a problem known in the Operations Research (OR) literature as either the maximal covering problem (MCP) or the maximal covering location problem (MCLP)

Page 82

(Church and ReVelle 1974, Underhill 1994, Downs and Camm 1996, Chakrabarty et al. 2002). Numerous implementations of the MCLP exist including the placement of guards in an art gallery (O’Rourke 1987), the siting of regional health facilities and services (Eaton et al. 1977, Pirkul and Schilling 1988), optimal placement of ambulance services (Chuang and Lin 2007) and conservation reserve site selection (Önal 2003). More recently the omnipotent threat to homeland security has focussed the attention of some researchers on optimal network designs to detect terrorist activities (Anderson et al. 2007).

The objective of an MCP/MCLP is to determine the configuration of sensors of varying types (in terms of cost, monitoring range, detection capabilities) that achieves mandated levels of surveillance accuracy at minimum total cost. When sensors can be located anywhere on a plane the problem is referred to as the planar maximal covering location problem (PMLCP) (Church 1984). Church and ReVelle (1974) provided the first mathematical formulation of the MCLP for the problem of siting public facilities. In this context a candidate facility site ‘covers’ a demand node if it is within some maximum distance from that node. Mathematically, we have

m Maximize:   cyii (4.1) i1

n s.t.  axij jy i iI (4.2) j1

n  x j  k (4.3) j1

yi  0,1 iI (4.4)

xjj  0,1 J (4.5)

In this formulation, Jjj 1, , n represents the set of candidate facility sites and

I ii1, , m denotes the set of demand nodes; ci is the population to be served at demand node i. Our (binary) decision variables are x j such that x j  1if site j is chosen for

Page 83

a facility and x j  0otherwise. The constraint represented by equation 4.3 requires that k facilities / sites are to be selected. The indicator variables yi assume a value of unity if demand node i is covered by some facility j and assume a value of zero otherwise. Similarly, variable aij is unity if the shortest distance between demand node i and facility j does not exceed some prescribed maximum permissible distance.

Mathematicians refer to MCLPs as being NP-hard. While not wishing to go into the complexities of mathematical algorithms, a problem is assigned to the NP (nondeterministic polynomial time) class if it is solvable in polynomial time. A problem is NP-hard if an algorithm for solving it can be translated into one for solving any NP- problem. NP-hard therefore means ‘at least as hard as any NP-problem’, although it might in fact be harder (Weisstein, undated). Practical approaches to solving the MCLP problem include mathematical programming methods, genetic algorithms, graph theory, complete enumeration and heuristics. Genetic algorithms (GAs) have become popular in recent years as a way of solving large optimisation problems such as the MCLP (Arakaki and Lorena 2001, Buczak et al. 2001). A Genetic Algorithm is an adaptive heuristic search algorithm that embodies the evolutionary concept of ‘survival of the fittest’. The success of GAs is due to an intelligent exploitation of a problem’s solution space.

Perhaps one of the most common solution techniques (possibly due to the accessibility of ‘off-the-shelf’ solvers) is integer linear programming (ILP). Integer linear programming is a variant of linear programming (LP) in which the decision-variables assume integer values only – in the case of the MCLP, the decision-variables are binary (0/1). In the formulation above, we typically have mn and Downs and Camm (1996) note that direct ILP approaches to solving equations 4.1 to 4.5 above frequently suffer from a high degree of both primal and dual degeneracy. Replacing yi with hyii1 in the MCLP formulation leads to an equivalent minimisation (of uncovered nodes) problem which, it is claimed, identifies an optimal solution more quickly and reduces primal degeneracy (Downs and Camm 1996).

In the next section we describe a more general version of the MCLP in which each sensor type is characterised by a spatially-explicit weighting function (representing say, a

Page 84

detection probability or ‘depth of feel’) rather than a deterministic cut-off range. The distinction is that the weighting-function can provide an anisotropic representation of monitoring ‘effectiveness’ (which itself may be sensor-specific) that assigns higher weight when the target is close and less weight when the target is far away. This ‘envelope of effectiveness’ is then combined with a 2D ‘risk’ surface to produce a final set of weightings

(the aij of equation 4.2). We are unaware of any similar formulation of the MCLP and to that extent, we believe that this represents a new and original contribution to the literature on covering problems. We have termed this the generalised MCLP or g-MCLP.

4-4 The generalised MCLP (g-MCLP)

Our problem formulation is based on subdividing a region of interest into a regular grid of appropriate dimensions. What constitutes ‘appropriate’ will be problem specific but will be based on considerations of: (i) computational resources; (ii) mathematical tractability; and (iii) a context-specific decision unit. By (iii) we mean the dimensions of a grid cell that are commensurate with the scale of the range over which sensors are effective and the geographic extent of the surveillance space. Thus, for example, if a particular monitoring device or activity is effective up to 50-80 km and surveillance is over 250,000 km2 an appropriate grid cell might, for example be 20 km x 20 km. The motivation for discretising the problem is two-fold: (i) a discrete representation of the surveillance space facilitates codifying and solving the problem as a MCLP; and (ii) it is consistent with the way in which management agencies characterise regional risk (Figure 37). The generic situation is depicted in Figure 39. The overall network configuration is defined by the number, type, and spatial location of individual sensors. For each grid cell we define the following binary decision variable:

1if cell {,ij } has a sensor of type k xijk,,   (4.6) 0otherwise

Page 85

The concept of ‘envelope of effectiveness’ is illustrated in Figure 51 which shows ellipses centred at three locations (cells). The orientation and spatial extent of an ellipse is assumed to be a function of the monitoring instrument, device or procedure. Furthermore, it is assumed that the information for cells close to the sensor location will be more relevant or accurate than for cells further away. This implies a spatial gradient or weighting of ‘accuracy or ‘relevancy’. Note, this assumption may not apply in all instances in which case an equal weighting scheme can be used.

Figure 51. Illustration of 'envelope of effectiveness' for three sensor types. Note that range and orientation can be different for each sensor.

In order to formulate a mathematical objective function, we need first to define a measure of utility. Obvious candidates include risk (however defined), cost, and detection probability. Clearly, the first two choices result in minimisation problems while use of the third results in a maximisation problem. For illustrative purposes, we will use a combination of both risk (defined as the likelihood of spread of infection) and detection probability. The objective function will be the maximisation of the overall probability of detecting the spread of EI. Note, detection probability is a characteristic of the sensors while the risk of

Page 86

spread is a function of uncontrollable (external) factors. However, the overall probability of detecting the spread of EI is a product of both terms. Thus we have:

Pxdetection at with sensor located at y disease present in vicinity of xwh ( ) (4.7)

PxExpresent at location  ( ) (4.8) 

Pxdetection at with sensor located at yExwh     (4.9) where hxy is the distance between the site of interest and the sensor location.

Furthermore, we impose the requirements that w01  and w  0 .

4-4-1 A logistic model for disease probability In the absence of hard data, we have used the following logistic model to compute the probabilities in equation 4.8:

exp01EI x  2 log_density x   3 EI x log_density x  Ex (4.10) 1 exp01EIx   2 log_density x   3 EIx  log_density x where the  terms are model parameters; EI is an indicator variable having value 1 if EI has been detected at x; and log_density is the natural logarithm of population density at x. For positive values of 12,  and 3 , risk is an increasing function of population density and will tend to increase rapidly if there has been at least one reported incidence of EI at x. The discrete version of the surveillance problem involving a single sensor is shown in Figure 52.

Page 87

E11 E12 E13    E1C

E21 E22 E23    E2C

E31 E32 E33    E3C

  Eij    

dijkl()

 wEijkl kl     

      

ER1 ER2 ER3    ERC

Figure 52. Representation of single sensor network. Sensor is located location in cell {i,j}. The probability of detection in cell {k,l} is a product of the risk/likelihhood for cell {k,l} (denoted Ekl) multiplied by the sensor’s detection probability wijkl which is a function of distance (dijkl) between cells {i,j} and {k,l}).

In reality the situation is more complex than indicated by Figure 52 since there will be multiple sensors and it is conceivable that some grid cells will fall within the envelope of effectiveness of more than one sensor. The more general situation is shown in Figure 53.

Page 88

E11 E12 E13    E1C

E21 E22 E23    E2C

E31 E32 E33    E3C

  Eij    

dijkl() wEijkl kl       wEijst kl

       dijst()

ER1 ER2 ER3  Est  ERC

Figure 53. Generalisation of previous figure for multiple sensor network.

4-5 Problem formulation

We consider the problem of optimally locating m sensors on an r x c grid. It will be convenient to use matrix notation to define the objective function and in order to simplify this expression it is helpful to ‘unpack’ the row-column data associated with the sampling grid. It doesn’t matter how this is done although we have chosen to do this row-by-row to form the following three matrices:

Page 89

Decision matrix: X

This is a ()mrc x m block-diagonal matrix with XdiagX  (1),, X (m )  where X ()p is a (1)rc x column vector having elements defined by equation 3.1 for each of the rc grid cells for the pth sensor.

Weight matrix: W

 (1) (2) (m )  ()p This is a ()rc x mrc matrix having structure: WWW   W where W is a th ()p ()rc x rc matrix of detection probabilities associated with the p sensor. Let w j be th ()p ()p the j (1)rc x column of W . The elements of w j are the detection probabilities for all rc grid cells obtained when sensor p is placed within the jth grid cell.

Risk vector: E

E is a (1x rc ) row vector whose entries are the unpacked E’s of Figure 52.

Objective function

The objective function is:

Maximize   EWX...1T (4.11)

where 1T is an (1)mx vector of ones.

Constraints

A complete set of constraints will require elicitation. However, by way of example we could have constraints on cost; number of sensor sites; and sub-region representation (to accommodate subjective requirements for monitoring in certain regions).

Number of sensors of type j: 1TjX () n j (4.12)

where 1 is a (1)rc x vector of ones.

Total number of sensors: J TTXN1 

(4.13) where J T is an (1x mrc ) vector of ones and 1T is a (1)mx vector of ones.

Page 90

Total Cost: CXTT1   (4.14)

where CT is a (1x mrc ) vector of costs and 1T is a (1)mx vector of ones.

The structure of CT is CC(1) (2) C (m )  where component C ()j is a (1x rc ) vector of   costs associated with locating a sensor of type j within each of the rc grid cells. Note, that this formulation is sufficiently general in that it acknowledges that establishment an operating costs can vary spatially even for the same type of equipment.

m Tj() Representitveness LXSS R (4.15) j1

T where LS is a(1x rc ) vector of ones and zeros such that a 1 indicates that the relevant

grid cell is a member of sub-region S and RS is a scalar lower bound on the number of sensor locations that must be placed within sub-region S.

()ij () T Minimum spacing X XS1  ij, (4.16)

where S (a scalar) is the minimum separation between any two sensors.

Note that equation 4.16 is non-linear in the decision-variables. Equation 4.16 can be linearised with the imposition of additional constraints. This is achieved as follows: suppose u and v are two binary variables. Then their product uv can be replaced by a new binary variable  with the additional constraints: (i) uv  2 ; and (ii) uv1  .

4-6 Example

Important note: This example is for illustrative purposes only and uses artificially constructed data to represent EI risk.

We demonstrate the approach outlined in the previous section by examining a number of scenarios for optimal sensor location over the state of NSW under various constraints. A 14 x 14 grid has been used corresponding to a grid cell size of approximately 68 km (north-south) x 144 km (east-west). Demographic data for local government areas has been taken from publicly available information (see Appendix F). A surface/contour

Page 91

plot of the population density is shown in Figure 54 while Figure 55 shows population density contours with EI detection locations superimposed.

Figure 54. 3D Surface/contour plot of NSW population density. (Raw data given in Appendix F).

Figure 55. NSW population density contours. Solid red circles correspond to EI locations as shown in Figure 1.

Page 92

The following parameter values have been used in equation 4.10: 0 1.0023;

1 1.6554 ; 2  0.23728 ; and 3 1.16869 . A plot of the 3D risk surface is shown in Figure 56 and again as a contour plot with superimposed EI detection locations in Figure 57.

Figure 56. 3D surface/contour plot of EI risk over NSW.

Because of the discreteness of the problem formulation our decision variables are individual cells - not infinitesimally small points on the map. For this reason we need a measure of risk for a cell rather than a point. To this end, the continuum of risk has been averaged over each grid cell to obtain an average cell risk (Figure 58).

Page 93

Figure 57. Contour plot of EI risk in NSW. Solid red circles are EI locations from Figure 1.

Figure 58. Block-averaged risk probabilities.

Page 94

4-6-1 Implementation

Pre-processing of input data (creation of W matrix and risk vector E) and various other manipulations were performed in Mathcad 14 (Parametric Technology Corporation). The objective function is given by equation 4.11. The ILP solution was implemented using extended LINGO 11 (Lindo Systems 2007). The LINGO code appears in Appendix G. Elements of the vector E and cost coefficient data appear between the “data” and “enddata” lines of the LINGO program in Appendix G.

Scenario evaluation

A number of scenarios were devised to illustrate the impact on the optimal sensor network as a result of altering one or more constraints. The system is sufficiently flexible to allow the incorporation of as many constraints as desired as well as the ability to relax or tighten constraints individually or simultaneously. Combinations of different surveillance modes can be readily accommodated by specifying the number and type of ‘sensors’ available. Individual sensor effectiveness is reflected in the matrix of weights, W defined earlier. For the purpose of illustration, we assume we have available three different surveillance methods or sensors. The number and characteristics of each of these is given in Table 4-1. Table 4-1 Sensor characteristics.

No. Sensors Major Major Sensor Type angle available range range 1 1 50 25 3400

0 2 2302560 3 2352000

A graphical depiction of the sensor characteristics corresponding to the parameters in Table 4-1 is shown in Figure 59.

Page 95

Sensor Sensor

Type

3

Type

3

Figure 59. generic representation of three sensors having characteristics given in Table 4-1.

Six scenarios showing the impact on the optimal solution as a result of changing the minimum separation and cost constraints are listed in Table 4-2. An arbitrary function was used to generate a cost surface over the region of interest (Figure 60). By the same reasoning that was applied to the risk metric, costs at individual points need to be averaged over each cell (Figure 61).

Table 4-2. Constraint data.

Minimum Max. number of sensors Cost sensor constraint Scenario spacing Type 1 Type 2 Type 3  $  km 1 1 2 2 30 ∞ 2 1 2 2 35 ∞ 3 1 2 2 45 ∞ 4 1 2 2 60 ∞ 5 0 2 2 0 20000 6 0 2 2 35 20000

Page 96

Figure 60. Cost-contours for surveillance monitoring.

The optimal sensor configuration for each scenario has been identified (Figures 62 to

679). Figure 68 shows a solution configuration overlaid on satellite imagery. While the results in Figures 62 to 67 are intuitively sensible (solutions targeting areas of high risk while honouring the minimum separation and cost constraints) their identification requires a considerable amount of computation. Nevertheless, the use of highly optimised numerical algorithms such as those found in LINGO® meant that a solution was generally found within a few minutes when run on a standard desktop computing platform.

9 The contour lines in Figures 62 to 67 have no meaning and are an anomaly of the software used to produce these figures.

Page 97

Figure 61. Block-averaged surveillance monitoring costs.

Figure 62. Optimal solution for scenario #1: 5 sensors; min spacing 30 km; no cost constraint.

Page 98

Figure 63. Optimal solution for scenario #2: 5 sensors; min spacing=35 km; no cost constraint.

Figure 64. Optimal solution for scenario #3: 5 sensors; min spacing=45 km; no cost constraint.

Page 99

Figure 65. Optimal solution for scenario #4: 5 sensors; min spacing=60 km; no cost constraint.

Figure 66. Optimal solution for scenario #5: 4 sensors; no min spacing; cost <=$20,000.

Page 100

Figure 67. Optimal solution for scenario #6: 4 sensors; min spacing 35 km; cost <=$20,000

Figure 68. Optimal monitoring locations corresponding to Figure 62 overlaid on geographic map. (Note: Apparent positional discrepancies due to different mapping projections).

Page 101

4-7 Discussion

The use of mathematical programming techniques to optimise network design problems is indeed not new. Applications of mathematical programming techniques to the optimisation of sparse sensor networks have been associated with air quality monitoring (Fox et al. 2009, McElroy et al., 1986), water supply security (Chakrabarty, & Iyengar, 2002) and computer network integrity (Noel and Jajodia, 2007). We have uncovered only limited evidence of mathematical optimisation methods applied to the identification of optimal network designs for biosecurity surveillance. Alternatives to the conventional mathematical programming methods listed above include genetic algorithms and heuristic algorithms (Kanaroglou et al., 20005; Liu, et al., 1986). These are briefly discussed below.

Genetic algorithms (GAs) These commence with a group of possible solutions; coded as binary strings and then the solutions with greater utility are selected, perturbed and re-combined to produce even better solutions. Genetic algorithms are an example of an evolutionally algorithm which is generally completed either for a set number of generations or until it does not seem that any further improvement is possible. GAs provide efficient location of near optimal solutions without any necessary previous understanding of the search space.

Greedy searches These are a type of heuristic algorithm whereby actions are chose on the basis of their improvement in the objective function for that step. They are computationally efficient and can allow for a reasonable solution to very complex problems. Noel and Jajodia (2007) use a modified greedy search. They are also scalable to constraints to a greater degree than other approximate algorithms. Krause et al. (2006) discusses the problem with the performance of greedy searches in a situation where communication costs are considered critical. In terms of computational effort greedy search algorithms tend to perform significantly worse than other stochastic algorithms.

Page 102

Gradient ascent Gradient ascent is a non-linear optimisation algorithm which attempts to find extrema by moving in the direction of steepest gradient of the objective function. It was used for example by Vickers et al. (2006) to guide the optimal placement of sensors on the base of information on the state of the environment (specifically water flow) for a defensive military application. Gradient ascent methods will identify local extrema but can get ‘trapped’ and fail to reach a global optimum. To circumvent this problem, gradient ascent programs are often run as ensembles generated with randomised initial conditions. The best performing member of the ensemble is chosen as an optimum and generally provides a more robust estimation of the global maximum.

The issue of surveillance network design is as important as the surveillance activities and data analysis methods themselves. A sub-optimal monitoring network design is not only wasteful of precious monitoring resources but compromises statistical power – that is the ability to identify disease outbreaks, quarantine threats, or (bio) security violations when they have in fact occurred. As noted by Patil et al. (2006) “surveillance geoinformatics of spatial and spatiotemporal hotspot detection and prioritization is a critical need for the 21st century”. Although remote sensing will continue to provide an important capability in plant protection and monitoring, the need for ground-based surveillance systems will remain. To be effective, remote sensing needs to be able to resolve zones of infection that are as small as 5m in diameter – that is, of the order of a single pixel of information generated by present day satellites (Patil et al. 2006). While the siting of biosurveillance ‘sensors’ has been recognised as an important consideration in monitoring program design, most efforts in this regard have been driven largely by logistical considerations using heuristic algorithms. For example, in response to the September 11 2001 terrorist attacks the United States government, through its Department of Homeland security, deployed the BioWatch Program to provide early warning of a mass pathogen release. Although exact details of the location of BioWatch monitoring sites is unknown, it is thought that these may have been co-located with EPA air quality monitoring sites “on the basis of cost and ease of access” (Shea and Lister, 2003).

Page 103

In this chapter we have presented a design tool/methodology which compliments the temporal monitoring of Chapter 2 and the spatial-temporal modelling of Chapter 3. Together, these three chapters provide new and novel methods for the design, analysis, and prediction of disease/pest movement in space and time. We regard these methods as a starting point for further exploration and development. In particular, they need to be thoroughly ‘road-tested’ on a suite of comprehensive scenarios based on actual biosecurity monitoring and surveillance case studies.

Page 104

ROBUST METHODS Chapter FOR BIOSURVEILLANCE 5

5-1 INTRODUCTION

Since the events of September 11 2001, there has been increased emphasis on monitoring and surveillance to detect and prevent further terrorist attacks. While significant resources have been devoted to the mechanics of screening, far less attention has been paid to quantifying the efficacy of these surveillance programs. Given the high volumes of passengers, containers, mail items, and various other inter-continental ‘movements’, it is pertinent to examine the effectiveness of the inspection regimes. From a statistical perspective, the answer is partly provided by the theory and methods of statistical process control (SPC) and elementary probability theory (see chapter 1). However, bio-surveillance, syndromic surveillance, and counter-terrorism surveillance are fundamentally different from monitoring activities for quality assurance in an industrial manufacturing process. Unlike the industrial setting where there is generally good information on the performance of the manufacturing process (eg. percent defective; proportion of non-conforming or ‘out-of-spec’ items), monitoring in the context of bio / homeland security is characterised by extreme uncertainty. For example, because of the nefarious activities of terrorist organisations, security and intelligence organisations do not always know what it is they’re looking for. As terrorists become increasingly sophisticated in their modes of attack, our uncertainty in both the likelihood of an attack and which sentinels to monitor increases. By way of example, prior to 9/11 there was little or no surveillance at general aviation (GA) airports or monitoring of student pilot training. This lack of information regarding what to monitor is a case of Shackle-Popper Indeterminism (SPI) (Ben-Haim, 2006). The distinguishing features of SPI are:

Page 105

Intelligence: What people know influences how they behave.

Discovery: What will be discovered tomorrow cannot be known today.

Indeterminism: Tomorrow’s behaviour cannot be modelled completely today.

In contrast to traditional methods of (constrained) optimisation, Ben-Haim (2006) developed Info-Gap theory to identify robust solutions to decision-making problems under extreme uncertainty. Info-gap theory has recently been applied to assessing the performance of counter-terrorism surveillance programs (Moffitt et al. 2005) and the identification of robust strategies to deal with bioterrorism attacks (Yoffe and Ben-Haim, 2006). Thompson (unpublished) examined the general sampling problem associated with inspecting a random sample of n items (containers, flights, people, etc.) from a finite population of N such items in a biosecurity context using an info-gap approach. The basic situation considered by Thompson is that there is a probability pn()of a catastrophic outcome (eg. terrorist attack) given that n events / items out of N have been inspected. The info-gap formulation of the problem permitted the identification of a sample size n such that p()n did not exceed a nominal threshold,  c when severe uncertainty about p()n existed. Implicit in this formulation was the assumption that the detection probability (ie. the probability of detecting a weapon, adverse event, anomalous behaviour etc.) once having observed or inspected the relevant item / event / behaviour was unity. In the context of counter-terrorism, our uncertainties (or info-gaps) will most certainly extend to a lack of certitude in detection. This is a consequence of the ‘discovery’ aspect of the SPI phenomenon identified above. Consider the following examples:

1. Quarantine authorities invoke various levels of inspection ranging from a cursory examination (eg. opening the door of a container and visually the contents that are accessible) to full inspection (eg. removal and inspection of the entire contents). Clearly, the probability of detecting a prohibited import will be much lower in the first case than the second even though the container is regarded as having been inspected in both cases;

2. Indeterminism means that security agencies do not always know what they’re looking for. For example, liquids are currently banned from hand-luggage on certain international flights. This is because explosive devices can be assembled from innocuous constituents carried separately by more than one passenger.

Page 106

Prior to this ‘discovery’, the detection probability for an acetone-peroxide based explosive would have been small.

In the following sections we describe the general surveillance problem for which the probability of detection is less than unity. We then provide an info-gap formulation to help identify sampling strategies that are robust to multiple sources of uncertainty – including the detection probability.

5-2 Surveillance with imperfect detection

Following Thompson (unpublished), we assume that there is a finite population of N objects, events, people, or behaviours that are potentially subject to inspection. From this population of N ‘objects’ a random sample of size n is to be inspected. We define the following events:

I – the event that an object is inspected;

W – the event that an object is a security threat (eg. the object is a weapon, the person is a terrorist, the behaviour is indicative of malicious intent);

D – the event that the security breach is identified / detected.

Furthermore, we assume that only inspected objects are classified as either belonging to D or D . We thus have I  DD   and hence

  PIPDPD     (5.1)

Furthermore, I  DW   DW  DW  DW and thus

    PI  PDW  PDW   PDW   PDW  (5.2)

In a biosecurity / counter-terrorism context, arguably, the most important probability is not

PW PWD (the probability of a security threat) but rather it is the conditional probability  ie. the probability of a security threat given that no breach of security was detected.

Page 107

The lack of detection of a security breach is due to: (i) the absence of a security threat; and/or (ii) imperfections of the detection equipment / method/ process. Our inability to distinguish between (i) and (ii) is an info-gap.

5-2-1 Problem formulation

From elementary probability theory:

P WD  PWD   (5.3)    PD 

From equation 5.2 we have:

  P W D PI  PD W  PD  W  PD W (5.4)

Each of the joint probabilities in Equation (4) can be expressed in terms of relevant conditional probabilities viz:

P W D P I P DW P W  P DW P  W  P  DW P  W       (5.5)

PDW1 PDW 0 Note that  and .

We next define the detection efficiency,  as PDW  i.e. the probability that a security breach will be detected given a threat actually exists. Furthermore, we let   PW be the

 PI n unconditional probability that an object is a security threat and   N the inspection fraction or probability. Hence, equation 5.3 can be written as

 1  PWD    PD 

1   11111             (5.6) 1  1

Page 108

Next, observe that  PWPWIPWI    P WI PDWPDW  

Therefore

  P DW PWPWI    PDW  PW PWI P I  P DW PW     1 PWI   11 PWI    (5.7)

But, W andI are independent events and therefore W andIare also independent PWI PW which means  and thus equation 5.7 becomes

 PD W  1  (5.8)

Notice that for the probability in equation 5.8 to be non-negative    i.e the sampling fraction must be at least as large as the detection efficiency. Substituting equation 5.8 into equation 5.6 gives

11  PWD      PD  W

PD W PDWPW  PDW1, But  and since  this becomes

 PD W PW 1  . Thus,

1  PWD p,, (5.9) 1

Note, that when 100% inspections are performed, the conditional probability in equation 5.9 becomes

Page 109

1  PWD p(1, , ) (5.10) 1 and under these conditions, this probability is only zero when the detection efficiency is 100%. For 0% detection efficiency p(1, 0, ) is  - the unconditional probability that the object is a security threat. Furthermore, whenever the inspection rate is 100% , p,, exceeds p(1, , ) . This increase in ‘risk’ may be regarded as the ‘cost’ associated with less than complete inspection. We thus define our performance criterion  to be the ratio p,, , thus p1, , 

1  1 ,,  (5.11) 11  

We next consider an info-gap formulation to assess the effects of uncertainty in key parameters (namely and ) on the performance criterion given by equation 5.11.

5-3 An Info-Gap model for surveillance performance

Information-gap (hereafter referred to as info-gap) theory is a recent development designed to assist decision makers faced with severe uncertainty (Ben-Haim 2006, Regan et al. 2005, Carmel and Ben-Haim 2005). Info-gap theory aims to address the “robustness” of decision making under uncertainty. It asks the question: how wrong can a model and its parameters be without jeopardising the quality of decisions made on the basis of this model? Info-gap theory derives its robustness functions from three elements: a performance measure, a process model and a non-probabilistic model of uncertainty. The performance measure is a statistical, economic or bio-physical metric of value to the decision maker. The decision maker may wish to increase the performance measure (e.g. dollar value of a share portfolio) or reduce it (e.g. probability of not detecting a terrorist attack). In each case there is often a critical performance value which defines a change in decision. In our

Page 110

case, the performance measure is  - effectively the reduction in surveillance efficacy when less than 100% inspection is employed. The process model is a mathematical summary of the system in question. It describes the relationship between the performance measure and the important characteristics of the system in question. In this example the performance threshold is the maximum tolerable reduction in surveillance efficacy and the process model is given by equation 5.11.

The info-gap model of uncertainty for the uncertain quantities  in the process model is the unbounded family of nested sets U ,  of possible realisations  , where  represents the unknown “horizon of uncertainty” and  our best or initial estimate of  . This model satisfies two axioms:

contraction: U 0,   (5.12) nesting:   UU ,,      (5.13)

The contraction axiom states that in the absence of uncertainty  0 , our best estimate  is correct, while the nesting axiom states that the range of uncertain variation increases as the horizon of uncertainty increases. In all cases  is unknown and unbounded with  0. In this example the uncertain quantities are the detection efficiency  and , the probability that an object is a security threat. Thus, , and our initial or best estimate of these parameters is denoted ,. In this section we consider uncertain parameter values – the detection efficiency and the probability that an object is a security threat,  . The fractional errors   /  and   /  are unknown. With this prior information we formulate the following fractional-error info-gap model:

Page 111

, : max 0,  1    min 1, 1     U ,, ,  0 (5.14) max 0, 1  min 1, 1   

This is a bounded family of nested sets of , values with the sets becoming more inclusive as the horizon of uncertainty, increases.

The definition of the performance measure, process model and uncertainty model(s) completes the specification of the formulation of the info-gap analysis. We now turn to the derivation of the robustness function. In info-gap parlance “robustness” is defined as the greatest horizon of uncertainty, across all uncertain model components, such that the performance measure still meets the pre-defined requirement. In our application the robustness of a surveillance regime in which  x100% of the target population is inspected, is the greatest horizon of uncertainty  for which all combinations of the uncertain parameters  , the minimum required inspection performance is achieved, that is

   ,max:min,,dd     (5.15) ,,,U   

where  d is the required value of  . Equation 5.15 is the robustness function for this application of the info-gap model. The strategy of robust-satisficing (Ben-Haim 2006) is to attempt to guarantee an adequate level of surveillance performance, by choosing a value of  which is highly robust to uncertainty. Thus, for any inspection fraction  , the robustness function indicates the confidence in attaining the minimum performance requirement with that  . Examination of the process model (equation 5.11) reveals that it is a monotonic decreasing function with respect to  and a monotonic increasing function with respect to  . Combining this observation with the uncertainty model (equation 5.14) allows us to write the inner minimum of the robustness function (equation 5.15) as

h,,,  d (5.16)

Page 112

where 111   11 2  h ,,,   2 (5.17) 11 111  

5-4 Illustrative Example

Suppose new intelligence suggested that a clandestine operation had been planned to smuggle native fauna out of the country and although the exact mode of export is unknown, it is thought to rely on secret cavities sown into a passenger’s clothing. Airport security and quarantine staff thus have no clear idea what they are looking for except that they have been instructed to closely monitor the appearance, texture, and integrity of passengers’ clothes. Our best guess of the parameters  , is   0.7 and   0.05 although considerable uncertainty exists around these figures. Figure 1 plots the performance function  ,, as a function of robustness for a range of  values.

1.1

  0.2

  0.3

  0.4

  0.5

1.05   0.6

  0.7 Performance (relative risk)

  0.85   0.052

1 0 0.2 0.4 0.6 0.8 0.9 1

Robustness

Figure 69. Robustness of surveillance performance for various sampling fractions (lambda).

Page 113

In recognition that 100% detailed inspection of all passengers is not feasible, a reduced level of surveillance will be tolerated provided the increased risk (of an undetected threat)is no more than 1.5% (relative to complete inspection). The dashed (red) horizontal line in Figure 69 is thus our maximum tolerable relative risk. To meet this performance requirement a minimum of around 5% of passengers will have to be screened. At this level of screening, the robustness to uncertainty is zero and hence, if our initial estimates of the probability of a quarantine threat or of the detection probability are wrong, the performance requirement will not be met. Increasing the surveillance rate to 50% results in about 20% robustness, while an inspection rate of 85% will guarantee the performance requirement is met even if our initial guesses for the parameters are in error by 90%.

5-4-1 Comparison with a Bayesian Approach

The previous example has been modelled using the WinBUGS software. The directed acyclic graph is shown in Figure 70. Beta priors have been placed on  and . In particular:  ~(0.05,0.15)unif and  ~unif (0.1,0.8) (Note:  has a truncated distribution). The respective prior densities for  and are shown in Figures 71 and 72. The WinBUGS code is shown in Figure 73.

Page 114

name: theta type: stochastic density: dbeta a 0.98 b 97.0225 lower bound upper bound 0.5

lambda theta

phi psi

Figure 70. Directed acyclic graph for the biosurveillance example.

Distribution Plot Distribution Plot Uniform, Lower=0.05, Upper=0.15 Uniform, Lower=0.1, Upper=0.8 1.6 10

1.4 8 1.2

1.0 6 0.8 Density Density 4 0.6

0.4 2 0.2

0 0.0 0.050 0.075 0.100 0.125 0.150 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 theta phi Figure 71. Prior density for theta. Figure 72. Prior density for phi.

Page 115

model; { theta~dunif(0.05,0.15) phi~dunif(0.1,0.8) psi<-phi*(1-lambda*theta)*(1-theta*phi)/(1-phi*theta*lambda)/(phi*(1-theta)) }

list(lambda=0.85)

Figure 73. WinBugs codeFigure for biosurveillance . example

Empirical cdfs have been plotted for a variety of  values (Figure 74). Using a notional minimum performance criterion of 1.015 , we see that with a 60% inspection rate, there is about a 25% probability of meeting this target. This probability increases to around 90% for sampling inspection rates of at least 85%. These results accord well with the IG analysis.

Empirical 1.015 1.0   0.85   0.6   0.4   0.2   0.052 0.8

0.6

Probability 0.4

0.2

0.0

1.00 1.02 1.04 1.06 1.08 1.10 1.12 1.14 1.16 psi

Figure 74. Emprical cdfs for performance measure of equation 5.11 based on 10,000 simulated results

Page 116

5-5 Discussion

Physical and biosecurity is maintained and enhanced by a combination of activities and strategies, not least of which are border inspections of people and containers. A characteristic linking both bio-terrorism and biosecurity are the “unknown unknowns”1 – that is, we often don’t know what it is we’re looking for. Monitoring strategies for the detection of invasive plant or animal species are particularly problematic due to the general absence of economic considerations and the climate of severe uncertainty about the likelihood of species introductions and successful detection (Moffitt et al. 2008). Moffitt et al. (2008) used Info-Gap theory (Ben-Haim, 2006) to deal with the inherent uncertainty in biosecurity monitoring while Thompson and Fox (2008) independently used the same approach in the context of bioterrorism surveillance. In this chapter we have followed the approach outlined in Thompson and Fox (2008) and also compared this with the results of a Bayesian approach to the problem of determining an appropriate level of monitoring effort. With the choice of priors used in the Bayesian analysis to reflect reasonable assumptions about our knowledge (or ignorance) of detection and threat probabilities, it was found that both the IG and Bayseian methods resulted in similar inspection rates. We suspect this is more coincidence than a convergence of paradigms. The IG approach is, despite its appearance, relatively unsophisticated in its treatment of uncertainty. Whereas Bayesian methods characterise and manipulate uncertainty via probability functions or probability density functions, the info-gap method is deterministic and is essentially a sensitivity analysis on selected model parameters. While there is nothing inherently wrong with this, we suspect that the use of realistic prior distributions coupled with informative probability models for inspection and detection are likely to yield more informative results than the IG method. One of the difficulties with the IG approach is the interpretation of the robustness metric that is central to the IG paradigm (Fox, 2008). Other issues with the IG philosophy have been raised by Sniedovich (2007). Irrespective of the method used, a recurring message from these types of analyses is that the invariably small, flat-rate inspection policies that are in widespread use at present are unlikely to provide

1 This term has been attributed to former US Defence Secretary Donald Rumsfeld who used it during a press briefing on Afghanistan on February 12, 2002.

Page 117

adequate levels of immunity to the severe uncertainty that characterises biosecurity and biosurveillance. Given the importance and level of interest in robust decision-making for biosecurity, considerably more work needs to be done in this area.

Page 118

6 References

Anderson, L.B., Atwell, R.J., Barnett, D.S. and Bovey, R.L. (2007) Application of the Maximum Flow Problem to Sensor Placement on Urban Road Networks for Homeland Security. Homeland Security Affairs, 3(3), 1-15.

Animal Health Australia (2007). Disease strategy: Equine influenza (Version 3.0). Australian Veterinary Emergency Plan (AUSVETPLAN), Edition 3, Primary Industries Ministerial Council, Canberra, ACT.

Alexandersen, S., Oleksiewicz, M. and Donaldson, A. (2001) The Early Pathogenesis of foot and mouth disease in pigs infected by contact: a quantitative time-course study using TaqMan RT-PCR. J. Gen. Virol. 82, 747-755.

Arakaki, R.G.I. and Lorena, L.A.N. (2001) A Constructive Genetic Algorithm for the Maximal Covering Location Problem. Proc. MIC’2001- 4th Metaheuristics International Conference, Porto Portugal, July 16-20, 2001.

AusVet and CSIRO (2005) Review of the Potential Impacts of New Technologies on Australia’s Foot and Mouth Disease (FMD) Planning and Policies – Report to Australian Government Department of Agriculture, Fisheries and Forestry. DAFF, Canberra. (Available at: http://www.daff.gov.au/__data/assets/pdf_file/0004/146875/ausvet_fmd_technology_fullreport_jul 06.pdf )

Baron, M.I. (2001) Bayes Stopping Rules in a Change-Poiny Model with a Random Hazard Rate. Sequential Analysis, 20(3), 147-163.

Ben-Haim, Y. (2006). Information-gap decision theory: Decisions under severe uncertainty. 2nd. edition, Academic Press, San Diego.

Benneyan, J.C. (2001a) Number-Between g-type statistical quality control charts for monitoring adverse events. Health Care Management Science, 4, 305-318.

Benneyan, J.C. (2001b) Performance of Number-Between g-type statistical quality control charts for monitoring adverse events. Health Care Management Science, 4, 319-336.

Benyoussef, A., Boccara, N. Chakib, H., and Ez-Zahraouy (1999) Lattice three-species models of the spatial spread of rabies among foxes. Int. J. Modern Phys. C, 22, 1025- 1038.

Buczak, A.L., Wang, H., Darabi, H. and Jafari, M.A. (2001) Genetic algorithim convergence study for sensor network optimization. Information Sciences, 133, 267-282.

Burkhom, H.S., Murphy, S.P. and Shmueli, G. (2007) Automated Time Series Forecasting for Biosurveillance. Statistics in Medicine, 26,4202–4218.

Page 119

Callinan, I. (2008) Equine influenza - The August 2007 outbreak in Australia Report of the Equine Influenza Inquiry. The Hon. Ian Callinan AC April 2008 Commonwealth of Australia

Cannon, R.M. and Garner, M.G. (1999) Assessing the risk of wind-borne spread of foot- and-mouth disease in Australia. Environment International, 25(6/7), 713-723.

Carpenter, T.E. (2001) Methods to investigate spatial and temporal clustering in veterinary epidemiology. Preventative Veterinary Medicine, 48, 303-320.

Carmel, Y. and Ben-Haim, Y. (2005) Info-gap robust-satisficing model of foraging behavior: Do foragers optimize or satisfice? American Naturalist, 166, 633-641.

Chakrabarty, S. and Iyengar, S.S. (2002) Sensor placement for grid coverage under imprecise detections. Proceedings of the Fifth International Conference on Information Fusion 1581-158.

Chakrabarty, K., Iyengar, S.S., Qi, H. and Cho, E. (2002) Grid coverage for surveillance and target location in distributed sensor networks. IEEE Transactions on Computers, 51(12), 1448-1453.

Chuang, C. and Lin, R. (2007) A maximum Expected Covering Model for an Ambulance Location Problem. J. Chinese Inst. Indust. Eng. 24(6), 468-474.

Church, R. and ReVelle, C. (1974) The Maximal Covering Location Problem. Papers of the Regional Science Association, 32, 101-118.

Church, R. (1984) The Planar Maximal Covering Location Problem. J. Regional Science, 24(2), 185-201.

Colosimo, B. M. and del Castillo, E. eds. (2007) Bayesian Process Monitoring, Control and Optimization. Chapman and Hall / CRC, Boca Raton.

Commonwealth of Australia (2002) Meat hygiene assessment: Objective methods for the monitoring of processes and products. 2nd. Edition. Canberra, Australia. http://www.daff.gov.au/corporate_docs/publications/pdf/quarantine/mid/msqa2ndedition.pdf

Doherr, M.G. and Audigé, L. (2001) Monitoring and surveillance for rare health-related events: a review from the veterinary perspective. Philos.Trans.R.Soc.London Ser.B, 356, 1097–1106.

Doran, R.J. and Laffan, S.W. (2005) Simulating the spatial dynamics of foot and mouth disease outbreaks in feral pigs and livestock in Queensland, Australia using a susceptible-infected-recovered cellular automata model. Preventive Veterinary Medicine, 70, 133-152. Donaldsen, A.I. (1983) Quantitive data on airborne FMD virus; its production, carriage and deposition. Phil. Trans. Roy. Soc. Lond. Ser B – Biol. Sci., 302,529-534.

Page 120

Dooley, K., and R. Flor (1998) Perceptions of Success and Failure in Total Quality Management Initiatives. J. of Quality Management, 3(2), 157-174.

Department of Primary Industries Victoria (2008) Equine Influenza in Australia 2007: Victoria Situation Report Date: Friday 25 January 2008 – 1200 hrs. http://www.polocrossevic.org.au/DocumentDownload/SITREP_%20INDUSTRY_EISDCHQ_200 80125.pdf

Department of Primary Industries NSW (2008) Equine influenza FAQs - About the disease and symptoms. http://www.dpi.nsw.gov.au/agriculture/livestock/horse/influenza/faqs/disease

Downs, B.T., and Camm, J.D. (1996) An Exact Algorithm for the Maximal Covering Problem. Naval Research Logistics, 43, 435-461.

Eaton, D.J., Church, R.L. and ReVelle (1977) Location Analysis: A New Tool for Health Planners. Methodological Working Document No. 53, Sector Analysis Division, Bureau for Latin America, Agency for International development.

FAO (1998) Guidelines for Surveillance, International Standards for Phytosanitary Measures Publication No. 6 (http://www.ippc.int/IPP/En/ispm.htm)

FAO (2005) Regional Standards for Phytosanitary Measures requirements for the establishment and maintenance of Pest Free areas for Tephritid Fruit Flies. RAP Publication 2005/26. The Asia and Pacific Plant Protection Commission (APPPC)

Fox, D.R. (2008) . To IG or not to IG? – that is the question. Decision-making under uncertainty. Decision Point 24 10-11 (Available at: http://www.aeda.edu.au/docs/Newsletters/DPoint_24.pdf ).

Fox, D.R., Bisignanesi, V., Joynt, B., and McGuire, R. (2009) Optimising Victoria’s Air Quality Monitoring Network. The Australian Centre for Environmetrics, January 2009, University of Melbourne, Parkville, Australia.

Fox, J.C., Buckley, Y.M., Panetta, F.D., Bourgoin, J., and Pullar, D. (2009) Surveillance protocols for management of invasive plants: modelling Chilean needle grass (Nassella neesiana) in Australia. Diversity and Distributions, (in press).

Fricker, R.D. (2008) Syndromic Surveillance. In Encyclopedia of Quantitative Risk Assessment and Analysis, Melnick, E. and Everitt, B. (eds). John Wiley & Sons Ltd, Chichester, UK.

Fuentes, M.A. and Kuperman, M.N. (1999) Cellular Automata and Epidemiological Models with Spatial Dependence. Physica A, 267, 471-486.

Garner, M.G. and Cannon, R.M. (1995) Potential for Wind-Borne Spread of Foot and Mouth Disease Virus in Australia - A report prepared for the Australian Meat

Page 121

Research Corporation. (available at: http://www.daff.gov.au/__data/assets/pdf_file/0007/159541/fmdwind.pdf )

Hall, J.A. and Golding, L. (1998) Standard methods for whole effluent toxicity testing: Development and Application. NIWA Client report MfE80205, National Institute of Water and Atmospheric Research, Hamilton, New Zealand.

Hamada, M. (2002) Bayesian Tolerance Interval Control Limits for Attributes. Qual. Reliab. Engng. Int. 18, 45-52.

Hogan, W.R., Cooper, G.F., Wallstrom, G.L., Wagner, M.M. and Depinay, J. (2007) The Bayesian aerosol release detector: An algorithm for detecting and characterizing outbreaks caused by an atmospheric release of Bacillus anthracis. Statistics in Medicine, 26, 5225-5252.

Höhle, M. and Paul, M. (2008) Count data regression charts for the monitoring of surveillance time series. Computational Statistics and Data Analysis, 52, 4357-4368.

Kanaroglou, P.S., Jerrett, M., Morrison, J., Beckerman, B., Arain, M.A., Gilbert, N.L., and Brook, J.R. (2005) Establishing an air pollution monitoring network for intra-urban population exposure assessment: A location-allocation approach. Atmospheric Environment 39, 2399-2409.

Korth, W. (1997) Determination and application of recovery factors in chemical residue testing of food products by Australian laboratories associated with the national residue survey. http://www.daff.gov.au/corporate_docs/publications/pdf/animalplanthealth/nrs/recovery_factors_ korth_1997.pdf

Krause, A., Guestrin, C., Gupta, A., and Kleinberg, J. (2006) Near-optimal sensor placements: maximizing information while minimizing communication cost. Proceedings of the fifth international conference on Information processing in sensor networks 2-10.

Kulldorff, M. (1997) A spatial scan statistic. Communications in Statistics: Theory and Methods, 26:1481-1496.

Kulldorff, M. (2001) Prospective time-periodic geographical disease surveillance using a scan statistic. Journal of the Royal Statistical Society (Series A), 164, 61-72.

Lindo Systems (2007) Lingo User’s Guide. Lindo Systems Inc., Chicago.

Liu, M.K., Avrin, J., Pollack, R.I., Behar, J.V., and McElroy, J.L. (1986) Methodology for designing air quality monitoring networks: I. Theoretical aspects. Environmental Monitoring and Assessment 6, 1-11.

Marshall, C., Best, N., Bottle, A. and Aylin, P. (2004) Statistical Issues in the prospective monitoring of health outcomes across multiple units. J. Royal Statist. Soc. (A), 167(3), 541-559.

Page 122

Menzefricke, U. (2002) On the Evaluation of Control Chart Limits Based on Predictive Distributions. Commun. Statist. – Theory Meth., 31(8), 1423-1440.

Moffitt, L.J., Stranlund, J.K., and Field, B.C. (2005) Inspections to Avert Terrorism: Robustness Under Severe Uncertainty. Journal of Homeland security and Emergency Management, 2(3), 1-17.

Moffitt, L.J., Stranlund, J.K. and Osteen, C.D. (2008) Robust detection protocols for uncertain introductions of invasive species. J. Environmental Management, 89, 293-299.

Mostashari, F. and Hartman, J. (2003) Syndromic Surveillance: a Local Perspective. Journal of Urban Health: Bulletin of the New York Academy of Medicine 80(2), Supplement 1.

Nairn, M.E., Allen, P.G., Inglis, A.R. and Tanner, C. (1996) Australian quarantine : a shared responsibility . Canberra, A.C.T. : Dept. of Primary Industries and Energy, 1996.

Noel, S. and Jajodia, S. (2007) Attack Graphs for Sensor Placement, Alert Prioritization, and Attack Response. Presented at the Cyberspace Research Workshop - Air Force Cyberspace Symposium.

Önal, H. (2003) First-best, second-best, and heuristic solutions in conservation reserve site selection. Biological Conservation, 115, 55-62.

O’Rourke (1987) Art Gallery Theorems and Algorithms. Oxford University Press, New York, NY.

Patil, G.P. Acharaya, R., Glasmeier, A., Myers, W., Phoha, S. and Rathbun, S. (2006) Hotspot Detection and Prioritization GeoInformatics for Digital Governance. Penn State University, Center for Statistical Ecology and Environmental Statistics, Department of Technical Report Number 2006-0516.

Pheloung, P. (2004) Plant pest surveillance in Australia, Australian Journal of Emergency Management, 19(3), 13-16.

Pheloung, P. (2005) Contingency planning for plant pest incursions in Australia In: Identification of risks and management of invasive alien species using the IPPC framework. Proceedings of a workshop in Braunschweig, Germany 22–26 September 2003 . FAO (2005).

Pirkul, H. and Schilling, D. (1988) The Siting of Emergency service Facilities with Workload capacities and Backup services. Management Science, 34(8), 896-908.

Radaelli, G. (1998) Planning time-between-events Shewhart control charts. Total Quality Management, 9(1), 133-140.

Page 123

Regan, H.M., Ben-Haim, Y., Langford, W., Wilson, W.G., Lundberg, P., Andelman, S.J. and Burgman, M.A. (2005) Robust decision making under severe uncertainty for conservation management, Ecological Applications, 15(4), 1471-1477.

Riley, S. (2007) Large-Scale Spatial-Transmission Models of Infectious Disease. Science, 316, 1 June 2007, 1298-1301.

Robinson, A., Burgman, M., Atkinson, W., Cannon, R., Miller, C., and Immonen, H., (2009) Import Clearance Data Framework. ACERA Project 0804 Final Report. Australian centre of Excellence for Risk Analysis, University of Melbourne.

Shea, D.A. and Lister, S.A. (2003) The BioWatch Program: Detection of Bioterrorism Congressional Research Service Report No. RL 32152 November 19, 2003. Available at: http://www.fas.org/sgp/crs/terror/RL32152.html

Shmueli, G., and Burkom, H.S. (in press) Statistical Challenges Facing Early Outbreak Detection in Biosurveillance", Technometrics (Special Issue on Anomaly Detection).

Sniedovich, M. (2007) A Critique of Info-Gap: Myths and Facts. Second SRA Conference August 20-21, 2007 Hobart, TAS, Australia. (Available at: http://www.acera.unimelb.edu.au/materials/papers%2007/Moshe_Sniedovich_SRA.pdf ).

Sonesson, C. and Bock, D. (2003) A Review and Discussion of Prospective Statistical Surveillance in Public Health. Journal of the Royal Statistical Society. Series A (Statistics in Society), 166(1), 5-21.

Sørensen, J.H., Mackay, D., Jensen, C.O., and Donaldson, A. (2000) An integrated model to predict the atmospheric spread of foot and mouth disease virus. Epidemiol. Infect. 26, 93-97.

Stark, K.D.C., Regula, G., Hernandez, J., Knopf, L., Fuchs, K., Morris, R.S. and Davies, P. (2006) Concepts for risk-based surveillance in the field of veterinary medicine and veterinary public health: Review of current approaches BMC Health Services Research 6(20). Available at http://www.biomedcentral.com/1472-6963/6/20

Stephens. M.A. (1974) EDF Statistics for Goodness of Fit and Some Comparisons, Journal of the American Statistical Association, 69(347), 730-737

Thompson, C.J. (unpublished) Modelling Risk and Uncertainty in Bio- and Homeland security. Presented at Australian and new Zealand Society for Risk Analysis Conference, University of Melbourne, July 17-19, 2006.

Thompson, C.J. and Fox, D.R. (2008) Sampling and Inspection for Monitoring Threats to Homeland Security, in Encyclopedia of Quantitative Risk Assessment and Analysis, Melnick, E. and Everitt, B. (eds). John Wiley & Sons Ltd, Chichester, UK, pp 1600-1603.

Tsiamyrtzis, P., and Hawkins D.M. (2007) Bayesian Process Monitoring, Control and Optimization. Chapman & Hall/CRC Press London/Boca Raton, FL.

Page 124

Underwood, A.J. (1994) On beyond BACI: Sampling designs that might reliably detect environmental disturbance. Ecological Applications, 4(1), 3-15.

Vickers, L., Stolkin, R. and Nickerson, J.V. (2006) Computational Environmental Models Aid Sensor Placement Optimization. Military Communications Conference, 2006. MILCOM 2006 1-6.

Wagner, M.M, Aryel, R.M., and Moore, A.W. (2006) Handbook of Biosurveillance, Elsevier.

Weisstein, E. W. (undated) NP-Hard Problem. From MathWorld--A Wolfram Web Resource. http://mathworld.wolfram.com/NP-HardProblem.html

Wong, W., Cooper, G., Dash, D., Levander, J., Dowling, J., Hogan, W., and Wagner, M. (2005). Use of Multiple Data Streams to Conduct Bayesian Biologic Surveillance, Morbidity and Mortality Weekly Report, 54 (supplemental), pp. 63-69.

Yoffe, A. and Ben-Haim, Y. (2006) An Info-Gap Approach to Policy Selection for Bio- Terror Response. IEEE International Conference on Intelligence and Security Informatics, ISI 2006, San Diego, CA, USA May 23-24 2006, 554-559.

Page 125

APPENDIX A : DATA USED IN CHAPTER 1 EXAMPLE

* 5.966 23.313 20.971 18.324 6.164 8.349 1.108 19.483 9.941 3.563 3.743 1.552 9.793 2.808 22.799 2.789 0.015 0.104 2.83 21.088 0.702 9.899 21.599 1.16 0.965 0.195 21.147 0.423 5.451 16.495 2.707 1.047 5.013 5.712 7.103 4.145 5.057 8.668 19.926 6.384 2.808 15.013 7.363 0.349 1.98 4.654 6.911 2.904 0.424 2.257 3.409 9.08 10.489 14.868 0.123 18.668 1.915 0.968 4.609 6.428 12.826 11.432 13.102 21.721 1.839 8.508 6.479 10.587 10.6 10.602 1.344 5.558 6.039 1.375 7.862 7.242 3.396 19.974 32.228 4.928 8.224 4.206 0.734 4.386 0.462 0.597 21.667 7.259 2.426 6.356 10.23 2.183 0.959 2.917 9.976 5.189 0.72 5.679 10.163 5.903 7.936 0.701 11.881 11.151 14.205 6.05 0.779 8.724 7.169 13.771 0.61 1.1 3.945 1.053 0.779 1.41 3.454 1.356 0.061 19.92 31.168 7.937 12.486 22.916 1.578 2.46 3.691 9.384 0.44 6.278 0.113 4.232 2.348 4.667 4.774 5.353 2.536 7.808 7.331 10.9 16.738 2.292 3.082 13.185 1.235 2.471 17.38 1.407 10.346 17.379 5.669 3.893 1.8 2.041 3.875 15.387 17.414 1.693 6.714 34.181 6.708 2.954 6.668 6.965 0.528 0.756 4.081 4.881 2.697 5.011 2.028 11.68 15.126 2.506 7.047 5.783 1.436 3.053 5.868 9.69 1.605 23.962 5.221 4.929 2.158 0.27 12.947 16.342 0.436 5.259 2.887 1.46 3.164 1.665 36.372 15.661 1.749 0.763 2.477 3.188 2.569 5.877 11.601 0.448 7.424 0.594 13.119 9.131 2.206 0.585 4.692 3.905 3.994 31.585 3.071 13.219 8.659 0.326 13.719 5.003 15.532 1.169 3.618 3.98 1.698 0.319 4.415 2.548 2.379 25.935 15.764 1.957 5.437 34.542 5.101 7.958 13.953 5.546 0.96 0.576 11.084 1.708 6.436 0.195 1.679 12.284 25.663 0.964 7.051 12.53 4.02 22.363 108.031 Table entries are days

Page 126

APPENDIX B: Derivation of Equation 2-5

Equation 2-5 is given in the text as:

m Njay y1   m1 yar N  pyab,1   ; yjab!!jr00m1 yabrm   yNab 0,1, , , 0  (A1)

The proof follows.

1 N y eN 1 b1 We have pyab,1 a1  d which can be re-written as  yab!(,) 0 

y 1 Nyab (,) Nya 1 1 b1  ed 1 (A2) yab!(,)0 ( yab ,)

The integrand in equation A2 is the expected value of eN with respect to the  (,)y  ab density.

The moment-generating function for a  (,)y  ab density is

  m1 y ar tm t Mt  Ee1    (A3) m1  r0 y abrm !

And thus the integral in equation A2 is simply M N  and hence

y 1 Nyab (,) Nya 1 1 b1  ed 1 yab!(,)0 ( yab ,)

Nyabym (,)  m1 yarN () 1    (A4) yab!(,) m1 r0 yabrm !

Page 127

It is relatively straightforward to show that the ratio of the two beta terms in equation A4 y1 j  a can be expressed as, thus completing the proof. j0 j ab

Page 128

APPENDIX C: Derivation of Equation 2-9

Equation 2-9 is given in the text as m   m1 xyar N  1 tt1 t    nyabrm ! nt1  xyanxb,    m1 r0 tt1  pX y tt111 tt  tt1  m x  m1 t1  yabt  ,  yar N  1 t t    yabr m! m1 r0 t  (B.1)

The proof follows.

t As given in the text, Nn  i is the total number of sampled units and Y is the total i1 t 1 YX p Xy fX p yd number of failures at time t i.e.  i . Now, tt11  t t i1 0

pyt  p with py t  . pyt

1 1 b1 p ypypd   p a1 1 Furthermore, tt  with  . 0  ab,

Now, YbinN ~,   but since N is large and  small it is reasonable to assume eNN  y Y ~ Poisson N  i.e. pY  . Hence we obtain the pdf for yt as y!

11N yt eN 1 a1 b1 pyttpy  pd 1   d  yab!, 00t 

yt 1 b1 N  N yat 1  ed 1  yabt  , 0

yt 1 N 1 b1  yabe,1 N yat 1  d t   yabtt,,0  yab

Page 129

N yt   N   yabEet ,    (B.2) yabt  ,

 N where Ee is the expectation of N with respect to the pdf for  [i.e. the

 yt  ab, density].

From mathematical distribution theory, it is known that the moment generating function for the beta density is defined as:

 t  M tEe     m1 m  yart   t 1    (B.3) m1  r0 yt abrm !

 N Given that EeMN  it follows that m  m1 yar N N t Ee 1   and hence: m1 r0 yabrt  m!

m yt   m1  N  yart  N  pytt  y ab,1  (B.4) yab ,!  yabrm tt m1 r0 

py t   p  Substituting equation B.4 into py t  we obtain: pyt

N yt eN   1 b1 a1 1 yab!,  py  t t m yt   m1  N  yart  N   yabt ,1  yab ,!  yabrm tt  m1 r0 

b1 eN yat 1 1  (B.5)   m1 m   yart  N   yabt ,1    yabr m!  m1 r0 t 

Page 130

1 p Xy p Xy fx p yd Returning to  tt1  we can write tt11  t t . Now, 0

1 fx p yd  tt1  0

b1 1 N yat 1 nt1 x nxtt11 e 1  t1 1d   m1 m xt1  0  yart  N  yabt ,1   yabr m! m1 r0 t

1 nt1 nxb1 edN  xyatt1 1 1tt11   xt1 0   m1 m yart  N  yabt ,1   yabr m! m1 r0 t

nt1  xyanxbtt111, tt   nxb1  1  xyatt1 1 tt11  xt1 N 1  ed   m1 m   xyanxb,   yart  N 0  tt111 tt    yabt ,1   yabr m! m1 r0 t (B.6)

Noting that the integrand in equation B.6 is a beta pdf, we have:

nt1  xyanxbtt111, tt   xt1 N pX y Ee  tt1 m    m1 yart  N  yabt ,1   yabr m! m1 r0 t

Page 131

nt1  xyanxb,   m  tt111 tt   m1  xt1  xyartt1  N  1  m1 m    m1 r0 nyabrm ! yart  N  tt1   yabt ,1   yabr m! m1 r0 t

thus completing the proof.

Page 132

APPENDIX D : DATA USED IN CHAPTER 3 EXAMPLE

Row Column x n Row Column x n 20 13 0 10 24 20 0 9 22 26 0 12 12 7 0 13 3 16 0 9 20 15 0 8 8 23 1 13 25 24 0 9 1 24 0 4 24 17 0 8 2 19 0 11 13 17 4 5 14 14 1 7 17 18 0 11 24 24 0 17 18 22 4 12 16 16 0 10 26 5 0 7 26 21 0 14 10 8 1 9 19 20 0 3 20 6 0 17 16 13 0 6 25 7 0 14 12 11 2 10 12 2 0 8 13 2 0 13 20 3 0 14 16 1 0 8 9 6 2 8 2 17 0 15 25 3 0 15 10 1 0 11 14 7 0 9 15 19 6 8 7 21 0 6 19 17 0 12 25 22 0 12 18 5 0 10 3 13 1 15 17 4 0 6 12 9 0 6 7 26 0 10 23 8 0 17 15 8 0 6 21 24 1 13 12 24 3 14 14 2 0 15 6 22 0 8 14 15 2 14 26 22 0 12 21 6 0 9 4 7 7 11 8 16 6 8 4 16 0 13 2 5 3 11 12 18 10 10 22 22 0 11 4 10 9 11 24 7 0 5 23 25 0 11 8 18 1 6 5 14 3 11 3 11 1 8 16 14 0 5 16 23 5 6 9 5 0 9 17 15 0 14 1 7 0 4 3 23 0 7 6 8 12 13 2 24 0 15 4 22 0 6 25 16 0 6 14 4 0 12 6 11 6 10 23 18 0 11 21 9 0 10 22 15 0 11 6 12 7 14 26 9 0 9 1 3 3 11

Page 133

Row Column x n Row Column x n 26 8 0 13 22 9 0 12 18 9 0 11 11 11 3 9 8 22 0 10 18 15 0 9 4 6 7 9 20 26 0 8 12 1 0 10 17 19 1 9 13 15 2 7 13 6 0 8 12 19 6 7 13 19 12 13 19 15 0 14 5 3 2 7 2 9 2 9 1 6 1 5 1 17 0 10 16 20 3 8 10 21 2 13 4 13 0 8 10 5 0 11 25 17 0 6 2 6 3 8 9 19 3 7 2 3 5 12 18 16 0 11 22 14 0 14 12 4 0 11 16 25 5 13 7 9 5 7 13 18 10 11 19 10 0 14 21 5 0 4 3 5 5 14 5 23 0 12 1 21 0 10 18 17 0 6 22 11 0 9 13 11 0 14 26 18 0 11 26 17 0 9 26 14 0 7 18 8 0 9 7 24 0 12 7 17 2 8 9 17 5 9 1 13 0 13 5 6 6 8 15 1 0 7 14 17 5 9 17 8 0 9 6 2 0 12 6 25 0 15 16 2 0 15 4 24 0 15 25 26 0 10 24 5 0 9 26 16 0 8 25 21 0 5 7 12 10 12 2 4 6 12 3 3 5 10

Page 134

APPENDIX E : MATHCAD CODE FOR CELLULAR AUTOMATA MODEL

Establish grid dimensions

R  26 C  26 i1  R j1  C

Define spatial probability model(s)

a 0.0025 b .0055

2 2  ()r1 r2  ()c1 c2  pr1c1( r2c2 ) exp 0.5    .12  2 2  ()r1 r2 2(r1 r2 )() c1 c2 ()c1 c2  p1( r1 c1r2c2 ) exp 1      a ab b 

Data Entry - observed cases and locations Data matrix D: col1=row index; col2=col index; col3=r cases; col4=sample size N

D 

g()ab  0 K  rows() D  1

I matrix() R Cg L  0 K D L2 P  DL0 DL1 D P is matrix of observed proportions L3

Initial probabilities

GG() r0 c0  for i1R  for j1C 

g  p1() r0 c0ij ij Mg return M

Page 135

Update probabilities

MA() B A for ii R 1 1 for jj C 1 1 P1 for is ( ii 1 ) () ii 1 for js ( jj 1 ) () jj 1 PPA  ()1 p() is js iijj  1A if is ii  js jj  is js is js  B  A  1A ()1P ii jj ii jj ii jj B

Likelihood Function

L1() G tt  for it 0 tt AG it sum 0 for il 0 K

 30   if A  0A 10   Dil 0 Dil 1 Dil 0 Dil 1    if() 1 0.999999 sum sum D log()  D  D log() 1    il 2 il 3 il 2  BB  sum it return BB

Page 136

Iterative step

Lik()  for r0 1 R for c0 1 C

G  GG() r0 c0 0 for T1   G  MG T T1 LL L1() G for t 0 () rows() LL  1 x  t t vs cspline() x LL fit( xx ) interp() vs xLLxx

d dfit() xx  fit() xx dxx  xx  2  on error t0 root() dfit()xx xx  Tmax  t0 r0 c0 Lmax  fit() t0 r0 c0

result stack() Tmax Lmax return result

Execute Spatial-Temporal Likelihood Routine

Q Lik() 200 Call Likelihood routine and evaluate for 200 time increments

Qt submatrix() Q 323323 For each grid cell extract max. likelihood value and time Ql submatrix() Q 2853126

Page 137

APPENDIX F : DATA USED IN CHAPTER 4 EXAMPLE

Source: NSW Department of Local Government Comparative Information on : http://www.dlg.nsw.gov.au/Files/Comparatives/0506data.xls (2005/06 population data). Eastings and Northings were separately digitised and generally correspond to the location of the shire office or the main shire town.

Page 138

LGA Area Popn Density East North Albury City Council 313 47247 150.9489 55489544.19 6006937.56 Armidale Dumaresq Council 4235 24611 5.811334 56372288.10 6623485.37 The Council of the 8 40018 5002.25 56326691.92 6248511.47 Auburn Council 32 64209 2006.531 56318056.99 6252811.05 Council 484 39953 82.54752 56554531.31 6806290.42 Council 21699 2730 0.125812 54734757.03 6164111.33 Bankstown City Council 77 177000 2298.701 56320515.48 6246228.39 Bathurst Regional Council 3820 37001 9.686126 55737652.31 6300358.92 The Council of the Shire of Baulkham Hills 401 161068 401.6658 56313143.65 6262799.59 Council 6280 32431 5.164172 55754085.51 5937494.66 Council 1602 12758 7.963795 56490221.37 6631080.26 Council 2067 8289 4.01016 55392522.77 6053397.24 240 283458 1181.075 56306319.45 6261359.58 Council 8560 6530 0.76285 55562644.59 6238218.91 Council 1525 6773 4.441311 55709424.31 6287420.05 Blue Mountains City Council 1432 76511 53.42947 56245749.74 6253862.08 Council 14611 3105 0.212511 55543782.77 6465771.25 Bombala Council 3944 2534 0.642495 55699752.38 5912814.16 Boorowa Council 2579 2495 0.967429 55657688.30 6187875.51 The Council of the 22 37074 1685.182 56333335.12 6242502.82 Council 41679 3906 0.093716 55397636.12 6670893.85 Council 19188 2168 0.112987 55486395.53 6685470.28 Broken Hill City Council 170 20203 118.8412 54544093.91 6463992.40 Burwood Council 7 31158 4451.143 56324615.79 6249816.81 Council 567 30827 54.36861 56559834.71 6831369.76 Cabonne Shire Council 6026 12703 2.108032 55674404.14 6336963.95 Camden Council 201 51367 255.5572 56287315.55 6229394.42 Campbelltown City Council 312 150216 481.4615 56298132.52 6228207.01 Council 20 67261 3363.05 56325763.09 6251416.42 Canterbury City Council 34 134126 3944.882 56325862.77 6247335.13 Council 18940 3274 0.172862 55355775.11 6191578.81 Council 53511 2406 0.044963 54769399.77 6473160.35 Cessnock City Council 1966 48502 24.6704 56343973.20 6366566.04 10441 49538 4.744565 56493548.28 6715426.76 Council 45606 5013 0.10992 55389904.90 6514598.89 Coffs Harbour City Council 1175 67442 57.39745 56512367.25 6649928.18 Council 8751 1782 0.203634 55334636.58 6091907.25 Council 2433 4127 1.69626 55518298.98 6147394.41 Cooma‐ Council 5229 9792 1.872633 55691001.82 5987716.22 Council 9926 4714 0.474914 55632618.98 6574634.93 Council 1524 7623 5.001969 55593913.40 6166502.05 Council 2324 11058 4.758176 55445129.82 6016107.36 Council 2810 13185 4.692171 55656488.52 6254887.97 Council 130 8169 62.83846 55314945.72 6066425.01 Dubbo City Council 3428 39263 11.45362 55651130.34 6431257.08 Council 2251 8440 3.749445 56383121.71 6414117.90 Council 3422 36389 10.63384 55768055.96 5996288.26 102 187790 1841.078 56310890.27 6250540.33 Council 4720 9974 2.113136 55593744.79 6305538.82 Council 4836 4660 0.963606 55657591.12 6490315.33 Council 5487 8735 1.591945 56493539.67 6715131.84 Council 2952 4917 1.66565 56405622.61 6433457.25 Gosford City Council 940 163304 173.7277 56345843.68 6300152.02 Goulburn Mulwaree Council 3220 27112 8.419876 55748774.02 6150722.05 3376 34695 10.27696 56453974.41 6439552.50 Greater Council 5746 10510 1.829099 55503364.04 6052899.50

Page 139

LGA Area Popn Density East North Greater Taree City Council 3730 46986 12.59678 56453923.26 6469651.07 Griffith City Council 1640 25140 15.32927 55411983.84 6205380.69 Council 2458 3764 1.531326 55600985.37 6119128.65 Shire Council 4994 12074 2.417701 56237879.00 6569537.96 Council 4395 4460 1.01479 56372299.30 6656404.23 Council 9453 5530 0.584999 56346675.62 6661081.79 Council 1869 3773 2.018727 55625658.56 6175448.30 Hawkesbury City Council 2776 63824 22.99135 56297510.80 6279081.23 Council 11328 3534 0.31197 55302157.44 6180238.52 Holroyd City Council 40 91941 2298.525 56314305.51 6254900.40 The Council of the Shire of Hornsby 462 157204 340.2684 56232801.04 6268957.55 The Council of the Municipality of Hunters Hill 6 13928 2321.333 56328936.41 6254666.14 Hurstville City Council 23 76036 3305.913 56324566.30 6239997.91 Council 8606 15794 1.835231 56317571.48 6704579.90 Council 3375 1871 0.55437 55384546.63 6086686.12 Council 2031 5922 2.915805 55553436.85 6141106.73 Council 3380 28742 8.50355 56484928.10 6561452.45 The Council of the 258 20357 78.9031 56303626.32 6161275.82 Kogarah Municipal Council 16 55800 3487.5 56327543.22 6240364.98 Ku‐ring‐gai Council 86 108697 1263.919 56327503.08 6257453.72 3589 9630 2.683199 56500401.96 6833849.82 Council 14973 7360 0.491551 56502628.97 6836653.29 Lake Macquarie City Council 644 190320 295.528 56367738.45 6340389.58 Lane Cove Municipal Council 11 32326 2938.727 56330507.85 6256892.92 Council 1167 12026 10.30506 55445543.27 6176454.44 Leichhardt Municipal Council 11 51142 4649.273 56328303.11 6249671.53 Lismore City Council 1290 43628 33.82016 56524662.57 6813735.01 Council 4567 20889 4.5739 56235915.36 6291875.75 Liverpool City Council 305 170192 558.0066 56308135.68 6243967.18 Council 5086 7852 1.543846 56279679.29 6511748.58 Council 2895 3520 1.215889 55474185.61 6102402.15 Maitland City Council 392 61517 156.9311 56364940.41 6377203.27 15 38886 2592.4 56341520.33 6259018.68 17 75114 4418.471 56329631.61 6246191.78 Mid‐Western Regional Council 8737 22141 2.534165 55742729.20 6391132.55 Council 17928 15936 0.888889 55775586.20 6737355.89 Mosman Municipal Council 9 28363 3151.444 56337511.71 6255408.96 Council 4345 6729 1.548677 55310506.69 6034804.42 Council 3505 2620 0.747504 55397800.91 6147943.88 Council 3406 15149 4.447739 56301140.37 6428420.07 Nambucca Shire Council 1491 18755 12.57881 56489126.94 6604929.58 Council 13031 14172 1.08756 55767571.32 6641936.98 Council 4117 6582 1.598737 55459446.51 6155188.87 Council 5264 7033 1.336056 55616795.69 6433183.33 Newcastle City Council 183 146967 803.0984 56386048.62 6356218.24 11 60944 5540.364 56334048.79 6254330.01 3627 5447 1.501792 55764651.74 6267017.81 Orange City Council 285 37791 132.6 55695559.24 6315302.28 5134 11470 2.234125 55722018.65 6095827.43 Council 5958 15034 2.52333 55609517.33 6332764.55 Parramatta City Council 61 151860 2489.508 56315214.15 6256346.91 Penrith City Council 405 177955 439.3951 56286574.28 6263835.68 91 57354 630.2637 56342396.61 6278492.66 Port Macquarie‐Hastings Council 3687 70581 19.14321 56485350.65 6525726.25 858 63579 74.1014 56412603.20 6383659.01 Queanbeyan City Council 172 37169 216.0988 55702816.16 6085362.79

Page 140

LGA Area Popn Density East North Randwick City Council 36 126034 3500.944 56337477.15 6246080.77 3051 20913 6.854474 56504462.10 6807112.80 Rockdale City Council 28 95341 3405.036 56326583.34 6242267.69 Ryde City Council 41 99550 2428.049 56322072.18 6258394.28 Shellharbour City Council 147 63124 429.415 56301532.58 6173077.36 Shoalhaven City Council 4568 93615 20.49365 56293860.94 6141111.81 Singleton Shire Council 4896 22270 4.548611 56328047.51 6395563.94 Council 6030 7293 1.209453 55664173.05 5973508.95 Strathfield Municipal Council 14 31624 2258.857 56323712.34 6250189.48 Council 335 215053 641.9493 56320710.43 6232658.02 Council of the 27 148367 5495.074 56334163.20 6251129.02 Tamworth Regional Council 9713 54522 5.613302 56302632.83 6558389.66 Council 2802 6337 2.261599 55549030.58 6188166.31 Council 7332 6805 0.928123 56404713.87 6787004.06 Shire Council 4392 3613 0.822632 55591544.06 6040261.05 Council 4566 11347 2.485107 55611372.19 6092928.90 Council 1309 80935 61.82964 56538815.48 6866575.02 Council 8071 13424 1.663239 56250573.49 6441047.34 Council 7102 7328 1.031822 55726919.69 6184282.75 Council 3230 6075 1.880805 56356507.93 6609024.65 Council 3357 1389 0.413762 55433261.52 6090174.20 Wagga Wagga City Council 4824 58055 12.03462 55537764.94 6114710.87 The Council of the Shire of Wakool 7520 4836 0.643085 55263694.54 6071647.71 Walcha Council 6267 3283 0.523855 56365571.22 6570426.76 Council 22336 8031 0.359554 55607681.74 6678300.78 Council 10760 3273 0.304182 55579346.64 6492580.42 150 139626 930.84 56340392.61 6264446.51 Council 12380 10508 0.848788 55686315.27 6541687.52 9 61611 6845.667 56338382.16 6247691.36 Council 3410 3848 1.128446 55607364.63 6248881.02 4113 8599 2.090688 55682594.43 6396278.51 Council 26269 7300 0.277894 54584782.43 6225613.24 Willoughby City Council 23 63959 2780.826 56333248.08 6258396.24 Council 2689 44670 16.61212 56262880.05 6181838.81 Council 2557 41463 16.21549 56279419.07 6214683.49 City Council 684 192402 281.2895 56307010.27 6188621.72 Woollahra Municipal Council 12 52747 4395.583 56337665.28 6250073.97 Council 745 143393 192.4738 56353093.20 6313175.65 3999 12936 3.234809 55674748.32 6142796.17 Council 2694 12035 4.467335 55619421.95 6202330.27

Page 141

® APPENDIX G: LINGO CODE FOR OPTIMAL SENSOR CONFIGURATION model: sets: row/1..14/:; col/1..14/:;

type/1..3/:N; grid(row,col,type):E,X;

G1(row,col,row,col):D,W;

G2(row,col):Y,cost; endsets

@for(G1(i1,j1,i2,j2):D(i1,j1,i2,j2)=@if(i1 #eq# i2 #and# j1 #eq# j2,999,@sqrt((i1-i2)^2+(j1-j2)^2)));

!max=@sum(grid:E*X); min=@sum(grid:E*(1-X));

@for(type(k):@sum(grid(i,j,k):x(i,j,k))<=N(k));

@for(grid(i,j,k)| j #eq# 14:x(i,j,k)=0);

@for(grid(i,j,k)|i #ge# 7 #and# j #eq# 13 :x(i,j,k)=0);

@for(grid(i,j,k)|i #ge# 9 #and# j #eq# 12 :x(i,j,k)=0);

@for(grid(i,j,k)|i #ge# 10 #and# j #eq# 11 :x(i,j,k)=0);

@for(grid(i,j,k)|i #ge# 12 #and# j #eq# 10 :x(i,j,k)=0);

@for(type(k): x(14,9,k)=0);

@for(G2(i,j):@sum(type(k):x(i,j,k))<=1);

@for(G2(i,j):Y(i,j)=@sum(type(k):x(i,j,k)));

Total_cost=@sum(grid(i,j,k):x(i,j,k)*cost(i,j));

Total_cost<=20000;

@for(row(i1): @for(col(j1): @for(row(i2): @for(col(j2):

Page 142

Y(i1,j1)+Y(i2,j2)>=2*W(i1,j1,i2,j2); Y(i1,j1)+Y(i2,j2)-1<=W(i1,j1,i2,j2)))));

@for(G1:999*(1-W)+D>2);

!@for(grid:@sum(grid(i,j,k) | i #LE# 3 #AND# j #LE# 3: x(i,j,k))<=1);

!@for(grid:@sum(grid(i,j,k) |i #GE# 4 #AND# i #LE# 6 #AND# j #LE# 3: x(i,j,k))<=1);

!@for(grid:@sum(grid(i,j,k) |i #GE# 7 #AND# i #LE# 9 #AND# j #LE# 3: x(i,j,k))<=1);

!@for(grid:@sum(grid(i,j,k) |i #LE# 3 #AND# j #GE# 4 #AND# j #LE# 6: x(i,j,k))<=1);

!@for(grid:@sum(grid(i,j,k) |i #GE# 4 #AND# i #LE# 6 #AND# j #GE# 4 #AND# j #LE# 6: x(i,j,k))<=1);

!@for(grid:@sum(grid(i,j,k) |i #GE# 7 #AND# i #LE# 9 #AND# j #GE# 4 #AND# j #LE# 6: x(i,j,k))<=1);

!@for(grid:@sum(grid(i,j,k) |i #LE# 3 #AND# j #GE# 7 #AND# j #LE# 9: x(i,j,k))<=1);

!@for(grid:@sum(grid(i,j,k) |i #GE# 4 #AND# i #LE# 6 #AND# j #GE# 7 #AND# j #LE# 9: x(i,j,k))<=1);

!@for(grid:@sum(grid(i,j,k) |i #GE# 7 #AND# i #LE# 9 #AND# j #GE# 7 #AND# j #LE# 9: x(i,j,k))<=1);

@for(grid:@BIN(X)); @for(G1:@BIN(W));

data:

E= 0.09 0.136 0.166 0.128 0.188 0.224 0.14 0.212 0.231 0.142 0.217 0.228 0.142 0.214 0.22 0.14 0.208 0.209 0.137 0.202 0.202 0.133 0.2 0.207 0.132 0.204 0.224 0.136 0.213 0.242 0.141 0.22 0.251 0.145 0.221 0.25

Page 143

0.145 0.205 0.242 0.113 0.156 0.195 0.112 0.191 0.232 0.152 0.259 0.304 0.161 0.286 0.307 0.16 0.287 0.296 0.157 0.276 0.277 0.148 0.263 0.256 0.136 0.257 0.253 0.131 0.266 0.282 0.137 0.286 0.328 0.152 0.306 0.359 0.161 0.316 0.37 0.161 0.31 0.361 0.154 0.28 0.337 0.122 0.21 0.266 0.116 0.213 0.275 0.155 0.282 0.348 0.16 0.3 0.338 0.154 0.288 0.312 0.139 0.264 0.277 0.123 0.249 0.251 0.122 0.26 0.266 0.146 0.302 0.339 0.177 0.355 0.429 0.195 0.392 0.474 0.197 0.401 0.486 0.185 0.383 0.47 0.164 0.331 0.421 0.124 0.24 0.321 0.129 0.224 0.297 0.164 0.285 0.358 0.157 0.287 0.326 0.136 0.256 0.283 0.109 0.221 0.238 0.093 0.212 0.217 0.115 0.253 0.262 0.176 0.341 0.386 0.24 0.435 0.524 0.269 0.494 0.583 0.272 0.509 0.595 0.255 0.481 0.578 0.212 0.408 0.512 0.146 0.286 0.376 0.136 0.22 0.297 0.164 0.266 0.336 0.14 0.249 0.283 0.104 0.205 0.23 0.074 0.174 0.189 0.071 0.184 0.189 0.111 0.255 0.268 0.191 0.376 0.432 0.274 0.499 0.6 0.317 0.576 0.664 0.329 0.601 0.677 0.32 0.577 0.671 0.277 0.495 0.61 0.189 0.347 0.441 0.118 0.192 0.276 0.125 0.219 0.289 0.091 0.191 0.223 0.064 0.155 0.177 0.056 0.144 0.156

Page 144

0.071 0.182 0.187 0.12 0.276 0.295 0.202 0.41 0.474 0.286 0.535 0.641 0.33 0.61 0.698 0.343 0.638 0.709 0.342 0.628 0.722 0.312 0.558 0.694 0.218 0.401 0.51 0.09 0.155 0.241 0.086 0.169 0.234 0.061 0.144 0.171 0.051 0.126 0.143 0.06 0.142 0.149 0.094 0.205 0.209 0.161 0.312 0.332 0.247 0.439 0.5 0.306 0.54 0.637 0.317 0.595 0.678 0.319 0.62 0.69 0.335 0.63 0.725 0.334 0.586 0.746 0.246 0.439 0.572 0.076 0.128 0.204 0.073 0.138 0.189 0.055 0.124 0.141 0.052 0.124 0.136 0.07 0.158 0.164 0.115 0.231 0.24 0.177 0.33 0.36 0.236 0.43 0.493 0.272 0.501 0.589 0.279 0.539 0.622 0.284 0.568 0.642 0.31 0.598 0.698 0.335 0.586 0.768 0.266 0.459 0.62 0.066 0.11 0.17 0.066 0.125 0.16 0.055 0.125 0.135 0.061 0.14 0.153 0.085 0.181 0.193 0.125 0.245 0.265 0.169 0.319 0.365 0.2 0.385 0.456 0.216 0.433 0.517 0.231 0.47 0.557 0.248 0.513 0.598 0.278 0.563 0.673 0.313 0.575 0.777 0.265 0.467 0.654 0.056 0.096 0.14 0.061 0.12 0.142 0.065 0.137 0.142 0.084 0.165 0.178 0.107 0.204 0.221 0.128 0.25 0.278 0.149 0.297 0.352 0.163 0.339 0.41 0.177 0.379 0.454 0.201 0.428 0.512 0.229 0.491 0.582 0.273 0.562 0.679

Page 145

0.322 0.585 0.793 0.277 0.48 0.677 0.044 0.082 0.114 0.061 0.114 0.129 0.08 0.145 0.146 0.102 0.18 0.194 0.114 0.215 0.237 0.125 0.249 0.279 0.142 0.283 0.333 0.157 0.32 0.376 0.175 0.365 0.418 0.208 0.431 0.496 0.257 0.516 0.597 0.315 0.6 0.711 0.354 0.619 0.816 0.29 0.499 0.688 0.035 0.069 0.092 0.05 0.102 0.114 0.068 0.134 0.137 0.093 0.172 0.184 0.112 0.21 0.229 0.125 0.244 0.267 0.14 0.278 0.31 0.159 0.318 0.347 0.185 0.373 0.397 0.23 0.455 0.49 0.291 0.555 0.612 0.349 0.64 0.731 0.369 0.645 0.812 0.289 0.509 0.671 0.033 0.059 0.073 0.045 0.085 0.094 0.054 0.11 0.112 0.073 0.142 0.149 0.1 0.181 0.193 0.125 0.22 0.232 0.145 0.257 0.267 0.164 0.299 0.301 0.192 0.361 0.355 0.243 0.449 0.451 0.306 0.55 0.572 0.358 0.626 0.679 0.372 0.621 0.735 0.291 0.484 0.598 0.029 0.042 0.051 0.039 0.059 0.068 0.042 0.073 0.077 0.05 0.093 0.1 0.07 0.121 0.135 0.093 0.153 0.169 0.11 0.183 0.195 0.13 0.218 0.222 0.168 0.27 0.268 0.222 0.341 0.348 0.28 0.418 0.443 0.322 0.473 0.521 0.332 0.466 0.556 0.26 0.362 0.45 ; N=1,2,2; cost= 4280.094

Page 146

4280.094 4280.094 4280.094 4280.094 4345.848 5101.356 5252.13 5252.13 5710.304 6228.32 6079.8 6631.55 6243.155 4280.094 4280.094 4280.094 4280.094 4300.452 5076.084 5267.808 5267.808 5267.808 5332.088 5625.14 5601.17 5316.081 5118.956 4280.094 4280.094 4280.094 4276.428 4945.356 5267.808 5267.808 5267.808 5267.808 5267.808 5267.808 5267.808 5267.808 5092.954 4280.094 4285.632 4578.132 5091.372 5267.808 5267.808 5267.808 5267.808 5267.808 5267.808 5267.808 5267.808 5168.849 5070.597 4742.322 5204.862 5267.808 5267.808 5267.808 5267.808 5267.808 5267.808

Page 147

5267.808 5267.808 5267.808 5267.808 5248.097 5335.754 5253.066 5267.808 5267.808 5267.808 5267.808 5267.808 5267.808 5267.808 5267.808 5267.808 5267.808 5267.808 5193.122 5679.003 5253.066 5267.808 5267.808 5267.808 5267.808 5267.808 5267.808 5267.808 5267.808 5243.472 5822.574 5408.995 5086.388 5553.474 5253.066 5267.808 5267.808 5267.808 5267.808 5267.808 5267.808 5267.808 5267.808 5735.617 6229.293 6403.78 5950.831 6664.698 5253.066 5267.808 5267.808 5267.808 5267.808 5267.808 5267.808 5267.808 5267.808 5899.752 6753.56 6823.062 6462.877 8108.99 5253.066

Page 148

5267.808 5267.808 5267.808 5267.808 5267.808 5267.808 5267.808 5219.353 5902.19 6986.068 7652.653 7672.996 8323.233 4827.732 5173.74 5267.808 5267.808 5267.808 5267.808 5267.808 5267.808 5357.372 6275.522 7181.484 7945.927 6766.541 8616.6 4280.094 4267.068 4560.426 5070.702 5267.808 5267.808 5267.808 5237.856 5804.34 6404.603 7280.268 7856.119 9231.327 9019.89 4280.094 4280.094 4280.094 4272.918 4619.082 5153.85 5183.193 5223.22 5684.175 6045.716 6912.198 7961.793 9211.644 9576.162 4280.094 4280.094 4280.094 4280.094 4280.094 4292.184 4697.551 5176.938

Page 149

5524.386 6256.168 7381.806 8446.131 8983.71 10161.406 ; enddata end

Page 150