Anthropogenic Environmental Change and Emerging Infectious Disease: A Proximate-Determinants Approach

James Holland Jones and William H. Durham Department of Anthropological Sciences Stanford University

March, 2006

1. Introduction In this paper, we suggest that the theoretical apparatus of mathematical provides a fundamental set of tools for understanding the relationship between social and environmental change and disease emergence. These relationships are manifold and extraordinarily complex (Institute of Medicine 2003). However, by focusing attention on the proximate mechanisms of emergence, the theory of mathematical epidemiology can be used to organize and simplify the many potential pathways to emergence and let us understand how population phenomena such as growth, compositional change, consumer preferences, population movements, etc. can affect disease emergence.

One of the fundamental quantities of interest in mathematical epidemiology and theoretical ecology is the basic reproduction number, R . The basic reproduction number is defined as 0 the expected number of secondary cases produced by a single (typical) disease in a completely susceptible population in the absence of infection control measures. In addition to its intuitive meaning as a reproduction number (directly analogous to the R 0 familiar to demographers (Heesterbeek 2002)), R acts as a threshold parameter for the 0 system. A critical value exists at R = 1. Above this threshold, an epidemic can 0 occur. Below it, there will only be minor outbreaks.

A simple Susceptible-Infected-Removed (SIR) epidemic model serves to introduce R and 0 motivate its relationship to the dynamic systems represented by epidemic models.

For simplicity, assume:

1. Constant (closed) population size, N , of susceptibles 2. Constant rates (e.g., , removal rates) 3. No demographic changes (i.e., births and deaths) 4. Well-mixed population

A well-mixed population is one where any infected individual has a probability of contacting any susceptible individual that is reasonably well approximated by the average. This is often the most problematic assumption, but is easily relaxed in more complex models.

In the closed population of N individuals, let us say that S are susceptible, I infected, and R are removed. We can then write s = S / N , i = I / N , r = R / N to denote the fraction in each compartment.

The SIR model (Anderson and May 1991; Diekmann and Heesterbeek 2000) is then: Jones & Durham: Proximate Determinants of Emergence

ds = −βsi (1) dt di = βsi − νi (2) dt dr = νi (3) dt where β = τc is the effective contact rate, τ is the transmissibility of the infection, c is the average contact rate between susceptible and infected hosts, and ν is the removal rate. By assumption all rates are constant. This means that the expected duration of infection is −1 simply the inverse of the removal rate: d = ν .

An epidemic occurs if the number of infected individuals increases, i.e., di / dt > 0 . Solving equation (2) for this inequality, we get:

βsi − νi > 0

βsi > i ν

At the outset of an epidemic, nearly everyone is susceptible. So we can say that s ≈ 1. Substituting s = 1, we arrive at the following inequality

β = τ c d = R0 > 1 ν

The assumptions of the basic SIR model can easily be relaxed and thus, for a wide range of circumstances, the logic of the basic reproduction number will apply (Diekmann and Heesterbeek 2000).

2. Adding Demography Extending the simple model above to cases where population size is not constant (i.e., where there are additions and removals to the population through birth, death, and migration) is straightforward. In the case of the simple SIR model with demography, the basic reproduction number remains the ratio of additions to removals multiplied by the fraction susceptible.

3. Some Abstraction

It is easy to imagine that some or all of the components of R could be functions of 0 population size or density: R = τ(N ) c(N ) d(N ) , 0 where N is some measure of population size.

In general, we expect that contact rates are going to be most directly related to population size and/or density. In addition, though, population density could contribute to higher

2 Jones & Durham: Proximate Determinants of Emergence transmissibilities through selection on the pathogen for increased infectiousness or through reducing defenses as through social stress, co-morbidity, etc. In the model employed here, any postulated effect of population growth must do at least one of three things: (1) increase transmissibility of the pathogen (2) increase the rate of contact between susceptible and infected individuals, or (3) increase the duration of infectiousness.

It is in this way that the basic reproduction number serves as a conduit for understanding how myriad environmental, social, and economic changes facilitate (or impede) disease emergence. Thinking of disease emergence through the lens of R0 focuses attention on the causal pathways in a manner directly analogous to the way that focus on the proximate determinants of fertility helped us understand demographic transitions (Davis and Blake 1956; Bongaarts 1978).

4. “ R Thinking” and Environmental Determinants of Emergence 0 Further consideration of the components of the basic reproduction number, R = τ c d , 0 shows why R can be instrumental to understanding the ways in which human-induced 0 environmental change influence disease emergence. Briefly put, this inequality suggests that an environmental influence on the three components of R , singly or in combination, 0 can trigger an epidemic outbreak of infectious disease. The challenge becomes to identify the ways in which a given environmental change (or set of such changes) impinges on the component terms of R . By analogy to Ernst Mayr’s “population thinking” (Mayr 2001) in 0 the understanding of evolutionary processes, we call this approach “ R thinking.” 0 Consider, first, environmental influences on transmissibility (τ ), the probability of infection given contact between a susceptible and an infected individual. Transmissibility is directly related to a host of factors of pathenogenicity including inoculation rates and thresholds, virulence or toxicity, and so on. Of these, several are sensitive to environmental conditions: virulence, for example, varies widely with environmental circumstances, owing to the potential for environmental selection for or against virulence depending on rates and processes of between-host transmission. A pathogen whose virulence kills the host before its own reproduction and transmission is itself selected against (Lenski and May 1994; van Baalen and Sabelis 1995; Frank 1996; Lipsitch et al. 1996; Longini et al. 2002; Galvani 2003).

A possible example relating τ to environmental change may be provided by the seventh of cholera that began in 1961: with the environmental change of slowly improving sanitation systems and potable water supplies, host survival has recently become more important to pathogen transmission, apparently selecting for tempered bacterial virulence. The seventh cholera pandemic has been characterized by a much lower case fatality rate than in previous (Longini et al. 2002).

Similarly, consider possible environmental influences on c , the average rate of contact between susceptible and infected individuals. As noted above, c will obviously increase with the size and density of the host population, as may occur through such diverse processes as economic development and colonization projects, for example, and the spontaneous migration of “environmental refugees” away from degraded environments.

In the particular case of -borne diseases, contact rate varies with the “vectorial capacity” of the vector that shuttles the infection between infecteds and susceptibles. Vectorial capacity, in turn, is related to vector survival rates, the efficiency of vector infection and transmission, and so on. Those survival rates are subject to abiotic environmental

3 Jones & Durham: Proximate Determinants of Emergence influences, such as ambient temperature, humidity, and the like (Singer and De Castro 2001). And they are also subject to influences of the biotic environment, such as predator abundance and diversity (as may influence the abundance of -vectoring mosquitoes). In like manner, contact rate can vary greatly with the number and diversity of alternative hosts, their disease competencies, and their overall “dilution” effect. A classic example of the latter comes from the study of Lyme disease, where the fate of an epidemic rides on the diluting biodiversity of alternative hosts: more diversity means lower R (LoGiudice et al. 0 2003).

Finally, consider average duration of the . As a general rule, one may anticipate that duration of infectiousness will increase with any compromising of the host’s immune response as may be produced through environmental contamination, various forms of heat or water stress, and so on (Cassel 1976; Farmer 1996). Similarly, co-morbidity arising from an environmental change in relation to a secondary infection can have a big impact (again through compromised immune function) on the infectious period of the primary disease (Godfrey-Faussett and Ayles 2003). A candidate example of an environmentally- induced change in duration of infectiousness comes from recent work on malaria among gold miners of the Amazon (Silbergeld et al. 2000; Crompton et al. 2002), whose immune systems showed mercury-related impairment.

In ways like these, environmental change may affect the terms of the R equation singly or in 0 combination. It is thus our conviction that R thinking opens the door to analyzing and 0 disentangling the various environmental influences that may impinge on infectious disease transmission in this anthropogenically changing world.

In the remainder of this paper, we look at related uses of R in understanding proximate 0 determinants of emergence in multi-host communities and in age-structured populations.

5. The Basic Reproduction Number and the Next Generation Matrix Many of today’s most important emerging infectious diseases are multi-host by their very nature. As a result, they require a slightly more complex formalism for investigating epidemic thresholds, etc. The basic tool for examining epidemic thresholds in complex, structured models is the so-called next generation matrix (Diekmann et al. 1990; Diekmann et al. 1991; Diekmann and Heesterbeek 2000).

Consider a population of individuals (or species) subdivided into n compartments, of which m are infected. Let x represent the proportion of the population in the i th compartment and i let the vector of the proportions in all the compartments be x . Let F (x) denote the rate of i appearance of new infections in compartment i and V (x) be the rate of movement into and i out of state i . F includes new infections only, not transfers of individuals from one infected i compartment to another.

We can now define the matrices, ⎡∂F (x ) ⎤ F = ⎢ i 0 ⎥, ∂x ⎣⎢ j ⎦⎥ and

4 Jones & Durham: Proximate Determinants of Emergence

⎡∂V (x ) ⎤ V = ⎢ i 0 ⎥, ∂x ⎣⎢ j ⎦⎥ where x denotes the disease-free equilibrium and the indices i, j = 1,…,m . 0

−1 The entries of the matrix G = FV give the rate at which infected individuals of state j generate new infections of type i . G is called the next generation matrix (Diekmann et al. 1990). R is the dominant eigenvalue of G . 0

6. Properties of Next Generation Matrices

G is a non-negative matrix. All next generation matrices will also be irreducible. A graph is irreducible if and only if all compartments or nodes can communicate with each other. The case of non-communicating states can occur, for example, in zoonotic diseases in which human infection represents an epidemiologic dead-end. While medical and public health consequences of human infection in these cases may be important, epidemic control per se must focus on the competent compartments for transmission. So even in these cases next generation matrices will be irreducible. In many cases G will also be primitive, meaning that it will become positive when raised to a sufficiently high power.

For a stage-classified matrix (containing age groups, for example), a sufficient condition for primitivity is that the life cycle digraph contain at least one self-loop. However, a substantial fraction of next generation matrices will, in fact, be imprimitive. Imprimitivity will be an issue, for example, in vector-born diseases with obligate reproduction in a single host.

The individual elements of the matrix G , g , can themselves be interpreted as reproduction ij numbers. The element g is the expected number of secondary cases of type i caused by ij contact with an infectious individual of class j . For i ≠ j , we call this the “between-class reproduction number” and for the diagonal elements of the next generation matrix, i.e., where i = j , we call g the “within-class reproduction number.” ij

7. Perturbation of the Next Generation Matrix

Let R be the dominant eigenvalue, p be the corresponding right eigenvector, and q the corresponding left eigenvectors of G . p represents the asymptotic distribution of disease states. q represents the relative contribution to the asymptotic population of each of the disease states.

Caswell (Caswell 1978; Caswell 2000) provides a detailed derivation of the perturbation. The basic formula that emerges from this derivation for the sensitivity of the dominant eigenvalue of the next generation matrix to a small change in element g is ij ∂R = q p . (4) ∂g i j ij

In other words, the sensitivity of fitness to a small change in projection matrix element g is ij simply the i th element of the left eigenvector weighted by the proportion of the stable population in the j th class.

5 Jones & Durham: Proximate Determinants of Emergence

It is often instructive to use proportional sensitivities or elasticities:

gij ∂R ∂log R e = = (5) ij R ∂g ∂log g ij ij

Elasticities have two properties that make them very attractive. First, the sum of all elasticities equals 1, e = 1, so there is a sense in which an elasticity represents a given ∑ i, j ij fraction of the total transmission. Second, the sum of the elasticities of all incoming arcs of the transmission graph must equal the sum of the elasticities of outgoing arcs (Van Groenendael et al. 1994).

8. Example: Multi-Host Influenza Consider a multi-host epidemic model for influenza in which we consider two compartments: (1) domestic birds and (2) wild birds. Both of these compartments are clearly heavily collapsed, the former being comprised of domestic chickens, ducks, etc. and the latter including a wide array of species (Jones 2006). Infection can arise through either within- or between compartment transmission, leading to a 2x2 next generation matrix. Let w denote the expected number of secondary infections in wild birds caused by contact with an infected wild bird in a completely susceptible population. Let d be the analogous quantity for the domestic compartment and c and c be the analogous quantities for cross-compartment 1 2 infection. This yields:

⎡w c1 ⎤ G = ⎢ ⎥ ⎣c2 d ⎦

Unfortunately, these quantities remain unknown for most recent influenzas, including avian flu. Nonetheless, we can use the formalism to make reasonable statements about the behavior of the system. First, it is probably reasonable to posit that within-compartment transmission exceeds between-compartment transmission. Second, it is likely that the within-domestic reproduction number exceeds the within-wild component, at least for avian flu. If we assume that these two quantities are proportional to each other d = kw for some k > 1and that the between-compartment reproduction numbers can be assumed to be approximately zero, we get:

⎡d / k 0⎤ G = ⎢ ⎥ ⎣ 0 d⎦

Since G is diagonal, R0 = d . It is nonetheless informative to write out the reproduction number completely:

⎛ 2 2 ⎞ 1 ⎜ d 2d d ⎟ ⎜ 2 ⎟ R0 = ⎜ d + + d − + 2 ⎟ . 2 ⎜ k k k ⎟ ⎝ ⎠

Clearly, for k > 1, R is dominated by d . 0

6 Jones & Durham: Proximate Determinants of Emergence

We can relax the rather stringent assumption of c1,c2 ≈ 0 , and calculate elasticities of R0 for different values of the between-compartment reproduction numbers relative to the within compartment numbers. Figure 1 plots these elasticities with respect to k , showing that for nearly all parameter combinations, control of within-domestic compartment new infections has the greatest impact on reducing R0 .

The implication is clear: control measures in this multi-host system should focus on domestic fowl. The feature of the system that drives domestic fowl to be so important is largely human demand for animal protein in a growing popultion, and perhaps more importantly, an increasingly affluent population.

We expect this result to be a general feature of emerging infections of zoonotic origin. The contact rates of domesticated animals are so much greater than all but the most gregarious non-domesticates that it is almost certain to be so. Similarly, human-animal contact rates will typically be greatest with domesticates. As demand for meat increases and market forces guide meat production toward increased efficiency, contact between humans and domesticated animals, and the contact rate between the domesticates themselves will increase. Today there are as many as 5,000 chickens per km2 throughout Asia (Slingenbergh et al. 2004), a spectacular increase in just a few decades.

9. Example: Age-Structured Transmission of Influenza

Using data from the Tecumseh, Michigan influenza study (Longini et al. 1983), we can construct an age-structured next-generation matrix for influenza transmission dynamics (Hill and Longini 2003).

⎡0.60 0.10 0.10 0.10 0.10⎤ ⎢ ⎥ ⎢0.20 1.70 0.30 0.20 0.20⎥ G = ⎢0.40 0.30 0.50 0.40 0.30⎥ ⎢ ⎥ ⎢0.20 0.10 0.30 0.20 0.10⎥ ⎢0 10 0 10 0 10 0 10 0 10⎥ ⎣ . . . . . ⎦

Note here that the compartments correspond to age-groups of the flu-affected human population: (1) pre-school age, (2) school age, (3) young adult, (4) mid-adult, (5) old adult. The within-stage reproduction number with the highest value is stage 2, which corresponds to school-age children. While it is conceivable that school-age children could shed more virus or remain infectious for longer periods, the most parsimonious interpretation is that school children have elevated levels of contact with respect to the other age-classes, leading to a within-class reproduction number nearly three times as large as the next largest compartment.

Calculating the elasticities of R0 yields:

⎡0.01 0.01 0.00 0.00 0.00⎤ ⎢ ⎥ ⎢0.01 0.77 0.04 0.01 0.01⎥ E = ⎢0.01 0.04 0.02 0.01 0.00⎥ ⎢ ⎥ ⎢0.00 0.01 0.01 0.00 0.00⎥ ⎢0 00 0 01 0 00 0 00 0 00⎥ ⎣ . . . . . ⎦

7 Jones & Durham: Proximate Determinants of Emergence

Clearly, reducing transmission within the school-age class will have the greatest impact on bringing R below threshold. A full 77% of the total elasticity of R lies in the within-school 0 0 age reproduction number. This analysis suggests that targeting campaigns at school-age children may have large public health benefits, a suggestion that has recent empirical support (Ghendon et al. 2006). Of course, such a policy may itself have subsequent population consequences and should be weighed carefully against potential costs (Carrat et al. 2006).

This has very strong implications for age-structured transmission systems in regions with very young age structure due to recent rapid population growth. We further expect a greater impact of school-age children in populations where there is heightened age-grade mixing out of school.

10. Conclusion

R , the basic reproduction number, is a quantity of great practical and theoretical value in 0 the study of infectious diseases in human populations. It tells at a glance, for instance, whether an infectious disease is epidemic ( R >1) in a population or dying out ( R <1). 0 0 Breaking R into its components (typically more practical with simple models), moreover, 0 allows one to consider ways in which anthropogenic environmental change can induce changes in transmissibility (τ ), average contract rate ( c ), or in the duration of infection ( d ).

The causes of disease emergence and re-emergence are many and complex (Institute of Medicine 2003). However, the formalism of mathematical epidemiology and basic reproduction number in particular provide a fulcrum of analysis of the proximate determinants of disease emergence. With regard to this, we suggest that models of intermediate complexity (as in those models discussed here) are likely to provide the most advantage for insight into the process of emergence. While we have focused here primarily on influenza, we believe that these models are likely to be particularly productive in cases of emergence facilitated by anthropogenic environmental change (Jones & Durham, forthcoming).

R can also be used in the context of the next generation matrix G to provide an assay into 0 the elasticities of intervention into an epidemic, guiding control policy. In the case of an epidemic within a multi-host community, such as highly pathogenic avian influenza for example, this technique allows one to detect those interactions within the community where disease-control measures would be most effective. In the case of an age-structured population, the same technique can be used to detect the greatest payoff to control measures among age groups. In this paper, we explore some first implications of “ R - 0 thinking” for the study of environmental change and emerging infectious disease. Our goal has been to explore what those R implications are and, of course, “are not.” 0

Literature Cited

Anderson, R. M., and R. M. May. 1991. Infectious diseases of humans: Dynamics and control. Oxford: Oxford University Press. Bongaarts, J. 1978. A framework for analyzing the proximate determinants of fertility. Population and Development Review 4:105-132.

8 Jones & Durham: Proximate Determinants of Emergence

Carrat, F., A. Lavenu, S. Cauchemez, and S. Deleger. 2006. Repeated influenza vaccination of healthy children and adults: borrow now, pay later? Epidemiology and Infection 134 (1):63-70. Cassel, J. 1976. The Contribution of the Social-Environment to Host-Resistance - the 4th Wade-Hampton-Frost-Lecture. American Journal of Epidemiology 104 (2):107-123. Caswell, H. 1978. A general formula for the sensitivity of population growth rate to changes in life history parameters. Theoretical Population Biology 14:215-230. Caswell, H. 2000. Matrix population models: Formulation and analysis. 2nd ed. Sunderland, MA: Sinauer. Crompton, P., A. M. Ventura, J. M. de Souza, E. Santos, G. T. Strickland, and E. Silbergeld. 2002. Assessment of mercury exposure and malaria in a Brazilian Amazon riverine community. Environmental Research 90 (2):69-75. Davis, K., and J. Blake. 1956. Social structure and fertility: An analytic framework. Econ. Dev. Cult. Change. 4:211-235. Diekmann, O., K. Dietz, and J. A. P. Heesterbeek. 1991. The Basic Reproduction Ratio for Sexually-Transmitted Diseases .1. Theoretical Considerations. Mathematical Biosciences 107 (2):325-339. Diekmann, O., and H. Heesterbeek. 2000. Mathematical Epidemiology of Infectious Diseases : Model Building, Analysis and Interpretation. New York: Wiley. Diekmann, O., J. A. P. Heesterbeek, and J. A. J. Metz. 1990. On the Definition and the Computation of the Basic Reproduction Ratio R0 in Models for Infectious-Diseases in Heterogeneous Populations. Journal of Mathematical Biology 28 (4):365-382. Farmer, P. 1996. Social inequalities and emerging infectious diseases. Emerging Infectious Diseases 2 (4):259-269. Frank, S. A. 1996. Models of parasite virulence. Quarterly Review of Biology 71 (1):37-78. Galvani, A. P. 2003. Epidemiology meets evolutionary ecology. Trends in Ecology & Evolution 18 (3):132-139. Ghendon, Y. Z., A. N. Kaira, and G. A. Elshina. 2006. The effect of mass influenza in children on the morbidity of the unvaccinated elderly. Epidemiology and Infection 134 (1):71-78. Godfrey-Faussett, P., and H. Ayles. 2003. Can we control tuberculosis in high HIV settings? Tuberculosis 83 (1-3):68-76. Heesterbeek, J. A. P. 2002. A brief history of R-0 and a recipe for its calculation. Acta Biotheoretica 50 (3):189-204. Hill, A. N., and I. M. Longini. 2003. The critical vaccination fraction for heterogeneous epidemic models. Mathematical Biosciences 181 (1):85-106. Institute of Medicine. 2003. Microbial threats to health: Emergence, detection, and response. Washington, D.C.: National Academies Press. Jones, J. H. 2006. WIld bird culls are infefficient for emerging influenza control. submitted. Lenski, R. E., and R. M. May. 1994. The Evolution of Virulence in Parasites and Pathogens: Reconciliation between 2 Competing Hypotheses. Journal of Theoretical Biology 169 (3):253-265. Lipsitch, M., S. Siller, and M. A. Nowak. 1996. The evolution of virulence in pathogens with vertical and horizontal transmission. Evolution 50 (5):1729-1741. LoGiudice, K., R. S. Ostfeld, K. A. Schmidt, and F. Keesing. 2003. The ecology of infectious disease: Effects of host diversity and community composition on Lyme disease risk. Proceedings of the National Academy of Sciences of the United States of America 100 (2):567-571. Longini, I. M., Jr., M. Yunus, K. Zaman, A. K. Siddique, R. B. Sack, and A. Nizam. 2002. Epidemic and cholera trends over a 33-year period in Bangladesh. J Infect Dis 186 (2):246-51. Mayr, E. 2001. What is evolution? New York: Basic Books. Silbergeld, E. K., J. B. Sacci, and A. F. Azad. 2000. Mercury exposure and murine response to Plasmodium yoelii infection and immunization. Immunopharmacology and Immunotoxicology 22 (4):685-695.

9 Jones & Durham: Proximate Determinants of Emergence

Singer, B. H., and M. C. De Castro. 2001. Agricultural colonization and malaria on the Amazon Frontier. Annals of the New York Academy of Sciences 954:184-222. Slingenbergh, J., M. Gilbert, K. de Balogh, and W. Wint. 2004. Ecological sources of zoonotic diseases. Revue Scientifique Et Technique De L Office International Des Epizooties 23 (2):467-484. van Baalen, M., and M. W. Sabelis. 1995. The dynamics of multiple infection and the evolution of virulence. American Naturalist 146 (6):881-910. Van Groenendael, J., H. De Kroon, S. Kalisz, and S. Tuljapurkar. 1994. Loop analysis: Evaluating life history pathways in population projection matrices. Ecology 75 (8):2410-2415.

10 Jones & Durham: Proximate Determinants of Emergence

Figure 1: Elasticity of R with respect to d , ∂log R / ∂log d . As k gets large (i.e., as the 0 0 expected number of infections in domestic fowl which are caused by domestic fowl greatly exceeds the analogous number in wild birds), the fraction of the total elasticity in R with 0 respect to d approaches unity. Interventions aimed at reducing infections in domestic fowl will have the greatest impact on reducing R . Relatively more between-compartment 0 transmission slows this approach, but the qualitative behavior remains the same.

11