The effects of host contact network structure on pathogen diversity and strain structure

Caroline O’F. Buckee*†‡, Katia Koelle†§, Matthew J. Mustard†¶ʈ, and *

*Department of Zoology, , Oxford OX1 3PS, ; ¶Royal Botanic Gardens, Kew, Richmond, Surrey TW9 3AE, United Kingdom; ʈDepartment of Plant and Soil Science, St. Machar Drive, University of Aberdeen, Aberdeen AB24 3UU, United Kingdom; and §Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI 48104

Edited by Kenneth W. Wachter, University of California, Berkeley, CA, and approved May 28, 2004 (received for review March 22, 2004) For many important pathogens, mechanisms promoting antigenic al. (8) defined antigenic distance between strains in continuous variation, such as mutation and recombination, facilitate immune strain space, showing analogous dynamical results for varying evasion and promote strain diversity. However, mathematical levels of cross-immunity, stable homogeneous and heteroge- models have shown that host immune responses to polymorphic neous pathogen populations at low and high levels of cross- antigens can structure pathogen populations into discrete strains immunity, respectively, and traveling wave patterns through with nonoverlapping antigenic repertoires, despite recombination. strain space at intermediate levels. A different approach has Until now, models of strain evolution incorporating host immunity been taken to keep track of multiple strains (9), where the have assumed a randomly mixed host population. Here, we illus- immune status of the hosts at any point in time is taken into trate the effects of different host contact networks on strain account rather than the history of for each individual. diversity and dynamics by using a stochastic, spatially heteroge- Although sustained oscillations were not observed, the struc- neous analogue of this model. For randomly mixed populations, turing of the pathogen population was still dependent on mech- our model confirms that cross-immunity to strains sharing alleles at anisms of host immunity. antigenic loci may structure the pathogen population into discrete, Despite differences in the formulation of these deterministic nonoverlapping strains. However, this structure breaks down once models, they produce similar outcomes in terms of the polar- the assumption of random mixing is relaxed, and an increasingly ization of strains in strain space for higher levels of cross- diverse pathogen population emerges as contacts between hosts immunity. However, they all assume that host populations are become more localized. These results imply that host contact well mixed, and do not take stochastic effects and spatial network structure plays a significant role in mediating the emer- heterogeneities into account. Studies have shown that network gence of pathogen strain structure and dynamics. structure can significantly affect the processes occurring on social networks, including the dynamics and evolution of infec- tious diseases (10–13). For example, some have investigated the any important pathogens, such as Neisseria meningitidis effect of network structure on the evolution of disease traits such and Plasmodium falciparum, display structured strain di- M as infectious period and transmission rates (10), as well as versity: highly diverse genotypes are organized into distinct, invasion thresholds for epidemics (11). Others have explored the persisting strains, which can be detected as linkage disequilib- role of spatial contact structure in the evolution of virulence rium between particular genes (for example, see refs. 1 and 2). (12). To date, there have been no studies explicitly investigating Strains can often show cyclical temporal dynamics, with succes- the effects of host contact networks on the interaction of sive types dominating in prevalence within the host population. multiple strains incorporating host cross-immunity, however. Understanding the maintenance of diversity within pathogen Many important multistrain pathogens exist in diverse geograph- populations, and the dynamics of multiple strains, has been a ical environments and in different types of host populations. It focus for many theoretical studies. Previous studies have shown therefore follows that, for directly transmitted diseases, the that interference between strains, either through the prevention social network structure of the host population may impact the of superinfection (3) or from cross-immunity gained by exposure pathogen population by affecting the extent of strain mixing, and to ‘similar’ strains (4, 5), can allow for the stable coexistence of therefore the level of competition and recombination between different strains, as well as sustained oscillations, under certain different strains. In communities where local contacts are the conditions. The latter studies emphasized the importance of primary means of transmission, the population genetics of the cross-immunity as a mechanism for structuring pathogen pop- pathogen may be very different from in large cities where ulations, but assumed that the ‘‘similarity’’ between strains was individuals mix with large numbers of random contacts. based on a single genetic locus. Here we use a stochastic individual-based model (IBM), based For pathogens that undergo antigenic variation, such as on the framework of Gupta et al. (6) described above, to , trypanosomes, and , multiple genetic loci are investigate the effects of social network structure on the evolu- often important in generating host immune responses. Gupta et tion of pathogen diversity and strain structure. We first restrict al. (6, 7) explicitly accounted for multiple, polymorphic immu- our analyses to regular and random host contact networks, as nogenic loci, by using the overlap between allelic profiles of caricatures of two extreme social network scenarios, and com- different strains to determine the extent of host cross-immunity. pare these networks to each other as well as to stochastic They showed that even high levels of cross-immunity can result mean-field approximations of the IBM to analyze the effect of in stable, diverse pathogen populations. For very low levels, no structured host contact networks on the dynamics of the strains. strain structure is observed. As it increases, unstable structure We then further analyze several small-world host contact net- can emerge, displaying cyclic or chaotic patterns of strain dominance. At sufficiently high levels of cross-immunity, selec- tion by the immune system will result in the dominance of a set This paper was submitted directly (Track II) to the PNAS office. of strains with nonoverlapping antigenic repertoires (which will Abbreviations: IBM, individual-based model; LHS, Latin hypercube sampling; ODE, ordinary not be competing for susceptible hosts). This structure will differential equation. †C.O’F.B., M.J.M., and K.K. contributed equally to this work. persist despite recombination events that generate different BIOLOGY variants, because immune selection against strains that share ‡To whom correspondence should be addressed. E-mail: [email protected]. POPULATION alleles at antigenic loci will suppress their prevalence. Gomes et © 2004 by The National Academy of Sciences of the USA

www.pnas.org͞cgi͞doi͞10.1073͞pnas.0402000101 PNAS ͉ July 20, 2004 ͉ vol. 101 ͉ no. 29 ͉ 10839–10844 Downloaded by guest on September 29, 2021 works and argue that the extent of host clustering is the primary network characteristic affecting pathogen strain structure and diversity. The results highlight the importance of considering social network structure in the analysis of pathogen population structuring and dynamics. The Model Hosts. The individual-based model simulates each potential host as a separate entity including its contacts, the strains it is infected with, and its immune response (memory of infection). Each individual has a position in a ring lattice. A host contact network is created at the beginning of a simulation, with every individual in contact with a fixed number of other individuals. This contact network remains constant throughout the simulation for all host contact networks modeled, except the mean-field approximation host network (described below). The structure of the contact network, ranging from regular through small-world to random, is determined by the ␳ parameter, as in Watts and Strogatz (14). ␳ is the probability that an individual will come into contact with a randomly chosen individual rather than a local neighbor in the ␳ ring lattice. Hence, a of 0 means that an individual will only Fig. 1. Strain histograms illustrating diversity and discordance metrics. In this interact with its immediate neighbors, whereas a ␳ of 1 means example, with a three-allele pathogen, eight possible strains can exist in a that the host contact network is a random network, wherein population at any point in time (each consisting of a unique combination of every fixed interaction is with a randomly chosen individual. To three immunodominant loci). Populations in A and B both have the same approximate the mean-field ordinary differential equation discordance value (H ϭ 0.5), but population in A has a more diverse distribu- ϭ ϭ (ODE) model, the stochastic IBM model uses a host contact tion of strains present (D 1.0) than population in B (D 0.33). Populations ␳ ϭ in C and D, although having identical diversity levels (D ϭ 0.79), differ in the network that is random (i.e., 1) and changes at every time extent of the allelic similarities of the strains present, with population in D step (henceforth referred to as the ‘‘mean-field approximation’’ having a higher discordance level (H ϭ 0.68) than population in C (H ϭ 0.51). host contact network). Contact between hosts occurs once in each time step and changes in host infection and immunity status are updated synchronously at the end of each time step. measure antigenic discordance between strains. We use a tax- One important difference between the stochastic model and onomic distinctness measure, previously used in calculating the the mean-field ODE is the possibility of the stochastic loss of an average phylogenetic distance between species within a commu- allele in the IBM. In the deterministic ODE simulations, mu- nity (16). Here, instead of using weights to quantify phylogenetic tation was unnecessary because alleles could not be lost. How- distances between species, we use weights to quantify allelic ever, there was a need for mutation to reintroduce alleles in the differences between strains. The weights can therefore simply be stochastic IBM. This was especially evident in small populations the Hamming distances between strains, where the Hamming where demographic stochasticity frequently resulted in allelic distance between two strains is the number of bits by which they extinction (data not shown). differ. Because the maximum Hamming distance possible in a pathogen population is known, we adjust the taxonomic distinct- Pathogens. Pathogens were represented as bit-strings, with each ness measure by dividing by the maximum Hamming distance bit being one immunodominant locus coding for an antigen on (the number of loci) to get a discordance (H) measure between the surface of the pathogen. We limited each locus to two alleles, 0 and 1: designated as a ‘‘1’’ or a ‘‘0’’. There are therefore 2n different configurations (‘‘genotypes’’) that a pathogen can have. A strain ͸͸ 1 iϽj wijpipj is defined as a pathogen subpopulation with one of these distinct H ϭ ͩ ͸͸ ͪ, [2] configurations. To measure the genetic variability within a n iϽj pipj heterogeneous pathogen population, we introduce two metrics: diversity (D) and discordance (H). where wij is the number of loci with different alleles for strains Diversity measures the evenness with which a pathogen pop- i and j. pi and pj are the frequencies of strain i and j in the ulation is partitioned into all of its possible different strains. We pathogen population, respectively, and n is the number of calculate diversity by dividing the entropy of the pathogen loci. Fig. 1 illustrates the differences between diversity and population (also known as the Shannon–Weaver diversity index, discordance. ref. 15) by the maximum possible entropy of the population: Dynamics. Pathogens are assumed to only exist within the mod- Ns eled hosts. A host infected with a pathogen contacting an ͸ ͑ ͞ ͒ pi log 1 pi individual with no immunity to that pathogen will infect that iϭ1 individual with probability ␤. Although a host may be infected ϭ D , [1] by several strains at once, it may only infect one individual with log(Ns) a single strain in any one time step. Upon infection, individuals where pi is the frequency of strain i in the population, and Ns, the remain infected by that pathogen for a period, such that the number of strains, ϭ 2n. Therefore, for a pathogen population, average duration of infection with a pathogen is 1/␮, where ␮ is D ranges between 0 and 1, with D ϭ 1 indicating that all of the the probability that the host rids itself of the pathogen in a time possible strain types in the population are equally represented. step (Table 1). After infection, individuals remain immune to In addition to diversity, a metric that describes the average that pathogen for a period, such that the average duration of allelic difference between any two pathogens picked at random immunity to that pathogen is 1/␴, where ␴ is the probability of from a heterogeneous pathogen population is necessary to the host losing its immunity to a pathogen in a time step (Table

10840 ͉ www.pnas.org͞cgi͞doi͞10.1073͞pnas.0402000101 Buckee et al. Downloaded by guest on September 29, 2021 Table 1. Key model parameters, including symbol, description, were run for two sets of parameter values, with values of ␳ and sample ranges used in the LHS sensitivity analysis between 0 and 1. LHS range, Results Parameter Description min͞max Stochastic Extinctions. Of the 1,000 parameter combinations in the C Mean number of contacts per host 4:12 LHS sensitivity analysis, 255 resulted in the extinction of all ␳ Degree of randomness in the host contact NA pathogen strains on the host network in one or several of the network three host network scenarios. Analysis of variance of the pa- 1͞␮ Average duration of infection 3:10 rameter combinations for these simulations, in contrast to those 1͞␴ Average duration of immunity 10:30 in which extinction did not occur, revealed that the main factors ␤ Probability of transmission (to a completely 0.2:0.8 contributing to extinction were, in declining order of importance, susceptible host) a short infectious period (P Ͻ 0.001, F ϭ 79.57), a high degree R Probability of recombination per allele 0.01:0.1 of cross-immunity (P Ͻ 0.001, F ϭ 75.70), a small host popula- ␶ Probability of allelic mutation per allele 0.001:0.005 tion (P Ͻ 0.001, F ϭ 69.45), and low numbers of contacts ␥ Degree of cross-immunity 0.02:4 between individuals (P Ͻ 0.001, F ϭ 42.65). The results de- N Number of immunodominant loci 2:4 scribed below are based on the 745 LHS sensitivity analysis P Host population size 100:500 parameter combinations in which stochastic extinction did not For the LHS analysis, we used a uniform probability distribution function for occur in any of the network scenarios. each range of parameter values. NA, not applicable; min͞max, ratio of min- imum to maximum values. Comparison with Mean-Field Models. The stochastic IBM repro- duces many of the features present in the original mean-field ODE formulation (6, 7). The effect of varying cross-immunity is 1). The duration of infection and of immunity therefore exhibit particularly clear, with pathogen populations having no strain exponential decay. When an infection event occurs, there is also structure at low cross immunity, displaying strain cycling or the chance that the strain will undergo mutation or recombine chaotic fluctuations at intermediate cross-immunity, and popu- with another strain in the same host. Both of these events occur lations with one dominant, discordant set occurring at high levels with defined probabilities (␶ and r, respectively, Table 1). of cross-immunity (Fig. 2). Fig. 2 shows that for both models The strains that a host is immune to influences the host’s strong host cross-immunity is sufficient to structure the patho- probability of infection, given contact with an infected neighbor gen population into discrete strains; in our model, this occurs has occurred, depending on the similarity of the strains. We regardless of the rate of recombination or mutation. As in the deterministic model, the changes in dynamics seen in Fig. 2 occur model this cross-immunity by assuming that a host’s vulnerability ␥ to infection by a strain depends on the similarity between that at critical values of , corresponding to the reduction in diversity and increase in discordance. In addition to the expected effect strain and the strains in the host’s immune memory, an assump- ␥ tion also made in Gupta et al. (6). Given this (reasonable) of on strain diversity and discordance, increasing the number of immunodominant loci (n) also affected these metrics by assumption, the fraction of identical bits between the host’s increasing diversity and decreasing discordance (Table 2). immune memory and the infecting strain can therefore be converted into a vulnerability of infection (v), between 0 and 1, Effect of Host Network Structure. Fig. 3 shows a comparison of by using mean diversity and mean discordance of simulations from the ͞␥ ␥ v ϭ ͑1 Ϫ f1 ͒ , [3] mean-field approximation versus the random fixed network and from the random fixed network versus the regular network. The where f is the fraction of identical bits and ␥ is a positive number results indicate that host contact network structure clearly scaling the level of cross-immunity (Table 1). affects pathogen strain structure and dynamics, with the discor- The measures of genetic variability used to quantify a patho- dant strain structure seen in the mean-field approximation gen population at one point in time, outlined above, can also be breaking down in the more regular networks and strain diversity used to interpret the dynamics of a pathogen population on a increasing. As the random mixing of the network decreases and host network and for comparisons between different networks. contacts between hosts become more localized, the genetic Pathogen populations that have only one discordant pathogen structuring of the pathogen population decreases; the diversity set present have a low mean diversity value [D Ϸ log (2)/log(N )0] of strains present increases and the dominance of sets of s antigenically discordant strains declines. These results are robust and a high mean degree of discordance (H Ϸ 1). Pathogen for different parameter values (Table 2 and Fig. 3), and empha- populations with no strain structure have a high mean diversity size that the evolutionary dynamics of a pathogen may reflect the value (D Ϸ 1) and a low discordance value (H Ϸ 0.5). Pathogen nature of the interactions between hosts rather than character- populations with stochastic cycling exhibit intermediate mean istics of the hosts or pathogen species themselves. Analysis into values of diversity and discordance. the relative effect of contact network structure in the LHS sensitivity analysis reveals that network structure describes a Experimental Approach. Parameter space was explored by using significant and comparatively large part of the variation in the statistical technique of Latin hypercube sampling (LHS) pathogen diversity and discordance (Table 2). Within a certain (17), which selects combinations of parameter values without network type, however, the degree of cross-immunity (␥) and the replacement, given parameter value ranges and probability number of loci (n) again account for most of the variance in distribution functions. The key model parameters that were discordance and diversity. The probability of recombination (r) sampled by using LHS can be found in Table 1. and pathogen transmissibility (␤) conspicuously do not signifi- We used 1,000 LHS to cover parameter space. For each of cantly affect strain diversity or discordance in any of the three these, three simulations differing in host contact network struc- network types, a point to which we will return in the discussion. ture were run for 3,000 time steps (sufficiently long to remove We conjecture that the higher degree of host clustering in ␳ ϭ transient dynamics): one on a regular host network ( 0), one regular contact networks compared to random contact net- BIOLOGY on a random host network (␳ ϭ 1), and one on a mean-field works cause these patterns in mean diversity and mean dis- POPULATION approximation network. In addition, small-world simulations cordance. To evaluate this hypothesis further, we simulated the

Buckee et al. PNAS ͉ July 20, 2004 ͉ vol. 101 ͉ no. 29 ͉ 10841 Downloaded by guest on September 29, 2021 Fig. 2. An illustration of the changing dynamics for both the original deterministic model and our stochastic mean-field approximation model for pathogen populations with two loci (i.e., four strains). Two strains comprising one discordant set are plotted in black; the other discordant set is plotted in gray. Plotted in all simulations is the proportion of the host population immune to each of the four strains. Parameter values [corresponding to the parameter notation of Gupta et al. (6)] used for the ODE simulations (A–C) were ␮ ϭ 0.02, ␴ ϭ 10, R0 ϭ 2, ␣ ϭ 1, with the only difference in parameter values being the degree of cross-immunity ␥ (0.3 in A, 0.7 in B, 0.9 in C). The mean-field approximation of the stochastic IBM (D–F) used parameter values corresponding to our model’s parameter notation in Table 1: C ϭ 12, r ϭ 0.0953, ␶ ϭ 0.0042, ␤ ϭ 0.2472, 1͞␮ ϭ 7, 1͞␴ ϭ 23, P ϭ 223, n ϭ 2. The degree of cross-immunity ␥ was 0.01 in D, 0.95 in E, and 2.00 in F. (Note that ␥ is defined slightly differently in our model compared with its definition in ref. 6). Aand D, with the lowest values of ␥, both have no strain structure, with the mean diversity in D being 0.9882 and the mean discordance being 0.6723. B and E have intermediate values of ␥, and both exhibit cyclical strain dynamics. Mean diversity in E is 0.8733, and mean discordance is 0.7496. C and F have high values of ␥, and both exhibit strong strain structure, with one discordant set being dominant. Mean diversity in F is 0.5480, and mean discordance is 0.9720. Simulations were run for 2,000 time steps for the IBM and 500 time steps for the ODE.

strain dynamics with the IBM for two LHS samples, using 14 (14), and several quantities, such as characteristic path lengths different values of ␳ between 0 and 1. When ␳ is between 0 and and clustering coefficients, can be used as metrics to describe 1, the networks are considered to be ‘‘small-world’’ networks their structure (14). In Fig. 4, mean discordance values (Fig. 4A), mean diversity values (Fig. 4B), and the clustering coefficients characterizing the host networks (Fig. 4B) are Table 2. The effects of model parameters on diversity and plotted against ␳. Fig. 4 clearly illustrates that the systematic discordance of the pathogen population changes in mean diversity and mean discordance values as the host network goes from being regular to being random occur Regular Random Mean-field Combined at the ␳ values where the degree of clustering changes. Further HDHDHDHDanalysis into the degree of strain clustering in the host contact network (a strain cluster is defined as a group of connected C 7.9 1.8 0.6 1.0 0.7 hosts who are currently either infected with, or immune to, a r given strain) indicates that, as contacts between hosts become ␶ 1.4 1.3 0.6 more localized (Fig. 4B), the average size of the largest strain ␥ 7.1 57.0 65.0 27.1 61.0 57.2 24.8 38.6 ␤ cluster diminishes (Fig. 4C). As discordant sets occur together 1͞␮ 2.1 4.2 1.0 1.1 1.2 spatially, this trend indicates that discordant sets grow in 1͞␴ 0.7 0.7 0.9 cluster size as host contacts become more random. P 4.5 6.4 1.1 Discussion n 15.6 2.3 1.5 13.5 10.7 12.1 3.3 10.0 Network NA NA NA NA NA NA 29.0 15.2 Our stochastic IBM model illustrates that network contact structure structure of the host population can play a major part in determining the strain structure and evolutionary dynamics of a Results of linear regression analysis on the effects of variation in the model pathogen population. For pathogens with polymorphic, immu- parameters on mean diversity (D) and mean discordance (H) for 3,000 itera- nodominant antigens, regular host networks with localized in- tions and for 745 simulations. Values are the percentage of the variance in D teractions may allow for a more diverse pathogen population to or H explained by the model parameter (determined by using Pearson’s product moment correlation coefficient), with blank cells representing non- exist, whereas well mixed host populations promote genetic significant correlations (P Ͻ 0.05, two-tailed distribution). Data for the Com- structuring by the host immune system. bined column result from analyzing the data for Regular, Random, and Our mean-field approximation supports the findings of Mean-field together, comparing the effect of network structure with that of deterministic models, and reproduces the three types of dy- the other model parameters. namics found previously within this type of framework (5, 8,

10842 ͉ www.pnas.org͞cgi͞doi͞10.1073͞pnas.0402000101 Buckee et al. Downloaded by guest on September 29, 2021 Fig. 3. A comparison of mean diversity and mean discordance for all 745 (of Fig. 4. The effects of transitioning from regular to random networks on 1,000) simulations for which stochastic extinction did not occur within the first strain diversity and discordance. (A) The effect of ␳ (the degree of host mixing) 3,000 time steps of either the mean-field approximation or the random or on mean discordance (dashed line) and mean diversity (solid line) for two regular ring simulation. Mean diversity slightly increased from the mean-field simulations. The simulation denoted with the open circle has parameter approximation simulations to the random simulations (A), whereas mean values C ϭ 10, r ϭ 0.0681, ␶ ϭ 0.0049, ␥ ϭ 2.8178, ␤ ϭ 0.4093, 1͞␮ ϭ 8, 1͞␴ ϭ discordance slightly decreased (B). A large increase in mean diversity is evident 11, P ϭ 481, n ϭ 4. The simulation denoted with the open square has when regular ring network dynamics are compared with random host contact parameter values C ϭ 8, r ϭ 0.0554, ␶ ϭ 0.0021, ␥ ϭ 3.5578, ␤ ϭ 0.7262, 1͞␮ ϭ network dynamics (C), as is the large decrease in mean discordance levels (D). 9, 1͞␴ ϭ 15, P ϭ 371, n ϭ 3. (B) The degree of host clustering, measured by the clustering coefficient, as a function of ␳. The clustering coefficient is defined and computed as in Watts and Strogatz (14). (C) The average size of the largest 9): no strain structure at low levels of cross-immunity, discrete, strain cluster as a function of ␳. The decrease in discordance and the increase nonoverlapping strain structure at high levels, and cyclical in diversity with more localized interactions (lower ␳) is strongly correlated to dominance of nonoverlapping sets of strains at intermediate the degree of host clustering and the growth in the size of the largest strain levels. The addition of a stochastic framework to these mean- cluster. Both simulations were run for 5,000 time steps, for each of the 14 ␳ field models has allowed for the inclusion of mutation events, values, ranging from 0.0001 to 1. The first 2,000 time steps were discarded to a varied population size, and an increased number of strains, remove the effect of transients. Note the logarithmic scale on the x axis. in addition to the exploration of different host networks. The fact that the effects of host cross-immunity are reproduced accurately even in relatively small populations, with large dominant loci of the recombinant pathogen. Therefore, re- numbers of strains, and with high rates of mutation and combinant strains cannot establish themselves regardless of recombination, provides strong support for the hypothesis that how often they are generated, because they are immediately immunity of the host may dictate the structure and dynamics suppressed by herd immunity to their parent strains. There- of the pathogen population when pathogens are antigenically fore, higher recombination rates do not significantly affect variable. strain diversity or discordance. The variance in strain discordance and diversity for all The fact that localized interactions may promote diversity in networks was primarily affected by the degree of cross- phenomena occurring on networks is well established (18–20). A immunity (␥) and, to a lesser extent, the number of immuno- number of loosely connected ‘‘islands of contacts’’ can result in dominant loci (n). Across networks, the host contact structure the emergence of different dynamics occurring in different parts also played a key role in determining these metrics (Table 2). of the network, because local densities equilibrate more rapidly Within a given host network type, as well as in the combined than global densities (13). As a result, models that have incor- analysis, other factors, such as the average number of contacts porated space have often produced differing results from their per host (C), the average duration of infection and immunity mean-field counterparts. This study is no exception. What makes (1͞␮ and 1͞␴, respectively), and the host population size (P), our finding of particular importance, however, is the discovery only contributed slightly to explaining the variance in diversity of the primary importance of host contact network structure in and discordance. Interestingly, neither the probability of trans- controlling the dynamics of pathogen strain evolution and di- mission (␤) nor the probability of recombination (r) signifi- versity. Unlike mean-field models, in which the selective force of cantly explained any variance in these metrics. Although we the host immune system impacts the whole system equally, would not expect the probability of transmission to necessarily incorporating constraints on the spatial distribution of different affect these metrics, because all strains are equally fit, it is at strains allows for the build up of spatial clustering. Qualitative first surprising that the probability of recombination does not analyses suggest that discordant sets do arise locally, but that contribute to explaining either of the metrics’ variance. High herd immunity is not established over the entire network when rates of recombination, which should promote diversity and contacts between hosts are local. Moreover, the upward trend in disrupt discordant strain structure, do not have this effect the average size of the largest strain cluster associated with more because a recombinant pathogen inherits immunodominant random host networks highlights the importance of contact loci from its ‘‘parent’’ strains. Because discordant sets cluster networks in controlling the establishment of widespread herd

together in the host networks, recombinants are generated in immunity. These observations argue for further investigation BIOLOGY host environments in which the hosts are likely to have already into the role that contact–network structure may play in gener- POPULATION experienced, and therefore be immune to, all of the immuno- ating these dynamics, in relation to these other key variables,

Buckee et al. PNAS ͉ July 20, 2004 ͉ vol. 101 ͉ no. 29 ͉ 10843 Downloaded by guest on September 29, 2021 especially considering that our general results appear over a ing ecological data, needs to be further addressed in the field of large range of other parameter values. , where the nonrandom connectivity of hosts pro- Although mean-field models can provide valuable insight into vides the spatial backdrop for understanding and controlling the mechanisms driving pathogen evolution, we have shown that disease dynamics. relaxing the assumption of random mixing within host popula- tions may have profound effects on the interpretation of clinical data. Caution must be exercised when inferring mechanisms of This study was conceived and initiated as part of the Complex Systems selection from models that assume random host mixing, because Summer School 2003 at the Santa Fe Institute. We thank the Santa Fe the ‘‘environmental’’ contexts in which pathogen evolution oc- Institute, Jonathan Shapiro, Tom Carter, and the participants of the curs may be important in shaping their dynamics. Spatial patch- summer school for advice and support during this study. We also thank iness, having been shown to be of great relevance in understand- two anonymous reviewers for suggestions.

1. Jolley, K. A., Kalmusova, J., Feil, E. J., Gupta, S., Musilek, M., Kriz, P. & 12. O’Keefe, K. J. & Antonovics, J. (2002) Am. Nat. 159, 579–605. Maiden, M. C. J. (2000) J. Clin. Microbiol. 38, 4492–4498. 13. van Baalen, M. (2002) in The Adaptive Dynamics of Infectious Diseases: In 2. Gupta, S., Trenholme, K., Anderson, R. M. & Day, K. P. (1994) Science 263, Pursuit of Virulence Management, eds. Dieckman, U., Metz, J. A. J., Sabelis, 961–963. M. W. & Sigmund, K. (Cambridge Univ. Press, Cambridge, U.K.), pp. 3. Dietz, K. (1979) J. Math. Biol. 8, 291–300. 85–103. 4. Castillo-Chavez, C., Hethcote, H. W., Andreasen, V., Levin, S. A. & Liu, W. M. 14. Watts, D. J. & Strogatz, S. H. (1998) Nature 393, 440–442. (1989) J. Math. Biol. 27, 233–258. 15. Shannon, C. E. & Weaver, W. (1949) The Mathematical Theory of Communi- 5. Andreasen, V., Lin, J. & Levin, S. A. (1997) J. Math. Biol. 35, 825–842. cation (Univ. of Illinois Press, Urbana). 6. Gupta, S., Ferguson, N. & Anderson, R. M. (1998) Science 280, 912–915. 16. Warwick, R. M. & Clarke, K. R. (1995) Mar. Ecol. Prog. Ser. 129, 301–305. 7. Gupta, S., Maiden, M., Feavers, I. M., Nee, S., May, R. M. & Anderson, R. M. 17. Blower, S. M. & Dowlatabadi, H. (1994) Int. Stat. Rev. 2, 229–243. (1996) Nat. Med. 2, 437–442. 18. Tilman, D. & Kareiva, P., eds. (1997) Spatial Ecology: The Role of Space in 8. Gomes, M. G., Medley, G. F. & Nokes, D. J. (2002) Proc. R. Soc. London Ser. Population Dynamics and Interspecific Interactions (Princeton Univ. Press, B 269, 277–233. Princeton) 9. Gog, J.R&Swinton, J. (2002) J. Math. Biol. 44, 169–184. 19. Hassel, M. (2000) Spatial and Temporal Dynamics of Host–Parasitoid Interac- 10. Read, J. M. & Keeling, M. J. (2003) Proc. R. Soc. London Ser. B 270, 699–708. tions (Oxford Univ. Press, Oxford) 11. Keeling, M. J. (1999) Proc. R. Soc. London Ser. B 266, 859–867. 20. Newman, M. E. J. (2003) SIAM Rev. 45, 167–256.

10844 ͉ www.pnas.org͞cgi͞doi͞10.1073͞pnas.0402000101 Buckee et al. Downloaded by guest on September 29, 2021