Inferring Livestock Movement Networks from Archived Data To
Total Page:16
File Type:pdf, Size:1020Kb
1 Supplementary: Inferring livestock movement networks from archived 2 data to support infectious disease control in developing countries 3 4 A. Muwonge1,4, P.R. Bessell1, T. Porphyre1,5, P. Motta1,6, G. Rydevik 1 , G. Devailly1,3, N.F. 5 Egbe2, R.F. Kelly1, I.G. Handel1,4, S. Mazeri1,4, B.M.deC. Bronsvoort1,4 6 7 8 9 1. The Roslin Institute and the Royal (Dick) School of Veterinary Studies, University of 10 Edinburgh, Easter Bush, Midlothian, EH25 9RG, UK 11 12 2. School of Life Sciences, University of Lincoln, Brayford Pool, Lincoln LN6 7TS 13 United Kingdom. 14 15 3. GenPhySE, Université de Toulouse, INRAE, ENVT, 31326, Castanet Tolosan, France 16 17 4. Epidemiology Economics and Risk Assessment group at The Roslin Institute and the 18 Royal (Dick) School of Veterinary Studies, University of Edinburgh, Easter Bush, 19 Midlothian, EH25 9RG, UK 20 21 5. Université Lyon 1, CNRS, VetAgro Sup, Laboratoire de Biométrie Et Biologie 22 Évolutive, Université de Lyon, Villeurbanne Cedex, France 23 24 6. The European Commission for the Control of Foot-and-Mouth Disease (EuFMD), Food 25 and Agriculture Organization of the United Nations, Rome, Italy 26 27 28 29 30 31 32 33 34 35 36 Materials and methods 37 Hypothesis and conceptual framework 38 This comparative analysis was implemented with some assumption about the livestock 39 production of Cameroon a) The cattle lifecycle includes three stages highlighted here as 40 “Rearing on farm”, “movement for Trading”, ending up “at the slaughter house”, b) the datasets 41 census, empirical and molecular used here represent the three stages respectively. We then 42 generate networks from each dataset using the following methodologies; 43 1. Cattle movement network derived as a function of human protein demand using gravity 44 modelling 45 2. Derive an empirical network topology from an edge list generated from cross- sectional 46 study out team conducted (DBSX1) and 47 3. Using phylodynamic modelling of host-to-host pathogen transmission network, since 48 pathogens are considered “hitchhikers” on hosts 49 4. Derive a random network topology. The Empirical and random are used as controls, 50 i.e. the former as the reference/gold standard and the latter as the negative control/null 51 The vast majority of network structures/topologies are a product of dynamic processes [1,2], 52 therefore one can think of the resultant network topology as a relic of the contact structure. So, 53 based on this contact structure we can elucidate disease spread by simulation. The novelty here 54 is the ability to repurpose generally archived data census and molecular data. Similarity and 55 dissimilarity in topology and simulated disease characteristics between gravity, molecular and 56 empirical network as well as the random equivalents allow us to examine the following; 57 a) The amount of overlap in information captured, 58 b) the complementally utility from i.e. the extra information each captures, 59 c) how specific and non-specific each network can be. All this information can be exploited 60 to support data driven livestock disease management especially resource allocation. 61 Description of data source and context 62 Empirical data set (For R code see section-A2 in Network_Generation_Code) 63 The empirical network (EN) was generated using data collated on cattle movements through 64 the livestock trading system across Adamawa, West and North-West regions of Cameroon. The 65 lists of cattle markets present within these regions were obtained from the Ministry of 66 Livestock, Fisheries and Animal Industries (MINEPIA). Combining this information with the 67 analysis of commercial connections of between markets in each region identified a total of 59 68 livestock markets [18]. 69 Census data summary 70 The census data used represents approximately 8.85 and 10.3 million head of cattle and humans 71 respectively. The human and cattle population difference between sub divisions within regions 72 is shown in Fig S1. It is however noteworthy that ratio of human to cattle is highest and lowest 73 in Adamawa and central regions respectively. Furthermore, that areas without cattle or human, 74 or missingness of one population were excluded for our analysis. 75 A B 2000000 5e+05 1500000 4e+05 3e+05 1000000 2e+05 500000 1e+05 0e+00 0 76 77 Figure S1 Spatial distribution of the (A) human and (B) cattle population generated using ggplot in R 78 using the Cameroonian census data of 2005-2007(DBSX3). Color scheme ranges from dark blue- 79 yellow for the legend and represents population ranges (A) 0-2million and (B) 0-0.5million. 80 Principles behind each network topology 81 Molecular network topology (For R code see section-A1 in Network_Generation_Code) 82 Here we exploit the principals of “measurably evolving populations” (MEP) of pathogens [3] 83 to reconstruct transmission network based of M.bovis (Fig 2) 84 Panel 2a ID MIRU-VNTR Spoligopattern Molecular distance A B ID ID Distance C D A B 0 E F G C D 4 C E 2 E F 4 distance Map Physical distance Molecular ID ID Distance A Physical distance B A B 2 C C D 3 G E D C E 3.5 F E F 5 85 86 Panel 2b Panel 2c Loca/on(A( Genotype( Period!of!interest! !y! A1( !x! MRCA( "! Genotype( ! A( !Transmission!events! Genotype( !!!!! A2( !! Loca/on(B( !!!!!!Muta&on!events! Period!of!interest! 87 88 Figure S2 illustrates the approach used to generate the molecular network. Panel A shows how we have 89 extracted molecular distance from M.bovis genotypes and Physical distance from the host(cattle). Panel 90 B shows how molecular distance and physical distance are related to transmission and mutation events 91 (MEP). Here transmission events (are analogous to physical distance µ) and mutation events (analogous 92 to genetic distance α). The “window” x-y defines the epidemiological “window” of interest in space 93 and time [15]. Panel C puts this in phylogeographic context, the pathogen genotypes are cast in space(µ) 94 and time(α). The two genotypes A1 and A2 evolved from the MRCA, most recent common ancestor, 95 note here that they are isolated in two different locations i.e. location A & B. From phylogeography, 96 there must be a direct relation between physical distance and molecular distance. It is the linear 97 relationship that defines the data used to construct our molecular network 98 To contextualize this, take two genotypes; A1 and A2 recovered from cattle in location X and 99 Y, and assuming the genotypes have a common recent ancestor Ao (Figure S2-Panel B). We 100 can use the molecular and physical distance α and µ to extract data points from DBSX2 that 101 satisfy the linear relationship (Fig S3) and ideally our “window” µ ~ α of interest (Figure S2- 102 Panel C) & quadrant B (Fig S3). Given our datasets we assume this window represents 2007 103 to 2014 and accounts for the period between pathogen transmission, latent infection and 104 infectious period for the cattle in Africa. [4]. This is why we use the census data for 2005-2007. 105 In our data set we define mutation changes as reported by [5] i.e. a mutation event as the 106 difference in steps in a MIRU-VNTR type between any two isolates with the same spoligotype 107 (Fig S2-Panel A) [16]. Physical distance is computed as linear Euclidean distances between 108 any two sub counties. We therefore use the data points from quadrant B (Fig S3) to generate 109 the undirected molecular network which we direct using the molecular diversity at each 110 subdivision. Quadrant A Quadrant C 9 ●● ● ● ● ● ●● ● ●● ● 8 ● ● ● ● 8 ● ● ●●●●●● ●●● ● ● ●●● ●● ●● ●● ●● ●● ● ●● ● ● ● ● 7 ● ● ● ● ● ● ● ● ● ● ● ● ● 7 ● ● ●●●●●● ●●● ● ● ●● ● ●● ●●● ● ●●● ● ●●●● ● ● ● ● ● ● ● 6 ●● ● ● ●●● ● ● ● ●● ●● ● ●● ● ● ● ● ●●● ● ● 6 ● ● ●● ● ● ●●●●●●●●●● ●● ● ● ● ●● ●●● ●●● ●● ● ●●● ●●●●● ● ●●●●● ● ●● ● ● ● ● ● ● Genetic distance 5 ●● ●●● ● ●● ● ● ● ● ●● ●●●● ● ●● ● ● ● ● ●● ● ● ● ●● ● 5 ● ● ●● ● ●●●●●●●●● ● ●● ● ● ● ● ●● ●●● ●● ●● ●●● ●●●● ● ● ●●●● ● ●● ● ●● ● ● ● ● 4 ● ● ● ● ●●●●●●●●●● ● ●●●●●● ●● ●● ● ● ●● ● ●●● ●● ●●● ● ● ●●● ● ● ●● ●● ● ●● ● ●● ● ● ● 4 ●● ●● ● ● ●● ● ● ● ● ●● ●● ● ● ● ●●●● ● ●● ● ● ● ● ● ●● ●● ● 0 1 2 3 4 5 6 Quadrant B Quadrant D 4 ● ● ● ● ●●●●●●●●●● ● ●●●●●● ●● ●● ● ● ●● ● ●●● ●● ●●● ● ● ●●● ● ● ●● ●● ● ●● ● ●● ● ● ● 4 ●● ●● ● ● ●● ● ● ● ● ●● ●● ● ● ● ●●●● ● ●● ● ● ● ● ● ●● ●● ● 3 ● ● ● ●●● ●●●●●● ●● ● ●●●●● ●●● ●● ● ● ●● ● ● ● ● ●●● ●● ●● ● ●● ● ●● ● ● 3 ● ● ●●● ● ●● ● ● ● ●● ●●●● ●●● ● ● ● ● ● ● ● ● ● ● 2 ● ● ●● ●●● ●●●●●●●●●●●●●● ●●●● ●●● ● ● ● ●● ●●● ● ● ● ● ● ● ●● ● ● ● ● 2 ● ●● ● ●● ● ● ● ● ● ●● ●● ●● ● ● ●● ●● ● ●● ● ● ● ● ● Genetic distance 1 ● ●●● ● ●●●●● ●●●●● ●●●● ●●● ●● ●● ●● ● ● ● ● ● ● ●● ● ● ● ● ● 1 ● ● ●● ● ●●●● ● ● ● ●● ● ● ●● ● ●● ● ● ● ●● ● ● 0 ● ●● ●● ●●●●●●●● ●● ●● ●●● ● ● ●● ● ●● ● ● ● ● ● ● ●● ● 0 ● ● ● ● ●● ●● ●● ●● ● ●● ● ● ●●● ● ● ● ● 0 1 2 3 4 5 6 111 Physical distance Physical geographic distance 112 Figure S3 Relationship between genetic distance (α) and physical distance (µ) using our data. 113 Quadrant B would then represent the “window” whose data points we used to derive the molecular 114 network topography. 115 The dataset used for generating the molecular distance [5] contained 25 unique subdivisions of 116 which 20 were used for deriving the molecular network topography. Four and fifty-one unique 117 spoligotypes and MIRU-VNTR types respectively were used to generate molecular distances. 118 In this regard spoligotypes SB0944, SB0953, SB1025 and SB1460 listed in their order of 119 prevalence were used. 120 Gravity network topology (For R code see section-A3 in Network_Generation_Code) 121 Gravity