<<

bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 25, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

1 Intragenic Recombination Influences Diversity and Evolution 2

3 Irene Hoxiea,b and John J. Dennehya,b,#

4

5 aBiology Department, Queens College of The City University of New York, Queens, NY 6 bThe Graduate Center of The City University of New York, New York, NY 7 8 9 10 Running head: Rotavirus Recombination and Evolution 11 12 #Address correspondence to John J. Dennehy, [email protected] 13 14 15

1 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 25, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

16 Abstract 17 Because of their replication mode and segmented dsRNA , homologous recombination is assumed to be 18 rare in the . We analyzed 23,627 complete rotavirus genome sequences available in the NCBI 19 Variation database, and found 109 instances of homologous recombination, at least 11 of which prevailed across 20 multiple sequenced isolates. In one case, recombination may have generated a novel rotavirus VP1 lineage. We 21 also found strong evidence for intergenotypic recombination in which more than one sequence strongly 22 supported the same event, particularly between different genotypes of segment 9, which encodes the serotype 23 , VP7. The recombined regions of many putative recombinants showed substitutions 24 differentiating them from their major and minor parents. This finding suggests that these recombination events 25 were not overly deleterious, since presumably these recombinants proliferated long enough to acquire adaptive 26 mutations in their recombined regions. Protein structural predictions indicated that, despite the sometimes 27 substantial amino acid replacements resulting from recombination, the overall protein structures remained 28 relatively unaffected. Notably, recombination junctions appear to occur non-randomly with hot spots 29 corresponding to secondary RNA structures, a pattern seen consistently across segments. In total, we found 30 strong evidence for recombination in nine of eleven rotavirus A segments. Only segment 7 (NSP3) and segment 31 11 (NSP5) did not show strong evidence of recombination. Collectively, the results of our computational 32 analyses suggest that, contrary to the prevailing sentiment, recombination may be a significant driver of 33 rotavirus evolution and may influence circulating strain diversity. 34 35 Keywords: dsRNA, genetic diversity, , segmented genome, RDP4

2 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 25, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

36 Introduction 37 The non-enveloped, dsRNA rotaviruses of the family Reoviridae are a common cause of acute 38 gastroenteritis in young individuals of many bird and mammal species (Desselberger 2014). The rotavirus 39 genome consists of 11 segments, each coding for a single protein with the exception of segment 11, which 40 encodes two , NSP5 and NSP6 (Desselberger 2014). Six of the proteins are structural proteins (VP1-4, 41 VP6 and VP7), and the remainder are non-structural proteins (NSP1-6). The infectious virion is a triple-layered 42 particle consisting of two outer-layer proteins, VP4 and VP7, a middle layer protein, VP6, and an inner 43 protein, VP2. The RNA polymerase (VP1) and the capping-enzyme (VP3) are attached to the inner capsid 44 protein. For the virus to be infectious (at least when not infecting as an extracellular vesicle), the VP4 spike 45 protein must be cleaved by a , which results in the proteins VP5* and VP8* (Arias, Romero et al. 1996). 46 Because they comprise the outer layer of the virion, VP7 and VP4 are capable of eliciting neutralizing 47 antibodies, and are used to define G (glycoprotein) and P (protease sensitive) serotypes respectively 48 (Matthijnssens, Ciarlet et al. 2008, Nair, Feng et al. 2017). Consequently, VP7 and VP4 are likely to be under 49 strong selection for diversification to mediate cell entry or escape host immune responses (McDonald, 50 Matthijnssens et al. 2009, Kirkwood 2010, Patton 2012). 51 Based on sequence identity and antigenic properties of VP6, 10 different rotavirus groups (A–J) have 52 been identified, with group A rotaviruses being the most common cause of human infections (Matthijnssens, 53 Otto et al. 2012, Mihalov-Kovacs, Gellert et al. 2015, Banyai, Kemenesi et al. 2017). A genome classification 54 system based on established nucleotide percent cutoff values has been developed for group A rotaviruses 55 (Matthijnssens, Ciarlet et al. 2008, Matthijnssens, Ciarlet et al. 2011). In the classification system, the segments 56 VP7-VP4-VP6-VP1-VP2-VP3-NSP1-NSP2-NSP3-NSP4-NSP5/6 are represented by the indicators Gx-P[x]-Ix- 57 Rx-Cx-Mx-Ax-Nx-Tx-Ex-Hx, (x = Arabic numbers starting from 1), respectively (Matthijnssens, Ciarlet et al. 58 2008, Matthijnssens, Ciarlet et al. 2011). To date, between 20 to 51 different genotypes have been identified for 59 each segment, including 51 different VP4 genotypes (P[1]–P[51]) and 36 different VP7 genotypes (G1–G36), 60 both at 80% nucleotide identity cutoff values (Steger, Boudreaux et al. 2019). 61 The propensity of rotavirus for coinfection and outcrossing with other rotavirus strains makes it a 62 difficult to control and surveil, even with current vaccines (Rahman, Matthijnssens et al. 2007, 63 Matthijnssens, Ciarlet et al. 2008, Matthijnssens, Bilcke et al. 2009, Kirkwood 2010, Ghosh and Kobayashi 64 2011, Sadiq, Bostan et al. 2018). Understanding rotaviral diversity expansion, genetic exchange between strains 65 (especially between the clinically significant type I and type II genogroups), and evolutionary dynamics 66 resulting from coinfections have important implications for disease control (Rahman, Matthijnssens et al. 2007, 67 Matthijnssens, Ciarlet et al. 2008, Matthijnssens, Bilcke et al. 2009, Kirkwood 2010, Ghosh and Kobayashi 68 2011, Sadiq, Bostan et al. 2018). Rotavirus A have high mutation rates (Matthijnssens, Heylen et al. 69 2010, Donker and Kirkwood 2012, Sadiq, Bostan et al. 2018), undergo frequent reassortment (Ramig and Ward 70 1991, Ramig 1997, Ghosh and Kobayashi 2011, McDonald, Nelson et al. 2016), and the perception is that these 71 two processes are the primary drivers of rotavirus evolution (Doro, Farkas et al. 2015, Sadiq, Bostan et al. 3 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 25, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

72 2018). Genome rearrangements may also contribute to rotavirus diversity, but are not believed to be a major 73 factor in rotavirus evolution (Desselberger 1996). Homologous recombination, however, is thought to be 74 especially rare in rotaviruses due to their dsRNA genomes (Ramig 1997, McDonald, Nelson et al. 2016, 75 Varsani, Lefeuvre et al. 2018). Unlike +ssRNA (Lukashev 2005) and DNA viruses (Pérez-Losada, 76 Arenas et al. 2015), dsRNA viruses cannot easily undergo intragenic recombination because their genomes are 77 not replicated in the cytoplasm by host polymerases, but rather within nucleocapsids by viral RNA-dependent 78 RNA polymerase (Ramig 1997, Patton, Vasquez-Del Carpio et al. 2007). Genome encapsidation should 79 significantly reduce the opportunities for template switching, the presumptive main mechanism of 80 intramolecular recombination (Lai 1992, Pérez-Losada, Arenas et al. 2015). 81 Despite the expectation that recombination should be rare in rotavirus A, there are nevertheless 82 numerous reports of recombination among rotaviruses in the literature (Suzuki, Gojobori et al. 1998, Parra, Bok 83 et al. 2004, Phan, Okitsu et al. 2007, Phan, Okitsu et al. 2007, Cao, Barro et al. 2008, Martinez-Laso, Roman et 84 al. 2009, Donker, Boniface et al. 2011, Jere, Mlera et al. 2011, Esona, Roy et al. 2017, Jing, Zhang et al. 2018). 85 However, a comprehensive survey of 797 rotavirus A genomes failed to find any instances of the same 86 recombination event in multiple samples (Woods 2015). There are two possible explanations for this outcome. 87 Either the putative recombinants were spurious results stemming from poor analytical technique or were poorly 88 fit recombinants that failed to increase in frequency in the population such that they would be resampled 89 (Woods 2015). The main implications of this study are that recombination among rotavirus A is rare, usually 90 disadvantageous, and not a significant factor in rotavirus evolution. 91 Since the number of publicly available rotavirus A whole segment genomes is now over 23,600, it is 92 worth revisiting these conclusions to see if they are still valid. To this end, we used bioinformatics tools to 93 identify possible instances of recombination among all available complete rotavirus A genome sequences 94 available in the NCBI Virus Variation database as of May, 2019. We found strong evidence for recombination 95 events among all rotavirus A segments with the exception of NSP3 and NSP5. In several cases, the 96 recombinants were fixed in the population such that several hundred sampled strains showed remnants of this 97 same event. These reports suggest that rotavirus recombination occurs more frequently than is generally 98 appreciated, and can significantly influence rotavirus A evolution. 99 Methods 100 Sequence Acquisition and Metadata Curation 101 We downloaded all complete rotavirus genomes from NCBI’s Virus Variation Resource as of May 2019 102 (n = 23,627) (Hatcher, Zhdanov et al. 2017). Laboratory strains were removed from the dataset. Genomes that 103 appeared to contain substantial insertions were excluded as well. Avian and mammalian strains were analyzed 104 separately as recombination analyses between more divergent genomes can sometimes confound the results. No 105 well-supported events were identified among the avian strains. For each rotavirus genome, separate fasta files 106 were downloaded for each of the 11 segments. Metadata including host, country of isolation, collection date, and 107 genotype (genotype cutoff values as defined by (Matthijnssens, Ciarlet et al. 2008)) were recorded. The 4 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 25, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

108 genotype cutoff values allowed for the differentiation of events that occurred between two different genotypes— 109 intergenotypic—from those which occurred between the same genotype—intragenotypic. 110 Recombination Detection and Phylogenetic Analysis 111 All 11 segments of each complete genome were separately aligned using MUSCLE v3.8.31 (Edgar 112 2004) after removing any low quality sequence (e.g., ‘Ns’). Putative recombinants were identified using RDP4 113 Beta 4.97, a program, which employs multiple recombination detection methods to minimize the possibility of 114 false positives (Martin, Murrell et al. 2015). All genomes were analyzed, with a p-value cut off < 10E-4, using 115 3Seq, Chimæra, SiScan, MaxChi, Bootscan, Geneconv, and RDP as implemented in RDP4 (Martin, Murrell et 116 al. 2015). We eliminated any strains that did not show a putative recombination event predicted by at least 6 of 117 the above listed programs. We then ran separate phylogenic analyses on the ‘major parent’ and ‘minor parent’ 118 sequences of putative recombinants in BEAST v1.10.4 using the general time reversible (GTR) +  + I 119 substitution model (Suchard, Lemey et al. 2018). ‘Parent’ in this case does not refer to the actual progenitors of 120 the recombinant strain, but rather those members of the populations whose genome sequences most closely 121 resemble that of the recombinant. Significant phylogenetic incongruities with high posterior probabilities 122 between the ‘major parent’ and ‘minor parent’ sequences were interpreted as convincing evidence for 123 recombination. 124 Phylogenetic Analysis of VP1 and VP3 125 Segment 1 (VP1) and segment 3 (VP3) each showed evidence for recombination events resulting in a 126 novel lineage, wherein many isolates reflected the same recombination event. To analyze these events more 127 thoroughly, we split alignments of a subset of environmental isolates to include flagged clades, minor parent 128 clades, and major parent clades along with outgroup clades to improve accuracy of tip dating. We generated 129 minor and major parent phylogenies using BEAST v1.10.4 (Suchard, Lemey et al. 2018). We used tip dating to 130 calibrate molecular clocks and generate time-scaled phylogenies. The analyses were run under an uncorrelated 131 relaxed clock model using a time-aware Gaussian Markov random field Bayesian skyride tree prior (Minin, 132 Bloomquist et al. 2008). The alignments were run using a GTR + Γ + I substitution model and partitioned by 133 codon position. Log files in Tracer v1.7.1 (Rambaut, Drummond et al. 2018) were analyzed to confirm 134 sufficient effective sample size (ESS) values, and trees were annotated using a 20% burn in. The alignments 135 were run for three chains with a 200,000,000 Markov chain Monte Carlo (MCMC) chain length, analyzed on 136 Tracer v1.7.1 (Rambaut, Drummond et al. 2018), and combined using LogCombiner v1.10.4 (Drummond and 137 Rambaut 2007). Trees were annotated with a 10% burn, and run in TreeAnnotator v1.10.4 (Suchard, Lemey et 138 al. 2018). The best tree was visualized using FigTree v1.4.4 (Rambaut 2019) with the nodes labeled with 139 posterior probabilities. 140 RNA Secondary Structure Analysis 141 To test the hypothesis that recombination junctions were associated with RNA secondary structure, we 142 generated consensus secondary RNA structures of different segments and genotypes using RNAalifold in the

5 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 25, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

143 ViennaRNA package, version 2.4.11 (Bernhart, Hofacker et al. 2008, Lorenz, Bernhart et al. 2011). The 144 consensus structures were visualized and mountain plots were generated to identify conserved structures that 145 may correspond to breakpoint locations. We made separate consensus alignments in two ways, by using the 146 same genotype and by combining genotypes. Due to the high sequence variability of the observed recombinants, 147 particularly in segments 4 (VP4) and 9 (VP7), we note that the consensus structures may vary substantially 148 depending on the sequences used in the alignments. Segments 7 (NSP3), 10 (NSP4), and 11 (NSP5) were not 149 analyzed due to only having one or no recombination events. Consensus structures could not reliably be made 150 for segment 2 (VP2). While the VP2 protein is relatively conserved across genotypes, it contains insertions 151 particularly in the C1 genotype, yet shows recombination across C1 and C2 genotypes. 152 Protein Structure and Antigenic Predictions 153 To evaluate whether recombination events resulted in substantial (deleterious) protein structure changes, 154 we employed LOMETS2 (Local Meta-Threading-Server) I-TASSER (Iterative Threading Assembly 155 Refinement) (Zhang 2008, Roy, Kucukural et al. 2010, Yang and Zhang 2015) to predict secondary and tertiary 156 protein structures. I-TASSER generates a confidence (C) score for estimating the quality of the protein models. 157 To determine if any of the putative recombinants possessed recombined regions containing , we 158 analyzed the amino acid sequences of all VP4, VP6, and VP7 recombinants and their major and minor parents. 159 We used the Immune Epitope Database (IEDB) (Vita, Mahajan et al. 2019) and SVMTriP (Yao, Zhang et al. 160 2012) to predict conserved epitopes. 161 Results 162 Strong Evidence for Homologous Recombination in Rotavirus A 163 We identified 109 putative recombination events (identified by 6/7 RDP4 programs; Table 1, 164 Supplementary Table 1). Of these, 67 recombination events were strongly supported, meaning they were 165 detected by 7/7 RDP4 programs with a p-value cut off < 10E-9 (Table 1). Most recombination events detected 166 were observed in sequences uploaded to the Virus Variation database since the last large scale analysis (Woods 167 2015) so differences between these results and prior studies may simply reflect the recent increase in available 168 genome sequences. 169 Putative Recombination Events Observed in Multiple Environmental Isolates 170 Eleven of the recombination events identified in Table 1 were observed in more than one environmental 171 isolate (Fig. 1; Table 2). The observation of multiple sequenced strains with the same recombination event is 172 strong evidence that the observed event was not spurious, and was not a consequence of improper analytical 173 technique or experimental error. Assuming the events are not spurious, there are only two ways that multiple 174 sequenced isolates will show the same recombinant genotype. Either multiple recombination events with the 175 same exact breakpoints occurred at approximately the same time, or the event happened once and descendants 176 of the recombined genotype were subsequently isolated from additional infected hosts. The latter scenario is 177 more parsimonious, and suggests that the new genotype was not reproductively impaired. Indeed, some

6 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 25, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

178 recombined genotypes may be more fit than their predecessors, but this outcome would need to be 179 experimentally validated with infectivity assays. 180 181 Table 1. Recombination events identified among all mammalian rotavirus A genome sequences downloaded 182 from NCBI Virus Variation database in May 2019 (Hatcher, Zhdanov et al. 2017). a6/7 programs implemented 183 in RDP4 identified putative recombination event (see Methods). bEvents identified by 7/7 programs implemented 184 in RDP4 or events where more than one environmental isolate showed the same event (see Methods). cStrongly 185 supported events (Row 3) divided by number of sequences analyzed (Row 1). dNumber of intergenotypic events 186 out of all putative recombination events (Row 2). VP1 VP2 VP3 VP4 VP6 VP7 NSP1 NSP2 NSP3 NSP4 NSP5 Number of 1710 1600 1905 1990 2176 3887 1962 2186 1881 2430 1900 sequences analyzed Putative 15 13 16 11 11 24 11 4 0 3 1 recombination eventsa Strongly 7 8 15 7 6 14 7 1 0 1 1 supported eventsb Recombination 4.1E-3 5.0E-3 7.9E-3 3.5E-3 2.8E-3 3.6E-3 3.6E-3 4.6E-4 -- 4.1E-4 5.3E-4 frequencyc Intergenotypic 2/15 8/13 4/16 6/11 7/11 22/24 3/11 2/4 -- 2/3 1/1 recombination ratiod 187

7 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 25, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

A

B

188 189 Figure 1. Bootscan (A) and RDP (B) analyses of putative recombinants where multiple environmental samples 190 supported the event. From top left to bottom left: a G6-G6 event in VP7, a G9-G1 event, an R1-R1 event in VP1, 191 and an N1-N1 event in NSP2. From top right to bottom right: an A8-A8 event in NSP1, an R2-R2 event in VP1, 192 a G2-G1 event in VP7, and a G3-G1 event. 193

8 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 25, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

194 Table 2. Recombination events observed in multiple independent environmental isolates (i.e., isolates from 195 different patients and/or sequenced by different laboratories). a99% confidence intervals. bSee Fig. 2.

Recombined Genotype(s) Number of Corrected and Breakpointsa Accession Segment Involved Independent Uncorrected Multiple Numbers Isolates Comparisons Segment 1 R1 2 4.787E-18 and 1.349E-10 35 – 1448 KC579647 (VP1) (1290 –1498) KC580176 Segment 1 R2 Flagged in 285 1.305E-16 and 3.332E-09 (541-711) – KU199270b (VP1) samples. (1327–1445) Segment 1 R2 2 1.756E-20 and 4.951E-13 656 (541–711) KU356662 (VP1) – 2661 (2600– KU356640 2735) Segment 3 M2 Flagged in 551 1.78E-12 and 4.982E-05 1258 (872– KX655453 (VP3) samples. 1389) – 1798 (1662–1917) Segment 3 M1 Flagged in 107 2.491E-20 and 6.946E-13 2158 (2129– KJ919553 (VP3) samples. 2174) – 2531 KJ919517 (undetermined) KJ919551 KJ753665 JQ069727 Segment 7 A8 (porcine) 3 3.01E-37 and 8.622E-30 (1434–25) – KP753174 (NSP1) (573–596) KJ753184 KP752951 Segment 8 N1 3 3.689E-13 and 4.054E-06 485 (455–499) KJ753657 (NSP2) – 889 (868– KM026663 914) KM026664 Segment 9 G6 (bovine) 2 6.240E-12 and 2.780E-05 1048 (1032– HM591496 (VP7) 128) – 481 KF170899 (448–506) Segment 9 G1 & G2 2 5.507E-38 and 1.664E-30 55 (994–60) – KC443034 (VP7) 291 (262–296) MG181727 Segment 9 G1 & G3 2 1.4E-23 and 4.230E-16 857 (833–859) KJ751729 (VP7) – 1019 (991– KP752817 51) Segment 9 G1 & G9 2 5.292E-19 and 1.599E-11 362 (346–366) AF281044 (VP7) – 589 (558– GQ433992 605) 196 197 Segment 1 Intragenotypic Recombination Resulting in a New Lineage 198 Segment 1 (VP1; ~3,302 base pairs) showed evidence of a recombination event within the R2 clade that 199 was fixed in the population, and resulted in a new lineage (highlighted clade in Fig. 2; Fig. 3; Table 2). The 200 multiple comparison (MC) uncorrected and corrected probabilities were 1.305E-16 and 3.332E-09 respectively. 201 The sequence most closely related to the recombined sequence was KU199270, a human isolate from 202 Bangladesh in 2010 (Aida, Nahar et al. 2016). Phylogenetic analysis using tip calibration suggests that the 203 recombination event occurred no later than 2000-2005 (node Cis), so if this is a true recombination event, the 204 2010 sequence is not the original recombinant. The recombinant region is 100% similar to an isolate also from 205 Bangladesh in 2010 (KU248372) (Aida, Nahar et al. 2016), which also has a putative recombinant sequence in 206 another region of its genome. The breakpoint regions (99% CI: (541-711) — (1327-1445)) may represent a 207 potential hotspot for segment 1 recombination as multiple recombination events show breakpoints in this area

9 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 25, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

208 (Table 2). Phylogenetic analysis of the alignment of 250 representative sequences containing only the putative 209 recombinant region compared with the rest of the segment 1 sequence showed a consistent subclade shift within 210 the R2 clade. The recombination events resulted in the incorporation of the following amino acid substitutions in 211 the recombinant strains when compared to the major parent strain: 227 (K→E), 293 (D→N), 297 (K→R), 305 212 (N→K), and 350 (K→E). 213

0.19KY657705-China-2017-01-10 1 KY657707-China-2017-01-10 0.24KU550262_Spain_2015_03_23 1 KY657706-China-2017-01-10 1 KU550267_Spain_2015_03_17 0.9 KU550264_Spain_2015_04_16 1 KY657704-China-2017-01-10 KU870407_Hungary_2015 1 KY657703-China-2017-01-10 0.77 1 KU870385_Hungary_2015 KY657702-China-2017-01-10 0.19 LC386076_Japan_2017_05 1 KY657701-China-2017-01-10 0.97 LC260250_Indonesia_2016 0.61 1 KY657700-China-2016-12-30 0.27 MF997035_USA_2015 1 KY657699-China-2016-12-20 MH569747_Brazil_2015 0.29 1 KY657698-China-2016-12-10 1 KY657697-China-2016-11-30 0.78 0.22 KU550266_Spain_2015_04_13 KY657696-China-2016-11-20 0.24 KU550265_Spain_2015_03_23 KM660258-Cameroon-2010 1 1 KX880417_Germany_2015_03 KY657695-China-2017-01-10 KU550263_Spain_2015_04_09 0.51 1 LC228304_Japan_2015 1 KY657694-China-2016-12-30 1 MH704718_Indonesia_2014_03_17 VP1 (R2) pos. 1 KY657693-China-2016-12-20 VP1 (R2) pos. LC228315_Japan_2016 0.99 KY657692-China-2016-12-10 0.96 1 KY657691-China-2016-11-30 0.16 0.28 LC228326_Japan_2016 1 KY657690-China-2016-11-20 1 LC260252_Indonesia_2016 0.99 KM660246-Cameroon-2011 1 LC260248_Indonesia_2016 1 MH704719_Indonesia_2015_07_10 0.91 KJ752117-Guinea-Bissau-2011-04-21 LC260249_Indonesia_2016 0.13 KJ752350-Guinea-Bissau-2011-03-26 1 0.18 KP752693-Gambia-2010-02-08 0.9 LC260251_Indonesia_2016 1 0.97 1 LC105248_Japan_2014_04_14 KJ752161-Togo-2010-06 0.97 KY616885_Japan_2013 KP752560-SouthAfrica-2011 AB848007_Japan_2012 1 KJ752394-Gambia-2010-01-21 KX362788_VietNam_2014_01_14 0.37 KP752448-Gambia-2010-01-30 0.28 1-697,-1440- KJ752445-Senegal-2010-12-20 698-1439 1 0.97 KX362700_VietNam_2013_11_22 1 KM660248-Cameroon-2011 0.6 LC386065_Japan_2017_03 0.82 KM660249-Cameroon-2010 1 KX362733_VietNam_2013_12_19 MG181887-Malawi-2012-06-11 0.18LC066661_Thailand_2013_03 0.99 1 LC066639_Thailand_2013_04 0.94 MG181876-Malawi-2012-06-06 0.91 LC066650_Thailand_2013_02 0.97 MG181766-Malawi-2012-06-05 LC105003_Japan_2014_05_20 0.2 0.6 MG181843-Malawi-2012-04-03 0.33 0.87 KJ753787-SouthAfrica-2009-06-03 0.14 1 LC105426_Japan_2014_06_09 KJ753716-SouthAfrica-2008 LC105393_Japan_2014_06_04 0.03 1 MG599516_Brazil_2013 1 KM660254-Cameroon-2011 MG599515_Brazil_2013 0.4 KM660263-Cameroon-2011 0.05 1 KM660253-Cameroon-2011 0.5 MG599518_Brazil_2013 3302 0.03 MG599517_Brazil_2013 1 KP753145-Togo-2010-11 KP007184_Philippines_2012_07 0.11 KJ752183-Togo-2010-11 1 1 KM660247-Cameroon-2011 KP007204_Philippines_2012_11 0.01 0.02 0.22LC260245_Indonesia_2015 0.11 JF460834-Belgium-2009-03-14 KX880428_Germany_2016_05 0.03 KP752812-Togo-2010-08 1 0.67 0.16LC260244_Indonesia_2015 1 JF460823-Belgium-2009-02-18 0.31 LC260242_Indonesia_2015 JF460812-USA-2006 KU059788_Australia_2014_09_23 KP752493-Togo-2009 0.22 0.92 KY000549_Germany_2016_09 KX655517-Uganda-2013-07-05 0.33 0.43 KX655473-Uganda-2012-11-26 KU059766_Australia_2013_04_19 0.49 KX632331-Uganda-2012-11-15 0.92 KU059777_Australia_2013_09_03 1 0.35 1 LC086714_Thailand_2013 0.91 1 KX632276-Uganda-2013-01-05 1 LC086725_Thailand_2013 0.99MF184806-USA-2011 KJ639012_Japan_2013_02_27 0.19 MF184787-USA-2011 1 1 KJ639023_Japan_2013_03_14 0.83 KJ751568-SouthAfrica-2009 0.48 LC260247_Indonesia_2016 0.86 1 KP753082-Senegal-2009 KP752971-Gambia-2008-06-12 1 AB930194_Japan_2014 KJ748476-Ghana-2008 LC260246_Indonesia_2016 KC257091-Camelusdromedarius-Sudan-2002 1 1 LC086780_Thailand_2013 LC228337-Japan-2016 0.24 KX171520_SouthKorea_2015 0.17 1 LC086736_Thailand_2013 0.25 LC228359-Japan-2016 LC086791_Thailand_2013 1 LC228348-Japan-2016 0.18 LC086802_Thailand_2013 0.93 LC228370-Japan-2016 1 1 KX171515_SouthKorea_2012 0.98KC443378-Australia-2010-11-24 1 KX171512_SouthKorea_2011 0.23 KC443211-Australia-2010-08-15 1 KC443455_Australia_2011 0.23 KC443554-Australia-2010-05-19 KC443200_Australia_2010_10_23 1 0.99 JX965134-Australia-2010-11-04 0.17 0.99 LC169973_Thailand_2014 KC443620-Australia-2010-09-26 0.59 LC169984_Thailand_2014 KC443266-Australia-2010-11-27 0.39 KX362821_VietNam_2014_01_21 0.18 KP752720-Swaziland-2010-07-14 0.61 0.82KX362744_VietNam_2013_12_20 0.17 KJ753005-SouthAfrica-2010-06-15 0.5 KX362645_VietNam_2013_09_23 0.3 KF636322-SouthAfrica-2009-05-18 0.37 KX362612_VietNam_2013_08_28 0.31 KP752779-Zambia-2009 0.46 0.59 0.95 0.2 KX362667_VietNam_2013_10_03 0.99 KJ752405-SouthAfrica-2007-02-20 KX362722_VietNam_2013_12_12 1 KJ753605-SouthAfrica-2007-05-15 0.99 KX362569_VietNam_2013_07_02 KJ751801-SouthAfrica-2005 1 KX362580_VietNam_2013_07_18 0.41 JQ069889-Canada-2007 KT936626_Thailand_2012_02_02 0.65 KC178770-Italy-2008 1 0.99 LC086769_Thailand_2013 KC442903-USA-2007 0.1 KC178763_Italy_2010 0.38 0.21 MG181865-Malawi-2012-05-12 0.08 KC443631_Australia_2006_07_27 0.18 MG181788-Malawi-2012-02-11 0.98 KC443763_Australia_2006_07_19 0.7 MG181832-Malawi-2012-03-14 KY497551_Pakistan_2010 MG181898-Malawi-2012-03-29 1 1 1 KC443007_USA_2005 0.9 MG181777-Malawi-2012-01-17 KT920606_India_2003 0.35 MG181854-Malawi-2012-04-27 1 1 DQ490545_Bangladesh_2000_02_14 0.1 KJ752372-Susscrofa-SouthAfrica-2007 DQ490551_Bangladesh_2000_04_20 KJ752458-Zimbabwe-2010-06-26 0.97 0.76 KU356585_Bangladesh_2010_01 1 KF636311-Zambia-2009-04-13 0.91 1 KU356574_Bangladesh_2010_01 0.03 KJ753811-SouthAfrica-2008-08-26 HQ641355_Bangladesh_2005_09 0.65 0.33 KF636357-Zambia-2008-10-19 KU199270_Bangladesh_2010_02 0.93 KF636245-Kenya-2009 0.92 KJ721724_Brazil_2007 0.1 KJ751602-Tanzania-2010-10-15 0.88 KJ721725_Brazil_2006 1 KJ751947-SouthAfrica-2009 KJ721718_Brazil_2011 1 KP752571-Tanzania-2011-11-17 0.11 0.98 0.06 KJ721726_Brazil_2005 1 KX655495-Uganda-2013-01-04 0.06 KJ626715_Paraguay_2005_09 1 KX655462-Uganda-2013-08-29 1 KX171511_SouthKorea_2010 KJ752662-Uganda-2008-06-18 0.04 0.13 KX171518_SouthKorea_2014 0.14 KJ751635-Botswana-2007 KJ626938_Paraguay_2007_07 0.98 KJ753401-SouthAfrica-2005-09-26 KJ626965_Paraguay_2005_07 KF636345-Swaziland-2010-07-22 0.06 KJ627161_Paraguay_2005_07 0.99 KC443653-Australia-2000-09-14 0.01 1 KJ626829_Paraguay_2006_08 KC443345-Australia-2001-06-29 1 KJ626570_Paraguay_2006_09 0.28KU870385-Hungary-2015 0.67 KJ626604_Paraguay_2006_08 0.48 LC386076-Japan-2017-05 0.07 KJ626751_Paraguay_2006_08 KU870407-Hungary-2015 0.02 1 0.94 KJ626726_Paraguay_2005_07 0.33KU550262-Spain-2015-03-23 KJ626773_Paraguay_2005_07 1 KU550267-Spain-2015-03-17 0.34 KJ412775_Paraguay_2008_09 0.32 KU550264-Spain-2015-04-16 0.24 KJ626893_Paraguay_2007_08 0.34 MF997035-USA-2015 1 KJ626987_Paraguay_2007_10 0.42 MH569747-Brazil-2015 0.08 KJ626998_Paraguay_2007_08 0.96 LC260250-Indonesia-2016 0.94 KJ626859_Paraguay_2005_09 0.59KU550263-Spain-2015-04-09 1 JN706448_Thailand_2009 0.12 0.98 KU550265-Spain-2015-03-23 0.96 JN706449_Thailand_2009 1 KU550266-Spain-2015-04-13 0.99 JN706465_Thailand_2009 0.21 KX880417-Germany-2015-03 0.92 1 JN706450_Thailand_2009 KP007184-Philippines-2012-07 0.38 1 JQ069953_Canada_2009 0.72 MH704718-Indonesia-2014-03-17 0.97 KC443521_Australia_2007_05_02 LC228304-Japan-2015 1 KC442985_USA_2005 0.99 LC228326-Japan-2016 1 KC443029_USA_2006 1 LC228315-Japan-2016 KC442870_USA_2007 0.33 LC260248-Indonesia-2016 0.94 0.99 KC178766_Italy_2006 LC260252-Indonesia-2016 1 KF716358_USA_2011 1 MH704719-Indonesia-2015-07-10 1 KF716360_USA_2011 1 LC260251-Indonesia-2016 1 MF184767_USA_2011 0.92 LC260249-Indonesia-2016 KR701629_USA_2011 KP007204-Philippines-2012-11 1 0.1 1 AB849000_Japan_2012 0.3 MG599516-Brazil-2013 KU925784_Taiwan_2011 0.16 MG599515-Brazil-2013 0.97 0.7 KX363117_VietNam_2013_01_31 0.98 MG599518-Brazil-2013 1 KX363245_VietNam_2013_04_01 0.84 MG599517-Brazil-2013 0.75 0.82 1 KX362832_VietNam_2012_10_31 0.72 LC386065-Japan-2017-03 0.97 KX171513_SouthKorea_2011 KX362788-VietNam-2014-01-14 1 KX171510_SouthKorea_2010 0.14 KX362700-VietNam-2013-11-22 0.15 0.13 KC443543_Australia_2006_10_30 LC066639-Thailand-2013-04 0.98 KC443686_Australia_2010_09_01 0.94 KX362733-VietNam-2013-12-19 1 KC443356_Australia_2011_05_25 1 LC066661-Thailand-2013-03 0.97 0.99 KC834711_Australia_2009_02_27 0.5 LC066650-Thailand-2013-02 JQ069907_Canada_2008 0.31LC105426-Japan-2014-06-09 0.27 1 KC178765_Italy_2004 0.92 LC105393-Japan-2014-06-04 1 0.94 GQ414540_Germany_2009_01 0.14 LC105003-Japan-2014-05-20 0.08 KC443167_Australia_2000_07_27 0.89 KY616885-Japan-2013 0.13 1 KC443288_Australia_2000_09_17 0.95 0.94 LC105248-Japan-2014-04-14 KC443433_Australia_2000_09_01 AB848007-Japan-2012 1 KC443178_Australia_2000_09_12 0.99 KU059788-Australia-2014-09-23 0.21 KC443730_Australia_2000_09_10 1 KU059777-Australia-2013-09-03 0.71 JQ069876_Canada_2007 0.15 KU059766-Australia-2013-04-19 JQ069900_Canada_2007 LC086725-Thailand-2013 1 0.99 0.31 0.35 JQ069919_Canada_2008 0.04 LC086714-Thailand-2013 HQ657171_SouthAfrica_2009 LC260244-Indonesia-2015 KC834713_Australia_1999_03_28 0.23LC260242-Indonesia-2015 1 1 KC443774_Australia_2001_09_13 0.22LC260245-Indonesia-2015 1 KY657706_China_2017_01_10 KX880428-Germany-2016-05 1 KY657707_China_2017_01_10 0.44 KY000549-Germany-2016-09 1 KY657705_China_2017_01_10 0.27 LC260247-Indonesia-2016 1 KY657704_China_2017_01_10 0.54 LC260246-Indonesia-2016 1 KY657703_China_2017_01_10 0.11 0.4 AB930194-Japan-2014 KY657702_China_2017_01_10 0.97 KJ639023-Japan-2013-03-14 1KY657700_China_2016_12_30 KJ639012-Japan-2013-02-27 0.77 1 KY657701_China_2017_01_10 1 LC169973-Thailand-2014 1 KY657699_China_2016_12_20 0.3 LC169984-Thailand-2014 1 1 KY657698_China_2016_12_10 0.02 KX362821-VietNam-2014-01-21 1 KY657697_China_2016_11_30 KX362612-VietNam-2013-08-28 KY657696_China_2016_11_20 0.03 0.12KX362667-VietNam-2013-10-03 KM660258_Cameroon_2010 KX362569-VietNam-2013-07-02 KY657695_China_2017_01_10 1 0.1 1 KX362722-VietNam-2013-12-12 1 1 KY657694_China_2016_12_30 0.85 0.06 KX362744-VietNam-2013-12-20 1 KY657693_China_2016_12_20 1 0.95 KX362645-VietNam-2013-09-23 KY657692_China_2016_12_10 0.16 0.45 KX362580-VietNam-2013-07-18 1 KY657691_China_2016_11_30 KT936626-Thailand-2012-02-02 1 1 KY657690_China_2016_11_20 0.98 LC086791-Thailand-2013 KM660246_Cameroon_2011 0.15 0.27 LC086802-Thailand-2013 0.98 KJ752350_Guinea-Bissau_2011_03_26 0.17 LC086736-Thailand-2013 KJ752117_Guinea-Bissau_2011_04_21 KX171520-SouthKorea-2015 0.98 KP752693_Gambia_2010_02_08 0.35 1 LC086780-Thailand-2013 0.92 1 0.24 KP752560_SouthAfrica_2011 0.2 0.07 LC086769-Thailand-2013 KJ752161_Togo_2010_06 0.22 KC178763-Italy-2010 1 KM660248_Cameroon_2011 0.73KX171512-SouthKorea-2011 0.28 KM660249_Cameroon_2010 0.99 KX171515-SouthKorea-2012 1 KJ752445_Senegal_2010_12_20 1 1 KC443455-Australia-2011 1 KP752448_Gambia_2010_01_30 0.09 KC443200-Australia-2010-10-23 KJ752394_Gambia_2010_01_21 KY497551-Pakistan-2010 1 0.37 KX655517_Uganda_2013_07_05 0.29 KC443763-Australia-2006-07-19 0.32 KX632276_Uganda_2013_01_05 KC443631-Australia-2006-07-27 0.81 KX632331_Uganda_2012_11_15 1 0.99 KU356585-Bangladesh-2010-01 KX655473_Uganda_2012_11_26 0.41 1 KU356574-Bangladesh-2010-01 1 KX655438_Uganda_2013_01_10 HQ641355-Bangladesh-2005-09 MF184787_USA_2011 0.27 0.32 1 0.8 KT920606-India-2003 MF184806_USA_2011 0.79 KC443007-USA-2005 0.35 KJ748476_Ghana_2008 DQ490551-Bangladesh-2000-04-20 1 KP753082_Senegal_2009 0.98 DQ490545-Bangladesh-2000-02-14 0.99 KP752971_Gambia_2008_06_12 KU199270-Bangladesh-2010-02 KJ751568_SouthAfrica_2009 0.26KC443609-Australia-2000-07-12 1 1 KJ752183_Togo_2010_11 0.3 KC443532-Australia-2000-09-11 0.16 KP753145_Togo_2010_11 1 KC443675-Australia-2000-08-04 1 JF460834_Belgium_2009_03_14 0.96 JF460823_Belgium_2009_02_18 0.17 KC443719-Australia-2000-08-13 0.19 KX655438-Uganda-2013-01-10 0.99 0.23 KP752493_Togo_2009 0.2 KJ627011-Paraguay-1999-09 KP752812_Togo_2010_08 KC443774-Australia-2001-09-13 0.34 KM660247_Cameroon_2011 0.23 KJ626987-Paraguay-2007-10 1 KM660253_Cameroon_2011 0.19 KJ626998-Paraguay-2007-08 1 KM660263_Cameroon_2011 0.98 KJ626893-Paraguay-2007-08 0.43 KM660254_Cameroon_2011 0.01 KJ412775-Paraguay-2008-09 KC257091_Camelusdromedarius_Sudan_2002 0.13 KJ721718-Brazil-2011 0.92 1 MG181876_Malawi_2012_06_06 0.01 JQ069953-Canada-2009 1 MG181887_Malawi_2012_06_11 0.32 1 0 KJ626751-Paraguay-2006-08 MG181766_Malawi_2012_06_05 0.7 KJ721724-Brazil-2007 1 MG181843_Malawi_2012_04_03 0.02 KJ721725-Brazil-2006 0.67 0.35 KJ753787_SouthAfrica_2009_06_03 0 KJ626965-Paraguay-2005-07 1 KJ753716_SouthAfrica_2008 1 KX171511-SouthKorea-2010 JF460812_USA_2006 1 KX171518-SouthKorea-2014 0.25LC228337_Japan_2016 0 0.03 KJ626938-Paraguay-2007-07 0.21 LC228359_Japan_2016 1 0.93 KC443029-USA-2006 LC228370_Japan_2016 0 KC442870-USA-2007 1 LC228348_Japan_2016 KJ721726-Brazil-2005 1 KC443211_Australia_2010_08_15 0.55 KJ626570-Paraguay-2006-09 1 KC443378_Australia_2010_11_24 KJ626829-Paraguay-2006-08 0.99 KC443620_Australia_2010_09_26 0.63 JQ069919-Canada-2008 0.08 1 KC443554_Australia_2010_05_19 0 0.97 HQ657171-SouthAfrica-2009 0.98 KC443266_Australia_2010_11_27 JQ069900-Canada-2007 0.99 JX965134_Australia_2010_11_04 0.89 0.99 JQ069876-Canada-2007 JQ069889_Canada_2007 0.87 0 KC178766-Italy-2006 1 1 KF636322_SouthAfrica_2009_05_18 0 0.04 KJ626773-Paraguay-2005-07 1 KJ753005_SouthAfrica_2010_06_15 0 0 KC443521-Australia-2007-05-02 1 KJ751801_SouthAfrica_2005 KP752779_Zambia_2009 KJ626604-Paraguay-2006-08 1 1 KC442985-USA-2005 0.83 KP752720_Swaziland_2010_07_14 KJ627161-Paraguay-2005-07 0.99 KJ753605_SouthAfrica_2007_05_15 0.85 0.99 JN706449-Thailand-2009 KJ752405_SouthAfrica_2007_02_20 0.84 JN706450-Thailand-2009 0.65 KC442903_USA_2007 0.4 JN706465-Thailand-2009 KC178770_Italy_2008 JN706448-Thailand-2009 KC443345_Australia_2001_06_29 0.54 0.01 1 KC443653_Australia_2000_09_14 0 0.99 KJ626715-Paraguay-2005-09 KJ626859-Paraguay-2005-09 0.18 MG181777_Malawi_2012_01_17 KJ626726-Paraguay-2005-07 0.14 MG181865_Malawi_2012_05_12 KC834713-Australia-1999-03-28 0.43 MG181898_Malawi_2012_03_29 KX363245-VietNam-2013-04-01 0.53 MG181832_Malawi_2012_03_14 1 0.81 0.74 1 0.76 KX363117-VietNam-2013-01-31 MG181788_Malawi_2012_02_11 KX362832-VietNam-2012-10-31 1 MG181854_Malawi_2012_04_27 0.99 KJ752372_Susscrofa_SouthAfrica_2007 0.06KX171510-SouthKorea-2010 0.96 0.25 KX171513-SouthKorea-2011 0.97 KF636311_Zambia_2009_04_13 AB849000-Japan-2012 1 0.76 KJ752458_Zimbabwe_2010_06_26 1 KU925784-Taiwan-2011 KJ753811_SouthAfrica_2008_08_26 0.99 0.97 KF636357_Zambia_2008_10_19 0.2KF716358-USA-2011 0.16 KF716360-USA-2011 0.65 KJ751602_Tanzania_2010_10_15 0.74 1 KR701629-USA-2011 1 1 KP752571_Tanzania_2011_11_17 0.83 KJ751947_SouthAfrica_2009 1 MF184767-USA-2011 KX655495_Uganda_2013_01_04 KC443686-Australia-2010-09-01 1 1 KX655462_Uganda_2013_08_29 0.18 KC443543-Australia-2006-10-30 1 0.7 KF636245_Kenya_2009 0.19 KC443356-Australia-2011-05-25 0.13 0.96 JQ069907-Canada-2008 KJ752662_Uganda_2008_06_18 KC834711-Australia-2009-02-27 1 KJ753401_SouthAfrica_2005_09_26 0.94 KC178765-Italy-2004 0.89 1 KJ751635_Botswana_2007 GQ414540-Germany-2009-01 KF636345_Swaziland_2010_07_22 1 KC443719_Australia_2000_08_13 0.98KC443433-Australia-2000-09-01 0.24 0.11 KC443288-Australia-2000-09-17 0.21 KC443609_Australia_2000_07_12 1 KC443675_Australia_2000_08_04 0.6 KC443178-Australia-2000-09-12 KC443532_Australia_2000_09_11 0.17 KC443730-Australia-2000-09-10 KC443167-Australia-2000-07-27 0.37KJ751702_SouthAfrica_2000_02_14 0.25 KJ752216_SouthAfrica_2000_02_11 0.81 KJ751960-SouthAfrica-2003 0.24 KX655506-Uganda-2013-04-15 1 KJ751812_SouthAfrica_2000_05_04 0.47 KJ752684_SouthAfrica_2000_04_04 0.3 KJ751812-SouthAfrica-2000-05-04 KX655506_Uganda_2013_04_15 0.26 KJ752216-SouthAfrica-2000-02-11 1 1 KJ751960_SouthAfrica_2003 0.97KJ752684-SouthAfrica-2000-04-04 0.15 KJ751702-SouthAfrica-2000-02-14 0.91 1 KP752460_SouthAfrica_1999_12_05 0.96 KP752460-SouthAfrica-1999-12-05 KJ751658_SouthAfrica_1999_12_07 0.37 KJ751658-SouthAfrica-1999-12-07 0.98 KC834712_Australia_2004_02_28 KC834712-Australia-2004-02-28 0.48JN605437_SouthAfrica_1999 0.92 0.93 1 KJ752249_SouthAfrica_1999 0.31KJ752249-SouthAfrica-1999 0.99 KP752845-SouthAfrica-2009 KP752845_SouthAfrica_2009 JN605437-SouthAfrica-1999 KJ627011_Paraguay_1999_09

3.0 3.0

- 2 5 - 2 0 - 1 5 - 1 0 -5 0- 2 5 - 2 0 - 1 5 - 1 0 -5 0 214 215 Figure 2. Bayesian phylogenetic analysis of a possible recombination event in VP1 that became fixed in the 216 population. The strain closest to the recombination event is highlighted in red. Major clades are colored to 217 show the phylogenetic incongruity. Nodes are labeled with posterior probabilities with 95% node height 218 intervals shown. Time axis at the bottom is in years before 2017.

10 A

bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 25, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

A

B

B 219 220 Figure 3. A) An RDP analysis (top) and a BootScan analysis (bottom) showing a putative recombination event 221 between two R2 segment 1 genotypes. The red line compares the minor parent to the recombinant, blue line 222 compares the major parent to the recombinant, and the green line compares the major parent to the minor 223 parent. The Y-axis for RDP (top) is the pairwise identity, while the Y-axis for BootScan (bottom) is the bootstrap 224 support. The X-axis is the sequence along segment 1. B) The relevant sites shown above color-coded to strain 225 that the recombinant matches. The recombinant is the middle sequence, the minor parent is the bottom 226 sequence, and the major parent is the top sequence. Mutations matching the major sequence are shown in blue, 227 while mutations matching the minor parent are shown in purple. Yellow mutations show mutations not present 228 in the recombinant sequence but which match the major and minor parent, possibly suggesting a second 229 recombination event. 230 231 Segment 3 Intragenic Recombination 232 Two recombination events occurring in segment 3 (VP3) appear to have fixed in the population, and are 233 seen in many descendant sequences (Table 2). The first strain identified as a putative recombinant is KJ753665, 234 an isolate from South Africa in 2004. The recombinant region occurs between positions (2129-2174) — 2531. 235 This region is 98.7% identical to a porcine strain isolated in Uganda in 2016 (KY055418)(Bwogi, Jere et al. 236 2017), while the rest of segment 3 is 95.8% similar to a 2009 human isolate from Ethiopia (KJ752028). The 237 Monte Carlo uncorrected and corrected probabilities were 2.491E-20 and 6.946E-13, respectively, with 107 238 isolates flagged as possibly derived from the recombination event. Phylogenetic analysis using 450 randomly 239 selected VP3 sequences (excluding sequences lacking collection dates) within the putative recombinant region, 240 along with an analysis of the genome excluding the two major recombination events, resulted in five sequences 241 showing a significant phylogenetic incongruity (Fig. 4). The incongruity appeared between sub-lineages within

11 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 25, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

242 the larger M1 lineage of VP3. Amino acid substitutions in the recombinant region included positions 748 243 (M→T) and 780 (T→M).

0.89

1

1

1

1 VP3 alignment 0.32

0.94

VP3 nucleotide 1

0.96 0.82 1 1 1

0.99 VP3 alignment 1 position 1-1259, 1

1 0.18 VP3 alignment 0.37 1 1 1 0.69 1 1 alignment position 1 0.94 0.99 1 1 1 1800-2213 0.85 1 1 1 1 0.26 1 0.17 0.98 0.99 1 0.3 0.31 0.2 1 0.99 1 0.36 0.63 1 0.22

pos. 1-1259, 0.01 0.23 1 0.78 0.24 0.97 0.99 1 0.22 pos. 2214-2617 2214-2617 0.81 0.74 0.07 0.99 1 1 0.81 1 1

0.01 1 0.7 0.06 0.82 1 1 0.39 0.92 0.12 0.36 0.97 0.95 0.94 0.3 0.89 1 1 0.74 0.58 1 1 1 0.44 0.36 1 1 0.78 0.97 1 1 0.37 0.66 0.01 1

0 1 0.21 1 1800-2213 0.03 1 0.1 1 0.17 0.98 1 0.97 0.16 0.99 0.35 0.39 1 1 0.92 0.04 0.34 1 1 0.11 1 0.06 0.9 0.96 0.03 0.04 0.79 0.26 1 0.9 1 0.92 0.35 1 1 1 0.34 1 1 1 1 1 1 0.98 0.38 1 0.77 0.04 0.54 1 1 0.22 0.74 0.88 1 0.33 1 1 0.81 0.2 1 0.73 0.42 0.91 0.54 0.53 0.87 0.40.99 1 0.88 1 0.35 0.66 A 0.93 0.77 1 0.9 0.91 0.53 0.75 1 1 0.95 1 0.51 1 0.19 0.93 0.19 0.99 0.99 1 0.78 1 1 1 1 0.46 1 0.32 1 0.78 1 1 1 0.5 1 0.36 0.83 0.85 0.82 1 0.99 1 1 0.76 0.06 1 0.81 1 1 1 1 0.93 0.32 0.52 0.25 0.99 0.96 0.98 1 1 0.2 0.27 0.98 1 0.37 1 0.160.32 0.9 1 1 0.91 1 0.99 1 1 1 1 0.94 0.88 0.85 0.9 0.82 1 1 1 0.99 0.29 0.01 0.01 1 1 0.97 1 1 0.34 0.19 0.91 0.97 0.92 0.93 1 0.72 1 0.8 1 1 0.74 0.16 0.04 0.37 1 1 0.43 0.99 1 0.51 1 0.68 0.05 0.22 1 1 0.73 0.04 0.48 0.43 1 0 0.5 1 0.86 1 1 0.65 0.99 0.12 1 0.83 0.42 1 1 1 0.96 1 0.93 0.6 0.38 1 0.41 0.08 0.56 1 0.63 0.9 1 1 1 0.91 0.99 0.23 0.63 1 1 0.24 0.91 0.82 0.24 0.97 0.83 1 1 1 1 0.06 1 1 0.96 0.27 1 0.48 0.04 0.97 0.94 0.86 1 1 0.68 0.57 0.18 1 0.45 1 1 0.73 1 1 0.07 1 0.96 0.41 0.28 1 0.02 1 0.69 1 1 1 0.180.25 1 1 1 0.42 1 1 0.3 1 1 0.74 0.32 0.8 0 0.31 0.39 0.76 1 0.35 1 1 1 0.28 0.84 0.21 0.32 1 0.74 0.24 0.21 0.63 0.52 0.2 0.14 1 0.86 0.76 1 0.95 1 0.13 1 0.91 0.28 0.16 0.44 0.5 0.89 0.98 1 0.99 0.53 0.04 0.9 0.83 1 0.72 1 0.99 0.12 0.97 0.19 0.9 0.38 0.28 0 0.07 1 1 0.01 1 0.23 0.12 0.01 1 1 0.09 0.99 0.14 0.08 0.69 0.96 0 1 0.2 0.15 0.45 0.88 0.88 0.21 0.821 0.35 0 0.01 0.16 0.18 0.9 0.28 1 0 0.04 1 0.18 0.02 0.08 0.02 1 1 0.12 0.94 0.76 0.96 0 0.08 0.94 1 0.47 0.25 1 0.38 0.17 0.96 0.94 0.36 1 0.94 0.87 0.77 0.08 1 0.14 0.63 0.99 0.82 0.14 0.87 0 1 0.98 0.39 1 1 0.16 0.58 0.38 0.55 0.7 0.02 0.990.77 0.96 0 1 1 0.98 0.95 1 1 0.07 1 0.02 0.94 0.98 0.78 0.04 0.03 0.22 0.78 0.16 0.79 1 0.24 0.92 0.44 0.99 0.99 0.22 1 1 0.271 0.46 0.02 1 0 1 1 0.230.95 0.72 0.53 0.34 1 0.03 1 1 1 1 0 0.93 0.87 0.91 0.54 0.14 0.96 1 0.04 0.02 0 0.1 1 0.86 1 0.77 0 0 0.03 1 0.99 0 0.97 0 0 0 0 0 0 0.22 0.410.95 0.95 0.99 1 1 0.05 0.260.97 0.01 1 0.01 1 0.36 0.16 0.88 1 0.2 0.98 0.08 0.88 0.95 0.58 1 0.24 1 1 0.99 1 0.97 1 0.99 1 1 0.54 0.65 1 0.311 0.28 0.24 0 0.020.06 0.26 0.11 0 0.3 0.120.22 0.82 0.01 1 0.96 0.250.88 0.02 1 1 1 0.27 0.22 0.39 1 0.99 0.66 0.61 0.08 0.94 0.88 0.7 1 0.68 0.82 0.21 1 1 0.33

1 0.99 1 0.7 1 0.740.970.50.88 0.03 1 0.14 1 1 0.76 0.01 0.69 0.01 0.87 0.15 0.04 0.55 0.26 0.53 0.09 0.65 1 0.03 0.1 0.72 0.18 0 0 1 0.01 0.78 1 1 0.14 0.04 1 0.57 0 1 0.23 0.36 0.03 0.06 0.1 0.97 0.18 1 0.23 1 1 0.28 0.63 0.96 1 1 1 0.1 0.11 0.07 1 0.33 1 0.49 0.26 1 0.34 1 0.01 0.93 0.170.16 1 0.94 0 0.95 1 0.26

0.94 0 0.7 1 0.97 1 1 0.88 0 0 1 0.99 0.97 0 0.01 0.93 1 0.09 1 0.91 0.76 1 1 0.35 0.55 0.58 1 0.030.13 0.25 0 0.42 0.01 0.1 1 1 0 0.03 1

0.3 0 0.29

0.01

0.12 0.64 0.05 0.05 0.02 0.01 0.29 1 0.98 0 0.07 0.82 0.02 0.030.07 0.6 0.57

0.99 0.05 0.03 0.06 0.52 0.44 0.54 0.25 0.39 0.83 0.02 0.36 0.11 0.58 0.03 1 1 0.53 0.03 0.13 1 0.08 0.54 0.11 0.13 0.47 0.56 0.58 0.95 1 0.94 1

0.43 0.25 0.98 0.46 0.02 1

1 0.82 0.02 0.38 0.93 0.15 0.73

1

0.62 1

0.17

0.15

0.29 0.59

0.34

0.33

0.85

7 10

7.0 10.0 0 10 20 30 40 50 60 70 B 0 10 20 30 40 50 60 70 80 90 100

244 245 Figure 1. Segment 3 recombination event supported by multiple isolates. A) Phylogenetic trees made using 246 alignments from (left) nucleotide positions 1-1259 and 1800-2213 representing the major parent region 247 excluding second recombination event (Fig. 5), and (right) nucleotide position 2214-2617 representing the 248 minor parent sequence. B) BootScan (top) and RDP (bottom) analyses. The red line compares the minor parent 249 to the recombinant, blue line compares the major parent to the recombinant, and the green line compares the 250 major parent to the minor parent. The Y-axis for the BootScan analysis (top) is the bootstrap support, while the 251 Y-axis for the RDP analysis (bottom) is the pairwise identity. The X-axis for both analyses is the sequence along 252 segment 3. 253 254

12 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 25, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

VP3 alignment position 1-1260, VP3 alignment 1800-2213 position 1260-1800 M1 M1 1 KX880419_Germany_2015_03 A 1 0.91 KY616887_Japan_2013 0.47 MF997037_USA_2015 0.89 LC103021_Japan_2014 0.8 LC103021_Japan_2014 1 1 KY616887_Japan_2013 MF997037_USA_2015 1 1 1 1 KX880419_Germany_2015 0.15 KJ639025_Japan_2013_03 1 1 KC443224_Australia_2008 KJ639025_Japan_2013 KC443523_Australia_2007 0.37 KC443224_Australia_2008 0.92 KJ721712_Brazil_2008 KC443523_Australia_2007 0.06 1 0.41 0.32 KJ721712_Brazil_2008 0.18 KJ919584_Hungary_2012 M2 0.35 0.69 0.08 KC443688_Australia_2010 0.32 KJ919584_Hungary_2012 0.49 0.82KJ752827_SouthAfrica_2003

0.54 KX171537_SouthKorea_2011 0.94 KC443688_Australia_2010 0.04 M2 0.96 0.95 1 1 JQ069785_Canada_2009 0.73 0.54 JQ069721_Canada_2007 KJ752827_SouthAfrica_2003 0.28 0.49 0.18 KX171537_SouthKorea_2011 1 0.99 JQ069785_Canada_2009 0.74 0.2 0.79 JQ069721_Canada_2007 1 KX171540_SouthKorea_2013 1 1 1 0.82 KC443589_Australia_1977 0.51 0.29 KJ919514_Hungary_2012 1 KJ919514_Hungary_2012 0.35KF636247_Kenya_2009 1 1 0.29 0.17 KX171540_SouthKorea_2013 0.58 JF304928_Kenya_1982 1 1 KX655497_Uganda_2013 0.81 1 1 KX655464_Uganda_2013 0.31 0.91 KX655464_Uganda_2013 KX655497_Uganda_2013 MF184865_USA_2011 0.94 1 KC443009_USA_2005KF636247_Kenya_2009 0.26 0.3 0.98 1 1 KM660308_Cameroon_2011 0.92 KP753180_Uganda_2009 0.24 1 KC443009_USA_2005 KC443589_Australia_1977 0.39 0.27 1 1 KJ721709_Brazil_2010 KJ751626_Ghana_1999KC443347_Australia_2001 1 KU356587_Bangladesh_2010 MG181614_Malawi_2013 1 JF304928_Kenya_1982 KJ721709_Brazil_2010 0.82 KP753180_Uganda_2009MG181614_Malawi_2013 1 KU356587_Bangladesh_2010 1 0.99 1

1 KX655453_Uganda_2013 0.05 MF184865_USA_2011 0.4 1 1 JQ069732_Canada_2007HQ657173_SouthAfrica_2009 KJ751626_Ghana_1999KC443347_Australia_2001JQ069732_Canada_2007KM660308_Cameroon_2011 0.64 1 HQ657173_SouthAfrica_2009 1 0.99 0.97 1 0.33 GU384193_Bostaurus_China_2008 0.88 0.46 0.43 0.98 0.52 1 0.98 1 0.08 0.93 0.5 0.8 GU384193_Bostaurus_China_2008 1 0.99 KP882037_Bangladesh_2008KU199283_Bangladesh_2013 0.99 0.13 KU356642_Bangladesh_2013 0.1 0.81 0.25 0.17 1 JN706507_Thailand_2008KU356664_Bangladesh_2013

1 1 KU356664_Bangladesh_2013 0.54 1 0.99 1 1 0.98 1 0.73 0.48 1 KP882037_Bangladesh_2008KU356642_Bangladesh_2013 1 1 1 0.76 JX965141_Australia_2010 JN706507_Thailand_2008 JQ069768_Canada_2009 1 KU199283_Bangladesh_2013 0.98 1 0.22 1 0.62 0.04 KX171545_SouthKorea_2015 E U 7 0 8 9 2 5 _ C a n is lu p u s fa m ilia ris _ U S A _ 1 9 7 9 E U 7 0 8 9 3 6 _ C a n is0.99 lu p u s fa m ilia ris _ U S A _ 1 9 7 9 1 JQ069768_Canada_2009

1 E U 7 0 8 9 1 4 _ C a n is lu p u s fa m ilia ris _ U S A _ 1 9 8 0 0.6 K P 8 8 2 0 8 1 _ B a n g l a dKF716384_USA_2011 e s h _ 2 0 0 7 1 0.97 0.11 KX536659_India_2010 1 1 1 0.99 1 1 KF716384_USA_2011 0.99 LC227960_India_2013 JX965141_Australia_2010 0.27 1 KX171545_SouthKorea_2015 0.33 KC443325_Australia_2006 0.16 1 KX536659_India_2010 KP882081_Bangladesh_2007 0.29 K C 4 4 3 2 0 2 _ A u s t r a l i a _ 2 0 1 0

1 0.68 1 1 1 E U 7 0 8 9 0 3 _ U S A _ 1 9 8 4 0.73 KC443325_Australia_2006 0.55 K X 6 5 5 4 5 3 _ U g a n d a _ 2 0 1 3

E U 7 0 8 8 9 2 _ Is ra e l_ 1 9 8 5 LC227960_India_2013 H M 6 2 7 5 4 4 _ K e n y a _ 1 9 8 7 1 1 0.19 0.94 1 0.97 0.7 1 0.77 K T 9 3 6 6 2 8 _ T h a i l a n d _ 2 0 1 2 1 1 KP883027_Mali_2008 1 1 KX171539_SouthKorea_2012 K F 5 0 0 1 7 6 _ S u s s c r o f a _ S o u t h K o r e a _ 2 0 0 4 1 K P 8 8 2 4 7 7K _ GY 4h 9a 7n 5a 5_3 2 _ 0 P 0 a8 k i s t a n _ 2 0 1 0 KC443202_Australia_2010 J F 3 0 4 9 1 7 _ K e n y a _ 1 9 8 9 K X 1 7 1 5 3 9 _ S o u t h K o r e a _ 2 0 1 2

0.96 0.83 0.98 0.99 0.98 1 K P 8 8 3 0 2 7 _ M a l i _ 2 0 0 8 1 1 1 K Y 4 9 7 5 5 3 _ P a k i s t a n _ 2 0 1 0 0.9 K T 9 3 6 6 2 8 _ T h a i l a n d _ 2 0 1 2 K C 1 7 4 9 8 7 _ I n d ia _ 2 0 0 4 0.9 K P 8 8 2 4 7 7 _ G h a n a _ 2 0 0 8 K C 1 7 5 0 0 8 _ I n d ia _ 2 0 0 4 _ 0 3 E U 7 0 8 9 2 5 _ C a n is lu p u s fa m ilia ris _ U S A _ 1 9 7 9 1 1 0.99 E U 7 0 8 9 3 6 _ C a n is lu p u s fa m ilia ris _ U S A _ 1 9 7 9 1 F J 3 4 7 1 2 4 _ L a m a g u a n ic o e _ A r g e n t in a _ 1 9 9 8 G U 2 9 6 4 2 5 _ Ita ly _ 1 9 9 6 F J 3 4 7 1 2 4 _ L a m a g u a n ic o e _ A rg e n tin a _ 1 9 9 8 E U 7 0 8 9 1 4 _ C a n is lu p u s fa m ilia ris _ U S A _ 1 9 8 0 1 A B 5 7 3 0 8 1 _ B o s t a u r u s _ J a p a n _ 2 0 0 6 1 1 0.35 0.96 G U 2 9 6 4 2 4 _ Ita ly _ 1 9 9 6 K T 2 8 1 1 2 2 _ U S A _ 2 0 1 2 1 K F 6 3 6 2 5 8 _ B o s t a u r u s _ S o u t h A f r i c a _ 2 0 0 7 0.18 0.23 K P 0 0 6 5 0 8 _ G u a t e m a l a _ 2 0 0L 9 C 3 4 0K 0 X 1 2 5 1 _ 2 J 8 a 7 p 1 a _ n B _ o2 s0 t 1 a 4 u r u s _ T u r k e y _ 2 0 1 5

J F 3 0 4 9 1 7 _ K e n y a _ 1 9 8 9 0.07 L C 0 9 5 9 5 9 _ V ie tN a m _ 2 0 0 7 H M 6 2 7 5 4 4 _ K e n y a _ 1 9 8 7 J N 8 3 1 2 1 1 _ B o s t a u r u s _ S o u t h A f r i c a _ 2 0 0 7 0.73 A B 8 5 3 8 9 2 _ B o s ta u r u s _ J a p a n _ 2 0 1 3 _ 1 9 K U 9 5L 6 0C 0 3 8 3 _ 6 H 5 o 9 n 3 d _K u J U r a a 9 p s 6 a _ 2 n 2 1 _ 0 3 21 0 01 _ 1 _ 2 U S A _ 2 0 1 4 1

0.8 K J 4 1 1 4 3 4 _ U S A _ 2 0 1 2 E U 7 0 8 9 0 3 _ U S A _ 1 9 8 4 1 F J 3 4 7 1 1 3 _ B o s ta u ru s _ A rg e n tin a _ 1 9 9 8 1 A B 5 7 3 0 7 2 _ B o s ta u r u s _ J a p a n _ 2 0 0 8

0.93 1

E U 7 0 8 8 9 2 _ Is ra e l_ 1 9 8 5 K P 7 5 2 8 7 9 _ B o s t a u r u s _ S o u t h A f r i c a _ 2 0 0 9 F J 3 4 7 1 0 2 _ L a m a g u a n ic o e _ A rg e n tin a _ 1 9 9 9 A B 5 7 3 0 8 1 _ B o s ta u ru s _ J a p a n _ 2 0 0 6 1 1 K C 4 4 3 6 0 0 _ A u s t r a lia _ 2 0 0 8 K F 5 0 0 1 7 6 _ S u s s c ro fa _ S o u th K o re a _ 2 0 0 4 1 E F 5 5 4 0 8 4 _ B e lg iu m _ 2 0 0 2 0.64 G U 2 9 6 4 2 5 _ Ita ly _ 1 9 9 6 0.85 0.91 K F 6 3 6 2 5 8 _ B o s ta u r u s _ S o u th A fr ic a _ 2 0 0 7

0.67 J N 8 3 1 2 1 1 _ B o s t a u r u s _ S o u t h A f r ic a _ 2 0 0 7 G U 2 9 6 4 2 4 _ Ita ly _0.99 1 9 9 6 G U 8 2 7 4 0 8 _ F e lis c a tu s _ Ita ly _ 2 0 0 5

0.51 K T 2 8 1 1 2 2 _ U S A _ 2 0 1 2 K Y 4 2 6 8 1 1 _ C a p r e o lu s c a p r e o lu s _ S lo v e n ia _ 2 0 1 5 K U 5 0 8 3 8 2 _ H u n g a ry _ 2 0 0 2 K C 1 7 4 9 8 7 _ I n d ia _ 2 0 0 4 K P 7 5 2 8 7 9 _ B o s t a u ru s _ S o u t h A f ric a _ 2 0 0 9 K P 0 0 6 5 0 8 _ G u a te m a la _ 2 0 0 9 K C 4 4 3 6 0 0 _ A u s t r a lia _ 2 0 0 8 K J 4 1 1 4 3 4 _ U S A _ 2 0 1 2 E F 5 5 4 0 8 4K _ B C e 1 lg 7 5iu 0 m 0 _ 8 2 _ 0 In 0 d2 ia _ 2 0 0 4 0.77 A B 8 5 3 8 9 2 _ B o s ta u r u s _ J a p a n _ 2 0 1 3 _ 1 9

1 L C 0 9 5 9 5 9 _ V ie tN a m _ 2 0 0 7 K U 9 6 2 1 3 0 _ _ U S A _ 2 0 1 4 G U 8 2 7 4 0 8 _ F eA lis B 5c 7a 3tu 0 s 7 _ 2 Ita _ B ly o _s t2 a 0 u 0 r u5 s _ J a p a n _ 2 0 0 8 F J 3 4 7 1 1 3 _ B o s ta u ru s _ A rg e n tin a _ 1 9 9 8 A Y 7 4 0 7 3 9 _ B e lg iu m _ 2 0 0 0 J F 7 1 2 5 6 8 _ E q u u s c a b a llu s _ A rg e n tin a _ 1 9 9 3 K U 9 5 6 0 0 8 _ HL Co n 3 d 4 u 0 ra 0 1 s 5 _ _ 2 J 0 a 1 p 1 a n _ 2 0 1 4

F J 3 4 7 1 0 2 _ L a m a g u a n ic o e _ A rg e n tin a _ 1 9 9 9 L C 3 3 6 5 9 3 _ J a p a n _ 2 0 1 2 J X 2 7 1 0 0 3 _ T u n isJ ia F 4 _ 2 2 1 0 9 0 7 8 7 _ J a p a n _ 2 0 1 0 K U 5 0 8 3 8 2 _ H u n g a ry _ 2 0 0 2

1

1

J F 4 2 1 9 7 7 _ J a p a n _ 2 0 1 0 J N 9 0 3 5 2 4 _ E q u u s c a b a llu s _ Ire la n d _ 2 0 0 3 1 1 J X 2 7 1 0 0 3 _ T u n is ia _ 2 0 0 8 K P 8 8 2 6 3 1 _ G h a n a _ 2 0 0 8 K Y 4 2 6 8 1 1 _ C a p r e o lu s c a p r e o lu s _ S lo v e n ia _ 2 0 1 5

K J 7 5 2 1 3 0 _ E th io p ia _ 2 0 0 8 K F 5 0 0 5 2 1 _ U S A _ 2 0 1 2 1 K J 7 5 2 1 3 0 _ E th io p ia _ 2 0 0 8 A Y 7 4 0 7 3 9 _ B e lg iu m _ 2 0 0 0 1

K P 8 8 2 6 3 1 _ G h a n a _ 2 0 0 8 K J 4 1 2 7 2 3 _ P a ra g u a y _ 2 0 0 7

K F 5 0 0 5 2 1 _ U S A _ 2 0 1 2 JF712568_Equuscaballus_Argentina_1993

J F 7 1 2 5 7 9 _ E q u u s c a b a llu s _ A rg e n tin a _ 2 0 0 6 K F 9 0 7 2 8 8 _ B ra z il_ 2 0 0 8

1 1 J X 0 3 6 3 6 7 _ E qK u J u 4 s 1 c 2 a 6 b 2 a 2 llu _ P s _a Ara rg g u e a n y tin _ 2 a 0 _ 0 2 9 0 0 8 K J 9 1 9 5 5 0 _ H u n g a ry _ 2 0 1 2

K J 4 1 2 7 2 3 _ P a ra g u a y _ 2 0 0 7

J N 8 7 2 8 6 7 _ E q u u s c a b a llu s _ A rg e n tin a _ 2 0 0 8 K F 9 0 7 2 8 8 _ B ra z il_ 2 0 0 8

K J 8 7 5 7 9 3 _ C a n is lu p u s fa m ilia ris _ H u n g a ry _ 2 0 1 2 L C 1 5 8 1 2 1 _ C h iro p te ra _ Z a m b ia _ 2 0 1 2

J F 7 1 2 5 7 9 _ E q u u s c a b a llu s _ A rg e n tin a _ 2 0 0 6 J X 0 3 6 3 6 7 _ E q u u s c a b a llu s _ A rg e n tin a _ 2 0 0 8 K X 8 8 0 4 4 1 _ G e rm a n y_ 2 0 1 4 K J 4 1 2 6 2 2 _ P a ra g u a y _ 2 0 0 9 K F 9 0 7 2 8 7 _ B ra z il_ 2 0 1 0 J X 9 4 6 1 7 0 _ C h in a _ 2 0 1 1 K C 9 6 0 6 2 1 _ C h iro p te ra _ C h in a _ 2 0 1 1 K J 9 1 9 5 5 0 _ HK u nX g 8 a 8 ry0 4 _ 4 2 1 0 _ 1 G 2 e rm a n y _ 2 0 1 4

J N 8 7 2 8 6 7 _ E q u u s c a b a llu s _ A rg e n tin a _ 2 0 0 8

L C 1 5 8 1 2 1 _ C h iro p te ra _ Z a m b ia _ 2 0 1 2 J X 9 4 6 1 7 0K _ J C 8 h 7 in 5 7 a 9 _ 3 2 _ 0 C 1 1a n is lu p u s fa m ilia ris _ H u n g a ry _ 2 0 1 2 L C 3 4 0 0 2 6 _ J a p a n _ 2 0 1 4

K F 9 0 7 2 8 7 _ B raK z C il_ 9 2 6 0 0 16 2 0 1 _ C h iro p te ra _ C h in a _ 2 0 1 1 K F 6 9 0 1 2 7 _ A u s tra lia _ 2 0 1 2

K P 2 5 8 4 0 0 _ B e lg iu m _ 2 0 1 2

K X 2 1 2 8 7 1 _ B o s ta u ru s _ T u rk e y _ 2 0 1 5

L C 3 4 0 0 2 6 _ J a p a n _ 2 0 1 4

K F 6 9 0 1 2 7 _ A u s tra lia _ 2 0 1 2

K P 2 5 8 4 0 0 _ B e lg iu m _ 2 0 1 2

K X 8 1 4 9 2 6 _ C h iro p te ra _ C h in a _ 2 0 1 5

K X 8 1 4 9 2 6 _ C h iro p te ra _ C h in a _ 2 0 1 5

JN903523_Equuscaballus_Ireland_2004

KR632625_Italy_2012

JN903524_Equuscaballus_Ireland_2003

KR632625_Italy_2012

JN903523_Equuscaballus_Ireland_2004

KJ879450_Rattusnorvegicus_Germany_2010

KJ879450_Rattusnorvegicus_Germany_2010

7.0 B 10.0

255 256 Figure 5. A) Phylogenetic analysis of segment 3. Left tree (major parent) was made from an alignment of 257 nucleotide positions 1-1260 and 1900-2213 and right tree (minor parent) was made from nucleotide positions 258 1260-1800. M2 strains have been collapsed, and recombinant is colored in pink. B) BootScan (top) and RDP 259 analyses (bottom of VP3 M2 putative recombinant. The red line compares the minor parent to the recombinant, 260 blue line compares the major parent to the recombinant, and the green line compares the major parent to the 261 minor parent. The Y-axis for BootScan is the bootstrap support, while the Y-axis for the RDP analysis is the 262 pairwise identity. The X-axis in both analyses is the segment 3 sequence. 263 264 A second potential recombination event was identified in segment 3 between sublineages within the M2 265 lineage (Table 2; Fig. 5). However, when we created split alignments and ran a phylogenetic analysis in BEAST 266 v1.10.4, only one sequence showed a phylogenetic incongruity supporting this event (KX655453). Amino acid 267 substitutions as a result of this event included positions 405 (I→V), 412 (V→M), 414 (N→D), 441 (N→D), 458 268 (I→V), 459 (I→T), 468 (L→F), 473 (N→D), 486 (M→I), 518 (N→S), and 519 (E→G). 13 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 25, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

269 One possible complication with relying on phylogenetic incongruity as evidence of major recombination 270 events is that a recombinant virus may evolve faster than the parental strains, and, despite possessing an initial 271 fitness advantage such as immune avoidance, may eventually converge back towards the major parent’s 272 sequence. Thus, phylogenetic analysis likely underestimates the frequency of transiently stable recombination 273 events in a population. This phenomenon may account for the observation that the number of VP3 isolates 274 flagged as deriving from a recombination event based on sequence analysis is greater than the prevalence of 275 recombination predicted by the phylogenetic analysis. 276 Intergenotypic Recombination 277 We found strong evidence for intergenotype recombination in all segments except segment 7 (NSP3) 278 and segment 11 (NSP5) (Table 1; Supplementary Table 1). Instances of intergenotypic recombination in 279 segment 1 (VP1) (KU714444, JQ988899) only occurred in regions where the amino acid sequence was highly 280 conserved across genotypes, so these events resulted in few, if any, nonsynonymous mutations. Of the putative 281 events observed in more than two environmental isolates with strong support from detection methods, only 282 events in segment 9 (VP7) occurred between different genotypes (Table 2). The segment 9 recombinant region 283 amino acid substitutions that match the minor parent and differ from the major parent are shown in 284 Supplementary Table 2. 285 Structural Prediction of Recombinant Proteins 286 The protein models generated by I-TASSER for the intergenotypic recombinant G proteins showed that, 287 although the amino acid changes for the G1-G2 recombinant were substantial (25 changes) (Supplementary 288 Table 2), the secondary and tertiary structures seemed largely intact (Fig. 6). The G3-G1 and G6-G6 289 recombinants each showed four amino acid changes. The G9-G1 recombinant showed the most secondary 290 structure disruption, including a loss of beta sheets, a loss of antiparallel beta sheets, and slightly shorter beta 291 sheets/helices, which could indicate lower stability. However, based on the structural modeling, the tertiary 292 structure appeared to be maintained, suggesting that the putative recombinant glycoproteins were able to form 293 properly folded G-proteins (Fig. 6).

294 295 Figure 6. VP7 protein structures were predicted from amino acid alignments of the four strongly 296 supported G recombinants using I-TASSER. C-scores are confidence scores estimating the quality of 297 the predicted model, and range from [-5, 2], with higher scores indicate greater confidence A) G2-G1,

14 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 25, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

298 C-score = -1.38, KC443034; B) G9-G1:AF281044, C-score = -1.29; C) G3-G1, C-score = -1.27 299 KJ751729; D) G6-G6: KF170899, C-score = -1.42. 300 301 Antigenic epitope predictions generated by IEDB (Vita, Mahajan et al. 2019) and SVMTriP (Yao, 302 Zhang et al. 2012), as well as a large study done on mammalian G-types (Ghosh, Chattopadhyay et al. 2012), 303 showed that VP7 recombination occasionally results in amino acid substitutions in conserved epitopes 304 (Supplementary Table 2). For example, the amino acid sequence RVNWKKWWQV is usually flagged as, or 305 part of, an epitope in most G types including G2, G3, G4, G6, G9, but not in G1. In the G2-G1 recombination 306 event flagged in multiple isolates (Table 2; Figure 1), this region is altered (sometimes a KR substitution and 307 sometimes multiple amino acid substitutions) (Supplementary Table 2). Moreover, despite containing highly 308 variable regions, the region where the recombination occurred had low solvent accessibility. The G3-G1 309 recombinant had amino acid substitutions in two of four conserved epitope regions due to the recombination 310 event. There was also a conserved epitope sequence around amino acids 297-316 in G9 proteins that was altered 311 in the G9-G1 recombinant so it no longer appeared as an epitope. 312 Structural predictions generated by I-TASSER suggest that, although the amino acid sequences may 313 diverge, the protein folding and three-dimensional structures remain relatively conserved (Fig. 6). Thus, 314 although the amino acid sequence substitutions do not result in significant changes to the protein structure, they 315 may nevertheless reduce binding by antibodies or T-cell receptors, and may provide a selective advantage in 316 allowing the virus to avoid immune surveillance. 317 Recombination Junctions Often Correspond to RNA Secondary Structure Elements 318 Breakpoint distribution plots showed the sequence regions with the most breakpoints. These breakpoints 319 often corresponded to hairpins predicted by RNAalifold. Secondary RNA structure predictions for segment 9 320 (VP7) genotypes G1, G2, G6 and G9 are shown as mountain plots (Fig. 7). The breakpoints of the segment 9 321 recombination events correspond to areas leading to the peaks in the mountain plots (Fig. 7). The peaks indicate 322 a conserved hairpin loop, with the sequences leading up to the peak being the double-stranded portion of the 323 hairpin. Breakpoint distributions also appeared to correspond to secondary structure predictions made from 324 alignments of segment 1 (VP1) and segment (VP6) (Fig. 8). Segment 4 (VP4) RNA secondary structure 325 predictions (Supplementary Fig. 1) showed greater variation across genotypes, so while breakpoints did coincide 326 with secondary RNA structures as predicted by the models, there were not enough events in each genotype to 327 provide strong support that the breakpoints were correlated with secondary structure. The breakpoint 328 distributions for NSP1 and VP3 were also non-randomly distributed across the sequence (Supplementary Fig. 1), 329 however secondary structure predictions from these alignments were not consistent, so are not shown 330 (Supplementary Fig. 2).

15 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 25, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

VP7 G6 0 100 200 300 400 500 600 700 800 900 1000 0 100 200 300 400 500 600 700 800 900 1000 VP7 A A G9 i ii iii

GG CUU UAA AAG AGA GAA UUU CC GUC U GGC UAG C GG UUA GC U CC UUUU AAU GUA UGG UAU UGA AUA UAC C ACAA UUC UAA C CU UUC UGA UAU CAA UAG UUU UA UU GAA C UAU AU AUU A AA AUC AC UA AC UAG U GC GAU GGA C UU UAU A AUUU A UAGA UUU C UU UUA CUU AUU GUU AUU GC A UC AC CU UUU GUU AAA AC AC AA AAU UAU GGA AUU AAU UUA CC GAUC AC UGGC UC C AU GGA UAC AGC AUA UGC AAA UUC AUC AC AGC A AGA AAC AUU UUU GAC UUC AAC GC U AUG C UUAUA UUA UC C UAC AGA AGC AUC AA C UCAAA UUG GAG AUA CGG AAU GGA AGG AUA CUC UGU CC CA AUU AU UC UU GA CU AAA GG GU GG CC AA CUG GA UC AG UCU A UUU UA AA GA AUAC AC U GAU AUC GC UUC A UUC UC AAUU GAU C CAC AA CUU UAU UGU GAU UAU AAU GUU GUA CUG AUG AA GUA UGA UUC AAC GUU AGA GC UAGA UAU GUC UGA AUU AGC UGA UUU AAU UC UAAA UGA AUG GUU AUG UAA CCC AAU GGA UAU AAC A UU AUA U UA UUA UC AG CA AAC A GAU GA AGC G AA UAA AUG G AU AUC G AU GGG ACA G UC UU GU AC CAU AA AA GU AU GU CCA UU GA AU AC GC AGA CU UU AG GA AU AGGU UGU AUU AC C AC A AAU AC A GC GAC A UUU GAA GAG GUG GC UAC A AGU GAA AAA UUA GUA AU AAC C GAUGU UGU UGA UGG UGU GAA CC AUAA ACU UGA UGU GAC UAC AAA UAC C UG UAC AAU UAG GAA UUG UAA GAA GUU AGG AC C AAGAG AA AA UGU AGC G AU UAU A CA AGU CGG U GG CUC A GAU GU GUU A GA UAU UAC A GC GGA U CC AA CUA CUG C AC CAC A AAC UG AAC G UA UGA UGC G AG UAA A UU GGA AG AAA UGG UGG CAA GUU UUC UAU ACG GUA GUA GAU UAU AUUAA UC AGAU UGU GCA A GU UA UG UC CAA AA GA UC ACG G UCA UU AA AU UC AG CAG CU UU UU ACU A UAG GGU UUG AU AUA UC UUAG AUU AGA AUU GUA UGA UGU GAC C

GG CU UUA AAA GC G AGA AUU UC CGU UUG GC UAGC GG UUA GC UC CU UUU AA UG UA U GG UAU U GA AU AUA C C AC AA UUC UA A UC UUC U UG AUA U CG AU UAC A UU G UU AAA UU A UA UAU U AA AAU C AA UA AC G AGA A UA AU GGA C UA UAU A AU UU AC AGA UUU C UA UUU AU AGU AGU AAU AUU GG C C AC CA UGA UAA AU GC GC AA AAU UAU GG AGU UAA UUU GC CAAU UA CAG GAU CAA UGG AU AC U AC AUAU GC AAA U UC UAC GC A AAG UGA AC C AU UUU UAA CAU C AAC U CUU UGU UUG UA UUA UC C AGU UGA GG CAU C AA AC GAAA UA GC U GAU AC C GAA UG GAA AAA UAC CUU GUC ACA ACU A UU UU UG AC AA AA GG AU GG CC AA CA G GA UCU G UG UA CU UU AA AG AA UA UAC U GA UA U AG CAG C UU UUU C AG UG GAU C CA CAG UU GUA C UG UGA U UA CAAU UUA GUU UUA AUG AA AUA UGA UUC AAC AC UAG AAU UAG AUA UG UC UGAA UUG GC C GAU C UUAU AUU GAA UGA AU GGC UGU GC AAUC CA AUG GAC AUA AC GUU GUA UUA UUA CC AAC A AA CUG AUG AAG CAA AU AAA UGG AUA UC AAU GGG UUC GUC UUG CA CAG UUA AAG UGU GUC C AUUA AAU AC AC AA AC AC UUGG UAU U GG AU GUC UAA CAA CUG AU C CA AAC AC A UUU GAA AC AGU UGC GAC AAC GG AA AA G UU AGU G AU UA CUG A UG U UG UAG AU G GU GUC A AC CAU A AA UU AAA U GU U AC AAC AG C AA CUU G CA CUA UA C GC AAC UGU A AA AA AUU A GG AC CA AG AGA A AAC GUA GC A GUU AU AC A GGU AGG CG GC GC GA AC A UUU UA GAC AUC AC A GC UGAU CC AAC AAC UGC AC CAC AGA CAG AGA GAA UG AUG CGA GUA AAU UG GAA A AA AUG GUG GC AAG UGU UUU AUA C AG UA GUG GAU UAC GUC AA UC AGAU AAU UCA A GCA A UG UCCA AA AG AU CU AG AU CG CU A AA UU CA UC AG CA UU CU ACU A CAG AGU G UA GAU AC A UG UUA G AU UAG AG U UG UAU GAU G UG AC C

VP7

0 100 200 300 400 500 600 700 800 900 1000 G1 0 100 200 300 400 500 600 700 800 900 1000 VP7 iv vG2

GG CU UUA AAA GAG AGA AUU UC C GU CUG GC UAAC GG UUA GC UC CU UUU AA UG UA UGG U AU UGA AU AUA C CA CAA UUC UA A UC UUU CUG A UA UC AAU CAU U CU ACU C AACU A UA UA UU AAA AU CA GU GA CCC GA AU AA UG GAC UA CA UU AU AUA UAGA UUU UUG UUA AU UUC UGU AGC AUU AU UUG CCU UAA C UA AA GC UCAG AAC UAU GG AC UUAA UAU AC CAAU AA CAG GAU C AA UGG AU AC UGUA UAC UC C AAC UC UAC UC A AGA AGG AA UAU UUC UAA C AU CC ACA UUA UGU UUG UA UUA UC C AAC UGA AG C AA GUA CUC AAA UC AGU GAU GGU GAA UG GAA AGA CUC AUU AUC ACA A AU GU UUCU U AC AA AA GGU UG GC CA ACA GG AU CA GU CUA UU UU AA AG AG UAC UC AAA UA U UGU UG AUU U UU CCG UU GAC C CA CAA UU AUA U UG UGA UUA U AAC UUA GUA CUA AUG AA GUA UGA UC A AAA UC UUG AAU UAG AUA UG UC A GAA UUA GC U GAU UU GAU AUU GAA UGA AU GGU UAU GUA AUC CA AUG GAU AUA AC AUU AUA UUA UUA UC A ACA AU CGG GAG AAU CAA AU AAG UGG AUA UC A AU GGG AUC AUC AUG UA CUG UGA AAG UGU GUC CA CUG AAU AC A CAA AC GUU AGG AAU A GG UUGUC AAA C AA CGA AU GUA GAC UCA UUU GAA AC AGU UGC UGA GAA UG AA AA AUU A GCU AU AG UGG A UGU CG UUG AU G GGA UA AAU C AU AAA AU AAA U UU GAC AAC UA C GA CAU GUA C UA UU CGA AAU U GU AAG AA GUU A GG UC CA AGA GA GAAU GUA GC UGUA AU AC AAGU UGG UG GC UC UAAUG UAU UA GAC AUA AC AGC A GAU CC AAC GAC UAA UC CAC AAA UUG AGA GAA UG AUG AGA GUG AAU UG GAA A AG AUG GUG GC A AG UAU UUU AUA CUA UA GUA GAU UAU AUU AAUC A GAU UGU ACA GG UA AU GUCCA AA AG AU CAA GA UCA U UA AA UUC UG CU GC GU UUU AU UA UAG A GUA UA GAU AU A UCU UA GAU U AG AAU UG UAU G AU GUG AC C

GG C UUUA AAA GAG AGA AUU UC CGU C UG GC UAGC GG UUA GC UC UU UUU AA UG UA U GG UAU U GA AU AUA C CA CAA U UC UG AC CA UU UUG A UA UC UAU C AU AUU A UU GAA UU A UA UAU UAA A AA CU AUA A CU AAU A CG AU GGA C UA CAU A AU UU UU AGA UUU UUA C UA CU CAU C GC UC UGAU GU CAC C AU UUG UGA GGAC G CAA AAU UAU GG C AU GUA UUU AC CAAU AA CGG GAU CAC UAG AC GC UGUA UAC AC A AA UUC AAC UAG UGG AGA AU CAU UUC UAA C UU CA AC GC UA UGU UUA UA CUA UC C AAC AGA AG CUA AAA AUG AGA UU UC AGAU AAU GAA UG GGA AAA UAC UC UAUC ACA A UU AU UU UU AA CU AA AG GA UG GC CG ACU G GA UCA G UU UA UU UU AA AG AC UA CAA U GA UA UUA CUA C AU UUU C UA UG AAU C CA CAA C UG UA UUG U GA UUA U AAU GUA GUA UUG AUG AG AUA UGA UAA UAC AU C UG AAU UAG AUG CA UC GGAG UUA GC A GAU C UUAU AUU GAA CGA AU GGC UGU GC A AUC C UAUG GAU AUA UC ACU U UA CU AU UA UCA ACA A AA UA GCG A AU CA AA UA AA UG GA UA UCA A UG GG AAC AGA CUG C AC GG UAA AAG UUU GUC CA CUC AAU AC AC AA AC UUU AGG AAU UGG AU GC A AAA C UA CGG AC GUG GAU AC A UUU GAG AU UGU UGC GUC GUC UG AA AA AUU G GU AAU UA CUG A UG UUG U AA AU GGU G UU AAU C AU AAA AU A AA UAU U UC AAU AA G UA CGU G UA CUA UA C GU AAU U GU AAU AA A CUA GG AC CA CG AGA A AAU GUU GC UAUA AU UC AAGU UGG UG GAC CGA AC G CAC UA GAU AUC AC U GC UGAU CC AAC AAC AGU UC CAC AGG UUC AAA GAA UU AUG CGA GUA AAU UG GAA AAA AUG GUG GC A AG UGU UUU AUA C AG UA GUU GAC UAU AUU AA CC AAAU UAU ACA A GU UA UG UCCA AA CG GU CA AG AU CA UU AG AC AC AG CU GC UU UU UA UU AU AG AAU U UA GAU AU AGC U UA GAU U AG AA UUG U AU GAU G UG AC C B

331 332 Figure 7. A) Consensus mountain plots made in RNAalifold using the ViennaRNA package of the predicted RNA 333 secondary structures based on alignments made using full sequences for each of the five segment 9 G types 334 involved in the intergenotypic recombination events: (i) G9 (ii) G6 (iii) G3 (iv) G1 (v) G2. Peaks represent 335 hairpin loops, slopes correspond to helices, and plateaus correspond to loops. The X-axis corresponds to the 336 sequence of the segment. Each base-pairing is represented by a horizontal box where the height of the box 337 corresponds with the thermodynamic likelihood of the pairing. The colors correspond to the variation of base 338 pairings at that position. Red indicates the base pairs are highly conserved across all the sequences, and black 339 indicates the least conservation of those base pairings. B) Breakpoint distribution plots made in RDP4 of 340 putative recombinants in segment 9. The X-axis shows the position in the sequence, and Y-axis shows the 341 number of breakpoints per 50 nucleotide window. The highest peaks are around X = 115, 285, 1050.

16 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 25, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

A B

342 343 Figure 8. Consensus mountain plots overlaid with breakpoint distribution plots for recombination events in A) 344 VP6 and B) VP1. Position in the nucleotide sequence is on the X-axis. The entropy curve represented in green. 345 The black curve represents the pairing probabilities, and the red curve represents the minimum free energy 346 structure with well-defined regions having low entropy. 347 348 Discussion 349 Detecting Recombination: Recognizing Type I and Type II Error 350 Apparent instances of recombination may actually be the result of convergent evolution, lineage- 351 specific rate variation, sequencing error, poor , laboratory contamination or improper 352 bioinformatics analysis (Worobey, Rambaut et al. 2002, Boni, de Jong et al. 2010, Bertrand, Töpel et al. 2012, 353 Boni, Smith et al. 2012). Several steps can be taken to minimize incorrect attribution of viral recombination. 354 Ideally this process should begin at the time of sequencing. First, it should be confirmed that the originating 355 sample did not come from a host infected with multiple genotypes of the same virus type. Prior to RNA 356 extraction, single plaques should be repeatedly picked and plated (i.e. plaque purification) to ensure that 357 multiple genotypes are not inadvertently sequenced. Similarly, care must be taken when sequencing multiple 358 samples of the same virus to minimize the possibility of cross-contamination. 359 For sequences obtained from online repositories, such precautions are rarely possible. Instead careful 360 bioinformatics procedures can help minimize possible errors. As typical first step in identifying recombination 361 events, virus genome sequences are analyzed with software such as RDP4 (Martin, Murrell et al. 2015), but all 362 software programs are prone to error. For example, programs may falsely identify a recombination event when 363 none exists (type I error) or fail to detect a true recombination event when one exists (type II error). Several 364 studies measured errors incurred by RDP4 in the analysis of the genomes of tick-borne encephalitis virus, a 365 positive-sense RNA that rarely recombines (Norberg, Roth et al. 2013, Bertrand, Johansson et al. 366 2016). The results of the analyses indicated that recombination was overestimated in these viruses, and that 367 certain detection methods were more prone to type I error (Norberg, Roth et al. 2013, Bertrand, Johansson et al. 368 2016). MaxChi, Chimaera, and SiScan showed higher false positive rates than other RDP4 programs, but had 369 greater power to detect true recombination events. By contrast, 3Seq and GENECONV displayed lower false 370 positive rates, but had the lowest detection power of true events.

17 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 25, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

371 False positives using RDP4 are especially common among closely related strains (Bertrand, Johansson 372 et al. 2016). That said, recombination events are likely occur between closely related strains given their close 373 spatial/temporal proximity and genetic compatibility, so caution should be used in inferring events between 374 highly dissimilar genotypes. When a positive recombination signal has been detected, it is essential to assess its 375 statistical significance. However, in RDP4, the P-value of 0.05 does not correspond to a 5% rate of false 376 positives (Bertrand, Johansson et al. 2016), therefore we used a cut-off value of 10E-04 and focused only on 377 events where at least six RDP4 programs detected the putative recombinant. 378 After identifying putative recombination events, additional strategies can be used to eliminate errors. 379 For example, rates of type I and type II error increase with shorter length recombination regions (Boni, Zhou et 380 al. 2008, Norberg, Roth et al. 2013, Bertrand, Johansson et al. 2016). Therefore, we ignored any putative 381 recombination events of < 100nt with the exception of one isolate in segment 4 (NSP4) and one isolate in 382 segment 5 (NSP5) as the putative recombinant regions in these isolates were in conserved regions at the ends of 383 the respective segments (Supplementary Table 1). In addition, we visually inspected sequence alignments to 384 exclude misaligned sequences (Boni, de Jong et al. 2010). Splitting alignments by major and minor parent 385 followed by carefully parameterized BEAST runs may help distinguish genuine phylogenetic incongruity 386 signals from spurious false positives. Furthermore, we checked for the presence of unique polymorphisms 387 differing from the parent strains within the suspected recombination region as they may provide evidence that 388 recombination events are not laboratory artifacts. Presumably such substitutions would reflect subsequent 389 adaptive evolution by the recombinant virus. 390 In addition, we noted how many times the same recombination event occurred across multiple samples 391 since false positive recombinants are likely to be present as single isolates in phylogenetic trees (Boni, de Jong 392 et al. 2010). The more isolates showing the same event, the greater the probability that it represents a true 393 recombination event, especially if the isolates were acquired and sequenced by different laboratories. Events that 394 showed strong support, but were only isolated in one sequence are noted (Supplementary Table 1), but not 395 discussed, as it is difficult to rule out the possibility of type I error due to PCR or mosaic contig assembly (Boni, 396 de Jong et al. 2010, Varsani, Lefeuvre et al. 2018). 397 Sequence metadata can also be used to identify unlikely recombination events. For an event to be 398 plausible, the major and minor parents should have had opportunity to coinfect the same host, which is only 399 possible if they are congruent in time and space (Boni, de Jong et al. 2010). For example, one study identified 400 influenza A virus strain A/Taiwan/4845/99 as a recombinant of A/Wellington/24/2000 and A/WSN/33 (He, Han 401 et al. 2008). Given that the two parents were isolated 77 years apart in different parts of the world, it is 402 exceedingly unlikely that is a natural recombination event. Any putative recombination events should be 403 carefully screened to determine if the parental strains could have plausibly interacted. In Supplementary Table 1, 404 we include information on source species, year and place of isolation, % average nucleotide identity, and 405 genogroup for all putative recombinants and their major and minor parents. 406 Naturally High Coinfection in Rotavirus A 18 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 25, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

407 Some features of rotavirus biology make recombination not only possible, but also relatively plausible. 408 Rotaviruses are often released from cells as aggregates of approximately 5-15 particles contained within 409 extracellular vesicles (Santiana, Ghosh et al. 2018). While it is not yet clear whether these extracellular vesicles 410 can contain different rotavirus genotypes, they do allow for rotavirus coinfection even at low multiplicities of 411 infection. Thus, the physical barriers to recombination in dsRNA viruses (Lai 1992) may be offset by the high 412 rates of coinfection resulting from vesicle transmission of rotaviruses. 413 Furthermore, infection of hosts by multiple rotavirus strains appears to be relatively common. In a study 414 of 100 children in the Detroit area, G and P typing, which identifies the serotype of the VP7 and VP4 proteins 415 respectively, revealed that ~10% of patients were infected with multiple rotavirus A strains (Abdel-Haq, 416 Thomas et al. 2003). Similarly high frequencies of G and P mixed genotype infections were observed in children 417 sampled in India (three studies showing multiple G types in 11.3 %, 12% and 21% of samples) (Husain, Seth et 418 al. 1996, Jain, Das et al. 2001, Khetawat, Dutta et al. 2002), Spain (>11.4% of samples) (Sánchez-Fauquier, 419 Montero et al. 2006), Kenya (5.9%) (Kiulia, Peenze et al. 2006), Africa (12%) (Mwenda, Ntoto et al. 2010), and 420 Mexico (5.6% in 2010, 33.5% in 2012) (Anaya-Molina, De La Cruz Hernández et al. 2018). Even higher 421 frequencies of mixed genotype infections were observed in whole genome studies. For example, among 39 422 Peruvian fecal samples genotyped using multiplexed PCR, 33 (84.6%) showed evidence of multiple rotavirus 423 genotypes (Rojas, Dias et al. 2019). In another study, whole genome deep sequencing revealed that 15/61 (25%) 424 samples obtained in Kenya contained multiple rotavirus genotypes (Mwanga, Nyaigoti et al. 2018). Given the 425 high genetic diversity of rotavirus populations (Kirkwood 2010, Ghosh and Kobayashi 2011, Sadiq, Bostan et 426 al. 2018), and their proficiency in infecting a broad range of mammalian hosts including many domesticated 427 species (Martella, Banyai et al. 2010, Doro, Farkas et al. 2015), the high frequencies of hosts infected 428 with multiple genotypes is not entirely surprising. These coinfections present abundant opportunities for 429 rotavirus recombination. 430 Rotavirus Recombination Generates Genetic Diversity 431 Homologous recombination previously has not been considered a significant driver in rotavirus genetic 432 diversity and evolution (Ramig 1997, Woods 2015). Recombination is usually expected to be deleterious as the 433 breakage of open reading frames may disrupt RNA secondary structure and alter protein functionality (Lai 1992, 434 Simon-Loriere and Holmes 2011). However, recombination, as with reassortment (Ramig and Ward 1991, 435 Iturriza-Gomara, Isherwood et al. 2001, Schumann, Hotzel et al. 2009, Ghosh and Kobayashi 2011, Jere, 436 Chaguza et al. 2018), may further increase rotavirus genetic diversity due to epistatic interactions resulting in 437 reassortant-specific or recombinant-specific mutations (Zeldovich, Liu et al. 2015). Formerly deleterious 438 mutations may become beneficial when the genetic background changes, resulting in an increase in circulating 439 pathogenically relevant viral strains. 440 In our study, most recombination events occurred between strains of the same genotype (Fig. 1; Table 2; 441 Supplementary Table 1). This outcome is consistent with the expectation that intragenotypic is more common 442 since it would less likely to disrupt protein or secondary RNA structure. Nonetheless, intragenotypic 19 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 25, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

443 recombination can have long lasting effects on rotavirus genetic diversity. For example, we identified a 444 recombinant sub-lineage within the R2 clade of segment 1, the polymerase-encoding segment (Fig. 2; Fig. 3; 445 Table 2). As the same event is found in strains isolated years apart from geographically distant locations, we can 446 infer that the resulting genotype was sufficiently fit enough to be maintained in the population and disperse 447 widely (Supplementary Table 1). This finding suggests that this homologous recombination event has had a 448 long-term effect on rotavirus diversity. 449 While comparatively less common, we observed instances of intergenotypic recombination in all 450 segments with the exception of segment 7 (NSP3), the only segment where we observed no recombination 451 events (Table 2; Supplementary Table 1). A previous study reported intergenotypic recombination events in 452 segment 6 (VP6), segment 8 (NSP2), and segment 10 (NSP4) (Jere, Mlera et al. 2011), so our study adds to the 453 number of segments able to tolerate intergenotypic recombination. Interestingly, the serotype proteins, VP4 and 454 VP7, have the most different genotypes, with 51 and 36 respectively (Steger, Boudreaux et al. 2019). Given this 455 genetic diversity, the chances of two viruses with different G or P types coinfecting a cell is substantially higher 456 than other segments. In addition, both VP4 and VP7 seem to be more prone to reassortment, and to tolerate more 457 divergent genetic backgrounds or genome constellations (Martella, Ciarlet et al. 2003, Gentsch, Laird et al. 458 2005, McDonald, Matthijnssens et al. 2009, Patton 2012). This diversity and tolerance of many different genetic 459 backgrounds implies that VP4 and VP7 may be more tolerant of recombination between divergent strains than 460 the other segments. Our data seem to support this claim. 461 Specifically, we observed numerous instances of intergenotypic recombination in segment 9, the VP7 462 coding segment (Table 2; Supplementary Table 1). These events appear to be beneficial because the 463 recombinant genotypes persisted in populations long enough for multiple samples showing the same event to be 464 sampled (Table 2). For example, the same mosaic VP7 G6 gene (Fig. 8) was sequenced in multiple bovine 465 strains, one isolated in 2009 (HM591496) and another in 2012 (KF170899), by two separate research groups. 466 While still differentiable, the parents are closely related. However, in some instances, events between highly 467 divergent genotypes seem to have been able to persist in populations. Both a 2014 Malawi isolate (MG181727) 468 and a 2006 isolate from the United States (KC443034) showed a similar G1-G2 VP7 mosaic gene (Fig. 8). The 469 fact that these two G genotypes are highly divergent from one another and were identified in different years in 470 different locations by different research groups supports the contention that it is a true recombination event. 471 Altogether, 22/24 instances of recombination in segment 9 occurred between different G serotypes 472 (Supplementary Table 1). These examples indicate that mosaic genes formed from two divergent genotypes are 473 relatively common and increase the diversity of circulating VP7 genotypes. 474 In addition, 7/16 instances of recombination in segment 4 (VP4) were intergenotypic. Recombination 475 was observed between P4-P8, P6-P8, and P8-P14 serotypes (Supplementary Table 1). P4, P6 and P8 are all 476 relatively closely related being in the P[II] genogroup, while P14 is in the P[III] genogroup. These putative 477 recombination events suggest that relevant serotype diversity in human hosts is expanding. However, as P4, P6,

20 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 25, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

478 and P8 are also the dominant P types in human infections, there is a sampling bias towards this genogroup as the 479 genotypes within this group are more likely to coinfect humans, so caution should be taken with this conclusion. 480 Segment 5 (NSP1) also showed recombination between highly divergent strains. Not only is NSP1 not 481 strictly required for (Hua, Chen et al. 1994), it is also the least conserved of all rotavirus 482 proteins, including even the serotype proteins VP4 and VP7 (Arnold and Patton 2011), suggesting that 483 intergenotypic recombination disrupting the protein’s amino acid sequence may be less likely to be deleterious. 484 Segments 7 (NSP3), 8 (NSP2), 10 (NSP4), and 11 (NSP5) all had low rates of recombination. These 485 segments are short thus recombination is expected to be less likely, however segment 9 (VP7), one of the 486 smallest segments, defies this pattern, having a high number of events observed. The smaller segments may also 487 be less able to tolerate recombination events due to the important roles they play during the formation and 488 stabilization of the supramolecular RNA complex (Fajardo, Sung et al. 2015) during rotavirus packaging and 489 assembly (Li, Manktelow et al. 2010, Suzuki 2015, Borodavka, Dykeman et al. 2017, Fajardo, Sung et al. 2017). 490 Generation of Escape Mutants 491 Recombination involving regions encoding conserved epitopes, especially in segments that encode 492 proteins involved in host cell attachment and entry, may provide selective advantages to rotaviruses by allowing 493 them to evade inactivation by host-produced antibodies. These escape mutants may be generally less fit than 494 wildtype viruses, but competitively advantaged in hosts because of a lack of host recognition. Subsequent 495 intrahost adaptation may then select for compensatory fitness-increasing mutations allowing these strains to be 496 competitive with circulating rotavirus strains. Many of the recombination events observed in our study appeared 497 to generate such escape mutants. For example, we found two instances of the same segment 9 (VP7) G1-G3 498 recombination event (Table 2; KJ751729 and KP752817) with amino acid changes in two of four conserved 499 epitope regions due to the recombination event, which suggests that this strain prevailed in the population 500 because it was better able to evade antibody neutralization. 501 Based on the I-TASSER structural predictions, the putative recombinant VP7 proteins detected in our 502 survey appear able to fold properly and form functional proteins despite containing amino acid sequence from 503 different ‘parental’ genotypes (Fig. 6). While the secondary structure appeared slightly altered (e.g., shorter beta 504 sheets), the recombinant VP7 proteins generally maintained their 3-dimensional shape. The selective advantage 505 from swapping epitopes may outweigh any potential decrease in protein stability resulting from recombination. 506 We also identified many recombination events involving segment 4 (VP4). In order to infect cells, VP4 507 must be proteolytically cleaved to produce VP5* and VP8* (Arias, Romero et al. 1996). Most of the segment 4 508 recombination events involved the spike head of the VP8* protein or the spike body/stalk region of the VP5* 509 protein (antigen domain). Escape mutant studies (Zhou, Burns et al. 1994, Ludert, Ruiz et al. 2002, Aoki, 510 Settembre et al. 2009, Nair, Feng et al. 2017) for VP4 show the VP8* spike head recognizes histo-blood group 511 antigens, which is one of rotavirus’s main host range expansion barriers (Huang, Xia et al. 2012, Hu, Sankaran 512 et al. 2018, Lee, Dickson et al. 2018). VP5* mediates membrane penetration during cell entry (Yoder and 513 Dormitzer 2006). VP4 recombinants therefore may help the virus expand host range and aid in immune evasion. 21 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 25, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

514 Segment 6 (VP6) also showed recombination events resulting in substantial amino acid changes. VP6 is 515 a more conserved protein, but also an antigenic protein that interacts with naïve B cells (Parez, Garbarg-Chenon 516 et al. 2004). This feature suggests that there may be selection for VP6 escape mutants to evade host immune 517 responses. Structure analyses of VP6 indicated that it is relatively conserved across genotypes (Jiang, 518 Tsunemitsu et al. 1992, Tang, Gilbert et al. 1997, Charpilienne, Lepault et al. 2002), which may explain why 519 VP6 seems to have more frequent intergenotypic recombination. Conserved epitopes in VP6 exist around amino 520 acid positions 197 to 214, and 308 to 316 (Aiyegbo, Eli et al. 2014). 521 Recombination in Other dsRNA Viruses 522 Recombination has had a significant impact on the diversity of other dsRNA reoviruses (He, Ding et al. 523 2010). One study of 692 complete bluetongue virus segments found evidence for at least 11 unique recombinant 524 genotypes (1.6%) (He, Ding et al. 2010). The case for recombination among bluetongue viruses is strengthened 525 by the fact that viruses containing the same (or similar) recombinant segments were isolated by different 526 research groups in different countries at different times, indicating that the recombinant viruses persisted and

527 spread following the recombination event (Carpi, Holmes et al. 2010, He, Ding et al. 2010). Another study found 528 multiple possible instances of recombination in genome segment 8 (encoding NS2) of the epizootic hemorrhagic 529 disease virus, a reovirus similar to bluetongue virus (Anthony, Maan et al. 2009). Several studies have reported 530 recombination among the dsRNA rice black-streaked dwarf virus, which is also a member of the Reoviridae 531 family, but infects (Li, Xia et al. 2013, Yin, Zheng et al. 2013). Putative recombinants were identified in 532 six of the ten Southern rice black-streaked dwarf virus segments (Li, Xia et al. 2013, Yin, Zheng et al. 2013). 533 Finally, intragenic recombination was observed in multiple isolates of the African horse sickness virus, an 534 Oribivirus of the family Reoviridae (Ngoveni, van Schalkwyk et al. 2019). At least one of these events appeared 535 in multiple subsequent lineages (Ngoveni, van Schalkwyk et al. 2019). 536 In study of the family , 1,881 sequences were analyzed for evidence of recombination 537 (Hon, Lam et al. 2008). While no interspecies recombination was observed, at least eight putative instances of 538 intraspecies recombination were observed among the infectious bursal disease viruses and the aquabirnaviruses 539 (Hon, Lam et al. 2008). Subsequent studies focusing on the infectious bursal disease viruses supported these 540 results, and identified additional potential recombination events (He, Ma et al. 2009, Jackwood 2012, Vukea, 541 Willows-Munro et al. 2014). We note that birnaviruses’ genetic material is in a complex with ribonucleoprotein, 542 while the genetic material of Reoviridae members is free in the virion, which could be a factor in differing rates 543 of recombination across these dsRNA viruses. 544 Recombination has also been observed in dsRNA , including in the (Botella, 545 Tuomivirta et al. 2015) and the Hypoviridae (Carbone, Liu et al. 2004, Linder-Basso, Dynek et al. 2005, Feau, 546 Dutech et al. 2014) and the (Voth, Mairura et al. 2006). Recombination in Gammapartititvirus, 547 which infects the Gremmeniella abietina, may have permitted the virus to cross species borders (Botella, 548 Tuomivirta et al. 2015). In cryphonectria 1, which infects chestnut blight, recombination was 549 implicated in the spread of the virus in Europe (Feau, Dutech et al. 2014). Collectively, these studies, and those 22 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 25, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

550 of other dsRNA families, suggest that not only is recombination possible, but also significantly impacts virus 551 evolution. There is little doubt that this conclusion will be strengthened as more dsRNA viruses are discovered 552 and/or sequenced. 553 Possible Mechanism of Recombination in Rotavirus 554 The precise mechanism for how rotavirus recombination occurs is unknown, but inferences can be made 555 because many of the details regarding rotavirus replication and packaging have been resolved (McDonald and 556 Patton 2011, Borodavka, Desselberger et al. 2018). The most-accepted hypothesis is that recombination takes 557 place when the rotavirus +ssRNA is replicated after being packaged in the nucleocapsid (Esona, Roy et al. 2017, 558 Jing, Zhang et al. 2018). For packaging and replication to occur, the eleven +ssRNA segments must join a 559 protein complex consisting of the VP1 polymerase and the VP3 capping enzyme. Secondary RNA structures in 560 the non-translated terminal regions (NTRs) aid in the formation of this supramolecular RNA complex (Fajardo, 561 Sung et al. 2015, Borodavka, Dykeman et al. 2017) and determine whether the segments are packaged (Li, 562 Manktelow et al. 2010, Suzuki 2015). Perhaps recombination occurs when multiple homologous RNA strands 563 are joined in the same complex, allowing the homologous NTRs to partially hybridize. In this scenario, the VP1 564 polymerase replicates part of one strand before switching to the other, thus producing a recombined segment. 565 This conjecture is supported by the fact that recombination tends to occur in segment regions where self- 566 hybridization forms three-dimensional structures. Moreover, rotavirus do not seem constrained from packaging 567 extra genetic material (Desselberger 1996). A similar form of template switching is seen in poliovirus, an 568 ssRNA virus, which exhibits high rates of recombination, although the precise mechanism may be different in 569 that case insofar as in poliovirus, the polymerase may be stalled due to a hairpin or other secondary structure, 570 and switches to a different template (Tolskaya, Romanova et al. 1987). Further study is needed to determine the 571 precise mechanism of recombination in rotavirus. 572 573

23 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 25, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

574 References 575 Abdel-Haq, N. M., R. A. Thomas, B. I. Asmar, V. Zacharova and W. D. Lyman (2003). "Increased 576 prevalence of G1P[4] genotype among children with rotavirus-associated gastroenteritis in 577 metropolitan Detroit." J Clin Microbiol 41(6): 2680-2682. 578 Aida, S., S. Nahar, S. K. Paul, M. A. Hossain, M. R. Kabir, S. R. Sarkar, S. Ahmed, S. Ghosh, N. 579 Urushibara, M. Kawaguchiya, M. S. Aung, A. Sumi and N. Kobayashi (2016). "Whole genomic analysis 580 of G2P[4] human Rotaviruses in Mymensingh, north-central Bangladesh." Heliyon 2(9): e00168- 581 e00168. 582 Aiyegbo, M. S., I. M. Eli, B. W. Spiller, D. R. Williams, R. Kim, D. E. Lee, T. Liu, S. Li, P. L. Stewart and J. 583 E. Crowe, Jr. (2014). "Differential accessibility of a rotavirus VP6 epitope in trimers comprising type I, 584 II, or III channels as revealed by binding of a human rotavirus VP6-specific antibody." J Virol 88(1): 585 469-476. 586 Anaya-Molina, Y., S. I. De La Cruz Hernández, A. E. Andrés-Dionicio, H. L. Terán-Vega, H. Méndez- 587 Pérez, G. Castro-Escarpulli and H. García-Lozano (2018). "A one-step real-time RT-PCR helps to 588 identify mixed rotavirus infections in Mexico." Diagnostic Microbiology and Infectious Disease 92(4): 589 288-293. 590 Anthony, S. J., N. Maan, S. Maan, G. Sutton, H. Attoui and P. P. C. Mertens (2009). "Genetic and 591 phylogenetic analysis of the non-structural proteins NS1, NS2 and NS3 of epizootic haemorrhagic 592 disease virus (EHDV)." Virus Research 145(2): 211-219. 593 Aoki, S. T., E. C. Settembre, S. D. Trask, H. B. Greenberg, S. C. Harrison and P. R. Dormitzer (2009). 594 "Structure of rotavirus outer-layer protein VP7 bound with a neutralizing Fab." Science 324(5933): 595 1444-1447. 596 Arias, C. F., P. Romero, V. Alvarez and S. Lopez (1996). "Trypsin activation pathway of rotavirus 597 infectivity." J Virol 70(9): 5832-5839. 598 Arnold, M. M. and J. T. Patton (2011). "Diversity of interferon antagonist activities mediated by NSP1 599 proteins of different rotavirus strains." J Virol 85(5): 1970-1979. 600 Banyai, K., G. Kemenesi, I. Budinski, F. Foldes, B. Zana, S. Marton, R. Varga-Kugler, M. Oldal, K. Kurucz 601 and F. Jakab (2017). "Candidate new rotavirus species in Schreiber's bats, Serbia." Infect Genet Evol 602 48: 19-26. 603 Bernhart, S. H., I. L. Hofacker, S. Will, A. R. Gruber and P. F. Stadler (2008). "RNAalifold: improved 604 consensus structure prediction for RNA alignments." BMC Bioinformatics 9: 474. 605 Bertrand, Y., M. Töpel, A. Elväng, W. Melik and M. Johansson (2012). "First Dating of a Recombination 606 Event in Mammalian Tick-Borne ." PLOS ONE 7(2): e31981. 607 Bertrand, Y. J. K., M. Johansson and P. Norberg (2016). "Revisiting Recombination Signal in the Tick- 608 Borne Encephalitis Virus: A Simulation Approach." PLOS ONE 11(10): e0164435. 609 Boni, M. F., M. D. de Jong, H. R. van Doorn and E. C. Holmes (2010). "Guidelines for identifying 610 homologous recombination events in influenza A virus." PLoS One 5(5): e10434. 611 Boni, M. F., G. J. D. Smith, E. C. Holmes and D. Vijaykrishna (2012). "No evidence for intra-segment 612 recombination of 2009 H1N1 influenza virus in swine." Gene 494(2): 242-245. 613 Boni, M. F., Y. Zhou, J. K. Taubenberger and E. C. Holmes (2008). "Homologous Recombination Is Very 614 Rare or Absent in Human Influenza A Virus." Journal of Virology 82(10): 4807. 615 Borodavka, A., U. Desselberger and J. T. Patton (2018). "Genome packaging in multi-segmented 616 dsRNA viruses: distinct mechanisms with similar outcomes." Current Opinion in Virology 33: 106-112. 617 Borodavka, A., E. C. Dykeman, W. Schrimpf and D. C. Lamb (2017). "Protein-mediated RNA folding 618 governs sequence-specific interactions between rotavirus genome segments." Elife 6.

24 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 25, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

619 Botella, L., T. T. Tuomivirta, J. Hantula, J. J. Diez and L. Jankovsky (2015). "The European race of 620 Gremmeniella abietina hosts a single species of Gammapartitivirus showing a global distribution and 621 possible recombinant events in its history." Fungal Biology 119(2): 125-135. 622 Bwogi, J., K. C. Jere, C. Karamagi, D. K. Byarugaba, P. Namuwulya, F. N. Baliraine, U. Desselberger and 623 M. Iturriza-Gomara (2017). "Whole genome analysis of selected human and animal rotaviruses 624 identified in Uganda from 2012 to 2014 reveals complex genome reassortment events between 625 human, bovine, caprine and porcine strains." PLoS One 12(6): e0178855. 626 Cao, D., M. Barro and Y. Hoshino (2008). "Porcine rotavirus bearing an aberrant gene stemming from 627 an intergenic recombination of the NSP2 and NSP5 genes is defective and interfering." Journal of 628 virology 82(12): 6073-6077. 629 Carbone, I., Y. C. Liu, B. I. Hillman and M. G. Milgroom (2004). "Recombination and migration of 630 Cryphonectria hypovirus 1 as inferred from gene genealogies and the coalescent." Genetics 166(4): 631 1611-1629. 632 Carpi, G., E. C. Holmes and A. Kitchen (2010). "The Evolutionary Dynamics of Bluetongue Virus." 633 Journal of Molecular Evolution 70(6): 583-592. 634 Charpilienne, A., J. Lepault, F. Rey and J. Cohen (2002). "Identification of rotavirus VP6 residues 635 located at the interface with VP2 that are essential for capsid assembly and transcriptase activity." 636 Journal of virology 76(15): 7822-7831. 637 Desselberger, U. (1996). "Genome rearrangements of rotaviruses." Advances in Virus Research, Vol 638 46 46: 69-95. 639 Desselberger, U. (2014). "Rotaviruses." Virus Research 190: 75-96. 640 Donker, N. C., K. Boniface and C. D. Kirkwood (2011). "Phylogenetic analysis of rotavirus A NSP2 gene 641 sequences and evidence of intragenic recombination." Infect Genet Evol 11(7): 1602-1607. 642 Donker, N. C. and C. D. Kirkwood (2012). "Selection and evolutionary analysis in the nonstructural 643 protein NSP2 of rotavirus A." Infection Genetics and Evolution 12(7): 1355-1361. 644 Doro, R., S. L. Farkas, V. Martella and K. Banyai (2015). "Zoonotic transmission of rotavirus: 645 surveillance and control." Expert Review of Anti-Infective Therapy 13(11): 1337-1350. 646 Drummond, A. J. and A. Rambaut (2007). "BEAST: Bayesian evolutionary analysis by sampling trees." 647 BMC Evolutionary Biology 7(1): 214. 648 Edgar, R. C. (2004). "MUSCLE: multiple sequence alignment with high accuracy and high throughput." 649 Nucleic Acids Res 32(5): 1792-1797. 650 Esona, M. D., S. Roy, K. Rungsrisuriyachai, J. Sanchez, L. Vasquez, V. Gomez, L. A. Rios, M. D. Bowen 651 and M. Vazquez (2017). "Characterization of a triple-recombinant, reassortant rotavirus strain from 652 the Dominican Republic." Journal of General Virology 98(2): 134-142. 653 Fajardo, T., Jr., P. Y. Sung and P. Roy (2015). "Disruption of Specific RNA-RNA Interactions in a Double- 654 Stranded RNA Virus Inhibits Genome Packaging and Virus Infectivity." PLoS Pathog 11(12): e1005321. 655 Fajardo, T., P. Y. Sung, C. C. Celma and P. Roy (2017). "Rotavirus Genomic RNA Complex Forms via 656 Specific RNA-RNA Interactions: Disruption of RNA Complex Inhibits Virus Infectivity." Viruses 9(7). 657 Feau, N., C. Dutech, J. Brusini, D. Rigling and C. Robin (2014). "Multiple introductions and 658 recombination in Cryphonectria hypovirus 1: perspective for a sustainable biological control of 659 chestnut blight." Evolutionary applications 7(5): 580-596. 660 Gentsch, J. R., A. R. Laird, B. Bielfelt, D. D. Griffin, K. Banyai, M. Ramachandran, V. Jain, N. A. Cunliffe, 661 O. Nakagomi, C. D. Kirkwood, T. K. Fischer, U. D. Parashar, J. S. Bresee, B. Jiang and R. I. Glass (2005). 662 "Serotype diversity and reassortment between human and animal rotavirus strains: implications for 663 rotavirus vaccine programs." J Infect Dis 192 Suppl 1: S146-159.

25 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 25, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

664 Ghosh, A., S. Chattopadhyay, M. Chawla-Sarkar, P. Nandy and A. Nandy (2012). "In silico study of 665 rotavirus VP7 surface accessible conserved regions for antiviral drug/vaccine design." PLoS One 7(7): 666 e40749. 667 Ghosh, S. and N. Kobayashi (2011). "Whole-genomic analysis of rotavirus strains: current status and 668 future prospects." Future Microbiology 6(9): 1049-1065. 669 Hatcher, E. L., S. A. Zhdanov, Y. Bao, O. Blinkova, E. P. Nawrocki, Y. Ostapchuck, A. A. Schaffer and J. R. 670 Brister (2017). "Virus Variation Resource - improved response to emergent viral outbreaks." Nucleic 671 Acids Res 45(D1): D482-d490. 672 He, C.-Q., L.-Y. Ma, D. Wang, G.-R. Li and N.-Z. Ding (2009). "Homologous recombination is apparent 673 in infectious bursal disease virus." Virology 384(1): 51-58. 674 He, C. Q., N. Z. Ding, M. He, S. N. Li, X. M. Wang, H. B. He, X. F. Liu and H. S. Guo (2010). "Intragenic 675 recombination as a mechanism of genetic diversity in bluetongue virus." J Virol 84(21): 11487-11495. 676 He, C. Q., G. Z. Han, D. Wang, W. Liu, G. R. Li, X. P. Liu and N. Z. Ding (2008). "Homologous 677 recombination evidence in human and swine influenza A viruses." Virology 380(1): 12-20. 678 Hon, C. C., T. T. Lam, C. W. Yip, R. T. Wong, M. Shi, J. Jiang, F. Zeng and F. C. Leung (2008). 679 "Phylogenetic evidence for homologous recombination within the family Birnaviridae." J Gen Virol 680 89(Pt 12): 3156-3164. 681 Hu, L., B. Sankaran, D. R. Laucirica, K. Patil, W. Salmen, A. C. M. Ferreon, P. S. Tsoi, Y. Lasanajak, D. F. 682 Smith, S. Ramani, R. L. Atmar, M. K. Estes, J. C. Ferreon and B. V. V. Prasad (2018). "Glycan recognition 683 in globally dominant human rotaviruses." Nat Commun 9(1): 2631. 684 Hua, J., X. Chen and J. T. Patton (1994). "Deletion mapping of the rotavirus metalloprotein NS53 685 (NSP1): the conserved cysteine-rich region is essential for virus-specific RNA binding." J Virol 68(6): 686 3990-4000. 687 Huang, P., M. Xia, M. Tan, W. Zhong, C. Wei, L. Wang, A. Morrow and X. Jiang (2012). "Spike protein 688 VP8* of human rotavirus recognizes histo-blood group antigens in a type-specific manner." J Virol 689 86(9): 4833-4843. 690 Husain, M., P. Seth, L. Dar and S. Broor (1996). "Classification of rotavirus into G and P types with 691 specimens from children with acute diarrhea in New Delhi, India." Journal of Clinical Microbiology 692 34(6): 1592. 693 Iturriza-Gomara, M., B. Isherwood, U. Desselberger and J. Gray (2001). "Reassortment in vivo: driving 694 force for diversity of human rotavirus strains isolated in the United Kingdom between 1995 and 695 1999." J Virol 75(8): 3696-3705. 696 Jackwood, D. J. (2012). "Molecular epidemiologic evidence of homologous recombination in 697 infectious bursal disease viruses." Avian Dis 56(3): 574-577. 698 Jain, V., B. K. Das, M. K. Bhan, R. I. Glass and J. R. Gentsch (2001). "Great diversity of group A rotavirus 699 strains and high prevalence of mixed rotavirus infections in India." J Clin Microbiol 39(10): 3524-3529. 700 Jere, K. C., C. Chaguza, N. Bar-Zeev, J. Lowe, C. Peno, B. Kumwenda, O. Nakagomi, J. E. Tate, U. D. 701 Parashar, R. S. Heyderman, N. French, N. A. Cunliffe, M. Iturriza-Gomara and V. Consortium (2018). 702 "Emergence of Double- and Triple-Gene Reassortant G1P 8 Rotaviruses Possessing a DS-1-Like 703 Backbone after Rotavirus Vaccine Introduction in Malawi." Journal of Virology 92(3). 704 Jere, K. C., L. Mlera, N. A. Page, A. A. van Dijk and H. G. O'Neill (2011). "Whole genome analysis of 705 multiple rotavirus strains from a single stool specimen using sequence-independent amplification and 706 454(R) pyrosequencing reveals evidence of intergenotype genome segment recombination." Infect 707 Genet Evol 11(8): 2072-2082.

26 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 25, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

708 Jiang, B., H. Tsunemitsu, J. R. Gentsch, R. I. Glass, K. Y. Green, Y. Qian and L. J. Saif (1992). "Nucleotide 709 sequence of gene 5 encoding the inner capsid protein (VP6) of bovine group C rotavirus: comparison 710 with corresponding genes of group C, A, and B rotaviruses." Virology 190(1): 542-547. 711 Jing, Z., X. Zhang, H. Shi, J. Chen, D. Shi, H. Dong and L. Feng (2018). "A G3P 13 porcine group A 712 rotavirus emerging in China is a reassortant and a natural recombinant in the VP4 gene." 713 Transboundary and Emerging Diseases 65(2): e317-e328. 714 Khetawat, D., P. Dutta, S. K. Bhattacharya and S. Chakrabarti (2002). "Distribution of rotavirus VP7 715 genotypes among children suffering from watery diarrhea in Kolkata, India." Virus Res 87(1): 31-40. 716 Kirkwood, C. D. (2010). "Genetic and antigenic diversity of human rotaviruses: potential impact on 717 vaccination programs." Journal of Infectious Diseases 202: S43-S48. 718 Kiulia, N. M., I. Peenze, J. Dewar, A. Nyachieo, M. Galo, E. Omolo, A. D. Steele and J. M. Mwenda 719 (2006). "Molecular characterisation of the rotavirus strains prevalent in Maua, Meru North, Kenya." 720 East Afr Med J 83(7): 360-365. 721 Lai, M. M. (1992). "RNA recombination in animal and viruses." Microbiol Rev 56(1): 61-79. 722 Lee, B., D. M. Dickson, A. C. deCamp, E. Ross Colgate, S. A. Diehl, M. I. Uddin, S. Sharmin, S. Islam, T. R. 723 Bhuiyan, M. Alam, U. Nayak, J. C. Mychaleckyj, M. Taniuchi, W. A. Petri, Jr., R. Haque, F. Qadri and B. 724 D. Kirkpatrick (2018). "Histo-Blood Group Antigen Phenotype Determines Susceptibility to Genotype- 725 Specific Rotavirus Infections and Impacts Measures of Rotavirus Vaccine Efficacy." J Infect Dis 217(9): 726 1399-1407. 727 Li, W., E. Manktelow, J. C. von Kirchbach, J. R. Gog, U. Desselberger and A. M. Lever (2010). "Genomic 728 analysis of codon, sequence and structural conservation with selective biochemical-structure 729 mapping reveals highly conserved and dynamic structures in rotavirus with potential cis-acting 730 functions." Nucleic Acids Res 38(21): 7718-7735. 731 Li, Y., Z. Xia, J. Peng, T. Zhou and Z. Fan (2013). "Evidence of recombination and genetic diversity in 732 southern rice black-streaked dwarf virus." Arch Virol 158(10): 2147-2151. 733 Linder-Basso, D., J. N. Dynek and B. I. Hillman (2005). "Genome analysis of Cryphonectria hypovirus 4, 734 the most common hypovirus species in North America." Virology 337(1): 192-203. 735 Lorenz, R., S. H. Bernhart, C. Honer Zu Siederdissen, H. Tafer, C. Flamm, P. F. Stadler and I. L. Hofacker 736 (2011). "ViennaRNA Package 2.0." Algorithms Mol Biol 6: 26. 737 Ludert, J. E., M. C. Ruiz, C. Hidalgo and F. Liprandi (2002). "Antibodies to rotavirus outer capsid 738 glycoprotein VP7 neutralize infectivity by inhibiting virion decapsidation." J Virol 76(13): 6643-6651. 739 Lukashev, A. N. (2005). "Role of recombination in evolution of ." Rev Med Virol 15(3): 740 157-167. 741 Martella, V., K. Banyai, J. Matthijnssens, C. Buonavoglia and M. Ciarlet (2010). "Zoonotic aspects of 742 rotaviruses." Veterinary Microbiology 140(3-4): 246-255. 743 Martella, V., M. Ciarlet, A. Pratelli, S. Arista, V. Terio, G. Elia, A. Cavalli, M. Gentile, N. Decaro, G. 744 Greco, M. A. Cafiero, M. Tempesta and C. Buonavoglia (2003). "Molecular analysis of the VP7, VP4, 745 VP6, NSP4, and NSP5/6 genes of a buffalo rotavirus strain: identification of the rare P[3] rhesus 746 rotavirus-like VP4 gene allele." Journal of clinical microbiology 41(12): 5665-5675. 747 Martin, D. P., B. Murrell, M. Golden, A. Khoosal and B. Muhire (2015). "RDP4: Detection and analysis 748 of recombination patterns in virus genomes." Virus Evol 1(1): vev003. 749 Martinez-Laso, J., A. Roman, M. Rodriguez, I. Cervera, J. Head, I. Rodriguez-Avial and J. J. Picazo 750 (2009). "Diversity of the G3 genes of human rotaviruses in isolates from Spain from 2004 to 2006: 751 cross-species transmission and inter-genotype recombination generates alleles." J Gen Virol 90(Pt 4): 752 935-943.

27 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 25, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

753 Matthijnssens, J., J. Bilcke, M. Ciarlet, V. Martella, K. Banyai, M. Rahman, M. Zeller, P. Beutels, P. Van 754 Damme and M. Van Rans (2009). "Rotavirus disease and vaccination: impact on genotype diversity." 755 Future Microbiology 4(10): 1303-1316. 756 Matthijnssens, J., M. Ciarlet, E. Heiman, I. Arijs, T. Delbeke, S. M. McDonald, E. A. Palombo, M. 757 Iturriza-Gomara, P. Maes, J. T. Patton, M. Rahman and M. Van Ranst (2008). "Full genome-based 758 classification of rotaviruses reveals a common origin between human Wa-Like and porcine rotavirus 759 strains and human DS-1-like and bovine rotavirus strains." J Virol 82(7): 3204-3219. 760 Matthijnssens, J., M. Ciarlet, S. M. McDonald, H. Attoui, K. Bányai, J. R. Brister, J. Buesa, M. D. Esona, 761 M. K. Estes, J. R. Gentsch, M. Iturriza-Gómara, R. Johne, C. D. Kirkwood, V. Martella, P. P. C. Mertens, 762 O. Nakagomi, V. Parreño, M. Rahman, F. M. Ruggeri, L. J. Saif, N. Santos, A. Steyer, K. Taniguchi, J. T. 763 Patton, U. Desselberger and M. Van Ranst (2011). "Uniformity of rotavirus strain nomenclature 764 proposed by the Rotavirus Classification Working Group (RCWG)." Archives of Virology 156(8): 1397- 765 1413. 766 Matthijnssens, J., M. Ciarlet, M. Rahman, H. Attoui, K. Bányai, M. K. Estes, J. R. Gentsch, M. Iturriza- 767 Gómara, C. D. Kirkwood, V. Martella, P. P. C. Mertens, O. Nakagomi, J. T. Patton, F. M. Ruggeri, L. J. 768 Saif, N. Santos, A. Steyer, K. Taniguchi, U. Desselberger and M. Van Ranst (2008). "Recommendations 769 for the classification of group A rotaviruses using all 11 genomic RNA segments." Archives of virology 770 153(8): 1621-1629. 771 Matthijnssens, J., E. Heylen, M. Zeller, M. Rahman, P. Lemey and M. Van Ranst (2010). "Phylodynamic 772 Analyses of Rotavirus Genotypes G9 and G12 Underscore Their Potential for Swift Global Spread." 773 Molecular Biology and Evolution 27(10): 2431-2436. 774 Matthijnssens, J., P. H. Otto, M. Ciarlet, U. Desselberger, M. Van Ranst and R. Johne (2012). "VP6- 775 sequence-based cutoff values as a criterion for rotavirus species demarcation." Archives of Virology 776 157(6): 1177-1182. 777 McDonald, S. M., J. Matthijnssens, J. K. McAllen, E. Hine, L. Overton, S. Wang, P. Lemey, M. Zeller, M. 778 Van Ranst, D. J. Spiro and J. T. Patton (2009). "Evolutionary dynamics of human rotaviruses: balancing 779 reassortment with preferred genome constellations." PLOS 5(10): e1000634. 780 McDonald, S. M., M. I. Nelson, P. E. Turner and J. T. Patton (2016). "Reassortment in segmented RNA 781 viruses: mechanisms and outcomes." Nature Reviews Microbiology 14(7): 448-460. 782 McDonald, S. M. and J. T. Patton (2011). "Assortment and packaging of the segmented rotavirus 783 genome." Trends Microbiol 19(3): 136-144. 784 Mihalov-Kovacs, E., A. Gellert, S. Marton, S. L. Farkas, E. Feher, M. Oldal, F. Jakab, V. Martella and K. 785 Banyai (2015). "Candidate new rotavirus species in sheltered dogs, Hungary." Emerg Infect Dis 21(4): 786 660-663. 787 Minin, V. N., E. W. Bloomquist and M. A. Suchard (2008). "Smooth skyride through a rough skyline: 788 Bayesian coalescent-based inference of population dynamics." Mol Biol Evol 25(7): 1459-1471. 789 Mwanga, M., C. A. Nyaigoti, R. Njeru, D. J. Nokes, C. Matthew, V. T. M. Phan and E. C. Holmes (2018). 790 "Unbiased whole genome deep sequencing of Rotavirus Group A positive samples from rural Kenya, 791 2012-14 reveals high frequency of coinfection and genetic reassortment." International Journal of 792 Infectious Diseases 73: 203. 793 Mwenda, J. M., K. M. Ntoto, A. Abebe, C. Enweronu-Laryea, I. Amina, J. McHomvu, A. Kisakye, E. M. 794 Mpabalwani, I. Pazvakavambwa, G. E. Armah, L. M. Seheri, N. M. Kiulia, N. Page, M.-A. Widdowson 795 and A. Duncan Steele (2010). "Burden and Epidemiology of Rotavirus Diarrhea in Selected African 796 Countries: Preliminary Results from the African Rotavirus Surveillance Network." The Journal of 797 Infectious Diseases 202(Supplement_1): S5-S11.

28 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 25, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

798 Nair, N., N. Feng, L. K. Blum, M. Sanyal, S. Ding, B. Jiang, A. Sen, J. M. Morton, X. S. He, W. H. Robinson 799 and H. B. Greenberg (2017). "VP4- and VP7-specific antibodies mediate heterotypic immunity to 800 rotavirus in humans." Sci Transl Med 9(395). 801 Ngoveni, H. G., A. van Schalkwyk and J. J. O. Koekemoer (2019). "Evidence of Intragenic 802 Recombination in African Horse Sickness Virus." Viruses 11(7). 803 Norberg, P., A. Roth and T. Bergstrom (2013). "Genetic recombination of tick-borne flaviviruses 804 among wild-type strains." Virology 440(2): 105-116. 805 Parez, N., A. Garbarg-Chenon, C. Fourgeux, F. Le Deist, A. Servant-Delmas, A. Charpilienne, J. Cohen 806 and I. Schwartz-Cornil (2004). "The VP6 protein of rotavirus interacts with a large fraction of human 807 naive B cells via surface immunoglobulins." J Virol 78(22): 12489-12496. 808 Parra, G. I., K. Bok, M. Martinez and J. A. Gomez (2004). "Evidence of rotavirus intragenic 809 recombination between two sublineages of the same genotype." J Gen Virol 85(Pt 6): 1713-1716. 810 Patton, J. T. (2012). "Rotavirus diversity and evolution in the post-vaccine world." Discovery Medicine 811 13(68): 85-97. 812 Patton, J. T., R. Vasquez-Del Carpio, M. A. Tortorici and Z. F. Taraporewala (2007). "Coupling of 813 rotavirus genome replication and capsid assembly." Adv Virus Res 69: 167-201. 814 Pérez-Losada, M., M. Arenas, J. C. Galán, F. Palero and F. González-Candelas (2015). "Recombination 815 in viruses: Mechanisms, methods of study, and evolutionary consequences." Infection, Genetics and 816 Evolution 30: 296-307. 817 Phan, T. G., S. Okitsu, N. Maneekarn and H. Ushijima (2007). "Evidence of intragenic recombination in 818 G1 rotavirus VP7 genes." J Virol 81(18): 10188-10194. 819 Phan, T. G., S. Okitsu, N. Maneekarn and H. Ushijima (2007). "Genetic heterogeneity, evolution and 820 recombination in emerging G9 rotaviruses." Infection Genetics and Evolution 7(5): 656-663. 821 Rahman, M., J. Matthijnssens, X. Yang, T. Delbeke, I. Arijs, K. Taniguchi, M. Iturriza-Gomara, N. 822 Iftekharuddin, T. Azim and M. Van Ranst (2007). "Evolutionary history and global spread of the 823 emerging G12 human rotaviruses." Journal of Virology 81(5): 2382-2390. 824 Rambaut, A. (2019). "Figtree v1.4.4." 825 Rambaut, A., A. J. Drummond, D. Xie, G. Baele and M. A. Suchard (2018). "Posterior Summarization in 826 Bayesian Phylogenetics Using Tracer 1.7." Systematic Biology 67(5): 901-904. 827 Ramig, R. F. (1997). "Genetics of the rotaviruses." Annual Review of Microbiology 51: 225-255. 828 Ramig, R. F. and R. L. Ward (1991). "Genomic segment reassortment in rotaviruses and other 829 reoviridae." Advances in Virus Research 39: 163-207. 830 Rojas, M., H. G. Dias, J. L. S. Gonçalves, A. Manchego, R. Rosadio, D. Pezo and N. Santos (2019). 831 "Genetic diversity and zoonotic potential of rotavirus A strains in the southern Andean highlands, 832 Peru." Transboundary and Emerging Diseases 0(0). 833 Roy, A., A. Kucukural and Y. Zhang (2010). "I-TASSER: a unified platform for automated protein 834 structure and function prediction." Nat Protoc 5(4): 725-738. 835 Sadiq, A., N. Bostan, K. C. Yinda, S. Naseem and S. Sattar (2018). "Rotavirus: Genetics, pathogenesis 836 and vaccine advances." Reviews in Medical Virology 28(6). 837 Sánchez-Fauquier, A., V. Montero, S. Moreno, M. Solé, J. Colomina, M. Iturriza-Gomara, A. Revilla, I. 838 Wilhelmi, J. Gray and V.-N. G. Gegavi (2006). "Human rotavirus G9 and G3 as major cause of diarrhea 839 in hospitalized children, Spain." Emerging infectious diseases 12(10): 1536-1541. 840 Santiana, M., S. Ghosh, B. A. Ho, V. Rajasekaran, W. L. Du, Y. Mutsafi, D. A. De Jesus-Diaz, S. V. 841 Sosnovtsev, E. A. Levenson, G. I. Parra, P. M. Takvorian, A. Cali, C. Bleck, A. N. Vlasova, L. J. Saif, J. T. 842 Patton, P. Lopalco, A. Corcelli, K. Y. Green and N. Altan-Bonnet (2018). "Vesicle-Cloaked Virus Clusters 843 Are Optimal Units for Inter-organismal Viral Transmission." Cell Host Microbe 24(2): 208-220.e208.

29 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 25, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

844 Schumann, T., H. Hotzel, P. Otto and R. Johne (2009). "Evidence of interspecies transmission and 845 reassortment among avian group A rotaviruses." Virology 386(2): 334-343. 846 Simon-Loriere, E. and E. C. Holmes (2011). "Why do RNA viruses recombine?" Nature Reviews 847 Microbiology 9(8): 617-626. 848 Steger, C. L., C. E. Boudreaux, L. E. LaConte, J. B. Pease and S. M. McDonald (2019). "Group A 849 Rotavirus VP1 Polymerase and VP2 Core Shell Proteins: Intergenotypic Sequence Variation and 850 In Vitro Functional Compatibility." Journal of Virology 93(2): e01642-01618. 851 Suchard, M. A., P. Lemey, G. Baele, D. L. Ayres, A. J. Drummond and A. Rambaut (2018). "Bayesian 852 phylogenetic and phylodynamic data integration using BEAST 1.10." Virus Evol 4(1): vey016. 853 Suzuki, Y. (2015). "A candidate packaging signal of human rotavirus differentiating Wa-like and DS-1- 854 like genomic constellations." Microbiol Immunol 59(9): 567-571. 855 Suzuki, Y., T. Gojobori and O. Nakagomi (1998). "Intragenic recombinations in rotaviruses." FEBS Lett 856 427(2): 183-187. 857 Tang, B., J. M. Gilbert, S. M. Matsui and H. B. Greenberg (1997). "Comparison of the rotavirus gene 6 858 from different species by sequence analysis and localization of subgroup-specific epitopes using site- 859 directed mutagenesis." Virology 237(1): 89-96. 860 Tolskaya, E. A., L. I. Romanova, V. M. Blinov, E. G. Viktorova, A. N. Sinyakov, M. S. Kolesnikova and V. I. 861 Agol (1987). "Studies on the recombination between RNA genomes of poliovirus: the primary 862 structure and nonrandom distribution of crossover regions in the genomes of intertypic poliovirus 863 recombinants." Virology 161(1): 54-61. 864 Varsani, A., P. Lefeuvre, P. Roumagnac and D. Martin (2018). "Notes on recombination and 865 reassortment in multipartite/segmented viruses." Curr Opin Virol 33: 156-166. 866 Vita, R., S. Mahajan, J. A. Overton, S. K. Dhanda, S. Martini, J. R. Cantrell, D. K. Wheeler, A. Sette and 867 B. Peters (2019). "The Immune Epitope Database (IEDB): 2018 update." Nucleic Acids Res 47(D1): 868 D339-d343. 869 Voth, P. D., L. Mairura, B. E. Lockhart and G. May (2006). "Phylogeography of Ustilago maydis virus H1 870 in the USA and Mexico." Journal of General Virology 87(11): 3433-3441. 871 Vukea, P. R., S. Willows-Munro, R. F. Horner and T. H. T. Coetzer (2014). "Phylogenetic analysis of the 872 polyprotein coding region of an infectious South African bursal disease virus (IBDV) strain." Infection, 873 Genetics and Evolution 21: 279-286. 874 Woods, R. J. (2015). "Intrasegmental recombination does not contribute to the long-term evolution of 875 group A rotavirus." Infection Genetics and Evolution 32: 354-360. 876 Worobey, M., A. Rambaut, O. G. Pybus and D. L. Robertson (2002). "Questioning the evidence for 877 genetic recombination in the 1918 "Spanish flu" virus." Science 296(5566): 211 discussion 211. 878 Yang, J. and Y. Zhang (2015). "Protein Structure and Function Prediction Using I-TASSER." Curr Protoc 879 Bioinformatics 52: 5 8 1-15. 880 Yao, B., L. Zhang, S. Liang and C. Zhang (2012). "SVMTriP: a method to predict antigenic epitopes 881 using support vector machine to integrate tri-peptide similarity and propensity." PLoS One 7(9): 882 e45152. 883 Yin, X., F. Q. Zheng, W. Tang, Q. Q. Zhu, X. D. Li, G. M. Zhang, H. T. Liu and B. S. Liu (2013). "Genetic 884 structure of rice black-streaked dwarf virus populations in China." Arch Virol 158(12): 2505-2515. 885 Yoder, J. D. and P. R. Dormitzer (2006). "Alternative intermolecular contacts underlie the rotavirus 886 VP5* two- to three-fold rearrangement." The EMBO journal 25(7): 1559-1568. 887 Zeldovich, K. B., P. Liu, N. Renzette, M. Foll, S. T. Pham, S. V. Venev, G. R. Gallagher, D. N. Bolon, E. A. 888 Kurt-Jones, J. D. Jensen, D. R. Caffrey, C. A. Schiffer, T. F. Kowalik, J. P. Wang and R. W. Finberg (2015).

30 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 25, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

889 "Positive Selection Drives Preferred Segment Combinations during Influenza Virus Reassortment." 890 Mol Biol Evol 32(6): 1519-1532. 891 Zhang, Y. (2008). "I-TASSER server for protein 3D structure prediction." BMC Bioinformatics 9: 40. 892 Zhou, Y. J., J. W. Burns, Y. Morita, T. Tanaka and M. K. Estes (1994). "Localization of rotavirus VP4 893 neutralization epitopes involved in antibody-induced conformational changes of virus structure." J 894 Virol 68(6): 3955-3964. 895 896 897 SUPPLEMENTARY INFORMATION 898 Supplementary Table 1. Putative recombination events identified by RDP4 for whole genome 899 Rotavirus A sequences. A total of 117 events were identified by at least 6/7 RDP4 programs. Shaded 900 rows indicate strongly supported events (identified by 7/7 RDP4 programs or found in multiple 901 sequenced isolates). Events that were flagged by RDP4, but were not strongly supported are not listed 902 here. Segment Accession Number Breakpoint Cis p-Values Representative Representative Minor (Excluding Gaps) Major Parent Parent Sequence, Source Sequence, Source Species, Year of Species, Year of Isolation, % ANI, Isolation, % ANI, Genogroup Genogroup VP1 JQ988899, Homo sapiens, 970 (781-986) - RDP 1.02E-43 KX632342, Homo GQ414540, Homo Croatia, 2006 1734 (1693-1751) GENECONV. 8.72E-48 sapiens, Uganda, sapiens, Germany, 2009, BootScan 4.17E-32 93.7%, R1 98.7%, R2 MaxChi. 1.15E-17 Chimera 1.05E-17 SiScan 4.30E-19 3Seq 2.96E-09 VP1 KU739903, Sus scrofa, 1-(747-805) RDP 1.19E-24 KU739900, Sus KU739904, Sus scrofa, Taiwan, 2015 GENECONV 1.41E-21 scrofa, Taiwan, 2015, Taiwan, 2015, 99.1%, R1 BootScan. 5.14E-21 95.8%, R1 MaxChi. 8.41E-16 Chimera 3.57E-16 SiScan. 9.64E-16 3Seq 2.83E-09 VP1 KF812721, Homo sapiens, 2504 (2464-2537) - RDP 2.58E-22 JX027681, Homo JF796734, Sus scrofa, South Korea, 2010 3268 (3232-116) GENECONV 2.12E-08 sapiens, Australia, South Korea, 2006, BootScan 4.71E-22 2007, 98.1%, R1 94.6%, R1 MaxChi 9.96E-16 Chimera 1.47E-13 SiScan 5.10E-09 3Seq 2.05E-02 VP1 JQ069926, Homo sapiens, 2498 (2356-2507) - RDP 9.14E-22 JX195074, Homo JX027681, Homo sapiens, Canada, 2008 3268 GENECONV 5.71E-19 sapiens, Italy, 2010, Australia, 2007, 99.6%, BootScan 1.00E-18 98.1%, R1 R1 MaxChi 1.30E-15 Chimera 2.51E-15 SiScan. 7.42E-23 3Seq 2.78E-25 VP1 KM454503, Equus 840 (802-951) - RDP 2.77E-15 Unknown, closest = KM454492, Equus callabus, USA, 1981 3281 GENECONV 1.36E-19 DQ838638, unknown, callabus, United BootScan 4.79 E-21 R2 Kingdom, 1976, 99.1%, MaxChi 1.55E-17 R2 SiScan 1.27E-40 3Seq 8.22E-05 VP1 KU714444, Homo sapiens, 1656 (1636-1673) - RDP 1.35E-18 JN706454, Homo KX655451, Homo Malawi, 2000 1752 (1740-1766) GENECONV 1.25E-17 sapiens, Thailand, sapiens, Uganda, 2013, BootScan 3.76E-11 2009, 97.4%, R1 97.9%, R2 MaxChi 3.81E-02 Chimera 4.33E-02 3Seq 5.67E-09 VP1 JQ069936, Homo sapiens, 1670 (1531-1703) - RDP 1.83E-14 KU248405, Homo HQ657171, Homo Canada, 2009 2419 (2308-2436) GENECONV 7.68E-13 sapiens, Bangladesh, sapiens, South Africa, MaxChi 2.73E-08 2010, 98.7%, R2 2009, 99.7%, R2

31 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 25, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

Chimera 2.48E-08 SiScan 2.65E-08 3Seq 5.15E-19 VP1 KU739903, Sus scrofa, 1543 (138-1588) - RDP 3.06E-13 KU739900, Sus KU739904, Sus scrofa, Taiwan, 2015 2220 (2160-2260) GENECONV 6.19E-04 scrofa, Taiwan, 2015, Taiwan, 2015, 96.5%, R1 BootScan 9.40E-11 97.8%, R1 MaxChi 2.98E-08 Chimera 5.67E-08 SiScan 1.19 E-06 3Seq 2.83E-09 VP1 JQ988899, Homo sapiens, 222 (3094-362) - RDP 3.76E-12 KJ559215, Homo Unknown, closest = Croatia, 2006 969 (882-NA) GENECONV 4.88E-10 sapiens, Argentina, JX195074, Homo sapiens, MaxChi 7.74E-09 1998, 96.8%, R1 Italy, 2010, R1 Chimera 4.5E-07 SiScan 6.32E-14 3Seq 5.64E-09 VP1 KC580176, Homo sapiens, 1471 (1328-1521) - RDP 9.77E-12 KC580526, Homo KC580601, Homo USA, 1988 3270 GENECONV 3.81E-08 sapiens, USA, 1979, sapiens, USA, 1988, MaxChi 6.16E-12 99%, R1 100%, R1 Chimera 2.36E-04 SiScan 7.42E-23 3Seq 1.91E-20 VP1 JQ069918, Homo sapiens, 604 (541-662) - RDP 2.42E-10 KJ751834, Homo JQ069924, Homo sapiens, Canada, 2008 956 (889-1068) GENECONV 3.25E-08 sapiens, South Africa, Canada, 2008, 99.7%, R1 BootScan 2.38E-09 2009, 99.5%, R1 MaxChi 5.62E-03 Chimera 5.62E-03 SiScan 1.10E-04 3Seq 1.13E-08 VP1 KC579647, Homo sapiens, 35 – 1448 (1290- GENECONV 4.30E-04 KC580601, Homo KC579509, Homo USA, 1988 1498) BootScan 2.63E-05 sapiens, USA, 1988, sapiens, USA, 1989, MaxChi 2.34E-09 100%, R1 99.4%, R1 Chimera 1.41E-08 SiScan 6.55E-11 3Seq 4.34E-19 VP1 KU199270, Homo sapiens, 670 (541-711) - Flagged for 285 KX536654, Homo KU356640, Homo Bangladesh, 2010 1403 (1327-1445) sequences, sapiens, India, 2011, sapiens, Bangladesh, p-value = 3.33E-09 98.6%, R2 2013, 100%, R2 VP1 KU356662, KU356640, 656 (541-711) - RDP 5.87E-08 KU356607, Homo KC178768, Homo Homo sapiens, 2013 2661 (2600-2735) GENECONV 1.43E-04 sapiens, Bangladesh, sapiens, Italy, 2007, MaxChi 9.02E-10 2013, 99.5%, R2 99.6%, R2 SiScan 6.71E-20 3Seq 6.88 E-23 VP1 KU199270, Homo sapiens, 3282 (3116-50) - GENECONV 3.94E-06 JQ069920, Homo KJ751889, Homo sapiens, Bangladesh, 2010 1410 (1363-1478) BootScan 2.03E-04 sapiens, Canada, Ethiopia, 2009, 99.1%, R2 MaxChi 1.20E-09 2008, 99.5%, R2 Chimera 3.36E-10 SiScan2.57E-09 3Seq 5.81E-18 VP2 KJ753425, Homo sapiens, 599 (561-623) - RDP 6.28E-47 JF490148, Homo KP752972, Homo Uganda, 2011 991 (967-1019) GENECONV 5.13E-45 sapiens, Australia, sapiens, Gambia, 2008, BootScan 2.53E-44 2004, 98.3%, C1 99.7%, C2 MaxChi 5.39E-12 Chimera 6.10E-12 SiScan 8.38E-15 3Seq 1.96E-09 VP2 KU714445, Homo sapiens, 1155(1139-1170) - RDP 2.37E-41 KP752983, Homo KU714456, Homo Malawi, 2000 1887 (1862-1908) GENECONV 1.62E-33 sapiens, South Africa, sapiens, Malawi, 2000, BootScan 5.28E-39 2009, 94.3%, C1 99.9%, C2 MaxChi 1.26E-16 Chimera 1.42E-16 SiScan 8.61E-25 3Seq 1.96E-09 VP2 KX655439, Homo sapiens, 2713 (2672-2717) - RDP 1.34E-25 Unknown, closest = MG181349, Homo Uganda, 2013 158 (143-171) GENECONV 9.67E-24 KX632343, Homo sapiens, Malawi, 2002, MaxChi 1.54E-03 sapiens, Uganda, 99.4%, C1 Chimera 8.56E-04 2013, C2 SiScan 1.61E-05 3S— 2.15E--08 VP2 KJ753617, Homo sapiens, 548 (464-560) - RDP 5.31E-27 KJ752329, Homo KX655439, Homo South Africa, 2004 784 (754-842) GENECONV 4.03E-24 sapiens, South Africa, sapiens, Uganda, 2013, BootScan 7.13E-23 2003, 99.4%, C1 98.4%, C2

32 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 25, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

MaxChi 3.85E-05 Chimera 4.51E-05 3Seq 1.96E-09 VP2 HM988959, Bos taurus, 558 (524-586) - RDP 1.14E-21 JQ069842, Homo JX971570, Bos taurus, South Korea, Unknown 2685 GENECONV 2.68E-25 sapiens, Canada, South Korea, 2004, 100%, BootScan 2.00 E-21 2008, 95.2%, C1 C1 MaxChi 1.29E-05 Chimera 1.11E-03 SiScan 3.88E-15 3Seq 5.94E-21 VP2 MG181591, Homo sapiens, 1129 (1106-1140) - RDP 3.47 E-21 KJ753828, Homo KJ753425, Homo sapiens, Malawi, 2013 1299 (1292-1318) GENECONV 2.01E-19 sapiens, Zimbabwe, Uganda, 2011, 99.4%, C1 BootScan 4.15E-15 unknown, 99.1%, C2 MaxChi 4.04E-03 Chimera 3.95E-03 3Seq 1.96E-09 VP2 KX632343, Homo sapiens, 2666 (2614-39) - RDP 9.89E-21 JN258812, Homo KJ753402, Homo sapiens, Uganda, 2013 593 (560-612) GENECONV 2.26 E-17 sapiens, Belgium, South Africa, 2005, BootScan 5.20E-19 2000, 91.7%, C1 99.5%, C2 MaxChi 4.65E-13 Chimera 3.18 E-13 SiSeq 2.53E-24 3Seq 1.37E-05 VP2 KX632343, Homo sapiens, 1357 (1322-1368) - RDP 1.14E-21 MG181371, Homo KJ870879, Homo sapiens, Uganda, 2013 1808 (1790-1830) GENECONV 3.44E-17 sapiens, Malawi, Democratic Republic of MaxChi 1.69E-11 2002, 95.3%, C1 the Congo, 99.3%, C2 Chimera 3.57E-10 SiScan 2.84E-14 3Seq 1.96E-09 VP2 KY055428, Capra 983 (960-1009) - RDP 1.64E-18 JN831232, Bos KF636257, Bos taurus, aegagrus, Uganda, 2014 1238 (1194-1243) GENECONV 1.12E-16 taurus, South Africa, South Africa, 2007, BootScan 2.13E-10 2007, 94.7%, C2 99.6%, C2 MaxChi 1.67E-05 Chimera 4.73E-04 SiScan 4.35E-05 3Seq 1.96E-09 VP2 KJ870901, Homo sapiens, 2330 (2304-2359) - RDP 1.96E-18 KY055417, Sus LC367260, Homo Democratic Republic of the 2648 (2589-1) GENECONV 2.99E-16 scrofa, Uganda, 2016, sapiens, Nepal, 2008, Congo, Unknown BootScan 5.51E-16 95%, C1 99.4%, C1 MaxChi 1.71E-03 Chimera 7.67E-04 SiScan 6.29E-06 3Seq 1.96E-09 VP2 KC257092, Camelus 2047 (2017-2102) - RDP 2.53E-16 Unknown, closest = FJ347123, Lama dromedaries, Sudan, 2002 733 (661-754) GENECONV 4.72E-06 DQ480724, Homo guanicoe, Argentina, BootScan 7.95E-23 sapiens, Iran, 1998, 99.9%, C2 MaxChi 2.68E-13 unknown, C2 Chimera 8.40E-12 SiScan 8.51E-11 3Seq 9.20E-08 VP2 KX632343, Homo sapiens, 1962 (1932-1978) RDP 1.44E-13 JN706477, Homo KX655439, Homo Uganda, 2013 – 2133 (2099- GENECONV 7.46E-10 sapiens, Thailand, sapiens, Uganda, 2013, 2136) MaxChi 3.26E-06 2010, 93%, C1 97.7%, C2 Chimera 4.13E-05 SiScan 1.49E-07 3Seq1.96E-09 VP2 KJ627064, Homo sapiens, 2675 (2615-95) - RDP 7.01E-06 KJ752074, Homo KJ752859, Homo sapiens, Paraguay, 2002 409 (370-449) GENECONV 2.53E-03 sapiens, South Africa, Zimbabwe, 2011, 100%, BootScan 1.26E-04 2005, 98.5%, C1 C1 MaxChi 4.49E-02 Chimera 3.89E-02 SiScan 1.20E-02 3Seq 9.21E-04 VP3 KJ412679, Homo sapiens, 1 (2497-3) - 1068 RDP 1.04E-72 JX185760, Homo KJ412622, Homo sapiens, Paraguay, 2009 (1047-1080) GENECONV 1.11E-59 sapiens, Italy, 2007, Paraguay, 2009, 99.9%, BootScan 1.41E-73 97.8%, M1 M3 MaxChi 3.58E-31 Chimera 5.31E-31 SiScan 6.77E-38 3Seq 2.83E-09

33 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 25, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

VP3 KP007164, Homo sapiens, 968 (968-996) - RDP 1.97E-38 KY000551, Homo KP882081, Homo Philippines, 2012 1753 (1727-1777) GENECONV 2.93E-35 sapiens, Germany, sapiens, Bangladesh, BootScan 2.16E-37 2016, 99.4%, M2 2007, 99.7%, M2 MaxChi 2.45E-16 Chimera 1.76E-15 SiScan 2.85E-19 3Seq 2.83E-19 VP3 KX171542, Homo sapiens, 1465 (1432-1538) - RDP 1.40E-37 KJ752827, Homo KX171538, Homo South Korea, 2014 2482 (2461-59) GENECONV 7.17E-32 sapiens, South Africa, sapiens, South Korea, BootScan 2.84E-35 2003, 98.8%, M2 2012, 99.7%, M2 MaxChi 2.03E-19 Chimera 3.57E-19 SiScan 3.27E-23 3Seq 2.83E-09 VP3 KX171543, Homo sapiens, 1650 (1607-1688) - RDP 5.19E-37 KU248374, Homo KC443523, Homo South Korea, 2014 2472 (2455-59) GENECONV 2.46E-33 sapiens, Bangladesh, sapiens, Australia, 2007, BootScan 2.81E-34 2010, 99.1%, M2 99.6%, M2 MaxChi 1.95E-17 Chimera 1.05E-17 SiScan 1.55E-19 3Seq 2.83E-09 VP3 AB779630, Sus scrofa, 1590 (1578-1645) - RDP 3.19E-33 MG781054, Sus AB779632, Sus scrofa, Thailand, 2008 90 (2490-96) GENECONV 1.69E-25 scrofa, 2010-2011, Thailand, 2009, 99.9%, BootScan 2.39E-31 Thailand, 92.7%, M1 M1 MaxChi 2.05E-16 Chimera 3.42E-15 SiScan 9.38E-25 3Seq 1.42E-24 VP3 MG181592, Homo sapiens, 2334 (2318-2349) - RDP 1.09E-32 KC442894, Homo KT919912, Homo Malawi, 2013 196 (150-204) GENECONV 6.85E-18 sapiens, USA, 2008, sapiens, USA, 2013, 99%, BootScan 5.72E-27 96.4%, M2 M1 MaxChi 6.16E-12 Chimera 9.20E-12 SiScan 3.92E-10 3Seq 3.54E-27 VP3 JN013991, Homo sapiens, 82 (2550-99) - 260 RDP 1.69E-29 KJ752816, Homo KP883027, Homo South Africa, 2008 (253-268) GENECONV 8.98 E-27 sapiens, South Africa, sapiens, Mali, 2008, BootScan 1.24E-17 2011, 97.4%, M1 99.4%, M2 MaxChi 9.47E-06 Chimera 1.58E-05 SiScan 1.76E-02 3Seq 5.67E-09 VP3 KU356576, Homo sapiens, 844 (815-855) - RDP 4.55E-24 KC178787, Homo KP882081, Homo Bangladesh, 2010 1194 (1142-1219) GENECONV 1.69E-21 sapiens, Italy, 2008, sapiens, Bangladesh, BootScan 3.82E-20 97.4%, M2 2007, 100%, M2 MaxChi 1.89E-09 Chimera 1.77E-09 SiSeq 1.18E-07 3Seq 2.83E-09 VP3 JQ0692727, Homo sapiens, 51 (2482-59) - 845 RDP 4.27E-23 JQ069748, Homo JQ069762, Homo sapiens, Canada, 2007 (817-932) GENECONV 9.70E-19 sapiens, Canada, Canada, 2008, 99.7%, M1 BootScan 6.96E-18 2008, 99.1%, M1 MaxChi 8.76E-14 Chimera 4.71E-14 SiScan 2.58E-17 3Seq 3.98E-35 VP3 JQ069754, Homo sapiens, 835 (811-879) - RDP 6.24E-20 KJ627117, Homo KP645258, Homo Canada, 2008 1621 (1591-1631) GENECONV 4.13E-17 sapiens, Paraguay, sapiens, Australia, 2010, BootScan 4.05E-17 1999, 99%, M1 99.7%, M1 MaxChi 1.69E-09 Chimera 1.31E-09 SiScan 1.91E-10 3Seq 1.71E-27 VP3 MG181592, Homo sapiens, 1736 (1712-1755) - RDP 2.97E-19 KC178787, Homo LC374087, Homo Malawi, 2013 1941 (1889-1953) GENECONV 7.80E-18 sapiens, Italy, 2008, sapiens, Nepal, 2009, BootScan 5.65E-15 97%, M2 100%, M1 MaxChi 5.47E-04 Chimera 5.27E-04 SiSeq 9.43E-07 3Seq 2.83E-09

34 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 25, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

VP3 KJ753665, Homo sapiens, 2158 (2127-2201) - Flagged in 107, KA p- KJ753481, Homo KY055418, Sus scrofa, South Africa, 2004 2517 (2447-64) value= 2.23E-18, Global sapiens, Ethiopia, Uganda, 2016, 98.6%, M1 p-value 5.69E-11 2012, 94.9%, M1 VP3 AB374145, Bos taurus, 810 (765-875) - RDP 1.85E-08 LC340015, Homo KJ411434, Homo sapiens, Japan, Unknown 1544 (1486-1639) GENECONV 1.83E-04 sapiens, Japan, 2014, USA, 2012, 97.4%, M2 BootScan 3.64E-05 95.5%, M2 MaxChi 1.68E-06 Chimera 3.72E-06 SiScan 2.14E-07 3Seq 4.41E-10 VP3 KJ753665, Homo sapiens, 1212 (1140-1312) - RDP 3.84E-07 HQ392255, Homo KX778614, Homo South Africa, 2004 1732 (1632-1801) GENECONV 3.50E-05 sapiens, Belgium, sapiens, Dominican BootScan 1.92E-05 2008, 96.2%, M1 Republic of the Congo, MaxChi 2.67E-05 2013, 99.4%, M1 Chimera 1.59E-05 SiScan 3.24E-09 3Seq 2.08E-04 VP3 KJ753665, Homo sapiens, 580 (498-612) - GENECONV 7.32E-11 KP752825, Homo JQ069760, Homo sapiens, South Africa, 2004 922 (831-964) BootScan 1.64E-09 sapiens, South Africa, Canada, 2008, 99.7%, M1 MaxChi 3.90E-05 2002, 99.6%, M1 Chimera 3.85E-06 SiScan 9.81E-09 3Seq 7.82E-20 VP3 KX655453, Homo sapiens, 1258 (872-1389) - Flagged in 551 sequences, KP882576, Homo KU714446, Homo Uganda, 2013 1798 (1662-1917) Global p-value= 1.78E-12 sapiens, Ghana, 2009, sapiens, Malawi, 2000, 94.9%, M2 96.7%, M2 VP4 KX655509, Homo sapiens, 321 (308-323) - RDP 2.16E-54 KP902549, Homo KJ52687, Homo sapiens, Uganda, 2013 683 (616-634) GENECONV 1.17E052 sapiens, South Africa, South Africa, 2000, VP8* domain BootScan 1.59E-51 2003, 99.3%, P8 99.3%, P8 MaxChi 7.47E-19 Chimera 1.44E-18 SiScan 8.23E-23 3Seq 2.07E-09 VP4 KX655487, Homo sapiens, 334(326-347) - RDP 1.06E-55 KX646644, Homo MG181340, Homo Uganda, 2013 682(670-698) GENECONV 5.94E-51 sapiens, India, 2012, sapiens, Malawi, 2002, VP8* domain BootScan 1.77E-50 96.6%, P6 99.7%, P8 MaxChi 4.55E-19 Chimera 5.96E-19 SiScan 1.60E-17 3Seq 2.07E-09 VP4 MG181725, Homo sapiens, 374 (367-382) - GENECONV 5.45E-46 MG181758, Homo MG181890, Homo Malawi, 2014 542 (526-553) BootScan 2.45E-41 sapiens, Malawi, sapiens, Malawi, 2012, VP8* domain MaxChi 4.02E-11 2014, 99.2%, P8 100%, P4 Chimera 3.97E-11 SiScan 2.83E-13 3Seq 2.07E-09 VP4 KU714447, Homo sapiens, 1380 (1378-1381) - RDP 1.47E-39 GQ869840, Homo KU714458, Homo Malawi, 2000 1562 (1543-1571) GENECONV 4.07E-38 sapiens, Malawi, sapiens, Malawi, 2000, BootScan 104E-29 2000, P8 99.5%, P14 MaxChi 1.24E-07 Chimera 3.25E-07 SiScan 7.45E-12 3Seq 4.14E-09 VP4 KX655441, Homo sapiens, 374 (357-382)-484 RDP 5.41E-34 LC260226, Homo JQ069692, Homo sapiens, Uganda, 2013 (475-488) GENECONV 1.32E-31 sapiens, 2015-2016, Canada, 2009, 99.1%, P8 VP8* domain BootScan2.38E-30 Indonesia, 99.2%, P6 MaxChi 1.11E-05 Chimera 6.90E-06 SiScan 4.86E-10 3Seq 2.07E-09 VP4 MG181637, Homo sapiens, 350 (325-368) - GENECONV 5.13E-28 MG181758, Homo KJ870892, Homo sapiens, Malawi, 2013 459 (464-466) BootScan 5.75E-15 sapiens, Malawi, Democratic Republic of VP8* domain MaxChi 2.13E-04 2014, 99.5%, P8 the Congo, 2007-2010, Chimera 1.00E-04 95.5%, P6 SiScan 4.71E-05 3Seq 4.14E-09 VP4 JQ069674, Homo sapiens, 1568 (1541-1574) - RDP 1.59E-23 HQ392253, Homo HM773714, Homo Canada, 2008 2329 (2251-101) GENECONV 2.53E-21 sapiens, Belgium, sapiens, USA, 2008, BootScan 1.59E-23 2008, 99.5%, P8 99.5%, P8 MaxChi 2.96E-12 Chimera 2.13E-12 SiScan 7.86E-14

35 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 25, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

3Seq 1.44E-35 VP4 KC579762, Homo sapiens, 1073 (1039-1124) - RDP 4.67E-23 HQ392253, Homo FJ947895, Homo sapiens, USA, 1991 2332 (2269-8) GENECONV 4.72E-17 sapiens, Belgium, USA, 1991, 99.9%, P8 BootScan 8.46E-23 2008, 98.7%, P8 MaxChi 1.99E-14 Chimera 1.76E-11 SiScan 7.07E-25 3Seq 4.04E-40 VP4 MG181659, Homo sapiens, 1116 (1064-1146) - RDP 6.53E-11 MG181725, Homo KJ752208, Homo sapiens, Malawi, 2013 1462 (1402-1501) GENECONV 2.20E-09 sapiens, Malawi, South Africa, 2012, BootScan 1.08E-09 2013, 98.1%, P8 98.3%, P4 MaxChi 1.781E-05 Chimera 1.74E-05 SiScan 5.94E-06 VP4 JF490554, Homo sapiens, 462 (397-482) - RDP 2.17E-10 KJ751716, Homo JQ069674, Homo sapiens, Australia, 2005 683 (637-722) GENECONV 8.75E-09 sapiens, Gambia, Canada, 2008, 99.5%, P8 BootScan 1.89E-10 2010, 98.4%, P8 MaxChi 3.26E-04 Chimera 1.04E-03 SiScan 1.14E-04 3Seq 4.14E-09 VP4 AB008291, Homo sapiens, 2291 (2223-71) - RDP 1.02E-03 AB039939, Homo AB008277, Homo 1995 807 (762-849) BootScan 4.42E-02 sapiens, Japan, 1990, sapiens, China, 1992, MaxChi 5.00E-05 98.6%, P8 98.9%, P8 Chimera 1.32E-04 SiScan 1.51E-05 3Seq 1.66E-08 VP6 KJ482497, Sus scrofa, 589 (584-605) - RDP 1.81E-25 DQ119822, Sus KJ482491, Sus scrofa, Brazil, 2013 801 (790 – 809) GENECOV 7.17E-23 scrofa, China, Brazil, 2013, 100%, I5 Bootscan 7.09E018 unknown, 98.1%, I2 MaxChi 2.98E-10 SiScan 2.41E-11 3Seq 1.41E-09 VP6 KX632346, Homo sapiens, 1269 (1250-1272) - RDP 7.77E-19 KU714448, Homo KX655455, Homo Uganda, 2013 56 (1333-103) GENECONV 1.33E-17 sapiens, Malawi, sapiens, Uganda, 2013, Bootscan 7.27E-09 2000, 92.7%, I1 96.5%, I2 MaxChi 2.1E-02 Chimera 3.24E-03 SiScan 1.02E-07 3Seq 3.06E-11 VP6 JN014005, Homo sapiens, 160 (33-220 - 384 RDP 1.98E-18 JN706549, Homo KJ752622, Homo sapiens, South Africa, 2008 (379-451) GENECONV 6.96E-18 sapiens, Thailand, Senegal, 2009, 96.4%, I2 BootScan 4.63E-16 2010, 97.8%, I1 MaxChi 1.00E-06 Chimera 2.77E-07 3Seq 1.41E-09 VP6 JX040423, Homo sapiens, 1177(1143-1181) - RDP 1.30E-15 KF740531, Homo Unknown, closest = India, 2009 160(140-176) GENECONV 4.97E-09 sapiens, India, 2009, KC579918, Homo BootScan 1.95E-11 I12 sapiens, USA, 1976), I1 MaxChi 4.38E-11 Chimera 9.18E-12 SiScan 7.20E-15 3Seq 2.82E-09 VP6 JN014003, Homo sapiens, 202 (1333-208) - RDP 6.30E-15 KP752725, Homo JX027952, Homo sapiens, South Africa, 2008 384(354-421) GENECONV 2.39E-15 sapiens, Swaziland, Australia, 2010, 96.7%, I1 BootScan 1.04E-12 2010, 95.5%, I2 MaxChi 3.76E-05 Chimera 3.02E-05 SiScan 4.51E-05 3Seq 1.41E-09 VP6 KU714448, Homo sapiens, 653(635- 664) - RDP 1.24E-12 JN014005, Homo KX655455, Homo Malawi, 2000 831(818-846) GENECONV 7.55E-10 sapiens, South Africa, sapiens, Uganda, 2013, BootScan 5.50E-07 2008, 95.9%, I1 99.4%, I2 MaxChi 1.15E-05 Chimera 3.38E-05 SiScan 1.38E-04 3Seq 1.41E-09 VP6 KU714448, Homo sapiens, 317 (290-333) - RDP 3.94E-15 KX638567, Homo GU984758, Bos taurus, Malawi, 2000 491 (474-507) GENECONV 9.37E-14 sapiens, India, 2010, India, 2007, 98.9%, I2 BootScan 2.42E-13 96.6%, I1 MaxChi 2.80E Chimera 2.58E-05

36 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 25, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

SiScan 1.73E-07 3Seq 1.41E-09 VP6 HQ11009, Homo sapiens, 754 (697-775) - GENECONV 2.23E-05 MF469405, Homo HQ392161, Homo Belgium, 2007, 98.5% 1357 (1329-32) BootScan 1.61E-03 sapiens, USA, 2015, sapiens, Belgium, 2007, MaxChi 3.32E-11 99.3%, I1 98.5%, I1 Chimera 1.08E-10 SiScan 5.16E-14 3Seq 2.54E-19 VP6 FJ685614, Homo sapiens, 698(672-712) - RDP 1.72E-08 KX638599, Homo KX632291, Homo India, 1992 1357 (1231-70) GENECONV 1.61E-07 sapiens, India, 2010, sapiens, Uganda, 2013, BootScan 2.85E-08 99.4%, I1 99.1%, I1 MaxChi 1.40E-09 Chimera 2.14E-09 SiScan 4.04E-11 3Seq 1.54E-23 VP6 KJ753780, Homo sapiens, 513 (463-560) - GENECONV 1.89E-05 JN258839, Homo LC056797, Homo South Africa, 2004 1008 (952-1031) BootScan 2.60E-05 sapiens, Belgium, sapiens, Vietnam, 2007, MaxChi 9.01E-09 2001, 99.6%, I1 98.8%, I1 Chimera 1.46E-08 SiScan 1.38E-09 3Seq 1.30E-19 VP6 HM988972, Bos taurus, 751 (340-811) - RDP 2.08E-08 Unknown, closest = MH423866, Sus scrofa, South Korea, Unknown 1113 (1080-1132) GENECONV 1.92E-08 MH238302, Sus China, 2016, 99.4%, I5 BootScan 2.91E-08 scrofa, Spain, 2017, I5 MaxChi 1.45E-05 Chimera 2.41E-05 SiScan 3.56E-05 3Seq 5.55E-05 VP7 AB158431, Bos taurus, 901 (868-907)- RDP 3.71E-16 AB077053, Bos U50332, Bos taurus, Japan, 2000 1051 (1037-80) GENECONV 1.25E-13 taurus, Japan, 1995- USA, 1991, 97.2%, G6 MaxChi 6.99E-07 1996, 95.9%, G8 Chimera 3.69E-05 SiScan 8.07E-10 3Seq 4.95E-10 VP7 HM591496, Bos taurus, 1048 (1032-128) - RDP 2.78E-05 EF199501, Bos JX442784, Bos taurus, India, 2009 481 (448-506) GENECONV 4.77 E-04 taurus, India, 2001- India, 2010, 100%, G6 KF170899, Bos taurus, BootScan 4.86E-04 2005, 96.6%, G6 India, 2012 MaxChi 1.02E-03 Chimera 3.78E-05 SiSeq 7.21E-09 3Seq 3.90E-08 VP7 D86274, Homo sapiens, 353 (338-368) - RDP 1.73E-18 D86282, Homo KF636217, Homo 1990 745 (730-756) GENECONV 1.43E-16 sapiens, Japan, 1981, sapiens, South Africa, BootScan 2.50E-14 98.7%, G3 2010, 94.9%, G1 MaxChi 3.26E-13 Chimera 6.54E-13 SiScan 2.47E-10 3Seq 3.55E-33 VP7 MG181727, Homo sapiens, 316 (1010-314) - RDP 3.69E-17 JQ069508, Homo KJ753362, Homo sapiens, Malawi, 2014 470 (437-476) GENECONV 1.84E-16 sapiens, Canada, South Africa, 2003, BootScan 8.96E-12 2008, 98%, G1 98.7%, G2 MaxChi 2.23E-08 Chimera 3.80E-08 SiScan 2.65E-12 3Seq 2.69E-23 VP7 KX655511, Homo sapiens, 840 (823-851)- RDP 1.20E-16 EU839936, Homo KX655456, Homo Uganda, 2013 1052 (1037-79) GENECONV 3.11E-15 sapiens, Bangladesh, sapiens, Uganda, 2013, BootScan 2.21E-16 2005, 97.4%, G9 100%, G8 MaxChi 6.12E-05 Chimera 6.98E-06 SiScan 2.52E-04 3Seq 3.36E-09 VP7 KJ753024, Homo sapiens, 1013 (985-59) - RDP 1.53E-16 KM008672, Homo D16344, Homo sapiens, South Africa, 2009 130(121-143) GENECONV 2.32E-14 sapiens, India, 2010, 96.5%, Japan, 1977, G1 BootScan 1.27E-10 97.8%, G12 MaxChi 9.65E-04 Chimera 5.96E-04 SiScan 8.42E-13 3Seq 2.38E-16 VP7 KJ751729, Homo sapiens, 857 (833-859) - RDP 4.0E-15 KJ560447, Homo KP882703, Homo Ethiopia, 2010 1019 (991-51) GENECONV 1.60E-10 sapiens, USA, 2011, sapiens, Kenya, 2009, BootScan 1.21EE-11 99%, G3 99.3%, G1

37 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 25, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

KP752817, Homo sapiens, MaxChi 1.12E-03 Togo, 2010 Chimera 6.78E-04 SiScan 1.73E-03 3Seq 1.34E-05 VP7 MG181661, Homo sapiens, 590 (582-616) - RDP 7.82E-30 JQ069502, Homo MG181859, Homo Malawi, 2013 175 (170-181) GENECONV 7.33E-23 sapiens, Canada, sapiens, Malawi, 2012, BootScan 5.06E-31 2008, 96.6%, G1 99.5%, G2 MaxChi 3.49E-21 Chimera 1.06E-25 3Seq 2.28E-47 VP7 KF170899, Bos taurus, (71) - (448-506) RDP 2.78 E-05 EF199501, Bos JX442784, Bos taurus, India, 2010 GENEECONV 4.77 E-04 taurus, India, 2001- India, 2010, 99.5%, G6 BootScan 4.86 E-04 2005, 97.3%, G6 MaxChi 1.02 E-03 Chimaera 3.78 E-05 SiScan 7.21 E-09 3Seq 3.90 E-08

VP7 KJ412857, Homo sapiens, 772 (754-779) - GENECONV 1.23E-19 KJ412858, Homo KJ412802, Homo sapiens, Paraguay, 2009 1014(988-51) BootScan 6.33E-20 sapiens, Paraguay, Paraguay, 2009, 99.2%, MaxChi 2.56E-07 2009, 100%, G3 G1 Chimera 4.38E-07 SiScan 2.01E-09 3Seq 3.36E-09 VP7 KC443034, Homo sapiens, 55 (994-60) - 291 RDP 1.66 E-30 GQ229044, Homo JX411970, Homo USA, 2006 (262-296) GENECONV 5.07E-26 sapiens, India, 2005- sapiens, India, 2011, MG181727, Homo sapiens, BootScan 2.32E-24 2008, 97.4%, 98.7%, G1 Malawi, 2014 MaxChi 1.03E32 G2 Chimaera 1.10E-17 3Seq 3.36E-09

VP7 MG181771, Homo sapiens, 715 (707-722) - RDP 2.65E-29 KU356667, Homo KP222809, Homo Malawi, 2012 950 (921-971) GENECONV 6.45E-26 sapiens, Bangladesh, sapiens, Mozambique, BootScan 5.84E-17 2013, 94.9%, G2 2011, 99.6%, G1 MaxChi 1.70E-11 Chimaera 2.36E-11 SiScan 4.78E-11 3Seq 3.55E-09

VP7 KP752498, Homo sapiens, 618 (605-641) - RDP 1.47E-28 D86266, Homo MG181661, Homo Togo, 2009 1018 (1000-1020) GENECONV 4.49E-23 sapiens, Japan, 1995, sapiens, Malawi, 2013, BootScan 2.78E-14 95.7%, G3 98.8%, G1 MaxChi 2.05E-14 Chimaera 7.10E-19 SiScan 6.78E-39 VP7 AY261339, Homo sapiens, 783 (774-791) - RDP 3.82E-28 GU567778, Homo JX458964, Homo sapiens, South Africa 14(1044-18) GENECONV 2.35E-25 sapiens, Russia, 2003- Argentina, 2001, 100%, BootScan 2.41E-27 2009, 96.9%, G2 G1 MaxChi 5.44E-10 Chimera 2.33E-10 SiScan 3.15E-14 3Seq 3.36E-09 VP7 KX655489, Homo sapiens, 151 (132-155) - RDP 2.14E-27 DQ146676, Homo JN232074, Homo sapiens, Uganda, 2013 316 (311-325) GENECONV 1.27E024 sapiens, Bangladesh, Brazil, 2004, 99.4%, G1 BootScan 3.15E-19 2003, 98.1%, G12 MaxChi 7.75E-09 Chimera 7.42E-09 SiScan 8.44E-20 3Seq 4.13E-03

VP7 D86273, Homo sapiens, 347 (339-353) - RDP 1.70E-24 AF450293, Homo D50121, Homo sapiens, 1986 726 (697-763) GENECONV 1.57E-22 sapiens, China, 96.3%, Japan, 1992-1993, BootScan 1.26E-17 Unknown, 97.4%, G3 G2 MaxChi 7.23E-14 Chimera 1.37E-15 SiScan 1.73E-13 3Seq 6.15E-47 VP7 KJ4122858, Homo sapiens, 881 (859-914) - RDP 4.80E-14 AB585923, Homo HQ425288, Homo Paraguay, 2009 1016 (991-54) GENECONV 3.04E-13 sapiens, Hong Kong, sapiens, South Korea, BootScan 3.81E-13 2006, 98.5%, G3 2004, 99.2%, G4 MaxChi 5.36E-03 Chimera 1.05E-02

38 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 25, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

SiScan 1.36E-04 3Seq 3.36E-09 VP7 KU714449, Homo sapiens, 825 (795-835) - 93 RDP 1.04E-13 KM008635, Homo KU714460, Homo Malawi, 2000 (85-93) GENECONV 4.64E-09 sapiens, India, 2009, sapiens, Malawi, 2000, BootScan 1.71E-09 92.5%, G1 98.2%, G6 MaxChi 1.20E-10 Chimera 9.96E-11 SiScan 3.89E-10 3Seq 7.72E-08 VP7 KJ412888, Homo sapiens, 873 (823-890) - RDP 2.17E-12 D86275, Homo JF490143, Homo sapiens, Paraguay, 2009 1023 (1000-81) GENECONV 7.16E-11 sapiens, China, 1986, Australia, 2004, 99.3%, BootScan 3.63E-12 88%, G3 G1 MaxChi 5.48E-04 Chimera 4.30E-04 SiScan 8.10E-03 3Seq 1.12E-05 VP7 KX655445, Homo sapiens, 1042 (1006-110) - RDP 6.32E-12 KJ560447, Homo KJ753326, Homo sapiens, Uganda, 2013 274 (269-287) GENECONV 5.98E-10 sapiens, USA, 2011, Kenya, unknown, 94.9%, BootScan 8.41E-08 95.8%, G3 G8 MaxChi 2.20E-06 Chimera 3.68E-08 SiScan 3.20E-04 3Seq 8.29E-11 VP7 D86277, Homo sapiens, 335 (323-353)- 754 RDP 8.20E-20 D86279, Homo KP883055, Homo China, 1989 (710-762) GENECONV 2.21E-15 sapiens, China, 1992, sapiens, Mali, 2009, BootScan 1.81E-10 89.6%, G3 95.7%, G2 MaxChi 1.91E-12 Chimera 1.95E-13 SiScan 3.36E-13 3Seq 7.22E-27 VP7 AF281044, Homo sapiens, 362 (346-366) - RDP 1.32E-09 AY866503, Homo KM008633, Homo Ireland, 1997-1999 589 (558-605) GENECONV 1.25E-02 sapiens, Thailand, sapiens, India, 2009, 89%, GQ433992, Homo sapiens, BootScan 4.96E-05 1995-1997, 98.9%, G1 Ireland, Unknown MaxChi 2.75E-05 G9 Chimera 1.50E-04 3Seq 3.36E-09 VP7 KU714449, Homo sapiens, 440 (421-446) - RDP 3.12E-10 JQ069503, Homo KU714460, Homo Malawi, 2000 604(641-704) GENECONV 7.36E-11 sapiens, Malawi, sapiens, Malawi, 2000, BootScan 1.56E-07 2000, 100%, G1 100%, G6 MaxChi 1.83E-03 Chimera 1.69E-03 3Seq 1.08E-12 VP7 KX655444, Homo sapiens, 110 (88-114) - RDP 2.84E-08 KC579965, Homo AF254137, Homo Uganda, 2013 191(183-197) GENECONV 3.00E-05 sapiens, USA, 1978, sapiens, Ireland, 1997- MaxChi3.11E-02 96.2%, G3 1999, 93.9%, G1 Chimera 3.11E-02 SiScan 1.01E-03 3Seq 8.25E-05 NSP1 KP753174, Sus scrofa, 1498 (1435-25) - RDP 7.45E-28 KP752999, Sus KP753000, Sus scrofa, South Africa, 2008, 585(573-596) GENECONV 5.26E-27 scrofa, South Africa, South Africa, 2009, KJ753184, Sus scrofa, BootScan 7.87E-25 2009, 99.9%, A8 99.7%, A8 South Africa, 2008, MaxChi 8.49E-16 KP752951, Sus scrofa, Chimera 5.73E-16 South Africa, 2008 SiScan 9.95E-18 3Seq 1.09E-48 NSP1 KX655446, Homo sapiens, 485 (455-499)- 889 RDP 3.98E-25 LC406825, Homo MG181277, Homo Uganda, 2013 (868-914) GENECONV 2.55E-22 sapiens, Kenya, 2012, sapiens, Malawi, 2000, BootScan 2.53E-21 95.9%, A2 96.5%, A1 MaxChi 1.96E-14 Chimera 8.97E-15 SiScan 1.22E-19 3Seq 3.18E-09 NSP1 DQ199658, Homo sapiens, 485 (440-517) - RDP 2.25E-22 KP883160, Homo DQ199657, Homo Australia, 2001 942 (931-956) GENECONV 7.04E-20 sapiens, Malawi, sapiens, Australia, 1997, BootScan 1.52E-21 2008, 98.1%, A1 98.7%, A1 MaxChi 2.09E-12 Chimera 1.10E-12 SiScan 6.14E-12 3Seq 2.61E-35 NSP1 KJ482257, Sus scrofa, 1534 (1490-44) - RDP 3.24E-20 KJ482252, Sus scrofa, KJ482255, Sus scrofa, Brazil, 2013 888 (853-909) GENECONV 2.11E-19 Brazil, 2013, 92.9%, Brazil, 2013, 99.9%, A8 BootScan 5.21E-10 A8

39 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 25, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

MaxChi 9.52E-18 SiScan 2.81E-21 3Seq 1.81E-29 NSP1 KX778616, Homo sapiens, 1147 (1127-1155) - RDP 1.14E-19 KJ752290, Homo KX363351, Sus scrofa, Dominican Republic, 2013 1462 (1433-5) GENECONV 4.51E-17 sapiens, Zambia, Vietnam, 2012, 95.9%, BootScan 8.47E-16 2009, 96.3%, A1 A8 MaxChi 1.54E-11 Chimera 1.93E-12 SiScan 9.28E-13 3Seq 3.18E-09 NSP1 KJ482254, Sus scrofa, 534 (499-559) - RDP 7.72E-18 KJ482247, Sus scrofa, KJ482256, Sus scrofa, Brazil, 2013 886 (849-911) GENECONV 1.84E-16 Brazil, 2013, 100%, Brazil, 2013, 100%, A8 BootScan 3.89E-14 A8 MaxChi 2.53E-10 Chimera 2.27E-10 SiScan 1.55E-08 3Seq 8.01E-16 NSP1 KX632348, Homo sapiens, 580 (508-605) - RDP 1.91E-10 KP752785, Homo KP882434, Homo Uganda, 2013 844 (829-874) GENECONV 3.85E-10 sapiens, Ethiopia, sapiens, Ghana, 2009, BootScan 1.02E-06 2010, 94.9%, A1 92.8%, A2 MaxChi 2.32E-04 Chimera 2.29E-04 SiScan 6.75E-08 3Seq 3.18E-09 NSP1 KP752999, Sus scrofa, 1394 (1368-31) - RDP 1.38E-06 KF835942, Homo Unknown, closest = South Africa, 2009 601 (552-631) BootScan 5.20E-05 sapiens, Hungary, KP753000 Sus scrofa, MaxChi 2.04E-08 2005, 93.9%, A8 South Africa, 2009, A8 Chimera 2.74E-08 SiScan 3.04E-06 3Seq 2.78E-12 NSP1 JX040426, Homo sapiens, 662 (559-677) - RDP 8.72E-03 KU292525, Homo KP882478, Homo India, 2009 842 (783-884) GENECONV 1.35E-05 sapiens, India, 2014, sapiens, Ghana, 2008, BootScan 4.19E-02 98.4%, A11 95%, A11 MaxChi 1.57E-04 Chimera 5.50E-05 SiScan 4.50E-03 3Seq 6.62E-07 NSP1 KP753000, Sus scrofa, 1389 (1359-41) - RDP 4.27E-03 JQ309141, Equus KF835942, Homo South Africa, 2009 602 (504-631) BootScan 4.30-03 callabus, United sapiens, Hungary, 2005, MaxChi 1.12E-04 Kingdom, 1975, 92.8%, A8 Chimera 1.52E-05 87.2%, A8 SiScan 3.90E-04 3Seq 5.58E-05 NSP1 KJ482254, Sus scrofa, 1534 (1461-44) – GENECONV: 1.50E-13 KJ482247, Sus scrofa, KJ482255, Sus scrofa, Brazil, 2013 533 (499-909) BootScan: 5.96E-10 Brazil, 2013, 99.1%, Brazil, 2013, 100%, MaxChi: 3.77E-08 A8 A8 Chimaera: 6.43E-08 SiScan: 1.88E-11 3Seq: 1.46 E-24 NSP2 KJ753657, Homo sapiens, (0) - (219 -278) RDP 2.01E-03 KM026633, Homo KJ918915, Homo sapiens, South Africa, 2004 GENECONV 1.6E-04 sapiens, Brazil, 2003, Hungary, 2012, 99.5%, KM026663, Homo sapiens, BootScan 4.07E-06 98.6%, N1 N1 Brazil, 2009 MaxChi 3.51E-03 KM026664, Homo sapiens, Chimera 3.30E-03 Brazil, 2009 SiScan 1.90E-07 3Seq 1.57E-04

NSP2 MF18783, Homo sapiens, 1015 (1002-57) - GENECONV 1.60E-14 KJ752146, Homo KX655480, Homo USA, 2011 286 (273-297) BootScan 1.55E-14 sapiens, South Africa, sapiens, Uganda, 2012, MaxChi 6.46E-04 2011, 99.7%, N1 99.3%, N2 Chimera 3.74E-06 SiScan 4.37E-08 3Seq 5.75E-22 NSP2 MF18783, Homo sapiens, 1015 (1002-57) - GENECONV 1.60E-14 KJ752146, Homo KX655480, Homo USA, 2011 286 (273-297) BootScan 1.55E-14 sapiens, South Africa, sapiens, Uganda, 2012, MaxChi 6.46E-04 2011, 99.7%, N1 99.3%, N2 Chimera 3.74E-06 SiScan 4.37E-08 3Seq 5.75E-22 NSP2 AB779641, Sus scrofa, 870 (833-921) - RDP 6.30E-05 KX363374, Sus JF796719, Sus scrofa, Thailand, 2008 216 (150-396) GENECONV 4.67 E-03 scrofa, Vietnam, South Korea, 2006, 100%, BootScan 1.24 E-03 2012, 96.5%, N1 N1

40 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 25, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

MaxChi 1.28 E-03 Chimaera 6.20 E-03 3Seq 1.48 E-05 NSP4 MF58092, Homo sapiens, 599 (546-616) - RDP 1.64 E-11 JF813107, Homo MG781050, Homo China, 2013 (745-94) GENECONV 2.38 E-08 sapiens, China, 2008, sapiens, Thailand, 2011, BootScan 1.14 E-09 97.9%, E1 73.1%, E1 MaxChi 1.01 E-09 Chimaera 4.58 E-10 SiScan 4.77 E-18 3Seq 6.43 E-21 NSP4 AB361290, Homo sapiens, 624 (548-635) - RDP 1.66 E-09 AB326338, Homo Unknown, closest = India, 2006 726 (712-50) GENECONV 4.38 E-08 sapiens, India, 1989, FJ492834, Sus scrofa, BootScan 6.96 E-07 95.8%, E2 Ireland, E9 MaxChi 1.96 E-03 Chimaera 2.49 E-05 3Seq 1.15 E-12 NSP4 AB326966, Homo sapiens, 605 (564-614) - RDP 2.77 E-09 FJ685615, Homo Unknown, closest = India, 2001 675 (670-46) GENECONV 1.10 E-10 sapiens, India, 1992, MG181929, Homo BootScan 2.26 E -02 98.2%, E1 sapiens, Malawi, 2012, E2 MaxChi 2.54 E -05 Chimaera 1.63 E -03 3Seq 8.89 E-09 NSP5 M33608, Homo sapiens, 619 (505-624) - RDP 1.44 E -05 MG996138, Homo JQ358774, Homo sapiens, 1993 668 (665-182) GENECONV 2.64 E-06 sapiens, Singapore, India, 2011, 100%, H3 BootScan 4.1 E-05 2016, 95%, H2 MaxChi 4.54 E-02 Chimaera 3.70 E-02 SiScan 2.49 E-05 3Seq 6.61 E-04 903 904 Supplementary Table 2. Amino acid changes observed in segment 9 (VP7) recombination events 905 observed in more than one isolate. Changes occurring in predicted epitope regions are noted. 906 Gene Accession Position Major Minor Epitope recombinant Number region? G6-G6 KF170899 40 M T - 46 I T - 48 V T - 116 V I - G3-G1 KJ751729 281 V I Yes 303 V I Yes 309 A V Yes 318 N D G9-G1 AF281044 135 S C - 162 D E - 163 R W - 169 V D 170 N I 176 L Q 180 G E Yes G2-G1 KC443034 9 I T -

41 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 25, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

10 F I - 21 S Y - 25 T S - 26 I V - 28 N R - 29 P I - 31 A D - 32 P Y - 35 F Y - 41 I F - 42 A V - 43 L A - 44 M L - 45 S F - 47 F L - 48 G T - 49 R K - 50 T A - 56 Y N - 65 A T - 68 T S - 72 S R - 73 G E - 75 S V - 907 908

42 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 25, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

909

910 911 Supplementary Figure 1. P6, P8, and P4 consensus RNA secondary structure mountain plots. Mountain plots 912 showing consensus secondary structure of VP4 genotypes P6, P8, and P4. Lighter colors indicate lower 913 probabilities of base-pairing. Peaks correspond to hairpin-loops, plateus to loops, and slopes to helices. Plot 914 was generated using RNAalifold in the ViennaRNA package. 915

A B

916 917 Supplementary Figure 2. Consensus mountain plots of A) NSP1 and B) VP3 with recombination breakpoint 918 distribution plot in gray.

43