<<

bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 7, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

1 Intragenic Recombination Influences Diversity and Evolution 2

3 Irene Hoxiea,b and John J. Dennehya,b,#

4

5

6 aBiology Department, Queens College of The City University of New York, Queens, NY 7

8 bThe Graduate Center of The City University of New York, New York, NY 9 10 11 12 Running head: Rotavirus Recombination and Evolution 13 14 #Address correspondence to John J. Dennehy, [email protected] 15 16 Word counts 17 Abstract: 212 18 Text: excl. refs, table and figure legends 19

1 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 7, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

20 Abstract 21 Because of their replication mode and segmented dsRNA genome, homologous recombination is assumed to be 22 rare in the . We analyzed 23,627 complete rotavirus genome sequences available in the NCBI 23 Variation database and found numerous instances of homologous recombination, some of which prevailed 24 across multiple sequenced isolates. In one case, recombination may have generated a novel rotavirus VP1 25 lineage. We also found strong evidence for intergenotypic recombination in which more than one sequence 26 strongly supported the same event, particularly between different genotypes of the serotype , VP7. In 27 addition, we found evidence for recombination in nine of eleven rotavirus A segments. Only two segments, 7 28 (NSP3) and 11 (NSP5/NSP6), did not show strongly supported evidence of prior recombination. The abundance 29 of putative recombinants with substantial replacements occurring between different G types suggests 30 a fitness advantage for the resulting recombinant strains. Protein structural predictions indicated that despite the 31 sometimes substantial, amino acid replacements resulting from recombination, the overall protein structures 32 remained relatively unaffected. Notably, recombination junctions appear to occur non-randomly with hot spots 33 corresponding to secondary RNA structures, a pattern that is seen consistently across segments. Collectively, the 34 results of our computational analyses suggest that, contrary to the prevailing sentiment, recombination may be a 35 significant driver rotavirus evolution and influence circulating strain diversity. 36 37 Keywords: dsRNA, genetic diversity, , segmented genome, RDP4

2 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 7, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

38 Introduction 39 The non-enveloped, dsRNA rotaviruses of the family Reoviridae are a common cause of acute 40 gastroenteritis in young individuals of many bird and mammal species (1). The rotavirus genome consists of 11 41 segments, each coding for a single protein with the exception of segment 11, which encodes two , NSP5 42 and NSP6 (1). Six of the proteins are structural proteins (VP1-4, VP6 and VP7), and the remainder are non- 43 structural proteins (NSP1-6). The infectious virion is a triple-layered particle consisting of two outer-layer 44 proteins, VP4 and VP7, a middle layer protein, VP6, and an inner capsid protein, VP2. The RNA polymerase 45 (VP1) and the capping-enzyme (VP3) are attached to the inner capsid protein. For the virus to be infectious (at 46 least when not infecting as an extracellular vesicle), the VP4 spike protein’s head (VP8*) must be cleaved by a 47 , which results in the protein VP5* (2). Because they comprise the outer layer of the virion, VP7 and 48 VP4 are capable of eliciting neutralizing antibodies, and are used to define G (glycoprotein) and P (protease 49 sensitive) serotypes respectively (3, 4). Consequently, VP7 and VP4 are likely to be under strong selection for 50 diversification to mediate cell entry or escape host immune responses (5-7). 51 Based on sequence identity and antigenic properties of VP6, 10 different rotavirus groups (A–J) have 52 been identified, with group A rotaviruses being the most common cause of human infections (8-10). A genome 53 classification system based on established nucleotide percent cutoff values has been developed for group A 54 rotaviruses (3, 11). In the classification system, the segments VP7-VP4-VP6-VP1-VP2-VP3-NSP1-NSP2- 55 NSP3-NSP4-NSP5/6 are represented by the indicators Gx-P[x]-Ix-Rx-Cx-Mx-Ax-Nx-Tx-Ex-Hx, (x = Arabic 56 numbers starting from 1), respectively (3, 11). To date, between 20 to 51 different genotypes have been 57 identified for each segment, including 51 different VP4 genotypes (P[1]–P[51]) and 36 different VP7 genotypes 58 (G1–G36), both at 80% nucleotide identity cutoff values (12). 59 The propensity of rotavirus for coinfection and outcrossing with other rotavirus strains makes it a 60 difficult to control and surveil, even with current vaccines (3, 7, 13-16). Understanding rotaviral 61 diversity expansion, genetic exchange between strains (specifically between the clinically significant type I and 62 type II genogroups), and evolutionary dynamics resulting from coinfections has important implications for 63 disease control (3, 7, 13-16). Rotavirus A genomes have high mutation rates (16-18), undergo frequent 64 reassortment (15, 19-21), and the perception is that these two processes are the primary drivers of rotavirus 65 evolution (16, 22). Genome rearrangements may also contribute to rotavirus diversity, but are not believed to be 66 a major factor in rotavirus evolution (23). Homologous recombination, however, is thought to be especially rare 67 in rotaviruses due to their dsRNA genomes (20, 21, 24). Unlike +ssRNA (25) and DNA viruses (26), 68 dsRNA viruses are not easily able to undergo intragenic recombination because their genomes are not 69 transcribed in the cytoplasm by host polymerases, but rather within nucleocapsids by viral RNA-dependent 70 RNA polymerase (20, 27). Genome encapsidation should significantly reduce the opportunities for template 71 switching, the presumptive main mechanism of intramolecular recombination (26, 28). 72 Despite the expectation that recombination should be rare in rotavirus A, there are nevertheless 73 numerous reports of recombination among rotaviruses in the literature (29-37). However, a comprehensive 3 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 7, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

74 survey of 797 rotavirus A genomes failed to find any instances of the same recombination event in multiple 75 samples (38). There are two possible explanations for this outcome. Either the putative recombinants were 76 spurious results stemming from poor analytical technique or were poorly fit recombinants that failed to increase 77 in frequency in the population such that they would be resampled (38). The main implications of this study are 78 that recombination among rotavirus A is rare and usually disadvantageous. 79 Since the number of publicly available rotavirus A whole segment genomes is now over 23,600, it is 80 worth revisiting these conclusions to see if they are still valid. To this end, we used bioinformatics tools to 81 identify possible instances of recombination among all available complete rotavirus A genome sequences 82 available in the NCBI Virus Variation database as of May, 2019. We found strong evidence for recombination 83 events among all rotavirus A segments with the exception of NSP3 and NSP5. In several cases, the 84 recombinants were fixed in the population such that several hundred sampled strains showed remnants of this 85 same event. These reports suggest that rotavirus recombination occurs more frequently than is generally 86 appreciated, and can significantly influence rotavirus A evolution. 87 Methods 88 Sequence Acquisition and Metadata Curation 89 We downloaded all complete rotavirus genomes from the Virus Variation Resource as of May 2019 90 (39). Laboratory strains were removed from the dataset. Genomes that appeared to contain substantial insertions 91 were excluded as well. Avian and mammalian strains were analyzed separately as recombination analyses 92 between more divergent genomes can sometimes confound the results. No well-supported events were identified 93 among the avian strains. For each rotavirus genome, separate fasta files were downloaded for each of the 11 94 segments. Metadata including host, country of isolation, collection date, and genotype (genotype cutoff values 95 as defined by (40)) were recorded. The genotype cutoff values allowed for the differentiation of events that 96 occurred between two different genotypes—intergenotypic—from those which occurred between the same 97 genotype—intragenotypic. 98 Recombination Detection and Phylogenetic Analysis 99 All 11 segments of each complete genome were separately aligned using MUSCLE v3.8.31 (41) after 100 removing any low quality sequence (e.g., ‘Ns’). Putative recombinants were identified using RDP4 Beta 4.97, a 101 program which employs multiple recombination detection methods to minimize the possibility of false positives

102 (42). All genomes were analyzed, with a p-value cut off < 10-9, using 3Seq, Chimæra, SiScan, MaxChi, 103 Bootscan, Geneconv, and RDP as implemented in RDP4 (42). We eliminated any strains that did not show a 104 putative recombination event predicted by at least 6 of the above listed programs. We then ran separate 105 phylogenic analyses on the ‘major parent’ and ‘minor parent’ sequences of putative recombinants in BEAST 106 v1.10.4 using the general time reversible (GTR) +  + I substitution model (43). ‘Parent’ in this case does not 107 refer to the actual progenitors of the recombinant strain, but rather those members of the populations whose 108 genome sequences most closely resemble that of the recombinant. Significant phylogenetic incongruities with

4 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 7, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

109 high posterior probabilities between the ‘major parent’ and ‘minor parent’ sequences were interpreted as 110 convincing evidence for recombination. 111 Phylogenetic Analysis of VP1 and VP3 112 Segment 1 (VP1) and segment 3 (VP3) each showed evidence for recombination events resulting in a 113 novel lineage, wherein many isolates reflected the same recombination event. To analyze these events more 114 thoroughly, for each segment we split alignments of a subset of environmental isolates to include flagged clades, 115 minor parent clades, and major parent clades along with outgroup clades to improve accuracy of tip dating. We 116 generated minor and major parent phylogenies using BEAST v1.10.4 (43). We used tip dating to calibrate 117 molecular clocks and generate time-scaled phylogenies. The analyses were run under an uncorrelated relaxed 118 clock model using a time-aware Gaussian Markov random field Bayesian skyride tree prior (44). The alignments 119 were run using a GTR + Γ + I substitution model and partitioned by codon position. Log files in Tracer v1.7.1 120 (45) were analyzed to confirm sufficient effective sample size (ESS) values, and trees were annotated using a 121 20% burn in. The alignments were run for three chains with a 200,000,000 Markov chain Monte Carlo (MCMC) 122 chain length, analyzed on Tracer v1.7.1 (45), and combined using LogCombiner v1.10.4 (46). Trees were 123 annotated with a 10% burn, and run in TreeAnnotator v1.10.4 (43). The best tree was visualized using FigTree 124 v1.4.4 (47) with the nodes labeled with posterior probabilities. 125 RNA Secondary Structure Analysis 126 To test the hypothesis that recombination junctions were associated with RNA secondary structure, we 127 generated consensus secondary RNA structures of different segments and genotypes using RNAalifold in the 128 ViennaRNA package, version 2.4.11 (48, 49). The consensus structures were visualized and mountain plots 129 were generated to identify conserved structures that may correspond to breakpoint locations. We made separate 130 consensus alignments both by using the same genotype and by combining genotypes. Due to the high sequence 131 variability of the observed recombinants, particularly in segments 4 (VP4) and 9 (VP7), we note that the 132 consensus structure may vary substantially depending on the sequences used in the alignments. Segments 7 133 (NSP3), 10 (NSP4), and 11 (NSP5) were not analyzed due to only having one or no recombination events. 134 Consensus structures could not reliably be made for segment 2 (VP2). While the VP2 protein is relatively 135 conserved across genotypes, it contains insertions particularly in the C1 genotype, yet shows recombination 136 across C1 and C2 genotypes. 137 Protein Structure and Antigenic Predictions 138 To verify that events did not result in substantial (deleterious) protein structure changes, we employed 139 LOMETS2 (Local Meta-Threading-Server) I-TASSER (Iterative Threading Assembly Refinement) (50-52) to 140 predict secondary and tertiary protein structures. I-TASSER generates a confidence (C) score for estimating the 141 quality of the protein models. 142 To determine if any of the putative recombinants possessed recombined regions containing , we 143 analyzed the amino acid sequences of all VP4, VP7, and VP6 recombinants and their major and minor parents.

5 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 7, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

144 We used the Immune Epitope Database (IEDB) (53) and SVMTriP (54) to predict conserved epitopes 145 (Supplementary Table 2). 146 Results 147 Strong Evidence for Homologous Recombination in Rotavirus A 148 We identified 117 putative recombination events (identified by 6/7 RDP4 programs; Table 1, Supplementary 149 Table 1). Of these, 79 recombination events were strongly supported, meaning they were detected by 7/7 150 recombination programs as implemented in RDP4 (Table 1). Most recombination events detected were observed 151 in sequences uploaded to the Virus Variation database since the last large scale analysis (38) so differences 152 between these results and prior studies may simply reflect the recent increase in available genome sequences. VP1 VP2 VP3 VP4 VP6 VP7 NSP1 NSP2 NSP3 NSP4 NSP5 Number of 1710 1600 1905 1990 2176 3887 1962 2186 1881 2430 1900 Sequences Analyzed Putative 17 13 17 16 12 24 11 3 0 3 1 Recombination

Eventsa Strongly 11 8 12 12 10 17 7 1 0 1 0 Supported

Eventsb Recombination 6.4E-3 5.0E-3 6.3E-3 8.0E-3 4.6E-3 4.4E-3 4.4E-3 4.6E-4 3.2E-3 4.1E-4 0

Frequencyc Intergenotypic 2/17 6/13 4/17 7/16 7/12 22/24 3/9 1/3 -- 2/3 1/1 Recombination ratio 153 Table 1. Recombination events identified among all mammalian rotavirus A genome sequences downloaded 154 from NCBI Virus Variation database in May 2019 (39). a6/7 programs implemented in RDP4 identified event 155 (see Methods). b7/7 programs implemented in RDP4 identified event or a putative recombination event where 156 more than one downloaded sequence showed the same event (see Methods). cStrongly supported events (Row 3) 157 divided by number of sequences analyzed (Row 1). 158 159 Putative Recombination Events Observed in Multiple Environmental Isolates 160 Eleven of the recombination events identified in Table 1 were observed in more than one environmental 161 isolate (Fig. 1; Table 2). The observation of multiple sequenced strains with the same recombination event is 162 strong evidence that the observed event was not spurious, and was not a consequence of improper analytical 163 technique or experimental error. There are only two ways that multiple sequenced isolates will show the same 164 recombinant genotype. Either multiple recombination events with the same exact breakpoints occurred at 165 approximately the same time, or the event happened once and descendants of the recombined genotype were

6 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 7, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

166 subsequently isolated from additional infected hosts. The latter scenario is more parsimonious, and suggests that 167 the new genotype was not reproductively impaired. Indeed, some recombined genotypes may be more fit than 168 their predecessors, but this outcome would need to be experimentally validated with infectivity assays.

Recombined Genotype(s) Number of Corrected and Breakpointsa Accession Segment Involved Independent Uncorrected Multiple Numbers Isolates Comparisons Segment 1 (VP1) R1 2 4.787 E-18 and 1.349 E-10 35 – 1448 (1290-1498) KC579647 KC580176 Segment 1 (VP1) R2 flagged in 285 1.305E-16 and 3.332E-09 (541-711) - (1327- KU199270 1445) *see Fig. 2

Segment 1 (VP1) R2 2 1.756 E-20 and 4.951 E-13 656 (541-711)- 2661 KU356662 (2600-2735) KU356640 Segment 3 (VP3) M2 flagged in 551 1.78E-12 and 4.982 E-05 1258 (872-1389)- 1798 KX655453 (1662-1917) Segment 3 (VP3) M1 flagged in 719 2.491E-20 and 6.946E-13 2158 (2129-2174) – KJ919553 2531 (undetermined) KJ919517 KJ919551 KJ753665 JQ069727 Segment 7 (NSP1) A8 (porcine) 3 3.01 E -37 and 8.622 E-30 (1434-25)- (573-596) KP753174 KJ753184 KP752951 Segment 8 (NSP2) N1 3 3.689 E-13 and 4.054 E-06 485 (455-499)- 889 KJ753657 (868-914) KM026663 KM026664 Segment 9 (VP7) G6 (bovine) 2 6.240 E-12 and 2.780 E-05 1048 (1032-128)-481 HM591496 (448-506) KF170899 Segment 9 (VP7) G1 & G2 2 5.507 E-38 and 1.664 E-30 55 (994-60) – 291 (262- KC443034 296) MG181727 Segment 9 (VP7) G1 & G3 2 1.4 E-23 and 4.230 E-16 857 (833-859)- 1019 KJ751729 (991-51) KP752817 Segment 9 (VP7) G1 & G9 2 5.292 E-19 and 1.599 E -11 362 (346-366) – 589 AF281044 (558-605) GQ433992

169 Table 2. Recombination events observed in multiple independent environmental isolates (i.e., isolates from 170 different patients and/or sequenced by different laboratories). a99% Confidence Intervals. 171

7 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 7, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

A

B

172 173 Figure 1. Putative recombinants where more than one sequence supported the event. a) Bootscan b) RDP. Top 174 left showing a G6-G6 event in VP7, second from the top left showing a G9-G1 event, third from the top left 175 showing a R1-R1 event in VP1, bottom left showing a N1-N1 event in NSP2. Top right is showing an A8-A8 176 event in NSP1, second from the top right is showing an R2-R2 event in VP1, third from top right is showing a 177 G2-G1 event in VP7 and bottom right is showing a G3-G1 event. 178

8 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 7, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

179 Segment 1 Intragenotypic Recombination Resulting in a New Lineage

0.19KY657705-China-2017-01-10 1 KY657707-China-2017-01-10 0.24KU550262_Spain_2015_03_23 1 KY657706-China-2017-01-10 1 KU550267_Spain_2015_03_17 0.9 KU550264_Spain_2015_04_16 1 KY657704-China-2017-01-10 KU870407_Hungary_2015 1 KY657703-China-2017-01-10 0.77 1 KU870385_Hungary_2015 KY657702-China-2017-01-10 0.19 LC386076_Japan_2017_05 1 KY657701-China-2017-01-10 0.97 LC260250_Indonesia_2016 0.61 1 KY657700-China-2016-12-30 0.27 MF997035_USA_2015 1 KY657699-China-2016-12-20 MH569747_Brazil_2015 0.29 1 KY657698-China-2016-12-10 1 KY657697-China-2016-11-30 0.78 0.22 KU550266_Spain_2015_04_13 KY657696-China-2016-11-20 0.24 KU550265_Spain_2015_03_23 KM660258-Cameroon-2010 1 1 KX880417_Germany_2015_03 KY657695-China-2017-01-10 KU550263_Spain_2015_04_09 0.51 1 LC228304_Japan_2015 1 KY657694-China-2016-12-30 1 MH704718_Indonesia_2014_03_17 VP1 (R2) pos. 1 KY657693-China-2016-12-20 VP1 (R2) pos. LC228315_Japan_2016 0.99 KY657692-China-2016-12-10 0.96 1 KY657691-China-2016-11-30 0.16 0.28 LC228326_Japan_2016 1 KY657690-China-2016-11-20 1 LC260252_Indonesia_2016 0.99 KM660246-Cameroon-2011 1 LC260248_Indonesia_2016 1 MH704719_Indonesia_2015_07_10 0.91 KJ752117-Guinea-Bissau-2011-04-21 LC260249_Indonesia_2016 0.13 KJ752350-Guinea-Bissau-2011-03-26 1 0.18 KP752693-Gambia-2010-02-08 0.9 LC260251_Indonesia_2016 1 0.97 1 LC105248_Japan_2014_04_14 KJ752161-Togo-2010-06 0.97 KY616885_Japan_2013 KP752560-SouthAfrica-2011 AB848007_Japan_2012 1 KJ752394-Gambia-2010-01-21 KX362788_VietNam_2014_01_14 0.37 KP752448-Gambia-2010-01-30 0.28 1-697,-1440- KJ752445-Senegal-2010-12-20 698-1439 1 0.97 KX362700_VietNam_2013_11_22 1 KM660248-Cameroon-2011 0.6 LC386065_Japan_2017_03 0.82 KM660249-Cameroon-2010 1 KX362733_VietNam_2013_12_19 MG181887-Malawi-2012-06-11 0.18LC066661_Thailand_2013_03 0.99 1 LC066639_Thailand_2013_04 0.94 MG181876-Malawi-2012-06-06 0.91 LC066650_Thailand_2013_02 0.97 MG181766-Malawi-2012-06-05 LC105003_Japan_2014_05_20 0.2 0.6 MG181843-Malawi-2012-04-03 0.33 0.87 KJ753787-SouthAfrica-2009-06-03 0.14 1 LC105426_Japan_2014_06_09 KJ753716-SouthAfrica-2008 LC105393_Japan_2014_06_04 0.03 1 MG599516_Brazil_2013 1 KM660254-Cameroon-2011 MG599515_Brazil_2013 0.4 KM660263-Cameroon-2011 0.05 1 KM660253-Cameroon-2011 0.5 MG599518_Brazil_2013 3302 0.03 MG599517_Brazil_2013 1 KP753145-Togo-2010-11 KP007184_Philippines_2012_07 0.11 KJ752183-Togo-2010-11 1 1 KM660247-Cameroon-2011 KP007204_Philippines_2012_11 0.01 0.02 0.22LC260245_Indonesia_2015 0.11 JF460834-Belgium-2009-03-14 KX880428_Germany_2016_05 0.03 KP752812-Togo-2010-08 1 0.67 0.16LC260244_Indonesia_2015 1 JF460823-Belgium-2009-02-18 0.31 LC260242_Indonesia_2015 JF460812-USA-2006 KU059788_Australia_2014_09_23 KP752493-Togo-2009 0.22 0.92 KY000549_Germany_2016_09 KX655517-Uganda-2013-07-05 0.33 0.43 KX655473-Uganda-2012-11-26 KU059766_Australia_2013_04_19 0.49 KX632331-Uganda-2012-11-15 0.92 KU059777_Australia_2013_09_03 1 0.35 1 LC086714_Thailand_2013 0.91 1 KX632276-Uganda-2013-01-05 1 LC086725_Thailand_2013 0.99MF184806-USA-2011 KJ639012_Japan_2013_02_27 0.19 MF184787-USA-2011 1 1 KJ639023_Japan_2013_03_14 0.83 KJ751568-SouthAfrica-2009 0.48 LC260247_Indonesia_2016 0.86 1 KP753082-Senegal-2009 KP752971-Gambia-2008-06-12 1 AB930194_Japan_2014 KJ748476-Ghana-2008 LC260246_Indonesia_2016 KC257091-Camelusdromedarius-Sudan-2002 1 1 LC086780_Thailand_2013 LC228337-Japan-2016 0.24 KX171520_SouthKorea_2015 0.17 1 LC086736_Thailand_2013 0.25 LC228359-Japan-2016 LC086791_Thailand_2013 1 LC228348-Japan-2016 0.18 LC086802_Thailand_2013 0.93 LC228370-Japan-2016 1 1 KX171515_SouthKorea_2012 0.98KC443378-Australia-2010-11-24 1 KX171512_SouthKorea_2011 0.23 KC443211-Australia-2010-08-15 1 KC443455_Australia_2011 0.23 KC443554-Australia-2010-05-19 KC443200_Australia_2010_10_23 1 0.99 JX965134-Australia-2010-11-04 0.17 0.99 LC169973_Thailand_2014 KC443620-Australia-2010-09-26 0.59 LC169984_Thailand_2014 KC443266-Australia-2010-11-27 0.39 KX362821_VietNam_2014_01_21 0.18 KP752720-Swaziland-2010-07-14 0.61 0.82KX362744_VietNam_2013_12_20 0.17 KJ753005-SouthAfrica-2010-06-15 0.5 KX362645_VietNam_2013_09_23 0.3 KF636322-SouthAfrica-2009-05-18 0.37 KX362612_VietNam_2013_08_28 0.31 KP752779-Zambia-2009 0.46 0.59 0.95 0.2 KX362667_VietNam_2013_10_03 0.99 KJ752405-SouthAfrica-2007-02-20 KX362722_VietNam_2013_12_12 1 KJ753605-SouthAfrica-2007-05-15 0.99 KX362569_VietNam_2013_07_02 KJ751801-SouthAfrica-2005 1 KX362580_VietNam_2013_07_18 0.41 JQ069889-Canada-2007 KT936626_Thailand_2012_02_02 0.65 KC178770-Italy-2008 1 0.99 LC086769_Thailand_2013 KC442903-USA-2007 0.1 KC178763_Italy_2010 0.38 0.21 MG181865-Malawi-2012-05-12 0.08 KC443631_Australia_2006_07_27 0.18 MG181788-Malawi-2012-02-11 0.98 KC443763_Australia_2006_07_19 0.7 MG181832-Malawi-2012-03-14 KY497551_Pakistan_2010 MG181898-Malawi-2012-03-29 1 1 1 KC443007_USA_2005 0.9 MG181777-Malawi-2012-01-17 KT920606_India_2003 0.35 MG181854-Malawi-2012-04-27 1 1 DQ490545_Bangladesh_2000_02_14 0.1 KJ752372-Susscrofa-SouthAfrica-2007 DQ490551_Bangladesh_2000_04_20 KJ752458-Zimbabwe-2010-06-26 0.97 0.76 KU356585_Bangladesh_2010_01 1 KF636311-Zambia-2009-04-13 0.91 1 KU356574_Bangladesh_2010_01 0.03 KJ753811-SouthAfrica-2008-08-26 HQ641355_Bangladesh_2005_09 0.65 0.33 KF636357-Zambia-2008-10-19 KU199270_Bangladesh_2010_02 0.93 KF636245-Kenya-2009 0.92 KJ721724_Brazil_2007 0.1 KJ751602-Tanzania-2010-10-15 0.88 KJ721725_Brazil_2006 1 KJ751947-SouthAfrica-2009 KJ721718_Brazil_2011 1 KP752571-Tanzania-2011-11-17 0.11 0.98 0.06 KJ721726_Brazil_2005 1 KX655495-Uganda-2013-01-04 0.06 KJ626715_Paraguay_2005_09 1 KX655462-Uganda-2013-08-29 1 KX171511_SouthKorea_2010 KJ752662-Uganda-2008-06-18 0.04 0.13 KX171518_SouthKorea_2014 0.14 KJ751635-Botswana-2007 KJ626938_Paraguay_2007_07 0.98 KJ753401-SouthAfrica-2005-09-26 KJ626965_Paraguay_2005_07 KF636345-Swaziland-2010-07-22 0.06 KJ627161_Paraguay_2005_07 0.99 KC443653-Australia-2000-09-14 0.01 1 KJ626829_Paraguay_2006_08 KC443345-Australia-2001-06-29 1 KJ626570_Paraguay_2006_09 0.28KU870385-Hungary-2015 0.67 KJ626604_Paraguay_2006_08 0.48 LC386076-Japan-2017-05 0.07 KJ626751_Paraguay_2006_08 KU870407-Hungary-2015 0.02 1 0.94 KJ626726_Paraguay_2005_07 0.33KU550262-Spain-2015-03-23 KJ626773_Paraguay_2005_07 1 KU550267-Spain-2015-03-17 0.34 KJ412775_Paraguay_2008_09 0.32 KU550264-Spain-2015-04-16 0.24 KJ626893_Paraguay_2007_08 0.34 MF997035-USA-2015 1 KJ626987_Paraguay_2007_10 0.42 MH569747-Brazil-2015 0.08 KJ626998_Paraguay_2007_08 0.96 LC260250-Indonesia-2016 0.94 KJ626859_Paraguay_2005_09 0.59KU550263-Spain-2015-04-09 1 JN706448_Thailand_2009 0.12 0.98 KU550265-Spain-2015-03-23 0.96 JN706449_Thailand_2009 1 KU550266-Spain-2015-04-13 0.99 JN706465_Thailand_2009 0.21 KX880417-Germany-2015-03 0.92 1 JN706450_Thailand_2009 KP007184-Philippines-2012-07 0.38 1 JQ069953_Canada_2009 0.72 MH704718-Indonesia-2014-03-17 0.97 KC443521_Australia_2007_05_02 LC228304-Japan-2015 1 KC442985_USA_2005 0.99 LC228326-Japan-2016 1 KC443029_USA_2006 1 LC228315-Japan-2016 KC442870_USA_2007 0.33 LC260248-Indonesia-2016 0.94 0.99 KC178766_Italy_2006 LC260252-Indonesia-2016 1 KF716358_USA_2011 1 MH704719-Indonesia-2015-07-10 1 KF716360_USA_2011 1 LC260251-Indonesia-2016 1 MF184767_USA_2011 0.92 LC260249-Indonesia-2016 KR701629_USA_2011 KP007204-Philippines-2012-11 1 0.1 1 AB849000_Japan_2012 0.3 MG599516-Brazil-2013 KU925784_Taiwan_2011 0.16 MG599515-Brazil-2013 0.97 0.7 KX363117_VietNam_2013_01_31 0.98 MG599518-Brazil-2013 1 KX363245_VietNam_2013_04_01 0.84 MG599517-Brazil-2013 0.75 0.82 1 KX362832_VietNam_2012_10_31 0.72 LC386065-Japan-2017-03 0.97 KX171513_SouthKorea_2011 KX362788-VietNam-2014-01-14 1 KX171510_SouthKorea_2010 0.14 KX362700-VietNam-2013-11-22 0.15 0.13 KC443543_Australia_2006_10_30 LC066639-Thailand-2013-04 0.98 KC443686_Australia_2010_09_01 0.94 KX362733-VietNam-2013-12-19 1 KC443356_Australia_2011_05_25 1 LC066661-Thailand-2013-03 0.97 0.99 KC834711_Australia_2009_02_27 0.5 LC066650-Thailand-2013-02 JQ069907_Canada_2008 0.31LC105426-Japan-2014-06-09 0.27 1 KC178765_Italy_2004 0.92 LC105393-Japan-2014-06-04 1 0.94 GQ414540_Germany_2009_01 0.14 LC105003-Japan-2014-05-20 0.08 KC443167_Australia_2000_07_27 0.89 KY616885-Japan-2013 0.13 1 KC443288_Australia_2000_09_17 0.95 0.94 LC105248-Japan-2014-04-14 KC443433_Australia_2000_09_01 AB848007-Japan-2012 1 KC443178_Australia_2000_09_12 0.99 KU059788-Australia-2014-09-23 0.21 KC443730_Australia_2000_09_10 1 KU059777-Australia-2013-09-03 0.71 JQ069876_Canada_2007 0.15 KU059766-Australia-2013-04-19 JQ069900_Canada_2007 LC086725-Thailand-2013 1 0.99 0.31 0.35 JQ069919_Canada_2008 0.04 LC086714-Thailand-2013 HQ657171_SouthAfrica_2009 LC260244-Indonesia-2015 KC834713_Australia_1999_03_28 0.23LC260242-Indonesia-2015 1 1 KC443774_Australia_2001_09_13 0.22LC260245-Indonesia-2015 1 KY657706_China_2017_01_10 KX880428-Germany-2016-05 1 KY657707_China_2017_01_10 0.44 KY000549-Germany-2016-09 1 KY657705_China_2017_01_10 0.27 LC260247-Indonesia-2016 1 KY657704_China_2017_01_10 0.54 LC260246-Indonesia-2016 1 KY657703_China_2017_01_10 0.11 0.4 AB930194-Japan-2014 KY657702_China_2017_01_10 0.97 KJ639023-Japan-2013-03-14 1KY657700_China_2016_12_30 KJ639012-Japan-2013-02-27 0.77 1 KY657701_China_2017_01_10 1 LC169973-Thailand-2014 1 KY657699_China_2016_12_20 0.3 LC169984-Thailand-2014 1 1 KY657698_China_2016_12_10 0.02 KX362821-VietNam-2014-01-21 1 KY657697_China_2016_11_30 KX362612-VietNam-2013-08-28 KY657696_China_2016_11_20 0.03 0.12KX362667-VietNam-2013-10-03 KM660258_Cameroon_2010 KX362569-VietNam-2013-07-02 KY657695_China_2017_01_10 1 0.1 1 KX362722-VietNam-2013-12-12 1 1 KY657694_China_2016_12_30 0.85 0.06 KX362744-VietNam-2013-12-20 1 KY657693_China_2016_12_20 1 0.95 KX362645-VietNam-2013-09-23 KY657692_China_2016_12_10 0.16 0.45 KX362580-VietNam-2013-07-18 1 KY657691_China_2016_11_30 KT936626-Thailand-2012-02-02 1 1 KY657690_China_2016_11_20 0.98 LC086791-Thailand-2013 KM660246_Cameroon_2011 0.15 0.27 LC086802-Thailand-2013 0.98 KJ752350_Guinea-Bissau_2011_03_26 0.17 LC086736-Thailand-2013 KJ752117_Guinea-Bissau_2011_04_21 KX171520-SouthKorea-2015 0.98 KP752693_Gambia_2010_02_08 0.35 1 LC086780-Thailand-2013 0.92 1 0.24 KP752560_SouthAfrica_2011 0.2 0.07 LC086769-Thailand-2013 KJ752161_Togo_2010_06 0.22 KC178763-Italy-2010 1 KM660248_Cameroon_2011 0.73KX171512-SouthKorea-2011 0.28 KM660249_Cameroon_2010 0.99 KX171515-SouthKorea-2012 1 KJ752445_Senegal_2010_12_20 1 1 KC443455-Australia-2011 1 KP752448_Gambia_2010_01_30 0.09 KC443200-Australia-2010-10-23 KJ752394_Gambia_2010_01_21 KY497551-Pakistan-2010 1 0.37 KX655517_Uganda_2013_07_05 0.29 KC443763-Australia-2006-07-19 0.32 KX632276_Uganda_2013_01_05 KC443631-Australia-2006-07-27 0.81 KX632331_Uganda_2012_11_15 1 0.99 KU356585-Bangladesh-2010-01 KX655473_Uganda_2012_11_26 0.41 1 KU356574-Bangladesh-2010-01 1 KX655438_Uganda_2013_01_10 HQ641355-Bangladesh-2005-09 MF184787_USA_2011 0.27 0.32 1 0.8 KT920606-India-2003 MF184806_USA_2011 0.79 KC443007-USA-2005 0.35 KJ748476_Ghana_2008 DQ490551-Bangladesh-2000-04-20 1 KP753082_Senegal_2009 0.98 DQ490545-Bangladesh-2000-02-14 0.99 KP752971_Gambia_2008_06_12 KU199270-Bangladesh-2010-02 KJ751568_SouthAfrica_2009 0.26KC443609-Australia-2000-07-12 1 1 KJ752183_Togo_2010_11 0.3 KC443532-Australia-2000-09-11 0.16 KP753145_Togo_2010_11 1 KC443675-Australia-2000-08-04 1 JF460834_Belgium_2009_03_14 0.96 JF460823_Belgium_2009_02_18 0.17 KC443719-Australia-2000-08-13 0.19 KX655438-Uganda-2013-01-10 0.99 0.23 KP752493_Togo_2009 0.2 KJ627011-Paraguay-1999-09 KP752812_Togo_2010_08 KC443774-Australia-2001-09-13 0.34 KM660247_Cameroon_2011 0.23 KJ626987-Paraguay-2007-10 1 KM660253_Cameroon_2011 0.19 KJ626998-Paraguay-2007-08 1 KM660263_Cameroon_2011 0.98 KJ626893-Paraguay-2007-08 0.43 KM660254_Cameroon_2011 0.01 KJ412775-Paraguay-2008-09 KC257091_Camelusdromedarius_Sudan_2002 0.13 KJ721718-Brazil-2011 0.92 1 MG181876_Malawi_2012_06_06 0.01 JQ069953-Canada-2009 1 MG181887_Malawi_2012_06_11 0.32 1 0 KJ626751-Paraguay-2006-08 MG181766_Malawi_2012_06_05 0.7 KJ721724-Brazil-2007 1 MG181843_Malawi_2012_04_03 0.02 KJ721725-Brazil-2006 0.67 0.35 KJ753787_SouthAfrica_2009_06_03 0 KJ626965-Paraguay-2005-07 1 KJ753716_SouthAfrica_2008 1 KX171511-SouthKorea-2010 JF460812_USA_2006 1 KX171518-SouthKorea-2014 0.25LC228337_Japan_2016 0 0.03 KJ626938-Paraguay-2007-07 0.21 LC228359_Japan_2016 1 0.93 KC443029-USA-2006 LC228370_Japan_2016 0 KC442870-USA-2007 1 LC228348_Japan_2016 KJ721726-Brazil-2005 1 KC443211_Australia_2010_08_15 0.55 KJ626570-Paraguay-2006-09 1 KC443378_Australia_2010_11_24 KJ626829-Paraguay-2006-08 0.99 KC443620_Australia_2010_09_26 0.63 JQ069919-Canada-2008 0.08 1 KC443554_Australia_2010_05_19 0 0.97 HQ657171-SouthAfrica-2009 0.98 KC443266_Australia_2010_11_27 JQ069900-Canada-2007 0.99 JX965134_Australia_2010_11_04 0.89 0.99 JQ069876-Canada-2007 JQ069889_Canada_2007 0.87 0 KC178766-Italy-2006 1 1 KF636322_SouthAfrica_2009_05_18 0 0.04 KJ626773-Paraguay-2005-07 1 KJ753005_SouthAfrica_2010_06_15 0 0 KC443521-Australia-2007-05-02 1 KJ751801_SouthAfrica_2005 KP752779_Zambia_2009 KJ626604-Paraguay-2006-08 1 1 KC442985-USA-2005 0.83 KP752720_Swaziland_2010_07_14 KJ627161-Paraguay-2005-07 0.99 KJ753605_SouthAfrica_2007_05_15 0.85 0.99 JN706449-Thailand-2009 KJ752405_SouthAfrica_2007_02_20 0.84 JN706450-Thailand-2009 0.65 KC442903_USA_2007 0.4 JN706465-Thailand-2009 KC178770_Italy_2008 JN706448-Thailand-2009 KC443345_Australia_2001_06_29 0.54 0.01 1 KC443653_Australia_2000_09_14 0 0.99 KJ626715-Paraguay-2005-09 KJ626859-Paraguay-2005-09 0.18 MG181777_Malawi_2012_01_17 KJ626726-Paraguay-2005-07 0.14 MG181865_Malawi_2012_05_12 KC834713-Australia-1999-03-28 0.43 MG181898_Malawi_2012_03_29 KX363245-VietNam-2013-04-01 0.53 MG181832_Malawi_2012_03_14 1 0.81 0.74 1 0.76 KX363117-VietNam-2013-01-31 MG181788_Malawi_2012_02_11 KX362832-VietNam-2012-10-31 1 MG181854_Malawi_2012_04_27 0.99 KJ752372_Susscrofa_SouthAfrica_2007 0.06KX171510-SouthKorea-2010 0.96 0.25 KX171513-SouthKorea-2011 0.97 KF636311_Zambia_2009_04_13 AB849000-Japan-2012 1 0.76 KJ752458_Zimbabwe_2010_06_26 1 KU925784-Taiwan-2011 KJ753811_SouthAfrica_2008_08_26 0.99 0.97 KF636357_Zambia_2008_10_19 0.2KF716358-USA-2011 0.16 KF716360-USA-2011 0.65 KJ751602_Tanzania_2010_10_15 0.74 1 KR701629-USA-2011 1 1 KP752571_Tanzania_2011_11_17 0.83 KJ751947_SouthAfrica_2009 1 MF184767-USA-2011 KX655495_Uganda_2013_01_04 KC443686-Australia-2010-09-01 1 1 KX655462_Uganda_2013_08_29 0.18 KC443543-Australia-2006-10-30 1 0.7 KF636245_Kenya_2009 0.19 KC443356-Australia-2011-05-25 0.13 0.96 JQ069907-Canada-2008 KJ752662_Uganda_2008_06_18 KC834711-Australia-2009-02-27 1 KJ753401_SouthAfrica_2005_09_26 0.94 KC178765-Italy-2004 0.89 1 KJ751635_Botswana_2007 GQ414540-Germany-2009-01 KF636345_Swaziland_2010_07_22 1 KC443719_Australia_2000_08_13 0.98KC443433-Australia-2000-09-01 0.24 0.11 KC443288-Australia-2000-09-17 0.21 KC443609_Australia_2000_07_12 1 KC443675_Australia_2000_08_04 0.6 KC443178-Australia-2000-09-12 KC443532_Australia_2000_09_11 0.17 KC443730-Australia-2000-09-10 KC443167-Australia-2000-07-27 0.37KJ751702_SouthAfrica_2000_02_14 0.25 KJ752216_SouthAfrica_2000_02_11 0.81 KJ751960-SouthAfrica-2003 0.24 KX655506-Uganda-2013-04-15 1 KJ751812_SouthAfrica_2000_05_04 0.47 KJ752684_SouthAfrica_2000_04_04 0.3 KJ751812-SouthAfrica-2000-05-04 KX655506_Uganda_2013_04_15 0.26 KJ752216-SouthAfrica-2000-02-11 1 1 KJ751960_SouthAfrica_2003 0.97KJ752684-SouthAfrica-2000-04-04 0.15 KJ751702-SouthAfrica-2000-02-14 0.91 1 KP752460_SouthAfrica_1999_12_05 0.96 KP752460-SouthAfrica-1999-12-05 KJ751658_SouthAfrica_1999_12_07 0.37 KJ751658-SouthAfrica-1999-12-07 0.98 KC834712_Australia_2004_02_28 KC834712-Australia-2004-02-28 0.48JN605437_SouthAfrica_1999 0.92 0.93 1 KJ752249_SouthAfrica_1999 0.31KJ752249-SouthAfrica-1999 0.99 KP752845-SouthAfrica-2009 KP752845_SouthAfrica_2009 JN605437-SouthAfrica-1999 KJ627011_Paraguay_1999_09

3.0 3.0

- 2 5 - 2 0 - 1 5 - 1 0 -5 0- 2 5 - 2 0 - 1 5 - 1 0 -5 0 180 181 Figure 2. Bayesian phylogenetic analysis of a possible recombination event in VP1 that became fixed in the 182 population. The strain closest to the recombination event is highlighted in red. Major clades are colored to 183 show the phylogenetic incongruity. Nodes are labeled with posterior probabilities with 95% node height 184 intervals shown. Time axis at the bottom is in years before 2017.

9 A

bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 7, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

A

B

B 185 186 Figure 3. A) RDP analysis (top) and BootScan analysis (bottom) shown for a putative recombinant between two 187 R2 segment 1 genotypes. The red line is comparing the minor parent to the recombinant, blue line is comparing 188 the major parent to the recombinant, and the green line is comparing the major parent to the minor parent. The 189 Y axis for RDP (top) pairwise identity, while the Y axis for BootScan (bottom) is Bootstrap support. The X axis 190 is the sequence along segment 1. B) The relevant sites shown below color-coded by which strain the 191 recombinant matches. The Recombinant is the middle sequence, the minor parent is the lowest sequence, and 192 the major parent is the top sequence. Mutations matching the major sequence are shown in blue, while 193 mutations matching the minor parent are shown in purple. Yellow mutations are showing mutations not present 194 in the recombinant sequence but which match between the major and minor parent, possibly suggesting a 195 second recombination event. 196 197 Segment 1 (VP1; ~3,302 base pairs) showed evidence of a recombination event within the R2 clade that 198 was fixed in the population, and resulted in a new lineage (highlighted clade in Fig. 2; Fig. 3; Table 2). The 199 multiple comparison (MC) uncorrected and corrected probabilities were 1.305E-16 and 3.332E-09 respectively. 200 The sequence most closely related to the recombined sequence was KU199270, a human isolate from 201 Bangladesh in 2010 (55). Phylogenetic analysis using tip calibration suggests that the recombination event 202 occurred no later than 2000-2005 (node Cis), so if this is a true recombination event, the 2010 sequence is not 203 the original recombinant. The recombinant region is 100% similar to an isolate also from Bangladesh in 2010 204 (KU248372) (55), which also has a putative recombinant sequence in another region of its genome. The 205 breakpoint regions (99% CI: (541-711) - (1327-1445)) may represent a potential hotspot for segment 1 206 recombination as multiple recombination events show breakpoints in this area (Table 2). Phylogenetic analysis 207 of the alignment of 250 representative sequences containing only the putative recombinant region compared 208 with the rest of the segment 1 sequence showed a consistent subclade shift within the R2 clade. The

10 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 7, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

209 recombination events resulted in the incorporation of the following amino acid substitutions in the recombinant 210 strains when compared to the major parent strain: 227 (K→E), 293 (D→N), 297 (K→R), 305 (N→K), and 350 211 (K→E). 212 Segment 3 Intragenic Recombination

0.89

1

1

1

1 VP3 alignment 0.32

0.94

VP3 nucleotide 1

0.96 0.82 1 1 1

0.99 VP3 alignment 1 position 1-1259, 1

1 0.18 VP3 alignment 0.37 1 1 1 0.69 1 1 alignment position 1 0.94 0.99 1 1 1 1800-2213 0.85 1 1 1 1 0.26 1 0.17 0.98 0.99 1 0.3 0.31 0.2 1 0.99 1 0.36 0.63 1 0.22

pos. 1-1259, 0.01 0.23 1 0.78 0.24 0.97 0.99 1 0.22 pos. 2214-2617 2214-2617 0.81 0.74 0.07 0.99 1 1 0.81 1 1

0.01 1 0.7 0.06 0.82 1 1 0.39 0.92 0.12 0.36 0.97 0.95 0.94 0.3 0.89 1 1 0.74 0.58 1 1 1 0.44 0.36 1 1 0.78 0.97 1 1 0.37 0.66 0.01 1

0 1 0.21 1 1800-2213 0.03 1 0.1 1 0.17 0.98 1 0.97 0.16 0.99 0.35 0.39 1 1 0.92 0.04 0.34 1 1 0.11 1 0.06 0.9 0.96 0.03 0.04 0.79 0.26 1 0.9 1 0.92 0.35 1 1 1 0.34 1 1 1 1 1 1 0.98 0.38 1 0.77 0.04 0.54 1 1 0.22 0.74 0.88 1 0.33 1 1 0.81 0.2 1 0.73 0.42 0.91 0.54 0.53 0.87 0.40.99 1 0.88 1 0.35 0.66 A 0.93 0.77 1 0.9 0.91 0.53 0.75 1 1 0.95 1 0.51 1 0.19 0.93 0.19 0.99 0.99 1 0.78 1 1 1 1 0.46 1 0.32 1 0.78 1 1 1 0.5 1 0.36 0.83 0.85 0.82 1 0.99 1 1 0.76 0.06 1 0.81 1 1 1 1 0.93 0.32 0.52 0.25 0.99 0.96 0.98 1 1 0.2 0.27 0.98 1 0.37 1 0.160.32 0.9 1 1 0.91 1 0.99 1 1 1 1 0.94 0.88 0.85 0.9 0.82 1 1 1 0.99 0.29 0.01 0.01 1 1 0.97 1 1 0.34 0.19 0.91 0.97 0.92 0.93 1 0.72 1 0.8 1 1 0.74 0.16 0.04 0.37 1 1 0.43 0.99 1 0.51 1 0.68 0.05 0.22 1 1 0.73 0.04 0.48 0.43 1 0 0.5 1 0.86 1 1 0.65 0.99 0.12 1 0.83 0.42 1 1 1 0.96 1 0.93 0.6 0.38 1 0.41 0.08 0.56 1 0.63 0.9 1 1 1 0.91 0.99 0.23 0.63 1 1 0.24 0.91 0.82 0.24 0.97 0.83 1 1 1 1 0.06 1 1 0.96 0.27 1 0.48 0.04 0.97 0.94 0.86 1 1 0.68 0.57 0.18 1 0.45 1 1 0.73 1 1 0.07 1 0.96 0.41 0.28 1 0.02 1 0.69 1 1 1 0.180.25 1 1 1 0.42 1 1 0.3 1 1 0.74 0.32 0.8 0 0.31 0.39 0.76 1 0.35 1 1 1 0.28 0.84 0.21 0.32 1 0.74 0.24 0.21 0.63 0.52 0.2 0.14 1 0.86 0.76 1 0.95 1 0.13 1 0.91 0.28 0.16 0.44 0.5 0.89 0.98 1 0.99 0.53 0.04 0.9 0.83 1 0.72 1 0.99 0.12 0.97 0.19 0.9 0.38 0.28 0 0.07 1 1 0.01 1 0.23 0.12 0.01 1 1 0.09 0.99 0.14 0.08 0.69 0.96 0 1 0.2 0.15 0.45 0.88 0.88 0.21 0.821 0.35 0 0.01 0.16 0.18 0.9 0.28 1 0 0.04 1 0.18 0.02 0.08 0.02 1 1 0.12 0.94 0.76 0.96 0 0.08 0.94 1 0.47 0.25 1 0.38 0.17 0.96 0.94 0.36 1 0.94 0.87 0.77 0.08 1 0.14 0.63 0.99 0.82 0.14 0.87 0 1 0.98 0.39 1 1 0.16 0.58 0.38 0.55 0.7 0.02 0.990.77 0.96 0 1 1 0.98 0.95 1 1 0.07 1 0.02 0.94 0.98 0.78 0.04 0.03 0.22 0.78 0.16 0.79 1 0.24 0.92 0.44 0.99 0.99 0.22 1 1 0.271 0.46 0.02 1 0 1 1 0.230.95 0.72 0.53 0.34 1 0.03 1 1 1 1 0 0.93 0.87 0.91 0.54 0.14 0.96 1 0.04 0.02 0 0.1 1 0.86 1 0.77 0 0 0.03 1 0.99 0 0.97 0 0 0 0 0 0 0.22 0.410.95 0.95 0.99 1 1 0.05 0.260.97 0.01 1 0.01 1 0.36 0.16 0.88 1 0.2 0.98 0.08 0.88 0.95 0.58 1 0.24 1 1 0.99 1 0.97 1 0.99 1 1 0.54 0.65 1 0.311 0.28 0.24 0 0.020.06 0.26 0.11 0 0.3 0.120.22 0.82 0.01 1 0.96 0.250.88 0.02 1 1 1 0.27 0.22 0.39 1 0.99 0.66 0.61 0.08 0.94 0.88 0.7 1 0.68 0.82 0.21 1 1 0.33

1 0.99 1 0.7 1 0.740.970.50.88 0.03 1 0.14 1 1 0.76 0.01 0.69 0.01 0.87 0.15 0.04 0.55 0.26 0.53 0.09 0.65 1 0.03 0.1 0.72 0.18 0 0 1 0.01 0.78 1 1 0.14 0.04 1 0.57 0 1 0.23 0.36 0.03 0.06 0.1 0.97 0.18 1 0.23 1 1 0.28 0.63 0.96 1 1 1 0.1 0.11 0.07 1 0.33 1 0.49 0.26 1 0.34 1 0.01 0.93 0.170.16 1 0.94 0 0.95 1 0.26

0.94 0 0.7 1 0.97 1 1 0.88 0 0 1 0.99 0.97 0 0.01 0.93 1 0.09 1 0.91 0.76 1 1 0.35 0.55 0.58 1 0.030.13 0.25 0 0.42 0.01 0.1 1 1 0 0.03 1

0.3 0 0.29

0.01

0.12 0.64 0.05 0.05 0.02 0.01 0.29 1 0.98 0 0.07 0.82 0.02 0.030.07 0.6 0.57

0.99 0.05 0.03 0.06 0.52 0.44 0.54 0.25 0.39 0.83 0.02 0.36 0.11 0.58 0.03 1 1 0.53 0.03 0.13 1 0.08 0.54 0.11 0.13 0.47 0.56 0.58 0.95 1 0.94 1

0.43 0.25 0.98 0.46 0.02 1

1 0.82 0.02 0.38 0.93 0.15 0.73

1

0.62 1

0.17

0.15

0.29 0.59

0.34

0.33

0.85

7 10

7.0 10.0 0 10 20 30 40 50 60 70 B 0 10 20 30 40 50 60 70 80 90 100

213 214 Figure 1. Segment 3 Recombination Event With. Multiple Isolates Supporting. a) Phylogenetic trees made using 215 alignments from (left) nucleotide position 1-1259 and 1800-2213 representing the major parent region 216 excluding second recombination event (Fig. 5), and (right) nucleotide position 2214-2617 representing the 217 minor parent sequence. B) BootSccan (top) and RDP (bottom). The red line is comparing the minor parent to 218 the recombinant, blue line is comparing the major parent to the recombinant, and the green line is comparing 219 the major parent to the minor parent. The Y axis for RDP (bottom) pairwise identity, while the Y axis for 220 BootScan (top) is Bootstrap support. The X axis is the sequence along segment 3. 221 222 Two recombination events occurring in segment 3 (VP3) appear to have fixed in the population, and are 223 seen in many descendant sequences (Table 2). The first strain identified as a putative recombinant is KJ753665, 224 an isolate from South Africa in 2004. The recombinant region occurs between positions (2129-2174) - 2531.

11 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 7, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

225 This region is 98.7% identical to a porcine strain isolated in Uganda in 2016 (KY055418)(56), while the rest of 226 segment 3 is 95.8% similar to a 2009 human isolate from Ethiopia (KJ752028). MC uncorrected and corrected 227 probabilities were 2.491E-20 and 6.946E-13, respectively, with 107 isolates flagged as possibly derived from 228 the recombination event. Phylogenetic analysis using 450 randomly selected VP3 sequences (excluding 229 sequences lacking collection dates) within the putative recombinant region, along with an analysis of the 230 genome excluding the two major recombination events, resulted in five sequences showing a significant 231 phylogenetic incongruity (Fig. 4). The incongruity appeared between sub-lineages within the larger M1 lineage 232 of VP3. Amino acid substitutions in the recombinant region included positions 748 (M→T) and 780 (T→M).

VP3 alignment position 1-1260, VP3 alignment 1800-2213 position 1260-1800 M1 M1 1 KX880419_Germany_2015_03 A 1 0.91 KY616887_Japan_2013 0.47 MF997037_USA_2015 0.89 LC103021_Japan_2014 0.8 LC103021_Japan_2014 1 1 KY616887_Japan_2013 MF997037_USA_2015 1 1 1 1 KX880419_Germany_2015 0.15 KJ639025_Japan_2013_03 1 1 KC443224_Australia_2008 KJ639025_Japan_2013 KC443523_Australia_2007 0.37 KC443224_Australia_2008 0.92 KJ721712_Brazil_2008 KC443523_Australia_2007 0.06 1 0.41 0.32 KJ721712_Brazil_2008 0.18 KJ919584_Hungary_2012 M2 0.35 0.69 0.08 KC443688_Australia_2010 0.32 KJ919584_Hungary_2012 0.49 0.82KJ752827_SouthAfrica_2003

0.54 KX171537_SouthKorea_2011 0.94 KC443688_Australia_2010 0.04 M2 0.96 0.95 1 1 JQ069785_Canada_2009 0.73 0.54 JQ069721_Canada_2007 KJ752827_SouthAfrica_2003 0.28 0.49 0.18 KX171537_SouthKorea_2011 1 0.99 JQ069785_Canada_2009 0.74 0.2 0.79 JQ069721_Canada_2007 1 KX171540_SouthKorea_2013 1 1 1 0.82 KC443589_Australia_1977 0.51 0.29 KJ919514_Hungary_2012 1 KJ919514_Hungary_2012 0.35KF636247_Kenya_2009 1 1 0.29 0.17 KX171540_SouthKorea_2013 0.58 JF304928_Kenya_1982 1 1 KX655497_Uganda_2013 0.81 1 1 KX655464_Uganda_2013 0.31 0.91 KX655464_Uganda_2013 KX655497_Uganda_2013 MF184865_USA_2011 0.94 1 KC443009_USA_2005KF636247_Kenya_2009 0.26 0.3 0.98 1 1 KM660308_Cameroon_2011 0.92 KP753180_Uganda_2009 0.24 1 KC443009_USA_2005 KC443589_Australia_1977 0.39 0.27 1 1 KJ721709_Brazil_2010 KJ751626_Ghana_1999KC443347_Australia_2001 1 KU356587_Bangladesh_2010 MG181614_Malawi_2013 1 JF304928_Kenya_1982 KJ721709_Brazil_2010 0.82 KP753180_Uganda_2009MG181614_Malawi_2013 1 KU356587_Bangladesh_2010 1 0.99 1

1 KX655453_Uganda_2013 0.05 MF184865_USA_2011 0.4 1 1 JQ069732_Canada_2007HQ657173_SouthAfrica_2009 KJ751626_Ghana_1999KC443347_Australia_2001JQ069732_Canada_2007KM660308_Cameroon_2011 0.64 1 HQ657173_SouthAfrica_2009 1 0.99 0.97 1 0.33 GU384193_Bostaurus_China_2008 0.88 0.46 0.43 0.98 0.52 1 0.98 1 0.08 0.93 0.5 0.8 GU384193_Bostaurus_China_2008 1 0.99 KP882037_Bangladesh_2008KU199283_Bangladesh_2013 0.99 0.13 KU356642_Bangladesh_2013 0.1 0.81 0.25 0.17 1 JN706507_Thailand_2008KU356664_Bangladesh_2013

1 1 KU356664_Bangladesh_2013 0.54 1 0.99 1 1 0.98 1 0.73 0.48 1 KP882037_Bangladesh_2008KU356642_Bangladesh_2013 1 1 1 0.76 JX965141_Australia_2010 JN706507_Thailand_2008 JQ069768_Canada_2009 1 KU199283_Bangladesh_2013 0.98 1 0.22 1 0.62 0.04 KX171545_SouthKorea_2015 E U 7 0 8 9 2 5 _ C a n is lu p u s fa m ilia ris _ U S A _ 1 9 7 9 E U 7 0 8 9 3 6 _ C a n is0.99 lu p u s fa m ilia ris _ U S A _ 1 9 7 9 1 JQ069768_Canada_2009

1 E U 7 0 8 9 1 4 _ C a n is lu p u s fa m ilia ris _ U S A _ 1 9 8 0 0.6 K P 8 8 2 0 8 1 _ B a n g l a dKF716384_USA_2011 e s h _ 2 0 0 7 1 0.97 0.11 KX536659_India_2010 1 1 1 0.99 1 1 KF716384_USA_2011 0.99 LC227960_India_2013 JX965141_Australia_2010 0.27 1 KX171545_SouthKorea_2015 0.33 KC443325_Australia_2006 0.16 1 KX536659_India_2010 KP882081_Bangladesh_2007 0.29 K C 4 4 3 2 0 2 _ A u s t r a l i a _ 2 0 1 0

1 0.68 1 1 1 E U 7 0 8 9 0 3 _ U S A _ 1 9 8 4 0.73 KC443325_Australia_2006 0.55 K X 6 5 5 4 5 3 _ U g a n d a _ 2 0 1 3

E U 7 0 8 8 9 2 _ Is ra e l_ 1 9 8 5 LC227960_India_2013 H M 6 2 7 5 4 4 _ K e n y a _ 1 9 8 7 1 1 0.19 0.94 1 0.97 0.7 1 0.77 K T 9 3 6 6 2 8 _ T h a i l a n d _ 2 0 1 2 1 1 KP883027_Mali_2008 1 1 KX171539_SouthKorea_2012 K F 5 0 0 1 7 6 _ S u s s c r o f a _ S o u t h K o r e a _ 2 0 0 4 1 K P 8 8 2 4 7 7K _ GY 4h 9a 7n 5a 5_3 2 _ 0 P 0 a8 k i s t a n _ 2 0 1 0 KC443202_Australia_2010 J F 3 0 4 9 1 7 _ K e n y a _ 1 9 8 9 K X 1 7 1 5 3 9 _ S o u t h K o r e a _ 2 0 1 2

0.96 0.83 0.98 0.99 0.98 1 K P 8 8 3 0 2 7 _ M a l i _ 2 0 0 8 1 1 1 K Y 4 9 7 5 5 3 _ P a k i s t a n _ 2 0 1 0 0.9 K T 9 3 6 6 2 8 _ T h a i l a n d _ 2 0 1 2 K C 1 7 4 9 8 7 _ I n d ia _ 2 0 0 4 0.9 K P 8 8 2 4 7 7 _ G h a n a _ 2 0 0 8 K C 1 7 5 0 0 8 _ I n d ia _ 2 0 0 4 _ 0 3 E U 7 0 8 9 2 5 _ C a n is lu p u s fa m ilia ris _ U S A _ 1 9 7 9 1 1 0.99 E U 7 0 8 9 3 6 _ C a n is lu p u s fa m ilia ris _ U S A _ 1 9 7 9 1 F J 3 4 7 1 2 4 _ L a m a g u a n ic o e _ A r g e n t in a _ 1 9 9 8 G U 2 9 6 4 2 5 _ Ita ly _ 1 9 9 6 F J 3 4 7 1 2 4 _ L a m a g u a n ic o e _ A rg e n tin a _ 1 9 9 8 E U 7 0 8 9 1 4 _ C a n is lu p u s fa m ilia ris _ U S A _ 1 9 8 0 1 A B 5 7 3 0 8 1 _ B o s t a u r u s _ J a p a n _ 2 0 0 6 1 1 0.35 0.96 G U 2 9 6 4 2 4 _ Ita ly _ 1 9 9 6 K T 2 8 1 1 2 2 _ U S A _ 2 0 1 2 1 K F 6 3 6 2 5 8 _ B o s t a u r u s _ S o u t h A f r i c a _ 2 0 0 7 0.18 0.23 K P 0 0 6 5 0 8 _ G u a t e m a l a _ 2 0 0L 9 C 3 4 0K 0 X 1 2 5 1 _ 2 J 8 a 7 p 1 a _ n B _ o2 s0 t 1 a 4 u r u s _ T u r k e y _ 2 0 1 5

J F 3 0 4 9 1 7 _ K e n y a _ 1 9 8 9 0.07 L C 0 9 5 9 5 9 _ V ie tN a m _ 2 0 0 7 H M 6 2 7 5 4 4 _ K e n y a _ 1 9 8 7 J N 8 3 1 2 1 1 _ B o s t a u r u s _ S o u t h A f r i c a _ 2 0 0 7 0.73 A B 8 5 3 8 9 2 _ B o s ta u r u s _ J a p a n _ 2 0 1 3 _ 1 9 K U 9 5L 6 0C 0 3 8 3 _ 6 H 5 o 9 n 3 d _K u J U r a a 9 p s 6 a _ 2 n 2 1 _ 0 3 21 0 01 _ 1 _ 2 U S A _ 2 0 1 4 1

0.8 K J 4 1 1 4 3 4 _ U S A _ 2 0 1 2 E U 7 0 8 9 0 3 _ U S A _ 1 9 8 4 1 F J 3 4 7 1 1 3 _ B o s ta u ru s _ A rg e n tin a _ 1 9 9 8 1 A B 5 7 3 0 7 2 _ B o s ta u r u s _ J a p a n _ 2 0 0 8

0.93 1

E U 7 0 8 8 9 2 _ Is ra e l_ 1 9 8 5 K P 7 5 2 8 7 9 _ B o s t a u r u s _ S o u t h A f r i c a _ 2 0 0 9 F J 3 4 7 1 0 2 _ L a m a g u a n ic o e _ A rg e n tin a _ 1 9 9 9 A B 5 7 3 0 8 1 _ B o s ta u ru s _ J a p a n _ 2 0 0 6 1 1 K C 4 4 3 6 0 0 _ A u s t r a lia _ 2 0 0 8 K F 5 0 0 1 7 6 _ S u s s c ro fa _ S o u th K o re a _ 2 0 0 4 1 E F 5 5 4 0 8 4 _ B e lg iu m _ 2 0 0 2 0.64 G U 2 9 6 4 2 5 _ Ita ly _ 1 9 9 6 0.85 0.91 K F 6 3 6 2 5 8 _ B o s ta u r u s _ S o u th A fr ic a _ 2 0 0 7

0.67 J N 8 3 1 2 1 1 _ B o s t a u r u s _ S o u t h A f r ic a _ 2 0 0 7 G U 2 9 6 4 2 4 _ Ita ly _0.99 1 9 9 6 G U 8 2 7 4 0 8 _ F e lis c a tu s _ Ita ly _ 2 0 0 5

0.51 K T 2 8 1 1 2 2 _ U S A _ 2 0 1 2 K Y 4 2 6 8 1 1 _ C a p r e o lu s c a p r e o lu s _ S lo v e n ia _ 2 0 1 5 K U 5 0 8 3 8 2 _ H u n g a ry _ 2 0 0 2 K C 1 7 4 9 8 7 _ I n d ia _ 2 0 0 4 K P 7 5 2 8 7 9 _ B o s t a u ru s _ S o u t h A f ric a _ 2 0 0 9 K P 0 0 6 5 0 8 _ G u a te m a la _ 2 0 0 9 K C 4 4 3 6 0 0 _ A u s t r a lia _ 2 0 0 8 K J 4 1 1 4 3 4 _ U S A _ 2 0 1 2 E F 5 5 4 0 8 4K _ B C e 1 lg 7 5iu 0 m 0 _ 8 2 _ 0 In 0 d2 ia _ 2 0 0 4 0.77 A B 8 5 3 8 9 2 _ B o s ta u r u s _ J a p a n _ 2 0 1 3 _ 1 9

1 L C 0 9 5 9 5 9 _ V ie tN a m _ 2 0 0 7 K U 9 6 2 1 3 0 _ _ U S A _ 2 0 1 4 G U 8 2 7 4 0 8 _ F eA lis B 5c 7a 3tu 0 s 7 _ 2 Ita _ B ly o _s t2 a 0 u 0 r u5 s _ J a p a n _ 2 0 0 8 F J 3 4 7 1 1 3 _ B o s ta u ru s _ A rg e n tin a _ 1 9 9 8 A Y 7 4 0 7 3 9 _ B e lg iu m _ 2 0 0 0 J F 7 1 2 5 6 8 _ E q u u s c a b a llu s _ A rg e n tin a _ 1 9 9 3 K U 9 5 6 0 0 8 _ HL Co n 3 d 4 u 0 ra 0 1 s 5 _ _ 2 J 0 a 1 p 1 a n _ 2 0 1 4

F J 3 4 7 1 0 2 _ L a m a g u a n ic o e _ A rg e n tin a _ 1 9 9 9 L C 3 3 6 5 9 3 _ J a p a n _ 2 0 1 2 J X 2 7 1 0 0 3 _ T u n isJ ia F 4 _ 2 2 1 0 9 0 7 8 7 _ J a p a n _ 2 0 1 0 K U 5 0 8 3 8 2 _ H u n g a ry _ 2 0 0 2

1

1

J F 4 2 1 9 7 7 _ J a p a n _ 2 0 1 0 J N 9 0 3 5 2 4 _ E q u u s c a b a llu s _ Ire la n d _ 2 0 0 3 1 1 J X 2 7 1 0 0 3 _ T u n is ia _ 2 0 0 8 K P 8 8 2 6 3 1 _ G h a n a _ 2 0 0 8 K Y 4 2 6 8 1 1 _ C a p r e o lu s c a p r e o lu s _ S lo v e n ia _ 2 0 1 5

K J 7 5 2 1 3 0 _ E th io p ia _ 2 0 0 8 K F 5 0 0 5 2 1 _ U S A _ 2 0 1 2 1 K J 7 5 2 1 3 0 _ E th io p ia _ 2 0 0 8 A Y 7 4 0 7 3 9 _ B e lg iu m _ 2 0 0 0 1

K P 8 8 2 6 3 1 _ G h a n a _ 2 0 0 8 K J 4 1 2 7 2 3 _ P a ra g u a y _ 2 0 0 7

K F 5 0 0 5 2 1 _ U S A _ 2 0 1 2 JF712568_Equuscaballus_Argentina_1993

J F 7 1 2 5 7 9 _ E q u u s c a b a llu s _ A rg e n tin a _ 2 0 0 6 K F 9 0 7 2 8 8 _ B ra z il_ 2 0 0 8

1 1 J X 0 3 6 3 6 7 _ E qK u J u 4 s 1 c 2 a 6 b 2 a 2 llu _ P s _a Ara rg g u e a n y tin _ 2 a 0 _ 0 2 9 0 0 8 K J 9 1 9 5 5 0 _ H u n g a ry _ 2 0 1 2

K J 4 1 2 7 2 3 _ P a ra g u a y _ 2 0 0 7

J N 8 7 2 8 6 7 _ E q u u s c a b a llu s _ A rg e n tin a _ 2 0 0 8 K F 9 0 7 2 8 8 _ B ra z il_ 2 0 0 8

K J 8 7 5 7 9 3 _ C a n is lu p u s fa m ilia ris _ H u n g a ry _ 2 0 1 2 L C 1 5 8 1 2 1 _ C h iro p te ra _ Z a m b ia _ 2 0 1 2

J F 7 1 2 5 7 9 _ E q u u s c a b a llu s _ A rg e n tin a _ 2 0 0 6 J X 0 3 6 3 6 7 _ E q u u s c a b a llu s _ A rg e n tin a _ 2 0 0 8 K X 8 8 0 4 4 1 _ G e rm a n y_ 2 0 1 4 K J 4 1 2 6 2 2 _ P a ra g u a y _ 2 0 0 9 K F 9 0 7 2 8 7 _ B ra z il_ 2 0 1 0 J X 9 4 6 1 7 0 _ C h in a _ 2 0 1 1 K C 9 6 0 6 2 1 _ C h iro p te ra _ C h in a _ 2 0 1 1 K J 9 1 9 5 5 0 _ HK u nX g 8 a 8 ry0 4 _ 4 2 1 0 _ 1 G 2 e rm a n y _ 2 0 1 4

J N 8 7 2 8 6 7 _ E q u u s c a b a llu s _ A rg e n tin a _ 2 0 0 8

L C 1 5 8 1 2 1 _ C h iro p te ra _ Z a m b ia _ 2 0 1 2 J X 9 4 6 1 7 0K _ J C 8 h 7 in 5 7 a 9 _ 3 2 _ 0 C 1 1a n is lu p u s fa m ilia ris _ H u n g a ry _ 2 0 1 2 L C 3 4 0 0 2 6 _ J a p a n _ 2 0 1 4

K F 9 0 7 2 8 7 _ B raK z C il_ 9 2 6 0 0 16 2 0 1 _ C h iro p te ra _ C h in a _ 2 0 1 1 K F 6 9 0 1 2 7 _ A u s tra lia _ 2 0 1 2

K P 2 5 8 4 0 0 _ B e lg iu m _ 2 0 1 2

K X 2 1 2 8 7 1 _ B o s ta u ru s _ T u rk e y _ 2 0 1 5

L C 3 4 0 0 2 6 _ J a p a n _ 2 0 1 4

K F 6 9 0 1 2 7 _ A u s tra lia _ 2 0 1 2

K P 2 5 8 4 0 0 _ B e lg iu m _ 2 0 1 2

K X 8 1 4 9 2 6 _ C h iro p te ra _ C h in a _ 2 0 1 5

K X 8 1 4 9 2 6 _ C h iro p te ra _ C h in a _ 2 0 1 5

JN903523_Equuscaballus_Ireland_2004

KR632625_Italy_2012

JN903524_Equuscaballus_Ireland_2003

KR632625_Italy_2012

JN903523_Equuscaballus_Ireland_2004

KJ879450_Rattusnorvegicus_Germany_2010

KJ879450_Rattusnorvegicus_Germany_2010

7.0 B 10.0

233 234 Figure 5. A) Phylogenetic analysis of segment 3. Left tree (major parent) was made from an anlignment of 235 nucelotide positions 1-1260 and 1900-2213 and right tree (minor parent) was made from nucleotide positions 236 1260-1800. M2 strains have been collapsed, and recombinant is colored in pink. B) BootSccan (top) and RDP

12 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 7, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

237 (bottom of VP3 M2 putative recombinant.. The red line is comparing the minor parent to the recombinant, blue 238 line is comparing the major parent to the recombinant, and the green line is comparing the major parent to the 239 minor parent. The Y axis for RDP (bottom) pairwise identity, while the Y axis for BootScan (top) is Bootstrap 240 support. The X axis is the sequence along segment 3. 241 242 A second potential recombination event was identified in segment 3 between sublineages within the M2 243 lineage (Table 2; Fig. 5). However, when we created split alignments and ran a phylogenetic analysis in 244 BEAST, only one sequence showed a phylogenetic incongruity supporting this event (KX655453). Amino acid 245 substitutions as a result of this event included positions 405 (I→V), 412 (V→M), 414 (N→D), 441 (N→D), 458 246 (I→V), 459 (I→T), 468 (L→F), 473 (N→D), 486 (M→I), 518 (N→S), and 519 (E→G). 247 One possible complication with relying on phylogenetic incongruity as evidence of major recombination 248 events is that the recombinant virus may evolve faster than the parental strains, and, even if possessing an initial 249 advantage such as immune avoidance, may eventually converge back toward the major parent’s sequence. Thus, 250 phylogenetic analysis likely underestimates the frequency of transiently stable recombination events in the 251 population. This may account for the fact that the number of VP3 isolates flagged as deriving from a 252 recombination event based on sequence is greater than the prevalence of recombination predicted by the 253 phylogenetic analysis. 254 Intergenotypic Recombination 255 We found evidence for intergenotype recombination in all segments except segment 7 (NSP3), segment 256 10 (NSP4), and segment 11 (NSP5) (Table 1; Supplementary Table 1). Instances of intergenotypic 257 recombination in segment 1 (VP1) (KU714444, JQ988899) only occurred in regions where the amino acid 258 sequence was highly conserved across genotypes, so these events resulted in few, if any, nonsynonymous 259 mutations. Of the putative events observed in more than two environmental isolates with strong support from 260 detection methods, only events in segment 9 (VP7) occurred between different genotypes (Table 2). The 261 segment 9 recombinant region amino acid substitutions that match the minor parent and differ from the major 262 parent are shown in Supplementary Table 2. 263 Structural Prediction of Recombinant Proteins 264 The protein models generated by I-TASSER for the intergenotypic recombinant G proteins showed that, 265 although the amino acid changes for the G1-G2 recombinant were substantial (25 changes) (Supplementary 266 Table 2), the secondary and tertiary structures seemed largely intact (Fig. 6). The G3-G1 and G6-G6 267 recombinants each showed four amino acid changes. The G9-G1 recombinant showed the most secondary 268 structure disruption, including a loss of beta sheets, a loss of antiparallel beta sheets, and slightly shorter beta 269 sheets/helices, which could indicate lower stability. However, based on the structural modeling, the tertiary 270 structure appeared to be maintained, suggesting that the putative recombinant glycoproteins were able to form 271 properly folded G-proteins (Fig. 6).

13 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 7, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

272 273 Figure 6. VP7 protein models predicted from I-TASSER. Proteins were predicted from amino acid 274 alignments of the four strongly supported G recombinants a) G2-G1, C=-1.38 KC443034 b) G9- 275 G1:AF281044, C=-1.29 c) G3-G1, C=-1.27 KJ751729 d) G6-G6: KF170899, C=-1.42. 276 277 Antigenic epitope predictions generated by IEDB (53) and SVMTriP (54), as well as a large study done 278 on mammalian G-types (57), showed that VP7 recombination occasionally results in amino acid substitutions in 279 conserved epitopes (Supplementary Table 2). For example, the amino acid sequence RVNWKKWWQV is 280 usually flagged as, or part of, an epitope in most G types including G2, G3, G4, G6, G9, but not in G1. In the 281 G2-G1 recombinant flagged in multiple sequences (Table 2; Figure 1), this region is altered (sometimes a KR 282 substitution and sometimes multiple amino acid substitutions) (Supplementary Table 2). Moreover, the region 283 where the recombination occurred, although containing highly variable regions, had low solvent accessibility. 284 Thus, although the amino acid sequences may diverge, the protein folding and three-dimensional structure 285 remains relatively conserved (Fig. 6). The G3-G1 recombinant had amino acid substitutions in two of four 286 conserved epitope regions due to the recombination event. There was also a conserved epitope sequence around 287 amino acids 297-316 in G9 proteins that was altered in the G9-G1 recombinant so it no longer appeared as an 288 epitope. In other words, changes in the amino acid sequence that do not result in significant changes to the 289 protein configuration may nevertheless reduce binding by antibodies or T-cell receptors, and may provide a 290 selective advantage allowing the virus to avoid immune surveillance. 291 Recombination junctions often correspond to RNA secondary structural elements

14 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 7, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

VP7 G6 0 100 200 300 400 500 600 700 800 900 1000 0 100 200 300 400 500 600 700 800 900 1000 VP7 A A G9 i ii iii

GG CUU UAA AAG AGA GAA UUU CC GUC U GGC UAG C GG UUA GC U CC UUUU AAU GUA UGG UAU UGA AUA UAC C ACAA UUC UAA C CU UUC UGA UAU CAA UAG UUU UA UU GAA C UAU AU AUU A AA AUC AC UA AC UAG U GC GAU GGA C UU UAU A AUUU A UAGA UUU C UU UUA CUU AUU GUU AUU GC A UC AC CU UUU GUU AAA AC AC AA AAU UAU GGA AUU AAU UUA CC GAUC AC UGGC UC C AU GGA UAC AGC AUA UGC AAA UUC AUC AC AGC A AGA AAC AUU UUU GAC UUC AAC GC U AUG C UUAUA UUA UC C UAC AGA AGC AUC AA C UCAAA UUG GAG AUA CGG AAU GGA AGG AUA CUC UGU CC CA AUU AU UC UU GA CU AAA GG GU GG CC AA CUG GA UC AG UCU A UUU UA AA GA AUAC AC U GAU AUC GC UUC A UUC UC AAUU GAU C CAC AA CUU UAU UGU GAU UAU AAU GUU GUA CUG AUG AA GUA UGA UUC AAC GUU AGA GC UAGA UAU GUC UGA AUU AGC UGA UUU AAU UC UAAA UGA AUG GUU AUG UAA CCC AAU GGA UAU AAC A UU AUA U UA UUA UC AG CA AAC A GAU GA AGC G AA UAA AUG G AU AUC G AU GGG ACA G UC UU GU AC CAU AA AA GU AU GU CCA UU GA AU AC GC AGA CU UU AG GA AU AGGU UGU AUU AC C AC A AAU AC A GC GAC A UUU GAA GAG GUG GC UAC A AGU GAA AAA UUA GUA AU AAC C GAUGU UGU UGA UGG UGU GAA CC AUAA ACU UGA UGU GAC UAC AAA UAC C UG UAC AAU UAG GAA UUG UAA GAA GUU AGG AC C AAGAG AA AA UGU AGC G AU UAU A CA AGU CGG U GG CUC A GAU GU GUU A GA UAU UAC A GC GGA U CC AA CUA CUG C AC CAC A AAC UG AAC G UA UGA UGC G AG UAA A UU GGA AG AAA UGG UGG CAA GUU UUC UAU ACG GUA GUA GAU UAU AUUAA UC AGAU UGU GCA A GU UA UG UC CAA AA GA UC ACG G UCA UU AA AU UC AG CAG CU UU UU ACU A UAG GGU UUG AU AUA UC UUAG AUU AGA AUU GUA UGA UGU GAC C

GG CU UUA AAA GC G AGA AUU UC CGU UUG GC UAGC GG UUA GC UC CU UUU AA UG UA U GG UAU U GA AU AUA C C AC AA UUC UA A UC UUC U UG AUA U CG AU UAC A UU G UU AAA UU A UA UAU U AA AAU C AA UA AC G AGA A UA AU GGA C UA UAU A AU UU AC AGA UUU C UA UUU AU AGU AGU AAU AUU GG C C AC CA UGA UAA AU GC GC AA AAU UAU GG AGU UAA UUU GC CAAU UA CAG GAU CAA UGG AU AC U AC AUAU GC AAA U UC UAC GC A AAG UGA AC C AU UUU UAA CAU C AAC U CUU UGU UUG UA UUA UC C AGU UGA GG CAU C AA AC GAAA UA GC U GAU AC C GAA UG GAA AAA UAC CUU GUC ACA ACU A UU UU UG AC AA AA GG AU GG CC AA CA G GA UCU G UG UA CU UU AA AG AA UA UAC U GA UA U AG CAG C UU UUU C AG UG GAU C CA CAG UU GUA C UG UGA U UA CAAU UUA GUU UUA AUG AA AUA UGA UUC AAC AC UAG AAU UAG AUA UG UC UGAA UUG GC C GAU C UUAU AUU GAA UGA AU GGC UGU GC AAUC CA AUG GAC AUA AC GUU GUA UUA UUA CC AAC A AA CUG AUG AAG CAA AU AAA UGG AUA UC AAU GGG UUC GUC UUG CA CAG UUA AAG UGU GUC C AUUA AAU AC AC AA AC AC UUGG UAU U GG AU GUC UAA CAA CUG AU C CA AAC AC A UUU GAA AC AGU UGC GAC AAC GG AA AA G UU AGU G AU UA CUG A UG U UG UAG AU G GU GUC A AC CAU A AA UU AAA U GU U AC AAC AG C AA CUU G CA CUA UA C GC AAC UGU A AA AA AUU A GG AC CA AG AGA A AAC GUA GC A GUU AU AC A GGU AGG CG GC GC GA AC A UUU UA GAC AUC AC A GC UGAU CC AAC AAC UGC AC CAC AGA CAG AGA GAA UG AUG CGA GUA AAU UG GAA A AA AUG GUG GC AAG UGU UUU AUA C AG UA GUG GAU UAC GUC AA UC AGAU AAU UCA A GCA A UG UCCA AA AG AU CU AG AU CG CU A AA UU CA UC AG CA UU CU ACU A CAG AGU G UA GAU AC A UG UUA G AU UAG AG U UG UAU GAU G UG AC C

VP7

0 100 200 300 400 500 600 700 800 900 1000 G1 0 100 200 300 400 500 600 700 800 900 1000 VP7 iv vG2

GG CU UUA AAA GAG AGA AUU UC C GU CUG GC UAAC GG UUA GC UC CU UUU AA UG UA UGG U AU UGA AU AUA C CA CAA UUC UA A UC UUU CUG A UA UC AAU CAU U CU ACU C AACU A UA UA UU AAA AU CA GU GA CCC GA AU AA UG GAC UA CA UU AU AUA UAGA UUU UUG UUA AU UUC UGU AGC AUU AU UUG CCU UAA C UA AA GC UCAG AAC UAU GG AC UUAA UAU AC CAAU AA CAG GAU C AA UGG AU AC UGUA UAC UC C AAC UC UAC UC A AGA AGG AA UAU UUC UAA C AU CC ACA UUA UGU UUG UA UUA UC C AAC UGA AG C AA GUA CUC AAA UC AGU GAU GGU GAA UG GAA AGA CUC AUU AUC ACA A AU GU UUCU U AC AA AA GGU UG GC CA ACA GG AU CA GU CUA UU UU AA AG AG UAC UC AAA UA U UGU UG AUU U UU CCG UU GAC C CA CAA UU AUA U UG UGA UUA U AAC UUA GUA CUA AUG AA GUA UGA UC A AAA UC UUG AAU UAG AUA UG UC A GAA UUA GC U GAU UU GAU AUU GAA UGA AU GGU UAU GUA AUC CA AUG GAU AUA AC AUU AUA UUA UUA UC A ACA AU CGG GAG AAU CAA AU AAG UGG AUA UC A AU GGG AUC AUC AUG UA CUG UGA AAG UGU GUC CA CUG AAU AC A CAA AC GUU AGG AAU A GG UUGUC AAA C AA CGA AU GUA GAC UCA UUU GAA AC AGU UGC UGA GAA UG AA AA AUU A GCU AU AG UGG A UGU CG UUG AU G GGA UA AAU C AU AAA AU AAA U UU GAC AAC UA C GA CAU GUA C UA UU CGA AAU U GU AAG AA GUU A GG UC CA AGA GA GAAU GUA GC UGUA AU AC AAGU UGG UG GC UC UAAUG UAU UA GAC AUA AC AGC A GAU CC AAC GAC UAA UC CAC AAA UUG AGA GAA UG AUG AGA GUG AAU UG GAA A AG AUG GUG GC A AG UAU UUU AUA CUA UA GUA GAU UAU AUU AAUC A GAU UGU ACA GG UA AU GUCCA AA AG AU CAA GA UCA U UA AA UUC UG CU GC GU UUU AU UA UAG A GUA UA GAU AU A UCU UA GAU U AG AAU UG UAU G AU GUG AC C

GG C UUUA AAA GAG AGA AUU UC CGU C UG GC UAGC GG UUA GC UC UU UUU AA UG UA U GG UAU U GA AU AUA C CA CAA U UC UG AC CA UU UUG A UA UC UAU C AU AUU A UU GAA UU A UA UAU UAA A AA CU AUA A CU AAU A CG AU GGA C UA CAU A AU UU UU AGA UUU UUA C UA CU CAU C GC UC UGAU GU CAC C AU UUG UGA GGAC G CAA AAU UAU GG C AU GUA UUU AC CAAU AA CGG GAU CAC UAG AC GC UGUA UAC AC A AA UUC AAC UAG UGG AGA AU CAU UUC UAA C UU CA AC GC UA UGU UUA UA CUA UC C AAC AGA AG CUA AAA AUG AGA UU UC AGAU AAU GAA UG GGA AAA UAC UC UAUC ACA A UU AU UU UU AA CU AA AG GA UG GC CG ACU G GA UCA G UU UA UU UU AA AG AC UA CAA U GA UA UUA CUA C AU UUU C UA UG AAU C CA CAA C UG UA UUG U GA UUA U AAU GUA GUA UUG AUG AG AUA UGA UAA UAC AU C UG AAU UAG AUG CA UC GGAG UUA GC A GAU C UUAU AUU GAA CGA AU GGC UGU GC A AUC C UAUG GAU AUA UC ACU U UA CU AU UA UCA ACA A AA UA GCG A AU CA AA UA AA UG GA UA UCA A UG GG AAC AGA CUG C AC GG UAA AAG UUU GUC CA CUC AAU AC AC AA AC UUU AGG AAU UGG AU GC A AAA C UA CGG AC GUG GAU AC A UUU GAG AU UGU UGC GUC GUC UG AA AA AUU G GU AAU UA CUG A UG UUG U AA AU GGU G UU AAU C AU AAA AU A AA UAU U UC AAU AA G UA CGU G UA CUA UA C GU AAU U GU AAU AA A CUA GG AC CA CG AGA A AAU GUU GC UAUA AU UC AAGU UGG UG GAC CGA AC G CAC UA GAU AUC AC U GC UGAU CC AAC AAC AGU UC CAC AGG UUC AAA GAA UU AUG CGA GUA AAU UG GAA AAA AUG GUG GC A AG UGU UUU AUA C AG UA GUU GAC UAU AUU AA CC AAAU UAU ACA A GU UA UG UCCA AA CG GU CA AG AU CA UU AG AC AC AG CU GC UU UU UA UU AU AG AAU U UA GAU AU AGC U UA GAU U AG AA UUG U AU GAU G UG AC C B

292 293 Figure 7. Consensus mountain plots made in RNAalifold using the ViennaRNA package, of the predicted RNA 294 secondary structures based on alignments made using full sequences for each of the five G types involved in the 295 intergenotypic recombination events: (i) G9 (ii) G6 (iii) G3 (iv) G1 (v) G2. Peaks represent hairpin loops, 296 slopes correspond to helices, and plateaus correspond to loops. X axis corresponds to the sequence of the 297 segment. Each base-pairing is represented by a horizontal box where the height of the box corresponds with the 298 thermodynamic likeliness of the pairing. The colors correspond to the variation of base pairings at that position. 299 Red indicates the base pairs are highly conserved across all the sequences, black indicates the least 300 conservation of those base pairings, B) Breakpoint distribution plot made in RDP4, of putative recombinants in 301 segment 9. X axis is showing the position in the sequence, and Y axis is showing the number of breakpoints per a 302 50-nucleotide window. Highest peaks are around x=115, 285, 1050.

15 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 7, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

A B

303 304 Figure 8. Consensus Mountain Plots overlaid with breakpoint distribution plots of recombination events for a) 305 VP6, and b) VP1. Position of nucleotide sequence is on the X axis. Entropy curve represented in green. The 306 black curve represents the pairing probabilities, and the red curve represents the MFE structure with well- 307 defined regions having low entropy. 308 309 Breakpoint distribution plots showed the sequence regions with the most breakpoints. These breakpoints 310 often corresponded to hairpins predicted by RNAalifold. Secondary RNA structure predictions for segment 9 311 (VP7) genotypes G1, G2, G6 and G9 are shown as mountain plots (Fig. 7). The breakpoints of the segment 9 312 recombination events correspond to areas leading to the peaks in the mountain plots (Fig. 7). The peaks indicate 313 a conserved hairpin loop, with the sequences leading up to the peak being the double-stranded portion of the 314 hairpin. Breakpoint distributions also appeared to correspond to secondary structure predictions made from 315 alignments of segment 1 (VP1) and segment (VP6) (Fig. 8). Segment 4 (VP4) RNA secondary structure 316 predictions (Supplementary Fig. 1) showed greater variation across genotypes, so while breakpoints did coincide 317 with secondary RNA structures as predicted by the models, there were not enough events in each genotype to 318 provide strong support that the breakpoints were correlated with secondary structure. The breakpoint 319 distributions for NSP1 and VP3 were also non-randomly distributed across the sequence (Supplementary Fig. 1), 320 however secondary structure predictions from these alignments were not consistent, so they are not shown 321 (Supplementary Fig. 2). 322 Discussion 323 Recognizing Type I and Type II Error 324 Apparent instances of recombination may actually be the result of convergent evolution, lineage- 325 specific rate variation, sequencing error, poor sequence alignments, laboratory contamination or improper 326 bioinformatics analysis (58-61). For example, the rates of type I (falsely identifying a recombination event when 327 none exists) and type II (failure to detect a true recombination event) error incurred during the use of the 328 recombination detection software, RDP4, were measured in a +ssRNA , which has low rates of 329 recombination (62, 63). The results of the analyses indicated that recombination was overestimated in these 330 viruses, and certain detection methods were more prone to type I error. 3Seq and GENECONV display the 331 lowest false positive rates as well as the lowest detection power of true events, while MaxChi, Chimaera, and

16 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 7, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

332 SiScan show higher false positive rates, but also had higher detection power for true positives. False positives 333 using RDP4 are especially common among closely related strains. That being said, recombination events 334 between sequences with low substitution rates and divergence percentages would be undetectable, yet most 335 likely occur given their close spatial/temporal proximity. Furthermore, in RDP4, the P-value of .05 does not 336 correspond to a 5% rate of false positives, which is why the cut-off value was set to 10E-04 and focused only on 337 events where at least six RDP4 programs detected the putative recombinant. The length of the recombination 338 region, if too short, also contributes to both type I and type II error (62-64). We ignored any putative 339 recombination events of < 100nt, with the exception of one isolate in segment 4 (NSP4) and one isolate in 340 segment 5 (NSP5), which we noted in Supplementary Table 1, as the putative recombinant regions were both at 341 conserved regions at the ends of the segments. 342 Several steps can be taken to minimize incorrect attribution of viral recombination. Ideally this process 343 should begin at the time of sequencing. If a putative recombination event is observed, then it should be 344 confirmed that the originating sample did not come from a host infected with multiple genotypes by purifying, 345 amplifying, and sequencing multiple plaques (58). Similarly, care must be taken when sequencing multiple 346 samples of the same virus to minimize the possibility of cross-contamination. 347 For sequences obtained from online repositories, such precautions are rarely possible. Instead careful 348 bioinformatics procedures can help minimize possible errors. Sequence alignments should be visually inspected 349 to exclude misaligned sequences (58). Splitting alignments by major and minor event, followed by carefully 350 parameterized BEAST runs, may help distinguish genuine phylogenetic incongruity signals from spurious false 351 positives. Moreover, false positive recombinants are likely to be present as single isolates in phylogenetic trees 352 (58). 353 The more isolates showing the same event, the greater the probability that it represents a true 354 recombination event, especially if isolates were acquired and sequenced by different laboratories. Events that 355 showed strong support, but were only isolated in one sequence are noted (Supplementary Table 1), but not 356 discussed, as it is difficult to rule out the possibility of type I error due to PCR or mosaic contig assembly (24, 357 58). For example, pseudo-recombinants may be generated from template switching during PCR amplification, 358 and can cause significant confusion when performing any kind of phylogenetic analysis or divergence dating. 359 Similarly, recombination facility likely depends on genetic similarity, so caution should be used in 360 inferring events between highly dissimilar genotypes. When a positive recombination signal has been detected, 361 it is essential to assess its statistical significance. Programs such as 3SEQ (65) and Simplot (66) can be used to 362 identify recombination breakpoints and assign p-values assessing the likelihood that the recombination signal 363 was non-random (58). In addition, the presence of unique polymorphisms differing from the parent strains 364 within the suspected recombination region may provide evidence that recombination events are not laboratory 365 artifacts. 366 Sequence metadata can also identify unlikely recombination events. For an event to be plausible, the 367 major and minor parents should have had opportunity to coinfect the same host, which is only possible if they 17 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 7, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

368 are congruent in time and space (58). For example, one study identified influenza A virus strain 369 A/Taiwan/4845/99 as a recombinant of A/Wellington/24/2000 and A/WSN/33 (67). Given that the two parents 370 were isolated 77 years apart in different parts of the world, it is exceedingly unlikely that is a natural 371 recombination event. Any putative recombination events should be carefully screed to determine if the parental 372 strains could have plausibly interacted. 373 Naturally High Coinfection in Rotavirus 374 Some features of rotavirus biology make it plausible that recombination is not only possible, but also 375 relatively common. Rotaviruses are often released from cells as aggregates of approximately 5-15 particles 376 contained within extracellular vesicles (68). Collective infection of host cells by rotavirus-containing 377 extracellular vesicles may increase the likelihood of recombination between genetically different viruses in the 378 wild, an effect that may be missed in laboratory coinfection assays. Moreover, although the transcription of 379 dsRNA in the viral core reduces the opportunity for recombination, it does not prevent it entirely or exclude 380 adaptive recombination events. The physical barriers to recombination in dsRNA viruses may be offset by the 381 high rates of coinfection resulting from vesicle transmission of rotaviruses. 382 Furthermore, infection of hosts by multiple rotavirus strains appears to be relatively common. In a study 383 of 100 children in the Detroit area, G and P typing, which identifies the serotype of the VP7 and VP4 proteins 384 respectively, revealed that ~10% of patients were infected with multiple rotavirus A strains (69). Similarly high 385 frequencies of G and P mixed genotype infections were observed in children sampled in India (three studies 386 showing multiple G types in 11.3 %, 12% and 21% of samples) (70-72), Spain (>11.4% of samples) (73), Kenya 387 (5.9%) (74), Africa (12%) (75), and Mexico (5.6% in 2010, 33.5% in 2012) (76). Even higher frequencies of 388 mixed genotype infections were observed in whole genome studies. For example, among 39 Peruvian fecal 389 samples genotyped using multiplexed PCR, 33 (84.6%) showed evidence of multiple rotavirus genotypes (77). 390 In another study, whole genome deep sequencing revealed that 15/61 (25%) samples obtained in Kenya 391 contained multiple rotavirus genotypes (78). Given the high genetic diversity of rotavirus populations (7, 15, 392 16), and their proficiency in infecting a broad range of mammalian hosts, including many domesticated 393 species (22, 79), the high frequencies of hosts infected with multiple genotypes is not entirely surprising. These 394 coinfections present abundant opportunities for rotavirus recombination. 395 Recombination Between Divergent Genotypes 396 Rotavirus genotype classification varies by segment, with NSP5 having the highest % nucleotide identity cutoff 397 (91% nucleotide identity) and NSP1 having the lowest % nucleotide identity cutoff (~79% nucleotide 398 identity)(3). The serotype proteins, VP4 and VP7, have the most different genotypes, with 51 and 36 399 respectively (12). In particular, VP4 has especially variable secondary RNA structure (80). Since VP4 forms the 400 host receptor-binding spike protein and VP7 is a glycoprotein that makes up the virus outer surface, it is likely 401 that antibody neutralization drives the evolution of VP4 and VP7 diversity (6). Given this genetic diversity, the 402 chances of two viruses with different G or P types coinfecting a cell is substantially higher than other segments. 403 In addition, both VP4 and VP7 seem to be more prone to reassortment, and to tolerate more divergent genetic 18 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 7, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

404 backgrounds or genome constellations (5, 6, 81, 82). This diversity and tolerance of many different genetic 405 backgrounds suggests that VP4 and VP7 may be more tolerant of recombination between divergent strains than 406 the other segments. Consistent with this expectation, we found substantial evidence of intergenotypic 407 recombination in segment 4 (VP4) and in segment 9 (VP7) (Table 2; SI Table 1). A previous study reported 408 intergenotypic recombination events in segment 6 VP6, segment 8 (NSP2), and segment 10 (NSP4) (35), so our 409 results add to the number of segments able to tolerate intergenotypic recombination. 410 Specifically, we observed several instances of intergenotypic recombination among G6 and G8 411 genotypes of segment 9 (SI Table 1). In addition, we identified intergenotypic recombination between P4-P6, 412 P4-P8, P6-P8, and P8-P14 genotypes of segment 4 (SI Table 1). P4, P6 and P8 are all relatively closely related 413 being in the P[II] genogroup, while P14 is in the P[III] genogroup. These putative recombination events suggest 414 that relevant serotype diversity in human hosts is expanding. However, as P4, P6 and P8 are also the dominant P 415 types in human infections, there is a sampling bias towards this genogroup as the genotypes within this group 416 are more likely to coinfect humans, so caution should be taken with this conclusion. 417 Generation of Escape Mutants 418 Recombination involving regions encoding conserved epitopes, especially in segments that encode 419 proteins involved in host cell attachment and entry, may provide selective advantages to rotaviruses by allowing 420 them to evade inactivation by host-produced antibodies. These escape mutants may be generally less fit than 421 wildtype viruses, but competitively advantaged in hosts because of a lack of host recognition. Subsequent 422 intrahost adaptation may then select for compensatory fitness-increasing mutations allowing these strains to be 423 competitive with circulating rotavirus strains. Many of the recombination events observed in our study appeared 424 to generate such escape mutants. For example, we found two instances of the same segment 9 (VP7) G1-G3 425 recombination event (Table 2; KJ751729 and KP752817) with amino acid changes in two of four conserved 426 epitope regions due to the recombination event, which suggests that this strain prevailed in the population 427 because it was better able to evade antibody neutralization. 428 Most of the VP4 recombination events involved the spike head of the VP8* protein or the spike 429 body/stalk region of VP5* (antigen domain). Escape mutant studies (4, 83-85) for VP4 show the VP8* spike 430 head recognizes histo-blood group antigens, which is one of rotavirus’s main host range expansion barriers (86- 431 88). VP4 recombinants therefore may help the virus expand host range and potentially aid in immune evasion. 432 While VP4 is immunogenic, antibodies must bind to many VP8* protein fragments to inhibit host 433 binding and viral entry. Because of this, VP7 specific monoclonal antibodies are more efficient at neutralizing 434 rotavirus infection (89). Not only is there more VP7 protein on the virus surface, it is neutralized by inhibiting 435 decapsidation (84). Our study found that a G3-G1 recombinant (KJ751729) had amino acid changes in two out 436 of four total conserved epitope regions due to the recombination event, which could indicate this strain was able 437 to prevail in the population by being better able to evade neutralization. 438 VP6 also showed recombination events resulting in substantial amino acid changes. VP6 is a more 439 conserved protein, but also an antigenic protein that interacts with naïve B cells (90). This feature suggests that 19 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 7, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

440 there may be selection for VP6 escape mutants to evade host immune responses. A secondary structure analysis 441 of VP6 indicated that it was relatively conserved across genotypes, which may explain why VP6 seems to have 442 more frequent intergenotypic recombination. Conserved epitopes in VP6 exist around amino acid positions 197 443 to 214, and 308 to 316 (91). 444 NSP1 also showed recombination between highly divergent strains. It is expected that NSP1 could 445 tolerate more mutations or even recombination, as NSP1 is not strictly required for (92). NSP1 446 is also the least conserved of the proteins, including the serotype proteins VP4 and VP7 (93), indicating 447 intergenotypic recombination causing more disruption of the amino acid sequence may be less likely to be 448 deleterious. 449 Recombination has the potential to decouple RNA-RNA interactions from protein-protein interactions. 450 For reassortment to occur, segments packaging requires not only compatible RNA-RNA and RNA-protein 451 interactions, but also the newly generated reassortant must be able to outcompete its progenitor strains. This 452 feature implies that a reassortant virus with a segment originating from a more genetically divergent virus may 453 be infectious, but less likely to be generated during a coinfection due to competing RNA interactions. Intragenic 454 recombination could allow for novel reassortant proteins if breakpoints are within open reading frames and do 455 not disrupt packaging signals. This could allow a virus to replicate and infect efficiently in a new genetic 456 background. 457 Recombination around conserved epitope regions, especially in the VP4 and VP7 proteins involved in 458 host cell attachment and entry, may provide a selective advantage to rotaviruses by allowing them to evade 459 inactivation by host-produced antibodies. These escape mutants may be generally less fit than wildtype viruses, 460 but competitively advantaged in the host because of a lack of host recognition. Subsequent intrahost adaptation 461 may then select for compensatory fitness-increasing mutations allowing these strains to be competitive with 462 circulating rotavirus strains. 463 Based on the I-TASSER structural predictions, the putative recombinant VP7 proteins detected in our 464 survey appear able to fold properly and form functional proteins despite containing amino acid sequence from 465 different ‘parental’ genotypes (Fig. 6). While the secondary structure appeared slightly altered (e.g., shorter beta 466 sheets), the recombinant VP7 proteins generally maintained their 3-dimensional shape. The selective advantage 467 from swapping epitopes may outweigh any potential decrease in protein stability resulting from recombination. 468 Segment 7 (NSP2), 8 (NSP3), 10 (NSP4), and 11 (NSP5) all had low rates of recombination. These 469 segments are short thus recombination is expected to be less likely, however segment 9 (VP7), one of the 470 smallest segments, defies this pattern, having a high number of events observed. The smaller segments may also 471 be less able to tolerate recombination events due to the important roles they play during the formation and 472 stabilization of the supramolecular RNA complex (94) during rotavirus packaging and assembly (80, 95-97). 473 Drivers of Recombination 474 Homologous recombination has not previously been considered a significant driver in rotavirus 475 evolution (20, 38). While recombination is common in positive-sense RNA viruses (98), and even required for 20 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 7, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

476 retroviral genome replication and maintaining genome integrity (99), it is rarely reported in dsRNA viruses. This 477 discrepancy may stem from divergent selective pressures among different types of RNA viruses. While 478 recombination may be an important mechanism for purging deleterious mutations and lowering clonal 479 interference in some non-segmented viruses (100), segmented virus can accomplish the same outcome through 480 reassortment without breaking up any genes (21). Segmented viruses may be producing recombinant viruses that 481 are less fit than reassortant viruses, and therefore do not dominate populations. While some segmented positive- 482 sense RNA viruses can also reassort, such as the , they have fewer genome segments, increasing 483 clonal interference. Moreover, double-stranded RNA is a more stable molecule than single-stranded RNA so 484 dsRNA viruses are more resistant to damage (101). Recombination as a mechanism for genetic repair, therefore, 485 may be more important in segmented +ssRNA viruses than segmented dsRNA viruses. 486 Recombination is usually expected to be deleterious as the breakage of open reading frames disrupts 487 RNA secondary structure and may alter protein functionality (28, 100). However, recombination, as with 488 reassortment (15, 19, 102-104), may further increase rotavirus A genetic diversity due to epistatic interactions 489 resulting in reassortant-specific or recombinant-specific mutations (105). Formerly deleterious mutations may 490 become beneficial when the genetic background changes, resulting in an increase in circulating pathogenically 491 relevant viral strains. 492 In our study, recombination events usually occurred between strains of the same genotype. Intertypic 493 recombination may be more likely to be selected against as it may disrupt proteins or secondary RNA structure. 494 Divergent genotypes may also have a lower likelihood of coinfect the same cell due to regional dispersal. 495 However, the isolation of several strains resulting from recombination between different genotypes of segment 496 9, the VP7 coding segment, indicates recombination may be a mechanism for increasing the diversity of 497 circulating VP7 genotypes (Table 2). 498 We also note that undetectable recombination would be expected to be more frequent as it is less likely 499 to be deleterious, however this study shows recombination between strains different enough to result in 500 significant phylogenetic incongruities occurs in rotaviruses. These events seem to be beneficial as evidenced by 501 the fact that the recombinant genotypes persist in populations long enough for multiple samples showing the 502 same event to be sampled. For example, the same mosaic VP7 G6 gene (KF170899) (Figure 8) was sequenced 503 in multiple bovine strains, one isolated in 2009, and another in 2012, by two separate research groups. While 504 still detectable, the parents are closely related. However, in some instances, events between highly divergent 505 genotypes seem to have been able to persist in populations. Both a 2014 Malawi isolate and a 2006 isolate from 506 the United States showed similar G1-G2 VP7 mosaic gene (accession number KU356662) (Figure 8). Given that 507 these two G genotypes are highly divergent from one another, were identified in different years, in different 508 locations, and by different research groups, supports the contention that it is a true recombination event. This 509 example also indicates mosaic genes formed from two divergent genotypes can persist in the population, 510 affecting the diversity of prevailing strains.

21 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 7, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

511 The results of this study also suggest the possibility of a recombinant sub-lineage within the R2 clade of 512 segment 1, the polymerase-encoding segment. This result would indicate that homologous recombination has 513 had a long-term effect on rotavirus diversity as the same event is represented in strains isolated years apart from 514 geographically distant locations. Strains isolated in different countries showing the same recombination pattern 515 indicates at least some recombination events were sufficiently favorable to be maintained in the population. 516 Recombination has had a significant impact on the diversity of other Reoviridae members such as 517 bluetongue virus (106). For example, He et al., found that homologous recombination in bluetongue virus 518 played a role in shaping the virus’s evolution and epidemiology (106). Multiple bluetongue virus sequences 519 isolated in different countries at different times showed evidence of the same recombination event, indicating 520 that a mosaic gene successfully spread within the population. 521 Recombination in Other dsRNA Viruses 522 While recombination may result lead to fitness benefits, rates of recombination across viruses are more 523 likely determined by RNA polymerase processivity (average number of nucleotides added by the polymerase 524 per association with the template strand) (107). This supposition is evidenced by the fact that -ssRNA viruses 525 are the least likely to undergo homologous recombination due to their genome architecture and replication 526 mode. Since the expectation is that dsRNA viruses seldom recombine, researchers are rarely motivated to 527 investigate recombination in dsRNA viruses, further exacerbating the lack of reported recombination events in 528 these viruses. 529 Template switching, the presumptive main mechanism of RNA virus recombination, requires two 530 different genomes to be in close proximity (26, 100, 108). Because of this requirement, recombination is 531 expected to be rare in the dsRNA viruses because their genomes are transcribed within nucleocapsids, which 532 limits opportunities for the genomes of different virus genotypes to interact (109, 110). Despite these limitations, 533 recombination has been reported in several dsRNA virus species. For example, a study of 692 complete 534 bluetongue virus segments found evidence for at least 11 unique recombinant genotypes (1.6%) (106). The case 535 for recombination among bluetongue viruses is strengthened by the fact that viruses containing the same (or 536 similar) recombinant segments were isolated by different research groups in different countries at different 537 times, indicating that the recombinant viruses persisted and spread following the recombination event (106,

538 111). In another study, 1,881 sequences from members of the family Birnaviridae were analyzed for evidence of 539 recombination (112). While no interspecies recombination was observed, at least eight putative instances of 540 intraspecies recombination were observed among the infectious bursal disease viruses and the aquabirnaviruses 541 (112). We note that Birnaviruses’ genetic material is in a complex with ribonucleoprotein, while the genetic 542 material of Reoviridae members is free in the virion, which could be a factor in differing rates of recombination 543 across these dsRNA viruses. Li et al. reported several instances of recombination among the dsRNA Southern 544 rice black-streaked dwarf virus which is also a member of the Reoviridae family, but infects plants (113). 545 Putative recombinants were identified in six of the ten segments in both structural and nonstructural proteins. No 546 instances of intrasegment/homologous) recombination were observed in a survey of the dsRNA 22 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 7, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

547 (114), but the medium segment of phage Φ13 is believed to have recombined with the medium segment of Φ6 548 (110). 549 Possible Mechanism of Recombination in dsRNA Viruses 550 Genetic recombination is a significant driver of evolution in many viruses, including human 551 immunodeficiency virus (115), Western equine encephalitis virus (116), and poliovirus (25), however, 552 recombination rates vary dramatically across different RNA viruses due to differences in genome replication and 553 transcription, and due to differing propensities for coinfection. Negative-sense RNA viruses have lower rates of 554 recombination because, inside host cells, their RNAs are encased by ribonucelocapsids (117) thus less available 555 for recombination to occur. DsRNA viruses also undergo RNA synthesis after being packaged, and most do not 556 have free double-stranded RNA in the cell. Prior to packaging, rotavirus ssRNA can form temporary 557 associations even at low homology, and become tangled, which could result in an extra RNA being occasionally 558 mispackaged. The packaging of more than 11 segments may provide opportunities for recombination among 559 between the surplus segment and its homolog. 560 The precise mechanism for how rotaviruses package RNA is unknown, but there are leading hypotheses, 561 which share similarities to the well-established systems Φ6 and influenza. The most-accepted hypothesis is that 562 the eleven +ssRNA segments are bound to a protein complex consisting of the VP1 polymerase and the VP3 563 capping enzyme. RNA structures in the non-translated terminal regions (NTRs) aid in the formation of the 564 supramolecular RNA complex (94, 97) and determine whether the segments are packaged (80, 96). 565 Subsequently the supramolecular RNA complex is encapsidated by the core shell protein, VP2. Once the 566 particle is assembled, RNA synthesis occurs resulting in the dsRNA genome. The packaging mechanism of 567 bacteriophage Φ6 and other Cystoviruses is well understood, and distinct from the Reoviridae family. In the 568 Cystoviruses, each RNA segment is packaged separately in a precise order (118). Because of this, the packaging 569 of an extra segment would be more difficult, which may explain the lack of homologous recombination reported 570 in Cystoviruses (110). 571 Genomic rearrangement events (non-homologous recombination) in immunocompromised patients with 572 acute infection were shown to be generated non-randomly and exhibited recombination hotspots (119). 573 Secondary structure has been linked as a mechanism for recombination (120). If secondary structures form 574 around breakpoints, this may cause recombination to occur, as is the case with poliovirus (121). This may be 575 because regions, especially near the base of the hairpin would act as a stall for the polymerase. If a sequence 576 forms a hairpin with itself, it could also hybridize with a second RNA having a similar but not identical 577 sequence, which would allow for possible crossing over. 578 It is difficult to quantify the true frequency of recombination, as presumably most events would go 579 unnoticed if they were between near-identical sequences or be too deleterious to result in viable viruses that 580 could be isolated and sequenced. However, there is strong evidence showing that recombination does indeed 581 occur between distinct strains of rotavirus A, and at least some of these events are able to persist well enough to 582 be detected in sequences from multiple subsequent isolates. 23 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 7, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

583 References 584 585 1. Desselberger U. 2014. Rotaviruses. Virus Research 190:75-96. 586 2. Arias CF, Romero P, Alvarez V, Lopez S. 1996. Trypsin activation pathway of rotavirus 587 infectivity. J Virol 70:5832-9. 588 3. Matthijnssens J, Ciarlet M, Heiman E, Arijs I, Delbeke T, McDonald SM, Palombo EA, Iturriza- 589 Gomara M, Maes P, Patton JT, Rahman M, Van Ranst M. 2008. Full genome-based 590 classification of rotaviruses reveals a common origin between human Wa-Like and porcine 591 rotavirus strains and human DS-1-like and bovine rotavirus strains. J Virol 82:3204-19. 592 4. Nair N, Feng N, Blum LK, Sanyal M, Ding S, Jiang B, Sen A, Morton JM, He XS, Robinson WH, 593 Greenberg HB. 2017. VP4- and VP7-specific antibodies mediate heterotypic immunity to 594 rotavirus in humans. Sci Transl Med 9. 595 5. McDonald SM, Matthijnssens J, McAllen JK, Hine E, Overton L, Wang S, Lemey P, Zeller M, Van 596 Ranst M, Spiro DJ, Patton JT. 2009. Evolutionary Dynamics of Human Rotaviruses: Balancing 597 Reassortment with Preferred Genome Constellations. PLOS 5:e1000634. 598 6. Patton JT. 2012. Rotavirus Diversity and Evolution in the Post-Vaccine World. Discovery 599 Medicine 13:85-97. 600 7. Kirkwood CD. 2010. Genetic and Antigenic Diversity of Human Rotaviruses: Potential Impact 601 on Vaccination Programs. Journal of Infectious Diseases 202:S43-S48. 602 8. Matthijnssens J, Otto PH, Ciarlet M, Desselberger U, Van Ranst M, Johne R. 2012. VP6- 603 sequence-based cutoff values as a criterion for rotavirus species demarcation. Archives of 604 Virology 157:1177-1182. 605 9. Mihalov-Kovacs E, Gellert A, Marton S, Farkas SL, Feher E, Oldal M, Jakab F, Martella V, Banyai 606 K. 2015. Candidate new rotavirus species in sheltered dogs, Hungary. Emerg Infect Dis 21:660- 607 3. 608 10. Banyai K, Kemenesi G, Budinski I, Foldes F, Zana B, Marton S, Varga-Kugler R, Oldal M, Kurucz 609 K, Jakab F. 2017. Candidate new rotavirus species in Schreiber's bats, Serbia. Infect Genet Evol 610 48:19-26. 611 11. Matthijnssens J, Ciarlet M, McDonald SM, Attoui H, Bányai K, Brister JR, Buesa J, Esona MD, 612 Estes MK, Gentsch JR, Iturriza-Gómara M, Johne R, Kirkwood CD, Martella V, Mertens PPC, 613 Nakagomi O, Parreño V, Rahman M, Ruggeri FM, Saif LJ, Santos N, Steyer A, Taniguchi K,

24 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 7, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

614 Patton JT, Desselberger U, Van Ranst M. 2011. Uniformity of rotavirus strain nomenclature 615 proposed by the Rotavirus Classification Working Group (RCWG). Archives of virology 616 156:1397-1413. 617 12. Steger CL, Boudreaux CE, LaConte LE, Pease JB, McDonald SM. 2019. Group A Rotavirus VP1 618 Polymerase and VP2 Core Shell Proteins: Intergenotypic Sequence Variation and In 619 Vitro Functional Compatibility. Journal of Virology 93:e01642-18. 620 13. Rahman M, Matthijnssens J, Yang X, Delbeke T, Arijs I, Taniguchi K, Iturriza-Gomara M, 621 Iftekharuddin N, Azim T, Van Ranst M. 2007. Evolutionary history and global spread of the 622 emerging G12 human rotaviruses. Journal of Virology 81:2382-2390. 623 14. Matthijnssens J, Bilcke J, Ciarlet M, Martella V, Banyai K, Rahman M, Zeller M, Beutels P, Van 624 Damme P, Van Rans M. 2009. Rotavirus disease and vaccination: impact on genotype 625 diversity. Future Microbiology 4:1303-1316. 626 15. Ghosh S, Kobayashi N. 2011. Whole-genomic analysis of rotavirus strains: current status and 627 future prospects. Future Microbiology 6:1049-1065. 628 16. Sadiq A, Bostan N, Yinda KC, Naseem S, Sattar S. 2018. Rotavirus: Genetics, pathogenesis and 629 vaccine advances. Reviews in Medical Virology 28. 630 17. Donker NC, Kirkwood CD. 2012. Selection and evolutionary analysis in the nonstructural 631 protein NSP2 of rotavirus A. Infection Genetics and Evolution 12:1355-1361. 632 18. Matthijnssens J, Heylen E, Zeller M, Rahman M, Lemey P, Van Ranst M. 2010. Phylodynamic 633 Analyses of Rotavirus Genotypes G9 and G12 Underscore Their Potential for Swift Global 634 Spread. Molecular Biology and Evolution 27:2431-2436. 635 19. Ramig RF, Ward RL. 1991. GENOMIC SEGMENT REASSORTMENT IN ROTAVIRUSES AND OTHER 636 REOVIRIDAE. Advances in Virus Research 39:163-207. 637 20. Ramig RF. 1997. Genetics of the rotaviruses. Annual Review of Microbiology 51:225-255. 638 21. McDonald SM, Nelson MI, Turner PE, Patton JT. 2016. Reassortment in segmented RNA 639 viruses: mechanisms and outcomes. Nature Reviews Microbiology 14:448-460. 640 22. Doro R, Farkas SL, Martella V, Banyai K. 2015. Zoonotic transmission of rotavirus: surveillance 641 and control. Expert Review of Anti-Infective Therapy 13:1337-1350. 642 23. Desselberger U. 1996. Genome rearrangements of rotaviruses. Advances in Virus Research, 643 Vol 46 46:69-95.

25 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 7, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

644 24. Varsani A, Lefeuvre P, Roumagnac P, Martin D. 2018. Notes on recombination and 645 reassortment in multipartite/segmented viruses. Curr Opin Virol 33:156-166. 646 25. Lukashev AN. 2005. Role of recombination in evolution of . Rev Med Virol 647 15:157-67. 648 26. Pérez-Losada M, Arenas M, Galán JC, Palero F, González-Candelas F. 2015. Recombination in 649 viruses: Mechanisms, methods of study, and evolutionary consequences. Infection, Genetics 650 and Evolution 30:296-307. 651 27. Patton JT, Vasquez-Del Carpio R, Tortorici MA, Taraporewala ZF. 2007. Coupling of rotavirus 652 genome replication and capsid assembly. Adv Virus Res 69:167-201. 653 28. Lai MM. 1992. RNA recombination in animal and plant viruses. Microbiol Rev 56:61-79. 654 29. Suzuki Y, Gojobori T, Nakagomi O. 1998. Intragenic recombinations in rotaviruses. FEBS Lett 655 427:183-7. 656 30. Parra GI, Bok K, Martinez M, Gomez JA. 2004. Evidence of rotavirus intragenic recombination 657 between two sublineages of the same genotype. J Gen Virol 85:1713-6. 658 31. Phan TG, Okitsu S, Maneekarn N, Ushijima H. 2007. Evidence of intragenic recombination in 659 G1 rotavirus VP7 genes. J Virol 81:10188-94. 660 32. Phan TG, Okitsu S, Maneekarn N, Ushijima H. 2007. Genetic heterogeneity, evolution and 661 recombination in emerging G9 rotaviruses. Infection Genetics and Evolution 7:656-663. 662 33. Martinez-Laso J, Roman A, Rodriguez M, Cervera I, Head J, Rodriguez-Avial I, Picazo JJ. 2009. 663 Diversity of the G3 genes of human rotaviruses in isolates from Spain from 2004 to 2006: 664 cross-species transmission and inter-genotype recombination generates alleles. J Gen Virol 665 90:935-43. 666 34. Donker NC, Boniface K, Kirkwood CD. 2011. Phylogenetic analysis of rotavirus A NSP2 gene 667 sequences and evidence of intragenic recombination. Infect Genet Evol 11:1602-7. 668 35. Jere KC, Mlera L, Page NA, van Dijk AA, O'Neill HG. 2011. Whole genome analysis of multiple 669 rotavirus strains from a single stool specimen using sequence-independent amplification and 670 454(R) pyrosequencing reveals evidence of intergenotype genome segment recombination. 671 Infect Genet Evol 11:2072-82. 672 36. Jing Z, Zhang X, Shi H, Chen J, Shi D, Dong H, Feng L. 2018. A G3P 13 porcine group A rotavirus 673 emerging in China is a reassortant and a natural recombinant in the VP4 gene. Transboundary 674 and Emerging Diseases 65:e317-e328.

26 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 7, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

675 37. Esona MD, Roy S, Rungsrisuriyachai K, Sanchez J, Vasquez L, Gomez V, Rios LA, Bowen MD, 676 Vazquez M. 2017. Characterization of a triple-recombinant, reassortant rotavirus strain from 677 the Dominican Republic. Journal of General Virology 98:134-142. 678 38. Woods RJ. 2015. Intrasegmental recombination does not contribute to the long-term 679 evolution of group A rotavirus. Infection Genetics and Evolution 32:354-360. 680 39. Hatcher EL, Zhdanov SA, Bao Y, Blinkova O, Nawrocki EP, Ostapchuck Y, Schaffer AA, Brister JR. 681 2017. Virus Variation Resource - improved response to emergent viral outbreaks. Nucleic 682 Acids Res 45:D482-d490. 683 40. Matthijnssens J, Ciarlet M, Rahman M, Attoui H, Bányai K, Estes MK, Gentsch JR, Iturriza- 684 Gómara M, Kirkwood CD, Martella V, Mertens PPC, Nakagomi O, Patton JT, Ruggeri FM, Saif 685 LJ, Santos N, Steyer A, Taniguchi K, Desselberger U, Van Ranst M. 2008. Recommendations for 686 the classification of group A rotaviruses using all 11 genomic RNA segments. Archives of 687 virology 153:1621-1629. 688 41. Edgar RC. 2004. MUSCLE: multiple with high accuracy and high 689 throughput. Nucleic Acids Res 32:1792-7. 690 42. Martin DP, Murrell B, Golden M, Khoosal A, Muhire B. 2015. RDP4: Detection and analysis of 691 recombination patterns in virus genomes. Virus Evol 1:vev003. 692 43. Suchard MA, Lemey P, Baele G, Ayres DL, Drummond AJ, Rambaut A. 2018. Bayesian 693 phylogenetic and phylodynamic data integration using BEAST 1.10. Virus Evol 4:vey016. 694 44. Minin VN, Bloomquist EW, Suchard MA. 2008. Smooth skyride through a rough skyline: 695 Bayesian coalescent-based inference of population dynamics. Mol Biol Evol 25:1459-71. 696 45. Rambaut A, Drummond AJ, Xie D, Baele G, Suchard MA. 2018. Posterior Summarization in 697 Bayesian Phylogenetics Using Tracer 1.7. Systematic Biology 67:901-904. 698 46. Drummond AJ, Rambaut A. 2007. BEAST: Bayesian evolutionary analysis by sampling trees. 699 BMC Evolutionary Biology 7:214. 700 47. Rambaut A. 2019. Figtree v1.4.4. Accessed 701 48. Lorenz R, Bernhart SH, Honer Zu Siederdissen C, Tafer H, Flamm C, Stadler PF, Hofacker IL. 702 2011. ViennaRNA Package 2.0. Algorithms Mol Biol 6:26. 703 49. Bernhart SH, Hofacker IL, Will S, Gruber AR, Stadler PF. 2008. RNAalifold: improved consensus 704 structure prediction for RNA alignments. BMC Bioinformatics 9:474.

27 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 7, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

705 50. Roy A, Kucukural A, Zhang Y. 2010. I-TASSER: a unified platform for automated protein 706 structure and function prediction. Nat Protoc 5:725-38. 707 51. Yang J, Zhang Y. 2015. Protein Structure and Function Prediction Using I-TASSER. Curr Protoc 708 Bioinformatics 52:5 8 1-15. 709 52. Zhang Y. 2008. I-TASSER server for protein 3D structure prediction. BMC Bioinformatics 9:40. 710 53. Vita R, Mahajan S, Overton JA, Dhanda SK, Martini S, Cantrell JR, Wheeler DK, Sette A, Peters 711 B. 2019. The Immune Epitope Database (IEDB): 2018 update. Nucleic Acids Res 47:D339-d343. 712 54. Yao B, Zhang L, Liang S, Zhang C. 2012. SVMTriP: a method to predict antigenic epitopes using 713 support vector machine to integrate tri-peptide similarity and propensity. PLoS One 7:e45152. 714 55. Aida S, Nahar S, Paul SK, Hossain MA, Kabir MR, Sarkar SR, Ahmed S, Ghosh S, Urushibara N, 715 Kawaguchiya M, Aung MS, Sumi A, Kobayashi N. 2016. Whole genomic analysis of G2P[4] 716 human Rotaviruses in Mymensingh, north-central Bangladesh. Heliyon 2:e00168-e00168. 717 56. Bwogi J, Jere KC, Karamagi C, Byarugaba DK, Namuwulya P, Baliraine FN, Desselberger U, 718 Iturriza-Gomara M. 2017. Whole genome analysis of selected human and animal rotaviruses 719 identified in Uganda from 2012 to 2014 reveals complex genome reassortment events 720 between human, bovine, caprine and porcine strains. PLoS One 12:e0178855. 721 57. Ghosh A, Chattopadhyay S, Chawla-Sarkar M, Nandy P, Nandy A. 2012. In silico study of 722 rotavirus VP7 surface accessible conserved regions for antiviral drug/vaccine design. PLoS One 723 7:e40749. 724 58. Boni MF, de Jong MD, van Doorn HR, Holmes EC. 2010. Guidelines for identifying homologous 725 recombination events in influenza A virus. PLoS One 5:e10434. 726 59. Bertrand Y, Töpel M, Elväng A, Melik W, Johansson M. 2012. First Dating of a Recombination 727 Event in Mammalian Tick-Borne . PLOS ONE 7:e31981. 728 60. Boni MF, Smith GJD, Holmes EC, Vijaykrishna D. 2012. No evidence for intra-segment 729 recombination of 2009 H1N1 influenza virus in swine. Gene 494:242-245. 730 61. Worobey M, Rambaut A, Pybus OG, Robertson DL. 2002. Questioning the evidence for genetic 731 recombination in the 1918 "Spanish flu" virus. Science 296:211 discussion 211. 732 62. Bertrand YJK, Johansson M, Norberg P. 2016. Revisiting Recombination Signal in the Tick- 733 Borne Encephalitis Virus: A Simulation Approach. PLOS ONE 11:e0164435. 734 63. Norberg P, Roth A, Bergstrom T. 2013. Genetic recombination of tick-borne flaviviruses among 735 wild-type strains. Virology 440:105-16.

28 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 7, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

736 64. Boni MF, Zhou Y, Taubenberger JK, Holmes EC. 2008. Homologous Recombination Is Very Rare 737 or Absent in Human Influenza A Virus. Journal of Virology 82:4807. 738 65. Boni MF, Posada D, Feldman MW. 2007. An Exact Nonparametric Method for Inferring Mosaic 739 Structure in Sequence Triplets. Genetics 176:1035. 740 66. Lole KS, Bollinger RC, Paranjape RS, Gadkari D, Kulkarni SS, Novak NG, Ingersoll R, Sheppard 741 HW, Ray SC. 1999. Full-length human immunodeficiency virus type 1 genomes from subtype 742 C-infected seroconverters in India, with evidence of intersubtype recombination. J Virol 743 73:152-60. 744 67. He CQ, Han GZ, Wang D, Liu W, Li GR, Liu XP, Ding NZ. 2008. Homologous recombination 745 evidence in human and swine influenza A viruses. Virology 380:12-20. 746 68. Santiana M, Ghosh S, Ho BA, Rajasekaran V, Du WL, Mutsafi Y, De Jesus-Diaz DA, Sosnovtsev 747 SV, Levenson EA, Parra GI, Takvorian PM, Cali A, Bleck C, Vlasova AN, Saif LJ, Patton JT, Lopalco 748 P, Corcelli A, Green KY, Altan-Bonnet N. 2018. Vesicle-Cloaked Virus Clusters Are Optimal 749 Units for Inter-organismal Viral Transmission. Cell Host Microbe 24:208-220.e8. 750 69. Abdel-Haq NM, Thomas RA, Asmar BI, Zacharova V, Lyman WD. 2003. Increased prevalence of 751 G1P[4] genotype among children with rotavirus-associated gastroenteritis in metropolitan 752 Detroit. J Clin Microbiol 41:2680-2. 753 70. Husain M, Seth P, Dar L, Broor S. 1996. Classification of rotavirus into G and P types with 754 specimens from children with acute diarrhea in New Delhi, India. Journal of Clinical 755 Microbiology 34:1592. 756 71. Khetawat D, Dutta P, Bhattacharya SK, Chakrabarti S. 2002. Distribution of rotavirus VP7 757 genotypes among children suffering from watery diarrhea in Kolkata, India. Virus Res 87:31- 758 40. 759 72. Jain V, Das BK, Bhan MK, Glass RI, Gentsch JR. 2001. Great diversity of group A rotavirus 760 strains and high prevalence of mixed rotavirus infections in India. J Clin Microbiol 39:3524-9. 761 73. Sánchez-Fauquier A, Montero V, Moreno S, Solé M, Colomina J, Iturriza-Gomara M, Revilla A, 762 Wilhelmi I, Gray J, Gegavi V-NG. 2006. Human rotavirus G9 and G3 as major cause of diarrhea 763 in hospitalized children, Spain. Emerging infectious diseases 12:1536-1541. 764 74. Kiulia NM, Peenze I, Dewar J, Nyachieo A, Galo M, Omolo E, Steele AD, Mwenda JM. 2006. 765 Molecular characterisation of the rotavirus strains prevalent in Maua, Meru North, Kenya. 766 East Afr Med J 83:360-5.

29 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 7, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

767 75. Mwenda JM, Ntoto KM, Abebe A, Enweronu-Laryea C, Amina I, McHomvu J, Kisakye A, 768 Mpabalwani EM, Pazvakavambwa I, Armah GE, Seheri LM, Kiulia NM, Page N, Widdowson M- 769 A, Duncan Steele A. 2010. Burden and Epidemiology of Rotavirus Diarrhea in Selected African 770 Countries: Preliminary Results from the African Rotavirus Surveillance Network. The Journal of 771 Infectious Diseases 202:S5-S11. 772 76. Anaya-Molina Y, De La Cruz Hernández SI, Andrés-Dionicio AE, Terán-Vega HL, Méndez-Pérez 773 H, Castro-Escarpulli G, García-Lozano H. 2018. A one-step real-time RT-PCR helps to identify 774 mixed rotavirus infections in Mexico. Diagnostic Microbiology and Infectious Disease 92:288- 775 293. 776 77. Rojas M, Dias HG, Gonçalves JLS, Manchego A, Rosadio R, Pezo D, Santos N. 2019. Genetic 777 diversity and zoonotic potential of rotavirus A strains in the southern Andean highlands, Peru. 778 Transboundary and Emerging Diseases 0. 779 78. Mwanga M, Nyaigoti CA, Njeru R, Nokes DJ, Matthew C, Phan VTM, Holmes EC. 2018. 780 Unbiased whole genome deep sequencing of Rotavirus Group A positive samples from rural 781 Kenya, 2012-14 reveals high frequency of coinfection and genetic reassortment. International 782 Journal of Infectious Diseases 73:203. 783 79. Martella V, Banyai K, Matthijnssens J, Buonavoglia C, Ciarlet M. 2010. Zoonotic aspects of 784 rotaviruses. Veterinary Microbiology 140:246-255. 785 80. Li W, Manktelow E, von Kirchbach JC, Gog JR, Desselberger U, Lever AM. 2010. Genomic 786 analysis of codon, sequence and structural conservation with selective biochemical-structure 787 mapping reveals highly conserved and dynamic structures in rotavirus RNAs with potential cis- 788 acting functions. Nucleic Acids Res 38:7718-35. 789 81. Gentsch JR, Laird AR, Bielfelt B, Griffin DD, Banyai K, Ramachandran M, Jain V, Cunliffe NA, 790 Nakagomi O, Kirkwood CD, Fischer TK, Parashar UD, Bresee JS, Jiang B, Glass RI. 2005. 791 Serotype diversity and reassortment between human and animal rotavirus strains: 792 implications for rotavirus vaccine programs. J Infect Dis 192 Suppl 1:S146-59. 793 82. Martella V, Ciarlet M, Pratelli A, Arista S, Terio V, Elia G, Cavalli A, Gentile M, Decaro N, Greco 794 G, Cafiero MA, Tempesta M, Buonavoglia C. 2003. Molecular analysis of the VP7, VP4, VP6, 795 NSP4, and NSP5/6 genes of a buffalo rotavirus strain: identification of the rare P[3] rhesus 796 rotavirus-like VP4 gene allele. Journal of clinical microbiology 41:5665-5675.

30 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 7, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

797 83. Aoki ST, Settembre EC, Trask SD, Greenberg HB, Harrison SC, Dormitzer PR. 2009. Structure of 798 rotavirus outer-layer protein VP7 bound with a neutralizing Fab. Science 324:1444-7. 799 84. Ludert JE, Ruiz MC, Hidalgo C, Liprandi F. 2002. Antibodies to rotavirus outer capsid 800 glycoprotein VP7 neutralize infectivity by inhibiting virion decapsidation. J Virol 76:6643-51. 801 85. Zhou YJ, Burns JW, Morita Y, Tanaka T, Estes MK. 1994. Localization of rotavirus VP4 802 neutralization epitopes involved in antibody-induced conformational changes of virus 803 structure. J Virol 68:3955-64. 804 86. Hu L, Sankaran B, Laucirica DR, Patil K, Salmen W, Ferreon ACM, Tsoi PS, Lasanajak Y, Smith 805 DF, Ramani S, Atmar RL, Estes MK, Ferreon JC, Prasad BVV. 2018. Glycan recognition in 806 globally dominant human rotaviruses. Nat Commun 9:2631. 807 87. Huang P, Xia M, Tan M, Zhong W, Wei C, Wang L, Morrow A, Jiang X. 2012. Spike protein VP8* 808 of human rotavirus recognizes histo-blood group antigens in a type-specific manner. J Virol 809 86:4833-43. 810 88. Lee B, Dickson DM, deCamp AC, Ross Colgate E, Diehl SA, Uddin MI, Sharmin S, Islam S, 811 Bhuiyan TR, Alam M, Nayak U, Mychaleckyj JC, Taniuchi M, Petri WA, Jr., Haque R, Qadri F, 812 Kirkpatrick BD. 2018. Histo-Blood Group Antigen Phenotype Determines Susceptibility to 813 Genotype-Specific Rotavirus Infections and Impacts Measures of Rotavirus Vaccine Efficacy. J 814 Infect Dis 217:1399-1407. 815 89. Ruggeri FM, Greenberg HB. 1991. Antibodies to the trypsin cleavage peptide VP8 neutralize 816 rotavirus by inhibiting binding of virions to target cells in culture. J Virol 65:2211-9. 817 90. Parez N, Garbarg-Chenon A, Fourgeux C, Le Deist F, Servant-Delmas A, Charpilienne A, Cohen 818 J, Schwartz-Cornil I. 2004. The VP6 protein of rotavirus interacts with a large fraction of 819 human naive B cells via surface immunoglobulins. J Virol 78:12489-96. 820 91. Aiyegbo MS, Eli IM, Spiller BW, Williams DR, Kim R, Lee DE, Liu T, Li S, Stewart PL, Crowe JE, Jr. 821 2014. Differential accessibility of a rotavirus VP6 epitope in trimers comprising type I, II, or III 822 channels as revealed by binding of a human rotavirus VP6-specific antibody. J Virol 88:469-76. 823 92. Hua J, Chen X, Patton JT. 1994. Deletion mapping of the rotavirus metalloprotein NS53 (NSP1): 824 the conserved cysteine-rich region is essential for virus-specific RNA binding. J Virol 68:3990- 825 4000. 826 93. Arnold MM, Patton JT. 2011. Diversity of interferon antagonist activities mediated by NSP1 827 proteins of different rotavirus strains. J Virol 85:1970-9.

31 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 7, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

828 94. Fajardo T, Jr., Sung PY, Roy P. 2015. Disruption of Specific RNA-RNA Interactions in a Double- 829 Stranded RNA Virus Inhibits Genome Packaging and Virus Infectivity. PLoS Pathog 830 11:e1005321. 831 95. Fajardo T, Sung PY, Celma CC, Roy P. 2017. Rotavirus Genomic RNA Complex Forms via Specific 832 RNA-RNA Interactions: Disruption of RNA Complex Inhibits Virus Infectivity. Viruses 9. 833 96. Suzuki Y. 2015. A candidate packaging signal of human rotavirus differentiating Wa-like and 834 DS-1-like genomic constellations. Microbiol Immunol 59:567-71. 835 97. Borodavka A, Dykeman EC, Schrimpf W, Lamb DC. 2017. Protein-mediated RNA folding 836 governs sequence-specific interactions between rotavirus genome segments. Elife 6. 837 98. Chare ER, Holmes EC. 2006. A phylogenetic survey of recombination frequency in plant RNA 838 viruses. Arch Virol 151:933-46. 839 99. Rawson JMO, Nikolaitchik OA, Keele BF, Pathak VK, Hu W-S. 2018. Recombination is required 840 for efficient HIV-1 replication and the maintenance of viral genome integrity. Nucleic Acids 841 Research 46:10535-10545. 842 100. Decrey L, Kazama S, Kohn T. 2016. Ammonia as an <em>In Situ</em> Sanitizer: 843 Influence of Virus Genome Type on Inactivation. Applied and Environmental Microbiology 844 82:4909. 845 101. Simon-Loriere E, Holmes EC. 2011. Why do RNA viruses recombine? Nat Rev Microbiol 9:617- 846 26. 847 102. Iturriza-Gomara M, Isherwood B, Desselberger U, Gray J. 2001. Reassortment in vivo: driving 848 force for diversity of human rotavirus strains isolated in the United Kingdom between 1995 849 and 1999. J Virol 75:3696-705. 850 103. Jere KC, Chaguza C, Bar-Zeev N, Lowe J, Peno C, Kumwenda B, Nakagomi O, Tate JE, Parashar 851 UD, Heyderman RS, French N, Cunliffe NA, Iturriza-Gomara M, Consortium V. 2018. 852 Emergence of Double- and Triple-Gene Reassortant G1P 8 Rotaviruses Possessing a DS-1-Like 853 Backbone after Rotavirus Vaccine Introduction in Malawi. Journal of Virology 92. 854 104. Schumann T, Hotzel H, Otto P, Johne R. 2009. Evidence of interspecies transmission and 855 reassortment among avian group A rotaviruses. Virology 386:334-343. 856 105. Zeldovich KB, Liu P, Renzette N, Foll M, Pham ST, Venev SV, Gallagher GR, Bolon DN, Kurt- 857 Jones EA, Jensen JD, Caffrey DR, Schiffer CA, Kowalik TF, Wang JP, Finberg RW. 2015. Positive

32 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 7, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

858 Selection Drives Preferred Segment Combinations during Influenza Virus Reassortment. Mol 859 Biol Evol 32:1519-32. 860 106. He CQ, Ding NZ, He M, Li SN, Wang XM, He HB, Liu XF, Guo HS. 2010. Intragenic recombination 861 as a mechanism of genetic diversity in bluetongue virus. J Virol 84:11487-95. 862 107. Simon-Loriere E, Holmes EC. 2011. Why do RNA viruses recombine? Nature Reviews 863 Microbiology 9:617-626. 864 108. Bentley K, Evans DJ. 2018. Mechanisms and consequences of positive-strand RNA virus 865 recombination. J Gen Virol 99:1345-1356. 866 109. McDonald SM, Patton JT. 2011. Assortment and packaging of the segmented rotavirus 867 genome. Trends Microbiol 19:136-44. 868 110. Mindich L. 2004. Packaging, replication and recombination of the segmented genomes of 869 bacteriophage Phi 6 and its relatives. Virus Research 101:83-92. 870 111. Carpi G, Holmes EC, Kitchen A. 2010. The Evolutionary Dynamics of Bluetongue Virus. Journal 871 of Molecular Evolution 70:583-592. 872 112. Hon CC, Lam TT, Yip CW, Wong RT, Shi M, Jiang J, Zeng F, Leung FC. 2008. Phylogenetic 873 evidence for homologous recombination within the family Birnaviridae. J Gen Virol 89:3156- 874 64. 875 113. Li Y, Xia Z, Peng J, Zhou T, Fan Z. 2013. Evidence of recombination and genetic diversity in 876 southern rice black-streaked dwarf virus. Arch Virol 158:2147-51. 877 114. Silander OK, Weinreich DM, Wright KM, O'Keefe KJ, Rang CU, Turner PE, Chao L. 2005. 878 Widespread genetic exchange among terrestrial bacteriophages. Proceedings of the National 879 Academy of Sciences of the United States of America 102:19009-19014. 880 115. Robertson DL, Hahn BH, Sharp PM. 1995. Recombination in AIDS viruses. J Mol Evol 40:249-59. 881 116. Hahn CS, Lustig S, Strauss EG, Strauss JH. 1988. Western equine encephalitis virus is a 882 recombinant virus. Proc Natl Acad Sci U S A 85:5997-6001. 883 117. Chare ER, Gould EA, Holmes EC. 2003. Phylogenetic analysis reveals a low rate of homologous 884 recombination in negative-sense RNA viruses. J Gen Virol 84:2691-703. 885 118. Patton JT, Jones MT, Kalbach AN, He YW, Xiaobo J. 1997. Rotavirus RNA polymerase requires 886 the core shell protein to synthesize the double-stranded RNA genome. J Virol 71:9618-26.

33 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 7, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

887 119. Schnepf N, Deback C, Dehee A, Gault E, Parez N, Garbarg-Chenon A. 2008. Rearrangements of 888 rotavirus genomic segment 11 are generated during acute infection of immunocompetent 889 children and do not occur at random. J Virol 82:3689-96. 890 120. Dedepsidis E, Kyriakopoulou Z, Pliaka V, Markoulatos P. 2010. Correlation between 891 recombination junctions and RNA secondary structure elements in poliovirus Sabin strains. 892 Virus Genes 41:181-91. 893 121. Tolskaya EA, Romanova LI, Blinov VM, Viktorova EG, Sinyakov AN, Kolesnikova MS, Agol VI. 894 1987. Studies on the recombination between RNA genomes of poliovirus: the primary 895 structure and nonrandom distribution of crossover regions in the genomes of intertypic 896 poliovirus recombinants. Virology 161:54-61. 897 898

34 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 7, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

899 SUPPLEMENTARY INFORMATION 900 901 Supplementary Table 1. Supported putative recombinants identified by RDP4 for whole genome 902 Rotavirus A sequences. 117 events were identified where at least 6/7 methods listed supported the 903 event. Shaded rows indicate strongly supported events (identified by 7/7 RDP4 programs). Events that were

904 flagged by RDP4 but showed less support are not listed here.

Segment Accession Number Breakpoint Cis p-Values Representative Representative Minor (Excluding Gaps) Major Parent Parent Sequence, Source Sequence, Source Species, Year of Species, Year of Isolation, % ANI, Isolation, % ANI, Genogroup Genogroup VP1 JQ988899 (972), (353-509)- RDP 1.00E-43 KJ751768 GQ414540 Homo sapiens, Croatia, (1702-1760) GENECONV. 9.08E-47 Homo sapiens, South Homo sapiens, Germany, 2006 BootScan 4.28E-27 Africa, 94.7% ,R1 2009, 98.7%, R2 MaxChi. 5.44E-18 Chimera 3.97E-18 SiScan 3.46E-27 3Seq 2.83E-09 VP1 KU739903 1-(747-805) RDP 1.19E-24 KU739900 KU739904 Sus scrofa, Taiwan, 2015 GENECONV 1.41E-21 Sus scrofa, Taiwan, Sus scrofa, Taiwan, 2015, BootScan. 5.14E-21 2015, 95.8%, R1 99.1%, R1 MaxChi. 8.41E-16 Chimera 3.57E-16 SiScan. 9.64E-16 3Seq 2.83E-09 VP1 KF812721 (2374-2537)- RDP 7.11E-23 JX027681 Unknown, closest = Homo sapiens, South (3268) GENECONV 1.75E-16 Homo sapiens, JQ069926, R1 Korea, 2010 BootScan 2.44E-21 Australia, 2007, MaxChi 9.35E-16 98.1%, R1 Chimera 5.18E-15 SiScan 2.49E-21 3Seq 5.15E-29 VP1 JQ069926 2498 (2356-2507) RDP 9.14E-22 JX195074 JX027681 Homo sapiens, Canada, – 3268 GENECONV 5.71E-19 Homo sapiens, Italy, Homo sapiens, Australia, 2008 BootScan 1.00E-18 2010, 98.1% 2007, 99.6% MaxChi 1.30E-15 R1 R1 Chimera 2.51E-15 SiScan. 7.42E-23 3Seq 2.78E-25 VP1 KM454503 840 (802-951)- RDP 2.77E-15 Unknown, closest = KM454492 Equus callabus, USA, 3281 GENECONV 1.36E-19 DQ838638, R2 Equus callabus, United 1981 BootScan 4.79 E-21 Kingdom, 1976, 99.1%, MaxChi 1.55E-17 R2 SiScan 1.27E-40

35 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 7, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

3Seq 8.22E-05 VP1 KU714444 1656 (1636-1673)- RDP 1.35E-18 JN706454, Homo KX655451, Homo Homo sapiens, Malawi, 1752 (1740-1766) GENECONV 1.25E-17 sapiens, Thailand, sapiens, Uganda, 2013, 2000 BootScan 3.76E-11 2009, 97.4%, R1 97.9%, R2 MaxChi 3.81E-02 Chimera 4.33E-02 3Seq 5.67E-09 VP1 JQ069936, Homo sapiens, 1670 (1531, 1703)- RDP 1.83E-14 KU248405, Homo HQ657171, Homo Canada, 2009 2419 (2308-2436) GENECONV 7.68E-13 sapiens, Bangladesh, sapiens, South Africa, MaxChi 2.73E-08 2010 (98.7%), R2 2009, 99.7%, R2 Chimera 2.48E-08 SiScan 2.65E-08 3Seq 5.15E-19 VP1 KU739903, Sus scrofa, 1543 (1383-1588)- RDP 3.06E-13 KU739900 KU739904 Taiwan, 2015 2220 (2160-2260) GENECONV 6.19E-04 Sus scrofa, Taiwan, Sus scrofa, Taiwan, 2015, BootScan 9.40E-11 2015, 97.8%, R1 96.5%, R1 MaxChi 2.98E-08 Chimera 5.67E-08 SiScan 1.19 E-06 3Seq 2.83E-09 VP1 JQ988899 390 (241-428)-971 RDP 3.76E-12 JX195074 Unknown, JQ069918, Homo sapiens, Croatia, (894-1847) GENECONV 4.88E-10 Homo sapiens, Italy, Homo sapiens, Canada, 2006 MaxChi 7.74E-09 2010, 93.5% 2008 Chimera 4.5E-07 R1 R1 SiScan 6.32E-14 3Seq 5.64E-09 VP1 KC580176, Homo sapiens, 1471 (1328-1521)- RDP 9.77E-12 KC580526, Homo KC580601, Homo USA, 1988 3270 GENECONV 3.81E-08 sapiens, USA, 1979, sapiens, USA, 1988, MaxChi 6.16E-12 99%, R1 100%, R1 Chimera 2.36E-04 SiScan. 7.42E-23 3Seq 1.91E-20 VP1 JQ069918, Homo sapiens, 604 (541-662)-956 RDP 2.42E-10 KJ751834, Homo JQ069924, Homo sapiens, Canada, 2008 (889-1068) GENECONV 3.25E-08 sapiens, South Africa, Canada, 2008, 99.7%, R1 BootScan 2.38E-09 2009, 99.5%, R1 MaxChi 5.62E-03 Chimera 5.62E-03 3.28E-03 SiSeq 1.10E-04 3Seq 1.13E-08 VP1 KC579647, Homo sapiens, 35 – 1448 (1290- GENECONV 4.30E-04 KC580601, Homo KC579509, Homo USA, 1988 1498) BootScan 2.63E-05 sapiens, USA, 1988, sapiens, USA, 1989, MaxChi 2.34E-09 100%, R1 99.4%, R1 Chimera 1.41E-08 SiScan 6.55E-11 3Seq 4.34E-19

36 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 7, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

VP1 KU199270, Homo sapiens, 670 (541-711)- Flagged for 285 seqs, KX536654, Homo KU356640, Homo Bangladesh, 2010 1403 (1327-1445) p-value= 3.33E-09 sapiens, India, 2011, sapiens, Bangladesh, 98.6% 2013, 100% R2 R2 VP1 KU356662, 656 (541-711)- RDP 5.87E-08 KU356607, Homo KC178768, Homo KU356640, Homo sapiens, 2661 (2600-2735) GENECONV 1.43E-04 sapiens, Bangladesh, sapiens, Italy, 2007, 2013 MaxCh 9.02E-10 2013, 99.5%, R2 99.6%, R2 SiScan 6.71E-20 3Seq 6.88 E-23 VP1 KU199270, Homo sapiens, 7-315 RDP 4.80E-03 MG181755, Homo KU248372, Homo Bangladesh, 2010 GENECONV 3.11E-03 sapiens, Malawi, sapiens, Bangladesh, BootScan 1.68E-02 2014, 98.8%, R2 2010, 100%, R2 MaxChi 4.91E-02 Chimera 4.65E-02 3Seq 2.43E-03 VP2 KJ753425, Homo sapiens, 599 (561-623) – RDP 6.28E-47 JF490148, Homo KP752972, Homo Uganda, 2011 991 (967-1019) GENECONV 5.13E-45 sapiens, Australia, sapiens, Gambia, 2008, BootScan 2.53E-44 2004, 98.3% ,C1 99.7%, R2-C2 MaxChi 5.39E-12 Chimera 6.10E-12 SiScan 8.38E-15 3Seq 1.96E-09 VP2 KU714445, Homo sapiens, 1155(1139-1170) – RDP 2.37E-41 KP752983, Homo KU714456, Homo Malawi, 2000 1887 (1862-1908) GENECONV 1.62E-33 sapiens, South Africa, sapiens, Malawi, 2000, BootScan 5.28E-39 2009, 94.3%, C1 99.9%, R2-C2 MaxChi 1.26E-16 Chimera 1.42E-16 SiScan 8.61E-25 3Seq 1.96E-09 VP2 KX655439, Homo sapiens, 2713 (2672-2717)- RDP 1.34E-25 Unknown closest = MG181349, Homo Uganda, 2013 158 (143-171) GENECONV 9.67E-24 KX632343, Homo sapiens, Malawi, 2002, MaxChi 1.54E-03 sapiens, Uganda, 99.4%, C1 Chimera 8.56E-04 2013), C2 SiScan 1.61E-05 3Seq 2.15E--08 VP2 KJ753617, Homo sapiens, 548 (464-560)-784 RDP 5.31E-27 KJ752329, Homo KX655439, Homo South Africa, 2004 (754-842) GENECONV 4.03E-24 sapiens, South Africa, sapiens, Uganda, 2013, BootScan 7.13E-23 2003, 99.4%, R1-C1 98.4%, R2-C2 MaxChi 3.85E-05 Chimera 4.51E-05 3Seq 1.96E-09 VP2 HM988959, Bos taurus, 558 (524-586)- RDP 1.14E-21 JQ069842, Homo JX971570, Bos taurus, South Korea, Unknown 2685 GENECONV 2.68E-25 sapiens, Canada, South Korea, 2004, 100%, BootScan 2.00 E-21 2008, 95.2%, C1 C1 MaxChi 1.29E-05 Chimera 1.11E-03 SiScan 3.88E-15 3Seq 5.94E-21

37 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 7, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

Vp2 MG181591, Homo sapiens, 1129 (1106-1140)- RDP 3.47 E-21 KJ753828, Homo KJ753425, Homo sapiens, Malawi, 2013 1299 (1292-1318) GENECONV 2.01E-19 sapiens, Zimbabwe, Uganda, 2011, 99.4% ,C1 BootScan 4.15E-15 99.1%, C2 MaxChi 4.04E-03 Chimera 3.95E-03 3Seq 1.96E-09 VP2 KX632343, Homo sapiens, 2666 (2614-39)- RDP 9.89E-21 JN258812, Homo KJ53402, Homo sapiens, Uganda, 2013 593 (560-612) GENECONV 2.26 E-17 sapiens, Belgium, South Africa, 2005, BootScan 5.20E-19 2000, 91.7%, C1 99.5% MaxChi 4.65E-13 Chimera 3.18 E-13 SiSeq 2.53E-24 3Seq 1.37E-05 VP2 KX632343, Homo sapiens, 1357 (1322-1368)- RDP 1.14E-21 MG181371, Homo KJ870879, Homo sapiens, Uganda, 2013 1808 (1790-1830) GENECONV 3.44E-17 sapiens, Malawi, Democratic Republic of MaxChi 1.69E-11 2002, 95.3%, C1 the Congo, 99.3%, C2 Chimera 3.57E-10 SiScan 2.84E-14 3Seq 1.96E-09 VP2 KY055428, Capra 983 (960-1009) – RDP 1.64E-18 JN831232, Bos KF63257, Bos taurus, aegagrus, Uganda, 2014 1238 (1194-1243) GENECONV 1.12E-16 taurus, South Africa, South Africa, 2007, BootScan 2.13E-10 2007, 94.7%, C2 99.6% MaxChi 1.67E-05 Chimera 4.73E-04 SiScan 4.35E-05 3Seq 1.96E-09 VP2 KJ870901, Homo sapiens, 2330 (2304-2359)- RDP 1.96E-18 KY055417, Sus LC367260, Homo Democratic Republic of the 2648 (2589-1) GENECONV 2.99E-16 scrofa, Uganda, 2016, sapiens, Nepal, 2008, Congo, Unknown BootScan 5.51E-16 95%, C1 99.4%, C1 MaxChi 1.71E-03 Chimera 7.67E-04 SiScan 6.29E-06 3Seq 1.96E-09 VP2 KC257092, Camelus 2047 (2017-2102)- RDP 2.53E-16 Unknown, closest = FJ347123, Lama dromedaries, Sudan, 2002 733 (661-754) GENECONV 4.72E-06 DQ480724, Homo guanicoe, Argentina, BootScan 7.95E-23 sapiens, Iran, C2 1998, 99.9%, C2 MaxChi 2.68E-13 Chimera 8.40E-12 SiScan 8.51E-11 3Seq 9.20E-08 VP2 KX632343, Homo sapiens, 1962 (1932-1978) RDP 1.44E-13 JN706477, Homo KX655439, Homo Uganda, 2013 – 2133 (2099- GENECONV 7.46E-10 sapiens, Thailand, sapiens, Uganda, 2013, 2136) MaxChi 3.26E-06 2010, 93% 97.7% Chimera 4.13E-05 C1 C2 SiScan 1.49E-07 3Seq1.96E-09

38 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 7, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

VP2 KJ627064, Homo sapiens, 2675 (2615-95)- RDP 7.01E-06 KJ752074, Homo KJ752859, Homo sapiens, Paraguay, 2002 409 (370-449) GENECONV 2.53E-03 sapiens, South Africa, Zimbabwe, 2011, 100%, BootScan 1.26E-04 2005, 98.5% , C1 C1 MaxChi 4.49E-02 Chimera 3.89E-02 SiScan 1.20E-02 3Seq 9.21E-04 VP3 KJ412679, Homo sapiens, 1 (2497-3) – 1068 RDP 1.04E-72 JX185760, Homo KJ412622, Homo sapiens, Paraguay, 2009 (1047-1080) GENECONV 1.11E-59 sapiens, Italy, 2007, Paraguay, 2009, 99.9%, BootScan 1.41E-73 97.8%, M1 M3 MaxChi 3.58E-31 Chimera 5.31E-31 SiScan 6.77E-38 3Seq 2.83E-09 VP3 KP007164, Homo sapiens, 968 (968-996)- RDP 1.97E-38 KY000551, Homo KP882081, Homo Philippines, 2012 1753 (1727-1777) GENECONV 2.93E-35 sapiens, Germany, sapiens, Bangladesh, BootScan 2.16E-37 2016, 99.4%, M2 2007, 99.7%, M2 MaxChi 2.45E-16 Chimera 1.76E-15 SiScan 2.85E-19 3Seq 2.83E-19 VP3 KX171542, Homo sapiens, 1465 (1432-1538)- RDP 1.40E-37 KJ752827, Homo KX171538, Homo South Korea, 2014 2482 (2461-59) GENECONV 7.17E-32 sapiens, South Africa, sapiens, South Korea, BootScan 2.84E-35 2003, 98.8%, M2 2012, 99.7%, M2 MaxChi 2.03E-19 Chimera 3.57E-19 SiScan 3.27E-23 3Seq 2.83E-09 VP3 KX171543, Homo sapiens, 1650 (1607-1688)- RDP 5.19E-37 KU248374, Homo KC443523, Homo South Korea, 2014 2472 (2455-59) GENECONV 2.46E-33 sapiens, Bangladesh, sapiens, Australia, 2007, BootScan 2.81E-34 2010, 99.1%, M2 99.6%, M2 MaxChi 1.95E-17 Chimera 1.05E-17 SiScan 1.55E-19 3Seq 2.83E-09 VP3 AB779630, Sus scrofa, 1590 (1578-1645)- RDP 3.19E-33 MG781054, Sus AB779632, Sus scrofa, Thailand, 2008 90 (2490-96) GENECONV 1.69E-25 scrofa, Thailand, Thailand, 2009, 99.9%, BootScan 2.39E-31 92.7%, M1 M1 MaxChi 2.05E-16 Chimera 3.42E-15 SiScan 9.38E-25 3Seq 1.42E-24 VP3 MG181592, Homo sapiens, 2334 (2318-2349)- RDP 1.09E-32 KC442894, Homo KT919912, Homo Malawi, 2013 196 (150-204) GENECONV 6.85E-18 sapiens, USA, 2008, sapiens, USA, 2013, 99%, BootScan 5.72E-27 96.4%, M2 M1 MaxChi 6.16E-12 Chimera 9.20E-12 SiScan 3.92E-10

39 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 7, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

3Seq 3.54E-27 VP3 JN013991, Homo sapiens, 82 (2550-99)- 260 RDP 1.69E-29 KJ752816, Homo Kp883027, Homo sapiens, South Africa, 2008 (253-268) GENECONV 8.98 E-27 sapiens, South Africa, Mali, 2008, 99.4%, M2 BootScan 1.24E-17 2011, 97.4%, M1 MaxChi 9.47E-06 Chimera 1.58E-05 SiScan 1.76E-02 3Seq 5.67E-09 VP3 KU356576, Homo sapiens, 844 (815-855)- RDP 4.55E-24 KC178787, Homo KP882081, Homo Bangladesh, 2010 1194 (1142-1219) GENECONV 1.69E-21 sapiens, Italy, 2008, sapiens, Bangladesh, BootScan 3.82E-20 97.4%, M2 2007, 100%, M2 MaxChi 1.89E-09 Chimera 1.77E-09 SiSeq 1.18E-07 3Seq 2.83E-09 VP3 JQ0692727, Homo sapiens, 51 (2482-59)- 845 RDP 4.27E-23 JQ069748, Homo JQ069762, Homo sapiens, Canada, 2007 (817-932) GENECONV 9.70E-19 sapiens, Canada, Canada, 2008, 99.7%, M1 BootScan 6.96E-18 2008, 99.1%, M1 MaxChi 8.76E-14 Chimera 4.71E-14 SiScan 2.58E-17 3Seq 3.98E-35 Vp3 JQ069754, Homo sapiens, 835 (811-879)- RDP 6.24E-20 KJ627117, Homo KP645258, Homo Canada, 2008 1621 (1591-1631) GENECONV 4.13E-17 sapiens, Paraguay, sapiens, Australia, 2010, BootScan 4.05E-17 1999, 99%, M1 99.7%, M1 MaxChi 1.69E-09 Chimera 1.31E-09 SiScan 1.91E-10 3Seq 1.71E-27 VP3 MG181592, Homo sapiens, 1736 (1712-1755)- RDP 2.97E-19 KC178787, Homo LC374087, Homo Malawi, 2013 1941 (1889-1953) GENECONV 7.80E-18 sapiens, Italy, 2008, sapiens, Nepal, 2009, BootScan 5.65E-15 97%, M2 100%, M1 MaxChi 5.47E-04 Chimera 5.27E-04 SiSeq 9.43E-07 3Seq 2.83E-09 VP3 KJ753665, Homo sapiens, 2158 (2127-2201)- Flagged in ~307, KA p- KJ753481, Homo KY055418, Sus scrofa, South Africa, 2004 2517 (2447-64) value= 2.23E-18, Global sapiens, Ethiopia, Uganda, 2016, 98.6% p-value 5.69E-11 2012, 94.9% M1 M1 VP3 AB374145, Bos taurus, 810 (765-875)- RDP 1.85E-08 LC340015, Homo KJ411434, Homo sapiens, Japan, Unknown 1544 (1486-1639) GENECONV 1.83E-04 sapiens, Japan, 2014, USA, 2012, 97.4%, M2 BootScan 3.64E-05 95.5%, M2 MaxChi 1.68E-06 Chimera 3.72E-06 SiScan 2.14E-07 3Seq 4.41E-10

40 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 7, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

VP3 KJ753665, Homo sapiens, 1212 (1140-1312)- RDP 3.84E-07 HQ392255, Homo KX778614, Homo South Africa, 2004 1732 (1632-1801) GENECONV 3.50E-05 sapiens, Belgium, sapiens, Dominican BootScan 1.92E-05 2008, 96.2%, M1 Republic of the Congo, MaxChi 2.67E-05 2013, 99.4%, M1 Chimera 1.59E-05 SiScan 3.24E-09 3Seq 2.08E-04 VP3 KJ753665, Homo sapiens, 580 (498-612)- 922 GENECONV 7.32E-11 KP752825, Homo JQ069760, Homo sapiens, South Africa, 2004 (831-964) BootScan 1.64E-09 sapiens, South Africa, Canada, 2008, 99.7%, M1 MaxChi 3.90E-05 2002, 99.6%, M1 Chimera 3.85E-06 SiScan 9.81E-09 3Seq 7.82E-20 VP3 KX655453, Homo sapiens, 1258 (872-1389)- Flagged as many KP882576, Homo KU714446, Homo Uganda, 2013 1798 (1662-1917) sequences, Global p- sapiens, Ghana, 2009, sapiens, Malawi, 2000, value= 1.78E-12 94.9%, M2 96.7%, M2 VP4 KX655509, Homo sapiens, 321 (308-323)- 683 RDP 2.16E-54 KP902549, Homo KJ52687, Homo sapiens, Uganda, 2013 (616-634) GENECONV 1.17E052 sapiens, South Africa, South Africa, 2000, VP8* domain BootScan 1.59E-51 2003, 99.3%, P8 99.3%, P8 MaxChi 7.47E-19 Chimera 1.44E-18 SiScan 8.23E-23 3Seq 2.07E-09 VP4 KX655487, Homo sapiens, 334(326-347)- RDP 1.06E-55 KX646644, Homo MG181340, Homo Uganda, 2013 682(670-698) GENECONV 5.94E-51 sapiens, India, 2012, sapiens, Malawi, 2002, VP8* domain BootScan 1.77E-50 96.6%, P6 99.7%, P8 MaxChi 4.55E-19 Chimera 5.96E-19 SiScan 1.60E-17 3Seq 2.07E-09 VP4 MG181725, Homo sapiens, 374 (367-382)- 542 GENECONV 5.45E-46 MG181758, Homo MG181890, Homo Malawi, 2014 (526-553) BootScan 2.45E-41 sapiens, Malawi, sapiens, Malawi, 2012, VP8* domain MaxChi 4.02E-11 2014, 99.2%, P8 100%, P4 Chimera 3.97E-11 SiScan 2.83E-13 3Seq 2.07E-09 VP4 KU714447, Homo sapiens, 1380 (1378-1381)- RDP 1.47E-39 GQ869840, Homo KU714458, Homo Malawi, 2000 1562 (1543-1571) GENECONV 4.07E-38 sapiens, Malawi, sapiens, Malawi, 2000, BootScan 104E-29 2000, P8 99.5%, P14 MaxChi 1.24E-07 Chimera 3.25E-07 SiScan 7.45E-12 3Seq 4.14E-09 VP4 KX655441, Homo sapiens, 374 (357-382)-484 RDP 5.41E-34 LC260226, Homo JQ069692, Homo sapiens, Uganda, 2013 (475-488) GENECONV 1.32E-31 sapiens, Indonesia, Canada, 2009, 99.1%, P8 VP8* domain BootScan2.38E-30 99.2%, P6 MaxChi 1.11E-05 Chimera 6.90E-06

41 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 7, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

SiScan 4.86E-10 3Seq 2.07E-09 VP4 MG181637, Homo sapiens, 350 (325-368)-459 GENECONV 5.13E-28 MG181758, Homo KJ870892, Homo sapiens, Malawi, 2013 (464-466) BootScan 5.75E-15 sapiens, Malawi, Democratic Republic of VP8* domain MaxChi 2.13E-04 2014, 99.5%, P8 the Congo, 95.5%, P6 Chimera 1.00E-04 SiScan 4.71E-05 3Seq 4.14E-09 VP4 JQ069674, Homo sapiens, 1568 (1541-1574)- RDP 1.59E-23 HQ392253, Homo HM773714, Homo Canada, 2008 2329 (2251-101) GENECONV 2.53E-21 sapiens, Belgium, sapiens, USA, 2008, BootScan 1.59E-23 2008, 99.5%, P8 99.5%, P8 MaxChi 2.96E-12 Chimera 2.13E-12 SiScan 7.86E-14 3Seq 1.44E-35 VP4 KC579762, Homo sapiens, 1073 (1039-1124)- RDP 4.67E-23 HQ392253, Homo FJ947895, Homo sapiens, USA, 1991 2332 (2269-8) GENECONV 4.72E-17 sapiens, Belgium, USA, 1991, 99.9%, P8 BootScan 8.46E-23 2008, 98.7%, P8 MaxChi 1.99E-14 Chimera 1.76E-11 SiScan 7.07E-25 3Seq 4.04E-40 VP4 MG181659, Homo sapiens, 1116 (1064-1146)- RDP 6.53E-11 MG181725, Homo KJ752208, Homo sapiens, Malawi, 2013 1462 (1402-1501) GENECONV 2.20E-09 sapiens, Malawi, South Africa, 2012, BootScan 1.08E-09 2013, 98.1%, P8 98.3%, P4 MaxChi 1.781E-05 Chimera 1.74E-05 SiScan 5.94E-06 VP4 JF490554, Homo sapiens, 462 (397-482)- 683 RDP 2.17E-10 KJ751716, Homo JQ069674, Homo sapiens, Australia, 2005 (637-722) GENECONV 8.75E-09 sapiens, Gambia, Canada, 2008, 99.5%, P8 BootScan 1.89E-10 2010, 98.4%, P8 MaxChi 3.26E-04 Chimera 1.04E-03 SiScan 1.14E-04 3Seq 4.14E-09 VP4 AB008291, Homo sapiens, 2291 (2223-71)- RDP 1.02E-03 AB039939, Homo AB008277, Homo 1995 807 (762-849) BootScan 4.42E-02 sapiens, Japan, 1990, sapiens, China, 1992, MaxChi 5.00E-05 98.6%, P8 98.9%, P8 Chimera 1.32E-04 SiScan 1.51E-05 3Seq 1.66E-08 VP6 KJ482497, Sus scrofa, 589 (584-605) – RDP 1.81E-25 DQ119822, Sus KJ482491, Sus scrofa, Brazil, 2013 801 (790 – 809) GENECOV 7.17E-23 scrofa, China, 98.1%, Brazil, 2013, 100%, I5 Bootscan 7.09E018 I2 Maxchi 2.98E-10 SiScan 2.41E-11 3Seq 1.41E-09

42 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 7, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

VP6 KX632346, Homo sapiens, 1269 (1250-1272)- RDP 7.77E-19 KU714448, Homo KX655455, Homo Uganda, 2013 56 (1333-103) GENECONV 1.33E-17 sapiens, Malawi, sapiens, Uganda, 2013, Bootscan 7.27E-09 2000, 92.7%, I1 96.5%, I2 Maxchi 2.1E-02 Chimera 3.24E-03 SiScan 1.02E-07 3Seq 3.06E-11 VP6 JN014005, Homo sapiens, 160 (33-220_- 384 RDP 1.98E-18 JN706549, Homo KJ752622, Homo sapiens, South Africa, 2008 (379-451) GENECONV 6.96E-18 sapiens, Thailand, Senegal, 2009, 96.4%, I2 BootScan 4.63E-16 2010, 97.8%, I1 MaxChi 1.00E-06 Chimera 2.77E-07 3Seq 1.41E-09 VP6 JX040423, Homo sapiens, 1177(1143-1181)- RDP 1.30E-15 KF740531, Homo Unknown, closest India, 2009 160(140-176) GENECONV 4.97E-09 sapiens, India, 2009, =KC579918, Homo BootScan 1.95E-11 I12 sapiens, USA, 1976), I1 MaxChi 4.38E-11 Chimera 9.18E-12 SiScan 7.20E-15 3Seq 2.82E-09 VP6 JN014003, Homo sapiens, 202 (1333-208)- RDP 6.30E-15 KP752725, Homo JX027952, Homo sapiens, South Africa, 2008 384(354-421) GENECONV 2.39E-15 sapiens, Swaziland, Australia, 2010, 96.7%, I1 BootScan 1.04E-12 2010, 95.5%, I2 MaxChi 3.76E-05 Chimera 3.02E-05 SiScan 4.51E-05 3Seq 1.41E-09 VP6 KU714448, Homo sapiens, 653(635- 664)- RDP 1.24E-12 JN014005, Homo KX655455, Homo Malawi, 2000 831(818-846) GENECONV 7.55E-10 sapiens, South Africa, sapiens, Uganda, 2013, BootScan 5.50E-07 95.9%, I1 99.4%, I2 MaxChi 1.15E-05 Chimera 3.38E-05 SiScan 1.38E-04 3Seq 1.41E-09 VP6 KU714448, Homo sapiens, 317 (290-333)- 491 RDP 3.94E-15 KX638567, Homo GU984758, Bos taurus, Malawi, 2000 (474-507) GENECONV 9.37E-14 sapiens, India, 2010, India, 2007, 98.9%, I2 BootScan 2.42E-13 96.6%, I1 MaxChi 2.80E Chimera 2.58E-05 SiScan 1.73E-07 3Seq 1.41E-09 VP6 HQ11009, Homo sapiens, 754 (697-775)- GENECONV 2.23E-05 MF469405, Homo HQ392161, Homo Belgium, 2007, 98.5% 1357 (1329-32) BootScan 1.61E-03 sapiens, USA, 2015, sapiens, Belgium, 2007, MaxChi 3.32E-11 99.3%, I1 98.5%, I1 Chimera 1.08E-10 SiScan 5.16E-14 3Seq 2.54E-19

43 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 7, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

VP6 FJ685614, Homo sapiens, 698(672-712)-1357 RDP 1.72E-08 KX638599, Homo KX632291, Homo India, 1992 (1231-70) GENECONV 1.61E-07 sapiens, India, 2010, sapiens, Uganda, 2013, BootScan 2.85E-08 99.4% ,I1 99.1%, I1 MaxChi 1.40E-09 Chimera 2.14E-09 SiScan 4.04E-11 3Seq 1.54E-23 VP6 KJ753780, Homo sapiens, 513 (463-560)- GENECONV 1.89E-05 JN258839, Homo LC056797, Homo South Africa, 2004 1008 (952-1031) BootScan 2.60E-05 sapiens, Belgium, sapiens, Viet Nam, 2007, MaxChi 9.01E-09 2001, 99.6%, I1 98.8%, I1 Chimera 1.46E-08 SiScan 1.38E-09 3Seq 1.30E-19 VP6 HM988972, Bos taurus, 751 (340-811)- RDP 2.08E-08 Unknown, closest = MH423866, Sus scrofa, South Korea, Unknown 1113 (1080-1132) GENECONV 1.92E-08 MH238302, Sus China, 2016, 99.4% ,I5 BootScan 2.91E-08 scrofa, Spain, 2017, I5 MaxChi 1.45E-05 Chimera 2.41E-05 SiScan 3.56E-05 3Seq 5.55E-05 VP7 AB158431, Bos taurus, 901 (868-907)- RDP 3.71E-16 AB077053, Bos U50332, Bos taurus, Japan, 2000 1051 (1037-80) GENECONV 1.25E-13 taurus, Japan, 1995- USA, 1991, 97.2%, G6 MaxChi 6.99E-07 1996, 95.9%, G8 Chimera 3.69E-05 SiScan 8.07E-10 3Seq 4.95E-10 VP7 HM591496, Bos taurus, 1048 (1032-128)- RDP 2.78E-05 EF199501, Bos JX442784, Bos taurus, India, 2009 481 (448-506) GENECONV 4.77 E-04 taurus, India, 96.6%, India, 2010, 100%, G6 KF170899, Bos taurus, BootScan 4.86E-04 G6 India, 2012 MaxChi 1.02E-03 Chimera 3.78E-05 SiSeq 7.21E-09 3Seq 3.90E-08 VP7 D86274, Homo sapiens, 353 (338-368)- 745 RDP 1.73E-18 D86282, Homo Kf636217, Homo sapiens, 1990 (730-756) GENECONV 1.43E-16 sapiens, 1981, 98.7%, South Africa, 2010, BootScan 2.50E-14 G3 94.9%, G1 MaxChi 3.26E-13 Chimera 6.54E-13 SiScan 2.47E-10 3Seq 3.55E-33 VP7 MG181727, Homo sapiens, 316 (1010-314)- RDP 3.69E-17 JQ069508, Homo KJ753362, Homo sapiens, Malawi, 2014 470 (437-476) GENECONV 1.84E-16 sapiens, Canada, South Africa, 2003, BootScan 8.96E-12 2008, 98%, G1 98.7%, G2 MaxChi 2.23E-08 Chimera 3.80E-08 SiScan 2.65E-12 3Seq 2.69E-23

44 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 7, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

VP7 KX655511, Homo sapiens, 840 (823-851)- RDP 1.20E-16 EU839936, Homo KX655456, Homo Uganda, 2013 1052 (1037-79) GENECONV 3.11E-15 sapiens, Bangladesh, sapiens, Uganda, 2013, BootScan 2.21E-16 2005, 97.4%, G9 100%, G8 MaxChi 6.12E-05 Chimera 6.98E-06 SiScan 2.52E-04 3Seq 3.36E-09 VP7 KJ753024, Homo sapiens, 1013 (985-59)- RDP 1.53E-16 KM008672, Homo D16344, Homo sapiens, South Africa, 2009 130(121-143) GENECONV 2.32E-14 sapiens, India, 2010, 96.5%, G1 BootScan 1.27E-10 97.8%, G12 MaxChi 9.65E-04 Chimera 5.96E-04 SiScan 8.42E-13 3Seq 2.38E-16 VP7 KJ751729, Homo sapiens, 857 (833-859)- RDP 4.0E-15 KJ560447, Homo KP882703, Homo Ethiopia, 2010 1019 (991-51) GENECONV 1.60E-10 sapiens, USA, 2011, sapiens, Kenya, 99.3%, KP752817, Homo sapiens, BootScan 1.21EE-11 99%, G3 G1 Togo, 2010 MaxChi 1.12E-03 Chimera 6.78E-04 SiScan 1.73E-03 3Seq 1.34E-05 VP7 MG181661, Homo sapiens, 590 (582-616)- 175 RDP 7.82E-30 JQ069502, Homo MG181859, Homo Malawi, 2013 (170-181) GENECONV 7.33E-23 sapiens, Canada, sapiens, Malawi, 2012, BootScan 5.06E-31 2008, 96.6%, G1 99.5%, G2 MaxChi 3.49E-21 Chimera 1.06E-25 3Seq 2.28E-47 MG181661, Homo sapiens, 590 (582-616)- 175 RDP 7.82E-30 JQ069502, Homo MG181859, Homo VP7 Malawi, 2013 (170-181) GENECONV 7.33E-23 sapiens, Canada, sapiens, Malawi, 2012, BootScan 5.06E-31 2008, 96.6%, G1 99.5%, G2 MaxChi 3.49E-21 Chimera 1.06E-25 3Seq 2.28E-47

KF170899, Bos taurus, (71) – (448-506) RDP 2.78 E-05 EF199501, Bos JX442784, Bos taurus,

VP7 India, 2010 GENEECONV 4.77 E-04 taurus, India, 97.3%, India, 99.5%, G6 BootScan 4.86 E-04 G6

MaxChi 1.02 E-03 Chimaera 3.78 E-05

SiScan 7.21 E-09

3Seq 3.90 E-08

VP7 KJ412857, Homo sapiens, 772 (754-779)- GENECONV 1.23E-19 KJ412858, Homo KJ412802, Homo sapiens, Paraguay, 2009 1014(988-51) BootScan 6.33E-20 sapiens, Paraguay, Paraguay, 2009, 99.2% MaxChi 2.56E-07 2009, 100% Chimera 4.38E-07 G1 SiScan 2.01E-09 G3 3Seq 3.36E-09

45 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 7, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

VP7 KC443034, Homo sapiens, 55 (994-60)-291 RDP 1.66 E-30 GQ229044, Homo JX411970, Homo USA, 2006 (262-296) GENECONV 5.07E-26 sapiens, India, 97.4%, sapiens, India, 2011, MG181727, Homo sapiens, BootScan 2.32E-24 G2 98.7%, G1 Malawi, 2014 MaxChi 1.03E32 Chimaera 1.10E-17 3Seq 3.36E-09

VP7 MG181771, Homo sapiens, 715 (707-722) – RDP 2.65E-29 KU356667, Homo KP222809, Homo Malawi, 2012 950 (921-971) GENECONV 6.45E-26 sapiens, Bangladesh, sapiens, Mozambique, BootScan 5.84E-17 2013, 94.9% 2011, 99.6% MaxChi 1.70E-11 G2 G1 Chimaera 2.36E-11 SiScan 4.78E-11 3Seq 3.55E-09

VP7 KP752498, Homo sapiens, 618 (605-641)- RDP 1.47E-28 D86266, Homo MG181661, Homo Togo, 2009 1018 (1000-1020) GENECONV 4.49E-23 sapiens, 95.7% sapiens, Malawi, 2013, BootScan 2.78E-14 G3 98.8% MaxChi 2.05E-14 G1 Chimaera 7.10E-19 SiScan 6.78E-39 VP7 AY261339, Homo sapiens, 783 (774-791)- RDP 3.82E-28 GU567778, Homo JX458964, Homo sapiens, South Africa 14(1044-18) GENECONV 2.35E-25 sapiens, Russia, Argentina, 2001, 100% BootScan 2.41E-27 96.9% G1 MaxChi 5.44E-10 G2 Chimera 2.33E-10 SiScan 3.15E-14 3Seq 3.36E-09 VP7 KX655489, Homo sapiens, 151 (132-155)- 316 RDP 2.14E-27 DQ146676, Homo JN232074, Homo sapiens, Uganda, 2013 (311-325) GENECONV 1.27E024 sapiens, Bangladesh, Brazil, 2004, 99.4%, G1 BootScan 3.15E-19 2003, 98.1%, G12 MaxChi 7.75E-09 Chimera 7.42E-09 SiScan 8.44E-20 3Seq 4.13E-03

VP7 D86273, Homo sapiens, 347 (339-353)- 726 RDP 1.70E-24 AF450293, Homo D50121, Homo sapiens, 1986 (697-763) GENECONV 1.57E-22 sapiens, China, 96.3%, Japan, 1992-1993, BootScan 1.26E-17 Unknown, 97.4%, G3 G2 MaxChi 7.23E-14 Chimera 1.37E-15 SiScan 1.73E-13 3Seq 6.15E-47 VP7 Kj4122858, Homo sapiens, 881 (859-914)- RDP 4.80E-14 AB585923, Homo HQ425288, Homo Paraguay, 2009 1016 (991-54) GENECONV 3.04E-13 sapiens, Hong Kong, sapiens, South Korea, BootScan 3.81E-13 2006, 98.5%, G3 2004, 99.2%, G4 MaxChi 5.36E-03 Chimera 1.05E-02

46 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 7, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

SiScan 1.36E-04 3Seq 3.36E-09 VP7 KU714449, Homo sapiens, 825 (795-835)- 93 RDP 1.04E-13 KM008635, Homo KU714460, Homo Malawi, 2000 (85-93) GENECONV 4.64E-09 sapiens, India, 2009, sapiens, Malawi, 2000, BootScan 1.71E-09 92.5%, G1 98.2%, G6 MaxChi 1.20E-10 Chimera 9.96E-11 SiScan 3.89E-10 3Seq 7.72E-08 VP7 KJ412888, Homo sapiens, 873 (823-890)- RDP 2.17E-12 D86275, Homo JF490143, Homo sapiens, Paraguay, 2009 1023 (1000-81) GENECONV 7.16E-11 sapiens, 88%, G3 Australia, 2004, 99.3%, BootScan 3.63E-12 G1 MaxChi 5.48E-04 Chimera 4.30E-04 SiScan 8.10E-03 3Seq 1.12E-05 VP7 KX655445, Homo sapiens, 1042 (1006-110)- RDP 6.32E-12 KJ560447, Homo KJ753326, Homo sapiens, Uganda, 2013 274 (269-287) GENECONV 5.98E-10 sapiens, USA, 2011, Kenya, 94.9%, G8 BootScan 8.41E-08 95.8%, G3 MaxChi 2.20E-06 Chimera 3.68E-08 SiScan 3.20E-04 3Seq 8.29E-11 VP7 D86277, Homo sapiens, 335 (323-353)- 754 RDP 8.20E-20 D86279, Homo KP883055, Homo China, 1989 (710-762) GENECONV 2.21E-15 sapiens, China, 1992, sapiens, Mali, 2009, G2 BootScan 1.81E-10 89.6%, G3 MaxChi 1.91E-12 Chimera 1.95E-13 SiScan 3.36E-13 3Seq 7.22E-27 VP7 AF281044, Homo sapiens, 362 (346-366)- 589 RDP 1.32E-09 AY866503, Homo KM008633, Homo Ireland, 1997-1999 (558-605) GENECONV 1.25E-02 sapiens, Thailand, sapiens, India, 2009, 89%, GQ433992, Homo sapiens, BootScan 4.96E-05 1995-1997, 98.9%, G1 Ireland, Unknown MaxChi 2.75E-05 G9 Chimera 1.50E-04 3Seq 3.36E-09 VP7 KU714449, Homo sapiens, 440 (421-446)- RDP 3.12E-10 JQ069503, Homo KU714460, Homo Malawi, 2000 604(641-704) GENECONV 7.36E-11 sapiens, Malawi, sapiens, Malawi, 2000, BootScan 1.56E-07 2000, 100%, G1 100%, G6 MaxChi 1.83E-03 Chimera 1.69E-03 3Seq 1.08E-12 VP7 KX655444, Homo sapiens, 110 (88-114)- RDP 2.84E-08 KC579965, Homo AF254137, Homo Uganda, 2013 191(183-197) GENECONV 3.00E-05 sapiens, USA, 96.2%, sapiens, 93.9% ,G1 MaxChi3.11E-02 G3 Chimera 3.11E-02 SiScan 1.01E-03 3Seq 8.25E-05

47 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 7, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

NSP1 KP753174, Sus scrofa, 1498 (1435-25)- RDP 7.45E-28 KP752999, Sus KP753000, Sus scrofa, South Africa, 2008, 585(573-596) GENECONV 5.26E-27 scrofa, South Africa, South Africa, 2009, KJ753184, Sus scrofa, BootScan 7.87E-25 2009, 99.9%, A8 99.7%, A8 South Africa, 2008, MaxChi 8.49E-16 KP752951, Sus scrofa, Chimera 5.73E-16 South Africa, 2008 SiScan 9.95E-18 3Seq 1.09E-48 NSP1 KX655446, Homo sapiens, 485 (455-499)- 889 RDP 3.98E-25 LC406825, Homo MG181277, Homo Uganda, 2013 (868-914) GENECONV 2.55E-22 sapiens, Kenya, 2012, sapiens, Malawi, 2000, BootScan 2.53E-21 95.9%, A2 96.5%, A1 MaxChi 1.96E-14 Chimera 8.97E-15 SiScan 1.22E-19 3Seq 3.18E-09 NSP1 DQ199658, Homo sapiens, 485 (440-517)- 942 RDP 2.25E-22 KP883160, Homo DQ199657, Homo Australia, 2001 (931-956) GENECONV 7.04E-20 sapiens, Malawi, sapiens, Australia, 1997, BootScan 1.52E-21 2008, 98.1%, A1 98.7%, A1 MaxChi 2.09E-12 Chimera 1.10E-12 SiScan 6.14E-12 3Seq 2.61E-35 NSP1 KJ482257, Sus scrofa, 1534 (1490-44)- RDP 3.24E-20 KJ482252, Sus scrofa, KJ482255, Sus scrofa, Brazil, 2013 888 (853-909) GENECONV 2.11E-19 Brazil, 2013, 92.9%, Brazil, 2013, 99.9%, A8 BootScan 5.21E-10 A8 MaxChi 9.52E-18 SiScan 2.81E-21 3Seq 1.81E-29 NSP1 KX778616, Homo sapiens, 1147 (1127-1155)- RDP 1.14E-19 KJ752290, Homo KX363351, Sus scrofa, Dominican Republic, 2013 1462 (1433-5) GENECONV 4.51E-17 sapiens, Zambia, Vietnam, 2012, 95.9%, BootScan 8.47E-16 2009, 96.3%, A1 A8 MaxChi 1.54E-11 Chimera 1.93E-12 SiScan 9.28E-13 3Seq 3.18E-09 NSP1 KJ482254, Sus scrofa, 534 (499-559)- 886 RDP 7.72E-18 KJ482247, Sus scrofa, KJ482256, Sus scrofa, Brazil, 2013 (849-911) GENECONV 1.84E-16 Brazil, 2013, 100%, Brazil, 2013, 100%, A8 BootScan 3.89E-14 A8 MaxChi 2.53E-10 Chimera 2.27E-10 SiScan 1.55E-08 3Seq 8.01E-16 NSP1 KX632348, Homo sapiens, 580 (508-605)- 844 RDP 1.91E-10 KP752785, Homo KP882434, Homo Uganda, 2013 (829-874) GENECONV 3.85E-10 sapiens, Ethiopia, sapiens, Ghana, 2009, BootScan 1.02E-06 2010, 94.9%, A1 92.8%, A2 MaxChi 2.32E-04 Chimera 2.29E-04 SiScan 6.75E-08 3Seq 3.18E-09

48 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 7, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

NSP1 KP752999, Sus scrofa, 1394 (1368-31)- RDP 1.38E-06 KF835942, Homo Unknown (close to South Africa, 2009 601 (552-631) BootScan 5.20E-05 sapiens, Hungary, KP753000 Sus scrofa, MaxChi 2.04E-08 2005, 93.9%, A8 South Africa, 2009), A8 Chimera 2.74E-08 SiScan 3.04E-06 3Seq 2.78E-12 NSP1 JX040426, Homo sapiens, 662 (559-677)- 842 RDP 8.72E-03 KU292525, Homo KP882478, Homo India, 2009 (783-884) GENECONV 1.35E-05 sapiens, India, 2014, sapiens, Ghana, 2008, BootScan 4.19E-02 98.4%, A11 95%, A11 MaxChi 1.57E-04 Chimera 5.50E-05 SiScan 4.50E-03 3Seq 6.62E-07 NSP1 KP753000, Sus scrofa, 1389 (1359-41)- RDP 4.27E-03 JQ309141, Equus KF835942, Homo South Africa, 2009 602 (504-631) BootScan 4.30-03 callabus, United sapiens, Hungary, 2005, MaxChi 1.12E-04 Kingdom, 1975, 92.8%, A8 Chimera 1.52E-05 87.2%, A8 SiScan 3.90E-04 3Seq 5.58E-05 NSP1 KJ482254, Sus scrofa, 1534 (1461-44) – R --- KJ482247, Sus scrofa, KJ482255, Sus scrofa,

Brazil, 2013 533 (499-909) GENECONV: 1.50E-13 Brazil, 2013, 99.1%, Brazil, 2013, 100%, BootScan: 5.96E-10 A8 A8

MaxChi: 3.77E-08

Chimaera: 6.43E-08

SiScan: 1.88E-11 3Seq: 1.46 E-24 NSP2 KJ753657, Homo sapiens, (0)- (219-278) RDP 2.01E-03 KM026633, Homo KJ918915, Homo sapiens, South Africa, 2004 GENECONV 1.6E-04 sapiens, Brazil, 2003, Hungary, 2012, 99.5%, KM026663, Homo sapiens, BootScan 4.07E-06 98.6%, N1 N1 Brazil, 2009 MaxChi 3.51E-03 KM026664, Homo sapiens, Chimera 3.30E-03 Brazil, 2009 SiScan 1.90E-07

3Seq 1.57E-04

NSP2 MF18783, Homo sapiens, 1015 (1002-57)- GENECONV 1.60E-14 KJ752146, Homo KX655480, Homo USA, 2011 286 (273-297) BootScan 1.55E-14 sapiens, South Africa, sapiens, Uganda, 2012, MaxChi 6.46E-04 2011, 99.7%, N1 99.3%, N2 Chimera 3.74E-06 SiScan 4.37E-08 3Seq 5.75E-22 NSP2 MF18783, Homo sapiens, 1015 (1002-57)- GENECONV 1.60E-14 KJ752146, Homo KX655480, Homo USA, 2011 286 (273-297) BootScan 1.55E-14 sapiens, South Africa, sapiens, Uganda, 2012, MaxChi 6.46E-04 2011, 99.7% 99.3% Chimera 3.74E-06 N1 N2 SiScan 4.37E-08 3Seq 5.75E-22

49 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 7, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

NSP2 AB779641, Sus scrofa, 870 (833-921)- 216 RDP 6.30E-05 KX363374, Sus JF796719, Sus scrofa, Thailand, 2008 (150-396) GENECONV 4.67 E-03 scrofa, Vietnam, South Korea, 2006, 100% BootScan 1.24 E-03 2012, 96.5% N1 MaxChi 1.28 E-03 N1 Chimaera 6.20 E-03

3Seq 1.48 E-05

NSP4 MF58092, Homo sapiens, 599 (546-616)- RDP 1.64 E-11 JF813107, Homo MG781050, Homo

China, 2013 (745-94) GENECONV 2.38 E-08 sapiens, China, 2008, sapiens, Thailand, 2011,

BootScan 1.14 E-09 97.9% 73.1%

MaxChi 1.01 E-09 E1 E1

Chimaera 4.58 E-10

SiScan 4.77 E-18 3Seq 6.43 E-21 NSP4 AB361290, Homo sapiens, 624 (548-635)- 726 RDP 1.66 E-09 AB326338, Homo Unknown (FJ492834, Sus India, 2006 (712-50) GENECONV 4.38 E-08 sapiens, India, 1989, scrofa, Ireland) BootScan 6.96 E-07 95.8% E9 MaxChi 1.96 E-03 E2 Chimaera 2.49 E-05 3Seq 1.15 E-12 NSP4 AB326966, Homo sapiens, 605 (564-614)- 675 RDP 2.77 E-09 FJ685615, Homo Unknown (MG181929, India, 2001 (670-46) GENECONV 1.10 E-10 sapiens, India, 1992, Homo sapiens, Malawi, BootScan 2.26 E -02 98.2% 2012) MaxChi 2.54 E -05 E1 E2 Chimaera 1.63 E -03 3Seq 8.89 E-09 NSP5 M33608, Homo sapiens, 619 (505-624)- 668 RDP 1.44 E -05 MG996138, Homo JQ358774, Homo sapiens, 1993 (665-182) GENECONV 2.64 E-06 sapiens, Singapore, India, 2011, 100% BootScan 4.1 E-05 2016, 95% H3 MaxChi 4.54 E-02 H2 Chimaera 3.70 E-02 SiScan 2.49 E-05 3Seq 6.61 E-04 905 906

50 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 7, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

907 Supplementary Table 2. All amino acid changes observed in the segment 9 (VP7) recombinants 908 which had more than one isolate supported the event. If changes were shown to occur in a predicted 909 epitope region, it is noted. 910 911 Gene Accession Position Major Minor Epitope recombinant Number region?

G6-G6 KF170899 40 M T - 46 I T - 48 V T - 116 V I - G3-G1 KJ751729 281 V I Yes 303 V I Yes 309 A V Yes 318 N D G9-G1 AF281044 135 S C - 162 D E - 163 R W - 169 V D 170 N I 176 L Q 180 G E Yes G2-G1 KC443034 9 I T - 10 F I - 21 S Y - 25 T S - 26 I V - 28 N R - 29 P I - 31 A D - 32 P Y - 35 F Y - 41 I F - 42 A V - 43 L A - 44 M L - 45 S F - 47 F L - 48 G T - 49 R K - 50 T A - 56 Y N - 65 A T - 68 T S - 72 S R -

51 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 7, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

73 G E - 75 S V - 912

913 914 Supplementary Figure 1. P8, P6, and P4 consensus RNA secondary structure mountain plots. Mountain plot 915 showing consensus secondary structure of VP4 genotypes P8, P6, and P4. Lighter colors indicate lower 916 probabilities of base-pairing. Peaks correspond to hairpin-loops, plateus to loops, and slopes to helices. Plot was 917 generated using RNAalifold in ViennaRNA package.

918

52 bioRxiv preprint doi: https://doi.org/10.1101/794826; this version posted October 7, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

919 920

A B

921 922 Supplementary Figure 2. Consensus mountain plot with recombination breakpoint distribution plot in grey. A) 923 NSP1 B) VP3

53