This thesis has been submitted in fulfilment of the requirements for a postgraduate degree (e.g. PhD, MPhil, DClinPsychol) at the University of Edinburgh. Please note the following terms and conditions of use:

• This work is protected by copyright and other intellectual property rights, which are retained by the thesis author, unless otherwise stated. • A copy can be downloaded for personal non-commercial research or study, without prior permission or charge. • This thesis cannot be reproduced or quoted extensively from without first obtaining permission in writing from the author. • The content must not be changed in any way or sold commercially in any format or medium without the formal permission of the author. • When referring to this work, full bibliographic details including the author, title, awarding institution and date of the thesis must be given.

The Epidemiology, Classification and Evolution of Human Rhinoviruses

Chloé Leanne McIntyre

A thesis submitted for the degree of Doctor of Philosophy

The University of Edinburgh

March 2013

Abstract

Human rhinoviruses (HRV) are extremely common human respiratory pathogens, most commonly associated with mild upper respiratory tract infections. The three known species of HRV (HRV-A, -B and –C) are members of the family Picornaviridae and genus Enterovirus. In contrast to the enterovirus (EV-A-D) species that commonly infect the gut, HRV are generally thought to be acid labile with replication restricted to the respiratory tract.

Investigations of the clinical correlations of HRV infections detected on diagnostic screening of respiratory specimens demonstrated no specific association between HRV variant and clinical presentation. For example, similar species distributions were observed in patients admitted to the ITU and those discharged with minor illness. Unexpectedly, screening of stool specimens for HRV showed a prevalence of 10% with viral loads similar to EV infections. These findings suggested that a reappraisal of HRV tropism and disease associations may be warranted.

HRV-A and -B isolates were originally classified into 100 serotypes by serological neutralisation properties. As HRV-C is difficult to isolate, no attempt had been previously made to classify the wealth of available HRV-C sequences. To facilitate definition of novel HRV types and classification of HRV-C, a system was devised to divide HRV sequences into genotypically defined types. Pairwise VP1 nucleotide p-distance analysis revealed distinct thresholds between inter- and intra- type divergence and available sequences were classified into 77 HRV-A, 29 -B and 51 -C types. This provides a standardised basis for type definition and identification, allowing consistency in studies of genetic diversity, epidemiology and evolution. It has been adopted by the ICTV Study Group for classification of HRV.

i

Abstract

Although the occurrence of recombination has been documented within the coding region of EV, analysis of dated HRV sequences revealed an overall lack of intra-species recombination between three coding regions of HRV-B and -C. In contrast, full HRV-A type groups appeared to have been subject to a large number of recombination events, suggesting extensive recombination during the period of its diversification into types. Putative recombination breakpoints localised to the non-structural region. Within HRV-A and HRV-B, recombination within the 5ˈUTR was infrequent. However, over 60% of analysed HRV-C strains grouped within the HRV-A clade and two recombination hotspots were identified. An additional interspecies recombination event was detected between HRV-A/C in the 2A coding region, with putative breakpoints mapping to the boundaries of the C-terminal domain of the proteinase.

The studies within this thesis provide evidence for a broadened understanding of the clinical significance of HRV. In addition, the assignment of HRV sequences into genotypically defined types allowed description of the observed genetic diversity and completion of analysis which reaffirmed the sporadic nature of recombination within the coding region of HRV.

ii

Declaration of Originality

I declare that the work recorded within this thesis was completed entirely by myself at the Centre for Infectious Diseases and Roslin Institute of the University of Edinburgh between September 2009 and March 2013. In cases where others have made a contribution to the results obtained, this is clearly stated within the text. This work has not been submitted for any other degree or professional qualification. In cases where portions of the work contained within this thesis have been published in academic journals, this is indicated within the text and copies of papers are given as appendices. Where required by license agreement, permission to include published papers has been sought from publishers.

Chloé McIntyre

iii

Acknowledgements

A hundred people have contributed in a thousand unique ways to the completion of this PhD and there is not enough paper in the world to thank them all as much as they deserve. For my supervisor, Professor Peter Simmonds. Thank you for giving this opportunity to a young surgical trainee with little experience. Your unending patience, enthusiasm and dedication over the last 4 years has taken me from a clueless newcomer to a fledgling scientist able and willing to think for herself. I cannot thank you enough. For the others who have been involved in my academic and scientific growth, a special thanks. For Professor Paul Sharp who has always been available for advice and support. For Dr Heli Harvala, whose infectious enthusiasm and incredible passion for virology has widened the horizons of my interest and experience far beyond what I initially imagined. For Dr Carol McWilliam-Leitch who took me from having never held a pipette to confidently designing my own PCR protocols far faster than I ever thought possible. For all the PhD students and post-docs in our lab past and present; Elly, Ingrid, Sinead, Nora, Nigel, Jereon, Donald, Jill, Colin, Inga, Nicky, Richard and Carol. Not only have you all been at various times an indispensible source of wisdom, but you have brightened my days and almost never been too busy to meet for a beer and a giggle after hours. To the students whose projects I have been fortunate enough to be involved in, thank you for allowing me to learn how to teach and keeping me smiling with your enthusiasm and humour. For my wonderful parents who have been unfailingly loving and supportive and never questioned my decision to follow my passion. Without you, none of this would have ever been possible. You have always made me feel like a star! I cannot thank you enough. For my best friend, life-partner, handyman, life-coach, chef, therapist, IT-support, removal man, musical hero, brew-master and most of all, amazing husband; Kevin. Thank you for being so endlessly, uncomplainingly, beautifully patient with me; with my long working hours, tiredness, frustration and endless, obsessive fixation on minute details. Your love and support has been the cornerstone of everything I’ve done. And most of all, to our little bun in the oven. I hope that someday you will read this and smile. Mummy loves science and Mummy loves you.

iv

List of Original Publications

Original publications directly associated with this thesis

McIntyre CL, McWilliam Leitch EC, Savolainen-Kopra C, Hovi T, Simmonds P. 2010. Analysis of genetic diversity and sites of recombination in human rhinovirus species C. J Virol 84:10297–310.

McIntyre CL, Savolainen-Kopra C, Hovi T, Simmonds P. 2013. Recombination in the evolution of human rhinovirus . Arch Virol. [Epub ahead of print]

Simmonds P, McIntyre CL, Savolainen-Kopra C, Tapparel C, Mackay IM, Hovi T. 2010. Proposals for the classification of human rhinovirus species C into genotypically assigned types. J Gen Virol 91:2409–19.

Harvala H, McIntyre CL, McLeish NJ, Kondracka J, Palmer J, Molyneaux P, Gunson R, Bennett S, Templeton K, Simmonds P. 2012. High detection frequency and viral loads of human rhinovirus species A to C in fecal samples; diagnostic and clinical implications. J Med Virol 84:536–42.

McIntyre CL, Knowles NJ, Simmonds P. 2013. Proposals for the classification of human rhinovirus species A, B and C (HRV-A, -B and –C) into genotypically assigned types. Submitted.

Original publications to which the author has otherwise contributed

Gaunt ER, Harvala H, McIntyre CL, Templeton KE, Simmonds P. 2011. Disease burden of the most commonly detected respiratory in hospitalized patients calculated using the disability adjusted life year (DALY) model. J Clin Virol. 52(3):215-21

McLeish NJ, Witteveldt J, Clasper L, McIntyre CL, McWilliam Leitch EC, Hardie A, Bennett S, Gunson R, Carman WF, Feeney SA, Coyle PV, Vipond B, Muir P, Benschop K, Wolthers K, Waris M, Osterback R, Johannessen I, Templeton K, Harvala H, Simmonds P. 2012. Development and assay of RNA transcripts of enterovirus species A to D, rhinovirus species A to C, and human parechovirus: assessment of assay sensitivity and specificity of real-time screening and typing methods. J Clin Micro 50:2910–7.

v

List of Original Publications

Harvala H, Gaunt E, McIntyre CL, Roddie H, Labonte S, Curran E, Othieno R, Simmonds P, Bremner J. 2012. Epidemiology and clinical characteristics of parainfluenza 3 outbreak in a Haemato-oncology unit. J Infect. 65(3):246-54

Harvala H, McIntyre CL, Imai N, Clasper L, Djoko CF, LeBreton M, Vermeulen M, Saville A, Mutapi F, Tamoufé U, Kiyang J, Biblia TG, Midzi N, Mduluza T, Pépin J, Njouom R, Smura T, Fair JN, Wolfe ND, Roivainen M, Simmonds P. 2012. High seroprevalence of enterovirus infections in apes and old world monkeys. Emerg Infect Dis 18(2):283–6.

Harvala H, Mcleish N, Kondracka J, Mcintyre CL, Leitch ECM, Templeton K, Simmonds P. 2011. Comparison of Human Parechovirus and Enterovirus Detection Frequencies in Cerebrospinal Fluid Samples Collected Over a 5-Year Period in Edinburgh : HPeV Type 3 Identified as the Most Common Picornavirus Type. J Clin Micro 896:889–896.

vi

Abbreviations

AdV adenovirus A&E Accident and Emergency department AICc small sample Akaike information criterion AOM acute otitis media Arg arginine ATCC American Tissue Culture Collection BEAST Bayesian Evolutionary Analysis of Sampling Trees BEAUti Bayesian Evolutionary Analysis Utility BLAST Basic Local Alignment Search Tool BRAV Bovine rhinitis A virus BRBV Bovine rhinitis B virus CAV Coxsackie A virus CBV Coxsackie B virus cDNA complementary deoxyribonucleic acid CFLI cold and flu-like illness CHI community health index CNS central nervous system cre cis-acting replication element CSF cerebrospinal fluid Cys cysteine CT cycle threshold for real-time PCR DAF decay accelerating factor DDBJ DNA Databank of Japan DNA deoxyribonucleic acid dNTP deoxynucleotide triphosphates dsRNA double stranded ribonucleic acid EDTA ethylenediaminetetraacetic acid EMBL European Molecular Biology Laboratory EMCV encephalomyocarditis virus ERAV equine rhinitis A virus ERBV equine rhinitis B virus EV enterovirus

vii

Abbreviations

FluA influenza A virus FluB influenza B virus FMDV foot and mouth disease virus GARD genetic algorithm for recombination detection GP general practitioner HAV hepatitis A virus HCoV human coronavirus HCV hepatitis C virus HDU High Dependency Unit HIV human immunodeficiency virus HKY Hasegawa-Kishino-Yano model of nucleotide substitution HPA Health Protection Agency HMPV human metapneumovirus HRV human rhinovirus HSV herpes simplex virus ICAM-1 intercellular adhesion molecule 1 ICTV International Committee on the Taxonomy of Viruses IFA immunofluorescence indels insertions and deletions in nucleotide sequences IRES internal ribosomal entry site ITU Intensive Care/Therapy Unit kB kilobase L leader protein LDL low density lipoprotein LRTI lower respiratory tract infection MAU Medical Assessment Unit MCL maximum composite likelihood MCMC Markov Chain Monte Carlo MEGA Molecular Evolutionary Genetics Analysis NCBI National Center for Biotechnology Information NJ neighbour joining NPEV non polio enterovirus nt nucleotide OTU operational taxonomic unit pat provisionally assigned human rhinovirus type

viii

Abbreviations

PBS phosphate buffered saline PCR polymerase chain reaction PIM2 Paediatric Index of Mortality score PIV parainfluenza virus PPT polypyrimidine tract PV poliovirus RDP Recombination detection program RF recombinant form RHSC Royal Hospital for Sick Children (Edinburgh) RIE Royal Infirmary of Edinburgh RNA ribonucleic acid RT reverse transcription RT-PCR reverse transcription polymerase chain reaction SBP Single breakpoint recombination detection SGPB Streptomyces griseus proteinase B SVC Scottish Virology Centre TAE tris-acetate EDTA Tm melting temperature UTR untranslated region URTI upper respiratory tract infection UV ultraviolet VPg viral protein linked VZV varicella zoster virus WHO World Health Organisation

ix

List of Figures

FIGURE 1.1 Neighbour joining phylogenetic tree depicting relationships between species within the Enterovirus genus…………….. ..7 FIGURE 1.2 Neighbour joining phylogenetic tree showing the prototype strains of HRV-A and corresponding biological properties...12 FIGURE 1.3 The structure of human rhinovirus virions…………………..13 FIGURE 1.4 Sequence of cleavages during proteolytic processing of the translated genome of enteroviruses………………………….16 FIGURE 1.5 RNA secondary structure of picornavirus IRES elements…..18

FIGURE 1.6 Summary of the picornavirus life cycle……………………..20 FIGURE 2.1 “One way” laboratory system for the prevention of carryover contamination in PCR reactions……………………………..37 FIGURE 3.1 Total number of respiratory samples referred to the SVC for virological testing per month from Oct 05 to May 11………68 FIGURE 3.2 Epidemiological analysis of HRV infections in patients from September 2008 – May 2009………………………………..72 FIGURE 3.3 HRV species isolated from distinct patient groups………… 74 FIGURE 3.4 Neighbour joining phylogenetic trees of the VP4/VP2 region of HRV-A, HRV-B and HRV-C…………………………....78 FIGURE 3.5 Neighbour joining phylogenetic tree showing the VP4/VP2 region of HRV positive samples from respiratory specimens and stool specimens…………………………………………82 FIGURE 4.1 Distributions of pairwise nucleotide p-distances for the VP1 region of all three species of HRV…………………………100 FIGURE 4.2 Distribution of pairwise nucleotide p-distances for the VP1 region of all HRV sequences………………………………101 FIGURE 4.3 Phylogenetic trees showing the VP1 region of HRV-A, -B and –C…………………………………………………………..102 FIGURE 4.4 Neighbour joining phylogenetic trees showing classical HRV serotype pairs which do not conform to the proposed VP1 divergence thresholds……………………………………. 108

x

List of Figures

FIGURE 4.5 Frequency histogram showing the distribution of HRV-A VP1 pairwise p-distances immediately surrounding the threshold of 13% ………………………………………………………..110 FIGURE 4.6 Neighbour joining phylogenetic trees showing the VP4/VP2 region of all three species of HRV…………………………124 FIGURE 4.7 Distributions of pairwise nucleotide p-distances for the VP4/VP2 region of all three species of HRV…………… 126 FIGURE 4.8 Total number of identified HRV strains divided by geographical location of isolation………………………….132 FIGURE 5.1 Neighbour joining phylogenetic trees of study sequences and corresponding regions from published complete genome sequences from the VP4/VP2, VP1 and 3Dpol regions of HRV-C…………………………………………………… 147 FIGURE 5.2 Comparison of phylogenies obtains from analysis of the 5ˈ and 3ˈ ends of 3Dpol of HRV-C sequences…………………….149 FIGURE 5.3 Neighbour joining phylogenetic trees showing the partial 5ˈUTR and VP4/VP2 regions…………………………… 151 FIGURE 5.4 Neighbour joining phylogenetic tree showing the 5ˈUTR region of all HRV-C full genomes…………………………152 FIGURE 5.5 Recombination breakpoints for three representative HRV-Ca sequences…………………………………………………..154 FIGURE 5.6 Sequence variability scan of the 5ˈUTR showing mean HRV- A, -B and –Cc intertype distances and interspecies distance of HRV-A/Cc…………………………………………………156 FIGURE 5.7 GroupScan analysis of the extreme 5ˈ end of the HRV-C genome……………………………………………………..157 FIGURE 5.8 Mean pairwise (uncorrected) p-distances of nucleotide (A) and amino acid (B) sequences in the VP4/VP2, VP1 and 3Dpol regions of all three HRV species………………………… 158 FIGURE 5.9 Sequence divergence scans showing mean nucleotide and amino acid divergences within complete genome sequences of HRV-C and between HRV-C and HRV-A/-B…………….159

xi

List of Figures

FIGURE 5.10 Neighbour joining phylogenetic trees showing the putative recombinant portion (3267 – 3525) of the 2A coding region as compared with 5ˈ (3006 – 3266) and 3ˈ (3526 – 3786) flanking regions……………………………………………………...161 FIGURE 5.11 Neighbour joining phylogenetic trees showing the 5ˈUTR and recombinant 2A regions of HRV-C sequences…………….163 FIGURE 5.12 Sequence alignment of the 2A region of selected HRV-A, -B and –C sequences…………………………………………. 164 FIGURE 5.13 Pairwise nucleotide and amino acid p-distances for the 2A coding region of HRV-A, HRV-C and HRV-A/-C………. 171 FIGURE 6.1 Neighbour joining phylogenetic trees showing the VP4/VP2 and VP1 region of HRV-A, -B and –C…………………….182 FIGURE 6.2 Likelihood mapping analysis of the VP4/VP2, VP1 and 3Dpol regions of HRV-A, -B and –C……………………………..184 FIGURE 6.3 Mean substitution rates and 95% highest posterior density intervals for two coding regions of HRV-A and HRV-C….185 FIGURE 6.4 Comparison of pairwise nucleotide p-distances between the VP1 and 3Dpol regions of HRV-A, -B and –C……………187 FIGURE 6.5 Neighbour joining phylogenetic trees showing the VP1 and partial 3Dpol regions of all available HRV-A strains……...189 FIGURE 6.6 Neighbour joining phylogenetic trees showing distinct patterns of recombination observed in HRV-A sequences………….194 FIGURE 6.7 Neighbour joining phylogenetic trees showing the P1, P2 and P3 regions of HRV-A sequences………………………… 196 FIGURE 6.8 Comparison of pairwise nucleotide p-distances between the P1/P2 and P2/P3 regions of HRV-A……………………….200 FIGURE 6.9 Neighbour joining phylogenetic trees of HRV-B (A) and HRV-C (B) VP1 and partial 3Dpol sequences…………….202 FIGURE 6.10 Neighbour joining phylogenetic trees showing the P1, P2 and P3 regions of HRV-B sequences…………………………...203 FIGURE 6.11 A: GroupScan of two representative recombinant HRV-A types. B: Calculated coding region recombination breakpoints for all HRV-A recombinant types mapped onto a diagram of the HRV-A full genome……………………….205

xii

List of Figures

FIGURE 6.12 Neighbour joining phylogenetic trees constructed for the 5ˈUTR and VP4 regions of HRV-A and HRV-B sequences…………………………………………………..207

xiii

List of Tables

TABLE 1.1 Members of the family Picornaviridae and their natural hosts…………………………………………………………..4 TABLE 1.2 Cell surface receptors used by selected picornavirus genera...... 21

TABLE 2.1 Reagents included in reverse transcription reactions………..39 TABLE 2.2 Codes for individual nucleotides and ambiguous bases used for design of PCR primers…………………………………..41 TABLE 2.3 Reagents used in PCR reactions…………………………….42 TABLE 2.4 Cycling conditions used in PCR reactions………………….42 TABLE 2.5 Modifications to standard PCR protocols used for individual PCR reactions……………………………………………….43 TABLE 2.6 Reagents used in Superscript III RT-PCR reactions……………………………………………………..45 TABLE 2.7 Cycling conditions used in Superscript III RT-PCR reactions……………………………………………………..45 TABLE 2.8 Reagents used in sequencing reactions…………………… 48 TABLE 2.9 Cycling conditions used in sequencing reactions………… 48 TABLE 3.1 Clinical characteristics of patients with HRV isolated from stool samples………………………………………………...79 TABLE 3.2 Comparison of detection rates between HRV positives isolated from stool and respiratory samples……………………….... 83 TABLE 4.1 Co-ordinates of capsid coding genome regions used in analysis and corresponding HRV reference strain………………….. 97 TABLE 4.2 Limits of intra- and inter- clade VP1 p-distance for HRV-A, -B and –C…………………………………………………. 103 TABLE 4.3 Confirmed HRV-A, -B and -C types with full genome sequences available………………………………………...112 TABLE 4.4 Confirmed HRV-C types defined by VP1 sequence...... 120

xiv

List of Tables

TABLE 4.5 Provisionally assigned types of HRV based on VP4/VP2 sequence divergence……………………………………….127 TABLE 4.6 Limits of inter- and intra-clade divergence in HRV types which do not conform to VP4/VP2 thresholds……………128 TABLE 5.1 Accession numbers of sequences generated during the study of recombination in HRV-C in 2006/2007…………………. 145 TABLE 5.2 Calculated 5ˈUTR recombination breakpoints for HRV-C types analysed……………………………………………...155 TABLE 6.1 Phylogenetic incongruities observed between VP1 and 3Dpol in HRV-A sequences……………………………………....191 TABLE 6.2 Phylogenetic incongruities observed between the P1, P2 and P3 regions in HRV-A sequences which lack bootstrap support in the VP1 or 3Dpol regions……………………………….197 FIGURE 6.3 HRV-A types which displayed evidence of phylogenetic incongruity between both the P1/P2 and P2/P3 regions… 195

xv

Contents

Abstract ...... i Declaration of Originality ...... iii Acknowledgements ...... iv List of Original Publications ...... v Abbreviations ...... vii List of Figures ...... x List of Tables ...... xiv Contents ...... xvi Chapter 1 Introduction ...... 1 1.1 Introduction ...... 1 1.2 Taxonomy and classification of viruses ...... 1 1.2.1 General ...... 1 1.2.2 Classification of the Picornaviridae ...... 3 1.2.3 Classification of the genus Enterovirus ...... 5 1.2.4 Classification of human rhinoviruses (HRV) ...... 7 1.3 General aspects of picornavirus biology...... 13 1.3.1 Structure of picornavirus virions ...... 13 1.3.2 Genome organisation and proteolytic processing ...... 14 1.3.3 RNA secondary structure in picornavirus genomes ...... 17 1.3.4 Life cycles of ...... 19 1.3.5 Pathogenesis of enterovirus infections ...... 23 1.4 Evolution of RNA viruses ...... 25 1.4.1 General points in the evolution of RNA viruses ...... 25 1.4.2 Point mutations ...... 26 1.4.3 Recombination ...... 27 1.5 Aims of this thesis ...... 29 Chapter 2 Materials and Methods ...... 33 2.1 Storage and archiving of clinical samples ...... 33 2.2 Laboratory techniques ...... 35 2.2.1 Prevention of contamination of polymerase chain reaction products ...... 35 2.2.2 Extraction of RNA from clinical samples ...... 37 2.2.3 Reverse transcription ...... 38 2.2.4 PCR and sequencing of HRV positive samples ...... 40 2.3 Computational Methods ...... 48 2.3.1 Statistical methods ...... 48 2.3.2 Sequence alignment ...... 48

xvi

Contents

2.3.3 Database searching ...... 49 2.3.4 Likelihood mapping analysis ...... 51 2.3.5 Construction of phylogenetic trees ...... 52 2.3.6 Bootstrapping ...... 54 2.3.7 Recombination detection methods ...... 55 2.3.8 Bayesian Evolution Analysis of Sampling Trees (BEAST) ...... 60 Chapter 3 Epidemiology and Clinical Correlations of HRV ...... 62 3.1 Introduction ...... 62 3.2 Materials and Methods ...... 67 3.2.1 Sample selection ...... 67 3.2.2 Amplification of 5ˈUTR and VP4/VP2 region of HRV positive samples ...... 70 3.3 Results ...... 70 3.3.1 Epidemiological characteristics of HRV circulating in Edinburgh between September 2009 and May 2010 ...... 70 3.3.2 Comparison of characteristics of HRV positive paediatric patients between acute and high dependency care settings ...... 73 3.3.3 Screening of CSF samples for HRV ...... 75 3.3.4 Isolation of HRV from faecal samples ...... 76 3.3.5 Discrepancy of screening results between real time PCR in diagnostic setting and VP4/VP2 screening PCR ...... 80 3.3.6 Analysis of concurrent HRV circulation in respiratory samples during September and December 2010 ...... 83 3.4 Discussion ...... 84 3.4.1 Epidemiology and clinical correlations of HRV respiratory infections in Edinburgh ...... 84 3.4.2 Screening for HRV in cerebrospinal fluid samples ...... 87 3.4.3 Isolation of HRV from faecal specimens ...... 87 Chapter 4 Proposals for the Classification of All Three Species of Human Rhinovirus into Genotypically Assigned Types ...... 91 4.1 Introduction ...... 91 4.1.1 The standard approach to typing in HRV ...... 91 4.1.2 Definition of new EV types based on sequence divergence in the VP1 region 93 4.1.3 The requirement for genotypic classification of HRV ...... 94 4.2 Materials and Methods ...... 95 4.2.1 Selection of samples ...... 95 4.2.2 Amplification of the VP1 region ...... 96 4.2.3 Sequence alignment, calculation of pairwise nucleotide p-distances and phylogenetic analysis ...... 97 4.2.4 Analysis of recombination within the capsid region of HRV-A, -B and –C ... 98 4.3 Results ...... 98

xvii

Contents

4.3.1 Assigning nucleotide divergence thresholds in the VP1 region of HRV-A, -B and –C ...... 98 4.3.2 Phylogenetic analysis of the VP1 region ...... 101 4.3.3 Identification of putative new HRV types on the basis of sequence divergence in VP1 ...... 104 4.3.4 Analysis of HRV type groups which do not conform to the proposed VP1 divergence thresholds ...... 106 4.3.5 Assignment of type designations based on VP1 divergence for all three species of HRV ...... 111 4.3.6 Analysis of recombination within the capsid coding region of HRV and phylogenetic analysis of the VP4/VP2 region ...... 122 4.3.7 Assigning nucleotide divergence thresholds for the VP4/VP2 region ...... 125 4.3.8 Analysis of confirmed HRV types which do not conform to VP4/VP2 thresholds ...... 128 4.3.9 Geographical distribution of HRV types ...... 130 4.4 Discussion ...... 133 4.4.1 Development of a sensitive and specific RT-PCR for the amplification of the VP1 region of HRV ...... 133 4.4.2 The importance of genotyping in HRV ...... 134 4.4.3 Proposed criteria for the division of HRV into genotypically defined types 134 4.4.4 Future directions ...... 136 Chapter 5 Analysis of genetic diversity and sites of recombination in HRV-C ...... 138 5.1 Introduction ...... 138 5.1.1 Recombination in the evolution of human enteroviruses ...... 138 5.1.2 Recombination in the 5ˈUTR of HRV-C ...... 140 5.1.3 Genetic features of HRV-C ...... 141 5.2 Materials and Methods ...... 143 5.2.1 Sample selection ...... 143 5.2.2 Amplification of the VP4/VP2, VP1 and 3Dpol regions of HRV-C ...... 143 5.2.3 Amplification of the 5ˈUTR region of HRV-C ...... 143 5.2.4 Amplification of the 2A region of HRV-C...... 144 5.2.5 Dataset construction ...... 144 5.3 Results ...... 146 5.3.1 Phylogenetic analysis of the VP4/VP2, VP1 and 3Dpol regions of HRV-C 146 5.3.2 Phylogenetic analysis of the 5ˈUTR of HRV-C ...... 150 5.3.3 Determination of recombination breakpoints within the 5ˈUTR of HRV-C . 153 5.3.4 Sequence divergence across the full genome of HRV-C ...... 157 5.3.5 Analysis of recombination within the 2A region of HRV-C ...... 159 5.4 Discussion ...... 165 5.4.1 Absence of recombination within the coding region of HRV-C ...... 165 5.4.2 Recombination within the 5ˈUTR of HRV-C ...... 167

xviii

Contents

5.4.3 Putative recombination events within the 2A coding region of HRV-C ...... 169 Chapter 6 Recombination in the evolution of HRV genomes ...... 175 6.1 Introduction ...... 175 6.2 Materials and Methods ...... 176 6.2.1 Sample selection ...... 176 6.2.2 Amplification of three coding regions of HRV ...... 177 6.2.3 Construction of datasets for recombination analysis ...... 177 6.2.4 Likelihood mapping analysis ...... 178 6.2.5 Analysis of mean substitution rates ...... 179 6.2.6 Recombination analysis ...... 179 6.3 Results ...... 181 6.3.1 Phylogenetic analysis within HRV type groups ...... 181 6.3.2 Likelihood mapping analysis ...... 183 6.3.3 Analysis of mean substitution rates ...... 184 6.3.4 Comparisons of pairwise nucleotide p-distances ...... 186 6.3.5 Analysis of recombination within the coding region of HRV-A ...... 188 6.3.6 Recombination between the P1, P2 and P3 regions of HRV-A ...... 195 6.3.7 Analysis of recombination within the coding region of HRV-B and HRV-C sequences ...... 201 6.3.8 Determining putative recombination breakpoints ...... 204 6.3.9 Recombination within the 5ˈUTR of HRV-A and HRV-B ...... 206 6.4 Discussion ...... 208 6.4.1 Estimation of mean substitution rates and analysis of pairwise nucleotide p- distances ...... 208 6.4.2 Analysis of recombination within the coding region of HRV-A, HRV-B and HRV-C ...... 210 6.4.3 Putative recombination breakpoints within the coding region ...... 214 6.4.4 Recombination within the 5ˈUTR of HRV-A and HRV-B ...... 214 6.4.5 Conclusion ...... 215 Chapter 7 Concluding remarks ...... 217 7.1 A broad view of the global importance of HRV ...... 217 7.2 Novel findings in the clinical epidemiology of HRV ...... 218 7.3 Classification of HRV into genotypically defined types ...... 221 7.4 Recombination in the evolution of HRV ...... 223 7.5 Conclusion ...... 227 Appendix 1 – List of Primers ...... 229 Appendix 2 – Permissions for reproduction of publications and figures ...... 232 References ...... 238

xix

Chapter 1 Introduction

1.1 Introduction

Human rhinoviruses (family Picornaviridae, genus Enterovirus) are best known as one of the major causes of the common cold. Although mainly associated with a mild and self-limiting upper respiratory tract infection (URTI), human rhinoviruses (HRV) are an established cause of other more serious clinical illness (Broberg et al., 2011; Hicks et al., 2006; Gutman et al., 2007; Fuji et al., 2011; Tapparel et al., 2009c). Childhood wheezing secondary to HRV infections has also been extensively implicated in the subsequent development of asthma (Jartti and Korppi, 2011; Jackson et al., 2008; Gern and Busse, 1999). In addition, the common cold places a considerable economic and social burden upon suffering populations, in terms of lost productivity, purchase of over-the-counter remedies and even unnecessary prescription of antibiotics (Fendrick et al., 2003). The last ten years have seen HRV research blossom with the discovery of an entire new species, many new types and ever increasing association with more severe disease. As the scope of clinical impact of HRV becomes clear, the necessity of developing an understanding of the evolutionary processes shaping the enormous diversity of these viruses and developing formal criteria for their classification has become evident. The studies presented within this thesis have aimed to address aspects of these issues.

1.2 Taxonomy and classification of viruses

1.2.1 General virus classification

The taxonomic classification and naming of viruses in a manner which allows worldwide communication between researchers has long been an important issue.

1

Introduction

Since 1971, a broad method of virus classification related to the nature of the viral genome and the method of translation has been used for high-level grouping of viruses. This is known as the System (Baltimore, 1971) and divides viruses into seven groups based on nucleic acid type, sense of genome and method of replication. Although this system remains extremely useful for developing a broad understanding of the properties of certain virus groups, the vast number of diverse subgroups within each class highlights the need for a system of defining and cataloguing smaller groups.

The International Committee for the Taxonomy of Viruses (ICTV) has overseen issues relating to the naming and taxonomic grouping of viruses since the 1970s. The committee exists to provide a universal system of classification and nomenclature and releases in the form of reports and amendments are the state of the art in the field. The currently used system of virus classification is similar to, but distinct from other biological classification systems. Classification is based on sequence data and certain biological properties where known. Groups of virus variants, strains or types are classified as species. The ICTV defines a virus species as “a monophyletic group of viruses whose properties can be distinguished from those of other species by multiple criteria” (www.ictvonline.org). Distinguishing features of different species include phylogenetic relationships, sequence identity, host range, cell tropism, pathogenesis and antigenicity. The assignment and naming of variants, strains or types is overseen by various subcommittees or study groups with a particular expertise in a certain virus family. In ascending order of magnitude, virus species are grouped together according to common aspects and phylogenetic relationships to form a genus, a group of genera forms a family and a group of families forms an order. In instances of complex hierarchical problems, subfamilies of several genera can be created. However, subfamily membership is not a necessary aspect of a particular virus’ taxonomy. There are currently, as of 2012, 7 recognised

2

Introduction orders, 96 families, 22 subfamilies, 420 genera and 2618 species of virus1. Although all proposals for naming and taxonomy of viruses above the species level must be ratified by ICTV members, each virus family is overseen by a subcommittee or study group who are responsible for the development and submission of such proposals.

1.2.2 Classification of the Picornaviridae

Picornaviruses are one of five virus families which fall within the order (also including , , Iflavirus and ) (Sanfaçon et al., 2011). The order consists of a diverse range of viruses with a single stranded positive sense ribonucleic acid (RNA) genome, grouped together on the basis of similar capsid structure and viral life cycles. Picornavirales have a very wide host range including humans, animals, plants, algae, insects and birds. In terms of viral capsid structure, these diverse viruses share a non-enveloped capsid consisting of 60 protomers, each with three distinct “jelly-roll” domains (Le Gall et al., 2008). The formal designation of the order Picornavirales replaces the putative “picorna-like supergroup”. In addition to the five known families within the order, there are two unassigned genera (Bacillarnavirus and Labyrnavirus) which belong to no known family (www.picornavirales.com).

The largest family within the order Picornavirales are the Picornaviridae. Not only is this an extremely large and genetically diverse virus family, but it contains many human and veterinary pathogens of great clinical importance, including poliovirus (PV) and foot and mouth disease virus (FMDV). The family Picornaviridae currently contains 17 recognised genera (Knowles et al., 2012)2 which cause clinical

1 These figures have been updated since the publication of the Ninth Report of the ICTV and the figures reported here reflect these updates (www.ictvonline.org). 2 Five of the seventeen picornavirus genera mentioned (Aquamavirus, Cosavirus, Dicipivirus, Megrivirus and Salivirus) were approved in Feburary 2013 and hence did not appear in the 9th ICTV report released in 2012. These newly approved genera appear in the revised ICTV Master Species List (www.ictvonline.org). In addition, over 20 currently unassigned putative picornaviruses have been found in such diverse host species as bats, eels, cats, chickens, mice and tortoises (www.picornaviridae.com). 3

Introduction disease in a wide range of human and animal hosts (Table 1.1). The family is named for the fact that all members are extremely small (pico) RNA viruses.

TABLE 1.1: Members of the family Picornaviridae and their natural hosts Genera Species Number Host of typesa Apthovirus Foot and mouth 7 Cloven-footed disease virus animals (eg/ (FMDV) cattle and pigs) Bovine rhinitis A 2 Cows virus (BRAV) Bovine rhinitis B 1 Cows virus (BRBV) Equine rhinitis A 1 Horses virus (ERAV) Cardiovirus Encephalomyocarditis 1 Pigs virus (EMCV) Theilovirus 12 Rodents Humans Aquamavirus Aquamavirus A 2 Seals Avihepatovirus Duck hepatitis A 3 Ducks virus Cosavirus Cosavirus A 24 Humans Dicipivirus Cadicivirus A 1 Dogs Enterovirus Enterovirus A (EV-A) 23 Humans Primates Enterovirus B (EV-B) 60 Humans Primates Pigs Enterovirus C (EV-C) 23 Humans Enterovirus D (EV-D) 5 Humans Primates Enterovirus E (EV-E) 4 Cows Enterovirus F (EV-F) 6 Cows Possums Enterovirus G (EV-G) 6 Pigs Sheep Enterovirus H (EV-H) 1 Primates Enterovirus J (EV-J) 6 Primates Human rhinovirus A 77b Humans (HRV-A) Human rhinovirus B 25b Humans (HRV-B) Human rhinovirus C 51b Humans (HRV-C) Erbovirus Equine rhinitis B 3 Horses virus (ERBV) Hepatovirus Hepatitis A virus 1 Humans

4

Introduction

Kobuvirus Aichivirus A 3 Humans Dogs Mice Aichivirus B 2 Cows Aichivirus C 1 Pigs Megrivirus Melegrivirus A 1 Turkeys Parechovirus Human parechovirus 16 Humans Ljungan virus 4 Rodents Salivirus Salivirus A 2 Humans Primates Sapelovirus Porcine sapelovirus 1 Pigs Simian sapelovirus 1 Primates Avian sapelovirus 1 Birds Senecavirus Seneca valley virus 1 Pigs Teschovirus Porcine teschovirus 13 Pigs Tremovirus Avian 1 Birds encephalomyelitis virus a Information included on number of types and known host species is adapted from information available on the Picornavirus study group website (www.picornaviridae.com) and additional references (Pallansch and Roos, 2007; Knowles et al., 2012). b The number of HRV types given relates to the number of types officially recognised in early 2013. Studies presented within this thesis have identified and proposed a number of additional HRV types from all three species.

The criteria for the definition of members of the same genus within the Picornaviridae are homology between certain important polypeptides (specifically, the leader protein, 2A, 2B and 3A coding regions), RNA structural homology of the internal ribosomal entry site (IRES) and amino acid identity of more than 40% in the P1 and P2 coding regions and more than 50% in the P3 coding region (Knowles et al., 2012).

1.2.3 Classification of the genus Enterovirus

The enteroviruses (EV) originally consisted of only viruses isolated from humans and were divided into four main groups based on their pathogenesis in man and suckling mice. The Coxsackie A viruses (CAV) were first isolated in 1948 from stool samples of children with an illness clinically similar to poliomyelitis (Dalldorf and Sickles, 1948). In contrast to PV which infected primates, CAV caused a flaccid paralysis in suckling mice. Coxsackie B viruses (CBV) were isolated in 1949 from

5

Introduction cases of aseptic meningitis (Melnick et al., 1949) and were found to cause a spastic paralysis and severe systemic infection in suckling mice. The final classical group of human EV were the echoviruses. These were first isolated in 1951 (Robbins et al., 1951) from tissue culture samples. They are named enteric cytopathogenic human orphan viruses due to their initial lack of association with clinical disease in either humans or laboratory animals.

While the classification based on pathogenicity was useful, as sequence data accumulated it became apparent that these subgroups did not correlate with observed phylogenetic relationships (Hyypiä et al., 1997). In addition, each observed type appeared to cause a wide spectrum of disease and so classification in clinical terms was impossible. Therefore, human EV were classified into four species (EV-A – D) on the basis of sequence identity and phylogenetic relationships. Within each species, individual types were defined either by investigation of serological properties or sequence identity (discussed in Chapter 4).

The classification of picornaviruses on the basis of phylogenetic relationships and sequence identity has led to the recognition that a great many animal viruses additionally fall within the Enterovirus genus (Table 1.1). Indeed, many EV that infect primates are members of the same species as those infecting humans. For instance, EV-A, -B and –D all predominantly infect humans and contain several simian EV (Harvala et al., 2011; Oberste et al., 2002; Harvala et al., 2012a). The recognition of this host species diversity within a single EV species has resulted in the recent proposal to remove the host species from the enterovirus name (resulting in nine EV species named EV-A – J3). HRV (for which no other host species is currently known) were also reclassified as members of the Enterovirus genus on the basis of observed similarities in life cycle, genome organisation and phylogenetic relationships (Laine et al., 2005)

3 EV-I was excluded from the naming system in order to avoid confusion with EV-1. 6

Introduction

FIGURE 1.1: Neighbour joining phylogenetic tree depicting relationships between species within the Enterovirus genus. Tree was constructed as described in Chapter 2 and using simian sapelovirus sequence AY064708 as an outgroup. Branch to tree root has been collapsed. Enterovirus species have been collapsed for ease of reference. Host species for each EV species are given in Table 1.1. Branches are scaled by genetic distance.

1.2.4 Classification of human rhinoviruses (HRV)

1.2.4.1 Classification of human rhinoviruses based on biological and serological properties Since the isolation of HRV-A1 in 1956 by two independent laboratories (Pelon et al., 1957; Price, 1956), the classification of HRV has undergone many revisions. Early studies of the biology of HRV sought to classify them as “H” or “M” strains according to their ability to grow in various different cell lines (Andrewes, 1961). 29

7

Introduction strains of HRV were classified in this way, with the vast majority being found to replicate only in human embryonic kidney cells (designated H strains). Only 2 HRV strains (which are known under the current classification system as HRV-A1 and HRV-A2) were found to be able to replicate in both human and monkey kidney cells (designated M strains) (Taylor-Robinson and Tyrrell, 1962; Arden and Mackay, 2010). This particular classification of HRV based on properties of in vitro cell culture was abandoned and is now of historical interest only.

Between 1967 and 1987, all of the known HRV isolates were collected and tested by serological neutralisation assays to determine their antigenic relationships (Kapikian et al., 1967; Hamparian et al., 1987; Kapikian et al., 1971) This collaborative program was completed in three stages and resulted in the definition of 100 sequentially numbered HRV serotypes. As the existence of distinct species of HRV was not recognised at this time, sequential numbering remains interspersed throughout the two HRV species now known as HRV-A and HRV-B4. Serotypes were initially defined on the basis of antigenic distinctness and a lack of significant cross-reactivity with known HRV serotypes (Kapikian et al., 1967). However, subsequent studies identified a number of HRV serotype pairs which exhibited reciprocal cross-reactivity (HRV-1A/1B, 2/49, 3/14, 9/32, 12/78, 13/41, 15/74, 29/44, 36/58, 8/95) and an even greater number of serotype pairs which exhibited non-reciprocal cross-neutralisation (Cooney et al., 1982; Ledford et al., 2004; Halfpap and Cooney, 1983). However, the antibody titres to heterotypic HRV were generally low.

An additional classification criteria applied to HRV-A and –B was cell surface receptor usage. Before the major receptors for HRV infection were known, competitive binding assays were used to demonstrate the existence of two distinct receptor groups. If a cell culture assay was saturated with one HRV type, most other

4 For clarity, HRV types are now conventionally listed with their species designation included (for example, HRV-A12 and HRV-B3). 8

Introduction known types could not bind and establish infection. In fact, only 4 of 24 HRV types showed no inhibition. It was therefore inferred that the majority of HRV used the same receptor (Abraham and Colonno, 1984). This “major group” HRV receptor was eventually identified as the cell surface glycoprotein intercellular adhesion molecule 1 (ICAM-1) (Greve et al., 1989); a protein which is expressed in abundance on endothelial cells. All known HRV-A and –B were then classified into two groups on the basis of receptor usage (Uncapher et al., 1991); with 91 serotypes falling within the major group and 9 within the minor. The minor group receptor was shown to be a member of the low density lipoprotein family (LDL) (Hofer et al., 1994). One type, HRV-87, was determined to use a third receptor (Uncapher et al., 1991). However, this was later reclassified as a member of the EV-D species (EV- 68) (Blomqvist et al., 2002b; Ishiko et al., 2002) with tropism for the respiratory tract.

Finally, HRV have also been classified in terms of sensitivity to certain classes of antiviral drugs. In 1990, all 100 HRV serotypes were tested against a specific panel of 15 antiviral compounds and this revealed the existence of two antiviral susceptibility groups (Andries et al., 1990). Group A consisted of 33 serotypes which were susceptible to elongated compounds, such as WIN517111 and Group B consisted of 67 serotypes susceptible to short compounds, such as chalcone, dichloroflavan and R61837. Further analysis of the amino acid sequence of this region revealed that specific mutations were predictive of the susceptibility of a particular HRV serotype to a further antiviral drug, known as pleconaril (Ledford et al., 2005).

Although methods of HRV classification based on observable biological properties have proved extremely useful in HRV research, all require extensive and time- consuming laboratory procedures. The use of phylogenetic methods based on readily

9

Introduction available sequence data for the classification of HRV has allowed much more timely and easy identification.

1.2.4.2 Classification of human rhinoviruses based on phylogenetic relationships The recognition that the 100 recognised serotypes of HRV could be divided into two species (HRV-A and –B) was achieved by phylogenetic analysis. These two distinct species groupings were shown to be present throughout the capsid coding region (Horsnell et al., 1995; Savolainen et al., 2004; Laine et al., 2005), the 5ˈ untranslated region (UTR) (Kiang et al., 2008) and the 3Dpol region at the 3ˈ end of the coding genome (Savolainen et al., 2004). The completion of full genome sequences for all 100 HRV-A and –B strains in 2009 additionally confirmed these two species groupings (Palmenberg et al., 2009).

The common usage of molecular methods in studies of HRV epidemiology has led to the rapid accumulation of sequence data and a revised understanding of many aspects of HRV classification and taxonomy. As previously mentioned, HRV were initially classified within the genus Rhinovirus (distinct from Enterovirus) on the basis of cell tropism and disease manifestations. HRV generally replicate in the respiratory tract and cause respiratory disease, whereas EV predominantly replicate in the gastrointestinal tract. However, studies of phylogenetic relatedness and sequence identity between these two groups has led to the re-classification of HRV within the genus Enterovirus (Laine et al., 2005; Savolainen et al., 2004). It has been shown that the division of HRV into subgroups based on biological properties, as described above, does not correlate exactly with the phylogenetic relationships observed within the two species (Lewis-Rogers et al., 2009). All HRV-B and most HRV-A are members of the major receptor group. All HRV-B, except HRV-B27 are members of antiviral group A and all HRV-A, except HRV-A8 and –A95 are members of antiviral group B. Within HRV-A, the members of the minor receptor group form

10

Introduction several distinct clusters. However, these do not form a distinct genetic subgroup (Figure 1.2).

In 2006, a divergent phylogenetic clade of HRV variants were discovered from sequencing of HRV from respiratory samples (Lamson et al., 2006; Renwick et al., 2007; Arden et al., 2006; McErlean et al., 2007; Lau et al., 2007; Kistler et al., 2007a; Lee et al., 2007). These viruses were originally referred to as HRV-A2 or HRV-C and they received approved status by the ICTV as a distinct HRV species in 2009 (called HRV-C). HRV-C are genetically divergent from HRV-A and –B, except in the 5ˈUTR where there was evidence for inter-species recombination with HRV-A (Huang et al., 2009; Wisdom et al., 2009a). HRV-C have proven to be refractory to in vitro cell culture by classical methods used for isolation of HRV (McErlean et al., 2008). Recently, a cell culture system based on propagation of HRV-C in sinus mucosal organ culture has been developed and although this appears to be sensitive for HRV-C detection, the organ culture method is not practical for widespread use (Bochkov et al., 2011). Due to the lack of a suitable system for extensive analysis of the biological and replicative properties of HRV-C, these viruses have not been added to the classification systems described for HRV-A and HRV-B (including receptor usage, antigenicity and drug sensitivity profiles). Instead, classification of HRV-C must necessarily rely on nucleotide and amino acid sequence data derived from molecular studies.

FIGURE 1.2: Neighbour joining phylogenetic tree showing the prototype strains of HRV-A and corresponding biological properties. Tree was constructed as described in Chapter 2. The majority of HRV-A strains are members of the major receptor group. Members of the minor receptor group are marked with a black dot. Members of antiviral group A are marked by a black box. Branches are scaled by genetic distance.

11

Introduction

12

Introduction

1.3 General aspects of picornavirus biology

1.3.1 Structure of picornavirus virions

Picornaviruses are very small RNA viruses of around 30nm in diameter. They are non-enveloped and therefore their infectivity is not affected by inorganic solvents. Despite the genetic heterogeneity observed with the capsid coding sequences of picornaviruses, the basic structure of the capsid is conserved throughout the family (Racaniello, 2007). The three dimensional structure of the virus capsid has been determined for such diverse picornaviruses as HRV (Figure 1.3), Seneca Valley Virus and FMDV (Oliveira et al., 1993; Garriga et al., 2012; Hadfield et al., 1997; Medappa et al., 1971; Venkataraman et al., 2008).

FIGURE 1.3: The structure of human rhinovirus virions. Adapted with permission from (Dreschers et al., 2007). A: Electron micrograph of purified rhinoviruses. B: Structural model of rhinovirus capsid obtained from analysis of cryoelectron microscopic data. C: Relative positions of capsid proteins VP1, VP2 and VP3 on the rhinovirus capsid.

The capsid is composed of 60 protomers, each containing one copy of the four capsid proteins VP1 to VP4. As one of the few exceptions, parechoviruses contain only three capsid proteins, as VP2 and VP4 are retained in their uncleaved precursor form VP0 (Stanway et al., 1994). The external capsid proteins, VP1 to VP3 form an 8 stranded antiparallel beta barrel (known as a “jelly roll” domain) which is a common

13

Introduction feature to other members of the order Picornavirales (Le Gall et al., 2008). The VP4 protein is located towards the internal aspect of the capsid. Groups of five protomers are arranged into 12 pentamers and these pentamers form a rigid pseudo-spherical structure with icosahedral symmetry. This configuration balances the stability and durability required for transmission with the flexibility required for cell surface binding and virion disassembly. Several studies have suggested that the picornavirus capsid may be a dynamic structure, which transiently exposes internal proteins (such as VP4 and the N-terminus of VP1) in a “breathing” process (Li et al., 1994; Lewis et al., 1998). This is important for release of the RNA genome. Capsid breathing also has implications for the development of antiviral therapies, as not only can antibodies raised against these internal regions neutralize infectivity in PV-1 but these can also confer cross-serotypic neutralization in HRV due to the highly conserved nature of this protein (Katpally et al., 2009; Li et al., 1994).

The virion has three axes of symmetry; a five fold, three fold and two fold axis. In most picornaviruses, a depression is formed near the five fold axis of symmetry by the beta barrels of the VP1 protein and this is known as the canyon. The canyon is the receptor binding site for major (but not minor) group HRV (Hewat et al., 2000; Colonno et al., 1988). Apthoviruses have been found to lack the canyon site and instead the receptor binds to a flexible loop which projects from the surface of the virion (Acharya et al., 1989). In EV, immediately below the canyon floor lies a hydrophobic pocket region which contains a “pocket factor” (Oliveira et al., 1993; Verdaguer et al., 2000). This region is a drug target for various antiviral drugs and successful binding has been shown to inhibit the uncoating of the RNA genome and subsequent delivery into the host cell (Smith et al., 1986).

1.3.2 Genome organisation and proteolytic processing

The genome of all picornaviruses consists of single-stranded, positive sense, non- segmented RNA of around 7 to 9 kilobases (kB). The genome comprises of a single

14

Introduction coding region which is flanked by 5ˈ and 3ˈ UTRs of varying lengths. In addition, the 3ˈ polyA tail also varies significantly in length between genera and has been shown to be required for infectivity (Spector, 1974; Racaniello, 2007). The VPg (viral protein genome-linked) is a small protein that is covalently linked to the 5ˈ end of the genome and acts as a primer for genome replication. The VPg is encoded by the short and well conserved 3B genome region (Figure 1.4). In contrast to other picornaviruses, FMDV and aquamavirus A encode more than one VPg molecule (Forss and Schaller, 1982; Knowles and Wadsworth, 2010).

The picornavirus genome consists of a single open reading frame translated as a single polyprotein. However, the complete polyprotein is cleaved co-translationally and so is never evident in the cell (Racaniello, 2007). Unlike mammalian RNA, picornavirus RNA lacks a methylated 5ˈ cap structure and initiation of translation at the ribosome is carried out by the IRES within the 5ˈUTR. The translated polyprotein is divided into capsid and non-structural regions, known as P1, P2 and P3 (Rueckert and Wimmer, 1984). These regions encode four structural and seven non-structural proteins, which are cleaved by virus-encoded proteinases in a cascade of proteolytic processing events (Figure 1.4). The cascade shown is specific to the sequence of events in EV in keeping with the focus of this thesis. However, several notable differences between proteolytic processing events in other picornavirus genera exist.

15

Introduction

FIGURE 1.4: Sequence of cleavages during proteolytic processing of the translated genome of enteroviruses. Organisation of picornavirus genome is shown on the top line. Relative sizes of protein products are taken from annotation from Genbank entry for HRV-B14 : X01087. The Leader protein (L) is not present in EV and position in the genome is shown for reference only. Intermediate and final protein products are shown.

The junction between the P1 and P2 coding regions is cleaved in EV by the 2A proteinase. However, in certain other genera, including Cardiovirus, Parechovirus, Hepatovirus and Apthovirus, this junction is cleaved by the 3C proteinase (Racaniello, 2007), as the 2A protein does not have proteolytic activity. Several picornavirus genera (Apthovirus, Cardiovirus, Erbovirus, Kobuvirus and Sapelovirus) encode an additional leader protein (L), at a 5ˈ position to the VP4 capsid protein (position shown by dotted box in Figure 1.4). This functions as a proteinase in apthoviruses and has a role in the inhibition of host-cell protein synthesis (Devaney et al., 1988). However, in cardioviruses, the leader protein lacks proteolytic activity and is instead cleaved from the polyprotein by 3C. Other cleavage events, including the cleavage of P2 and P3, are undertaken either by the 3C proteinase or 3CDpro, an intermediate protein product that functions as a

16

Introduction proteinase. The encoded proteins 2B, 2BC and 3AB are all membrane associated proteins with roles in viral replication (VR Racaniello, 2007). VP1 to VP4 assemble into the final capsid structure and the last event in the assembly of the picornavirus virion is the cleavage of VP0 to form VP4 and VP2.

1.3.3 RNA secondary structure in picornavirus genomes

In addition to the encoded proteins, picornavirus genomes contain several well defined areas of functional RNA secondary structure. RNA secondary structure is formed by base pairing of single stranded RNA with itself and leads to the formation of structures such as stem-loops and pseudoknots. RNA secondary structure can form randomly or can have a high degree of functional importance and in practice it is often challenging to separate these two.

Certain structural elements are present in picornavirus genomes, including elements within the UTRs and the cis-acting replication element (cre) within the coding region. The cre is an RNA secondary structure which has been observed within the genome of several picornavirus genera. It has been implicated in negative strand RNA synthesis (Goodfellow et al., 2000) and in VPg uridylation, allowing VPg to function as a primer for genome replication (Paul et al., 2000). The location of the cre has been pinpointed in several different picornavirus genera and there is a high degree of structural conservation. Despite this, cre elements from different species exhibit high nucleotide sequence divergence and are located in different parts of the coding or non-coding regions of the genome (Cordey et al., 2008; Goodfellow et al., 2000; Witwer et al., 2001; McKnight and Lemon, 1998; Gerber et al., 2001). In fact, it has been suggested that this variability in position may be used as an additional classification tool for identifying species of HRV and EV (Cordey et al., 2008).

The IRES is a long and highly structured stretch of nucleotides within the 5ˈUTR. Picornavirus genomes lack the 5ˈ cap structure observed in mammalian RNA, which is integral to ribosome binding. The IRES was first described as the site of internal

17

Introduction initiation of translation for picornavirus genomes in 1988 (Pelletier and Sonenberg, 1988; Jang et al., 1988). There are five classes of viral IRES and two main types are present in picornaviruses. The type I and type II IRES of EV and apthoviruses/cardioviruses respectively have little homology in either structure or sequence and yet provide the same basic function (Racaniello, 2007) (Figure 1.5). Hepatitis A virus (HAV) contains a further distinct IRES, designated type III (Belsham, 2009). In addition, several other picornavirus genera (including Avihepatovirus, Teschovirus, Senecavirus, Sapelovirus and Tremovirus) have been shown to contain IRES elements which are similar to those observed in hepatitis C virus and pestiviruses in the virus family (Willcocks et al., 2011; Hellen and de Breyne, 2007). Although it is possible that this IRES type (type IV; Figure 1.X) may have evolved independently, it has also been suggested that this may have arisen by recombination and modular exchange of replication elements between members of different virus families (Hellen and de Breyne, 2007).

FIGURE 1.5: RNA secondary structure of picornavirus IRES elements. Presented with permission from (Fitzgerald and Semler, 2009). A: Type I IRES, as observed in Enterovirus. B: Type II IRES, as observed in Apthovirus.

In addition to these two regions of RNA secondary structure present in all picornaviruses, the 3ˈUTR contains elements of secondary structure thought to be

18

Introduction required in viral replication (Brown et al., 2005) and a 5ˈ cloverleaf structure has been observed in the 5ˈUTR of enteroviruses (Palmenberg et al., 2009). This 5ˈ structure has been implicated in the initiation of RNA synthesis (Barton et al., 2001). The entire 5ˈUTR of EVs can be divided into six functional domains, consisting of the cloverleaf structure (domain I) and the IRES (domains II – VI) (Rohll et al., 1994). Within the cloverleaf, four individual stem-loops exist (A-D). Stem loop B specifically binds with poly(rC)-binding proteins and stem loop D interacts with the viral 3C and 3CD proteases (Du et al., 2004). These have been implicated in the switch between translation and replication (Rohll et al., 1994).

1.3.4 Life cycles of picornaviruses

The replication of picornaviruses takes place in the cytoplasm of the host cell. Attachment of the virus to a specific cell surface receptor leads to structural changes in the capsid which allow the uncoating of the RNA genome and its subsequent entry into the cell. The genome is first translated to provide copies of the viral proteins required for genome replication. Replication is preceded by negative strand synthesis and this strand is then used as a template for positive strand RNA synthesis. Capsid proteins and newly synthesized viral genomes assemble into mature virions and are then released from the cell. The entire cycle is complete in around 5 to 10 hours (Racaniello, 2007) (Figure 1.6).

Attachment of picornaviruses to host cells occurs via a wide variety of cell surface receptors. Cell surface receptor specificity is an essential determining factor in cell/tissue tropism and therefore both disease manifestations and host species. However, many cellular receptors exploited by picornaviruses are expressed in a wide variety of tissues and therefore tropism may not be primarily determined by receptor expression alone. For example, both I-CAM1 and CD155, members of the immunoglobulin superfamily, are expressed in many tissue types throughout the

19

Introduction body and yet, the primary sites of replication of HRV and PV are the respiratory and gastrointestinal tracts respectively.

FIGURE 1.6: Summary of the picornavirus life cycle. Presented with permission from (Whitton et al., 2005). The figure summarises key events in the replication of picornaviruses and the main events are described in the text below. © represents the cap present on most cellular mRNA.

The receptor usage profile of picornaviruses is complex and receptor specificities of some major picornavirus species are summarised in Table 1.2. Some EV and HRV utilise receptors which are members of the immunoglobulin superfamily and binding occurs in the canyon region (Colonno et al., 1988). This leads directly to the conformational changes in the capsid required for uncoating of the genome.

20

Introduction

TABLE 1.2: Cell surface receptors used by selected picornaviruses Receptor family Receptor Type Virus Natural host Primary site of replication Immunoglobulin I-CAM1 HRV Human Respiratory superfamily tract CD155 PV Human Gastrointestinal tract Coxsackie and Coxsackie Human Gastroinestinal adenovirus viruses tract receptor (CAR) Murine vascular EMCV Pig Gastrointestinal cell adhesion tract molecular 1 (VCAM-1) Decay DAF/CD55 Some Human Gastrointestinal Accelerating enteroviruses tract Factor (DAF) Low density LDL HRV Human Respiratory lipoprotein family tract Integrins Integrins FMDV Cloven footed Respiratory animals tract Some Humans Gastrointestinal enteroviruses tract Sialic acid Sialic acid Cardioviruses Rodents ERAV Horses Respiratory tract EV71 receptors Human P- EV71 Humans Gastrointestinal selectin tract glycoprotein ligand 1 Hepatitis A virus HAV-cr1 HAV Humans Gastrointestinal receptor tract - Information adapted from (Tuthill et al., 2010; Racaniello, 2007; Pallansch and Roos, 2007)

Binding of the cell surface receptor leads to uncoating of the viral genome. Unlike enveloped viruses which can directly fuse their lipid coat with the host cell plasma membrane, the non-enveloped picornaviruses must exploit a variety of different mechanisms in order to effectively deliver their genome into the cytoplasm. Picornaviruses do not enter the cell directly through the plasma membrane, instead relying on endocytosis and subsequent release from vesicles (Brandenburg et al., 2007; Tuthill et al., 2010). The mechanism by which endocytosis takes place and the cue utilised for timely uncoating of the RNA varies between viruses. In all cases, the

21

Introduction plasticity of the conformation of the capsid and the phenomenon of “capsid breathing” is of crucial importance to this process (Roy and Post, 2012).

Endocytosis can be clathrin-mediated, as is observed for apthoviruses, some HRV and HAV (Tuthill et al., 2010). Some EV undergo endocytotic uptake via a caveolin mediated pathway, whereas PV depends on a non-caveolin non-clathrin mediated process (Brandenburg et al., 2007; DeTulleo and Kirchhausen, 1998). The actual uncoating of the RNA genome depends on either a direct mechanical cue secondary to receptor binding (as in major group HRV) or a chemical cue such as low pH (Suomalainen and Greber, 2013). This occurs at around pH 5.4 in the minor group HRV (Brabec et al., 2003). The N-myristoylated VP4 capsid protein and the N- terminus of the VP1 protein, usually present on the internal surface of the virus, are externalised. The exposed segment of the VP1 protein tethers the virus to the membrane, while the VP4 protein can form a pore which is suggested to allow the release of the genome into the cytoplasm (Davis et al., 2008; Fricks and Hogle, 1990; Danthi et al., 2003). Post release of the genome, the 80S empty particle of EV is still present within the endosome. However, the capsids of apthoviruses and cardioviruses dissociate into respective pentameric subunits (Tuthill et al., 2010). Uncoating of picornaviruses with a canyon region within the capsid can be blocked by certain antiviral compounds, such as pleconaril or WIN compounds. These bind to the canyon and stabilize the capsid conformation, thereby blocking the release of VP4 (Roy and Post, 2012).

Once in the cytoplasm, the viral RNA must be first translated in order to provide copies of the viral proteins necessary for genome replication, which are not naturally present in the host cell. Once sufficient viral proteins have been synthesized, the switch from translation to replication occurs, as both cannot occur on the same RNA template simultaneously (Racaniello, 2007). The viral RNA dependent RNA polymerase, encoded by the 3D genome region, is responsible for RNA synthesis.

22

Introduction

Both the cellular protein poly(rC) binding protein and the viral 3CD protease bind to the 5ˈ cloverleaf structure to form the ribonucleoprotein complex (Rohll et al., 1994; Andino et al., 1990). A protein-protein bridge is then formed between this complex and the cellular polyA binding protein that is bound to the 3ˈ end of the viral genome (Herold and Andino, 2001). This has the effect of circularizing the RNA, which in turn initiates negative strand RNA synthesis from the 3ˈpolyA region. Negative strand RNA synthesis results in the formation of a positive/negative RNA duplex known as the replication form. Positive strand synthesis then occurs using the negative strand as a template and secondary to the anchoring of the negative strand by virally encoded protein 2C. Uridylated VPg is required as a primer for RNA replication. RNA replication takes place within replication complexes bound to smooth membrane vesicles (Bienz et al., 1994). The compartmentalization of RNA replication may increase efficiency of replication by localising all required components and additionally may facilitate the avoidance of eliciting a dsRNA activated immune response during the replication cycle (Racaniello, 2007).

Some of the newly produced positive stranded RNA viral genomes are translated to produce viral capsid proteins. Each positive stranded RNA genome is linked to a VPg protein and encapsidated. The final event before cell lysis is the maturation cleavage of capsid protein VP0 to form VP4 and VP2 (Lee et al., 1993).

1.3.5 Pathogenesis of enterovirus infections Members of the genus Enterovirus comprise a large number of pathogens of both humans and animals (Table 1.1). Although the human EV have been documented to cause such varied clinical presentations as respiratory infections, hepatitis, acute haemorrhagic conjunctivitis and myocarditis (Gaaloul et al., 2012; Pallansch and Roos, 2007), the main public health concern for this group of viruses is their propensity to cause infections of the central nervous system (CNS). These include the well-known poliomyelitis (caused by PV), a large number of different paralysis

23

Introduction syndromes, aseptic meningitis and encephalitis (Siafakas et al., 2001; Kupila et al., 2006). Members of all four species of human EV have been documented to cause infections of the human CNS.

The CNS is protected from pathogens and toxins in circulating blood by the blood- brain-barrier. This consists largely of endothelial cells which regulate the transport of molecules in and out of the cerebrospinal fluid (CSF). There are two postulated mechanisms of EV entry into the CNS; either through haematogenous spread and subsequent violation of the blood brain barrier or invasion of peripheral nerves leading to retrograde axonal flow to the brain (Ren and Racaniello, 1992; Ohka et al., 1998). The phenomenon of retrograde axonal flow, whereby virus gains access to the CNS via invasion of peripheral nerves has been specifically observed for PV infections. In general, invasion of the CNS by EV is thought to occur due to haematogenous spread. Upon entry via the faecal-oral route, EV initially replicates within the associated lympatic tissues of the small bowel and subsequent invasion of the blood vessels leads to viraemia. Viraemia can occur even in cases of asymptomatic EV infection (Welch et al., 2001). Excretion of the virus occurs in faeces for weeks after the initial infection and occurs in both symptomatic and asymptomatic infections.

Human EV notably possess acid stable virions and are therefore able to pass through the highly acidic environment of the stomach prior to establishing infection in the small bowel. In contrast, HRV are thought to be acid labile (Giranda et al., 1992) and therefore have a respiratory tropism. The same association is observed in apthoviruses, which are acid labile viruses that replicate predominantly in the respiratory tract. HRV also appear to propagate in vitro preferentially at 33°C, which has been interpreted as an adaptation to the cooler environment of the upper respiratory tract. However, recent studies have indicated that HRV are capable of replication both at the higher temperatures associated with the lower respiratory tract

24

Introduction and within the lower respiratory tract itself (Gern et al., 1997; Mosser et al., 2002; Papadopoulos et al., 2000).

1.4 Evolution of RNA viruses

1.4.1 General points in the evolution of RNA viruses

The genetic diversification and evolution of viruses occurs by three main mechanisms; point mutations, recombination and reassortment. Point mutations cause small changes in the genome sequence during replication, secondary to the error prone nature and lack of proof-reading activity of the viral polymerase (Steinhauer et al., 1992). Recombination and reassortment cause the creation of chimaeric genomes, by transferring large genome regions between different viruses. Reassortment involves the transfer of entire genome segments and therefore occurs only in viruses with segmented genomes (such as influenza). Reassortment is not discussed in the brief review below.

RNA viruses generally have short replication cycles which allow the production of many virions, including a great number of mutants in each cycle. The fixation of mutations in these viruses leads to divergent RNA sequences and therefore frequently altered protein sequences. Although a large number of mutants are generated with each replication cycle, all are required to conserve essential functions to maintain viability. All mutant genomes are subjected to the same selective pressures and the observed genetic diversity of a viral population depends not only on the mechanism of generation of mutants, but on their fitness and ability to produce infectious progeny. Some RNA viruses are thought to exist as a swarm of closely related mutants clustering around a general consensus sequence, known as a quasispecies (Domingo et al., 1996, 2008). The rapid generation of diversity means that viral populations can quickly adapt to changing environments, which has implications for the development of both antiviral therapies and vaccines.

25

Introduction

Constraints on observed genetic diversity are imposed in several ways. Not only does every mutation or recombination event have implications for viral fitness, but the total genome size of picornaviruses is constrained by the fixed internal space within the viral capsid. However, PV with up to an additional 1500 nucleotides have been shown to be successfully packaged (Alexander et al., 1994). In addition, viral evolution can have important consequences for virulence and a balance must be struck between successful viral propagation and causing the premature death of the host cell.

1.4.2 Point mutations

Nucleotide misincorporation is estimated to occur at a frequency as high as one mutation per genome per round of replication in RNA viruses (Drake and Holland, 1999). The existence of picornaviruses as a related swarm of micro-variants or quasispecies means that there is a large pool of slightly mutated genomes present at any one time, which may become fitter variants as the selection pressures exerted by the host change. Picornaviruses are a highly genetically diverse group of viruses, both within and between species. Despite the fact that most random mutations are likely to be deleterious, it has been found that increasing the fidelity of the RNA polymerase actually causes attenuation of PV (Pfeiffer and Kirkegaard, 2005). However, decreasing the fidelity of the polymerase will necessarily lead to the creation of a large number of non-viable progeny and eventual complete inability to replicate (error catastrophe). For example, the small molecule ribavirin has been demonstrated to cause error catastrophe in PV by reducing the fidelity of the polymerase (Crotty et al., 2001). Therefore, random mutations in RNA viruses are a delicate balance between generating sufficient viral diversity to adapt to a rapidly changing host environment and avoidance of crossing the error threshold.

Point mutations occur by the random introduction of either single or multiple nucleotide changes during RNA replication. Even single base mutations can have

26

Introduction profound effects on viral phenotype and therefore these small, incremental changes are subject to the evolutionary forces of positive and negative selection. A progeny genome with a lowered fitness will be outcompeted by other variants in the population and therefore is likely to disappear quickly. Random non-synonymous mutations (which cause a change in amino acid sequence) can decrease viral fitness by altering the essential secondary structure or interactions of the encoded proteins. Insertions and deletions in the nucleotide sequence (indels) are significantly less commonly observed in sequenced viral genomes than single base point mutations. This is due to the fact that the addition or removal of a single base will cause a frameshift mutation resulting in a protein that is non-functional or truncated (if the indel introduces a stop codon). In order to maintain viability of the virus, the indel must occur in a region of genome that is not functionally conserved and must result in the insertion or deletion in multiples of three to preserve downstream coding.

Different segments of the genome have distinct roles in the viral life cycle and therefore are subjected to differing evolutionary pressures. For example, a staggeringly high genetic diversity is observed within the majority of the capsid coding region of both HRV and EV (McIntyre et al., 2010; Lindberg et al., 2003; Laine et al., 2006). This is likely secondary to the fact that these proteins are exposed to the host immune system and the generation of diversity, especially in the binding sites of neutralizing antigens likely allows immune escape.

1.4.3 Recombination

Genetic diversity is also routinely created in RNA viruses by recombination. Recombination is the joining of two genome regions from related or unrelated viruses to create a chimaeric genome. This results in both new viral phenotypes which may obtain an evolutionary advantage and also is thought to decrease the load of deleterious mutations (Worobey and Holmes, 1999).

27

Introduction

Recombination can be inferred indirectly from analysis of sequence data, as different recombinant genome regions will display different evolutionary histories. Although computational and phylogenetic methods in use today for the detection of recombination are both powerful and accurate, recombination between very closely related parents cannot be detected. Non-viable recombinants that are likely generated randomly within a single replication cycle also cannot usually be detected. Recombination within RNA viruses can be classified as homologous, aberrant homologous or nonhomologous (Lai, 1992). Homologous recombination occurs within a homologous region of two related viruses. Due to the likelihood of these generated recombinants being viable and able to propagate, it is these that are most often observed. Aberrant homologous recombination occurs between homologous viruses that are not in strict alignment and the relatively infrequent nonhomologous recombination occurs between two completely unrelated viruses.

The most widely accepted model of recombination in RNA viruses is known as the copy-choice model, whereby the RNA dependent RNA polymerase switches template strand during synthesis of the negative strand (Kirkegaard and Baltimore, 1986). This mechanism is necessarily replication dependent and relies both on co- infection of the same host cell by two different viruses and replication of the two viruses within the same cellular compartment. The dissociation and resulting switching may be related to “pausing” of the polymerase secondary to regions of RNA secondary structure or nucleotide misincorporations (Agol, 1997). A second mechanism is proposed whereby recombination occurs by the breaking and re- joining of RNA strands (Gmyl et al., 2003) and this is not replication dependent.

Recombination has been extensively documented in a variety of picornavirus genera, including Apthovirus (Heath et al., 2006), Parechovirus (Benschop et al., 2008; Calvert et al., 2010), Teschovirus (Simmonds, 2006) and Enterovirus (Oprisan et al., 2002; Santti et al., 1999; Simmonds and Welch, 2006). Within these picornavirus

28

Introduction genera, documented recombination events have tended to occur in the 5ˈUTR and the non-structural coding regions. Natural recombination of vaccine PV strains with circulating EV is also widely responsible for the generation of vaccine escape mutants and subsequent outbreaks of vaccine associated poliomyelitis, which has grave implications for the PV eradication campaign (Jegouic et al., 2009; Cuervo et al., 2001; Guillot et al., 2000). The oral live-attenuated poliovirus vaccine (Sabin) contains three serotypes of poliovirus (PV-1, -2 and -3). After administration of the oral vaccine, replication occurs in the gut tissues and thereby elicits a strong, local immune response preventing subsequent infection with wild poliovirus via the faecal-oral route. Unfortunately, the nature of the live-attenuated vaccine presents significant opportunities for inter-typic recombination of PV strains during replication. Recombinant PV strains have been widely linked to cases of vaccine associated paralytic poliomyelitis and documented recombination partners have included other Sabin vaccine strains (Cuervo et al., 2001), circulating wild poliovirus (Dahourou et al., 2002) and other related EV-C types (Jegouic et al., 2009). Understanding the frequency and nature of recombination events occurring in the natural circulation of a virus is vital for potential development and implementation of therapeutic strategies and potential vaccines.

1.5 Aims of this thesis

With the discovery of the novel HRV-C species in 2006 and the gradual realisation that HRV are responsible for a much wider variety of more severe clinical illness than was previously thought has come a renewed interest in research of the clinical and biological features of HRV. This thesis comprises of several distinct studies of the epidemiology, classification and evolution of HRV.

One of the primary goals of this research was to conduct preliminary explorations of other potential clinical manifestations of HRV infections. Although HRV are

29

Introduction traditionally thought to be acid labile and are therefore to only infect the respiratory tract, acid-stable mutants have been previously isolated (Skern et al., 1991). The closely related human EV are an established and important cause of aseptic (viral) meningitis and are frequently isolated from stool specimens. In light of this fact, CSF and stool samples were screened for HRV, in order to determine whether the virus was present and if so, whether any epidemiological patterns could be elicited. In addition, concurrent studies were undertaken of circulating HRV epidemiology in respiratory samples isolated from Edinburgh. These studies are outlined in Chapter 3 and aspects pertaining to HRV isolation from stool and CSF were published as part of a larger study in 2012 (Harvala et al., 2012b).

The numerous studies of HRV epidemiology undertaken across the globe has led to the rapid accumulation of a vast amount of HRV sequence data. Indeed, in the analysis of sequence data acquired from our own epidemiological studies and published sequences, a large number of sequence groups were identified that appeared to be divergent from known HRV types. At the time that this work was initially undertaken, the main method of classification of HRV and assignment of putative new types was by studies of neutralisation properties. These methods were not suitable for the study of HRV-C, as the virus cannot be isolated in standard cell lines and at that time, no alternative means of in vitro propagation existed. However, a system had been in use for several years, whereby EV could be classified and novel types assigned by nucleotide sequence divergence in the VP1 region (Oberste et al., 1999a, 1999b). Chapter 4 focusses on the determination of nucleotide sequence divergence thresholds in two capsid coding regions of HRV and their subsequent application in both defining new types and redefining previous erroneous classifications. Our proposals for the classification of HRV-C into genotypically defined types was published in 2010 and has now been adopted by the Picornavirus Study Group for the classification of this virus species (Simmonds et al., 2010). In

30

Introduction addition, we have formulated proposals for the extension of this classification system to include HRV-A and HRV-B (McIntyre et al., 2013a).

The classification of all available HRV sequence data into types defined by phylogenetic relationships and sequence identity allowed the completion of studies of the evolution of HRV species and types. Although the phenomenon of recombination is frequently observed in EV, several studies had suggested that it was relatively rare in the evolution of HRV coding genomes (Kistler et al., 2007b; Lewis- Rogers et al., 2009). Studies of the occurrence of recombination within the coding region of HRV species which included many isolates of the same type had not, to our knowledge, been undertaken. However, prior evidence did exist for widespread recombination between the 5ˈUTR regions of HRV-A and HRV-C (Huang et al., 2009; Wisdom et al., 2009a). Chapter 5 describes an analysis of recombination within the coding region and genetic diversity of HRV-C strains and includes an in- depth analysis of recombination in the 5ˈUTR. This analysis was published in 2010 (McIntyre et al., 2010).

However, the vast majority of HRV-C sequences generated for this study originated from the same 12 month period within the same geographical location. It has been previously noted that even the frequently recombinant EV-B sequences are not always evident as such during a short time-scale (McWilliam Leitch et al., 2009a). Therefore, the analysis of recombination in HRV was extended to include samples from all three HRV species and covering a time-span of at least 14 years. The study of the occurrence of recombination in the evolution of all three species of HRV is reported in Chapter 6 and was published in 2013 (McIntyre et al., 2013b).

31

Introduction

32

Chapter 2 Materials and Methods

Materials and methods that were used in the completion of the studies within this thesis comprise two main strands. Firstly, laboratory techniques that were applied to both screening large number of samples for the presence of HRV and producing sequence fragments from known positive isolates for evolutionary studies are described in Section 2.2. The second strand comprised computational analyses directed towards the construction of comprehensive datasets, phylogenetic and recombination analyses and determination of mean substitution rates in specific genome regions. These are described in Section 2.3. Also central to both areas of study were a series of archives of clinical specimens at the Scottish Virology Centre (SVC), described in Section 2.1.

2.1 Storage and archiving of clinical samples

Since 2005, large archives of clinical samples referred for virological testing have been maintained in collaboration between the SVC based at the Royal Infirmary of Edinburgh (RIE) and the University of Edinburgh. Referral of patient samples for diagnostic virology is generally undertaken for system-specific suspected diagnoses (for example, pneumonia for respiratory samples and aseptic meningitis for CSF samples). Although certain situations exist where the referral of samples is strongly indicated, such as pyrexia of unknown origin or severe unexplained headache, the process is largely dependent on the clinical judgement of the attending medical staff. Although effective interventions do not yet exist for a large number of viruses that are routinely screened for, viral diagnostics are also essential for prediction of prognosis (including possible complications) in individual patients, creating infection control protocols and for disease surveillance (Templeton, 2007).

33

Materials and Methods

The collection and storage of these referred samples additionally facilitates the completion of many studies of viral epidemiology, clinical characteristics and evolution that would not otherwise be easily possible. Samples stored in each archive are allocated an anonymised number and any information which would render the individual patient identifiable are removed. This includes name, date of birth and Community Health Index (CHI)/patient number. Approval for the retention of specific, anonymised patient data for the purposes of epidemiological research was initially requested by Professor P. Simmonds in 2005 and granted by the Lothian Regional Ethics Committee (Development of molecular methods to enhance diagnostic testing for viruses and other infectious agents; REC reference number 08/S1102/2). Non-identifiable patient information is retained in the form of a secure (password protected) database. This includes age band of the patient, year and month of collection, patient location (whether based in an Intensive Therapy Unit (ITU), general ward or community health centre), sample type, basic clinical information and the results of routine diagnostic screening. As the basic clinical information recorded largely consists of details given on referral forms, data is frequently missing or incomplete. Prospective data collection for specific studies has allowed certain records to retain more detailed non-identifiable clinical information for specific cases. In addition, the results of other microbiology screening are recorded (Templeton et al., 2004; Scheltinga et al., 2005; Gaunt et al., 2010).

In parallel with the maintenance of the archive of original clinical samples, nucleic acid extracts produced during routine viral diagnostic procedures are stored in a related and cross-referenced archive. Original clinical samples are stored at -80°C, while extracted RNA/DNA is stored at either -45°C or -20°C. A supplementary archive of any complementary DNA (cDNA) produced is kept at -20°C. However, these cDNA samples are kept for further use only when they are made by amplification with random primers and therefore are useful for screening for other RNA viruses. With the exception of samples obtained from Finland (Chapter 6), all

34

Materials and Methods clinical samples and HRV positive isolates used within the course of these studies were obtained from these clinical archives.

2.2 Laboratory techniques

2.2.1 Prevention of contamination of polymerase chain reaction products

The studies contained within this thesis relied on the generation of large amounts sequence data by polymerase chain reaction (PCR) amplification of genome fragments from HRV positive samples and subsequent Sanger sequencing. The exquisitely high sensitivity of PCR based techniques is potentially subject to contamination by extraneous DNA sequences and avoidance of this has been especially important for the work described herein. The accuracy of studies which include both screening large numbers of samples for a specific virus and amplifying non-consecutive genome fragments for evolutionary analysis can be severely compromised by PCR contamination.

Considerable precautions were taken in the completion of laboratory based work. In addition to employing standard good laboratory practice and practicing aseptic technique at all times, reagents and primers for the preparation of PCR mixes were kept in small aliquots and reserved for sole use of specific individuals where possible. In an attempt to prevent false positives resulting from contamination of adjacent PCR tubes in the preparation of reaction mixtures, DNA was added to the reaction mixture after all other reagents (Kwok and Higuchi, 1989). In order to detect potential reagent contamination, a negative control prepared from the same reagent mix as all actual samples was included with all PCR based experiments.

As even one round of PCR generates extremely high copy numbers of DNA fragments, one potential method of significantly reducing contamination is to limit carryover. Carryover occurs when PCR reaction mixtures are contaminated with

35

Materials and Methods products of previous reactions. In order to limit this, we used separate areas and dedicated equipment for the various stages in the amplification of DNA from clinical samples (Figure 2.1). This comprised of a “one-way” laboratory system, where only unidirectional transfer of reagents, equipment or samples between certain areas was permitted (marked by filled arrows on Figure 2.1).

Sample preparation, including defrosting, processing, cataloguing and nucleic acid extraction was carried out in a dedicated area reserved specifically for this purpose. PCR buffers and reagents were prepared in a clean room, with separate laboratory and personal protective equipment. In the latter part of this work, positive controls comprised of known copy number RNA transcripts (McLeish et al., 2012). Due to the potential for contamination of PCR reactions with high copy number transcripts, serial dilutions of RNA transcripts were additionally carried out in a dedicated laboratory area, separated from the area reserved for preparation of PCR reagents. cDNA synthesis and first round PCR reactions were carried out in the same area (marked with white arrows on Figure 2.1). In the majority of this work, second round PCR was carried out in a dedicated laboratory area, separate from gel electrophoresis and preparation of sequencing reactions. However, in later work, due to laboratory layout and space restrictions, backwards travel was necessary between the dedicated areas for these three procedures (marked by grey box on Figure 2.1).

The adherence of all group members to these protocols has ensured that the occurrence of PCR contamination observed within the completion of the studies contained within this thesis has been minimal.

36

Materials and Methods

FIGURE 2.1: “One way” laboratory system for the prevention of carryover contamination in PCR reactions. Filled arrows denote steps between areas which originally were designed to flow in one direction. Backwards travel of reagents, samples or equipment was not permitted between these areas. White arrows denote procedures which can be carried out in the same work-space. In later laboratory work, due to space restrictions, second round PCR preparation and preparation of sequencing reactions were required to be carried out within the same laboratory area (marked with grey box). An effort was still made to separate physical bench space, reagents and pipettes for these procedures.

2.2.2 Extraction of RNA from clinical samples

In cases where nucleic acids extracted during routine diagnostics at the SVC were not available for use in research studies, RNA was directly extracted from clinical specimens using the Qiagen Viral RNA Mini Kit (Qiagen, UK), according to the manufacturers’ instructions. Samples were stored at -40°C and aliquots were defrosted in a microbiological safety cabinet. Stool samples were thoroughly mixed with phosphate buffered saline (PBS) and clarified by low-speed centrifugation. The

37

Materials and Methods resulting supernatant was used in extraction procedures. Respiratory and CSF samples were used directly in the extraction protocol.

An identical protocol for RNA extraction was used in the processing of all sample types. Briefly, 140µL of sample was incubated at room temperature with 560µL viral lysis buffer (Buffer AVL) prepared with 1µg/µL carrier RNA. The addition of carrier RNA to the reaction mixture reduces degradation of viral nucleic acid by residual RNases and also assists the binding of RNA to the QIAamp Mini column membrane. 560µL of 96-100% ethanol was added and mixed by pulse vortexing. Each sample was passed through a QIAamp Mini column at 6,000 xg for 1 minute. Each sample was then subjected to two wash steps. The first was with 500µL of wash buffer 1 (Buffer AW1) followed by centrifugation at 6,000 x g for 1 minute and the second with wash buffer 2 (Buffer AW2) followed by centrifugation at full speed for 3 minutes. Residual buffer was removed by repeated centrifugation at full speed for 1 minute. RNA was then eluted in nuclease free water in two 40µL volumes. The use of a double elution step is suggested to increase RNA yield to up to 90%. Extracted RNA was either used in downstream applications immediately or stored at -20°C or -45°C.

2.2.3 Reverse transcription

In the majority of studies, a two-step RT-PCR protocol was used. Therefore, single stranded cDNA was generated by reverse transcription (RT) to allow amplification of extracted viral RNA by PCR. RT was carried out using the A3500 Reverse Transcription System (Promega, UK) according to the manufacturers’ instructions but with certain modifications.

Briefly, 5µL of eluate obtained from RNA extraction was incubated at 70°C for 10 minutes. 15µL of master-mix containing RT reagents (Table 2.1) was added to each sample. The reaction mixture was then incubated for 10 minutes at room temperature followed by an elongation step which was increased to 50 minutes at

38

Materials and Methods

42°C. Samples were then incubated for 5 minutes at 95°C and kept on ice for a further five minutes, in order to inactivate the reverse transcriptase enzyme and prevent it binding to the single stranded cDNA. cDNA was then either used immediately for PCR reactions or stored at -20°C.

TABLE 2.1: Reagents included in reverse transcription reactions 4µL 25mM MgCl 2µL 10X reverse transcription buffer5 2µL 10mM deoxynucleotide triphosphates (dNTPs) 0.5µL Recombinant RNasin ribonuclease inhibitor (25U/µL) 1µL 100µM random hexamers 0.6µL Avian myeloblastoma virus reverse transcriptase (10U/µL) 4.9µL Nuclease free H2O

All positive and negative controls for use in subsequent PCR reactions were included in the RT step to ensure consistency of results. The use of random primers in the RT reaction enabled cDNA generated to be used both for the analysis of multiple genome regions and for studies of other RNA viruses. As an alternative to random primers, an RT reaction utilising the outer PCR primers designed for the desired genome fragment can give increased sensitivity. This technique was used in the analysis of certain samples which proved challenging to amplify with standard procedures. However, as this technique precludes the use of generated cDNA for other purposes and is therefore not cost-effective, it was only used sparingly.

In the screening of CSF samples for HRV, samples were combined into pools of 10 prior to RT. 10 sequential samples from the CSF archive were selected and 2µL of RNA from each was combined to create a 20µL pool. This was then thoroughly vortex mixed and 5µL used in the RT reaction. The study of HRV incidence in CSF samples and the repeated screening of certain reported HRV negative respiratory

5 10X reverse transcription buffer was supplied with the Reverse Trasncription System kit (Promega, UK) and was composed of 100mM Tris-HCl (pH 9.0 at 25°C), 500mM KCl and 1% Triton® X-100. 39

Materials and Methods samples (described in Chapter 3) were the only studies undertaken for which samples were pooled. Although the value of pooling samples for screening large numbers of patients has been previously demonstrated for human metapneumovirus (HMPV) (Gaunt et al., 2009) and coronavirus (HCoV) (Gaunt et al., 2010), the overall incidence of HRV in respiratory samples is too high to give any value to the procedure in investigations of respiratory epidemiology.

2.2.4 PCR and sequencing of HRV positive samples

Within the studies presented in this thesis, PCR had a dual role. Firstly, samples were screened by PCR based methods (Chapter 3) for the presence of HRV and secondly, specific sequence fragments were amplified for studies of genetic diversity, recombination and evolution within the HRV genome.

2.2.4.1 PCR Primer Design The design of PCR primers for use in the amplification of various genome regions of all three species of HRV required a delicate balance between primer degeneracy (to counteract high genetic diversity within regions) and primer specificity (to avoid amplification of non-target sequences). All PCR primers were designed in-house, either as part of a previous study (Wisdom et al., 2009b, 2009a) or specifically for the studies described. Upon initial design, each primer set was tested for sensitivity using serial 10-fold dilutions of known positive clinical samples6.

The first step in the design of PCR primers was to obtain all relevant sequences from Genbank. These aligned sequences were then inspected for areas of high sequence conservation around the target regions. Regions of high sequence conservation were

6 In 2011, known copy number RNA transcripts were developed of several of these positive clinical samples which had been previously used for both positive controls and sensitivity testing (McLeish et al., 2012). After the advent of these RNA transcripts, sensitivity tests were carried out using known copy number samples at 10-fold dilutions. 40

Materials and Methods then analysed and a fragment for the PCR primer was selected according to the following criteria: o High sequence conservation – as few degenerate bases as possible o Length of close to 23-25 nucleotides (not less than 16 and not more than 26) o Melting temperature (Tm) predicted to be close to 60°C o G+C content of 40-60% with 3ˈ GC clamp where possible o No self-annealing or hairpin formation

Calculations of Tm and analysis for self-annealing/hairpin formation were carried out in with the aid of the OligoCalc progam (available at http://www.basic.northwestern.edu/biotools/oligocalc.html). Primer pairs for use in the same PCR reaction were selected to have a Tm of within 5 degrees of each other, where possible. The specificity of each primer set was confirmed by analysis with primer BLAST (www.ncbi.nlm.nih.gov/tools/primer-blast) to ensure that primers did not match human or other non-target sequences in the sample.

TABLE 2.2: Codes for individual nucleotides and ambiguous bases used for design of PCR primers Base Code Nucleotides A Adenine C Cytosine G Guanine T Thymine K G or T M A or C S G or C W A or T Y T or C R G or A B G, C or T V G, C or A D G, A or T H A, C or T N A, C, G or T

41

Materials and Methods

2.2.4.2 PCR protocols PCR was used to amplify DNA fragments either for detection (in screening) or sequencing. The PCR protocol consists of a number of temperature dependent steps carried out in a cyclical fashion. The reaction mixture (Table 2.3) is subjected to incubation at 94°C which allows denaturation of the target DNA (either cDNA or first round PCR product). The mixture is then incubated at 50°C to allow primer annealing and then at 72°C for nucleotide extension. This is carried out for 30 cycles as standard, followed by a final extension step at 72°C. Specific primers for each PCR reaction are detailed in Appendix 1.

TABLE 2.3: Reagents used in PCR reactions TABLE 2.4: Cycling conditions used in PCR reactions 25mM MgCl2 4µL Number Temperature Time of cycles 3mM dNTPsa 2µL 30 94°C 18s Sense primer (100µM) 1µL 50°C 30s Antisense primer (100µM) 1µL 72°C 90s Taq polymeraseb (5U/µL) 0.08µL 1 72°C 300s Either cDNA or first round 1 – 2µL PCR productc Nuclease free H2O Up to 20µL a dNTPs were acquired as separate dATP, dGTP, dCTP and DTTP 25mM aliquots (Promega, UK). These were then diluted to make 3mM stocks for use in PCR. b Taq polymerase was GoTaq® DNA Polymerase (Promega, UK). c For the first round of PCR, 2µL cDNA was incorporated into the reaction mixture. For the second round of PCR, 1µL of first round product was used.

The reagent concentrations and cycling conditions detailed above (Table 2.3 and Table 2.4) represent the standard conditions used. All protocols with new primers were attempted initially under these conditions. However, individual PCR protocols were optimised prior to use with study samples. This involved using 10-fold dilution series of positive control samples and sequentially varying certain parameters to determine the optimal set of conditions for each PCR reaction. Detection of HRV RNA transcript dilutions with 1000 copies present was targeted. The most

42

Materials and Methods commonly modified parameters included the annealing temperature and dNTP concentration. In several cases, the sensitivity of a PCR protocol was greatly improved by using one of the same primers in both rounds (a hemi-nested reaction). All modifications to the standard protocol used for individual PCRs are given below (Table 2.5), along with a reference to the thesis chapter to which these reactions relate.

TABLE 2.5: Modifications to standard PCR protocols used for individual PCR reactions PCR Reaction Name of primer Modifications to Modifications to Chapter set standard standard cycling Reference reagents conditions 5ˈUTR Screening UTR Screen None Annealing Chapter 3 temperature of 53°C in the second round VP4/VP2 Typing VP4/VP2 None None Chapter 3 HRV-A VP1 A VP1 Specific inner None Chapter 4 Amplifcation antisense primer Chapter 6 used for HRV- A28 positive samples

Same first round product used for both second round fragments HRV-B VP1 B VP1 Hemi-nested Annealing Chapter 4 Amplification reaction – same temperature of Chapter 6 sense primer in 48°C in the both rounds second round HRV-C VP1 C VP1 PCR 1 None None Chapter 4 Amplifcation Chapter 5 Chapter 6 HRV-C Modified C VP1 PCR 2 All reaction Annealing Chapter 4 VP1 components temperature of Chapter 5 Amplification scaled up to give a 45°C in the Chapter 6 final volume of second round 50µL HRV-C 5ˈUTR UTR 2 Hemi-nested Annealing Chapter 5 Segment 1 reaction – same temperature of sense primers in 53°C in the both rounds second round

43

Materials and Methods

HRV-C 5ˈUTR UTR 3 Hemi-nested Annealing Chapter 5 Segment 2 reaction – same temperature of sense primers in 53°C in the both rounds second round HRV-C 5ˈUTR 5ˈ UTR 1 Combined Annealing Chapter 5 end amplification of temperature of HRV-A/-C used 53°C in the multiple primers second round in each round (listed in Appendix 1)a Hemi-nested reaction – same sense primers in both rounds HRV-A 3Dpol A 3Dpol None Annealing Chapter 6 amplification temperature of 45°C in the second round HRV-B 3Dpol B 3Dpol None Annealing Chapter 6 amplification temperature of 45°C in the second round HRV-C 3Dpol C 3Dpol None Annealing Chapter 5 amplification temperature of Chapter 6 45°C in the second round HRV-A 2A C 2A Same first round Annealing Chapter 5 amplification product used for temperature of both second round 46°C in the fragments second round

4µL 3mM dNTPs used in first roundb

Fragment 2 (labelled C2A PCR 2) obtained with a heminested reaction using OAS primer a For reactions using more than two primers, the volume of nuclease free H2O was adjusted accordingly. b Where volume of dNTPs used is increased, the volume of nuclease free H2O was adjusted accordingly.

44

Materials and Methods

2.2.4.3 SuperScript One-Step reverse transcription polymerase chain reaction (RT-PCR) For samples which were difficult to amplify using the optimised GoTaq protocols listed above, a more sensitive one-step combined RT and PCR protocol was used. This utilised the Superscript III One-Step RT-PCR system with Platinum Taq DNA polymerase (Invitrogen, UK) with parameters and cycling conditions as previously optimised for the amplification of EV (McWilliam Leitch et al., 2009a).

TABLE 2.6: Reagents used in Superscript III TABLE 2.7: Cycling conditions used RT-PCR reactions in Superscript III RT-PCR reactions 2X Reaction Mix7 10µL Number Temperature Time of cycles Outer Sense PCR primer 1µL 1 43°C 1 hour (100µM) Outer Antisense PCR primer 1µL 20 53°C 60s (100µM) Superscript II reverse 0.4µL 55°C 60s transcriptase 8/Platinum Taq Extracted RNA 6µL 1 70°C 15 min Nuclease free H2O 1.6µL 1 94°C 2 min 40 94°C 30s 50°C 30s 68°C 90s 1 68°C 5 min

All reactions were prepared on ice and contained reagents as detailed (Table 2.6). RNA directly extracted from clinical specimens was used. As the outer PCR primers for the particular reaction were used for the RT step, the resulting product could only be used for amplification of that specific segment. The Platinum Taq used in the amplification is complexed with an activity blocking antibody that renders it unreactive at room temperature. This prevents non-specific annealing and therefore

7 2X Reaction Mix is supplied with the Superscript III One-step RT-PCR kit and contains 4mM of each dNTP and 3.2mM of MgSO4. 8 In contrast to standard reverse transcription reactions, RT undertaken with the Superscript III system used Moloney murine leukaemia virus reverse transcriptase (M-MLV-RT), as opposed to AMV-RT. 45

Materials and Methods increases sensitivity. The complexed antibody is denatured at the high temperatures used in the thermal cycling (Table 2.7) and polymerase activity resumes. The first round product obtained from the Superscript III reaction was used directly in second round of PCR, as previously described.

2.2.4.4 Gel Electrophoresis The presence of PCR positive samples and the approximate length of DNA fragment amplified was visualised by agarose gel electrophoresis. 2% agarose gels were prepared with 1X tris-acetate-EDTA (TAE) buffer (Severn biotechnologies, UK) and 1/10000 ethidium bromide (10mg/µL)9. 10µL of second round PCR product was thoroughly mixed with 1.5µL of 6X loading dye and loaded directly into a well of prepared gel in an electrophoresis tank containing 1X TAE buffer. As the loading dye is significantly denser than the TAE buffer, it allows the sample to sink into the well. The dye is also negatively charged and so co-migrates with the DNA within the sample, allowing monitoring of the progress of the sample along the gel. An exACTGene Low Range Plus DNA ladder (Fisher Scientific, UK) was loaded alongside the PCR products.

All second round PCR products were electrophoresed at 150 volts for 30 to 45 minutes, depending on the size of the expected product. Resulting bands were visualised using a UV transilluminator.

2.2.4.5 Gel extraction and purification of amplified products In cases where multiple products were visualized on an agarose gel and the PCR reaction could not be further optimised, bands were excised and purified using the Illustra GFX PCR DNA and Gel Band Purification Kit (GE Life Sciences, UK), according to the manufacturers’ instructions. Briefly, DNA bands were manually

9 Ethidium bromide was used for the majority of initial screening studies undertaken during this thesis. However, as it is an intercalating agent and is a known carcinogen with prolonged exposure, certain University areas discourage its use. For later studies, SybrSAFE was used as an alternative. 46

Materials and Methods excised under UV transillumination with minimal exposure time to prevent DNA damage. Excised bands were incubated at 60°C with 10µL of Capture Buffer 3 per 10mg agarose, until the agarose was completely dissolved. The DNA within the sample was bound to a silica membrane with a Illustra GFX Spin Column by spinning at 16,000 xg for 30 seconds. The sample was then washed with an ethanol containing buffer to remove salts and impurities. Finally, each sample was incubated at room temperature for one minute with 50µL nuclease free water and eluted by spinning at 16,000 xg for 30 seconds. If not required for immediate use, both excised bands and eluted DNA could be stored for up to one week at -20°C.

2.2.4.6 PCR clean-up procedures All PCR products which were not extracted directly from agarose gels underwent an additional PCR clean-up step to remove unincorporated primers and dNTPs and to improve the fidelity of sequencing reactions. This step was carried out using EXOsap-IT (Fisher Scientific, UK), which is a combination of Exonuclease I to degrade single stranded DNA and Shrimp Alkaline Phosphatase to dephosphorylate excess dNTPS. 2µL of EXOsap-IT was incubated with 5µL second round PCR product at 37°C for 15 minutes. The active enzymes were then denatured at 80°C for 15 minutes. The resulting mixture in theory contained only PCR product, nucleosides and inorganic phosphate and was then used directly for sequencing reactions.

2.2.4.7 Sanger sequencing All sequencing undertaken during the course of this work employed the Sanger method, whereby the reaction mixture contains all four chain terminating dideoxynucleotides labelled with distinct fluorescent dyes allowing the DNA sequence to be determined sequentially. This was performed using the ABI BigDye Terminator kit (Applied Biosystems, Warrington, UK), with reagents and cycling conditions described below (Table 2.8 and Table 2.9). PCR products were

47

Materials and Methods sequenced in both sense and antisense orientation using second round PCR primers. The amount of BigDye used in each reaction was adjusted according to fragment size, with up to 2µL being used for the largest sequence fragments.

TABLE 2.8: Reagents used in TABLE 2.9: Cycling conditions used sequencing reactions in sequencing reactions PCR product 2µL Number Temperature Time of cycles BigDye v3.1 1µL - 2µL 25 95°C 30s Primer (100µM) 1µL 50°C 20s Nuclease free H2O to 20µL 60°C 4 min 1 68°C 5 min

Reactions products were sent to the in-house sequencing facility at The Genepool (Ashworth Laboratories, Kings Buildings, Edinburgh) and results were returned in the form of FASTA and corresponding ABI files.

2.3 Computational Methods

2.3.1 Statistical methods

Unless otherwise stated, all statistical tests carried out on epidemiological data were two-tailed Fishers exact tests, appropriate for small datasets with limited categories for each variable.

2.3.2 Sequence alignment

All phylogenetic and evolutionary analyses were entirely dependent on the construction of a sequence alignment in which homologous sequences were correctly identified and aligned. The alignment process is central to all downstream analysis and prevents false conclusions regarding phylogeny, genetic diversity and recombination.

48

Materials and Methods

Sequences were imported into SSE v1.0 (Simmonds, 2012) and initially aligned with a combination of the CLUSTAL-W or MUSCLE version 3.8 algorithms (Thompson et al., 1994; Edgar, 2004) implemented within SSE v1.0. Alignments were refined where absolutely necessary by manual editing at the amino acid level (for coding sequences) and nucleotide level (for non-coding sequences). PCR products were sequenced in both sense and antisense directions to allow for dual coverage. Any gaps and mismatches between the sense and antisense sequences were resolved by inspection of accompanying chromatograms in the Chromas software (www.technelysium.com.au). Amino acid sequences were obtained by translation of nucleotide sequences carried out in the SSE v1.0 software package (Simmonds, 2012) and using a standard genetic code.

2.3.3 Database searching

A critically important step in many of the analyses within this thesis was the construction of comprehensive datasets containing all published sequences of a particular gene region or species. The inclusion of all available published sequences with sequences generated from our studies allowed the most comprehensive investigations possible to be carried out. All database searches were carried out of nucleotide sequences available in the three major online databases (Genbank NCBI, European Molecular Biology Laboratory (EMBL) and DNA Databank of Japan (DDBJ)). The existence of a reciprocal agreement between the three major databases to cross-submit all sequences allowed all information required to be accessed via the NCBI Basic Local Alignment Search Tool (BLAST).

The BLAST method of searching sequence databases was first described in 1990 (Altschul et al., 1990) and is based on locating regions of local similarity between a query sequence and all database entries. This procedure was then extended in 1997 to an algorithm more based in biological reality due to its ability to deal with gaps in sequence alignments (Altschul et al., 1997). BLAST is an heuristic, which while fast

49

Materials and Methods and reasonably accurate does sacrifice some of its accuracy for speed and cannot guarantee that all homologous sequences will be returned. The optimization of certain parameters is therefore vital to the accuracy of the endeavour.

All studies presented within this thesis utilised a nucleotide BLAST procedure to obtain the maximum number of matching and potentially homologous sequences for analysis. In brief, an input query sequence in FASTA format is broken down into “words” of a certain length. The entire database is then searched for occurrences of every possible word generated from the query sequence. These have a default length of 11 nucleotides. However, as an exact match of the entire word length is required for progression of the algorithm, the word length can be specified in order to alter the sensitivity and specificity of the protocol. When a match is located in a database sequence, the hit is extended by adding bases from the query sequence at both the 5ˈ and 3ˈ ends and checking for matches in the database sequence. The growing alignment is sequentially scored based on a specified match/mismatch penalty. The default value for match/mismatch score is 1/-2. However, this was decreased to 1/-1 for searches for sequences which were more distantly related. The alignment extension is continually scored until the score drops below a specified threshold (20 for nucleotide sequences). The extension step allows the differentiation of meaningful matches from random ones. Sequences returned were assigned an E- value, which is an estimate of the probability of a hit being a false positive.

In searches which were conducted in order to obtain the total number of VP4/VP2 sequences, specificity was sacrificed for sensitivity in terms of both match/mismatch scoring and word length. Alignment results were then edited manually to contain only sequences that covered >90% of the query region (therefore excluding small, random matches) and those which were annotated as other Picornavirus species. All published sequences obtained by database searching are catalogued in our alignments

50

Materials and Methods and presented in phylogenetic trees and other analyses by accession number; which is consistent across all three major databases.

A large number of HRV polyprotein sequences (which are not yet associated with publications) contain large gaps within the coding sequence, presumably due to incomplete coverage during Illumina sequencing. As they are all nearly complete genomes, these sequences were considered as complete genomes and were only excluded if the gaps affected the regions under investigation and comprised of more than 10% of the region.

2.3.4 Likelihood mapping analysis

In the analysis of the occurrence of recombination between three coding regions of HRV-A, -B and –C, likelihood mapping analysis was used to visualise the phylogenetic signal content of the alignment a priori (Strimmer and von Haeseler, 1997). In addition, this method confirms that sequences have undergone the correct mode of evolution to result in a resolved bifurcating phylogenetic tree and therefore that conclusions based upon such an inferred tree would be valid.

Likelihood mapping analysis is based on quartet puzzling and was implemented in the TREE-PUZZLE v5.2 software (Schmidt et al., 2002). Briefly, maximum likelihoods were calculated for each of the 3 possible unrooted trees derived from 4 randomly chosen sequences from the alignment (a quartet) (Strimmer, 1994). Using the Hasegawa-Kishino-Yano (HKY) model of substitution, a total of 10,000 quartets were analysed and each was assigned into a particular category based on the maximum likelihood. The results are plotted onto a triangular surface, whereby the proportion of quartets in each of the three vertices represent fully resolved bifurcating phylogenetic trees and those located in the sides of the triangle represent equal observed likelihoods between two of the calculated trees. The proportion of quartets in the centre of the triangle represent those with an unresolved star-like phylogeny, whereby the likelihoods of all three possible trees are equal. Any

51

Materials and Methods analysis with more than 20-30% of analysed quartets showing unresolved phylogeny are not considered reliable for phylogenetic reconstruction. Possible explanations include “noisy” data, lack of sequence homology, alignment errors and recombination within the region subjected to the analysis.

Likelihood mapping analysis has been previously used to confirm the suitability of a subset of HRV VP4/VP2 sequences (Savolainen et al., 2002) and EV sequences (Andersson et al., 2002; Lindberg et al., 2003) for phylogenetic analysis. In addition, the method has also been used to demonstrate the star-like bursting radiation resulting from the rapid transmission of human immunodeficiency virus 1C (HIV- 1C) after introduction to South America (Véras et al., 2011) and to assess suitability of different regions of HAV for genotyping (Joshi et al., 2008).

2.3.5 Construction of phylogenetic trees

Phylogenetic analysis uses sequence information to provide an estimate of the evolutionary relationships between sequences, provided that adequate homology exists. Studies within this thesis were largely reliant on the use of the Molecular Evolutionary Genetics Analysis (MEGA) software version 4.0 (Tamura et al., 2007) and latterly, version 5.0 (Tamura et al., 2011). Unless otherwise stated, all phylogenetic trees used for HRV typing and analyses of recombination were constructed by neighbour-joining method (Saitou and Nei, 1987) from 100 bootstrap re-sampled sequence alignments of maximum composite likelihood (MCL) (Guindon and Gascuel, 2003) distances with pairwise deletion for missing data 10. Phylogenetic trees were improved by the inclusion of an outlier, which rooted the tree and therefore provided an estimate of which nodes under study were the most ancient. For full HRV species analyses that were conducted, sequence NC_001430: EV70 was used to root the tree. However, for analyses of less genetically diverse

10 Phylogenetic trees included in Chapter 6 were latterly constructed with 1000 bootstrap repetitions. These were compared directly with trees constructed with 100 bootstraps and no detectable difference was found. 52

Materials and Methods sequences (within HRV species), a sequence of different species or type was used (specified throughout).

The neighbour-joining (NJ) method was first described by Saitou and Nei in 1987 (Saitou and Nei, 1987). It is a distance based method that produces an additive tree and is one of the most widely used methods for phylogenetic tree reconstruction (Gascuel and Steel, 2006). While the authors have stated that the method is not guaranteed to always produce the minimum evolution tree, it has good reliability in producing the correct tree topology (Saitou and Nei, 1987). The fact that it is significantly more computationally efficient than more traditional methods based on maximum parsimony allows the relatively fast calculation of trees reconstructed from a large number of operational taxonomic units (OTUs). As the name suggests, NJ is based on the principle of sequentially joining the most closely related OTUs via a single interior node in an unrooted bifurcating tree. Initially, all OTUs/sequences are represented in a star-like tree with no internal clustering. Evolutionary distances are calculated in a pairwise fashion between all sequences and the OTU pair with the shortest inferred branch lengths are joined by an internal tree node. The algorithm then considers this pair as a single OTU and repeats the process of analysing evolutionary distances between all sequences, culminating in the joining of the pair to their nearest neighbour. This process is repeated sequentially until all sequences are included in the final tree.

One of the limitations of the NJ method is that it assumes that no backwards or parallel substitutions have taken place. Therefore, NJ trees are subject to long branch attraction (Felsenstein, 1978), whereby it can predict that two OTUs are more closely related than they actually are, as the algorithm cannot correct for multiple substitutions at specific sites. Other phylogenetic reconstruction methods such as those relying on maximum likelihood optimality criterion are generally more biologically accurate due to their greater ability to correct for multiple hits (Holder

53

Materials and Methods and Lewis, 2003). However, as maximum likelihood methods of tree-building seek to explore a significant proportion of the tree space and score each tree, they are very computationally intensive methods and not practical for use on large datasets.

The dual problems of inaccurate phylogeny estimation and long branch attraction exhibited in pure NJ algorithms can be prevented by the usage of an MCL method to calculate evolutionary distances within a dataset (Tamura et al., 2004). The calculation of MCL distances is ideal for rapidly evolving sequences, as it automatically corrects for multiple mutational events at the same site and additionally incorporates a model of nucleotide substitution which is optimized automatically to fit the data (Tamura et al., 2011). This includes such aspects as transition/transversion ratios, whereby it is recognised that purine-purine and pyrimidine-pyrimidine substitutions occur more readily than purine-pyrimidine in natural mutations.

For datasets that were analysed with Bayesian Evolutionary Analysis of Sampling Trees (BEAST) software (Drummond and Rambaut, 2007), phylogenetic trees were additionally constructed by Bayesian methods implemented in the program. These were found, in each case, to present identical tree topology to trees produced by the neighbour-joining maximum composite likelihood method.

2.3.6 Bootstrapping

In order to determine the robustness of observed phylogenetic groupings within the tree, all trees were constructed with either 100 or 1000 bootstrap resamplings. The bootstrap is a statistical technique that was first applied to assessing confidence levels in phylogenetic groupings by Professor J. Felsenstein of the University of Washington in 1985 (Felsenstein, 1985). The generation of bootstrap values for internal nodes of a phylogenetic tree gives an estimate of the reproducibility of a monophyletic clade. Sequence data used to construct the original tree is randomly resampled and trees constructed from each of the pseudo-replicates produced. The

54

Materials and Methods proportion of observed trees in which each monophyletic clade is observed is then computed and reported as the bootstrap value (Holder and Lewis, 2003). A high bootstrap value (close to 100) indicates that analysis of randomly selected subsets of the alignment data consistently predict the same clade (Berry and Gascuel, 1996). One of the major limitations of bootstrap analysis is that it fails to provide an estimate of the probability of an observed tree and merely indicates a degree of robustness for the tree phylogeny. For instance, due to the inherent bias in methods of calculation of evolutionary distances, long branches subject to long branch attraction can have good bootstrap support, even if they do not represent a “true” phylogenetic relationship (Swofford et al., 2001).

For all phylogenetic studies included in this thesis, a bootstrap value of 70 was used as the threshold for determining adequate support. MEGA v4.0 and v5.0 software packages present phylogenetic trees as both a majority rule bootstrap consensus tree and as the originally determined tree with bootstrap values superimposed. The bootstrap consensus tree was used for phylogenetic analyses.

2.3.7 Recombination detection methods

The general strategy for the detection of recombination in sequence alignments was multi-faceted. Central to all analysis undertaken was the topology of bootstrap supported phylogenetic trees. Phylogenetic trees were constructed containing sequence fragments of varying sizes (depending on the specific analysis undertaken). Trees from sequential genome regions were manually inspected to identify phylogeny violations between clades with bootstrap support above the threshold of 70. All recombination events detected from the inspection of phylogenetic trees were further characterised by analysis with GroupScan and where possible, putative breakpoints were determined.

In addition to the initial screening of sequence datasets by phylogenetic reconstruction, several computational analyses were undertaken to effectively screen

55

Materials and Methods each dataset for the occurrence of recombination. This was done to supplement and support results obtained from phylogenetic analysis. Although a great number of methods for the detection of recombination in sequence data currently exist, methods often give differing or even, contradictory results (Posada and Crandall, 2001; Posada, 2002) and therefore a variety of different approaches were combined within this work. Screening analyses were undertaken with programs implemented in the Recombination Detection Program (RDP) v4.0 (Martin et al., 2010) and SSE v1.0 (Simmonds, 2012) software packages and those available through the HyPhy datamonkey webserver (www.datamonkey.org). All algorithms used for the detection and description of putative recombination events are discussed below.

2.3.7.1 Recombination Detection Program v4.0 The initial screening of each sequence dataset for the occurrence of recombination with bioinformatics based methods was undertaken using algorithms implemented in RDP v4.0 (Martin et al., 2010). RDP v4.0 incorporates numerous different methods for the analysis of recombination in sequence data. Of these, we subjected our datasets to analysis with RDP (Martin and Rybicki, 2000), GeneConv (Padidam et al., 1999), MaxChi (Smith, 1992), Chimaera (Posada and Crandall, 2001), SiScan (Gibbs et al., 2000) and BootScan (Martin et al., 2005). A putative recombination event was accepted for further analysis if it was detected by more than two of these methods and returned a p-value of less than 0.05. Although these algorithms employ varying methods of detecting and characterising potential recombination events, only BootScan within the RDP package takes account of phylogenetic incongruence observed between different sequence fragments. As GroupScan offers a preferable method to BootScan for the identification of putative recombination breakpoints (Section 2.3.7.4), the algorithms available within the RDP v4.0 software package were used only for screening purposes.

56

Materials and Methods

2.3.7.2 Genetic Algorithm for Recombination Detection (GARD) and Single Breakpoint Recombination (SBP) All coding region datasets analysed for the occurrence of recombination were analysed with the Genetic Algorithm for Recombination Detection (GARD) and Single Breakpoint Recombination (SBP) programs (Kosakovsky Pond et al., 2006b, 2006a), available on the HyPhy Datamonkey webserver (www.datamonkey.org).

The two programs operate in a highly similar manner. However, SBP seeks to confirm the presence of recombination within a sequence alignment and GARD seeks to locate all potential recombination breakpoints. The method is based on a search for sequence specific phylogenies between different regions of an alignment, concentrating on a single inferred breakpoint for SBP and with a more heuristic approach in GARD. Briefly, a NJ tree is constructed for the full sequence alignment and an AICc (small sample Akaike Information Criterion) score is calculated. The score takes into account the nucleotide model specified and rate parameters and branch lengths are estimated by maximum likelihood. The sequence alignment is then split into sequential blocks with a breakpoint (i) in the middle. An NJ tree and corresponding AICc score are calculated for each block. In SBP, this process is repeated for every possible location of the breakpoint. If the AICc score is higher for any individual segment than for the alignment as a whole, recombination is deemed to have acted on the alignment. In GARD analysis, a more heuristic approach to searching the space for putative breakpoints is employed, as scanning every possible combination of breakpoint locations is not practical. In order to infer a putative recombination breakpoint, the algorithm does not always require a topological change to be observed. Instances of changes in branch length between segments are additionally reported (Kosakovsky Pond et al., 2006b, 2006a).

Datasets were analysed both as full HRV species and as individual HRV types. In addition, sets of sequences involved in recombination events observed by other

57

Materials and Methods methods (ie/ RDP or phylogenetic incongruity) were specifically subjected to analysis with GARD and SBP, in order to verify events observed. Due to the restriction placed upon individual server usage, a full analysis of all three HRV species concurrently was not able to be carried out in this manner. The model selection tool offered within the datamonkey webserver was employed and the appropriate model of nucleotide substitution was used for each GARD/SBP analysis.

2.3.7.3 TreeOrderScan The program TreeOrderScan in SSE v1.0 is a sequence analysis tool used to directly visualize the frequency of recombination events in a particular sequence alignment and provide a preliminary analysis of potential breakpoints. TreeOrderScan analyses a sequence alignment over sequential fragments of defined length and increment. Optimal trees are produced for each sequence fragment and branches collapsed to a pre-specified bootstrap support threshold. The program then directly compares the branching order of trees from each fragment and catalogues any phylogeny violations. This is then presented as a half-diagonal matrix, giving an overview of the degree of phylogenetic incongruence observed in each region and additionally as a specific list of putative violations. Sequence datasets can be sub-divided into groups, which then allows the computation of frequencies of phylogeny violations both within and between groups (for instance, different HRV species).

TreeOrderScan was used as an initial tool to screen alignments for recombination, using a sequence fragment of 300 nucleotides and an increment of 30. This was implemented in analysis of recombination within the 5ˈUTR of HRV-C (Chapter 5) and analyses of recombination within the coding regions of all three HRV species (Chapter 6). All analyses were undertaken using EV70 : NC_001430 as an outgroup. This technique has also been used for the analysis of recombination in apthoviruses (Heath et al., 2006), HRV (Linsuwanon et al., 2011) and EV (Simmonds and Welch, 2006).

58

Materials and Methods

2.3.7.4 GroupScan In instances where recombination was detected by inspection of phylogenetic trees and the usage of the computational methods described above, GroupScan was used in order to verify putative events and determine recombination breakpoints where possible. GroupScan is also implemented within the SSE v1.0 software package.

GroupScan was first described in 2005 (Simmonds and Midgley, 2005) and is used to score the extent of grouping of a query sequence with two or more control groups. The grouping score is a measure of how deeply embedded within a particular group a query sequence is. For instance, a score of 0 indicates that there is no grouping and a score of 1 indicates that the query sequence groups deep within a clade of control sequences. Sequences are analysed in a sequential fashion, with both fragment size and increment being specified by the user. The output shows a graph with the midpoint of the analysed fragment represented on the x axis and the grouping score on the y. This can be used to determine approximate recombination breakpoints, when a query sequence is scanned against its nearest neighbours in both regions. The putative breakpoint is determined by point of intersection of the grouping score trace of both parental sequences.

The GroupScan method was used for all determinations of recombination breakpoints. In preference to the method commonly employed in BootScan and Simplot analyses, GroupScan can consider entire groups of sequences as control groups and therefore doesn’t lose valuable phylogenetic information by condensing these sequences to a single consensus. This also allows the calculation of recombination breakpoints in circumstances where parental sequences are not present within the dataset analysed. The RDP algorithm also returns estimations of recombination breakpoints. However, the RDP method relies purely on the analysis of evolutionary distance data, whereas GroupScan bases the score on observations conducted of bootstrap supported phylogenetic trees.

59

Materials and Methods

Within this thesis, GroupScan was used to pinpoint recombination breakpoints both in coding (Chapter 6) and non-coding (Chapter 5) genome regions. The program was also used as an adjunct to other analyses in the further characterisation of putative recombination events highlighted by computational analyses, but not observed on bootstrapped phylogenetic trees. All analyses carried out used a fragment size of 300 nucleotides and an increment of 30, unless otherwise stated. Coding region recombination breakpoints (Chapter 6) were additionally analysed with a 600 base fragment and 30 base increment. These additional program runs generated highly similar results.

2.3.8 Bayesian Evolution Analysis of Sampling Trees (BEAST)

In order to model the evolutionary history of selected HRV sequence fragments and determine mean substitution rates, the BEAST software package was used (Drummond and Rambaut, 2007) to out Bayesian Markov Chain Monte Carlo (MCMC) analysis of input sequence datasets. Sequence data is prepared for analysis using the adjunct software Bayesian Evolutionary Analysis Utility (BEAUti). This allows input of tip dates for sequences (where known) and specification of priors, including a nucleotide substitution model and molecular clock. The SRD06 model was used for all studies within this thesis, as has been recommended for the analysis of protein coding sequence data (Shapiro et al., 2006). This model specifies an HKY model of nucleotide substitution, assumes equal base frequencies and specifies gamma distributed rate heterogeneity. Additionally, the SRD06 model links the first two bases of a codon but allows the third base to have a different substitution rate and transition/transversion ratio. A constant population size prior and a relaxed molecular clock were used for all analyses. A relaxed molecular clock does not assume a constant mutation rate between lineages and so can assign different mutation rates to different tree branches. All other priors are optimized during the execution of the program.

60

Materials and Methods

The XML file generated in BEAUti was input into BEAST and the MCMC algorithm was run for 20 million steps. Initially, the program produces a phylogenetic reconstruction from the input sequence data and calculates the posterior probability density of this tree. The algorithm then randomly varies a few of the input parameters to create a small, sequential step in the parameter space. The posterior probability density of this step is calculated and a decision is made regarding the position of the chain in the parameter space. If the posterior probability of the new step is higher, the chain moves on. If a large drop is observed, then the chain maintains its position and an identical step is created. However, if the posterior probability density decreases only slightly, the move is accepted in a proportion of cases. This process is then repeated millions of times. As the steps taken are small and incremental, chains tend to stay in regions of high posterior probability and therefore converge upon a specific reconstruction with a high posterior probability.

BEAST output was analysed in the Tracer program, as part of the BEAST package. On completion, inspection of the output files can help to give an indication of whether the number of steps in the chain was sufficient. Each output parameter is reported with an estimated sample size value (ESS). Values of less than 200 indicate a requirement to optimize input priors or increase the number of MCMC steps. In order to reconstruct phylogenetic trees produced by Bayesian methods, tree must be annotated in TreeAnnotator program (also part of the BEAST package) and visualized in the FigTree software.

In order to verify that the specified priors were not inadvertently influencing the results obtained, each sequence dataset was subjected to an “empty” run of the BEAST algorithm with sequence data excluded. In addition, each sequence dataset was analysed with the dates of isolation of individual sequences scrambled.

61

Chapter 3 Epidemiology and Clinical Correlations of HRV

3.1 Introduction

3.1.1 Characteristics of circulating HRV in populations of patients with respiratory illnesses

HRV are very common respiratory infections and the classic triad of rhinorrhoea, cough and headache has been experienced on numerous occasions by most of the population. Seropositivity for HRV exceeds 90% by two years of age (Blomqvist et al., 2002a). However, even this staggeringly high seroprevalence is likely to be an underestimate, as the methods employed in this study would have overlooked not only 12 now recognised HRV-A/-B types11 (www.picornaviridae.com) but the entire species C. The gradual shift to the use of molecular based methods for HRV screening has enabled much more accurate estimates of HRV incidence and worldwide circulation to be produced.

HRVs are routinely detected in respiratory samples in every month of the year and all three species have been found to have a global distribution (Briese et al., 2008). The documented detection frequency of HRV in respiratory samples ranges from 7.9% - 64.3% (Alper et al., 2008; du Prel et al., 2009; Brittain-Long et al., 2010; Chan et al., 2012; Chidlow et al., 2012; Renwick et al., 2007; Lamson et al., 2006; Arden et al., 2006; Kistler et al., 2007a)12. Two seasonal peaks in HRV detection are generally observed; one in the early autumn and one in the spring (Jin et al., 2009;

11 Throughout this chapter, the term “HRV type” is used to refer to a phylogenetic group of HRV sequences with pairwise nucleotide p-distance in VP4/VP2 of below the proposed threshold for each species (Simmonds et al., 2010; McIntyre et al., 2013a). The process of defining these thresholds and assigning HRV sequences into genotypically assigned types in described in Chapter 4. 12 Only detection rates from published studies which used RT-PCR based screening methods are given. 62

Epidemiology and Clinical Correlations of HRV

El-Sahly et al., 2000). In contrast to other common respiratory viruses and other EV species, HRV types do not appear to circulate in an epidemic fashion (with a small number of circulating types being periodically replaced). Instead, a large number of genetically distinct HRV types circulate concurrently (Savolainen et al., 2002; Wisdom et al., 2009b), with no discernible predominance of a particular strain or species (Fry et al., 2011). The existence of over 150 distinct types and the lack of cross-immunity make this constant circulation possible. Despite the abundance of HRV types circulating during one peak season, there appears to be relatively limited persistence of types from one season to the next. One study which considered HRV circulating between September 2006 and Feburary 2007 found that only 5 of the 43 types observed were present in both study months and one HRV-C positive patient was found to be re-infected with HRV-A only 5 days later (Wisdom et al., 2009b). The characteristic, almost random, fashion in which HRV appears to circulate in respiratory samples makes discerning patterns and possible networks of transmission nearly impossible.

HRV-A strains are generally the most frequently detected, while HRV-B is comparatively rare (Chidlow et al., 2012; Wisdom et al., 2009b). A few sporadic studies, mostly originating from the Asian-Pacific region have reported a higher prevalence for HRV-C (Chan et al., 2012; Jin et al., 2009). No consistent bias towards infection of male or female patients has been evident in published studies. Although one study found an a slight excess of HRV-C infections in female patients (Wisdom et al., 2009b), another found a slight excess overall of HRV in male patients (Chan et al., 2012). Overall, precise epidemiological characteristics of HRV infections are difficult to discern, due to the very high incidence in the general population and the large number of distinct strains circulating concurrently.

63

Epidemiology and Clinical Correlations of HRV

3.1.2 Involvement of HRV in infections out with the upper respiratory tract

The classical perception of HRV as merely an agent of the “common cold” caused this group of viruses to be largely overlooked for many years. In addition to their known association with mild URTIs, the realisation that HRV can effectively replicate in the lower airways has led to a now extensive association with lower respiratory tract infection (LRTI), especially in immunocompromised patients (Gutman et al., 2007; Parody et al., 2007; Papadopoulos et al., 2000; Kusel et al., 2006; Papadopoulos et al., 1999; Han et al., 2009; Linsuwanon et al., 2009; Garbino et al., 2009). The reported detection frequency of HRV in lower respiratory tract specimens ranges from 12.6% to 37.8% (Fry et al., 2011; de Almeida et al., 2010; Costa et al., 2011; Arden et al., 2006). HRV are also clinically important causes of exacerbation of chronic lung diseases, such as asthma (Khetsuriani et al., 2008; Bizzintino et al., 2011; Miller et al., 2009a; Friedlander and Busse, 2005) and cystic fibrosis (de Almeida et al., 2010; Smyth et al., 1995).

In addition to severe illness within the respiratory tract, HRV has been implicated in the development of acute otitis media (AOM) (Nokso-Koivisto et al., 2006; Pitkäranta et al., 1998; Savolainen-Kopra et al., 2009a). Indeed, one study isolated HRV from middle ear fluid in 41% of episodes of AOM (Blomqvist et al., 2002a). However, this rate was obtained from screening samples by cell culture based methods and is therefore likely to represent a significant underestimate of the true incidence. HRV have also been linked to the development of acute and chronic sinusitis (Pitkäranta et al., 2001, 1997), which are significant causes of morbidity in the general adult population.

HRV have also been isolated, albeit infrequently, from blood samples of infected patients. This is suggestive of a potential role of HRV viraemia in development of severe disease (Xatzipsalti et al., 2005). Several isolated case reports have discussed

64

Epidemiology and Clinical Correlations of HRV the occurrence of HRV in critically and systemically unwell patients, with HRV RNA being obtained from such distant sites as plasma, pericardial fluid, urine and stool (Broberg et al., 2011; Tapparel et al., 2009c). However, these reports of isolation of HRV from sites other than the upper and lower respiratory tract do remain rare. The current perception of HRV as predominantly causative agents of mild and unimportant illness has probably stifled the desire of the research community to investigate other potential clinical associations.

3.1.3 The role of enteroviruses in aseptic meningitis

The four known species of EV (EVA-D) are often isolated from faecal samples and have been associated with a wide variety of acute and chronic clinical illness (Pallansch and Roos, 2007). However, despite a broad spectrum of clinical associations, all four species have been linked with CNS infection, especially aseptic (viral) meningitis.

Although viral meningitis is generally self-limiting, it can potentially lead to devastating sequelae for affected patients, including seizure disorders and rapid deterioration of mental state in cases of progression to meningoencephalitis (Valcour et al., 2008). A viral aetiology can only be defined in around 66% of cases of aseptic meningitis and of these EV are the most common, with an overall detection frequency of around 26% (Kupila et al., 2006). Other clinically important causes of viral meningitis include herpes simplex viruses (HSV) types 1 and 2 and varicella zoster virus (VZV).

Despite the close genetic relationship of HRV with EV and known phenotypic overlap in the causation of respiratory infections, to our knowledge, a potential relationship of HRV with previously undiagnosed cases of aseptic meningitis had not been examined. Interestingly, HRVs have been isolated during the molecular surveillance of sewage samples in Finland, Latvia and Slovakia (Blomqvist et al.,

65

Epidemiology and Clinical Correlations of HRV

2009) and in investigation of acute flaccid paralysis in South Asian children (Victoria et al., 2009). The gastrointestinal tract is the major site of EV replication and many cases of meningitis are diagnosed on the basis of EV in stool samples. HRV have been traditionally thought to be unable to infect the gastrointestinal tract secondary to acid lability. The isolation of HRV strains from environmental sewage samples was initially attributed to swallowed respiratory secretions or discarded tissues (Blomqvist et al., 2009). One previous case report detailed the isolation of an HRV-C strain from the stool of a systemically unwell patient with pneumonia and pericarditis (Tapparel et al., 2009c). In this case, HRV was also isolated from blood plasma and pericardial fluid. However, the significance of the HRV positive stool sample was uncertain as it had a low observed viral load. The detection frequency of HRV in stool samples had, at the time of the completion of our study, not been discussed.

The studies presented within this chapter further investigate several intriguing aspects of the epidemiology of HRV. Two studies of the epidemiology of HRV respiratory infections in Edinburgh aimed to further expand upon previous work carried out in this geographical region (Wisdom et al., 2009b) by analysing clinical features of HRV positive patients and directly comparing distribution of HRV species between patients with severe respiratory disease necessitating care in the ITU and those cared for on a general ward-based setting. Preliminary investigations of the occurrence and detection frequency of HRV infections in both CSF and stool specimens were conducted, involving 1500 and 288 referred samples respectively. The results of these studies have highlighted a potential overlap in the disease manifestations of HRV/EV and highlight the requirement for further study of these areas.

66

Epidemiology and Clinical Correlations of HRV

3.2 Materials and Methods

3.2.1 Sample selection 3.2.1.1 Sample selection for screening of respiratory samples In order to investigate the clinical correlations of HRV species and types circulating in respiratory samples in Edinburgh, 556 samples were selected from the SVC archive for screening. These were selected randomly from all samples referred for testing between September 2008 and May 2009.

The SVC archive has been maintained since October 2005. The total number of samples referred to the SVC has increased slightly every year and regular peaks in the number of referrals are additionally observed during the so-called “cold and flu months”. These peaks coincide with the Health Protection Agency (HPA) influenza monitoring programme, which runs from October to May (www.hpa.org.uk).

A sharp increase in the number of samples referred to the SVC was noted after the announcement of the H1N1 global swine flu pandemic in June 2009 (Figure 3.1). In particular, the proportion of samples referred from community based patients increased dramatically, from between 1-3% in 2006 – 2008 to 9% in 2009 (marked in black on Figure 3.1). The peak of referrals from community health care centres was 21% of all referred samples in June 2009, coinciding with the official announcement by the World Health Organisation (WHO) of the H1N1 swine flu pandemic. This increase could be attributed to greater public awareness and anxiety regarding cold and flu like illness (CFLI), a lower referral threshold and the practice of routine sampling of asymptomatic contacts in suspected cases.

As a consequence of the artificially increased numbers of referred samples, it was deemed both inappropriate and impractical to use archived samples from this period for this study. Therefore, samples were randomly selected from the 9 month period immediately prior to June 2009. The universally high observed detection frequency

67

Epidemiology and Clinical Correlations of HRV of HRV in respiratory samples diminishes the value of pooling samples for screening and therefore they were selected randomly to provide a cross-section of the time period, while still maintaining manageable numbers for a small study.

FIGURE 3.1: Total number of respiratory samples referred to the SVC for virological testing per month from October 2005 to May 2011. Sections coloured in black represent the proportion of samples that were referred from community settings, whereas sections coloured in grey represent those referred from hospitalised patients. The date of declaration of the swine flu pandemic by the WHO is marked by a dotted line.

3.2.1.2 Sample selection for comparison of patients with acute and critical illness In order to investigate clinical associations between HRV species and severity of illness, a preliminary study was conducted comparing characteristics of HRV positive patients in both the Intensive Treatment Unit (ITU) and the Accident & Emergency department (A&E) or general wards. A small subset of HRV positive samples from July 2007 until June 2008 with comprehensive patient information available were selected for inclusion. These had previously undergone routine

68

Epidemiology and Clinical Correlations of HRV diagnostic screening for respiratory viruses by staff at the SVC using standard protocols (Bennett et al., 2011; Dierssen et al., 2008; Scheltinga et al., 2005). Samples which were from the same patient were excluded. All other data, including age and clinical details were obscured during the selection process.

3.2.1.3 Sample selection for screening of CSF samples In parallel with the respiratory archive, an archive is maintained of all referred CSF samples and corresponding extracted RNA. In order to investigate HRV as a potential causative agent of aseptic meningitis, one full year (2009) of referred CSF samples were selected and extracted RNA was combined into pools of 10 for screening.

3.2.1.4 Sample selection for screening of stool samples The study presented within this thesis involved screening a total of 288 referred stool samples for HRV. This work formed a part of a larger published study that included screening undertaken at the SVC by Dr H. Harvala of a further 333 diagnostic stool samples by real-time PCR (Harvala et al., 2012b). The 288 samples included in this study comprised of stool samples isolated in May (n=90), September (n=108) and December (n=90) of 2010. These represented referred specimens for routine diagnostic analysis and included a number of patients referred in the investigation of suspected norovirus outbreaks.

3.2.1.5 Sample selection for screening of respiratory samples from September and December 2010 In order to determine whether detection of HRV in stool specimens coincided with an increase in the same strains isolated from respiratory specimens, screening of a subset of respiratory samples was undertaken. These were taken from the two calendar months with the highest detection frequency of HRV in stool samples. Due to the concentration of HRV infections in stool in children under the age of five and

69

Epidemiology and Clinical Correlations of HRV older adults, the sample selection was restricted to these age bands. This led to the inclusion of a total of 731 respiratory samples (September: n= 221, December: n=510).

3.2.2 Amplification of 5ˈUTR and VP4/VP2 region of HRV positive samples

All samples were initially screened for HRV using a nested RT-PCR strategy for the 5ˈUTR of HRV and positive samples were then amplified in the VP4/VP2 region (described in Chapter 2). VP4/VP2 sequences of samples grouped into phylogenetic clusters (types), in accordance with grouping with an HRV prototype strain.

3.3 Results

3.3.1 Epidemiological characteristics of HRV circulating in Edinburgh between September 2009 and May 2010

556 samples were screened in the 5ˈUTR for HRV and VP4/VP2 typing was attempted for the 141 detected positive samples. Of these, 8 samples (6%) were found to be EV strains, 6 samples could not be typed and 127 (23 %) were confirmed to be positive for HRV. HRV-A was the most frequently observed (54%), followed by HRV-C (36%) and HRV-B was present in only 10% of samples The observed prevalence of HRV infection was slightly higher in males (26%) than in females (20%) and a broadly similar relative frequency of HRV species was observed in both groups (Figure 3.2 A). HRV was significantly more frequently detected in hospitalised patients than in patients in community based or outpatient settings (p=0.0001). There was additionally no evidence of predominant circulation of any particular HRV species or type within any time period.

Samples included were divided into those obtained from the upper respiratory tract (including nasopharyngeal aspirates) and those isolated from the lower respiratory

70

Epidemiology and Clinical Correlations of HRV tract (including bronchoalveolar lavage). The majority were taken from the upper respiratory tract (n=476; 86%). There was no difference in detection frequency of HRV recorded in upper and lower respiratory samples (23% and 24% respectively). However, there was a statistically significant association between HRV-A detection and isolation of sample from the lower respiratory tract (p=0.034) (Figure 3.2 B). HRV infection was more frequently detected in patients under the age of 5 (31.2%) than in those aged 6 and over (12.1%) (p=0.001). The relative frequency of different HRV species was largely similar across different age bands (Figure 3.2 C), with the only statistically significant difference being a higher rate of detection of HRV-C in the 7-12 month age group (p=<0.0001).

The samples included in this study had previously undergone diagnostic screening at the SVC for adenovirus (AdV), parainfluenza virus (PIV) types 1-4, influenza A (FluA) and influenza B (FluB). In addition, these samples had been previously screened for three types of human coronavirus (HCoV); HKU1, OC43 and 229E (Gaunt et al., 2010). The most common co-infection observed was by AdV representing 10.2% of HRV infections (n=13). There were four triple infections and one quadruple infection, all involving both HRV and AdV. All triple infections originated from paediatric patients for in the general ward. However, the quadruple infection was observed in a 7-12 month old ITU patient. Detailed clinical data for this particular sample is missing, although the sample type is recorded as a bronchoalveloar lavage specimen. Bronchoalveolar lavage is a traumatic and invasive sampling technique which is only carried out when indicated by high clinical suspicion of lower respiratory tract pathology. This supports the assumption that there is likely to have been a relatively severe lower respiratory tract involvement in this child’s illness.

71

Epidemiology and Clinical Correlations of HRV

FIGURE 3.2: Epidemiological analysis of HRV infections in patients from September 2008 – May 2009. Comparison of the detection frequency of HRV-A, -B and –C infections in male and female subjects (A) and upper and lower respiratory tract samples (B). Subject/sample totals are given above each graph. Six samples had no sample type information available. (C) Detection frequency of HRV-A, -B and –C in each age band studied (m = months, y = years).

72

Epidemiology and Clinical Correlations of HRV

One paediatric patient, aged 7-12 months, appeared to harbour a persistent infection with both the same HRV-C (C8) and HCoV (OC43) types over a 2 month period. This patient was suffering Pneumocystis carinnii pneumonia, secondary to severe combined immunodeficiency and requiring mechanical ventilation. Besides HRV infection, no other co-infections were noted. These observations suggest that persistence of HRV may occur in immunocompromised patients who are critically unwell.

3.3.2 Comparison of characteristics of HRV positive paediatric patients between acute and high dependency care settings

In order to investigate clinical characteristics of HRV positive paediatric patients in acute and high dependency care settings, a subset of 64 samples were typed in the VP4/VP2 region. These were all from patients being treated in the Royal Hospital for Sick Children, Edinburgh (RHSC) and spanned a single calendar year from July 2007 to June 2008. 35 samples (55%) originated from patients being cared for in a high dependency ward setting, including both the ITU and high dependency unit (HDU). 29 samples (45%) were taken from patients during acute and short-term admissions. Acute admissions were defined as those patients discharged from A&E or those under short-term observation within the Medical Assessment Unit (MAU). This group had an average length of stay in hospital of 2.42 days (range 0 to 13 days), whereas high dependency patients had an average length of stay in hospital of 35 days (range 3 to 170 days).

Within the total population studied, we observed a detection frequency of 69% HRV- A (n=44), 6% HRV-B (n=4) and 25% HRV-C (n=16). There was no difference in HRV species detection between the two patient groups (p = 0.69) (Figure 3.3). The large number of different types identified (43 HRV types in 64 patients) rendered any attempt to determine associations between specific HRV strains and disease presentations impossible.

73

Epidemiology and Clinical Correlations of HRV

FIGURE 3.3: HRV species isolated from distinct patient groups. Values are shown separately for ITU/HDU, A&E/MAU and total cohort separately.

The vast majority of HRV positive samples were from those under the age of 5 (86%). The species prevalence within each age band followed the expected and previously observed pattern with HRV-A most prevalent and HRV-B the least. When grouped by age bands, patients did not segregate by admission to ITU or A&E.

30% of children sampled had evidence of a co-infection (n=19). Similar frequencies of co-infections were observed in patients admitted to ITU (74%) and patients seen in the acute care setting (67%). The most common co-infection observed in both settings was by AdV. This was not found to be correlated with an increased length of stay in hospital or more severe respiratory symptoms in this sample set. Notably, only two bacterial co-infections were identified; one with Moraxella catarrhalis and one with Haemophilus influenzae. Both of these occurred in children aged one to

74

Epidemiology and Clinical Correlations of HRV two years admitted to ITU with severe respiratory distress requiring ventilatory support and both had an additional viral co-infection with AdV.

The Paediatric Index of Mortality (PIM2) scoring system is a tool for rating the severity of illness and mortality risk in children at the time of their admission to ITU (Slater et al., 2003). Consideration of a variety of clinical factors allows the calculation of a score correlating with a percentage likelihood of mortality. The likelihood of mortality as predicted by PIM2 score showed no significant difference between HRV species. Additionally, no relationship was evident between viral loads in respiratory samples and PIM2 score. However, these results are difficult to interpret due to a lack of detailed clinical information regarding the primary cause of illness in each child. For example, one child with an extremely high PIM2 score of 0.48317 was admitted to ITU with cardiac failure post cardiac arrest and a long- standing history of immunosuppression and mitochondrial cytopathy. Mitochondrial cytopathy predisposes to the development of LRTIs secondary to poor muscle tone and swallowing difficulties. However, the post-arrest status of this child and resulting cardiac failure are likely to have been the primary reasons for ITU admission and it is likely that the documented respiratory infection was a secondary concern which was already resolving (CT value of 38.64). Further studies would benefit from prospective collection of significantly more detailed clinical data.

3.3.3 Screening of CSF samples for HRV As EV are an extremely common cause of aseptic meningitis, screening of CSF samples for HRV was undertaken in order to determine whether these closely related pathogens may also play a role in the development of this disease. A total of 1500 CSF samples from 2009 were screened for HRV in pools of 10. Of these, only one was positive, which originated from a female neonate (under the age of 3 months). The limited clinical data available identified a sepsis-like illness. It is not known whether the patient was immunocompromised. HRV was additionally present in a

75

Epidemiology and Clinical Correlations of HRV throat swab from the same patient. However, stool samples and blood cultures were not obtained. These results indicate that HRV is an extremely rare and currently unrecognised potential cause of aseptic meningitis.

3.3.4 Isolation of HRV from faecal samples

Stool samples are referred to the SVC principally for diagnosis of suspected viral gastroenteritis or viral meningitis and also commonly in investigation of suspected infective outbreaks of diarrhoea and vomiting within the hospital setting.

A total of 26 from 288 stool samples screened were PCR-positive for HRV (9%). Positives were amplified and sequenced in the VP4/VP2 region for HRV species and type identification. All positive samples were isolated from hospitalised patients, of which over a third (35%) were obtained from patients under investigation for potential norovirus infections. The highest frequency of detection of HRV in an individual month was observed in September (10%). However, both May and December also displayed relatively high detection rates (7% and 8% respectively).

In the 26 positive samples, a total of 15 HRV VP4/VP2 types were identified (5 HRV-A, 2 -B and 8 -C). This suggests that rather akin to the pattern of circulation observed in respiratory specimens, a large number of types circulate concurrently each epidemic season (Savolainen et al., 2002). HRV species identified in the positive samples comprised 50% species C (n=13), 42% species A (n=11) and 8% (n=2) species B. Additionally, phylogenetic analysis reveals that HRV sequences obtained from stool samples were genetically diverse in the VP4/VP2 region (Figure 3.4).

Five HRV positive patients infected with highly similar strains of HRV-A1 were identified (marked with black dots on Figure 3.4). All of these patients presented in September 2010 with diarrhoea (Table 3.1) and had stool samples referred as part of the investigation of a norovirus outbreak. However, as the patients were treated in

76

Epidemiology and Clinical Correlations of HRV three different locations, it is unlikely that these five samples represent an isolated outbreak of HRV infection.

In the absence of a control group with no gastrointestinal symptoms, it is not possible to conclusively link the presence of HRV in stool with gastrointestinal disease. However, 77% (n=20) had no documented evidence of concurrent respiratory tract infection, such as bronchiolitis or URTI. 27% (n=7) of HRV positive patients required treatment in an ITU setting. The majority of putative HRV gastrointestinal infections were observed in children under the age of 5 years (n=18, 69%), with a further peak observed in the over 65 age group (n=6, 23%). Nine of the HRV positive patients were patients who were initially referred as part of the investigation of suspected norovirus outbreaks and tested negative for norovirus.

Only three co-infections were noted in stool samples; two with AdV and one with Salmonella enteritidis. One child who was co-infected by AdV had symptoms of both an URTI and diarrhoea. Further screening performed by real-time PCR used for EV detection by Dr H. Harvala at the SVC detected 11 HRV positive stool samples from 333 tested (Harvala et al., 2012b). The false positivity for HRV in EV screening in the diagnostic setting arising from the high level of sequence conservation in the 5ˈUTR may have led to misrepresentation of the frequency of EV infections in clinical samples.

77

Epidemiology and Clinical Correlations of HRV

FIGURE 3.4: Neighbour joining phylogenetic trees of the VP4/VP2 region of HRV-A, HRV-B and HRV-C. Sequences from isolates recovered from stool samples are marked with black dots. Trees were constructed using NC_001430 : EV70 as an outgroup. Branches are scaled by genetic distance.

78

Epidemiology and Clinical Correlations of HRV

TABLE 3.1: Clinical characteristics of patients with HRV isolated from stool samples

Sample Isolation Age Clinical Clinical details Number month location 10_511 May 2010 6-10 years RHSC Diarrhoea 10_512 May 2010 7-12 months RHSC URTI, diarrhoea and vomiting 10_518 May 2010 3-5 years RHSC Diarrhoea and fever 10_520 May 2010 7-12 months ESCITU Diarrhoea and seizure 10_521 May 2010 <3 months ESCITU Diarrhoea, vomiting and fever 10_522 May 2010 <3 months RHSC Bronchiolitis and vomiting 10_547 a May 2010 7-12 months RHSC Diarrhoea and vomiting 10_568 a May 2010 3-5 years RHSC Diarrhoea and vomiting 10_91 a Sept 2010 >65 years RIE Diarrhoea 10_92 a Sept 2010 37-65 years ITU Diarrhoea and fever 10_93 a Sept 2010 >65 years ITU Diarrhoea and fever 10_94 Sept 2010 1-2 years RHSC Diarrhoea and fever 10_95 Sept 2010 <3 months RHSC Diarrhoea and vomiting 10_100 Sept 2010 7-12 months RHSC Loose stools and irritability 10_128 Sept 2010 <3 months RHSC Bronchiolitis and loose stools 10_129b Sept 2010 1-2 years RHSC Fever (post-transplant) 10_141c Sept 2010 3-6 months RHSC Diarrhoea 10_149b Sept 2010 1-2 years RHSC URTI and loose stool 10_153 a Sept 2010 >65 years ITU Diarrhoea and fever 10_121 Dec 2010 3-6 months ESCITU Bronchiolitis and vomiting 10_123 Dec 2010 7-12 months ESCITU Bronchiolitis and vomiting 10_124 Dec 2010 7-12 months RHSC Gastroenteritis 10_125 Dec 2010 7-12 months RHSC Diarrhoea and seizure 10_128 a Dec 2010 >65 years RIE Diarrhoea and vomiting 10_1213 a Dec 2010 >65 years RIE Diarrhoea and vomiting 10_1218 a Dec 2010 >65 years RIE Diarrhoea and vomiting RIE – Royal Infirmary of Edinburgh, RHSC – Royal Hospital for Sick Children (Edinburgh), ITU – Intensive Therapy Unit (RIE), ESCITU – Edinburgh Sick Children ITU (RHSC) a Nine patients were investigated as part of suspected norovirus outbreaks b Two co-infections with Adenovirus were identified c One co-infection with Salmonella enteritidis indentified

In order to ascertain whether HRV obtained from stool samples was present at concentrations consistent with the possibility of gastrointestinal infection, viral loads were recorded and compared for both HRV and EV positive samples. There was no significant difference in viral load (as measured by real-time PCR CT value) between

79

Epidemiology and Clinical Correlations of HRV

HRV (28.5; standard deviation ±3.5) and EV (28.9; standard deviation ±1.9) positive samples (p=0.77). However, one-step real-time PCR specific for the HRV 5ˈUTR performed as per SVC protocols detected only 42% (n=11) of the 26 HRV positive samples identified by nested RT-PCR. Despite the high frequency of HRV detection observed in all three months, more extensive sampling would be required to determine whether HRV RNA is found in stool samples throughout the year.

3.3.5 Discrepancy of screening results between real time PCR in diagnostic setting and VP4/VP2 screening PCR

In order to investigate the characteristics of concurrent HRV circulation in respiratory illnesses during September and December 2010, 731 respiratory samples were screened for HRV. A total of 572 of these had previously undergone screening procedures at the SVC using the standard HRV 5ˈUTR real-time RT-PCR. Of these, 115 (20%) were recorded as HRV positive and therefore typed by sequencing of the amplified VP4/VP2 region. Eight of these positive samples could not be amplified in VP4/VP2 and repeat screening with the 5ˈUTR nested RT-PCR previously described was negative (Wisdom et al., 2009b).

The 457 samples recorded as negative on routine diagnostic screening were combined into pools of 6 and screened by nested 5ˈUTR RT-PCR. 15 of these samples (3%) were actually HRV positive and typed in the VP4/VP2 region (shown in red on Figure 3.5). These 15 samples comprised 4 of species A, 10 species C and one species B.

The samples which were negative on initial 5ˈUTR PCR screening were genetically diverse, from all three HRV species and did not segregate in VP4/VP2. Within HRV-A, three out of the four previously negative samples grouped with samples that were identified by 5ˈUTR real-time PCR screening. Within HRV-C, five initially false-negative samples grouped with three types that had been identified by the

80

Epidemiology and Clinical Correlations of HRV

5ˈUTR real-time PCR screening. This suggests that the detection failure may be secondary to low viral load in these cases.

However, 6 false-negative HRV-C samples belonging to four types did not group with any previously identified HRV-C positive samples. All four of these belonged to groups that have been previously documented to be recombinant HRV-Ca types (Wisdom et al., 2009a; McIntyre et al., 2010, 2013b). Analysis of a small fragment of the 5ˈUTR that is additionally amplified during the VP4/VP2 typing PCR confirmed these 5ˈUTR groupings (data not shown). This may indicate that the addition of a single extra primer specific for HRV-Cc strains may not be sufficient and a redesign of primers used in this particular screening protocol may be warranted for use in routine HRV screening.

FIGURE 3.5: Neighbour joining phylogenetic tree showing the VP4/VP2 region of HRV positive samples from respiratory specimens and stool specimens. NC_001430: EV70 was used as an outgroup for analysis for all trees and branch to tree root has been collapsed for ease of reference. Positive samples from the SVC respiratory archive are shown in bold black, positives from stool screening are shown in bold blue. HRV positive samples which were negative on initial real-time HRV 5ˈUTR RT-PCR screening are shown in red. Two HRV-B and two HRV-C VP4/VP2 sequences belonging to putative new types are outlined by a blue box (later designated HRV-Bpat4 and HRV- Cpat17 – see Chapter 4). Branches are scaled by genetic distance.

81

Epidemiology and Clinical Correlations of HRV

82

Epidemiology and Clinical Correlations of HRV

3.3.6 Analysis of concurrent HRV circulation in respiratory samples during September and December 2010

A total of 142 HRV positive samples were identified in retrospective screening of 731 respiratory samples from September and December 2010. The overall detection frequency seen was relatively higher than that observed in stool samples (Table 3.2). However, the distribution of the three species of HRV was highly similar. HRV-C was the most frequently detected in both, closely followed by HRV-A, while HRV-B was relatively rare (Table 3.2). Additionally, in both sample types, HRV was relatively more common in September than December.

TABLE 3.2: Comparison of detection rates between HRV positives isolated from stool and respiratory samples Stool Respiratory specimens specimens HRV positive samples 9.0% 19.4% (n=26) (n=142) % HRV-A 42.3% 41.6% (n=11) (n=59) % HRV-B 7.7% 12.0% (n=2) (n=17) % HRV-C 50.0% 46.5% (n=13) (n=66) HRV positive samples – 12.2% 36.3% September (n=11) (n=81) HRV positive samples – 7.8% 12.0% December (n=7) (n=61)

HRV positive samples observed represented 51 HRV types (21 HRV-A, 9 HRV-B and 21 HRV-C). Of the 13 HRV types detected in stool specimens from September and December 2010, only 6 were seen co-circulating in respiratory samples from the same month (marked on Figure 3.5). In addition, only two potential matches between a stool and respiratory sample could be identified. This was surprising, as although there was a low incidence of respiratory symptoms within the group of patients with HRV positive stool specimens, 7 of these were treated in ITU. The majority of ITU admissions with fever or other infective symptoms are routinely

83

Epidemiology and Clinical Correlations of HRV subjected to a full septic screen comprising of respiratory, stool and urine samples with blood cultures. One HRV-A1 positive patient had the virus identified in both stool and upper respiratory secretions (sample numbers 10_94 and Resp_28852). In addition, HRV-C39 was isolated from both stool and upper respiratory secretions of a child aged 3-6 months admitted to the RHSC (sample numbers 10_141 and Resp_28726). Although matched respiratory samples could not be identified for HRV-C24 positive stool samples, three HRV-C24 positive respiratory samples were isolated from patients also admitted to the RHSC in the same month. HRV positive samples of the same type from the same month were also observed for HRV-A89, - B52 and –C39.

3.4 Discussion

3.4.1 Epidemiology and clinical correlations of HRV respiratory infections in Edinburgh

Throughout the course of this work, we have completed three studies of the epidemiology and clinical correlations of HRV respiratory infections in Edinburgh, including a study of the distribution of HRV species between ITU and A&E patients and investigations of concurrently circulating HRV in the same time period as HRV positive stool samples. A number of samples included in the study of HRV respiratory samples matched with stool samples had been previously screened for the presence of HRV by real-time RT-PCR specific for the 5ˈUTR of HRV. Of particular concern is the observation that this commonly used diagnostic PCR failed to detect a number of HRV positives. Indeed, around 3% of the samples that had been reported as negative by real-time PCR were actually HRV positive. Published literature on this subject indicates that this is not an isolated occurrence and not restricted to this particular PCR. Of 10 commonly used HRV 5ˈUTR primer pairs, no single pair could reliably detect all HRV present in a testing panel (Faux et al.,

84

Epidemiology and Clinical Correlations of HRV

2011)13. In accordance with our observations (Figure 3.5), the erroneous negative results did not correspond to a particular HRV species or type. Although the primer sets published previously (Wisdom et al., 2009b, 2009a) and used with modifications described within have not been independently tested for sensitivity and specificity, other work undertaken within our group found that the 5ˈUTR and VP4/VP2 nested PCRs were consistently able to detect RNA transcripts of all species of HRV in samples with close to 103copies/µL (McLeish et al., 2012).

Within the set of randomly selected respiratory samples from Edinburgh, HRV was found in 23% of cases, which falls within the range of 7.9% - 64.3% previously described (Alper et al., 2008; du Prel et al., 2009; Brittain-Long et al., 2010; Chan et al., 2012; Chidlow et al., 2012; Renwick et al., 2007; Lamson et al., 2006; Arden et al., 2006; Kistler et al., 2007a). We found that HRV was significantly more common in hospitalised patient populations than in the community based patients. However, the number of community based patients with referred respiratory samples was small and only four positive samples were identified in this group. Although the incidence of HRV is likely to be significantly higher in cases of mild URTI, patients with mild illness do not usually present and are not routinely sampled. Previous reports indicate that the incidence of HRV infection in asymptomatic individuals may vary from 2% to over 50% (Fry et al., 2011; Alper et al., 2008; Peltola et al., 2008; Brittain-Long et al., 2010; Chidlow et al., 2012). A valuable addition to future work would be the prospective collection of both samples and data from a community based control group, including both symptomatic and asymptomatic patients.

In accordance with previous studies, HRV-A was generally the most prevalent HRV species in respiratory samples. However, in both stool samples and respiratory samples from September 2010, HRV-C was the most commonly isolated species

13 Primers in use in both the research and diagnostic setting in Edinburgh were not included in this published analysis. 85

Epidemiology and Clinical Correlations of HRV

(Figure 3.5). There was no difference in the distribution of infecting HRV species by age. Despite several published studies suggesting a correlation between HRV-C infection and more severe clinical illness, we found no significant difference between HRV species distribution or co-infection rate in patients admitted to ITU or A&E. Although ITU admission was presumed to correlate with illness severity, it may represent a somewhat crude estimate in some respects. For instance, some patients are likely to have required ITU care for another, unrelated condition, with the documented respiratory infection being a secondary concern. The limited amount of clinical data that was able to be collected for each patient retrospectively limited the analysis. Future studies would benefit from prospective data collection.

The results of these studies have several notable differences from a similar study of 456 Edinburgh respiratory samples carried out in 2006/2007 (Wisdom et al., 2009b). Although this study was carried out less than 5 years prior to the studies within this thesis, they found that HRV-A were considerably more frequent among patients admitted to ITU and HRV-C strains were entirely absent. In addition, they found a significantly elevated incidence of HRV infection in the over 65 age group, whereas we observed HRV in only 4% of this group. However, certain other features, such as a lack of male or female bias and the generally more frequent isolation of HRV-A are consistent not only between these two studies, but are widely repeated observations. In accordance with our study, Wisdom and co-workers also identified a large number of genetically distinct co-circulating HRV strains (Wisdom et al., 2009b). Determining the finer details of HRV circulation patterns, epidemiological features and clinical correlates are unlikely to be achieved with small, isolated clinical studies. An unprecedented volume of data would be required in order to effectively investigate the correlation of specific HRV strains with specific disease presentations. A large, multicentre, prospective epidemiological study carried out over a period of several years and over many geographical locations, using standardised PCR reagents and primers would potentially allow characterisation of

86

Epidemiology and Clinical Correlations of HRV such features. However, such an endeavour, whilst undoubtedly worthwhile would likely prove to be both logistically challenging and labour intensive.

3.4.2 Screening for HRV in cerebrospinal fluid samples

Despite the detection of HRV RNA in stool samples from patients with acute flaccid paralysis and in vitro propagation of live virus from sewage under conditions optimised for the growth of EV (Victoria et al., 2009; Blomqvist et al., 2009), there was only very little evidence of HRV in CNS infection. Indeed, the one positive sample that was recovered led to an overall detection frequency of HRV in aseptic meningitis of less than 0.5%, which although low is still remarkable. The identification of only one positive sample led to the consideration of the possibility of contamination or laboratory error. However, the individual sample underwent the process of RNA extraction, nested PCR and sequencing on two independent occasions and produced an identical result. In addition, no other laboratory work with HRV positive samples was being carried out during this period. Overall, these results suggest that although it is significantly more rarely observed than EV, HRV is a potential cause of neonatal sepsis-like illness and potentially other CNS disease.

3.4.3 Isolation of HRV from faecal specimens Since the publication of our work on this subject (Harvala et al., 2012b), two further studies detailing the presence of HRV in diagnostic stool specimens have been published (Lau et al., 2012; Honkanen et al., 2013). In addition, one analysis of HRV positive children in Malaysia contained an incidental finding that 34.8% had a primary symptomatology of diarrhoea and vomiting, rather than respiratory symptoms (Chan et al., 2012). Although the high incidence of nearly 10% we observed was surprising, the possibility that HRV may be routinely present in stool samples has been evident for a long time. Not only has HRV been isolated from sewage samples (Blomqvist et al., 2009), from cases of acute flaccid paralysis (Victoria et al., 2009) and from stool samples from critically ill children (Tapparel et

87

Epidemiology and Clinical Correlations of HRV al., 2009c; Broberg et al., 2011), but information available from the ATCC archive indicates that the original HRV-33 prototype strain was isolated from a pool of human stools collected in Maryland in 1939 (strain VR-330). This suggests that HRV has been potentially circulating in faecal samples largely undetected for many years.

HRV-C was the most frequently isolated HRV species from stool samples and accounted for 50% of HRV positive samples. Lau et al also found that HRV-C was detected most frequently in stool samples (Lau et al., 2012). However, in a large study of 4184 stool samples isolated from Finnish children, HRV-A was the most frequently detected HRV species (Honkanen et al., 2013). We found no specific species or strain to be associated with gastrointestinal infection. Although Honkanen and co-workers reported that HRV-A61 was the most commonly amplified HRV type (Honkanen et al., 2013), we did not observe any occurrences of this type within our sample set. The genetic diversity of HRV strains in stool suggests that there may not be specifically enteropathic HRV strains and indeed, all HRV may be capable of infecting the gastrointestinal tract under favourable circumstances. Although there was very little overlap between circulating infections with identical HRV strains in stool and respiratory samples between September and December, the same broad trends were observed (Table 3.2). Not only was HRV-C over-represented in both sample sets but the detection frequency of HRV was uniformly higher in September. This correlates with a previously observed high frequency of HRV infection in Edinburgh in September (Wisdom et al., 2009b), which is possibly linked to the declining temperatures and the start of the school year leading to increased person to person contact in a warm, humid environment. The relatively high incidence of HRV-C in both sample sets may be an incidental finding related to small sample sizes or it may be indicative of a genuinely higher circulation of HRV-C near the close of 2010.

88

Epidemiology and Clinical Correlations of HRV

HRV are traditionally considered to be acid labile and indeed, retention of infectivity at pH of less than 5 was often used to differentiate EV from HRV in clinical diagnostics. Previous findings that may have raised the possibility of HRV as a gastrointestinal pathogen have been attributed merely to swallowed respiratory secretions or discarded tissues (Blomqvist et al., 2009). However, our observation of infrequent reporting of respiratory symptoms in cases of HRV positive stool suggests that swallowed respiratory secretions may not play a significant role. This assertion is further strengthened by the observation of high viral loads, which are incompatible with swallowed HRV being diluted by liquid gut contents. In addition, if simply suffering an upper respiratory HRV infection could lead to the routine isolation of HRV from stool or sewage samples, the detection frequency of HRV in stool samples would surely be substantially higher. Neither our study, nor any published since can claim to prove causality by observation of this association with HRV positive stool samples and gastrointestinal pathology. However, there was no statistically significant difference between viral loads in HRV and EV positive samples (Harvala et al., 2012b). As EV is known to replicate in the gastrointestinal tract, this observation lends weight to the assertion that HRV may be a causative factor in these infections and should potentially be further investigated.

The isolation of HRV from stool samples of symptomatic patients at a similar rate to the isolation of EV indicates the potential value of addition of HRV to the diagnostic screening procedures currently employed. Some diagnostic laboratories use the presence of EV in stool samples as an adjunct to screening CSF samples in the diagnosis of aseptic meningitis. As HRV appears to potentially have a significant role in gastrointestinal infection, this approach may lead to overdiagnosis of EV meningitis in cases where the sample is not further typed. Indeed, 15% of reported EVs identified by diagnostic screening PCR carried out at the SVC were subsequently found to be HRV (Harvala et al., 2012b).

89

Epidemiology and Clinical Correlations of HRV

In conclusion, in common with previously published data, studies of HRV epidemiology in respiratory infections did not reveal any association of a particular strain or species with illness severity or clinical presentation. Significantly larger studies which span a wide geographical area over the course of many years may be the only possible method of determining whether these patterns actually exist and if so, what their nature may be. In contrast to the high incidence of EV observed in cases of CNS infection, HRV was observed at an extremely low frequency in CSF, suggesting that it may be a rare cause of CNS disease. Intriguingly, we found evidence of HRV in around 9% of stool samples from patients with symptomatic gastrointestinal infections and viral loads were similar to that observed for EV infections. Although this was the first published report of the incidence of HRV RNA in the GI tract, two other studies published subsequently have reported similar findings (Lau et al., 2012; Honkanen et al., 2013). Further studies of HRV infection in stool may focus on larger sample sets to determine seasonality and potential circulation patterns, as well as clinical correlates. Investigations in the biological differences in acid stability between these strains and those commonly isolated from respiratory infections may allow determination of the genetic factors that make such infections possible.

90

Chapter 4 Proposals for the Classification of All Three Species of Human Rhinovirus into Genotypically Assigned Types

4.1 Introduction

4.1.1 The standard approach to typing in HRV

By the 1960’s, a large number of antigenically distinct HRV isolates had been identified and development of a system for their classification became essential. On the basis of neutralisation properties, HRV were divided into 100 serotypes by 1987 (Kapikian et al., 1967; Hamparian et al., 1987) and prototype strains of each were lodged at the American Tissue Culture Collection (ATCC).

For many years, serological methods were the standard technique used for typing HRV and EV isolates. Serological typing comprises isolation of the virus in cell culture, followed by neutralisation with intersecting pools of type-specific antisera and confirmation with a single type-specific antiserum. This approach was extremely time-consuming, laborious and fraught with difficulty. HRV are challenging to propagate in cell-culture, requiring specific HRV susceptible cell-lines and certain growth requirements (for example, an incubator at 33°C and roller-bottle culture vessels). With over 100 strains recognised, accurately typing even one isolate was a formidable task and standardised sera are in limited supply. The result was that typing was not generally undertaken in HRV and only rarely in EV.

When undertaken, serotyping in EV took 1-2 weeks. In most cases, an umbrella diagnosis of EV infection was deemed sufficient as the results of serotyping had a negligible impact on patient management. The most clinically useful application was

91

Proposals for the Classification of All Three Species of Human Rhinovirus into Genotypically Assigned Types to distinguish between PV and non-polio enterovirus (NPEV) infections in enterovirus surveillance.

Serotyping also routinely encountered numerous problems leading to reports of “untypeable” isolates. Failure due to virus aggregation within the sample was commonly encountered (Wallis and Melnick, 1967). This could be overcome by filtration or treatment with sodium deoxycholate (JG Kapsenberg, A Ras, 1980). Similarly, failure arising from the presence of a mixture of viruses within the sample could be circumvented by plaque purification or limiting dilutions (Blomqvist et al., 2008). However, both these solutions created additional time-consuming steps in an already lengthy and complex procedure.

As a further difficulty, antisera that were originally developed against strains of HRV and EV isolated in the 1950s and 1960s, increasingly showed reduced or non- reciprocal cross-neutralisation with contemporary isolates through antigenic drift. Several of these antigenic variants (later identified as belonging to a specific type by VP1 sequence analysis (Oberste et al., 2000)) were known as prime strains. These contemporary isolates cannot be neutralised by antisera to the prototype strain while antisera to the contemporary isolate can neutralise the prototype. This has been reported on several occasions for EV (Abraham and Holmes, 1974; Schmidt et al., 1966; Norder et al., 2002), and for HRV (Halfpap and Cooney, 1983). These “inter- type” strains showing intermediate serological properties likely represent variants in the process of diverging into antigenically different types.

Cell culture based screening routinely undertaken for EV and HRV relied on visual inspection of cultures for cytopathic effect and was therefore subjective, especially when used on a large scale. Serological based techniques also suffer from the same subjectivity, even in experienced hands. In one published survey of twelve labs, none reported the correct or expected serotyping results for all ten EV samples

92

Proposals for the Classification of All Three Species of Human Rhinovirus into Genotypically Assigned Types included and an overall rate of accurate reporting of around 60% was recorded (van Loon et al., 1999). In addition, the requirement for multiple cell lines to screen for HRV and HEV simultaneously (as would be preferable in screening of stool samples), renders cell culture screening nearly impossible. PCR based screening methods offer an easier and more objective alternative.

Other alternatives proposed to serotyping include viral protein fingerprinting (Holland et al., 1998) and immunofluorescence (IFA) based methods (Kiang et al., 2009), both of which can be highly accurate. However, viral protein fingerprinting requires specialised instruments, generates radioactive waste and relies on comparison with available protein sequences, which are significantly less abundant than nucleotide sequence data. IFA based methods suffer from some of the same problems as serological typing, in that reagents are limited and not all known serotypes are able to be tested.

As PCR is a ubiquitous technique in most labs and online databases contain a wealth of nucleotide sequence data, a classification system based on sequence data was ideal. Such a system was developed in 1999 for EV (Oberste et al., 1999b).

4.1.2 Definition of new EV types based on sequence divergence in the VP1 region

In order to classify EV by sequence data, it was essential to consider the likely molecular determinants of serotype. Sequence of the VP1 region correlates with serotype designation in PV isolates (Kilpatrick et al., 1998) and other NPEV (Oberste et al., 1999b, 1999a; Kiang et al., 2009). Studies of sequence divergence in VP1 led to the proposal of a threshold of 25% nucleotide and 12% amino acid divergence for the identification of EV types (Oberste et al., 1999b). These thresholds are now routinely used and a great number of new EV types have been classified on this basis (Oberste et al., 2002, 2007; Brown et al., 2009; Oberste et al.,

93

Proposals for the Classification of All Three Species of Human Rhinovirus into Genotypically Assigned Types

2004c; Smura et al., 2007b). As some EV isolates have documented evidence of recombination within the capsid region (Bouslama et al., 2007; Blomqvist, 2003; Zhang et al., 2010), only VP1 is recommended to be used for typing in EV. Previous attempts to classify EV by VP2 sequences (Arola et al., 1996; Oberste et al., 1998) have been unsuccessful.

The classification of EV isolates by molecular methods has now replaced neutralization assays in standard practice. These methods have been found to consistently outperform serotyping, in terms of accuracy, speed and ability to classify new types (Oberste et al., 2000; Kiang et al., 2009). The method also has benefits in the arena of polio eradication, allowing investigators to distinguish between vaccine derived and wild PV and to track emerging recombinant forms. .

4.1.3 The requirement for genotypic classification of HRV

With such a broad scope of genetic diversity and wide range of distinct clinical manifestations, it is no longer practical or desirable to consider all HRV types as one biological and clinical entity. There are presently no known specific associations of any HRV types with a particular disease. However, as the range of serious clinical illness that can be attributed to HRV infection is becoming more fully appreciated, there will likely be a requirement to routinely screen for HRV in the diagnostic setting and to commence large-scale epidemiological studies to uncover circulation patterns and strain associations. An easy and practical system of classifying HRV into types, analogous to the system now in use for EV (Oberste et al., 1999a) will allow investigations of potential outbreaks and nosocomial transmission and investigation of type-specific biological properties such as identification of specific types with potentially increased virulence.

Methods which rely on cell culture isolation are not practical for detection and characterisation of HRV, especially HRV-C. Indeed, HRV-C cannot be isolated by

94

Proposals for the Classification of All Three Species of Human Rhinovirus into Genotypically Assigned Types most standard cell culture techniques and propagation currently relies on sinus mucosal organ culture (Bochkov et al., 2011). In vitro propagation of HRV-C has previously been attempted in WI-38, WisL, BEAS-2B, A549, HeLa, MRC-5, primary human lung fibroblasts and bronchial, sinus and adenoidal epithelial cells with no success (Bochkov et al., 2011; Kistler et al., 2007a; McErlean et al., 2007). In other viruses, such as hepatitis C virus (HCV), which are challenging or impossible to grow in standard cell culture, genotyping has provided a useful alternative means of classification.

The studies presented within this chapter aimed to develop a system of classification informed by nucleotide divergence for all three species of HRV. In 2010, we published proposals for the classification of HRV-C into genotypically defined types (Simmonds et al., 2010). These have now been adopted by the Picornavirus Study Group as the primary mechanism for defining HRV-C types. In addition, we have extended this classification system to include HRV-A and HRV-B, leading to a full re-evaluation of type designations within the two species (McIntyre et al., 2013a). Since the original division of HRV into 100 serotypes, an additional 62 types have now been designated. These include 52 HRV types determined from the results of studies contained within this thesis. The term “serotype” implies identification and classification by direct investigation of antigenic properties. Therefore, we have simply used the term “type” throughout to represent HRV types which have been identified and classified by sequence data alone, as is common practice for newly identified EV types.

4.2 Materials and Methods

4.2.1 Selection of samples

A selection of both published and unpublished studies of HRV epidemiology and evolution in Edinburgh yielded a wealth of HRV VP4/VP2 sequences (Gaunt et al.,

95

Proposals for the Classification of All Three Species of Human Rhinovirus into Genotypically Assigned Types

2011; Wisdom et al., 2009b, 2009a; Harvala et al., 2012b; McIntyre et al., 2010). VP4/VP2 sequences from our studies (431 HRV-A and 113 HRV-B) were divided into groups based on phylogenetic clustering. Any isolate which did not show a close grouping with a prototype full genome was selected for amplification of the VP1 region.

In addition, the 508 available HRV-C VP4/VP2 sequences obtained from isolates within the SVC archive were divided into groups on the basis of phylogenetic clustering. Representatives from each cluster, presumed to be analogous to HRV-C types, were selected for VP1 amplification.

4.2.2 Amplification of the VP1 region

The full VP1 region of HRV-A and -B isolates was amplified and sequenced in two overlapping sequence fragments (as described in Chapter 2). For HRV-A28 samples, amplification of the 5ˈ fragment was not possible with the standard protocol and a specific inner anti-sense primer was designed. In all cases, where amplification with GoTaq polymerase (Promega, UK) was unsuccessful, it was attempted using a SuperScript III (Invitrogen, UK) protocol.

Two differing protocols were used for the amplification of the VP1 region of HRV-C during the course of this work. The first HRV-C VP1 PCR used the standard protocols described above and primers labelled as C VP1 PCR 1 (Appendix 1: Primer sequences). This resulted in two sequence fragments with a 105 base overlap (McIntyre et al., 2010). As this assay was only successful in amplifying 72 of 89 samples, a second outer anti-sense primer was developed (labelled as C VP1 PCR 2 in Appendix 1: Primer sequences). This improved assay allowed the amplification of the full HRV-C VP1 region and only failed to amplify one HRV-C isolate tested. The second round of this nested PCR reaction was carried out with an annealing temperature of 45°C and in a 50 μl reaction volume. PCR products were then

96

Proposals for the Classification of All Three Species of Human Rhinovirus into Genotypically Assigned Types separated at 150V for 45 minutes on a 2% agarose gel and DNA bands of 1000 bases were excised manually under UV transillumination. Excised DNA bands were purified using the Illustra GFX PCR DNA and Gel Band Purification Kit (GE Healthcare, UK) and sequenced as previously described.

4.2.3 Sequence alignment, calculation of pairwise nucleotide p-distances and phylogenetic analysis All available HRV VP4/VP2 and VP1 sequences were downloaded from Genbank in November 2012. The genome regions analysed in this study included the full VP1 region and a fragment of VP4/VP2 that is commonly used in studies of HRV epidemiology (Savolainen-Kopra et al., 2009a; Kaida et al., 2011; Piralla et al., 2011; Xiang et al., 2010a; Han et al., 2009; Mizuta et al., 2010; Henquell et al., 2012; Miller et al., 2009a; Blomqvist et al., 2009). Any sequences which were less than 90% complete across these regions (Table 4.1) or classified as non-functional were excluded.

TABLE 4.1 : Co-ordinates of capsid coding genome regions used in analysis and corresponding HRV reference strain HRV Species Numbered by: VP4/VP2 Region VP1 Region HRV-A FJ445111 (HRV-A1) 627 – 1015 2337 – 3197 HRV-B X01087 (HRV-B14) 625 – 1013 2296 – 3192 HRV-C EF582385 (HRV-C4) 616 – 1004 2305 – 3125

All sequences were aligned in SSE v1.0 and pairwise nucleotide p-distances were calculated using the program SequenceDist within SSE v1.0 software package. Phylogenetic trees were constructed as previously described and all phylogenetic analysis was undertaken using NC_001430 (EV-D70) as an outgroup.

97

Proposals for the Classification of All Three Species of Human Rhinovirus into Genotypically Assigned Types

4.2.4 Analysis of recombination within the capsid region of HRV-A, -B and –C

Phylogenetic trees were visually inspected for evidence of incongruence between VP4/VP2 and VP1. Datasets containing these two non-consecutive regions were concatenated to give one continuous sequence, which was then additionally analysed with RDP and GARD software packages.

4.3 Results

4.3.1 Assigning nucleotide divergence thresholds in the VP1 region of HRV-A, -B and –C

In order to determine whether a distinct threshold that divided pairwise p-distance comparisons into intra- and inter-type values existed within each HRV species, distributions of pairwise nucleotide p-distances were constructed from a total of 435 HRV-A, 133 HRV-B and 206 HRV-C sequences (Figure 4.1). These showed a maximum within-species nucleotide p-distance of 39.4%, 33.9% and 42.9% respectively and minimum values between 10% and 14% divergence. Detailed inspection allowed identification of minimum values in the distribution and assignment of a nucleotide divergence threshold (marked in grey on Figure 4.1 A2, B2, C2). A threshold of 13% was identified for both HRV-A and HRV-C sequences, while the less divergent HRV-B isolates showed a minimum value that supported a 12% threshold. Although analysis of nucleotide divergence thresholds for HRV-C sequences was carried out in 2010 (Simmonds et al., 2010), the results presented herein represent a repeated analysis incorporating an additional 120 HRV-C VP1 sequences. However, despite larger scale of the renewed investigation, the observed nucleotide divergence threshold of 13% remains unchanged from our original proposals (Simmonds et al., 2010).

98

Proposals for the Classification of All Three Species of Human Rhinovirus into Genotypically Assigned Types

As the majority of the sequences included originated from a large variety of clinical and evolutionary studies and types show different prevalences, there was an unequal representation of different putative HRV types. The number of strains available for a specific type ranged from 1 to a maximum of 13 for HRV-A (A49), 10 for HRV-B (B69) and 22 for HRV-C (C3). Therefore, an additional analysis was carried out including only the prototype strains of all three species (coloured grey on Figure 4.2 A1, B1 , C1). Prototype strains of HRV-A and -B were the earliest isolated full genome sequence of a known type and in HRV-C, prototype strains were those defined in our original classification scheme (Simmonds et al., 2010). The distributions of pairwise nucleotide p-distances of prototype strains closely reproduced the larger analyses (Figure 4.1).

Distributions of pairwise nucleotide p-distances within the VP1 region of HRV-A and HRV-C sequences showed a single large peak representing inter-type comparisons. In contrast, HRV-B sequences had two large peaks within the inter- type distance range. Further analysis revealed that the smaller peak, between p- distance values of 0.190 and 0.260 corresponds to comparisons between HRV-B types within each of the four basally branching phylogenetic clusters (marked by grey boxes on Figure 4.3). The larger peak corresponds to comparisons between variants in different clusters.

When sequences from all three species of HRV were compared, the pairwise nucleotide p-distances fell into three distinct distributions (Figure 4.2). As expected, the lowest values represented comparisons from within the same type and the highest (between 43% and 56% divergence) were between isolates of different species. The large number of comparisons between 15% and 43% divergence represent isolates of the same species but differing type. The existence of a discrete inter-species p- distance threshold reinforces the notion that HRV species may be defined on the basis of nucleotide divergence in the capsid region.

99

Proposals for the Classification of All Three Species of Human Rhinovirus into Genotypically Assigned Types

FIGURE 4.1: Distributions of pairwise nucleotide p-distances for the VP1 region of all three species of HRV. HRV-A is shown in section A1, HRV-B in B1 and HRV-C in C1. Analysis including all available VP1 sequences is shown in black with y-axis scale on the left. Analysis including prototype strains only is shown in grey with y-axis scale on the right. A detailed analysis of the distribution values adjacent to the proposed thresholds for HRV-A (A2), HRV-B (B2) and HRV-C (C2) is also shown and the proposed nucleotide divergence threshold is shown by a red dotted line.

100

Proposals for the Classification of All Three Species of Human Rhinovirus into Genotypically Assigned Types

FIGURE 4.2: Distribution of pairwise nucleotide p-distances for the VP1 region of all HRV sequences.

4.3.2 Phylogenetic analysis of the VP1 region

Phylogenetic trees were constructed of the VP1 region of HRV-A, -B and –C (Figure 4.3). All HRV sequences were divided into groups which were defined by bootstrap supported phylogenetic clades that closely matched types assigned by sequence distances. These groups were numbered either by clustering with a known prototype strain (HRV-A/B 1-100) or by order of submission of full genome (HRV-A101-106, HRV-B100-104 and HRV-C1-11) or VP1 sequence (HRV-C12-51). The majority of sequences which grouped together within the same putative type group showed intra- clade VP1 divergence of less than the proposed thresholds of 13% in HRV-A and HRV-C and 12% in HRV-B (Table 4.2). Most of these also showed a minimum inter-clade VP1 divergence (with the nearest neighbour type group) of greater than the proposed threshold. However, a number of HRV type groups (13, 2 and 1 in HRV-A, -B and –C respectively) violated the proposed VP1 thresholds (marked in red on Table 4.2 and further discussed in section 4.4.4).

101

Proposals for the Classification of All Three Species of Human Rhinovirus into Genotypically Assigned Types

FIGURE 4.3: Phylogenetic trees showing the VP1 region of HRV-A, -B and –C. Putative type groups and branch to tree root has been collapsed for ease of reference. Within HRV-B, phylogenetic clusters with pairwise p-distances ranging from 0.190 to 0.260 are bounded by grey boxes. Instances where two types have been combined to form a single type are marked in yellow (see Section 4.3.3). New HRV types defined on the basis of sequence divergence in VP1 are marked in red and HRV types with intermediate VP1 divergence are marked in purple. Branches are scaled by genetic distance.

102

Proposals for the Classification of All Three Species of Human Rhinovirus into Genotypically Assigned Types

TABLE 4.2: Limits of intra- and inter-clade VP1 p-distance for HRV-A, -B and -C

HRV Max. Min. a Type intra- inter- clade b clade distance distancec A1 0.091 0.119 A67 0.095 0.178 B102 0 0.163 A1B 0.109 0.119 A68 0.104 0.150 B103 0.035 0.113 A2 0.106 0.141 A71 0.005 0.172 B104 0.085 0.121 A7 0.005 0.182 A73 0.003 0.182 C1 0.052 0.229 A8 0.004 0.015 A74 0.006 0.164 C2 0.057 0.165 A9 0.109 0.148 A75 0.079 0.161 C3 0.039 0.146 A10 0.099 0.170 A76 0.092 0.161 C4 0.016 0.139 A11 0 0.172 A77 0.001 0.185 C5 0.042 0.170 A12 0.099 0.263 A78 0.116 0.263 C6 0.066 0.146 A13 0.082 0.171 A80 0.087 0.203 C7 0.075 0.152 A15 0.103 0.163 A81 0.129 0.172 C8 0.132 0.219 A16 0.089 0.172 A82 0.105 0.195 C9 0.110 0.282 A18 0.086 0.175 A85 0.006 0.159 C10 0.037 0.205 A19 0.098 0.227 A88 0.001 0.182 C11 0 0.170 A20 0.099 0.150 A89 0.088 0.120 C12 0.101 0.197 A21 0.095 0.205 A90 0 0.174 C13 0.005 0.264 A22 0.113 0.190 A94 0 0.158 C14 0 0.215 A23 0.107 0.143 A95 0 0.015 C15 0.092 0.271 A24 0.076 0.131 A96 0 0.180 C16 0.044 0.147 A25 0.006 0.093 A98 0 0.107 C17 0.018 0.147 A28 0.112 0.228 A100 0 0.147 C18 0.053 0.193 A29d 0.007 0.072 A101 0.061 0.261 C19 0.072 0.302 A30 0.111 0.143 A102 0 0.151 C20 0.007 0.151 A31 0.117 0.140 A103 0.069 0.216 C21 0.009 0.152 A32 0.001 0.148 A104 0 0.131 C22 0.037 0.250 A33 0.103 0.161 A105 0.067 0.129 C23 0.081 0.269 A34 0.096 0.158 A106 0 0.172 C24 0 0.227 A36 0.115 0.120 B3 0.005 0.218 C25 0.109 0.227 A38 0.094 0.192 B4 0.094 0.196 C26 0.039 0.261 A39 0.009 0.234 B5 0.014 0.166 C27 0.031 0.224 A40 0.121 0.159 B6 0.088 0.113 C28 0.034 0.193 A41 0.006 0.171 B14 0.018 0.186 C29 0 0.239 A43 0.110 0.161 B17 0.008 0.124 C30 0 0.184

103

Proposals for the Classification of All Three Species of Human Rhinovirus into Genotypically Assigned Types

HRV Max. Min. a Type intra- inter- clade b clade distance distancec A44d 0.007 0.072 B26 0.002 0.186 C31 0 0.219 A45 0.081 0.252 B27 0.003 0.176 C32 0.028 0.136 A46 0.006 0.203 B35 0.013 0.191 C33 0.021 0.189 A47 0.122 0.140 B37 0.078 0.147 C34 0.027 0.151 A49 0.100 0.141 B42 0.098 0.167 C35 0 0.326 A50 0.002 0.158 B48 0.110 0.188 C36 0.111 0.256 A51 0.099 0.190 B52 0.006 0.121 C37 0 0.207 A53 0.103 0.222 B69 0.102 0.181 C38 0.034 0.174 A54 0.087 0.107 B70 0.116 0.124 C39 0.026 0.136 A55 0.056 0.207 B72 0.099 0.186 C40 0.070 0.165 A56 0 0.201 B79 0.005 0.192 C41 0.075 0.184 A57 0 0.129 B83 0.088 0.169 C42 0.086 0.241 A58 0.086 0.157 B84 0.101 0.223 C43 0.116 0.247 A59 0.089 0.136 B86 0.105 0.218 C44 0 0.197 A60 0.098 0.197 B91 0.085 0.143 C45 0.043 0.239 A61 0.102 0.180 B92 0.005 0.169 C46 0 0.256 A62 0.007 0.093 B93 0.002 0.176 C47 0 0.222 A63 0 0.174 B97 0.004 0.194 C48 0 0.189 A64 0.006 0.158 B99 0.001 0.182 C49 0.031 0.139 A65 0.065 0.150 B100 0 0.166 C50 0 0.286 A66 0.085 0.130 B101 0.016 0.163 C51 0.005 0.207 a For ease of reference, the prefix HRV- has been omitted from list of HRV type designations. Comparisons of prototype strains which show pairwise p-distance values that violate the proposed thresholds are shown in red. b Maximum VP1 p-distance observed within a bootstrap supported phylogenetic clade or putative type group. c Minimum VP1 p-distance observed between one particular bootstrap supported phylogenetic clade and it’s nearest neighbour d In cases where contemporary strains group separately from prototype strains of two types, which should potentially be combined (A29/A44), the p-distance listed involves the prototype strains only.

4.3.3 Identification of putative new HRV types on the basis of sequence divergence in VP1

Three new types HRV-A and one of HRV-B have been previously defined on the basis of sequence divergence in VP1; HRV-A101 (Rathe et al., 2010), HRV-A102

104

Proposals for the Classification of All Three Species of Human Rhinovirus into Genotypically Assigned Types

(de Vries et al., 2008), HRV-A103 and HRV-B100 (Linsuwanon et al., 2011). In addition, the current analysis has yielded three putative new HRV-A types (tentatively labelled HRV-A104 – A106) and four novel HRV-B types (HRV-B101 – B104). These are marked in red on phylogenetic trees presented (Figure 4.3).

HRV-C sequences are now divided into 51 distinct types on the basis of sequence divergence in the VP1 region (Table 4.3). HRV-C1-33 were first defined in 2010 (Simmonds et al., 2010), while HRV-C36-46 were subsequently recognised in 2012 (McIntyre et al., 2013b). In addition, HRV-C34, C35 and C47-51 have been submitted from other sources and confirmed as distinct types by the Picornavirus Study Group (www.picornaviridae.com).

Several of the putative new HRV-A and –B types included showed a lower limit of inter-clade VP1 nucleotide divergence that was close to or below the 13% and 12% thresholds respectively (Table 4.2). For example, the 6 available sequences from the novel group putatively designated as HRV-B103 showed VP1 distances from their nearest neighbour type (HRV-B6) ranging from 0.113 to 0.148, with maximum intra- type pairwise p-distances for HRV-B6 and HRV-B103 of 0.088 and 0.035 respectively. For these two types, only two pairwise comparisons fell below the 12% threshold and both involved sequence F110_9318. When this single sequence is excluded from the analysis, the minimum value was 0.126. This discrepancy might have originated from sequencing or assembly errors, an evolutionarily intermediate sequence between HRV-B6 and HRV-B103 arising during their subsequent differentiation into distinct types or a chimaera generated by inter-type recombination of closely related types.

In contrast to the borderline VP1 divergence observed between HRV-B6 and HRV- B103, new HRV-B types -B100-102 showed pairwise p-distances in VP1 of 0.214, 0.190 and 0.227 from their nearest neighbour (HRV-B35). The novel type

105

Proposals for the Classification of All Three Species of Human Rhinovirus into Genotypically Assigned Types designation HRV-B104 represents a division of a previously single type group (HRV-B52). This division is based on the existence of two discrete phylogenetic clades with VP1 p-distance of 0.121 to 0.142 between them. In this case, the clade containing the original ATCC isolates, including strains FJ445188 and EF173424 has retained the HRV-B52 type designation, whereas the clade containing field isolate FJ445137 and JF781506 has been designated HRV-B104.

The three putative new types detected within available HRV-A sequences consisted of two groups with borderline VP1 divergence (HRV-A104/A105) and one which had a p-distance of 0.172 from its nearest neighbour, HRV-A71 (HRV-A106). While HRV-A104 can technically be considered to be a borderline new HRV-A type, the lowest inter-clade VP1 p-distance observed between it and HRV-A24 was just above the proposed threshold (0.131). The six available HRV-A105 VP1 sequences group in a bootstrap supported clade which is closely related to, but distinct from the HRV-A57 prototype sequence, FJ445141. Pairwise VP1 p-distances range from 0.129 to 0.140 between the two groups. Only two sequences within HRV-A105 had a pairwise VP1 p-distance of below 13% with HRV-A57; AY355238 and AY450495. HRV-A105 has therefore been classified as a putative new HRV-A type with intermediate VP1 divergence.

4.3.4 Analysis of HRV type groups which do not conform to the proposed VP1 divergence thresholds

With the definition of VP1 nucleotide divergence thresholds for EV, several pairs of types showing divergence below this value were combined into single types. In many cases, this reassignment into single types was consistent with demonstrated serological cross-reactivity. As examples from human EV classification, CAV13/CAV18 and CAV11/CAV15 have been reclassified as CAV13 and CAV11 respectively (Brown et al., 2003). Analysis of pairwise VP1 divergence between

106

Proposals for the Classification of All Three Species of Human Rhinovirus into Genotypically Assigned Types previously recognised HRV types revealed several similar examples. For example, HRV-A8 and HRV-A95 have been combined and renamed as HRV-A8, due to an inter-type VP1 p-distance range of 0.015 to 0.019 and previously documented serological cross-reactivity (Ledford et al., 2004) (Figure 4.4 A1).

Among other candidates, HRV-A29 and HRV-A44 have previously demonstrated cross-reactivity (Cooney et al., 1982) and the two clades containing the prototype strains had an inter-type VP1 p-distance of 0.0724 (Figure 4.4 A2). However, contemporary strains isolated between 2003 and 2009 grouped separately from both serotype groups and cannot be classified as either by VP1 p-distance. When both contemporary and prototype strains are considered, the A29/A44 group had a maximum divergence of 0.159 (involving the most divergent strain FJ042596). All instances of intra-clade VP1 divergence of more than 13% involved one of the two most divergent strains, FJ042596 or FJ042606 compared with another contemporary strain. However, all contemporary strains had a VP1 p-distance with the A29 and A44 prototype strains (FJ445125 and DQ473499) that is below the threshold of 13%. The most divergent strains mentioned are represented in each case by one sequence only and therefore it is not possible to determine at this stage whether these represent variants that happen to be at the extreme end of the distribution of intra-type divergence values or genuine intermediates in the process of diverging into a new HRV type. As data accumulates from further epidemiological and evolutionary studies, this may become clear. However, due to the well supported phylogenetic grouping with other strains within this group and the p-distance comparisons with prototype strains that universally fell below the threshold of 13%, this entire group of sequences has been combined and renamed as HRV-A29.

HRV-A54 and HRV-A98 are a third type pair that are more closely related than expected, with a pairwise nucleotide VP1 p-distance of 0.107 to 0.124. Unfortunately, the cross-neutralisation properties of these two serotypes have not

107

Proposals for the Classification of All Three Species of Human Rhinovirus into Genotypically Assigned Types been analysed in previous studies (Cooney et al., 1982; Ledford et al., 2004). However, on the basis of close phylogenetic relatedness (Figure 4.4 A3) and analysis of VP1 divergence, we would suggest that these may be combined.

These three combined pairs of classical types may have been erroneous classifications in original studies (marked in yellow on Figure 4.3). In the new proposed HRV classification system, they have been assigned to a type number of the earliest submitted full genome. This therefore removes designations HRV-A44, A54 and A95. In order to avoid confusion arising from published literature referring to the now abolished type designations, we would propose retaining these gaps in the numbering scheme.

108

Proposals for the Classification of All Three Species of Human Rhinovirus into Genotypically Assigned Types

FIGURE 4.4: Neighbour joining phylogenetic trees showing classical HRV type pairs which do not conform to the proposed VP1 divergence thresholds. For ease of reference, the branch to tree root has been collapsed and the previously recognised prototype strain of each HRV type is highlighted in blue. Three HRV-A type pairs which have been combined are shown in A1, A2 and A3. Four HRV type pairs which display intermediate divergence values are shown in B1, B2, B3 and B4. Branches are scaled by genetic distance.

In addition to those type pairs which have been combined into single type assignments, a number of other HRV types display intermediate divergence values that violate the proposed thresholds. HRV-A1/A1B are serologically cross-reactive (Cooney et al., 1982) but fall into two discrete phylogenetic groups and have pairwise p-distances that fall above the VP1 threshold (Figure 4.4 B1). While the overall inter-clade distance ranges from 0.119 to 0.149, analysis of the prototype strains alone (A1: FJ445111 and A1B: D00239) showed a distance of 0.122. While we suggest that these two clades be considered as HRV-A1, this type designation should be reviewed as more sequence data becomes available.

As a contrasting problem, HRV-A25/A62 showed inter-clade VP1 pairwise distances ranging from 0.093 to 0.103 (Figure 4.4 B3), despite no evidence of cross- neutralisation between these two serotypes (Cooney et al., 1982). Therefore, this pair have retained their current type designation and been highlighted as an intermediate type pair. Future studies should include re-assessment of the serological relationship between these two types and their designation revised accordingly.

HRV-A36 and HRV-A89 strains fell into two distinct bootstrap supported phylogenetic clades and the prototype strains (DQ473508 and M162488) showed a VP1 p-distance of 0.129, on the borderline of being classified as the same type. However, although serum specific to HRV-A36 strains can neutralize HRV-A89, - A50 and –A58, there was no reciprocal neutralisation (Cooney et al., 1982). Analysis of all nine known strains of HRV-A36 and five known strains of HRV-A89 yielded an inter-clade VP1 divergence ranging from 0.117 to 0.172. We would

109

Proposals for the Classification of All Three Species of Human Rhinovirus into Genotypically Assigned Types suggest considering these two type designations as distinct entities with an intermediate nucleotide divergence in VP1.

Only one HRV-C type showed p-distance values that violated proposed thresholds. HRV-C8 strains were divided into two clades, one previously classified as HRV- Cpat28 and both containing contemporary sequences (Figure 4.4 B4). The VP1 p- distance between the two clades ranged from 0.129 to 0.132. In consultation with the chairman of the Picornavirus Study Group (personal communication, N. Knowles, Picornavirus Study Group), it was decided that these two clades should remain classified as HRV-C8 on the basis of borderline VP1 divergence.

FIGURE 4.5: Distribution of HRV-A VP1 pairwise p-distances immediately surrounding the threshold of 13%. VP1 p-distances between or within types which violate the proposed thresholds are marked in different colours. Types which should be combined are shown in green, while those which remain discrete HRV types have inter-type distances falling below the threshold are pink. Intra-type comparisons falling above the divergence threshold are shown in purple. 110

Proposals for the Classification of All Three Species of Human Rhinovirus into Genotypically Assigned Types

Despite the large number of sequences available for all three species, HRV-A alone showed considerable evidence for VP1 pairwise p-distance comparisons that violate the proposed divergence thresholds. Of these, the majority represent comparisons between types which should be combined (labelled in green on Figure 4.5). Only 6 discrete HRV-A types showed inter-clade distances which are below the threshold (labelled in pink on Figure 4.5) and intra-clade distances which are above the threshold (labelled in purple on Figure 4.5). We would suggest the combined use of phylogenetic analysis and pairwise VP1 p-distance analysis in the classification of all HRV sequences, especially those suspected to be members of intermediate type groups.

4.3.5 Assignment of type designations based on VP1 divergence for all three species of HRV The conclusion of the analysis of all available HRV VP1 data divided sequences into 77 HRV-A types, 29 HRV-B and 13 HRV-C on the basis of full genome sequences (Table 4.3). In addition, a further 38 HRV-C types were defined with only VP1 sequence data available (Table 4.4). The prototype sequence designated was the earliest submitted full genome or VP1 sequence (labelled in red in Table 4.3 and Table 4.4). In cases where an HRV-C type originally defined on the basis of VP1 sequence now has a full genome sequence available, the type has retained the first submitted VP1 sequence as the prototype strain (Simmonds et al., 2010). In addition, certain inconsistencies in naming exist where novel strains have been submitted by individual researchers at separate times. These have been classified based on date of submission to the Picornavirus Study Group, rather than the earliest isolated strain. These inconsistencies are unfortunately unavoidable and should remain in place for clarity.

111

Proposals for the Classification of All Three Species of Human Rhinovirus into Genotypically Assigned Types

TABLE 4.3: Confirmed HRV-A, -B and –C types with full genome sequences available Full Genome Sequences HRV Genbank Strain Date of Full VP1 Prototype Type Accession Identifierb submissionb genome variantsc strain Numbera variants reference HRV-A1 D00239 HRV 1B 1/2/00 4 14 (Hughes et al., FJ445111 1988) JN815255 JN837694 HRV-A2 X02316 HRV 2 14/11/06 1 4 (Skern and Sommergruber , 1985) HRV-A7 DQ473503 TC-65007 27/6/07 2 4 (Kistler et al., FJ445176 2007b) HRV-A8 FJ445113d ATCC 10/4/09 2 6 (Palmenberg FJ445170 VR-1118 et al., 2009) HRV-A9 FJ445177e ATCC 10/4/09 3 10 (Palmenberg FJ445114 VR-489 et al., 2009) FJ445115 HRV-A10 DQ473498 TC-71602 27/6/07 6 7 (Kistler et al., FJ445178 2007b) JN798575 JN798582 JN541269 JN815247 HRV-A11 EF173414 HRV 11 15/7/07 1 3 (Tapparel et al., 2007) HRV-A12 EF173415 HRV 12 15/7/07 3 7 (Tapparel et HQ123441 al., 2007) JF781511 HRV-A13 FJ445116 ATCC 10/4/09 2 4 (Palmenberg FJ445117 VR-1123 et al., 2009) HRV-A15 DQ473493 TC-66919 27/6/07 2 4 (Kistler et al., JN541268 2007b) HRV-A16 L24917 HRV 16 14/7/95 8 10 (Lee et al., JN562722 1995) JN614992 JN798564 JN798574 JN815253 JN990704 JX074057 HRV-A18 FJ445118 ATCC 10/4/09 3 6 (Palmenberg JF781496 VR-1128 et al., 2009) JF781508 HRV-A19 FJ445119 ATCC 10/4/09 3 6 (Palmenberg JQ747746 VR-1129 et al., 2009) JQ747750

112

Proposals for the Classification of All Three Species of Human Rhinovirus into Genotypically Assigned Types

HRV Genbank Strain Date of Full VP1 Prototype Type Accession Identifier submission genome variants strain Number variants reference HRV-A20 FJ445120 ATCC 10/4/09 5 8 (Palmenberg JN541270 VR-1130 et al., 2009) JN614993 JN798571 JQ994494 HRV-A21 FJ445121 ATCC 10/4/09 3 7 (Palmenberg JN837693 VR-1131 et al., 2009) JQ747747 HRV-A22 FJ445122 ATCC 10/4/09 1 4 (Palmenberg VR-1132 et al., 2009) HRV-A23 DQ473497 TC-65847 27/6/07 4 6 (Kistler et al., JN621244 2007b) JN815254 JN837694 HRV-A24 EF173416 HRV 24 15/6/07 3 8 (Tapparel et FJ445190 al., 2007) JN798563 HRV-A25 FJ445123 ATCC 10/4/09 1 3 (Palmenberg VR-1135 et al., 2009) HRV-A28 DQ473508 TC-65852 27/6/07 4 11 (Kistler et al., JN798577 2007b) JN798580 JQ747751 HRV-A29 DQ473499d TC-72548 27/6/07 2 12 (Kistler et al., FJ445125 2007b) HRV-A30 DQ473512 TC-67821 27/6/07 2 10 (Kistler et al., FJ445179 2007b) JN798557 HRV-A31 FJ445126 ATCC 10/4/09 1 7 (Palmenberg VR-506 et al., 2009) HRV-A32 FJ445127 ATCC 10/4/09 1 3 (Palmenberg VR-1142 et al., 2009) HRV-A33 FJ445128 ATCC 10/4/09 3 5 (Palmenberg JN815250 VR-330 et al., 2009) JN990707 HRV-A34 DQ473501 TC-65856 27/6/07 4 6 (Kistler et al., FJ445189 2007b) JF781510 JF781512 JN562720

113

Proposals for the Classification of All Three Species of Human Rhinovirus into Genotypically Assigned Types

HRV Genbank Strain Date of Full VP1 Prototype Type Accession Identifier submission genome variants strain Number variants reference HRV-A36 DQ473505 TC-74313 27/6/07 11 15 (Kistler et al., JF781497 2007b) JN614994 JN621243 JN798583 JN798584 JN815241 JN815242 JN815246 JN837697 JX074050 HRV-A38 DQ473495 TC-72617 27/6/07 3 5 (Kistler et al., FJ445180 2007b) JQ994496 HRV-A39 AY751783 HRV 39 12/4/05 1 3 (Harris and Racaniello, 2005) HRV-A40 FJ445129 ATCC 10/4/09 4 6 (Palmenberg JN798579 VR-341 et al., 2009) JQ245967 JX074051 HRV-A41 DQ473491 TC-66066 27/6/07 1 3 (Kistler et al., 2007b) HRV-A43 FJ445131 ATCC 10/4/09 2 5 (Palmenberg JN815237 VR-1153 et al., 2009) HRV-A45 FJ445132 ATCC 10/4/09 1 5 (Palmenberg VR-1155 et al., 2009) HRV-A46 DQ473506 TC-75908 27/6/07 1 3 (Kistler et al., 2007b) HRV-A47 FJ445133 ATCC 10/4/09 3 10 (Palmenberg GQ223229 VR-1157 et al., 2009) JN837692 HRV-A49 DQ473496 TC-66958 27/6/07 5 13 (Kistler et al., FJ445134 2007b) JN621241 JN798561 JN798589 HRV-A50 FJ445135 ATCC 10/4/09 1 3 (Palmenberg VR-517 et al., 2009) HRV-A51 FJ445136 ATCC 10/4/09 2 4 (Palmenberg JN562725 VR-1161 et al., 2009) HRV-A53 DQ473507 TC-67618 27/6/07 2 4 (Kistler et al., JN798587 2007b) HRV-A54 FJ445138 ATCC 10/4/09 3 7 (Palmenberg FJ445139 VR-1164 et al., 2009) FJ445173

114

Proposals for the Classification of All Three Species of Human Rhinovirus into Genotypically Assigned Types

HRV Genbank Strain Date of Full VP1 Prototype Type Accession Identifier submission genome variants strain Number variants reference HRV-A55 DQ473511 TC-64522 27/6/07 2 4 (Kistler et al., JQ837718 2007b) HRV-A56 FJ445140 ATCC 10/4/09 2 3 (Palmenberg EU840727 VR-1166 et al., 2009) HRV-A57 FJ445141 Fs ship#1 10/4/09 1 1 (Palmenberg et al., 2009) HRV-A58 FJ445142 ATCC 10/4/09 2 5 (Palmenberg JX025558 VR-1168 et al., 2009) HRV-A59 DQ473500 TC-70403 27/6/07 2 4 (Kistler et al., JN541266 2007b) HRV-A60 FJ445143 ATCC 10/4/09 2 4 (Palmenberg JN798590 VR-1473 et al., 2009) HRV-A61 FJ445144 ATCC 10/4/09 2 6 (Palmenberg JN798560 VR-1171 et al., 2009) HRV-A62 FJ445145 ATCC 10/4/09 1 4 (Palmenberg VR-1172 et al., 2009) HRV-A63 FJ445146 ATCC 10/4/09 1 3 (Palmenberg VR-1173 et al., 2009) HRV-A64 EF173417 HRV 64 15/7/07 2 4 (Tapparel et FJ445181 al., 2007) HRV-A65 FJ445147 ATCC 10/4/09 3 5 (Palmenberg JF781504 VR-1175 et al., 2009) JQ245966 HRV-A66 FJ445148 ATCC 10/4/09 4 6 (Palmenberg JN112340 VR-1176 et al., 2009) JN621246 JQ837715 HRV-A67 FJ445149 ATCC 10/4/09 1 6 (Palmenberg JN621245 VR-1177 et al., 2009) HRV-A68 FJ445150 ATCC 10/4/09 2 4 (Palmenberg JN798578 VR-1178 et al., 2009) HRV-A71 FJ445152 ATCC 10/4/09 1 3 (Palmenberg VR-1181 et al., 2009) HRV-A73 DQ473492 TC-73172 27/6/07 1 3 (Kistler et al., 2007b) HRV-A74 DQ473494 TC-70882 27/6/07 1 3 (Kistler et al., 2007b) HRV-A75 DQ473510 TC-70308 27/6/07 3 5 (Kistler et al., JF781503 2007b) JN837690 HRV-A76 DQ473502 TC-70326 27/6/07 5 8 (Kistler et al., FJ445182 2007b) JN815238 JX074049 JX074055

115

Proposals for the Classification of All Three Species of Human Rhinovirus into Genotypically Assigned Types

HRV Genbank Strain Date of Full VP1 Prototype Type Accession Identifier submission genome variants strain Number variants reference HRV-A77 FJ445154 ATCC 10/4/09 1 3 (Palmenberg VR-1187 et al., 2009) HRV-A78 EF173418 HRV 78 15/7/07 2 5 (Tapparel et FJ445183 al., 2007) HRV-A80 FJ445156 ATCC 10/4/09 4 6 (Palmenberg JN798586 VR-1190 et al., 2009) JN798576 JN990705 HRV-A81 FJ445157 ATCC 10/4/09 4 6 (Palmenberg FJ445158 VR-1191 et al., 2009) FJ445159 HQ123442 HRV-A82 DQ473509 HRV A 27/6/07 5 7 (Kistler et al., FJ445160 complete 2007b) JN798556 genome JN798585 JQ837722 HRV-A88 DQ473504 TC-70782 27/6/07 1 3 (Kistler et al., 2007b) HRV-A89 M16248 HRV 89 7/2/03 6 7 (Duechler et FJ445165 al., 1987) FJ445166 FJ445184 JQ837716 JQ837719 HRV-A90 FJ445167 ATCC 10/4/09 1 3 (Palmenberg VR-1291 et al., 2009) HRV-A94 EF173419 HRV 94 15/7/07 2 4 (Tapparel et FJ445185 al., 2007) HRV-A96 FJ445171 ATCC 10/4/09 1 3 (Palmenberg VR-1296 et al., 2009) HRV- FJ445175 ATCC 10/4/09 1 3 (Palmenberg A100 VR-1300 et al., 2009) HRV- GQ415051 HRV- 13/1/10 3 3 (Rathe et al., A101 GQ415052 A101 2010) JQ245965 HRV- EF155421 AMS323 31/7/10 1 1 (de Vries et A102 al., 2008) HRV- JF965515 WA327E 2/6/11 3 3 Unpub. A103 JQ747749 /09 JQ994499 HRV- JN562727 P1025 1/8/12 3 3 Unpub. A104 JX074047 sR2625 JX193797 2009

116

Proposals for the Classification of All Three Species of Human Rhinovirus into Genotypically Assigned Types

HRV Genbank Strain Date of Full VP1 Prototype Type Accession Identifier submission genome variants strain Number variants reference HRV- JN614995 P1064 1/8/12 2 2 Unpub. A105 JN990699 sR985 2009 HRV- JQ245971 P1042 16/5/12 2 2 Unpub. A106 JX025555f sR141 2008 HRV-B3 DQ473485 TC-64701 27/6/07 2 5 (Kistler et al., EF173422 2007b) HRV-B4 DQ473490 HRV4 27/6/07 2 4 (Kistler et al., JN798573 2007b) HRV-B5 FJ445112 ATCC 10/4/09 1 3 (Palmenberg VR-485 et al., 2009) HRV-B6 DQ473486 TC-65006 27/6/07 5 8 (Kistler et al., JN562723 2007b) JQ747745 JQ747748 JNX193795 HRV-B14 L05355 HRV 14 11/6/93 3 4 (Lee et al., K02121 1993) X01087 HRV-B17 EF173420 HRV 17 15/7/07 1 3 (Tapparel et al., 2007) HRV-B26 FJ445124 ATCC 10/4/09 1 3 (Palmenberg VR-1136 et al., 2009) HRV-B27 EF173421 HRV 27 15/7/07 2 3 (Tapparel et FJ445186 al., 2007) HRV-B35 DQ473487 TC-73280 27/6/07 2 4 (Kistler et al., FJ445187 2007b) HRV-B37 EF173423 HRV 37 15/7/07 1 4 (Tapparel et al., 2007) HRV-B42 FJ445130 ATCC 10/4/09 4 6 (Palmenberg JN562724 VR-338 et al., 2009) JN781498 JN781507 HRV-B48 DQ473488 TC-70326 27/6/07 2 4 (Kistler et al., JN990698 2007b) HRV-B52 EF173424 HRV 52 15/7/07 2 4 (Tapparel et FJ445188 al., 2007) HRV-B69 FJ445151 ATCC 10/4/09 4 10 (Palmenberg HQ123445 VR-1179 et al., 2009) JN562721 JQ245970 HRV-B70 DQ473489 TC-72589 27/6/07 2 6 (Kistler et al., JQ245974 2007b)

117

Proposals for the Classification of All Three Species of Human Rhinovirus into Genotypically Assigned Types

HRV Genbank Strain Date of Full VP1 Prototype Type Accession Identifier submission genome variants strain Number variants reference HRV-B72 FJ445153 ATCC 10/4/09 6 8 (Palmenberg GU968948 VR-1182 et al., 2009) JN562726 JN614997 JN798562 JQ245969 HRV-B79 FJ445155 ATCC 10/4/09 1 3 (Palmenberg VR-1189 et al., 2009) HRV-B83 FJ445161 ATCC 10/4/09 2 4 (Palmenberg JN990701 VR-1193 et al., 2009) HRV-B84 FJ445162 ATCC 10/4/09 7 9 (Palmenberg JF781499 VR-1194 et al., 2009) JF781502 JN541271 JN614991 JQ837723 JX074048 HRV-B86 FJ445164 ATCC 10/4/09 1 3 (Palmenberg VR-1196 et al., 2009) HRV-B91 FJ445168 ATCC 10/4/09 1 4 (Palmenberg VR-1292 et al., 2009) HRV-B92 FJ445169 ATCC 10/4/09 1 3 (Palmenberg VR-1293 et al., 2009) HRV-B93 EF173425 HRV 93 15/7/07 1 3 (Tapparel et al., 2007) HRV-B97 FJ445172 ATCC 10/4/09 1 3 (Palmenberg VR-1297 et al., 2009) HRV-B99 FJ445174 ATCC 10/4/09 1 3 (Palmenberg VR-1299 et al., 2009) HRV- HQ123444 CU211 23/1/11 1 1 (Linsuwanon B100 et al., 2011) HRV- JF781500 P1156 1/8/12 3 4 Unpub. B101 JF781501 sR810 JX074052 2007 HRV- JX074052 P1044 8/6/12 1 1 Unpub. B102 sR122 2007 HRV- JN614996 P1160 1/8/12 5 6 Unpub. B103 JN798572 sR1153 JQ245972 2009 JQ837717 JQ994497 HRV- FJ445137 F10 10/4/09 2 4 (Palmenberg B104 JF781506 et al., 2009) HRV-C1 EF077279 NAT001 20/10/06 2 5 (Kistler et al., HQ123443 2007a)

118

Proposals for the Classification of All Three Species of Human Rhinovirus into Genotypically Assigned Types

HRV Genbank Strain Date of Full VP1 Prototype Type Accession Identifier submission genome variants strain Number variants reference HRV-C2 EF077280 NAT043 20/10/06 6 6 (Kistler et al., JN815248 2007a) JN837695 JN990703 JQ245968 JX025557 HRV-C3 EF186077 QPM 14/12/06 3 22 (McErlean et JN798567 al., 2007) JN990700 HRV-C4 EF582385 024 27/4/07 2 2 (Lau et al., JF781509 2007) HRV-C5 EF582386 025 27/4/07 1 3 (Lau et al., 2007) HRV-C6 EF582387 026 27/4/07 3 14 (Lau et al., JF317016 2007) JN990702 HRV-C7 DQ875932 NY-074 14/7/08 5 7 (Lamson et al., JN798559 2006) JN798570 JN837689 JX025556 HRV-C8 GQ223227 N4 29/5/09 3 4 (Huang et al., JQ245964 2009) JQ245973 HRV-C9 GQ223228 N10 29/5/09 1 8 (Huang et al., 2009) HRV-C10 GQ323774 QCE 29/6/09 1 6 (Arden et al., 2010) HRV-C11 EU840952 CL- 21/5/10 1 1 (Tapparel et 170085 al., 2009a) HRV-C35 JF436925 Subtype 35 13/6/12 1 1 (Lysholm et al., 2012) HRV-C51 JF317015 LZ508 2 2 Unpub. JX291115 a Designated prototype strain is shown in red. b Strain identifier and date of submission both refer to prototype strain. c Number of VP1sequences available including full genomes. d Where some classical HRV type groups have been combined, the new prototype strain given is the earliest isolated full genome of either classical type. e In cases where more than one full genome has been sequenced and submitted concurrently, the full genome derived from amplification of an original ATCC strain has been designated as the prototype strain. f JX025555 has been designated as the prototype strain of HRV-A104, as the other available polyprotein sequence JQ245971 has a large gap within the VP1 region.

119

Proposals for the Classification of All Three Species of Human Rhinovirus into Genotypically Assigned Types

TABLE 4.4: Confirmed HRV–C types defined by VP1 sequence

Full VP1 sequences Genome Sequences HRV Genbank Genbank Strain Date of VP1 Prototype Type Accession Accession Identifier submission variants strain Number Number reference HRV-C12 JF317017 HM236958 Resp_3922/07 14/5/10 7 (McIntyre et al., 2010) HRV-C13 - HM236908 Resp_2951/06 14/5/10 2 (McIntyre et al., 2010) HRV-C14 - HM236911 Resp_3090/06 14/5/10 1 (McIntyre et al., 2010) HRV-C15 GU219984 HM236963 Resp_4644/07 14/5/10 4 (McIntyre et JF317014 al., 2010) JN837688 HRV-C16 - HM236944 Resp_5910/07 14/5/10 3 (McIntyre et al., 2010) HRV-C17 JN815240 HM236936 Resp_5145/07 14/5/10 3 (McIntyre et JN815244 al., 2010) HRV-C18 - HM236918 Resp_3631/07 14/5/10 6 (McIntyre et al., 2010) HRV-C19 - EU840728 CL-Fnp5 20/6/08 3 (Tapparel et al., 2009a) HRV-C20 - HM236923 Resp_3995/07 14/5/10 2 (McIntyre et al., 2010) HRV-C21 - HM236903 Resp_5071/07 14/5/10 2 (McIntyre et al., 2010) HRV-C22 JN621242 HM236905 Resp_2748/06 14/5/10 3 (McIntyre et al., 2010) HRV-C23 - HM236901 Resp_3053/06 14/5/10 2 (McIntyre et al., 2010) HRV-C24 - HM236939 Resp_7147/07 14/5/10 1 (McIntyre et al., 2010) HRV-C25 HQ123440 HM236952 Resp_2832/06 14/5/10 4 (McIntyre et JF317013 al., 2010) JN837685 HRV-C26 JX193796 HM236904 Resp_2514/06 14/5/10 3 (McIntyre et al., 2010) HRV-C27 - HM236906 Resp_2784/06 14/5/10 4 (McIntyre et al., 2010) HRV-C28 JN798569 HM236954 Resp_3105/06 14/5/10 2 (McIntyre et al., 2010) HRV-C29 - HM236949 Resp_5345/07 14/5/10 1 (McIntyre et al., 2010) HRV-C30 - HM236968 Resp_3898/07 14/5/10 1 (McIntyre et al., 2010)

120

Proposals for the Classification of All Three Species of Human Rhinovirus into Genotypically Assigned Types

HRV Genbank Genbank Strain Date of VP1 Prototype Type Accession Accession Identifier submission variants strain Number Number reference HRV-C31 - HM236964 Resp_4923/07 14/5/10 1 (McIntyre et al., 2010) HRV-C32 JN798581 HM236897 Resp_6131/07 14/5/10 12 (McIntyre et JQ994498 al., 2010) HRV-C33 - HM236934 Resp_4917/07 14/5/10 2 (McIntyre et al., 2010) HRV-C34 JF436926 JF519798 PNG7111- 22/6/12 2 (Chidlow et 3625 al., 2012) HRV-C36 JN541267 JF416311 Resp_2480/07 22/8/11 5 (McIntyre et al., 2013b) HRV-C37 - JF416321 Resp_6135/08 22/8/11 1 (McIntyre et al., 2013b) HRV-C38 JN837691 JF416322 Resp_6142/08 22/8/11 2 (McIntyre et al., 2013b) HRV-C39 - JF416306 Resp_10221/08 22/8/11 2 (McIntyre et al., 2013b) HRV-C40 JF781505 JF416312 Resp_2800/06 22/8/11 6 (McIntyre et JN815251 al., 2013b) HRV-C41 JN798565 JF416323 Resp_9449/08 22/8/11 3 (McIntyre et al., 2013b) HRV-C42 JQ994500 JF416320 Resp_5477/08 22/8/11 2 (McIntyre et al., 2013b) HRV-C43 JN837687 JF416307 Resp_13229/08 22/8/11 4 (McIntyre et JX074056 al., 2013b) HRV-C44 - JF416310 Resp_15588/09 22/8/11 1 (McIntyre et al., 2013b) HRV-C45 JN837686 JF416308 Resp_13958/08 22/8/11 3 (McIntyre et al., 2013b) HRV-C46 - JF416318 Resp_5153/07 22/8/11 2 (McIntyre et al., 2013b) HRV-C47 - JF519760 PNG7254- 22/6/12 1 (Chidlow et 3947 al., 2012) HRV-C48 - JF519762 PNG7293- 22/6/12 1 (Chidlow et 3193 al., 2012) HRV-C49 JF907574 JF946738 WA11040M 22/10/11 3 (Wiertsema et al., 2011) HRV-C50 - JQ739202 SG1 24/4/12 1 Unpub.

121

Proposals for the Classification of All Three Species of Human Rhinovirus into Genotypically Assigned Types

4.3.6 Analysis of recombination within the capsid coding region of HRV and phylogenetic analysis of the VP4/VP2 region

Within EV isolates, recombination has been documented within the capsid coding region (Bouslama et al., 2007) and therefore the VP1 region alone is deemed suitable for typing. However, recombination within the capsid coding region of HRV is thought to be relatively infrequent (Palmenberg et al., 2009; Kistler et al., 2007b; Lewis-Rogers et al., 2009).

The well conserved and readily amplified VP4/VP2 region of HRV is commonly used in studies of its epidemiology and clinical associations. Recent and rapid accumulation of sequence data has created over 3900 HRV sequences in the region available on GenBank compared to 384 sequences of the VP1 region, a region that is problematic to amplify without species or type-specific primers. Therefore, in view of this imbalance, it is important to verify that known HRV types can be correctly identified through analysis of the VP4/VP2 region if sequence data from this region is to be employed in epidemiological and clinical studies.

The majority of HRV strains grouped congruently between VP1 and VP4/VP2 (data not shown), indicating that phylogenetic grouping in VP4/VP2 was predictive of the type group assigned using VP1 sequences. Only one sequence showed evidence of phylogenetic incongruence within the capsid region. JX291115 is a member of HRV-C51 type group in the VP1 region, with a pairwise nucleotide p-distance of 0.005. However, in VP4/VP2, this sequence had a VP1 p-distance of 0.1459 from HRV-C51 and groups as a poorly supported outgroup to HRV-C15. Analysis with the RDP software package confirms this sequence as a putative recombinant and similar to the pattern observed from phylogenetic analysis, there was no known minor recombination parent. Analysis of HRV-A and -B carried out with RDP and

122

Proposals for the Classification of All Three Species of Human Rhinovirus into Genotypically Assigned Types

GARD highlighted no evidence of recombination between regions. This suggests that the VP4/VP2 region may additionally be used for classification of HRV types.

We previously proposed a divergence threshold of 10% for the purpose of classifying HRV-C sequences by VP4/VP2 sequence (Simmonds et al., 2010). This was calculated using all HRV sequences that were over 90% complete across a fragment of VP4/VP2 from position 615 – 1043 (numbered by EF582385). In the revised analysis, the use of a shorter fragment was considered (615 – 1004 numbered by EF582385), as this would allow inclusion of a large number of additional VP4/VP2 sequences. Inspection of phylogenetic trees constructed for both fragments showed no change in the uniformly high bootstrap support for individual type groups and no change in the clade membership of individual sequences. Therefore, the smaller fragment was used in all subsequent phylogenetic and p-distance analysis (Figure 4.6).

123

Proposals for the Classification of All Three Species of Human Rhinovirus into Genotypically Assigned Types

124

Proposals for the Classification of All Three Species of Human Rhinovirus into Genotypically Assigned Types

FIGURE 4.6: Neighbour joining phylogenetic trees showing the VP4/VP2 region of all three species of HRV. For ease of reference, individual type groups and the branch to tree root have been collapsed. Divergent groups which have been temporarily designated as provisionally assigned types are shown in red. HRV type groups which do not conform to the proposed VP4/VP2 thresholds are shown in purple. The single recombinant sequence detected, JX291115 is marked by a purple box. Branches are scaled by genetic distance.

4.3.7 Assigning nucleotide divergence thresholds for the VP4/VP2 region

Pairwise p-distance comparisons are divided into two distributions in the VP4/VP2 region (Figure 4.7). Detailed inspection of distributions of pairwise distances allowed a threshold of 10.5%, 9.5% and 10.5% to be assigned for HRV-A, -B and –C respectively. Due to the fact that VP4/VP2 is a considerably shorter and less divergent sequence fragment than VP1, the thresholds are less clearly defined, with a greater number of pairwise comparisons close to the assignment threshold.

The original HRV-C classification included 28 provisionally assigned types, which were assigned on the basis of VP4/VP2 sequence divergence alone. When a corresponding VP1 sequence was obtained and confirmed to display at least 13% divergence, a new confirmed type was created and the provisionally assigned type designation discarded. Of these original 28 provisionally assigned types, only 11 remain (listed in Table 4.5). In 2012, VP1 sequencing of HRV-C isolates from 10 provisionally assigned type groups led to confirmation of further species C types HRV-C36 to HRV-C46 (McIntyre et al., 2013b). To date, only one provisionally assigned type has been combined into an existing type (HRV-C8/Cpat28). This reaffirms the usefulness of the provisionally assigned type system for identifying candidate new types and guiding the use of VP1 region sequencing for confirming these.

125

Proposals for the Classification of All Three Species of Human Rhinovirus into Genotypically Assigned Types

FIGURE 4.7: Distributions of pairwise nucleotide p-distances for the VP4/VP2 region of all three species of HRV. HRV-A is shown in section A1, HRV-B in B1 and HRV-C in C1. A detailed analysis of the distribution values adjacent to the proposed thresholds for HRV-A (A2), HRV-B (B2) and HRV- C (C2) is also shown and the proposed nucleotide divergence threshold is shown by a red dotted line.

126

Proposals for the Classification of All Three Species of Human Rhinovirus into Genotypically Assigned Types

Based on this principle, 4 previously unrecognised divergent VP4/VP2 groups were observed in HRV-A and 4 in HRV-B (labelled in red on Figure 4.6). A further three provisionally assigned HRV-C types were also detected (designated HRV-Cpat29- pat31). All showed divergence above the VP4/VP2 threshold proposed for their species (Table 4.5). Only one of these provisionally assigned types, HRV-Apat2 had a corresponding sample which existed within our archive. However, the sample was missing and so VP1 sequencing could not be attempted.

TABLE 4.5: Provisionally assigned types of HRV based on VP4/VP2 sequence divergence

HRV HRV VP4/VP2 Submission VP4/ Closest p-distb Type strain ref Species p.a.t type strain date VP2 typea strains A Apat1 GU568105 13/10/10 2 A73 0.106 (Xiang et al., 2010b) A Apat2 GQ476621 4/12/09 4 A47 0.119 (Wisdom et al., 2009b) A Apat3 AB549407 10/3/10 6 A43 0.144 (Kaida et al., 2011) A Apat4 EU590059 24/6/09 12 A46 0.112 (Savolainen-Kopra et al., 2009a) B Bpat1 EF077241 23/8/07 1 B83 0.119 (Kistler et al., 2007a) B Bpat2 HM366914 3/1/11 2 B48 0.122 (Piralla et al., 2011) B Bpat3 HM366910 3/1/11 11 B97 0.097 (Piralla et al., 2011) B Bpat4 AB548901 10/3/10 15 B91 0.107 Unpub. C Cpat10 EU590054 25/3/08 17 C2 0.200 (Savolainen-Kopra et al., 2009a) C Cpat14 EU697852 5/5/08 6 Cpat16 0.106 (Briese et al., 2008) C Cpat16 EU752358 26/5/08 5 Cpat14 0.106 (Miller et al., 2009a) C Cpat17 EU752398 26/5/08 14 C29 0.133 (Miller et al., 2009a) C Cpat18 EU752412 26/5/08 28 Cpat19 0.146 (Miller et al., 2009a) C Cpat19 FJ598096 31/12/09 17 Cpat18 0.146 Unpub. C Cpat20 FJ615722 9/1/09 1 C41 0.105 (Miller et al., 2009b) C Cpat21 FJ615737 9/1/09 7 C36 0.144 (Miller et al., 2009b) C Cpat22 FJ615745 9/1/09 7 C33 0.121 (Miller et al., 2009b) C Cpat24 FJ869923 27/3/09 2 C28 0.106 (Han et al., 2009) C Cpat27 GU214340 18/11/09 10 C18 0.105 (Piralla et al., 2009) C Cpat29 FR820909 3/4/12 3 C50 0.113 (Henquell et al., 2012) C Cpat30 AB550405 23/12/11 1 C5 0.110 (Fujitsuka et al., 2011) C Cpat31 AB628117 24/2/12 1 C27 0.115 Unpub. a This represents the closest HRV type by p-distance and phylogenetic analysis. b Lowest inter-clade VP4/VP2 pairwise nucleotide p-distance between provisionally assigned type and nearest neighbour

127

Proposals for the Classification of All Three Species of Human Rhinovirus into Genotypically Assigned Types

4.3.8 Analysis of confirmed HRV types which do not conform to VP4/VP2 thresholds

Analysis of inter- and intra-type pairwise nucleotide p-distances in VP4/VP2 revealed many groups which violated the proposed divergence thresholds (22, 10 and 6 types of HRV-A, -B and –C respectively). HRV-A1 and HRV-A25/62 displayed the same patterns as in the VP1 region, in which they were designated as intermediate types.

TABLE 4.6: Limits of inter- and intra- clade divergence in HRV types which do not conform to VP4/VP2 thresholds HRV Type Prototype VP4/VP2 VP4/VP2 intra- VP4/VP2 inter- strain variants clade clade accession comparisons comparisons number upper limit lower limit HRV-A1a D00239 44 0.124 0.124 HRV-A2 X02316 15 0.123 0.133 HRV-A15 DQ473493 17 0.107 0.147 HRV-A20 FJ445120 23 0.119 0.119 HRV-A22 FJ445122 13 0.136 0.148 HRV-A25 FJ445123 7 0.075 0.064 HRV-A28 DQ473508 30 0.114 0.151 HRV-A29b DQ473499 42 0.159 0.151 HRV-A31 FJ445126 10 0.097 0.083 HRV-A46 DQ473506 14 0.137 0.119 HRV-A47 FJ445133 39 0.098 0.083 HRV-A53 DQ473507 28 0.109 0.128 HRV-A54 FJ445138 12 0.129 0.125 HRV-A61 FJ445144 17 0.112 0.135 HRV-A62 FJ445145 7 0.064 0.064 HRV-A63 FJ445146 24 0.118 0.136 HRV-A66 FJ445148 16 0.112 0.130 HRV-A78 EF173418 83 0.153 0.163 HRV-A80 FJ445156 15 0.129 0.129 HRV-A81 FJ445157 17 0.107 0.152 HRV-A82 DQ473509 20 0.135 0.147 HRV-A100 FJ445175 5 0.143 0.147 HRV-B3 U60874 13 0.128 0.142 HRV-B35 DQ473487 13 0.102 0.135 HRV-B37 EF173423 6 0.102 0.118 HRV-B48 DQ473488 22 0.129 0.122 HRV-B52 EF173424 32 0.005 0.093 HRV-B69 FJ445151 61 0.131 0.099

128

Proposals for the Classification of All Three Species of Human Rhinovirus into Genotypically Assigned Types

HRV Type Prototype VP4/VP2 VP4/VP2 intra- VP4/VP2 inter- strain variants clade clade accession comparisons comparisons number upper limit lower limit HRV-B70 DQ473489 16 0.100 0.097 HRV-B72 FJ445153 17 0.102 0.151 HRV-B86 FJ445164 7 0.107 0.138 HRV-B104 FJ445137 5 0.092 0.093 HRV-C4 EF582385 6 0.031 0.100 HRV-C8 GQ223227 25 0.147 0.169 HRV-C12 HM236958 67 0.113 0.134 HRV-C32 HM236897 19 0.071 0.090 HRV-C39 JF416306 46 0.074 0.090 HRV-C49 JF946738 8 0.049 0.100 a Confirmed types which are distinct in the VP1 region but violate the divergence thresholds in VP4/VP2 are marked in bold italics. b Please note that new HRV type designations as presented in Table 4.3 are used in this analysis.

Four type pairs had VP4/VP2 divergence that was below the proposed threshold. HRV-A31/A47, HRV-B52/B104, HRV-C4/C49 and HRV-C32/C39 had inter-type p- distances in VP4/VP2 of 0.083, 0.093, 0.100 and 0.090 respectively. In each case, VP1 divergence between the two members of the pair confirmed them to be distinct types and each type group forms a distinct phylogenetic cluster which is readily distinguishable from its nearest neighbour. These represent confirmed type groups which are distinct in the VP1 region, but less divergent than would be expected in VP4/VP2 (marked in bold italics on Table 4.6).

Most type groups contained two sub-clusters; one containing older strains closely related to the prototype and the other containing contemporary isolates. The majority of type groups with intra-type VP4/VP2 distances which violated the thresholds had a representative VP1 sequence from the divergent contemporary cluster. Invariably, these fell within the proposed thresholds for divergence in the VP1 region. These types represent confirmed type groups which are more divergent in VP4/VP2 than would be expected (marked in bold italics on Table 4.6). Each of these types formed a distinct phylogenetic cluster in VP4/VP2.

129

Proposals for the Classification of All Three Species of Human Rhinovirus into Genotypically Assigned Types

The remaining 12 HRV types showed divergence above the threshold in VP4/VP2 and had no available VP1 sequence to confirm the type designation. These could be categorised into two groups. For the majority of these types (including HRV-A15, - A53, -A63, -A100, -B35, -B69, -B72 and –C12), all p-distance comparisons above the threshold occurred between contemporary isolates. When considered individually, every contemporary strain had a p-distance from the prototype strain of less than 10.5% or 9.5%. Therefore, these types contained examples of divergent contemporary strains that were still type group members. The remaining types could not be definitively classified. HRV-A46, -A61, -A86 and –B3 all contained at least one contemporary strain that had a pairwise nucleotide p-distance of greater than 10.5% or 9.5% with their corresponding prototype strain. However, in each case, phylogenetic analysis revealed that the divergent strain was embedded within a clade containing type group members.

The existence of such a large number of HRV types which violate the proposed thresholds highlights the fact that it is not sufficient to rely on pairwise nucleotide p- distance analysis for identification of HRV types. However, when considered in parallel with phylogenetic analysis, all strains could be readily classified into types with a degree of certainty.

4.3.9 Geographical distribution of HRV types An analysis of the broad geographical distribution of HRV types was undertaken by obtaining information on geographical location of isolation of all VP4/VP2 sequences, where available. VP4/VP2 sequences were selected in preference over VP1 sequences, as this region is the most commonly used in prevalence studies of HRV. Although most sequences had associated geographical data available, 492 HRV isolates could not be associated with a particular location.

130

Proposals for the Classification of All Three Species of Human Rhinovirus into Genotypically Assigned Types

Although there is clearly a finite limit to the number of currently circulating HRV types, the continuing discovery of novel HRV types suggests that the limit has not yet been reached. HRV screening has been undertaken over a wide geographical base including Europe, USA, Australia and Asia. For the purposes of this analysis, sequences were divided into those isolated in 4 large regions; USA, Asia, Europe and other (including Australia, Jordan and Africa). As a substantial number of the sequences included were derived from our own screening, sequences from the UK and those from other European studies were separated into two distinct categories. The distribution of all identified strains between the geographical regions was highly similar across all three species, which indicates a lack of bias in the transmission and circulation of any HRV species within a particular geographical region (Figure 4.8).

Several HRV types were excluded from the analysis, on the basis of limited availability of geographical information (HRV-A50, -A57, -A64, -B5, -B17- B91 and –C50). All of these types consisted of one or two VP4/VP2 sequences. In addition, 5 HRV-C and 2 HRV-B types were represented by one available strain only. Without further large-scale epidemiological studies of HRV circulation and type distribution over a wide geographical area, it is difficult to speculate on the significance of these seemingly rare types. It is possible that these represent types which have a tendency to cause more mild and self-limiting infections, which would not be seen on screening carried out among hospitalised patients. No single HRV type with a large number of VP4/VP2 sequences available was found to display geographical restriction, which suggests that most HRV types circulate freely worldwide. A generalised increase in the practice of routine HRV typing will allow further investigation of the geographical distribution of HRV types.

131

Proposals for the Classification of All Three Species of Human Rhinovirus into Genotypically Assigned Types

FIGURE 4.8: Total number of identified HRV strains divided by geographical location of isolation. HRV-A, -B and –C are shown separately. The total representation of sequences from each geographical area is shown in inset pie-chart.

132

Proposals for the Classification of All Three Species of Human Rhinovirus into Genotypically Assigned Types

4.4 Discussion

4.4.1 Development of a sensitive and specific RT-PCR for the

amplification of the VP1 region of HRV The ability to perform genotyping directly from clinical specimens removes the need for cell culture passage and therefore greatly reduces time required for an accurate diagnosis and exposure of staff to live cultures. This has been achieved in studies of EV (Mirand et al., 2009; Nix et al., 2006; McWilliam Leitch et al., 2009b). The HRV specific assays developed in the course of this work allow the amplification of the full VP1 region from respiratory specimens, which often have low viral titres. Unfortunately, due to the high degree of genetic diversity, it was not possible to develop an assay capable of amplifying all three species of HRV concurrently. Indeed, even within a single HRV species, maintaining the required balance between allowing sufficient primer degeneracy to amplify all types and avoiding non-specific amplification was challenging.

Several studies have advocated the use of a one-step PCR for EV typing, comprising only of a first round of amplification (She et al., 2010; Kiang et al., 2009), with the suggestion that a “closed system” is less likely to suffer from contamination. However, one of the above studies reported that almost one third of tested isolates could not be effectively amplified in the VP1 region and those that were successful showed significant non-specific product (She et al., 2010). We have found that a nested PCR method dramatically reduces non-specific amplification and is much more effective for the amplification of very low titre samples. Also, with effective and universal adherence to the “one-way” system in PCR laboratories, with spatial separation of dedicated areas for nucleic acid extraction, first round and second round PCR, the occurrence of contamination is greatly reduced (Chapter 2).

133

Proposals for the Classification of All Three Species of Human Rhinovirus into Genotypically Assigned Types

4.4.2 The importance of genotyping in HRV

The identification and definition of HRV types by analysis of available sequence data within the capsid region has the potential to revolutionise HRV typing. The ease with which this can be carried out, requiring very little additional equipment or training, should make the endeavour of classifying all detected HRV sequences much more feasible and attractive. A global system of type identification should allow large-scale investigations of epidemiology, transmission and evolution. Classification and centralised reporting of new HRV types improves the ability of researchers to commence study of the distribution of new types quickly. Whereas previously a considerable latent period would have been necessary while waiting for type-specific antiserum to be developed and dispatched, now sequence data for comparison is available worldwide almost instantly.

4.4.3 Proposed criteria for the division of HRV into genotypically defined types

The usage of nucleotide divergence thresholds for the definition of new EV types for several years with great success strengthens the assertion that this is a valid method of classification for the closely related HRV. The criteria that we have developed and have been proposed formally (Simmonds et al., 2010; McIntyre et al., 2013a) are summarised below. The overall conclusions and recommendations are informed by and consistent with those already used for EV. However, by incorporating lower thresholds and acknowledging the lack of capsid recombination by proposing a VP4/VP2 divergence threshold, we have taken into account the differences between HRV and EV in terms of biological properties and genetic diversity.

134

Proposals for the Classification of All Three Species of Human Rhinovirus into Genotypically Assigned Types

Proposed criteria for the assignment of genotypically defined types of HRV

1. An HRV type should be phylogenetically distinct from other HRV types. 2. Analysis of divergence in the VP1 region should include at least 90% of the full VP1 coding region for each species. 3. In the VP1 region, an HRV type should have at least 13% (HRV-A), 12% (HRV-B) or 13% (HRV-C) nucleotide divergence from all other HRV types. 4. In cases where nucleotide divergence in VP1 is not sufficient to definitively assign types, phylogenetic relationships may be taken into account. 5. VP4/VP2 sequences can be used for type identification in epidemiological studies. 6. If a putative new HRV type is initially identified from VP4/VP2 sequences and no VP1 sequence data is available, it should be designated as a provisionally assigned type. 7. For putative new types that are confirmed by VP1 nucleotide divergence, a permanent HRV type designation should be assigned. 8. New HRV type names should be numbered sequentially and include the species designation (eg/ HRV-A104). 9. The “prototype strain” of each new HRV type should be the first reported full genome sequence or, if this is unavailable, the first reported full VP1 sequence. 10. The Picornavirus Study Group should continue to oversee assignment of new HRV types, as the group includes members with specific expertise in enterovirus type assignment and a number of scientists currently active in HRV research.

The use of only capsid coding regions in type assignment criteria should not detract from the importance of continued investigation of other genomic regions, in particular where these may contribute to the phenotype and disease associations of HRV. In addition, the use of the VP1 region for definition of new HRV types should not discourage the widespread continued use of 5ˈUTR and VP4/VP2 screening protocols. Screening with these relatively conserved regions allows a much greater opportunity for discovery of previously unknown HRV types.

Although this represents, to our knowledge, the most comprehensive survey of HRV sequence data currently possible, the guidelines should be subject to further review as additional data becomes available.

135

Proposals for the Classification of All Three Species of Human Rhinovirus into Genotypically Assigned Types

4.4.4 Future directions

In all three species, the phylogeny of the full VP1 region reliably separated isolates of known differing type into distinct bootstrap supported clades. As VP1 also contains a large number of the neutralizing immunogenic sites observed in HRV-A and -B (Kistler et al., 2007b), we can assume that this region is an accurate molecular determinant of the serotype. An interesting potential future addition to this work would be confirmation of serological properties of HRV-C type groups. This would require an easily implementable system for the cultivation of HRV-C in vitro. The sinus mucosal organ culture system currently available represents an excellent opportunity to study biological properties of HRV-C (Bochkov et al., 2011). However, it is not suitable for the extensive use that would be required to confirm type designations as not only is it more labour-intensive than other cell culture methods, but requires regular access to a local Department of Head and Neck Surgery to obtain viable sinus tissue. Limited serotyping studies would also be useful in re-confirming some potentially aberrant HRV-A and HRV-B type assignments where cross-reactivity is uncertain (such as HRV-A54/A98).

There is a possibility that a smaller fragment of VP1 could be suitable for genotyping in HRV. Oberste et al highlighted that a 450 base segment at the 3ˈ end of VP1 could be used for EV typing and the results had a 100% correlation with neutralization results (Oberste et al., 1999a). A second study showed full correlation of a 303 nucleotide stretch of VP1 with neutralization data, but this was only applied to 59 Genbank strains (Kiang et al., 2009). In addition, a simple, fast and effective method of typing EV by pyrosequencing has recently been developed (Silva et al., 2008). This may become extremely useful as pyrosequencing becomes more widely available worldwide.

Finally, there is potential for the development of an interactive web tool which would allow users to input a FASTA sequence and receive an HRV type designation. The 136

Proposals for the Classification of All Three Species of Human Rhinovirus into Genotypically Assigned Types service could include an additional step comprising of request for input of a VP1 sequence and subsequent analysis in cases where type designation is dubious. A similar system is already in use for EV and Noroviruses (Kroneman et al., 2011). However, this would have to be modified significantly to allow assignment of HRV types, as the described algorithm is based solely on phylogeny. In order to accurately assign HRV types, it would be desirable to consider both inferred phylogeny and calculated pairwise nucleotide p-distances from all prototype strains. Such a tool, widely and free available on the internet, would make HRV genotyping even more freely accessible, easy and attractive to research groups the world over.

137

Chapter 5 Analysis of genetic diversity and sites of recombination in HRV-C

5.1 Introduction

5.1.1 Recombination in the evolution of human enteroviruses

Human EV are divided into four species by phylogenetic analysis (EV-A-D) and the three serotypes of PV are now classified as EV-C. EV-B is the largest and most clinically common human EV species. Developing a detailed understanding of the frequency and occurrence of recombination in circulating EV has been vital in surveillance of outbreaks of vaccine derived poliomyelitis, whereby vaccine PV strains may recombine either with each other or with circulating EV-C strains to produce neurovirulent PV (Cuervo et al., 2001; Blomqvist, 2003; Jegouic et al., 2009).

Circulating EV strains undergo frequent recombination, which manifests as observed incongruities between inferred phylogenies in distant genome fragments. Intra-species recombination within the coding region has been described in EV-A (Simmonds and Welch, 2006; Huang et al., 2008; McWilliam Leitch et al., 2012; Santti et al., 1999), EV-B (Simmonds and Welch, 2006; Lukashev et al., 2005; Lindberg et al., 2003; McWilliam Leitch et al., 2009a, 2010; Chevaliez et al., 2004; Andersson et al., 2002) and EV-C (Jegouic et al., 2009; Dahourou et al., 2002). EV has a similar genome structure to HRV with four structural and seven non-structural proteins translated as a single polyprotein that is cleaved post-translationally. Recombination breakpoints are often observed to occur within the non-structural protein coding region and frequently near the boundary with the structural protein coding region. Specifically, recombination breakpoints have been isolated in the 2A

138

Analysis of genetic diversity and sites of recombination in HRV-C protease coding region (van der Sanden et al., 2011), 2B (Lindberg et al., 2003; Chevaliez et al., 2004), 2C (Lukashev et al., 2005; Andersson et al., 2002), at the 5ˈ end of 3D (van der Sanden et al., 2011) and (in PV recombinants generated in vitro) within the 3C region (Runckel et al., 2013).

The frequent recombination between the capsid and non-structural protein coding regions has led to a theory that these genome regions of EV strains evolve in a modular and independent fashion (Santti et al., 1999; Lukashev et al., 2005). This implies that EV diversity is derived from constant recombination between capsid sequences and pools of non-structural genome sequences. The combination of a capsid sequence and non-structural region generates a specific recombinant form (RF). These RFs have been shown to rise to dominance and then disappear entirely over the span of only a few years (McWilliam Leitch et al., 2009a).

In contrast, only rarely is recombination observed within the capsid coding region of EV (Santti et al., 1999; Lindberg et al., 2003; McWilliam Leitch et al., 2009a; Simmonds and Welch, 2006) and this has been previously attributed to the significantly greater sequence divergence seen within the capsid coding region (Lindberg et al., 2003). It has been suggested that the increased divergence may present a biological compatibility barrier, which renders variants with chimaeric capsid regions unviable or simply less fit than parental strains (Simmonds and Welch, 2006). The theory of the biological compatibility barrier is additionally supported by the observation that inter-species recombination in any part of the coding region is only very rarely observed in EV. One recent study of PV recombination in vitro used a deep sequencing technique to isolate not only viable PV recombinants, but all genomes generated during viral replication (Runckel et al., 2013). Analysis of recombination breakpoints in over 50,000 generated genomes found that there was actually no difference between the frequency of occurrence of recombination within the capsid and the non-structural protein coding region. This

139

Analysis of genetic diversity and sites of recombination in HRV-C supports the assertion that capsid recombinants are likely routinely randomly generated, but are less able to compete with wild type viruses. Viable circulating EV strains with chimaeric capsids have been infrequently isolated (Blomqvist, 2003; Bouslama et al., 2007; Zhang et al., 2010). In addition, one study found a recombination breakpoint near the 5ˈ end of the VP4 region, presumably related to recombination within the 5ˈUTR (Lukashev et al., 2005).

Within all four species of human EV, 5ˈUTR sequences cluster phylogenetically into two large groups (Pöyry et al., 1994). Sequences from EV-A and –B are interspersed within one group (Group I) and sequences from EV-C and –D within the other (Group II) (Pöyry et al., 1996). Extensive phylogenetic incongruence has been observed between the 5ˈUTR and capsid coding regions (Simmonds and Welch, 2006) and no evidence has been found of a correlation between phylogenetic grouping in the 5ˈUTR region and other putative recombinant regions downstream (Lukashev et al., 2005). In addition, four recently described new EV types (EV-A90, EV-A91, EV-C104 and EV-C109) have 5ˈUTR sequences which group outwith the 5ˈUTR cluster expected by their EV species designation (Yozwiak et al., 2010; Tapparel et al., 2009b; Smura et al., 2007a). Two of these are EV-C types (EV-C104 and EV-C109) which group closely with EV-A/-B sequences in the 5ˈUTR and two (EV-A90 and EV-A91) are EV-A types which cluster with EV-C/-D in the 5ˈUTR. This indicates a degree of reciprocal exchange between the two 5ˈUTR clusters in the contemporary evolution of EV strains.

5.1.2 Recombination in the 5ˈUTR of HRV-C HRV-C was only discovered relatively recently (McErlean et al., 2008; Arden et al., 2006; Kistler et al., 2007b; Lamson et al., 2006; Renwick et al., 2007; Lau et al., 2007) and at the time that this work was undertaken, only limited information regarding the frequency and occurrence of recombination within the HRV-C genome had been published (Huang et al., 2009; Wisdom et al., 2009a). However,

140

Analysis of genetic diversity and sites of recombination in HRV-C recombination was thought to be rare within the coding region of HRV-A and -B (Kistler et al., 2007b; Lewis-Rogers et al., 2009). In contrast, preliminary studies undertaken of circulating genetic diversity of HRV-C strains indicated that they fell into two distinct phylogenetic groups (Han et al., 2009; Huang et al., 2009; Wisdom et al., 2009a). In fact, over 60% of sequenced strains which are members of HRV-C species by sequence analysis in the capsid region group within the HRV-A clade in the 5ˈUTR (Wisdom et al., 2009a). It appears that similar to EV strains, inter-species recombination in the 5ˈUTR has taken place extensively in the evolutionary history of this species. Studies of the clinical correlates and molecular epidemiology of HRV-C strains have found no difference in the clinical outcomes or circulation patterns of recombinant (HRV-Ca) or non-recombinant (HRV-Cc) strains (Huang et al., 2009; Wisdom et al., 2009a).

The grouping of the majority of HRV-C strains within the HRV-A clade has probably contributed to the relatively recent discovery of HRV-C, despite the fact that it appears to have been circulating in human populations for many years (Briese et al., 2008). Prior to the recognition of recombinant HRV strains, it was common practice to use 5ˈUTR sequence comparisons for HRV species and type identification (Kiang et al., 2008); a practice which would have led to the misidentification of over two thirds of HRV-C strains as HRV-A. However, it is now generally accepted that this typing method is fundamentally flawed (Savolainen-Kopra et al., 2009b) and the majority of published studies utilise the VP4/VP2 region instead.

5.1.3 Genetic features of HRV-C HRV-C strains share the same basic genome organisation with HRV-A and -B and other EVs, including an approximately 7kB genome encoding a single open reading frame. HRV-C have been found to have a generally shorter coding region than other Picornaviridae, secondary to deletions within the VP1 capsid protein coding region (Lau et al., 2007; Arden et al., 2010). Also in common with other HRV are the

141

Analysis of genetic diversity and sites of recombination in HRV-C observed amino acid sequence motifs at the start of the coding region (MGAQVS) and within the 3D polymerase coding region (YGDD, YGL, FLKR, SIRWT) (Lau et al., 2007; Arden et al., 2010; McErlean et al., 2008). These are all fully conserved in other HRV species (Savolainen et al., 2004) and the 3Dpol motifs are additionally conserved in EV strains (Brown et al., 2003).

Distinguishing features of HRV-C genomes include the unique position of the cis- acting replication element (cre) within VP2 (Cordey et al., 2008; Arden et al., 2010). In contrast, this has been located within 2A in HRV-A (Gerber et al., 2001) and VP1 in HRV-B (McKnight and Lemon, 1998). In addition, HRV-C are found to have a generally elevated G+C content of between 42-43% compared to HRV-A and –B (39-40%), although this is generally lower than the G+C content seen in EV (46%) (Arden et al., 2010; Lau et al., 2007).

Similar to all members of the genus Enterovirus, HRV-C strains possess a 5ˈUTR of approximately 600 bases. The 5ˈUTR is a highly structured and well conserved region which contains certain elements essential to effective replication of the virus. As all EV, HRV possess a type I IRES, which is essential for the internal initiation of translation of the uncapped mRNA (Palmenberg et al., 2009). In addition, a 5ˈ cloverleaf structure, implicated in the initiation of RNA synthesis was present and structurally conserved in all three species of HRV (Palmenberg et al., 2009). The high degree of conservation of both RNA sequence and secondary structure observed within this region may contribute to the observed propensity for recombination within the 5ˈUTR of HRV-A/-C.

The majority of the analysis presented within this chapter focussed on the occurrence of recombination within samples collected from Edinburgh patients between 2006 and 2007. This has been previously published (McIntyre et al., 2010). However, section 5.3.2 also refers to an additional analysis of the 5ˈUTR region of all

142

Analysis of genetic diversity and sites of recombination in HRV-C published HRV-C full genomes, undertaken in 2012 and published as part of a further paper (McIntyre et al., 2013b).

5.2 Materials and Methods

5.2.1 Sample selection

A previously conducted study had produced sequences from VP4/VP2 and a small fragment of the 5ˈUTR of 144 HRV-C sequences derived from Edinburgh patients. These fell into 46 phylogenetic clusters, now known to be analogous to type groups. Samples for inclusion in the study of recombination within the coding region of HRV-C were selected in order to include at least one representative from as many of these phylogenetic groups as possible. In addition, a further 15 samples from epidemiological studies (Chapter 3) were selected. This resulted in the inclusion of 35 HRV-C types in the 5ˈUTR, VP4/VP2 and 3Dpol regions and 28 in the VP1 region.

5.2.2 Amplification of the VP4/VP2, VP1 and 3Dpol regions of HRV-C The VP4/VP2 and VP1 regions were amplified as previously described (Chapter 3, Chapter 4). A 3Dpol fragment (from positions 6384 – 6854 numbered by EF582385) was amplified using a nested PCR protocol (described in Chapter 2 and primers listed in Appendix 1). Samples negative on initial nested PCR were amplified using a SuperScript III protocol (Chapter 2).

5.2.3 Amplification of the 5ˈUTR region of HRV-C

Two sequence fragments with a 61 base overlap were produced, allowing a composite 5ˈUTR sequence of 680 bases to be constructed. The amplification of the 5ˈUTR region utilised the two primer sets (UTR2 and UTR3). In addition, a further 12 HRV-C variants were further amplified using hemi-nested combined HRV- A/HRV-C primers to give sequence complete from nucleotide position 27 to 355

143

Analysis of genetic diversity and sites of recombination in HRV-C

(primer set UTR1 in Appendix 1). Cycling conditions for all reactions are described in Chapter 2.

5.2.4 Amplification of the 2A region of HRV-C

The 2A coding region of HRV-C strains was amplified in two fragments with a 303 base overlap, to produce a single sequence of 884 bases. Both second round PCRs used the same first round PCR product. Due to the fact that the first round fragment to be amplified was almost 1000 bases in length, we found that the addition of 2µL 3mM dNTPs in the first round gave a higher sensitivity (concentration of nuclease- free H2O was adjusted accordingly to give a final reaction volume of 20µL). Resulting amplified DNA bands of the appropriate size were then cut out from agarose gels visualized under UV transillumination. The target DNA was then extracted from the agarose gel and sequenced as previously described (Chapter 2).

5.2.5 Dataset construction A total of 89 HRV-C positive samples corresponding to 40 phylogenetic groups were amplified in the VP4/VP2 and 3Dpol regions14. Two samples (Resp_3856 and Resp_6131) could not be amplified in the 5ˈUTR, possibly due to low viral loads in the original sample. However, both of these samples came from phylogenetic groups in which other members were represented within the 5ˈUTR dataset.

Of the 89 HRV-C positive samples with sequences available for the 5ˈUTR, VP4/VP2 and 3Dpol regions, only 71 could be successfully amplified in the VP1 region. This corresponded to 33 of the 40 phylogenetic groups. This was due to limitations in the sensitivity of the originally described HRV-C VP1 nested PCR and

14 74 of the included VP4/VP2 sequences had been previously amplified as part of a previously published study (Wisdom et al., 2009a). 144

Analysis of genetic diversity and sites of recombination in HRV-C subsequent redevelopment of this protocol led to a much improved sensitivity15 (McIntyre et al., 2013b). Sequences which could not be amplified in the VP1 region did not cluster together on phylogenetic trees in other regions. We can therefore conclude that these did not represent an identifiable genetic subset whose omission would have significantly affected the analysis.

Sequences generated in the course of this study have been submitted to Genbank and assigned accession numbers as described (Table 5.1). In addition, VP4/VP2 sequences generated as part of the previous study by Wisdom et al are published with accession numbers in the series GU294336 to GU294480 (Wisdom et al., 2009a).

TABLE 5.1: Accession numbers of sequences generated during the study of recombination in HRV-C in 2006/2007 Genome Number of Accession Numbers Region sequences VP4/VP2 15 HM352737 – HM352752 VP1 71 HM236897 – HM236968 3Dpol 89 HM485468 – HM485556 5ˈUTR 86 HM581802 – HM581888

All available HRV-C full genomes (n=10) were initially obtained from Genbank in April 2010. For the inter-species recombination and full genome sequence divergence analysis, these were combined into a single dataset with available HRV- A (n=82) and HRV-B (n=25) full genome sequences. An additional analysis of the 5ˈUTR region of all available full genome sequences of HRV-A and HRV-C was undertaken in July 2012. This included all full genome sequences which had a 5ˈUTR fragment that was more than 90% complete across the region 167-626 (numbered by FJ445111 : HRV-A1). 167 HRV-A and 59 HRV-C sequences were included in the analysis.

15 Recombination studies which utilised this improved HRV-C VP1 nested PCR protocol are detailed in Chapter 6. In addition, HRV-C36-C46 VP1 sequences were generated using the improved HRV-C VP1 PCR. 145

Analysis of genetic diversity and sites of recombination in HRV-C

Phylogenetic trees were constructed as previously described (Chapter 2) and recombination analysis was undertaken primarily by inspection of phylogenetic trees for bootstrap supported changes in tree topology or branch length. This analysis was supplemented by screening datasets with the TreeOrderScan software within the SSE v1.0 package (Simmonds, 2012). Breakpoints were determined by analysis with the program GroupScan within the SSE v1.0 software package.

5.3 Results

5.3.1 Phylogenetic analysis of the VP4/VP2, VP1 and 3Dpol regions of HRV-C

For analysis of recombination within the coding region of HRV-C, three genome regions (VP4/VP2, VP1 and 3Dpol) were included. VP1 sequences are the basis for type designations of HRV-C (Simmonds et al., 2010) whereas VP4/VP2 and 3Dpol represent the extreme 5ˈ and 3ˈ ends of the coding genome. In all three regions, sequences fell into clearly defined, bootstrap supported groups (Figure 5.1). A total of 15 HRV types were represented by a single sequence only and the remainder were represented by several unique variants. In both VP4/VP2 and 3Dpol, the majority of sequences clustered according to their type assignment, as defined by VP1. The phylogeny of all three coding regions was remarkably congruent, in terms of both clade membership and branch length.

Only one HRV-C sequence showed evidence of incongruent phylogenetic position (shaded box in Figure 5.1). GQ223227, a previously published full genome sequence (Huang et al., 2009), is the HRV-C8 prototype strain as defined by VP1 (Simmonds et al., 2010). Our sample selection contained one other HRV-C8 variant (Resp_10300) confirmed by phylogenetic grouping and pairwise nucleotide p- distances. However, while Resp_10300 maintained its phylogenetic position relative

146

Analysis of genetic diversity and sites of recombination in HRV-C to other sequences in 3Dpol region, GQ223227 changes tree position to group with the HRV-C9 prototype strain, GQ223228 in 3Dpol.

147

Analysis of genetic diversity and sites of recombination in HRV-C

FIGURE 5.1: Neighbour joining phylogenetic trees of study sequences and corresponding regions from published complete genome sequences from the VP4/VP2, VP1 and 3Dpol regions of HRV-C. Branches showing at least 70% bootstrap support are indicated. Branch to tree root has been collapsed for ease of reference. Complete genome sequences were labelled in black. The putative recombinant sequence group containing GQ223227 and Resp_10300 (HRV-C8) are marked with blue boxes. Branches are scaled by genetic distance.

On inspection of these two full genome sequences, the change in tree topology appeared to occur within 3Dpol. At the 5ˈ end of 3Dpol (position 6384 – 6645), GQ223227 had a pairwise nucleotide p-distance of 0.08 from Resp_10300. However, at the 3ˈ end of 3Dpol (position 6646 – 6854), this sequence was nearly identical (p-distance 0.01) to GQ223228 and was much more distantly related (p- distance 0.35) to Resp_10300. Inspection of phylogenetic trees for the 5ˈ and 3ˈ ends of 3Dpol clearly showed the change in grouping of GQ223227 within the region (Figure 5.2).

This observation is complicated by the fact that both GQ223227 and GQ223228 originated from the same published study within the same laboratory (Huang et al., 2009). Although we cannot definitively exclude that this sequence represents a genuine natural recombination event, in light of the extremely rare nature of recombination within the rest of the sample set and the origination of both segments of this chimaeric genome from the same laboratory, it seems prudent to question its authenticity. As the other HRV-C8 sequence available within our dataset shows no evidence of this putative recombination event, it was felt that the hybrid genome may instead represent a sequencing or assembly error. As a result, GQ223227 was not considered as a genuine recombinant genome in the remainder of the analysis carried out.

148

Analysis of genetic diversity and sites of recombination in HRV-C

FIGURE 5.2: Comparison of phylogenies obtained from analysis of the 5ˈ and 3ˈ ends of 3Dpol of HRV-C sequences. Published full genomes are marked with black circles. Sequences involved in putative recombination event are labelled with coloured arrows.

149

Analysis of genetic diversity and sites of recombination in HRV-C

5.3.2 Phylogenetic analysis of the 5ˈUTR of HRV-C Phylogenetic trees constructed for the 5ˈUTR region of HRV-C sequences from 2006-2007 (and including 10 HRV-C full genomes) showed that sequences continued to group within their types (as defined by VP1) in the 5ˈUTR. Even the putative recombinant HRV-C8 type (consisting of GQ223227 and Resp_10300) was congruent in the 5ˈUTR (marked by shaded grey boxes on Figure 5.3). However, there was extensive breakdown in the deeper relationships between HRV-C type groups observed between the 5ˈUTR and VP4/VP2 regions.

In accordance with previous reports (Wisdom et al., 2009a; Huang et al., 2009), we found that HRV-C sequences grouped into two distinct clades in the 5ˈUTR. The largest of these clades (marked as HRV-Ca on Figure 5.3) contains HRV-C sequences which group within the HRV-A clade16. In total, over 70% of analysed HRV-C sequences (n=68) representing 30 of the 40 included HRV-C types grouped with HRV-A sequences. The remaining 10 HRV-C types grouped in a distinct clade, labelled as non-recombinant HRV-Cc strains (marked in red on Figure 5.3). Putative recombinant (HRV-Ca) strains were scattered within the VP4/VP2 region, indicating the possibility that recombination between the 5ˈUTR of HRV-A/-C has occurred on multiple occasions throughout HRV-C diversification (shown by blue lines and grey arrows on Figure 5.3).

Analysis of 5ˈUTR and VP4/VP2 sequences with TreeOrderScan showed a minimum of 12 phylogeny violations between the two regions (data not shown). On inspection of phylogenetic trees, inter-type relationships showed a great deal of inconsistency between the two regions, with some large clades being entirely maintained and others showing breakdown in phylogenetic relationships. Two large HRV-Ca clades containing 6 and 8 HRV-C types respectively were bootstrap

16 For ease of reference, relationships between HRV-C sequences only are presented in Figure 5.3. 150

Analysis of genetic diversity and sites of recombination in HRV-C supported in both regions with completely congruent tree topology (marked by grey boxes on Figure 5.3).

FIGURE 5.3: Neighbour joining phylogenetic trees showing the partial 5ˈUTR and VP4/VP2 regions. Putative recombinant (HRV-Ca) sequences are coloured in blue. Non-recombinant (HRV- Cc) sequences are coloured in red. Two samples which could not be amplified in the 5ˈUTR region are shown in yellow. The phylogenetic position of recombinant HRV-Ca sequences are shown by blue lines. Two contiguous and congruent clades of HRV-Ca sequences are shown by grey boxes and grey arrows. Branches are scaled by genetic distance.

151

Analysis of genetic diversity and sites of recombination in HRV-C

FIGURE 5.4: Neighbour joining phylogenetic tree showing the 5ˈUTR region of all HRV-A and HRV-C full genomes. HRV-A sequences are marked in blue, with HRV-C marked in red. Branch to tree root has been collapsed for ease of reference. The largest portion of the HRV-A clade, which had no HRV-C sequences embedded within it has been additionally collapsed. Branches are scaled by genetic distance.

152

Analysis of genetic diversity and sites of recombination in HRV-C

An additional analysis of all available HRV-A and HRV-C 5ˈUTR regions from published polyprotein sequences revealed a similar pattern, whereby the majority of HRV-C sequences and type groups were embedded within the HRV-A clade (Figure 5.4). This analysis included 59 HRV-C 5ˈUTR sequences representing 30 HRV-C type groups. All sequences analysed grouped in the 5ˈUTR region according to their assigned type. Recombinant sequences were not distributed throughout the entire HRV-A 5ˈUTR clade, but fell into three distinct groups. One group of HRV-C sequences appeared to share a common ancestor with HRV-A12, -A45, A78 and – A101, while another was closest to HRV-A51, -A65, -A71 and –A103. The third small clade consisted only of HRV-C4 and –C49. Although this group fell within the HRV-A clade, they did not group closely with any known HRV-A sequence.

5.3.3 Determination of recombination breakpoints within the 5ˈUTR of HRV-C In order to determine the location of putative recombination breakpoints in the HRV- C 5ˈUTR, amplified 5ˈUTR and VP4/VP2 sequences were joined to give a single, long sequence which was scanned between positions 185 – 903 (numbered by EF582385) using the program GroupScan. The sequences generated in the current study were combined with the 10 available published HRV-C complete genomes, 34 5ˈUTR sequences previously published (Huang et al., 2009) and all available HRV- A/-B complete genome sequences to construct the dataset for analysis. Each putative recombinant (HRV-Ca) sequence was scanned against HRV-A, HRV-B and control HRV-Cc (n=29) sequences and the recombination breakpoint was calculated from the intersection of the two lines (Figure 5.5).

Of the 51 confirmed HRV-C types, 42 had UTR sequences that were more than 90% complete across the region analysed. Of these, 31 HRV-C types were members of the HRV-Ca group and breakpoints were calculated. Calculated recombination breakpoints were similar within HRV-C type groups and all calculated breakpoints

153

Analysis of genetic diversity and sites of recombination in HRV-C occurred within the UTR (Table 5.2). The majority of HRV-Ca sequences showed putative recombination breakpoints that fell within two “hotspot” regions. These mapped within the IRES to the polypyrimidine tract (PPT) between stem loops 5 and 6 (positions 561-576; mean, 565) and near the terminal loop of stem-loop 5 (position 508-544; mean,523). Only two HRV-C types had recombination breakpoints out with these two regions and both were at a more 5ˈ site (position 479-489; mean, 481).

FIGURE 5.5: Recombination breakpoints for three representative HRV-Ca sequences. Analysis was carried out in the program GroupScan using HRV-A, -B and –Cc sequences as control groups. Above the graphs is a diagrammatic summary of 5ˈUTR RNA structure elements (stem-loops 1-6 (Anderson et al., 2007) and polypyrimidine tract [PPT]). Breakpoint is calculated from intersection of two lines. Association values/grouping scores of less than 0.5 refer to distantly related outgroups (marked in blue).

154

Analysis of genetic diversity and sites of recombination in HRV-C

TABLE 5.2: Calculated 5ˈUTR recombination breakpoints for HRV-C types analysed HRV 5ˈUTR Total Breakpoint Breakpoint Type group sequences (nt) mean (nt) range analysed C1 Ca 5 520 519-523 C2 Ca 1 564 - C3 Ca 2 526 526-527 C4 Ca 1 561 - C5 Ca 3 563 563-564 C6 Ca 10 524 524-526 C7 Ca 2 574 574 C8 Ca 4 565 561-570 C9 Cc - - - C10 Ca 1 525 - C11 Ca 2 554 552-557 C12 Ca 4 508 508 C13 Ca 1 523 - C14 Cc - - - C15 Ca 3 483 479-489 C16 Ca 3 520 518-526 C17 Ca 1 567 - C18 Ca 8 528 518-544 C19 Cc - - - C20 Cc - - - C21 Ca 6 572 570-575 C22 Ca 2 576 576 C23 Ca 3 562 552-570 C24 Ca 1 560 - C25 Ca 1 558 - C26 Ca/Cca - - - C27 Ca/Cca 3 535 522-559 C28 Ca 2 543 536-550 C29 Cc - - - C30 Ca 1 561 - C31 Ca 1 528 - C32 Ca 9 524 523-527 C33 Cc - - - C35 Cc - - - C36 Cc - - - C37 Cc - - - C38 Ca 2 555 555 C40 Ca 1 560 - C41 Ca 1 548 - C42 Ca 1 479 - C46 Cc - - - C49 Ca 1 570 - C51 Cc - - - a HRV-C26/-C27 are labelled as HRV-Ca/Cc groups, as previously published sequences of these types do not all fall within the same 5ˈUTR group (Huang et al., 2009; Wisdom et al., 2009a).

155

Analysis of genetic diversity and sites of recombination in HRV-C

An analysis of mean pairwise p-distance within and between HRV species groups (with HRV-Ca and HRV-Cc considered separately) confirmed high sequence conservation within the 5ˈUTR. In particular, sequences of the stem-loop regions either side of the recombination hotspot centring on position 523 were highly conserved (Figure 5.6). A further analysis of this data performed by Professor P. Simmonds showed that HRV-Ca variants with a 5ˈ sequence that was HRV-A like and a 3ˈ sequence that was closely related to HRV-C showed similar predictive pairing to both non-recombinant HRV-A and HRV-Cc sequences. It was therefore hypothesized that this region could comfortably accommodate a recombination event without destabilising the stem-loop and therefore disrupting the function of the highly structured IRES.

FIGURE 5.6: Sequence variability scan of the 5ˈUTR showing mean HRV-A, -B and –Cc inter- type distances and interspecies distance of HRV-A/Cc. The pairwise p-distances are averaged over a 30 base window. The position of known 5ˈUTR structures and putative location of recombination breakpoints are shown.

Within the fragment of 5ˈUTR analysed (from position 185), all sequences showed evidence of only one recombination breakpoint. At the 5ˈ end of the sequence fragment, all sequences grouped strongly within the HRV-A clade. In order to

156

Analysis of genetic diversity and sites of recombination in HRV-C investigate whether a 5ˈ breakpoint existed in HRV-C 5ˈUTR sequences, a subset of 11 HRV-Ca variants were sequenced to position 27. All 11 sequences analysed grouped with HRV-A at the extreme 5ˈ end of the genome. In each case, association values with HRV-A at the 5ˈend were significantly higher than the 0.5 score which signifies an outgroup position (an example is shown in Figure 5.7). Analysis of phylogenetic trees constructed from 50 and 100 base fragments of this region showed no evidence for grouping of any HRV-Ca variants outside the HRV-A clade (data not shown).

FIGURE 5.7: GroupScan analysis of the extreme 5ˈ end of the HRV-C genome. This analysis used a subset of control sequences which were complete from position 27 to scan against the 11 HRV-Ca query sequences.

5.3.4 Sequence divergence across the full genome of HRV-C

Mean pairwise nucleotide and amino acid p-distances were calculated for the VP4/VP2, VP1 and 3Dpol regions of HRV-A, -B and –C (Figure 5.8)17. Across all three species, VP1 was uniformly found to be the most divergent region (pairwise nucleotide p-distance of 0.290, 0.265 and 0.347 respectively), while VP4/VP2 was consistently the most conserved region. A significantly greater within-species divergence in all three regions was seen in HRV-C, compared with HRV-A and –B. In fact, the amino acid divergence of HRV-C in both VP4/VP2 and 3Dpol is almost twice that observed in other species.

17 For HRV-C, only the 10 published full genomes available at the time of this analysis and the sequences generated in this study were included in this calculation. 157

Analysis of genetic diversity and sites of recombination in HRV-C

FIGURE 5.8: Mean pairwise (uncorrected) p-distances of nucleotide (A) and amino acid (B) sequences in the VP4/VP2, VP1 and 3Dpol regions of all three HRV species. Actual values are shown above each bar.

In order to investigate sequence divergence across the full HRV-C genome and highlight any other potential areas of recombination, a divergence scan using a fragment length of 300 nucleotides, an increment of 30 nucleotides and including all HRV-A/-B and 10 HRV-C full genomes was performed (Figure 5.9). Sequence divergence of HRV-C compared with HRV-A or HRV-B was uniformly higher throughout the genome than within species divergence of HRV-C. This was true for both nucleotide and amino acid p-distances. Interestingly, a small (approximately 260 base) region showed inter-species HRV-A/HRV-C divergence that approached intra-species HRV-C divergence. This area of reduced inter-species divergence fell approximately between positions 3268 and 3525 within the 2A proteinase coding region. HRV-B sequences remained divergent from both HRV-A and HRV-C within this region. This highlighted the possibility of inter-species recombination involving the 2A coding region of HRV-C sequences.

158

Analysis of genetic diversity and sites of recombination in HRV-C

FIGURE 5.9: Sequence divergence scans showing mean nucleotide and amino acid divergences within complete genome sequences of HRV-C and between HRV-C and HRV-A/-B. Mean inter- species pairwise distances between HRV-A and HRV-C are shown in red, between HRV-B and HRV-C are shown in green and mean intra-species distance for HRV-C is shown in blue. Fragment length of 300 nucleotides and an increment of 30 nucleotides was used. A genome diagram showing different HRV-C genes is shown underneath (with annotation taken from sequence EF582385). The P1 region is marked in red, P2 in pale blue and P3 in blue.

5.3.5 Analysis of recombination within the 2A region of HRV-C

Initially, in order to further investigate the approximately 260 base region of decreased divergence between HRV-A and HRV-C, phylogenetic trees were constructed of the putative recombinant region and 5ˈ/3ˈ flanking regions of equal

159

Analysis of genetic diversity and sites of recombination in HRV-C length (Figure 5.10). The phylogeny of HRV-A and HRV-C sequences in the putative recombinant region (3268-3525) was remarkably incongruent with that observed in flanking regions of the same length. All 10 HRV-C sequences grouped within the HRV-A clade in a similar manner similar to the 5ˈUTR region. However, in the putative recombinant 2A region, all HRV-C sequences (including HRV-Cc strain GQ223228) were actually embedded within the HRV-A clade. HRV-B sequences retained a phylogenetic position distinct from both HRV-A and HRV-C throughout 2A.

Subsequently, 38 additional 2A sequences from HRV-C strains in Edinburgh were combined with 58 HRV-C complete polyprotein sequences (>90% complete across the region 3127 – 3552 numbered by EF582385) downloaded in 2012 to give a complete dataset consisting of 96 HRV-C 2A sequences, representative of 39 HRV- C types. Analysis additionally included 194 HRV-A complete polyprotein sequences for comparison. This showed similar features to the original analysis with only 10 HRV-C complete genomes. All HRV-C sequences from all HRV-C types grouped in several distinct clades within the HRV-A clade within the putative recombinant region (3127 – 3552), implying the occurrence of several separate recombination events, similar to that observed in the 5ˈUTR. However, bootstrap values for inter- type relationships were generally poor for trees constructed from both datasets and so recombination parents could not be determined. Despite poor bootstrap support for inter-type relationships within the region, the grouping of HRV-C within HRV-A is well supported in both datasets (bootstrap values of 99 and 80 respectively).

160

Analysis of genetic diversity and sites of recombination in HRV-C

FIGURE 5.10: Neighbour joining phylogenetic trees showing the putative recombinant portion (3267 – 3525) of the 2A coding region as compared with 5ˈ (3006 – 3266) and 3ˈ (3526 – 3786) flanking regions. Numbering is from sequence EF582385 (HRV-C4). HRV-C sequences are labelled in black and HRV-A are labelled in white. The accession numbers of HRV-A/-B variants have been prefixed with their corresponding serotype designation. Branches are scaled by genetic distance.

161

Analysis of genetic diversity and sites of recombination in HRV-C

Of the 96 HRV-C strains with available sequences for the 2A region, 73 also had a 5ˈUTR sequence that was complete to position 179 (including all 38 strains isolated sequenced in this study). We had previously reported a monophyletic group of HRV-C sequences that was maintained in both the 5ˈUTR and 2A region (including types HRV-C1, -C3, -C6, -C7 and –C11) (McIntyre et al., 2010) and speculated that recombination in the 5ˈUTR and 2A regions may be linked. In order to investigate this potential linkage, we examined phylogenetic trees constructed from the same sequence set in both regions (Figure 5.11). This showed that, contrary to the pattern observed in 5ˈUTR region, HRV-Cc sequences did not group together in 2A and were interspersed throughout the HRV-C tree (marked in red on Figure 5.11). In the larger dataset, two clades containing 6 and 8 HRV-C types respectively were notably contiguous between the 5ˈUTR and 2A regions (marked with dotted lines and grey arrows in Figure 5.11). However, inspection of phylogenetic trees constructed from the VP1 region of HRV-C revealed that these groupings were additionally maintained in the coding region. In addition, one further bootstrap supported clade in 2A contained both HRV-Ca and HRV-Cc sequences and was present in the VP1 region. Indeed, when HRV-C sequences are considered alone, no phylogeny violations were recorded between VP1 and the putative recombinant 2A region (data not shown). This suggests that the recombination events observed in the 5ˈUTR and 2A regions are not, in fact, linked.

As there were no non-recombinant HRV-C 2A sequences, it was not possible to determine putative recombination breakpoints via GroupScan. However, inspection of the amino acid alignment of HRV-A, -B and –C sequences in this region revealed an area of obvious similarity corresponding closely with the region identified by divergence scanning (Figure 5.12). Annotation of the sequence alignment with known features of the 2A protein revealed the putative breakpoint to fall almost exactly at the boundary of the C-terminal domain (Petersen et al., 1999).

162

Analysis of genetic diversity and sites of recombination in HRV-C

FIGURE 5.11: Neighbour joining phylogenetic trees showing the 5ˈUTR and recombinant 2A regions of HRV-C sequences. Numbering is as shown and sequences are numbered by EF582385. Sequences which are non-recombinant in the 5ˈUTR region (HRV-Cc) are labelled in red type. Bootstrap supported clades containing more than two HRV-C types which are maintained in 5ˈUTR and recombinant 2A region are marked with white arrows and dotted lines. Bootstrap supported clades containing more than two HRV-C types which are also present in the VP1 region are marked with blue arrows and blue boxes. Branches are scaled by genetic distance.

163

Analysis of genetic diversity and sites of recombination in HRV-C

FIGURE 5.12: Sequence alignment of the 2A region of selected HRV-A, -B and –C sequences. HRV-A (n=9) and HRV-B (n=8) sequences were selected on the basis of highest observed genetic divergence among prototype strains. HRV-C (n=10) strains include all of the originally published full genomes. Secondary structure elements are superimposed and annotated (Petersen et al., 1999). Inferred positions of recombination are shown by dotted lines (positions 3268 – 3525). Sequences are numbered by HRV-A2 : X02316. Β-strands and catalytic/zinc binding residues are as previously proposed (Petersen et al., 1999).

164

Analysis of genetic diversity and sites of recombination in HRV-C

The putative recombinant region contained all of the zinc binding sites known within this protein (Petersen et al., 1999). Additionally, the proposed recombination breakpoint at position 3268 corresponded closely with the 3ˈ end of the proposed cis- acting replication element (cre) for HRV-A, which has been previously mapped to between positions 3226 – 3270 (Gerber et al., 2001).

5.4 Discussion

5.4.1 Absence of recombination within the coding region of HRV-C

In the study of the phylogenetic relationships between different coding regions of the HRV-C genome, the most striking finding was the lack of evidence for recombination within the coding region. This was manifest as almost complete phylogenetic congruence between two capsid regions (VP4/VP2 and VP1) and a sequence fragment from the 3ˈ end of the genome (3Dpol) (Figure 5.1). This was in stark contrast to the frequent recombination observed in all three species of the closely related human EV (Lukashev, 2005; Oprisan et al., 2002; Simmonds and Welch, 2006) and indeed in many other Picornavirus genera (Simmonds, 2006; Heath et al., 2006; Agol, 1997; Calvert et al., 2010). However, recombination has not been frequently observed within the coding region of other HRV species (Kistler et al., 2007b; Lewis-Rogers et al., 2009; Palmenberg et al., 2009).

One putative recombination event was noted within the coding region of HRV-C isolates. Sequence GQ223227 (N4) changed tree position between VP1 and 3Dpol, while the other member of the same type (Resp_10300) maintained a congruent position relative to other sequences (Figure 5.2). While we cannot definitively exclude the possibility that recombination occurred within a natural site located in this region and gave rise to a viable HRV-C strain, it seems relatively unlikely that this putative recombination event is genuine. Subsequent analyses on the occurrence of recombination in all three species of HRV indicated that the sporadic occurrence

165

Analysis of genetic diversity and sites of recombination in HRV-C of recombination within HRV appeared to predominantly involve full HRV types, rather than individual sequences (McIntyre et al., 2013b). It is possible that this apparently chimaeric genome has arisen due to contamination in laboratory analysis or sequence assembly. We would suggest that re-sequencing this isolate may be warranted to verify its ancestry. One subsequent study of recombination in HRV species showed a grossly elevated recombination frequency in HRV-C due to the inclusion of this aberrant sequence (Linsuwanon et al., 2011). In light of this, we have excluded it from this and further analyses of recombination in HRV-C sequences. With this sequence excluded, we found no other evidence of recombination occurring between HRV-C types within the coding region.

In studies of circulating EV strains, recombination frequency is correlated with both geographical and temporal separation of analysed strains (McWilliam Leitch et al., 2009a). In fact, even within these highly recombinant species, analysed sequences of both EV-A and EV-B were only observed to be 100% recombinant when 10 years had elapsed (Simmonds and Welch, 2006). The sequences produced in this study were all isolated from a single geographical location and covered a 2-3 year time period (from 2006 – 2008). With the inclusion of the ten available published full genome sequences, this time period was increased only to eight years. This raises the possibility that the detection of recombination may have been limited by the included dataset, especially if recombination occurs comparatively more rarely than in EV. In one study of EV strains isolated from Edinburgh blood donors in 2000- 2001, no recombination among the study strains of EV-A was documented (Simmonds and Welch, 2006). However, when EV-A prototype strains and published strains covering a much larger time span were included, a high frequency of phylogenetic incongruity and putative recombination was observed. Further studies undertaken within this thesis investigated the occurrence of recombination within the coding region of HRV strains of all three species spanning 14 years and two geographically separate locations (Chapter 6).

166

Analysis of genetic diversity and sites of recombination in HRV-C

5.4.2 Recombination within the 5ˈUTR of HRV-C

In accordance with previous studies (Huang et al., 2009; Wisdom et al., 2009a), we noted considerable phylogenetic incongruity between the 5ˈUTR of HRV-C and the rest of the genome downstream (VP4/VP2, VP1 and 3Dpol fragments). The majority of HRV-C sequences grouped within the HRV-A clade in the 5ˈUTR region (Figures 5.3 and 5.4). The 5ˈUTR/P1 junction is a known recombination hotspot in other members of the genus Enterovirus (Santti et al., 1999).

All HRV-C sequences included in this study maintained their phylogenetic grouping within their assigned type in the 5ˈUTR. Previously published analyses showed two HRV-C types which contained strains with both recombinant (HRV-Ca) and non- recombinant (HRV-Cc) 5ˈUTR sequences (Wisdom et al., 2009a) (marked as Ca/Cc on Table 5.2). One of these groups contained only sequences isolated from Edinburgh; three of which were HRV-Cc (Resp_2514, Resp_3938 and Resp_3102) and only one (Resp_3856) was HRV-Ca. Unfortunately, the aberrant strain Resp_3856 was missing and could not be further investigated. In the second Ca/Cc group, all published sequences from a Chinese study were reportedly HRV-Ca (Huang et al., 2009) and all those from Edinburgh samples were HRV-Cc. We did not find evidence of any other HRV-C types with both HRV-A and HRV-C like 5ˈUTR sequences. However, the previously reported existence of these Ca/Cc groups may indicate that contemporary recombination can occur within the 5ˈUTR region of HRV-C.

We observed the majority of recombination breakpoints within the 5ˈUTR to fall into two hotspot regions; near position 565 in the PPT and 523 in stem-loop 5 of the IRES. This is a considerably more restricted distribution of breakpoints than was previously reported (Huang et al., 2009). However, we were able to utilise a different scanning method (GroupScan) and a much larger dataset. Similarly to the coding region of HRV-C, the existence of a biological compatibility barrier and

167

Analysis of genetic diversity and sites of recombination in HRV-C requirement to preserve essential RNA secondary structure may generally limit the generation of viable recombinants in the 5ˈUTR to sequence surrounding these hotspot regions. The 5ˈUTR of HRV contains the highly structured and well conserved IRES and 5ˈ cloverleaf, which have essential functions in viral replication (Palmenberg et al., 2009, 2010) and are also conserved in EV (Pöyry et al., 1996). Recombination breakpoints occurring at the hotspot regions described were found to have no influence on predicted RNA secondary structure (McIntyre et al., 2010). The observation that recombination is more likely to occur in areas of concentrated RNA secondary structure may be potentially supported by our observations (Runckel et al., 2013). However, this assertion remains controversial in EV (Simmonds and Welch, 2006) and the location of putative recombination breakpoints in HRV-C may be simply secondary to the highly conserved sequence in these regions. Putative recombination events observed in the 5ˈUTR of EV-C109 resulted in preservation of both the 5ˈ cloverleaf and IRES (Yozwiak et al., 2010). The conservation of RNA secondary structure within recombinant 5ˈUTR sequences suggests a degree of functional constraint on the biological viability of chimaeric genomes within this region.

We also found no evidence of the putative second recombination event that was proposed at the 5ˈ end of the 5ˈUTR (Huang et al., 2009), instead noting the swap of the entire 5ˈUTR in a modular fashion. Our methods for detection of the 5ˈ breakpoint were potentially limited by the use of a 200 base fragment size in GroupScan analysis. However, the additional inspection of phylogenetic trees constructed from both 50 and 100 base regions at the 5ˈ end of the UTR showed no evidence for grouping of any HRV-Ca sequence out with the HRV-A clade.

In EV-A71, recombination in the 5ˈUTR has been linked to epidemic spread. Specifically, a single epidemic genotype in the Netherlands has been found to have same recombinant 5ˈUTR sequence as an outbreak in Asian-Pacific region caused by

168

Analysis of genetic diversity and sites of recombination in HRV-C a different genotype (van der Sanden et al., 2011). HRV-A/-C 5ˈUTR recombinants are likely to have originally had increased viral fitness, as there are multiple VP4/VP2 lineages in the recombinant 5ˈUTR clade which indicates the occurrence of multiple recombination events (Figure 5.3). In addition, there is not a single recorded instance of an HRV-A variant possessing a HRV-C like 5ˈUTR sequence. However, no difference is observed between HRV-Ca and –Cc strains in terms of epidemiology or clinical presentation (Wisdom et al., 2009a). Schibler and co- workers found that artificially generated chimaeric genomes with a coding region derived from HRV-A16 and 5ˈUTR regions from other Enterovirus species (specifically HRV-Ca, HRV-Cc, HRV-B and EV-A) were all able to replicate successfully in vitro (Schibler et al., 2012). However, each recombinant genome was consistently out-competed by the wild type HRV-A16 in competition experiments. This indicates that, in this case, while the 5ˈUTR and coding regions of these viruses are functionally compatible, the wild type maintains a replication advantage.

5.4.3 Putative recombination events within the 2A coding region of HRV-C

The 2A coding region in HRV/EV encodes a chymotrypsin-like proteinase (2Apro), of 142 amino acids and the crystal structure has been solved (Petersen et al., 1999). 2Apro has two primary functions within the HRV life cycle. Firstly, it is responsible for the initial cleavage of the viral genome post-translation. 2Apro cleaves the structural and non-structural proteins, separating the two domains at its own amino terminus (N-terminus) (Toyoda et al., 1986). In addition, 2Apro is responsible for the cleavage of eukaryotic initiation factor eIF4G which is required for the recruitment of capped mRNA to the ribosome and therefore essential for host-cell mRNA translation (Glaser et al., 2003; Lloyd et al., 1988). This disruption of host cell metabolism (“host cell shut-off”) allows uninhibited initiation of translation of picornavirus mRNA through an internally initiated process (via the IRES).

169

Analysis of genetic diversity and sites of recombination in HRV-C

Structurally, 2Apro is comprised of N-terminal (amino) and C-terminal (carboxyl) domains, connected by an approximately 12 amino acid interdomain loop (annotated onto sequence data in Figure 5.12) (Petersen et al., 1999). Unexpectedly, we observed a possible second site of HRV inter-species recombination manifest as a 260 base fragment of high sequence similarity between HRV-A and HRV-C. A similar observation was previously made in analysis of sequence GQ223227 (N4), which appeared to be more closely related to HRV-A sequences in the 3ˈ end of 2A (Huang et al., 2009). Analysis of our expanded dataset (containing 96 HRV-C 2A sequences) revealed that the putative recombinant region corresponded precisely to the boundaries of the C-terminal domain. Phylogenetic trees constructed of the putative recombinant region (nucleotide positions 3268 – 3525) showed all known HRV-C sequences grouping within the HRV-A clade. The C-terminal domain of the 2Apro is highly conserved, not only among HRV, but among all picornaviruses (Petersen et al., 1999). The observation of a putative recombination breakpoint in this region raises the possibility of a degree of modularity between the two domains. The N-terminal region is much more variable between HRV-A and HRV-C and the two species show segregation on phylogenetic trees, as is observed in other coding regions (Figure 5.10). Interestingly, a previous study observed this same modularity between the two domains when comparing proteinase sequences of HRV and Streptomyces griseus proteinase B (SGPB) by structure based alignment. Although the N-terminus was variable and significantly truncated in HRV, the C-terminal domain of both distantly related proteins was topologically identical (Petersen et al., 1999). This suggests a strong evolutionary constraint related to the structure of this type of proteinase, which may account for the observed sequence similarity among closely related species.

Inspection of the amino acid alignment for the 2A region of HRV-A, -B and –C highlighted an area of obvious similarity corresponding closely to the C-terminal domain of the 2A protein (Figure 5.12). HRV-B sequences remained divergent from

170

Analysis of genetic diversity and sites of recombination in HRV-C both HRV-A and HRV-C throughout the 2A region, consistent with their separate phylogenetic grouping (Figure 5.10). Analysis of pairwise nucleotide p-distances within the 2A coding region showed that there was considerably less genetic diversity observed within the putative recombinant region. This was true of intra- species distances for HRV-A and HRV-C, as well as inter-species distances between HRV-A and HRV-C. This observation was even more marked when pairwise amino acid p-distances were considered (Figure 5.13). In fact, the observed amino acid p- distance between HRV-A and HRV-C sequences in the non-recombinant region is more than twice that observed in the putative recombinant region.

FIGURE 5.13: Pairwise nucleotide and amino acid p-distances for the 2A coding region of HRV- A, HRV-C and HRV-A/-C. The 2A region has been divided into the non-recombinant (3127-3266) and putative recombinant (3267-3525) regions. Pairwise nucleotide p-distance is shown in A and pairwise amino acid p-distance is shown in B.

The C-terminal domain of 2Apro is structurally composed of 6 β-sheets which co- ordinate the binding of a zinc ion adjacent to the catalytic site of the protease. The zinc ion is extremely tightly bound within the protein, requiring denaturation of the protease in order to chelate it (Sommergruber et al., 1994). Despite a lack of direct involvement in enzyme catalysis, the presence of zinc within the molecule has been

171

Analysis of genetic diversity and sites of recombination in HRV-C found to be essential for function of the enzyme and indeed for viral replication (Glaser et al., 2003), secondary to a role in maintaining structural integrity (Voss et al., 1995). All four of the known zinc binding sites are located within the putative recombinant region of 2Apro and consist of three cysteine (Cys) residues and one histidine (marked on Figure 5.12). The zinc binding sites are conserved in a broad range picornaviruses (Petersen et al., 1999) and in all sequenced HRV strains, with only one exception (GQ223228). This single sequence contains an Arginine (Arg) residue at position 3456, in place of Cys, corresponding to a single nucleotide mutation within the coding sequence. Given the differing physicochemical properties of these two amino acids, it seems unlikely that this substitution would allow efficient zinc binding. Additional strains of HRV-C8 that were sequenced in this region displayed a perfectly conserved Cys residue at the position in question. This single base change observed in only one published strain may be due to ambiguities often observed with certain sequencing techniques, where misincorporation of a nucleotide can lead to infrequent random base mutations.

Recombination is detected by the observation of the loss of genetic diversity and incongruence of phylogenetic relationships within a putative recombinant region. We did observe a higher than expected degree of sequence conservation within the C-terminal domain of HRV-A and HRV-C sequences. Our original analysis of putative recombination within this region suggests that HRV-C sequences were scattered throughout the HRV-A clade and that the conservation of certain monophyletic groups of sequences may indicate linkage between the recombination observed in the 5ˈUTR and 2A regions (McIntyre et al., 2010). However, the inclusion of almost 10 times the number of sequences included in the original analysis revealed that all HRV-C sequences grouped together within the HRV-A clade and that bootstrap supported clades maintained between the 5ˈUTR and 2A regions were also maintained throughout the coding region. Additionally, we observed largely congruent phylogenetic relationships between sequences in the VP1

172

Analysis of genetic diversity and sites of recombination in HRV-C and 2A coding regions. This suggests that a strong purifying selection influence in this region, rather than inter-species recombination, may be driving the pattern of genetic diversity observed. In order to maintain phylogenetic grouping within the coding region, a recombination event in this region would have presumably been required to have taken place prior to the diversification of HRV-C into genetically distinct types. The markedly reduced amino acid divergence, as compared with nucleotide divergence in the C-terminal domain of this protein is consistent with a high number of synonymous mutations, again suggesting a region under the influence of strong negative selection.

Consistent with the proposed functional and structural modularity between these two domains is the fact that the 5ˈ end of the putative recombinant region corresponds almost exactly to the 3ˈ end of the HRV-A cis-acting replication element (cre) (Gerber et al., 2001). A breakpoint in this region would not disrupt the secondary structure or function of the cre, which is essential for RNA replication. Additionally, the fact that there is no structural homology between the proposed HRV-A cre and the corresponding sequence fragment in HRV-C is consistent with the current understanding that the HRV-C cre is located within the VP2 region (Cordey et al., 2008).

In conclusion, recombination within the coding region appears to be extremely rare within HRV-C. This observation is limited by the fact that our analysis of HRV-C sequences was restricted to strains that were mostly isolated from a small geographical region and over a restricted time period. Subsequent analysis focussed on the occurrence of recombination within all species of HRV over a significantly extended time period and isolated from several distinct geographical regions. The occurrence of recombination within the 5ˈUTR of HRV-C sequences has been previously documented (Huang et al., 2009; Wisdom et al., 2009b; Palmenberg et al., 2009). However, to our knowledge, this was the first study to attempt to accurately

173

Analysis of genetic diversity and sites of recombination in HRV-C pinpoint recombination breakpoints. The grouping of recombinant HRV-Ca sequences with HRV-A at the extreme 5ˈ end of the 5ˈUTR suggests a modular exchange of almost the entire 5ˈUTR region. Sequence analysis of full HRV-C genomes revealed a highly conserved sequence fragment corresponding to the C- terminal domain of the 2A proteinase. However, we did not find evidence for a linkage of the recombination events between the 5ˈUTR and 2A regions.

174

Chapter 6 Recombination in the evolution of HRV genomes

6.1 Introduction

The occurrence of recombination within the coding region of all three species of HRV is generally considered to be both rare and sporadic, notwithstanding significant disagreement about its actual frequency within the currently published literature. For example, a study of 45 HRV-A and HRV-B serotypes found some evidence of phylogenetic incongruity within the coding region of the genome, but could not confirm any putative recombination events by RDP analysis (Lewis- Rogers et al., 2009). In contrast, a study of 35 full genomes of HRV-A/-B found that the full genome phylogeny was identical to that constructed from various subgenomic regions and the sequence identity throughout the genome was largely consistent (Kistler et al., 2007b). However, several small scattered putative recombination events were identified by RDP analysis. Phylogenetic incongruity was detected between VP4/VP2 and 3Dpol sequences of HRV-A/-B (Savolainen et al., 2004). However, due to the poor bootstrap support for inter-type relationships in the VP4/VP2 region, none of these potential incongruities could be further analysed. Sporadic putative recombination events have also been identified within the coding region of HRV genomes (Tapparel et al., 2009b). Finally, a large study which included at least one sequenced isolate of all 100 known HRV-A/-B serotypes found that the full genome phylogeny recovered did show a number of inconsistencies with subgenomic regions (Palmenberg et al., 2009) and ultimately suggested that several contemporary HRV types had arisen due to recombination. The limitation in previous studies may be related to the sample size or time span of data used for the analysis. None of the studies mentioned included both a representative set of full genome sequences and several sequenced examples from most HRV types.

175

Recombination in the evolution of HRV genomes

Our analysis (presented in Chapter 5) of a set of contemporary HRV-C sequences was limited by the absence of sequences collected prior to 2001, giving only a 7 year time span for analysis. Analysis of several extensively recombinant EV-B types (E9, E11 and E30) indicated that even these required sequence analysis spanning several years for recombination to become widely evident (McWilliam Leitch et al., 2009a, 2010). In particular, E11 showed a half-life for individual recombinant forms of nearly 10 years. An analysis of HRV-C sequences over an extended time period had never been conducted and in addition, HRV-A/-B have not been extensively analysed using a large dataset of both contemporary and prototype sequences. Therefore, an extended HRV dataset was constructed in order to more definitively measure recombination frequencies occurring during the diversification and contemporary evolution of HRV types. The analysis presented within this chapter focussed on examination of the occurrence of recombination within all published full genome sequences of all three species of HRV and additionally compared sequenced isolates spanning 14 years. This study was the first to comprehensively characterise all detectable recombination events in HRV and was published in 2013 (McIntyre et al., 2013b) .

6.2 Materials and Methods

6.2.1 Sample selection 28 HRV positive respiratory samples isolated from 1995 – 1997 in Finland were provided by Carita Savolainen-Kopra (National Institute for Health and Welfare (THL), Mannerheimintie 166, Finland). The VP4/VP2 sequences for these isolates had been previously published and were within the series of accession numbers EU590043 – EU590113 (Savolainen-Kopra et al., 2009a) and AY015114 – AY015174 (Savolainen et al., 2002). By analysis of pairwise nucleotide p-distances and phylogenetic clustering, these isolates were divided into 22 HRV types (10 HRV-A, 6 –B and 6 –C).

176

Recombination in the evolution of HRV genomes

41 HRV positive respiratory samples isolated in Edinburgh of these same 22 HRV types were identified and selected. In addition, a further 23 HRV-C sequences relating to 11 previously provisionally assigned HRV-C types were identified during the course of this work and were also included in the recombination analysis.

6.2.2 Amplification of three coding regions of HRV

The VP4/VP2 and VP1 regions of HRV-A, -B and –C were amplified as previously described (Chapter 2). A fragment of 3Dpol stretching from positions 6414 – 6896 in HRV-A (numbered by FJ445111- HRV-A1), 6475 – 6960 in HRV-B (numbered by X01087 : HRV-B14) and 6361 – 6835 in HRV-C (numbered by EF582385 : HRV-C4) was amplified as described in Chapter 2 using primers listed (Appendix 1)18. The region described above corresponded to that amplified in HRV-C in previous work (McIntyre et al., 2010). Where necessary, for samples which proved challenging to amplify, a SuperScript III (Invitrogen, UK) protocol was used as previously described.

Two HRV-B isolates could only be amplified by using a multiplexed mixture of primers for all three HRV species. Subsequent inclusion of inner primers for all species in the sequencing reaction gave a clear, readable sequence. Neither of these two HRV-B sequences were found to be recombinant in the 3Dpol region.

All sequences included in this study have been submitted to Genbank and assigned accession numbers in the series KC342054 – KC342173.

6.2.3 Construction of datasets for recombination analysis

Two separate datasets were constructed for analysis of recombination within HRV genomes. The first consisted of all VP4/VP2, VP1 and 3Dpol sequences generated in this study (McIntyre et al., 2013b), in our previous work on recombination within

18 Within this chapter, all analysis of the 3Dpol region was undertaken using the fragment between the co-ordinates described above. 177

Recombination in the evolution of HRV genomes

HRV-C (McIntyre et al., 2010) and extracted from all HRV polyprotein sequences and population sets available on Genbank (downloaded on 6/6/12). In order to meet the criteria for inclusion in the study, sequences were required to be more than 90% complete across the VP4/VP2 region previously described (615 – 1004 numbered by EF582385), the full VP1 region and the 3Dpol fragment described above. In addition, all sequences which had been annotated as non-functional or contained stop codons were excluded. This resulted in inclusion of 136 HRV-A, 53 HRV-B and 121 HRV-C sequences, in which the three regions were concatenated to form a single long sequence suitable for recombination analysis. The second dataset constructed included all available HRV polyprotein sequences that were more than 90% complete across the P1, P2 and P3 regions (numbered by Genbank annotation for FJ445111 : HRV-A1, X01087 : HRV-B14 and EF582385 : HRV-C4).

6.2.4 Likelihood mapping analysis

Many inferred phylogenetic clades observed in preliminary studies suffered from poor bootstrap support and this was especially marked when considering deep phylogenetic relationships within the VP4/VP2 and 3Dpol regions. In order to assess whether conclusions based on inspection of bootstrap supported phylogenetic clades were valid, it was necessary to confirm the presence of adequate phylogenetic signal and that a tree-like topology could adequately describe the phylogenetic relationships within the dataset.

Likelihood mapping analysis was undertaken using TREE-PUZZLE v5.2 (Schmidt et al., 2002), using 10000 random quartets, uniform rates and a specified HKY substitution model. All other parameters were estimated from the data by maximum likelihood during execution of the program. Fragment datasets were analysed for each studied region and each species separately. The analysis was also carried out in duplicate to ensure convergence of results.

178

Recombination in the evolution of HRV genomes

6.2.5 Analysis of mean substitution rates

A total of 18 HRV-A and 23 HRV-C sequences from 6 distinct HRV types were selected for analysis of mean substitution rates. These included sequences representing HRV-A9, -A28, -A78, -C9, -C12 and –C18. These were selected on the basis of a lack of observed inter- and intra-type recombination between VP1 and 3Dpol. Datasets were constructed of each individual type group and subjected to analysis with SBP (Kosakovsky Pond et al., 2006a), GARD (Kosakovsky Pond et al., 2006b) and RDP (Martin et al., 2010) prior to inclusion.

Analysis of mean substitution rates was carried out in BEAST (Drummond and Rambaut, 2007). All analyses were carried out using the SRD06 model of nucleotide substitution, as recommended for protein coding sequences (Shapiro et al., 2006). A relaxed uncorrelated lognormal molecular clock and a constant population size were utilised and all other priors were optimized during execution of the program. The MCMC chain was run for 20 million states and the output was recorded every 1000 states. Trees were directly compared to those produced by bootstrap re-sampled MCL neighbour joining methods to ensure consistency of the phylogenetic relationships inferred.

6.2.6 Recombination analysis

In addition to phylogenetic analysis carried out by visual inspection of bootstrap supported clade members, branch length and overall tree topology, several computational methods were employed to both screen entire datasets for evidence of recombination and also to further characterise any putative recombination events.

Datasets containing both full HRV species and single HRV types were subjected individually to analysis with GARD (Kosakovsky Pond et al., 2006b), SBP (Kosakovsky Pond et al., 2006a) and the RDP v4.0 software package (Martin et al., 2010). For any putative recombination event highlighted on large-scale screening or

179

Recombination in the evolution of HRV genomes inspection of phylogenetic trees, additional confirmatory analysis was undertaken with RDP v4.0 programs. In each instance, a putative recombinant sequence or group of sequences were screened along with the nearest neighbours in each region. As a control measure, several groups of sequences which were observed to have no evidence of phylogenetic incongruence on inspection of bootstrap supported phylogenetic trees were analysed by RDP, GARD and SBP. All analyses confirmed no evidence of recombination within these datasets.

Further characterisation of putative recombination events was undertaken using the GroupScan program within the SSE v1.0 software package (Simmonds, 2012) and supplemented by construction and inspection of bootstrapped phylogenetic trees of the regions flanking every putative recombinant region. Unlike the other computational methods employed, GroupScan specifically scores the extent of phylogenetic clustering of a query sequence with specified control groups and therefore takes changes in bootstrap supported phylogenetic tree topology into account. This is especially useful in cases where parental sequences are not present in the dataset and phylogenetic inference is more likely to determine an accurate grouping than sequence distance alone. Additionally, the usage of full clades as control groups prevents the unnecessary aggregation of valuable sequence data that occurs when creating a consensus sequence. As GroupScan outputs a series of grouping score values calculated over a specified sliding window and plotted as a linear graph, the putative recombination breakpoint can be easily calculated by the intersection of the lines. In these studies, concatenated sequence fragments from VP4/VP2, VP1 and 3Dpol were scanned with a fragment size of 300 bases and an increment of 30. In each case, the query sequence was the potential recombinant and it was scanned against the full nearest neighbour group in both VP1 and 3Dpol. Each analysis was additionally repeated with a fragment size of 600 bases and a window of 30 to ensure consistency. This analysis gave highly similar results.

180

Recombination in the evolution of HRV genomes

6.3 Results

6.3.1 Phylogenetic analysis within HRV type groups

Sequenced HRV isolates were combined with sequences obtained from Genbank to give a total of 164 HRV-A, 70 –B and 145 –C sequences that were more than 90% complete across the VP4/VP2, VP1 and 3Dpol regions described. In addition, a second dataset was assembled containing all published HRV polyprotein sequences that were more than 90% complete across the P1, P2 and P3 regions (201 HRV-A, 71 –B and 59 –C).

Inspection and comparison of phylogenetic trees constructed for the VP4/VP2 and VP1 regions revealed the majority of HRV sequences grouping in both regions into bootstrap supported phylogenetic clusters, analogous to type groups19. The putative type groupings defined by VP1 clustering and pairwise nucleotide p-distance analysis were largely maintained in both the VP4/VP2 and full P1 regions.

Bootstrap support observed for clusters of HRV types within all three species was generally poor within the short and well conserved VP4/VP2 region (Figure 6.1). Only two HRV types showed evidence for any phylogenetic incongruity. HRV-A31 and HRV-A47 have previously been described as distinct HRV-A types, due to their genetic distance and phylogenetic relationship in VP1 (Chapter 4). However, on inspection of VP4/VP2 trees, contemporary sequences group separately and are not easily assigned to either type. This anomaly has been previously described as a pair of HRV types which are less divergent in VP4/VP2 than would be expected. Likewise, HRV-A29 and HRV-A44 are equally related to one contemporary sequence JN815252 in the VP4/VP2 region. This serotype pair was previously highlighted as potentially able to be combined.

19 Although we have proposed that certain HRV type groups undergo revisions to their classical definitions (for example, HRV-A29 and HRV-A44), all analysis presented within this chapter follows the conventional labelling of HRV types. 181

Recombination in the evolution of HRV genomes

182

Recombination in the evolution of HRV genomes

FIGURE 6.1: Neighbour joining phylogenetic trees showing the VP4/VP2 and VP1 region of HRV-A, -B and –C. Analysis included full VP1 region as previously described and VP4/VP2 region between 615 – 1004(numbered by EF582385). Branches are scaled by genetic distance.

No evidence of recombination was evident within the capsid region on inspection of phylogenetic trees20 (Figure 6.1). In addition, analysis with RDP, GARD and SBP failed to highlight any significant instances of putative recombination.

6.3.2 Likelihood mapping analysis

Likelihood mapping analysis revealed strong support for a tree-like pattern of evolution in all three gene regions of all three species (Figure 6.2). In TREE- PUZZLE v5.2 output, the proportion of analysed quartets with a reliably resolved tree-like phylogeny is mapped into the three vertices of the triangle, while the proportion of quartets which produce a star-like phylogeny or phylogenetic noise are mapped in the centre. For each species and each analysed gene, the proportion of unresolved quartets did not exceed 1%. In contrast, general support for a resolved bifurcating phylogenetic tree consistently exceeded 90%. It can therefore be inferred that conclusions based on analysis of bootstrap supported phylogenetic trees are likely to be valid.

20 The sequence JX291115 which was reported as a potential recombinant between VP4/VP2 and VP1 in the analysis of HRV-C taxonomy was not included in this analysis as the sequence was published after the datasets for recombination analysis were assembled. 183

Recombination in the evolution of HRV genomes

FIGURE 6.2: Likelihood mapping analysis of the VP4/VP2, VP1 and 3Dpol regions of HRV-A, -B and –C. Proportion of quartets in the vertices represent those with fully resolved bifurcating phylogeny, whereas the proportion in the centre represent those with star-like or network-like evolution.

6.3.3 Analysis of mean substitution rates

Mean substitution rates were calculated for the VP1 and 3Dpol regions of three HRV-A and HRV-C types. Phylogenetic trees analysed for each type showed no bootstrap supported change in topology within each type or species. In addition, datasets including sequences from either a single type or a single HRV species showed no evidence of recombination when analysed with RDP, GARD and SBP. The total number of HRV-A and HRV-C sequences analysed was 18 and 23 respectively and the datasets spanned 40 years (HRV-A) and 14 years (HRV-C).

184

Recombination in the evolution of HRV genomes

Insufficient sequences of individual HRV-B types were available for inclusion in the analysis.

Calculated mean substitution rates and 95% highest posterior density (HPD) intervals were of the order of 10-3 to 10-2 substitutions per site per year. The rates observed were largely similar between both regions and both HRV species analysed (Figure 6.3). Although both the VP1 and 3Dpol regions of HRV-C showed a 2 to 3-fold elevation in substitution rates (6.9138 x 10-3 and 9.8491 x 10-3 respectively) when compared to those estimates obtained for HRV-A strains (3.9954 x 10-3 and 3.4586 x 10-3), the 95% HPD intervals (which are analogous to confidence intervals) were clearly overlapping between both species.

FIGURE 6.3: Mean substitution rates and 95% highest posterior density intervals for two coding regions of HRV-A and HRV-C.

Phylogenetic trees produced by Bayesian methods were identical in branching order and type membership to the bootstrap resampled MCL neighbour joining trees utilised in subsequent analysis (data not shown). Both independent executions of the BEAST program gave highly similar results. In addition, analyses run with

185

Recombination in the evolution of HRV genomes sequence data excluded or dates of isolation scrambled showed no overlap with the genuine estimates obtained.

6.3.4 Comparisons of pairwise nucleotide p-distances

The concurrence of similar mean substitution rates between the VP1 and 3Dpol regions of HRV-A and –C should create a linear relationship between sequence divergences between these two regions. This was indeed observed in both intra- and inter-type distance ranges for HRV-B and HRV-C (Figure 6.4). Within distributions for both HRV-B and HRV-C, there were very few, if any outlying data points representing potentially recombinant sequences and the line of best fit displayed a gradient of close to 1 (0.9042 and 0.9418 respectively). This correlation provides evidence for maintained equal substitution rates and a general lack of recombination throughout the period of diversification into these distinct types.

In contrast to the congruent relationship observed in HRV-B and –C sequences, HRV-A sequences were almost uniformly less divergent than expected in the 3Dpol region when compared to VP1 divergence. A line of best fit for the entire HRV-A dataset gave a gradient of 0.8714. In addition, a large number of outlying data points were observed which may be indicative of individual recombinant sequences during the diversification of HRV-A sequences.

186

Recombination in the evolution of HRV genomes

FIGURE 6.4: Comparison of pairwise nucleotide p-distances between the VP1 and 3Dpol regions of HRV-A, -B and –C. In each distribution, the line of best fit is indicated by the red dotted line. Line is shown for putative recombinant and non-recombinant HRV-A sequences separately. The graph depicting HRV-A comparisons shows putative recombination events involving full HRV-A type groups in red and those involving single sequences in yellow. For HRV-B sequences, all putative recombination events are shown in red. The previously described, likely erroneous recombinant HRV-C sequence GQ223227, has been excluded from the analysis.

Analysis of individual HRV-A pairwise p-distances that were discordant between genome regions allowed the identification of potential recombinant sequences. In combination with inspection of bootstrap supported phylogenetic trees, pairwise p-

187

Recombination in the evolution of HRV genomes distances of HRV-A sequences could be readily divided into three distinct sets. The majority of sequences were non-recombinant and showed no evidence of bootstrap supported changes in tree topology (shown in blue on Figure 6.4). Non-recombinant sequences had a line of best fit gradient of 0.9011 similar to that of species B and C. The remaining putative recombinant HRV-A sequences could be categorised into two distinct distributions; one arising from large groups containing several types (shown in red on Figure 6.4) and one containing instances of sporadic phylogenetic incongruity involving single sequences alone (shown in yellow on Figure 6.4). This analysis highlighted the possibility of significant recombination acting in both the recent diversification of HRV-A (sporadic recombinants) and more ancient events that occurred before the diversification of HRV-A into types.

6.3.5 Analysis of recombination within the coding region of HRV-A

With the exception of HRV-A12 and HRV-A78, all HRV-A sequences grouped into four bootstrap supported clades (numbered on Figure 6.5), which branched basally on both VP1 and 3Dpol trees. These basal clades were maintained in the VP1, 3Dpol, P1, P2 and P3 regions. Due to poor bootstrap support for relationships between distantly related HRV types, the basal clades were not maintained in VP4/VP2. However, no bootstrap supported incongruity was observed. The clades were additionally observed to be present, bootstrap supported and monophyletic in every other individual gene region of HRV-A (data not shown), with the exception of the short and well-conserved VP4 and 3B regions. Interestingly, although multiple instances of bootstrap supported recombination events were recorded within clades 1, 2 and 3, there were no instances of recombination within the coding region occurring between these clades. In addition, no changes in tree topology were observed with clade 4 due to the inclusion of only 3 HRV-A types within it.

188

Recombination in the evolution of HRV genomes

189

Recombination in the evolution of HRV genomes

FIGURE 6.5: Neighbour joining phylogenetic trees showing the VP1 and partial 3Dpol regions of all available HRV-A strains. Where HRV-A types are maintained between both regions, these have been collapsed for ease of reference. Contiguous basal clades 1 to 4 are marked by coloured boxes. Clades 2, 3 and 4 are marked by purple, green and black boxes respectively, while Clade 1 is divided into sub-clusters which group together in 3Dpol. Within Clade 1, the orange arrows indicate the formation of one 3Dpol clade from three VP1 clades. Putative recombinant HRV-A types which undergo a bootstrap supported change in nearest neighbour between VP1 and 3Dpol are marked with a red line. Any changes in tree topology which are bootstrap supported only on P1, P2 and P3 trees are indicated with a yellow line. Branches are scaled by genetic distance.

The majority of observed putative recombination events within HRV-A involved full groups of sequences, analogous to one HRV type undergoing a change in tree position between VP1 and 3Dpol. Within clades 2 and 3, four HRV-A types underwent a change in bootstrap supported nearest neighbour between VP1 and 3Dpol and an additional HRV-A type showed evidence of a putative recombination event that was evident on comparison of P1, P2 and P3 trees.

Clade 1 formed several distinct sub-clusters of HRV-A types and displayed a clear breakdown in the phylogenetic relationships between these types between VP1 and 3Dpol. One monophyletic, bootstrap supported clade in 3Dpol (marked in orange on Figure 6.5) was formed from members of three, distinct and distantly related VP1 clades. Relationships between types were largely incongruent within the newly formed 3Dpol clade and the comparisons of pairwise p-distances between sequences within this clade corresponded to the large cluster of putative recombinant sequences between 0.25 and 0.32 VP1 p-distances (coloured red in Figure 6.5). However, there was uniformly observed poor bootstrap support for nearest neighbour relationships between some HRV-A types. These poorly supported relationships were largely mirrored on P3 trees. Remarkably, the majority of observed phylogenetic incongruity within all HRV-A sequences corresponded to full HRV-A types and more contemporary relationships between strains of the same HRV-A type are perfectly congruent. This may suggest the occurrence of widespread recombination during the period of the diversification of HRV-A into the presently observed types. The number of bootstrap supported nodes that each putative recombinant group

190

Recombination in the evolution of HRV genomes violated in the transition between VP1 and 3Dpol tree position ranged from two to eight (summarised in Table 6.1).

TABLE 6.1: Phylogenetic incongruities observed between VP1 and 3Dpol in HRV-A sequences HRV Type Accession Nearest Nearest Bootstrap Calculated Numbers neighbour Neighbour supported breakpoint VP1 3Dpol nodes mean (range) violateda A7 FJ445176 A88 A36 3 6576 DQ473503 A58 A89b A13 FJ445116 A41 A73 2 3393 FJ445117 (3365 – 3420) A18 FJ445118 A34 A24 7e 5276 F292/8643c A50 A31 (5024 -5577) JF781496 A38 JF781510d A40 A47 A54 A60 A66 A77 A85 A90 A98 A31 FJ445126 A25 A47 6 5224 A29 A54 A44 A98 A47 A62 JN815252 f A38 FJ445180 A15 A18 7g 5130 DQ473495 A60 A24 (5118 – 5143) JQ994496 A74 A31 JN541272 A40 A47 A54 A60 A66 A77 A85 A90 A98

191

Recombination in the evolution of HRV genomes

HRV Type Accession Nearest Nearest Bootstrap Calculated Numbers neighbour Neighbour supported breakpoint VP1 3Dpol nodes mean (range) violateda A39 AY751783 A40 A2 4 3431 A54 A23 A56 A30 A59 A49 A63 A85 A98 A47 FJ445133 A25 A31 6 5424 JN837692 A29 A54 (5416 – 5435) GQ223229 A31 A98 A44 A62 JN815252 A46 DQ473506 A80 A53 4 3280 A53 DQ473507 A28 A46 4 3213 JN798587 (3198 – 3228) A54 FJ445138 A40 A31 6 3228 A56 A47 A85 A98 A98 A60 FJ445143 A15 A18 8h 5117 JN798590 A38 A24 (5111 – 5123) A74 A31 A38 A40 A47 A54 A66 A77 A85 A90 A98 A66 FJ445148 A77 A24 6 5652 JN112340 (5645 – 5674) JN621246 JQ837715 A68 FJ445150 A20 A28 4 3998 JN798578 (3995 – 4001) A76 EU840726 A76 A56 7 5130 (except JN815328 (5129 – 5131) FJ445182, JX074055 DQ473502) JX074049 A98 FJ445139 A40 A31 6 3360 FJ445173 A54 A47 (3343 – 3377) A56 A54 A85

192

Recombination in the evolution of HRV genomes a This relates to the number of bootstrap supported nodes a group of sequences violates to form the new grouping. b Where more than one type is listed as a single nearest neighbour, this represents the most closely related phylogenetic clade which is bootstrap supported. For example, in cases where the putative recombinant type is an outgroup to a clade or its relationship with the most closely related sequences is not supported, all bootstrap supported types were considered as nearest neighbour. c Sequences indicated in bold italics are those which consist only of non-consecutive sequence fragments and are therefore not represented on P1, P2 and P3 trees. These were excluded from breakpoint analysis. d Sequences indicated in bold font are those which are members of a type which is a putative recombinant but are not themselves present on VP1 and 3Dpol trees due to sequence gaps. These are however represented on P1, P2 and P3 trees. e Due to a lack of bootstrap support for nearest neighbour grouping in 3Dpol for A18, nodes changes are taken from grouping with A34 to A85 (nearest bootstrap supported neighbour in the P3 region). f JN815252 is part of a group of previously described contemporary sequences which cannot be definitively assigned to either the HRV-A29 or HRV-A44 classical groupings. It is therefore listed as a single sequence. g Node changes for HRV-A38 between VP1 and 3Dpol are taken from grouping with HRV-A60 to HRV-A18. h Node changes for HRV-A60 between VP1 and 3Dpol are taken from grouping with HRV-A38 to HRV-A77.

Several HRV-A types showed a pattern of divergence whereby contemporary strains grouped separately from older prototype strains in the 3Dpol region. This represents tentative evidence of recombination within the time-frame of this analysis. As previously reported (Tapparel et al., 2009b), contemporary HRV-A76 strains collected between 1999 – 2010 group separately from the HRV-A76 prototype strain and form a close association with the HRV-A56 group. HRV-A56 maintains a congruent phylogenetic position between VP1 and 3Dpol and so this suggests that HRV-A76 have potentially recombined with HRV-A56 between 1999 and 2010 (Figure 6.6 A1). The intra-type p-distance range of contemporary HRV-A76 strains compared with prototype strains is 0.084 to 0.093 in VP1 and 0.209 to 0.229 in 3Dpol. Unfortunately, an insufficient number of contemporary HRV-A76 strains were available to perform Bayesian analysis to determine the date of most recent common ancestor of this group.

HRV-A68 and HRV-A33 also displayed a similar pattern of divergence, whereby the older prototype strains grouped preferentially with HRV-A28 and HRV-A76 type

193

Recombination in the evolution of HRV genomes strains respectively in 3Dpol (Figure 6.6 A2 and A3). However, both of these events relate to putative recombination between types which are closely related and so their significance is uncertain. All of the above listed phylogenetic incongruities were confirmed as putative recombination events by analysis with SBP, GARD and RDP (data not shown).

FIGURE 6.6: Neighbour joining phylogenetic trees showing distinct patterns of recombination observed in HRV-A sequences. The branch to tree root has been collapsed for ease of reference and the outgroup used for analysis is given below. A: Phylogenetic trees showing examples of incongruent topology between VP1 and 3Dpol for (1): HRV-A56 and HRV-A76 (outgroup: HRV-A7), (2): HRV-A20, HRV-A28 and HRV-A68 (outgroup: HRV-A95) and (3): HRV-A33 and HRV-A76 (outgroup: HRV-A88). B: Phylogenetic tree showing branch length discrepancies between VP1 and 3Dpol within HRV-A80 (outgroup: HRV-A46). Branches are scaled by genetic distance.

194

Recombination in the evolution of HRV genomes

HRV-A80 sequences showed a substantial difference in intra-type branch lengths between the two regions. However, there was no accompanying change in tree topology (Figure 6.6 B). The HRV-A80 prototype strain (FJ445156) displayed a pairwise nucleotide p-distance from contemporary HRV-A80 strains of 0.078 in VP1 and 0.207 in 3Dpol. This discrepancy was confirmed with RDP analysis (data not shown).

6.3.6 Recombination between the P1, P2 and P3 regions of HRV-A

Sequence groupings observed in phylogenetic trees constructed for the whole P1, P2 and P3 coding regions were largely consistent with those observed in VP1 and 3Dpol. As expected, P1 was analogous to the VP1 tree and P2/P3 were similar to 3Dpol. The putative recombinant sequences groups observed in the 3Dpol tree were also evident upon inspection of P2 and P3 trees. However, seven further HRV-A types displayed evidence of bootstrap supported phylogenetic incongruity which was not bootstrap supported in 3Dpol (Figure 6.7). Of these seven further HRV-A types which were observed to undergo bootstrap supported changes in tree position, four showed evidence of incongruity between P1 and P2 and three between P2 and P3 (marked in yellow in Figure 6.7). All but one involved full HRV-A types (listed in Table 6.2).

The HRV-A40 type group includes three recently described polyprotein sequences. These three sequences were not included in the VP1 and 3Dpol analysis due to the presence of gaps which resulted in sequence completeness below the threshold of 90%. However, on analysis of the whole P3 region, the three contemporary HRV- A40 strains (JX074051, JQ245067 and JN798579) formed a bootstrap supported clade separate to the HRV-A40 prototype strain (marked in yellow in Figure 6.7). The prototype strain maintained its grouping with HRV-A85 (marked with black line in Figure 6.7).

195

Recombination in the evolution of HRV genomes

FIGURE 6.7: Neighbour joining phylogenetic trees showing the P1, P2 and P3 regions of HRV-A sequences. P1, P2 and P3 regions are numbered by FJ445111 : HRV-A1. Seven HRV-A types which change bootstrap supported tree position between these regions and do not have a corresponding bootstrap supported phylogenetic incongruity on VP1 and 3Dpol trees are marked with yellow lines. The non-recombinant HRV-A40 prototype strain is marked with a black line. Those 3 HRV-A types which change bootstrap supported tree position both between P1/P2 and P2/P3 are marked with purple lines. Putative recombinant sequences which underwent one bootstrap supported change in tree topology between VP1 and 3Dpol are not shown. Branches are scaled by genetic distance.

196

Recombination in the evolution of HRV genomes

TABLE 6.2: Phylogenetic incongruities observed between the P1, P2 and P3 regions in HRV-A sequences which lack bootstrap support in the VP1 or 3Dpol regions

Nearest Neighbours Number of changes HRV Accession P1 P2 P3 P1  P2  Calculated Type Numbers P2 P3 breakpoint mean (range) A36 DQ473505 A89 A89 A58 0 2 5068 F41598a (5049-5086) F44035 JF781497 JN614994 JN621243b JN798583 JN798584 JN815241 JN815242 JN815246 JN837697 JX074050 A40 JN798579 A40 A40 A18 1 0 N/Ac (except JQ245067 A31 FJ445129) JX074051 A38 A40 A47 A54 A60 A77 A85 A98 A43 DQ473510 A11 A9 A9 3 0 3248 F41576 A18 A15 A15 (3244-3257) JF781503 A21 A19 A19 JN837690 A24 A22 A22 A33 A32 A32 A34 A64 A64 A50 A67 A67 A55 A74 A74 A57 A75 A75 A75 A82 A82 A76 A94 A94 A90

197

Recombination in the evolution of HRV genomes

HRV Accession P1 P2 P3 P1  P2  Calculated Type Numbers P2 P3 breakpoint mean (range) A56 FJ445140 A40 A10 A10 8 0 3617 EU840727 A54 A76cd A85 A98 A63 FJ445146 A59 A59 A10 0 6e 5124 A56 A76c A75 FJ445131 A11 A9 A9 3 0 3204 JN815237 A18 A15 A15 (3199 – 3209) A21 A19 A19 A24 A22 A22 A33 A32 A32 A34 A43 A43 A43 A64 A64 A50 A67 A67 A55 A74 A74 A57 A82 A82 A76 A94 A94 A90 A100 FJ445175 A10 A25 A10 5 0 3292 A29 A25 A44 A29 A62 A44 JN815252 A56 A62 A63 A76c JN815252 a Sequences indicated in bold italics are those which consist only of non-consecutive sequence fragments and are therefore not represented on P1, P2 and P3 trees. b Sequences indicated in bold are those which are not present on VP1 and 3Dpol trees due to sequence gaps. c As the new tree position of the contemporary HRV-A40 sequences is as an outgroup to a clade which contains the original HRV-A40 prototype sequence and has poor bootstrap support for deep phylogenetic relationships, it was not possible to determine an accurate breakpoint for these sequences. However, by phylogenetic anlaysis, it can be assumed that this breakpoint lies between the P2 and P3 regions. d A76c represents contemporary HRV-A76 sequences (EU840726, JN815238, JX074055 and JX074049). e Node changes for A63 between P2 and P3 are taken from A59 to A10 and A56.

In addition, three HRV-A types were observed to undergo two separate changes in bootstrap supported nearest neighbour, with a change occurring between P1 and P2 and again between P2 and P3 (listed in Table 6.3). The closely related sequence pair

198

Recombination in the evolution of HRV genomes

HRV-A38/A60 underwent all changes together, suggesting that these events may have occurred before their diversification into phylogenetically distinct types. The breakpoint analysis was achieved by considering both HRV types as a single query sequence group. All recombination events described were confirmed by analysis with RDP, SBP and GARD software programs (data not shown).

TABLE 6.3: HRV-A types which displayed evidence of phylogenetic incongruity between both the P1/P2 and P2/P3 regions

Nearest neighbour Number of changes HRV Accession P1 P2 P3 P1  P2  Calculated Type Number P2 P3 breakpoint mean (range)a A18 FJ445118 A34 A66 A85 9 5 3397 F292/8643 A50 A77 (3328 – 3526) JF781496 JF781510 A38 DQ473495 A9 A19 A18 4 b 14c 3235 FJ445180 A15 A60 A31 (3207 – 3285) JN541272 A32 A40 JQ994496 A60 A47 A67 A54 A74 A60 A77 A85 A98 A60 FJ445143 A9 A19 A18 3 10 3257 JN798590 A15 A38 A31 (3218 – 3296) A32 A38 A38 A40 A67 A47 A74 A54 A77 A85 A98 a The calculated mean breakpoint given corresponds to that one which was not observed on VP1 and 3Dpol trees and not described in Table 6.1. b Node changes for HRV-A38/A60 between P1 and P2 are taken from HRV-A60/A38 to HRV-A19 c Node changes for HRV-A38/A60 between P2 and P3 are taken from HRV-A60/A38 to HRV-A40

Comparisons of pairwise nucleotide p-distances between the P1/P2 and P2/P3 regions of HRV-A mirrored the patterns observed upon inspection of phylogenetic

199

Recombination in the evolution of HRV genomes trees. In particular, the majority of the recombination events involving full HRV-A type groups were detected upon comparison of P1/P2 (coloured red in Figure 6.8), including the large scale re-organisation of inter-type relationships seen within Clade 1 (Figure 6.5). A line of best fit imposed on the P1/P2 distribution showed a gradient of 0.7341 for putative recombinant comparisons and 0.9799 for non-recombinant, similar to the pattern observed for the analysis conducted between VP1 and 3Dpol sequences. In contrast, the comparison of P2/P3 revealed relatively fewer, sporadic recombination events. These included instances where contemporary sequences of a particular type group separately to the type strain and are more distantly related than would be expected. For example, the relationship between contemporary HRV-A76 strains and the HRV-A56 type group and the putative recombination event observed within HRV-A80. A line of best fit for the P2/P3 distribution showed a gradient of 1.2082 for putative recombinant comparisons and 0.9813 for non-recombinant.

FIGURE 6.8: Comparison of pairwise nucleotide p-distances between the P1/P2 and P2/P3 regions of HRV-A. In each distribution, the line of best fit is indicated by the red dotted line and given for putative recombinant and non-recombinant sequences. P-distance comparisons between putative recombinant sequences are shown in red.

200

Recombination in the evolution of HRV genomes

Within both analyses, the presence of two distinct distributions was apparent; both occurring at pairwise distances of over 0.25 in the P2 region. Detailed inspection of pairwise nucleotide p-distances indicated that these all represented p-distance comparisons between the observed clades 1 to 4 (Figure 6.5). Meanwhile, all p- distance comparisons below this threshold were between members of the same basal clades. The clear division of the two distributions further strengthens the assertion that relationships between the deep clades have not been subject to recombination within the recent history of diversification of the viruses.

6.3.7 Analysis of recombination within the coding region of HRV-B and HRV-C sequences

In contrast to the widespread phylogenetic incongruity observed within HRV-A sequences, HRV-B and HRV-C showed very limited evidence for recombination within the coding region. Phylogenetic relationships between the two regions within both HRV species were largely congruent (Figure 6.9). Both HRV-B and –C showed evidence for grouping into contiguous basal phylogenetic clades which were maintained in both regions. Due to poor basal bootstrap support, one HRV-C and seven HRV-B types were excluded from membership of any deep clade. Similar to the clustering observed in HRV-A sequences, there was only extremely limited evidence for recombination between the deep clades. The only exception was the potentially erroneous HRV-C sequence GQ223227, which changes deep clade membership between VP1 and 3Dpol.

In accordance with the observed conformity of HRV-B VP1 and 3Dpol pairwise nucleotide p-distances (Figure 6.4), the majority of HRV-B sequences displayed congruent phylogenetic relationships and maintenance of phylogenetic clustering within their types as defined by VP1 (Figure 6.9 A). In fact, only one bootstrap supported change in topology was observed between these two regions of HRV-B sequences.

201

Recombination in the evolution of HRV genomes

FIGURE 6.9: Neighbour joining phylogenetic trees of HRV-B (A) and HRV-C (B) VP1 and partial 3Dpol sequences. HRV types have been collapsed for ease of reference. Putative recombinant sequences and types are marked with black lines. Contiguous clades are marked with black boxes. HRV types have retained their classical definition where we have proposed changes to these. Branches are scaled by genetic distance.

HQ123444 is a recently published full genome sequence (Linsuwanon et al., 2011) and is currently the only representative of the HRV-B100 type. This sequence changed position to cluster extremely closely with HRV-B35 sequences in the 3Dpol region. Analysis of pairwise nucleotide p-distances revealed that this sequence had a minimum inter-type p-distance of 0.19 from HRV-B35 in VP1. However, on

202

Recombination in the evolution of HRV genomes inspection of HRV full genomes, this sequence is identical to HRV-B35 from position 6353, within the 3Dpol region. This observed phylogenetic incongruity was confirmed by analysis with RDP, GARD and SBP.

FIGURE 6.10: Neighbour joining phylogenetic trees showing the P1, P2 and P3 regions of HRV-B sequences. P1, P2 and P3 regions are numbered by X01087: HRV-B14. One putative recombinant HRV-B type (HRV-B70) is indicated by yellow line. For ease of reference, branch to tree root has been collapsed. Branches are scaled by genetic distance.

On inspection of phylogenetic trees constructed for the full P1, P2 and P3 regions of HRV-B full genome sequences, an incongruity in branching order within HRV-B70 was noted (shown with yellow line on Figure 6.10). Contemporary sequences which

203

Recombination in the evolution of HRV genomes belong to the HRV-B70 type by VP1/P1 grouping group separately from the HRV- B70 prototype strain in both P2 and P3. The prototype strain maintains its original tree position, closely related to HRV-B17. This putative recombination event was also evident on inspection of the 3Dpol trees, but only had adequate bootstrap support for the recombinant grouping when P2 and P3 trees were considered. The majority of HRV-B sequences were additionally observed to group within the three basal clades previously observed in the VP1 and 3Dpol regions (marked with black boxes on Figure 6.9).

Within HRV-C sequences, there was only one observed bootstrap supported change in tree position of a single sequence between VP1 and 3Dpol. This involved the previously reported potentially artefactually recombinant sequence GQ223227. Although this change in position was additionally observed within the P2 and P3 regions, there were no other bootstrap supported topology changes within HRV-C (data not shown). Any inconsistency in branching order showed bootstrap support below the threshold of 70% and could not be confirmed by analysis with RDP, GARD or SBP. There was also no recorded bootstrap supported change in basal clade membership between the P1, P2 and P3 regions.

6.3.8 Determining putative recombination breakpoints

In order to estimate recombination breakpoints for putative recombinant HRV sequences, all available full genomes from each noted recombinant type underwent GroupScan analysis and were compared directly to their nearest neighbours in the P1, P2 and P3 regions. Breakpoints were localised to the area where the grouping score with one control group fell and another increased. Breakpoint estimates were pinpointed by calculating the intersection of the two lines. Two representative examples of GroupScan output are given, with putative recombinant breakpoints highlighted (Figure 6.11 A).

204

Recombination in the evolution of HRV genomes

FIGURE 6.11 A: GroupScan of two representative recombinant HRV-A types. Proposed recombination breakpoint are indicated by the dotted line. B: Calculated coding region recombination breakpoints for all HRV-A recombinant types, mapped onto a diagram of the HRV- A full genome. Full genome numbering is taken from FJ445111 – HRV-A1. Several sequences from large recombinant 3Dpol clade have been excluded, as complexity of nearest neighbour relationships precluded accurate estimation of recombination breakpoints.

HRV-A18 has a breakpoint near the P1/P2 boundary which falls near position 3397 (marked by dotted line on Figure 6.11 A). Although an area of uncertain phylogenetic grouping can be observed immediately 5ˈ and 3ˈ to the calculated breakpoint, the only genuine point of intersection is at 3397. HRV-A63 shows a putative recombination breakpoint within the P3 region, with the intersection calculated at position 5124. The genome fragment between position 2000 and 2500

205

Recombination in the evolution of HRV genomes notably shows a similarly elevated grouping score with HRV-A59. However, this association only borders on significance and did not correspond to any bootstrap supported change in tree topology. It was therefore discounted.

Breakpoints were successfully determined for the majority of putative recombinant HRV-A sequences and types. However, within the large recombinant 3Dpol clade observed (Figure 6.5), several HRV-A types had very complex or closely related nearest neighbour relationships in both regions, which precluded the estimation of breakpoints. In addition, no breakpoint could be determined for the putative recombination event involving contemporary HRV-A80 or HRV-B70 sequences, as no minor recombination parent group existed within the dataset.

The majority of putative breakpoints occurred near the P1/P2 boundary, specifically within the 2A coding region (Figure 6.11 B). A second putative hotspot region was observed around the P2/P3 boundary. Most breakpoints within this region occurred within the 3A coding region or at the 3A/3B junction.

6.3.9 Recombination within the 5ˈUTR of HRV-A and HRV-B

Further to the detailed analysis of inter-species recombination within the 5ˈUTR region of HRV-A and HRV-C (presented in Chapter 5), intra-species recombination within the 5ˈUTR region of HRV-A and HRV-B was analysed. All sequences that were more than 90% complete across the 5ˈUTR fragment numbered 167 – 626 (numbered by FJ445111) were included in the analysis (167 HRV-A and 38 HRV-B) 21. The occurrence of recombination within the 5ˈUTR was assessed by visual inspection of phylogenetic trees for HRV-A (Figure 6.12 A) and HRV-B (Figure 6.12 B) separately. An additional analysis including all three species demonstrated distinct clustering of HRV-B sequences, distant to HRV-A and HRV-C within the 5ˈUTR region, indicating no inter-species recombination involving HRV-B.

21 A fragment significantly smaller than the full 5ˈUTR region was used for analysis, as this allowed inclusion of a much larger number of sequences. 206

Recombination in the evolution of HRV genomes

FIGURE 6.12: Neighbour joining phylogenetic trees constructed for the 5ˈUTR and VP4 regions of HRV-A and HRV-B sequences. Putative recombinant sequences and types are marked by a red box on the 5ˈUTR tree and their corresponding position on the VP4/VP2 tree is highlighted by a red line. Branches are scaled by genetic distance.

207

Recombination in the evolution of HRV genomes

Within HRV-A, 17 putative recombinant sequences from nine HRV types were subject to bootstrap supported changes in nearest neighbour between the two regions (marked with black boxes on Figure 6.12). Two sequences (JQ837724 : HRV-A1 and JN621245 : HRV-A67) formed bootstrap supported groupings in the 5ˈUTR distant from other members of the same type. This may indicate the occurrence of recombination events within contemporary sequences.

In contrast, HRV-B 5ˈUTR sequences were considerably less numerous and only three putative recombinant types were noted (marked with black boxes on Figure 6.12). Of these, two consisted of a single sequence which was the sole representative of the type in question.

6.4 Discussion

The results of the analysis presented within this chapter reaffirms the assertion that recombination is a rare and sporadic occurrence in the contemporary evolution of HRV genomes. This suggests that contemporary HRV types may exist as generally segregating groups with little potential for recombination between them and that the mainstay of generation of contemporary HRV diversity is genetic drift. In contrast, a large number of recombination events were noted to have potentially occurred earlier during the period of diversification of HRV-A sequences into currently recognised types.

6.4.1 Estimation of mean substitution rates and analysis of pairwise nucleotide p-distances

This study represents the first comparison of mean substitution rates between distant genome regions of HRV-A and HRV-C. Mean substitution rates calculated for HRV-A and HRV-C correlated with published estimates for other single stranded RNA viruses (Jenkins et al., 2002; Duffy et al., 2008; Hicks and Duffy, 2011). However, most previously published studies of substitution rates with picornaviruses

208

Recombination in the evolution of HRV genomes focus on the capsid coding region, particularly VP1 (Calvert et al., 2010; Hicks and Duffy, 2011). One previously published estimate of the mean substitution rate within VP4/VP2 sequences of HRV-C, 6.6 x 10-4 substitutions/site/year, was considerably lower than our estimates for other regions of HRV-C (Briese et al., 2008). This possibly relates to the much more restricted sampling time of 32 months used in this study or to the fact that VP4/VP2 is generally a well conserved coding region with significantly less observed genetic diversity than the VP1 or 3Dpol regions.

In a published comparison of mean substitution rates within the Picornavirus family, members of the genus Enterovirus (excluded HRV species) were found to have a significantly higher mean substitution rate than other genera (including Apthovirus, Teschovirus, Hepatovirus and Cardiovirus) (Hicks and Duffy, 2011). In fact, substitution rates estimated for both the VP1 and 3Dpol regions of EV were around one order of magnitude faster than those obtained for other members of the Picornavirus family. Our estimates for HRV-A and HRV-C showed overlapping 95% HPD intervals with both EV and non-EV examples.

Studies of the evolution of recombinant forms of EV have included estimates of substitution rates within the 3Dpol region which are of the same order of magnitude as our estimates for HRV-A and –C (Hicks and Duffy, 2011; McWilliam Leitch et al., 2009a, 2012, 2010). In addition, when recombinant forms of EV71, E9, E11 and E30 are considered separately, the same general congruency of rates between the VP1 and 3Dpol region is observed (McWilliam Leitch et al., 2009a, 2010, 2012).

Observed similarity in mean rates of nucleotide substitution between two distant genome regions should result in a linear relationship between sequence divergence in these two regions (revealing a distribution with a gradient of 1 and a y-intercept of 0). In the resulting distribution, any discontinuities or outlying p-distance comparisons may relate to potential recombination events (coloured red and yellow

209

Recombination in the evolution of HRV genomes in Figure 6.4). Indeed, this relationship has been documented in EV71 isolates (McWilliam Leitch et al., 2012), where outlying data points corresponded to putative recombinant sequence groups. Within our analysis, the observed discrepancies in sequence divergence between VP1 and 3Dpol of HRV-A were typically the result of comparisons between variants that displayed bootstrap supported changes in phylogenetic clustering between the two regions. The formation of clusters of outlying data points supports the observed phylogenetic incongruity involving full HRV-A types and in particular, a large cluster of putative recombinant sequences between 0.25 and 0.34 pairwise nucleotide p-distance in VP1 (coloured red in Figure 6.4) corresponds to the widespread reorganisation of inter-type relationships observed within Clade 1 of HRV-A. Several instances of sporadic inter-type recombination observed within HRV-A are manifest as scattered discrepant comparisons (coloured yellow in Figure 6.4). In contrast, comparisons of sequence divergence in the VP1 and 3Dpol regions of HRV-B and HRV-C showed relatively few outlying data points, consistent with the observed congruence of phylogenetic relationships and generalised lack of recombination detected within these two species.

6.4.2 Analysis of recombination within the coding region of HRV-A, HRV-B and HRV-C Despite the generally held view that HRV are not usually subject to recombination, a small number of recombination events have been hypothesized to have contributed to the formation of some recognised HRV-A and HRV-B types (Palmenberg et al., 2009). In addition, it has been speculated that the entire HRV-B species was formed by recombination between HEV and HRV-A (Tapparel et al., 2009b).

The absence of recombination documented between the two subgenomic regions at the 5ˈ and 3ˈ extremes of the capsid coding region (VP4/VP2 and VP1) affirms the suitability of either of these regions for the purposes of taxonomic classification and

210

Recombination in the evolution of HRV genomes epidemiological identification of circulating HRV types (see Chapter 4). Despite poor bootstrap support for basally branching phylogenetic relationships in both VP4/VP2 and 3Dpol, likelihood mapping analysis revealed a strong support for the representation of HRV sequences as a bifurcating phylogenetic tree in all regions studied (Figure 6.2). This led us to believe that conclusions drawn regarding potential recombination events using phylogenetic incongruity as the cornerstone of the analysis were valid. Nonetheless, supplemental examination of all putative recombinant sequences was undertaken with the panel of computational analyses described previously.

In common with previous analyses of recombination with HRV full genome sequences (Palmenberg et al., 2009) and phylogenetic incongruity between the VP4/VP2 and 3Dpol regions of HRV sequences (Savolainen et al., 2004), we found the highest proportion of recombinant sequences within HRV-A. HRV-B and HRV- C sequences showed only very limited evidence for recombination within the coding region. In contrast, Linsuwanon and co-workers found that recombination frequency within HRV full genomes was actually highest within the 3Dpol region of HRV-C (Linsuwanon et al., 2011). However, the relevance of this finding is questionable, as the study in question considered only 9 HRV-C full genome sequences, including the previously reported potentially artificially mosaic sequence GQ223227. The low sample size combined with the single aberrant sequence is likely to have given an erroneously high estimate of recombination frequency. Our similar analysis included 59 HRV-C full genome sequences and additionally excluded GQ223227, which should give a more robust estimate of recombination frequency in this species.

HRV-A sequences reliably cluster into four bootstrap supported basal clades, which were consistently observed in the regions analysed in this study (VP4/VP2, VP1 and 3Dpol). Recombination appears to have occurred relatively frequently within clades 1-3 (Figure 6.5), especially during the period of diversification of HRV-A into

211

Recombination in the evolution of HRV genomes currently recognised type groups. However, there was no evidence of recombination occurring between the basal clades in any coding region indicating that these clades are now potentially on distinct and diverging evolutionary paths. This concept of separate divergent evolution is supported by the inspection of comparisons of sequence divergence in the P1, P2 and P3 regions, whereby inter-clade comparisons form a discrete cluster with no overlap with intra-clade comparisons. One group of three HRV-A types (HRV-A8, -A45 and –A95) has been previously noted to be phylogenetically distinct (Palmenberg et al., 2009; Savolainen et al., 2004) and additionally possess certain RNA elements which are atypical of other HRV-A strains (Palmenberg et al., 2009). These three types have been tentatively proposed as a new species; HRV-D although they do not meet the ICTV guidelines for definition as a distinct species in terms of sequence divergence.

The most comprehensive study of recombination within all prototype full genomes of HRV previously published detected a total of 23 recombinant genomes, resulting from 12 distinct recombination events (Palmenberg et al., 2009). Notably, most HRV types within this analysis were represented by a single isolate. Our analysis uncovered putative recombination events analogous to most of the previously described events within the coding region. However, as we elected to define nearest neighbour relationships by inspection of bootstrap supported phylogenetic trees, our estimates of major and minor recombination parents occasionally differed from those reported. The detailed deep phylogeny of nearest neighbour relationships is not always well-resolved, particularly in the non-structural protein coding regions studies (P2, P3 and 3Dpol). The use of GroupScan as the mainstay of breakpoint determination allowed us to consider full clades as single nearest neighbour groups and additionally calculate breakpoints in sequences with no closely related parental sequences within the dataset. In addition to the extensive recombination involving full HRV-A types that was documented, we also recorded several instances of putative recombination among contemporary HRV-A sequences. This included

212

Recombination in the evolution of HRV genomes confirmation of the previously reported finding that contemporary HRV-A76 sequences have undergone a recombination event with HRV-A56 (Tapparel et al., 2009b). The original study reported a single recombinant sequence which displayed this branching pattern. However, we have noted that in fact, all contemporary HRV- A76 isolates currently known group together and therefore, that the recombinant group of HRV-A76 is the present dominant lineage. A similar pattern was also observed in several other HRV types, indicating that while contemporary recombination occurring with the 50 year time span of this dataset is not frequent, it can occur.

In contrast, only the previously reported putative recombinant HQ123444 (Linsuwanon et al., 2011) was noted in analysis of HRV-B sequences. Due to differing analysis methods used, the previous study did not detect the 100% identity of this sequence with prototype HRV-B35 strains from position 6353 onwards and instead noted an association with HRV-B35 throughout the length of the genome. As there are no contemporary HRV-B35 sequences available and HQ123444 is the only currently known example of HRV-B100, it is not possible to speculate on the validity of this observation. It is notable, however, that the breakpoint noted here is near the same region as the previously described putative sequencing error observed for GQ223227 (HRV-C8) (McIntyre et al., 2010). It is possible that this putative recombinant sequence may represent a sequencing or assembly error.

In common with earlier work on recombination within the coding region of HRV-C, only one recombinant sequence was observed within this species. However, although the renewed analysis included HRV-C strains spanning a 14 year time period, it is regrettably still limited in its scope by the lack of older strains. Notably, all instances of recombination involving only contemporary sequences of HRV-A and HRV-B documented involved a sequenced example from the collection of prototype strains lodged at the ATCC that were originally collected over 40 years

213

Recombination in the evolution of HRV genomes ago. It is not beyond the realm of possibility that HRV-C strains undergo recombination among contemporary sequences at similar frequencies to HRV-A, but the time span of the current data set renders detection impossible.

6.4.3 Putative recombination breakpoints within the coding region

Recombination breakpoints were determined for all putative recombinant HRV-A sequences with adequately supported nearest neighbour groups (Table 6.1). For putative recombination events involving very closely related HRV types (for instance, HRV-B17 and –B70), recombination breakpoints could not be accurately determined.

Similarly to documented observations within EV (Lindberg et al., 2003; Lukashev et al., 2005) and human parechovirus (Benschop et al., 2010b), the majority of recombination breakpoints occur near the P1/P2 junction, at the boundary of structural and non-structural protein coding regions. High within-species divergence in observed within the P1 region of HRV sequences, which suggests that there may be a biological compatibility barrier which decreases the reproductive fitness of progeny which are recombinant within the capsid region. Therefore, although inter- typic recombination events likely occur in a random fashion throughout the full length of the genome in the course of a natural co-infection, recombination within the capsid region may not be represented in routine clinical sampling which only detects the predominant variant.

6.4.4 Recombination within the 5ˈUTR of HRV-A and HRV-B

Despite the lack of recombination observed within the coding region of HRV sequences, there were several documented instances of phylogenetic incongruity within the 5ˈUTR region of HRV-A and HRV-B. Although inter-species recombination has been documented among EV isolates (Santti et al., 1999), the only instances observed within HRV sequences involved the previously documented

214

Recombination in the evolution of HRV genomes relationship between HRV-A and HRV-C (McIntyre et al., 2010; Huang et al., 2009; Wisdom et al., 2009a). Analysis restricted to HRV-A and HRV-B alone showed total segregation of the two species in the 5ˈUTR, as in all other gene regions.

In HRV-A, basal clades 1 to 4 were not maintained in the 5ˈUTR region. We noted considerably more putative recombination events within the 5ˈUTR of HRV-A than previous studies (Palmenberg et al., 2009) and these included two instances of contemporary recombination involving single sequences grouping separately from other sequences of the same type. The discrepancy in these analyses is potentially explained by the differing alignment methods employed and by the use of phylogenetic incongruence as the mainstay of our recombination analysis. In fact, several putative recombinant sequences that were previously noted by Palmenberg and co-workers displayed inadequate bootstrap support on phylogenetic trees to be considered as recombinant by our methods (Figure 6.12). This lack of support for previously reported recombinant sequences was especially evident upon analysis of HRV-B. However, the 5ˈUTR region of HRV-B has not been extensively sequenced to date and currently a mere nine HRV-B types are represented by more than one sequence. Future studies may benefit from a focus on obtaining sequence data in order to perform a more comprehensive analysis of this region.

6.4.5 Conclusion

This study demonstrates that, despite a close phylogenetic relationship with EV, recombination within the history of diversification of all three species of HRV was substantially less frequent than that observed within EV species. Following upon previous reports which have provided some glimpses of the occurrence of recombination in HRV-A (Palmenberg et al., 2009; Tapparel et al., 2009b; Savolainen et al., 2004), our study represents a comprehensive analysis of recombination in all published HRV full genome sequences to date, an additional analysis of HRV sequence fragments obtained from isolates spanning a 14 year time

215

Recombination in the evolution of HRV genomes period and, to our knowledge, the first attempt to catalogue all currently detectable recombination events and potential breakpoints within HRV-A.

216

Chapter 7 Concluding remarks

7.1 A broad view of the global importance of HRV

Acute respiratory infections are one of the leading causes of paediatric mortality worldwide (Bryce et al., 2005). Severe acute respiratory diseases account for around 20% of deaths in children under 5 (in the 42 countries where 90% of global child deaths take place) (Black et al., 2003; Bryce et al., 2005). This figure rises to 29% when pneumonia and sepsis in neonates is additionally considered (Bryce et al., 2005). Although more widely feared serious respiratory infections (including influenza, tuberculosis and Streptococcus Pneumoniae) are responsible for by far the largest proportion of these illnesses and deaths, the role of HRV as a potential cause of lower respiratory tract disease, severe bronchiolitis and life-threatening exacerbations of chronic lung disease is now clear (Papadopoulos, 2002; Renwick et al., 2007; Broberg et al., 2011; Mallia et al., 2011; Bizzintino et al., 2011; McManus et al., 2008). The predominance of the life-threatening infections mentioned above and opportunistic infections secondary to human immunodeficiency virus (HIV) infection in the developing world has generally prevented most investigations of seemingly trivial pathogens, such as HRV and as such, there is only very little data available on the regional circulation and clinical impact of HRV (Smuts et al., 2011; Chidlow et al., 2012). However, chronic diseases such as chronic obstructive pulmonary disease (COPD) and asthma, traditionally the scourges of affluent western societies, are now known to be making significant headway into these regions (Chan- Yeung et al., 2004). In addition, understanding the circulation, epidemiology and evolution of HRV can theoretically inform the development of targeted therapeutic strategies, thereby reducing not only the clinical burden of these infections but the

217

Concluding remarks inappropriate prescription of antibiotics for viral respiratory tract illness. The common cold is one of the most common reasons for inappropriate antibiotic prescription (Nyquist et al., 1998) and with the threat of widespread antibiotic resistant pathogens looming, the urgency of dealing with this situation on a global scale has become clear. The realisation of the potential importance of these pathogens has meant that the scope of research interest in HRV has been increasing steadily over the last 20 years and over 200 research papers were indexed on PubMed in 2012 (www.ncbi.nlm.nih.gov/pubmed).

7.2 Novel findings in the clinical epidemiology of HRV

One of the main reasons for this substantial upswing in HRV research globally has been the realisation that these viruses are potentially responsible for more severe clinical illnesses than was previously thought. Most of the published analyses of the impact of HRV in severe illness have focussed on both severe respiratory tract infections (Broberg et al., 2011; Renwick et al., 2007; Fuji et al., 2011; Hicks et al., 2006) and occasional reports of systemic disease in young children (Tapparel et al., 2009c; Broberg et al., 2011). The results of epidemiological studies undertaken during the course of this PhD (Chapter 3) suggested a surprising link between HRV isolation from stool and gastrointestinal infections and also a potential role for HRV in aseptic meningitis (Harvala et al., 2012b). Since the publication of our findings, two other studies have also been published that report detection of HRV RNA in stool samples (Lau et al., 2012; Honkanen et al., 2013).

The closely related human EV are routinely isolated from both human stool samples obtained in cases of clinical illness and in environmental surveillance of sewage (Benschop et al., 2010a). In fact, screening sewage samples is one of the mainstays of the surveillance aspect of the poliovirus eradication campaign. However, HRV are generally thought to be acid labile and therefore unable to withstand the highly

218

Concluding remarks acidic environment of the gastrointestinal tract. Exposure to acidic vapour was reported to cause conformational changes in the capsid structure of HRV-B14 and these changes may prevent uncoating of the RNA genome (Giranda et al., 1992). Although the acid sensitivity of several HRV strains has been demonstrated experimentally, the generally held wisdom that HRV are almost universally acid labile has been called into question several times. Several acid stable mutants of HRV were reported over 20 years ago (Skern et al., 1991) and viable HRV have been isolated from sewage samples by cell culture based screening techniques (Blomqvist et al., 2009). The fact that the study that found HRV in sewage recorded only 24 strains over an 8 year period is likely to be secondary to the usage of cell culture based methods that are specific for the isolation of human EV. More recent studies, including ours, undertaken using molecular methods and specifically screening patients with gastrointestinal symptoms have found rates of HRV detection in stool ranging from 8-10% (Honkanen et al., 2013; Harvala et al., 2012b; Lau et al., 2012). Not only were the HRV detected in sewage able to propagate in vitro, but actually one clinical HRV isolate from a stool sample was subsequently grown in RD cells (Honkanen et al., 2013; Blomqvist et al., 2009). This indicates the surprising presence of viable and infectious HRV in both sewage and stool samples.

Interestingly, no study has found evidence for a distinct genetic subgroup of HRV being more commonly isolated from the gastrointestinal tract. Instead, in accordance with our findings, studies have generally reported a high level of similarity with respiratory strains. These findings indicate that not only are HRV potentially able to withstand more acidic environments than was previously thought, but the boundary between the clinical manifestations of HRV and EV infections may be more blurry than it is traditionally considered. In addition to the potential link between HRV and gastrointestinal infections, certain human EV strains, such as EV-D68 and EV-C109 are also associated with URTI. (Oberste et al., 2004b; Yozwiak et al., 2010) Future studies in this area could comprise of the direct experimental evaluation of infectivity

219

Concluding remarks of acid exposed HRV isolated from both the gut and respiratory tract. The use of a control group of known acid-stable EV would allow direct comparison between the two groups of viruses.

The distribution of ages of patients with HRV detected in their stool samples included mainly young children and the elderly. Both of the other published studies screened samples from children alone (Honkanen et al., 2013; Lau et al., 2012) and so there is a striking lack of data regarding possible HRV detection in the stool of adult patients. There was also no detailed information available as to the underlying medical history of patients with HRV detected in their stool samples and therefore pre-existing gastrointestinal pathology and drug history was unknown. In future, prospective studies conducted should probably seek to include this information, especially where concerning the presence of gastro-oesophageal reflux disease (GORD) and subsequent usage of over-the-counter remedies for this condition. Commonly prescribed treatments for gastro-oesophageal reflux are generally aimed at reducing gastric acidity. Drugs such as Omeprazole (proton pump inhibitors; PPIs) are extremely commonly prescribed both as direct treatment for GORD and as a supplementary gastro-protective agent in cases of therapy with certain painkillers within the class of non-steroidal anti-inflammatory drugs (NSAIDs), very commonly used for the treatment of arthritis. PPIs are also, rather controversially, used occasionally to treat GORD in infants and young children. Speculatively, there may be potential for a link between the usage of these medications, subsequent alteration of the pH of gastric contents and infection with HRV. The collection of detailed medical and prescription histories in a prospective study would allow examination of these factors.

A second surprising finding was the amplification of HRV RNA from the CSF of a systemically unwell infant. HRV has not, to our knowledge, been documented as a potential cause of aseptic meningitis. The finding does suggest that under certain

220

Concluding remarks circumstances, HRV is able to traverse the blood brain barrier and therefore potentially cause an infection of the CNS. The fact that this appears to have been a comparatively rare occurrence, as compared to CNS infection with the highly genetically and structurally similar human EV may in theory be partially explained by the apparently limited tendency of HRV to cause a systemic viraemia. This is thought to be a vital step in the spread of EV infections to the CNS, except in the case of PV, which are hypothesized to be additionally able to directly invade peripheral nerves (Pallansch and Roos, 2007). In two studies of HRV viraemia, only 11-12% of symptomatic HRV infections were linked with HRV detection in blood samples and this detection frequency was reduced to zero when those without respiratory symptoms were considered (Xatzipsalti et al., 2005; Fuji et al., 2011). However, studies on the rates of HRV viraemia in patients with symptomatic gastrointestinal infections and HRV detected in stool have not, to our knowledge, been undertaken. Studies such as this would provide further clues as to why, when HRV appears to be able to both replicate in the gastrointestinal tract and cross the blood brain barrier, it does not cause a similar proportion of aseptic meningitis as human EV.

7.3 Classification of HRV into genotypically defined types

A core component of the work presented within this thesis has been the development of a system of defining and classifying HRV on the basis of readily available sequence data. The determination of associations of specific HRV types with certain clinical illnesses, if they exist, is an issue of great importance; especially with the hypothesis that certain HRV strains may be more likely to both be implicated in the development and cause severe exacerbations of asthma (Palmenberg et al., 2010). Consistency in classification and nomenclature of HRV types is essential to these endeavours.

221

Concluding remarks

The system of classification for all three species of HRV that we have proposed is similar in nature to that currently in use for the definition of new EV types (Oberste et al., 1999a, 1999b). The realisation that the majority of HRV-C sequences are recombinant in the 5ˈUTR region (Huang et al., 2009; Wisdom et al., 2009a; McIntyre et al., 2010) should preclude the use of this region for this purpose as the majority of HRV-C strains will be misidentified. The use of capsid regions only for the purposes of HRV typing should not cause underestimation of the importance of non-structural protein coding regions for investigation of viral phenotypes. For example, one study of EV-A71 pathogenesis in mice found specific mutations in the 3Dpol region of strains with reduced virulence (Chang et al., 2010). The extensive recombination observed within EV species means that typing EV by VP1 alone is tantamount to knowing that around one third of the genome is closely related to a known prototype strain. If virulence and tropism influencing residues are situated outside the capsid, then a more comprehensive approach to EV typing could be warranted. Whole genome sequencing has been suggested as the most appropriate method for HRV typing (Palmenberg et al., 2009). However, as recombination within the coding region appears to occur only very infrequently within the contemporary evolution of HRV, this approach may not be necessary.

We also noted that the three species of HRV appear to harbour considerable antigenic diversity both within and between species. Over 140 HRV types over three species are recognised compared to 130 types (in nine species) for EV and 11 (in four species) for Apthovirus (Table 1.1). This discrepancy may be due to the ubiquitous nature and constant circulation of HRV, meaning that many more types are detected and catalogued in a single study than other picornavirus groups. The action of diversifying selection pressure on residues within the capsid coding region that specifically coincide with neutralizing antigenic sites has previously been described (Kistler et al., 2007b) and one could speculate that such pressure could lead to the routine emergence and identification of many new HRV types. However,

222

Concluding remarks a large number of the HRV types that have been newly discovered in recent years have shown evidence of circulation long before their initial discovery (Briese et al., 2008). These cannot be properly called emerging viruses but rather new discoveries made possible by the increasing scope of molecular detection methods. In addition, evidence of the contemporary diversification of HRV into genotypically distinct types was only rarely evident within our dataset, although several divergent strains of a few types (including HRV-A29) were documented (Chapter 4). These may represent the gradual creation of antigenically distinct types. Further studies focussing on exploring serological cross-reactivity between members of genotypically defined types in all three species would be beneficial to gaining a deeper understanding of these relationships.

7.4 Recombination in the evolution of HRV

The process of recombination can generate strains with differing virulence patterns and can cause reversion in the attenuated PV vaccine (Macadam et al., 1989; van der Sanden et al., 2011; Burns et al., 2013). The circulation of recombinant strains has been frequently reported in many other picornaviruses, including human EV (Oprisan et al., 2002; Simmonds and Welch, 2006; Bouslama et al., 2007; McWilliam Leitch et al., 2010; Santti et al., 1999; McWilliam Leitch et al., 2012; Oberste et al., 2004d, 2004a; Lindberg et al., 2003; Andersson et al., 2002), Apthovirus (Heath et al., 2006; Simmonds, 2006), Cardiovirus (Drexler et al., 2010; Blinkova et al., 2009) and Parechovirus (Benschop et al., 2008; Calvert et al., 2010). In fact, the occurrence of phylogenetic incongruence between the structural and non- structural genome regions of many picornaviruses has led to a theory of separate and modular evolution of these two genome regions (Lukashev et al., 2005; Santti et al., 1999). In contrast, the studies contained within this thesis (Chapters 5 and 6) have generally supported a low level of detectable recombination within the coding region of HRV genomes (McIntyre et al., 2010, 2013b). In particular, there was little

223

Concluding remarks evidence within HRV for the widespread recombination between contemporary strains observed in EV. Recombinant strains of EV-B have been documented to circulate in outbreaks, where one recombinant form (a combination of a particular capsid and non-structural region) rises to dominate circulation for a few years and then disappears entirely (McWilliam Leitch et al., 2010, 2009a). In contrast, the infecting HRV population appears to consist of a great number of types which circulate almost continuously. These types appear to be somewhat stable and show congruent within-type phylogeny between their structural and non-structural genome regions. As EV and HRV in particular exhibit a highly similar genome structure and a degree of sequence homology, the mechanism by which recombination is restricted in HRV remains puzzling.

Although the occurrence of recombination during genome replication is essentially a random process and probably occurs rather frequently, recombinant genomes must necessarily compete with parental strains in order to come to form a detectable fraction of the population. The fact that recombinant HRV genomes are so rarely observed within contemporary datasets suggests that recombinant genomes may be generally less fit than parental strains. The fitness of the generated virus depends not just on the functioning of individual encoded proteins but on the interactions between different parts of the genome. For instance, the interaction between the 3CD protein and the 5ˈUTR is an essential component of negative strand synthesis and therefore RNA replication (Rohll et al., 1994; Andino et al., 1990) . These interactions have likely been somewhat optimised over time by co-evolution of different genome segments and therefore, even biologically viable recombinants are likely to exhibit decreased fitness when compared with parental strains. A similar point has been demonstrated by in vitro studies of recombination in the 5ˈUTR region of HRV. Artificial chimaeras were created between the coding genome of HRV-A16 and the 5ˈUTR region of various different members of the Enterovirus genus (comprising of an HRV-Ca, HRV-Cc, HRV-B and human EV strain). Interestingly, the study found

224

Concluding remarks that each chimaera was viable and could replicate efficiently in cell culture but in competition experiments was consistently less fit than the wild type virus (Schibler et al., 2012). This suggests that while certain parts of the genome may be functionally interchangeable, the restricting factor in the detection of recombinant genomes may be a decline in viral fitness. Indeed, this functional compatibility within the 5ˈUTR region can extend to highly divergent viruses. A study which replaced the IRES of a PV1 virus with that of EMCV found that this still resulted in the creation of a viable chimaera (Alexander et al., 1994). This finding is even more remarkable when considered in light of the fact that this indicates a degree of functional compatibility between a Type I and Type II IRES, which have little or no homology in either sequence or structure. Notably there is also evidence for the exchange of IRES elements between diverse picornavirus genera. For example, as previously discussed the Type IV (pestivirus related) IRES has been found in both Sapelovirus and Avihepatovirus. The extremely rare isolation of recombinants from such divergent viruses in nature may be secondary to decreased evolutionary fitness or to their limited opportunity to co-infect the same host cell.

Although the well-conserved 5ˈUTR region appears to be somewhat interchangeable between diverse viruses, the possibility of a biological compatibility barrier between the more divergent structural and non-structural regions of divergent viruses has been previously discussed (Kistler et al., 2007b; Simmonds and Welch, 2006; Lukashev et al., 2005). The theory states that viruses that bypass a particular divergence threshold in the non-structural region may lack the ability to form viable recombinants, as vital interactions between different encoded proteins may be severely disturbed. HRV are generally highly genetically diverse within and between species in both structural and non-structural genome regions (Kistler et al., 2007b) and we have demonstrated a generally higher within-species divergence for HRV-C than other HRV (Chapter 5). This co-incides with the almost completely absence of recombinant HRV-C genomes. In contrast, the more frequently

225

Concluding remarks recombinant EV-B shows considerably decreased divergence in the non-structural protein region (Lindberg et al., 2003). This may allow different genome segments to have a higher degree of functional compatibility and therefore permit successful competition of recombinant genomes with parental strains. It has been suggested that G+C content may play a role in the frequency of generation of recombinant genomes; whereby high G+C content increases recombination frequency by 1.3 times and the existence of GC tracts increase it even further (Runckel et al., 2013). This study used a deep sequencing technique to obtain sequences from all PV genomes generated during the replication cycle and so recovered sequences were not restricted by the viability or fitness of the variant. If this observation can be assumed to be generally applicable to HRV as well as PV, it may suggest that the frequency of generation of recombinants in HRV during replication is less than in EV secondary to a generally lower G+C content. This would also suggest that recombination occurs slightly more frequently in the more G+C rich HRV-C genomes than other species of HRV. However, the lack of observation of such recombinants suggests that certain restrictions on biological viability and evolutionary fitness render these recombinants less able to compete with wild type viruses.

The emergence of epidemic strains secondary to recombination of the 5ˈUTR region in EV has been documented (van der Sanden et al., 2011). In addition, strains of EV- A71 with altered virulence secondary to mutations in the 3Dpol region have been noted (Chang et al., 2010). EV-B circulates in a cyclical fashion, whereby recombinant strains predominate for a certain period of time and then are periodically replaced. This constant shuffling of distinct genome modules in EV may potentially contribute to the epidemic circulation of certain strains. Studies of HRV epidemiology, such as those outlined in Chapter 3, generally note a lack of predominance of a certain HRV species or strain circulating in a particular time scale. The lack of recombination and subsequent modular evolution of structural and

226

Concluding remarks non-structural genome regions in HRV may be a partial explanation for the patterns of circulation observed in these viruses.

7.5 Conclusion

In summary, the work undertaken throughout the course of this PhD has focussed on the epidemiology, classification and evolution of HRV. In addition to the discovery of a novel potential association between HRV infection and both gastrointestinal and CNS disease (Chapter 3), we have developed proposals for the classification of HRV into genotypically defined types and applied these guidelines to the classification of all available HRV sequence data into over 140 types (Chapter 4). The classification of all HRV sequences in this manner allowed investigation of the occurrence of recombination in the history of diversification of all three species of HRV (Chapter 5 and 6). We found only limited evidence of recombination occurring within the contemporary diversification of HRV types, as evidenced by the congruent phylogenetic groupings of sequences. However, we did find evidence for phylogenetic incongruity affecting entire HRV-A types, suggesting that these viruses may have been subject to extensive recombination during the period of their diversification into distinct HRV types.

One potential long-term application of research into the genetic diversity of HRV and evolutionary processes shaping this diversity is the potential identification of genome regions which may serve as therapeutic targets for the development of both antiviral drugs and vaccines. The considerable genetic diversity of HRV renders the formulation of a common vaccine for even a single HRV species extremely challenging. However, the taxonomic identification and classification of infecting strains may allow future determination of potential associations with certain strains and severe diseases. Studies of evolution and recombination in HRV may reveal certain exploitable similarities between related HRV types. For instance, the finding

227

Concluding remarks that antibodies to the well-conserved VP4 region may exhibit cross-serotypic neutralisation properties (Katpally et al., 2009) coupled with our observations that the VP4 region of HRV is both well conserved and largely non-recombinant throughout all three species may have implications for the development of such targeted therapies. In addition, the finding that the 2A protein is very well conserved within HRV-A and HRV-C may highlight this region as a particularly attractive therapeutic target. With global collaboration and studies concentrated on aspects of epidemiology and evolution of HRV strains, it may actually be possible in the future to make an impact on the considerable clinical and economic burden of these extremely common pathogens.

228

Appendix 1 – List of Primers

Name HRV- Orienta Primer Numbered Sequence of Species Number by: primer b set Primers for screening (5ˈUTR) for and typing (VP4/VP2) all HRV

UTR All B14 Screen HRV OS 178 :NC_001490 HCAAGYACTTCTGTYWCCCCSG UTR All B14 Screen HRV OAS 573 :NC_001490 GAAACACGGACACCCAAAGTAGT UTR HRV- B14 Screen A/-B IS 367 :NC_001490 CYAGCCTGCGTGGCKGCCWRC UTR B14 Screen HRV-C IS 367 :NC_001490 GTAGCCYGCGTGGTGCCCWGC UTR All B14 Screen HRV IAS 477 :NC_001490 TTAGCCRCATTCAGGGGCCGG VP4/V All P2 HRV OS 445 C4:EF582385 CCGGCCCCTGAATGYGGCTAA VP4/V All P2 HRV IS 533 C4:EF582385 ACCRACTACTTTGGGTGTCCGTG VP4/V All P2 HRV IAS 1066 C4:EF582385 TCWGGHARYTTCCAMCACCANCC VP4/V All P2 HRV OAS 1104 C4:EF582385 ACATRTTYTSNCCAAANAYDCCCAT Primers for the amplification of the 5ˈUTR of HRV-C TTMAAACAGCSGBNNGGTTGYACCCA UTR 1 HRV-A OS/IS 1 C4:EF582385 UTR 1 HRV-C OS/IS 1 C4:EF582385 TTMAAACTGRRWBYRGGTTGYTCCCA HRV- GCAGGSMRCCACGCRGGCT UTR 1 A/C OAS 375 C4:EF582385 GCAGGCAGCCACGCAGGCT UTR 1 HRV-A OAS 375 C4:EF582385 UTR 1 HRV-C OAS 375 C4:EF582385 GCAGGGCACCACGCGGGCT UTR 1 HRV-A IAS 355 C4:EF582385 GAHCACTGTYRCCAGTGGGG UTR 1 HRV-C IAS 354 C4:EF582385 GACACCGTCGCCGGTGGGG HRV- UTR 2 A/C S 164 C4:EF582385 HCAAGYACTTCTGTYWCCCCSG HRV- RGAAACACGGACACCCAAAGTAGT UTR 2 A/C OAS 560 C4:EF582385 HRV- UTR 2 A/C IAS 550 C4:EF582385 ACACCCAAAGTAGTYGGTYCCR HRV- UTR 3 A/C A 445 C4:EF582385 CCGGCCCCTGAATGYGGCTAA

229

List of Primers

Name HRV- Orient Primer Numbered Sequence of Species Number by: primer set HRV- UTR 3 A/C OAS 1066 C4:EF582385 TCWGGHARYTTCCAMCACCANCC HRV- ATAGTRATTTGYTTDAGCCTATCDGAV A UTR 3 A/C IAS 868 C4:EF582385 Primers for the amplification of the VP1 region of HRV-A, -B and –C A VP1 HRV-A OS 1943 A1:D00239 MGHTTYAGYTTYATGTTYTGTGG A VP1 HRV-A OAS 3485 A1:D00239 CCACARTCWCCWGGYTCACADGG A VP1 HRV-A IS 2418 A1:D00239 TRGAYGCWGCWGARACWGG A VP1 HRV-A IAS 3290 A1:D00239 GTRTTTGTKCGGTADATGAYTARRTC A VP1 HRV-A IAS 2504 A1:D00239 CTYTCHAYRCTCATYTCATC A28 HRV- VP1 A28 IAS 2702 A1:D00239 ACTTGCAACTGACACCACTAT B VP1 HRV-B OS/IS 2250 B14:X01087 TAGTGCRTGTCCAGATTTYA B VP1 HRV-B OAS 3455A B14:X01087 TAACTACCACCTTCAATCCA B VP1 HRV-B OAS 3455B B14:X01087 TAGTTTCCTCCTTCAATCCA B VP1 HRV-B IAS 3256 B14:X01087 ARRTGRTCYTCHTGYGTCAT C VP1 HRV-C OS 1986 C4:EF582385 ATRGCMTACACMCCHCCWGG PCR 1 C VP1 HRV-C OAS 3449 C4:EF582385 YTTVCCWCCACARTCHCCWGG PCR 1 C VP1 HRV-C IS 2256 C4:EF582385 ATGHTRCGWGAYACWCCHATGAT PCR 1 C VP1 HRV-C IASa 2852 C4:EF582385 ATCATAHCCATCATARAAVATGTARTA PCR 1 C VP1 HRV-C IASb 2852 C4:EF582385AT ATCATAHCCATCATARAAVACATARTA PCR 1 C VP1 HRV-C IS 2745 C4:EF582385 GCYTCHAACCCHAGTGTNTTYT PCR 1 C VP1 HRV-C IAS 3151 C4:EF582385GT GTRTGNACAWAHAKRTCACTRGGTCC PCR 1 C VP1 HRV-C OAS 3882 C4:EF582385 ARBCCHYKDGCDGCRTTRCA PCR 2 Pr Primers for the amplification of the 2A region of HRV-C C2A HRV-C OS 2832 C4:EF582385 GCHTAYTAYATKTTYTATGATG C2A HRV-C OAS 3882 C4:EF582385 ARBCCHYKDGCDGCRTTRCA C2A HRV-C IS 2898 C4:EF582385 AATGAYAYGGGWWSASMYTRTG PCR1 C2A HRV-C IAS 3429 C4:EF582385 YTTVCCWCCRCARTCHCCHGG PCR 1 C2A HRV-C IS 3126 C4:EF582385 GGNCCHAGTGAYMWVTWTG PCR 2 Pri Primers for the amplification of the 3Dpol region of HRV-A, -B and –C A 3Dpol HRV-A OS 5657 A1:D00239 AAYGGNARSAHGGNTTYKCHGC A OAS/I 3Dpol HRV-A AS 7130 A1:D00239 ARAAYTTBTCRTACCAYTCRTG

230

List of Primers

Name HRV- Orient Primer Numbered Sequence of Species Number by: primer set A 3Dpol HRV-A IS 6464 A1:D00239 TTYGAYTAYWCBAAYTATGAYGG A 3Dpol HRV-A OS 6323 A1:D00239 TVAATGAYTCWGTNAAYATGAG B 3Dpol HRV-B OAS 7030 B14:X01087 CCWGARTGCCAGGCYADCATRCA B 3Dpol HRV-B IS 6475 B14:X01087 TTYGAYTAYWCHAAYTWYGATGC B 3Dpol HRV-B IAS 6974A B14:X01087 TCVTRDATDTCHHGCATYGGCAT B 3Dpol HRV-B IAS 6974B B14:X01087 TCVTRDATDTCYTGCATYGGCAT C 3Dpol HRV-C OS 6300 C4:EF582385 TGTGATCCWGAYRTYTTYTGGTCAC C 3Dpol HRV-C OAS 6901 C4:EF582385 ACRTGRTCTTSWGTRTTCTTTGG C 3Dpol HRV-C IS 6360A C4:EF582385 TTTGAYTAYACMAAYTATGATGG C 3Dpol HRV-C IS 6360B C4:EF582385 TTTGAYTAYACMAAYTATGATGGNAG C 3Dpol HRV-C IAS 6877 C4:EF582385 TCYYTDGTCCATCTRATTGAYTCA a Orientation of primer sequences. OS : outer sense. OAS : outer antisense. IS : inner sense. IAS : inner antisense b Primer numbering is taken from the 5ˈ base position

231

Appendix 2 – Permissions for reproduction of publications and figures

Where required by the individual publishers, explicit permission for the reproduction of published articles has been sought and is included below. SPRINGER LICENSE TERMS AND CONDITIONS Mar 19, 2013

This is a License Agreement between Chloe L McIntyre ("You") and Springer ("Springer") provided by Copyright Clearance Center ("CCC"). The license consists of your order details, the terms and conditions provided by Springer, and the payment terms and conditions. License Number 3102590463561 License date Mar 05, 2013 Licensed content Springer publisher Licensed content Archives of Virology publication Licensed content title Recombination in the evolution of human rhinovirus genomes Licensed content Chloe L. McIntyre author Licensed content date Jan 1, 2013 Type of Use Thesis/Dissertation Portion Full text Number of copies 1 Author of this Yes and you are the sole author of the new work Springer article Title of your thesis / The epidemiology, classification and evolution of human dissertation rhinoviruses Expected completion Mar 2013 date Estimated size(pages) 200

Total 0.00 GBP

232

Permissions for reproduction of publications and figures

JOHN WILEY AND SONS LICENSE TERMS AND CONDITIONS Mar 19, 2013

This is a License Agreement between Chloe L McIntyre ("You") and John Wiley and Sons ("John Wiley and Sons") provided by Copyright Clearance Center ("CCC"). The license consists of your order details, the terms and conditions provided by John Wiley and Sons, and the payment terms and conditions. License Number 3102600306975 License date Mar 05, 2013 Licensed content John Wiley and Sons publisher Licensed content Journal of Medical Virology publication High detection frequency and viral loads of human rhinovirus Licensed content title species A to C in fecal samples; diagnostic and clinical implications Licensed copyright Copyright © 2012 Wiley Periodicals, Inc. line Licensed content H. Harvala,C.L. McIntyre,N.J. McLeish,J. Kondracka,J. Palmer,P. author Molyneaux,R. Gunson,S. Bennett,K. Templeton,P. Simmonds Licensed content date Jan 13, 2012 Start page 536 End page 542

Type of use Dissertation/Thesis Requestor type Author of this Wiley article Format Print and electronic Portion Full article

Total 0.00 USD

233

Permissions for reproduction of publications and figures

SPRINGER LICENSE TERMS AND CONDITIONS Apr 30, 2013

This is a License Agreement between Chloe L McIntyre ("You") and Springer ("Springer") provided by Copyright Clearance Center ("CCC"). The license consists of your order details, the terms and conditions provided by Springer, and the payment terms and conditions. All payments must be made in full to CCC. For payment instructions, please see information listed at the bottom of this form. License Number 3138730673639 License date Apr 30, 2013 Licensed content publisher Springer Licensed content publication Cellular and Molecular Life Sciences The cold case: Are rhinoviruses perfectly adapted Licensed content title pathogens? Licensed content author S. Dreschers Licensed content date Jan 1, 2007 Volume number 64 Issue number 2

Type of Use Thesis/Dissertation Portion Figures Author of this Springer article No Order reference number

The epidemiology, classification and evolution of human Title of your thesis / dissertation rhinoviruses Expected completion date Apr 2013 Estimated size(pages) 200

Total 0.00 GBP

234

Permissions for reproduction of publications and figures

ELSEVIER LICENSE TERMS AND CONDITIONS Apr 30, 2013

This is a License Agreement between Chloe L McIntyre ("You") and Elsevier ("Elsevier") provided by Copyright Clearance Center ("CCC"). The license consists of your order details, the terms and conditions provided by Elsevier, and the payment terms and conditions. Elsevier Limited Supplier The Boulevard,Langford Lane Kidlington,Oxford,OX5 1GB,UK Registered Company Number 1982084 Customer name Chloe L McIntyre Customer address Infection and Immunity Division Edinburgh, EH25 9RG License number 3138741121335 License date Apr 30, 2013 Licensed content publisher Elsevier Biochimica et Biophysica Acta (BBA) - Gene Licensed content publication Regulatory Mechanisms Bridging IRES elements in mRNAs to the Licensed content title eukaryotic translation apparatus Licensed content author Kerry D. Fitzgerald,Bert L. Semler Licensed content date September–October 2009 Licensed content volume number 1789 Licensed content issue number 9–10 Number of pages 11 Start Page 518 End Page 528

Type of Use reuse in a thesis/dissertation Intended publisher of new work other Portion figures/tables/illustrations Number of figures/tables/illustrations 1 Format both print and electronic Are you the author of this Elsevier No 235

Permissions for reproduction of publications and figures article?

Will you be translating? No Order reference number

The epidemiology, classification and evolution of Title of your thesis/dissertation human rhinoviruses

Expected completion date Apr 2013 Estimated size (number of pages) 200 Elsevier VAT number GB 494 6272 12

Permissions price 0.00 GBP VAT/Local Sales Tax 0.0 USD / 0.0 GBP

Total 0.00 GBP

NATURE PUBLISHING GROUP LICENSE TERMS AND CONDITIONS May 01, 2013

This is a License Agreement between Chloe L McIntyre ("You") and Nature Publishing Group ("Nature Publishing Group") provided by Copyright Clearance Center ("CCC"). The license consists of your order details, the terms and conditions provided by Nature Publishing Group, and the payment terms and conditions. All payments must be made in full to CCC. For payment instructions, please see information listed at the bottom of this form. License Number 3140160361941 License date May 01, 2013 Licensed content publisher Nature Publishing Group Licensed content publication Nature Reviews Microbiology Host and virus determinants of picornavirus Licensed content title pathogenesis and tropism J. Lindsay Whitton,Christopher T. CornellandRalph Licensed content author Feuer Licensed content date Oct 1, 2005 Volume number 3 Issue number 10

Type of Use reuse in a thesis/dissertation

236

Permissions for reproduction of publications and figures

Requestor type academic/educational Format print and electronic Portion figures/tables/illustrations Number of figures/tables/illustrations 1 High-res required no Figures Figure 1 - Summary of the picornavirus life cycle Author of this NPG article no Your reference number

The epidemiology, classification and evolution of Title of your thesis / dissertation human rhinoviruses Expected completion date May 2013 Estimated size (number of pages) 200

Total 0.00 GBP

237

References

Abraham, A. A., and Holmes, W. L. (1974). Isolation and characterization of an echovirus, possible “prime” strain of echovirus type 12. Proc Soc Exp Biol Med 147, 444–8.

Abraham, G., and Colonno, R. J. (1984). Many rhinovirus serotypes share the same cellular receptor. J Virol 51, 340–5.

Acharya, R., Fry, E., Stuart, D., Fox, G., Rowlands, D., and Brown, F. (1989). The three- dimensional structure of foot-and-mouth disease virus at 2.9 A resolution. Nature 337, 709–16.

Agol, V. I. (1997). Recombination and Other Genomic Rearrangements in Picornaviruses. Seminars in Virology 8, 77–84.

Alexander, L., Lu, H. H., and Wimmer, E. (1994). Polioviruses containing picornavirus type 1 and/or type 2 internal ribosomal entry site elements: genetic hybrids and the expression of a foreign gene. Proc Natl Acad Sci USA 91, 1406–10. de Almeida, M. B., Zerbinati, R. M., Tateno, A. F., Oliveira, C. M., Romão, R. M., Rodrigues, J. C., Pannuti, C. S., and da Silva Filho, L. V. F. (2010). Rhinovirus C and respiratory exacerbations in children with cystic fibrosis. Emerg Infect Dis 16, 996–9.

Alper, C. M., Doyle, W. J., Winther, B., and Hendley, J. O. (2008). Upper respiratory virus detection without parent-reported illness in children is virus-specific. J Clin Virol 43, 120–2.

Altschul, S. F., Gish, W., Miller, W., Myers, E. W., and Lipman, D. J. (1990). Basic local alignment search tool. J Mol Biol 215, 403–10.

Altschul, S. F., Madden, T. L., Schäffer, A. A., Zhang, J., Zhang, Z., Miller, W., and Lipman, D. J. (1997). Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25, 3389–402.

Anderson, E. C., Hunt, S. L., and Jackson, R. J. (2007). Internal initiation of translation from the human rhinovirus-2 internal ribosome entry site requires the binding of Unr to two distinct sites on the 5’ untranslated region. J Gen Virol 88, 3043–52.

Andersson, P., Edman, K., and Lindberg, A. (2002). Molecular analysis of the echovirus 18 prototype: evidence of interserotypic recombination with echovirus 9. Virus Res 85, 71–83.

238

References

Andino, R., Rieckhof, G. E., Trono, D., and Baltimore, D. (1990). Substitutions in the protease (3Cpro) gene of poliovirus can suppress a mutation in the 5’ noncoding region. J Virol 64, 607–12.

Andrewes, C. H. (1961). The taxonomic position of common cold viruses and some others. Yale J Biol Med 34, 200–6.

Andries, K., Dewindt, B., Snoeks, J., Wouters, L., Moereels, H., Lewi, P. J., and Janssen, P. a (1990). Two groups of rhinoviruses revealed by a panel of antiviral compounds present sequence divergence and differential pathogenicity. J Virol 64, 1117–23.

Arden, K. E., Faux, C. E., O’Neill, N. T., McErlean, P., Nitsche, A., Lambert, S. B., Nissen, M. D., Sloots, T. P., and Mackay, I. M. (2010). Molecular characterization and distinguishing features of a novel human rhinovirus (HRV) C, HRVC-QCE, detected in children with fever, cough and wheeze during 2003. J Clin Virol 47, 219–23.

Arden, K., and Mackay, I. (2010). Newly identified human rhinoviruses: molecular methods heat up the cold viruses. Rev Med Virol 20, 156–176.

Arden, K., McErlean, P., Nissen, M., Sloots, T., and Mackay, I. (2006). Frequent detection of human rhinoviruses, paramyxoviruses, coronaviruses, and bocavirus during acute respiratory tract infections. J Med Virol 78, 1232–40.

Arola, A., Santti, J., Ruuskanen, O., Halonen, P., and Hyypiä, T. (1996). Identification of enteroviruses in clinical specimens by competitive PCR followed by genetic typing using sequence analysis. J Clin Micro 34, 313–8.

Baltimore, D. (1971). Expression of animal virus genomes. Bacteriol Rev 35, 235–41.

Barton, D. J., O’Donnell, B. J., and Flanegan, J. B. (2001). 5’ cloverleaf in poliovirus RNA is a cis-acting replication element required for negative-strand synthesis. EMBO J 20, 1439–48.

Belsham, G. J. (2009). Divergent picornavirus IRES elements. Virus Res 139, 183–92.

Bennett, S., Harvala, H., Witteveldt, J., McWilliam Leitch, E., McLeish, N., Templeton, K., Gunson, R., Carman, W., and Simmonds, P. (2011). Rapid simultaneous detection of enterovirus and parechovirus RNAs in clinical samples by one-step real-time reverse transcription-PCR assay. J Clin Micro 49, 2620–4.

Benschop, K., Minnaar, R., Koen, G., van Eijk, H., Dijkman, K., Westerhuis, B., Molenkamp, R., and Wolthers, K. (2010a). Detection of human enterovirus and human parechovirus (HPeV) genotypes from clinical stool samples: polymerase chain reaction and direct molecular typing, culture characteristics, and serotyping. Diagn Microbiol Infect Dis 68, 166–73.

239

References

Benschop, K. S., de Vries, M., Minnaar, R. P., Stanway, G., van der Hoek, L., Wolthers, K. C., and Simmonds, P. (2010b). Comprehensive full-length sequence analyses of human parechoviruses: diversity and recombination. J Gen Virol 91, 145–54.

Benschop, K. S., Williams, C. H., Wolthers, K. C., Stanway, G., and Simmonds, P. (2008). Widespread recombination within human parechoviruses: analysis of temporal dynamics and constraints. J Gen Virol 89, 1030–5.

Berry, V., and Gascuel, O. (1996). On the interpretation of bootstrap trees: appropriate threshold of clade selection and induced gain. Mol Biol Evol 13, 999–1011.

Bienz, K., Egger, D., and Pfister, T. (1994). Characteristics of the poliovirus replication complex. Arch Virol Suppl 9, 147–57.

Bizzintino, J., Lee, W.-M., Laing, I. a, Vang, F., Pappas, T., Zhang, G., Martin, a C., Khoo, S.-K., Cox, D. W., Geelhoed, G. C., et al. (2011). Association between human rhinovirus C and severity of acute asthma in children. Eur Resp J 37, 1037–42.

Black, R. E., Morris, S. S., and Bryce, J. (2003). Where and why are 10 million children dying every year? Lancet 361, 2226–34.

Blinkova, O., Kapoor, A., Victoria, J., Jones, M., Wolfe, N., Naeem, A., Shaukat, S., Sharif, S., Alam, M. M., Angez, M., et al. (2009). Cardioviruses Are Genetically Diverse and Cause Common Enteric Infections in South Asian Children. J Virol 83, 4631–4641.

Blomqvist, S. (2003). Characterization of a recombinant type 3/type 2 poliovirus isolated from a healthy vaccinee and containing a chimeric capsid protein VP1. J Gen Virol 84, 573–580.

Blomqvist, S., Paananen, A., Savolainen-Kopra, C., Hovi, T., and Roivainen, M. (2008). Eight years of experience with molecular identification of human enteroviruses. J Clin Micro 46, 2410–3.

Blomqvist, S., Roivainen, M., Puhakka, T., Kleemola, M., and Hovi, T. (2002a). Virological and serological analysis of rhinovirus infections during the first two years of life in a cohort of children. J Med Virol 66, 263–8.

Blomqvist, S., Savolainen, C., Råman, L., Roivainen, M., and Hovi, T. (2002b). Human rhinovirus 87 and enterovirus 68 represent a unique serotype with rhinovirus and enterovirus features. J Clin Micro 40, 4218–23.

Blomqvist, S., Savolainen-Kopra, C., Paananen, A., Hovi, T., and Roivainen, M. (2009). Molecular characterization of human rhinovirus field strains isolated during surveillance of enteroviruses. J Gen Virol 90, 1371–81.

240

References

Bochkov, Y. A., Palmenberg, A. C., Lee, W.-M., Rathe, J. a, Amineva, S. P., Sun, X., Pasic, T. R., Jarjour, N. N., Liggett, S. B., and Gern, J. E. (2011). Molecular modeling, organ culture and reverse genetics for a newly identified human rhinovirus C. Nat Med 17, 627–32.

Bouslama, L., Nasri, D., Chollet, L., Belguith, K., Bourlet, T., Aouni, M., Pozzetto, B., and Pillet, S. (2007). Natural recombination event within the capsid genomic region leading to a chimeric strain of human enterovirus B. J Virol 81, 8944–52.

Brabec, M., Baravalle, G., Blaas, D., and Fuchs, R. (2003). Conformational changes, plasma membrane penetration, and infection by human rhinovirus type 2: role of receptors and low pH. J Virol 77, 5370–7.

Brandenburg, B., Lee, L. Y., Lakadamyali, M., Rust, M. J., Zhuang, X., and Hogle, J. M. (2007). Imaging poliovirus entry in live cells. PLoS Biol 5, e183.

Briese, T., Renwick, N., Venter, M., Jarman, R. G., Ghosh, D., Köndgen, S., Shrestha, S. K., Hoegh, A. M., Casas, I., Adjogoua, E. V., et al. (2008). Global distribution of novel rhinovirus genotype. Emerg Infect Dis 14, 944–7.

Brittain-Long, R., Westin, J., Olofsson, S., Lindh, M., and Andersson, L.-M. (2010). Prospective evaluation of a novel multiplex real-time PCR assay for detection of fifteen respiratory pathogens-duration of symptoms significantly affects detection rate. J Clin Virol 47, 263–7.

Broberg, E., Niemelä, J., Lahti, E., Hyypiä, T., Ruuskanen, O., and Waris, M. (2011). Human rhinovirus C-Associated severe pneumonia in a neonate. J Clin Virol 51, 79– 82.

Brown, B., Maher, K., Flemister, M. R., Naraghi-Arani, P., Uddin, M., Oberste, M. S., and Pallansch, M. a (2009). Resolving ambiguities in genetic typing of human enterovirus species C clinical isolates and identification of enterovirus 96, 99 and 102. J Gen Virol 90, 1713–23.

Brown, B., Oberste, M. S., Maher, K., and Pallansch, M. A. (2003). Complete genomic sequencing shows that polioviruses and members of human enterovirus species C are closely related in the noncapsid coding region. J Virol 77, 8973–84.

Brown, D. M., Cornell, C. T., Tran, G. P., Nguyen, J. H. C., and Semler, B. L. (2005). An Authentic 3Ј Noncoding Region Is Necessary for Efficient Poliovirus Replication. 79, 11962–11973.

Bryce, J., Boschi-Pinto, C., Shibuya, K., and Black, R. E. (2005). WHO estimates of the causes of death in children. Lancet 365, 1147–52.

241

References

Burns, C. C., Shaw, J., Jorba, J., Bukbuk, D., Adu, F., Gumede, N., Pate, M. A., Abanida, E. A., Gasasira, A., Iber, J., et al. (2013). Multiple Independent Emergences of Type 2 Vaccine-Derived Polioviruses during a Large Outbreak in northern Nigeria. J Virol, February [Epub ahead of print].

Calvert, J., Chieochansin, T., Benschop, K. S., McWilliam Leitch, E. C., Drexler, J. F., Grywna, K., da Costa Ribeiro, H., Drosten, C., Harvala, H., Poovorawan, Y., et al. (2010). Recombination dynamics of human parechoviruses: investigation of type- specific differences in frequency and epidemiological correlates. J Gen Virol 91, 1229– 38.

Chan, Y.-F., Jafar, F. L., Nathan, A.-M., de Bruyne, J. A., Hassan, A., Nor’e, S. S., Abubakar, S., and Sam, I.-C. (2012). Diverse human rhinoviruses A and C from children with respiratory infections in Kuala Lumpur, Malaysia. J Infect 64, 633–6.

Chan-Yeung, M., Aït-Khaled, N., White, N., Ip, M. S., and Tan, W. C. (2004). The burden and impact of COPD in Asia and Africa. Int J Tuberc Lung Dis 8, 2–14.

Chang, G.-H., Lin, L., Luo, Y.-J., Cai, L.-J., Wu, X.-Y., Xu, H.-M., and Zhu, Q.-Y. (2010). Sequence analysis of six enterovirus 71 strains with different virulences in humans. Virus Res 151, 66–73.

Chevaliez, S., Szendröi, A., Caro, V., Balanant, J., Guillot, S., Berencsi, G., and Delpeyroux, F. (2004). Molecular comparison of echovirus 11 strains circulating in Europe during an epidemic of multisystem hemorrhagic disease of infants indicates that evolution generally occurs by recombination. Virology 325, 56–70.

Chidlow, G. R., Laing, I. A., Harnett, G. B., Greenhill, A. R., Phuanukoonnon, S., Siba, P. M., Pomat, W. S., Shellam, G. R., Smith, D. W., and Lehmann, D. (2012). Respiratory viral pathogens associated with lower respiratory tract disease among young children in the highlands of Papua New Guinea. J Clin Virol 54, 235–239.

Colonno, R. J., Condra, J. H., Mizutani, S., Callahan, P. L., Davies, M. E., and Murcko, M. a (1988). Evidence for the direct involvement of the rhinovirus canyon in receptor binding. Proc Natl Acad Sci USA 85, 5449–53.

Cooney, M. K., Fox, J. P., and Kenny, G. E. (1982). Antigenic groupings of 90 rhinovirus serotypes. Infect Immun 37, 642–7.

Cordey, S., Gerlach, D., Junier, T., Zdobnov, E. M., Kaiser, L., and Tapparel, C. (2008). The cis-acting replication elements define human enterovirus and rhinovirus species. RNA 14, 1568–78.

242

References

Costa, C., Bergallo, M., Astegiano, S., Sidoti, F., Terlizzi, M. E., Gambarino, S., Curtoni, A., Simeone, S., Solidoro, P., and Cavallo, R. (2011). Detection of human rhinoviruses in the lower respiratory tract of lung transplant recipients. Arch Virol 156, 1439–43.

Crotty, S., Cameron, C. E., and Andino, R. (2001). RNA virus error catastrophe: direct molecular test by using ribavirin. Proc Natl Acad Sci USA 98, 6895–900.

Cuervo, N. S., Guillot, S., Romanenkova, N., Combiescu, M., Aubert-Combiescu, A., Seghier, M., Caro, V., Crainic, R., and Delpeyroux, F. (2001). Genomic features of intertypic recombinant sabin poliovirus strains excreted by primary vaccinees. J Virol 75, 5740–51.

Dahourou, G., Guillot, S., Le Gall, O., and Crainic, R. (2002). Genetic recombination in wild-type poliovirus. J Gen Virol 83, 3103–10.

Dalldorf, G., and Sickles, G. M. (1948). An Unidentified, Filtrable Agent Isolated From the Feces of Children With Paralysis. Science 108, 61–2.

Danthi, P., Tosteson, M., Li, Q., and Chow, M. (2003). Genome Delivery and Ion Channel Properties Are Altered in VP4 Mutants of Poliovirus. J Virol 77, 5266–5274.

Davis, M. P., Bottley, G., Beales, L. P., Killington, R. a, Rowlands, D. J., and Tuthill, T. J. (2008). Recombinant VP4 of human rhinovirus induces permeability in model membranes. J Virol 82, 4169–74.

DeTulleo, L., and Kirchhausen, T. (1998). The clathrin endocytic pathway in viral infection. EMBO J 17, 4585–93.

Devaney, M. a, Vakharia, V. N., Lloyd, R. E., Ehrenfeld, E., and Grubman, M. J. (1988). Leader protein of foot-and-mouth disease virus is required for cleavage of the p220 component of the cap-binding protein complex. J Virol 62, 4407–9.

Dierssen, U., Rehren, F., Henke-Gendo, C., Harste, G., and Heim, A. (2008). Rapid routine detection of enterovirus RNA in cerebrospinal fluid by a one-step real-time RT-PCR assay. J Clin Virol 42, 58–64.

Domingo, E., Escarmís, C., Sevilla, N., Moya, A., Elena, S. F., Quer, J., Novella, I. S., and Holland, J. J. (1996). Basic concepts in RNA virus evolution. FASEB J 10, 859–64.

Domingo, E., Martin, V., Perales, C., and Escarmis, C. (2008). Coxsackieviruses and quasispecies theory: evolution of enteroviruses. Curr Top Microbiol Immunol 323, 3– 32.

Drake, J. W., and Holland, J. J. (1999). Mutation rates among RNA viruses. Proc Natl Acad Sci USA 96, 13910–3.

243

References

Dreschers, S., Dumitru, C. a, Adams, C., and Gulbins, E. (2007). The cold case: are rhinoviruses perfectly adapted pathogens? Cellular and molecular life sciences : CMLS 64, 181–91.

Drexler, J. F., Baumgarte, S., de Souza Luna, L. K., Stocker, A., Almeida, P. S., Ribeiro, T. C. M., Petersen, N., Herzog, P., Pedroso, C., Brites, C., et al. (2010). Genomic features and evolutionary constraints in Saffold-like cardioviruses. J Gen Virol 91, 1418–1427.

Drummond, A. J., and Rambaut, A. (2007). BEAST: Bayesian evolutionary analysis by sampling trees. BMC Evol Biol 7, 214.

Du, Z., Yu, J., Ulyanov, N. B., Andino, R., and James, T. L. (2004). Solution structure of a consensus stem-loop D RNA domain that plays important roles in regulating translation and replication in enteroviruses and rhinoviruses. Biochemistry 43, 11959–72.

Duechler, M., Skern, T. I. M., Sommergrubert, W., Neubauer, C., Gruendler, P., Fogyt, I., Blaas, D., and Kuechlert, E. (1987). Evolutionary relationships within the human rhinovirus genus : Biochemistry 84, 2605–2609.

Duffy, S., Shackelton, L. A., and Holmes, E. C. (2008). Rates of evolutionary change in viruses: patterns and determinants. Nat Rev Genet 9, 267–76.

Edgar, R. C. (2004). MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res 32, 1792–7.

El-Sahly, H. M., Atmar, R. L., Glezen, W. P., and Greenberg, S. B. (2000). Spectrum of clinical illness in hospitalized patients with “common cold” virus infections. Clin Infect Dis 31, 96–100.

Faux, C., Arden, K. E., Lambert, S. B., Nissen, M. D., Nolan, T. M., Chang, A. B., Sloots, T. P., and Mackay, I. M. (2011). Usefulness of published PCR primers in detecting human rhinovirus infection. Emerg Infect Dis 17, 296–8.

Felsenstein, J. (1978). Cases in which parsimony or compatibility methods will be positively misleading. Systematic Zoology 27, 401–410.

Felsenstein, J. (1985). Confidence limits on phylogenies: an approach using the bootstrap. Evolution 39, 783–791.

Fendrick, a M., Monto, A. S., Nightengale, B., and Sarnes, M. (2003). The economic burden of non-influenza-related viral respiratory tract infection in the United States. Arch Intern Med 163, 487–94.

Fitzgerald, K. D., and Semler, B. L. (2009). Bridging IRES elements in mRNAs to the eukaryotic translation apparatus. Biochimica et biophysica acta 1789, 518–28.

244

References

Forss, S., and Schaller, H. (1982). A tandem repeat gene in a picornavirus. Nucleic Acids Res 10, 6441–50.

Fricks, C. E., and Hogle, J. M. (1990). Cell-induced conformational change in poliovirus: externalization of the amino terminus of VP1 is responsible for liposome binding. J Virol 64, 1934–45.

Friedlander, S. L., and Busse, W. W. (2005). The role of rhinovirus in asthma exacerbations. J Allergy Clin Immunol 116, 267–73.

Fry, A. M., Lu, X., Olsen, S. J., Chittaganpitch, M., Sawatwong, P., Chantra, S., Baggett, H. C., and Erdman, D. (2011). Human Rhinovirus Infections in Rural Thailand: Epidemiological Evidence for Rhinovirus as Both Pathogen and Bystander. PloS one 6, e17780.

Fuji, N., Suzuki, A., Lupisan, S., Sombrero, L., Galang, H., Kamigaki, T., Tamaki, R., Saito, M., Aniceto, R., Olveda, R., et al. (2011). Detection of human rhinovirus C viral genome in blood among children with severe respiratory infections in the Philippines. PloS one 6, e27247.

Fujitsuka, A., Tsukagoshi, H., Arakawa, M., Goto-Sugai, K., Ryo, A., Okayama, Y., Mizuta, K., Nishina, A., Yoshizumi, M., Kaburagi, Y., et al. (2011). A molecular epidemiological study of respiratory viruses detected in Japanese children with acute wheezing illness. BMC Infect Dis 11, 168.

Gaaloul, I., Riabi, S., Harrath, R., Evans, M., Salem, N. H., Mlayeh, S., Huber, S., and Aouni, M. (2012). Sudden unexpected death related to enterovirus myocarditis: histopathology, immunohistochemistry and molecular pathology diagnosis at post- mortem. BMC Infect Dis 12, 212.

Le Gall, O., Christian, P., Fauquet, C. M., King, A. M. Q., Knowles, N. J., Nakashima, N., Stanway, G., and Gorbalenya, A. E. (2008). Picornavirales, a proposed order of positive-sense single-stranded RNA viruses with a pseudo-T = 3 virion architecture. Arch Virol 153, 715–27.

Garbino, J., Soccal, P. M., Aubert, J., Rochat, T., Meylan, P., Thomas, Y., Tapparel, C., Bridevaux, P., and Kaiser, L. (2009). Respiratory viruses in bronchoalveolar lavage : a hospital-based cohort study in adults. Thorax, 399–404.

Garriga, D., Pickl-Herk, A., Luque, D., Wruss, J., Castón, J. R., Blaas, D., and Verdaguer, N. (2012). Insights into minor group rhinovirus uncoating: the X-ray structure of the HRV2 empty capsid. PLoS pathogens 8, e1002473.

Gascuel, O., and Steel, M. (2006). Neighbor-joining revealed. Mol Biol Evol 23, 1997–2000.

245

References

Gaunt, E., McWilliam-Leitch, E. C., Templeton, K., and Simmonds, P. (2009). Incidence, molecular epidemiology and clinical presentations of human metapneumovirus; assessment of its importance as a diagnostic screening target. J Clin Virol 46, 318–24.

Gaunt, E. R., Hardie, A., Claas, E. C. J., Simmonds, P., and Templeton, K. E. (2010). Epidemiology and clinical presentations of the four human coronaviruses 229E, HKU1, NL63, and OC43 detected over 3 years using a novel multiplex real-time PCR method. J Clin Micro 48, 2940–7.

Gaunt, E. R., Harvala, H., McIntyre, C., Templeton, K. E., and Simmonds, P. (2011). Disease burden of the most commonly detected respiratory viruses in hospitalized patients calculated using the disability adjusted life year (DALY) model. J Clin Virol 52, 215–21.

Gerber, K., Wimmer, E., and Paul, A. V. (2001). Biochemical and genetic studies of the initiation of human rhinovirus 2 RNA replication: identification of a cis-replicating element in the coding sequence of 2A(pro). J Virol 75, 10979–90.

Gern, J. E., and Busse, W. W. (1999). Association of rhinovirus infections with asthma. Clin Microbiol Rev 12, 9–18.

Gern, J. E., Galagan, D. M., Jarjour, N. N., Dick, E. C., and Busse, W. W. (1997). Detection of rhinovirus RNA in lower airway cells during experimentally induced infection. Am J Respir Crit Care Med 155, 1159–61.

Gibbs, M. J., Armstrong, J. S., and Gibbs, A. J. (2000). Sister-scanning: a Monte Carlo procedure for assessing signals in recombinant sequences. Bioinformatics 16, 573–82.

Giranda, V. L., Heinz, B. a, Oliveira, M. a, Minor, I., Kim, K. H., Kolatkar, P. R., Rossmann, M. G., and Rueckert, R. R. (1992). Acid-induced structural changes in human rhinovirus 14: possible role in uncoating. Proc Natl Acad Sci USA 89, 10213–7.

Glaser, W., Triendl, A., and Skern, T. (2003). The processing of eIF4GI by human rhinovirus type 2 2A(pro): relationship to self-cleavage and role of zinc. J Virol 77, 5021–5.

Gmyl, A. P., Korshenko, S. A., Belousov, E. V., Khitrina, E. V., and Agol, V. I. (2003). Nonreplicative homologous RNA recombination: promiscuous joining of RNA pieces? RNA 9, 1221–31.

Goodfellow, I., Chaudhry, Y., Richardson, A., Meredith, J., Almond, J. W., Barclay, W., and Evans, D. J. (2000). Identification of a cis-acting replication element within the poliovirus coding region. J Virol 74, 4590–600.

246

References

Greve, J. M., Davis, G., Meyer, a M., Forte, C. P., Yost, S. C., Marlor, C. W., Kamarck, M. E., and McClelland, a (1989). The major human rhinovirus receptor is ICAM-1. Cell 56, 839–47.

Guillot, S., Caro, V., Cuervo, N., Korotkova, E., Combiescu, M., Persu, A., Aubert- Combiescu, A., Delpeyroux, F., and Crainic, R. (2000). Natural genetic exchanges between vaccine and wild poliovirus strains in humans. J Virol 74, 8434–43.

Guindon, S., and Gascuel, O. (2003). A Simple, Fast, and Accurate Algorithm to Estimate Large Phylogenies by Maximum Likelihood. Syst Biol 52, 696–704.

Gutman, J. A., Peck, A. J., Kuypers, J., and Boeckh, M. (2007). Rhinovirus as a cause of fatal lower respiratory tract infection in adult stem cell transplantation patients: a report of two cases. Bone Marrow Transplant 40, 809–11.

Hadfield, A., Lee, W. M., Zhao, R., Oliveira, M. a, Minor, I., Rueckert, R. R., and Rossmann, M. G. (1997). The refined structure of human rhinovirus 16 at 2.15 A resolution: implications for the viral life cycle. Structure 5, 427–41.

Halfpap, L. M., and Cooney, M. K. (1983). Isolation of rhinovirus intertypes related to either rhinoviruses 12 and 78 or 36 and 58. Infect Immun 40, 213–8.

Hamparian, V. V., Colonno, R. J., Cooney, M. K., Dick, E. C., Gwaltney, J. M., Hughes, J. H., Jordan, W. S., Kapikian, A. Z., Mogabgab, W. J., and Monto, A. (1987). A collaborative report: rhinoviruses--extension of the numbering system from 89 to 100. Virology 159, 191–2.

Han, T. H., Chung, J. Y., Hwang, E. S., and Koo, J. W. (2009). Detection of human rhinovirus C in children with acute lower respiratory tract infections in South Korea. Arch Virol 154, 987–91.

Harris, J. R., and Racaniello, V. R. (2005). Amino acid changes in proteins 2B and 3A mediate rhinovirus type 39 growth in mouse cells. J Virol 79, 5363–73.

Harvala, H., McIntyre, C. L., Imai, N., Clasper, L., Djoko, C. F., LeBreton, M., Vermeulen, M., Saville, A., Mutapi, F., Tamoufé, U., et al. (2012a). High seroprevalence of enterovirus infections in apes and old world monkeys. Emerg Infect Dis 18, 283–6.

Harvala, H., McIntyre, C. L., McLeish, N. J., Kondracka, J., Palmer, J., Molyneaux, P., Gunson, R., Bennett, S., Templeton, K., and Simmonds, P. (2012b). High detection frequency and viral loads of human rhinovirus species A to C in fecal samples; diagnostic and clinical implications. J Med Virol 84, 536–42.

Harvala, H., Sharp, C. P., Ngole, E. M., Delaporte, E., Peeters, M., and Simmonds, P. (2011). Detection and genetic characterization of enteroviruses circulating among wild

247

References

populations of chimpanzees in Cameroon: relationship with human and simian enteroviruses. J Virol 85, 4480–6.

Heath, L., van der Walt, E., Varsani, A., and Martin, D. P. (2006). Recombination patterns in aphthoviruses mirror those found in other picornaviruses. J Virol 80, 11827–32.

Hellen, C. U. T., and de Breyne, S. (2007). A distinct group of hepacivirus/pestivirus-like internal ribosomal entry sites in members of diverse picornavirus genera: evidence for modular exchange of functional noncoding RNA elements by recombination. J Virol 81, 5850–63.

Henquell, C., Mirand, A., Deusebis, A.-L., Regagnon, C., Archimbaud, C., Chambon, M., Bailly, J.-L., Gourdon, F., Hermet, E., Dauphin, J.-B., et al. (2012). Prospective genotyping of human rhinoviruses in children and adults during the winter of 2009- 2010. J Clin Virol 53, 280–4.

Herold, J., and Andino, R. (2001). Poliovirus RNA replication requires genome circularization through a protein-protein bridge. Mol Cell 7, 581–91.

Hewat, E. A., Neumann, E., Conway, J. F., Moser, R., Ronacher, B., Marlovits, T. C., and Blaas, D. (2000). The cellular receptor to human rhinovirus 2 binds around the 5-fold axis and not in the canyon : a structural view. EMBO J 19, 6317–6325.

Hicks, A. L., and Duffy, S. (2011). Genus-Specific Substitution Rate Variability among Picornaviruses. J Virol 85, 7942–7.

Hicks, L. A., Shepard, C. W., Britz, P. H., Erdman, D. D., Fischer, M., Flannery, B. L., Peck, A. J., Lu, X., Thacker, W. L., Benson, R. F., et al. (2006). Two outbreaks of severe respiratory disease in nursing homes associated with rhinovirus. J Am Ger Soc 54, 284– 9.

Hofer, F., Gruenberger, M., Kowalski, H., Machat, H., Huettinger, M., Kuechler, E., and Blaas, D. (1994). Members of the low density lipoprotein receptor family mediate cell entry of a minor-group common cold virus. Proc Natl Acad Sci USA 91, 1839–42.

Holder, M., and Lewis, P. O. (2003). Phylogeny estimation: traditional and Bayesian approaches. Nat Rev Genet 4, 275–84.

Holland, D. T., Senne, J., Peter, C. R., Urmenta, C., and Connor, J. D. (1998). Differentiation and characterization of enteroviruses by computer-assisted viral protein fingerprinting. J Clin Micro 36, 1588–94.

Honkanen, H., Oikarinen, S., Peltonen, P., Simell, O., Ilonen, J., Veijola, R., Knip, M., and Hyoty, H. (2013). Human rhinoviruses including group C are common in stool samples of young Finnish children. J Clin Virol 56, 334–338.

248

References

Horsnell, C., Gama, R. E., Hughes, P. J., and Stanway, G. (1995). Molecular relationships between 21 human rhinovirus serotypes. J Gen Virol 76, 2549–55.

Huang, S.-C., Hsu, Y.-W., Wang, H.-C., Huang, S.-W., Kiang, D., Tsai, H.-P., Wang, S.-M., Liu, C.-C., Lin, K.-H., Su, I.-J., et al. (2008). Appearance of intratypic recombination of enterovirus 71 in Taiwan from 2002 to 2005. Virus Res 131, 250–9.

Huang, T., Wang, W., Bessaud, M., Ren, P., Sheng, J., Yan, H., Zhang, J., Lin, X., Wang, Y., Delpeyroux, F., et al. (2009). Evidence of recombination and genetic diversity in human rhinoviruses in children with acute respiratory infection. PLoS One 4, e6355.

Hughes, P. J., North, C., Jellis, C. H., Minor, P. D., and Stanway, G. (1988). The nucleotide sequence of human rhinovirus 1B: molecular relationships within the rhinovirus genus. J Gen Virol 69 ( Pt 1), 49–58.

Hyypiä, T., Hovi, T., Knowles, N. J., and Stanway, G. (1997). Classification of enteroviruses based on molecular and biological properties. J Gen Virol 78 ( Pt 1), 1–11.

Ishiko, H., Miura, R., Shimada, Y., Hayashi, A., Nakajima, H., Yamazaki, S., and Takeda, N. (2002). Human rhinovirus 87 identified as human enterovirus 68 by VP4-based molecular diagnosis. Intervirology 45, 136–41.

JG Kapsenberg, A Ras, J. K. (1980). Improvement of enterovirus neutralization by treatment with sodium deoxycholate or chloroform. Intervirology 12, 329–34.

Jackson, D. J., Gangnon, R. E., Evans, M. D., Roberg, K. a, Anderson, E. L., Pappas, T. E., Printz, M. C., Lee, W.-M., Shult, P. a, Reisdorf, E., et al. (2008). Wheezing rhinovirus illnesses in early life predict asthma development in high-risk children. Am J Respir Crit Care Med 178, 667–72.

Jang, S. K., Kräusslich, H. G., Nicklin, M. J., Duke, G. M., Palmenberg, a C., and Wimmer, E. (1988). A segment of the 5’ nontranslated region of encephalomyocarditis virus RNA directs internal entry of ribosomes during in vitro translation. J Virol 62, 2636– 43.

Jartti, T., and Korppi, M. (2011). Rhinovirus-induced bronchiolitis and asthma development. Paediatr Allergy Immunol 22, 350–5.

Jegouic, S., Joffret, M.-L., Blanchard, C., Riquet, F. B., Perret, C., Pelletier, I., Colbere- Garapin, F., Rakoto-Andrianarivelo, M., and Delpeyroux, F. (2009). Recombination between polioviruses and co-circulating Coxsackie A viruses: role in the emergence of pathogenic vaccine-derived polioviruses. PLoS Pathog 5, e1000412.

Jenkins, G. M., Rambaut, A., Pybus, O. G., and Holmes, E. C. (2002). Rates of molecular evolution in RNA viruses: a quantitative phylogenetic analysis. J Mol Evol 54, 156–65.

249

References

Jin, Y., Yuan, X.-H., Xie, Z.-P., Gao, H.-C., Song, J.-R., Zhang, R.-F., Xu, Z.-Q., Zheng, L.- S., Hou, Y.-D., and Duan, Z.-J. (2009). Prevalence and clinical characterization of a newly identified human rhinovirus C species in children with acute respiratory tract infections. J Clin Micro 47, 2895–900.

Joshi, M. S., Walimbe, A. M., and Chitambar, S. D. (2008). Evaluation of genomic regions of hepatitis A virus for phylogenetic analysis: Suitability of the 2C region for genotyping. J Virol Methods 153, 36–42.

Kaida, A., Kubo, H., Takakura, K., Togawa, M., Shiomi, M., Kohdera, U., and Iritani, N. (2011). Molecular epidemiology of human rhinovirus C in patients with acute respiratory tract infections in Osaka City, Japan. Jpn J Infect Dis 64, 488–92.

Kapikian, A., Conant, R., and Hamparian, V. (1967). Rhinoviruses: a numbering system. Nature 213, 761–2.

Kapikian, A., Conant, R., Hamparian, V., Chanock, R., Dick, E., Gwaltney, J., Hamre, D., Jordan, W., Kenny, G., Lennette, E., et al. (1971). A collaborative report: rhinoviruses- -extension of the numbering system. Virology 43, 524–526.

Katpally, U., Fu, T.-M., Freed, D. C., Casimiro, D. R., and Smith, T. J. (2009). Antibodies to the buried N terminus of rhinovirus VP4 exhibit cross-serotypic neutralization. J Virol 83, 7040–8.

Khetsuriani, N., Lu, X., Teague, W. G., Kazerouni, N., Anderson, L. J., and Erdman, D. D. (2008). Novel human rhinoviruses and exacerbation of asthma in children. Emerg Infect Dis 14, 1793–6.

Kiang, D., Kalra, I., Yagi, S., Louie, J. K., Boushey, H., Boothby, J., and Schnurr, D. P. (2008). Assay for 5’ noncoding region analysis of all human rhinovirus prototype strains. J Clin Microbiol 46, 3736–45.

Kiang, D., Newbower, E. C., Yeh, E., Wold, L., Chen, L., and Schnurr, D. P. (2009). An algorithm for the typing of enteroviruses and correlation to serotyping by viral neutralization. J Clin Virol 45, 334–40.

Kilpatrick, D. R., Nottay, B., Yang, C. F., Yang, S. J., Da Silva, E., Peñaranda, S., Pallansch, M., and Kew, O. (1998). Serotype-specific identification of polioviruses by PCR using primers containing mixed-base or deoxyinosine residues at positions of codon degeneracy. J Clin Micro 36, 352–7.

Kirkegaard, K., and Baltimore, D. (1986). The mechanism of RNA recombination in poliovirus. Cell 47, 433–443.

250

References

Kistler, A., Avila, P. C., Rouskin, S., Wang, D., Ward, T., Yagi, S., Schnurr, D., Ganem, D., DeRisi, J. L., and Boushey, H. A. (2007a). Pan-viral screening of respiratory tract infections in adults with and without asthma reveals unexpected human coronavirus and human rhinovirus diversity. J Infect Dis 196, 817–25.

Kistler, A. L., Webster, D. R., Rouskin, S., Magrini, V., Credle, J. J., Schnurr, D. P., Boushey, H. A., Mardis, E. R., Li, H., and DeRisi, J. L. (2007b). Genome-wide diversity and selective pressure in the human rhinovirus. Virol J 4, 40.

Knowles, N. J., Hovi, T., Hyypiä, T., King, A. M. Q., Lindberg, A. M., Pallansch, M. A., Palmenberg, A. C., Simmonds, P., Skern, T., Stanway, G., et al. (2012). Picornaviridae. In Virus Taxonomy: Classification and Nomenclature of Viruses: Ninth Report of the International Committee on Taxonomy of Viruses. , eds. A. M. Q. King, M. J. Adams, E. B. Carstens, and E. J. Lefkowitz. Elsevier.

Knowles, N., and Wadsworth, J. (2010). The complete genome sequence of a picornavirus isolated from a harbour (common) seal (Phoca vitulina). in EUROPIC : XVI Meeting of the European Study Group on the Molecular Biology of Picornaviruses, St. Andrews, Scotland.

Kosakovsky Pond, S. L., Posada, D., Gravenor, M. B., Woelk, C. H., and Frost, S. D. W. (2006a). Automated phylogenetic detection of recombination using a genetic algorithm. Mol Biol Evol 23, 1891–901.

Kosakovsky Pond, S. L., Posada, D., Gravenor, M. B., Woelk, C. H., and Frost, S. D. W. (2006b). GARD: a genetic algorithm for recombination detection. Bioinformatics 22, 3096–8.

Kroneman, A., Vennema, H., Deforche, K., v d Avoort, H., Peñaranda, S., Oberste, M. S., Vinjé, J., and Koopmans, M. (2011). An automated genotyping tool for enteroviruses and noroviruses. J Clin Virol 51, 121–5.

Kupila, L., Vuorinen, T., Vainionpää, R., Hukkanen, V., Marttila, R. J., and Kotilainen, P. (2006). Etiology of aseptic meningitis and encephalitis in an adult population. Neurology 66, 75–80.

Kusel, M. M., de Klerk, N. H., Holt, P. G., Kebadze, T., Johnston, S. L., and Sly, P. D. (2006). Role of respiratory viruses in acute upper and lower respiratory tract illness in the first year of life: a birth cohort study. Pediatr Infect Dis J 25, 680–6.

Kwok, S., and Higuchi, R. (1989). Avoiding false positives with PCR. Nature 339, 237–8.

Lai, M. M. (1992). RNA recombination in animal and plant viruses. Microbiol Rev 56, 61– 79.

251

References

Laine, P., Blomqvist, S., Savolainen, C., Andries, K., and Hovi, T. (2006). Alignment of capsid protein VP1 sequences of all human rhinovirus prototype strains: conserved motifs and functional domains. J Gen Virol 87, 129–38.

Laine, P., Savolainen, C., Blomqvist, S., and Hovi, T. (2005). Phylogenetic analysis of human rhinovirus capsid protein VP1 and 2A protease coding sequences confirms shared genus-like relationships with human enteroviruses. J Gen Virol 86, 697–706.

Lamson, D., Renwick, N., Kapoor, V., Liu, Z., Palacios, G., Ju, J., Dean, A., St George, K., Briese, T., and Lipkin, W. I. (2006). MassTag polymerase-chain-reaction detection of respiratory pathogens, including a new rhinovirus genotype, that caused influenza-like illness in New York State during 2004-2005. J Infect Dis 194, 1398–402.

Lau, S. K. P., Yip, C. C. Y., Lung, D. C., Lee, P., Que, T.-L., Lau, Y.-L., Chan, K.-H., Woo, P. C. Y., and Yuen, K.-Y. (2012). Detection of human rhinovirus C in fecal samples of children with gastroenteritis. J Clin Virol 53, 290–6.

Lau, S. K. P., Yip, C. C. Y., Tsoi, H.-W., Lee, R. a, So, L.-Y., Lau, Y.-L., Chan, K.-H., Woo, P. C. Y., and Yuen, K.-Y. (2007). Clinical features and complete genome characterization of a distinct human rhinovirus (HRV) genetic cluster, probably representing a previously undetected HRV species, HRV-C, associated with acute respiratory illness in children. J Clin Micro 45, 3655–64.

Ledford, R. M., Collett, M. S., and Pevear, D. C. (2005). Insights into the genetic basis for natural phenotypic resistance of human rhinoviruses to pleconaril. Antiviral Res 68, 135–8.

Ledford, R. M., Patel, N. R., Demenczuk, T. M., Watanyar, A., Herbertz, T., Collett, M. S., Pevear, D. C., and Irol, J. V. (2004). VP1 Sequencing of All Human Rhinovirus Serotypes : Insights into Genus Phylogeny and Susceptibility to Antiviral Capsid- Binding Compounds. J Virol 78, 3663–3674.

Lee, W. M., Kiesner, C., Pappas, T., Lee, I., Grindle, K., Jartti, T., Jakiela, B., Lemanske, R. F., Shult, P. A., and Gern, J. E. (2007). A diverse group of previously unrecognized human rhinoviruses are common causes of respiratory illnesses in infants. PLoS One 3, e966.

Lee, W. M., Monroe, S. S., and Rueckert, R. R. (1993). Role of maturation cleavage in infectivity of picornaviruses: activation of an infectosome. J Virol 67, 2110–22.

Lee, W. M., Wang, W., and Rueckert, R. R. (1995). Complete sequence of the RNA genome of human rhinovirus 16, a clinically useful common cold virus belonging to the ICAM- 1 receptor group. Virus genes 9, 177–81.

252

References

Lewis, J. K., Bothner, B., Smith, T. J., and Siuzdak, G. (1998). Antiviral agent blocks breathing of the common cold virus. Proc Natl Acad Sci USA 95, 6774–8.

Lewis-Rogers, N., Bendall, M. L., and Crandall, K. A. (2009). Phylogenetic relationships and molecular adaptation dynamics of human rhinoviruses. Mol Biol Evol 26, 969–81.

Li, Q., Yafal, a G., Lee, Y. M., Hogle, J., and Chow, M. (1994). Poliovirus neutralization by antibodies to internal epitopes of VP4 and VP1 results from reversible exposure of these sequences at physiological temperature. J Virol 68, 3965–70.

Lindberg, A. M., Andersson, P., Savolainen, C., Mulders, M. N., and Hovi, T. (2003). Evolution of the genome of Human enterovirus B: incongruence between phylogenies of the VP1 and 3CD regions indicates frequent recombination within the species. J Gen Virol 84, 1223–1235.

Linsuwanon, P., Payungporn, S., Samransamruajkit, R., Posuwan, N., Makkoch, J., Theanboonlers, A., and Poovorawan, Y. (2009). High prevalence of human rhinovirus C infection in Thai children with acute lower respiratory tract disease. J Infect 59, 115– 21.

Linsuwanon, P., Payungporn, S., Suwannakarn, K., Chieochansin, T., Theamboonlers, A., and Poovorawan, Y. (2011). Complete coding sequence characterization and comparative analysis of the putative novel human rhinovirus (HRV) species C and B. Virol J 8, 5.

Lloyd, R. E., Grubman, M. J., and Ehrenfeld, E. (1988). Relationship of p220 cleavage during picornavirus infection to 2A proteinase sequencing. J Virol 62, 4216–23. van Loon, A., Cleator, G., and Ras, A. (1999). External quality assessment of enterovirus detection and typing. European Union Concerted Action of Virus Meningitis and Encephalitis. Bull World Health Organ 77, 217–223.

Lukashev, A. N. (2005). Role of recombination in evolution of enteroviruses. Rev Med Virol 15, 157–67.

Lukashev, A. N., Lashkevich, V. A., Ivanova, O. E., Koroleva, G. A., Hinkkanen, A. E., and Ilonen, J. (2005). Recombination in circulating Human enterovirus B: independent evolution of structural and non-structural genome regions. J Gen Virol 86, 3281–90.

Lysholm, F., Wetterbom, A., Lindau, C., Darban, H., Bjerkner, A., Fahlander, K., Lindberg, a M., Persson, B., Allander, T., and Andersson, B. (2012). Characterization of the viral microbiome in patients with severe lower respiratory tract infections, using metagenomic sequencing. PloS one 7, e30875.

253

References

Macadam, A., Arnold, C., Howlett, J., John, A., Marsden, S., Taffs, F., Reeve, P., Hamada, N., Wareham, K., and Almond, J. (1989). Reversion of the attenuated and temperature- sensitive phenotypes of the Sabin type 3 strain of poliovirus in vaccinees. Virology 172, 408–14.

Mallia, P., Message, S. D., Gielen, V., Contoli, M., Gray, K., Kebadze, T., Aniscenko, J., Laza-Stanca, V., Edwards, M. R., Slater, L., et al. (2011). Experimental rhinovirus infection as a human model of chronic obstructive pulmonary disease exacerbation. Am J Respir Crit Care Med 183, 734–42.

Martin, D. P., Lemey, P., Lott, M., Moulton, V., Posada, D., and Lefeuvre, P. (2010). RDP3: a flexible and fast computer program for analyzing recombination. Bioinformatics 26, 2462–3.

Martin, D. P., Posada, D., Crandall, K. A., and Williamson, C. (2005). A modified bootscan algorithm for automated identification of recombinant sequences and recombination breakpoints. AIDS Res Hum 21, 98–102.

Martin, D., and Rybicki, E. (2000). RDP: detection of recombination amongst aligned sequences. Bioinformatics 16, 562–3.

McErlean, P., Shackelton, L. a, Andrews, E., Webster, D. R., Lambert, S. B., Nissen, M. D., Sloots, T. P., and Mackay, I. M. (2008). Distinguishing molecular features and clinical characteristics of a putative new rhinovirus species, human rhinovirus C (HRV C). PloS one 3, e1847.

McErlean, P., Shackelton, L. a, Lambert, S. B., Nissen, M. D., Sloots, T. P., and Mackay, I. M. (2007). Characterisation of a newly identified human rhinovirus, HRV-QPM, discovered in infants with bronchiolitis. J Clin Virol 39, 67–75.

McIntyre, C., Knowles, N., and Simmonds, P. (2013a). Proposals for the classification of human rhinovirus species A, B and C (HRV-A, -B and –C) into genotypically assigned types. [Submitted].

McIntyre, C., McWilliam Leitch, E. C., Savolainen-Kopra, C., Hovi, T., and Simmonds, P. (2010). Analysis of genetic diversity and sites of recombination in human rhinovirus species C. J Virol 84, 10297–310.

McIntyre, C., Savolainen-Kopra, C., Hovi, T., and Simmonds, P. (2013b). Recombination in the evolution of human rhinovirus genomes. Arch Virol, February [Epub ahead of print].

McKnight, K. L., and Lemon, S. M. (1998). The rhinovirus type 14 genome contains an internally located RNA structure that is required for viral replication. RNA 4, 1569–84.

254

References

McLeish, N., Witteveldt, J., Clasper, L., McIntyre, C., McWilliam Leitch, E. C., Hardie, A., Bennett, S., Gunson, R., Carman, W. F., Feeney, S. a, et al. (2012). Development and assay of RNA transcripts of enterovirus species A to D, rhinovirus species a to C, and human parechovirus: assessment of assay sensitivity and specificity of real-time screening and typing methods. J Clin Micro 50, 2910–7.

McManus, T. E., Marley, A.-M., Baxter, N., Christie, S. N., O’Neill, H. J., Elborn, J. S., Coyle, P. V., and Kidney, J. C. (2008). Respiratory viral infection in exacerbations of COPD. Respir Med 102, 1575–80.

McWilliam Leitch, E. C., Bendig, J., Cabrerizo, M., Cardosa, J., Hyypiä, T., Ivanova, O. E., Kelly, A., Kroes, A. C., Lukashev, A., MacAdam, A., et al. (2009a). Transmission networks and population turnover of echovirus 30. J Virol 83, 2109–18.

McWilliam Leitch, E. C., Cabrerizo, M., Cardosa, J., Harvala, H., Ivanova, O. E., Koike, S., Kroes, A. C., Lukashev, A. N., Perera, D., Roivainen, M., et al. (2012). The association of recombination events in the founding and emergence of subgenogroup evolutionary lineages of human enterovirus 71. J Virol 86, 2676–85.

McWilliam Leitch, E. C., Cabrerizo, M., Cardosa, J., Harvala, H., Ivanova, O. E., Kroes, A. C., Lukashev, A., Muir, P., Odoom, J., Roivainen, M., et al. (2010). Evolutionary dynamics and temporal/geographical correlates of recombination in the human enterovirus echovirus types 9, 11, and 30. J Virol 84, 9292–300.

McWilliam Leitch, E. C., Harvala, H., Robertson, I., Ubillos, I., Templeton, K., and Simmonds, P. (2009b). Direct identification of human enterovirus serotypes in cerebrospinal fluid by amplification and sequencing of the VP1 region. J Clin Microbiol 44, 119–124.

Medappa, K. C., McLean, C., and Rueckert, R. R. (1971). On the structure of rhinovirus 1A. Virology 44, 259–70.

Melnick, J. L., Shaw, E. W., and Curnen, E. C. (1949). A virus isolated from patients diagnosed as non-paralytic poliomyelitis or aseptic meningitis. Proc Soc Exp Biol Med 71, 344–9.

Miller, E. K., Edwards, K. M., Weinberg, G. A., Iwane, M. K., Griffin, M. R., Hall, C. B., Zhu, Y., Szilagyi, P. G., Morin, L. L., Heil, L. H., et al. (2009a). A novel group of rhinoviruses is associated with asthma hospitalizations. J Allergy Clin Immunol 123, 98–104.e1.

Miller, E. K., Khuri-Bulos, N., Williams, J. V., Shehabi, A. a, Faouri, S., Al Jundi, I., Chen, Q., Heil, L., Mohamed, Y., Morin, L.-L., et al. (2009b). Human rhinovirus C associated with wheezing in hospitalised children in the Middle East. J Clin Virol 46, 85–9.

255

References

Mirand, A., Rouveyrol, F., Chambon, M., Archimbaud, C., Regagnon, C., Bailly, J.-L., Peigue-Lafeuille, H., and Henquell, C. (2009). Enterovirus genotyping directly from original clinical specimens: prospective application to a severe neonatal infection. J Clin Virol 44, 177–8.

Mizuta, K., Hirata, A., Suto, A., Aoki, Y., Ahiko, T., Itagaki, T., Tsukagoshi, H., Morita, Y., Obuchi, M., Akiyama, M., et al. (2010). Phylogenetic and cluster analysis of human rhinovirus species A (HRV-A) isolated from children with acute respiratory infections in Yamagata, Japan. Virus Res 147, 265–74.

Mosser, A. G., Brockman-Schneider, R., Amineva, S., Burchell, L., Sedgwick, J. B., Busse, W. W., and Gern, J. E. (2002). Similar frequency of rhinovirus-infectible cells in upper and lower airway epithelium. The Journal of infectious diseases 185, 734–43.

Nix, W. A., Oberste, M. S., and Pallansch, M. a (2006). Sensitive, seminested PCR amplification of VP1 sequences for direct identification of all enterovirus serotypes from original clinical specimens. J Clin Micro 44, 2698–704.

Nokso-Koivisto, J., Hovi, T., and Pitkäranta, A. (2006). Viral upper respiratory tract infections in young children with emphasis on acute otitis media. Int J Paediatr Otorhinolaryngol 70, 1333–42.

Norder, H., Bjerregaard, L., and Magnius, L. O. (2002). Open reading frame sequence of an Asian enterovirus 73 strain reveals that the prototype from California is recombinant. J Gen Virol 83, 1721–8.

Nyquist, A., Gonzales, R., Steiner, J. F., and Sande, M. a (1998). Antibiotic prescribing for children with colds, upper respiratory tract infections, and bronchitis. JAMA 279, 875– 7.

Oberste, M. S., Maher, K., Flemister, M. R., Marchetti, G., Kilpatrick, D. R., and Pallansch, M. a (2000). Comparison of classic and molecular approaches for the identification of untypeable enteroviruses. J Clin Micro 38, 1170–4.

Oberste, M. S., Maher, K., Kilpatrick, D. R., Flemister, M. R., Brown, B. A., and Pallansch, M. A. (1999a). Typing of human enteroviruses by partial sequencing of VP1. J Clin Microbiol 37, 1288–93.

Oberste, M. S., Maher, K., Kilpatrick, D. R., and Pallansch, M. A. (1999b). Molecular evolution of the human enteroviruses: correlation of serotype with VP1 sequence and application to picornavirus classification. J Virol 73, 1941–8.

Oberste, M. S., Maher, K., Nix, W. a, Michele, S. M., Uddin, M., Schnurr, D., Al-Busaidy, S., Akoua-Koffi, C., and Pallansch, M. a (2007). Molecular identification of 13 new

256

References

enterovirus types, EV79-88, EV97, and EV100-101, members of the species Human Enterovirus B. Virus Res 128, 34–42.

Oberste, M. S., Maher, K., and Pallansch, M. A. (2004a). Evidence for frequent recombination within species human enterovirus B based on complete genomic sequences of all thirty-seven serotypes. J Virol 78, 855–67.

Oberste, M. S., Maher, K., and Pallansch, M. A. (2002). Molecular Phylogeny and Proposed Classification of the Simian Picornaviruses. J Virol 76, 1244–1251.

Oberste, M. S., Maher, K., and Pallansch, M. a (1998). Molecular phylogeny of all human enterovirus serotypes based on comparison of sequences at the 5’ end of the region encoding VP2. Virus Res 58, 35–43.

Oberste, M. S., Maher, K., Schnurr, D., Flemister, M. R., Lovchik, J. C., Peters, H., Sessions, W., Kirk, C., Chatterjee, N., Fuller, S., et al. (2004b). Enterovirus 68 is associated with respiratory illness and shares biological features with both the enteroviruses and the rhinoviruses. J Gen Virol 85, 2577–84.

Oberste, M. S., Michele, S. M., Maher, K., Schnurr, D., Cisterna, D., Junttila, N., Uddin, M., Chomel, J.-J., Lau, C.-S., Ridha, W., et al. (2004c). Molecular identification and characterization of two proposed new enterovirus serotypes, EV74 and EV75. J Gen Virol 85, 3205–12.

Oberste, M. S., Peñaranda, S., and Pallansch, M. A. (2004d). RNA recombination plays a major role in genomic change during circulation of coxsackie B viruses. J Virol 78, 2948–55.

Ohka, S., Yang, W., Terada, E., Iwasaki, K., and Nomoto, A. (1998). Retrograde transport of intact poliovirus through the axon via the fast transport system. Virology 1, 67–75.

Oliveira, M., Zhao, R., Lee, W., Kremer, M., Minor, I., Rueckert, R., Diana, G., Pevear, D., Dutko, F., and Mckinlay, M. (1993). The structure of human rhinovirus 16. Structure 1, 51–68.

Oprisan, G., Combiescu, M., Guillot, S., Caro, V., Combiescu, A., Delpeyroux, F., and Crainic, R. (2002). Natural genetic recombination between co-circulating heterotypic enteroviruses. J Gen Virol 83, 2193–200.

Padidam, M., Sawyer, S., and Fauquet, C. M. (1999). Possible emergence of new geminiviruses by frequent recombination. Virology 265, 218–25.

Pallansch, M., and Roos, R. (2007). “Enteroviruses: polioviruses, coxsackieviruses, echoviruses and newer enteroviruses.,” in Field Virology, 5th ed., eds. D. Knipe and P. Howley (Philadelphia: Lippincott Williams & Wilkins), 839–93.

257

References

Palmenberg, A., Rathe, J. A., and Liggett, S. B. (2010). Analysis of the complete genome sequences of human rhinovirus. J Allergy Clin Immunol 125, 1190–9.

Palmenberg, A., Spiro, D., Kuzmickas, R., Wang, S., Djikeng, A., Rathe, J. A., Fraser- Liggett, C. M., and Liggett, S. B. (2009). Sequencing and analyses of all known human rhinovirus genomes reveal structure and evolution. Science 324, 55–9.

Papadopoulos, N. G. (2002). Association of Rhinovirus Infection with Increased Disease Severity in Acute Bronchiolitis. Am J Respir Crit Care Med 165, 1285–1289.

Papadopoulos, N. G., Bates, P. J., Bardin, P. G., Papi, A., Leir, S. H., Fraenkel, D. J., Meyer, J., Lackie, P. M., Sanderson, G., Holgate, S. T., et al. (2000). Rhinoviruses infect the lower airways. J Infect Dis 181, 1875–84.

Papadopoulos, N. G., Sanderson, G., Hunter, J., and Johnston, S. L. (1999). Rhinoviruses replicate effectively at lower airway temperatures. J Med Virol 58, 100–4.

Parody, R., Rabella, N., Martino, R., Otegui, M., del Cuerpo, M., Coll, P., and Sierra, J. (2007). Upper and lower respiratory tract infections by human enterovirus and rhinovirus in adult patients with hematological malignancies. Am J Hematol 82, 807– 11.

Paul, A., Rieder, E., Kim, D. W., van Boom, J. H., and Wimmer, E. (2000). Identification of an RNA hairpin in poliovirus RNA that serves as the primary template in the in vitro uridylylation of VPg. J Virol 74, 10359–70.

Pelletier, J., and Sonenberg, N. (1988). Internal initiation of translation of eukaryotic mRNA directed by a sequence derived from poliovirus RNA. Nature 334, 320–5.

Pelon, W., Mogabgab, W. J., Phillip, I. A., and Pierce, W. E. (1957). A cytopathogenic agent isolated from naval recruits with mild respiratory illnesses. Proc Soc Exp Biol Med 94, 262–7.

Peltola, V., Waris, M., Osterback, R., Susi, P., Ruuskanen, O., and Hyypiä, T. (2008). Rhinovirus transmission within families with children: incidence of symptomatic and asymptomatic infections. J Infect Dis 197, 382–9.

Petersen, J. F., Cherney, M. M., Liebig, H. D., Skern, T., Kuechler, E., and James, M. N. (1999). The structure of the 2A proteinase from a common cold virus: a proteinase responsible for the shut-off of host-cell protein synthesis. EMBO J 18, 5463–75.

Pfeiffer, J. K., and Kirkegaard, K. (2005). Increased fidelity reduces poliovirus fitness and virulence under selective pressure in mice. PLoS Pathog 1, e11.

258

References

Piralla, A., Baldanti, F., and Gerna, G. (2011). Phylogenetic patterns of human respiratory picornavirus species, including the newly identified group C rhinoviruses, during a 1- year surveillance of a hospitalized patient population in Italy. J Clin Micro 49, 373–6.

Piralla, A., Rovida, F., Campanini, G., Rognoni, V., Marchi, A., Locatelli, F., and Gerna, G. (2009). Clinical severity and molecular typing of human rhinovirus C strains during a fall outbreak affecting hospitalized patients. J Clin Virol 45, 311–7.

Pitkäranta, a, Starck, M., Savolainen, S., Pöyry, T., Suomalainen, I., Hyypiä, T., Carpen, O., and Vaheri, A. (2001). Rhinovirus RNA in the maxillary sinus epithelium of adult patients with acute sinusitis. Clin Infect Dis 33, 909–11.

Pitkäranta, a, Virolainen, a, Jero, J., Arruda, E., and Hayden, F. G. (1998). Detection of rhinovirus, respiratory syncytial virus, and coronavirus infections in acute otitis media by reverse transcriptase polymerase chain reaction. Pediatrics 102, 291–5.

Pitkäranta, A., Arruda, E., Malmberg, H., Hayden, F. G., Arruda, E., Malmberg, H., and Pitka, A. (1997). Detection of rhinovirus in sinus brushings of patients with acute community-acquired sinusitis by reverse transcription-PCR . These include : Detection of Rhinovirus in Sinus Brushings of Patients with Acute Community-Acquired Sinusitis by Reverse Transcr. Microbiology 35, 7–10.

Posada, D. (2002). Evaluation of methods for detecting recombination from DNA sequences: empirical data. Mol Biol Evol 19, 708–17.

Posada, D., and Crandall, K. A. (2001). Evaluation of methods for detecting recombination from DNA sequences: computer simulations. Proc Natl Acad Sci USA 98, 13757–62. du Prel, J.-B., Puppe, W., Gröndahl, B., Knuf, M., Weigl, J. a I., Schaaff, F., and Schmitt, H.-J. (2009). Are meteorological parameters associated with acute respiratory tract infections? Clin Infect Dis 49, 861–8.

Price, W. H. (1956). The isolation of a new virus associated with respiratory clinical disease in humans. Proc Natl Acad Sci USA 42, 892–6.

Pöyry, T., Hyypiä, T., Horsnell, C., Kinnunen, L., Hovi, T., and Stanway, G. (1994). Molecular analysis of coxsackievirus A16 reveals a new genetic group of enteroviruses. Virology 202, 982–7.

Pöyry, T., Kinnunen, L., Hyypiä, T., Brown, B., Horsnell, C., Hovi, T., and Stanway, G. (1996). Genetic and phylogenetic clustering of enteroviruses. J Gen Virol 77 ( Pt 8), 1699–717.

259

References

VR Racaniello (2007). “Picornaviridae : the viruses and their replication,” in Fields Virology, eds. D. Knipe and P. Howley (Philadelphia: Lippincott Williams & Wilkins), 795 – 834.

Rathe, J. A., Liu, X., Tallon, L. J., Gern, J. E., and Liggett, S. B. (2010). Full-genome sequence and analysis of a novel human rhinovirus strain within a divergent HRV-A clade. Arch Virol 155, 83–7.

Ren, R., and Racaniello, V. (1992). Poliovirus spreads from muscle to the central nervous system by neural pathways. J Infect Dis 4, 747–52.

Renwick, N., Schweiger, B., Kapoor, V., Liu, Z., Villari, J., Bullmann, R., Miething, R., Briese, T., and Lipkin, W. I. (2007). A recently identified rhinovirus genotype is associated with severe respiratory-tract infection in children in Germany. J Infect Dis 196, 1754–60.

Robbins, F. C., Enders, J. F., Weller, T. H., and Florentino, G. L. (1951). Studies on the cultivation of poliomyelitis viruses in tissue culture. V. The direct isolation and serologic identification of virus strains in tissue culture from patients with nonparalytic and paralytic poliomyelitis. Am J Hyg 54, 286–93.

Rohll, J. B., Percy, N., Ley, R., Evans, D. J., Almond, J. W., and Barclay, W. S. (1994). The 5’-untranslated regions of picornavirus RNAs contain independent functional domains essential for RNA replication and translation. J Virol 68, 4384–91.

Roy, A., and Post, C. B. (2012). Long-distance correlations of rhinovirus capsid dynamics contribute to uncoating and antiviral activity. Proc Natl Acad Sci USA 109, 5271–6.

Rueckert, R. R., and Wimmer, E. (1984). Systematic nomenclature of picornavirus proteins. J Virol 50, 957–9.

Runckel, C., Westesson, O., Andino, R., and Derisi, J. L. (2013). Identification and manipulation of the molecular determinants influencing poliovirus recombination. PLoS pathogens 9, e1003164.

Saitou, N., and Nei, M. (1987). The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol Biol Evol 4, 406–25. van der Sanden, S., van Eek, J., Martin, D. P., van der Avoort, H., Vennema, H., and Koopmans, M. (2011). Detection of recombination breakpoints in the genomes of human enterovirus 71 strains isolated in the Netherlands in epidemic and non-epidemic years, 1963-2010. Infect Genet Evol 11, 886–94.

Sanfaçon, H., Gorbalenya, A. E., Knowles, N. J., and Chen, Y. (2011). Picornavirales. In Virus Taxonomy: Classification and Nomenclature of Viruses: Ninth Report of the

260

References

International Committee on Taxonomy of Viruses. , eds. A. M. Q. King, M. J. Adams, E. B. Carstens, and E. . Lefkowitz San Diego: Elsevier.

Santti, J., Hyypiä, T., Kinnunen, L., and Salminen, M. (1999). Evidence of recombination among enteroviruses. J Virol 73, 8741–9.

Savolainen, C., Laine, P., Mulders, M. N., and Hovi, T. (2004). Sequence analysis of human rhinoviruses in the RNA-dependent RNA polymerase coding region reveals large within-species variation. J Gen Virol 85, 2271–7.

Savolainen, C., Mulders, M. N., and Hovi, T. (2002). Phylogenetic analysis of rhinovirus isolates collected during successive epidemic seasons. Virus Res 85, 41–6.

Savolainen-Kopra, C., Blomqvist, S., Kilpi, T., Roivainen, M., and Hovi, T. (2009a). Novel species of human rhinoviruses in acute otitis media. Pediatr Infect Dis J 28, 59–61.

Savolainen-Kopra, C., Blomqvist, S., Smura, T., Roivainen, M., Hovi, T., Kiang, D., Kalra, I., Yagi, S., Louie, J. K., Boushey, H., et al. (2009b). 5’ Noncoding Region Alone Does Not Unequivocally Determine Genetic Type of Human Rhinovirus Strains. J Clin Micro 47, 1278–80.

Scheltinga, S. A., Templeton, K. E., Beersma, M. F. C., and Claas, E. C. J. (2005). Diagnosis of human metapneumovirus and rhinovirus in patients with respiratory tract infections by an internally controlled multiplex real-time RNA PCR. J Clin Virol 33, 306–11.

Schibler, M., Gerlach, D., Martinez, Y., Belle, S. V., Turin, L., Kaiser, L., and Tapparel, C. (2012). Experimental human rhinovirus and enterovirus interspecies recombination. J Gen Virol 93, 93–101.

Schmidt, H. a, Strimmer, K., Vingron, M., and von Haeseler, A. (2002). TREE-PUZZLE: maximum likelihood phylogenetic analysis using quartets and parallel computing. Bioinformatics 18, 502–4.

Schmidt, N. J., Lennette, E. H., and Ho, H. H. (1966). Observations on antigenic variants of echovirus type 11. Proc Soc Exp Biol Med 123, 696–700.

Shapiro, B., Rambaut, A., and Drummond, A. J. (2006). Choosing appropriate substitution models for the phylogenetic analysis of protein-coding sequences. Mol Biol Evol 23, 7– 9.

She, R. C., Hymas, W. C., Taggart, E. W., Petti, C. a, and Hillyard, D. R. (2010). Performance of enterovirus genotyping targeting the VP1 and VP2 regions on non- typeable isolates and patient specimens. J Virol Methods 165, 46–50.

261

References

Siafakas, N., Georgopoulou, A., Markoulatos, P., Spyrou, N., and Stanway, G. (2001). Molecular detection and identification of an enterovirus during an outbreak of aseptic meningitis. J Clin Lab Anal 15, 87–95.

Silva, P. A., Diedrich, S., de Paula Cardoso, D. D. D., and Schreier, E. (2008). Identification of enterovirus serotypes by pyrosequencing using multiple sequencing primers. J Virol Methods 148, 260–4.

Simmonds, P. (2006). Recombination and selection in the evolution of picornaviruses and other Mammalian positive-stranded RNA viruses. J Virol 80, 11124–40.

Simmonds, P. (2012). SSE: a nucleotide and amino acid sequence analysis platform. BMC Res Notes 5, 50.

Simmonds, P., McIntyre, C., Savolainen-Kopra, C., Tapparel, C., Mackay, I. M., and Hovi, T. (2010). Proposals for the classification of human rhinovirus species C into genotypically assigned types. J Gen Virol 91, 2409–19.

Simmonds, P., and Midgley, S. (2005). Recombination in the genesis and evolution of hepatitis B virus genotypes. J Virol 79, 15467–76.

Simmonds, P., and Welch, J. (2006). Frequency and dynamics of recombination within different species of human enteroviruses. J Virol 80, 483–93.

Skern, T., and Sommergruber, W. (1985). Human rhinovirus 2: complete nucleotide sequence and proteolytic processing signals in the capsid protein region. Nucleic Acids Res 13, 2111–2126.

Skern, T., Torgersen, H., Auer, H., Kuechler, E., and Blaas, D. (1991). Human rhinovirus mutants resistant to low pH. Virology 183, 757–63.

Slater, A., Shann, F., and Pearson, G. (2003). PIM2: a revised version of the Paediatric Index of Mortality. Intensive care medicine 29, 278–85.

Smith, J. M. (1992). Analyzing the mosaic structure of genes. J Mol Evol 34, 126–9.

Smith, T. J., Kremer, M. J., Luo, M., Vriend, G., Arnold, E., Kamer, G., Rossmann, M. G., McKinlay, M. A., Diana, G. D., and Otto, M. J. (1986). The site of attachment in human rhinovirus 14 for antiviral agents that inhibit uncoating. Science 233, 1286–93.

Smura, T., Blomqvist, S., Paananen, A., Vuorinen, T., Sobotová, Z., Bubovica, V., Ivanova, O., Hovi, T., and Roivainen, M. (2007a). Enterovirus surveillance reveals proposed new serotypes and provides new insight into enterovirus 5’-untranslated region evolution. J Gen Virol 88, 2520–6.

262

References

Smura, T. P., Junttila, N., Blomqvist, S., Norder, H., Kaijalainen, S., Paananen, A., Magnius, L. O., Hovi, T., and Roivainen, M. (2007b). Enterovirus 94, a proposed new serotype in human enterovirus species D. J Gen Virol 88, 849–58.

Smuts, H. E., Workman, L. J., and Zar, H. J. (2011). Human rhinovirus infection in young African children with acute wheezing. BMC Infect Dis 11, 65.

Smyth, A., Smyth, R. L., Tong, C. Y., Hart, C. a, and Heaf, D. P. (1995). Effect of respiratory virus infections including rhinovirus on clinical status in cystic fibrosis. Arch Dis Child 73, 117–20.

Sommergruber, W., Casari, G., Fessl, F., Seipelt, J., and Skern, T. (1994). The 2A proteinase of human rhinovirus is a zinc containing enzyme. Virology 204, 815–8.

Spector, D. H. (1974). Requirement of 3’-Terminal Poly(adenylic Acid) for the Infectivity of Poliovirus RNA. Proc Natl Acad Sci USA 71, 2983–2987.

Stanway, G., Kalkkinen, N., Roivainen, M., Ghazi, F., Khan, M., Smyth, M., Meurman, O., and Hyypiä, T. (1994). Molecular and biological characteristics of echovirus 22, a representative of a new picornavirus group. J Virol 68, 8232–8.

Steinhauer, D. A., Domingo, E., and Holland, J. J. (1992). Lack of evidence for proofreading mechanisms associated with an RNA virus polymerase. Gene 122, 281–8.

Strimmer, K. (1994). Quartet Puzzling: A Quartet Maximum-Likelihood Method for Reconstructing Tree Topologies. Mol Biol Evol 13, 964–969.

Strimmer, K., and von Haeseler, A. (1997). Likelihood-mapping: a simple method to visualize phylogenetic content of a sequence alignment. Proc Natl Acad Sci USA 94, 6815–9.

Suomalainen, M., and Greber, U. F. (2013). Uncoating of non-enveloped viruses. Curr Opin Virol 3, 27–33.

Swofford, D. L., Waddell, P. J., Huelsenbeck, J. P., Foster, P. G., Lewis, P. O., and Rogers, J. S. (2001). Bias in phylogenetic estimation and its relevance to the choice between parsimony and likelihood methods. Systematic biology 50, 525–39.

Tamura, K., Dudley, J., Nei, M., and Kumar, S. (2007). MEGA4: Molecular Evolutionary Genetics Analysis (MEGA) software version 4.0. Mol Biol Evol 24, 1596–9.

Tamura, K., Nei, M., and Kumar, S. (2004). Prospects for inferring very large phylogenies by using the neighbor-joining method. Proc Natl Acad Sci USA 101, 11030–5.

263

References

Tamura, K., Peterson, D., Peterson, N., Stecher, G., Nei, M., and Kumar, S. (2011). MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Mol Biol Evol 28, 2731–9.

Tapparel, C., Cordey, S., Van Belle, S., Turin, L., Lee, W.-M., Regamey, N., Meylan, P., Mühlemann, K., Gobbini, F., and Kaiser, L. (2009a). New molecular detection tools adapted to emerging rhinoviruses and enteroviruses. J Clin Micro 47, 1742–9.

Tapparel, C., Junier, T., Gerlach, D., Cordey, S., Van Belle, S., Perrin, L., Zdobnov, E. M., and Kaiser, L. (2007). New complete genome sequences of human rhinoviruses shed light on their phylogeny and genomic features. BMC Genomics 8, 224.

Tapparel, C., Junier, T., Gerlach, D., Van-Belle, S., Turin, L., Cordey, S., Mühlemann, K., Regamey, N., Aubert, J.-D., Soccal, P. M., et al. (2009b). New respiratory enterovirus and recombinant rhinoviruses among circulating picornaviruses. Emerg Infect Dis 15, 719–26.

Tapparel, C., L’Huillier, A. G., Rougemont, A. L., Beghetti, M., Barazzone-Argiroffo, C., and Kaiser, L. (2009c). Pneumonia and pericarditis in a child with HRV-C infection: a case report. J Clin Virol 45, 157–60.

Taylor-Robinson, D., and Tyrrell, D. A. (1962). Serotypes of viruses (rhinoviruses) isolated from common colds. Lancet 1, 452–4.

Templeton, K. E. (2007). Why diagnose respiratory viral infection? J Clin Virol 40 Suppl 1, S2–4.

Templeton, K. E., Scheltinga, S. A., Beersma, M. F. C., Kroes, A. C. M., and Claas, E. C. J. (2004). Rapid and sensitive method using multiplex real-time PCR for diagnosis of infections by influenza a and influenza B viruses, respiratory syncytial virus, and parainfluenza viruses 1, 2, 3, and 4. J Clin Microbiol 42, 1564–9.

Thompson, J. D., Higgins, D. G., and Gibson, T. J. (1994). CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res 22, 4673– 80.

Toyoda, H., Nicklin, M. J., Murray, M. G., Anderson, C. W., Dunn, J. J., Studier, F. W., and Wimmer, E. (1986). A second virus-encoded proteinase involved in proteolytic processing of poliovirus polyprotein. Cell 45, 761–70.

Tuthill, T. J., Groppelli, E., Hogle, J. M., and Rowlands, D. J. (2010). Picornaviruses. Curr Top Microbiol Immunol 343, 43–89.

264

References

Uncapher, C. R., DeWitt, C. M., and Colonno, R. J. (1991). The major and minor group receptor families contain all but one human rhinovirus serotype. Virology 180, 814–7.

Valcour, V., Haman, A., Cornes, S., Lawall, C., Parsa, A. T., Glaser, C., Yagi, S., Tihan, T., Bhatnagar, J., and Geschwind, M. (2008). A case of enteroviral meningoencephalitis presenting as rapidly progressive dementia. Nat Clin Prac Neurol 4, 399–403.

Venkataraman, S., Reddy, S. P., Loo, J., Idamakanti, N., Hallenbeck, P. L., and Reddy, V. S. (2008). Structure of Seneca Valley Virus-001: an oncolytic picornavirus representing a new genus. Structure 16, 1555–61.

Verdaguer, N., Blaas, D., and Fita, I. (2000). Structure of human rhinovirus serotype 2 (HRV2). J Mol Biol 300, 1179–94.

Victoria, J. G., Kapoor, A., Li, L., Blinkova, O., Slikas, B., Wang, C., Naeem, A., Zaidi, S., and Delwart, E. (2009). Metagenomic analyses of viruses in stool samples from children with acute flaccid paralysis. J Virol 83, 4642–51.

Voss, T., Meyer, R., and Sommergruber, W. (1995). Spectroscopic characterization of rhinoviral protease 2A: Zn is essential for the structural integrity. Protein Sci 4, 2526– 31. de Vries, M., Pyrc, K., Berkhout, R., Vermeulen-Oost, W., Dijkman, R., Jebbink, M. F., Bruisten, S., Berkhout, B., and van der Hoek, L. (2008). Human parechovirus type 1, 3, 4, 5, and 6 detection in picornavirus cultures. J Clin Micro 46, 759–62.

Véras, N. M. C., Gray, R. R., Brígido, L. F. D. M., Rodrigues, R., and Salemi, M. (2011). High-resolution phylogenetics and phylogeography of human immunodeficiency virus type 1 subtype C epidemic in South America. J Gen Virol 92, 1698–709.

Wallis, C., and Melnick, J. L. (1967). Virus aggregation as the cause of the non-neutralizable persistent fraction. J Virol 1, 478–88.

Welch, J. B., McGowan, K., Searle, B., Gillon, J., Jarvis, L. M., and Simmonds, P. (2001). Detection of enterovirus viraemia in blood donors. Vox sanguinis 80, 211–5.

Whitton, J., Cornell, C., and Feuer, R. (2005). Host and virus determinants of picornavirus pathogenesis and tropism. Nat Rev Micro 3, 765–776.

Wiertsema, S. P., Chidlow, G. R., Kirkham, L. S., Corscadden, K. J., Mowe, E. N., Vijayasekaran, S., Coates, H. L., Harnett, G. B., and Richmond, P. C. (2011). High Detection Rates of Nucleic Acids of a Wide Range of Respiratory Viruses in the Nasopharynx and the Middle Ear of Children With a History of Recurrent Acute Otitis Media. J Med Virol 2017, 2008–2017.

265

References

Willcocks, M. M., Locker, N., Gomwalk, Z., Royall, E., Bakhshesh, M., Belsham, G. J., Idamakanti, N., Burroughs, K. D., Reddy, P. S., Hallenbeck, P. L., et al. (2011). Structural features of the Seneca Valley virus internal ribosome entry site (IRES) element: a picornavirus with a pestivirus-like IRES. J Virol 85, 4452–61.

Wisdom, A., Kutkowska, A. E., McWilliam Leitch, E. C., Gaunt, E., Templeton, K., Harvala, H., and Simmonds, P. (2009a). Genetics, recombination and clinical features of human rhinovirus species C (HRV-C) infections; interactions of HRV-C with other respiratory viruses. PLoS One 4, e8518.

Wisdom, A., McWilliam Leitch, E. C., Gaunt, E., Harvala, H., and Simmonds, P. (2009b). Screening respiratory samples for detection of human rhinoviruses (HRVs) and enteroviruses: comprehensive VP4-VP2 typing reveals high incidence and genetic diversity of HRV species C. J Clin Micro 47, 3958–67.

Witwer, C., Rauscher, S., Hofacker, I. L., and Stadler, P. F. (2001). Conserved RNA secondary structures in Picornaviridae genomes. Nucleic Acids Res 29, 5079–89.

Worobey, M., and Holmes, E. C. (1999). Evolutionary aspects of recombination in RNA viruses. J Gen Virol, 2535–2543.

Xatzipsalti, M., Kyrana, S., Tsolia, M., Psarras, S., Bossios, A., Laza-Stanca, V., Johnston, S. L., and Papadopoulos, N. G. (2005). Rhinovirus viremia in children with respiratory infections. Am J Respir Crit Care Med 172, 1037–40.

Xiang, Z., Gonzalez, R., Wang, Z., Xiao, Y., Chen, L., Li, T., Vernet, G., Paranhos-Baccalà, G., Jin, Q., and Wang, J. (2010a). Human rhinoviruses in Chinese adults with acute respiratory tract infection. J Infect 61, 289–98.

Xiang, Z., Gonzalez, R., Xie, Z., Xiao, Y., Liu, J., Chen, L., Liu, C., Zhang, J., Ren, L., Vernet, G., et al. (2010b). Human rhinovirus C infections mirror those of human rhinovirus A in children with community-acquired pneumonia. J Clin Virol 49, 94–9.

Yozwiak, N. L., Skewes-Cox, P., Gordon, A., Saborio, S., Kuan, G., Balmaseda, A., Ganem, D., Harris, E., and DeRisi, J. L. (2010). Human enterovirus 109: a novel interspecies recombinant enterovirus isolated from a case of acute pediatric respiratory illness in Nicaragua. J Virol 84, 9047–58.

Zhang, Y., Zhu, S., Yan, D., Liu, G., Bai, R., Wang, D., Chen, L., Zhu, H., An, H., Kew, O., et al. (2010). Natural type 3/type 2 intertypic vaccine-related poliovirus recombinants with the first crossover sites within the VP1 capsid coding region. PloS one 5, e15300.

266

JOURNAL OF VIROLOGY, Oct. 2010, p. 10297–10310 Vol. 84, No. 19 0022-538X/10/$12.00 doi:10.1128/JVI.00962-10 Copyright © 2010, American Society for Microbiology. All Rights Reserved.

Analysis of Genetic Diversity and Sites of Recombination in Human Rhinovirus Species Cᰔ Chloe L. McIntyre,1 E. Carol McWilliam Leitch,1 Carita Savolainen-Kopra,2 Tapani Hovi,2 and Peter Simmonds1* Centre for Infectious Diseases, University of Edinburgh, Summerhall, Edinburgh EH9 1QH, United Kingdom,1 and National Institute for Health and Welfare, Mannerheimintie 166, FIN-00300 Helsinki, Finland2

Received 3 May 2010/Accepted 16 July 2010

Human rhinoviruses (HRVs) are a highly prevalent and diverse group of respiratory viruses. Although HRV-A and HRV-B are traditionally detected by virus isolation, a series of unculturable HRV variants have recently been described and assigned as a new species (HRV-C) within the picornavirus Enterovirus genus. To investigate their genetic diversity and occurrence of recombination, we have performed comprehensive phylo- genetic analysis of sequences from the 5؅ untranslated region (5؅ UTR), VP4/VP2, VP1, and 3Dpol regions amplified from 89 HRV-C-positive respiratory samples and available published sequences. Branching orders of VP4/VP2, VP1, and 3Dpol trees were identical, consistent with the absence of intraspecies recombination in the coding regions. However, numerous tree topology changes were apparent in the 5؅ UTR, where >60% of analyzed HRV-C variants showed recombination with species A sequences. Two recombination hot spots in .stem-loop 5 and the polypyrimidine tract in the 5؅ UTR were mapped using the program GroupingScan Available HRV-C sequences showed evidence for additional interspecies recombination with HRV-A in the 2A gene, with breakpoints mapping precisely to the boundaries of the C-terminal domain of the encoded protein- ase. Pairwise distances between HRV-C variants in VP1 and VP4/VP2 regions fell into two separate distribu- tions, resembling inter- and intraserotype distances of species A and B. These observations suggest that, without serological cross-neutralization data, HRV-C genetic groups may be equivalently classified into types using divergence thresholds derived from distance distributions. The extensive sequence data from multiple genome regions of HRV-C and analyses of recombination in the current study will assist future formulation of consensus criteria for HRV-C type assignment and identification.

A series of studies published in 2006 to 2007 described a ation of translation from an internal methionine codon several novel group of human rhinoviruses (HRVs), referred to as A2 hundreds of bases from the 5Ј end of the genome. or group C (HRV-C), genetically distinct from existing rhino- Although no more closely related to each other than they are viruses (3, 21, 23, 25, 27, 33, 42). These and subsequent inves- to other Enterovirus species phylogenetically, the three species tigations have shown the proposed new species HRV-C to be of human rhinoviruses (A, B, and C) share a number of bio- remarkably prevalent, widely distributed geographically, and logical and genetic attributes. Most prominent is their trans- frequently associated with severe respiratory disease, both fol- mission route and primary tropism for the respiratory tract, lowing primary infections, particularly in young children (24, associated with the known acid lability of species A and B that 25, 29, 32, 59), and as an exacerbating factor in asthma and is traditionally believed to prevent their colonization of the other chronic obstructive airway diseases (20, 25, 29, 36). On gastrointestinal tract. The HRV genome possesses a lower the other hand, HRV-C has also been abundant in specimens GϩC content than genomes of other enteroviruses, and it is collected from both children and adults with milder cases of often speculated that this represents an adaptation for repli- respiratory disease (47, 48). cation at the lower temperatures found in upper airways. The HRV-C species is a member of the Enterovirus genus HRV-C differs from other species, however, in other aspects, within the family Picornaviridae (53) and shares many features including the siting of the cis-replicating element in VP2 (11) of its genome organization and structure with other picorna- instead of in 2A (species A) (13) or VP1 (species B) (34). viruses and, more specifically, with other members of the En- Species C has so far proven refractory to all attempts at in vitro terovirus genus. This includes an approximately 7,100-base ge- culture, despite the use of a wide variety of cell lines and nome containing a single reading frame; this encodes four primary cell cultures (25, 33). This restriction has hindered capsid proteins and a series of functionally conserved nonstruc- investigations of its replication, receptor use, and antigenic tural proteins involved in virus replication. HRV-C, in com- diversity. mon with other members of the Enterovirus genus, possesses a Like most picornavirus groups, human rhinoviruses show type I internal ribosomal entry site (IRES) that enables initi- substantial genetic heterogeneity that underlies the existence of a large number of antigenically distinct variants. In the case of species A and B, a total of 74 and 25 different serotypes, * Corresponding author. Mailing address: Centre for Infectious Dis- respectively, have been defined using cross-neutralization as- eases, University of Edinburgh, Summerhall, Edinburgh EH9 1QH, United Kingdom. Phone: 44 131 650 7927. Fax: 44 131 650 6511. says in cell culture (14, 18, 19). Available nucleotide sequences E-mail: [email protected]. from the capsid region of HRV-C reveal similar or even ᰔ Published ahead of print on 28 July 2010. greater diversity than is found between species A and B sero-

10297 10298 MCINTYRE ET AL. J. VIROL.

types, but its lack of culturability currently precludes an equiv- the region from position 615 to 1043 were discarded (sequence positions are alent classification of HRV-C into serotypes. numbered throughout the manuscript using the sequence 024 [accession number Although coding regions of HRV-C are genetically distinct EF582385]; the current GenBank entry for the QPM prototype sequence [EF186077] is incomplete at the 5Ј end). These comprised 538 sequences along from those of species A and B rhinoviruses and other entero- with a further 16 previously unpublished sequences from the SVC archive. The viruses, both the initial detection and subsequent genetic anal- sequence data set was supplemented with the 10 complete genome sequences of yses of HRV-C have been complicated by the similarity of most HRV-C currently available on public databases as well as all available complete ϭ ϭ (Ͼ60%) sequences of the 5Ј untranslated region (UTR) to genome sequences of species A (n 82) and B (n 25) of defined serotypes. Phylogenetic trees were constructed using the MEGA package, version 4.0 those of species A (49), with the remainder being phylogeneti- (54), by the neighbor-joining method (44) from 100 bootstrap resampled se- cally distinct. It has been proposed that one or more interspe- quence alignments of maximum composite likelihood distances (MCL) (55) with cies recombination events between species C and species A pairwise deletion for missing data. Distributions of pairwise sequence distances rhinoviruses have occurred to generate these chimeric se- for identification of type thresholds were calculated using the program Sequence- quences (15, 17, 49, 59). Despite these differences in genetic Dist in the Simmonics package. Recombination breakpoints in HRV-Ca 5Ј UTR sequences were determined composition, no differences in clinical presentations or epide- using the program GroupingScan (51), using complete genome sequences of miology have been detected between HRV-C variants with species A (n ϭ 82), B (n ϭ 25), and nonrecombinant species C (Cc) sequences species A-like and species C-like 5Ј UTR (HRV-Ca and HRV- (n ϭ 29) as control groups. Trees were constructed using a fragment size of 200 Cc, respectively) sequences (17, 59). with incremental steps of 10 bases, a bootstrap value of 70% to define phyloge- netically supported groups, and 10 sequence relabelings (default) to calculate In the current study we have generated comparative se- control values. A second analysis of sequences extending to the extreme 5Ј end quence data from several genome regions (5Ј UTR, VP4/VP2, of the genome was performed to investigate further recombination events VP1, and 3Dpol) from a large number of HRV-C-positive around the replication structures (eight HRV-Cc control sequences). Recombi- clinical specimens. Combined with published sequence infor- nation breakpoints were interpolated as the nucleotide positions where the mation from these regions, we have investigated the occur- grouping values switched from the A group to the C group. 5؅ UTR RNA structure prediction. Structure prediction for HRV-C was based rence, frequency, and location of recombination events within on the established model for HRV-A (6) and previous comparative analyses of this rhinovirus species. The information gained on comparative HRV-A and -B sequences (43, 56). Mfold analysis of a selection of HRV-Ca and phylogenies and sequence divergence in different regions pro- HRV-Cc sequences produced minimum-energy secondary structures for HRV-C vides data that will assist in the eventual classification of the variants that closely matched the HRV-A2 structure (data not shown). Nucleotide sequence accession numbers. All newly generated sequences were current plethora of HRV-C sequences into a number of ge- submitted to GenBank and were assigned accession numbers HM236897 to netically determined types. HM236968 (VP1), HM352737 to HM352752 (VP4), HM485468 to HM485556 (3Dpol), and HM581802 to HM581888 (5Ј UTR).

MATERIALS AND METHODS Genetic characterization of HRV-C variants. A total of 89 HRV-C-positive RESULTS samples from the Edinburgh Specialist Virology Centre (SVC) respiratory sam- ؅ ple archive (60) were selected for genetic characterization; these represent a Phylogenies of HRV-C 5 UTR, VP4/VP2, VP1, and 3Dpol subset of the 144 samples previously characterized in the VP4/partial VP2 region regions. To investigate sequence relationships of HRV-C in (59). RNA was extracted from clinical samples as previously described (60). different regions of the genome, 89 study samples were se- Sequences were amplified from four different genomic regions using the methods quenced in the 5Ј UTR (nucleotide position 185 to 614; n ϭ described below. (i) VP4/VP2. The VP4 and 5Ј end of VP2 were amplified using nested primers 87), VP4 and partial VP2 (615 to 1043), whole VP1 (2348 to (see Table S1 posted at http://www.virus-evolution.org/Downloads/JVI00962 3125; n ϭ 72), and partial 3Dpol regions (6384 to 6854) and -10/) as previously described (59). aligned with the 10 available complete genome sequences of (ii) VP1. Nested reverse transcription-PCRs (RT-PCRs) were performed as HRV-C (Fig. 1). On phylogenetic analysis, sequences in each previously described (35) but using primers shown in Table S1 (posted at the region fell into a number of clearly defined, bootstrap-sup- URL mentioned above). A total of 72 from the 89 samples amplifiable in other genome regions were successfully amplified in two VP1 regions. The two se- ported phylogenetic groups, with the majority containing sev- quence fragments overlapped by 105 bases, allowing a composite 777-base se- eral variants and others represented by single sequences. The quence to be generated for phylogenetic analysis. small subset of variants for which we were unable to obtain ؅ (iii) 3Dpol and 5 UTR. RNA extracted from respiratory samples was reverse VP1 sequences were scattered throughout the 5Ј UTR, VP4/ transcribed into cDNA using the Promega reverse transcription system (Pro- mega, United Kingdom) as per the manufacturer’s instructions except that 5 ␮l VP2, and 3Dpol trees. They therefore did not constitute an of extracted RNA template was used. Amplification of the 3Dpol region and the identifiable genetic subset of sequences whose omission would 5Ј UTR region from cDNA used nested primers (see Table S1 posted at the have biased the analysis of VP1 sequence diversity. URL mentioned above) and previously described PCR conditions (59) but with The phylogenies of the three coding regions (VP4/VP2, an annealing temperature of 48°C in the second round. Samples negative on VP1, and 3Dpol) were remarkably congruent, with identical initial amplification were amplified using the nested PCR strategy described for VP1 above. A 3Dpol amplicon of 470 nucleotides in length was produced. For grouping patterns for all but one of the sequences analyzed. the 5Ј UTR, two sequence fragments with a 61-base overlap allowed a 680-base Using a bootstrap value of 70% or greater to define clades, composite 5Ј UTR sequence overlapping the VP4/VP2 amplicon to be gener- there was only one phylogeny violation between trees con- ϭ ated. A subset of HRV-C variants (n 12) was further amplified using hemin- structed from coding region sequences (Fig. 1B, C, and D). ested, combined species A and C primers (5Ј UTR set 1 in Table S1, posted at http://www.virus-evolution.org/Downloads/JVI00962-10/) to generate an ampli- This involved the previously characterized complete genome con from nucleotide position 1 to 355 (27 to 335, excluding primer sequences). sequence, N4 (GQ223227) (Fig. 1, shaded box), which changed All amplicons were sequenced using a BigDye Terminator kit (Applied Bio- tree position between VP1 and 3Dpol regions. Our sample systems, Warrington, United Kingdom). Prior to sequencing reactions, all sam- collection contained a single variant (Resp_10300/08) that ples underwent a PCR product cleanup procedure using EXOSAP-IT (GE grouped with N4 in the 5Ј UTR, VP4/VP2, and VP1 regions. In Healthcare, United Kingdom). Sequence analysis. All available VP4/VP2 region sequences were downloaded contrast to N4, its 3Dpol sequence maintained a similar phy- from GenBank on 26 April 2010. Sequences that were Ͻ90% complete across logenetic position relative to other sequences, as was observed VOL. 84, 2010 HRV-C DIVERSITY AND RECOMBINATION 10299

FIG. 1. Phylogenetic analysis of study sequences and corresponding regions from complete genome sequences in the 5Ј UTR (position 185 to 614, numbered according to the reference sequence EF582385) (A), VP4/VP2 (616 to 1043) (B), VP1 (2348 to 3125) (C), and 3Dpol (6384 to 6854) (D). Trees were constructed by neighbor joining of pairwise maximum composite likelihood distance implemented in the program MEGA (54); branches showing at least 70% bootstrap support are indicated. Complete genome sequences were labeled using filled symbols. The putative recombinant sequence N4 (GQ223227) and the other group 7 sequence Resp_10300/08 are indicated by shaded boxes. in the 5Ј UTR, VP4/VP2, and VP1 regions (Fig. 1). In inves- tical to another sequence, N10 (GQ223228), originating from tigating this anomaly, we found on closer inspection of the N4 the same laboratory (sequence divergence, 0.01, compared to sequence in the 3Dpol region that its 5Ј end (position 6384 to 0.35 divergence from Resp_10300/08). The chimeric nature of 6645) was similar to that of Resp_10300/08 (pairwise distance, this sequence is evident from the change in tree position of 5Ј- 0.08), while its 3Ј end (position 6646 to 6854) was nearly iden- and 3Ј-end fragments of the 3Dpol region (see Fig. S1 and 10300 MCINTYRE ET AL. J. VIROL.

values (Fig. 2B) contained a first distribution (maximum values of 6.9% and 8%, respectively) that corresponded to pairwise distances between variants within the same phylogenetic clades (Fig. 1B and C). These were comparable to the majority of (intraserotype) distances between the small number of avail- able HRV-A sequences of the same serotype (Fig. 2A). A second, nonoverlapping distribution, with minimum values of 10.9% in VP4 and 14.7% in VP1, corresponded to the set of pairwise distances between HRV-C clades. There was a com- parable division of pairwise distances between VP4/VP2 and VP1 sequences of species A (Fig. 2A) and species B (data not shown) into two largely separate distributions. The previously proposed VP1 divergence thresholds of 12% and 13% for serotype assignment (22) correspond closely to the low point in the distribution of HRV-A and HRV-B VP1 pairwise dis- tances. With the currently available sequence data, a threshold around this divergence value in VP1 and 10% in VP4/VP2 would appear appropriate for HRV-C. Observations of the separate distributions of pairwise distances in VP1 and phylo- genetic evidence for marked phylogenetic clustering of HRV-C in all three coding regions analyzed support the idea that HRV-C variants might be usefully classified into a number of (genetically determined) groups that correspond to sero- types of other rhinoviruses. Such assignments might be made even though we lack serological evidence for antigenic distinc- FIG. 2. Distribution of pairwise p distances between HRV-A tiveness of HRV-C variants. sequences (A) and available sequences of study samples and complete As a preparatory step toward a future genotypic classifica- genome sequences of HRV-C (B) in the VP4 and VP1 regions. The tion of HRV-C, we have tabulated all available HRV-C vari- shaded area represents the zone between inter- and intraserotype Ͼ divergence values in species A. ants into groups containing members that show 0.13 se- quence divergence from each other in VP1 or, if not available, Ͼ0.10 in VP4/VP2 (Table 1). Table 1 has been split into two sections, one for which complete genome sequences are avail- supplementary data posted at http://www.virus-evolution.org able (genetic groups numbered 1 to 11) and a second contain- /Downloads/JVI00962-10/); N4 grouped with Resp_10300/08 ing genetic groups for which VP4/VP2 sequences are available, at the 5Ј half of the fragment (comparable to other genome with or without VP1, 3Dpol, and 5Ј UTR sequences. Combin- regions) (Fig. 1B and C) but clustered very closely with N10 ing the assignments, if sequences grouped in this way are con- (GQ223228) at the 3Ј end. Possible underlying explanations for sidered the equivalent of (sero)types in other rhinoviruses, the anomalous phylogeny of the N4 sequence in the 3Dpol then the current data set of sequences can be grouped into a region are discussed below. total of 60 genetically determined types. In contrast to the identity of coding region phylogenies, a Sequence divergence of HRV-C in different genome regions. minimum of 12 phylogeny violations were observed in the Using the genetic groups in the previous section, we compared species C sequence data set on comparison of phylogenetic the degree of sequence divergence in the analyzed genome trees from the 5Ј UTR and VP4/VP2. Members of the previ- regions with divergence in regions of different serotypes of ously described HRV-Cc 5Ј UTR group, distinct from the HRV-A and HRV-B species (Fig. 3). Both at the nucleotide species A-like 5Ј UTR sequences of the majority of HRV-C and amino acid level, VP1 was the most variable region be- sequences (labeled in Fig. 1), were drawn from several coding tween types and serotypes, followed by 3Dpol and VP4/VP2. region lineages. These observations are consistent with the However, HRV-C genetic groups were consistently more di- occurrence of multiple recombination events between the 5Ј verse than serotypes of other rhinovirus species, showing, for UTR and the rest of the genome. example, 35% (nucleotide) and 33% (amino acid) divergence Pairwise distance distributions. Distributions of pairwise compared to 25% and 21 to 26% in HRV-A and -B serotypes. distances between HRV-C sequences in VP4/VP2 and VP1 Even more marked differences in 3Dpol and VP4/VP2 regions regions were computed to investigate whether they contained were observed, with amino acid divergence between species C clear sequence divergence thresholds that might be used to groups almost twice that of species A and B. define intertype and intratype divergence values (Fig. 2B). Mapping positions of recombination between the 5؅ UTR These distributions were compared with those of species A in and VP4/VP2. Combining sequences from overlapping 5Ј UTR the same genome regions (Fig. 2A). They were also similar to and VP4/VP2 amplicons provided a continuous sequence from those of species B, but in the latter case there were insufficient position 185 to 1043 that enabled the recombination break- numbers of classified sequences in either genome region for a point(s) to be determined in each of the Ca recombinant vari- conclusive analysis. ants identified to date. The combined set of 10 available com- The set of HRV-C VP4/VP2 and VP1 pairwise distance plete genomes, 87 samples from the current study, and 34 5Ј VOL. 84, 2010 HRV-C DIVERSITY AND RECOMBINATION 10301

TABLE 1. Division of HRV-C into genetic groups based on divergence in VP1 or VP4/VP2

5Ј UTR Submission date HRV-C genetic groupa Accession no.b Identifier Referencee groupc (mo/day/yr)d

Groups containing complete genomes 1 EF077279 NAT001 Ca 10/20/06 21 2 EF077280 NAT045 Ca 10/20/06 21 3 EF186077 QPM Ca 12/14/06 33 4 EF582385 024 Ca 4/27/07 25 5 EF582386 025 Ca 4/27/07 25 6 EF582387 026 Ca 4/27/07 25 7 DQ875932 NY-074 Ca 7/14/08 23 8 GQ223227 N4 Ca 5/29/09 17 9 GQ223228 N10 Cc 5/29/09 17 10 GQ323774 QCE Ca 6/29/09 2 11 FJ392317f CL170085 Ca 10/17/08 58

Groups containing partial genomes 12 EF077264 NAT083 Ca 10/20/06 21 13 EU081795 tu403 Ca 8/03/07 42 14 EU081796 06-445 Cc 8/03/07 42 15 EU081800 06-20 Ca 8/03/07 42 16 EU081808 g2-4 Ca 8/03/07 42 17 EU081809 06-582 Ca 8/03/07 42 18 EU590074 PNC41788 Ca 3/25/08 48 19 EU697850 7316563 Cc 5/05/08 7 20 EU697851 DK-1 Cc 5/05/08 7 21 EU752377 RV471 Ca 5/26/08 36 22 EU752381 RV541 Ca 5/26/08 36 23 EU752424 RV64 Ca 5/26/08 36 24 EU752426 RV408 Ca 5/26/08 36 25 EU752427 RV1123 Ca 5/26/08 36 26 EU752441 RV177 Ca/Ccg 5/26/08 36 27 GQ223122 N22 Ca/Cc 1/09/09 17 28 GQ223134 N46 Ca 1/09/09 17 29 FJ615699 201882 Cc 1/09/09 37 30 GQ476669 Resp_3898 Ca 8/13/09 60 31 GU294380 Resp_3776 Ca 12/04/09 59 32 GU294466 Resp_5613 Ca 12/04/09 59 33 GU294480 Resp_6157 Cc 12/04/09 59 34 EF077256 NAT069 Cc 10/20/06 21 35 EF077260 NAT069 Cc 10/20/06 21 36 EU081790 06-739 Cc 8/03/07 42 37 EU081791 tu304 Ca 8/03/07 42 38 EU081799 g2-11 8/03/07 42 39 EU081802 g2-25 Ca 8/03/07 42 40 EU081803 g2-23 8/03/07 42 41 EU081805 g2-28 Ca 8/03/07 42 42 EU081807 06-230 8/03/07 42 43 EU590054 PNC86718 Cc 3/25/08 48 44 EU590061 PNC40168 3/25/08 48 45 EU590064 PNC40449 3/25/08 48 46 EU697839 IN-36 5/05/08 7 47 EU697852 SA365412 5/05/08 7 48 EU743925 CO-1368 Cc 5/22/08 11a 49 EU752358 RV1250 5/26/08 36 50 EU752398 RV1039 5/26/08 36 51 EU752412 RV546 Cc 5/26/08 36 52 FJ598096 GDFY100 Cc 12/29/08 Unpublished 53 FJ615722 202511 1/09/09 37 54 FJ615737 202092 1/09/09 37 55 FJ615745 202642 1/09/09 37 56 FJ841957 S05986 3/17/09 9 57 FJ869923 KR1868 3/27/09 15 58 FJ869950 KR2315 Cc 3/27/09 15 59 GQ466482 K1091 301104 8/07/09 47 60 GU214340 PV68 Ca 11/18/09 41

a Genetic groups showing Ͼ0.10 sequence divergence in VP4/VP2. Groups for which VP4/VP2 sequences are available, with or without VP1, 3DPol, and 5Ј UTR sequences, were considered partial. b Accession number of the first member of the assigned genetic group. c Identity of the 5Ј UTR group (Ca or Cc) associated with viruses of this group (59). d Submission date of the first sequence of the assigned genetic group in GenBank/EMBL/DDBJ. e Citation for the first submitted member of the genetic group. f Accession number FJ392317 refers to a previously published VP1 sequence (58). The full genome of this variant has now been completed (C. Tapparel, personal communication). In order to preserve the numbering of genetic groups, sequences grouping with this complete genome in VP4/VP2 have been assigned as genetic group 11. g Variants with Ca and Cc 5Ј UTR sequences are found in these genetic groups (59). 10302 MCINTYRE ET AL. J. VIROL.

FIG. 3. Mean pairwise (uncorrected) p distances of nucleotide (A) and amino acid (B) sequences in the VP4/VP2, VP1, and 3Dpol regions between serotypes of HRV-A and HRV-B and between iden- tified genetic groups of HRV-C (values shown above each bar).

UTR sequences described by Huang et al. (17) represented 45 of the 60 sequence groups listed in Table 1. Twenty-nine HRV-C sequence groups had 5Ј UTR sequences that grouped within the species A 5Ј UTR clade (labeled Ca in Fig. 1), two contained sequences from both the Ca and Cc groups, and the remainder were phylogenetically distinct, falling in the Cc clade that contained the N10 complete genome sequence (GQ223228). The program Grouping Scan (51) was used to identify re- combination breakpoints in the Ca 5Ј UTR sequences. This program was chosen in preference to bootscanning methods because it scores the extent of grouping within predefined control groups rather than simply the bootstrap support for the grouping of a query sequence with a group consensus se- quence. As discussed previously (51), the latter method can lead to false assignment in cases where a query sequence is not closely affiliated to any of the control groups. Furthermore, by simply condensing the often large amount of comparative se- quence data within control groups to a single consensus se- quence, bootscanning additionally discards informative data on sequence diversity within groups that are of value in assessing phylogeny relationships. The available set of HRV-A and HRV-B complete genome sequences were used as control FIG. 4. (A) Mapping of recombination breakpoints of three repre- groups while the HRV-Cc control group (nonrecombinant Cc sentative published or study HRV-Ca sequences using the program sequences) was assembled from the N10 sequence and Cc Grouping Scan with HRV-A, -B, and HRV-Cc sequences as control ϭ groups, as indicated on the figure. Above the graphs is a diagrammatic variants identified in the current study (n 29). summary of 5Ј UTR RNA structure elements (stem-loops 1 to 6 [1] and Each of the complete genome and study sequences from the the PPT). (B) Variability scan of the 5Ј UTR, showing mean pairwise Ca 5Ј UTR clade was scanned between position 185 and 903 interserotype/genetic group distances of HRV-A, -B, and -Cc sequences (three representative results depicting different breakpoints averaged over a 30-base window and, separately, the interspecies diver- gence between HRV-A and HRV-Cc (A-Cc). The position of 5Ј UTR are shown in Fig. 4A). Superimposed is a diagrammatic sum- structures and range of identified recombination breakpoints (Table 2) mary of the RNA secondary structure elements comprising the are indicated. (C) Extension of Grouping Scan analysis to the 5Ј end of replication structures (stem-loop 1 cloverleaf) and IRES the HRV genome using the subset of control sequences complete from (stem-loops 2 to 6, the latter containing the AUG start codon position 27 onwards and an HRV-Ca query sequence. of the HRV-C open reading frame at position 615). Recombination breakpoints identified in HRV-Ca variants (PPT) between stem-loops 5 and 6 (position 561 to 576; mean, were invariably positioned similarly among variants in the 565) and a second around the terminal loop of stem-loop 5 same genetic group (Table 2). Two recombination hot spots (position 508 to 544; mean, 523), while two HRV-C genetic were identified, one occurring in the polypyrimidine tract groups showed a more 5Ј site (position 479 to 489; mean, 481). VOL. 84, 2010 HRV-C DIVERSITY AND RECOMBINATION 10303

FIG. 4—Continued.

Sequences of the stem-loop stem regions either side of the interspecies recombination, we performed a scan of sequence recombination hot spot centering on position 523 were highly divergence between complete genome sequences and those of conserved (Fig. 4B). The HRV-Ca recombinants with a 5Ј stem species A and B rhinoviruses (Fig. 5). Between-species diver- sequence derived from species A and a 3Ј sequence from gence was substantially greater than the mean pairwise dis- species C showed predicted pairings similar to those of tance within species C sequences except for an approximately HRV-A and nonrecombinant HRV-Cc sequences (data not 260-base region in 2A. This region (position 3268 to 3525) shown) and would therefore accommodate a recombination showed a markedly different phylogeny from flanking regions event without destabilizing stem-loop 5. of the same length (Fig. 6). More specifically, species C se- Finally, the occurrence of further recombination events at quences grouped within the HRV-A clade in a manner similar the extreme 5Ј end of the 5Ј UTR was investigated using a to that observed in the 5Ј UTR. As found in the latter region, subset of 11 HRV-Ca sequences with sequences complete from single HRV-C sequences or groups of HRV-C sequences be- position 27. All query sequences grouped within the HRV-A came scattered among the HRV-A sequences, implying the clade, with association values substantially above the 0.5 out- occurrence of separate recombination events. Remarkably, the group position score (an example is shown in Fig. 4C). Al- bootstrap-supported grouping of five HRV-C sequences (ac- though the resolution of this method is limited by the 200-base cession numbers EF077279, DQ875932, EF582387, EF186077, fragment size used for analysis, visual inspection of trees con- and GQ323774) in 2A was similarly monophyletic in the 5Ј structed from shorter sequence fragments (50 and 100 bases) UTR (Fig. 1A). The main difference from the 5Ј UTR phylog- at the extreme 5Ј end of the genome provides no evidence for eny was the position of N10 (GQ223228). This formed part of a change in the phylogenetic position of any Ca variants to a the Cc 5Ј UTR clade (Fig. 1A) but contained a 2A sequence position outside the HRV-A clade (data not shown). embedded within the HRV-A clade (Fig. 6). A change in the Interspecies recombination in the 2A region. To investigate phylogenetic position of N4 (GQ223227) was additionally ap- whether other genome regions of HRV-C showed evidence for parent in trees from the two regions. 10304 MCINTYRE ET AL. J. VIROL.

TABLE 2. Position of recombination breakpoints in different the coding region. For example, mean intergenetic group pair- HRV-Ca groups wise distances in the 3Dpol region of HRV-C (33% nucleotide; Total no. of 29% amino acid) were similar to the divergence of VP1 (35% Breakpoint (mean nucleotide Genetic group sequences position ͓range͔) and 33%, respectively). This contrasts markedly with human analyzed enteroviruses and several other picornavirus groups where 1 5 520 (519–523) nonstructural gene regions are much less divergent between 2 1 564 serotypes (Ͻ10% at the amino acid level). As previously pro- 3 2 526 (526–527) posed (50), this restricted variability increases the likelihood of 4 1 561 5 3 563 (563–564) recombinants with breakpoints in the nonstructural regions 6 10 524 (524–526) being viable biologically. In contrast, the highly divergent se- 7 2 574 (574) quences in equivalent regions of HRV may be functionally 8 4 565 (561–570) incompatible with each other and effectively isolate each rhi- 10 1 525 novirus serotype into a separate evolutionary path. 11 2 554 (552–557) 12 4 508 (508) One HRV-C sequence showed different phylogenetic rela- 13 1 523 tionships in different regions of the coding region (N4; 15 3 483 (479–489) GQ223227). Although we cannot exclude the possibility of a 16 3 520 (518–526) natural recombination site within the 3D region as an expla- 17 1 567 18 8 528 (518–544) nation for the hybrid nature of its 3Dpol sequence (see Fig. S1 21 6 572 (570–575) and the supplementary data posted at http://www.virus 22 2 576 (576) -evolution.org/Downloads/JVI00962-10/), its close resem- 23 3 562 (552–570) blance of the 3Ј end to the N10 sequence originating from the 24 1 560 same laboratory suggests the possibility of laboratory contam- 25 1 558 27 3 535 (522–559) ination during assembly of the complete genome sequence. 28 2 543 (536–550) Another group 7 variant (Resp_10300/08) showed consistent 30 1 561 phylogenetic relationships to other sequences in all three cod- 31 1 528 ing regions analyzed, including 3Dpol (Fig. 1B, C, and D). It 32 9 524 (523–527) 37 2 555 (555) would be of value if the laboratory from which the N4 se- 39 1 560 quence originated (17) were able to perform additional ampli- 41 1 479 fication and sequencing to rule out contamination/assembly 60 1 516 errors and correct the GenBank entry if necessary. Evidence for a different, likely more evolutionarily ancient, pattern of recombination in HRV-A has been previously ob- tained, manifested by differences in sequence relatedness be- DISCUSSION tween sequences in different genome regions (39, 57). For Recombination in HRV-C. This study has investigated the example, HRV-53 was substantially more similar to HRV-46 in occurrence and sites of recombination within the genomes of the nonstructural region than anticipated by their sequence the recently described species C human rhinoviruses. The most relationship in the capsid-encoding region; conversely HRV-78 striking finding was the almost identical phylogenetic trees of and HRV-12 were more divergent. In these and other specific the three coding regions analyzed, including gene regions at examples, changes in the phylogeny relationship usually oc- the extreme 5Ј and 3Ј ends of the open reading frame (VP4/ curred at the P1/P2 boundary, implying a limited degree of VP2 and 3Dpol). The absence of recombination implied by compatibility between structural and nonstructural gene mod- these observations is consistent with previous comparisons of ules derived from different serotypes. As described above, this VP4/VP2 and 3Dpol region phylogenies of HRV species A and pattern of recombination is quite distinct from the multiple B (46), where only minimal changes in branching order were recombination sites within nonstructural gene regions of en- observed among the full set of 101 classified serotypes. These teroviruses and other picornaviruses. No evidence for its oc- findings together with the data obtained here for species C currence was documented among the data set of HRV-C se- contrast dramatically with the rampant and ongoing recombi- quences assembled in the current study, where both small-scale nation process between structural and nonstructural gene re- (genetic group) and larger sequence groupings were preserved gions in other species within the Enterovirus genus, particularly across the genome. Potentially, the much greater amino acid in species A, B, and C human enteroviruses (10, 28, 30, 38, 52). sequence divergence between HRV-C sequence groups (Fig. Recombination is also extensively documented in aphthovi- 3) increases the likelihood of biological incompatibility and ruses (16, 50), cardioviruses (5, 12), and parechoviruses (4, 8). restricts further the occurrence of recombination. For these, the concept of separate, modular evolution of struc- Despite the limited evidence for recombination in the cod- tural and nonstructural regions of picornavirus genomes has ing regions of rhinoviruses, this study confirmed its frequent been developed (28, 31, 50). occurrence between VP4 (and the rest of the genome down- Potentially underlying this difference in recombination fre- stream) and the 5Ј UTR (17, 49). The generation of a large quency in rhinoviruses is their different pattern of sequence number of 5Ј UTR Ca and Cc sequences permitted a detailed divergence in structural and nonstructural regions from most investigation of the positions where recombination occurred. other picornaviruses. Species C variants along with HRV-A The distribution of recombination hot spots centered around and HRV-B show substantial sequence divergence throughout position 565 (in the PPT), 523 (within stem-loop 5 of the VOL. 84, 2010 HRV-C DIVERSITY AND RECOMBINATION 10305

FIG. 5. Divergence scan, showing mean nucleotide and amino acid sequence divergence between complete genome sequences of HRV-C with those of HRV-A and -B and within-species diversity of HRV-C. The scan used fragment lengths of 300 nucleotides, incrementing by 30 bases across the HRV coding sequence. The genome diagram below the graph shows positions of different HRV-C genes, using annotation supplied with the sequence EF582385.

IRES), and two groups showing evidence of a breakpoint HRV-C-derived IRES and adjacent coding sequences into spe- around position 481. This is a much more restricted distribu- cies A replicons or infectious clones will be of value in func- tion than previously described in an analysis using a smaller tionally mapping these compatibility restrictions. data set and a different scanning method (17). Nor could we A new and unexpected finding in the current study was the confirm the second recombination event at the extreme 5Ј end second site of HRV-A/HRV-C interspecies recombination in of the genome where Ca variants remained grouped with the 2A region (Fig. 6 and 7). Although a previous bootscanning HRV-A sequences. analysis showed the N4 sequence to group with the HRV-A As suggested from the analysis of recombination in the cod- consensus sequence in this region (17), this observation was ing regions, the occurrence and positions of recombination misinterpreted. As demonstrated by phylogenetic analysis of events in the 5Ј UTR may be governed by biological compat- this and flanking regions, all HRV-C variants genetically char- ibility restrictions. For example, the conservation of sequences acterized in this region to date show evidence for recombina- within stem-loop 5 and potentially independent modular func- tion (Fig. 6) in a pattern remarkably similar to that observed in tions of stem-loops 5 and 6 may favor the creation of viable the 5Ј UTR (with the noted exceptions of N10 and N4). The recombinants. Similarly, the high degree of sequence conser- 2A protein of rhinoviruses and enteroviruses is a chymotryp- vation of the 5Ј end of the genome between rhinovirus species sin-like proteinase of 142 amino acids. The region between may facilitate the interaction between the species C replication positions 3268 to 3525 (residues 47 to 133) identified as re- complex and the HRV-A-derived cloverleaf replication struc- combinant corresponds precisely to the boundaries of the pre- ture (stem-loop 1) (56). In the future, in vitro insertion of the viously C-terminal structural domain of 2A, being comprised FIG. 6. Comparison of the phylogeny of HRV-A, -B, and -C sequences in the putative recombinant region between position 3268 and 3525 with those of 5Ј and 3Ј flanking regions (positions 3009 to 3267 and 3526 to 3784). See the legend of Fig. 1 for tree construction method, labels, and symbols. For clarity, the accession numbers of species A and B variants have been prefixed by the species letter and serotype designation (e.g., A8 FJ445113 and B14 NC001490).

10306 VOL. 84, 2010 HRV-C DIVERSITY AND RECOMBINATION 10307

FIG. 7. Annotation of the secondary structure elements in the 2A gene of HRV-2 (40) superimposed on an alignment of the most genetically divergent HRV-A (n ϭ 9) and HRV-B (n ϭ 8) sequences and all available HRV-C 2A sequences and the inferred positions of the recombination breakpoints (dotted lines; nucleotide position 3268 to 3525). ␤-Strands and catalytic and zinc binding residues are as previously proposed (40). The positions of the N-terminal and C-terminal structural domains are indicated in shaded boxes. Numbering of amino acid residues follows that of the HRV-2 sequence X02316. For clarity, the accession numbers of species A and B variants have been prefixed by the species letter and serotype designation (as described in the legend of Fig. 6). 10308 MCINTYRE ET AL. J. VIROL. of six ␤-sheets that coordinate a zinc ion adjacent to the cat- groups using the 0.13 and 0.10 upper divergence thresholds for alytic site of the protease (Fig. 7) (40). Our observations of the VP1 and VP4/VP2 regions and into the same group if recombination in the HRV-C 2A gene suggest at least some below. Also potentially facilitating a possible future genetically degree of modularity of its two structural domains. Investiga- based classification of HRV-C is the lack of observed recom- tion of the enzymatic properties of the HRV-C chimeric pro- bination between the VP4/VP2 and VP1 regions (Fig. 1B and tein and the influence of the more variable N terminus of 2A C). This suggests that, once classified, HRV-C types may be encoded by HRV-C sequences on its function and specificity readily identified using either VP4/VP2 sequences or those may provide insights into potential biological differences be- from VP1. This would be particularly helpful, given the pre- tween HRV species. Finally, the proposed recombination junc- ponderance of VP4/VP2 sequence data and the ease of ampli- tion at position 3268 coincides almost exactly with the 3Ј end of fication of this region for typing purposes. Remarkably, within the species A CRE (cis-acting RNA element) region (mapped 5 years of discovery of HRV-C, currently available sequences to position 3226 to 3270 [13]). The absence of a homologous would correspond to a total of 60 types if classified in this way structure in the NЈ terminal domain of species C variants is (Table 1). consistent with current predictions for its localization in VP2 Future classification proposals and the development of ro- (11). bust, well-defined type assignment criteria for HRV-C require The evolutionary events underlying the inferred 5Ј UTR and discussion and consensus from an expert group, likely affiliated 2A recombination events remain unclear. The existence of to the Picornavirus Study Group of the International Commit- multiple recombination breakpoints in the 5Ј UTR and the tee for the Taxonomy of Viruses. The data generated in the scattering of HRV-Ca genetic groups in the VP4/VP2 tree (17, current study will be of value in future formulation of diver- 59) imply that recombination in the 5Ј UTR has occurred gence thresholds in different genome regions for type assign- several times independently. Similarly, the existence of multi- ment if this is to be adopted for classification purposes, and a ple groupings of HRV-C sequences in the 2A clade of HRV-A consensus paper with formal proposals for type assignment suggests that more than one recombination event of that spe- based on the genetic groups defined in Table 1 has now been cific fragment occurred. What is remarkable is the evidence published (52a). Information on recombination frequency will that recombination in these two regions is at least partly linked, similarly assist the interpretation of sequence data from other with the same grouping of five of the HRV-C variants observed genomic regions and substantiate the proposed classification of in both genomic regions. We are currently sequencing the 89 Ca and Cc 5Ј UTR variants (17) that we have extended in the study samples in the 2A region to obtain additional informa- current study. Other issues specific to HRV-C and which re- tion on the occurrence and linkage of recombination at these quire consideration in formulating classification proposals in- two sites in order to obtain a better understanding of the clude the lack of virus isolates or type strains equivalent to process and constraints under which these interspecies recom- those of other rhinoviruses that currently precludes their se- bination events occur. rological and genetic/biological characterization. Genetic diversity of HRV-C. HRV-C variants showed sub- stantially greater diversity than HRV-A or HRV-B serotypes ACKNOWLEDGMENTS (Fig. 3) but showed similar clustering of genetic groups into a We are grateful to Elly Gaunt, Heli Harvala, Kate Templeton, Peter number of well-defined (bootstrap-supported) clades (Fig. 1C) McCullough, Julie White, Mary Notman, Eleanor Leslie, and Carol to other rhinovirus species. There were additionally two well- Thomson for providing samples, data, and other virus testing results separated distributions of pairwise distances in VP4/VP2 and from the respiratory sample archive. We thank David J. Evans for VP1 regions (Fig. 2B), suggesting that genetically defined types valuable discussion of the manuscript. of HRV-C might be readily defined and demarcated for clas- Chloe McIntyre’s Ph.D. studentship was funded by the Medical Research Council. sification purposes. This genetic approach may be necessitated by the absence of HRV-C isolates with which to investigate REFERENCES serological interrelationships, as has been performed previ- 1. Anderson, E. C., S. L. Hunt, and R. J. Jackson. 2007. Internal initiation of ously for HRV-A and -B species. translation from the human rhinovirus-2 internal ribosome entry site re- For other HRV species, there exists a small number of pairs quires the binding of Unr to two distinct sites on the 5Ј untranslated region. J. Gen. Virol. 88:3043–3052. of variants with genetic distances at an intermediate position in 2. Arden, K. E., C. E. Faux, N. T. O’Neill, P. McErlean, A. Nitsche, S. B. the distributions (the gray zone marked in Fig. 2 for HRV-A). Lambert, M. D. Nissen, T. P. Sloots, and I. M. Mackay. 2010. Molecular For both HRV-A and -B, while there is a good correlation characterization and distinguishing features of a novel human rhinovirus (HRV) C, HRVC-QCE, detected in children with fever, cough and wheeze between VP1 sequence divergence and serological relation- during 2003. J. Clin. Virol. 47:219–223. ships (22, 26, 45), the latter are of limited value in defining a 3. Arden, K. E., P. McErlean, M. D. Nissen, T. P. Sloots, and I. M. Mackay. 2006. Frequent detection of human rhinoviruses, paramyxoviruses, corona- precise nucleotide sequence divergence threshold to separate viruses, and bocavirus during acute respiratory tract infections. J. Med. Virol. inter- from intraserotype divergence values (22, 26). These 78:1232–1240. discrepancies likely reflect the somewhat variable relationship 4. Benschop, K. S., C. H. Williams, K. C. Wolthers, G. Stanway, and P. Sim- monds. 2008. Widespread recombination within human parechoviruses: between sequence divergence, epitope exposure, and antige- analysis of temporal dynamics and constraints. J. Gen. Virol. 89:1030–1035. nicity and the possibility that much of the sequence divergence 5. Blinkova, O., A. Kapoor, J. Victoria, M. Jones, N. Wolfe, A. Naeem, S. between serotypes is immunologically driven. Shaukat, S. Sharif, M. M. Alam, M. Angez, S. Zaidi, and E. L. Delwart. 2009. Cardioviruses are genetically diverse and cause common enteric infections in For species C, however, the current data set of VP1 (and South Asian children. J. Virol. 83:4631–4641. VP4/VP2) sequences shows a marked absence of variants 6. Borman, A., and R. J. Jackson. 1992. Initiation of translation of human rhinovirus RNA: mapping the internal ribosome entry site. Virology 188: showing intermediate distances in this sequence divergence 685–696. range, and all can be defensibly assigned into different genetic 7. Briese, T., N. Renwick, M. Venter, R. G. Jarman, D. Ghosh, S. Kondgen, VOL. 84, 2010 HRV-C DIVERSITY AND RECOMBINATION 10309

S. K. Shrestha, A. M. Hoegh, I. Casas, E. V. Adjogoua, C. Akoua-Koffi, K. S. human rhinovirus C in children and adults in Hong Kong reveals a possible Myint, D. T. Williams, G. Chidlow, B. R. van den, C. Calvo, O. Koch, G. distinct human rhinovirus C subgroup. J. Infect. Dis. 200:1096–1103. Palacios, V. Kapoor, J. Villari, S. R. Dominguez, K. V. Holmes, G. Harnett, 25. Lau, S. K., C. C. Yip, H. W. Tsoi, R. A. Lee, L. Y. So, Y. L. Lau, K. H. Chan, D. Smith, J. S. Mackenzie, H. Ellerbrok, B. Schweiger, K. Schonning, M. S. P. C. Woo, and K. Y. Yuen. 2007. Clinical features and complete genome Chadha, F. H. Leendertz, A. C. Mishra, R. V. Gibbons, E. C. Holmes, and characterization of a distinct human rhinovirus (HRV) genetic cluster, prob- W. I. Lipkin. 2008. Global distribution of novel rhinovirus genotype. Emerg. ably representing a previously undetected HRV species, HRV-C, associated Infect. Dis. 14:944–947. with acute respiratory illness in children. J. Clin. Microbiol. 45:3655–3664. 8. Calvert, J., T. Chieochansin, K. Benschop, E. C. McWilliam-Leitch, J. F. 26. Ledford, R. M., N. R. Patel, T. M. Demenczuk, A. Watanyar, T. Herbertz, Drexler, K. Grywna, H. da Costa Ribeiro, C. Drosten, H. Harvala, Y. Poo- M. S. Collett, and D. C. Pevear. 2004. VP1 sequencing of all human rhino- vorawan, K. Wolthers, and P. Simmonds. 2010. The recombination dynamics virus serotypes: insights into genus phylogeny and susceptibility to antiviral of human parechoviruses; investigation of type-specific differences in fre- capsid-binding compounds. J. Virol. 78:3663–3674. quency and epidemiological correlates. J. Gen. Virol. 91:1229–1238. 27. Lee, W. M., C. Kiesner, T. Pappas, I. Lee, K. Grindle, T. Jartti, B. Jakiela, 9. Calvo, C., M. Luz Garcia, F. Pozo, N. Reyes, P. Perez-Brena, and I. Casas. R. F. Lemanske, Jr., P. A. Shult, and J. E. Gern. 2007. A diverse group of 2009. Role of rhinovirus C in apparently life-threatening events in infants, previously unrecognized human rhinoviruses are common causes of respira- Spain. Emerg. Infect. Dis. 15:1506–1508. tory illnesses in infants. PLoS One 2:e966. 10. Chevaliez, S., A. Szendroi, V. Caro, J. Balanant, S. Guillot, G. Berencsi, and 28. Lindberg, A. M., P. Andersson, C. Savolainen, M. N. Mulders, and T. Hovi. F. Delpeyroux. 2004. Molecular comparison of echovirus 11 strains circulat- 2003. Evolution of the genome of human enterovirus B: incongruence be- ing in Europe during an epidemic of multisystem hemorrhagic disease of tween phylogenies of the VP1 and 3CD regions indicates frequent recom- infants indicates that evolution generally occurs by recombination. Virology bination within the species. J. Gen. Virol. 84:1223–1235. 325:56–70. 29. Linsuwanon, P., S. Payungporn, R. Samransamruajkit, N. Posuwan, J. 11. Cordey, S., D. Gerlach, T. Junier, E. M. Zdobnov, L. Kaiser, and C. Tap- Makkoch, A. Theanboonlers, and Y. Poovorawan. 2009. High prevalence of parel. 2008. The cis-acting replication elements define human enterovirus human rhinovirus C infection in Thai children with acute lower respiratory and rhinovirus species. RNA 14:1568–1578. tract disease. J. Infect. 59:115–121. 11a.Dominguez, S. R., T. Briese, G. Palacious, J. Hui, J. Villari, V. Kapoor, R. 30. Lukashev, A. N., V. A. Lashkevich, O. E. Ivanova, G. A. Koroleva, A. E. Tokarz, M. P. Glode´, M. S. Anderson, C. C. Robinson, K. V. Homes, and Hinkkanen, and J. Ilonen. 2003. Recombination in circulating enteroviruses. W. I. Lipkin. 2008. Multiplex MassTag PCR for respiratory pathogens in J. Virol. 77:10423–10431. pediatric nasopharyngeal washes negative by conventional diagnostic testing 31. Lukashev, A. N., V. A. Lashkevich, O. E. Ivanova, G. A. Koroleva, A. E. shows a high prevalence of viruses belonging to a newly recognized picor- Hinkkanen, and J. Ilonen. 2005. Recombination in circulating Human en- navirus clade. J. Clin. Virol. 43:219–222. terovirus B: independent evolution of structural and non-structural genome 12. Drexler, J. F., S. Baumgarte, L. K. Souza Luna, A. Stocker, P. S. Almeida, regions. J. Gen. Virol. 86:3281–3290. T. C. Ribeiro, N. Petersen, P. Herzog, C. Pedroso, C. Brites, R. H. da Costa, 32. McErlean, P., L. A. Shackelton, E. Andrews, D. R. Webster, S. B. Lambert, Jr., A. Gmyl, C. Drosten, and A. Lukashev. 2010. Genomic features and M. D. Nissen, T. P. Sloots, and I. M. Mackay. 2008. Distinguishing molecular evolutionary constraints in Saffold-like Cardioviruses. J. Gen. Virol. 91: features and clinical characteristics of a putative new rhinovirus species, 1418–1427. human rhinovirus C (HRV C). PLoS One 3:e1847. 13. Gerber, K., E. Wimmer, and A. V. Paul. 2001. Biochemical and genetic 33. McErlean, P., L. A. Shackelton, S. B. Lambert, M. D. Nissen, T. P. Sloots, studies of the initiation of human rhinovirus 2 RNA replication: identifica- and I. M. Mackay. 2007. Characterisation of a newly identified human tion of a cis-replicating element in the coding sequence of 2Apro. J. Virol. rhinovirus, HRV-QPM, discovered in infants with bronchiolitis. J. Clin. 75:10979–10990. Virol. 39:67–75. 14. Hamparian, V. V., R. J. Colonno, M. K. Cooney, E. C. Dick, J. M. Gwaltney, 34. McKnight, K. L., and S. M. Lemon. 1998. The rhinovirus type 14 genome Jr., J. H. Hughes, W. S. Jordan, Jr., A. Z. Kapikian, W. J. Mogabgab, A. contains an internally located RNA structure that is required for viral rep- Monto, C. A. Philips, R. R. Rueckert, J. H. Schieble, E. J. Stott, and D. A. J. lication. RNA 4:1569–1584. Tyrrell. 1987. A collaborative report: rhinoviruses—extension of the num- 35. McWilliam Leitch, E. C., J. Bendig, M. Cabrerizo, J. Cardosa, T. Hyypia, bering system from 89 to 100. Virology 159:191–192. O. E. Ivanova, A. Kelly, A. C. Kroes, A. Lukashev, A. Macadam, P. McMinn, 15. Han, T. H., J. Y. Chung, E. S. Hwang, and J. W. Koo. 2009. Detection of M. Roivainen, G. Trallero, D. J. Evans, and P. Simmonds. 2009. Transmis- human rhinovirus C in children with acute lower respiratory tract infections sion networks and population turnover of echovirus 30. J. Virol. 83:2109– in South Korea. Arch. Virol. 154:987–991. 2118. 16. Heath, L., W. E. van der, A. Varsani, and D. P. Martin. 2006. Recombination 36. Miller, E. K., K. M. Edwards, G. A. Weinberg, M. K. Iwane, M. R. Griffin, patterns in aphthoviruses mirror those found in other picornaviruses. J. Vi- C. B. Hall, Y. Zhu, P. G. Szilagyi, L. L. Morin, L. H. Heil, X. Lu, and J. V. rol. 80:11827–11832. Williams. 2009. A novel group of rhinoviruses is associated with asthma 17. Huang, T., W. Wang, M. Bessaud, P. Ren, J. Sheng, H. Yan, J. Zhang, X. Lin, hospitalizations. J. Allergy Clin. Immunol. 123:98–104. Y. Wang, F. Delpeyroux, and V. Deubel. 2009. Evidence of recombination 37. Miller, E. K., N. Khuri-Bulos, J. V. Williams, A. A. Shehabi, S. Faouri, J. Al, and genetic diversity in human rhinoviruses in children with acute respiratory I., Q. Chen, L. Heil, Y. Mohamed, L. L. Morin, A. Ali, and N. B. Halasa. infection. PLoS One 4:e6355. 2009. Human rhinovirus C associated with wheezing in hospitalised children 18. Kapikian, A. Z., R. M. Conant, V. V. Hamparian, R. M. Chanock, P. J. in the Middle East. J. Clin. Virol. 46:85–89. Chapple, E. C. Dick, J. D. Fenters, J. M. Gwaltney, Jr., D. Hamre, J. C. 38. Oprisan, G., M. Combiescu, S. Guillot, V. Caro, A. Combiescu, F. Delpey- Holper, W. S. Jordan, E. H. Lennette, J. L. Melnick, W. J. Mogabgab, M. A. roux, and R. Crainic. 2002. Natural genetic recombination between co- Mufson, C. A. Phillips, J. H. Schieble, and D. A. J. Tyrell. 1967. Rhinovi- circulating heterotypic enteroviruses. J. Gen. Virol. 83:2193–2200. ruses: a numbering system. Nature 213:761–762. 39. Palmenberg, A. C., D. Spiro, R. Kuzmickas, S. Wang, A. Djikeng, J. A. Rathe, 19. Kapikian, A. Z., R. M. Conant, V. V. Hamparian, R. M. Chanock, E. C. Dick, C. M. Fraser-Liggett, and S. B. Liggett. 2009. Sequencing and analyses of all J. M. Gwaltney, Jr., D. Hamre, W. S. Jordan, G. E. Kenny, E. H. Lennette, known human rhinovirus genomes reveal structure and evolution. Science J. L. Melnick, W. J. Mogabgab, C. A. Phillips, J. H. Schieble, E. J. Stott, and 324:55–59. D. A. J. Tyrell. 1971. A collaborative report: rhinoviruses—extension of the 40. Petersen, J. F., M. M. Cherney, H. D. Liebig, T. Skern, E. Kuechler, and numbering system. Virology 43:524–526. M. N. James. 1999. The structure of the 2A proteinase from a common cold 20. Khetsuriani, N., X. Lu, W. G. Teague, N. Kazerouni, L. J. Anderson, and virus: a proteinase responsible for the shut-off of host-cell protein synthesis. D. D. Erdman. 2008. Novel human rhinoviruses and exacerbation of asthma EMBO J. 18:5463–5475. in children. Emerg. Infect. Dis. 14:1793–1796. 41. Piralla, A., F. Rovida, G. Campanini, V. Rognoni, A. Marchi, F. Locatelli, 21. Kistler, A., P. C. Avila, S. Rouskin, D. Wang, T. Ward, S. Yagi, D. Schnurr, and G. Gerna. 2009. Clinical severity and molecular typing of human rhino- D. Ganem, J. L. DeRisi, and H. A. Boushey. 2007. Pan-viral screening of virus C strains during a fall outbreak affecting hospitalized patients. J. Clin. respiratory tract infections in adults with and without asthma reveals unex- Virol. 45:311–317. pected human coronavirus and human rhinovirus diversity. J. Infect. Dis. 42. Renwick, N., B. Schweiger, V. Kapoor, Z. Liu, J. Villari, R. Bullmann, R. 196:817–825. Miething, T. Briese, and W. I. Lipkin. 2007. A recently identified rhinovirus 22. Laine, P., C. Savolainen, S. Blomqvist, and T. Hovi. 2005. Phylogenetic genotype is associated with severe respiratory-tract infection in children in analysis of human rhinovirus capsid protein VP1 and 2A protease coding Germany. J. Infect. Dis. 196:1754–1760. sequences confirms shared genus-like relationships with human enterovi- 43. Rivera, V. M., J. D. Welsh, and J. V. Maizel. 1988. Comparative sequence ruses. J. Gen. Virol. 86:697–706. analysis of the 5Ј noncoding region of the enteroviruses and rhinoviruses. 23. Lamson, D., N. Renwick, V. Kapoor, Z. Liu, G. Palacios, J. Ju, A. Dean, K. Virology 165:42–50. St George, T. Briese, and W. I. Lipkin. 2006. MassTag polymerase-chain- 44. Saitou, N., and M. Nei. 1987. The neighbor-joining method: a new method reaction detection of respiratory pathogens, including a new rhinovirus ge- for reconstructing phylogenetic trees. Mol. Biol. Evol. 4:406–425. notype, that caused influenza-like illness in New York State during 2004– 45. Savolainen, C., S. Blomqvist, M. N. Mulders, and T. Hovi. 2002. Genetic 2005. J. Infect. Dis. 194:1398–1402. clustering of all 102 human rhinovirus prototype strains: serotype 87 is close 24. Lau, S. K., C. C. Yip, A. W. Lin, R. A. Lee, L. Y. So, Y. L. Lau, K. H. Chan, to human enterovirus 70. J. Gen. Virol. 83:333–340. P. C. Woo, and K. Y. Yuen. 2009. Clinical and molecular epidemiology of 46. Savolainen, C., P. Laine, M. N. Mulders, and T. Hovi. 2004. Sequence 10310 MCINTYRE ET AL. J. VIROL.

analysis of human rhinoviruses in the RNA-dependent RNA polymerase classification and nomenclature of viruses. Eighth Report of the Interna- coding region reveals large within-species variation. J. Gen. Virol. 85:2271– tional Committee on Taxonomy of Viruses. Elsevier/Academic Press, Lon- 2277. don, United Kingdom. 47. Savolainen-Kopra, C., S. Blomqvist, S. Kaijalainen, U. Juonio, R. Juvonen, 54. Tamura, K., J. Dudley, M. Nei, and S. Kumar. 2007. MEGA4: molecular A. Peitso, A. Saukkoriipi, O. Vainio, T. Hovi, and M. Roivainen. 2010. All evolutionary genetics analysis (MEGA) software version 4.0. Mol. Biol. Evol. known human rhinovirus species are present in sputum specimens of military 24:1596–1599. recruits during respiratory infection. Viruses 1:1178–1189. 55. Tamura, K., M. Nei, and S. Kumar. 2004. Prospects for inferring very large 48. Savolainen-Kopra, C., S. Blomqvist, T. Kilpi, M. Roivainen, and T. Hovi. phylogenies by using the neighbor-joining method. Proc. Natl. Acad. Sci. 2009. Novel species of human rhinoviruses in acute otitis media. Pediatr. U. S. A. 101:11030–11035. Infect. Dis. J. 28:59–61. 56. Tapparel, C., T. Junier, D. Gerlach, S. Cordey, S. Van Belle, L. Perrin, E. M. 49. Savolainen-Kopra, C., S. Blomqvist, T. Smura, M. Roivainen, T. Hovi, D. Zdobnov, and L. Kaiser. 2007. New complete genome sequences of human Kiang, I. Kalra, S. Yagi, J. K. Louie, H. Boushey, J. Boothby, and D. P. rhinoviruses shed light on their phylogeny and genomic features. BMC Schnurr. 2009. 5Ј Noncoding region alone does not unequivocally determine Genomics 8:224. genetic type of human rhinovirus strains. J. Clin. Microbiol. 47:1278–1280. 57. Tapparel, C., T. Junier, D. Gerlach, S. Van Belle, L. Turin, S. Cordey, K. 50. Simmonds, P. 2006. Recombination and selection in the evolution of picor- Muhlemann, N. Regamey, J. D. Aubert, P. M. Soccal, P. Eigenmann, E. naviruses and other mammalian positive-stranded RNA viruses. J. Virol. Zdobnov, and L. Kaiser. 2009. New respiratory enterovirus and recombinant 80:11124–11140. rhinoviruses among circulating picornaviruses. Emerg. Infect. Dis. 15:719– 51. Simmonds, P., and S. Midgley. 2005. Recombination in the genesis and 726. evolution of hepatitis B virus genotypes. J. Virol. 79:15467–15476. 58. Tapparel, C., A. G. L’Huillier, A. L. Rougemont, M. Beghetti, C. Barazzone- 52. Simmonds, P., and J. Welch. 2006. Frequency and dynamics of recombina- Argiroffo, and L. Kaiser. 2009. Pneumonia and pericarditis in a child with tion within different species of human enteroviruses. J. Virol. 80:483–493. HRV-C infection: a case report. J. Clin. Virol. 45:157–160. 52a.Simmonds, P., C. L. McIntyre, C. Savolainen-Kopra, C. Tapparel, I. M. 59. Wisdom, A., A. Kutkowska, E. C. McWilliam-Leitch, E. Gaunt, K. Temple- Mackay, and T. Hovi. 21 July 2006. Proposals for the classification of human ton, H. Harvala, and P. Simmonds. 2009. Genetics, recombination and rhinovirus species C into genotypically-assigned types. J. Gen. Virol. doi: clinical features of human rhinovirus species C (HRV-C) infections; inter- 10.1099/vir.0.023994-0. actions of HRV-C with other respiratory viruses. PLoS One 4:e8518. 53. Stanway, G., F. Brown, P. Christian, T. Hovi, T. Hyypia, A. M. Q. King, N. J. 60. Wisdom, A., C. McWilliam Leitch, E. Gaunt, H. Harvala, and P. Simmonds. Knowles, S. M. Lemon, P. D. Minor, M. A. Pallansch, A. C. Palmenberg, and 2009. Screening respiratory samples for human rhinoviruses (HRV) and T. Skern. 2005. Family Picornaviridae, p. 757–778. In C. M. Fauquet, M. A. enteroviruses: comprehensive VP4/2-typing reveals high incidence and ge- Mayo, J. Maniloff, U. Desselberger, and L. A. Ball (ed.), Virus taxonomy: netic diversity of HRV species C. J. Clin. Microbiol. 47:3958–3967. Arch Virol DOI 10.1007/s00705-013-1634-6

ORIGINAL ARTICLE

Recombination in the evolution of human rhinovirus genomes

Chloe L. McIntyre • Carita Savolainen-Kopra • Tapani Hovi • Peter Simmonds

Received: 4 November 2012 / Accepted: 3 January 2013 Ó Springer-Verlag Wien 2013

Abstract Human rhinoviruses (HRV) are highly pre- Introduction valent human respiratory pathogens that belong to the genus Enterovirus. Although recombination within the Human rhinoviruses (HRVs) are a highly prevalent and coding region is frequent in other picornavirus groups, genetically diverse group of human pathogens that cause a most evidence of recombination in HRV has been restric- wide range of disease. Although they have been tradi- ted to the 5’ untranslated region. We analysed the occur- tionally linked to mild, self-limiting upper respiratory tract rence of recombination within published complete genome infections, there is now evidence for a role of HRV in sequences of members of all three HRV species and lower respiratory tract infection [13, 24, 31], exacerbations additionally compared sequences from HRV strains span- of chronic lung disease [20, 43, 70], severe systemic ning 14 years. HRV-B and HRV-C showed very little infections [66, 69] and gastrointestinal disease [14]. evidence of recombination within the coding region. In HRVs are single-stranded, positive-sense RNA viruses contrast, HRV-A sequences appeared to have undergone a that belong to the genus Enterovirus, family Picornaviri- large number of recombination events, typically involving dae. HRVs are classified into three species on the basis of whole type groups. This suggests that HRV-A may have phylogenetic relationships and sequence divergence: been subject to extensive recombination during the period Human rhinovirus A, Human rhinovirus B, and Human of diversification into types. This study demonstrates the rhinovirus C [25, 49, 56]. While species A and B viruses rare and sporadic nature of contemporary recombination of can be readily isolated and cultured in vitro (and were HRV strains and contrasts with evidence of extensive among the earliest identified human picornaviruses), the recombination within HRV-A and between members of equally prevalent HRV-C was only discovered very different species during earlier stages in its evolutionary recently following the introduction of PCR-based diag- diversification. nostic and surveillance screening methods [1, 21, 26, 28, 53]. For reasons that are currently poorly understood, HRV-C replicates poorly, if at all, in conventional cell culture, although recently, a variant of HRV-C was suc- cessfully cultured in sinus mucosal organ culture [4]. Electronic supplementary material The online version of this Serotypes of HRV-A and HRV-B were originally article (doi:10.1007/s00705-013-1634-6) contains supplementary defined by serological cross-neutralisation assays [12, 19] material, which is available to authorized users. and further characterised by receptor specificity and anti- viral sensitivity profile. More recently, HRV-A and HRV-B C. L. McIntyre (&) Á P. Simmonds Infection and Immunity Division, Roslin Institute, have been genetically divided into two groups into which University of Edinburgh, Edinburgh EH25 9RG, UK the 74 and 25 HRV types previously defined by serological e-mail: [email protected] methods have been assigned [56]. More recent analyses of phylogenetic relationships and genetic divergence among C. Savolainen-Kopra Á T. Hovi National Institute for Health and Welfare (THL), HRV strains has led to the reclassification of EV-68 as Mannerheimintie 166, FIN-00300, Helsinki, Finland HRV-A87 [3] and of HRV-A21 and the Hanks strain as the 123 C. L. McIntyre et al. same genotype [27]. Three new HRV-A types have also types had only one full genome sequence available. In been defined on the basis of sequence divergence in the VP1 addition, sequence data for HRV-C strains spanned a very region: HRV-A101–HRV-A103 (52, www.picornaviridae. limited time period due to their relatively recent discovery com). For the more recently described species C rhinovi- and the tendency of studies to focus on capsid-based typ- ruses, difficulties associated with in vitro culture have ing. In the course of this analysis, we obtained sequences precluded the development of serology-based virus typing, from isolates of all three HRV species spanning 14 years in and variants are currently divided into 51 genotypically order to determine whether recombination had shaped the defined types. Classification into types is based on the contemporary diversity of these types. In addition, we also current system used for enteroviruses [46], in this case performed a comprehensive analysis of recombination based on an assignment threshold of 13 % nucleotide between the VP1 and 3Dpol regions of all available HRV divergence in the VP1 region [59]. full genomes. This study represents an attempt to com- The epidemiological profile of HRVs and occurrence of prehensively characterise all detectable recombination infection with different types varies substantially every events in the history of diversification of HRV. epidemic season [57], with high frequencies of asymp- tomatic infections [50] and no identified link between any strain, type or species of HRV with particular disease Materials and methods presentations. Despite an apparent difference in tissue tropism, the HRVs share many genetic similarities with Sample selection other members of the genus Enterovirus. For instance, they have a single-stranded, positive-sense RNA genome of HRV-positive isolates were selected from distinct geo- around 7100 bases with a 5’ untranslated region (5’UTR) graphical locations (UK and Finland) over a period span- of around 600–650 bases. The coding region contains a ning 14 years. These included 28 HRV-positive clinical single open reading frame and is cleaved post-translation- respiratory specimens from Finland collected between ally into four structural (VP1–VP4) and seven non-struc- 1995 and 1997. VP4/VP2 sequences for these isolates were tural (2A–3D) proteins. The 5’UTR contains a type I previously published with accession numbers in the series internal ribosomal entry site (IRES) and a conserved 5’ EU590043–EU590113 and AY015114–AY015174. Using cloverleaf structure [48, 49]. previously defined pairwise nucleotide p-distance thresh- EV serotypes, in particular EV-B, have been docu- olds in the VP4/VP2 region [59, 67], these 28 samples were mented to undergo frequent recombination [30, 34, 40, 44, identified as belonging to 22 HRV types (10 HRV-A, 6 61], and it has been proposed that structural and non- HRV-B and 6 HRV-C). structural genome regions evolve independently [33, 45, In addition, a total of 552 respiratory samples from 55]. EVs can thus be classified into a series of distinct patients previously referred to the Specialist Virology recombinant forms (RFs), each possessing a phylogeneti- Centre at the Royal Infirmary of Edinburgh were screened cally distinct grouping in 3Dpol and other non-structural by our previously described nested PCR method, specific genes. These RFs have been shown to arise at intervals, for the 5’UTR region of HRV/HEV [67]. Positive samples dominate circulation for a few years and then disappear (n=136) were amplified and sequenced in the VP4/VP2 entirely [39, 41]. Recombination breakpoints in EV gen- region as described previously [67]. Samples that were of erally localise to two hotspot regions: between the 5’UTR the same 22 types described above (n= 41) were identified and P1 and between P1 and P2 [61]. and selected for further analysis. A further 23 HRV-C- In contrast, recombination is generally thought to be rare positive samples were also identified, which represented 11 and sporadic within the coding region of HRV [22, 29, 65], of the previously 28 provisionally assigned HRV-C types and a contemporary set of HRV-C sequences were found to [59]. In total, 92 HRV isolates were analysed during the cluster completely congruently in three coding genome course of this work. regions [38]. To date, there is limited evidence of inter- species recombination within the coding region of HRV. It Amplification of the VP4/VP2, VP1 and 3Dpol regions is well established, however, that HRV-A and HRV-C have of HRV-A, -B and –C undergone recombination within the 5’UTR, whereby over half of all HRV-C types group within the HRV-A clade RNA was extracted from clinical samples using a QIAGEN [17, 68]. In addition, recent reports have highlighted the Virus Spin Kit (QIAGEN, UK), according to manufac- possibility that several contemporary HRV-A strains have turer’s instructions. Reverse transcription reactions were been formed by recombination [49]. carried out using random hexamers and a Promega Access Until recently, data from the non-structural region of the Reverse Transcription Kit with modifications as described HRV genome had only limited availability, as most HRV previously [15]. Amplification of the VP1 and 3Dpol 123 Recombination in the evolution of human rhinovirus genomes

VP1 3Dpol 100 HRV-A18 100 HRV-A38 96 HRV-A34 88 HRV-A50 HRV-A40 HRV-A18 HRV-A85 100 HRV-A77 82 HRV-A57 HRV-A60 100 HRV-A21 100 99 HRV-A54 HRV-A55 78 HRV-A98 99 73 99 HRV-A24 HRV-A31 HRV-A90 88 HRV-A47 81 HRV-A11 HRV-A90 100 99 100 HRV-A33 HRV-A66 77 86 HRV-A24 HRV-A76 100 100 100 HRV-A75 HRV-A50 81 HRV-A34 100 HRV-A43 99 100 99 HRV-A21 HRV-A9 78 99 HRV-A57 100 HRV-A32 HRV-A55 HRV-A67 HRV-A44 91 100 96 HRV-A15 HRV-A29 HRV-A74 70 JN815252 85 100 HRV-A60 HRV-A62 71 100 HRV-A25 84 100 HRV-A38 HRV-A100 HRV-A19 99 HRV-A10 100 HRV-A82 98 HRV-A63 HRV-A22 100 99 JN815238.1 94 100 95 HRV-A94 EU840726.1 100 99 HRV-A64 HRV-A56 100 100 HRV-A61 HRV-A11 100 JN815250 Clade 1 HRV-A96 100 99 HRV-A33 100 HRV-A73 HRV-A76 93 HRV-A41 99 HRV-A13 100 100 HRV-A59

100 HRV-A30 HRV-A39 HRV-A23 100 HRV-A49 100 86 100 HRV-A2 80 HRV-A2 95 99 99 HRV-A49 100 100 HRV-A30 100 HRV-A81 86 98 HRV-A23 100 HRV-A16 100 100 HRV-A75 100 HRV-A1 99 HRV-A94

100 HRV-A40 HRV-A22 HRV-A85 92 HRV-A64 HRV-A56 HRV-A19 94 100 HRV-A82 HRV-A98 100 100 HRV-A43 HRV-A54 100 97 97 HRV-A15 89 HRV-A63 99 HRV-A74 100 HRV-A59 100 72 HRV-A13 HRV-A39 100 HRV-A73 100 HRV-A10 83 100 HRV-A41 HRV-A100 HRV-A96 100 HRV-A66 84 HRV-A61 99 100 100 HRV-A77 HRV-A67 99 HRV-A47 99 HRV-A32 100 HRV-A9 HRV-A31 99 100 HRV-A1 HRV-A62 100 100 100 HRV-A81 HRV-A25 96 99 HRV-A16 72 JN815252 100 HRV-A88 HRV-A44 100 100 HRV-A7 99 HRV-A29 95 HRV-A89 100 70 HRV-A7 98 Clade 2 87 99 HRV-A58 HRV-A88 HRV-A36 100 99 100 HRV-A58 HRV-A78 100 100 100 98 HRV-A89 99 HRV-A53 99 HRV-A36 HRV-A46 100 97 100 HRV-A78 HRV-A80 HRV-A101 HRV-A12 87 100 100 100 100 HRVA65 100 HRV-A80 98 HRV-A51 HRV-A46 100 79 HRV-A71 100 HRV-A101 HRV-A103 84 100 HRV-A65 100 100 HRV-A20 Clade 3 HRV-A51 HRV-A28 100 84 100 99 HRV-A71 HRV-A68 HRV-A103 97 105/6735 Resp4129 100 HRV-A20 100 100 Resp4103 HRV-A68 99 Resp3949 100 93 HRV-A53 Resp5119 72 HRV-A28 HRV-A12 100 100 100 HRV-A95 100 HRV-A95 HRV-A8 HRV-A8 Clade 4 100 99 HRV-A45 HRV-A45 100 100 NC001430 NC001430

0.05 0.05

123 C. L. McIntyre et al. b Fig. 1 Neighbour-joining phylogenetic trees showing the VP1 and HRV-A, -B and –C 3Dpol partial 3Dpol regions of HRV-A strains. Where possible, HRV-A type groups have been collapsed for ease of reference. Contiguous clades are marked by coloured boxes. Clades 2, 3 and 4 are marked by Amplification of the partial 3Dpol region was carried out purple, green and black boxes respectively, while clade 1 is divided under previously described conditions [38] but using an into sub-clusters, which group together in 3Dpol. HRV-A types that annealing temperature of 45 °C in the second round. The undergo a bootstrap-supported change in nearest neighbour between amplified segment covered positions 6414–6896 in HRV-A VP1 and 3Dpol are marked with a red line. Changes in tree topology that are bootstrap supported on P1, P2 and P3 trees only are marked (numbered according to FJ445111–HRV-A1), 6475-6960 with a yellow line. Orange arrows indicate the formation of one 3Dpol in HRV-B (numbered according to X01087–HRV-B14) clade from three VP1 clades and 6361-6835 in HRV-C (numbered according to EF582385–HRV-C4). regions was carried out using the primers listed (Supple- All amplicons were sequenced using a BigDye Termi- mentary Table 1; Supplementary Data) and under the nator Kit (Applied Biosystems, Warrington, UK). conditions described below. Sequences generated in this study have been submitted to GenBank and assigned accession numbers in the series HRV-A VP1 KC342054–KC342173.

The full HRV-A VP1 region was assembled by amplifi- Sequence alignment and pairwise nucleotide p-distance cation and sequencing of two overlapping sequence frag- calculations ments. The same first-round product was utilised for second-round reactions for both VP1 fragments. For frag- All available HRV full genomes were downloaded from ment 2, VP1 amplification was carried out as described GenBank on 6/6/12. As a number of currently published previously [67]. For those samples that proved problematic HRV polyprotein sequences contain large gaps within the to amplify, a combined reverse transcription and first-round coding region, only those that were more than 90 % PCR utilising the SuperScript III system (Invitrogen, UK) complete in the VP4/VP2, VP1 and partial 3Dpol regions was carried out as described previously [42]. For fragment were included. The regions analysed were 616–1002 for 1, the second round was a hemi-nested PCR reaction with VP4/VP2, 2305–3126 for VP1 and 6361–6835 for partial primers 1943s and 2504as and previously described tem- 3Dpol (with all numbering according to reference strain perature cycling conditions. For HRV-A28-positive speci- HRV-C4, strain 024 (EF582385)). These comprised 136 mens, amplification of fragment 1 was not possible with HRV-A, 53 HRV-B and 121 HRV-C sequences, which standard primers, and a specific inner antisense primer was were then compiled into separate datasets for each species, designed. together with sequences generated in the current study: 28 HRV-A, 17 HRV-B and 24 HRV-C. A second dataset HRV-B VP1 containing all available full genomes that were more than 90 % complete in the P1, P2 and P3 coding regions was Amplification of HRV-B VP1 was carried out using hemi- constructed. Sequence alignments were performed in SSE nested primers. PCR conditions were as for HRV-A, except v1.0 [58], using a combination of CLUSTAL/MUSCLE for the use of an annealing temperature of 48 °C in the algorithms implemented in the program and manual second round. For any samples that could not be amplified inspection of nucleotide and amino acid alignments. by this method, the SuperScript III system was used. Pairwise nucleotide p-distances for individual regions were calculated using the program SequenceDist, within HRV-C VP1 the SSE v1.0 package. For HRV-C, pairwise nucleotide p-distances in the VP1 region were used to confirm the SuperScript III was used to perform a combined reverse minimum genetic divergence of the new HRV-C types transcription and first-round PCR reaction directly on [59]. extracted RNA. The second round of the nested PCR reaction was carried out with an annealing temperature of Analysis of mean substitution rates 45 °C and in a 50-ll reaction volume. PCR products were loaded into wells on a 2 % agarose gel and separated at 150 Sequences representing six HRV-A and HRV-C types were V for 45 minutes. DNA bands of around 1000 bases were selected for analysis of mean substitution rates in the VP1 excised manually under UV transillumination and purified and partial 3Dpol regions. For VP1 datasets, every avail- from the agarose gel using an Illustra GFX PCR DNA and able GenBank entry that was [90 % complete across the Gel Band Purification Kit (GE Healthcare, UK) according region and had a date of isolation specified was included. to the manufacturer’s instructions. 3Dpol datasets contained only sequences generated in this 123 Recombination in the evolution of human rhinovirus genomes study and published full genomes. HRV-A and HRV-C specified a constant population size, with auto-optimization datasets were screened with GARD, SBP and RDP for of all other priors during execution of the program. Each evidence for intra- and inter-type recombination prior to MCMC chain was run for 20 million states, and the output substitution rate analysis. was recorded every 1000 states. All datasets were analysed Analysis of mean substitution rates was carried out in in duplicate to ensure convergence of results. In order to the Bayesian Evolutionary Analysis of Sampling Trees confirm that all observed results were a consequence of program (BEAST) [8]. Each analysis used the SRD06 input sequence data, each dataset was subjected to an model of nucleotide substitution, as recommended for ‘‘empty’’ execution of BEAST with sequence data exclu- protein coding data [8]. The analysis was carried out using ded. Additionally, all datasets were analysed with the dates a relaxed uncorrelated lognormal molecular clock and of isolation of individual sequences scrambled.

ABVP1 3Dpol

100 3Dpol 100 HRV-B72 HRV-B37 VP1 99 99 100 100 HRV-C6 97 HRV-B6 HRV-C6 88 HRV-B14 100 100 99 HRV-C3 HRV-B103 HRV-C3 100 HRV-B3 100 100 100 99 HRV-B3 98 HRV-C1 HRV-C10 100 100 HRV-B37 100 81 HRV-C10 100 HRV-B72 HRV-B6 HRV-C1 100 100 99 100 HRV-C7 95 HRV-B14 HRV-C43 100 HRV-B103 100 100 100 86 HRV-C21 100 HRV-B83 HRV-C22 100 HRV-B83 100 95 100 100 HRV-C22 86 HRV-B92 HRV-C21 76 HRV-B92. 88 100 HRV-C43 HRV-B79 HRV-C7 99 HRV-B79 100 100 HRV-C32 100 100 100 HRV-C13 96 HRV-B35 HRV-B101 100 99 HRV-C39 97 HRV-C32 98 HRV-B100 100 HRV-C13 78 HRV-B100: HQ123444 HRV-C39 99 100 100 HRV-C5 HRV-B101 HRV-B35 FJ445187. HRV-C5 79 100 100 92 HRV-C38 HRV-B35 DQ473487. 99 HRV-C11 99 HRV-B86 100 99 HRV-C11 100 HRV-B42 HRV-C38 HRV-B42 HRV-C27 88 73 HRV-B5 HRV-C27 100 100 HRV-B5 100 HRV-C20 99 HRV-C45 100 HRV-B26 70 99 100 97 HRV-B99 HRV-C45 94 HRV-B99 HRV-C29 98 HRV-B26 88 99 HRV-C29 HRV-B4 HRV-C20 100 100 100 100 HRV-C41 100 HRV-B4 HRV-C41 100 97 HRV-B84 99 100 HRV-B84 HRV-C30 HRV-C30 100 HRV-B97 86 HRV-C23 HRV-B97. HRV-C23 77 100 93 100 HRV-B93 100 99 HRV-C15 98 HRV-B93 82 78 HRV-C15 HRV-B27 100 HRV-C24 88 HRV-B27 HRV-C24 79 100 Resp 13213/08 100 94 98 HRV-C25 HRV-C25 100 Resp 13213/08 Resp 13171/08 100 100 100 100 HRV-C12 HRV-C12 98 94 Resp 13171/08 HRV-B70 DQ473489. 97 99 91 97 HRV-C44 99 100 HRV-C44 HRV-B70 DQ473489. HRV-B17 HRV-C42 HRV-C42 97 HRV-B17 87 HRV-B91 100 100 100 HRV-C17 HRV-C17 100 HRV-B91 99 HRV-B52 100 100 98 HRV-C16 HRV-C16 100 HRV-B69 100 95 HRV-B69 100 89 100 100 HRV-C8:GQ223227/06 HRV-C31 100 HRV-B52 HRV-B48 100 80 100 76 Resp 10300/08 Resp 10300/08 HRV-B48 HRV-B86 100 HRV-C18 99 100 HRV-C31

NC_001430 100 100 99 NC_001430 HRV-C28 HRV-C28 99 100 HRV-C18 HRV-C4 100 99 0.05 0.05 100 100 HRV-C33 HRV-C33 96 HRV-C14 99 HRV-C14

100 100 HRV-C40 HRV-C40 100 100 HRV-C2 HRV-C2 74 100 100 HRV-C19 HRV-C19 100 100 100 HRV-C26 100 HRV-C26 88 84 JF317015/07 JF317015/07 HRV-C46 HRV-C46 81 HRV-C36 HRV-C36 100 99 HRV-C9 HRV-C8:GQ223227/06 100 HRV-C4 77 HRV-C9 100 100 NC 001430 NC 001430

0.05 0.05

Fig. 2 A: Neighbour-joining phylogenetic trees of HRV-B VP1 and with black boxes. B: Neighbour-joining phylogenetic trees of HRV-C partial 3Dpol sequences. Putative recombinant sequences and type VP1 and partial 3Dpol sequences. HRV-C type groups have been groups are marked with black lines. HRV-B type groups are collapsed collapsed where possible for ease of reference. Contiguous clades are where possible for ease of reference. Contiguous clades are marked marked with black boxes 123 C. L. McIntyre et al.

Table 1 Pairwise nucleotide p-distances of newly proposed HRV types to the nearest neighbours GenBank accession Nearest neighbour VP1 pairwise nucleotide Provisional HRV-C Confirmed HRV number (HRV type) p-distance type type

JN815252 HRV-A44 HRV-A44: 0.109 -- HRV-A29 HRV-A29: 0.1044 JF781500 HRV-B35 0.1982 - HRV-B101 JF781501 HRV-B35 0.1901 - HRV-B101 JX074052 HRV-B35 0.1915 - HRV-B101 JX074053 HRV-B35 0.2267 - HRV-B102 JF416311a HRV-C2 0.2900 HRV-Cpat1 HRV-C36 JF416321 HRV-C26 0.2719 HRV-Cpat2 HRV-C37 JF416322 HRV-C5 0.2147 HRV-Cpat4 HRV-C38 JF416306 HRV-C32 0.1472 HRV-Cpat5 HRV-C39 JF416312 HRV-C2 0.1755 HRV-Cpat6 HRV-C40 JF416323 HRV-C30 0.1904 HRV-Cpat7 HRV-C41 JF416320 HRV-C12 0.2506 HRV-Cpat8 HRV-C42 JF416307 HRV-C22 0.2460 HRV-Cpat9 HRV-C43 JF416310 HRV-C12 0.2077 HRV-Cpat11 HRV-C44 JF416308 HRV-C29 0.2540 HRV-Cpat12 HRV-C45 JF416318 HRV-C26 0.2836 HRV-Cpat25 HRV-C46 a For ease of reference, where more than one example of a previously provisionally assigned HRV-C type was sequenced in this study, only the earliest isolated strain is shown

Output was analysed in the TRACER program, within and Bootscan [35]. A potential recombination event was the BEAST package. Phylogenetic trees generated by accepted for further analysis if it was detected by more than Bayesian methods were annotated in TreeAnnotator v1.6 two of the above algorithms, with a p-value of less than 0.05. and visualised in FigTree. These were then directly com- All datasets were further analysed using the GARD and pared to trees produced by bootstrap re-sampled maximum SBP [23] programs available on the HyPhy datamonkey composite likelihood neighbour-joining trees to ensure webserver (www.datamonkey.org). Prior to analysis of consistency of phylogenetic relationships inferred using the each dataset, the model selection tool included was two methods. employed, and the appropriate model of nucleotide sub- stitution was used for further analysis. Phylogenetic analysis In addition, as a control measure, several groups of sequences that were observed to display congruent phylo- Phylogenetic trees were constructed using the MEGA 5.0 genetic relationships between VP1 and 3Dpol were software package [63] by the neighbour-joining method analysed by these methods. All analyses of putative non- [54] from 100 bootstrap-resampled sequence alignments recombinant sequence groups confirmed that there was no using the maximum composite likelihood (MCL) [11] evidence of recombination. distances with pairwise deletion for missing data. Phylo- The combination of different recombination detection genetic trees were examined for changes in branch length algorithms detailed above were used as a screening tool and tree topology between non-consecutive genome frag- in order to effectively screen each alignment for evi- ments (VP4/VP2, VP1 and 3Dpol) and between coding dence of recombination. If evidence of recombination regions of HRV full genomes (P1, P2 and P3). was detected by these methods, the recorded event was further analysed by phylogenetic analysis of putative Recombination analysis recombinant regions and further characterised by analysis with the GroupScan program within the SSE v1.0 soft- RDP v4.0 [36] analysis was undertaken to further investigate ware package. observed phylogenetic incongruities. Putative recombinant sequences were analysed with their observed nearest Determining putative recombination breakpoints neighbours in each region. Each sequence set was examined using a combination of algorithms, including RDP [37], In order to verify putative recombination events and GeneConv [47], MaxChi [62], Chimaera [51], SiScan [10] determine recombination breakpoints where possible, sets

123 Recombination in the evolution of human rhinovirus genomes

analysed regions were included in the recombination analysis. A separate recombination analysis was performed with complete genome sequences (201 HRV-A, 71 HRV-B and 59 HRV-C). Examination of phylogenetic trees constructed for the VP1 region of members of all three HRV species revealed consistent grouping of individual (sero)types into clades with uniformly high bootstrap support (Figs. 1, 2). These type groupings were also apparent in the VP4/VP2 and full P1 capsid coding region. However, the relationships between HRV type groups were generally less well resolved in VP4/VP2 trees as a result of greater sequence conservation and shorter fragment length (data not shown). Previously proposed pairwise nucleotide p-distance divergence thresholds were used to support phylogenetic genotype groupings and define putative new HRV types [59, 67] (Table 1). One newly described HRV-A sequence, JN815252, did not group with any currently assigned HRV-A type and Fig. 3 Mean substitution rates and 95 % highest posterior density shared nearest neighbours, HRV-A29 (FJ445125) and intervals for two coding regions of selected HRV-A and HRV-C sequences HRV-A44 (DQ473499), in every region studied. Using the previously determined pairwise nucleotide p-distance of sequences underwent GroupScan analysis in the threshold of 12 % divergence in VP1 for identification of SSEv1.0 software package. GroupScan scores the extent of new HRV-A types [67], JN815252 fell within the intra- phylogenetic grouping of a query sequence with predefined serotype distance range on comparison with both HRV- sequence groups and can pinpoint potential breakpoints if A29 and HRV-A44 (0.1051 and 0.1083, respectively). sequences that are closely related to both parental groups Divergence between HRV-A29 and HRV-A44 has previ- are included. The GroupScan method relies on detection of ously been shown to fall within the intra-serotype distance bootstrap-supported changes in tree topology along an range [67], and these two serotypes have also been dem- alignment and was used for the determination of recom- onstrated to be serologically cross-reactive [7]. These three bination breakpoints. For each analysis, the three sequences were treated as the same serotype for recombi- sequenced regions of each isolate were concatenated into a nation analysis. single sequence, and a fragment size of 300 bases with an In species B, JF781500, JF781501 and JX074052 increment of 30 bases was used. Additionally, each anal- formed a bootstrap-supported clade in VP1 with a pairwise ysis was repeated with a fragment size of 600 bases and an nucleotide p-distance of 0.015 to 0.018 from each other and increment of 30 to ensure consistency of results. In each of consistently greater than 0.19 from the nearest assigned instance, the query sequence was the potential recombinant HRV-B type (HRV-B35). JX074053 also displayed a sequence/group and was scanned against the nearest p-distance of 0.2267 from HRV-B35, its nearest relative. neighbour/group in both regions. The intersection of the By current criteria [67], these two groups represent new two lines was calculated to give an estimation of the HRV-B types and have been formally designated HRV- recombination breakpoint. Breakpoints were compared and B101 and HRV-B102 (Table 1)(http://www.picornastudy found to be highly similar in scans carried out with a group.com/types/enterovirus/hrv-b.htm). 300-nucleotide fragment and those done with a HRV-C VP1 sequence data generated during this study 600-nucleotide fragment. confirmed type assignment of eleven HRV-C types that had been provisionally identified as new types based on VP4/ VP2 sequences [59]. These had pairwise distances that fell Results above the VP1 divergence threshold (Table 1) and have been formally designated as HRV-C36 – HRV-C46 (http://www. Genotype assignment for HRV-A, HRV-B and HRV-C picornastudygroup.com/types/enterovirus/hrv-c.htm). The sequences previously proposed nucleotide divergence threshold of 13 % was readily identifiable in the distribution of pairwise A total of 164 HRV-A, 70 HRV-B and 145 HRV-C vari- VP1 p-distances generated from the expanded VP1 dataset ants that were more than 90 % complete across the (Fig. S1, Supplementary Data). 123 C. L. McIntyre et al.

Fig. 4 Comparison of pairwise nucleotide p-distances between VP1 putative recombination events involving full HRV-A type groups and partial 3Dpol regions of HRV-A, HRV-B and HRV-C. The line of [such as observed extensively within clade 1 (Fig. 4)] in red. Sporadic best fit is indicated by a dotted red line. For HRV-A, the line of best recombination events involving single HRV-A sequences are marked fit is given separately for putative recombinant and non-recombinant in yellow. For graphs depicting HRV-B and HRV-C, all examples of distributions. A graph depicting HRV-A pairwise comparisons shows putative recombination events are marked in red

Analysis of mean substitution rates in selected HRV-A types were available for inclusion of members of this species and HRV-C types in the analysis. Analysis was carried out separately for HRV-A sequences (n=18) and HRV-C sequences (n=23). Three non-recombinant types each of HRV-A (HRV-A9, For both analysed regions of HRV-A and HRV-C, mean -A28 and –A78) and HRV-C (HRV-C9, -C12 and –C18) with substitution rates and 95 % highest posterior density (HPD) dates of isolation spanning at least 13 years were selected for intervals were on the order of 10-3–10-2 substitutions per analysis of mean substitution rates. The HRV-A dataset site per year. Calculated mean substitution rates were lar- assembled contained sequences spanning 40 years, while gely similar between the two coding regions and two HRV HRV-C sequences spanned 14 years. Phylogenetic trees species considered (Fig. 3). Although both regions of analysed for each type included showed no bootstrap-sup- HRV-C showed 2- to 3-fold elevated substitution rates ported change in topology within each type or species (data (6.9138 9 10-3 and 9.8491 9 10-3 for VP1 and 3Dpol, not shown). Insufficient sequences of individual HRV-B respectively) when compared to the estimates obtained for

123 Recombination in the evolution of human rhinovirus genomes

Fig. 5 Neighbour-joining phylogenetic trees representing distinct patterns of recombination observed within HRV-A sequences. The branch to the tree root has been collapsed, and the outgroup used for analysis is given below. A: Phylogenetic trees showing examples of incongruent topology between VP1 and 3Dpol involving [1] HRV-A56 and HRV-A76 (outgroup: A7), [2] HRV-A28 and HRV-A68 (outgroup: A95) and [3] HRV- A33 and older HRV-A76 strains (outgroup: A88) B: Phylogenetic tree showing branch-length discrepancies between VP1 and 3Dpol of HRV-A80 (outgroup: A46)

HRV-A strains (3.9954 9 10-3 and 3.4586 9 10-3 for VP1 observed linear relationships between the pairwise p-dis- and 3Dpol, respectively), the 95 % HPD intervals over- tances of sequences in these two regions (Fig. 4). This lapped for the two species. linear relationship was observed in HRV-B and HRV-C, in In order to ensure that the specified priors were not both intra- and inter-(sero) type distance ranges (Fig. 4). inadvertently influencing the posterior distribution of the Both distributions contained few if any outlying data calculated mean substitution rate, the BEAST algorithm points, and lines of best fit had gradients of approximately was executed with sequence data excluded for each dataset. 1 (0.9042 and 0.9418, respectively). This correlation pro- As additional confirmation, dates of isolation for individual vides evidence for maintained equal substitutions rates in sequences were randomised using the SSE v1.0 software the two regions throughout the period of diversification package, and analysis with BEAST was repeated. Scram- into different types. bled datasets gave extremely low substitution rate esti- In contrast, HRV-A sequences showed lower divergence mates well below the lower HPD interval of the test in the 3Dpol region compared to VP1 and a gradient of sequences (data not shown). 0.8714 for the line of best fit (Fig. 4). The distribution additionally contained a large number of outlying data Comparison of pairwise nucleotide p-distances in VP1 points, consistent with the occurrence of several recombi- and 3Dpol nation events during HRV-A diversification. HRV-A pairwise distances could be divided into three distinct The similarity of mean substitution rates between VP1 and distributions, consisting of non-recombinant types, poten- 3Dpol regions in HRV-A and HRV-C is consistent with the tial recombination events involving full HRV-A type

123 C. L. McIntyre et al. groups and sporadic instances of phylogenetic incongruity protein coding region. As reported previously [65], HRV- involving single HRV-A sequences (marked with blue, red A76 strains collected between 1999 and 2010 grouped most and yellow, respectively, in Fig. 4; see phylogenetic anal- closely with HRV-A56 in 3Dpol (Fig. 5 A1). The intra- ysis below). For the putative non-recombinant types that type pairwise nucleotide p-distances of HRV-A76 strains grouped consistently in VP1 and 3Dpol regions, a gradient ranged from 0.0839 to 0.0933 in VP1 and from 0.2085 to of approximately 1 (0.9011) was observed. Meanwhile, 0.2292 in 3Dpol. HRV-A68 and HRV-A33 also displayed a consideration of pairwise p-distance comparisons between similar pattern, whereby the older strains grouped prefer- potentially recombinant HRV-A sequences alone gave a entially with HRV-A28 (Fig. 5 A2) and HRV-A76 (Fig. 5 calculated gradient of 0.7216. A3) type strains, respectively, in 3Dpol. The HRV-A40 type group includes three recently described full genome Analysis of phylogenetic incongruity in HRV-B sequences that were not included in VP1 and 3Dpol anal- and HRV-C sequences ysis due to gaps present in published sequences, which resulted in sequence completeness below the threshold of In accordance with the observed conformity of HRV-B VP1 90 %. However, on analysis of the whole P3 region, the and 3Dpol pairwise distances (Fig. 4), the majority of HRV-B three contemporary HRV-A40 strains (JX074051, sequences displayed congruent clustering in phylogenetic trees JQ245067 and JN798579) formed a bootstrap-supported and clustered within their genotype group, as defined by VP1 clade separate from the HRV-A40 type strain, which (Fig. 2A). The only bootstrap-supported putative recombinant maintained its grouping with HRV-A85. sequence observed was HQ123444. This full genome HRV-A80 sequences showed a substantial difference in sequence showed a pairwise p-distance of at least 0.19 from branch lengths between two regions, with no accompany- HRV-B35 strains in VP1. However, on inspection of all pub- ing change in tree topology (Fig. 5B). The HRV-A80 type lished HRV full genomes, HQ123444 was identical to HRV- strain had a pairwise nucleotide p-distance of 0.078 from B35 from position 6353 in the 3Dpol region. This incongruity contemporary HRV-A80 strains in VP1. However, in was detected using RDP, GARD and SBP. 3Dpol, the contemporary HRV-A80 strains were much In addition, a closer phylogenetic grouping of the HRV- more divergent from the type strain, with a p-distance of B17 and HRV-B70 type strains, as compared to contem- 0.207. All of the above putative recombination events were porary HRV-B70 sequences, was observed in the 3Dpol verified by analysis with GARD, SBP and RDP. region. Although this change of topology did not have With the exception of HRV-A12 and HRV-A78, all adequate support in 3Dpol (Fig. 2A), it was supported in HRV-A sequences grouped into four bootstrap-supported both the P2 and P3 regions (data not shown). clades (numbered 1-4 in Fig. 1) branching basally in both The majority of HRV-C sequences fell within six VP1 and 3Dpol trees. Clades 1–4 were maintained in all bootstrap-supported clades, which were congruent between regions studied, and although bootstrap-supported changes all regions analysed (Fig. 2B). With the previously repor- in tree topology were relatively common within each of ted artefactually recombinant sequence GQ223227 exclu- these clades, there were no instances of recombination ded from analysis [17, 38], no evidence of inter-type occurring between them. recombination was observed in HRV-C. Any observed Bootstrap-supported changes in branching order and tree inconsistency in branching order or phylogenetic relation- topology were observed within clades 1, 2 and 3 (marked ships between HRV-C types could not be confirmed by with red lines in Fig. 1). The number of bootstrap-supported analysis with RDP, GARD or SBP. nodes that each putative recombinant sequence violated in the transition between VP1 and 3Dpol grouping ranged from Analysis of phylogenetic incongruity in HRV-A two to eight (Table 2). Clade 1 formed several sub-clusters sequences of HRV-A types, and there was a clear breakdown in phy- logenetic relationships of HRV-A types between VP1 and In contrast to HRV-B and HRV-C, the comparison of VP1 3Dpol. For example, one monophyletic bootstrap-supported and partial 3Dpol pairwise nucleotide p-distances in HRV-A clade in 3Dpol (orange box; Fig. 1) was formed from three suggested a number of separate recombination events distinct bootstrap-supported VP1 clades. Phylogenetic rela- between VP1 and 3Dpol (Fig. 1) and between the P1, P2 and tionships between involved HRV-A types within this 3Dpol P3 regions (Fig. S2; Supplementary data). Putative recom- clade were largely incongruent when compared with the VP1 bination events identified by phylogenetic analysis were tree topology. However, there was poor bootstrap support for confirmed by analysis with RDP, GARD and SBP. nearest-neighbour relationships between HRV-A types in Several HRV-A types showed a pattern similar to that of this region, regardless of whether 3Dpol or full P3 region HRV-B70, with contemporary strains grouping separately trees were inspected. Notably, the large cluster of putative from their respective older type strains in the non-structural recombinant HRV-A pairwise nucleotide p-distance 123 Recombination in the evolution of human rhinovirus genomes

Table 2 Phylogenetic incongruities observed between VP1 and 3Dpol in HRV-A sequences HRV type Accession numbers Nearest neighbour Number of changesa B/p mean VP1 3Dpol P1 P2 P3 VP1 ? 3Dpol P1 ? P2 P2 ? P3 (range)

A13 FJ445116 A41 A73 A41 A73 A73 2 2 0 3393 FJ445117 (3365–3420) A53 DQ473507 A28 A46 A28 A46 A46 4 4 0 3213 JN798587 (3198–3228) A46 DQ473506 A80 A53 A80 A53 A53 4 4 0 3280 A68 FJ445150 A20 A28 A20 A20 A28 4 0 3 3998 JN798578b (3995–4001) A7 FJ445176 A88 A36 A88 A88 A88 3 0 0 6576 DQ473503 A58 A36 A36 A89 A58 A58 A89 A89 A76 EU840726 A76 A56 A76 A76 A56 7 0 7 5130 (except FJ445182 JN815238 (5129–5131) DQ473502) JX074055 JX074049 A54 FJ445138 A56 A31 A56 A40 A31 6 2 5 3316 A98 FJ445139 FJ445173 A40 A47 A40 A85 A47 (3238–3377) A85 A85 A31 FJ445126 A62 A98 A62 A100 A98 6 0 5 5224 A47 FJ445133 A25 A54 A25 A62 A54 5424 JN837692 A44 A44 A25 (5416–5435) GQ223229 A29 A29 A44 JN815252 JN815252 A29 JN815252 A18 FJ445118 A34 A38 A34 A66 A85 7d 9 5 3397 F292/8643c A50 A40 A50 A77 (3328–3526) JF781496 A85 5276 JF781510 A77 (5024–5577) A60 A54 A98 A31 A47 A90 A66 A24 A39 AY751783 A63 A2 A63 A2 A2 4 5 0 3431 A59 A23 A59 A23 A23 A98 A30 A98 A30 A30 A54 A49 A54 A49 A49 A56 A56 A40 A40 A85 A85 A66 FJ445148 A77 A24 A77 A77 A24 6 0 6 5652 JN112340 A90 (5645–5674) JN621246 JQ837715

123 C. L. McIntyre et al.

Table 2 continued HRV type Accession numbers Nearest neighbour Number of changesa B/p mean VP1 3Dpol P1 P2 P3 VP1 ? 3Dpol P1 ? P2 P2 ? P3 (range)

A38 FJ445180 A15 A18 A15 A19 A18 7e 414f 5130 DQ473495 A74 A40 A74 A60 A40 (5118–5143) JQ994496 A60 A85 A9 A85 JN541272 A77 A32 A77 A54 A67 A54 A98 A60 A98 A31 A31 A47 A47 A90 A60 A66 A24 A60 A60 FJ445143 A15 A18 A15 A19 A18 8g 3h 10 5117 JN798590 A74 A40 A74 A60 A40 (5111–5123) A60 A85 A9 A85 A77 A32 A77 A54 A67 A54 A98 A60 A98 A31 A31 A47 A47 A90 A60 A66 A24 A60 a ‘‘Number of changes’’ refers to the number of bootstrap-supported nodes a group of sequences violates to form a new grouping b Sequences indicated in bold font are those that are not present on VP1 and 3Dpol trees due to sequence gaps c Sequences indicated in bold italics are those that consist only of non-consecutive sequence fragments (VP4/VP2, VP1 and 3Dpol) and are therefore not represented on P1, P2 and P3 trees d Due to lack of bootstrap support for nearest-neighbour grouping in 3Dpol for A18, node changes are taken from grouping with HRV-A34 to HRV-A85 (nearest bootstrap supported neighbour in P3 region) e Similarly, node changes for HRV-A38 between VP1 and 3Dpol are taken from grouping with HRV-A60 to HRV-A18 f Node changes for HRV-A38 between P2 and P3 are taken from HRV-A60 to HRV-A40 g Node changes for HRV-A60 between VP1 and 3Dpol are taken from HRV-A38 to HRV-A77 h Node changes for HRV-A60 between P2 and P3 are taken from HRV-A38 to HRV-A40 comparisons between 0.25 and 0.32 VP1 p-distance (col- All phylogenetic incongruities listed were additionally oured red in Fig. 4) corresponded to the formation of this confirmed by RDP, GARD and SBP analysis (data not new 3Dpol clade. Putative recombination events involving shown). HRV-A types from all three clades were further analysed (Table 2). Analysis of putative recombination breakpoints Sequence groupings in phylogenetic trees constructed for whole P1, P2 and P3 coding regions were largely In order to determine putative breakpoints in HRV-A consistent with those observed in VP1 and 3Dpol; P1 was sequences showing evidence for recombination between similar to the VP1 tree and P2/P2 was similar to 3Dpol VP1 and 3Dpol, full genomes from each HRV-A type (Fig. S2; Supplementary Data). Sequence analysis of these underwent GroupScan analysis and were compared directly longer regions did, however, identify bootstrap-supported to their nearest neighbours in P1, P2 and P3. Two repre- phylogenetic incongruity of a further six HRV-A types sentative examples of Grouping Scan output are given, between regions (yellow lines in Fig. 1; listed in Table S1; with putative recombination breakpoints highlighted Supplementary Data). (Fig. 6A). HRV-A18 has a breakpoint near the P1/P2

123 Recombination in the evolution of human rhinovirus genomes

Fig. 6 A: GroupScan of two HRV-A18 : FJ445118 representative recombinant A 0.9 HRV-A types. The proposed 0.8 recombination breakpoint is 0.7 indicated by a dotted line. HRV-A34 0.6 B: Diagram showing calculated HRV-A50 coding region recombination 0.5 HRV-A66 HRV-A77 breakpoints for all HRV-A 0.4 recombinant types, mapped onto 0.3

Grouping Score a diagram of the full genome of 0.2

HRV-A sequences (numbering 0.1 taken from FJ445111–HRV- 0 A1). Those from the large 927 1521 2085 2661 3188.16 3741 4320 4908 5457 6057 6642 recombinant 3Dpol clade, which Genome position lacked bootstrap support for HRV-A63 : FJ445146 nearest-neighbour groupings in 0.9 the analysed regions, are 0.8 excluded 0.7 HRV-A59 0.6 HRV-A10 0.5 HRV-A56 HRV-A76c 0.4

Grouping Score Grouping 0.3

0.2

0.1

0 927 1521 2085 2661 3188.16 3741 4320 4908 5457 6057 6642 Genome position B P1 P2 P3

VP4 VP2 VP3 VP1 2A 2B 2C 3A 3C 3D

627 1000 2000 3000 4000 5000 6000 7097 boundary, and HRV-A63 has a breakpoint within P3. The not shown). In contrast, HRV-A and over two-thirds genome fragment between positions 2000 and 2500 in (69 %) of HRV-C sequences were interspersed in the HRV-A63, which shows a similarly elevated grouping 5’UTR, consistent with inter-species recombination, as score with HRV-A59, did not correspond to any bootstrap- proposed previously [17, 38, 68]. supported change in tree topology in this region (data not Inspection of HRV-A sequences exclusively revealed 17 shown). putative recombinant sequences, from nine HRV-A types The majority of breakpoints occur near the P1/P2 (black boxes; Fig. 7A), demonstrated by bootstrap-sup- boundary, specifically within the 2A coding region ported changes in the nearest neighbour in phylogenetic (Fig. 6B). There was also a second putative hotspot region trees. Two putative recombinant sequences (JQ837724 : around the P2/P3 boundary, and most of these occurred HRV-A1 and JN621245 : HRV-A67) represent contempo- within the 3A coding region or at the 3A/3B junction. rary isolates, which form a bootstrap-supported grouping distant from other members of the same type. Within HRV-B Recombination in the 5’ untranslated region of HRV-A, 5’UTR sequences (Fig. 7B), only three putative recombinant -B and –C types were noted, and two consisted of a single sequence, which was the sole representative of the type in question. All sequences that were more than 90 % complete across the 5’UTR fragment numbered 167–626 (numbered by FJ445111) were included in the analysis (167 HRV-A, 38 Discussion HRV-B and 59 HRV-C). The occurrence of recombination in the 5’UTR was assessed by visual inspection of phylo- Identification of novel HRV strains genetic trees for HRV-A (Fig. 7A), HRV-B (Fig. 7B) and HRV-A/HRV-C (Fig. S3: Supplementary Data). Novel strains of HRV representing new genotypically All 5’UTR sequences from HRV-B clustered together assigned types have been described frequently during and were distinct from those of HRV-A and HRV-C (data genetic analyses of HRV variants identified during routine

123 C. L. McIntyre et al.

AB5’UTR VP1 5’UTR VP1 100 HRV-A82 100 100 HRV-B3 99 HRV-A82 HRV-B3 HRV-A1 : JQ837724 100 95 100 HRV-B35 82 HRV-A22 HRV-B37 HRV-A22 100 97 100 83 HRV-A64 HRV-B79 99 HRV-B6 85 HRV-A64 100 HRV-A94 75 86 HRV-B92 100 HRV-B14 96 HRV-A94 100 99 HRV-A19 96 HRV-B83 100 HRV-B72 HRV-A19 100 HRV-A38 99 82 HRV-B37 92 100 HRV-A13 100 HRV-B4 99 HRV-A60 90 HRV-B6 86 HRV-A41 HRV-B97 93 HRV-A74 100 78 HRV-B14 HRV-A73 99 HRV-A15 HRV-B93 100 98 HRV-B72 HRV-A96 90 HRV-A67 : JN621245 73 HRV-B27 100 100 97 HRV-A61 100 HRV-B86 86 97 HRV-A67 : FJ445149 HRV-B35 HRV-A1 : D00239 100 HRV-A32 73 HRV-B4 77 HRV-B79 99 HRV-A1 : FJ445111. 100 HRV-A9 100 HRV-B26 100 HRV-B92 HRV-A1 : JN815255 100 100 HRV-A13 HRV-B99 99 HRV-A43 HRV-B83 78 HRV-A41 83 HRV-B5 99 HRV-B86 HRV-A75 HRV-A73 99 90 HRV-B42 100 HRV-B84 93 HRV-A74 HRV-A96 100 HRV-B97 HRV-A15 100 HRV-A61 HRV-B26 80 100 100 HRV-A67 100 HRV-A43 HRV-B99 HRV-B93 99 100 99 HRV-A32 87 HRV-A75 87 HRV-B5 96 HRV-B27 100 100 76 HRV-A9 100 98 98 HRV-A34 HRV-B42 HRV-B84 HRV-A60 89 99 HRV-A50 HRV-B52 99 HRV-B52 99 HRV-A23 HRV-A18 100 HRV-B48 HRV-B48 HRV-A30 100 97 99 99 84 HRV-A57 94 100 HRV-B69 94 HRV-B69 HRV-A2 100 HRV-A55 80 HRV-A49 99 HRV-B91 HRV-B91 97 HRV-A21 99 100 HRV-B17 HRV-A10 100 HRV-A24 100 HRV-B17 94 87 HRV-A100 HRV-A90 HRV-B70 89 HRV-B70 HRV-A77 99 76 HRV-A11 NC 001430 NC 001430 HRV-A66 92 99 100 HRV-A33 HRV-A31 81 HRV-A76 0.05 0.1 100 98 HRV-A47 99 80 HRV-A23 100 HRV-A56 HRV-A30 HRV-A46 100 100 HRV-A2 100 HRV-A34 96 99 HRV-A49 100 HRV-A18 100 99 HRV-A81 92 HRV-A21 100 HRV-A16 HRV-A80 100 99 100 HRV-A1 : D00239 97 99 HRV-A59 100 HRV-A : JN815255 HRV-A63 HRV-A1 : JQ837724 HRV-A50 98 HRV-A1 : FJ445111. 99 HRV-A76 100 100 HRV-A40 HRV-A33 HRV-A85 99 90 HRV-A20 88 93 HRV-A56 100 HRV-A68 79 HRV-A98 HRV-A90 86 100 HRV-A54 99 89 HRV-A24 85 HRV-A63 HRV-A11 100 HRV-A59 100 98 HRV-A57 HRV-A39 99 HRV-A38 100 HRV-A10 72 100 HRV-A54 HRV-A100 98 HRV-A54 HRV-A77 96 HRV-A40 100 HRV-A66 98 100 HRV-A85 100 100 100 HRV-A47 HRV-A62 HRV-A31 85 99 HRV-A25 HRV-A62 100 100 71 JN815252 HRV-A25

96 HRV-A29 92 JN815252 72 HRV-A44 100 HRV-A29 99 HRV-A55 100 HRV-A44 98 HRV-A81 100 HRV-A7 92 91 99 HRV-A16 HRV-A88 99 100 HRV-A39 100 HRV-A58 100 99 HRV-A8 85 HRV-A89 HRV-A95 99 HRV-A36 100 99 HRV-A28 86 HRV-A12 98 HRV-A53 HRV-A78 99 100 HRV-A45 100 100 HRV-A80

83 HRV-A12 HRV-A46 99 100 75 HRV-A78 HRV-A28 79 HRV-A101 HRV-A53 99 92 97 100 99 100 93 HRV-A7 HRV-A20 HRV-A88 100 HRV-A68 100 99 HRV-A67 : JN621245 100 HRV-A101 99 99 78 HRV-A58 100 HRV-A65 86 HRV-A89 HRV-A51 78 HRV-A36 100 100 87 HRV-A103 99 HRV-A65 99 HRV-A71 100 77 HRV-A51 HRV-A45 HRV-A103 97 HRV-A8 94 99 HRV-A71 100 96 HRV-A95 NC 001430 NC 001430

0.02 0.05 Fig. 7 A: Neighbour-joining phylogenetic trees constructed using B: Neighbour-joining phylogenetic trees constructed using HRV-B HRV-A 5’UTR sequences that were [90 % complete from 167-626 5’UTR sequences that were[90 % complete from 167-626 (numbered (numbered according to FJ445111), compared to the corresponding according to FJ445111), compared to the corresponding VP1 region. VP1 region. Putative recombinant sequences are marked by a black line. Putative recombinant sequences are marked by a black line

123 Recombination in the evolution of human rhinovirus genomes clinical screening [6, 32, 52, 67]. This suggests that there is Analysis of recombination in HRV-A, HRV-B still potential for as yet undiscovered genetic variation and HRV-C within all three species of HRV. Consistent with these previous studies, our analysis identified 13 putative new A small number of recombination events have been types of HRV-B and HRV-C. These putative new types hypothesized to have played a role in the formation of a have been submitted for consideration to the Picornavirus number of HRV-A and HRV-B types [49]. More specula- Study Group and have been assigned new type numbers. tively, it has also been suggested that the entire HRV-B species was formed by recombination of HEV with HRV-A Estimation of mean substitution rates and analysis [64]. of pairwise p-distances In common with one previous analysis [49], we found HRV-A to have the highest number of recombinant This study represents the first comparison of mean substi- sequences among circulating strains, with only very tution rates between distant genome regions of HRV-A and infrequently detected recombination events within the HRV-C. The estimates of mean substitution rates obtained coding regions of HRV-B and HRV-C. in both regions correlate with other published estimates for HRV-A sequences are divided into four bootstrap-sup- single-strand RNA viruses [9, 18]. Most previously docu- ported clades, which were consistently observed in the mented substitution rate analysis for picornaviruses has regions analysed in this study (VP4/VP2, VP1 and 3Dpol). focused on the capsid coding region, particularly VP1 Further analysis indicated that these distinct clades were, in [5, 16]. In one comparison of several picornaviruses, it was fact, present in every gene region, except for the relatively found that members of the genus Enterovirus (EV) had short and highly conserved VP4 and 3B regions. Recom- significantly higher mean substitution rates than other bination appears to have occurred relatively frequently members of the family Picornaviridae (including members within clades 1-3 (Fig. 1). However, there was no evidence of the genera Aphthovirus, Teschovirus, Hepatovirus and of recombination occurring between the clades within any Cardiovirus), with very limited overlap of the 95 % HPD coding region, and therefore, we would suggest that these intervals between the two groups [16]. Our estimates for HRV-A clades are potentially now diverging on distinct both HRV-A and HRV-C showed overlapping 95 % HPD and evolutionarily separate paths. intervals with EV and non-EV examples cited in this paper. Within these four clades, most evidence of recombina- Two studies of the evolution of serotypes of EV have tion involves full HRV type groups, as opposed to indi- included estimates of substitution rates in the 3Dpol region vidual sequences. This would suggest that these events are [16, 41], and these are of the same order of magnitude as likely to have taken place during the process of diversifi- our estimates for HRV-A and HRV-C. cation into genetically distinct types. Similar to our observations for HRV-A and HRV-C, a One previous study, considering only one example from general congruency of substitution rates between VP1 and most HRV types, detected 23 recombinant genomes 3Dpol has previously been observed in the evolution of resulting from 12 different recombination events [49]. Our recombinant groups of EV71 [40] and recombinant groups analysis identified events that are analogous to most of of three HEV-B serotypes: E9, E11 and E30 [41]. these. However, as we used bootstrap-supported phyloge- As has previously been observed in analysis of EV71 netic trees to define nearest neighbours in different coding isolates [40], similar mean rates of nucleotide substitution regions, our estimates of recombination parents occasion- between two distant genome regions should lead to a dis- ally differed (Table 2). As the detailed phylogeny of tribution of pairwise p-distances in these two regions with a nearest-neighbour relationships are not well resolved, gradient of 1 and a y-intercept of 0. This suggests that any particularly in the non-structural protein-coding regions, discontinuities or outlying comparisons in distributions of we felt it was useful to consider the nearest bootstrap pairwise p-distances in these two regions relate to potential supported clade members as representing nearest-neigh- recombination events (coloured red and yellow in Fig. 4). bour groups. Additionally, in some cases, a recombinant Outlying data points were typically the result of compari- sequence behaves as an outgroup to a new clade, and son with variants that displayed bootstrap-supported therefore, we have treated the entire new clade as the single changes in tree topology in the two regions. The most nearest neighbour. This approach was facilitated by the use striking example of this is a large cluster of recombinant of GroupScan to confirm recombinant sequences and to sequences between 0.23 and 0.34 pairwise nucleotide determine putative recombination breakpoints. In prefer- p-distance in VP1 for HRV-A (coloured red in Fig. 4), ence to bootscanning, this method determines the extent of which correspond to the HRV-A type groups in clade 1 grouping of a query sequence within entire pre-defined (Fig. 1) that undergo a bootstrap-supported change in tree groups, rather than losing potentially phylogenetically topology between VP1 and 3Dpol. informative sequence data by reducing control groups to 123 C. L. McIntyre et al. single consensus sequences. This eliminates the issue of putative recombinant sequences that were identified pre- bias of results when a query sequence is genetically viously displayed inadequate bootstrap support on phylo- divergent from both control groups [60]. genetic trees (Fig. 7A). Our results also affirm the previously reported finding Recombination involving the 5’UTR of HRV-B was that contemporary HRV-A76 sequences have undergone a relatively rare, and several previously reported putative recombination event with HRV-A56 [65]. In addition to the recombinants showed no bootstrap support [49]. However, originally reported single recombinant sequence, we have the 5’UTR region of HRV-B has not been extensively noted that all contemporary isolates group together and sequenced to date, and only nine HRV-B types were rep- therefore that the recombinant group of contemporary resented by more than one sequence. HRV-A76 sequences is likely to be the present dominant Similarly to EV [55], HRV showed evidence of inter- lineage. This pattern was observed in several other HRV species recombination within the 5’UTR region. The types, indicating that while contemporary recombination majority of analysed HRV-C 5’UTR sequences had a occurring within the time period of our dataset (around 50 bootstrap-supported grouping within the HRV-A clade years) is not frequent, it can occur. (Fig. S3: Supplementary Data). HRV-C sequences were not In contrast to the extensive phylogenetic incongruity interspersed evenly throughout the HRV-A clade, but all observed in HRV-A, HRV-B and HRV-C show relatively fell within three distinct clades. These findings are in infrequent or entirely absent recombination within the accordance with other reports of inter-species recombina- coding region. One putative recombinant sequence, tion in this region [17, 38]. HQ123444, belonging to the HRV-B species has been reported previously [32]. However, the previous analysis Analysis of recombination breakpoints used different methods for determining recombinant within the coding region sequences and did not detect the 100 % identity of this sequence with HRV-B35 sequences in the 3’ end of 3Dpol, Recombination breakpoints were determined for all puta- instead noting an association with HRV-B35 throughout. In tive recombinant HRV-A sequences with well-supported fact, this sequence is identical to published strains from nearest-neighbour groups (Table 2). Similarly to EV [30, position 6353 onwards. One might speculate on possible 34] and HPeV [2], the majority of breakpoints are near the explanations for this observation, as no contemporary P1/P2 junction. Coupled with the observation of high HRV-B35 full genomes are available for comparison. It is within-species divergence within the capsid coding region, possible that this putative recombinant may also represent a this suggests that there may be a biological compatibility sequencing or assembly error. barrier that decreases the reproductive fitness of progeny In accordance with our previous study, there is a striking that are recombinant within the capsid region. Therefore, lack of recombination within HRV-C coding regions [38]. although inter-typic recombination may occur throughout However, in both cases, HRV-C analysis is limited by the the full length of the genome in the course of a natural lack of strains older than 15 years. In HRV-A and HRV-B, co-infection, recombination within the capsid region may all instances of contemporary sequences grouping sepa- not be represented in clinical sampling. rately from older strains involve a sequenced example that This study demonstrates that, despite a close phyloge- was originally collected over 40 years ago. It is therefore netic relationship to EV, recombination within all three possible that HRV-C sequences undergo recombination at species of HRV was substantially less frequent than that similar frequencies, but the current available time span of observed within EV species. In addition, there was only sequences renders their detection problematic. very limited evidence of recombination occurring within the 40-50 years that the dataset spanned. Following upon Recombination within the 5’UTR of HRVs previous reports that have provided some glimpses of the occurrence of recombination in HRV-A [49, 65], our study Within HRV-A sequences, the 5’UTR was the only region represents a comprehensive analysis of recombination in analysed in which clades 1 to 4 were not maintained. We all published HRV full genome sequences to date, an noted considerably more recombination events within the additional analysis of HRV sequence fragments obtained 5’UTR of HRV-A than were seen in previous studies [49]. from isolates spanning a 14-year time period and, to our This included two instances of contemporary sequences knowledge, the first attempt to catalogue all currently grouping distantly from other members of their type group. detectable recombination events and potential breakpoints The discrepancy in results could be due to different within HRV-A. alignment methods and the fact that we relied upon com- parisons of robust groupings within phylogenetic trees to Acknowledgments The authors are grateful to Eleanor Gaunt for determine recombinant sequences. In addition, several valuable assistance received in the preliminary stages of this work. 123 Recombination in the evolution of human rhinovirus genomes

The authors are also grateful to Heli Harvala, Kate Templeton, Peter 15. Harvala H, Robertson I, McWilliam Leitch EC, Benschop K, McCullough, Julie White, Mary Notman, Eleanor Leslie and Carol Wolthers KC, Templeton K, Simmonds P (2008) Epidemiology Thomson for providing samples, data and other virus testing results and clinical associations of human parechovirus respiratory from the respiratory sample archive. Chloe McIntyre’s PhD student- infections. J Clin Microbiol 46:3446–3453 ship was funded by the Medical Research Council. 16. Hicks AL, Duffy S (2011) Genus-Specific Substitution Rate Variability among Picornaviruses. J Virol 85:7942–7947 Conflict of interest The authors declare that they have no conflict 17. Huang T, Wang W, Bessaud M, Ren P, Sheng J, Yan H, of interest. Zhang J, Lin X, Wang Y, Delpeyroux F, Deubel V (2009) Evidence of recombination and genetic diversity in human rhinoviruses in children with acute respiratory infection. PLoS One 4:e6355 18. Jenkins GM, Rambaut A, Pybus OG, Holmes EC (2002) Rates of References molecular evolution in RNA viruses: a quantitative phylogenetic analysis. J Mol Evol 54:156–165 1. Arden KE, McErlean P, Nissen MD, Sloots TP, Mackay IM 19. Kapikian A, Conant R, Hamparian V (1967) Rhinoviruses: a (2006) Frequent detection of human rhinoviruses, paramyxovi- numbering system. Nature 213:761–762 ruses, coronaviruses, and bocavirus during acute respiratory tract 20. Khetsuriani N, Lu X, Teague WG, Kazerouni N, Anderson LJ, infections. J Med Virol 78:1232–1240 Erdman DD (2008) Novel human rhinoviruses and exacerbation 2. Benschop KS, de Vries M, Minnaar RP, Stanway G, van der of asthma in children. Emerg Infect Dis 14:1793–1796 Hoek L, Wolthers KC, Simmonds P (2010) Comprehensive full- 21. Kistler A, Avila PC, Rouskin S, Wang D, Ward T, Yagi S, length sequence analyses of human parechoviruses: diversity and Schnurr D, Ganem D, DeRisi JL, Boushey HA (2007) Pan-viral recombination. J Gen Virol 91:145–154 screening of respiratory tract infections in adults with and without 3. Blomqvist S, Savolainen C, Ra˚man L, Ra L, Roivainen M, Hovi asthma reveals unexpected human coronavirus and human rhi- T (2002) Human Rhinovirus 87 and Enterovirus 68 represent a novirus diversity. J Infect Dis 196:817–825 unique serotype with rhinovirus and enterovirus features. J Clin 22. Kistler AL, Webster DR, Rouskin S, Magrini V, Credle JJ, Micro 40:4218–4223 Schnurr DP, Boushey HA, Mardis ER, Li H, DeRisi JL (2007) 4. Bochkov YA, Palmenberg AC, Lee W-M, Rathe JA, Amineva Genome-wide diversity and selective pressure in the human rhi- SP, Sun X, Pasic TR, Jarjour NN, Liggett SB, Gern JE (2011) novirus. Virol J 4:40 Molecular modeling, organ culture and reverse genetics for a 23. Kosakovsky Pond SL, Posada D, Gravenor MB, Woelk CH, Frost newly identified human rhinovirus C. Nat Med 17:627–632 SDW (2006) Automated phylogenetic detection of recombination 5. Calvert J, Chieochansin T, Benschop KS, McWilliam Leitch EC, using a genetic algorithm. Mol Biol Evol 23:1891–1901 Drexler JF, Grywna K, da Costa Ribeiro H, Drosten C, Harvala 24. Kusel MM, de Klerk NH, Holt PG, Kebadze T, Johnston SL, Sly H, Poovorawan Y, Wolthers KC, Simmonds P (2010) Recombi- PD (2006) Role of respiratory viruses in acute upper and lower nation dynamics of human parechoviruses: investigation of type- respiratory tract illness in the first year of life: a birth cohort specific differences in frequency and epidemiological correlates. study. Pediatr Infect Dis J 25:680–686 J Gen Virol 91:1229–1238 25. Laine P, Savolainen C, Blomqvist S, Hovi T (2005) Phylogenetic 6. Chidlow GR, Laing IA, Harnett GB, Greenhill AR, Phuanuk- analysis of human rhinovirus capsid protein VP1 and 2A protease oonnon S, Siba PM, Pomat WS, Shellam GR, Smith DW, Leh- coding sequences confirms shared genus-like relationships with mann D (2012) Respiratory viral pathogens associated with lower human enteroviruses. J Gen Virol 86:697–706 respiratory tract disease among young children in the highlands 26. Lamson D, Renwick N, Kapoor V, Liu Z, Palacios G, Ju J, Dean of Papua New Guinea. J Clin Virol 54:235–239 A, St George K, Briese T, Lipkin WI (2006) MassTag poly- 7. Cooney MK, Fox JP, Kenny GE (1982) Antigenic groupings of merase-chain-reaction detection of respiratory pathogens, 90 rhinovirus serotypes. Infect Immun 37:642–647 including a new rhinovirus genotype, that caused influenza-like 8. Drummond AJ, Rambaut A (2007) BEAST: Bayesian evolu- illness in New York State during 2004–2005. J Infect Dis tionary analysis by sampling trees. BMC Evol Biol 7:214 194:1398–1402 9. Duffy S, Shackelton LA, Holmes EC (2008) Rates of evolu- 27. Ledford RM, Patel NR, Demenczuk TM, Watanyar A, Herbertz tionary change in viruses: patterns and determinants. Nat Rev T, Collett MS, Pevear DC, Irol JV (2004) VP1 Sequencing of All Genet 9:267–276 Human Rhinovirus Serotypes : Insights into Genus Phylogeny 10. Gibbs MJ, Armstrong JS, Gibbs AJ (2000) Sister-scanning: a and Susceptibility to Antiviral Capsid-Binding Compounds. Monte Carlo procedure for assessing signals in recombinant J Virol 78:3663–3674 sequences. Bioinformatics 16:573–582 28. Lee WM, Kiesner C, Pappas T, Lee I, Grindle K, Jartti T, Jakiela 11. Guindon S, Gascuel O (2003) A Simple, Fast, and Accurate B, Lemanske RF, Shult PA, Gern JE (2007) A diverse group of Algorithm to Estimate Large Phylogenies by Maximum Likeli- previously unrecognized human rhinoviruses are common causes hood. Syst Biol 52:696–704 of respiratory illnesses in infants. PLoS One 3:e966 12. Hamparian VV, Colonno RJ, Cooney MK, Dick EC, Gwaltney 29. Lewis-Rogers N, Bendall ML, Crandall KA (2009) Phylogenetic JM, Hughes JH, Jordan WS, Kapikian AZ, Mogabgab WJ, Monto relationships and molecular adaptation dynamics of human rhi- A (1987) A collaborative report: rhinoviruses–extension of the noviruses. Mol Biol Evol 26:969–981 numbering system from 89 to 100. Virology 159:191–192 30. Lindberg AM, Andersson P, Savolainen C, Mulders MN, Hovi T 13. Han TH, Chung JY, Hwang ES, Koo JW (2009) Detection of (2003) Evolution of the genome of Human enterovirus B: human rhinovirus C in children with acute lower respiratory tract incongruence between phylogenies of the VP1 and 3CD regions infections in South Korea. Arch Virol 154:987–991 indicates frequent recombination within the species. J Gen Virol 14. Harvala H, McIntyre CL, McLeish NJ, Kondracka J, Palmer J, 84:1223–1235 Molyneaux P, Gunson R, Bennett S, Templeton K, Simmonds P 31. Linsuwanon P, Payungporn S, Samransamruajkit R, Posuwan N, (2012) High detection frequency and viral loads of human rhi- Makkoch J, Theanboonlers A, Poovorawan Y (2009) High novirus species A to C in fecal samples; diagnostic and clinical prevalence of human rhinovirus C infection in Thai children with implications. J Med Virol 84:536–542 acute lower respiratory tract disease. J Infect 59:115–121

123 C. L. McIntyre et al.

32. Linsuwanon P, Payungporn S, Suwannakarn K, Chieochansin T, 49. Palmenberg AC, Spiro D, Kuzmickas R, Wang S, Djikeng A, Theamboonlers A, Poovorawan Y (2011) Complete coding Rathe JA, Fraser-Liggett CM, Liggett SB (2009) Sequencing and sequence characterization and comparative analysis of the puta- analyses of all known human rhinovirus genomes reveal structure tive novel human rhinovirus (HRV) species C and B. Virol J 8:5 and evolution. Science 324:55–59 33. Lukashev AN, Lashkevich VA, Ivanova OE, Koroleva GA, 50. Peltola V, Waris M, Osterback R, Susi P, Ruuskanen O, Hyypia¨ T Hinkkanen AE, Ilonen J (2003) Recombination in Circulating (2008) Rhinovirus transmission within families with children: Enteroviruses. J Virol 77:10423–10431 incidence of symptomatic and asymptomatic infections. J Infect 34. Lukashev AN, Lashkevich VA, Ivanova OE, Koroleva GA, Dis 197:382–389 Hinkkanen AE, Ilonen J (2005) Recombination in circulating 51. Posada D, Crandall KA (2001) Evaluation of methods for Human enterovirus B: independent evolution of structural and detecting recombination from DNA sequences: computer simu- non-structural genome regions. J Gen Virol 86:3281–3290 lations. Proc Natl Acad Sci USA 98:13757–13762 35. Martin DP, Posada D, Crandall KA, Williamson C (2005) A 52. Rathe JA, Liu X, Tallon LJ, Gern JE, Liggett SB (2010) Full- modified bootscan algorithm for automated identification of genome sequence and analysis of a novel human rhinovirus strain recombinant sequences and recombination breakpoints. AIDS within a divergent HRV-A clade. Arch Virol 155:83–87 Res Hum Retroviruses 21:98–102 53. Renwick N, Schweiger B, Kapoor V, Liu Z, Villari J, Bullmann 36. Martin DP, Lemey P, Lott M, Moulton V, Posada D, Lefeuvre P R, Miething R, Briese T, Lipkin WI (2007) A recently identified (2010) RDP3: a flexible and fast computer program for analyzing rhinovirus genotype is associated with severe respiratory-tract recombination. Bioinformatics 26:2462–2463 infection in children in Germany. J Infect Dis 196:1754–1760 37. Martin D, Rybicki E (2000) RDP: detection of recombination 54. Saitou N, Nei M (1987) The neighbor-joining method: a new amongst aligned sequences. Bioinformatics 16:562–563 method for reconstructing phylogenetic trees. Mol Biol Evol 38. McIntyre CL, McWilliam Leitch EC, Savolainen-Kopra C, Hovi 4:406–425 T, Simmonds P (2010) Analysis of genetic diversity and sites of 55. Santti J, Hyypia¨ T, Kinnunen L, Salminen M (1999) Evidence of recombination in human rhinovirus species C. J Virol recombination among enteroviruses. J Virol 73:8741–8749 84:10297–10310 56. Savolainen C, Blomqvist S, Mulders MN, Hovi T (2002) Genetic 39. McWilliam Leitch EC, Bendig J, Cabrerizo M, Cardosa J, Hyypia¨ clustering of all 102 human rhinovirus prototype strains: serotype T, Ivanova OE, Kelly A, Kroes AC, Lukashev A, MacAdam A, 87 is close to human enterovirus 70. J Gen Virol 83:333–340 McMinn P, Roivainen M, Trallero G, Evans DJ, Simmonds P 57. Savolainen C, Mulders MN, Hovi T (2002) Phylogenetic analysis (2009) Transmission networks and population turnover of echo- of rhinovirus isolates collected during successive epidemic sea- virus 30. J Virol 83:2109–2118 sons. Virus Res 85:41–46 40. McWilliam Leitch EC, Cabrerizo M, Cardosa J, Harvala H, 58. Simmonds P (2012) SSE: a nucleotide and amino acid sequence Ivanova OE, Koike S, Kroes AC, Lukashev AN, Perera D, analysis platform. BMC Res Notes 5:50 Roivainen M, Susi P, Trallero G, Evans DJ, Simmonds P (2012) 59. Simmonds P, McIntyre C, Savolainen-Kopra C, Tapparel C, The association of recombination events in the founding and Mackay IM, Hovi T (2010) Proposals for the classification of emergence of subgenogroup evolutionary lineages of human human rhinovirus species C into genotypically assigned types. enterovirus 71. J Virol 86:2676–2685 J Gen Virol 91:2409–2419 41. McWilliam Leitch EC, Cabrerizo M, Cardosa J, Harvala H, 60. Simmonds P, Midgley S (2005) Recombination in the genesis and Ivanova OE, Kroes AC, Lukashev A, Muir P, Odoom J, Roi- evolution of hepatitis B virus genotypes. J Virol 79:15467–15476 vainen M, Susi P, Trallero G, Evans DJ, Simmonds P (2010) 61. Simmonds P, Welch J (2006) Frequency and dynamics of Evolutionary dynamics and temporal/geographical correlates of recombination within different species of human enteroviruses. recombination in the human enterovirus echovirus types 9, 11, J Virol 80:483–493 and 30. J Virol 84:9292–9300 62. Smith JM (1992) Analyzing the mosaic structure of genes. J Mol 42. McWilliam Leitch EC, Harvala H, Robertson I, Ubillos I, Tem- Evol 34:126–129 pleton K, Simmonds P (2009) Direct identification of human 63. Tamura K, Peterson D, Peterson N, Stecher G, Nei M, Kumar S enterovirus serotypes in cerebrospinal fluid by amplification and (2011) MEGA5: molecular evolutionary genetics analysis using sequencing of the VP1 region. J Clin Microbiol 44:119–124 maximum likelihood, evolutionary distance, and maximum par- 43. Miller EK, Edwards KM, Weinberg GA, Iwane MK, Griffin MR, simony methods. Mol Biol Evol 28:2731–2739 Hall CB, Zhu Y, Szilagyi PG, Morin LL, Heil LH, Lu X, 64. Tapparel C, Junier T, Gerlach D, Cordey S, Van Belle S, Perrin Williams JV (2009) A novel group of rhinoviruses is associated L, Zdobnov EM, Kaiser L (2007) New complete genome with asthma hospitalizations. J Allergy Clin Immunol sequences of human rhinoviruses shed light on their phylogeny 123(98–104):e1 and genomic features. BMC Genomics 8:224 44. Oberste MS, Maher K, Pallansch MA (2004) Evidence for fre- 65. Tapparel C, Junier T, Gerlach D, Van-Belle S, Turin L, Cordey S, quent recombination within species human enterovirus B based Mu¨hlemann K, Regamey N, Aubert J-D, Soccal PM, Eigenmann on complete genomic sequences of all thirty-seven serotypes. P, Zdobnov E, Kaiser L (2009) New respiratory enterovirus and J Virol 78:855–867 recombinant rhinoviruses among circulating picornaviruses. 45. Oberste MS, Pen˜aranda S, Pallansch MA (2004) RNA recombi- Emerg Infect Dis 15:719–726 nation plays a major role in genomic change during circulation of 66. Tapparel C, L’Huillier AG, Rougemont AL, Beghetti M, Barazz- coxsackie B viruses. J Virol 78:2948–2955 one-Argiroffo C, Kaiser L (2009) Pneumonia and pericarditis in a 46. Oberste MS, Maher K, Kilpatrick DR, Flemister MR, Brown BA, child with HRV-C infection: a case report. J Clin Virol 45:157–160 Pallansch MA (1999) Typing of human enteroviruses by partial 67. Wisdom A, McWilliam Leitch EC, Gaunt E, Harvala H, Sim- sequencing of VP1. J Clin Microbiol 37:1288–1293 monds P (2009) Screening respiratory samples for detection of 47. Padidam M, Sawyer S, Fauquet CM (1999) Possible emergence human rhinoviruses (HRVs) and enteroviruses: comprehensive of new geminiviruses by frequent recombination. Virology VP4-VP2 typing reveals high incidence and genetic diversity of 265:218–225 HRV species C. J Clin Microbiol 47:3958–3967 48. Palmenberg AC, Rathe JA, Liggett SB (2010) Analysis of the 68. Wisdom A, Kutkowska AE, McWilliam Leitch EC, Gaunt E, complete genome sequences of human rhinovirus. J Allergy Clin Templeton K, Harvala H, Simmonds P (2009) Genetics, recom- Immunol 125:1190–1199 bination and clinical features of human rhinovirus species C

123 Recombination in the evolution of human rhinovirus genomes

(HRV-C) infections; interactions of HRV-C with other respira- 70. de Almeida MB, Zerbinati RM, Tateno AF, Oliveira CM, Roma˜o tory viruses. PLoS One 4:e8518 RM, Rodrigues JC, Pannuti CS, da Silva Filho LV (2010) Rhi- 69. Xatzipsalti M, Kyrana S, Tsolia M, Psarras S, Bossios A, Laza- novirus C and respiratory exacerbations in children with cystic Stanca V, Johnston SL, Papadopoulos NG (2005) Rhinovirus fibrosis. Emerg Infect Dis 16:996–999 viremia in children with respiratory infections. Am J Respir Crit Care Med 172:1037–1040

123 Journal of General Virology (2010), 91, 2409–2419 DOI 10.1099/vir.0.023994-0

Review Proposals for the classification of human rhinovirus species C into genotypically assigned types

Peter Simmonds,1 Chloe McIntyre,1 Carita Savolainen-Kopra,2 Caroline Tapparel,3 Ian M. Mackay4 and Tapani Hovi2

Correspondence 1Centre for Infectious Diseases, University of Edinburgh, Summerhall, Edinburgh EH9 1QH, UK Peter Simmonds 2National Institute for Health and Welfare, Mannerheimintie 166, FIN-00300 Helsinki, Finland [email protected] 3Central Laboratory of Virology, University of Geneva Hospitals, 1211 Geneva 14, Switzerland 4Emerging Viruses Group, Queensland Paediatric Infectious Diseases Laboratory, Department of Infectious Diseases, Royal Children’s Hospital, Queensland Children’s Medical Research Institute, The University of Queensland, Brisbane, Queensland 4029, Australia

Human rhinoviruses (HRVs) are common respiratory pathogens associated with mild upper respiratory tract infections, but also increasingly recognized in the aetiology of severe lower respiratory tract disease. Wider use of molecular diagnostics has led to a recent reappraisal of HRV genetic diversity, including the discovery of HRV species C (HRV-C), which is refractory to traditional virus isolation procedures. Although it is heterogeneous genetically, there has to date been no attempt to classify HRV-C into types analogous to the multiple serotypes identified for HRV-A and -B and among human enteroviruses. Direct investigation of cross-neutralization properties of HRV-C is precluded by the lack of methods for in vitro culture, but sequences from the capsid genes (VP1 and partial VP4/VP2) show evidence for marked phylogenetic clustering, suggesting the possibility of a genetically based system comparable to that used for the assignment of new enterovirus types. We propose a threshold of 13 % divergence for VP1 nucleotide sequences for type assignment, a level that classifies the current dataset of 86 HRV-C VP1 sequences into a total of 33 types. We recognize, however, that most HRV-C sequence data have been collected in the VP4/VP2 region (currently 701 sequences between positions 615 and 1043). We propose a subsidiary classification of variants showing .10 % divergence in VP4/ VP2, but lacking VP1 sequences, to 28 provisionally assigned types (subject to confirmation once VP1 sequences are determined). These proposals will assist in future epidemiological and clinical studies of HRV-C conducted by different groups worldwide, and provide the foundation for future exploration of type-associated differences in clinical presentations and biological properties.

Introduction (Arden et al., 2006; Kaiser et al., 2006; Lamson et al., 2006; Kistler et al., 2007; Lau et al., 2007; Lee et al., 2007; Human rhinoviruses (HRVs) are highly prevalent respi- McErlean et al., 2007; Renwick et al., 2007; Olenec et al., ratory pathogens, most commonly associated with mild 2010). upper respiratory tract disease and exacerbations of pre- existing respiratory disease such as asthma. They are also These newly characterized rhinoviruses have been pro- increasingly recognized as underlying more severe disease posed to belong to a novel species of rhinovirus manifestations, such as bronchiolitis in young children and (designated species C; HRV-C), recognizing their substan- in the immunosuppressed. The increasing use of molecular tial sequence divergence from other classified species methods for respiratory virus screening has contributed to within the genus Enterovirus of picornaviruses (Carstens, this reappraisal of rhinoviruses, as has the recent discovery 2010; Knowles, 2010) (Fig. 1a). Clinically and biologically, of an entirely novel rhinovirus group, refractory to they share many attributes with the other designated HRV previously used virus isolation methods but now known species, HRV-A and -B. Most studies of HRV-C disease to be highly prevalent and widely circulating worldwide associations, typically focused on children from asthmatic and/or hospital-based populations (Arden & Mackay, A full list of all HRV-C variants characterized to date, categorized into a 2010), have demonstrated a similarly broad range of total of 61 confirmed or provisionally assigned types, is available with the clinical outcomes to those observed in HRV-A and -B online version of this paper. infections and, indeed, with other respiratory viruses. Some

023994 G 2010 SGM Printed in Great Britain 2409 P. Simmonds and others

Fig. 1. (a) Sequence relationships between species currently assigned to the genus Enterovirus, depicted by phylogenetic analysis of aligned sequences from the P1 (capsid protein-encoding) region [positions 616– 3125, numbered according to the reference sequence (GenBank accession no. EF582385); species identified by different symbols and shading]. The tree was con- structed by neighbour-joining analysis of pair- wise amino acid p distances, with branches showing ¢70 % bootstrap support labelled. The sequence of porcine enterovirus 8 (genus Sapelovirus) was used as an outgroup. For presentation purposes, only the 10 most divergent sequences were included for spe- cies containing more than 10 sequences (HEV-A, -B, -C, HRV-A, -B and bovine enter- oviruses). Bar, amino acid p distance of 0.05. (b) Diagram of the rhinovirus genome, identifying the 59-and39-UTRs and structural (P1) and non-structural (P2, P3) gene regions, along with the designations of their encoded proteins. The genome is drawn to scale, using the complete genome sequence 024 (GenBank accession no. EF582385) for numbering. studies show no difference in clinical outcome between non-structural proteins collinear with those of other HRV species (Lau et al., 2007; Piotrowska et al., 2009), enteroviruses, including 3D (which encodes the RNA- whereas others provide evidence for a more frequent role of dependent RNA polymerase). HRV-C contains a type I HRV-C in lower respiratory tract disease, febrile wheeze in internal ribosomal entry site that is structurally similar to infants and toddlers, and asthma exacerbations in older and has short regions of striking sequence conservation with children (Lau et al., 2007; Khetsuriani et al., 2008; Miller those of other enteroviruses. Members of the genus et al., 2009a; Wisdom et al., 2009b). Contrastingly, one study Enterovirus do, however, differ in other aspects of their described a shorter duration of asthma symptoms and less genome organization. Most evident is the variability in the cough than seen in HRV-A infection (Arden et al., 2010a). position of the cis-acting replication element. This is located at a homologous position within the 2C coding sequence in HRV-C shares a number of features of its genomic all four human enterovirus (HEV) species A–D, but is organization with other members of the genus Enterovirus variable in position in each rhinovirus species [within 2A in (Fig. 1b). This includes an approximately 7100 base genome HRV-A (Gerber et al., 2001), VP1 in HRV-B and proposed containing a single reading frame, the absence of a leader to be located at the 59 end of VP2 in species C (Cordey et al., protein, a P1 region encoding four capsid proteins, a 2A 2008)]. In marked contrast to the ease with which HRV-A gene encoding a cis-acting proteinase, followed by a series of and -B can be isolated, HRV-C has, to date based on the

2410 Journal of General Virology 91 Classification proposals for HRV-C published literature, been unculturable in vitro (Lau et al., was found to belong to the species HEV-D as a variant of 2007; McErlean et al., 2007). Historically, this hindered its HEV-68 (Blomqvist et al., 2002). A strain referred to as discovery and has additionally precluded investigation of its Hanks was considered to represent a candidate new type antigenic variability, a feature characteristic of other HRVs and HEVs.

Proposal aims Genetic characterization of HRV-C amplified from clinical specimens has provided evidence for extensive heterogeneity in the VP4/VP2 region, the existence of two phylogenetically separate groups of sequences in the 59-UTR (one resembling sequences found in species A rhinoviruses; Han et al., 2009; Huang et al., 2009; Savolainen-Kopra et al., 2009b; Wisdom et al., 2009a) and substantial sequence divergence throughout the genome of the 11 full-length HRV-C sequences obtained to date (Lamson et al., 2006; Lau et al., 2007; Kistler et al., 2007; McErlean et al., 2007; Huang et al., 2009; Tapparel et al., 2009b; Arden et al., 2010b). Although we currently lack the means to classify HRV-C serologically (as has been achieved for other rhinovirus and enterovirus species), we recognize and are responding to the need to develop a classification system for this species. This will assist in organizing the rapidly accumulating sequence data currently being generated from virological and clinical studies, and allow assignment of uniform type descriptions that will enable comparison of genetic variants characterized in separate studies over time and across different geographical regions. This process is potentially made easier by the evident similarities in the pattern of sequence divergence of HRV- C to other rhinoviruses and enteroviruses for which classification methods have been developed and standar- dized. For example, the large number of distinct genetic lineages identifiable by sequence comparisons in the VP4/ VP2 region matches the diversity in this region shown by different serotypes of species A and B rhinoviruses (Lau et al., 2009). These differences are mirrored in VP1 (Huang et al., 2009; Wisdom et al., 2009b; McIntyre et al., 2010) and other exposed regions of the capsid that underlie the latter’s putative antigenic diversity. In developing clas- sification proposals for HRV-C, we have strived to develop criteria for type assignment that are consistent with the principles used for other rhinoviruses and enteroviruses, whilst acknowledging differences in its diversity, genetic history and biology.

Fig. 2. Distributions of pairwise nucleotide p (uncorrected) Classification of enteroviruses and rhinoviruses distances (y-axis) between HRV-A and -B variants around the into serotypes previously proposed thresholds (Savolainen et al., 2002) dividing inter- and intra-serotype distances in VP1 (lower panel) and VP4/ HRV-A and -B comprise a number of antigenically distinct VP2 (upper panel). Pairwise comparisons in the equivalent viruses designated on the basis of their cross-neutralization distance range for HRV-C are shown for comparison. Unfilled properties in vitro, currently totalling 74 serotypes of HRV- boxes indicate pairwise distances between HRV variants pre- A and 25 of HRV-B (Kapikian et al., 1967). These totals viously classified as the same serotype; black-filled boxes, pairwise incorporate minor adjustments to the original classification comparisons between different serotypes; grey boxes, pairwise when HRV was subsequently characterized genetically comparisons of variants with unknown serological cross-reactivity (Kapikian et al., 1971; Hamparian et al., 1987). As (HRV-C). For clarity, multiple examples of pairwise distances examples, one of the classical HRV prototypes, HRV-87, between the same (sero)type pairs have not been shown. http://vir.sgmjournals.org 2411 P. Simmonds and others but, on sequence analysis, it was found to be genetically and within serotypes exist (Savolainen et al., 2002). In similar to HRV-21 (Ledford et al., 2004; Laine et al., 2005). species A, pairwise divergence values in VP1 between serotypes 95 and 8 (1.6 %), serotypes 44 and 29 (7.3 %), In common with HEVs (Oberste et al., 1999), there is a serotypes 62 and 25 (9.4 %) and serotypes 98 and 54 close correlation between sequence divergence of HRV-A (11.4 %) are interspersed with those observed within and -B in the VP1 region (and other structural genes) and serotypes, as is the pairwise distance between the species their designated serotypes (Savolainen et al., 2002; Ledford B serotypes 70 and 17 (12.3 %) (Fig. 2, lower panel). et al., 2004; Laine et al., 2005). For HEVs, a nucleotide Overlaps in inter- and intra-serotype distances, often sequence divergence value of .25 % in VP1 (.15 % amino involving the same serotype pairs, are also evident from acid sequence difference) may be used as an alternative an equivalent analysis of pairwise distances of VP4/VP2 means to classify more recently discovered types without sequences (Fig. 2, upper panel). In two cases (95/8, 44/29), recourse to extensive serological characterization (Oberste reanalysis demonstrated that these pairs did indeed show et al., 1999). Application of this principle has led to the cross-neutralization (Cooney et al., 1982; Ledford et al., assignment of 40 genotypically defined enterovirus ‘types’ 2004), whereas one of the more divergent pairs, 62/25, did in addition to the 64 traditionally classified serotypes. not (Cooney et al., 1982). There is clearly scope to Thresholds of 12 % similarly differentiate inter- from intra- reinvestigate cross-reactivity of the other discrepant pairs. serotype divergences in the VP1 gene of species A and B Overall, however, the actual distributions of pairwise rhinoviruses, respectively (Savolainen et al., 2002; McIntyre distances between different serotypes of HEVs and et al., 2010), providing the means in principle to detect rhinoviruses overlap minimally with intra-serotype nuc- novel species A types (e.g. Wisdom et al., 2009b) without leotide distances (as exemplified by HEV-B and HRV-A; assaying for cross-neutralization (Savolainen et al., 2002). Fig. 3a, b). For enteroviruses, the lowest value between For rhinoviruses it is, however, recognized that some these two distributions closely matches the type-assign- inconsistencies and overlap of divergence values between ment threshold now used for genotypic assignment of new

Fig. 3. Distribution of pairwise nucleotide p distances between: (a) available complete (.90 %) VP1 sequences of HEV-B (n5330); (b) available complete (.90 %) VP1 sequences of HRV-A (n5279); (c) available VP1 sequences of HRV-C (n586; .90 % complete between positions 2304 and 3125); (d) all available VP4/ VP2 sequences of HRV-C (n5702; .90 % complete between positions 615 and 1043). Previously designated (HEV) or proposed type thresholds are indicated by thick lines.

2412 Journal of General Virology 91 Classification proposals for HRV-C types. We propose to adopt this method for a genotypic HRV-C heterogeneity and proposals for type classification of HRV-C. assignments Eleven (near-)complete genome sequences, 541 sequences Rhinovirus recombination from the VP4/VP2 region [.90 % complete between The process of recombination creates chimaeric virus genomes in which different genome regions have separate evolutionary origins; recombinants may change in their phylogenetic relationships to other sequences between genome regions. For enteroviruses and rhinoviruses whose type assignments are dependent on the capsid genes (and the encoded differences in antigenic properties), regions that undergo extremely frequent recombination (such as the 59-UTR, P2 and P3 non-structural gene regions in HEV) cannot therefore contribute to their (sero)type classification (Savolainen-Kopra et al., 2009b). Rhinovirus species A and B show much more consistent phylogeny relationships across the genome, as exemplified by the largely concordant phylogenies of the 3Dpol and VP4/VP2 regions (Savolainen et al., 2004). There are, however, some inconsistencies evident on analysis of complete genome sequences of species A and B (Palmenberg et al., 2009; Tapparel et al., 2009a). For example, HRV-53 shows greater similarity to HRV-46 in the non-structural region than anticipated by their sequence relationship in the capsid-encoding region, whilst a similar comparison of HRV-78 and HRV-12 showed non-structural gene sequences to be more divergent. In these and other instances, most changes in phylogenetic relationship occurred at the P1/P2 boundary, implying separate evolutionary origins for the structural and non- structural gene blocks in some serotypes. In contrast to species A and B, our recent extensive comparison of phylogenies of the VP4/VP2, VP1 and 3Dpol regions of species C demonstrated consistent branching orders and relative branch lengths in all three coding regions (McIntyre et al., 2010). However, several phylogeny violations occurred between the 59-UTR and VP4/VP2 trees, originating from a series of likely interspecies recombination events with breakpoints towards the 39 end of the 59-UTR (Han et al., 2009; Huang et al., 2009; Savolainen-Kopra et al., 2009b; Wisdom et al., 2009a). Remarkably, most 59-UTR sequences of species C cluster within the species A 59-UTR clade, with the remainder being phylogenetically distinct (Han et al., 2009; Huang et al., 2009; Savolainen-Kopra et al., 2009b; Wisdom et al., 2009a). Those with species A-like 59-UTR sequences have been named HRV-Ca, with the remainder assigned as HRV-Cc (Huang et al., 2009). We have recently found that the region of 2A encoding the C-terminal domain of the Fig. 4. Phylogenetic analysis of all available sequences of HRV-C in proteinase also has a recombinant origin, with all available the VP1 region (.90 % complete between positions 2304 and HRV-C sequences from this region clustering within the 3125). The tree was constructed by neighbour-joining analysis of HRV-A clade (McIntyre et al., 2010). The evolutionary pairwise maximum composite likelihood distances implemented in events and the selection pressures underlying these the program MEGA (Tamura et al., 2007); branches showing ¢70 % instances of HRV-A/-C interspecies recombination are bootstrap support are indicated. Complete genome sequences are currently unknown. labelled $. Bar, maximum composite likelihood distance of 0.05. http://vir.sgmjournals.org 2413 P. Simmonds and others positions 615 and 1043, numbered here and below using region resembles that of VP1, with an equivalent minimum the complete genome HRV-C sequence 024 (GenBank value corresponding to the type threshold of VP1 at 10 %. accession no. EF582385)] from public databases and a However, as a likely result of its shorter length and lesser further 160 unpublished sequences contributed by the degree of sequence diversity than VP1, the type threshold authors of the current study, along with 86 complete VP1 for VP4/VP2 was less clearly defined (Fig. 2). This pattern (positions 2304–3125) and 89 partial 3Dpol (positions was also found in a similar comparison of VP4/VP2 region 6384–6854) sequences, collectively attest to the substantial distance distributions in HRV-A and -B (McIntyre et al., genetic heterogeneity of HRV-C. The formation of a 2010) and in human enteroviruses (Oberste et al., 1999; number of discrete clades of HRV-C in each genome Mulders et al., 2000). region (Savolainen et al., 2002; McIntyre et al., 2010) (Fig. 4) supports the idea that genetic variants of HRV-C Type-assignment proposals might be usefully classified into a number of types, comparable to types/serotypes of other HRV species. In formulating the following criteria for type assignment, we are aware of the need for simplicity and transparency in The current lack of data on antigenic properties of HRV-C is the assignment process and the use of criteria comparable unlikely to be addressed in the near future, due to difficulties to those used for genotypic classification in other with in vitro culture and the daunting task of creating and enterovirus species. At the same time, these proposals applying the necessary serotyping reagents should a viral should respect and adapt to differences in the pattern of culture system be developed. These factors necessitate a diversity in species C and the occurrence of recombination. genotypic classification method. To investigate whether In addition, we acknowledge that current surveillance and clear inter- and intra-type thresholds can be defined for genetic characterization of HRV-C are incomplete and we HRV-C, we constructed a frequency histogram of the set of state the need for review of and, if necessary, revision of pairwise distances between all available sequences from the type-assignment criteria as further genetic data become VP1 region (Fig. 3c) using previously described methods for available in the future. Finally, the use of genetic constructing sequence alignments and determining comparisons in restricted regions of the genome (VP1 sequence distances (McIntyre et al., 2010). For comparison, and VP4/VP2) should not diminish perceptions of the we have additionally analysed an even larger dataset of importance of other genomic regions in shaping the available VP4/VP2 sequences (Fig. 3d). phenotype of HRV-C. However, these, together with As described previously (McIntyre et al., 2010), the putative biological/epidemiological differences to be found distribution of HRV-C VP1 sequence distances is indeed in the future, lie specifically in the realm of research bimodal, with a clearly defined minimum (zero) value enquiry and we advise against their use as subsidiary or below 14.9 % and above 8 % (Fig. 2), which may be used as alternative classification criteria unless or until there is a a threshold for putative assignment of HRV-C types. This future major reappraisal of our understanding of HRV diversity and genetics. corresponds closely to the 12 % threshold that divides within- and between-serotype distances in species A and B (a) A proposed HRV-C type should be phylogeneti- rhinoviruses (Savolainen et al., 2002; McIntyre et al., 2010). cally distinct and show .13 % nucleotide The distribution of pairwise distances in the VP4/VP2 sequence divergence in VP1 from all other

Table 1. Proposed assigment of complete genome sequences of HRV-C into types

Type GenBank Strain identifier Submission Variants Reference* assignment accession no. VP4/VP2 VP1

HRV-C1 EF077279 NAT001 20 Oct 2006 17 4 Kistler et al. (2007) HRV-C2 EF077280 NAT045 20 Oct 2006 36 1 Kistler et al. (2007) HRV-C3 EF186077 QPM 14 Dec 2006 12 2 McErlean et al. (2007) HRV-C4 EF582385 024 27 Apr 2007 3 1 Lau et al. (2007) HRV-C5 EF582386 025 27 Apr 2007 22 1 Lau et al. (2007) HRV-C6 EF582387 026 27 Apr 2007 32 11 Lau et al. (2007) HRV-C7 DQ875932 NY-074 14 Jul 2008 4 2 Lamson et al. (2006) HRV-C8 GQ223227 N4 29 May 2009 6 2 Huang et al. (2009) HRV-C9 GQ223228 N10 29 May 2009 27 5 Huang et al. (2009) HRV-C10 GQ323774 QCE 29 Jun 2009 7 1 Arden et al. (2010b) HRV-C11 EU840952 CL-170085 21 May 2010 11 2 Tapparel et al. (2009b)

*Reference for first submitted sequence for each type.

2414 Journal of General Virology 91 http://vir.sgmjournals.org

Table 2. Proposed type assigment of HRV-C variants represented by VP1 (and VP4/VP2) sequences

Type assignment VP4/VP2 region VP1 region References*

GenBank Submission date Variants GenBank Submission date Variants accession no. accession no.

HRV-C12 EF077264 20 Oct 2006 25 HM236958 14 May 2010 3 Kistler et al. (2007); McIntyre et al. (2010) HRV-C13 EU081795 3 Aug 2007 6 HM236908 14 May 2010 1 Renwick et al. (2007); McIntyre et al. (2010) HRV-C14 EU081796 3 Aug 2007 6 HM236911 14 May 2010 1 Renwick et al. (2007); McIntyre et al. (2010) HRV-C15 EU081800 3 Aug 2007 22 HM236963 14 May 2010 1 Renwick et al. (2007); McIntyre et al. (2010) HRV-C16 EU081808 3 Aug 2007 31 HM236944 14 May 2010 3 Renwick et al. (2007); McIntyre et al. (2010) HRV-C17 EU081809 3 Aug 2007 3 HM236936 14 May 2010 1 Renwick et al. (2007); McIntyre et al. (2010) HRV-C18 EU590074 25 Mar 2008 30 HM236918 14 May 2010 5 Savolainen-Kopra et al. (2009a); McIntyre et al. (2010) HRV-C19 EU697850 5 May 2008 9 EU840728 20 Jun 2008 3 Briese et al. (2008); Tapparel et al. (2009a) HRV-C20 EU697851 5 May 2008 14 HM236923 14 May 2010 6 Briese et al. (2008); McIntyre et al. (2010) HRV-C21 EU752377 26 May 2008 16 HM236903 14 May 2010 2 Miller et al. (2009a); McIntyre et al. (2010) HRV-C22 EU752381 26 May 2008 10 HM236905 14 May 2010 2 Miller et al. (2009a); McIntyre et al. (2010) HRV-C23 EU752424 26 May 2008 21 HM236901 14 May 2010 2 Miller et al. (2009a); McIntyre et al. (2010) HRV-C24 EU752426 26 May 2008 14 HM236939 14 May 2010 1 Miller et al. (2009a); McIntyre et al. (2010) HRV-C25 EU752427 26 May 2008 15 HM236952 14 May 2010 1 Miller et al. (2009a); McIntyre et al. (2010) HRV-C26 EU752441 26 May 2008 19 HM236904 14 May 2010 2 Miller et al. (2009a); McIntyre et al. (2010) HRV-C27 GQ223122 9 Jan 2009 14 HM236906 14 May 2010 4 Huang et al. (2009); McIntyre et al. (2010) HRV-C28 GQ223134 9 Jan 2009 11 HM236954 14 May 2010 1 Huang et al. (2009); McIntyre et al. (2010) HRV-C29 FJ615699 9 Jan 2009 4 HM236949 14 May 2010 1 Miller et al. (2009b); McIntyre et al. (2010) HRV-C30 GQ476669 13 Aug 2009 2 HM236968 14 May 2010 1 Wisdom et al. (2009b); McIntyre et al. (2010) HRV-C31 GU294380 4 Dec 2009 4 HM236964 14 May 2010 1 Wisdom et al. (2009b); McIntyre et al. (2010) HRV-C32 GU294466 4 Dec 2009 13 HM236897 14 May 2010 10 Wisdom et al. (2009b); McIntyre et al. (2010) HRV-C for proposals Classification HRV-C33 GU294480 4 Dec 2009 3 HM236934 14 May 2010 2 Wisdom et al. (2009b); McIntyre et al. (2010)

*References for first submitted sequence for each type. 2415 P. Simmonds and others

previously classified species C types. The VP1 (d) Subsequent assignments have been made (HRV- sequence obtained for this sequence comparison C12 onwards) to genetic variants of HRV-C for must be .90 % complete between positions 2304 which VP1 and VP4/VP2 sequences are available, and 3125 for determining valid nucleotide again ordered by submission date of the first sequence distances. The proposed threshold sequence in either VP4/VP2 or VP1 (Table 2). corresponds to approximately 8 % amino acid (e) The remaining genetic variants of HRV-C for sequence divergence in VP1. However, for clarity which only VP4/VP2 region sequences are and avoidance of conflicting assignments, we do available and which show .10 % divergence from not propose amino acid distances as an additional other species C sequences in this region should be or alternative criterion for type assignments. assigned as provisionally assigned types (desig- (b) A sequence from the VP4/VP2 region (between nated ‘pat’), e.g. HRV-C_pat1, HRV-C_pat2 etc. positions 615 and 1043) can be used for (Table 3). If and when VP1 sequence data are identification of HRV-C types among the much determined for at least one member of this larger dataset of VP4/VP2 sequences that have provisionally assigned type, it can be added to been obtained from surveillance studies. the list of confirmed types and removed from the (c) Types should be numbered sequentially from 1 provisional list. using a ‘C’ prefix to distinguish them from (f) A designated Expert Group takes responsibility serotype designations of other HRV species. In for the future coordinated assignment of HRV-C the tables of assignments drawn up, numbering types, including a reappraisal of the type assign- commences with the 11 (near)-complete genome ment as more sequence data accumulate. This sequences HRV-C1 to -C11 (Table 1), based on might perhaps be nominated by the ICTV submission date to GenBank. Picornavirus Study Group and include Study

Table 3. Provisional type assignment of HRV-C variants represented by VP4/VP2 sequences only

Provisional type GenBank accession no. Submission date Variants Reference*

HRV-C_pat1 EF077256 20 Oct 2006 21 Kistler et al. (2007) HRV-C_pat2 EF077260 20 Oct 2006 8 Kistler et al. (2007) HRV-C_pat3 EU081790 3 Aug 2007 15 Renwick et al. (2007) HRV-C_pat4 EU081791 3 Aug 2007 13 Renwick et al. (2007) HRV-C_pat5 EU081799 3 Aug 2007 20 Renwick et al. (2007) HRV-C_pat6 EU081802 3 Aug 2007 29 Renwick et al. (2007) HRV-C_pat7 EU081803 3 Aug 2007 7 Renwick et al. (2007) HRV-C_pat8 EU081805 3 Aug 2007 11 Renwick et al. (2007) HRV-C_pat9 EU081807 3 Aug 2007 13 Renwick et al. (2007) HRV-C_pat10 EU590054 25 Mar 2008 14 Savolainen-Kopra et al. (2009a) HRV-C_pat11 EU590061 25 Mar 2008 5 Savolainen-Kopra et al. (2009a) HRV-C_pat12 EU590064 25 Mar 2008 5 Savolainen-Kopra et al. (2009a) HRV-C_pat13 EU697839 5 May 2008 6 Briese et al. (2008) HRV-C_pat14 EU697852 5 May 2008 1 Briese et al. (2008) HRV-C_pat15 EU743925 22 May 2008 10 Dominguez et al. (2008) HRV-C_pat16 EU752358 26 May 2008 2 Miller et al. (2009a) HRV-C_pat17 EU752398 26 May 2008 6 Miller et al. (2009a) HRV-C_pat18 EU752412 26 May 2008 21 Miller et al. (2009a) HRV-C_pat19 FJ598096 29 Dec 2008 5 Currently unpublished HRV-C_pat20 FJ615722 9 Jan 2009 1 Miller et al. (2009b) HRV-C_pat21 FJ615737 9 Jan 2009 4 Miller et al. (2009b) HRV-C_pat22 FJ615745 9 Jan 2009 1 Miller et al. (2009b) HRV-C_pat23 FJ841957 17 Mar 2009 2 Calvo et al. (2009) HRV-C_pat24 FJ869923 27 Mar 2009 1 Calvo et al. (2009) HRV-C_pat25 FJ869950 27 Mar 2009 4 Calvo et al. (2009) HRV-C_pat26 GQ466482 7 Aug 2009 1 Savolainen-Kopra et al. (2009c) HRV-C_pat27 GU214340 18 Nov 2009 5 Piralla et al. (2009) HRV-C_pat28 HM347248 24 May 2010 1 Currently unpublished

*Reference for first submitted sequence for each type.

2416 Journal of General Virology 91 Classification proposals for HRV-C

Fig. 5. Total numbers of each assigned HRV-C type identified by sequence comparisons in the VP4/VP2 region, divided by geographical region. The total representation of sequences from each geographical range is indicated by the inset pie chart.

Group members with expertise and experience in HRV-C genetic variant as belonging to an already classified new enterovirus type assignments, as well as type can be achieved by sequence comparisons in VP4/VP2. ‘outside’ scientists active in HRV-C or more Currently, most sequence data obtained for genetic char- general HRV research. acterization of HRV-C have been obtained in this region, (g) Alignments of the VP1 and VP4/VP2 regions, including all of the confirmed types. These sequence data are along with information on the regions used for derived from a wide geographical base, combining sequence sequence comparisons, will be made available on data from Europe, USA, Australia, Japan and South-East Asia. a publicly available database accessible through the Picornavirus Study Group. These alignments By phylogenetic analysis and examination of pairwise will be regularly maintained and updated with distances within the now-extensive dataset of VP4/VP2 new sequence data and type assignments as these sequences, the aforementioned threshold permits all HRV-C become available. variants characterized to date to be categorized into a total of (h) This Group should cooperate closely with those 61 confirmed or provisionally assigned types, the majority of developing future type-assignment criteria for which now contain multiple examples from geographically species A and B rhinoviruses to help ensure separate locations (Fig. 5; a full list of individual assignments consistency in approach. is available as Supplementary Table S1 in JGV Online). The decreasing pace of identification of variants worldwide that Applying these criteria to the currently available dataset of can be assigned (even provisionally) as new types suggests a HRV-C sequences creates a total of 33 confirmed types finite limit to the number that will eventually be classified. (Tables 1 and 2) and a further 28 provisionally assigned The actual total will, however, only become clear with more types based on VP4/VP2 sequences (Table 3). temporally and geographically widespread sampling.

In summary, this proposal draws together existing know- Type identification ledge of the genetic diversity of HRV-C and applies In drawing up specific classification proposals, we should principles established for type assignment of other entero- emphasize that the process of type assignment is an activity viruses to create a genotypically based classification scheme distinct from type identification or detection. From the data for HRV-C types. We hope that these proposals will be of obtained from genetic characterization of HRV-C in different value in future rhinovirus research, and provide the impetus genome regions and the lack of recombination observed to develop related type-assignment criteria for novel HRV-A (McIntyre et al., 2010), we consider that identification of an and -B genetic variants that have been described. http://vir.sgmjournals.org 2417 P. Simmonds and others

Acknowledgements Huang, T., Wang, W., Bessaud, M., Ren, P., Sheng, J., Yan, H., Zhang, J., Lin, X., Wang, Y. & other authors (2009). Evidence of The authors would like to thank Nick Knowles of the Picornavirus recombination and genetic diversity in human rhinoviruses in Study Group for useful comments and advice received during the children with acute respiratory infection. PLoS ONE 4, e6355. preparation of this proposal. Kaiser, L., Aubert, J. D., Pache, J. C., Deffernez, C., Rochat, T., Garbino, J., Wunderli, W., Meylan, P., Yerly, S. & other authors (2006). Chronic rhinoviral infection in lung transplant recipients. Am References J Respir Crit Care Med 174, 1392–1399. Arden, K. E. & Mackay, I. M. (2010). Newly identified human Kapikian, A. Z., Conant, R. M., Hamparian, V. V., Chanock, R. M., rhinoviruses: molecular methods heat up the cold viruses. Rev Med Chapple, P. J., Dick, E. C., Fenters, J. D., Gwaltney, J. M., Jr, Hamre, D. Virol 20, 156–176. & other authors (1967). Rhinoviruses: a numbering system. Nature 213, 761–762. Arden, K. E., McErlean, P., Nissen, M. D., Sloots, T. P. & Mackay, I. M. (2006). Frequent detection of human rhinoviruses, paramyxoviruses, Kapikian, A. Z., Conant, R. M., Hamparian, V. V., Chanock, R. M., Dick, coronaviruses, and bocavirus during acute respiratory tract infections. E. C., Gwaltney, J. M., Jr, Hamre, D., Jordan, W. S., Kenny, G. E. & J Med Virol 78, 1232–1240. other authors (1971). A collaborative report: rhinoviruses – extension of the numbering system. Virology 43, 524–526. Arden, K. E., Chang, A. B., Lambert, S. B., Nissen, M. D., Sloots, T. P. Khetsuriani, N., Lu, X., Teague, W. G., Kazerouni, N., Anderson, L. J. & Mackay, I. M. (2010a). Newly identified respiratory viruses in & Erdman, D. D. (2008). children with non-hospitalised asthma exacerbation. J Med Virol 82, Novel human rhinoviruses and exacerbation 1458–1461. of asthma in children. Emerg Infect Dis 14, 1793–1796. Kistler, A., Avila, P. C., Rouskin, S., Wang, D., Ward, T., Yagi, S., Arden, K. E., Faux, C. E., O’Neill, N. T., McErlean, P., Nitsche, A., Schnurr, D., Ganem, D., DeRisi, J. L. & Boushey, H. A. (2007). Lambert, S. B., Nissen, M. D., Sloots, T. P. & Mackay, I. M. (2010b). Pan- Molecular characterization and distinguishing features of a novel viral screening of respiratory tract infections in adults with and without asthma reveals unexpected human coronavirus and human human rhinovirus (HRV) C, HRVC-QCE, detected in children with rhinovirus diversity. J Infect Dis 196, 817–825. fever, cough and wheeze during 2003. J Clin Virol 47, 219–223. Knowles, N. J. (2010). Blomqvist, S., Savolainen, C., Raman, L., Roivainen, M. & Hovi, T. Virus Taxonomy: 2009 release. http://www. ictvonline.org/virusTaxonomy.asp?version=2009 (2002). Human rhinovirus 87 and enterovirus 68 represent a unique serotype with rhinovirus and enterovirus features. J Clin Microbiol 40, Laine, P., Savolainen, C., Blomqvist, S. & Hovi, T. (2005). 4218–4223. Phylogenetic analysis of human rhinovirus capsid protein VP1 and Briese, T., Renwick, N., Venter, M., Jarman, R. G., Ghosh, D., 2A protease coding sequences confirms shared genus-like relation- ships with human enteroviruses. J Gen Virol 86, 697–706. Kondgen, S., Shrestha, S. K., Hoegh, A. M., Casas, I. & other authors (2008). Global distribution of novel rhinovirus genotype. Emerg Infect Lamson, D., Renwick, N., Kapoor, V., Liu, Z., Palacios, G., Ju, J., Dean, Dis 14, 944–947. A., St George, K., Briese, T. & Lipkin, W. I. (2006). MassTag polymerase-chain-reaction detection of respiratory pathogens, Calvo, C., Luz Garcia, M., Pozo, F., Reyes, N., Perez-Brena, P. & including a new rhinovirus genotype, that caused influenza-like Casas, I. (2009). Role of rhinovirus C in apparently life-threatening illness in New York State during 2004–2005. J Infect Dis 194, 1398– events in infants, Spain. Emerg Infect Dis 15, 1506–1508. 1402. Carstens, E. B. (2010). Ratification vote on taxonomic proposals to Lau, S. K., Yip, C. C., Tsoi, H. W., Lee, R. A., So, L. Y., Lau, Y. L., Chan, the International Committee on Taxonomy of Viruses (2009). Arch K. H., Woo, P. C. & Yuen, K. Y. (2007). Clinical features and complete Virol 155, 133–146. genome characterization of a distinct human rhinovirus (HRV) Cooney, M. K., Fox, J. P. & Kenny, G. E. (1982). Antigenic groupings genetic cluster, probably representing a previously undetected HRV of 90 rhinovirus serotypes. Infect Immun 37, 642–647. species, HRV-C, associated with acute respiratory illness in children. Cordey, S., Gerlach, D., Junier, T., Zdobnov, E. M., Kaiser, L. & J Clin Microbiol 45, 3655–3664. Tapparel, C. (2008). The cis-acting replication elements define human Lau, S. K., Yip, C. C., Lin, A. W., Lee, R. A., So, L. Y., Lau, Y. L., Chan, enterovirus and rhinovirus species. RNA 14, 1568–1578. K. H., Woo, P. C. & Yuen, K. Y. (2009). Clinical and molecular Dominguez, S. R., Briese, T., Palacios, G., Hui, J., Villari, J., Kapoor, epidemiology of human rhinovirus C in children and adults in Hong V., Tokarz, R., Glode, M. P., Anderson, M. S. & other authors (2008). Kong reveals a possible distinct human rhinovirus C subgroup. J Infect Multiplex MassTag-PCR for respiratory pathogens in pediatric Dis 200, 1096–1103. nasopharyngeal washes negative by conventional diagnostic testing Ledford, R. M., Patel, N. R., Demenczuk, T. M., Watanyar, A., shows a high prevalence of viruses belonging to a newly recognized Herbertz, T., Collett, M. S. & Pevear, D. C. (2004). VP1 sequencing of rhinovirus clade. J Clin Virol 43, 219–222. all human rhinovirus serotypes: insights into genus phylogeny and Gerber, K., Wimmer, E. & Paul, A. V. (2001). Biochemical and genetic susceptibility to antiviral capsid-binding compounds. J Virol 78, studies of the initiation of human rhinovirus 2 RNA replication: 3663–3674. identification of a cis-replicating element in the coding sequence of Lee, W. M., Kiesner, C., Pappas, T., Lee, I., Grindle, K., Jartti, T., 2A(pro). J Virol 75, 10979–10990. Jakiela, B., Lemanske, R. F., Jr, Shult, P. A. & Gern, J. E. (2007). A Hamparian, V. V., Colonno, R. J., Cooney, M. K., Dick, E. C., Gwaltney, diverse group of previously unrecognized human rhinoviruses are J. M., Jr, Hughes, J. H., Jordan, W. S., Jr, Kapikian, A. Z., Mogabgab, common causes of respiratory illnesses in infants. PLoS ONE 2, e966. W. J. & Monto, A. (1987). A collaborative report: rhinoviruses – McErlean, P., Shackelton, L. A., Lambert, S. B., Nissen, M. D., Sloots, extension of the numbering system from 89 to 100. Virology 159, 191– T. P. & Mackay, I. M. (2007). Characterisation of a newly identified 192. human rhinovirus, HRV-QPM, discovered in infants with bronch- Han, T. H., Chung, J. Y., Hwang, E. S. & Koo, J. W. (2009). Detection of iolitis. J Clin Virol 39, 67–75. human rhinovirus C in children with acute lower respiratory tract McIntyre, C., McWilliam-Leitch, E. C., Savolainen-Kopra, C., Hovi, T. infections in South Korea. Arch Virol 154, 987–991. & Simmonds, P. (2010). Analysis of genetic diversity and sites of

2418 Journal of General Virology 91 Classification proposals for HRV-C recombination in human rhinovirus species C (HRV-C). J Virol (in Savolainen, C., Blomqvist, S., Mulders, M. N. & Hovi, T. (2002). press). doi:10.1128/JVI.00962-10 Genetic clustering of all 102 human rhinovirus prototype strains: Miller, E. K., Edwards, K. M., Weinberg, G. A., Iwane, M. K., Griffin, serotype 87 is close to human enterovirus 70. J Gen Virol 83, 333–340. M. R., Hall, C. B., Zhu, Y., Szilagyi, P. G., Morin, L. L. & other authors Savolainen, C., Laine, P., Mulders, M. N. & Hovi, T. (2004). Sequence (2009a). A novel group of rhinoviruses is associated with asthma analysis of human rhinoviruses in the RNA-dependent RNA hospitalizations. J Allergy Clin Immunol 123, 98–104. polymerase coding region reveals large within-species variation. Miller, E. K., Khuri-Bulos, N., Williams, J. V., Shehabi, A. A., Faouri, S., J Gen Virol 85, 2271–2277. Al Jundi, I., Chen, Q., Heil, L., Mohamed, Y. & other authors (2009b). Savolainen-Kopra, C., Blomqvist, S., Kilpi, T., Roivainen, M. & Hovi, Human rhinovirus C associated with wheezing in hospitalised T. (2009a). Novel species of human rhinoviruses in acute otitis media. children in the Middle East. J Clin Virol 46, 85–89. Pediatr Infect Dis J 28, 59–61. Mulders, M. N., Salminen, M., Kalkkinen, N. & Hovi, T. (2000). Savolainen-Kopra, C., Blomqvist, S., Smura, T., Roivainen, M., Molecular epidemiology of coxsackievirus B4 and disclosure of the Hovi, T., Kiang, D., Kalra, I., Yagi, S., Louie, J. K. & other authors correct VP1/2A(pro) cleavage site: evidence for high genomic (2009b). 59 noncoding region alone does not unequivocally diversity and long-term endemicity of distinct genotypes. J Gen determine genetic type of human rhinovirus strains. J Clin Virol 81, 803–812. Microbiol 47, 1278–1280. Oberste, M. S., Maher, K., Kilpatrick, D. R. & Pallansch, M. A. (1999). Savolainen-Kopra, C., Blomqvist, S., Kaijalainen, S., Juonio, U., Juvonen, R., Peitso, A., Saukkoriipi, A., Vainio, O., Hovi, T. & Molecular evolution of the human enteroviruses: correlation of Roivainen, M. (2009c). serotype with VP1 sequence and application to picornavirus All known human rhinovirus species are present in sputum specimens of military recruits during respiratory classification. J Virol 73, 1941–1948. infection. Viruses 1, 1178–1189. Olenec, J. P., Kim, W. K., Lee, W. M., Vang, F., Pappas, T. E., Salazar, Tamura, K., Dudley, J., Nei, M. & Kumar, S. (2007). MEGA4: molecular L. E., Evans, M. D., Bork, J., Roberg, K. & other authors (2010). evolutionary genetics analysis (MEGA) software version 4.0. Mol Biol Weekly monitoring of children with asthma for infections and illness Evol 24, 1596–1599. during common cold seasons. J Allergy Clin Immunol 125, 1001– 1006. Tapparel, C., Junier, T., Gerlach, D., Van Belle, S., Turin, L., Cordey, S., Muhlemann, K., Regamey, N., Aubert, J. D. & other authors Palmenberg, A. C., Spiro, D., Kuzmickas, R., Wang, S., Djikeng, A., (2009a). New respiratory enterovirus and recombinant rhinoviruses Rathe, J. A., Fraser-Liggett, C. M. & Liggett, S. B. (2009). Sequencing among circulating picornaviruses. Emerg Infect Dis 15, 719–726. and analyses of all known human rhinovirus genomes reveal structure and evolution. Science 324, 55–59. Tapparel, C., L’Huillier, A. G., Rougemont, A. L., Beghetti, M., Barazzone-Argiroffo, C. & Kaiser, L. (2009b). Pneumonia and Piotrowska, Z., Vazquez, M., Shapiro, E. D., Weibel, C., Ferguson, D., pericarditis in a child with HRV-C infection: a case report. J Clin Landry, M. L. & Kahn, J. S. (2009). Rhinoviruses are a major cause of Virol 45, 157–160. wheezing and hospitalization in children less than 2 years of age. Wisdom, A., Kutkowska, A., McWilliam-Leitch, E. C., Gaunt, E., Pediatr Infect Dis J 28, 25–29. Templeton, K., Harvala, H. & Simmonds, P. (2009a). Genetics, Piralla, A., Rovida, F., Campanini, G., Rognoni, V., Marchi, A., recombination and clinical features of human rhinovirus species C Locatelli, F. & Gerna, G. (2009). Clinical severity and molecular (HRV-C) infections; interactions of HRV-C with other respiratory typing of human rhinovirus C strains during a fall outbreak affecting viruses. PLoS ONE 4, e8518. hospitalized patients. J Clin Virol 45, 311–317. Wisdom, A., McWilliam Leitch, C., Gaunt, E., Harvala, H. & Renwick, N., Schweiger, B., Kapoor, V., Liu, Z., Villari, J., Bullmann, Simmonds, P. (2009b). Screening respiratory samples for human R., Miething, R., Briese, T. & Lipkin, W. I. (2007). A recently identified rhinoviruses (HRV) and enteroviruses: comprehensive VP4/2-typing rhinovirus genotype is associated with severe respiratory-tract reveals high incidence and genetic diversity of HRV species C. J Clin infection in children in Germany. J Infect Dis 196, 1754–1760. Microbiol 47, 3958–3967.

http://vir.sgmjournals.org 2419 Journal of Medical Virology 84:536–542 (2012)

High Detection Frequency and Viral Loads of Human Rhinovirus Species A to C in Fecal Samples; Diagnostic and Clinical Implications

H. Harvala,1* C.L. McIntyre,2 N.J. McLeish,2 J. Kondracka,2 J. Palmer,2 P. Molyneaux,3 R. Gunson,4 S. Bennett,4 K. Templeton,1 and P. Simmonds2 1Specialist Virology Centre, Royal Infirmary of Edinburgh, Edinburgh, United Kingdom 2Centre for Immunology, Infection and Evolution, University of Edinburgh, Edinburgh 3Department of Medical Microbiology, Aberdeen Royal Infirmary, Aberdeen, United Kingdom 4West of Scotland Specialist Virology Centre, Gartnavel General Hospital, Glasgow, United Kingdom

Human rhinoviruses (HRVs) can be divided KEY WORDS: human rhinovirus; HRV-A to into three species; HRV-A to HRV-C. Up to 148 C; gastroenteritis; diagnostic; different HRV (sero)types have been identified enterovirus to date. Because of sequence similarity between 50-NCR of HRVs and enteroviruses (EVs), it is problematic to design EV-specific RT-PCR assays. The aims of this study were to INTRODUCTION assess the rate of false-detection of different Human rhinoviruses (HRVs) together with entero- rhinoviruses by EV RT-PCR, and to evaluate viruses (EVs) belong to the genus Enterovirus within the diagnostic and clinical significance of the family Picornaviridae. HRVs can be divided into such cross-reactivity. In vitro RNA transcripts three species; HRV species A (HRV-A) comprises of of HRV A-C created from cDNA templates 74 different serotypes, HRV-B of 25 serotypes and were quantified spectrophotometrically. Six more recently characterized HRV-C of at least 49 types hundred twenty-one stool samples screened as [Stanway et al., 2005; McIntyre et al., 2010]. In most part of routine diagnostic for EV, 17 EV-positive cases, HRV causes mild and self-limiting upper respi- stool samples referred for typing, 288 stool ratory tract infections or common colds, but HRVs samples submitted for gastroenteritis investiga- have also been associated with bronchiolitis, lower tions, and 1,500 CSF samples were included in respiratory infection as well as with exacerbation of the study. EV-specific RT-PCR detected RNA asthma and chronic obstructive airway diseases transcripts of HRV-A1b, HRV-B14, and HRV- [reviewed by Mackay, 2008]. Crpat18 but with 10–1,000 reduced sensitivity HRV infections are most often diagnosed from compared to EV transcripts. Screening fecal respiratory tract sample by reverse transcription samples by EV RT-PCR identified 13 positive polymerase chain reaction (RT-PCR) targeting the samples identified subsequently as rhinovi- 50-non-coding region (50-NCR) of the viral genome ruses; a further 26 HRV-positive samples were [Scheltinga et al., 2005; Mackay, 2008]. Although identified by nested HRV RT-PCR. All individua- relatively highly conserved compared to HRV coding ls were hospitalized and presented mostly regions, sequence variation in the 50-NCR between with diarrhea. A total of 26 HRV types were different HRV (sero)types creates difficulties in identified (HRV-A: 46%; HRV-B: 13%; HRV-C: designing primer pairs that can satisfactorily detect 41%). Results confirm that EV-specific RT-PCR all HRV types and species [Faux et al., 2011]. can detect HRVs, and at a practical level, identi- Additionally, sequence similarity in this region fy potential problems of interpretation if fecal samples are used for surrogate screening in cases of suspected viral meningitis. High detec- *Correspondence to: H. Harvala, PhD, MSc, MD, Specialist tion frequencies (10%) and viral loads in Virology Centre, Royal Infirmary of Edinburgh, 51 Little France stool samples provide evidence for enteric Crescent, Edinburgh EH16 4SA, UK. E-mail: [email protected] replication of HRV, and its association with Accepted 15 November 2011 enteric disease requires further etiological J. Med. Virol. 84:536–542, 2012. DOI 10.1002/jmv.23203 studies. Published online in Wiley Online Library ß 2012 Wiley Periodicals, Inc. (wileyonlinelibrary.com).

ß 2012 WILEY PERIODICALS, INC. Detection of Human Rhinoviruses in Fecal Samples 537 between HRVs and enteroviruses (EVs) complicates of samples) with suspected viral meningitis or pyrexia the development of EV-specific RT-PCR screening of unknown origin during a 14-month period methods that would not also detect some HRVs. (March 2010 to April 2011) were investigated in this The current study has two aims. Firstly, the study. This study additionally included 17 EV-positive frequency of cross-reactivity and relative sensitivity of stool samples referred from Aberdeen for enterovirus EV RT-PCR for HRV, both as pre-quantified RNA typing and a subset of stool samples (n ¼ 288) submit- transcripts and HRV-positive clinical specimens were ted mostly for gastroenteritis investigations (mostly investigated. The second aim was to investigate suspected norovirus infection) in May (n ¼ 90), the frequency of enteric excretion of rhinoviruses September (n ¼ 108), and December (n ¼ 90). Finally, by HRV-specific PCR and virological and clinical char- 1,500 CSF samples from the Specialist Virology acterization of positive subjects. Center were also included. All samples tested retrospectively were anonymized before testing, but epidemiologic and demographic MATERIALS AND METHODS information was retained while patient confidentiality was protected (in accordance with a protocol approved In Vitro Transcribed RNA for Species A–C by the Lothian Regional Ethics Committee [Protocol Rhinoviruses 2002/4/36]). Genomic cDNA clones of two different HRVs, HRV- A1b (species A) and HRV-B14 (species B) were kindly provided by Professor G. Stanway [HRV-A1b; Hughes RESULTS et al., 1988 and HRV-B14; Alsaadi et al., 1989]. Full- Detection of HRV Sequences by 50-UTR PCR length transcripts were generated from these using the Ambion T7 Transcription Kit (Promega (UK), The number of mismatches between EV RT-PCR Southampton, UK). In vitro transcribed RNA for spe- primers and probes used for diagnostic screening with cies C rhinoviruses were made from cDNA clones of representative sequences of all assigned HRV and EV the combined 50-NCR/VP4 region of two HRV-C (sero)types was determined (Table I). This revealed a variants directly amplified from clinical specimens substantially greater number of mismatches between (types HRV-CrPat19 and HRV-C40). Transcripts were primers and probes with all three HRV species purified with the RNeasy Kit (Qiagen (UK), Crawley, than with EV. Mismatches were similarly identified West Sussex, UK) and concentrations were deter- between the probe and primers sequences with the mined by NanoDrop2000 (Thermo Scientific/Fisher, HRV RNA transcript sequences used in the current Horsham, West Sussex, UK). Integrity of RNA tran- study as assay controls. scripts was demonstrated by denaturing agarose gel The actual degree of assay cross-reactivity was electrophoresis (data not shown). Ten-fold dilution determined experimentally for HRV sequences by serious of each RNA transcript were made and comparing Ct-values of HRV RNA transcripts derived stabilized using citrate buffer (pH 6.0, Ambion, from the cDNA clones of HRV-A1b, HRV-B14, HRV- Life Technologies, Glasgow, UK) supplemented with CrPat19, and HRV-C40 amplified by both methods 0.05 mg/ml carrier tRNA and 0.1 U/ml RNAsin (Fig. 1). RNA transcripts were quantified biophysi- inhibitor. cally before dilution for assay calibration into RNA copies. The four HRV transcripts showed different amplification kinetics in the HRV RT-PCR; the end- RT-PCR Assays and Sequencing point titers for HRV-B14, HRV-C40, and HRV- HRV and EV detection in a diagnostic laboratory CrPat19 transcripts were one, two, and three logs used one-step RT-PCR assays as modified from pub- lower than for HRV-A1b. However, the EV-specific lished protocols [Scheltinga et al., 2005; Dierssen RT-PCR was also able to detect RNA transcripts of et al., 2008; Bennett et al., 2011], whereas the HRV-A1b, HRV-B14, and HRV-C19 but with substan- additional retrospective screening was done using the tially less sensitivity; 9,000 copies of HRV-A1b and nested RT-PCR targeting the 50-UTR as prescribed HRV-B14, and 90,000 copies of HRV-C19 could be previously [Wisdom et al., 2009]. EV-positive samples detected. Interestingly, the detection efficiency of were primarily typed by VP1 sequencing [Leitch HRV-A1b and HRV-B14 by EV-specific RT-PCR assay et al., 2009]; those refractory to EV typing were was comparable to that of HRV-C19 by HRV-specific amplified and sequenced in the VP4 region for type RT-PCR. identification [Wisdom et al., 2009]. Similarly, HRV-positives sample were further identified by VP4 EV RT-PCR Cross-Reactivity in Clinical Samples sequencing [Wisdom et al., 2009]. A total of 67 EV RT-PCR reactive samples identified on diagnostic screening of 621 stool samples Clinical Samples (frequency of 11%) were further characterized in A total of 621 diagnostic stool samples predominantly this study. From those, seven samples untypeable by referred from young children (age under 2 years; 63% the EV-specific VP1 typing assays were subsequently

J. Med. Virol. DOI 10.1002/jmv 538 Harvala et al.

TABLE I. Number of Mismatches Between Primer and Probe Sequences of the HRV and EV RT-PCRs With Prototype HRV and EV 50-UTR Sequences

Mean number of mismatches (range)

Forward1 Forward2 Reverse Probe (A) Rhinovirus PCR Rhinovirus species HRV-A (75 types) 1.3 (0–2) 4.2 (3–5) 1.2 (1–3) 1 (1–2) HRV-B (28 types) 1.8 (1–4) 2.2 (2–3) 1.6 (1–2) 1 (1–2) HRV-C (10 types) 1.4 (1–3) 4 (4) 2 (1–3) 1 (1)

Mean number of mismatches (range)

Forward Reverse Probe (B) Enterovirus PCR Rhinovirus species HRV-A (75 types) 5.9 (3–7) 3.5 (1–4) 1.7 (0–3) HRV-B (28 types) 5.5 (3–7) 2.4 (2–3) 1 (1–2) HRV-C (10 types) 4.4 (3–7) 2.7 (1–4) 0.9 (0–2) Enterovirus species EV-A (17 types) 1 (0–2) 0 0 (0–1) EV-B (58 types) 1 (0–3) 0 (0–1) 0 (0–1) EV-C (24 types) 2.3 (2–4) 0 (0–1) 0 (0–2) EV-D (3 types) 2 (1–3) 0 0 (0–1)

identified as HRV-positives by VP4 amplification and for HRV by nested RT-PCR. A total of 26 further sequencing (10.4%; Table II). VP4 typing similarly HRV-positive samples were further identified and identified 6 HRV-positive samples in a second typed by VP4 amplification and sequencing (frequency group of 17 EV-positive stool samples referred from 10%, including three identified by EV RT-PCR, Aberdeen (35%; Table II). Table III). Detection frequencies varied from 7.4% in To investigate the frequency of HRV excretion in May and 7.7% in December to 12.2% in September. feces, 288 stool samples were retrospectively screened One-step real-time RT-PCR would have detected only 11 of these 26 samples (42%; data not shown). Viral loads of HRV and EV in fecal samples were First negave diluons A1B-EV comparable (Fig. 2). Mean C -values of the HRV- B14-EV t A1B-HRV B14-HRV Cc40-HRV CrPat19-HRV CrPat19-EV Cc40-EV positive samples in the HRV RT-PCR and of EV- 0.9 9 90 900 9000 90000 positive samples in the EV RT-PCR were 28.5 40 CrPat19-EV (standard deviation 3.5) and 28.9 (1.9), respectively 38 (P ¼ 0.77). No specific type of HRV was associated with detec- 36 A1B-EV tion in fecal samples. A total of 26 different HRV types accounted for the 39 identified HRV-positive 34 B14-EV samples; 16 samples (41%) were identified as species CrPat19-HRV A (HRV-A1, A7, A19, A34, A40, A46, A71, A78, A89, 32 Cc40-HRV and A103), 5 (13%) as species B (HRV-B6, B52, and B97), and 18 (46%) as species C (HRV-C5, C6, C12, 30 C14, C24, C25, C36, C37, C39, C42, C43, and C45).

Ct value Positive samples were mostly obtained from young 28 children under the age of 2 years (26/39) or from the B14-HRV patients over the age of 65 years (7/39). All of these 26 individuals were hospitalized, and seven of them A1B-HRV required intensive therapy unit (ITU) admission. 24 Patients presented with variety of clinical symptoms, 110 100 1000 10000 100000 Input RNA copies most often with fever, diarrhea, and/or vomiting; 72% of patients did not have any respiratory symptoms at Fig. 1. Amplification of RNA transcripts of defined concentrations admission (28/39). Nine patients were investigated as by EV RT-PCR and HRV RT-PCR. Ct-values (y-axis) were plotted for part of unconfirmed norovirus outbreak in hospital. serial 10-fold dilutions of each transcript (x-axis). The PCR assay A total of 11 patients (28%) presented with acute used for amplification (EV, HRV) is indicated as a suffix to each label (e.g., HRV-1B–EV: amplification of the HRV-1B transcript in respiratory symptoms, accompanied with loose stool the EV RT PCR). or diarrhea. Enteric co-pathogens were identified in

J. Med. Virol. DOI 10.1002/jmv Detection of Human Rhinoviruses in Fecal Samples 539

TABLE II. HRV-Positive Patients Identified by Real-Time EV RT-PCR

Date Age Clinical details EV PCR HRV PCR Type Nov_10 7–12 months Bronchiolitis, apnoeic episodes 38.09 32.89 HRV-B52 Nov_10 7–12 months Fever, diarrhea and vomiting 33.89 30.78 HRV-C25 Feb_11 3–5 years LRTI and fever 41.1 ND HRV-C6 May_10 1–2 years URTI, fever and diarrhea 38.48 ND HRV-A40a July_10 3–6 months Seizure and loose stool 36.31 ND HRV-A46 Sep_10 7–12 months Bronchiolitis and diarrhea 36.46 ND HRV-C12 Sep_10 <3 months Loose stool and poor weight gain 42.55 29.01 HRV-C42 Aug_10 3–6 months Loose stool and fever NA 28.55 HRV-B52 Aug_10 6–12 months Diarrhea NA 26.39 HRV-A78 Aug_10 <3 months Fever and poor feeding NA 29.62 HRV-B6b Aug_10 >65 years Seizure and loose stool NA 30.39 HRV-A7 Sep_10 10–15 years Fever, diarrhea and vomiting 28.02 23.43 HRV-A71 Nov_10 1–2 years LRTI and fever 33.01 28.75 HRV-C5

NA, data not available; ND, not done, sample finished. aOne co-infection with adenovirus identified. bOne co-infection with rotavirus identified.

five patients (adenovirus n ¼ 3, rotavirus n ¼ 1, and The sample was obtained from a young girl under the Salmonella gastroenteritidis n ¼ 1). age of 3 months who presented with suspected sepsis- To investigate whether non-specificity of EV RT- like illness, and has also HRV detected in her throat PCR extended to diagnostic testing of CSF samples, swab. The virus was identified as HRV-B6 in both CSF and indeed whether HRV can cause CNS-associated sample and throat swab (stool sample not obtained). infection, 1,500 CSF samples were screened by nested HRV RT-PCR. From these, HRV RNA was detected DISCUSSION in a total of one individual sample; this sample was negative by routine EV RT-PCR assay most likely This study documents substantial cross-reactivity due to a low viral load. A finding was confirmed after between EV and HRV real-time PCR methods for re-extraction of RNA from original CSF sample. HRV RNA sequences and identifies a relatively high

TABLE III. Characteristics of Patients Identified as HRV-Positive by Screening Stool Samples by HRV-Specific RT-PCR Assay

Sample Date Age Location Clinical details Type 10_91 Sep_10 >65 years RIE Diarrheaa HRV-A1 10_92 Sep_10 37–65 years ITU Diarrhea and fevera HRV-A1 10_93 Sep_10 >65 years ITU Diarrhea and fevera HRV-A1 10_94 Sep_10 1–2 years RHSC Diarrhea and fever HRV-A1 10_95 Sep_10 <3 months RHSC Diarrhea and vomiting HRV-A1 10_100 Sep_10 7–12 months RHSC Loose stool and irritable HRV-C24 10_128 Sep_10 <3 months RHSC Bronchiolitis, loose stool HRV-C43 10_129 Sep_10 1–2 years RHSC Fever (post-transplant) HRV-C43b 10_141 Sep_10 3–6 months RHSC Diarrhea HRV-C39c 10_149 Sep_10 2 years RHSC URTI and loose stool HRV-A19b 10_153 Sep_10 >65 years ITU Diarrhea and fevera HRV-C39 10_511 May_10 6–10 years RHSC Diarrhea HRV-C45 10_512 May_10 7–12 months RHSC URTI, diarrhea and vomiting HRV-C12 10_518 May_10 3–5 years RHSC Diarrhea and fever HRV-A34 10_520 May_10 7–12 months ESCITU Diarrhea and seizure HRV-A89 10_521 May_10 <3 months ESCITU Diarrhea, vomiting and fever HRV-C37 10_522 May_10 <3 months RHSC Bronchiolitis and vomiting HRV-C36 10_547 May_10 7–12 months RHSC Diarrhea and vomitinga HRV-C36 10_568 May_10 3–5 years RHSC Diarrhea and vomitinga HRV-A103 10_121 Dec_10 3–6 months ESCITU Bronchiolitis and vomiting HRV-A34 10_123 Dec_10 7–12 months ESCITU Bronchiolitis and vomiting HRV-A89 10_124 Dec_10 7–12 months RHSC Gastroenteritis HRV-C37 10_125 Dec_10 7–12 months RHSC Diarrhea and seizure HRV-C36 10_128 Dec_10 >65 years RIE Diarrhea and vomitinga HRV-B97 10_1213 Dec_10 >65 years RIE Diarrhea and vomitinga HRV-C14 10_1218 Dec_10 >65 years RIE Diarrhea and vomitinga HRV-B52

RIE, Royal Infirmary Edinburgh; RHSC, Royal Hospital for Sick Children; ITU, Intensive Therapy Unit; ESCITU, Edinburgh Sick Children ITU. aInvestigated as part of norovirus outbreak. bTwo co-infections with adenovirus identified. cOne co-infection with Salmonella enteritidis identified.

J. Med. Virol. DOI 10.1002/jmv 540 Harvala et al.

p=0.77 The detection frequency and potential mis-identifi- cation of HRV-positive samples as EV positive using 40 standard EV RT-PCR creates a number of interpreta- tion difficulties for diagnostic screening of fecal sam- ples and to a much more limited extent in the current study of CSF samples. The relatively high viral loads of HRV in fecal samples potentially creates problems 35 for laboratories that use this sample type as an addi- tional or alternative surrogate sample to CSF samples to guide diagnosis of EV meningitis and encephalitis as frequently directed by local or national guidelines. As shown here, approximately 10% of stool samples 30 referred for virology diagnostic testing are genuine HRV-positive generally with high viral loads. If even a small proportion of these cross-reacted in the EV RT-PCR, as can be predicted from the study findings

Ct Value 25 where approximately 15% (13/84) of EVs identified by EV RT-PCR in stool samples were indeed HRVs, then this could create a number of falsely reported EV detections in meningitis cases caused by other organ- isms. In this situation, virus typing as an adjunct to 20 screening that included methods to detect HRV variants would identify instances of false-detection of EVs; our analysis of CSF samples does not suggest HRV being a frequent cause of central nervous system 15 associated infections. EV HRV Tissue Tropism of HRV

Fig. 2. Ct-values of EV-positive samples in the EV RT-PCR (left An unanticipated finding from the current study graph) and HRV-positive samples by the HRV RT-PCR (right was the high rate of HRV detection in fecal samples graph). Mean values and 67 percentile distributions shown as bars. Ct-values were compared by the non-parametric Kruskall–Wal- and the comparability of their viral loads to entero- lace test, P-value shown above graph. viruses that are long known to replicate primarily in the lower gastrointestinal tract [Sabin and Ward, 1941; Iwasaki et al., 2002]. HRV are labile to the frequency of HRV excretion in feces (comparable to acidic environment of the stomach that leads to irre- EV detection rates). The similarity in viral loads versible conformational changes in capsid. Experi- between HRV and EV in feces provides evidence for a mentally, HRVs differ from enteroviruses in being much greater propensity of HRV to establish enteric inactivated at pH of <5 as well as pH values above 9– infections than previously described. 10. For this reason, HRV detection in fecal samples or sewage has generally been attributed in the past to swallowed respiratory secretions [Blomqvist et al., Detection of HRV by EV RT-PCR 2009]. In our opinion, this is unlikely for two reasons; Although infrequently described and not formally firstly, the acid lability of HRVs that destroys their quantified [Faux et al., 2011], cross-reactivity of HRV infectivity when passing through the stomach would in EV detection assays may be reasonably anticipated lead to HRV RNA degradation if the virion capsid was given the sequence similarity of the region of the 50- breached. Secondly, the dilution of HRV secretions UTR typically targeted by EV screening assays. Both with the much larger volumes of gut contents could HRVs and EVs have a type I internal ribosomal entry not conceivably create the high viral loads detected in site and there are substantial similarities in key func- the current study by HRV PCR. Indeed, the simplest tional areas including RNA secondary structures explanation for the near identity of viral loads of HRV between all members of the Enterovirus genus that and EV in fecal samples is that viruses replicate com- extend to the targeted region (nucleotides from 412 parably in the gastrointestinal tract. However, studies to 553). Indeed, the greater number of mismatches are required to demonstrate this further. The restric- between the EV RT-PCR primers and probes with the tion of HRV-positive samples to children of <2 years HRV-A1b and HRV-B14 than with EV sequences of age and to the elderly may have arisen from (max 6, 3, and 1 for sense primer, antisense primer, survival of infectious HRV through the less acidic and probe) did not preclude their amplification by stomach environment in these groups [Lopez-Alonso HRV RT-PCR, although with a sensitivity approxi- and Ribas, 1991]. Survival of HRV may additionally mately 100-fold lower that for EV sequences. be enhanced by the presence of cationic substances

J. Med. Virol. DOI 10.1002/jmv Detection of Human Rhinoviruses in Fecal Samples 541 such as MgCl2 that are favorable for the replication of investigated as a part of suspected gastroenteritis out- HRV by stabilizing virions [Blough et al., 1969]. As an break in hospital, and HRV (in fecal samples) has additional possibility, in vitro culture of HRV-B14 in been the only etiological agent identified so far. More acidic medium has been shown to select rapidly for a focused screening of fecal samples for HRV in patients point mutation in the capsid (VP1) gene that confers presenting with gastroenteritis, sepsis, and other resistance to low pH [Skern et al., 1991]. This may systemic diseases, along with parallel estimation of represent a more general and necessary adaptive HRV detection frequency in control groups will be change for successful replication in the GI tract. required to demonstrate these additional potential Whether each of the wide range of HRV (sero)types disease association of HRV infections. Furthermore, and species detected in fecal samples in the current the possible presence of infectious HRV in fecal study are capable of an equivalent phenotypic change samples potentially creates infection control problems. and identification of the associated genetic changes Whereas HRV are usually regarded to spread via are intriguing areas of further research. The other po- direct contact with respiratory droplets, findings tential requirement for successful gut replication by in the current study point towards the very real possi- rhinoviruses is adaptation to replication at 378C bility of fecal–oral transmission of HRV. This would instead of the 33–348C in the respiratory tract, have a number of implications for the management of traditionally assumed to be the optimum temperature HRV-infected patients in a hospital or institutional of HRVs. However, when actually experimentally setting. investigated [Papadopoulos et al., 1999], a wide range of reference species A and B serotypes and primary REFERENCES clinical isolates showed minimal differences in replication ability at 33 and 378C, and indeed some Alsaadi S, Hassard S, Stanway G. 1989. Sequences in the 50 non- coding region of human rhinovirus 14 RNA that affect in vitro isolates actually grew better at the higher incubation translation. J Gen Virol 70:2799–2804. temperature. Bennett S, Harvala H, Witteveldt J, McWilliam-Leitch EC, McLeish The frequent detection and high viral loads of HRV N, Templeton K, Gunson R, Carman WF, Simmonds P. 2011. Rapid simultaneous detection of enterovirus and parechovirus in fecal samples in the current study is consistent RNA in clinical samples by one-step real-time RT-PCR assay. with the frequent detection of HRV by virus isolation J Clin Microbiol 49:2620–2624. during environmental screening of human sewage Blomqvist S, Savolainen-Kopra C, Paananen A, Hovi T, Roivainen [Blomqvist et al., 2009]. It was speculated that the M. 2009. Molecular characterization of human rhinovirus field strains isolated during surveillance of enteroviruses. J Gen Virol HRV variants derived from respiratory secretions 30:1371–1381. carried into sewage by discarded used tissues or in Blough HA, Tiffany JM, Gordon G, Fiala M. 1969. The effect of mag- wastewater. Our findings of potential gut replication nesium on the intracellular crystallization of rhinovirus. Virology 38:694–698. of HRV implied by the high viral loads in fecal sam- Broberg E, Niemela J, Lahti E, Hyypia T, Ruuskanen O, Waris M. ples (Fig. 2) provide a more direct explanation for the 2011. Human rhinovirus C–associated severe pneumonia in a presence of infectious HRV in sewage. The plethora of neonate. J Clin Virol 51:79–82. different HRV types (n ¼ 26) detected in fecal samples Dierssen U, Rehren F, Henke-Gendo C, Harste G, Heim A. 2008. Rapid routine detection of enterovirus RNA in cerebrospinal fluid argues against the existence of specifically entero- by a one-step real-time RT-PCR assay. J Clin Virol 42:58–64. pathic strains of HRV. Furthermore, using PCR-based Faux CE, Arden KE, Lambert SB, Nissen MD, Nolan TM, Chang screening methods equally sensitive for all three AB, Sloots TP, Mackay IM. 2011. Usefulness of published PCR primers in detecting human rhinovirus infection. Emerg Infect species, the relative proportions of HRV-A (46%), Dis 17:296–298. HRV-B (13%), and HRV-C (41%) were broadly compa- Hughes PJ, North C, Jellis CH, Minor PD, Stanway G. 1988. The rable to previously reported detection frequencies in nucleotide sequence of human rhinovirus 1B: Molecular relation- respiratory samples [63%, 7%, and 30%, respectively; ships within the rhinovirus genus. J Gen Virol 69:49–58. Iwasaki A, Welker R, Mueller S, Linehan M, Nomoto A, Wimmer E. Wisdom et al., 2009]. 2002. Immunofluorescence analysis of poliovirus receptor expres- HRV have traditionally been associated with respi- sion in Peyer’s patches of humans, primates, and CD155 trans- ratory tract infections. Although HRV are mostly genic mice: Implications for poliovirus infection. J Infect Dis 186:585–592. regarded as harmless cause of common colds, they Leitch EC, Harvala H, Robertson I, Ubillos I, Templeton K, have been linked more recently with severe neonatal Simmonds P. 2009. Direct identification of human enterovirus infections [van Piggelen et al., 2010; Broberg et al., serotypes in cerebrospinal fluid by amplification and sequencing 2011] and pericarditis [Tapparel et al., 2009]. In these of the VP1 region. J Clin Virol 44:119–124. Lopez-Alonso M, Ribas J. 1991. Technical improvement for anorectal case reports HRV was also detected in stool samples, manometry in newborns. J Pediatr Surg 26:1215–1218. consistent with the more severe systemic nature of Mackay IM. 2008. Human rhinoviruses: The cold wars resume. HRV infections in young children. Although not J Clin Virol 42:297–320. formally tested by prospective study and without McIntyre CJ, McWilliam-Leitch EC, Savolainen-Kopra C, Hovi T, Simmonds P. 2010. Analysis of genetic diversity and sites of re- control groups, our data support the idea that HRV combination in human rhinovirus species C (HRV-C). J Virol may on some occasions be responsible cause of gastro- 83:10297–10310. enteritis, given the high frequency of diarrhea, Papadopoulos NG, Sanderson G, Hunter J, Johnston SL. 1999. Rhinoviruses replicate effectively at lower airway temperatures. absence of any respiratory symptoms in 70% of J Med Virol 58:100–104. the positive subjects as well as absence of detected Sabin AB, Ward RR. 1941. The natural history of human poliomyeli- co-pathogens. Indeed, eight of our patients were tis. J Exp Med 73:771–793.

J. Med. Virol. DOI 10.1002/jmv 542 Harvala et al.

Scheltinga SA, Templeton KE, Beersma MF, Claas EC. 2005. Diag- Tapparel C, L’Huillier AG, Rougemont AL, Beghetti M, Barazzone- nosis of human metapneumovirus and rhinovirus in patients Argiroffo C, Kaiser L. 2009. Pneumonia and pericarditis in a with respiratory tract infections by an internally controlled child with HRV-C infection: A case report. J Clin Virol 45:157– multiplex real-time RNA PCR. J Clin Virol 33:306–311. 160. Skern T, Torgersen H, Auer H, Kuechler E, Blaas D. 1991. Human van Piggelen RO, Van Loon AM, Krediet TG, Verboon-Maciolek MA. rhinovirus mutants resistant to low pH. Virology 183:757–763. 2010. Human rhinovirus causes severe infection in preterm Stanway G, Brown F, Christian P, Hovi T, Hyypia T, King AMQ, infants. Pediatr Infect Dis J 29:364–365. Knowles NJ, Lemon SM, Minor PD, Pallansch MA, Palmenberg Wisdom A, McWilliam Leitch C, Gaunt E, Harvala H, Simmonds P. AC, Skern T. 2005. Family picornaviridae. In: Fauquet CM, 2009. Screening respiratory samples for human rhinoviruses Mayo MA, Maniloff J, Desselberger V, Ball La, editors. Virus tax- (HRV) and enteroviruses: Comprehensive VP4/2-typing reveals onomy. Eight report on the International Committee on Taxono- high incidence and genetic diversity of HRV species C. J Clin my of Viruses. London: Elsevier/Academic Press. pp 757–778. Microbiol 47:3958–3967.

J. Med. Virol. DOI 10.1002/jmv