of taro and other edible aroids in East Africa

by

Dawit Beyene KIDANEMARIAM Bachelor of Education (Biology) Master of Science (Botany)

Centre for Tropical Crops and Biocommodities School of Earth, Environmental and Biological Sciences Science and Engineering Faculty

A thesis submitted in fulfilment of the requirement for the degree of Doctor of Philosophy Queensland University of Technology Brisbane, Australia

2018

“The end of a journey is the beginning of another….”

Abstract

Edible aroids such as taro and tannia are important root crops in most parts of East Africa and cultivated mainly by small-holder farmers. Taro is the most preferred aroid in the region where it plays significant nutritional, economic and social roles. Viruses are among the most important constraints for the production of edible aroids worldwide. To date, no comprehensive study has been carried out to determine the status of viruses infecting taro and other edible aroids in East Africa. This PhD project, therefore, aimed to investigate the incidence, distribution and possible origin of viruses infecting taro and other edible aroids in the region. During 2014/15, a survey was carried out in the major growing areas in Ethiopia, Kenya, Tanzania and Uganda. A total of 25 districts were visited in the four countries and a total of 392 leaf samples were collected. Based on the availability of reliable diagnostic molecular tests, the samples were tested for the presence of badnaviruses, potyviruses and cucumber mosaic . Additional screening was also carried out for the presence of rhabdoviruses known to infect taro.

When the 392 samples were tested by PCR using degenerate badnavirus primers, between 58-74 % of the samples from the four countries were positive. BLAST analysis of the core RT/RNase H-coding sequences revealed the presence of both taro bacilliform virus (TaBV) and taro bacilliform CH virus (TaBCHV) with TaBCHV identified in all four countries and TaBV identified in all countries except Ethiopia. Full-length genome sequences of representative TaBV and TaBCHV isolates infecting both taro and tannia from East Africa were generated by rolling circle amplification (RCA) and outward-facing PCR, respectively. The genome of TaBV isolates from East Africa ranged between 7,796-7,805 nucleotides and contained four open reading frames consistent with that of a previously reported isolate from Papua New Guinea. The genome of TaBCHV isolates from East Africa ranged from 7,389-7,654 nucleotides. Unlike previous reports of TaBCHV isolates from China and Hawaii which possessed six and five ORFs, respectively, the TaBCHV isolates from East Africa contained only four ORFs. No obvious symptoms were associated with TaBV and

i

TaBCHV infection in East Africa, with a number of asymptomatic plants also testing positive. Phylogenetic analysis showed that all East African TaBV isolates form a single subgroup together with a known TaBV isolate from New Caledonia. However, TaBCHV isolates formed several distinct subgroups in the phylogenetic tree.

Due to quarantine restrictions, an Australian TaBV isolate was used as a model to generate a TaBV infectious clone. A terminally redundant cloned copy of the TaBV genome was generated and was shown to be infectious when inoculated into taro plants by agrobacterium-mediated inoculation. TaBV genomic DNA was amplified from inoculated plants using rolling circle amplification at 12 weeks post-inoculation confirms the presence of episomal TaBV DNA. At 20 weeks post-inoculation, some plants developed symptoms including downward-curling of the leaf margins, similar to that observed in some TaBV-infected taro plants in the field. This was the first report describing the development of an infectious clone of TaBV and may serve as an important tool to facilitate further investigation into the virus host range, symptoms and yield loss.

The incidence and distribution in East Africa of four RNA viruses known to infect taro, namely cucumber mosaic virus (CMV), dasheen mosaic virus (DsMV), taro vein chlorosis virus (TaVCV) and colocasia bobone disease-associated virus (CBDaV), was also investigated by RT-PCR using degenerate and/or virus-specific primers. No samples tested positive for TaVCV or CBDaV. Further, CMV was only detected in three tannia plants with mosaic, mottling and vein chlorosis symptoms from Buikwe district in Uganda. Next generation sequencing of total RNA extracted from these samples confirmed the presence of CMV in all three plants, the nucleotide sequences of which showed 99.5-99.8 % identity. One isolate, designated CMV-Xa, was characterised further. Pairwise sequence comparison, BLAST search and phylogenetic analysis based on full-length RNA 1, 2 and 3 sequences showed that CMV-Xa belonged to subgroup-IB of CMV isolates. The genome organisation of RNA 1 and 3 of CMV-Xa was similar to previously reported CMV isolates. However, RNA 2 contained an additional, non-AUG initiated putative ORF, referred to as ORF 2c, in addition to ORF 2a and 2b. This was the first report of a complete genome sequence of a subgroup IB

ii

CMV isolate from sub-Saharan Africa and was also the first report of CMV infecting Xanthosoma sp.

DsMV was detected in 40 samples, including 36 out of 171 from Ethiopia, 1 out of 94 from Uganda and 3 out of 41 from Tanzania, while no samples from Kenya tested positive. The complete genomes of nine DsMV isolates from East Africa were cloned and sequenced. Phylogenetic analyses based on the amino acid sequence of the CP-coding region revealed two distinct clades, which is consistent with previous reports. Interestingly, samples from Ethiopia were distributed across several subgroups in both clades, while samples from Uganda and Tanzania belonged to different clades.

During preliminary RT-PCR assay development for potyviruses at QUT, an aroid (Alocasia sp.) showing a mosaic and feathery-mottle symptom typical of DsMV infection was identified growing near Brisbane. The plant tested positive for potyvirus infection by RT-PCR using degenerate primers and subsequent cloning and sequence analysis revealed the presence of the potyvirus, Zantedeschia mild mosaic virus (ZaMMV). The complete genome of ZaMMV from Australia (ZaMMV-AU) was obtained and was found to be closely related to a previously reported ZaMMV isolate from Taiwan (ZaMMV-TW). This was the first report of ZaMMV from Australia and from an Alocasia sp.

To our knowledge, this is the first study describing the occurrence, distribution and genome organisation of viruses infecting aroids in East Africa and it will contribute to ongoing surveillance and to disease management activities throughout the region. Aroids are considered an ‘orphan-crop’ in East Africa and, as a result, are receiving less attention from national and regional research agencies. The findings from this study will hopefully raise awareness of the status of viral diseases of aroids in the region and may be the catalyst for attracting much needed funding for research and development activities in the future.

iii

Keywords

Colocasia esculenta, CMV, DsMV, East Africa, Ethiopia, Infectious clone, Kenya, RCA, taro, tannia, TaBCHV, TaBV, Tanzania, Uganda, Xanthosoma sp., ZaMMV

iv

Publications

Peer reviewed publications related to this PhD thesis

1. Kidanemariam, D.B., Abraham, A.D., Sukal, A.C., Holton, T.A., Dale, J.L., James, A.P. and Harding, R.M. (2016). Complete genome sequence of a novel zantedeschia mild mosaic virus isolate: the first report from Australia and from Alocasia sp. Archives of Virology 161:1079–1082.

2. Kidanemariam, D.B., Sukal, A.C., Abraham, A.D., Stomeo, F., Dale, J.L., James, A.P. and Harding, R.M. Identification and molecular characterisation of taro bacilliform virus and taro bacilliform CH virus from East Africa. Submitted to Plant Pathology https://doi.org/10.1111/ppa.12921.

3. Kidanemariam, D.B., Sukal, A.C., Crew, K., Jackson, G.V.H., Abraham, A.D., Stomeo, F., Dale, J.L., James, A.P. and Harding, R.M. (2018). Characterization of an Australian isolate of Taro bacilliform virus and development of an infectious clone. Archives of Virology 163:1677–1681.

4. Kidanemariam, D.B., Sukal, A.C., Abraham, A.D., Njuguna, J.N., Mware, B.O., Stomeo, F., Dale, J.L., James, A.P. and Harding, R.M. Characterisation of a subgroup IB isolate of Cucumber mosaic virus from Xanthosoma sp. in sub- Saharan Africa. Submitted to Virus Genes.

5. Kidanemariam, D.B., Sukal, A.C., Abraham, A.D., Njuguna, J.N., Stomeo, F., Dale, J.L., James, A.P. and Harding, R.M. Incidence and distribution of four RNA viruses infecting taro and tannia in East Africa and molecular characterisation of Dasheen mosaic virus isolates. Formatted for submission to Annals of Applied Biology.

v

Table of Contents

Abstract ...... i Publications ...... v Table of Contents ...... vi List of Figures ...... viii List of Tables ...... x List of Abbreviations ...... xi Statement of Original Authorship ...... xiii Acknowledgments ...... xiv Chapter 1 ...... 1 Introduction ...... 1 Description of the scientific problem investigated ...... 1 General objectives of the study ...... 2 Specific aims of the study ...... 2 Account of scientific progress linking the scientific papers ...... 2 Chapter 2 ...... 5 Literature Review ...... 5 2.1 Taro ...... 5 2.2 Taro in East Africa ...... 6 2.3 Factors affecting the production of taro ...... 9 2.4 Production constraints of taro in East Africa ...... 9 2.5 Viral diseases of taro ...... 10 2.6 Research problem and aim ...... 22 2.7 Objectives ...... 23 2.8 References ...... 24 Chapter 3 ...... 35 Complete genome sequence of a novel Zantedeschia mild mosaic virus isolate: the first report from Australia and from Alocasia sp...... 35 Abstract ...... 37 Acknowledgments ...... 44 References...... 45 Chapter 4 ...... 47 Identification and molecular characterisation of taro bacilliform virus and taro bacilliform CH virus from East Africa ...... 47 Abstract ...... 49 Introduction ...... 50 Materials and methods ...... 53 Results ...... 56 Discussion ...... 70 Acknowledgments ...... 74 References...... 75 vi

Chapter 5 ...... 79 Characterisation of an Australian isolate of taro bacilliform virus and development of an infectious clone ...... 79 Acknowledgments ...... 90 References ...... 91 Chapter 6 ...... 93 Characterization of a subgroup IB isolate of Cucumber mosaic virus from Xanthosoma sp. in sub-Saharan Africa ...... 93 Abstract ...... 96 Acknowledgements ...... 109 References ...... 110 Chapter 7 ...... 113 Incidence and distribution of four RNA viruses infecting taro and tannia in East Africa and molecular characterisation of Dasheen mosaic virus isolates ...... 113 Abstract ...... 116 Introduction ...... 117 Materials and Methods ...... 119 Results ...... 123 Discussion ...... 131 Acknowledgments ...... 134 References ...... 135 Chapter 8 ...... 139 General Discussion ...... 139 References ...... 143

vii

List of Figures

Chapter 2 Figure 1. Taro production and use in Ethiopia and Kenya...... 8 Figure 2. The typical feathery-mottle and mosaic symptoms associated with DsMV infection...... 11 Figure 3. Virions and genome organisation of DsMV...... 13 Figure 4. Electron micrograph showing bacilliform-shaped badnavirus particles partially purified from taro leaves...... 15 Figure 5. Linearised schematic representation of the genome organisation of TaBV and TaBCHV...... 16 Figure 6. Virion structure and typical genome organisation of rhabdoviruses ...... 20 Figure 7. Typical vein chlorosis symptom associated with TaVCV infection in taro. . 21

Chapter 3 Figure 1. Phylogenetic analysis of ZaMMV-AU...... 41 Figure 2. Genome organisation of ZaMMV-AU...... 42 Figure 3. Alignment of partial amino acid sequences of the NIb-CP junction of ZaMMV and selected potyviruses from the BCMV subgroup...... 43

Chapter 4 Figure 1. Linearised schematic representation of the genome organisation of full- length TaBV and TaBCHV isolates sequenced from East Africa...... 60 Figure 2. Phylogenetic analyses of the TaBV and TaBCHV sequences from East Africa together with other representative sequences from the family ...... 65 Figure 3. Phylogenetic analyses of the TaBV-like sequences characterised in this study...... 66 Figure 4. Phylogenetic analyses of the TaBCHV-like sequences characterised in this study...... 68

Chapter 5 Figure 1. Schematic representation of the linearised genome of TaBV-Aus7...... 86 Figure 2. Phenotypic and molecular analysis of pOPT-NXT-Aus7 inoculated taro plants...... 89

viii

Chapter 6 Figure 1. Symptoms associated with CMV-Xa...... 98 Figure 2. Schematic representation of the genome organisation of CMV-xa...... 102 Figure 3. Phylogenetic analysis of CMV–Xa based on complete nucleotide sequences...... 108

Chapter 7 Figure 1. Locations of survey sites in Ethiopia, Kenya, Tanzania and Uganda...... 124 Figure 2. Photos of typical virus-like symptoms on taro and tannia plants from East Africa...... 126 Figure 3. Phylogenetic analysis based on amino acid sequences of the core CP- coding region of selected DsMV isolates...... 130

ix

List of Tables Chapter 3 Table 1. Comparison of the nucleotide and amino acid sequences of the putative coding and non-coding regions of ZaMMV-AU and ZaMMV-TW...... 40

Chapter 4 Table 1. Summary of badnavirus PCR screening and samples used for initial sequence analysis...... 57 Table 2. Summary of the genomic features of TaBV and TaBCHV isolates from East Africa...... 61 Table 3. Pairwise sequence comparisons of TaBCHV isolates using core 529 nt RT/RNase H-coding sequences...... 69

Chapter 5 Table 1. Sampling locations and results of PCR testing for TaBV in taro leaf samples...... 84

Chapter 6 Table 1. Next generation sequencing data from Xanthosoma sp. samples collected from Uganda...... 100 Table 2. Name, subgroup, country of origin and accession numbers of CMV sequences from NCBI database used in the analysis...... 103

Chapter 7 Table 1. Primers used for virus detection with RT-PCR...... 121 Table 2. Summary of PCR and RT-PCR screening results for viruses infecting taro and tannia samples in this study...... 125

x

List of Abbreviations

aa amino acid AAS Australia Awards Scholarship bp base pair/s BecA–ILRI Hub Biosciences eastern and central Africa–International Livestock Research Institute Hub BLAST basic local alignment search tool cDNA complementary DNA CTAB cetyl trimethyl ammonium bromide CTCB Centre for Tropical Crops and Biocommodities DB-PCR direct-binding polymerase chain reaction DNA deoxyribonucleic acid ds double-stranded EIAR Ethiopian Institute of Agricultural Research ELISA enzyme-linked immunosorbent assay g gravity gfp green fluorescent protein ha hectare Hz hertz IC-PCR immuno-capture polymerase chain reaction ICTV International Committee on Taxonomy of Viruses IR intergenic region kbp kilobase pair/s kDa kilodalton/s min minute/s ml millilitre NARS National Agricultural Research Systems NCBI National Centre for Biotechnology Information ng nanogram NGS Next Generation Sequencing xi

nm nanometre/s nt nucleotide/s nptII neomycin phosphotransferase II ORF open reading frame PBS-T phosphate buffered saline with Tween-20 PCR polymerase chain reaction pH -log (hydrogen ion concentration) ρmol picomole/s RACE rapid amplification of cDNA ends RCA rolling circle amplification RNA ribonucleic acid RNase H ribonuclease H RT reverse transcriptase RT-PCR reverse transcription polymerase chain reaction s second/s SEF Science and Engineering Faculty sp. species t ton/s QUT Queensland University of Technology UTR untranslated region V volt/s µl microlitre/s µg microgram °C degrees Celsius

xii

Statement of Original Authorship

I certify that this thesis is my own work and contains no material which has been previously submitted to meet requirements for an award at this or any other higher education institution. To the best of my knowledge and belief, the thesis contains no material previously published or written by another person, except where due reference is made.

QUT Verified Signature

Signature

Date

xiii Acknowledgments

I am deeply thankful to my wife Abigail for her support, understanding and patience throughout my study. I am so sorry for keeping you up late while I stayed longer in the lab. Those extra 5 minutes in the lab are what made this possible.

I would also like to express my deepest gratitude to my supervisors Rob Harding, Anthony James, James Dale and Adane Abraham for their unconditional support, guidance and encouragement. Rob and AJ, your dedication, hard-work and meticulousness make me travel an extra mile, read more, think more, of course pipette more and write more but, in the end, you crafted me very well. While saying this without forgetting all the celebrations we had for every small success, thank you very much.

Ben, ‘Science Faculty’, I don’t know how to express my deepest gratitude to you. Your advice, support and humour made me pass all the challenges and cloudy days I faced - you are a real friend to depend on and a real genius.

I am very thankful to my friend Amit for suggestions, sharing frustrations and celebrating every small success along the way (Uni pub should also take some credit for that). I am glad to have a friend whom I can call a brother.

I am also very thankful to Timothy Holton and his family, for the love they showed me and for his support and guidance at the beginning of this project and during the pilot study which has paved my path.

To everyone who helped me during sample collection, Mengistu, Demelw, Stephen, Paul, Abigail, Ndungu, Margaret, Julius, Castro, and Kwame, thank you very much for the care you showed me during my visit and sharing the hard work of sample collection and making my life easier, especially with translations. I could have come out empty-handed from all my surveys without your kind

xiv

assistance. I am also very thankful to all the farmers in all the countries for allowing me to inspect their farms and collect samples.

I am greatly indebted for all the support, encouragement and love I received from all the wonderful students and staff at CTCB, with special thanks to Dani, JY, Saga and CTCB admin. I am also thankful to all my colleagues from Holetta National Agricultural Biotechnology Laboratory, Ethiopia, for all the support you gave me, and especially Melaku for taking care of all my official communications.

From CSSF, Jennifer and Anne, thank you very much for the excellent job you are doing.

To my friend Zola and Abdulwahab, thank you very much for all the advice and the strength you built inside me, it all adds up to this.

ABCF program and fellows, capacity building team particularly Appolinaire, Ekaya, Francesca, Val, Joyce, Dedan, Marvin, all research assistants, and all staff at BecA–ILRI Hub, I really appreciate your kind support and encouragement.

I am very grateful to Australian Awards scholarship, Centre for Tropical Crops and Biocommodities, Queensland University of Technology, Ethiopian Institute of Agricultural Research and Biosciences eastern and central Africa for sponsoring this study - without you this could not be possible. I also wish to thank international student services at QUT for their kind support and encouragement along the way.

To my family, words cannot express how grateful I am for all your sacrifices, support and encouragement and, above all, for allowing me to follow my heart and make me a confident person. Special thanks to my brothers and sister for all your unreserved support, especially during those challenging times of our life. This thesis is dedicated to my parents and grandparents.

xv

xvi

Chapter 1

Introduction

This thesis is presented in ‘Thesis by Publication’ style containing a comprehensive literature review section (Chapter 2) followed by five results chapters (Chapter 3 to 7) and a general discussion chapter (Chapter 8). Of the five results chapters, Chapter 3 and 5 have been published in the journal Archives of Virology. Chapters 4 and 6 have been submitted for publication, while Chapter 7 has been formatted for submission to the journal Annals of Applied Biology. Therefore, the presentation of the results chapters follows the formatting style of the target journals.

Description of the scientific problem investigated

Taro (Colocasia esculenta (L.)) and other edible aroids, such as tannia (Xanthosoma sp.), are among the most important crops cultivated by small- holder farmers in East Africa. The production of taro in Ethiopia, as well as Kenya, Uganda and Tanzania, has declined significantly in recent times due to a lack of improved planting materials and the occurrence of weeds, pests and diseases. In addition, aroids are receiving less attention from both national and regional agricultural research institutes in terms of research and development activities. A pilot study in 2013 to identify taro viruses in Ethiopia and Kenya confirmed the presence of the potyvirus Dasheen mosaic virus (DsMV) and the badnavirus Taro bacilliform virus (TaBV). Apart from this study, the incidence, distribution and genome organisation of viruses infecting taro and other edible aroids in the region was unknown at the commencement of this PhD project. As the threat of viral diseases on this economically important crop warrants urgent attention, the identification of viruses affecting taro production throughout the region was considered a research priority. Therefore, to address the lack of knowledge on the incidence and distribution of viral diseases of taro and other edible aroids in East Africa, and to establish a capacity for virus-indexing of aroids in the region,

1

the current PhD project was initiated. This project has established a baseline for knowledge on the occurrence and distribution of viruses infecting taro, and the related crop tannia, in East Africa and will contribute towards taro disease management both within the region and worldwide.

General objectives of the study

The general objective of this study was to identify, characterise and determine the distribution of economically important viruses infecting taro and other edible aroids in East Africa.

Specific aims of the study

The specific aims of this project were to (i) conduct extensive surveys in four East African countries and determine the incidence and distribution of known DNA and RNA viruses infecting taro and other edible aroids in East Africa, and (ii) characterise, at the molecular level, the viruses detected.

Account of scientific progress linking the scientific papers

During the initial work at QUT to develop/optimise assays for the detection of potyviruses, a leaf sample was collected from an Alocasia plant (member of the Araceae family) growing near Brisbane which showing symptoms typical of DsMV. The sample tested positive by PCR and further characterisation showed that it was an isolate of Zantedeschia mild mosaic virus (ZaMMV), another species in the genus Potyvirus. As this was the first report of ZaMMV from Australia, as well as from an Alocasia sp., the complete genome sequence of this novel isolate was determined and analysed. These results are presented in Chapter 3.

2

Chapter 4 describes the occurrence, distribution and molecular characterisation of two distinct members of the genus Badnavirus, TaBV and Taro bacilliform CH virus (TaBCHV), in East Africa. This was the first comprehensive study covering the four countries in the region (Ethiopia, Kenya, Tanzania and Uganda) with 392 samples collected from 25 districts. The results showed that badnaviruses are widespread in East Africa, but no symptoms were consistently associated with infections.

There are no reports on the host range of TaBV or yield losses due to infection. Infectious clones of plant viruses are a convenient way to undertake such studies and, therefore, Chapter 5 describes the development of the first infectious clone of TaBV. Due to strict biosecurity regulations in Australia, it was not possible to develop an infectious clone for an African TaBV isolate. Therefore, an Australian TaBV isolate was identified for use as a model system and its complete genome sequence was determined. Taro plants were inoculated with the TaBV infectious clone and some leaves displayed mild downward-curling, a symptom sometimes observed on taro plants in the field. The infectious clone will be useful in screening aroid germplasm for resistance and also for investigations into host range and yield.

The remaining two chapters mainly involved work to characterise RNA viruses infecting aroids in East Africa. During surveys in Uganda, three tannia samples showing symptoms usually associated with DsMV infection were collected. The samples tested negative for potyviruses but were subsequently found to be infected with cucumber mosaic virus (CMV) following RNAseq Next Generation Sequencing. Sequence analysis revealed the first subgroup-IB isolate of CMV from sub-Saharan Africa and also that the RNA2 encoded a putative novel ORF. The results are presented in Chapter 6.

The final results chapter (Chapter 7) summarises the findings of the field surveys, with a particular emphasis on RNA viruses. The incidence and distribution of DsMV, CMV and rhabdoviruses is presented in addition to any correlations observed between virus infection and symptoms.

3

This study is the first to comprehensively assess the occurrence, incidence and sequence diversity of taro and tannia viruses in East Africa. Sequence information has been deposited in the National Centre for Biotechnology Information (NCBI) GenBank database and a collection of the samples is stored at the BecA–ILRI Hub laboratory in Nairobi, Kenya, for future analysis if needed.

4

Chapter 2

Literature Review

2.1 Taro

Taro (Colocasia esculenta (L.) Schott) belongs to the Araceae family (Vaneker & Slaats, 2012) which comprises a diverse range of plants commonly called aroids. Taro originated in south-east or south-central Asia and is believed to have been first domesticated in northern India (Kantaka, 2004; Wilson & Siemonsma, 1996). Aroids are the world’s oldest food crops, being utilised even before the domestication of wheat and rice. They are among the six most important root and tuber crops, and rank fourteenth among staple vegetable crops (Vaneker & Slaats, 2012; Kantaka, 2004). Archaeological evidence from the Solomon Islands suggests that taro was being propagated around 28,700 years ago and it was introduced to Egypt and East Africa at least 2000 years ago (Vaneker & Slaats, 2012; Kantaka, 2004). The five most cultivated aroids, used as food are taro (Colocasia esculenta (L.) Schott), tannia (Xanthosoma sagittifolium L.), elephant ear (Alocasia spp), elephant foot yam (Amorphophallus paeoniifolius Dennst (Nicolson)) and swamp taro (Cyrtosperma merkusii Hassk (Schott)).

Taro is an erect, herbaceous perennial plant but most often it is grown as an annual crop (Kantaka, 2004; Wilson & Siemonsma, 1996). It performs best in the tropics and tolerates a wide range of environments and agricultural practises (Kantaka, 2004). It is tolerant to drought and low temperatures and can also be cultivated on dry land or under flooded conditions. In addition, it is tolerant to shade making it suitable for intercropping in agroforestry systems (Wilson & Siemonsma, 1996). In the wet tropics, aroids can be cultivated throughout the year. Rainfall, between 200 and 300 mm/month, is ideal for optimum growth and production. However, irrigation is necessary for taro and swamp taro in low rainfall areas, while tannia, elephant ear and elephant foot yam are more drought tolerant. The time needed to reach maturity varies according to species/variety, temperature, sunlight

5

and water availability (Lebot, 2009). Under ideal agronomic practises, taro can give yields up to 60 – 110 t/ha (Lebot, 2009).

In 2012, worldwide production of taro was 9.98 million metric tons from a total of 1.32 million hectares of land, with Africa accounting for 7.36 million metric tons (FAOSTAT, 2014). Nigeria is the world’s largest producer of taro with a total production of 3.45 million metric tons in 2012, followed by China, Cameroon and Ghana (FAOSTAT, 2014).

The corms and leaves of taro are very rich sources of easily digestible starch and dietary fibre. They also contain substantial amounts of protein, vitamin C, thiamine, riboflavin, niacin, β-carotene, iron and folic acid (Ndabikunze et al., 2011; Tumuhimbise et al.; 2009). The corm can be sliced and fried into chips and is used in the preparation of soups, beverages and puddings. The starch is used in baby foods and as a cereal substitute. In Hawaii, the corms are processed into flour and used for biscuits and bread. The leaves are eaten as leafy vegetables and pot-herbs for soups and sauces (Wilson & Siemonsma, 1996). Although the medicinal value of taro corm or leaf has not been studied in detail, in different parts of the world people use taro corm and/or leaf to treat snakebites, rheumatism, arterial hypertension, liver infection and ulcers (Wilson & Siemonsma, 1996).

2.2 Taro in East Africa

Taro plays a significant social, cultural and economic role for most small scale farmers in East Africa (Akwee et al., 2015; Onwueme and Charles, 1994; Talwana et al., 2009). There are reports showing taro and other edible aroids are introduced into the African continent at different times from different sources. The first introduction of taro to East Africa is believed to be at least 2,000 years ago to Egypt via Arabia (Plucknett et al., 1970; Bown, 2000; Kantaka, 2004). In addition, tannia (Xanthosoma sp.) was introduced to Central and West Africa between the 16th and 17th centuries by the Portuguese (Bown, 2000).

6

In the south and south-western part of Ethiopia around 20 million people depend on root crops such as potato, sweet potato, taro and enset for their dietary intake, during both surplus and poor harvest years (Mariame and Gelmesa, 2006; Beyene, 2013; Harrison et al., 2014). Taro (locally called ‘godere’) (Figure 1A, B) and enset are propagated mainly because they are known to perform well in drought- prone areas where the annual rainfall is too low to support the production of other crops (Harrison et al., 2014). In Sheka (a town in the southwest of Ethiopia), taro remains important, since it is available throughout the year (Mariame and Gelmesa, 2006).

In Kenya, taro, also known locally as ‘arrowroot’, and tannia are a basic source of starch in the diet for many communities in the Mount Kenya and Aberdares districts of central Kenya, as well as in the Lake Victoria basin districts of Kakamega, Kisumu and Siaya, where it is mainly cultivated adjacent to streams and rivers (Akwee et al., 2015; Figure 1C, D, E). In Tanzania and Uganda, taro and tannia are mainly grown along the Lake Victoria basin, including Bukoba, Musoma, Tarime, Biharamulo and Mwanza districts in Tanzania and the Mitiyana, Masaka, Jinja, Iganga and Luuka districts in Uganda (Talwana et al., 2009; Ndabikunze et al., 2011).

In Ethiopia in the fiscal years 2009/10, 2010/11 and 2011/12, the average taro production was 7.77, 8.03 and 7.94 t/ha, respectively (CSA, 2010; CSA, 2011; CSA, 2012). In the years 2007, 2008 and 2009, the average taro production in Kenya was 7.70, 7.49 and 9.62 t/ha, respectively (CPPMU, 2010). In Uganda and Tanzania, the average annual production is less than 1 t/ha (Tumuhimbise et al., 2009; Talwana et al., 2009).

7

A B

D

C E

Figure 1. Taro production and use in Ethiopia and Kenya. (A) Taro plantation at Areka Agricultural Research Centre, Ethiopia, (B) Local taro market in Welayita, Ethiopia, (C) Taro and tomatoes in a supermarket in Nairobi, Kenya, (D) Boiled taro, a typical breakfast in Kenya, (E) Taro leaf vegetable.

8

2.3 Factors affecting the production of taro

Several pests and diseases are known to cause significant yield reduction in taro with different insects, snails and nematodes among the pests (Lebot, 2009). There are also some reports on abiotic stresses caused by nutrient deficiency, temperature and water shortage affecting the production of taro (Carmichael et al 2008; Zettler, 1989; Ooka, 1990). Numerous viral, bacterial and fungal pathogens are also known to infect taro and result in significant production loss in terms of quantity and quality (Zettler, 1989; Revill et al., 2005a). Taro leaf blight caused by the Oomycete, Phytophthora colocasiae, is a disease of major importance in many regions of the world where taro is grown (Sharma et al., 2009; Singh et al., 2012). Bacterial soft rot and bacterial leaf spot are among the most economically important bacterial diseases of taro (Carmichael et al 2008; Ooka, 1990). Viruses are one of the most important pathogens affecting taro and, since the focus of this PhD study is on viruses of taro, they are discussed in more detail in section 2.5.

2.4 Production constraints of taro in East Africa

Although taro has significant social, cultural and economic importance for most small scale farmers in East Africa, the average yields obtained from taro are below the potential of the crop due to various constraints including diminishing soil fertility, unavailability of improved varieties, competition due to weeds and the presence of pests and diseases (Akwee et al., 2015; Talwana et al., 2009; Tumuhimbise et al., 2009). In Africa, particularly Eastern Africa, the situation of low taro yields is intensified by a lack of research and extension efforts to support the production, utilisation and consumption of the crop (Akwee et al., 2015; Ndabikunze et al., 2011; Talwana et al., 2009). Consequently, production of taro in East Africa is lagging behind that of other root and tuber crops (Tumuhimbise et al., 2009). A pilot study on taro viruses in Ethiopia and Kenya conducted by Kidanemariam et al. (2018), confirmed the presence of dasheen mosaic virus (DsMV), taro bacilliform virus (TaBV) and

9

possibly two other viruses. Aside from this previous study, there is no other information regarding taro viruses in the region.

2.5 Viral diseases of taro

Viruses are among the most economically important pathogens of taro and infection can result in significant yield losses (Revill et al., 2005a). Moreover, the presence of taro viruses restricts the international movement of germplasm, which has a serious impact on its accessibility and production (Revill et al., 2005a). Until relatively recently, studies on taro viruses have been limited to a number of Pacific Island countries and all diagnostic tests have been developed using viruses identified from this region (Yang et al., 2003a, b; Pearson et al., 1999; Revill et al., 2005a, b).

2.5.1 Dasheen mosaic virus (DsMV)

DsMV is one of the most important viruses known to infect both edible and ornamental aroids worldwide (Elliott et al., 1997). The virus was first reported in 1970 from Florida, USA and subsequently assigned under the family , genus Potyvirus (Zettler et al., 1970). DsMV is transmitted in a non-persistent manner by several aphid species including Myzus persicae and Aphis gossypii and it can also be transmitted by vegetative propagation or mechanically with infected plant sap (Babu et al., 2011; Elliott et al., 1997; Nelson, 2008). The virus has a natural host range of at least 16 genera from both edible and ornamental members of the Araceae family including Cyrtosperma and Alocasia (Elliott et al., 1997). DsMV infection typically results in a characteristic feathery-mottle and mosaic symptoms, but symptoms may vary considerably with cultivars and seasons (Figure 2; Elliott et al., 1997). DsMV infection is reported to affect both quality and quantity of the corm with production loss ranging from 20 – 60 % (Rana et al., 1983; Elliott et al., 1997).

10

B A

Figure 2. The typical feathery-mottle and mosaic symptoms associated with DsMV infection. (A) Taro (Colocasia esculenta), (B) tannia (Xanthosoma sagittifolium) (Nelson, 2008).

11

DsMV consists of filamentous virions of ∼750 nm long and 11-15 nm in diameter (Figure 3A). The genome comprises a monopartite molecule of single-stranded (ss), positive sense RNA of ∼10 kbp, which consists of 5ˈ and 3ˈ terminal UTRs flanking a major single ORF and the 3ˈ UTR terminating with a poly-A tail (Hull, 2014; King et al., 2012; Cuevas et al., 2012; Adams et al., 2005; Ha et al., 2008a). The major single ORF is translated into a large polyprotein which is subsequently processed into ten functional proteins by the action of several viral-encoded proteinases (Hull, 2014; King et al., 2012). The ten functional proteins in their order from 5ˈ to 3ˈ are P1 (first protein), HC-Pro (helper component protease), P3 (third protein), 6K1, CI (cylindrical inclusion protein), 6K2, VPg (viral protein genome-linked), NIa-Pro (major- protease of small nuclear inclusion protein -NIa), NIb (large nuclear inclusion protein) and CP (coat protein) (Figure 3B; Hull, 2014; Cuevas et al., 2012; Adams et al., 2005). The currently accepted criteria for distinguishing virus species within the family Potyviridae is based on genome sequence relatedness. Different species have an amino acid (aa) sequence identity less than 80 % in the CP-coding region and/or nucleotide (nt) sequence identity less than 76 % over the entire genome. In addition, differences in host range and host reaction, antigenic properties and the morphology of inclusion bodies can be considered as criteria for demarcation (King et al., 2012).

Symptomatology, serology and molecular approaches have been used for the detection of DsMV (Abo El-Nil et al., 1977; Nelson, 2008; Babu and Hegde, 2014). However, due to high sensitivity, molecular techniques are the most preferred method. Several published degenerate and virus specific primers targeting the most conserved regions including CP, CI and Nib of potyvirus or DsMV are available (Revill et al., 2005a; Ha et al., 2008b; Zheng et al., 2010). Furthermore, several cultural, agronomical and biotechnological approaches have also been used to control DsMV infection in taro (Zettler and Hartman, 1986; Shaw et al., 1979). However, successful elimination of DsMV from taro plants was achieved through tissue culture technique using 0.5 mm meristem-tip culture (Zettler and Hartman, 1987; Zettler et al., 1989).

12

A

B

Figure 3. Virions and genome organisation of DsMV. (A) Negatively stained flexuous rod-shaped particles of DsMV (Zettler et al., 1970); (B) Schematic representation of Potyvirus genome (Cuevas et al., 2012). The ten functional proteins represented. P1: first protein, HC-Pro: helper component protease, P3: third protein, 6K1, CI: cylindrical inclusion protein, 6K2, VPg: viral protein genome-linked, NIa: major protease of small nuclear inclusion protein, NIb: large nuclear inclusion protein, and CP: coat protein.

13

2.5.2 Badnaviruses

Badnaviruses are plant pararetroviruses in the family Caulimoviridae, genus Badnavirus (Geering and Hull, 2012; Geering, 2014; Bhat et al., 2016, Bömer et al., 2017). The genus Badnavirus is the most diverse and heterogeneous member of the family Caulimoviridae both at the genomic and antigenic level. Currently, it comprises more than forty distinct recognised species (https://talk.ictvonline.org/taxonomy/), the majority of which infecting a broad range of economically important tropical and subtropical crops worldwide including banana, yam, taro, sugarcane, black pepper, citrus, and cacao with some reports also from temperate regions in hosts such as raspberry, gooseberry and ornamental spiraea (Bhat et al., 2016; Yang et al., 2003a; Iskra-Caruana et al., 2014). An estimated 10-90 % economic loss is recorded in various crops as a result of infection from different species of badnaviruses (Bhat et al., 2016). Currently, there are two distinct species of badnavirus which have been reported to infect taro, namely TaBV (Yang et al., 2003a, b) and Taro bacilliform CH virus (TaBCHV) (Ming et al., 2013; Kazmi et al., 2015; Geering and Teycheney, 2016).

TaBV is a bacilliform-shaped virus, which has virions of 130 x 30 nm (Figure 4) and a circular, double-stranded (ds) DNA genome comprising ∼7.5 kbp (James et al., 1973; Bhat et al., 2016; King et al., 2012). The genome of TaBV possesses four ORFs, all encoded on the plus-strand of the viral DNA, with the size and organisation of ORFs 1-3 consistent with most badnaviruses (Figure 5A; Yang et al., 2003a). ORF 1 and 2 of TaBV encodes proteins of 16.67 and 15.78 kDa, respectively. The function of the protein coded by ORF 1 is unknown, whereas the protein coded by ORF 2 has nonspecific DNA and RNA binding activity and may be involved in virion assembly (Jacquot et al., 1996). ORF 3 encodes a large polyprotein (214.34 kDa) which contains motifs that are conserved amongst badnaviruses including movement protein (MP), coat protein (CP), aspartic protease (AP), reverse transcriptase (RT) and ribonuclease H (RNase H) (Yang et al., 2003 a, b; Hull, 2014). ORF 4, which overlaps with ORF 3, encodes a small protein (∼13.1 kDa) of unknown function (Figure 5A; Yang et al., 2003b).

14

Figure 4. Electron micrograph showing bacilliform-shaped badnavirus particles partially purified from taro leaves. (James et al., 1973).

15

A

1000 2000 3000 4000 5000 6000 7000

met ORF 1 ORF 2 MP CP Zn AP RT RNase H tRNA TATA PolyA ORF 3 ORF 4

B 1000 2000 3000 4000 5000 6000 7000

met ORF 1 MP Zn RNase H tRNA CP AP RT TATA PolyA

ORF 3 ORF 5 ORF 2 ORF 4 ORF 6

Figure 5. Linearised schematic representation of the genome organisation of TaBV and TaBCHV. (A) TaBV, (B) TaBCHV. Functional proteins encoded by ORF 3 are represented. MP: movement protein, CP: coat protein, Zn: zinc finger-like domains, AP: aspartic protease, RT: reverse transcriptase, RNase H: ribonuclease H.

16

In contrast to TaBV, TaBCHV encodes six putative ORFs, with ORFs 1-4 analogous to TaBV and an additional two small ORFs at the 3' end of ORF 3 (Figure 5A, B; Kazmi et al., 2015). ORF 5 partially overlaps ORF 3, while ORF 6 is downstream of, and partially overlaps, the 3' end of ORF 5 (Figure 5B; Kazmi et al., 2015).

According to the International Committee on Taxonomy of Viruses (ICTV), the criterion for demarcation of species in the genus Badnavirus is a threshold of 20 % nucleotide divergence in the RT/RNase H-coding region of ORF 3 (King et al., 2012). The current genetic diversity of badnaviruses appears to be structured into three major clades. Interestingly, however, Bougainvillea spectabilis chlorotic vein-banding virus (BCVBV) and TaBV isolates group as an additional clade which appears as an out- group (Iskra-Caruana et al., 2014).

TaBV appears to infect plants without causing symptoms or to cause only mild symptoms such as vein clearing, stunting and down-curling of the leaf blades (Bhat et al., 2016; Revill et al., 2005a, Yang et al., 2003a). A synergistic infection with colocasia bobone disease-associated virus (CBDaV), a putative rhabdovirus, is thought to result in the lethal disease ‘alomae’ which is the most economically important virus disease affecting taro (Higgins et al., 2016; Revill et al., 2005a; Macanawai et al., 2005). TaBV has a natural host range restricted to aroids. The virus can be transmitted by the mealybugs (Sedococcus longispinus), seed or pollen but it is not mechanically transmissible (Macanawai et al., 2005).

All members of the family Caulimoviridae are pararetroviruses. Therefore, at least one part of the viral replication occurs in the nucleus where the viral DNA genome is transcribed from minichromosomes formed by an association with histones (Iskra-Caruana et al., 2014; Hull, 2014). This likely facilitates the random integration of viral DNA into the host genome by illegitimate recombination or during repair of DNA breaks which contributes to the diversity and evolution of badnaviruses (Iskra-Caruana et al., 2014; Holmes, 2011). Integrated viral sequences of badnavirus are also known as endogenous badnaviruses (Holmes, 2011).

17

Different molecular and serological diagnostic tools have been developed in the past for the detection of different badnaviruses (Yang et al., 2003b; Harper et al., 1999; Sukal et al., 2017; James et al., 2011a; Bomer et al., 2016; Bomer et al., 2017). Immuno-capture-PCR (IC-PCR), direct-binding polymerase chain reaction (DB-PCR), immuno-sorbent electron microscopy (ISEM) and ELISA techniques were limited due to the higher serological variability of badnaviruses (Harper et al., 1999; Mulholland, 2005; Le Provost et al., 2006; Geering and Hull, 2012). In addition, due to the illegitimate integration of viral DNA into the host genome, PCR tests can also give a false positive amplification where such phenomenon has been observed in banana for the detection of banana streak virus (Geering et al., 2005; James et al., 2011b). Recently, rolling circle amplification (RCA) techniques have been optimised for the selective detection and amplification of different episomal badnavirus DNAs using bacteriophage Phi29 DNA polymerase (James et al., 2011a, b; Bomer et al., 2016; Sukal et al., 2017). RCA is a non-sequence-specific method for the amplification of circular DNA molecules and has been used successfully to amplify plant viruses in all three families with circular DNA genomes (Caulimoviridae, and ). To amplify episomal virus DNA, isothermal amplification is carried out at 30 oC for 18 hours, followed by restriction digestion of the products with endonuclease enzyme and visualise digested fragments using agarose gel electrophoresis. Digested reaction products can subsequently be cloned and sequenced (Sukal et al., 2017; Johne et al., 2009; James et al., 2011a).

2.5.3 Taro vein chlorosis virus (TaVCV)

TaVCV is an enveloped, bullet-shaped virus in the family , genus Nucleorhabdovirus with virions ∼210 x 70 nm (Revill et al., 2005b). The genome of TaVCV comprises a molecule of single-stranded, negative sense RNA of ∼12 kbp and has six open reading frames (Hull, 2014; Revill et al., 2005b). Three of the six encoded proteins, namely the nucleocapsid protein (N), phosphoprotein (P) and RNA- dependent RNA-polymerase (L) are associated with the RNA in the virion (Hull, 2014). The glycoprotein (G) associates with the matrix protein (M) to form the major

18

structural component of the virion outer shell, while the remaining ORF encodes the movement protein (3) (Figure 6A, B; Hull, 2014; Revill et al., 2005b).

A distinct leaf-vein chlorosis near the leaf margin is a typical symptom caused by TaVCV (Pearson et al., 1999; Revill et al., 2005b; Figure 7). The virus has been reported from several South Pacific island countries as well as Hawaii (Revill et al., 2005b; Long et al., 2014). PCR based diagnostic tools have been successfully used for the detection of TaVCV (Revill et al., 2005b).

2.5.4 Colocasia bobone disease-associated virus (CBDaV)

CBDaV is an uncharacterised virus which has been classified as a putative member of the family Rhabdoviridae based on sequence analysis and the presence of a characteristic, enveloped, bullet-shaped particles of ∼300 x 50 nm observed in sap extracts (Higgins et al., 2016; Pearson et al., 1999). Previously it was known as taro large bacilliform virus. CBDaV is symptomatically recognised by leaf distortions, formation of galls on petioles and plant stunting. The virus is much more devastating when there is co-infection of TaBV. The virus has only been reported from Papua New Guinea and Solomon Islands (Higgins et al., 2016; Pearson et al., 1999; Revill et al., 2005b).

2.5.5 Taro reovirus (TaRV)

TaRV is among the more recently identified taro viruses (Revill et al., 2005a, b). It is a putative member of the family and genus Oryzavirus based on sequence analysis of four partial genomic segments (Revill et al., 2005a). Reoviruses have an icosahedral double capsid viral particle with a diameter of 75 - 80 nm (Hull, 2014; King et al., 2012). Viruses in the genus Oryzavirus have a genome comprised of 10 segments of linear, double-stranded RNA (dsRNA) with the size of segments varying between 1.1 - 3.8 kbp (Hull, 2014). No symptoms have been associated with TaRV infection and the virus has only been detected in symptomless taro plants and plants infected with other viruses (Revill et al., 2005a).

19

A

B 3' N P 3 M G L 5' leader trailer

Figure 6. Virion structure and typical genome organisation of rhabdoviruses (A) Bullet-shaped virion strcture; (B) genome organisation of Taro vein chlorosis virus (King et al., 2012).

20

Figure 7. Typical vein chlorosis symptom associated with TaVCV infection in taro. Photo: Prof. Rob Harding.

21

2.5.6 Other viruses infecting taro and other aroids

Several other viruses have been reported to infect taro and other aroids. Konjac mosaic virus (KoMV) from the family Potyviridae and genus Potyvirus was reported from India infecting taro, elephant foot yam (Amorphophallus paeoniifolius), Caladium sp. and Dieffenbachia sp. (Manikonda et al., 2011; Padmavathi et al., 2013). Furthermore, the potyvirus Zantedeschia mild mosaic virus (ZaMMV) was reported infecting calla lily and Alocasia sp. from Taiwan and Australia (Huang et al., 2005, Huang et al., 2007; Kidanemariam et al., 2016). Wang et al. (2014), reported the first incidence of Cucumber mosaic virus (CMV), family genus Cucumovirus infecting taro from China. In 2011, Groundnut bud necrosis virus (GBNV) from the family Bunyaviridae, genus Tospovirus was reported infecting taro in India (Sivaprasad et al., 2011). In addition, Calla lily chlorotic spot virus (CCSV), a putative tospovirus, was reported infecting calla lily in Taiwan (Chen et al., 2012). Except ZaMMV, which was reported from Australia infecting Alocasia sp., the other reports of taro and other aroids infected with viruses mentioned are basically from the Asian continent. In addition, apart from their occurrence and genome characterisation, production loss or other agronomic traits associated with these viruses on aroids is yet unknown. PCR based detection techniques have been used for the detection of these viruses.

2.6 Research problem and aim

Despite the substantial contribution of taro to the food and income security for many small scale farmers in East Africa, the crop has gained very low research priority within the region (Akwee et al., 2015; Talwana et al., 2009; Tumuhimbise et al., 2009). The production of taro in the region has declined significantly over time due to poor agronomic practices and various biotic and abiotic stresses (Talwana et al., 2009). Viruses are known to be one of the most important constraints to production, with some infections resulting in a severe reduction in quantity and quality of production (Talwana et al., 2009; Revill et al., 2005b; Lebot et al., 2004). The status of taro viruses 22

in East Africa has not been extensively studied. However, in a small pilot study conducted by Kidanemariam et al. (2018) in Ethiopia and Kenya, DsMV, TaBV and possibly two other viruses were detected. A more extensive study is now warranted in order to identify the viruses affecting taro and possibly other aroids from the region that may serve as virus reservoirs. Therefore, the aim of this project was to determine the identity and incidence of economically important viruses associated with taro, and other important aroids where possible, in East Africa.

2.7 Objectives

The specific aims of this project were to (i) conduct extensive surveys in four East African countries and determine the incidence and distribution of known DNA and RNA viruses infecting taro and other edible aroids in East Africa, and (ii) characterise, at the molecular level, the viruses detected.

23

2.8 References

Abo El-Nil, M.M., Zettler, F.W., Hiebert, E. (1977). Purification, serology and some physical properties of dasheen mosaic vims. Phytopathol. 67:1445–1450.

Akwee, P.E., Netondo, G., Kataka, J.A. and Palapala, V.A. (2015). A critical review of the role of taro Colocasia esculenta L. (Schott) to food security: A comparative analysis of Kenya and Pacific Island taro germplasm. Scientia Agri. 9:101–108.

Adams, M., Antoniw, J. and Fauquet, C. (2005). Molecular criteria for genus and species discrimination within the family Potyviridae. Arch. Virol. 150:459–479.

Babu, B., Hegde, V., Makeshkumar, T. and Jeeva, M. (2011). Characterisation of the coat protein gene of dasheen mosaic virus infecting elephant foot yam. J. Plant Pathol. 93:199–203.

Babu, B. and Hegde, V. (2014). Molecular characterization of dasheen mosaic virus isolates infecting edible aroids in India. Acta Virologica 58:34–42.

Beyene, T.M. (2013). Morpho-agronomical characterization of taro (Colocasia esculenta) accessions in Ethiopia. SciencePG 1:1–9.

Bhat, A.I., Hohn, T. and Selvarajan, R., (2016). Badnaviruses: the current global scenario. Viruses 8:177.

Bomer, M., Turaki, A.A., Silva, G., Kumar, P. and Seal, S.E. (2016). A sequence- independent strategy for amplification and characterisation of episomal badnavirus sequences reveals three previously uncharacterised yam badnaviruses. Viruses 8:188.

Bömer, M., Rathnayake, A. I., Visendi, P., Silva, G., & Seal, S. E. (2017). Complete genome sequence of a new member of the genus Badnavirus, Dioscorea bacilliform RT virus 3, reveals the first evidence of recombination in yam badnaviruses. Arch. Virol. 163:553–538.

Bown, D. (2000). Aroids: plants of the Arum family (No. Ed. 2). Timber press.

24

Carmichael, A., Harding, R., Jackson, G., Kumar, S., Lal, S., Masamdu, R., Wright, J. and Clarke, A. (2008). TaroPest: an illustrated guide to pests and diseases of taro in the South Pacific. ACIAR 132:76.

Chen, T.C., Li, J.T., Lin, Y.P., Yeh, Y.C., Kang, Y.C., Huang, L.H., and Yeh, S.D. (2012). Genomic characterization of Calla lily chlorotic spot virus and design of broad‐ spectrum primers for detection of tospoviruses. Plant Pathol. 61:183–194.

CPPMU (Central Planning and Project Monitoring Unit), (2010). Republic of Kenya. Ministry of Agriculture. Economic review of agriculture, 2010. Accessed 29/03/2014 http://www.kilimo.go.ke/kilimo_docs/pdf/ERA_2010.pdf.

CSA (Central Statistical Agency), (2010). Federal democratic republic of Ethiopia. Central statistical agency. Agricultural sample survey 2009 / 2010, Report on area and production of major crops. Statistical bulletin, 4. Addis Ababa, Ethiopia. Accessed 28/03/2014 http://harvestchoice.org/sites/default/files/downloads/publications/Ethiopia _2009-0_Vol_4.pdf.

CSA (Central Statistical Agency), (2011). Federal democratic republic of Ethiopia. Central statistical agency. Agricultural sample survey 2010 / 2011, Report on area and production of major crops. Statistical bulletin, 1 . Addis Ababa, Ethiopia. Accessed 28/03/2014 http://harvestchoice.org/sites/default/files/downloads/publications/Ethiopia _2010-1_Vol_1.pdf.

CSA (Central Statistical Agency), (2012). Federal democratic republic of Ethiopia. Central statistical agency. Agricultural sample survey 2011 / 2012, Report on area and production of major crops. Statistical bulletin, 1. Addis Ababa, Ethiopia. Accessed 28/03/2014 http://www.csa.gov.et/newcsaweb/images/documents/surveys/survey0/data /Doc/Report/Area%20and%20production%20report%202004.pdf.

25

Cuevas, J.M., Delaunay, A., Visser, J.C., Bellstedt, D.U., Jacquot, E. and Elena, S. F. (2012). Phylogeography and molecular evolution of potato virus Y. PLoS One 7: e37853.

Elliott, M.S., Zettler, F.W. and Brown, L.G. (1997). Dasheen mosaic potyvirus of edible and ornamental aroids. Plant Pathol. Circular, 384.

FAOSTAT (2014). Food and Agricultural Organization of the United Nations. Crop production. Accessed 01/04/2014 http://faostat3.fao.org/faostat- gateway/go/to/download/Q/QC/E.

Geering, A., Olszewski, N.E., Harper, G., Lockhart, B. Hull, R. and Thomas, J. (2005). Banana contains a diverse array of endogenous badnaviruses. J. Gen. Virol. 86: 511–520.

Geering, A. and Hull, R. (2012). Caulimoviridae. In: King, A.M.Q., Adams, M.J., Carstens, E.B., Lefkowitz, E.J. (Eds.), Virus Taxonomy, Ninth report of the International Committee on Taxonomy of Viruses pp. 429–443. Amsterdam, Elsevier.

Geering, A. (2014). Caulimoviridae (Plant Pararetroviruses). In: eLS. John Wiley & Sons, Ltd: Chichester. DOI: 10.1002/9780470015902.a0000746.pub3

Geering, A. and Teycheney, P. (2016). Two new species in the genus Badnavirus. https://talk.ictvonline.org/files/

Ha, C., Revill, P., Harding, R.M., Vu, M. and Dale, J.L. (2008a). Identification and sequence analysis of potyviruses infecting crops in Vietnam. Arch. Virol. 153:45–60.

Ha, C., Coombs, S., Revill, P., Harding, R.M., Vu, M. and Dale, J.L. (2008b). Design and application of two novel degenerate primer pairs for the detection and complete genomic characterization of potyviruses. Arch. Virol. 153:25–36.

26

Harrison, J., Moore, K.A., Paszkiewicz, K., Jones, T., Grant, M.R., Ambacheew, D., Muzemil, S. and Studholme, D.J. (2014). A Draft genome sequence for Ensete ventricosum, the drought-tolerant “Tree against hunger”. Agronomy 4:13–33.

Harper, G., Dahal, G., Thottappilly, G. and Hull, R. (1999). Detection of episomal banana streak badnavirus by IC-PCR. J. Virol. Methods 79:1–8.

Higgins, C., Bejerman, N., Li, M., James, A., Dietzgen, R., Pearson, M., Revill, P. and Harding, R. (2016). Complete genome sequence of Colocasia bobone disease- associated virus, a putative cytorhabdovirus infecting taro. Arch. Virol. 161:745–748.

Holmes, E.C. (2011). The evolution of endogenous viral elements. Cell host & microbe. 10: 368–377.

Huang, C.H. and Chang, Y.C. (2005). Identification and molecular characterization of Zantedeschia mild mosaic virus, a new calla lily-infecting potyvirus. Arch. Virol. 150:1221–1230.

Huang, C.H., Hu, W.C., Yang, T.C. and Chang, Y.C. (2007). Zantedeschia mild mosaic virus, a new widespread virus in calla lily, detected by ELISA, dot‐blot hybridization and IC‐RT‐PCR. Plant Pathol. 56:183–189.

Hull, R. (2014). Plant Virology (5th ed.). UK, Elsevier.

Iskra-Caruana, M.L., Duroy, P.O., Chabannes, M. and Muller, E. (2014). The common evolutionary history of badnaviruses and banana. Infection, Genetics and Evolution 21:83–89.

Jacquot, M., Hagen, L.S., Jacquemond, M. and Yot, P. (1996). The open reading frame 2 product of cacao swollen shoot badnavirus is a nucleic acid-binding protein. Virology 225:191–195.

27

James, A., Geijskes, R.J., Dale, J.L. and Harding, R.M. (2011a). Development of a novel rolling-circle amplification technique to detect Banana streak virus that also discriminates between integrated and episomal virus sequences. Plant Dis. 95:57–62.

James, A., Geijskes, R.J., Dale, J.L. and Harding, R.M. (2011b). Molecular characterisation of six badnavirus species associated with leaf streak disease of banana in East Africa. Ann. Appl. Biol. 158:346–353.

James, M., Kenten, H.R. and Woods, D.R. (1973). Virus-like particles associated with two diseases of Colocasia esculenta (L.) Schott in the Solomon Islands. J. Gen. Virol. 21:145–153.

Johne, R., Müller, H., Rector, A., Van Ranst, M. and Stevens, H. (2009). Rolling-circle amplification of viral DNA genomes using phi29 polymerase. Trends in Microbiol. 17:205–211.

Kantaka, S. (2004). Colocasia esculenta (L.). Schott. Grubbrn, G.J.H. and Denton, O.A. (Eds). PROTA (Plant resources of Tropical Africa / Ressources vegetales de l’ Afrique tropicale). Netherlands, Wageningen.

Kazmi, S.A., Yang, Z. and Hong, N. (2015). Characterization by small RNA sequencing of Taro Bacilliform CH Virus (TaBCHV), a novel Badnavirus. PLoS One: 10, e0134147.

Kidanemariam, D. B., Abraham, A. D., Sukal, A. C., Holton, T. A., Dale, J. L., James, A. P., & Harding, R. M. (2016). Complete genome sequence of a novel zantedeschia mild mosaic virus isolate: the first report from Australia and from Alocasia sp. Arch. Virol. 161:1079–1082.

Kidanemariam, D. B., Macharia, M. W., Harvey, J., Holton, T., Sukal, A., James, A. P., Harding, R. M. & Abraham, A. D. (2018). First report of Dasheen mosaic virus infecting taro (Colocasia esculenta) from Ethiopia. Plant Dis. PDIS-12.

28

King, A.M., Adams, M.J., Lefkowitz, E.J. and Carstens, E.B. (2012). Virus taxonomy: classification and nomenclature of viruses: Ninth report of the International Committee on Taxonomy of Viruses. Amsterdam, Elsevier.

Le Provost, G., Iskra-Caruana, M.L., Acina, I. and Teycheney, P.Y. (2006). Improved detection of episomal Banana streak viruses by multiplex immunocapture PCR. J. Virol. Methods 137:7–13.

Lebot, V., Prana, M.S., Kreike, N., van Heck, H., Pardales, J., Okpul, T., Gendua, T., Thongjiem, M., Hue, H., Viet, N. and Yap, T.C. (2004). Characterisation of taro (Colocasia esculenta (L.) Schott) gentic resources in Southeast Asia and Oceania. Genetic Resources and Crop Evol. 51:381–392.

Lebot, V. (2009). Tropical root and tuber crops: cassava, sweet potato, yams and aroids. UK, MPG Biddles Ltd.

Long, M.H., Ayin, C., Li, R., Hu, J.S. and Melzer, M.J. (2014). First report of Taro vein chlorosis virus Infecting taro (Colocasia esculenta) in the United States. Plant Dis. 98:1160–1160.

Macanawai, A.R., Ebenebe, A.A., Hunter, D., Devitt, L., Hafner, G. and Harding, R. (2005). Investigations into the seed and mealybug transmission of Taro bacilliform virus. Aust. Plant Pathol. 34:73–76.

Manikonda, P., Srinivas, K.P., Reddy, S., Venkata, C., Ramesh, B., Navodayam, K., Krishnaprasadji, J., Ratan, P.B. and Sreenivasulu, P. (2011). Konjac mosaic virus naturally infecting three aroid plant species in Andhra Pradesh, India. J. Phytopathol. 159:133–135.

Mariame, F. and Gelmesa, D. (2006). Review of the status of vegetable crops production and marketing in Ethiopia. Uganda J. Agri. Sci. 12:26–30.

Ming, S.F.Y., Ping, G.W., Ping, L.W., Xing, W.X. and Ni, H. (2013) Molecular identifcation and specifc detection of badnavirus from taro grown in China. Acta Phytopathol Sinica 6:590–595 29

Mulholland, V. (2005). Immunocapture-polymerase chain reaction. Methods in Molecular Biology, Vol. 295: Immunochemical Protocols, 3rd edition pp. 281- 290. New York, Humana Press.

Nelson, S.C. (2008). Dasheen mosaic of edible and ornamental aroids. Plant Dis. 44:1– 9.

Ndabikunze, B.K., Talwana, H.A.L., Mongi, R.J., Issa-Zacharia, A., Serem, A.K., Palapala, V. and Nandi, J.O.M. (2011). Proximate and mineral composition of cocoyam (Colocasia esculenta L. and Xanthosoma sagittifolium L.) grown along the Lake Victoria Basin in Tanzania and Uganda. Afri. J. Food Sci. 5:248–254.

Onwueme, I.C. and Charles, W.B. (1994). Cultivation of cocoyam. In: Tropical root and tuber crops. Production, perspectives and future prospects. FAO Plant Production and Protection Paper 126, Rome. pp. 139–161.

Ooka, J.J. (1990). Taro Diseases. Accessed 17/04/2014 http://www.ctahr.hawaii.edu/oc/freepubs/pdf/RES-114-11.pdf.

Padmavathi, M., Srinivas, K., Hema, M. and Sreenivasulu, P. (2013). First report of Konjac mosaic virus in elephant foot yam (Amorphophallus paeoniifolius) from India. Aust. Plant Dis. Notes, 8:27–29.

Pearson, M., Jackson, G., Saelea, J. and Morar, S. (1999). Evidence for two rhabdoviruses in taro (Colocasia escudenta) in the Pacific region. Aust. Plant Pathol. 28:248–253.

Plucknett, D.L., Pena, R.D. L., and Obrero, F. (1970). Taro (Colocasia escalenta). In Field Crop Abstracts 23: 413–426.

Rana, G.L., Vovlas, C. and Zettler, F.W. (1983). Manual transmission of dasheen mosaic virus from Richardia to nonaraceous hosts. Plant Dis. 67:1121–1122.

30

Revill, P., Jackson, G., Hafner, G., Yang, I., Maino, M., Dowling, M., Devitt, L., Dale, J. and Harding, R. (2005a). Incidence and distribution of viruses of taro (Colocasia esculenta) in Pacific Island countries. Aust. Plant Pathol. 35:327–331.

Revill, P., Trinh, X., Dale, J. and Harding, R. (2005b). Taro vein chlorosis virus: characterization and variability of a new nucleorhabdovirus. J. General Virol. 86:491–499.

Sharma, K., Mishra, A.K. and Misra, R.S. (2009). Identification and characterization of differentially expressed genes in the resistance reaction in taro infected with Phytophthora colocasiae. Mol. Biol. Reports 36:1291–1297.

Shaw, E.D., Plumb, R.T. and Jackson, G.V.H. (1979). Virus diseases of taro (Colocasia esculenta) and Xanthosoma spp. in Papua New Guinea. Papua New Guinea Agri. J. 30:71–97.

Singh, D., jackson, G., Hunter, D., Fullerton, R., Lebot, V., Taylor, M., Iosefa, T., Okpul, T. and Tyson, J. (2012). Taro leaf blight—A threat to food security. Agriculture 2:182–203.

Sivaprasad, Y., Reddy, B.B., Kumar, C.N., Reddy, K.R. and Gopal, D.S. (2011). First report of groundnut bud necrosis virus infecting taro (Colocasia esculenta). Aust. Plant Dis. Notes 6:30–32.

Sukal, A., Kidanemariam, D., Dale, J., James, A. and Harding, R. (2017). Characterization of badnaviruses infecting Dioscorea spp. in the Pacific reveals two putative novel species and the first report of dioscorea bacilliform RT virus 2. Virus Research 238:29–34.

Talwana, H.A.L., Serem, A.K., Ndabikunze, B.K., Nandi, J.O.M., Tumuhimbise, R., Kaweesi, T., Chumo, E.C. and Palapala, V. (2009). Production status and prospects of cocoyam (Colocasia esculenta (L.) Schott.) in East Africa. J. Root Crops 35:98–107.

31

Tumuhimbise, R., Talwana, H.L., Osiru, D.S.O., Serem, A.K., Ndabikunze, B.K., Nandi, J.O.M. and Palapala, V. (2009). Growth and development of wetland-grown taro under different plant populations and seedbed types in Uganda. Afri. Crop Sci. J. 17:49–60.

Vaneker, K. and Slaats, E. (2012). AROIDS: The world’s oldest food crop. Accessed 09/03/2012 http://www.b4fn.org/fileadmin/B4FN_Docs/documents/Case_study_docume nts/Aroids_factsheet.pdf.

Wang, Y.F., Wang, G.P., Wang, L.P. and Hong, N. (2014). First report of Cucumber mosaic virus in taro plants in China. American Phytopathol. Society J. 98:574.

Wilson, J.E. and Siemonsma, J.S. (1996). Colocasia esculenta (L.) Schott, Record from proseabase. Flach, M. & Rumawas, F. (Editors). PROSEA (Plant resources of South-East Asia) Foundation, Bogor, Indonesia. Accessed 01/03/2014 http://www.prota4u.org/search.asp.

Yang, I.C., Hafner, G.J., Revill, P., Dale, J. and Harding, R. (2003a). Sequence diversity of South Pacific isolates of Taro bacilliform virus and the development of a PCR- based diagnostics test. Arch. Virol. 148:1957–1968.

Yang, I., Hafner, G., Dale, J. and Harding, R. (2003b). Genomic characterisation of Taro bacilliform virus. Arch. Virol. 148:937–949.

Zettler, F.W., Foxe, M.J., Hartman, R.D., Edwardson, J.R. and Christie, R.G. (1970). Filamentous viruses infecting taro and other araceous plants. Phytopathol. 60: 983–987.

Zettler, F.W. and Hartman, R.D. (1986). Dasheen mosaic virus and its control in cultivated aroids. Extension Bulletin 233, ASPAC Food fertilizer technology center, Taiwan (233).

32

Zettler, F.W. and Hartman, R.D. (1987). Dasheen mosaic virus as a pathogen of cultivated aroids and control of the virus by tissue culture. Plant Dis. 71: 958– 963.

Zettler, F.W., Jackson, G.V.H. and Frison, E.A (eds.) (1989). FAO/IBPGR Technical guidelines for the safe movement of edible aroid germplasm. Food and Agriculture Organization of the United Nations, Rome / International board for plant genetic resources, Rome. Accessed 15/03/2014 http://www.bioversityinternational.org/uploads/tx_news/Edible_aroid_400.p df.

Zheng, L., Rodoni, B., Gibbs, M. and Gibbs, A. (2010). A novel pair of universal primers for the detection of potyviruses. Plant Pathol. 59:211–220.

33

34

Chapter 3

Complete genome sequence of a novel Zantedeschia mild mosaic virus isolate: the first report from Australia and from Alocasia sp.

Dawit B. Kidanemariam1,2, Adane D. Abraham2*, Amit C. Sukal1, Timothy A. Holton3, James L. Dale1, Anthony P. James1, Robert M. Harding1

1Centre for Tropical Crops and Biocommodities, Queensland University of Technology, Brisbane, 4001, Australia 2National Agricultural Biotechnology Research Center, Ethiopian Institute of Agricultural Research, P.O. Box 2003, Addis Ababa, Ethiopia 3Biosciences eastern and central Africa–International Livestock Research Institute (BecA–ILRI) Hub, P.O. Box 30709, Nairobi, Kenya

*Current address: Department of Biotechnology, Addis Ababa Science and Technology University. P.O. Box 16417, Addis Ababa, Ethiopia

Archives of Virology 161:1079-1082

35

Statement of Contribution of Co-Authors of Thesis by Publication Paper The authors listed below have certified that: 1. They meet the criteria for authorship in that they have participated in the conception, execution, or interpretation, of at least that part of the publication in their field of expertise; 2. They take public responsibility for their part of the publication, except for the responsible author who accepts overall responsibility for the publication; 3. There are no other authors of the publication according to these criteria; 4. Potential conflicts of interest have been disclosed to (a) granting bodies, (b) the editor or publisher of journals or other publications, and (c) the head of the responsible academic unit, and 5. They agree to the use of the publication in the student’s thesis and its publication on the QUT’s ePrints site consistent with any limitations set by publisher requirements. In the case of this chapter: Complete genome sequence of a novel Zantedeschia mild mosaic virus isolate: the first report from Australia and from Alocasia sp.

QUT Verified Signatures

QUT Verified Signature

RSC, Level 4, 88 Musk Ave, Kelvin Grove Qld 4059 Page 1 of 1 Current @ 20/09/2016 CRICOS No. 00213J

36 Abstract The complete genome of an Australian isolate of zantedeschia mild mosaic virus (ZaMMV) causing mosaic symptoms on Alocasia sp. (designated ZaMMV-AU) was cloned and sequenced. The genome comprises 9942 nucleotides (excluding the poly- A tail) and encodes a polyprotein of 3167 amino acids. The sequence is most closely related to a previously reported ZaMMV isolate from Taiwan (ZaMMV-TW), with 82 and 86 % identity at the nucleotide and amino acid level, respectively. Unlike the amino acid sequence of ZaMMV-TW, however, ZaMMV-AU does not contain a polyglutamine stretch at the N-terminus of the coat-protein-coding region upstream of the DAG motif. This is the first report of ZaMMV from Australia and from Alocasia sp.

Zantedeschia mild mosaic virus (ZaMMV) is a positive sense, single-stranded RNA virus belonging to the genus Potyvirus, family Potyviridae [1]. The virus was first reported infecting calla lily (Zantedeschia sp.) in Taiwan in 2005 [1, 2] and has subsequently only been reported from Italy [3] and New Zealand (GenBank accession no. DQ407934). Currently, there is only a single published full-length genome sequence of ZaMMV available from Taiwan, designated ZaMMV-TW (GenBank accession no. AY626825).

In 2014, an aroid (Alocasia sp.) showing feathery mosaic symptoms typical of those caused by the potyvirus dasheen mosaic virus (DsMV) was observed at Bellthorpe, Queensland, Australia. To determine if the plant was infected with DsMV, symptomatic leaves were collected and initially tested for the presence of potyviruses by RT-PCR. Total RNA was extracted using a lithium-chloride based protocol [4], and cDNA was synthesised using M-MLV reverse transcriptase (Promega) and oligo(dT)18 primers. PCR was carried out using GoTaq-Green Master Mix (Promega) and degenerate primers designed to amplify a fragment of the CI- coding region of potyviruses [5, 6]. As a positive control, total RNA extracted from DsMV-infected taro leaves was used. An amplicon of the expected size (∼700 bp) was generated from extracts derived from both the DsMV-infected taro and Alocasia sp.

37

samples. The amplicon from the Alocasia sp. sample was subsequently cloned and sequenced, and a BLAST search analysis of the 621-nt sequence revealed 84 % and 93 % identity to ZaMMV-TW at the nucleotide and amino acid level, respectively. As ZaMMV has not previously been reported in Australia, or in Alocasia sp., the complete genome sequence of this novel isolate (herein referred to as ZaMMV-AU) was determined.

To obtain the remainder of the virus genome, RT-PCR was carried out using degenerate primers targeting the potyviral HC-Pro-, NIb- and CP-coding regions [6, 7]. The amplicons were cloned and sequenced, and specific primers were subsequently designed in order to amplify the intervening sequences. The 50- terminal sequence of the genome was obtained by rapid amplification of cDNA ends (RACE) using a 50/30 RACE Kit, 2nd Generation (Roche). In all cases, amplicons were separated by electrophoresis through 1.5 % agarose gels, purified using the Freeze ‘N SqueezeTM DNA Gel Extraction Spin Columns (Bio-Rad) and cloned into pGEM_-T Easy Vector (Promega) following the manufacturer’s protocols. For each amplicon, at least three clones were sequenced in both directions using a Big Dye_ Terminator v3.1 Cycle Sequencing Kit (Thermo Fisher Scientific) following the manufacturer’s protocol.Sequencing data were processed and analysed using CLC Main Workbench v6.9.2 (QIAGEN) and Vector NTI Advance_ Suite v11 (Invitrogen). Virus sequences were further aligned and analyzed using the ClustalW multiple alignment algorithm in BioEdit version 7.1.9 (http://www.mbio.ncsu.edu/BioEdit/bioedit.html), and phylogenetic trees were constructed from ClustalW-aligned sequences using MEGA version 6.0.6 [8], using the neighbour-joining method and the Kimura 2-parameter model with 1000 bootstrap replications.

The complete genome sequence of ZaMMV-AU was assembled from the consensus sequences of amplicons generated using degenerate and specific primers and 50 RACE. The genome comprised 9942 nucleotides (Gen-Bank accession no. KT729506) including the 50 UTR (198 nt) and 3’ UTR (240 nt), but excluding the 3’ polyA-tail. Sequence analysis identified a single putative open reading frame of 9501 nt, encoding a 3167-amino-acid polyprotein with a predicted MW of 359.14 kDa.

38

Sequence comparison of the complete genome of ZaMMV-AU to ZaMMV-TW revealed 82 % identity, while comparison of the polyprotein coding region revealed 79.5 % and 86.3 % identity at the nucleotide and amino acid level, respectively. The nucleotide and amino acid sequences of the putative protein- coding and non-coding region of ZaMMV-AU and ZaMMV-TW were also compared (Table 1). These analyses revealed nucleotide sequence identities ranging from 61.3 % (5’ UTR) to 88 % (3’ UTR) and amino acid sequence identities ranging from 58.7 % (P1) to 100 % (6K1). Further, when the nucleotide sequence of ZaMMV-AU was compared to the partial sequences of the Italian and New Zealand ZaMMV isolates, there was 86.6 % and 80.3 % identity, respectively. Phylogenetic analysis of the complete genome sequence of ZaMMV-AU and other selected Potyviridae members showed that it groups with ZaMMV-TW within the bean common mosaic virus (BCMV) subgroup of the genus Potyvirus (Figure 1).

Analysis of the amino acid sequence revealed the presence of putative potyviral proteinase cleavage sites, which would result in cleavage of the polyprotein into ten putative mature proteins [9–11] (Figure 2). A PIPO-encoding ORF (81 amino acids), embedded within the P3 cistron, was also identified, while the presence of a DAG motif in the CP-coding region indicates that ZaMMV-AU may be aphid-transmissible. The amino acid sequence of ZaMMV-TW contains an unusual stretch of 39 glutamine residues at the N-terminus of the CP-coding region, upstream of the DAG motif, for which the function is unknown [1]. Despite analyzing this region in sequences of 10 individual clones from two different cloning experiments, such a polyglutamine stretch is not present in the amino acid sequence of ZaMMV-AU. In ZaMMV-AU, this region comprises a smaller number of amino acids and is lysine rich (9/36) (Figure 3). The differences between ZaMMV-TW and –AU across this region raise questions about their biological significance.

39

Table 1. Comparison of the nucleotide and amino acid sequences of the putative coding and non-coding regions of ZaMMV-AU and ZaMMV-TW.

5' UTR P1 HC-Pro P3 PIPO* 6K1 CI 6K2 VPg NIa NIb CP 3' UTR

% Nucleotide sequence identity 61.3 63.6 80.6 81.4 85.8 85.9 83.2 81.1 84.1 82.7 83.6 75.5 88

% Amino acid sequence identity 58.7 90 83.4 78.8 100 93.4 92.5 93.7 92.2 93.4 78.1 * Predicted from ZaMMV-TW sequence annotation

40

Figure 1. Phylogenetic analysis of ZaMMV-AU. Phylogenetic tree generated by the neighbour-joining method in MEGA 6 [8] using nucleotide sequences of the complete polyprotein ORF of selected potyviruses comprising the bean common mosaic virus (BCMV) subgroup and representative members of other genus Potyvirus subgroups. The tree was rooted using ryegrass mosaic virus (RGMV, NC_001814.1), the type member of the genus Rymovirus. Bootstrap values greater than 50 % are shown, and the scale bar indicates 0.1 substitutions per site. Subgroup A includes potyviruses from the BCMV subgroup, and subgroup B includes potyviruses from other subgroups. Abbreviations are BCMV (bean common mosaic virus, KC832501), BCMNV (bean common mosaic necrosis virus, AY864314), BYMV (bean yellow mosaic virus, AB439732), CABMV (cowpea aphid-borne mosaic virus, AF348210), DsMV (dasheen mosaic virus, KJ786965), KoMV (konjac mosaic virus, AB219545), PVY (potato virus Y, EF026076), SCMV (sugarcane mosaic virus, AY569692), SMV (soybean mosaic virus, KF135488), SPVG (sweet potato virus G, KF790759), SrMV (sorghum mosaic virus, KJ541740) WMV (watermelon mosaic virus, FJ823122),YMV (yam mosaic virus, NC004752), ZaMMV- AU (zantedeschia mild mosaic virus-Australia, KT729506), ZaMMV-TW (zantedeschia mild mosaic virus-Taiwan, AY626825), ZYMV (zucchini yellow mosaic virus, AY188994-1).

41

Figure 2. Genome organisation of ZaMMV-AU. Predicted mature proteins and their relative position on the genome, and predicted proteinase cleavage sites of ZaMMV-AU (PIPO-encoding ORF not shown).

42

Figure 3. Alignment of partial amino acid sequences of the NIb-CP junction of ZaMMV and selected potyviruses from the BCMV subgroup. The polyglutamine amino acid tract present in the ZaMMV-TW isolate is underlined, the characteristic DAG motif is boxed, and the predicted cleavage site between NIb and CP-coding regions is indicated by an arrow.

43

According to the current species demarcation criteria for viruses within the family Potyviridae [9], members of different species are distinguished by having less than 80 % CP amino acid sequence identity and less than 76 % nucleotide sequence identity, either in the CP-coding region or over the whole genome. Based on comparisons over the whole genome, the virus sequence isolated from Alocasia sp. in this study should be considered a strain of ZaMMV. However, based on comparisons using only the CP-coding region, the reported sequence could be considered a new potyvirus. We have chosen the whole-genome comparison as the criterion for classification due to the presence of the unusual stretch of amino acids in the CP-coding region upstream of the DAG motif. When this region was excluded from comparisons, the amino acid sequences of ZaMMV-TW and ZaMMV-AU shared 89.8 % identity.

To our knowledge, this is the first report of ZaMMV from Australia, and it is also the first report of ZaMMV infecting an Alocasia sp. This report provides a useful reference for further work investigating the occurrence of viruses in Alocasia sp. and its relatives, particularly the economically important members of the family Araceae, such as the cultivated taros (Colocasia esculenta).

Acknowledgments

DK is the recipient of an Australia Awards Scholarship.

44

References

1. Huang CH, Chang YC (2005) Identification and molecular characterization of Zantedeschia mild mosaic virus, a new calla lily-infecting potyvirus. Arch Virol 150:1221–1230

2. Huang CH, Hu WC, Yang TC, Chang YC (2007) Zantedeschia mild mosaic virus, a new widespread virus in calla lily, detected by ELISA, dot-blot hybridization and IC-RT-PCR. Plant Pathol 56:183–189

3. Rizzo D, Panattoni A, Stefani L, Paoli M, Nesi B, Lazzereschi S, Vanarelli S, Farina P, Della Bartola M, Materazzi A, Luvisi A (2015) First report of Zantedeschia mild mosaic virus on Zantedeschia aethiopica (L) Spreng in Italy. J Plant Pathol 97:1–2

4. Valderrama-Cha´irez ML, Cruz-Herna´ndez A, Paredes-Lo´pez O (2002) Isolation of functional RNA from cactus fruit. Plant Mol Biol Rep 20:279–286

5. Ha C, Revill P, Harding RM, Vu M, Dale JL (2008) Identification and sequence analysis of potyviruses infecting crops in Vietnam. Arch Virol 153:45–60

6. Ha C, Coombs S, Revill P, Harding RM, Vu M, Dale JL (2008) Design and application of two novel degenerate primer pairs for the detection and complete genomic characterization of potyviruses. Arch Virol 153:25–36

7. Yamamoto H, Fuji S (2008) Rapid determination of the nucleotide sequences of potyviral coat protein genes using semi-nested RT-PCR with universal primers. J Gen Plant Pathol 74:97–100

8. Tamura K, Stecher G, Peterson D, Filipski A, Kumar S (2013) MEGA6: molecular evolutionary genetics analysis version 6.0. Mol Biol Evol 30:2725–2729

9. Adams MJ, Zerbini FM, French R, Rabenstein F, Stenger DC, Valkonen JPT (2012) Family Potyviridae. In: King AMQ, Lefkowitz E, Adams MJ, Carstens EB (eds) Virus taxonomy: ninth report of the International Committee on Taxonomy of Viruses, London, pp 1069–1089

45

10. Adams MJ, Antoniw JF, Fauquet CM (2005) Molecular criteria for genus and species discrimination within the family Potyviridae. Arch Virol 150:459–479

11. Adams MJ, Antoniw JF, Beaudoin F (2005) Overview and analysis of the polyprotein cleavage sites in the family Potyviridae. Mol Plant Pathol 6:471–487

46

Chapter 4

Identification and molecular characterisation of taro bacilliform virus and taro bacilliform CH virus from East Africa

D. B. Kidanemariama,b, A. C. Sukala, A. D. Abrahamc, F. Stomeod, J. L. Dalea, A. P. Jamesa, R. M. Hardinga*

a Centre for Tropical Crops and Biocommodities, Queensland University of Technology, Brisbane, 4001, Australia b National Agricultural Biotechnology Research Center, Ethiopian Institute of Agricultural Research, P.O. Box 2003, Addis Ababa, Ethiopia c Department of Biotechnology, Addis Ababa Science and Technology University, P.O. Box 16417, Addis Ababa, Ethiopia d Biosciences eastern and central Africa–International Livestock Research Institute (BecA–ILRI) Hub, P.O. Box 30709, Nairobi, Kenya

Plant Pathology https://doi.org/10.1111/ppa.12921

47

Statement of Contribution of Co-Authors of Thesis by Publication Paper The authors listed below have certified that: 1. They meet the criteria for authorship in that they have participated in the conception, execution, or interpretation, of at least that part of the publication in their field of expertise; 2. They take public responsibility for their part of the publication, except for the responsible author who accepts overall responsibility for the publication; 3. There are no other authors of the publication according to these criteria; 4. Potential conflicts of interest have been disclosed to (a) granting bodies, (b) the editor or publisher of journals or other publications, and (c) the head of the responsible academic unit, and 5. They agree to the use of the publication in the student’s thesis and its publication on the QUT’s ePrints site consistent with any limitations set by publisher requirements. In the case of this chapter: Identification and molecular characterisation of taro bacilliform virus and taro bacilliform CH virus from East Africa

QUT Verified Signatures

QUT Verified Signature

RSC, Level 4, 88 Musk Ave, Kelvin Grove Qld 4059 Page 1 of 1 Current @ 20/09/2016 CRICOS No. 00213J

48

Abstract

Taro (Colocasia esculenta) and tannia (Xanthosoma sp.) are important root crops cultivated mainly by small-scale farmers in sub-Saharan Africa and the South Pacific. Viruses are known to be one of the most important constraints to production, with infections resulting in severe yield reduction. In 2014 and 2015, surveys were conducted in Ethiopia, Kenya, Tanzania and Uganda to determine the identity of viruses infecting taro in East Africa. Screening of 392 samples collected from the region using degenerate badnavirus primers revealed an incidence of 58-74% among the four countries surveyed, with sequence analysis identifying both taro bacilliform virus (TaBV) and taro bacilliform CH virus (TaBCHV). TaBCHV was identified from all four countries while TaBV was identified in all except Ethiopia. Full-length sequences from representative TaBV and TaBCHV isolates showed that the genome organisation of TaBV isolates from East Africa was consistent with previous reports while TaBCHV isolates from East Africa were found to encode only four ORFs, distinct from a previous report from China. Phylogenetic analysis showed that all East African TaBV isolates form a single subgroup within known TaBV isolates, while TaBCHV isolates form at least two distinct subgroups. To our knowledge, this is the first report describing the occurrence and genome organisation of TaBV and TaBCHV isolates from East Africa and the first full-length sequence of the two viruses from tannia.

Keywords

Colocasia esculenta; Xanthosoma; Caulimoviridae; badnavirus; rolling circle amplification; episomal DNA

49

Introduction

The aroids, taro (Colocasia esculenta (L.) Schott) and tannia (Xanthosoma sp.), are among the most important root crops in many sub-Saharan African countries including Burundi, Cote d’Ivoire, Ethiopia, Gabon, Ghana, Kenya, Nigeria, Tanzania and Uganda (Ndabikunze et al., 2011; Akwee et al., 2015). The corm and leaves of taro plants are very rich sources of easily digestible starch and dietary fibre and also contain substantial amounts of protein, vitamins and minerals (Ndabikunze et al., 2011). Worldwide more than half a billion people incorporate taro in their diets, including many areas of the tropics (Lebot, 2009). In East African countries, taro is mainly cultivated by small-holder farmers where it plays important cultural, economic and nutritional roles (Onwueme and Charles, 1994; Talwana et al., 2009; Tumuhimbise et al., 2009; Beyene, 2013).

In southern Ethiopia, taro (locally called ‘godere’), tannia and enset are the preferred food security crops, as they perform well with minimal agricultural inputs (Mariame and Gelmesa, 2006; Beyene, 2013; Harrison et al., 2014). In Kenya, taro, also known locally as ‘arrowroot’ and tannia are a basic source of starch in the diet for many communities in the Mount Kenya and Abedares districts of central Kenya, as well as in the Lake Victoria basin districts of Kakamega, Kisumu and Siaya, where it is mainly cultivated adjacent to streams and rivers (Akwee et al., 2015). In Tanzania and Uganda, taro and tannia are mainly grown along the Lake Victoria basin, including Bukoba, Musoma, Tarime, Biharamulo and Mwanza districts in Tanzania and the Mitiyana, Masaka, Jinja, Iganga and Luuka districts in Uganda (Talwana et al., 2009; Ndabikunze et al., 2011). Due to a range of biotic and abiotic factors, the yield from taro production in East Africa is much lower than the world’s average production. These factors include pests, weeds, soil infertility and a lack of genetically improved cultivars, as well as a range of diseases caused by fungi, bacteria and viruses (Tumuhimbise et al., 2009; Talwana et al., 2009; Akwee et al., 2015).

50

Badnaviruses infect a wide range of tropical and subtropical crops including banana, yam, taro, sugarcane, black pepper, citrus, and cacao with some reports also from temperate regions in hosts such as raspberry, gooseberry and ornamental spiraea (Bhat et al., 2016). Badnaviruses have bacilliform-shaped particles of approximately 30 nm by 120–150 nm with a circular, double-stranded (ds) DNA genome of 6.9 – 9.2 kb. The genome typically contains three ORFs but there may be one or more additional ORFs (Geering and Hull, 2012; Bhat et al., 2016). ORFs 1 and 2 encode small proteins of about 23 and 15 kDa, respectively (Geering and Hull, 2012). The function of the protein encoded by ORF 1 is unknown, while the ORF 2 protein has non-specific DNA- and RNA-binding activity and may be involved in virion assembly (Jacquot et al., 1996). ORF 3 encodes a large polyprotein of about 200 kDa which is post-translationally processed into several mature proteins, including movement protein (MP), coat protein (CP), aspartic protease (AP), reverse transcriptase (RT) and ribonuclease H (RNase H) (Geering and Hull, 2012; Bhat et al., 2016). Several additional ORFs have been reported from a number of species, however, these usually have no ascribed function (Kazmi et al., 2015). The RT/RNase H-coding region of ORF 3 is the most conserved region of the genome and nucleotide (nt) differences of greater than 20% in this part of the genome is used for the demarcation of species in the genus (Geering and Hull, 2012).

The genus Badnavirus is the most diverse member of the family Caulimoviridae at both the genomic and antigenic level (Geering and Hull, 2012). Currently, it comprises forty distinct recognised species (https://talk.ictvonline.org/taxonomy/). All members of the family Caulimoviridae are pararetroviruses, whereby at least one part of the viral replication occurs in the nucleus where the viral DNA genome is transcribed from mini-chromosomes formed by an association with histones (Hull and Covey, 1983; Geering and Hull, 2012). This replication strategy can result in the random integration of the viral DNA into the host genome by either illegitimate recombination, or during repair of DNA breaks (Iskra-Caruana et al., 2014). The genetic and serological diversity of badnaviruses and occurrence of viral DNA within the genome of host plants complicates diagnosis (Kenyon et al., 2008; Muller et al., 2011; Seal et al., 2014). Additionally, as many host plant species are vegetatively 51

propagated, badnaviruses can accumulate across cultivation cycles. These attributes make badnaviruses important pathogens for many crops and presents a serious threat to germplasm exchange in a number of important crop species (Borah et al., 2013).

In taro, two distinct badnavirus species have been reported, namely Taro bacilliform virus (TaBV) (Yang et al., 2003a, b) and Taro bacilliform CH virus (TaBCHV) (Ming et al., 2013; Kazmi et al., 2015). The genome of TaBV possesses four ORFs, all encoded on the plus-strand of the viral DNA, with the size and organisation of ORFs 1-3 consistent with most badnaviruses (Yang et al., 2003a). ORF 4 of TaBV overlaps ORF 3 between the MP and CP domains and putatively encodes a protein of ~13 kDa, with little homology to any published protein-coding sequences (Yang et al., 2003a). In contrast to TaBV, TaBCHV encodes six putative ORFs, with ORFs 1-4 analogous to TaBV and an additional two small ORFs at the 3' end of ORF 3. ORF 5 partially overlaps ORF 3, while ORF 6 is downstream of, and partially overlaps, the 3' end of ORF 5 (Kazmi et al., 2015). Characterisation of Pacific isolates of TaBV showed that there is up to 23% nucleotide sequence variability within the RT/RNase H-coding region (Yang et al., 2003b). The same study also revealed the presence of TaBV-like sequences in taro samples from Papua New Guinea (PNG), Fiji, Vanuatu, Samoa, Solomon Island and New Caledonia with 50 to 60% nucleotide identity to TaBV, indicating the possible presence of other badnaviruses infecting taro in the South Pacific region. Recently, TaBCHV has been reported from Hawaii (USA), with 91-98% nucleotide sequence identity to the published TaBCHV isolate from China (Wang et al., 2018).

To date, TaBV and TaBCHV appear to be restricted to host plants in the family Araceae. TaBV is transmitted mainly by vegetative propagation, by mealybugs in a semi-persistent manner and in some cases through seed or pollen, but it is not mechanically transmissible (Gollifer et al., 1977; Macanawai et al., 2005). Although no consistent symptoms have been associated with TaBV infection, there have been some reports of mild symptoms such as vein clearing, stunting and downward-curling of the leaf blades in some cultivars (Yang et al., 2003a; Revill et al., 2005; Kidanemariam et al., 2018).

52

Despite the importance of aroids in sub-Saharan Africa, there is no information on the incidence, distribution and diversity of TaBV or TaBCHV in the region. In 2014 and 2015, surveys were conducted to identify viruses infecting taro and other edible aroids in Ethiopia, Kenya, Tanzania and Uganda. In this paper, we report the identification and genomic characterisation of both TaBV and TaBCHV from East African countries and discuss their incidence and sequence diversity. Further, the current nomenclature of TaBV isolates is discussed and a modification to TaBV nomenclature is proposed.

Materials and methods

Sample collection and DNA extraction

Between November 2014 and August 2015, leaf samples were collected from 333 taro plants and 59 tannia plants from 25 major growing areas in Ethiopia, Kenya, Tanzania and Uganda. Of these, 171 (160 taro and 11 tannia) were collected from Ethiopia, 86 (83 taro and three tannia) from Kenya, 41 (29 taro and 12 tannia) from Tanzania and 94 (61 taro and 33 tannia) from Uganda. Samples were taken from plants showing virus-like symptoms as well as from asymptomatic plants. The leaf samples were desiccated over silica-gel and transported to the BecA–ILRI Hub laboratory in Nairobi, Kenya for in-vitro laboratory analysis. Total nucleic acid (TNA) was extracted using 2% CTAB (0.1 M Tris-HCl pH 8, 1.4 M NaCl, 20 mM EDTA, 2% CTAB, 2% PVP and 1 M DTT) as described by Kleinow et al. (2009). Selected samples were later transported to Queensland University of Technology (QUT), Brisbane, Australia for cloning and sequence analysis.

PCR, cloning and sequencing

PCR was carried out using OneTaq® 2x Master Mix (NEB, UK) and degenerate badnavirus primers BadnaFP/RP (Yang et al., 2003a) which amplify an approximately 580 nt region of the RT/RNase H-coding region of ORF 3. As a positive control, total DNA extracted from yam leaf tissue infected with dioscorea bacilliform alata virus was used. Briefly, 1 μl of TNA (30 ng/μl) was mixed with 10 μl of OneTaq® 2x Master Mix and 5 ρmol of each primer in a total of 20 μl. PCR cycling conditions were as 53

follows: initial denaturation at 94 °C for 3 min followed by 40 cycles of 94 °C for 30 s, 50 °C for 30 s, and 72 °C for 1 min, with a final extension at 72 °C for 5 min. Amplicons were separated by electrophoresis through 1.5 % agarose gels.

Ten PCR positive samples from each country, representing different districts where possible, were randomly selected and amplicons of the expected size (∼580 bp) were gel-excised and purified using the Freeze ‘N’ Squeeze™ DNA Gel Extraction Spin Columns (Bio-Rad, Australia) and subsequently cloned into pGEM®-T Easy (Promega, Australia). Putative recombinant plasmid DNA containing the PCR amplicons was sequenced using the Big Dye® Terminator v3.1 Cycle Sequencing Kit (Thermo Fisher Scientific, Australia) at the Central Analytical Research Facility (CARF), QUT, Brisbane, Australia. For each sample, three independent clones were sequenced in one direction using M13F primer.

Rolling circle amplification (RCA), restriction digestion, cloning and sequencing

RCA was carried out using the Illustra™ TempliPhi 100 Amplification Kit (GE Healthcare, UK) as described by James et al. (2011). The RCA products were digested with StuI, SalI and XbaI restriction enzymes (NEB, UK), which were predicted, from in silico restriction site analysis based on published full-length sequences of TaBV (Yang et al., 2003a; GenBank accession no. AF357836) and TaBCHV (Kazmi et al., 2015; GenBank accession no. NC026819), to cut up to three times. Digested RCA products were separated using 0.8 % agarose gels and fragments of approximately 7-8 kb were excised and purified using the Freeze ‘N’ Squeeze™ DNA Gel Extraction Spin Columns (Bio-Rad, Australia) and subsequently ligated into appropriately digested and alkaline phosphatase-treated pUC19 plasmid DNA. Recombinant DNAs were transformed into E.coli competent cells and plasmid DNAs were purified by alkali lysis and digested using EcoRI (NEB, UK) to identify putative recombinant plasmid DNAs containing the RCA amplicons. Full-length genome sequences were subsequently generated from RCA products, with sequencing carried out as described previously. For each sample, at least three independent clones were sequenced in both directions. To confirm the sequences spanning the putative restriction sites, PCR was carried out using sequence-specific primers flanking the region. Briefly, PCR master mix consisted of 54

10 μl of 2x GoTaq Green Master Mix (Promega, Australia), 5 ρmol of each sequence- specific primer and 1 μl of TNA (30 ng/μl) in a final volume of 20 μl. PCR cycling conditions were as follows: initial denaturation at 94 °C for 3 min followed by 35 cycles of 94 °C for 30 s, 50 °C for 30 s, and 72 °C for 2 min, with a final extension at 72 °C for 10 min. The amplified products were cloned into pGEM®-T Easy vector and sequenced as described previously.

Outward-facing PCR

To amplify the complete genome sequence of TaBCHV from East African isolates, outward-facing, sequence-specific primers (TaBCVH-OutF: AGGCCCATTATACTCAAAAG and TaBCHV-OutR: GAAATCAATGGTTGGTACTG) were designed based on consensus RT/RNase H-coding sequences obtained in this study. Long range PCRs were carried out using 1 μl of TNA (30 ng/μl) mixed with 10 μl of 2x GoTaq Long-range PCR Master Mix (Promega, Australia) and 5 ρmol of each sequence-specific primer in a final volume of 20 μl. PCR cycling was as follows: initial denaturation at 94 °C for 3 min followed by 30 cycles of 94 °C for 30 s, 50 °C for 30 s, and 72 °C for 7 min, with a final extension at 72 °C for 10 min. Amplicons were separated by electrophoresis through 0.8 % agarose gels purified, cloned into pGEM®-T easy vector and sequenced by primer-walking as described previously.

Sequence and phylogenetic analysis

Sequencing data were processed and analysed using CLC Main Workbench v6.9.2 (QIAGEN) and Geneious v11.0.2 (Biomatters) computer software. Sequences were compared to all known badnaviruses on the NCBI database using BLAST algorithms available on the NCBI website (http://blast.ncbi.nlm.nih.gov/Blast.cgi). The presence of putative open reading frames (ORFs) was predicted using Geneious v11.0.2 (Biomatters) and SnapGene® software (GLS Biotech). Virus sequences were further aligned and analysed with the ClustalW multiple alignment application using BioEdit sequence alignment editor program version 7.1.9 (http://www.mbio.ncsu.edu/BioEdit/bioedit.html). Phylogenetic trees were constructed from ClustalW-aligned sequences on MEGA version 7.0

55

(http://www.megasoftware.net/mega.php), using the Maximum-Likelihood method and a Kimura 2-Parameter model with 1000 bootstrap replications. Pairwise sequence comparison (PASC) was carried out on aligned sequences using Geneious v11.0.2 (Biomatters) computer software. For taxonomic purposes, the 1.2 kb polymerase gene covering the RT/RNase H domains was used to compare the different genera in the family Caulimoviridae while the core 529 bp sequence of the RT/RNase H-coding region (excluding the BadnaFP/RP primer binding sites) was used to compare the different TaBV and TaBCHV isolates.

Results

PCR screening and sequence analysis

Of the 392 leaf samples collected from the four countries included in this study, 333 were from taro and 59 were from tannia. Of these, 68 taro samples and 23 tannia samples showed virus-like symptoms including mosaic, feathery mottle, vein clearing, downward-curling of leaf blades and stunting. As an initial test for the presence of badnaviruses, TNA was extracted from all samples and PCR carried out using the degenerate BadnaFP/RP primers. An amplicon of the expected size was observed in 70 of 94 samples from Uganda, 54 of 86 samples from Kenya, 25 of 41 samples from Tanzania and 100 of 171 samples from Ethiopia. Of the 392 samples 223 of 333 taro samples and 26 of 59 tannia samples tested positive, with positive samples identified in all of the 25 districts surveyed in the four countries (Table 1). No consistent symptoms were observed on any of the plants testing positive with numerous asymptomatic plants also testing positive.

56

Table 1. Summary of badnavirus PCR screening and samples used for initial sequence analysis.

Number of taro Number of tannia Total Total number Total number of Total number of Percentage Country District samples testing samples testing number of Samples selected for sequencing1 of samples taro samples tannia samples positive (%) positive positive positive Welayita 87 84 75 3 1 76 87.4 Et4, Et8, Et17, Et22, Et141, Et158 Oromia 22 22 1 0 0 1 4.5 Et72

Ethiopia Sheka 25 22 7 3 3 10 40 Et43 Masha 14 12 3 2 1 4 28.6 Et49 Kefa 23 20 6 3 3 9 39.1 Et50 TOTAL 171 160 92 11 8 10

Nyeri 30 29 17 1 0 17 56.7 Ke65, Ke72 Laikipia 3 2 1 1 0 1 33.3 Tharaka Nithi 14 14 10 0 0 10 71.4 Ke14, Ke16, Ke18, Kirinyaga 9 8 5 1 0 5 55.6 Kenya Embu 19 19 13 0 0 13 68.4 Ke43, Ke49, Ke51, Ke52 Kakamega 4 4 4 0 0 4 100 Kisumu 5 5 3 0 0 3 60 Ke83

Siaya 2 2 1 0 0 1 50 TOTAL 86 83 54 3 0 10 Musoma 9 9 3 0 0 3 33.3 Tz7 Tarime 5 2 1 3 3 4 80 Tanzania Mago 2 2 2 0 0 2 100 Tz16, Tz17 Biharamulo 9 1 0 8 8 8 88.9 Tz24, Tz27

Mwanza 16 15 7 1 1 8 50 Tz36, Tz42, Tz43, Tz44, Tz47 TOTAL 41 29 21 12 4 10 Busuju 25 16 15 9 4 19 76 Ug6, Ug10, Ug15 Lukaaya 26 17 15 9 5 20 76.9 Ug35, Ug45, Ug52

Busiro 20 11 10 9 1 11 55 Ug67 Uganda Budondo 4 4 4 0 0 4 100 Ug75 Buunya 6 5 5 1 0 5 83.3 Ug79 Kignlu 3 2 2 1 1 3 100

Luuka 10 6 5 4 3 8 80 Ug96 TOTAL 94 61 56 33 14 10

1The two tannia samples sequenced are shown in bold font 57

A total of 10 amplicons from each country, which included samples from most districts (Table 1), were randomly selected for further analysis and were subsequently cloned and sequenced. All the samples from Ethiopia, Kenya and Uganda were from taro while from Tanzania, eight samples were from taro and two samples (Tz24 and Tz27) were from tannia.

Analysis of the sequences from the three clones derived from each isolate revealed 98-99% nucleotide identity. When the consensus sequence of each of the 40 isolates was subjected to a BLAST analysis, 14 isolates showed highest nucleotide identity (96-97%) to a New Caledonian TaBV isolate (AY186614), while the remaining 26 isolates showed highest nucleotide identity (79.1-92.6%) to TaBCHV from China. The Ethiopian isolates showed greatest nucleotide identity to TaBCHV only, while isolates from Tanzania, Uganda and Kenya showed greatest nucleotide identity to either TaBCHV or TaBV. Of the two tannia samples sequenced, Tz24 showed 97% nucleotide identity to TaBV from New Caledonia, whereas Tz27 showed 92% nucleotide identity to TaBCHV from China. Nucleotide sequence identity amongst the 40 East African isolates ranged from 57 to 99%. Within isolates showing greatest nucleotide identity to TaBCHV, nucleotide sequence variability was highest in the 10 Ethiopian isolates, with variability of up to 22.6%. In the other three countries, the nucleotide sequence identity of TaBCHV ranged from 85.2 to 99.9%. For the 14 isolates which were most similar to TaBV, nucleotide sequence identity ranged from 96.5 to 98% across all isolates. The least amount of variability in TaBV was observed between isolates within each country, with the four isolates from Kenya showing 99.2-99.8% nucleotide sequence identity, the five samples from Tanzania showing 97.4 to 99.9% and the remaining five samples from Uganda showing 98.6 to 99.8% nucleotide sequence identity.

RCA

Following the initial sequence analyses, six isolates showing greatest sequence similarity to TaBV and eight isolates showing greatest sequence similarity to TaBCHV were randomly selected and subjected to RCA in an attempt to amplify the complete genomes. When RCA was carried out on eight isolates with high sequence similarity 58

to TaBCHV, no restriction profiles were observed in any samples following digestion with a range of restriction enzymes which were predicted to cut the full-length published TaBCHV and/or TaBV sequences either once or twice. In contrast, StuI digestion of the RCA product obtained from all six isolates showing highest similarity to TaBV resulted in a single fragment of approximately 8 kb. Further, XbaI digestion resulted in three fragments while no restriction profiles were observed following SalI digestion. Putative full-length StuI digest fragments from the six isolates were cloned and the RT/RNase H-coding region sequenced using primer BadnaFP. Three cloned DNAs for individual isolates generated from RCA were sequenced and showed 99- 100% identity. The consensus sequence derived from each RCA-amplified isolate was compared with the consensus PCR-generated sequences described earlier and in all cases the RCA-amplified sequences showed 99-100% nucleotide identity to the PCR- amplified sequences.

Complete genome sequences were then obtained for three representative isolates from taro originating from Kenya (Ke52), Tanzania (Tz17) and Uganda (Ug75), and one isolate infecting tannia from Tanzania (Tz24). The complete genome sequence of isolate Ke52 comprised 7,805 nt and contained four ORFs (Fig. 1a; Table 2). ORFs 1-3 were 453, 417 and 5979 nt in length, respectively, and encoded respective putative proteins of 150, 138 and 1,992 amino acids (aa). ORF 4 was 333 nt long, encoded a putative protein of 110 aa, and was positioned entirely within ORF 3 (Fig. 1a; Table 2). The complete genome sequence of isolate Tz17 was 7,803 nt with four ORFs similar to Ke52 (Fig. 1a; Table 2). ORFs 1-4 of Tz17 were 453, 417, 5982 and 333 nt in length, respectively, and encoded respective putative proteins of 150, 138, 1993 and 110 aa. Similarly, the complete genome of isolate Ug75 was 7,796 nt in length and contained four ORFs with a similar arrangement to isolates Ke52 and Tz17 (Fig. 1a; Table 2). Similar to the TaBV sequences amplified from taro, the complete genome of tannia isolate Tz24 was found to comprise 7,799 nt and contain four ORFs (Fig. 1a; Table 2). ORFs 1-4 of Tz24 were 453, 414, 5877 and 330 nt, respectively, which encoded predicted proteins of 150, 137, 1958 and 109 aa, respectively (Fig. 1a; Table 2).

59

(a)

1000 2000 3000 4000 5000 6000 7000

met ORF 1 ORF 2 MP CP Zn AP RT RNase H tRNA TATA PolyA ORF 3 ORF 4

(b)

1000 2000 3000 4000 5000 6000 7000

met ORF 1 MP CP Zn AP RT RNase H tRNA TATA PolyA ORF 2 ORF 3 ORF 4

Figure 1. Linearised schematic representation of the genome organisation of full-length TaBV and TaBCHV isolates sequenced from East Africa. (a) Genome organisation of full-length TaBV isolates from East Africa representing isolates from Kenya (Ke52) Tanzania (Tz17, Tz24) and Uganda (Ug75). (b) Genome organisation of full-length TaBCHV isolates from East Africa representing isolates from Ethiopia (Et17), Kenya (Ke43), Tanzania (Tz27, Tz36) and Uganda (Ug10). The predicted putative conserved domains: movement protein (MP), coat protein (CP), zinc finger (Zn), aspartic protease (AP), reverse transcriptase (RT) and ribonuclease H (RNase H) are shown on ORF 3.

60

Table 2. Summary of the genomic features of TaBV and TaBCHV isolates from East Africa.

ORF 1 ORF 2 ORF 3 ORF 4 Transcriptional elements Genome Protein Virus nt Start-stop aa Protein MW nt Start-stop aa Protein MW nt Start-stop aa Protein MW nt Start-stop aa Isolate length (nt) MW species TATA box -gap- polyA-signal lengt lengt lengt lengt lengt lengt length (codon use) length (kDa) (codon use) (kDa) (codon use) (kDa) (codon use) (kDa) h h h h h h 386-838 838-1254 1257-7235 2137-2469 7609-7615 7714-7720 Ke52 7805 453 150 17.1 417 138 15.1 5979 1992 227 333 110 12.5 -99- (ATG-TGA) (ATG-TAA) ATG-TAA (ATG-TAA) ttcTATAAAAggc TTTTTT 386-838 838-1254 1257-7238 2137-2469 7612-7618 7713-7718 Tz17 7803 453 150 17.1 417 138 15.1 5982 1993 227.1 333 110 12.8 -94- (ATG-TGA) (ATG-TAA) (ATG-TAA) (ATG-TAA) tccTATAAAAggc TTTATT TaBV 386-838 838-1251 1254-7229 2134-2460 7603-7609 7708-7713 Ug75 7796 453 150 17.1 414 137 15 5976 1991 226.8 327 108 12.5 -98- (ATG-TGA) (ATG-TAA) (ATG-TAA) (ATG-TAA) ttcTATAAAAggc TTTTTT (ATG-TGA) 838-1251 1251-7127 2131-2460 7605-7611 7709-7714 Tz24 7799 453 150 17.1 414 137 15 5877 1958 222.7 330 109 12.6 -97- (386-838) (ATG-TAA) (ATG-TAA) (ATG-TAA) ttcTATAAAAggc TTTTTT 359-796 793-1173 1170-6581 6502-6810 7561-7467 7562-7567 Et17 7610 438 145 17 381 126 14 5412 1803 205.9 309 102 12.2 -94- (ATG-TGA) (ATG-TGA) (ATG-TGA) (ATG-TGA) aggTATATAAtaa AAAAAT 344-781 778-1158 1163-6550 6471-6779 7376-7382 7478-7483 Ke43 7647 438 145 17 381 126 13.9 5388 1795 200.4 309 102 12.4 -95- (ATG-TGA) (ATG-TGA) (ATG-TGA) (ATG-TGA) aggTATATAAtat AAAAAT 521-958 955-1335 1341-6725 6646-6954 7425-7431 7544-7549 TaBCHV Tz36 7654 438 145 16.7 381 126 14 5385 1794 203.8 309 102 12.4 -112- (ATG-TGA) (ATG-TGA) (ATG-TGA) (ATG-TGA) atcTATATAAgga TAAAAA 344-781 778-1158 1163-6547 6468-6776 7244-7250 7363-7368 Ug10 7643 438 145 17 381 126 14 5385 1794 206.3 309 102 12.4 -112- (ATG-TGA) (ATG-TGA) (ATG-TGA) (ATG-TGA) atcTATATAAgga TAAAAA 344-781 778-1185 1164-6292 6214-6522 6990-6996 7109-7114 Tz27 7389 438 145 17 381 126 14 5130 1709 193.8 309 102 12.4 -112- (ATG-TGA) (ATG-TGA) (ATG-TGA) (ATG-TGA) atcTATATAAgga TAAAAA nt: nucleotide; aa: amino acid; MW: molecular weight; kDa: kilodalton

61

Sequence analysis of all four genome sequences revealed the presence of a putative tRNAmet binding site (TGGTATCAGAGCTTTGTT) with 88% nt identity to the plant tRNAmet consensus sequence (3'-ACCAUAGUCUCGGUCCAA-5'). Further, transcriptional promoter elements including a putative TATA box and polyadenylation signal were identified (Table 2).

Analysis of the aa sequence of ORF 3 from all four isolates identified conserved motifs related to the movement protein, coat protein, aspartic protease, reverse transcriptase, RNase H and RNA-binding zinc finger-like domains typical of Caulimoviridae (Fig. 1a). Based on these analyses, isolates Ke52, Tz17, Tz24 and Ug75 were identified as TaBV.

Outward-facing PCR

Outward-facing PCR was used in an attempt to amplify the complete TaBCHV-like genomic sequence from representative taro samples obtained from Ethiopia (Et17), Kenya (Ke43), Tanzania (Tz36) and Uganda (Ug10) and one tannia sample collected from Tanzania (Tz27). Using sequence-specific primers designed from the consensus RT/RNase H-coding sequences generated previously by PCR, a single amplicon of approximately 7.5 kb was obtained from each isolate. These primers were designed to overlap the BadnaFP/RP amplicons by 202 nt and 163 nt including the primer sequences at the 5' and 3' ends respectively. The amplicons were cloned and complete genome sequences for the five isolates were assembled using the near full- length outward-facing PCR products and the original BadnaFP/RP PCR product sequences. When the overlapping sequences between the two amplicons from each isolate were compared, there was 99-100% identity. The complete genomes of the five isolates varied in length from 7,389 to 7,654 nt and all contained four putative ORFs (Fig. 1b; Table 2). Whereas the size and arrangement of ORFs 1-3 were similar to that of the TaBCHV isolate from China, putative ORF 4 in all five isolates was located at the 3' end of ORF 3 where it overlapped the 3' end of ORF 3 by 77 nt, a position analogous with ORF 5 of the Chinese TaBCHV isolate. In all five isolates, ORFs 1, 2 and 4 comprised 438, 381 and 309, respectively, and encoded putative proteins of 145, 126 and 102 aa, respectively. In contrast, ORF 3 of Et17, Ke43, Tz36, Ug10 and 62

Tz27 comprised 5412, 5274, 5385, 5385 and 5130 nt and encoded respective putative proteins of 1803, 1798, 1794, 1794 and 1709 aa (Fig. 1b; Table 2). All five sequences contained the putative tRNAmet binding site which was either TGGTATCAGAGCTTTGTT (Et17, Ke43, Tz27 and Ug10) or TGGTATCAGAGCTTAGTT (Tz36) and showed 84-88% nucleotide identity to the plant tRNAmet consensus sequence. In addition, putative TATA boxes, polyadenylation signals and conserved functional domains typical of Caulimoviridae were also identified (Fig. 1b; Table 2).

Phylogenetic analysis and pairwise sequence comparison

Phylogenetic analysis was initially carried out using the conserved 1.2 kb RT/RNase H domain sequences of the nine full-length outward-facing PCR- and RCA-generated episomal sequences from this study, together with previously reported TaBV and TaBCHV isolates, additional members of the genus Badnavirus and representative members of the other genera in the family Caulimoviridae. This analysis confirmed that TaBV and TaBCHV isolates are members of two distinct clades within the genus badnavirus (Fig. 2). TaBCHV isolates were found to be most closely related to citrus yellow mosaic virus (AF347695), fig badnavirus 1 (JF411989) and several yam- infecting badnaviruses, while TaBV isolates formed a separate clade together with Bougainvillea spectabilis chlorotic vein-banding virus (EU034539), cacao swollen shoot virus (L14546) and pagoda yellow mosaic-associated virus (KJ013302) (Fig. 2).

Analysis of full-length and partial TaBV sequences from the 14 isolates from East Africa based on the core 529 bp RT/RNase H sequence showed they were members of a single clade, but they do not form distinct groups based on their country of origin, with isolates from the three countries interspersed across a single terminal branch of the tree (Fig. 3). The nearest common ancestor to the East African samples was TaBV isolate NC1 from New Caledonia (AY186614).

63

SCBGDV-FJ439817 BSMYV-AY805074 KTSV-AY180137 BSGFV-AY493509 BSIMV-HQ659760 92 100 BSVNV-AY750155 BSOLV-AJ002234

62 BSCAV-HQ593111 57 BSUAV-HQ593107

87 CiYMV-AF347695 FBV1-JF411989 84 TaBCHV-KP710178 Et17 100 Tz27 71 TaBCHV Tz36 51 Ug10 99 Ke43 Badnavirus CSSV-L14546 PYMAV-KJ013302 BCVBV-EU034539 TaBV-AF357836 Tz17 99 Ug75 TaBV 100 Ke52 68 Tz24

62 DBSNV-DQ822073

84 DBRTV2-KX008577 69 DBRTV1-KX008574 ComYMV-X52938

99 SCBMOV-M89923 90 SCBIMV-AJ277091

98 BSUIV-HQ593108

98 BSULV-HQ593109 50 BSUMV-HQ593110 RTBV-NC001914 Tungrovirus RYVV-JX028536 Rosadnavirus SbCMV-NC001739 Soymovirus

100 CsVMV-NC001648 Cavemovirus TVCV-AF190123 Solendovirus 96 CaMV-NC001497 Caulimovirus 100 PVCV-NC001839 Petuvirus

0.2

64

Figure 2. Phylogenetic analyses of the TaBV and TaBCHV sequences from East Africa together with other representative sequences from the family Caulimoviridae. The tree is based on 1.2 kbp pol-gene sequences of the RT/RNase H-coding region of ORF 3 (as described by Geering et al., 2010). BSUAV: banana streak UA virus; BSCAV: banana streak CA virus; BSOLV: banana streak OL virus; BSVNV: banana streak VN virus; BSIMV: banana streak IM virus; KTSV: Kalanchoe top spotting virus; BSGFV: banana streak GF virus; BSMYV: banana streak MY virus; SCBGDV: sugarcane bacilliform Guadeloupe D virus; ComYMV: Commelina yellow vein mosaic virus; DBSNV: Dioscorea bacilliform VN virus; DBRTV1: Dioscorea bacilliform RT virus 1; DBRTV2: Dioscorea bacilliform RT virus 2; FBV1: fig badnavirus 1; CiYMV: citrus yellow mosaic virus; TaBCHV: taro bacilliform CH virus; CSSV: cacao swollen shoot virus; PaYMV: pagoda yellow mosaic associated virus; BCVBV: Bougainvillea spectabilis chlorotic vein-banding virus; TaBV: taro bacilliform virus; SCBMOV: sugarcane bacilliform MO virus; SCBIMV: sugarcane bacilliform IM virus; BSUIV: banana streak UI virus; BSULV: banana streak UL virus; BSUMV: banana streak UM virus; RTBV: rice tungro bacilliform virus; CsVMV: cassava vein mosaic virus; TVCV: tobacco vein clearing virus; SbCMV: soybean chlorotic mottle virus; CaMV: cauliflower mosaic virus; PVCV: Petunia vein clearing virus; RYVV: rose yellow vein virus.

65

Tz24 Tz47 57 Tz43 Ke18 Tz44 Ug79 Tz17 Ug6 66 Ug75 100 Ug67 Ke83

95 Ke49 Ke52 64 86 Ug45 TaBV NC1-AY186614 TaBV SI2-AY186617 66 TaBV V1-AY186616 TaBV FP1-AY186613 TaBV S2-AY186615 97 TaBV SI4-AY186618 TaBV SI7-AY186619 TaBV PNG-AF357836 TaBV F1-AY186612 BCVBV-EU034539

0.05 Figure 3. Phylogenetic analyses of the TaBV-like sequences characterised in this study. The analysis is based on the core 529 nt RT/RNase H-coding sequences delimited by the BadnaFP/RP primers. Ke, Tz and Ug indicate isolates from Kenya, Tanzania and Uganda, respectively, while TaBV isolates NC1, SI2, V1, FP1, S2, SI4 SI7, PNG and F1 are those previously described by Yang et al. (2003b). ). Bougainvillea spectabilis chlorotic vein-banding virus (BCVBV) was used as an outgroup (see Fig. 2).

66

When analysis was done using the two published TaBCHV sequences from China together with full-length and partial sequences of the 26 isolates from East Africa based on the core 529 bp RT/RNase H sequence, the TaBCHV isolates were divided into two distinct subgroups (Fig. 4). The first subgroup, herein referred to as ‘subgroup a’, includes five isolates from Ethiopia and one isolate from Uganda, whereas the second subgroup, herein referred as ‘subgroup b’, is more diverse and comprises the two published TaBCHV sequences from China together with additional isolates from all four countries in East Africa.

The distinctive clustering of the six TaBCHV isolates from East Africa (Ug96, Et4, Et8, Et43, Et72 and Et141) within ‘subgroup a’, with high bootstrap support values, is indicative that this subgroup may represent a distinct badnavirus species. ‘Subgroup b’ can be further divided into four closely related sequence groups supported by moderate to high bootstrap values, with three of the Ethiopian TaBCHV isolates in a basal position to these and sharing a common ancestor with ‘subgroup a’.

As the initial sequence comparisons of PCR-amplified RT/RNase H-coding sequences indicated that nucleotide sequence variability in the TaBCHV isolates was up to 22.6 %, PASC analysis was carried out using all available TaBCHV sequences (Table 3). This analysis revealed that the six isolates in TaBCHV ‘subgroup a’ showed 79.1 to 80.5 % nucleotide sequence identity with the published TaBCHV sequences from China, which is on the threshold for species demarcation in the genus Badnavirus. These six sequences also shared 78.9 to 81.4 % nucleotide sequence identity to other East African TaBCHV isolates, with the exception of two isolates (Et17 and Et49) from ‘subgroup b’ which are distinct from, and basal to, the Chinese TaBCHV sequences with 84.1 to 85.8 % identity, as well as isolate Et22 from another distinct TaBCHV subgroup (Fig. 4).

67

Ug10 Tz16 57 Ke51 Tz36 99 Ug15

89 Ke16 Ke43 88 Tz42 61 Et22 Ke65

93 Ke72 Subgroup b 93 Ug35 58 Ke14 Et158

100 Tz7

50 Tz27 Ug52

91 TaBCHV-1-NC026819 100 TaBCHV-2-KP710177

97 Et50 Et49 100 Et17 Ug96 Et141 100 Et4 99 Subgroup a Et8 69 Et72 Et43 FBV1-JF411989 Outgroup CiYMV-AF347695

Figure 4. Phylogenetic analyses of the TaBCHV-like sequences characterised in this study. The analysis is based on the core 529 nt RT/RNase H-coding sequences delimited by the BadnaFP/RP primers. Et, Ke, Tz and Ug indicate isolates from Ethiopia, Kenya, Tanzania and Uganda, respectively, while TaBCHV-1 and -2 are described in Kazmi et al. (2015). Fig badnavirus 1 (FBV1) and citrus yellow mosaic virus (CiYMV) were used as outgroups (see Fig. 2).

68

Table 3. Pairwise sequence comparisons of TaBCHV isolates using core 529 nt RT/RNase H-coding sequences.

Tz16 Ke51 Ug10 Tz36 Ug15 Ke16 Ke43 Tz42 Et22 Ug35 Ke72 Ke65 Ke14 TaBCHV-1 TaBCHV-2 Ug52 Tz27 Tz7 Et158 Et49 Et17 Et50 Et8 Et4 Et43 Et72 Et141 Tz16 Ke51 99.9 Ug10 99.9 99.9 Tz36 99.6 99.6 99.6 Ug15 99.6 99.6 99.6 99.2 Ke16 99.8 99.8 99.8 99.4 99.4 Ke43 99.6 99.6 99.6 99.2 99.2 99.8 Tz42 96.0 96.0 96.0 95.6 95.6 96.2 96.4 Et22 96.4 96.4 96.4 96.0 96.0 96.2 96.0 96.0 Ug35 91.3 91.3 91.3 90.9 91.3 91.1 90.9 93.4 94.1 Ke72 91.8 91.8 91.8 91.5 91.8 92.0 91.8 93.9 94.7 98.7 Ke65 92.6 92.6 92.6 92.2 92.6 92.8 92.6 94.7 95.4 97.9 98.9 Ke14 89.2 89.2 89.2 88.8 89.2 89.4 89.2 89.9 91.3 92.6 93.4 93.7 TaBCHV-1 87.3 87.3 87.3 86.9 87.7 87.1 86.9 86.1 88.0 89.0 88.8 89.2 90.3 TaBCHV-2 86.9 86.9 86.9 86.5 87.3 86.7 86.5 85.8 87.7 88.6 88.4 88.8 89.9 99.2 Ug52 87.9 87.9 87.9 87.5 88.2 87.7 87.5 86.7 88.4 88.6 89.0 89.0 90.5 92.6 92.2 Tz27 91.8 91.8 91.8 91.5 91.8 91.7 91.5 90.3 92.0 91.7 92.0 92.4 93.2 91.5 91.1 93.0 Tz7 91.8 91.8 91.8 91.5 91.8 91.7 91.5 90.3 92.0 91.7 92.0 92.4 93.2 91.5 91.1 93.0 99.9 Et158 92.4 92.4 92.4 92.0 92.0 92.2 92.0 90.5 92.6 92.0 92.4 92.8 91.8 92.0 91.7 92.6 96.0 96.0 Et49 89.4 89.4 89.4 89.0 89.4 89.6 89.4 89.4 91.5 89.8 90.5 90.7 89.8 88.8 88.4 88.8 90.9 90.9 92.2 Et17 90.5 90.5 90.5 90.1 90.5 90.3 90.1 90.9 93.7 91.8 92.2 92.4 90.3 88.6 88.2 88.4 91.7 91.7 93.4 96.0 Et50 86.3 86.3 86.3 86.0 86.3 86.1 86.0 85.2 86.9 87.3 87.1 87.1 86.3 89.0 88.6 86.7 89.2 89.2 91.5 91.1 91.1 Et8 81.0 81.0 81.0 80.6 80.8 80.8 80.6 81.2 84.1 79.5 79.9 80.6 80.8 79.7 79.1 78.9 80.1 80.1 80.5 84.1 85.8 77.6 Et4 81.0 81.0 81.0 80.6 80.8 80.8 80.6 81.2 84.1 79.5 79.9 80.6 80.8 79.7 79.1 78.9 80.1 80.1 80.5 84.1 85.8 77.6 99.9 Et43 81.0 81.0 81.0 80.6 80.8 80.8 80.6 81.2 84.1 79.5 79.9 80.6 80.8 79.7 79.1 78.9 80.1 80.1 80.5 84.1 85.8 77.6 99.9 99.9 Et72 81.0 81.0 81.0 80.6 80.8 80.8 80.6 81.2 84.1 79.5 79.9 80.6 80.8 79.7 79.1 78.9 80.1 80.1 80.5 84.1 85.8 77.6 99.9 99.9 99.9 Et141 81.2 81.2 81.2 80.8 81.0 81.0 80.8 81.4 84.3 79.7 80.1 80.8 81.0 79.9 79.3 79.1 80.3 80.3 80.6 84.3 85.6 77.4 99.8 99.8 99.8 99.8 Ug96 81.2 81.2 81.2 80.8 81.0 81.0 80.8 81.0 84.3 79.5 79.9 80.6 81.6 80.5 79.9 79.7 80.5 80.5 80.8 84.3 85.2 77.6 96.6 96.6 96.6 96.6 96.8

TaBCHV-1 is GenBank Accession No. NC026819; TaBCHV-2 is GenBank Accession No. KP710177

69

Five clear sequence groups having very high (>96 %) nucleotide sequence identity were identified, including the six isolates from ‘subgroup a’ (96.6 to 99.9 % identity), the two published TaBCHV sequences from China (99.2 % identity), isolates Tz7, Tz27 and Et158 (96 to 100 % identity), isolates Ug36, Ke72 and Ke65 (97.9 to 98.9% identity) and the nine isolates forming the terminal TaBCHV subgroup (96 to 99.9 % identity). Between the various groups of TaBCHV isolates determined in the phylogenetic analysis, nucleotide sequence identity generally ranged from 85 to 94%, which may explain the low bootstrap support for some branches in the phylogenetic analysis (Fig. 4; Table 3).

Discussion

Several surveys were carried out in 2014 and 2015 to identify viruses infecting taro and other edible aroids in East Africa. Using a PCR-based strategy with the degenerate badnavirus primers, BadnaFP/RP, a high incidence of badnavirus-like sequences was found in taro growing in Ethiopia, Kenya, Tanzania and Uganda. This ranged from 58.4% to 74.4% of samples from each country, with at least one PCR-positive sample detected in every district surveyed. Similar to previous studies (Yang et al 2003b; Revill et al., 2005), no correlation was observed between the presence of the badnavirus-like sequences and symptoms in either taro or tannia plants. However, since mixed infections are common in taro (Revill et al., 2005), testing the samples for other viruses is necessary to shed further light on any symptoms associated with badnavirus infection. Sequence analysis of the RT/RNase H-coding region of 40 isolates amplified using PCR revealed greatest nucleotide sequence identities to either TaBV or TaBCHV, with 14 samples showing highest (96-97%) nucleotide sequence identity to TaBV from New Caledonia, while the remaining 26 samples showed highest (79-92%) nucleotide sequence identity to TaBCHV from China. In Ethiopia, sequences similar to only TaBCHV were detected, while both TaBV- and TaBCHV-like sequences were detected from Uganda, Kenya and Tanzania. Of the two tannia samples selected for sequencing, TaBV was detected from one sample (Tz24), while TaBCHV was detected from a second sample (Tz27).

70

Since the BadnaFP/RP-generated amplicons could have been derived from either integrated sequences or episomal virus, RCA was used in an attempt to specifically amplify episomal viral genomic DNA. Whereas RCA amplified the complete genome of TaBV isolates, no amplification products were obtained using samples containing the TaBCHV-like sequences. Therefore, the latter samples were analysed using an outward-facing PCR strategy which resulted in the amplification of full-length East African TaBCHV genomes. Interestingly, analysis of the cloned TaBCHV sequences revealed the presence of the restriction sites StuI and XbaI, which were predicted from the published TaBCHV sequence from China and which were used to digest the RCA-amplified DNA from these samples. Despite the presence of high molecular weight amplification products in RCA reactions using samples shown to contain TaBCHV, the RCA-amplified products did not digest with StuI and XbaI as expected. The reason for this is unknown but could be due to very low levels of target episomal DNA in taro plants, as has been reported with badnaviruses from sweet potato (Kreuze et al., 2017).

The genome organisation of the TaBV isolates infecting taro from East Africa is consistent with the previously published South Pacific TaBV isolates with four ORFs (Yang et al., 2003a). The genome organisation of the TaBV isolate infecting tannia is also consistent with the taro-infecting TaBV isolates identified from East Africa and the South Pacific. In contrast, whereas the genome organisation of the four TaBCHV isolates from East Africa were similar to each other and also contained four ORFs, this differs from the previously published Chinese TaBCHV isolate which was reported to encode six ORFs (Kazmi et al., 2015). Recently, Wang et al. (2018) reported a full- length sequence of TaBCHV infecting taro from Hawaii, USA. The genome of this Hawaiian TaBCHV isolate contained five ORFs. The sizes and locations of ORF 1, 2, 3 and 5 are consistent with ORFs 1-4 of TaBCHV isolates from East Africa. However, unlike TaBCHV isolates from East Africa, TaBCHV-Hawaii possesses an overlapping ORF within ORF 3 (Wang et al., 2018). Of the five East African TaBCHV isolates sequenced in the current study, three (Ke43, Ug10 and Tz36) are representative of a small subset in the terminal branch of ‘subgroup b’ in the phylogenetic analysis, while Et17 is a basal member of this subgroup (Fig. 4). The sole TaBCHV isolate from tannia 71

(Tz27) formed another small subset within ‘subgroup b’ together with previously published TaBCHV isolates from China and other isolates from Ethiopia and Uganda (Fig. 4). Based on the genome organisation and phylogenetic analysis, it could be inferred that all members of ‘subgroup b’ would have four ORFs, but interestingly the Chinese TaBCHV sequence, which falls into a distinct group of isolates within ‘subgroup b’, has two additional ORFs. One of these ORFs is analogous to the TaBV ORF4, while the other, ORF 6, is located at a position downstream of the ORF4 described herein from TaBCHV isolates from East Africa. Additional sequencing of isolates from the various TaBCHV groups within ‘subgroup b’ of the phylogenetic tree is needed to clarify these differences in genome organisation.

Phylogenetic analysis showed that all East African TaBV isolates form a single subgroup within known TaBV isolates and are most similar to a published isolate from New Caledonia (Fig. 3). This may indicate that a single isolate of TaBV was initially introduced to East Africa and has since been disseminated throughout three of the countries in the region. Phylogenetic analysis of TaBCHV isolates from East Africa showed that they form two distinct subgroups (Fig. 4). PASC of the isolates within these two subgroups suggests that ‘subgroup a’ may be distinct enough from some members of ‘subgroup b’ to be considered a distinct species. However, when all sequences in this group are considered there is no clear delineation of species based on the current criteria for species demarcation in the genus badnavirus of 20% nucleotide sequence variability in the core RT/RNase H-coding region of ORF3 (Table 3). Whether the members of ‘subgroup a’ represent a novel badnavirus species requires further sequencing of TaBCHV isolates from East Africa and other regions.

Initial characterisation of badnaviruses infecting taro from the South Pacific in 2003 by Yang et al. (2003a) reported a single virus species represented by a single full-length genome sequence of a PNG isolate (GenBank accession no. NC004450) and partial genome sequences of isolates from Fiji, Solomon Islands, Vanuatu, New Caledonia, French Polynesia and Samoa (Yang et al., 2003a, b). The name taro bacilliform virus (TaBV) was subsequently accepted for this viral species (Fauquet et al., 2005). More recently, Ming et al. (2013) reported a new species of badnavirus

72

infecting taro from China (GenBank accession no. NC026819) and Kazmi et al. (2015) determined the complete genome sequence of two isolates using sequence-specific PCR amplification and small RNA (sRNA) sequencing. The name taro bacilliform CH virus (TaBCHV) was accepted for this new viral species within the genus Badnavirus (Geering and Teycheney, 2016). This current study is the first to identify and characterise TaBV and TaBCHV isolates infecting taro and tannia in East Africa and the possible presence of a new badnavirus species in Ethiopia and Uganda. To have a consistent naming of badnaviruses infecting taro and other aroids, we propose that Taro bacilliform virus (TaBV) be renamed taro bacilliform PNG virus (TaBPNGV) to include the name of the country from which the virus was first reported (Papua New Guinea).

Virus infection in taro has been reported to affect both the quality and quantity of the harvested corms, with production losses ranging from 20 to 60% and, in some cases, plant death. These losses often result from the synergistic interactions of multiple virus infections (Revill et al., 2005; Rana et al., 1983; Elliott et al., 1997), however, the role of badnaviruses in these interactions remains poorly understood. This study confirmed the widespread occurrence of two known badnavirus species, TaBV and TaBCHV, in East Africa. Further, in the case of TaBCHV, at least two genetically distinct subgroups were identified. To our knowledge, this is the first report of TaBV and TaBCHV in these countries and the first sequence record from tannia.

73

Data Availability Statement Sequences described in this paper are available in GenBank as accession numbers MG017321 - MG017360 and MG833013 - MG833014.

Acknowledgments

This project was funded by Biosciences eastern and central Africa (BecA–ILRI) Hub through the African Biosciences Challenge Fund (ABCF). ABCF program is supported by the Australian Department of Foreign Affairs and Trade (DFAT) through BecA- CSIRO partnership; the Syngenta Foundation for Sustainable Agriculture (SFSA); the Bill and Melinda Gates Foundation (BMGF); the UK Department for International Development (DFID) and the Swedish International Development Agency (SIDA). DK is the recipient of an Australia Awards Scholarship.

Conflict of interest

The authors declare no conflict of interest.

74

References

Akwee PE, Netondo G, Kataka, JA, 2015. A critical review of the role of taro Colocasia esculenta L. (Schott) to food security: A comparative analysis of Kenya and Pacific Island taro germplasm. Scientia Agriculturae 9, 101–08.

Beyene TM, 2013. Morpho-agronomical characterization of taro (Colocasia esculenta) accessions in Ethiopia. SciencePG 1, 1–9.

Bhat AI, Hohn T, Selvarajan R, 2016. Badnaviruses: the current global scenario. Viruses 8, 177.

Borah BK, Sharma S, Kant R, 2013. Bacilliform DNA-containing plant viruses in the tropics: commonalities within a genetically diverse group. Molecular Plant Pathology 14, 759–71.

Elliott MS, Zettler FW, Brown LG, 1997. Dasheen mosaic potyvirus of edible and ornamental aroids. University of Florida, Plant Pathology Circular, 384.

Fauquet C, Mayo MA, Maniloff J, 2005. Virus Taxonomy: Eighth report of the International Committee on Taxonomy of Viruses. New York, NY, USA: Elsevier Academic Press.

Geering A, Hull R, 2012. Caulimoviridae. In: King AMQ, Adams MJ, Carstens EB, eds. Virus Taxonomy, Ninth report of the International Committee on Taxonomy of Viruses. New York, NY, USA: Elsevier, 429–43.

Geering A, Teycheney P, 2016. Two new species in the genus Badnavirus. https://talk.ictvonline.org/files/ Accessed 27 October 2017.

Gollifer DE, Jackson GVH, Dabek AJ, 1977. The occurrence and transmission of viruses of edible aroids in the Solomon Islands and the Southwest Pacific. International Journal of Pest Management 23, 171–77.

Harrison J, Moore KA, Paszkiewicz K et al., 2014. A draft genome sequence for Ensete ventricosum, the Drought-Tolerant “Tree against hunger”. Agronomy 4, 13–33.

Hull R, Covey SN, 1983. Does cauliflower mosaic virus replicate by reverse transcription? Trends in Biochemical Sciences 8, 119–21.

Iskra-Caruana ML, Duroy PO, Chabannes M, 2014. The common evolutionary history of badnaviruses and banana. Infection, Genetics and Evolution 21, 83–89.

Jacquot M, Hagen LS, Jacquemond M, 1996. The open reading frame 2 product of Cacao Swollen Shoot Badnavirus is a nucleic acid-binding protein. Virology 225, 191–95.

James AP, Geijskes RJ, Dale JL, 2011. Development of a novel rolling-circle amplification technique to detect Banana streak virus that also discriminates between integrated and episomal virus sequences. Plant Disease 95, 57–62.

75

Kazmi SA, Yang Z, Hong N, 2015. Characterization by small RNA sequencing of Taro Bacilliform CH Virus (TaBCHV), a novel Badnavirus. PLoS One 10, e0134147.

Kenyon L, Lebas BSM, Seal SE, 2008. Yams (Dioscorea spp.) from the South Pacific Islands contain many novel badnaviruses: implications for international movement of yam germplasm. Archives of Virology 153, 877–89.

Kidanemariam D, Sukal A, Crew K et al., 2018. Characterization of an Australian isolate of taro bacilliform virus and development of an infectious clone. Archives of Virology doi: 10.1007/s00705-018-3783-0

Kleinow T, Nischang M, Beck A et al., 2009. Three C-terminal phosphorylation sites in the Abutilon mosaic virus movement protein affect symptom development and viral DNA accumulation. Virology 390, 89–101.

Kreuze J, Perez A, Galvez M, 2017. Badnaviruses of Sweetpotato: symptomless co- inhabitants on a global scale. bioRxiv, 140517.

Lebot V, 2009. Tropical root and tuber crops: cassava, sweet potato, yams and aroids. Crop Production Science in Horticulture, 17. CABI, Wallingford, UK.

Macanawai AR, Ebenebe AA, Hunter D, 2005. Investigations into the seed and mealybug transmission of Taro bacilliform virus. Australasian Plant Pathology 34, 73–76.

Mariame F, Gelmesa D, 2006. Review of the status of vegetable crops production and marketing in Ethiopia. Uganda Journal of Agricultural Science 12, 26–30.

Ming SFY, Ping GW, Ping LW, 2013. Molecular identification and specific detection of Badnavirus from taro grown in China. Acta Phytopathologica Sinica 6, 590–95.

Muller E, Dupuy V, Blondin L et al., 2011. High molecular variability of sugarcane bacilliform viruses in Guadeloupe implying the existence of at least three new species. Virus Research 160, 414–19.

Ndabikunze BK, Talwana HAL, Mongi RJ et al., 2011. Proximate and mineral composition of cocoyam (Colocasia esculenta L. and Xanthosoma sagittifolium L.) grown along the Lake Victoria Basin in Tanzania and Uganda. African Journal of Food Science 5, 248–54.

Onwueme IC, Charles WB, 1994. Cultivation of cocoyam. In: Tropical root and tuber crops. Production, perspectives and future prospects. FAO Plant Production and Protection Paper 126, Rome 139-61.

Rana GL, Vovlas C, Zettler FW, 1983. Manual transmission of dasheen mosaic virus from Richardia to nonaraceous hosts. Plant Disease 67, 1121–22.

Revill P, Jackson G, Hafner G et al., 2005. Incidence and distribution of viruses of taro (Colocasia esculenta) in Pacific Island countries. Australian Plant Pathology 35, 327–31.

76

Seal S, Turaki A, Muller E et al., 2014. The prevalence of badnaviruses in West African yams (Dioscorea cayenensis-rotundata) and evidence of endogenous pararetrovirus sequences in their genomes. Virus Research 186, 144–54.

Talwana HAL, Serem AK, Ndabikunze BK et al., 2009. Production status and prospects of Cocoyam (Colocasia esculenta (L.) Schott.) in East Africa. Journal of Root Crops 35, 98–07.

Tumuhimbise R, Talwana HL, Osiru DSO et al., 2009. Growth and development of wetland-grown taro under different plant populations and seedbed types in Uganda. African Crop Science Journal 17, 49–60.

Wang Y, Borth WB, Green JC et al., 2018. Genome characterization and distribution of Taro bacilliform CH virus on taro in Hawaii, USA. European Journal of Plant Pathology 150, 1107–11.

Yang IC, Hafner GJ, Dale JL et al., 2003a. Genomic characterization of taro bacilliform virus. Archives of Virology 148, 937–49.

Yang IC, Hafner GJ, Revill PA et al., 2003b. Sequence diversity of South Pacific isolates of Taro bacilliform virus and the development of a PCR-based diagnostic test. Archives of Virology 148, 1957–68.

77

78

Chapter 5

Characterisation of an Australian isolate of taro bacilliform virus and development of an infectious clone

Dawit B. Kidanemariam1, 2, Amit C. Sukal1, Kathy Crew3, Grahame V. H. Jackson4, Adane D. Abraham5, James L. Dale1, Robert M. Harding1, Anthony P. James1*

1Centre for Tropical Crops and Biocommodities, Queensland University of Technology, Brisbane, 4001, Australia 2National Agricultural Biotechnology Research Center, Ethiopian Institute of Agricultural Research, P.O. Box 2003, Addis Ababa, Ethiopia 3Department of Agriculture and Fisheries, Eco-sciences Precinct, Dutton Park, Brisbane, 4102, Australia 424 Alt St, Queens Park, NSW 2022, Australia 5Department of Biotechnology, Addis Ababa Science and Technology University, P.O. Box 16417, Addis Ababa, Ethiopia

Archives of Virology 163:1677–1681

79

Statement of Contribution of Co-Authors of Thesis by Publication Paper The authors listed below have certified that: 1. They meet the criteria for authorship in that they have participated in the conception, execution, or interpretation, of at least that part of the publication in their field of expertise; 2. They take public responsibility for their part of the publication, except for the responsible author who accepts overall responsibility for the publication; 3. There are no other authors of the publication according to these criteria; 4. Potential conflicts of interest have been disclosed to (a) granting bodies, (b) the editor or publisher of journals or other publications, and (c) the head of the responsible academic unit, and 5. They agree to the use of the publication in the student’s thesis and its publication on the QUT’s ePrints site consistent with any limitations set by publisher requirements. In the case of this chapter: Characterisation of an Australian isolate of taro bacilliform virus and development of an infectious clone

QUT Verified Signatures

QUT Verified Signature

RSC, Level 4, 88 Musk Ave, Kelvin Grove Qld 4059 Page 1 of 1 Current @ 20/09/2016 CRICOS No. 00213J

80

Abstract The badnavirus, taro bacilliform virus (TaBV), has been reported to infect taro (Colocasia esculenta L.) and other edible aroids in several South Pacific island countries but there are no published reports from Australia. Using PCR and RCA, we identified and characterized an Australian TaBV isolate. A terminally redundant cloned copy of the TaBV genome was generated and shown to be infectious in taro following agro-inoculation. This is the first report of TaBV from Australia and also the first report of an infectious clone for this virus.

Keywords

Colocasia esculenta, Badnavirus, Caulimoviridae, rolling circle amplification, episomal DNA

Taro bacilliform virus (TaBV) is a member of the genus Badnavirus, family Caulimoviridae [4]. TaBV has a natural host range restricted to aroids and is transmitted by vegetative propagation, mealybugs in a semi-persistent manner and in some cases through seed or pollen, but is not mechanically transmissible [5, 14]. To date, TaBV isolates have only been characterized from several South Pacific Island countries, including Fiji, Solomon Islands, Vanuatu, New Caledonia, French Polynesia and Samoa [17, 23]. A second badnavirus, Taro bacilliform CH virus (TaBCHV), has been reported from China and the USA [13, 16, 20].

Badnaviruses are characterized by non-enveloped, bacilliform-shaped particles of 30 nm by 120-150 nm and circular, double-stranded DNA genomes of 7.2 to 9.2 kb, typically encoding three open reading frames (ORFs) [4]. The function of the protein encoded by ORF 1 is unknown, while the ORF 2 protein has non-specific DNA- and RNA-binding activity and may be involved in virion assembly [9]. ORF 3 encodes a large polyprotein (∼200 kDa) which is processed into several mature functional proteins including a movement protein (MP), coat protein (CP), aspartic protease (AP), reverse transcriptase (RT) and ribonuclease H (RNase H) [4]. The 81

RT/RNase H-coding sequence of ORF 3 is the most conserved region of the genome and a nucleotide difference of more than 20 % in this region is used for demarcation of species in the genus [4]. The genus Badnavirus contains the most diverse and heterogeneous viruses within the family Caulimoviridae, both at the genomic and antigenic level, and is currently grouped into forty distinct species. The majority of known badnaviruses infect tropical crops including banana, yam, taro, sugar cane, pepper, citrus and cacao (https://talk.ictvonline.org/taxonomy/). The genome of TaBV possesses four ORFs, with the size and organization of ORFs 1-3 consistent with most badnaviruses [22]. ORF 4 of TaBV overlaps ORF 3 between the MP and CP domains and putatively encodes a protein of ∼13 kDa, with little homology to any published protein-coding sequences.

TaBV-infected taro plants are typically symptomless although, in some cases, vein-clearing, stunting and downward-curling of the leaf blades have been reported [17, 22-23]. However, dual infection of taro with TaBV and colocasia bobone disease- associated virus (CBDaV), a putative rhabdovirus [7], is believed to cause the lethal disease called Alomae in Papua New Guinea (PNG) and Solomon Islands [5, 12, 17]. The inability to mechanically transmit TaBV has not only hindered investigations into symptoms and yield losses associated with virus infection, but also the contribution of TaBV to the Alomae disease complex.

Plant virus infectious clones are being increasingly used as a simple and efficient means to study plant-virus interactions. Infectious clones of several badnaviruses have been reported, including commelina yellow mottle virus, citrus yellow mosaic virus, cacao swollen shoot virus and sugarcane bacilliform virus [3, 8, 10, 15]. The recent development of rolling circle amplification (RCA) has not only facilitated the amplification and detection of the complete genome sequence of circular DNA viruses, including badnaviruses [1, 11, 19], but has also greatly simplified the development of infectious clones of viruses with circular DNA genomes such as geminiviruses [6, 21]. In this study, we used RCA to amplify the full-length genome of an Australian TaBV isolate and describe the development of a greater-than-genome- length cloned copy of the virus DNA which is infectious in taro.

82

In 2013, 24 taro leaf samples were collected from several field sites in north and south-east Queensland and northern New South Wales, Australia (Table 1). Samples were desiccated over silica-gel and total nucleic acids (TNA) were extracted using a CTAB-based protocol [11]. Samples were initially tested for TaBV by PCR using the primers 12F and CP-R to amplify a 560 bp fragment of the CP-coding region [23]. Briefly, 1 μL of TNA was mixed with 10 μL of 2x GoTaq Green Master Mix (Promega, Australia) and 5 ρmol of each primer in a 20 μL reaction volume. PCR cycling conditions included an initial denaturation step at 94 °C for 3 min followed by 35 cycles of 94 °C for 30 s, 50 °C for 30 s, and 72 °C for 1 min and a final extension step at 72 °C for 5 min. PCR amplicons were separated by electrophoresis through 1.5 % agarose gels. Nine samples tested positive for TaBV (Table 1), three of which (7, 12 and 24) were randomly selected for further analysis by PCR using the degenerate primers BadnaFP/RP [22]. Amplicons of the expected size (589 bp) were obtained from all three samples, and these were gel-excised, cloned into pGEM-T® Easy (Promega, Australia) and sequenced using the Big Dye® Terminator v3.1 Cycle Sequencing Kit (Thermo Fisher Scientific, Australia). For each sample, three independent clones were sequenced. Sequences were processed using the CLC Main Workbench v6.9.2 (QIAGEN) and Geneious v10.2.2 (Biomatters, New Zealand) computer software programs. In each case, the three independent sequence reads showed 98-99 % nucleotide identity and the consensus sequence for each could be translated to give a predicted functional protein sequence. The three consensus sequences showed 82.1 to 86.5 % identity to each other at the nucleotide level. Subsequent BLAST analysis, using the 529 nt sequences excluding the BadnaFP/RP priming sites, showed that sample 7 had 92.8 % identity to a PNG TaBV isolate, while sample 12 was identical to a New Caledonian TaBV isolate and sample 24 had 97.9 % identity to a Fijian TaBV isolate.

83

Table 1. Sampling locations and results of PCR testing for TaBV in taro leaf samples.

Sample Location1 PCR result2 1 Cairns - 2 . - 3 . - 4 . - 5 El Arish - 6 . - 7 . + 8 . + 9 Innisfail + 10 . + 11 . - 12 Tully + 13 . + 14 Ingham - 15 Mackay - 16 . - 17 Brisbane (north) - 18 . - 19 . - 20 Cudgen - 21 . - 22 Brisbane (south) + 23 . + 24 . +

1 All locations in Queensland except Cudgen in New South Wales 2 Result of PCR screening using primers 12F/CP-R as described by Yang et al. [20]

84

To generate the complete genome sequence of a representative Australian TaBV isolate, TNA from sample 7 (hereafter referred to as TaBV-Aus7) was subjected to RCA using the IllustraTM TempliPhi 100 Amplification Kit (GE Healthcare, UK) as previously described [11]. Based on in silico restriction site analysis of a published full-length TaBV sequence (GenBanK ID NC004450), the restriction enzymes SalI and StuI were selected for digesting the RCA product as they were predicted to have one and two recognition sites, respectively. When the digested RCA products were separated by electrophoresis, no RFLP profile was observed using SalI, whereas a single, putative full-length fragment of ∼7.5 kb was obtained using StuI. The StuI fragment was excised, cloned into SmaI-digested and dephosphorylated pUC19 vector and the complete genome was sequenced as described previously using a primer-walking approach. The nucleotide sequence flanking the StuI site was confirmed by PCR using sequence-specific primers and subsequent cloning and sequencing of the amplicons. Analysis of the complete sequence confirmed the presence of a single StuI site, with no SalI sites present in the full-length sequence.

The complete genome sequence of TaBV-Aus7 was determined to be 7,494 nt. Sequence analysis identified three putative ORFs consistent with the typical genome organization of badnaviruses. ORF 1 comprised 441 nt and encoded a putative protein of 146 aa (Mr 16.6 kDa), while ORF 2 was 435 nt encoding a putative protein of 144 aa (Mr 15.7 kDa). ORF 3 was 5,664 nt in length encoding a putative protein of 1,887 aa (Mr = 215.2 kDa) with conserved motifs identified for the MP, CP, AP, RT, RNase H and RNA-binding zinc finger-like domains of badnaviruses (Figure 1). There was a single nucleotide overlap between the ORF 1 and 2 stop/start codons (TGATG) which is consistent with the previously published TaBV sequence from PNG. Whereas the published TaBV-PNG sequence has a two nucleotide gap between ORF 2 and 3, a three nucleotide gap was present between ORF 2 and 3 in TaBV-Aus7.

85

MluI (832) XhoI (3469) XbaI (6916)

ORF 1 ORF 2 MP CP Zn AP RT RNase H ORF 3

Figure 1. Schematic representation of the linearised genome of TaBV-Aus7. It is showing the three ORFs and conserved motifs in the ORF 3 polyprotein. Single-cutting restriction sites used for the preparation of the infectious clone are also shown.

86

The intergenic (IR) region was 952 nt in length and included a putative tRNAmet binding site with 78% nucleotide identity to the plant tRNAmet consensus sequence and this was designated as the origin of the circular genome, consistent with the convention currently used for badnaviruses. Interestingly, unlike the published TaBV sequence from PNG, TaBV-Aus7 does not possess an ORF 4.

To generate an infectious clone of TaBV-Aus7, RCA was used on TNA to amplify the episomal viral DNA and, based on analysis of the full-length sequence, two separate double digestions were carried out. Initially, XbaI and XhoI were used to cut the TaBV-Aus7 genome within ORF 3 at the 3' and 5' ends respectively, to generate two fragments of 4047 nt and 3447 nt. The ∼4 kb fragment including the last four nucleotides at the 3' end of ORF 3, the complete IR, ORF 1, ORF 2, and the first 2,213 nt of ORF 3 (Figure 1) was gel-excised. XhoI and MluI were subsequently used to generate fragments of 4,857 nt and 2637, with the ∼4.8 kb fragment including 3,451 nt of ORF 3 from the XhoI site used previously, as well as the IR, ORF 1 and the first 14 nt of ORF 2 also gel-excised (Figure 1). The binary vector pOPT-NXT, containing a multiple cloning site and nptII plant selection cassette, was then double-digested using XbaI and AscI and dephosphorylated.

The ∼4 kb fragment from the XbaI/XhoI digest, together with the ∼4.8 kb XhoI/MluI fragment and the XbaI/AscI digested pOPT-NXT were ligated, resulting in a terminally redundant Aus7 molecule of 8,904 nt (∼1.2x the genome of Aus7) in the binary vector. The pOPT-NXT-Aus7 DNA was transferred into Agrobacterium strain Agl1 by electroporation and inoculum prepared as previously described [18]. Inoculum was injected [2] at the base of the pseudostem of five individual six-week- old tissue cultured taro plants, while three plants were inoculated with the pOPT-NXT vector alone and two additional plants were maintained as non-inoculated controls. These plants (variety Bun Long) were obtained from a commercial tissue-culture laboratory (Plant Biotech, Palmwoods, Australia) and all tested negative for TaBV using RCA prior to experimentation.

87

All plants were kept in a growth room at 25 °C with a 12 hr photoperiod. At 12 weeks post-inoculation, no distinct symptoms indicative of viral infection were observed on any of the inoculated or control plants. However, growth characteristics such as leaf size, number of leaves and plant height appeared reduced in all five taro plants inoculated with pOPT-NXT-Aus7 (Figure 2A). Furthermore, by 20 weeks post- inoculation, downward-curling of the leaf blades was observed on taro plants inoculated with pOPT-NXT-Aus7, but not on any control plants (Figure 2B).

Leaf samples were collected from all plants at 12 weeks post-inoculation, TNA was extracted as described previously and PCR carried out to check for any residual Agrobacterium using primers Agl1-F (ATCATTTGTAGCGACT) and Agl1-R (AGCTCAAACCTGCTTC) targeting the virC operon. Of the five pOPT-NXT-Aus7 inoculated taro plants, one tested positive for residual agrobacterium contamination. RCA was then carried out on plant TNA to screen for the presence of episomal TaBV DNA. RCA products from the five pOPT-NXT-Aus7 inoculated, three pOPT-NXT inoculated and two non-inoculated control plants were digested with StuI, which cuts the Aus7 sequence at a single site. Fragments of the expected size were obtained from all five taro plants inoculated with pOPT-NXT-Aus7, however no fragments were obtained from taro plants inoculated with pOPT-NXT or the non-inoculated control plants (Figure 2C).

The ∼7.5 kb StuI-digested fragments from one of the taro plants inoculated with the infectious clone was excised, ligated into linearized (SmaI-digested) and de- phosphorylated pUC19 and the RT/RNase H-coding region was sequenced as described earlier. Pairwise sequence comparison of the RT/RNase H-coding region showed that the sequences amplified from the inoculated taro plant was identical to the original TaBV-Aus7 sequence. This result confirmed the infectivity of pOPT-NXT- Aus7 in taro plants, which are the natural host of TaBV.

88

(A)

(B)

(C)

Figure 2. Phenotypic and molecular analysis of pOPT-NXT-Aus7 inoculated taro plants. (A) Taro plants at 12 weeks post-inoculation; (i) non-inoculated control plant with no symptoms and (ii) inoculated plant showing reduced growth. (B) Taro plants at 20 weeks post-inoculation, (i) non-inoculated control plant with normal leaf morphology and (ii) inoculated plant showing downward-curling of the leaf margin. (C) Agarose gel of StuI-digested RCA-amplified DNAs from inoculated or non-inoculated taro plants. M is HyperLadder 1 (Bioline, Australia); Lane 1 is the positive control (original Aus7 sample), lanes 2-6 are the five taro plants inoculated with pOPT-NXT-Aus7; lanes 7 and 8 are the two non-inoculated taro plants; lanes 9-11 are the three taro plants inoculated with pOPT-NXT (empty vector control); and lane 12 is a no template control. Arrow indicates 8 kbp marker fragment. 89

This report describes the first complete genome sequence of an Australian TaBV isolate (TaBV-Aus7) which was obtained using RCA. The size and genome organization of ORFs 1-3 of TaBV-Aus7 were similar to a published TaBV sequence from PNG [22] except that, whereas the PNG TaBV isolate has four ORFs, TaBV-Aus7 has only three ORFs. Analysis of partial sequences from three TaBV isolates revealed high nucleotide identity to TaBV isolates from PNG, New Caledonia and Fiji. Using RCA-amplified viral DNA, a greater-than-genome-length cloned copy of TaBV-Aus7 was constructed and shown to be infectious in taro. Downward-curling of leaf blades, a symptom sometimes associated with TaBV infection, was observed on inoculated taro plants after 20 weeks and plants were shown to be infected with TaBV-Aus7 using RCA. This is the first report describing the development of an infectious clone of TaBV which may serve as an important tool to facilitate further investigation into the virus host range, symptoms and yield loss. The infectious clone may also have utility in determining the possible role of TaBV in the etiology of the lethal Alomae disease.

Acknowledgments

The authors are grateful to Dr. Ben Dugdale, Queensland University of Technology, for providing the pOPT-NXT vector for cloning purposes. DK is the recipient of an Australia Awards Scholarship.

Data Availability Sequences described in this paper are available under GenBank accession numbers MG017318-MG017320.

Conflict of interest The authors declare they have no conflict of interest.

Ethical approval This article does not contain any work conducted on animal or human participants.

90

References

1. Bomer M, Turaki A, Silva G, Kumar P, Seal S (2016) A sequence-independent strategy for amplification and characterisation of episomal badnavirus sequences reveals three previously uncharacterised yam badnaviruses. Viruses 8:188

2. Boulton M, Buchholz W, Marks M, Markham P, Davies J (1989) Specificity of Agrobacterium-mediated delivery of maize streak virus DNA to members of the Gramineae. Plant Mol Biol 12:31–40

3. Bouhida M, Lockhart B Olszewski N (1993) An analysis of the complete sequence of a sugarcane bacilliform virus genome infectious to banana and rice. J Gen Virol 74:15–22

4. Geering A, Hull R (2012) Caulimoviridae. In: King, A.M.Q., Adams, M.J., Carstens, E.B., Lefkowitz, E.J. (Eds.), Virus Taxonomy, Ninth report of the International Committee on Taxonomy of Viruses. Elsevier, Amsterdam, pp. 429–443

5. Gollifer D, Jackson G, Dabek A, Plumb R, May Y (1977) The occurrence and transmission of viruses of edible aroids in the Solomon Islands and the Southwest Pacific. Int. J. Pest Manag. 23:171–177

6. Haible D, Kober S, Jeske H (2006) Rolling circle amplification revolutionises diagnosis and genomics of geminiviruses. J Virol Meth 135:9–16

7. Higgins C, Bejerman N, Li M, James A, Dietzgen R, Pearson M, Revill P, Harding R (2016) Complete genome sequence of Colocasia bobone disease-associated virus, a putative cytorhabdovirus infecting taro. Arch Virol 161:745–748

8. Huang Q, Hartung J (2001) Cloning and sequence analysis of an infectious clone of Citrus yellow mosaic virus that can infect sweet orange via Agrobacterium- mediated inoculation. J Gen Virol 82:2549–2558

9. Jacquot E, Hagen L, Jacquemond M, Yot P (1996) The open reading frame 2 product of Cacao swollen shoot badnavirus is a nucleic acid-binding protein. Virology 225:191–195

10. Jacquot E, Hagen L, Michler P, Rohfritsch O, Stussi-Garaud C, Keller M, Jacquemond M Yot P (1999) In situ localisation of Cacao swollen shoot virus in agroinfected Theobroma cacao. Arch Virol 144:259–271

11. James M, Kenten R, Woods R (1973) Virus-like particles associated with two diseases of Colocasia esculenta (L.) Schott in the Solomon Islands. J Gen Virol 21:145–153

12. James A, Geijskes R, Dale J, Harding R (2011) Development of a novel rolling-circle amplification technique to detect Banana streak virus that also discriminates between integrated and episomal virus sequences. Plant Dis 95:57–62

91

13. Kazmi SA, Yang Z, Hong N, Wang G, Wang Y (2015) Characterization by small RNA sequencing of taro bacilliform CH virus (TaBCHV), a novel badnavirus. PLoS One 10:e0134147

14. Macanawai A, Ebenebe A, Hunter D, Devitt L, Hafner G, Harding R (2005) Investigations into the seed and mealybug transmission of Taro bacilliform virus. Aust Plant Pathol 34:73–76

15. Medberry S, Lockhart B, Olszewski N (1990) Properties of Commelina yellow mottle virus’s complete DNA sequence, genomic discontinuities and transcript suggest that it is a pararetrovirus. Nucleic Acids Res 18: 5505–5513

16. Ming SFY, Ping GW, Ping LW, Xing WX, Ni H (2013) Molecular identifcation and specifc detection of badnavirus from taro grown in China. Acta Phytopathol Sinica 6:590–595

17. Revill P, Jackson G, Hafner G, Yang I, Maino M, Dowling M, Devitt L, Dale J, Harding R (2005). Incidence and distribution of viruses of taro (Colocasia esculenta) in Pacific Island countries. Aust Plant Pathol 35:327–331

18. Sainsbury F, Thuenemann C, Lomonossoff P. (2009) pEAQ: versatile expression vectors for easy and quick transient expression of heterologous proteins in plants. Plant Biotec J 7:682–693

19. Sukal A, Kidanemariam D, Dale J, James A, Harding R. (2017) Characterisation of badnaviruses infecting Dioscorea spp. in the Pacific reveals two putative novel species and the first report of dioscorea bacilliform RT virus 2. Virus Res 238:29– 34

20. Wang Y, Hu J, Borth WB, Hamim I, Green JO, Melzer M (2017) First report of taro bacilliform CH virus (TaBCHV) on taro (Colocasia esculenta) in Hawaii, USA. Plant Dis 101:1334

21. Wu C, Lai Y, Lin N, Hsu Y, Tsai H, Liao J, Hu C (2008) A simplified method of constructing infectious clones of begomovirus employing limited restriction enzyme digestion of products of rolling circle amplification. J Virol Meth 147:355– 359

22. Yang I, Hafner G, Dale J, Harding R (2003a) Genomic characterisation of taro bacilliform virus. Arch Virol 148:937–949

23. Yang I, Hafner G, Revill P, Dale J, Harding R (2003b) Sequence diversity of South Pacific isolates of Taro bacilliform virus and the development of a PCR-based diagnostic test. Arch Virol 148:1957–1968

92

Chapter 6

Characterization of a subgroup IB isolate of Cucumber mosaic virus from Xanthosoma sp. in sub-Saharan Africa

Dawit B. Kidanemariam1,2, Amit C. Sukal1, Adane D. Abraham3, Joyce N. Njuguna4, Benard O. Mware5, Francesca Stomeo4, James L. Dale1, Anthony P. James1, Robert M. Harding1*

1 Centre for Tropical Crops and Biocommodities, Queensland University of Technology, Brisbane, 4001, Australia 2 National Agricultural Biotechnology Research Center, Ethiopian Institute of Agricultural Research, P.O. Box 2003, Addis Ababa, Ethiopia 3 Department of Biotechnology, Addis Ababa Science and Technology University, P.O. Box 16417, Addis Ababa, Ethiopia 4 Biosciences eastern and central Africa–International Livestock Research Institute (BecA–ILRI) Hub, P.O. Box 30709, Nairobi, Kenya 5 International Institute of Tropical Agriculture (IITA), Nairobi, Kenya

[Formatted for submission to Australasian Plant Disease Notes]

93

Statement of Contribution of Co-Authors of Thesis by Publication Paper The authors listed below have certified that: 1. They meet the criteria for authorship in that they have participated in the conception, execution, or interpretation, of at least that part of the publication in their field of expertise; 2. They take public responsibility for their part of the publication, except for the responsible author who accepts overall responsibility for the publication; 3. There are no other authors of the publication according to these criteria; 4. Potential conflicts of interest have been disclosed to (a) granting bodies, (b) the editor or publisher of journals or other publications, and (c) the head of the responsible academic unit, and 5. They agree to the use of the publication in the student’s thesis and its publication on the QUT’s ePrints site consistent with any limitations set by publisher requirements. In the case of this chapter: Characterization of a subgroup IB isolate of Cucumber mosaic virus from Xanthosoma sp. in sub-Saharan Africa

QUT Verified Signatures

RSC, Level 4, 88 Musk Ave, Kelvin Grove Qld 4059 Page 1 of 2 Current @ 20/09/2016 CRICOS No. 00213J

94

QUT Verified Signature

95

Abstract

A cucumber mosaic virus isolate infecting Xanthosoma sp. was identified in Uganda. The complete genome sequence of CMV-Xa was determined with the genome organization of RNA 1 and 3 consistent with previously characterized CMV isolates. However, in addition to ORFs 2a and 2b, RNA 2 contained a putative third, non-AUG initiated ORF, referred to as ORF 2c. Sequence analyses based on the three genomic RNAs showed that CMV–Xa belongs to subgroup IB. This is the first report of CMV infecting Xanthosoma sp. and also the first CMV isolate from subgroup IB detected from sub-Saharan Africa.

Keywords: Cucumovirus, tannia, Aracaeae

Cucumber mosaic virus (CMV) is the type species of the genus Cucumovirus (family Bromoviridae) and has a wide host range, infecting more than 1000 crop and non- crop plant species (Jacquemond 2012). The genome of CMV comprises three molecules of positive-sense single-stranded RNA (Bujarski et al. 2012). RNA 1 has one open reading frame (ORF) encoding a single protein (1a) which is crucial for replication (Jacquemond 2012; Nouri et al. 2014). RNA 2 possesses two ORFs (2a and 2b) each encoding a single protein (Ding et al. 1994). The 2a protein is involved in replication by interacting with the 1a protein (Jacquemond 2012), while 2b has a role in post-transcriptional gene silencing and symptom expression (Du et al. 2007). RNA 3 encodes two proteins (3a and 3b) which encode the movement protein (MP) and coat protein (CP), respectively (Roossinck et al. 1999). Based on serology, nucleic acid hybridization, RFLP analyses and nucleotide sequence comparisons, CMV isolates have been classified into two subgroups, designated I and II (Nouri et al. 2014), with 69–77% nucleotide (nt) identity between the two subgroups (Chen et al. 2007; Nouri et al. 2014). Subgroup I has been further divided into subgroups IA and IB based on differences in pathogenicity and sequence variation within the CP-coding region/3' UTR of RNA 3. Isolates within the same subgroup have a sequence identity of greater than 90% at the nt level (Chen et al. 2007; Nouri et al. 2014; Roossinck 2002). In 2009, a third subgroup (III) was proposed

96

(Liu et al. 2009) after the discovery of a new isolate CMV-BX which was phylogenetically distinct from subgroup I and II isolates and showed 71-89% nt identity to previously published CMV isolates. More recently, a CMV isolate (CMV- Rom) was reported (Tepfer et al. 2016) which showed 66-77% nt identity to previously classified CMV isolates and was phylogenetically distinct from subgroup I, II and III isolates. Although CMV subgroup I and II isolates generally have worldwide distributions (Gallitelli 2000; Roossinck 2002; Eiras et al. 2004; Lin et al. 2004; Sclavounos et al. 2006), no isolates from subgroup IB have been reported from sub- Saharan Africa. Taro (Colocasia esculenta L.) and tannia (Xanthosoma sp.) are both members of the family Araceae and are among the most important root crops for many small- scale farmers in sub-Saharan Africa. However, production is suffering from a range of biotic and abiotic factors (Akwee et al. 2015). In 2015, we surveyed taro and other edible aroids in east Africa (Ethiopia, Kenya, Tanzania and Uganda) in order to identify and characterize any viruses present. During these surveys, three tannia plants showing mosaic, mottling and vein chlorosis symptoms (Fig. 1a-c) were observed in Buikwe district, Uganda. Leaf samples were taken from the three plants (samples Ug90, Ug91 and Ug92), desiccated over silica and transported to the BecA–ILRI Hub laboratory in Nairobi, Kenya. PCR testing of the samples for the presence of potyviruses, which typically cause mosaic and mottling symptoms in aroids, using degenerate primers targeting regions of the coat protein (CP)-coding region (∼700 bp) (Yamamoto and Fuji 2008), and cylindrical inclusion (CI)-coding region (∼700 bp) (Ha et al. 2008) was negative. To identify other possible virus/es infecting the samples, total RNA was extracted (Valderrama-Cháirez et al. 2002) from the three tannia samples and was subjected to Illumina MiSeq Next Generation Sequencing (NGS). cDNA libraries were prepared using the Illumina® TruSeq Stranded Total RNA LT Sample Prep Kit with Ribo-Zero™ Plant, according to the manufacturer’s instructions (Illumina). A final concentration of 12 ρmol of pooled cDNA library was sequenced using a 600 cycles, MiSeq v3 Reagent cartridge (Illumina) and paired-end reads were generated on the Illumina® MiSeq platform.

97

a) b) c)

Figure 1. Symptoms associated with CMV-Xa. The three tannia plants collected from Uganda showing mosaic, mottling and vein chlorosis symptoms. (a) Ug91, (b) Ug92, and (c) Ug93.

98

The total number of raw reads generated for each sample ranged between 2,893,680 and 3,629,228 (Table 1). Adapter sequences were removed (http://hannonlab.cshl.edu/fastx_toolkit/) and reads were further trimmed to attain optimum quality using the DynamicTrim function of SolexaQA++ v.3.1.3 (Cox et al. 2010) De-novo assembly of reads from each sample was performed using Trinity v.2.0.3 (Grabherr et al. 2011). Contigs from the de novo assemblies were used to BLAST an NCBI-derived virus database (ftp://ftp.ncbi.nih.gov/genomes/Viruses/), with CMV identified in all three samples. No other virus sequences were identified. To validate the presence of CMV in these samples, RT-PCR was carried out using primers CMV-CPF/CPR (Wang et al. 2014) which amplify a ∼780 bp fragment spanning the CP-coding region and 3' UTR of CMV RNA 3. Amplicons of the expected size were obtained from all three samples and were subsequently cloned into pGEM®-T Easy (Promega) and sequenced using the Big Dye® Terminator v3.1 Cycle Sequencing Kit (Thermo Fisher Scientific). The amplicons from all three samples comprised 735 nt (not including the primer binding sites) and the sequences were identical to the corresponding NGS-generated sequence for each sample. The complete genome sequences of the three CMV isolates were assembled using the NGS data based on comparisons to a CMV reference sequence from the NCBI database (accession numbers NC_002034, NC_002035 and NC_001440 for RNA 1, 2 and 3, respectively) and ORFs were predicted and annotated using CLC Genomics Workbench v.7.5.1 (https://www.qiagenbioinformatics.com/) with default parameters. The total number of reads which mapped to the reference sequences ranged between 301,926 to 1,228,667 (Table 1). Pairwise sequence comparison of the respective NGS-generated RNA 1 to 3 genome sequences from the three tannia samples showed nucleotide sequence identities ranging from 99.5-99.8% (sequences were deposited in the GenBank accession numbers MG021454 - MG021462). Since there were no significant sequence differences between the three samples, further analyses were done using only one representative sample (Ug92).

99

Table 1. Next generation sequencing data from Xanthosoma sp. samples collected from Uganda.

Percentage of Number of Reference Number of Number of per-base Length of Final NCBI reads CMV sequence reads mapped Sample ID raw reads coverage to the consensus sequence accession after RNA used for to reference obtained reference sequence length Number trimming mapping sequence sequence RNA 1 NC_002034 696, 758 18.1 3357 3349 MG021454 Ug90 3,452,634 3,294,042 RNA 2 NC_002035 607,477 14.1 3050 3052 MG021455 RNA 3 NC_001440 1,228,667 33.9 2216 2212 MG021456 RNA 1 NC_002034 587, 925 17.07 3357 3349 MG021457 Ug91 3,629,228 3,108,688 RNA 2 NC_002035 413,815 11.97 3050 3049 MG021458 RNA 3 NC_001440 604,810 17.85 2216 2212 MG021459 RNA 1 NC_002034 375,107 12.65 3357 3349 MG021460 Ug92 2,893,680 2,586,760 RNA 2 NC_002035 301,926 10.19 3050 3050 MG021461 RNA 3 NC_001440 638,999 23.63 2216 2212 MG021462

100

The genome organization of RNA 1 and 3 of isolate Ug92 (designated CMV–Xa) was typical of other CMV isolates. RNA 1 comprised 3,349 nt and contained a single ORF (1a) predicted to encode a protein of 992 amino acids with 5' and 3' UTRs of 95 and 281 nt, respectively (Fig. 2). RNA 3 comprised 2,212 nt and contained two ORFs (3a and 3b) separated by an intergenic region of 271 nt. ORFs 3a and 3b comprised 840 nt and 687 nt, respectively, and were predicted to encode proteins of 279 and 228 amino acids, respectively. The 5' and 3' UTRs of RNA 3 were 112 and 302 nt, respectively (Fig. 2). Similar to other CMV isolates, RNA 2 of CMV–Xa comprised 3,052 nt and contained two overlapping ORFs (2a and 2b). ORF 2a was 2,577 nt and encoded a putative protein of 858 amino acids, while ORF 2b was 339 nt and encoded a putative protein of 112 amino acids. The 5' and 3' UTRs of RNA 2 were 80 and 298 nt, respectively (Fig. 2). Interestingly, analysis of RNA 2 also revealed the presence of a putative third UUG-initiated ORF (designated 2c) which was positioned within ORF 2a at the 5' end (Fig. 2). This ORF was located 58 nt downstream of, and out of frame with, the start codon of ORF 2a, and comprised 336 nt which encoded a putative protein of 112 amino acids. The presence of this putative ORF in the genome of CMV- Xa was confirmed by RT-PCR and sequencing. Further, ORF2c was present in the NGS- derived RNA2 sequences of Ug90 and Ug91. An ORF equivalent to ORF 2c has not been previously reported in CMV. However, analysis of 44 full-length CMV RNA 2 sequences from the NCBI database revealed that 20 CMV isolates contained a similarly positioned ORF comprising between 306 and 381 nt. Of these, the ORF was initiated with AUG, CUG and UUG in nine, eight and three isolates, respectively (Table 2). Interestingly, sequence analysis of RNA 2 of the cucumovirus, Tomato aspermy virus (TAV; NCBI Accession no. NC003838, KT757537, D10663, KF432414, AJ320274), also revealed the presence of a third ORF on RNA 2, similarly positioned to ORF 2c of CMV-Xa. The third ORF in all five TAV isolates were AUG-initiated and varied between 318-321 nt long.

101

500 1000 1500 2000 2500 3000

RNA 1 ORF 1a

500 1000 1500 2000 2500 3000 ORF 2c ORF 2b RNA 2 ORF 2a

500 1000 1500 2000

RNA 3 ORF 3a ORF 3b

Figure 2. Schematic representation of the genome organisation of CMV-xa. ORFs predicted on RNA 1, 2 and 3 are represented with box.

102

Table 2. Name, subgroup, country of origin and accession numbers of CMV sequences from NCBI database used in the analysis.

Name Country1 RNA 1 RNA 2 RNA 3 ORF 2c start codon BX China DQ399548 DQ399549 DQ399550 Ca China AY429434 AY429433* AY429432 UUG Cah1 China FJ268744 FJ2687452 FJ268746 AUG Cb7 China EF216866 DQ785470* EF216867 AUG CM95 Japan AB188234 AB188235 AB188236 CS China AY429435 AY429436* AY429437 UUG CTL China EF213023 EF213024* EF213025 AUG D8 Japan AB179764 AB179765 AB004781 Fny USA NC002034 NC002035 NC001440 GTN South Korea KP033524 KP033525* KP033526 AUG HM3 Egypt KT921314 KT921315* KX014666 UUG IA Indonesia AB042292 AB042293 AB042294 Ixora (USA) U20220 U20218* U20219 CUG KO India KM272277 KM272278* KM272275 CUG Li South Korea AB506795 AB506796* AB506797 CUG Ls USA AF416899 AF416900 AF127976 Ly Australia AF198101 AF198102 AF198103 MB Sri Lanka AF150731* CUG Mf South Korea AJ276479 AJ276480 AJ276481 Mi Japan AB188228 AB188229 AB188230 New Delhi India GU111227 GU111228* GU111229 CUG NS Hungary AJ580953 AJ511989 AJ511990 Nt9 Taiwan D28778 D28779* D28780 CUG Pepo (Japan) AB124834 AB124835 AF103991 PF Japan AB368499 AB368500* AB368501 AUG PI1 Spain AM183114 AM183115* AM183116 CUG Phy China DQ402477 DQ412731 DQ412732 PHz China EU723568 EU723570 EU723569 PSV (USA) NC002038 NC002039 NC002040 Q Australia X02733 X00985 M21464 R France HE793685 HE793686 Y18138 Rb South Korea GU327363 GU327364 GU327365 Rom France KU558987 KU558988 KU558989 RP19 South Korea KC527793 KC527703* KC527748 AUG SD (China) AF071551 D86330 AB008777 SFQT1-2 China HQ283392 HQ283391* HQ283393 AUG SW11 Australia KM434204 KM434205 KM434206 TAV (USA) NC003837 NC003838 NC003836 Tfn Italy Y16924 Y16925* Y16926 CUG TN Japan AB176849 AB176848 AB176847 Vir Italy HE962478 HE962479* HE962480 AUG Y Japan D12537 D12538 D12499 Z1 South Korea GU327366 GU327367 GU327368 209 China KJ400002 KJ400003* KJ400004 AUG 1Where known, the country of origin for the isolates is indicated. Where the country of origin is not known, the country where the sequence data was uploaded to the NCBI database is indicated in brackets. *RNA 2 sequences with the putative ORF 2c.

103

Attempts to identify a possible function for the putative ORF 2c gene product of CMV-Xa by database comparisons failed to reveal any significant homology with known viral proteins. As such, further studies will be required to determine whether this ORF is functional in CMV and TAV. BLASTn analysis of CMV RNA 1 sequences revealed that CMV-Xa had highest sequence identity (94%) to a tomato-infecting CMV isolate (HM3) from Egypt. Similarly, RNA 2 showed highest identity (93%) to capsicum-infecting CMV isolates from Italy and India (Vir and KO) and CMV-HM3 from Egypt, while RNA 3 showed 97% identity to CMV-HM3. The complete nucleotide sequences of CMV-Xa RNAs 1, 2 and 3, together with published CMV sequences, were separately aligned using the ClustalW multiple-alignment algorithm in BioEdit version 7 (http://www.mbio.ncsu.edu/BioEdit/bioedit.html). Phylogenetic trees were subsequently constructed in MEGA version 7 (http://www.megasoftware.net/mega.php) using the maximum-likelihood method and the Kimura 2-parameter model with 1000 bootstrap replications. For all three genomic RNAs, clades corresponding to the previously described subgroups I (including IA and IB), II and III were observed (Fig. 3a-c). Further, phylogenetic analyses revealed that CMV–Xa clusters with subgroup IB CMV isolates and is most closely related to CMV-Vir, -KO and -HM3. Interestingly, of the 20 RNA 2 sequences which possess the putative ORF 2c, 17 clustered within subgroup IB together with CMV–Xa (Fig. 3b). The other three isolates grouped under subgroup IA (isolates PF and Li) or branched independently of other isolates (isolate ‘209’; Fig. 3b). To determine whether CMV-Xa is mechanically transmissible, Nicotiana benthamiana plants were inoculated using sap extracts prepared from sample Ug92. Approximately 200 mg of CMV-Xa-infected leaf tissue was ground in 1 ml of 0.1 M sodium phosphate buffer (pH 7) with 10 mg of carborundum powder and the sap was gently rubbed onto fully-expanded leaves of eight-week old N. benthamiana plants. Five weeks post-inoculation, newly emerging leaves developed mosaic-like symptoms and tested positive for CMV by RT-PCR using primers CMV-CPF/CPR as described previously.

104

Li a) Fny 100 Pepo 100 Y Mi 79 100 CM95 IA Mf

209 59 70 Rb Z1 99 PF

78 53 NS

100 Ca CS 100 D8 72 SD 99 SFQT1-2

100 Cb7 97 100 Phy Ixora 99 100 Cah1 NewDelhi 100 PI1 IB 100 98 Tfn 77 Nt9

100 GTN 100 100 RP19 CTL IA 100 63 Xa 98 Vir 69 98 HM3 KO BX III 100 PHz Rom SW11 Ls 100 TN II Ly Q R TAV Outgroup 100 PSV

0.05

105

99 Rb Mf b) Pepo Mi

96 100 CM95 Z1 IA Fny 100 63 Y

PF * 72 NS Li * * 94 RP19 * 85 GTN * 99 SFQT1-2 89 Cah1 * 86 Ixora * 99 CTL* Cb7 * * 100 NewDelhi MB*

HM3 * 99 * 64 KO IB Xa 98 Vir *

IA * 100 Ca CS * PI1 * 50 100 Nt9* 100 Tfn * SD D8 99 85 65 Phy 209 *

BX III 100 PHz

Rom Q Ls 100 R II Ly SW11 62 TN TAV Outgroup 100 PSV

0.1

106

70 Pepo 64 Mi Z1 Y

c) 81 CM95 D8 IA NS 99 99 Fny 99 Rb 87 Mf Li

99 Ca 99 CS 54 94 SD RP19 100 GTN 91 HM3

99 Xa 93 KO 87 Vir

55 IA IB CTL 99 54 Ixora 83 PI1 75 72 NewDelhi 99 Nt9 Tfn

100 99 SFQT1-2 Phy 100 Cah1 91 58 Cb7 90 209 BX III 100 PHz PF SW11 100 Ly

58 Q II Ls R

56 TN Rom TAV Outgroup 100 PSV

0.05

107

Figure 3. Phylogenetic analysis of CMV–Xa based on complete nucleotide sequences. (a) RNA 1, (b) RNA 2, and (c) RNA 3. Asterisks indicate isolates having the putative ORF 2c on RNA 2. All trees were rooted using tomato aspermy virus (TAV) and peanut stunt virus (PSV) as outgroups. Bootstrap values greater than 50 % are shown, and the scale bar indicates substitutions per site. Detailed information of the isolates included in the phylogenetic analysis can be accessed from Table 2.

108

To our knowledge, this is the first report of a complete genome sequence of a subgroup IB CMV isolate from sub-Saharan Africa and is also the first report of CMV infecting Xanthosoma sp. The only previously published sequence record of CMV from a member of the Araceae is a partial CP-coding sequence from a Chinese isolate infecting taro (Wang et al. 2014). Although CMV has also been detected in Anthurium andreanum in Brazil using ELISA and PCR, no sequence information was reported (Miura et al. 2013).

Acknowledgements

This project was funded by the Biosciences eastern and central Africa–International Livestock Research Institute (BecA–ILRI) Hub through the African Biosciences Challenge Fund (ABCF). DK is the recipient of an Australia Awards Scholarship.

Data Availability

Sequences described in this paper are available under GenBank accession numbers MG021454 - MG021462.

Compliance with ethical standards The authors declare no conflict of interest. This article does not contain any work conducted on animal or human participants.

109

References

Akwee, P. E., Netondo, G., Kataka, J. A., & Palapala, V. A. (2015). A critical review of the role of taro Colocasia esculenta L. (Schott) to food security: A comparative analysis of Kenya and Pacific Island taro germplasm. Scientia Agriculturae, 9, 101–108. Bujarski, J., Figlerowicz, M., Gallitelli, D., Roossinck, M. J., & Scott, S. W. (2012). Bromoviridae. In: King AMQ, Adams MJ, Carstens EB, Lefkowitz EJ (eds) Virus taxonomy: Ninth Report of the International Committee on Taxonomy of Viruses. London: Elsevier. Chen, Y., Chen, J., Zhang, H., Tang, X., & Du, Z. (2007). Molecular evidence and sequence analysis of a natural reassortant between Cucumber mosaic virus subgroup IA and II strains. Virus Genes, 35, 405–413. Cox, M. P., Peterson, D. A., & Biggs, P. J. (2010). SolexaQA: At-a-glance quality assessment of Illumina second-generation sequencing data. BMC Bioinformatics, 11, 485. Ding, S. W., Anderson, B. J., Haase, H. R., & Symons, R. H. (1994). New overlapping gene encoded by the cucumber mosaic virus genome. Virology, 198, 593–601. Du, Z. Y., Chen, F. F., Liao, Q. S., Zhang, H. R., Chen, Y. F., & Chen, J. S. (2007). 2b ORFs encoded by subgroup IB strains of cucumber mosaic virus induce differential virulence on Nicotiana species. Journal of General Virology, 88, 2596–2604. Eiras, M., Boari, A. J., Colariccio, A., Chaves, A. L. R., Briones, M. R. S., Figueira, A. R., & Harakava, R. (2004). Characterization of isolates of the Cucumovirus Cucumber mosaic virus present in Brazil. Journal of Plant Pathology, 86, 61–69. Gallitelli, D. (2000). The ecology of Cucumber mosaic virus and sustainable agriculture. Virus Research, 71, 9–21. Grabherr, M. G., Haas, B. J., Yassour, M., Levin, J. Z., Thompson, D. A., Amit, I., Adiconis, X., Fan, L., Raychowdhury, R., Zeng, Q., Chen, Z., Mauceli, E., Hacohe, N., Gnirke, A., Rhind, N., di Palma, F., Birren, B. W, Nusbaum, C., Lindblad-Toh, K., Friedman, N., & Regev, A. (2011). Full-length transcriptome assembly from RNA-seq data without a reference genome. Nature Biotechnology, 29, 644– 652. 110

Ha, C., Coombs, S., Revill, P., Harding, R. M., Vu, M., & Dale, J. L. (2008). Design and application of two novel degenerate primer pairs for the detection and complete genomic characterization of potyviruses. Archives of Virology, 153, 25–36. Jacquemond, M. (2012). Cucumber mosaic virus. In: Maramorosch M, Shatkin AJ Murphy FA (eds). Advances in Virus Research, 84, 439–504. Lin, H. X., Rubio, L., Smythe, A. B., & Falk, B. W. (2004). Molecular population genetics of Cucumber mosaic virus in California: evidence for founder effects and reassortment. Journal of Virology, 78, 6666–6675. Liu, Y. Y., Yu, S. L., Lan, Y. F., Zhang, C. L., Hou, S. S., Li, X. D., Li, X. D., Zhang, G. M., & Shi, C. K. (2009). Molecular variability of five cucumber mosaic virus isolates from China. Acta Virologica, 53, 89–97. Miura, N. S., Beriam, L. O., & Rivas, E. B. (2013). Detection of cucumber mosaic virus in commercial Anthurium crops and genotypes evaluation. Horticultura Brasileira, 31, 322–327. Nouri, S., Arevalo, R., Falk, W. B., & Groves, L. R. (2014). Genetic structure and molecular variability of cucumber mosaic virus isolates in the United States. PLoS One, 9, e96582. Roossinck, M. J., Zhang, L., & Hellwald, K. H. (1999). Rearrangements in the 5' nontranslated region and phylogenetic analyses of cucumber mosaic virus RNA 3 indicate radial evolution of three subgroups. Journal of Virology, 73, 6752– 6758. Roossinck, M. J. (2002). Evolutionary history of cucumber mosaic virus deduced by phylogenetic analyses. Journal of Virology, 76, 3382–3387. Sclavounos, A. P., Voloudakis, A. E., Arabatzis, C., & Kyriakopoulou, P. E. (2006). A severe hellenic CMV tomato isolate: symptom variability in tobacco, characterization and discrimination of variants. European Journal of Plant Pathology, 115, 163–172.

111

Tepfer, M., Girardot, G., Fénéant, L., Tamarzizt, H. B., Verdin, E., Moury, B., & Jacquemond, M. (2016). A genetically novel, narrow-host-range isolate of cucumber mosaic virus (CMV) from rosemary. Archives of Virology, 161, 2013– 2017. Valderrama-Cháirez, M. L., Cruz-Hernández, A., & Paredes-López, O. (2002). Isolation of functional RNA from cactus fruit. Plant Molecular Biology Reporter, 20, 279– 286. Wang, Y. F., Wang, G. P., Wang, L. P., & Hong, N. (2014). First report of cucumber mosaic virus in taro plants in China. Plant Disease, 98, 574–574. Yamamoto, H., & Fuji, S. (2008). Rapid determination of the nucleotide sequences of potyviral coat protein genes using semi-nested RT-PCR with universal primers. Journal of General Plant Pathology, 74, 97–100.

112

Chapter 7

Incidence and distribution of four RNA viruses infecting taro and tannia in East Africa and molecular characterisation of Dasheen mosaic virus isolates

D. B. Kidanemariam1,2, A. C. Sukal1, A. D. Abraham3, J. N. Njuguna4, F. Stomeo4, J. L. Dale1, A. P. James1, R. M. Harding1*

1 Centre for Tropical Crops and Biocommodities, Queensland University of Technology, Brisbane, 4001, Australia 2 National Agricultural Biotechnology Research Centre, Ethiopian Institute of Agricultural Research, P.O. Box 2003, Addis Ababa, Ethiopia 3 Department of Biotechnology, Addis Ababa Science and Technology University. P.O. Box 16417, Addis Ababa, Ethiopia 4 Biosciences eastern and central Africa–International Livestock Research Institute (BecA–ILRI) Hub, P.O. Box 30709, Nairobi, Kenya

[Formatted for submission to Annals of Applied Biology]

113

Statement of Contribution of Co-Authors of Thesis by Publication Paper The authors listed below have certified that: 1. They meet the criteria for authorship in that they have participated in the conception, execution, or interpretation, of at least that part of the publication in their field of expertise; 2. They take public responsibility for their part of the publication, except for the responsible author who accepts overall responsibility for the publication; 3. There are no other authors of the publication according to these criteria; 4. Potential conflicts of interest have been disclosed to (a) granting bodies, (b) the editor or publisher of journals or other publications, and (c) the head of the responsible academic unit, and 5. They agree to the use of the publication in the student’s thesis and its publication on the QUT’s ePrints site consistent with any limitations set by publisher requirements. In the case of this chapter: Incidence and distribution of four RNA viruses infecting taro and tannia in East Africa and molecular characterisation of Dasheen mosaic virus isolates

QUT Verified Signatures

RSC, Level 4, 88 Musk Ave, Kelvin Grove Qld 4059 Page 1 of 2 Current @ 20/09/2016 CRICOS No. 00213J

114

QUT Verified Signature

115

Abstract

Taro and tannia are important food crops in many districts of East Africa. To investigate the incidence and distribution of four RNA viruses known to infect these plants, 392 leaf samples were collected from taro or tannia plants growing in 25 districts in Ethiopia, Kenya, Tanzania and Uganda. The samples were tested for Cucumber mosaic virus (CMV), Dasheen mosaic virus (DsMV), Taro vein chlorosis virus (TaVCV) and Colocasia bobone disease-associated virus (CBDaV) by RT-PCR. No samples tested positive for TaVCV or CBDaV, while CMV was only detected in three tannia samples with mosaic symptoms from Uganda. DsMV was detected in 40 samples, including 36 out of 171 from Ethiopia, 1 out of 94 from Uganda and 3 out of 41 from Tanzania, while no samples from Kenya tested positive. The complete genomes of nine DsMV isolates from East Africa were cloned and sequenced. Phylogenetic analyses based on the amino acid sequence of the CP-coding region revealed two distinct clades, which is consistent with previous reports. Interestingly, samples from Ethiopia were distributed across several subgroups in both clades, while samples from Uganda and Tanzania belong to different clades.

Keywords: Ethiopia, Kenya, Tanzania, Uganda, cucumber mosaic virus, rhabdoviruses, aroids

116

Introduction

The aroids, taro (Colocasia esculenta) and tannia (Xanthosoma sp.), are the most important and widely cultivated edible members of the Araceae family in sub- Saharan Africa (Ndabikunze et al., 2011). In Ethiopia, Kenya, Tanzania and Uganda, taro and tannia are mainly cultivated by small-holder farmers and play important cultural, economic and nutritional roles (Onwueme and Charles, 1994; Talwana et al., 2009; Tumuhimbise et al., 2009; Beyene, 2013). However, due to various biotic and abiotic factors the yields from taro and tannia in East Africa are much lower than the world’s average production (Tumuhimbise et al., 2009; Talwana et al., 2009; Akwee et al., 2015). Viruses are among the most economically important pathogens of these crops, resulting in significant yield losses, with a number of viruses reported from different parts of the world (Elliott et al., 1997; Revill et al., 2005a).

The potyvirus, Dasheen mosaic virus (DsMV, family Potyviridae, genus Potyvirus) infects taro and other edible aroids wherever they grow (Zettler et al., 1970; Elliott et al., 1997). DsMV is transmitted in a non-persistent manner by several aphid species and can also be transmitted by vegetative propagation or sap inoculation (Elliott et al., 1997; Nelson, 2008). The virus has a worldwide distribution and infects both edible and ornamental members of the Araceae family (Elliott et al., 1997). Infection typically results in a characteristic feathery-mottle and mosaic symptom on the leaves, but symptoms may vary considerably between cultivars and season of the year (Alconero and Zettler, 1971; Elliott et al., 1997). DsMV infection is reported to affect both the quality and quantity of the edible corms, with production losses ranging from 20 to 60 % (Rana et al., 1983; Elliott et al., 1997).

Taro vein chlorosis virus (TaVCV) is a member of the family Rhabdoviridae, genus Nucleorhabdovirus (Revill et al., 2005b). Typical symptoms associated with TaVCV infection include a distinct vein chlorosis near the leaf margins of infected plants (Pearson et al., 1999; Revill et al., 2005b). TaVCV has been reported from several South Pacific island countries, as well as Hawaii and American Samoa (Long et al., 2014; Atibalentja et al., 2017). To date, TaVCV is only known to infect taro, but

117

there is no published information on production losses resulting from infection (Revill et al., 2005b). CBDaV is a putative member of the family Rhabdoviridae based on sequence analysis and the presence of characteristic, enveloped, bullet-shaped particles of ∼300 x 50 nm in infected plants (Higgins et al., 2016; Pearson et al., 1999). CBDaV has only been reported from Papua New Guinea and the Solomon Islands, where it has been associated with the severe diseases bobone and alomae (Gollifer et al., 1977; Revill et al., 2005a). Bobone disease is thought to be caused by CBDaV alone and is characterised by stunting and gall formation on the pseudostem (Gollifer et al., 1977; Pearson et al., 1999; Revill et al., 2005a; Higgins et al., 2016), whereas alomae is a lethal disease caused by the dual infection of taro with CBDaV and taro bacilliform virus (TaBV).

A number of other viruses have also been reported from aroids worldwide. Taro reovirus (TaRV), a putative member of the genus Oryzavirus in the family Reoviridae, has been partially characterised based on sequence analysis of four incomplete genomic segments of an isolate from PNG (Revill et al., 2005a, b). However, no symptoms have been associated with TaRV infection and the virus has only been detected in symptomless taro plants and plants infected with other viruses (Revill et al., 2005a). Konjac mosaic virus (KoMV, family Potyviridae, genus Potyvirus), Cucumber mosaic virus (CMV, family Bromoviridae, genus Cucumovirus), Groundnut bud necrosis virus (GBNV, family Bunyaviridae, genus Tospovirus) and Tomato zonate spot virus (TZSV, tentatively assigned in the genus Tospovirus) have also been identified from different aroids (Manikonda et al., 2011; Wang et al., 2014; Sivaprasad et al., 2011; Dong et al., 2008). Of the known viruses reported to infect edible and ornamental aroids, DsMV and TaBV are the most widespread (Elliott et al., 1997; Revill et al., 2005a).

We have recently reported the incidence, distribution and molecular characterisation of badnaviruses infecting taro and tannia in East Africa (Kidanemariam et al., 2018a), but there is no information on the incidence, distribution and diversity of RNA viruses. In this paper, we report the results of surveys carried out in 2014 and 2015 to determine the occurrence of four RNA viruses

118

infecting taro and tannia in Ethiopia, Kenya, Tanzania and Uganda. The complete genome sequences and phylogenetic analyses of nine DsMV isolates from East Africa is also reported

Materials and Methods

Sample collection and nucleic acid extraction Between November 2014 and June 2015, a total of 171 (160 taro and 11 tannia), 86 (83 taro and three tannia), 41 (29 taro and 12 tannia) and 94 (61 taro and 33 tannia) symptomatic and asymptomatic leaf samples were collected from major growing areas in Ethiopia, Kenya, Tanzania and Uganda, respectively. Leaf samples were desiccated over silica-gel, transported to the BecA-ILRI hub laboratory in Nairobi, Kenya and RNA was extracted (Valderrama-Cháirez et al., 2002). Following initial screening for viruses at BecA-ILRI hub, selected extracts were transported to Queensland University of Technology (QUT), Brisbane, Australia for further analysis.

RT-PCR, cloning and sequencing Complementary DNA (cDNA) was synthesised using M-MuLV reverse transcriptase

(Thermo Fisher Scientific, UK) with oligo(dT)18 and random hexamers as per the manufacturer’s instructions. For the detection of potyviruses and rhabdoviruses, PCR was carried out using published degenerate primers, while virus-specific primers were used for the specific detection of DsMV, TaVCV, CBDaV and CMV (Table 1). All PCRs were carried out using 2 μl of cDNA mixed with 10 μl of OneTaq® 2x Master Mix and 5 ρmol of each primer in a total volume of 20 μl. PCR cycling conditions for CBDaV was, an initial denaturation of 94 °C for 2 min, followed by 35 cycles 94 °C for 30 s, 50 °C for 30 s, and 72 °C for 30 s, with a final extension step of 72 °C for 5 min. All other PCR assays used published cycling conditions (Table 1). A positive control samples were included for each experiment.

PCR products were electrophoresed through 1.5 % agarose gels and were stained using GelRed™ (Biotium, USA). Amplicons from representative samples chosen for sequencing were gel-excised, purified using Freeze ‘N’ Squeeze™ DNA Gel Extraction Spin Columns (Bio-Rad, Australia), cloned into pGEM-T Easy (Promega, 119

Australia) and sequenced using the Big Dye® Terminator v3.1 Cycle Sequencing Kit (Thermo Fisher Scientific, Australia) at the Central Analytical Research Facility (CARF), QUT, Brisbane, Australia. For each sample, three independent clones were sequenced with M13F and/or M13R primers.

120

Table 1. Primers used for virus detection with RT-PCR.

Virus Primer name Primer sequence (5' – 3') Expected size (bp) Target region Reference CI-F GGIVVIGTIGGIWSIGGIAARTCIAC Cylindrical Potyvirus ∼700 Ha et al., 2008 CI-R ACICCRTTYTCDATDATRTTIGTIGC inclusion body DsMV-3F ATGACAAACCTGARCAGCGTGAYA DsMV ∼680 Coat protein Maino et al., 2003 DsMV-3R TTYGCAGTGTGCCTYTCAGGT CMV-F ATGGACAAATCTGAATCAACC CMV ∼780 Coat protein Wang et al., 2014 CMV-R TAAGCTGGATGGACAACCCGT RhabF GGATMTGGGGBCATCC Rhabdovirus ∼900 L-gene Dietzgen et al., 2013 RhabR GTCCABCCYTTTTGYC TaVCV-1 AATATGCTCTCCAGTGTTCACCC TaVCV ∼1000 L-gene Revill et al., 2005b TaVCV-2 AGGTGCTCAAATGACTCAGCTTGTCC CBDV-3 CTCAAGACAATCAATGGGTGATG CBDaV ∼300 L-gene Ralf Dietzgen. Pers comm. CBDV-4 CCACGACCGAGTAATTGAC

121

Generating complete genome sequences of DsMV Illumina Next Generation Sequencing (NGS) was carried out to generate the complete genome sequences of DsMV. cDNA libraries were prepared using the Illumina® TruSeq Stranded Total RNA LT Sample Prep Kit with Ribo-Zero™ Plant, according to the manufacturer’s instructions (Illumina, USA). A final concentration of 12 ρmol of pooled cDNA library was sequenced using a 600 cycles, MiSeq v3 Reagent cartridge (Illumina, USA) and paired-end reads were generated on the Illumina® MiSeq platform at the BecA–ILRI Hub laboratory, Nairobi, Kenya. Subsequently, the NGS data for representative samples was validated by RT-PCR and Sanger sequencing on cloned DNA fragments and the 5'–terminal sequences were obtained by rapid amplification of cDNA ends (RACE) using a 5'/3' RACE Kit, 2nd generation (Roche, Australia).

Sequence and phylogenetic analysis Sanger-derived sequences were trimmed to remove primer-binding sites and analysed using CLC Main Workbench v6.9.2 (QIAGEN, USA) and Geneious v11.0.2 (Biomatters, New Zealand). For RNAseq data, adapter sequences were removed using the fastx_clipper and reads were further trimmed to attain optimum quality using the DynamicTrim function of SolexaQA++ v.3.1.3 software (Cox et al., 2010) and fastx- trimmer module of FASTX-Toolkit (http://hannonlab.cshl.edu/fastx_toolkit/). De novo assembly of reads from each sample was performed using Trinity v.2.0.3 (Grabherr et al., 2011) and virus contigs were identified by BLASTn analysis against the NCBI-derived local virus database (ftp://ftp.ncbi.nih.gov/genomes/Viruses/) using a blast command line analysis (Altschul et al., 1990). Reads were subsequently mapped onto reference sequences using CLC Genomics Workbench v.7.5.1 (https://www.qiagenbioinformatics.com/) with default parameters. ORFs were predicted and annotated using CLC Genomics Workbench v.7.5.1 and sequences were designated ‘complete’ based on comparison with the reference sequence used for mapping.

Processed Sanger and NGS data were compared to sequences on the NCBI database using BLAST algorithms available on the NCBI website

122

(http://blast.ncbi.nlm.nih.gov/Blast.cgi). For DsMV sequences, the conserved core CP-coding region, excluding the heterogeneous N-terminal sequences, were further aligned and analysed using the ClustalW multiple alignment application using BioEdit sequence alignment editor program version 7.1.9 (http://www.mbio.ncsu.edu/BioEdit/bioedit.html). Phylogenetic trees were constructed from ClustalW-aligned sequences with MEGA version 7.0 (http://www.megasoftware.net/mega.php), using the Maximum-Likelihood method and a Kimura 2-Parameter model with 1000 bootstrap replications. Pairwise sequence comparison (PASC) was carried out on aligned sequences using Geneious v11.0.2 (Biomatters, New Zealand) computer software.

Results

Sample collection and symptoms Four surveys were conducted covering a total of 25 taro and tannia growing regions of Ethiopia, Kenya, Tanzania and Uganda (Fig. 1, Table 2). Of the 392 samples collected, 333 were from taro and the remaining 59 were from tannia, of which 68 taro and 23 tannia plants showed typical virus-like symptoms (Fig. 2A-K; Table 2). In Ethiopia, taro and tannia plants showing feathery-mottle, mosaic, stunting, leaf distortion, leaf yellowing and vein-clearing symptoms (Fig. 2C-I) were observed from all regions except Oromia. The highest number of symptomatic samples was collected from Welayita region with 33 out of 87 samples showing virus-like symptoms. In Kenya, virus-like symptoms were observed on taro and tannia growing in all regions except Siaya, whereas in Tanzania, taro and tannia plants exhibiting symptoms (Fig. 2A) were observed in all five locations surveyed. In Uganda, virus-like symptoms were seen on taro and tannia plants (Fig. 2B, J-K) growing at five of the seven regions visited. No plants showing typical alomae or bobone disease symptoms were observed during the surveys.

123

Ethiopia Kenya

Uganda Tanzania

Figure 1. Locations of survey sites in Ethiopia, Kenya, Tanzania and Uganda. Red stars represent sampling sites. A total of 171, 86, 41 and 94 samples were collected from Ethiopia, Kenya, Tanzania and Uganda respectively.

124

Table 2. Summary of PCR and RT-PCR screening results for viruses infecting taro and tannia samples in this study.

Number of samples collected Symptomatic samples Number of RT-PCR positive samples Country Region Poty DsMV Total Taro Tannia Total Taro Tannia CMV1 TaVCV CBDaV Total Taro Tannia Total Taro Tannia Welayita 87 84 3 16 13 3 17 13 4 17 13 4 0 0 0 Oromia 22 22 0 0 0 0 3 3 0 3 3 0 0 0 0 Ethiopia Sheka 25 22 3 6 4 2 9 5 4 9 5 4 0 0 0 Masha 14 12 2 3 1 2 4 2 2 4 2 2 0 0 0 Keffa 23 20 3 4 1 3 3 1 2 3 1 2 0 0 0 Total 171 160 11 29 19 10 36 24 12 36 24 12 0 0 0 Nyeri 30 29 1 9 9 0 0 0 0 0 0 0 0 0 0 Laikipia 3 2 1 1 1 0 0 0 0 0 0 0 0 0 0 Tharaka Nithi 14 14 0 8 8 0 0 0 0 0 0 0 0 0 0 Kirinyaga 9 8 1 3 3 0 0 0 0 0 0 0 0 0 0 Kenya Embu 19 19 0 4 4 0 0 0 0 0 0 0 0 0 0 Kakamega 4 4 0 1 1 0 0 0 0 0 0 0 0 0 0 Kisumu 5 5 0 1 1 0 0 0 0 0 0 0 0 0 0 Siaya 2 2 0 0 0 0 0 0 0 0 0 0 0 0 0 Total 86 83 3 27 27 0 0 0 0 0 0 0 0 0 0 Musoma 9 9 0 2 2 0 0 0 0 0 0 0 0 0 0 Tarime 5 2 3 1 0 1 0 0 0 0 0 0 0 0 0 Tanzania Mago 2 2 0 1 1 0 0 0 0 0 0 0 0 0 0 Biharamulo 9 1 8 1 0 1 2 0 2 2 0 2 0 0 0 Mwanza 16 15 1 9 7 2 1 1 0 1 1 0 0 0 0 Total 41 29 12 14 10 4 3 1 2 3 1 2 0 0 0 Busuju 25 16 9 9 5 4 0 0 0 0 0 0 0 0 0 Lukaaya 26 17 9 4 4 0 1 0 1 1 0 1 0 0 0 Busiro 20 11 9 3 1 2 0 0 0 0 0 0 0 0 0 Uganda Budondo 4 4 0 0 0 0 0 0 0 0 0 0 0 0 0 Buunya 6 5 1 1 1 0 0 0 0 0 0 0 0 0 0 Kignlu 3 2 1 0 0 0 0 0 0 0 0 0 0 0 0 Luuka 10 6 4 4 1 3 0 0 0 0 0 0 3 0 0 Total 94 61 33 21 12 9 1 0 1 1 0 1 3 0 0 1All the three samples tested positive to CMV from Uganda are from tannia

125

A B C

D E F

H

I

G K J Figure 2. Photos of typical virus-like symptoms on taro and tannia plants from East Africa. A) Tz47 showing feathery-mottle symptom; B) Ug31 showing leaf yellowing and vein clearing symptoms; C) Et105 showing feathery-mottle and stunting symptoms; D) Et26 showing mosaic and feathery-mottle symptoms; E) Et36 showing yellowing and mosaic symptoms F) Et82 showing mosaic and stunting symptoms; G) and H) Et51 showing mosaic, leaf distortion, stunting and feathery-mottle symptoms; I) Et41 showing yellowing and mosaic symptoms; J) Ug93 showing mosaic symptom; and K) Ug91 showing mosaic and yellowing symptoms.

126

RT-PCR screening When RNA extracts were tested for the presence of potyviruses by RT-PCR using the degenerate primers, CI-F/R, the expected ∼700 bp amplicon was only observed in 36 (24 taro and 12 tannia) samples from Ethiopia, as well as one sample from Uganda (Ug31, Lukaaya region) and three samples from Tanzania (Tz24 and Tz34 from Biharamulo and Tz47 from Mwanza). Samples Ug31, Tz24 and Tz34 were from tannia, while Tz47 was from taro. When these 40 samples were subsequently tested for DsMV by RT-PCR using specific primers DsMV-3F/3R, the expected amplicon of ∼560 bp was obtained from all 40 samples.

Testing of the extracts for the presence of CMV using the specific primers, CMV- F/R, resulted in an amplicon of the expected size from only three tannia samples (Ug90, 91, 92) from Buikwe district in Uganda. The amplicons from the three samples were cloned and sequenced, with BLAST analysis of the trimmed 735 bp region of the cloned sequences revealing highest identity (96 %) to a subgroup IB CMV isolate from Egypt.

When extracts were tested for the presence of rhabdoviruses using the degenerate primers Rhab-F/R, the expected ∼900 bp product was generated from 13 samples. However, these samples all tested negative for TaVCV and CBDaV using virus-specific primers, despite amplicons of the expected size (∼220 and ∼700 bp, respectively) being generated from the positive controls. Subsequent sequence analysis of cloned amplicons generated using the degenerate rhabdovirus primers revealed the sequences were of a non-viral origin.

Sequencing of DsMV isolates Following RT-PCR using the degenerate potyvirus primers, amplicons from five samples selected from different locations (Et9, Et41, Et56, Tz34 and Ug31) were cloned and sequenced. BLAST analysis of the trimmed 630 bp sequences revealed 79- 89 % and 90-99 % identity at the nucleotide and amino acid levels, respectively, to DsMV isolates infecting either taro from India (Et41, Tz34 and Ug31) or Zantedeschia aethiopica (Arum lily) from China (Et9 and 56). Amplicons generated using the DsMV- specific primers from 16 representative samples were subsequently cloned and 127

sequenced. These 16 samples included 13 from Ethiopia (Et5, 9, 26, 29, 36, 40, 41, 51, 56, 74, 82, 105, 106), as well as samples Tz24 and 34 from Tanzania and sample Ug31 from Uganda. BLAST analysis of the trimmed 520 bp sequences from the 16 samples revealed a maximum of 92-96 % and 98-99 % identity at the nucleotide and amino acid levels, respectively, to DsMV isolates infecting a range of aroids from China, Japan, India and Nicaragua.

Following the analysis of these partial sequences, the complete genome sequences of isolates Ug31, Tz34 and seven isolates from Ethiopia (Et5, 9, 26, 29, 36, 41 and 56) were generated using Illumina MiSeq NGS. Comparison of the consensus nucleotide sequences of the nine isolates derived from NGS with the respective consensus RT-PCR-generated sequences revealed 99-100 % identity. The complete genome sequences of the nine DsMV isolates varied from 9,710 and 9,978 nucleotides in length, excluding the 3' polyA-tail. The 5' and 3' UTRs of all the isolates varied between 138-339 nucleotides and 206-249 nucleotides, respectively. The genome sequences also contained a single large ORF ranging from 9,339-9,576 nucleotides, encoding a predicted polyprotein of 3,113-3,192 amino acids, with predicted molecular masses of 354.7-362.5 kDa. Further, the overlapping ORF known as P3N-PIPO was identified in the nine sequences.

Phylogenetic analysis and PASC Phylogenetic analysis was carried out using the amino acid sequences of the core CP- coding region from RT-PCR amplicons from the 16 DsMV isolates sequenced from East Africa, together with 39 published DsMV isolates and other representative members of the family Potyviridae. DsMV isolates included in the analysis formed a large heterogeneous group separate from other potyviridae members (Fig. 3). Within the DsMV sequences included, 11 subgroups were identified, although many of these have low bootstrap support values. Isolates from East Africa clustered into five of these subgroups. However, the clustering was not representative of either host plant species or geographic origins, with Ethiopian DsMV sequences from taro and tannia present in four out of the five subgroups and clustering with isolates infecting taro,

128

DsMV-EF199550-Konjac-China DsMV-LC114503-Konjac-Japan DsMV-AM910400-Tannia-Nicaragua DsMV-AM910399-Tannia-Nicaragua DsMV-KJ786965-Elephant foot yam-India DsMV-AM910401-Tannia-Nicaragua DsMV-AJ298034-Arum lily-China DsMV-AM910406-Tannia-Nicaragua DsMV-AM910403-Tannia-Nicaragua DsMV-AM910398-Tannia-Nicaragua DsMV-AJ298036-Taro-Japan DsMV-JN692173-Taro-China DsMV-AM910405-Tannia-Nicaragua 65 DsMV-AM910407-Tannia-Nicaragua Et26-Taro-MG602229 Et56-Tannia-MG602233 DsMV-DQ925465-Taro-Vietnam Et29-Taro-MG602230 Et5-Taro-MG602227 DsMV-AM910404-Tannia-Nicaragua DsMV-LC114515-Konjac-Japan

83 Et9-Taro-MG602228 Clade I Et36-Tannia-MG602231 DsMV-AF511485-Calla lily-Taiwan DsMV-FJ160764-Elephant foot yam-India VanMV-AJ616719-Vanilla-French Polynesia DsMV-AY994104-Taro-New Zealand 84 DsMV-AY994105-Taro-New Zealand 56 Et40-Taro-MG602236 54 Et74-Tannia-MG602238 Et82-Taro-MG602239 DsMV-HQ207530-Elephant foot yam-India DsMV-U00122-Taro-USA Et41-Tannia-MG602232 Et51-Taro-MG602237 Ug31-Tannia-MG602235 DsMV-AJ298035-Taro-Japan 78 DsMV-LC114499-Konjac-Japan DsMV-AJ298033-Arum lily-China

60 DsMV-NC003537-Arum lily-China VanMV-AJ616720-Vanilla-Cook Islands DsMV-HQ207537-Elephant foot yam-India DsMV-HQ207538-Elephant foot yam-India 99 98 89 DsMV-HQ207536-Elephant foot yam-India DsMV-LC114497-Konjac-Japan DsMV-LC114505-Konjac-Japan

87 DsMV-LC114493-Konjac-Japan DsMV-LC114498-Konjac-Japan DsMV-LC114513-Konjac-Japan DsMV-JN692172-Taro-China Clade II

73 Tz24-Tannia-MG602242 Tz34-Tannia-MG602234

62 Et105-Taro-MG602240 Et106-Taro-MG602241 67 DsMV-LC114506-Konjac-Japan ZYMV-AY188994

93 WMV-FJ823122 SMV-KF135488 79 75 BCMNV-AY864314 CABMV-AF348210 58 PStV-AY968604 99 BCMV-KC832501 ZaMMV-KT729506 PVY-EF026076 PeMoV-NC002600 SrMV-KJ541740 99 SCMV-AY569692 YMV-NC004752 BYMV-AB439732 SPVG-KF790759 76 KoMV-AB219545 RGMV-NC001814 Outgroup 129

Figure 3. Phylogenetic analysis based on amino acid sequences of the core CP- coding region of selected DsMV isolates. Phylogenetic tree generated using the Maximum-Likelihood method and a Kimura 2- Parameter model with 1000 bootstrap replications in MEGA 7. The tree was rooted using Ryegrass mosaic virus (RGMV, NC001814), as outgroup. Bootstrap values greater than 50 % are shown. Taro (Colocasia esculenta), tannia (Xanthosoma sp.), elephant foot yam (Amorphophallus paeoniifolius), konjac (Amorphophallus konjac), arum lily (Zantedeschia aethiopica), calla lily (Zantedeschia sp.). Et: isolates sequenced from Ethiopia, Tz: isolates sequenced from Tanzania and Ug: isolates sequenced from Uganda.

130

tannia, Calla lily, Elephant foot yam, vanilla and Konjac from Vietnam, Nicaragua, Taiwan, India, New Zealand, USA, French Polynesia and Japan (Fig. 3). PASC analysis revealed that DsMV isolates from East Africa have an amino acid similarity ranging from 90.5 % to 100 % with previously reported DsMV isolates.

Discussion

Of the 392 samples collected from 25 regions in the four countries, a total of 91 (68 taro and 23 tannia) samples showed virus-like symptoms. These symptoms included mosaic, yellowing, stunting, feathery-mottle, leaf distortion, vein-clearing and/or downward-curling of the leaf blades (Fig. 2), which have previously been associated with virus infection in a range of aroids (Zettler et al., 1970; Elliott et al., 1997; Revill et al., 2005a). Symptomatic samples were collected from plants growing in 21 of the 25 regions surveyed, with the exception of Oromia in Ethiopia, Siaya in Kenya or Budondo and Kignlu in Uganda, where no virus-like symptoms were observed (Table 2). In the 25 regions from the four countries surveyed in this study, no samples showing symptoms usually attributed to bobone or alomae diseases were observed.

Of the 91 symptomatic samples, 45 samples from Ethiopia, Tanzania and Uganda showed symptoms such as feathery-mottle, mosaic, leaf distortion, yellowing and/or stunting, which are often associated with DsMV infection (Fig. 2A-K) (Nelson, 2008). Of the 45 samples with DsMV-like symptoms, three tannia samples collected from a single site in Luuka region of Uganda showing mosaic, mottling and vein- chlorosis symptoms (Fig. 2J and K) were found to be infected with CMV, with all three samples testing negative for DsMV. Although a range of other symptoms were observed in the samples collected in this study, no other samples tested positive for CMV, suggesting that asymptomatic infections of taro and tannia with CMV were not present in any of the samples collected and that symptoms observed on other samples were not associated with CMV infection. Of the remaining 42 samples with typical DsMV-like symptoms (Fig. 2A-I), 36 were confirmed to be infected with DsMV, including 33 samples from Ethiopia, one from Uganda and two from Tanzania. In addition, three asymptomatic plants from

131

Ethiopia together with an asymptomatic sample from Tanzania (Tz24) also tested positive for DsMV. This phenomenon is consistent with previous studies and may occur as a consequence of seasonal effects or differences in symptom expression in different host plant species (Elliott et al., 1997; Nelson, 2008). Hence, sampling and testing of aroids for DsMV at different seasons of the year should be considered in future studies. The survey findings suggest that, while DsMV is widespread in Ethiopia, being detected in ∼21 % of the 171 samples collected from the five regions surveyed, this is not the case in Uganda, Tanzania and Kenya. Six samples (four from Ethiopia and two from Uganda) with typical DsMV-like symptoms tested negative for all of the viruses. The yellowing and mosaic symptoms observed on these six samples might be caused by other factors such as aging, pest attack, pesticide use or viruses other than DsMV, CMV or the two rhabdoviruses assayed.

Forty six samples showing symptoms such as leaf discolouration or yellowing, vein swelling or deformation, downward-curling of the leaf blades, or stunting, tested negative for all the assayed viruses. Of these 46 samples, 36 were from taro and 10 were from tannia collected from the four countries surveyed. The symptoms observed on these samples may be caused by any one of a number of factors, such as nutritional deficiencies, as yet unidentified virus/es or by other aroid-infecting viruses for which testing was not done such as viruses from the families Reoviridae and Tospoviridae. It is also possible that the plants were infected with sequence variants of DsMV, CMV, CBDaV or TaVCV whose diversity precluded their detection using the currently available primers. In work associated with the current study (Kidanemariam et al., 2018a), samples were tested for badnaviruses using PCR and rolling circle amplification (RCA) and full-length sequences were characterised. A high incidence and wide distribution of both TaBV and TaBCHV was determined, with at least one sample from every district testing positive, however there was no clear association of either of these two viruses with symptoms. Interestingly, of the 40 samples which tested positive for DsMV, mixed infections of DsMV and TaBV were observed in 25 of the DsMV-positive samples from Ethiopia as well as all three DsMV- positive samples from Tanzania. This result indicates that mixed infections of TaBV

132

and DsMV are not uncommon and further work on the synergistic effects of mixed infections, compared to infection with either TaBV or DsMV alone, on the yield of taro plants is warranted. Interestingly there were no mixed infections between the other badnavirus species identified, TaBCHV, and DsMV. Although partial sequences of DsMV isolates from Ethiopia are available (Kidanemariam et al., 2018b), the complete genomic sequences of East African DsMV isolates have not been reported. Therefore, the complete genome sequences of nine East African isolates were determined and analyses were carried out to determine the evolutionary relationship of these and previously reported DsMV isolates. The genome organisation of the nine DsMV isolates was consistent with other DsMV isolates. Phylogenetic analysis carried out using the core CP-coding amino acid sequences was also consistent with previous reports, with DsMV isolates grouping into two distinct clades (Wang et al., 2017; Babu and Hegde 2014). The separation of Ethiopian DsMV isolates into five groups across the two clades each containing isolates from different geographic locations including, Vietnam, Nicaragua, Taiwan, India, the USA and Japan suggests that the virus has most likely been introduced from different sources on multiple occasions. The origins of the isolates from Uganda (clade I) and Tanzania (clade II) are clearly different from each other, but similar to two of the five groups of isolates present in Ethiopia. The phylogenetic analysis also revealed that there is no relationship between either clades or groups with respect to geographic origin or host plant among the DsMV isolates included in this study, which is also consistent with previous work (Wang et al., 2017). This is the first comprehensive survey carried out in East Africa to identify and characterise viruses infecting taro and other edible aroids in the region. The findings from this study will assist farmers and national agricultural research services in the region to make informed decisions regarding the acquisition and dissemination of edible aroids, and in particular highlights the high prevalence of DsMV in Ethiopia. Further work on the yield effects of taro and tannia infected with DsMV will be crucial in determining yield losses and identifying if resistant cultivars are available for distribution. The establishment of virus-indexed tissue culture nurseries within East Africa will play a key role in the production and distribution of virus-free farmer-

133

preferred taro cultivars in the region. The collection of field samples from this work will be preserved at the BecA–ILRI Hub and will be available for further analysis, if and when additional diagnostic assays become available. This may shed light on the cause of the symptoms displayed on some plants which tested negative in the current work.

Acknowledgments

This project was funded by Biosciences eastern and central Africa (BecA–ILRI) Hub through the African Biosciences Challenge Fund (ABCF). ABCF program is supported by the Australian Department of Foreign Affairs and Trade (DFAT) through BecA- CSIRO partnership; the Syngenta Foundation for Sustainable Agriculture (SFSA); the Bill and Melinda Gates Foundation (BMGF); the UK Department for International Development (DFID) and the Swedish International Development Agency (SIDA). We are also thankful to all the farmers for allowing us to inspect their fields and collect samples. DK is the recipient of an Australia Awards Scholarship.

134

References

Akwee P.E., Netondo G., Kataka J.A., Palapala V.A. (2015) A critical review of the role of taro Colocasia esculenta L. (Schott) to food security: A comparative analysis of Kenya and Pacific Island taro germplasm. Scientia Agriculturae, 9, 101–108.

Alconero R., Zettler F.W. (1971) Virus infections of Colocasia and Xanthosoma in Puerto Rico. Plant Disease Reporter, 55, 506–508.

Altschul S.F., Gish W., Miller W., Myers E.W., Lipman D.J. (1990) Basic local alignment search tool. Journal of Molecular Biology, 215, 403–410.

Atibalentja N., Fiafia T.S., Gosai R., Melzer M. (2017) First Report of Taro vein chlorosis virus on Taro (Colocasia esculenta) in the U.S. Territory of American Samoa. Plant Disease, doi.org/10.1094/PDIS-09-17-1478-PDN

Babu B., Hegde V. (2014) Molecular characterization of dasheen mosaic virus isolates infecting edible aroids in India. Acta Virologica, 58, 34–42.

Beyene T.M. (2013) Morpho-agronomical characterization of taro (Colocasia esculenta) accessions in Ethiopia. SciencePG, 1, 1–9.

Cox M.P., Peterson D.A., Biggs P.J. (2010) SolexaQA: At-a-glance quality assessment of Illumina second-generation sequencing data. BMC Bioinformatics, 11, 485.

Dietzgen R.G., Tan E.R., Yong A.H.S., Feng C.W. (2013) Partial polymerase gene sequence, phylogeny and RT-PCR diagnostic assay for Datura yellow vein nucleorhabdovirus. Australasian Plant Disease Notes, 8, 21–25.

Dong J.H., Cheng X.F., Yin Y.Y., Fang Q., Ding M., Li T.T., Zhang L.Z., Su X.X., McBeath J.H., Zhang Z.K. (2008) Characterization of tomato zonate spot virus, a new tospovirus in China. Archives of Virology, 153, 855–864.

Elliott M.S., Zettler F.W., Brown L.G. (1997) Dasheen mosaic potyvirus of edible and ornamental aroids. Plant Pathology Circular, 384.

Grabherr M.G., Haas B.J., Yassour M., Levin J.Z., Thompson D.A., Amit I., Adiconis X., Fan L., Raychowdhury R., Zeng Q., Chen Z., Mauceli E., Hacohe N., Gnirke A., Rhind N., di Palma F., Birren B.W., Nusbaum C., Lindblad-Toh K., Friedman N., Regev A. (2011) Full-length transcriptome assembly from RNA-seq data without a reference genome. Nature Biotechnology, 29, 644–652.

Gollifer D., Jackson G., Dabek A., Plumb R., May Y. (1977) The occurrence and transmission of viruses of edible aroids in the Solomon Islands and the Southwest Pacific. International Journal of Pest Management, 23, 171–177.

135

Ha C., Coombs S., Revill P., Harding R.M., Vu M., Dale J.L. (2008) Design and application of two novel degenerate primer pairs for the detection and complete genomic characterization of potyviruses. Archives of Virology, 153, 25–36.

Higgins C.M., Bejerman N., Li M., James A.P., Dietzgen R.G., Pearson M.N., Revill A.P., Harding R.M. (2016) Complete genome sequence of Colocasia bobone disease- associated virus, a putative cytorhabdovirus infecting taro. Archives of Virology, 161, 745–748.

James M., Kenten R., Woods R. (1973) Virus-like particles associated with two diseases of Colocasia esculenta (L.) Schott in the Solomon Islands. Journal of General Virology, 21, 145–153.

Kazmi S.A., Yang Z., Hong N., Wang G., Wang Y. (2015) Characterization by small RNA sequencing of Taro Bacilliform CH Virus (TaBCHV), a novel Badnavirus. PloS One, 10, e0134147.

Kidanemariam D.B., Sukal A.C., Abraham A.D., Stomeo F., Dale J.L., James A.P., Harding R. (2018a) Identification and molecular characterisation of taro bacilliform virus and taro bacilliform CH virus from East Africa. Plant pathology, https://doi.org/10.1111/ppa.12921.

Kidanemariam D.B., Macharia M.W., Harvey J., Holton T., Sukal A.C., James A.P., Harding R., Abraham A.D. (2018b) First report of Dasheen mosaic virus infecting taro (Colocasia esculenta L.) from Ethiopia. Plant disease, doi.org/10.1094/PDIS-12-17-1991-PDN

Long M.H., Ayin C., Li R., Hu J.S. Melzer M.J. (2014) First Report of Taro vein chlorosis virus infecting taro (Colocasia esculenta) in the United States. Plant Disease, 98, 1160–1160.

Macanawai, A.R., Ebenebe, A.A., Hunter, D., Devitt, L., Hafner, G. and Harding, R. (2005) Investigations into the seed and mealybug transmission of Taro bacilliform virus. Australasian Plant Pathology, 34, 73–76.

Manikonda P., Srinivas K.P., Reddy S., Venkata C., Ramesh B., Navodayam K., Krishnaprasadji J., Ratan P. B., Sreenivasulu P. (2011) Konjac mosaic virus naturally infecting three aroid plant species in Andhra Pradesh. Indian Journal of Phytopathology, 159, 133–135.

Maino M.K. (2003) The development of a serological-based diagnostic test for Dasheen mosaic potyvirus (DsMV). MSc thesis, school of life sciences, Queensland University of Technology. 136

Ndabikunze B.K., Talwana H.A.L., Mongi R.J., Issa-Zacharia A., Serem A. K., Palapala V., Nandi J.O.M. (2011) Proximate and mineral composition of cocoyam (Colocasia esculenta L. and Xanthosoma sagittifolium L.) grown along the Lake Victoria basin in Tanzania and Uganda. African Journal of Food Science, 5, 248– 254.

Nelson, S. C. (2008) Dasheen mosaic of edible and ornamental aroids. Plant Disease, 44, 1–9.

Onwueme I.C., Charles W.B. (1994) Cultivation of cocoyam. In: Tropical root and tuber crops. Production, perspectives and future prospects, pp. 139–161. FAO Plant Production and Protection, 126, Rome.

Pearson M., Jackson G., Saelea J., Morar S. (1999) Evidence for two rhabdoviruses in taro (Colocasia esculenta) in the Pacific region. Australasian Plant Pathology, 28, 248–253.

Rana G.L., Vovlas C., Zettler F.W. (1983) Manual transmission of dasheen mosaic virus from Richardia to nonaraceous hosts. Plant Disease, 67, 1121–1122.

Revill P., Jackson G., Hafner G., Yang I., Maino M., Dowling M., Devitt L., Dale J., Harding R. (2005a) Incidence and distribution of viruses of taro (Colocasia esculenta) in Pacific Island countries. Australian Plant Pathology, 35, 327–331.

Revill P., Trinh X., Dale J., Harding R. (2005b) Taro vein chlorosis virus: characterization and variability of a new nucleorhabdovirus. Journal of General Virology, 86, 491–499.

Sivaprasad Y., Reddy B.B., Kumar C.N., Reddy K.R., Gopal D.S. (2011) First report of groundnut bud necrosis virus infecting taro (Colocasia esculenta). Australasian Plant Disease Notes, 6, 30–32.

Talwana H.A.L., Serem A.K., Ndabikunze B.K., Nandi J.O.M., Tumuhimbise R., Kaweesi T., Chumo E.C., Palapala V. (2009) Production status and prospects of cocoyam (Colocasia esculenta (L.) Schott.) in East Africa. Journal of Root Crops, 35, 98– 107.

Tumuhimbise R., Talwana H.L., Osiru D.S.O., Serem A.K., Ndabikunze B.K., Nandi J.O.M., Palapala V. (2009) Growth and development of wetland-grown taro under different plant populations and seedbed types in Uganda. African Crop Science Journal, 17, 49–60.

137

Valderrama-Cháirez M.L., Cruz-Hernández A., Paredes-López O. (2002) Isolation of functional RNA from cactus fruit. Plant Molecular Biology Reporter, 20, 279– 286.

Wang Y.F., Wang G.P., Wang L.P., Hong N. (2014) First report of Cucumber mosaic virus in taro plants in China. Plant Disease, 98, 574–574.

Wang Y., Wu B., Borth W.B., Hamim I., Green J.C., Melzer M.J., Hu J.S. (2017) Molecular characterization and distribution of two strains of dasheen mosaic virus on taro in Hawaii. Plant Disease, 101, 1980–1989.

Yang I.C., Hafner G.J., Revill P.A., Dale J.L., Harding R.M. (2003) Sequence diversity of South Pacific isolates of Taro bacilliform virus and the development of a PCR- based diagnostic test. Archives of Virology, 148, 1957–1968.

Zettler F.W., Foxe M.J., Hartman R.D., Edwardson J.R., Christie R.G. (1970) Filamentous viruses infecting dasheen and other araceous plants. Phytopathology, 60, 983–987.

138

Chapter 8

General Discussion

Despite the remarkable economic growth recorded in East Africa over the past 15 years, food and nutrition insecurity are still significant problems for the region (AASR, 2016; AEO, 2017). Agricultural productivity in the region is affected by many different factors including climate change such as El Niño, armed conflict, losses due to pests and diseases and inefficient farming systems (AASR, 2016; FAO, 2016). Taro (Colocasia esculenta L.) and tannia (Xanthosoma sp.) are among the most important root crops grown for both food and economic security by many small-holder farmers in sub-Saharan Africa. In the densely populated south and south-western part of Ethiopia around 20 million people depend on root crops such as potato, sweet potato, taro and enset for their dietary intake (Harrison et al., 2014). Taro is propagated mainly because it performs well with minimal agricultural input (Harrison et al., 2014; Wada et al., 2017) and provides a basic source of starch in the diet for many communities. In Kenya, it is grown beside many streams and rivers in Mount Kenya and Abedares, as well as in the Lake Victoria basin districts of Kakamega, Kisumu and Siaya. Taro is also a very important food crop in Uganda and Tanzania. It is mainly grown along the Lake Victoria Basin in Tanzania (Bukoba and Misenyi districts) and Uganda (Wakiso and Mukono districts) (Talwana et al., 2009; Ndabikunze et al., 2011; Macharia et al., 2014).

In Ethiopia, Areka Agricultural Research Centre (AARC) is one of the research institutes mandated to carry out experiments with root and tuber crops. In early 2000, AARC released a taro variety called ‘Boloso-one’ with desirable production and agronomic characteristics and it was accepted by most farmers (Dagne et al., 2014). However, the production of ‘Boloso-one’ and other taro varieties in southern Ethiopia has declined significantly in recent years. In other areas of the world, such as Asia and the South Pacific, yield decline of taro and other edible aroids has been attributed to virus infection and these pathogens are among the most important constraints for the production (Yang et al., 2003; Revill et al., 2005; Babu et al., 2014). Prior to the 139

current research project, however, no comprehensive study has been carried out to determine the incidence, distribution and the possible origin of viruses infecting taro and other edible aroids in East Africa.

In this study, a high incidence of TaBV and TaBCHV infection was detected in taro and tannia from East Africa. TaBCHV was detected in all four countries surveyed (Ethiopia, Kenya, Tanzania and Uganda) while TaBV was only identified in Kenya, Uganda and Tanzania. It is possible, however, that TaBV is present in Ethiopia but was not represented in the samples randomly selected for sequencing from this country. Therefore, further sampling and analysis of plants from Ethiopia is needed to verify the absence of TaBV in this country. Both TaBV and TaBCHV are known to infect aroids without causing obvious symptoms (Revill et al., 2005). This is also consistent with our observations whereby no correlation was seen between symptoms and the presence of TaBV and TaBCHV. There is currently no information available on production losses in taro and other edible aroids due to infection by TaBV or TaBCHV alone or in combination.

DsMV infection can reportedly cause up to 60 % production losses in aroids (Hartman and Zettler, 1974; Elliott et al., 1997). Unless strict disease control and eradication measures are taken, the high occurrence of DsMV in Ethiopia is a threat to the production of taro in this country and the region. AARC is currently multiplying and distributing corms of elite taro cultivars like ‘Boloso-one’ to farmers in southern Ethiopia. In addition, AARC has the largest taro germplasm collection in the country. Therefore, it is crucial for the centre to implement viral disease diagnostic procedures in its taro production and distribution systems. Furthermore, production of disease- free taro planting materials through tissue culture should be considered in the future. Field observations in southern Ethiopia have revealed a high incidence of aphid infestations, which may be facilitating the rapid spread of DsMV. Therefore, integrated disease and pest management systems need to be established in order to achieve effective control of DsMV in Ethiopia. Surprisingly, no samples tested positive for DsMV from Kenya and there was a very low incidence of DsMV in Uganda and Tanzania. Therefore, appropriate quarantine measures need to be in place in these

140

three countries in order to prevent the introduction and dissemination of the virus. In addition, equipping researchers and farmers in Kenya, Tanzania and Uganda with proper training on early identification and removal of DsMV-infected plants should be considered to control the virus.

No samples collected in this study tested positive for colocasia bobone disease- associated virus (CBDaV) or Taro vein chlorosis virus (TaVCV), two members of the family Rhabdoviridae infecting taro in the South Pacific. However, there is currently limited sequence information available for these two viruses and rhabdoviruses in general. As a result, it is not known whether the virus-specific or degenerate PCR primers used in this study will amplify the breadth of variability that may be present within any East African rhabdovirus isolates. Therefore, the inability to detect rhabdoviruses in this this study must be treated with caution and further research is required. However, it is advisable that strict quarantine measures be implemented in order to prevent the introduction of taro planting material infected with known rhabdoviruses into the region. This is critical to avert the occurrence of alomae, a lethal disease reported from Papua New Guinea (PNG) and the Solomon Islands caused by the synergistic interaction between colocasia bobone disease-associated virus (CBDaV) and TaBV (Revill et al., 2005; Higgins et al., 2016). The establishment of disease diagnostic capacities, especially in national agricultural research systems (NARS) and quarantine regulation bodies, is also vital.

Due to time constraints and a lack of suitable assays or known infected samples for use as controls, testing of plant samples for members of the Tospovirus and Oryzavirus genera was not done in this study. Groundnut bud necrosis virus (GBNV), genus Tospovirus is only reported from India (Sivaprasad et al., 2011), while taro reovirus (TaRV) genus Oryzavirus has only previously been identified in Papua New Guinea (Revill et al., 2005). Currently there is no information about the effect of these viruses on the production of taro. The testing of aroids from East Africa for these viruses and NGS analysis for a large of number of RNA samples will be important for future research activities.

141

Infectious virus clones are a useful means of transmitting plant viruses without the need for insect vectors, to facilitate studies on virus resistance, viral gene functions and to enable modification of viruses for gene expression or gene silencing (Grimsley et al., 1986; Grimsley, 1990). Additional uses could be the assessment of production losses in vegetatively propagated crops, such as taro, caused by viral infection over subsequent generations and investigating the synergistic interactions of individual viruses in mixed infections (Grimsley et al., 1986; Grimsley et al., 1987; Dasgupta et al. 1991). In this study, a greater-than-unit-length clone of an Australian TaBV isolate was generated and was shown to be infectious in taro. This infectious clone can be used in future studies to assess the production losses in taro due to TaBV infection over several generations. Furthermore, this infectious clone can be used to screen taro cultivars for TaBV resistance, and to develop mutant TaBV infectious clones in order to study the role of different virus gene products in the virus life-cycle. Importantly, the TaBV infectious clone could be used to understand the possible role of TaBV in the lethal viral disease of taro known as ‘alomae’ caused by mixed infection of taro with TaBV and CBDaV. This would, however, first require the generation of an infectious clone of CBDaV.

To our knowledge, this study is the first to determine the incidence and distribution of viruses infecting taro and other edible aroids from East Africa. These results will assist farmers, NARS and private tissue culture laboratories from the countries surveyed, to make informed decisions on the acquisition, dissemination and production of virus-free planting materials of taro and other edible aroids in the region. In addition, it will lay the groundwork for future studies on aroids in the region.

142

References

AASR (Africa Agriculture Status Report). (2016). Progress towards agricultural transformation in Africa. Accessed 01/11/2017 https://agra.org/aasr2016/public/assr.pdf

AEO (African Economic Outlook). (2017). Entrepreneurship and industrialisation. Accessed 01/11/2017 https://www.afdb.org/fileadmin/uploads/afdb/Documents/Publications/AEO _2017_Report_Full_English.pdf

Babu, B. and Hegde, V. (2014). Molecular characterization of dasheen mosaic virus isolates infecting edible aroids in India. Acta virologica 58:34–42.

Dagne, Y., Mulualem, T. and Kifle, A. (2014). Development of high yielding taro (Colocacia esculenta L.) variety for mid altitude growing areas of Southern Ethiopia. J. Plant Sci. 2: 50–54.

Dasgupta, I., Hull, R., Eastop, S., Poggi-Pollini, C., Blakebrough, M., Boulton, M. I. and Davies, J. W. (1991). Rice tungro bacilliform virus DNA independently infects rice after Agrobacterium-mediated transfer. J. Gen. Virol. 72:1215–1221.

Elliott, M.S., Zettler, F.W. and Brown, L.G. (1997). Dasheen mosaic potyvirus of edible and ornamental aroids. Plant Pathol. Circular, 384.

FAO (2016). Africa, Regional overview of food security and nutrition. The challenhes of building resilience to shocks and stresses. Accessed 22/09/2017 http://www.fao.org/3/a-i6813e.pdf

Grimsley, N., Hohn, B., Hohn, T. and Walden, R. (1986). “Agroinfection,” an alternative route for viral infection of plants by using the Ti plasmid. Proceedings of the national academy of Sci. 83: 3282–3286.

Grimsley, N., Hohn, T., Davies, J. W. and Hohn, B. (1987). Agrobacterium-mediated delivery of infectious maize streak virus into maize plants. Nature 325:177–179. 143

Grimsley, N. (1990). Agroinfection. Physiologia Plantarum 79: 147–153.

Harrison, J., Moore, K.A., Paszkiewicz, K., Jones, T., Grant, M.R., Ambacheew, D., Muzemil, S. and Studholme, D.J. (2014). A draft genome sequence for ensete ventricosum, the drought-tolerant “Tree against hunger”. Agronomy 4:13–33.

Hartman, R.D. and Zettler, F.W. (1974). Effects of dasheen mosaic virus on yields of Caladium, Dieffenbachia, and Philodendron. Phytopathol. 64: 768.

Higgins, C., Bejerman, N., Li, M., James, A., Dietzgen, R., Pearson, M., Revill, P. and Harding, R. (2016). Complete genome sequence of Colocasia bobone disease- associated virus, a putative cytorhabdovirus infecting taro. Arch. Virol. 161:745–748.

Macharia, M. W., Runo, S. M., Muchugi, A. N., & Palapala, V. (2014). Genetic structure and diversity of East African taro (Colocasia esculenta (L.) Schott). Afri. J. Biotech. 13:2950–2955.

Ndabikunze, B.K., Talwana, H.A.L., Mongi, R.J., Issa-Zacharia, A., Serem, A.K., Palapala, V. and Nandi, J.O.M. (2011). Proximate and mineral composition of cocoyam (Colocasia esculenta L. and Xanthosoma sagittifolium L.) grown along the Lake Victoria basin in Tanzania and Uganda. Afri. J. Food Sci. 5:248–254.

Revill, P., Jackson, G., Hafner, G., Yang, I., Maino, M., Dowling, M., Devitt, L., Dale, J. and Harding, R. (2005). Incidence and distribution of viruses of taro (Colocasia esculenta) in Pacific Island countries. Aust. Plant Pathol. 35:327–331.

Sivaprasad, Y., Reddy, B.B., Kumar, C.N., Reddy, K.R. and Gopal, D.S. (2011). First report of groundnut bud necrosis virus infecting taro (Colocasia esculenta). Aust. Plant Dis. Notes 6:30–32.

Talwana, H.A.L., Serem, A.K., Ndabikunze, B.K., Nandi, J.O.M., Tumuhimbise, R., Kaweesi, T., Chumo, E.C. and Palapala, V. (2009). Production status and prospects of cocoyam (Colocasia esculenta (L.) Schott.) in East Africa. J. Root Crops 35:98–107. 144

Wada, E., Asfaw, Z., Feyissa, T. and Tesfaye, K. (2017). Farmers perception of agromorphological traits and uses of cocoyam (Xanthosoma sagittifolium (L.) Schott) grown in Ethiopia. African J. Agri. Res. 12: 2681–2691.

Yang, I.C., Hafner, G.J., Revill, P., Dale, J. and Harding, R. (2003a). Sequence diversity of South Pacific isolated of Taro bacilliform virus and the development of a PCR- based diagnostics test. Arch. Virol. 148:1957–1968.

145