American International Journal of Available online at http://www.iasir.net Research in Formal, Applied & Natural Sciences ISSN (Print): 2328-3777, ISSN (Online): 2328-3785, ISSN (CD-ROM): 2328-3793 AIJRFANS is a refereed, indexed, peer-reviewed, multidisciplinary and open access journal published by International Association of Scientific Innovation and Research (IASIR), USA (An Association Unifying the Sciences, Engineering, and Applied Research)

Subspecies identification in (Homoptera: ) by application of partial sequence of cytochrome c oxidase subunit I (COI) gene: a view on the potential of method Nina V. Voronova Zoology Department of Belarusian State University, Minsk, Belarus

Abstract: The family Aphididae is one of the most important for crop production group of phytophagous. In such groups, in which certain forms can noticeably vary in their host-specify and harmfulness, correct species and subspecies identification has great significance. Aphids (Homoptera: Aphididae) is known their high ecological plasticity and simultaneously this family includes a lot of closely related species, morphologically similar subspecies and other forms which morphology-based identification cannot be single-valued. Mitochondrial COI sequences are provided for not only species but subspecies identification of aphids from Europe. Fragments about 540-bp were analyzed. Most of the studied subspecies (76.92 percent) had distinct COI sequences. The rate of nucleotide substitutions on 5'-end of the COI gene varied from two to seven per length (0.37–1.30 %). Difference between subspecies reached the level typical for closed related species, and showed no individual variability or geographic affinity. Based on these results, we conclude that COI sequences can provide an effective tool for identifying subspecies in such applications as pest management, monitoring and plant quarantine. Keywords: aphids • subspecies identification • DNA barcoding • COI • molecular

I. INTRODUCTION Today the species problem remains one of the most fundamental biological problems as well as one extremely difficult to solve[1]. Debates surrounding the concept of a species concept have been going on for decades[2–6]. Ignoring the philosophical aspect of this problem, the most important question of the discussion is – if the living world is really discrete, on which level of the taxonomic system and by which criterion can we draw a line between taxonomic units so that their distinction and individuality are not in doubt[7-9]? The species problem can be observed most clearly in evolutionarily young groups of where adaptive radiation of low level taxa was particular wide. The origination of a generous amount of species, subspecies, races and other forms in an evolutionarily short time have led to an abundance of difficult to distinguish taxa. Aphids (Homoptera: Aphididae) is just such a group. There is a large number of closed related species and subspecies and, perhaps, ecological races which only differ in their biology or host-adaptation among the aphids, but the evolutionary significance of such features are not unquestionable. The major ecological and morphological plasticity of aphids which had been repeatedly demonstrated in experiments[10-11] introduces additional complexity into construction of the phylogenetic system of this large group of animals. Nevertheless, it is impossible to disregard the questions of the taxonomic status of the closed related forms, because such forms of aphids may vary considerably in their harmfulness. Determination of the significance of each aphid species and subspecies as a crop pest is the main factor that forces scientists to search for new methods to detect morphologically similar groups. Another reason for researching the problem of the subspecies detection in aphids is the importance of obtaining knowledge about every evolutionary event in the group, including (or most importantly) the ones which belongs to the micro evolutionary level. Studying the micro evolutionary events allows us both to discover their mechanisms and to identify the main evolutionary trends in a group of phytophagous carrying such an importance for food production and plant ecology[12-13]. Cytochrome c oxidase subunit I gene (COI) is a mitochondrion gene of eukaryotes which was selected as the most effective molecular marker for species identification[14]. A 500–700-bp region at the 5' end of the COI gene forms the primary barcode sequence for members of the kingdom[15-16]. In our work we aimed to estimate whether the usage of COI partial sequences as a single molecular marker allows manifesting of morphologically indistinguishable subspecies in aphids. II. MATERIALS AND METODS Specimens and DNA extraction Specimens of aphids were collected in 2008–2010 from Russia and Belarus. We analyzed COI sequences from aphids of 17 species and subspecies belonging to 4 genera and 2 tribes within Aphididae (Table 1).

AIJRFANS 14-205; © 2014, AIJRFANS All Rights Reserved Page 1 Nina V. Voronova, American International Journal of Research in Formal, Applied & Natural Sciences, 6(1), March-May 2014, pp. 01-06

All aphid samples were stored in 75% ethanol for slide voucher specimens and 96% ethanol at –20 °C for DNA extracting. Identification of each aphid was based on exterior morphology of slide-mounted specimens[17], verified with slides of Zoological Institute Collection of the Russian Academy of Sciences and were preserved in the Collection of the Zoology Department of the Belorussian State University (Minsk, Belarus). Total genomic DNA extraction was performed using Genomic DNA Purification Kit (Thermo Fisher Scientific, Fermentas). Samples for extraction consisted of single or several individuals from the same colony. Amplification and Sequencing of mitochondrial genome fragments The target 708-bp fragment of COI was amplified by polymerase chain reaction (PCR) using universal primer pair, LCO1490 (5'-GGTCAACAAATCATAAAGATATTGG-3') and HCO2198 (5'- TAAACTTCAGGGTGACCAAAAAATCA-3')[15,18]. The reaction mixture contained 200 μmol of dNTP mix, 15 pmol of each primer, 2 mmol of MgCl2, PCR-Buffer, 1 U Pfu polymerase (Thermo Fisher Scientific, Fermentas) and 0.5 μg of DNA. We used the following thermal cycle parameters for 25 µl amplification reactions: initial denaturation for 5 min at 94 °C, followed by 35 cycles of 30 s at 94 °C, 30 s at 50 °C, and 90 s at 72 °C and a subsequent final extension at 72 °C for 5 min. Table 1. Aphid species and subspecies used in this study. Nomenclature according to Remaudiere 1997. Species Locality Host plant Date Amphorophora idaei Börn. Grodno region, Belarus Rubus idaeus L. 27/08/2010 Aphis fabae cirsiiacanthoidis (Scop.) Minsk region, Belarus Cirsium arvense (L.) Scop. 15/07/2008 Aphis fabae fabae Scop. Minsk region, Belarus Chenopodium album L. 05/08/2010 Aphis fabae philadelphi (Börn.) Minsk, Belarus Philadelphus coronarius L. 13/06/2008 Aphis fabae ssp. Minsk region, Belarus Chenopodium album L. 03/06/2008 Aphis ruborum Börn. Stolbtsy district, Minsk region, Belarus Rubus caesius L. 14/07/2010 Dysaphis aff. newskyi (Börn.) Turochaksky district, Altai, Russia Heracleum sp. 04/07/2010 Dysaphis newskyi ossiannilssoni Turochaksky district, Altai, Russia Angelica sp. 21/07/2010 Stroyan Filipendula ulmaria (L.) cholodkovskyi Mordv. Lepel district, Vitebsk region, Belarus 04/06/2010 Maxim. Macrosiphum gei Koch Minsk region, Belarus Geum urbanum L. 05/06/2009 Anthriscus sylvestris (L.) Macrosiphum gei Koch Minsk region, Belarus 05/06/2009 Hoffm. Macrosiphum gei Koch Minsk region, Belarus Aegopodium podagraria L. 12/06/2009 Macrosiphum gei Koch Minsk region, Belarus Chaerophillum aromaticum L. 10/06/2009 Macrosiphum knautiae Holm. Minsk region, Belarus arvensis (L.) Coult. 11/08/2009 Macrosiphum knautiae Holm. Minsk region, Belarus L. 08/07/2009 Macrosiphum rosae (L.) Minsk, Belarus Rosa glauca Pourr. 26/05/2010 Myzus cerasi cerasi (F.) Stolbtsy, Minsk region, Belarus, Cerasus vulgaris Mill. 14/08/2010 Myzus cerasi pruniavium (Börn.) Nesvizh, Minsk region, Belarus Cerasus avium (L.) Moench 10/07/2010 PCR products were tested by electrophoresis on an agarose gel and, if a single band was observed, were purified using a DNA Gel Extraction Kit (Thermo Fisher Scientific, Fermentas) and were sequenced in a forward direction by the automated sequencer 3130 Genetic Analyzer (Applied Biosystems, USA) with BigDye Terminator v3.1 Cycle Sequencing Kit (Applied Biosystems, USA) in the Institute of Bioorganic Chemistry NASB, Belarus. The sequences were individually checked by eye in the software package BioEdit v 7.0.5.3 and verified for protein coding frame-shifts to avoid pseudogenes. Sequences are deposited in GenBank (accession numbers JF340096, JF340098, JF340100–JF340103, JF340105, JF340107, JF340108, JF340113, JF776568 and JN004138). Genetic Divergence and Phylogeny The combined sequences were aligned by using ClustalW in the software package MEGA4[19]. Genetic distances between pairs of species were calculated using the Maximum Composite Likelihood method in MEGA4[20]. One hundred twenty-three COI sequences obtained from the GenBank NCBI and The Barcode of Life Data System CBOL were used for construction of trees and phylogenetic inference (the accession numbers for the downloaded sequences given in Table S of Supplementary Information to this article). Phylogenetic trees were constructed by the Minimum Evolution method[21]. Bootstrapping was conducted using 1000 replicates[22]. After construction of trees, if we found that the subspecies branches had formed clusters with high values on the bootstrap test, all sequences were checked by eye to discover specific nucleotide substitutions.

III. RESULTS AND DISCUSSION Sequence analysis The COI sequences within six groups of aphids were compared. Groups of comparison had been formed: M. cerasi subspecies, D. newskyi subspecies, A. fabae complex of species, four forms of M. gei, which differ in

AIJRFANS 14-205; © 2014, AIJRFANS All Rights Reserved Page 2 Nina V. Voronova, American International Journal of Research in Formal, Applied & Natural Sciences, 6(1), March-May 2014, pp. 01-06 their host-plant specificity and may presumably belong to a different subspecies, and complexes of species close to M. rosae and A. idaei Goot. We studied the region from 81 to 620 nucleotide of COI having performed sequence alignment on the complete mitochondrion genome sequence of Acyrthosiphon pisum (Harr.) [NC011594]. We also used the COI sequences obtained from GenBank (accession numbers and some voucher specimens are given in Fig. 2 and Fig. 3) in order to compare the identity of nucleotide substitutions between the COI genes of the same subspecies from samples obtained by other researchers in different regions and in some cases on different continents. In all cases some single nucleotide substitutions were found in the COI sequences obtained from different subspecies of aphids (Fig. 1). Most frequently, those were synonymous transitions. The ratio of A↔G and С↔Т transitions was almost equal, while the number of sites bearing such replacements varies in different species. Namely, the COI sequences of the M. cerasi subspecies differ in seven loci. In particular, there are five nucleotide substitutions between M. cerasi cerasi and M. cerasi pruniavium each of which is a synonymous transition. Interestingly, all studied sequences of M. cerasi, including those samples which had not been identified exactly as subspecies, strongly divided into three groups. The first one, as expected, incorporated M. cerasi cerasi, the second one included M. cerasi pruniavium. We only found two nucleotide substitutions between D. aff. newskyi and D. newskyi ossiannilssoni, however one of them was the transversion A↔T. The same level of difference was detected between the sequences of the closely related species M. rosae and M. knautiae (two nucleotide transitions) as well as A. idaei and A. ruborum (one nucleotide transition). After comparing the COI sequences of certain forms of M. gei, which had been taken into the study because the question about the necessity to divide M. gei into subspecies according to their host-adaptation is presently under consideration[23], we observed that M. gei from Ch. aromaticum differs to M. gei from G. urbanum in four COI sites, where three of those four substitutions were transversions, and it differs from other forms of the M. gei complex in three sites.

Fig. 1 COI sequences of Myzus cerasi, Dyzaphis newskyi subspecies, Aphis fabae, Aphis idaei, Macrosiphum gei, and Macrosiphum rosae groups. Comparisons are made using DNA-barcodes by Foottit at al. (2008) and Valenzuela et al. (2007)

AIJRFANS 14-205; © 2014, AIJRFANS All Rights Reserved Page 3 Nina V. Voronova, American International Journal of Research in Formal, Applied & Natural Sciences, 6(1), March-May 2014, pp. 01-06

The A. fabae species complex turned out the most homogeneous on COI gene sequences. A. fabae fabae was only differing from the other forms of the complex in four loci, two of which represented a different transversion. The COI sequences of A. fabae cirsiiacanthoidis, A. fabae philadelphi and A. fabae solanella were identical (Fig. 1). To compare the data we obtained with the other investigators’ results we had additional COI sequences of the Illinoia azalea subspecies from CBOL. Unfortunately, it was the only group of subspecies of aphids we found in the databases of nucleotide sequences after extensive searching. Nevertheless, it is easy to observe that there are six nucleotide substitutions between the analogical COI gene fragments of the I. azalea subspecies. This finding is completely consistent with our findings given above on the nucleotide substitution rate between subspecies of aphids. Phylogenetic constructing We could not always observe the separation of the subspecies to stable clusters when we constructed the trees by using various phylogenetic methods such as minimum-evolution, maximum-parsimony, and neighbor- joining. From all studied groups only M. cerasi subspecies divided into three clusters very reliably (Fig. 3), which may be associated with a large number of nucleotide substitutions between subspecies’ sequences. Genetic distance Genetic distances between some unites within studied groups (dij) were lying in the range from 0.000 to 0.023, and the mean value for all subspecies was 0.011 (Table 2). To compare we calculated the genetic distances between certain species allocable to the same genus by using the orthologous COI gene fragment. We found out that the genetic distances between closed related species are equal to the ones between subspecies, for example the genetic distance between A. idaei and A. ruborum is 0.002 while it is 0.003 between M. rosae closely related species (Fig. 2).

Table 2. Pairwise genetic distances of COI gene between subspecies calculated with three different models. aAverage, bMinimum, and cMaximum. Among all analyzed Between four Between four Between four forms of Between three subspecies Myzus cerasi ssp. Aphis fabae ssp. Macrosiphum gei Illinoia azaliae ssp.

Avea Minb Maxc Ave Min Max Ave Min Max Ave Min Max Ave Min Max Model MCL 0.011 0.000 0.023 0.017 0.011 0.020 0.005 0.000 0.010 0.014 0.006 0.023 0.007 0.002 0.010 p–dist. 0.011 0.000 0.022 0.015 0.011 0.019 0.005 0.000 0.010 0.014 0.006 0.022 0.007 0.002 0.010

K2P 0.011 0.000 0.022 0.015 0.011 0.019 0.005 0.000 0.010 0.014 0.006 0.022 0.007 0.002 0.010

In this paper we leave aside the discussion of questions about the evolutionary history of the analyzed subspecies as well as the validity of their existence, relying in this issue on the opinions of colleagues in taxonomy[17,24]. Our interest shall only be the feasibility of identification of closed aphid subspecies by COI sequence. It should be noted that nowadays the DNA barcoding technique is used more widely in purely taxonomic studies of aphids[25-31]. Nevertheless, the absoluteness of this approach is still being debated. Particularly heated polemics arise when molecular taxonomy data conflicts with views which are accepted at that time among taxonomists on the status of some forms or on the relationship between the groups[32]. The reason is because the question “on which taxonomic level relevant molecular signal debuts” is the great issue of molecular taxonomy. When dealing with a very short partial sequence of DNA in comparison with the full length genome (typically 500-1500 nucleotides in length that is about 6 per cent of mitochondrial genome and about 2.15e–4 per cent of total size of aphid DNA[33]) it is sometimes difficult to be sure that the analysis of this very small portion of genome will allow one to accurately identify species or subspecies. Recent studies of the Canadian and Korean researchers[34-35] as well as our own have shown that COI allows detection of certain species of aphids within a high level of statistical significance. And this study displays that COI haplotypes are formed in aphids at a lower taxonomic level, namely at the level of subspecies. These haplotypes cannot be qualified as a random (individual) variability because we found that, for example, the COI haplotype of M. cerasi cerasi from Belorussia is completely identical to those from Canada [EU701789 и EU701784], and the haplotype of Belorussian M. cerasi pruniavium is the same as M. cerasi from Australia [DQ499055] and Canada [EU701786, EU701790]. Likewise, sympatric M. gei subspecies – the necessity of the description of which, as mentioned above, is currently under discussion – living in the same habitats but on different host plants also show differences in the COI sequence. Only the A. fabae group introduces some uncertainty as to the reliability of our conclusions. The lack of nucleotide substitutions between COI of A. fabae cirsiiacanthoidis, A. fabae philadelphi and A. fabae solanella may have two possible explanations: (1) these subspecies had diverged more recently than others we compared and the specific nucleotide substitutions were not formed yet or (2) the subspecies rank was improperly assigned to these forms.

AIJRFANS 14-205; © 2014, AIJRFANS All Rights Reserved Page 4 Nina V. Voronova, American International Journal of Research in Formal, Applied & Natural Sciences, 6(1), March-May 2014, pp. 01-06

Fig. 2 Minimum evolution identification trees of Myzus cerasi, Dyzaphis newskyi subspecies, Aphis fabae, Aphis idaei, Macrosiphum gei, and Macrosiphum rosae group based on Maximum Composite Likelihood genetic distances. Numbers on branches are bootstrap.

Despite the fact that the vast majority of found nucleotide substitutions is synonymous, the fact of their existence is an important evolutionary evidence, because it is known that in genes which have a strong functional or selective constraint[36-37], like COI has, even synonymous substitutions are often under the pressure of purifying selection. In any case the patterns we observed allow us to expect that the usage of even a single marker, COI, provides an important tool for the identification of not only species but subspecies in aphids, at least when it comes to “good subspecies” (by analogy to “good species”[38]).

IV. CONCLUSION We compared the sequences from 81 to 620 nucleotide of the COI gene within six groups of aphids belonging to low level taxa (subspecies and closed related species). COI allows identifying subspecies with the same efficiency as separate species. Subspecies of aphids have their own haplotypes of COI, with no individual variability or geographic affinity. Acknowledgements We are grateful to Dr A. V. Stekolshchikov of the Laboratory of Insect Taxonomy of the Zoological Institute of the Russian Academy of Sciences, for kindly providing us some Disaphis’ samples. We are also thankful to Dr

AIJRFANS 14-205; © 2014, AIJRFANS All Rights Reserved Page 5 Nina V. Voronova, American International Journal of Research in Formal, Applied & Natural Sciences, 6(1), March-May 2014, pp. 01-06

A. N. Evtushenkov the Head of the Department of Molecular Biology of Belarusian State University for his help in the technical support of our work. We thank Mr. J. Boleininger of the Materials Department of Imperial College London, UK for his helpful suggestion in improving the manuscript.

REFERENCES [1] Hey, J. Trends in Ecology and Evolution. 2001. 16, 7. 326–329. [2] Mayr, E. Animal Species and Evolution. Cambridge: The Belknap press. 1963. 797 pp. [3] Shaposhnikov, G.Ch. Evolutionary Theory. 1984. 7. 1–39. [4] Mallet, J. Journal of Evolutionary Biology. 2001. 14, 6. 887–888. [5] Wu, C.I. J. Evol. Biol. 2001. 14. 851–865. [6] Van Alphen, J.J.M., Seehausen, O. J. Evol. Biol. 2001. 14. 874–875. [7] Hull, D.L. Philosophy of Science. 1978. 45, 3. 335–360. [8] Rakauskas, R. Ekologija. 2003. 1. 3–7. [9] Fitzpatrick, B.M., Fordyce, J.A. & Gavrilets S. J. Evol. Biol. 2008. 21. 1452–1459. [10] Shaposhnikov, G.Ch. Entomological Review. 1965. 44(1). 3–25. (in Russian). [11] Dixon, A.F.G. Aphid ecology. Glasgow: Blackie & Son Ltd. 1985. 157 pp. [12] Hales, D.F., Tomiuk, J., Wöhnrmann, K., Sunnucs, P. Eur. J. Entomol. 1997. 94. 1–55. [13] Van Emden, H.F. & Harrington, R. Aphids as crop pests. UK: Oxford press. 2006. 699 pp. [14] Ratnasingham, S. & Hebert, P. D. N. Molecular Ecology Notes. 2007. 7. 355–364. [15] Hebert, P.D.N., Cywinska, A., Ball, S.L. & deWaard, J.R. Proc. R. Soc. Lond, B. 2003. 270. 313–321. [16] Savolainen, V., Cowan, R.S., Vogler, A.P., Roderick, G.K., Lane, R. Philos. Trans. R. Soc. Lond., B, Biol Sci. 2005. 360. 1805– 1811. [17] Heie, O.E. Fauna Entomologica Skandinavica. 1992. 25. 1–188. [18] Folmer, O., Black, M., Hoeh, W., Lutz, R., Vrijenhoek, R. Molecular Marine Biology and Biotechnology. 1994. 3. 294–299. [19] Tamura, K., Dudley, J., Nei, M., Kumar, S. Mol. Biol. Evol. 2007. 24. 1596−1599. [20] Tamura, K., Nei, M. & Kumar S. PNAS. 2004. 101. 11030–11035. [21] Rzhetsky, A. & Nei, M. Molecular Biology and Evolution. 1992. 9. 945–967. [22] Felsenstein, J. Evolution. 1985. 39. 783–791. [23] Voronova, N.V., Buga, S.V. & Kurchenko, V.P. Proceedings of BSU. 2011. 5(1). 171–178. (in Russian). [24] Remaudiere, G. & Remaudiere, M. Catalogue of the World’s Aphididae (Homoptera: Aphidoidea). Paris: Institut National de la Recherche Agronomique. 1997. 473 pp. [25] Lozier, J.D., Roderick, G.K. & Mills, N.J. Evolution. 2007. 61. 1353–1367. [26] Lozier, J.D., Foottit, R.G., Miller, G.L., Mills, N.J., Roderick, G.K. Zootaxa. 2008. 1688. 1–19. [27] Coeur d’acier, A., Jousselin, E., Martin, J.F., Rasplus, J.Y. Molecular Phylogenetics and Evolution. 2007. 42. 598– 611. [28] Kim, H., Lee, S. Mol. Cells. 2008. 25. 510–522. [29] Cocuzza, G.E., Cavalieri, V. & Barbagallo, S. Bulletin of Insectology. 2008. 61. 125–126. [30] Foottit, R.G. & Maw, H.I.L. Redia. 2009. 92. 87–91. [31] Rakauskas, R., Turcinaviciene, J. & Basilova, J. Eur. J. Entomol. 2011. 108. 469–479. [32] Rakauskas, R. Redia. 2009. 92. 97–100. [33] The International Aphid Genomics Consortium. PLoS Biol. 2010. 8. doi:10.1371/ journal.pbio.1000313. [34] Foottit, R.G., Maw, H.E.L., Von Dolhen, C.D., & Hebert, P.D.N. Molecular Ecology Resources. 2008. 8. 1189–1201. [35] Lee, W., Kim, H., Lim, J., Choi, H.R., Kim, Y., Kim, Y.S., Ji, J.Y., Foottit, R.G., Lee, E. Molecular Ecology Resources. 2011. 11. 32–37. [36] Blouin C., Boucher Y., Roger, A. Nucleic Acids Research. 2003. 31. 790–797. [37] Fay, J.C., Wu, C.I. Annu. Rev. Genomics Hum. Genet. 1985. 4. 213–235. [38] Mallet, J. Trends Ecol. Evol, 1995. 10. 294–299.

AIJRFANS 14-205; © 2014, AIJRFANS All Rights Reserved Page 6