Genetics and Molecular Biology, 40, 1(suppl), 191-199 (2017) Copyright © 2017, Sociedade Brasileira de Genética. Printed in Brazil DOI: http://dx.doi.org/10.1590/1678-4685-GMB-2016-0042

Research Article

Secondary structure of nrDNA Internal Transcribed Spacers as a useful tool to align highly divergent species in phylogenetic studies

Giovanna C. Giudicelli1, Geraldo Mäder1, Gustavo A. Silva-Arias1, Priscilla M. Zamberlan1, Sandro L. Bonatto2 and Loreta B. Freitas1 1Laboratory of Molecular Evolution, Department of Genetics, Universidade Federal do Rio Grande do Sul (UFRGS), Porto Alegre, Brazil 2 Laboratory of Genomic and Molecular Biology, Pontifícia Universidade Católica do Rio Grande do Sul, Porto Alegre, RS, Brazil

Abstract Recently, it has been suggested that internal transcribed spacer (ITS) sequences are under selective constraints to preserve their secondary structure. Here, we investigate the patterns of the ITS nucleotide and secondary structure conservation across the L. genus to evaluate the potential use of secondary structure data as a helpful tool for the alignment in taxonomically complex genera. Considering the frequent use of ITS, this study also presents a perspective on future analyses in other groups. The ITS1 and ITS2 sequences presented significant differ- ences for mean values of the lowest energy state (LES) and for number of hairpins in different Passiflora subgenera. Statistical analyses for the subgenera separately support significant differences between the LES values and the to- tal number of secondary structures for ITS. In order to evaluate whether the LES values of ITS secondary structures were related to selective constraints, we compared these results among 120 ITS sequences from Passiflora species and 120 randomly generated sequences. These analyses indicated that Passiflora ITS sequences present charac- teristics of a region under selective constraint to maintain the secondary structure showing to be a promising tool to improve the alignments and identify sites with non-neutral substitutions or those correlated evolutionary steps. Keywords: Passiflora; lowest energy state; hairpins; ITS1; ITS2. Received: February 24, 2016; Accepted: August 23, 2016.

Introduction portant role in rRNA processing. Deletion of the central Ribosomal DNA (rDNA) consists of one of the larg- portion of ITS1, including a processing region that is used est multigenic families in eukaryotic genomes and is pres- in the early stages of ribosomal maturation, blocks the for- ent at one or several locations in arrays of tandem elements mation of 40S subunits (Musters et al., 1990), whereas (Hillis and Dixon, 1991). Each unit is composed of three small deletions in the 5’-terminal region of ITS2 prevent rRNA gene regions (18S, 5.8S, and 26S) that are preceded the maturation of 26S rRNA, and deletions in the by an external transcribed spacer, and the two internal tran- 3’-terminal portion severely reduce the efficiency of this scribed spacers ITS1 and ITS2 separate the genes respec- process (van der Sande et al., 1992). The maturation and tively (Buckler-IV et al., 1997). The rDNA exons are high- splicing processes depend on the secondary structure of ly conserved across eukaryotic organisms, whereas the ITS ITS regions and thus imply some degree of conservation at regions present length variability due to point mutations the sequence or structural level (Mai and Coleman, 1997). and insertions/deletions (indels) (Baldwin et al., 1995; ITS sequences have been widely used in the inference Álvarez and Wendel, 2003). Gene regions are further pro- of phylogenetic hypotheses or in molecular evolution stud- cessed to produce the mature RNAs to form the cytoplas- ies in (Álvarez and Wendel, 2003), including several mic ribosomes, whereas ITS regions suffer a specific cleav- taxonomic levels (Kress et al., 2002; Zimmer et al., 2002; age during the maturation of the ribosomal subunits and are Koch et al., 2003; Rauscher et al., 2004; Bellarosa et al., thus not incorporated into mature ribosomes (Hillis and 2005; Kan et al., 2007; Queiroz et al., 2011; Ghada et al., Dixon, 1991; Baldwin et al., 1995). ITS regions play an im- 2013; Fan et al., 2014). In the Passiflora genus, ITS regions are the most commonly used marker for estimating the Send corresponding to Loreta B. Freitas. Laboratory of Molecular Evolution, Department of Genetics, Universidade Federal do Rio phylogenetic relationships among subgenera and species Grande do Sul, P.O. Box 15053, 91501-970, Porto Alegre, RS, (Muschner et al., 2003; Krosnick and Freudenstein, 2005; Brazil. E-mail: [email protected] Lorenz-Lemke et al., 2005; Koehler-Santos et al., 2006; 192 Giudicelli et al.

Mäder et al., 2010; Cazé et al., 2013; Krosnick et al., 2013; The presence of a high number of polymorphisms in Ramaiya et al., 2014), although this is usually combined sequences makes it more difficult to obtain good align- with other sequences in these studies. ments, and this could affect phylogenetic inference The Passiflora genus is the largest member of the (Álvarez and Wendel, 2003). For that reason, the aim of this family, with more than 520 species widely study was to investigate the patterns of ITS nucleotide and distributed in the Neotropical region and a few species oc- secondary structure conservation across the Passiflora ge- curring in the Old World (Deginani, 2001; Ulmer and nus to evaluate the potential use of secondary structure data MacDougal, 2004). This genus was initially divided into 22 as a helpful tool to improve the alignments in a genus that or 23 subgenera based on floral morphology (Killip, 1938; presents high taxonomic complexity. We also evaluated for Escobar, 1989), but the current infrageneric classification the first time whether the ITS sequences are under selective (Feuillet and MacDougal, 2003) regrouped the species into constraints, considering their secondary structure in four subgenera that have been partially or fully corrobo- Passiflora species. Considering the frequent use of ITS se- rated in phylogenetic studies (Muschner et al., 2003, 2012; quences, this study also presents a perspective for future Yockteng and Nadot 2004a; Hansen et al., 2006, 2007): analyses of secondary structure and estimates of phylo- Astrophea (DC.) Mast., Decaloba (DC.) Rchb., genies in other plant groups. Moreover, this original contri- Deidamioides (Harms) Killip, and Passiflora. The species bution improves the database of ITS secondary structure of the Passiflora genus present a large diversity of floral that is essential for a trustworthy reference to phylogenetic and vegetative features, which contributes to the complex and DNA barcoding analyses (Ankenbrand et al., 2015). taxonomy of this group (Krosnick and Freudenstein, 2005). The first study to obtain a molecular phylogeny for Material and Methods Passiflora (Muschner et al., 2003) described a high diver- Taxon sampling gence among ITS sequences of the different subgenera; consequently, alignment of these sequences presents sev- GenBank sampling included 870 accessions of ITS1 eral ambiguous regions and indels. For that reason, authors and ITS2 from 163 species representatives of all four sub- suggested a separate alignment per subgenus. Several plas- genera: Astrophea (9 spp.), Decaloba (112 spp.), tid regions (Muschner et al., 2003; Krosnick and Deidamioides (6 spp.), and Passiflora (36 spp.). We also Freudenstein, 2005; Hansen et al., 2006) and other nuclear included 74 new sequences (GenBank accessions numbers: markers (Yockteng and Nadot, 2004a,b; Muschner et al., KP769869- KP769905; KP769917- KP769953) from 26 2012) have also been used in Passiflora studies; however, species (Astrophea 5 spp., Decaloba 1 spp., and Passiflora they do not present sufficient variability to distinguish spe- 20 spp.), totaling 944 sequences from 189 Passiflora spe- cies as efficiently as observed in ITS. Despite this, the low cies. All these sequences were used to assess diversity in variability in plastid and other nuclear regions is advanta- Passiflora. Subgenera were represented by different num- geous in relation to ITS because allows the alignment of all ber of sequences according the number of species included subgenera simultaneously (Muschner et al., 2003; Mäder et on each one. Sequences that presented a large number of al., 2010). missing data in initial or final portions of ITS1 or ITS2 were Mutations in the tandem repeats of rDNA are com- not included in our analyses. monly homogenized through concerted evolution Plant material, DNA extraction, PCR amplification (Arnheim et al., 1980; Baldwin et al., 1995). This phenom- and sequencing enon makes each copy of an rRNA array very similar to other copies within individuals and species, thus providing To obtain the new sequences, the total DNA was ex- a high degree of similarity in gene family and affecting the tracted from young leaves dried in silica gel using the region variability (Hillis and Dixon, 1991; Buckler-IV et method of Roy et al. (1992). Voucher specimens were de- al., 1997). However, the high level of ITS polymorphism posited at ICN Herbarium (Department of Botany, Federal present in Passiflora species indicates that the process of University of Rio Grande do Sul, Porto Alegre, Brazil). concerted evolution is not sufficiently fast to homogenize ITS1 and ITS2 regions were amplified using primers 92 the region (Lorenz-Lemke et al., 2005; Koehler-Santos et and 75 (White et al., 1990) and amplification conditions as al., 2006), as observed in other plant groups (Wissemann previously described (Desfeux and Lejeune, 1996). To ex- and Ritz, 2005; Harpke and Peterson, 2006). Melo and clude the presence of low stability templates, we used 10% Guerra (2003) suggested that the high variability of ITS dimethyl sulfoxide (DMSO) (Buckler-IV et al., 1997; present in Passiflora could be related to the chromosomal Fuertes-Aguilar et al., 1999). We checked the quality and location of rDNA in this genus, as has been suggested for quantity of PCR products by horizontal electrophoresis in Gossypium L. (Wendel et al., 1995), because the 45S sites 1% agarose gel stained with GelRed (Biotium, UK), and where the ITS region is located were mapped in subtermi- purified them using the polyethyleneglycol (PEG) precipi- nal position, which interferes with copy homogenization tation method (Dunn and Blattner, 1987). We performed (Li and Zhang, 2002). the sequencing reactions automatically in a MegaBACE nrDNA ITS in phylogenetic studies 193

1000 DNA Analysis System (GE Healthcare Biosciences, values of ITS1 and ITS2 sequences were related to selec- Pittsburgh, PA, USA). tive constraints. These sequences were obtained using a Random DNA Sequence Generator (www.faculty.ucr.edu/ Data analyses ~mmaduro/random.htm) and presented the same length We removed the 5.8S gene region from ITS se- range (176 bp to 280 bp) of the original sequences. There- quences because of its well-conserved nature, and ITS1 and fore, for each Passiflora ITS1 or ITS2 sequence, we ob- ITS2 were analyzed individually. We discarded identical tained a random sequence that presented the same size (bp) sequences obtained from a same species, such that each se- and a similar GC content. All secondary structures for those quence type was considered only once, using DnaSP 5 random sequences were also modeled, and the obtained in- (Librado and Rojas, 2009). Because Passiflora subgenera formation is available in Supplementary Material Table S2. present high genetic variability among them, we conducted Secondary structure prediction the analysis considering each subgenus separately. Conse- quently, for each subgenus, sequences of ITS1 or ITS2 We modeled ITS1 and ITS2 putative secondary struc- were automatically aligned using default parameters in tures for the 120 sequences of 60 Passiflora species and ClustalX (Thompson et al., 1997), visually reviewed, and 120 random sequences using RNAstructure 5.3 (Reuster manually adjusted using MEGA6 (Tamura et al., 2013). and Mathew, 2010). Output parameters of this software in- We deleted ambiguous sites from ITS sequences (Mäder et cluded the LES and total number of structures for each ana- al., 2010), and gaps were coded as binary characters lyzed sequence. We manually analyzed the number of (Simmons and Ochoterena 2000) using GapCoder (Young hairpins and paired nucleotides of ITS sequences as per- and Healy, 2003). formed by Edger et al. (2014). Our analysis included for We selected 120 sequences from 60 species each Passiflora subgenus and random sequences: list of (Astrophea 6 ssp., Decaloba 39 ssp., Deidamioides 6 ssp., species, GenBank accession numbers, lengths, LES, total and Passiflora 9 ssp.) to model secondary structures of both number of structures, number of hairpins, and number and ITS1 and ITS2 regions. For this selection, we first con- percentage of paired nucleotides (Supplementary Material structed phylogenetic trees in BEAST 1.8 (Drummond et Tables S1 and S2). al., 2012) using the HKY substitution model with four We use the Wilcoxon-Mann-Whitney test in R soft- gamma categories, a Yule tree prior, and 107 chain lengths. ware package (R Development Core Team, 2011) to assess The first 1000 trees were discarded as Òburn inÓ. We used whether there are significant differences between second- the JModelTest software (Guindon and Gascuel, 2003; ary structure parameters (i.e., LES, total number of struc- Darriba et al., 2012) to select the best evolution model for tures, and number of hairpins) obtained for ITS1, ITS2, and our BEAST analysis, but because the resulting models were random sequences. We performed an analysis comparing not available in the BEAST software, we selected the HKY all ITS1 and ITS2 sequences to the 120 random sequences substitution model as the closest to our results, based on generated and another two analyses considering only ITS1 Passiflora phylogenies previously obtained (Muschner et or ITS2 sequences and their 60 random sequences sepa- al., 2003; Lorenz-Lemke et al., 2005; Koehler-Santos et rately. We assessed differences for the same secondary al., 2006; Mäder et al., 2010; Cazé et al., 2013). Phylogen- structure parameters among the four Passiflora subgenera etic trees obtained using BEAST are available in Supple- using a Kruskal-Wallis test (Kruskal and Wallis, 1952). mentary Material (Figures S1-S8). We selected the species to obtain the ITS1 and ITS2 secondary structures consider- Results and Discussion ing only species from well-supported clades (posterior probability 3 0.7). We included a greater number of ITS1 and ITS2 sequences characteristics and Decaloba species to test differences between secondary secondary structure structures of species from different supersections that are Alignments for all subgenera presented indels with phylogenetically well supported (Krosnick et al., 2013). different extensions among species, as previously observed For this analysis, we randomly chose three species from in other Passiflora studies (Muschner et al., 2003; Mäder et each supersection, except for Multiflora: all sequences a- al., 2010). Characteristics of ITS1 and ITS2 regions evalu- vailable for this supersection were analyzed because it is ated per subgenera are summarized in Table 1. ITS1 se- the only paraphyletic supersection in Decaloba subgenus quences always presented higher length than the ITS2 studies (Krosnick et al., 2013). sequences within each subgenus. Decaloba showed the We calculate mean guanine-cytosine (GC) content longest length for ITS1 (336 base pairs - bp), as did for each Passiflora subgenus and ITS segment (ITS1 or Passiflora for ITS2 (239 bp). Decaloba exhibited the high- ITS2) separately. Considering these values and the size est percentages of variable and informative sites for both variation (bp) of the original ITS sequences, we generated ITS1 and ITS2 sequences, whereas Deidamioides showed 120 random sequences to evaluate whether the total num- the lowest percentages of variable and informative sites for ber of secondary structures and lowest energy state (LES) ITS1 and Astrophea for ITS2 sequences. 194 Giudicelli et al.

Table 1 - Characteristics of ITS1 and ITS2 dataset presented per Passiflora subgenera.

Subgenus ITS region N individuals N species Alignment length (bp) Variable characters (%) PI characters (%) Astrophea ITS1 26 14 288 31.25 20.14 ITS2 31 14 235 28.51 16.17 Decaloba ITS1 202 113 336 71.43 58.93 ITS2 180 113 231 66.23 53.25 Deidamioides ITS1 70 6 288 27.43 18.06 ITS2 59 6 226 36.28 23.01 Passiflora ITS1 126 56 260 46.54 31.54 ITS2 106 56 239 46.86 25.94

ITS, internal transcribed spacer; BP, base pairs; PI, parsimony informative

Considering ITS1 and ITS2 sequences of all subgen- showed higher numbers of conserved sites, as previously era, the ITS1 region size range was between 220 and 280 observed for a different Passiflora species set (Giudicelli et bp, whereas ITS2 ranged between 176 and 222 bp. We al., 2015). High conservative patterns in ITS2 sequences evaluated each ITS1 and ITS2 sequence based on different have been previously reported and related to structural con- parameters: sequence length, LES, total number of struc- straints that are present at very deep phylogenetic scales in tures, number of hairpins, and number and percentage of eukaryotes (Mai and Coleman 1997; Shultz et al., 2005). paired nucleotides (Table 2; Supplementary Material Table Although highly conserved motifs have also been reported S1). ITS1 sequences presented higher average length for ITS1 sequences in plants, such motifs seem to be shorter (mean 263 bp, Standard Deviation SD = 19 bp) than ITS2 than those observed in ITS2, and in Passiflora ITS1 region, sequences (mean 206 bp, SD 7 bp), considering the four remaining parts of the sequence are more variable these subgenera. Additionally, ITS1 sequences displayed signifi- motifs (Liu and Schardl 1994). cant differences and lower mean values of LES than ITS2 No statistically significant differences were observed sequences (-105.6 and -80.5 degrees, respectively; between ITS1 and ITS2 in the total number of possible sec- P < 0.001; Table 2). This result means that although they ondary structures. ITS1 and ITS2 sequences exhibited sta- are of shorter sequence length, ITS2 sequences require tistically significant differences in the numbers of hairpins more energy than ITS1 to form the secondary structures. (P < 0.001; Table 2; Figure 1). The most frequent number ITS1 sequences exhibited higher values of variable of hairpins for ITS1 was seven (min = 2, max = 8), whereas and informative characters, whereas the ITS2 region the most frequent number of hairpins in ITS2 was three

Table 2 - Mean parameters values analyzed for ITS1 and ITS2 sequences per Passiflora subgenera. Standard deviations for each value are shown in pa- rentheses.

ITS 1 Astrophea [6]* Decaloba [39]* Deidamioides [6]* Passiflora [9]* All sequences [60]* Length (bp) 270.2 (2.3) 271.8 (8.8) 251.2 (19.7) 225.8 (2.5) 262.7 (19.1) Lowest Energy State (LES) -121.8 (5.5) -104.2 (9.4) -104.4 (4.9) -101.8 (6.1) -105.6 (9.8) Total number of Structures 15.8 (3.9) 12.6 (4.1) 16 (4.5) 8.8 (3.8) 12.7 (4.5) No. Hairpins 5.8 (1.6) 5.6 (1.4) 5.2 (2.1) 4.0 (0.9) 5.3 (1.5) No. Paired Nucleotides 168.3 (2.7) 167.7 (10.8) 149.3 (12.6) 135.3 (7.3) 161.1 (15.7) % Paired Nucleotides 62.3 (0.8) 61.7 (2.8) 59.4 (0.5) 59.9 (3.1) 61.3 (2.7) ITS 2 Length (bp) 208.8 (7.3) 208.1 (3.2) 204.5 (5) 195.3 (10.9) 205.9 (7.1) Lowest Energy State (LES) -90.3 (1.5) -74.9 (6.2) -87.9 (7.9) -92.9 (6.1) -80.4 (9.7) Total number of Structures 14.2 (2.2) 11.2 (4) 13.8 (3.6) 17.3 (3.8) 12.7 (4.3) No. Hairpins 2.8 (1.3) 3.3 (0.6) 2.3 (0.8) 2.8 (1) 3.0 (0.8) No. Paired Nucleotides 128.7 (4.5) 123.4 (6.1) 128.7 (4.1) 130.7 (10.5) 125.5 (7.1) % Paired Nucleotides 61.6 (1) 59.3 (3) 62.9 (2.6) 66.8 (2.6) 61.0 (3.8)

*Numbers of analyzed sequences. ITS, internal transcribed spacer; BP, base pairs nrDNA ITS in phylogenetic studies 195

Figure 1 - Frequency and examples of number of hairpins for ITS1 and ITS2 sequences. (a) Frequency of the total number of hairpins for ITS1 (gray) and ITS2 (black) sequences. Secondary structures examples with (b) seven hairpins in ITS1 and (c) three hairpins in ITS2.

(min = 2, max = 5). No differences were observed in the value of LES for ITS1 was observed in Passiflora subgenus percentage of paired nucleotides for ITS1 and ITS2 se- (-101.8 degrees), and for ITS2 in Decaloba (-74.9 degrees; quences (Table 2). Sequences had over 50% of their bases Table 2). There were also statistically significant differ- paired with other nucleotides to form secondary structures ences in the total number of possible secondary structures (Table 2), as observed for Brassicaceae (Edger et al., 2014). among subgenera for ITS1 (P < 0.01) and ITS2 (P < 0.01). Because ITS1 sequences displayed longer lengths than ITS2 and because the sequences exhibited significant dif- For ITS1, Deidamioides showed a higher mean value of ferences according to the number of paired nucleotides, possible alternate secondary structures (16 possible struc- ITS1 presented a higher number of paired nucleotides tures, Table 2), while for ITS2 this was the case for (P < 0.001; Table 2). Passiflora (17 possible structures; Table 2). No statistically significant differences were observed within Decaloba Statistical analyses conducted for the four subgenera when the species were separated according to the super- separately support significant differences between LES values for ITS1 (P < 0.01) and ITS2 (P < 0.001) sequences. sections previously proposed (Feuillet and MacDougal, These differences are not surprising when taking into ac- 2003; Ulmer and MacDougal, 2004). We also observed that count the large observed differences of the sequences a- different sequences types of a species did not exhibit differ- mong the four Passiflora subgenera. The highest mean ences on their secondary structure (data not shown). 196 Giudicelli et al.

Comparison of secondary structures of ITS1, ITS2, ly conserved structural patterns and sequence motifs for the and random sequences to assess selective entire genus, which support the suggestion of selective con- constraints of ITS regions straints for maintaining the secondary structures.

Randomly generated sequences presented -80.2 de- Utility of ITS secondary structures to improve grees for LES (min = -126.7, max = -53.9) on average and a alignments mean of 17.2 possible secondary structures (min = 3, max = 20). In all three analyses comparing ITS and ran- Analyses conducted for the four Passiflora subgenera domly-generated sequences (Figure 2), we found different separately revealed no statistically significant differences values: to combined ITS sequences, LES P < 0.001 and in the total number of hairpins among subgenera. However, number of possible secondary structures P < 0.001; ITS1 although no pattern was observed to characterize each sub- only, LES P < 0.001 and number of possible secondary genus, secondary structures were shown to be a potential structures P < 0.001; and ITS2 only, LES P < 0.001 and tool for improving the quality of alignments and identifying number of possible secondary structures P < 0.001. Those possible sites with non-neutral substitution patterns or sites analyses indicated that Passiflora sequences secondary with correlated evolution. This allows to obtain better re- structures exhibit significantly lower energy states and less sults in phylogenetic inferences for Passiflora, as previ- structures compared with randomly generated sequences ously observed for other plant groups (Gottschling et al., that are similar in length and GC content. These results sug- 2001; Goertzen et al., 2003). gest that Passiflora ITS sequences show characteristics of a The highly conserved motif in the central region of region that is under selective constraint to maintain its sec- the ITS1 sequence previously described for flowering ondary structure. plants (Liu and Schardl, 1994) was also present in the Different secondary structure predictor softwares Passiflora sequences and was helpful in aligning the may propose very distinct structures from the same se- Passiflora sequences for phylogenetic analysis. Other ITS1 quence, even when using the same initial parameters conserved motifs described in other groups were also found (Gottschling and Pltner, 2004). Additionally, thermody- in Passiflora, such as 5’AAGGAA 3’ in the central region namically optimal structures certainly do not reflect the of ITS1 (Liu and Schardl, 1994; Gottschling et al., 2001) structures in cells, because there are many unaccounted and a region rich in adenine and cytosine (Coleman et al., biochemical factors that give a dynamic (temporarily un- 1998). Conserved patterns that had already been described stable) characteristic to the secondary structure of the in angiosperms (Hershkovitz and Zimmer 1996; Mai and rRNA transcripts. However, in addition to the strong differ- Coleman, 1997) and green algae (Mai and Coleman, 1997) ences seen among the Passiflora sequences, there are high- were also observed in Passiflora ITS2 sequences. These

Figure 2 - ITS sequences compared with random sequences. Scatter plot of t lowest energy state values (X-axis) and total number of secondary structures (Y-axis). Squares represent ITS1 sequences; circles represent ITS2 sequences; and triangles represent randomly generated sequences. nrDNA ITS in phylogenetic studies 197 conserved motifs are commonly associated with the forma- References tion of hairpins in secondary structure. Ankenbrand MJ, Keller A, Wolf M, Schultz J and Frster F (2015) Comparisons among species from different subgen- ITS2 database V: Twice as much. Mol Biol Evol era that presented the same number of hairpins showed con- 32:3030-3032. served patterns on secondary structure that could be useful Álvarez I and Wendel JF (2003) Ribosomal ITS sequences and for improving the Passiflora sequence alignments. For ex- plant phylogenetic inference. Mol Phylogenet Evol ample, when comparing the ITS1 region among species 29:417-434. Arnheim N, Krystal M, Schmickel R, Wilson G, Ryder O and from the Astrophea, Decaloba, and Deidamioides subgen- Zimmer E (1980) Molecular evidence for genetic exchanges era with seven hairpins, we observed a common hairpin among ribosomal genes on non-homologous chromosomes with 5-6 paired bases formed by initial (5’-TCGAA) and fi- in man and apes. Proc Natl Acad Sci USA 77:7323-7327. nal (TTCGA-3’) portions. This same motif was also found Baldwin BG, Sanderson MJ, Porter JM, Wojciechowski MF, in ITS1 sequences that presented different numbers of hair- Campbell CS and Donoghue MJ (1995) The ITS region of pins, thus demonstrating the conserved nature of ITS1 se- nuclear ribosomal DNA: A valuable source of evidence on quences and serving as a reason why a universal set of angiosperm phylogeny. Ann Mo Bot Gard 82:247-277. primers could be used to amplify sequences of plants and Bellarosa R, Simeonea MC, Papinib A and Schironea B (2005) fungi (White et al., 1990; Álvarez and Wendel, 2003). The Utility of ITS sequence data for phylogenetic reconstruction analysis of secondary structure also facilitates the identifi- of Italian Quercus spp. Mol Phylogenet Evol 34:355-370. cation of indels when sequences with different number of Buckler-IV ES, Ippolito A and Holtsford TP (1997) The evolution of ribosomal DNA: Divergent paralogues and phylogenetic hairpins are compared, thus contributing to the improve- implications. Genetics 145:821-832. ment of the alignment of sequences with different lengths Cazé ALR, Mäder G, Bonatto SL and Freitas LB (2013) A molec- and high diversity index. ular systematic analysis of Passiflora ovalis and Passiflora The conservation of specific nucleotides in secondary contracta (Passifloraceae). Phytotaxa 132:39-46. structures contributes to positional homology and is a help- Coleman AW, Maria Preparata R, Mehrota B and Mai JC (1998) ful tool to improve the alignments (Keller et al., 2010). Derivation of the secondary structure of the ITS-1 transcript in Volvocales and its taxonomic correlations. Protist Therefore, we believe that analyses of ITS secondary struc- 149:135-146. tures obtained from the Passiflora species could be a valu- Darriba D, Taboada GL, Doallo R and Posada D (2012) able source of information to improve alignment, jModelTest 2: More models, new heuristics and parallel considering the difficulties of aligning species from the dif- computing. Nat Methods 9:772. ferent subgenera (Muschner et al., 2003; Mäder et al., Deginani NB (2001) Las especies argentinas del género Passiflora 2010), and the same strategy could be used in other plant (Passifloraceae). Darwiniana 39:43-129. genera that present a complex taxonomy. Desfeux C and Lejeune B (1996) Systematics of Euromediter- ranean Silene (Caryophyllaceae): Evidence from a phylo- Many Passiflora studies have used ITS sequences to genetic analysis using ITS sequence. C R Acad Sci III estimate phylogenetic relationships, and our results show 319:351-358. that ITS regions are under selective constraints to maintain Drummond AJ, Suchard MA, Xie D and Rambaut A (2012) their secondary structures, likely due to their specific func- Bayesian phylogenetics with 346 BEAUti and the BEAST tions in the splicing process, as was observed for 1.7. Mol Biol Evol 29:1969-1973. Brassicaceae (Edger et al., 2014). The use of ITS as a mo- Dunn IS and Blattner FR (1987) Charons 36 to 40: Multi-enzyme, lecular marker carries some risk and must be done cau- high capacity, recombination deficient replacement vectors tiously, but it presents the advantage of having a great deal with polylinkers and polystuffers. Nucleic Acids Res of sequence data for this locus (Krosnick et al., 2013). 15:2677-2698. Plastid markers are not sufficiently variable in the Edger PP, Tang M, Bird KA, Mayfield DR, Conant G, Mummenhoff K, Koch MA and Pires JC (2014) Secondary Passiflora genus (Muschner et al., 2003; Mäder et al., structure analyses of the nuclear rRNA internal transcribed 2010; Cazé et al., 2013), and ITS secondary structures spacers and assessment of its phylogenetic utility across the could thus be a helpful tool to improve phylogenetic infer- Brassicaceae (Mustards). PLoS ONE 9:e101341. ences and evolutionary studies in this genus and others. Escobar LK (1989) A new subgenus and five new species in Passiflora (Passifloraceae) from South America. Ann Mo Bot Gard 76:877-885. Acknowledgments Fan X, Liu J, Sha LN, Sun GL, Hu ZQ, Zeng J, Kang HY, Zhang HQ, Wang Y, Wang XL, et al. (2014) Evolutionary pattern This work was supported by the Conselho Nacional of rDNA following polyploidy in Leymus (Triticeae: de Desenvolvimento Científico e Tecnológico (CNPq), the Poaceae). Mol Phylogenet Evol 77:296-306. Coordenação de Aperfeiçoamento de Pessoal de Nível Su- Feuillet C and MacDougal JM (2003) A new infrageneric classifi- perior (CAPES), and the Programa de Pós-Graduação em cation of Passiflora L. (Passifloraceae). Passiflora 13:34-38. Genética e Biologia Molecular da Universidade Federal do Fuertes-Aguilar J, Rosselló JA and Nieto-Feliner G (1999) Nu- Rio Grande do Sul (PPGBM-UFRGS). clear ribosomal DNA (nrDNA) concerted evolution in natu- 198 Giudicelli et al.

ral and artificial hybrids of Armeria (Plumbaginaceae). Mol Kress WJ, Prince LM and Williams KJ (2002) The phylogeny and Ecol 8:1341-1346. a new classification of the gingers (Zingiberaceae): Evi- Ghada B, Ahmed BA, Messaoud M and Amel SH (2013) Genetic dence from molecular data. Am J Bot 89:1682-1696. diversity and molecular evolution of the internal transcribed Krosnick SE and Freudenstein JV (2005) Monophyly and floral spacer (ITSs) of nuclear ribosomal DNA in the Tunisian fig character homology of Old World Passiflora (Subgenus cultivars (Ficus carica L.; Moracea). Biochem Syst Ecol Decaloba: Supersection Disemma). Syst Bot 30:139-152. 48:20-33. Krosnick SE, Porter-Utley KE, MacDougal JM, Jørgensen PM Giudicelli GC, Mäder G and Freitas LB (2015) Efficiency of ITS and McDade LA (2013) New insights into the evolution of sequences for DNA barcoding in Passiflora Passiflora subgenus Decaloba (Passifloraceae): Phylogen- (Passifloraceae). Int J Mol Sci 16:7289-7303. etic relationships and morphological synapomorphies. Syst Bot 38:692-713. Goertzen LR, Cannone JJ, Gutell RR and Jansen RK (2003) ITS secondary structure derived from comparative analysis: Im- Kruskal WH and Wallis WA (1952) Use of ranks in one-criterion plications for sequence alignment and phylogeny of the variance analysis. J Am Stat Assoc 47:583-621. Asteraceae. Mol Phylogenet Evol 29:216-234. Li D and Zhang X (2002) Physical localization of the 18S- 5.8S-26S rDNA and sequence analysis of ITS regions in Gottschling M and Pltner J (2004) Secondary structure models of Thinopyrum ponticum (Poaceae, Triticaceae): Implications the nuclear internal transcribed spacer regions and 5.8S for concerted evolution. Ann Bot 90:445-452. rRNA in Calciodinelloideae (Peridiniaceae) and other dino- Librado P and Rojas J (2009) DnaSP v5: A software for compre- flagellates. Nucleic Acids Res 32:307-315. hensive analysis of DNA polymorphism data. Bio- Gottschling M, Hilger HH, Wolf M and Diane N (2001) Second- informatics 25:1451-1452. ary structure of the ITS1 and its application in a reconstruc- Liu JS and Schardl CL (1994) A conserved sequence in internal tion of the phylogeny of Boraginales. Plant Biol 3:629-636. transcribed spacer 1 of plant nuclear rRNA genes. Plant Mol Guindon S and Gascuel O (2003) A simple, fast and accurate Biol 26:775-778. method to estimate large phylogenies by maximum- Lorenz-Lemke AP, Muschner VC, Bonatto SL, Cervi AC, likelihood. Syst Biol 52:696-704. Salzano FM and Freitas LB (2005) Phylogeographic infer- Hansen AK, Gilbert LE, Simpson BB, Downie SR, Cervi AC and ences concerning evolution of Brazilian Passiflora actinia Jansen RK (2006) Phylogenetic relationships and chromo- and P. elegans (Passifloraceae) based on ITS (nr DNA) vari- some number evolution in Passiflora. Syst Bot 31:138-150. ation. Ann Bot 95:799-806. Hansen AK, Escobar LK, Gilbert LE and Jansen RK (2007) Pater- Mäder G, Zamberlan PM, Fagundes NJR, Magnus T, Salzano nal, maternal, and biparental inheritance of the chloroplast FM, Bonatto SL and Freitas LB (2010) The use and limits of genome in Passiflora (Passifloraceae): Implications for ITS data in the analysis of intraspecific variation in phylogenetic studies. Am J Bot 94:42-46. Passiflora L. (Passifloraceae). Genet Mol Biol 33:99-108. Harpke D and Peterson A (2006) Non-concerted ITS evolution in Mai JC and Coleman AW (1997) The internal transcribed spacer 2 Mammillaria (Cactaceae). Mol Phylogenet Evol exhibits a common secondary structure in green algae and 41:579-593. flowering plants. J Mol Evol 44:258-271. Hershkovitz MA and Zimmer EA (1996) Conservation patterns in Melo NF and Guerra M (2003) Variability of the 5S and 45S angiosperm rDNA ITS2 sequences. Nucleic Acids Res rDNA sites in Passiflora L. species with distinct base chro- 24:2857-2867. mosome numbers. Ann Bot 92:309-316. Hillis DM and Dixon MT (1991) Ribosomal DNA: Molecular Muschner VC, Lorenz AP, Cervi AC, Bonatto SL, Souza-Chies evolution and phylogenetic inference. Q Rev Biol TT, Salzano FM and Freitas LB (2003) A first molecular 66:411-453. phylogenetic analysis of Passiflora (Passifloraceae). Am J Bot 90:1229-1238. Kan XZ, Wang SS, Ding X and Wang XQ (2007) Structural evo- Muschner VC, Zamberlan PM, Bonatto SL and Freitas LB (2012) lution of nrDNA ITS in Pinaceae and its phylogenetic impli- Phylogeny, biogeography and divergent times in Passiflora cations. Mol Phylogenet Evol 44:765-777. (Passifloraceae). Genet Mol Biol 35:1036-1043. Keller A, Frster F, Müller T, Dandekar T, Schultz J and Wolf M Musters W, Boon K, van der Sande CA, van Heerikhuizen H and (2010) Including RNA secondary structures improves accu- Planta RJ (1990) Functional analysis of transcribed spacers racy and robustness in reconstruction of phylogenetic trees. of yeast ribosomal DNA. EMBO J 9:3989-3996. Biol Direct 5:4. Queiroz CS, Batista FRC and Oliveira LO (2011) Evolution of the Killip EP (1938) The American Species of Passifloraceae. Botani- 5.8S nrDNA gene and internal transcribed spacers in cal Series, Field Museum of Natural History, Chicago, 613 Carapichea ipecacuanha (Rubiaceae) within a phylogeo- p. graphic context. Mol Phylogenet Evol 59:293-302. Koch MA, Dopes C and Mitchell-Olds T (2003) Multiple hybrid R Development Core Team (2011) R: A language and environ- formation in natural populations: Concerted evolution of the ment for statistical computing. R Foundation for Statistical Internal Transcribed Spacer of nuclear ribosomal DNA Computing. Vienna, Austria. Available online at (ITS) in North American Arabis divaricarpa (Brassicaceae). http://www.R-project.org/. Mol Biol Evol 20:338-350. Ramaiya SD, Bujang JS and Zakaria MH (2014) Genetic diversity Koehler-Santos P, Lorenz-Lemke A, Muschner VC, Bonatto SL, in Passiflora species assessed by morphological and ITS se- Salzano FM and Freitas LB (2006) Molecular genetic varia- quences analysis. Sci World J 2014:598313. tion in (Passifloraceae), an invasive species Rauscher JT, Doyle JJ and Brown AHD (2004) Multiple origins in southern Brazil. Biol J Linn Soc Lond 88:611-630. and nrDNA Internal Transcribed Spacer homeologue evolu- nrDNA ITS in phylogenetic studies 199

tion in the Glycine tomentella (Leguminosae) allopolyploid maturase K in Passißora. Mol Phylogenet Evol. complex. Genetics 166:987-998. 31:397-402. Reuster JS and Mathews DH (2010) RNAstructure: Software for Yockteng R and Nadot S (2004b) Phylogenetic relationships RNA secondary structure prediction and analysis. BMC among Passiflora species based on the glutamine synthase Bioinformatics 11:129. nuclear gene expressed in chloroplast (ncpGS). Mol Roy A, Frascaria N, Mackay J and Bousquet J (1992) Segregating Phylogenet Evol 31:379-396. random amplified polymorphic DNAs (RAPDs) in Betula Young ND and Healy J (2003) GapCoder automates the use of alleghaniensis. Theor Appl Genet 85:173-180. indel characters in phylogenetic analysis. BMC Shultz J, Maisel S, Gerlach D, Müller T and Wolf M (2005) A Bioinformatics 4:6. common core of secondary structure of the internal tran- Zimmer EA, Roalson EH, Skog LE, Boggan JK and Idnurm A scribed spacer 2 (ITS2) throughout the Eukaryota. RNA (2002) Phylogenetic relationships in the Gesnerioideae 11:361-364. (Gesneriaceae) based on nrDNA ITS and cpDNA trnL-F Simmons MP and Ochoterena H (2000) Gaps as characters in se- and trnE-T spacer region sequences. Am J Bot 89:296-311. quence based phylogenetic analyses. Syst Biol 49:369-381. Tamura K, Stecher G, Peterson D, Filipski A and Kumar S (2013) Supplementary material MEGA6: Molecular Evolutionary Genetics Analysis ver- The following online material is available for this ar- sion 6.0. Mol Biol Evol 30:2725-2729. ticle: Thompson JD, Gibson TJ, Plewniak F, Jeanmougin F and Higgins Figure S1 - Phylogenetic trees based on ITS se- DG (1997) The CLUSTALX windows interface: Flexible strategies for multiple sequences alignment aided by quality quences for Astrophea, ITS1. analysis tools. Nucleic Acids Res 25:4876-4882. Figure S2 - Phylogenetic trees based on ITS se- Ulmer T and MacDougal JM (2004) Passiflora - Passionflowers quences for Astrophea, ITS2. of the World. Timber Press, Portland, 430 p. Figure S3 - Phylogenetic trees based on ITS se- van der Sande CA, Kwa M, Nues RW van, Heerikhuizen H van, quences for Decaloba, ITS1. Raué HA and Planta RJ (1992) Functional analysis of inter- Figure S4 - Phylogenetic trees based on ITS se- nal transcribed spacer 2 of Saccharomyces cerevisiae ribo- quences for Decaloba, ITS2. somal DNA. J Mol Biol 223:899-910. Figure S5 - Phylogenetic trees based on ITS se- Wendel JF, Schnabel A and Seelanan T (1995) Bidirectional quences for Deidamioides, ITS1. interlocus concerted evolution following allopolyploid Figure S6. Phylogenetic trees based on ITS sequences speciation in cotton (Gossypium). Proc Natl Acad SciUSA for Deidamioides, ITS2. 92:280-284. Figure S7. Phylogenetic trees based on ITS sequences White TJ, Bruns T, Lee S and Taylor J (1990) Amplication and di- rect sequencing of fungal ribosomal RNA genes for for Passiflora, ITS1. phylogenetics. In: Innis M, Gelfand D, Sninsky J and White Figure S8 - Phylogenetic trees based on ITS se- T (eds) PCR Protocols: A Guide to Methods and Applica- quences for Passiflora, ITS2. tions. Academic Press, San Diego, pp 315-322. Table S1 - ITS1 and ITS2 sequences parameters ana- Wissemann V and Ritz CM (2005) The genus Rosa (Rosoideae, lyzed Rosaceae) revisited: Molecular analysis of nrITS-1 and Table S2 - Random sequences parameters analyzed. atpB-rbcL intergenic spacers (IGS) versus conventional tax- onomy. Bot J Linn Soc Lond 147:275-290. Associate Editor: Marcio de Castro Silva Filho Yockteng R and Nadot S (2004a) Infrageneric phylogenies: A License information: This is an open-access article distributed under the terms of the comparison of chloroplast-expressed glutamine synthetase, Creative Commons Attribution License (type CC-BY), which permits unrestricted use, cytosol-expressed glutamine synthetase and cpDNA distribution and reproduction in any medium, provided the original article is properly cited.