<<

View metadata, citation and similar papers at core.ac.uk brought to you by CORE

provided by Elsevier - Publisher Connector

Org. Divers. Evol. 3, 1–12 (2003) © Urban & Fischer Verlag http://www.urbanfi scher.de/journals/ode

Application of spectral analysis to examine phylogenetic signal among SSU rDNA data sets ()

Ingo Busse, Angelika Preisfeld*

Fakultät für Biologie, Universität Bielefeld, Germany

Received 28 May 2002 · Accepted 17 September 2002

Abstract

Euglenid as a common and widespread group of display a broad morphological variety. Against the background of pro- nounced genetic diversity and varying sequence characteristics of SSU rDNA sequences among different euglenid subgroups we analyzed the content and distribution of phylogenetic signal and noise within different euglenozoan data sets. Two statistical approaches, PTP-test and RASA, were employed to achieve a measure of overall signal content. Spectral analyses were used to evaluate support and conflict for given bi- partitions of the data sets.These investigations revealed a large amount of phylogenetic information present in the molecular data. Convincing support could be found for primary osmotrophic and corresponding subgroups, a taxon mainly based on molecular data. On the other hand, in agreement with weak corroboration from morphological data, euglenid monophyly and interrelationships of phagotrophs, pho- totrophs and osmotrophs were not supported.Focusing on the primary osmotrophic subclade Rhabdomonadina spectral analysis revealed only few well supported splits. Generally, the application of sequence evolution models in maximum likelihood and spectral analyses of euglenid SSU rDNA data sets did not lead to significant amplification of split supporting signal. Phylogenetic hypotheses are discussed in regard to the evolution of morphological and ultrastructural characters.

Key words: SSU rDNA, molecular phylogeny, euglenids, primary osmotrophs, phylogenetic signal, spectral analysis

See also Electronic Supplement at http://www.senckenberg.de/odes/03-01.htm

Introduction 2000) and secondarily developed osmotrophic nutrition. (3) Primary osmotrophs possess neither ingestion appa- Euglenid flagellates represent one of few evolu- ratuses nor , feeding on pinocytotic uptake of nu- tionary lineages comprising three generally different trients via the plasma membrane of the reservoir. Based modes of nutrition: (1) Phagotrophs (Sphenomonadina on morphological and ultrastructural investigations it and Heteronematina) employ ingestion devices for up- turned out to be difficult to specify unambiguous aut- take of bacterial or eukaryotic prey, depending on cy- apomorphies supporting euglenid monophyly. Presently, tostome architecture and complexity. Since phagotrophy debated candidates are the euglenid pellicle composed can be found in closely related bodonids and diplone- of the plasmamembrane, underlying proteinaceous epi- mids as well it is assumed to be the plesiomorphic state. plasmatic strips accompanied by , and pos- (2) Phototrophic taxa (Euglenina and Eutreptiina) ac- session of paramylon grains, the beta-1,3-glucan storage quired their plastids by means of secondary endocyto- product (Leander et al. 2001). biosis. Presumably, a phagotrophic ancestor with a com- Similarities in ultrastructural features of the mitotic, plex feeding apparatus ingested a green alga and subse- feeding, and flagellar apparatuses led to the conclusion quently established the (Gibbs 1978). of close phylogenetic relations of euglenids to diplone- Species like Astasia longa or Khawkinea quartana have mids and kinetoplastids, the latter comprising bodonids lost the ability of (Linton et al. 1999, and generally parasitic trypanosomatids (Cavalier-Smith

*Corresponding author: Angelika Preisfeld, Fakultät für Biologie, Universität Bielefeld, Postfach 100 131, D-33501 Bielefeld, Germany; e-mail: [email protected]

1439-6092/03/03/01-001 $ 15.00/0 Org. Divers. Evol. (2003) 3, 1–12 2 Busse & Preisfeld

1981, 1993, Kivic & Walne 1984, Triemer & Farmer Menoidium sp. CCAP 1247/6 (AY083243), Rhabdomonas 1991, Simpson 1997). The introduction of SSU rDNA costata SAG 1271-1 (AF403150), Rhabdomonas incurva data for phylogenetic inference confirmed the monophy- SAG 1271-8 (AF403151). Menoidium pellucidum ly of euglenids, kinetoplastids, and diplonemids forming (AF403156; Frantz et al. 2000) was kindly provided by the Euglenozoa (Sogin et al. 1989, Maslov et al. 1999, Geneviève Bricheux. Cultures were grown in the dark at 20 °C on soil water medium with pea. Busse & Preisfeld 2002a). The last common ancestor of Cells were harvested by centrifugation, ground in liquid ni- Euglenozoa probably was a phagotrophic biflagellate trogen, and subsequently total cell DNA was extracted using with two emergent flagella and a simple the DNeasy DNA Purification Kit (Qiagen). Nearly (Kivic & Walne 1984). Further analyses verified a single complete SSU rDNA was amplified according to the PCR pro- origin of phototrophic euglenids and suggested a sister- tocols described before (Preisfeld et al. 2000, Busse & Preis- group relation of phototrophs to trichopho- feld 2002c), using both universal eukaryotic and specific rum, a phagotrophic species with a complex ingestion primers: AP7 (forward) 5′ gtc ata tgc tty ktt caa ggr cta agc c apparatus, which is supposed to be capable of ingesting 3′, AP8 (reverse) 5′ tca cct aca gcw acc ttg tta cga c 3′, AP13 eukaryotic prey. cantuscygni, possessing a (forward) 5′ atg cag tga gca tcc act tgt 3′, AP14 (reverse) 5′ cct comparatively simple ingestion device, builds the base of cct cca gca atg aga t 3′. PCR products were cloned into pCR the euglenid subtree (e.g., Montegut-Felkner & Triemer 2.1 vector using TOPO TA Cloning Kit (Invitrogen). Sequenc- 1997, Linton et al. 1999, 2000). Another euglenid clade, ing in both directions was conducted with standard M13 primers as well as with additional internal primers. the primary osmotrophs comprising the genus Distigma, some Astasia species and the suborder Rhabdomonadina, was discovered based mainly on molecular data (Müllner Alignments and phylogenetic analyses et al. 2001, Busse & Preisfeld 2002b). Investigations of euglenid SSU rDNA sequences re- New sequences of Rhabdomonas and Menoidium were manual- vealed pronounced genetic diversity compared to other ly aligned to a euglenozoan data set. As a guide, available sec- taxa (Montegut-Felkner & Triemer 1997, Preisfeld et al. ondary structure models of various euglenid, kinetoplastid and 2001, Busse & Preisfeld 2002a). Additionally, critical outgroup sequences from the rRNA database (Wuyts et al. 2002) were used to facilitate alignment. Only unambiguously investigations of sequence characteristics showed sig- homologous positions of SSU rRNA core regions were retained nificantly heterogeneous nucleotide distribution, large for phylogenetic inference. The first data set comprised a broad amounts of rate variation among lineages, and positional range of primary osmotrophic and phototrophic euglenids to- rate variation at different branches of the tree (Busse & gether with two available phagotrophic species. The other eu- Preisfeld 2002a, b). glenozoan subgroups, kinetoplastids and diplonemids were rep- These features motivated the study of phylogenetic resented by five species, respectively. A diverse set of outgroup signal content of extremely heterogeneous euglenozoan sequences was used because closest relatives of Euglenozoa are SSU rDNA data sets and its support for different nodes of still unknown, except for outgroups that are distantly related to the tree. The central question was whether phylogenetic the ingroup taxa. Two additional, reduced data sets were created hypotheses based on molecular data were founded on for spectral analysis (see below). All data sets are available from an Organisms Diversity and Evolution Electronic Supple- phylogenetic signal representing genealogical history, or ment (http://www.senckenberg.de/odes/03-01.htm). due to random noise resulting in conflicting patterns of Data sets were analyzed using PAUP* version 4.0b8 character distribution. We investigated the branching (Swofford 2001). Neighbor joining (p-distances) and maxi- order among major euglenozoan and euglenid subgroups, mum parsimony approaches were used for initial analyses to and subsequently concentrated on the Rhabdomonadina avoid impacts of sequence evolution models by correcting for as an example of a morphologically diverse and molecu- multiple substitutions. The following options were employed larly well sampled group of primary osmotrophic eu- for heuristic tree search: the starting tree was obtained via glenids. Spectral analyses were used to investigate and il- stepwise addition of taxa in random order, with 500 replica- lustrate the distribution of phylogenetic signal and noise tions for maximum parsimony and 15 replications for maxi- without being limited to a single tree topology. mum likelihood analyses; the tree-bisection-reconnection (TBR) algorithm was used for branch swapping; steepest de- scent option not in effect while „MulTrees“ was enforced. In order to evaluate the best fit model of sequence evolution Material and methods for maximum likelihood analyses, an hierarchical likelihood Organisms and molecular cloning ratio test (hLRT) and the Akaike information criterion (AIC), as implemented in Modeltest 3.06 (Posada & Crandall 1998), Rhabdomonas and Menoidium species were obtained from the have been employed. Parameters for ML analyses for all data Culture Collection of and (CCAP) at the CEH sets analysed are given in Table 1. Windermere, UK, and from the Culture Collection of Algae at Nonparametric bootstrapping was performed to assess sta- the University of Göttingen, Germany (SAG): Menoidium ob- tistical significance for support of internal nodes (Felsenstein tusum CCAP 1247/4 (GenBank accession number AF403155), 1985).

Org. Divers. Evol. (2003) 3, 1–12 Spectral analysis applied to Euglenozoa phylogeny 3

Table 1. Maximum likelihood parameters according to hLRT and AIC as determined with Modeltest for all data sets analyzed. G: gamma shape parameter alpha; GTR: general time reversible model (Rodriguez et al. 1990); I: proportion of invariable sites; TrNef: model after Tamura & Nei (1993), assuming equal base frequencies (ef).

Data set 1 Data set 2 Data set 3 Euglenozoa, large Euglenozoa, reduced Rhabdomonadina hLRT AIC hLRT AIC hLRT AIC

Model selected GTR+I+G GTR+I+G TrNef+I+G GTR+I+G TrNef+I+G GTR+I+G ML score (-ln) 22528.65097 22528.65097 10399.33378 10380.52084 7367.60979 7341.93883

Base frequencies 0.2526 0.2526 equal 0.2567 equal 0.2345 (A, C, G,T) 0.2105 0.2105 0.2129 0.2333 0.2901 0.2901 0.2879 0.2988 0.2468 0.2468 0.2425 0.2334

Rate matrix 1.5075 1.5075 1.0000 1.3465 1.0000 1.1988 (R[A-C], R [A-G], 2.6853 2.6853 2.6430 2.4490 2.9131 2.9148 R [A-T], R [C-G], 1.3463 1.3463 1.0000 1.1962 1.0000 1.8233 R [C-T], R [G-T]) 0.6958 0.6958 1.0000 0.7375 1.0000 0.5539 4.7930 4.7930 4.0403 4.9243 4.8289 5.6460 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000

Pinvar (I) 0.1744 0.1744 0.1861 0.1809 0.2403 0.2284 Gamma shape (G) 0.7929 0.7929 0.7896 0.7719 0.5262 0.5106

The two-cluster test implemented in Phyltest 2.0 (Kumar The number of unique substitutions as a measure of the 1996) was used to evaluate hypotheses of rate constancy amount of autapomorphies in terminal taxa was determined among lineages. For this purpose uncorrected Hamming dis- with PAUP* by deleting single terminal taxa and recording the tances were used. increase in the number of constant sites.

Estimation of phylogenetic signal content Spectral analysis Two smaller data sets were generated comprising 18 and 17 Spectral analysis of phylogenetic data allows the investigation of taxa, respectively, since Spectrum 2.3.0 (Charleston 1998) phylogenetic signal within a given data set without restricting to used for spectral analysis is limited in number of input se- a single „optimal“ tree. For a set of (n) aligned sequences there quences. One data set included representatives of all eugleno- are 2(n-1) possible splits (i.e. every possible bipartition of the taxa zoan and especially euglenid subgroups, the other only rhab- in two subsets). A phylogenetic tree represents at most (2n-3) domonadid sequences. Regarding the latter data set, 416 addi- splits, of which n are terminal (also called empty or pendant) and tional positions could be unambiguously homologized due to (n-2) are internal splits. Spectral analyses were performed with the close relationship of taxa. Two a priori tests for examina- Spectrum 2.3.0 (Charleston 1998). The spectra were plotted as tion of phylogenetic signal content have been conducted: Per- Lento diagrams (Lento et al. 1995) based on uncorrected dis- mutation Tail Probability Test (PTP; Archie 1989, Faith 1989, tances using Hadamard conjugation (Hendy & Penny 1993). Faith & Cranston 1991) and Relative Apparent Synapomorphy Splits are ordered from left to right by their support. Above the x- Analysis (online version of RASA; Lyons-Weiler et al. 1996, axis the support for a given bipartition is indicated in units of ex- Lyons-Weiler & Hoelzer 1997). The purpose of RASA is to pected number of substitutions per site separating both subsets. provide a critical test of the assumption that the combined in- When phylogenetic information is low or absent, conflicting fluences of the processes of evolution and sampling of both statements of relationships exist in the matrix. Conflict is given taxa and characters have resulted in a distribution of character below the x-axis as the sum of support for all bipartitions not states among taxa that is reflective of genealogical relation- compatible with the split under study. In contrast to other studies ships. In addition, the taxon variance plot offered by RASA (e.g. Lento et al. 1995), conflict was not normalized, i.e. reduced compares the relative contribution of cladistic and phenetic by certain factors. Net support for a given split is the difference variance by individual taxa, and hence allows for the identifi- between support and conflict. Empty splits represent the amount cation of putative long branch taxa (Lyons-Weiler & Hoelzer of evolutionary change along terminal branches that do not sup- 1997). In contrast to other analyses we employed RASA to an- port any groupings among taxa. In further analyses influences of alyze characteristics of the data set rather than to manipulate different models of sequence evolution and varying proportions the taxon sampling by removing putative long branch taxa. of invariable sites were examined.

Org. Divers. Evol. (2003) 3, 1–12 4 Busse & Preisfeld

To visualize the results of spectral analysis in a tree fashion topology (not shown). Differences concerned nodes the corresponding Manhattan Tree was generated with Spec- which were previously poorly supported (Fig. 1) by trum. The Manhattan Tree is that tree whose expected spec- decay indices, e.g. the placement of diplonemids and trum is the closest by Manhattan distance to the spectrum ob- kinetoplastids and the internal phylogeny of phototroph- tained from the data. The Manhattan Tree tends to be very sim- ic euglenids. Within the Rhabdomonadina Menoidium ilar to the Closest Tree (Hendy 1991). appeared paraphyletic due to the shift of M. gibbum to a clade built by Rhabdomonas intermedia and Parmidium (ML) or to the base of the Rhabdomonadina (MP), re- Results spectively. Generally, although hLRT and AIC in some cases preferred different models of sequence evolution Phylogenetic inference and sequence characteristics (Table 1) subsequent tree searches always led to exactly Five SSU rDNA sequences from the genera Menoidium the same topology. Application of the LogDet distance and Rhabdomonas were obtained and analyzed together transformation to account for nonstationary base fre- with an overall euglenozoan data set. The neighbor join- quencies did not change the tree shape at all (not ing tree inferred from uncorrected distances (Fig. 1) shown). shows generally well known clades: diplonemids and As in earlier studies (Preisfeld et al. 2001, Busse & kinetoplastids are monophyletic, whereas euglenids are Preisfeld 2002a, b) we investigated SSU rDNA se- doubtful, with interrelationships of these groups being quence specifics of the data set (data not shown): eu- poorly resolved. Within the euglenid subtree photo- glenids are characterized by a pronounced sequence di- trophs and primary osmotrophic euglenids are well sup- vergence. Especially sequences of primary osmotrophic ported as monophyletic groups. However, exact place- euglenids as well as Petalomonas cantuscygni con- ment of the phagotrophs Peranema trichophorum and tribute to the overall molecular diversity. Nucleotides Petalomonas cantuscygni is less clear. The phototrophic are significantly heterogeneously distributed among clade is divided into two lineages: Eutreptiina and Eu- taxa (chi-square = 345.43, df = 219, P = 0.00). A relative glenina. Within primary osmotrophs monophyletic rate test revealed a large amount of substitution rate vari- Rhabdomonadina are in close relation to some Astasia ation among lineages, whereupon primary osmotrophs species. The Distigma curvatum group builds the sister- are consistently identified as fast evolving. group to this clade and Distigma proteus can be found at the base of primary osmotrophs. Newly analyzed strains Phylogeny of Euglenozoa: of Rhabdomonas incurva and R. costata cluster together examination of phylogenetic signal content with previously published sequences of the same species. Strains of R. costata exhibit nearly identical The initial data set was reduced to 18 taxa representing SSU rDNA sequences, whereas R. incurva strains differ all major clades within euglenids and euglenozoans. 549 notably (4.1% observed differences). R. incurva and R. characters out of 1,117 were parsimony informative, costata together with Gyropaigne lefevrei form a weakly base frequencies were still varying among lineages (P = supported clade termed R. incurva clade in the follow- 0.34). Initially, two well known tests were performed to ing. Ribosomal small subunit sequences from Menoidi- investigate phylogenetic signal content of the whole data um pellucidum and M. cultellus are nearly identical, set. Consistently, PTP (P < 0.01) and RASA (Trasa = likewise M. intermedium, M. obtusum, and M. sp. are 16.96) indicated significant amount of phylogenetic sig- only slightly different. Together with M. bibacillatum nal in the data set. Taxon variance ratios and evaluation these sequences form a well supported group of very of the number of unique substitutions (Fig. 2) were ex- similar sequences, informally referred to as ‘Eumenoidi- amined to screen for problematic taxa, e.g. possible long um’ (but not sensu Huber-Pestalozzi 1955). Monophyly branches. The results confirmed the heterogeneity of the of the genus Menoidium including M. gibbum is poorly data set and consistently indicated that especially supported. Petalomonas cantuscygni and Distigma proteus were Maximum likelihood analysis considering estimated likely to be susceptible to long branch attraction phe- base frequencies and among-site rate variation (parame- nomena. Nevertheless, deletion of P. cantuscygni and D. ters according to hLRT and AIC, see Table 1) and maxi- proteus in order to avoid presumed long branches seems mum parsimony resulted in essentially the same tree not appropriate, because they are key taxa regarding

Fig. 1. Neighbor joining phylogram based on an uncorrected distance matrix calculated with PAUP*. NJ and MP bootstrap values derived from 1000 pseudosamples are given above and below branches, respectively. Bootstrap values for internal nodes of phototrophic euglenids, Distig- ma curvatum group, diplonemids, kinetoplastids and outgroup not shown. New sequences are in boldface.

Org. Divers. Evol. (2003) 3, 1–12 Spectral analysis applied to Euglenozoa phylogeny 5

Org. Divers. Evol. (2003) 3, 1–12 6 Busse & Preisfeld monophyly of euglenids and of primary osmotrophs, re- conflict. On the other hand, the splits m (monophyly of spectively. primary osmotrophs, phototrophs and P. trichophorum), n (separating P. cantuscygni from all other Euglenozoa), and o (monophyly of phototrophs and P. trichophorum) Spectral analysis of an euglenozoan data set should be treated with caution because net support is zero The resulting spectrum based on uncorrected Hamming or even negative. This demonstrates that there is no un- distances together with the corresponding Manhattan ambiguous phylogenetic signal for the monophyly of eu- tree is presented in Figure 3. The topology is nearly glenids, nor for the interrelationships of phototrophic, os- identical to the one in Figure 1, with the exceptions of motrophic and phagotrophic taxa. However, these results the placement of P. cantuscygni at the base of the Eu- are based on the reduced data set comprising only 18 eu- glenozoa and the sistergroup relation of diplonemids and glenozoan taxa altogether. An attempt to model the ‘true’ kinetoplastids; corresponding nodes were insignificantly underlying evolutionary process should lead to reduction supported in initial analyses. Maximum likelihood anal- of noise and amplification of true signal. But use of yses using parameters according to hLRT and AIC, re- LogDet or maximum likelihood distances (according to spectively (Table 1), revealed exactly the same topology hLRT and AIC) for spectral analysis did not change the (not shown). Use of different outgroups (e.g. Chlamy- shape of the spectrum significantly. Especially, poorly domonas moewusii and Proceraea cornuta) did not af- supported nodes were not improved. fect tree shape either. Additionally, we examined a data set used by Maslov The spectrum revealed all fourteen high-scoring bi- et al. (1999) to investigate phylogenetic placement partitions (a-o) representing non-trivial splits compatible of (available at http://www.rna.ucla.edu/ with each other and with the tree in Figure 3a. The first trypanosome/alignments.html, alignment #9). Their eleven splits displayed unequivocal positive net support, analyses revealed 94% bootstrap support for monophyly indicating that corresponding branches were unambigu- of euglenids, comprising gracilis, ously supported without significant amount of contra- ovata, Khawkinea quartana and Petalomonas cantus- dicting signal. Only the node connecting diplonemids cygni. Spectral analysis (data not shown) revealed only and kinetoplastids (p) was placed among incompatible poor support for this node, with an equivalent amount of splits. Highest support values could be found for kineto- conflicting patterns indicating artificially high bootstrap plastids (a), Euglenozoa (b), and diplonemids (c). Like- support due to unbalanced taxon sampling and inclusion wise, primary osmotrophs and subgroups (d, f, g, h, i, l) of noisy positions for phylogenetic inference (1,349 po- as well as phototrophs (e, k) exhibit more support than sitions included).

Fig. 2. Taxon variance plot (a) and examination of unique substitutions (b) for the reduced data set of euglenozoan sequences.

Org. Divers. Evol. (2003) 3, 1–12 Spectral analysis applied to Euglenozoa phylogeny 7

(a)

(b)

Fig. 3. Spectral analysis of the reduced data set of euglenozoan sequences. Manhattan tree (a) and spectrum based on Hamming distances as a Lento plot (b).

Org. Divers. Evol. (2003) 3, 1–12 8 Busse & Preisfeld

Phylogeny of Rhabdomonadina: likely the result of high sequence similarity within this examination of phylogenetic signal content group. Bipartitions separating Parmidium (n) and Parmidium + R. intermedia (m), respectively, from all A third data set was created in order to examine distribu- other Rhabdomonadina are poorly supported with high tion of phylogenetic signal among taxa of Rhab- conflict. domonadina. Nucleotide composition of the data is sig- Maximum likelihood analyses (See Table 1 for pa- nificantly homogenous (P = 0.99). rameters) revealed a similar tree topology which dif- Statistical tests of phylogenetic information content, fered in the placement of Parmidium and R. intermedia PTP (P<0.01) and RASA (Trasa = 3.50) likewise retrieved (not shown). Again, using ML distances for spectral significant results. Analysis of taxon variance ratios and analysis did not change the shape of the spectrum signif- a plot of unique substitutions (Fig. 4) uncovered out- icantly, except for less conflict for the R. incurva clade. groups as clear outliers in contrast to a homogeneous in- group. Indeed, A. torta and A. curvata are the closest re- lated outgroup available, thus allowing us to homologize Results summary 1,533 characters. Omission of outgroup sequences did not change resolution of ingroup taxa at all. All nodes supported by high bootstrap values according to standard distance and parsimony approaches could be found with positive net support. Interrelationships of Spectral analysis of rhabdomonadid taxa major euglenozoan and euglenid subgroups were not Spectrum and corresponding Manhattan Tree are given supported. Within the suborder Rhabdomonadina only in Figure 5. Clear support accompanied by low conflict few unambiguously supported clades could be estab- could be found for monophyly of Rhabdomonas costata lished. Application of sequence evolution models did strains (a), ‘Eumenoidium’ (b), Parmidium (c) and not improve the phylogenetic signal to noise ratio. Rhabdomonadina (d). Monophyly and substructure of the R. incurva clade (e.g., h) is less supported with grow- ing conflict, as is the monophyly of the genus Menoidi- Discussion um (f) including M. gibbum. Subgroups of ‘Eumenoidi- um’, M. intermedium clade (i), M. cultellus clade (l), and The present work focuses on the examination of eu- M. bibacillatum (k, separating M. bibacillatum from all glenozoan SSU rDNA data sets in order to review hy- other ‘Eumenoidium’), are slightly supported but also potheses about euglenid phylogeny in the light of pro- weakly contradicted by alternative splits. This is most nounced molecular diversity and unevenly distributed

Fig. 4. Taxon variance plot (a) and examination of unique substitutions (b) for the reduced data set comprising sequences of Rhabdomonadina.

Org. Divers. Evol. (2003) 3, 1–12 Spectral analysis applied to Euglenozoa phylogeny 9

(a)

(b)

Fig. 5. Spectral analysis of the reduced data set comprising sequences of Rhabdomonadina. Manhattan tree (a) and spectrum based on Ham- ming distances as a Lento plot (b).

Org. Divers. Evol. (2003) 3, 1–12 10 Busse & Preisfeld sequence characteristics along different branches of the euglenid phylogeny is complex and can hardly be met by tree. Spectral analysis has been used as a tool to study model parameters. Consequently, application of avail- and visualize distribution of phylogenetic signal and able evolutionary models did not improve phylogenetic noise. inference by amplifying signal.

Examination of phylogenetic signal content Euglenid phylogeny Despite an existing variety of methods to discriminate Phylogenetic hypotheses presented in this paper are in between character covariance or natural cladistic hierar- general agreement with previous studies addressing eu- chy (e.g. Kitching et al. 1998) and noise, there is no gen- glenozoan evolution (Montegut-Felkner & Triemer erally accepted consensus. PTP and RASA are common- 1997, Maslov et al. 1999, Linton et al. 1999, 2000, Preis- ly used statistical approaches based on permutation and feld et al. 2000, 2001, Busse & Preisfeld 2002a). Strik- regression analyses, respectively. However, there is jus- ingly, spectral analysis challenges euglenid monophyly tified criticism of their performance (Goloboff et al. due to lack of unambiguous phylogenetic signal which 1998, Simmons et al. 2002). Although PTP and RASA has been shown by a modified slow-fast approach gave significant results concerning natural cladistic hier- (Busse & Preisfeld 2002a). These results do not contra- archy within euglenozoan data sets, these findings could dict euglenid monophyly but indicate that SSU rDNA not be considered as evidence for accurate phylogenetic data do not support euglenid monophyly as unequivocal- inference. Generally, a priori statistical tests based on ly as often stated (e.g. Preisfeld et al. 2000, Müllner et al. randomization procedures may be useful to identify 2001). Generally, it could be found that well supported completely randomized data sets but do not provide any bipartitions are accompanied by high bootstrap values estimation of confidence in the resulting phylogenetic (Wägele et al. 1999). Nevertheless, important excep- trees. tions exist: a reexamination of the data set from Maslov Spectral analysis on the other hand offers the opportu- et al. (1999) revealed no unambiguous phylogenetic sig- nity to study signal and conflict for any bipartition with- nal for euglenid monophyly despite high bootstrap val- out being limited to one estimate of signal. Examining a ues. Artificially high support for euglenid monophyly is spectrum, presence of hierarchical structure is reflected most often due to unbalanced taxon sampling (e.g. ne- by compatible splits showing positive net support. On glecting primary osmotrophs) and use of few or puta- the other hand it has to be emphasized that the taxon tively unsuitable outgroup sequences (e.g. closely relat- sampling has to be restricted for the use of Spectrum, ed kinetoplastids). Thus, conclusions about euglenid thus bearing a possible source of error. monophyly and interrelationships of euglenozoan sub- groups should not be drawn (Montegut-Felkner & Triemer 1997, Linton et al. 1999, 2000, Leander et al. Impact of sequence evolution models 2001, Müllner et al. 2001). One reason for the failure of Analyses were started using uncorrected distance ap- molecular data to recover euglenid monophyly may be proaches, and subsequently the impact of modeling the the pronounced SSU rDNA sequence diversity resulting putative evolutionary process was studied. Although hi- in saturation and masking of phylogenetic signal. Addi- erarchical likelihood ratio test and Akaike information tionally, Petalomonas cantuscygni has been shown like- criterion as implemented in Modeltest provide objective ly to be a long branch taxon, accompanied by a signifi- criteria for selecting a model of sequence evolution, one cant underrepresentation of phagotrophic euglenids in has to consider that the resulting parameters represent actually available data sets. On the other hand it is possi- only the best fitting model among models at hand. Obvi- ble that euglenids are indeed para- or polyphyletic, cast- ously, even when hLRT and AIC prefer different model ing doubt on the pellicle and paramylon as possible parameters the resulting trees were completely the same. synapomorphies. Unfortunately, few methods have been offered to study how real data sets deviate from estimated model param- Interrelations of euglenid subgroups eters. As an example, analyses of euglenozoan SSU rDNA sequences using Treepuzzle 5.0 (Strimmer & von Besides the indication of monophyly of the phototrophs, Haeseler 1996) revealed that all taxa failed to pass a chi spectral analysis detected convincing support for mono- square test when equal base frequencies were assumed phyly of primary osmotrophic euglenids, adding further (as the hLRT had suggested for the second and third data support to the hypothesis of a primary osmotrophic set). Additionally, Menoidium bibacillatum and Saccha- clade comprising molecularly and morphologically di- romyces cerevisiae deviate significantly from estimated verse organisms. This lineage presumably originated base frequencies (unpublished results). Our analyses from phagotrophic ancestors as suggested by the occur- clearly show that the underlying evolutionary process of rence of cytosolic endobacteria in some species, which

Org. Divers. Evol. (2003) 3, 1–12 Spectral analysis applied to Euglenozoa phylogeny 11 are interpreted as remnants of phagocytotic prey uptake species within ‘Eumenoidium’ is less clear. SSU rDNA (Dragos et al. 1990, Yamaguchi & Anderson 1994). analysis identifies three lineages comprising identical Monophyly of primary osmotrophs, phototrophs and sequences each, indicating that corresponding cultures Peranema trichophorum can be founded on a flexible contain the same species. This is accompanied by un- pellicle with numerous helical strips resulting from a precise species descriptions which rely on variable char- strip duplicating event (Leander et al. 2001), but evi- acters like the anterior rostrum and the form and distri- dence from molecular data is weak and contradictory. bution of paramylon grains. Consequently, our analyses Phylogenetic relationships of primary osmotrophs to indicate that ‘Eumenoidium’ consists of three species phototrophs and Peranema trichophorum can not be whose identity should be clarified by ultrastructural in- clarified using SSU rDNA data. However, a sister group vestigations. Scattered placement of Rhabdomonas relation of Peranema and phototrophs seems to be plau- species and low support for Menoidium reflect enduring sible, as phagotrophs with complex ingestion apparatus- discussion concerning the composition and identity of es are likely candidates for giving rise to endocytobiosis these genera (Pringsheim 1942, Skuja 1948, Huber- leading to phototrophs (Preisfeld et al. 2000, 2001, Le- Pestalozzi 1955, Preisfeld et al. 2001). ander et al. 2001). Nevertheless, this grouping is not un- ambiguously supported compared to conflicting alterna- tives. Acknowledgements Although D. proteus is probably susceptible to long branch attraction, and sequence divergence is pro- We sincerely thank G. Bricheux for kindly providing a culture nounced, monophyly of primary osmotrophs is much of Menoidium pellucidum, and are very grateful to S. Talke for more reasonable compared to euglenid monophyly in- critically reading the manuscript. This work was supported by cluding P. cantuscygni at its base. One explanation may the DFG (PR 624/2-3). be the comparatively well-balanced taxon sampling of primary osmotrophs. Additionally, the origin of this clade can be expected to be much younger than the di- vergence of the branch leading to P. cantuscygni, thus References phylogenetic signal is less masked by multiple hits. To date, comparative morphological and ultrastructural Archie, J. W. (1989): A randomization test for phylogenetic in- formation in systematic data. Syst. Zool. 38: 219–252. analyses of a broad range of primary osmotrophic eu- Busse, I. & Preisfeld, A. (2002a): Phylogenetic position of glenids are still missing. Unifying morphological char- sp. and Diplonema ambulator as indicated by acters of primary osmotrophs comprise negative fea- analyses of euglenozoan small subunit ribosomal DNA. tures like absence of ingestion devices and plastids. Gene 284: 83–91. Nevertheless, the possession of split-ringed structures Busse, I. & Preisfeld, A. (2002b, in press): Systematics of pri- around the canal, as described for Distigma proteus, mary osmotrophic euglenids: a molecular approach to the might be a putative synapomorphy and is potentially ho- phylogeny of Distigma and Astasia (Euglenozoa). Int. J. mologous to the scroll of the Rhabdomonadina (Leedale Syst. Evol. Microbiol. 1967, Leedale & Hibberd 1974, Leander et al. 2001). A Busse, I. & Preisfeld, A. (2002c, in press): Unusually expand- well founded molecular synapomorphy is the length of ed SSU ribosomal DNA of primary osmotrophic euglenids: the gene coding the SSU rRNAwhich is significantly molecular evolution and phylogenetic inference. J. Mol. Evol. extended in all primary osmotrophs, reaching more than Cavalier-Smith, T. (1981): kingdoms: seven or 4.500 nt in Distigma sennii (Busse & Preisfeld 2002c). nine? BioSystems 14: 461–481. Cavalier-Smith, T. (1993): Protozoa and its 18 Interrelationships of the Rhabdomonadina phyla. Biol. Rev. 57: 953–994. Charleston, M. A. (1998): Spectrum: spectral analysis of phy- Results concerning phylogenetic relationships of Rhab- logenetic data. Bioinformatics 14: 98–99. domonadina are generally well in accordance with a pre- Dragos, N., Péterfi, L. S., Nedelcu, A. & Craciun, C. (1990): vious study (Preisfeld et al. 2001). Besides monophyly Intracellular in euglenoid flagellates. Rev. Roum. of Rhabdomonadina, spectral analysis reveals two sig- Biol. Sér. Biol. Végét. 35: 51–54. nificantly supported groups: Parmidium and ‘Eu- Faith, D. P. (1989): Homoplasy as pattern: multivariate analy- sis of morphological convergence in Anseriformes. Cladis- menoidium’. The genus Parmidium can be identified tics 5: 235–258. morphologically by a strongly flattened and oval cell Faith, D. P. & Cranston, P. S. (1991): Could a cladogram this shape. ‘Eumenoidium’ is well corroborated by molecu- short have arisen by chance alone? On permutation tests for lar and morphological data, as all species are laterally cladistic structure. Cladistics 7: 1–28. flattened and sickle shaped in contrast to M. gibbum Felsenstein, J. (1985): Confidence limits on phylogenies: an which is twisted at the posterior end. The identity of approach using the bootstrap. Evolution 39: 783–791.

Org. Divers. Evol. (2003) 3, 1–12 12 Busse & Preisfeld

Frantz, C., Ebel, C., Paulus, F. & Imbault, P. (2000): Charac- Müllner, A. N., Angeler, D. G., Samuel, R., Linton, E. W. & terization of trans-splicing in euglenoids. Curr. Genet. 37: Triemer, R. E. (2001): Phylogenetic analysis of phagotroph- 349–355. ic, phototrophic and osmotrophic euglenoids by using the Gibbs, S. A. (1978): The of Euglena may have nuclear 18S rDNA sequence. Int. J. Syst. Evol. Microbiol. evolved from symbiotic . Can. J. Bot. 56: 51: 783–791. 2883–2889. Posada, D. & Crandall, K. A. (1998): MODELTEST: testing Goloboff, P. A., Carpenter, J. M. & Farris, J. S. (1998): PTP is the model of DNA substitution. Bioinformatics 14: meaningless, T-PTP is contradictory: a reply to Trueman. 817–818. Cladistics 14: 105–116. Preisfeld, A., Berger, S., Busse, I., Liller, S. & Ruppel, H. G. Hendy, M. D. (1991): A combinatorial description of the clos- (2000): Phylogenetic analyses of various euglenoid taxa est tree algorithm for finding evolutionary trees. Discrete (Euglenozoa) based on 18S rDNA sequence data. J. Phycol. Math. 96: 51–58. 36: 220–226. Hendy, M. D. & Penny, D. (1993): Spectral analysis of phylo- Preisfeld, A., Busse, I., Klingberg, M., Talke, S. & Ruppel, H. genetic data. J. Classif. 10: 5–24. G. (2001): Phylogenetic position and inter-relationship of Huber-Pestalozzi, G. (1955): Das Phytoplankton des osmotrophic euglenids with emphasis on the Rhabdomon- Süßwassers. 4. Teil. E. Schweizerbart’sche Verlagsbuchh., adales (Euglenozoa) based on SSU rDNA data. Int. J. Syst. Stuttgart. Evol. Microbiol. 51: 751–758. Kitching, I. J., Forey, P. L., Humphries, C. J. & Williams, D. Pringsheim, E. G. (1942): Contributions to our knowledge M. (1998): Cladistics. The Theory and Practice of Parsimo- of saprotrophic algae and flagellata. III. Astasia, Distig- ny Analysis. Second edition. Syst. Assoc. Publ. 11. Oxford ma, Menoidium and Rhabdomonas. New Phytol. 41: University Press, Oxford, etc. 171–205. Kivic, P. A. & Walne, P. (1984): An evaluation of a possible Rodriguez, R., Oliver, J. L., Marin, A. & Medina, J. R. (1990): phylogenetic relationship between the Euglenophyta and The general stochastic model of nucleotide substitution. J. . Origins of 13: 269–288. Theor. Biol. 142: 485–501. Kumar, S. (1996): PHYLTEST: Phylogenetic Hypothesis Test- Simmons, M. P., Randle, C. P., Freudenstein, J. V. & Wenzel, J. ing Software. Version 2.0. Pennsylvania State University, W. (2002): Limitations of relative apparent synapomorphy University Park, Pennsylvania. analysis (RASA) for measuring phylogenetic signal. Mol. Leander, B. S., Triemer, R. E. & Farmer, M. A. (2001): Char- Biol. Evol. 19: 14–23. acter evolution in heterotrophic euglenids. Europ. J. Protis- Simpson, A. G. B. (1997): The identity and composition of the tol. 37: 337–356. Euglenozoa. Arch. Protistenkd. 148: 318–328. Leedale, G. F. (1967): Euglenoid Flagellates. Prentice-Hall, Skuja, H. (1948): Taxonomie des Phytoplanktons einiger Seen Inc., Englewood Cliffs, New Jersey. in Uppland, Schweden. Symbolae Botanicae Upsalienses Leedale, G. F. & Hibberd, D. J. (1974): Observations on the IX, 3. cytology and fine structure of the euglenoid genera Menoid- Sogin, M. L., Gunderson, J. H., Elwood, H. J., Alonso, R. A. & ium Perty and Rhabdomonas Fresenius. Arch. Protistenkd. Peattie, D. A. (1989): Phylogenetic meaning of the kingdom 116: 319–345. concept: an unusual ribosomal RNA from lamblia. Lento, G. M., Hickson, R. E., Chambers, G. K. & Penny, D. Science 243: 75–77. (1995): Use of spectral analysis to test hypotheses on the Strimmer, K. & von Haeseler, A. (1996): Quartet puzzling: A origin of pinnipeds. Mol. Biol. Evol. 12: 28–52. quartet maximum likelihood method for reconstructing tree Linton, E. W., Hittner, D., Lewandowski, C., Auld, T. & topologies. Mol. Biol. Evol. 13: 964–969. Triemer, R. E. (1999): A molecular study of euglenid phy- Swofford, D. L. (2001): PAUP*. Phylogenetic Analysis Using logeny using small subunit rDNA. J. Eukaryot. Microbiol. Parsimony (*and other methods). Version 4. Sinauer Asso- 46: 217–223. ciates, Sunderland, Massachusetts. Linton, E. W., Nudelmann, M. A., Conforti, V. & Triemer, R. Tamura, K. & Nei, M. (1993): Estimation of the number of nu- E. (2000): A molecular analysis of the Euglenophytes using cleotide substitutions in the control region of mitochondrial SSU rDNA. J. Phycol. 36: 740–746. DNA in humans and chimpanzees. Mol. Biol. Evol. 10: Lyons-Weiler, J. & Hoelzer, G. A. (1997): Escaping from the 512–526. Felsenstein zone by detecting long branches in phylogenet- Triemer, R. E. & Farmer, M. A. (1991): An ultrastructural ic data. Mol. Phylogenet. Evol. 8: 375–384. comparison of the mitotic apparatus, feeding apparatus, Lyons-Weiler, J., Hoelzer, G. A. & Tausch, R. J. (1996): Rela- flagellar apparatus and in euglenoids and kine- tive apparent synapomorphy analysis (RASA) I: the statisti- toplastids. Protoplasma 164: 91–104. cal measurement of phylogenetic signal. Mol. Biol. Evol. Wägele, J. W., Erikson, T., Lockhart, P. & Misof, B. (1999): 13: 749–757. The Ecdysozoa: artifact or monophylum? J. Zool. Syst. Maslov, D. A., Yasuhira, S. & Simpson, L. (1999): Phyloge- Evol. Res. 37: 211–223. netic affinities of Diplonema within the Euglenozoa as in- Wuyts, J., Van de Peer, Y., Winkelmans, T. & De Wachter, R. ferred from the ssu rRNA gene and partial COI protein se- (2002): The European database on small subunit ribosomal quences. Protist 150: 33–42. RNA. Nucleic Acids Res. 30: 183–185. Montegut-Felkner, A. E. & Triemer, R. E. (1997): Phylogenet- Yamaguchi, T. & Anderson, O. R. (1994): Fine structure of ic relationships of selected euglenoid genera based on mor- laboratory cultured Distigma proteus and cytochemical lo- phological and molecular data. J. Phycol. 33: 512–519. calization of acid phosphatase. J. Morphol. 219: 89–99.

Org. Divers. Evol. (2003) 3, 1–12