Unusually Long Hyptiotes (Araneae, Uloboridae) Sequence for Small Subunit (18S) Ribosomal Rna Supports Secondary Structure Model Utility in Spiders
Total Page:16
File Type:pdf, Size:1020Kb
2006 (2007). The Journal of Arachnology 34:557–565 UNUSUALLY LONG HYPTIOTES (ARANEAE, ULOBORIDAE) SEQUENCE FOR SMALL SUBUNIT (18S) RIBOSOMAL RNA SUPPORTS SECONDARY STRUCTURE MODEL UTILITY IN SPIDERS J. C. Spagna1 and R. G. Gillespie: Division of Insect Biology, Department of Environmental Science, Policy and Management, 137 Mulford Hall, UC Berkeley, Berkeley, California 94720, USA. E-mail: [email protected] ABSTRACT. We report on the structure of the small-subunit ribosomal RNA (18S rRNA) sequence from Hyptiotes gertschi (Araneae, Uloboridae), which is the largest 18S gene sequenced in any arachnid to date. We compare this remarkable sequence to those from a range of other spiders and arachnids, and develop base-pairing models of its insert regions to determine its overall secondary structure. The H. gertschi sequence of 1902 bases is 86 nucleotides longer than any comparable spider sequence and contains 5 inserts between 5 and 28 bases in length, all at regions characterized as among the most variable in eukaryotic 18S genes. Inserts were also found in one of these variable regions in published sequences of 3 species of hard ticks (Acari, Ixodidae). Other arachnid taxa were remarkably uniform in 18S primary sequence length, ranging from 1802 to 1816 nucleotides. Thermodynamic modeling of the H. gertschi inserts suggests they are largely self-complementary, extending the stem portions of the variable regions. Keywords: Phylogenetics, arachnids, Acari, gene inserts The small subunit ribosomal RNA, or 18S at which different parts of the primary se- rRNA, is one of a small set of commonly used quence vary. sequences for molecular phylogenetic recon- The genes coding for ribosomal RNAs con- struction of arthropod relationships (reviewed tain regions that can accumulate and lose ba- in Caterino et al. 2000). The gene coding for ses through insertion and deletion events (in- the 18S rRNA contains sections that are high- dels) more easily than protein-coding genes, ly conserved, and these provide informative which are constrained by the requirement that characters for the assessment of relationships they maintain an open reading frame for prop- between distantly related taxa, such as meta- er translation into a functional primary amino zoan relationships (Giribet & Wheeler 2001; acid sequence. Indels can change the length of Mallatt et al. 2004). The 18S rRNA has also the gene and make homology assessments of been used in studies of divergence between individual base-pairs, and often long stretches arachnid orders, as well as studies of diver- of base-pairs, difficult. The proper methodo- gence between spider genera and species logical approach to this sequence-alignment (Wheeler & Hayashi 1998; Arnedo et al. problem is a source of controversy in phylo- 2004). In the context of arachnid molecular genetics, and opinions range from using static phylogenetics, to understand both the poten- alignments, with regions that are difficult to tial utility and drawbacks in the use of any align being either included or discarded (e.g., genetic marker, it is important to have knowl- Nardi et al. 2003); to using direct optimization edge of the amount of variation in both the (Wheeler 1996), which avoids the arbitrary re- primary sequence and secondary structures moval of data and possible evolutionary sig- across taxa at different levels. This is because nal and avoids the problem of multiple-se- the secondary structure can influence the rate quence alignment altogether. An additional 1 Current address: Dept. of Entomology, University feature of direct optimization, as implemented of Illinois Urbana-Champaign, 320 Morrill Hall, in the program POY (Wheeler 1999) is the 505 S. Godwin Ave., Urbana, Illinois 61801. ability to treat inserts as multistate characters E-mail: [email protected] with as many states as there are unique insert 557 558 THE JOURNAL OF ARACHNOLOGY sequences, with a matrix of character-state hard ticks (Acari, Ixodidae) that demonstrate transformation costs. See Giribet & Wheeler the evolutionary conservation of the overall (2001) for an example of implementation of structure of the arachnid 18s gene. the latter method. A very different approach, and one that has METHODS been promoted by many authors, is the use of Taxon sampling.—We sampled exemplar secondary structure models to aid in homol- sequences from all arachnid orders and major ogy assessment and relative weighting (Dixon spider lineages for which at least one full se- & Hillis 1993; Kjer 2004). Recently, it has quence of the entire 18S gene was available been proposed that elements of rRNA second- in GenBank (http://www.ncbi.nlm.nih.gov/ ary structure can provide a basis for choice Genbank/). To sample unusually long arachnid between multiple models to be used in a par- 18S sequences thoroughly, we extended our titioned Bayesian analysis (Telford et al. search to include all ‘‘partial sequences’’ that 2005). For this technique, bases from a variety came within 90 bases (ϳ5% of the gene of taxa would be compared to a secondary length) of the primer regions, since this region structure model for the gene for assignment to is highly conserved across arthropods and the ‘‘stem’’ and ‘‘loop’’ (and possibly transitional missing length could be estimated despite lack or ambiguous) partitions. These partitions of primary sequence. In taxa where 18S gene would be individually tested for the most ap- length was uniform and in the ϳ1800 base propriate likelihood models, then subsequent- pair (ϭ bp) length typical of arachnids, one ly analyzed using the partitioned, multiple- individual was sampled, while all complete or model approach. nearly-complete sequences with large (Ͼ 3 The RNAs produced by the 18S genes form bp) inserts relative to the Aphonopelma sp. se- secondary structure by base-pairing with their quence were included in the sample, since a own complementary stretches of sequence, complete secondary structure model is avail- forming helical structures commonly referred able for this taxon (Hendriks et al. 1988). to as stems, while the non-pairing regions Among spiders, lineages included represen- form loop structures that connect multiple tatives from Mygalomorphae (Aphonopelma stems, or form the terminal turn in the self- sp., Theraphosidae), Mesothelae (Liphistius complementary stems. Because the overall bicoloripes, Liphistiidae), Orbicularia (H. secondary structure is functionally important gertschi, Uloboridae), ‘‘derived orb-weavers’’ for the ribosome, and because two comple- (sensu Coddington 1990), (Nesticus cellulan- mentary changes in primary sequence are re- us, Nesticidae) and the RTA (retrolateral tibial quired to maintain a stem structure, it has been apophysis) clade (sensu Coddington et al. proposed that the stems should be treated dif- 2004) (Coelotes terrestris, Amaurobiidae) ferently in phylogenetic analysis than the less- (see Table 1). All spider sequences were from constrained loops (Dixon & Hillis 1993). To Genbank except for the H. gertschi data, for do so requires the ability to tell whether nu- which new sequence data were generated in cleotides are in a loop or a stem structure, the current study. Specimens of H. gertschi which can be modeled using computer algo- were collected by S. Lew at the Angelo Re- rithms that develop secondary structures serve, Mendocino County (39Њ43ЈN, based on comparative or thermodynamic in- 123Њ39ЈW), and from Del Norte County near formation (reviewed in Gardner & Giegerich the Oregon border (41Њ59ЈN, 123Њ43ЈW), Cal- 2004). ifornia, USA, in May 2003. Voucher speci- Here we present a comparison of 18S struc- mens are stored in the collection of the Essig tures sampled from arachnid orders and spider Museum of Entomology, UC Berkeley, under lineages to assess the utility of secondary the voucher codes EMEC50654 and structure information currently available for EMEC50993, respectively. making homology judgments across arachnid Molecular methods.—DNA was extracted taxa, and for determining data partitions used from two legs from each of the H. gertschi in model-based analyses. Included are new specimens using a Qiagen DNeasy Tissue data from the species Hyptiotes gertschi Kit’s standard protocol. The 18S gene was di- Chamberlin & Ivie 1935 (Araneae, Ulobori- rectly PCR-amplified in three parts using dae), and remarkably similar sequences from primer pairs 1F-5R, 5F-9R, and 3F-7R (after SPAGNA & GILLESPIE—HYPTIOTES 18S SECONDARY STRUCTURE 559 Table 1.—Taxa examined, with total gene length and p-distance between primary sequence of individ- uals and Aphonopelma sp. reference sequence. Length includes terminal primer regions (approximately 50 base pairs (ϭ bp); 23 per primer ϩ 4 bp downstream in the 3Ј direction from primer 9R). * indicates partial sequences (see text), with lengths estimated by assuming uniform sequence length relative to reference sequence at missing terminal regions. Genbank Total Uncorrected Accession length p-distance to Order (lineage represented) Taxon sampled Number (bp) reference Araneae (Orbicularia) Hyptiotes gertschi Chamber- DQ015708 1902 10.8% lin & Ivie 1935 Araneae (Mygalomorphae) Aphonopelma sp. X13457 1814 — Araneae (Mesothelae) Liphistius bicoloripes Ono AF007104 1808 2.4% 1988 Araneae (derived orb-weavers) Nesticus cellulanus (Clerck AF005447 1816 7.8% 1757) Araneae (RTA clade) Coelotes terrestris (Wider AJ007986 1814 5.7% 1834) Opiliones Odiellus troguloides (Lucas X81441 1810 6.2% 1847) Scorpiones Androctonus australis (Lin- X77908 1812 6.7% naeus 1758) Pseudoscorpiones Roncus