ZOOLOGICAL RESEARCH

Volume 33, Issues E5-6 December 2012

CONTENTS

Special Issue on Genetic Diversity, Development and Evolution

Foreword ...... Bingyu MAO (E63) Introduction to the State Key Laboratory of Genetic Resources and Evolution ...... (E64)

Reviews Summary of Laurasiatheria (Mammalia) Phylogeny ...... Jingyang HU, Yaping ZHANG, Li YU (E65) Evolution of gamma-aminobutyric acid, glutamate and their receptors ...... Zhiheng GOU, Xiao WANG, Wen WANG (E75)

Articles XZFP36L1, an RNA binding protein, is involved in Xenopus neural development ...... Yingjie XIA, Shuhua ZHAO, Bingyu MAO (E82) Establishment and characterization of a fibroblast-like cell line from Anabarilius grahami ...... Xiaoai WANG, Junxing YANG, Xiaoyong CHEN, Xiaofu PAN (E89) Stage-specific appearance of cytoplasmic microtubules around the surviving nuclei during the third prezygotic division of Paramecium ...... Yiwen WANG, Jinqiang YUAN, Xin GAO, Xianyu YANG (E98) Molecular phylogeny and divergence time of Trachypithecus: with implication for of T. phayrei ...... Kai HE, Naiqing HU, Joseph D. ORKIN, Daw Thida NYEIN, Chi MA, Wen XIAO, Pengfei FAN, Xuelong JIANG (E104) Complete mitogenome of the Painted Jezebel, Delias hyparete Linnaeus (: Pieridae) and its comparison with other butterfly species ...... Qinghui SHI, Jing XIA, Xiaoyan SUN, Jiasheng HAO, Qun YANG (E111) Peak Identification for ChIP-seq Data with no Controls ...... Yanfeng ZHANG, Bing SU (E121)

Zoological Research call for papers: The 2nd and 3rd issue, 2013, “Special Issue on Amphibians and Reptiles” and “Special Issue on Fish” ...... (Inside back cover)

Guest Editors:Bingyu MAO,Jianfan WEN

Zoological Research Websites: http://www.zoores.ac.cn/EN/volumn/current.shtml

Foreword

The State Key Laboratory of Genetic Resources and Evolution, located at the Kunming Institute of Zoology, Chinese Academy of Sciences (CAS), is a leading research center in genetic resource conservation and evolutionary biology. In this special issue of Zoological Research on “Animal Genetic diversity, development and evolution”, we have compiled 12 review and research articles from the Key Lab as well as 4 related articles from external authors, covering, among other topics, genetic biodiversity, molecular phylogeny, evolution of gene families, and the use of mitochondrial DNA in the study of adaptive evolution and tumor evolution. Based on previous cytogenetic, morphological and molecular data, Hu JY et al summarized the higher-level phylogeny of Laurasiatheria, one of the richest and most diverse superorders of placental mammals. The genus Trachypithecus is the most diverse langur taxon, distributed in southwestern China and south and southeastern Asia. He K et al reconstructed the phylogeny of the 16 recognized Trachypithecus species using mitochondrial and nuclear genes. Wang CY et al reported on the identification of the processed medicinal Catharsius molossus by means of DNA barcoding, hopefully stimulating similar studies. Wang F et al cloned the heat shock protein 60 cDNA of Neobenedenia melleni (a capsalid monogenean parasite on fish skin) and studied its expression change under different temperatures and salinity. Wang JH et al and Wang XA et al respectively discussed the establishment of fibroblast-like cell lines of the endangered Crested ibis (Nipponia nippon) and the local fish Anabarilius grahami (Cypriniformes: Cyprinidae). Due to its rapid mutation ratio, mitochondrial DNA has been widely used in molecular evolutionary studies. In this issue, Chen X et al reviewed recent progress on its applications in molecular phylogeny, evolution of energy metabolism and adaptation. Different from normal cells, tumor cells generate energy via glycolysis even under aerobic conditions. In tumor cells, somatic mutations on the energy metabolism related genes (including mitochondrial ones) occur at a high frequency. Whether this is involved in the switch of the energy metabolism pathways is examined by Liu J and Kong QP. Meanwhile, Shi QH et al reported on the complete mitochondrial genome of the Painted Jezebel, Delias hyparete. During conjugation of the protozoa Paramecium caudatum, four haploid nuclei formed during the first two meiotic prezygotic divisions, of which only one survived. In the study by Wang YW et al, cytoplasmic microtubules (cMTs) were observed surrounding the surviving nuclei during the third prezygotic division, providing a new clue to the mechanism of its survival. The regulation of mRNA stability through RNA binding proteins plays an important role in embryonic development. In this issue, Xia YJ et al also showed that XZFP36L1, an RNA binding protein, is involved in Xenopus neural development. XZFP36L1 is expressed in specific brain regions in Xenopus embryos, and its overexpression inhibits neural differentiation and can lead to severe neural tube defect. Ye DD et al reviewed the bioinformatics pipelines for metagenomic analysis and Zhang & Su provided validation for an approach for peak identification for ChIP-seq data with no controls. This issue also includes reviews on the evolution of neurotransmitter gamma-aminobutyric acid, glutamate and their receptors; advances in the study of the nucleolus; and current progress on the study of ovarian germ stem cells in mammals.

Associate Editor-in-Chief

(State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, the Chinese Academy of Sciences, Kunming 650223, China)

E63

Introduction to the State Key Laboratory of Genetic Resources and Evolution

The State Key Laboratory of Genetic Resources and Evolution, located at Kunming Institute of Zoology, Chinese Academy of Sciences (CAS), was established in 2007 on the basis of the previous “Key laboratory of Cellular and Molecular Evolution” of Chinese Academy of Sciences, which was established in 1990. The director of the key lab is academician Yaping ZHANG, and the director of the academic committee is academician Zuoyan ZHU. In 2011, the lab was evaluated as “excellent” by MOST and the lab was also awarded as one of the Outstanding Team on the Innovation Culture Construction of Chinese Academy of Sciences. The 3 major research areas of the key lab have been set to “Evolution and conservation of the diverse genetic resources”, “Evolution of genes and genomes” and “Genetics, development and evolution”. The lab has got world recognition in the research fields of genetic resource collection and conservation, human origin and migration, origin of primate cognition, origin of new genes and the genetic diversity of domestic etc. The aim of the key laboratory is to make the lab center of researches on genetic resources and evolution in China and also in the world in certain fields. The laboratory has 92 staff members, including one academician, 15 principal investigators and 10 associate professors. The lab currently has 15 research groups and 3 supporting units and is hosting about 100 graduate students. The lab is currently holding or participating in more than 200 research projects, including key projects from the National Basic Research Program (973), the National Natural Science Foundation of China and the CAS Innovation Program. Since 2006, the lab has published more than 400 papers indexed by SCI (Science Citation Index), some were on top journals including Nature and Science. Awards to the lab members include “Second Prize of National Natural Science Progress” (2), “First Prize of Yunnan Natural Science Progress” (3), “Biodiversity Leadership Award” (1), “Cheung Kong Scholar Achievements Awards” (1), etc. The lab is equipped with state-of-the-art instruments, including automated DNA sequencer, microarray scanner, GenomeLab SNPstream, high performance computing and cluster system, etc. The equipments of the lab valued more than RMB 20 million in total and the facilities are open to researchers from outside the lab. The lab promotes scientific exchange and cooperation with researchers both at home and abroad. Welcome to visit the lab homepage at: http://www.kiz.cas.cn/gre/

E64 Zoological Research 33 (E5−6): E65−E74 doi: 10.3724/SP.J.1141.2012.E05-06E65

Summary of Laurasiatheria (Mammalia) Phylogeny

Jingyang HU 1,2, Yaping ZHANG 1,3, Li YU 1,2,*

1. Laboratory for Conservation and Utilization of Bio-resource, Yunnan University, Kunming 650091, China 2. Key Laboratory for Animal Genetic Diversity and Evolution of High Education in Yunnan Province, Yunnan University, Kunming 650091, China 3. State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming 650223, China

Abstract: Laurasiatheria is one of the richest and most diverse superorders of placental mammals. Because this group had a rapid evolutionary radiation, the phylogenetic relationships among the six orders of Laurasiatheria remain a subject of heated debate and several issues related to its phylogeny remain open. Reconstructing the true phylogenetic relationships of Laurasiatheria is a significant case study in evolutionary biology due to the diversity of this suborder and such research will have significant implications for biodiversity conservation. We review the higher-level (inter-ordinal) phylogenies of Laurasiatheria based on previous cytogenetic, morphological and molecular data, and discuss the controversies of its phylogenetic relationship. This review aims to outline future researches on Laurasiatheria phylogeny and adaptive evolution.

Keywords: Laurasiatheria; Phylogeny; Mitochondrial DNA; Nuclear genes; Phylogenomics

Placental mammals, which diverged from As a mammalian group bearing important evolutionary marsupials about 160 million years ago (Mya) (Luo et al, significance and conservation value, Laurasiatheria has 2011, Dos Reis et al, 2012), consist of four super-orders, long been a focus of researches. i.e., Afrotheria, Xenarthra, Euarchontoglires, and 1Current classifications of Laurasiatheria recognize Laurasiatheria.(Asher & Helgen, 2010; Carter, 2001; six orders, namely Eulipotyphla (hedgehogs, shrews, and Eizirik et al, 2001; Hallström et al, 2011; Kriegs et al, moles), Perissodactyla (rhinoceroses, horses, and tapirs), 2006; Kulemzina et al, 2010; Lin et al, 2002; Murphy et Carnivora (carnivores), Cetartiodactyla (artiodactyls and al, 2001a,b; Murphy et al, 2007; Nikolaev et al, 2007; cetaceans), Chiroptera (bats), and Pholidota (pangolins) Nishihara et al, 2006; Prasad et al, 2008; Springer et al, (Hallström et al, 2011; Lin et al, 2002; Waddell et al, 2004; Wildman et al, 2006; Zhou & Yang, 2010; Zhou et 1999a; Wildman et al, 2006; Wilson & Reeder, 2005; al, 2011a). Among them, Laurasiatheria is one of the Zhou et al, 2011a). However, the phylogeny and richest and most-diverse, with an origin and a primary evolution of Laurasiatheria remain subjects of heated distribution on the supercontinent of Laurasia (including debate and are not yet well- established. The origins of North America, Europe and Asia) during their early Laurasiatheria have been dated to the Cretaceous, history (Springer et al, 2011; Waddell et al, 1999b). ranging from 100 to 78.5−93 mya according to several Although the root of placental mammals is unclear, molecular estimates (Cao et al, 2000; Hallström & Janke Laurasiatheria has been consistently recognized as a sub- 2008, 2010; Hallström et al, 2011; Hasegawa et al, 2003; clade of Boreoeutheria (=Laurasiatheria+Euarchontoglires) Ji et al, 2002; Kitazoe et al, 2007; Meredith et al, 2011; (Dos Reis et al, 2012; Gibson et al, 2005; Hallström & Murphy et al, 2004; Springer & Murphy,2007; Zhou et Janke, 2008; Song et al, 2012; Madsen et al, 2001; al, 2011b). However, most morphological studies place McCormack et al, 2012; Murphy et al, 2001a, b; the origins of this super-order in the Paleocene (Asher et Nishihara et al, 2006; Springer & Murphy, 2007;

Springer et al, 2011; Wildman et al, 2006). Many Laurasiatheria species have attracted great interest in Received: 06 September 2012; Accepted: 15 November 2012 relation to the conservation of wild animals and are also Fundation Item: This work was supported by Program for New crucial animal models for studies on adaptive evolution. Century Excellent Talents in University (NCET) * Corresponding author, E-mail: [email protected] Science Press Volume 33 Issues E5−6

E66 HU, ZHANG, YU al, 2003; Wible et al, 2007, 2009; Luo et al, 2011). The Lin et al, 2002; Murphy et al, 2001a, b; Nery et al, 2012; diversification of mammalian species within Nishihara et al, 2006; Romiguier et al, 2010; Song et al, Laurasiatherian orders appear to have radiated within 2012; Wildman et al, 2006; Zhou et al, 2011a, b), 1−4 mya in the Eocene (Hallström & Janke 2008, 2010; conflicting phylogenetic hypotheses exist for the other Zhou et al, 2011b). So, attempts to clarify relationships orders that evolved subsequently (Figure 1). In this among the six Laurasiatheria orders from a variety of article, we review the higher-level (inter-ordinal) studies have encountered serious challenges due to the phylogenies of Laurasiatheria based on previous rapid evolutionary radiations and recent speciation cytogenetic, morphological and molecular data, and events. Although there has been a general consensus discuss the controversies of its phylogenetic relationship. regarding the earliest divergence of the order This review aims to outline future research of Eulipotyphla (Dos Reis et al, 2012; Hallström et al, 2011; Laurasiatheria phylogeny and adaptive evolution.

Figure 1 Sixteen different tree topologies proposed in previous studies A: Waddell & Shelley, 2003; Murphy et al, 2007; Kulemzina et al, 2010. B: Waddell et al, 1999a; Murphy et al, 2001b; Arnason et al, 2002; Arnason & Janke, 2002; Amrine-Madsen et al, 2003; Springer et al, 2004; Wildman et al, 2006; Hallström et al, 2011; Song et al, 2012. C: Mouchaty et al, 2000; Nikaido et al, 2001. D: Jow et al, 2002; Gibson et al, 2005. E: Hudelot et al, 2003. F: Madsen et al, 2001, Figure1A. G: Madsen et al, 2001, Figure1B; Meredith et al, 2011. H: Murphy et al, 2001a; Waddell et al, 2001; Beck et al, 2006; Springer et al, 2007; Bininda-Emonds et al, 2007; Zhou et al, 2011b. I: Madsen et al, 2002. J: Asher et al, 2003; Kullberg et al, 2006. K: Matthee et al, 2007. L: Kriegs et al, 2006; Nikolaev et al, 2007. M: Nishihara et al, 2006; Romiguier et al, 2010; McCormack et al, 2011; Zhou et al, 2011a; Dos Reis et al, 2012. N: Prasad et al, 2008, O: Prasad et al, 2008, P: Hou et al, 2009; Nery et al, 2012.

Pholidota)))) (Figure 1A, Table 1). Their study therefore Laurasiatheria phylogeny inferred from supports the basal placement of Eulipotyphla, and the cytogenetic and morphological evidence close relationship between Carnivora and Pholidota, as well as that between Perissodactyla and Chiroptera. Early karyotype studies of Laurasiatheria have However, the trichotomy among Cetartiodactyla, suggested there are significant differences in Carnivora + Pholidota, and Perissodactyla + Chiroptera chromosome number, morphology and banding pattern is unresolved. among the six orders (Ao et al, 2007; Frönicke et al, Compared with limited karyotype studies, numerous 1997; Kulemzina et al, 2009; Nie et al, 2011; Sotero- morphological studies have been conducted on Caio et al, 2011; Trifonov et al, 2008; Yang & Laurasiatheria phylogeny reaching a variety of conclusions. Graphodatsky, 2004; Yang et al, 2006; Ye et al, 2006). In early morphological studies, Laurasiatheria Kulemzina et al (2010) investigated the phylogenetic monophyly was not even supported (Asher et al, 2003; relationships among Laurasiatheria orders by comparing Novacek, 1992, 2001; Simpson, 1945; Wible et al, 2007, their karyotypes with human karyotype, and proposed 2009). For examples, Perissodactyla was originally the Laurasiatheria phylogeny as (Eulipotyphla, classified into clade Ungulata (Shoshani & McKenna, (Cetartiodactyla, ((Perissodactyla, Chiroptera), (Carnivora, 1998), whereas Chiroptera was often associated with

Zoological Research www.zoores.ac.cn Summary of Laurasiatheria (Mammalia) Phylogeny E67

Table 1 Major studies on phylogenetics of laurasiatherian groups Study Number of Main studies Data Data size Number of orders evidence species Cytogenetic Kulemzina et al, 2010 6 37 Shoshani and McKenna, 1998 260 morphological characteristics Asher et al, 2003 433morphological characteristics Morpho- Wildman et al, 2006 13 placenta features 6 20 logical Wilb et al, 2007 408 morphological characters Wilb et al, 2009 408 morphological characters Pumo et al, 1998 mt genome ~16 kb 5 (lack Pholidota) 12 ATP6,ATP8,CoⅠ,CoⅡ,Co 4 (lack Chiroptera and Cao et al, 1998 ~16 kb 9 Ⅲ,Cytb,ND1,ND2,ND3,ND4,ND4L,ND5 Eulipotyphl) 4 (lack Pholidota and Waddell et al, 1999b mt genome, t RNA, ND6 ~1 Mb 9 Chiroptera) Cao et al, 2000 mt genome ~16 kb 5 (lack Pholidota) 17 Mitochond- Arnason et al, 2002 mt genome, Cytb, 12S r RNA ~16 kb 6 25 rial genes Lin et al, 2002 mt genome, r RNA, t RNA ~16 kb 5 (lack Pholidota) 29 Jow et al, 2002 mt genome, r RNA, t RNA ~16 kb 5 (lack Pholidota) 24 Hudelot et al, 2003 r RNA, t RNA ~3.5 kb 6 31 Kern & Kondrashov, 2004 t RNA 6 30 Gibson et al, 2005 mt genome ~16 kb 6 31 Kjer et al, 2007 mt genome, r RNA, t RNA, ND6 ~1.4 Mb 6 42 Madson et al, 2001 4 nu DNA(A2AB, IRBP, Vwf, BRCA1); mt r RNA ~5 kb 6 21 15 nu DNA (ADORA3,ADRB2,APP-3'UTR, ATP7A, BDNF, BMl1- Murphy et al, 2001a 3'UTR, CNR1, CREM-3'UTR, EDG1, PLCB4-3'UTR, PNOC, RAG1, ~1 kb 6 24 RAG2, TYR, ZFX); mt DNA Murphy et al, 2001b 19 nuDNA (Murphy et al, 2001a; Madsen et al,2001); mt DNA ~16 kb 6 20 Waddell et al, 2001 19 nu DNA (Madson et al, 2001; Murphy et al, 2001b); mt DNA ~10 kb 6 20 Eizirik et al, 2001 15 nuDNA (Murphy et al, 2001a); mtDNA ~1 Mb 6 24 Madsen et al, 2002 1 nuDNA (A2AB) ~1 kb 6 19 18 nuDNA(Murphy et al, 2001a;2001b,Madsen et al,2001); mt DNA; Asher et al, 2003 ~17 kb 6 20 196 morphological characters Nuclear Amrine-Madsen et al, 2003 20 nuDNA(Murphy et al, 2001b, apolipoprotein B) ~17kb 6 20 genes 27nuDNA (Amrine-Madsen et al, 2003, RAG1, c-fibrinogen, ND6, Wanddell & Shelley, 2003 ~20 kb 6 38 RNA, c-MYC, e-globin, and GHR) 8 housekeeping genes (NAD [mt], SDSA, SDHB, RPL18, G3PD, 4 (lack Eulipotyphl Kullberg et al, 2006 ~5.8 kb 10 isocitrate dehydrogenase 1, ATP synthase, citrate synthase) and Pholidota) Matthee et al, 2007 introns (THY, PRKC, MGF) ~6 kb 6 114 19 nuDNA and mt DNA (Murphy et al, 2001b,);196 morphological Asher et al, 2007 ~17 kb 6 19 characters (Asher et al, 2003) 20 nuDNA (Amrine-Madsen et al, 2003); 175 morphological Springer et al, 2007 ~1.4 kb 6 22 characters 26 nuDNA (A2AB, ApoB, BRCA, DMP1, ENAM, GHR, IRBP, Rag1, Meredith et al, 2011 vWF, TTN, CNR1, BCHE, EDG1, RAG1, RAG2, ATP7A, TYR1, ~2 Mb 6 69 BDNF, ADRB2, APP, BMI1, CREM, FBN1, PLCB4, ADORA3, PNOC) 4 (lack Pholidota and Bashir et al, 2005 1000 orthologous genes ~1.5 Mb 4 Chiroptera) 5 (lack Kriegs et al, 2006 retroposed elements 15 Perissodactyla ) Nishihara et al, 2006 retroposon insertion (long interspersed elements; L1) loci 5 (lack Pholidota) 8 4 (lack Perissodactyla Nikolaev et al, 2007 218 orthologous genes ~200 kb 4 and Pholidota ) Prasd et al, 2008 protein-coding sequence and non-coding sequence ~1.9 Mb 5 (lack Pholidota) 13 4 (lack Perissodactyla Hallström & Janke,2008 3012 orthologous genes ~280 kb 6 and Pholidota) Phylogeno- 4(lack Pholidota and Hou et al, 2009 2705 orthologous genes ~40 MB 4 mics Eulipotyphla ) Romiguier et al, 2010 1138 orthologous genes ~300 kb 5 (lack Pholidota) 10 Hallström et al, 2011 4775 orthologous genes ~6 Mb 5 (lack Pholidota) 12 McCormack et al, 2011 683 loci of ultraconserved elements 5 (lack Pholidota) 6 Zhou et al, 2011a 8 orthologous genes ~5 kb 6 11 Zhou et al, 2011b 97 orthologous genes ~46 kb 6 13 Nery et al, 2012 3733 orthologous genes ~50 Mb 5 (lack Pholidota) 6 Dos Reis et al, 2012 857 orthologous genes and 153 mt genome ~2 Gb 5 (lack Pholidota) 153 Song et al, 2012 447 orthologous genes ~7 Mb 5 (lack Pholidota) 11

Kunming Institute of Zoology (CAS), China Zoological Society Volume 33 Issues E5−6 E68 HU, ZHANG, YU

Dermoptera (flying lemur) (Asher et al, 2003; Novacek, 2004; Kjer & Honeycutt, 2007; Lin et al, 2002; Reyes et 1992; Shoshani & McKenna, 1998; Simmons & Geisler, al, 2004) (Figure 1B, Table 1) with the exception of 1998). Gunnell & Simmons (2005) proposed that the Eulipotyphla positioned as the most basal order of grouping of Chiroptera with Dermoptera was spurious placental mammals, and Cetartiodactyla as the sister due to the homoplasious morphological characters. group of Perissodactyla and Carnivora (Waddell et al, Based on analyses of 13 placenta features, Wildman et al 1999a). (2006) first supported that Laurasiatheria formed a Conversely, Nikaido et al (2001) and Mouchaty et al monophyletic clade, which was confirmed by later (2000) proposed that Eulipotyphla and Chiroptera were molecular studies, and favored the Laurasiatheria clustered together, which were closely related to phylogeny as (Eulipotyphla, (Chiroptera, Perissodactyla, Carnivora and Cetartiodactyla (Figure 1C, (Cetartiodactyla, (Perissodactyla, (Carnivora, Table 1). Jow et al (2002) utilized all mt tRNAs and Pholidota))))) (Figure 1B, Table 1). rRNAs to reconstruct the Laurasiatheria phylogeny Similar to the karyotype studies, the earliest among orders except for Pholidota. The resulting tree divergence of Eulipotyphla and the close association topology was (Eulipotyphla, (Chiroptera, (Carnivora, between Carnivora and Pholidota were supported. (Cetartiodactyla, Perissodactyla)))) (Figure 1D, Table 1). Regarding the phylogenetic placements of the other three Subsequently, by adding the mt tRNAs and rRNAs of orders, however, morphological studies demonstrated Pholidota, Hudelot et al (2003) claimed the more resolutions and debates than the karyotype studies. Laurasiatheria phylogeny was (Eulipotyphla, (Chiroptera, (Pholidota, (Carnivora, (Cetartiodactyla, Laurasiatheria phylogeny inferred from Perissodactyla))))) (Figure 1E, Table 1). Notably, their molecular evidence study rejected the general view of the clustering of Carnivora and Pholidota (Arnason et al, 2002; Arnason Phylogenetic estimate from mitochondrial genes & Janke, 2002). Therefore, except for the early Due to relatively small effective population size and divergences of Eulipotyphla and Chiroptera, the lack of recombination as well as easy obtainment, phylogenetic relationships among the other four orders mitochondrial DNA (mtDNA) has often been chosen as a are inconsistent with all the previous studies. preferred molecular marker in early molecular phylogenetics (Bargelloni et al, 2000; Brito & Edwards, Phylogenetic estimate from nuclear genes 2009; Phillips & Penny, 2003; Sánchez-Gracia & Although analyses of mtDNA sequences have Castresana, 2012; Springer et al, 2001; Sturmbauer & provided insights into the phylogeny and evolution of Meyer, 1993; Yu et al, 2007), Laurasiatheria, the fact that all genes comprising mt Pumo et al (1998) analyzed 12 complete mt genome genome are inherited as a single, haploid linkage unit is a sequences of five Laurasiatheria orders, placing well-known limitation on phylogenetic reconstruction Chiroptera as the sister group of Cetartiodactyla, because the resulting mt gene trees are unlikely to reflect Perissodactyla and Carnivora, while positioning one independent estimate of the species tree (Giannasi et Eulipotyphla as the most basal order of placental al, 2001; Johnson & Clayton, 2000; Moore, 1995; Page, mammals, thus rejecting Laurasiatheria monophyly. Cao 2000; Yu et al, 2004; Yu & Zhang, 2006a). Therefore, et al (1998) investigated the mt genome sequences of when much longer nuclear DNA sequences become three orders, placing Cetartiodactyla as the sister group widely available, nuclear genes received more and more of Perissodactyla and Carnivora. By summarizing the attention in molecular phylogenetic studies, and results proposed in the “International Symposium on the demonstrated superiority in resolving deep phylogenies Origin of Mammalian Orders”, Waddell et al (1999b) (Springer et al, 1999, 2001). Since 2001, a series of suggested the Laurasiatheria phylogeny as (Eulipotyphla, Laurasiatheria studies based on analyses of nuclear genes (Chiroptera, (Cetartiodactyla, (Perissodactyla, (Carnivora, made important contributions to the resolution of Pholidota))))) (Figure 1B, Table 1). These relationships relationships among Laurasiatheria orders, entering into are consistent with those from the morphological study an unprecedented progress. of Wildman et al (2006). Moreover, these relationships Madsen et al (2001) first use three nuclear are also supported by later analysis of 13 important rare genes,combined with mt tRNA and rRNAs (~5 kb in genomic changes (RGCs) (Springer et al, 2004). By total), to investigate Laurasiatheria relationships and adding the mtDNA genome sequences of other orders proposed a tree topology of ((Carnivora, Perissodactyla), and by using different tree-building schemes, later (Cetartiodactyla, (Pholidota, (Chiroptera, Eulipotyphla)))) studies obtained the same phylogenetic relationships as (Figure 1F, Table 1). When another nuclear gene (BRCA1) those from Waddell et al (1999b) and Wildman et al was added and the taxon of representatives was increased (2006) (Arnason et al, 2002; Arnason & Janke, 2002; to test the stability of the resulting tree, the tree topology Cao et al, 2000; Gibson et al, 2005; Kern & Kondrashov, changed to (Eulipotyphla, ((Carnivora, Pholidota),

Zoological Research www.zoores.ac.cn Summary of Laurasiatheria (Mammalia) Phylogeny E69

(Chiroptera, (Cetartiodactyla, Perissodactyla)))) (Figure phylogeny of (Chiroptera, (Cetartiodactyla, (Perissodactyla, 1G, Table 1), indicating that Laurasiatheria phylogeny Carnivora))) (Figure 1J, Table 1). Despite weak support, varied with the genes and species used in the analyses. this result was consistent with most mt phylogenies (Cao In the same year, by screening 15 nuclear et al, 2000; Lin et al, 2002). genes(~10 kb) and three mt genes, Murphy et al (2001a) Different from those protein-coding nuclear genes supported a Laurasiatheria phylogeny of (Eulipotyphla, commonly used in previous studies, Matthee et al (2007) (Chiroptera, ((Cetartiodactyla, Perissodactyla), first utilize three nuclear introns (~6 kb) from 114 (Carnivora, Pholidota)))) (Figure 1H, Table 1). Laurasiatheria species to rebuild the relationships among Interestingly, their study clustered Cetartiodactyla with the six orders. Their study supported the phylogeny of Perissodactyla, though this relationship received low (Eulipotyphla, (Perissodactyla, (Cetartiodactyla, statistical support. The subsequent study of Waddell et al (Chiroptera, (Carnivora, Pholidota))))) (Figure 1K, Table (2001) using 12 nuclear genes (~10 kb) and mt genome 1). However, these relationships were poorly supported. sequences retrieved the same Laurasiatheria phylogeny Asher (2007) reanalyzed 19 nuclear and three mt and similar nodal supports (Figure 1H, Table 1). These sequences data from Murphy et al (2001b) as well as 196 results were also supported by the analysis of 20 nuclear morphological characters from Asher et al (2003) (Table genes and 175 morphological characters by Springer et al 1). By incorporating indels of sequence data, the study (2007) (Figure 1H, Table 1). suggested that the phylogenetic relationships among the Remarkably, when the dataset is increased to 19 Laurasiatheria orders changed with the combinations of nuclear genes and three mt genes (~16 kb), Murphy et al characters and the analytic methods. (2001b) showed that the Laurasiatheria phylogeny Based on the above studies, Laurasiatheria was (Eulipotyphla, (Chiroptera, (Cetartiodactyla, phylogeny at the ordinal level remains an outstanding (Perissodactyla, (Carnivora, Pholidota))))) (Figure 1b problem in mammalian systematics due to the and Table 1). Differences from other nuclear trees thus contradicting conclusions reached under different mainly lie in the phylogenetic positions of datasets, especially in the case of the phylogenetic Cetartiodactyla and Perissodactyla. By adding a new positions of Perissodactyla, Cetartiodactyla and Chiroptera. nuclear marker (apolipoprotein B gene) to Murphy et al (2001b)’s dataset, Amrine-Madsen et al (2003) achieved Phylogenomics of Laurasiatheria the same phylogenetic results (Figure 1B, Table 1). With the increasing availability of genomic data of These results were congruent with those from the more species, phylogenetic analysis, which use genomic morphological and mt genome analyses (Arnason et al, data to infer evolutionary relationships, is entering a new 2002; Arnason & Janke, 2002; Springer et al, 2004; era (Delsuc et al, 2005; Rokas & Chatzimanolis, 2008; Wildman et al, 2006). Yu & Zhang, 2006a, b). By performing phylogenomic Madsen et al (2002) attempted to evaluate the utility studies, a reliable phylogenetic tree can be reconstructed of Alpha 2B adrenergic receptor gene in the using many more characters than those in previous Laurasiatheria phylogeny, which showed the tree as studies, including gene families, large insertion and (Perissodactyla, ((Eulipotyphla, Cetartiodactyla), deletion gene fragments, and gene rearrangement, etc. (Chiroptera, Carnivora, Pholidota))) (Figure 1I, Table 1). (Meslin et al, 2011; Wu et al, 2012; Yu et al, 2011). Surprisingly, Perissodactyla was placed as the basal and These characters might be useful for distinguishing Eulipotyphla was the sister-group of Cetartiodactyla. nodes resulting from rapid radiation episodes such as the Waddell & Shelley (2003) developed seven nuclear gene Laurasiatheria speciation events. fragments and determined the Laurasiatheria phylogeny Bashir et al (2005) attempted to reconstruct the to be (Eulipotyphla, (Cetartiodactyla, ((Perissodactyla, Laurasiatheria phylogeny of four orders (Eulipotyphla, Chiroptera), (Carnivora, Pholidota)))) (Figure 1A, Cetartiodactyla, Perissodactyla, and Carnivora) based on Table1). more than 1 000 orthologous repetitive elements obtained Asher et al (2003) examined 18 nuclear genes, in from publically available genome sequence data. In their combination with three mt genes and 196 morphological study, Perissodactyla and Carnivora were grouped characters, and presented the Laurasiatheria phylogeny together, whereas the placements of the other two orders as (Eulipotyphla, ((Cetartiodactyla, Perissodactyla), remained unclear. (Chiroptera, (Carnivora, Pholidota)))) (Figure 1J, Table Using retroposed elements as new markers to 1). Notably, Chiroptera was not placed at the second reconstruct the Laurasiatheria phylogeny with the diverging lineage in Laurasiatheria, as suggested in most exception of Perissodactyla, Kriegs et al (2006) previous studies (Madsen et al, 2001; Murphy et al, determined the tree as (Eulipotyphla, (Chiroptera, 2001b), but displayed a close relationship with Carnivora (Cetartiodactyla, (Pholidota, Carnivora)))) (Figure 1L, + Pholidota. Using 8 housekeeping genes from four Table 1). On the other hand, Nishihara et al (2006) Laurasiatheria orders, Kullberg et al (2006) retrieved a performed a comprehensive comparison of orthologous

Kunming Institute of Zoology (CAS), China Zoological Society Volume 33 Issues E5−6 E70 HU, ZHANG, YU retroposon insertion (long interspersed elements; L1) loci (Cetartiodactyla, (Chiroptera, (Perissodactyla, Carnivora)))) among five orders of Laurasiatheria (lacking Pholidota). (Figure 1M, Table 1). These loci supported the phylogeny as (Eulipotyphla, Hallström et al (2011) analysed 4 775 protein- (Cetartiodactyla, (Chiroptera, (Perissodactyla, coding genes (~6 Mb) screened from the available Carnivora)))) (Figure 1M, Table 1). Their most genome sequences of five Laurasiatheria orders and interesting finding was that Carnivora, Perissodactyla, determined the tree as (Eulipotyphla, (Chiroptera, and Chiroptera were grouped together, excluding (Cetartiodactyla, (Perissodactyla, Carnivora)))) (Figure Cetartiodactyla and Eulipotyphla, which has not been 1B, Table 1). Their results were more consistent with recovered by previous studies. morphological and mt genome analyses as well as Based on 218 protein-coding genes (~200 kb) nuclear analyses of Murphy et al (2001b). However, obtained from the analysis of 18 placental mammalian Pholidota was not included in the analyses. When the genomes, Nikolaev et al (2007) suggested that retroposon insertions from these genome sequences were Eulipotyphla diverged first, followed by Chiroptera, and. analysed, their study supported the earliest divergence of Cetartiodactyla and Carnivora were sister-group (Figure Eulipotyphla, but failed to resolve the relationships 1L, Table 1). Subsequently, by searching for informative among the other four orders. Additionally, McCormack coding indels within whole-genome sequence data and et al (2011) analysed 683 loci of ultraconserved elements, amplifying them in Laurasiatheria species, Murphy et al supporting the tree topology (Eulipotyphla, (Cetartiodactyla, (2007) yielded a new tree topology as (Eulipotyphla, (Chiroptera, (Perissodactyla, Carnivora)))), which was (Cetartiodactyla, ((Chiroptera, Perissodactyla), consistent with Nishihara et al (2006) (Figure 1M, Table (Carnivora, Pholidota)))) (Figure 1A, Table 1). The close 1). relatedness of Chiroptera and Perissodactyla has been By screening protein-coding genes within genome only recovered in karyotype research (Kulemzina et al, sequence data and amplifying them in Laurasiatheria 2010). species, Zhou et al (2011a) analyzed 8 markers (~5 kb) Prasad et al (2008) reconstructed the phylogeny of and reconstructed the Laurasiatheria phylogeny as Laurasiatheria except for Pholidota based on 1.9 Mb (Eulipotyphla, (Cetartiodactyla, (Chiroptera, (Perissodactyla, gene regions. The protein-coding sequence analyses Carnivora)))) (Figure 1M, Table 1), consistent with the supported the tree as (Eulipotyphla, (Carnivora, results from Nishihara et al (2006) and McCormack et al (Chiroptera, (Perissodactyla, Cetartiodactyla)))) (Figure (2011). Subsequently, Zhou et al (2011b) investigated 1N, Table 1), whereas the combination of coding and Laurasiatheria phylogeny based on 97 orthologous genes non-coding sequence analyses favored the tree as (~46 kb) from six orders. Regardless of datasets and (Eulipotyphla, (Chiroptera, (Perissodactyla, (Carnivora, analytic methods used, they obtained an identical tree Cetartiodactyla)))) (Figure 1O, Table 1), showing a lack topology of (Eulipotyphla, (Chiroptera, ((Carnivora, of consistency in Laurasiatheria phylogeny under Pholidota), (Cetartiodactyla, Perissodactyla)))) (Figure different kinds of characters used. The lack of 1H, Table 1). This result was consistent with those from consistency with such large amounts of data can be also Murphy et al (2001a) and Waddell et al (2001). evidenced from Hou et al (2009), in which 2705 Recently, Nery et al (2012) choose 3 733 orthologous genes (~40 Mb) from Cetartiodactyla, orthologous genes (~50 Mb) obtained from available Perissodactyla, and Carnivora were used, and different genome sequences of five orders, namely Eulipotyphla, tree topologies were produced under different tree- Chiroptera, Carnivora, Cetartiodactyla, and Perissodactyla. building methods. When Chiroptera was added into the They obtained a Laurasiatheria phylogeny of (Eulipotyphla, analysis, they found close relatedness of Perissodactyla ((Chiroptera, Cetartiodactyla), (Perissodactyla, Carnivora))) with Carnivora, and that of Cetartiodactyla with with high supports (Figure 1P, Table 1). Chiroptera (Figure 1P, Table 1). As the most comprehensive study so far, Dos Reis Hallström & Janke (2008) screened 3 012 genes et al (2012) studied 11 available whole genomes of five (~280 kb) from four orders with available genome Laurasiatheria orders (except Pholidota). Based on 857 sequences (Eulipotyphla, Chiroptera, Carnivora, and nuclear genes and 153 mt genomes (~2 Gb), they Cetartiodactyla). Their results only recovered the basal reconstructed the tree topology as (Eulipotyphla, placement of Eulipotyphla, and failed to resolve the (Cetartiodactyla, (Chiroptera, (Perissodactyla, (Carnivora, relationships of the other three orders. The rapid Pholidota))))) (Figure 1M, Table 1), which was radiation of Laurasiatheria within a narrow time scale consistent with Nishihara et al (2006) and McCormack et was proposed to explain the irresolution using such large al (2011). amounts of data. By utilizing the third condon position Most recently, Song et al (2012) analyzed 447 GC content (GC3) of 1 138 protein-coding orthologous orthologous genes (~7 Mb) and used multispecies genes (~300 kb), Romiguier et al (2010) revealed the coalescent model analysis to investigate the phylogenetic Laurasiatheria phylogeny to be (Eulipotyphla, relationships among the five orders of Laurasiatheria

Zoological Research www.zoores.ac.cn Summary of Laurasiatheria (Mammalia) Phylogeny E71

(except Pholidota). Their results strongly favored Most current investigations of Laurasiatheria (Eulipotyphla, (Chiroptera, (Cetartiodactyla, (Perissodactyla, phylogeny at the ordinal level, especially phylogenomic Carnivora)))) (Figure 1B), supporting Waddell et al studies, have been mainly conducted based on five orders (1999a), Murphy et al (2001b), Arnason et al (2002), with available genome sequences. The lack of Pholidota Arnason & Janke (2002), Amrine-Madsen et al (2003), genome sequences in those studies may be one of the Springer et al (2004), Wildman et al (2006), and reasons causing inconsistent phylogenetic relationships Hallström et al (2011). (Bashir et al, 2005; Dos Reis et al, 2012; Hallström et al, 2011; Hou et al, 2009; McCormack et al, 2012; Nery et Perspectives al, 2012; Nishihara et al, 2006; Prasad et al, 2008; Romiguier et al, 2010; Song et al, 2012), given that tree The phylogeny of Laurasiatheria, which is topology of phylogenetic trees can be biased by sampling characterized by rapid species radiations and short errors, leading to an unreliable and unstable estimation. internal tree branches, has long been one of the most In addition, we note that previous understanding of controversial and challenging problems in mammalian Laurasiatheria phylogeny largely depended on the systematics. So far, only the earliest divergence of analyses of coding DNA sequences, with only a few Eulipotyphla and the close relationship between studies concerning other characters. Therefore, in future Carnivora and Pholidota has been generally accepted. work, more complete sampling of the Laurasiatheria The main controversies are concentrated on the species and of the various characters in use would be phylogenetic positions of the other three orders, i.e., expected to clarify the phylogenetic puzzles in Chiroptera, Cetartiodactyla and Perissodactyla. Laurasiatheria.

References

Amrine-Madsen H, Koepfli KP, Wayne RK, Spring MS. 2003. A new Cao Y, Fujiwara M, Nikaido M,Okada N, Hasegawa M. 2000. phylogenetic marker, apolipoprotein B, provides compelling evidence Interordinal relationships and timescale of eutherian evolution as for eutherian relationships. Mol Phylogenet Evol, 28(2): 225-240. inferred from mitochondrial genome data. Gene, 259(1-2): 149-158.

Ao L, Mao X, Nie W,Gu X, Feng Q, Wang JH, Su Weiting, Wang YX, Cao Y, Janke A, Waddell PJ,Westerman M,Takenaka O, Murata S, Volleth M, Yang FT. 2007. Karyotypic evolution and phylogenetic Oksda N, Pääbo S, Hasegawa M. 1998. Conflict among individual relationships in the order Chiroptera as revealed by G-banding mitochondrial proteins in resolving the phylogeny of eutherian orders. J comparison and chromosome painting. Chromosome Res, 15(3): 257- Mol Evol, 47(3): 307-322. 267. Carter AM. 2001. Evolution of the placenta and fetal membranes seen Arnason U, Janke A. 2002. Mitogenomic analyses of eutherian in the light of molecular phylogenetics. Placenta, 22(10): 800-807. relationships. Cytogenet Genome Res, 96(1-4): 20-32. Delsuc F, Brinkmann H, Philippe H. 2005. Phylogenomics and the Arnason U, Adegoke JA, Bodin K, Born EW, Esa YB, Gullberg A, reconstruction of the tree of life. Nat Rev Genet, 6(5): 361-375. Nilsson M, Short RV, Xu XF, Janke A. 2002. Mammalian mitogenomic Dos Reis M, Inoue J, Hasegawa M, Asher RJ, Donoghue PCJ, Yang ZH. relationships and the root of the eutherian tree. Proc Natl Acad Sci USA, 2012. Phylogenomic datasets provide both precision and accuracy in 99(12): 8151-8156. estimating the timescale of placental mammal phylogeny. Proc R Soc B, Asher RJ. 2007. A web-database of mammalian morphology and a 279(1742): 3491-3500. reanalysis of placental phylogeny. BMC Evol Biol, 7(1): 108-118. Eizirik E, Murphy WJ, O'Brien SJ. 2001. Molecular dating and Asher RJ, Helgen KM. 2010. Nomenclature and placental mammal biogeography of the early placental mammal radiation. J Hered, 92(2): phylogeny. BMC Evol Biol, 10: 102-111. 212-219.

Asher RJ, Novacek MJ, Geisler JH. 2003. Relationships of endemic Frönicke L, Müller-Navia J, Romanakis K, Scherthan H. 1997. African mammals and their fossil relatives based on morphological and Chromosomal homeologies between human, harbor seal (Phoca vitulina) and the putative ancestral carnivore karyotype revealed by molecular evidence. J Mamm Evol, 10(1-2): 131-194. Zoo-FISH. Chromosoma, 106(2): 108-113. Bargelloni L, Marcato S, Zane L,Patarnello T. 2000. Mitochondrial Giannasi N, Thorpe RS, Malhotra A. 2001. The use of amplified phylogeny of notothenioids: a molecular approach to Antarctic fish fragment length polymorphism in determining species trees at fine evolution and biogeography. Syst Biol, 49(1): 114-129. taxonomic levels: analysis of a medically important snake, Bashir A, Ye C, Price AL, Bafna V. 2005. Orthologous repeats and Trimeresurus albolabris. Mol Ecol, 10(2): 419-426. mammalian phylogenetic inference. Genome Res, 15(7): 998-1006. Gibson A, Gowri-Shankar V, Higgs PG, Rattray M. 2005. A Brito PH, Edwards SV. 2009. Multilocus phylogeography and comprehensive analysis of mammalian mitochondrial genome base phylogenetics using sequence-based markers. Genetica, 135(3): 439- composition and improved phylogenetic methods. Mol Bio Evol, 22(2): 455. 251-264.

Kunming Institute of Zoology (CAS), China Zoological Society Volume 33 Issues E5−6 E72 HU, ZHANG, YU

Gunnell GF, Simmons NB. 2005. Fossil evidence and the origin of bats. relationships. Mol Biol Evol, 23(8): 1493-1503. J MammEvol, 12(1-2): 209-246. Lin YH, McLenachan PA, Gore AR, Philips MJ, Ota R, Hendy MD, Hallström BM, Janke A. 2008. Resolution among major placental Penny D. 2002. Four new mitochondrial genomes and the increased mammal interordinal relationships with genome data imply that stability of evolutionary trees of mammals from improved taxon speciation influenced their earliest radiations. BMC Evol Biol, 8:162- sampling. Mol Biol Evol, 19(12): 2060-2070. 175. Luo ZX, Yuan CX, Meng QJ, Ji Q. 2011. A Jurassic eutherian mammal Hallström BM, Janke A. 2010. Mammalian evolution may not be and divergence of marsupials and placentals. Nature, 476(7361): 442- strictly bifurcating. Mol Biol Evol, 27(12): 2804-2816. 445.

Hallström BM, Schneider A, Zoller S, Janke A. 2011. A genomic Madsen O, Willemsen D, Ursing BM, Arnason U, de Jong WW. 2002. approach to examine the complex evolution of laurasiatherian Molecular evolution of the mammalian alpha 2B adrenergic receptor. mammals. PLoS One, 6(12): e28199. Mol Biol Evol, 19(12): 2150-2160.

Hasegawa M, Thorne JL, Kishino H. 2003. Time scale of eutherian Madsen O, Scally M, Douady CJ, Kao DJ, DeBry RW, Adkins R, evolution estimated without assuming a constant rate of molecular Amrine HM, Stanhope MJ, de Jong WW, Springer MS. 2001. Parallel evolution. Genes Genet Syst, 78(4): 267-283. adaptive radiations in two major clades of placental mammals. Nature, 409(6820): 610-614. Hou ZC, Romero R, Wildman DE. 2009. Phylogeny of the Ferungulata (Mammalia: Laurasiatheria) as determined from phylogenomic data. Matthee CA, Eick G, -Munro S, Montgelard C, Pardini AT, Mol Phylogenet Evol, 52(3): 660-664. Robinson TJ. 2007. Indel evolution of mammalian introns and the utility of non-coding nuclear markers in eutherian phylogenetics. Mol Hudelot C, Gowri-Shankar V, Jow H, Rattray M, Higgs PG. 2003. Phylogenet Evol, 42(3): 827-837. RNA-based phylogenetic methods: application to mammalian mitochondrial RNA sequences. Mol Phylogenetics Evol, 28(2): 241- McCormack JE, Faircloth BC, Crawford NG, Gowaty PA, Brumfiled 252. RT, Glenn TC. 2012. Ultraconserved elements are novel phylogenomic markers that resolve placental mammal phylogeny when combined Ji Q, Luo ZX, Yuan CX, Wible JR, Zhang JP, Georgi JA. 2002. The with species-tree analysis. Genome Res, 22(10): 746-754. earliest known eutherian mammal. Nature, 416(6883): 816-822. Meredith RW, Janecka JE, Gatesy J, Ryder OA, Fisher CA, Teeling EC, Johnson KP, Clayton DH. 2000. A molecular phylogeny of the dove Goodbla A, Eizirik E, Simão TLL, Stadler T, Rabosky DL, Honeycutt genus Zenaida: mitochondrial and nuclear DNA sequences. Condor, RL, Flynn JJ, Ingram CM, Steiner C, Williams TL, Robinson TJ, Burk- 102(4): 864-870. Herrick A, Westerman M, Ayoub NA, Springer MS, Murphy WJ. 2011. Jow H, Hudelot C, Rattray M, Higgs PG. 2002. Bayesian phylogenetics Impacts of the cretaceous terrestrial revolution and KPg extinction on using an RNA substitution model applied to early mammalian mammal diversification. Science, 334(6055): 521-524. evolution. Mol Biol Evol, 19(9): 1591-1601. Meslin C, Brimau F, Meillour PNL,Callebaut I, Pascal G, Monget P. Kern AD, Kondrashov FA. 2004. Mechanisms and convergence of 2011. The evolutionary history of the SAL1 gene family in eutherian compensatory evolution in mammalian mitochondrial tRNAs. Nat mammals. BMC Evol Biol, 11: 148-162. Genets, 36(11): 1207-1212. Moore WS. 1995. Inferring phylogenies from mtDNA variation: mitochondrial-gene trees versus nuclear-gene trees. Evolution, 49(4): Kitazoe Y, Kishino H, Waddell PJ,Nakajima N, Okabayashi T, Watabe 718-726. T, Okuhara Y. 2007. Robust time estimation reconciles views of the antiquity of placental mammals. PLoS One, 2(4): e384. Mouchaty SK, Gullberg A, Janke A, Arnason U. 2000. The phylogenetic position of the talpidae within eutheria based on analysis Kjer KM, Honeycutt RL. 2007. Site specific rates of mitochondrial of complete mitochondrial sequences . Mol Biol Evol, 17(1): 60-67. genomes and the phylogeny of eutheria. BMC Evol Biol, 7: 8-17. Murphy WJ, Pevzner PA, O'Brien SJ. 2004. Mammalian Kriegs JO, Churakov G, Kiefmann M, Jordan U, Brosius J, Schmitz J. phylogenomics comes of age. Trends Genet, 20(12): 631-639. 2006. Retroposed elements as archives for the evolutionary history of placental mammals. PLoS Biol, 4(4): e91. Murphy WJ, Pringle TH, Crider TA, Springer MS. 2007. Using genomic data to unravel the root of the placental mammal phylogeny. Kulemzina AI, Trifonov VA, Perelman PL, Rubtsova NV, Volobuev V, Genome Resh, 17(4): 413-421. Ferguson-Smith MA, Stanyon R, Yang FT, Graphodatsky AS. 2009. Cross-species chromosome painting in Cetartiodactyla: reconstructing Murphy WJ, Eizirik E, Johnson WE, Zhang YP, Ryder OA, O'Brien. the karyotype evolution in key phylogenetic lineages. Chromosome Res, 2001a. Molecular phylogenetics and the origins of placental mammals. 17(3): 419-436. Nature, 409(6820): 614-618.

Kulemzina I, Biltueva L, Trifonov VA, Perelman PL, Staroselec YY, Murphy WJ, Eizirik E, O'Brien SJ, Madsen O, Scally M, Douady CJ, Beklemisheva VR, Vorobieva NV, Serdukova NA, Graphodatsky AS. Teeling E, Ryder OA, Stanhope MJ, de Jong WW, Springer MS. 2001b. 2010. Comparative cytogenetics of main Laurasiatheria taxa. Russ J Resolution of the early placental mammal radiation using Bayesian Genet, 46(9): 1132-1137. phylogenetics. Science, 294(5550): 2348-2351.

Kullberg M, Nilsson MA, Arnason U, Harley EH, Janke A. 2006. Nery MF, González DJ, Hoffmann FG, Opazo JC. 2012. Resolution of Housekeeping genes for phylogenetic analysis of eutherian the laurasiatherian phylogeny: evidence from genomic data. Mol

Zoological Research www.zoores.ac.cn Summary of Laurasiatheria (Mammalia) Phylogeny E73

Phylogenetics Evol, 64(3): 685-689. to extant bat lineages, with comments on the evolution of echolocation and foraging strategies in Microchiroptera. Bulletin AMNH, 235: 1-182. Nie W, Wang J, Su W, Wang D, Tanomtong A, Perelman PL, Grahodatsky AS, Yang F. 2011. Chromosomal rearrangements and Simpson GG. 1945. The principles of classification and a classification karyotype evolution in carnivores revealed by chromosome painting. of mammals. Bull Amer Mus Nat Hist, 85: 1-350. Heredity, 108(1): 17-27. Song S, Liu L, Edwards SV, Wu SY. 2012. Resolving conflict in Nikaido M, Kawai K, Harada M, Tomita S, Okada N, Hasegawa M. eutherian mammal phylogeny using phylogenomics and the 2001. Maximum likelihood analysis of the complete mitochondrial multispecies coalescent model. Proc Natl Acad Sci USA, 109(37): genomes of eutherians and a reevaluation of the phylogeny of bats and 14942-14947. insectivores. J Mol Evol, 53(4-5):508–516. Sotero-Caio CG, Pieczarka JC, Nagamachi CY, Gomes AJB, Lira TC, Nikolaev S, Montoya-Burgos JI, Margulies EH, NISC Comparative O'Brien PCM, Ferguson-Smith MA, Souza MJ, Santos N. 2011. Sequencing Program, Rougemont J, Nyffeler B, Antonarakis SE. 2007. Chromosomal homologies among vampire bats revealed by Early history of mammals is elucidated with the ENCODE multiple chromosome painting (Phyllostomidae, Chiroptera). Cytogenet Genome species sequencing data. PLoS Genet, 3(1): e2. Res, 132(3): 156-164.

Nishihara H, Hasegawa M, Okada N. 2006. Pegasoferae, an unexpected Springer MS, Murphy WJ. 2007. Mammalian evolution and mammalian clade revealed by tracking ancient retroposon insertions. biomedicine: new views from phylogeny. Biol Rev Camb Philos Soc, Proc Natl Acad Sci USA, 103(26): 9929-9934. 82(3): 375-392.

Novacek MJ. 1992. Mammalian phylogeny: shaking the tree. Nature, Springer MS, Amrine HM, Burk A, Stanhope MJ. 1999. Additional 356(6365): 121-125. support for Afrotheria and Paenungulata, the performance of mitochondrial versus nuclear genes, and the impact of data partitions Novacek MJ. 2001. Mammalian phylogeny: genes and supertrees. Curr with heterogeneous base composition. Syst Biol, 48(1): 65-75. Biol, 11(14): 573-575. Springer MS, Stanhope MJ, Madsen O, de Jong WW. 2004. Molecules Page RDM. 2000. Extracting species trees from complex gene trees: consolidate the placental mammal tree. Trends Ecol Evol, 19(8): 430- reconciled trees and vertebrate phylogeny. Mol Phylogenetics Evol, 438. 14(1): 89-106. Springer MS, Meredith RW, Janecka JE, Murphy WJ. 2011. The Phillips MJ, Penny D. 2003. The root of the mammalian tree inferred historical biogeography of Mammalia. Philos Trans R Soc Lond B Biol from whole mitochondrial genomes. Mol Biol Evo, 28(2): 171-185. Sci, 366(1577): 2478-2502.

Prasad AB, Allard MW, Green ED. 2008. Confirming the phylogeny of Springer MS, Burk-Herrick A, Meredith R, Eizirik E, Teeling E, mammals by use of large comparative sequence data sets. Mol Biol O'Brien SJ, Murphy WJ. 2007. The adequacy of morphology for Evol, 25(9): 1795-1808. reconstructing the early history of placental mammals. Syst Biol, 56(4): Pumo DE, Finamore PS, Franek WR, Phillips CJ, Tarzami S, Balzarano 673-684. D. 1998. Complete mitochondrial genome of a neotropical fruit bat, Springer MS, DeBry RW, Douady C, Amrine HM, Madsen O, de Jong Artibeus jamaicensis, and a new hypothesis of the relationships of bats WW, Stanhope MJ. 2001. Mitochondrial versus nuclear gene sequences to other eutherian mammals. J Mol Evol, 47(6): 709-717. in deep-level mammalian phylogeny reconstruction. Mol Bio Evol, 18(2): 132-143. Reyes A, Gissi C, Catzeflis F, Nevo E, Pesole G, Saccone C. 2004. Congruent mammalian trees from mitochondrial and nuclear genes Sturmbauer C, Meyer A. 1993. Mitochondrial phylogeny of the using Bayesian methods. Mol Biol Evol, 21(2): 397-403. endemic mouthbrooding lineages of cichlid fishes from Lake Tanganyika in eastern Africa. Mol Biol Evol, 10(4): 751-768. Rokas A, Chatzimanolis S. 2008. From gene-scale to genome-scale phylogenetics: the data flood in, but the challenges remain. Methods Trifonov VA, Stanyon R, Nesterenko AI, Fu BY, Perelman PL, O'Brien Mol Biol, 422: 1-12. PCM, Stone G, Rubtsova NV, Houck ML, Robinson TJ, Ferguson- Smith MA, Dobigny G, Graphodatsky AS, Yang FT. 2008. Romiguier J, Ranwez V, Douzery EJP, Galtier N. 2010. Contrasting Multidirectional cross-species painting illuminates the history of GC-content dynamics across 33 mammalian genomes: relationship karyotypic evolution in Perissodactyla. Chromosome Res, 16(1): 89- with life-history traits and chromosome sizes. Genome Resh, 20(8): 107. 1001-1009. Waddell PJ, Shelley S. 2003. Evaluating placental inter-ordinal Sánchez-Gracia A, Castresana J. 2012. Impact of deep coalescence on phylogenies with novel sequences including RAG1, γ-fibrinogen, ND6, the reliability of species tree inference from different types of DNA and mt-tRNA, plus MCMC-driven nucleotide, amino acid, and codon markers in mammals. PLoS One, 7(1): e30239. models. Mol Phylogenet Evol, 28(2): 197-224. Shoshani J, McKenna MC. 1998. Higher taxonomic relationships Waddell PJ, Okada N, Hasegawa M. 1999a. Towards resolving the among extant mammals based on morphology, with selected interordinal relationships of placental mammals. Syst Biol, 48(1): 1-5. comparisons of results from molecular data. Mol Phylogenetics Evol, 3(9): 572-584. Waddell PJ, Kishino H, Ota R. 2001. A phylogenetic foundation for comparative mammalian genomics. Genome Inform, 12: 141-154. Simmons NB, Geisler JH. 1998. Phylogenetic relationships of Icaronycteris, Archaeonycteris, Hassianycteris, and Palaeochiropteryx Waddell PJ, Cao Y, Hauf J, Hasegawa M. 1999b. Using novel

Kunming Institute of Zoology (CAS), China Zoological Society Volume 33 Issues E5−6 E74 HU, ZHANG, YU phylogenetic methods to evaluate mammalian mtDNA, including Ye JP, Biltueva L, Huang L, Nie WH, Wang JH, Jing MD, Su WT, amino acid-invariant sites-LogDet plus site stripping, to detect internal Volobouev V, Jiang XL, Graphodatsky AS, Yang FT. 2006. Cross- conflicts in the data, with special reference to the positions of hedgehog, species chromosome painting unveils cytogenetic signatures for the armadillo, and elephant. Syst Biol, 48(1): 31-53. Eulipotyphla and evidence for the polyphyly of Insectivora. Chromosome Res, 14(2): 151-159. Wible JR, Rougier GW, Novacek MJ, Asher RJ. 2007. Cretaceous eutherians and Laurasian origin for placental mammals near the K/T Yu L, Zhang YP. 2006a. Phylogenomics-An attractive avenue to boundary. Nature, 447(7147): 1003-1006. reconstruct "Tree of Life". Genitics, 28(11): 1445-1450. (in Chinese)

Wible JR, Rougier GW, Novacek MJ, Asher RJ. 2009. The eutherian Yu L, Zhang YP. 2006b. Summary of phylogeny in mammalian order mammal Maelestesgobiensis from the Late Cretaceous of Mongolia carnivora. Zool Res, 27(6): 657-665. (in Chinese) and the phylogeny of Cretaceous Eutheria. Bio One, 327: 1-123. Yu L, Li YW, Ryder OA, Zhang YP. 2004. Phylogeny of the bears Wildman DE, Chen CY, Erez O, Grossman LI, Goodman M, Romero R. (Ursidae) based on nuclear and mitochondrial genes. Mol Phylogenet 2006. Evolution of the mammalian placenta revealed by phylogenetic Evol, 32(2): 480-494. analysis. Proc Natl Acad Sci USA, 103(9): 3203-3208. Yu L, Li YW, Ryder OA, Zhang YP. 2007. Analysis of complete Wilson DE, Reeder DM. 2005. Mammal Species of the World: A mitochondrial genome sequences increases phylogenetic resolution of Taxonomic and Geographic Reference. 3rd ed. Baltimore: Jonhs bears (Ursidae), a mammalian family that experienced rapid speciation. Hopkuns University Press. BMC Evol Biol, 7: 198-209.

Wu YC, Rasmussen MD, Kellis M. 2012. Evolution at the subgene Yu L, Luan PT, Jin W, Ryder OA, Chemnick LG, Davis HA, Zhang YP. level: domain rearrangements in the drosophila phylogeny. Mol Bio 2011. Phylogenetic utility of nuclear introns in interfamilial Evol, 29(2): 689-705. relationships of Caniformia (order Carnivora). Syst Biol, 60(2): 175- 187. Yang FT, Graphodatsky AS. 2004. Integrated comparative genome Zhou XM, Yang G. 2010. A review on the progress in mammalian maps and their implications for karyotype evolution of carnivores. phylogenomics. Acta Theriol Sin, 30(3): 339-345. (in Chinese) Chromosomes Today, 14: 215-244. Zhou XM, Xu SX, Zhang P, Yang G. 2011a. Developing a series of Yang FT, Graphodatsky AS, Li TL,Fu BY, Dobigny G, Wang JH, conservative anchor markers and their application to phylogenomics of Perelman PL, Serdukova NA, Su WT, O'Brien PCM, Wang YX, Laurasiatherian mammals. Mol Ecol Resour, 11(1): 134-140. Ferguson-Smith MA, Volobouev V, Nie WH. 2006. Comparative genome maps of the pangolin, hedgehog, sloth, anteater and human Zhou XM, Xu SX, Xu JX, Chen BY, Zhou KY, Yang G. 2011b. revealed by cross-species chromosome painting: further insight into the Phylogenomic analysis resolves the interordinal relationships and rapid ancestral karyotype and genome evolution of eutherian mammals. diversification of the laurasiatherian mammals. Syst Biol, 61(1): 150- Chromosome Resh, 14(3): 283-296. 164.

Zoological Research www.zoores.ac.cn Zoological Research 33 (E5−6): E75−E81 doi: 10.3724/SP.J.1141.2012.E05-06E75

Evolution of neurotransmitter gamma-aminobutyric acid, glutamate and their receptors

Zhiheng GOU1,2,#, Xiao WANG1,2,#, Wen WANG1,*

1. CAS-Max Planck Junior Research Group, State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, the Chinese Academy of Sciences (CAS), Kunming 650223, China 2. Graduate School of the Chinese Academy of Sciences, Beijing 100049, China

Abstract: Gamma-aminobutyric acid (GABA) and glutamate are two important amino acid neurotransmitters widely present in the nervous systems of mammals, , round worm, and platyhelminths, while their receptors are quite diversified across different animal phyla. However, the evolutionary mechanisms between the two conserved neurotransmitters and their diversified receptors remain elusive, and antagonistic interactions between GABA and glutamate signal transduction systems, in particular, have begun to attract significant attention. In this review, we summarize the extant results on the origin and evolution of GABA and glutamate, as well as their receptors, and analyze possible evolutionary processes and phylogenetic relationships of various GABAs and glutamate receptors. We further discuss the evolutionary history of Excitatory/Neutral Amino Acid Transporter (EAAT), a transport protein, which plays an important role in the GABA-glutamate “yin and yang” balanced regulation. Finally, based on current advances, we propose several potential directions of future research.

Keywords: Gamma-aminobutyric acid; Glutamate; Neurotransmitter; Receptor; Evolution; Yin and yang regulation

Amino acid neurotransmitters play significant roles hydrozoan Hydra vulgaris, which has the most simple in animal nervous systems. Different amino acid and archaic nervous system, GABA decreases the neurotransmitters have different functions in nervous outputs of impulse generating systems, including systems. Interestingly, in mammals GABA and glutamate ectodermal body contraction bursts (CBs), the number of coexist as a pair of antagonistic neurotransmitters with pulses in a burst (P/CB), and endodermal rhythmic glutamate functioning as an activator and GABA as a potentials (RPs) (Kass-Simon et al, 2003). Consistently, repressor. So far, no comprehensive analysis has been GABA has inhibitory effects in some Platyhelminthes 1 conducted in relation to the origin and evolution of the such as the flatworm Gastrothylax crumenifer, which “yin and yang” balanced relationship between GABA evolved the first central nervous system (Verma et al, and glutamate. Here we present a scenario on the 2010). In the phylogenetically advanced nematodes, evolution of these two antagonistic acid neurotransmitters GABA becomes the main inhibitory neurotransmitter and their receptors through comprehensive analysis on (McIntire et al, 1993). There are some exceptions for literature and related data. special species such as Cancer borealis, in which the stomach cells can produce an excitatory reaction after GABA and glutamate as neurotransmitters exist in GABA binding (Swensen et al, 2000). In contrast to early animal lineages animals, GABA in plants plays a role in various Gamma-aminobutyric acid is well known as the physiological processes such as the tricarboxylic acid main inhibitory neurotransmitter in the brains of cycle, nitrogen metabolism, insect defense, oxidative mammals (DeFeudis, 1975). Such a function appears ancestrally derived as evidence shows that GABA exerts Received: 20 September 2012; Accepted: 20 November 2012 similar inhibitory functions in the nervous system of # These authors contributed equally to this work even the most primitive animals. For example, in the * Corresponding author, E-mail: [email protected]

Science Press Volume 33 Issues E5−6

E76 GOU, WANG, WANG stress defense, osmoregulation, and signal transmission, Evolution of GABAA receptor genes which are extensively reviewed elsewhere (Bouché & The GABAA receptor belongs to the inhibitory Fromm, 2004). ligand-gated ion channel family. This family also Glutamate is the most abundant excitatory includes glycine receptor, another amino acid neurotransmitter in the brain of vertebrates (Nakanishi, neurotransmitter suggested to be derived from the 1992), and almost one third of central nervous system GABAA receptor family (Vassilatis et al, 1997). In synapses contain glutamate (Fiorillo & Williams, 1998). invertebrates such as nematodes, ascarids, and fruit flies, Most glutamate in the brain comes from the glutamine GABAA receptor-like genes exist in pairs (Hosie et al, and Krebs cycles. Like GABA, glutamate has been 1997; Martínez-Delgado et al, 2010), encoding class I proposed to play roles in the nervous system of the most and II subtypes, which are homologues of α and β primitive animals. Antagonistic to GABA, glutamate subunits of the vertebrate GABAA receptors, respectively. increases the outputs of ectodermal and endodermal In contrast, GABAA receptor genes exist as clusters in impulse generating systems in the phylogenetically basal vertebrates (Tsang et al, 2007), suggesting a gene family hydrozoan Hydra vulgaris (Kass-Simon et al, 2003), and expansion in vertebrates. The cluster mode of GABAA acts as an excitatory neurotransmitter in cestode and receptors in vertebrates very likely originated after the aplysiidae nervous systems (Davis & Stretton, 1995; divergence between vertebrates and invertebrates, and Verma et al, 2010; Gerschenfeld, 1973; Kehoe & Marder, the common ancestor of the coupled GABAA genes may 1976; Segerberg & Stretton, 1993). Glutamate is the be the α-β like genes of bilaterians (Tsang et al, 2007). main excitatory neurotransmitter in the nervous systems There are at least 16 GABAA receptor subunits (α1–α6, of nematodes and fruit fly (Devaud et al, 2008). However, β1–β3, γ1–γ3, δ, ε, π and θ) in the central nervous system glutamte could also induce an inhibitory function of vertebrates (Wilke et al, 1997). These subunits are through some special receptors in nematode and fruit fly classified into three types according to the similarity of (Schuske et al, 2004). the subunit amino acid: α1–α6; β1–β3 and θ (β-like); and Both GABA and glutamate have existed in the γ1–γ3 and ε (γ-like). Generally, the GABAA receptor is earliest stage (coelenterate) of animal evolution, and they composed of two α, two β, and one γ subunits (Chang et usually function antagonistically. As will be discussed al, 1996; Tretter et al, 1997). Human GABAA receptor below, the structural and functional differentiation genes are distributed in four gene clusters on four between GABA and glutamate receptors during chromosomes (4, 5, 15, X) (Darlison et al, 2005). evolution has given rise to diverse receptors with more Darlison et al (2005) stated that different GABAA specialized functions. receptor subunits in vertebrates may have originated from gene duplications from the vertebrate ancestor Evolution of GABA and glutamate receptors based on the similarities among gene clusters (Figure 1). The GABA receptors, which recognize and bind Duplication probably occurred during the two whole GABA, are located in the postsynaptic membrane. genome duplications (WGD) in the evolution of During binding, ion permeability of the membrane vertebrates. The positionally corresponding genes in changes (Lü & Tian, 1991). There are three types of these gene clusters have similar functions, which also known GABA receptors: GABAA, GABAB and GABAC. supports the duplication scenario (Darlison et al, 2005). GABAA is the most common inhibitory neurotransmitter Within each gene cluster, some genes experienced receptor in the brain and is a kind of ligand-gated further individual gene duplications (Figure 1). chloride channel. GABAB is a G protein-coupled 7- transmembrane receptor, which can activate second messenger system and K+, Ca2+ channels. It is a heterodimer and is composed of two homologous subunits, GABAB1 and GABAB2 (White et al, 1998). It is believed that GABAB and glutamate receptors have common ancestor and will be discussed below. The GABAC receptor, composed of GABAρ subunits, is a newly discovered GABA -like receptor, and is also a A kind of ligand-gated chloride channel (Chebib & Johnston, 1999). Structurally, the ρ subunit is part of the Figure 1 Evolution model of human GABAA receptor. family of GABAA receptor subunits (Shimada et al, The figure is modified from Darlison et al (2005) Every gene cluster contained three kinds of subunits (γ, α, and β). ε is γ-like 1992), and thus GABAC is usually regarded as a special and θ is β-like. The two WGD events occurred about 500 and 430 million GABAA (Barnard et al, 1998). The phylogenetic years ago, respectively. The direction of the arrows below the subunits relationship of the ρ subunit between GABAA receptor subunits is included in Figure 2. shows the transcriptional direction.

Zoological Research www.zoores.ac.cn Evolution of neurotransmitter gamma-aminobutyric acid, glutamate and their receptors E77

In Drosophila, three kinds of ligand-gated ion and not present as gene clusters. The ascarids have 39 channel subunit genes were cloned: resistance to dieldrin GABAA receptor-like genes resulting from single gene (RDL), GABA and glycine-like receptor of Drosophila duplication and subsequent subfunctionalization (Tsang (GRD) and ligand-gated chloride channel homologue 3 et al, 2007). The phylogenetic tree in Figure 2 is the (LCCH3), which shows a sequence similarity to the relationship between insect (Drosophila) GABAA GABAA receptor subunits in vertebrates (Hosie et al, receptor like genes and vertebrate ligand-gated receptor 1997; Martínez-Delgado et al, 2010). Totally, Drosophila subunits, showing that GRD is close to the GABAA has 12 functionally differentiated GABAA receptor-like receptor α subunit, LCCH3 is close to GABAA β subunit, genes, which are scattered in different genomic regions and RDL is close to vertebrate GlyR.

Figure 2 Phylogenetic tree constructed by Neighbor-Joining via MEGA version 5.0 with Bootstrap Test (1 000 replications) demonstrating the similarities among the known insect (Drosophila) GABA receptor subunits and vertebrate ligand-gated anion-channels The phylogenetic tree is based on similarities in the amino acid sequences. Vertebrate GABA receptor subunits are marked as α, β, etc., GLY refers to GlyR subunits while Glu-Cl refers to glutamate-gated chloride-channels from Haemonchus contortus (Hosie, et al, 1997, Martínez-Delgado, et al, 2010). GABA alpha-1: P62813.1, GABA alpha-2: P26048.1, GABA alpha-3: P26049.1, GABA alpha-4: AAB46866.1, GABA alpha-5: P19969.1, GABA alpha-6: P16305.3, GABA beta-1: P18505.2, GABA beta-2: P47870.2, GABA beta-3: P63079.1, GABA gamma-1: NP_034382.2, GABA gamma-2: P18507.2, GABA gamma-3: P28473.1, GABA gamma-4: P34904.1, GABA rho-1: NP_002033.2, GABA rho-2: NP_002034.2, GABA rho-3: NP_001099050.1, glycine alpha-1 Homo sapiens: NP_001139512.1, glycine alpha-2 Rattus norvegicus: NP_036700.1, glycine, alpha-3 Homo sapiens: NP_006520.2, glycine beta Homo sapiens: NP_001159532.1, Glu-Cl Haemonchus contortus: P91730.1, GRD Drosophila melanogaster: Q24352.1, LCCH3 Drosophila melanogaster: AAS65370.1, RDL Drosophila melanogaster: AAF50311.1

Other ion channels evolved from GABAA ligand- no GlyR has been identified (Tsang et al, 2007), and gated ion channel invertebrate GluCl- may be homologous to vertebrate As mentioned above, the vertebrate ligand-gated ion GlyCl- because they both have a special cystine ring - channel, GlyCl receptor, evolved from the GABAA structure (Vassilatis et al, 1997). Invertebrate-specific receptor family (Vassilatis et al, 1997). It is interesting GluCl- will be discussed in the context of glutamate to note that the invertebrate specific glutamate receptor receptors below. - like protein, GluCl , is also derived from the GABAA Interestingly, although GABA is a main inhibitory receptor family (Ortells & Lunt, 1995). In invertebrates, neurotransmitter in nematodes, both inhibitory GABA

Kunming Institute of Zoology (CAS), China Zoological Society Volume 33 Issues E5−6 E78 GOU, WANG, WANG receptor transporting Cl- and excitatory GABA receptor nerve joints (Cully et al, 1996). As mentioned above, transporting Na+ in C. elegans exist, and thus GABA however, some special inhibitory iGluRs (GluCl-) are plays an excitatory role under some circumstances found in invertebrates, such as the nematodes and fruit (Schuske et al, 2004; Yates, et al, 2003). Beg & fly (Schuske et al, 2004). It is believed that the GluCl- Jorgensen (2003) cloned the excitatory GABA receptor receptors evolved from GABA ligand-gated chloride gene exp-1 from C. elegans. By sequence analysis they channels (Chiu et al, 1999). These receptors mainly exist found that protein EXP-1 had the closest relationship in muscle and neuronal cell bodies, which can inhibit with inhibitory GABA receptor. Therefore, it is unlikely muscle excitation caused by glutamate ligand-gated that the GABA ligand-gated cation channels evolved cation receptors (Brockie et al, 2001). How the GluCl- from cation channels by mutations, but rather from inhibitory receptor and its regulatory system evolved and mutations of the GABA ligand-gated anion channels why they are absent in vertebrates remains unclear. (Beg & Jorgensen, 2003). Whether other species also have excitatory GABA receptors needs further Evolution of mGluRs, which are homologous to investigation. GABAB receptors There are eight kinds of mGluRs, namely mGluR1- Evolution of glutamate receptors mGluR8 (Xie et al, 2003). Cao et al (2009) reported that There are two types of glutamate receptors: the GABAB receptors and mGlu receptors have the same ionotropic receptor (iGluR) and metabotropic receptor ancestor based on sequence and structure similarity. They (mGluR) (Xie et al, 2003; Yano et al, 1998). When may have evolved from the periplasmic amino acid stimulated by glutamate, iGluR usually directly regulates binding proteins (PBPs) of bacteria (Cao et al, 2009). cation influx, while mGluR promotes the second The phylogenetic tree in Figure 3 shows the relationship messenger G-protein, which leads to increased among mGlu receptors (mGluR1-mGluR8), GABAB + + permeability of Na and K on membranes and receptors (GABAB1 and GABAB2), and their ancestor membrane potential depolarization. Ribeiro et al (2005) homolog leucine-binding protein (LBP), a kind of PBP. showed that S. mansoni has the homologs of mammal The GABAB receptor can be found in Drosophila mGlu receptor-like genes and iGluRs (Ribeiro et al, (Mezler et al, 2001), nematodes (Bargmann, 1998), and 2005). The mGluRs have been reported in more hydra (Kass-Simon et al, 2003; Kass-Simon & primitive metazoa, for example, mGluRs can be part of Scappaticci, 2004), suggesting that GABAB receptors the sensory system in spongia (Müller, 1998). may have existed before these species diverged. In more primitive animal porifera there is a protein highly Evolution of iGluRs homologous to the GABABR1 receptor and mGluR5 The iGluRs are tetramer type or pentamer type receptor in rats. After analyzing the function of this proteins, including N-methyl-D-aspartate (NMDA) type protein, Perovic et al (1999) found that the and non-NMDA type. The NMDA receptors mainly mGlu/GABAB-like proteins may even exist in porifera + + regulate the flow of Na and K by ligand-gated form, when the GABAB and mGlu receptor have not been fully while non-NMDA receptors mainly use voltage-sensitive differentiated (Perovic et al, 1999). channels to control Na+ influx depolarizing current. Interestingly, there is a kind of glutamate receptor in GABA-glutamate balancing in vivo and evolution of plants, whose sequence is highly similar with iGluR in glutamate transport proteins animals (Chiu et al, 1999). The glutamate receptor in Several “yin and yang” regulation mechanisms plants and animals shares the same domains, including between GABA and glutamate exist in the central two ligand binding domains and four transmembrane nervous systems of mammals. The NMDA receptor, a domains (M1-M4) (Chiu et al, 1999). The highest subunit of glutamate receptors, which exists in identity in these domains is 60%, discovered in the M3 parvalbumin-positive neurons, can monitor strong domain. These facts suggest that the core function of excitatory signals and then stimulate inhibitory neurons iGluRs could have existed before the division of animals (Gordon, 2010). When the excitatory signal becomes and plants (Chiu et al, 1999). Chiu et al (1999) also weak, NMDA receptors cease monitoring the excitatory found that the primitive iGluR receptors in plants do not signal. Therefore, the NMDA receptor functions by function as ion channels, but depend on interaction with balancing stimulatory and inhibitory signals, although other proteins. Therefore, the real iGluR ion channel the intrinsic mechanism is unknown (Gordon, 2010). should appear after the division of animals and plants, Moreover, a regulatory network exists in which GABAB especially after the division of NMDA and non-NMDA receptors and glutamate receptors interact with each receptors (Ryan & Grant, 2009). other. For example, in the spin of hippocampal pyramidal Animal iGluRs are usually excitatory glutamate neurons, NMDA receptor activation leads to ligand-gated cation channels, which mainly exist on the phosphorylation of GABAB, which causes the degradation

Zoological Research www.zoores.ac.cn Evolution of neurotransmitter gamma-aminobutyric acid, glutamate and their receptors E79

Figure 3 Phylogenetic tree constructed with Neighbor-Joining via MEGA version 5.0 with Bootstrap Test (1 000 replications)

demonstrating that mGluR and GABAB receptors may have a common ancestor The phylogenetic tree is based on the similarities of amino acid sequences. The numbers after each gene are GenBank accession numbers. of GABAB, finally reducing the inhibitory signal of exists in invertebrates or not, although EAAT genes have GABAB. On the other hand, at the parallel fibre-purkinje been cloned from invertebrates (Gesemann et al, 2010). cell synapse, GABAB receptor activation enhances mGluR1a signal transduction (Gassmann & Bettler, Perspective 2012). Glutamate and GABA transport proteins also GABA and glutamate act as inhibitory and participate in the balanced regulation between the two excitatory neurotransmitters, respectively, although they neurotransmitters. There are two transport protein display opposite functions in some special species or in families in rodents, namely Excitatory/Neutral Amino some special tissues, such as GABA in nematodes and Acid Transporter (EAAT) 1/2 and GABA transporter glutamate in fruit fly and nematodes. We found that the (GAT) 2/3, which transport glutamate and GABA, exceptional cases mainly appeared in invertebrates. Their respectively. The EAAT 1/2 transfers glutamate from diverse functions rely on the refined evolution of their extracellular to intracellular together with three Na+ and receptors, which turn out to be multiform and more one H+ when transporting one K+ in the opposite elaborate. Unlike vertebrates in which there is a GABA- direction (Héja et al, 2009). Therefore, the co-existence glutamate balance system mediated by GABA and of GAT with EAAT in glial cells can reverse GAT by glutamate transport proteins, in invertebrates the balance high concentration Na+, and release GABA from glial may be regulated by those specialized receptors. cells to inhibit neurons. In summary, with the help of However, the regulatory mechanism and its evolutionary EAAT and GAT, excitatory glutamate can cause the destination remains an open question. One particularly release of GABA which displays inhibitory effects (Héja interesting example is the invertebrate specific GluCl- et al, 2009). However, the mechanism underlying the receptor, which evolved from the GABA receptor and regulation of GABA on glutamate is still unclear. functions as an exceptionally inhibitory glutamate Gesemann et al (2010) analyzed sequence variation receptor. How this receptor evolved and why its function of the EAAT family in vertebrates. There are seven seems lost during the emergence of vertebrates is unclear. EAAT genes in humans and mice, and nine to thirteen Future exploration addressing this question will shed genes in prototheria, sauropsida, and amphibia. These light on the evolution of the GABA-glutamate balance differences between species are mainly due to different system. Similarly, another interesting issue comes from genome duplication and gene loss history. Whole the phylogenetic similarity between mGluRs and genome duplications occurred 500 million years ago GABAB receptors. Some evidence shows that the (mya) in vertebrates, while an additional duplication GABAB receptor regulates mGluRs in the GABA- occurred in teleosts about 350 to 320 mya. Thus, zebra glutamate balance system, and that mGlu/GABAB-like fish have thirteen EAAT genes. Species-specific gene proteins may have even existed in early organisms when loss also determines kinds and number of EAATs in a the GABAB and mGlu receptor were not fully certain species. For example, all theria lost EAAT6 and differentiated. The how and when the two ancient EAAT7, lepidosauromorpha retained EAAT7 but lost paralogous genes functionally diverged remain EAAT6, and archosauromorpha lost EAAT7 but retained unanswered, as does the evolutionary significance of this EAAT6. It is not clear whether glutamate-GABA balance functional divergence.

Kunming Institute of Zoology (CAS), China Zoological Society Volume 33 Issues E5−6 E80 GOU, WANG, WANG

How glutamate and GABA quickly turn over the point mentioned above, whether invertebrates have excitatory and inhibitory effects in animals has long been an EAAT- GAT transporter system still awaits further an enigma. Recently, research on the balance between investigation. Finally, the interaction between GABA and glutamate and GABA has focused on the EAAT and GAT glutamate is not only a kind of transmitter exchange via transporter system in mammals. But there are still many transport proteins, but also a complex regulating network uncertainties in this balance system. Firstly, while regulated by their receptors. Taking advantage of the evidence indicates how glutamate causes the release of current “omics” data, the pathway and even regulatory GABA, a few studies show the opposite process. network of the GABA-glutamate “yin and yang” Secondly, our review shows the evolution of EAAT, but balanced regulation system should be clarified in the little is known about the GAT. Furthermore, following near future.

References

Bargmann CI. 1998. Neurobiology of the Caenorhabditis elegans Widespread brain distribution of the Drosophila metabotropic genome. Science, 282(5396): 2028-2033. glutamate receptor. Neuroreport, 19(3): 367-371.

Barnard E, Skolnick P, Olsen R, Mohler H, Sieghart W, Biggio G, Fiorillo CD, Williams JT. 1998. Glutamate mediates an inhibitory Braestrup C, Bateson A, Langer S. 1998. International Union of postsynaptic potential in dopamine neurons. Nature, 394(6688): 78-82. Pharmacology. XV. Subtypes of γ-aminobutyric acidA receptors: Gassmann M, Bettler B. 2012. Regulation of neuronal GABAB classification on the basis of subunit structure and receptor function. receptor functions by subunit composition. Nat Rev Neurosci, 13(6): Pharmacol Rev, 50(2): 291-314. 380-394. Beg AA, Jorgensen EM. 2003. EXP-1 is an excitatory GABA-gated Gerschenfeld HM. 1973. Chemical transmission in invertebrate central cation channel. Nat Neurosci, 6(11): 1145-1152. nervous systems and neuromuscular junctions. Physiol Rev, 53(1): 1- Bouché N, Fromm H. 2004. GABA in plants: just a metabolite? Trends 119. Plant Sci, 9(3): 110-115. Gesemann M, Lesslauer A, Maurer CM, Schönthaler HB, Neuhauss Brockie PJ, Madsen DM, Zheng Y, Mellem J, Maricq AV. 2001. SCF. 2010. Phylogenetic analysis of the vertebrate Excitatory/Neutral Differential expression of glutamate receptor subunits in the nervous Amino Acid Transporter (SLC1/EAAT) family reveals lineage specific system of Caenorhabditis elegans and their regulation by the subfamilies. BMC Evol Biol, 10(1): 117. homeodomain protein UNC-42. J Neurosci, 21(5): 1510-1522. Gordon JA. 2010. Testing the glutamate hypothesis of schizophrenia. Cao JH, Huang SL, Qian J, Huang JL, Jin LX, Su Z, Yang J, Liu JF. Nat Neurosci, 13(1): 2-4. 2009. Evolution of the class C GPCR Venus flytrap modules involved Héja L, Barabás P, Nyitrai G, Kékesi KA, Lasztóczi B, Tőke O, positive selected functional divergence. BMC Evol Biol, 9(1): 67. Tárkányi G, Madsen K, Schousboe A, Dobolyi Á, Palkovits M, Kardos Chang Y, Wang R, Barot S, Weiss DS. 1996. Stoichiometry of a J. 2009. Glutamate uptake triggers transporter-mediated GABA release recombinant GABAA receptor. J Neurosci, 16(17): 5415-5424. from astrocytes. PLoS ONE, 4(9): e7153.

Chebib M, Johnston GAR. 1999. The ‘ABC’of GABA receptors: a Hosie A, Sattelle D, Aronstein K. 1997. Molecular biology of insect brief review. Clin Exp Pharmacol Physiol, 26(11): 937-940. neuronal GABA receptors. Trends Neurosci, 20(12): 578-583.

Chiu J, DeSalle R, Lam HM, Meisel L, Coruzzi G. 1999. Molecular Kass-Simon G, Scappaticci A Jr. 2004. Glutamatergic and GABAnergic evolution of glutamate receptors: a primitive signaling mechanism that control in the tentacle effector systems of Hydra vulgaris. existed before plants and animals diverged. Mol Biol Evol, 16(6): 826- Hydrobiologia, 2004, 178(530-531): 67-71. 838. Kass-Simon G, Pannaccione A, Pierobon P. 2003. GABA and glutamate Cully DF, Paress PS, Liu KK, Schaeffer JM, Arena JP. 1996. receptors are involved in modulating pacemaker activity in hydra. Identification of a Drosophila melanogaster glutamate-gated chloride Comp Biochem Physiol A Mol Integr Physiol, 136(2): 329-342. channel sensitive to the antiparasitic agent avermectin. J Biol Chem, Kehoe J, Marder E. 1976. Identification and effects of neural 271(33): 20187-20191. transmitters in invertebrates. Annu Rev Pharmacol Toxicol, 16(1): 245-268. Darlison MG, Pahal I, Thode C. 2005. Consequences of the evolution Lü BZ, Tian Y. 1991. Receptor Introduction. Beijng: Beijing Science of the GABAA receptor gene family. Cell Mol Neurobiol, 25(3): 607- Press: 923-930. (in Chinese) 624. Martínez-Delgado G, Estrada-Mondragón A, Miledi R, Martínez-Torres Davis RE, Stretton AOW. 1995. Biochemistry and Molecular Biology A. 2010. An update on GABAρ receptors. Curr Neuropharmacol, 8(4): of Parasites. Sandiego: Academic Press Inc. 422-433. DeFeudis FV. 1975. Amino acids as central neurotransmitters. Annu McIntire SL, Jorgensen E, Kaplan J, Horvitz HR. 1993. The Rev Pharmacol, 15(1): 105-130. GABAergic nervous system of Caenorhabditis elegans. Nature, Devaud JM, Clouet-Redt C, Bockaert J, Grau Y, Parmentier ML. 2008. 364(6435): 337-341.

Zoological Research www.zoores.ac.cn Evolution of neurotransmitter gamma-aminobutyric acid, glutamate and their receptors E81

Mezler M, Müller T, Raming K. 2001. Cloning and functional ganglion of the crab Cancer borealis. J Exp Biol, 203(14): 2075-2092. expression of GABAB receptors from Drosophila. Eur J Neurosci, Tretter V, Ehya N, Fuchs K, Sieghart W. 1997. Stoichiometry and 13(3): 477-486. assembly of a recombinant GABAA receptor subtype. J Neurosci, Müller WE. 1998. Origin of Metazoa: sponges as living fossils. 17(8): 2728-2737. Naturwissenschaften, 85(1): 11-25. Tsang SY, Ng SK, Xu Z, Xue H. 2007. The evolution of GABAA Nakanishi S. 1992. Molecular diversity of glutamate receptors and receptor-like genes. Mol Biol Evol, 24(2): 599-610. implications for brain function. Science, 258(5082): 597-603. Vassilatis DK, Elliston KO, Paress PS, Hamelin M, Arena JP, Schaeffer Ortells MO, Lunt GG. 1995. Evolutionary history of the ligand-gated JM, Van der Ploeg LHT, Cully DF. 1997. Evolutionary relationship of ion-channel superfamily of receptors. Trends Neurosci, 18(3): 121-127. the ligand-gated ion channels and the avermectin-sensitive, glutamate- gated chloride channels. J Mol Evol, 44(5): 501-508. Perovic S, Krasko A, Prokic I, Müller IM, Müller WEG. 1999. Origin of neuronal-like receptors in Metazoa: cloning of a metabotropic Verma P, Kumar D, Tandan S. 2010. Effects of amino acid glutamate/GABA-like receptor from the marine sponge Geodia neurotransmitters on spontaneous muscular activity of the rumen cydonium. Cell Tissue Res, 296(2): 395-404. amphistome, Gastrothylax crumenifer. J Helminthol, 83(4): 385-389.

Ribeiro P, El-Shehabi F, Patocka N. 2005. Classical transmitters and White JH, Wise A, Main MJ, Green A, Fraser NJ, Disney GH, Barnes their receptors in flatworms. Parasitology, 131(S1): S19-S40. AA, Emson P, Foord SM, Marshall FH. 1998. Heterodimerization is required for the formation of a functional GABAB receptor. Nature, Ryan TJ, Grant SGN. 2009. The origin and evolution of synapses. Nat 396(6712): 679-682. Rev Neurosci, 10(10): 701-712. Wilke K, Gaul R, Klauck SM, Poustka A. 1997. A gene in human Schuske K, Beg AA, Jorgensen EM. 2004. The GABA nervous system chromosome band Xq28 (GABRE) defines a putative new subunit class in C. elegans. Trends Neurosci, 27(7): 407-414. of the GABAA neurotransmitter receptor. Genomics, 45(1): 1-10.

Segerberg MA, Stretton A. 1993. Actions of cholinergic drugs in the Xie YF, Tang JS, Jia H. 2003. The roles of different types of glutamate nematode Ascaris suum. Complex pharmacology of muscle and receptors involved in the mediation of nucleus submedius (Sm) motorneurons. J Gen Physiol, 101(2): 271-296. glutamate-evoked antinociception in the rat. Brain Res, 988(1-2): 146- 153. Shimada S, Cutting G, Uhl G. 1992. gamma-Aminobutyric acid A or C receptor? gamma-Aminobutyric acid rho 1 receptor RNA induces Yano S, Tokumitsu H, Soderling TR. 1998. Calcium promotes cell bicuculline-, barbiturate-, and benzodiazepine-insensitive gamma- survival through CaM-K kinase activation of the protein-kinase-B aminobutyric acid responses in Xenopus oocytes. Mol Pharmacol, pathway. Nature, 396(6711): 584-587. 41(4): 683-687. Yates DM, Portillo V, Wolstenholme AJ. 2003. The avermectin Swensen AM, Golowasch J, Christie AE, Coleman MJ, Nusbaum MP, receptors of Haemonchus contortus and Caenorhabditis elegans. Int J Marder E. 2000. GABA and responses to GABA in the stomatogastric Parasitol, 33(11): 1183-1193.

Kunming Institute of Zoology (CAS), China Zoological Society Volume 33 Issues E5−6 Zoological Research 33 (E5−6): E82−E88 doi: 10.3724/SP.J.1141.2012.E05-06E82

Involvement of XZFP36L1, an RNA-binding protein, in Xenopus neural development

Yingjie XIA1, 2,#, Shuhua ZHAO3,#, Bingyu MAO1,*

1. State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, the Chinese Academy of Sciences, Kunming 650223, China 2. Graduate University of Chinese Academy of Sciences, Beijing 100049, China 3. Yunnan Key Laboratory of Fertility Regulation and Minority Eugenics, Yunnan Population and Family Planning Research Institute, Kunming 650021, China

Abstract: Xenopus ZFP36L1 (zinc finger protein 36, C3H type-like 1) belongs to the ZFP36 family of RNA-binding proteins, which contains two characteristic tandem CCCH-type zinc-finger domains. The ZFP36 proteins can bind AU-rich elements in 3' untranslated regions of target mRNAs and promote their turnover. However, the expression and role of ZFP36 genes during neural development in Xenopus embryos remains largely unknown. The present study showed that Xenopus ZFP36L1 was expressed at the dorsal part of the forebrain, forebrain-midbrain boundary, and midbrain-hindbrain boundary from late neurula stages to tadpole stages of embryonic development. Overexpression of XZFP36L1 in Xenopus embryos inhibited neural induction and differentiation, leading to severe neural tube defects. The function of XZP36L1 requires both its zinc finger and C terminal domains, which also affect its subcellular localization. These results suggest that XZFP36L1 is likely involved in neural development in Xenopus and might play an important role in post-transcriptional regulation.

Keywords: ZFP36L1; RNA-binding protein; Neural development; Xenopus; Post-transcriptional regulation

mRNA stability is an important mechanism for gene and zebrafish ZFCTH1) contains four ZF domains expression regulation, which is critical for embryogenesis instead of two. The four ZFP36 genes in Xenopus have and normal physiological processes. The zinc finger been cloned and are widely expressed in adult tissues, protein 36 (ZFP36) family proteins are key players in except for XZFP36L2.1 (XC3H-4), which is detected post-transcriptional regulation and contain two or more only in the ovary (De et al, 1999). It has been suggested highly conserved tandem CCCH zinc fingers (TZF) of that XZFP36L2b (XC3H-3b) plays an important role in tandem Cx8Cx5Cx3H repeats (x represents a variable Xenopus kidney development (Kaneko et al, 2003), but amino acid). These factors are able to bind to AU-rich little is known about the roles of other ZFP36 members elements (AREs) in the 3' untranslated regions (UTR) of in embryonic development. 1 targeted mRNAs through their TZFs (Carballo et al, Also known as cMG1, TIS11b, ERF1, BRF1, C3H- 2000; Lai & Blackshear, 2001; Lai et al, 1999; Lai et al, 2, and Berg36 (Barnard et al, 1993; De et al, 1999; Lai et 2000), which leads to the removal of the poly(A) tail off al, 1990; Varnum et al, 1991), ZFP36L1 has been the targeted mRNAs and promotes their degradation reported to regulate vascular endothelial growth factor (Carrick et al, 2004; Lai & Blackshear, 2001). The (VEGF) mRNA stability in adrenocorticotropic ZFP36 family proteins are evolutionarily conserved, hormone-stimulated primary adrenocortical cells acting including four rodent members (ZFP36, ZFP36L1, as a negative regulator of VEGF during development ZFP36L2, and ZFP36L3) and three mammal members (with ZFP36L3 missing, Blackshear et al, 2005), each of Received: 17 August 2012; Accepted: 15 October 2012 which has two conserved ZF domains. Four members Foundation items: This work was supported by National Natural have been reported in Xenopus and zebrafish, among Science Foundation of China (90919039; C120106) which ZFP36, ZFP36L1, and ZFP36L2 are conserved # These authors contributed equally to this work while the fourth member (Xenopus ZFP36L2.1, ZFP36L2.2 * Corresponding author, E-mail: [email protected]

Science Press Volume 33 Issues E5−6

Involvement of XZFP36L1, an RNA binding protein, in Xenopus neural development E83

(Ciais et al, 2004). Research has also shown ZFP36L1 to ZFP36_M.musc (AAH21391), ZFP36_X.laev be regulated by Von Hippel-Lindau (VHL) tumor (NP_001081884), ZFP36_D.rerio (XP_002665468), suppressor protein in renal cancer cells (Sinha et al, ZFP36_C.elegans (NP_505927), ZFP36L1_ H.sapi 2009). In addition, ZFP36L1 is phosphorylated by p38 (NP_004917), ZFP36L1_ M.musc (NP_031590), MAPK/MAPK-activated protein kinase 2 (MK2) and ZFP36L1A_X.laev (NP_001089645), ZFP36L1B_X.laev PKB/AKT pathways. Phosphorylation of ZFP36L1 (NP_001084214) ZFP36L1A_D.rerio (NP_001070621), inhibits its ARE mRNA decay activity, but does not ZFP36L1B_D.rerio (NP_955943), ZFP36L2_H.sapi affect its ability to bind AREs (Benjamin et al, 2006; (NP_008818), ZFP36L2_M.musc (NP_001001806), Maitra et al, 2008; Schmidlin et al, 2004). Due to a ZFP36L2A_X.laev (NP_001080610), ZFP36L2B_X.laev failure of chorioallantoic fusion, ZFP36L1 knockout (NP_001081886), ZFP36L2_D.rerio (NP_996938), mice died in utero before embryonic day 11 (E11) ZFP36L2.1_X.laev (NP_001081888), ZFP36L2.2_X.laev (Stumpo et al, 2004). The XZFP36L1 gene contains 345 (NP_001108269), ZFP36L3_M.musc (NP_001009549), amino acids and shows 76% identity to human BRF-1 and ZFCTH1_D.rerio (NP_571014). (De et al, 1999). However, the developmental expression Caenorhabditis elegans ZFP36 (C3H-1) were and function of Xenopus ZFP36L1 remains largely included as an out-group. Sequence alignment was unknown. In this study, we cloned XZFP36L1 and performed using Clustal W (Thompson et al, 1994). The investigated its expression pattern and potential function Bayesian approach was implemented in MrBayes3.12 during Xenopus laevis embryo development. (Ronquist & Huelsenbeck, 2003) with estimation of the substitution rate model, and gamma distribution of MATERIALS AND METHODS among site rate variation two runs were used, each with a single chain of 2×107 generations, sampled every 10 000 Plasmid construction generations. Convergence was assessed by comparing The XZFP36L1 full open reading frame (ORF) and the standard deviation of split frequencies between runs. all truncated fragments were amplified by polymerase We excluded 1 000 trees from a total sample of 2 001 chain reaction (PCR) using the XL073b24 clone as a trees in each run. Sample independence was confirmed template, which was provided to us by the National as no significant increase in log-likelihoods observed Institute for Basic Biology, Japan (NIBB) The PCR after the burn-in phase. primers were: XZFP36L1 ORF: forward: 5'-ATGTCTACAGCTTTGAT In situ hybridization, section and RT-PCR analysis TTCTCC-3' Whole mount in situ hybridization (WMISH) on X. and reverse: 5'- ATCATCAGAGATAGAAAGTCTGC-3'; laevis was carried out as previously described (Gawantka N-term truncate: forward: 5'-ATGTCTACAGCTTTGAT et al, 1995). To examine potential overlapping expression, TTCTCC-3' digoxigenin- and fluorescein-labeled RNA probes were and reverse: 5'-CTGTCCCCCAGGCTTTTGGAGAA-3'; used for double WMISH. Plasmid XL073b24 was cut NZF truncate: forward: 5'-ATGTCTACAGCTTTGAT using BamHI, and transcribed with T7 RNA polymerase TTCTCC-3' to generate an antisense probe of XZFP36L1. The probes and reverse: 5'- TCTCTCTTCTGCATTGTGGATG-3'; of X. laevis N-Tubulin (Papalopulu & Kintner, 1996), ZF domains: forward: 5'-CAGGTGAACTCCAGCAGG Sox2 (Ma et al, 2007), Slug (Aybar et al, 2003), Xath1 TACAAG-3' (Kim et al, 1997), En2 (Hemmati-Brivanlou et al, 1991), and reverse: 5'- TCTCTCTTCTGCATTGTGGATG-3'; Pax3 (Sato et al, 2005), Dbx2 (Ma et al, 2011), and ZFC truncate: forward: 5'- CAGGTGAACTCCAGCAG Nkx6.2 (Zhao et al, 2007) were used as previously GTACAAG -3' described. Stained embryos were embedded in paraffin, and reverse: 5'- ATCATCAGAGATAGAAAGTCTGC-3'; sectioned at 20 μm, and photographed using a light C-term truncate: forward: 5'-GAGAGACGCTTGGTAT microscope (Leica). CAGGAC-3' Semi-quantitative reverse transcription PCR was and reverse: 5'- ATCATCAGAGATAGAAAGTCTGC-3'. carried out as previously described (Wu et al, 2010). The For expression in Xenopus embryos and cultured PCR primers and conditions were: cells, the fragments were cloned into a eukaryotic XZFP36L1A: forward: 5'-TGCATGAGCAGCAGACGT expression vector pCS2+ with a Flag tag coding G-3' and reverse: 5'-GGTCTTATGCGTCTCTCCATG-3', sequence at the C-terminal. 30 cycles; XZFP36L1B: forward: 5'-TGCATGAGCAGCAGAGGA Phylogenetic analysis of the ZFP36 protein family A-3' and reverse: 5'-GCCCTTCTGTGTCTCTAAACA- Twenty full coding amino acid sequences of the 3', 30 cycles; H4 was used as a loading control: forward: ZFP36 family were used for Bayesian phylogenetic 5'-CGGGA TAACATTCAGGGTA-3’ and reverse: 5'- analysis, including ZFP36_H.sapi (AAH09693), TCCATGGCG GTAACTGTC-3', 26 cycles.

Kunming Institute of Zoology (CAS), China Zoological Society Volume 33 Issues E5−6 E84 XIA,ZHAO,MAO

Embryo culture and microinjection and XZFP36L1B, which shared 97% identity at the In vitro fertilization, embryo culture, and WMISH amino acid level. The full ORF of XZFP36L1B was of Xenopus embryos were carried out as per Gawantka et cloned by PCR using an EST clone as a template al (1995). Embryos were staged according to Nieuwkoop (XL073b24, from the National Institute for Basic and Faber (1967). Capped mRNA of X. laevis ZFP36L1 Biology, Japan). In the following studies on XZFP36L, and its truncated forms were synthesized in vitro using this allele was used unless otherwise stated. Consistent SP6 mMessage Machine kit (Ambion). Microinjection with its mammal orthologues, XZFP36L1 had two was carried out using the PLI-1 Pico-injector (Harvard tandem ZF domains. To confirm the evolutionary Apparatus) equipped with an MK-1 micromanipulator relationships between the homologues of ZFP36L1, a (Singer Instruments). Embryos were injected in one phylogenetic tree was constructed using MrBayes. The dorsal blastomere at the 4-cell stage with 0.5−1 ng nomenclature of the XZFP36 members was according to XZFP36L1 mRNA or 0.3−0.6 ng of truncated mRNA. Xenbase (Bowes et al, 2008). The ZFP36 family proteins Injected mRNA was traced by co-injected 100 pg nuclear were grouped into four well-supported vertebrate clades β-galactosidase, which was stained with Red-Gal inferred from the phylogenetic results (Figure 1). The substrate (Research Organics). The detection of β- results indicate that ZFP361, ZFP36L1, ZFP36L2, and galactosidase activity for tracing cell lineage was carried ZFP36L3 were indeed paralogues that arose by gene out as previously described (Kurata & Ueno, 2003). duplication from a common ancestor to the vertebrate lineage. However, ZFP36L3 in rodents and the fourth Cell transfection and immunohistochemistry ZFP36 genes in fish (ZFCTH1) and frog (ZFP36L2.1 The XZFP36L1 ORF and its truncated constructs and 2.2) were lineage specific. Among the ZFP36 related were cloned in frame into a eukaryotic expression vector proteins, members of the ZFP36L1 and ZFP36L2 pCS2+ with a C-terminal Flag tag coding sequence. Hela subfamilies were more closely related. cell culture, transfection, and immunohistochemistry Temporal and spatial expression of XZFP36L1 were carried out as described in previous research (Wu et during Xenopus laevis early development al, 2010). We examined XZFP36L1 expression during X. laevis embryogenesis by RT-PCR and WMISH. Both RESULTS XZFP36L1A and B showed similar temporal expression patterns, as detected by RT-PCR. They were both Cloning and phylogenetic analysis of ZFP36 family expressed at low levels maternally and in embryos from proteins blastula to gastrula stages. Their expression went up at Through BLAST search against EST databases, two early neurula stages and remained stable at later stages alleles of X. laevis ZFP36L1 were found, XZFP36L1A (Figure 2A).

Figure 1 Bayesian tree depicting the phylogenetic relationships of the ZFP36 proteins. Branch confidence values are shown as Bayesian posterior probabilities. The scale bar shows the number of substitutions per site.

Zoological Research www.zoores.ac.cn Involvement of XZFP36L1, an RNA binding protein, in Xenopus neural development E85

Figure 2 Temporal and spatial expression patterns of XZFP36L1 A: RT-PCR analysis of the developmental expression of XZFP36L1A and B; B–I: Expression patterns of XZFP36L1 revealed by WMISH and paraffin sections; B: Stage 10, the embryo was bisected, the arrowheads indicate the expression of XZFP36L1 in the ectoderm and involuting mesoderm; C: Stage 16, anterior view; D: Stage 20, lateral view; E: Stage 20, anterior view; F-I: Stage 32 embryos: F: Lateral view. The broken lines indicate the position of corresponding sections showed in F’ and F”. Br, branchial arches. F’, F”: Transverse sections of stage 32 embryo, the arrowheads indicate the expression of XZFP36L1 in the dorsal brain. ot, otic vesicle; G: Dorsal view of the head region; H: Horizontal section showing its expression in the fore-midbrain boundary (red arrowheads) and mid-hindbrain boundary (black arrowheads); I: Double WMISH of En2 (red) and XZFP36L1 (blue). Arrowheads in (C−I) indicate the expression of XZFP36L1 in the forebrain (blue), fore-midbrain boundary (red) and mid-hindbrain boundary (black). White arrowheads in (C) and (E) indicate its expression at the presumptive neural placode region. Bars in (B−I)=0.5 mm.

Since the two alleles of XZFP36L1 shared more at the dorsal part of the neural tube (Figure 2F’, F”). than 90% identity, it was difficult to design probes to These data showed that XZFP36L1 was expressed in distinguish their expression patterns by in situ specific brain regions during neurulation and might play hybridization. Here we used XZFP36L1B as a template a role in neural development. for in situ probe. At the gastrula stage, XZFP36L1 Overexpression of XZFP36L1 leads to neural tube transcripts were detected in both ectoderm and defects mesoderm (Figure 2B). During neurula stages, its To investigate the function of XZFP36L1 during expression became enriched in the neural system, Xenopus embryogenesis, we overexpressed XZFP36L1 especially in the anterior neural folds (Figure 2C), where in Xenopus embryos by injecting synthetic XZFP36L1 their expression became gradually restricted into three mRNA into one dorsal cell at the 4-cell stage. discontinuous patches, corresponding to the presumptive Interestingly, the injected embryos showed severe neural forebrain, fore-midbrain boundary. and mid-hindbrain tube defects (NTDs). Compared to the control group, the boundary, respectively (Figure 2D, E). Its expression was neural tubes of the injected embryos failed to close also detected in the placode regions at the anterior neural properly and remained open at tailbud stages (57%, n=81, plate borders (Figure 2C, E). At tadpole stages, Figure 3A, B). To further investigate the role of XZFP36L1 remained expressed in the three XZFP36L1 in neural development, we checked the discontinuous regions of the brain and also in the expression of several neural markers in ZFP36L1 mRNA branchial arches (Figure 2F, G). These domains were injected embryos. The results showed that many of the confirmed to be the forebrain, fore-midbrain boundary, neural markers were inhibited at neurula and tailbud and mid-hindbrain boundary, respectively, by horizontal stages, including N-tubulin (67%, n=24, Figure 3C), sections (Figure 2H, arrow heads) and double in situ Sox2 (86%, n=35, Figure 3D), Slug (85%, n=20, Figure hybridization with En2, a mid-hindbrain boundary 3E), En2 (84%, n=25, Figure 3F), Nkx6.2 (73%, n=15, marker (Figure 2I). Transverse sections of stage32 Figure 3G), Pax3 (70%, n=17, Figure 3H), Dbx1 (69%, embryos showed that XZFP36L1 was mainly expressed n=16, Figure 3I), and Xath1 (62%, n=13, Figure 3J). Kunming Institute of Zoology (CAS), China Zoological Society Volume 33 Issues E5−6 E86 XIA,ZHAO,MAO

Figure 3 Overexpression of XZFP36L1 leads to NTDs in Xenopus embryos A: Control embryos, stage25; B: Embryo injected with XZFP36L1 mRNA shows severe defects during neural tube closure (arrowheads), stage 25; C−J: Overexpression of XZFP36L1 inhibits neural differentiation. Expression of neural marker genes N-tubulin (C), Sox2 (D), Slug (E), En2 (F), Nkx6.2 (G), Pax3 (H), Dbx1 (I) and Xath1 (G) is largely inhibited in the injected sides (arrowheads). (C−F), stage 15; (G−J), stage 25. Bars=0.5 mm.

These results suggested that overexpression of transfected Hela cells. As expected, the full length ORF XZFP36L1 inhibited neural induction and differentiation was detected dominantly in the cytoplasm, which was in Xenopus embryos, leading to severe neural tube consistent with human ZFP36L1 (CMG1) results (Figure defects. 4H; Phillips et al, 2002). The construct containing only the N-terminal domain showed granular distribution in The ZF and C-terminal domains of XZFP36L1 are the cytoplasm (Figure 4I). All the other truncated forms required for its activity and subcellular localization were more enriched in the nucleus, except for the one The XZFP36L1 protein contains an N-terminal containing both the ZF and C-terminal domains, which domain, two ZF domains, and a C-terminal domain showed higher signal in the cytoplasm compared to other (Figure 4A). To determine which domains are essential truncates (Figure 4J-M). According to the above data, the for its function in neural development, several truncated activity of these truncates in NTDs seemed consistent forms of XZFP36L1 were constructed (Figure 4A), with their localization in the cytoplasm, indicating that transcribed into mRNAs, and injected into Xenopus the cytoplasmic localization of ZFP36L1 was important embryos. Embryos were examined at stage 20, when the for its function. neural tube was completely closed. The results showed that truncates containing both the ZF and the C-terminal DISCUSSION domains led to defects in neural tube closure for Xenopus embryos like the full length ORF (Figure 4F), whereas In this study, we showed that XZFP36L1 was other truncated forms lacking either the ZF motif or the expressed in specific brain regions in Xenopus embryos, C-terminal domain did not exhibit such an effect (Figure and overexpression of this protein inhibited neural 4 D, E, G). These results suggest that both the ZF and the induction and differentiation and led to severe NTDs, C-terminal domains were necessary for its function. indicating that XZFP36L1 was involved in Xenopus The XZFP36L1 gene belongs to the ZFP36 RNA- neural development. We also tried to knockdown the binding protein family. The ZF domains of ZFP36 family expression of endogenous ZFP36L1 using specific proteins are suggested to bind to the AREs of targeted 3' morpholinos. Unfortunately, no clear phenotypes were UTRs. Thus, these proteins are supposed to locate in the observed in the injected embryos, although the cytoplasm to execute their functions. To determine morpholinos actively inhibited the expression of the whether the activity of XZFP36L1 in neural development reporter GFP genes carrying the targeted sequences (data was related to its subcellular localization, we fused the not shown). The reason for this might be that either the full-length XZFP36L1 ORF and its truncates with a C- morpholino was not efficient enough to block terminal Flag tag coding sequence and performed XZFP36L1 expression, or its function was compensated immunohistochemistry using anti-Flag antibodies in by other members of this gene family, since it has been Zoological Research www.zoores.ac.cn Involvement of XZFP36L1, an RNA binding protein, in Xenopus neural development E87

Figure 4 Activity and cellular localization of different XZFP36L1 truncates A: The schematic structures of XZFP36L1 protein and its truncates; B-G: Both the ZF motif and the C-terminal domain are required for its activity in neural development; B: control embryos. Overexpression of full length XZFP36L1 (C) or the ZFC truncate (F) leads to defects in neural tube closure (arrowheads), while overexpression of the NZF truncate (D) or ZF domains (E) or C-terminal truncate (G) has no obvious effect on neural tube closure; H-M: Localization of XZFP36L1 and its truncates in transfected Hela cells. reported that the ZFP36 protein family are all RNA- for cytoplasmic localization (Phillips et al, 2002). Our binding proteins and may share some common target results indicated that the cytoplasmic localization of mRNAs, e.g., TNF-α(Carballo et al, 1998; Lai et al, ZFP36L1 was important for its function, and suggested 2000), GM-CSF(Carballo et al, 2000; Lai & Blackshear, that ZFP36L1 was involved in neural development, 2001), and VEGF(Bell et al, 2006; Brennan et al, 2009; likely by acting as an RNA-binding protein. However, Ciais et al, 2004). the in vivo function of XZFP36L1 during brain The function of XZP36L1 required both its ZF and development requires further detailed analysis. C terminal domains, the ZF domains for binding to AREs and targeting mRNA degradation and the C-terminal Acknowledgements We thank the National Institute domain to contain nucleus export sequences responsible for Basic Biology, Japan, for the Xl073b24 clone.

References

Aybar MJ, Nieto MA, Mayor R. 2003. Snail precedes slug in the RNA binding protein Zfp36l1 is required for normal vascularisation and genetic cascade required for the specification and migration of the post-transcriptionally regulates VEGF expression. Dev Dyn, 235(11): Xenopus neural crest. Development, 130(3): 483-494. 3144-3155.

Barnard RC, Pascall JC, Brown KD, McKay IA, Williams NS, Bustin Benjamin D, Schmidlin M, Min L, Gross B, Moroni C. 2006. BRF1 SA. 1993. Coding sequence of ERF-1, the human homologue of protein turnover and mRNA decay activity are regulated by protein Tis11b/cMG1, members of the Tis11 family of early response genes. kinase B at the same phosphorylation sites. Mol Cell Biol, 26(24): Nucleic Acids Res, 21(15): 3580-3580. 9497-9507.

Bell SE, Sanchez MJ, Spasic-Boskovic O, Santalucia T, Gambardella L, Blackshear PJ, Phillips RS, Ghosh S, Ramos SVB, Richfield EK, Lai Burton GJ, Murphy JJ, Norton JD, Clark AR, Turner M. 2006. The WS. 2005. Zfp36l3, a rodent X chromosome gene encoding a placenta-

Kunming Institute of Zoology (CAS), China Zoological Society Volume 33 Issues E5−6 E88 XIA,ZHAO,MAO specific member of the Tristetraprolin family of CCCH tandem zinc Blackshear PJ. 1999. Evidence that tristetraprolin binds to AU-rich finger proteins. Biol Reprod, 73(2): 297-307. elements and promotes the deadenylation and destabilization of tumor necrosis factor alpha mRNA. Mol Cell Biol, 19(6): 4311-43123. Brennan SE, Kuwano Y, Alkharouf N, Blackshear PJ, Gorospe M, Wilson GM. 2009. The mRNA-destabilizing protein tristetraprolin is Ma L, Zhao SH, Kong QH, Mao BY. 2007. Temporal and spatial suppressed in many cancers, altering tumorigenic phenotypes and expression patterns of Sox1 gene in Xenopus laevis embryo. Zool Res, patient prognosis. Cancer Res, 69(12): 5168-5176. 28(4): 403-408.

Bowes JB, Snyder KA, Segerdell E, Gibb R, Jarabek C, Noumen E, Ma PC, Zhao SH, Zeng WL, Yang QT, Li CC, Lv XY, Zhou Q, Mao Pollet N, Vize PD. 2008. Xenbase: a Xenopus biology and genomics BY. 2011. Xenopus Dbx2 is involved in primary neurogenesis and early resource. Nucleic Acids Res, 36(Database issue): D761-D767. neural plate patterning. Biochem Biophys Res Commun, 412(1): 170- 174. Carballo E, Lai WS, Blackshear PJ. 1998. Feedback inhibition of macrophage tumor necrosis factor-α production by tristetraprolin. Maitra S, Chou CF, Luber CA, Lee KY, Mann M, Chen CY. 2008. The Science, 281(5379): 1001–1005. AU-rich element mRNA decay-promoting activity of BRF1 is regulated by mitogen-activated protein kinase-activated protein kinase 2. RNA, Carballo E, Lai WS, Blackshear PJ. 2000. Evidence that tristetraprolin 14(5): 950-959. is a physiological regulator of granulocyte-macrophage colony- stimulating factor messenger RNA deadenylation and stability. Blood, Nieuwkoop PD, Faber J. 1967. Normal Table of Xenopus laevis 95(6): 1891-1899. (daudin) . New York and London: Garland Publishing.

Carrick DM, Lai WS, Blackshear PJ. 2004. The tandem CCCH zinc Papalopulu N, Kintner C. 1996. A posteriorising factor, retinoic acid, finger protein tristetraprolin and its relevance to cytokine mRNA reveals that anteroposterior patterning controls the timing of neuronal turnover and arthritis. Arthritis Res Ther, 6(6): 248-264. differentiation in Xenopus neuroectoderm. Development, 122(11): 3409-3418. Ciais D, Cherradi N, Bailly S, Grenier E, Berra E, Pouyssegur J, LaMarre J, Feige JJ. 2004. Destabilization of vascular endothelial Phillips RS, Ramos SBV, Blackshear PJ. 2002. Members of the growth factor mRNA by the zinc-finger protein Tis11b. Oncogene, tristetraprolin family of tandem CCCH zinc finger proteins exhibit 23(53): 8673-8680. CRM1-dependent nucleocytoplasmic shuttling. J Biol Chem, 277(13): De J, Lai WS, Thorn JM, Goldsworthy SM, Liu X, Blackwell TK, 11606-11613. Blackshear PJ. 1999. Identification of four CCCH zinc finger proteins Ronquist F, Huelsenbeck JP. 2003. Mrbayes 3: Bayesian phylogenetic in Xenopus, including a novel vertebrate protein with four zinc fingers inference under mixed models. Bioinformatics, 19(12): 1572-1574. and severely restricted expression. Gene, 228(1-2): 133-145. Sato T, Sasai N, Sasai Y. 2005. Neural crest determination by co- Gawantka V, Delius H, Hirschfeld K, Blumenstock C, Niehrs C. 1995. activation of Pax3 and Zic1 genes in Xenopus ectoderm. Development, Antagonizing the Spemann organizer: Role of the homeobox gene 132(10): 2355-2363. Xvent-1. EMBO J, 14(24): 6268-6279. Schmidlin M, Lu M, Leuenberger SA, Stoecklin G, Mallaun M, Gross Hemmati-Brivanlou A, De La Torre J, Holt C, Harland R. 1991. B, Gherzi R, Hess D, Hemmings BA, Moroni C. 2004. The ARE- Cephalic expression and molecular characterization of Xenopus En-2. dependent mRNA-destabilizing activity of BRF1 is regulated by Development, 111(3): 715-724. protein kinase B. EMBO J, 23(24): 4760-4769.

Kaneko T, Chan TC, Satow R, Fujita T, Asashima M. 2003. The Sinha S, Dutta S, Datta K, Ghosh AK, Mukhopadhyay D. 2009. Von isolation and characterization of XC3H-3b: A CCCH zinc-finger Hippel-Lindau gene product modulates TIS11B expression in renal cell protein required for pronephros development. Biochem Biophys Res carcinoma. J Biol Chem, 284(47): 32610-32618. Commun, 308(3): 566-572. Stumpo DJ, Byrd NA, Phillips RS, Ghosh S, Maronpot RR, Castranio T, Kim P, Helms AW, Johnson JE, Zimmerman K. 1997. XATH-1, a Meyers EN, Mishina Y, Blackshear PJ. 2004. Chorioallantoic fusion vertebrate homolog of drosophila atonal, induces neuronal defects and embryonic lethality resulting from disruption of Zfp36L1, a differentiation within ectodermal progenitors. Dev Biol, 187(1): 1-12. gene encoding a CCCH tandem zinc finger protein of the Kurata T, Ueno N. 2003. Xenopus Nbx, a novel NK-1 related gene Tristetraprolin family. Mol Cell Biol, 24(14): 6445-6455. essential for neural crest formation. Dev Biol, 257(1): 30-40. Thompson JD, Higgins DG, Gibson TJ. 1994. CLUSTAL W: Improving Lai WS, Blackshear PJ. 2001. Interactions of CCCH zinc-finger the sensitivity of progressive multiple sequence alignment through proteins with mRNA tristetraprolin-mediated AU-rich element- sequence weighting, position-specific gap penalties and weight matrix dependent mRNA degradation can occur in the absence of a poly(A) choice. Nucleic Acids Res, 22(22): 4673-4680. tail. J Biol Chem, 276(25): 23144-23154. Varnum BC, Ma QF, Chi TH, Fletcher B, Herschman HR. 1991. The Lai WS, Stumpo D, Blackshear P. 1990. Rapid insulin-stimulated TIS11 primary response gene is a member of a gene family that accumulation of an mRNA encoding a proline-rich protein. J Biol encodes proteins with a highly conserved sequence containing an Chem, 265(27): 16556-16563. unusual Cys-His repeat. Mol Cell Biol, 11(3): 1754-1758.

Lai WS, Carballo E, Thorn JM, Kennington EA, Blackshear PJ. 2000. Wu JY, LI CC, Zhao SH, Mao BY. 2010. Differential expression of the Interactions of CCCH zinc finger proteins with mRNA. Binding of Brunol/CELF family genes during Xenopus laevis early development. tristetraprolin-related zinc finger proteins to AU-rich elements and Int J Dev Biol, 54(1): 209-214. destabilization of mRNA. J Biol Chem, 275(23): 17827-17837. Zhao SH, Jiang HF, Wang W, Mao BY. 2007. Cloning and developmental Lai WS, Carballo E, Strum JR, Kennington EA, Phillips RS, expression of the Xenopus Nkx6 genes. Dev Genes Evol, 217(6): 477-483.

Zoological Research www.zoores.ac.cn Zoological Research 33 (E5−6): E89−E97 doi: 10.3724/SP.J.1141.2012.E05-06E89

Establishment and characterization of a fibroblast-like cell line from Anabarilius grahami (Cypriniformes: Cyprinidae)

Xiaoai WANG1,2, Junxing YANG1,*, Xiaoyong CHEN1,*, Xiaofu PAN1

1. State key laboratory of genetic resources and evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming 650223, China 2. University of the Chinese Academy of Sciences, Beijing 100049, China

Abstract: Though Yunnan province contains some 562 known species of fish, no cell lines from any of these have been made available to date. To protect germplasm resources and provide an effective tool in solving problems at cellular level of Anabarilius grahami, a fish endemic to Fuxian Lake, Yunnan, China, we established and characterized the major features of a continuous cell line (AGF II) from the caudal fin tissue of A. grahami. This AGF II cell line consists of fibroblast-like cells and has been subcultured more than 60 times over the course of a year. The cell line was maintained in DMEM/F12 supplemented with 10% FBS, with a cellular doubling time of 51.1 h. We continued with more experiments to optimize the culture and storage conditions, and found a variety of interesting results: cells could grow at temperature between 24 °C and 28 °C, with the optimal temperature of 28 °C. Likewise, the growth rate of A. grahami fin cells increased when the FBS proportion increased from 5% to 20%, with the optimal growth at the concentrations of 20% FBS; cells were able to grow in L-15 and DMEM/F12 with optimal growth at L-15; DMSO is a better cryoprotectant than Glycerol, EG and MeOH for AGFII cells with optimal concentration of 5% DMSO. Chromosome analysis also showed that the distribution of chromosome number varies from 38 to 52, with a modal peak at 48 chromosomes, accounting for 39.8% of all cells. Using the same primer pairs specific to mtDNA, the AGF II cell sequences obtained by PCR were identical to those from muscle tissues of A. grahami. Both chromosome analysis and PCR amplification confirmed the AGF II cells were from A. grahami, also indicating that that current long-term artificial propagation of A. grahami has been successful. Finally, we noted that when cells were transfected with pEYFP-N1 and pECFP-N1 plasmid, bright fluorescent signals were observed, suggesting that this cell line may be suitable for use in transfection and future gene expression studies.

Keywords: Characterization; Fibroblast-like cell line; Anabarilius grahami; Cryopreservation

In vitro cultures of animal cells have provided a japonicus (Ye et al, 2006) and Epinephelus akaara powerful tool in studying different a variety of subjects, (Huang et al, 2009); nevertheless, only a few are rare fish, including immunology (Bols et al, 2001), environmental such as Acipenser sinensisi (Zhou et al, 2008) and toxicology (Bahich & Borenfreund, 1991), endocrinology Gobiocypris rarus (Tan et al, 2009). In Yunnan province (Bols & Lee, 1991), virology (Wolf, 1988), biotechnology alone, home to china’s largest biodiversity hotspot, there and aquaculture (Bols & Lee, 1991), and resources are 562 known species of fish (Yang & Li, 2010), and yet protection (Zhou et al, 2008; Chen & Qin, 2011). Since no cell lines are available. Clearly, more attention should the first reported fish cell line in 1962 (Wolf & Quimby), be paid to the rare and endemic fish cell cultures in over 280 lines have been established, most of derived future. 1 from freshwater and anadromous fish species (Wolf & Mann, 1980; Fryer & Lannan, 1994; Lakra et al, 2011). Received: 03 September 2012; Accepted: 30 October 2012 To date, more than 50 cell lines have been established in Foundation items: This work was supported by the Global Environment China including over 20 species (Chen & Qin, 2011). Foundation/The World Bank Project (GEF-MSP grant No. TF051795); Among these fish, most are economically important the Yunnan Development and Reform Commission species, such as Anguilla japonica (Ku et al, 2010), * Corresponding authors, E-mails: [email protected]; Chrysophrys major (Chen et al, 2003), Lateolabrax chenxy@mail. kiz.ac.cn Science Press Volume 33 Issues E5−6

E90 WANG, et al.

One such native species endemic to Fuxian Lake, days (Clem et al, 1961; Chen & Qin, 2011; Freshney, Yunnan, Anabarilius grahami (Regan) belongs to 2010). Cypriniformes, Cyprinidae and has been assessed as a vulnerable species on the China species red lists (Wang Sub-culture and maintenance & Xie, 2004) and evaluated as a threatened fish due its Each flask was examined to ensure it was free of declining numbers (Liu et al, 2009). Fuxian Lake, the microbial contamination and normal morphology of the second deepest lake in China (Ley et al., 1963; Yang, cells. The spent medium was removed to a new flask and 1984), has a great wealth of endemic fish, some4.7% of then combined with the same volume of fresh medium, all Yunnan fish species (Yang & Chen, 1995). However, making a conditioned medium. The cells were washed because of exotic species invasion and over-exploitation, with HBSS and covered with 1 mL of 0.25% trypsin- the production of A. grahami has sharply declined in EDTA solution (Invitrogen). Each culture was examined recent years (Yang & Chen, 1995; Li et al, 2003a; Xiong under an inverted microscope (Olympus, CKX41) to et al, 2006). To date, a few reports on its Karyotype (Zan ensure that cells had separated from the flask surface. & Song, 1980), ecology (Yang, 1994; Li et al, 2003a), The trypsin solution was decanted and the cells were diet composition (Qin et al, 2007), artificial propagation dislodged by gently tapping the flask. A volume of 10 (Li et al, 2003b,c), disease control (Li et al, 2003d), mL fresh or conditioned medium was added to the flask embryonic development (Ma et al, 2008) and genetic and shaken to obtain a cell suspension. Finally, 5 mL of diversity (Yang et al, 2008; Liu et al, 2011) have been the suspension was transferred to a new flask and made available. Unfortunately, little research has been incubated at 28 ℃. The cells were confluent within 3-4 carried out on A. graham’s cell culture of. To further days (Zhou et al, 2008; Chen & Qin, 2011). protect this valuable species and aid in its recovery, Li et al (2003b) succeeded in launching artificial cultivation Growth of cells and the release of some fingerlings into Fuxian Lake Cells were placed at an initial density of 4.8×104/ml annually since 2007. Liu et al (2009) also noted that an into a T-25 flask with DMEM/F12+10% FBS at 28 ℃. investigation of the restocked population is necessary. Cells from the flasks were collected and counted daily Aquaculture by artificial cultivation has inherent for 9 days using a haemocytometer,; this process was limitations, e.g. germplasm degradation, reduction of repeated three times (Freshney, 2010). genetic diversity, and a weakened ability to defend against disease and environmental stress, (Chen, 2002; Effect of medium on cell growth Su et al, 2012). A cell culture of A. grahami could The effect of medium on cell growth was evaluated provide an effective tool in studying these problems on in 4-well. Cells was respectively cultured in DMEM/F12, cellular level. DMEM, M199 and L-15 with 10% FBS at 28 ℃. Cells of duplicate wells in each medium were trypsinized and MATERIALS AND METHODS counted with a haemocytometer daily for 5 days (Huang et al, 2009; Ou et al, 2010). Primary cell culture A healthy juvenile of A. grahami (10 cm total length) Effect of temperature and FBS concentrations on cell was obtained from the tenth generation of artificial growth cultivation from the Endangered Fish Conservation Cells were seeded into 4-well containing L-15 Center (EFCC), Kunming Institute of Zoology, Chinese medium with 10% FBS and incubated at 20 ℃, 24 ℃, Academy of Sciences. The specimen was disinfected 28 ℃ and 32 ℃, successively. Duplicate wells were used with potassium permanganate. Caudal fin tissue was for each temperature. In the same manner, L-15 medium clipped and transferred to flasks using aseptic techniques containing 5%, 10%, 15%, 20% and 25% FBS were and then washed three times with Hanks’ balanced salt assessed for their effect on cell growth at 28 ℃. In the solution (HBSS) supplemented with antibiotics three tests, the average number of cells was calculated (penicillin, 400 IU/ml; streptomycin, 400 ug/ml; daily for 4 days using a haemocytometer (Ku et al, 2010; fungizone, 10 ug/ml). The washed tissue was minced into Lakra et al, 2011). about 1 mm2 in 10 cm diameter Petri dishes, and transferred into T-25 flasks (Corning) containing 3 mL of Effect of cryoprotectant on cell cryopreservation growth medium (DMEM/F12, Gibco) supplemented with The procedure of cell cryopreservation as follows: 20% fetal bovine serum (FBS, Gibco), 100 IU/ml of cells were propagated for 3-4 days in T-75 flasks until penicillin, and 100 ug/ml of streptomycin. All flasks confluent monolayers had formed and cell suspension were incubated at 28 ℃. After 3 days, 3 mL of growth from each flask was collected into a 50 mL centrifuge medium was added into the uncontaminated flasks. tube and then centrifuged at 200 r/min for 8 min. The Monolayers of primary cells were obtained after 30-40 pellets were suspended in FBS containing optimal

Zoological Research www.zoores.ac.cn Establishment and characterization of a fibroblast-like cell line from Anabarilius grahami (Cypriniformes: Cyprinidae) E91 cryoprotectant to obtain a concentration of 5-6×106 mixture of methanol:acetic acid. Chromosomes spread cells/ml. Cell suspensions were then transferred to 2-mL on a glass slide were then prepared according to the cryovials and kept sequentially at -20 ℃ for 4 h, -80 ℃ conventional drop-splash technique, followed by staining for 16 h and finally stored in liquid nitrogen -196 ℃ with 5% Giemsa for 8 min (Zan & Song, 1980; Freshney, (Freshney, 2010; Chen & Qin, 2011). The frozen cells 2010) and then the number of chromosomes was counted were recovered in a water bath at 37 ℃, transferred to L- under a Leica DMR microscope (Leica Mikroskopie und 15 medium (10% FBS) in a T-25 flask and incubated at Systeme GmbH). 28 ℃. The medium was decanted from the flask 24 h after incubation and replaced with 5 mL of fresh L-15 Cell transfection medium. The flask was incubated at 28 ℃ until the cells Cells were seeded in 24-well plates and transfected reached confluence. with the plasmid DNA of two fluorescent proteins We progressively examined the effects of (pEYFP-N1 and pECFP-N1) by Lipofectamine 2000 cryoprotectants, DMSO, Glycerol, ethylene glycol (EG) (Invitrogen) at a ratio of 1:2. The medium was renewed 6 and Methanol (MeOH) on cell cryopreservation., h after transfection, and cells were observed under a allowing us to then examine suitable concentrations of Leica inverted fluorescence microscope 24 h after better cryoprotectant at 5%, 7.5%, 10%, 12.5%, and 15%. transfection. The transfection efficiency was determined The quality of the cells recovered after thawing was by counting the number of fluorescent protein-positive assessed according to two criteria: (1) viability of cells, cells as well as the total cell number in 20 independent assessed by Trypan blue exclusion assay; and (2) optical fields at a magnification of ×400, after 60h post- adhesion ability of the thawed cells, estimated by transfection (Ouyang et al, 2010). counting the cells and then seeding them in 4 wells with L-15+10% FBS, and dissociated with 0.25% Trypsin- RESULTS EDTA after 24 h at 28 ℃. The cell adhesion percentage was the ratio of the total numbers of viable cells Primary cell culture and maintenance recovered after adhesion to the number of viable cells After 5-6 days, cells migrated out from the edge of after thawing (Mauger, 2006). the caudal fin tissues and grew well, forming a monolayer within 40 days at 28 ℃. The initial cells Cell origin identification consisted of both epithelioid-like and fibroblast-like cells Total DNA was isolated from fresh A. grahami (Figure 1A). After 6 subcultures, the fibroblast-like cells, muscle tissue and the AGF II cell line using the DNeasy named AGF II (AG, Anabarilius grahami; F, Fin), Tissue Kit (BioTeke). Primers PDL1: 5'-ACCCCTGGC became predominant (Figure 1B). To date of this study, TCCCAAAGC-3' and PDH1: 5'-ATCTTAGCATCTTCA the cell line has been subcultured more than 60 passages. GTG-3' were used to amplify the 925 bp of D-loop gene using PCR (Liu et al,2002). The PCR conditions were Growth of cells modified as follows: 50 μL mixture of PCR containing 5 After subculture, AGF II cells progress through a μL of 10× buffer, 1 μL of 10 μM PDL1 and PDH1 characteristic growth pattern: a lag phase (1-3 day), primers, 3 μL of 10 mM dNTP, 0.25 μL Supernew Taq, 1 exponential phase (3-7 day) and stationary phase (7-9 μL DNA and 36.75 μL ddH2O. PCR was ran at 94 ℃ day) (Figure 2). The AGF II cells’ population doubling denaturation for 3 min followed by 35 cycles of 94 ℃ for time during exponential growth was determined to be 45 s, 52 ℃ for 45 s and 72 ℃ for 90 s, and a final 51.1 h, at a seeding density of 4.8×104 cells/mL with extension of 10 min at 72 °C. The purified fragments DMEM/F12+10% FBS at 28 ℃. were then sequenced by Sangon Biotech (Shanghai) (Yang et al, 2008; Liu et al, 2011). Effect of medium on cell growth The effects of different medium on cell growth are Chromosome preparation presented in Figure 3. The cells adhered the first day Chromosome spreads were prepared from AGF II with L-15, DMEM/F12 and DMEM were more than with cells at passages 32. About 106 cells were seeded into a M1640.Afterward, cells were able to grow in L-15 and T-75 flask and grown in L-15 medium supplemented DMEM/F12, with optimal growth at L-15.No obvious with 10% FBS. After 1 day incubation at 28 ℃, 0.2 growth was observed at DMEM and M1640. ug/mL of colcemid (Sigma) was added to the medium for 2 h to arrest the cells in metaphase. The majority of cells Effect of temperature and FBS concentration on cell (70%) were dislodged from the flask by gentle tapping growth and then harvested by centrifugation (200 g, 8 min). Cell Our observations showed that AGF II Cells grew at pellets were suspended in a hypotonic solution consisting 20 ℃, 24 ℃ and 32 ℃ with optimal growth at 28 ℃ of 0.4% KCl for 15 min, and thereafter fixed in a 3:1 (Figure 4A). The cells grew well between 24 ℃ and

Kunming Institute of Zoology (CAS), China Zoological Society Volume 33 Issues E5−6 E92 WANG, et al.

Figure 3 Effect of four different media on AGF II cell growth

Figure 1 Photomicrograph of the Anabarilius grahami cell line A: Epithelioid-like (e) and fibroblast –like (f) cells from fin explants were present after 40 days culture at 28 ℃; B: Monolayer of AGFII cells at passage 13. Bar=100 μm.

Figure 2 Growth curve of AGF II cells with DMEM/F12+10o% FBS at 28 ℃

28 ℃, but when the culture temperature declined to Figure 4 Effect of (A) temperature and (B) FBS concentration 20 ℃, no obvious growth was observed, and the cells on AGF II cell growth staid at the lag phase. When the temperature was raised to 32 ℃, aside from no growth being observed, the rate observed when the concentration of FBS was number of viable cells declined. increased to 25%. Growth of AGF II cells at varying levels of FBS concentration is shown in Figure 4B. Our results show Effect of cryoprotectant on cell cryopreservation that serum was required for the cell to survive and grow, A given cryoprotectant and concentration of and that when concentration of FBS varied between 5% cryoprotectant can affect the quality of thawed cells. Our and 20%, the higher the concentration of FBS, there was data showed that the viability and ability of cells to a faster cell growth rate, but there was no faster growth recover after cryopreservation were significant lower

Zoological Research www.zoores.ac.cn Establishment and characterization of a fibroblast-like cell line from Anabarilius grahami (Cypriniformes: Cyprinidae) E93 than that of the fresh control(Figure 5A). DMSO than 70% of the thawed cells were viable after treatment gave better re-viability percentage (80.36± cryopreservation for a month, and more than 40% could 5.3)% and re-adhesion percentage (44.24±7.5)% than be recovered after seeding with the 4 other treatments others (P<0.01). The re-viability percentage between (Figure 5B). Glycerol (61.29±4.5)% and EG (60.29±5.1)% treatment was not significant(P>0.05), nor was the re-adhesion Cell origin identification percentage. MeOH treatment gained worst re-viability PCR products of the D-loop gene showed a band and re-adhesion percentages, (28.3±4.5)% and near 1 000 bp by agarose gel electrophoresis (Figure 6). (2.42±1.0)%, respectively. For both the percentages of Sequencing of this gene obtained a 925 bp length, and viability and adhesion of cryopreserved cells, we noted a then comparative analysis showed these sequences from discernible relationships of DMSO>Glycerol and fresh A. grahami muscle tissue and from AGF II at EG>MeOH. We likewise found that DMSO is the best passage 3, 10 and 20 were identical. These data cryoprotectant for AGFII cells. confirmed that the origin of the established AGF II cell lines were from A. grahami.

Figure 6 Photograph of agarose gel electrophoresis from PCR products Total DNA from line M was from fresh muscle of A. grahami, and line P3, P10 and P20 were representing the AGF II cell line at passages 3,10 and 20, respectively.

Chromosome analysis Chromosome morphology of AGF II is presented in Figure 7. The result of chromosome counts of 108 metaphase plates revealed that the chromosome number of AGF II cells were widely distributed between 38 and 52, with a modal peak at 48 chromosomes, and 39.8% of all cells contained 48 chromosomes. Aneuploidy was observed in the AGF II cell line, though they were small Figure 5 Effect of different cryoprotectant (A) and concentration in proportion, such as 14.8%, 12.1% and 10.2% of all of DMSO (B) on viability and adhesion percentage of cells contained 50, 49, 47 chromosomes, respectively. recovered AGF II cells at passage 21 after Diploid karyotype formula of A. grahami fin cell line was cryopreservation 14 m (metacentric chromosome)+20 sm (sub-metacentric chromosome)+14 st (subtelocentric chromosome). In addition to the above-mentioned results, all graded concentrations of DMSO treatment led to more Cell transfection cell losses than those observed with the fresh control After electroporation, we observed yellow and cyan (P<0.05). Among the 5 treatments, 5% proved to be the fluorescent proteins in the cell line after 24 h transfection optimal concentration, whose re-viability and re- (Figure 8). The transfection efficiency was 35% for adhesion percentage were respectively (94.03±5.7)% and pEYFP-N1 and 10% for pECFP-N1, indicating the (51.2±8.0)%. As concentrations of DMSO increase, the suitability of this cell line for transfection and gene percentage of viability and adhesion decreased. More expression studies.

Kunming Institute of Zoology (CAS), China Zoological Society Volume 33 Issues E5−6 E94 WANG, et al.

Figure 8 Fluorescent micrographs of yellow fluorescent protein (YFP, A) and cyan fluorescent protein (CFP, B) expression in transfected AGFII cells using pEYFP-N1 and pECFP-N1, respectively (×400)

Culture conditions Though different media are used to culture different cell lines, DMEM/F12 comes close to being an all purpose culture medium for the cells of mammals, birds, reptiles, amphibians as well as fish (Wolf & Quimby, 1969), and L-15 does not require CO2 buffering, so they have also successfully used in fish cell lines (Leibovitz 1963). The optimal incubation temperature of AGF II cells Figure 7 Chromosome distribution (A) metaphase (B) and (24 ℃-28 ℃) is higher than optimal growth temperature diploid karyotype (C) of A. grahami fin cells at of adult fish (23 ℃-26 ℃) and fertilized eggs (20 ℃) of passage 32. 108 metaphases were counted A. grahami (Li et al, 2003a; Ma et al, 2008). Long-term aquarium environment may contribute to these differences DISCUSSION to a certain extent, because the temperature of A. grahami could adapt to 28 ℃-29 ℃ in artificial breeding. The AGF II is the first cell line of Anabarilius FBS seems to be the most popular choice of grahami and is also the first cell line developed from supplements in the tissue culture media as it is easy to Fuxian Lake (Yang, 1994). The morphology of initial obtain in large volumes as well as for to the presence of cells is the same as other fin cell lines (Zhou et al, 2008; both known and unknown growth factors (Lakra et al, Ku et al, 2010). Different cell lines require various 2011). In fact, when using a FBS free medium, the AGF culture and cryopreservation conditions (Chen & Qin, II cells did not adhere to the substratum flask, suggesting 2011;Lakra et al, 2011), suggesting that it is necessary to that FBS may play a key role in promoting cell optimize conditions of each cell line and significant to attachment and proliferation, which is consistent with the identify cell origin to ensure the AGF II comes from A. case of the Acipenser sinensis fin cell lines (Zhou et al, grahami, as well as cell application. 2008). However, FBS concentrations are not usually

Zoological Research www.zoores.ac.cn Establishment and characterization of a fibroblast-like cell line from Anabarilius grahami (Cypriniformes: Cyprinidae) E95 much higher than 20%, as there is evidence that high containing modal chromosomes to total cells varied from serum concentrations may inhibit cell growth (Freshney, 20% to 70%, depending on different species and cell 2010; Chen & Qin, 2011). lines (Chen & Qin, 2011). The percentage in AGF II is 39.8%, which is imperfect, but because the materials’ Cell cryopreservation origin was not explicitly described in most studies—i.e. Freezing lower vertebrate cells is done using the the use of a wild or domestic population and of which same procedure as for homeotherm cells, with optimal generation—so comparing our findings with their results storage in liquid nitrogen (Wolf & Mann, 1980). is not significant or clearly useful. Conversely, AGF II is However, less attention has been paid to the effects of from the tenth generation of A. grahami through artificial different cryoprotectant and concentration on cell propagation, suggesting that to some degree, long-term cryopreservation. DMSO is the most common artificial propagation of A. grahami has been successful cryoprotectant for cultured cells (Mauger et al, 2006; and may further benefit by frequently adding wild A. Freshney, 2010), yet Glycerol, EG and MeOH are mostly grahami species from different locations (Yang et al, used in cryopreservation of germ cells but not somatic 2008). Both aneuploidy and heteroploidy were cells (Cabrita et al, 2010). All cryoprotectants have widespread in fish cell lines (Wolf & Quimby, 1969; proved to have some negative effects toward fish cells, Kang et al, 2003; Ye et al, 2006; Cheng et al, 2010), since the percentage of viability and adhesion of thawed which may be connected with the growth status of cell cells is lower than fresh cells, which may be relative to lines (Wolf & Quimby, 1969). the mild toxicity of cryoprotectant toward fresh cells, apart from the effect of lower temperature (Borenfreund, Cell application 1989; Muchlisin, 2005; Mauger et al, 2006; Cabrita et al, As other cell lines, AGF II is also capable of 2010). DMSO has been demonstrated to be better than accepting exogenous genes (Zhou et al, 2008; Ou et al, other cryoprotectant for A. grahami cell cryopreservation. 2010; Cheng et al, 2010), so we also assessed the Similarly, concentration of DMSO did not influence cell application of the A. grahami cell line for exogenous cryopreservation, because all different DMSO gene manipulation by examining the ability of AGF II concentration could obtain relative higher viability and cells to express the YFP and CFP gene. Cases where adhesion than Glycerol, EG and MeOH. different cell lines lead to different transfection efficiency (Zhou et al, 2008; Ku et al, 2010) and the Cell identification ability of the same cell to express different gene is also Identifying the cell origin after establishment of the distinct (Hameed et al, 2006) may potentially be the cell line has been finished is essential. Chromosome reason that transfection efficiency of pECFP-N1is lower analysis is the most commonly used method (Freshney, than that of pEYFP-N1. 2010), and because the mitochondrial DNA control In summary, we establish a freshwater fish cell line, region (D-loop) is highly variable and can be used for AGF II, from the caudal fin of A. grahami, and phylogenetic analysis (Yang et al, 2008); accordingly subcultured more than 60 passages. This cell line could they were used to verify whether the AGF II was actually provide an important tool for the further study of derived from A. grahami. Both the PCR amplification exogenous gene manipulation among freshwater fish. We and chromosome analysis confirmed that the AGF II also expect that the results of this study will attract more cells were from A. grahami. The D-loop is an important researchers to pay attention to cell culture of native non-coding region of mitochondrial DNA that possesses fishes, and protect fish germplasm resource of Yunnan a high rate of evolutionary change (Yang et al, 2008). All province. sequences of D-loop gene from A. grahami brought about a discovery of difference between this A. grahami Acknowledgements: We thank Wenhui NIE, Jinhuan and the wider A. grahami species was not obvious. That WANG, Weiting SU and Yi HU (Cell Bank of the relatively high genetic diversity of from the EFCC may Kunming Institute of Zoology, CAS) for their help and contribute in this matter (Yang et al, 2008). support in experiment, and also Wansheng JIANG(State A modal chromosome number (2n=48) of AGF II Key Laboratory of Genetic Resources and Evolution of cell line by Karyotype analysis was consistent with Zan the Kunming Institute of Zoology, CAS) for their & Song’s (1980) results. The percentages of cells assistance in improving the manuscript.

References

Bahich H, Borenfreund E. 1991. Cytotoxicity and genotoxicity assays Bols NC, Lee LEJ. 1991. Technology and uses of cell culture from with cultured fish cells: a review. Toxicol in Vitro, 5(2-3): 91-100. tissues and organs of bony fish. Cytotechnology, 6(3): 163-187.

Kunming Institute of Zoology (CAS), China Zoological Society Volume 33 Issues E5−6 E96 WANG, et al.

Bols NC, Brubacher JL, Ganassin RC, Lee LEJ. 2001. Ecotoxicology Li ZY, Chen YR, Yang JX, Zhang PQ, Huang MH, Gu HS. 2003d. and innate immunity in fish. Dev Comp Immunol, 25(8-9): 853-873. Examples for captivity and fish disease control in Anabarilius grahami. Freshwater Fish, 33(4): 32-34. (in Chinese) Borenfreund E, Babich H, Martin-Alguacil N. 1989. Effect of methylazoxymethanol acetate on bluegill sunfish cell cultures in vitro. Liu HY, Xiong F, Yang D, Zhang FR, Yu LN. 2011. Mitochondrial Ecotox Environ Saf, 17(3): 297-307. cytochrome b gene sequence diversity in wild and cultured population of Anabarilius grahami. J Huazhong Agri Univ, 30(1): 94-98. Cabrita E, Sarasquete C, Martínez-Páramo S, Robles V, Beiräo J, Pérez-Cerezales S, Herráez MP. 2010. Cryopreservation of fish sperm: Liu HZ, Teng CS, Teng HY. 2002. Sequence variations in the applications and perspectives. J Appl Ichthyol, 26(5): 623-635. mitochondrial DNA control region and their implications for the phylogeny of the Cypriniformes. Can J Zool, 80(3):569-581. Chen SL. 2002. Progress and prospect of cryopreservation of fish gametes and embryos. J Fish Chn, 26(2): 161-168. Liu SW, Chen XY, Yang JX. 2009. Threatened fishes of the world: Anabarilius grahami Regan, 1908 (Cyprinidae). Environ Biol Fish, Chen SL, Ye HQ, Sha ZX, Hong Y. 2003. Derivation of a pluripotent 86(3): 399-400. embryonic cell line from red sea bream blastulas. J Fish Biol, 63(3): 795-805. Kang MS, Oh MJ, Kim YJ, Kawai K, Jung SJ. 2003. Establishment and characterization of two new cell lines derived from flounder, Chen SL, Qin QW. 2011. Theory and Technology of Fishs Cell Culture. Paralichthys olivaceus (Temminck & Schlegel). J Fish Dis, 26(11-12): Beijing: Science Press. 657-665.

Cheng TC, Lai YS, Lin IY, Wu CP, Chang SL, Chen TI, Su MS. 2010. Ku CC, Lu CH, Wang CS. 2010. Establishment and characterization of Establishment, characterization, virus susceptibility and transfection of a fibroblast cell line derived from the dorsal fin of red sea bream, cell lines from cobia. Rachycentron canadum (L.), brain and fin. J Fish Pagrus major (Temminck & Schlegel). J Fish Dis, 33(3): 187-195. Dis, 33(2): 161-169. Ma L, Pan XF, Wei YH, Li ZY, Li CC, Yang JX, Mao BY. 2008. Clem LW, Moewus L, Sigel MM. 1961. Studies with cells from marine Embryonic stages and eye-specific gene expression of the local fish in tissue culture. Proc Soc Exp Biol Med, 108(3): 762-766. cyprinoid fish Anabarilius grahami in Fuxian Lake, China. J Fish Biol, Freshney RI. 2010. Culture of Animal Cells: A Manual of Basic 73(8): 1946-1959. th Technique. 6 ed. New York: Wiley- Blackwell. Mauger P E, Lebail P Y, Labbé C. 2006. Cryobanking of fish somatic cells: Optimizations of fin explant culture and fin cell cryopreservation. Fryer J, Lannan C. 1994. Three decades of fish cell culture: a current Comp Biochem Phys B: Biochem Mol Biol, 144(1): 29-37. listing of cell lines derived from fishes. Method Cell Sci, 16(2): 87-94. Muchlisin Z. 2005. Review: Current status of extenders and Hameed ASS, Parameswaran V, Shukla R, Singh ISB, cryoprotectants on fish spermatozoa cryopreservation. Biodiversitas, Thirunavukkarasu AR, Bhonde RR. 2006. Establishment and 6(1): 66-69. characterization of India's first marine fish cell line (SISK) from the kidney of sea bass (Lates calcarifer). Aquaculture, 257(1-4): 92-103. Ouyang ZL, Huang XH, Huang EY, Huang YH, Gong J, Sun JJ, Qin QW. 2010. Establishment and characterization of a new marine fish Huang XH, Huang YH, Sun JJ, Han X, Qin QW. 2009. cell line derived from red-spotted grouper Epinephelus akaara. J Fish Characterization of two grouper Epinephelus akaara cell lines: Biol, 77(5): 1083-1095. Application to studies of Singapore grouper iridovirus (SGIV) propagation and virus-host interaction. Aquaculture, 292(3): 172-179. Qin JH, Xu J, Xie P. 2007. Diet Overlap between the Endemic Fish Anabarilius grahami (Cyprinidae) and the Exotic Noodlefish Lakra WS, Swaminathan TR, Joy KP. 2011. Development, Neosalanx taihuensis (Salangidae) in Lake Fuxian, China. J characterization, conservation and storage of fish cell lines: a review. Freshwater Ecol, 22(3): 365-370. Fish Physiol Biochem, 37(1): 1-20. Su HH, Joung SJ, Liu KM. 2012. Fisheries, management and Leibovitz A. 1963. The growth and maintenance of tissue-cell cultures conservation of the whale shark Rhincodon typus in Taiwan. J Fish in free gas exchange with the atmosphere. Am J Hyg, 78(2): 173-180. Biol, 80(5): 1595-1607. Ley SH, Yu MK, Li KC, Tseng CM, Chen CY, Kao PY, Huang FC. Tan FX, Yang FX, Wang WM, Wang M, Lu YA. 2009. A new fish cell 1963. Limnological survey of the lakes of Yunnan Plateau. line of fin established from rare minnow as versatile tool in Oceanologia Limnologia Sin, 5(2): 87-114. ecotoxicology assessment of cytotoxicity of heavy metals. Acta Li ZY, Chen YR, Yang JX. 2003a. Biological characters and reason Hydrobiol Sin, 33(4): 767-771. (in Chinese) analysis population attenuation of Anabarilius grahami. Freshwater Wang S, Xie Y. 2004. China Species Red List, vol. 1. Red List. Beijing: Fish, 33(1): 26-27. (in Chinese) Higher Education Press. (in Chinese)

Li ZY, Chen YR, Yang JX, Zhang PQ, Huang MH. 2003b. Artificial Wolf K, Quimby MC. 1962. Established eurythermic line of fish cells Propagation and Larvae Cultivation of Sinocyclocheilus tingi. Zool Res, in vitro. Science, 135(3508): 1065-1066. 30(4): 463-467. (in Chinese) Wolf K, Quimby MC. 1969. Fish Cell and Tissue Culture // Hoar WS, Li ZY, Chen YR, Yang JX, Zhang PQ, Huang MH. 2003c. Egg- Randall DJ eds. Fish Physiology, vol III. New York: Academic Press. collection, hatching and fry rearing of Anabarilius grahami (Regan). Freshwater Fish, 33(3): 29-31. (in Chinese) Wolf K, Mann JA. 1980. Poikilotherm vertebrate cell-lines and viruses-

Zoological Research www.zoores.ac.cn Establishment and characterization of a fibroblast-like cell line from Anabarilius grahami (Cypriniformes: Cyprinidae) E97 a current listing for fishes. In Vitro, 16(2): 168-179. 15(2): 1-9. (in Chinese)

Wolf K. 1988. Fish Viruses and Fish Viral Diseases. New York: Yang JX, Chen YR. 1995. The Biology and Resource Utilization of the Cornell University Press. Fishes of Fuxian Lake Yunnan. Kunming: Yunnan Science and Technology Press. (in Chinese) Xiong F, Li WC, Pan JH, Li AQ, Xia, TX. 2006. Status and changes of fish resources in Lake Fuxian, Yunnan Province. J Lake Sci, 18(3): Yang L, Li H. 2010. Yunnan Wetland. Beijing: Chinese Forestry Press. 305-311. (in Chinese) Ye HQ, Chen SL, Sha ZX, Xu MY. 2006. Development and Yang B, Chen XY, Yang JX. 2008. Structure of the mitochondrial characterization of cell lines from heart, liver, spleen and head kidney DNA control region and population genetic diversity analysis of of sea perch Lateolabrax japonicus. J Fish Biol, 69(sa): 115-126. Anabarilius grahami (Regan). Zool Res, 29(4): 379-385. Zan RG, Song Z. 1980. Studies of the karyotypes of eight species of Yang LF. 1984. The preliminary study on the original classification and fishes in Cryprinus and Anabarilius. Zool Res, 1(2): 141-150. (in distribution law of lakes on the Yunnan Plateau. Trans Oceanol Limno, Chinese) 6(1): 34-39. Zhou GZ, Gui L, Li ZQ, Yuan XP, Zhang QY. 2008. Establishment of Yang JX. 1994. The biological characters of fishes of Fuxian Lake, a Chinese sturgeon Acipenser sinensis tail-fin cell line and its Yunnan, with comments on their adaptation to the lacustrine. Zool Res, susceptibility to frog iridovirus. J Fish Biol, 73(8): 2058-2067.

Kunming Institute of Zoology (CAS), China Zoological Society Volume 33 Issues E5−6 Zoological Research 33 (E5−6): E98−E103 doi: 10.3724/SP.J.1141.2012.E05-06E98

Stage-specific appearance of cytoplasmic microtubules around the surviving nuclei during the third prezygotic division of Paramecium

Yiwen WANG, Jinqiang YUAN, Xin GAO, Xianyu YANG*

The Nurturing Station for the State Key Laboratory of Subtropical Silviculture, Zhejiang A & F University, Lin’an 311300, China

Abstract: There are six micronuclear divisions during conjugation of Paramecium caudatum: three prezygotic and three postzygotic divisions. Four haploid nuclei are formed during the first two meiotic prezygotic divisions. Usually only one meiotic product is located in the paroral cone (PC) region at the completion of meiosis, which survives and divides mitotically to complete the third prezygotic division to yield a stationary and a migratory pronucleus. The remaining three located outside of the PC degenerate. The migratory pronuclei are then exchanged between two conjugants and fuse with the stationary pronuclei to form synkarya, which undergo three successive divisions (postzygotic divisions). However, little is known about the surviving mechanism of the PC nuclei. In the current study, stage-specific appearance of cytoplasmic microtubules (cMTs) was indicated during the third prezygotic division by immunofluorescence labeling with anti-alpha tubulin antibodies surrounding the surviving nuclei, including the PC nuclei and the two types of prospective pronuclei. This suggested that cMTs were involved in the formation of a physical barrier, whose function may relate to sequestering and protecting the surviving nuclei from the major cytoplasm, where degeneration of extra-meiotic products occurs, another important nuclear event during the third prezygotic division.

Keywords: Paramecium; Conjugation; Meiosis; Paroral cone region; Cytoplasmic microtubules

Ciliates are a group of unicellular eukaryotes with In our previous studies, immunofluorescence labeling nuclear dualism, possessing both polygenomic somatic and protargol staining techniques indicated stage-specific macronuclei and diploid germinal micronuclei derived spindle microtubular behavior at the telophase of the from synkaryon (fertilized nucleus) division products third prezygotic division (Gao et al, 2011a, 2011b). through conjugation, one kind of sexual reproduction Concerning cytoplasmic microtubules (cMTs), it has (Orias et al, 2011). Paramecium caudatum is a globally been suggested they play a role in pronuclear exchange distributed ciliate with one micronucleus and one (Nakajima et al, 2001). In the current study, macronucleus, and six micronuclear divisions during immunofluorescence labeling with monoclonal antibodies of anti-alpha tubulin indicated stage-specific behavior of conjugation: three prezygotic and three postzygotic. The cytoplasmic microtubules (cMTs) during the third first two meiotic prezygotic divisions form four haploid prezygotic division in P. caudatum, which might be nuclei,among which one is located and survives in the related with the surviving mechanism of meiotic paroral cone region (PC, the area around the degenerated products and two gametic pronuclei. 1 oral apparatus) of each conjugant (each cell of a conjugating pair), with the remaining three degenerating outside the PC. The surviving PC nucleus undergoes a MATERIAL AND METHODS third prezygotic division yielding two gametic nuclei: a Chemicals stationary pronucleus (StP) and a migratory pronucleus Immunofluorescence staining-related chemicals (MiP), corresponding to a female and a male gamete of multicellular organisms, respectively. The MiPs then Received: 23 July 2012; Accepted: 10 October 2012 exchange reciprocally between two conjugants and fuse Foundation items: National Natural Science Foundation of China with the StPs to form synkarya, which divide three times (31071881); Research Development Foundation of Zhejiang A & F successively (postzygotic divisions) to form eight synkaryon University ( 2009FK67). products (Calkins & Cull, 1907; Wichterman, 1986). * Corresponding author, E-mail: [email protected]

Science Press Volume 33 Issues E5−6

Stage-specific appearance of cytoplasmic microtubules around the surviving nuclei during the third prezygotic division of Paramecium E99 including monoclonal antibodies of mouse anti-α tubulin, were detected (Figure 1A). The PI staining recognized FITC-conjugated goat anti-mouse Ig G, propidium iodide two meiotic products and the macronuclei (Figure 1A′). (PI), Triton-X 100, 0.5 mol/L EGTA (pH 8.0), 4% Neither definite orientation of spindle extension nor paraformaldehyde, and RNase A were purchased from definite localization of the first meiotic products was the Beyotime Institute of Biotechnology (Haimen observed. As a result, the two meiotic products were Jiangsu, China). Other chemicals were provided by the randomly distributed in the cytoplasm (Figure 1A"), as Hangzhou Dafang Chemical Reagent Inc (China). reported previously (Gao et al, 2011b).

Cell culture and induction of conjugation One telophase nucleus located in PC during the Two complementary mating types of P. caudatum second prezygotic division were collected from East Lake Campus of Zhejiang A & During the second prezygotic division, the anti-α F University (China). Cell culture and conjugation tubulin antibodies also recognized long and slender induction followed previous studies (Hiwatashi, 1968). spindles and four telophase division products. There was Conjugating pairs were isolated by iron-dextran particles no definite orientation of spindle extension, but one or (Sun et al, 2010; Yang & Takahashi, 1999). more telophase nuclei were already located in the PC (Figure 1B, B′, B"), as observed previously (Gao et al, Immunofluorescence staining 2011b, 2011c). However, during the observation of ten Immunofluorescence staining with monoclonal conjugants in the earlier telophase, the spindles showed a antibodies of anti-α tubulin followed previous studies tendency of extending towards the PC area directly (Gao et al, 2011b; Yang & Takahashi, 2002). 1) Cells (Figure 1C, D, E). were fixed in 2% paraformaldehyde diluted in 2 mmol/L phosphate buffer (pH 7.0) containing 25 mmol/L KCl Microtubular behavior soon after meiosis (PBS). 2) Fixed cells were rinsed three times with At the completion of meiosis, one meiotic nucleus washing buffer (PBS containing 5 mmol/L MgSO4, 2 was already located in the PC and nMTs were observed mmol/L EGTA, and 0.05% Triton-X 100). 3) Cells were in all four meiotic products showing the same blocked with 1% BSA dissolved in 5 mmol/L NH4Cl. 4). immunostaining pattern regardless of their locations Cells were incubated overnight in 1 000× diluted inside or outside the PC (compare yellow and white monoclonal antibodies of anti-α tubulin containing 10 arrows in Figure 2A, A′, A"). With the third prezygotic μg/mL RNase A. 5) After three rinses, cells were division, more anti-α tubulin antibodies recognized tiny incubated for 2 h in 500× diluted FITC-conjugated goat dots, the cytoplasmic microtubules (cMTs) appeared and anti-mouse Ig G containing 2.5 μg/mL PI. 6) After a accumulated around the PC areas of two conjugants brief rinse, the stained cells were observed under a (Figure 2B, B′, B"), and at the metaphase such cMTs fluorescence Olympus DX60 microscope. All experiments formed upside down heart shapes (Figure 2C, C′, C"). were performed at room temperature (25 °C). However, these cMTs never appeared around the meiotic nuclei outside the PC and even the nMTs faded away RESULTS from them (compare yellow and white arrows in Figure 2C, C"). More than 50 conjugants were observed in each It is well-known that microtubules are involved in case and 100% of cells showed the same characteristics. pronuclear transfer in both Paramecium (Jurand, 1976; Nakajima et al, 2001) and Tetrahymena (Orias et al, Microtubular behavior around the telophase of the 1983). In 2001, cytoplasmic microtubules (cMTs) and third prezygotic division intranuclear microtubules (nMTs) were reported to play At the telophase of the third prezygotic division, important roles in reciprocal nuclear exchange in P. many tiny cMTs dots still existed and were mainly caudatum (Nakajima et al, 2001). To determine if any observed around both the prospective StP and MiP (blue cytoplasmic microtubules played any roles in the and red arrows in Figure 3A, A" and B, B", respectively). surviving mechanism of post-meiotic nuclei, At the stage of pronuclear exchange, however, almost no immunofluorescence labeling with anti-alpha tubulin cMTs were observed around the two pronuclei, but antibodies was performed on the prezygotic conjugating nMTs still existed (Figure 3C, C"). More than 50 pairs. conjugants were observed in each case and 100% of cells showed the same characteristics. No definite orientation of spindle extension during the first prezygotic division DISCUSSION At the telophase of the first prezygotic division, anti-α tubulin antibodies recognized long and slender Previous studies on immunofluorescence labeling spindles and two meiotic products, but no macronuclei with anti-α tubulin antibodies have indicated a

Kunming Institute of Zoology (CAS), China Zoological Society Volume 33 Issues E5−6 E100 WANG, et al.

Figure 1 Nuclear behavior during the first and the second prezygotic divisions of P. caudatum indicated by FITC and PI doubled-staining A: Telophase of the first prezygotic division; B: Telophase of the second prezygotic division; C, D, E: Earlier telophase of the second prezygotic division, one meiotic product entered the PC region. A and B are FITC images; A′ and B′ are PI images; A″, B″, C, D and E are merged images of FITC and PI; A, A′ , A″ and B, B′ , B″ are the same cell in each row; Triangles: macronuclei; Bars=20 µm.

Zoological Research www.zoores.ac.cn Stage-specific appearance of cytoplasmic microtubules around the surviving nuclei during the third prezygotic division of Paramecium E101

Figure 2 Appearance of cytoplasmic microtubules (cMTs) soon after meiosis of P. caudatum indicated by FITC and PI doubled-staining A, B: Soon after meiosis; C: At the beginning of the third pre-zygotic division. A, B and C are FITC images; A′, B′ and C′ are PI images; A″, B″ and C″ are merged images of FITC and PI; A, A′, A″ and B, B′, B″ and C, C′, C″ are the same cell in each row; Yellow arrows: meiotic products in the PC; White arrows: meiotic products outside the PC; Triangle: old macronuclei; Oval area: the area of cMT appearance; Bars=20 µm.

Kunming Institute of Zoology (CAS), China Zoological Society Volume 33 Issues E5−6 E102 WANG, et al.

Figure 3 Distribution of cMTs before and after the completion of the third prezygotic division of P. caudatum indicated by FITC and PI doubled-staining A, B: Late telophase of the third prezygotic division; C: Reciprocal migratory pronuclear exchange. A, B and C are FITC images; A′, B′ and C′ are PI images; A″, B″ and C″ are merged images of FITC and PI; A, A′, A″ and B, B′, B″ and C, C′, C″ are the same cell in each row; Yellow arrows: (prospective) migratory pronuclei; Blue arrows: (prospective) stationary pronuclei; White arrows: meiotic products outside PC; Triangle: old macronuclei; Bars=20 µm.

Zoological Research www.zoores.ac.cn Stage-specific appearance of cytoplasmic microtubules around the surviving nuclei during the third prezygotic division of Paramecium E103 microtubular function in at least two aspects. The first all normal functional micronuclei including vegetative involves a role during reciprocal pronuclear exchange in micronuclei, two meiotic division products, the selected P. caudatum (Nakajima et al, 2001), which is supported meiotic products in the PC, two types of pronuclei, the by different experimental techniques in Tetrahymena products of three postzygotic divisions (Gao et al, 2011b, (Orias et al, 1983) and other Paramecium species (Jurand, 2011c; Ishida et al, 1999; Nakajima et al, 2001, 2002; 1976; Yang & Shi, 2007). The second is guiding the Yang & Takahashi, 2002), and the presumptive nuclei to the destined locations including the PC entrance micronuclei in exconjugants (Ishida et al, 1999; Taka et at the completion of meiosis and anterior and posterior al, 2006). But such nMTs are never kept in the localization at the telophase of the third postzygotic degenerating meiotic products or in the differentiated division (Gao et al, 2011a, 2011b, 2011c; Yang & macronuclear anlagen (Ishida et al, 1999). In the current Takahashi, 2002). study, the cMTs were observed around the PC nuclei, In the current study, stage- and space-specific cMTs and these nuclei also kept their nMTs during the third were observed by immunofluorescence labeling with prezygotic division (Figure 3). No such cMTs appeared anti-alpha tubulin antibodies. They appeared during the around the meiotic products outside the PC, and even the third prezygotic division and were distributed around the nMTs faded away from them completely (Figure 3). In surviving nuclei including meiotic products in the PC other words, the behavior of both cMTs and nMTs was (Figure 2) and two prospective StPs and MiPs (Figure 3). consistent during the third prezygotic nuclear division. However, such cMTs appeared neither around the extra- During the third prezygotic division, degeneration of the degenerating meiotic products (Figure 2C", 3A", B") nor extra meiotic products outside the PC occurs (Gao et al, the nuclear products of the first prezygotic division 2010; Yang et al, 2007). Combining all the observations (Figure 1C", 3C") and the three postzygotic divisions obtained in the current study with previous studies, it is (Yang & Takahashi, 2002). These results might indicate suggested that both nMTs and cMTs are indispensable cMT involvement in the surviving mechanism of the during the third prezygotic division, whose function post-meiotic nuclei. might relate to the survival of PC nuclei and two In fact, intranuclear microtubules (nMTs) exist in prospective pronuclei, which needs further investigation.

References

Calkins GN, Cull SW. 1907. The conjugation of Paramecium aurelia Nakajima Y, Ishida M, Mikami K. 2002. Microtubules mediate germ- (caudatum). Arch Protistenk, 10: 375-415. nuclear behavior after meiosis in conjugation of Paramecium caudatum. J Eukaryot Microbiol, 49(1): 74-81. Gao X, Zhang XJ, Yang XY. 2010. Morphological apoptotic characteristics of the post-meiotic micronuclei in Paramecium Orias E, Cervantes MD, Hamilton EP. 2011. Tetrahymena thermophila, caudatum. Europ J Protistol, 46(3): 243-250. a unicellular eukaryote with separate germline and somatic genomes. Res Microbiol, 162(6): 578-586. Gao X, Shi XB, Yang XY. 2011a. Dynamic behaviour of stationary pronuclei during their positioning in Paramecium caudatum. Europ J Orias JD, Hamilton E, Orias E. 1983. A microtubule meshwork Protistol, 47(3): 235-237. associated with gametic pronucleus transfer across a cell-cell junction. Science, 222(4620): 181-184. Gao X, Zhu JJ, Yang XY, Wang YW, Yuan JQ, Song MG. 2011b. Localization of stationary pronuclei during conjugation of Paramecium Sun XT, Wu JW, Gao X, Yang XY. 2010. To enhance the production of indicated by immunofluorescence staining. Zool Res, 32(4): 461-464. magnetic iron-dextran particles by ultrasonic treatment and its applications in ciliate studies. Chn J Zool, 45(1): 90-93. (in Chinese) Gao X, Yang XY, Zhu JJ, Yuan, JQ, Wang YW, Song MG. 2011c. New details indicated by different stainings during conjugation of ciliated Taka N, Kurokawa K, ArakiT, Mikami K. 2006. Selection of the germinal micronucleus in Paramecium caudatum: nuclear division and protozoa Paramecium. Zool Res, 32(6): 651-656. nuclear death. J Eukaryot Microbiol, 53(3): 177-184. Hiwatashi K. 1968. Determination and inheritance of mating types in Wichterman R. 1986. The Biology of Paramecium [M]. 2nd Ed. New Paramecium caudatum. Genetics, 58(3): 373-386. York: Plenum Press. Ishida M, Nakajima Y, Kurokawa K, Mikami K. 1999. Nuclear Yang XY, Takahashi M. 1999. Disturbance of determination of behavior and differentiation in Paramecium caudatum analyzed by germinal and somatic nuclei by heat shock in Paramecium caudatum. immunofluorescence with anti-tubulin antibody. Zool Sci, 16(6): 915- J Eukaryot Microbiol, 46(1): 49-55. 926. Yang XY, Takahashi M. 2002. Nuclei may anchor at specific locations Jurand A. 1976. Some ultrastructural features of micronuclei during during nuclear determination in Paramecium caudatum. Europ J conjugation and autogamy in Paramecium aurelia. J Gen Microbiol, Protistol, 38(2): 147-153. 94(1): 193-203. Yang XY, Shi XB. 2007. Gametic nuclear exchange during the Nakajima Y, Mikami K, Takahashi M. 2001. Role of the cytoplasmic conjugation of Paramecium polycaryum. JPN J Protozool, 40: 113-121. and the intranuclear microtubules on the behavior of pronuclei during conjugation in Paramecium caudatum. Proc JPN Acad Ser B, 77(9): Yang XY, Gao X, Shi XL. 2007. Detection of haploid nuclear death in 172-177. living Paramecium caudatum. JPN J Protozool, 40: 123-130. Kunming Institute of Zoology (CAS), China Zoological Society Volume 33 Issues E5−6 Zoological Research 33 (E5−6): E104−E110 doi: 10.3724/SP.J.1141.2012.E05-06E104

Molecular phylogeny and divergence time of Trachypithecus: with implications for the taxonomy of T. phayrei

Kai HE1, 2, Naiqing HU1, 2, Joseph D. ORKIN1,3, Daw Thida NYEIN4, Chi MA5, Wen XIAO 5, Pengfei FAN 5, Xuelong JIANG 1,*

1. State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, the Chinese Academy of Sciences, Kunming 650223, China 2. University of theChinese Academy of Sciences, Beijing 100049, China 3. Department of Anthropology, Washington University in St. Louis, St. Louis, MO 63130, USA 4. Biodiversity and Nature Conservation Association, Yangon, Myanmar 5. Institute of Eastern- Himalaya Biodiversity Research, Dali University, Dali 671003, China

Abstract: The genus Trachypithecus is the most diverse langur taxon, distributed in southwestern China, south and southeastern Asia. In this study, we include 16 recognized Trachypithecus species to reconstruct the phylogeny with particular concern to the taxonomy of the three subspecies of T. phayrei using multiple genes. Our results support a sister-relationship between T. p. phayrei and T. p. shanicus. However, the mitochondrial CYT B gene supported T. p. crepuscula as a distinct species, but the nuclear PRM1 gene suggested a closer relationship between T. p. crepuscula and T. p. phayrei. The incongruence between nuclear and mitochondrial genes suggests that hybridization may have occurred, a fact that would benefit from re-examination using multiple unlinked nuclear genes.

Keywords: Non-invasive sampling; Partitioned Bayesian phylogenetic analyses; Relaxed molecular clock; Trachypithecus phayrei

The genus Trachypithecus is the most diverse langur analyses have demonstrated that T. p. crepuscula and T. p. taxon, having a broad distribution including India, Sri phayrei do not form a monophyletic clade. T. p. phayrei Lanka, Bangladesh, Southwestern China, and Southeast from India is the sister taxon of T. barbei and T. obscures, Asia (Groves, 2005; Wang et al, 1999). It is but T. p. crepuscula from Vietnam represents a distinct phylogenetically embedded within the Family lineage, being a close relative of the T. francoisi species Cercopithecidae and closely related to Semnopithecus group (Karanth et al, 2008; Nadler et al, 2003). (Perelman et al, 2011; Wang et al, 2012). Groves (2005) Accordingly, Nadler et al (2003) and Liedigk et al (2009) assigned full species status to 17 taxa, which he clustered suggested that T. p. crepuscula from Vietnam should be into 5 species groups. While 16 of these species have given full species status. However, a lack of genetic been assessed in other phylogenetic contexts (Bleisch et information from T. p. crepuscula and T. p. shanicus al, 2008; Geissmann et al, 2004; Karanth et al, 2008; (southwestern China) has prevented agreement on the Liedigk et al, 2009; Nadler et al, 2003; Wang et al, 2012; purported taxonomy (Bleisch et al, 2008). Zhang & Shi, 1993), which has greatly improved our In this study, we sampled T. p. crepuscula and T. p. understanding of Trachypithecus evolution, there has shanicus from southwestern China (Yunnan) and been no dedicated molecular analysis of the complex northern Myanmar and sequenced both mitochondrial species structure of Trachypithecus. Of particular concern is the disputed taxonomy of the endangered Received: 06 September 2012; Accepted: 15 November 2012 Phayre’s leaf monkey (T. phayrei). Three putative Foundation item: This work was supported by the Fund of State Key subspecies inhabit Bangladesh, northeastern India, Laboratory of Genetics Resources and Evolution, Kunming Institute Myanmar, Southwestern China, Thailand, Laos, and of Zoology, Chinese Academy of Sciences (GREKF12-06) northern Vietnam (Bleisch et al, 2008). Recent genetic * Corresponding author, E-mail: [email protected]

Science Press Volume 33 Issues E5−6

Molecular phylogeny and divergence time of Trachypithecus: with implications for the taxonomy of T. phayrei E105 and nuclear genes to resolve the phylogenetic and Guangxi, China. One sample of T. germaini was relationships of T. phayrei. The sequences of T. p. obtained from the Kunming Cell Bank of the Chinese shanicus are represented for the first time. By including Academy of Sciences (Table 1). These samples are langur sequences from GenBank into our analysis, we tissues from deceased individuals, museum skins, or test the validity of the species groups proposed by noninvasive feces collections. All animal samples were Groves (2005). Furthermore, we estimated divergence obtained following the regulations of China for the times simultaneously with phylogenetic reconstruction to implementation of the protection of terrestrial wild test the diversification pattern of Trachypithecus through animals (State Council Decree [1992] No. 13) and time (Drummond et al, 2012). This effort will provide approved by the Ethics Committee of Kunming Institute further evidence to clarify the nebulous classification of of Zoology, Chinese Academy of Sciences, China. T. phayrei and the putative Trachypithecus species-group Total DNA was extracted from tissues and museum divisions. skins using the phenol/proteinase K/sodium dodecyl sulphate method (Sambrook & Russell, 2001). Four fecal MATERIALS AND METHODS samples were collected from the Gaoligong Mountain National Nature Reserve in 2008. Samples were stored Sampling and lab work using the “two-step” storage procedure (Nsubuga et al, Four samples of T. p. crepuscula and five of T. p. 2004). We extracted total DNA from fecal samples using shanicus were collected from northern Myanmar, Yunnan, the 2CTAB/PCI method (Vallet et al, 2008).

Table 1 Sample and sequences used in this study Species Collection code Code Sample Locality CYT B PRM1 LZM Trachypithecus Myanmar01 NM Northern Myanmar phayrei crepuscula KC285863 KC285882 KC285875 YN_WLS0301001 WL Mt.Wuliang,Yunnan,China KC285866 KC285883 KC285876 YN_XSBN1 BN1 Xishuangbanna,Yunnan,China KC285864 KC285884 KC285873 GX_KCB89 BN2 Xishuangbanna, Yunnan,China KC285865 KC285885 KC285874 T. p. shanicus YN_GLG0912002 G1 Mt.Gaoligong,Yunnan,China KC285867 KC285886 KC285877 YN_GLG08002 G2 Mt.Gaoligong,Yunnan,China KC285868 KC285887 - YN_GLG08003 G3 Mt.Gaoligong,Yunnan,China KC285869 − KC285878 YN_GLG08004 G4 Mt.Gaoligong,Yunnan,China KC285870 − KC285879 YN_GLG08005 G5 Mt.Gaoligong,Yunnan,China KC285871 − KC285880 T. germaini KCB92007 V2 Northern Vietnam KC285872 KC285888 KC285881

We sequenced the mitochondrial CYT B and nuclear Inc., USA) and aligned with MUSCLE (Edgar, 2004). protamine P1 (PRM1) and lysozyme (LZM) genes in this CYT B and coding regions of the two nuclear genes were study. CYT B has been widely used in mammalian translated into amino acids following the identification of phylogenetic and phylogeographic studies (Bradley & any premature stop codon. All these sequences were Baker, 2001). This gene was amplified using L14724_hk6 submitted to GenBank (Accession numbers: KC285863 (CCGTGATATGAAAAACCATCGTTG) and H15915 - KCKC285888). 1 141 bp of CYT B, 387 bp of PRM1, (Irwin et al, 1991). L14724_hk6 was modified from the and 135 bp of LZM were used in further analyses. universal primer L14724 (Irwin et al, 1991) to avoid Additional sequences of Catarrhini were downloaded amplification of nuclear pseudogenes of CYT B (Karanth, from GenBank and added to the alignments 2008). PRM1 and LZM have been used previously in the (Supplementary Table 1, available online). In total, 70 phylogenetic study of other primates, including langurs CYT B, 35 PRM1 and 32 LZM sequences representing 36 (Karanth et al, 2008). These two genes were amplified species including all three subspecies of Trachypithecus following (Karanth et al, 2008). All PCR products were phayrei were used in this study. purified using the UNIQ-10 spin column DNA gel We performed partitioned Bayesian analyses extraction kit (Sangon, Shanghai, China). Purified products (Brandley et al, 2005) to reconstruct the phylogenetic were directly sequenced with PCR primers using the relationships using BEAST v1.7.2 (Drummond et al, BigDye Terminator Cycle kit v3.1 on an ABI 3730xl 2012). The analyses were performed on both the CYT B sequencer by Tiangen Biotech Co, LTD. (Beijing, China). data set as well as the three-gene combined data set. Missing data were coded as “?”. The model of DNA Phylogenetic and molecular dating analyses evolution was determined by a Bayesian Information Nucleotide sequences were edited using SeqMan Criterion (BIC) (Schwarz, 1978) in jModelTest v0.1.1 and EditSeq in the DNASTAR package v7.1 (DNASTAR, (Guindon & Gascuel, 2003; Posada, 2008). Because

Kunming Institute of Zoology (CAS), China Zoological Society Volume 33 Issues E3−4 E106 HE, et al. there are missing data in the CYT B alignment, this gene determined separately. We did not calculate the was divided into two partitions (partition 1: 1−423 bp, evolutionary model for each codon position of LZM or 826−1 141 bp; partition 2: 424−825 bp). Because the coding regions of PRM1 to avoid error caused by partition 2 contains no missing data, it was used for the overpartitioning (Brown & Lemmon, 2007). In estimation of the time of molecular divergence (see jModelTest, three substitution schemes were selected and below). The evolutionary model for each codon position a proportion of invariant sites were not included in the of partition 1 and partition 2 of CYT B, LZM, and the model selection, following He et al (2012). The partition coding and non-coding regions of PRM1 were information and models selected are shown in Figure 1.

Figure 1 Models of molecular evolution used for each gene/partition for phylogenetic reconstruction *: partition was used for divergence time estimation.

We performed a MCMC search of twenty million 2001), so we set the prior as lognormal and offset=5.6, generations, sampling every 2 000 generations. We SD =0.29. The oldest Colobinae was dated to 9.8 Ma constrained the monophyly of Semnopithecus + (Benefit & Pickford, 1986; Nakatsukasa et al, 2010), but Trachypithecus according to previous analyses the molecular divergence estimation indicated a much (Osterholz et al, 2008; Perelman et al, 2011; Wang et al, more ancient divergence (Chatterjee et al, 2009), so we 2012). All analyses were repeated twice. Tracer v1.5 was set the prior as exponential. The earliest possible age was used to make sure all analyses reached the same posterior set to 9.8 Ma, and the older 95% CI to 20 Ma (mean=3.4). distributions and estimated the convergences by The oldest African Colobinae was dated to 6.1 Ma calculating effective sample sizes (ESSs) (Rambaut & (Gilbert et al, 2010). We set the prior as lognormal, the Drummond, 2009). Posterior probabilities (PP) ≥ 0.95 earliest possible age to 6.1 Ma, and the 95% CI to the were considered as strongly supported (Huelsenbeck & beginning of the Late Miocene (11.6 Ma, SD=1.04). The Rannala, 2004). most recent common ancestor (MRCA) of Molecular divergence times were estimated Semnopithecus and Trachypithecus was estimated to simultaneously with the multi-gene phylogenetic have existed at about 4.05 Ma (95% CI=2.93-5.36) reconstruction. Because missing data will mislead (Perelman et al, 2011), so we set the prior as lognormal estimates of branch lengths and thus affect the result of (mean=4.03, SD=0.18). Fossils of Trachypithecus have divergence time estimation (Lemmon et al, 2009), our been found in Southeast Asia and southern China, most molecular dating analyses were limited to partition 2 of of which date to the Pleistocene (Jablonski, 2002). The the CYT B alignment (402 bp, Figure 1). We used a oldest known fossil is T. auratus sangiranensis with an “relaxed” molecular approach (Drummond et al, 2006) age of 1.9±0.5 Ma (Jablonski & Tyler, 1999); however, and used fossil records and secondary calibration for the site stratigraphy remains in doubt (Larick et al, 2000). dating. Six calibration points were applied as lognormal Other fossils are not suitable for calibration, because of or exponential distributions (Ho, 2007). Saadanius taxonomic uncertainty. hijazensis dated to 29−28 Ma for the hominoid- The minimum-spanning tree of PRM1 were derived cercopithecoid divergence between 29−28 and 24 Ma from Network v4.5 using the median-joining approach (Zalmout et al, 2010), so that we set the prior as (Bandelt et al, 1999). lognormal and offset=24, SD=0.98. The oldest known hominoid (Morotopithecus bishopi) is dated to 20.6 Ma RESULTS (Gebo et al, 1997). We set the prior as lognormal. The earliest sample age was set to 20.6 Ma (offset=20.6) and Phylogenetic relationships the older 95% CI to the beginning of the Miocene (24.1 The combined phylogenetic tree is shown in Figure Ma, SD=0.76). The Homo-Pan divergence occurred 2, and the mitochondrial phylogenetic topology is between 7.2−5.6 Ma (Aiello & Collard, 2001; Senut et al, identical to that based on multiple genes. Trachypithecus

Zoological Research www.zoores.ac.cn Molecular phylogeny and divergence time of Trachypithecus: with implications for the taxonomy of T. phayrei E107

Figure 2 Phylogenetic relationships of Cartarrhini based on combined mitochondrial and nuclear genes and chronogram of Cercopithecidae based on partial CYT B gene Node numbers above branches indicate mean divergence time; numbers below branches indicate Bayesian posterior probabilities; node bars indicate the 95% CI for the clade age.

Kunming Institute of Zoology (CAS), China Zoological Society Volume 33 Issues E3−4 E108 HE, et al. taxa fell into two clades. T. pileatus, T. geei from India, T. geei from India and T. francoisi share the central vetulus, and T. johnii cluster with Semnopithecus (PP=1.0), haplotype, which differs from other haplotypes by two to which is consistent with previous studies (Chatterjee et al, nine mutations (Figure 3). These mutations also lead to 2009; Karanth et al, 2008; Osterholz et al, 2008; Wang et amino acid substitutions/deletions. T. p. shanicus shares al, 2012), suggesting that T. vetulus and T. johnii should the same haplotype with T. germaini and T. obscurus and be merged into the genus Semnopithecus. The rest of the differs from the haplotype shared by T. p. phayrei and T. Trachypithecus taxa form a monophyletic clade (PP=1.0) p. crepuscula by one amino acid deletion. with five strongly supported lineages recognized (PP=1.0). Our results strongly support that T. p. phayrei Table 2 The first ten amino acid sequences of the lysozyme and T. p. shanicus are sister-taxa (PP=1.0) and are close protein of presbytini species relatives to T. barbei and T. obscures (PP=1.0). T. p. Nasalis larvatus M K A L I I L G L V crepuscula from southwestern China and Vietnam form a distinct lineage (PP=1.0) as the sister taxon to the species Pygathrix nemaeus M K A L I I L G L V group T. Obscurus +T. cristatus (PP=1.0). T. geei and T. Semnopithecus entellus M K A L T I L G L V pileatus from Bhutan are strongly supported as the sister Trachypithecus johnii M K A L T I L G L V taxa of T. shortridgei (PP=1.0). LZM shows low variation in Trachypithecus and T. vetulus M K A L T I L G L V Semnopithecus. Only two mutations that lead to amino T. pileatus (India) M R A L I I L G L V acid substitutions were observed (Table 2). As shown T. geei (India) M R A L I I L G L V previously (Karanth et al, 2008), our results indicate that T. francoisi M R A L I I L G L V the T. pileatus and T. geei from India, which are T. phayrei phayrei M R A L I I L G L V members of the genus Semnopithecus in the phylogenetic tree (Figure 2), have the same LZM amino acid T. phayrei shanicus M R A L I I L G L V sequences as the species in the Trachypithecus clade, T. phayrei crepuscula M R A L I I L G L V indicating possible hybridization. T. obscurus M R A L I I L G L V Nucleotide substitutions and insertions/deletions T. germaini M R A L I I L G L V (indels) were found in the PRM1 alignment in both T. cristatus M R A L I I L G L V coding and noncoding regions. In the Network tree, the T.

Figure 3 The median-joining network of the genus Trachypithecus based on partial PRM1 Numbers by the branches indicate the number of mutations.

Divergence time estimation Ma (0.16−0.77 Ma). The estimates of divergence time including 95% DISCUSSION credible intervals are shown in Figure 2. The Trachypithecus clade diverged from outgroups at about As has been discussed previously, the conflicting 5.25 Ma (7.01−3.77 Ma). Diversifications within this phylogenetic information observed in Trachypithecus is clade started at about 2.35 Ma (3.43−1.46 Ma) and likely related to hybridization, incomplete lineage sorting, occurred throughout the Pleistocene. T. p. phayrei and T. and ancestral polymorphism (Karanth et al, 2008; p. shanicus diverged at about 0.37 Ma (0.17−0.63 Ma). Liedigk et al, 2009; Wang et al, 2012). Pleistocene The MRCA of T. p. crepuscula occurred at about 0.41 glaciations may have caused large scale deforestation

Zoological Research www.zoores.ac.cn Molecular phylogeny and divergence time of Trachypithecus: with implications for the taxonomy of T. phayrei E109 throughout southeast Asia, isolating formerly widespread histories. species in refugia, and the subsequent recolonization of Although the incongruences between nuclear and these territories during interglacial periods may have mitochondrial genes undoubtedly increase the difficulty allowed for the hybridization of these isolates (Brandon- for estimating the “real” species tree and taxonomy, our Jones, 1996; Meijaard & Groves, 2006). Consistent with combined phylogenetic tree (most likely reflecting the the karyotype stability within the subfamily Colobinae mitochondrial relationships) supports the species-group (O'Brien et al, 2006), hybridization between different division proposed by Groves (2005) except for T. barbei species of Trachypithecus and between Trachypithecus and T. p. crepuscula. T. barbei, which was included in and closely related genera has been observed in captivity the T. cristatus group, is strongly supported to be a and could explain the paraphyly of T. pileatus and T. geei member of the T. obscurus group, being congruent with (Que et al, 2007; Schempp et al, 2008; Wangchuk, 2005; previous study and supported by morphological Wangchuk et al, 2008). Coupled with hybridization, the characters as well (Geissmann et al, 2004). Our results observed discrepancies between autosomal and are consistent with this subspecies representing a distinct mitochondrial genes might also have resulted from maternal genealogy with nuclear introgression from T. p. maternal philopatry and male dispersal from natal groups, shanicus, suggesting a possible species status, namely T. common in living Trachypithecus and possibly the crepusculus, which could also be a new species group ancestral state of the genus, which could have produced (Liedigk et al, 2009). these different patterns of genetic inheritance (Karanth et Our study largely supports the proposed species- al, 2008). group division of Groves (2005) except for two taxa (T. Previous study including T. p. crepuscula from barbei and T. p. crepuscula), both of which have been Vietnam and T. p. phayrei from India have found discussed in previous studies. We support the sister- Trachypithecus phayrei to be paraphyletic (Karanth et al, relationship of T. p. shanicus and T. p. phayrei, whereas T. 2008). As a result, it remained unclear whether the p. crepuscula may represent a distinct species throughout subspecies T. p. crepuscula is itself monophyletic and its distribution range, although hybridization may have whether T. p. shanicus is close a relative to either of the occurred between it and T. p. phayrei. The incongruences other two subspecies. For the first time, we found T. p. between nuclear and mitochondrial genes imply crepuscula from Vietnam, southwestern China, and complicated evolutionary histories of the three Myanmar are monophyletic and share a young MCRA at subspecies and the classification would benefit from re- about 0.41 Ma and T. p. shanicus diverged from T. p. examination using multiple nuclear genes. phayrei at about 0.37 Ma. However, T. p. shanicus is represented by a different and probably more ancient Acknowledgements: We thank Professor Ming LI PRM1 haplotype than the other two subspecies. This from The Institute of Zoology, Chinese Academy of might be the result of nuclear introgression from T. p. Sciences, and Dr. Zhijin LIU and Dr. Zongfei CHANG shanicus to the ancestor of T. p. crepuscula due to male from his lab for technique support. We also thank dispersal and female philopatry. Additional taxon Professor Huaisen AI, the director of Baoshan sampling of other species and multiple unlinked genes Administrative Bureau, Gaoligong National Nature Reserve are needed for reconstructing these robust evolutionary for facilitating our field work in the Mt. Gaoligong.

References

Aiello LC, Collard M. 2001. Palaeoanthropology: Our newest oldest as indicators of quaternary climatic change. Biol J Linn Soc, 59(3): 327-350. ancestor? Nature, 410(6828): 526-527. Brown JM, Lemmon AR. 2007. The importance of data partitioning Bandelt HJ, Forster P, Röhl A. 1999. Median-joining networks for and the utility of Bayes factors in Bayesian phylogenetics. Syst Biol, inferring intraspecific phylogenies. Mol Biol Evol, 16(1): 37-48. 56(4): 643-655.

Benefit BR, Pickford M. 1986. Miocene fossil cercopithecoids from Chatterjee HJ, Ho SYW, Barnes I, Groves C. 2009. Estimating the Kenya. Am J Phys Anthropol, 69(4): 441-464. phylogeny and divergence times of primates using a supermatrix approach. BMC Evol Biol, 9(1): 259. Bleisch B, Brockelman W, Timmins RJ, Nadler T, Thun S, Das J, Yongcheng L. 2008. Trachypithecus phayrei. IUCN 2012. IUCN Red Drummond AJ, Ho SYW, Phillips MJ, Rambaut A. 2006. Relaxed List of Threatened Species. Version 2012.1. [www.iucnredlist.org]. phylogenetics and dating with confidence. PLoS Biol, 4(5): 699-710.

Bradley RD, Baker RJ. 2001. A test of the genetic species concept: Drummond AJ, Suchard MA, Xie D, Rambaut A. 2012. Bayesian Cytochrome-b sequences and mammals. J Mammal, 82(4): 960-973. phylogenetics with BEAUti and the BEAST 1.7. Mol Biol Evol, 29(8): 1969-1973. Brandley MC, Schmitz A, Reeder TW. 2005. Partitioned bayesian analyses, partition choice, and the phylogenetic relationships of scincid Edgar RC. 2004. MUSCLE: multiple sequence alignment with high lizards. Syst Biol, 54(3): 373-390. accuracy and high throughput. Nucleic Acids Res, 32(5): 1792-1797.

Brandon-Jones D. 1996. The Asian Colobinae (Mammalia: Cercopithecidae) Gebo DL, MacLatchy L, Kityo R, Deino A, Kingston J, Pilbeam D.

Kunming Institute of Zoology (CAS), China Zoological Society Volume 33 Issues E3−4 E110 HE, et al.

1997. A hominoid genus from the early Miocene of Uganda. Science, L. 2004. Factors affecting the amount of genomic DNA extracted from 276(5311): 401-404. ape faeces and the identification of an improved sample storage method. Mol Ecol, 13(7): 2089-2094. Geissmann T, Groves CP, Roos C. 2004. The Tenasserim lutung, Trachypithecus barbei (Blyth, 1847) (Primates: Cercopithecidae): O'Brien SJ, Menninger JC, Nash WG. 2006. Atlas of Mammalian Description of a live specimen, and a reassessment of phylogenetic Chromosomes. New Jersey: John Wiley and Sons, 714. affinities, taxonomic history, and distribution. Contrib Zool, 73(4): 271-282. Osterholz M, Walter L, Roos C. 2008. Phylogenetic position of the Gilbert CC, Goble ED, Hill A. 2010. Miocene cercopithecoidea from langur genera Semnopithecus and Trachypithecus among Asian colobines, the Tugen Hills, Kenya. J Hum Evol, 59(5): 465-483. and genus affiliations of their species groups. BMC Evol Biol, 8(1): 58.

Groves CP. 2005. Order primates // Wilson D E, Reeder D M. Mammal Perelman P, Johnson WE, Roos C, Seuánez HN, Horvath JE, Moreira Species of the World: A Taxonomic and Geographic Reference. MA, Kessing B, Pontius J, Roelke M, Rumpler Y, Schneider MPC, Baltimore: John Hopkins University Press, 111-184. Silva A, O'Brien SJ, Pecon-Slattery J. 2011. A molecular phylogeny of living primates. PLoS Genet, 7(3): e1001342. Guindon S, Gascuel O. 2003. A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Syst Biol, 52(5): Posada D. 2008. jModelTest: phylogenetic model averaging. Mol Biol 696-704. Evol, 25(7): 1253-1256.

He K, Chen JH, Gould GC, Yamaguchi N, Ai HS, Wang YX, Zhang YP, Que TC, Hu YL, Zhang CC, Huang CM, Meng XJ. 2007. Observations Jiang XL. 2012. An estimation of erinaceidae phylogeny: A combined on the Hybrid F of Trachypthecus leucocephalus and T. francioisi and analysis approach. PLoS ONE, 7(6): e39304. its offspring. Zool Res, 28(3): 225-230.

Ho SYM. 2007. Calibrating molecular estimates of substitution rates Rambaut A, Drummond A. 2009. Tracer v1.5 [http://tree.bio.ed.ac.uk/ and divergence times in birds. J Avian Biol, 38(4): 409-414. software/tracer/].

Huelsenbeck J, Rannala B. 2004. Frequentist properties of Bayesian Sambrook J, Russell D. 2001. Molecular Cloning: A Laboratory posterior probabilities of phylogenetic trees under simple and complex Manual. New York: Cold Spring Harbor Laboratory Press, 2368. substitution models. Syst Biol, 53(6): 904-913. Schempp W, Münch C, Roos C, Nadler T. 2008. Chromosomal and Irwin DM, Kocher TD, Wilson AC. 1991. Evolution of the cytochrome molecular studies of a hybrid between red-shanked douc langur b gene of mammals. J Mol Evol, 32(2): 128-144. (Pygathrix nemaeus) and Hatinh langur (Trachypithecus laotum hatinhensis). Vietnam J Primatol, 1: 55-62. Jablonski NG. 2002. Fossil Old World monkeys: the Late Neogene radiation // Hartwig WC. The Primate Fossil Record. Cambridge: Schwarz G. 1978. Estimating the dimension of a model. Ann Statist, Cambridge University Press, 255-299. 6(2): 461-464.

Jablonski NG, Tyler DE. 1999. Trachypithecus auratus sangiranensis, a Senut B, Pickford M, Gommery D, Mein P, Cheboi K, Coppens Y. 2001. new fossil monkey from Sangiran, central Java, Indonesia. Int J First hominid from the Miocene (Lukeino Formation, Kenya). Comptes Primatol, 20(3): 319-326. Rendus De L Academie Des Sciences Serie Ii Fascicule a-Sciences De La Terre Et Des Planetes, 332(2): 137-144. Karanth KP. 2008. Primate numts and reticulate evolution of capped and golden leaf monkeys (Primates: Colobinae). J Biosci, 33(5): 761-770. Vallet D, Petit E, Gatti S, Levréro F, Ménard N. 2008. A new 2CTAB/PCI method improves DNA amplification success from faeces Karanth KP, Singh L, Collura RV, Stewart CB. 2008. Molecular of Mediterranean (Barbary macaques) and tropical (lowland gorillas) phylogeny and biogeography of langurs and leaf monkeys of South primates. Conserv Genet, 9(3): 677-680. Asia (Primates: Colobinae). Mol Phylogenet Evol, 46(2): 683-694. Wang XP, Yu L, Roos C, Ting N, Chen CP, Wang J, Zhang YP. 2012. Larick R, Ciochon RL, Zaim Y, Sudijono, Suminto, Rizal Y, Aziz F. Phylogenetic relationships among the colobine monkeys revisited: New 2000. Lithostratigraphic context for Kln-1993.05-SNJ, a fossil colobine insights from analyses of complete mt genomes and 44 nuclear non- maxilla from Jokotingkir, Sangiran dome. Int J Primatol, 21(4): 731-759. coding markers. PLoS ONE, 7(4): e36274. Lemmon AR, Brown JM, Stanger-Hall K, Lemmon EM. 2009. The Wang YX, Jiang XL, Feng Q. 1999. Taxonomy, status and conservation effect of ambiguous data on phylogenetic estimates obtained by of leaf monkeys in China. Zool Res, 20(4): 206-315 (in Chinese). Maximum Likelihood and Bayesian Inference. Syst Biol, 58(1): 130-145. Wangchuk T. 2005. The Evolution, Phylogeography, and Conservation Liedigk R, Van Ngoc Thinh TN, Walter L, Roos C. 2009. Evolutionary of the Golden Langur (Trachypithecus geei) in Bhutan. Ph.D. thesis. history and phylogenetic position of the Indochinese grey langur Department of Biology, University of Maryland. (Trachypithecus crepusculus). Vietn J Primatol, 3: 1-8. Wangchuk T, Inouye DW, Hare MP. 2008. The emergence of an Meijaard E, Groves CP. 2006. The geography of mammals and rivers in endangered species: Evolution and phylogeny of the Trachypithecus mainland Southeast Asia // Lehman SM, Fleagle JG. Primate geei of Bhutan. Int J Primatol, 29(3): 565-582. Biogeography. New York: Springer, 305-329. Zalmout IS, Sanders WJ, MacLatchy LM, Gunnell GF, Al-Mufarreh Nadler T, Momberg F, Dang NX, Lormee N. 2003. Vietnam Primate YA, Ali MA, Nasser AAH, Al-Masari AM, Al-Sobhi SA, Nadhra AO, Conservation Status Review 2002. Part 2: Leaf Monkeys. Hanoi: Fauna Matari AH, Wilson JA, Gingerich PD. 2010. New Oligocene primate and Flora International and Frankfurt Zoological Society, 145-164. from Saudi Arabia and the divergence of apes and Old World monkeys. Nakatsukasa M, Mbua E, Yoshihiro S, Sakai T, Nakaya H, Kunimatsu Y. Nature, 466(7304): 360-365. 2010. Earliest colobine skeletons from Nakali, Kenya. Am J Phys Zhang YP, Shi LM. 1993. Mitochondrial DNA restriction fragment Anthropol, 143(3): 365-382. length polymorphism in Presbytis phayrei and P. francoisi. Zool Res, Nsubuga AM, Robbins MM, Roeder AD, Morin PA, Boesch C, Vigilant 14(3): 263-269 (in Chinese).

Zoological Research www.zoores.ac.cn Zoological Research 33 (E5−6): E111−E120 doi: 10.3724/SP.J.1141.2012.E05-06E111

Complete mitogenome of the Painted Jezebel, Delias hyparete Linnaeus (Lepidoptera: Pieridae) and its comparison with other butterfly species

Qinghui SHI 1, Jing XIA 1, Xiaoyan SUN 2, Jiasheng HAO 1,*, Qun YANG 2,*

1. College of Life Sciences, Anhui Normal University, Wuhu 241000, China; 2. Nanjing Institute of Paeleontology and Stratigraphy, the Chinese Academy of Sciences, Nanjing 210008, China

Abstract: In the present study, we report the first complete mitochondrial genome (mitogenome) of the Painted Jezebel, Delias hyparete. The mitogenome of Delias hyparete is 15 186 bp in length, and has typical sets of 37 genes: 13 protein-coding genes (PCGs), 2 ribosomal RNAs, 22 transfer RNAs and a non-coding A+T-rich region. All protein-coding genes are initiated by ATN codons, except for COI, which is tentatively designated by the CGA codon, as observed in other butterfly species. A total of 10 PCGs harbored the complete termination codon TAA or TAG, while the COI, COII and ND5 genes ended at a single T residue. All 22 tRNA genes show typical clover structures, with the exception of the tRNASer(AGN) which lacks the dihydrouridine (DHU) stem and is instead replaced by a simple loop. Thirteen intergenic spacers totaling 153 bp, and 13 overlapping regions totaling 46 bp are scattered throughout the whole genome. The 377 bp long of D. hyparete A+T-rich region is not comprised of large repetitive sequences, but harbors several features characteristic of the lepidopteran insects, including the motif ATAGA followed by an 18 bp poly-T stretch, a microsatellite-like (AT)5 element preceded by the ATTTA motif, an 10 bp polyA-like stretch (AAAAATAAAA) present immediately upstream tRNAMet.

Keywords: Lepidoptera; Pieridae; Delias hyparete; Mitochondrial genome

The insect mitogenome is a circular and double- India, and south-east Asia (Chou, 1998), the Painted stranded molecule, approximately 14–20 kb in size, Jezebel, Delias hyparete Linnaeus, is an attractive typically comprised of 37 genes (including 13 PCGs, 22 common garden butterfly species of genus Delias, tRNA, and 2 rRNA genes) and a non-coding A+T-rich subfamily Pierinae in the family Pieridae. Like many region harboring the initiation sites for transcription and other Pieridae species, Delias hyparete is medium sized1 replication (Wolstenholme, 1992; Boore, 1999; Taanman, and is not easily mistaken due to its bright wing colours 1999). In recent decades, the mitogenome has been and graceful flight pattern. Here, we report the complete widely used in studies of phylogenetics, comparative and mitogenome of Delias hyparete, analysis of its evolutionary genomics, population genetics and nucleotide organization and other major characteristics molecular evolution, etc. (Ballard & Whitlock, 2004; Simonsen et al, 2006), because of its smaller size, faster Received: 28 June 2012; Accepted: 24 October 2012 evolutionary rate, maternal inheritance and little Foundation items: This work was supported by the National Science recombination rather than the nuclear genomics (Vigilan Foundation of China (Grant No.41172004), the CAS/SAFEA et al, 1991; Stoneking & Soodyall, 1996). As of August International Partnership Program for Creative Research Teams, 2012, the complete or nearly complete mitochondrial Chinese Academy of Sciences (Grant No. KZCX22YW2JC104), and genomes have been determined for 376 insect species. the Opening Funds from the State Key Laboratory of Palaeobiology Among these insect mitogenomes, 25 are from butterfly and Stratigraphy, Nanjing Institute of Geology and Palaeontology, species, and only 3 from Peridae species (Artogeia Chinese Academy of Sciences (Grant No. 104143) melete, Pieris rapae and Aporia crataegi). *Corresponding authors, E-mails: [email protected]; Found commonly in tropical areas of south China, [email protected]

Science Press Volume 33 Issues E5−6

E112 SHI, et al. with those of other butterfly species available, with the Then, the long PCR primers were designed by Primer aim of providing more molecular data for studies of this Premier 5.0 according to the conserved regions based on organisms molecular identification, population genetics, the newly acquired sequences of those short gene conservation biology, comparative genomics with other fragments (Table 1). All primers were synthesized by butterfly species and other relevant areas. Sangon Biotechnology, Shanghai, China. Short fragments were amplified under the following MATERIALS AND METHODS condition: 5 min of initial denaturation at 95 C°, 35 cycles of denaturation at 95 C° for 50 s, annealing at Sample collection and DNA extraction 45−51 C° (depending on primer pairs) for 50 s, Adult individuals of D. hyparete were collected extension at 72 C° for 1 min and 30 s, and a final from Jinghong, Yunnan, China in July 2006. After extension at 72 C° for 10 min. Long PCR techniques sample collection, the fresh tissues were preserved in were performed using TaKaRa LA Taq polymerase with 100% ethanol immediately and stored at −20 C° until the cycling parameters: an initial denaturation at 95 C° DNA extraction. Total genomic DNA was isolated from for 5 min, followed by 30 cycles of denaturation at 95 C° the thoracic muscle of an adult individual using the for 50 s, annealing at 46−55 C° (depending on primers) proteinase K-SiO2 method, as described by Hao et al for 50 s, extension at 68 C° for 2 min and 30 s during the (2007). first 15 cycles, and then an additional 5 s per cycle during the last 15 cycles, and a final extension at 68 C° Primer design, PCR amplification and DNA for 10 min. PCR products were visualized by sequencing electrophoresis on 1.2% agarose gel, then purified using Five short fragment sequences (500−700 bp) of COI, a 3S Spin PCR Product Purification Kit and sequenced COIII, ND4, Cytb and 16S rRNA genes were amplified directly with an ABI-377 automatic DNA sequencer. For using insect universal primers (Simon et al, 1994; each long PCR product, the full, double-stranded Caterino & Sperling, 1999; Simmons & Weller, 2001). sequence was determined by primer walking. Table 1 PCR and long-PCR primers used in this study Primers Upper primer sequence (5’→3’) Lower primer sequence (5’→3’) TM (℃) Size (kb) COI* GGTCAACAAATCATAAAGATATTG TAAACTTCAGGGTGACCAAAAAAT 50.0 0.67 COI-COIII GAACTGAACTGGGAAACC AAGCCTGAAGGATAGAAA 50.5 2.96 COIII* GTTCTGAGATTTCAGGTAA TTACTAATAAATCATTTGC 49.6 0.65 COIII-ND4 ATTCCCTTTAATCCTTTCC CGTTTAGGGAGACGAAGA 55.0 3.0 ND4* TTATGAAAGAAATTCTTT CAGATATAAGGGTTAATT 45.5 0.5 ND4-Cytb ATTACTCGCAATAAACCG TACAGCAAATCCTCCTCA 46.5 2.6 Cytb* TATGTACTACCATGAGGACAAATAT ATTACACCTCCTAATTTATTAGGAAT 47.0 0.5 Cytb-16S ACTGAATCTGAGGAGGAT CTTAGGGATAACAGCGTA 50.0 1.93 16S* CTGTACAAAGGTAGCATA GCCAAAACTTTAGTCTAG 49.0 0.5 16S-COI ATTACGCTGTTATCCCTAT TTCGTGGAAAGGCTATGTC 49.8 3.0 “*”universal primer

Sequence analysis and annotation with other lepidopteran tRNA genes. The tandem repeats The determined sequences were checked firstly with in the A+T-rich region were predicted using the Tandem the NCBI Internet BLAST search function. The raw Repeats Finder available online (http://tandem.bu.edu/ sequence files were proofread and assembled in BioEdit trf/trf.html) (Benson, 1999). All mitogenome sequence 7.0 (Hall, 1999) as well as ClustalX 1.8 (Thompson et al, data has been deposited into GenBank under the 1997). Individual D. hyparete genes and the A+T-rich accession number JX094279. region were identified by aligning the sequences with homologous regions of other lepidopteran mitogenome RESULTS sequences using Sequin 5.35. Sequence annotation was performed using the DNAStar package (DNAStar Inc., Genome organization and structure Madison, WI, USA) and the online blast tools available The complete mitogenome of D. hyparete is 15 186 through the NCBI web site. The concatenated amino acid bp long, containing 13 protein-coding genes (ND1-6, sequences of the 13 PCGs were obtained and analysed by ND4L, COI-III, Cytb, ATP6, ATP8), 2 ribosomal RNA ClustalX 1.8 (Thompson et al, 1997) and MEGA 5.0 genes (lrRNA and srRNA), 22 putative tRNA genes and (Tamura et al, 2011). Identification of tRNA genes was one major non-coding A+T-rich region (control region) verified using the program tRNAscan-SE 1.21 (Lowe & (Figure 1, Table 2). Like many other insect mitogenomes, Eddy, 1997). Putative tRNAs that could not be found by its major strand codes for 23 genes (9 PCGs and 14 tRNAscan-SE were identified by sequence comparisons tRNAs) and the A+T-rich region, while the minor strand

Zoological Research www.zoores.ac.cn Complete mitogenome of the Painted Jezebel, Delias hyparete Linnaeus (Lepidoptera: Pieridae) and its comparison with other butterfly species E113 codes for the remaining 14 genes (4 PCGs, 8 tRNAs and corresponding values of other lepidopterans, ranging 2 rRNA genes). Besides the non-coding A+T-rich region, from −0.048 (Parathyma sulpitia) to 0.059 (Bombyx 13 intergenic spacer sequences ranging from 1 to 46 bp mori) and from −0.318 (Ochrogaster lunifer) to −0.158 (153 bp in total) and 13 overlapping regions from 1 to 8 (Coreana raphaelis), respectively. bp (46 bp in total) are spread over the whole mitogenome (Table 2). The 7 overlapping nucleotides (ATGATAA) Table 2 Organization of the Delias hyparete mitochondrial are located between ATP8 and ATP6. Moreover, the genome Direc- Size Intergenic Start Stop overlapped nucleotide sequences were also detected in Gene Position * tion (bp) length codon codon all lepidopteran mitogenomes sequenced to date, and tRNAMet F 1−65 65 0 function in forming the structure of the hairpin loops for tRNAIle F 63−127 65 −3 posttranslational modifications (Kim et al, 2006; Fenn et tRNAGln R 125−193 69 −3 al, 2007). ND2 F 240−1253 1 014 46 ATT TAA tRNATrp F 1 252−1 317 66 −2 tRNACys R 1 310−1 375 66 −8 tRNATyr R 1 375−1 439 65 −1 COI F 1 442−2 972 1 531 2 CGA T tRNALeu(UUR) F 2 973−3 038 66 0 COII F 3 040−3 718 679 1 ATG T tRNALys F 3 719−3 789 71 0 tRNAAsp F 3 789−3 854 66 −1 ATP8 F 3 855−4 016 162 0 ATT TAA ATP6 F 4 010−4 687 678 −7 ATG TAA COIII F 4 691−-5 482 792 3 ATG TAA tRNAGly F 5 485−5 550 66 2 ND3 F 5 551−5 904 354 0 ATT TAG tRNAAla F 5 903−5 968 66 −2 tRNAArg F 5 968−6 033 66 −1 tRNAAsn F 6 038−6 102 65 4 tRNASer(AGN) F 6 103−6 163 61 0

tRNAGlu F 6 164−6 231 68 0 Figure 1 Circular map of Delias hyparete mitochondrial Phe genome tRNA R 6 240−6 303 64 8 COI, COII and COIII refer to the cytochrome oxidase subunits; Cytb refers ND5 R 6 306−8 026 1 721 2 ATT T to cytochrome B; ATP6 and ATP8 refer to subunits 6 and 8 of F0 ATPase; tRNAHis R 8 051−8 115 65 24 ND1- 6 refer to components of NADH dehydrogenase. tRNAs are denoted ND4 R 8 155−9 461 1 307 39 ATG T as one-letter symbols consistent with the IUPAC-IUB single letter amino acid codes. Gene names that are not underlined indicate a clockwise ND4L R 9 455−9 748 294 −7 ATA TAA transcriptional direction, whereas underlined indicate a counter-clockwise tRNAThr F 9 741−9 804 64 −8 transcriptional direction. tRNAPro R 9 805−9 869 65 0 ND6 F 9 872−10 396 525 2 ATT TAA Similar to other available butterfly mitogenomes, Cytb F 10 396−11 544 1 149 −1 ATG TAA the nucleotide composition of the whole D. hyparete Ser(UCN) tRNA F 11 546−11 610 65 −2 mitogenome is A+T biased (79.8%), whose value is ND1 R 11 627−12 562 936 16 ATA TAA identical to that of Artogeia melete (79.8%), and slightly tRNALeu(CUN) R 12 567−12 634 68 4 higher than Pieris rapae (79.7%), and well within the range of butterfly species known to date, from 79.1% in lrRNA R 12 635−13 970 1 336 0 Val Eumenis autonoe to 82.7% in Coreana raphaelis. tRNA R 13 971−14 035 65 0 Furthermore, the A+T content varies profoundly among srRNA R 14 036−14 809 774 0 A+T-rich 14 810−15 186 377 0 genes and regions of the D. hyparete mitogenome; for region example, 92.0% for the A+T-rich region, 85.1% for *In the column intergenic length, the positive number indicates interval base srRNA gene, 80.4% for all the tRNA genes, 83.7% for pairs between genes, while the negative number indicates the overlapping lrRNA gene and 78.4% for all the PCGs (Table 3, Table base pairs between genes. 4). The A+T- and G+C-skew of the whole D. hyparete mitogenome is −0.018 and −0.228, respectively (Table 4), Protein-coding genes (PCGs) indicating that more Ts and Cs than As and Gs are used. The 13 PCGs are 11 142 bp in length. Their nucleotide frequency of the majority (J strand) is T Both of the A+T- and G+C-skew values are within the Kunming Institute of Zoology (CAS), China Zoological Society Volume 33 Issues E5−6 E114 SHI, et al.

Table 3 Characteristics of butterfly species mitogenomes

b Whole genome PCG lrRNA srRNA A+T-rich region GenBank Taxon Size Size Size accession References Size(bp) A+T (%) No. codonsa A+T(%) A+T% A+T% A+T% (bp) (bp) (bp) no. Papilionidae Kim et al, 15 389 81.3 3 734 80.1 1 344 83.9 773 85.1 504 93.6 NC_014053 Parnassius bremeri 2009 Wu et al, Agehana maraho 16 094 80.5 3 718 78.2 1 333 83.7 779 85.5 1 258 95.2 NC_014055 2010 Sericinus montela 15 242 80.8 3 691 79.9 1 338 83.6 760 84.6 408 94.1 HQ259122 Ji et al, 2012 Teinopalpus aureus 15 242 79.8 3 720 78.3 1 320 82.4 781 85.6 395 93.1 NC_014398 Unpublished Nymphalidae 15 245 79.7 3 717 78.1 1 331 83.9 788 83.7 430 96.0 NC_013604 Hu et al, 2009 Acraea issoria Kim et al, Eumenis autonoe 15 489 79.1 3 728 76.8 1 335 83.7 775 85.2 678 94.6 GQ868707 2010 Wang et al, Argyreus hyperbius 15 156 80.8 3 718 79.4 1 330 84.5 778 85.2 349 95.4 NC_015988 2011 Xia et al, Calinaga davidis 15 267 80.4 3 737 78.8 1 337 83.8 773 85.9 389 92.0 HQ658143 2011 Kim et al, Argynnis nerippe 15 140 80.9 3 729 79.7 1 321 84.5 773 84.9 329 95.7 NC_016419 2011b Chen et al, Apatura ilia 15 242 80.5 3 711 78.9 1 333 86.0 776 84.9 403 92.5 JF437925 2012 Tian et al, Parathyma sulpitia 15 268 81.9 3 729 80.6 1 319 84.7 779 85.7 349 94.6 JQ347260 2012 Sasakia charonda 15 244 79.9 3 695 78.2 1 323 84.4 775 85.0 380 91.8 NC_014224 Unpublished Euploea mulciber 15 166 81.5 3 713 80.2 1 314 84.6 776 85.3 399 93.5 NC_016720 Unpublished Libythea celtis 15 164 81.2 3 732 80.1 1 335 84.7 774 85.4 328 96.3 NC_016724 Unpublished Kim et al, 15 314 82.7 3 708 81.5 1 330 85.3 777 85.8 375 94.2 DQ102703 Coreana raphaelis 2006 Kim et al, Protantigius superans 15 248 81.7 3 720 80.4 1 331 85.1 739 85.6 361 93.6 NC_016016 2011a Kim et al, Spindasis takanonis 15 349 82.3 3 728 81.1 1 333 85.6 777 84.7 371 94.6 NC_016018 2011a Pieridae Hong et al, 15 140 79.8 3 715 78.4 1 319 83.4 777 85.5 351 89.2 NC_010568 Artogeia melete 2009 Mao et al, Pieris rapae 15 157 79.7 3 721 78.2 1 320 84.0 764 85.0 393 91.6 HM156697 2010 Park et al, Aporia crataegi 15 140 81.3 3 708 79.9 - 85.4 - 85.5 354 95.2 JN796473 2012 Delias hyparete 15 186 79.8 3 703 78.4 1 336 83.7 774 85.1 377 92.0 JX094279 This study Hesperoidea Hao et al, 15 468 80.5 3 698 78.9 1 343 84.1 774 86.4 429 88.1 JF713818 Ctenoptilum vasava 2012 a: Termination codons were excluded in total codon count; b: Protein coding genes.

Table 4 Nucleotide composition and skewness in different regions of the Delias hyparete mitogenome

Size Nucleotide composition (%) Region AT-skew GC-skew (bp) T C A G A+T

All gene 15 186 40.6 12.4 39.2 7.8 79.8 −0.018 −0.228

Genes on J-strand 6 884 43.7 13.4 33.2 9.7 76.9 −0.137 −0.160

Genes on N-strand 4 258 33.4 12.8 47.4 6.4 80.8 0.173 −0.333

PCGs 11 142 39.8 13.1 38.6 8.4 78.4 −0.015 −0.219

First codon positions 3 703 38.0 10.0 41.5 10.0 79.5 0.044 0.000

Second codon positions 3 703 37.0 17.7 33.5 12.1 70.5 −0.050 −0.188

Third codon positions 3 703 44.0 11.8 40.8 3.3 84.8 −0.038 −0.563

tRNA 1 447 39.1 11.3 41.3 8.3 80.4 0.027 −0.153

lrRNA 1 336 42.8 10.9 40.9 5.4 83.7 −0.023 −0.337

srRNA 774 46.5 10.2 38.6 4.7 85.1 −0.093 −0.369

A+T-rich region 377 49.3 4.5 42.7 3.4 92.0 −0.072 −0.139

Zoological Research www.zoores.ac.cn Complete mitogenome of the Painted Jezebel, Delias hyparete Linnaeus (Lepidoptera: Pieridae) and its comparison with other butterfly species E115

(43.7%), A (33.2%), C (13.4%) and G (9.7%), whereas, A+T-rich region the minority strand (N strand) is A (47.4%), T (33.4%), The 377 bp A+T-rich region of D. hyparete C (12.8%) and G (6.4%) (Table 4). mitogenome is located between the srRNA and tRNAMet All PCGs of the D. hyparete mitogenome are (Figure 1, Table 2) and shows a relatively higher level of initiated by typical ATN codons (ND2, ATP8, ND3, ND5 A+T content (92.0%) than those of other D. hyparete and ND6 with ATT; COII, ATP6, COIII, ND4 and Cytb mitogenome regions (Table 4). This region does not with ATG; ND4L and ND1 with ATA), while the COI contain any conspicuous macro-repeat units, but harbors gene is tentatively designated by the CGA codon. Nine some structures typical of other butterfly mitogenomes: PCGs harbored the complete termination codon TAN (8 the ON (origin of minority or light strand replication) genes with TAA, ND3 with TAG) and the remaining 4 located 18 bp upstream of the 5'-end of the srRNA gene, genes (COI, COII, ND5 and ND4) ended with a single T which contains the motif ATAGA followed by an 18 bp right ahead of tRNA genes (Table 2). poly-T stretch; the microsatellite-like elements (TA)5 and D. hyparete harbors 3 703 codons, excluding (AT)7 located 196 bp and 273 bp upstream of srRNA termination codons, and this number is slightly lower gene (Figure 3); the microsatellite-like repeat (AT)5 than those of Aporia crataegi (3 708) and Coreana preceded by the ATTTA motif; and an 10 bp polyA-like raphaelis (3 708) (Table 3). The base composition at stretch (AAAAATAAAA) present immediately upstream each codon position of the concatenated 13 PCGs of tRNAMet (Figure 3). showed that the A+T content at the third codon position (84.8%) was higher than those of the first (79.5%) and DISCUSSION second (70.5%), consistent with other sequenced lepidopteran species. The relative synonymous codon General features of D. hyparete mitogenome usage (RSCU) revealed that NNU and NNA codons were The 15 186 bp D. hyparete mitogenome is the greater than 1, indicating that the U and A are more largest among the three available Pieridae species, frequently used for the third codon positions. The codon however its size is well within the range detected in usage bias and A+T bias of the third codon position in completely sequenced butterfly species, ranging from 15 PCGs are positively correlated with each other. 140 bp in Artogeia melete (Hong et al, 2009), Aporia Furthermore, ATT (Ile), AAT (Asn), TTT (Phe) and crataegi (Park et al, 2012) and Argynnis nerippe (Kim et TTA (Leu) are the most frequently used codons, al, 2011b) to 16 094 bp in Agehana maraho (Wu et al, accounting for 9.2%, 7.2%, 6.6% and 6.4% of all the 2010). Likewise, the gene content, orientation and order codons, respectively. are identical to the majority of other lepidopterans. This type of gene arrangement has been considered to be a Transfer RNA genes and ribosomal RNA genes synapomorphy for the order Lepidoptera (Kim et al, In total, 22 transfer RNA genes are interspersed 2011a), differing from those ancestral insects due to the throughout the D. hyparete mitogenome, ranging in movement of tRNAMet to a position 5'-upstream of length from 61 bp (tRNASer(AGN)) to 71 bp (tRNALys) tRNAIle, resulting in the order tRNAMet, tRNAIle, and (Table 2). All of them are shown to be folded into the tRNAGln, rather than the order tRNAIle, tRNAGln, and expected clover-leaf secondary structures except that the tRNAMet (Boore et al, 1998). tRNASer(AGN) lacks the dihydrouridine (DHU) stem, which Although the gene content of the insect mitogenome is replaced by a simple loop (Figure 2). The nucleotide is highly conserved, specific characteristics vary composition of the 22 tRNA genes (1 447 bp in total size) considerably among its different groups, e.g. variable is also A+T biased (80.4%). Sixteen tRNA genes possess length of A+T-rich region, different number of tRNA 24 pair mismatches in their stems, including 5 pairs in genes, and variable organization of genes (Hong et al, the amino acid acceptor stems, 8 pairs in the DHU stems, 2009). For instance, Coreana rapaelis has an extra copy 9 pairs in the anticodon stems and 2 pairs in the TΨC of tRNASer(AGN) (Kim et al, 2006), Acraea issoria stems. These mismatched bases are mainly G•U and U•U, harbors an extra copy of tRNAIle(AUR)b (Hu et al, 2010), with the exception of tRNAArg and tRNALeu(UUR) that Parnassius bremeri harbors two tRNATrp-like and exhibit A•C mismatches. tRNALeu(UUR)-like sequences (Kim et al, 2009), Argynnis Like all other insect mitogenome sequences, two nerippe possesses two extra tRNAMet-like and tRNALeu(UUR)- ribosomal RNA genes (1 336 bp lrRNA and 774 bp like sequences (Kim et al, 2011b), and Ctenoptilum srRNA) are presented in D. hyparete mitogenome. These vasava includes an extra copy of the trnS (AGN) gene two rRNA genes are composed of 83.7% and 85.1% A+T and a tRNA-like insertion trnL (UUR) (Hao et al, 2012). content, and located between tRNALeu(CUN) and tRNAVal, and between tRNAVal and the A+T-rich region, PCGs respectively (Table 2, Table 3). To date, no agreement has been reached concerning

Kunming Institute of Zoology (CAS), China Zoological Society Volume 33 Issues E5−6 E116 SHI, et al.

Figure 2 Predicted secondary clover-leaf structure of the Delias hyparete 22 tRNA genes

Zoological Research www.zoores.ac.cn Complete mitogenome of the Painted Jezebel, Delias hyparete Linnaeus (Lepidoptera: Pieridae) and its comparison with other butterfly species E117

Figure 3 The structure of the A+T-rich region of the Delias hyparete mitochondrial genome the insect COI start codons. For example, the as other lepidopterans. Their sizes are likewise well trinucleotide TTG was supposed to be the initiation within the range of other known butterfly species, from 1 codon for COI gene in the Caligula boisduvalii (Hong et 314 bp of Euploea mulciber (Unpublished, NC_016720) al, 2008), Acraea issoria (Hu et al, 2010) and Calinaga to 1 344 bp of Parnassius bremeri (Kim et al, 2009) for davidis (Xia et al, 2011); ATA for Sericinus montela (Ji lrRNA, and from 739 bp of Protantigius superans (Kim et al, 2012), ATT for Ctenoptilum vasava (Hao et al, et al, 2011a) to 788 bp of Acraea issoria (Hu et al, 2010) 2012) and Aporia crataegi (Park et al, 2012); the for srRNA, respectively. The A+T content value of the tetranucleotide TTAG for Antheraea pernyi (Liu et al, two rRNA genes is consistent with those observed in 2008) and Coreana raphaelis (Kim et al, 2006); and the other available butterfly species (Table 3). hexanucleotides TATTAG for Ostrinia nubilalis and O. furnicalis (Coates et al, 2005), ATTACG for Papilio Non-coding regions xuthus (Feng et al, 2010), TTAAAG for Pieris rapae Intergenic spacer sequences The four spacers longer (Mao et al, 2010). Here, by sequence homology than 10 bp are located respectively between tRNAGln and comparison with other available lepidopteran species, we ND2 (46 bp), tRNAHis and ND4 (39 bp), ND5 and tRNAHis tentatively presumed CGA as the start codon for COI (24 bp) and tRNASer(UCN) and ND1(16 bp). The remaining gene (Lee et al, 2006; Jiang et al, 2009; Hong et al, 2009; 9 spacers shorter than 10 bp are dispersed throughout the Kim et al, 2009; Kim et al, 2010; Wang et al, 2011; Tian whole genome, accounting for 18.3% of the total length et al, 2012, Chen et al, 2012). of entire intergenic spacers. The first longer intergenic spacer located between Transfer and ribosomal RNA genes tRNAGln and ND2 is not only found in D. hyparete, but The D. hyparete mitogenome harbors 22 tRNA has also been detected in other sequenced lepidopterans genes scattered throughout the whole genome. Unlike (Lee et al, 2006; Kim et al, 2009; Hu et al, 2010; Kim et some other butterfly species, no extra tRNA gene has al, 2010), ranging in size from 40 bp in Parnassius been detectedin D. hyparete (Kim et al, 2006; Hu et al, bremeri to 87 bp in Sasakia charonda. This intergenic 2010; Kim et al, 2009; Kim et al, 2011b). The D. spacer is usually considered a constitutive synapomorphic hyparete tRNAs harbor a total of 24 unmatched base feature and a molecular signature of lepidopteran pairs within the stem region: 17 are G•U pairs, the mitogenomes due to its absence in non-lepidopteran remaining 7 are atypical pairs, including 5 U•U pairs and insects (Junqueira et al, 2004; Cha et al, 2007; Cameron 2 A•C pairs, all of which form a weak bond in the tRNAs & Whiting, 2008). Additionally, the 67.4% homology of and could be corrected through RNA-editing this intergenic spacer with its neighboring ND2 gene mechanisms (Yokobori & Pääbo, 1995; Lavrov et al, reveals its ND2 duplication origin (Kim et al, 2009; Kim 2000). All tRNAs possess an invariable 7 bp in the et al, 2010), similar to the cases presented in other aminoacyl stem, as commonly found in other butterfly lepidopterans, including Artogeia melete (68.8%) (Hong species (Kim et al, 2009; Kim et al, 2010; Chen et al, et al, 2009), Pieris rapae (71.7%) (Mao et al, 2010), 2012), 21 tRNAs harbor 7 bp in the anticodon loop Coreana raphaelis (62.5%) (Kim et al, 2006), Eumenis (except 9 bp in tRNAHis), and 19 tRNAs harbor 5 bp in autonoe (74%) (Kim et al, 2010), Parnassius bremeri the anticodon stem (except 4 base pairs in tRNAIle, (70%) (Kim et al, 2009), Bombyx mandarina (70.2%) tRNAGln and tRNALys). Additionally, most tRNA size (Pan et al, 2008). variations result from length variations in the DHU and The second largest spacer is inserted between TΨC arms, within which loop sizes (3−9 bp) are slightly tRNAHis and ND4 (39 bp), which is only correspondingly more variable than the stem sizes (3−5 bp). detected in the Acraea issoria mitogenome (4 bp) (Hu et In D. hyparete, the two rRNA locations are the same al, 2010), while absent in other sequenced butterfly

Kunming Institute of Zoology (CAS), China Zoological Society Volume 33 Issues E5−6 E118 SHI, et al. species (e.g., Kim et al, 2006; Kim et al, 2009; Kim et al, and this region is highly variable in length among the 2010; Wang et al, 2011; Hong et al, 2009; Mao et al, butterfly species (8−24 bp) (Hu et al, 2010; Kim et al, 2010; Ji et al, 2012; Hao et al, 2012). The sequence 2006; Wang et al, 2011; Ji et al, 2012) and sometimes alignment of this spacer with the neighboring ND4 gene even overlaps between the two genes, such as in revealed a sequence homology of 82.1%, also suggesting Parnassius bremeri (Kim et al, 2009) and Eumenis its origin of ND4 duplication (Figure 4). The third largest autonoe (Kim et al, 2010). spacer, located between ND5 and tRNAHis, is 24 bp long

Figure 4 Sequence alignment of the intergenic spacer located between tRNAHis and ND4 gene with its neighboring partial ND4 gene

In some butterfly species, such as Acraea issoria, Diatraea saccharalis (Unpublished, NC_013274) and 38 Argyreus hyperbius, Calinaga davidis, Argynnis nerippe bp spacer in Ostrinia nubilalis (Coates et al, 2005). All and Parathyma sulpitia there exists 2 bp overlapping of these intergenic spacers harbor the ATACTAA motif, sequence resided between the tRNASer(UCN) and ND1 which is conserved in all sequenced lepidopteran species genes. However, for D. hyparete, which we used in this and has been suggested as a possible recognition site for study and other butterfly (moth) species, intergenic the transcription termination peptide (mtTERM protein) spacer sequences are detected, such as 9 bp spacer in (Cameron & Whiting, 2008; Taanman, 1999) (Figure 5).

Figure 5 Sequence alignment of the internal spacer regions located between tRNASer(UCN) and ND1 Boxed nucleotides indicate the conserved heptanucleotide region (ATACTAA) detected in all sequenced lepidopteran insects. Underlined nucleotides indicate the adjacent partial sequences of the tRNASer(UCN) gene and ND1 gene. Arrows indicate the transcriptional direction.

A+T-rich region As in other lepidopteran insects, the The presence of varying copy numbers of tandem A+T-rich region of D. hyparete mitogenome includes the repeated elements in the mitochondrial A+T-rich region origin sites for transcription and replication (Taanman, has frequently been reported in insects (Zhang et al, 1995; 1999). Its 377 bp size is well within the range observed Cameron & Whiting, 2007), including some lepidopteran in the completely sequenced butterfly species, ranging from insects. For example, the Antheraea pernyi A+T-rich 328 bp in Libythea celtis (Unpublished, NC_016724) to region harbors a repeat element of 38 bp tandemly 1 270 bp in Papilio maraho (Wu et al, 2010). The A+T repeated 6 times (Liu et al, 2008), the A. proylei has 5 content of D. hyparete A+T-rich region (92%) is repeat elements (Arunkumar et al, 2006; Liu et al, 2008), slightlylower than the corresponding regions of Aporia Japanese Bombyx mandarina harbors a tandem crataegi (95.2%) (Park et al, 2012) within the determined triplication of a 126 bp repeat unit, while only one of the Pieridae species, but this value is also within the range repeat elements is found in the A+T-rich region of B. from 88.1% in Ctenoptilum vasava (Hao et al, 2012) to mori and Chinese B. mandarina (Yukuhiro et al, 2002; 96.3% in Libythea celtis (Unpublished, NC_016724). Pan et al, 2008). Within Papilionoidea, the nymphalid

Zoological Research www.zoores.ac.cn Complete mitogenome of the Painted Jezebel, Delias hyparete Linnaeus (Lepidoptera: Pieridae) and its comparison with other butterfly species E119

Eumenis autonoe is the only species reported to have a (AT)7 in Coreana raphaelis (Kim et al, 2006), ATTTA tandem repeat sequence, which harbors 10 identical 27 (TA)9 in Pieris rapae (Mao et al, 2010), ATTTA (TA)9 in bp long tandem repeats and one 13 bp long incomplete Argyreus hyperbius (Wang et al, 2011), ATTTA (TA)10 in final repeat ( Kim et al, 2010). In this study, no large Apatura ilia (Chen et al, 2012) and so on. The function sequence repeats were detected in the D. hyparete A+T- of these sequences is still not known and requires more rich region, which harbors several features common to data to come to any conclusive decisions (Kim et al, the other lepidopteran insects, such as the motif ATAGA 2011b). The 10 bp polyA-like structure upstream tRNAMet followed by 18 bp long poly-T stretches close to srRNA is absent in other two sequenced Pieridae species gene, and a microsatellite-like (AT)5 preceded by the Artogeia melete (Hong et al, 2009) and Pieris rapae ATTTA motif and a (AT)7 element (Fig. 3). The (Mao et al, 2010). However, a similar case was found in microsatellite-like repeat preceded by the ATTTA motif other butterfly species, for example, the polyA structure is common in insect mitogenomes, and has been found in of Acraea issoria and Apatura ilia are also inserted by a all other lepidopteran species. For example, ATTTA single G and T base, respectively (Hu et al, 2010; Chen (TA)8 in Hyphantria cunea (Liao et al, 2010), ATTTA et al, 2012).

References

Arunkumar KP, Metta M, Nagaraju J. 2006. Molecular phylogeny of mitochondrial genome of the butterfly Papilio xuthus (Lepidoptera: silkmoths reveals the origin of domesticated silkmoth, Bombyx mori Papilionidae) and related phylogenetic analyses. Mol Biol Rep, 37(8): from Chinese Bombyx mandarina and paternal inheritance of Antheraea 3877-3888. proylei mitochondrial DNA. Mol Phylogenet Evol, 40(2): 419-427. Fenn JD, Cameron SL, Whiting MF. 2007. The complete mitochondrial Ballard JWO, Whitlock MC. 2004. The incomplete natural history of genome of the Mormon cricket (Anabrus simplex: Tettigoniidae: mitochondrial. Mol Ecol, 13(4): 729-744. Orthoptera) and an analysis of control region variability. Insect Mol Biol, 16(2): 239-252. Benson G. 1999. Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res, 27(2): 573-580. Hao JS, Su CY, Zhu GP, Chen N, Wu DX, Zhang XP. 2007. The molecular morphologies of mitochondrial 16S rDNA of the main Boore JL, Lavrov DV, Brown WM. 1998. Gene translocation links butterfly lineages and their phylogenetic significances. Genet Mol Biol, insects and crustaceans. Nature, 392(6677): 667-668. 18(2): 111-123.

Boore JL. 1999. Animal mitochondrial genomes. Nucleic Acids Res, Hao JS, Sun QQ, Zhao HB, Sun XY, Gai YH, Yang Q. 2012. The 27(8): 1767-1780. complete mitochondrial genome of Ctenoptilum vasava (Lepidoptera: Cameron SL, Whiting MF. 2007. Mitochondrial genomic comparisons Hesperiidae: Pyrginae) and its phylogenetic implication. Comp Funct of the subterranean termites from the Genus Reticulitermes (Insecta: Genomics, 2012: 328049, doi: 10.1155/2012/328049. Isoptera: Rhinotermitidae). Genome, 50(2): 188-202. Hall TA. 1999. BioEdit: a user-friendly biological sequence alignment Cameron SL, Whiting MF. 2008. The complete mitochondrial genome editor and analysis program for Windows 95/98/NT. Nucleic Acids of the tobacco hornworm, Manduca sexta, (Insecta: Lepidoptera: Symp Ser, 41: 95-98. Sphingidae), and an examination of mitochondrial gene variability Hong MY, Lee EM, Jo YH, Park HC, Kim SR, Hwang JS, Jin BR, within butterflies and moths. Gene, 408(1-2): 112-123. Kang PD, Kim KG, Han YS, Kim I. 2008. Complete nucleotide Caterino MS, Sperling FAH. 1999. Papilio phylogeny based on sequence and organization of the mitogenome of the silk moth Caligula mitochondrial cytochrome oxidase I and II genes. Mol Phylogenet Evol, boisduvalii (Lepidoptera: Saturniidae) and comparison with other 11(1): 122-137. lepidopteran insects. Gene, 413(1-2): 49-57.

Cha SY, Yoon HJ, Lee EM, Yoon MH, Hwang JS, Jin BR, Han YS, Hong GY, Jiang ST, Yu M, Yang Y, Li F, Xue FS, Wei ZJ. 2009. The Kim I. 2007. The complete nucleotide sequence and gene organization complete nucleotide sequence of the mitochondrial genome of the of the mitochondrial genome of the bumblebee, Bombus ignitus cabbage butterfly, Artogeia melete (Lepidoptera: Pieridae). Acta (Hymenoptera: Apidae). Gene, 392(1-2): 206-220. Biochim Biophys Sin, 41(6): 446-455.

Chen M, Tian LL, Shi QH, Cao TW, Hao JS. 2012. Complete Hu J, Zhang DX, Hao JS, Huang DY, Cameron S, Zhu CD. 2010. The mitogenome of the Lesser Purple Emperor Apatura ilia (Lepidoptera: complete mitochondrial genome of the yellow coaster, Acraea issoria Nymphalidae: Apaturinae) and comparison with other nymphalid (Lepidoptera: Nymphalidae: Heliconiinae: Acraeini): sequence, gene butterflies. Zool Res, 33(2): 191-201. organization and a unique tRNA translocation event. Mol Biol Rep, 37(7): 3431-3438. Chou I. 1998. Classification and Identification of Chinese Butterflies[M]. Zhengzhou: Henan Scientific and Technological Ji LW, Hao JS, Wang Y, Huang DY, Zhao JL, Zhu CD. 2012. The Publishing House. (in Chinese) complete mitochondrial genome of the dragon swallowtail, Sericinus montela Gray (Lepidoptera: Papilionidae) and its phylogenetic Coates BS, Sumerford DV, Hellmich RL, Lewis LC. 2005. Partial implication. Acta Entomol Sin, 55(1): 91-100. mitochondrial genome sequences of Ostrinia nubilalis and Ostrinia furnicalis. Int J Biol Sci, 1(1): 13-18. Jiang ST, Hong GY, Yu M, Li N, Yang Y, Liu YQ, Wei ZJ. 2009. Characterization of the complete mitochondrial genome of the giant Feng X, Liu DF, Wang NX, Zhu CD, Jiang GF. 2010. The silkworm moth, Eriogyna pyretorum (Lepidoptera: Saturniidae). Int J

Kunming Institute of Zoology (CAS), China Zoological Society Volume 33 Issues E5−6 E120 SHI, et al.

Biol Sci, 5(4): 351-365. Simmons RB, Weller SJ. 2001. Utility and evolution of cytochrome b in insects. Mol Phylogenet Evol, 20(2): 196-210. Junqueira ACM, Lessinger AC, Torres TT, Da Silva FR, Vettore AL, Arruda P, Azeredo-Espin AML. 2004. The mitochondrial genome of Simon C, Frati F, Bekenbach A, Crespi B, Liu H, Flook P. 1994. the blowfly Chrysomya chloropyga (Diptera: Calliphoridae). Gene, 339: Evolution, weighting, and phylogenetic utility of mitochondrial gene 7-15. sequences and a compilation of conserved polymerase chain reaction primers. Ann Entomol Soc Am, 87(6): 651-701. Kim I, Lee EM, Seol KY, Yun EY, Lee YB, Hwang JS, Jin BR. 2006. The mitochondrial genome of the Korean hairstreak, Coreana raphaelis Simonsen TJ, Wahlberg N, Brower AVZ, Jong R. 2006. Morphology, (Lepidoptera: Lycaenidae). Insect Mol Biol, 15(2): 217-225. molecules and fritillaries: approaching a stable phylogeny for Argynnini Kim MII, Baek JY, Kim MJ, Jeong HC, Kim KG, Bae CH, Han YS, (Lepidoptera: Nymphalidae). Insect Syst Evol, 37(4): 405-418. Jin BR, Kim I. 2009. Complete nucleotide sequence and organization Stoneking M, Soodyall H. 1996. Human evolution and the of the mitogenome of the red-spotted apollo butterfly, Parnassius mitochondrial genome. Curr Opin Genet Dev, 6(6): 731-736. bremeri (Lepidoptera: Papilionidae) and comparison with other lepidopteran insects. Mol Cells, 28(4): 347-363. Taanman JW. 1999. The mitochondrial genome: structure, transcription, translation and replication. Biochim Biophys Acta, 1410(2): 103-123. Kim MJ, Wan XL, Kim KG, Hwang JS, Kim I. 2010. Complete nucleotide sequence and organization of the mitogenome of endangered Tamura K, Peterson D, Peterson N, Stecher G, Nei M, Kumar S. 2011. Eumenis autonoe (Lepidoptera: Nymphalidae). Afri J Biotech, 9(5): MEGA5: molecular evolutionary genetics analysis using maximum 735-754. likelihood, evolutionary distance, and maximum parsimony methods. Mol Bio Evol, 28(10): 2731-2739. Kim MJ, Kang AR, Jeong HC, Kim KG, Kim I. 2011a. Reconstructing intraordinal relationships in Lepidoptera using mitochondrial genome Thompson JD, Gibson TJ, Plewniak F, Jeanmougin F, Higgins DG. data with the description of two newly sequenced lycaenids, Spindasis 1997. The Clustal X windows interface: flexible strategies for multiple takanonis and Protantigius superans (Lepidoptera: Lycaenidae). Mol sequences alignment aided by quality analysis tools. Nucleic Acids Res, Phylogenet Evol, 61(2): 436-445. 25(24): 4876-4882.

Kim MJ, Jeong HC, Kim SR, Kim I. 2011b. Complete mitochondrial Tian LL, Sun XY, Chen M, Gai YH, Hao JS, Yang Q. 2012. Complete genome of the nerippe fritillary butterfly, Argynnis nerippe mitochondrial genome of the Five-dot Sergeant Parathyma sulpitia (Lepidoptera: Nymphalidae). Mitochondrial DNA, 22(4): 86-88. (Nymphalidae: Limenitidinae) and its phylogenetic implications. Zool Res, 33(2): 133-143. Lee ES, Shin KS, Kim MS, Park H, Cho S, Kim CB. 2006. The mitochondrial genome of the smaller tea tortix Adoxophyes honmai Vigilan L, Stoneking M, Harpending H, Hawkes K, Wilson A. 1991. (Lepidoptera: Tortricidae). Gene, 373: 52-57. African populations and the evolution of human mitochondrial. Science, Lavrov DV, Brown WM, Boore JL. 2000. A novel type of RNA editing 253(5027): 1503-1507. occurs in the mitochondrial tRNAs of the centipede Lithobius Wang XC, Sun XY, Sun QQ, Zhang DX, Hu J, Yang Q, Hao JS. 2011. forficatus. Proc Natl Acad Sci USA, 97(25): 13738-13742. Complete mitochondrial genome of the laced fritillary Argyreus Liao F, Wang L, Wu S, Li YP, Zhao L, Huang GM, Niu CJ, Liu YQ, Li hyperbius (Lepidoptera: Nymphalidae). Zool Res, 32(5): 465-475. MG. 2010. The complete mitochondrial genome of the fall webworm, Wolstenholme DR. 1992. Animal mitochondrial DNA: structure and Hyphantria cunea (Lepidoptera: Arctiidae). Int J Biol Sci, 6(2): 172- evolution. Int Rev Cytol, 141: 173-216. 186. Wu LW, Lees DC, Yen SH, Lu CC, Hsu YF. 2010. The complete Liu YQ, Li YP, Pan MH, Dai FY, Zhu XW, Lu C, Xiang ZH. 2008. mitochondrial genome of the near-threatened swallowtail, Agehana The complete mitochondrial genome of the Chinese oak silkmoth, maraho (Lepidoptera: Papilionidae): evaluating sequence variability Antheraea pernyi (Lepidoptera: Saturniidae). Acta Biochim Biophys Sin, and suitable markers for conservation genetic studies. Etomol News, 40(8): 693-703. 121(3): 267-280. Lowe TM and Eddy SR. 1997. tRNAscan-SE: a program for improved Xia J, Hu J, Zhu GP, Zhu CD, Hao JS. 2011. Sequence and analysis of detection of transfer RNA genes in genomic sequence. Nucleic Acids the complete mitochondrial genome of Calinaga davidis Oberthür Res, 25(5): 955-964. (Lepidoptera: Nymphalidae). Acta Entomol Sin, 54(5): 555-565. (in Mao ZH, Hao JS, Zhu GP, Hu J, Si MM, Zhu CD. 2010. Sequencing Chinese) and analysis of the complete mitochondrial genome of Pieris rapae Yokobori S, Pääbo S. 1995. Transfer RNA editing in land snail Linnaeus (Lepidoptera: Pieridae). Acta Entomol Sin, 53(11): 1295-1304. mitochondrial. Proc Natl Acad Sci USA, 92(22): 10432-10435. (in Chinese)

Pan MH, Yu QY, Xia YL, Dai FY, Liu YQ, Lu C, Zhang Z, Xiang ZH. Yukuhiro K, Sezutsu H, Itoh M, Shimizu K, Banno Y. 2002. Significant 2008. Characterization of mitochondrial genome of Chinese wild levels of sequence divergence and gene rearrangements have occurred mulberry silkworm, Bomyx mandarina (Lepidoptera: Bombycidae). Sci between the mitochondrial genomes of the wild mulberry silkmoth, Chn, 51(8): 693-701. Bombyx mandarina, and its close relative, the domes-ticated silkmoth, Bombyx mori. Mol Biol Evol, 19(8): 1385-1389. Park JS, Cho Y, Kim MJ, Nam SH, Kim I. 2012. Description of complete mitochondrial genome of the black-veined white, Aporia Zhang DX, Szymura JM, Hewitt GM. 1995. Evolution and structural crataegi (Lepidoptera: Papilionoidea), and comparison to papilionoid conservation of the control region of insect mitochondrial DNA. J Mol species. J Asia Pac Entomol, 15(3): 331-341. Evol, 40(4): 382-391.

Zoological Research www.zoores.ac.cn Zoological Research 33 (E5−6): E121−E128 doi: 10.3724/SP.J.1141.2012.E05-06E121

Peak identification for ChIP-seq data with no controls

Yanfeng ZHANG, Bing SU*

State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, the Chinese Academy of Sciences, Kunming 650223, China

Abstract: Chromatin immunoprecipitation followed by sequencing (ChIP-seq) is increasingly being used for genome-wide profiling of transcriptional regulation, as this technique enables dissection of the gene regulatory networks. With input as control, a variety of statistical methods have been proposed for identifying the enriched regions in the genome, i.e., the transcriptional factor binding sites and chromatin modifications. However, when there are no controls, whether peak calling is still reliable awaits systematic evaluations. To address this question, we used a Bayesian framework approach to show the effectiveness of peak calling without controls (PCWC). Using several different types of ChIP-seq data, we demonstrated the relatively high accuracy of PCWC with less than a 5% false discovery rate (FDR). Compared with previously published methods, e.g., the model-based analysis of ChIP-seq (MACS), PCWC is reliable with lower FDR. Furthermore, to interpret the biological significance of the called peaks, in combination with microarray gene expression data, gene ontology annotation and subsequent motif discovery, our results indicate PCWC possesses a high efficiency. Additionally, using in silico data, only a small number of peaks were identified, suggesting the significantly low FDR for PCWC.

Keywords: ChIP-seq; Bayesian; Peak calling; Gene regulation

With the advance of high-throughput sequencing from non-specific IP (such as IP with immunoglobulin technologies, both global survey of the genome structure G). Concomitantly, several tools for ChIP-seq analysis and transcriptome and transcriptional regulation has have been developed, the majority of which require a become more accurate and sensitive. As one of the more matched control sample to determine fold enrichment powerful and widely used experimental techniques for and significance of peak signals. This idea of fold ratio DNA-protein interactions in vivo, chromatin relative to controls used for ChIP-seq is similar with immunoprecipitation followed by sequencing (ChIP-seq) ChIP-chip data analysis, which in the currently used is one of the early applications of next generation methods is necessary for peak calling (Rozowsky et al, sequencing (NGS). Since the first study in 2007 2009; Valouev et al, 2008). Meanwhile, several new (Mikkelsen et al), ChIP-seq technology has been methods based on Validation Discriminant Analysis 1 increasingly used for mapping DNA-binding sites by (Micsinai et al, 2012), Hidden Markov Models (Choi et transcriptional factors and chromatin modifications. al, 2009) or Bayesian Models (Spyrou et al, 2009) Compared with the earlier method of chromatin combine ChIP-seq and ChIP-chip data for peak immunoprecipitation followed by tiling microarray identification. Nonetheless, a systematic analysis of (ChIP-chip) (Kapranov et al, 2002), ChIP-seq has more ChIP-chip and ChIP-seq datasets revealed that the input advantages, e.g., higher resolution, cost-effectiveness and data has variable effects on peak finding (Ho et al, 2011), technical simplification. At the same time, ChIP-seq suggesting of the need for a high quality input sample for itself also has several disadvantages, including efficiently peak calling. Additionally, sequencing depth significantly computational analysis techniques, sequencing depth requirement, and the like (Park, 2009). Received: 06 September 2012; Accepted: 31 October 2012 Currently, there are three commonly used types of Foundation items: This study was supported by the National 973 control samples in ChIP experiments: input DNA (no project of China (2011CBA01101) and the National Natural Science immunoprecipitation (IP) DNA samples); mock IP DNA Foundation of China (30871343 and 31130051) (DNA obtained from IP without antibodies); and DNA * Corresponding author, E-mail: [email protected]

Science Press Volume 33 Issues E5−6

E122 ZHANG, SU impacts the peak calling, further causing bias of peak framework (Madigan & Ridgeway, 2003), the posterior identification (Chen et al, 2012). To improve peak calling density for θ is obtained, up to a proportionality constant accuracy, Osmanbeyoglu et al (2012) utilized a strategy by multiplying the prior density g(θ) by the likelihood of co-regulation binding by integrating multiple sources L(θ) where θ denotes the probability that a region is a of biological information. Recently, although several true peak: novel tools for peak calling without controls have been Pr(θθθ |data )∝ L ( ) g ( ) (1) made available (Cairns et al, 2011; Fejes et al, 2008; Based on global deep sequencing that reaches to Hower et al, 2011), for example, the model-based single nucleotide resolution, a uniform prior distribution analysis of ChIP-seq (MACS) defines a dynamic is usually supposed, which has been adopted in the parameter to capture local biases when a control profile RNA-seq analysis (Sultan et al, 2008; Wang et al, 2008). is unavailable (Zhang et al, 2008). No systematic Here, for simplification and feasibility, we assume a evaluation of peak calling without controls has been uniform distribution for the prior distribution g(θ) carried out, even for MACS or BayesPeak. ( g()~θ U ), so g(θθ ) = 1, for , leading to the Given the bias of control data in the ChIP-seq peak equation: calling, we sought to address whether ChIP-seq peak Pr(θθθ |data )= L ( ) d (2) calling without controls (PCWC) is plausible. Employing ∫ the Bayesian theorem and simulation-based empirical For measuring the significant enrichment of reads in estimates of background distribution, we demonstrated a peak, we considered that the candidate peaks with the that our method ChIP-seq peak identification with no number of reads (#reads, a) below a cutoff threshold (c, control is reliable, with an FDR lower than 5%. usually 0.01) would fit the following equation, Compared with the Poisson-based MACS method, our Pr(#readsa≥=− ) 1 Pr(# readsa < ) Bayesian framework strategies showed lower FDRs, 1−c (3) =− 1Ld (θθ ) ≤ c suggesting high selectivity and effectiveness for peak ∫0 calling. The systematic analysis of PCWC we present in where the cumulative density of #reads reaching to a is the current study could serve as an alternative strategy based on likelihood function L(θ) and L(θ) is empirically for ChIP-seq analysis and subsequent biological estimated from the genome-wide scanning of tag-count interpretation of gene regulation. (10,000 w bp random bins per chromosome). Following the optimal parameter a below c, we can MATERIALS AND METHODS obtain the read-count (uniquely mapped total reads) with a sliding window of size w/2 bp (w=50 bp in default). Data sets Afterward, the #reads in each w/2 bp sliding window size In order to present the comprehensive performance above a is regarded as candidate peaks. Peaks spanning of PCWC, we used the FASTQ-formatted ChIP-seq data lower than 100 bp are discarded and the neighboring sets downloaded from Gene Expression Omnibus (GEO) peaks within 1 kb are merged (Guenther et al, 2008) to (Edgar et al, 2002), including the ChIP-seq data of counteract the shifting effects of aligned tags from transcription factor E2F1 in mESCs (Chen et al, 2008), forward and reverse reads. VDR binding in human lymphoblastoid cell (Ramagopalan et al, 2010) and H3K4me3 in mESCs False discovery rate (FDR) estimate for the number (Creyghton et al, 2010), as well as the ABI SOLID ChIP- of peaks seq data sets of EGR1 ChIP-seq data (Tang et al, 2010) Similar to methods used by Valouev et al (2008), the and MNase-seq data in two replicates (Valouev et al, overlapped number of called peaks (using the same 2011). Furthermore, we integrated the microarray gene threshold) between the input dataset and ChIP dataset are expression dataset to evaluate the effectiveness of PCWC. the false discovery number, and the FDR is the false Additionally, we in silico generated 1×107 36-bp reads discovery number divided by the total number of peaks using the simreads program of the Rmap package (Smith called in the ChIP experiment. et al, 2009) to simulate control samples for peak calling. The detailed description for ChIP-seq is shown in Table Motif analysis S1. We used MEME 4.6.1 to discover motifs (Bailey & Elkan, 1994). Since the running time can be prohibitively Peak calling based on the Bayesian framework long for large sets, the peaks are ranked by the number of On the genome-wide scale, peaks in the ChIP-seq uniquely aligned reads and only the top 10% of the peaks experiment are those regions significantly enriched by were selected for motif discovery. A similar reads. Thus, regarding all uniquely mapped reads of the methodological strategy was used by Ramagopalan et al ChIP-seq as background, the density of reads in peak (2010) and Jothi et al (2008). The location of the peak is regions should be dominantly enriched. In the Bayesian centered and extended by 100 bp on either side. Motifs

Zoological Research www.zoores.ac.cn Peak Identification for ChIP-seq data with no controls E123 between 5 and 30 bp in length were determined on both (Creyghton et al, 2010), all of which were generated strands. from the Illumina platform. Employing SOAP2 program (Li et al, 2009) with maximal 1 mismatch, the uniquely Microarray data preprocessing and analysis mapped reads are considered for further analysis. Using a A total of 45 microarray data sets for calcitriol cutoff threshold ( c =0.01 or 0.001), the number of reads stimulation on human lymphoblastoid cell lines were a based on 10 000 random bins per chromosome with used for expression analysis. The multi-array average window size (w=50) was estimated (Figure 1). With (RMA) expression values were calculated using the these two parameters, we can empirically determine the Bioconductor “affy” package (Gentleman et al, 2004). candidate peaks for each data set. The log2 expression signals were used to calculate fold change.

Gene annotation Genes regulated by trans-acting factors (or by chromatin modification, such as H3K4me3) were defined by no more than 5 kb distance of peaks to Refseq transcription start site (TSS). The enriched analysis of genes was achieved using the DAVID annotation system (Huang da et al, 2009). After Benjamini-Hochberg correction, P<0.01 were considered significantly enriched. Peak calling based on the Bayesian model and other bioinformatics analyses were implemented in Perl and R (data available upon request).

RESULTS AND DISCUSSION To demonstrate the effectiveness of ChIP-seq Figure 1 Cumulative density of reads based on genome-wide PCWC, we selected five representative ChIP-seq data random bins sets from two platforms (three from Illumina and two Grey dashed line represents the read cutoff for peak calling. from ABI SOLID platforms, respectively) and one in silico data (See Materials and Methods for details). Vitamin D receptor data We first applied the Bayesian-based approach on For the VDR ChIP-seq samples, it is highly suitable recently published ChIP-seq data for transcription factor for evaluating the ChIP-seq peak calling without controls E2F1 in mouse embryonic stem cells (mESCs) (Chen et due to the inclusion of conditional data (calcitriol treated al, 2008), vitamin D receptor (VDR) binding in human or not) with deep sequencing (ranging from 7.9×106 to lymphoblastoid cell (Ramagopalan et al, 2010) and H3 1.46×107 uniquely mapped reads, see Table 1) and the trimethylated at lysine 4 (H3K4me3) in mESCs available microarray gene expression data set.

Table 1 Summary of the peak calling analysis using the VDR ChIP-seq data

Samplesa Description #Reads #Uniquely aligned Pr (#reads≥a)≤0.001 # Peaks Mean lengthb

SRX022390 unstimulated_rep1 18 252 156 13 487 568 7 1 989 633.9 SRX022391 unstimulated_rep2 15 379 663 10 761 914 7 548 708.6 SRX022392 vitaminD_rep1 18 391 888 13 937 615 7 2 920 455.4 SRX022393 vitaminD_rep2 18 965 010 14 613 990 7 2 835 467.3 SRX022394 unstimulated_rep1 12 264 149 10 149 547 7 2 223 519.3

SRX022395 unstimulated_rep2 10 145 026 7 930 475 7 5 298 460.7

SRX022396 vitaminD_rep1 13 302 506 10 657 775 9 6 755 428.9

SRX022397 vitaminD_rep2 14 526 186 11 755 087 9 6 981 434.9

SRX022398 Input1 15 001 356 11 403 930 6 348 −

SRX022399 Input2 13 682 506 11 402 869 6 545 − a: Sample ID is the accession ID for meta-data documented in GEO database; b: Mean peak length reflects the resolution of binding sites.

Kunming Institute of Zoology (CAS), China Zoological Society Volume 33 Issues E5−6 E124 ZHANG, SU

Firstly, we sought to call peaks in both conditions. an increased VDR binding in the intronic regions In the samples not treated with calcitriol, the number of (Ramagopalan et al, 2010), further suggesting the high peaks without control is between 548 to 5 298, while in accuracy of our Bayesian model for PCWC. the calcitriol-treated samples the number of peaks is For ChIP-seq data analysis, in addition to the between 2 835 to 6 981 (Table 1), consistent with the integration of gene expression data, the subsequent motif Ramagopalan’s reports (Ramagopalan et al, 2010), where discovery analysis for the enriched regions is also they used MACS with two independent controls for peak informative to interpret the biological implications (Park, calling. Interestingly, our Bayesian-based PCWC get a 2009). Thus, we carried out a motif analysis of VDR similar number of peaks compared with the data using binding (see Methods). Due to the prohibitively long MACS with controls. running time for large data sets, only the top 10% of the Next, we conducted analyses on multiple scales. peaks were used for motif discovery. Figure 2 shows the The resolution of peaks (VDR binding sites) is also motifs discovered by our method, which are nearly comparable with the reported study (Table 1). identical with the reported data (Ramagopalan et al, Furthermore, the gene ontology (GO) annotation 2010). Collectively, the data presented using the VDR indicates that immune system development (GO:0002520), dataset indicates that the PCWC is reliable. lymphocyte activation (GO:0046649) and T cell activation (GO:0042110) are significantly enriched (P≤0.01, after Benjamini correction), also highly concordant with the report (Ramagopalan et al, 2010). Additionally, previous studies have experimentally validated that VDR (auto-regulation) (Zella et al, 2010), CCNC (Sinkkonen et al, 2005), ALOX5 (Seuter et al, 2007), IRF8 and PTPN2 (Ramagopalan et al, 2010) modulated by VDR have significant enrichment of peaks (VDR binding) under calcitriol-treated conditions, which Figure 2 Inferred consensus binding motif for VDR were all confirmed using our method (Figure S1), suggesting the effectiveness of our method for ChIP-seq For PCWC, evaluate whether the called peaks are peak calling. real signals relative to the flanking regions is crucial. We We then integrated the microarray gene expression therefore conducted such analysis based on the ratio of data (a total of 45 microarrays) to evaluate the accuracy peaks to the flanking regions. For each peak, we selected of PCWC. Compared with the untreated condition, a 300 bp region at both the 5' and 3' flanking sequences. total of 205 genes (fold change≥1.5) were up-regulated As shown in Figure 3, compared with the 5' and 3' with calcitriol treatment, of which 134 (65.4%) genes flanking regions, the peak regions have much higher (at were significantly enriched with the called peaks. Of the least 2 times higher) read coverage, indicating the called 134 genes, 81 (60.4%) genes enriched with peaks are peaks without control are relatively reliable. Furthermore, located in the intronic regions, 33 genes in the TSS the distribution of the uniquely mapped bases (in 1 kb regions (TSS±500) bp, consistent with the report that bins) in the genome for the ChIP and input data also under the condition of the calcitriol stimulation, there is indicates that the peak regions in the ChIP data are

Figure 3 Density of read coverage ratio between peak and flank (5' and 3') regions

Zoological Research www.zoores.ac.cn Peak Identification for ChIP-seq data with no controls E125 enriched in reads, while in the input data, they are Bayesian model ( Pr(# reads ≥≤ 9) 0.01), we identified relatively uniformly distributed (Figure S2). 7 539 peaks, of which 7 518 (99.7%) were overlapped We also evaluated the FDR of the Bayesian model with the MACS model with no controls, indicating a for PCWC. Similar with the VDR ChIP-seq data analysis, high accuracy for the Bayesian model. To avoid the peak calling ( Pr(# reads ≥≤ 6) 0.001 ) for two potentially high false negative rate resulting from the independent controls (from the VDR data sets, Table 1) stringent cutoff applied, we also tested a relaxed cutoff was conducted to evaluate the FDR. We totally identified for the Bayesian model ( Pr(# reads ≥≤ 4) 0.05 , and 348 and 545 peaks in two independent controls (Table 1), we identified a total of 11,536 peaks. Compared with the respectively. Only less than 3% of peaks in the two MACS model using different cutoffs, the relaxed controls overlap with the ChIP data, indicating that the Bayesian model remained high overlapping rates with FDR for our method is small. It should be noted that the the MACS data P=1e-08, again indicating the PCWC is also available in the MACS (version 1.4) effectiveness of PCWC using our Bayesian model. program. Thus we compared the relative effectiveness of our Bayesian model with the Poisson-based MACS Table 2 Comparison of peak callings between Bayesian model. Under the default parameter condition, the and MACS models MACS program failed for peak calling for the two Options MACS Bayesian Overlap Overlapping(%) independent controls due to the large -mfold (-mfold=32) 1e-05 14 346 11 312 98.06 parameter, resulting in unavailable construction of the 1e-06 12 627 11 116 96.36 background model. When we decreased the -mfold 1e-07 11 663 10 827 93.85 parameter to 8 (suitable for building the background 1e-08 10 841 10 432 90.43 model), a total of 3 278 and 3 486 candidate peaks are 1e-09 10 235 10 020 86.86 called with the FDRs of 5% and 8%, respectively, which For the Bayesian model, the cutoff was set toPr (# reads ≥≤ 4) 0.05 . is higher than the FDR of our Bayesian model. The advantages of the Bayesian model relies on the genome- As the likelihood function used in the Bayesian wide scanning to empirically measure the background model is subject to simulation for optimizing the number distribution, while the Poisson-based model is subject to of reads a, we next addressed whether the simulation overestimation of the number of candidate peaks, method based on 10 000 bootstrapping per chromosome therefore resulting in higher FDRs (Kharchenko et al, with w=50 bp random bins would be biased in our 2008). method. Firstly, we evaluated the effects of number of bootstrappings (ranging from 1 000 to 50 000), and no H3K4me3 data bias was observed when analyzing the H3K4me3 ChIP- We also used the H3K4me3 ChIP-seq data seq data (Table 3). Similar results were obtained when (8.77×106 uniquely mapped reads, accounting for 69.7%) using the VDR ChIP-seq data (data not shown). We then to compare the peak calling between the Bayesian model assessed the effects of random bin size (w) and found no and the MACS model. Using the MACS with no controls, bias there either, suggesting that the simulation method a total of 14 346 peaks (Table 2) were identified when the itself would not generate bias for the parameters a and c cutoff was set to P=1e-05 (default). With the use of the (Figure 4).

Table 3 Effects of bootstrapping on the cutoff threshold #Reads 1 000 bootstrap 5 000 bootstrap 10 000 bootstrap 20 000 bootstrap 50 000 bootstrap 1 0.148 0.148 0.144 0.141 0.144 2 0.077 0.083 0.084 0.076 0.081 3 0.052 0.060 0.062 0.052 0.056 4 0.037 0.040 0.048 0.038 0.042 5 0.027 0.028 0.036 0.029 0.031 6 0.023 0.021 0.026 0.021 0.023 7 0.017 0.016 0.018 0.015 0.016 8 0.012 0.011 0.013 0.010 0.011 9a 0.008 0.008 0.008 0.007 0.007 10 0.004 0.005 0.007 0.004 0.005 11 0.002 0.004 0.004 0.003 0.004 13 0.000 0.002 0.003 0.001 0.002 a: bolded row indicates that bootstraps are not biased.

Kunming Institute of Zoology (CAS), China Zoological Society Volume 33 Issues E5−6 E126 ZHANG, SU

PCWC, we in silico generated 1×107 36-bp reads using simreads of the Rmap package (Smith et al, 2009) to produce a simulated control sample. As expected, only 428 peaks were generated based on the Bayesian model. Notably, the number of peaks called from the in silico data is equivalent with the two independent controls (Table 1), suggesting that the PCWC is sensitive.

CONCLUSIONS In this study, we demonstrated an informative analysis of ChIP-seq PCWC based on the Bayesian

framework approach. By the application to multiple Figure 4 Effects of random bin size on the number of reads a ChIP-seq data sets, we have demonstrated the effectiveness of our method for both transcriptional E2F1 data factor and chromatin modification ChIP-seq experiments. For the E2F1 ChIP-seq data, a total of 1.188×107 Notably, the demonstration PCWC’s effectiveness reads (~39.0% of total reads) were uniquely mapped onto showed no superiority over the method with controls. the genome. Using the PCWC ( Pr(# reads ≥≤ 9) 0.01), The purpose of this study is to demonstrate a potential we identified a total of 11 470 peaks, of which the alternative strategy for designing ChIP-seq experiments majority (75.3%, n=8 639) span across the annotated and identifying transcriptional factor binding sequences Refseq TSSs, consistent with Chen et al’s (2008) report without using controls. that about ~50% of all genes are regulated by E2F1. Our For ChIP-chip analysis, the tiling array may suffer analysis also indicated E2F1 binding regions were very from cross-hybridizations, therefore, the fold ratio for close to TSS (Figure S3), consistent with the previous peak calling relative to background is required. By ChIP-seq and ChIP-chip results (Chen et al, 2008; Xu et contrast, our analyses indicated the effectiveness of al, 2007). The 5 genes (Figure S4) experimentally ChIP-seq PCWC, likely due to the finer resolution and verified to be positively regulated by E2F1,Cdc6 (Yan greater signal-to-noise ratio of the ChIP-seq data et al, 1998), Ccne1, Ccna2 (DeGregori et al, 1995), (Rozowsky et al, 2009). Meanwhile, as opposed to the Mcm4 and Mcm7 (Arata et al, 2000),were all identified Poisson-based model for ChIP-seq peak calling, our in the E2F1- regulated regions (peaks). Bayesian model is dependent on genome-wide scanning to empirically measure the background distribution, Other data resulting in the higher selectivity and lower FDRs for the Using the Illumina platform of ChIP-seq data, we PCWC. demonstrated the effectiveness and accuracy of PCWC. Contrary to RNA-seq, ChIP-seq is more confined To further test the effectiveness of the Bayesian model, by relatively complicated pre-ChIP experiments, e.g., we used two ABI SOLID platform data sets. The raw antibody specificity, large amount of cells (usually no 7 colorspace-formatted data were aligned using Bowtie less than 1×10 cells), formaldehyde cross-link and software (Langmead et al, 2009) with maximal 2 supersonic shearing, etc. If the ChIP experiments ensure mismatches and uniquely mapped reads were used for high antibody specificity, the following high-throughput further analysis. sequencing with deep coverage to identify peaks without For the early growth response gene 1 (EGR1) ChIP- control is both plausible and robust. While depth-of- seq data (Tang et al, 2010), we identified a total of 7 302 sequencing issues (Kharchenko et al, 2008) exist in the peaks. The KEGG pathway annotation result indicates ChIP-seq experiments, our analyses suggest that no less that genes regulated by EGR1 are significantly enriched than 10 million effective reads (uniquely mapped) are in the MAPK, the Wnt and the TGF-beta signaling necessary for PCWC. pathways (hsa04010, hsa04310, hsa04350, respectively), Moreover, improvements of ChIP-seq experimental congruent with earlier reported data (Tang et al, 2010). strategies without (or with) control data in an appropriate For the MNase-seq data in two replicates (Valouev way may be preferable. Based on our survey, two et al, 2011), a total of 533 401 and 514 135 peaks were biologically independent replicates is highly replicable (s identified, respectively, where 83.12% overlapped shown in Table 1) as previously suggested (Park, 2009). between the two replicates, further supporting the Therefore, two replicates of ChIP-seq would be sufficient effectiveness of peak calling with no control. for peak calling. Meanwhile, if PCWC is determined, the integration with other data types will be essential. For In silico data example, the integration of ChIP-seq data with gene To demonstrate the effectiveness of ChIP-seq expression data (including microarray gene expression

Zoological Research www.zoores.ac.cn Peak Identification for ChIP-seq data with no controls E127 data or RNA-seq data) may maximize the interpretation gene ontology; VDR: vitamin D receptor; H3K4me3: H3 of gene regulatory network. trimethylated at lysine 4; ESCs: embryonic stem cells

Abbreviations: ChIP-seq: chromatin Acknowledgments: We are thankful to Shao-Bin immunoprecipitation followed by sequencing; NGS: next XU (Kunming Institute of Zoology, CAS) for his support generation sequencing; ChIP-chip: chromatin on super-computing service, and to Yu-qi ZHAO immunoprecipitation followed by tiling microarray; (Kunming Institute of Zoology, CAS) for his helpful MACS: model-based analysis of ChIP-seq; PCWC: peak discussion. calling without controls; FDR: false discovery rates; GO:

References

Arata Y, Fujita M, Ohtani K, Kijima S, Kato J-y. 2000. Cdk2-dependent open software development for computational biology and and -independent Pathways in E2F-mediated S Phase Induction. J Biol bioinformatics. Genome Biol, 5(10): R80. Chem,275(9): 6337-6345. Guenther MG, Lawton LN, Rozovskaia T, Frampton GM, Levine SS, Bailey TL, Elkan C. 1994. Fitting a mixture model by expectation Volkert TL, Croce CM, Nakamura T, Canaani E, Young RA. 2008. maximization to discover motifs in biopolymers. Proc Int Conf Intell Aberrant chromatin at genes encoding stem cell regulators in human Syst Mol Biol,2: 28-36. mixed-lineage leukemia. Genes Dev, 22(24): 3403-3408.

Cairns J, Spyrou C, Stark R, Smith ML, Lynch AG, Tavare S. 2011. Ho JW, Bishop E, Karchenko PV, Negre N, White KP, Park PJ. 2011. BayesPeak--an R package for analyzing ChIP-seq data. ChIP-chip versus ChIP-seq: lessons for experimental design and data Bioinformatics,27(5): 713-714. analysis. BMC Genomics, 12: 134.

Chen X, Xu H, Yuan P, Fang F, Huss M, Vega VB, Wong E, Orlov YL, Hower V, Evans SN, Pachter L. 2011. Shape-based peak identification Zhang W, Jiang J, Loh YH, Yeo HC, Yeo ZX, Narang V, Govindarajan for ChIP-Seq. BMC Bioinformatics, 12: 15. KR, Leong B, Shahab A, Ruan Y, Bourque G, Sung WK, Clarke ND, Huang da W, Sherman BT, Lempicki RA. 2009. Systematic and Wei CL, Ng HH. 2008. Integration of external signaling pathways with integrative analysis of large gene lists using DAVID bioinformatics the core transcriptional network in embryonic stem cells. Cell, 133(6): resources. Nat Protoc, 4(1): 44-57. 1106-1117. Jothi R, Cuddapah S, Barski A, Cui K, Zhao K. 2008. Genome-wide Chen Y, Negre N, Li Q, Mieczkowska JO, Slattery M, Liu T, Zhang Y, identification of in vivo protein-DNA binding sites from ChIP-Seq data. Kim TK, He HH, Zieba J, Ruan Y, Bickel PJ, Myers RM, Wold BJ, Nucleic Acids Res, 36(16): 5221-5231. White KP, Lieb JD, Liu XS. 2012. Systematic evaluation of factors influencing ChIP-seq fidelity. Nat Methods,9(6): 609-614. Kapranov P, Cawley SE, Drenkow J, Bekiranov S, Strausberg RL, Fodor SP, Gingeras TR. 2002. Large-scale transcriptional activity in Choi H, Nesvizhskii AI, Ghosh D, Qin ZS. 2009. Hierarchical hidden chromosomes 21 and 22. Science, 296(5569): 916-919. Markov model with application to joint analysis of ChIP-chip and ChIP-seq data. Bioinformatics,25(14): 1715-1721. Kharchenko PV, Tolstorukov MY, Park PJ. 2008. Design and analysis of ChIP-seq experiments for DNA-binding proteins. Nat Biotechnol, Creyghton MP, Cheng AW, Welstead GG, Kooistra T, Carey BW, Steine 26(12): 1351-1359. EJ, Hanna J, Lodato MA, Frampton GM, Sharp PA, Boyer LA, Young RA, Jaenisch R. 2010. Histone H3K27ac separates active from poised Langmead B, Trapnell C, Pop M, Salzberg S. 2009. Ultrafast and enhancers and predicts developmental state. Proc Natl Acad Sci USA, memory-efficient alignment of short DNA sequences to the human 107(50): 21931-21936. genome. Genome Biol, 10(3): R25.

DeGregori J, Kowalik T, Nevins JR. 1995. Cellular targets for Li R, Yu C, Li Y, Lam TW, Yiu SM, Kristiansen K, Wang J. 2009. activation by the E2F1 transcription factor include DNA synthesis- and SOAP2: an improved ultrafast tool for short read alignment. G1/S-regulatory genes. Mol Cell Biol, 15(8): 4215-4224. Bioinformatics, 25(15): 1966-1967.

Edgar R, Domrachev M, Lash AE. 2002. Gene Expression Omnibus: Madigan D, Ridgeway G. 2003. Bayesian data analysis. In Ye, N. (eds). NCBI gene expression and hybridization array data repository. Nucleic The Handbook of Data Mining CRC Press, USA: 103-131. Acids Res, 30(1): 207-210. Micsinai M, Parisi F, Strino F, Asp P, Dynlacht BD, Kluger Y. 2012. Fejes AP, Robertson G, Bilenky M, Varhol R, Bainbridge M, Jones SJ. Picking ChIP-seq peak detectors for analyzing chromatin modification 2008. FindPeaks 3.1: a tool for identifying areas of enrichment from experiments. Nucleic Acids Res, 40(9), e70. massively parallel short-read sequencing technology. Bioinformatics, Mikkelsen TS, Ku M, Jaffe DB, Issac B, Lieberman E, Giannoukos G, 24(15): 1729-1730. Alvarez P, Brockman W, Kim TK, Koche RP, Lee W, Mendenhall E, Gentleman RC, Carey VJ, Bates DM, Bolstad B, Dettling M, Dudoit S, O'Donovan A, Presser A, Russ C, Xie X, Meissner A, Wernig M, Ellis B, Gautier L, Ge Y, Gentry J, Hornik K, Hothorn T, Huber W, Jaenisch R, Nusbaum C, Lander ES, Bernstein BE. 2007. Genome- Iacus S, Irizarry R, Leisch F, Li C, Maechler M, Rossini AJ, Sawitzki G, wide maps of chromatin state in pluripotent and lineage-committed Smith C, Smyth G, Tierney L, Yang JY, Zhang J. 2004. Bioconductor: cells. Nature, 448(7153): 553-560.

Kunming Institute of Zoology (CAS), China Zoological Society Volume 33 Issues E5−6 E128 ZHANG, SU

Osmanbeyoglu H, Hartmaier R, Oesterreich S, Lu X. 2012. Improving view of gene activity and alternative splicing by deep sequencing of the ChIP-seq peak-calling for functional co-regulator binding by human transcriptome. Science, 321(5891): 956-960. integrating multiple sources of biological information. BMC Genomics, Tang C, Shi X, Wang W, Zhou D, Tu J, Xie X, Ge Q, Xiao PF, Sun X, 13(Suppl 1): S1. Lu Z. 2010. Global analysis of in vivo EGR1-binding sites in Park PJ. 2009. ChIP-seq: advantages and challenges of a maturing erythroleukemia cell using chromatin immunoprecipitation and technology. Nat Rev Genet, 10(10): 669-680. massively parallel sequencing. Electrophoresis, 31(17): 2936-2943.

Ramagopalan SV, Heger A, Berlanga AJ, Maugeri NJ, Lincoln MR, Valouev A, Johnson DS, Sundquist A, Medina C, Anton E, Batzoglou S, Burrell A, Handunnetthi L, Handel AE, Disanto G, Orton SM, Watson Myers RM, Sidow A. 2008. Genome-wide analysis of transcription CT, Morahan JM, Giovannoni G, Ponting CP, Ebers GC, Knight JC. factor binding sites based on ChIP-Seq data. Nat Methods, 5(9): 829- 2010. A ChIP-seq defined genome-wide map of vitamin D receptor 834. binding: associations with disease and evolution. Genome Res, 20(10): Valouev A, Johnson SM, Boyd SD, Smith CL, Fire AZ, Sidow A. 2011. 1352-1360. Determinants of nucleosome organization in primary human cells. Rozowsky J, Euskirchen G, Auerbach RK, Zhang ZD, Gibson T, Nature, 474(7352): 516-520. Bjornson R, Carriero N, Snyder M, Gerstein MB. 2009. PeakSeq Wang ET, Sandberg R, Luo S, Khrebtukova I, Zhang L, Mayr C, enables systematic scoring of ChIP-seq experiments relative to controls. Kingsmore SF, Schroth GP, Burge CB. 2008. Alternative isoform Nat Biotechnol, 27(1): 66-75. regulation in human tissue transcriptomes. Nature, 456(7221): 470-476. Seuter S, Vaisanen S, Radmark O, Carlberg C, Steinhilber D. 2007. Xu X, Bieda M, Jin VX, Rabinovich A, Oberley MJ, Green R, Farnham Functional characterization of vitamin D responding regions in the PJ. 2007. A comprehensive ChIP-chip analysis of E2F1, E2F4, and human 5-Lipoxygenase gene. Biochim Biophys Acta, 1771(7): 864-872. E2F6 in normal and tumor cells reveals interchangeable roles of E2F Sinkkonen L, Malinen M, Saavalainen K, Vaisanen S, Carlberg C. 2005. family members. Genome Res, 17(11): 1550-1561. Regulation of the human cyclin C gene via multiple vitamin D3- Yan Z, DeGregori J, Shohet R, Leone G, Stillman B, Nevins JR, responsive regions in its promoter. Nucleic Acids Res, 33(8): 2440-2451. Williams RS. 1998. Cdc6 is regulated by E2F and is essential for DNA Smith AD, Chung W-Y, Hodges E, Kendall J, Hannon G, Hicks J, Xuan replication in mammalian cells. Proc Natl Acad Sci USA,95(7): 3603- Z, Zhang MQ. 2009. Updates to the RMAP short-read mapping 3608. software. Bioinformatics, 25(21): 2841-2842. Zella LA, Meyer MB, Nerenz RD, Lee SM, Martowicz ML, Pike JW. Spyrou C, Stark R, Lynch AG, Tavare S. 2009. BayesPeak: Bayesian 2010. Multifunctional enhancers regulate mouse and human vitamin D analysis of ChIP-seq data. BMC Bioinformatics, 10: 299. receptor gene transcription. Mol Endocrinol, 24(1): 128-147.

Sultan M, Schulz MH, Richard H, Magen A, Klingenhoff A, Scherf M, Zhang Y, Liu T, Meyer CA, Eeckhoute J, Johnson DS, Bernstein BE, Seifert M, Borodina T, Soldatov A, Parkhomchuk D, Schmidt D, Nusbaum C, Myers RM, Brown M, Li W, Liu XS. 2008. Model-based O'Keeffe S, Haas S, Vingron M, Lehrach H, Yaspo ML. 2008. A global analysis of ChIP-Seq (MACS). Genome Biol, 9(9): R137.

Zoological Research www.zoores.ac.cn

Zoological Research call for papers

The 2nd and 3rd issue, 2013, “Special Issue on Amphibians and Reptiles” and “Special Issue on Fish”

To whom it may concern: We sincerely welcome any relative contributions from you and your research teams, following are the details.

1) Contribution scope and requirements The relative research on amphibians, reptiles and fish, which including integrative biology, molecular biology, genetics, biochemistry, physiology, evolution biology, reproductive biology, etc. The researches should focus on: the investigation of species diversity and distribution pattern and formation mechanism; ecology; behavior and evolution; coevolution between the adaptation to the environment and the morphology and function; resources and protection. Both submissions in English and Chinese are acceptable, while manuscripts in English are preferable. For reviews and reports, they should be limited within 10,000 words, and for research articles, there’s no strict word limitations. 2) Contact us You can send your articles by E-mail ([email protected]) or use our online submission system (http://www.zoores.ac.cn/). Please label “submission for the special issue” in front of your word (.doc) file. Deadline: 31st, Jan. 2013 for Special Issue on Amphibians and Reptiles; 28th, Feb. 2013 for Special Issue on Fish.

We appreciate your support, thank you!

Zoological Research Editorial Office

Kunming Institute of Zoology, the Chinese Academy of Sciences