Public Databases
Total Page:16
File Type:pdf, Size:1020Kb
Mol Genet Genomics DOI 10.1007/s00438-016-1233-9 ORIGINAL ARTICLE Transcriptome sequencing and de novo characterization of Korean endemic land snail, Koreanohadra kurodana for functional transcripts and SSR markers Se Won Kang1 · Bharat Bhusan Patnaik1,2 · Hee‑Ju Hwang1 · So Young Park1 · Jong Min Chung1 · Dae Kwon Song1 · Hongray Howrelia Patnaik1 · Jae Bong Lee3 · Changmu Kim4 · Soonok Kim4 · Hong Seog Park5 · Yeon Soo Han6 · Jun Sang Lee7 · Yong Seok Lee1 Received: 10 February 2016 / Accepted: 25 July 2016 © Springer-Verlag Berlin Heidelberg 2016 Abstract The Korean endemic land snail Koreanohadra the public databases. The direction of the unigenes to func- kurodana (Gastropoda: Bradybaenidae) found in humid tional categories was determined using COG, GO, KEGG, areas of broadleaf forests and shrubs have been consid- and InterProScan protein domain search. The GO analysis ered vulnerable as the number of individuals are declining search resulted in 22,967 unigenes (12.02 %) being cate- in recent years. The species is poorly characterized at the gorized into 40 functional groups. The KEGG annotation genomic level that limits the understanding of functions at revealed that metabolism pathway genes were enriched. the molecular and genetics level. In the present study, we The most prominent protein motifs include the zinc finger, performed de novo transcriptome sequencing to produce a ribonuclease H, reverse transcriptase, and ankyrin repeat comprehensive transcript dataset of visceral mass tissue of domains. The simple sequence repeats (SSRs) identified K. kurodana by the Illumina paired-end sequencing tech- from >1 kb length of unigenes show a dominancy of dinu- nology. Over 234 million quality reads were assembled cleotide repeat motifs followed with tri- and tetranucleotide to a total of 315,924 contigs and 191,071 unigenes, with motifs. A number of unigenes were putatively assessed to an average and N50 length of 585.6 and 715 bp and 678 belong to adaptation and defense mechanisms including and 927 bp, respectively. Overall, 36.32 % of the unigenes heat shock proteins 70, Toll-like receptor 4, AMP-activated found matches to known protein/nucleotide sequences in protein kinase, aquaporin-2, etc. Our data provide a rich source for the identification and functional characterization of new genes and candidate polymorphic SSR markers in Communicated by S. Hohmann. K. kurodana. The availability of transcriptome informa- tion (http://bioinfo.sch.ac.kr/submission/) would promote S. W. Kang and B. B. Patnaik contributed equally to this work. the utilization of the resources for phylogenetics study and Electronic supplementary material The online version of this genetic diversity assessment. article (doi:10.1007/s00438-016-1233-9) contains supplementary material, which is available to authorized users. * Yong Seok Lee 4 National Institute of Biological Resources, 42, [email protected] Hwangyeong‑ro, Seo‑gu, Incheon 22689, Korea 5 1 Department of Life Science and Biotechnology, College Research Institute, GnC BIO Co., LTD., 621‑6 of Natural Sciences, Soonchunhyang University, Banseok‑dong, Yuseong‑gu, Daejeon 34069, Korea 22 Soonchunhyangro, Shinchang‑myeon, Asan, 6 College of Agriculture and Life Science, Chonnam National Chungcheongnam‑do 31538, Korea University, 77 Yongbong‑ro, Buk‑gu, Gwangju 61186, Korea 2 Trident School of Biotech Sciences, Trident Academy 7 Institute of Environmental Research, Kangwon National of Creative Technology (TACT), Chandaka Industrial Estate, University, 1 Kangwondaehak‑gil, Chuncheon‑si, Chandrasekharpur, Bhubaneswar, Odisha 751024, India Gangwon‑do 243341, Korea 3 Korea Zoonosis Research Institute (KOZRI), Chonbuk National University, 820‑120 Hana‑ro, Iksan, Jeollabuk‑do 54528, Korea 1 3 Mol Genet Genomics Keywords Koreanohadra kurodana · Transcriptomics · De threatened for extinction with lack of regional conservation novo assembly · Functional annotation · Simple sequence measures. With genomic information lacking for the spe- repeats cies, an evaluation of population genetic structure, genetic diversity, and possible evolutionary responses to environ- mental perturbations and climate change are not feasi- Introduction ble. The National Centre for Biotechnology Information (NCBI) database only shows the sequence of mitochondrial Gastropods represent one of the most successful classes protein cytochrome c oxidase I (CO-1) under the synonym of living molluscs having greatly reduced shells or a com- name Fruticicola koreana. plete absence of shells. From an evolutionary standpoint it Next-generation sequencing (NGS) such as Illumina is perplexing, though on account of parallel development sequencing by synthesis (SBS) allows for massive char- of such an adaptation trait, it relates to a strong advantage acterization of the transcriptome by generation of mil- for colonization on land. The Gastropod class of molluscs lions of short reads. The short reads produced provide a has invaded the land with more than 62,000 described liv- more sensitive application for the detection of differential ing species, comprising about 80 % of living molluscs. The expression compared to conventional gene arrays (Marioni land snails are the most numerous with close to 35,000 et al. 2008; Collins et al. 2008). The Illumina platform has described species, mostly inhabiting the tropics. Still, many become increasingly cost-effective and been successfully species remain undescribed mainly due to their minute used to characterize the transcriptome of non-model spe- sizes and under-exploration from extreme terrestrial eco- cies with greater precision (Wang et al. 2010, 2012; Meng zones. The land snail family Bradybaenidae, identified on et al. 2013; Yue et al. 2015; Patnaik et al. 2015; Hwang the basis of the presence of dart sac and dart-related organs, et al. 2016). The Illumina NGS technology has also greatly are a relatively younger taxon, to the closely related Cama- benefitted with the development of algorithms and experi- enidae family (Hirano et al. 2014). The bradybaenids are ments for de novo assembly (Martin and Wang 2011; Ekb- extremely specious with distribution in Northwest Amer- lom and Wolf 2014). The Illumina sequencing technology ica and Europe, Southeast and East Asia including Japan, has been utilized for the transcriptome characterization of Korea, Taiwan, Indonesia, Philippines, and China (Chen molluscs and annotation of genes suggested to participate and Zhang 2004; Wu et al. 2007; Kuznik-Kowalska et al. in various metabolic, cellular and immune pathways (Cas- 2013; Hirano et al. 2014; Huang et al. 2014). A study by tellanos-Martinez et al. 2014; Deng et al. 2014; Patnaik Kwon et al. (1993) classified the Korean Bradybaenidae et al. 2016a). Furthermore, molecular markers such as sin- snails into 24 species. Among the land snail fauna of Jeju gle nucleotide polymorphism (SNP) and simple sequence Island, species belonging to family Bradybaenidae are one repeats (SSR) have been detected from de novo assembled of the largest, with eight reported species (Noseworthy sequences of Illumina-based transcriptome of molluscs et al. 2007). Under bradybaenid snails, Koreanohadra has (Franchini et al. 2011; Patnaik et al. 2016b). In this study, been classified as a genus of its own by Min et al. (2004) we have used Illumina HiSeq 2500 sequencing to gener- and accorded synonyms of Fruticicola and Karaftohelix by ate a transcriptome of K. kurodana. We have conducted Schileyko (2004) and Sysoev and Schileyko (2009), respec- de novo assembly of the transcriptome and used bioinfor- tively. Koreanohadra kurodana is a bradybaenid land snail matics analysis to annotate the functional direction of the species endemic to Korea. The species is found at humid assembled sequences. In doing so, we have suggested the areas in broadleaf forests with shrubs and known from predicted functions based on database classification struc- Gangwon-do and Gyeonggi-do provinces of Korea. Chro- ture. This is the first large-scale transcriptome information mosome numbers and karyotype of K. kurodana reported for K. kurodana that could be useful in providing valuable by Lee and Kwon (1993) suggests that it possesses 11 pairs resources towards functional gene discovery and devel- of metacentric chromosomes, 17 sub-metacentric and one opment of molecular markers for the conservation of the telocentric chromosome. Such an analysis was referred species. while reporting the 29 chromosome pairs in another species from the same genus, Koreanohadra koreana (Park 2011). Koreanohadra kurodana has been assessed as vulner- Materials and methods able ( 10 species occupied within <2000 km2) by Korean ≤ Red List of Threatened Species, 2014. A continual decline Species collection and RNA extraction has been observed in the area of occupancy of the species due to predation by natural enemies, indiscriminate collec- Specimens of K. kurodana were collected from the moun- tion for ornamental use, change and loss of natural habi- tain near to Maha-ri, Mitan-myeon, Pyeongchang-gun, tats caused by forest development. The species has been Gangwon-do, Republic of Korea in August 2014. No 1 3 Mol Genet Genomics prior permission and ethical clearance was required as the The reads, thus, obtained were assembled using Trinity species is endemic to Korea and enlisted under ‘vulner- software with default options (minimum allowed length able’ category of Korean Red List KORED assessments. of 200 bp) (Grabherr et al. 2011). A K-mer length of 25 As per the regulations of National