Deep Sequencing-Based Transcriptome Profiling Analysis Of
Total Page:16
File Type:pdf, Size:1020Kb
Xiang et al. BMC Genomics 2010, 11:472 http://www.biomedcentral.com/1471-2164/11/472 RESEARCH ARTICLE Open Access Deep sequencing-based transcriptome profiling analysis of bacteria-challenged Lateolabrax japonicus reveals insight into the immune- relevant genes in marine fish Li-xin Xiang1,2†, Ding He1,2†, Wei-ren Dong1,2, Yi-wen Zhang1,2, Jian-zhong Shao1,2*† Abstract Background: Systematic research on fish immunogenetics is indispensable in understanding the origin and evolution of immune systems. This has long been a challenging task because of the limited number of deep sequencing technologies and genome backgrounds of non-model fish available. The newly developed Solexa/ Illumina RNA-seq and Digital gene expression (DGE) are high-throughput sequencing approaches and are powerful tools for genomic studies at the transcriptome level. This study reports the transcriptome profiling analysis of bacteria-challenged Lateolabrax japonicus using RNA-seq and DGE in an attempt to gain insights into the immunogenetics of marine fish. Results: RNA-seq analysis generated 169,950 non-redundant consensus sequences, among which 48,987 functional transcripts with complete or various length encoding regions were identified. More than 52% of these transcripts are possibly involved in approximately 219 known metabolic or signalling pathways, while 2,673 transcripts were associated with immune-relevant genes. In addition, approximately 8% of the transcripts appeared to be fish- specific genes that have never been described before. DGE analysis revealed that the host transcriptome profile of Vibrio harveyi-challenged L. japonicus is considerably altered, as indicated by the significant up- or down-regulation of 1,224 strong infection-responsive transcripts. Results indicated an overall conservation of the components and transcriptome alterations underlying innate and adaptive immunity in fish and other vertebrate models. Analysis suggested the acquisition of numerous fish-specific immune system components during early vertebrate evolution. Conclusion: This study provided a global survey of host defence gene activities against bacterial challenge in a non-model marine fish. Results can contribute to the in-depth study of candidate genes in marine fish immunity, and help improve current understanding of host-pathogen interactions and evolutionary history of immunogenetics from fish to mammals. Background Further, it is also beneficial in the creation of immune- Since it is a representative population of lower verte- based therapy of severe fish diseases. Great progress in brates serving as an essential link to early vertebrate bioinformatics and genome projects in model organisms, evolution, fish is believed to be an important model in including human (Homo sapiens), mouse (Mus muscu- various developmental and comparative evolutionary lus), frog (Xenopus laevis), chicken (Gallus gallus), and studies [1-3]. Fish immunogenetics has received consid- zebrafish (Danio rerio), has led to the emergence of stu- erable attention due to its essential role in understand- dies focusing on the identification and characterization ing the origin and evolution of immune systems [4-6]. of immune-related genes in teleost fish based on com- parative genomics. These have provided preliminary * Correspondence: [email protected] observations on fish immunogenetics and evolutionary † Contributed equally history of immune systems from lower vertebrates to 1College of Life Science, Zhejiang University, Hangzhou 310058, China Full list of author information is available at the end of the article mammals [7,8]. However, large-scale identification of © 2010 Xiang et al; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Xiang et al. BMC Genomics 2010, 11:472 Page 2 of 21 http://www.biomedcentral.com/1471-2164/11/472 immune-related genes at the genome or transcriptome in marine fish can contribute to the in-depth investiga- levels in fish was seen in limited species (e.g. Danio tion of candidate genes in fish immunity. Results are rerio) due to the inadequate number of high-throughput also expected to improve current understanding of deep sequencing technologies available [9,10]. This is an host-pathogen interactions and evolutionary history of even more difficult problem in non-model fish species immunogenetics from fish to mammals. with totally unknown genome sequences. Recently developed RNA deep sequencing technolo- Results gies, such as Solexa/Illumina RNA-seq and Digital gene Aligning raw sequencing reads to non-redundant expression (DGE), have dramatically changed the way consensus immune-related genes in fish are identified because Approximately 34.59 and 33.03 million 75-bp pair end these technologies facilitate the investigation of the (PE) raw reads (submitted to GEO database, Association functional complexity of transcriptomes [11,12]. RNA- No. GSE21721) from the head kidney and spleen tissues Seq refers to whole transcriptome shotgun sequencing of bacteria- and mock-challenged fish, respectively, were wherein mRNA or cDNA is mechanically fragmented, generated using Solexa/Illumina RNA-seq deep sequen- resulting in overlapping short fragments that cover the cing analysis. Repetitive, low-complexity, and low-quality entire transcriptome. DGE is a tag-based transcriptome reads were filtered out prior to assembly of sequence sequencing approach where short raw tags are generated reads for non-redundant consensus. Using Grape soft- by endonuclease. The expression level of virtually all ware, reliable reads were assembled into contigs, which genes in the sample is measured by counting the num- were then compared with all PE reads. Overlap of PE ber of individual mRNA molecules produced from each readswithtwocontigswastakentoindicatethatthe gene. Compared with DGE analysis, the RNA-Seq contigs are short segments of a scaffold. Reads were approach is more powerful for unraveling transcriptome used for gap-filling of these scaffolds to generate final complexity, and for identification of genes, structure of scaffold sequences. Using tgicl and cap3 software pro- transcripts, alternative splicing, non-coding RNAs, and grams, scaffold sequences were assembled into clusters new transcription units. In contrast, the DGE protocol that were then analysed for consensus. A total of is more suitable and affordable for comparative gene 150,125 and 140,330 non-redundant consensus expression studies because it enables direct transcript sequences, ranging from 100 to 2,000-bp, were gener- profiling without compromise and potential bias, thus ated from each group. Then, consensus sequences were allowing for a more sensitive and accurate profiling of merged for DGE analysis. Removal of partial overlapping the transcriptome that more closely resembles the biol- sequences yielded 169,950 non-redundant consensus ogy of the cell [9,13-17]. These two technologies have sequences (Table 1). These sequences provide abundant been used in transcriptome profiling studies for various data on healthy and infected conditions, thus allowing applications, including cellular development, cancer, and for better reference of immune-relevant genes. immune defence of various organisms [10,18-29]. How- ever, they have not been used in immunogenetic analy- Annotation of all non-redundant consensus sequences sis of marine fish species. BLASTX and ESTscan software analysis of 169,950 non- Japanese sea bass (Lateolabrax japonicus)isaneco- redundant consensus sequences revealed that about nomically important marine species widely cultured in fisheries worldwide. Various diseases caused by bacterial and viral pathogens plague this species [30]. High mortal- Table 1 Distribution of non-redundant consensus ity is associated with infection with Vibrio harveyi, a typi- sequences cal gram-negative pathogen of a wide range of marine Consensus Length (bp) Total Number Percentage animals. Infection results in a variety of vibriosis, a com- 100-500 143865 84.65% mon aquatic animal disease associated with high mortal- 500-1000 14313 8.42% ity throughout the world [31]. In L. japonicus, V. harveyi 1000-1500 5475 3.22% infection leads to bacterial septicaemia with muscle ulcer 1500-2000 2849 1.68% as well as subcutaneous and gastroenteritic haemorrhage. ≥2000 3448 2.03% The present study is the first to conduct a transcrip- Total 169950 100.00% tome profiling analysis of V. harveyi-challenged L. japo- nicus using RNA-seq and DGE to gain deep insight into Total bp 57817772 the immunogenetics of marine fish. Bacteria-challenged N50 598 L. japonicus is expected to be a model system for study- Mean 340 ing bacterial immunity in marine fish. Further, a global N50 = median length of all non-redundant consensus sequences survey of anti-bacterial immune defence gene activities Mean = average length of all consensus sequences Xiang et al. BMC Genomics 2010, 11:472 Page 3 of 21 http://www.biomedcentral.com/1471-2164/11/472 48,987 have reliable coding sequences (CDs) [32,33]. seven categories comprised of 219 known metabolic or CD-containing consensus sequences have high potential signalling pathways, including cellular growth, differen- for translation into functional proteins and most of tiation, apoptosis, migration, endocrine, and various them translated to proteins with more than