Genome-Wide Profiling of RNA Editing Sites in Sheep Yuanyuan Zhang1,2, Deping Han1, Xianggui Dong1, Jiankui Wang1, Jianfei Chen1, Yanzhu Yao1, Hesham Y
Total Page:16
File Type:pdf, Size:1020Kb
Zhang et al. Journal of Animal Science and Biotechnology (2019) 10:31 https://doi.org/10.1186/s40104-019-0331-z RESEARCH Open Access Genome-wide profiling of RNA editing sites in sheep Yuanyuan Zhang1,2, Deping Han1, Xianggui Dong1, Jiankui Wang1, Jianfei Chen1, Yanzhu Yao1, Hesham Y. A. Darwish1,3, Wansheng Liu2* and Xuemei Deng1* Abstract Background: The widely observed RNA-DNA differences (RDDs) have been found to be due to nucleotide alteration by RNA editing. Canonical RNA editing (i.e., A-to-I and C-to-U editing) mediated by the adenosine deaminases acting on RNA (ADAR) family and apolipoprotein B mRNA editing catalytic polypeptide-like (APOBEC) family during the transcriptional process is considered common and essential for the development of an individual. To date, an increasing number of RNA editing sites have been reported in human, rodents, and some farm animals; however, genome-wide detection of RNA editing events in sheep has not been reported. The aim of this study was to identify RNA editing events in sheep by comparing the RNA-seq and DNA-seq data from three biological replicates of the kidney and spleen tissues. Results: A total of 607 and 994 common edited sites within the three biological replicates were identified in the ovine kidney and spleen, respectively. Many of the RDDs were specific to an individual. The RNA editing-related genes identified in the present study might be evolved for specific biological functions in sheep, such as structural constituent of the cytoskeleton and microtubule-based processes. Furthermore, the edited sites found in the ovine BLCAP and NEIL1 genes are in line with those in previous reports on the porcine and human homologs, suggesting the existence of evolutionarily conserved RNA editing sites and they may play an important role in the structure and function of genes. Conclusions: Our study is the first to investigate RNA editing events in sheep. We screened out 607 and 994 RNA editing sites in three biological replicates of the ovine kidney and spleen and annotated 164 and 247 genes in the kidney and spleen, respectively. The gene function and conservation analysis of these RNA editing-related genes suggest that RNA editing is associated with important gene function in sheep. The putative functionally important RNA editing sites reported in the present study will help future studies on the relationship between these edited sites and the genetic traits in sheep. Keywords: DNA resequencing, RNA-DNA differences, RNA editing, RNA-seq, Sheep Background splicing, and sequences and structures of the mature RNA According to the central dogma, it is assumed that the se- molecules. One of the most important mechanisms is quence of mRNA truthfully reflects that of the DNA tem- RNA editing, which results in RNA–DNA differences plate. However, we know that the RNA sequences are not (RDDs), e.g., codon changes leading to protein variants coded one-to-one by their corresponding DNA sequences. [1]. It has been illustrated that the base modification at RNA modifications during co- and post-transcriptional the RNA level plays an important role in post transcrip- processes contribute to the complexity of alternative tional regulation to enhance the complexity of transcripts and alter the function of genes. * Correspondence: [email protected]; [email protected] 2Department of Animal Science, Pennsylvania State University, University The most commonly observed RNA editing event in hu- Park, PA 16802, USA man is A-to-I editing (normally interpreted as guanosine 1Key Laboratory of Animal Genetics, Breeding and Reproduction of the during transcription or by sequencing enzymes, recognized Ministry of Agriculture & Beijing Key Laboratory of Animal Genetic Improvement, China Agricultural University, Beijing 100193, China as A-to-G). The genes of the adenosine deaminases acting Full list of author information is available at the end of the article on RNA (ADAR) family were investigated to mediate the © The Author(s). 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. Zhang et al. Journal of Animal Science and Biotechnology (2019) 10:31 Page 2 of 9 A-to-I editing process by binding the double-stranded (ds) phenol–chloroform extraction method. The total RNA RNA secondary structure [2]. Another widely described was extracted and purified using the RNA Sample Total RNA editing base modification type in mammals is C-to-U Kit (Qiagen, German). The quality of gDNA and RNA conversion mediated by catalytic deaminase, apobec-1, and was evaluated using NanoDrop 2000 (Thermo Fisher apobec-1 complementation factor (ACF) in apolipoprotein Scientific) and by agarose gel electrophoresis (1.2% B mRNA editing catalytic polypeptide-like (APOBEC) agarose gel). family. The C-to-U conversion type of editing is relatively rare in the human transcriptome [3]. Both A-to-I and DNA/RNA sequencing C-to-U events are considered as canonical RNA editing For each DNA sample, a whole-genome sequencing li- types, whereas the other types of RNA editing are gener- brary was built using the Illumina TruSeq DNA Sample ally considered as false positive detections in human Preparation Kit (Illumina, Inc., San Diego, CA, USA) with genome-wide scanning studies [4–6]. However, there are an insert size of ~ 350 bp. The libraries were sequenced some reports on the importance of the non-canonical on the Illumina HiSeq 2000 platform. The paired-end editing types [7, 8]. reads of 100 bp were generated for each fragment. According to the Rigorously Annotated Database of The total RNA from each sample was used as input for A-to-I RNA Editing (RADAR) database [9],onlyafew the TruSeq RNA Prep Kit (Illumina, Inc.) and by means of dozen human RNA editing targets that change amino acids indexed adapters; a sequencing library was created accord- in non-repetitive regions have been identified. An example ing to the manufacturer’s instructions. RNA sequencing is the initially detected RNA editing event, glutamate iono- was conducted on the Illumina HiSeq 2500 platform, tropic receptor AMPA type subunit 2 (GluR-2) Q/R site resulting in paired-end 100-bp reads. The insert length of resulting amino acid substitution, converting glutamine the RNA-seq library ranged from 300 to 400 bp. The codon into arginine codon [10]. To date, the development DNA- and RNA-seq data have been deposited to the of high-throughput sequencing technology such as next National Center for Biotechnology Information (NCBI) generation sequencing methods have facilitated the discov- SRA Database (accession number: SRP133430). ery of RNA editing events and its functional mechanisms in differentorganisms.Inadditiontostudiesonhumantis- Mapping and variant calling strategies sues, several RNA editing sites have been reported in farm The raw reads of both DNA- and RNA-seq data were first animals such as pig [11] and chicken [12]. checked using the FastQC tool [15]andtrimmedusing To the best of our knowledge, there is no report on Trimmomatic [16]. First six bases of each read were genome-wide scanning of RNA editing sites in sheep. In discarded to avoid artificial mismatches derived from the present study, for the first time, we applied DNA random-hexamer priming, adaptors, and low-quality nu- and RNA resequencing data to identify potential editing cleotides (quality < 30) were also removed. The resulting events in sheep. We restricted our search for RNA edit- clean reads were obtained and prepared for alignment ing sites only to homogenous genomic DNA (gDNA) se- against sheep reference genomic sequences (OARv4.0). quences, but heterogenous RNA sequences as putative For the RNA-seq data, the reads were mapped to the RDDs were also included to minimize false positives. In sheep reference genome using the two-pass mapping addition, crucial thresholds were set for the detection of strategy in STAR (v2.6) [17], and only uniquely mapped RDDs for RNA and DNA mapping and genotype quality reads were extracted based on the mapping quality in order to eliminate uncertain reads and single-nucleo- (MAPQ = 255). We then used the GATK ReassignOne- tide variants (SNVs). Furthermore, we excluded the MappingQuality read filter of GATK (v4.0.12.0) [18]to RDDs that are present on repetitive regions, which may reassign map quality scores (MAPQ) of all unique align- result in mis-mapped reads and variant call errors [13, ments from 255 to the default value of 60 in GATK 14]. Overall, this study aims to explore reliable RNA (v4.0.12.0) [18]. The BWA (v0.7.17) “mem” method [13] editing events in sheep and provide more information with default parameters was used to align DNA reads for RNA-editing database of mammals. against the sheep reference genome (OARv4.0). The polymerase chain reaction duplicates were removed Methods from the resulting DNA and RNA mapped bam files Sample collection using the MarkDuplicates tool from GATK (v4.0.12.0) The kidney and spleen tissues, and blood samples were [18]. The results from the two mapping procedures were obtained from three adult (2 years old) male Lanping merged into a single BAM file. sheep (a Chinese indigenous sheep breed) from the same Joint variant calling was conducted on the BAM file population in Yunnan Province of China. These animals using Samtools/Bcftools [19, 20] to construct a prelimin- were unrelated according to their pedigree records.