FEBS Letters 583 (2009) 1171–1174

journal homepage: www.FEBSLetters.org

Conserved recoding RNA editing of vertebrate C1q-related factor C1QL1

Christina P. Sie, Stefan Maas *

Department of Biological Sciences, Lehigh University, 111 Research Drive, Iacocca Hall D226, Bethlehem, PA 18015-4732, United States article info abstract

Article history: A-to-I RNA editing can lead to recoding of pre-mRNAs with profound functional consequences for Received 13 February 2009 the ensuing proteins. Here we show that complement component 1, q subcomponent-like 1 Accepted 28 February 2009 (C1QL1) undergoes RNA editing in vivo causing non-synonymous amino acid substitutions in Available online 9 March 2009 human, mouse as well as zebrafish. The major editing site had previously been annotated as a single-nucleotide polymorphism in human, but our analysis reveals that post-transcriptional mod- Edited by Michael Ibba ification is the cause for the sequence variation. Remarkably, although editing of C1QL1 is conserved across vertebrate species, the predicted RNA secondary structure mediating editing involves differ- Keywords: ent regions in zebrafish versus mammals. RNA editing Inosine Ó 2009 Federation of European Biochemical Societies. Published by Elsevier B.V. All rights reserved. deaminase acting on RNA Double-stranded RNA Complement component 1, q subcomponent-like 1 Single-nucleotide polymorphism

1. Introduction binding domains with high affinity for dsRNA structures, but known physiological targets for editing lack recognizable primary The posttranscriptional processing of pre-mRNAs by A-to-I sequence motifs or recurring RNA folds to allow straight-forward modification has been recognized as an important mechanism for prediction of editing sites. generating RNA and protein diversity, as edited and non-edited Since editing alters coding information post-transcriptionally, gene products are usually produced side-by-side within the same the genomic sequences of an affected gene are indistinguishable cell (for review see [1–3]). If A-to-I RNA editing occurs within cod- from a gene that does not undergo editing. It is therefore important ing sequences, single amino acid substitutions can be the conse- to discriminate DNA-based gene variations (genomic SNPs) from quence since inosine is interpreted as guanosine by the variations in gene products (RNA or protein) that originate from translational machinery. Several mammalian genes have been recoding events on the level of the RNA transcripts. Single-nucleo- described where the substitution of a single amino acid due to tide polymorphisms (SNPs) are important molecular markers that RNA editing leads to a significant alteration in protein function link sequence variations to phenotypic changes. They constitute (reviewed in [1]). The deficiency or misregulation of A-to-I editing the most frequent type of genetic variation in the has been implicated in the etiology of several human disease phe- [5]. Among the millions of validated genomic SNPs, some polymor- notypes (for review see [4]). phisms have been deduced from analyzing only expressed Especially, neurotransmitter receptors and other brain-specific sequences [6,7]. Therefore, absent of genomic validation, it is pos- transcripts are among the previously characterized recoding tar- sible that such variations may result from RNA editing events [8]. gets for editing. Adenosine deaminases acting on RNA (ADARs) Focusing on recoding events, we recently performed a screen for are responsible for the site-selective modification of adenosine res- A-to-I RNA editing candidates and identified two genes, the splic- idues to inosine. Their target specificities and enzymatic regulation ing factor SRp25 and the insulin-like growth factor binding protein are not well understood. ADARs harbor double-stranded (ds) RNA IGFBP7, that are subject to RNA editing at specific sites within their coding region and had previously been falsely annotated as SNPs [9]. Abbreviations: ADAR, acting on RNA; C1QL1, complement Here we show that complement component 1, q subcompo- component 1, q subcomponent-like 1; SNP, single-nucleotide polymorphism; ECS, nent-like 1 (C1QL1) undergoes A-to-I RNA editing within its open editing site complementary sequence * Corresponding author. Fax: +1 610 758 4004. reading frame leading to non-synonymous codon changes in E-mail address: [email protected] (S. Maas). human, mouse and zebrafish transcripts. One of the experimentally

0014-5793/$36.00 Ó 2009 Federation of European Biochemical Societies. Published by Elsevier B.V. All rights reserved. doi:10.1016/j.febslet.2009.02.044 1172 C.P. Sie, S. Maas / FEBS Letters 583 (2009) 1171–1174 validated positions changing a to arginine codon in hu- A Hydrophobic Collagen- C1q globular domain man emerged as a high scoring candidate editing site in our data- sequence like domain base screen [9]. The predicted T63A and Q66R amino acid substitutions may impact protein oligomerization as they are situ- SEQSGAPPPST LVQ GPQ GKPGRTG ated immediately prior to a collagen-like trimerization domain. Intriguingly, zebra fish C1QL1 also undergoes RNA editing at the 5 ’ - CCU UCC A CG CUG GUG CA GGGCCCCCA G- 3 ’ Q66R site, but uses a different RNA secondary structure that medi- ICG CIG CIG ates editing. T A QR QR

2. Materials and methods B gDNA 2.1. Databases and data analysis

Annotations for human single-nucleotide polymorphisms cDNA (SNPs) from the dbSNP database [5] build 125 were downloaded using the UCSC genome table browser [10]. For subsequent analy- sis of candidate genes the UCSC human genome browser (assembly May 2004) was used. RNA secondary structures were predicted using the M-fold algorithm [11] and multiple sequence alignments were done with clustal W 1.8. C The bioinformatics screening procedure for the prediction and gDNA scoring of candidate editing sites within the human transcriptome is described in detail in [9] and summarized in our Supplementary data. cDNA 2.2. RNA editing analysis

For experimental validation, gene-specific fragments of cDNA as well as genomic regions were amplified by PCR and subjected to dideoxy sequencing as described previously [12]. Human brain to- tal RNA and gDNA isolated from the same specimen (Biochain, CA) Fig. 2. Editing of human and mouse complement component 1, q subcomponent- were used and processed using a standard protocol for reverse like 1 (C1QL1). (A) A schematic representation of the C1QL1 protein is shown with transcription. Amplification of C1QL1 cDNA was performed using the three main functional domains indicated. The amino acid sequence surrounding the editing sites is shown, and recoding events are indicated both at the amino acid an optimized PCR protocol with Phire Hot Start DNA Polymerase and at the RNA sequence level. (B) Representative sequence tracks from subcloning (NEB). For detailed PCR protocols and a list of DNA oligonucleotides of mouse cerebellum C1QL1. The genomic sequence is at the top. The positions of used see Supplementary data. PCR products were gel-purified with the three editing sites are boxed. (C) Representative sequence tracks from QIAEX II Gel Extraction Kit (QIAGEN) and subjected to dideoxy subcloning of human C1QL1. The genomic sequence is at the top. The two editing sequencing (Geneway Research). sites are boxed. For in vivo editing levels based on all clones analyzed, please refer to Table S2. For further analysis PCR products were subcloned into pBlue- script II (Stratagene) vector and individual DNA templates were purified and sequenced. scribed for human C1QL1. Danio rerio gDNA and total RNA were For analysis of mouse C1QL1 total RNA and genomic DNA from isolated from adult and hatchlings (four day post-fertilization) cortex and cerebellum of adult mice were prepared using standard using the same procedures. See Supplementary data for detailed procedures and analyzed following a similar RT-PCR strategy as de- protocols, statistical analysis and primer sequences.

Human A -- A GCGC CG CGC CCGA C A --- AG 5’- GUGCG CCC CU CCCC GGGGCCC GCGCCGGCG GGA CGG GGCG CGCC CUG C ||||| ||| || |||| ||||||| ||||||||| ||| ||| |||| |||| ||| G 3’- CACGC GGG GA GGGG CCCCGGG CGUGGUCGC CCU GCC CCGC GCGG GAC A C CC A -AC- -A -A- UCC- C - CGA G

5’- exon 1 exon 2 -3’

Mouse U C A -- A GCGC CG CGC UCCGA C -A- G AG 5’- GUGCG CCC CU UCCC GGGGCCC GCGCCGGCG GG CGG GGCG CGCU UG C ||||| ||| || |||| ||||||| ||||||||| || ||| |||| |||| || G 3’- CACGC GGG GA GGGG CCCCGGG CGUGGUCGC CC GCC CCGC GCGA AC A C CC A -AC- -A -A- UCCC- U GUG G G C

Fig. 1. Secondary structure predictions of human, mouse and rat complement component 1, q subcomponent-like 1 (C1QL1). Two thousand five hundred nucleotides of the pre-mRNA sequences starting from their known 50-ends were analyzed by M-fold [11] and the structures surrounding the putative editing sites (bold and underlined) are shown. Within the mouse sequence, bases that differ from human are shaded and those that are different in rat are shown above the mouse sequence (shaded and boxed). C.P. Sie, S. Maas / FEBS Letters 583 (2009) 1171–1174 1173

3. Results and discussion C1QL1 cDNA. The analysis of purified cortex and cerebellum sam- ples revealed three positions of RNA editing within the same exon 3.1. A-to-I RNA editing in mammalian C1QL1 of C1QL1, all causing non-synonymous codon changes. The Q66R site had been predicted by our computational screen ([9] and Table We recently completed a bioinformatics based exploration of S1), while the others alter a (ACG) to an codon human genome and transcriptome sequence databases predicting (GCG) and a glutamine (CAG) to arginine (CGG) codon, respec- candidate sites for A-to-I editing within human mRNAs [9]. Pre- tively. Table S2 summarizes the editing levels measured at the dicted candidates are experimentally validated through parallel three sites within the two mice. Intriguingly, in both specimens, analysis of cDNA and genomic DNA isolated from the same individ- editing levels in cerebellum are substantially different from those ual to rule out polymorphisms. The experimental strategy employs in cortex arguing for tissue-specific regulation of editing. The standard reverse transcription and PCR to amplify gene-specific re- Q66R site is edited to 10% or 17% in cerebellum, whereas it is edi- gions that span the predicted editing site followed by sequence ted to only 1–3% in cortex. Q69R only showed evidence of editing analysis to determine the ratio of edited-to-unedited templates. in cerebellum (3–7%) and the T63A site is modified 1–2% in both Within the highest scoring group of predicted target sites we cortex and cerebellum (Table S2). showed that three out of four positions are bona fide RNA editing Taken together, we confirmed that all three sites in mouse recoding sites affecting two genes, splicing factor SRp25 and insu- C1QL1 undergo RNA editing in vivo leading to non-synonymous lin-like growth factor binding protein IGFBP7 [9]. None of the low- codon changes. We then moved to the analysis of human speci- er scoring candidates that we evaluated experimentally (total of 64 mens applying the optimized protocols to several human brain sites) showed detectable RNA editing in human brain tissue. RNAs. One sample showed high levels of modification at T63A Table S1 shows the top scoring sites for RNA editing within a (18% editing) and Q66R (56% editing) (Fig. 2, Table S2). subset of 554 human sequences that had previously been anno- Since for this human specimen the genomic counterpart was tated as SNPs based solely on expressed sequence data. Human not available, we analyzed three additional specimens from human C1QL1 is a high scoring candidate (Table S1). However, standard brain together with the corresponding genomic DNA. In two of the PCR did not allow productive amplification of the human C1QL1 three cases we confirmed the occurrence of RNA editing at the cDNA fragment, probably due to its high G/C content. We used to- Q66R position since the genome samples displayed an adenosine tal mouse RNA to optimize reaction conditions for a RT-PCR proto- at both sites in C1QL1 (Fig. 2 and Table S2). In these two human col that will allow RNA editing analysis. Not only is the mouse specimens, editing at the T63A site was not detectable. Although C1QL1 cDNA highly conserved to the human sequence, also the we can therefore formally not rule out that in human the T63A site predicted RNA secondary structures of mouse, rat and human exon represents a previously unknown gSNP, our results from analysis of 1 sequences are the same (Fig. 1). Therefore, we hypothesized that mouse C1QL1 and the complete conservation of the predicted RNA RNA editing at the candidate position in human would also be con- secondary structure surrounding the editing sites argue that, like served in the mouse C1QL1 orthologue. We isolated total RNA from in mouse, the T63A position is also an A-to-I editing site in human. both cortex and cerebellum of two mouse brains. The observed variation in editing levels at both the major and Use of a special DNA polymerase, a special buffer and optimized minor sites in human specimens may be due to regional and/or amplification protocols allowed us to obtain a specific amplicon for temporal regulation of C1QL1 editing, similar as observed for other

A ------(((((-(((((-((((------(((((((--(((((((((----((-----(--((-((((-(((((((------)))---)-- hu GATGCTGGGCACCTGCCGCATGGTGTGCGACCCCTACCCCGCGC---GGGGCCCCGGCGCCGGCGCG-CGGACCGAC--GGCGGCGACGCCCTGAGCGAGCAGAGCGGC mo GATGCTGGGCACCTGCCGCATGGTGTGCGACCCCTATCCCGCGC---GGGGCCCCGGCGCCGGCGCG-CGGTCCGAC--GGCGGCGACGCTCTGAGCGAGCAGAGCGGT ra GATGCTGGGCACCTGCCGCATGGTGTGCGACCCTTATCCCGCGC---GGGGCCCCGGCGCCGGCGCG-CGGTCCGAC--GGCGGCGACGCTCTGAGCGAGCAGAGCGGT cow GATGCTGGGCACCTGCCGCATGGTGTGTGACCCCTACCCCGCGC---GGGGCCCCGGCGCCGGCGCG-CGGCCCGAC--GGCGGCGACGCCCTGAGCGAGCAGAGCGGC dog GATGCTGGGCACCTGCCGCATGGTATGCGACCCCTACCCCGCGC---GGGGCCCCGGCGCCGGCGCT-CGGCCCGAC--GGCGGCGACGCCCTGAGCGAGCAGAGTGGC fug GATGTTGGGCACCTGTCGTATGGTGTGCGACCCCTACCTGAACA-19nt-CACCAGTTCCA-CCGGTC---TTCAGGCTGAGGCTGAGGCATTGAGTGACCACAGCAAT zeb GATGCTGGGCACCTGTCGAATGGTGTGCGATCCATACCAGAACA-19nt--CGCCAGCACCGGCTCTTCTGTACAGGCCGAGGCCGAGGCTCTGGCCGACCACAGCAAC ****************** *********** ** ** * * * * * * * * * * ** ** ** ** ** **_____

))))-)))----))-)))))---))))-)))))))--))))-))--)))-))))) hu GCGCCCCCGCCTTCCACGCTG---GTGCAGGGCCCCCAGGGGAAGCCGGGCCGCACCGGCAAGCCCGGCCCTCCGGGGCCTCCCGGGGACCCAGGTCCTCCCGGCCCTG mo GCGCCTCCGCCCTCCACGCTG---GTGCAGGGCCCCCAGGGGAAGCCGGGCCGCACGGGCAAGCCGGGCCCTCCGGGGCCTCCAGGAGACCGGGGCCCTCCAGGTCCTG ra GCGCCCCCGCCCTCCACGCTG---GTGCAGGGCCCCCAGGGGAAGCCGGGCCGCACCGGCAAGCCAGGCCCCCCCGGGCCTCCAGGAGACCGGGGACCTCCAGGTCCTG cow GCGCCCCCGCCCTCCACGCTG---GTGCAGGGCCCCCAGGGGAAGCCGGGACGCACAGGCAAGCCGGGCCCCCCTGGGCCCCCCGGGGACCCAGGTCCTCCGGGTCCTG dog GCGCCCCCGCCTTCCACGCTG---GTGCAGGGCCCCCAGGGGAAACCGGGCCGCACAGGCAAGCCGGGCCCCCCGGGGCCTCCCGGGGACCCAGGTCCTCCGGGCCCTG fug GTGCACCCTCCTTCAACTTTA---CTACAGGGTCCACAAGGGAAGCCTGGCAGGCCAGGAAAGCCAGGACCACCCGGACCACCAGGAGAACCAGGCCCACCAGGACCAG zeb ATGCCTCCACCCTCTACGCTC---CTCCAGGGGCCACCGGGGAAGCCGGGCCGACCAGGCAAACCCGGACCTCCGGGGCCTCCAGGGGAGCCAGGGCCCCCGGGTCCAA ** ** ** ** ** * * ***** ** * ***** ** ** * * ** ** ** ** ** ** ** ** ** ** ** * ** ** ** ** ** ((((---(((--((((((-((((((---(((((((-----)))----))))--))))))))))))--)))))))

B Zebrafish AC CA A AAG G 5’- GCUCCUC GGGGCC CCGGGG CCGG----GCC A ||||||| |||||| |||||| |||| ||| C 3’- CGAGGGG CUCCGG GGCCUC GGCC CGG C CA AC - -CA CAAA A

Fig. 3. Distinct RNA folds mediate editing in mammals versus zebrafish. (A) Clustal W(1.81) alignment of vertebrate complement component 1, q subcomponent-like 1 (C1QL1) sequences and RNA secondary structures that mediate editing. Exon1 sequences were retrieved from the UCSC Genome Browser [10]. Base-paired nucleotides in the mammalian structure are depicted with purple shading for the sequence surrounding the recoding editing sites and in yellow shading for the upstream, complementary region. Nucleotides involved in base-pairing in the zebrafish RNA structure are shown in purple (region around the recoding editing site) and green (downstream complementary region). that were experimentally shown to be subject to editing are shown in red with yellow shading. Nucleotides identical across all displayed species are indicated by a star; base-pairing nucleotides are further highlighted through bracket notation. (B) Zebrafish computer predicted RNA secondary structure. C1QL1 zebrafish sequences were analyzed using M-fold [11] and the predicted structure surrounding the Q74R editing site (bold and underlined) as well as other minor sites (underlined and gray shading) are shown. 1174 C.P. Sie, S. Maas / FEBS Letters 583 (2009) 1171–1174 recoding A-to-I editing targets, such as glutamate receptor tran- (see Fig. 3). In contrast, human C1QL1 does not show any evidence scripts [13]. That assumption is supported by the tissue-specific of editing at the downstream adenosines. pattern of editing in mouse brain tissue described above. We also analyzed human lung, kidney and spleen RNA samples for editing Acknowledgments in C1QL1 but did not detect evidence of editing above the detection limit of <5% for direct sequence track analysis. We thank Dr. Kathy Iovine for zebrafish and William Coleman The family of C1Q-domain proteins includes important signal- for the mouse specimens. ing molecules with roles in inflammation, adaptive immunity and energy homeostasis [14]. The physiological function of Appendix A. Supplementary data C1QL1 has not been elucidated, but it is expressed highest within the brain and was suggested to be especially important for neurons Supplementary data associated with this article can be found, in involved in coordination and regulation of motor control [15]. Fur- the online version, at doi:10.1016/j.febslet.2009.02.044. thermore, it may be part of a neuroprotective immune response [16]. Interestingly, one study revealed upregulation of C1QL1 in re- References sponse to kainic acid induced seizures [17]. Kainate is the specific agonist for the ionotropic glutamate receptors GluR-5 and Glu-6, [1] Gommans, W.M., Dupuis, D.E., McCane, J.E., Tatalias, N.E., Maas, S. (2008). both of which are also regulated through A-to-I RNA editing. Diversifying exon code through A-to-I RNA editing. In: DNA RNA Editing (H. Smith, Ed.), pp. 3–30. Wiley & Sons, Inc. The RNA editing sites in C1QL1 are located immediately up- [2] Hoopengardner, B. (2006) Adenosine-to-inosine RNA editing: perspectives and stream and at the beginning of a collagen-like domain. In other predictions. Mini. Rev. Med. Chem. 6, 1213–1216. C1Q-domain proteins, such as the hormone adiponectin, this coin- [3] Nishikura, K. (2006) Editor meets silencer: crosstalk between RNA editing and RNA interference. Nat. Rev. Mol. Cell Biol. 7, 919–931. cides with a region of protease-mediated processing [18]. Future [4] Maas, S., Kawahara, Y., Tamburro, K.M. and Nishikura, K. (2006) A-to-I RNA studies will show if the amino acid substitutions caused by RNA editing and human disease. RNA Biol. 3, 1–9. editing may alter post-translational processing of C1QL1, or if it af- [5] Sherry, S.T., Ward, M.H., Kholodov, M., Baker, J., Phan, L., Smigielski, E.M. and Sirotkin, K. (2001) DbSNP: the NCBI database of genetic variation. Nucleic fects other properties of the protein in vivo. Acids Res. 29, 308–311. Our findings validating mammalian C1QL1 as a bona fide A-to-I [6] Buetow, K.H., Edmonson, M.N. and Cassidy, A.B. (1999) Reliable identification RNA editing target further highlights the effectiveness of our bioin- of large numbers of candidate SNPs from public EST data. Nat. Genet. 21, 323– 325. formatics search strategy as applied to the subset of human mRNAs [7] Irizarry, K., Kustanovich, V., Li, C., Brown, N., Nelson, S., Wong, W. and Lee, C.J. with non-synonymous A/G discrepancies chosen from the SNP (2000) Genome-wide analysis of single-nucleotide polymorphisms in human database. Four of the top five highest scoring sites (80%) prove to expressed sequences. Nat. Genet. 26, 233–236. be in vivo editing targets, whereas none of the tested sites with [8] Eisenberg, E., Adamsky, K., Cohen, L., Amariglio, N., Hirshberg, A., Rechavi, G. and Levanon, E.Y. (2005) Identification of RNA editing sites in the SNP lower scores (an additional 60 positions tested) show detectable database. Nucleic Acids Res. 33, 4612–4617. editing. This analysis sets the stage for a systematic, genome-wide [9] Gommans, W.M., Tatalias, N.E., Sie, C.P., Dupuis, D., Vendetti, N., Smith, L., screen for A-to-I recoding events in human and other organisms. Kaushal, R. and Maas, S. (2008) Screening of human SNP database identifies recoding sites of A-to-I RNA editing. RNA 14, 2074–2085. [10] Kuhn, R.M. et al. (2007) The UCSC genome browser database: update 2007. 3.2. A distinct RNA fold supports zebrafish C1QL1 editing Nucleic Acids Res. 35, D668–D673. [11] Zuker, M. (2003) Mfold web server for nucleic acid folding and hybridization prediction. Nucleic Acids Res. 31, 3406–3415. The C1QL1 exon 1 sequence is strongly conserved between [12] Athanasiadis, A., Rich, A. and Maas, S. (2004) Widespread A-to-I RNA editing of mammalian species (Figs. 1 and 3) and suggests that in addition Alu-containing mRNAs in the human transcriptome. PLoS Biol. 2, e391. to the human and mouse gene, also the rat, cow and dog C1QL1 [13] Paschen, W. and Djuricic, B. (1995) Regional differences in the extent of RNA editing of the glutamate receptor subunits GluR2 and GluR6 in rat brain. J. RNA is likely subject to editing. However, we noticed that the pre- Neurosci. Meth. 56, 21–29. dicted secondary structure supporting editing in mammalian [14] Ghai, R., Waters, P., Roumenina, L.T., Gadjeva, M., Kojouharova, M.S., Reid, K.B., C1QL1 is not conserved within the zebrafish (Danio rerio) ortho- Sim, R.B. and Kishore, U. (2007) C1q and its growing family. Immunobiology 212, 253–266. logue (Fig. 3). The editing site complementary sequence (ECS) [15] Berube, N.G., Swanson, X.H., Bertram, M.J., Kittle, J.D., Didenko, V., Baskin, D.S., 0 within human exon 1, located 5 to the recoding editing sites, is Smith, J.R. and Pereira-Smith, O.M. (1999) Cloning and characterization of CRF, not conserved in any of the non-mammal sequences including zeb- a novel C1q-related factor, expressed in areas of the brain involved in motor rafish. However, in zebrafish, another RNA fold of similar strength function. Brain Res. Mol. Brain Res. 63, 233–240. [16] Glanzer, J.G. et al. (2007) Genomic and proteomic microglial profiling: is formed with sequences downstream of the recoding sites within pathways for neuroprotective inflammatory responses following nerve exon 1. Indeed, when we analyze RNAs isolated from adult and fragment clearance and activation. J. Neurochem. 102, 627–645. four day post-fertilization zebrafish specimens, we readily detect [17] Hunsberger, J.G., Bennett, A.H., Selvanayagam, E., Duman, R.S. and Newton, S.S. (2005) Gene profiling the response to kainic acid induced seizures. Brain Res. editing at the Q74R site (equivalent to human Q66R) of about Mol. Brain Res. 141, 95–112. 50% in adult and 33% in hatchlings (Table S2). [18] Waki, H. et al. (2005) Generation of globular fragment of adiponectin by The distinct RNA fold predicted for the zebrafish sequence is leukocyte elastase secreted by monocytic cell line THP-1. Endocrinology 146, 790–796. supported by the fact that two additional adenosines located on the opposite site of the predicted duplex also undergo editing