Annotating RNA Motifs in Sequences and Alignments Paul P

Annotating RNA Motifs in Sequences and Alignments Paul P

Published online Xxxx 2014 Nucleic Acids Research, 2014, Vol. XX, No. YY 1–37 doi:10.1093/nar/gkn000 Supplementary material: Annotating RNA motifs in sequences and alignments Paul P. Gardner1;∗, and Hisham Eldai1∗ 1School of Biological Sciences, Biomolecular Interaction Centre, University of Canterbury, Private Bag 4800, Christchurch 8140, New Zealand Received July, 2014; Revised July, 2014; Accepted July, 2014 ∗To whom correspondence should be addressed. Tel: +64 3 364 2987; Fax: +64 3 364 2590; Email: [email protected] c 2014 The Author(s) This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/ by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited. [11:22 5/11/2014 supplementary-results.tex] Page: 1 1{37 2 Nucleic Acids Research, 2014, Vol. XX, No. YY SUMMARY In the following document we present supplementary methods, results and figures relating to the RMfam resource: 1. Figures 1-8 illustrate secondary structure diagrams for each of the RMfam motifs. Figure 1 contains a Legend, detailing the color and symbol schemes used to illustrate different evolutionary constraints on the different structures. 2. Figure 9 illustrates our estimates of the accuracy of using covariance models to annotate RNA motifs on sequences and alignments. 3. Figures 10-43 contain secondary structures and the results of per-motif benchmarks. 4. Figures 44&45 illustrate improvements to Rfam (v11.0) alignments and consensus structures based upon RMfam annotations. 5. Figure 46 illustrates the network of the 50 highest scoring RMfam to Rfam mappings. [11:22 5/11/2014 supplementary-results.tex] Page: 2 1{37 Nucleic Acids Research, 2014, Vol. XX, No. YY 3 SECONDARY STRUCTURES Legend U U U Y C G A A A U C basepairannotations A A G A U G C G U C covaryingmutations G C Y R C G compatiblemutations G Y C G nomutationsobserved A Y R Y G C nucleotide nucleotide U A A R present identity A U Y 80% 60% N 80% 70% 40% N 70% R U G U N 60% 5´ 5´ 5´ 5´ 5´ R=AorG.Y=CorU. Figure 1. A legend describing the symbols used in all the secondary structures images presented in figures 1-8. Secondary structure diagrams of: tetraloops: ANYA (1, 2, 3), CUYG (4, 5, 6, 7), GNRA (8, 9, 10, 11, 12, 13), UMAC (14, 15) and UNCG (10, 12, 13, 16) and the hairpins loops C-loop (17, 18, 19, 20), T-loop (12, 13, 21, 22, 23) and U-turn (12, 13, 24, 25). 5-46 nt R R R A U R U R Y R G A G C C G C U A G C A R R Y R Y C A U R G Y 5´ 5´ 5´ Figure 2. Secondary structure diagrams of: the hairpins loops; C-loop (17, 18, 19, 20), T-loop (12, 13, 21, 22, 23) and U-turn (12, 13, 24, 25). 0-38 nt Y 3-81 nt R C R C 4-40 nt R R Y A A Y G C A U G R Y G G C Y R Y R U U R Y Y C C A G G C R R Y G A A A Y R R Y G U A A A G U A U G G G G R A A G G 0-57 nt G A G A A Y R R R Y C G R R R R Y R R Y Y R R Y R R Y C G Y R A R R Y C G G C A G 5´ 5´ 5´ A G 5´ R A R A A 5´ R Figure 3. Secondary structure diagrams of: internal loops: three k-turns (3, 12, 13, 18, 26, 27, 28) and two sarcin-ricin loops (12, 20, 29, 30). [11:22 5/11/2014 supplementary-results.tex] Page: 3 1{37 4 Nucleic Acids Research, 2014, Vol. XX, No. YY R Y G G A G G G G C 4-88 nt 2-22 nt R G Y Y G A G A U C A G A A G U C C C G G G A C U C G G R Y R Y A G R R C G R G R Y A A R Y R R A 0-31 nt C A A C G R Y R Y R G G Y U G U G C C C C A A A U R Y A R Y G G U G A A 28-249 nt R G A A R Y A G C G G A G 1-12 nt A R Y C Y C G G Y Y A G C R Y C R 0-7 nt G Y A G A Y U R R Y Y G R Y G A G C Y G C G Y G U G 0-36 nt C G Y G U G R Y G Y G C G Y G C R G C 5´ R Y A 5´ G C 5´ R G C Y R 5´ 5´ 5´ Figure 4. Secondary structure diagrams of: internal loops: the tandem-GA (20, 31), twist up (17) and UAA GAN (32), the docking elbow (33), right angle 2 and 3 (34) motifs. R Y R R R Y C G C G C G R Y Y R C G Y R A U A U R Y R Y Y G A U C G C G G C C G G Y A U A U G C A U A U R Y C G G C A U R Y A U A U A U G C R U C G A U G C 5´ U Y U Y R U 5´ U U U Y 5´ R Y Y Y Y R R R R Figure 5. Secondary structure diagrams of Rho independent transcription terminators (35). [11:22 5/11/2014 supplementary-results.tex] Page: 4 1{37 Nucleic Acids Research, 2014, Vol. XX, No. YY 5 A R A R G Y G Y R G A U U A U U C G Y A Y R U R U Y Y G U A U A U Y U A R U R U RA Y R Y R R Y U A R U R R Y U R U R U R 5´ 5´ U 5´ 5´ U R G C G Y R 5´ 5´ A A Y A A R A A C A A A R R Figure 6. Secondary structure diagrams of: interactions: the AUF1 (36), CRC (37, 38, 39), CsrA (40, 41, 42, 43, 44, 45, 46, 47), HuR (48, 49), Roquin (50) and VTS1 (51, 52, 53, 54) protein binding motifs. R A A A G A R G A Y R G A G G C C G A C R Y A C A U Y Y G R Y C G R G Y R R Y Y C C C Y C G A A Y G R G A R C R C G R Y G U R Y A U G Y Y G Y 5´ A U A U G Y C 5´ 5´ Figure 7. Secondary structure diagrams of: vapC target (55), the SRP RNA S domain (56, 57, 58) and the catalytic Domain-V (59, 60). [11:22 5/11/2014 supplementary-results.tex] Page: 5 1{37 6 Nucleic Acids Research, 2014, Vol. XX, No. YY Shine-Dalgarno sequence from Bacillus subtilis subsp. subtilis str. 168 2.0 1.0 bits AA GG G A A A A G U A A G A U U A A U A A U U AAAU C A AU U U GA U A C U U CA U U AUAAAAAUAU UU C GA U C C A C G G C C C C C C C C C U C C C U A A G C C CU G UG U GC G UU G G G GGG G C CA G U C G G G A U U A G A G G C 0.0 A -30 -25 -20 -15 -10 -5 0 5 Distance from start codon (nucs) WebLogo 3.1 5´ R A A A R G G G G G R R Y Y R Y A U G A R R R A R Shine-Dalgarno sequence from Escherichia coli str. K-12 substr. MG1655 2.0 1.0 bits AGGA A A A A A A U AA A AA A A U U UU U AA C U UUU G A A AU A U C C C G C UC A U A CC A U U U CA U G U U U G C C G C GU C AA C C A C C G C GC C U UG C U G U G G C CCG UG C U AG CA A U G 0.0 GA -30 -25 -20 -15 -10 -5 0 5 Distance from start codon (nucs) WebLogo 3.1 5´ Y Y Y Y Y Y Y Y R R R G G R R R A Y Y A U G A R A R Shine-Dalgarno sequence from Helicobacter pylori 26695 2.0 1.0 bits A AG A A A AU UA G U A A A UU U AA AU AA U A A C CA A A A U A A U AAAA GG UA AC GU U AUU U U UA UU U G G A AU C CC GC G U G C C G U G U GC C A C G C UC U C C C C U UG U U G G G C G C G C C C G C GG C G C G U C A G G U 0.0 GAA -30 -25 -20 -15 -10 -5 0 5 Distance from start codon (nucs) WebLogo 3.1 5´ Y Y Y Y A A G G R R Y A U G R R A R A Figure 8.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    37 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us