¨Ubung Bioinformatik Von RNA- Und Proteinstrukturen

¨Ubung Bioinformatik Von RNA- Und Proteinstrukturen

Ubung¨ Bioinformatik von RNA- und Proteinstrukturen Ronny Lorenz [email protected] Bioinformatik University of Leipzig Leipzig, Germany, April 14, 2014 RNA Secondary structure prediction Secondary structures can be uniquely decomposed into loops interior base pairs interior base pair closing base pair CG A A G 5 G U G 5 5 C 3 U C 3 3 CA C U AA closing base pair closing base pair stacking pair hairpin loop multi loop interior base pair closing base pair C G A GA G 5 5 A 3 3 CU C 5 3 AU G interior base pair UAUACGCA closing base pair interior loop bulge exterior loop RNA Secondary structure prediction X E(S) = E(L) L2S i j N 1 • The free energy of a secondary structure is the sum of the free energy of the loops its composed of • Loop energies depend on loop type, loop size and sequence • Energy parameters are measured experimentally or extrapolated by mathematical models RNA Secondary structure prediction Decomposition scheme F i,j = | i j i i+1 j i k k+1 j Ci,j = | | i j i j i d e j i uu+1 j Mi,j = | | i j i j−1j i uu+1j i u u+1 j ^ Mi,j = | i j i j−1j i j Programs / Program suites • Unafold / mfold (M. Zuker) http://mfold.rna.albany.edu/ ...the ’inventor’ of the DP recursion scheme • RNAstructure (D. Mathews) http://rna.urmc.rochester.edu/RNAstructure.html http://rna.urmc.rochester.edu/NNDB/ ...the energy parameter guys • ViennaRNA Package (I. Hofacker) http://www.tbi.univie.ac.at/RNA/ http://www.tbi.univie.ac.at/~ronny/RNA/ ...comprehensive compilation and very fast implementation of RNA structure prediction algorithms RNA/DNA sequence file formats • Simple textfile (ViennaRNA) CAAAGGCGACUCUCCUUAGACUCUAUAAAUAGUAAAUAGCUCCUAGGGACAAGGCUUACG • SEQ file (RNAstructure) ; There can be any number of comments. A title must immediately follow on the next line and be on one line. AAA GCGG UUTGTT UTCUTaaTCTXXXXUCAGG1 • FASTA >some random sequence CAAAGGCGACUCUCCUUAGACUCUAUAAAU AGUAAAUAGCUCCUAGGGACAAGGCUUACG • GenBank http://www.ncbi.nlm.nih.gov/Sitemap/samplerecord.html • EMBL http://www.ebi.ac.uk/ena/ • RNAML XML style storage of RNA sequence- and structure data • Clustal format for sequence alignments (used by many alignment programs) • Stockholm format sequence alignments (http://rfam.sanger.ac.uk/) RNA/DNA structure file formats • dot parenthesis (a.k.a. dot-bracket) notation UGGGAAUAGUCUCUUCCGAGUCUCGCGGGCGACGGGCGAUCUUCGAAAGUGGAAUCCGUA ..(((....((.(((.((((..((((.........))))..)))).))).))..)))... • BPSEQ format 1 G 8 2 G 7 3 C 0 4 A 0 5 U 0 6 U 0 7 C 2 8 C 1 • connectivity table (CT) format 8 1 G 0 2 8 1 2 G 1 3 7 2 3 C 2 4 0 3 4 A 3 5 0 4 5 U 4 6 0 5 6 U 5 7 0 6 7 C 6 8 2 7 8 C 7 0 1 8 • RNAML see also http://www.rnasoft.ca/strand/help.php The ViennaRNA Package www.tbi.univie.ac.at/~ronny/RNA • Source code available • Binary packages available (Fedora, Ubuntu, Debian, Windows) • FAST and accurate!!! http: //www.tbi.univie.ac.at/~ronny/RNA/performance.html • Reads simple sequence files, FASTA, clustal, and Stockholm formats • Produces Postscript plots, dot-bracket structures, other output • ViennaRNA Websuite: rna.tbi.univie.ac.at RNAfold Compute minimum free energy structures (...partition function, base pair probabilities, centroid-, and MEA-structure, etc...) • Sequence file, e.g. FASTA format (sequence.fa) >some random sequence UGGGAAUAGUCUCUUCCGAGUCUCGCGGGCGA CGGGCGAUCUUCGAAAGUGGAAUCCGUA • Run RNAfold $ RNAfold < sequence.fa >some random sequence UGGGAAUAGUCUCUUCCGAGUCUCGCGGGCGACGGGCGAUCUUCGAAAGUGGAAUCCGUA ..(((....((.(((.((((..((((.........))))..)))).))).))..)))... (-13.40) • Postscript structure plot U A U G A G A U U C U C G G G G G C C U U C G A G C U G U C G C A C C G G A A G C U U G U G A A U A A G C G C U G A G C RNAplot Draw and annotate RNA secondary structure plots • Sequence/Structure pair input, e.g. output of RNAfold >some random sequence UGGGAAUAGUCUCUUCCGAGUCUCGCGGGCGACGGGCGAUCUUCGAAAGUGGAAUCCGUA ..(((....((.(((.((((..((((.........))))..)))).))).))..)))... (-13.40) • Run RNAplot with different layout algorithm $ RNAfold < sequence.fa | RNAplot -t 0 C A G C G G G G G G C C G G C A U U C C U U G G C U A G A A U C U C G G C G G A G C A U U C G C U AG G C C G C U G G U G A U C G G C A A U A A A G G G G A U C C U U U C G G U G G C G U C C G A G U C G C C A C UG G G U U C A C C U G G G A A G A C C U A C U U A G C G G A U U U U G A A C U A G G A U A G C A U G C U U G C G U A G A A U U C A C G G A U A U G G • Run RNAplot with --pre="" option to create annotation macros $ RNAfold < sequence.fa | RNAplot --pre="" open resulting postscript plot in a text editor and investigate macros RNAeval Evaluate the free energy of a sequence/structure pair • Simple check if RNAfold and RNAeval score equally $ RNAfold < sequence.fa >some random sequence UGGGAAUAGUCUCUUCCGAGUCUCGCGGGCGACGGGCGAUCUUCGAAAGUGGAAUCCGUA ..(((....((.(((.((((..((((.........))))..)))).))).))..)))... (-13.40) $ RNAfold < sequence.fa | RNAeval UGGGAAUAGUCUCUUCCGAGUCUCGCGGGCGACGGGCGAUCUUCGAAAGUGGAAUCCGUA ..(((....((.(((.((((..((((.........))))..)))).))).))..)))... (-13.40) • Evaluate at 20°C and dangle model 1 $ RNAfold < sequence.fa | RNAeval -d 1 -T 20. UGGGAAUAGUCUCUUCCGAGUCUCGCGGGCGACGGGCGAUCUUCGAAAGUGGAAUCCGUA ..(((....((.(((.((((..((((.........))))..)))).))).))..)))... (-20.28) RNAeval Evaluate the free energy of a sequence/structure pair • Verbose output $ RNAfold < sequence.fa | RNAeval -v External loop : -140 Interior loop ( 3, 57) GC; ( 4, 56) GC: -330 Interior loop ( 4, 56) GC; ( 5, 55) AU: -240 Interior loop ( 5, 55) AU; ( 10, 52) UG: 380 Interior loop ( 10, 52) UG; ( 11, 51) CG: -150 Interior loop ( 11, 51) CG; ( 13, 49) CG: -10 Interior loop ( 13, 49) CG; ( 14, 48) UA: -210 Interior loop ( 14, 48) UA; ( 15, 47) UA: -90 Interior loop ( 15, 47) UA; ( 17, 45) CG: 120 Interior loop ( 17, 45) CG; ( 18, 44) GC: -240 Interior loop ( 18, 44) GC; ( 19, 43) AU: -240 Interior loop ( 19, 43) AU; ( 20, 42) GU: -60 Interior loop ( 20, 42) GU; ( 23, 39) UA: 290 Interior loop ( 23, 39) UA; ( 24, 38) CG: -240 Interior loop ( 24, 38) CG; ( 25, 37) GC: -240 Interior loop ( 25, 37) GC; ( 26, 36) CG: -340 Hairpin loop ( 26, 36) CG : 400 UGGGAAUAGUCUCUUCCGAGUCUCGCGGGCGACGGGCGAUCUUCGAAAGUGGAAUCCGUA ..(((....((.(((.((((..((((.........))))..)))).))).))..)))... (-13.40).

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    12 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us