Revision Hands-On Infernal Conclusion

RNA Secondary Structures 1st EVBC WinterSchool

Jorg¨ Fallmann, Kevin Lamkiewicz

9 – 13.03.2018 Revision Hands-On Infernal Conclusion

A small revision of the theory Revision Hands-On Infernal Conclusion

NUSSINOV ALGORITHM (1978)

Ei,j = min{Ei+1,j ; min Ei+1,k−1 + Ek+1,j + βi,k } k:i+m

E EE E = | i j i ii+1j i+1 k-1 k k+1 j

⇒ maximize basepairs, minimize energy! Revision Hands-On Infernal Conclusion

MCCASKILL ALGORITHMS (1990)

The McCaskill Algorithm calculates the partition function Z of all sampled structures Ω in an efficient way.

I E(P): Energy of the structure P

I R: Gas constant

I T : Temperature

X −E(P) Z = e R·T = Q1,n P∈Ω Revision Hands-On Infernal Conclusion

FREE ENERGY MINIMIZATION

I Overcome the main drawback of Nussinov’s algorithm: non-realism of maximization!

I Define an energy model for RNA that can be parameterized by experimentally measured energies

I Devise an algorithm that minimizes the free energy of RNA according to this model

I Algorithm (by Zuker) will be similar to Nussinov’s algorithm Revision Hands-On Infernal Conclusion

DEFINITION (GIBBS FREE ENERGY)

The Gibbs Free Energy G of a system (e.g. dilution of ) is G = H - TS where H is the enthalpy (potential to perform work), T the absolute temperature and S the entropy (measure of disorder).

23 I For RNA, we will compute the free energy of (a certain amount NA ≈ 6 ∗ 10 of molecules, a “mol”) of a certain structure P. More precisely, we compute the change of free energy ∆E due to folding into P from Punfolded I The (change of) Gibbs free energy corresponding to P can be computed by summing free energy contributions from single “structural elements”.

I Those contributions (for loops, stacks, ...) can be measured experimentally (Turner). They consist of enthalpic and entropic terms. Due to the latter, they depend on temperature. Revision Hands-On Infernal Conclusion Revision Hands-On Infernal Conclusion

DEFINITION (FREE ENERGYOFAN RNA)

Given an RNA structure P of an RNA sequence S p I loop free energy Eij : energy contribution of Sec(i,j) P P I total free energy E(P): E ij (i,j)∈P Revision Hands-On Infernal Conclusion

ZUKERS ALGORITHM

Wi,j = min{Wi,j−1 ; min Wi,k−1 + Wk+1,j−1 + E(???)} i≤k

FORWARD RECURSION Revision Hands-On Infernal Conclusion

RNA Structures on the commandline I RNAfold I RNAsubopt I RNAcofold I RNAduplex I RNAalifold

I RNALfold I ...

I CMbuild

I CMcalibrate I CMsearch I CMfinder

I ViennaRNA

I LocaRNA I Infernal

I MAFFT, VARNA, ...

Revision Hands-On Infernal Conclusion

TOOLSAND SCRIPTS I CMbuild

I CMcalibrate I CMsearch I CMfinder

I LocaRNA I Infernal

I MAFFT, VARNA, ...

Revision Hands-On Infernal Conclusion

TOOLSAND SCRIPTS

I ViennaRNA

I RNAfold I RNAsubopt I RNAcofold I RNAduplex I RNAalifold

I RNALfold I ... I CMbuild

I CMcalibrate I CMsearch I CMfinder

I Infernal

I MAFFT, VARNA, ...

Revision Hands-On Infernal Conclusion

TOOLSAND SCRIPTS

I ViennaRNA

I RNAfold I RNAsubopt I RNAcofold I RNAduplex I RNAalifold

I RNALfold I ...

I LocaRNA I MAFFT, VARNA, ...

Revision Hands-On Infernal Conclusion

TOOLSAND SCRIPTS

I ViennaRNA

I RNAfold I RNAsubopt I RNAcofold I RNAduplex I RNAalifold

I RNALfold I ...

I LocaRNA I Infernal

I CMbuild

I CMcalibrate I CMsearch I CMfinder Revision Hands-On Infernal Conclusion

TOOLSAND SCRIPTS

I ViennaRNA

I RNAfold I RNAsubopt I RNAcofold I RNAduplex I RNAalifold

I RNALfold I ...

I LocaRNA I Infernal

I CMbuild

I CMcalibrate I CMsearch I CMfinder

I MAFFT, VARNA, ... Revision Hands-On Infernal Conclusion

RNAFOLD

1 # Use RNAfold on sequence. in order 2 # to fold all sequences in the file 3 $> RNAfold < sequence.fasta Revision Hands-On Infernal Conclusion

RNAFOLD

1 # Use RNAfold on sequence.fasta in order 2 # to fold all sequences in the file 3 $> RNAfold < sequence.fasta

1 # Redirect the output of RNAfold intoa file 2 $> RNAfold < sequence.fasta > sequence.fold 5 # This will producea centroid structure; 6 # the structure with minimal average distance to all 7 # sampled structures. 8 #. unpaired 9 #, weakly paired 10 #| strongly pairedw/o preference 11 #{} weakly paired 12 #() strongly paired

Revision Hands-On Infernal Conclusion

RNAFOLDMIT PARTITION FUNCTION

1 # Use the-p parameter to calculate the partition function 2 # on top of the minimum free energy secondary structures 3 $> RNAfold -p < sequence.fasta > sequence.fold 4 Revision Hands-On Infernal Conclusion

RNAFOLDMIT PARTITION FUNCTION

1 # Use the-p parameter to calculate the partition function 2 # on top of the minimum free energy secondary structures 3 $> RNAfold -p < sequence.fasta > sequence.fold 4 5 # This will producea centroid structure; 6 # the structure with minimal average distance to all 7 # sampled structures. 8 #. unpaired 9 #, weakly paired 10 #| strongly pairedw/o preference 11 #{} weakly paired 12 #() strongly paired Revision Hands-On Infernal Conclusion

MEA STRUCTURE

Maximum expected accuracy Structure

Reminder: With the partition function we are able to calculate basepair probabilities. The structure with the heighest sum of all probabilities, is called the MEA structure. Revision Hands-On Infernal Conclusion

MEA STRUCTURE

Maximum expected accuracy Structure

Reminder: With the partition function we are able to calculate basepair probabilities. The structure with the heighest sum of all probabilities, is called the MEA structure.

1 # Use-p and--MEA to calculate theMFE, 2 # theMEA and the centroid structure. 3 # Note: TheMFE and theMEA structure do not have to be the same! 4 # Note2:--MEA usually implies-p 5 $> RNAfold -p --MEA < sequence.fasta > sequence.fold Revision Hands-On Infernal Conclusion

STRUCTUREAND DOTPLOTS

1 # the same command again; you will get an .ps anda dot.ps 2 $> RNAfold -p --MEA < sequence.fasta > sequence.fold

dot.ps GCUAGCUAGCUAGCUGCUCGAAGGCAUCGAU CGUUU U A G C U A C G G A A G C U C G U C G A U C G A U C G A U C G U A G C U A C G G A A G C U C G U C G A U C G A U C G A U C G U G U C U C U U A G A C U C A G U A G G G C C U A U G C C G G U A C U U U G C U U U G C G A GCUAGCUAGCUAGCUGCUCGAAGGCAUCGAU CGUUU 5 #.(no constraint for this base) 6 #|(corresponding base has to be paired) 7 #x(base is unpaired) 8 #<(basei is paired with basej>i) 9 #>(basei is paired with basej

Revision Hands-On Infernal Conclusion

RNAFOLDWITHCONSTRAINTS

I Structure of RNA is partly known (e.g. via SHAPE experiments)

I RNAfold is able to consider this knowledge

1 # Enable-C to include constraints.--noPS prevents the generation 2 # of the rna.ps and dot.ps files. 3 $> RNAfold --noPS -C < constrained.fasta > constrained.fold 4 Revision Hands-On Infernal Conclusion

RNAFOLDWITHCONSTRAINTS

I Structure of RNA is partly known (e.g. via SHAPE experiments)

I RNAfold is able to consider this knowledge

1 # Enable-C to include constraints.--noPS prevents the generation 2 # of the rna.ps and dot.ps files. 3 $> RNAfold --noPS -C < constrained.fasta > constrained.fold 4 5 #.(no constraint for this base) 6 #|(corresponding base has to be paired) 7 #x(base is unpaired) 8 #<(basei is paired with basej>i) 9 #>(basei is paired with basej

Revision Hands-On Infernal Conclusion

RNAFOLD+

Let’s use some colors! The ViennaRNA Package offers some -scripts, which can enrich the PostScript files of our structures. U G U G U C U C U C U U U C U U A G A A G A C U C A C U C A G U A G G G U A G G G C C G C C U A U U A U G C G C C G C G G G U A U A 0 1 C 0 1.0 C G A G A

Revision Hands-On Infernal Conclusion

RNAFOLD+

Let’s use some colors! The ViennaRNA Package offers some Perl-scripts, which can enrich the PostScript files of our structures.

U G U C U C U U A G A C U C A G U A G G G C C U A U G C C G G U A 0 2.3 C G A U G U C U C U U A G A C U C A G U A G G G C C U A U G C C G G U A 0 1.0 C G A

Revision Hands-On Infernal Conclusion

RNAFOLD+

Let’s use some colors! The ViennaRNA Package offers some Perl-scripts, which can enrich the PostScript files of our structures.

U G U G U C U C U C U U U C U U A G A A G A C U C A C U C A G U A G G G U A G G G C C G C C U A U U A U G C G C C G C G G G U A U A 0 2.3 C 0 1 C G A G A Revision Hands-On Infernal Conclusion

RNAFOLD+

Let’s use some colors! The ViennaRNA Package offers some Perl-scripts, which can enrich the PostScript files of our structures.

U G U G U G U C U C U C U C U U U C U U U C U U A G A A G A A G A C U C A C U C A C U C A G U A G G G U A G G G U A G G G C C G C C G C C U A U U A U U A U G C G C G C C G C G C G G G G U A U A U A 0 2.3 C 0 1 C 0 1.0 C G A G A G A 8 #-p colors the nucleotides based on their base-pairing 9 # probability 10 $> ./relplot.pl -p rna.ps dot.ps > probability.ps 11 12 #-a colors the nucleotides based on their accessbility 13 #(e.g. the probability of being unpaired) 14 $> ./relplot.pl -a rna.ps dot.ps > access.ps

Revision Hands-On Infernal Conclusion

RNAFOLD PERL SCRIPTS

1 # Low entropy regions have little structural flexibility, 2 # which means the reliability of the predicted structure is high. 3 # High entropy indicate many structual alternatives 4 # which might be functional important but make the prediction 5 # more difficult- and thus less reliable. 6 $> ./relplot.pl rna.ps dot.ps > entropy.ps 7 12 #-a colors the nucleotides based on their accessbility 13 #(e.g. the probability of being unpaired) 14 $> ./relplot.pl -a rna.ps dot.ps > access.ps

Revision Hands-On Infernal Conclusion

RNAFOLD PERL SCRIPTS

1 # Low entropy regions have little structural flexibility, 2 # which means the reliability of the predicted structure is high. 3 # High entropy indicate many structual alternatives 4 # which might be functional important but make the prediction 5 # more difficult- and thus less reliable. 6 $> ./relplot.pl rna.ps dot.ps > entropy.ps 7 8 #-p colors the nucleotides based on their base-pairing 9 # probability 10 $> ./relplot.pl -p rna.ps dot.ps > probability.ps 11 Revision Hands-On Infernal Conclusion

RNAFOLD PERL SCRIPTS

1 # Low entropy regions have little structural flexibility, 2 # which means the reliability of the predicted structure is high. 3 # High entropy indicate many structual alternatives 4 # which might be functional important but make the prediction 5 # more difficult- and thus less reliable. 6 $> ./relplot.pl rna.ps dot.ps > entropy.ps 7 8 #-p colors the nucleotides based on their base-pairing 9 # probability 10 $> ./relplot.pl -p rna.ps dot.ps > probability.ps 11 12 #-a colors the nucleotides based on their accessbility 13 #(e.g. the probability of being unpaired) 14 $> ./relplot.pl -a rna.ps dot.ps > access.ps 5 # With the-e parameter, one can definea certain 6 # energy range. Using this, RNAsubopt returns 7 # all structures that are in range of this parameter. 8 $> RNAsubopt -e 2 < sequence.fasta > sequence_e2.subopt

Revision Hands-On Infernal Conclusion

RNASUBOPT

Sometimes we’re interested in suboptimal structures.

1 # In general RNAsubopt is used exactly like RNAfold. 2 # With-p one calculates the partition function 3 $> RNAsubopt [OPTIONS] < sequence.fasta > sequence.subopt 4 Revision Hands-On Infernal Conclusion

RNASUBOPT

Sometimes we’re interested in suboptimal structures.

1 # In general RNAsubopt is used exactly like RNAfold. 2 # With-p one calculates the partition function 3 $> RNAsubopt [OPTIONS] < sequence.fasta > sequence.subopt 4 5 # With the-e parameter, one can definea certain 6 # energy range. Using this, RNAsubopt returns 7 # all structures that are in range of this parameter. 8 $> RNAsubopt -e 2 < sequence.fasta > sequence_e2.subopt Revision Hands-On Infernal Conclusion

RNACOFOLD

1 # RNAcofold works like RNAfold, but allows to specify twoRNA sequences. 2 # These sequences are then allowed to forma dimer structure. In order 3 # to calculate the hybrid structure, it is necessary to concatenate the 4 # twoRNA sequence, using& asa separator. 5 6 $> RNAcofold [OPTIONS] < sequences.fasta > sequences.cofold 7 8 #>seq1 9 #AUGGCAUCGACA 10 #>seq2 11 #UGUCGAAUCCAA 12 13 # RNAcofold Input: 14 #AUGGCAUCGACA&UGUCGAAUCCAA 7 # Alternative: 8 # RNAcofold-C< sequences_constrained.fasta 9 # sequences_constrained.fasta 10 #UAGCUAGCAUGCAUCGACGAU&CGAUGCAUGCAUGCAUGCAUC 11 # <<<<<<<<<<<<<<<<<<<<<&>>>>>>>>>>>>>>>>>>>>>

Revision Hands-On Infernal Conclusion

RNADUPLEX

1 # RNAduplex is very similar to RNAcofold. Actually, 2 # it isa special case of RNAcofold, where only inter-molecular 3 # base pairs are allowed. 4 5 $> RNAduplex [OPTIONS] < sequences.fasta > sequences.duplex 6 Revision Hands-On Infernal Conclusion

RNADUPLEX

1 # RNAduplex is very similar to RNAcofold. Actually, 2 # it isa special case of RNAcofold, where only inter-molecular 3 # base pairs are allowed. 4 5 $> RNAduplex [OPTIONS] < sequences.fasta > sequences.duplex 6 7 # Alternative: 8 # RNAcofold-C< sequences_constrained.fasta 9 # sequences_constrained.fasta 10 #UAGCUAGCAUGCAUCGACGAU&CGAUGCAUGCAUGCAUGCAUC 11 # <<<<<<<<<<<<<<<<<<<<<&>>>>>>>>>>>>>>>>>>>>> Revision Hands-On Infernal Conclusion

RNAALIFOLD

RNAalifold calculates a consensus RNA secondary structure for several aligned RNA sequences.

1 # RNAalifold acceptsCLUSTAL, Stockholm,FASTA orMAF 2 # formats for the input alignment. 3 #--color will producea colored version of the structure plot 4 #--aln producesa colored alignment based on the structure 5 6 $> RNAalifold --aln --color < input.aln > consensus.alifold Revision Hands-On Infernal Conclusion

LOCARNA VS MAFFT

In order to create a multiple sequence alignment, we can use MAFFT and/or LocARNA (and many more...)

1 # mafft createsa multiple sequence alignment based on 2 # sequence conservation only 3 $> mafft --clustalout dengue_3utr.fa > dengue_3utr_mafft.aln 4 5 # locarna folds and aligns the sequences simultanously, 6 # yielding better results for sequence that sharea 7 # structural conservation 8 # However, locarna needs quite some time compared to sequence-based 9 # alignment tools. 10 $> mlocarna --thread 4 dengue_3utr.fasta > dengue_3utr.locarna Revision Hands-On Infernal Conclusion

RNAALIFOLD RESULTS

1 # alirna.ps and aln.ps give information of the structural conservation 2 $> RNAalifold --aln --color < dengue_3utr_mafft.aln \ 3 > dengue_3utr_mafft.alifold 4 5 #NOTE: both PostScript files will be overwritten! 6 # locarna saves the alignment ina subdirectory 7 $> RNAalifold --aln --color < dengue_3utr.out/results/result.aln \ 8 > dengue_3utr_locarna.alifold Revision Hands-On Infernal Conclusion

STRUCTURES MAFFT ______A _ A _ A _ A _ A _ G _ G _ C _ A _ A _ G _ A A G A A _ A C _ G G C A G C A C U A C U C A C G A G G A A U G A A G C A G G A C A C U A C A U U G G G A U G C G G G A A G C AA A A U C_ A A U G U U C A G A A _ G A A G C U AC U C A G _G G A A U A C GA C _U U A G AG _C G C U C U C G U U G A C A _ G GU C G_ _ C C U U U _U G _ U U G _A_ U C A _ _ _ _ _C_C UA U C U G C U G C C G A A A G AA AA CU C A A U A A A _ A U A A C _ C C C C A G G C C A G G G G U U A A G A A G A CA A A G G U G G C C C G C G G _ U C C G C G A A A C C C A C A U C CUG G A C C G A C G G A _ _ A _ G U _ U G _ G _ G C A G A G _ C _ A G U C _ A A _ A G A _ _AC G C A G C C C A _A G A A _ C U A A G G G C C AG A AU C C G U C U A U AAU G A G C A A C C GC G C AA U A A G C U A G A U _ G _ _ C _ U GA C CA G G C C C G C A C G G C G C G C C U U A A U G _ _ A G C _ _ _ A G C _ _ _ A C A GA CG C G C C AC G A U A U A C U G A C G A G A A U G A G A G A C C G C G A U U A G C G G A C A A G U C U G Revision Hands-On Infernal Conclusion

STRUCTURES LocARNA ______A _ _A G _ A C_ _ _ G C _ A G C_ A _ A C A C G _ G _ A G _ G C A U A A _ G A _ U U A _ C A G A U G G _ A C _ A C A A C U A CC A G U _ A A U G A C A A _ G C C U U G G GA A U _ G A U A G A C C C A C C _ U G A A G U C A U U _ U A G G A _ G G C G C C C _ G _ C U G C A _ A A A C U G C G _ U C _ C A A G A C G U U _ C A G A U _ A G G A A G C _ A G G A A A A U U C U C _ U A A G _ _C _U _A _A _ _ AG C A G _ C G U G G U C U C G A C CC C C A C A C A U G G C A G C U A G A C A G C U G C U U C G U G A U G G G G C C UA C A U G U C G A U A U C G U A G A A A A A _ A A C A C G C A _ A U C C C C C C AG C C A G G G C C C A G G A GU _G CA _ G C _ _ G G G U U A C _ _ G C _ A G _ A _ C A A G G C A C G C _ _ A C G C G AUA G U A A C A U _ G A G C C G _ G C G C U A G A G C _ U _ G A G C A G C G _ _ A A A C C G C C C C A C C GA C C A C A U GG A A G C G C AC A A A A G U U A U A C U G A G C C GC G A G A A A G A U G G A G A A C C G C G A U U A G C G G A C A A G U C U G Revision Hands-On Infernal Conclusion

STRUCTURES

MAFFT LocARNA ______A _ _ A _ _ _ A _ _ _ A _ _ _ A _ _ G _ _ G A _ _ _A G C _ _ _ A A _ C A _ _ _ C G _ _ G G A A A G A A C_ A A _ _ C A C _ A G _ G C C G G A C G _ G _ U A A C C G C A U A G U _ A A C C G A _ A U G A U A G A A A A U _ C G G A G U G G _ G C A A C G C A C _ A A A A C A CC A G _ C U A A A U A G U C A U A C A A _ U U G G C U G G A U C G A U G C G G G A U G GA A _ A G C AA G C A U A G A A U C_ A C C A C _ A U G U U C U A C A G A A _ G A A G A G C U AC U C A G _G G G U U _ A A U A C GA C _U C A U U A G AG _C G G C U C U C G U A G A _ U U G A C A _ G G G GU C G_ _ G C C _ C C U U U _U C C G _ U U G _A_ U C A _ _ _ _ _C_C UA G _ U C U G C U G C U G C C G A A A _ A G AA AA CU C A A C A A U A A A _ A G C G _ U A A C _ A C U C A C C U _ C C C A G G A C G U U _ C G C C A G G A U A G A G G A _ G A G U U A A G C _ A G G G A A G A A A A U U C U C _ U A C A A A G _ U A _ AG C A G AUA G G C G G _C _ _ _A _ U G G A _ U G U C C G G C C C A C CC C A A C C G G C C C C A G _ U U G A G U A G C C G C C G C G C U G A A A C C A A U G U C A C A U C G U A U C CUG G A C C G G G C UA A C G G A G G C U C _ _ A _ G U U A G G _ U G C U A _ G _ G C U A A G A G G U G C _ C _ A A A A A G U C A A _ _ A A A A _ C A C A G C A _ A A _ _AC G C G C A G C C C U C A _A G A A C G C C _ C U A A C C A G A G G G C C G C C AG G A _ G C C A G G A AU C C G A U _G C G U C C _ _ G G G U U U A U AAU C _ A A G A G C _ G C _ _ G A A A _ C A C A G C C GC A C C _ A G C AA G G G _ G CAUA U A U A C U _ C G G G A A A G C G C A C U C U A _ G G C A G G A A G C _ U A U _ G G C G C G _ _ _ A A A A C G _ C C C C C _ CG A C U C _ C GA C A C A U GA C CA GG A A G C G G G C AC A A A A C C C G C U G U A C G G U A C G C G C A C U G A G C C U U A A U _ C GGC A G G _ A G C A A G A _ _ _ A G C A _ _ _ G A C U A G A GA CG A C G C G C AC G A A C U A U A C U G A C G C G A G A A C G U G A G A G A C A U C G U A C G A U G C U A G G G C A C G G A C A A A A G G U U C U G C U G Revision Hands-On Infernal Conclusion

RNAALIFOLD COLOR CODE

R A K G V Y G V V R A SL4 C G G C B R C D S S K U C C S G V R R SL1 R Y Y R K M S 100 V R K U A Y K W S H M A R U A C C K B U G W U C G M B V Y V Y Number of Base Pair Types A SL2 R 1 2 3 4 5 6 M Y K R 50 UG A R 0 Y C U - M A U 1 Y S Y U M W Y R A D 2

Incompatible Base Pairs R H M K A U R S K M R M R Y A U TRS-L M R 1 RMWURM YAYYK M----CRUHHUCUMAACKAAMW WCAAGW 152 Revision Hands-On Infernal Conclusion

VARNA: FOR NICE FIGURES! Revision Hands-On Infernal Conclusion

Covariance Models, Calibration and succesful searches Hidden Markov-Model A Hidden Markov-Model assume that some states are unobserved: λ = (M, A, P, B, π) I M is again the finite number of states I A is the set of all possible observations I P is again the matrix with transitions probabilites I B is a matrix with output probabilties (emission probablities) I π is the initial distribution

Revision Hands-On Infernal Conclusion

HMMS

Markov-Model A markov model is a tuple λ = (M, π(0), P), where: I M is a finite number of states I Initial distribution π(0) I P is a matrix with transitions probabilities Revision Hands-On Infernal Conclusion

HMMS

Markov-Model A markov model is a tuple λ = (M, π(0), P), where: I M is a finite number of states I Initial distribution π(0) I P is a matrix with transitions probabilities

Hidden Markov-Model A Hidden Markov-Model assume that some states are unobserved: λ = (M, A, P, B, π) I M is again the finite number of states I A is the set of all possible observations I P is again the matrix with transitions probabilites I B is a matrix with output probabilties (emission probablities) I π is the initial distribution We model each nucleotide independent from each other. Since nucleotides upstream of the current position are not considered, we are not able to model secondary structures with HMMs.

CMs In a covariance model both the primary and the secondary structure are modelled.

Revision Hands-On Infernal Conclusion

COVARIANCE MODEL

Disadvantage of HMMs? CMs In a covariance model both the primary and the secondary structure are modelled.

Revision Hands-On Infernal Conclusion

COVARIANCE MODEL

Disadvantage of HMMs?

We model each nucleotide independent from each other. Since nucleotides upstream of the current position are not considered, we are not able to model secondary structures with HMMs. Revision Hands-On Infernal Conclusion

COVARIANCE MODEL

Disadvantage of HMMs?

We model each nucleotide independent from each other. Since nucleotides upstream of the current position are not considered, we are not able to model secondary structures with HMMs.

CMs In a covariance model both the primary and the secondary structure are modelled. Revision Hands-On Infernal Conclusion

INFERNAL

http://eddylab.org/infernal User’s Guide Quote: How to avoid reading this manual If you’re like most people, you don’t enjoy reading documentation. You’re probably thinking: 113 pages of documentation, you must be joking! Revision Hands-On Infernal Conclusion

INFERNAL

http://eddylab.org/infernal User’s Guide Quote: How to avoid reading this manual If you’re like most people, you don’t enjoy reading documentation. You’re probably thinking: 113 pages of documentation, you must be joking! Revision Hands-On Infernal Conclusion

INFERNAL

http://eddylab.org/infernal User’s Guide Quote: How to avoid reading this manual If you’re like most people, you don’t enjoy reading documentation. You’re probably thinking: 113 pages of documentation, you must be joking! 5 # build cm from aln alignment 6 $> clustalw flavi_3utr.fasta 7 $> cmbuild -F --noss flavi_3utr.cvm flavi_3utr.aln 8 # train emission and transition probabilities 9 $> cmcalibrate --cpu 2 flavi_3utr.cvm 10 11 # search in the fasta file for sequences matching 12 # the cmfile and get their positions as table 13 $> cmsearch --cpu 2 --tblout flavi_3utr_cmsearch.csv flavi_3utr.cvm \ 14 all_complete_flavi.fa > cmsearch.log

Revision Hands-On Infernal Conclusion

CMBUILD,CMCALIBRATE,CMSEARCH

1 # go to: https://www.ncbi.nlm.nih.gov/nuccore 2 # txid11051[Organism:exp]AND complete genome[Title] 3 # download them: all_complete_flavi.fasta 4 11 # search in the fasta file for sequences matching 12 # the cmfile and get their positions as table 13 $> cmsearch --cpu 2 --tblout flavi_3utr_cmsearch.csv flavi_3utr.cvm \ 14 all_complete_flavi.fa > cmsearch.log

Revision Hands-On Infernal Conclusion

CMBUILD,CMCALIBRATE,CMSEARCH

1 # go to: https://www.ncbi.nlm.nih.gov/nuccore 2 # txid11051[Organism:exp]AND complete genome[Title] 3 # download them: all_complete_flavi.fasta 4 5 # build cm from aln alignment 6 $> clustalw flavi_3utr.fasta 7 $> cmbuild -F --noss flavi_3utr.cvm flavi_3utr.aln 8 # train emission and transition probabilities 9 $> cmcalibrate --cpu 2 flavi_3utr.cvm 10 Revision Hands-On Infernal Conclusion

CMBUILD,CMCALIBRATE,CMSEARCH

1 # go to: https://www.ncbi.nlm.nih.gov/nuccore 2 # txid11051[Organism:exp]AND complete genome[Title] 3 # download them: all_complete_flavi.fasta 4 5 # build cm from aln alignment 6 $> clustalw flavi_3utr.fasta 7 $> cmbuild -F --noss flavi_3utr.cvm flavi_3utr.aln 8 # train emission and transition probabilities 9 $> cmcalibrate --cpu 2 flavi_3utr.cvm 10 11 # search in the fasta file for sequences matching 12 # the cmfile and get their positions as table 13 $> cmsearch --cpu 2 --tblout flavi_3utr_cmsearch.csv flavi_3utr.cvm \ 14 all_complete_flavi.fa > cmsearch.log Revision Hands-On Infernal Conclusion

The important things on one slide I ViennaRNA Heros

I Infernal and CMs Ninjas

Revision Hands-On Infernal Conclusion

YOUARENOWEXPERTS:

I Algorithms

I Implementations Revision Hands-On Infernal Conclusion

YOUARENOWEXPERTS:

I Algorithms

I Implementations

I ViennaRNA Heros

I Infernal and CMs Ninjas Revision Hands-On Infernal Conclusion

TAKEHOMEMESSAGE

The structure of an RNA sequence can be more important than the sequence itself. Many functions of non-coding RNAs are associated with their structure. At first glance, sequences share no similarity at all, but if you look carefully... Revision Hands-On Infernal Conclusion

SEQUENCEVS.STRUCTURE GCAUGCAUGCUAGCUGACUAGCAUGCAUGCAUGCAUGCAUGCAGU CCGAGAUACCCUAACUCUAGGGUAUCUCGGACCUCAAAAGAGGGU Revision Hands-On Infernal Conclusion

SEQUENCEVS.STRUCTURE GCAUGCAUGCUAGCUGACUAGCAUGCAUGCAUGCAUGCAUGCAGU CCGAGAUACCCUAACUCUAGGGUAUCUCGGACCUCAAAAGAGGGU

GCAUGCAUGCUAGCUGACUAGCAUGCAUGCAUGCAUGCAUGUCAG CCGAGAUACCCUAACUCUAGGGUAUCUCGGACCUCAAAAGAGGGU (((((((((((((....))))))))))))).((((.... ) .))).