RNA Structure Prediction Software and Analysis

RNA Structure Prediction
RNA structure prediction strategies

Secondary structure prediction
1) Energy minimization
(thermodynamics)
2) Comparative sequence analysis

(co-variation)
3) Combined experimental & computational

Secondary structure prediction strategies
1) Energy minimization (thermodynamics)

• Algorithm:
Dynamic programming to find
high probability pairs
(also, some Genetic algorithms)
• Software:
Mfold - Zuker
Vienna RNA Package - Hofacker
RNAstructure - Mathews
Sfold - Ding & Lawrence
R Knight 2005
2) Comparative sequence analysis (co-variation)

• Algorithm:
Mutual information
Context-free grammars
• Software:
ConStruct
Alifold
Pfold
FOLDALIGN
Dynalign
R Knight 2005
3) Combined experimental & computational
• Experiment:
Map single-stranded vs double-stranded regions in
folded RNA
• How?
Enzymes: S1 nuclease, T1 RNase
Chemicals: kethoxal, DMS
R Knight 2005
Experimental RNA structure determination?
• X-ray crystallography
• NMR spectroscopy
• Enzymatic/chemical mapping
1) Energy minimization method
What are the assumptions?

Native tertiary structure or "fold" of an RNA
molecule is (one of) its "lowest" free energy
configuration(s)
Gibbs free energy = G in kcal/mol at 37C
= equilibrium stability of structure
lower values (negative) are more favorable
Is this assumption valid?
in vivo? - this may not hold, but we don't really know
Free energy minimization
What are the rules?
A U Basepair A=U
A U A=U What gives here?
G = -1.2 kcal/mole
A U Basepair
A=U
U A U=A
G = -1.6 kcal/mole
C Staben 2005
Energy minimization calculations:
Base-stacking is critical
AA -1.2 CG -3.0
UU GC
AU or UA -1.6 GC -4.3
UA AU CG
AG, AC, CA, GA -2.1 GU -0.3

UC, UG, GU, CU UG
CC -4.8 XG, GX 0
GG YU, UY
- Tinocco et al.
C Staben 2005
Nearest-neighbor parameters
Most methods for free energy minimization

use nearest-neighbor parameters (derived from
experiment) for predicting stability of an RNA secondary structure
(in terms of G at 37C)
& most available software packages use

the same set of parameters:
Mathews, Sabina, Zuker & Turner, 1999
Energy minimization - calculations:
Total free energy of a specific
conformation for a specific RNA molecule
= sum of incremental energy terms for:
• helical stacking
(sequence dependent)
• loop initiation
• unpaired stacking
(favorable "increments" are < 0)
Fig 6.3
Baxevanis &
Ouellette 2005
But how many possible conformations for a single RNA molecule?
Huge number:
Zuker estimates (1.8)N possible secondary structures for a
sequence of N nucleotides
for 100 nts (small RNA…) =
3 X 1025 structures!
Solution? Not exhaustive enumeration…
 Dynamic programming
O(N3) in time
O(N2) in space/storage
iff pseudoknots excluded, otherwise:
O(N6 ), time
O(N4 ), space
(co-variation)
Two basic approaches:

• Algorithms constrained by initial alignment
Much faster, but not as robust as unconstrained
Base-pairing probabilities determined by a partition
function
• Algorithms not constrained by initial alignment

Genetic algorithms often used for finding an alignment & set
of structures
RNA Secondary structure prediction: Performance?
How evaluate?
• Not many experimentally determined structures
currently, ~ 50% are rRNA structures
so "Gold Standard" (in absence of tertiary structure):
compare with predicted RNA secondary structure with that
determined by comparative sequence analysis (!!??) using Benchmark
Datasets
NOTE: Base-pairs predicted by comparative sequence analysis for large &
small subunit rRNAs are 97% accurate when compared with high resolution
crystal structures! - Gutell, Pace
RNA Secondary structure prediction: Performance?
1) Energy minimization (via dynamic programming)

73% avg. prediction accuracy - single sequence
97% avg. prediction accuracy - multiple sequences (e.g., highly
conserved rRNAs)
much lower if sequence conservation is lower &/or fewer sequences
are available for alignment
3) Combined - recent developments:
combine thermodynamics & co-variation
& experimental constraints? IMPROVED RESULTS
RNA structure prediction strategies
Tertiary structure prediction
Requires "craft" & significant user input & insight
1) Extensive comparative sequence analysis to predict tertiary
contacts (co-variation)
e.g., MANIP - Westhof
2) Use experimental data to constrain model building
e.g., MC-CYM - Major
3) Homology modeling using sequence alignment & reference tertiary
structure (not many of these!)
4) Low resolution molecular mechanics
e.g., yammp - Harvey

RNA Structure Prediction Software and Analysis

Uploaded by

Copyright:

Available Formats

RNA Structure Prediction Software and Analysis

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

RNA Structure Prediction Software and Analysis

Uploaded by

Copyright:

Available Formats

RNA Structure Prediction

RNA structure prediction strategies

2) Comparative sequence analysis

3) Combined experimental & computational

1) Energy minimization (thermodynamics)

2) Comparative sequence analysis (co-variation)

3) Combined experimental & computational

What are the assumptions?

What are the rules?

AG, AC, CA, GA -2.1 GU -0.3

Most methods for free energy minimization

& most available software packages use

(favorable "increments" are < 0)

Two basic approaches:

• Algorithms not constrained by initial alignment

1) Energy minimization (via dynamic programming)

You might also like