RNA Structure Prediction Software and Analysis
RNA Structure Prediction Software and Analysis
RNA Structure Prediction Software and Analysis
1) Energy minimization
(thermodynamics)
• Software:
Mfold - Zuker
Vienna RNA Package - Hofacker
RNAstructure - Mathews
Sfold - Ding & Lawrence
R Knight 2005
Secondary structure prediction strategies
• Software:
ConStruct
Alifold
Pfold
FOLDALIGN
Dynalign
R Knight 2005
Secondary structure prediction strategies
• Experiment:
Map single-stranded vs double-stranded regions in
folded RNA
• How?
Enzymes: S1 nuclease, T1 RNase
Chemicals: kethoxal, DMS
R Knight 2005
Experimental RNA structure determination?
• X-ray crystallography
• NMR spectroscopy
• Enzymatic/chemical mapping
1) Energy minimization method
A U Basepair A=U
A U A=U What gives here?
G = -1.2 kcal/mole
A U Basepair
A=U
U A U=A
G = -1.6 kcal/mole
C Staben 2005
Energy minimization calculations:
Base-stacking is critical
AA -1.2 CG -3.0
UU GC
AU or UA -1.6 GC -4.3
UA AU CG
CC -4.8 XG, GX 0
GG YU, UY
- Tinocco et al.
C Staben 2005
Nearest-neighbor parameters
Fig 6.3
Baxevanis &
Ouellette 2005
But how many possible conformations for a single RNA molecule?
Huge number:
Zuker estimates (1.8)N possible secondary structures for a
sequence of N nucleotides
for 100 nts (small RNA…) =
3 X 1025 structures!
Solution? Not exhaustive enumeration…
Dynamic programming
O(N3) in time
O(N2) in space/storage
iff pseudoknots excluded, otherwise:
O(N6 ), time
O(N4 ), space
2) Comparative sequence analysis
(co-variation)
How evaluate?
• Not many experimentally determined structures
currently, ~ 50% are rRNA structures
so "Gold Standard" (in absence of tertiary structure):
compare with predicted RNA secondary structure with that
determined by comparative sequence analysis (!!??) using Benchmark
Datasets
NOTE: Base-pairs predicted by comparative sequence analysis for large &
small subunit rRNAs are 97% accurate when compared with high resolution
crystal structures! - Gutell, Pace
RNA Secondary structure prediction: Performance?