BRAliBase I: Benchmarking structure prediction algorithms.


Supplementary data for BRaliBase I:

Gardner PP & Giegerich R (2004) A comprehensive comparison of comparative RNA structure prediction approaches. BMC Bioinformatics.5(1):140.



Structural Alignment

Phylogenetic Tree

Secondary Structure


High Similarity

Med Similarity

High Similarity

Med Similarity


E.coli LSU rRNA














E.coli SSU rRNA














E.coli Rnase P














S.cerevisiae tRNA-PHE














Software:

use compare_ct.pl to compute sensitivity, specificity, and MCC as reported in the above paper. This requires the paul_specials library of functions.

The below algorithms marked "*" were tested upon the above data-sets. See the preprint, fig1, fig2, fig3 or my Bielefeld 2004 seminar for further details.

Some notes on structure formats.


The following algorithms were compared in this study:

Single sequence RNA folding algorithms:

RNAfold*

RNA structure prediction program which ships with the Vienna package. Also a web interface.

Mfold*

Mike Zuker's famous MFE RNA structure prediction algorithm. A more direct link here.

Sfold*

Statistical sampling of all possible structures. The sampling is weighted by partition function probabilities.
RNA folding with alignment (Plan A):

Pfold*

Folds alignments using a SCFG trained on rRNA alignments. The alignment length limit is 500.

RNAalifold*

Folds alignments using a combination of free-energy and a covariation measure. Ships with the Vienna package. Also a web-server.

ILM*

Iterated Loop Matching. Evaluates stems in an alignment using a combination of free-energy and mutual information. Iteratively selects high scoring stems.

RNA folding without alignment (Plan B):

caRNAc*

Computer Alignment of RNA by Cofolding. VERY fast and VERY selective.

Dynalign*

Uses a "full energy model" and comparative information to align and fold 2 sequences. Restricts the 'span' of base-pairs to improve CPU time.

Foldalign*

Predicts conserved local sequence and hair-pin structures using CONSENSUS and CLUSTAL-like heuristics. Primarily used to infer cis-regulatory elements.

RNA structure alignment (Plan C):

RNAforester*

Compare and align RNA secondary structures via a "forest alignment" approach.

MARNA*

MARNA considers both primary sequence and the secondary structure to align RNAs. Based on pairwise comparisons using costs of edit operations. The edit operations can be divided into edit operations on arcs and edit operations on bases.


.



Paul Gardner, <pg5@sanger.ac.uk>
Dept. of Evolutionary Biology, University of Copenhagen,
Universitetsparken 15, 2100 Copenhagen Ø, Denmark.

Time-stamp: 2005-10-28 10:21:24 pgardner