Forecasting residue-residue contact prediction accuracy

    loading  Checking for direct PDF access through Ovid



Apart from meta-predictors, most of today's methods for residue-residue contact prediction are based entirely on Direct Coupling Analysis (DCA) of correlated mutations in multiple sequence alignments (MSAs). These methods are on average ˜40% correct for the 100 strongest predicted contacts in each protein. The end-user who works on a single protein of interest will not know if predictions are either much more or much less correct than 40%, which is especially a problem if contacts are predicted to steer experimental research on that protein.


We designed a regression model that forecasts the accuracy of residue-residue contact prediction for individual proteins with an average error of 7 percentage points. Contacts were predicted with two DCA methods (gplmDCA and PSICOV). The models were built on parameters that describe the MSA, the predicted secondary structure, the predicted solvent accessibility and the contact prediction scores for the target protein. Results show that our models can be also applied to the meta-methods, which was tested on RaptorX.

Availability and implementation

All data and scripts are available from


Supplementary information

Supplementary data are available at Bioinformatics online.

Related Topics

    loading  Loading Related Articles