The exponential growth of genomic variants uncovered by next-generation sequencing necessitates efficient and accurate computational analyses to predict their functional effects. A number of computational methods have been developed for the task, but few unbiased comparisons of their performance are available. To fill the gap, The Critical Assessment of Genome Interpretation (CAGI) comprehensively assesses phenotypic predictions on newly collected experimental datasets. Here, we present the results of the SUMO conjugase challenge where participants were predicting functional effects of missense mutations in human SUMO-conjugating enzyme UBE2I. The performance of the predictors is similar to each other and is far from perfection. Evolutionary information from sequence alignments dominates the success: deleterious mutations at conserved positions and benign mutations at variable positions are accurately predicted. Prediction accuracy of other mutations remains unsatisfactory, and this fast-growing field of research is yet to learn the use of spatial structure information to improve the predictions significantly.
Competitive Growth Scores of UBE2I Mutations. A) A histogram of scores. Bars are colored in gradient from red (deleterious mutations) through white (wild type) to blue (advantageous mutations). The same color gradient is applied to corresponding residues in surface representations of the UBE2I structure for B) deleterious and C) advantageous mutations. The consensus substrate tetrapeptide PsiKxE (magenta) and the C-terminal SUMO peptide (green) are shown and select residues near the active site are labeled.