SigniSite: Identification of residue-level genotype-phenotype correlations in protein multiple sequence alignments

    loading  Checking for direct PDF access through Ovid

Abstract

Identifying which mutation(s) within a given genotype is responsible for an observable phenotype is important in many aspects of molecular biology. Here, we presentSigniSite, an online application for subgroup-free residue-level genotype–phenotype correlation. In contrast to similar methods,SigniSitedoes not require any pre-definition of subgroups or binary classification. Input is a set of protein sequences where each sequence has an associated real number, quantifying a given phenotype.SigniSitewill then identify which amino acid residues are significantly associated with the data set phenotype. As output,SigniSitedisplays a sequence logo, depicting the strength of the phenotype association of each residue and a heat-map identifying ‘hot’ or ‘cold’ regions.SigniSitewas benchmarked against SPEER, a state-of-the-art method for the prediction of specificity determining positions (SDP) using a set of human immunodeficiency virus protease-inhibitor genotype–phenotype data and corresponding resistance mutation scores from the Stanford University HIV Drug Resistance Database, and a data set of protein families with experimentally annotated SDPs. For both data sets,SigniSitewas found to outperform SPEER.SigniSiteis available at:http://www.cbs.dtu.dk/services/SigniSite/.

Related Topics

    loading  Loading Related Articles