A note on gene pleiotropy estimation from phylogenetic analysis of protein sequences
Recently, several statistical methods have been independently proposed for estimating the degree (n) of gene pleiotropy (i.e. the capacity of a gene to affect many phenotypes) without knowing measurable phenotypic traits. However, the theoretical limitation of these approaches has not been well demonstrated. In this short note, we show that our previous method based on the phylogeny of protein sequences is, in fact, an effective estimate of a parameter that can be written symbolically as K = min(n,r), where r is the rank of mutations at an amino acid site. Hence, understanding of r is crucial for appropriate interpretation of the estimated K, denoted by Ke (the effective gene pleiotropy). Indeed, when protein sequence alignment is used to estimate effective gene pleiotropy (Ke) by this method, Ke can be interpreted as an effective estimate of n when n ≤ 20, as long as the phylogeny is sufficiently large. If n > 20, Ke → 20, although the true n could be much higher.