Different phenotypes have increasingly been used as tools for clinical characterization of frailty among older adults. Although there have been studies about the comparability and effectiveness of various simplifications and approximations of existing frailty phenotypes for risk prediction, there have been no studies in which investigators evaluated the stability of the clinical characterization achieved. In the present study, we used baseline (1992–1996) data from 786 community-dwelling women who were 70–79 years of age in the Women's Health and Aging Study I and II to compare physical frailty phenotypes (PFPs). Using the 5 criteria set forth by Fried, we created 15 PFPs that were positive for various combinations of 3 or 4 of those criteria and compared them with the PFP that included all 5 criteria in order to assess construct validity with regard to frailty syndrome characterization and predictive validity for adverse outcomes of aging. All PFPs exhibited high specificity and negative predictive values for identifying frailty syndrome. Three-item PFPs were insensitive but were the best performers for positive predictive value, with the highest positive predictive value of 0.86 seen in the PFP characterized by the combination of weakness, exhaustion, and weight loss. In comparison, the 5-criterion PFP achieved a sensitivity of 0.82 but a positive predictive value of only 0.53. With regard to predictive validity, it was not merely the number of criteria used to characterize the PFPs but rather the specific criteria combinations that predicted the risk of adverse outcomes. Our findings show that there clinically important contexts in which simplified PFPs cannot be used interchangeably.