The design of well-powered in vivo preclinical studies is a key element in building the knowledge of disease physiology for the purpose of identifying and effectively testing potential antiobesity drug targets. However, as a result of the complexity of the obese phenotype, there is limited understanding of the variability within and between study animals of macroscopic end points such as food intake and body composition. This, combined with limitations inherent in the measurement of certain end points, presents challenges to study design that can have significant consequences for an antiobesity program. Here, we analyze a large, longitudinal study of mouse food intake and body composition during diet perturbation to quantify the variability and interaction of the key metabolic end points. To demonstrate how conclusions can change as a function of study size, we show that a simulated preclinical study properly powered for one end point may lead to false conclusions based on secondary end points. We then propose the guidelines for end point selection and study size estimation under different conditions to facilitate proper power calculation for a more successful in vivo study design.