Cognitive tasks that are too hard or too easy produce imprecise measurements of ability, which, in turn, attenuates group differences and can lead to inaccurate conclusions in clinical research. We aimed to illustrate this problem using a popular experimental measure of working memory—the N-back task—and to suggest corrective strategies for measuring working memory and other cognitive deficits in schizophrenia. Samples of undergraduates (n = 42), community controls (n = 25), outpatients with schizophrenia (n = 33), and inpatients with schizophrenia (n = 17) completed the N-back. Predictors of task difficulty—including load, number of word syllables, and presentation time—were experimentally manipulated. Using a methodology that combined techniques from signal detection theory and item response theory, we examined predictors of difficulty and precision on the N-back task. Load and item type were the 2 strongest predictors of difficulty. Measurement precision was associated with ability, and ability varied by group; as a result, patients were measured more precisely than controls. Although difficulty was well matched to the ability levels of impaired examinees, most task conditions were too easy for nonimpaired participants. In a simulation study, N-back tasks primarily consisting of 1- and 2-back load conditions were unreliable, and attenuated effect size (Cohen’s d) by as much as 50%. The results suggest that N-back tasks, as commonly designed, may underestimate patients’ cognitive deficits as a result of nonoptimized measurement properties. Overall, this cautionary study provides a template for identifying and correcting measurement problems in clinical studies of abnormal cognition.