In clinical and experimental settings, planning ability is typically assessed using the Tower of London (ToL) or one of its variants. For enhancing the comparability across studies, a common ToL problem set was recently suggested comprising a collection of 4- to 7-move problems. Based on previous theoretical and empirical analyses of problem space and task structure, development of the problem set accounted particularly for the influence of structural problem parameters on the detection of individual differences in planning ability. To assess its adequacy as a clinical and research instrument, the present study evaluated the psychometric properties of the suggested problem set. Results showed a clear and nearly perfect linear increase of task difficulty across minimum moves. Given a broad range of item difficulty, high- and low-achieving subjects could be well discriminated. The test scores' split-half reliability (r = .72) and internal consistency (α = .69) were satisfactory. Taken together, the ToL problem set evaluated here proved to have good psychometric qualities and constitutes a conceptually sound basis for diagnostic and research purposes.