In multidimensional forced-choice (MFC) questionnaires, items measuring different attributes are presented in blocks, and participants have to rank order the items within each block (fully or partially). Such comparative formats can reduce the impact of numerous response biases often affecting single-stimulus items (aka rating or Likert scales). However, if scored with traditional methodology, MFC instruments produce ipsative data, whereby all individuals have a common total test score. Ipsative scoring distorts individual profiles (it is impossible to achieve all high or all low scale scores), construct validity (covariances between scales must sum to zero), criterion-related validity (validity coefficients must sum to zero), and reliability estimates. We argue that these problems are caused by inadequate scoring of forced-choice items and advocate the use of item response theory (IRT) models based on an appropriate response process for comparative data, such as Thurstone's law of comparative judgment. We show that when Thurstonian IRT modeling is applied (Brown & Maydeu-Olivares, 2011), even existing forced-choice questionnaires with challenging features can be scored adequately and that the IRT-estimated scores are free from the problems of ipsative data.