Competence decisions in health professions education require combining scores from multiple sources and identifying pass–fail decisions based on noncompensatory (required to pass all subcomponents) and compensatory scoring decisions. This study investigates consequences of combining scores, reliability, and implications for validity using a national examination with subcomponent assessments.Method
National data were used from three years (2015, 2016, and 2017) of the Japan Primary Care Association Board Certification Examination, with four subcomponent assessments: Clinical Skills Assessment–Integrated Clinical Encounter (CSA-ICE), CSA–Communication and Interpersonal Skills (CSA-CIS), Multiple-Choice Questions (MCQ), and Portfolio. Generalizability theory was used to estimate variance components and reliability. Kane’s composite reliability and kappa decision consistency were used to examine the impact of using compensatory and noncompensatory scoring.Results
Mean performance (n = 251) on the CSA-ICE, CSA-CIS, MCQ, and Portfolio subcomponent assessments were, respectively, 61% (SD = 11%), 67% (SD = 13%), 74% (SD = 8%), and 65% (SD = 9%); component-specific Φ-coefficient reliability ranged between, respectively, 0.57 and 0.67; 0.50 and 0.60; 0.65 and 0.76; and 0.87 and 0.89. Using a completely noncompensatory scoring approach on all four subcomponents, decision-consistency reliability was 0.33. Fully compensatory scoring yielded reliability of 0.86.Conclusions
Assessing a range of abilities in making entrustment decisions requires considering the balance of assessment tools measuring distinct but related competencies. These results indicate that noncompensatory pass–fail decision making, which seems more congruent with competency-based education, may lead to much lower reliability than compensatory decision making when several assessment subcomponents are used.