Combining Scores Based on Compensatory and Noncompensatory Scoring Rules to Assess Resident Readiness for Unsupervised Practice: Implications From a National Primary Care Certification Examination in Japan

    loading  Checking for direct PDF access through Ovid


PurposeCompetence decisions in health professions education require combining scores from multiple sources and identifying pass–fail decisions based on noncompensatory (required to pass all subcomponents) and compensatory scoring decisions. This study investigates consequences of combining scores, reliability, and implications for validity using a national examination with subcomponent assessments.MethodNational data were used from three years (2015, 2016, and 2017) of the Japan Primary Care Association Board Certification Examination, with four subcomponent assessments: Clinical Skills Assessment–Integrated Clinical Encounter (CSA-ICE), CSA–Communication and Interpersonal Skills (CSA-CIS), Multiple-Choice Questions (MCQ), and Portfolio. Generalizability theory was used to estimate variance components and reliability. Kane’s composite reliability and kappa decision consistency were used to examine the impact of using compensatory and noncompensatory scoring.ResultsMean performance (n = 251) on the CSA-ICE, CSA-CIS, MCQ, and Portfolio subcomponent assessments were, respectively, 61% (SD = 11%), 67% (SD = 13%), 74% (SD = 8%), and 65% (SD = 9%); component-specific Φ-coefficient reliability ranged between, respectively, 0.57 and 0.67; 0.50 and 0.60; 0.65 and 0.76; and 0.87 and 0.89. Using a completely noncompensatory scoring approach on all four subcomponents, decision-consistency reliability was 0.33. Fully compensatory scoring yielded reliability of 0.86.ConclusionsAssessing a range of abilities in making entrustment decisions requires considering the balance of assessment tools measuring distinct but related competencies. These results indicate that noncompensatory pass–fail decision making, which seems more congruent with competency-based education, may lead to much lower reliability than compensatory decision making when several assessment subcomponents are used.

    loading  Loading Related Articles