To determine the number of assessors needed to reliably assess medical students' overall clinical performance during the OB/GYN clerkship.BACKGROUND:
Reliable assessment of students' clinical performance during clerkships provides important information for decisions regarding grades and advancement. However, the assessment ratings students receive can vary based on many factors, and there are no clear data regarding the number of assessors needed to obtain a reliable assessment of students' clinical performance on the OB/GYN clerkship.METHODS:
During the 2015-2016 OB/GYN clerkship, faculty and residents completed one assessment per student. Using the overall performance score for each student, we performed generalizability analysis to determine the number of assessors needed to achieve an acceptable threshold of reliability (G=0.7).RESULTS:
Students’ mean overall performance score was 6.38±1.21 (scale of 1-10). 90.1% of variance in scores was attributed to assessors and 9.9% was attributed to students. The generalizability estimate for eight assessors was G=0.469. For other clerkships, G coefficients for eight assessors ranged from 0.000-0.795. Decision studies suggest that 17 assessors were needed to achieve G=0.7 for the OB/GYN clerkship, compared to 4-12 assessors for other clerkships.DISCUSSION:
Much of the variation in students’ overall performance scores can be attributed to assessors rather than students. We suggest clerkships with high variability among assessment scores find alternative ways to interpret these scores. For example, we currently use each assessor's Z score to determine the number of standard deviations each student's assessment lies from the assessor's mean student evaluation score to adjust for assessor stringency or leniency.