To establish inter-rater reliability for genital injury detection among experienced forensic sexual assault (SA) examiners.Methods:
Cross-sectional observational study testing inter-rater agreement of injury assessment among eight experienced SA examiners who each viewed 2–4 digital images from 50 cases. Each case was rated by 4 examiners and included images before and after toluidine blue dye application. We calculated overall agreement and kappa (κ).Results:
Examiners had perfect agreement in 60 cases; in 24 cases 3 examiners agreed; in 5 cases 2 agreed and 1 was unsure; and in 9 cases there were 2 “yes” and 2 “no” ratings or 1 “yes,” 1 “no,” and 2 “unsure” ratings. Overall agreement was 82% (κ, 0.57) when yes|unsure and no|unsure combinations equaled disagreement and 86% (κ, 0.66) when only yes|no dyads equaled disagreement. Neither the number of images nor any single examiner fundamentally influenced results. Highly experienced examiners tended to agree with each other (86%) slightly more often than moderate examiners agreed with each other (75%).Conclusions:
Our set of experienced forensic examiners achieved moderate inter-rater agreement in assessment of the presence of female genital injury on selected digital images obtained during SA examination.