The radiographs of fifty fractures of the proximal part of the humerus were used to assess the interobserver reliability and intraobserver reproducibility of the Neer classification system. A trauma series consisting of scapular anteroposterior, scapular lateral, and axillary radiographs was available for each fracture. The radiographs were reviewed by an orthopaedic shoulder specialist, an orthopaedic traumatologist, a skeletal radiologist, and two orthopaedic residents, in their fifth and second years of postgraduate training. The radiographs were reviewed on two different occasions, six months apart.
Interobserver reliability was assessed by comparison of the fracture classifications determined by the five observers.Intraobserver reproducibility was evaluated by comparison of the classifications determined by each observer on the first and second viewings. Kappa (kappa) reliability coefficients were used.
All five observers agreed on the final classification for 32 and 30 per cent of the fractures on the first and second viewings, respectively.Paired comparisons between the five observers showed a mean reliability coefficient of 0.48 (range, 0.43 to 0.58) for the first viewing and 0.52 (range, 0.37 to 0.62) for the second viewing. The attending physicians obtained a slightly higher kappa value than the orthopaedic residents (0.52 compared with 0.48). Reproducibility ranged from 0.83 (the shoulder specialist) to 0.50 (the skeletal radiologist), with a mean of 0.66. Simplification of the Neer classification system, from sixteen categories to six more general categories based on fracture type, did not significantly improve either interobserver reliability or intraobserver reproducibility.