The radiographs of ninety-five fractures of the proximal end of the humerus were classified with the Neer and the AO/ASIF systems by five orthopaedic surgeons who had a special interest in problems of the shoulder. Without access to their initial interpretations, the same five orthopaedic surgeons re-evaluated the same radiographs eight weeks later. Intraobserver and interobserver reliability were found to be fair or poor for both classification systems. Kappa values for the interobserver reliability were 0.40 for the Neer system and 0.53 for the AO/ASIF system. When the fractures were subclassified, according to the recommendations of the AO/ASIF, into groups and subgroups, reproducibility became progressively worse. Intraobserver reliability showed kappa values of 0.60 and 0.58, respectively. A so-called extended radiographic trauma series, consisting of three perpendicular radiographs, was available for thirty-five fractures; the third perpendicular projection did not significantly improve the reproducibility values for either classification compared with those obtained with only two perpendicular projections.
We concluded that neither the Neer nor the AO/ASIF classification of fractures of the proximal end of the humerus is sufficiently reproducible to allow meaningful comparison of similarly classified fractures in different studies.