Reliability of the Sanders Classification Versus the Risser Stage; Avoid Misinterpretation
I was interested to read the paper by Vira et al1 published in the Jun 2017 issue of the J Pediatr Orthop. Estimation of skeletal maturity, classically performed using Risser sign, plays a crucial role in the treatment of AIS. Recent data, however, has shown the simplified Tanner-Whitehouse (Sanders) classification, based on an anterior posterior (AP) hand radiographs, to correlate more closely to the rapid growth phase and thus curve progression. The aim of the author was to assess reliability of the Sanders and Risser classifications among clinicians at different levels of training.1 Twenty AP scoliosis radiographs and 20 AP hand radiographs were randomized and distributed to 11 graders. The graders consisted of 3 orthopaedic residents, 3 spine fellows, 3 spine surgeons, and 1 radiologist.
They reported that for all graders the average κ coefficient for the interobserver and intraobserver reliability of the Sanders classification was 0.54 and 0.62, respectively, and 0.46 and 0.49 for the Risser classification. With respect to spine attending alone, the average κ coefficient for the interobserver and intraobserver reliability of Sanders classification was 0.72 and 0.77, respectively, and 0.46 and 0.67 for the Risser classification.
However, using κ is not the most appropriate test to assess reliability for qualitative outcomes. Two important weaknesses of k value to assess agreement are as follow: It depends upon the prevalence in each category, which means it can be possible to have different κ value having the same percentage for both concordant and discordant cells. Table 1 shows that in both (a) and (b) situations the prevalence of concordant cells are 90% and discordant cells are 10%; however, we get different κ value (0.44 and 0.80), respectively. κ value also depends upon the number of categories.2–5 Therefore, to get an unbiased result, using weighted κ in such situations can be suggested as an appropriate statistical test. It has been recently demonstrated that the linearly weighted κ is a weighted average of the κ coefficients of the embedded 2 by 2 agreement matrices, while the quadratically weighted κ is insensitive to the agreement matrices that are row or column reflection symmetric. A rank-1 matrix decomposition approach to the weighting schemes can demonstrate these phenomena in a concise manner.4
They concluded that the Sanders classification is a reliable and reproducible system and should be in the armamentarium of surgeons who treat adolescent idiopathic scoliosis. Such conclusion should be supported by the above mentioned statistical issue; otherwise, misinterpretation can easily lead to mismanagement of the patients.