Response to a Letter to the Editor
We appreciate the insightful comments made by our colleagues with regard to our article, “The Interobserver and Intraobserver Reliability of the Sanders Classification Versus the Risser Stage”1 and the opportunity to respond to their concerns.
The commentators discuss that using weighted kappa is a more appropriate statistical test based on original kappa test weaknesses. As regards to our calculations of reliability and the use of the Cohen Kappa, and the provided example, we would like to present 2 following points. First, knowing the number of observations in a sample and number of discordant cells should not suffice to conclude 2 distributions are similar (eg, behavior of χ2 of independence). The Cohen Kappa takes the possibility of only positive, only negative, and answers in agreement; and in doing so, considers that if one of the categories (either positive or negative) is very small overall, the value of agreement cells goes down considerably. Using the example provided by the commentators, case A in Table 1 behaves differently than case B. The number of negatives in agreement over total negatives per rater is 5/10 and 5/10=45% in A, whereas 45/50 and 45/50=90% in B. This discordance reflects that if a measure seems to be skewed towards one end, it is the other end which gains importance to be able to detect differences in a sample, in terms of proportions. Second, much like when dealing with other distribution test and binary values, error generated by the linear method is minimized when using a binary variable.
Our choice of kappa test was based on its use in prior literature for similar studies that perform interrater and intrarater reliability tests.2–4 Although we agree weighted Kappa is a superior method of reliability for contingency tables of more ordinal categories, we did not find it necessary in testing 1 rater against 1 rater, the way our study presents the data. We again would like to thank our colleagues for their interest in our study.