Discordance in Pathologist Assessment of Endometrial Cancer is Informative While Level of Discordance of Molecular Classification Remains Unknown

To the Editor:
Offered here is a an opposite mindset of how to view data in the recent publication entitled “Interobserver agreement in endometrial carcinoma histotype diagnosis varies depending on the cancer genome atlas (TCGA)-based molecular subgroup.”1
In the introduction section of the paper, a statement was troubling. Specifically mentioned was lack of reproducibility being a profound weakness for any histopathologic variable. What is offered here is a review of the paper in which the mindset was poor reproducibility can be a profound strength of pathology assessment. How can this be?
Reported was 90% complete concordance of histotype by seven pathologists in 41 P53 wild type (P53wt) endometrial cancers. In the P53wt group, of 287 (41×7) total assessments, discrepant opinions totalled 10 (3.5%) and 8 of these were in 2 of the 41 tumors. Meanwhile, only 36% of P53 abnormal (P53abn) tumors had complete concordance. Using R statistics function prop.test it can be calculated the difference in proportion of concordance in P53wt and P53abn was strongly significant (P<0.00001). Discordance was not randomly distributed. When new information significantly changes a probability distribution an appropriate term is classification.
Finding zones of poor reproducibility require multiple assessments, or what can be termed “discordance discovery.” Discordance discovery is not possible unless a method has sufficient ease for multiple assessments. This was a demonstrated strength of morphology given the ability for 7 pathologists from 7 institutions to have reviewed the same material. Relative ease in assessment cannot be assumed with respect to the molecular side of the equation. Results of testing from 7 separate laboratories were not matched to the interpretations from the 7 pathologists.
It would have been interesting to know if susceptibility for discordance in pathology interpretation (phenotypic analysis) is associated with corresponding susceptibility for interlaboratory discordance in molecular assays (or surrogate immunohistochemistry methods). Until this is quantitated it can be difficult to reach conclusions. Pathologist discordance was often associated with cancers showing heterogeneity.
Discordance has been found with molecular markers even in the absence of morphologic heterogeneity.2 Molecular testing was performed on 3 separate blocks of 49 tumors. Five tumors (10%) had degree of discordance which would have altered molecular-based risk assignment. The 5 cases of 49 was concluded as representing a limited number of misclassified cases. No confidence interval for the 10% proportion was provided, and whether or not an upper range of a confidence interval would overlap with what might be regarded as an unacceptable level of misclassification was not mentioned. Time may tell if the conclusion, based on a small sample, was premature and over reaching.
Molecular classification was claimed to be “much more” reproducible than pathology.1 This assertion was based on an intralaboratory comparison of results from curettage specimens versus the hysterectomy specimen.3 The study consisted of 22 P53wt tumors, 11 P53abn, 12 ultramutated polymerase epsilon, and 12 mismatch repair. This is not a large number of tumors, and much fewer than the number of tumors submitted to the 7 pathologists.
There is agreement with concern over pathologist discordance resulting in a patient being treated in one way at one center and in another way elsewhere. The same concern is applicable to molecular assays. Lacking an interlaboratory study in the reproducibility of molecular classification the alarm becomes a fallacy of incomplete comparison.
In a widely referenced breast cancer gene-based concordance study the 70 gene-profile and recurrence score were deemed as having strong concordance.4 Yet, 18% discordance between the 2 can be counted from the open dasta set.5 Of 295 patients, a molecular assay could have resulted in 53 patients having been treated as low risk at one center and high risk at another.
