We aimed to use longitudinal data from an established screening programme with good quality assurance and quality control procedures and a stable well-trained workforce to determine the accuracy of grading in diabetic retinopathy screening.Methods
We used a continuous time-hidden Markov model with five states to estimate the probability of true progression or regression of retinopathy and the conditional probability of an observed grade given the true grade (misclassification). The true stage of retinopathy was modelled as a function of the duration of diabetes and HbA1c.Results
The modelling dataset consisted of 65 839 grades from 14 187 people. The median number [interquartile range (IQR)] of examinations was 5 (3, 6) and the median (IQR) interval between examinations was 1.04 (0.99, 1.17) years. In total, 14 227 grades (21.6%) were estimated as being misclassified, 10 592 (16.1%) represented over-grading and 3635 (5.5%) represented under-grading. There were 1935 (2.9%) misclassified referrals, 1305 were false-positive results (2.2%) and 630 were false-negative results (11.0%). Misclassification of background diabetic retinopathy as no detectable retinopathy was common (3.4% of all grades) but rarely preceded referable maculopathy or retinopathy.Conclusion
Misclassification between lower grades of retinopathy is not uncommon but is unlikely to lead to significant delays in referring people for sight-threatening retinopathy.