The purposes of this study were to (1) perform a systematic review of articles that reported agreement or reproducibility in repeated diagnosis of developmental dysplasia of the hip (DDH) using ultrasound imaging, (2) estimate the reproducibility in the available dysplasia metrics, and (3) compare reproducibility of the available dysplasia metrics.Methods:
A systematic review of the Medline and Embase databases was performed by using a search strategy formulated from our research question: “For infants at risk of DDH, are US imaging-based diagnoses reproducible?” Two reviewers independently identified articles for inclusion in the systematic review, and then assessed the quality of the included studies using the Guidelines for Reporting Reliability and Agreement Studies guideline. Variability and agreement-related statistics in the included studies were extracted and included in a meta-analysis for summarizing the available statistics. The reproducibility of the available dysplasia metrics was compared, with a Bonferroni correction made to adjust for multiple comparisons.Results:
Twenty eight studies were included in the systematic review. Overall, the quality of the included studies was moderate (average, 10.7/15; range, 6 to 12). Graf’s alpha angle had the lowest interexamination variability of the metrics assessed, followed by Graf’s beta angle (the variability of the alpha angle was 10% lower than the variability of the beta angle, P<0.05). However, despite Graf’s angles having lower variability compared with other dysplasia metrics, their actual variability was still problematically high. This finding was supported by the low intraclass correlation and Kappa coefficient values reported in the included studies. There was also evidence to suggest that the reproducibility in DDH diagnosis has potentially worsened over time.Conclusions:
Overall, we found high variability and low agreement in all reported dysplasia metrics. Furthermore, in the last 3 decades, the repeatability of dysplasia metrics has not markedly improved and may even have declined, indicating a genuine need for improving repeatability and reliability of ultrasound-based DDH diagnosis.Level of Evidence:
Level III—systematic review of level III studies.