Identifying patients with acute respiratory distress syndrome (ARDS) is a recognized challenge. Experts often have only moderate agreement when applying the clinical definition of ARDS to patients. However, no study has fully examined the implications of low reliability measurement of ARDS on clinical studies.Objectives:
To investigate how the degree of variability in ARDS measurement commonly reported in clinical studies affects study power, the accuracy of treatment effect estimates, and the measured strength of risk factor associations.Methods:
We examined the effect of ARDS measurement error in randomized clinical trials (RCTs) of ARDS-specific treatments and cohort studies using simulations. We varied the reliability of ARDS diagnosis, quantified as the interobserver reliability (κ-statistic) between two reviewers. In RCT simulations, patients identified as having ARDS were enrolled, and when measurement error was present, patients without ARDS could be enrolled. In cohort studies, risk factors as potential predictors were analyzed using reviewer-identified ARDS as the outcome variable.Measurements and Main Results:
Lower reliability measurement of ARDS during patient enrollment in RCTs seriously degraded study power. Holding effect size constant, the sample size necessary to attain adequate statistical power increased by more than 50% as reliability declined, although the result was sensitive to ARDS prevalence. In a 1,400-patient clinical trial, the sample size necessary to maintain similar statistical power increased to over 1,900 when reliability declined from perfect to substantial (κ = 0.72). Lower reliability measurement diminished the apparent effectiveness of an ARDS-specific treatment from a 15.2% (95% confidence interval, 9.4–20.9%) absolute risk reduction in mortality to 10.9% (95% confidence interval, 4.7–16.2%) when reliability declined to moderate (κ = 0.51). In cohort studies, the effect on risk factor associations was similar.Conclusions:
ARDS measurement error can seriously degrade statistical power and effect size estimates of clinical studies. The reliability of ARDS measurement warrants careful attention in future ARDS clinical studies.