Background: Most epidemiological studies have missing information, leading to reduced power and potential bias. Estimates of exposure-outcome associations will generally be biased if the outcome variable is missing not at random (MNAR). Linkage to administrative data containing a proxy for the missing study outcome allows assessment of whether this outcome is MNAR and the evaluation of bias. We examined this in relation to the association between infant breastfeeding and IQ at 15 years, where a proxy for IQ was available through linkage to school attainment data.
Methods: Subjects were those who enrolled in the Avon Longitudinal Study of Parents and Children in 1990–91 (n = 13 795), of whom 5023 had IQ measured at age 15. For those with missing IQ, 7030 (79%) had information on educational attainment at age 16 obtained through linkage to the National Pupil Database. The association between duration of breastfeeding and IQ was estimated using a complete case analysis, multiple imputation and inverse probability-of-missingness weighting; these estimates were then compared with those derived from analyses informed by the linkage.
Results: IQ at 15 was MNAR—individuals with higher attainment were less likely to have missing IQ data, even after adjusting for socio-demographic factors. All the approaches underestimated the association between breastfeeding and IQ compared with analyses informed by linkage.
Conclusions: Linkage to administrative data containing a proxy for the outcome variable allows the MNAR assumption to be tested and more efficient analyses to be performed. Under certain circumstances, this may produce unbiased results.