Moderation Analysis With Missing Data in the Predictors
The most widely used statistical model for conducting moderation analysis is the moderated multiple regression (MMR) model. In MMR modeling, missing data could pose a challenge, mainly because the interaction term is a product of two or more variables and thus is a nonlinear function of the involved variables. In this study, we consider a simple MMR model, where the effect of the focal predictor X on the outcome Y is moderated by a moderator U. The primary interest is to find ways of estimating and testing the moderation effect with the existence of missing data in X. We mainly focus on cases when X is missing completely at random (MCAR) and missing at random (MAR). Three methods are compared: (a) Normal-distribution-based maximum likelihood estimation (NML); (b) Normal-distribution-based multiple imputation (NMI); and (c) Bayesian estimation (BE). Via simulations, we found that NML and NMI could lead to biased estimates of moderation effects under MAR missingness mechanism. The BE method outperformed NMI and NML for MMR modeling with missing data in the focal predictor, missingness depending on the moderator and/or auxiliary variables, and correctly specified distributions for the focal predictor. In addition, more robust BE methods are needed in terms of the distribution mis-specification problem of the focal predictor. An empirical example was used to illustrate the applications of the methods with a simple sensitivity analysis.