AbstractRationale, aims and objectives
An essential requirement for ensuring the validity of outcomes in matching studies is that study groups are comparable on observed pre-intervention characteristics. Investigators typically use numerical diagnostics, such as t-tests, to assess comparability (referred to as ‘balance’). However, such diagnostics only test equality along one dimension (e.g. means in the case of t-tests), and therefore do not adequately capture imbalances that may exist elsewhere in the distribution. Furthermore, these tests are generally sensitive to sample size, raising the concern that a reduction in power may be mistaken for an improvement in covariate balance. In this paper, we demonstrate the shortcomings of numerical diagnostics and demonstrate how visual displays provide a complete representation of the data to more robustly assess balance.Methods
We generate artificial datasets specifically designed to demonstrate how widely used equality tests capture only a single-dimension of the data and are sensitive to sample size. We then plot the covariate distributions using several graphical displays.Results
As expected, tests showing perfect covariate balance in means failed to reflect imbalances at higher moments (variances). However, these discrepancies were easily detected upon inspection of the graphic displays. Additionally, smaller sample sizes led to the appearance of covariate balance, when in fact it was a result of lower statistical power.Conclusions
Given the limitations of numerical diagnostics, we advocate using graphical displays for assessing covariate balance and encourage investigators to provide such graphs when reporting balance statistics in their matching studies.