### Excerpt

A randomized controlled trial is conducted to evaluate whether a vaccine (A = 1 if vaccine, 0 if placebo) decreases the risk of disease (

JOURNAL/epide/04.02/00001648-201805000-00020/math_20MM1/v/2018-04-21T053452Z/r/image-tiff

if disease, 0 otherwise). Individuals are enrolled at baseline, randomized to vaccine or placebo, followed 6 months, and monitored for disease. The vaccine is more likely to result in injection site pain (

JOURNAL/epide/04.02/00001648-201805000-00020/math_20MM2/v/2018-04-21T053452Z/r/image-tiff

if pain, 0 otherwise), and those with pain are more likely to drop out and have unobserved outcomes (

JOURNAL/epide/04.02/00001648-201805000-00020/math_20MM3/v/2018-04-21T053452Z/r/image-tiff

if dropped out, 0 otherwise). Participants with poor (unmeasured) health (

JOURNAL/epide/04.02/00001648-201805000-00020/math_20MM4/v/2018-04-21T053452Z/r/image-tiff

if poor health, 0 otherwise) are more likely to experience pain and get the disease. The scenario is summarized in Figure A.

There is selection bias if we condition on not dropping out (

JOURNAL/epide/04.02/00001648-201805000-00020/math_20MM5/v/2018-04-21T053452Z/r/image-tiff

) because the path

JOURNAL/epide/04.02/00001648-201805000-00020/math_20MM6/v/2018-04-21T053452Z/r/image-tiff

is opened. Stratifying on W does not block this path and may in fact induce more bias. Based on this causal diagram, it is not immediately clear how to identify the causal effect of the vaccine using the observed data (although see references 4, 5, or 6).

The single-world intervention graph in Figure B, however, clearly displays the independencies necessary to identify the effect of the vaccine from the observed data as follows (here, a variable

JOURNAL/epide/04.02/00001648-201805000-00020/math_20MM7/v/2018-04-21T053452Z/r/image-tiff

represents the value of

JOURNAL/epide/04.02/00001648-201805000-00020/math_20MM8/v/2018-04-21T053452Z/r/image-tiff

had the individual received vaccine level

JOURNAL/epide/04.02/00001648-201805000-00020/math_20MM9/v/2018-04-21T053452Z/r/image-tiff

):

JOURNAL/epide/04.02/00001648-201805000-00020/math_20MM10/v/2018-04-21T053452Z/r/image-tiff

JOURNAL/epide/04.02/00001648-201805000-00020/math_20MM11/v/2018-04-21T053452Z/r/image-tiff

JOURNAL/epide/04.02/00001648-201805000-00020/math_20MM12/v/2018-04-21T053452Z/r/image-tiff

JOURNAL/epide/04.02/00001648-201805000-00020/math_20MM13/v/2018-04-21T053452Z/r/image-tiff

JOURNAL/epide/04.02/00001648-201805000-00020/math_20MM14/v/2018-04-21T053452Z/r/image-tiff

The first equality holds by the law of total probability, the second by d-separation of

JOURNAL/epide/04.02/00001648-201805000-00020/math_20MM15/v/2018-04-21T053452Z/r/image-tiff

and

JOURNAL/epide/04.02/00001648-201805000-00020/math_20MM16/v/2018-04-21T053452Z/r/image-tiff

given

JOURNAL/epide/04.02/00001648-201805000-00020/math_20MM17/v/2018-04-21T053452Z/r/image-tiff

, the third by d-separation of

JOURNAL/epide/04.02/00001648-201805000-00020/math_20MM18/v/2018-04-21T053452Z/r/image-tiff

and

JOURNAL/epide/04.02/00001648-201805000-00020/math_20MM19/v/2018-04-21T053452Z/r/image-tiff

, the fourth by d-separation of

JOURNAL/epide/04.02/00001648-201805000-00020/math_20MM20/v/2018-04-21T053452Z/r/image-tiff

and

JOURNAL/epide/04.02/00001648-201805000-00020/math_20MM21/v/2018-04-21T053452Z/r/image-tiff

given

JOURNAL/epide/04.02/00001648-201805000-00020/math_20MM22/v/2018-04-21T053452Z/r/image-tiff

and

JOURNAL/epide/04.02/00001648-201805000-00020/math_20MM23/v/2018-04-21T053452Z/r/image-tiff

, and the last by causal consistency. All components of the final line of the equation, which is Robins’ g-formula,7 can be estimated from observed data. The key insight provided by the single-world intervention graph is that

JOURNAL/epide/04.02/00001648-201805000-00020/math_20MM24/v/2018-04-21T053452Z/r/image-tiff

is independent of

JOURNAL/epide/04.02/00001648-201805000-00020/math_20MM25/v/2018-04-21T053452Z/r/image-tiff

given

JOURNAL/epide/04.02/00001648-201805000-00020/math_20MM26/v/2018-04-21T053452Z/r/image-tiff

, but conditioning on

JOURNAL/epide/04.02/00001648-201805000-00020/math_20MM27/v/2018-04-21T053452Z/r/image-tiff

does not open any paths between

JOURNAL/epide/04.02/00001648-201805000-00020/math_20MM28/v/2018-04-21T053452Z/r/image-tiff

and

JOURNAL/epide/04.02/00001648-201805000-00020/math_20MM29/v/2018-04-21T053452Z/r/image-tiff

.

We conducted a simulation of 1,000,000 individuals for illustration (SAS code is available in the eAppendix; http://links.lww.com/EDE/B306). Individuals were randomly assigned vaccine with probability 0.5 and had probability 0.3 of being in poor health. The probability of injection site pain for healthy individuals was 0.2 if assigned placebo and 0.6 if assigned vaccine. Poor health increased the probability of pain by 0.3. The probability of dropping out was 0.1 for those without pain and 0.9 for those with pain. Finally, the probability of disease was 0.3 for healthy individuals assigned placebo, and it was increased by 0.5 by poor health and decreased by 0.2 by the vaccine.

The true effect of the vaccine on the disease was a 0.20 decrease in risk. The complete case analysis gave a 0.24 decrease in risk. Stratifying on injection site pain worsened the bias, giving a 0.26 decrease in risk. Finally, the g-formula with empirically estimated expectations and probabilities yielded the true decrease of 0.20.

An anonymous reviewer noted that the derivation above also holds with certain additional edges in the causal diagram, such as

JOURNAL/epide/04.02/00001648-201805000-00020/math_20MM30/v/2018-04-21T053452Z/r/image-tiff

or

JOURNAL/epide/04.02/00001648-201805000-00020/math_20MM31/v/2018-04-21T053452Z/r/image-tiff

. These would lead to, respectively, edges

JOURNAL/epide/04.02/00001648-201805000-00020/math_20MM32/v/2018-04-21T053452Z/r/image-tiff

or

JOURNAL/epide/04.02/00001648-201805000-00020/math_20MM33/v/2018-04-21T053452Z/r/image-tiff

in the single-world intervention graph. In the latter case,

JOURNAL/epide/04.02/00001648-201805000-00020/math_20MM34/v/2018-04-21T053452Z/r/image-tiff

is d-separated from

JOURNAL/epide/04.02/00001648-201805000-00020/math_20MM35/v/2018-04-21T053452Z/r/image-tiff

given

JOURNAL/epide/04.02/00001648-201805000-00020/math_20MM36/v/2018-04-21T053452Z/r/image-tiff

and

JOURNAL/epide/04.02/00001648-201805000-00020/math_20MM37/v/2018-04-21T053452Z/r/image-tiff

, thus

JOURNAL/epide/04.02/00001648-201805000-00020/math_20MM38/v/2018-04-21T053452Z/r/image-tiff

would remain independent of

JOURNAL/epide/04.02/00001648-201805000-00020/math_20MM39/v/2018-04-21T053452Z/r/image-tiff

conditional on

JOURNAL/epide/04.02/00001648-201805000-00020/math_20MM40/v/2018-04-21T053452Z/r/image-tiff

(Theorem 12 in Richardson and Robins3). The reviewer also noted that the derivation fails with unmeasured confounding between

JOURNAL/epide/04.02/00001648-201805000-00020/math_20MM41/v/2018-04-21T053452Z/r/image-tiff

and

JOURNAL/epide/04.02/00001648-201805000-00020/math_20MM42/v/2018-04-21T053452Z/r/image-tiff

or between

JOURNAL/epide/04.02/00001648-201805000-00020/math_20MM43/v/2018-04-21T053452Z/r/image-tiff

and

JOURNAL/epide/04.02/00001648-201805000-00020/math_20MM44/v/2018-04-21T053452Z/r/image-tiff

.