Journal of Epidemiology & Community Health. 72(Suppl_1):A66, SEPTEMBER 2018

DOI: 10.1136/jech-2018-SSMabstracts.137

,

Issn Print: 0143-005X

Publication Date: September 2018

# P11 Mathematical coupling and causal inference through example

L Berrie;PWG Tennant;PD Norman;PD Baxter;MS Gilthorpe;

+ Author Information

1School of Medicine, University of Leeds, Leeds, UK2School of Geography, University of Leeds, Leeds, UK

### Abstract

In health studies, proportions and percentages can often seem more informative than raw counts and therefore appear to be of more interest to analysts. However, it has long been acknowledged that their use is problematic in correlation and regression analyses where they comprise common components that are present in both the dependent and independent constituents of a model (exposure and outcome), as in the regression analysis of proportions with common denominators. We demonstrate this so-called mathematical coupling with real-world examples aided by directed acyclic graphs (DAGs) and simulations.We consider three possible real-world scenarios: (1) the population size (N) of a geographical area causes both the number of people living in detached houses (X) and the number of people living in care homes (Y), within each area, but the number of detached houses (X) does not cause the number of care homes (Y) within any area, or vice versa; (2) the population size (N) of a geographical area causes both the number of people with no formal qualifications (X) and the number of people with poor self-reported health (Y), while both the population size (N) and number of people with no formal qualifications (X) are causes of the number of people with self-reported poor health (Y); and (3) within a geographical area, the area wealth (X) causes the number of elderly people (N), while both area wealth (X) and the number of elderly people (N) cause social care expenditure (Y).We show how historical solutions to the issue of mathematical coupling caused by a common denominator hold under the situation when the denominator is a confounder of the exposure outcome relationship; i.e. the results of the simulated examples under scenarios 1 and 2 result in expected regression coefficients. The same solution does not hold in scenario 3, when the denominator is a mediator (i.e. lies on the causal path) between the exposure and outcome.We show how DAGs and accompanying causal graph theory can be used to understand a problem first presented over a century ago. We highlight the issue of mathematical coupling when analysing proportions with a common denominator, showing under which circumstances historical solutions are valid or invalid. By using real-world examples to inform simulations, we demonstrate the utility of DAGs and causal graph theory in health geography and observational research to understand statistical problems and to verify proposed solutions.