Confounders & conditioning of analyses

Table of Contents

Confounders & conditioning of analyses
Initial notes
Annotations on common readings
Annotated additions by students
Idea: Statistical associations between any two variables generally vary depending on the values taken by other potentially "confounding" variables. We need to take this dependency (or conditionality) into account when using our analyses to make predictions or hypothesize about causes, but how do we decide which variables are relevant and real confounders?

Initial notes

From PT:
Before reading this week's articles, read Gordis on confounding variables (chaps 14 old editions or 15 new edition) & chapter 3 or 4 on age adjustment (standardization).
When reading the articles, make notes on how the readings address the topic of adjusting for confounding variables (which includes age-standardization) and identify controversies or discordant views about how to do this.

Immunization levels (Egede): Note the conclusion about racial/ethnic inequality even after adjusting for other variables thought to correlate with race/ethnicity. Do you agree with the three implications (p. 326ff) drawn from the results?

SES gradients in disease (Krieger): The abstract states that "for virtually all outcomes, risk increased with CT [census tract] poverty, and when we adjusted for CT poverty, racial/ethnic disparities were substantially reduced." Where can the result of adjustment be seen in the paper? (This paper also fits in week 7 on inequalities.)

Hormone replacement therapy (Prentice vs. Petitti): Notice the adjustments used by the first paper that bring the clinical component of the WHI hormone replacement trial into line with the observational component. Do Pettiti acknowledge and rebut this in concluding that it was wrong to think that hormone therapy prevents CV disease?

Birth weight and blood pressure (Huxley vs. Davies): Along with Huxley et al's general argument that the birthweight-adult blood pressure association may well be an artifact of selective publication of studies with small sample size, they criticise the adjustment of the association for adult weight. (In other words, the association holds for people in the same stratum or slice of weight.) Try to form an opinion about whether you agree or disagree with such an adjustment. Davies et al. provide counter-evidence to Huxley et al. -- how does their study differ in methods, results, and interpretation?

Control at work and mortality (Davey-Smith 1997): This simple study shows that "control at work" is not the cause of SES gradients in health outcomes. What method(s) do they use to undermine previous claims about control at work?

Mendelian randomization to analyze environmental exposures (Davey-Smith & Ebrahim 2007): The approach introduced in this paper was cutting edge "epidemiology in the age of genomics" in the 2000s and led to funding of a major Research Center under Davey-Smith at Bristol. I suggest that you summarize for yourself the logic of this approach so you can explain it to someone who's never heard of it.

Franks shows that SES is associated with CHD independently of the traditional risk factors that are included in the Framingham risk score, even when those factors are monitored after the baseline date. Think about how to explain how SES might affect CHD via pathways or causal risk factors not included in the Framingham risk score. Should the online risk calculators be revised (including Ridker's Reynolds risk score)?

Krieger and Davey-Smith (2016b) respond to commentaries on Krieger and Davey-Smith (2016) (a supplementary reading for session 5). Of course, their responses require reading of the commentaries to make full sense of, but the boxes provide examples that illustrate what DAGs are and why the authors resist the reliance on DAGs for thinking about confounding.

Notes and annotations from 2007 course, 2009
Common readings and cases: Davey-Smith 1997 (Control at work and mortality), Davey-Smith & Ebrahim 2007 (Mendelian randomization to analyze environmental exposures), Hernan (2000), Lynch 2007
Supplementary Reading: Davies 2006, Egede 2003, Huxley 2002, Lawlor 2004, Petitti 2005, Prentice 2005

Annotations on common readings

Is control at work the key to socioeconomic gradients in mortality?
By: Smith, George Davey, Harding, Seeromanie, Lancet, 00995355, 11/08/97, Vol. 350, Issue 9088
In this article Davey Smith et al. considers whether control at work is the key precipitator of socioeconomic gradients in mortality. One existing hypothesis is that a low control over one’s activity at work (a pervasive job-related stressor) is a major contributor to the socioeconomic differentials exhibited in mortality rates. Historically, there have been substantial socioeconomic differentials in coronary heart disease (CHD) not accounted for by conventional cardiovascular risk factors. Previous analyses employing statistical adjustment for self –reported job control have succeeded at essentially eliminating the socioeconomic gradient in CHD incidence. However, Smith et al. argues that low work is practically synonymous with low socioeconomic status, thus loss of power due to the instance of collinearity may present to be potential deficits in these analyses. The steeper socioeconomic gradient in CHD for women than for men, when women were not a formal constituent of the labor force at the beginning of the century, is cited as an important consideration that lends support to a skeptical view of an independent casual contribution of low job control to the social distribution of CHD. It is also noted that the social gradient of CHD among people who are beyond working age is the same as those that are of working age. Smith et al. further explored this issue by analyzing the association between mortality and socioeconomic position (as indexed by car access) in a longitudinal study. Their results demonstrated that socioeconomic differentials in mortality were not specific to those employed for whom low job control could be a plausible mechanism for any increased CHD risk. Smith et al. conceives that job control may act as a sensitive indicator of socioeconomic position. (SY)

Causal Knowledge as a Prerequisite for Confounding Evaluation:
An Application to Birth Defects Epidemiology
Miguel A. Hernán, Sonia Hernández-Díaz, Martha M.Werler, and Allen A. Mitchell
In epidemiologic studies, statistical analyses revolve around three essential sets of variables: the exposure, the outcome, and the confounder(s). The variables defining exposure and outcome are usually elicited by the causal question under consideration however; confounders (false positives) must be identified extrinsically. Confounders are usually characterized as “… a variable associated with the exposure in the population, associated with the outcome conditional on the exposure and not in the causal pathway between the exposure and the outcome (Hernan et al.).” Adjusting, stratifying or conditioning on the common cause are approaches to extinguishing any lurking spurious component present in the association between exposure and disease. Hernan et al. purports that procedures for identifying confounders are extensively centered on statistical associations, despite the argued benefits of having a priori theories about the causal network linking the variables that are being studied. Three common strategies used to classify whether a variable is a confounder are automatic variable selection, relative change in estimate greater than 10 percent and standard rules for confounding. Hernan et al. notes that strategies 1 and 2 depends only on statistical associations however, strategy 3 combines statistical associations from the data with background knowledge of the causal pathway. The authors affirm the susceptibility of all three strategies to lead to bias stemming from the omission of important confounders or the computing of inappropriate adjustment for non-confounders. In this article, Hernan et al. employs a case-control study on folic acid supplementation and risk of neural tube defects to "highlight the potential inconsistencies between beliefs and actions in data analysis (Hernan et al.).” Casual diagrams were used to illustrate three possible sources of statistical association between two variables: cause and effect, sharing of common causes, and calculation of the association within levels of a common effect. Hernan et al. demonstrates the exploited strategies ubiquitous preference for the adjusted effect estimate over the crude effect estimate. However, they subsequently applied a priori subject-matter knowledge to argue that the crude estimate should probably be preferred. Hernan et al. proposes that knowledge of the causal structure is a prerequisite for accurately labeling a variable as a confounder. Moreover, statistical criteria are rendered to be insufficient for identifying confounders and thus insufficient for precluding bias inference. (SY)

Annotated additions by students

(In alphabetical order by author's name with contributor's initials and date at the end.)