第11章 因果分析

11.1 Introduction

“Impact assessment, simply defined, is the process of identifying the future consequences of a current or proposed action.” (IAIA, 2009)

“Policy assessment seeks to inform decision-makers by predicting and evaluating the potential impacts of policy options.” (Adelle and Weiland, 2012)

“… I see no greater impediment to scientific progress than the prevailing practice of focusing all our mathematical resources on probabilistic and statistical inferences while leaving causal considerations to the mercy of intuition and good judgment.” (Pearl, 1999)

11.2 Causal Information

11.2.1 Target

practical framework for causal effect estimation in the context of policy assessment and impact analysis, and in the absence of experimental data

11.2.2 Causal sources

  • Causal Inference by Experiment: Randomized experiments
  • Causal Inference from Observational Data and Theory: Existing data or Big data

11.2.3 Identification and Estimation Process

  • Causal Identification: domain knowledge based
  • Computing the Effect Size: Bayesian networks

11.3 Theoretical Background

11.3.1 Potential Outcomes Framework

  • \(Y_{i,1}\) Potential outcome of individual i given treatment T=1 (e.g. taking two Aspirins)
  • \(Y_{i,0}\) Potential outcome of individual i given treatment T=0 (e.g. drinking a glass of water)
  • individual-level causal effect (ICE)
    • \(ICE=Y_{i,1} −Y_{i,0}\)
  • average causal effect (ACE)
    • \(ACE = E[Y_{i,1}] −E[Y_{i,0}]\)

11.3.2 Causal Identification

  • \(Y_{i,1}\) (treatment) and \(Y_{i,0}\) (non-treatment) can never be both observed for the same individual at the same time
  • Association S
    • \(S = E[Y_1|T = 1] - E[Y_0|T = 0]\)
  • S is not the same with ACE
    • association does not imply causation
    • randomized experiment

11.3.2.1 Ignorability

  • \((Y_1,Y_0) \upmodels T\), \(Y_1\) and \(Y_0\) must be jointly independent of the treatment assignment
  • \((Y_1,Y_0) \upmodels T|X\) for realobservational studies, conditional on variables X, \(Y_1\), and \(Y_0\) are jointly independent of T, the assignment mechanism
  • \(ACE|X = E[Y_1|X] - E[Y_0|X] = E[Y_1|T=1,X] - E[Y_0|􏰀T = 0,X] = E[Y|􏰀T = 1,X] - E[Y|􏰀T =0,X] = S|􏰀X\)

11.3.2.2 Assumptions

  • Causal inference requires causal assumptions

11.4 Methods for Identification and Estimation

11.4.1 Directed Acyclic Graphs(DAG) for Identification

  • DAGs Are Nonparametric
  • A Node represents a variable in a domain, regardless of whether it is observable or unobservable
  • A Directed Arc has the appearance of an arrow and represents a potential causal effect. The arc direction indicates the assumed causal direction, i.e. “A→B” means “A causes B.”
  • A Missing Arc encodes the definitive absence of a direct causal effect, i.e. no arc between A and B means that there exists no direct causal relationship between A and B and vice versa. As such, a missing arc repre- sents an assumption

11.4.2 Indirect Connection

  • A causes B via node C
  • \(A \nupmodels B\) and \(A \upmodels B|C\)

11.4.3 Common Cause

  • C causes both A and B
  • \(A \nupmodels B\) and \(A \upmodels B|C\)

11.4.4 Common Effect

  • C is the common effect of A and B
  • \(A \upmodels B\) and \(A \nupmodels B|C\)

11.4.5 Example: Simpson’s Paradox