+ - 0:00:00
Notes for current slide
Notes for next slide

STA 235H - Potential Outcomes

Fall 2023

McCombs School of Business, UT Austin

1 / 46



How? Potential Outcomes Framework

What? Causal Estimands

Why? Causal Questions and Study Design

2 / 46

The "How": Potential outcomes framework

3 / 46

4 / 46

What do you think are the biggest issues here?

5 / 46


6 / 46


6 / 46

Before we start...

Be clear about your language
Be clear about your data
Be clear about your assumptions

7 / 46

What is Causal Inference?

Inferring the effect of one thing on another thing

8 / 46

What is Causal Inference?

Inferring the effect of one thing on another thing

  • "My headache went away because I took an aspirin".
8 / 46

What is Causal Inference?

Inferring the effect of one thing on another thing

  • "My headache went away because I took an aspirin".

  • "The new marketing campaign increased our sales by 20%"

8 / 46

What is Causal Inference?

Inferring the effect of one thing on another thing

  • "My headache went away because I took an aspirin".

  • "The new marketing campaign increased our sales by 20%"

  • "Providing students support when filling out FAFSA forms improves college access and completion."

8 / 46

A world of potential (outcomes)

  • Under a binary treatment or intervention, there are two potential worlds:
  • World 1: You take the pill

  • World 2: You don't take the pill

9 / 46

A world of potential (outcomes)

  • A potential outcome is the outcome under each of these scenarios or "worlds".

    • There will be one for each path!
10 / 46

A world of potential (outcomes)

  • A potential outcome is the outcome under each of these scenarios or "worlds".

    • There will be one for each path!
  • A priori, each of these scenarios has a potential outcome

  • A posteriori, I can only observe at most one of the potential outcomes

10 / 46

A world of potential (outcomes)

  • A potential outcome is the outcome under each of these scenarios or "worlds".

    • There will be one for each path!
  • A priori, each of these scenarios has a potential outcome

  • A posteriori, I can only observe at most one of the potential outcomes

Fundamental Problem of Causal Inference

10 / 46

What are the potential outcomes for our previous example?

11 / 46

Potential Outcomes Examples

  • "My headache went away because I took an aspirin".
12 / 46

Potential Outcomes Examples

  • "My headache went away because I took an aspirin".

Headache status if I take an aspirin/ Headache status if I don't take an aspirin

12 / 46

Potential Outcomes Examples

  • "My headache went away because I took an aspirin".

Headache status if I take an aspirin/ Headache status if I don't take an aspirin

  • "The new marketing campaign increased our sales by 20%"
12 / 46

Potential Outcomes Examples

  • "My headache went away because I took an aspirin".

Headache status if I take an aspirin/ Headache status if I don't take an aspirin

  • "The new marketing campaign increased our sales by 20%"
  • "Providing students support when filling out FAFSA forms improves college access and completion."
12 / 46

Let's see a specific example

  • You work at a retail company and you are debating on whether to send out an email campaign to boost your sales:
13 / 46

Let's see a specific example

  • You work at a retail company and you are debating on whether to send out an email campaign to boost your sales:

  • You are interested in two specific outcomes:

Sales: Whether a customer makes a purchase or not.

Churn: Whether a customer unsubscribes for your mailing list or not.

13 / 46

Potential Outcomes Framework

Let's introduce some notation:

  • Let Yi be the observed outcome for unit i (e.g. whether a person makes a purchase or not).
  • Let Zi be the treatment or intervention (e.g. receiving a promotional email (1) or not (0)).
  • Let Yi(z) be the potential outcome under treatment Z=z. (e.g. whether the person would make a purchase or not if they received treatment z).
14 / 46

Potential Outcomes Framework

Let's introduce some notation:

  • Let Yi be the observed outcome for unit i (e.g. whether a person makes a purchase or not).
  • Let Zi be the treatment or intervention (e.g. receiving a promotional email (1) or not (0)).
  • Let Yi(z) be the potential outcome under treatment Z=z. (e.g. whether the person would make a purchase or not if they received treatment z).

Then, if a person is treated, Zi=1, then their observed outcome Yi will be the same as their potential outcome under treatment, Yi(1)

Yi|(Zi=1)=ΔYi(1)

14 / 46

Potential Outcomes Framework

Let's introduce some notation:

  • Let Yi be the observed outcome for unit i (e.g. whether a person makes a purchase or not).
  • Let Zi be the treatment or intervention (e.g. receiving a promotional email (1) or not (0)).
  • Let Yi(z) be the potential outcome under treatment Z=z. (e.g. whether the person would make a purchase or not if they received treatment z).

Then, if a person is treated, Zi=1, then their observed outcome Yi will be the same as their potential outcome under treatment, Yi(1)

Yi|(Zi=1)=ΔYi(1) In the same fashion, if a person is not treated, Zi=0, then their observed outcome Yi will be the same as their potential outcome under control, Yi(0)

Yi|(Zi=0)=ΔYi(0)

14 / 46

Potential Outcomes Framework

This means that we can write the observed outcome as a function of the potential outcomes:

Yi=ZiYi(1)+(1Zi)Yi(0)

  • This definition will be useful because we can see this as a missing data problem.
15 / 46

Causal Effects

Individual Causal Effect

ICEi=Yi(1)Yi(0)

16 / 46

Causal Effects

Individual Causal Effect

ICEi=Yi(1)Yi(0)
Can we ever observe individual causal effects?

17 / 46

Causal Effects

Individual Causal Effect

ICEi=Yi(1)Yi(0)


Can we ever observe individual causal effects?

No!*

18 / 46

Only one realization

Z=1

Z=0

19 / 46

The "What": Causal estimands, estimates, and estimators

20 / 46

Estimands vs Estimates vs Estimators

Estimand

A quantity we want to estimate

Estimate

The result of an estimation

Estimator

A rule for calculating
an estimate based on data

21 / 46

Estimands vs Estimates vs Estimators

Estimand

A quantity we want to estimate

E.g.: Population mean

μ

Estimate

The result of an estimation

E.g.: Result of the sample mean
for a given sample S

μ^

Estimator

A rule for calculating
an estimate based on data

E.g.: Sample mean

1niYi

22 / 46

Estimands vs Estimates vs Estimators

Source: Deng, 2022

23 / 46

Estimands vs Estimates vs Estimators

  • Some important estimands that we need to keep in mind:

Average Treatment Effect (ATE)

Average Treatment Effect on the Treated (ATT)

Conditional Average Treatment Effect (CATE)

24 / 46

Estimands vs Estimates vs Estimators

  • Some important estimands that we need to keep in mind:

ATE: E.g. Average Treatment Effect for all customers

ATT: E.g. Average Treatment Effect for customers that received the email

CATE: E.g. Average Treatmenf Effect for customer under 25 years old

25 / 46

Estimands vs Estimates vs Estimators

  • Some important estimands that we need to keep in mind:

ATE=E[Y(1)Y(0)]

ATT=E[Y(1)Y(0)|Z=1]

CATE=E[Y(1)Y(0)|X]

26 / 46

Getting around the fundamental problem of causal inference

  • Let's go back to our original example: Does an email campaign increase sales?
i
Z
Y
Y(1)
Y(0)
Y(1)-Y(0)
1
0
0
?
0
?
2
1
0
0
?
?
3
1
1
1
?
?
4
0
1
?
1
?
5
0
0
?
0
?
6
1
1
1
?
?
27 / 46

Getting around the fundamental problem of causal inference

  • We have a missing data problem
i
Z
Y
Y(1)
Y(0)
Y(1)-Y(0)
1
0
0
?
0
?
2
1
0
0
?
?
3
1
1
1
?
?
4
0
1
?
1
?
5
0
0
?
0
?
6
1
1
1
?
?
28 / 46

Getting around the fundamental problem of causal inference

  • Compare those who received the email to the ones did not received the email.
i
Z
Y
Y(1)
Y(0)
Y(1)-Y(0)
1
0
0
?
0
?
2
1
0
0
?
?
3
1
1
1
?
?
4
0
1
?
1
?
5
0
0
?
0
?
6
1
1
1
?
?
29 / 46

Getting around the fundamental problem of causal inference

  • Compare those who received the email to the ones did not received the email.
i
Z
Y
Y(1)
Y(0)
Y(1)-Y(0)
1
0
0
?
0
?
2
1
0
0
?
?
3
1
1
1
?
?
4
0
1
?
1
?
5
0
0
?
0
?
6
1
1
1
?
?
30 / 46

Getting around the fundamental problem of causal inference

  • Compare those who received the email to the ones did not received the email.
i
Z
Y
Y(1)
Y(0)
Y(1)-Y(0)
1
0
0
?
0
?
2
1
0
0
?
?
3
1
1
1
?
?
4
0
1
?
1
?
5
0
0
?
0
?
6
1
1
1
?
?

τ^=13iZ=1Yi13iZ=0Yi=0.333

31 / 46

Getting around the fundamental problem of causal inference

I we had more data, we could do the same with a simple regression:

Purchase=β0+β1Email+ε

32 / 46

Getting around the fundamental problem of causal inference

I we had more data, we could do the same with a simple regression:

Purchase=β0+β1Email+ε

Imagine you get the following results:

Purchase=0.4+0.33Email+ε

  • Interpret the coefficient for Email:
32 / 46








What could be the problem with comparing the sample means?

33 / 46

Let's do a little exercise

34 / 46


Look at your green piece of paper and go to the following website

https://sta235h.click/week4

Would you go to a physician/urgent care?

35 / 46

The "Why": Causal questions and study designs

36 / 46

Under what assumptions is our estimate causal?

We are using: τ^=13iZ=1Yi13iZ=0Yi) to estimate:

τ=E[Yi(1)Yi(0)]

37 / 46

Under what assumptions is our estimate causal?

We are using: τ^=13iZ=1Yi13iZ=0Yi)

to estimate:

τ=E[Yi(1)Yi(0)]

Let's do some math

38 / 46

Under what assumptions is our estimate causal?

τ=E[Yi(1)Yi(0)] =E[Yi(1)]E[Yi(0)]

39 / 46

Under what assumptions is our estimate causal?

τ=E[Yi(1)Yi(0)] =E[Yi(1)]E[Yi(0)]Key assumption:

Ignorability

Ignorability means that the potential outcomes Y(0) and Y(1) are independent of the treatment, e.g. (Y(0),Y(1))Z.

E[Yi(1)|Z=0]=E[Yi(1)|Z=1]=E[Yi(1)] and

E[Yi(0)|Z=0]=E[Yi(0)|Z=1]=E[Yi(0)]

39 / 46

Under what assumptions is our estimate causal?

τ=E[Yi(1)Yi(0)] =E[Yi(1)]E[Yi(0)]

  • Under ignorability (see previous slide), E[Yi(1)]=E[Yi(1)|Z=1]=E[Yi|Z=1] and E[Yi(0)]=E[Yi(0)|Z=0]=E[Yi|Z=0], then:

τ=E[Yi(1)]E[Yi(0)]=E[Yi(1)|Z=1]Obs. Outcome for TE[Yi(0)|Z=0]Obs. Outcome for C

40 / 46

Ignorability Assumption

We can just "ignore" the missing data problem:

i
Z
Y
Y(1)
Y(0)
Y(1)-Y(0)
1
0
0
0
2
1
0
0
3
1
1
1
4
0
1
1
5
0
0
0
6
1
1
1
41 / 46

Ignorability Assumption

We can just "ignore" the missing data problem:

i
Z
Y
Y(1)
Y(0)
Y(1)-Y(0)
1
0
0
0
2
1
0
0
3
1
1
1
4
0
1
1
5
0
0
0
6
1
1
1
42 / 46

Ignorability Assumption

We can just "ignore" the missing data problem:

i
Z
Y
Y(1)
Y(0)
Y(1)-Y(0)
1
0
0
0
2
1
0
0
3
1
1
1
4
0
1
1
5
0
0
0
6
1
1
1
2/3
1/3
43 / 46

Main takeaway points

Causal Inference is hard

44 / 46

Main takeaway points

Causal Inference is hard

  • Think about the causal problem
44 / 46

Main takeaway points

Causal Inference is hard

  • Think about the causal problem

  • Check validity of assumptions (Is ignorability plausible? Am I controlling for the right covariates?)

44 / 46

Main takeaway points

Causal Inference is hard

  • Think about the causal problem

  • Check validity of assumptions (Is ignorability plausible? Am I controlling for the right covariates?)

  • Most of this chapter will be spent on looking for exogeneous variation to make the ignorability assumption happen.

44 / 46

Next week

  • Randomized Controlled Trials:

    • Pros and Cons
    • Concept of validity
    • A/B Testing

45 / 46

References

46 / 46



How? Potential Outcomes Framework

What? Causal Estimands

Why? Causal Questions and Study Design

2 / 46
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
sToggle scribble toolbox
Esc Back to slideshow