Grades for Homework 2 will be posted this week.
Review the Answer Key on the course website (posted Mon/Tue after submission).
Everyone did pretty well, but remember that answers need to match submitted code.
Grades for Homework 2 will be posted this week.
Review the Answer Key on the course website (posted Mon/Tue after submission).
Everyone did pretty well, but remember that answers need to match submitted code.
Midterm is in class (week of Oct. 16th):
Practice quizz (not graded, but mandatory) for proctored exams (HonorLock).
There will be a review session Thur/Fri before the midterm (poll).
Finished with randomized controlled trials.
Introduced observational studies:
Talk about other Observational Studies:
Natural Experiments
Difference-in-Differences
First half: Material
Second half: You will tackle an exercise.
Recap so far
Limitations in RCTs:
Generalizability
Breaking SUTVA: Spillover effects and General Equilibrium Effects.
Limitations in RCTs:
Generalizability
Breaking SUTVA: Spillover effects and General Equilibrium Effects.
Introduced Observational Studies:
We need to control by confouders: Conditional Ignorability Assumption.
How? E.g. Regression, Matching.
Randomized Controlled trials (RCTs)
Randomized Controlled trials (RCTs)
Treatment assignment is randomized
Ignorability assumption holds by design: Groups are comparable in obs. and unobs. characteristics.
Randomized Controlled trials (RCTs)
Treatment assignment is randomized
Ignorability assumption holds by design: Groups are comparable in obs. and unobs. characteristics.
Analysis? (i) Check balance and (ii) difference in means.
Selection on Observables (Matching, Regressions with covariates):
Selection on Observables (Matching, Regressions with covariates):
Treatment assignment is not randomized
Conditional independence assumption holds if we can control for all confounders (assumes all confounders are observed)
Selection on Observables (Matching, Regressions with covariates):
Treatment assignment is not randomized
Conditional independence assumption holds if we can control for all confounders (assumes all confounders are observed)
Analysis? (i) Compare balance before matching, (ii) compare balance after matching, and (iii) difference in means for the matched sample.
Is there randomness out there?
Natural Experiments
You, as a researcher, did not assign units to treatment levels
Natural Experiments
You, as a researcher, did not assign units to treatment levels
Random: Assignment to an intervention is random (e.g. lottery)
As if random: Assignment to an intervention is not random, but it's not correlated with potential outcomes.
Natural Experiments
You, as a researcher, did not assign units to treatment levels
Random: Assignment to an intervention is random (e.g. lottery)
As if random: Assignment to an intervention is not random, but it's not correlated with potential outcomes.
Context matters!
Oregon Health experiment: Lotteries for Medicaid expansion.
Vietnam Draft: Impact of military service/education (GI Bill) on earnings.
Oregon Health experiment: Lotteries for Medicaid expansion.
Vietnam Draft: Impact of military service/education (GI Bill) on earnings.
Lottery winners: Impact of unearned income on labor earnings.
Oregon Health experiment: Lotteries for Medicaid expansion.
Vietnam Draft: Impact of military service/education (GI Bill) on earnings.
Lottery winners: Impact of unearned income on labor earnings.
We can analyze these cases just like an RCT
Oregon Health experiment: Lotteries for Medicaid expansion.
Vietnam Draft: Impact of military service/education (GI Bill) on earnings.
Lottery winners: Impact of unearned income on labor earnings.
We can analyze these cases just like an RCT
What do we do if we have something like a natural experiment but both our groups are not necessarily balanced?
Two wrongs make a right
What happens if we raise the minimum wage
What happens if we raise the minimum wage
Economic theory says there should be fewer jobs
What happens if we raise the minimum wage
Economic theory says there should be fewer jobs
New Jersey in 1992
$4.25 → $5.05
Avg. # of jobs per fast food restaurant in NJ
Avg. # of jobs per fast food restaurant in NJ
New Jerseybefore = 20.44
Avg. # of jobs per fast food restaurant in NJ
New Jerseybefore = 20.44
New Jerseyafter = 21.03
Avg. # of jobs per fast food restaurant in NJ
New Jerseybefore = 20.44
New Jerseyafter = 21.03
∆ = 0.59
Avg. # of jobs per fast food restaurant in NJ
New Jerseybefore = 20.44
New Jerseyafter = 21.03
∆ = 0.59
Is this a causal effect?
Avg. # of jobs per fast food restaurant
Avg. # of jobs per fast food restaurant
Pennsylvaniaafter = 21.17
Avg. # of jobs per fast food restaurant
Pennsylvaniaafter = 21.17
New Jerseyafter = 21.03
Avg. # of jobs per fast food restaurant
Pennsylvaniaafter = 21.17
New Jerseyafter = 21.03
∆ = -0.14
Avg. # of jobs per fast food restaurant
Pennsylvaniaafter = 21.17
New Jerseyafter = 21.03
∆ = -0.14
Is this a causal effect?
Before vs After
Only looking at the treatment group
Impossible to separate changes because of treatment or time
Before vs After
Only looking at the treatment group
Impossible to separate changes because of treatment or time
Treatment vs Control
Only looking at post-treatment values
Impossible to separate changes because of treatment or differences in growth/other confounders
The idea of a DD analysis is to take the within-unit growth...
Pre mean | Post mean | ∆ (post − pre) | |
---|---|---|---|
Control | A (never treated) |
B (never treated) |
B − A |
Treatment | C (not yet treated) |
D (treated) |
D − C |
∆ (treatment − control) |
A − C | B − D | (B − A) − (D − C) or (B − D) − (A − C) |
∆ (post − pre) = within-unit growth
... and the across-group growth...
Pre mean | Post mean | ∆ (post − pre) | |
---|---|---|---|
Control | A (never treated) |
B (never treated) |
B − A |
Treatment | C (not yet treated) |
D (treated) |
D − C |
∆ (treatment − control) |
C − A | D − B | (B − A) − (D − C) or (B − D) − (A − C) |
∆ (treatment − control) = across-group growth
... and combine them!
Pre mean | Post mean | ∆ (post − pre) | |
---|---|---|---|
Control | A (never treated) |
B (never treated) |
B − A |
Treatment | C (not yet treated) |
D (treated) |
D − C |
∆ (treatment − control) |
C − A | D − B | (D − C) − (B − A) or (D − B) − (C − A) |
∆within units − ∆across groups =
Difference-in-differences =
causal effect!
Pre mean | Post mean | ∆ (post − pre) | |
---|---|---|---|
Pennsylvania | 23.33 A |
21.17 B |
-2.16 B − A |
New Jersey | 20.44 C |
21.03 D |
0.59 D − C |
∆ (NJ − PA) |
-2.89 C − A |
-0.14 D − B |
(0.59) − (−2.16) = 2.76 |
We can use regressions!
We can use regressions!
Yi=β0+β1Treati+β2Posti+β3Treati×Posti+εi where Treat=1 for the treatment group, and Post=1 for the after period.
We can use regressions!
Yi=β0+β1Treati+β2Posti+β3Treati×Posti+εi where Treat=1 for the treatment group, and Post=1 for the after period.
Can you identify the different coefficients?
We can use regressions!
Yi=β0+β1Treati+β2Posti+β3Treati×Posti+εi where Treat=1 for the treatment group, and Post=1 for the after period.
β3 is the causal effect!
minwage <- read.csv("https://raw.githubusercontent.com/maibennett/sta235/main/exampleSite/content/Classes/Week7/1_DiffInDiff/data/minwage.csv")minwage <- minwage %>% mutate(treat = ifelse(location=="PA", 0, 1), # treat group: the treated state post = ifelse(date=="nov1992", 1, 0)) # post: time after treatment was set in placehead(minwage)
## chain location wage full part date treat post## 1 wendys PA 5.00 20 20 feb1992 0 0## 2 wendys PA 5.50 6 26 feb1992 0 0## 3 burgerking PA 5.00 50 35 feb1992 0 0## 4 burgerking PA 5.00 10 17 feb1992 0 0## 5 kfc PA 5.25 2 8 feb1992 0 0## 6 kfc PA 5.00 2 10 feb1992 0 0
summary(lm(full ~ treat*post, data = minwage))
## ## Call:## lm(formula = full ~ treat * post, data = minwage)## ## Residuals:## Min 1Q Median 3Q Max ## -10.664 -5.971 -2.405 3.653 52.029 ## ## Coefficients:## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 10.664 1.007 10.589 <2e-16 ***## treat -2.693 1.117 -2.411 0.0162 * ## post -2.493 1.424 -1.750 0.0805 . ## treat:post 2.927 1.580 1.853 0.0643 . ## ---## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1## ## Residual standard error: 8.243 on 712 degrees of freedom## Multiple R-squared: 0.008207, Adjusted R-squared: 0.004028 ## F-statistic: 1.964 on 3 and 712 DF, p-value: 0.118
summary(lm(full ~ treat*post, data = minwage))
## ## Call:## lm(formula = full ~ treat * post, data = minwage)## ## Residuals:## Min 1Q Median 3Q Max ## -10.664 -5.971 -2.405 3.653 52.029 ## ## Coefficients:## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 10.664 1.007 10.589 <2e-16 ***## treat -2.693 1.117 -2.411 0.0162 * ## post -2.493 1.424 -1.750 0.0805 . ## treat:post 2.927 1.580 1.853 0.0643 . ## ---## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1## ## Residual standard error: 8.243 on 712 degrees of freedom## Multiple R-squared: 0.008207, Adjusted R-squared: 0.004028 ## F-statistic: 1.964 on 3 and 712 DF, p-value: 0.118
"Increasing the minimum wage from $4.25 to $5.05 had an average effect in New Jersey of 2.9 additional jobs per fast food restaurant"
In Difference-in-Differences, groups do not need to be balanced
In Difference-in-Differences, groups do not need to be balanced
Difference-in-Differences provides an estimate for an average treatment effect for the treated group
Diff-in-Diff Assumptions
Parallel Trends
Parallel Trends
In the absence of the intervention, treatment and control group would have changed in the same way
Pre-Parallel Trends
Pre-Parallel Trends
Check by pretending the treatment happened earlier; if there's an effect, there's likely an underlying trend
Your turn
We introduced a new study design!
If we think the parallel trend assumption holds, we can find an Average Treatment Effect for the treated group (ATT)
Next week we will see more identification strategies.
Angrist, J. and S. Pischke. (2015). "Mastering Metrics". Chapter 2.
Angrist, J. and S. Pischke. (2015). "Mastering Metrics". Chapter 5.
Heiss, A. (2020). "Program Evaluation for Public Policy". Class 8-9: Diff-in-diff I and II, Course at BYU.
Grades for Homework 2 will be posted this week.
Review the Answer Key on the course website (posted Mon/Tue after submission).
Everyone did pretty well, but remember that answers need to match submitted code.
Keyboard shortcuts
↑, ←, Pg Up, k | Go to previous slide |
↓, →, Pg Dn, Space, j | Go to next slide |
Home | Go to first slide |
End | Go to last slide |
Number + Return | Go to specific slide |
b / m / f | Toggle blackout / mirrored / fullscreen mode |
c | Clone slideshow |
p | Toggle presenter mode |
t | Restart the presentation timer |
?, h | Toggle this help |
Esc | Back to slideshow |