class: center, middle, inverse, title-slide .title[ # STA 235H - Natural Experiments & Difference-In-Differences ] .subtitle[ ## Fall 2023 ] .author[ ### McCombs School of Business, UT Austin ] --- <!-- <script type="text/javascript"> --> <!-- MathJax.Hub.Config({ --> <!-- "HTML-CSS": { --> <!-- preferredFont: null, --> <!-- webFont: "Neo-Euler" --> <!-- } --> <!-- }); --> <!-- </script> --> <style type="text/css"> .small .remark-code { /*Change made here*/ font-size: 80% !important; } .tiny .remark-code { /*Change made here*/ font-size: 90% !important; } </style> #Announcements - **.darkorange[Grades for Homework 2]** will be posted this week. - Review the Answer Key on the course website (posted Mon/Tue after submission). - Everyone did pretty well, but remember that answers need to match submitted code. -- - **.darkorange[Midterm is in class (week of Oct. 16th)]**: - Practice quizz (not graded, but mandatory) for proctored exams (HonorLock). - There will be a review session Thur/Fri before the midterm (poll). --- # Last week .pull-left[ .center[ ![:scale 100%](https://media.giphy.com/media/IPbS5R4fSUl5S/giphy.gif)] ] .pull-right[ - Finished with **.darkorange[randomized controlled trials]**. - Limitations in generalizability and interference (e.g. spillovers). - Introduced **.darkorange[observational studies]**: - Controlling for observable confounders (e.g. regression and matching) ] --- # Today .pull-left[ - Talk about other **.darkorange[Observational Studies]**: - Natural Experiments - Difference-in-Differences - **.darkorange[First half]**: Material - **.darkorange[Second half]**: You will tackle an exercise. ] .pull-right[ ![:scale 100%](https://media.giphy.com/media/8FNlmNPDTo2wE/giphy.gif) ] --- background-position: 50% 50% class: left, bottom, inverse .big[ Recap so far ] --- # What did we see last week? - **.darkorange[Limitations in RCTs]**: -- - Generalizability - Breaking SUTVA: Spillover effects and General Equilibrium Effects. -- - Introduced **.darkorange[Observational Studies]**: - We need to control by confouders: Conditional Ignorability Assumption. - How? E.g. Regression, Matching. --- # Identification strategies (designs) we have seen so far... **.darkorange[Randomized Controlled trials (RCTs)]** -- - Treatment assignment is <u>randomized</u> - Ignorability assumption holds by design: Groups are comparable in obs. and unobs. characteristics. -- - Analysis? (i) Check balance and (ii) difference in means. --- # Identification strategies (designs) we have seen so far... **.darkorange[Selection on Observables (Matching, Regressions with covariates)]**: -- - Treatment assignment is <u>not randomized</u> - Conditional independence assumption holds <u>if we can control for all confounders</u> (assumes all confounders are observed) .small[ - *After adjusting for covariates*, assignment to treatment is as good as random (*Is this a credible assumption?*).] -- - Analysis? (i) Compare balance before matching, (ii) compare balance after matching, and (iii) difference in means for the matched sample. --- background-position: 50% 50% class: left, bottom, inverse .big[ Is there randomness out there? ] --- # Finding "RCTs" in the wild - Given that we can't run RCTs for everything, the next best thing is finding a source of random variation that, for all practical purposes, **.darkorange[would work as an RCT]** -- <br> <br> .box-5LA[Natural Experiments] .box-5t[You, as a researcher, did not assign units to treatment levels] -- 1. **.darkorange[Random]**: Assignment to an intervention is random (e.g. lottery) 2. **.darkorange[As if random]**: Assignment to an intervention is not random, but it's not correlated with potential outcomes. -- .box-6[Context matters!] --- # Examples of natural experiments - **.darkorange[Oregon Health experiment]**: Lotteries for Medicaid expansion. -- - **.darkorange[Vietnam Draft]**: Impact of military service/education (GI Bill) on earnings. -- - **.darkorange[Lottery winners]**: Impact of unearned income on labor earnings. -- .center[We can analyze these cases **.darkorange[just like an RCT]**] -- .box-7trans[What do we do if we have something like a natural experiment but <u>both our groups are not necessarily balanced</u>?] --- background-position: 50% 50% class: left, bottom, inverse .big[ Two wrongs make a right ] --- # Raising the minimum wage .box-5LA[What happens if we raise the minimum wage] -- .box-5tL[Economic theory says there should be fewer jobs] -- .box-5[New Jersey in 1992] .box-5[$4.25 → $5.05] --- # The setup .center[ ![:scale 55%](https://raw.githubusercontent.com/maibennett/sta235/main/exampleSite/content/Classes/Week7/1_DiffInDiff/images/min_wage_map.png)] --- # Before vs After .box-2[Avg. # of jobs per fast food restaurant in NJ] -- .box-2tL[New Jersey<sub>before</sub> = 20.44] -- .box-2tL[New Jersey<sub>after</sub> = 21.03] -- .box-2[∆ = 0.59] -- .box-2[Is this a causal effect?] --- # Treatment vs Control .box-3[Avg. # of jobs per fast food restaurant] -- .box-3tL[Pennsylvania<sub>after</sub> = 21.17] -- .box-3tL[New Jersey<sub>after</sub> = 21.03] -- .box-3[∆ = -0.14] -- .box-3[Is this a causal effect?] --- # Problems .pull-left[ .box-7Trans[Before vs After] .box-6trans[Only looking at the treatment group] .box-6trans[Impossible to separate changes because of treatment or time] ] -- .pull-right[ .box-7Trans[Treatment vs Control] .box-6trans[Only looking at post-treatment values] .box-6trans[Impossible to separate changes because of treatment or differences in growth/other confounders] ] --- .center2[ ![:scale 110%](https://media.giphy.com/media/3o85xIO33l7RlmLR4I/giphy.gif)] --- # Difference-in-Differences The idea of a **.darkorange[DD]** analysis is to take the **.darkorange[within-unit growth]**... <table style="margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:center;font-weight: bold;color: #CF4446 !important;background-color: #FFC6C6 !important;"> </th> <th style="text-align:center;font-weight: bold;color: #CF4446 !important;background-color: #FFC6C6 !important;"> Pre mean </th> <th style="text-align:center;font-weight: bold;color: #CF4446 !important;background-color: #FFC6C6 !important;"> Post mean </th> <th style="text-align:center;font-weight: bold;color: #CF4446 !important;background-color: #FFC6C6 !important;"> ∆ <span class="small">(post − pre)</span> </th> </tr> </thead> <tbody> <tr> <td style="text-align:center;background-color: #FFFFFF !important;font-weight: bold;color: #CF4446 !important;background-color: #FFC6C6 !important;"> Control </td> <td style="text-align:center;background-color: #FFFFFF !important;"> <b>A</b><br><span class="small">(never treated) </span> </td> <td style="text-align:center;background-color: #FFFFFF !important;"> <b>B</b><br><span class="small">(never treated) </span> </td> <td style="text-align:center;background-color: #FFFFFF !important;background-color: #DDDDDD !important;"> <b>B − A</b> </td> </tr> <tr> <td style="text-align:center;background-color: #FFFFFF !important;font-weight: bold;color: #CF4446 !important;background-color: #FFC6C6 !important;"> Treatment </td> <td style="text-align:center;background-color: #FFFFFF !important;"> <b>C</b><br><span class="small">(not yet treated) </span> </td> <td style="text-align:center;background-color: #FFFFFF !important;"> <b>D</b><br><span class="small">(treated) </span> </td> <td style="text-align:center;background-color: #FFFFFF !important;background-color: #DDDDDD !important;"> <b>D − C</b> </td> </tr> <tr> <td style="text-align:center;font-weight: bold;color: #CF4446 !important;background-color: #FFC6C6 !important;color: #FFFFFF !important;background-color: #FFFFFF !important;"> ∆<br><span class="smaller">(treatment − control)</span> </td> <td style="text-align:center;color: #FFFFFF !important;background-color: #FFFFFF !important;"> <b>A − C</b> </td> <td style="text-align:center;color: #FFFFFF !important;background-color: #FFFFFF !important;"> <b>B − D</b> </td> <td style="text-align:center;background-color: #DDDDDD !important;color: #FFFFFF !important;background-color: #FFFFFF !important;"> <span class=""><b>(B − A) − (D − C)</b></span> <i>or</i><br><span class=""><b>(B − D) − (A − C)</b></span> </td> </tr> </tbody> </table> .box-5[∆ (post − pre) = <b>within-unit growth</b>] --- # Difference-in-Differences ... and the **.darkorange[across-group growth]**... <table style="margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:center;font-weight: bold;color: #CF4446 !important;background-color: #FFC6C6 !important;"> </th> <th style="text-align:center;font-weight: bold;color: #CF4446 !important;background-color: #FFC6C6 !important;"> Pre mean </th> <th style="text-align:center;font-weight: bold;color: #CF4446 !important;background-color: #FFC6C6 !important;"> Post mean </th> <th style="text-align:center;font-weight: bold;color: #CF4446 !important;background-color: #FFC6C6 !important;"> <span class="color-light-5">∆ <span class="smaller">(post − pre)</span></span> </th> </tr> </thead> <tbody> <tr> <td style="text-align:center;background-color: #FFFFFF !important;font-weight: bold;color: #CF4446 !important;background-color: #FFC6C6 !important;"> Control </td> <td style="text-align:center;background-color: #FFFFFF !important;"> <b>A</b><br><span class="small">(never treated) </span> </td> <td style="text-align:center;background-color: #FFFFFF !important;"> <b>B</b><br><span class="small">(never treated) </span> </td> <td style="text-align:center;background-color: #FFFFFF !important;color: #FFFFFF !important;background-color: #FFFFFF !important;"> <b>B − A</b> </td> </tr> <tr> <td style="text-align:center;background-color: #FFFFFF !important;font-weight: bold;color: #CF4446 !important;background-color: #FFC6C6 !important;"> Treatment </td> <td style="text-align:center;background-color: #FFFFFF !important;"> <b>C</b><br><span class="small">(not yet treated) </span> </td> <td style="text-align:center;background-color: #FFFFFF !important;"> <b>D</b><br><span class="small">(treated) </span> </td> <td style="text-align:center;background-color: #FFFFFF !important;color: #FFFFFF !important;background-color: #FFFFFF !important;"> <b>D − C</b> </td> </tr> <tr> <td style="text-align:center;background-color: #DDDDDD !important;font-weight: bold;color: #CF4446 !important;background-color: #FFC6C6 !important;"> ∆<br><span class="smaller">(treatment − control)</span> </td> <td style="text-align:center;background-color: #DDDDDD !important;"> <b>C − A</b> </td> <td style="text-align:center;background-color: #DDDDDD !important;"> <b>D − B</b> </td> <td style="text-align:center;background-color: #DDDDDD !important;color: #FFFFFF !important;background-color: #FFFFFF !important;"> <span class=""><b>(B − A) − (D − C)</b></span> <i>or</i><br><span class=""><b>(B − D) − (A − C)</b></span> </td> </tr> </tbody> </table> .box-5[∆ (treatment − control) = <b>across-group growth</b>] --- # Difference-in-Differences ... and **.darkorange[combine them!]** <table style="margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:center;font-weight: bold;color: #CF4446 !important;background-color: #FFC6C6 !important;"> </th> <th style="text-align:center;font-weight: bold;color: #CF4446 !important;background-color: #FFC6C6 !important;"> Pre mean </th> <th style="text-align:center;font-weight: bold;color: #CF4446 !important;background-color: #FFC6C6 !important;"> Post mean </th> <th style="text-align:center;font-weight: bold;color: #CF4446 !important;background-color: #FFC6C6 !important;"> ∆ <span class="smaller">(post − pre)</span> </th> </tr> </thead> <tbody> <tr> <td style="text-align:center;background-color: #FFFFFF !important;font-weight: bold;color: #CF4446 !important;background-color: #FFC6C6 !important;"> Control </td> <td style="text-align:center;background-color: #FFFFFF !important;"> <b>A</b><br><span class="small">(never treated) </span> </td> <td style="text-align:center;background-color: #FFFFFF !important;"> <b>B</b><br><span class="small">(never treated) </span> </td> <td style="text-align:center;background-color: #FFFFFF !important;background-color: #DDDDDD !important;"> <b>B − A</b> </td> </tr> <tr> <td style="text-align:center;background-color: #FFFFFF !important;font-weight: bold;color: #CF4446 !important;background-color: #FFC6C6 !important;"> Treatment </td> <td style="text-align:center;background-color: #FFFFFF !important;"> <b>C</b><br><span class="small">(not yet treated) </span> </td> <td style="text-align:center;background-color: #FFFFFF !important;"> <b>D</b><br><span class="small">(treated) </span> </td> <td style="text-align:center;background-color: #FFFFFF !important;background-color: #DDDDDD !important;"> <b>D − C</b> </td> </tr> <tr> <td style="text-align:center;background-color: #DDDDDD !important;font-weight: bold;color: #CF4446 !important;background-color: #FFC6C6 !important;"> ∆<br><span class="smaller">(treatment − control)</span> </td> <td style="text-align:center;background-color: #DDDDDD !important;"> <b>C − A</b> </td> <td style="text-align:center;background-color: #DDDDDD !important;"> <b>D − B</b> </td> <td style="text-align:center;background-color: #DDDDDD !important;background-color: #DDDDDD !important;"> <span class="color-5"><b>(D − C) − (B − A)</b></span> <i>or</i><br><span class="color-5"><b>(D − B) − (C − A)</b></span> </td> </tr> </tbody> </table> .box-5[∆<sub>within units</sub> − ∆<sub>across groups</sub> =<br>Difference-in-differences =<br>causal effect!] --- # Coming back to New Jersey <br> <table style="margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:center;font-weight: bold;color: #CF4446 !important;background-color: #FFC6C6 !important;"> </th> <th style="text-align:center;font-weight: bold;color: #CF4446 !important;background-color: #FFC6C6 !important;"> Pre mean </th> <th style="text-align:center;font-weight: bold;color: #CF4446 !important;background-color: #FFC6C6 !important;"> Post mean </th> <th style="text-align:center;font-weight: bold;color: #CF4446 !important;background-color: #FFC6C6 !important;"> ∆ <span class="smaller">(post − pre)</span> </th> </tr> </thead> <tbody> <tr> <td style="text-align:center;background-color: #FFFFFF !important;font-weight: bold;color: #CF4446 !important;background-color: #FFC6C6 !important;"> Pennsylvania </td> <td style="text-align:center;background-color: #FFFFFF !important;"> <b>23.33</b><br><span class="small">A </span> </td> <td style="text-align:center;background-color: #FFFFFF !important;"> <b>21.17</b><br><span class="small">B </span> </td> <td style="text-align:center;background-color: #FFFFFF !important;background-color: #DDDDDD !important;"> <b>-2.16</b><br><span class="small">B − A </span> </td> </tr> <tr> <td style="text-align:center;background-color: #FFFFFF !important;font-weight: bold;color: #CF4446 !important;background-color: #FFC6C6 !important;"> New Jersey </td> <td style="text-align:center;background-color: #FFFFFF !important;"> <b>20.44</b><br><span class="small">C </span> </td> <td style="text-align:center;background-color: #FFFFFF !important;"> <b>21.03</b><br><span class="small">D </span> </td> <td style="text-align:center;background-color: #FFFFFF !important;background-color: #DDDDDD !important;"> <b>0.59</b><br><span class="small">D − C </span> </td> </tr> <tr> <td style="text-align:center;background-color: #DDDDDD !important;font-weight: bold;color: #CF4446 !important;background-color: #FFC6C6 !important;"> ∆<br><span class="smaller">(NJ − PA)</span> </td> <td style="text-align:center;background-color: #DDDDDD !important;"> <b>-2.89</b><br><span class="small">C − A </span> </td> <td style="text-align:center;background-color: #DDDDDD !important;"> <b>-0.14</b><br><span class="small">D − B </span> </td> <td style="text-align:center;background-color: #DDDDDD !important;background-color: #DDDDDD !important;"> <b>(0.59) − (−2.16) = <br><span class="color-5">2.76</span></b> </td> </tr> </tbody> </table> --- # How does it look in a plot? <img src="f2023_sta235h_8_DiffInDiff_files/figure-html/dd-graph-1.svg" style="display: block; margin: auto;" /> --- # ... And the real plot! .center[ ![:scale 55%](https://raw.githubusercontent.com/maibennett/sta235/main/exampleSite/content/Classes/Week7/1_DiffInDiff/images/min_wage_plot.png)] --- # Difference-in-Differences in practice - There's no need to manually estimate all group means.. -- .box-3trans[We can use regressions!] -- - If the **.darkorange[two dimensions]** for our DD are *time* and *treatment*: `$$Y_i = \beta_0 + \beta_1Treat_i + \beta_2Post_i + \beta_3Treat_i \times Post_i + \varepsilon_i$$` where `\(Treat = 1\)` for the treatment group, and `\(Post=1\)` for the after period. -- .box-7trans[Can you identify the different coefficients?] --- # Difference-in-Differences in practice - There's no need to manually estimate all group means.. .box-3trans[We can use regressions!] - If the **.darkorange[two dimensions]** for our DD are *time* and *treatment*: `$$Y_i = \beta_0 + \beta_1Treat_i + \beta_2Post_i + \beta_3Treat_i \times Post_i + \varepsilon_i$$` where `\(Treat = 1\)` for the treatment group, and `\(Post=1\)` for the after period. .box-7trans[β<sub>3</sub> is the causal effect!] --- # Let's see it with data ```r minwage <- read.csv("https://raw.githubusercontent.com/maibennett/sta235/main/exampleSite/content/Classes/Week7/1_DiffInDiff/data/minwage.csv") minwage <- minwage %>% mutate(treat = ifelse(location=="PA", 0, 1), # treat group: the treated state post = ifelse(date=="nov1992", 1, 0)) # post: time after treatment was set in place head(minwage) ``` ``` ## chain location wage full part date treat post ## 1 wendys PA 5.00 20 20 feb1992 0 0 ## 2 wendys PA 5.50 6 26 feb1992 0 0 ## 3 burgerking PA 5.00 50 35 feb1992 0 0 ## 4 burgerking PA 5.00 10 17 feb1992 0 0 ## 5 kfc PA 5.25 2 8 feb1992 0 0 ## 6 kfc PA 5.00 2 10 feb1992 0 0 ``` --- # Let's see it with data .small[ ```r summary(lm(full ~ treat*post, data = minwage)) ``` ``` ## ## Call: ## lm(formula = full ~ treat * post, data = minwage) ## ## Residuals: ## Min 1Q Median 3Q Max ## -10.664 -5.971 -2.405 3.653 52.029 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 10.664 1.007 10.589 <2e-16 *** ## treat -2.693 1.117 -2.411 0.0162 * ## post -2.493 1.424 -1.750 0.0805 . ## treat:post 2.927 1.580 1.853 0.0643 . ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Residual standard error: 8.243 on 712 degrees of freedom ## Multiple R-squared: 0.008207, Adjusted R-squared: 0.004028 ## F-statistic: 1.964 on 3 and 712 DF, p-value: 0.118 ``` ] - Can you interpret the treatment effect? -- .small[ *"Increasing the minimum wage from $4.25 to $5.05 had an average effect <u>in New Jersey</u> of 2.9 additional jobs per fast food restaurant"*] --- # Important things to note - In Difference-in-Differences, **.darkorange[groups do not need to be balanced]** - If differences are stable over time, they get cancelled out when doing the Diff-in-Diff. -- - Difference-in-Differences provides an estimate for an **.darkorange[average treatment effect for the treated group]** - The estimated effect is not generalizable for the entire sample, *only for the treated group*. --- background-position: 50% 50% class: left, bottom, inverse .big[ Diff-in-Diff Assumptions ] --- # Assumptions .box-3LA[Parallel Trends] -- .box-3tL[In the absence of the intervention, treatment and control group would have changed in the same way] --- # If parallel trends assumption hold... <img src="f2023_sta235h_8_DiffInDiff_files/figure-html/plot-trends-1-1.svg" style="display: block; margin: auto;" /> --- # If parallel trends assumption doesn't hold... <img src="f2023_sta235h_8_DiffInDiff_files/figure-html/plot-trends-2-1.svg" style="display: block; margin: auto;" /> --- #... the DD estimate will be biased <img src="f2023_sta235h_8_DiffInDiff_files/figure-html/plot-trends-3-1.svg" style="display: block; margin: auto;" /> --- # Robustness Check .box-3LA[Pre-Parallel Trends] -- .box-3tL[Check by pretending the treatment happened earlier; if there's an effect, there's likely an underlying trend] --- # Use the pre-intervention period and conduct a placebo DD <img src="f2023_sta235h_8_DiffInDiff_files/figure-html/plot-pretrends-1-1.svg" style="display: block; margin: auto;" /> --- <br> <br> <br> <br> <br> <br> <br> .box-4Trans[Your turn] --- # Wrapping up .pull-left[ - We introduced a new study design! - If we think the **.darkorange[parallel trend assumption holds]**, we can find an Average Treatment Effect for the treated group (ATT) - Remember that we can't say anything about the treatment effect for the control group! - Next week we will see **.darkorange[more identification strategies.]** ] .pull-right[ .center[ ![](https://media.giphy.com/media/l0HlBenJSVV0Y66aI/giphy.gif) ] ] --- # References - Angrist, J. and S. Pischke. (2015). "Mastering Metrics". *Chapter 2*. - Angrist, J. and S. Pischke. (2015). "Mastering Metrics". *Chapter 5*. - Heiss, A. (2020). "Program Evaluation for Public Policy". *Class 8-9: Diff-in-diff I and II, Course at BYU*.