Causal ML Tutorials
I've been learning and working with causal inference and causal ML methods, and I find that the best way to fully understand them is to explain them to someone else. That's why I thought it would be a good idea to create Jupyter Notebook tutorials, which you can explore here.
This is a project in progress. Currently, there are two notebooks: one for difference-in-difference and one for causal impact.
Difference-in-difference: do higher wages lead to job losses?
This notebook tutorial follows the study by Card, D., & Krueger, Minimum wages and employment: A case study of the fast food industry in New Jersey and Pennsylvania.
In April 1992, New Jersey passed a law increasing the minimum wage to $5 per hour. The economists Card and Krueger examined the law's effects on employment. At the time, the conventional economic theory was that increasing the minimum wage would lead employers to cut jobs, resulting in a higher unemployment rate.
The study examined 410 fast-food restaurants in New Jersey and Philadelphia (which didn't increase its minimum wage) before and after April '92 (Figure 1). It used the difference-in-differences method to examine the causal relationship between minimum wages and employment.
The study found no statistically significant employment reduction after New Jersey increased its minimum wage. On the contrary, it found that employment in New Jersey's fast-food sector actually increased slightly relative to Pennsylvania's, although this difference was not statistically significant. These findings have fundamentally shaped the understanding of how economic policies affect individuals and markets.
David Card was one of three recipients of the Nobel Prize for Economics in 2021 for his contributions to empirical microeconomics, especially labor economics.
Figure 1: The left plot shows the relationship between FTE and wages before April '92 for New Jersey (green) and Philadelphia (blue), with most data points below $5/hr for FTEs under 40. New Jersey’s wage distribution is skewed lower, while Philadelphia’s is more spread out. Both cities have similar FTE distributions centered around 20. On the right, after April '92, New Jersey's data points cluster above $5/hr, while Philadelphia's remain below. New Jersey’s wage distribution centers on $5/hr, while Philadelphia’s is more uniform. The FTE distribution remains similar, centered around 20. Note: The authors of the paper set the FTE of closed restaurants to 0.
Analyzing Stock Market Interventions: A CausalImpact Tutorial
This notebook is a tutorial on CausalImpact, a causal inference method developed by Google. It estimates the causal effect of an intervention (e.g., a policy change, marketing campaign, or product launch) on a time series. It applies Bayesian structural time-series models (BSTS) to analyze how an intervention alters a time series, controlling for other influencing variables. By comparing actual post-intervention data to a counterfactual (what would have happened without the intervention), CausalImpact estimates the impact and provides measures of uncertainty.
Bayesian structural time-series models, upon which the CausalImpact package is built, are a flexible and powerful approach for analyzing time series data by capturing underlying structures such as trends, seasonality, and external regressors. Its probabilistic framework allows for uncertainty estimation in predictions and effectively handles complex data patterns. With Bayesian inference, BSTS models continuously update beliefs about the system as new data arrives, making them adaptive and robust to changing conditions. They are especially useful for forecasting, causal inference, and anomaly detection, as they compare observed outcomes to predicted counterfactuals, even with incomplete data. Additionally, the Bayesian approach provides credible intervals, offering more nuanced insights into the model’s predictions and associated uncertainty, which supports better decision-making in areas like policy evaluation, demand forecasting, and campaign effectiveness analysis.
For more in-depth explanation of BSTS, and the CausalImpact package, please refer to Brodersen et al: Inferring causal impact using Bayesian structural time-series models.
In this tutorial, we look at whether there was an impact on Meta's stock price after the Facebook-Cambridge Analytica scandal (Figure 2).
Figure 2: This figure shows the results of the causal analysis. The upper subplot compares Meta’s observed (black) and modeled (dashed red) stock values over time, with a divergence after the Cambridge Analytica scandal (vertical dashed line). The middle subplot displays the difference between observed and predicted values (dashed red line). The lower subplot shows the cumulative impact of the intervention, indicating an immediate decrease followed by a return to normal after 3-4 months.