Predictions & Causality

Making Predictions that Match Causal Questions: When Using Experimental Logic in Observational Data Works, and When Alternatives Are Better

Penn State Population Research Institute Methods Workshop on "Quasi-Experimental Methods in Demography”

May 16, 2023


One of the most popular approaches for approximating causal inferences in observational data is the application of experimental logic. That is, to try and recreate the control inherent in an experimental design in naturally occurring data. While this approach has many positives and works well to answer some research questions, it is suboptimal for others. In particular, many real-world causal effects do not operate in the manner that experimental effects operate. With nonlinear effects, the effect of an intervention will depend on the level of the independent variable. Differing distributions across groups will cause the same intervention to have different effects. This should not be treated as a problem to be contolled for but instead reflected in the estimation of predictions and effects. In addition, most interventions cause other changes to occur so "holding all else constant" is unrealistic. Using generalized marginal effects which allow multiple independent variables to change simultaneously better reflect the real effects of some variables.