OBJECTIVE Drawing causal estimates from observational data is problematic, because datasets often contain underlying bias (eg, discrimination in treatment assignment). To examine causal effects, it is important to evaluate what-if… Click to show full abstract
OBJECTIVE Drawing causal estimates from observational data is problematic, because datasets often contain underlying bias (eg, discrimination in treatment assignment). To examine causal effects, it is important to evaluate what-if scenarios-the so-called "counterfactuals." We propose a novel deep learning architecture for propensity score matching and counterfactual prediction-the deep propensity network using a sparse autoencoder (DPN-SA)-to tackle the problems of high dimensionality, nonlinear/nonparallel treatment assignment, and residual confounding when estimating treatment effects. MATERIALS AND METHODS We used 2 randomized prospective datasets, a semisynthetic one with nonlinear/nonparallel treatment selection bias and simulated counterfactual outcomes from the Infant Health and Development Program and a real-world dataset from the LaLonde's employment training program. We compared different configurations of the DPN-SA against logistic regression and LASSO as well as deep counterfactual networks with propensity dropout (DCN-PD). Models' performances were assessed in terms of average treatment effects, mean squared error in precision on effect's heterogeneity, and average treatment effect on the treated, over multiple training/test runs. RESULTS The DPN-SA outperformed logistic regression and LASSO by 36%-63%, and DCN-PD by 6%-10% across all datasets. All deep learning architectures yielded average treatment effects close to the true ones with low variance. Results were also robust to noise-injection and addition of correlated variables. Code is publicly available at https://github.com/Shantanu48114860/DPN-SA. DISCUSSION AND CONCLUSION Deep sparse autoencoders are particularly suited for treatment effect estimation studies using electronic health records because they can handle high-dimensional covariate sets, large sample sizes, and complex heterogeneity in treatment assignments.
               
Click one of the above tabs to view related content.