This post gathers resources that may be helpful in learning the semi-parametric statistical theory that is relevant to statistical methods development for causal inference. I intend to continually update this post. BibTex is provided at the bottom.
Articles
All of Edward Kennedy’s expository writing on the subject is excellent; I recommend starting with the following two articles:
- Kennedy, Edward H. (2023). Semiparametric doubly robust targeted double machine learning: a review. https://arxiv.org/abs/2203.06469
- Fisher, Aaron, and Kennedy, Edward H. (2021). Visually Communicating and Teaching Intuition for Influence Functions. The American Statistician, 75(2), 162–172. https://doi.org/10.1080/00031305.2020.1717620
For a big-picture review of Targeted Learning and Double Machine Learning:
- Díaz, Iván. (2019). Machine learning in the estimation of causal effects: targeted minimum loss-based estimation and double/debiased machine learning. Biostatistics, 21(2), 353–358. https://doi.org/10.1093/biostatistics/kxz042
Online Resources
- Hoffman, Katherine. (2020). An Illustrated Guide to TMLE. https://www.khstats.com/blog/tmle/tutorial
- Benkeser, David and Chambaz, Antoine. (2022). A Ride in Targeted Learning Territory. https://achambaz.github.io/tlride/
- Schuler, Alejandro and van der Laan, Mark. Introduction to Modern Causal Inference. https://alejandroschuler.github.io/mci/introduction-to-modern-causal-inference.html
- Susmann, Herbert. One-step Estimators and Pathwise Derivatives. https://herbsusmann.com/notebooks/one-step-estimators/.
Books
There are several books by van der Vaart and colleagues that serve as foundational references for semi-parametric theory. You will often see both of the following books cited in papers.
- van der Vaart, Aad W., & Wellner, Jon A. (2023). Weak Convergence and Empirical Processes: With Applications to Statistics (2nd ed.). Springer Cham. https://link.springer.com/book/10.1007/978-3-031-29040-4
- van der Vaart, Aad W. (1998). Asymptotic Statistics. Cambridge University Press. https://doi.org/10.1017/CBO9780511802256
Both of the Targeted Learning books by van der Laan and Rose are the canonical texts on the subject:
- van der Laan, Mark, and Rose, Sherri. (2011). Targeted Learning: Causal Inference for Observational and Experimental Data. Springer New York, NY. https://doi.org/10.1007/978-1-4419-9782-1
- van der Laan, Mark, and Rose, Sherri. (2018). Targeted Learning in Data Science: Causal Inference for Complex Longitudinal Studies. Springer Cham. https://doi.org/10.1007/978-3-319-65304-4
Other books:
- Bickel, Peter J., Klaassen, Chris A. J., Ritov, Ya’acov, & Wellner, Jon. A. (1993). Efficient and Adaptive Estimation for Semiparametric Models. Springer New York, NY. https://link.springer.com/book/9780387984735
- Tsiatis, A. A. (2006). Semiparametric Theory and Missing Data. Springer New York, NY. https://doi.org/10.1007/0-387-37345-4
- Kosorok, Michael. R. (2008). Introduction to Empirical Processes and Semiparametric Inference. Springer New York, NY. https://doi.org/10.1007/978-0-387-74978-5
Bibtex
Note the correct capitalization: “van der Laan” and “van der Vaart”.
@misc{kennedy2023review,
title={Semiparametric doubly robust targeted double machine learning: a review},
author={Edward H. Kennedy},
year={2023},
eprint={2203.06469},
archivePrefix={arXiv},
primaryClass={stat.ME},
url={https://arxiv.org/abs/2203.06469},
}
@article{fisher2021influence,
author = {Aaron Fisher and Edward H. Kennedy},
title = {Visually Communicating and Teaching Intuition for Influence Functions},
journal = {The American Statistician},
volume = {75},
number = {2},
pages = {162--172},
year = {2021},
publisher = {ASA Website},
doi = {10.1080/00031305.2020.1717620},
URL = {https://doi.org/10.1080/00031305.2020.1717620},
eprint = {https://doi.org/10.1080/00031305.2020.1717620}
}
@article{diaz2019machinelearning,
author = {Díaz, Iván},
title = "{Machine learning in the estimation of causal effects: targeted minimum loss-based estimation and double/debiased machine learning}",
journal = {Biostatistics},
volume = {21},
number = {2},
pages = {353-358},
year = {2019},
month = {11},
issn = {1465-4644},
doi = {10.1093/biostatistics/kxz042},
url = {https://doi.org/10.1093/biostatistics/kxz042},
eprint = {https://academic.oup.com/biostatistics/article-pdf/21/2/353/32914770/kxz042.pdf},
}
@book{vdlrose2011targetd,
author={Mark {van der Laan} and Sherri Rose},
title={Targeted Learning},
subtitle={Causal Inference for Observational and Experimental Data},
year={2011},
series={Springer Series in Statistics},
publisher={Springer New York, NY},
doi={https://doi.org/10.1007/978-1-4419-9782-1}
}
@book{vdlrose2018targetd,
author={Mark {van der Laan} and Sherri Rose},
title={Targeted Learning in Data Science},
subtitle={Causal Inference for Complex Longitudinal Studies},
year={2018},
series={Springer Series in Statistics},
publisher={Springer Cham},
doi={https://doi.org/10.1007/978-3-319-65304-4}
}
@book{vdv1998asymptotics,
place={Cambridge},
series={Cambridge Series in Statistical and Probabilistic Mathematics},
title={Asymptotic Statistics},
publisher={Cambridge University Press},
author={Aad W. {van der Vaart}},
year={1998},
collection={Cambridge Series in Statistical and Probabilistic Mathematics}
}
@book{vdv2023weakconvergence,
edition={2},
year={2023},
series={Springer Series in Statistics},
publisher={Springer Cham},
author={Aad W. {van der Vaart} and Jon A. Wellner},
title={Weak Convergence and Empirical Processes},
subtitle={With Applications to Statistics},
doi={10.1007/978-3-031-29040-4}
}
@book{bickel1993semiparametric,
author={Peter J. Bickel and Chris A.J. Klaassen and Ya'acov Ritov and Jon A. Wellner},
title={Efficient and Adaptive Estimation for Semiparametric Models},
year={1993},
publisher={Springer New York, NY}
}
@book{tsiatis2006theory,
author={Anastasios A. Tsiatis},
title={Semiparametric Theory and Missing Data},
doi={10.1007/0-387-37345-4},
publisher={Springer New York, NY},
series={Springer Series in Statistics},
year={2006}
}
@book{kosorok2008semiparametric,
title={Introduction to Empirical Processes and Semiparametric Inference},
author={Michael R. Kosorok},
doi={10.1007/978-0-387-74978-5},
series={Springer Series in Statistics},
publisher={Springer New York, NY},
year={2008}
}