- 无标题文档
























 双重稳健性 缺失数据 因果分析 倾向评分 逆概率加权    


 Doubly Robust Missing Data Causal Inference Propensity Score Inverse Probability Weight    

在社会学与生物学等领域的的研究中,人们经常需要处理缺失数据或因果分析问题,我们最常用的方法是建立观测到的协变量与因变量之间的回归模型(OR模型)和倾向评分模型(PS模型)来对目标参数进行估计。双重稳健估计估计量(DR估计量)可以在PS模型和OR模型至少有一个被正确设定的情况下保持一致,这样就给人们两次得到有效估计的机会。 本文默认文中所有的缺失模式都为随机缺失。我们以对整体均值的估计为例,首先讨论了在缺失数据模型中构造DR估计量的方法。常见的数据分为非纵向数据和纵向数据两类,我们先讨论了DR估计量在非纵向数据中如何被构建的,再通过递归将其推广到纵向数据中。 而因果分析模型,我们假设模型中无混淆变量,同时所有的与处理的分配与潜在的结果相关的协变量都被检测到。我们将因果效应定义为在不同处理下观察到的潜在结果的差值,然后将分配机制量化,就可以建立与缺失数据模型的PS模型与OR模型类似的分配机制模型与反事实结果模型,进而构建类似的DR估计量。 同时,在构建DR估计量的过程中,我们发现干扰参数估计量的选择可能会对估计结果造成影响。而选择合适的干扰参数估计量能够有效地减小DR估计量的方差与偏差,提高估计量的效率。 关键字:双重稳健性;缺失数据;因果分析;倾向评分;逆概率加权



[1] Rosenbaum, P. R. & Rubin, D. B. (1983). The central role of the propensity score in observational studies for causal effects. Biometrika, 70, 41–55.

[2] Robins, J. M., Mark, S. D., & Newey, W. K. (1992). Estimating Exposure Effects by Modeling the Expectation of Exposure Conditional on Confounders. Biometrics, 48, 479–495.

[3] Robins, J. M., Rotnitzky, A. & Zhao, L. P. (1994). Estimation of regression coefficients when some regressors are not always observed. J. Am. Statist. Assoc, 89, 846–66.

[4] Rotnitzky, A., and Robins, J. M. (1995). Semiparametric Regression Estimation in the Presence of Dependent Censoring. Biometrika, 82, 805–820.

[5] Robins, J. M. (1999a). Association, Causation, and Marginal Structural Models. Synthese, 121, 151–179.

[6] Robins, J. M., Rotnitzky, A., and Scharfstein, D. O. (1999). Sensitivity Analysis for Selection Bias and Unmeasured Confounding in Missing Data and Causal Inference Models. Statistical Models in Epidemiology, E. M. Halloran and D. Berry, New York: Springer-Verlag, 1–92.

[7] Robins, J. M., Hernán, M. & Brumback, B. (2000). Marginal structural models and causal inference in epidemiology. Epidemiol, 11, 550–60.

[8] Heejung Bang, James M. Robins (2005). Doubly Robust Estimation in Missing Data and Causal Inference Models. Biometrics, 61, 962–972

[9] Tan, Z. (2006). A Distributional Approach for Causal Inference Using Propensity Scores. Journal of the American Statistical Association, 101:476, 1619-1637

[10] Robins, J. M., Sued, M., Lei-Gomez, Q., & Rotnitzky, A. (2007). Comment: Performance of Double-Robust Estimators when Inverse Probability Weights are Highly Variable. Statistical Science, 22, 544–559.

[11] Jing Qin, Jun Shao & Biao Zhang. (2008). Efficient and Doubly Robust Imputation for Covariate-Dependent Missing Responses. Journal of the American Statistical Association, 103:482, 797–810.

[12] Cao, W. H., Tsiatis, A. A., & Davidian, M. (2009). Improving efficiency and robustness of the doubly robust estimator for a population mean with incomplete data. Biometrika, 96, 723–734.

[13] van der Laan, Mark J. & Gruber, Susan. (2009). Collaborative Targeted Maximum Likelihood Estimation. U.C. Berkeley Division of Biostatistics Working Paper Series, 246.

[14] Tsiatis, A. A., Davidian, M., and Cao, W. (2011). Improved Doubly Robust Estimation When Data are Monotonely Coarsened, with Application to Longitudinal Studies with Dropout. Biometrics, 67, 536–545.

[15] Rotnitzky, A., Lei, Q. H., Sued, M., and Robins, J. M. (2012). Improved Double-Robust Estimation in Missing Data and Causal Inference Models. Biometrika, 99, 439–456.

[16] Vansteelandt, S., Bekaert, M., and Claeskens, G. (2012). On Model Selection and Model Misspecification in Causal Inference. Statistical Methods in Medical Research, 21, 7–30.

[17] Karel Vermeulen & Stijn Vansteelandt (2014). Biased-Reduced Doubly Robust Estimation, Journal of the American Statistical Association, 110:511, 1024–1036.

[18] 秦国友.半参数混合效应模型的稳健估计[D].华东师范大学, 2007.

[19] 赵丽.缺失数据模型的逆概率加权的双重稳健估计[D].山东大学, 2010.

[20] 邱涛.缺失数据情形两类统计模型的统计推断[D].广西师范大学, 2010.

[21] 孙静.若干非参数和半参数模型的稳健估计和特征筛选[D].山东大学, 2013.




   建议浏览器: 谷歌 火狐 360请用极速模式,双核浏览器请用极速模式