Robust Methods for Estimating the Mean with Missing Data.
[摘要] Missing data are common in many empirical studies. In this dissertation, we explore robust methods to estimate the mean of an outcome variable subject to nonresponse in the presence of fully observed covariates.In Chapter II, we consider data on a continuous outcome that are missing at random and a fully observed set of covariates. Doubly-robust (DR) estimators are consistent when either the regression model for the mean function or the propensity to respond (the ;;propensity model”) is correctly specified. We compare by simulation a variety of doubly-robust (DR) estimators for estimating the mean of the outcome. Penalized spline of propensity prediction (Zhang and Little 2009) and the augmented estimating equation method proposed in Cao, et al (2009) tended to outperform the other DR methods.In Chapter III, we consider estimating the mean of a continuous outcome that may be missing not at random (MNAR). Bivariate normal pattern-mixture models (BNPM; Little 1994) and proxy-pattern mixture models (PPMA; Andridge and Little 2011) have been proposed to estimate the mean of the outcome under varying assumptions about the missing data mechanism given one or more fully observed covariates. Both BNPM and PPMA assume normality. We propose a spline-proxy pattern mixture model (S-PPMA), which relaxes the normality assumption using a penalized spline. Properties of S-PPMA are assessed by simulation. Results show that S-PPMA yields estimates that are more robust than PPMA to deviations from normality, while trading off precision when normality assumption is met.In Chapter IV, we extend S-PPMA to a binary outcome (binS-PPMA), by assuming an underlying continuous latent variable that generates the binary outcome. We apply binS-PPMA to the latent variable to impute the binary outcome given information from observed covariates and assumptions about the nonresponse mechanism. As with continuous missing variables, binS-PPMA shows improvement in robustness to normality compared to bin-PPMA, with no important differences between the methods when all variables are normally distributed.
[发布日期] [发布机构] University of Michigan
[效力级别] doubly robust [学科分类]
[关键词] missing data;doubly robust;Statistics and Numeric Data;Science;Biostatistics [时效性]