- Browse by Subject
Browsing by Subject "causal inference"
Now showing 1 - 4 of 4
Results Per Page
Sort Options
Item Big Data and Causal Inference: What Does a New Analysis of the UK BioBank Data Tell Us?(AHA, 2020) Tu, Wanzhu; Pratt, J. Howard; Biostatistics, School of Public HealthItem An OLS-Based Method for Causal Inference in Observational Studies(2019-07) Xu, Yuanfang; Zhang, Ying; Huang, Bin; Tu, Wanzhu; Bakoyannis, Giorgos; Song, YiqingObservational data are frequently used for causal inference of treatment effects on prespecified outcomes. Several widely used causal inference methods have adopted the method of inverse propensity score weighting (IPW) to alleviate the in uence of confounding. However, the IPW-type methods, including the doubly robust methods, are prone to large variation in the estimation of causal e ects due to possible extreme weights. In this research, we developed an ordinary least-squares (OLS)-based causal inference method, which does not involve the inverse weighting of the individual propensity scores. We first considered the scenario of homogeneous treatment effect. We proposed a two-stage estimation procedure, which leads to a model-free estimator of average treatment effect (ATE). At the first stage, two summary scores, the propensity and mean scores, are estimated nonparametrically using regression splines. The targeted ATE is obtained as a plug-in estimator that has a closed form expression. Our simulation studies showed that this model-free estimator of ATE is consistent, asymptotically normal and has superior operational characteristics in comparison to the widely used IPW-type methods. We then extended our method to the scenario of heterogeneous treatment effects, by adding in an additional stage of modeling the covariate-specific treatment effect function nonparametrically while maintaining the model-free feature, and the simplicity of OLS-based estimation. The estimated covariate-specific function serves as an intermediate step in the estimation of ATE and thus can be utilized to study the treatment effect heterogeneity. We discussed ways of using advanced machine learning techniques in the proposed method to accommodate high dimensional covariates. We applied the proposed method to a case study evaluating the effect of early combination of biologic & non-biologic disease-modifying antirheumatic drugs (DMARDs) compared to step-up treatment plan in children with newly onset of juvenile idiopathic arthritis disease (JIA). The proposed method gives strong evidence of significant effect of early combination at 0:05 level. On average early aggressive use of biologic DMARDs leads to around 1:2 to 1:7 more reduction in clinical juvenile disease activity score at 6-month than the step-up plan for treating JIA.Item A Switching Regressions Framework for Models with Count-Valued Omni-Dispersed Outcomes: Specification, Estimation and Causal Inference(2020-02) Manalew, Wondimu Samuel; Terza, Joseph V.; Boukai, Ben; Osili, Una; Tennekoon, Vidhura; Trombley, MattIn this dissertation, I develop a regression-based approach to the specification and estimation of the effect of a presumed causal variable on a count-valued outcome of interest. Statistics for relevant causal inference are also derived. As an illustration and as a basis for comparing alternative parametric specifications with respect to ease of implementation, computational efficiency and statistical performance, the proposed models and estimation methods are used to analyze household fertility decisions. I estimate the effect of a counterfactually imposed additional year of wife’s education on actual family size (AFS) and desired family size (DFS) [count-valued variables]. In order to ensure the causal interpretability of the effect parameter as I define it, the underlying regression model is cast in a potential outcomes (PO) framework. The specification of the relevant data generating process (DGP) is also derived. The regression-based approach developed in the dissertation, in addition to taking explicit account of the fact that the outcome of interest is count-valued, is designed to account for potential sample selection bias due to a particular data deficiency in the count data context and to accommodate the possibility that some structural aspects of the model may vary with the value of a binary switching variable. Moreover, my approach loosens the equi-dispersion constraint [conditional mean (CM) equals conditional variance (CV)] that plagues conventional (poisson) count-outcome regression models. This is a particularly important feature of my model and method because in most contexts in empirical economics the data are either over-dispersed (CM < CV) or under-dispersed (CM > CV) – fertility models are usually characterized by the latter. Alternative count data models were discussed and compared using simulated and real data. The simulation results and estimation results using real data suggest that the estimated effects from my proposed models (models that loosen the equi-dispersion constraint, account for the sample selection, and accommodate variability in structural aspect of the models due to a switching variable) substantively differ from estimates from a conventional linear and count regression specifications.Item Using a monotone single‐index model to stabilize the propensity score in missing data problems and causal inference(Wiley, 2019-04) Qin, Jing; Yu, Tao; Li, Pengfei; Liu, Hao; Chen, Baojiang; Biostatistics, School of Public HealthThe augmented inverse weighting method is one of the most popular methods for estimating the mean of the response in causal inference and missing data problems. An important component of this method is the propensity score. Popular parametric models for the propensity score include the logistic, probit, and complementary log‐log models. A common feature of these models is that the propensity score is a monotonic function of a linear combination of the explanatory variables. To avoid the need to choose a model, we model the propensity score via a semiparametric single‐index model, in which the score is an unknown monotonic nondecreasing function of the given single index. Under this new model, the augmented inverse weighting estimator (AIWE) of the mean of the response is asymptotically linear, semiparametrically efficient, and more robust than existing estimators. Moreover, we have made a surprising observation. The inverse probability weighting and AIWEs based on a correctly specified parametric model may have worse performance than their counterparts based on a nonparametric model. A heuristic explanation of this phenomenon is provided. A real‐data example is used to illustrate the proposed methods.