Robust Inference for Heterogeneous Treatment Effects With Applications to NHANES Data

Date
2024-12
Authors
Language
American English
Embargo Lift Date
Department
Committee Chair
Degree
Ph.D.
Degree Year
2024
Department
Mathematical Sciences
Grantor
Purdue University
Journal Title
Journal ISSN
Volume Title
Found At
Abstract

Estimating the conditional average treatment effect (CATE) using data from the National Health and Nutrition Examination Survey (NHANES) provides valuable insights into the heterogeneous impacts of health interventions across diverse populations, facilitating public health strategies that consider individual differences in health behaviors and conditions. However, estimating CATE with NHANES data face challenges often encountered in observational studies, such as outliers, heavy-tailed error distributions, skewed data, model misspecification, and the curse of dimensionality. To address these challenges, this dissertation presents three consecutive studies that thoroughly explore robust methods for estimating heterogeneous treatment effects.

The first study introduces an outlier-resistant estimation method by incorporating M-estimation, replacing the (L_2) loss in the traditional inverse propensity weighting (IPW) method with a robust loss function. To assess the robustness of our approach, we investigate its influence function and breakdown point. Additionally, we derive the asymptotic properties of the proposed estimator, enabling valid inference for the proposed outlier-resistant estimator of CATE.

The method proposed in the first study relies on a symmetric assumption which is commonly required by standard outlier-resistant methods. To remove this assumption while maintaining unbiasedness, the second study employs the adaptive Huber loss, which dynamically adjusts the robustification parameter based on the sample size to achieve optimal tradeoff between bias and robustness. The robustification parameter is explicitly derived from theoretical results, making it unnecessary to rely on time-consuming data-driven methods for its selection. We also derive concentration and Berry-Esseen inequalities to precisely quantify the convergence rates as well as finite sample performance.

In both previous studies, the propensity scores were estimated parametrically, which is sensitive to model misspecification issues. The third study extends the robust estimator from our first project by plugging in a kernel-based nonparametric estimation of the propensity score with sufficient dimension reduction (SDR). Specifically, we adopt a robust minimum average variance estimation (rMAVE) for the central mean space under the potential outcome framework. Together with higher-order kernels, the resulting CATE estimation gains enhanced efficiency.

In all three studies, the theoretical results are derived, and confidence intervals are constructed for inference based on these findings. The properties of the proposed estimators are verified through extensive simulations. Additionally, applying these methods to NHANES data validates the estimators' ability to handle diverse and contaminated datasets, further demonstrating their effectiveness in real-world scenarios.

Description
IUI
item.page.description.tableofcontents
item.page.relation.haspart
Cite As
ISSN
Publisher
Series/Report
Sponsorship
Major
Extent
Identifier
Relation
Journal
Source
Alternative Title
Type
Thesis
Number
Volume
Conference Dates
Conference Host
Conference Location
Conference Name
Conference Panel
Conference Secretariat Location
Version
Full Text Available at
This item is under embargo {{howLong}}