Economics Department Theses and Dissertations

Permanent URI for this collection

https://hdl.handle.net/1805/670

Browse

Now showing 1 - 10 of 15

Avoiding Bad Control in Regression for Partially Qualitative Outcomes, and Correcting for Endogeneity Bias in Two-Part Models: Causal Inference from the Potential Outcomes Perspective
(2021-05) Asfaw, Daniel Abebe; Terza, Joseph; Ottoni-Wilhelm, Mark; Tennekoon, Vidhura; Tan, Fei
The general potential outcomes framework (GPOF) is an essential structure that facilitates clear and coherent specification, identification, and estimation of causal effects. This dissertation utilizes and extends the GPOF, to specify, identify, and estimate causally interpretable (CI) effect parameter (EP) for an outcome of interest that manifests as either a value in a specified subset of the real line or a qualitative event -- a partially qualitative outcome (PQO). The limitations of the conventional GPOF for casting a regression model for a PQO is discussed. The GPOF is only capable of delivering an EP that is subject to a bias due to bad control. The dissertation proposes an outcome measure that maintains all of the essential features of a PQO that is entirely real-valued and is not subject to the bad control critique; the P-weighted outcome – the outcome weighted by the probability that it manifests as a quantitative (real) value. I detail a regression-based estimation method for such EP and, using simulated data, demonstrate its implementation and validate its consistency for the targeted EP. The practicality of the proposed approach is demonstrated by estimating the causal effect of a fully effective policy that bans pregnant women from smoking during pregnancy on a new measure of birth weight. The dissertation also proposes a Generalized Control Function (GCF) approach for modeling and estimating a CI parameter in the context of a fully parametric two-part model (2PM) for a continuous outcome in which the causal variable of interest is continuous and endogenous. The proposed approach is cast within the GPOF. Given a fully parametric specification for the causal variable and under regular Instrumental Variables (IV) assumptions, the approach is shown to satisfy the conditional independence assumption that is often difficult to hold under alternative approaches. Using simulated data, a full information maximum likelihood (FIML) estimator is derived for estimating the “deep” parameters of the model. The Average Incremental Effect (AIE) estimator based on these deep parameter estimates is shown to outperform other conventional estimators. I apply the method for estimating the medical care cost of obesity in youth in the US.
Casual analysis using two-part models : a general framework for specification, estimation and inference
(2018-06-22) Hao, Zhuang; Terza, Joseph V.; Devaraj, Srikant; Liu, Ziyue; Mak, Henry; Ottoni-Wilhelm, Mark
The two-part model (2PM) is the most widely applied modeling and estimation framework in empirical health economics. By design, the two-part model allows the process governing observation at zero to systematically differ from that which determines non-zero observations. The former is commonly referred to as the extensive margin (EM) and the latter is called the intensive margin (IM). The analytic focus of my dissertation is on the development of a general framework for specifying, estimating and drawing inference regarding causally interpretable (CI) effect parameters in the 2PM context. Our proposed fully parametric 2PM (FP2PM) framework comprises very flexible versions of the EM and IM for both continuous and count-valued outcome models and encompasses all implementations of the 2PM found in the literature. Because our modeling approach is potential outcomes (PO) based, it provides a context for clear definition of targeted counterfactual CI parameters of interest. This PO basis also provides a context for identifying the conditions under which such parameters can be consistently estimated using the observable data (via the appropriately specified data generating process). These conditions also ensure that the estimation results are CI. There is substantial literature on statistical testing for model selection in the 2PM context, yet there has been virtually no attention paid to testing the “one-part” null hypothesis. Within our general modeling and estimation framework, we devise a relatively simple test of that null for both continuous and count-valued outcomes. We illustrate our proposed model, method and testing protocol in the context of estimating price effects on the demand for alcohol.
Childhood Bully Victimization and Adverse Life Outcomes
(2023-10) Adhikary, Satabdi; Tennekoon, Vidhura; Royalty, Anne; Morrison, Gwendolyn; Ottoni-Wilhelm, Mark; Xu, Huiping
Bullying is widely prevalent in the US. Although anti-bullying laws have been implemented across the country since 1999, bullying prevalence rates remain high. Research suggests that being a bully or a bully victim or both makes an individual more likely to experience worse physical, mental, and financial health. This dissertation comprises of three essays examining the adverse effects of bully victimization on life outcomes. The first essay examines, using Panel Study of Income Dynamics (PSID) data, how being a victim of bullying affects sleep hours of an individual over the years. Results suggest that being a bully victim during teenage years reduces sleep hours, both contemporaneously and during early adulthood. The second essay uses the National Longitudinal Survey of Youth 1997 (NLSY97) data to examine how repeated bully victimization experiences in childhood and teenage years affect future labor market outcomes. A standard Mincer wage equation is used in a Heckman selection model and Inverse Probability Weighting (IPW) model to derive the estimates. Results indicate that being repeatedly bullied in teenage years reduces future earnings, mainly through reduced wage rates. The third essay, using NLSY97, looks at the effect of repeated bully victimization on wealth accumulation during early adult ages in difference-in-difference type framework. Measures of wealth accumulation include net household worth and its components, financial and non-financial assets, and financial debt at 20, 25, 30 and 35 years of age. Results indicate that the bully victims accumulate fewer net assets during the ages 20-35 than their non-victimized counterparts.
Count-Regression-Based Empirical Causal Analysis from a Potential Outcomes Perspective: Accounting for Boundedness, Discreteness, Dispersion and Unobservable Confounding
(2024-06) Kazeminezhad, Golnoush; Terza, Joseph V.; Harle, Christopher A.; Morrison, Wendy; Russell, Steven
Empirical economic research is primarily driven by the desire to offer scientific evidence that serves to inform the study of cause-and-effect. In this dissertation, I developed new models for count-regression-model-based (CRM-based) causal effect estimation in which the value for the outcome of interest is restricted to the non-negative integers. I implement first-order two-stage residual inclusion (FO-2SRI) methods, in the context of the general potential outcomes framework, that accommodate nonlinearities due to the intrinsic characteristics of count-valued outcomes such as boundedness (outcome nonnegative), discreteness (outcome has countable support) and dispersion (conditional variance and other higher order conditional moments of the outcome not necessarily equal to its conditional mean) of count data, and unobservable confounding. The focus here is on the case in which the causal variable is continuous. The newly proposed causal effect estimators are compared with extant FO-2SRI estimators based on conventional control function methods and the linear instrumental variables (LIV) estimator. A series of simulation studies are performed to investigate the accuracy of the proposed estimators and compare the results with the extant estimators. In the simulation studies, the robustness of the fully nonlinear CRM-based FO-2SRI methods are investigated with attention to an important type of misspecification error. The models are also applied to a real-world data from Nigeria to investigate the effect of female education on their fertility decisions in a developing country. The results of the simulation studies reveal that estimates obtained via the newly proposed estimators are very accurate and widely diverge from the results from the extant control function and LIV methods. Moreover, one of the new estimators, which allows dispersion flexibility, dominated all other estimators (aside from a few extreme dispersion cases) with regard to avoidance of misspecification bias. Finally, the results showed that same estimator to be quite accurate for a wide range of values of the dispersion parameter (which measures mean/variance divergence). Similar results were obtained via the real data analysis which indicates that increasing women’s education decreases childbearing.
The effect of sugar-sweetened beverage consumption on childhood obesity - causal evidence
(2016-05-18) Yang, Yan; Terza, Joseph V.; Courtemanche, Charles; Jung, Haeil; Mak, Henry Y.; Wu, Jisong
Communities and States are increasingly targeting the consumption of sugar sweetened beverages (SSBs), especially soda, in their efforts to curb childhood obesity. However, the empirical evidence based on which policy makers design the relevant policies is not causally interpretable. In the present study, we suggest a modeling framework that can be used for making causal estimation and inference in the context of childhood obesity. This modeling framework is built upon the two-stage residual inclusion (2SRI) instrumental variables method and have two levels – level one models children’s lifestyle choices and level two models children’s energy balance which is assumed to be dependent on their lifestyle behaviors. We start with a simplified version of the model that includes only one policy, one lifestyle, one energy balance, and one observable control variable. We then extend this simple version to be a general one that accommodates multiple policy and lifestyle variables. The two versions of the model are 1) first estimated via the nonlinear least square (NLS) method (henceforth NLS-based 2SRI); and 2) then estimated via the maximum likelihood estimation (MLE) method (henceforth MLE-based 2SRI). Using simulated data, we show that 1) our proposed 2SRI method outperforms the conventional method that ignores the inherent nonlinearity [the linear instrumental variables (LIV) method] or the potential endogeneity [the nonlinear regression (NR) method] in obtaining the relevant estimators; and 2) the MLE-based 2SRI provides more efficient estimators (also consistent) compared to the NLS-based one. Real data analysis is conducted to illustrate the implementation of 2SRI method in practice using both NLS and MLE methods. However, due to data limitation, we are not able to draw any inference regarding the impacts of lifestyle, specifically SSB consumption, on childhood obesity. We are in the process of getting better data and, after doing so, we will replicate and extend the analyses conducted here. These analyses, we believe, will produce causally interpretable evidence of the effects of SSB consumption and other lifestyle choices on childhood obesity. The empirical analyses presented in this dissertation should, therefore, be viewed as an illustration of our newly proposed framework for causal estimation and inference.
Effect Specification, Identification, Estimation, and Inference in a Fractional Outcome Regression Model with an Endogenous Causal Variable
(2024-08) Cheong, Taul; Terza, Joseph V.; Gupta, Sumedha; Steinberg, Richard; Liu, Ziyue
Empirical economic research is primarily driven by the desire to offer scientific evidence that helps assess policy relevant cause-and-effect. The approach most often applied in pursuit of this objective involves regression modeling and estimation. In this dissertation, we focus on the specification, identification, estimation, and causal inference of a causal effect (CE) in the context of the fractional regression model (FRM) for which the support of the outcome variable of interest is restricted to the unit interval. Empirical applications of such models abound in health economics, health services research and health policy literatures. Examples from other disciplines include labor economics, development economics, political economics, commerce or finance. Various full information maximum likelihood and quasi-maximum likelihood regression estimators and nonlinear least squares approach have been proposed to account for the inherent nonlinearity in the FRM due to the unit interval support restriction (UISR) on the outcome variable. Additional nonlinearity is induced in the FRM when the presumed causal variable is subject to unobservable confounding (UC) [i.e., when the presumed causal variable is endogenous]. In such cases, the additional analytic and implementation effort required to account for both sources of nonlinearity (fractional outcome and UC) while avoiding UC bias (which precludes causal interpretability) can be daunting. We seek to develop and implement regression model specifications that account for the inherent nonlinearity implied by this restriction, as well as the nonlinearity that could be additionally imposed by the endogeneity of the presumed causal variable. We focus on the case where the presumed causal variable is continuous. We develop new models for FRM-based CE estimation that implement two-stage residual inclusion (2SRI) methods, as suggested by Terza et al. (2008). We assess the accuracy of our proposed new methods and compare them with extant 2SRI approaches using simulation study. An empirical application demonstrates the working of our proposed method.
Essays in health economics
(2018-06-22) Ghosh, Ausmita; Royalty, Anne Beeson; Simon, Kosali; Freedman, Seth; Morrison, Wendy; Antwi, Yaa Akosa
My dissertation is a collection of three essays on the design of public health insurance in the United States. Each essay examines the responsiveness of health behavior and healthcare utilization to insurance-related incentives and draws implications for health policy in addressing the needs of disadvantaged populations. The first two essays evaluate the impact of Medicaid expansions under the Affordable Care Act (ACA) on health and healthcare utilization. The Medicaid expansions that included full coverage of preconception care, led to a decline in childbirths, particularly those that are unintended. In addition, these fertility reductions are attributable to higher utilization of Medicaidfinanced prescription contraceptives. The second essay documents patterns of aggregate prescription drug utilization in response to the Medicaid expansions. Within the first 15 months following the policy change, Medicaid prescriptions increased, with relatively larger increases for chronic drugs such as diabetes and cardio-vascular medications, suggesting improvements in access to medical care. There is no evidence of reductions in uninsured or privately-insured prescriptions, suggesting that Medicaid did not simply substitute for other forms of payment, and that net utilization increased. The effects on utilization are relatively higher in areas with larger minority and disadvantaged populations, suggesting reduction in disparities in access to care. Finally, the third essay considers the effect of Medicaid coverage loss on hospitalizations and uncompensated care use among non-elderly adults. The results show that coverage loss led to higher uninsured hospitalizations, suggesting higher uncompensated care use. Most of the increase in uninsured hospitalizations are driven by visits originating in the ED - a pattern consistent with losing access to regular place of care. These results indicate that policies that reduce Medicaid funding could be particularly harmful for patients with chronic conditions.
Specification and estimation of the price responsiveness of alcohol demand: a policy analytic perspective
(2016-01-13) Devaraj, Srikant; Tezra, Joseph V.; Antwi, Yaa Akosa; Jones, Josette; Wu, Jisong
Accurate estimation of alcohol price elasticity is important for policy analysis – e.g.., determining optimal taxes and projecting revenues generated from proposed tax changes. Several approaches to specifying and estimating the price elasticity of demand for alcohol can be found in the literature. There are two keys to policy-relevant specification and estimation of alcohol price elasticity. First, the underlying demand model should take account of alcohol consumption decisions at the extensive margin – i.e., individuals' decisions to drink or not – because the price of alcohol may impact the drinking initiation decision and one's decision to drink is likely to be structurally different from how much they drink if they decide to do so (the intensive margin). Secondly, the modeling of alcohol demand elasticity should yield both theoretical and empirical results that are causally interpretable. The elasticity estimates obtained from the existing two-part model takes into account the extensive margin, but are not causally interpretable. The elasticity estimates obtained using aggregate-level models, however, are causally interpretable, but do not explicitly take into account the extensive margin. There currently exists no specification and estimation method for alcohol price elasticity that both accommodates the extensive margin and is causally interpretable. I explore additional sources of bias in the extant approaches to elasticity specification and estimation: 1) the use of logged (vs. nominal) alcohol prices; and 2) implementation of unnecessarily restrictive assumptions underlying the conventional two-part model. I propose a new approach to elasticity specification and estimation that covers the two key requirements for policy relevance and remedies all such biases. I find evidence of substantial divergence between the new and extant methods using both simulated and the real data. Such differences are profound when placed in the context of alcohol tax revenue generation.
Specification, estimation and testing of treatment effects in multinomial outcome models : accommodating endogeneity and inter-category covariance
(2018-06-18) Tang, Shichao; Terza, Joseph V.; Carlin, Paul; Lin, Hsien-Chang; Morrison, Gwendolyn; Seo, Boyoung
In this dissertation, a potential outcomes (PO) based framework is developed for causally interpretable treatment effect parameters in the multinomial dependent variable regression framework. The specification of the relevant data generating process (DGP) is also derived. This new framework simultaneously accounts for the potential endogeneity of the treatment and loosens inter-category covariance restrictions on the multinomial outcome model (e.g., the independence from irrelevant alternatives restriction). Corresponding consistent estimators for the “deep parameters” of the DGP and the treatment effect parameters are developed and implemented (in Stata). A novel approach is proposed for assessing the inter-category covariance flexibility afforded by a particular multinomial modeling specification [e.g. multinomial logit (MNL), multinomial probit (MNP), and nested multinomial logit (NMNL)] in the context of our general framework. This assessment technique can serve as a useful tool for model selection. The new modeling/estimation approach developed in this dissertation is quite general. I focus here, however, on the NMNL model because, among the three modeling specifications under consideration (MNL, MNP and NMNL), it is the only one that is both computationally feasible and is relatively unrestrictive with regard to inter-category covariance. Moreover, as a logical starting point, I restrict my analyses to the simplest version of the model – the trinomial (three-category) NMNL with an endogenous treatment (ET) variable conditioned on individual-specific covariates only. To identify potential computational issues and to assess the statistical accuracy of my proposed NMNL-ET estimator and its implementation (in Stata), I conducted a thorough simulation analysis. I found that conventional optimization techniques are, in this context, generally fraught with convergence problems. To overcome this, I implement a systematic line search algorithm that successfully resolves this issue. The simulation results suggest that it is important to accommodate both endogeneity and inter-category covariance simultaneously in model design and estimation. As an illustration and as a basis for comparing alternative parametric specifications with respect to ease of implementation, computational efficiency and statistical performance, the proposed model and estimation method are used to analyze the impact of substance abuse/dependence on the employment status using the National Epidemiologic Survey on Alcohol and Related Conditions (NESARC) data.
A Switching Regressions Framework for Models with Count-Valued Omni-Dispersed Outcomes: Specification, Estimation and Causal Inference
(2020-02) Manalew, Wondimu Samuel; Terza, Joseph V.; Boukai, Ben; Osili, Una; Tennekoon, Vidhura; Trombley, Matt
In this dissertation, I develop a regression-based approach to the specification and estimation of the effect of a presumed causal variable on a count-valued outcome of interest. Statistics for relevant causal inference are also derived. As an illustration and as a basis for comparing alternative parametric specifications with respect to ease of implementation, computational efficiency and statistical performance, the proposed models and estimation methods are used to analyze household fertility decisions. I estimate the effect of a counterfactually imposed additional year of wife’s education on actual family size (AFS) and desired family size (DFS) [count-valued variables]. In order to ensure the causal interpretability of the effect parameter as I define it, the underlying regression model is cast in a potential outcomes (PO) framework. The specification of the relevant data generating process (DGP) is also derived. The regression-based approach developed in the dissertation, in addition to taking explicit account of the fact that the outcome of interest is count-valued, is designed to account for potential sample selection bias due to a particular data deficiency in the count data context and to accommodate the possibility that some structural aspects of the model may vary with the value of a binary switching variable. Moreover, my approach loosens the equi-dispersion constraint [conditional mean (CM) equals conditional variance (CV)] that plagues conventional (poisson) count-outcome regression models. This is a particularly important feature of my model and method because in most contexts in empirical economics the data are either over-dispersed (CM < CV) or under-dispersed (CM > CV) – fertility models are usually characterized by the latter. Alternative count data models were discussed and compared using simulated and real data. The simulation results and estimation results using real data suggest that the estimated effects from my proposed models (models that loosen the equi-dispersion constraint, account for the sample selection, and accommodate variability in structural aspect of the models due to a switching variable) substantively differ from estimates from a conventional linear and count regression specifications.

Browse

Browsing Economics Department Theses and Dissertations by Title

Results Per Page

Sort Options