IU Indianapolis ScholarWorks :: Browsing by Author "Terza, Joseph V."

Browsing by Author "Terza, Joseph V."

Now showing 1 - 10 of 16

Casual analysis using two-part models : a general framework for specification, estimation and inference
(2018-06-22) Hao, Zhuang; Terza, Joseph V.; Devaraj, Srikant; Liu, Ziyue; Mak, Henry; Ottoni-Wilhelm, Mark
The two-part model (2PM) is the most widely applied modeling and estimation framework in empirical health economics. By design, the two-part model allows the process governing observation at zero to systematically differ from that which determines non-zero observations. The former is commonly referred to as the extensive margin (EM) and the latter is called the intensive margin (IM). The analytic focus of my dissertation is on the development of a general framework for specifying, estimating and drawing inference regarding causally interpretable (CI) effect parameters in the 2PM context. Our proposed fully parametric 2PM (FP2PM) framework comprises very flexible versions of the EM and IM for both continuous and count-valued outcome models and encompasses all implementations of the 2PM found in the literature. Because our modeling approach is potential outcomes (PO) based, it provides a context for clear definition of targeted counterfactual CI parameters of interest. This PO basis also provides a context for identifying the conditions under which such parameters can be consistently estimated using the observable data (via the appropriately specified data generating process). These conditions also ensure that the estimation results are CI. There is substantial literature on statistical testing for model selection in the 2PM context, yet there has been virtually no attention paid to testing the “one-part” null hypothesis. Within our general modeling and estimation framework, we devise a relatively simple test of that null for both continuous and count-valued outcomes. We illustrate our proposed model, method and testing protocol in the context of estimating price effects on the demand for alcohol.
Count-Regression-Based Empirical Causal Analysis from a Potential Outcomes Perspective: Accounting for Boundedness, Discreteness, Dispersion and Unobservable Confounding
(2024-06) Kazeminezhad, Golnoush; Terza, Joseph V.; Harle, Christopher A.; Morrison, Wendy; Russell, Steven
Empirical economic research is primarily driven by the desire to offer scientific evidence that serves to inform the study of cause-and-effect. In this dissertation, I developed new models for count-regression-model-based (CRM-based) causal effect estimation in which the value for the outcome of interest is restricted to the non-negative integers. I implement first-order two-stage residual inclusion (FO-2SRI) methods, in the context of the general potential outcomes framework, that accommodate nonlinearities due to the intrinsic characteristics of count-valued outcomes such as boundedness (outcome nonnegative), discreteness (outcome has countable support) and dispersion (conditional variance and other higher order conditional moments of the outcome not necessarily equal to its conditional mean) of count data, and unobservable confounding. The focus here is on the case in which the causal variable is continuous. The newly proposed causal effect estimators are compared with extant FO-2SRI estimators based on conventional control function methods and the linear instrumental variables (LIV) estimator. A series of simulation studies are performed to investigate the accuracy of the proposed estimators and compare the results with the extant estimators. In the simulation studies, the robustness of the fully nonlinear CRM-based FO-2SRI methods are investigated with attention to an important type of misspecification error. The models are also applied to a real-world data from Nigeria to investigate the effect of female education on their fertility decisions in a developing country. The results of the simulation studies reveal that estimates obtained via the newly proposed estimators are very accurate and widely diverge from the results from the extant control function and LIV methods. Moreover, one of the new estimators, which allows dispersion flexibility, dominated all other estimators (aside from a few extreme dispersion cases) with regard to avoidance of misspecification bias. Finally, the results showed that same estimator to be quite accurate for a wide range of values of the dispersion parameter (which measures mean/variance divergence). Similar results were obtained via the real data analysis which indicates that increasing women’s education decreases childbearing.
The effect of sugar-sweetened beverage consumption on childhood obesity - causal evidence
(2016-05-18) Yang, Yan; Terza, Joseph V.; Courtemanche, Charles; Jung, Haeil; Mak, Henry Y.; Wu, Jisong
Communities and States are increasingly targeting the consumption of sugar sweetened beverages (SSBs), especially soda, in their efforts to curb childhood obesity. However, the empirical evidence based on which policy makers design the relevant policies is not causally interpretable. In the present study, we suggest a modeling framework that can be used for making causal estimation and inference in the context of childhood obesity. This modeling framework is built upon the two-stage residual inclusion (2SRI) instrumental variables method and have two levels – level one models children’s lifestyle choices and level two models children’s energy balance which is assumed to be dependent on their lifestyle behaviors. We start with a simplified version of the model that includes only one policy, one lifestyle, one energy balance, and one observable control variable. We then extend this simple version to be a general one that accommodates multiple policy and lifestyle variables. The two versions of the model are 1) first estimated via the nonlinear least square (NLS) method (henceforth NLS-based 2SRI); and 2) then estimated via the maximum likelihood estimation (MLE) method (henceforth MLE-based 2SRI). Using simulated data, we show that 1) our proposed 2SRI method outperforms the conventional method that ignores the inherent nonlinearity [the linear instrumental variables (LIV) method] or the potential endogeneity [the nonlinear regression (NR) method] in obtaining the relevant estimators; and 2) the MLE-based 2SRI provides more efficient estimators (also consistent) compared to the NLS-based one. Real data analysis is conducted to illustrate the implementation of 2SRI method in practice using both NLS and MLE methods. However, due to data limitation, we are not able to draw any inference regarding the impacts of lifestyle, specifically SSB consumption, on childhood obesity. We are in the process of getting better data and, after doing so, we will replicate and extend the analyses conducted here. These analyses, we believe, will produce causally interpretable evidence of the effects of SSB consumption and other lifestyle choices on childhood obesity. The empirical analyses presented in this dissertation should, therefore, be viewed as an illustration of our newly proposed framework for causal estimation and inference.
Effect Specification, Identification, Estimation, and Inference in a Fractional Outcome Regression Model with an Endogenous Causal Variable
(2024-08) Cheong, Taul; Terza, Joseph V.; Gupta, Sumedha; Steinberg, Richard; Liu, Ziyue
Empirical economic research is primarily driven by the desire to offer scientific evidence that helps assess policy relevant cause-and-effect. The approach most often applied in pursuit of this objective involves regression modeling and estimation. In this dissertation, we focus on the specification, identification, estimation, and causal inference of a causal effect (CE) in the context of the fractional regression model (FRM) for which the support of the outcome variable of interest is restricted to the unit interval. Empirical applications of such models abound in health economics, health services research and health policy literatures. Examples from other disciplines include labor economics, development economics, political economics, commerce or finance. Various full information maximum likelihood and quasi-maximum likelihood regression estimators and nonlinear least squares approach have been proposed to account for the inherent nonlinearity in the FRM due to the unit interval support restriction (UISR) on the outcome variable. Additional nonlinearity is induced in the FRM when the presumed causal variable is subject to unobservable confounding (UC) [i.e., when the presumed causal variable is endogenous]. In such cases, the additional analytic and implementation effort required to account for both sources of nonlinearity (fractional outcome and UC) while avoiding UC bias (which precludes causal interpretability) can be daunting. We seek to develop and implement regression model specifications that account for the inherent nonlinearity implied by this restriction, as well as the nonlinearity that could be additionally imposed by the endogeneity of the presumed causal variable. We focus on the case where the presumed causal variable is continuous. We develop new models for FRM-based CE estimation that implement two-stage residual inclusion (2SRI) methods, as suggested by Terza et al. (2008). We assess the accuracy of our proposed new methods and compare them with extant 2SRI approaches using simulation study. An empirical application demonstrates the working of our proposed method.
Exploring the Importance of Accounting for Nonlinearity in Correlated Count Regression Systems from the Perspective of Causal Estimation and Inference
(2021-07) Zhang, Yilei; Terza, Joseph V.; Vest, Joshua R.; Morrison, Wendy; Gupta, Sumedha
The main motivation for nearly all empirical economic research is to provide scientific evidence that can be used to assess causal relationships of interest. Essential to such assessments is the rigorous specification and accurate estimation of parameters that characterize the causal relationship between a presumed causal variable of interest, whose value is to be set and altered in the context of a relevant counterfactual and a designated outcome of interest. Relationships of this type are typically characterized by an effect parameter (EP) and estimation of the EP is the objective of the empirical analysis. The present research focuses on cases in which the regression outcome of interest is a vector that has count-valued elements (i.e., the model under consideration comprises a multi-equation system of equations). This research examines the importance of account for nonlinearity and cross-equation correlations in correlated count regression systems from the perspective of causal estimation and inference. We evaluate the efficiency and accuracy gains of estimating bivariate count valued systems-of-equations models by comparing three pairs of models: (1) Zellner’s Seemingly Unrelated Regression (SUR) versus Count-Outcome SUR - Conway Maxwell Poisson (CMP); (2) CMP SUR versus Single-Equation CMP Approach; (3) CMP SUR versus Poisson SUR. We show via simulation studies that it is more efficient to estimate jointly than equation-by-equation, it is more efficient to account for nonlinearity. We also apply our model and estimation method to real-world health care utilization data, where the dependent variables are correlated counts: count of physician office-visits, and count of non-physician health professional office-visits. The presumed causal variable is private health insurance status. Our model results in a reduction of at least 30% in standard errors for key policy EP (e.g., Average Incremental Effect). Our results are enabled by our development of a Stata program for approximating two-dimensional integrals via Gauss-Legendre Quadrature.
Health Policy Analysis from a Potential Outcomes Perspective: Smoking During Pregnancy and Birth Weight
(2014-08-25) Terza, Joseph V.
Most empirical research in health economics is conducted with the goal of providing scientific evidence that will serve to inform current and future health policy. The use of parametric nonlinear regression (NR) methods for empirical analysis in health economics abounds. Studies that offer clear policy-relevant interpretations of NR results are, however, rare. We offer a comprehensive policy analytic framework within which the applied researcher can: 1) clearly define the policy-relevant estimation objective; 2) consistently estimate that objective using NR methods designed to account for the possible endogeneity of the policy variable of interest; 3) conduct correct asymptotic inference; and 4) offer policy-relevant interpretations of the empirical results. For binary policies, Rubin (1974, 1977) developed the potential outcomes framework (POF). We propose a generally applicable extension of the POF (EPOF) which covers a broad range of policy analytic contexts. In particular, our EPOF accommodates: a) a non-binary policy variable of interest (Xp ); b) policy-relevant counterfactual versions of Xp that are not fixed values; and c) a policy-defining increment to Xp that is not constant. Moreover, our EPOF facilitates the use of extant nonlinear regression (NR) methods that correct for potential bias due to the endogeneity of Xp . As a case in point, we consider the analysis of potential gains in infant birth weight that may result from a prenatal smoking prevention and cessation policy which, if fully effective, would maintain zero levels of smoking for non-smokers (prevention) and convince smokers to quit before becoming pregnant (cessation). In the context of our EPOF, using endogeneity-correcting NR methods, we re-analyze the data examined by Mullahy (1997) and estimate the potential effect of the smoking prevention/cessation policy described above. The EPOF should serve as a useful guide to applied health policy analysts.
Inference Using Sample Means of Parametric Nonlinear Data Transformations
(Wiley, 2016-06) Terza, Joseph V.; Economics, School of Liberal Arts
Quantifying risk of injury from usual alcohol consumption: An instrumental variable analysis
(Wiley, 2021) Ye, Yu; Cherpitel, Cheryl J.; Terza, Joseph V.; Kerr, William C.; Economics, School of Liberal Arts
Background: There have been numerous studies of roadside accidents among emergency room patients showing elevated risk of injury from acute alcohol consumption, i.e. recent drinking prior to the injury event, with large effect size and a dose-response relationship observed. In contrast, studies quantifying the association between injury risk and chronic consumption such as past year average volume show that relative risk estimates are low compared to those from acute consumption. Methods: Using the US National Alcohol Surveys (NAS) combining four waves for years 2000–2015 (N=29,571, 53% overall cooperation rate), risk of any past-year injury was first estimated by past-year volume using logistic regression. An instrumental variable (IV) analysis utilizing the two-stage residual inclusion (2SRI) approach was then conducted to estimate injury risk from volume, further adjusting for unobserved confounders, using state beer and spirits tax rates, zip code-level outlet and bar density, and control state status as instruments. Results: Based on the combined US population surveys and controlling for socio-demographics, odds ratios of injury from average volume of 1, 2 and 5 drinks per day were 1.12 [95% confidence interval: 1.02, 1.24], 1.10 [1.00, 1.22], and 1.04 [0.88, 1.22], respectively, using conventional logistic regression, compared to 1.67 [1.00, 2.78], 2.38 [0.87, 6.54] and 6.98 [0.57, 85.89] using the IV method. The proportion of injury attributed to alcohol also increased in magnitude, from 6.2% [0.3%, 11.9%] using the conventional approach to 17.9% [8.2%, 27.7%] using the IV method. Conclusions: Findings suggest that the association between injury and chronic alcohol consumption may be confounded by unobserved factors, with the risk estimate possibly biased downward.
Regression-Based Causal Analysis from the Potential Outcomes Perspective
(Taylor & Francis, 2020-01) Terza, Joseph V.; Economics, School of Liberal Arts
Most empirical economic research is conducted with the goal of providing scientific evidence that will be informative in assessing causal relationships of interest based on relevant counterfactuals. The implementation of regression methods in this context is ubiquitous. With this as motivation, we detail a comprehensive regression-based potential outcomes framework for causal modeling, estimation and inference. This framework facilitates rigorous specification of the effect parameter of interest and makes clear the sense in which it is causally interpretable, when appropriately defined in a potential outcomes setting. It also serves to crystallize the conditions under which the effect parameter and the underlying regression parameters are identified. The consistent sample analog estimator of the effect parameter is discussed. Juxtaposing this framework with a stylized version of a commonly implemented and routinely applied modeling and estimation protocol reveals how the latter is deficient in recognizing, and fully accounting for, conditions required for identification of the relevant effect parameter and the causal interpretability of estimation results. In the context of an example, we demonstrate the conceptual advantages of this general potential outcomes framework for regression modeling by showing how it resolves fundamental shortcomings in the conventional approach to characterizing and remedying omitted variable bias.
Simpler Standard Errors for Multi-Stage Regression-Based Estimators: Illustrations in Health Economics
(2014-08-25) Terza, Joseph V.
With a view towards lessening the analytic and computational burden faced by researchers in empirical health economics who seek an alternative to bootstrapping for the standard errors of two-stage estimators, we offer heretofore unexploited simplifications of the typical, but somewhat daunting, textbook approach. For the most commonly encountered cases in empirical health economics – two-stage estimators that, in either stage, involve maximum likelihood estimation or the nonlinear least squares method – we show that: 1) the usual textbook formulation of the relevant asymptotic covariance can be substantially reduced in complexity; and 2) nearly all components of our simplified formulation can be retrieved as outputs from packaged regression routines (e.g., in Stata). With the applied researcher in mind, we illustrate these points with two examples in empirical health economics that involve the estimation of causal effects in the presence of endogeneity – a sampling problem that can often be solved via two-stage estimation. As a by-product of this illustrative discussion, we detail four very useful two-stage estimators (and their asymptotic standard errors) that are consistent for the model parameters in such settings, along with their corresponding multi-stage causal effect estimators (and their asymptotic standard errors).

Browsing by Author "Terza, Joseph V."

Results Per Page

Sort Options