- Browse by Author
Browsing by Author "Wang, Honglang"
Now showing 1 - 10 of 15
Results Per Page
Sort Options
Item Comprehensive quantification of the responses of ecosystem production and respiration to drought time scale, intensity and timing in humid environments: A FLUXNET synthesis(Wiley, 2022-05) Jiao, Wenzhe; Wang, Lixin; Wang, Honglang; Lanning, Matthew; Chang, Qing; Novick, Kimberly A.; Earth Sciences, School of ScienceDrought is one of the most important natural hazards impacting ecosystem carbon cycles. However, it is challenging to quantify the impacts of drought on ecosystem carbon balance and several factors hinder our explicit understanding of the complex drought impacts. First, drought impacts can have different time dimensions such as simultaneous, cumulative, and lagged impacts on ecosystem carbon balance. Second, drought is not only a multiscale (e.g., temporal and spatial) but also a multidimensional (e.g., intensity, time-scale, and timing) phenomenon, and ecosystem production and respiration may respond to each drought dimension differently. In this study, we conducted a comprehensive drought impact assessment on ecosystem productivity and respiration in humid regions by including different drought dimensions using global FLUXNET observations. Short-term drought (e.g., 1-month drought) generally did not induce a decrease in plant productivity even under high severity drought. However, ecosystem production and respiration significantly decreased as drought intensity increased for droughts longer than one month in duration. Drought timing was important, and ecosystem productivity was most vulnerable when drought occurred during or shortly after the peak vegetation growth. We found that lagged drought impacts more significantly affected ecosystem carbon uptake than simultaneous drought, and that ecosystem respiration was less sensitive to drought time scale than ecosystem production. Overall, our results indicated that temporally-standardized meteorological drought indices can be used to reflect plant productivity decline, but drought timing, antecedent, and cumulative drought conditions need to be considered together.Item Efficient Inference and Dominant-Set Based Clustering for Functional Data(2024-05) Wang, Xiang; Wang, Honglang; Boukai, Benzion; Tan, Fei; Peng, HanxiangThis dissertation addresses three progressively fundamental problems for functional data analysis: (1) To do efficient inference for the functional mean model accounting for within-subject correlation, we propose the refined and bias-corrected empirical likelihood method. (2) To identify functional subjects potentially from different populations, we propose the dominant-set based unsupervised clustering method using the similarity matrix. (3) To learn the similarity matrix from various similarity metrics for functional data clustering, we propose the modularity guided and dominant-set based semi-supervised clustering method. In the first problem, the empirical likelihood method is utilized to do inference for the mean function of functional data by constructing the refined and bias-corrected estimating equation. The proposed estimating equation not only improves efficiency but also enables practically feasible empirical likelihood inference by properly incorporating within-subject correlation, which has not been achieved by previous studies. In the second problem, the dominant-set based unsupervised clustering method is proposed to maximize the within-cluster similarity and applied to functional data with a flexible choice of similarity measures between curves. The proposed unsupervised clustering method is a hierarchical bipartition procedure under the penalized optimization framework with the tuning parameter selected by maximizing the clustering criterion called modularity of the resulting two clusters, which is inspired by the concept of dominant set in graph theory and solved by replicator dynamics in game theory. The advantage offered by this approach is not only robust to imbalanced sizes of groups but also to outliers, which overcomes the limitation of many existing clustering methods. In the third problem, the metric-based semi-supervised clustering method is proposed with similarity metric learned by modularity maximization and followed by the above proposed dominant-set based clustering procedure. Under semi-supervised setting where some clustering memberships are known, the goal is to determine the best linear combination of candidate similarity metrics as the final metric to enhance the clustering performance. Besides the global metric-based algorithm, another algorithm is also proposed to learn individual metrics for each cluster, which permits overlapping membership for the clustering. This is innovatively different from many existing methods. This method is superiorly applicable to functional data with various similarity metrics between functional curves, while also exhibiting robustness to imbalanced sizes of groups, which are intrinsic to the dominant-set based clustering approach. In all three problems, the advantages of the proposed methods are demonstrated through extensive empirical investigations using simulations as well as real data applications.Item Empirical Likelihood Ratio Tests for Coe cients in High Dimensional Heteroscedastic Linear Models(ICSA, 2018) Wang, Honglang; Zhong, Ping-Shou; Cui, Yuehua; Mathematical Sciences, School of ScienceThis paper considers hypothesis testing problems for a low-dimensional coefficient vector in a high-dimensional linear model with heteroscedastic variance. Heteroscedasticity is a commonly observed phenomenon in many applications, including finance and genomic studies. Several statistical inference procedures have been proposed for low-dimensional coefficients in a high-dimensional linear model with homoscedastic variance, which are not applicable for models with heteroscedastic variance. The heterscedasticity issue has been rarely investigated and studied. We propose a simple inference procedure based on empirical likelihood to overcome the heteroscedasticity issue. The proposed method is able to make valid inference even when the conditional variance of random error is an unknown function of high-dimensional predictors. We apply our inference procedure to three recently proposed estimating equations and establish the asymptotic distributions of the proposed methods. Simulation studies and real data applications are conducted to demonstrate the proposed methods.Item Examining Ecosystem Drought Responses Using Remote Sensing and Flux Tower Observations(2022-09) Jiao, Wenzhe; Wang, Lixin; Novick, Kimberly A.; Filippelli, Gabriel; Wang, Honglang; Li, LinWater is fundamental for plant growth, and vegetation response to water availability influences water, carbon, and energy exchanges between land and atmosphere. Vegetation plays the most active role in water and carbon cycle of various ecosystems. Therefore, comprehensive evaluation of drought impact on vegetation productivity will play a critical role for better understanding the global water cycle under future climate conditions. In-situ meteorological measurements and the eddy covariance flux tower network, which provide meteorological data, and estimates of ecosystem productivity and respiration are remarkable tools to assess the impacts of drought on ecosystem carbon and water cycles. In regions with limited in-situ observations, remote sensing can be a very useful tool to monitor ecosystem drought status since it provides continuous observations of relevant variables linked to ecosystem function and the hydrologic cycle. However, the detailed understanding of ecosystem responses to drought is still lacking and it is challenging to quantify the impacts of drought on ecosystem carbon balance and several factors hinder our explicit understanding of the complex drought impacts. This dissertation addressed drought monitoring, ecosystem drought responses, trends of vegetation water constraint based on in-situ metrological observations, flux tower and multi-sensor remote sensing observations. This dissertation first developed a new integrated drought index applicable across diverse climate regions based on in-situ meteorological observations and multi-sensor remote sensing data, and another integrated drought index applicable across diverse climate regions only based on multi-sensor remote sensing data. The dissertation also evaluated the applicability of new satellite dataset (e.g., solar induced fluorescence, SIF) for responding to meteorological drought. Results show that satellite SIF data could have the potential to reflect meteorological drought, but the application should be limited to dry regions. The work in this dissertation also accessed changes in water constraint on global vegetation productivity, and quantified different drought dimensions on ecosystem productivity and respiration. Results indicate that a significant increase in vegetation water constraint over the last 30 years. The results highlighted the need for a more explicit consideration of the influence of water constraints on regional and global vegetation under a warming climate.Item A modified isotope-based method for potential high-frequency evapotranspiration partitioning(Elsevier, 2022-02) Yuan, Yusen; Wang, Lixin; Wang, Honglang; Lin, Wenqing; Jiao, Wenzhe; Du, Taisheng; Earth Sciences, School of ScienceTo better understand water and energy cycles, numerous efforts to partition evapotranspiration (ET) into evaporation (E) and transpiration (T) have been made over the recent half century. One of the analytical methods is the isotopic approach. The isotopic composition of ET (δET) is a crucial parameter in the traditional isotope-based ET partitioning model, which however, has considerable uncertainty and high sensitivity. Here we proposed a modified T fraction in total ET (FT) calculation using Keeling plot slope (k), the atmospheric vapor concentration (Cv), and the isotopic composition of atmospheric vapor (δv), to avoid the direct use of δET. Following the traditional method, we used the Craig-Gordon model for the isotopic composition of evaporation (δE) and chamber method for the isotopic composition of transpiration (δT) in our modified method. The modified FT calculation method (FT (m)) can be applied at a 15-min time scale using the average values (FTi (mp)) and at a 1 Hz time scale for high-frequency method (FTi). The modified method was verified by both theoretical derivations and field observations. FTi (mp) was equivalent to those using the traditional isotopic method at a 15-min time scale. However, FTi eliminated the highly sensitive parameter δET, and redistributed the sensitivity of δET into three less sensitive parameters. Additionally, FTi has two main advantages. First, the high-frequency method avoids the extrapolation of the Keeling plot regression line intercept. Second, the high-frequency method can produce a 95% confidence interval of FT in a measurement cycle (e.g., 15 min). The calculated confidence interval was different from that of traditional uncertainty analysis. The high-frequency method might be useful when investigating evapotranspiration partitioning under short-term extreme weather events and flush agricultural irrigation.Item Multivariate partial linear varying coefficients model for gene‐environment interactions with multiple longitudinal traits(Wiley, 2022) Wang, Honglang; Zhang, Jingyi; Klump, Kelly L.; Burt, Sybil Alexandra; Cui, Yuehua; Mathematical Sciences, School of ScienceCorrelated phenotypes often share common genetic determinants. Thus, a multi‐trait analysis can potentially increase association power and help in understanding pleiotropic effect. When multiple traits are jointly measured over time, the correlation information between multivariate longitudinal responses can help to gain power in association analysis, and the longitudinal traits can provide insights on the dynamic gene effect over time. In this work, we propose a multivariate partially linear varying coefficients model to identify genetic variants with their effects potentially modified by environmental factors. We derive a testing framework to jointly test the association of genetic factors and illustrated with a bivariate phenotypic trait, while taking the time varying genetic effects into account. We extend the quadratic inference functions to deal with the longitudinal correlations and used penalized splines for the approximation of nonparametric coefficient functions. Theoretical results such as consistency and asymptotic normality of the estimates are established. The performance of the testing procedure is evaluated through Monte Carlo simulation studies. The utility of the method is demonstrated with a real data set from the Twin Study of Hormones and Behavior across the menstrual cycle project, in which single nucleotide polymorphisms associated with emotional eating behavior are identified.Item Novel Keeling-plot-based methods to estimate the isotopic composition of ambient water vapor(EGU, 2020-09) Yuan, Yusen; Du, Taisheng; Wang, Honglang; Wang, Lixin; Earth Sciences, School of ScienceThe Keeling plot approach, a general method to identify the isotopic composition of source atmospheric CO2 and water vapor (i.e., evapotranspiration), has been widely used in terrestrial ecosystems. The isotopic composition of ambient water vapor (δa), an important source of atmospheric water vapor, is not able to be estimated to date using the Keeling plot approach. Here we proposed two new methods to estimate δa using the Keeling plots: one using an intersection point method and another relying on the intermediate value theorem. As the actual δa value was difficult to measure directly, we used two indirect approaches to validate our new methods. First, we performed external vapor tracking using the Hybrid Single Particle Lagrangian Integrated Trajectory (HYSPLIT) model to facilitate explaining the variations of δa. The trajectory vapor origin results were consistent with the expectations of the δa values estimated by these two methods. Second, regression analysis was used to evaluate the relationship between δa values estimated from these two independent methods, and they are in strong agreement. This study provides an analytical framework to estimate δa using existing facilities and provides important insights into the traditional Keeling plot approach by showing (a) a possibility to calculate the proportion of evapotranspiration fluxes to total atmospheric vapor using the same instrumental setup for the traditional Keeling plot investigations and (b) perspectives on the estimation of isotope composition of ambient CO2 (δa13C).Item Observed increasing water constraint on vegetation growth over the last three decades(Springer Nature, 2021-06-18) Jiao, Wenzhe; Wang, Lixin; Smith, William K.; Chang, Qing; Wang, Honglang; D’Odorico, Paolo; Earth and Environmental Sciences, School of ScienceDespite the growing interest in predicting global and regional trends in vegetation productivity in response to a changing climate, changes in water constraint on vegetation productivity (i.e., water limitations on vegetation growth) remain poorly understood. Here we conduct a comprehensive evaluation of changes in water constraint on vegetation growth in the extratropical Northern Hemisphere between 1982 and 2015. We document a significant increase in vegetation water constraint over this period. Remarkably divergent trends were found with vegetation water deficit areas significantly expanding, and water surplus areas significantly shrinking. The increase in water constraints associated with water deficit was also consistent with a decreasing response time to water scarcity, suggesting a stronger susceptibility of vegetation to drought. We also observed shortened water surplus period for water surplus areas, suggesting a shortened exposure to water surplus associated with humid conditions. These observed changes were found to be attributable to trends in temperature, solar radiation, precipitation, and atmospheric CO2. Our findings highlight the need for a more explicit consideration of the influence of water constraints on regional and global vegetation under a warming climate.Item Optimal Policies in Reliability Modelling of Systems Subject to Sporadic Shocks and Continuous Healing(2022-12) Chatterjee, Debolina; Sarkar, Jyotirmoy; Boukai, Benzion; Li, Fang; Wang, HonglangRecent years have seen a growth in research on system reliability and maintenance. Various studies in the scientific fields of reliability engineering, quality and productivity analyses, risk assessment, software reliability, and probabilistic machine learning are being undertaken in the present era. The dependency of human life on technology has made it more important to maintain such systems and maximize their potential. In this dissertation, some methodologies are presented that maximize certain measures of system reliability, explain the underlying stochastic behavior of certain systems, and prevent the risk of system failure. An overview of the dissertation is provided in Chapter 1, where we briefly discuss some useful definitions and concepts in probability theory and stochastic processes and present some mathematical results required in later chapters. Thereafter, we present the motivation and outline of each subsequent chapter. In Chapter 2, we compute the limiting average availability of a one-unit repairable system subject to repair facilities and spare units. Formulas for finding the limiting average availability of a repairable system exist only for some special cases: (1) either the lifetime or the repair-time is exponential; or (2) there is one spare unit and one repair facility. In contrast, we consider a more general setting involving several spare units and several repair facilities; and we allow arbitrary life- and repair-time distributions. Under periodic monitoring, which essentially discretizes the time variable, we compute the limiting average availability. The discretization approach closely approximates the existing results in the special cases; and demonstrates as anticipated that the limiting average availability increases with additional spare unit and/or repair facility. In Chapter 3, the system experiences two types of sporadic impact: valid shocks that cause damage instantaneously and positive interventions that induce partial healing. Whereas each shock inflicts a fixed magnitude of damage, the accumulated effect of k positive interventions nullifies the damaging effect of one shock. The system is said to be in Stage 1, when it can possibly heal, until the net count of impacts (valid shocks registered minus valid shocks nullified) reaches a threshold $m_1$. The system then enters Stage 2, where no further healing is possible. The system fails when the net count of valid shocks reaches another threshold $m_2 (> m_1)$. The inter-arrival times between successive valid shocks and those between successive positive interventions are independent and follow arbitrary distributions. Thus, we remove the restrictive assumption of an exponential distribution, often found in the literature. We find the distributions of the sojourn time in Stage 1 and the failure time of the system. Finally, we find the optimal values of the choice variables that minimize the expected maintenance cost per unit time for three different maintenance policies. In Chapter 4, the above defined Stage 1 is further subdivided into two parts: In the early part, called Stage 1A, healing happens faster than in the later stage, called Stage 1B. The system stays in Stage 1A until the net count of impacts reaches a predetermined threshold $m_A$; then the system enters Stage 1B and stays there until the net count reaches another predetermined threshold $m_1 (>m_A)$. Subsequently, the system enters Stage 2 where it can no longer heal. The system fails when the net count of valid shocks reaches another predetermined higher threshold $m_2 (> m_1)$. All other assumptions are the same as those in Chapter 3. We calculate the percentage improvement in the lifetime of the system due to the subdivision of Stage 1. Finally, we make optimal choices to minimize the expected maintenance cost per unit time for two maintenance policies. Next, we eliminate the restrictive assumption that all valid shocks and all positive interventions have equal magnitude, and the boundary threshold is a preset constant value. In Chapter 5, we study a system that experiences damaging external shocks of random magnitude at stochastic intervals, continuous degradation, and self-healing. The system fails if cumulative damage exceeds a time-dependent threshold. We develop a preventive maintenance policy to replace the system such that its lifetime is utilized prudently. Further, we consider three variations on the healing pattern: (1) shocks heal for a fixed finite duration $\tau$; (2) a fixed proportion of shocks are non-healable (that is, $\tau=0$); (3) there are two types of shocks---self healable shocks heal for a finite duration, and non-healable shocks. We implement a proposed preventive maintenance policy and compare the optimal replacement times in these new cases with those in the original case, where all shocks heal indefinitely. Finally, in Chapter 6, we present a summary of the dissertation with conclusions and future research potential.Item A provable smoothing approach for high dimensional generalized regression with applications in genomics(Institute of Mathematical Statistics, 2017) Han, Fang; Ji, Hongkai; Ji, Zhicheng; Wang, Honglang; Mathematical Sciences, School of ScienceIn many applications, linear models fit the data poorly. This article studies an appealing alternative, the generalized regression model. This model only assumes that there exists an unknown monotonically increasing link function connecting the response YYY to a single index XTβ∗XTβ∗\boldsymbol{X} ^{\mathsf{T}}\boldsymbol{\beta } ^{*} of explanatory variables X∈RdX∈Rd\boldsymbol{X} \in{\mathbb{R}} ^{d}. The generalized regression model is flexible and covers many widely used statistical models. It fits the data generating mechanisms well in many real problems, which makes it useful in a variety of applications where regression models are regularly employed. In low dimensions, rank-based M-estimators are recommended to deal with the generalized regression model, giving root-nnn consistent estimators of β∗β∗\boldsymbol{\beta } ^{*}. Applications of these estimators to high dimensional data, however, are questionable. This article studies, both theoretically and practically, a simple yet powerful smoothing approach to handle the high dimensional generalized regression model. Theoretically, a family of smoothing functions is provided, and the amount of smoothing necessary for efficient inference is carefully calculated. Practically, our study is motivated by an important and challenging scientific problem: decoding gene regulation by predicting transcription factors that bind to cis-regulatory elements. Applying our proposed method to this problem shows substantial improvement over the state-of-the-art alternative in real data.