Mathematical Sciences Department Theses and Dissertations

Permanent URI for this collection

For more information about the Mathematical Sciences graduate programs visit: www.science.iupui.edu http://www.science.iupui.edu

Browse

Recent Submissions

Now showing 1 - 10 of 40
  • Item
    A Dynamical Approach to the Potts Model on Cayley Tree
    (2024-12) Pannipitiya, Diyath Nelaka; Kitchens, Bruce P.; Roeder, Roland K. W.; Geller, William; Perez, Rodrigo A.
    The Ising model is one of the most important theoretical models in statistical physics, which was originally developed to describe ferromagnetism. A system of magnetic particles, for example, can be modeled as a linear chain in one dimension or a lattice in two dimensions, with one particle at each lattice point. Then each particle is assigned a spin σi ∈ {±1}. The q-state Potts model is a generalization of the Ising model, where each spin σi may take on q ≥3 number of states {0,··· ,q−1}. Both models have temperature T and an externally applied magnetic field h as parameters. Many statistical and physical properties of the q- state Potts model can be derived by studying its partition function. This includes phase transitions as T and/or h are varied. The celebrated Lee-Yang Theorem characterizes such phase transitions of the 2-state Potts model (the Ising model). This theorem does not hold for q > 2. Thus, phase transitions for the Potts model as h is varied are more complicated and mysterious. We give some results that characterize the phase transitions of the 3-state Potts model as h is varied for constant T on the binary rooted Cayley tree. Similarly to the Ising model, we show that for fixed T >0the 3-state Potts model for the ferromagnetic case exhibits a phase transition at one critical value of h or not at all, depending on T. However, an interesting new phenomenon occurs for the 3-state Potts model because the critical value of h can be non-zero for some range of temperatures. The 3-state Potts model for the antiferromagnetic case exhibits a phase transition at up to two critical values of h. The recursive constructions of the (n + 1)st level Cayley tree from two copies of the nth level Cayley tree allows one to write a relatively simple rational function relating the Lee-Yang zeros at one level to the next. This allows us to use techniques from dynamical systems.
  • Item
    Robust Inference for Heterogeneous Treatment Effects With Applications to NHANES Data
    (2024-12) Mo, Ran; Wang, Honglang; Li, Fang; Tan, Fei; Peng, Hanxiang
    Estimating the conditional average treatment effect (CATE) using data from the National Health and Nutrition Examination Survey (NHANES) provides valuable insights into the heterogeneous impacts of health interventions across diverse populations, facilitating public health strategies that consider individual differences in health behaviors and conditions. However, estimating CATE with NHANES data face challenges often encountered in observational studies, such as outliers, heavy-tailed error distributions, skewed data, model misspecification, and the curse of dimensionality. To address these challenges, this dissertation presents three consecutive studies that thoroughly explore robust methods for estimating heterogeneous treatment effects. The first study introduces an outlier-resistant estimation method by incorporating M-estimation, replacing the \(L_2\) loss in the traditional inverse propensity weighting (IPW) method with a robust loss function. To assess the robustness of our approach, we investigate its influence function and breakdown point. Additionally, we derive the asymptotic properties of the proposed estimator, enabling valid inference for the proposed outlier-resistant estimator of CATE. The method proposed in the first study relies on a symmetric assumption which is commonly required by standard outlier-resistant methods. To remove this assumption while maintaining unbiasedness, the second study employs the adaptive Huber loss, which dynamically adjusts the robustification parameter based on the sample size to achieve optimal tradeoff between bias and robustness. The robustification parameter is explicitly derived from theoretical results, making it unnecessary to rely on time-consuming data-driven methods for its selection. We also derive concentration and Berry-Esseen inequalities to precisely quantify the convergence rates as well as finite sample performance. In both previous studies, the propensity scores were estimated parametrically, which is sensitive to model misspecification issues. The third study extends the robust estimator from our first project by plugging in a kernel-based nonparametric estimation of the propensity score with sufficient dimension reduction (SDR). Specifically, we adopt a robust minimum average variance estimation (rMAVE) for the central mean space under the potential outcome framework. Together with higher-order kernels, the resulting CATE estimation gains enhanced efficiency. In all three studies, the theoretical results are derived, and confidence intervals are constructed for inference based on these findings. The properties of the proposed estimators are verified through extensive simulations. Additionally, applying these methods to NHANES data validates the estimators' ability to handle diverse and contaminated datasets, further demonstrating their effectiveness in real-world scenarios.
  • Item
    Modeling and Simulation of Osteocyte-Fluid Interaction in a Lacuno-Canalicular Network in Three Dimensions
    (2024-12) Karimli, Nigar; Barber, Jared; Zhu, Luoding; Arciero, Julia; Na, Sungsoo
    Bone health relies on its cells' ability to sense and respond to mechanical forces, a process primarily managed by osteocytes embedded within the bone matrix. The cells reside in the lacuno-canalicular network (LCN), a complex structure, comprised of lacunae (small cavities) and canaliculi (microscopic channels), through which they communicate and receive nutrients. The mechanotransduction (MT) process, by which osteocytes convert mechanical signals from mechanical loading into biochemical responses, is essential for bone remodeling but remains poorly understood. Both in-vitro and in-vivo studies present challenges in directly measuring the cellular stresses and strains involved, making computational modeling a valuable tool for studying osteocyte mechanics. In this dissertation, we present a coarse-grained, integrative model designed to simulate stress and strain distributions within an osteocyte and its microenvironment. Our model features the osteocyte membrane represented as a network of viscoelastic springs, with six slender, arm-like osteocytic processes extending from the membrane. The osteocyte is immersed in interstitial fluid and encompassed by the rigid extracellular matrix (ECM). The cytosol and interstitial fluid are both modeled as water-like, viscous incompressible fluids, allowing us to capture the fluid-structure interactions crucial to understanding the MT. To simulate these interactions, we employ the Lattice Boltzmann - Immersed Boundary (LB-IB) method. This approach couples the Lattice Boltzmann method, which numerically solves fluid equations, with the immersed boundary method, which handles the interactions between the osteocyte structures and the surrounding fluids. This framework consists of a system of integro-partial differential equations describing both fluid and solid dynamics, enabling a detailed examination of force, strain, and stress distribution within the osteocyte. Major results include 1) increased incoming flow routes results in increased stress and strain, 2) regions of higher stress and strain are concentrated near the junctions where the osteocytic processes meet the main body.
  • Item
    Sample Size Determination for Subsampling in the Analysis of Big Data, Multiplicative Models for Confidence Intervals and Free-Knot Changepoint Models
    (2024-05) Zhang, Sheng; Peng, Hanxiang; Tan, Fei; Sarkar, Jyoti; Boukai, Ben
    The dissertation consists of three parts. Motivated by subsampling in the analysis of Big Data and by data-splitting in machine learning, sample size determination for multidimensional parameters is presented in the first part. In the second part, we propose a novel approach to the construction of confidence intervals based on improved concentration inequalities. We provide the missing factor for the tail probability of a random variable which generalizes Talagrand’s (1995) result of the missing factor in Hoeffding’s inequalities. We give the procedure for constructing confidence intervals and illustrate it with simulations. In the third part, we study irregular change-point models using free-knot splines. The consistency and asymptotic normality of the least squares estimators are proved for the irregular models in which the linear spline is not differentiable. Simulations are carried out to explore the numerical properties of the proposed models. The results are used to analyze the US Covid-19 data.
  • Item
    Efficient Inference and Dominant-Set Based Clustering for Functional Data
    (2024-05) Wang, Xiang; Wang, Honglang; Boukai, Benzion; Tan, Fei; Peng, Hanxiang
    This dissertation addresses three progressively fundamental problems for functional data analysis: (1) To do efficient inference for the functional mean model accounting for within-subject correlation, we propose the refined and bias-corrected empirical likelihood method. (2) To identify functional subjects potentially from different populations, we propose the dominant-set based unsupervised clustering method using the similarity matrix. (3) To learn the similarity matrix from various similarity metrics for functional data clustering, we propose the modularity guided and dominant-set based semi-supervised clustering method. In the first problem, the empirical likelihood method is utilized to do inference for the mean function of functional data by constructing the refined and bias-corrected estimating equation. The proposed estimating equation not only improves efficiency but also enables practically feasible empirical likelihood inference by properly incorporating within-subject correlation, which has not been achieved by previous studies. In the second problem, the dominant-set based unsupervised clustering method is proposed to maximize the within-cluster similarity and applied to functional data with a flexible choice of similarity measures between curves. The proposed unsupervised clustering method is a hierarchical bipartition procedure under the penalized optimization framework with the tuning parameter selected by maximizing the clustering criterion called modularity of the resulting two clusters, which is inspired by the concept of dominant set in graph theory and solved by replicator dynamics in game theory. The advantage offered by this approach is not only robust to imbalanced sizes of groups but also to outliers, which overcomes the limitation of many existing clustering methods. In the third problem, the metric-based semi-supervised clustering method is proposed with similarity metric learned by modularity maximization and followed by the above proposed dominant-set based clustering procedure. Under semi-supervised setting where some clustering memberships are known, the goal is to determine the best linear combination of candidate similarity metrics as the final metric to enhance the clustering performance. Besides the global metric-based algorithm, another algorithm is also proposed to learn individual metrics for each cluster, which permits overlapping membership for the clustering. This is innovatively different from many existing methods. This method is superiorly applicable to functional data with various similarity metrics between functional curves, while also exhibiting robustness to imbalanced sizes of groups, which are intrinsic to the dominant-set based clustering approach. In all three problems, the advantages of the proposed methods are demonstrated through extensive empirical investigations using simulations as well as real data applications.
  • Item
    Weighted Curvatures in Finsler Geometry
    (2023-08) Zhao, Runzhong; Shen, Zhongmin; Buse, Olguta; Ramras, Daniel; Roeder, Roland
    The curvatures in Finsler geometry can be defined in similar ways as in Riemannian geometry. However, since there are fewer restrictions on the metrics, many geometric quantities arise in Finsler geometry which vanish in the Riemannian case. These quantities are generally known as non-Riemannian quantities and interact with the curvatures in controlling the global geometrical and topological properties of Finsler manifolds. In the present work, we study general weighted Ricci curvatures which combine the Ricci curvature and the S-curvature, and define a weighted flag curvature which combines the flag curvature and the T -curvature. We characterize Randers metrics of almost isotropic weighted Ricci curvatures and show the general weighted Ricci curvatures can be divided into three types. On the other hand, we show that a proper open forward complete Finsler manifold with positive weighted flag curvature is necessarily diffeomorphic to the Euclidean space, generalizing the Gromoll-Meyer theorem in Riemannian geometry.
  • Item
    Values of Ramanujan's Continued Fractions Arising as Periodic Points of Algebraic Functions
    (2023-08) Akkarapakam, Sushmanth Jacob; Morton, Richard Patrick; Klimek, Slawomir D.; Roeder, Roland K. W.; Geller, William A.
    The main focus of this dissertation is to find and explain the periodic points of certain algebraic functions that are related to some modular functions, which themselves can be represented by continued fractions. Some of these continued fractions are first explored by Srinivasa Ramanujan in early 20th century. Later on, much work has been done in terms of studying the continued fractions, and proving several relations, identities, and giving different representations for them. The layout of this report is as follows. Chapter 1 has all the basic background knowledge and ingredients about algebraic number theory, class field theory, Ramanujan’s theta functions, etc. In Chapter 2, we look at the Ramanujan-Göllnitz-Gordon continued fraction that we call v(τ) and evaluate it at certain arguments in the field K = Q(√−d), with −d ≡ 1 (mod 8), in which the ideal (2) = ℘2℘′2 is a product of two prime ideals. We prove several identities related to itself and with other modular functions. Some of these are new, while some of them are known but with different proofs. These values of v(τ) are shown to generate the inertia field of ℘2 or ℘′2 in an extended ring class field over the field K. The conjugates over Q of these same values, together with 0, −1 ± √2, are shown to form the exact set of periodic points of a fixed algebraic function ˆF(x), independent of d. These are analogues of similar results for the Rogers-Ramanujan continued fraction. See [1] and [2]. This joint work with my advisor Dr. Morton, is submitted for publication to the New York Journal. In Chapters 3 and 4, we take a similar approach in studying two more continued fractions c(τ) and u(τ), the first of which is more commonly known as the Ramanujan’s cubic continued fraction. We show what fields a value of this continued fraction generates over Q, and we describe how the periodic points for described functions arise as values of these continued fractions. Then in the last chapter, we summarise all these results, give some possible directions for future research as well as mentioning some conjectures.
  • Item
    Certain Aspects of Quantum and Classical Integrable Systems
    (2022-08) Kosmakov, Maksim; Tarasov, Vitaly; Its, Alexander; Mukhin, Evgeny; Ramras, Daniel
    We derive new combinatorail formulas for vector-valued weight functions for the evolution modules over the Yangians Y (gl_n). We obtain them using the Nested Algebraic Bethe ansatz method. We also describe the asymptotic behavior of the radial solutions of the negative tt* equation via the Riemann-Hilbert problem and the Deift-Zhou nonlinear steepest descent method.
  • Item
    Optimal Policies in Reliability Modelling of Systems Subject to Sporadic Shocks and Continuous Healing
    (2022-12) Chatterjee, Debolina; Sarkar, Jyotirmoy; Boukai, Benzion; Li, Fang; Wang, Honglang
    Recent years have seen a growth in research on system reliability and maintenance. Various studies in the scientific fields of reliability engineering, quality and productivity analyses, risk assessment, software reliability, and probabilistic machine learning are being undertaken in the present era. The dependency of human life on technology has made it more important to maintain such systems and maximize their potential. In this dissertation, some methodologies are presented that maximize certain measures of system reliability, explain the underlying stochastic behavior of certain systems, and prevent the risk of system failure. An overview of the dissertation is provided in Chapter 1, where we briefly discuss some useful definitions and concepts in probability theory and stochastic processes and present some mathematical results required in later chapters. Thereafter, we present the motivation and outline of each subsequent chapter. In Chapter 2, we compute the limiting average availability of a one-unit repairable system subject to repair facilities and spare units. Formulas for finding the limiting average availability of a repairable system exist only for some special cases: (1) either the lifetime or the repair-time is exponential; or (2) there is one spare unit and one repair facility. In contrast, we consider a more general setting involving several spare units and several repair facilities; and we allow arbitrary life- and repair-time distributions. Under periodic monitoring, which essentially discretizes the time variable, we compute the limiting average availability. The discretization approach closely approximates the existing results in the special cases; and demonstrates as anticipated that the limiting average availability increases with additional spare unit and/or repair facility. In Chapter 3, the system experiences two types of sporadic impact: valid shocks that cause damage instantaneously and positive interventions that induce partial healing. Whereas each shock inflicts a fixed magnitude of damage, the accumulated effect of k positive interventions nullifies the damaging effect of one shock. The system is said to be in Stage 1, when it can possibly heal, until the net count of impacts (valid shocks registered minus valid shocks nullified) reaches a threshold $m_1$. The system then enters Stage 2, where no further healing is possible. The system fails when the net count of valid shocks reaches another threshold $m_2 (> m_1)$. The inter-arrival times between successive valid shocks and those between successive positive interventions are independent and follow arbitrary distributions. Thus, we remove the restrictive assumption of an exponential distribution, often found in the literature. We find the distributions of the sojourn time in Stage 1 and the failure time of the system. Finally, we find the optimal values of the choice variables that minimize the expected maintenance cost per unit time for three different maintenance policies. In Chapter 4, the above defined Stage 1 is further subdivided into two parts: In the early part, called Stage 1A, healing happens faster than in the later stage, called Stage 1B. The system stays in Stage 1A until the net count of impacts reaches a predetermined threshold $m_A$; then the system enters Stage 1B and stays there until the net count reaches another predetermined threshold $m_1 (>m_A)$. Subsequently, the system enters Stage 2 where it can no longer heal. The system fails when the net count of valid shocks reaches another predetermined higher threshold $m_2 (> m_1)$. All other assumptions are the same as those in Chapter 3. We calculate the percentage improvement in the lifetime of the system due to the subdivision of Stage 1. Finally, we make optimal choices to minimize the expected maintenance cost per unit time for two maintenance policies. Next, we eliminate the restrictive assumption that all valid shocks and all positive interventions have equal magnitude, and the boundary threshold is a preset constant value. In Chapter 5, we study a system that experiences damaging external shocks of random magnitude at stochastic intervals, continuous degradation, and self-healing. The system fails if cumulative damage exceeds a time-dependent threshold. We develop a preventive maintenance policy to replace the system such that its lifetime is utilized prudently. Further, we consider three variations on the healing pattern: (1) shocks heal for a fixed finite duration $\tau$; (2) a fixed proportion of shocks are non-healable (that is, $\tau=0$); (3) there are two types of shocks---self healable shocks heal for a finite duration, and non-healable shocks. We implement a proposed preventive maintenance policy and compare the optimal replacement times in these new cases with those in the original case, where all shocks heal indefinitely. Finally, in Chapter 6, we present a summary of the dissertation with conclusions and future research potential.
  • Item
    Sample Size Determination in Multivariate Parameters With Applications to Nonuniform Subsampling in Big Data High Dimensional Linear Regression
    (2021-12) Wang, Yu; Peng, Hanxiang; Li, Fang; Sarkar, Jyoti; Tan, Fei
    Subsampling is an important method in the analysis of Big Data. Subsample size determination (SSSD) plays a crucial part in extracting information from data and in breaking the challenges resulted from huge data sizes. In this thesis, (1) Sample size determination (SSD) is investigated in multivariate parameters, and sample size formulas are obtained for multivariate normal distribution. (2) Sample size formulas are obtained based on concentration inequalities. (3) Improved bounds for McDiarmid’s inequalities are obtained. (4) The obtained results are applied to nonuniform subsampling in Big Data high dimensional linear regression. (5) Numerical studies are conducted. The sample size formula in univariate normal distribution is a melody in elementary statistics. It appears that its generalization to multivariate normal (or more generally multivariate parameters) hasn’t been caught much attention to the best of our knowledge. In this thesis, we introduce a definition for SSD, and obtain explicit formulas for multivariate normal distribution, in gratifying analogy of the sample size formula in univariate normal. Commonly used concentration inequalities provide exponential rates, and sample sizes based on these inequalities are often loose. Talagrand (1995) provided the missing factor to sharpen these inequalities. We obtained the numeric values of the constants in the missing factor and slightly improved his results. Furthermore, we provided the missing factor in McDiarmid’s inequality. These improved bounds are used to give shrunken sample sizes.