Statistical Deep Learning of Multivariate Longitudinal Data

Li, Yunyi

Statistical Deep Learning of Multivariate Longitudinal Data

dc.contributor.advisor	Gao, Sujuan
dc.contributor.advisor	Liu, Hao
dc.contributor.author	Li, Yunyi
dc.contributor.other	Apostolova, Liana G.
dc.contributor.other	Li, Xiaochun
dc.contributor.other	Zhao, Yi
dc.date.accessioned	2024-12-17T10:58:39Z
dc.date.available	2024-12-17T10:58:39Z
dc.date.issued	2024-11
dc.degree.date	2024
dc.degree.discipline	Biostatistics & Health Data Science
dc.degree.grantor	Indiana University
dc.degree.level	Ph.D.
dc.description	IUI
dc.description.abstract	Nowadays, various types of longitudinal data, including continuous, binary, and count data, are increasingly collected in numerous scientific research fields such as Alzheimer’s disease studies. Despite the wealth of data, the complex structure of multivariate longitudinal data presents significant modeling challenges. For years, scientific research has been actively exploring dynamic interactions among multiple components and understanding how interventions can impact outcomes over time with complex underlying dynamics. However, statistical methods for modeling these dynamic changes and associations are still limited. To address these gaps, we propose a novel nonparametric method to describe the mean temporal changes of sparsely and irregularly observed multivariate longitudinal data. This method is based on an Ordinary Differential Equation (ODE) system approximated by neural networks. Furthermore, we presented a novel approach to treat the initial values of ODEs as an unknown parameter vector, a departure from existing methods that either pre-specify the initial values or estimate them in an ad hoc manner. In the second topic, we propose deep latent ODE models. These models nonparametrically model latent temporal trends by an unknown function of an ODE system and parametrically estimate the effects of covariates using Bayesian approaches. To address the intractability of the posterior distribution of initial values, we employ a variational autoencoder (VAE) algorithm. The approximate posterior distribution is characterized by a recurrent neural network (RNN), and high dimensional hy-perparameters are estimated using the stochastic gradient descent method based on Kullback-Leibler (KL) divergence. Lastly, we propose Bayesian generalized random effects models for modeling longitudinal data from various distributions, including longitudinal counts, and longitudinal binary outcomes. This model extends traditional generalized linear mixed effect models (GLMMs) to generalized semi-parametric mixed effect models. It assumes a nonparametric baseline function with a stochastic process prior, and parameters are estimated using the Bayesian approach. The proposed model is practical and can be applied to various types of longitudinal data, including longitudinal binary, and count data. Neural ODE, RNN, variational inference, and KL divergence techniques are also applied in this project.
dc.identifier.uri	https://hdl.handle.net/1805/45093
dc.language.iso	en_US
dc.subject	Alzheimer's Disease
dc.subject	Deep Learning
dc.subject	Longitudinal Data
dc.subject	Ordinary Differential Equation
dc.title	Statistical Deep Learning of Multivariate Longitudinal Data
dc.type	Thesis

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Li_iuindianapolis_2432A_10822.pdf
Size:: 3.48 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 2.04 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Biostatistics Department Theses and Dissertations