Statistical Deep Learning of Multivariate Longitudinal Data

dc.contributor.advisorGao, Sujuan
dc.contributor.advisorLiu, Hao
dc.contributor.authorLi, Yunyi
dc.contributor.otherApostolova, Liana G.
dc.contributor.otherLi, Xiaochun
dc.contributor.otherZhao, Yi
dc.date.accessioned2024-12-17T10:58:39Z
dc.date.available2024-12-17T10:58:39Z
dc.date.issued2024-11
dc.degree.date2024
dc.degree.disciplineBiostatistics & Health Data Science
dc.degree.grantorIndiana University
dc.degree.levelPh.D.
dc.descriptionIUI
dc.description.abstractNowadays, various types of longitudinal data, including continuous, binary, and count data, are increasingly collected in numerous scientific research fields such as Alzheimer’s disease studies. Despite the wealth of data, the complex structure of multivariate longitudinal data presents significant modeling challenges. For years, scientific research has been actively exploring dynamic interactions among multiple components and understanding how interventions can impact outcomes over time with complex underlying dynamics. However, statistical methods for modeling these dynamic changes and associations are still limited. To address these gaps, we propose a novel nonparametric method to describe the mean temporal changes of sparsely and irregularly observed multivariate longitudinal data. This method is based on an Ordinary Differential Equation (ODE) system approximated by neural networks. Furthermore, we presented a novel approach to treat the initial values of ODEs as an unknown parameter vector, a departure from existing methods that either pre-specify the initial values or estimate them in an ad hoc manner. In the second topic, we propose deep latent ODE models. These models nonparametrically model latent temporal trends by an unknown function of an ODE system and parametrically estimate the effects of covariates using Bayesian approaches. To address the intractability of the posterior distribution of initial values, we employ a variational autoencoder (VAE) algorithm. The approximate posterior distribution is characterized by a recurrent neural network (RNN), and high dimensional hy-perparameters are estimated using the stochastic gradient descent method based on Kullback-Leibler (KL) divergence. Lastly, we propose Bayesian generalized random effects models for modeling longitudinal data from various distributions, including longitudinal counts, and longitudinal binary outcomes. This model extends traditional generalized linear mixed effect models (GLMMs) to generalized semi-parametric mixed effect models. It assumes a nonparametric baseline function with a stochastic process prior, and parameters are estimated using the Bayesian approach. The proposed model is practical and can be applied to various types of longitudinal data, including longitudinal binary, and count data. Neural ODE, RNN, variational inference, and KL divergence techniques are also applied in this project.
dc.identifier.urihttps://hdl.handle.net/1805/45093
dc.language.isoen_US
dc.subjectAlzheimer's Disease
dc.subjectDeep Learning
dc.subjectLongitudinal Data
dc.subjectOrdinary Differential Equation
dc.titleStatistical Deep Learning of Multivariate Longitudinal Data
dc.typeThesis
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Li_iuindianapolis_2432A_10822.pdf
Size:
3.48 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
2.04 KB
Format:
Item-specific license agreed upon to submission
Description: