New Applications of Spline-Based Learning Algorithms

Date
2021-10
Language
American English
Embargo Lift Date
Department
Committee Chair
Degree
Ph.D.
Degree Year
2021
Department
Grantor
Indiana University
Journal Title
Journal ISSN
Volume Title
Found At
Abstract

Statistical learning methods are a ecting human society and our daily lives in unprecedented ways. Most of these learning methods are motivated by practical applications, and they in turn are being used to solve real-world problems. Although generally accepted principles exist for the development of learning methods, new models and algorithms tend to emerge not as a result of theoretical extensions but as a consequence of the scienti c, technological, and societal needs of the world. In view of application-motivated method development, two classes of statistical learning methods are described: One addressing the needs of precision medicine and the other exploring the underlying longitudinal data structure in an unsupervised manner. A common thread in the two methods is combining spline-based models with learning algorithms to improve analytical accuracy. The challenges in optimizing treatment for individual patients are rst addressed. Specifically, therapeutic optimization must be based on a good causal understanding of the treatment e ects. Furthermore, given the multiple treatment options available, recommendations must be consistent regardless of the reference treatment. To address the issue of inconsistent recommendations in a newer R-learner method, a simplex R-learning algorithm to help select the best treatment for individual patients is presented. The algorithm was tested, and the analytical results of the data from the Systolic Blood Pressure Intervention Trial (SPRINT) are presented. The proposed method provided recommendations consistent with the current clinical guidelines for hypertension treatment. The second part of this dissertation addresses the clustering of longitudinal data with sparse and irregular observations. Through simulation studies, the algorithm is demonstrated to have superior clustering accuracy and numerical e ciency to those of the existing methods. In addition, the algorithm can be easily extended to multiple-outcome longitudinal data with little additional computational cost, and is capable of detecting the correct number of clusters when extremely unbalanced cluster sizes exist. The algorithm was applied to a 12-year multi-site observational study (PREDICT-HD) to investigate the disease progression patterns of Huntington's disease (HD). Finally, an R package, ClusterLong, was developed to provide a tool for the public use of this algorithm. The tool was incorporated into an R Shiny application to allow users unfamiliar with R to access the method.

Description
Indiana University-Purdue University Indianapolis (IUPUI)
item.page.description.tableofcontents
item.page.relation.haspart
Cite As
ISSN
Publisher
Series/Report
Sponsorship
Major
Extent
Identifier
Relation
Journal
Rights
Source
Alternative Title
Type
Dissertation
Number
Volume
Conference Dates
Conference Host
Conference Location
Conference Name
Conference Panel
Conference Secretariat Location
Version
Full Text Available at
This item is under embargo {{howLong}}