- Browse by Author
Browsing by Author "Zhou, Junyi"
Now showing 1 - 5 of 5
Results Per Page
Sort Options
Item A reference-free R-learner for treatment recommendation(Sage, 2023) Zhou, Junyi; Zhang, Ying; Tu, Wanzhu; Biostatistics and Health Data Science, School of MedicineAssigning optimal treatments to individual patients based on their characteristics is the ultimate goal of precision medicine. Deriving evidence-based recommendations from observational data while considering the causal treatment effects and patient heterogeneity is a challenging task, especially in situations of multiple treatment options. Herein, we propose a reference-free R-learner based on a simplex algorithm for treatment recommendation. We showed through extensive simulation that the proposed method produced accurate recommendations that corresponded to optimal treatment outcomes, regardless of the reference group. We used the method to analyze data from the Systolic Blood Pressure Intervention Trial (SPRINT) and achieved recommendations consistent with the current clinical guidelines.Item A spline-based nonparametric analysis for interval-censored bivariate survival data(Institute of Statistical Science, 2022) Wu, Yuan; Zhang, Ying; Zhou, Junyi; Biostatistics, School of Public HealthIn this manuscript we propose a spline-based sieve nonparametric maximum likelihood estimation method for joint distribution function with bivariate interval-censored data. We study the asymptotic behavior of the proposed estimator by proving the consistency and deriving the rate of convergence. Based on the sieve estimate of the joint distribution, we also develop an efficient nonparametric test for making inference about the dependence between two interval-censored event times and establish its asymptotic normality. We conduct simulation studies to examine the finite sample performance of the proposed methodology. Finally we apply the method to assess the association between two subtypes of mild cognitive impairment (MCI): amnestic MCI and non-amnestic MCI, for Huntington disease (HD) using data from a 12-year observational cohort study on premanifest HD individuals, PREDICT-HD.Item Mild Cognitive Impairment as an Early Landmark in Huntington's Disease(Frontiers Media, 2021-07-07) Zhang, Ying; Zhou, Junyi; Gehl, Carissa R.; Long, Jeffrey D.; Johnson, Hans; Magnotta, Vincent A.; Sewell, Daniel; Shannon, Kathleen; Paulsen, Jane S.; Biostatistics and Health Data Science, Richard M. Fairbanks School of Public HealthAs one of the clinical triad in Huntington's disease (HD), cognitive impairment has not been widely accepted as a disease stage indicator in HD literature. This work aims to study cognitive impairment thoroughly for prodromal HD individuals with the data from a 12-year observational study to determine whether Mild Cognitive Impairment (MCI) in HD gene-mutation carriers is a defensible indicator of early disease. Prodromal HD gene-mutation carriers evaluated annually at one of 32 worldwide sites from September 2002 to April 2014 were evaluated for MCI in six cognitive domains. Linear mixed-effects models were used to determine age-, education-, and retest-adjusted cut-off values in cognitive assessment for MCI, and then the concurrent and predictive validity of MCI was assessed. Accelerated failure time (AFT) models were used to determine the timing of MCI (single-, two-, and multiple-domain), and dementia, which was defined as MCI plus functional loss. Seven hundred and sixty-eight prodromal HD participants had completed all six cognitive tasks, had MRI, and underwent longitudinal assessments. Over half (i.e., 54%) of the participants had MCI at study entry, and half of these had single-domain MCI. Compared to participants with intact cognitive performances, prodromal HD with MCI had higher genetic burden, worsened motor impairment, greater brain atrophy, and a higher likelihood of estimated HD onset. Prospective longitudinal study of those without MCI at baseline showed that 48% had MCI in subsequent visits and data visualization suggested that single-domain MCI, two-domain MCI, and dementia represent appropriate cognitive impairment staging for HD gene-mutation carriers. Findings suggest that MCI represents an early landmark of HD and may be a sensitive enrichment variable or endpoint for prodromal clinical trials of disease modifying therapeutics.Item New Applications of Spline-Based Learning Algorithms(2021-10) Zhou, Junyi; Tu, Wanzhu; Zhang, Ying; Cao, Sha; Zhang, Chi; Bakoyannis, GiorgosStatistical learning methods are a ecting human society and our daily lives in unprecedented ways. Most of these learning methods are motivated by practical applications, and they in turn are being used to solve real-world problems. Although generally accepted principles exist for the development of learning methods, new models and algorithms tend to emerge not as a result of theoretical extensions but as a consequence of the scienti c, technological, and societal needs of the world. In view of application-motivated method development, two classes of statistical learning methods are described: One addressing the needs of precision medicine and the other exploring the underlying longitudinal data structure in an unsupervised manner. A common thread in the two methods is combining spline-based models with learning algorithms to improve analytical accuracy. The challenges in optimizing treatment for individual patients are rst addressed. Specifically, therapeutic optimization must be based on a good causal understanding of the treatment e ects. Furthermore, given the multiple treatment options available, recommendations must be consistent regardless of the reference treatment. To address the issue of inconsistent recommendations in a newer R-learner method, a simplex R-learning algorithm to help select the best treatment for individual patients is presented. The algorithm was tested, and the analytical results of the data from the Systolic Blood Pressure Intervention Trial (SPRINT) are presented. The proposed method provided recommendations consistent with the current clinical guidelines for hypertension treatment. The second part of this dissertation addresses the clustering of longitudinal data with sparse and irregular observations. Through simulation studies, the algorithm is demonstrated to have superior clustering accuracy and numerical e ciency to those of the existing methods. In addition, the algorithm can be easily extended to multiple-outcome longitudinal data with little additional computational cost, and is capable of detecting the correct number of clusters when extremely unbalanced cluster sizes exist. The algorithm was applied to a 12-year multi-site observational study (PREDICT-HD) to investigate the disease progression patterns of Huntington's disease (HD). Finally, an R package, ClusterLong, was developed to provide a tool for the public use of this algorithm. The tool was incorporated into an R Shiny application to allow users unfamiliar with R to access the method.Item PLUS: Predicting cancer metastasis potential based on positive and unlabeled learning(PLOS, 2022-03-29) Zhou, Junyi; Lu, Xiaoyu; Chang, Wennan; Wan, Changlin; Lu, Xiongbin; Zhang, Chi; Cao, Sha; Medical and Molecular Genetics, School of MedicineMetastatic cancer accounts for over 90% of all cancer deaths, and evaluations of metastasis potential are vital for minimizing the metastasis-associated mortality and achieving optimal clinical decision-making. Computational assessment of metastasis potential based on large-scale transcriptomic cancer data is challenging because metastasis events are not always clinically detectable. The under-diagnosis of metastasis events results in biased classification labels, and classification tools using biased labels may lead to inaccurate estimations of metastasis potential. This issue is further complicated by the unknown metastasis prevalence at the population level, the small number of confirmed metastasis cases, and the high dimensionality of the candidate molecular features. Our proposed algorithm, called Positive and unlabeled Learning from Unbalanced cases and Sparse structures (PLUS), is the first to use a positive and unlabeled learning framework to account for the under-detection of metastasis events in building a classifier. PLUS is specifically tailored for studying metastasis that deals with the unbalanced instance allocation as well as unknown metastasis prevalence, which are not considered by other methods. PLUS achieves superior performance on synthetic datasets compared with other state-of-the-art methods. Application of PLUS to The Cancer Genome Atlas Pan-Cancer gene expression data generated metastasis potential predictions that show good agreement with the clinical follow-up data, in addition to predictive genes that have been validated by independent single-cell RNA-sequencing datasets.