- Browse by Subject
Browsing by Subject "Statistical modeling"
Now showing 1 - 3 of 3
Results Per Page
Sort Options
Item Artificial Intelligence to Aid Glaucoma Diagnosis and Monitoring: State of the Art and New Directions(MDPI, 2022) Nunez, Roberto; Harris, Alon; Ibrahim, Omar; Keller, James; Wikle, Christopher K.; Robinson, Erin; Zukerman, Ryan; Siesky, Brent; Verticchio, Alice; Rowe, Lucas; Guidoboni, Giovanna; Ophthalmology, School of MedicineRecent developments in the use of artificial intelligence in the diagnosis and monitoring of glaucoma are discussed. To set the context and fix terminology, a brief historic overview of artificial intelligence is provided, along with some fundamentals of statistical modeling. Next, recent applications of artificial intelligence techniques in glaucoma diagnosis and the monitoring of glaucoma progression are reviewed, including the classification of visual field images and the detection of glaucomatous change in retinal nerve fiber layer thickness. Current challenges in the direct application of artificial intelligence to further our understating of this disease are also outlined. The article also discusses how the combined use of mathematical modeling and artificial intelligence may help to address these challenges, along with stronger communication between data scientists and clinicians.Item Combining NMR and LC/MS Using Backward Variable Elimination: Metabolomics Analysis of Colorectal Cancer, Polyps, and Healthy Controls(ACS Publications, 2016-08-16) Deng, Lingli; Gu, Haiwei; Zhu, Jiangjiang; Gowda, G. A. Nagana; Djukovic, Danijel; Chiorean, Gabriela; Raftery, Daniel; Department of Medicine, School of MedicineBoth nuclear magnetic resonance (NMR) spectroscopy and mass spectrometry (MS) play important roles in metabolomics. The complementary features of NMR and MS make their combination very attractive; however, currently the vast majority of metabolomics studies use either NMR or MS separately, and variable selection that combines NMR and MS for biomarker identification and statistical modeling is still not well developed. In this study focused on methodology, we developed a backward variable elimination partial least-squares discriminant analysis algorithm embedded with Monte Carlo cross validation (MCCV-BVE-PLSDA), to combine NMR and targeted liquid chromatography (LC)/MS data. Using the metabolomics analysis of serum for the detection of colorectal cancer (CRC) and polyps as an example, we demonstrate that variable selection is vitally important in combining NMR and MS data. The combined approach was better than using NMR or LC/MS data alone in providing significantly improved predictive accuracy in all the pairwise comparisons among CRC, polyps, and healthy controls. Using this approach, we selected a subset of metabolites responsible for the improved separation for each pairwise comparison, and we achieved a comprehensive profile of altered metabolite levels, including those in glycolysis, the TCA cycle, amino acid metabolism, and other pathways that were related to CRC and polyps. MCCV-BVE-PLSDA is straightforward, easy to implement, and highly useful for studying the contribution of each individual variable to multivariate statistical models. On the basis of these results, we recommend using an appropriate variable selection step, such as MCCV-BVE-PLSDA, when analyzing data from multiple analytical platforms to obtain improved statistical performance and a more accurate biological interpretation, especially for biomarker discovery. Importantly, the approach described here is relatively universal and can be easily expanded for combination with other analytical technologies.Item Computational modeling for identification of low-frequency single nucleotide variants(2015-11-16) Hao, Yangyang; Liu, Yunlong; Edenberg, Howard J.; Li, Lang; Nakshatr, HarikrishnaReliable detection of low-frequency single nucleotide variants (SNVs) carries great significance in many applications. In cancer genetics, the frequencies of somatic variants from tumor biopsies tend to be low due to contamination with normal tissue and tumor heterogeneity. Circulating tumor DNA monitoring also faces the challenge of detecting low-frequency variants due to the small percentage of tumor DNA in blood. Moreover, in population genetics, although pooled sequencing is cost-effective compared with individual sequencing, pooling dilutes the signals of variants from any individual. Detection of low frequency variants is difficult and can be cofounded by multiple sources of errors, especially next-generation sequencing artifacts. Existing methods are limited in sensitivity and mainly focus on frequencies around 5%; most fail to consider differential, context-specific sequencing artifacts. To face this challenge, we developed a computational and experimental framework, RareVar, to reliably identify low-frequency SNVs from high-throughput sequencing data. For optimized performance, RareVar utilized a supervised learning framework to model artifacts originated from different components of a specific sequencing pipeline. This is enabled by a customized, comprehensive benchmark data enriched with known low-frequency SNVs from the sequencing pipeline of interest. Genomic-context-specific sequencing error model was trained on the benchmark data to characterize the systematic sequencing artifacts, to derive the position-specific detection limit for sensitive low-frequency SNV detection. Further, a machine-learning algorithm utilized sequencing quality features to refine SNV candidates for higher specificity. RareVar outperformed existing approaches, especially at 0.5% to 5% frequency. We further explored the influence of statistical modeling on position specific error modeling and showed zero-inflated negative binomial as the best-performed statistical distribution. When replicating analyses on an Illumina MiSeq benchmark dataset, our method seamlessly adapted to technologies with different biochemistries. RareVar enables sensitive detection of low-frequency SNVs across different sequencing platforms and will facilitate research and clinical applications such as pooled sequencing, cancer early detection, prognostic assessment, metastatic monitoring, and relapses or acquired resistance identification.