IU Indianapolis ScholarWorks :: Browsing by Author "Rajwa, Bartek"

Browsing by Author "Rajwa, Bartek"

Now showing 1 - 9 of 9

Automated Assessment of Disease Progression in Acute Myeloid Leukemia by Probabilistic Analysis of Flow Cytometry Data
(Institute of Electrical and Electronics Engineers, 2017-05) Rajwa, Bartek; Wallace, Paul K.; Griffiths, Elizabeth A.; Dundar, Murat; Computer and Information Science, School of Science
OBJECTIVE: Flow cytometry (FC) is a widely acknowledged technology in diagnosis of acute myeloid leukemia (AML) and has been indispensable in determining progression of the disease. Although FC plays a key role as a posttherapy prognosticator and evaluator of therapeutic efficacy, the manual analysis of cytometry data is a barrier to optimization of reproducibility and objectivity. This study investigates the utility of our recently introduced nonparametric Bayesian framework in accurately predicting the direction of change in disease progression in AML patients using FC data. METHODS: The highly flexible nonparametric Bayesian model based on the infinite mixture of infinite Gaussian mixtures is used for jointly modeling data from multiple FC samples to automatically identify functionally distinct cell populations and their local realizations. Phenotype vectors are obtained by characterizing each sample by the proportions of recovered cell populations, which are, in turn, used to predict the direction of change in disease progression for each patient. RESULTS: We used 200 diseased and nondiseased immunophenotypic panels for training and tested the system with 36 additional AML cases collected at multiple time points. The proposed framework identified the change in direction of disease progression with accuracies of 90% (nine out of ten) for relapsing cases and 100% (26 out of 26) for the remaining cases. CONCLUSIONS: We believe that these promising results are an important first step toward the development of automated predictive systems for disease monitoring and continuous response evaluation. SIGNIFICANCE: Automated measurement and monitoring of therapeutic response is critical not only for objective evaluation of disease status prognosis but also for timely assessment of treatment strategies.
Batch Discovery of Recurring Rare Classes toward Identifying Anomalous Samples
(ACM, 2014) Dundar, Murat; Yerebakan, Halid Ziya; Rajwa, Bartek; Computer and Information Science, School of Science
We present a clustering algorithm for discovering rare yet significant recurring classes across a batch of samples in the presence of random effects. We model each sample data by an infinite mixture of Dirichlet-process Gaussian-mixture models (DPMs) with each DPM representing the noisy realization of its corresponding class distribution in a given sample. We introduce dependencies across multiple samples by placing a global Dirichlet process prior over individual DPMs. This hierarchical prior introduces a sharing mechanism across samples and allows for identifying local realizations of classes across samples. We use collapsed Gibbs sampler for inference to recover local DPMs and identify their class associations. We demonstrate the utility of the proposed algorithm, processing a flow cytometry data set containing two extremely rare cell populations, and report results that significantly outperform competing techniques. The source code of the proposed algorithm is available on the web via the link: http://cs.iupui.edu/~dundar/aspire.htm.
An emerging method to noninvasively measure and identify vagal response markers to enable bioelectronic control of gastroparesis symptoms with gastric electrical stimulation
(Elsevier, 2020-04-15) Ward, Matthew P.; Gupta, Anita; Wo, John M.; Rajwa, Bartek; Furness, John B.; Powley, Terry L.; Nowak, Thomas V.; Medicine, School of Medicine
Background: Gastric electrical stimulation (GES) can be a life-changing, device-based treatment option for drug-resistant nausea and vomiting associated with diabetic or idiopathic gastroparesis (GP). Despite over two decades of clinical use, the mechanism of action remains unclear. We hypothesize a vagal mechanism. New method: Here, we describe a noninvasive method to investigate vagal nerve involvement in GES therapy in 66 human subjects through the compound nerve action potential (CNAP). Results: Of the 66 subjects, 28 had diabetic GP, 35 had idiopathic GP, and 3 had postsurgical GP. Stimulus charge per pulse did not predict treatment efficacy, but did predict a significant increase in total symptom score in type 1 diabetics as GES stimulus charge per pulse increased (p < 0.01), representing a notable side effect and providing a method to identify it. In contrast, the number of significant left and right vagal fiber responses that were recorded directly related to patient symptom improvement. Increased vagal responses correlated with significant decreases in total symptom score (p < 0.05). Comparison with existing method(s): We have developed transcutaneous recording of cervical vagal activity that is synchronized with GES in conscious human subjects, along with methods of discriminating the activity of different nerve fiber groups with respect to conduction speed and treatment response. Conclusions: Cutaneous vagal CNAP analysis is a useful technique to unmask relationships among GES parameters, vagal recruitment, efficacy and side-effect management. Our results suggest that CNAP-guided GES optimization will provide the most benefit to patients with idiopathic and type 1 diabetic gastroparesis.
High-throughput segmentation of unmyelinated axons by deep learning
(Springer Nature, 2022-01-24) Plebani, Emanuele; Biscola, Natalia P.; Havton, Leif A.; Rajwa, Bartek; Shemonti, Abida Sanjana; Jaffey, Deborah; Powley, Terry; Keast, Janet R.; Lu, Kun‑Han; Dundar, M. Murat; Computer and Information Science, School of Science
Axonal characterizations of connectomes in healthy and disease phenotypes are surprisingly incomplete and biased because unmyelinated axons, the most prevalent type of fibers in the nervous system, have largely been ignored as their quantitative assessment quickly becomes unmanageable as the number of axons increases. Herein, we introduce the first prototype of a high-throughput processing pipeline for automated segmentation of unmyelinated fibers. Our team has used transmission electron microscopy images of vagus and pelvic nerves in rats. All unmyelinated axons in these images are individually annotated and used as labeled data to train and validate a deep instance segmentation network. We investigate the effect of different training strategies on the overall segmentation accuracy of the network. We extensively validate the segmentation algorithm as a stand-alone segmentation tool as well as in an expert-in-the-loop hybrid segmentation setting with preliminary, albeit remarkably encouraging results. Our algorithm achieves an instance-level F1 score of between 0.7 and 0.9 on various test images in the stand-alone mode and reduces expert annotation labor by 80% in the hybrid setting. We hope that this new high-throughput segmentation pipeline will enable quick and accurate characterization of unmyelinated fibers at scale and become instrumental in significantly advancing our understanding of connectomes in both the peripheral and the central nervous systems.
Identification of predictive patient characteristics for assessing the probability of COVID-19 in-hospital mortality
(Public Library of Science, 2024) Rajwa, Bartek; Naved, Md Mobasshir Arshed; Adibuzzaman, Mohammad; Grama, Ananth Y.; Khan, Babar A.; Dundar, M. Murat; Rochet, Jean-Christophe; Computer Science, Luddy School of Informatics, Computing, and Engineering
As the world emerges from the COVID-19 pandemic, there is an urgent need to understand patient factors that may be used to predict the occurrence of severe cases and patient mortality. Approximately 20% of SARS-CoV-2 infections lead to acute respiratory distress syndrome caused by the harmful actions of inflammatory mediators. Patients with severe COVID-19 are often afflicted with neurologic symptoms, and individuals with pre-existing neurodegenerative disease have an increased risk of severe COVID-19. Although collectively, these observations point to a bidirectional relationship between severe COVID-19 and neurologic disorders, little is known about the underlying mechanisms. Here, we analyzed the electronic health records of 471 patients with severe COVID-19 to identify clinical characteristics most predictive of mortality. Feature discovery was conducted by training a regularized logistic regression classifier that serves as a machine-learning model with an embedded feature selection capability. SHAP analysis using the trained classifier revealed that a small ensemble of readily observable clinical features, including characteristics associated with cognitive impairment, could predict in-hospital mortality with an accuracy greater than 0.85 (expressed as the area under the ROC curve of the classifier). These findings have important implications for the prioritization of clinical measures used to identify patients with COVID-19 (and, potentially, other forms of acute respiratory distress syndrome) having an elevated risk of death.
The Infinite Mixture of Infinite Gaussian Mixtures
(2015) Yerebakan, Halid Z.; Rajwa, Bartek; Dundar, Murat; Department of Computer & Information Science, School of Science
Dirichlet process mixture of Gaussians (DPMG) has been used in the literature for clustering and density estimation problems. However, many real-world data exhibit cluster distributions that cannot be captured by a single Gaussian. Modeling such data sets by DPMG creates several extraneous clusters even when clusters are relatively well-defined. Herein, we present the infinite mixture of infinite Gaussian mixtures (I2GMM) for more flexible modeling of data sets with skewed and multi-modal cluster distributions. Instead of using a single Gaussian for each cluster as in the standard DPMG model, the generative model of I2GMM uses a single DPMG for each cluster. The individual DPMGs are linked together through centering of their base distributions at the atoms of a higher level DP prior. Inference is performed by a collapsed Gibbs sampler that also enables partial parallelization. Experimental results on several artificial and real-world data sets suggest the proposed I2GMM model can predict clusters more accurately than existing variational Bayes and Gibbs sampler versions of DPMG.
A non-parametric Bayesian model for joint cell clustering and cluster matching: identification of anomalous sample phenotypes with random effects
(Springer (Biomed Central Ltd.), 2014) Dundar, Murat; Akova, Ferit; Yerebakan, Halid Z.; Rajwa, Bartek; Department of Computer & Information Science, School of Science
BACKGROUND: Flow cytometry (FC)-based computer-aided diagnostics is an emerging technique utilizing modern multiparametric cytometry systems.The major difficulty in using machine-learning approaches for classification of FC data arises from limited access to a wide variety of anomalous samples for training. In consequence, any learning with an abundance of normal cases and a limited set of specific anomalous cases is biased towards the types of anomalies represented in the training set. Such models do not accurately identify anomalies, whether previously known or unknown, that may exist in future samples tested. Although one-class classifiers trained using only normal cases would avoid such a bias, robust sample characterization is critical for a generalizable model. Owing to sample heterogeneity and instrumental variability, arbitrary characterization of samples usually introduces feature noise that may lead to poor predictive performance. Herein, we present a non-parametric Bayesian algorithm called ASPIRE (anomalous sample phenotype identification with random effects) that identifies phenotypic differences across a batch of samples in the presence of random effects. Our approach involves simultaneous clustering of cellular measurements in individual samples and matching of discovered clusters across all samples in order to recover global clusters using probabilistic sampling techniques in a systematic way. RESULTS: We demonstrate the performance of the proposed method in identifying anomalous samples in two different FC data sets, one of which represents a set of samples including acute myeloid leukemia (AML) cases, and the other a generic 5-parameter peripheral-blood immunophenotyping. Results are evaluated in terms of the area under the receiver operating characteristics curve (AUC). ASPIRE achieved AUCs of 0.99 and 1.0 on the AML and generic blood immunophenotyping data sets, respectively. CONCLUSIONS: These results demonstrate that anomalous samples can be identified by ASPIRE with almost perfect accuracy without a priori access to samples of anomalous subtypes in the training set. The ASPIRE approach is unique in its ability to form generalizations regarding normal and anomalous states given only very weak assumptions regarding sample characteristics and origin. Thus, ASPIRE could become highly instrumental in providing unique insights about observed biological phenomena in the absence of full information about the investigated samples.
Partially-observed models for classifying minerals on Mars
(IEEE, 2013) Dundar, Murat; Li, Lin; Rajwa, Bartek; Earth Sciences, School of Science
The identification of phyllosilicates by NASA's CRISM (Compact Reconnaissance Imaging Spectrometer for Mars) strongly suggests the presence of water-related geological processes. A variety of water-bearing phyllosilicate minerals have already been identified by several research groups utilizing spectral enrichment techniques and matching phyllosilicate-rich regions on the Martian surface to known spectra of minerals found on earth. However, fully automated analysis of the CRISM data remains a challenge for two main reasons. First, there is significant variability in the spectral signature of the same mineral obtained from different regions on the Martian surface. Second, the list of mineral confirmed to date constituting the set of training classes is not exhaustive. Thus, when classifying new regions, using a classifier trained with selected minerals and chemicals, one must consider the potential presence of unknown materials not represented in the training library. We made an initial attempt to study these problems in the context of our recent work on partially-observed classification models and present results that show the utility of such models in identifying spectra of unknown minerals while simultaneously recognizing spectra of known minerals.
Simplicity of Kmeans versus Deepness of Deep Learning: A Case of Unsupervised Feature Learning with Limited Data
(IEEE, 2015-12) Dundar, Murat; Kou, Qiang; Zhang, Baichuan; He, Yicheng; Rajwa, Bartek; Department of Computer and Information Sciences, School of Science
We study a bio-detection application as a case study to demonstrate that Kmeans -- based unsupervised feature learning can be a simple yet effective alternative to deep learning techniques for small data sets with limited intra-as well as inter-class diversity. We investigate the effect on the classifier performance of data augmentation as well as feature extraction with multiple patch sizes and at different image scales. Our data set includes 1833 images from four different classes of bacteria, each bacterial culture captured at three different wavelengths and overall data collected during a three-day period. The limited number and diversity of images present, potential random effects across multiple days, and the multi-mode nature of class distributions pose a challenging setting for representation learning. Using images collected on the first day for training, on the second day for validation, and on the third day for testing Kmeans -- based representation learning achieves 97% classification accuracy on the test data. This compares very favorably to 56% accuracy achieved by deep learning and 74% accuracy achieved by handcrafted features. Our results suggest that data augmentation or dropping connections between units offers little help for deep-learning algorithms, whereas significant boost can be achieved by Kmeans -- based representation learning by augmenting data and by concatenating features obtained at multiple patch sizes or image scales.

Browsing by Author "Rajwa, Bartek"

Results Per Page

Sort Options