- Browse by Author
Browsing by Author "Johnson, Travis S."
Now showing 1 - 10 of 21
Results Per Page
Sort Options
Item BERMUDA: a novel deep transfer learning method for single-cell RNA sequencing batch correction reveals hidden high-resolution cellular subtypes(BioMed Central, 2019-08-12) Wang, Tongxin; Johnson, Travis S.; Shao, Wei; Lu, Zixiao; Helm, Bryan R.; Zhang, Jie; Huang, Kun; Medical and Molecular Genetics, School of MedicineTo fully utilize the power of single-cell RNA sequencing (scRNA-seq) technologies for identifying cell lineages and bona fide transcriptional signals, it is necessary to combine data from multiple experiments. We present BERMUDA (Batch Effect ReMoval Using Deep Autoencoders), a novel transfer-learning-based method for batch effect correction in scRNA-seq data. BERMUDA effectively combines different batches of scRNA-seq data with vastly different cell population compositions and amplifies biological signals by transferring information among batches. We demonstrate that BERMUDA outperforms existing methods for removing batch effects and distinguishing cell types in multiple simulated and real scRNA-seq datasets.Item Combinatorial analyses reveal cellular composition changes have different impacts on transcriptomic changes of cell type specific genes in Alzheimer’s Disease(Springer Nature, 2021-01-11) Johnson, Travis S.; Xiang, Shunian; Dong, Tianhan; Huang, Zhi; Cheng, Michael; Wang, Tianfu; Yang, Kai; Ni, Dong; Huang, Kun; Zhang, Jie; Biostatistics, School of Public HealthAlzheimer’s disease (AD) brains are characterized by progressive neuron loss and gliosis. Previous studies of gene expression using bulk tissue samples often fail to consider changes in cell-type composition when comparing AD versus control, which can lead to differences in expression levels that are not due to transcriptional regulation. We mined five large transcriptomic AD datasets for conserved gene co-expression module, then analyzed differential expression and differential co-expression within the modules between AD samples and controls. We performed cell-type deconvolution analysis to determine whether the observed differential expression was due to changes in cell-type proportions in the samples or to transcriptional regulation. Our findings were validated using four additional datasets. We discovered that the increased expression of microglia modules in the AD samples can be explained by increased microglia proportions in the AD samples. In contrast, decreased expression and perturbed co-expression within neuron modules in the AD samples was likely due in part to altered regulation of neuronal pathways. Several transcription factors that are differentially expressed in AD might account for such altered gene regulation. Similarly, changes in gene expression and co-expression within astrocyte modules could be attributed to combined effects of astrogliosis and astrocyte gene activation. Gene expression in the astrocyte modules was also strongly correlated with clinicopathological biomarkers. Through this work, we demonstrated that combinatorial analysis can delineate the origins of transcriptomic changes in bulk tissue data and shed light on key genes and pathways involved in AD.Item Deep learning-based cancer survival prognosis from RNA-seq data: approaches and evaluations(BMC, 2020) Huang, Zhi; Johnson, Travis S.; Han, Zhi; Helm, Bryan; Cao, Sha; Zhang, Chi; Salama, Paul; Rizkalla, Maher; Yu, Christina Y.; Cheng, Jun; Xiang, Shunian; Zhan, Xiaohui; Zhang, Jie; Huang, Kun; Medicine, School of MedicineBackground: Recent advances in kernel-based Deep Learning models have introduced a new era in medical research. Originally designed for pattern recognition and image processing, Deep Learning models are now applied to survival prognosis of cancer patients. Specifically, Deep Learning versions of the Cox proportional hazards models are trained with transcriptomic data to predict survival outcomes in cancer patients. Methods: In this study, a broad analysis was performed on TCGA cancers using a variety of Deep Learning-based models, including Cox-nnet, DeepSurv, and a method proposed by our group named AECOX (AutoEncoder with Cox regression network). Concordance index and p-value of the log-rank test are used to evaluate the model performances. Results: All models show competitive results across 12 cancer types. The last hidden layers of the Deep Learning approaches are lower dimensional representations of the input data that can be used for feature reduction and visualization. Furthermore, the prognosis performances reveal a negative correlation between model accuracy, overall survival time statistics, and tumor mutation burden (TMB), suggesting an association among overall survival time, TMB, and prognosis prediction accuracy. Conclusions: Deep Learning based algorithms demonstrate superior performances than traditional machine learning based models. The cancer prognosis results measured in concordance index are indistinguishable across models while are highly variable across cancers. These findings shedding some light into the relationships between patient characteristics and survival learnability on a pan-cancer level.Item Diagnostic Evidence GAuge of Single cells (DEGAS): a flexible deep transfer learning framework for prioritizing cells in relation to disease(BMC, 2022-02-01) Johnson, Travis S.; Yu, Christina Y.; Huang, Zhi; Xu, Siwen; Wang, Tongxin; Dong, Chuanpeng; Shao, Wei; Zaid, Mohammad Abu; Huang, Xiaoqing; Wang, Yijie; Bartlett, Christopher; Zhang, Yan; Walker, Brian A.; Liu, Yunlong; Huang, Kun; Zhang, Jie; Medicine, School of MedicineWe propose DEGAS (Diagnostic Evidence GAuge of Single cells), a novel deep transfer learning framework, to transfer disease information from patients to cells. We call such transferrable information "impressions," which allow individual cells to be associated with disease attributes like diagnosis, prognosis, and response to therapy. Using simulated data and ten diverse single-cell and patient bulk tissue transcriptomic datasets from glioblastoma multiforme (GBM), Alzheimer's disease (AD), and multiple myeloma (MM), we demonstrate the feasibility, flexibility, and broad applications of the DEGAS framework. DEGAS analysis on myeloma single-cell transcriptomics identified PHF19high myeloma cells associated with progression.Item Disease-associated astrocytes and microglia markers are upregulated in mice fed high fat diet(Springer Nature, 2023-08-09) Lin, Li; Basu, Rashmita; Chatterjee, Debolina; Templin, Andrew T.; Flak, Jonathan N.; Johnson, Travis S.; Pharmacology and Toxicology, School of MedicineHigh-fat diet (HFD) is associated with Alzheimer's disease (AD) and type 2 diabetes risk, which share features such as insulin resistance and amylin deposition. We examined gene expression associated with astrocytes and microglia since dysfunction of these cell types is implicated in AD pathogenesis. We hypothesize gene expression changes in disease-associated astrocytes (DAA), disease-associated microglia and human Alzheimer's microglia exist in diabetic and obese individuals before AD development. By analyzing bulk RNA-sequencing (RNA-seq) data generated from brains of mice fed HFD and humans with AD, 11 overlapping AD-associated differentially expressed genes were identified, including Kcnj2, C4b and Ddr1, which are upregulated in response to both HFD and AD. Analysis of single cell RNA-seq (scRNA-seq) data indicated C4b is astrocyte specific. Spatial transcriptomics (ST) revealed C4b colocalizes with Gfad, a known astrocyte marker, and the colocalization of C4b expressing cells with Gad2 expressing cells, i.e., GABAergic neurons, in mouse brain. There also exists a positive correlation between C4b and Gad2 expression in ST indicating a potential interaction between DAA and GABAergic neurons. These findings provide novel links between the pathogenesis of obesity, diabetes and AD and identify C4b as a potential early marker for AD in obese or diabetic individuals.Item Enhanced microglial dynamics and paucity of tau seeding in the amyloid plaque microenvironment contributes to cognitive resilience in Alzheimer’s disease(bioRxiv, 2023-07-28) Jury-Garfe, Nur; You, Yanwen; Martínez, Pablo; Redding-Ochoa, Javier; Karahan, Hande; Johnson, Travis S.; Zhan, Jie; Kim, Jungsu; Troncoso, Juan C.; Lasagna-Reeves, Cristian A.; Anatomy, Cell Biology and Physiology, School of MedicineAsymptomatic Alzheimer’s disease (AsymAD) describes the status of subjects with preserved cognition but with identifiable Alzheimer’s disease (AD) brain pathology (i.e. Aβ-amyloid deposits, neuritic plaques, and neurofibrillary tangles) at autopsy. In this study, we investigated the postmortem brains of a cohort of AsymAD cases to gain insight into the underlying mechanisms of resilience to AD pathology and cognitive decline. Our results showed that AsymAD cases exhibit an enrichment of core plaques and decreased filamentous plaque accumulation, as well as an increase in microglia surrounding this last type. In AsymAD cases we found less pathological tau aggregation in dystrophic neurites compared to AD and tau seeding activity comparable to healthy control subjects. We used spatial transcriptomics to further characterize the plaque niche and found autophagy, endocytosis, and phagocytosis within the top upregulated pathways in the AsymAD plaque niche, but not in AD. Furthermore, we found ARP2, an actin-based motility protein crucial to initiate the formation of new actin filaments, increased within microglia in the proximity of amyloid plaques in AsymAD. Our findings support that the amyloid-plaque microenvironment in AsymAD cases is characterized by microglia with highly efficient actin-based cell motility mechanisms and decreased tau seeding compared to AD. These two mechanisms can potentially provide protection against the toxic cascade initiated by Aβ that preserves brain health and slows down the progression of AD pathology.Item Gene Co-expression Network and Copy Number Variation Analyses Identify Transcription Factors Associated With Multiple Myeloma Progression(Frontiers, 2019-05-17) Yu, Christina Y.; Xiang, Shunian; Huang, Zhi; Johnson, Travis S.; Zhan, Xiaohui; Han, Zhi; Abu Zaid, Mohammad; Huang, Kun; Medicine, School of MedicineMultiple myeloma (MM) has two clinical precursor stages of disease: monoclonal gammopathy of undetermined significance (MGUS) and smoldering multiple myeloma (SMM). However, the mechanism of progression is not well understood. Because gene co-expression network analysis is a well-known method for discovering new gene functions and regulatory relationships, we utilized this framework to conduct differential co-expression analysis to identify interesting transcription factors (TFs) in two publicly available datasets. We then used copy number variation (CNV) data from a third public dataset to validate these TFs. First, we identified co-expressed gene modules in two publicly available datasets each containing three conditions: normal, MGUS, and SMM. These modules were assessed for condition-specific gene expression, and then enrichment analysis was conducted on condition-specific modules to identify their biological function and upstream TFs. TFs were assessed for differential gene expression between normal and MM precursors, then validated with CNV analysis to identify candidate genes. Functional enrichment analysis reaffirmed known functional categories in MM pathology, the main one relating to immune function. Enrichment analysis revealed a handful of differentially expressed TFs between normal and either MGUS or SMM in gene expression and/or CNV. Overall, we identified four genes of interest (MAX, TCF4, ZNF148, and ZNF281) that aid in our understanding of MM initiation and progression.Item Identifying 1q amplification and PHF19 expressing high-risk cells associated with relapsed/refractory multiple myeloma(Research Square, 2023-08-16) Johnson, Travis S.; Sudha, Parvathi; Liu, Enze; Blaney, Patrick; Morgan, Gareth; Chopra, Vivek S.; Dos Santos, Cedric; Nixon, Michael; Huang, Kun; Suvannasankha, Attaya; Abu Zaid, Mohammad; Abonour, Rafat; Walker, Brian A.; Biostatistics and Health Data Science, School of MedicineMultiple Myeloma is an incurable plasma cell malignancy with a poor survival rate that is usually treated with immunomodulatory drugs (iMiDs) and proteosome inhibitors (PIs). The malignant plasma cells quickly become resistant to these agents causing relapse and uncontrolled growth of resistant clones. From whole genome sequencing (WGS) and RNA sequencing (RNA-seq) studies, different high-risk translocation, copy number, mutational, and transcriptional markers have been identified. One of these markers, PHF19, epigenetically regulates cell cycle and other processes and has already been studied using RNA-seq. In this study a massive (325,025 cells and 49 patients) single cell multiomic dataset was generated with jointly quantified ATAC- and RNA-seq for each cell and matched genomic profiles for each patient. We identified an association between one plasma cell subtype with myeloma progression that we have called relapsed/refractory plasma cells (RRPCs). These cells are associated with 1q alterations, TP53 mutations, and higher expression of PHF19. We also identified downstream regulation of cell cycle inhibitors in these cells, possible regulation of the transcription factor (TF) PBX1 on 1q, and determined that PHF19 may be acting primarily through this subset of cells.Item LAmbDA: label ambiguous domain adaptation dataset integration reduces batch effects and improves subtype detection(Oxford Academic, 2019-04) Johnson, Travis S.; Wang, Tongxin; Huang, Zhi; Yu, Christina Y.; Wu, Yi; Han, Yatong; Zhang, Yan; Huang, Kun; Zhang, Jie; Medicine, School of MedicineMotivation Rapid advances in single cell RNA sequencing (scRNA-seq) have produced higher-resolution cellular subtypes in multiple tissues and species. Methods are increasingly needed across datasets and species to (i) remove systematic biases, (ii) model multiple datasets with ambiguous labels and (iii) classify cells and map cell type labels. However, most methods only address one of these problems on broad cell types or simulated data using a single model type. It is also important to address higher-resolution cellular subtypes, subtype labels from multiple datasets, models trained on multiple datasets simultaneously and generalizability beyond a single model type. Results We developed a species- and dataset-independent transfer learning framework (LAmbDA) to train models on multiple datasets (even from different species) and applied our framework on simulated, pancreas and brain scRNA-seq experiments. These models mapped corresponding cell types between datasets with inconsistent cell subtype labels while simultaneously reducing batch effects. We achieved high accuracy in labeling cellular subtypes (weighted accuracy simulated 1 datasets: 90%; simulated 2 datasets: 94%; pancreas datasets: 88% and brain datasets: 66%) using LAmbDA Feedforward 1 Layer Neural Network with bagging. This method achieved higher weighted accuracy in labeling cellular subtypes than two other state-of-the-art methods, scmap and CaSTLe in brain (66% versus 60% and 32%). Furthermore, it achieved better performance in correctly predicting ambiguous cellular subtype labels across datasets in 88% of test cases compared with CaSTLe (63%), scmap (50%) and MetaNeighbor (50%). LAmbDA is model- and dataset-independent and generalizable to diverse data types representing an advance in biocomputing.Item Network analysis of pseudogene-gene relationships: from pseudogene evolution to their functional potentials(Pacific Symposium on Biocomputing, 2018) Johnson, Travis S.; Li, Sihong; Kho, Jonathan R.; Huang, Kun; Zhang, Yan; Medicine, School of MedicinePseudogenes are fossil relatives of genes. Pseudogenes have long been thought of as "junk DNAs", since they do not code proteins in normal tissues. Although most of the human pseudogenes do not have noticeable functions, ∼20% of them exhibit transcriptional activity. There has been evidence showing that some pseudogenes adopted functions as lncRNAs and work as regulators of gene expression. Furthermore, pseudogenes can even be "reactivated" in some conditions, such as cancer initiation. Some pseudogenes are transcribed in specific cancer types, and some are even translated into proteins as observed in several cancer cell lines. All the above have shown that pseudogenes could have functional roles or potentials in the genome. Evaluating the relationships between pseudogenes and their gene counterparts could help us reveal the evolutionary path of pseudogenes and associate pseudogenes with functional potentials. It also provides an insight into the regulatory networks involving pseudogenes with transcriptional and even translational activities.In this study, we develop a novel approach integrating graph analysis, sequence alignment and functional analysis to evaluate pseudogene-gene relationships, and apply it to human gene homologs and pseudogenes. We generated a comprehensive set of 445 pseudogene-gene (PGG) families from the original 3,281 gene families (13.56%). Of these 438 (98.4% PGG, 13.3% total) were non-trivial (containing more than one pseudogene). Each PGG family contains multiple genes and pseudogenes with high sequence similarity. For each family, we generate a sequence alignment network and phylogenetic trees recapitulating the evolutionary paths. We find evidence supporting the evolution history of olfactory family (both genes and pseudogenes) in human, which also supports the validity of our analysis method. Next, we evaluate these networks in respect to the gene ontology from which we identify functions enriched in these pseudogene-gene families and infer functional impact of pseudogenes involved in the networks. This demonstrates the application of our PGG network database in the study of pseudogene function in disease context.
- «
- 1 (current)
- 2
- 3
- »