- Browse by Subject
Browsing by Subject "computational biology"
Now showing 1 - 5 of 5
Results Per Page
Sort Options
Item AIscEA: unsupervised integration of single-cell gene expression and chromatin accessibility via their biological consistency(Oxford, 2022-12-01) Jafari, Elham; Johnson, Travis; Wang, Yue; Liu, Yunlong; Huang, Kun; Wang, Yijie; Biostatistics and Health Data Science, School of MedicineMotivation The integrative analysis of single-cell gene expression and chromatin accessibility measurements is essential for revealing gene regulation, but it is one of the key challenges in computational biology. Gene expression and chromatin accessibility are measurements from different modalities, and no common features can be directly used to guide integration. Current state-of-the-art methods lack practical solutions for finding heterogeneous clusters. However, previous methods might not generate reliable results when cluster heterogeneity exists. More importantly, current methods lack an effective way to select hyper-parameters under an unsupervised setting. Therefore, applying computational methods to integrate single-cell gene expression and chromatin accessibility measurements remains difficult. Results We introduce AIscEA—Alignment-based Integration of single-cell gene Expression and chromatin Accessibility—a computational method that integrates single-cell gene expression and chromatin accessibility measurements using their biological consistency. AIscEA first defines a ranked similarity score to quantify the biological consistency between cell clusters across measurements. AIscEA then uses the ranked similarity score and a novel permutation test to identify cluster alignment across measurements. AIscEA further utilizes graph alignment for the aligned cell clusters to align the cells across measurements. We compared AIscEA with the competing methods on several benchmark datasets and demonstrated that AIscEA is highly robust to the choice of hyper-parameters and can better handle the cluster heterogeneity problem. Furthermore, AIscEA significantly outperforms the state-of-the-art methods when integrating real-world SNARE-seq and scMultiome-seq datasets in terms of integration accuracy. Availability and implementation AIscEA is available at https://figshare.com/articles/software/AIscEA_zip/21291135 on FigShare as well as {https://github.com/elhaam/AIscEA} onGitHub.Item Highlights from the Fourth International Society for Computational Biology Student Council Symposium at the Sixteenth Annual International Conference on Intelligent Systems for Molecular Biology(2008-10) Peixoto, Lucia; Gehlenborg, Nils; Janga, Sarath ChandraIn this meeting report we give an overview of the talks and presentations from the Fourth International Society for Computational Biology (ISCB) Student Council Symposium held as part of the annual Intelligent Systems for Molecular Biology (ISMB) conference in Toronto, Canada. Furthermore, we detail the role of the Student Council (SC) as an international student body in organizing this symposium series in the context of large, international conferences.Item Highlights from the Third International Society for Computational Biology Student Council Symposium at the Fifteenth Annual International Conference on Intelligent Systems for Molecular Biology(2007-11) Gehlenborg, Nils; Corpas, Manuel; Janga, Sarath ChandraIn this meeting report we give an overview of the 3rd International Society for Computational Biology Student Council Symposium. Furthermore, we explain the role of the Student Council and the symposium series in the context of large, international conferences.Item Information Theory in Computational Biology: Where We Stand Today(MDPI, 2020-06) Chanda, Pritam; Costa, Eduardo; Hu, Jie; Sukumar, Shravan; Van Hemert, John; Walia, Rasna; Computer and Information Science, School of Science"A Mathematical Theory of Communication" was published in 1948 by Claude Shannon to address the problems in the field of data compression and communication over (noisy) communication channels. Since then, the concepts and ideas developed in Shannon's work have formed the basis of information theory, a cornerstone of statistical learning and inference, and has been playing a key role in disciplines such as physics and thermodynamics, probability and statistics, computational sciences and biological sciences. In this article we review the basic information theory based concepts and describe their key applications in multiple major areas of research in computational biology-gene expression and transcriptomics, alignment-free sequence comparison, sequencing and error correction, genome-wide disease-gene association mapping, metabolic networks and metabolomics, and protein sequence, structure and interaction analysis.Item regSNPs-ASB: A Computational Framework for Identifying Allele-Specific Transcription Factor Binding From ATAC-seq Data(Frontiers, 2020-07-29) Xu, Siwen; Feng, Weixing; Lu, Zixiao; Yu, Christina Y.; Shao, Wei; Nakshatri, Harikrishna; Reiter, Jill L.; Gao, Hongyu; Chu, Xiaona; Wang, Yue; Liu, Yunlong; Medical and Molecular Genetics, School of MedicineExpression quantitative trait loci (eQTL) analysis is useful for identifying genetic variants correlated with gene expression, however, it cannot distinguish between causal and nearby non-functional variants. Because the majority of disease-associated SNPs are located in regulatory regions, they can impact allele-specific binding (ASB) of transcription factors and result in differential expression of the target gene alleles. In this study, our aim was to identify functional single-nucleotide polymorphisms (SNPs) that alter transcriptional regulation and thus, potentially impact cellular function. Here, we present regSNPs-ASB, a generalized linear model-based approach to identify regulatory SNPs that are located in transcription factor binding sites. The input for this model includes ATAC-seq (assay for transposase-accessible chromatin with high-throughput sequencing) raw read counts from heterozygous loci, where differential transposase-cleavage patterns between two alleles indicate preferential transcription factor binding to one of the alleles. Using regSNPs-ASB, we identified 53 regulatory SNPs in human MCF-7 breast cancer cells and 125 regulatory SNPs in human mesenchymal stem cells (MSC). By integrating the regSNPs-ASB output with RNA-seq experimental data and publicly available chromatin interaction data from MCF-7 cells, we found that these 53 regulatory SNPs were associated with 74 potential target genes and that 32 (43%) of these genes showed significant allele-specific expression. By comparing all of the MCF-7 and MSC regulatory SNPs to the eQTLs in the Genome-Tissue Expression (GTEx) Project database, we found that 30% (16/53) of the regulatory SNPs in MCF-7 and 43% (52/122) of the regulatory SNPs in MSC were also in eQTL regions. The enrichment of regulatory SNPs in eQTLs indicated that many of them are likely responsible for allelic differences in gene expression (chi-square test, p-value < 0.01). In summary, we conclude that regSNPs-ASB is a useful tool for identifying causal variants from ATAC-seq data. This new computational tool will enable efficient prioritization of genetic variants identified as eQTL for further studies to validate their causal regulatory function. Ultimately, identifying causal genetic variants will further our understanding of the underlying molecular mechanisms of disease and the eventual development of potential therapeutic targets.