- Browse by Author
Browsing by Author "Yazdanparast, Aida"
Now showing 1 - 3 of 3
Results Per Page
Sort Options
Item Bi-EB: Empirical Bayesian Biclustering for Multi-Omics Data Integration Pattern Identification among Species(MDPI, 2022-10-30) Yazdanparast, Aida; Li, Lang; Zhang, Chi; Cheng, Lijun; BioHealth Informatics, School of Informatics and ComputingAlthough several biclustering algorithms have been studied, few are used for cross-pattern identification across species using multi-omics data mining. A fast empirical Bayesian biclustering (Bi-EB) algorithm is developed to detect the patterns shared from both integrated omics data and between species. The Bi-EB algorithm addresses the clinical critical translational question using the bioinformatics strategy, which addresses how modules of genotype variation associated with phenotype from cancer cell screening data can be identified and how these findings can be directly translated to a cancer patient subpopulation. Empirical Bayesian probabilistic interpretation and ratio strategy are proposed in Bi-EB for the first time to detect the pairwise regulation patterns among species and variations in multiple omics on a gene level, such as proteins and mRNA. An expectation-maximization (EM) optimal algorithm is used to extract the foreground co-current variations out of its background noise data by adjusting parameters with bicluster membership probability threshold Ac; and the bicluster average probability p. Three simulation experiments and two real biology mRNA and protein data analyses conducted on the well-known Cancer Genomics Atlas (TCGA) and The Cancer Cell Line Encyclopedia (CCLE) verify that the proposed Bi-EB algorithm can significantly improve the clustering recovery and relevance accuracy, outperforming the other seven biclustering methods-Cheng and Church (CC), xMOTIFs, BiMax, Plaid, Spectral, FABIA, and QUBIC-with a recovery score of 0.98 and a relevance score of 0.99. At the same time, the Bi-EB algorithm is used to determine shared the causality patterns of mRNA to the protein between patients and cancer cells in TCGA and CCLE breast cancer. The clinically well-known treatment target protein module estrogen receptor (ER), ER (p118), AR, BCL2, cyclin E1, and IGFBP2 are identified in accordance with their mRNA expression variations in the luminal-like subtype. Ten genes, including CCNB1, CDH1, KDR, RAB25, PRKCA, etc., found which can maintain the high accordance of mRNA-protein for both breast cancer patients and cell lines in basal-like subtypes for the first time. Bi-EB provides a useful biclustering analysis tool to discover the cross patterns hidden both in multiple data matrixes (omics) and species. The implementation of the Bi-EB method in the clinical setting will have a direct impact on administrating translational research based on the cancer cell screening guidance.Item Comprehensive comparison of molecular portraits between cell lines and tumors in breast cancer(BioMed Central, 2016-08-22) Jiang, Guanglong; Zhang, Shijun; Yazdanparast, Aida; Li, Meng; Pawar, Aniruddha Vikram; Liu, Yunlong; Inavolu, Sai Mounika; Cheng, Lijun; Department of Medical and Molecular Genetics, IU School of MedicineBackground: Proper cell models for breast cancer primary tumors have long been the focal point in the cancer’s research. The genomic comparison between cell lines and tumors can investigate the similarity and dissimilarity and help to select right cell model to mimic tumor tissues to properly evaluate the drug reaction in vitro. In this paper, a comprehensive comparison in copy number variation (CNV), mutation, mRNA expression and protein expression between 68 breast cancer cell lines and 1375 primary breast tumors is conducted and presented. Results: Using whole genome expression arrays, strong correlations were observed between cells and tumors. PAM50 gene expression differentiated them into four major breast cancer subtypes: Luminal A and B, HER2amp, and Basal-like in both cells and tumors partially. Genomic CNVs patterns were observed between tumors and cells across chromosomes in general. High C > T and C > G trans-version rates were observed in both cells and tumors, while the cells had slightly higher somatic mutation rates than tumors. Clustering analysis on protein expression data can reasonably recover the breast cancer subtypes in cell lines and tumors. Although the drug-targeted proteins ER/PR and interesting mTOR/GSK3/TS2/PDK1/ER_P118 cluster had shown the consistent patterns between cells and tumor, low protein-based correlations were observed between cells and tumors. The expression consistency of mRNA verse protein between cell line and tumors reaches 0.7076. These important drug targets in breast cancer, ESR1, PGR, HER2, EGFR and AR have a high similarity in mRNA and protein variation in both tumors and cell lines. GATA3 and RP56KB1 are two promising drug targets for breast cancer. A total score developed from the four correlations among four molecular profiles suggests that cell lines, BT483, T47D and MDAMB453 have the highest similarity with tumors. Conclusions: The integrated data from across these multiple platforms demonstrates the existence of the similarity and dissimilarity of molecular features between breast cancer tumors and cell lines. The cell lines only mirror some but not all of the molecular properties of primary tumors. The study results add more evidence in selecting cell line models for breast cancer research.Item Integrative Analysis for Identifying Multi-Layer Modules in Precision Medicine(2020-12) Yazdanparast, Aida; Wu, Huanmei; Li, Lang; Liu, Xiaowen; Liu, Yunlong; Zhang, ChiPrecision medicine aims to employ information from all modalities to develop a comprehensive view of disease progression and administer therapies tailored to the individual patient. A set of genomic features (gene CNVs, mutations, mRNA expressions, and protein abundances) is associated with each patient and it is hard to explain the phenotypic similarities such as gene essentiality or variability in drug response in a single genomic level. Thus, to extract biological principles it is critical to seek mutual information from multi-dimensional datasets. To address these concerns, we first conduct an integrated mRNA/protein analysis in both breast cancer cell lines and tumors, and most interestingly in the breast cancer subtypes. We identified cell lines that provide optimum heterogeneity models for studying the underlying biological processes of tumors. Our systematic observation across multi-omics data identifies distinct subgroups of cancer cells and patients. Based on this identified signal transduction between mRNA and RPPA, we developed a biclustering model to characterize key genetic alterations that are shared in both cancer cell lines and patients. We integrated two types of omics data including copy number variations, transcriptome, and proteome. Bi-EB adopts a data-driven statistics strategy by using Expected-Maximum (EM) algorithm to extract the foreground bicluster pattern from its background noise data in an iterative search. Using Bi-EB algorithm we selected translational gene sets that are characterized by highly correlated molecular profiles among RNA and proteins. To further investigate cell line and tissue in breast cancer we explore the relationship vii between genomic features and the phenotypic factors. Using in vitro/in vivo drug screening data, we adopt partial least square regression method and develop a multi-modular approach to predict anticancer therapy benefits for ER-negative breast cancer patients. The identified joint multi-dimensional modules here provide us new insights into the molecular mechanisms of drugs and cancer treatment.