- Browse by Author
Browsing by Author "Cheng, Jun"
Now showing 1 - 10 of 12
Results Per Page
Sort Options
Item BrcaSeg: A Deep Learning Approach for Tissue Quantification and Genomic Correlations of Histopathological Images(Elsevier, 2021) Lu, Zixiao; Zhan, Xiaohui; Wu, Yi; Cheng, Jun; Shao, Wei; Ni, Dong; Han, Zhi; Zhang, Jie; Feng, Qianjin; Huang, Kun; Medicine, School of MedicineEpithelial and stromal tissues are components of the tumor microenvironment and play a major role in tumor initiation and progression. Distinguishing stroma from epithelial tissues is critically important for spatial characterization of the tumor microenvironment. Here, we propose BrcaSeg, an image analysis pipeline based on a convolutional neural network (CNN) model to classify epithelial and stromal regions in whole-slide hematoxylin and eosin (H&E) stained histopathological images. The CNN model is trained using well-annotated breast cancer tissue microarrays and validated with images from The Cancer Genome Atlas (TCGA) Program. BrcaSeg achieves a classification accuracy of 91.02%, which outperforms other state-of-the-art methods. Using this model, we generate pixel-level epithelial/stromal tissue maps for 1000 TCGA breast cancer slide images that are paired with gene expression data. We subsequently estimate the epithelial and stromal ratios and perform correlation analysis to model the relationship between gene expression and tissue ratios. Gene Ontology (GO) enrichment analyses of genes that are highly correlated with tissue ratios suggest that the same tissue is associated with similar biological processes in different breast cancer subtypes, whereas each subtype also has its own idiosyncratic biological processes governing the development of these tissues. Taken all together, our approach can lead to new insights in exploring relationships between image-based phenotypes and their underlying genomic events and biological processes for all types of solid tumors. BrcaSeg can be accessed at https://github.com/Serian1992/ImgBio.Item Computational analysis of pathological images enables a better diagnosis of TFE3 Xp11.2 translocation renal cell carcinoma(Nature Research, 2020) Cheng, Jun; Han, Zhi; Mehra, Rohit; Shao, Wei; Cheng, Michael; Feng, Qianjin; Ni, Dong; Huang, Kun; Cheng, Liang; Zhang, Jie; Medicine, School of MedicineTFE3 Xp11.2 translocation renal cell carcinoma (TFE3-RCC) generally progresses more aggressively compared with other RCC subtypes, but it is challenging to diagnose TFE3-RCC by traditional visual inspection of pathological images. In this study, we collect hematoxylin and eosin- stained histopathology whole-slide images of 74 TFE3-RCC cases (the largest cohort to date) and 74 clear cell RCC cases (ccRCC, the most common RCC subtype) with matched gender and tumor grade. An automatic computational pipeline is implemented to extract image features. Comparative study identifies 52 image features with significant differences between TFE3-RCC and ccRCC. Machine learning models are built to distinguish TFE3-RCC from ccRCC. Tests of the classification models on an external validation set reveal high accuracy with areas under ROC curve ranging from 0.842 to 0.894. Our results suggest that automatically derived image features can capture subtle morphological differences between TFE3-RCC and ccRCC and contribute to a potential guideline for TFE3-RCC diagnosis.Item Computational Image Analysis Identifies Histopathological Image Features Associated With Somatic Mutations and Patient Survival in Gastric Adenocarcinoma(Frontiers Media, 2021-03-31) Cheng, Jun; Liu, Yuting; Huang, Wei; Hong, Wenhui; Wang, Lingling; Zhan, Xiaohui; Han, Zhi; Ni, Dong; Huang, Kun; Zhang, Jie; Medicine, School of MedicineComputational analysis of histopathological images can identify sub-visual objective image features that may not be visually distinguishable by human eyes, and hence provides better modeling of disease phenotypes. This study aims to investigate whether specific image features are associated with somatic mutations and patient survival in gastric adenocarcinoma (sample size = 310). An automated image analysis pipeline was developed to extract quantitative morphological features from H&E stained whole-slide images. We found that four frequently somatically mutated genes (TP53, ARID1A, OBSCN, and PIK3CA) were significantly associated with tumor morphological changes. A prognostic model built on the image features significantly stratified patients into low-risk and high-risk groups (log-rank test p-value = 2.6e-4). Multivariable Cox regression showed the model predicted risk index was an additional prognostic factor besides tumor grade and stage. Gene ontology enrichment analysis showed that the genes whose expressions mostly correlated with the contributing features in the prognostic model were enriched on biological processes such as cell cycle and muscle contraction. These results demonstrate that histopathological image features can reflect underlying somatic mutations and identify high-risk patients that may benefit from more precise treatment regimens. Both the image features and pipeline are highly interpretable to enable translational applications.Item Correlation Analysis of Histopathology and Proteogenomics Data for Breast Cancer(American Society for Biochemistry and Molecular Biology, 2019-08-09) Zhan, Xiaohui; Cheng, Jun; Huang, Zhi; Han, Zhi; Helm, Bryan; Liu, Xiaowen; Zhang, Jie; Wang, Tian-Fu; Ni, Dong; Huang, Kun; Medicine, School of MedicineTumors are heterogeneous tissues with different types of cells such as cancer cells, fibroblasts, and lymphocytes. Although the morphological features of tumors are critical for cancer diagnosis and prognosis, the underlying molecular events and genes for tumor morphology are far from being clear. With the advancement in computational pathology and accumulation of large amount of cancer samples with matched molecular and histopathology data, researchers can carry out integrative analysis to investigate this issue. In this study, we systematically examine the relationships between morphological features and various molecular data in breast cancers. Specifically, we identified 73 breast cancer patients from the TCGA and CPTAC projects matched whole slide images, RNA-seq, and proteomic data. By calculating 100 different morphological features and correlating them with the transcriptomic and proteomic data, we inferred four major biological processes associated with various interpretable morphological features. These processes include metabolism, cell cycle, immune response, and extracellular matrix development, which are all hallmarks of cancers and the associated morphological features are related to area, density, and shapes of epithelial cells, fibroblasts, and lymphocytes. In addition, protein specific biological processes were inferred solely from proteomic data, suggesting the importance of proteomic data in obtaining a holistic understanding of the molecular basis for tumor tissue morphology. Furthermore, survival analysis yielded specific morphological features related to patient prognosis, which have a strong association with important molecular events based on our analysis. Overall, our study demonstrated the power for integrating multiple types of biological data for cancer samples in generating new hypothesis as well as identifying potential biomarkers predicting patient outcome. Future work includes causal analysis to identify key regulators for cancer tissue development and validating the findings using more independent data sets.Item Deep learning-based cancer survival prognosis from RNA-seq data: approaches and evaluations(BMC, 2020) Huang, Zhi; Johnson, Travis S.; Han, Zhi; Helm, Bryan; Cao, Sha; Zhang, Chi; Salama, Paul; Rizkalla, Maher; Yu, Christina Y.; Cheng, Jun; Xiang, Shunian; Zhan, Xiaohui; Zhang, Jie; Huang, Kun; Medicine, School of MedicineBackground: Recent advances in kernel-based Deep Learning models have introduced a new era in medical research. Originally designed for pattern recognition and image processing, Deep Learning models are now applied to survival prognosis of cancer patients. Specifically, Deep Learning versions of the Cox proportional hazards models are trained with transcriptomic data to predict survival outcomes in cancer patients. Methods: In this study, a broad analysis was performed on TCGA cancers using a variety of Deep Learning-based models, including Cox-nnet, DeepSurv, and a method proposed by our group named AECOX (AutoEncoder with Cox regression network). Concordance index and p-value of the log-rank test are used to evaluate the model performances. Results: All models show competitive results across 12 cancer types. The last hidden layers of the Deep Learning approaches are lower dimensional representations of the input data that can be used for feature reduction and visualization. Furthermore, the prognosis performances reveal a negative correlation between model accuracy, overall survival time statistics, and tumor mutation burden (TMB), suggesting an association among overall survival time, TMB, and prognosis prediction accuracy. Conclusions: Deep Learning based algorithms demonstrate superior performances than traditional machine learning based models. The cancer prognosis results measured in concordance index are indistinguishable across models while are highly variable across cancers. These findings shedding some light into the relationships between patient characteristics and survival learnability on a pan-cancer level.Item Differentiation between immune checkpoint inhibitor‐related and radiation pneumonitis in lung cancer by CT radiomics and machine learning(Wiley, 2022) Cheng, Jun; Pan, Yi; Huang, Wei; Huang, Kun; Cui, Yanhai; Cui, Yanhai; Hong, Wenhui; Wang, Lingling; Ni, Dong; Tan, Peixin; Biostatistics, School of Public HealthPurpose Consolidation immunotherapy after completion of chemoradiotherapy has become the standard of care for unresectable locally advanced non-small cell lung cancer and can induce potentially severe and life-threatening adverse events, including both immune checkpoint inhibitor-related pneumonitis (CIP) and radiation pneumonitis (RP), which are very challenging for radiologists to diagnose. Differentiating between CIP and RP has significant implications for clinical management such as the treatments to pneumonitis and the decision to continue or restart immunotherapy. The purpose of this study is to differentiate between CIP and RP by a CT radiomics approach. Methods We retrospectively collected the CT images and clinical information of patients with pneumonitis who received immune checkpoint inhibitor (ICI) only (n = 28), radiotherapy (RT) only (n = 31), and ICI+RT (n = 14). Three kinds of radiomic features (intensity histogram, gray-level co-occurrence matrix (GLCM) based, and bag-of-words features) were extracted from CT images, which characterize tissue texture at different scales. Classification models, including logistic regression, random forest, and linear SVM, were first developed and tested in patients who received ICI or RT only with 10-fold cross validation and further tested in patients who received ICI+RT using clinicians’ diagnosis as a reference. Results Using 10-fold cross validation, the classification models built on the intensity histogram features, GLCM based features, and bag-of-words features achieved an area under curve (AUC) of 0.765, 0.848, and 0.937, respectively. The best model was then applied to the patients receiving combination treatment, achieving an AUC of 0.896. Conclusions This study demonstrates the promising potential of radiomic analysis of CT images for differentiating between CIP and RP in lung cancer, which could be a useful tool to attribute the cause of pneumonitis in patients who receive both ICI and RT.Item Editorial: Computational pathology for precision diagnosis, treatment, and prognosis of cancer(Frontiers Media, 2023-06-06) Cheng, Jun; Huang, Kun; Xu, Jun; Biostatistics and Health Data Science, School of MedicineItem Identification of Topological Features in Renal Tumor Microenvironment Associated with Patient Survival(Oxford, 2018-03) Cheng, Jun; Mo, Xiaokui; Wang, Xusheng; Parwani, Anil; Feng, Qianjin; Huang, Kun; Medicine, School of MedicineMotivation As a highly heterogeneous disease, the progression of tumor is not only achieved by unlimited growth of the tumor cells, but also supported, stimulated, and nurtured by the microenvironment around it. However, traditional qualitative and/or semi-quantitative parameters obtained by pathologist’s visual examination have very limited capability to capture this interaction between tumor and its microenvironment. With the advent of digital pathology, computerized image analysis may provide a better tumor characterization and give new insights into this problem. Results We propose a novel bioimage informatics pipeline for automatically characterizing the topological organization of different cell patterns in the tumor microenvironment. We apply this pipeline to the only publicly available large histopathology image dataset for a cohort of 190 patients with papillary renal cell carcinoma obtained from The Cancer Genome Atlas project. Experimental results show that the proposed topological features can successfully stratify early- and middle-stage patients with distinct survival, and show superior performance to traditional clinical features and cellular morphological and intensity features. The proposed features not only provide new insights into the topological organizations of cancers, but also can be integrated with genomic data in future studies to develop new integrative biomarkers.Item Integrative analysis based on survival associated co-expression gene modules for predicting Neuroblastoma patients' survival time(Biomed Central, 2019-02-13) Han, Yatong; Ye, Xiufen; Cheng, Jun; Zhang, Siyuan; Feng, Weixing; Han, Zhi; Zhang, Jie; Huang, Kun; Medicine, School of MedicineBACKGROUND: More than 90% of neuroblastoma patients are cured in the low-risk group while only less than 50% for those with high-risk disease can be cured. Since the high-risk patients still have poor outcomes, we need more accurate stratification to establish an individualized precise treatment plan for the patients to improve the long-term survival rate. RESULTS: We focus on extracting features and providing a workflow to improve survival prediction for neuroblastoma patients. With a workflow for gene co-expression network (GCN) mining in microarray and RNA-Seq datasets, we extracted molecular features from each co-expressed module and summarized them into eigengenes. Then we adopted the lasso-regularized Cox proportional hazards model to select the most informative eigengene features regarding association to the risk of metastasis. Nine eigengenes were selected which show strong association with patient survival prognosis. All of the nine corresponding gene modules also have highly enriched biological functions or cytoband locations. Three of them are unique modules to RNA-Seq data, which complement the modules from microarray data in terms of survival prognosis. We then merged all eigengenes from these unique modules and used an integrative method called Similarity Network Fusion to test the prognostic power of these eigengenes for prognosis. The prognostic accuracies are significantly improved as compared to using all eigengenes, and a subgroup of patients with very poor survival rate was identified. CONCLUSIONS: We first compared GCNs mined from microarray and RNA-seq data. We discovered that each data modality yields unique GCNs, which are enriched with clear biological functions. Then we do module unique analysis and use lasso-cox model to select survival-associated eigengenes. Integration of unique and survival-associated eigengenes from both data types provides complementary information that leads to more accurate survival prognosis.Item Integrative Analysis of Histopathological Images and Genomic Data Predicts Clear Cell Renal Cell Carcinoma Prognosis(AACR, 2017-11) Cheng, Jun; Zhang, Jie; Han, Yatong; Wang, Xusheng; Ye, Xiufen; Meng, Yuebo; Parwani, Anil; Han, Zhi; Feng, Qianjin; Huang, Kun; Medicine, School of MedicineIn cancer, both histopathologic images and genomic signatures are used for diagnosis, prognosis, and subtyping. However, combining histopathologic images with genomic data for predicting prognosis, as well as the relationships between them, has rarely been explored. In this study, we present an integrative genomics framework for constructing a prognostic model for clear cell renal cell carcinoma. We used patient data from The Cancer Genome Atlas (n = 410), extracting hundreds of cellular morphologic features from digitized whole-slide images and eigengenes from functional genomics data to predict patient outcome. The risk index generated by our model correlated strongly with survival, outperforming predictions based on considering morphologic features or eigengenes separately. The predicted risk index also effectively stratified patients in early-stage (stage I and stage II) tumors, whereas no significant survival difference was observed using staging alone. The prognostic value of our model was independent of other known clinical and molecular prognostic factors for patients with clear cell renal cell carcinoma. Overall, this workflow and the shared software code provide building blocks for applying similar approaches in other cancers.