- Browse by Author
Browsing by Author "Banerjee, Imon"
Now showing 1 - 10 of 15
Results Per Page
Sort Options
Item AI recognition of patient race in medical imaging: a modelling study(Elsevier, 2022-06) Gichoya, Judy Wawira; Banerjee, Imon; Bhimireddy, Ananth Reddy; Burns, John L.; Celi, Leo Anthony; Chen, Li-Ching; Correa, Ramon; Dullerud, Natalie; Ghassemi, Marzyeh; Huang, Shih-Cheng; Kuo, Po-Chih; Lungren, Matthew P.; Palmer, Lyle J.; Price, Brandon J.; Purkayastha, Saptarshi; Pyrros, Ayis T.; Oakden-Rayner, Lauren; Okechukwu, Chima; Seyyed-Kalantari, Laleh; Trivedi, Hari; Wang, Ryan; Zaiman, Zachary; Zhang, Haoran; BioHealth Informatics, School of Informatics and ComputingBackground Previous studies in medical imaging have shown disparate abilities of artificial intelligence (AI) to detect a person's race, yet there is no known correlation for race on medical imaging that would be obvious to human experts when interpreting the images. We aimed to conduct a comprehensive evaluation of the ability of AI to recognise a patient's racial identity from medical images. Methods Using private (Emory CXR, Emory Chest CT, Emory Cervical Spine, and Emory Mammogram) and public (MIMIC-CXR, CheXpert, National Lung Cancer Screening Trial, RSNA Pulmonary Embolism CT, and Digital Hand Atlas) datasets, we evaluated, first, performance quantification of deep learning models in detecting race from medical images, including the ability of these models to generalise to external environments and across multiple imaging modalities. Second, we assessed possible confounding of anatomic and phenotypic population features by assessing the ability of these hypothesised confounders to detect race in isolation using regression models, and by re-evaluating the deep learning models by testing them on datasets stratified by these hypothesised confounding variables. Last, by exploring the effect of image corruptions on model performance, we investigated the underlying mechanism by which AI models can recognise race. Findings In our study, we show that standard AI deep learning models can be trained to predict race from medical images with high performance across multiple imaging modalities, which was sustained under external validation conditions (x-ray imaging [area under the receiver operating characteristics curve (AUC) range 0·91-0·99], CT chest imaging [0·87-0·96], and mammography [0·81]). We also showed that this detection is not due to proxies or imaging-related surrogate covariates for race (eg, performance of possible confounders: body-mass index [AUC 0·55], disease distribution [0·61], and breast density [0·61]). Finally, we provide evidence to show that the ability of AI deep learning models persisted over all anatomical regions and frequency spectrums of the images, suggesting the efforts to control this behaviour when it is undesirable will be challenging and demand further study. Interpretation The results from our study emphasise that the ability of AI deep learning models to predict self-reported race is itself not the issue of importance. However, our finding that AI can accurately predict self-reported race, even from corrupted, cropped, and noised medical images, often when clinical experts cannot, creates an enormous risk for all model deployments in medical imaging. Funding National Institute of Biomedical Imaging and Bioengineering, MIDRC grant of National Institutes of Health, US National Science Foundation, National Library of Medicine of the National Institutes of Health, and Taiwan Ministry of Science and Technology.Item Current Clinical Applications of Artificial Intelligence in Radiology and Their Best Supporting Evidence(Elsevier, 2020-11) Tariq, Amara; Purkayastha, Saptarshi; Padmanaban, Geetha Priya; Krupinski, Elizabeth; Trivedi, Hari; Banerjee, Imon; Gichoya, Judy W.; BioHealth Informatics, School of Informatics and ComputingPurpose Despite tremendous gains from deep learning and the promise of artificial intelligence (AI) in medicine to improve diagnosis and save costs, there exists a large translational gap to implement and use AI products in real-world clinical situations. Adoption of standards such as Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis, Consolidated Standards of Reporting Trials, and the Checklist for Artificial Intelligence in Medical Imaging is increasing to improve the peer-review process and reporting of AI tools. However, no such standards exist for product-level review. Methods A review of clinical trials showed a paucity of evidence for radiology AI products; thus, the authors developed a 10-question assessment tool for reviewing AI products with an emphasis on their validation and result dissemination. The assessment tool was applied to commercial and open-source algorithms used for diagnosis to extract evidence on the clinical utility of the tools. Results There is limited technical information on methodologies for FDA-approved algorithms compared with open-source products, likely because of intellectual property concerns. Furthermore, FDA-approved products use much smaller data sets compared with open-source AI tools, because the terms of use of public data sets are limited to academic and noncommercial entities, which precludes their use in commercial products. Conclusions Overall, this study reveals a broad spectrum of maturity and clinical use of AI products, but a large gap exists in exploring actual performance of AI tools in clinical practice.Item CVAD - An unsupervised image anomaly detector(Elsevier, 2022-02) Guo, Xiaoyuan; Gichoya, Judy Wawira; Purkayastha, Saptarshi; Banerjee, Imon; BioHealth Informatics, School of Informatics and ComputingDetecting out-of-distribution samples for image applications plays an important role in safeguarding the reliability of machine learning model deployment. In this article, we developed a software tool to support our OOD detector CVAD - a self-supervised Cascade Variational autoencoder-based Anomaly Detector , which can be easily applied to various image applications without any assumptions. The corresponding open-source software is published for better public research and tool usage.Item CVAD: A generic medical anomaly detector based on Cascade VAE(arXiv, 2021) Guo, Xiaoyuan; Gichoya, Judy Wawira; Purkayastha, Saptarshi; Banerjee, Imon; BioHealth Informatics, School of Informatics and ComputingDetecting out-of-distribution (OOD) samples in medical imaging plays an important role for downstream medical diagnosis. However, existing OOD detectors are demonstrated on natural images composed of inter-classes and have difficulty generalizing to medical images. The key issue is the granularity of OOD data in the medical domain, where intra-class OOD samples are predominant. We focus on the generalizability of OOD detection for medical images and propose a self-supervised Cascade Variational autoencoder-based Anomaly Detector (CVAD). We use a variational autoencoders' cascade architecture, which combines latent representation at multiple scales, before being fed to a discriminator to distinguish the OOD data from the in-distribution (ID) data. Finally, both the reconstruction error and the OOD probability predicted by the binary discriminator are used to determine the anomalies. We compare the performance with the state-of-the-art deep learning models to demonstrate our model's efficacy on various open-access medical imaging datasets for both intra- and inter-class OOD. Further extensive results on datasets including common natural datasets show our model's effectiveness and generalizability.Item A DICOM Framework for Machine Learning and Processing Pipelines Against Real-time Radiology Images(SpringerLink, 2021-08-17) Kathiravelu, Pradeeban; Sharma, Puneet; Sharma, Ashish; Banerjee, Imon; Trivedi, Hari; Purkayastha, Saptarshi; Sinha, Priyanshu; Cadrin‑Chenevert, Alexandre; Safdar, Nabile; Wawira Gichoya, Judy; BioHealth Informatics, School of Informatics and ComputingReal-time execution of machine learning (ML) pipelines on radiology images is difficult due to limited computing resources in clinical environments, whereas running them in research clusters requires efficient data transfer capabilities. We developed Niffler, an open-source Digital Imaging and Communications in Medicine (DICOM) framework that enables ML and processing pipelines in research clusters by efficiently retrieving images from the hospitals’ PACS and extracting the metadata from the images. We deployed Niffler at our institution (Emory Healthcare, the largest healthcare network in the state of Georgia) and retrieved data from 715 scanners spanning 12 sites, up to 350 GB/day continuously in real-time as a DICOM data stream over the past 2 years. We also used Niffler to retrieve images bulk on-demand based on user-provided filters to facilitate several research projects. This paper presents the architecture and three such use cases of Niffler. First, we executed an IVC filter detection and segmentation pipeline on abdominal radiographs in real-time, which was able to classify 989 test images with an accuracy of 96.0%. Second, we applied the Niffler Metadata Extractor to understand the operational efficiency of individual MRI systems based on calculated metrics. We benchmarked the accuracy of the calculated exam time windows by comparing Niffler against the Clinical Data Warehouse (CDW). Niffler accurately identified the scanners’ examination timeframes and idling times, whereas CDW falsely depicted several exam overlaps due to human errors. Third, with metadata extracted from the images by Niffler, we identified scanners with misconfigured time and reconfigured five scanners. Our evaluations highlight how Niffler enables real-time ML and processing pipelines in a research cluster.Item Margin-Aware Intra-Class Novelty Identification for Medical Images(SPIE, 2022-02) Guo, Xiaoyuan; Gichoya, Judy Wawira; Purkayastha, Saptarshi; Banerjee, Imon; BioHealth Informatics, School of Informatics and ComputingPurpose: Existing anomaly detection methods focus on detecting interclass variations while medical image novelty identification is more challenging in the presence of intraclass variations. For example, a model trained with normal chest x-ray and common lung abnormalities is expected to discover and flag idiopathic pulmonary fibrosis, which is a rare lung disease and unseen during training. The nuances of intraclass variations and lack of relevant training data in medical image analysis pose great challenges for existing anomaly detection methods. Approach: We address the above challenges by proposing a hybrid model—transformation-based embedding learning for novelty detection (TEND), which combines the merits of classifier-based approach and AutoEncoder (AE)-based approach. Training TEND consists of two stages. In the first stage, we learn in-distribution embeddings with an AE via the unsupervised reconstruction. In the second stage, we learn a discriminative classifier to distinguish in-distribution data and the transformed counterparts. Additionally, we propose a margin-aware objective to pull in-distribution data in a hypersphere while pushing away the transformed data. Eventually, the weighted sum of class probability and the distance to margin constitutes the anomaly score. Results: Extensive experiments are performed on three public medical image datasets with the one-vs-rest setup (namely one class as in-distribution data and the left as intraclass out-of-distribution data) and the rest-vs-one setup. Additional experiments on generated intraclass out-of-distribution data with unused transformations are implemented on the datasets. The quantitative results show competitive performance as compared to the state-of-the-art approaches. Provided qualitative examples further demonstrate the effectiveness of TEND. Conclusion: Our anomaly detection model TEND can effectively identify the challenging intraclass out-of-distribution medical images in an unsupervised fashion. It can be applied to discover unseen medical image classes and serve as the abnormal data screening for downstream medical tasks. The corresponding code is available at https://github.com/XiaoyuanGuo/TEND_MedicalNoveltyDetection.Item MedShift: Automated Identification of Shift Data for Medical Image Dataset Curation(IEEE, 2023) Guo, Xiaoyuan; Wawira Gichoya, Judy; Trivedi, Hari; Purkayastha, Saptarshi; Banerjee, Imon; Biomedical Engineering and Informatics, Luddy School of Informatics, Computing, and EngineeringAutomated curation of noisy external data in the medical domain has long been demanding as AI technologies should be validated on various sources with clean annotated data. To curate a high-quality dataset, identifying variance between the internal and external sources is a fundamental step as the data distributions from different sources can vary significantly and subsequently affect the performance of the AI models. Primary challenges for detecting data shifts are – (1) access to private data across healthcare institutions for manual detection, and (2) the lack of automated approaches to learn efficient shift-data representation without training samples. To overcome the problems, we propose an automated pipeline called MedShift to detect the top-level shift samples and evaluating the significance of shift data without sharing data between the internal and external organizations. MedShift employs unsupervised anomaly detectors to learn the internal distribution and identify samples showing significant shiftness for external datasets, and compared their performance. To quantify the effects of detected shift data, we train a multi-class classifier that learns internal domain knowledge and evaluating the classification performance for each class in external domains after dropping the shift data. We also propose a data quality metric to quantify the dissimilarity between the internal and external datasets. We verify the efficacy of MedShift with musculoskeletal radiographs (MURA) and chest X-rays datasets from more than one external source. Experiments show our proposed shift data detection pipeline can be beneficial for medical centers to curate high-quality datasets more efficiently. The code can be found at https://github.com/XiaoyuanGuo/MedShift. An interface introduction video to visualize our results is available at https://youtu.be/V3BF0P1sxQE.Item MedShift: identifying shift data for medical dataset curation(2021) Guo, Xiaoyuan; Gichoya, Judy Wawira; Trivedi, Hari; Purkayastha, Saptarshi; Banerjee, Imon; BioHealth Informatics, School of Informatics and ComputingTo curate a high-quality dataset, identifying data variance between the internal and external sources is a fundamental and crucial step. However, methods to detect shift or variance in data have not been significantly researched. Challenges to this are the lack of effective approaches to learn dense representation of a dataset and difficulties of sharing private data across medical institutions. To overcome the problems, we propose a unified pipeline called MedShift to detect the top-level shift samples and thus facilitate the medical curation. Given an internal dataset A as the base source, we first train anomaly detectors for each class of dataset A to learn internal distributions in an unsupervised way. Second, without exchanging data across sources, we run the trained anomaly detectors on an external dataset B for each class. The data samples with high anomaly scores are identified as shift data. To quantify the shiftness of the external dataset, we cluster B's data into groups class-wise based on the obtained scores. We then train a multi-class classifier on A and measure the shiftness with the classifier's performance variance on B by gradually dropping the group with the largest anomaly score for each class. Additionally, we adapt a dataset quality metric to help inspect the distribution differences for multiple medical sources. We verify the efficacy of MedShift with musculoskeletal radiographs (MURA) and chest X-rays datasets from more than one external source. Experiments show our proposed shift data detection pipeline can be beneficial for medical centers to curate high-quality datasets more efficiently. An interface introduction video to visualize our results is available at https://youtu.be/V3BF0P1sxQE.Item Multi-Label Medical Image Retrieval Via Learning Multi-Class Similarity(SSRN, 2022) Guo, Xiaoyuan; Duan, Jiali; Gichoya, Judy Wawira; Trivedi, Hari; Purkayastha, Saptarshi; Sharma, Ashish; Banerjee, Imon; BioHealth Informatics, School of Informatics and ComputingIntroduction: Multi-label image retrieval is a challenging problem in the medical area. First, compared to natural images, labels in the medical domain exhibit higher class-imbalance and much nuanced variations. Second, pair-based sampling for positives and negatives during similarity optimization are ambiguous in the multi-label setting, as samples with the same set of labels are limited. Methods: To address the aforementioned challenges, we propose a proxy-based multi-class similarity (PMS) framework, which compares and contrasts samples by comparing their similarities with the discovered proxies. In this way, samples of different sets of label attributes can be utilized and compared indirectly, without the need for complicated sampling. PMS learns a class-wise feature decomposition and maintains a memory bank for positive features from each class. The memory bank keeps track of the latest features, used to compute the class proxies. We compare samples based on their similarity distributions against the proxies, which provide a more stable mean against noise. Results: We benchmark over 10 popular metric learning baselines on two public chest X-ray datasets and experiments show consistent stability of our approach under both exact and non-exact match settings. Conclusions: We proposed a methodology for multi-label medical image retrieval and design a proxy-based multi-class similarity metric, which compares and contrasts samples based on their similarity distributions with respect to the class proxies. With no perquisites, the metrics can be applied to various multi-label medical image applications. The implementation code repository will be publicly available after acceptance.Item Multireader evaluation of radiologist performance for COVID-19 detection on emergency department chest radiographs(Elsevier, 2022-02) Gichoya, Judy W.; Sinha, Priyanshu; Davis, Melissa; Dunkle, Jeffrey W.; Hamlin, Scott A.; Herr, Keith D.; Hoff, Carrie N.; Letter, Haley P.; McAdams, Christopher R.; Puthoff, Gregory D.; Smith, Kevin L.; Steenburg, Scott D.; Banerjee, Imon; Trivedi, Hari; Radiology and Imaging Sciences, School of MedicineBACKGROUND: Chest radiographs (CXR) are frequently used as a screening tool for patients with suspected COVID-19 infection pending reverse transcriptase polymerase chain reaction (RT-PCR) results, despite recommendations against this. We evaluated radiologist performance for COVID-19 diagnosis on CXR at the time of patient presentation in the Emergency Department (ED). MATERIALS AND METHODS: We extracted RT-PCR results, clinical history, and CXRs of all patients from a single institution between March and June 2020. 984 RT-PCR positive and 1043 RT-PCR negative radiographs were reviewed by 10 emergency radiologists from 4 academic centers. 100 cases were read by all radiologists and 1927 cases by 2 radiologists. Each radiologist chose the single best label per case: Normal, COVID-19, Other - Infectious, Other - Noninfectious, Non-diagnostic, and Endotracheal Tube. Cases labeled with endotracheal tube (246) or non-diagnostic (54) were excluded. Remaining cases were analyzed for label distribution, clinical history, and inter-reader agreement. RESULTS: 1727 radiographs (732 RT-PCR positive, 995 RT-PCR negative) were included from 1594 patients (51.2% male, 48.8% female, age 59 ± 19 years). For 89 cases read by all readers, there was poor agreement for RT-PCR positive (Fleiss Score 0.36) and negative (Fleiss Score 0.46) exams. Agreement between two readers on 1638 cases was 54.2% (373/688) for RT-PCR positive cases and 71.4% (679/950) for negative cases. Agreement was highest for RT-PCR negative cases labeled as Normal (50.4%, n = 479). Reader performance did not improve with clinical history or time between CXR and RT-PCR result. CONCLUSION: At the time of presentation to the emergency department, emergency radiologist performance is non-specific for diagnosing COVID-19.