Was there COVID-19 back in 2012? Challenge for AI in Diagnosis with Similar Indications

Banerjee, Imon; Sinha, Priyanshu; Purkayastha, Saptarshi; Mashhaditafreshi, Nazanin; Tariq, Amara; Jeong, Jiwoong; Trivedi, Hari; Gichoya, Judy W.

Was there COVID-19 back in 2012? Challenge for AI in Diagnosis with Similar Indications

dc.contributor.author	Banerjee, Imon
dc.contributor.author	Sinha, Priyanshu
dc.contributor.author	Purkayastha, Saptarshi
dc.contributor.author	Mashhaditafreshi, Nazanin
dc.contributor.author	Tariq, Amara
dc.contributor.author	Jeong, Jiwoong
dc.contributor.author	Trivedi, Hari
dc.contributor.author	Gichoya, Judy W.
dc.contributor.department	BioHealth Informatics, School of Informatics and Computing	en_US
dc.date.accessioned	2020-07-31T15:41:51Z
dc.date.available	2020-07-31T15:41:51Z
dc.date.issued	2020-06-23
dc.description.abstract	Purpose: Since the recent COVID-19 outbreak, there has been an avalanche of research papers applying deep learning based image processing to chest radiographs for detection of the disease. To test the performance of the two top models for CXR COVID-19 diagnosis on external datasets to assess model generalizability. Methods: In this paper, we present our argument regarding the efficiency and applicability of existing deep learning models for COVID-19 diagnosis. We provide results from two popular models - COVID-Net and CoroNet evaluated on three publicly available datasets and an additional institutional dataset collected from EMORY Hospital between January and May 2020, containing patients tested for COVID-19 infection using RT-PCR. Results: There is a large false positive rate (FPR) for COVID-Net on both ChexPert (55.3%) and MIMIC-CXR (23.4%) dataset. On the EMORY Dataset, COVID-Net has 61.4% sensitivity, 0.54 F1-score and 0.49 precision value. The FPR of the CoroNet model is significantly lower across all the datasets as compared to COVID-Net - EMORY(9.1%), ChexPert (1.3%), ChestX-ray14 (0.02%), MIMIC-CXR (0.06%). Conclusion: The models reported good to excellent performance on their internal datasets, however we observed from our testing that their performance dramatically worsened on external data. This is likely from several causes including overfitting models due to lack of appropriate control patients and ground truth labels. The fourth institutional dataset was labeled using RT-PCR, which could be positive without radiographic findings and vice versa. Therefore, a fusion model of both clinical and radiographic data may have better performance and generalization.	en_US
dc.identifier.citation	Banerjee, I., Sinha, P., Purkayastha, S., Mashhaditafreshi, N., Tariq, A., Jeong, J., Trivedi, H., & Gichoya, J. W. (2020). Was there COVID-19 back in 2012? Challenge for AI in Diagnosis with Similar Indications. http://arxiv.org/abs/2006.13262	en_US
dc.identifier.uri	https://hdl.handle.net/1805/23475
dc.language.iso	en_US	en_US
dc.source	ArXiv	en_US
dc.subject	COVID-19	en_US
dc.subject	Deep Learning Models	en_US
dc.subject	Convolutional Neural Network	en_US
dc.subject	Chest Radiographs	en_US
dc.title	Was there COVID-19 back in 2012? Challenge for AI in Diagnosis with Similar Indications	en_US
dc.type	Preprint	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Banerjee2020Was.pdf
Size:: 509.99 KB
Format:: Adobe Portable Document Format
Description:: Preprint

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.99 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Open Access Coronavirus-Related Works
Department of Biomedical Engineering and Informatics Works
Open Access Policy Articles
Saptarshi Purkayastha