Was there COVID-19 back in 2012? Challenge for AI in Diagnosis with Similar Indications

dc.contributor.authorBanerjee, Imon
dc.contributor.authorSinha, Priyanshu
dc.contributor.authorPurkayastha, Saptarshi
dc.contributor.authorMashhaditafreshi, Nazanin
dc.contributor.authorTariq, Amara
dc.contributor.authorJeong, Jiwoong
dc.contributor.authorTrivedi, Hari
dc.contributor.authorGichoya, Judy W.
dc.contributor.departmentBioHealth Informatics, School of Informatics and Computingen_US
dc.date.accessioned2020-07-31T15:41:51Z
dc.date.available2020-07-31T15:41:51Z
dc.date.issued2020-06-23
dc.description.abstractPurpose: Since the recent COVID-19 outbreak, there has been an avalanche of research papers applying deep learning based image processing to chest radiographs for detection of the disease. To test the performance of the two top models for CXR COVID-19 diagnosis on external datasets to assess model generalizability. Methods: In this paper, we present our argument regarding the efficiency and applicability of existing deep learning models for COVID-19 diagnosis. We provide results from two popular models - COVID-Net and CoroNet evaluated on three publicly available datasets and an additional institutional dataset collected from EMORY Hospital between January and May 2020, containing patients tested for COVID-19 infection using RT-PCR. Results: There is a large false positive rate (FPR) for COVID-Net on both ChexPert (55.3%) and MIMIC-CXR (23.4%) dataset. On the EMORY Dataset, COVID-Net has 61.4% sensitivity, 0.54 F1-score and 0.49 precision value. The FPR of the CoroNet model is significantly lower across all the datasets as compared to COVID-Net - EMORY(9.1%), ChexPert (1.3%), ChestX-ray14 (0.02%), MIMIC-CXR (0.06%). Conclusion: The models reported good to excellent performance on their internal datasets, however we observed from our testing that their performance dramatically worsened on external data. This is likely from several causes including overfitting models due to lack of appropriate control patients and ground truth labels. The fourth institutional dataset was labeled using RT-PCR, which could be positive without radiographic findings and vice versa. Therefore, a fusion model of both clinical and radiographic data may have better performance and generalization.en_US
dc.identifier.citationBanerjee, I., Sinha, P., Purkayastha, S., Mashhaditafreshi, N., Tariq, A., Jeong, J., Trivedi, H., & Gichoya, J. W. (2020). Was there COVID-19 back in 2012? Challenge for AI in Diagnosis with Similar Indications. http://arxiv.org/abs/2006.13262en_US
dc.identifier.urihttps://hdl.handle.net/1805/23475
dc.language.isoen_USen_US
dc.sourceArXiven_US
dc.subjectCOVID-19en_US
dc.subjectDeep Learning Modelsen_US
dc.subjectConvolutional Neural Networken_US
dc.subjectChest Radiographsen_US
dc.titleWas there COVID-19 back in 2012? Challenge for AI in Diagnosis with Similar Indicationsen_US
dc.typePreprinten_US
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Banerjee2020Was.pdf
Size:
509.99 KB
Format:
Adobe Portable Document Format
Description:
Preprint
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.99 KB
Format:
Item-specific license agreed upon to submission
Description: