Trusted Data Anomaly Detection (TaDA) in Ground Truth Image Data
Date
Language
Embargo Lift Date
Committee Members
Degree
Degree Year
Department
Grantor
Journal Title
Journal ISSN
Volume Title
Found At
Abstract
Current state-of-the-art Artificial Intelligence (AI) anomaly detection from images is primarily used for defect detection and relies on relatively homogeneous datasets of images with similar foregrounds and backgrounds. This type of anomaly detection uses human labelled ground-truth data. In our research, we have extremely heterogeneous datasets and want to identify outliers. We use self-supervised Variational Autoencoders (VAEs) to identify anomalies in the latent vector feature space. Understanding the outliers in a large training data set is important for establishing trustworthiness of the AI models learned from these data, a strong requirement for military AI applications. Our study uses 8984 examples from Kaggle military planes and 4300 examples from Kaggle landscape data. We present the results of the combined heterogeneous dataset on the localized methods, with one such result exhibiting inliers as landscapes/backgrounds and outliers as all aircraft, detecting aircraft as anomalies with a 0.87 AUC. Results also include the inter-class AUC across the different aircraft classes. Our contribution to the state-of-the-art is to apply isolation forests to the latent space data after UMAP embeddings in a strongly heterogeneous image dataset for military applications to identify anomalies.