3D Object Detection Using Virtual Environment Assisted Deep Network Training

Dale, Ashley S.

3D Object Detection Using Virtual Environment Assisted Deep Network Training

dc.contributor.advisor	Christopher, Lauren
dc.contributor.author	Dale, Ashley S.
dc.contributor.other	King, Brian
dc.contributor.other	Salama, Paul
dc.date.accessioned	2021-01-05T16:11:20Z
dc.date.available	2021-01-05T16:11:20Z
dc.date.issued	2020-12
dc.degree.date	2020	en_US
dc.degree.discipline	Electrical & Computer Engineering	en
dc.degree.grantor	Purdue University	en_US
dc.degree.level	M.S.E.C.E.	en_US
dc.description	Indiana University-Purdue University Indianapolis (IUPUI)	en_US
dc.description.abstract	An RGBZ synthetic dataset consisting of five object classes in a variety of virtual environments and orientations was combined with a small sample of real-world image data and used to train the Mask R-CNN (MR-CNN) architecture in a variety of configurations. When the MR-CNN architecture was initialized with MS COCO weights and the heads were trained with a mix of synthetic data and real world data, F1 scores improved in four of the five classes: The average maximum F1-score of all classes and all epochs for the networks trained with synthetic data is F1∗ = 0.91, compared to F1 = 0.89 for the networks trained exclusively with real data, and the standard deviation of the maximum mean F1-score for synthetically trained networks is σ∗ = 0.015, compared to σ_F1 = 0.020 for the networks trained exclusively with real F1 data. Various backgrounds in synthetic data were shown to have negligible impact on F1 scores, opening the door to abstract backgrounds and minimizing the need for intensive synthetic data fabrication. When the MR-CNN architecture was initialized with MS COCO weights and depth data was included in the training data, the net- work was shown to rely heavily on the initial convolutional input to feed features into the network, the image depth channel was shown to influence mask generation, and the image color channels were shown to influence object classification. A set of latent variables for a subset of the synthetic datatset was generated with a Variational Autoencoder then analyzed using Principle Component Analysis and Uniform Manifold Projection and Approximation (UMAP). The UMAP analysis showed no meaningful distinction between real-world and synthetic data, and a small bias towards clustering based on image background.	en_US
dc.identifier.uri	https://hdl.handle.net/1805/24756
dc.identifier.uri	http://dx.doi.org/10.7912/C2/2594
dc.language.iso	en_US	en_US
dc.rights	Attribution-ShareAlike 4.0 International	*
dc.rights.uri	https://creativecommons.org/licenses/by-sa/4.0	*
dc.subject	Machine Learning	en_US
dc.subject	MASK R-CNN	en_US
dc.subject	ARTIFICIAL INTELLIGENCE	en_US
dc.subject	IMAGE PROCESSING	en_US
dc.subject	3D IMAGE	en_US
dc.subject	SIGNAL PROCESSING	en_US
dc.subject	OBJECT DETECTION	en_US
dc.subject	THREAT DETECTION	en_US
dc.subject	VIRTUAL ENVIRONMENTS	en_US
dc.subject	SYNTHETIC DATASET	en_US
dc.subject	IMAGE SEGMENTATION	en_US
dc.subject	RGBD	en_US
dc.subject	RGBD VIDEO	en_US
dc.subject	RGBZ	en_US
dc.subject	ALGORITHM	en_US
dc.subject	MS COCO	en_US
dc.subject	TRANSFER LEARNING	en_US
dc.title	3D Object Detection Using Virtual Environment Assisted Deep Network Training	en_US
dc.type	Thesis	en

Files

Original bundle

Now showing 1 - 1 of 1

Name:: 3D_OBJECT_DETECTION_USING_VIRTUAL_ENVIRONMENT_ASSISTED_DEEP_NETWORK_TRAINING.pdf
Size:: 85.14 MB
Format:: Adobe Portable Document Format
Description:

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.99 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Electrical & Computer Engineering Department Theses and Dissertations