A Transfer Learning Approach to Object Detection Acceleration for Embedded Applications

Vance, Lauren M.

A Transfer Learning Approach to Object Detection Acceleration for Embedded Applications

dc.contributor.advisor	Christopher, Lauren
dc.contributor.author	Vance, Lauren M.
dc.contributor.other	King, Brian
dc.contributor.other	Rizkalla, Maher
dc.date.accessioned	2021-08-10T13:21:49Z
dc.date.available	2021-08-10T13:21:49Z
dc.date.issued	2021-08
dc.degree.date	2021	en_US
dc.degree.discipline	Electrical & Computer Engineering	en
dc.degree.grantor	Purdue University	en_US
dc.degree.level	M.S.E.C.E.	en_US
dc.description	Indiana University-Purdue University Indianapolis (IUPUI)	en_US
dc.description.abstract	Deep learning solutions to computer vision tasks have revolutionized many industries in recent years, but embedded systems have too many restrictions to take advantage of current state-of-the-art configurations. Typical embedded processor hardware configurations must meet very low power and memory constraints to maintain small and lightweight packaging, and the architectures of the current best deep learning models are too computationally-intensive for these hardware configurations. Current research shows that convolutional neural networks (CNNs) can be deployed with a few architectural modifications on Field-Programmable Gate Arrays (FPGAs) resulting in minimal loss of accuracy, similar or decreased processing speeds, and lower power consumption when compared to general-purpose Central Processing Units (CPUs) and Graphics Processing Units (GPUs). This research contributes further to these findings with the FPGA implementation of a YOLOv4 object detection model that was developed with the use of transfer learning. The transfer-learned model uses the weights of a model pre-trained on the MS-COCO dataset as a starting point then fine-tunes only the output layers for detection on more specific objects of five classes. The model architecture was then modified slightly for compatibility with the FPGA hardware using techniques such as weight quantization and replacing unsupported activation layer types. The model was deployed on three different hardware setups (CPU, GPU, FPGA) for inference on a test set of 100 images. It was found that the FPGA was able to achieve real-time inference speeds of 33.77 frames-per-second, a speedup of 7.74 frames-per-second when compared to GPU deployment. The model also consumed 96% less power than a GPU configuration with only approximately 4% average loss in accuracy across all 5 classes. The results are even more striking when compared to CPU deployment, with 131.7-times speedup in inference throughput. CPUs have long since been outperformed by GPUs for deep learning applications but are used in most embedded systems. These results further illustrate the advantages of FPGAs for deep learning inference on embedded systems even when transfer learning is used for an efficient end-to-end deployment process. This work advances current state-of-the-art with the implementation of a YOLOv4 object detection model developed with transfer learning for FPGA deployment.	en_US
dc.identifier.uri	https://hdl.handle.net/1805/26436
dc.identifier.uri	http://dx.doi.org/10.7912/C2/62
dc.language.iso	en_US	en_US
dc.rights	Attribution-NoDerivatives 4.0 International	*
dc.rights.uri	https://creativecommons.org/licenses/by-nd/4.0	*
dc.subject	deep learning	en_US
dc.subject	computer vision	en_US
dc.subject	object detection	en_US
dc.subject	embedded systems	en_US
dc.subject	fpga	en_US
dc.subject	transfer learning	en_US
dc.title	A Transfer Learning Approach to Object Detection Acceleration for Embedded Applications	en_US
dc.type	Thesis	en

Files

Original bundle

Now showing 1 - 1 of 1

Name:: A_Transfer_Learning_Approach_to_Object_Detection_Acceleration_for_Embedded_Applications.pdf
Size:: 1.28 MB
Format:: Adobe Portable Document Format
Description:

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.99 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Electrical & Computer Engineering Department Theses and Dissertations