Attention Mechanism Improves YOLOv5x for Detecting Vehicles on Surveillance Videos
Date
Language
Embargo Lift Date
Committee Members
Degree
Degree Year
Department
Grantor
Journal Title
Journal ISSN
Volume Title
Found At
Abstract
Vehicle detection accuracy on surveillance videos is heavily restricted by camera angles, low lighting conditions, low visibility caused by harsh weather, and serious occlusions. For the full 24/7 operation, the Intelligent Transportation Services (ITS) are expected to perform well on all the categories of the target detections in the environment. Unfortunately, most existing datasets do not cover all these difficult conditions. Moreover, the state-of-the-art Deep Learning detector performance decreases for these difficult conditions. This paper reports on the training of an object detection system using a range of traffic scenarios: sunny, rainy, snowy, one-side road, two-side road, complex road structures with occlusions, heavy traffic with congestion, light traffic, and reduced traffic at night. The state-of-the-art object detector of YOLOv5x is used for vehicle detection and is fine-tuned on this new diverse dataset through transfer learning. Transfer learning freezes the backbone network while training the remaining fully connected network. To further improve the detection performance, we added two convolutional block attention modules (CBAM) to the neck as our proposed system: 2xCBAM-YOLOv5. Several experiments refined the number of CBAMs and the placement of these modules to optimize performance. Doing transfer learning alone, the mean Average Precision(mAP) on the test data improves from 75.9% to 78.9%. After transfer learning, ablations were done on YOLOv5x combined with the new CBAMs. The resulting mAP reaches 85.0%, while precision improves from 82.3% to 88.2%, recall improves from 72.3% to 80.4% and F1-score improves from 0.77 to 0.841 compared with transfer learning alone. This new architecture provides an overall improvement for ITS traffic surveillance applications.