- Browse by Subject
Browsing by Subject "Sensor Fusion"
Now showing 1 - 8 of 8
Results Per Page
Sort Options
Item A Multi-head Attention Approach with Complementary Multimodal Fusion for Vehicle Detection(2024-05) Tabassum, Nujhat; El-Sharkawy, Mohamed; King, Brian; Rizkalla, MaherThe advancement of autonomous vehicle technologies has taken a significant leap with the development of an improved version of the Multimodal Vehicle Detection Network (MVDNet), distinguished by the integration of a multi-head attention layer. This key enhancement significantly refines the network's capability to process and integrate multimodal sensor data, an aspect that becomes crucial in the face of challenging weather conditions. The effectiveness of this upgraded Multi-Head MVDNet is rigorously verified through an extensive dataset acquired from the Oxford Radar Robotcar, demonstrating its enhanced performance capabilities. Notably, in complex environmental conditions, the Multi-Head MVDNet shows a marked superiority in terms of Average Precision (AP) compared to existing models, underscoring its advanced detection capabilities. The transition from the traditional MVDNet to the enhanced Multi-Head Vehicle Detection Network signifies a notable breakthrough in the arena of vehicle detection technologies, with a special emphasis on operation under severe meteorological conditions, such as the obscuring presence of dense fog or the complexities introduced by heavy snowfall. This significant enhancement capitalizes on the foundational principles of the original MVDNet, which skillfully amalgamates the individual strengths of lidar and radar sensors. This is achieved through an intricate and refined process of feature tensor fusion, creating a more robust and comprehensive sensory data interpretation framework. A major innovation introduced in this updated model is the implementation of a multi-head attention layer. This layer serves as a sophisticated replacement for the previously employed self-attention mechanism. Segmenting the attention mechanism into several distinct partitions enhances the network's efficiency and accuracy in processing and interpreting vast arrays of sensor data. An exhaustive series of experimental analyses was undertaken to determine the optimal configuration of this multi-head attention mechanism. These experiments explored various combinations and settings, ultimately identifying a configuration consisting of seven distinct attention heads as the most effective. This setup was found to optimize the balance between computational efficiency and detection accuracy. When tested using the rich radar and lidar datasets from the ORR project, this advanced Multi-Head MVDNet configuration consistently demonstrated its superiority. It not only surpassed the performance of the original MVDNet but also showed marked improvements over models that relied solely on lidar data or the DEF models, especially in terms of vehicular detection accuracy. This enhancement in the MVDNet model, with its focus on multi-head attention, not only represents a significant leap in the field of autonomous vehicle detection but also lays a foundation for future research. It opens new pathways for exploring various attention mechanisms and their potential applicability in scenarios requiring real-time vehicle detection. Furthermore, it accentuates the importance of sophisticated sensor fusion techniques as vital tools in overcoming the challenges posed by adverse environmental conditions, thus paving the way for more resilient and reliable autonomous vehicular technologies.Item Data Acquisition and Processing Pipeline for E-Scooter Tracking Using 3d Lidar and Multi-Camera Setup(2020-12) Betrabet, Siddhant S.; Tian, Renran; Zhu, Likun; Anwar, SohelAnalyzing behaviors of objects on the road is a complex task that requires data from various sensors and their fusion to recreate the movement of objects with a high degree of accuracy. A data collection and processing system are thus needed to track the objects accurately in order to make an accurate and clear map of the trajectories of objects relative to various coordinate frame(s) of interest in the map. Detection and tracking moving objects (DATMO) and Simultaneous localization and mapping (SLAM) are the tasks that needs to be achieved in conjunction to create a clear map of the road comprising of the moving and static objects. These computational problems are commonly solved and used to aid scenario reconstruction for the objects of interest. The tracking of objects can be done in various ways, utilizing sensors such as monocular or stereo cameras, Light Detection and Ranging (LIDAR) sensors as well as Inertial Navigation systems (INS) systems. One relatively common method for solving DATMO and SLAM involves utilizing a 3D LIDAR with multiple monocular cameras in conjunction with an inertial measurement unit (IMU) allows for redundancies to maintain object classification and tracking with the help of sensor fusion in cases when sensor specific traditional algorithms prove to be ineffectual when either sensor falls short due to their limitations. The usage of the IMU and sensor fusion methods relatively eliminates the need for having an expensive INS rig. Fusion of these sensors allows for more effectual tracking to utilize the maximum potential of each sensor while allowing for methods to increase perceptional accuracy. The focus of this thesis will be the dock-less e-scooter and the primary goal will be to track its movements effectively and accurately with respect to cars on the road and the world. Since it is relatively more common to observe a car on the road than e-scooters, we propose a data collection system that can be built on top of an e-scooter and an offline processing pipeline that can be used to collect data in order to understand the behaviors of the e-scooters themselves. In this thesis, we plan to explore a data collection system involving a 3D LIDAR sensor and multiple monocular cameras and an IMU on an e-scooter as well as an offline method for processing the data to generate data to aid scenario reconstruction.Item Enhanced 3D Object Detection and Tracking in Autonomous Vehicles: An Efficient Multi-Modal Deep Fusion Approach(2024-08) Kalgaonkar, Priyank B.; El-Sharkawy, Mohamed; King, Brian S.; Rizkalla, Maher E.; Abdallah, Mustafa A.This dissertation delves into a significant challenge for Autonomous Vehicles (AVs): achieving efficient and robust perception under adverse weather and lighting conditions. Systems that rely solely on cameras face difficulties with visibility over long distances, while radar-only systems struggle to recognize features like stop signs, which are crucial for safe navigation in such scenarios. To overcome this limitation, this research introduces a novel deep camera-radar fusion approach using neural networks. This method ensures reliable AV perception regardless of weather or lighting conditions. Cameras, similar to human vision, are adept at capturing rich semantic information, whereas radars can penetrate obstacles like fog and darkness, similar to X-ray vision. The thesis presents NeXtFusion, an innovative and efficient camera-radar fusion network designed specifically for robust AV perception. Building on the efficient single-sensor NeXtDet neural network, NeXtFusion significantly enhances object detection accuracy and tracking. A notable feature of NeXtFusion is its attention module, which refines critical feature representation for object detection, minimizing information loss when processing data from both cameras and radars. Extensive experiments conducted on large-scale datasets such as Argoverse, Microsoft COCO, and nuScenes thoroughly evaluate the capabilities of NeXtDet and NeXtFusion. The results show that NeXtFusion excels in detecting small and distant objects compared to existing methods. Notably, NeXtFusion achieves a state-of-the-art mAP score of 0.473 on the nuScenes validation set, outperforming competitors like OFT by 35.1% and MonoDIS by 9.5%. NeXtFusion's excellence extends beyond mAP scores. It also performs well in other crucial metrics, including mATE (0.449) and mAOE (0.534), highlighting its overall effectiveness in 3D object detection. Visualizations of real-world scenarios from the nuScenes dataset processed by NeXtFusion provide compelling evidence of its capability to handle diverse and challenging environments.Item Exploration of Deep Learning Applications on an Autonomous Embedded Platform (Bluebox 2.0)(2019-12) Katare, Dewant; El-Sharkawy, Mohamed; Rizkalla, Maher; Kim, Dongsoo StephenAn Autonomous vehicle depends on the combination of latest technology or the ADAS safety features such as Adaptive cruise control (ACC), Autonomous Emergency Braking (AEB), Automatic Parking, Blind Spot Monitor, Forward Collision Warning or Avoidance (FCW or FCA), Lane Departure Warning. The current trend follows incorporation of these technologies using the Artificial neural network or Deep neural network, as an imitation of the traditionally used algorithms. Recent research in the field of deep learning and development of competent processors for autonomous or self-driving car have shown amplitude of prospect, but there are many complexities for hardware deployment because of limited resources such as memory, computational power, and energy. Deployment of several mentioned ADAS safety feature using multiple sensors and individual processors, increases the integration complexity and also results in the distribution of the system, which is very pivotal for autonomous vehicles. This thesis attempts to tackle two important adas safety feature: Forward collision Warning, and Object Detection using the machine learning and Deep Neural Networks and there deployment in the autonomous embedded platform. 1. A machine learning based approach for the forward collision warning system in an autonomous vehicle. 2. 3-D object detection using Lidar and Camera which is primarily based on Lidar Point Clouds. The proposed forward collision warning model is based on the forward facing automotive radar providing the sensed input values such as acceleration, velocity and separation distance to a classifier algorithm which on the basis of supervised learning model, alerts the driver of possible collision. Decision Tress, Linear Regression, Support Vector Machine, Stochastic Gradient Descent, and a Fully Connected Neural Network is used for the prediction purpose. The second proposed methods uses object detection architecture, which combines the 2D object detectors and a contemporary 3D deep learning techniques. For this approach, the 2D object detectors is used first, which proposes a 2D bounding box on the images or video frames. Additionally a 3D object detection technique is used where the point clouds are instance segmented and based on raw point clouds density a 3D bounding box is predicted across the previously segmented objects.Item Lidar Based 3D Object Detection Using Yolov8(2024-08) Menon, Swetha Suresh; El-Sharkawy, Mohamed; King, Brian; Rizkalla, MaherAutonomous vehicles have gained substantial traction as the future of transportation, necessitating continuous research and innovation. While 2D object detection and instance segmentation methods have made significant strides, 3D object detection offers unparalleled precision. Deep neural network-based 3D object detection, coupled with sensor fusion, has become indispensable for self-driving vehicles, enabling a comprehensive grasp of the spatial geometry of physical objects. In our study of a Lidar-based 3D object detection network using point clouds, we propose a novel architectural model based on You Only Look Once (YOLO) framework. This innovative model combines the efficiency and accuracy of the YOLOv8 network, a swift 2D standard object detector, and a state-of-the-art model, with the real-time 3D object detection capability of the Complex YOLO model. By integrating the YOLOv8 model as the backbone network and employing the Euler Region Proposal (ERP) method, our approach achieves rapid inference speeds, surpassing other object detection models while upholding high accuracy standards. Our experiments, conducted on the KITTI dataset, demonstrate the superior efficiency of our new architectural model. It outperforms its predecessors, showcasing its prowess in advancing the field of 3D object detection in autonomous vehicles.Item A Novel Fusion Technique for 2D LIDAR and Stereo Camera Data Using Fuzzy Logic for Improved Depth Perception(2021-08) Saksena, Harsh; Anwar, Sohel; Li, Lingxi; Tovar, AndresObstacle detection, avoidance and path finding for autonomous vehicles requires precise information of the vehicle’s system environment for faultless navigation and decision making. As such vision and depth perception sensors have become an integral part of autonomous vehicles in the current research and development of the autonomous industry. The advancements made in vision sensors such as radars, Light Detection And Ranging (LIDAR) sensors and compact high resolution cameras is encouraging, however individual sensors can be prone to error and misinformation due to environmental factors such as scene illumination, object reflectivity and object transparency. The application of sensor fusion in a system, by the utilization of multiple sensors perceiving similar or relatable information over a network, is implemented to provide a more robust and complete system information and minimize the overall perceived error of the system. 3D LIDAR and monocular camera are the most commonly utilized vision sensors for the implementation of sensor fusion. 3D LIDARs boast a high accuracy and resolution for depth capturing for any given environment and have a broad range of applications such as terrain mapping and 3D reconstruction. Despite 3D LIDAR being the superior sensor for depth, the high cost and sensitivity to its environment make it a poor choice for mid-range application such as autonomous rovers, RC cars and robots. 2D LIDARs are more affordable, easily available and have a wider range of applications than 3D LIDARs, making them the more obvious choice for budget projects. The primary objective of this thesis is to implement a smart and robust sensor fusion system using 2D LIDAR and a stereo depth camera to capture depth and color information of an environment. The depth points generated by the LIDAR are fused with the depth map generated by the stereo camera by a Fuzzy system that implements smart fusion and corrects any gaps in the depth information of the stereo camera. The use of Fuzzy system for sensor fusion of 2D LIDAR and stereo camera is a novel approach to the sensor fusion problem and the output of the fuzzy fusion provides higher depth confidence than the individual sensors provide. In this thesis, we will explore the multiple layers of sensor and data fusion that have been applied to the vision system, both on the camera and lidar data individually and in relation to each other. We will go into detail regarding the development and implementation of fuzzy logic based fusion approach, the fuzzification of input data and the method of selection of the fuzzy system for depth specific fusion for the given vision system and how fuzzy logic can be utilized to provide information which is vastly more reliable than the information provided by the camera and LIDAR separately.Item Sensor fusion to detect scale and direction of gravity in monocular SLAM systems(2017) Tucker, Seth C.; El-Sharkawy, Mohamed A.Monocular simultaneous localization and mapping (SLAM) is an important technique that enables very inexpensive environment mapping and pose estimation in small systems such as smart phones and unmanned aerial vehicles. However, the information generated by monocular SLAM is in an arbitrary and unobservable scale, leading to drift and making it difficult to use with other sources of odometry for control or navigation. To correct this, the odometry needs to be aligned with metric scale odometry from another device, or else scale must be recovered from known features in the environment. Typically known environmental features are not available, and for systems such as cellphones or unmanned aerial vehicles (UAV), which may experience sustained, small scale, irregular motion, an IMU is often the only practical option. Because accelerometers measure acceleration and gravity, an inertial measurement unit (IMU) must filter out gravity and track orientation with complex algorithms in order to provide a linear acceleration measurement that can be used to recover SLAM scale. In this thesis, an alternative method will be proposed, which detects and removes gravity from the accelerometer measurement by using the unscaled direction of acceleration derived from the SLAM odometry.Item Smart shoe gait analysis and diagnosis: designing and prototyping of hardware and software(2018) Peddinti, Seshasai Vamsi Krishna; Agarwal, Mangilal; Rizkalla, Maher; El-Sharkawy, MohamedGait analysis plays a major role in treatment of osteoarthritis, knee or hip replacements, and musculoskeletal diseases. It is extensively used for injury rehabilitation and physical therapy for issues like Hemiplegia and Diplegia. It also provides us with the information to detect various improper gaits such as Parkinson's disease, Hemiplegic and diplegic gaits. Though there are many wearable and non-wearable methods to detect the improper gate performance, they are usually not user friendly and have restrictions. Most existing devices and systems can detect the gait but are very limited with regards of diagnosing them. The proposed method uses two A201 Force sensing resistors, accelerometer, and gyroscope to detect the gait and send diagnosed information of the possibility of the specified improper gaits via Bluetooth wireless communication system to the user's hand-held device or the desktop. The data received from the sensors was analyzed by the custom made micro-controller and is sent to the desktop or mobile device via Bluetooth module. The peak pressure values during a gait cycle were recorded and were used to indicate if the walk cycle of a person is normal or it has any abnormality. Future work: A magnetometer can be added to get more accurate results. More improper gaits can be detected by using two PCBs, one under each foot. Data can be sent to cloud and saved for future comparisons.