- Browse by Author
Browsing by Author "Rizkalla, Maher"
Now showing 1 - 10 of 91
Results Per Page
Sort Options
Item A Multi-head Attention Approach with Complementary Multimodal Fusion for Vehicle Detection(2024-05) Tabassum, Nujhat; El-Sharkawy, Mohamed; King, Brian; Rizkalla, MaherThe advancement of autonomous vehicle technologies has taken a significant leap with the development of an improved version of the Multimodal Vehicle Detection Network (MVDNet), distinguished by the integration of a multi-head attention layer. This key enhancement significantly refines the network's capability to process and integrate multimodal sensor data, an aspect that becomes crucial in the face of challenging weather conditions. The effectiveness of this upgraded Multi-Head MVDNet is rigorously verified through an extensive dataset acquired from the Oxford Radar Robotcar, demonstrating its enhanced performance capabilities. Notably, in complex environmental conditions, the Multi-Head MVDNet shows a marked superiority in terms of Average Precision (AP) compared to existing models, underscoring its advanced detection capabilities. The transition from the traditional MVDNet to the enhanced Multi-Head Vehicle Detection Network signifies a notable breakthrough in the arena of vehicle detection technologies, with a special emphasis on operation under severe meteorological conditions, such as the obscuring presence of dense fog or the complexities introduced by heavy snowfall. This significant enhancement capitalizes on the foundational principles of the original MVDNet, which skillfully amalgamates the individual strengths of lidar and radar sensors. This is achieved through an intricate and refined process of feature tensor fusion, creating a more robust and comprehensive sensory data interpretation framework. A major innovation introduced in this updated model is the implementation of a multi-head attention layer. This layer serves as a sophisticated replacement for the previously employed self-attention mechanism. Segmenting the attention mechanism into several distinct partitions enhances the network's efficiency and accuracy in processing and interpreting vast arrays of sensor data. An exhaustive series of experimental analyses was undertaken to determine the optimal configuration of this multi-head attention mechanism. These experiments explored various combinations and settings, ultimately identifying a configuration consisting of seven distinct attention heads as the most effective. This setup was found to optimize the balance between computational efficiency and detection accuracy. When tested using the rich radar and lidar datasets from the ORR project, this advanced Multi-Head MVDNet configuration consistently demonstrated its superiority. It not only surpassed the performance of the original MVDNet but also showed marked improvements over models that relied solely on lidar data or the DEF models, especially in terms of vehicular detection accuracy. This enhancement in the MVDNet model, with its focus on multi-head attention, not only represents a significant leap in the field of autonomous vehicle detection but also lays a foundation for future research. It opens new pathways for exploring various attention mechanisms and their potential applicability in scenarios requiring real-time vehicle detection. Furthermore, it accentuates the importance of sophisticated sensor fusion techniques as vital tools in overcoming the challenges posed by adverse environmental conditions, thus paving the way for more resilient and reliable autonomous vehicular technologies.Item Analysis of Latent Space Representations for Object Detection(2024-08) Dale, Ashley Susan; Christopher, Lauren; King, Brian; Salama, Paul; Rizkalla, MaherDeep Neural Networks (DNNs) successfully perform object detection tasks, and the Con- volutional Neural Network (CNN) backbone is a commonly used feature extractor before secondary tasks such as detection, classification, or segmentation. In a DNN model, the relationship between the features learned by the model from the training data and the features leveraged by the model during test and deployment has motivated the area of feature interpretability studies. The work presented here applies equally to white-box and black-box models and to any DNN architecture. The metrics developed do not require any information beyond the feature vector generated by the feature extraction backbone. These methods are therefore the first methods capable of estimating black-box model robustness in terms of latent space complexity and the first methods capable of examining feature representations in the latent space of black box models. This work contributes the following four novel methodologies and results. First, a method for quantifying the invariance and/or equivariance of a model using the training data shows that the representation of a feature in the model impacts model performance. Second, a method for quantifying an observed domain gap in a dataset using the latent feature vectors of an object detection model is paired with pixel-level augmentation techniques to close the gap between real and synthetic data. This results in an improvement in the model’s F1 score on a test set of outliers from 0.5 to 0.9. Third, a method for visualizing and quantifying similarities of the latent manifolds of two black-box models is used to correlate similar feature representation with increase success in the transferability of gradient-based attacks. Finally, a method for examining the global complexity of decision boundaries in black-box models is presented, where more complex decision boundaries are shown to correlate with increased model robustness to gradient-based and random attacks.Item Artificial intelligence reveals features associated with breast cancer neoadjuvant chemotherapy responses from multi-stain histopathologic images(Springer Nature, 2023-01-27) Huang, Zhi; Shao, Wei; Han, Zhi; Alkashash, Ahmad Mahmoud; De la Sancha, Carlo; Parwani, Anil V.; Nitta, Hiroaki; Hou, Yanjun; Wang, Tongxin; Salama, Paul; Rizkalla, Maher; Zhang, Jie; Huang, Kun; Li, Zaibo; Electrical and Computer Engineering, School of Engineering and TechnologyAdvances in computational algorithms and tools have made the prediction of cancer patient outcomes using computational pathology feasible. However, predicting clinical outcomes from pre-treatment histopathologic images remains a challenging task, limited by the poor understanding of tumor immune micro-environments. In this study, an automatic, accurate, comprehensive, interpretable, and reproducible whole slide image (WSI) feature extraction pipeline known as, IMage-based Pathological REgistration and Segmentation Statistics (IMPRESS), is described. We used both H&E and multiplex IHC (PD-L1, CD8+, and CD163+) images, investigated whether artificial intelligence (AI)-based algorithms using automatic feature extraction methods can predict neoadjuvant chemotherapy (NAC) outcomes in HER2-positive (HER2+) and triple-negative breast cancer (TNBC) patients. Features are derived from tumor immune micro-environment and clinical data and used to train machine learning models to accurately predict the response to NAC in breast cancer patients (HER2+ AUC = 0.8975; TNBC AUC = 0.7674). The results demonstrate that this method outperforms the results trained from features that were manually generated by pathologists. The developed image features and algorithms were further externally validated by independent cohorts, yielding encouraging results, especially for the HER2+ subtype.Item Asset allocation in frequency and in 3 spatial dimensions for electronic warfare application(2016-04) Crespo, Jonah Greenfield; Christopher, Lauren Ann; Dos Santos, Euzeli Cipriano, Jr.; Rizkalla, Maher; Li, Lingxi; King, BrianThis paper describes two research areas applied to Particle Swarm Optimization (PSO) in an electronic warfare asset scenario. First, a three spatial dimension solution utilizing topographical data is implemented and tested against a two dimensional solution. A three dimensional (3D) optimization increases solution space for optimization of asset location. Topography from NASA's Digital Elevation Model is also added to the solution to provide a realistic scenario. The optimization is tested for run time, average distances between receivers, average distance between receivers and paired transmitters, and transmission power. Due to load times of maps and increased iterations, the average run times were increased from 123ms to 178ms, which remains below the 1 second target for convergence speeds. The spread distance between receivers was able to increase from 86km to 89km. The distance between receiver and its paired transmitters as well as the total received power did not change signi cannily. In the second research contribution, a user input is created and placed into an unconstrained 2D active swarm. This \human in the swarm" scenario allows a user to change keep-away boundaries during optimization. The blended human and swarm solution successfully implemented human input into a running optimization with a time delay. The results of this research show that a electronic warfare solutions with real 3D topography can be simulated with minimal computational costs over two dimensional solutions and that electronic warfare solutions can successfully optimize using human input data.Item Bilateral and adaptive loop filter implementations in 3D-high efficiency video coding standard(2015-09) Amiri, Delaram; El-Sharkawy, Mohamed; King, Brian; Salama, Paul; Rizkalla, MaherIn this thesis, we describe a different implementation for in loop filtering method for 3D-HEVC. First we propose the use of adaptive loop filtering (ALF) technique for 3D-HEVC standard in-loop filtering. This filter uses Wiener–based method to minimize the Mean Squared Error between filtered pixel and original pixels. The performance of adaptive loop filter in picture based level is evaluated. Results show up to of 0.2 dB PSNR improvement in Luminance component for the texture and 2.1 dB for the depth. In addition, we obtain up to 0.1 dB improvement in Chrominance component for the texture view after applying this filter in picture based filtering. Moreover, a design of an in-loop filtering with Fast Bilateral Filter for 3D-HEVC standard is proposed. Bilateral filter is a filter that smoothes an image while preserving strong edges and it can remove the artifacts in an image. Performance of the bilateral filter in picture based level for 3D-HEVC is evaluated. Test model HTM- 6.2 is used to demonstrate the results. Results show up to of 20 percent of reduction in processing time of 3D-HEVC with less than affecting PSNR of the encoded 3D video using Fast Bilateral Filter.Item Building a surface atlas of hippocampal subfields from high resolution T2-weighted MRI scans using landmark-free surface registration(IEEE, 2016-10) Cong, Shan; Rizkalla, Maher; Salama, Paul; Electrical and Computer Engineering, School of Engineering and TechnologyThe hippocampus is widely studied in neuroimaging field as it plays important roles in memory and learning. However, the critical subfield information is often not explored in most hippocampal studies. We previously proposed a method for hippocampal subfield morphometry by integrating FreeSurfer, FSL, and SPHARM tools. But this method had some limitations, including the analysis of T1-weighted MRI scans without detailed subfield information and hippocampal registration without using important subfield information. To bridge these gaps, in this work, we propose a new framework for building a surface atlas of hippocampal subfields from high resolution T2-weighted MRI scans by integrating state-of-the-art methods for automated segmentation of hippocampal subfields and landmark-free, subfield-aware registration of hippocampal surfaces. Our experimental results have shown the promise of the new framework.Item Building a Surface Atlas of Hippocampal Subfields from MRI Scans using FreeSurfer, FIRST and SPHARM(Institute of Electrical and Electronics Engineers, 2014-08) Cong, Shan; Rizkalla, Maher; Du, Eliza Y.; West, John; Risacher, Shannon; Saykin, Andrew J.; Shen, Li; Alzheimer's Disease Neuroimaging Initiative; Department of Medicine, IU School of MedicineThe hippocampus is widely studied with neuroimaging techniques given its importance in learning and memory and its potential as a biomarker for brain disorders such as Alzheimer's disease and epilepsy. However, its complex folding anatomy often presents analytical challenges. In particular, the critical hippocampal subfield information is usually ignored by hippocampal registration in detailed morphometric studies. Such an approach is thus inadequate to accurately characterize hippocampal morphometry and effectively identify hippocampal structural changes related to different conditions. To bridge this gap, we present our initial effort towards building a computational framework for subfield-guided hippocampal morphometry. This initial effort is focused on surface-based morphometry and aims to build a surface atlas of hippocampal subfields. Using the FreeSurfer software package, we obtain valuable hippocampal subfield information. Using the FIRST software package, we extract reliable hippocampal surface information. Using SPHARM, we develop an approach to create an atlas by mapping interpolated subfield information onto an average surface. The empirical result using ADNI data demonstrates the promise and good reproducibility of the proposed method.Item Capacitorless Power Electronics Converters Using Integrated Planar Electro-Magnetics(2024-08) Kanakri, Haitham; Cipriano Dos Santos, Euzeli, Jr.; Rizkalla, Maher; Li, Lingxi; King, BrianThe short lifespan of capacitors in power electronics converters is a significant challenge. These capacitors, often electrolytic, are vital for voltage smoothing and frequency filtering. However, their susceptibility to heat, ripple current, and aging can lead to premature faults. This can cause issues like output voltage instability and short circuits, ultimately resulting in catastrophic failure and system shutdown. Capacitors are responsible for 30% of power electronics failures. To tackle this challenge, scientists, researchers, and engineers are exploring various approaches detailed in technical literature. These include exploring alternative capacitor technologies, implementing active and passive cooling solutions, and developing advanced monitoring techniques to predict and prevent failures. However, these solutions often come with drawbacks such as increased complexity, reduced efficiency, or higher upfront costs. Additionally, research in material science is ongoing to develop corrosion-resistant capacitors, but such devices are not readily available. This dissertation presents a capacitorless solution for dc-dc and dc-ac converters. The proposed solution involves harnessing parasitic elements and integrating them as intrinsic components in power converter technology. This approach holds the promise of enhancing power electronics reliability ratings, thereby facilitating breakthroughs in electric vehicles, compact power processing units, and renewable energy systems. The central scientific premise of this proposal is that the capacitance requirement in a power converter can be met by deliberately augmenting parasitic components. Our research hypothesis that incorporating high dielectric material-based thin-films, fabricated using nanotechnology, into planar magnetics will enable the development of a family of capacitorless electronic converters that do not rely on discrete capacitors. This innovative approach represents a departure from the traditional power converter schemes employed in industry. The first family of converters introduces a novel capacitorless solid-state power filter (SSPF) for single-phase dc-ac converters. The proposed configuration, comprising a planar transformer and an H-bridge converter operating at high frequency, generates sinusoidal ac voltage without relying on capacitors. Another innovative dc-ac inverter design is the twelve step six-level inverter, which does not incorporate capacitors in its structure. The second family of capacitorless topologies consists of non-isolated dc-dc converters, namely the buck converter and the buck-boost converter. These converters utilize alternative materials with high dielectric constants, such as calcium copper titanate (CCTO), to intentionally enhance specific parasitic components, notably inter capacitance. This innovative approach reduces reliance on external discrete capacitors and facilitates the development of highly reliable converters. The study also includes detailed discussions on the necessary design specifications for these parasitic capacitors. Furthermore, comprehensive finite element analysis solutions and detailed circuit models are provided. A design example is presented to demonstrate the practical application of the proposed concept in electric vehicle (EV) low voltage side dc-dc power converters used to supply EVs low voltage loads.Item Compressed convolutional neural network for autonomous systems(2018-12) Pathak, Durvesh; El-Sharkawy, Mohamed; Rizkalla, Maher; King, BrianThe word “Perception” seems to be intuitive and maybe the most straightforward problem for the human brain because as a child we have been trained to classify images, detect objects, but for computers, it can be a daunting task. Giving intuition and reasoning to a computer which has mere capabilities to accept commands and process those commands is a big challenge. However, recent leaps in hardware development, sophisticated software frameworks, and mathematical techniques have made it a little less daunting if not easy. There are various applications built around to the concept of “Perception”. These applications require substantial computational resources, expensive hardware, and some sophisticated software frameworks. Building an application for perception for the embedded system is an entirely different ballgame. Embedded system is a culmination of hardware, software and peripherals developed for specific tasks with imposed constraints on memory and power. Therefore, the applications developed should keep in mind the memory and power constraints imposed due to the nature of these systems. Before 2012, the problems related to “Perception” such as classification, object detection were solved using algorithms with manually engineered features. However, in recent years, instead of manually engineering the features, these features are learned through learning algorithms. The game-changing architecture of Convolution Neural Networks proposed in 2012 by Alex K [1], provided a tremendous momentum in the direction of pushing Neural networks for perception. This thesis is an attempt to develop a convolution neural network architecture for embedded systems, i.e. an architecture that has a small model size and competitive accuracy. Recreate state-of-the-art architectures using fire module’s concept to reduce the model size of the architecture. The proposed compact models are feasible for deployment on embedded devices such as the Bluebox 2.0. Furthermore, attempts are made to integrate the compact Convolution Neural Network with object detection pipelines.Item Compressed MobileNet V3: An efficient CNN for resource constrained platforms(2021-05) Prasad, S. P. Kavyashree; El-Sharkawy, Mohamed; King, Brian; Rizkalla, MaherComputer Vision is a mathematical tool formulated to extend human vision to machines. This tool can perform various tasks such as object classification, object tracking, motion estimation, and image segmentation. These tasks find their use in many applications, namely robotics, self-driving cars, augmented reality, and mobile applications. However, opposed to the traditional technique of incorporating handcrafted features to understand images, convolution neural networks are being used to perform the same function. Computer vision applications widely use CNNs due to their stellar performance in interpreting images. Over the years, there have been numerous advancements in machine learning, particularly to CNNs.However, the need to improve their accuracy, model size and complexity increased, making their deployment in restricted environments a challenge. Many researchers proposed techniques to reduce the size of CNN while still retaining its accuracy. Few of these include network quantization, pruning, low rank, and sparse decomposition and knowledge distillation. Some methods developed efficient models from scratch. This thesis achieves a similar goal using design space exploration techniques on the latest variant of MobileNets, MobileNet V3. Using DPD blocks, escalation in the number of expansion filters in some layers and mish activation function MobileNet V3 is reduced to 84.96% in size and made 0.2% more accurate. Furthermore, it is deployed in NXP i.MX RT1060 for image classification on CIFAR-10 dataset.