- Browse by Author
Browsing by Author "King, Brian S."
Now showing 1 - 10 of 21
Results Per Page
Sort Options
Item AI on the Edge with CondenseNeXt: An Efficient Deep Neural Network for Devices with Constrained Computational Resources(2021-08) Kalgaonkar, Priyank B.; El-Sharkawy, Mohamed A.; King, Brian S.; Rizkalla, Maher E.Research work presented within this thesis propose a neoteric variant of deep convolutional neural network architecture, CondenseNeXt, designed specifically for ARM-based embedded computing platforms with constrained computational resources. CondenseNeXt is an improved version of CondenseNet, the baseline architecture whose roots can be traced back to ResNet. CondeseNeXt replaces group convolutions in CondenseNet with depthwise separable convolutions and introduces group-wise pruning, a model compression technique, to prune (remove) redundant and insignificant elements that either are irrelevant or do not affect performance of the network upon disposition. Cardinality, a new dimension to the existing spatial dimensions, and class-balanced focal loss function, a weighting factor inversely proportional to the number of samples, has been incorporated in order to relieve the harsh effects of pruning, into the design of CondenseNeXt’s algorithm. Furthermore, extensive analyses of this novel CNN architecture was performed on three benchmarking image datasets: CIFAR-10, CIFAR-100 and ImageNet by deploying the trained weight on to an ARM-based embedded computing platform: NXP BlueBox 2.0, for real-time image classification. The outputs are observed in real-time in RTMaps Remote Studio’s console to verify the correctness of classes being predicted. CondenseNeXt achieves state-of-the-art image classification performance on three benchmark datasets including CIFAR-10 (4.79% top-1 error), CIFAR-100 (21.98% top-1 error) and ImageNet (7.91% single model, single crop top-5 error), and up to 59.98% reduction in forward FLOPs compared to CondenseNet. CondenseNeXt can also achieve a final trained model size of 2.9 MB, however at the cost of 2.26% in accuracy loss. Thus, performing image classification on ARM-Based computing platforms without requiring a CUDA enabled GPU support, with outstanding efficiency.Item Complex Vehicle Modeling: A Data Driven Approach(2019-12) Schoen, Alexander C.; Ben Miled, Zina; Dos Santos, Euzeli C.; King, Brian S.This thesis proposes an artificial neural network (NN) model to predict fuel consumption in heavy vehicles. The model uses predictors derived from vehicle speed, mass, and road grade. These variables are readily available from telematics devices that are becoming an integral part of connected vehicles. The model predictors are aggregated over a fixed distance traveled (i.e., window) instead of fixed time interval. It was found that 1km windows is most appropriate for the vocations studied in this thesis. Two vocations were studied, refuse and delivery trucks. The proposed NN model was compared to two traditional models. The first is a parametric model similar to one found in the literature. The second is a linear regression model that uses the same features developed for the NN model. The confidence level of the models using these three methods were calculated in order to evaluate the models variances. It was found that the NN models produce lower point-wise error. However, the stability of the models are not as high as regression models. In order to improve the variance of the NN models, an ensemble based on the average of 5-fold models was created. Finally, the confidence level of each model is analyzed in order to understand how much error is expected from each model. The mean training error was used to correct the ensemble predictions for five K-Fold models. The ensemble K-fold model predictions are more reliable than the single NN and has lower confidence interval than both the parametric and regression models.Item Deep Image Processing with Spatial Adaptation and Boosted Efficiency & Supervision for Accurate Human Keypoint Detection and Movement Dynamics Tracking(2023-05) Dai, Chao Yang; Zhang, Qingxue; King, Brian S.; Fang, ShiaofenThis thesis aims to design and develop the spatial adaptation approach through spatial transformers to improve the accuracy of human keypoint recognition models. We have studied different model types and design choices to gain an accuracy increase over models without spatial transformers and analyzed how spatial transformers increase the accuracy of predictions. A neural network called Widenet has been leveraged as a specialized network for providing the parameters for the spatial transformer. Further, we have evaluated methods to reduce the model parameters, as well as the strategy to enhance the learning supervision for further improving the performance of the model. Our experiments and results have shown that the proposed deep learning framework can effectively detect the human key points, compared with the baseline methods. Also, we have reduced the model size without significantly impacting the performance, and the enhanced supervision has improved the performance. This study is expected to greatly advance the deep learning of human key points and movement dynamics.Item Deep Learning of Biomechanical Dynamics With Spatial Variability Mining and Model Sparsifiation(2024-08) Liu, Ming; Zhang, Qingxue; King, Brian S.; Ben-Miled, Zina; Xia, YuniDeep learning of biomechanical dynamics is of great promise in smart health and data-driven precision medicine. Biomechanical dynamics are related to the movement patterns and gait characteristics of human people and may provide important insights if mined by deep learning models. However, efficient deep learning of biomechanical dynamics is still challenging, considering that there is a high diversity in the dynamics from different body locations, and the deep learning model may need to be lightweight enough to be able to be deployed in real-time. Targeting these challenges, we have firstly conducted studies on the spatial variability of biomechanical dynamics, aiming to evaluate and determine the optimal body location that is of great promise in robust physical activity type detection. Further, we have developed a framework for deep learning pruning, aiming to determine the optimal pruning schemes while maintaining acceptable performance. More specifically, the proposed approach first evaluates the layer importance of the deep learning model, and then leverages the probabilistic distribution-enabled threshold determination to optimize the pruning rate. The weighted random thresholding method is first investigated to further the understanding of the behavior of the pruning action for each layer. Afterwards, the Gaussian-based thresholding is designed to more effectively optimize the pruning strategies, which can find out the fine-grained pruning schemes with both emphasis and diversity regulation. Even further, we have enhanced and boosted the efficient deep learning framework, to co-optimize the accuracy and the continuity during the pruning process, with the latter metric – continuity meaning that the pruning locations in the weight matrices are encouraged to not cause too many noncontinuous non-pruned locations thereby achieving friendly model implementation. More specifically, the proposed framework leverages the significance scoring and the continuity scoring to quantize the characteristics of each of pruned convolutional filters, then leverages the clustering technique to group the pruned filters for each convolutional stage. Afterwards, the regularized ranking approach is designed to rank the pruned filters, through putting more emphasis on the continuity scores to encourage friendly implementation. In the end, a dual-thresholding strategy is leveraged to increase the diversity in this framework, during significance & continuity co-optimization. Experimental results have demonstrated promising findings, with enhanced understanding of the spatial variability of the biomechanical dynamics and best performance body location selection, with the effective deep learning model pruning framework that can reduce the model size significantly with performance maintained, and further, with the boosted framework that co-optimizes the accuracy and continuity to all consider the friendly implementation during the pruning process. Overall, this research will greatly advance the deep biomechanical mining towards efficient smart health.Item Deep Transferable Intelligence for Wearable Big Data Pattern Detection(2021-08) Gangadharan, Kiirthanaa; Zhang, Qingxue; King, Brian S.; Chien, Yung-Ping S.Biomechanical Big Data is of great significance to precision health applications, among which we take special interest in Physical Activity Detection (PAD). In this study, we have performed extensive research on deep learning-based PAD from biomechanical big data, focusing on the challenges raised by the need for real-time edge inference. First, considering there are many places we can place the motion sensors, we have thoroughly compared and analyzed the location difference in terms of deep learning-based PAD performance. We have further compared the difference among six sensor channels (3-axis accelerometer and 3-axis gyroscope). Second, we have selected the optimal sensor and the optimal sensor channel, which can not only provide sensor usage suggestions but also enable ultra-lowpower application on the edge. Third, we have investigated innovative methods to minimize the training effort of the deep learning model, leveraging the transfer learning strategy. More specifically, we propose to pre-train a transferable deep learning model using the data from other subjects and then fine-tune the model using limited data from the target-user. In such a way, we have found that, for single-channel case, the transfer learning can effectively increase the deep model performance even when the fine-tuning effort is very small. This research, demonstrated by comprehensive experimental evaluation, has shown the potential of ultra-low-power PAD with minimized sensor stream, and minimized training effort.Item Design of Ultra-Low Power FinFET Charge Pumps for Energy Harvesting Systems(2024-08) Atluri, Mohan Krishna; Rizkalla, Maher E.; King, Brian S.; Christopher, Lauren A.This work introduces an ultra-low-voltage charge pump for energy harvesters in biosensors. The unique aspect of the proposed charge pump is its two-level design, where the first stage elevates the voltage to a specific level, and the output voltage of this stage becomes the input voltage of the second stage. Using two levels reduces the number of stages in a charge pump and improves efficiency to get a higher voltage gain. In our measurements, this charge pump design could convert a low 85mV input voltage to a substantial 608.2mV output voltage, approximately 7.15 times the input voltage, while maintaining a load resistance of 7MΩ and a 29.5% conversion efficiency.Item Development of Automated Fault Recovery Controls for Plug-Flow Biomass Reactors(2024-05) Jacob, Mariam; Schubert, Peter J.; Li, Lingxi; King, Brian S.The demand for sustainable and renewable energy sources has prompted significant research and development efforts in the field of biomass gasification. Biomass gasification technology holds significant promise for sustainable energy production, offering a renewable alternative to fossil fuels while mitigating environmental impact. This thesis presents a detailed study on the design, development, and implementation of a Plug-Flow Reactor Biomass Gasifier integrated with an Automated Auger Jam Detection System and a Blower Algorithm to maintain constant reactor pressure by varying blower speed with respect to changes in reactor pressure. The system is based on indirectly- heated pyrolytic gasification technology and is developed using Simulink™. The proposed gasification system use the principles of pyrolysis and gasification to convert biomass feedstock into syngas efficiently. An innovative plug-flow reactor configuration ensures uniform heat distribution and residence time, optimizing gasification performance and product quality. Additionally, the system incorporates an automated auger jam detection system, which utilizes sensor data to detect and mitigate auger jams in real-time, thereby enhancing operational reliability and efficiency. By monitoring these parameters, the system detects deviations from normal operating conditions indicative of auger jams and initiates corrective actions automatically. The detection algorithm is trained using test cases and validated through detailed testing to ensure accurate and reliable performance. The MATLAB™-based implementation offers flexibility, scalability, and ease of integration with existing gasifier control systems. The graphical user interface (GUI) provides operators with real-time monitoring and visualization of system status, auger performance, and detected jam events. Additionally, the system generates alerts and notifications to inform operators of detected jams, enabling timely intervention and preventive maintenance. To maintain consistent gasification conditions, a blower algorithm is developed to regulate airflow and maintain constant reactor pressure within the gasifier. The blower algorithm dynamically adjusts blower speed based on feedback from differential pressure sensors, ensuring optimal gasification performance under varying operating conditions. The integration of the blower algorithm into the gasification system contributes to stable syngas production and improved process control. The development of the Plug-Flow Reactor Biomass Gasifier, Automated Auger Jam Detection System, and Blower Algorithm is accompanied by rigorous simulation studies and experimental validation. Overall, this thesis contributes to the advancement of biomass gasification technology by presenting a detailed study on a plug flow reactor biomass gasifier with indirectly- heated pyrolytic gasification technology with an Automated Auger Jam Detection System and Blower Algorithm. The findings offer valuable insights for researchers, engineers, policymakers, and industry stakeholders supporting the transition towards cleaner and more renewable energy systems.Item Dynamic electronic asset allocation comparing genetic algorithm with particle swarm optimization(2018-12) Islam, Md Saiful; Christopher, Lauren A.; King, Brian S.; El-Sharkawy, MohamedThe contribution of this research work can be divided into two main tasks: 1) implementing this Electronic Warfare Asset Allocation Problem (EWAAP) with the Genetic Algorithm (GA); 2) Comparing performance of Genetic Algorithm to Particle Swarm Optimization (PSO) algorithm. This research problem implemented Genetic Algorithm in C++ and used QT Data Visualization for displaying three-dimensional space, pheromone, and Terrain. The Genetic algorithm implementation maintained and preserved the coding style, data structure, and visualization from the PSO implementation. Although the Genetic Algorithm has higher fitness values and better global solutions for 3 or more receivers, it increases the running time. The Genetic Algorithm is around (15-30\%) more accurate for asset counts from 3 to 6 but requires (26-82\%) more computational time. When the allocation problem complexity increases by adding 3D space, pheromones and complex terrains, the accuracy of GA is 3.71\% better but the speed of GA is 121\% slower than PSO. In summary, the Genetic Algorithm gives a better global solution in some cases but the computational time is higher for the Genetic Algorithm with than Particle Swarm Optimization.Item A Dynamically Configurable Discrete Event Simulation Framework for Many-Core System-on-Chips(2010) Barnes, Christopher J.; Lee, Jaehwan John; King, Brian S.; Chien, Yung Ping StanleyIndustry trends indicate that many-core heterogeneous processors will be the next-generation answer to Moore's law and reduced power consumption. Thus, both academia and industry are focused on the challenges presented by many-core heterogeneous processor designs. In many cases, researchers use discrete event simulators to research and validate new computer architecture innovations. However, there is a lack of dynamically configurable discrete event simulation environments for the testing and development of many-core heterogeneous processors. To fulfill this need we present Mhetero, a retargetable framework for cycle-accurate simulation of heterogeneous many-core processors along with the cycle-accurate simulation of their associated network-on-chip communication infrastructure. Mhetero is the result of research into dynamically configurable and highly flexible simulation tools with which users are free to produce custom instruction sets and communication methods in a highly modular design environment. In this thesis we will discuss our approach to dynamically configurable discrete event simulation and present several experiments performed using the framework to exemplify how Mhetero, and similarly constructed simulators, may be used for future innovations.Item Enhanced 3D Object Detection and Tracking in Autonomous Vehicles: An Efficient Multi-Modal Deep Fusion Approach(2024-08) Kalgaonkar, Priyank B.; El-Sharkawy, Mohamed; King, Brian S.; Rizkalla, Maher E.; Abdallah, Mustafa A.This dissertation delves into a significant challenge for Autonomous Vehicles (AVs): achieving efficient and robust perception under adverse weather and lighting conditions. Systems that rely solely on cameras face difficulties with visibility over long distances, while radar-only systems struggle to recognize features like stop signs, which are crucial for safe navigation in such scenarios. To overcome this limitation, this research introduces a novel deep camera-radar fusion approach using neural networks. This method ensures reliable AV perception regardless of weather or lighting conditions. Cameras, similar to human vision, are adept at capturing rich semantic information, whereas radars can penetrate obstacles like fog and darkness, similar to X-ray vision. The thesis presents NeXtFusion, an innovative and efficient camera-radar fusion network designed specifically for robust AV perception. Building on the efficient single-sensor NeXtDet neural network, NeXtFusion significantly enhances object detection accuracy and tracking. A notable feature of NeXtFusion is its attention module, which refines critical feature representation for object detection, minimizing information loss when processing data from both cameras and radars. Extensive experiments conducted on large-scale datasets such as Argoverse, Microsoft COCO, and nuScenes thoroughly evaluate the capabilities of NeXtDet and NeXtFusion. The results show that NeXtFusion excels in detecting small and distant objects compared to existing methods. Notably, NeXtFusion achieves a state-of-the-art mAP score of 0.473 on the nuScenes validation set, outperforming competitors like OFT by 35.1% and MonoDIS by 9.5%. NeXtFusion's excellence extends beyond mAP scores. It also performs well in other crucial metrics, including mATE (0.449) and mAOE (0.534), highlighting its overall effectiveness in 3D object detection. Visualizations of real-world scenarios from the nuScenes dataset processed by NeXtFusion provide compelling evidence of its capability to handle diverse and challenging environments.
- «
- 1 (current)
- 2
- 3
- »