IU Indianapolis ScholarWorks :: Browsing by Subject "Image Classification"

Browsing by Subject "Image Classification"

Now showing 1 - 8 of 8

AI on the Edge with CondenseNeXt: An Efficient Deep Neural Network for Devices with Constrained Computational Resources
(2021-08) Kalgaonkar, Priyank B.; El-Sharkawy, Mohamed A.; King, Brian S.; Rizkalla, Maher E.
Research work presented within this thesis propose a neoteric variant of deep convolutional neural network architecture, CondenseNeXt, designed specifically for ARM-based embedded computing platforms with constrained computational resources. CondenseNeXt is an improved version of CondenseNet, the baseline architecture whose roots can be traced back to ResNet. CondeseNeXt replaces group convolutions in CondenseNet with depthwise separable convolutions and introduces group-wise pruning, a model compression technique, to prune (remove) redundant and insignificant elements that either are irrelevant or do not affect performance of the network upon disposition. Cardinality, a new dimension to the existing spatial dimensions, and class-balanced focal loss function, a weighting factor inversely proportional to the number of samples, has been incorporated in order to relieve the harsh effects of pruning, into the design of CondenseNeXt’s algorithm. Furthermore, extensive analyses of this novel CNN architecture was performed on three benchmarking image datasets: CIFAR-10, CIFAR-100 and ImageNet by deploying the trained weight on to an ARM-based embedded computing platform: NXP BlueBox 2.0, for real-time image classification. The outputs are observed in real-time in RTMaps Remote Studio’s console to verify the correctness of classes being predicted. CondenseNeXt achieves state-of-the-art image classification performance on three benchmark datasets including CIFAR-10 (4.79% top-1 error), CIFAR-100 (21.98% top-1 error) and ImageNet (7.91% single model, single crop top-5 error), and up to 59.98% reduction in forward FLOPs compared to CondenseNet. CondenseNeXt can also achieve a final trained model size of 2.9 MB, however at the cost of 2.26% in accuracy loss. Thus, performing image classification on ARM-Based computing platforms without requiring a CUDA enabled GPU support, with outstanding efficiency.
Comparison of Urban Tree Canopy Classification With High Resolution Satellite Imagery and Three Dimensional Data Derived From LIDAR and Stereoscopic Sensors
(2008-08-22T13:59:51Z) Baller, Matthew Lee; Wilson, Jeffrey S. (Jeffrey Scott), 1967-; Tedesco, Lenore P.; Li, Lin
Despite growing recognition as a significant natural resource, methods for accurately estimating urban tree canopy cover extent and change over time are not well-established. This study evaluates new methods and data sources for mapping urban tree canopy cover, assessing the potential for increased accuracy by integrating high-resolution satellite imagery and 3D imagery derived from LIDAR and stereoscopic sensors. The results of urban tree canopy classifications derived from imagery, 3D data, and vegetation index data are compared across multiple urban land use types in the City of Indianapolis, Indiana. Results indicate that incorporation of 3D data and vegetation index data with high resolution satellite imagery does not significantly improve overall classification accuracy. Overall classification accuracies range from 88.34% to 89.66%, with resulting overall Kappa statistics ranging from 75.08% to 78.03%, respectively. Statistically significant differences in accuracy occurred only when high resolution satellite imagery was not included in the classification treatment and only the vegetation index data or 3D data were evaluated. Overall classification accuracy for these treatment methods were 78.33% for both treatments, with resulting overall Kappa statistics of 51.36% and 52.59%.
Design Space Exploration of Convolutional Neural Networks for Image Classification
(2020-12) Shah, Prasham; Mohamed, El-Sharkawy; King, Brian; Rizkalla, Maher
Computer vision is a domain which deals with the goal of making technology as efficient as human vision. To achieve that goal, after decades of research, researchers have developed algorithms that are able to work efficiently on resource constrained hardware like mobile or embedded devices for computer vision applications. Due to their constant efforts, such devices have become capable for tasks like Image Classification, Object Detection, Object Recognition, Semantic Segmentation, and many other applications. Autonomous systems like self-driving cars, Drones and UAVs, are being successfully developed because of these advances in AI. Deep Learning, a part of AI, is a specific domain of Machine Learning which focuses on developing algorithms for such applications. Deep Learning deals with tasks like extracting features from raw image data, replacing pipelines of specialized models with single end-to-end models, making models usable for multiple tasks with superior performance. A major focus is on techniques to detect and extract features which provide better context for inference about an image or video stream. A deep hierarchy of rich features can be learned and automatically extracted from images, provided by the multiple deep layers of CNN models. CNNs are the backbone of Computer Vision. The reason that CNNs are the focus of attention for deep learning models is that they were specifically designed for image data. They are complicated but very effective in extracting features from an image or a video stream. After AlexNet won the ILSVRC in 2012, there was a drastic increase in research related with CNNs. Many state-of-the-art architectures like VGG Net, GoogleNet, ResNet, Inception-v4, Inception-Resnet-v2, ShuffleNet, Xception, MobileNet, MobileNetV2, SqueezeNet, SqueezeNext and many more were introduced. The trend behind the research depicts an increase in the number of layers of CNN to make them more efficient but with that, the size of the model increased as well. This problem was fixed with the advent of new algorithms which resulted in a decrease in model size. As a result, today we have CNN models, which are implemented on mobile devices. These mobile models are compact and have low latency, which in turn reduces the computational cost of the embedded system. This thesis resembles similar idea, it proposes two new CNN architectures, A-MnasNet and R-MnasNet, which have been derived from MnasNet by Design Space Exploration. These architectures outperform MnasNet in terms of model size and accuracy. They have been trained and tested on CIFAR-10 dataset. Furthermore, they were implemented on NXP Bluebox 2.0, an autonomous driving platform, for Image Classification.
Design Space Exploration of Convolutional Neural Networks for Image Classification
(2020-12) Shah, Prasham; Mohamed, El-Sharkawy; King, Brian; Rizkalla, Maher
Computer vision is a domain which deals with the goal of making technology as efficient as human vision. To achieve that goal, after decades of research, researchers have developed algorithms that are able to work efficiently on resource constrained hardware like mobile or embedded devices for computer vision applications. Due to their constant efforts, such devices have become capable for tasks like Image Classification, Object Detection, Object Recognition, Semantic Segmentation, and many other applications. Autonomous systems like self-driving cars, Drones and UAVs, are being successfully developed because of these advances in AI. Deep Learning, a part of AI, is a specific domain of Machine Learning which focuses on developing algorithms for such applications. Deep Learning deals with tasks like extracting features from raw image data, replacing pipelines of specialized models with single end-to-end models, making models usable for multiple tasks with superior performance. A major focus is on techniques to detect and extract features which provide better context for inference about an image or video stream. A deep hierarchy of rich features can be learned and automatically extracted from images, provided by the multiple deep layers of CNN models. CNNs are the backbone of Computer Vision. The reason that CNNs are the focus of attention for deep learning models is that they were specifically designed for image data. They are complicated but very effective in extracting features from an image or a video stream. After AlexNet won the ILSVRC in 2012, there was a drastic increase in research related with CNNs. Many state-of-the-art architectures like VGG Net, GoogleNet, ResNet, Inception-v4, Inception-Resnet-v2, ShuffleNet, Xception, MobileNet, MobileNetV2, SqueezeNet, SqueezeNext and many more were introduced. The trend behind the research depicts an increase in the number of layers of CNN to make them more efficient but with that, the size of the model increased as well. This problem was fixed with the advent of new algorithms which resulted in a decrease in model size. As a result, today we have CNN models, which are implemented on mobile devices. These mobile models are compact and have low latency, which in turn reduces the computational cost of the embedded system. This thesis resembles similar idea, it proposes two new CNN architectures, A-MnasNet and R-MnasNet, which have been derived from MnasNet by Design Space Exploration. These architectures outperform MnasNet in terms of model size and accuracy. They have been trained and tested on CIFAR-10 dataset. Furthermore, they were implemented on NXP Bluebox 2.0, an autonomous driving platform, for Image Classification.
Design Space Exploration of DNNs for Autonomous Systems
(2019-08) Duggal, Jayan Kant; El-Sharkawy, Mohamed; King, Brian; Rizkalla, Maher
Developing intelligent agents that can perceive and understand the rich visualworld around us has been a long-standing goal in the field of AI. Recently, asignificant progress has been made by the CNNs/DNNs to the incredible advances& in a wide range of applications such as ADAS, intelligent cameras surveillance,autonomous systems, drones, & robots. Design space exploration (DSE) of NNs andother techniques have made CNN/DNN memory & computationally efficient. Butthe major design hurdles for deployment are limited resources such as computation,memory, energy efficiency, and power budget. DSE of small DNN architectures forADAS emerged with better and efficient architectures such as baseline SqueezeNetand SqueezeNext. These architectures are exclusively known for their small modelsize, good model speed & model accuracy.In this thesis study, two new DNN architectures are proposed. Before diving intothe proposed architectures, DSE of DNNs explores the methods to improveDNNs/CNNs.Further, understanding the different hyperparameters tuning &experimenting with various optimizers and newly introduced methodologies. First,High Performance SqueezeNext architecture ameliorate the performance of existingDNN architectures. The intuition behind this proposed architecture is to supplantconvolution layers with a more sophisticated block module & to develop a compactand efficient architecture with a competitive accuracy. Second, Shallow SqueezeNextarchitecture is proposed which achieves better model size results in comparison tobaseline SqueezeNet and SqueezeNext is presented. It illustrates the architecture is xviicompact, efficient and flexible in terms of model size and accuracy.Thestate-of-the-art SqueezeNext baseline and SqueezeNext baseline are used as thefoundation to recreate and propose the both DNN architectures in this study. Dueto very small model size with competitive model accuracy and decent model testingspeed it is expected to perform well on the ADAS systems.The proposedarchitectures are trained and tested from scratch on CIFAR-10 [30] & CIFAR-100[34] datasets. All the training and testing results are visualized with live loss andaccuracy graphs by using livelossplot. In the last, both of the proposed DNNarchitectures are deployed on BlueBox2.0 by NXP.
HBONext: An Efficient Dnn for Light Edge Embedded Devices
(2021-05) Joshi, Sanket Ramesh; El-Sharkawy, Mohamed; King, Brian; Rizkalla, Maher
Every year the most effective Deep learning models, CNN architectures are showcased based on their compatibility and performance on the embedded edge hardware, especially for applications like image classification. These deep learning models necessitate a significant amount of computation and memory, so they can only be used on high-performance computing systems like CPUs or GPUs. However, they often struggle to fulfill portable specifications due to resource, energy, and real-time constraints. Hardware accelerators have recently been designed to provide the computational resources that AI and machine learning tools need. These edge accelerators have high-performance hardware which helps maintain the precision needed to accomplish this mission. Furthermore, this classification dilemma that investigates channel interdependencies using either depth-wise or group-wise convolutional features, has benefited from the inclusion of Bottleneck modules. Because of its increasing use in portable applications, the classic inverted residual block, a well-known architecture technique, has gotten more recognition. This work takes it a step forward by introducing a design method for porting CNNs to lowresource embedded systems, essentially bridging the difference between deep learning models and embedded edge systems. To achieve these goals, we use closer computing strategies to reduce the computer’s computational load and memory usage while retaining excellent deployment efficiency. This thesis work introduces HBONext, a mutated version of Harmonious Bottlenecks (DHbneck) combined with a Flipped version of Inverted Residual (FIR), which outperforms the current HBONet architecture in terms of accuracy and model size miniaturization. Unlike the current definition of inverted residual, this FIR block performs identity mapping and spatial transformation at its higher dimensions. The HBO solution, on the other hand, focuses on two orthogonal dimensions: spatial (H/W) contraction-expansion and later channel (C) expansion-contraction, which are both organized in a bilaterally symmetric manner. HBONext is one of those versions that was designed specifically for embedded and mobile applications. In this research work, we also show how to use NXP Bluebox 2.0 to build a real-time HBONext image classifier. The integration of the model into this hardware has been a big hit owing to the limited model size of 3 MB. The model was trained and validated using CIFAR10 dataset, which performed exceptionally well due to its smaller size and higher accuracy. The validation accuracy of the baseline HBONet architecture is 80.97%, and the model is 22 MB in size. The proposed architecture HBONext variants, on the other hand, gave a higher validation accuracy of 89.70% and a model size of 3.00 MB measured using the number of parameters. The performance metrics of HBONext architecture and its various variants are compared in the following chapters.
Mutual Learning Algorithms in Machine Learning
(2023-05) Chowdhury, Sabrina Tarin; Mukhopadhyay, Snehasis; Fang, Shiaofen; Tuceryan, Mihran
Mutual learning algorithm is a machine learning algorithm where multiple machine learning algorithms learns from different sources and then share their knowledge among themselves so that all the agents can improve their classification and prediction accuracies simultaneously. Mutual learning algorithm can be an efficient mechanism for improving the machine learning and neural network efficiency in a multi-agent system. Usually, in knowledge distillation algorithms, a big network plays the role of a static teacher and passes the data to smaller networks, known as student networks, to improve the efficiency of the latter. In this thesis, it is showed that two small networks can dynamically and interchangeably play the changing roles of teacher and student to share their knowledge and hence, the efficiency of both the networks improve simultaneously. This type of dynamic learning mechanism can be very useful in mobile environment where there is resource constraint for training with big dataset. Data exchange in multi agent, teacher-student network system can lead to efficient learning. The concept and the proposed mutual learning algorithm are demonstrated using convolutional neural networks (CNNs) and Support Vector Machine (SVM) to recognize the pattern recognition problem using MNIST hand-writing dataset. The concept of machine learning is applied in the field of natural language processing (NLP) too. Machines with basic understanding of human language are getting increasingly popular in day-to-day life. Therefore, NLP-enabled machines with memory efficient training can potentially become an indispensable part of our life in near future. A classic problem in the field of NLP is news classification problem where news articles from newspapers are classified by news categories by machine learning algorithms. In this thesis, we show news classification implemented using Naïve Bayes and support vector machine (SVM) algorithm. Then we show two small networks can dynamically play the changing roles of teacher and student to share their knowledge on news classification and hence, the efficiency of both the networks improves simultaneously. The mutual learning algorithm is applied between homogenous algorithms first, i.e., between two Naive Bayes algorithms and two SVM algorithms. Then the mutual learning is demonstrated between heterogenous agents, i.e., between one Naïve Bayes and one SVM agent and the relative efficiency increase between the agents is discussed before and after mutual learning.
Residual Capsule Network
(2019-08) Bhamidi, Sree Bala Shruthi; El-Sharkawy, Mohamed; King, Brian; Rizkalla, Maher
The Convolutional Neural Network (CNN) have shown a substantial improvement in the field of Machine Learning. But they do come with their own set of drawbacks. Capsule Networks have addressed the limitations of CNNs and have shown a great improvement by calculating the pose and transformation of the image. Deeper networks are more powerful than shallow networks but at the same time, more difficult to train. Residual Networks ease the training and have shown evidence that they can give good accuracy with considerable depth. Putting the best of Capsule Network and Residual Network together, we present Residual Capsule Network and 3-Level Residual Capsule Network, a framework that uses the best of Residual Networks and Capsule Networks. The conventional Convolutional layer in Capsule Network is replaced by skip connections like the Residual Networks to decrease the complexity of the Baseline Capsule Network and seven ensemble Capsule Network. We trained our models on MNIST and CIFAR-10 datasets and have seen a significant decrease in the number of parameters when compared to the Baseline models.

Browsing by Subject "Image Classification"

Results Per Page

Sort Options