- Browse by Subject
Browsing by Subject "design space exploration"
Now showing 1 - 3 of 3
Results Per Page
Sort Options
Item High Performance SqueezeNext for CIFAR-10(IEEE, 2019-07) Duggal, Jayan Kant; El-Sharkawy, Mohamed; Electrical and Computer Engineering, School of Engineering and TechnologyCNNs is the foundation for deep learning and computer vision domain enabling applications such as autonomous driving, face recognition, automatic radiology image reading, etc. But, CNN is a algorithm which is memory and computationally intensive. DSE of neural networks and compression techniques have made convolution neural networks memory and computationally efficient. It improved the CNN architectures and made it more suitable to implement on real-time embedded systems. This paper proposes an efficient and a compact CNN to ameliorate the performance of existing CNN architectures. The intuition behind this proposed architecture is to supplant convolution layers with a more sophisticated block module and to develop a compact architecture with a competitive accuracy. Further, explores the bottleneck module and squeezenext basic block structure. The state-of-the-art squeezenext baseline architecture is used as a foundation to recreate and propose a high performance squeezenext architecture. The proposed architecture is further trained on the CIFAR-10 dataset from scratch. All the training and testing results are visualized with live loss and accuracy graphs. Focus of this paper is to make an adaptable and a flexible model for efficient CNN performance which can perform better with the minimum tradeoff between model accuracy, size, and speed. Finally, the conclusion is made that the performance of CNN can be improved by developing an architecture for a specific dataset. The purpose of this paper is to introduce and propose high performance squeezenext for CIFAR-10.Item Image Classification on NXP i.MX RT1060 using Ultra-thin MobileNet DNN(IEEE, 2020-01) Desai, Saurabh Ravindra; Sinha, Debjyoti; El-Sharkawy, Mohamed; Electrical and Computer Engineering, School of Engineering and TechnologyDeep Neural Networks play a very significant role in computer vision applications like image classification, object recognition and detection. They have achieved great success in this field but the main obstacles for deploying a DNN model into an Autonomous Driver Assisted System (ADAS) platform are limited memory, constrained resources, and limited power. MobileNet is a very efficient and light DNN model which was developed mainly for embedded and computer vision applications, but researchers still faced many constraints and challenges to deploy the model into resource-constrained microprocessor units. Design Space Exploration of such CNN models can make them more memory efficient and less computationally intensive. We have used the Design Space Exploration technique to modify the baseline MobileNet V1 model and develop an improved version of it. This paper proposes seven modifications on the existing baseline architecture to develop a new and more efficient model. We use Separable Convolution layers, the width multiplier hyperparamater, alter the channel depth and eliminate the layers with the same output shape to reduce the size of the model. We achieve a good overall accuracy by using the Swish activation function, Random Erasing technique and a choosing good optimizer. We call the new model as Ultra-thin MobileNet which has a much smaller size, lesser number of parameters, less average computation time per epoch and negligible overfitting, with a little higher accuracy as compared to the baseline MobileNet V1. Generally, when an attempt is made to make an existing model more compact, the accuracy decreases. But here, there is no trade off between the accuracy and the model size. The proposed model is developed with the intent to make it deployable in a realtime autonomous development platform with limited memory and power and, keeping the size of the model within 5 MB. It could be successfully deployed into NXP i.MX RT1060 ADAS platform due to its small model size of 3.9 MB. It classifies images of different classes in real-time, with an accuracy of more than 90% when it is run on the above-mentioned ADAS platform. We have trained and tested the proposed architecture from scratch on the CIFAR-10 dataset.Item Shallow SqueezeNext: Real Time Deployment on Bluebox2.0 with 272KB Model Size(Science, 2020-12) Duggal, Jayan Kant; El-Sharkawy, Mohamed; Electrical and Computer Engineering, School of Engineering and TechnologyThe significant challenges for deploying CNNs/DNNs on ADAS are limited computation and memory resources with very limited efficiency. Design space exploration of CNNs or DNNS, training and testing DNN from scratch, hyper parameter tuning, implementation with different optimizers contributed towards the efficiency and performance improvement of the Shallow SqueezeNext architecture. It is also computationally efficient, inexpensive and requires minimum memory resources. It achieves better model size and speed in comparison to other counterparts such as AlexNet, VGGnet, SqueezeNet, and SqueezeNext, trained and tested from scratch on datasets such as CIFAR-10 and CIFAR-100. It can achieve the least model size of 272KB with a model accuracy of 82%, a model speed of 9 seconds per epoch, and tested on the CIFAR-10 dataset. It achieved the best accuracy of 91.41%, best model size of 0.272 MB, and best model speed of 4 seconds per epoch. Memory resources are of high importance when it comes down to real time system or platforms because usually the memory is quite limited. To verify that the Shallow SqueezeNext can be successfully deployed on a real time platform, bluebox2.0 by NXP was used. Bluebox2.0 deployment of Shallow SqueezeNext architecture achieved a model accuracy of 90.50%, 8.72MB model size and 22 seconds per epoch model speed. There is another version of the Shallow SqueezeNext which performed better that attained a model size of 0.5MB with model accuracy of 87.30% and 11 seconds per epoch model speed trained and tested from scratch on CIFAR-10 dataset.