Enhanced Multiple Dense Layer EfficientNet

Mohan, Aswathy

Enhanced Multiple Dense Layer EfficientNet

Files

Thesis_report_Aswathy_Mohan_2024_Final.pdf (2.52 MB)

Date

2024-08

Authors

Mohan, Aswathy

Language

American English

Committee Chair

El-Sharkawy, Mohamed

Committee Members

King , Brian
Rizkalla, Maher

Degree

M.S.E.C.E.

Degree Year

2024

Department

Electrical & Computer Engineering

Grantor

Purdue University

Abstract

In the dynamic and ever-evolving landscape of Artificial Intelligence (AI), the domain of deep learning has emerged as a pivotal force, propelling advancements across a broad spectrum of applications, notably in the intricate field of image classification. Image classification, a critical task that involves categorizing images into predefined classes, serves as the backbone for numerous cutting-edge technologies, including but not limited to, automated surveillance, facial recognition systems, and advanced diagnostics in healthcare. Despite the significant strides made in the area, the quest for models that not only excel in accuracy but also demonstrate robust generalization across varied datasets, and maintain resilience against the pitfalls of overfitting, remains a formidable challenge. EfficientNetB0, a model celebrated for its optimized balance between computational efficiency and accuracy, stands at the forefront of solutions addressing these challenges. However, the nuanced complexities of datasets such as CIFAR-10, characterized by its diverse array of images spanning ten distinct categories, call for specialized adaptations to harness the full potential of such sophisticated architectures. In response, this thesis introduces an optimized version of the EffciientNetB0 architecture, meticulously enhanced with strategic architectural modifications, including the incorporation of an additional Dense layer endowed with 512 units and the strategic use of Dropout regularization. These adjustments are designed to amplify the model’s capacity for learning and interpreting complex patterns inherent in the data. Complimenting these architectural refinements, a nuanced two-phase training methodology is also adopted in the proposed model. This approach commences with the initial phase of training where the base model’s pre-trained weights are frozen, thus leveraging the power of transfer learning to secure a solid foundational understanding. The subsequent phase of fine-tuning, characterized by the selective unfreezing of layers, meticulously calibrates the model to the intricacies of the CIFAR-10 dataset. This is further bolstered by the implementation of adaptive learning rate adjustments, ensuring the model’s training process is both efficient and responsive to the nuances of the learning curve. Through a comprehensive suite of evaluations, encompassing accuracy assessments, confusion matrices, and detailed classification reports, the proposed model demonstrates notable improvement in performance. The insights gleaned from this research not only shed light on the mechanisms underpinning successful image classification models but also chart a course for future aimed at bridging the gap between theoretical model and their practical applications. This research encapsulates the iterative process of model enhancement, providing a beacon of future endeavors in the quest for optimal image classification solutions.

Description

Indiana University-Purdue University Indianapolis (IUPUI)

Keywords

Neural Network, Computer Vision, Machine Learning

Rights

CC0 1.0 Universal

Type

Thesis

Permanent Link

https://hdl.handle.net/1805/43113

Collections

Electrical & Computer Engineering Department Theses and Dissertations

Full item page