Simplicity of Kmeans versus Deepness of Deep Learning: A Case of Unsupervised Feature Learning with Limited Data

If you need an accessible version of this item, please email your request to digschol@iu.edu so that they may create one and provide it to you.
Date
2015-12
Language
English
Embargo Lift Date
Committee Members
Degree
Degree Year
Department
Grantor
Journal Title
Journal ISSN
Volume Title
Found At
IEEE
Abstract

We study a bio-detection application as a case study to demonstrate that Kmeans -- based unsupervised feature learning can be a simple yet effective alternative to deep learning techniques for small data sets with limited intra-as well as inter-class diversity. We investigate the effect on the classifier performance of data augmentation as well as feature extraction with multiple patch sizes and at different image scales. Our data set includes 1833 images from four different classes of bacteria, each bacterial culture captured at three different wavelengths and overall data collected during a three-day period. The limited number and diversity of images present, potential random effects across multiple days, and the multi-mode nature of class distributions pose a challenging setting for representation learning. Using images collected on the first day for training, on the second day for validation, and on the third day for testing Kmeans -- based representation learning achieves 97% classification accuracy on the test data. This compares very favorably to 56% accuracy achieved by deep learning and 74% accuracy achieved by handcrafted features. Our results suggest that data augmentation or dropping connections between units offers little help for deep-learning algorithms, whereas significant boost can be achieved by Kmeans -- based representation learning by augmenting data and by concatenating features obtained at multiple patch sizes or image scales.

Description
item.page.description.tableofcontents
item.page.relation.haspart
Cite As
Dundar, M., Kou, Q., Zhang, B., He, Y., & Rajwa, B. (2015). Simplicity of Kmeans Versus Deepness of Deep Learning: A Case of Unsupervised Feature Learning with Limited Data. In 2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA) (pp. 883–888). https://doi.org/10.1109/ICMLA.2015.78
ISSN
Publisher
Series/Report
Sponsorship
Major
Extent
Identifier
Relation
Journal
Machine Learning and Applications (ICMLA), 2015 IEEE 14th International Conference on
Source
Author
Alternative Title
Type
Conference proceedings
Number
Volume
Conference Dates
Conference Host
Conference Location
Conference Name
Conference Panel
Conference Secretariat Location
Version
Author's manuscript
Full Text Available at
This item is under embargo {{howLong}}