A nonparametric Bayesian perspective for machine learning in partially-observed settings

dc.contributor.advisorDundar, Mehmet Murat
dc.contributor.authorAkova, Ferit
dc.contributor.otherQi, Yuan Alan
dc.date.accessioned2014-07-31T17:30:03Z
dc.date.available2014-07-31T17:30:03Z
dc.date.issued2014-07-31
dc.degree.date2013en_US
dc.degree.grantorPurdue Universityen_US
dc.degree.levelPh.D.en_US
dc.descriptionIndiana University-Purdue University Indianapolis (IUPUI)en_US
dc.description.abstractRobustness and generalizability of supervised learning algorithms depend on the quality of the labeled data set in representing the real-life problem. In many real-world domains, however, we may not have full knowledge of the underlying data-generating mechanism, which may even have an evolving nature introducing new classes continually. This constitutes a partially-observed setting, where it would be impractical to obtain a labeled data set exhaustively defined by a fixed set of classes. Traditional supervised learning algorithms, assuming an exhaustive training library, would misclassify a future sample of an unobserved class with probability one, leading to an ill-defined classification problem. Our goal is to address situations where such assumption is violated by a non-exhaustive training library, which is a very realistic yet an overlooked issue in supervised learning. In this dissertation we pursue a new direction for supervised learning by defining self-adjusting models to relax the fixed model assumption imposed on classes and their distributions. We let the model adapt itself to the prospective data by dynamically adding new classes/components as data demand, which in turn gradually make the model more representative of the entire population. In this framework, we first employ suitably chosen nonparametric priors to model class distributions for observed as well as unobserved classes and then, utilize new inference methods to classify samples from observed classes and discover/model novel classes for those from unobserved classes. This thesis presents the initiating steps of an ongoing effort to address one of the most overlooked bottlenecks in supervised learning and indicates the potential for taking new perspectives in some of the most heavily studied areas of machine learning: novelty detection, online class discovery and semi-supervised learning.en_US
dc.identifier.urihttps://hdl.handle.net/1805/4825
dc.identifier.urihttp://dx.doi.org/10.7912/C2/2316
dc.language.isoen_USen_US
dc.subjectNonparametric Bayesian; Nonexhaustive; Supervised; Semi-superviseden_US
dc.subject.lcshBayesian statistical decision theory -- Research -- Analysis -- Evaluationen_US
dc.subject.lcshStatistical decisionen_US
dc.subject.lcshSupervised learning (Machine learning) -- Researchen_US
dc.subject.lcshNonparametric statistics -- Researchen_US
dc.subject.lcshMathematical statisticsen_US
dc.subject.lcshStochastic processesen_US
dc.subject.lcshBoosting (Algorithms)en_US
dc.subject.lcshStatistics -- Data processingen_US
dc.subject.lcshMachine learningen_US
dc.subject.lcshMathematical statistics -- Data processingen_US
dc.subject.lcshDiscourse analysis -- Statistical methodsen_US
dc.subject.lcshComputational linguisticsen_US
dc.subject.lcshData miningen_US
dc.subject.lcshComputational intelligenceen_US
dc.titleA nonparametric Bayesian perspective for machine learning in partially-observed settingsen_US
dc.typeThesisen
thesis.degree.disciplineComputer & Information Scienceen
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Ferit_Akova_PhD_dissertation.pdf
Size:
1.29 MB
Format:
Adobe Portable Document Format
Description:
Main document
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.88 KB
Format:
Item-specific license agreed upon to submission
Description: