Batch Discovery of Recurring Rare Classes toward Identifying Anomalous Samples

dc.contributor.authorDundar, Murat
dc.contributor.authorYerebakan, Halid Ziya
dc.contributor.authorRajwa, Bartek
dc.contributor.departmentComputer and Information Science, School of Science
dc.date.accessioned2023-10-24T16:13:35Z
dc.date.available2023-10-24T16:13:35Z
dc.date.issued2014
dc.description.abstractWe present a clustering algorithm for discovering rare yet significant recurring classes across a batch of samples in the presence of random effects. We model each sample data by an infinite mixture of Dirichlet-process Gaussian-mixture models (DPMs) with each DPM representing the noisy realization of its corresponding class distribution in a given sample. We introduce dependencies across multiple samples by placing a global Dirichlet process prior over individual DPMs. This hierarchical prior introduces a sharing mechanism across samples and allows for identifying local realizations of classes across samples. We use collapsed Gibbs sampler for inference to recover local DPMs and identify their class associations. We demonstrate the utility of the proposed algorithm, processing a flow cytometry data set containing two extremely rare cell populations, and report results that significantly outperform competing techniques. The source code of the proposed algorithm is available on the web via the link: http://cs.iupui.edu/~dundar/aspire.htm.
dc.eprint.versionAuthor's manuscript
dc.identifier.citationDundar M, Yerebakan HZ, Rajwa B. Batch Discovery of Recurring Rare Classes toward Identifying Anomalous Samples. KDD. 2014;2014:223-232. doi:10.1145/2623330.2623695
dc.identifier.urihttps://hdl.handle.net/1805/36608
dc.language.isoen_US
dc.publisherACM
dc.relation.isversionof10.1145/2623330.2623695
dc.relation.journalKDD
dc.rightsPublisher Policy
dc.sourcePMC
dc.subjectHierarchical Dirichlet process
dc.subjectRandom effects
dc.subjectBatch clustering
dc.subjectRecurring classes
dc.subjectRare classes
dc.subjectAnomaly detection
dc.titleBatch Discovery of Recurring Rare Classes toward Identifying Anomalous Samples
dc.typeArticle
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
nihms-1875696.pdf
Size:
842.47 KB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.99 KB
Format:
Item-specific license agreed upon to submission
Description: