Computational Analysis of Flow Cytometry Data
Date
Authors
Language
Embargo Lift Date
Department
Committee Chair
Committee Members
Degree
Degree Year
Department
Grantor
Journal Title
Journal ISSN
Volume Title
Found At
Abstract
The objective of this thesis is to compare automated methods for performing analysis of flow cytometry data. Flow cytometry is an important and efficient tool for analyzing the characteristics of cells. It is used in several fields, including immunology, pathology, marine biology, and molecular biology. Flow cytometry measures light scatter from cells and fluorescent emission from dyes which are attached to cells. There are two main tasks that must be performed. The first is the adjustment of measured fluorescence from the cells to correct for the overlap of the spectra of the fluorescent markers used to characterize a cell’s chemical characteristics. The second is to use the amount of markers present in each cell to identify its phenotype. Several methods are compared to perform these tasks. The Unconstrained Least Squares, Orthogonal Subspace Projection, Fully Constrained Least Squares and Fully Constrained One Norm methods are used to perform compensation and compared. The fully constrained least squares method of compensation gives the overall best results in terms of accuracy and running time. Spectral Clustering, Gaussian Mixture Modeling, Naive Bayes classification, Support Vector Machine and Expectation Maximization using a gaussian mixture model are used to classify cells based on the amounts of dyes present in each cell. The generative models created by the Naive Bayes and Gaussian mixture modeling methods performed classification of cells most accurately. These supervised methods may be the most useful when online classification is necessary, such as in cell sorting applications of flow cytometers. Unsupervised methods may be used to completely replace manual analysis when no training data is given. Expectation Maximization combined with a cluster merging post-processing step gives the best results of the unsupervised methods considered.