- Browse by Subject
Browsing by Subject "Support vector machines"
Now showing 1 - 4 of 4
Results Per Page
Sort Options
Item 3D Facial Matching by Spiral Convolutional Metric Learning and a Biometric Fusion-Net of Demographic Properties(IEEE, 2021) Mahdi, Soha Sadat; Nauwelaers, Nele; Joris, Philip; Bouritsas, Giorgos; Gong, Shunwang; Bokhnyak, Sergiy; Walsh, Susan; Shriver, Mark D.; Bronstein, Michael; Claes, Peter; Biology, School of ScienceFace recognition is a widely accepted biometric verification tool, as the face contains a lot of information about the identity of a person. In this study, a 2-step neural-based pipeline is presented for matching 3D facial shape to multiple DNA-related properties (sex, age, BMI and genomic background). The first step consists of a triplet loss-based metric learner that compresses facial shape into a lower dimensional embedding while preserving information about the property of interest. Most studies in the field of metric learning have only focused on 2D Euclidean data. In this work, geometric deep learning is employed to learn directly from 3D facial meshes. To this end, spiral convolutions are used along with a novel mesh-sampling scheme that retains uniformly sampled 3D points at different levels of resolution. The second step is a multi-biometric fusion by a fully connected neural network. The network takes an ensemble of embeddings and property labels as input and returns genuine and imposter scores. Since embeddings are accepted as an input, there is no need to train classifiers for the different properties and available data can be used more efficiently. Results obtained by a to-fold cross-validation for biometric verification show that combining multiple properties leads to stronger biometric systems. Furthermore, the proposed neural-based pipeline outperforms a linear baseline, which consists of principal component analysis, followed by classification with linear support vector machines and a Naïve Bayes-based score-fuser.Item Computational Analysis of Flow Cytometry Data(2013-07-12) Irvine, Allison W.; Dundar, Murat; Tuceryan, Mihran; Mukhopadhyay, Snehasis; Fang, ShiaofenThe objective of this thesis is to compare automated methods for performing analysis of flow cytometry data. Flow cytometry is an important and efficient tool for analyzing the characteristics of cells. It is used in several fields, including immunology, pathology, marine biology, and molecular biology. Flow cytometry measures light scatter from cells and fluorescent emission from dyes which are attached to cells. There are two main tasks that must be performed. The first is the adjustment of measured fluorescence from the cells to correct for the overlap of the spectra of the fluorescent markers used to characterize a cell’s chemical characteristics. The second is to use the amount of markers present in each cell to identify its phenotype. Several methods are compared to perform these tasks. The Unconstrained Least Squares, Orthogonal Subspace Projection, Fully Constrained Least Squares and Fully Constrained One Norm methods are used to perform compensation and compared. The fully constrained least squares method of compensation gives the overall best results in terms of accuracy and running time. Spectral Clustering, Gaussian Mixture Modeling, Naive Bayes classification, Support Vector Machine and Expectation Maximization using a gaussian mixture model are used to classify cells based on the amounts of dyes present in each cell. The generative models created by the Naive Bayes and Gaussian mixture modeling methods performed classification of cells most accurately. These supervised methods may be the most useful when online classification is necessary, such as in cell sorting applications of flow cytometers. Unsupervised methods may be used to completely replace manual analysis when no training data is given. Expectation Maximization combined with a cluster merging post-processing step gives the best results of the unsupervised methods considered.Item Optimizing hydropathy scale to improve IDP prediction and characterizing IDPs' functions(2014-01) Huang, Fei; Dunker, A. Keith; Chen, Jake; Hurley, Thomas D., 1961-; Shen, LiIntrinsically disordered proteins (IDPs) are flexible proteins without defined 3D structures. Studies show that IDPs are abundant in nature and actively involved in numerous biological processes. Two crucial subjects in the study of IDPs lie in analyzing IDPs’ functions and identifying them. We thus carried out three projects to better understand IDPs. In the 1st project, we propose a method that separates IDPs into different function groups. We used the approach of CH-CDF plot, which is based the combined use of two predictors and subclassifies proteins into 4 groups: structured, mixed, disordered, and rare. Studies show different structural biases for each group. The mixed class has more order-promoting residues and more ordered regions than the disordered class. In addition, the disordered class is highly active in mitosis-related processes among others. Meanwhile, the mixed class is highly associated with signaling pathways, where having both ordered and disordered regions could possibly be important. The 2nd project is about identifying if an unknown protein is entirely disordered. One of the earliest predictors for this purpose, the charge-hydropathy plot (C-H plot), exploited the charge and hydropathy features of the protein. Not only is this algorithm simple yet powerful, its input parameters, charge and hydropathy, are informative and readily interpretable. We found that using different hydropathy scales significantly affects the prediction accuracy. Therefore, we sought to identify a new hydropathy scale that optimizes the prediction. This new scale achieves an accuracy of 91%, a significant improvement over the original 79%. In our 3rd project, we developed a per-residue C-H IDP predictor, in which three hydropathy scales are optimized individually. This is to account for the amino acid composition differences in three regions of a protein sequence (N, C terminus and internal). We then combined them into a single per-residue predictor that achieves an accuracy of 74% for per-residue predictions for proteins containing long IDP regions.Item Structure-Based Target-Specific Screening Leads to Small-Molecule CaMKII Inhibitors(Wiley, 2017-05-09) Xu, David; Li, Liwei; Zhou, Donghui; Liu, Degang; Hudmon, Andy; Meroueh, Samy O.; BioHealth Informatics, School of Informatics and ComputingTarget-specific scoring methods are more commonly used to identify small-molecule inhibitors among compounds docked to a target of interest. Top candidates that emerge from these methods have rarely been tested for activity and specificity across a family of proteins. In this study we docked a chemical library into CaMKIIδ, a member of the Ca2+ /calmodulin (CaM)-dependent protein kinase (CaMK) family, and re-scored the resulting protein-compound structures using Support Vector Machine SPecific (SVMSP), a target-specific method that we developed previously. Among the 35 selected candidates, three hits were identified, such as quinazoline compound 1 (KIN-1; N4-[7-chloro-2-[(E)-styryl]quinazolin-4-yl]-N1,N1-diethylpentane-1,4-diamine), which was found to inhibit CaMKIIδ kinase activity at single-digit micromolar IC50 . Activity across the kinome was assessed by profiling analogues of 1, namely 6 (KIN-236; N4-[7-chloro-2-[(E)-2-(2-chloro-4,5-dimethoxyphenyl)vinyl]quinazolin-4-yl]-N1,N1-diethylpentane-1,4-diamine), and an analogue of hit compound 2 (KIN-15; 2-[4-[(E)-[(5-bromobenzofuran-2-carbonyl)hydrazono]methyl]-2-chloro-6-methoxyphenoxy]acetic acid), namely 14 (KIN-332; N-[(E)-[4-(2-anilino-2-oxoethoxy)-3-chlorophenyl]methyleneamino]benzofuran-2-carboxamide), against 337 kinases. Interestingly, for compound 6, CaMKIIδ and homologue CaMKIIγ were among the top ten targets. Among the top 25 targets of 6, IC50 values ranged from 5 to 22 μm. Compound 14 was found to be not specific toward CaMKII kinases, but it does inhibit two kinases with sub-micromolar IC50 values among the top 25. Derivatives of 1 were tested against several kinases including several members of the CaMK family. These data afforded a limited structure-activity relationship study. Molecular dynamics simulations with explicit solvent followed by end-point MM-GBSA free-energy calculations revealed strong engagement of specific residues within the ATP binding pocket, and also changes in the dynamics as a result of binding. This work suggests that target-specific scoring approaches such as SVMSP may hold promise for the identification of small-molecule kinase inhibitors that exhibit some level of specificity toward the target of interest across a large number of proteins.