- Browse by Author
Browsing by Author "Fang, Shiaofen"
Now showing 1 - 10 of 70
Results Per Page
Sort Options
Item Analysis of Pseudo-Symmetry in Protein Homo-Oligomers(2018-12) Rajendran, Catherine Jenifer Rajam; Fang, Shiaofen; Liu, Jing-Yuan; Liang, YaoSymmetry plays a significant role in protein structural assembly and function. This is especially true for large homo-oligomeric protein complexes due to stability and finite control of function. But, symmetry in proteins are not perfect due to unknown reasons and leads to pseudosymmetry. This study focuses on symmetry analysis of homo-oligomers, specifically homo-dimers, homo-trimers and homo-tetramers. We defined Off Symmetry (OS) to measure the overall symmetry of the protein and Structural Index (SI) to quantify the structural difference and Assembly Index (AI) to quantify the assembly difference between the subunits. In most of the symmetrical homo-trimer and homo-tetramer proteins, Assembly Index contributes more to Off Symmetry and in the case of homo-dimer, Structural index contributes more than the Assembly Index. The main chain atom Carbon-Alpha (CA) is more symmetrical than the first side chain atom Carbon-Beta (CB), suggesting protein mobility may contribute to the pseudosymmetry. In addition, Pearson coefficient correlation between their Off-Symmetry and their respective atoms B-Factor (temperature factor) are calculated. We found that the individual residues of a protein in all the subunits are correlated to their average B-Factor of these residues. The correlation with BFactor is stronger in Structure Index than Assembly Index. All these results suggest that protein dynamics play an important role and therefore a larger off-symmetry may indicate a more mobile and flexible protein complex.Item Automated Methods To Detect And Quantify Histological Features In Liver Biopsy Images To Aid In The Diagnosis Of Non-Alcoholic Fatty Liver Disease(2016-03-31) Morusu, Siripriya; Tuceryan, Mihran; Zheng, Jiang; Tsechpenakis, Gavriil; Fang, ShiaofenThe ultimate goal of this study is to build a decision support system to aid the pathologists in diagnosing Non-Alcoholic Fatty Liver Disease (NAFLD) in both adults and children. The disease is caused by accumulation of excess fat in liver cells. It is prevalent in approximately 30% of the general population in United States, Europe and Asian countries. The growing prevalence of the disease is directly related to the obesity epidemic in developed countries. We built computational methods to detect and quantify the histological features of a liver biopsy which aid in staging and phenotyping NAFLD. Image processing and supervised machine learning techniques are predominantly used to develop a robust and reliable system. The contributions of this study include development of a rich web interface for acquiring annotated data from expert pathologists, identifying and quantifying macrosteatosis in rodent liver biopsies as well as lobular inflammation and portal inflammation in human liver biopsies. Our work on detection of macrosteatosis in mouse liver shows 94.2% precision and 95% sensitivity. The model developed for lobular inflammation detection performs with precision and sensitivity of 79.3% and 81.3% respectively. We also present the first study on portal inflammation identification with 82.1% precision and 88.3% sensitivity. The thesis also presents results obtained for correlation between model computed scores for each of these lesions and expert pathologists' grades.Item Automatic Extraction of Computer Science Concept Phrases Using a Hybrid Machine Learning Paradigm(2023-05) Jahin, S M Abrar; Al Hasan, Mohammad; Fang, Shiaofen; Mukhopadhyay, SnehasisWith the proliferation of computer science in recent years in modern society, the number of computer science-related employment is expanding quickly. Software engineer has been chosen as the best job for 2023 based on pay, stress level, opportunity for professional growth, and balance between work and personal life. This was decided by a rankings of different news, journals, and publications. Computer science occupations are anticipated to be in high demand not just in 2023, but also for the foreseeable future. It's not surprising that the number of computer science students at universities is growing and will continue to grow. The enormous increase in student enrolment in many subdisciplines of computers has presented some distinct issues. If computer science is to be incorporated into the K-12 curriculum, it is vital that K-12 educators are competent. But one of the biggest problems with this plan is that there aren't enough trained computer science professors. Numerous new fields and applications, for instance, are being introduced to computer science. In addition, it is difficult for schools to recruit skilled computer science instructors for a variety of reasons including low salary issue. Utilizing the K-12 teachers who are already in the schools, have a love for teaching, and consider teaching as a vocation is therefore the most effective strategy to improve or fix this issue. So, if we want teachers to quickly grasp computer science topics, we need to give them an easy way to learn about computer science. To simplify and expedite the study of computer science, we must acquaint school-treachers with the terminology associated with computer science concepts so they can know which things they need to learn according to their profile. If we want to make it easier for schoolteachers to comprehend computer science concepts, it would be ideal if we could provide them with a tree of words and phrases from which they could determine where the phrases originated and which phrases are connected to them so that they can be effectively learned. To find a good concept word or phrase, we must first identify concepts and then establish their connections or linkages. As computer science is a fast developing field, its nomenclature is also expanding at a frenetic rate. Therefore, adding all concepts and terms to the knowledge graph would be a challenging endeavor. Cre- ating a system that automatically adds all computer science domain terms to the knowledge graph would be a straightforward solution to the issue. We have identified knowledge graph use cases for the schoolteacher training program, which motivates the development of a knowledge graph. We have analyzed the knowledge graph's use case and the knowledge graph's ideal characteristics. We have designed a webbased system for adding, editing, and removing words from a knowledge graph. In addition, a term or phrase can be represented with its children list, parent list, and synonym list for enhanced comprehension. We' ve developed an automated system for extracting words and phrases that can extract computer science idea phrases from any supplied text, therefore enriching the knowledge graph. Therefore, we have designed the knowledge graph for use in teacher education so that school-teachers can educate K-12 students computer science topicses effectively.Item Automatic Landmark Placement for Large 3D Facial Image Dataset(IEEE, 2019-12) Wang, Jerry; Fang, Shiaofen; Fang, Meie; Wilson, Jeremy; Herrick, Noah; Walsh, Susan; Computer and Information Science, School of ScienceFacial landmark placement is a key step in many biomedical and biometrics applications. This paper presents a computational method that efficiently performs automatic 3D facial landmark placement based on training images containing manually placed anthropological facial landmarks. After 3D face registration by an iterative closest point (ICP) technique, a visual analytics approach is taken to generate local geometric patterns for individual landmark points. These individualized local geometric patterns are derived interactively by a user's initial visual pattern detection. They are used to guide the refinement process for landmark points projected from a template face to achieve accurate landmark placement. Compared to traditional methods, this technique is simple, robust, and does not require a large number of training samples (e.g. in machine learning based methods) or complex 3D image analysis procedures. This technique and the associated software tool are being used in a 3D biometrics project that aims to identify links between human facial phenotypes and their genetic association.Item BECA: A Software Tool for Integrated Visualization of Human Brain Data(Springer, 2017) Li, Huang; Fang, Shiaofen; Zigon, Bob; Sporns, Olaf; Saykin, Andrew J.; Goñi, Joaquin; Shen, Li; Computer and Information Science, School of ScienceVisualization plays an important role in helping neuroscientist understanding human brain data. Most publicly available software focuses on visualizing a specific brain imaging modality. Here we present an extensible visualization platform, BECA, which employ a plugin architecture to facilitate rapid development and deployment of visualization for human brain data. This paper will introduce the architecture and discuss some important design decisions in implementing the BECA platform and its visualization plugins.Item Brain Connectome Network Properties Visualization(2018-12) Zhang, Chenfeng; Fang, Shiaofen; Tuceryan, Mihran; Mukhopadhyay, SnehasisBrain connectome network visualization could help the neurologists inspect the brain structure easily and quickly. In the thesis, the model of the brain connectome network is visualized in both three dimensions (3D) environment and two dimensions (2D) environment. One is named “Brain Explorer for Connectomic Analysis” (BECA) developed by the previous research already. It could present the 3D model of brain structure with region of interests (ROIs) in different colors [5]. The other is mainly for the information visualization of brain connectome in 2D. It adopts the force-directed layout to visualize the network. However, the brain network visualization could not bring the user intuitively ideas about brain structure. Sometimes, with the increasing scales of ROIs (nodes), the visualization would bring more visual clutter for readers [3]. So, brain connectome network properties visualization becomes a useful complement to brain network visualization. For a better understanding of the effect of Alzheimer’s disease on the brain nerves, the thesis introduces several methods about the brain graph properties visualization. There are the five selected graph properties discussed in the thesis. The degree and closeness are node properties. The shortest path, maximum flow, and clique are edge properties. Except for clique, the other properties are visualized in both 3D and 2D. The clique is visualized only in 2D. For the clique, a new hypergraph visualization method is proposed with three different algorithms. Instead of using an extra node to present a clique, the thesis uses a “belt” to connect all nodes within the same clique. The methods of node connections are based on the traveling salesman problem (TSP) and Law of cosines. In addition, the thesis also applies the result of the clique to adjust the force-directed layout of brain graph in 2D to dramatically eliminate the visual clutter. Therefore, with the support of the graph properties visualization, the brain connectome network visualization tools become more flexible.Item Brain explorer for connectomic analysis(Springer, 2017-08-23) Li, Huang; Fang, Shiaofen; Contreras, Joey A.; West, John D.; Risacher, Shannon L.; Wang, Yang; Sporns, Olaf; Saykin, Andrew J.; Goñi, Joaquín; Shen, Li; Radiology and Imaging Sciences, School of MedicineVisualization plays a vital role in the analysis of multimodal neuroimaging data. A major challenge in neuroimaging visualization is how to integrate structural, functional, and connectivity data to form a comprehensive visual context for data exploration, quality control, and hypothesis discovery. We develop a new integrated visualization solution for brain imaging data by combining scientific and information visualization techniques within the context of the same anatomical structure. In this paper, new surface texture techniques are developed to map non-spatial attributes onto both 3D brain surfaces and a planar volume map which is generated by the proposed volume rendering technique, spherical volume rendering. Two types of non-spatial information are represented: (1) time series data from resting-state functional MRI measuring brain activation; (2) network properties derived from structural connectivity data for different groups of subjects, which may help guide the detection of differentiation features. Through visual exploration, this integrated solution can help identify brain regions with highly correlated functional activations as well as their activation patterns. Visual detection of differentiation features can also potentially discover image-based phenotypic biomarkers for brain diseases.Item Brain-wide structural connectivity alterations under the control of Alzheimer risk genes(Inderscience, 2020) Yan, Jingwen; Raja V, Vinesh; Huang, Zhi; Amico, Enrico; Nho, Kwangsik; Fang, Shiaofen; Sporns, Olaf; Wu, Yu-chien; Saykin, Andrew; Goni, Joaquin; Shen, Li; BioHealth Informatics, School of Informatics and ComputingBackground: Alzheimer's disease is the most common form of brain dementia characterized by gradual loss of memory followed by further deterioration of other cognitive function. Large-scale genome-wide association studies have identified and validated more than 20 AD risk genes. However, how these genes are related to the brain-wide breakdown of structural connectivity in AD patients remains unknown. Methods: We used the genotype and DTI data in the Alzheimer's Disease Neuroimaging Initiative (ADNI) database. After constructing the brain network for each subject, we extracted three types of link measures, including fiber anisotropy, fiber length and density. We then performed a targeted genetic association analysis of brain-wide connectivity measures using general linear regression models. Age at scan and gender were included in the regression model as covariates. For fair comparison of the genetic effect on different measures, fiber anisotropy, fiber length and density were all normalized with mean as 0 and standard deviation as one.We aim to discover the abnormal brain-wide network alterations under the control of 34 AD risk SNPs identified in previous large-scale genome-wide association studies. Results: After enforcing the stringent Bonferroni correction, rs10498633 in SLC24A4 were found to significantly associated with anisotropy, total number and length of fibers, including some connecting brain hemispheres. With a lower level of significance at 5e-6, we observed significant genetic effect of SNPs in APOE, ABCA7, EPHA1 and CASS4 on various brain connectivity measures.Item Characterizing software components using evolutionary testing and path-guided analysis(2013-12-16) McNeany, Scott Edward; Hill, James H. (James Haswell); Raje, Rajeev; Al Hasan, Mohammad; Fang, ShiaofenEvolutionary testing (ET) techniques (e.g., mutation, crossover, and natural selection) have been applied successfully to many areas of software engineering, such as error/fault identification, data mining, and software cost estimation. Previous research has also applied ET techniques to performance testing. Its application to performance testing, however, only goes as far as finding the best and worst case, execution times. Although such performance testing is beneficial, it provides little insight into performance characteristics of complex functions with multiple branches. This thesis therefore provides two contributions towards performance testing of software systems. First, this thesis demonstrates how ET and genetic algorithms (GAs), which are search heuristic mechanisms for solving optimization problems using mutation, crossover, and natural selection, can be combined with a constraint solver to target specific paths in the software. Secondly, this thesis demonstrates how such an approach can identify local minima and maxima execution times, which can provide a more detailed characterization of software performance. The results from applying our approach to example software applications show that it is able to characterize different execution paths in relatively short amounts of time. This thesis also examines a modified exhaustive approach which can be plugged in when the constraint solver cannot properly provide the information needed to target specific paths.Item Computational Analysis of Flow Cytometry Data(2013-07-12) Irvine, Allison W.; Dundar, Murat; Tuceryan, Mihran; Mukhopadhyay, Snehasis; Fang, ShiaofenThe objective of this thesis is to compare automated methods for performing analysis of flow cytometry data. Flow cytometry is an important and efficient tool for analyzing the characteristics of cells. It is used in several fields, including immunology, pathology, marine biology, and molecular biology. Flow cytometry measures light scatter from cells and fluorescent emission from dyes which are attached to cells. There are two main tasks that must be performed. The first is the adjustment of measured fluorescence from the cells to correct for the overlap of the spectra of the fluorescent markers used to characterize a cell’s chemical characteristics. The second is to use the amount of markers present in each cell to identify its phenotype. Several methods are compared to perform these tasks. The Unconstrained Least Squares, Orthogonal Subspace Projection, Fully Constrained Least Squares and Fully Constrained One Norm methods are used to perform compensation and compared. The fully constrained least squares method of compensation gives the overall best results in terms of accuracy and running time. Spectral Clustering, Gaussian Mixture Modeling, Naive Bayes classification, Support Vector Machine and Expectation Maximization using a gaussian mixture model are used to classify cells based on the amounts of dyes present in each cell. The generative models created by the Naive Bayes and Gaussian mixture modeling methods performed classification of cells most accurately. These supervised methods may be the most useful when online classification is necessary, such as in cell sorting applications of flow cytometers. Unsupervised methods may be used to completely replace manual analysis when no training data is given. Expectation Maximization combined with a cluster merging post-processing step gives the best results of the unsupervised methods considered.