Computer & Information Science Department Theses and Dissertations

Permanent URI for this collection

https://hdl.handle.net/1805/2053

For more information about the Computer & Information Science graduate programs visit: https://science.indianapolis.iu.edu.

Browse

Now showing 1 - 10 of 133

Active geometric model : multi-compartment model-based segmentation & registration
(2014-08-26) Mukherjee, Prateep; Tsechpenakis, Gavriil; Raje, Rajeev; Tuceryan, Mihran
We present a novel, variational and statistical approach for model-based segmentation. Our model generalizes the Chan-Vese model, proposed for concurrent segmentation of multiple objects embedded in the same image domain. We also propose a novel shape descriptor, namely the Multi-Compartment Distance Functions or mcdf. Our proposed framework for segmentation is two-fold: first, several training samples distributed across various classes are registered onto a common frame of reference; then, we use a variational method similar to Active Shape Models (or ASMs) to generate an average shape model and hence use the latter to partition new images. The key advantages of such a framework is: (i) landmark-free automated shape training; (ii) strict shape constrained model to fit test data. Our model can naturally deal with shapes of arbitrary dimension and topology(closed/open curves). We term our model Active Geometric Model, since it focuses on segmentation of geometric shapes. We demonstrate the power of the proposed framework in two important medical applications: one for morphology estimation of 3D Motor Neuron compartments, another for thickness estimation of Henle's Fiber Layer in the retina. We also compare the qualitative and quantitative performance of our method with that of several other state-of-the-art segmentation methods.
Adversarial Attacks and Defense Mechanisms to Improve Robustness of Deep Temporal Point Processes
(2022-08) Khorshidi, Samira; Mohler, George; Al Hasan, Mohammad; Raje, Rajeev; Durresi, Arjan
Temporal point processes (TPP) are mathematical approaches for modeling asynchronous event sequences by considering the temporal dependency of each event on past events and its instantaneous rate. Temporal point processes can model various problems, from earthquake aftershocks, trade orders, gang violence, and reported crime patterns, to network analysis, infectious disease transmissions, and virus spread forecasting. In each of these cases, the entity’s behavior with the corresponding information is noted over time as an asynchronous event sequence, and the analysis is done using temporal point processes, which provides a means to define the generative mechanism of the sequence of events and ultimately predict events and investigate causality. Among point processes, Hawkes process as a stochastic point process is able to model a wide range of contagious and self-exciting patterns. One of Hawkes process’s well-known applications is predicting the evolution of viral processes on networks, which is an important problem in biology, the social sciences, and the study of the Internet. In existing works, mean-field analysis based upon degree distribution is used to predict viral spreading across networks of different types. However, it has been shown that degree distribution alone fails to predict the behavior of viruses on some real-world networks. Recent attempts have been made to use assortativity to address this shortcoming. This thesis illustrates how the evolution of such a viral process is sensitive to the underlying network’s structure. In Chapter 3 , we show that adding assortativity does not fully explain the variance in the spread of viruses for a number of real-world networks. We propose using the graphlet frequency distribution combined with assortativity to explain variations in the evolution of viral processes across networks with identical degree distribution. Using a data-driven approach, by coupling predictive modeling with viral process simulation on real-world networks, we show that simple regression models based on graphlet frequency distribution can explain over 95% of the variance in virality on networks with the same degree distribution but different network topologies. Our results highlight the importance of graphlets and identify a small collection of graphlets that may have the most significant influence over the viral processes on a network. Due to the flexibility and expressiveness of deep learning techniques, several neural network-based approaches have recently shown promise for modeling point process intensities. However, there is a lack of research on the possible adversarial attacks and the robustness of such models regarding adversarial attacks and natural shocks to systems. Furthermore, while neural point processes may outperform simpler parametric models on in-sample tests, how these models perform when encountering adversarial examples or sharp non-stationary trends remains unknown. In Chapter 4 , we propose several white-box and black-box adversarial attacks against deep temporal point processes. Additionally, we investigate the transferability of whitebox adversarial attacks against point processes modeled by deep neural networks, which are considered a more elevated risk. Extensive experiments confirm that neural point processes are vulnerable to adversarial attacks. Such a vulnerability is illustrated both in terms of predictive metrics and the effect of attacks on the underlying point process’s parameters. Expressly, adversarial attacks successfully transform the temporal Hawkes process regime from sub-critical to into a super-critical and manipulate the modeled parameters that is considered a risk against parametric modeling approaches. Additionally, we evaluate the vulnerability and performance of these models in the presence of non-stationary abrupt changes, using the crimes and Covid-19 pandemic dataset as an example. Considering the security vulnerability of deep-learning models, including deep temporal point processes, to adversarial attacks, it is essential to ensure the robustness of the deployed algorithms that is despite the success of deep learning techniques in modeling temporal point processes. In Chapter 5 , we study the robustness of deep temporal point processes against several proposed adversarial attacks from the adversarial defense viewpoint. Specifically, we investigate the effectiveness of adversarial training using universal adversarial samples in improving the robustness of the deep point processes. Additionally, we propose a general point process domain-adopted (GPDA) regularization, which is strictly applicable to temporal point processes, to reduce the effect of adversarial attacks and acquire an empirically robust model. In this approach, unlike other computationally expensive approaches, there is no need for additional back-propagation in the training step, and no further network isrequired. Ultimately, we propose an adversarial detection framework that has been trained in the Generative Adversarial Network (GAN) manner and solely on clean training data. Finally, in Chapter 6 , we discuss implications of the research and future research directions.
Adversarial autoencoders for anomalous event detection in images
(2017) Dimokranitou, Asimenia; Tsechpenakis, Gavriil; Zheng, Jiang Yu; Tuceryan, Mihran
Detection of anomalous events in image sequences is a problem in computer vision with various applications, such as public security, health monitoring and intrusion detection. Despite the various applications, anomaly detection remains an ill-defined problem. Several definitions exist, the most commonly used defines an anomaly as a low probability event. Anomaly detection is a challenging problem mainly because of the lack of abnormal observations in the data. Thus, usually it is considered an unsupervised learning problem. Our approach is based on autoencoders in combination with Generative Adversarial Networks. The method is called Adversarial Autoencoders [1], and it is a probabilistic autoencoder, that attempts to match the aggregated posterior of the hidden code vector of the autoencoder, with an arbitrary prior distribution. The adversarial error of the learned autoencoder is low for regular events and high for irregular events. We compare our approach with state of the art methods and describe our results with respect to accuracy and efficiency.
Analysis of Pseudo-Symmetry in Protein Homo-Oligomers
(2018-12) Rajendran, Catherine Jenifer Rajam; Fang, Shiaofen; Liu, Jing-Yuan; Liang, Yao
Symmetry plays a significant role in protein structural assembly and function. This is especially true for large homo-oligomeric protein complexes due to stability and finite control of function. But, symmetry in proteins are not perfect due to unknown reasons and leads to pseudosymmetry. This study focuses on symmetry analysis of homo-oligomers, specifically homo-dimers, homo-trimers and homo-tetramers. We defined Off Symmetry (OS) to measure the overall symmetry of the protein and Structural Index (SI) to quantify the structural difference and Assembly Index (AI) to quantify the assembly difference between the subunits. In most of the symmetrical homo-trimer and homo-tetramer proteins, Assembly Index contributes more to Off Symmetry and in the case of homo-dimer, Structural index contributes more than the Assembly Index. The main chain atom Carbon-Alpha (CA) is more symmetrical than the first side chain atom Carbon-Beta (CB), suggesting protein mobility may contribute to the pseudosymmetry. In addition, Pearson coefficient correlation between their Off-Symmetry and their respective atoms B-Factor (temperature factor) are calculated. We found that the individual residues of a protein in all the subunits are correlated to their average B-Factor of these residues. The correlation with BFactor is stronger in Structure Index than Assembly Index. All these results suggest that protein dynamics play an important role and therefore a larger off-symmetry may indicate a more mobile and flexible protein complex.
Analysis of pseudo-symmetry in protein oligomers and its correlation with protein dynamics
(2017) Shankar, Kavya; Liu, Jing-Yuan
Symmetry is a feature that can be noticed almost anywhere around us. Animals, for example, have bilateral symmetry whereas flowers have a rotational symmetry. Proteins are complex systems that also exhibit this property as a rule but there is a disturbance in it that prevents it from being perfectly symmetrical. Even homo-oligomers that are made of identical subunits are not exempt from this. In this paper, we focused on protein homo-dimers and homo-trimers and we introduced off-symmetry(OS) to quantify how much a protein complex is off from perfect symmetry. Furthermore, we decomposed off-symmetry into two aspects namely structure index (SI) that measures structural difference and assembly index (AI) that measures assembly difference. We found in most cases, the major contributor to OS is SI in dimers and AI in trimers. In addition, we found that the SI and in turn OS contributed by each residue is positively correlated with their B factors, which indicates that protein flexibility and mobility may contribute to the off-symmetry of protein oligomers.
Analyzing and evaluating security features in software requirements
(2016-10-28) Hayrapetian, Allenoush; Raje, Rajeev
Software requirements, for complex projects, often contain specifications of non-functional attributes (e.g., security-related features). The process of analyzing such requirements for standards compliance is laborious and error prone. Due to the inherent free-flowing nature of software requirements, it is tempting to apply Natural Language Processing (NLP) and Machine Learning (ML) based techniques for analyzing these documents. In this thesis, we propose a novel semi-automatic methodology that assesses the security requirements of the software system with respect to completeness and ambiguity, creating a bridge between the requirements documents and being in compliance. Security standards, e.g., those introduced by the ISO and OWASP, are compared against annotated software project documents for textual entailment relationships (NLP), and the results are used to train a neural network model (ML) for classifying security-based requirements. Hence, this approach aims to identify the appropriate structures that underlie software requirements documents. Once such structures are formalized and empirically validated, they will provide guidelines to software organizations for generating comprehensive and unambiguous requirements specification documents as related to security-oriented features. The proposed solution will assist organizations during the early phases of developing secure software and reduce overall development effort and costs.
Applications of Data Mining in Healthcare
(2019-05) Peng, Bo; Mohler, George; Dundar, Murat; Zheng, Jiang Yu
With increases in the quantity and quality of healthcare related data, data mining tools have the potential to improve people’s standard of living through personalized and predictive medicine. In this thesis we improve the state-of-the-art in data mining for several problems in the healthcare domain. In problems such as drug-drug interaction prediction and Alzheimer’s Disease (AD) biomarkers discovery and prioritization, current methods either require tedious feature engineering or have unsatisfactory performance. New effective computational tools are needed that can tackle these complex problems. In this dissertation, we develop new algorithms for two healthcare problems: high-order drug-drug interaction prediction and amyloid imaging biomarker prioritization in Alzheimer’s Disease. Drug-drug interactions (DDIs) and their associated adverse drug reactions (ADRs) represent a significant detriment to the public h ealth. Existing research on DDIs primarily focuses on pairwise DDI detection and prediction. Effective computational methods for high-order DDI prediction are desired. In this dissertation, I present a deep learning based model D 3 I for cardinality-invariant and order-invariant high-order DDI pre- diction. The proposed models achieve 0.740 F1 value and 0.847 AUC value on high-order DDI prediction, and outperform classical methods on order-2 DDI prediction. These results demonstrate the strong potential of D 3 I and deep learning based models in tackling the prediction problems of high-order DDIs and their induced ADRs. The second problem I consider in this thesis is amyloid imaging biomarkers discovery, for which I propose an innovative machine learning paradigm enabling precision medicine in this domain. The paradigm tailors the imaging biomarker discovery process to individual characteristics of a given patient. I implement this paradigm using a newly developed learning-to-rank method PLTR. The PLTR model seamlessly integrates two objectives for joint optimization: pushing up relevant biomarkers and ranking among relevant biomarkers. The empirical study of PLTR conducted on the ADNI data yields promising results to identify and prioritize individual-specific amyloid imaging biomarkers based on the individual’s structural MRI data. The resulting top ranked imaging biomarkers have the potential to aid personalized diagnosis and disease subtyping.
Aural Mapping of STEM Concepts Using Literature Mining
(2013-03-06) Bharadwaj, Venkatesh; Palakal, Mathew J.; Raje, Rajeev; Xia, Yuni
Recent technological applications have made the life of people too much dependent on Science, Technology, Engineering, and Mathematics (STEM) and its applications. Understanding basic level science is a must in order to use and contribute to this technological revolution. Science education in middle and high school levels however depends heavily on visual representations such as models, diagrams, figures, animations and presentations etc. This leaves visually impaired students with very few options to learn science and secure a career in STEM related areas. Recent experiments have shown that small aural clues called Audemes are helpful in understanding and memorization of science concepts among visually impaired students. Audemes are non-verbal sound translations of a science concept. In order to facilitate science concepts as Audemes, for visually impaired students, this thesis presents an automatic system for audeme generation from STEM textbooks. This thesis describes the systematic application of multiple Natural Language Processing tools and techniques, such as dependency parser, POS tagger, Information Retrieval algorithm, Semantic mapping of aural words, machine learning etc., to transform the science concept into a combination of atomic-sounds, thus forming an audeme. We present a rule based classification method for all STEM related concepts. This work also presents a novel way of mapping and extracting most related sounds for the words being used in textbook. Additionally, machine learning methods are used in the system to guarantee the customization of output according to a user's perception. The system being presented is robust, scalable, fully automatic and dynamically adaptable for audeme generation.
Auto-Generating Models From Their Semantics and Constraints
(2013-08-20) Pati, Tanumoy; Hill, James H. (James Haswell); Raje, Rajeev; Al Hasan, Mohammad
Domain-specific models powered using domain-specific modeling languages are traditionally created manually by modelers. There exist model intelligence techniques, such as constraint solvers and model guidance, which alleviate challenges associated with manually creating models, however parts of the modeling process are still manual. Moreover, state-of-the-art model intelligence techniques are---in essence---reactive (i.e., invoked by the modeler). This thesis therefore provides two contributions to model-driven engineering research using domain-specific modeling language (DSML). First, it discusses how DSML semantic and constraint can enable proactive modeling, which is a form of model intelligence that foresees model transformations, automatically executes these model transformations, and prompts the modeler for assistance when necessary. Secondly, this thesis shows how we integrated proactive modeling into the Generic Modeling environment (GME). Our experience using proactive modeling shows that it can reduce modeling effort by both automatically generating required model elements, and by guiding modelers to select what actions should be executed on the model.
Automated image classification via unsupervised feature learning by K-means
(2015-07-09) Karimy Dehkordy, Hossein; Dundar, Mehmet Murat; Song, Fengguang; Xia, Yuni
Research on image classification has grown rapidly in the field of machine learning. Many methods have already been implemented for image classification. Among all these methods, best results have been reported by neural network-based techniques. One of the most important steps in automated image classification is feature extraction. Feature extraction includes two parts: feature construction and feature selection. Many methods for feature extraction exist, but the best ones are related to deep-learning approaches such as network-in-network or deep convolutional network algorithms. Deep learning tries to focus on the level of abstraction and find higher levels of abstraction from the previous level by having multiple layers of hidden layers. The two main problems with using deep-learning approaches are the speed and the number of parameters that should be configured. Small changes or poor selection of parameters can alter the results completely or even make them worse. Tuning these parameters is usually impossible for normal users who do not have super computers because one should run the algorithm and try to tune the parameters according to the results obtained. Thus, this process can be very time consuming. This thesis attempts to address the speed and configuration issues found with traditional deep-network approaches. Some of the traditional methods of unsupervised learning are used to build an automated image-classification approach that takes less time both to configure and to run.

Browse

Browsing Computer & Information Science Department Theses and Dissertations by Title

Results Per Page

Sort Options