- Browse by Author
Browsing by Author "Al Hasan, Mohammad"
Now showing 1 - 10 of 67
Results Per Page
Sort Options
Item A framework for graph-base neural network using numerical simulation of metal powder bed fusion for correlating process parameters and defect generation(Elsevier, 2022) Akter Jahan, Suchana; Al Hasan, Mohammad; El-Mounayri, Hazim; Mechanical and Energy Engineering, School of Engineering and TechnologyPowder bed fusion (PBF) is the most common technique used for metal additive manufacturing. This process involves consolidation of metal powder using a heat source such as laser or electron beam. During the formation of three-dimensional(3D) objects by sintering metal powders layer by layer, many different thermal phenomena occur that can create defects or anomalies on the final printed part. Similar to other additive manufacturing techniques, PBF has been in practice for decades, yet it is still going through research and development endeavors which is required to understand the physics behind this process. Defects and deformations highly impact the product quality and reliability of the overall manufacturing process; hence, it is essential that we understand the reason and mechanism of defect generation in PBF process and take appropriate measures to rectify them. In this paper, we have attempted to study the effect of processing parameters (scanning speed, laser power) on the generation of defects in PBF process using a graph-based artificial neural network that uses numerical simulation results as input or training data. Use of graph-based machine learning is novel in the area of manufacturing let alone additive manufacturing or powder bed fusion. The outcome of this study provides an opportunity to design a feedback controlled in-situ online monitoring system in powder bed fusion to reduce printing defects and optimize the manufacturing process.Item A-Optimal Subsampling For Big Data General Estimating Equations(2019-08) Cheung, Chung Ching; Peng, Hanxiang; Rubchinsky, Leonid; Boukai, Benzion; Lin, Guang; Al Hasan, MohammadA significant hurdle for analyzing big data is the lack of effective technology and statistical inference methods. A popular approach for analyzing data with large sample is subsampling. Many subsampling probabilities have been introduced in literature (Ma, \emph{et al.}, 2015) for linear model. In this dissertation, we focus on generalized estimating equations (GEE) with big data and derive the asymptotic normality for the estimator without resampling and estimator with resampling. We also give the asymptotic representation of the bias of estimator without resampling and estimator with resampling. we show that bias becomes significant when the data is of high-dimensional. We also present a novel subsampling method called A-optimal which is derived by minimizing the trace of some dispersion matrices (Peng and Tan, 2018). We derive the asymptotic normality of the estimator based on A-optimal subsampling methods. We conduct extensive simulations on large sample data with high dimension to evaluate the performance of our proposed methods using MSE as a criterion. High dimensional data are further investigated and we show through simulations that minimizing the asymptotic variance does not imply minimizing the MSE as bias not negligible. We apply our proposed subsampling method to analyze a real data set, gas sensor data which has more than four millions data points. In both simulations and real data analysis, our A-optimal method outperform the traditional uniform subsampling method.Item ACTS: Extracting Android App Topological Signature through Graphlet Sampling(IEEE, 2016-10) Peng, Wei; Gao, Tianchong; Sisodia, Devkishen; Saha, Tanay Kumar; Li, Feng; Al Hasan, Mohammad; Computer Information and Graphics Technology, School of Engineering and TechnologyAndroid systems are widely used in mobile & wireless distributed systems. In the near future, Android is believed to dominate the mobile distributed environment. However, with the popularity of Android-based smartphones/tablets comes the rampancy of Android-based malware. In this paper, we propose a novel topological signature of Android apps based on the function call graphs (FCGs) extracted from their Android App Packages (APKs). Specifically, by leveraging recent advances in graphlet sampling, the proposed method fully captures the invocator-invocatee relationship at local neighborhoods in an FCG without exponentially inflating the state space. Using real benign app and malware samples, we demonstrate that our method, ACTS (App topologiCal signature through graphleT Sampling), can detect malware and identify malware families robustly and efficiently. More importantly, we demonstrate that, without augmenting the FCG with any semantic features such as bytecode-based vertex typing, local topological information captured by ACTS alone can achieve a high malware detection accuracy. Since ACTS only uses structural features, which are orthogonal to semantic features, it is expected that combining them would give a greater improvement in malware detection accuracy than combining non-orthogonal semantic features.Item Adversarial Attacks and Defense Mechanisms to Improve Robustness of Deep Temporal Point Processes(2022-08) Khorshidi, Samira; Mohler, George; Al Hasan, Mohammad; Raje, Rajeev; Durresi, ArjanTemporal point processes (TPP) are mathematical approaches for modeling asynchronous event sequences by considering the temporal dependency of each event on past events and its instantaneous rate. Temporal point processes can model various problems, from earthquake aftershocks, trade orders, gang violence, and reported crime patterns, to network analysis, infectious disease transmissions, and virus spread forecasting. In each of these cases, the entity’s behavior with the corresponding information is noted over time as an asynchronous event sequence, and the analysis is done using temporal point processes, which provides a means to define the generative mechanism of the sequence of events and ultimately predict events and investigate causality. Among point processes, Hawkes process as a stochastic point process is able to model a wide range of contagious and self-exciting patterns. One of Hawkes process’s well-known applications is predicting the evolution of viral processes on networks, which is an important problem in biology, the social sciences, and the study of the Internet. In existing works, mean-field analysis based upon degree distribution is used to predict viral spreading across networks of different types. However, it has been shown that degree distribution alone fails to predict the behavior of viruses on some real-world networks. Recent attempts have been made to use assortativity to address this shortcoming. This thesis illustrates how the evolution of such a viral process is sensitive to the underlying network’s structure. In Chapter 3 , we show that adding assortativity does not fully explain the variance in the spread of viruses for a number of real-world networks. We propose using the graphlet frequency distribution combined with assortativity to explain variations in the evolution of viral processes across networks with identical degree distribution. Using a data-driven approach, by coupling predictive modeling with viral process simulation on real-world networks, we show that simple regression models based on graphlet frequency distribution can explain over 95% of the variance in virality on networks with the same degree distribution but different network topologies. Our results highlight the importance of graphlets and identify a small collection of graphlets that may have the most significant influence over the viral processes on a network. Due to the flexibility and expressiveness of deep learning techniques, several neural network-based approaches have recently shown promise for modeling point process intensities. However, there is a lack of research on the possible adversarial attacks and the robustness of such models regarding adversarial attacks and natural shocks to systems. Furthermore, while neural point processes may outperform simpler parametric models on in-sample tests, how these models perform when encountering adversarial examples or sharp non-stationary trends remains unknown. In Chapter 4 , we propose several white-box and black-box adversarial attacks against deep temporal point processes. Additionally, we investigate the transferability of whitebox adversarial attacks against point processes modeled by deep neural networks, which are considered a more elevated risk. Extensive experiments confirm that neural point processes are vulnerable to adversarial attacks. Such a vulnerability is illustrated both in terms of predictive metrics and the effect of attacks on the underlying point process’s parameters. Expressly, adversarial attacks successfully transform the temporal Hawkes process regime from sub-critical to into a super-critical and manipulate the modeled parameters that is considered a risk against parametric modeling approaches. Additionally, we evaluate the vulnerability and performance of these models in the presence of non-stationary abrupt changes, using the crimes and Covid-19 pandemic dataset as an example. Considering the security vulnerability of deep-learning models, including deep temporal point processes, to adversarial attacks, it is essential to ensure the robustness of the deployed algorithms that is despite the success of deep learning techniques in modeling temporal point processes. In Chapter 5 , we study the robustness of deep temporal point processes against several proposed adversarial attacks from the adversarial defense viewpoint. Specifically, we investigate the effectiveness of adversarial training using universal adversarial samples in improving the robustness of the deep point processes. Additionally, we propose a general point process domain-adopted (GPDA) regularization, which is strictly applicable to temporal point processes, to reduce the effect of adversarial attacks and acquire an empirically robust model. In this approach, unlike other computationally expensive approaches, there is no need for additional back-propagation in the training step, and no further network isrequired. Ultimately, we propose an adversarial detection framework that has been trained in the Generative Adversarial Network (GAN) manner and solely on clean training data. Finally, in Chapter 6 , we discuss implications of the research and future research directions.Item Android Malware Detection via Graphlet Sampling(IEEE, 2018-11) Gao, Tianchong; Peng, Wei; Sisodia, Devkishen; Saha, Tanay Kumar; Li, Feng; Al Hasan, Mohammad; Computer Information and Graphics Technology, School of Engineering and TechnologyAndroid systems are widely used in mobile & wireless distributed systems. In the near future, Android is believed to dominate the mobile distributed environment. However, with the popularity of Android-based smartphones/tablets comes the rampancy of Android-based malware. In this paper, we propose a novel topological signature of Android apps based on the function call graphs (FCGs) extracted from their Android App PacKages (APKs). Specifically, by leveraging recent advances on graphlet mining, the proposed method fully captures the invocator-invocatee relationship at local neighborhoods in an FCG without exponentially inflating the state space. Using real benign app and malware samples, we demonstrate that our method, ACTS (App topologiCal signature through graphleT Sampling), can detect malware and identify malware families robustly and efficiently. More importantly, we demonstrate that, without augmenting the FCG with any semantic features such as bytecode-based vertex typing, local topological information captured by ACTS alone can achieve a high malware detection accuracy. Since ACTS only uses structural features, which are orthogonal to semantic features, it is expected that combining them would give a greater improvement in malware detection accuracy than combining non-orthogonal semantic features.Item Anti-perturbation of Online Social Networks by Graph Label Transition(arXiv, 2020) Zhuang, Jun; Al Hasan, Mohammad; Computer and Information Science, School of ScienceOnline social networks (OSNs) classify users into different categories based on their online activities and interests, a task which is referred as a node classification task. Such a task can be solved effectively using Graph Convolutional Networks (GCNs). However, a small number of users, so-called perturbators, may perform random activities on an OSN, which significantly deteriorate the performance of a GCN-based node classification task. Existing works in this direction defend GCNs either by adversarial training or by identifying the attacker nodes followed by their removal. However, both of these approaches require that the attack patterns or attacker nodes be identified first, which is difficult in the scenario when the number of perturbator nodes is very small. In this work, we develop a GCN defense model, namely GraphLT, which uses the concept of label transition. GraphLT assumes that perturbators' random activities deteriorate GCN's performance. To overcome this issue, GraphLT subsequently uses a novel Bayesian label transition model, which takes GCN's predicted labels and applies label transitions by Gibbs-sampling-based inference and thus repairs GCN's prediction to achieve better node classification. Extensive experiments on seven benchmark datasets show that GraphLT considerably enhances the performance of the node classifier in an unperturbed environment; furthermore, it validates that GraphLT can successfully repair a GCN-based node classifier with superior performance than several competing methods.Item Auto-Generating Models From Their Semantics and Constraints(2013-08-20) Pati, Tanumoy; Hill, James H. (James Haswell); Raje, Rajeev; Al Hasan, MohammadDomain-specific models powered using domain-specific modeling languages are traditionally created manually by modelers. There exist model intelligence techniques, such as constraint solvers and model guidance, which alleviate challenges associated with manually creating models, however parts of the modeling process are still manual. Moreover, state-of-the-art model intelligence techniques are---in essence---reactive (i.e., invoked by the modeler). This thesis therefore provides two contributions to model-driven engineering research using domain-specific modeling language (DSML). First, it discusses how DSML semantic and constraint can enable proactive modeling, which is a form of model intelligence that foresees model transformations, automatically executes these model transformations, and prompts the modeler for assistance when necessary. Secondly, this thesis shows how we integrated proactive modeling into the Generic Modeling environment (GME). Our experience using proactive modeling shows that it can reduce modeling effort by both automatically generating required model elements, and by guiding modelers to select what actions should be executed on the model.Item Automatic Extraction of Computer Science Concept Phrases Using a Hybrid Machine Learning Paradigm(2023-05) Jahin, S M Abrar; Al Hasan, Mohammad; Fang, Shiaofen; Mukhopadhyay, SnehasisWith the proliferation of computer science in recent years in modern society, the number of computer science-related employment is expanding quickly. Software engineer has been chosen as the best job for 2023 based on pay, stress level, opportunity for professional growth, and balance between work and personal life. This was decided by a rankings of different news, journals, and publications. Computer science occupations are anticipated to be in high demand not just in 2023, but also for the foreseeable future. It's not surprising that the number of computer science students at universities is growing and will continue to grow. The enormous increase in student enrolment in many subdisciplines of computers has presented some distinct issues. If computer science is to be incorporated into the K-12 curriculum, it is vital that K-12 educators are competent. But one of the biggest problems with this plan is that there aren't enough trained computer science professors. Numerous new fields and applications, for instance, are being introduced to computer science. In addition, it is difficult for schools to recruit skilled computer science instructors for a variety of reasons including low salary issue. Utilizing the K-12 teachers who are already in the schools, have a love for teaching, and consider teaching as a vocation is therefore the most effective strategy to improve or fix this issue. So, if we want teachers to quickly grasp computer science topics, we need to give them an easy way to learn about computer science. To simplify and expedite the study of computer science, we must acquaint school-treachers with the terminology associated with computer science concepts so they can know which things they need to learn according to their profile. If we want to make it easier for schoolteachers to comprehend computer science concepts, it would be ideal if we could provide them with a tree of words and phrases from which they could determine where the phrases originated and which phrases are connected to them so that they can be effectively learned. To find a good concept word or phrase, we must first identify concepts and then establish their connections or linkages. As computer science is a fast developing field, its nomenclature is also expanding at a frenetic rate. Therefore, adding all concepts and terms to the knowledge graph would be a challenging endeavor. Cre- ating a system that automatically adds all computer science domain terms to the knowledge graph would be a straightforward solution to the issue. We have identified knowledge graph use cases for the schoolteacher training program, which motivates the development of a knowledge graph. We have analyzed the knowledge graph's use case and the knowledge graph's ideal characteristics. We have designed a webbased system for adding, editing, and removing words from a knowledge graph. In addition, a term or phrase can be represented with its children list, parent list, and synonym list for enhanced comprehension. We' ve developed an automated system for extracting words and phrases that can extract computer science idea phrases from any supplied text, therefore enriching the knowledge graph. Therefore, we have designed the knowledge graph for use in teacher education so that school-teachers can educate K-12 students computer science topicses effectively.Item Bayesian Non-Exhaustive Classification A Case Study: Online Name Disambiguation using Temporal Record Streams(ACM, 2016-10) Zhang, Baichuan; Dundar, Murat; Al Hasan, Mohammad; Department of Computer and Information Science, School of ScienceThe name entity disambiguation task aims to partition the records of multiple real-life persons so that each partition contains records pertaining to a unique person. Most of the existing solutions for this task operate in a batch mode, where all records to be disambiguated are initially available to the algorithm. However, more realistic settings require that the name disambiguation task be performed in an online fashion, in addition to, being able to identify records of new ambiguous entities having no preexisting records. In this work, we propose a Bayesian non-exhaustive classification framework for solving online name disambiguation task. Our proposed method uses a Dirichlet process prior with a Normal x Normal x Inverse Wishart data model which enables identification of new ambiguous entities who have no records in the training data. For online classification, we use one sweep Gibbs sampler which is very efficient and effective. As a case study we consider bibliographic data in a temporal stream format and disambiguate authors by partitioning their papers into homogeneous groups. Our experimental results demonstrate that the proposed method is better than existing methods for performing online name disambiguation task.Item Characterizing software components using evolutionary testing and path-guided analysis(2013-12-16) McNeany, Scott Edward; Hill, James H. (James Haswell); Raje, Rajeev; Al Hasan, Mohammad; Fang, ShiaofenEvolutionary testing (ET) techniques (e.g., mutation, crossover, and natural selection) have been applied successfully to many areas of software engineering, such as error/fault identification, data mining, and software cost estimation. Previous research has also applied ET techniques to performance testing. Its application to performance testing, however, only goes as far as finding the best and worst case, execution times. Although such performance testing is beneficial, it provides little insight into performance characteristics of complex functions with multiple branches. This thesis therefore provides two contributions towards performance testing of software systems. First, this thesis demonstrates how ET and genetic algorithms (GAs), which are search heuristic mechanisms for solving optimization problems using mutation, crossover, and natural selection, can be combined with a constraint solver to target specific paths in the software. Secondly, this thesis demonstrates how such an approach can identify local minima and maxima execution times, which can provide a more detailed characterization of software performance. The results from applying our approach to example software applications show that it is able to characterize different execution paths in relatively short amounts of time. This thesis also examines a modified exhaustive approach which can be plugged in when the constraint solver cannot properly provide the information needed to target specific paths.