Department of Computer and Information Science

Permanent URI for this community

https://hdl.handle.net/1805/6744

Browse

Now showing 1 - 10 of 34

Anti-perturbation of Online Social Networks by Graph Label Transition
(arXiv, 2020) Zhuang, Jun; Al Hasan, Mohammad; Computer and Information Science, School of Science
Online social networks (OSNs) classify users into different categories based on their online activities and interests, a task which is referred as a node classification task. Such a task can be solved effectively using Graph Convolutional Networks (GCNs). However, a small number of users, so-called perturbators, may perform random activities on an OSN, which significantly deteriorate the performance of a GCN-based node classification task. Existing works in this direction defend GCNs either by adversarial training or by identifying the attacker nodes followed by their removal. However, both of these approaches require that the attack patterns or attacker nodes be identified first, which is difficult in the scenario when the number of perturbator nodes is very small. In this work, we develop a GCN defense model, namely GraphLT, which uses the concept of label transition. GraphLT assumes that perturbators' random activities deteriorate GCN's performance. To overcome this issue, GraphLT subsequently uses a novel Bayesian label transition model, which takes GCN's predicted labels and applies label transitions by Gibbs-sampling-based inference and thus repairs GCN's prediction to achieve better node classification. Extensive experiments on seven benchmark datasets show that GraphLT considerably enhances the performance of the node classifier in an unperturbed environment; furthermore, it validates that GraphLT can successfully repair a GCN-based node classifier with superior performance than several competing methods.
Bayesian Non-Exhaustive Classification A Case Study: Online Name Disambiguation using Temporal Record Streams
(ACM, 2016-10) Zhang, Baichuan; Dundar, Murat; Al Hasan, Mohammad; Department of Computer and Information Science, School of Science
The name entity disambiguation task aims to partition the records of multiple real-life persons so that each partition contains records pertaining to a unique person. Most of the existing solutions for this task operate in a batch mode, where all records to be disambiguated are initially available to the algorithm. However, more realistic settings require that the name disambiguation task be performed in an online fashion, in addition to, being able to identify records of new ambiguous entities having no preexisting records. In this work, we propose a Bayesian non-exhaustive classification framework for solving online name disambiguation task. Our proposed method uses a Dirichlet process prior with a Normal x Normal x Inverse Wishart data model which enables identification of new ambiguous entities who have no records in the training data. For online classification, we use one sweep Gibbs sampler which is very efficient and effective. As a case study we consider bibliographic data in a temporal stream format and disambiguate authors by partitioning their papers into homogeneous groups. Our experimental results demonstrate that the proposed method is better than existing methods for performing online name disambiguation task.
A Combined Representation Learning Approach for Better Job and Skill Recommendation
(ACM, 2018-10) Dave, Vachik S.; Al Hasan, Mohammad; Zhang, Baichuan; AlJadda, Khalifeh; Korayem, Mohammed; Computer and Information Science, School of Science
Job recommendation is an important task for the modern recruitment industry. An excellent job recommender system not only enables to recommend a higher paying job which is maximally aligned with the skill-set of the current job, but also suggests to acquire few additional skills which are required to assume the new position. In this work, we created three types of information net- works from the historical job data: (i) job transition network, (ii) job-skill network, and (iii) skill co-occurrence network. We provide a representation learning model which can utilize the information from all three networks to jointly learn the representation of the jobs and skills in the shared k-dimensional latent space. In our experiments, we show that by jointly learning the representation for the jobs and skills, our model provides better recommendation for both jobs and skills. Additionally, we also show some case studies which validate our claims.
Con-S2V: A Generic Framework for Incorporating Extra-Sentential Context into Sen2Vec
(Springer, 2017) Saha, Tanay Kumar; Joty, Shafiq; Al Hasan, Mohammad; Computer and Information Science, School of Science
We present a novel approach to learn distributed representation of sentences from unlabeled data by modeling both content and context of a sentence. The content model learns sentence representation by predicting its words. On the other hand, the context model comprises a neighbor prediction component and a regularizer to model distributional and proximity hypotheses, respectively. We propose an online algorithm to train the model components jointly. We evaluate the models in a setup, where contextual information is available. The experimental results on tasks involving classification, clustering, and ranking of sentences show that our model outperforms the best existing models by a wide margin across multiple datasets.
Deep Learning based Crop Row Detection with Online Domain Adaptation
(ACM, 2021-08) Doha, Rashed; Al Hasan, Mohammad; Anwar, Sohel; Rajendran, Veera; Computer and Information Science, School of Science
Detecting crop rows from video frames in real time is a fundamental challenge in the field of precision agriculture. Deep learning based semantic segmentation method, namely U-net, although successful in many tasks related to precision agriculture, performs poorly for solving this task. The reasons include paucity of large scale labeled datasets in this domain, diversity in crops, and the diversity of appearance of the same crops at various stages of their growth. In this work, we discuss the development of a practical real-life crop row detection system in collaboration with an agricultural sprayer company. Our proposed method takes the output of semantic segmentation using U-net, and then apply a clustering based probabilistic temporal calibration which can adapt to different fields and crops without the need for retraining the network. Experimental results validate that our method can be used for both refining the results of the U-net to reduce errors and also for frame interpolation of the input video stream.
Defending Graph Convolutional Networks against Dynamic Graph Perturbations via Bayesian Self-Supervision
(AAAI Technical Track, 2022-06-28) Zhuang, Jun; Al Hasan, Mohammad; Computer and Information Science, School of Science
In recent years, plentiful evidence illustrates that Graph Convolutional Networks (GCNs) achieve extraordinary accomplishments on the node classification task. However, GCNs may be vulnerable to adversarial attacks on label-scarce dynamic graphs. Many existing works aim to strengthen the robustness of GCNs; for instance, adversarial training is used to shield GCNs against malicious perturbations. However, these works fail on dynamic graphs for which label scarcity is a pressing issue. To overcome label scarcity, self-training attempts to iteratively assign pseudo-labels to highly confident unlabeled nodes but such attempts may suffer serious degradation under dynamic graph perturbations. In this paper, we generalize noisy supervision as a kind of self-supervised learning method and then propose a novel Bayesian self-supervision model, namely GraphSS, to address the issue. Extensive experiments demonstrate that GraphSS can not only affirmatively alert the perturbations on dynamic graphs but also effectively recover the prediction of a node classifier when the graph is under such perturbations. These two advantages prove to be generalized over three classic GCNs across five public graph datasets.
Deperturbation of Online Social Networks via Bayesian Label Transition
(Society for Industrial and Applied Mathematics, 2022) Zhuang, Jun; Al Hasan, Mohammad; Computer and Information Science, School of Science
Online social networks (OSNs) classify users into different categories based on their online activities and interests, a task which is referred as a node classification task. Such a task can be solved effectively using Graph Convolutional Networks (GCNs). However, a small number of users, so-called perturbators, may perform random activities on an OSN, which significantly deteriorate the performance of a GCN-based node classification task. Existing works in this direction defend GCNs either by adversarial training or by identifying the attacker nodes followed by their removal. However, both of these approaches require that the attack patterns or attacker nodes be identified first, which is difficult in the scenario when the number of perturbator nodes is very small. In this work, we develop a GCN defense model, namely GraphLT, which uses the concept of label transition. GraphLT assumes that perturbators' random activities deteriorate GCN's performance. To overcome this issue, GraphLT subsequently uses a novel Bayesian label transition model, which takes GCN's predicted labels and applies label transitions by Gibbs-sampling-based inference and thus repairs GCN's prediction to achieve better node classification. Extensive experiments on seven benchmark datasets show that GraphLT considerably enhances the performance of the node classifier in an unperturbed environment; furthermore, it validates that GraphLT can successfully repair a GCN-based node classifier with superior performance than several competing methods.
Discovery of Functional Motifs from the Interface Region of Oligomeric Proteins using Frequent Subgraph Mining
(IEEE, 2018) Saha, Tanay Kumar; Katebi, Ataur; Dhifli, Wajdi; Al Hasan, Mohammad; Computer and Information Science, School of Science
Modeling the interface region of a protein complex paves the way for understanding its dynamics and functionalities. Existing works model the interface region of a complex by using different approaches, such as, the residue composition at the interface region, the geometry of the interface residues, or the structural alignment of interface regions. These approaches are useful for ranking a set of docked conformation or for building scoring function for protein-protein docking, but they do not provide a generic and scalable technique for the extraction of interface patterns leading to functional motif discovery. In this work, we model the interface region of a protein complex by graphs and extract interface patterns of the given complex in the form of frequent subgraphs. To achieve this we develop a scalable algorithm for frequent subgraph mining. We show that a systematic review of the mined subgraphs provides an effective method for the discovery of functional motifs that exist along the interface region of a given protein complex.
Dynamic topic modeling of the COVID-19 Twitter narrative among U.S. governors and cabinet executives
(2020-04-19) Sha, Hao; Al Hasan, Mohammad; Mohler, George; Brantingham, P.; Computer and Information Science, School of Science
A combination of federal and state-level decision making has shaped the response to COVID-19 in the United States. In this paper, we analyze the Twitter narratives around this decision making by applying a dynamic topic model to COVID-19 related tweets by U.S. Governors and Presidential cabinet members. We use a network Hawkes binomial topic model to track evolving sub-topics around risk, testing, and treatment. We also construct influence networks amongst government officials using Granger causality inferred from the network Hawkes process.
E-CLoG: Counting edge-centric local graphlets
(IEEE, 2017-12) Dave, Vachik S.; Ahmed, Nesreen K.; Al Hasan, Mohammad; Computer and Information Science, School of Science
In recent years, graphlet counting has emerged as an important task in topological graph analysis. However, the existing works on graphlet counting obtain the graphlet counts for the entire network as a whole. These works capture the key graphical patterns that prevail in a given network but they fail to meet the demand of the majority of real-life graph related prediction tasks such as link prediction, edge/node classification, etc., which require to build features for an edge (or a vertex) of a network. To meet the demand for such applications, efficient algorithms are needed for counting local graphlets within the context of an edge (or a vertex). In this work, we propose an efficient method, titled E-CLOG, for counting all 3,4 and 5 size local graphlets with the context of a given edge for its all different edge orbits. We also provide a shared-memory, multi-core implementation of E-CLOG, which makes it even more scalable for very large real-world networks. In particular, We obtain strong scaling on a variety of graphs (14x-20x on 36 cores). We provide extensive experimental results to demonstrate the efficiency and effectiveness of the proposed method. For instance, we show that E-CLOG is faster than existing work by multiple order of magnitudes; for the Wordnet graph E-CLOG counts all 3,4 and 5-size local graphlets in 1.5 hours using a single thread and in only a few minutes using the parallel implementation, whereas the baseline method does not finish in more than 4 days. We also show that local graphlet counts around an edge are much better features for link prediction than well-known topological features; our experiments show that the former enjoys between 10% to 45% of improvement in the AUC value for predicting future links in three real-life social and collaboration networks.

Browse

Browsing Department of Computer and Information Science by Author "Al Hasan, Mohammad"

Results Per Page

Sort Options