- Browse by Subject
Browsing by Subject "Hawkes process"
Now showing 1 - 6 of 6
Results Per Page
Sort Options
Item Hawkes binomial topic model with applications to coupled conflict-Twitter data(Project Euclid, 2020-12) Mohler, George; McGrath, Erin; Buntain, Cody; LaFree, Gary; Computer and Information Science, School of ScienceWe consider the problem of modeling and clustering heterogeneous event data arising from coupled conflict event and social media data sets. In this setting conflict events trigger responses on social media, and, at the same time, signals of grievance detected in social media may serve as leading indicators for subsequent conflict events. For this purpose we introduce the Hawkes Binomial Topic Model (HBTM) where marks, Tweets and conflict event descriptions are represented as bags of words following a Binomial distribution. When viewed as a branching process, the daughter event bag of words is generated by randomly turning on/off parent words through independent Bernoulli random variables. We then use expectation–maximization to estimate the model parameters and branching structure of the process. The inferred branching structure is then used for topic cascade detection, short-term forecasting, and investigating the causal dependence of grievance on social media and conflict events in recent elections in Nigeria and Kenya.Item Predicting Virality on Networks Using Local Graphlet Frequency Distribution(IEEE, 2018-12) Baas, Andrew; Hung, Frances; Sha, Hao; Al Hasan, Mohammad; Mohler, George; Computer and Information Science, School of ScienceThe task of predicting virality has far-reaching consequences, from the world of advertising to more recent attempts to reduce the spread of fake news. Previous work has shown that graphlet distribution is an effective feature for predicting virality. Here, we investigate the use of aggregated edge-centric local graphlets around source nodes as features for virality prediction. These prediction features are used to predict expected virality for both a time-independent Hawkes model and an independent cascade model of virality. In the Hawkes model, we use linear regression to predict the number of Hawkes events and node ranking, while in the independent cascade model we use logistic regression to predict whether a k-size cascade will multiply by a factor X in size. Our study indicates that local graphlet frequency distribution can effectively capture the variances of the viral processes simulated by Hawkes process and independent-cascade process. Furthermore, we identify a group of local graphlets which might be significant in the viral processes. We compare the effectiveness of our methods with eigenvector centrality-based node choice.Item The Role of Graphlets in Viral Processes on Networks(Springer, 2018) Khorshidi, Samira; Al Hasan, Mohammad; Mohler, George; Short, Martin; Computer and Information Science, School of SciencePredicting the evolution of viral processes on networks is an important problem with applications arising in biology, the social sciences, and the study of the Internet. In existing works, mean-field analysis based upon degree distribution is used for the prediction of viral spreading across networks of different types. However, it has been shown that degree distribution alone fails to predict the behavior of viruses on some real-world networks and recent attempts have been made to use assortativity to address this shortcoming. In this paper, we show that adding assortativity does not fully explain the variance in the spread of viruses for a number of real-world networks. We propose using the graphlet frequency distribution in combination with assortativity to explain variations in the evolution of viral processes across networks with identical degree distribution. Using a data-driven approach by coupling predictive modeling with viral process simulation on real-world networks, we show that simple regression models based on graphlet frequency distribution can explain over 95% of the variance in virality on networks with the same degree distribution but different network topologies. Our results not only highlight the importance of graphlets but also identify a small collection of graphlets which may have the highest influence over the viral processes on a network.Item Solving Prediction Problems from Temporal Event Data on Networks(2021-08) Sha, Hao; Mohler, George; Hasan, Mohammad; Dundar, Murat; Mukhopadhyay, SnehasisMany complex processes can be viewed as sequential events on a network. In this thesis, we study the interplay between a network and the event sequences on it. We first focus on predicting events on a known network. Examples of such include: modeling retweet cascades, forecasting earthquakes, and tracing the source of a pandemic. In specific, given the network structure, we solve two types of problems - (1) forecasting future events based on the historical events, and (2) identifying the initial event(s) based on some later observations of the dynamics. The inverse problem of inferring the unknown network topology or links, based on the events, is also of great important. Examples along this line include: constructing influence networks among Twitter users from their tweets, soliciting new members to join an event based on their participation history, and recommending positions for job seekers according to their work experience. Following this direction, we study two types of problems - (1) recovering influence networks, and (2) predicting links between a node and a group of nodes, from event sequences.Item SOS-EW: System for Overdose Spike Early Warning Using Drug Mover’s Distance-Based Hawkes Processes(Springer, 2020) Chiang, Wen-Hao; Yuan, Baichuan; Li, Hao; Wang, Bao; Bertozzi, Andrea; Carter, Jeremy; Ray, Brad; Mohler, George; Computer and Information Science, School of ScienceOpioid addictions and overdoses have increased across the U.S. and internationally over the past decade. In urban environments, overdoses cluster in space and time, with 50% of overdoses occurring in less than 5% of the city and dozens of calls for emergency medical services being made within a 48-hour period. In this work, we introduce a system for early detection of opioid overdose clusters based upon the toxicology report of an initial event. We first use drug SMILES, one hot encoded molecular substructures, to generate a bag of drug vectors corresponding to each overdose (overdoses are often characterized by multiple drugs taken at the same time). We then use spectral clustering to generate overdose categories and estimate multivariate Hawkes processes for the space-time intensity of overdoses following an initial event. As the productivity parameter of the process depends on the overdose category, this allows us to estimate the magnitude of an overdose spike based on the substances present (e.g. fentanyl leads to more subsequent overdoses compared to Oxycontin). We validate the model using opioid overdose deaths in Indianapolis and show that the model outperforms several recently introduced Hawkes-Topic models based on Dirichlet processes. Our system could be used in combination with drug test strips to alert drug using populations of risky batches on the market or to more efficiently allocate naloxone to users and health/social workers.Item Temporal Event Modeling of Social Harm with High Dimensional and Latent Covariates(2022-08) Liu, Xueying; Mohler, George; Fang, Shiaofen; Wang, Honglang; Hasan, Mohammad A.The counting process is the fundamental of many real-world problems with event data. Poisson process, used as the background intensity of Hawkes process, is the most commonly used point process. The Hawkes process, a self-exciting point process fits to temporal event data, spatial-temporal event data, and event data with covariates. We study the Hawkes process that fits to heterogeneous drug overdose data via a novel semi-parametric approach. The counting process is also related to survival data based on the fact that they both study the occurrences of events over time. We fit a Cox model to temporal event data with a large corpus that is processed into high dimensional covariates. We study the significant features that influence the intensity of events.