FS3: A Sampling based method for top-k Frequent Subgraph Mining

If you need an accessible version of this item, please email your request to digschol@iu.edu so that they may create one and provide it to you.
Date
2015
Language
American English
Embargo Lift Date
Committee Members
Degree
Degree Year
Department
Grantor
Journal Title
Journal ISSN
Volume Title
Found At
Abstract

Mining labeled subgraph is a popular research task in data mining because of its potential application in many different scientific domains. All the existing methods for this task explicitly or implicitly solve the subgraph isomorphism task which is computationally expensive, so they suffer from the lack of scalability problem when the graphs in the input database are large. In this work, we propose FS3, which is a sampling based method. It mines a small collection of subgraphs that are most frequent in the probabilistic sense. FS3 performs a Markov Chain Monte Carlo (MCMC) sampling over the space of a fixed-size subgraphs such that the potentially frequent subgraphs are sampled more often. Besides, FS3 is equipped with an innovative queue manager. It stores the sampled subgraph in a finite queue over the course of mining in such a manner that the top-k positions in the queue contain the most frequent subgraphs. Our experiments on database of large graphs show that FS3 is efficient, and it obtains subgraphs that are the most frequent amongst the subgraphs of a given size.

Description
item.page.description.tableofcontents
item.page.relation.haspart
Cite As
Saha, T. K., & Al Hasan, M. (2015). FS3: A sampling based method for top-k frequent subgraph mining. Statistical Analysis and Data Mining: The ASA Data Science Journal, 8(4), 245–261. http://doi.org/10.1002/sam.11277
ISSN
Publisher
Series/Report
Sponsorship
Major
Extent
Identifier
Relation
Journal
Statistical Analysis and Data Mining: The ASA Data Science Journal
Source
Author
Alternative Title
Type
Article
Number
Volume
Conference Dates
Conference Host
Conference Location
Conference Name
Conference Panel
Conference Secretariat Location
Version
Author's manuscript
Full Text Available at
This item is under embargo {{howLong}}