FS3: A Sampling based method for top-k Frequent Subgraph Mining

dc.contributor.authorSaha, Tanay Kumar
dc.contributor.authorAl Hasan, Mohammad
dc.contributor.departmentDepartment of Computer & Information Science, School of Scienceen_US
dc.date.accessioned2015-12-29T20:37:35Z
dc.date.available2015-12-29T20:37:35Z
dc.date.issued2015
dc.description.abstractMining labeled subgraph is a popular research task in data mining because of its potential application in many different scientific domains. All the existing methods for this task explicitly or implicitly solve the subgraph isomorphism task which is computationally expensive, so they suffer from the lack of scalability problem when the graphs in the input database are large. In this work, we propose FS3, which is a sampling based method. It mines a small collection of subgraphs that are most frequent in the probabilistic sense. FS3 performs a Markov Chain Monte Carlo (MCMC) sampling over the space of a fixed-size subgraphs such that the potentially frequent subgraphs are sampled more often. Besides, FS3 is equipped with an innovative queue manager. It stores the sampled subgraph in a finite queue over the course of mining in such a manner that the top-k positions in the queue contain the most frequent subgraphs. Our experiments on database of large graphs show that FS3 is efficient, and it obtains subgraphs that are the most frequent amongst the subgraphs of a given size.en_US
dc.eprint.versionAuthor's manuscripten_US
dc.identifier.citationSaha, T. K., & Al Hasan, M. (2015). FS3: A sampling based method for top-k frequent subgraph mining. Statistical Analysis and Data Mining: The ASA Data Science Journal, 8(4), 245–261. http://doi.org/10.1002/sam.11277en_US
dc.identifier.urihttps://hdl.handle.net/1805/7848
dc.language.isoen_USen_US
dc.relation.isversionof10.1002/sam.11277en_US
dc.relation.journalStatistical Analysis and Data Mining: The ASA Data Science Journalen_US
dc.rightsPublisher Policyen_US
dc.sourceAuthoren_US
dc.subjectdata miningen_US
dc.subjectsubgraph miningen_US
dc.subjectFS3en_US
dc.titleFS3: A Sampling based method for top-k Frequent Subgraph Miningen_US
dc.typeArticleen_US
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Saha_2015_FS3.pdf
Size:
240.11 KB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.88 KB
Format:
Item-specific license agreed upon to submission
Description: