RASMA: a reverse search algorithm for mining maximal frequent subgraphs

dc.contributor.authorSalem, Saeed
dc.contributor.authorAlokshiya, Mohammed
dc.contributor.authorHasan, Mohammad Al
dc.contributor.departmentComputer and Information Science, School of Scienceen_US
dc.date.accessioned2022-07-05T13:08:58Z
dc.date.available2022-07-05T13:08:58Z
dc.date.issued2021-03-16
dc.description.abstractBackground: Given a collection of coexpression networks over a set of genes, identifying subnetworks that appear frequently is an important research problem known as mining frequent subgraphs. Maximal frequent subgraphs are a representative set of frequent subgraphs; A frequent subgraph is maximal if it does not have a super-graph that is frequent. In the bioinformatics discipline, methodologies for mining frequent and/or maximal frequent subgraphs can be used to discover interesting network motifs that elucidate complex interactions among genes, reflected through the edges of the frequent subnetworks. Further study of frequent coexpression subnetworks enhances the discovery of biological modules and biological signatures for gene expression and disease classification. Results: We propose a reverse search algorithm, called RASMA, for mining frequent and maximal frequent subgraphs in a given collection of graphs. A key innovation in RASMA is a connected subgraph enumerator that uses a reverse-search strategy to enumerate connected subgraphs of an undirected graph. Using this enumeration strategy, RASMA obtains all maximal frequent subgraphs very efficiently. To overcome the computationally prohibitive task of enumerating all frequent subgraphs while mining for the maximal frequent subgraphs, RASMA employs several pruning strategies that substantially improve its overall runtime performance. Experimental results show that on large gene coexpression networks, the proposed algorithm efficiently mines biologically relevant maximal frequent subgraphs. Conclusion: Extracting recurrent gene coexpression subnetworks from multiple gene expression experiments enables the discovery of functional modules and subnetwork biomarkers. We have proposed a reverse search algorithm for mining maximal frequent subnetworks. Enrichment analysis of the extracted maximal frequent subnetworks reveals that subnetworks that are frequent are highly enriched with known biological ontologies.en_US
dc.eprint.versionFinal published versionen_US
dc.identifier.citationSalem S, Alokshiya M, Hasan MA. RASMA: a reverse search algorithm for mining maximal frequent subgraphs. BioData Min. 2021;14(1):19. Published 2021 Mar 16. doi:10.1186/s13040-021-00250-1en_US
dc.identifier.urihttps://hdl.handle.net/1805/29476
dc.language.isoen_USen_US
dc.publisherBMCen_US
dc.relation.isversionof10.1186/s13040-021-00250-1en_US
dc.relation.journalBioData Miningen_US
dc.rightsAttribution 4.0 International*
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/*
dc.sourcePMCen_US
dc.subjectBiological networksen_US
dc.subjectSubgraph enumerationen_US
dc.subjectFrequent subgraphsen_US
dc.subjectMaximal subgraphsen_US
dc.subjectReverse searchen_US
dc.titleRASMA: a reverse search algorithm for mining maximal frequent subgraphsen_US
dc.typeArticleen_US
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
13040_2021_Article_250.pdf
Size:
688.29 KB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.99 KB
Format:
Item-specific license agreed upon to submission
Description: