GRAPH BASED MINING ON WEIGHTED DIRECTED GRAPHS FOR SUBNETWORKS AND PATH DISCOVERY

dc.contributor.advisorPalakal, Mathew J.
dc.contributor.authorAbdulkarim, Sijin Cherupilly
dc.contributor.otherFang, Shiaofen
dc.contributor.otherXia, Yuni
dc.date.accessioned2011-08-16T19:34:53Z
dc.date.available2011-08-16T19:34:53Z
dc.date.issued2011-08-16
dc.degree.date2011en_US
dc.degree.disciplineComputer & Information Scienceen
dc.degree.grantorPurdue Universityen_US
dc.degree.levelM.S.en_US
dc.descriptionIndiana University-Purdue University Indianapolis (IUPUI)en_US
dc.description.abstractSubnetwork or path mining is an emerging data mining problem in many areas including scientific and commercial applications. Graph modeling is one of the effective ways in representing real world networks. Many natural and man-made systems are structured in the form of networks. Traditional machine learning and data mining approaches assume data as a collection of homogenous objects that are independent of each other whereas network data are potentially heterogeneous and interlinked. In this paper we propose a novel algorithm to find subnetworks and Maximal paths from a weighted, directed network represented as a graph. The main objective of this study is to find meaningful Maximal paths from a given network based on three key parameters: node weight, edge weight, and direction. This algorithm is an effective way to extract Maximal paths from a network modeled based on a user’s interest. Also, the proposed algorithm allows the user to incorporate weights to the nodes and edges of a biological network. The performance of the proposed technique was tested using a Colorectal Cancer biological network. The subnetworks and paths obtained through our network mining algorithm from the biological network were scored based on their biological significance. The subnetworks and Maximal paths derived were verified using MetacoreTM as well as literature. The algorithm is developed into a tool where the user can input the node list and the edge list. The tool can also find out the upstream and downstream of a given entity (genes/proteins etc.) from the derived Maximal paths. The complexity of finding the algorithm is found to be O(nlogn) in the best case and O(n^2 logn) in the worst case.en_US
dc.identifier.urihttps://hdl.handle.net/1805/2618
dc.identifier.urihttp://dx.doi.org/10.7912/C2/2287
dc.language.isoen_USen_US
dc.subjectComputer Scienceen_US
dc.subjectBioinformaticsen_US
dc.subject.lcshData miningen_US
dc.subject.lcshDirected graphsen_US
dc.titleGRAPH BASED MINING ON WEIGHTED DIRECTED GRAPHS FOR SUBNETWORKS AND PATH DISCOVERYen_US
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Thesis.pdf
Size:
3.58 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.88 KB
Format:
Item-specific license agreed upon to submission
Description: