GRAPH BASED MINING ON WEIGHTED DIRECTED GRAPHS FOR SUBNETWORKS AND PATH DISCOVERY
Files
Date
Authors
Language
Embargo Lift Date
Department
Committee Chair
Committee Members
Degree
Degree Year
Department
Grantor
Journal Title
Journal ISSN
Volume Title
Found At
Abstract
Subnetwork or path mining is an emerging data mining problem in many areas including scientific and commercial applications. Graph modeling is one of the effective ways in representing real world networks. Many natural and man-made systems are structured in the form of networks. Traditional machine learning and data mining approaches assume data as a collection of homogenous objects that are independent of each other whereas network data are potentially heterogeneous and interlinked. In this paper we propose a novel algorithm to find subnetworks and Maximal paths from a weighted, directed network represented as a graph. The main objective of this study is to find meaningful Maximal paths from a given network based on three key parameters: node weight, edge weight, and direction. This algorithm is an effective way to extract Maximal paths from a network modeled based on a user’s interest. Also, the proposed algorithm allows the user to incorporate weights to the nodes and edges of a biological network. The performance of the proposed technique was tested using a Colorectal Cancer biological network. The subnetworks and paths obtained through our network mining algorithm from the biological network were scored based on their biological significance. The subnetworks and Maximal paths derived were verified using MetacoreTM as well as literature. The algorithm is developed into a tool where the user can input the node list and the edge list. The tool can also find out the upstream and downstream of a given entity (genes/proteins etc.) from the derived Maximal paths. The complexity of finding the algorithm is found to be O(nlogn) in the best case and O(n^2 logn) in the worst case.