TEXT MINER FOR HYPERGRAPHS USING OUTPUT SPACE SAMPLING

dc.contributor.advisorMukhopadhyay, Snehasis
dc.contributor.authorTirupattur, Naveen
dc.contributor.otherFang, Shiaofen
dc.contributor.otherXia, Yuni
dc.date.accessioned2011-08-16T19:55:03Z
dc.date.available2011-08-16T19:55:03Z
dc.date.issued2011-08-16
dc.degree.date2011en_US
dc.degree.disciplineComputer & Information Scienceen
dc.degree.grantorPurdue Universityen_US
dc.degree.levelM.S.en_US
dc.descriptionIndiana University-Purdue University Indianapolis (IUPUI)en_US
dc.description.abstractText Mining is process of extracting high-quality knowledge from analysis of textual data. Rapidly growing interest and focus on research in many fields is resulting in an overwhelming amount of research literature. This literature is a vast source of knowledge. But due to huge volume of literature, it is practically impossible for researchers to manually extract the knowledge. Hence, there is a need for automated approach to extract knowledge from unstructured data. Text mining is right approach for automated extraction of knowledge from textual data. The objective of this thesis is to mine documents pertaining to research literature, to find novel associations among entities appearing in that literature using Incremental Mining. Traditional text mining approaches provide binary associations. But it is important to understand context in which these associations occur. For example entity A has association with entity B in context of entity C. These contexts can be visualized as multi-way associations among the entities which are represented by a Hypergraph. This thesis work talks about extracting such multi-way associations among the entities using Frequent Itemset Mining and application of a new concept called Output space sampling to extract such multi-way associations in space and time efficient manner. We incorporated concept of personalization in Output space sampling so that user can specify his/her interests as the frequent hyper-associations are extracted from the text.en_US
dc.identifier.urihttps://hdl.handle.net/1805/2620
dc.identifier.urihttp://dx.doi.org/10.7912/C2/2288
dc.language.isoen_USen_US
dc.subjectText Miningen_US
dc.subjectPubMeden_US
dc.subjectFrequent Itemset Miningen_US
dc.subject.lcshData miningen_US
dc.subject.lcshHypergraphsen_US
dc.titleTEXT MINER FOR HYPERGRAPHS USING OUTPUT SPACE SAMPLINGen_US
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
My Thesis Final.pdf
Size:
485.79 KB
Format:
Adobe Portable Document Format
Description:
thesis
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.88 KB
Format:
Item-specific license agreed upon to submission
Description: