Interactive pattern mining of neuroscience data

dc.contributor.advisorMukhopadhyay, Snehasis
dc.contributor.authorWaranashiwar, Shruti Dilip
dc.contributor.otherDurresi, Arjan
dc.contributor.otherXia, Yuni
dc.date.accessioned2014-01-29T16:47:20Z
dc.date.available2014-01-29T16:47:20Z
dc.date.issued2014-01-29
dc.degree.date2013en_US
dc.degree.grantorPurdue Universityen_US
dc.degree.levelM.S.en_US
dc.descriptionIndiana University-Purdue University Indianapolis (IUPUI)en_US
dc.description.abstractText mining is a process of extraction of knowledge from unstructured text documents. We have huge volumes of text documents in digital form. It is impossible to manually extract knowledge from these vast texts. Hence, text mining is used to find useful information from text through the identification and exploration of interesting patterns. The objective of this thesis in text mining area is to find compact but high quality frequent patterns from text documents related to neuroscience field. We try to prove that interactive sampling algorithm is efficient in terms of time when compared with exhaustive methods like FP Growth using RapidMiner tool. Instead of mining all frequent patterns, all of which may not be interesting to user, interactive method to mine only desired and interesting patterns is far better approach in terms of utilization of resources. This is especially observed with large number of keywords. In interactive patterns mining, a user gives feedback on whether a pattern is interesting or not. Using Markov Chain Monte Carlo (MCMC) sampling method, frequent patterns are generated in an interactive way. Thesis discusses extraction of patterns between the keywords related to some of the common disorders in neuroscience in an interactive way. PubMed database and keywords related to schizophrenia and alcoholism are used as inputs. This thesis reveals many associations between the different terms, which are otherwise difficult to understand by reading articles or journals manually. Graphviz tool is used to visualize associations.en_US
dc.identifier.urihttps://hdl.handle.net/1805/3878
dc.identifier.urihttp://dx.doi.org/10.7912/C2/2310
dc.language.isoen_USen_US
dc.subjectData Miningen_US
dc.subjectText Miningen_US
dc.subjectPubMeden_US
dc.subject.lcshData mining -- Research -- Methodology -- Evaluationen_US
dc.subject.lcshElectronic information resource searchingen_US
dc.subject.lcshGraphic methods -- Data processingen_US
dc.subject.lcshSoftware visualizationen_US
dc.subject.lcshUser interfaces (Computer systems)en_US
dc.subject.lcshNeuroinformatics -- Data processingen_US
dc.subject.lcshDatabase searching -- Research -- Methodology -- Evaluationen_US
dc.subject.lcshMarkov processesen_US
dc.subject.lcshMonte Carlo methoden_US
dc.subject.lcshStatistics -- Data processingen_US
dc.subject.lcshLife sciences literature -- Researchen_US
dc.subject.lcshNational Institutes of Health (U.S.). PubMed Central -- Researchen_US
dc.subject.lcshSchizophrenia -- Data processingen_US
dc.subject.lcshAlcoholism -- Data processingen_US
dc.titleInteractive pattern mining of neuroscience dataen_US
dc.typeThesisen
thesis.degree.disciplineComputer & Information Scienceen
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
MyThesisR1.pdf
Size:
1002.25 KB
Format:
Adobe Portable Document Format
Description:
Thesis
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.88 KB
Format:
Item-specific license agreed upon to submission
Description: