Interactive pattern mining of neuroscience data

Waranashiwar, Shruti Dilip

Interactive pattern mining of neuroscience data

dc.contributor.advisor	Mukhopadhyay, Snehasis
dc.contributor.author	Waranashiwar, Shruti Dilip
dc.contributor.other	Durresi, Arjan
dc.contributor.other	Xia, Yuni
dc.date.accessioned	2014-01-29T16:47:20Z
dc.date.available	2014-01-29T16:47:20Z
dc.date.issued	2014-01-29
dc.degree.date	2013	en_US
dc.degree.grantor	Purdue University	en_US
dc.degree.level	M.S.	en_US
dc.description	Indiana University-Purdue University Indianapolis (IUPUI)	en_US
dc.description.abstract	Text mining is a process of extraction of knowledge from unstructured text documents. We have huge volumes of text documents in digital form. It is impossible to manually extract knowledge from these vast texts. Hence, text mining is used to find useful information from text through the identification and exploration of interesting patterns. The objective of this thesis in text mining area is to find compact but high quality frequent patterns from text documents related to neuroscience field. We try to prove that interactive sampling algorithm is efficient in terms of time when compared with exhaustive methods like FP Growth using RapidMiner tool. Instead of mining all frequent patterns, all of which may not be interesting to user, interactive method to mine only desired and interesting patterns is far better approach in terms of utilization of resources. This is especially observed with large number of keywords. In interactive patterns mining, a user gives feedback on whether a pattern is interesting or not. Using Markov Chain Monte Carlo (MCMC) sampling method, frequent patterns are generated in an interactive way. Thesis discusses extraction of patterns between the keywords related to some of the common disorders in neuroscience in an interactive way. PubMed database and keywords related to schizophrenia and alcoholism are used as inputs. This thesis reveals many associations between the different terms, which are otherwise difficult to understand by reading articles or journals manually. Graphviz tool is used to visualize associations.	en_US
dc.identifier.uri	https://hdl.handle.net/1805/3878
dc.identifier.uri	http://dx.doi.org/10.7912/C2/2310
dc.language.iso	en_US	en_US
dc.subject	Data Mining	en_US
dc.subject	Text Mining	en_US
dc.subject	PubMed	en_US
dc.subject.lcsh	Data mining -- Research -- Methodology -- Evaluation	en_US
dc.subject.lcsh	Electronic information resource searching	en_US
dc.subject.lcsh	Graphic methods -- Data processing	en_US
dc.subject.lcsh	Software visualization	en_US
dc.subject.lcsh	User interfaces (Computer systems)	en_US
dc.subject.lcsh	Neuroinformatics -- Data processing	en_US
dc.subject.lcsh	Database searching -- Research -- Methodology -- Evaluation	en_US
dc.subject.lcsh	Markov processes	en_US
dc.subject.lcsh	Monte Carlo method	en_US
dc.subject.lcsh	Statistics -- Data processing	en_US
dc.subject.lcsh	Life sciences literature -- Research	en_US
dc.subject.lcsh	National Institutes of Health (U.S.). PubMed Central -- Research	en_US
dc.subject.lcsh	Schizophrenia -- Data processing	en_US
dc.subject.lcsh	Alcoholism -- Data processing	en_US
dc.title	Interactive pattern mining of neuroscience data	en_US
dc.type	Thesis	en
thesis.degree.discipline	Computer & Information Science	en

Files

Original bundle

Now showing 1 - 1 of 1

Name:: MyThesisR1.pdf
Size:: 1002.25 KB
Format:: Adobe Portable Document Format
Description:: Thesis

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.88 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Computer & Information Science Department Theses and Dissertations