- Browse by Subject
Browsing by Subject "PubMed"
Now showing 1 - 9 of 9
Results Per Page
Sort Options
Item Automatic Export of PubMed Citations to EndNote(http://www.tandfonline.com/doi/full/10.1080/02763861003723317#.UnARCCQ2-CY, 2010-04) London, Sue; Gurdal, Osman; Gall, CaroleThe export of MEDLINE references to EndNote can be accomplished in various ways. Unlike Ovid MEDLINE, PubMed does not have a direct export feature to EndNote. Until recently, PubMed references had to be saved as a text file to import into EndNote. Now, the automatic export of PubMed references can be done using Internet Explorer (IE) or Mozilla Firefox Web browsers. The development and teaching of seamless citation management is a value-added service to health professionals.Item A Case Study for Massive Text Mining: K Nearest Neighbor Algorithm on PubMed data(Office of the Vice Chancellor for Research, 2015-04-17) Do, Nhan; Dundar, MuratUS National Library of Medicine (NLM) has a huge collections of millions of books, journals, and other publications relating to medical domain. NLM creates the database called MEDLINE to store and link the citations to the publications. This database allows the researchers and students to access and find medical articles easily. The public can search on MEDLINE using a database called PubMed. When the new PubMed documents become available online, the curators have to manually decide the labels for them. The process is tedious and time-consuming because there are more than 27,149 descriptor (MeSH terms). Although the curators are already using a system called MTI for MeSH terms suggestion, the performance needs to be improved. This research explores the usage of text classification to annotate new PubMed document automatically, efficiently, and with reasonable accuracy. The data is gathered from BioASQ Contest, which contains 4 millions of abstracts. The research process includes preprocess the data, reduce the feature space, classify and evaluate the result. We focus on the K nearest neighbor algorithm in this case study.Item Developing and Validating a PubMed Infant Hedge: An MLA Pediatrics Librarians Caucus Initiative(2022-05) Brennan, Emily; Willis, Christine; Kysh, Lynn; Bogucka, Roxanne; Hinrichs, Rachel J.Item Extraction of pharmacokinetic evidence of drug-drug interactions from the literature(PLoS, 2015-05-11) Kolchinsky, Artemy; Lourenço, Anália; Wu, Heng-Yi; Li, Lang; Rocha, Luis M.; Department of Medical and Molecular Genetics, IU School of MedicineDrug-drug interaction (DDI) is a major cause of morbidity and mortality and a subject of intense scientific interest. Biomedical literature mining can aid DDI research by extracting evidence for large numbers of potential interactions from published literature and clinical databases. Though DDI is investigated in domains ranging in scale from intracellular biochemistry to human populations, literature mining has not been used to extract specific types of experimental evidence, which are reported differently for distinct experimental goals. We focus on pharmacokinetic evidence for DDI, essential for identifying causal mechanisms of putative interactions and as input for further pharmacological and pharmacoepidemiology investigations. We used manually curated corpora of PubMed abstracts and annotated sentences to evaluate the efficacy of literature mining on two tasks: first, identifying PubMed abstracts containing pharmacokinetic evidence of DDIs; second, extracting sentences containing such evidence from abstracts. We implemented a text mining pipeline and evaluated it using several linear classifiers and a variety of feature transforms. The most important textual features in the abstract and sentence classification tasks were analyzed. We also investigated the performance benefits of using features derived from PubMed metadata fields, various publicly available named entity recognizers, and pharmacokinetic dictionaries. Several classifiers performed very well in distinguishing relevant and irrelevant abstracts (reaching F1≈0.93, MCC≈0.74, iAUC≈0.99) and sentences (F1≈0.76, MCC≈0.65, iAUC≈0.83). We found that word bigram features were important for achieving optimal classifier performance and that features derived from Medical Subject Headings (MeSH) terms significantly improved abstract classification. We also found that some drug-related named entity recognition tools and dictionaries led to slight but significant improvements, especially in classification of evidence sentences. Based on our thorough analysis of classifiers and feature transforms and the high classification performance achieved, we demonstrate that literature mining can aid DDI discovery by supporting automatic extraction of specific types of experimental evidence.Item Identification and Extraction of Binary, Ternary, Transitive associations and Frequent Patterns from Text Documents in an Interactive Way(Office of the Vice Chancellor for Research, 2013-04-05) Waranashiwar, Shruti DilipAs the amount of electronically accessible textual material has been growing exponentially, Text mining is a new and exciting research area that tries to solve the information overload problem. It is a promising and automated approach for extracting knowledge from unstructured textual documents. The purpose of this research in text mining area is to find compact but high quality associations from Neuroscience related text documents. Here, we try to find the relationships (binary, ternary and transitive) between the terms related to some of the common disorders in neuroscience like Alcoholism and Schizophrenia from a database PubMed, using Vector Space Model (VSM) and the Artificial Neural Network (ANN). We also use Graphviz to visualize these associations. This research reveals many stronger and weaker associations between the different terms in different comorbidities, which are otherwise difficult to understand by reading articles or journals manually. Once the model is developed, it can be generalized to different terms and can be used to study different combinations of terms and comorbidities. As response time of these models is very fast, it will greatly contribute towards speeding up medical research. In such light, extracting associations between keywords could provide very interesting insights into their roles in various diseases and other biological processes. We also try to prove that instead of mining all frequent patterns, all of which may not be interesting to user, interactive method to mine only desired and interesting patterns is far better approach in terms of utilization of resources. We find the compact but high-quality frequent patterns in an interactive way using MCMC sampling method. In interactive patterns mining, a user gives feedback on whether a pattern is interesting or not. The discovery of interesting Associations has application in many fields. Few of them are business decision-making processes, web usage mining, intrusion detection and bioinformatics.Item IDENTIFICATION OF CAUSE AND EFFECT IN CAUSAL SENTENCES OF GERIATRIC CARE DOMAIN USING CONDITIONAL RANDOM(Office of the Vice Chancellor for Research, 2012-04-13) Mehrabi, Saeed; Krishnan, Anand; Palakal, MathewEvent extraction is a key step in many text mining applications. Identified events can be used in various applications such as question-answering systems, information extraction, summarization or building the knowledge base of a clinical decision support system. In this study we used PubMed abstracts of Geriatric care domain that were manually categorized into 42 different subdomains and further divided into causal and non-causal sentences by three domain experts. There are a total of 19,677 sentences in the collected abstracts from PubMed, out of which 2,856 sentences were selected and manually annotated with cause and effect events. We used conditional random fields (CRFs) that are statistical algorithms used to sequentially tag each word in a sentence as a cause or effect event based on some input variables or features. Features used in this study are words, words categories (lowercase, uppercase, mixed of letter and digits, etc.), affixes, part of speech and phrase chunks such as noun or verb phrase. For every word, a window of features before and after each word was also considered. We tested window of size, one to five meaning one to five features before and after each word was included as the input variables. The CRF algorithm was trained and tested on data set with 2,520 sentences in training set, 252 sentences in validation and 84 sentences in test set. Window of four features before and after each word had the best performance with 75.1% accuracy and F-measure of 85% with 84.6% precision and 87% recall.Item Interactive pattern mining of neuroscience data(2014-01-29) Waranashiwar, Shruti Dilip; Mukhopadhyay, Snehasis; Durresi, Arjan; Xia, YuniText mining is a process of extraction of knowledge from unstructured text documents. We have huge volumes of text documents in digital form. It is impossible to manually extract knowledge from these vast texts. Hence, text mining is used to find useful information from text through the identification and exploration of interesting patterns. The objective of this thesis in text mining area is to find compact but high quality frequent patterns from text documents related to neuroscience field. We try to prove that interactive sampling algorithm is efficient in terms of time when compared with exhaustive methods like FP Growth using RapidMiner tool. Instead of mining all frequent patterns, all of which may not be interesting to user, interactive method to mine only desired and interesting patterns is far better approach in terms of utilization of resources. This is especially observed with large number of keywords. In interactive patterns mining, a user gives feedback on whether a pattern is interesting or not. Using Markov Chain Monte Carlo (MCMC) sampling method, frequent patterns are generated in an interactive way. Thesis discusses extraction of patterns between the keywords related to some of the common disorders in neuroscience in an interactive way. PubMed database and keywords related to schizophrenia and alcoholism are used as inputs. This thesis reveals many associations between the different terms, which are otherwise difficult to understand by reading articles or journals manually. Graphviz tool is used to visualize associations.Item TEXT MINER FOR HYPERGRAPHS USING OUTPUT SPACE SAMPLING(2011-08-16) Tirupattur, Naveen; Mukhopadhyay, Snehasis; Fang, Shiaofen; Xia, YuniText Mining is process of extracting high-quality knowledge from analysis of textual data. Rapidly growing interest and focus on research in many fields is resulting in an overwhelming amount of research literature. This literature is a vast source of knowledge. But due to huge volume of literature, it is practically impossible for researchers to manually extract the knowledge. Hence, there is a need for automated approach to extract knowledge from unstructured data. Text mining is right approach for automated extraction of knowledge from textual data. The objective of this thesis is to mine documents pertaining to research literature, to find novel associations among entities appearing in that literature using Incremental Mining. Traditional text mining approaches provide binary associations. But it is important to understand context in which these associations occur. For example entity A has association with entity B in context of entity C. These contexts can be visualized as multi-way associations among the entities which are represented by a Hypergraph. This thesis work talks about extracting such multi-way associations among the entities using Frequent Itemset Mining and application of a new concept called Output space sampling to extract such multi-way associations in space and time efficient manner. We incorporated concept of personalization in Output space sampling so that user can specify his/her interests as the frequent hyper-associations are extracted from the text.Item Using the ‘rentrez’ R Package to Identify Repository Records for NCBI LinkOut(code4lib, 2017-10-18) Lee, Yoo Young; Foster, Erin D.; Polley, David E.; Odell, Jere D.; University LibraryIn this article, we provide a brief overview of the National Center for Biotechnology Information (NCBI) LinkOut service for institutional repositories, a service that allows links from the PubMed database to full-text versions of articles in participating institutional repositories (IRs). We discuss the criteria for participation in NCBI LinkOut for IRs, current methods for participating, and outline our solution for automating the identification of eligible articles in a repository using R and the ‘rentrez’ package. Using our solution, we quickly processed 4,400 open access items from our repository, identified the 557 eligible records, and sent them to the NLM. Direct linking from PubMed resulted in a 17% increase in web traffic.