Analyzing Patterns of Literature-Based Phenotyping Definitions for Text Mining Applications
Date
Language
Embargo Lift Date
Committee Members
Degree
Degree Year
Department
Grantor
Journal Title
Journal ISSN
Volume Title
Found At
Abstract
Phenotyping definitions are widely used in observational studies that utilize population data from Electronic Health Records (EHRs). Biomedical text mining supports biomedical knowledge discovery. Therefore, we believe that mining phenotyping definitions from the literature can support EHR-based clinical research. However, information about these definitions presented in the literature is inconsistent, diverse, and unknown, especially for text mining usage. Therefore, we aim to analyze patterns of phenotyping definitions as a first step toward developing a text mining application to improve phenotype definition. A set random of observational studies was used for this analysis. Term frequency-inverse document frequency (TF-IDF) and Term Frequency (TF) were used to rank the terms in the 3958 sentences. Finally, we present preliminary results analyzing phenotyping definitions patterns.