Identification of Patients with Family History of Pancreatic Cancer - Investigation of an NLP System Portability

dc.contributor.authorMehrabi, Saeed
dc.contributor.authorKrishnan, Anand
dc.contributor.authorRoch, Alexandra M.
dc.contributor.authorSchmidt, Heidi
dc.contributor.authorLi, DingCheng
dc.contributor.authorKesterson, Joe
dc.contributor.authorBeesley, Chris
dc.contributor.authorDexter, Paul
dc.contributor.authorSchmidt, Max
dc.contributor.authorPalakal, Mathew
dc.contributor.authorLiu, Hongfang
dc.contributor.departmentDepartment of BioHealth Informatics, School of Informatics and Computingen_US
dc.date.accessioned2016-07-20T16:59:24Z
dc.date.available2016-07-20T16:59:24Z
dc.date.issued2015
dc.description.abstractIn this study we have developed a rule-based natural language processing (NLP) system to identify patients with family history of pancreatic cancer. The algorithm was developed in a Unstructured Information Management Architecture (UIMA) framework and consisted of section segmentation, relation discovery, and negation detection. The system was evaluated on data from two institutions. The family history identification precision was consistent across the institutions shifting from 88.9% on Indiana University (IU) dataset to 87.8% on Mayo Clinic dataset. Customizing the algorithm on the the Mayo Clinic data, increased its precision to 88.1%. The family member relation discovery achieved precision, recall, and F-measure of 75.3%, 91.6% and 82.6% respectively. Negation detection resulted in precision of 99.1%. The results show that rule-based NLP approaches for specific information extraction tasks are portable across institutions; however customization of the algorithm on the new dataset improves its performance.en_US
dc.eprint.versionFinal published versionen_US
dc.identifier.citationMehrabi, S., Krishnan, A., Roch, A. M., Schmidt, H., Li, D., Kesterson, J., ... & Liu, H. (2015). Identification of Patients with Family History of Pancreatic Cancer-Investigation of an NLP System Portability. Studies in health technology and informatics, 216, 604-608.en_US
dc.identifier.urihttps://hdl.handle.net/1805/10427
dc.language.isoenen_US
dc.publisherIOSen_US
dc.relation.isversionof10.3233/978-1-61499-564-7-604en_US
dc.relation.journalStudies in health technology and informaticsen_US
dc.rightsAttribution-NonCommercial 3.0 United States
dc.rights.urihttps://creativecommons.org/licenses/by-nc/3.0/us
dc.sourceAuthoren_US
dc.subjectnatural language processingen_US
dc.subjectunstructured information management architectureen_US
dc.titleIdentification of Patients with Family History of Pancreatic Cancer - Investigation of an NLP System Portabilityen_US
dc.typeArticleen_US
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
mehrabi_2015_identification.pdf
Size:
193.55 KB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.88 KB
Format:
Item-specific license agreed upon to submission
Description: