Keyphrase Identification Using Minimal Labeled Data with Hierarchical Context and Transfer Learning

dc.contributor.authorGoli, Rohan
dc.contributor.authorHubig, Nina
dc.contributor.authorMin, Hua
dc.contributor.authorGong, Yang
dc.contributor.authorSittig, Dean F.
dc.contributor.authorRennert, Lior
dc.contributor.authorRobinson, David
dc.contributor.authorBiondich, Paul
dc.contributor.authorWright, Adam
dc.contributor.authorNøhr, Christian
dc.contributor.authorLaw, Timothy
dc.contributor.authorFaxvaag, Arild
dc.contributor.authorWeaver, Aneesa
dc.contributor.authorGimbel, Ronald
dc.contributor.authorJing, Xia
dc.contributor.departmentPediatrics, School of Medicine
dc.date.accessioned2024-01-25T11:07:18Z
dc.date.available2024-01-25T11:07:18Z
dc.date.issued2023-05-26
dc.description.abstractInteroperable clinical decision support system (CDSS) rules provide a pathway to interoperability, a well-recognized challenge in health information technology. Building an ontology facilitates creating interoperable CDSS rules, which can be achieved by identifying the keyphrases (KP) from the existing literature. However, KP identification for data labeling requires human expertise, consensus, and contextual understanding. This paper aims to present a semi-supervised KP identification framework using minimal labeled data based on hierarchical attention over the documents and domain adaptation. Our method outperforms the prior neural architectures by learning through synthetic labels for initial training, document-level contextual learning, language modeling, and fine-tuning with limited gold standard label data. To the best of our knowledge, this is the first functional framework for the CDSS sub-domain to identify KPs, which is trained on limited labeled data. It contributes to the general natural language processing (NLP) architectures in areas such as clinical NLP, where manual data labeling is challenging, and light-weighted deep learning models play a role in real-time KP identification as a complementary approach to human experts' effort.
dc.eprint.versionPre-Print
dc.identifier.citationGoli R, Hubig N, Min H, et al. Keyphrase Identification Using Minimal Labeled Data with Hierarchical Context and Transfer Learning. Preprint. medRxiv. 2023;2023.01.26.23285060. Published 2023 May 26. doi:10.1101/2023.01.26.23285060
dc.identifier.urihttps://hdl.handle.net/1805/38181
dc.language.isoen_US
dc.publishermedRxiv
dc.relation.isversionof10.1101/2023.01.26.23285060
dc.rightsAttribution-NonCommercial-NoDerivatives 4.0 Internationalen
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/4.0/
dc.sourcePMC
dc.subjectClinical Decision Support System
dc.subjectDomain adaptation
dc.subjectHierarchical context
dc.subjectMinimal labeled data
dc.subjectNatural language processing
dc.subjectSemi-supervised learning
dc.titleKeyphrase Identification Using Minimal Labeled Data with Hierarchical Context and Transfer Learning
dc.typeArticle
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
nihpp-2023.01.26.23285060v2.pdf
Size:
3.53 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.99 KB
Format:
Item-specific license agreed upon to submission
Description: