IU Indianapolis ScholarWorks :: Browsing by Subject "Large Language Model"

Browsing by Subject "Large Language Model"

Now showing 1 - 2 of 2

Deep Learning Based Methods for Automatic Extraction of Syntactic Patterns and their Application for Knowledge Discovery
(2023-12-28) Kabir, Md. Ahsanul; Hasan, Mohammad Al; Mukhopadhyay, Snehasis; Tuceryan, Mihran; Fang, Shiaofen
Semantic pairs, which consist of related entities or concepts, serve as the foundation for comprehending the meaning of language in both written and spoken forms. These pairs enable to grasp the nuances of relationships between words, phrases, or ideas, forming the basis for more advanced language tasks like entity recognition, sentiment analysis, machine translation, and question answering. They allow to infer causality, identify hierarchies, and connect ideas within a text, ultimately enhancing the depth and accuracy of automated language processing. Nevertheless, the task of extracting semantic pairs from sentences poses a significant challenge, necessitating the relevance of syntactic dependency patterns (SDPs). Thankfully, semantic relationships exhibit adherence to distinct SDPs when connecting pairs of entities. Recognizing this fact underscores the critical importance of extracting these SDPs, particularly for specific semantic relationships like hyponym-hypernym, meronym-holonym, and cause-effect associations. The automated extraction of such SDPs carries substantial advantages for various downstream applications, including entity extraction, ontology development, and question answering. Unfortunately, this pivotal facet of pattern extraction has remained relatively overlooked by researchers in the domains of natural language processing (NLP) and information retrieval. To address this gap, I introduce an attention-based supervised deep learning model, ASPER. ASPER is designed to extract SDPs that denote semantic relationships between entities within a given sentential context. I rigorously evaluate the performance of ASPER across three distinct semantic relations: hyponym-hypernym, cause-effect, and meronym-holonym, utilizing six datasets. My experimental findings demonstrate ASPER's ability to automatically identify an array of SDPs that mirror the presence of these semantic relationships within sentences, outperforming existing pattern extraction methods by a substantial margin. Second, I want to use the SDPs to extract semantic pairs from sentences. I choose to extract cause-effect entities from medical literature. This task is instrumental in compiling various causality relationships, such as those between diseases and symptoms, medications and side effects, and genes and diseases. Existing solutions excel in sentences where cause and effect phrases are straightforward, such as named entities, single-word nouns, or short noun phrases. However, in the complex landscape of medical literature, cause and effect expressions often extend over several words, stumping existing methods, resulting in incomplete extractions that provide low-quality, non-informative, and at times, conflicting information. To overcome this challenge, I introduce an innovative unsupervised method for extracting cause and effect phrases, PatternCausality tailored explicitly for medical literature. PatternCausality employs a set of cause-effect dependency patterns as templates to identify the key terms within cause and effect phrases. It then utilizes a novel phrase extraction technique to produce comprehensive and meaningful cause and effect expressions from sentences. Experiments conducted on a dataset constructed from PubMed articles reveal that PatternCausality significantly outperforms existing methods, achieving a remarkable order of magnitude improvement in the F-score metric over the best-performing alternatives. I also develop various PatternCausality variants that utilize diverse phrase extraction methods, all of which surpass existing approaches. PatternCausality and its variants exhibit notable performance improvements in extracting cause and effect entities in a domain-neutral benchmark dataset, wherein cause and effect entities are confined to single-word nouns or noun phrases of one to two words. Nevertheless, PatternCausality operates within an unsupervised framework and relies heavily on SDPs, motivating me to explore the development of a supervised approach. Although SDPs play a pivotal role in semantic relation extraction, pattern-based methodologies remain unsupervised, and the multitude of potential patterns within a language can be overwhelming. Furthermore, patterns do not consistently capture the broader context of a sentence, leading to the extraction of false-positive semantic pairs. As an illustration, consider the hyponym-hypernym pattern the w of u which can correctly extract semantic pairs for a sentence like the village of Aasu but fails to do so for the phrase the moment of impact. The root cause of this limitation lies in the pattern's inability to capture the nuanced meaning of words and phrases in a sentence and their contextual significance. These observations have spurred my exploration of a third model, DepBERT which constitutes a dependency-aware supervised transformer model. DepBERT's primary contribution lies in introducing the underlying dependency structure of sentences to a language model with the aim of enhancing token classification performance. To achieve this, I must first reframe the task of semantic pair extraction as a token classification problem. The DepBERT model can harness both the tree-like structure of dependency patterns and the masked language architecture of transformers, marking a significant milestone, as most large language models (LLMs) predominantly focus on semantics and word co-occurrence while neglecting the crucial role of dependency architecture. In summary, my overarching contributions in this thesis are threefold. First, I validate the significance of the dependency architecture within various components of sentences and publish SDPs that incorporate these dependency relationships. Subsequently, I employ these SDPs in a practical medical domain to extract vital cause-effect pairs from sentences. Finally, my third contribution distinguishes this thesis by integrating dependency relations into a deep learning model, enhancing the understanding of language and the extraction of valuable semantic associations.
Large Language Models for Unsupervised Keyphrase Extraction and Biomedical Data Analytics
(2024-08) Ding, Haoran; Luo, Xiao; King, Brian; Zhang, Qingxue; Li, Lingxi
Natural Language Processing (NLP), a vital branch of artificial intelligence, is designed to equip computers with the ability to comprehend and manipulate human language, facilitating the extraction and utilization of textual data. NLP plays a crucial role in harnessing the vast quantities of textual data generated daily, facilitating meaningful information extraction. Among the various techniques, keyphrase extraction stands out due to its ability to distill concise information from extensive texts, making it invaluable for summarizing and navigating content efficiently. The process of keyphrase extraction usually begins by generating candidates first and then ranking them to identify the most relevant phrases. Keyphrase extraction can be categorized into supervised and unsupervised approaches. Supervised methods typically achieve higher accuracy as they are trained on labeled data, which allows them to effectively capture and utilize patterns recognized during training. However, the dependency on extensive, well-annotated datasets limits their applicability in scenarios where such data is scarce or costly to obtain. On the other hand, unsupervised methods, while free from the constraints of labeled data, face challenges in capturing deep semantic relationships within text, which can impact their effectiveness. Despite these challenges, unsupervised keyphrase extraction holds significant promise due to its scalability and lower barriers to entry, as it does not require labeled datasets. This approach is increasingly favored for its potential to aid in building extensive knowledge bases from unstructured data, which can be particularly useful in domains where acquiring labeled data is impractical. As a result, unsupervised keyphrase extraction is not only a valuable tool for information retrieval but also a pivotal technology for the ongoing expansion of knowledge-driven applications in NLP. In this dissertation, we introduce three innovative unsupervised keyphrase extraction methods: AttentionRank, AGRank, and LLMRank. Additionally, we present a method for constructing knowledge graphs from unsupervised keyphrase extraction, leveraging the self-attention mechanism. The first study discusses the AttentionRank model, which utilizes a pre-trained language model to derive underlying importance rankings of candidate phrases through self-attention. This model employs a cross-attention mechanism to assess the semantic relevance between each candidate phrase and the document, enhancing the phrase ranking process. AGRank, detailed in the second study, is a sophisticated graph-based framework that merges deep learning techniques with graph theory. It constructs a candidate phrase graph using mutual attentions from a pre-trained language model. Both global document information and local phrase details are incorporated as enhanced nodes within the graph, and a graph algorithm is applied to rank the candidate phrases. The third study, LLMRank, leverages the strengths of large language models (LLMs) and graph algorithms. It employs LLMs to generate keyphrase candidates and then integrates global information through the text's graphical structures. This process reranks the candidates, significantly improving keyphrase extraction performance. The fourth study explores how self-attention mechanisms can be used to extract keyphrases from medical literature and generate query-related phrase graphs, improving text retrieval visualization. The mutual attentions of medical entities, extracted using a pre-trained model, form the basis of the knowledge graph. This, coupled with a specialized retrieval algorithm, allows for the visualization of long-range connections between medical entities while simultaneously displaying the supporting literature. In summary, our exploration of unsupervised keyphrase extraction and biomedical data analysis introduces novel methods and insights in NLP, particularly in information extraction. These contributions are crucial for the efficient processing of large text datasets and suggest avenues for future research and applications.

Browsing by Subject "Large Language Model"

Results Per Page

Sort Options