- Browse by Subject
Browsing by Subject "Knowledge Graph"
Now showing 1 - 2 of 2
Results Per Page
Sort Options
Item Automatic Extraction of Computer Science Concept Phrases Using a Hybrid Machine Learning Paradigm(2023-05) Jahin, S M Abrar; Al Hasan, Mohammad; Fang, Shiaofen; Mukhopadhyay, SnehasisWith the proliferation of computer science in recent years in modern society, the number of computer science-related employment is expanding quickly. Software engineer has been chosen as the best job for 2023 based on pay, stress level, opportunity for professional growth, and balance between work and personal life. This was decided by a rankings of different news, journals, and publications. Computer science occupations are anticipated to be in high demand not just in 2023, but also for the foreseeable future. It's not surprising that the number of computer science students at universities is growing and will continue to grow. The enormous increase in student enrolment in many subdisciplines of computers has presented some distinct issues. If computer science is to be incorporated into the K-12 curriculum, it is vital that K-12 educators are competent. But one of the biggest problems with this plan is that there aren't enough trained computer science professors. Numerous new fields and applications, for instance, are being introduced to computer science. In addition, it is difficult for schools to recruit skilled computer science instructors for a variety of reasons including low salary issue. Utilizing the K-12 teachers who are already in the schools, have a love for teaching, and consider teaching as a vocation is therefore the most effective strategy to improve or fix this issue. So, if we want teachers to quickly grasp computer science topics, we need to give them an easy way to learn about computer science. To simplify and expedite the study of computer science, we must acquaint school-treachers with the terminology associated with computer science concepts so they can know which things they need to learn according to their profile. If we want to make it easier for schoolteachers to comprehend computer science concepts, it would be ideal if we could provide them with a tree of words and phrases from which they could determine where the phrases originated and which phrases are connected to them so that they can be effectively learned. To find a good concept word or phrase, we must first identify concepts and then establish their connections or linkages. As computer science is a fast developing field, its nomenclature is also expanding at a frenetic rate. Therefore, adding all concepts and terms to the knowledge graph would be a challenging endeavor. Cre- ating a system that automatically adds all computer science domain terms to the knowledge graph would be a straightforward solution to the issue. We have identified knowledge graph use cases for the schoolteacher training program, which motivates the development of a knowledge graph. We have analyzed the knowledge graph's use case and the knowledge graph's ideal characteristics. We have designed a webbased system for adding, editing, and removing words from a knowledge graph. In addition, a term or phrase can be represented with its children list, parent list, and synonym list for enhanced comprehension. We' ve developed an automated system for extracting words and phrases that can extract computer science idea phrases from any supplied text, therefore enriching the knowledge graph. Therefore, we have designed the knowledge graph for use in teacher education so that school-teachers can educate K-12 students computer science topicses effectively.Item Large Language Models for Unsupervised Keyphrase Extraction and Biomedical Data Analytics(2024-08) Ding, Haoran; Luo, Xiao; King, Brian; Zhang, Qingxue; Li, LingxiNatural Language Processing (NLP), a vital branch of artificial intelligence, is designed to equip computers with the ability to comprehend and manipulate human language, facilitating the extraction and utilization of textual data. NLP plays a crucial role in harnessing the vast quantities of textual data generated daily, facilitating meaningful information extraction. Among the various techniques, keyphrase extraction stands out due to its ability to distill concise information from extensive texts, making it invaluable for summarizing and navigating content efficiently. The process of keyphrase extraction usually begins by generating candidates first and then ranking them to identify the most relevant phrases. Keyphrase extraction can be categorized into supervised and unsupervised approaches. Supervised methods typically achieve higher accuracy as they are trained on labeled data, which allows them to effectively capture and utilize patterns recognized during training. However, the dependency on extensive, well-annotated datasets limits their applicability in scenarios where such data is scarce or costly to obtain. On the other hand, unsupervised methods, while free from the constraints of labeled data, face challenges in capturing deep semantic relationships within text, which can impact their effectiveness. Despite these challenges, unsupervised keyphrase extraction holds significant promise due to its scalability and lower barriers to entry, as it does not require labeled datasets. This approach is increasingly favored for its potential to aid in building extensive knowledge bases from unstructured data, which can be particularly useful in domains where acquiring labeled data is impractical. As a result, unsupervised keyphrase extraction is not only a valuable tool for information retrieval but also a pivotal technology for the ongoing expansion of knowledge-driven applications in NLP. In this dissertation, we introduce three innovative unsupervised keyphrase extraction methods: AttentionRank, AGRank, and LLMRank. Additionally, we present a method for constructing knowledge graphs from unsupervised keyphrase extraction, leveraging the self-attention mechanism. The first study discusses the AttentionRank model, which utilizes a pre-trained language model to derive underlying importance rankings of candidate phrases through self-attention. This model employs a cross-attention mechanism to assess the semantic relevance between each candidate phrase and the document, enhancing the phrase ranking process. AGRank, detailed in the second study, is a sophisticated graph-based framework that merges deep learning techniques with graph theory. It constructs a candidate phrase graph using mutual attentions from a pre-trained language model. Both global document information and local phrase details are incorporated as enhanced nodes within the graph, and a graph algorithm is applied to rank the candidate phrases. The third study, LLMRank, leverages the strengths of large language models (LLMs) and graph algorithms. It employs LLMs to generate keyphrase candidates and then integrates global information through the text's graphical structures. This process reranks the candidates, significantly improving keyphrase extraction performance. The fourth study explores how self-attention mechanisms can be used to extract keyphrases from medical literature and generate query-related phrase graphs, improving text retrieval visualization. The mutual attentions of medical entities, extracted using a pre-trained model, form the basis of the knowledge graph. This, coupled with a specialized retrieval algorithm, allows for the visualization of long-range connections between medical entities while simultaneously displaying the supporting literature. In summary, our exploration of unsupervised keyphrase extraction and biomedical data analysis introduces novel methods and insights in NLP, particularly in information extraction. These contributions are crucial for the efficient processing of large text datasets and suggest avenues for future research and applications.