- Browse by Author
Browsing by Author "Xia, Yuni"
Now showing 1 - 10 of 35
Results Per Page
Sort Options
Item Aural Mapping of STEM Concepts Using Literature Mining(2013-03-06) Bharadwaj, Venkatesh; Palakal, Mathew J.; Raje, Rajeev; Xia, YuniRecent technological applications have made the life of people too much dependent on Science, Technology, Engineering, and Mathematics (STEM) and its applications. Understanding basic level science is a must in order to use and contribute to this technological revolution. Science education in middle and high school levels however depends heavily on visual representations such as models, diagrams, figures, animations and presentations etc. This leaves visually impaired students with very few options to learn science and secure a career in STEM related areas. Recent experiments have shown that small aural clues called Audemes are helpful in understanding and memorization of science concepts among visually impaired students. Audemes are non-verbal sound translations of a science concept. In order to facilitate science concepts as Audemes, for visually impaired students, this thesis presents an automatic system for audeme generation from STEM textbooks. This thesis describes the systematic application of multiple Natural Language Processing tools and techniques, such as dependency parser, POS tagger, Information Retrieval algorithm, Semantic mapping of aural words, machine learning etc., to transform the science concept into a combination of atomic-sounds, thus forming an audeme. We present a rule based classification method for all STEM related concepts. This work also presents a novel way of mapping and extracting most related sounds for the words being used in textbook. Additionally, machine learning methods are used in the system to guarantee the customization of output according to a user's perception. The system being presented is robust, scalable, fully automatic and dynamically adaptable for audeme generation.Item Automated image classification via unsupervised feature learning by K-means(2015-07-09) Karimy Dehkordy, Hossein; Dundar, Mehmet Murat; Song, Fengguang; Xia, YuniResearch on image classification has grown rapidly in the field of machine learning. Many methods have already been implemented for image classification. Among all these methods, best results have been reported by neural network-based techniques. One of the most important steps in automated image classification is feature extraction. Feature extraction includes two parts: feature construction and feature selection. Many methods for feature extraction exist, but the best ones are related to deep-learning approaches such as network-in-network or deep convolutional network algorithms. Deep learning tries to focus on the level of abstraction and find higher levels of abstraction from the previous level by having multiple layers of hidden layers. The two main problems with using deep-learning approaches are the speed and the number of parameters that should be configured. Small changes or poor selection of parameters can alter the results completely or even make them worse. Tuning these parameters is usually impossible for normal users who do not have super computers because one should run the algorithm and try to tune the parameters according to the results obtained. Thus, this process can be very time consuming. This thesis attempts to address the speed and configuration issues found with traditional deep-network approaches. Some of the traditional methods of unsupervised learning are used to build an automated image-classification approach that takes less time both to configure and to run.Item Biomedical Literature Mining with Transitive Closure and Maximum Network Flow(http://doi.acm.org/10.1145/1851476.1851552, 2011-05-15) Hoblitzell, Andrew P.; Mukhopadhyay, Snehasis; Xia, Yuni; Fang, ShiafoenThe biological literature is a huge and constantly increasing source of information which the biologist may consult for information about their field, but the vast amount of data can sometimes become overwhelming. Medline, which makes a great amount of biological journal data available online, makes the development of automated text mining systems and hence “data-driven discovery” possible. This thesis examines current work in the field of text mining and biological literature, and then aims to mine documents pertaining to bone biology. The documents are retrieved from PubMed, and then direct associations between the terms are computers. Potentially novel transitive associations among biological objects are then discovered using the transitive closure algorithm and the maximum flow algorithm. The thesis discusses in detail the extraction of biological objects from the collected documents and the co-occurrence based text mining algorithm, the transitive closure algorithm, and the maximum network flow which were then run to extract the potentially novel biological associations. Generated hypotheses (novel associations) were assigned with significance scores for further validation by a bone biologist expert. Extension of the work in to hypergraphs for enhanced meaning and accuracy is also examined in the thesis.Item Bridging Text Mining and Bayesian Networks(2011-03-09) Raghuram, Sandeep Mudabail; Xia, Yuni; Palakal, Mathew; Zou, Xukai, 1963-After the initial network is constructed using expert’s knowledge of the domain, Bayesian networks need to be updated as and when new data is observed. Literature mining is a very important source of this new data. In this work, we explore what kind of data needs to be extracted with the view to update Bayesian Networks, existing technologies which can be useful in achieving some of the goals and what research is required to accomplish the remaining requirements. This thesis specifically deals with utilizing causal associations and experimental results which can be obtained from literature mining. However, these associations and numerical results cannot be directly integrated with the Bayesian network. The source of the literature and the perceived quality of research needs to be factored into the process of integration, just like a human, reading the literature, would. This thesis presents a general methodology for updating a Bayesian Network with the mined data. This methodology consists of solutions to some of the issues surrounding the task of integrating the causal associations with the Bayesian Network and demonstrates the idea with a semiautomated software system.Item CyberWater: An Open Framework for Data and Model Integration(2024-05) Chen, Ranran; Liang, Yao; Song, Fengguang; Xia, Yuni; Zheng, JiangyuWorkflow management systems (WMSs) are commonly used to organize/automate sequences of tasks as workflows to accelerate scientific discoveries. During complex workflow modeling, a local interactive workflow environment is desirable, as users usually rely on their rich, local environments for fast prototyping and refinements before they consider using more powerful computing resources. This dissertation delves into the innovative development of the CyberWater framework based on Workflow Management Systems (WMSs). Against the backdrop of data-intensive and complex models, CyberWater exemplifies the transition of intricate data into insightful and actionable knowledge and introduces the nuanced architecture of CyberWater, particularly focusing on its adaptation and enhancement from the VisTrails system. It highlights the significance of control and data flow mechanisms and the introduction of new data formats for effective data processing within the CyberWater framework. This study presents an in-depth analysis of the design and implementation of Generic Model Agent Toolkits. The discussion centers on template-based component mechanisms and the integration with popular platforms, while emphasizing the toolkits ability to facilitate on-demand access to High-Performance Computing resources for large-scale data handling. Besides, the development of an asynchronously controlled workflow within CyberWater is also explored. This innovative approach enhances computational performance by optimizing pipeline-level parallelism and allows for on-demand submissions of HPC jobs, significantly improving the efficiency of data processing. A comprehensive methodology for model-driven development and Python code integration within the CyberWater framework and innovative applications of GPT models for automated data retrieval are introduced in this research as well. It examines the implementation of Git Actions for system automation in data retrieval processes and discusses the transformation of raw data into a compatible format, enhancing the adaptability and reliability of the data retrieval component in the adaptive generic model agent toolkit component. For the development and maintenance of software within the CyberWater framework, the use of tools like GitHub for version control and outlining automated processes has been applied for software updates and error reporting. Except that, the user data collection also emphasizes the role of the CyberWater Server in these processes. In conclusion, this dissertation presents our comprehensive work on the CyberWater framework’s advancements, setting new standards in scientific workflow management and demonstrating how technological innovation can significantly elevate the process of scientific discovery.Item DCMS: A data analytics and management system for molecular simulation(SpringerOpen, 2014-11-26) Kumar, Anand; Grupcev, Vladimir; Berrada, Meryem; Fogarty, Joseph C.; Tu, Yi-Cheng; Zhu, Xingquan; Pandit, Sagar A.; Xia, Yuni; Department of Computer and Information Science, School of ScienceMolecular Simulation (MS) is a powerful tool for studying physical/chemical features of large systems and has seen applications in many scientific and engineering domains. During the simulation process, the experiments generate a very large number of atoms and intend to observe their spatial and temporal relationships for scientific analysis. The sheer data volumes and their intensive interactions impose significant challenges for data accessing, managing, and analysis. To date, existing MS software systems fall short on storage and handling of MS data, mainly because of the missing of a platform to support applications that involve intensive data access and analytical process. In this paper, we present the database-centric molecular simulation (DCMS) system our team developed in the past few years. The main idea behind DCMS is to store MS data in a relational database management system (DBMS) to take advantage of the declarative query interface (i.e., SQL), data access methods, query processing, and optimization mechanisms of modern DBMSs. A unique challenge is to handle the analytical queries that are often compute-intensive. For that, we developed novel indexing and query processing strategies (including algorithms running on modern co-processors) as integrated components of the DBMS. As a result, researchers can upload and analyze their data using efficient functions implemented inside the DBMS. Index structures are generated to store analysis results that may be interesting to other users, so that the results are readily available without duplicating the analysis. We have developed a prototype of DCMS based on the PostgreSQL system and experiments using real MS data and workload show that DCMS significantly outperforms existing MS software systems. We also used it as a platform to test other data management issues such as security and compression.Item Decision Support System For Geriatric Care(Office of the Vice Chancellor for Research, 2010-04-09) Palakal, Mathew; Pandit, Yogesh; Jones, Josette; Xia, Yuni; Bandos, Jean; Geesaman, Jerry; Pecenka, Dave; Tinsley, EricGeriatrics is a branch in medicine that focuses on the healthcare of the elderly. We propose to build a decision support system for the elderly care based on a knowledgebase system that incorporates best practices that are reported in the literature. A Bayesian network model is then used for decision support for the geriatric care tool that we develop.Item Decision Support System for Geriatric CarePandit, Yogesh; Palakal, Mathew J.; Jones, Josette; Xia, Yuni; Pecenka, Dave; Bandos, Jean; Tinsley, Eric; Geesaman, JerryGeriatrics is a branch in medicine that focuses on the healthcare of the elderly. It is a field that promotes health and aims towards preventing and treating diseases and disabilities in the older people. Geriatric interventions are published in many different articles and journals.Item Deep Learning of Biomechanical Dynamics With Spatial Variability Mining and Model Sparsifiation(2024-08) Liu, Ming; Zhang, Qingxue; King, Brian S.; Ben-Miled, Zina; Xia, YuniDeep learning of biomechanical dynamics is of great promise in smart health and data-driven precision medicine. Biomechanical dynamics are related to the movement patterns and gait characteristics of human people and may provide important insights if mined by deep learning models. However, efficient deep learning of biomechanical dynamics is still challenging, considering that there is a high diversity in the dynamics from different body locations, and the deep learning model may need to be lightweight enough to be able to be deployed in real-time. Targeting these challenges, we have firstly conducted studies on the spatial variability of biomechanical dynamics, aiming to evaluate and determine the optimal body location that is of great promise in robust physical activity type detection. Further, we have developed a framework for deep learning pruning, aiming to determine the optimal pruning schemes while maintaining acceptable performance. More specifically, the proposed approach first evaluates the layer importance of the deep learning model, and then leverages the probabilistic distribution-enabled threshold determination to optimize the pruning rate. The weighted random thresholding method is first investigated to further the understanding of the behavior of the pruning action for each layer. Afterwards, the Gaussian-based thresholding is designed to more effectively optimize the pruning strategies, which can find out the fine-grained pruning schemes with both emphasis and diversity regulation. Even further, we have enhanced and boosted the efficient deep learning framework, to co-optimize the accuracy and the continuity during the pruning process, with the latter metric – continuity meaning that the pruning locations in the weight matrices are encouraged to not cause too many noncontinuous non-pruned locations thereby achieving friendly model implementation. More specifically, the proposed framework leverages the significance scoring and the continuity scoring to quantize the characteristics of each of pruned convolutional filters, then leverages the clustering technique to group the pruned filters for each convolutional stage. Afterwards, the regularized ranking approach is designed to rank the pruned filters, through putting more emphasis on the continuity scores to encourage friendly implementation. In the end, a dual-thresholding strategy is leveraged to increase the diversity in this framework, during significance & continuity co-optimization. Experimental results have demonstrated promising findings, with enhanced understanding of the spatial variability of the biomechanical dynamics and best performance body location selection, with the effective deep learning model pruning framework that can reduce the model size significantly with performance maintained, and further, with the boosted framework that co-optimizes the accuracy and continuity to all consider the friendly implementation during the pruning process. Overall, this research will greatly advance the deep biomechanical mining towards efficient smart health.Item Design and Implementation of Web-based Data and Network Management System for Heterogeneous Wireless Sensor Networks(2011-03-09) Yu, Qun; Liang, Yao; Zou, Xukai; Xia, YuniToday, Wireless Sensor Networks (WSNs) are forming an exciting new area to have dramatic impacts on science and engineering innovations. New WSN-based technologies, such as body sensor networks in medical and health care and environmental monitoring sensor networks, are emerging. Sensor networks are quickly becoming a flexible, inexpensive, and reliable platform to provide solutions for a wide variety of applications in real-world settings. The increase in the proliferation of sensor networks has paralleled the use of more heterogeneous systems in deployment. In this thesis, our work attempts to develop a new network management and data collection framework for heterogeneous wireless sensor networks called as Heterogeneous Wireless Sensor Networks Management System (H-WSNMS), which enables to manage and operate various sensor network systems with unified control and management services and interface. The H-WSNMS framework aims to provide a scheme to manage, query, and interact with sensor network systems. By introducing the concept of Virtual Command Set (VCS), a series of unified application interfaces and Metadata (XML files) across multiple WSNs are designed and implement the scalability and flexibility of the management functions for heterogeneous wireless sensor networks, which is demonstrated though through a series of web-based WSN management Applications such as Monitoring, Configuration, Reprogram, Data Collection and so on. The tests and application trials confirm the feasibility of our approach but also still reveal a number of challenges to be taken into account when deploying wireless sensor and actuator networks at industrial sites, which will be considered by our future research work.