- Browse by Author
Browsing by Author "Song, Fengguang"
Now showing 1 - 10 of 30
Results Per Page
Sort Options
Item Accelerating complex modeling workflows in CyberWater using on-demand HPC/Cloud resources(IEEE, 2021-09) Li, Feng; Chen, Ranran; Fu, Yuankun; Song, Fengguang; Liang, Yao; Ranawaka, Isuru; Pamidighantam, Sudhakar; Luna, Daniel; Liang, Xu; Computer Information and Graphics Technology, School of Engineering and TechnologyWorkflow management systems (WMSs) are commonly used to organize/automate sequences of tasks as workflows to accelerate scientific discoveries. During complex workflow modeling, a local interactive workflow environment is desirable, as users usually rely on their rich, local environments for fast prototyping and refinements before they consider using more powerful computing resources. However, existing WMSs do not simultaneously support local interactive workflow environments and HPC resources. In this paper, we present an on-demand access mechanism to remote HPC resources from desktop/laptop-based workflow management software to compose, monitor and analyze scientific workflows in the CyberWater project. Cyber-Water is an open-data and open-modeling software framework for environmental and water communities. In this work, we extend the open-model, open-data design of CyberWater with on-demand HPC accessing capacity. In particular, we design and implement the LaunchAgent library, which can be integrated into the local desktop environment to allow on-demand usage of remote resources for hydrology-related workflows. LaunchAgent manages authentication to remote resources, prepares the computationally-intensive or data-intensive tasks as batch jobs, submits jobs to remote resources, and monitors the quality of services for the users. LaunchAgent interacts seamlessly with other existing components in CyberWater, which is now able to provide advantages of both feature-rich desktop software experience and increased computation power through on-demand HPC/Cloud usage. In our evaluations, we demonstrate how a hydrology workflow that consists of both local and remote tasks can be constructed and show that the added on-demand HPC/Cloud usage helps speeding up hydrology workflows while allowing intuitive workflow configurations and execution using a desktop graphical user interface.Item Accelerating Experience Replay for Deep Q-Networks with Reduced Target Computation(CS & IT, 2023) Zigon, Bob; Song, Fengguang; Computer and Information Science, School of ScienceMnih’s seminal deep reinforcement learning paper that applied a Deep Q-network to Atari video games demonstrated the importance of a replay buffer and a target network. Though the pair were required for convergence, the use of the replay buffer came at a significant computational cost. With each new sample generated by the system, the targets in the mini batch buffer were continually recomputed. We propose an alternative that eliminates the target recomputation called TAO-DQN (Target Accelerated Optimization-DQN). Our approach focuses on a new replay buffer algorithm that lowers the computational burden. We implemented this new approach on three experiments involving environments from the OpenAI gym. This resulted in convergence to better policies in fewer episodes and less time. Furthermore, we offer a mathematical justification for our improved convergence rate.Item An Algorithm for Forward Reduction in Sequence-Based Software Specification Read More: http://www.worldscientific.com/doi/abs/10.1142/S0218194016400118(World Scientific, 2016-11) Lin, Lan; Xue, Yufeng; Song, Fengguang; Computer and Information Science, School of ScienceSequence-based software specification is a rigorous method for deriving a formal system model based on informal requirements, through a systematic process called sequence enumeration. Under this process, stimulus (input) sequences are considered in a breadth-first manner, with the expected system response to each sequence given. Not every sequence needs to be further extended by the enumeration rules. The completed specification encodes a Mealy machine and forms a basis for other activities including code development and testing. This paper presents a forward reduction algorithm for sequence-based specification. The need for such an algorithm has been identified by field applications. We used the state machine as an intermediate tool to comprehend and analyze all change impacts resulted from a forward reduction, and used an axiom system for its development. We present the algorithm both mathematically in functional form and procedurally in pseudocode, illustrate it with a symbolic example, and report a larger case study from the published literature in which the algorithm is applied. The algorithm will prove useful and effective in deriving a system-level specification as well as in merging and combining partial work products towards a formal system model in field applications.Item Automated image classification via unsupervised feature learning by K-means(2015-07-09) Karimy Dehkordy, Hossein; Dundar, Mehmet Murat; Song, Fengguang; Xia, YuniResearch on image classification has grown rapidly in the field of machine learning. Many methods have already been implemented for image classification. Among all these methods, best results have been reported by neural network-based techniques. One of the most important steps in automated image classification is feature extraction. Feature extraction includes two parts: feature construction and feature selection. Many methods for feature extraction exist, but the best ones are related to deep-learning approaches such as network-in-network or deep convolutional network algorithms. Deep learning tries to focus on the level of abstraction and find higher levels of abstraction from the previous level by having multiple layers of hidden layers. The two main problems with using deep-learning approaches are the speed and the number of parameters that should be configured. Small changes or poor selection of parameters can alter the results completely or even make them worse. Tuning these parameters is usually impossible for normal users who do not have super computers because one should run the algorithm and try to tune the parameters according to the results obtained. Thus, this process can be very time consuming. This thesis attempts to address the speed and configuration issues found with traditional deep-network approaches. Some of the traditional methods of unsupervised learning are used to build an automated image-classification approach that takes less time both to configure and to run.Item Building a scientific workflow framework to enable real‐time machine learning and visualization(Wiley, 2019-08) Li, Feng; Song, Fengguang; Computer and Information Science, School of ScienceNowadays, we have entered the era of big data. In the area of high performance computing, large‐scale simulations can generate huge amounts of data with potentially critical information. However, these data are usually saved in intermediate files and are not instantly visible until advanced data analytics techniques are applied after reading all simulation data from persistent storages (eg, local disks or a parallel file system). This approach puts users in a situation where they spend long time on waiting for running simulations while not knowing the status of the running job. In this paper, we build a new computational framework to couple scientific simulations with multi‐step machine learning processes and in‐situ data visualizations. We also design a new scalable simulation‐time clustering algorithm to automatically detect fluid flow anomalies. This computational framework is built upon different software components and provides plug‐in data analysis and visualization functions over complex scientific workflows. With this advanced framework, users can monitor and get real‐time notifications of special patterns or anomalies from ongoing extreme‐scale turbulent flow simulations.Item Correcting soft errors online in fast fourier transform(ACM, 2017) Liang, Xin; Chen, Jieyang; Tao, Dingwen; Li, Sihuan; Wu, Panruo; Li, Hongbo; Ouyang, Kaiming; Liu, Yuanlai; Song, Fengguang; Chen, Zizhong; Computer and Information Science, School of ScienceWhile many algorithm-based fault tolerance (ABFT) schemes have been proposed to detect soft errors offline in the fast Fourier transform (FFT) after computation finishes, none of the existing ABFT schemes detect soft errors online before the computation finishes. This paper presents an online ABFT scheme for FFT so that soft errors can be detected online and the corrupted computation can be terminated in a much more timely manner. We also extend our scheme to tolerate both arithmetic errors and memory errors, develop strategies to reduce its fault tolerance overhead and improve its numerical stability and fault coverage, and finally incorporate it into the widely used FFTW library - one of the today's fastest FFT software implementations. Experimental results demonstrate that: (1) the proposed online ABFT scheme introduces much lower overhead than the existing offline ABFT schemes; (2) it detects errors in a much more timely manner; and (3) it also has higher numerical stability and better fault coverage.Item CyberWater: An Open Framework for Data and Model Integration(2024-05) Chen, Ranran; Liang, Yao; Song, Fengguang; Xia, Yuni; Zheng, JiangyuWorkflow management systems (WMSs) are commonly used to organize/automate sequences of tasks as workflows to accelerate scientific discoveries. During complex workflow modeling, a local interactive workflow environment is desirable, as users usually rely on their rich, local environments for fast prototyping and refinements before they consider using more powerful computing resources. This dissertation delves into the innovative development of the CyberWater framework based on Workflow Management Systems (WMSs). Against the backdrop of data-intensive and complex models, CyberWater exemplifies the transition of intricate data into insightful and actionable knowledge and introduces the nuanced architecture of CyberWater, particularly focusing on its adaptation and enhancement from the VisTrails system. It highlights the significance of control and data flow mechanisms and the introduction of new data formats for effective data processing within the CyberWater framework. This study presents an in-depth analysis of the design and implementation of Generic Model Agent Toolkits. The discussion centers on template-based component mechanisms and the integration with popular platforms, while emphasizing the toolkits ability to facilitate on-demand access to High-Performance Computing resources for large-scale data handling. Besides, the development of an asynchronously controlled workflow within CyberWater is also explored. This innovative approach enhances computational performance by optimizing pipeline-level parallelism and allows for on-demand submissions of HPC jobs, significantly improving the efficiency of data processing. A comprehensive methodology for model-driven development and Python code integration within the CyberWater framework and innovative applications of GPT models for automated data retrieval are introduced in this research as well. It examines the implementation of Git Actions for system automation in data retrieval processes and discusses the transformation of raw data into a compatible format, enhancing the adaptability and reliability of the data retrieval component in the adaptive generic model agent toolkit component. For the development and maintenance of software within the CyberWater framework, the use of tools like GitHub for version control and outlining automated processes has been applied for software updates and error reporting. Except that, the user data collection also emphasizes the role of the CyberWater Server in these processes. In conclusion, this dissertation presents our comprehensive work on the CyberWater framework’s advancements, setting new standards in scientific workflow management and demonstrating how technological innovation can significantly elevate the process of scientific discovery.Item CyberWater: An Open Framework for Data and Model Integration in Water Science and Engineering(ACM, 2022-10-17) Chen, Ranran; Li, Feng; Bieger, Drew; Song, Fengguang; Liang, Yao; Luna, Daniel; Young, Ryan; Liang, Xu; Pamidighantam, Sudhakar; Computer and Information Science, School of ScienceThe CyberWater project is to build an open-data open-model framework for easy and incremental integration of heterogeneous data sources and diverse scientific models across disciplines in the broad water domain. The CyberWater framework extends the open-data open-model framework called Meta-Scientific-Modeling (MSM) that provides a system-wide data and model integration platform. On top of MSM, the CyberWater framework provides a set of toolkits, and external system integration engines, to further facilitate users' scientific modeling and collaboration across disciplines. For example, the developed generic model agent toolkit enables users to integrate their computational models into CyberWater via graphical user interface configuration without coding, which further simplifies the data and model integration and model coupling. CyberWater adopts a graphical scientific workflow system, VisTrails, ensuring data provenance and reproducible computing. CyberWater supports novel access to high-performance computing resources on demand for users' computational expensive model tasks. We demonstrate merits of CyberWater by a use case of hydrologic modeling workflow.Item Designing a Parallel Memory-Aware Lattice Boltzmann Algorithm on Manycore Systems(IEEE, 2018-09) Fu, Yuankun; Li, Feng; Song, Fengguang; Zhu, Luoding; Computer and Information Science, School of ScienceLattice Boltzmann method (LBM) is an important computational fluid dynamics (CFD) approach to solving the Naiver-Stokes equations and simulating complex fluid flows. LBM is also well known as a memory bound problem and its performance is limited by the memory access time on modern computer systems. In this paper, we design and develop both sequential and parallel memory-aware algorithms to optimize the performance of LBM. The new memory-aware algorithms can enhance data reuses across multiple time steps to further improve the performance of the original and fused LBM. We theoretically analyze the algorithms to provide an insight into how data reuses occur in each algorithm. Finally, we conduct experiments and detailed performance analysis on two different manycore systems. Based on the experimental results, the parallel memory-aware LBM algorithm can outperform the fused LBM by up to 292% on the Intel Haswell system when using 28 cores, and by 302 % on the Intel Skylake system when using 48 cores.Item Designing a Synchronization-reducing Clustering Method on Manycores: Some Issues and Improvements(ACM, 2017-11) Zheng, Weijian; Song, Fengguang; Lin, Lan; Computer and Information Science, School of ScienceThe k-means clustering method is one of the most widely used techniques in big data analytics. In this paper, we explore the ideas of software blocking, asynchronous local optimizations, and heuristics of simulated annealing to improve the performance of k-means clustering. Like most of the machine learning methods, the performance of k-means clustering relies on two main factors: the computing speed (per iteration), and the convergence rate. A straightforward realization of the software-blocking synchronization-reducing clustering algorithm, however, sees sporadic slower convergence rate than the standard k-means algorithm. To tackle the issues, we design an annealing-enhanced algorithm, which introduces the heuristics of stop conditions and annealing steps to provide as good or better performance than the standard k-means algorithm. This new enhanced k-means clustering algorithm is able to offer the same clustering quality as the standard k-means. Experiments with real-world datasets show that the new parallel implementation is faster than the open source HPC library of Parallel K-Means Data Clustering (e.g., 19% faster on relatively large datasets with 32 CPU cores, and 11% faster on a large dataset with 1,024 CPU cores). Moreover, the extent to which the program performance improves is largely determined by the actual convergence rate of applying the algorithm to different datasets.
- «
- 1 (current)
- 2
- 3
- »