- Browse by Author
Browsing by Author "Zheng, Weijian"
Now showing 1 - 5 of 5
Results Per Page
Sort Options
Item Designing a Synchronization-reducing Clustering Method on Manycores: Some Issues and Improvements(ACM, 2017-11) Zheng, Weijian; Song, Fengguang; Lin, Lan; Computer and Information Science, School of ScienceThe k-means clustering method is one of the most widely used techniques in big data analytics. In this paper, we explore the ideas of software blocking, asynchronous local optimizations, and heuristics of simulated annealing to improve the performance of k-means clustering. Like most of the machine learning methods, the performance of k-means clustering relies on two main factors: the computing speed (per iteration), and the convergence rate. A straightforward realization of the software-blocking synchronization-reducing clustering algorithm, however, sees sporadic slower convergence rate than the standard k-means algorithm. To tackle the issues, we design an annealing-enhanced algorithm, which introduces the heuristics of stop conditions and annealing steps to provide as good or better performance than the standard k-means algorithm. This new enhanced k-means clustering algorithm is able to offer the same clustering quality as the standard k-means. Experiments with real-world datasets show that the new parallel implementation is faster than the open source HPC library of Parallel K-Means Data Clustering (e.g., 19% faster on relatively large datasets with 32 CPU cores, and 11% faster on a large dataset with 1,024 CPU cores). Moreover, the extent to which the program performance improves is largely determined by the actual convergence rate of applying the algorithm to different datasets.Item FQL: An Extensible Feature Query Language and Toolkit on Searching Software Characteristics for HPC Applications(Springer, 2020) Zheng, Weijian; Wang, Dali; Song, Fengguang; Computer and Information Science, School of ScienceThe amount of large-scale scientific computing software is dramatically increasing. In this work, we designed a new query language, named Feature Query Language (FQL), to collect and extract HPC-related software features or metadata from a quick static code analysis. We also designed and implemented an FQL-based toolkit to automatically detect and present software features using an extensible query repository. A number of large-scale, high performance computing (HPC) scientific applications have been studied in the paper with the FQL toolkit to demonstrate the HPC-related feature extraction and information/metadata collection. Different from the existing static software analysis and refactoring tools which focus on software debug, development and code transformation, the FQL toolkit is simpler, significantly lightweight and strives to collect various and diverse software metadata with ease and rapidly.Item OpenGraphGym: A Parallel Reinforcement Learning Framework for Graph Optimization Problems(Springer, 2020-06-15) Zheng, Weijian; Wang, Dali; Song, Fengguang; Krzhizhanovskaya, Valeria V.; Závodszky, Gábor; Lees, Michael H.; Dongarra, Jack J.; Sloot, Peter M. A.; Brissos, Sérgio; Teixeira, João; Computer and Information Science, School of ScienceThis paper presents an open-source, parallel AI environment (named OpenGraphGym) to facilitate the application of reinforcement learning (RL) algorithms to address combinatorial graph optimization problems. This environment incorporates a basic deep reinforcement learning method, and several graph embeddings to capture graph features, it also allows users to rapidly plug in and test new RL algorithms and graph embeddings for graph optimization problems. This new open-source RL framework is targeted at achieving both high performance and high quality of the computed graph solutions. This RL framework forms the foundation of several ongoing research directions, including 1) benchmark works on different RL algorithms and embedding methods for classic graph problems; 2) advanced parallel strategies for extreme-scale graph computations, as well as 3) performance evaluation on real-world graph solutions.Item suCAQR: A Simplified Communication-Avoiding QR Factorization Solver Using the TBLAS Framework(IEEE, 2016-12) Zheng, Weijian; Song, Fengguang; Lin, Lan; Chen, Zizhong; Computer and Information Science, School of ScienceThe scope of this paper is to design and implement a scalable QR factorization solver that can deliver the fastest performance for tall and skinny matrices and square matrices on modern supercomputers. The new solver, named scalable universal communication-avoiding QR factorization (suCAQR), introduces a simplified and tuning-less way to realize the communication-avoiding QR factorization algorithm to support matrices of any shapes. The software design includes a mixed usage of physical and logical data layouts, a simplified method of dynamic-root binary-tree reduction, and a dynamic dataflow implementation. Compared with the existing communication avoiding QR factorization implementations, suCAQR has the benefits of being simpler, more general, and more efficient. By balancing the degree of parallelism and the proportion of faster computational kernels, it is able to achieve scalable performance on clusters of multicore nodes. The software essentially combines the strengths of both synchronization-reducing approach and communication-avoiding approach to achieve high performance. Based on the experimental results using 1,024 CPU cores, suCAQR is faster than DPLASMA by up to 30%, and faster than ScaLAPACK by up to 30 times.Item XScan: An Integrated Tool for Understanding Open Source Community-Based Scientific Code(Springer, 2019) Zheng, Weijian; Wang, Dali; Song, Fengguang; Computer and Information Science, School of ScienceMany scientific communities have adopted community-based models that integrate multiple components to simulate whole system dynamics. The community software projects’ complexity, stems from the integration of multiple individual software components that were developed under different application requirements and various machine architectures, has become a challenge for effective software system understanding and continuous software development. The paper presents an integrated software toolkit called X-ray Software Scanner (in abbreviation, XScan) for a better understanding of large-scale community-based scientific codes. Our software tool provides support to quickly summarize the overall information of scientific codes, including the number of lines of code, programming languages, external library dependencies, as well as architecture-dependent parallel software features. The XScan toolkit also realizes a static software analysis component to collect detailed structural information and provides an interactive visualization and analysis of the functions. We use a large-scale community-based Earth System Model to demonstrate the workflow, functions and visualization of the toolkit. We also discuss the application of advanced graph analytics techniques to assist software modular design and component refactoring.