Performance analysis and optimization of in-situ integration of simulation with data analysis: zipping applications up
dc.contributor.author | Fu, Yuankun | |
dc.contributor.author | Li, Feng | |
dc.contributor.author | Song, Fengguang | |
dc.contributor.author | Chen, Zizhong | |
dc.contributor.department | Computer and Information Science, School of Science | en_US |
dc.date.accessioned | 2019-05-16T18:17:29Z | |
dc.date.available | 2019-05-16T18:17:29Z | |
dc.date.issued | 2018-06 | |
dc.description.abstract | This paper targets an important class of applications that requires combining HPC simulations with data analysis for online or real-time scientific discovery. We use the state-of-the-art parallel-IO and data-staging libraries to build simulation-time data analysis workflows, and conduct performance analysis with real-world applications of computational fluid dynamics (CFD) simulations and molecular dynamics (MD) simulations. Driven by in-depth performance inefficiency analysis, we design an end-to-end application-level approach to eliminating the interlocks and synchronizations existent in the present methods. Our new approach employs both task parallelism and pipeline parallelism to reduce synchronizations effectively. In addition, we design a fully asynchronous, fine-grain, and pipelining runtime system, which is named Zipper. Zipper is a multi-threaded distributed runtime system and executes in a layer below the simulation and analysis applications. To further reduce the simulation application's stall time and enhance the data transfer performance, we design a concurrent data transfer optimization that uses both HPC network and parallel file system for improved bandwidth. The scalability of the Zipper system has been verified by a performance model and various empirical large scale experiments. The experimental results on an Intel multicore cluster as well as a Knight Landing HPC system demonstrate that the Zipper based approach can outperform the fastest state-of-the-art I/O transport library by up to 220% using 13,056 processor cores. | en_US |
dc.eprint.version | Author's manuscript | en_US |
dc.identifier.citation | Fu, Y., Li, F., Song, F., & Chen, Z. (2018). Performance Analysis and Optimization of In-situ Integration of Simulation with Data Analysis: Zipping Applications Up. In Proceedings of the 27th International Symposium on High-Performance Parallel and Distributed Computing (pp. 192–205). New York, NY, USA: ACM. https://doi.org/10.1145/3208040.3208049 | en_US |
dc.identifier.uri | https://hdl.handle.net/1805/19328 | |
dc.language.iso | en | en_US |
dc.publisher | ACM | en_US |
dc.relation.isversionof | 10.1145/3208040.3208049 | en_US |
dc.relation.journal | Proceedings of the 27th International Symposium on High-Performance Parallel and Distributed Computing | en_US |
dc.rights | Publisher Policy | en_US |
dc.source | Author | en_US |
dc.subject | high performance computing | en_US |
dc.subject | performance analysis and optimization | en_US |
dc.subject | in-situ/in-transit workflows | en_US |
dc.title | Performance analysis and optimization of in-situ integration of simulation with data analysis: zipping applications up | en_US |
dc.type | Article | en_US |