TY - GEN
T1 - Effective dynamic load balance using space-filling curves for large-scale SPH simulations on GPU-rich supercomputers
AU - Tsuzuki, Satori
AU - Aoki, Takayuki
N1 - Publisher Copyright:
© 2016 IEEE.
PY - 2017/1/30
Y1 - 2017/1/30
N2 - Billion of particles are required to describe fluid dynamics by using smoothed particle hydrodynamics (SPH), which computes short-range interactions among particles. In this study, we develop a novel code of large-scale SPH simulations on a multi-GPU platform by using the domain decomposition technique. The computational load of each decomposed domain is dynamically balanced by applying domain re-decomposition, which maintains the same number of particles in each decomposed domain. The performance scalability of the SPH simulation is examined on the GPUs of a TSUBAME 2.5 supercomputer by using two different techniques of dynamic load balance: the slice-grid method and the hierarchical domain decomposition method using the space-filling curve. The weak and strong scalabilities of a test case using 111 million particles are measured with 512 GPUs. In comparison with the slice-grid method, the performance keeps improving in proportion to the number of GPUs in the case of the space-filling curve. The Hilbert curve and the Peano curve show better performance scalabilities than the Morton curve in proportion to the increase in the number of GPUs.
AB - Billion of particles are required to describe fluid dynamics by using smoothed particle hydrodynamics (SPH), which computes short-range interactions among particles. In this study, we develop a novel code of large-scale SPH simulations on a multi-GPU platform by using the domain decomposition technique. The computational load of each decomposed domain is dynamically balanced by applying domain re-decomposition, which maintains the same number of particles in each decomposed domain. The performance scalability of the SPH simulation is examined on the GPUs of a TSUBAME 2.5 supercomputer by using two different techniques of dynamic load balance: the slice-grid method and the hierarchical domain decomposition method using the space-filling curve. The weak and strong scalabilities of a test case using 111 million particles are measured with 512 GPUs. In comparison with the slice-grid method, the performance keeps improving in proportion to the number of GPUs in the case of the space-filling curve. The Hilbert curve and the Peano curve show better performance scalabilities than the Morton curve in proportion to the increase in the number of GPUs.
KW - Dynamic Load Balance
KW - Multi-GPU Computing
KW - Smoothed Particle Hydrodynamics
KW - Space-Filling Curve
UR - http://www.scopus.com/inward/record.url?scp=85015194950&partnerID=8YFLogxK
U2 - 10.1109/ScalA.2016.005
DO - 10.1109/ScalA.2016.005
M3 - Conference contribution
AN - SCOPUS:85015194950
T3 - Proceedings of ScalA 2016: 7th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems - Held in conjunction with SC16: The International Conference for High Performance Computing, Networking, Storage and Analysis
SP - 1
EP - 8
BT - Proceedings of ScalA 2016
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 7th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, ScalA 2016
Y2 - 13 November 2016 through 18 November 2016
ER -