Tree cutting approach for domain partitioning on forest-of-octrees-based block-structured static adaptive mesh refinement with lattice Boltzmann method

Yuta Hasegawa, Takayuki Aoki, Hiromichi Kobayashi, Yasuhiro Idomura, Naoyuki Onodera

Research output: Contribution to journalArticlepeer-review

5 Scopus citations

Abstract

The aerodynamics simulation code based on the lattice Boltzmann method (LBM) using forest-of-octrees-based block-structured adaptive mesh refinement (AMR) with temporary-fixed refinement was implemented, and its performance was evaluated on GPU-based supercomputers. Although the Space-Filling-Curve-based (SFC) domain partitioning algorithm for the octree-based AMR has been widely used on conventional CPU-based supercomputers, accelerated computation on GPU-based supercomputers revealed a bottleneck due to costly halo data communication. Our new tree cutting approach adopts a hybrid domain partitioning with the coarse structured block decomposition and the SFC partitioning in each block. This hybrid approach improved the locality and the topology of the partitioned sub-domains and reduced the amount of the halo communication to one-third of the original SFC approach. In the strong scaling test, the code achieved maximum ×1.82 speedup at the performance of 2207 MLUPS (mega-lattice update per second) on 128 GPUs (NVIDIA® Tesla® V100). In the weak scaling test, the code achieved 9620 MLUPS at 128 GPUs with 4.473 billion grid points, while keeping the parallel efficiency of 93.4% from 8 to 128 GPUs.

Original languageEnglish
Article number102851
JournalParallel Computing
Volume108
DOIs
StatePublished - Dec 2021
Externally publishedYes

Keywords

  • Adaptive mesh refinement (AMR)
  • GPU
  • Lattice Boltzmann method
  • Static AMR

Fingerprint

Dive into the research topics of 'Tree cutting approach for domain partitioning on forest-of-octrees-based block-structured static adaptive mesh refinement with lattice Boltzmann method'. Together they form a unique fingerprint.

Cite this