SeisSol on Distributed Multi-GPU Systems: CUDA Code Generation for the Modal Discontinuous Galerkin Method

Ravil Dorozhinskii, Michael Bader

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

9 Scopus citations

Abstract

We present a GPU implementation of the high order Discontinuous Galerkin (DG) scheme in SeisSol, a software package for simulating seismic waves and earthquake dynamics. Our particular focus ison providing a performance portable solution for heterogeneous distributed multi-GPU systems. We therefore redesigned SeisSol's code generation cascade for GPU programming models. This includes CUDAsource code generation for the performance-critical small batched matrix multiplications kernels. The parallelisation extends the existing MPI+X scheme and supports SeisSol's cluster-wise Local TimeStepping (LTS) algorithm for ADER time integration. We performed a Roofline model analysis to ensure that the generated batched matrix operations achieve the performance limits posed by the memory-bandwidth roofline. Our results also demonstrate that the generated GPU kernels outperform the corresponding cuBLAS subroutines by 2.5 times on average. We present strong and weak scaling studies of our implementation on the Marconi100 supercomputer (with 4 Nvidia Volta V100 GPUs per node) on up to 256 GPUs , which revealed good parallel performance and efficiency in case of time integration using global time stepping. However, we show that directly mapping the LTS method from CPUs to distributed GPU environments results in lower hardware utilization. Nevertheless, due to the algorithmic advantages of local time stepping, the method still reduces time-to-solution by a factor of 1.3 on average in contrast to the GTS scheme.

Original languageEnglish
Title of host publicationProceedings of International Conference on High Performance Computing in Asia-Pacific Region, HPC Asia 2021
PublisherAssociation for Computing Machinery
Pages69-82
Number of pages14
ISBN (Electronic)9781450388429
DOIs
StatePublished - 20 Jan 2021
Event2021 International Conference on High Performance Computing in Asia-Pacific Region, HPC Asia 2021 - Virtual, Online, Korea, Republic of
Duration: 20 Jan 202122 Jan 2021

Publication series

NameACM International Conference Proceeding Series

Conference

Conference2021 International Conference on High Performance Computing in Asia-Pacific Region, HPC Asia 2021
Country/TerritoryKorea, Republic of
CityVirtual, Online
Period20/01/2122/01/21

Keywords

  • ADER
  • Discontinuous Galerkin
  • GPU
  • SeisSol
  • code generation
  • high performance computing
  • local time stepping
  • seismic wave propagation

Fingerprint

Dive into the research topics of 'SeisSol on Distributed Multi-GPU Systems: CUDA Code Generation for the Modal Discontinuous Galerkin Method'. Together they form a unique fingerprint.

Cite this