TY - GEN
T1 - Petascale Local Time Stepping for the ADER-DG Finite Element Method
AU - Breuer, Alexander
AU - Heinecke, Alexander
AU - Bader, Michael
N1 - Publisher Copyright:
© 2016 IEEE.
PY - 2016/7/18
Y1 - 2016/7/18
N2 - In this work we present a clustered local time stepping (LTS) scheme for the arbitrary high-order derivatives discontinuous Galerkin finite element scheme. By clustering elements of similar time step, our scheme meets regularity requirements of modern hardware through the design of the numerical discretization. We present a detailed description of our clustered local time stepping scheme for the seismic simulation package SeisSol. Our scheme is able to capture homogeneous and heterogeneous time step variations in the computational domain and maintains a large fraction of the theoretical speedup offered by LTS. From an engineering standpoint, our scheme addresses all important performance characteristics of state-of-the-art supercomputers. The combined algorithmic and computational performance results for SeisSol show that we are able to leverage the large potential of local time stepping by reducing time-to-solution by several factors (2.3 - 4.1), sustaining more than 53% of SuperMUC-II's HPL performance, what corresponds to more than 1.5 PFLOPS performance on 86,016 cores.
AB - In this work we present a clustered local time stepping (LTS) scheme for the arbitrary high-order derivatives discontinuous Galerkin finite element scheme. By clustering elements of similar time step, our scheme meets regularity requirements of modern hardware through the design of the numerical discretization. We present a detailed description of our clustered local time stepping scheme for the seismic simulation package SeisSol. Our scheme is able to capture homogeneous and heterogeneous time step variations in the computational domain and maintains a large fraction of the theoretical speedup offered by LTS. From an engineering standpoint, our scheme addresses all important performance characteristics of state-of-the-art supercomputers. The combined algorithmic and computational performance results for SeisSol show that we are able to leverage the large potential of local time stepping by reducing time-to-solution by several factors (2.3 - 4.1), sustaining more than 53% of SuperMUC-II's HPL performance, what corresponds to more than 1.5 PFLOPS performance on 86,016 cores.
KW - ADER-DG
KW - FEM
KW - Hardware-aware algorithms
KW - High-performance computing
KW - Local time stepping
KW - Parallel alogithms
UR - http://www.scopus.com/inward/record.url?scp=84983365313&partnerID=8YFLogxK
U2 - 10.1109/IPDPS.2016.109
DO - 10.1109/IPDPS.2016.109
M3 - Conference contribution
AN - SCOPUS:84983365313
T3 - Proceedings - 2016 IEEE 30th International Parallel and Distributed Processing Symposium, IPDPS 2016
SP - 854
EP - 863
BT - Proceedings - 2016 IEEE 30th International Parallel and Distributed Processing Symposium, IPDPS 2016
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 30th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2016
Y2 - 23 May 2016 through 27 May 2016
ER -