Abstract
Within the current reporting period (04/2016-04/2017) of our HLRS project we have developed a scalable implementation of the fault-tolerant combination technique. Fault-tolerance is one of the key topics in the ongoing research of algorithms for future exascale systems. Our algorithms enable fault-tolerance for both hard and soft faults, for the efficient and massively parallel computation of high-dimensional PDEs without the need of checkpointing or process replication. The research project EXAHD is part of DFG’s priority program “Software for Exascale Computing” (SPPEXA). The project’s target application is the large-scale simulation of plasma turbulence with the code GENE. The report combines parts of three publications.
Originalsprache | Englisch |
---|---|
Titel | High Performance Computing in Science and Engineering' 17 |
Untertitel | Transactions of the High Performance Computing Center, Stuttgart (HLRS) 2017 |
Herausgeber (Verlag) | Springer International Publishing |
Seiten | 513-529 |
Seitenumfang | 17 |
ISBN (elektronisch) | 9783319683942 |
ISBN (Print) | 9783319683935 |
DOIs | |
Publikationsstatus | Veröffentlicht - 1 Jan. 2018 |