TY - GEN
T1 - Shared memory parallelization of fully-adaptive simulations using a dynamic tree-split and -join approach
AU - Schreiber, Martin
AU - Bungartz, Hans Joachim
AU - Bader, Michael
PY - 2012
Y1 - 2012
N2 - In this work we present an approach for the parallelization of hyperbolic simulations on shared-memory architectures running on fully-adaptive grids. We tackle the parallelization problem with a dynamic sub-tree split- and join-approach by running computations on those split sub-trees in parallel using lightweight tasks. The traversal of sub-trees created by tree-splittings is built upon an inherently cache efficient approach for solving hyperbolic PDEs on dynamically adaptive triangular grids using a Sierpiński space filling curve. Our communication scheme among sub-trees stores the exchange-data to/from adjacent sub-trees in a consecutive memory area which is further utilized for an improved run-length-encoded data exchange. To give results for a concrete scenario, we implemented a solver for the shallow water equations which demands for fully-adaptive grid refinement and coarsening after each time-step. Our results give detailed statistics about optimization of the split size, parallelization overhead and also strong scalability results for a simulation running on multi-socket Intel and AMD architectures.
AB - In this work we present an approach for the parallelization of hyperbolic simulations on shared-memory architectures running on fully-adaptive grids. We tackle the parallelization problem with a dynamic sub-tree split- and join-approach by running computations on those split sub-trees in parallel using lightweight tasks. The traversal of sub-trees created by tree-splittings is built upon an inherently cache efficient approach for solving hyperbolic PDEs on dynamically adaptive triangular grids using a Sierpiński space filling curve. Our communication scheme among sub-trees stores the exchange-data to/from adjacent sub-trees in a consecutive memory area which is further utilized for an improved run-length-encoded data exchange. To give results for a concrete scenario, we implemented a solver for the shallow water equations which demands for fully-adaptive grid refinement and coarsening after each time-step. Our results give detailed statistics about optimization of the split size, parallelization overhead and also strong scalability results for a simulation running on multi-socket Intel and AMD architectures.
UR - http://www.scopus.com/inward/record.url?scp=84880274154&partnerID=8YFLogxK
U2 - 10.1109/HiPC.2012.6507479
DO - 10.1109/HiPC.2012.6507479
M3 - Conference contribution
AN - SCOPUS:84880274154
SN - 9781467323703
T3 - 2012 19th International Conference on High Performance Computing, HiPC 2012
BT - 2012 19th International Conference on High Performance Computing, HiPC 2012
T2 - 2012 19th International Conference on High Performance Computing, HiPC 2012
Y2 - 18 December 2012 through 21 December 2012
ER -