Abstract
In a complex processor landscape dominated by multi- and many-core processors, simplifying programming plays a crucial role in enhancing developers' productivity. One way is to use highly tuned library functions. In this paper we present fastsg, an optimized library for the sparse grid technique with support for dimensional truncation. With optimizations for best cache use and vectorization, we improve the performance on one processor core up to a factor of 10. Parallelization using OpenMP scales almost linearly on a 12-core system.
Originalsprache | Englisch |
---|---|
Seiten (von - bis) | 354-363 |
Seitenumfang | 10 |
Fachzeitschrift | Procedia Computer Science |
Jahrgang | 9 |
DOIs | |
Publikationsstatus | Veröffentlicht - 2012 |
Veranstaltung | 12th Annual International Conference on Computational Science, ICCS 2012 - Omaha, NB, USA/Vereinigte Staaten Dauer: 4 Juni 2012 → 6 Juni 2012 |