Abstract
OpenMP has become the dominant standard for shared memory programming. It is traditionally used for Symmetric Multiprocessor Systems, but has more recently also found its way to parallel architectures with distributed shared memory like NUMA machines. This combines the advantages of OpenMP's easy-to-use programming model with the scalability and cost-effectiveness of NUMA architectures. In NUMA (Non Uniform Memory Access) environments, however, OpenMP codes suffer from the longer latencies of remote memory accesses. This can be observed for both hardware and software DSM systems. In this paper we present SIMT/OMP, a simulation environment capable of modeling NUMA scenarios and providing comprehensive performance data about the inter-connection traffic. We use this tool to study the impact of NUMA on the performance of OpenMP applications and show how the memory layout of these codes can be improved using a visualization tool. Based on these techniques, we have achieved performance increases of up to a factor of five on some of our benchmarks, especially in larger system configurations.
Original language | English |
---|---|
Pages (from-to) | 41-52 |
Number of pages | 12 |
Journal | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
Volume | 3349 |
DOIs | |
State | Published - 2005 |
Externally published | Yes |
Event | 5th International Workshop on OpenMP Applications and Tools, WOMPAT 2004 - Houston, TX, United States Duration: 17 May 2004 → 18 May 2004 |