TY - GEN
T1 - Automatic performance analysis of OpenMP codes on a scalable shared memory system using periscope
AU - Benedict, Shajulin
AU - Gerndt, Michael
N1 - Funding Information:
This work is partially funded by BMBF under the ISAR project, grant 01IH08005A and the SILC project, grant 01IH08006E.
PY - 2012
Y1 - 2012
N2 - OpenMP is a successful interface for programming parallel applications on shared memory systems. It is widely applied on small scale shared memory systems such as multicore processors, but also in hybrid programming on large supercomputers. This paper presents performance properties for OpenMP and their automatic detection by Periscope. We evaluate Periscope's OpenMP analysis strategy in the context of the Altix 4700 supercomputer at Leibniz Computing Center (LRZ) in Garching. On this unique machine OpenMP scales up to 500 cores, one partition of in total 19 partitions. We present results for the NAS parallel benchmarks and for a large hybrid scientific application.
AB - OpenMP is a successful interface for programming parallel applications on shared memory systems. It is widely applied on small scale shared memory systems such as multicore processors, but also in hybrid programming on large supercomputers. This paper presents performance properties for OpenMP and their automatic detection by Periscope. We evaluate Periscope's OpenMP analysis strategy in the context of the Altix 4700 supercomputer at Leibniz Computing Center (LRZ) in Garching. On this unique machine OpenMP scales up to 500 cores, one partition of in total 19 partitions. We present results for the NAS parallel benchmarks and for a large hybrid scientific application.
KW - Memory accesses analysis
KW - OpenMP
KW - Performance analysis
KW - Supercomputers
UR - http://www.scopus.com/inward/record.url?scp=84857469209&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-28145-7_44
DO - 10.1007/978-3-642-28145-7_44
M3 - Conference contribution
AN - SCOPUS:84857469209
SN - 9783642281440
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 452
EP - 462
BT - Applied Parallel and Scientific Computing - 10th International Conference, PARA 2010, Revised Selected Papers
T2 - 10th International Conference on Applied Parallel and Scientific Computing, PARA 2010
Y2 - 6 June 2010 through 9 June 2010
ER -