TY - GEN
T1 - Exploiting state-of-the-art x86 architectures in scientific computing
AU - Heinecke, Alexander
AU - Auckenthaler, Thomas
AU - Trinitis, Carsten
PY - 2012
Y1 - 2012
N2 - In recent years, general purpose x86 architectures have undergone significant modifications towards high performance computing capabilities. Lately, technologies like wider vector units or Fused Multiply-Add (FMA) instruction, which were mainly known from GPU arcitectures, have been introduced. In this paper, we examine the performance of current x86 architectures, namely Intel Sandy Bridge and AMD Bulldozer, for four different parallel workloads with different properties. These properties comprise optimally cache-blocked algorithms as well as adaptive grid structures resulting in memory latency and bandwidth bound executions. The achieved performance on both architectures is very promising, and, if extrapolated towards upcoming server silicon, can be regarded as on par with current high-end GPU based accelerators.
AB - In recent years, general purpose x86 architectures have undergone significant modifications towards high performance computing capabilities. Lately, technologies like wider vector units or Fused Multiply-Add (FMA) instruction, which were mainly known from GPU arcitectures, have been introduced. In this paper, we examine the performance of current x86 architectures, namely Intel Sandy Bridge and AMD Bulldozer, for four different parallel workloads with different properties. These properties comprise optimally cache-blocked algorithms as well as adaptive grid structures resulting in memory latency and bandwidth bound executions. The achieved performance on both architectures is very promising, and, if extrapolated towards upcoming server silicon, can be regarded as on par with current high-end GPU based accelerators.
KW - AMD
KW - CPU architectures
KW - Intel
KW - Multi-core
KW - parallel applications
KW - vectorization
UR - http://www.scopus.com/inward/record.url?scp=84870754745&partnerID=8YFLogxK
U2 - 10.1109/ISPDC.2012.15
DO - 10.1109/ISPDC.2012.15
M3 - Conference contribution
AN - SCOPUS:84870754745
SN - 9780769548050
T3 - Proceedings - 2012 11th International Symposium on Parallel and Distributed Computing, ISPDC 2012
SP - 47
EP - 54
BT - Proceedings - 2012 11th International Symposium on Parallel and Distributed Computing, ISPDC 2012
T2 - 2012 11th International Symposium on Parallel and Distributed Computing, ISPDC 2012
Y2 - 25 June 2012 through 29 June 2012
ER -