TY - GEN
T1 - Pattern-aware staging for hybrid memory systems
AU - Arima, Eishi
AU - Schulz, Martin
N1 - Publisher Copyright:
© Springer Nature Switzerland AG 2020.
PY - 2020
Y1 - 2020
N2 - The ever increasing demand for higher memory performance and—at the same time—larger memory capacity is leading the industry towards hybrid main memory designs, i.e., memory systems that consist of multiple different memory technologies. This trend, however, naturally leads to one important question: how can we efficiently utilize such hybrid memories? Our paper proposes a software-based approach to solve this challenge by deploying a pattern-aware staging technique. Our work is based on the following observations: (a) the high-bandwidth fast memory outperforms the large memory for memory intensive tasks; (b) but those tasks can run for much longer than a bulk data copy to/from the fast memory, especially when the access pattern is more irregular/sparse. We exploit these observations by applying the following staging technique if the accesses are irregular and sparse: (1) copying a chunk (few GB of sequential data) from large to fast memory; (2) performing a memory intensive task on the chunk; and (3) writing it back to the large memory. To check the regularity/sparseness of the accesses at runtime with negligible performance impact, we develop a lightweight pattern detection mechanism using a helper threading inspired approach with two different Bloom filters. Our case study using various scientific codes on a real system shows that our approach achieves significant speed-ups compared to executions with using only the large memory or hardware caching: 3$$\times $$ or 41% speedups in the best, respectively.
AB - The ever increasing demand for higher memory performance and—at the same time—larger memory capacity is leading the industry towards hybrid main memory designs, i.e., memory systems that consist of multiple different memory technologies. This trend, however, naturally leads to one important question: how can we efficiently utilize such hybrid memories? Our paper proposes a software-based approach to solve this challenge by deploying a pattern-aware staging technique. Our work is based on the following observations: (a) the high-bandwidth fast memory outperforms the large memory for memory intensive tasks; (b) but those tasks can run for much longer than a bulk data copy to/from the fast memory, especially when the access pattern is more irregular/sparse. We exploit these observations by applying the following staging technique if the accesses are irregular and sparse: (1) copying a chunk (few GB of sequential data) from large to fast memory; (2) performing a memory intensive task on the chunk; and (3) writing it back to the large memory. To check the regularity/sparseness of the accesses at runtime with negligible performance impact, we develop a lightweight pattern detection mechanism using a helper threading inspired approach with two different Bloom filters. Our case study using various scientific codes on a real system shows that our approach achieves significant speed-ups compared to executions with using only the large memory or hardware caching: 3$$\times $$ or 41% speedups in the best, respectively.
UR - http://www.scopus.com/inward/record.url?scp=85087030168&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-50743-5_24
DO - 10.1007/978-3-030-50743-5_24
M3 - Conference contribution
AN - SCOPUS:85087030168
SN - 9783030507428
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 474
EP - 495
BT - High Performance Computing - 35th International Conference, ISC High Performance 2020, Proceedings
A2 - Sadayappan, Ponnuswamy
A2 - Chamberlain, Bradford L.
A2 - Juckeland, Guido
A2 - Ltaief, Hatem
PB - Springer
T2 - 35th International Conference on High Performance Computing, ISC High Performance 2020
Y2 - 22 June 2020 through 25 June 2020
ER -