TY - JOUR
T1 - Power-and Cache-Aware Task Mapping with Dynamic Power Budgeting for Many-Cores
AU - Rapp, Martin
AU - Sagi, Mark
AU - Pathania, Anuj
AU - Herkersdorf, Andreas
AU - Henkel, Jorg
N1 - Publisher Copyright:
© 1968-2012 IEEE.
PY - 2020/1/1
Y1 - 2020/1/1
N2 - Two factors primarily affect the performance of multi-threaded tasks on many-core processors with logically-shared and physically-distributed Last-Level Cache (LLC): the LLC latencies of threads running on different cores and the per-core power budgets that aim to guarantee thermally safe operation. Two knobs affect these factors: First, the mapping of threads to cores affects both the LLC latencies and the power budgets. Second, dynamic power budgeting refines the power budgets during task execution. A mapping that spatially distributes threads across the many-core increases the power budgets, but unfortunately also increases the LLC latencies. Contrarily, mapping all threads near the center of the many-core minimizes the LLC latencies, but unfortunately also decreases the power budgets. Consequently, both metrics cannot be simultaneously optimal, which leads to a Pareto-optimization for task mapping that has formerly not been exploited. Dynamic power budgeting reallocates the power budgets according to the tasks' execution phases. This results in a dynamically changing non-uniform power budget, which further increases the performance. We are the first to present a run-time algorithm PCGov combining task-agnostic task mapping and task-aware dynamic power budgeting for many-cores with shared distributed LLC. PCGov yields up to 21 percent lower response time and 13 percent lower energy consumption compared to the state-of-the-art, with a low overhead of less than 0.5 percent.
AB - Two factors primarily affect the performance of multi-threaded tasks on many-core processors with logically-shared and physically-distributed Last-Level Cache (LLC): the LLC latencies of threads running on different cores and the per-core power budgets that aim to guarantee thermally safe operation. Two knobs affect these factors: First, the mapping of threads to cores affects both the LLC latencies and the power budgets. Second, dynamic power budgeting refines the power budgets during task execution. A mapping that spatially distributes threads across the many-core increases the power budgets, but unfortunately also increases the LLC latencies. Contrarily, mapping all threads near the center of the many-core minimizes the LLC latencies, but unfortunately also decreases the power budgets. Consequently, both metrics cannot be simultaneously optimal, which leads to a Pareto-optimization for task mapping that has formerly not been exploited. Dynamic power budgeting reallocates the power budgets according to the tasks' execution phases. This results in a dynamically changing non-uniform power budget, which further increases the performance. We are the first to present a run-time algorithm PCGov combining task-agnostic task mapping and task-aware dynamic power budgeting for many-cores with shared distributed LLC. PCGov yields up to 21 percent lower response time and 13 percent lower energy consumption compared to the state-of-the-art, with a low overhead of less than 0.5 percent.
KW - Processor scheduling
KW - TSP (Thermal Safe Power)
KW - cache memory
KW - dark silicon
KW - low power design
KW - power dissipation
KW - task mapping
KW - thermal stability
UR - http://www.scopus.com/inward/record.url?scp=85078707907&partnerID=8YFLogxK
U2 - 10.1109/TC.2019.2935446
DO - 10.1109/TC.2019.2935446
M3 - Article
AN - SCOPUS:85078707907
SN - 0018-9340
VL - 69
SP - 1
EP - 13
JO - IEEE Transactions on Computers
JF - IEEE Transactions on Computers
IS - 1
M1 - 8807211
ER -