TY - GEN
T1 - A hardware-based multi-objective thread mapper for tiled manycore architectures
AU - Pujari, Ravi Kumar
AU - Wild, Thomas
AU - Herkersdorf, Andreas
N1 - Publisher Copyright:
© 2015 IEEE.
PY - 2015/12/14
Y1 - 2015/12/14
N2 - Thread mapping is typically performed as an integral part of cooperative or pre-emptive operating system (OS) scheduling in order to share the processor core(s) among competing applications. Schedulers usually follow a single-objective performance optimization, such as maximizing core utilization or satisfying deadlines by the prioritization of threads. Meeting multiple orthogonal objectives, like performance vs. power or thermal resilience, in the era of manycore processors is a challenge because of the associated scalability and thread management overhead. We tackle these challenges by employing a two stage thread management strategy. In the first stage (not covered in this short paper), threads are assigned to regions or compute tiles. For the second stage we introduce in this paper the TCU (Thread Control Unit), a configurable, low latency, low overhead hardware thread mapper that takes various runtime sensor parameters into account. It can map threads within a small and bounded number of clock cycles in round robin, single or multi-objective manner. TCU is designed to consider not just load balancing or performance criteria but also physical constraints like power budgets, temperature limits and reliability aspects. TCU macro achieves 150K thread mappings per second on a tiled MPSoC FPGA prototype while operating at moderate 50 Mz. Evaluations of different mapping policies show that multi-objective thread mapping provides about 10 to 40% less mapping latency for periodic and bursty traffic compared to single-objective or round robin schemes. FPGA and ASIC syntheses reveal a 9% hardware overhead for the TCU on a four core compute tile.
AB - Thread mapping is typically performed as an integral part of cooperative or pre-emptive operating system (OS) scheduling in order to share the processor core(s) among competing applications. Schedulers usually follow a single-objective performance optimization, such as maximizing core utilization or satisfying deadlines by the prioritization of threads. Meeting multiple orthogonal objectives, like performance vs. power or thermal resilience, in the era of manycore processors is a challenge because of the associated scalability and thread management overhead. We tackle these challenges by employing a two stage thread management strategy. In the first stage (not covered in this short paper), threads are assigned to regions or compute tiles. For the second stage we introduce in this paper the TCU (Thread Control Unit), a configurable, low latency, low overhead hardware thread mapper that takes various runtime sensor parameters into account. It can map threads within a small and bounded number of clock cycles in round robin, single or multi-objective manner. TCU is designed to consider not just load balancing or performance criteria but also physical constraints like power budgets, temperature limits and reliability aspects. TCU macro achieves 150K thread mappings per second on a tiled MPSoC FPGA prototype while operating at moderate 50 Mz. Evaluations of different mapping policies show that multi-objective thread mapping provides about 10 to 40% less mapping latency for periodic and bursty traffic compared to single-objective or round robin schemes. FPGA and ASIC syntheses reveal a 9% hardware overhead for the TCU on a four core compute tile.
KW - Hardware Scheduler
KW - MPSoC
KW - Multi-objective
KW - Thread Mapping
UR - http://www.scopus.com/inward/record.url?scp=84962368227&partnerID=8YFLogxK
U2 - 10.1109/ICCD.2015.7357148
DO - 10.1109/ICCD.2015.7357148
M3 - Conference contribution
AN - SCOPUS:84962368227
T3 - Proceedings of the 33rd IEEE International Conference on Computer Design, ICCD 2015
SP - 459
EP - 462
BT - Proceedings of the 33rd IEEE International Conference on Computer Design, ICCD 2015
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 33rd IEEE International Conference on Computer Design, ICCD 2015
Y2 - 18 October 2015 through 21 October 2015
ER -