TY - GEN
T1 - Automatic generation of high-performance modular multipliers for arbitrary mersenne primes on FPGAs
AU - Koppermann, Philipp
AU - De Santis, Fabrizio
AU - Heyszl, Johann
AU - Sigl, Georg
N1 - Publisher Copyright:
© 2017 IEEE.
PY - 2017/6/16
Y1 - 2017/6/16
N2 - Modular multiplication is a fundamental and performance determining operation in various public-key cryptosystems. High-performance modular multipliers on FPGAs are commonly realized by several small-sized multipliers, an adder tree for summing up the digit-products, and a reduction circuit. While small-sized multipliers are available in pre-fabricated high-speed DSP slices, the adder tree and the reduction circuit are implemented in standard logic. The latter operations represent the performance bottleneck to high-performance implementations. Previous works attempted to minimize the critical path of the adder tree by rearranging digit-products on digit-level. We report improved performance by regrouping digit-products on bit-level, while incorporating the reduction for Mersenne primes. Our approach leads to very fast modular multipliers, whose latency and throughput characteristics outperform all previous results. We formalize our approach and provide algorithms to automatically generate high-performance modular multipliers for arbitrary Mersenne primes from any small-sized multipliers.
AB - Modular multiplication is a fundamental and performance determining operation in various public-key cryptosystems. High-performance modular multipliers on FPGAs are commonly realized by several small-sized multipliers, an adder tree for summing up the digit-products, and a reduction circuit. While small-sized multipliers are available in pre-fabricated high-speed DSP slices, the adder tree and the reduction circuit are implemented in standard logic. The latter operations represent the performance bottleneck to high-performance implementations. Previous works attempted to minimize the critical path of the adder tree by rearranging digit-products on digit-level. We report improved performance by regrouping digit-products on bit-level, while incorporating the reduction for Mersenne primes. Our approach leads to very fast modular multipliers, whose latency and throughput characteristics outperform all previous results. We formalize our approach and provide algorithms to automatically generate high-performance modular multipliers for arbitrary Mersenne primes from any small-sized multipliers.
UR - http://www.scopus.com/inward/record.url?scp=85025151468&partnerID=8YFLogxK
U2 - 10.1109/HST.2017.7951794
DO - 10.1109/HST.2017.7951794
M3 - Conference contribution
AN - SCOPUS:85025151468
T3 - Proceedings of the 2017 IEEE International Symposium on Hardware Oriented Security and Trust, HOST 2017
SP - 35
EP - 40
BT - Proceedings of the 2017 IEEE International Symposium on Hardware Oriented Security and Trust, HOST 2017
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 10th IEEE International Symposium on Hardware Oriented Security and Trust, HOST 2017
Y2 - 1 May 2017 through 5 May 2017
ER -