As the computation bottleneck in lattice-based cryptography (LBC), the polynomial multiplication based on number theoretic transform (NTT) has been continuously studied for flexible hardware implementations with high area-efficiency. This paper… Click to show full abstract
As the computation bottleneck in lattice-based cryptography (LBC), the polynomial multiplication based on number theoretic transform (NTT) has been continuously studied for flexible hardware implementations with high area-efficiency. This paper presents an area-efficient and configurable NTT-based polynomial multiplier (AC-PM) incorporating algorithmic and architectural level optimization techniques. For the core operation of polynomial multiplication, two low-complexity and fast modular multiplication algorithms are introduced with loose constraints of LBC-friendly primes. Based on the proposed algorithms, a reconfigurable processing element (RPE) is dedicatedly designed to execute all the operations in an NTT-based polynomial multiplication: NTT, inverse NTT (INTT), and coefficient-wise multiplication (CWM). The proposed AC-PM can be configured with different numbers of RPEs and supports various polynomial degrees without recompilation. Additionally, the dataflow complexity is greatly simplified. More importantly, to the best of our knowledge, the twiddle factors are reused, for the first time, to support both NTT and INTT with multiple polynomial degrees, which leads to increased flexibility of AC-PM with small overhead on hardware resource. FPGA implementation results demonstrate that the proposed AC-PM significantly outperforms the prior arts in both flexibility and area efficiency.
               
Click one of the above tabs to view related content.