Publication | Closed Access
KaLi: A Crystal for Post-Quantum Security Using Kyber and Dilithium
86
Citations
22
References
2022
Year
Cryptographic PrimitiveEngineeringInformation SecurityChemistryKey GenerationQuantum ComputingPost-quantum CryptographySignature VerificationQuantum Key DistributionQuantum ScienceQuantum CryptographyPhysicsComputer EngineeringComputer ScienceCrystallographyData SecurityCryptographyNatural SciencesCryptographic ProtectionApplied PhysicsKyber Operations
Quantum computers pose a threat to the security of communications over the internet. This imminent risk has led to the standardization of cryptographic schemes for protection in a post-quantum scenario. We present a design methodology for future implementations of such algorithms. This is manifested using the NIST selected digital signature scheme CRYSTALS-Dilithium and key encapsulation scheme CRYSTALS-Kyber. A unified architecture, <monospace xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">KaLi</monospace> , is proposed that can perform key generation, encapsulation, decapsulation, signature generation, and signature verification for all the security levels of CRYSTALS-Dilithium, and CRYSTALS-Kyber. A unified yet flexible polynomial arithmetic unit is designed that can processes Kyber operations twice as fast as Dilithium operations. Efficient memory management is proposed to achieve optimal latency. <monospace xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">KaLi</monospace> is explicitly tailored for ASIC platforms using multiple clock domains. On ASIC 28nm/65nm technology, it occupies 0.263/1.107 mm2 and achieves a clock frequency of 2GHz/560MHz for the fast clock used for memory unit. On Xilinx Zynq Ultrascale+ZCU102 FPGA, the proposed architecture uses 23,277 LUTs, 9,758 DFFs, 4 DSPs, and 24 BRAMs, at 270 MHz clock frequency. <monospace xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">KaLi</monospace> performs better than the standalone implementations of either of the two schemes. This is the first work to provide a unified design in hardware for both schemes.
| Year | Citations | |
|---|---|---|
Page 1
Page 1