Publication | Closed Access
Fast AES Implementation: A High-Throughput Bitsliced Approach
55
Citations
25
References
2019
Year
Encryption ThroughputEngineeringHardware AlgorithmComputer ArchitectureBlock CipherAes ImplementationHardware SecurityHigh-performance ArchitectureFast Aes ImplementationParallel ComputingBitsliced ApproachData Encryption StandardComputer EngineeringLightweight CryptographyComputer ScienceFpga DesignData SecurityCryptographyHardware AccelerationParallel ProgrammingData-level Parallelism
In this work, a high-throughput bitsliced AES implementation is proposed, which builds upon a new data representation scheme that exploits the parallelization capability of modern multi/many-core platforms. This representation scheme is employed as a building block to redesign all of the AES stages to tailor them for multi/many-core AES implementation. With the proposed bitsliced approach, each parallelization unit processes an unprecedented number of thirty-two 128-bit input data. Hence, a high order of prallelization is achieved by the proposed implementation technique. Based on the characteristics of this new implementation model, the ShiftRows stage can be implicitly handled through input rearrangement and is simplified to the point where its computing process can be neglected. In this implementation, costly Byte-wise operations are performed through register shift and swapping. In addition, the need for look-up table based I/O operations, which are used by the Substitute Bytes stage is eliminated through using S-box logic circuit. The S-box logic circuit is optimized to simultaneously process 32 chunks of 128-bit input data. We develop high-throughput CTR and ECB AES encryption/decryption on 6 CUDA-enabled GPUs, which achieve 1.47 and 1.38 Tbps of encryption throughput on Tesla V100 GPU, respectively.
| Year | Citations | |
|---|---|---|
Page 1
Page 1