The Technology Dependence of Lightweight Hash Implementation Cost

The Technology Dependence of Lightweight Hash Implementation Cost Xu Guo and Patrick Schaumont Center for Embedded Systems for Critical Applications (CESCA) Bradley Department of Electrical and Computer Engineering Virginia Tech, Blacksburg, VA 24061, USA {xuguo,schaum}@vt.edu Abstract. The growing demand of security features in pervasive com- puting requires cryptographic implementations to meet tight cost con- straints. Lightweight Cryptography is a generic term that captures new efforts in this area, covering lightweight cryptography proposals as well as lightweight implementation techniques. This paper demonstrates the influence of technology selection when comparing different lightweight hash designs and when using lightweight cryptography techniques to implement a hash design. First, we demonstrate the impact of technology selection to the cost analysis of existing lightweight hash designs through two case studies: the new lightweight proposal Quark [1] and a lightweight implementation of CubeHash [2]. Second, by observing the interaction of hash algorithm design, architecture design, and technology mapping, we propose a methodology for lightweight hash implementation and apply it to Cubehash optimizations. Finally, we introduce a cost model for an- alyzing the hardware cost of lightweight hash implementations. 1 Introduction Lightweight Cryptography is a recent trend that combines several cryptographic proposals that implement a careful tradeoff between resource-cost and security- level. For Lightweight Hash implementations, lower security levels mean a reduced level of collision resistance as well as preimage and second-preimage resistance compared with the common SHA standard [3]. The growing emphasis on engineering aspects of cryptographic algorithms can be observed through recent advances in lightweight hash designs, which are strongly implementation-oriented. Fig. 1 demonstrates three important design fields in lightweight hash design: algorithm-specification, hardware architecture, and silicon implementation. Crypto-engineers are familiar with the relationship between algorithm- and architecture-level. However, the silicon implementation remains a significant challenge for the crypto-engineer, and the true impact of design decisions often remains unknown until the design is implemented. This paper contributes to this significant challenge in three ways. First, by mapping an existing lightweight hash proposal to different technology nodes and 2 X. Guo and P. Schaumont standard-cell libraries, we illustrate how important technology-oriented cost factors can be taken into account during analysis. Second, by studying the interaction between hardware architecture and silicon implementation, we demonstrate the importance of low-level memory structures in lightweight hash design. As a case study, we show how to optimize CubeHash [2], a SHA-3 Round 2 candidate with a high security level, to fit in 6,000 GEs (Gate Equivalent) by combin- ing bit-slicing (at hardware architecture level) and proper memory structures (at silicon implementation level). Third, we built a cost model for lightweight hash designs and provide guidelines for both lightweight hash developers and hardware designers. Algorithm Specification (e.g. datapath, state size, and number of round) Cost, Cost, Throughput, Throughput, and Power and Power Models Lightweight Results Hash Hardware Architecture Silicon Implementation (e.g. datapath folding and (e.g. memory structures, logic optimization) standard-cell design, and Circuit Level Optimization Choices full/semi-custom design) Fig. 1. The interactive process in the development of lightweight cryptography. 2 Lightweight Hash Comparison Issues Due to the nature of lightweight hash designs they are very sensitive to small variations in comparison metrics. For example, several of them claim area around 1,000 GEs with power consumption around 2 µW. However, it appears that most lightweight hash designers only discuss algorithm implementations at logic level, and they make abstraction of important technological factors. This makes a comparison between different lightweight hash designs difficult or even impossible. To understand the issues, one should first identify whether the selected metrics are dependent or independent of the technology. Below we discuss several metrics that are commonly used to compare different lightweight hash proposals. 2.1 Area Most of the lightweight hash papers compare the hardware cost by using the post- synthesis circuit area in terms of GEs. However, the circuit area estimation based Technology Dependence of Lightweight Hash Implementation Cost 3 on GEs is a coarse approach; it ignores the distinction between control, datapath, and storage, for example. Moreover, GEs are strongly technology dependent. In the following two case studies of Quark [1] and CubeHash [2], we will demonstrate different aspects of technology dependence of area cost. { Standard-cell Library Impact. This is brought by different technology nodes and different standard-cell libraries. The ASIC library influence can be found by using the same synthesis scripts for the same RTL designs in a comprehen- sive exploration with different technology nodes and standard-cell libraries. As found in the Quark case study in Section 3, the cost variation range caused by changing standard-cell libraries can be from -17.7% to 21.4%, for technology nodes from 90nm to 180nm. { Storage Structure Impact. Hash designs are state-intensive, and typically re- quire a proportionally large amount of gates for storage. This makes the selection of the proper storage structure an important tuning knob for the implementation of a hash algorithm. For example, we implement several bit-sliced versions of CubeHash [4,5,6], and show that the use of a register file may imply an area reduction of 42.7% area reduction compared with a common flip-flop based memory design. 2.2 Power The power consumption is another commonly used metric which is strongly cor- related to the technology. For the standard-cell library impact, we can see from Table 1 the power efficiency in terms of nW/MHz/GE differs by a factor of two to three times across different technology nodes. For the storage structure impact, as illustrated in our case study of CubeHash in Section 4, the power variation range can be from -31.4% to 14.5% at 130nm technology node. Therefore, it is important to provide proper context when comparing the power estimation results for different lightweight hash implementations. Table 1. Compare the characteristics of different ASIC technology nodes from some commercial standard-cell libraries [7]. Technology Gate Density Power Node [kGEs=mm2][nW=MHz=GE] 180 nm 125 15.00 130 nm 206 10.00 90 nm 403 7.00 65 nm 800 5.68 However, in general we think that the power consumption is a less important metric in comparing different lightweight hash designs. Take the most popular and ultra-constrained RFID application as an example, which has the power 4 X. Guo and P. Schaumont budget for security portion as 27 µW at 100KHz [8,9]. Just by looking at the average power number in Table 1, we can easily get a rough estimation of the area needed to consume 27 µW would be 18k GEs, 27k GEs, 39k GEs and 48k GEs at 180nm, 130nm, 90nm and 65nm, respectively. Therefore, all of the lightweight hash proposals which are much smaller should satisfy the power requirement. 2.3 Latency The latency of hash operations is measured in clock cycles and is technology independent. However, one related metric, Throughput, may be technology dependent if the maximum throughput is required since the maximum frequency of a design is technology dependent. Since in most lightweight hash targeted applications, especially tag-based applications, hashing a large amount of data is unlikely to happen or the operating frequency is fixed at a very low value (e.g. 100 KHz for RFID), latency as a technology independent metric should be sufficient to characterize the lightweight hash performance. 2.4 Summary As discussed above, only the latency metric is independent of technology. Power is technology dependent metric but it is a much less important metric than area. Therefore, the rest of the work will focus on the technology dependent analysis of the hardware cost of lightweight hash designs. The technology impacts of standard-cell libraries and storage structures are measured in a quantitative way through two concrete case studies of Quark and CubeHash designs. 3 ASIC Library Dependent Cost Analysis of Quark In this section, we investigate the impact of technology library selection on the overall GE count of a design. We do this through the example of the Quark lightweight hash function. 3.1 Overview of Quark The Quark hash family by Aumasson et al. was presented at CHES2010 [1], using sponge functions as domain extension algorithm, and an internal permutation inspired from the stream-cipher GRAIN and the block-cipher KATAN. There are three instances of Quark: U-Quark, D-Quark and S-Quark, providing at least 64, 80, 112 bit security against all attacks (collisions, second preimages, length extension, multi-collisions, etc.) [10]. The authors reported the hardware cost after layout of each variant as 1379, 1702 and 2296 GEs at 180nm. The open-source RTL codes found at the Quark website [10] help us evaluate the standard-cell library impact on the hardware cost by reusing the same source codes. The evaluation is performed at two steps: first, we look at cost variations at three technology nodes (180nm/130nm/90nm); second,

The Technology Dependence of Lightweight Hash Implementation Cost

CS-466: Applied Cryptography Adam O'neill

Linearization Framework for Collision Attacks: Application to Cubehash and MD6

Nasha, Cubehash, SWIFFTX, Skein

Quantum Preimage and Collision Attacks on Cubehash

Inside the Hypercube

Rebound Attack

The Hitchhiker's Guide to the SHA-3 Competition

Tocubehash, Grøstl, Lane, Toshabal and Spectral Hash

NISTIR 7620 Status Report on the First Round of the SHA-3

High-Speed Hardware Implementations of BLAKE, Blue

Side-Channel Analysis of Six SHA-3 Candidates⋆

Performance Analysis of Cryptographic Hash Functions Suitable for Use in Blockchain