<<

Lightweight : from Smallest to Fastest

Miroslav Knežević NXP Semiconductors July 21, 2015

LCW2015, NIST Gaithersburg, USA Trade-offs in HW

Performance Silicon area

2. July 22, 2015 Trade-offs in HW

Energy

Power

Performance Silicon area

3. July 22, 2015 Trade-offs in Crypto HW

Energy

Security Power

Performance Silicon area

4. July 22, 2015 Trade-offs in Crypto HW

Security analysis Verification Energy Security Max Design Power costs Average IP quality

Design time Performance Silicon area Customer requirements

Documentation

5. July 22, 2015 Designing the Smallest

Silicon area

6. July 22, 2015 NAND gate

• Smallest logic gate with two inputs.

• GE (gate equivalence) = physical area of a single NAND gate.

• (Ab)used for comparing HW designs across different CMOS technologies.

• When comparing lightweight crypto algorithms NEVER trust GE!

7. July 22, 2015 XOR gate

2-3 GE

8. July 22, 2015 Modern Lightweight Ciphers

< 1000 GE

9. July 22, 2015 AES (128-bit , ENC only)

2500 GE

10. July 22, 2015 Block Cipher – HW Perspective

Minimize!

Round function ≥ 80 bits + Memory + Control logic Block size ≥ 32 bits

11. July 22, 2015 KATAN – The Smallest Block Cipher

KATAN32 = 462 GE

12. July 22, 2015 KATAN – The Smallest Block Cipher

Only 508 bits of expanded key!

KATAN32 = 315 GE + 508 bits of ROM

13. July 22, 2015 How does KATAN compare to the competition?

It’s (one of) the smallest known cipher(s): < 500 GE

But it’s not very fast: 254 clock cycles

Still scalable: 3 times faster for negligible area overhead

14. July 22, 2015 Fine, but let’s really compare it to others!

LED KATAN 180nm UMC130 Synopsys Synopsys Piccolo ≥ 700 GE PRESENT ≥ 460 GE 130nm UMC180 Synopsys IHP250 ≥ 700 GE AMIS350 Synopsys ~1kGE

KLEIN TSMC180 Synopsys ≥ 1.3 kGE IBM130 IBM130 Synopsys Synopsys ≥ 520 GE 15. July 22, 2015 ≥ 580 GE Memory Elements in different CMOS Technologies 16 . July July 22, 2015 NXP 90NM NXP

<5 UMC 130NM UMC AREA OF SCAN OF AREA

6.25 UMC 180NM UMC - FF [GE] FF

6.67 NANGATE 45NM NANGATE

7.67 SPONGENT SPONGENT in different CMOS Technologies 17 . July July 22, 2015 VERSION 1 VERSION 521 738 759 868 VERSION 2 VERSION 737 1060 1103 1256 AREA AREA VERSION 3 VERSION 918 1329

1367 [GE] 1571 VERSION 4 VERSION 1192 1728 1768 2071 u p to70%difference! VERSION 5 VERSION 1340 1950 2012 2323 How can we do a fair comparison?

Difficult in practice. But why not using an open-core library at least?

http://www.nangate.com/

18. July 22, 2015 Trade-offs in Crypto HW

Energy

Security Power

Performance Silicon area

19. July 22, 2015 Designing the Fastest Block Cipher

Performance

20. July 22, 2015 Latency vs Throughput

12

Serial 9 3 processing

6

Latency = 15 s Throughput = 0.067 beer/s

21. July 22, 2015 Latency vs Throughput

12

Parallel 9 3 processing!

6

Latency = 15 s Throughput = 0.2 beer/s

22. July 22, 2015 Latency vs Throughput

12

9 3 Pipelining!

6

Latency = 15 s Throughput = 0.2 beer/s

23. July 22, 2015 Latency vs Throughput

12

9 3 Unrolling!

6

bottom-up! Latency = 5 s Throughput = 0.2 beer/s

24. July 22, 2015 Latency of Existing Ciphers – Is Lightweight = “Light + Wait”?

BLOCK-SIZE KEY-SIZE S-BOX P-LAYER K-SCHEDULE

AES 128 128 8 MDS LIGHT

NOEKOEN 128 128 4 BINARY NO

MINI-AES 64 64 4 MDS LIGHT

MCRYPTON 64 64, 96, 128 4 BINARY LIGHT

BIT PRESENT 64 80, 128 4 PERMUTATION LIGHT

KLEIN 64 64, 80, 96 4 MDS LIGHT

LED 64 64, 128 4 MDS NO

25. July 22, 2015 Number of Rounds vs Key Size

26. July 22, 2015 Unrolled HW Architectures

27. July 22, 2015 Results Results 28 . July July 22, 2015

17.8 20.2

15.3 16.4 – Latency 20.3 21.4

25.3 26.4

31.2 32.8

46.6 [NS] LATENCY

48.2 1-cycle

9.8 10.8 2-cycle 9.8 10.8

9.8 11

9.9 12

14.8 17

15.5 17.4

14.8 16.4

14.7 16.6 Results Results 29 . July July 22, 2015

366.6 191.8

48.2 24.9 – Area 63.7 32.6

79.9 41.3

128.7 63.5

193.1 AREA [ AREA

96 1-cycle

41.3 20.9 KGE 2-cycle 40.4 21.1 ]

41.4 21

40 22

102.5 49.6

49.5 27.1

72.3 37.6

73.8 37.1 Low Latency S-box

• Use small S-boxes (e.g. 5-bit, 4-bit, 3-bit)

• Almost everything follows the normal distribution. So does the S-box!

choose me!

slide credit:

30. July 22, 2015 Gregor Leander Low Latency Encryption Number of Rounds

Minimize!

31. July 22, 2015 Low Latency Encryption Round Complexity

• Not too low complexity.

• Reduce the number of rounds at the cost of (slightly) heavier round.

32. July 22, 2015 Low Latency Encryption Key Schedule

• Number of rounds should be independent of the key schedule.

• Use constant addition instead of a key schedule (if possible).

33. July 22, 2015 Low Latency Encryption Encryption vs Decryption

• Use involution where possible: 푓 푓 푥 = 푥.

• Make Encryption and Decryption procedures similar.

• BUT: think application oriented – sometimes it is beneficial to have asymmetric constructions.

34. July 22, 2015 Low Latency Encryption Meet PRINCE

훼-reflection property:

35. July 22, 2015 Low Latency Encryption Meet PRINCE

LATENCY [NS]

46.6

31.2

25.3

20.3

17.8

15.5

15.3

14.8 14.8

14.7

9.9

9.8 9.8 9.8 8

36. July 22, 2015 Low Latency Encryption Meet PRINCE

AREA [KGE]

366.6

193.1

128.7

102.5

79.9

73.8

72.3

63.7

49.5

48.2

41.4

41.3

40.4

40 17

37. July 22, 2015 Trade-offs in Crypto HW

Energy

Security Power

Performance Silicon area

38. July 22, 2015 Power vs Energy

Energy

Power

39. July 22, 2015 Power vs Energy

Energy =

Power = 12

9 3

6

40. July 22, 2015 Every mW matters!

Total number of mobile devices in 2015 = 8.6 billion* Average (regular) power consumption of a smartphone = 160 mW**

Total energy spent = €2.5 billion*** a year!

* Mobile Statistics Report 2014-2018, The Radicati Group Inc.

** An Analysis of Power Consumption in a Smartphone, A Carroll, G Heiser, USENIX 2010.

*** average electricity price in 2014 in EU was €0.208 per kWh.

41. July 22, 2015 Reducing Power Consumption

푃푡표푡 = 푃푠푤𝑖푡푐ℎ𝑖푛푔 + 푃푙푒푎푘푎푔푒

2 푃푠푤𝑖푡푐ℎ𝑖푛푔 ≈ 퐶푒푓푓 ∙ 푉퐷퐷∙ 푓푐푙푘∙ 푠푤

• Reduce circuit area (e.g. serializing): 퐶푒푓푓 ↓

• Reduce switching activity (e.g. clock gating): 푠푤 ↓

• Move to smaller CMOS technologies: 퐶푒푓푓 ↓, 푉퐷퐷 ↓, but 푃푙푒푎푘푎푔푒 ↑

• Reduce the operating clock frequency: 푓푐푙푘 ↓

42. July 22, 2015 Known Throughput Optimization Techniques and their Impact* on Power and Energy

serialization parallelism unrolling pipelining Power

Energy

* in the context of symmetric block cipher design

43. July 22, 2015 Trade-offs in Crypto HW

Energy

Security Power

Performance Silicon area

44. July 22, 2015 Designing the Most Secure Block Cipher :)

Security Power

45. July 22, 2015 Designing the Most Secure Block Cipher :)

Cryptanalysis

Side Channel Fault Attacks Analysis

46. July 22, 2015 Designing the Most Secure Block Cipher :)

templates DFA Side Channel Fault Attacks DPA Analysis safe-error attacks CPA

SPA

47. July 22, 2015 Challenges with SCA Countermeasures

INCOMPLETE MODELS

Circuit Adversary 1st vs higher models models order DPA

48. July 22, 2015 Challenges with FA Countermeasures

LACK OF CREATIVITY

Redundant Dummy Light executions operations sensors

49. July 22, 2015 THANK YOU!

Thanks to the teams of KATAN, SPONGENT, PRINCE, FIDES

50. July 22, 2015