aes and other secret key implementations...ingrid verbauwhede, k.u.leuven - cosic 1 kul - cosic...

35
Ingrid Verbauwhede, K.U.Leuven - COSIC 1 KUL - COSIC ECRYPT Summer School - 1 Albena, May 2011 AES and other secret key implementations Ingrid Verbauwhede ingrid.verbauwhede-at-esat.kuleuven.be K.U.Leuven, ESAT- SCD - COSIC Computer Security and Industrial Cryptography Acknowledgements: Current and former Ph.D. students at UCLA and K.U.Leuven KUL - COSIC ECRYPT Summer School - 2 Albena, May 2011 Outline & Goal Crypto engineering for secret key algorithms Design parameters – DES Modes of operation – AES Light weight crypto

Upload: others

Post on 08-Mar-2020

9 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: AES and other secret key implementations...Ingrid Verbauwhede, K.U.Leuven - COSIC 1 KUL - COSIC ECRYPT Summer School - 1 Albena, May 2011 AES and other secret key implementations Ingrid

Ingrid Verbauwhede, K.U.Leuven - COSIC 1

KUL - COSIC ECRYPT Summer School - 1 Albena, May 2011

AES and other secret key

implementations

Ingrid Verbauwhede

ingrid.verbauwhede-at-esat.kuleuven.be

K.U.Leuven, ESAT- SCD - COSIC

Computer Security and Industrial Cryptography

Acknowledgements:

Current and former Ph.D. students

at UCLA and K.U.Leuven

KUL - COSIC ECRYPT Summer School - 2 Albena, May 2011

Outline & Goal

• Crypto engineering for secret key algorithms

– Design parameters

– DES

– Modes of operation

– AES

– Light weight crypto

Page 2: AES and other secret key implementations...Ingrid Verbauwhede, K.U.Leuven - COSIC 1 KUL - COSIC ECRYPT Summer School - 1 Albena, May 2011 AES and other secret key implementations Ingrid

Ingrid Verbauwhede, K.U.Leuven - COSIC 2

KUL - COSIC ECRYPT Summer School - 3 Albena, May 2011

Design Parameters

Embedded security:

Area, delay, power, energy

KUL - COSIC ECRYPT Summer School - 4 Albena, May 2011

Crypto engineering everywhere

• Continuum between softwareand hardware– ASIC (microcode) – FPGA – fully

programmable processor

Everything is alwaysconnected everywhere

Page 3: AES and other secret key implementations...Ingrid Verbauwhede, K.U.Leuven - COSIC 1 KUL - COSIC ECRYPT Summer School - 1 Albena, May 2011 AES and other secret key implementations Ingrid

Ingrid Verbauwhede, K.U.Leuven - COSIC 3

KUL - COSIC ECRYPT Summer School - 5 Albena, May 2011

Embedded Security

NEED BOTH

• Efficient, light-weight Implementation– Within power, area, timing budgets

– Public key: 1024 bits RSA on 8 bit μC and 100 μW

– Public key on a passive RFID tag

• Trustworthy implementation– Resistant to attacks

– Active attacks: probing, power glitches, JTAG scan chain

– Passive attacks: side channel attacks, including power, timingand electromagnetic leaks

KUL - COSIC ECRYPT Summer School - 6 Albena, May 2011

Cost definition

• Area

• Time

• Power, Energy

• Physical Security

• NRE (Non Recurring Engineering) cost

Page 4: AES and other secret key implementations...Ingrid Verbauwhede, K.U.Leuven - COSIC 1 KUL - COSIC ECRYPT Summer School - 1 Albena, May 2011 AES and other secret key implementations Ingrid

Ingrid Verbauwhede, K.U.Leuven - COSIC 4

KUL - COSIC ECRYPT Summer School - 7 Albena, May 2011

Design parameters

• Speed or throughput:– HW: Gbits/sec or Mbits/sec/slice

– SW: Cycles/byte, independent of clock frequency

• Area:– HW: mm2 (gate or transistor count)

– SW: memory footprint

• Power or energy consumption:– Power (Watts) for cooling or transmission (RFID)

– Energy (Joule): battery operated devices

• Security: difficult to measure, but we want it– Entropy, leakage functions?

– Measurements until disclosure?

KUL - COSIC ECRYPT Summer School - 8 Albena, May 2011

Throughput: Real-time

• Extremely high throughput (Radar or fiber optics)

• One operator (= hardware unit, e.g. adder, shifter, register)

• for each operation (= algorithmic, e.g. addition, multiplication, delay)

clock frequency = sample frequency

• Most designs: time multiplexing

clock frequency = sample frequency

clock frequency

sample frequency = number of clock cycles available for the job

Page 5: AES and other secret key implementations...Ingrid Verbauwhede, K.U.Leuven - COSIC 1 KUL - COSIC ECRYPT Summer School - 1 Albena, May 2011 AES and other secret key implementations Ingrid

Ingrid Verbauwhede, K.U.Leuven - COSIC 5

KUL - COSIC ECRYPT Summer School - 9 Albena, May 2011

SW: cycles per byte

• “independent” of

clock frequency

or machine

• Size of packet

matters

• “match” of

algorithm to

architecture

[Source: http://bench.cr.yp.to/results-sha3.html]

Size (bytes)8 64 4096

40 cyc

les/

byte

Cycles/byte

KUL - COSIC ECRYPT Summer School - 10 Albena, May 2011

Power density problem

• Intel S. Borkar power density problem

[Author: S. Borkar, Intel]

Cooling!!

Page 6: AES and other secret key implementations...Ingrid Verbauwhede, K.U.Leuven - COSIC 1 KUL - COSIC ECRYPT Summer School - 1 Albena, May 2011 AES and other secret key implementations Ingrid

Ingrid Verbauwhede, K.U.Leuven - COSIC 6

KUL - COSIC ECRYPT Summer School - 11 Albena, May 2011

Low Energy: battery capacity

• Rabaey slide battery capacity

One AAA battery: 1300 to 5000 Joule

KUL - COSIC ECRYPT Summer School - 12 Albena, May 2011

Power and Energy are not the same!

• Power = P = I x V (current x voltage) (= Watt)

– instantaneous

– Typically checked for cooling or for peak

performance

• Energy = Power x execution time (= Joule)

– Battery content is expressed in Joules

– Gives idea of how much Joules to get the job done

Low power processor low energy solution !

Page 7: AES and other secret key implementations...Ingrid Verbauwhede, K.U.Leuven - COSIC 1 KUL - COSIC ECRYPT Summer School - 1 Albena, May 2011 AES and other secret key implementations Ingrid

Ingrid Verbauwhede, K.U.Leuven - COSIC 7

KUL - COSIC ECRYPT Summer School - 13 Albena, May 2011

Heat and parallelism

memory processor

M P

C Pmono = CV2f (Watt)

Power

(Heat)

C/4 C/4 C/4 C/4

M/4 P/4 M/4 P/4 M/4 P/4 M/4 P/44 (C/4)V2(f/4) = Pmono/4

but since f ~ V

can be even Pmono/43

Reduce power = reduce WASTE !!

TREND: MULTI-CORE!!

KUL - COSIC ECRYPT Summer School - 14 Albena, May 2011

Intermezzo: standard cell based design

Page 8: AES and other secret key implementations...Ingrid Verbauwhede, K.U.Leuven - COSIC 1 KUL - COSIC ECRYPT Summer School - 1 Albena, May 2011 AES and other secret key implementations Ingrid

Ingrid Verbauwhede, K.U.Leuven - COSIC 8

KUL - COSIC ECRYPT Summer School - 15 Albena, May 2011

Logic Design Activities

• Logic and FSM synthesis– State minim., coding

– Multilevel Logic Optimisation

• Technology Mapping– Functions to library cells

– Minimal Area for given delay

• Timing Verification– Estimate wiring load C

– Critical logic path

• Layout– P&R C extraction from wiring ...

Delay

Area

! !aoi ff

Extraction-> Timing

Timing

Closure

2 6... Logic

Depth

#literals

VHDL

Logic

Synthesis

(Synopsys)

KUL - COSIC ECRYPT Summer School - 16 Albena, May 2011

Standard Cell Layout

Std. Cell

Std. Cell Place & Route (RT-Module)

Routing Channel

Cell Row

(Courtesy : Tanner Tools)

Page 9: AES and other secret key implementations...Ingrid Verbauwhede, K.U.Leuven - COSIC 1 KUL - COSIC ECRYPT Summer School - 1 Albena, May 2011 AES and other secret key implementations Ingrid

Ingrid Verbauwhede, K.U.Leuven - COSIC 9

KUL - COSIC ECRYPT Summer School - 17 Albena, May 2011

Standard Cell Zoom In

layout

vdd

vss

KUL - COSIC ECRYPT Summer School - 18 Albena, May 2011

Module Generation

For data-path operators: structure is in bit-slices

Computer generated layout as function of wordlength

Compact, predictable IP

Power

Instruction, Clock

Data

Page 10: AES and other secret key implementations...Ingrid Verbauwhede, K.U.Leuven - COSIC 1 KUL - COSIC ECRYPT Summer School - 1 Albena, May 2011 AES and other secret key implementations Ingrid

Ingrid Verbauwhede, K.U.Leuven - COSIC 10

KUL - COSIC ECRYPT Summer School - 19 Albena, May 2011

Standard Cell and Module

Courtesy: J. Van Campenhout RUG

DatapathStandard Cell

Random Logic

KUL - COSIC ECRYPT Summer School - 20 Albena, May 2011

Start with easy one:

Block cipher - DES

Page 11: AES and other secret key implementations...Ingrid Verbauwhede, K.U.Leuven - COSIC 1 KUL - COSIC ECRYPT Summer School - 1 Albena, May 2011 AES and other secret key implementations Ingrid

Ingrid Verbauwhede, K.U.Leuven - COSIC 11

KUL - COSIC ECRYPT Summer School - 21 Albena, May 2011

Symmetric key: DES

• DES = Data Encryption Standard

• FIPS Standard 46 effective in July 1977: US government standardfor sensitive but unclassified data

• Re-affirmed in 1983, 1988, 1993, 1999 (FIPS 46-3)

• July 26, 2004: FIPS 46-3 is withdrawn: use TDEA or AES

• TDEA = Triple DES encryption algorithm – NIST 800-67

DESPlaintext (Pi) Ciphertext (Ci)

Key = 56 bits + 8 parity bits

64 64

64

KUL - COSIC ECRYPT Summer School - 22 Albena, May 2011

TDEA

• Triple DES Encryption Algorithm, NIST Spec. Pub. 800-67 (May 2004)

• Three Key options:– K1, K2, K3 different

– K1=K3, K2 different

– K1=K2=K3, backward compatible with single DES

• two-key triple DES: until 2009

• three-key triple DES: until 2030

DES

Plaintext (Pi)

Ciphertext (Ci)

64 64

64

DES-1 DES

K164

K264

K3

Page 12: AES and other secret key implementations...Ingrid Verbauwhede, K.U.Leuven - COSIC 1 KUL - COSIC ECRYPT Summer School - 1 Albena, May 2011 AES and other secret key implementations Ingrid

Ingrid Verbauwhede, K.U.Leuven - COSIC 12

KUL - COSIC ECRYPT Summer School - 23 Albena, May 2011

DES = Feistel cipher

+ f

Li-1 Ri-1

Li Ri

Encryption round i

+ f

Li-1 Ri-1

Li Ri

Decryption round i

Ki Ki

• DES has 16 rounds + initial and final permutation

• Basic cipher structure is Feistel cipher– other examples of Feistel: IDEA, FEAL, Kasumi

• Hardware: encryption = decryption!

(different for AES)

KUL - COSIC ECRYPT Summer School - 24 Albena, May 2011

DES- f function

32b-to-48b permutation

(wiring & bit duplication)

input of S-boxes: 8x6b

Si = 6b-to-4b non linear

substitution (ROM or logic

based Look up table)

output of S-boxes: 8x4b

Expansion E

+

32 Ri-1 Ki48

48

S1 S2 S3 S4 S5 S6 S7 S8

Permutation P

32

32

f(Ri-1, Ki)

• Because of Feistel: no need for f -1 (different for AES)

32b-to32b permutation

(wiring)

Page 13: AES and other secret key implementations...Ingrid Verbauwhede, K.U.Leuven - COSIC 1 KUL - COSIC ECRYPT Summer School - 1 Albena, May 2011 AES and other secret key implementations Ingrid

Ingrid Verbauwhede, K.U.Leuven - COSIC 13

KUL - COSIC ECRYPT Summer School - 25 Albena, May 2011

DES Key schedule

PC1

C D

56

64

PC2

48

56

PC1: permute and drop 8 bits

C&D: rotate left 1 or 2

bits each round

DECRYPTION: rotate right

PC2: permute and select 48

output bits

Initial key K

Round Key Ki

C&D left/right shift registers: encryption & decryption HW

KUL - COSIC ECRYPT Summer School - 26 Albena, May 2011

Key Schedule

Two options:

• On the “fly” = just in time processing

• Pre-compute and store

MemoryBC

Key

Schedule

Key

Schedule

BCTypical for Hardware

Typical for Software

Page 14: AES and other secret key implementations...Ingrid Verbauwhede, K.U.Leuven - COSIC 1 KUL - COSIC ECRYPT Summer School - 1 Albena, May 2011 AES and other secret key implementations Ingrid

Ingrid Verbauwhede, K.U.Leuven - COSIC 14

KUL - COSIC ECRYPT Summer School - 27 Albena, May 2011

Key schedule on the fly

• The cost of fast key

context switching:

• Example for IPSEC

router

– one 128 bit key = 1408

bits round keys (10 rounds

+ initial key)

– half of internet packets are

only 64 bytes in length

(512 bits)10 102 103 104 105

0

2

4

6

8

10

Record Size (bytes)

ARC4AES3DES

Co

nte

xt

ba

nd

wid

th (

Gb

ps

) Data at 1Gbps

[source: J. Goodman]

BANDWIDTH PROBLEM !

KUL - COSIC ECRYPT Summer School - 28 Albena, May 2011

Modes of operation

Page 15: AES and other secret key implementations...Ingrid Verbauwhede, K.U.Leuven - COSIC 1 KUL - COSIC ECRYPT Summer School - 1 Albena, May 2011 AES and other secret key implementations Ingrid

Ingrid Verbauwhede, K.U.Leuven - COSIC 15

KUL - COSIC ECRYPT Summer School - 29 Albena, May 2011

Design method

• Advice: include modes of operation into

hardware IP module or co-processor:

- increases the complexity somewhat: more

control or instructions are needed

+ CLEAN security partitioning

+ reduces communication overhead and traffic

KUL - COSIC ECRYPT Summer School - 30 Albena, May 2011

Modes of operation: ECB

• ECB = Electronic code book

• cipher blocks are independent, thus insertion ordeletion of blocks can go undetected

• block cipher does not hide data patterns

• BC = block cipher (e.g. 3DES or AES)

BC BC-1

Message M Ciphertext C Plaintext M

Key K K

Page 16: AES and other secret key implementations...Ingrid Verbauwhede, K.U.Leuven - COSIC 1 KUL - COSIC ECRYPT Summer School - 1 Albena, May 2011 AES and other secret key implementations Ingrid

Ingrid Verbauwhede, K.U.Leuven - COSIC 16

KUL - COSIC ECRYPT Summer School - 31 Albena, May 2011

Modes of operation: CBC

• CBC = Cipher block chaining

• error in Ci: propagation over 2 blocks (Ri and Ri+1)

• loss of block synchronization = fatal

• error in Pi: propagation for the remaining blocks

• mostly used with encryption-only for MAC generation

BC+Pi

Ci-1 CiBC-1 +

Ci-1

Ri

KUL - COSIC ECRYPT Summer School - 32 Albena, May 2011

Modes of operation: CBC-MAC

• Cipher block chaining – Message Authentication Code

• Initialization Vector: IV = C-1

• Feedback inhibits pipelining

BC+

Pi

Ci-1

BC+

Ci-1

Pi

Page 17: AES and other secret key implementations...Ingrid Verbauwhede, K.U.Leuven - COSIC 1 KUL - COSIC ECRYPT Summer School - 1 Albena, May 2011 AES and other secret key implementations Ingrid

Ingrid Verbauwhede, K.U.Leuven - COSIC 17

KUL - COSIC ECRYPT Summer School - 33 Albena, May 2011

Pipelining?

• Due to feedback pipeline remains empty

• Worse for triple DES

DES

+

Ci-1

Pi

16 xPi

DES

+

Ci-1

Pi

DES

+

Ci-1

Pi

48 xPi

48 rounds:

16 DES(K1)

16 DES-1(K2)

16 DES (K3)

KUL - COSIC ECRYPT Summer School - 34 Albena, May 2011

Modes of operation: counter

• Converts block cipher into stream cipher

• no feedback: pipelining is possible

• crucial to choose non-repeating counter functions, e.g. LFSR

• crucial to choose counter IV’s that are UNIQUE

BC

Pi

yi

Ci

yi

BC

Pi

cntri cntri

Page 18: AES and other secret key implementations...Ingrid Verbauwhede, K.U.Leuven - COSIC 1 KUL - COSIC ECRYPT Summer School - 1 Albena, May 2011 AES and other secret key implementations Ingrid

Ingrid Verbauwhede, K.U.Leuven - COSIC 18

KUL - COSIC ECRYPT Summer School - 35 Albena, May 2011

Modes of operation: OCB

• Offset code book – proprietary

• (used to be) popular because option in 802.11 WLAN

• can be pipelined

• need encryption & decryption logic (problem for AES)

BC+Pi

ZiCi

+

Zi

BC-1+Pi

Zi

+

Zi

+ extraschecksum + extras

checksum

KUL - COSIC ECRYPT Summer School - 36 Albena, May 2011

CCM (Counter + CBC MAC) mode

• Encryption & MAC creation (802.11 WLAN)

MIC = Message Integrity Check is same as MAC

Clear text frame

FC Dur A1 A2 A3 A4SC QC PC Data MIC

AES_E(K) AES_E(K) AES_E(K)AES_E(K)AES_E(K)

CBC-MAC

AES_E(K)AES_E(K)

FC Dur A1 A2 A3 A4SC QC PC Data MIC

Pl(2)Pl(1)

Counter preload

Transmitted

encrypted frame

IV

AES_E(K)

FCS

0 padded0 padded

Flag Nonce Dlen

Flag Nonce Cnt

Hlen

AES_E(K)

Pl(C)

AES_E(K)

Pl(0)

Page 19: AES and other secret key implementations...Ingrid Verbauwhede, K.U.Leuven - COSIC 1 KUL - COSIC ECRYPT Summer School - 1 Albena, May 2011 AES and other secret key implementations Ingrid

Ingrid Verbauwhede, K.U.Leuven - COSIC 19

KUL - COSIC ECRYPT Summer School - 37 Albena, May 2011

Block cipher modes of operation

• Conclusion: most practical applications ONLY use encryption. Important for:

– AES because encryption is more efficient than decryption (non-Feistel)

– area constraint applications (e.g. IEEE 802.11)

Privacy & MAC

Privacy & MAC

Yes

Only CTR

Dec

2Enc

Enc

2Enc

OCB

CCM

YesEncEncCTR

NoEncEncOFB

NoEncEncCFB

Message

authentication

NoEncEncCBC-MAC

NoDecEncCBC

not usedYesDecEncECB

NotesPipeliningReceiverSender

KUL - COSIC ECRYPT Summer School - 38 Albena, May 2011

Block cipher AES

Hardware

Page 20: AES and other secret key implementations...Ingrid Verbauwhede, K.U.Leuven - COSIC 1 KUL - COSIC ECRYPT Summer School - 1 Albena, May 2011 AES and other secret key implementations Ingrid

Ingrid Verbauwhede, K.U.Leuven - COSIC 20

KUL - COSIC ECRYPT Summer School - 39 Albena, May 2011

AES: Byte substitution

• Byte substitution: each byte individual

• 16 identical Sboxes

• Area - time trade-off: HW multiplexing

• 32 for Rijndael

a2 a6 a10 a14

a0 a4 a8 a12

a1 a5 a9 a13

a3 a7 a11 a15

b0 b4 b8 b12

b1 b5 b9 b13

b2 b6 b10 b14

b3 b7 b11 b15

GF(28)-1

Permute

ai bi

KUL - COSIC ECRYPT Summer School - 40 Albena, May 2011

AES: Shiftrow

• Shiftrow: circularly rotate each row of state array

• Easy wiring

a2 a6 a10 a14

a0 a4 a8 a12

a1 a5 a9 a13

a3 a7 a11 a15

a10 a14 a2 a6

a0 a4 a8 a12

a5 a9 a13 a1

a15 a3 a7 a11

Shiftrow

Page 21: AES and other secret key implementations...Ingrid Verbauwhede, K.U.Leuven - COSIC 1 KUL - COSIC ECRYPT Summer School - 1 Albena, May 2011 AES and other secret key implementations Ingrid

Ingrid Verbauwhede, K.U.Leuven - COSIC 21

KUL - COSIC ECRYPT Summer School - 41 Albena, May 2011

a6 a5 a4 a3 a2 a1 a0 0

0 0 0 a7 a7 0 a7 a7

b7 b6 b5 b4 b3 b2 b1 b0

AES: mix column

• matrix multiplication of state array columns

– multiply with constant entries

a2 a6 a10 a14

a0 a4 a8 a12

a1 a5 a9 a13

a3 a7 a11 a15

b0 b4 b8 b12

b1 b5 b9 b13

b2 b6 b10 b14

b3 b7 b11 b15

bi

bi+1

bi+2

bi+3

ai

ai+1

ai+2

ai+3

2 3 1 1

1 2 3 1

1 1 2 3

3 1 1 2

=

2 x

+

a7 a6 a5 a4 a3 a2 a1 a0a6 a5 a4 a3 a2 a1 a0 0

0 0 0 a7 a7 0 a7 a7

b7 b6 b5 b4 b3 b2 b1 b0

+

3 x

KUL - COSIC ECRYPT Summer School - 42 Albena, May 2011

Mix column - encryption

GF(B x 2)

GF(B x 3)

+

G(x)

00011011

<< 1carry

01

GF(B x 1)

+

GF(B)

8

a

b

c

d

02 03 01 01

01 02 03 01

01 01 02 03

03 01 01 02

Mix Column Operation is

GF(28) Linear Transform

+

<< 10

+++

Page 22: AES and other secret key implementations...Ingrid Verbauwhede, K.U.Leuven - COSIC 1 KUL - COSIC ECRYPT Summer School - 1 Albena, May 2011 AES and other secret key implementations Ingrid

Ingrid Verbauwhede, K.U.Leuven - COSIC 22

KUL - COSIC ECRYPT Summer School - 43 Albena, May 2011

Key schedule

• Unit is 32 bit words, W[i] = 32 bit = 1 column

• 4 different operations

• One round key is four W[i]’s

Input key

W[i-Nk] ^ W[i-1] W[i-Nk] ^ ByteSub(RotByte(W[i-1]))^ Rcon[i/Nk]

128b

W[i-Nk] ^ ByteSub(W[i-1])

0 1 2 3 4 5 6 7 8 9 10

0 1 2 3 4 5 6 7 8 9 10 11 12

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

192b

256b

KUL - COSIC ECRYPT Summer School - 44 Albena, May 2011

Key schedule

• Encryption key

– HW: on the fly: round key[i] = function (round key[i-1])

– SW: pre-compute and store in context (176, 208, 240Bytes)

• Decryption key

– encryption key in reverse order

– BUT need final round words to start

W[15] = D0 = C1^D1

W[14] = C0 = B1^C1

W[13] = B0 = A1^B1

W[12] = A0 = f(C1^D1)^A1

W[16] = A1 = f(D0)^A0

W[17] = B1 = f(D0)^A0^B0

W[18] = C1 = f(D0)^A0^B0^C0

W[18] = D1 = f(D0)^A0^B0^C0^D0

128 bit encrypt 128 bit decrypt

Initial Round Words

Initial Round Words

Final Round Words

Final Round Words

Page 23: AES and other secret key implementations...Ingrid Verbauwhede, K.U.Leuven - COSIC 1 KUL - COSIC ECRYPT Summer School - 1 Albena, May 2011 AES and other secret key implementations Ingrid

Ingrid Verbauwhede, K.U.Leuven - COSIC 23

KUL - COSIC ECRYPT Summer School - 45 Albena, May 2011

Combined AES architecture

• AES is not

Feistel

• Every operation

has its inverseMixColumn

ShiftRow

ByteSub

KeyAdd

KeyAdd

ShiftRow

ByteSub

KeyAdd

Ki

K0

KNr

Input Data

Output Data

Encryption

MixColumn-1

ShiftRow-1

ByteSub-1

KeyAdd-1

KeyAdd-1

ShiftRow-1

ByteSub-1

KeyAdd-1

Ki

K0

KNr

Output Data

Input Data

Decryption

KUL - COSIC ECRYPT Summer School - 46 Albena, May 2011

Decryption datapath

MixColumn-1

ShiftRow-1

ByteSub-1

KeyAdd-1

KeyAdd-1

ShiftRow-1

ByteSub-1

KeyAdd-1

Ki

K0

KNr

Output Data

Input Data

Decryption

MixColumn-1

ByteSub-1

ShiftRow-1

KeyAdd

KeyAdd

ByteSub-1

ShiftRow-1

KeyAdd

Ki

K0

KNr

Output Data

Input Data

Decryption

KeyAdd

ShiftRow-1

ByteSub-1

KeyAdd

MixColumn-1

ShiftRow-1

ByteSub-1

KeyAdd

Ki

KNr

K0

Output Data

Input Data

Decryption

MixColumn-1

ByteSub-1

ShiftRow-1

KeyAdd

KeyAdd

ByteSub-1

ShiftRow-1

KeyAdd

Ki

K0

KNr

Output Data

Input Data

Decryption

Reorganize

Key addition

is its own

inverse

switch

shiftrow and

bytesub

Flip

upside

down

Page 24: AES and other secret key implementations...Ingrid Verbauwhede, K.U.Leuven - COSIC 1 KUL - COSIC ECRYPT Summer School - 1 Albena, May 2011 AES and other secret key implementations Ingrid

Ingrid Verbauwhede, K.U.Leuven - COSIC 24

KUL - COSIC ECRYPT Summer School - 47 Albena, May 2011

Combined architecture

MixColumn

ShiftRow

ByteSub

KeyAdd

KeyAdd

ShiftRow

ByteSub

KeyAdd

Ki

K0

KNr

Input Data

Output Data

Encryption

KeyAdd

ShiftRow-1

ByteSub-1

KeyAdd

MixColumn-1

ShiftRow-1

ByteSub-1

KeyAdd

Ki

KNr

K0

Output Data

Input Data

Decryption

• Does not followcompletely Rijndaelproposal (which suggeststo switch KeyAdd andMixColumn) because itrequires a InvMixColumnto be applied to theroundkey.

KUL - COSIC ECRYPT Summer School - 48 Albena, May 2011

Combined architecture

• Provides encryption &

decryption

• Key addition is its

own inverse

ByteSub

ShiftRow

enc 10

MixColumn

10

01

KeyAdd

01

enc

enc

enc·first

first

regSel

Output Data

Input Data

Combined

enc

Page 25: AES and other secret key implementations...Ingrid Verbauwhede, K.U.Leuven - COSIC 1 KUL - COSIC ECRYPT Summer School - 1 Albena, May 2011 AES and other secret key implementations Ingrid

Ingrid Verbauwhede, K.U.Leuven - COSIC 25

KUL - COSIC ECRYPT Summer School - 49 Albena, May 2011

Sub modules

GF(28)

-1

permperm

-1

1

0 0

1

enc

In

Out

ByteSub

0

1

0

1

Out[3:0]

Out[7:4]

Out[11:8]

Out[15:12]

In[3:0]

In[4,7,6,5]

In[6,5,4,7]

In[11:8]

In[12,15,14,13]

In[14,13,12,15]

enc

ShiftRow

2 3 1 1

1 2 3 1

1 1 2 3

3 1 1 2

·Incol+enc·

c 8 c 8

8 c 8 c

c 8 c 8

8 c 8 c

·IncolOutcol=

MixColumn

Reason that

decryption is slower

KUL - COSIC ECRYPT Summer School - 50 Albena, May 2011

Sbox optimization

• GF(28)-1 requires large Look up tables

• Map to isomorphic fields, GF((24)2) or GF(((22)2)2) andinvert there

• smaller but slower!

()2 ()-1p0

A

A-1

al

ah

GF(24)

GF(28)

GF(28)

Page 26: AES and other secret key implementations...Ingrid Verbauwhede, K.U.Leuven - COSIC 1 KUL - COSIC ECRYPT Summer School - 1 Albena, May 2011 AES and other secret key implementations Ingrid

Ingrid Verbauwhede, K.U.Leuven - COSIC 26

KUL - COSIC ECRYPT Summer School - 51 Albena, May 2011

Sbox experiment• 0.18 μm CMOS, Synopsys experiment

• size of 1 Sbox, push for area or for speed

0

200

400

600

800

1000

1200

1400

1600

1800

2000

0 2 4 6 8

Latency (ns)

Are

a (

ga

tes

)

Direct Implementation Wolkerstorfer

GF(28)

GF((24)2)

KUL - COSIC ECRYPT Summer School - 52 Albena, May 2011

Compact SBOX

• GF(((22)2)2) instead of GF(28)

• Reduces the gate count to only 280 gates!

[size depends on the choice of ]

^2

^-1

4

4 4

4

88

GF(28)

GF(((22)2)2)

GF(((22)2)2)

GF(28)

[Mentens et al., RSA 2005]

Page 27: AES and other secret key implementations...Ingrid Verbauwhede, K.U.Leuven - COSIC 1 KUL - COSIC ECRYPT Summer School - 1 Albena, May 2011 AES and other secret key implementations Ingrid

Ingrid Verbauwhede, K.U.Leuven - COSIC 27

KUL - COSIC ECRYPT Summer School - 53 Albena, May 2011

Ballpark numbers

• 1 gate = 2input NAND gate = 4 transistors

• Sbox size:– GF(28)-1 = 650 to 700 gates

– GF((24)2)-1 = 400 gates ([Wol02])

– GF(((22)2)2)-1 = 280 gates ( [Sat01][Men05])

– but 50 to 100% slower

• AES core encryption only: 20K to 25Kgates– 128bit data, 128 bit key

– key schedule on the fly

– 1 clock cycle per round

• AES core for encryption and decryption: 40Kgates– 128 bit data, 128 bit key

– precompute and store round keys: 128x11bits SRAM

– 1 clock cycle per round

– savings in combining logic, losses in multiplexers!

KUL - COSIC ECRYPT Summer School - 54 Albena, May 2011

Extensions to Rijndael• Original Rijndael submission

– 128, 192 or 256 bits data (limited to 128bit in AES standard)

– 128, 192 or 256 bits key (unchanged in AES standard)

• Tricky part in key schedule: 2 loops in parallel

ciphertext

N-1 rounds

plaintext

m

m

m128

192

256{ k

128

192

256{

keyaddsubstitution

shiftrowmixcolumn

substitutionshiftrowkeyadd

keyschedule

key

k

L rounds

k

m

final round

roundkey

Page 28: AES and other secret key implementations...Ingrid Verbauwhede, K.U.Leuven - COSIC 1 KUL - COSIC ECRYPT Summer School - 1 Albena, May 2011 AES and other secret key implementations Ingrid

Ingrid Verbauwhede, K.U.Leuven - COSIC 28

KUL - COSIC ECRYPT Summer School - 55 Albena, May 2011

Key schedule for Rijndael

Cycle 1

Roundkey

Iterated Key

keysched1 keysched1

Cycle 2 Cycle 3

Roundkey

Iterated Key

keysched1 keysched2 keysched1

Cycle 1 Cycle 2

data=128

key=192

data=192

key=128

KUL - COSIC ECRYPT Summer School - 56 Albena, May 2011

Key schedule for Rijndael

• Key schedule on the

fly: one clockcycle

per round

seed

keysched1 keysched2

P C N

roundkey

data encrypt

Key Schedulecontroller

iteratedkey

roundkey assembly

m

k

m,k

m

Page 29: AES and other secret key implementations...Ingrid Verbauwhede, K.U.Leuven - COSIC 1 KUL - COSIC ECRYPT Summer School - 1 Albena, May 2011 AES and other secret key implementations Ingrid

Ingrid Verbauwhede, K.U.Leuven - COSIC 29

KUL - COSIC ECRYPT Summer School - 57 Albena, May 2011

• Rijndael

• Enc + Dec

• 0.18um CMOS

• Standard cells

KUL - COSIC ECRYPT Summer School - 58 Albena, May 2011

• AES, 2nd generation

• Regular & WDDL based implementation

• Standard cells

• 0.18 um CMOS

Page 30: AES and other secret key implementations...Ingrid Verbauwhede, K.U.Leuven - COSIC 1 KUL - COSIC ECRYPT Summer School - 1 Albena, May 2011 AES and other secret key implementations Ingrid

Ingrid Verbauwhede, K.U.Leuven - COSIC 30

KUL - COSIC ECRYPT Summer School - 59 Albena, May 2011

SW implementations

Illustrate with DES

• Option 1: operation by operation

• Option 2: table look-up

• Option 3: bit-slices

KUL - COSIC ECRYPT Summer School - 60 Albena, May 2011

SW: DES- f function

32b-to-48b perm/expansion

Combined with permutation P:

one Look-Up-Table

Wordwise EXOR: efficient

Sbox: Look-up Table

Expansion E

+

32 Ri-1 Ki48

48

S1 S2 S3 S4 S5 S6 S7 S8

Permutation P

32

32

f(Ri-1, Ki)

• DES: made for HW, inefficient in SW

Page 31: AES and other secret key implementations...Ingrid Verbauwhede, K.U.Leuven - COSIC 1 KUL - COSIC ECRYPT Summer School - 1 Albena, May 2011 AES and other secret key implementations Ingrid

Ingrid Verbauwhede, K.U.Leuven - COSIC 31

KUL - COSIC ECRYPT Summer School - 61 Albena, May 2011

Software SBOX

KUL - COSIC ECRYPT Summer School - 62 Albena, May 2011

SBOX in SW

Christof Paar, MEAD Course „Cryptographic Engineering“, Sept. 2009

Page 32: AES and other secret key implementations...Ingrid Verbauwhede, K.U.Leuven - COSIC 1 KUL - COSIC ECRYPT Summer School - 1 Albena, May 2011 AES and other secret key implementations Ingrid

Ingrid Verbauwhede, K.U.Leuven - COSIC 32

KUL - COSIC ECRYPT Summer School - 63 Albena, May 2011

Christof Paar, MEAD Course „Cryptographic Engineering“, Sept. 2009

Bit Slicing: Alternative Data RepresentationBit Slicing: Alternative Data Representation

• Introduced by Biham, 1997

• each register contains 1 bit of, e.g.,

32 blocks

• pipelining (=parallelization) of n

encryptions

bit 1, block n bit 1, block 1

bit 2, block 1 bit 2, block n

Block 1

Block n

Block 2

64

bit 64, block 1 bit 64 , block n

register (n=16, 32, 64)• CPU can be viewed as

16/32/64 one-bit parallel

processors

• CPU acts as SIMD

(Single-instruction

multiple-data) processor

Encryption with Bit Slicing (1): PermutationsEncryption with Bit Slicing (1): Permutations

bit 1, block n bit 1, block 1

bit 2, block 1 bit 2, block n

bit 64, block 1 bit 64 , block n

bit 1 , block n

bit 2, block n bit 2, block 1

bit 64, block 1 bit 64, block n

bit 1, block 1

Bit permuation is realized

by re-ordering of registers

(in practice: re-ordering of

pointers)

Ex: 64-64 bit permutation:

Christof Paar, MEAD Course „Cryptographic Engineering“, Sept. 2009

Page 33: AES and other secret key implementations...Ingrid Verbauwhede, K.U.Leuven - COSIC 1 KUL - COSIC ECRYPT Summer School - 1 Albena, May 2011 AES and other secret key implementations Ingrid

Ingrid Verbauwhede, K.U.Leuven - COSIC 33

KUL - COSIC ECRYPT Summer School - 65 Albena, May 2011

Christof Paar, MEAD Course „Cryptographic Engineering“, Sept. 2009

Encryption with Bit Slicing (2): S-BoxEncryption with Bit Slicing (2): S-Box

S-box

a b c d e f

O1 O2 O3 O4

Each output can be expressed as Boolean function (i.e., function

on bits)

O1 = f1(a, b, c, d, e, f)

O4 = f4(a, b, c, d, e, f)

On average, each S-box requires about 100 gates.

KUL - COSIC ECRYPT Summer School - 66 Albena, May 2011

Christof Paar, MEAD Course „Cryptographic Engineering“, Sept. 2009

Encryption with Bit Slicing (2): S-BoxEncryption with Bit Slicing (2): S-Box

bit 48 , block n

bit 1, block n bit 1, block 1

bit 2, block 1 bit 2, block n

bit 48, block 1

Ex:

Reg 8 AND Reg 17

OR Reg 44

NOR …

S-box

a b c d e f

O1 O2 O3 O4

Idea: Compute S-box Si for many blocks in parallel!

i.e., realize functions

O1 = f1(a, b, c, d, e, f)

O4 = f4(a, b, c, d, e, f)

for 64 (or 32 or 16) blocks of data in parallel!

Page 34: AES and other secret key implementations...Ingrid Verbauwhede, K.U.Leuven - COSIC 1 KUL - COSIC ECRYPT Summer School - 1 Albena, May 2011 AES and other secret key implementations Ingrid

Ingrid Verbauwhede, K.U.Leuven - COSIC 34

KUL - COSIC ECRYPT Summer School - 67 Albena, May 2011

Christof Paar, MEAD Course „Cryptographic Engineering“, Sept. 2009

DES Bit Slicing: Performance on 64 bit CPUDES Bit Slicing: Performance on 64 bit CPU

8 S-Boxes, 64 blocks parallel: 100 x 8 = 800 instructions

total (incl. load/store & conversion of I/O data)

19000 instr. / 64 blocks

300 instr. / block

4-5 instr. / bit

Throughput on 300 MHz Alpha

bit sliced: 137 Mbit/sec

optimized non-bit sliced: 46 Mbit/sec

ONLY COMPATIBLE WITH PIPELINE TYPE MODES OF OPERATION!!

Further reading: [Biham 97]

KUL - COSIC ECRYPT Summer School - 68 Albena, May 2011

[1] Amphion CS5230 on Virtex2 + Xilinx Virtex2 Power Estimator

[2] Dag Arne Osvik: 544 cycles AES – ECB on StrongArm SA-1110

[3] Helger Lipmaa PIII assembly handcoded + Intel Pentium III (1.13 GHz) Datasheet

[4] gcc, 1 mW/MHz @ 120 Mhz Sparc – assumes 0.25 u CMOS

[5] Java on KVM (Sun J2ME, non-JIT) on 1 mW/MHz @ 120 MHz Sparc – assumes 0.25 u CMOS

648 Mbits/secAsm

Pentium III [3] 41.4 W 0.015 (1/800)

Java [5] Emb.

Sparc 450 bits/sec 120 mW 0.0000037

(1/3.000.000)

C Emb. Sparc [4]133 Kbits/sec 0.0011 (1/10.000)

350 mW

Power

1.32 Gbit/secFPGA [1]

11 (1/1)3.84 Gbits/sec0.18μm CMOS

Figure of Merit

(Gb/s/W)

ThroughputAES 128bit key

128bit data

490 mW 2.7 (1/4)

120 mW

Throughput – Energy numbers

ASM StrongARM

[2] 240 mW 0.13 (1/85)31 Mbit/sec

Page 35: AES and other secret key implementations...Ingrid Verbauwhede, K.U.Leuven - COSIC 1 KUL - COSIC ECRYPT Summer School - 1 Albena, May 2011 AES and other secret key implementations Ingrid

Ingrid Verbauwhede, K.U.Leuven - COSIC 35

KUL - COSIC ECRYPT Summer School - 69 Albena, May 2011

Conclusions

• Energy is not the same as power

• Feistel = hardware friendly!

• Modes of operation inhibit pipelining,parallelism

– in SW bitslicing = pipelining

• Need for light weight (ultra small)

• Need for high throughput (ultra fast)

• Need for low power/low energy