hardware implementations of aes

47
Hardware Implementations of AES ECRYPT II AES day October 18 th , Bruges, Belgium Stefan Mangard Infineon Technologies, Munich, Germany [email protected]

Upload: others

Post on 12-Feb-2022

2 views

Category:

Documents


0 download

TRANSCRIPT

Hardware Implementations of AES

ECRYPT II AES day

October 18th, Bruges, Belgium

Stefan Mangard

Infineon Technologies, Munich, Germany

[email protected]

Outline

Requirements and Motivation

AES Components

AES Architectures

Physical Attacks

Summary

17.10.2012 Page 2 Copyright © Infineon Technologies 2012. All rights reserved.

PART I

Requirements and Motivation

17.10.2012 Page 3 Copyright © Infineon Technologies 2012. All rights reserved.

17.10.2012 Page 4 Copyright © Infineon Technologies 2012. All rights reserved.

Why Implement AES in Hardware?

Why AES?

We celebrating the 10th anniversary of AES. However, triple-DES is still around the adaption took really long for some

applications …

Why Hardware?

AES can be implemented efficiently in software on all processors

Only in case of very specific requirements hardware implementations are necessary

Classical Implementation Requirements and Optimization Goals

Page 5

Throughput

Power/Energy

Maintenance,

Flexibility

Area/Memory

Design

Reliability Security

17.10.2012 Copyright © Infineon Technologies 2012. All rights reserved.

Scenarios for AES Hardware Implementations

High Throughput

There is a processor, but the processor is not fast enough (e.g. servers, disk encryption)

Low Area/Power

There is no processor that could be used for AES because it would need too much area or power (e.g. RFIDs)

Low Energy

There is a processor, but given the number of cryptographic operations that need to be performed, the battery lifetime would be too short when encrypting with the processor (e.g. sensor nodes)

Security

There is a processor, but the processor and the system is not secure enough to implement a cryptographic algorithm (e.g. embedded processors)

Page 6 17.10.2012 Copyright © Infineon Technologies 2012. All rights reserved.

Implementation Requirements of AES in Terms of Security

Implementations of AES in any case need to protect the

Confidentiality of the key

Confidentiality of all intermediate values

Depending on the application also the following properties might be required:

Integrity of all intermediate values

Integrity of the key

Confidentiality and integrity of the plaintext

Integrity of the ciphertext

Page 7 17.10.2012 Copyright © Infineon Technologies 2012. All rights reserved.

Summary of Security Requirements

In practice, implementing AES hardware means building a module whose internal datapath and whose input/output interface need to be suited to handle confidential data and to protect the integrity

This makes AES hardware significantly different from functional units like a USB interface

Page 8

AES Module

17.10.2012 Copyright © Infineon Technologies 2012. All rights reserved.

Threats of AES Implementations

Logical Attacks (done via the communication interface)

Buffer overflows

Code injection

Trojans

Debug and test interfaces

Physical Attacks (require physical access to the device)

Power analysis attacks

Fault attacks

Probing/Forcing

Page 9 17.10.2012 Copyright © Infineon Technologies 2012. All rights reserved.

The Two Main Threat Scenarios

Secure Environment

Logical attacks

Examples: The classical Internet communication scenario, where the attacker does not have physical access to the device

Non-Secure Environment

Logical and physical attacks

Examples: all kinds of embedded devices, USB sticks, smart cards, RFID tags, …

In those scenarios, where there the strongest limitations of

resources (power, energy, and area), typically logical and physical attacks need to be considered

Page 10 17.10.2012 Copyright © Infineon Technologies 2012. All rights reserved.

Worst-Case Example of How Not to Integrate an AES Hardware Modules

USB stick with hardware-based AES-256 encryption to protect the content of the stick

There is a password-based authentication, which is done on the PC

The result of the password check leads to a 32 byte value which is independent of the password!

Whenever this byte sequence is sent to the USB stick as a result of the authentication procedure, the stick grants full access (all you need is a debugger on the PC …)

Page 11 17.10.2012 Copyright © Infineon Technologies 2012. All rights reserved.

PART II

Components

17.10.2012 Page 12 Copyright © Infineon Technologies 2012. All rights reserved.

Preliminaries

AES supports three key length: AES-128, AES-192, AES-256

AES is a round-based block cipher

One AES round consists of four transformations

AddRoundKey, ShiftRows, SubBytes, MixColumns

The Key is expanded and each round is provided with a 128 bit round key

The round function is independent of the key length

The key expansion can be inverted easily

17.10.2012 Page 13 Copyright © Infineon Technologies 2012. All rights reserved.

Overview

17.10.2012 Page 14 Copyright © Infineon Technologies 2012. All rights reserved.

AES

Key Expansion

AES

Data Path

Plaintext Key

Expanded Key Ciphertext

Round keys

Initial Remarks

Decryption can be done in two ways:

Inverse of all operations in reversed order

Inverse of all operations in same order as in encryption plus an extra InvMixColumns Transformation

In hardware, inverting the sequence of the transformations is

usually cheaper than implementing an extra InvMixColumns

The round function is easy to compute – hence, in hardware pre-computation of keys does usually not make sense

An implementation that allows immediate switching between encryption and decryption requires to store two keys (the actual key and the expanded one for decryption)

17.10.2012 Page 15 Copyright © Infineon Technologies 2012. All rights reserved.

Overview

17.10.2012 Page 16 Copyright © Infineon Technologies 2012. All rights reserved.

AES

Key Expansion

AES

Data Path

Plaintext Key

Expanded Key Ciphertext

Round keys

DEC

ENC

AES Data Path - The Round Function

17.10.2012 Page 17 Copyright © Infineon Technologies 2012. All rights reserved.

MC

SB

SB

SB

SB

MC

SB

SB

SB

SB

AR

AR

AR

AR

AR

AR

AR

AR

AR

AR

AR

AR

AR

AR

AR

AR

MC

MC

SR

AR

AR

AR

AR

AR

AR

AR

AR

AR

AR

AR

AR

AR

AR

AR

AR

SB

SB

SB

SB

SB

SB

SB

SB

MC

SB

SB

SB

SB

MC

SB

SB

SB

SB

MC

MC

SR SB

SB

SB

SB

SB

SB

SB

SB

Pla

inte

xt

Initial Round Round 1 Round 1

AR

AR

AR

AR

AR

AR

AR

AR

AR

AR

AR

AR

AR

AR

AR

AR

AES Data Path – Sbox

The SubBytes operation consists of 16 independent and identical 8-bit Sboxes

There are two options to implement such an Sbox

Lookup table: The 256 8-bit output values are stored in the hardware (not efficient for ASICs; efficient on FPGAs, where BRAMs are available)

“Calculation” of the Sbox: The Sbox corresponds to an inversion in GF(28) followed by an affine transformation; performing this computation is the standard approach for ASIC designs

17.10.2012 Page 18 Copyright © Infineon Technologies 2012. All rights reserved.

Sbox Input Byte Output Byte

AES Data Path – Sbox

17.10.2012 Page 19 Copyright © Infineon Technologies 2012. All rights reserved.

[WOL02]

AES Data Path – Sbox

The affine transformation and its inverse essentially corresponds to 28 XORs

There have been numerous proposals on how to efficiently implement the inversion by using different bases for the decomposition into operations of GF(28)

The most compact Sbox implementation was proposed by Canright in [C05] using normal bases:

It requires about 800 GE for a circuit implementing the Sbox and the inverse Sbox

In detail: 94 XORs, 34 NANDs, 6 NORs, 2 inverters, 16 MUX

17.10.2012 Page 20 Copyright © Infineon Technologies 2012. All rights reserved.

AES Data Path – MixColumns

MixColumns maps four input bytes (one column) to four output bytes (one column)

Each column is considered as a polynomial with coefficients in GF(28)

The operation is defined as follows:

17.10.2012 Page 21 Copyright © Infineon Technologies 2012. All rights reserved.

[NIST01]

AES Data Path – MixColumns

In hardware, calculating one byte of the MixColumns output can be done as shown on the left

In principle it possible to re-use this hardware for each byte; However, there is significant control overhead usually four parallel units are used (re-using common expressions)

In summary, MixColumns simply corresponds to about 200 XORs

17.10.2012 Page 22 Copyright © Infineon Technologies 2012. All rights reserved. [WOL01]

AES Key Expansion

The key expansion does not require any other building blocks than the data path

The key expansion essentially requires four Sbox computations and some XORs for each key expansion step

All the complexity for handling different key sizes needs to be done in the key expansion unit (remark: AES-192 is not nice to implement)

17.10.2012 Page 23 Copyright © Infineon Technologies 2012. All rights reserved.

PART III

Architectures

17.10.2012 Page 24 Copyright © Infineon Technologies 2012. All rights reserved.

Summary of What is Needed

Storage:

Datapath: 128 bit

Key Unit: 128 bit up to 512 bit

(512 bit are needed in implementations of AES-256 that allow immediate switching between encryption and decryption)

Computational operations that need to be done per round

20 Sbox operations

4 MixColumns operations

XOR operations for key addition, key expansion

Multiplexing for Shiftrows and data selection

17.10.2012 Page 25 Copyright © Infineon Technologies 2012. All rights reserved.

The Four Options

SMALL (8 bit architecture)

1 Sbox, 1 MixColumns 20 cycles per round

MEDIUM (32 bit architecture)

4 sboxes, 1 MixColumns -> 5 cycles per round

LARGE (128 bit architecture)

20 boxes, 4 MixColumns -> 1 cycle per round

XLARGE (unrolled 128 bit architecture)

200 boxes, 40 MixColumns -> 1/10 per round

17.10.2012 Page 26 Copyright © Infineon Technologies 2012. All rights reserved.

AES Data Path - The Round Function

17.10.2012 Page 27 Copyright © Infineon Technologies 2012. All rights reserved.

MC

SB

SB

SB

SB

MC

SB

SB

SB

SB

AR

AR

AR

AR

AR

AR

AR

AR

AR

AR

AR

AR

AR

AR

AR

AR

MC

MC

SR

AR

AR

AR

AR

AR

AR

AR

AR

AR

AR

AR

AR

AR

AR

AR

AR

SB

SB

SB

SB

SB

SB

SB

SB

MC

SB

SB

SB

SB

MC

SB

SB

SB

SB

MC

MC

SR SB

SB

SB

SB

SB

SB

SB

SB

Pla

inte

xt

Initial Round Round 1 Round 1

AR

AR

AR

AR

AR

AR

AR

AR

AR

AR

AR

AR

AR

AR

AR

AR

SMALL (8 bit Datapath)

In each cycle, there is an Sbox Operation

The Sbox lookups for the key or done in parallel to MixColumns

Two options for the key expansion

32 bit key calculation in the cycles 17, 18, 19, 20

8 bit key calculation in the cycles 17, 18, 19, 20, 21, ….

Shiftrows is either done in an extra cycle of by multiplexing

Size: 2500 – 5000 GE (depending on feature set)

17.10.2012 Page 28 Copyright © Infineon Technologies 2012. All rights reserved.

MC MC

AR AR

MC MC

SB SB 1 2

AR AR

SB SB 3 4

AR AR

SB SB 5 6

AR AR

SB SB 7 8

AR AR

SB SB 9 10

AR AR

SB SB 11 12

AR AR

SB SB 13 14

AR AR

SB SB 15 16

AR AR

SB SB 17 18

AR AR

SB SB 19 20

MEDIUM (32 bit Datapath)

In each cycle, a complete column of the state is processed

Shiftrows is either done in cycle 5 or by multiplexing

The key expansion is done 128 bit parallel

in clock cycle 5

Size: 6.000 – 10.000 GE

(depending on feature set)

17.10.2012 Page 29 Copyright © Infineon Technologies 2012. All rights reserved.

AR

AR

AR

AR

MC

SB

SB

SB

SB

1

AR

AR

AR

AR

MC

SB

SB

SB

SB

2

AR

AR

AR

AR

MC

SB

SB

SB

SB

3

AR

AR

AR

AR

MC

SB

SB

SB

SB

4

SB

SB

SB

SB

5

LARGE (128 bit Datapath)

In each cycle, a complete round of AES is computed

No multiplexing for Shiftrows

The key expansion is done 128 bit parallel

Size: 20.000 – 35.000 GE

(depending on feature set)

17.10.2012 Page 30 Copyright © Infineon Technologies 2012. All rights reserved.

AR AR AR AR

MC

SB SB SB SB

AR AR AR AR

MC

SB SB SB SB

AR AR AR AR

MC

SB SB SB SB

AR AR AR AR

MC

SB SB SB SB

SB SB SB SB

XLARGE (Unrolled 128 bit Datapath)

In each cycle, one AES output is computed

Pipelined processing is done that takes one plaintext per clock cycle and returns one ciphertext per clock cycle

Cannot be used with CBC or other modes that require the ciphertext of the previous block as input for the current block

Size: at least 200.000 GE

17.10.2012 Page 31 Copyright © Infineon Technologies 2012. All rights reserved.

Summary

In DES, there was essentially just one hardware implementation that made sense

AES is more flexible and allows three main architectures (SMALL, MEDIUM, LARGE)

Throughput, power, energy strongly depend on the used technology and on the interfaces

Clocking a SMALL architecture on an RFID tag in the range of 200 kHz leads to 1.000 AES-128 encryptions/sec

Clocking a LARGE architecture on a high speed chip with 1 GHz leads to 100.000.000 AES-128 encryptions/sec

17.10.2012 Page 32 Copyright © Infineon Technologies 2012. All rights reserved.

PART IV

Physical Attacks

17.10.2012 Page 33 Copyright © Infineon Technologies 2012. All rights reserved.

Power Analysis and EM Attacks

How many power traces does the best power analysis attack on AES need?

1 17.10.2012 Page 34 Copyright © Infineon Technologies 2012. All rights reserved.

Power Analysis Attacks on AES

Single trace or average trace

SMALL implementations are particular vulnerable because they leak information about many intermediate results separately

in the worst case, no averaging is necessary

State-of-the-art method to exploit the leakage: algebraic side-channel attacks

Differential power analysis attacks

Attacks on AES work nicely with all kinds of distinguishers

17.10.2012 Page 35 Copyright © Infineon Technologies 2012. All rights reserved.

Power Analysis Trends

Attack Strategies

Profiled attacks are an established tool

Exploitation more and more focuses on multiple points and their relationship (higher-order attacks)

Almost any statistical tool that can be used to measure dependencies between random variables has meanwhile been applied to power analysis

Measurement Setup

Measurements of the power consumption is often done via the electromagnetic field

Small coils allow local attacks on the chip

Basic DPA attack can be conducted with simple and cheap USB oscilloscopes

Storage and processing power of modern PCs and oscilloscopes allows to do attacks with more and more traces

17.10.2012 Page 36 Copyright © Infineon Technologies 2012. All rights reserved.

Algorithmic Countermeasures for AES

Masking

Numerous publications on already since many years on how to mask the Sbox

The problem of how to resolve all the implementation issues (glitches, data-dependent timings, …) are left to the designer

Threshold Implementations

There are meanwhile proposals for threshold implementations that resolve the implementation requirement of glitches

Open Issue

Higher-order attacks: hardware implementations process all shares in parallel or sequentially; Neither a masked nor a threshold implementation do provide sufficient protection given current setups and future trends

17.10.2012 Page 37 Copyright © Infineon Technologies 2012. All rights reserved.

Fault Attacks

How many fault inductions on AES does the best fault attack need?

1 17.10.2012 Page 38 Copyright © Infineon Technologies 2012. All rights reserved.

Fault Attacks on AES

Many papers appeared during the last years and the topic is well researched meanwhile

There are fault attacks on all different key sizes, on the datapath and on the key expansion path

Example results

One pair of (C,C’) and P break AES-128

Two pairs of (C, C’) break AES-128

There are efficient attacks on the AES middle rounds

There are efficient attacks even, if up to 12 byte of the state are changed by the attack

(P … plaintext, C … ciphertext, C’ … faulty ciphertext)

17.10.2012 Page 39 Copyright © Infineon Technologies 2012. All rights reserved.

Fault Attack Trends

Attack Strategies

For AES, there is not much space for practical improvement of the attack any more

However, also the system around the AES implementation is important and active field of research

Attack Setup

Lasers are the most effective method to produce controlled and localized faults in an IC

Attack setups can contain a laser to perform attacks from the front- as well as from the backside of the chip

Setups being able to induce multiple faults are becoming more and more prominent

17.10.2012 Page 40 Copyright © Infineon Technologies 2012. All rights reserved.

Algorithmic Countermeasures for AES

General Countermeasures

Sensor-based approaches: The goal is to detect specific fault induction vehicles (temperature sensor, light sensor, voltage sensor, …)

Error-detection based approaches: The goal is to detect the error that is the consequence of the fault induction

AES-Specific Countermeasures

Most publications use duplication (temporal or spatial)

Few publications on using parities in Sbox not sufficient against fault attacks

Open Issue

Strong algorithmic redundancy measures for AES

Multiple fault attacks

17.10.2012 Page 41 Copyright © Infineon Technologies 2012. All rights reserved.

Probing Attacks

How many probing needles does the best probing attack on AES need?

1 17.10.2012 Page 42 Copyright © Infineon Technologies 2012. All rights reserved.

Probing Attacks on AES

There are only few papers on probing attacks

Probing attacks are significantly more expensive than fault or power analysis attacks

SMALL implementations are particular vulnerable because they leak information about many intermediate results separately

E.g. Placing a needle on a wire of an Sbox provides all Sbox outputs during each encryption run …

Countermeasures include masking, but in the end some physical protection is necessary in order to prevent probing attacks on AES

17.10.2012 Page 43 Copyright © Infineon Technologies 2012. All rights reserved.

PART V

Summary

17.10.2012 Page 44 Copyright © Infineon Technologies 2012. All rights reserved.

Summary

For the components of AES, there exist standard solutions

There are also essentially three standard architectures

In case AES is operated in a secure environment, building AES means taking the standard components, selecting one of the architectures and optimizing the design according the concrete design needs standard design task

In case AES is NOT operated in a secure environment, doing an AES implementation is very challenging

After 10 years of AES, there is no publication on a secure

design that addresses all the threat scenarios

17.10.2012 Page 45 Copyright © Infineon Technologies 2012. All rights reserved.

References

[C05] David Canright: A Very Compact S-Box for AES. CHES 2005

[WOL01] Johannes Wolkerstorfer: An ASIC Implementation of the AES MixColumn-operation

[NIST01] National Institute of Standards and Technology (NIST): FIPS-197: Advanced Encryption Standard, 2001

[WOL02] Johannes Wolkerstorfer, Elisabeth Oswald, Mario Lamberger: An ASIC Implementation of the AES SBoxes. CT-RSA 2002

17.10.2012 Page 47 Copyright © Infineon Technologies 2012. All rights reserved.