crypto hardware design for embedded applicationsembedded ... · • tightly coupled high speed...

Post on 10-Oct-2019

4 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Crypto Hardware Design for Embedded ApplicationsEmbedded Applications

Dr. Amlan Chakrabarti&&

Mr. Suman SauReal Time Embedded System Research GroupReal Time Embedded System Research Group 

A.K.Choudhury School of Information TechnologyUniversity of Calcuttail k @ l i iemail:acakcs@caluniv.ac.in

1A. Chakrabarti and S. Sau, ICISS 2012

AgendaAgenda• Insight to Embedded Systems 

• High Performance Embedded System Design– Requirements– Challengesg

• Basic Concepts of Cryptography– Public & Private Keys– Hashing

Digital Signature– Digital Signature– Block Cipher and Stream Cipher

• Crypto Hardware for Embedded Systems – Requirements– Challenges– Reconfigurable Hardware

• Architectures• Design examples

• Crypto Engine Design– Prototype Design Using FPGA– Example

• Conclusion• Conclusion

2A. Chakrabarti and S. Sau, ICISS 2012

Insight to Embedded Systems 

• An embedded system is nearly any computing system (other than a general purpose computer) with the(other than a general‐purpose computer) with the following characteristics

Single functioned– Single‐functioned• Typically, is designed to perform predefined function

– Tightly constrainedg y• Tuned for low cost • Single‐to‐fewer components based P f f ti f t h• Performs functions fast enough

• Consumes minimum power

– Reactive and real‐timeReactive and real time• Must continually monitor the desired environment and react to changes

– Hardware and software co‐existence

3A. Chakrabarti and S. Sau, ICISS 2012

Insight to Embedded Systems (2)Insight to Embedded Systems (2)

4A. Chakrabarti and S. Sau, ICISS 2012

Insight to Embedded Systems (3)

• Typical embedded software components:• Typical embedded software components: 

Embedded Application Code

Device Drivers

A R l Ti O ti S t (RTOS)A Real‐Time Operating System (RTOS)

Hardware abstraction layer(s)Hardware abstraction layer(s)

System initialization routinesy

Amlan Chakrabarti, PHYSTENS‐ Dec. 2012

High Performance Embedded Systems (4)High Performance Embedded Systems (4)

• Massive computational resources with requirements of– Small size

Low Weight– Low Weight– Very low power consumption

• Need to employ innovative, advanced system architectures

• Architectures typically feature – Multiple processor cores

Tiered memory structures with multi level memory caching– Tiered memory structures with multi‐level memory caching– Multi‐layer bus structures. – Super‐pipelining and/or super‐scalingSuper pipelining and/or super scaling

Amlan Chakrabarti, PHYSTENS‐ Dec. 2012

High Performance Embedded Systems (6)High Performance Embedded Systems (6)

• Increasing software content

The software content– The software content of embedded systems is increasing at a phenomenal ratephenomenal rate

– software development and test often dominatetest often dominate the costs, timelines, and risks associated with today'swith today s embedded system designs. 

Amlan Chakrabarti, PHYSTENS‐ Dec. 2012

Multi-Core Embedded SystemsMulti Core Embedded Systems

8A. Chakrabarti and S. Sau, ICISS 2012

Superscalar EraSuperscalar Era• Single thread performance scaled atProcessor

Memory Bandwidth

performance scaled at 50% per year

• Bandwidth increasesce)

ProcessorMemory Latency

0% / year • Bandwidth increases much more slowly, but we could add er

form

anc

additional bits or channels.Lo

g(Pe

• Lack of diversity in architecture = lack of i di id l t iindividual tuning

• Power wall has capped single thread

1990 1995 2000 2005

9

capped single thread performanceA. Chakrabarti and S. Sau, ICISS 2012

Why Multicore?Why Multicore? • It is reasonable to question whether multicore is worth this

additional work, or whether it is possible to continue gaining improvements through single-core devices.improvements through single core devices.

• Raising Clock Frequency– Crank up the frequency– But it has become all too apparent that pushing the frequency came

at a price– Frequency improvements penalize power consumption which in turnFrequency improvements penalize power consumption, which in turn

generates heat that requires more advanced cooling, decreases reliability, and shortens the longevity of the device.

Oth i• Other issues– Techniques such as parallelizing instructions, speculative execution,

and pipelining cannot generally scale with the frequency

A. Chakrabarti and S. Sau, ICISS 2012 10

Multi-core embedded systemsy• Need

– Increased computing demands from embedded system withIncreased computing demands from embedded system with constrained energy and power

• A 3G mobile handset’s signal processing requires 35-40 GOPSGOPS

• Constraints: power dissipation budget of 1W• Performance efficiency required: 25 mW/GOP or 25Performance efficiency required: 25 mW/GOP or 25

pJ/operation– Multi-core embedded systems provide a promising solution

to meet these performance and power constraints• Multi-core embedded systems architecture

Processor cores– Processor cores– Caches– Memory controllersy– Interconnection network

A. Chakrabarti and S. Sau, ICISS 2012 11

Multicore for Multiple ReasonsMulticore for Multiple Reasons• Asynchronous multiprocessing (AMP)

Mi i i h d d h i ti i– Minimizes overhead and synchronization issues– Core 1 runs legacy OS, Core 2 runs RTOS, others do a variety of

processing tasks (i.e. where applications can be optimized)• Parallel pipeliningParallel pipelining

– Taking advantage of proximity– The performance opportunity….

APPLICATIONLinux ‘RTOS’APPLICATION

VideoCompress Security

Thread1 Thread2 Thread3 Thread4 ThreadnCompress Security

12A. Chakrabarti and S. Sau, ICISS 2012

Different Types of MulticoreDifferent Types of Multicore

• Homogenous• Homogenous– Describes a multicore environment in which cores are identical and

execute the same instruction setH t• Heterogeneous– Describes a multicore environment in which cores are not identical

and implement different instruction sets• The current trend is to create homogeneous multicore devices,

but a significant performance advantage can be obtained by using specialized cores and accelerators to offload the mainusing specialized cores and accelerators to offload the main cores.

13A. Chakrabarti and S. Sau, ICISS 2012

Major Challenges for Multi Core DesignsMajor Challenges for Multi‐Core Designs• Communication

– Memory hierarchy– Memory hierarchy– Data allocation (you have a large shared L2/L3 now)– Interconnection network– Scalability– Bus Bandwidth, how to get there?P P f Wi l ?• Power‐Performance —Win or lose?– Borkar’s multicore arguments 

• 15% per core performance drop 50% power saving15% per core performance drop  50% power saving• Giant, single core wastes power when task is small

– How about leakage?P i ti d i ld• Process variation and yield

• Programming Model

A. Chakrabarti and S. Sau, ICISS 2012

14

Basic Concepts of Cryptography

15A. Chakrabarti and S. Sau, ICISS 2012

Ciphers ==> ciphertext• We start with plaintext. Something we can read

• We apply a mathematical algorithm to the plaintext

• The algorithm is the cipherThe algorithm is the cipher

• The plaintext is turned in to ciphertext

• Almost all ciphers were secret until recently

• Creating a secure cipher is HARD

16A. Chakrabarti and S. Sau, ICISS 2012

What it Looks LikeWhat it Looks Like

17A. Chakrabarti and S. Sau, ICISS 2012

Symmetric Cipher

Private Key/Symmetric Ciphers

cipher text

cleartext

cleartext

K KK KThe same key is used to encrypt the document before sending and to

decrypt it once it is receivedExamples: DES , 3DES , AES , Blowfish, IDEA

18A. Chakrabarti and S. Sau, ICISS 2012

Public/Private KeysPublic/Private Keys• We generate a cipher key pair One key is the private key the other isWe generate a cipher key pair. One key is the private key, the other is

the public key

• The private key remains secret and should be protected• The private key remains secret and should be protected

• The public key is freely distributable. It is related mathematically to the i t k b t t ( il ) i th i t kprivate key, but you cannot (easily) reverse engineer the private key

from the public key

• Use the public key to encrypt data

• Only someone with the private key can decrypt.y p y yp

19A. Chakrabarti and S. Sau, ICISS 2012

Example (Public/Private Key pair)Example (Public/Private Key pair)

clear clear

ciphertextclear

textcleartextk1 k2

(public key) (private key)

One key is used to encrypt the documentOne key is used to encrypt the document,a different key is used to decrypt it.

This is a big deal!

20A. Chakrabarti and S. Sau, ICISS 2012

Block Cipher and Stream CipherBlock Cipher and Stream Cipher• Block cipher• Block cipher 

– operates on fixed‐length groups of bits, called blocks 

– unvarying transformation that is specified by a symmetric keyy g p y y y

– widely used to implement encryption of bulk data

• Stream Cipher 

plaintext digits are combined with a pseudorandom cipher digit– plaintext digits are combined with a pseudorandom cipher digit stream (key stream)

– each plaintext digit is encrypted one at a time with the corresponding digit of the key stream

A. Chakrabarti and S. Sau, ICISS 2012 21

HashinggOne-Way Encryption

Fixed length hashhashingcleartext

or message digesthashingfunction

Munging the document gives a shortdi t (h h) N t ibl tmessage digest (hash). Not possible to go

back from the digest to the original document.

22A. Chakrabarti and S. Sau, ICISS 2012

Protecting the Private Key

k2 k2

symmetriccipher

(encryptedon disk)

2readyfor use

P h

key

Passphraseentered by

user hashuser hash

K2= private key*Such as SHA-1 or SHA-2

23A. Chakrabarti and S. Sau, ICISS 2012

Di i l SiDigital Signatures

Let's reverse the role of public and private keys. To create a digital signature on a document do:g g Munge a document.

Encrypt the hash with your private key Encrypt the hash with your private key.

Send the document plus the encrypted hash.

O th th d th d t d d t On the other end munge the document and decrypt the encrypted message digest with the person's public keykey. 

If they match, the document is authenticated.

24A. Chakrabarti and S. Sau, ICISS 2012

Digital Signatures

Take a hash of the document and encrypt onlyTake a hash of the document and encrypt only that. An encrypted hash is called a "digital signature"signature

h h h h

digital COMPARE

hash hash

k2 k1

digitalsignature

COMPARE

( i t )k2 k1 (public)(private)

25A. Chakrabarti and S. Sau, ICISS 2012

Security FunctionsSecurity Functions

• data confidentiality• data integrity

th ti ti• authentication

26A. Chakrabarti and S. Sau, ICISS 2012

Embedded Security PyramidEmbedded Security Pyramid

27A. Chakrabarti and S. Sau, ICISS 2012

Design ChallengesDesign Challenges

28A. Chakrabarti and S. Sau, ICISS 2012

Crypto Hardware designCrypto Hardware design

29A. Chakrabarti and S. Sau, ICISS 2012

Hardware Implementation BenefitsHardware Implementation Benefits

• More secure implementations

• Implementing both algorithms in hardware b l k i d i hremoves bottleneck associated with 

• Single hardware implementation supporting b th l ith d t f tboth algorithms reduce costs of separate hardware

A. Chakrabarti and S. Sau, ICISS 2012 30

Architectures for Security Processing

31A. Chakrabarti and S. Sau, ICISS 2012

Second- and third-generationsecurity processing architecturesC hi H d A l• Cryptographic Hardware Accelerators– obtained through custom hardware implementations of cryptographic

(asymmetric, symmetric, hash) algorithms( y , y , ) g– Applications

• low-power mobile appliances and smartcards to high-performance t k t d li ti [Di ti S f t]network routers and application servers [Discretix; Safenet]

• Embedded Processor Enhancements– Improving the security processing capabilities of general-purposeImproving the security processing capabilities of general purpose

processors– accelerating bitlevel arithmetic operations such as the permutations

f d i t l ithperformed in crypto algorithms– Examples

• Smart MIPS ARM SecureCore family MOSES security processorSmart MIPS, ARM SecureCore family, MOSES security processor developed at NEC

32A. Chakrabarti and S. Sau, ICISS 2012

Second- and third-generationsecurity processing architecturessecurity processing architectures

contd.S i P l E i• Security Protocol Engines– Security protocol engines accelerate all or most of the

functionality present in a security protocoly p y p– higher efficiency than cryptographic accelerators– these protocol engines, if programmable, can be used to execute

lti l t l ffi i tlmultiple protocols efficiently– programmable security protocol engines are being used

increasinglyg y– by embedded system designers when both flexibility and

efficiency are requiredE l– Examples

• 7811 security processor from HIFN can be used in VPNs to perform IPSec processingp p g

33A. Chakrabarti and S. Sau, ICISS 2012

HW SW CodesignHW-SW CodesignSupport multiple algorithms and protocolsSupport multiple algorithms and protocols

34A. Chakrabarti and S. Sau, ICISS 2012

Implementation PlatformsImplementation Platforms

35A. Chakrabarti and S. Sau, ICISS 2012

Reconfigurable Hardware and Cryptography 

• Why Hardware?– Software Implementations are too slow for time pcritical applications

– Hardware implementations are intrinsically moreHardware implementations are intrinsically more secure

• Why Reconfigurable?

A. Chakrabarti and S. Sau, ICISS 2012 36

Reconfigurable Hardware and Cryptography (2)

• Advantages of reconfigurable platforms– Algorithm agilityg g y

– Algorithm Upgradability

Architecture Efficiency– Architecture Efficiency

– Resource Efficiency

– Algorithm Modification

– Throughput (Relative to software)g p ( )

– Cost Efficiency (Relative to ASICs)

A. Chakrabarti and S. Sau, ICISS 2012 37

ASIC vs FPGAASIC vs FPGA

A. Chakrabarti and S. Sau, ICISS 2012 38

Design With Reconfigurable HardwareDesign With Reconfigurable Hardware

Programmable Hardware: FPGA

•Re-Programmable Hardware•Re-Programmable Hardware

•Enables the development of nearly all digital circuits

•Leading vendors: Xilinx , Altera, MicroSemi

T d h l i f DSP•Trend to heterogeneous multicore systems out of processors, DSPs,high-speed I/O and programmable logic

•Usually a rapid prototyping platform but increased exploitation as ASICsubstitute 39A. Chakrabarti and S. Sau, ICISS 2012

Advantages of FPGA EmbeddedAdvantages of FPGA Embedded Processor Systems

• Merge CPU and I/O functions onto a single board

• Flexible design template – optimize power, data, and form factor to match application and I/O requirements

• Tightly coupled high speed logic and control system interface on a single chip – versatile tradeoff between hardware and software task

• Advanced tools bridge software and logic development• Advanced tools bridge software and logic development, provide BSP generation for Linux, VxWorks

40A. Chakrabarti and S. Sau, ICISS 2012

Embedded Processors onEmbedded Processors on FPGA

• Hard Core – Embedded processor is a dedicated physical component of

h hi f h bl l ithe chip, separate from the programmable logic

– E g Xilinx Virtex families w/ PowerPC 405E.g. Xilinx Virtex families w/ PowerPC 405

• Soft Core– Embedded processor is built out of the programmable logic on

the chip

– E.g. Xilinx MicroBlaze, Altera NIOS

41A. Chakrabarti and S. Sau, ICISS 2012

Hard Core vs Soft CoreHard Core vs. Soft CoreConsiderations

• Both cores utilize about the same % of total chip resources

• Hard core performance = 3-4x faster than fastest soft cores

• FPGAs with hard cores are more expensive

• Soft cores more flexibleSoft cores more flexible

– Multiple cores can be used in a single chip

– Can be used in a chip with a hard core

42A. Chakrabarti and S. Sau, ICISS 2012

Architecture of the appl. specific FPGA

43A. Chakrabarti and S. Sau, ICISS 2012

Application specific FPGA: Toolflow

44A. Chakrabarti and S. Sau, ICISS 2012

Crypto Engine Design

45A. Chakrabarti and S. Sau, ICISS 2012

What is crypto engine?

• Designed to implementDesigned to implement the specific cipher need

I l t ti• Implementation through a library of general purpose FPGA design blocksg

• Can be also configured for multiple ciphersfor multiple ciphers

A. Chakrabarti and S. Sau, ICISS 2012 46

Crypto Engine as a CoprocessorCrypto Engine as a Coprocessor

• Customized co‐processor core as per the requirement of the algorithm

• Main processor can execute the other required application tasks concurrentlyconcurrently

• Enables multi tasking• Enables multi‐tasking

C i t ith th i th h b• Communicates with the main processor core through a bus

A. Chakrabarti and S. Sau, ICISS 2012 47

Co‐processor based Hardware Design on FPGA

48A. Chakrabarti and S. Sau, ICISS 2012

Co processor using FSL bus(Internal p g (Architecture)

49A. Chakrabarti and S. Sau, ICISS 2012

Example Design of AES Crypto EngineExample Design of AES Crypto‐Engine• Internal Architecture of AES  Core

A. Chakrabarti and S. Sau, ICISS 2012 50

AES Engine as coprocessor with Micro blaze core. 

51A. Chakrabarti and S. Sau, ICISS 2012

System ArchitectureSystem Architecture

52A. Chakrabarti and S. Sau, ICISS 2012

Research Issues 

• Hardware design of latest crypto/hash algorithmsg

P ll li i f hi l i h• Parallelization  of  cryptographic algorithms– Higher throughput

• Low power design

H d d t ti• Hardware error detection

A. Chakrabarti and S. Sau, ICISS 2012 53

ConclusionConclusionM l i hi d id d d ff b Multi-core architecture presented provides a good trade-off between flexibility, performances and resource consumption

Crypto-accelaretors and crypto-engines can be efficiently catered by multi-core based designs

Synchronization is a great challenge

Reconfigurable hardware provides new opportunities

54A. Chakrabarti and S. Sau, ICISS 2012

• Prof. Ranjan Ghosh, University of Calcutta

• Mr. Rourab Paul, Research Scholar, University of Calcutta

• Mr. Sangeet Saha, M.Tech. student, University of Calcutta

A. Chakrabarti and S. Sau, ICISS 2012 55

56A. Chakrabarti and S. Sau, ICISS 2012

top related