multi-ip-based soc design including ccm security mode of...

93
Master Thesis ICT/ECS-2006-71 Multi-IP-Based SoC Design Including CCM Security Mode of Operation By Solmaz Ghaznavi A thesis presented to the University of Waterloo and KTH University in the fulfillment of the thesis requirement for the degree of Master of Science in System on-Chip Design Waterloo, Ontario, Canada, 2006 © Solmaz Ghaznavi, 2006 Supervisor: Professor Cathy Gebotys Examiner: Professor Axel Jantsch

Upload: others

Post on 09-Feb-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

Master Thesis ICT/ECS-2006-71

Multi-IP-Based SoC Design Including CCM Security Mode of Operation

By Solmaz Ghaznavi

A thesis presented to the University of Waterloo and KTH University in the fulfillment of the

thesis requirement for the degree of Master of Science

in System on-Chip Design

Waterloo, Ontario, Canada, 2006 © Solmaz Ghaznavi, 2006

Supervisor: Professor Cathy Gebotys Examiner: Professor Axel Jantsch

Page 2: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

I hereby declare that I am the sole author of this thesis.

I authorize the University of Waterloo and KTH University to lend this thesis to other institutions or

individuals for the purpose of scholarly research.

I further authorize the University of Waterloo and KTH University to reproduce this thesis by

photocopying or by other means, in total or in print, at the request of other institutions is

individuals for the purpose of scholarly research.

ii

Page 3: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

Abstract

Embedding security in many mobile electronic devices is of great importance. With the

emergence of powerful self-contained FPGAs which include microprocessors, memory etc. for

SoC designs, it has shifted focus to these programmable platforms. A co-design approach can be

used to optimize speed, area and power consumption by partitioning function onto the on-chip

microprocessor and programmable logic blocks.

FPGAs typically provide higher efficiency compared to software. On the other hand they offer

more flexibility and much lower design and debug costs compared to specifically-built hardware.

This thesis mainly implements CCM security mode of operation on a FPGA platform by using the

AES encryption algorithm, it then builds a complete SoC that is based on multi IP cores including

CCM. Except for the hard on-chip IP cores (i.e. microprocessors and memory), the device

controllers, the PLB and OPB buses and CCM are all soft IP peripherals to build a complex

system. The idea of building the elements as soft IP cores makes it very easy for further on-chip

developments or modifications. The CCM core that sits on the same PLB bus at 80 MHz, can

easily communicate with PowerPC or DDR SDRAM or BRAM controllers which are on the same

bus.

The implementation exploits iterative structure of AES to save the hardware resources; it

implements the key expansion core as well.

It also reports on the challenges and problems throughout the implementation.

iii

Page 4: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

Acknowledgements

I would like to thank my supervisor, Professor Cathy Gebotys, for all her advice, guidance and

encouragement. I would like to acknowledge CMC (Canadian Microelectronics Corporation)

support for using the AP1100 board. I would also like to thank my parents and my best friend

Adela for their support.

iv

Page 5: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

Table of Contents

Abstract ......................................................................................................................................... iii

List of Figures.............................................................................................................................. vii

List of Tables............................................................................................................................... viii

1 Introduction............................................................................................................................... 1

1.1 Thesis Objective.................................................................................................................. 2

1.2 Security Algorithm Choice................................................................................................... 2

1.3 Thesis Overview.................................................................................................................. 3

2 Board and the FPGA Features ................................................................................................ 4

2.1 Board Architecture .............................................................................................................. 4

2.2 Configuration, Debugging and Power Connections............................................................ 6

2.3 FPGA Features ................................................................................................................... 7 2.3.1 Configurable Logic Blocks ........................................................................................... 8 2.3.2 Slice Description .......................................................................................................... 9 2.3.3 Memory Style ............................................................................................................. 10

2.3.3.1 Distributed SelectRAM+ ..................................................................................... 10 2.3.3.2 Block SelectRAM+.............................................................................................. 12

2.3.4 FPGA Clocking .......................................................................................................... 14

3 Security Standards................................................................................................................. 15

3.1 CCM .................................................................................................................................. 16 3.1.1 CCM Cryptographic Techniques................................................................................ 17

3.1.1.1 Counter Mode Encryption (CTR)........................................................................ 17 3.1.1.2 CBC-MAC........................................................................................................... 19

3.1.2 CCM Security Assurance........................................................................................... 21

3.2 Advanced Encryption Standard (AES).............................................................................. 21 3.2.1 AES Cipher ................................................................................................................ 22 3.2.2 Key Expansion ........................................................................................................... 24

4 Design and Analysis of CCM in SoC .................................................................................... 26

4.1 Security Design Objective ................................................................................................. 26

4.2 High Level Design Architecture......................................................................................... 26 4.2.1 User Logic S/W Register Support.............................................................................. 28 4.2.2 Memory Map of PowerPC.......................................................................................... 29

4.3 CCM Implementation and Analysis................................................................................... 31 4.3.1 Key Expansion and Synthesis Analysis..................................................................... 31 4.3.2 Cipher Module and Synthesis Analysis ..................................................................... 33 4.3.3 Comparison with Previous Research......................................................................... 34

4.3.3.1 Microprocessor Implementation ......................................................................... 35 4.3.3.2 FPGA Implementation ........................................................................................ 36

4.3.3.2.1 AES Iterative Implementation ..................................................................... 36 4.3.3.2.2 AES Unrolled Implementation..................................................................... 37

4.3.4 Conclusion ................................................................................................................. 38

v

Page 6: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

4.4 Testing and Debugging ..................................................................................................... 39

4.5 Software Tools and Some Practical Hints......................................................................... 40

5 Discussion and Conclusions ................................................................................................ 42

5.1 Summary ........................................................................................................................... 42

5.2 Limitations and Future Work ............................................................................................. 42

References ................................................................................................................................... 44

Appendix A: AES Cipher HDL Synthesis Report ..................................................................... 45

Appendix B: MixColumns HDL Synthesis Report.................................................................... 47

Appendix C: Key Expansion HDL Synthesis Report ............................................................... 48

Appendix D: S-box (AES Forward Cipher)................................................................................ 50

Appendix E: Test Vectors ........................................................................................................... 51

Appendix F: VHDL Codes ........................................................................................................... 53

vi

Page 7: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

List of Figures

Figure 2-1. AP1100 Board Architecture .......................................................................................... 5

Figure 2-2. Virtex-II Pro CLB Element............................................................................................. 8

Figure 2-3. General Slice in Virtex-ll Pro ......................................................................................... 9

Figure 2-4. Half Slice in Virtex-ll Pro ............................................................................................. 10

Figure 2-5. Single-port Distributed SelectRAM+ ........................................................................... 11

Figure 2-6. Dual-port Distributed SelectRAM+.............................................................................. 12

Figure 3-1. CTR Block Diagram .................................................................................................... 18

Figure 3-2. CBC-MAC Block Diagram........................................................................................... 21

Figure 3-3. Forward Cipher Operation .......................................................................................... 23

Figure 3-4. Key Expansion ............................................................................................................ 25

Figure 4-1. Baseline Block Diagram with CCM Added to as Part of the System .......................... 27

Figure 4-2. XMD Window Showing How to Trigger Key Expansion and CCM ............................. 29

Figure 4-3. Memory Map of PowerPC........................................................................................... 30

Figure 4-4. CBC-MAC Schematic.................................................................................................. 31

Figure 4-5. Key Expansion RTL Schematic .................................................................................. 32

Figure 4-6. S-box After Synthesis.................................................................................................. 32

Figure 4-7. Cipher RTL Schematic ................................................................................................ 33

Figure 4-8. AES Iterative Implementation ..................................................................................... 37

Figure 4-9. AES Unrolled Pipelined Architecture .......................................................................... 38

vii

Page 8: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

List of Tables

Table 2-1. Virtex-II Pro Resources .................................................................................................. 7

Table 2-2. Resources in a CLB (4 slices)........................................................................................ 8

Table 2-3. Resources Used by Distributed Memory...................................................................... 11

Tabel 2-4. Supported Memory Configurations for Single-port and Dual-port Modes.................... 12

Table 2-5. Distributed RAM Switching Characteristics.................................................................. 13

Table 2-6. Block RAM Switching Characteristics .......................................................................... 13

Table 3-1. CTRi Formatting ........................................................................................................... 18

Table 3-2. Flags Byte .................................................................................................................... 18

Table 3-3. Block Zero (B0) for CBC-MAC ..................................................................................... 19

Table 3-4. Flags Byte in B0 ........................................................................................................... 19

Table 3-5. Parameters Dependent on Key Size............................................................................ 21

Table 3-6. Round Constant Bytes, RC in Hexadecimal ................................................................ 25

Table 4-1. IPIF Software Reset Register Description.................................................................... 28

Table 4-2. Instructions Execution for MixCulomns........................................................................ 36

Table 4-3. AES Encryption Results ............................................................................................... 39

viii

Page 9: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

1 Introduction

The emergence of many electronic devices that we use such as Security Identity Module (SIM)

cards in mobile phones, cash cards, Radio Frequency Identification (RFID) chips etc. has

increased the need for security. As a consequence it has triggered the desire to embed security

in system on-chip (SoC) designs within many mobile and ubiquitous devices.

There are three common ways of implementing algorithms in electronic devices. One method is

to use hardware built specifically for that algorithm such as Application Specific Integrated Circuit

(ASIC). This method produces a device that is highly efficient with respect to speed, area, and

power consumption. However there are some downsides to this approach; including a long and

expensive design time, inflexibility (if there is a need to modify the product due to some flaws or

updates it usually has to be remanufactured) and cost.

The second approach is using a software-programmed microprocessor. The great advantage in

this method is that modification is done through the software which makes it very flexible

comparing with the specifically-built hardware. On the other hand it may not be efficient with

respect to speed, area, and power consumption.

Reconfigurable devices such as Field Programmable Gate Arrays (FPGAs) can be considered an

intermediate option. FPGAs provide Configurable Logic Blocks (CLBs) and routing resources that

are programmable. Since the design can be tested and verified at the user site it benefits from a

less expensive design process than ASICs.

FPGAs can provide very high flexibility and they can produce efficient devices with respect to

speed, area, and power consumption relatively. The need for flexibility in case of modification or

damage can be crucial in some cases; for instance in cosmic equipment or in satellites, cosmic

rays might affect electronic devices, the capability to reprogram the devices remotely could be of

extreme importance.

A security mode of operation called CCM (Counter with Cipher Block Chaining-Message

Authentication Code) that uses AES cipher block is implemented as an IP peripheral in this

thesis. The implementation has all the elements for a system on-chip design, including a

microprocessor, memory and different buses etc. in which the buses, controllers and CCM are all

soft IP cores.

1

Page 10: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

1.1 Thesis Objective

The current state-of-the-art FPGAs provide ample hardware resources for a complete SoC

design. An IP-based approach, that contains different devices, controllers or buses as soft IP

cores in a library, could lead to a very flexible, efficient, and fast design methodology in SoC

designs on FPGAs.

The main objective of this thesis is to implement the CCM security algorithm as an IP peripheral

following an IP-based design approach in order to provide a very flexible and powerful SoC

implementation on Xilinx Virtex-ll Pro FPGA. This self-contained FPGA implementation includes

all the necessary elements of a SoC design such as the microprocessor, memory and buses etc.

(device controllers, buses and CCM are soft IP cores) that can easily communicate with each

other.

1.2 Security Algorithm Choice CCM is a security mode that provides authentication assurance by scarcity of ciphertexts;

meaning that an attacker without access to the key cannot easily generate a valid ciphertext.

So the output of the decryption-verification process is either an invalid error message or the valid

plaintext. However an attacker can produce a ciphertext with a certain probability. There is an

important parameter in CCM mode that can be set accordingly to control the probability of the

accepting inauthentic data as authentic. More security comes at a price of larger bandwidth [ref.

1].

CCM is based on an approved symmetric key block cipher algorithm whose block size is 128 bits.

In this thesis the underlying symmetric block cipher is the Advanced Encryption Standard (AES)

that was approved as the standard to replace Data Encryption Standard (DES) [ref. 1].

One of the advantages of CCM is that it only uses forward cipher in both generation-encryption

and decryption-verification processes. Another advantage is that it allows preprocessing; the

counter blocks may be generated in advance.

AES was published by NIST (National Institute of Standards and Technology) in 2001. Among all

the initial 21 candidates 5 were chosen according to the following criteria [ref. 2]:

- General Security

AES has no known attacks as-of-yet, although it has received some criticism on its

mathematical structure vulnerability.

- Software Implementations

AES has high potentials for parallelism which yields to efficient use of processor

resources in software implementations.

- Restricted-Space Environments

2

Page 11: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

AES is very well suited for environments where either encryption or decryption is

required. The downside where both are needed will be the ROM requirement.

- Hardware Implementations

AES has the potentials of parallelism and concurrency performance by unrolled or

pipelined implementations which come at the price of lager area.

- Attacks on Implementations

AES showed that it was among the easiest to defend against power and timing attacks

without causing significant performance degradation comparing with other candidate

algorithms.

- Encryption vs. Decryption

AES does not vary significantly between encryption and decryption, although the key

setup takes longer for decryption.

- Key Agility

It refers to the ability to change the key quickly with minimum resources. AES requires

the key expansion to run one time for a specific key. Key expansion includes some

hardware resources in either encryption or decryption.

- Other Versatility and Flexibility

AES supports key sizes of 128 bits, 192 bits and 256 bits that could be selected

according to the level of security needed and it supports data block size of 128 bits.

- Potential for Instruction-Level Parallelism

AES has a very high capability for concurrency for a single block encryption.

1.3 Thesis Overview

This thesis is composed of 5 Chapters. Chapter 2 provides the technical information on the board

and the FPGA used in the thesis. Chapter 3 describes the security algorithm (CCM) used in this

thesis in a clear and concise manner.

Chapter 4 presents the thesis contributions in implementing CCM in an IP-based system,

compares it with previous research, and discusses other design architectures. Chapter 5

describes the limitations and contributions of this research.

3

Page 12: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

2 Board and The FPGA features

In this thesis the implementation on the FPGA is based on multi IP cores that provide the

elements and devices needed in SoC. The main SoC hard units on the FPGA are 444 18 Kb

RAM blocks and two microprocessors; the other elements such as the PLB bus, OPB bus, PLB

DDR SDRAM controller, PLB Bram Controller, and the security module (CCM) are all soft IP

cores in VHDL or Verilog. The idea of making a SoC based on IP cores offers flexibility, and

speed in the design process and further modifications.

This chapter is devoted to the Board AP1100 and the FPGA Xilinx Virtex-ll Pro used in this thesis.

The purpose of this chapter is not to go through the details of the datasheets, instead it tries to

explain the main practical features that were involved in this project or could be useful in future

work.

2.1 Board Architecture [ref. 4]

Figure 2-1 shows the AP1100 board hardware architecture. The Virtex-II Pro FPGA is the main

feature of the board which has interfaces to different on-board devices.

The two DDR SDRAM banks (64 MB each) provide 32-bit Data width for the two on-chip

microprocessors. There are also two separate 18MB synchronous SRAMs providing large data

width. These SRAMs can be accessed as a single 72-bit bank or as two completely separate 36-

bit banks. The memory controllers inside the FPGA are soft IP cores.

As it is shown in figure 2-1, configuration Flash, program Flash and System ACE are accessible

through the local bus interface. A ported Linux distribution (2.4.18) is included for use with

AP1100. The kernel binary code is included on the board, stored in program flash at 0x20060000.

A ramdisk image is stored in the program flash at 0x20160000. By default, the AP1100 will load

the kernel and mount the ramdisk when powered on. U-Boot is a bootloader program that

provides the ability to load Linux, as well as a monitor program that allows access to the AP1100

resources. It is stored on the board in Program Flash at 0x20000000, and is transferred to

memory for execution. While U-Boot is running, it makes use of the SDRAM.

The System ACE provides the FPGA with an additional high-speed and high-performance

configuration solution.

The Processor Bus Dual PCI Bridge provides an interface to additional devices on the local PCI

bus. This bus provides the means to include a wide variety of I/Os by installing a PMC module, it

also allows to attach an Ethernet controller for network access. Furthermore high-speed network

interfaces can be used through the two Gigabit Ethernet physical layer devices that are

connected directly to the Virtex-II Pro.

4

Page 13: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

Additional expansion connectors are available through the Expansion I/O ports on the board.

These expansion ports allow either cabling or custom PCB daughter cards to be directly

connected to the Virtex-II Pro. CompactFlash, PCI 10/100/1000 Ethernet, and RS-232D

connectors are accessible from outside the chassis while the remaining connectors are

accessible from within the system chassis.

Figure 2-1. AP1100 Board Architecture [ref. 3]

Here is the list of main features of AP1100:

- Xilinx Virtex-II Pro Platform FPGA with two embedded PowerPC405 processors

- Dual 64 MB DDR SDRAM Banks

- Dual 2 MB SRAM Banks

- 16 MB Program Flash

- 16 MB Configuration Flash

- Xilinx System ACE CompactFlash Interface

- 64-bit/66 MHz System PCI Bus

5

Page 14: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

- 32-bit/66MHz Local PCI Bus

- Four HSSDC2 Connectors offering direct access to four Virtex-II Pro MGTs (Multi-

Gigabit Transceivers)

- Dual 10/100/1000BASE-T Ethernet Ports

- 10/100/1000BASE-T Ethernet Port

- RS-232D Serial Port

- Single IEEE 1386.1 PMC site

- U-Boot 1.1.1

- Linux 4.0

2.2 Configuration, Debugging and Power Connections

Amirix AP1100 provides several connectors that can be found in the datasheet. The connectors

that were involved in this project are as follows [ref. 4]:

- PCI bus, the card can be installed in any PCI slot; in order to benefit from the full

PCI bandwidth it should be placed in a 64-bit PCI slot at 66MHz. The AP1100

operates from a single 3.3 V supply and meets the requirements of the maximum

power consumption for a PCI card (the maximum power consumption is 25 W).

- Xilinx Parallel-IV cable, this connector is a high-speed download cable that

configures or programs the Xilinx FPGA. It connects to the JTAG port of the

FPGA. The cable uses IEEE 1284 ECP protocol and Xilinx iMPACT software to

increase download speeds over eight times faster than existing solutions. A 3-way

mouse port cable between the mouse connector and PC’s mouse port provides

power for the Parallel-IV cable.

The configuring mode that is used is boundary-scan which is an industry standard (IEEE 1149.1,

and 1532) for serial programming. External logic from a cable, microprocessor, or other devices is

used to drive the JTAG specific pins which are Test Data In (TDI), Test Mode Select (TMS), Test

Clock (TCK) and Test Data Out (TDO to sense device response). This mode is the most popular

mode of configuration due to its standardization and ability to program FPGAs, PLDs, and

PROMs through only these four JTAG pins. The data is transferred at one bit per TCK in

boundary_scan mode [ref. 5].

PowerPC has a built-in JTAG port for debugging. The JTAG ports of both of the PowerPC

processors can be chained with the JTAG port present in the FPGA using a bus interface called

JTAGPPC. “EDK provides wrappers (jtagppc_cntlr) for connecting the PowerPC and JTAGPPC.

6

Page 15: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

This way, the same JTAG cable used by the iMPACT tool for configuring the FPGA with a

bitstream file can also be used for debugging PowerPC programs” [ref. 6].

Xilinx Microprocessor Debugger (XMD) provides a Tool Command Language (Tcl) interface.

XMD console can be used for command line control and testing and debugging of the target. It is

also capable of running complex test scripts to verify a complete system. XMD communicates

with the PowerPC through the JTAG connection on the board [ref. 7].

2.3 FPGA Features

The Virtex-II Pro family is user-programmable gate arrays for designs that are based on soft IP

peripherals. This family includes multi-gigabit transceivers and PowerPC microprocessors blocks

within the FPGA. It is based on 0.13 µm CMOS technology nine-layer copper process. The

specifications for the Xlinx FPGA in AP1100 board that is used in this project are as follows [ref.

8]:

Architecture: virtex-ii pro

Device size: xc2vp100

Package: ff1704

Grade: -6

Table 2-1 shows the resources in Virtex-II Pro FPGA:

Table 2-1. Virtex-II Pro Resources [ref. 8]

CLB(4 slices

=max 128 bits)

Block SelectRAM+

Device

PowerPC405 Processor

Blocks

Logic

Cells(1) Slices

Max Distr RAM (Kb)

18 X 18

Bit MultiplierBlocks

18 Kb Blocks

Max Block RAM (Kb)

DCMs(2)

MaximumUser

Slices I/O Pads

xc2vp100 2 99216 44096 1378 444 444 7992 12 1164

Notes: 1- Logic Cell includes 4-input LUT + (1) FF + Carry Logic.

2- DCM: Digital clock manager.

7

Page 16: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

2.3.1 Configurable Logic Blocks

The Virtex-II Pro configurable logic blocks (CLBs) are organized in an array and are used to build

combinatorial and sequential circuits. As it is shown in figure 2-2 each logic block is attached to a

switch matrix to access the routing resources. A CLB element consists of 4 identical slices, with

fast local feedback within the logic block [ref. 8].

The four slices are split into two columns of two slices with two independent carry chains and one

common shift chain.

Figure 2-2. Virtex-II Pro CLB Element [ref. 8]

Table 2-2 summarizes the available resources in one CLB (4 slices). All of the CLBs are identical.

Table 2-2. Resources in a CLB (4 slices) [ref. 8]

Slices LUTs Flip-

Flops

Logic

multiplexer

MULT-

ANDS

Arithmetic

& Carry

Chains

SOP(1)

Chains

Distributes

SelectRAM+

Shift

RegisterTBUF

4 8 8 8 8 2 2 128 bits 128 bits 2

Notes: SOP: Some of products.

8

Page 17: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

2.3.2 Slice Description

Each slice includes two 4-input function generators, fast carry look-ahead chain, arithmetic logic

gates, wide function multiplexers and two storage elements, figure 2-3 shows a general slice in

Virtex-ll Pro.

The function generators F & G are either configurable as follows:

- 4-input look-up tables (LUTs)

- 16-bit shift registers

- 16-bit distributed SelectRAM+ memory

If Virtex-II Pro function generators (F and G in figure2-3) are implemented as 4-input look-up

tables (LUTs) four input lines are provided to each of them which acts as the address lines.

These function generators might be used to build any arbitrarily 4-input boolean function [ref. 8].

Figure 2-3. General Slice in Virtex-ll Pro [ref. 9]

As it is given in figure 2-4 there are five options for the signal from a function generator it could:

- exit the slice (X or Y output),

- feed the XOR dedicated gate,

- feed the carry-logic multiplexer,

- feed the D input of the storage element,

- go to the MUXF5 (multiplexers are not shown in figure 2-4).

The Virtex-II Pro slice contains multiplexers (MUXF5 and MUXFX multiplexers) that when

combined with function generators, can provide any function of five, six, seven, or eight inputs.

The MUXFX is either MUXF6 or MUXF7 or MUXF8 according to the slice considered in the Logic

block [ref.8].

9

Page 18: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

Storage element in a slice could be either an edge-triggered D flip-flop or a level-sensitive latch.

The clock enable signal (CE) is active High by default.

Figure 2-4. Half Slice in Virtex-ll Pro [ref. 8]

Note: Multiplexers MUXF5, MUXF6, MUXF7 and MUXF8 are not shown in this figure.

2.3.3 Memory Style

There are two choices for memory (RAM or ROM) style in XST HDL option; it can be either

distributed memory that consists of LUTs or on-chip block memory. Distributed style is used for S-

boxes in this project.

2.3.3.1 Distributed SelectRAM+

Each LUT which has four address lines can implement a 16 x 1-bit synchronous RAM called a

distributed SelectRAM+ element. Distributed SelectRAM+ elements could be configured within a

CLB as follows [ref. 8]:

- Single-Port 16x8-bit RAM

- Single-Port 32x4-bit RAM

- Single-Port 64x2-bit RAM

10

Page 19: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

- Single-Port 128x1-bit RAM

- Dual-Port 16x4-bit RAM

- Dual-Port 32x2-bit RAM

- Dual-Port 64x1-bit RAM

Distributed SelectRAM+ memory modules are write-synchronous. The asynchronous read access

time is short, while the synchronous write simplifies high-speed designs. The distributed

SelectRAM+ memory and the register share the same clock signal. Table 2-3 shows the number

of LUTs (2 per slice) used in each distributed SelectRAM+ configuration.

Table 2-3. Resources Used by Distributed Memory [ref. 8]

RAM/ ROM # of LUTs

16x1S 1

16x1D 2

32x1S 2

32x1D 4

64x1S 4

64x1D 8

128x1S 8

Notes: S= single-port configuration, D=dual-port configuration.

In single-port configurations, synchronous writes and asynchronous reads use the same address

lines, given in figure 2-5. In dual-port mode, one LUT address lines are connected to shared read

and write addresses. The second LUT uses the same address lines for synchronous write and

another address lines for the second asynchronous read shown in figure 2-6 [ref. 8].

Figure 2-5. Single-port Distributed SelectRAM+ [ref. 8]

Notes: A is asynchronous read address lines, WG is synchronous write address lines.

11

Page 20: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

Figure 2-6. Dual-port Distributed SelectRAM+ [ref. 8]

Notes: G is asynchronous read address lines, WG is synchronous write address lines.

2.3.3.2 Block SelectRAM+

In addition to distributed memory Virtex-II Pro devices include large amount of 18 Kb block

SelectRAM+ resources. The 18 Kb SelectRAM+ blocks can d cascaded to implement deeper or

wider single-port or dual-port memory. There are 444 18 Kb blocks that make it totally 7792 Kb

block SelectRAM+, table 2-4 gives supported memory configurations for single-port and dual-port

modes [ref. 8].

Tabel 2-4. Supported Memory Configurations for Single-port and Dual-port Modes [ref. 8]

16Kx1 bit 2Kx9 bits

8Kx2 bits 1Kx18 bits

4Kx4 bits 512x36 bits

Since the block SelectRAM+ have a regular array structure place-and-route software takes

advantage of this feature to deliver optimum system performance and fast compile times. The

segmented routing resources are essential to guarantee IP cores portability and to efficiently

handle an incremental design flow that is based on modular implementations. Total design time is

reduced due to fewer and shorter design iterations.

Another feature of the block SelectRAM+ is that there is one optimized multiplier associated with

each 18 Kb block SelectRAM+ resource. These 18-bit x 18-bit multipliers have the same

12

Page 21: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

organization as the block SelectRAM+, they are optimized for high-speed operations and have a

lower power consumption compared to an 18-bit x 18-bit multiplier in slices [ref. 8].

Switching Characteristics of both distributed and block type are given in table 2-5 and table 2-6

for comparison.

Table 2-5. Distributed RAM Switching Characteristics [ref. 8]

Description -6 (Speed grade) Units Sequential Delays

Clock CLK to X/Y outputs (WE active) in 16 x 1 mode 1.38 ns, max

Clock CLK to X/Y outputs (WE active) in 32 x 1 mode 1.75 ns, max

Clock CLK to F5 output 1.68 ns, max Setup and Hold Times Before/After Clock CLK BX/BY data inputs (DIN) 0.41/ -0.07 ns, min F/G address inputs 0.47/ 0.00 ns, min SR input 0.24/ 0.05 ns, min Clock CLK Minimum Pulse Width, High 0.72 ns, min Minimum Pulse Width, Low 0.72 ns, min Minimum clock period to meet address write cycle time 1.44 ns, min

Table 2-6. Block RAM Switching Characteristics [ref. 8] Description -6 (Speed grade) Units

Sequential Delays

Clock CLK to DOUT output 1.50 ns, max

Setup and Hold Times Before/After Clock CLK Address inputs 0.31/ 0.25 ns, min Data inputs (DIN) 0.23/ 0.25 ns, min EN inputs 0.32/ 0.00 ns, min RST input 032/ 0.00 ns, min WEN input 0.35/ 0.00 Clock CLK Minimum Pulse Width, High 1.30 ns, min Minimum Pulse Width, Low 1.30 ns, min Minimum clock period to meet address write cycle time 2.60 ns, min

Notes: 1- A Zero “0” Hold Time listing indicates no hold time or a negative hold time. Negative values cannot be guaranteed

“best-case”, but if a “0” is listed, there is no positive hold time.

13

Page 22: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

2.3.4 FPGA Clocking

All Virtex-II Pro devices from XC2VP2 to XC2VP100 have 16 global clock buffers and support 16

global clock domains. Up to eight of these clocks can be used in any quadrant of the device by

the synchronous logic elements (that is, registers, 18Kb block RAM, pipeline multipliers) and the

IOBs. The software tools place and route these global clocks automatically.

Digital clock manager (DCM) and global clock multiplexer buffers provide a complete solution for

designing high-speed clock schemes. Up to twelve DCM blocks are available. To generate

deskewed internal or external clocks, each DCM can be used to eliminate clock distribution delay.

The DCM also provides 90-, 180-, and 270-degree phase-shifted versions of its output clocks.

Fine-grained phase shifting offers high-resolution phase adjustments in increments of 1/256 of

the clock period. Very flexible frequency synthesis provides a clock output frequency equal to a

fractional or integer multiple of the input clock frequency [ref. 8].

In this project since CCM sits on the PLB bus the same PLB clock (80 MHz) is used, but faster

frequency could be practiced by using a DCM as a frequency multiplier. The following clock

signals (extracted from the user constraint file, “system.ucf”) feed the FPGA clock pins:

fpga_opb_clk 40 MHz Clock

fpga_plb_clk 80 MHz Clock

fpga_125diff_n 125 MHz Differ.

fpga_125diff_p 125 MHz Differ.

v2p_20mhz_clk 20MHz Clock

v2p_25mhz_clk 25MHz Clock)

lpci_v2p_clk 33/66MHz Clock

14

Page 23: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

3 Security Standards

This chapter briefly describes a mode of operation, called CCM, for a symmetric key block cipher

algorithm, which is the focus of the thesis. Before delving into details in section 3.1, the following

is a high level introduction to CCM and the security terminologies.

“In essence, a mode of operation is a technique for enhancing the effect of a cryptographic

algorithm or adapting the algorithm for an application, such as applying a block cipher to a

sequence of data blocks or a data stream.” [ref.2]

National Institute of Standards and Technology (NIST) introduced five modes of operations in

Special Publication 800-38A to satisfy the requirements of the applications which use the AES

algorithm.

CCM being implemented in this thesis is an algorithm that combines two modes of operations the

Cipher Block Chaining (CBC) mode and the Counter (CTR) mode. NIST published the CCM

mode for authentication and confidentiality in Special Publication 800-38C. Authentication

assures that the recipient sending the message is the source that it claims to be. Confidentiality

protects data from eavesdropping or monitoring.

The CBC and CTR modes both use the Advanced Encryption Standard (AES) block cipher which

is a symmetric block cipher and is used in a wide range of applications.

AES is an encryption/decryption scheme in which a block of a ciphertext is produced for a

plaintext; they both have the same length. AES is a symmetric algorithm that is also referred to as

single-key, secret-key or conventional algorithm meaning both sender and receiver use the same

key. The secret key is a value independent of the plaintext (input of the encryption) and of the

algorithm. The AES algorithm produces different ciphertexts (scrambled messages) depending on

the secret key, [ref. 2].

There are three inputs to the decryption process of CCM namely, the nonce, the associated data

and the payload. Depending on the application the nonce may be a timestamp, a counter or a

random number. The nonce is required to be non-repeating in any two distinct data pairs during

the lifetime of the key. The associated data is a header that needs to be protected from

modification and will be authenticated, but will not be encrypted and remains readable. For

instance in the IPSec protocol the associated data is used for data integrity in situations where

data is not secret but must be authenticated, for example where access is enforced by IPSec to

trusted computers only, or where network intrusion detection, QoS, or firewall filtering requires

traffic inspection. The payload is the actual data, [ref. 1].

Typically in the CTR mode the first input block is set to a distinct value in each data transmission;

CCM uses the nonce to maintain distinctiveness, the consequent input blocks are built by

incrementing the previous input block by 1.

The CBC mode has a chained structure and applies a formatting function on the inputs which are

the nonce, the payload and the associated data to produce the input blocks.

15

Page 24: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

The decryption process of CCM (not discussed in this project) has three inputs, namely, the

nonce, the associated data and the ciphertext from the CCM encryption. Similar to the CCM

encryption, it uses both CTR and CBC modes in which both use AES cipher. The AES decryption

is not used in the CCM mode of operation.

In general it is assumed that the attacker knows the encryption algorithm, consequently the

typical goal of attacking an encryption system is to recover the key. There are generally two

categories of attacking a system and deducting the key, [ref. 2]:

- Cryptanalysis: Cryptanalysis attacks try to exploit the nature of the algorithm. In addition to

that they might have some information on the plaintext or ciphertext.

- Brute-force attack: Brute-force attacks try every possible key on a ciphertext to derive the

plaintext.

Once the key is deducted by the attacker the effect is catastrophic. The entire ciphertexts related

to this deducted key can be decrypted.

The AES block cipher in the CCM mode provides the confidentiality through AES; meaning that

it’s impossible to obtain the plaintext without knowing the secret key. The Authentication is

provided by scarcity of the valid ciphertexts. The attacker without access to the secret key is not

able to produce the valid ciphertexts with a certain probability. Consequently any ciphertext that

passes the CCM decryption was probably generated legitimately, [ref. 1].

3.1 CCM [ref. 1]

Counter with Cipher Block Chaining-Message Authentication Code (abbreviated CCM) is a mode

of operation of the block cipher algorithm that can provide assurance of the confidentiality and

authenticity of data. In this project CCM is based on Advanced Encryption Standard (AES); that is

a symmetric cipher algorithm with the block size of 128 bits. The key expansion should be

implemented for AES to produce the expanded key beforehand. The total number of invocations

of the cipher during the lifetime of the key must be limited to 2 21 [ref. 1].

There are three inputs in CCM [ref. 1]:

- Payload, data that will be both authenticated and encrypted; Plen is the bit length of the

payload.

- Associated data, data that will be authenticated but not encrypted; Alen is the bit length of

the associated data,

- Nonce, that is assigned to the payload and the associated data; Nlen is the bit length of

the nonce.

CCM consists of two related functions; generation-encryption and decryption-verification. It

combines two cryptographic techniques; counter mode encryption (CTR) and Cipher Block

16

Page 25: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

Chaining-Message Authentication Code (CBC-MAC). Only the forward cipher function of the AES

algorithm is used within these techniques. CTR can do preprocessing and run in advance before

the input data is received.

In generation-encryption, CBC-MAC is applied to the payload, the associated data, and the nonce

to generate a message authentication code (MAC); then CTR result is applied to the MAC result

(the cryptographic checksum) and the payload, to transform them into an unreadable data, called

the ciphertext.

In decryption-verification, counter mode decryption is used to recover the MAC and the

corresponding payload; then cipher block chaining is applied to the recovered payload, the

received associated data, and the received nonce to verify the MAC. Successful verification

indicates that the payload and the associated data originated from a source with access to the

key.

A MAC (a cryptographic checksum that is designed to detect intentional, unauthorized

modifications of the data, as well as accidental modifications) provides stronger assurance of

authenticity than a non-cryptographic checksum or an error detecting code (that is designed to

detect only accidental modifications of the data). [ref. 1]

3.1.1 CCM Cryptographic Techniques

CCM combines two cryptographic modes that are based on forward cipher (AES cipher) [ref. 1]:

- Counter mode encryption (CTR)

- Cipher Block Chaining-Message Authentication Code (CBC-MAC)

Each mechanism applies a specific formatting to the input data (payload or associated data or

nonce) to produce the input sequence of blocks (the block size is 16 bytes).

3.1.1.1 Counter mode encryption (CTR) [ref. 1]

One mechanism in CCM is CTR that is a confidentiality mode. The formatting function generates

the input sequence of blocks, called the counter blocks (16-byte CTRis) to the CTR unit.

The counter blocks must be different within a single invocation and through all other invocations

of CTR under any key. The element that guarantees the distinction in a single invocation is the

bits that count from 0 to m (ceil(Plen/128)) for each 16-byte CTRi (0<=i<=m ). The element that

assures the distinction through all other invocations of CTR under a key is the nonce, the nonce

must be non-repeating meaning that any distinct data pairs must be assigned distinct nonces, but

17

Page 26: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

they do not need to be random. The formatting of CTRi is shown in table 3-1; all the blocks in

CTR have the same format.

Table 3-1. CTRi Formatting [ref. 1]

Byte number: 0 1 … n n+1 … 15

Contents: Flags Nonce [i] 8q Notes: - n = (the octet length of nonce) = (Nlen/8)

- i is the counter value that is unique in each block

- In [i] 8q, 8q is the number of bits for presenting i in binary

The flags byte (octet 0) is the same for all CTRis and has the following formatting shown in table

3-2:

Table 3-2. Flags Byte [ref. 1]

Bit number: 7 6 5 4 3 2 1 0

Contents: 0 0 0 0 0 [q-1] 3

Notes: - q = 15 - (the octet length of nonce) = 15 - (Nlen/8)

- In [q-1] 3, 3 is the number of bits for presenting q-1 in binary

CTR block diagram is shown in figure 3-1:

Figure 3-1. CTR Block Diagram

Notes: m is equal to ceil(Plen/128).

Since CTRis are generated from nonce and nonce only needs to be distinct in each invocation,

CTR provides preprocessing by running in advance before the input data.

18

Page 27: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

3.1.1.2 CBC-MAC [ref. 1]

The other cryptographic mechanism in CCM is CBC-MAC that is a confidentiality mode whose

encryption process features combining (“chaining”) of the plaintext blocks with the previous

ciphertext blocks. It provides the authenticity with an initialization vector of zero applied to the

data to be authenticated. The cryptographic checksum (MAC) results from the final block of the

CBC-MAC output, possibly truncated, serves as the message authentication.

The formatting function generates the input sequence of blocks (16-byte Bis) to CBC-MAC. The

formatting is done in three sections; it’s applied on the nonce on the associated data and the

payload (three examples are given in appendix E).

In the first section formatting is done on the nonce and it makes the first block (B0) for CBC-MAC

as bellow, shown in table 3-4.

Table 3-3. Block Zero (B0) for CBC-MAC [ref. 1]

Byte number: 0 1 … n n+1 … 15

Contents: Flags Nonce [P_oct] 8q Notes: - n = (the octet length of nonce) = (Nlen/8)

- q = 15 - (the octet length of nonce) = 15 - (Nlen/8)

- P_oct is the octet length of payload that is Plen/8

- In [P_oct] 8q, 8q is the number of bits for presenting P_oct in binary

The flags byte (octet 0) has the following formatting given in table 3-5:

Table 3-4. Flags Byte in B0 [ref. 1]

Bit number: 7 6 5 4 3 2 1 0

Contents: 0 Adata [(t-2)/2] 3 [q-1] 3

Notes: - t is the octet length of the MAC

- q = 15 - (the octet length of nonce) = 15 - (Nlen/8)

- In [q-1] 3, 3 is the number of bits for presenting q-1 in binary

- If there is no associated data Adata will be ‘0’ otherwise ‘1’.

In the second section formatting is done on the associated data that produces B1, B2, … Bu. If

there is no associated data then no block will be produced otherwise the following rules will be

applied (A_oct is the octet length of the associated data) [ref. 1]:

19

Page 28: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

- If 0 < A_oct < 2 16-2 8 then [A_oct] 16 will produce the 16 least significant bits of the first

block (B1).

- If 2 16-2 8 <= A_oct < 2 32 then the 48 least significant bits of the first block (B1) will be

0xff || 0xfe || [A_oct] 32.

- If 2 32 <= A_oct < 2 64 then the 80 least significant bits of the first block (B1) will be

0xff || 0xff || [A_oct] 64.

The above rules for formatting ensure the three cases will not overlap. The associated data

follows the least significant bits. Then the resulting bit string will be followed by the minimum

number of zeros, such that the resulting string can be partitioned into 16-byte blocks.

In the third section formatting is done on the payload that produces Bu, Bu+1 … Br, where r=u+

ceil(Plen/128); Plen is the bit length of the payload. In order to make the blocks the payload is

followed by the minimum number of zeros, such that the resulting string can be partitioned into

16-byte blocks.

After formatting and producing the blocks, figure 3-2 shows how CBC-MAC mechanism works.

The MAC (cryptographic checksum on data that is designed to reveal both accidental errors and

intentional modifications of the data) is Tlen most significant bits of the last result in CBC-MAC.

MAC = MSB Tlen (Yr); MSB is the most significant bit and Tlen is the bit length of MAC.

Considering Sis as the CTR output shown in figure 3-1 and defining signal S as the result of the

following concatenation:

S = S1 || S2 || … || Sm; where m is ceil(Plen/128).

The final result of CCM is defined as follows:

CCM final result = ( Payload XOR MSB Plen (S) ) || ( MAC XOR MSB Tlen(S0) ).

20

Page 29: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

Figure 3-2. CBC-MAC Block Diagram

3.1.2 CCM Security Assurance CCM provides authentication assurance by scarcity of ciphertexts; meaning that an attacker

without access to the key cannot easily generate a valid ciphertext. Consequently the result of the

decryption-verification process is either an invalid error message or the valid plaintext. However

this assurance of generating a valid plaintext is not absolute and an attacker can produce a

ciphertext with a certain probability.

The length of the MAC (Tlen parameter) can be set accordingly to control the probability of the

accepting inauthentic data as authentic. The larger values of Tlen that provide more security

come at a price of larger bandwidth [ref. 1].

3.2 Advanced Encryption Standard (AES)

The Advanced Encryption Standard (AES) was published by NIST (National Institute of Standard

and Technology) in 2001. AES is a symmetric block cipher that is intended to replace DES as

approved standard for a wide range of applications. AES takes a 128-bit block as the input data

(plaintext), has a key size of 128, 192, or 256 bits and produces the 128-bit block as the output

data (ciphertext). It has no known security attacks but has been criticized that its mathematical

structure may lead to attacks [ref. 2].

One of the main features of AES is simplicity that is achieved by symmetry at different levels and

the choice of basic operations. Symmetry comes from the fact that AES encrypts the 128-bit

plaintext by repeatedly applying the same round transformation a number of times depending on

the key size shown in table 3-6.

Table 3-5. Parameters Dependent on Key Size [ref. 8]

Key size (bits) 128 192 256 Number of rounds 10 12 14

Key expansion result (words/bytes) 44/176 52/208 60/240

21

Page 30: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

This project is involved with the AES forward cipher operation and it uses 128-bit key.

3.2.1 AES Cipher

Depending on the key size there is different number of rounds that has to be executed in AES

cipher. The input 128-bit block (plaintext) is presented in a 4x4 matrix of bytes (this matrix is

called state) and is modified in each round. In case of 128-bit key there are 10 rounds to run

shown in figure 3-3.The input key is expanded into an array of forty four 32-bit words. The key

expansion should be done before the cipher operation, and in each round 4 words of the

expanded key will be used [ref. 2].

Each round consists of four stages as follows [ref. 8]:

- SubBytes, used to substitute each byte in the State.

- ShiftRows, shifts each row by an offset.

- MixColumns, is a column-wise operation over GF(2 8).

- AddRoundKey, is bitwise XOR of the current state with a portion of the expanded key (4

words of the expanded key).

22

Page 31: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

Figure 3-3. Forward Cipher Operation [ref. 2]

SubBytes The function SubBytes is the only non-linear function in AES. All the four stages together provide

confusion, diffusion and non-linearity.

SubBytes operation uses a 16x16 matrix of byte called S-box (given in appendix D) that contains

a permutation of all possible 8-bit values. The content of the table can be computed by a finite-

field inversion followed by an affine transformation over GF(2 8). Each byte of state is mapped

into a byte from the S-box; The 4 leftmost bits are used as the row index while the 4 rightmost bits

are used as the column index.

The S-box is designed to be resistance to known cryptanalytic attacks. Specifically it has a low

correlation between input bits and output bits, the property that the output cannot be described as

a simple mathematical function of the input [ref. 2].

ShiftRows ShiftRows is a byte circular left shift operation by an offset that equals the row index. The first row

(row number 0) is not changed. The second row (row number one) is left-shifted circularly one

23

Page 32: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

byte. For the third row (row number 2) a 2-byte circular left shift is performed. For the forth row

(row number 3) a 3-byte circular left shift is performed. Since the MixColumns and AddRoundKey

operations are done column by column, ShiftRows ensures that 4 bytes of one column are spread

out to four different columns [ref. 2].

MixColumns MixColumns function operates on the state column by column; each byte of the column is

mapped into a new value that is a function of all the four bytes in that column as follows:

⎥⎥⎥⎥

⎢⎢⎢⎢

02010103030201010103020101010302

=

⎥⎥⎥⎥

⎢⎢⎢⎢

3,32,31,30,33,22,21,20,23,12,11,10,13,02,01,00,0

ssssssssssssssss

⎥⎥⎥⎥

⎢⎢⎢⎢

3,3'2,3'1,3'0,3'3,2'2,2'1,2'0,2'3,1'2,1'1,1'0,1'3,0'2,0'1,0'0,0'

ssssssssssssssss

The matrix multiplication is done in GF(2 8) meaning the addition is the bitwise XOR and the

multiplication is the polynomial multiplication modulo x 8 + x 4+ x 3 + x + 1. There are different

ways of implementing MixColumns depending on the platform being used to obtain the maximum

efficiency. MixColumns operation ensures a good mixing among the bytes of each column.

ShiftRows and MixColumns together ensure that after executing the rounds all output bits depend

on all input bits [ref. 2].

AddRoundKey AddRoundKey operation is designed as simple as possible; all the 128 bits of state are XORed

with 4 words (128 bits) of expanded key resulting from key expansion. AddRoundKey is the only

operation that involves using the key to ensure security. Key expansion is based on SubBytes

operation in addition to some simple byte level operations [ref. 2].

3.2.2 Key expansion

The Key expansion operation takes the 16-byte input key and its output is a 44-word expanded

key array, each round of AES cipher uses 4 words of that 44-word expanded key. The AES

developers designed the key expansion to be resistant to known cryptanalytic attacks. It is

designed in a way that each key bit affects many other round key bits. The first 4 words of the output array is the 16-byte input key. Except from the words whose

indexes are multiple of 4 the other words are simply made by XORing the preceding word with

the word four positions back shown in the figure 3-4. The words whose indexes are multiple of

four go through a more complex function (called function g).

24

Page 33: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

Figure 3-4. Key Expansion [ref. 2]

Notes: Kis are bytes and Wis are words

The function g takes the preceding word performs a one-byte circular left shift, then it performs

SubBytes on each byte of the shifted result. In the last step it takes the substituted word and XOR

it with a round constant word “RC(i), 0, 0, 0”, RC(i) is given in table 3-7 in hexadecimal for each

round.

Table 3-6. Round Constant Bytes, RC in Hexadecimal [ref. 2]

I (round number) 1 2 3 4 5 6 7 8 9 10

RC(i) 01 02 04 08 10 20 40 80 1B 36 The purpose of using round constants is to eliminate symmetries and similarities in making the 4-word expanded key for each round.

25

Page 34: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

4 Design and Analysis of CCM in SoC This chapter describes the CCM mode implementation in an IP-based platform. It begins by top

level description of the design, its communication with other main SoC modules and then it goes

through details of each module that make CCM. It will then compare the results with previous

research on software and FPGA implementation of AES cipher. The remainder of the chapter

describes testing and debugging, and it provides some useful hints and the solutions to hindering

practical problems that were dealt with regarding the software tools in this project.

4.1 Security design Objective

Security algorithms can be implemented in software, hardware or a combination of both with

respect to speed, area and power consumption. In general software implementations tend to

have low throughput comparing with hardware implementation, since they may not have the

efficient instruction set or operand size for a particular algorithm. On the other hand they are more

cost-efficient and flexible than hardware solutions.

Since one of the objectives in this project is throughput, the security algorithm is implemented

using the FPGA.

Another goal is to make it a very flexible SoC design for future modifications. It provides the

capability to split the design such that key expansion runs by the microprocessor and cipher runs

on the FPGA. Since key expansion needs to be executed once in any key life time it has a little

impact on performance overall. At a higher level it offers options to communicate with other future

devices that might be added to the design.

4.2 High Level Design Architecture The underlying foundation that has been used in this project is the baseline provided by CMC as

the system-on-chip platform (CMC is a federally incorporated non-profit corporation that provides

microsystems researchers with industry-calibre design resources, access to state-of-the-art

manufacturing technologies, and support services).

The CCM core is connected to the PLB bus as a slave module (shown in figure 4-1) through an

IP interface (abbreviated as IPIF) and uses the address space 0x40000000-0x400001FF. It uses

the PLB clock line of 80 MHz. Beside standard functions like address decoding the IPIF module

offers other commonly used services namely [ref. 10]:

- S/W reset and Module information register (RST/MIR)

- Burst and cashline transaction support

26

Page 35: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

- DMA

- FIFO

- User logic interrupt support

- User logic S/W register support

- User logic mater support

- User logic address range support

Two services out of the above services that have been used in this project are as follows:

- S/W reset and Module information register (RST/MIR)

- User logic S/W register support (explained in the section 4.2.1)

Figure 4-1. Baseline Block Diagram with CCM Added to as Part of the System [ref. 4]

Notes:

PLB2OPB bridge shown in the figure functions as a slave on PLB side and a master on OPB side.

The software reset allows individual peripherals to be reset from the software application. The

peripheral has a special write-only address. When a specific word is written to this address, the

27

Page 36: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

IPIF generates a reset signal for the peripheral (table 4-1). The peripheral resets itself using this

signal.

Table 4-1. IPIF Software Reset Register Description [ref. 10]

Notes:

C_BASEADDR is the address of the IPIF sitting on PLB bus.

The inputs come through the slave registers to the main core named ccm_core.vhd. Since the

design is based on active high reset, generating a software reset causes all the slave registers to

be set to ‘1’ accordingly.

4.2.1 User Logic S/W Register Support The User logic S/W register service provides 32 64-bit (the same width as the PLB bus) registers,

maximum. Selecting the same data width as the PLB bus is more efficient, since selecting a data

width that is less than the target bus will result in more resource usage due to byte steering logic

[ref. 10].

The inputs are read asynchronously and are stored in write-only address space versus the output

is written synchronously and are stored in read-only address. The input and the output data share

the same address space; when something is written in that address space it’ll be stored in the

write-only input registers, on the other hand when something is read from that address space it’ll

be read from the output registers. This way saves more space for both the input and the output

data; consequently 256 bytes for the inputs and 256 bytes for the output using the same

addresses.

The inputs that are K (the cipher key), N (the nonce), A (the associated data string), P (the

payload) should be written to C_BASEADDR + 0x00000000 sequentially and the maximum

length allowed is 255 bytes overall, the last byte is used as the control byte for triggering the

circuit.

To trigger the module there are two reset signals; one that goes to key expansion module and the

other that goes to CCM. Obviously CCM shouldn’t be triggered before the key expansion output

is valid; they can be triggered at the same time though. The resets are active-high and are

located at the following addresses:

Key expansion reset: C_BASEADDR+ 0x00000002 (second bit position from right)

Bits Core Access

Register’s address Description

0-31 Write C_BASEADDR(1) + 0x00000100

“0x0000000A” generates a reset

28

Page 37: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

CCM reset: C_BASEADDR+ 0x00000001 (first bit from right)

Here is an example on how to trigger the circuit through XMD console; generics are Klen=128,

Tlen=32, Nlen=56, Alen=64, Plen=32.

Figure 4-2. XMD Window Showing How to Trigger Key Expansion and CCM

mwr 0x400000ff 0x00000003 -- activates the reset for both key expansion and CCM.

mrd 0x40000000 -- 4 reads four output words.

mwr 0x40000000 {0x40 0x41…0x23} 35 b -- the inputs “K” 16 bytes, “A” 7 bytes, “N” 8 bytes,

“ P” 4 bytes are read; letter b indicates byte.

mwr 0x400000ff 0x00000000 -- resets both key expansion and CCM.

Mrd 0x40000000 2 -- displays the output.

4.2.2 Memory Map of PowerPC The entire 4GB of memory space is given in figure 4-4, with the lower 1.25 GB zoomed in. The

map shows the default baseline address decoding employed on the AP1100. The AP1100

Baseline Platform supports 64MB of SDRAM. This physical memory is aliased throughout the 512

MB range shown in the memory map. The lower and upper 1MB portions within this space are

used to store the U-Boot code and care must be taken to ensure this area of memory is not

overwritten. When downloading data or other software to the AP1100, these lower and upper 1

MB portions of SDRAM are best to be avoided [ref. 4].

29

Page 38: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

Figure 4-3. Memory Map of PowerPC [ref. 4]

Notes: 1- Blue section indicates that these devices reside on the Local Bus. The PowerPC can access them through the

PLB2OPB and OPB_EXT bridges.

2- Yellow section indicates that these devices reside on the OPB. The PowerPC can access them through the PLB2OPB

bridge.

In this project CCM is the slave module that uses the memory space (0x40000000-0x400001FF)

from the unused space (0x40000000-0x4B000000) shown in the figure4-4. Since it sits on the

same PLB bus as the PowerPC it can directly communicate with it.

There are other options for CCM; it might be the slave on OPB bus or the master on PLB or OPB

bus.

It might be designed to communicate with DDR SDRAM in case a large memory is needed for the

result. In this case CCM should act as a master on the bus. If it sits as the master on OPB bus

then it needs an OPB2PLB bridge that is a master on PLB side and a slave on PLB side.

30

Page 39: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

4.3 CCM Implementation and Analysis This section explains the CCM design from RTL angle and analyzes some details of the synthesis

report. The building blocks of CCM (ccm_core.vhd) are cipher module, key expansion and a

control unit. It also needs a formatting procedure on the inputs that is done in the package

datatypes.vhd. The source VHDL files are provided in appendix F.

The optimization goal in Xilinx Synthesis Tool (XST) is set to speed, optimization effort is set to

normal, RAM/ROM extraction is activated and RAM/ROM style is set to auto. The RTL schematic of CBC_MAC without the control unit is shown in figure 4-5.

Figure 4-4. CBC-MAC Schematic

The CTR mode duplicates AES cipher n times (n equals ⎡ ⎤128/Plen +1). Plen which is equal to

32, 128 and 192 has been configured on the FPGA; among those cases ⎡ ⎤ 31128/192 =+ was

the maximum number of the AES ciphers needed to be built for CTR.

4.3.1 Key Expansion and Synthesis Analysis The key expansion RTL schematic that is used for the VHDL code is shown in figure 4-6. The

control unit that basically uses a counter and drives the select lines for the multiplexers and other

control lines (register and output enables etc.) is not shown in this figure.

31

Page 40: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

Figure 4-5. Key Expansion RTL Schematic

As it is shown in the RTL schematic and the synthesis report (appendix C) there are 4 ROMs

(16x128-bit ROM) and 4 multiplexers (8-bit 16-to-1 multiplexer). These resources are used to

make the S-boxes given in figure 4-7 after synthesis (select lines and address lines are not

shown in this figure). 4 leftmost bits select one row (128 bits) of the S-box, the multiplexer selects

8 bit out of the 128 coming from the ROM, using the 4 rightmost bits as its select signals to select

the column.

Figure 4-6. S-box After Synthesis

The other multiplexer (8-bit 11-to-1 multiplexer) in the synthesis report is used to select the round

constant using i/4 as its select signals.

As for the timing analysis it takes 39 clock cycles from the time reset (synchronous reset) goes

inactive and is sensed by the falling edge of the clock until the key expansion output (43 words) is

produced.

32

Page 41: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

4.3.2 Cipher Module and Synthesis Analysis The cipher RTL schematic that is used for the VHDL code is shown in figure 4-8. The control unit

that basically consists of a counter and drives the select lines for the multiplexers and other

control lines (register and output enables etc.) is not shown in this figure.

Figure 4-7. Cipher RTL Schematic

The file mix.vhd doing MixColumns operation is combinational and add.vhd is sequential feeding

the feedback registers (the source VHDL files are provided in appendix F).

The MixColumns operation used in this project is based on using ROMs; the transformation is

given below [ref. 2]:

⎥⎥⎥⎥

⎢⎢⎢⎢

jcjcjcjc

,3,2,1,0

= =

⎥⎥⎥⎥

⎢⎢⎢⎢

01010103030201010103020101010302

⎥⎥⎥⎥

⎢⎢⎢⎢

jbjbjbjb

,3,2,1,0

⎟⎟⎟⎟⎟

⎜⎜⎜⎜⎜

⎥⎥⎥⎥

⎢⎢⎢⎢

jb ,0

03010102

⎟⎟⎟⎟⎟

⎜⎜⎜⎜⎜

⎥⎥⎥⎥

⎢⎢⎢⎢

jb ,1

01010203

⎟⎟⎟⎟⎟

⎜⎜⎜⎜⎜

⎥⎥⎥⎥

⎢⎢⎢⎢

jb ,2

01020301

⎟⎟⎟⎟⎟

⎜⎜⎜⎜⎜

⎥⎥⎥⎥

⎢⎢⎢⎢

jb ,3

02030101

Each multiplication is implemented in a ROM (256x32-bit) that takes as input a byte value

and returns a column (32-bit vector). These MixColumns tables are highly optimized automatically

by the synthesis tool (XST).

jbi ,

33

Page 42: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

As it is shown in the synthesis report (appendix A) there are 16 ROMs (16x128-bit ROM) and 16

multiplexers (8-bit 16-to-1 multiplexer). These are used to make the 16 S-boxes, the same way

it’s been explained for key expansion.

There are also 16 ROMs (256x32-bit ROM, appendix A) used for doing MixColumns operation,

according to the vhdl code. If we do a thumbnail calculation for the number of 4-input LUTs for a

single 256x32-bit MixColumns ROM we’ll find that:

for a single 256x1-bit ROM, 16 4-input LUTs plus one 16-to-1 multiplexer is needed thus,

for one 256x32-bit ROM 16*32=512 is needed.

As a result for 16 ROMs (256x32-bit); 512*16=8192 is needed.

The number of LUT's that is shown in the synthesis report (MixCoulmns report) is as follows:

# LUT2 (2-input LUT) : 80

# LUT4 (4-input LUT) : 176

This is much lower than we calculated and that is because these ROMs for mix column operation

can be extremely optimized, the optimization is done automatically by XST.

As for the timing it takes 10 clock cycles from the time reset (synchronous reset) goes inactive

and is sensed by the falling edge of the clock until the output (16 bytes) is produced, obviously

the key expansion results should be valid in advance for the cipher unit to produce the valid

output data.

4.3.3 Comparison with Previous Research

Since the main underlying block cipher algorithm in CCM is AES under a key of 128 bits this

sections describes some previous implementations of AES on microprocessors and FPGA

platforms.

Different platforms use different processing data sizes in software implementations; they are

based on 8-bit or 16-bit or 32-bit etc. architecture depending on the microprocessor.

AES hardware implementations are based on larger data path widths comparing with software in

order to gain higher throughput; 128-bit implementation gives the highest throughput in Gigabit

range since it offers the greatest degree of parallelism to increase concurrency in the

computations.

There are other techniques to increase throughput; some implementations use unrolling of the

rounds or some others use pipelining inside the round. Nevertheless these techniques can not be

applicable in all modes of operation, for instance CBC-MAC mode (described in chapter 3.1.1.2)

is not able to fully exploit the unrolling technique used in pipelining due to its feedback structure.

Since the result of the previous encryption is needed as the input for the next step it stalls the

pipeline so the performance gain is small and consequently hardware resources will be wasted.

All these techniques for increasing throughput come at a price. Using larger data path sizes in the

architecture will result in larger circuits. For instance 8-bit architecture needs one S-box; while

34

Page 43: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

128-bit architecture uses 16 S-boxes to provide fully parallel processing (S-boxes contain a large

portion of the circuit regarding area and are the most spacious parts of the AES implementation).

As another example, in case of 128-bit key size the fully unrolled implementation uses roughly ten

times more hardware resources than the unrolled implementation [ref. 11].

4.3.3.1 Microprocessor Implementation

As mentioned in the introduction chapter software implementations are generally slower than

hardware, if not in clock frequency then in throughput (high number of clock cycles), mainly

because they lack instructions for modular arithmetic operations on long operands (128-bit

operands in fully concurrent implementation of AES). They need more number of clock cycles to

produce the result. They usually have the frequency in Megabit range.

However in previous research there have been some techniques that lead to more efficient

software implementations.

There are different ways for doing MixColumns. After multiplying the two matrixes in GF (2 8)

(described in 3.2.1) Mixcolumns is expressed as

jsjsjsjsjsjsjsjsjsjsjsjsjsjsjsjsjsjsjsjs

,3.2,2,1,0.3,3',3.3,2.2,1,0,2',3,2.3,1.2,0,1',3,2,1.3,0.2,0'

⊕⊕⊕=⊕⊕⊕=⊕⊕⊕=⊕⊕⊕=

Multiplication by 2 in GF (2 8) is 1-bit left shift followed by a conditional bitwise XOR with

‘’00011011”.

In [ref. 12], that uses a 32-bit processor, the above equations are rewritten in the following format:

)),3,0.(2(,2,1,0,3')),3,2.(2(,3,1,0,2')),2,1.(2(,3,2,0,1')),1,0.(2(,3,2,1,0'

jsjsjsjsjsjsjsjsjsjsjsjsjsjsjsjsjsjsjsjsjsjsjsjs

⊕⊕⊕⊕=⊕⊕⊕⊕=⊕⊕⊕⊕=⊕⊕⊕⊕=

Consequently the instructions could be executed in the following manner given in table 4-2. In

their implementation, the S-box is implemented in memory and they got the following number of

clock cycles for encryption, 1675 cycles in ARM7TDMI, 1384 cycles in ARM9TDMI and 1119

cycles in Pentium-lll.

35

Page 44: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

Table 4-2. Instructions Execution for MixCulomns [ref. 12]

First Instruction Second Instruction Third Instruction 3210 xxxy ⊕⊕= 0.20 xx = 100 xxy ⊕= 3201 xxxy ⊕⊕= 1.21 xx = 211 xxy ⊕= 3102 xxxy ⊕⊕= 2.22 xx = 322 xxy ⊕= 2103 xxxy ⊕⊕= 3.23 xx = 303 xxy ⊕=

Another approach that was introduced in [ref. 2] for 8-bit processors rewrites the MixColumns as

followes:

)],0,3.(2[,3,3')],3,2.(2[,2,2'

)],2,1.(2[,1,1')],1,0.(2[,0,0'

,3,2,1,0

jsjstmpjsjsjsjstmpjsjs

jsjstmpjsjsjsjstmpjsjs

jsjsjsjstmp

⊕⊕⊕=⊕⊕⊕=

⊕⊕⊕=⊕⊕⊕=

⊕⊕⊕=

They also suggest that in order to make the implementation resistant against timing attacks,

multiplication by 2 in the Galois Field (that is a conditional XOR operation) can be replaced by a

lookup table [ref. 2].

4.3.3.2 FPGA Implementations

This section describes different design implementations with high throughput as the main

optimization goal. FPGA implementations mostly use a 128-bit architecture to reach the full

parallelism and concurrency in computations within each round.

4.3.3.2.1 AES Iterative Implementation

The proposed 128-bit iterative architecture has been designed to reach Gigabit throughput range.

It exploits the iterative structure of AES and provides maximum hardware utilization since it

reuses the hardware in each round. The general iterative block diagram without the control unit

and signals is shown in figure 4-8.

In the round transformation, the ShiftRows operation in a 128-bit architecture comes for free

because no logic resources are used; in this case ShifRows is a routing issue and is

accomplished by simple rewiring.

36

Page 45: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

Figure 4-8. AES Iterative Implementation

Notes: ShiftRows operation is just rewiring in 128-bit architecture without any hardware cost.

This architecture was completely implemented on the SoC platform in this thesis. The maximum

clock frequency from the Synthesis report was 176.398 MHz. However the fastest clock that was

available with the provided baseline system was 80 MHz and the CCM works at this frequency. It

uses 10 clock cycles to produce the AES ciphertext (128 bits); that gives the throughput of

SGbitSMbit /25.2)/(10

176*128≈

The CCM throughput depends on the input Data length. For instance in the case where Tlen=32,

Nlen=56, Alen=64, Plen=32, CBC_MAC needs three executions of AES cipher that yields the

throughput of SGbitSGbit /75.03/)/(25.2 =

It is important to mention that in order to reach higher frequencies, the DCM (digital clock

mangers) unit should be used as a frequency multiplier. This is not implemented in this thesis due

to time constraint. For detailed FPGA resources that have been used refer to the synthesis report

(appendix A)

4.3.3.2.2 AES Unrolled Implementation

In applications where even higher throughput is required, loop unrolling could be used. In order to

achieve the highest throughput, for instance when the key size is 128 bits all ten rounds can be

unrolled and pipelining registers can be inserted between the rounds. Obviously the highest

throughput comes at the price of about ten times more hardware resources [ref. 11].

The general block diagram of a fully unrolled pipelined architecture is given in figure 4-9 (the

control unit and signals are not shown).

37

Page 46: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

Figure 4-9. AES Unrolled Pipelined Architecture [ref. 11]

It is important to mention that efficient place and route in large pipelined cipher architecture may

be a critical issue comparing with smaller iterative implementations [ref. 11].

4.3.4 Conclusion

The CCM unit sits as a device on the PLB bus (refer to section 4.2 for details) and the platform

has all the necessary elements for a SoC design; this makes it very easy for further on-chip

developments or modifications on this project. CCM can easily communicate with PowerPC or

DDR SDRAM controller or BRAM controller that sits on the same PLB bus at 80MHz.

Since the main part of CCM is the AES cipher some previous research on AES encryption are

given in table 4-3 for rough comparison; the results are not accurately comparable since they use

different FPGA technology, such as Virtex E.

38

Page 47: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

Table 4-3. AES Encryption Results

Implementation # of LUTs # of Slices # of RAM Blocks Throughput (Mbit/S)

[ref. 15] NA 2222 100 6956 [ref. 14] 3516 2784 100 11776 [ref. 16] 889 NA 10 1187 [ref. 15] 877 542 10 1450 [ref. 17] NA 1880 0 589 [ref. 15] 2524 1767 0 2085 [ref. 17] NA 2529 0 833 [ref. 15] 3846 2257 0 2008 Our design 2948 2717 0 2250

The implementations above use either an iterative or unrolled architecture that offers a different

tradeoff between the resources and throughput. Iterative implementation uses less hardware

resources while it has lower throughput than unrolled architecture.

4.4 Testing and Debugging

Xilinx Microprocessor Debugger (XMD) console has been used for testing and debugging the

circuit. XMD console provides a Tool Command Language (Tcl) interface. This interface can be

used for command line control and debugging of the target as well as for running complex

verification scripts to test a system thoroughly [ref. 7].

The PowerPC JTAG logic in the baseline system is connected through the native JTAG port of

the FPGA (series connection) [ref. 13]. The JTAG chain inside the FPGA is through the two

PowerPCs. The chain includes an interface bus named JTAGPPC that contains all the JTAG

signals.

The test benches for this project are taken from NIST special Publication 800- 38C and are given

in the appendix E with the corresponding generics that are applied before the FPGA

configuration.

4.5 Software Tools and Some Practical Recommendations

The software tools used to support the multi-IP-based SoC platform are as follows:

- ISE (version: 7.1.04i)

- ModelSim Simulator

- Platform Studio (version: Xilinx EDK 7.1.2)

- iMPACT (version: 7.1.04i) for configuring the FPGA

39

Page 48: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

The purpose of this chapter is to go through the issues that were poorly explained in the

documents thus causing some time-consuming problems throughout the research.

Connecting the Design to a Specific Interface In general there are two ways to hook up the design to a specific IP interface (IPIF) in the

Platform Studio software. However the manual does not clearly provide this important high-level

view of connecting IPs to the system.

One way is to use “Create/ Import Peripheral…” from Tools menu that provides you with a friendly

user interface. The particular problem that we had with this “Create/ Import Peripheral…” was

dealing with more-than-one dimensional arrays as inputs or outputs from the ccm_core.vhd code;

these arrays were supposed to be connected to the PLB controller IPIF. The alternative way we

used to tackle this problem without changing the vhdl code was to modify the user_logic.vhd for

PLB controller IPIF and define the ccm_core.vhd as a component within this core.

Defining the Order in a Modular Design The Peripheral Analyze Order file (.pao file) defines the ordered list of HDL files in a library

needed for synthesis and simulation. The order of defining the files in the library must be bottom

up. For instance if a core named A.vhd contains B.vhd then B must precede A in the .poa file.

Some Useful Miscellaneous Recommendations Due to poor documentation the useful paths that were found throughout tackling the problems are

listed bellow.

Platform Studio makes a project file that can be opened in ISE software as well and makes it

easy to switch between Platform Studio and XST. This .ise file is located in

“MY_ProjectFolder\pcores\MY_Peripheral\devl\projnav\”

There is a README.txt file located in

“MY_ProjectFolder\pcores\MY_Peripheral\devl\” that gives you some useful information on your

peripheral and some signal definitions.

Depending on the design complexity the place and route (PAR) process can be too lengthy. In

order to make it faster you can reduce the overall effort level (ol) to the slowest level that is

standard. The modification is done in fast_runtime.opt file located in

“MY_ProjectFolder\pcores\MY_Peripheral \etc\”

There was one particular problem with the synthesis tool when concatenating (using ‘&’ operator)

all the 32 slave registers. The tool was unable to perform the synthesis on the following line:

read_vec<=slv_reg0 & slv_reg1 & … & slv_reg30 & slv_reg31;

40

Page 49: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

The solution was to concatenate each of the 8 registers into temporary signals and then

concatenate these temporary signals.

read_vec_0<=slv_reg0 & slv_reg1 & … & slv_reg6 & slv_reg7; read_vec_1<=slv_reg8 & slv_reg9 & … & slv_reg14 & slv_reg15; read_vec_2<=slv_reg16 & slv_reg17 & ... &slv_reg22 & slv_reg23; read_vec_3<=slv_reg24 & slv_reg25 & … & slv_reg30 & slv_reg31; read_vec<= read_vec_0 & read_vec_1 & read_vec_2 & read_vec_3;

41

Page 50: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

5 Discussion and Conclusions

5.1 Summary

The purpose of this thesis is to build a self-contained on-chip system that includes CCM as the

security core. This research was one of two projects which were tested to get the SoC platform

successfully working among Canadian Universities after much trouble due to lack of documents.

This research was also the first in Canada to implement a system including CCM mode of

operation that is based on multi-IP approach to produce a complete SoC. CCM is implemented

based on the iterative architecture (described in chapter 4).

The main feature that makes this system very flexible is using soft IP peripherals. Except for the

hard IP cores (i.e. microprocessors and block RAMs) the other cores such as CCM, buses,

device controllers and etc, are soft IP peripherals.

With the emergence of powerful FPGAs with efficient on-chip cores (i.e. microprocessors and

memory) the idea of building a SoC using multi soft IP cores could yield very flexible self-

contained solutions.

5.2 Limitations and Future Work

In order to save the FPGA resources without significant degradation in performance, it is possible

to split the security algorithm into two sections; Expand key could be implemented using

PowerPC while the cipher section could be implemented using CLBs. It would have a little impact

on performance since key expansion has to be executed once in each key life time.

From the top-level architectural point of view in future work, CCM could also be transformed to a

device that is able to interrupt PowerPC, or it can be transformed to a master device

communicating with DDR SDRAM or BRAM available on the board according to the application

needs.

There are other ways of making the S-box tables or doing the MixColumns operation. Other than

using lookup tables at the algorithm level it is possible to use mathematical operations over the

Galois Field. These methods could be further investigated to determine how it would affect the

speed, area or power consumption.

As was mentioned before, the maximum clock frequency that has been tested is the PLB clock at

80 MHz, in order to test faster frequencies a Digital Clock Manager (DCM) has to be used as the

frequency multiplier that feeds the CCM input clock.

The counter (CTR) mode in this project is not designed optimally since it duplicates the cipher

core ( +1) times without improving the overall CCM speed; because the CBC-MAC ⎡ 128/Plen ⎤

42

Page 51: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

mode that works with CTR to produce the output will slow down CCM, no matter how fast CTR

works; this could be researched in the future.

43

Page 52: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

References [ref. 1]

Morris Dworkin, “Recommendation for Block Cipher Modes of Operation: The CCM Mode for Authentication and Confidentiality”, NIST Special Publication 800-38C, May 2004. http://csrc.nist.gov/publications/nistpubs/800-38C/SP800-38C.pdf.

[ref. 2]

William Stallings, “Cryptography and Network Security”, Prentice Hall, fourth edition 2006.

[ref. 3]

Amirix AP100 Datasheet, Amirix Systems Inc., Oct. 2005. http://www.amirix.com/downloads/ap1000.pdf.

[ref. 4]

AP1000 FPGA Development Board Users Guide, AMIRIX Systems Inc., Sep. 2007. Document #: DOC-004017 Version 02.

[ref. 5]

iMPACT Overview, Xilinx Inc., 2005. http://toolbox.xilinx.com/docsan/xilinx7/help/iseguide/mergedProjects/impact/html/imp_b_overview.htm.

[ref. 6]

Platform Studio Debugging PowerPC Hardware Setup, Xilinx Inc., 2005. http://toolbox.xilinx.com/docsan/xilinx8/EDKHelp/platform_studio/html/ps_p_dbg_debugging_ppc_hw_setup.htm.

[ref. 7]

Embedded System Tools Reference Manual Embedded Development Kit EDK 7.1i

[ref. 8]

Virtex-II Pro and Virtex-II Pro X Platform FPGAs Complete Data Sheet, Xilinx Inc., Oct. 2005. http://www.xilinx.com/bvdocs/publications/ds083.pdf.

[ref. 9]

Virtex-II Pro and Virtex-II Pro X FPGA User Guide, Xilinx Inc., March 2005. http://www.xilinx.com/bvdocs/userguides/ug012.pdf.

[ref. 10] IPIF PLB Xilinx core, Xilinx Inc. Aug. 2004. [ref. 11] Martin Feldhofer, Kerstin Lemke, Elisabeth Oswald, Fran¸cois-Xavier Standaert,

Thomas Wollinger and Johannes Wolkerstorfer, “State of the Art in Hardware Architectures”, ECRYPT, Sep. 2005. http://www.iaik.tugraz.at/research/krypto/AES/VAM2-IAIK-17-D.VAM2-1_0.pdf.

[ref. 12] Guido Bertoni, Luca Breveglieri, Pasqualina Fragnet2, Marco Macchetti, and Stefano Marchesin, “Efficient Software Implementation of AES on 32-Bit Platforms”, CHES 2003, Germany. http://www.springerlink.com/media/1lbfddawqm0urn5m8eeq/contributions/u/v/x/5/uvx5nqgnn55vk199.pdf.

[ref. 13] PowerPC 405 Processor Block Reference Guide, Xilinx Inc., Jul. 2005. http://www.xilinx.com/bvdocs/userguides/ug018.pdf.

[ref. 14] Francois-Xavier Standaert, Gael Rouvroy, Jean-Jacques Quisquater, and Jean-Didier Legat, “Efficient Implementation of Rijndael Encryption in Reconfigurable Hardware: Improvements and Design Tradeoffs”, CHES 2003, Germany.

[ref. 15] M. McLoone and J.V. McCanny, “High Performance Single Ship FPGA Rijndael Algorithm Implementations”, in the proceedings of CHES 2001: The Third International CHES Workshop, Lecture Notes In Computer Science, LNCS 2162, pp 65–76, Springer-Verlag.

[ref. 16] Helion Technology, High Performance AES (Rijndael) Cores for XILINX FPGA, CHES 2003, Germany http://www.heliontech.com.

[ref. 17] A. Satoh et al, Compact Hardware Architecture for 128-bit Block Cipher Camellia, in the Proceedings of the Third NESSIE Workshop, november 6–7, 2002, Munich, Germany.

44

Page 53: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

Appendix A: AES Cipher HDL Synthesis Report

========================================================================= * Final Report * ========================================================================= Final Results RTL Top Level Output File Name : cipher_mod.ngr Top Level Output File Name : cipher_mod Output Format : NGC Optimization Goal : Speed Keep Hierarchy : NO Design Statistics # IOs : 1668 Macro Statistics : # ROMs : 32 # 16x128-bit ROM : 16 # 256x32-bit ROM : 16 # Registers : 34 # 1-bit register : 1 # 4-bit register : 1 # 8-bit register : 32 # Multiplexers : 32 # 1-bit 11-to-1 multiplexer : 16 # 8-bit 16-to-1 multiplexer : 16 Cell Usage : # BELS : 6041 # INV : 1 # LUT2 : 123 # LUT2_D : 36 # LUT2_L : 16 # LUT3 : 179 # LUT3_D : 1 # LUT3_L : 1215 # LUT4 : 2948 # LUT4_D : 129 # LUT4_L : 416 # MUXF5 : 576 # MUXF6 : 272 # MUXF7 : 128 # VCC : 1 # FlipFlops/Latches : 1041 # FDE_1 : 891 # FDRE_1 : 150 # Clock Buffers : 1 # BUFGP : 1 # IO Buffers : 1667 # IBUF : 1538 # OBUF : 129

45

Page 54: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

========================================================================= TIMING REPORT NOTE: THESE TIMING NUMBERS ARE ONLY A SYNTHESIS ESTIMATE. FOR ACCURATE TIMING INFORMATION PLEASE REFER TO THE TRACE REPORT GENERATED AFTER PLACE-and-ROUTE. Clock Information: ------------------ -----------------------------------+------------------------+-------+ Clock Signal | Clock buffer(FF name) | Load | -----------------------------------+------------------------+-------+ clk | BUFGP | 1041 | -----------------------------------+------------------------+-------+ Timing Summary: --------------- Speed Grade: -6 Minimum period: 5.669ns (Maximum Frequency: 176.398MHz) Minimum input arrival time before clock: 4.214ns Maximum output required time after clock: 3.615ns Maximum combinational path delay: No path found

46

Page 55: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

Appendix B: MixColumns HDL Synthesis Report ========================================================================= * Final Report * ========================================================================= Final Results RTL Top Level Output File Name : mix.ngr Top Level Output File Name : mix Output Format : NGC Optimization Goal : Speed Keep Hierarchy : NO Design Statistics # IOs : 256 Macro Statistics : # ROMs : 16 # 256x32-bit ROM : 16 Cell Usage : # BELS : 256 # LUT2 : 80 # LUT4 : 176 # IO Buffers : 256 # IBUF : 128 # OBUF : 128 =======================================================================

47

Page 56: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

Appendix C: Key Expansion HDL Synthesis Report ========================================================================= * Final Report * ========================================================================= Final Results RTL Top Level Output File Name : expandkey.ngr Top Level Output File Name : expandkey Output Format : NGC Optimization Goal : Speed Keep Hierarchy : NO Design Statistics # IOs : 1539 Macro Statistics : # ROMs : 4 # 16x128-bit ROM : 4 # Registers : 183 # 1-bit register : 7 # 8-bit register : 176 # Multiplexers : 5 # 8-bit 11-to-1 multiplexer : 1 # 8-bit 16-to-1 multiplexer : 4 # Adders/Subtractors : 1 # 6-bit adder : 1 Cell Usage : # BELS : 3408 # GND : 1 # INV : 1 # LUT1 : 5 # LUT2 : 72 # LUT2_D : 4 # LUT3 : 154 # LUT3_D : 35 # LUT3_L : 328 # LUT4 : 1740 # LUT4_D : 378 # LUT4_L : 443 # MUXCY : 5 # MUXF5 : 140 # MUXF6 : 64 # MUXF7 : 32 # VCC : 1 # XORCY : 5 # FlipFlops/Latches : 1632 # FDE_1 : 1536 # FDRE_1 : 89 # FDSE_1 : 7

48

Page 57: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

# Clock Buffers : 1 # BUFGP : 1 # IO Buffers : 1538 # IBUF : 129 # OBUF : 1409 ========================================================================= TIMING REPORT NOTE: THESE TIMING NUMBERS ARE ONLY A SYNTHESIS ESTIMATE. FOR ACCURATE TIMING INFORMATION PLEASE REFER TO THE TRACE REPORT GENERATED AFTER PLACE-and-ROUTE. Clock Information: ------------------ -----------------------------------+------------------------+-------+ Clock Signal | Clock buffer(FF name) | Load | -----------------------------------+------------------------+-------+ clk | BUFGP | 1632 | -----------------------------------+------------------------+-------+ Timing Summary: --------------- Speed Grade: -6 Minimum period: 8.873ns (Maximum Frequency: 112.701MHz) Minimum input arrival time before clock: 4.992ns Maximum output required time after clock: 3.692ns Maximum combinational path delay: No path found

49

Page 58: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

Appendix D: S-box (AES Forward Cipher)

63 7C 77 7B F2 6B 6F C5 30 01 67 2b EF D7 AB 76 CA 82 C9 7D FA 59 47 F0 AD D4 A2 AF 9C A4 72 C0 B7 FD 93 26 36 3F F7 CC 34 A5 E5 F1 71 D8 31 15 04 C7 23 C3 18 96 05 9A 07 12 80 E2 EB 27 B2 75 09 83 2C 1A 1B 6E 5A A0 52 3B D6 B3 29 E3 2F 84 53 D1 00 ED 202 FC B1 5B 6A CB BE 39 4A 4C 58 CF D0 EF AA FB 43 4D 33 85 45 F9 02 7F 50 3C 9F A8 51 A3 40 8F 92 9D 38 F5 BC B6 DA 21 10 FF F3 D2 CD 0C 13 EC 5F 97 44 17 C4 A7 7E 3D 64 5D 19 73 60 81 4F DC 22 2A 90 88 46 EE B8 14 DE 5E 0B DB E0 32 3A 0A 49 06 24 5C C2 D3 AC 62 91 95 E4 79 E7 C8 37 6D 8D D5 4E A9 6C 56 F4 EA 65 7A AE 08 BA 78 25 2E 1C A6 B4 C6 E8 DD 74 1F 4B BD 8B 8A 70 3E B5 66 48 03 F6 0E 61 35 57 B9 86 C1 1D 9E E1 F8 98 11 69 D9 8E 94 9B 1E 87 E9 CE 55 28 DF 8C A1 89 0D BF E6 42 68 41 99 2D 0f B0 54 BB 16

50

Page 59: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

Appendix E: Test Vectors

These are the test vector that has been used for verifying the implementation through XMD

console. The following examples are taken from [ref. 1].

Example 1 The generics in the following example: Klen = 128, Tlen=32, Nlen = 56, Alen = 64, and Plen = 32.

K: 40414243 44454647 48494a4b 4c4d4e4f

N: 10111213 141516

A: 00010203 04050607

P: 20212223

B: 4f101112 13141516 00000000 00000004

00080001 02030405 06070000 00000000

20212223 00000000 00000000 00000000

T: 6084341b

Ctr0: 07101112 13141516 00000000 00000000

S0: 2d281146 10676c26 32bad748 559a679a

Ctr1: 07101112 13141516 00000000 00000001

S1: 51432378 e474b339 71318484 103cddfb

C: 7162015b 4dac255d

Example 2 The generics in the following example: Klen = 128, Tlen=48, Nlen = 64, Alen = 128,

and Plen = 128.

K: 40414243 44454647 48494a4b 4c4d4e4f

N: 10111213 14151617

A: 00010203 04050607 08090a0b 0c0d0e0f

P: 20212223 24252627 28292a2b 2c2d2e2f

B: 56101112 13141516 17000000 00000010

00100001 02030405 06070809 0a0b0c0d

0e0f0000 00000000 00000000 00000000

20212223 24252627 28292a2b 2c2d2e2f

T: 7f479ffc a464

Ctr0: 06101112 13141516 17000000 00000000

S0: 6081d043 08a97dcc 20cdcc60 bf947b78

Ctr1: 06101112 13141516 17000000 00000001

S1: f280d2c3 75cf7945 20335db9 2b107712

51

Page 60: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

C: d2a1f0e0 51ea5f62 081a7792 073d593d

1fc64fbf accd

Example 3 The generics in the following example: Klen = 128, Tlen=64, Nlen = 96, Alen = 160,

and Plen = 192.

K: 40414243 44454647 48494a4b 4c4d4e4f

N: 10111213 14151617 18191a1b

A: 00010203 04050607 08090a0b 0c0d0e0f

10111213

P: 20212223 24252627 28292a2b 2c2d2e2f

30313233 34353637

B: 5a101112 13141516 1718191a 1b000018

00140001 02030405 06070809 0a0b0c0d

0e0f1011 12130000 00000000 00000000

20212223 24252627 28292a2b 2c2d2e2f

30313233 34353637 00000000 00000000

T: 67c99240 c7d51048

Ctr0: 02101112 13141516 1718191a 1b000000

S0: 2f8a00bb 06658919 c3a040a6 eaed1a7f

Ctr1: 02101112 13141516 1718191a 1b000001

S1: c393238a d1923c5d b335c0c7 e1bac924

Ctr2: 02101112 13141516 1718191a 1b000002

S2: 514798ea 9077bc92 6c22ebef 2ac732dc

C: e3b201a9 f5b71a7a 9b1ceaec cd97e70b

6176aad9 a4428aa5 484392fb c1b09951

52

Page 61: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

Appendix F: VHDL Codes CCM Core VHDL Code library IEEE; use IEEE.STD_LOGIC_1164.ALL; use IEEE.STD_LOGIC_ARITH.ALL; use IEEE.STD_LOGIC_UNSIGNED.ALL; library ccm_v1_00_a; use ccm_v1_00_a.datatypes.ALL; entity ccm_core is generic( Klen: natural:=128; Tlen: natural:=32; Nlen: natural:=56; Alen: natural:=64; Plen: natural:=32 ); port( clk: in std_logic; rst_exp: in std_logic; rst_ciph: in std_logic; key_in: in std_logic_vector (Klen-1 downto 0); P: in std_logic_vector (Plen-1 downto 0); nonce: in std_logic_vector(Nlen-1 downto 0); Adata: in std_logic_vector(Alen-1 downto 0); oe: out std_logic; C_out: out std_logic_vector (Plen+Tlen-1 downto 0) ); end ccm_core; architecture struct of ccm_core is component cipher_mod port( input: in key; key_exp: in word_arr; clk: in std_logic; rst: in std_logic; en_exp: in std_logic; oe: out std_logic; state_out: out state ); end component; component expandkey is port (clk: in std_logic; key_in: in key; reset: in std_logic; en_exp: out std_logic;

53

Page 62: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

key_exp: out word_arr ); end component; signal enexp, r3: std_logic; signal keyexp: word_arr; signal ctr_oe: std_logic_vector(ceil(Plen) downto 0); signal s: st_arr(ceil(Plen) downto 0); signal ctr_in: key_arr(ceil(Plen) downto 0); signal k: key; signal r_count: natural range rgen(Alen, Plen)+1 downto 1; signal cbcmac_oe: std_logic; signal cbcmac_rst: std_logic; signal cbcmac_in: key; signal y: state; signal T: std_logic_vector(Tlen-1 downto 0); signal S_bitvec: std_logic_vector(ceil(Plen)*128-1 downto 0); signal C_temp: std_logic_vector (Plen+Tlen-1 downto 0); signal S0, Yr: std_logic_vector (127 downto 0); signal sig_in: key_arr(rgen(Alen, Plen) downto 0); begin sig_in<=format(Tlen, Nlen, Alen, Plen, nonce, Adata, P); ctr_in<=format_ctr(Nlen, Plen, nonce); key_formatting: for i in 0 to 15 generate k(15-i)<=key_in(i*8+7 downto i*8); end generate; exp1: expandkey port map( clk=>clk, key_in=>k, reset=>rst_exp, en_exp=>enexp, key_exp=>keyexp ); ctr: for m in 0 to ceil(Plen) generate ctr_ciph: cipher_mod port map( input=>ctr_in(m), key_exp=>keyexp, clk=>clk, rst=>rst_ciph, en_exp=>enexp, oe=>ctr_oe(m), state_out=>s(m) ); end generate ctr; gen1: for u in 0 to 3 generate gen1_0: for v in 0 to 3 generate S0(u*32+v*8+7 downto u*32+v*8)<=s(0)(3-u)(3-v);

54

Page 63: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

Yr(u*32+v*8+7 downto u*32+v*8)<=Y(3-u)(3-v); end generate gen1_0; end generate gen1; gen2: for m in 0 to ceil(Plen)-1 generate gen2_0: for u in 0 to 3 generate gen2_0_0: for v in 0 to 3 generate S_bitvec(m*128+u*32+v*8+7 downto m*128+u*32+v*8)<=s(ceil(Plen)-m)(3-u)(3-v); end generate gen2_0_0; end generate gen2_0; end generate gen2; gen3: for m in 0 to Plen-1 generate C_temp(Plen+Tlen-1-m)<=P(Plen-1-m) xor S_bitvec(ceil(Plen)*128-1-m); end generate gen3; g4: for m in 0 to Tlen-1 generate C_temp(Tlen-1-m)<=T(Tlen-1-m) xor S0(127-m); end generate g4; T(Tlen-1 downto 0)<=Yr(127 downto 127-Tlen+1); cbcmac: cipher_mod port map( input=>cbcmac_in, key_exp=>keyexp, clk=>clk, rst=>cbcmac_rst, en_exp=>enexp, oe=>cbcmac_oe, state_out=>y ); r3<='1' when r_count=rgen(Alen, Plen)+1 else '0'; cbcmac_rst<=rst_ciph or (cbcmac_oe and not(r3)); process(clk) begin if (clk='0' and clk'event)then if (rst_ciph='1')then C_out<=(others=>'1'); oe<='0'; r_count<=1; cbcmac_in<=sig_in(0); else if (cbcmac_oe='1') then if not(r_count=rgen(Alen, Plen)+1) then r_count<=r_count+1; for u in 0 to 3 loop for v in 0 to 3 loop cbcmac_in(u*4+v)<=sig_in(r_count)(u*4+v) xor Y(u)(v); end loop; end loop; else oe<='1'; C_out<=C_temp; end if; end if;--cbcmac_oe

55

Page 64: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

end if;--reset end if;--clk end process; end struct; AES Cipher VHDL Code library IEEE; use IEEE.STD_LOGIC_1164.ALL; use IEEE.STD_LOGIC_ARITH.ALL; use IEEE.STD_LOGIC_UNSIGNED.ALL; library ccm_v1_00_a; use ccm_v1_00_a.datatypes.ALL; entity cipher_mod is port( input: in key; key_exp: in word_arr; clk: in std_logic; rst: in std_logic; en_exp: in std_logic; oe: out std_logic; state_out: out state ); end cipher_mod; architecture Behavioral of cipher_mod is component mix port(st_in: in state; st_out: out state ); end component; component add port( st_exp: in state; st_in: in state; clk: in std_logic; reset: in std_logic; en_exp: in std_logic; round: in natural range 0 to 10; st_out: out state; fin_out: out state ); end component; signal s, p, mix_out, q, init_st, fin_out: state; signal next_out, st_exp: state; signal round2: natural range 0 to 10; begin gen1: for i in 0 to 3 generate gen2: for j in 0 to 3 generate init_st(i)(j)<= input(i*4+j); p(i)(j)<=Sbox(conv_integer(next_out(i)(j)(7 downto 4)))(conv_integer(next_out(i)(j)(3 downto 0))); s(i)(j)<=p((i+j) mod 4)(j);--rotation end generate gen2; end generate gen1;

56

Page 65: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

mi: mix port map( st_in=>s, st_out=>mix_out ); ad: add port map( st_exp=>st_exp, st_in=>q, clk=> clk, reset=>rst, en_exp=>en_exp, round=>round2, st_out=>next_out, fin_out=> fin_out ); q<=init_st when round2=0 else s when round2=10 else mix_out; decoder_1: for j in 0 to 3 generate st_exp(j)<=key_exp(round2*4+j); end generate; process( clk,rst) begin if ( clk='0' and clk'event) then if (rst='1') then oe<='0'; state_out<=(((X"00"), (X"00"), (X"00"), (X"00")), ((X"00"), (X"00"), (X"00"), (X"00")), ((X"00"), (X"00"), (X"00"), (X"00")), ((X"00"), (X"00"), (X"00"), (X"00"))); round2<=0; else if not(round2=10) then if (en_exp='0') then round2<=round2+1; end if; else oe<='1'; state_out<=fin_out; end if; end if;--reset end if;--clk end process; end Behavioral;

57

Page 66: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

AddRoundKey VHDL Code -- mul_luti is the multiplication for column library IEEE; use IEEE.STD_LOGIC_1164.ALL; use IEEE.STD_LOGIC_ARITH.ALL; use IEEE.STD_LOGIC_UNSIGNED.ALL; library ccm_v1_00_a; use ccm_v1_00_a.datatypes.ALL; entity add is port( st_exp: in state; st_in: in state; clk: in std_logic; reset: in std_logic; en_exp: in std_logic; round: in natural range 0 to 10; st_out: out state; fin_out: out state ); end add; architecture structural of add is signal st_comb: state; begin g0: for k in 0 to 3 generate st_comb(k)<=st_in(k) xor st_exp(k); end generate g0; fin_out<=st_comb; process(clk) begin if (clk='0' and clk'event) then if reset='1' then else if (not(round=10) and en_exp='0') then st_out<=st_comb; end if; end if;--reset end if;--clk end process; end structural;

58

Page 67: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

MixColumns VHDL Code -- mul_luti is the multiplication for column library IEEE; use IEEE.STD_LOGIC_1164.ALL; use IEEE.STD_LOGIC_ARITH.ALL; use IEEE.STD_LOGIC_UNSIGNED.ALL; library ccm_v1_00_a; use ccm_v1_00_a.datatypes.ALL; entity mix is port( st_in: in state; st_out: out state); end mix; architecture structural of mix is signal r0, r1, r2, r3: state; begin gn1: for m in 0 to 3 generate r0(m)<=mul_row0(conv_integer(st_in(m)(0))); r1(m)<=mul_row1(conv_integer(st_in(m)(1))); r2(m)<=mul_row2(conv_integer(st_in(m)(2))); r3(m)<=mul_row3(conv_integer(st_in(m)(3))); st_out(m)<=r0(m) xor r1(m) xor r2(m) xor r3(m); end generate; end structural;

59

Page 68: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

AES Key Expansion VHDL Code library IEEE; use IEEE.STD_LOGIC_1164.ALL; use IEEE.STD_LOGIC_ARITH.ALL; use IEEE.STD_LOGIC_UNSIGNED.ALL; library ccm_v1_00_a; use ccm_v1_00_a.datatypes.ALL; entity expandkey is port (clk: in std_logic; key_in: in key; reset: in std_logic; en_exp: out std_logic; key_exp: out word_arr ); end expandkey; architecture structural of expandkey is signal en: std_logic; signal i: integer range 4 to 43; signal temp_w: word_arr; signal xor_input1, temp: word; signal con1: std_logic; begin temp(1)<=Sbox(conv_integer(temp_w(i-1)(2)(7 downto 4)))(conv_integer(temp_w(i-1)(2)(3 downto 0))); temp(2)<=Sbox(conv_integer(temp_w(i-1)(3)(7 downto 4)))(conv_integer(temp_w(i-1)(3)(3 downto 0))); temp(0)<=Sbox(conv_integer(temp_w(i-1)(1)(7 downto 4)))(conv_integer(temp_w(i-1)(1)(3 downto 0)))xor Rcon(i/4); temp(3)<=Sbox(conv_integer(temp_w(i-1)(0)(7 downto 4)))(conv_integer(temp_w(i-1)(0)(3 downto 0))); con1<=conv_std_logic_vector(i,6)(0) or conv_std_logic_vector(i,6)(1); xor_input1<=temp when (con1='0') else temp_w(i-1) ; en_exp<=en; key_exp<=temp_w; process(clk) variable xor_in1: word; variable z: std_logic_vector (1 downto 0); begin if (clk='0' and clk'event) then if (reset='1') then for r in 0 to 3 loop for s in 0 to 3 loop temp_w(r)(s)<=key_in(r*4+s); end loop; end loop;

60

Page 69: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

i<=4; en<='1'; else if (not(i=43) and en='1') then i<=i+1; else en<='0'; end if; if (en='1') then temp_w(i)<=temp_w(i-4) xor xor_input1; end if; end if;--reset end if;--clk end process; end structural;

61

Page 70: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

Package Datatypes VHDL Code library IEEE; use IEEE.STD_LOGIC_1164.all; use IEEE.STD_LOGIC_ARITH.ALL; package datatypes is type word is array (0 to 3) of std_logic_vector(7 downto 0); type key is array (0 to 15) of std_logic_vector(7 downto 0); type word_arr is array (0 to 43) of word ; type box is array (0 to 15) of key; type RCbox is array (0 to 10) of std_logic_vector(7 downto 0); type round_arr is array (1 to 10) of word; type mul_table is array (0 to 255) of word; type state is array (0 to 3) of word; type st_arr is array (natural range <>) of state; type key_arr is array (natural range <>) of key; constant mul_row0: mul_table:= ( (("00000000"), ("00000000"), ("00000000"), ("00000000")), (("00000010"), ("00000001"), ("00000001"), ("00000011")), (("00000100"), ("00000010"), ("00000010"), ("00000110")), (("00000110"), ("00000011"), ("00000011"), ("00000101")), (("00001000"), ("00000100"), ("00000100"), ("00001100")), (("00001010"), ("00000101"), ("00000101"), ("00001111")), (("00001100"), ("00000110"), ("00000110"), ("00001010")), (("00001110"), ("00000111"), ("00000111"), ("00001001")), (("00010000"), ("00001000"), ("00001000"), ("00011000")), (("00010010"), ("00001001"), ("00001001"), ("00011011")), (("00010100"), ("00001010"), ("00001010"), ("00011110")), (("00010110"), ("00001011"), ("00001011"), ("00011101")), (("00011000"), ("00001100"), ("00001100"), ("00010100")), (("00011010"), ("00001101"), ("00001101"), ("00010111")), (("00011100"), ("00001110"), ("00001110"), ("00010010")), (("00011110"), ("00001111"), ("00001111"), ("00010001")), (("00100000"), ("00010000"), ("00010000"), ("00110000")), (("00100010"), ("00010001"), ("00010001"), ("00110011")), (("00100100"), ("00010010"), ("00010010"), ("00110110")), (("00100110"), ("00010011"), ("00010011"), ("00110101")), (("00101000"), ("00010100"), ("00010100"), ("00111100")), (("00101010"), ("00010101"), ("00010101"), ("00111111")), (("00101100"), ("00010110"), ("00010110"), ("00111010")), (("00101110"), ("00010111"), ("00010111"), ("00111001")), (("00110000"), ("00011000"), ("00011000"), ("00101000")), (("00110010"), ("00011001"), ("00011001"), ("00101011")), (("00110100"), ("00011010"), ("00011010"), ("00101110")), (("00110110"), ("00011011"), ("00011011"), ("00101101")), (("00111000"), ("00011100"), ("00011100"), ("00100100")), (("00111010"), ("00011101"), ("00011101"), ("00100111")), (("00111100"), ("00011110"), ("00011110"), ("00100010")), (("00111110"), ("00011111"), ("00011111"), ("00100001")), (("01000000"), ("00100000"), ("00100000"), ("01100000")), (("01000010"), ("00100001"), ("00100001"), ("01100011")),

62

Page 71: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

(("01000100"), ("00100010"), ("00100010"), ("01100110")), (("01000110"), ("00100011"), ("00100011"), ("01100101")), (("01001000"), ("00100100"), ("00100100"), ("01101100")), (("01001010"), ("00100101"), ("00100101"), ("01101111")), (("01001100"), ("00100110"), ("00100110"), ("01101010")), (("01001110"), ("00100111"), ("00100111"), ("01101001")), (("01010000"), ("00101000"), ("00101000"), ("01111000")), (("01010010"), ("00101001"), ("00101001"), ("01111011")), (("01010100"), ("00101010"), ("00101010"), ("01111110")), (("01010110"), ("00101011"), ("00101011"), ("01111101")), (("01011000"), ("00101100"), ("00101100"), ("01110100")), (("01011010"), ("00101101"), ("00101101"), ("01110111")), (("01011100"), ("00101110"), ("00101110"), ("01110010")), (("01011110"), ("00101111"), ("00101111"), ("01110001")), (("01100000"), ("00110000"), ("00110000"), ("01010000")), (("01100010"), ("00110001"), ("00110001"), ("01010011")), (("01100100"), ("00110010"), ("00110010"), ("01010110")), (("01100110"), ("00110011"), ("00110011"), ("01010101")), (("01101000"), ("00110100"), ("00110100"), ("01011100")), (("01101010"), ("00110101"), ("00110101"), ("01011111")), (("01101100"), ("00110110"), ("00110110"), ("01011010")), (("01101110"), ("00110111"), ("00110111"), ("01011001")), (("01110000"), ("00111000"), ("00111000"), ("01001000")), (("01110010"), ("00111001"), ("00111001"), ("01001011")), (("01110100"), ("00111010"), ("00111010"), ("01001110")), (("01110110"), ("00111011"), ("00111011"), ("01001101")), (("01111000"), ("00111100"), ("00111100"), ("01000100")), (("01111010"), ("00111101"), ("00111101"), ("01000111")), (("01111100"), ("00111110"), ("00111110"), ("01000010")), (("01111110"), ("00111111"), ("00111111"), ("01000001")), (("10000000"), ("01000000"), ("01000000"), ("11000000")), (("10000010"), ("01000001"), ("01000001"), ("11000011")), (("10000100"), ("01000010"), ("01000010"), ("11000110")), (("10000110"), ("01000011"), ("01000011"), ("11000101")), (("10001000"), ("01000100"), ("01000100"), ("11001100")), (("10001010"), ("01000101"), ("01000101"), ("11001111")), (("10001100"), ("01000110"), ("01000110"), ("11001010")), (("10001110"), ("01000111"), ("01000111"), ("11001001")), (("10010000"), ("01001000"), ("01001000"), ("11011000")), (("10010010"), ("01001001"), ("01001001"), ("11011011")), (("10010100"), ("01001010"), ("01001010"), ("11011110")), (("10010110"), ("01001011"), ("01001011"), ("11011101")), (("10011000"), ("01001100"), ("01001100"), ("11010100")), (("10011010"), ("01001101"), ("01001101"), ("11010111")), (("10011100"), ("01001110"), ("01001110"), ("11010010")), (("10011110"), ("01001111"), ("01001111"), ("11010001")), (("10100000"), ("01010000"), ("01010000"), ("11110000")), (("10100010"), ("01010001"), ("01010001"), ("11110011")), (("10100100"), ("01010010"), ("01010010"), ("11110110")), (("10100110"), ("01010011"), ("01010011"), ("11110101")), (("10101000"), ("01010100"), ("01010100"), ("11111100")), (("10101010"), ("01010101"), ("01010101"), ("11111111")), (("10101100"), ("01010110"), ("01010110"), ("11111010")), (("10101110"), ("01010111"), ("01010111"), ("11111001")), (("10110000"), ("01011000"), ("01011000"), ("11101000")), (("10110010"), ("01011001"), ("01011001"), ("11101011")), (("10110100"), ("01011010"), ("01011010"), ("11101110")),

63

Page 72: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

(("10110110"), ("01011011"), ("01011011"), ("11101101")), (("10111000"), ("01011100"), ("01011100"), ("11100100")), (("10111010"), ("01011101"), ("01011101"), ("11100111")), (("10111100"), ("01011110"), ("01011110"), ("11100010")), (("10111110"), ("01011111"), ("01011111"), ("11100001")), (("11000000"), ("01100000"), ("01100000"), ("10100000")), (("11000010"), ("01100001"), ("01100001"), ("10100011")), (("11000100"), ("01100010"), ("01100010"), ("10100110")), (("11000110"), ("01100011"), ("01100011"), ("10100101")), (("11001000"), ("01100100"), ("01100100"), ("10101100")), (("11001010"), ("01100101"), ("01100101"), ("10101111")), (("11001100"), ("01100110"), ("01100110"), ("10101010")), (("11001110"), ("01100111"), ("01100111"), ("10101001")), (("11010000"), ("01101000"), ("01101000"), ("10111000")), (("11010010"), ("01101001"), ("01101001"), ("10111011")), (("11010100"), ("01101010"), ("01101010"), ("10111110")), (("11010110"), ("01101011"), ("01101011"), ("10111101")), (("11011000"), ("01101100"), ("01101100"), ("10110100")), (("11011010"), ("01101101"), ("01101101"), ("10110111")), (("11011100"), ("01101110"), ("01101110"), ("10110010")), (("11011110"), ("01101111"), ("01101111"), ("10110001")), (("11100000"), ("01110000"), ("01110000"), ("10010000")), (("11100010"), ("01110001"), ("01110001"), ("10010011")), (("11100100"), ("01110010"), ("01110010"), ("10010110")), (("11100110"), ("01110011"), ("01110011"), ("10010101")), (("11101000"), ("01110100"), ("01110100"), ("10011100")), (("11101010"), ("01110101"), ("01110101"), ("10011111")), (("11101100"), ("01110110"), ("01110110"), ("10011010")), (("11101110"), ("01110111"), ("01110111"), ("10011001")), (("11110000"), ("01111000"), ("01111000"), ("10001000")), (("11110010"), ("01111001"), ("01111001"), ("10001011")), (("11110100"), ("01111010"), ("01111010"), ("10001110")), (("11110110"), ("01111011"), ("01111011"), ("10001101")), (("11111000"), ("01111100"), ("01111100"), ("10000100")), (("11111010"), ("01111101"), ("01111101"), ("10000111")), (("11111100"), ("01111110"), ("01111110"), ("10000010")), (("11111110"), ("01111111"), ("01111111"), ("10000001")), (("00011011"), ("10000000"), ("10000000"), ("10011011")), (("00011001"), ("10000001"), ("10000001"), ("10011000")), (("00011111"), ("10000010"), ("10000010"), ("10011101")), (("00011101"), ("10000011"), ("10000011"), ("10011110")), (("00010011"), ("10000100"), ("10000100"), ("10010111")), (("00010001"), ("10000101"), ("10000101"), ("10010100")), (("00010111"), ("10000110"), ("10000110"), ("10010001")), (("00010101"), ("10000111"), ("10000111"), ("10010010")), (("00001011"), ("10001000"), ("10001000"), ("10000011")), (("00001001"), ("10001001"), ("10001001"), ("10000000")), (("00001111"), ("10001010"), ("10001010"), ("10000101")), (("00001101"), ("10001011"), ("10001011"), ("10000110")), (("00000011"), ("10001100"), ("10001100"), ("10001111")), (("00000001"), ("10001101"), ("10001101"), ("10001100")), (("00000111"), ("10001110"), ("10001110"), ("10001001")), (("00000101"), ("10001111"), ("10001111"), ("10001010")), (("00111011"), ("10010000"), ("10010000"), ("10101011")), (("00111001"), ("10010001"), ("10010001"), ("10101000")), (("00111111"), ("10010010"), ("10010010"), ("10101101")), (("00111101"), ("10010011"), ("10010011"), ("10101110")),

64

Page 73: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

(("00110011"), ("10010100"), ("10010100"), ("10100111")), (("00110001"), ("10010101"), ("10010101"), ("10100100")), (("00110111"), ("10010110"), ("10010110"), ("10100001")), (("00110101"), ("10010111"), ("10010111"), ("10100010")), (("00101011"), ("10011000"), ("10011000"), ("10110011")), (("00101001"), ("10011001"), ("10011001"), ("10110000")), (("00101111"), ("10011010"), ("10011010"), ("10110101")), (("00101101"), ("10011011"), ("10011011"), ("10110110")), (("00100011"), ("10011100"), ("10011100"), ("10111111")), (("00100001"), ("10011101"), ("10011101"), ("10111100")), (("00100111"), ("10011110"), ("10011110"), ("10111001")), (("00100101"), ("10011111"), ("10011111"), ("10111010")), (("01011011"), ("10100000"), ("10100000"), ("11111011")), (("01011001"), ("10100001"), ("10100001"), ("11111000")), (("01011111"), ("10100010"), ("10100010"), ("11111101")), (("01011101"), ("10100011"), ("10100011"), ("11111110")), (("01010011"), ("10100100"), ("10100100"), ("11110111")), (("01010001"), ("10100101"), ("10100101"), ("11110100")), (("01010111"), ("10100110"), ("10100110"), ("11110001")), (("01010101"), ("10100111"), ("10100111"), ("11110010")), (("01001011"), ("10101000"), ("10101000"), ("11100011")), (("01001001"), ("10101001"), ("10101001"), ("11100000")), (("01001111"), ("10101010"), ("10101010"), ("11100101")), (("01001101"), ("10101011"), ("10101011"), ("11100110")), (("01000011"), ("10101100"), ("10101100"), ("11101111")), (("01000001"), ("10101101"), ("10101101"), ("11101100")), (("01000111"), ("10101110"), ("10101110"), ("11101001")), (("01000101"), ("10101111"), ("10101111"), ("11101010")), (("01111011"), ("10110000"), ("10110000"), ("11001011")), (("01111001"), ("10110001"), ("10110001"), ("11001000")), (("01111111"), ("10110010"), ("10110010"), ("11001101")), (("01111101"), ("10110011"), ("10110011"), ("11001110")), (("01110011"), ("10110100"), ("10110100"), ("11000111")), (("01110001"), ("10110101"), ("10110101"), ("11000100")), (("01110111"), ("10110110"), ("10110110"), ("11000001")), (("01110101"), ("10110111"), ("10110111"), ("11000010")), (("01101011"), ("10111000"), ("10111000"), ("11010011")), (("01101001"), ("10111001"), ("10111001"), ("11010000")), (("01101111"), ("10111010"), ("10111010"), ("11010101")), (("01101101"), ("10111011"), ("10111011"), ("11010110")), (("01100011"), ("10111100"), ("10111100"), ("11011111")), (("01100001"), ("10111101"), ("10111101"), ("11011100")), (("01100111"), ("10111110"), ("10111110"), ("11011001")), (("01100101"), ("10111111"), ("10111111"), ("11011010")), (("10011011"), ("11000000"), ("11000000"), ("01011011")), (("10011001"), ("11000001"), ("11000001"), ("01011000")), (("10011111"), ("11000010"), ("11000010"), ("01011101")), (("10011101"), ("11000011"), ("11000011"), ("01011110")), (("10010011"), ("11000100"), ("11000100"), ("01010111")), (("10010001"), ("11000101"), ("11000101"), ("01010100")), (("10010111"), ("11000110"), ("11000110"), ("01010001")), (("10010101"), ("11000111"), ("11000111"), ("01010010")), (("10001011"), ("11001000"), ("11001000"), ("01000011")), (("10001001"), ("11001001"), ("11001001"), ("01000000")), (("10001111"), ("11001010"), ("11001010"), ("01000101")), (("10001101"), ("11001011"), ("11001011"), ("01000110")), (("10000011"), ("11001100"), ("11001100"), ("01001111")),

65

Page 74: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

(("10000001"), ("11001101"), ("11001101"), ("01001100")), (("10000111"), ("11001110"), ("11001110"), ("01001001")), (("10000101"), ("11001111"), ("11001111"), ("01001010")), (("10111011"), ("11010000"), ("11010000"), ("01101011")), (("10111001"), ("11010001"), ("11010001"), ("01101000")), (("10111111"), ("11010010"), ("11010010"), ("01101101")), (("10111101"), ("11010011"), ("11010011"), ("01101110")), (("10110011"), ("11010100"), ("11010100"), ("01100111")), (("10110001"), ("11010101"), ("11010101"), ("01100100")), (("10110111"), ("11010110"), ("11010110"), ("01100001")), (("10110101"), ("11010111"), ("11010111"), ("01100010")), (("10101011"), ("11011000"), ("11011000"), ("01110011")), (("10101001"), ("11011001"), ("11011001"), ("01110000")), (("10101111"), ("11011010"), ("11011010"), ("01110101")), (("10101101"), ("11011011"), ("11011011"), ("01110110")), (("10100011"), ("11011100"), ("11011100"), ("01111111")), (("10100001"), ("11011101"), ("11011101"), ("01111100")), (("10100111"), ("11011110"), ("11011110"), ("01111001")), (("10100101"), ("11011111"), ("11011111"), ("01111010")), (("11011011"), ("11100000"), ("11100000"), ("00111011")), (("11011001"), ("11100001"), ("11100001"), ("00111000")), (("11011111"), ("11100010"), ("11100010"), ("00111101")), (("11011101"), ("11100011"), ("11100011"), ("00111110")), (("11010011"), ("11100100"), ("11100100"), ("00110111")), (("11010001"), ("11100101"), ("11100101"), ("00110100")), (("11010111"), ("11100110"), ("11100110"), ("00110001")), (("11010101"), ("11100111"), ("11100111"), ("00110010")), (("11001011"), ("11101000"), ("11101000"), ("00100011")), (("11001001"), ("11101001"), ("11101001"), ("00100000")), (("11001111"), ("11101010"), ("11101010"), ("00100101")), (("11001101"), ("11101011"), ("11101011"), ("00100110")), (("11000011"), ("11101100"), ("11101100"), ("00101111")), (("11000001"), ("11101101"), ("11101101"), ("00101100")), (("11000111"), ("11101110"), ("11101110"), ("00101001")), (("11000101"), ("11101111"), ("11101111"), ("00101010")), (("11111011"), ("11110000"), ("11110000"), ("00001011")), (("11111001"), ("11110001"), ("11110001"), ("00001000")), (("11111111"), ("11110010"), ("11110010"), ("00001101")), (("11111101"), ("11110011"), ("11110011"), ("00001110")), (("11110011"), ("11110100"), ("11110100"), ("00000111")), (("11110001"), ("11110101"), ("11110101"), ("00000100")), (("11110111"), ("11110110"), ("11110110"), ("00000001")), (("11110101"), ("11110111"), ("11110111"), ("00000010")), (("11101011"), ("11111000"), ("11111000"), ("00010011")), (("11101001"), ("11111001"), ("11111001"), ("00010000")), (("11101111"), ("11111010"), ("11111010"), ("00010101")), (("11101101"), ("11111011"), ("11111011"), ("00010110")), (("11100011"), ("11111100"), ("11111100"), ("00011111")), (("11100001"), ("11111101"), ("11111101"), ("00011100")), (("11100111"), ("11111110"), ("11111110"), ("00011001")), (("11100101"), ("11111111"), ("11111111"), ("00011010")) ); constant mul_row1: mul_table:= ( (("00000000"), ("00000000"), ("00000000"), ("00000000")), (("00000011"), ("00000010"), ("00000001"), ("00000001")), (("00000110"), ("00000100"), ("00000010"), ("00000010")),

66

Page 75: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

(("00000101"), ("00000110"), ("00000011"), ("00000011")), (("00001100"), ("00001000"), ("00000100"), ("00000100")), (("00001111"), ("00001010"), ("00000101"), ("00000101")), (("00001010"), ("00001100"), ("00000110"), ("00000110")), (("00001001"), ("00001110"), ("00000111"), ("00000111")), (("00011000"), ("00010000"), ("00001000"), ("00001000")), (("00011011"), ("00010010"), ("00001001"), ("00001001")), (("00011110"), ("00010100"), ("00001010"), ("00001010")), (("00011101"), ("00010110"), ("00001011"), ("00001011")), (("00010100"), ("00011000"), ("00001100"), ("00001100")), (("00010111"), ("00011010"), ("00001101"), ("00001101")), (("00010010"), ("00011100"), ("00001110"), ("00001110")), (("00010001"), ("00011110"), ("00001111"), ("00001111")), (("00110000"), ("00100000"), ("00010000"), ("00010000")), (("00110011"), ("00100010"), ("00010001"), ("00010001")), (("00110110"), ("00100100"), ("00010010"), ("00010010")), (("00110101"), ("00100110"), ("00010011"), ("00010011")), (("00111100"), ("00101000"), ("00010100"), ("00010100")), (("00111111"), ("00101010"), ("00010101"), ("00010101")), (("00111010"), ("00101100"), ("00010110"), ("00010110")), (("00111001"), ("00101110"), ("00010111"), ("00010111")), (("00101000"), ("00110000"), ("00011000"), ("00011000")), (("00101011"), ("00110010"), ("00011001"), ("00011001")), (("00101110"), ("00110100"), ("00011010"), ("00011010")), (("00101101"), ("00110110"), ("00011011"), ("00011011")), (("00100100"), ("00111000"), ("00011100"), ("00011100")), (("00100111"), ("00111010"), ("00011101"), ("00011101")), (("00100010"), ("00111100"), ("00011110"), ("00011110")), (("00100001"), ("00111110"), ("00011111"), ("00011111")), (("01100000"), ("01000000"), ("00100000"), ("00100000")), (("01100011"), ("01000010"), ("00100001"), ("00100001")), (("01100110"), ("01000100"), ("00100010"), ("00100010")), (("01100101"), ("01000110"), ("00100011"), ("00100011")), (("01101100"), ("01001000"), ("00100100"), ("00100100")), (("01101111"), ("01001010"), ("00100101"), ("00100101")), (("01101010"), ("01001100"), ("00100110"), ("00100110")), (("01101001"), ("01001110"), ("00100111"), ("00100111")), (("01111000"), ("01010000"), ("00101000"), ("00101000")), (("01111011"), ("01010010"), ("00101001"), ("00101001")), (("01111110"), ("01010100"), ("00101010"), ("00101010")), (("01111101"), ("01010110"), ("00101011"), ("00101011")), (("01110100"), ("01011000"), ("00101100"), ("00101100")), (("01110111"), ("01011010"), ("00101101"), ("00101101")), (("01110010"), ("01011100"), ("00101110"), ("00101110")), (("01110001"), ("01011110"), ("00101111"), ("00101111")), (("01010000"), ("01100000"), ("00110000"), ("00110000")), (("01010011"), ("01100010"), ("00110001"), ("00110001")), (("01010110"), ("01100100"), ("00110010"), ("00110010")), (("01010101"), ("01100110"), ("00110011"), ("00110011")), (("01011100"), ("01101000"), ("00110100"), ("00110100")), (("01011111"), ("01101010"), ("00110101"), ("00110101")), (("01011010"), ("01101100"), ("00110110"), ("00110110")), (("01011001"), ("01101110"), ("00110111"), ("00110111")), (("01001000"), ("01110000"), ("00111000"), ("00111000")), (("01001011"), ("01110010"), ("00111001"), ("00111001")), (("01001110"), ("01110100"), ("00111010"), ("00111010")), (("01001101"), ("01110110"), ("00111011"), ("00111011")),

67

Page 76: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

(("01000100"), ("01111000"), ("00111100"), ("00111100")), (("01000111"), ("01111010"), ("00111101"), ("00111101")), (("01000010"), ("01111100"), ("00111110"), ("00111110")), (("01000001"), ("01111110"), ("00111111"), ("00111111")), (("11000000"), ("10000000"), ("01000000"), ("01000000")), (("11000011"), ("10000010"), ("01000001"), ("01000001")), (("11000110"), ("10000100"), ("01000010"), ("01000010")), (("11000101"), ("10000110"), ("01000011"), ("01000011")), (("11001100"), ("10001000"), ("01000100"), ("01000100")), (("11001111"), ("10001010"), ("01000101"), ("01000101")), (("11001010"), ("10001100"), ("01000110"), ("01000110")), (("11001001"), ("10001110"), ("01000111"), ("01000111")), (("11011000"), ("10010000"), ("01001000"), ("01001000")), (("11011011"), ("10010010"), ("01001001"), ("01001001")), (("11011110"), ("10010100"), ("01001010"), ("01001010")), (("11011101"), ("10010110"), ("01001011"), ("01001011")), (("11010100"), ("10011000"), ("01001100"), ("01001100")), (("11010111"), ("10011010"), ("01001101"), ("01001101")), (("11010010"), ("10011100"), ("01001110"), ("01001110")), (("11010001"), ("10011110"), ("01001111"), ("01001111")), (("11110000"), ("10100000"), ("01010000"), ("01010000")), (("11110011"), ("10100010"), ("01010001"), ("01010001")), (("11110110"), ("10100100"), ("01010010"), ("01010010")), (("11110101"), ("10100110"), ("01010011"), ("01010011")), (("11111100"), ("10101000"), ("01010100"), ("01010100")), (("11111111"), ("10101010"), ("01010101"), ("01010101")), (("11111010"), ("10101100"), ("01010110"), ("01010110")), (("11111001"), ("10101110"), ("01010111"), ("01010111")), (("11101000"), ("10110000"), ("01011000"), ("01011000")), (("11101011"), ("10110010"), ("01011001"), ("01011001")), (("11101110"), ("10110100"), ("01011010"), ("01011010")), (("11101101"), ("10110110"), ("01011011"), ("01011011")), (("11100100"), ("10111000"), ("01011100"), ("01011100")), (("11100111"), ("10111010"), ("01011101"), ("01011101")), (("11100010"), ("10111100"), ("01011110"), ("01011110")), (("11100001"), ("10111110"), ("01011111"), ("01011111")), (("10100000"), ("11000000"), ("01100000"), ("01100000")), (("10100011"), ("11000010"), ("01100001"), ("01100001")), (("10100110"), ("11000100"), ("01100010"), ("01100010")), (("10100101"), ("11000110"), ("01100011"), ("01100011")), (("10101100"), ("11001000"), ("01100100"), ("01100100")), (("10101111"), ("11001010"), ("01100101"), ("01100101")), (("10101010"), ("11001100"), ("01100110"), ("01100110")), (("10101001"), ("11001110"), ("01100111"), ("01100111")), (("10111000"), ("11010000"), ("01101000"), ("01101000")), (("10111011"), ("11010010"), ("01101001"), ("01101001")), (("10111110"), ("11010100"), ("01101010"), ("01101010")), (("10111101"), ("11010110"), ("01101011"), ("01101011")), (("10110100"), ("11011000"), ("01101100"), ("01101100")), (("10110111"), ("11011010"), ("01101101"), ("01101101")), (("10110010"), ("11011100"), ("01101110"), ("01101110")), (("10110001"), ("11011110"), ("01101111"), ("01101111")), (("10010000"), ("11100000"), ("01110000"), ("01110000")), (("10010011"), ("11100010"), ("01110001"), ("01110001")), (("10010110"), ("11100100"), ("01110010"), ("01110010")), (("10010101"), ("11100110"), ("01110011"), ("01110011")), (("10011100"), ("11101000"), ("01110100"), ("01110100")),

68

Page 77: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

(("10011111"), ("11101010"), ("01110101"), ("01110101")), (("10011010"), ("11101100"), ("01110110"), ("01110110")), (("10011001"), ("11101110"), ("01110111"), ("01110111")), (("10001000"), ("11110000"), ("01111000"), ("01111000")), (("10001011"), ("11110010"), ("01111001"), ("01111001")), (("10001110"), ("11110100"), ("01111010"), ("01111010")), (("10001101"), ("11110110"), ("01111011"), ("01111011")), (("10000100"), ("11111000"), ("01111100"), ("01111100")), (("10000111"), ("11111010"), ("01111101"), ("01111101")), (("10000010"), ("11111100"), ("01111110"), ("01111110")), (("10000001"), ("11111110"), ("01111111"), ("01111111")), (("10011011"), ("00011011"), ("10000000"), ("10000000")), (("10011000"), ("00011001"), ("10000001"), ("10000001")), (("10011101"), ("00011111"), ("10000010"), ("10000010")), (("10011110"), ("00011101"), ("10000011"), ("10000011")), (("10010111"), ("00010011"), ("10000100"), ("10000100")), (("10010100"), ("00010001"), ("10000101"), ("10000101")), (("10010001"), ("00010111"), ("10000110"), ("10000110")), (("10010010"), ("00010101"), ("10000111"), ("10000111")), (("10000011"), ("00001011"), ("10001000"), ("10001000")), (("10000000"), ("00001001"), ("10001001"), ("10001001")), (("10000101"), ("00001111"), ("10001010"), ("10001010")), (("10000110"), ("00001101"), ("10001011"), ("10001011")), (("10001111"), ("00000011"), ("10001100"), ("10001100")), (("10001100"), ("00000001"), ("10001101"), ("10001101")), (("10001001"), ("00000111"), ("10001110"), ("10001110")), (("10001010"), ("00000101"), ("10001111"), ("10001111")), (("10101011"), ("00111011"), ("10010000"), ("10010000")), (("10101000"), ("00111001"), ("10010001"), ("10010001")), (("10101101"), ("00111111"), ("10010010"), ("10010010")), (("10101110"), ("00111101"), ("10010011"), ("10010011")), (("10100111"), ("00110011"), ("10010100"), ("10010100")), (("10100100"), ("00110001"), ("10010101"), ("10010101")), (("10100001"), ("00110111"), ("10010110"), ("10010110")), (("10100010"), ("00110101"), ("10010111"), ("10010111")), (("10110011"), ("00101011"), ("10011000"), ("10011000")), (("10110000"), ("00101001"), ("10011001"), ("10011001")), (("10110101"), ("00101111"), ("10011010"), ("10011010")), (("10110110"), ("00101101"), ("10011011"), ("10011011")), (("10111111"), ("00100011"), ("10011100"), ("10011100")), (("10111100"), ("00100001"), ("10011101"), ("10011101")), (("10111001"), ("00100111"), ("10011110"), ("10011110")), (("10111010"), ("00100101"), ("10011111"), ("10011111")), (("11111011"), ("01011011"), ("10100000"), ("10100000")), (("11111000"), ("01011001"), ("10100001"), ("10100001")), (("11111101"), ("01011111"), ("10100010"), ("10100010")), (("11111110"), ("01011101"), ("10100011"), ("10100011")), (("11110111"), ("01010011"), ("10100100"), ("10100100")), (("11110100"), ("01010001"), ("10100101"), ("10100101")), (("11110001"), ("01010111"), ("10100110"), ("10100110")), (("11110010"), ("01010101"), ("10100111"), ("10100111")), (("11100011"), ("01001011"), ("10101000"), ("10101000")), (("11100000"), ("01001001"), ("10101001"), ("10101001")), (("11100101"), ("01001111"), ("10101010"), ("10101010")), (("11100110"), ("01001101"), ("10101011"), ("10101011")), (("11101111"), ("01000011"), ("10101100"), ("10101100")), (("11101100"), ("01000001"), ("10101101"), ("10101101")),

69

Page 78: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

(("11101001"), ("01000111"), ("10101110"), ("10101110")), (("11101010"), ("01000101"), ("10101111"), ("10101111")), (("11001011"), ("01111011"), ("10110000"), ("10110000")), (("11001000"), ("01111001"), ("10110001"), ("10110001")), (("11001101"), ("01111111"), ("10110010"), ("10110010")), (("11001110"), ("01111101"), ("10110011"), ("10110011")), (("11000111"), ("01110011"), ("10110100"), ("10110100")), (("11000100"), ("01110001"), ("10110101"), ("10110101")), (("11000001"), ("01110111"), ("10110110"), ("10110110")), (("11000010"), ("01110101"), ("10110111"), ("10110111")), (("11010011"), ("01101011"), ("10111000"), ("10111000")), (("11010000"), ("01101001"), ("10111001"), ("10111001")), (("11010101"), ("01101111"), ("10111010"), ("10111010")), (("11010110"), ("01101101"), ("10111011"), ("10111011")), (("11011111"), ("01100011"), ("10111100"), ("10111100")), (("11011100"), ("01100001"), ("10111101"), ("10111101")), (("11011001"), ("01100111"), ("10111110"), ("10111110")), (("11011010"), ("01100101"), ("10111111"), ("10111111")), (("01011011"), ("10011011"), ("11000000"), ("11000000")), (("01011000"), ("10011001"), ("11000001"), ("11000001")), (("01011101"), ("10011111"), ("11000010"), ("11000010")), (("01011110"), ("10011101"), ("11000011"), ("11000011")), (("01010111"), ("10010011"), ("11000100"), ("11000100")), (("01010100"), ("10010001"), ("11000101"), ("11000101")), (("01010001"), ("10010111"), ("11000110"), ("11000110")), (("01010010"), ("10010101"), ("11000111"), ("11000111")), (("01000011"), ("10001011"), ("11001000"), ("11001000")), (("01000000"), ("10001001"), ("11001001"), ("11001001")), (("01000101"), ("10001111"), ("11001010"), ("11001010")), (("01000110"), ("10001101"), ("11001011"), ("11001011")), (("01001111"), ("10000011"), ("11001100"), ("11001100")), (("01001100"), ("10000001"), ("11001101"), ("11001101")), (("01001001"), ("10000111"), ("11001110"), ("11001110")), (("01001010"), ("10000101"), ("11001111"), ("11001111")), (("01101011"), ("10111011"), ("11010000"), ("11010000")), (("01101000"), ("10111001"), ("11010001"), ("11010001")), (("01101101"), ("10111111"), ("11010010"), ("11010010")), (("01101110"), ("10111101"), ("11010011"), ("11010011")), (("01100111"), ("10110011"), ("11010100"), ("11010100")), (("01100100"), ("10110001"), ("11010101"), ("11010101")), (("01100001"), ("10110111"), ("11010110"), ("11010110")), (("01100010"), ("10110101"), ("11010111"), ("11010111")), (("01110011"), ("10101011"), ("11011000"), ("11011000")), (("01110000"), ("10101001"), ("11011001"), ("11011001")), (("01110101"), ("10101111"), ("11011010"), ("11011010")), (("01110110"), ("10101101"), ("11011011"), ("11011011")), (("01111111"), ("10100011"), ("11011100"), ("11011100")), (("01111100"), ("10100001"), ("11011101"), ("11011101")), (("01111001"), ("10100111"), ("11011110"), ("11011110")), (("01111010"), ("10100101"), ("11011111"), ("11011111")), (("00111011"), ("11011011"), ("11100000"), ("11100000")), (("00111000"), ("11011001"), ("11100001"), ("11100001")), (("00111101"), ("11011111"), ("11100010"), ("11100010")), (("00111110"), ("11011101"), ("11100011"), ("11100011")), (("00110111"), ("11010011"), ("11100100"), ("11100100")), (("00110100"), ("11010001"), ("11100101"), ("11100101")), (("00110001"), ("11010111"), ("11100110"), ("11100110")),

70

Page 79: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

(("00110010"), ("11010101"), ("11100111"), ("11100111")), (("00100011"), ("11001011"), ("11101000"), ("11101000")), (("00100000"), ("11001001"), ("11101001"), ("11101001")), (("00100101"), ("11001111"), ("11101010"), ("11101010")), (("00100110"), ("11001101"), ("11101011"), ("11101011")), (("00101111"), ("11000011"), ("11101100"), ("11101100")), (("00101100"), ("11000001"), ("11101101"), ("11101101")), (("00101001"), ("11000111"), ("11101110"), ("11101110")), (("00101010"), ("11000101"), ("11101111"), ("11101111")), (("00001011"), ("11111011"), ("11110000"), ("11110000")), (("00001000"), ("11111001"), ("11110001"), ("11110001")), (("00001101"), ("11111111"), ("11110010"), ("11110010")), (("00001110"), ("11111101"), ("11110011"), ("11110011")), (("00000111"), ("11110011"), ("11110100"), ("11110100")), (("00000100"), ("11110001"), ("11110101"), ("11110101")), (("00000001"), ("11110111"), ("11110110"), ("11110110")), (("00000010"), ("11110101"), ("11110111"), ("11110111")), (("00010011"), ("11101011"), ("11111000"), ("11111000")), (("00010000"), ("11101001"), ("11111001"), ("11111001")), (("00010101"), ("11101111"), ("11111010"), ("11111010")), (("00010110"), ("11101101"), ("11111011"), ("11111011")), (("00011111"), ("11100011"), ("11111100"), ("11111100")), (("00011100"), ("11100001"), ("11111101"), ("11111101")), (("00011001"), ("11100111"), ("11111110"), ("11111110")), (("00011010"), ("11100101"), ("11111111"), ("11111111")) ); constant mul_row2: mul_table:= ( (("00000000"), ("00000000"), ("00000000"), ("00000000")), (("00000001"), ("00000011"), ("00000010"), ("00000001")), (("00000010"), ("00000110"), ("00000100"), ("00000010")), (("00000011"), ("00000101"), ("00000110"), ("00000011")), (("00000100"), ("00001100"), ("00001000"), ("00000100")), (("00000101"), ("00001111"), ("00001010"), ("00000101")), (("00000110"), ("00001010"), ("00001100"), ("00000110")), (("00000111"), ("00001001"), ("00001110"), ("00000111")), (("00001000"), ("00011000"), ("00010000"), ("00001000")), (("00001001"), ("00011011"), ("00010010"), ("00001001")), (("00001010"), ("00011110"), ("00010100"), ("00001010")), (("00001011"), ("00011101"), ("00010110"), ("00001011")), (("00001100"), ("00010100"), ("00011000"), ("00001100")), (("00001101"), ("00010111"), ("00011010"), ("00001101")), (("00001110"), ("00010010"), ("00011100"), ("00001110")), (("00001111"), ("00010001"), ("00011110"), ("00001111")), (("00010000"), ("00110000"), ("00100000"), ("00010000")), (("00010001"), ("00110011"), ("00100010"), ("00010001")), (("00010010"), ("00110110"), ("00100100"), ("00010010")), (("00010011"), ("00110101"), ("00100110"), ("00010011")), (("00010100"), ("00111100"), ("00101000"), ("00010100")), (("00010101"), ("00111111"), ("00101010"), ("00010101")), (("00010110"), ("00111010"), ("00101100"), ("00010110")), (("00010111"), ("00111001"), ("00101110"), ("00010111")), (("00011000"), ("00101000"), ("00110000"), ("00011000")), (("00011001"), ("00101011"), ("00110010"), ("00011001")), (("00011010"), ("00101110"), ("00110100"), ("00011010")), (("00011011"), ("00101101"), ("00110110"), ("00011011")), (("00011100"), ("00100100"), ("00111000"), ("00011100")), (("00011101"), ("00100111"), ("00111010"), ("00011101")),

71

Page 80: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

(("00011110"), ("00100010"), ("00111100"), ("00011110")), (("00011111"), ("00100001"), ("00111110"), ("00011111")), (("00100000"), ("01100000"), ("01000000"), ("00100000")), (("00100001"), ("01100011"), ("01000010"), ("00100001")), (("00100010"), ("01100110"), ("01000100"), ("00100010")), (("00100011"), ("01100101"), ("01000110"), ("00100011")), (("00100100"), ("01101100"), ("01001000"), ("00100100")), (("00100101"), ("01101111"), ("01001010"), ("00100101")), (("00100110"), ("01101010"), ("01001100"), ("00100110")), (("00100111"), ("01101001"), ("01001110"), ("00100111")), (("00101000"), ("01111000"), ("01010000"), ("00101000")), (("00101001"), ("01111011"), ("01010010"), ("00101001")), (("00101010"), ("01111110"), ("01010100"), ("00101010")), (("00101011"), ("01111101"), ("01010110"), ("00101011")), (("00101100"), ("01110100"), ("01011000"), ("00101100")), (("00101101"), ("01110111"), ("01011010"), ("00101101")), (("00101110"), ("01110010"), ("01011100"), ("00101110")), (("00101111"), ("01110001"), ("01011110"), ("00101111")), (("00110000"), ("01010000"), ("01100000"), ("00110000")), (("00110001"), ("01010011"), ("01100010"), ("00110001")), (("00110010"), ("01010110"), ("01100100"), ("00110010")), (("00110011"), ("01010101"), ("01100110"), ("00110011")), (("00110100"), ("01011100"), ("01101000"), ("00110100")), (("00110101"), ("01011111"), ("01101010"), ("00110101")), (("00110110"), ("01011010"), ("01101100"), ("00110110")), (("00110111"), ("01011001"), ("01101110"), ("00110111")), (("00111000"), ("01001000"), ("01110000"), ("00111000")), (("00111001"), ("01001011"), ("01110010"), ("00111001")), (("00111010"), ("01001110"), ("01110100"), ("00111010")), (("00111011"), ("01001101"), ("01110110"), ("00111011")), (("00111100"), ("01000100"), ("01111000"), ("00111100")), (("00111101"), ("01000111"), ("01111010"), ("00111101")), (("00111110"), ("01000010"), ("01111100"), ("00111110")), (("00111111"), ("01000001"), ("01111110"), ("00111111")), (("01000000"), ("11000000"), ("10000000"), ("01000000")), (("01000001"), ("11000011"), ("10000010"), ("01000001")), (("01000010"), ("11000110"), ("10000100"), ("01000010")), (("01000011"), ("11000101"), ("10000110"), ("01000011")), (("01000100"), ("11001100"), ("10001000"), ("01000100")), (("01000101"), ("11001111"), ("10001010"), ("01000101")), (("01000110"), ("11001010"), ("10001100"), ("01000110")), (("01000111"), ("11001001"), ("10001110"), ("01000111")), (("01001000"), ("11011000"), ("10010000"), ("01001000")), (("01001001"), ("11011011"), ("10010010"), ("01001001")), (("01001010"), ("11011110"), ("10010100"), ("01001010")), (("01001011"), ("11011101"), ("10010110"), ("01001011")), (("01001100"), ("11010100"), ("10011000"), ("01001100")), (("01001101"), ("11010111"), ("10011010"), ("01001101")), (("01001110"), ("11010010"), ("10011100"), ("01001110")), (("01001111"), ("11010001"), ("10011110"), ("01001111")), (("01010000"), ("11110000"), ("10100000"), ("01010000")), (("01010001"), ("11110011"), ("10100010"), ("01010001")), (("01010010"), ("11110110"), ("10100100"), ("01010010")), (("01010011"), ("11110101"), ("10100110"), ("01010011")), (("01010100"), ("11111100"), ("10101000"), ("01010100")), (("01010101"), ("11111111"), ("10101010"), ("01010101")), (("01010110"), ("11111010"), ("10101100"), ("01010110")),

72

Page 81: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

(("01010111"), ("11111001"), ("10101110"), ("01010111")), (("01011000"), ("11101000"), ("10110000"), ("01011000")), (("01011001"), ("11101011"), ("10110010"), ("01011001")), (("01011010"), ("11101110"), ("10110100"), ("01011010")), (("01011011"), ("11101101"), ("10110110"), ("01011011")), (("01011100"), ("11100100"), ("10111000"), ("01011100")), (("01011101"), ("11100111"), ("10111010"), ("01011101")), (("01011110"), ("11100010"), ("10111100"), ("01011110")), (("01011111"), ("11100001"), ("10111110"), ("01011111")), (("01100000"), ("10100000"), ("11000000"), ("01100000")), (("01100001"), ("10100011"), ("11000010"), ("01100001")), (("01100010"), ("10100110"), ("11000100"), ("01100010")), (("01100011"), ("10100101"), ("11000110"), ("01100011")), (("01100100"), ("10101100"), ("11001000"), ("01100100")), (("01100101"), ("10101111"), ("11001010"), ("01100101")), (("01100110"), ("10101010"), ("11001100"), ("01100110")), (("01100111"), ("10101001"), ("11001110"), ("01100111")), (("01101000"), ("10111000"), ("11010000"), ("01101000")), (("01101001"), ("10111011"), ("11010010"), ("01101001")), (("01101010"), ("10111110"), ("11010100"), ("01101010")), (("01101011"), ("10111101"), ("11010110"), ("01101011")), (("01101100"), ("10110100"), ("11011000"), ("01101100")), (("01101101"), ("10110111"), ("11011010"), ("01101101")), (("01101110"), ("10110010"), ("11011100"), ("01101110")), (("01101111"), ("10110001"), ("11011110"), ("01101111")), (("01110000"), ("10010000"), ("11100000"), ("01110000")), (("01110001"), ("10010011"), ("11100010"), ("01110001")), (("01110010"), ("10010110"), ("11100100"), ("01110010")), (("01110011"), ("10010101"), ("11100110"), ("01110011")), (("01110100"), ("10011100"), ("11101000"), ("01110100")), (("01110101"), ("10011111"), ("11101010"), ("01110101")), (("01110110"), ("10011010"), ("11101100"), ("01110110")), (("01110111"), ("10011001"), ("11101110"), ("01110111")), (("01111000"), ("10001000"), ("11110000"), ("01111000")), (("01111001"), ("10001011"), ("11110010"), ("01111001")), (("01111010"), ("10001110"), ("11110100"), ("01111010")), (("01111011"), ("10001101"), ("11110110"), ("01111011")), (("01111100"), ("10000100"), ("11111000"), ("01111100")), (("01111101"), ("10000111"), ("11111010"), ("01111101")), (("01111110"), ("10000010"), ("11111100"), ("01111110")), (("01111111"), ("10000001"), ("11111110"), ("01111111")), (("10000000"), ("10011011"), ("00011011"), ("10000000")), (("10000001"), ("10011000"), ("00011001"), ("10000001")), (("10000010"), ("10011101"), ("00011111"), ("10000010")), (("10000011"), ("10011110"), ("00011101"), ("10000011")), (("10000100"), ("10010111"), ("00010011"), ("10000100")), (("10000101"), ("10010100"), ("00010001"), ("10000101")), (("10000110"), ("10010001"), ("00010111"), ("10000110")), (("10000111"), ("10010010"), ("00010101"), ("10000111")), (("10001000"), ("10000011"), ("00001011"), ("10001000")), (("10001001"), ("10000000"), ("00001001"), ("10001001")), (("10001010"), ("10000101"), ("00001111"), ("10001010")), (("10001011"), ("10000110"), ("00001101"), ("10001011")), (("10001100"), ("10001111"), ("00000011"), ("10001100")), (("10001101"), ("10001100"), ("00000001"), ("10001101")), (("10001110"), ("10001001"), ("00000111"), ("10001110")), (("10001111"), ("10001010"), ("00000101"), ("10001111")),

73

Page 82: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

(("10010000"), ("10101011"), ("00111011"), ("10010000")), (("10010001"), ("10101000"), ("00111001"), ("10010001")), (("10010010"), ("10101101"), ("00111111"), ("10010010")), (("10010011"), ("10101110"), ("00111101"), ("10010011")), (("10010100"), ("10100111"), ("00110011"), ("10010100")), (("10010101"), ("10100100"), ("00110001"), ("10010101")), (("10010110"), ("10100001"), ("00110111"), ("10010110")), (("10010111"), ("10100010"), ("00110101"), ("10010111")), (("10011000"), ("10110011"), ("00101011"), ("10011000")), (("10011001"), ("10110000"), ("00101001"), ("10011001")), (("10011010"), ("10110101"), ("00101111"), ("10011010")), (("10011011"), ("10110110"), ("00101101"), ("10011011")), (("10011100"), ("10111111"), ("00100011"), ("10011100")), (("10011101"), ("10111100"), ("00100001"), ("10011101")), (("10011110"), ("10111001"), ("00100111"), ("10011110")), (("10011111"), ("10111010"), ("00100101"), ("10011111")), (("10100000"), ("11111011"), ("01011011"), ("10100000")), (("10100001"), ("11111000"), ("01011001"), ("10100001")), (("10100010"), ("11111101"), ("01011111"), ("10100010")), (("10100011"), ("11111110"), ("01011101"), ("10100011")), (("10100100"), ("11110111"), ("01010011"), ("10100100")), (("10100101"), ("11110100"), ("01010001"), ("10100101")), (("10100110"), ("11110001"), ("01010111"), ("10100110")), (("10100111"), ("11110010"), ("01010101"), ("10100111")), (("10101000"), ("11100011"), ("01001011"), ("10101000")), (("10101001"), ("11100000"), ("01001001"), ("10101001")), (("10101010"), ("11100101"), ("01001111"), ("10101010")), (("10101011"), ("11100110"), ("01001101"), ("10101011")), (("10101100"), ("11101111"), ("01000011"), ("10101100")), (("10101101"), ("11101100"), ("01000001"), ("10101101")), (("10101110"), ("11101001"), ("01000111"), ("10101110")), (("10101111"), ("11101010"), ("01000101"), ("10101111")), (("10110000"), ("11001011"), ("01111011"), ("10110000")), (("10110001"), ("11001000"), ("01111001"), ("10110001")), (("10110010"), ("11001101"), ("01111111"), ("10110010")), (("10110011"), ("11001110"), ("01111101"), ("10110011")), (("10110100"), ("11000111"), ("01110011"), ("10110100")), (("10110101"), ("11000100"), ("01110001"), ("10110101")), (("10110110"), ("11000001"), ("01110111"), ("10110110")), (("10110111"), ("11000010"), ("01110101"), ("10110111")), (("10111000"), ("11010011"), ("01101011"), ("10111000")), (("10111001"), ("11010000"), ("01101001"), ("10111001")), (("10111010"), ("11010101"), ("01101111"), ("10111010")), (("10111011"), ("11010110"), ("01101101"), ("10111011")), (("10111100"), ("11011111"), ("01100011"), ("10111100")), (("10111101"), ("11011100"), ("01100001"), ("10111101")), (("10111110"), ("11011001"), ("01100111"), ("10111110")), (("10111111"), ("11011010"), ("01100101"), ("10111111")), (("11000000"), ("01011011"), ("10011011"), ("11000000")), (("11000001"), ("01011000"), ("10011001"), ("11000001")), (("11000010"), ("01011101"), ("10011111"), ("11000010")), (("11000011"), ("01011110"), ("10011101"), ("11000011")), (("11000100"), ("01010111"), ("10010011"), ("11000100")), (("11000101"), ("01010100"), ("10010001"), ("11000101")), (("11000110"), ("01010001"), ("10010111"), ("11000110")), (("11000111"), ("01010010"), ("10010101"), ("11000111")), (("11001000"), ("01000011"), ("10001011"), ("11001000")),

74

Page 83: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

(("11001001"), ("01000000"), ("10001001"), ("11001001")), (("11001010"), ("01000101"), ("10001111"), ("11001010")), (("11001011"), ("01000110"), ("10001101"), ("11001011")), (("11001100"), ("01001111"), ("10000011"), ("11001100")), (("11001101"), ("01001100"), ("10000001"), ("11001101")), (("11001110"), ("01001001"), ("10000111"), ("11001110")), (("11001111"), ("01001010"), ("10000101"), ("11001111")), (("11010000"), ("01101011"), ("10111011"), ("11010000")), (("11010001"), ("01101000"), ("10111001"), ("11010001")), (("11010010"), ("01101101"), ("10111111"), ("11010010")), (("11010011"), ("01101110"), ("10111101"), ("11010011")), (("11010100"), ("01100111"), ("10110011"), ("11010100")), (("11010101"), ("01100100"), ("10110001"), ("11010101")), (("11010110"), ("01100001"), ("10110111"), ("11010110")), (("11010111"), ("01100010"), ("10110101"), ("11010111")), (("11011000"), ("01110011"), ("10101011"), ("11011000")), (("11011001"), ("01110000"), ("10101001"), ("11011001")), (("11011010"), ("01110101"), ("10101111"), ("11011010")), (("11011011"), ("01110110"), ("10101101"), ("11011011")), (("11011100"), ("01111111"), ("10100011"), ("11011100")), (("11011101"), ("01111100"), ("10100001"), ("11011101")), (("11011110"), ("01111001"), ("10100111"), ("11011110")), (("11011111"), ("01111010"), ("10100101"), ("11011111")), (("11100000"), ("00111011"), ("11011011"), ("11100000")), (("11100001"), ("00111000"), ("11011001"), ("11100001")), (("11100010"), ("00111101"), ("11011111"), ("11100010")), (("11100011"), ("00111110"), ("11011101"), ("11100011")), (("11100100"), ("00110111"), ("11010011"), ("11100100")), (("11100101"), ("00110100"), ("11010001"), ("11100101")), (("11100110"), ("00110001"), ("11010111"), ("11100110")), (("11100111"), ("00110010"), ("11010101"), ("11100111")), (("11101000"), ("00100011"), ("11001011"), ("11101000")), (("11101001"), ("00100000"), ("11001001"), ("11101001")), (("11101010"), ("00100101"), ("11001111"), ("11101010")), (("11101011"), ("00100110"), ("11001101"), ("11101011")), (("11101100"), ("00101111"), ("11000011"), ("11101100")), (("11101101"), ("00101100"), ("11000001"), ("11101101")), (("11101110"), ("00101001"), ("11000111"), ("11101110")), (("11101111"), ("00101010"), ("11000101"), ("11101111")), (("11110000"), ("00001011"), ("11111011"), ("11110000")), (("11110001"), ("00001000"), ("11111001"), ("11110001")), (("11110010"), ("00001101"), ("11111111"), ("11110010")), (("11110011"), ("00001110"), ("11111101"), ("11110011")), (("11110100"), ("00000111"), ("11110011"), ("11110100")), (("11110101"), ("00000100"), ("11110001"), ("11110101")), (("11110110"), ("00000001"), ("11110111"), ("11110110")), (("11110111"), ("00000010"), ("11110101"), ("11110111")), (("11111000"), ("00010011"), ("11101011"), ("11111000")), (("11111001"), ("00010000"), ("11101001"), ("11111001")), (("11111010"), ("00010101"), ("11101111"), ("11111010")), (("11111011"), ("00010110"), ("11101101"), ("11111011")), (("11111100"), ("00011111"), ("11100011"), ("11111100")), (("11111101"), ("00011100"), ("11100001"), ("11111101")), (("11111110"), ("00011001"), ("11100111"), ("11111110")), (("11111111"), ("00011010"), ("11100101"), ("11111111")) ); constant mul_row3: mul_table:= (

75

Page 84: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

(("00000000"), ("00000000"), ("00000000"), ("00000000")), (("00000001"), ("00000001"), ("00000011"), ("00000010")), (("00000010"), ("00000010"), ("00000110"), ("00000100")), (("00000011"), ("00000011"), ("00000101"), ("00000110")), (("00000100"), ("00000100"), ("00001100"), ("00001000")), (("00000101"), ("00000101"), ("00001111"), ("00001010")), (("00000110"), ("00000110"), ("00001010"), ("00001100")), (("00000111"), ("00000111"), ("00001001"), ("00001110")), (("00001000"), ("00001000"), ("00011000"), ("00010000")), (("00001001"), ("00001001"), ("00011011"), ("00010010")), (("00001010"), ("00001010"), ("00011110"), ("00010100")), (("00001011"), ("00001011"), ("00011101"), ("00010110")), (("00001100"), ("00001100"), ("00010100"), ("00011000")), (("00001101"), ("00001101"), ("00010111"), ("00011010")), (("00001110"), ("00001110"), ("00010010"), ("00011100")), (("00001111"), ("00001111"), ("00010001"), ("00011110")), (("00010000"), ("00010000"), ("00110000"), ("00100000")), (("00010001"), ("00010001"), ("00110011"), ("00100010")), (("00010010"), ("00010010"), ("00110110"), ("00100100")), (("00010011"), ("00010011"), ("00110101"), ("00100110")), (("00010100"), ("00010100"), ("00111100"), ("00101000")), (("00010101"), ("00010101"), ("00111111"), ("00101010")), (("00010110"), ("00010110"), ("00111010"), ("00101100")), (("00010111"), ("00010111"), ("00111001"), ("00101110")), (("00011000"), ("00011000"), ("00101000"), ("00110000")), (("00011001"), ("00011001"), ("00101011"), ("00110010")), (("00011010"), ("00011010"), ("00101110"), ("00110100")), (("00011011"), ("00011011"), ("00101101"), ("00110110")), (("00011100"), ("00011100"), ("00100100"), ("00111000")), (("00011101"), ("00011101"), ("00100111"), ("00111010")), (("00011110"), ("00011110"), ("00100010"), ("00111100")), (("00011111"), ("00011111"), ("00100001"), ("00111110")), (("00100000"), ("00100000"), ("01100000"), ("01000000")), (("00100001"), ("00100001"), ("01100011"), ("01000010")), (("00100010"), ("00100010"), ("01100110"), ("01000100")), (("00100011"), ("00100011"), ("01100101"), ("01000110")), (("00100100"), ("00100100"), ("01101100"), ("01001000")), (("00100101"), ("00100101"), ("01101111"), ("01001010")), (("00100110"), ("00100110"), ("01101010"), ("01001100")), (("00100111"), ("00100111"), ("01101001"), ("01001110")), (("00101000"), ("00101000"), ("01111000"), ("01010000")), (("00101001"), ("00101001"), ("01111011"), ("01010010")), (("00101010"), ("00101010"), ("01111110"), ("01010100")), (("00101011"), ("00101011"), ("01111101"), ("01010110")), (("00101100"), ("00101100"), ("01110100"), ("01011000")), (("00101101"), ("00101101"), ("01110111"), ("01011010")), (("00101110"), ("00101110"), ("01110010"), ("01011100")), (("00101111"), ("00101111"), ("01110001"), ("01011110")), (("00110000"), ("00110000"), ("01010000"), ("01100000")), (("00110001"), ("00110001"), ("01010011"), ("01100010")), (("00110010"), ("00110010"), ("01010110"), ("01100100")), (("00110011"), ("00110011"), ("01010101"), ("01100110")), (("00110100"), ("00110100"), ("01011100"), ("01101000")), (("00110101"), ("00110101"), ("01011111"), ("01101010")), (("00110110"), ("00110110"), ("01011010"), ("01101100")), (("00110111"), ("00110111"), ("01011001"), ("01101110")), (("00111000"), ("00111000"), ("01001000"), ("01110000")),

76

Page 85: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

(("00111001"), ("00111001"), ("01001011"), ("01110010")), (("00111010"), ("00111010"), ("01001110"), ("01110100")), (("00111011"), ("00111011"), ("01001101"), ("01110110")), (("00111100"), ("00111100"), ("01000100"), ("01111000")), (("00111101"), ("00111101"), ("01000111"), ("01111010")), (("00111110"), ("00111110"), ("01000010"), ("01111100")), (("00111111"), ("00111111"), ("01000001"), ("01111110")), (("01000000"), ("01000000"), ("11000000"), ("10000000")), (("01000001"), ("01000001"), ("11000011"), ("10000010")), (("01000010"), ("01000010"), ("11000110"), ("10000100")), (("01000011"), ("01000011"), ("11000101"), ("10000110")), (("01000100"), ("01000100"), ("11001100"), ("10001000")), (("01000101"), ("01000101"), ("11001111"), ("10001010")), (("01000110"), ("01000110"), ("11001010"), ("10001100")), (("01000111"), ("01000111"), ("11001001"), ("10001110")), (("01001000"), ("01001000"), ("11011000"), ("10010000")), (("01001001"), ("01001001"), ("11011011"), ("10010010")), (("01001010"), ("01001010"), ("11011110"), ("10010100")), (("01001011"), ("01001011"), ("11011101"), ("10010110")), (("01001100"), ("01001100"), ("11010100"), ("10011000")), (("01001101"), ("01001101"), ("11010111"), ("10011010")), (("01001110"), ("01001110"), ("11010010"), ("10011100")), (("01001111"), ("01001111"), ("11010001"), ("10011110")), (("01010000"), ("01010000"), ("11110000"), ("10100000")), (("01010001"), ("01010001"), ("11110011"), ("10100010")), (("01010010"), ("01010010"), ("11110110"), ("10100100")), (("01010011"), ("01010011"), ("11110101"), ("10100110")), (("01010100"), ("01010100"), ("11111100"), ("10101000")), (("01010101"), ("01010101"), ("11111111"), ("10101010")), (("01010110"), ("01010110"), ("11111010"), ("10101100")), (("01010111"), ("01010111"), ("11111001"), ("10101110")), (("01011000"), ("01011000"), ("11101000"), ("10110000")), (("01011001"), ("01011001"), ("11101011"), ("10110010")), (("01011010"), ("01011010"), ("11101110"), ("10110100")), (("01011011"), ("01011011"), ("11101101"), ("10110110")), (("01011100"), ("01011100"), ("11100100"), ("10111000")), (("01011101"), ("01011101"), ("11100111"), ("10111010")), (("01011110"), ("01011110"), ("11100010"), ("10111100")), (("01011111"), ("01011111"), ("11100001"), ("10111110")), (("01100000"), ("01100000"), ("10100000"), ("11000000")), (("01100001"), ("01100001"), ("10100011"), ("11000010")), (("01100010"), ("01100010"), ("10100110"), ("11000100")), (("01100011"), ("01100011"), ("10100101"), ("11000110")), (("01100100"), ("01100100"), ("10101100"), ("11001000")), (("01100101"), ("01100101"), ("10101111"), ("11001010")), (("01100110"), ("01100110"), ("10101010"), ("11001100")), (("01100111"), ("01100111"), ("10101001"), ("11001110")), (("01101000"), ("01101000"), ("10111000"), ("11010000")), (("01101001"), ("01101001"), ("10111011"), ("11010010")), (("01101010"), ("01101010"), ("10111110"), ("11010100")), (("01101011"), ("01101011"), ("10111101"), ("11010110")), (("01101100"), ("01101100"), ("10110100"), ("11011000")), (("01101101"), ("01101101"), ("10110111"), ("11011010")), (("01101110"), ("01101110"), ("10110010"), ("11011100")), (("01101111"), ("01101111"), ("10110001"), ("11011110")), (("01110000"), ("01110000"), ("10010000"), ("11100000")), (("01110001"), ("01110001"), ("10010011"), ("11100010")),

77

Page 86: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

(("01110010"), ("01110010"), ("10010110"), ("11100100")), (("01110011"), ("01110011"), ("10010101"), ("11100110")), (("01110100"), ("01110100"), ("10011100"), ("11101000")), (("01110101"), ("01110101"), ("10011111"), ("11101010")), (("01110110"), ("01110110"), ("10011010"), ("11101100")), (("01110111"), ("01110111"), ("10011001"), ("11101110")), (("01111000"), ("01111000"), ("10001000"), ("11110000")), (("01111001"), ("01111001"), ("10001011"), ("11110010")), (("01111010"), ("01111010"), ("10001110"), ("11110100")), (("01111011"), ("01111011"), ("10001101"), ("11110110")), (("01111100"), ("01111100"), ("10000100"), ("11111000")), (("01111101"), ("01111101"), ("10000111"), ("11111010")), (("01111110"), ("01111110"), ("10000010"), ("11111100")), (("01111111"), ("01111111"), ("10000001"), ("11111110")), (("10000000"), ("10000000"), ("10011011"), ("00011011")), (("10000001"), ("10000001"), ("10011000"), ("00011001")), (("10000010"), ("10000010"), ("10011101"), ("00011111")), (("10000011"), ("10000011"), ("10011110"), ("00011101")), (("10000100"), ("10000100"), ("10010111"), ("00010011")), (("10000101"), ("10000101"), ("10010100"), ("00010001")), (("10000110"), ("10000110"), ("10010001"), ("00010111")), (("10000111"), ("10000111"), ("10010010"), ("00010101")), (("10001000"), ("10001000"), ("10000011"), ("00001011")), (("10001001"), ("10001001"), ("10000000"), ("00001001")), (("10001010"), ("10001010"), ("10000101"), ("00001111")), (("10001011"), ("10001011"), ("10000110"), ("00001101")), (("10001100"), ("10001100"), ("10001111"), ("00000011")), (("10001101"), ("10001101"), ("10001100"), ("00000001")), (("10001110"), ("10001110"), ("10001001"), ("00000111")), (("10001111"), ("10001111"), ("10001010"), ("00000101")), (("10010000"), ("10010000"), ("10101011"), ("00111011")), (("10010001"), ("10010001"), ("10101000"), ("00111001")), (("10010010"), ("10010010"), ("10101101"), ("00111111")), (("10010011"), ("10010011"), ("10101110"), ("00111101")), (("10010100"), ("10010100"), ("10100111"), ("00110011")), (("10010101"), ("10010101"), ("10100100"), ("00110001")), (("10010110"), ("10010110"), ("10100001"), ("00110111")), (("10010111"), ("10010111"), ("10100010"), ("00110101")), (("10011000"), ("10011000"), ("10110011"), ("00101011")), (("10011001"), ("10011001"), ("10110000"), ("00101001")), (("10011010"), ("10011010"), ("10110101"), ("00101111")), (("10011011"), ("10011011"), ("10110110"), ("00101101")), (("10011100"), ("10011100"), ("10111111"), ("00100011")), (("10011101"), ("10011101"), ("10111100"), ("00100001")), (("10011110"), ("10011110"), ("10111001"), ("00100111")), (("10011111"), ("10011111"), ("10111010"), ("00100101")), (("10100000"), ("10100000"), ("11111011"), ("01011011")), (("10100001"), ("10100001"), ("11111000"), ("01011001")), (("10100010"), ("10100010"), ("11111101"), ("01011111")), (("10100011"), ("10100011"), ("11111110"), ("01011101")), (("10100100"), ("10100100"), ("11110111"), ("01010011")), (("10100101"), ("10100101"), ("11110100"), ("01010001")), (("10100110"), ("10100110"), ("11110001"), ("01010111")), (("10100111"), ("10100111"), ("11110010"), ("01010101")), (("10101000"), ("10101000"), ("11100011"), ("01001011")), (("10101001"), ("10101001"), ("11100000"), ("01001001")), (("10101010"), ("10101010"), ("11100101"), ("01001111")),

78

Page 87: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

(("10101011"), ("10101011"), ("11100110"), ("01001101")), (("10101100"), ("10101100"), ("11101111"), ("01000011")), (("10101101"), ("10101101"), ("11101100"), ("01000001")), (("10101110"), ("10101110"), ("11101001"), ("01000111")), (("10101111"), ("10101111"), ("11101010"), ("01000101")), (("10110000"), ("10110000"), ("11001011"), ("01111011")), (("10110001"), ("10110001"), ("11001000"), ("01111001")), (("10110010"), ("10110010"), ("11001101"), ("01111111")), (("10110011"), ("10110011"), ("11001110"), ("01111101")), (("10110100"), ("10110100"), ("11000111"), ("01110011")), (("10110101"), ("10110101"), ("11000100"), ("01110001")), (("10110110"), ("10110110"), ("11000001"), ("01110111")), (("10110111"), ("10110111"), ("11000010"), ("01110101")), (("10111000"), ("10111000"), ("11010011"), ("01101011")), (("10111001"), ("10111001"), ("11010000"), ("01101001")), (("10111010"), ("10111010"), ("11010101"), ("01101111")), (("10111011"), ("10111011"), ("11010110"), ("01101101")), (("10111100"), ("10111100"), ("11011111"), ("01100011")), (("10111101"), ("10111101"), ("11011100"), ("01100001")), (("10111110"), ("10111110"), ("11011001"), ("01100111")), (("10111111"), ("10111111"), ("11011010"), ("01100101")), (("11000000"), ("11000000"), ("01011011"), ("10011011")), (("11000001"), ("11000001"), ("01011000"), ("10011001")), (("11000010"), ("11000010"), ("01011101"), ("10011111")), (("11000011"), ("11000011"), ("01011110"), ("10011101")), (("11000100"), ("11000100"), ("01010111"), ("10010011")), (("11000101"), ("11000101"), ("01010100"), ("10010001")), (("11000110"), ("11000110"), ("01010001"), ("10010111")), (("11000111"), ("11000111"), ("01010010"), ("10010101")), (("11001000"), ("11001000"), ("01000011"), ("10001011")), (("11001001"), ("11001001"), ("01000000"), ("10001001")), (("11001010"), ("11001010"), ("01000101"), ("10001111")), (("11001011"), ("11001011"), ("01000110"), ("10001101")), (("11001100"), ("11001100"), ("01001111"), ("10000011")), (("11001101"), ("11001101"), ("01001100"), ("10000001")), (("11001110"), ("11001110"), ("01001001"), ("10000111")), (("11001111"), ("11001111"), ("01001010"), ("10000101")), (("11010000"), ("11010000"), ("01101011"), ("10111011")), (("11010001"), ("11010001"), ("01101000"), ("10111001")), (("11010010"), ("11010010"), ("01101101"), ("10111111")), (("11010011"), ("11010011"), ("01101110"), ("10111101")), (("11010100"), ("11010100"), ("01100111"), ("10110011")), (("11010101"), ("11010101"), ("01100100"), ("10110001")), (("11010110"), ("11010110"), ("01100001"), ("10110111")), (("11010111"), ("11010111"), ("01100010"), ("10110101")), (("11011000"), ("11011000"), ("01110011"), ("10101011")), (("11011001"), ("11011001"), ("01110000"), ("10101001")), (("11011010"), ("11011010"), ("01110101"), ("10101111")), (("11011011"), ("11011011"), ("01110110"), ("10101101")), (("11011100"), ("11011100"), ("01111111"), ("10100011")), (("11011101"), ("11011101"), ("01111100"), ("10100001")), (("11011110"), ("11011110"), ("01111001"), ("10100111")), (("11011111"), ("11011111"), ("01111010"), ("10100101")), (("11100000"), ("11100000"), ("00111011"), ("11011011")), (("11100001"), ("11100001"), ("00111000"), ("11011001")), (("11100010"), ("11100010"), ("00111101"), ("11011111")), (("11100011"), ("11100011"), ("00111110"), ("11011101")),

79

Page 88: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

(("11100100"), ("11100100"), ("00110111"), ("11010011")), (("11100101"), ("11100101"), ("00110100"), ("11010001")), (("11100110"), ("11100110"), ("00110001"), ("11010111")), (("11100111"), ("11100111"), ("00110010"), ("11010101")), (("11101000"), ("11101000"), ("00100011"), ("11001011")), (("11101001"), ("11101001"), ("00100000"), ("11001001")), (("11101010"), ("11101010"), ("00100101"), ("11001111")), (("11101011"), ("11101011"), ("00100110"), ("11001101")), (("11101100"), ("11101100"), ("00101111"), ("11000011")), (("11101101"), ("11101101"), ("00101100"), ("11000001")), (("11101110"), ("11101110"), ("00101001"), ("11000111")), (("11101111"), ("11101111"), ("00101010"), ("11000101")), (("11110000"), ("11110000"), ("00001011"), ("11111011")), (("11110001"), ("11110001"), ("00001000"), ("11111001")), (("11110010"), ("11110010"), ("00001101"), ("11111111")), (("11110011"), ("11110011"), ("00001110"), ("11111101")), (("11110100"), ("11110100"), ("00000111"), ("11110011")), (("11110101"), ("11110101"), ("00000100"), ("11110001")), (("11110110"), ("11110110"), ("00000001"), ("11110111")), (("11110111"), ("11110111"), ("00000010"), ("11110101")), (("11111000"), ("11111000"), ("00010011"), ("11101011")), (("11111001"), ("11111001"), ("00010000"), ("11101001")), (("11111010"), ("11111010"), ("00010101"), ("11101111")), (("11111011"), ("11111011"), ("00010110"), ("11101101")), (("11111100"), ("11111100"), ("00011111"), ("11100011")), (("11111101"), ("11111101"), ("00011100"), ("11100001")), (("11111110"), ("11111110"), ("00011001"), ("11100111")), (("11111111"), ("11111111"), ("00011010"), ("11100101")) ); constant sbox: box:= ( --0 (X"63", X"7c", X"77", X"7b", X"f2", X"6b", X"6f", X"c5", X"30", X"01", X"67", X"2b", X"fe", X"d7", X"ab", X"76"), --1 (X"ca", X"82", X"c9", X"7d", X"fa", X"59", X"47", X"f0", X"ad", X"d4", X"a2", X"af", X"9c", X"a4", X"72", X"c0"), --2 (X"b7", X"fd", X"93", X"26", X"36", X"3f", X"f7", X"cc", X"34", X"a5", X"e5", X"f1", X"71", X"d8", X"31", X"15"), --3 (X"04", X"c7", X"23", X"c3", X"18", X"96", X"05", X"9a", X"07", X"12", X"80", X"e2", X"eb", X"27", X"b2", X"75"), --4 (X"09", X"83", X"2c", X"1a", X"1b", X"6e", X"5a", X"a0", X"52", X"3b", X"d6", X"b3", X"29", X"e3", X"2f", X"84"), --5

80

Page 89: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

(X"53", X"d1", X"00", X"ed", X"20", X"fc", X"b1", X"5b", X"6a", X"cb", X"be", X"39", X"4a", X"4c", X"58", X"cf"), --6 (X"d0", X"ef", X"aa", X"fb", X"43", X"4d", X"33", X"85", X"45", X"f9", X"02", X"7f", X"50", X"3c", X"9f", X"a8"), --7 (X"51", X"a3", X"40", X"8f", X"92", X"9d", X"38", X"f5", X"bc", X"b6", X"da", X"21", X"10", X"ff", X"f3", X"d2"), --8 (X"cd", X"0c", X"13", X"ec", X"5f", X"97", X"44", X"17", X"c4", X"a7", X"7e", X"3d", X"64", X"5d", X"19", X"73"), --9 (X"60", X"81", X"4f", X"dc", X"22", X"2a", X"90", X"88", X"46", X"ee", X"b8", X"14", X"de", X"5e", X"0b", X"db"), --10 (X"e0", X"32", X"3a", X"0a", X"49", X"06", X"24", X"5c", X"c2", X"d3", X"ac", X"62", X"91", X"95", X"e4", X"79"), --11 (X"e7", X"c8", X"37", X"6d", X"8d", X"d5", X"4e", X"a9", X"6c", X"56", X"f4", X"ea", X"65", X"7a", X"ae", X"08"), --12 (X"ba", X"78", X"25", X"2e", X"1c", X"a6", X"b4", X"c6", X"e8", X"dd", X"74", X"1f", X"4b", X"bd", X"8b", X"8a"), --13 (X"70", X"3e", X"b5", X"66", X"48", X"03", X"f6", X"0e", X"61", X"35", X"57", X"b9", X"86", X"c1", X"1d", X"9e"), --14 (X"e1", X"f8", X"98", X"11", X"69", X"d9", X"8e", X"94", X"9b", X"1e", X"87", X"e9", X"ce", X"55", X"28", X"df"), --15 (X"8c", X"a1", X"89", X"0d", X"bf", X"e6", X"42", X"68", X"41", X"99", X"2d", X"0f", X"b0", X"54", X"bb", X"16") ); constant Rcon: RCbox:= (

81

Page 90: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

(X"00", X"01", X"02", X"04", X"08", X"10", X"20", X"40", X"80", X"1b", X"36") ); function "xor"(a, b: in word) return word; function ceil(a: in natural) return natural; function rgen(a, b: in natural) return natural; function ugen(a: in natural) return natural; function format( Tlen, Nlen, Alen, Plen : in natural; nonce, Ain, Pin: in std_logic_vector ) return key_Arr; function format_ctr( Nlen, Plen : in natural; nonce: in std_logic_vector ) return key_Arr; end datatypes; package body datatypes is function "xor"(a, b: in word) return word is variable temp: word; begin temp(0):= a(0) xor b(0); temp(1):= a(1) xor b(1); temp(2):= a(2) xor b(2); temp(3):= a(3) xor b(3); return temp; end function "xor"; function ceil(a: in natural) return natural is variable temp: natural; begin if (a mod 128=0) then temp:= a/128; else temp:= a/128+1; end if; return temp; end function ceil; function rgen(a,b: in natural) return natural is variable temp: natural; begin if (0<(a/8) and (a/8)<2**16-2**8) then temp:=ceil(16+a)+ceil(b); else --if ( 2**16-2**8<=(a/8) and (a/8)<2**32 ) then temp:=ceil(48+a)+ceil(b); --else --if (2**32<=(a/8) and (a/8)<2**64) then --temp:=ceil(80+a)+ceil(b); --end if;

82

Page 91: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

--end if; end if; return temp; end function rgen; function ugen(a: in natural) return natural is variable temp: natural; begin if (0<(a/8) and (a/8)<2**16-2**8) then temp:=ceil(16+a); else --if ( 2**16-2**8<=(a/8) and (a/8)<2**32 ) then temp:=ceil(48+a); --else --if (2**32<=(a/8) and (a/8)<2**64) then --temp:=ceil(80+a); --end if; --end if; end if; return temp; end function ugen; function format( Tlen, Nlen, Alen, Plen : in natural; nonce, Ain, Pin: in std_logic_vector ) return key_Arr is variable temp: key_arr(rgen(Alen, Plen) downto 0); variable t, q, n, p: natural; variable count: natural; variable Q_str: std_logic_vector((15-Nlen/8)*8-1 downto 0); variable a_str: std_logic_vector(ugen(Alen)*128-1 downto 0); variable p_str: std_logic_vector(ugen(Plen)*128-1 downto 0); begin t:=Tlen/8; n:=Nlen/8; q:=15-n; p:=Plen/8; temp(0)(0)(7):='0'; if Alen=0 then temp(0)(0)(6):='0'; else temp(0)(0)(6):='1'; end if; temp(0)(0)(5 downto 3):=conv_std_logic_vector((t-2)/2, 3); temp(0)(0)(2 downto 0):=conv_std_logic_vector(q-1, 3); for j in 1 to n loop temp(0)(n+1-j):= nonce((j-1)*8+7 downto (j-1)*8); end loop; Q_str:= conv_std_logic_vector(p, q*8); for j in n+1 to 15 loop temp(0)(15+n+1-j):= Q_str((j-(n+1))*8+7 downto (j-(n+1))*8);

83

Page 92: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

end loop; if (0<(Alen/8) and (Alen/8)<2**16-2**8) then a_str(7 downto 0):= conv_std_logic_vector(Alen/8, 16)(15 downto 8); a_str(15 downto 8):= conv_std_logic_vector(Alen/8, 16)(7 downto 0); for i in 0 to Alen/8-1 loop a_str(16+i*8+7 downto 16+i*8):= Ain((Alen/8-1-i)*8+7 downto (Alen/8-1-i)*8); end loop; a_str(ugen(Alen)*128-1 downto 16+Alen):=(others=>'0'); for j in 1 to ugen(Alen) loop for k in 0 to 15 loop temp(j)(k):= a_str((j-1)*128+k*8+7 downto (j-1)*128+k*8); end loop; end loop; else --if ( 2**16-2**8<=(Alen/8) and (Alen/8)<2**32 ) then --else --if (2**32<=(Alen/8) and (Alen/8)<2**64) then --end if; --end if; end if; for i in 0 to Plen/8-1 loop p_str(i*8+7 downto i*8):= Pin((Plen/8-1-i)*8+7 downto (Plen/8-1-i)*8); end loop; p_str(ceil(Plen)*128-1 downto Plen):= (others=>'0'); for j in ugen(Alen)+1 to rgen(Alen, Plen) loop for k in 0 to 15 loop temp(j)(k):= p_str((j-(ugen(Alen)+1))*128+k*8+7 downto (j-(ugen(Alen)+1))*128+k*8); end loop; end loop; return temp; end function format; function format_ctr( Nlen, Plen : in natural; nonce: in std_logic_vector ) return key_Arr is variable q, n: natural; variable temp: key_arr(ceil(Plen) downto 0); begin n:=Nlen/8; q:=15-n; for i in 0 to ceil(Plen) loop temp(i)(0)(7 downto 3):=(others=>'0'); temp(i)(0)(2 downto 0):=conv_std_logic_vector(q-1, 3); for j in 1 to n loop temp(i)(n+1-j):= nonce((j-1)*8+7 downto (j-1)*8); end loop; for j in n+1 to 15 loop temp(i)(15+n+1-j):= conv_std_logic_vector(i, 8*q)((j-(n+1))*8+7 downto (j-(n+1))*8); end loop; end loop; return temp; end function format_ctr;

84

Page 93: Multi-IP-Based SoC Design Including CCM Security Mode of ...cgebotys/NEW/Solmaz_MSc_thesis_Aug3.pdf · more flexibility and much lower design and debug costs compared to specifically-built

end package body datatypes;

85