p. marwedel: embedded software:how to make it efficient? slide -1 - embedded software: how to make...

30
P. Marwedel: Embedded Software:How to make it efficient? Slide -1 - Embedded Software: How to make it efficient? PeterM arw edel U niversity ofD ortm und Inform atik 12 44221 D ortm und,G erm any

Post on 19-Dec-2015

221 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: P. Marwedel: Embedded Software:How to make it efficient? Slide -1 - Embedded Software: How to make it efficient?

P. Marwedel: Embedded Software:How to make it efficient? Slide -1 -

Embedded Software:How to make it efficient?

Peter Marwedel University of Dortmund

Informatik 12 44221 Dortmund, Germany

Page 2: P. Marwedel: Embedded Software:How to make it efficient? Slide -1 - Embedded Software: How to make it efficient?

P. Marwedel: Embedded Software:How to make it efficient? Slide -2 -

What is an embedded system?

These are not the embedded systems we will talk about!

Page 3: P. Marwedel: Embedded Software:How to make it efficient? Slide -1 - Embedded Software: How to make it efficient?

P. Marwedel: Embedded Software:How to make it efficient? Slide -3 -

Embedded Systems

Main reason for buying is not information processingCharacteristics:• not recognised as information processing• frequently real-time behaviour required • must be dependable & guarantee privacy• many of these systems are mobile systems• fundamental technology for pervasive computing/

ambient intelligence, implemented in complex software

Embedded systems = information processing systems embedded into a larger product

Page 4: P. Marwedel: Embedded Software:How to make it efficient? Slide -1 - Embedded Software: How to make it efficient?

P. Marwedel: Embedded Software:How to make it efficient? Slide -4 -

Views on embedded software

For many products in the area of consumer electronics the amount of code is doubling every two years [Fritz Vaandrager in: Rozenberg, Vaandrager (eds.): Lectures on Embedded Systems, LNCS, Vol. 1494, 1998]

„On Nanoscale Integration and Gigascale Complexity in the Post - .com world“ [de Man, Keynote, DATE 2002]

... it is now common knowledge that more than 70% of the development cost for complex systems such as automotive electronics and communication systems are due to software development[A. Sangiovanni-Vincentelli, 1999]

Page 5: P. Marwedel: Embedded Software:How to make it efficient? Slide -1 - Embedded Software: How to make it efficient?

P. Marwedel: Embedded Software:How to make it efficient? Slide -5 -

The energy/flexibility conflict- Intrinsic Power Efficiency -

Technology

[H. de Man, Keynote, DATE‘02;T. Claasen, ISSCC99]

Operations/Watt[MOPS/mW]

Processors

Reconfigurable Computinghardwired muxed

1

0.1

0.01

0.13µ

Necessary to optimize software; otherwise the prize for software flexibility cannot be paid!

Necessary to optimize software; otherwise the prize for software flexibility cannot be paid!

Ambient Intelligence

0.07µ

DSP-ASIPs

µPs

10

0.25µ0.5µ1.0µ

poor software generation techniques

Page 6: P. Marwedel: Embedded Software:How to make it efficient? Slide -1 - Embedded Software: How to make it efficient?

P. Marwedel: Embedded Software:How to make it efficient? Slide -6 -

„Power is considered as the most important constraint in embedded systems“[in: L. Eggermont (ed): Embedded Systems Roadmap 2002, STW]

Importance of Power and Energy Consumption

Current UMTS phones can hardly be operated for more than an hour, if data is being transmitted.[from a report of the Financial Times, Germany, on an analysis by Credit Suisse First Boston; http://www.ftd.de/tm/tk/9580232.html?nv=se]

Page 7: P. Marwedel: Embedded Software:How to make it efficient? Slide -1 - Embedded Software: How to make it efficient?

P. Marwedel: Embedded Software:How to make it efficient? Slide -7 -

Key requirements for embedded software

Hardware/software efficiency run-time efficiency, code-size efficiency, energy efficiency, power consumption, .....

Many standards published as „reference implementations“ (just provide the correct results; do not care about efficiency)

proposal of the „software washing machine“ (Catthoor)

„dirty“ unoptimized software in

„clean“ optimized software out

Page 8: P. Marwedel: Embedded Software:How to make it efficient? Slide -1 - Embedded Software: How to make it efficient?

P. Marwedel: Embedded Software:How to make it efficient? Slide -8 -

Generating efficient software requireswork at all levels

Algorithmic level(using the most efficient algorithm + data structures)

High-level source code transformations Compiler optimizations Code-Compression Operating system support

(e.g. for minimizing power consumption)

Page 9: P. Marwedel: Embedded Software:How to make it efficient? Slide -1 - Embedded Software: How to make it efficient?

P. Marwedel: Embedded Software:How to make it efficient? Slide -9 -

Algorithmic level

Choosing best decoding/filtering etc. algorithm+data structures

Example: MPEG-2 data structures: Inverse Discrete Cosine Transform (IDCT) most power/cycle hungry hot spot.

Transformations: Replacing „double“ by „float“

[still acceptable quality] Energy consumption reduced to 34%,

cycles reduced to 35 % Standard IDCT „Fast IDCT“ („double float“ „integer“),

[significant loss of precision]. Energy consumption reduced to 4.86%,

cycles reduced to 5.10% [T. Huels, Inf 12, UniDo, 2002]

Page 10: P. Marwedel: Embedded Software:How to make it efficient? Slide -1 - Embedded Software: How to make it efficient?

P. Marwedel: Embedded Software:How to make it efficient? Slide -10 -

High-level transformations

Example: Separation of margin handling

+

many if-statements for margin-checking

no checking,efficient

only few margin elements to be processed

Page 11: P. Marwedel: Embedded Software:How to make it efficient? Slide -1 - Embedded Software: How to make it efficient?

P. Marwedel: Embedded Software:How to make it efficient? Slide -11 -

if (x>=10||y>=14) for (; y<49; y++) for (k=0; k<9; k++) for (l=0; l<9;l++ ) for (i=0; i<4; i++) for (j=0; j<4;j++) { then_block_1; then_block_2}else {y1=4*y; for (k=0; k<9; k++) {x2=x1+k-4; for (l=0; l<9; ) {y2=y1+l-4; for (i=0; i<4; i++) {x3=x1+i; x4=x2+i; for (j=0; j<4;j++) {y3=y1+j; y4=y2+j; if (0 || 35<x3 ||0 || 48<y3) then-block-1; else else-block-1; if (x4<0|| 35<x4||y4<0||48<y4) then_block_2; else else_block_2;}}}}}}

Loop nest splitting at University of DortmundLoop nest from MPEG-4 full search motion estimation

for (z=0; z<20; z++) for (x=0; x<36; x++) {x1=4*x; for (y=0; y<49; y++) {y1=4*y; for (k=0; k<9; k++) {x2=x1+k-4; for (l=0; l<9; ) {y2=y1+l-4; for (i=0; i<4; i++) {x3=x1+i; x4=x2+i; for (j=0; j<4;j++) {y3=y1+j; y4=y2+j; if (x3<0 || 35<x3||y3<0||48<y3) then_block_1; else else_block_1; if (x4<0|| 35<x4||y4<0||48<y4) then_block_2; else else_block_2;}}}}}}

for (z=0; z<20; z++) for (x=0; x<36; x++) {x1=4*x; for (y=0; y<49; y++)

analysis of polyhedral domains, selection with genetic algorithm

[H. Falk et al., Inf 12, UniDo, 2002]

Page 12: P. Marwedel: Embedded Software:How to make it efficient? Slide -1 - Embedded Software: How to make it efficient?

P. Marwedel: Embedded Software:How to make it efficient? Slide -12 -

Results for loop nest splitting- Execution times -

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

110%

Cavity Motion Estimation QSDPCM

[H. Falk et al., Inf 12, UniDo, 2002]

Page 13: P. Marwedel: Embedded Software:How to make it efficient? Slide -1 - Embedded Software: How to make it efficient?

P. Marwedel: Embedded Software:How to make it efficient? Slide -13 -

Results for loop nest splitting- Code sizes -

0%

20%

40%

60%

80%

100%

120%

140%

160%

180%

200%

Sun

Pentiu

m HPM

IPS

PowerPC

DEC Alp

ha

TriMed

ia

TI C6x

ARM7

thm

b

ARM7

arm

Averag

e

Cavity Motion Estimation QSDPCM

[H. Falk et al., Inf 12, UniDo, 2002]

Page 14: P. Marwedel: Embedded Software:How to make it efficient? Slide -1 - Embedded Software: How to make it efficient?

P. Marwedel: Embedded Software:How to make it efficient? Slide -14 -

Generating efficient software requireswork at all levels

Algorithmic level(using the most efficient algorithm + data structures)

High-level source code transformations Compiler optimizations Code-Compression Operating system support

(e.g. for minimizing power consumption)

Page 15: P. Marwedel: Embedded Software:How to make it efficient? Slide -1 - Embedded Software: How to make it efficient?

P. Marwedel: Embedded Software:How to make it efficient? Slide -15 -

Compilers: Translation from C critical bottleneck

(Real-time) UML or equiv.(Real-time) UML or equiv.

StateCharts/SDLStateCharts/SDL

(sets of) C-programs(sets of) C-programs

Assembly levelAssembly level

RT-JavaRT-Java

Assembly levelAssembly level

VHDLVHDL

HWHW

(Real-time) UML or equiv.(Real-time) UML or equiv.

Page 16: P. Marwedel: Embedded Software:How to make it efficient? Slide -1 - Embedded Software: How to make it efficient?

P. Marwedel: Embedded Software:How to make it efficient? Slide -16 -

Overhead of compilers for DSP processors

DSPStone (Zivojnovic et al.). Example: ADPCMCycle overhead [× n]

DSP56001TI-C51 ADI-2101

1.0

2.0

3.0

4.0

5.0

8.0

7.0

6.0

Optimizations exploiting architectural features of embedded processors.

Current focus: VLIW processors (powerful multimedia processors).

In this talk: focus on energy consumption.

Page 17: P. Marwedel: Embedded Software:How to make it efficient? Slide -1 - Embedded Software: How to make it efficient?

P. Marwedel: Embedded Software:How to make it efficient? Slide -17 -

Larger & off-chip memories need more energythan smaller & on-chip memories

0

0.5

1

1.5

2

2.5

64 128 256 512 1024 2048 4096 8192

Memory size

En

erg

y p

er a

cce

ss

[nJ

]

Example (CACTI Model):

[Steinke et al., Inf 12, UniDo, 2002]

Page 18: P. Marwedel: Embedded Software:How to make it efficient? Slide -1 - Embedded Software: How to make it efficient?

P. Marwedel: Embedded Software:How to make it efficient? Slide -18 -

Example: Off-chip vs. on-chip memories

ARM7TDMI cores, well-known for low power consumption

ARM Atmel Evaluation Board

Processor

On-chip memory

board

On-board memory

Page 19: P. Marwedel: Embedded Software:How to make it efficient? Slide -1 - Embedded Software: How to make it efficient?

P. Marwedel: Embedded Software:How to make it efficient? Slide -19 -

On-chip vs. off-chip current

Current32 Bit-Load Instruction (Thumb)

48,2 50,9 44,4 53,1

11677,2 82,2

1,16

0

50

100

150

200

Prog Off-Chip/Data Off-Chip

Prog Off-Chip/Data On-Chip

Prog On-Chip/Data Off-Chip

Prog On-Chip/Data On-Chip

mA

Core+On-Chip-Memory Current (mA) Off-Chip-Memory Current (mA)

Example: Atmel ARM-Evaluation board

Processor

On-chip memory

board

On-board memory

current reduction:

/ 3.02

current reduction:

/ 3.02

Page 20: P. Marwedel: Embedded Software:How to make it efficient? Slide -1 - Embedded Software: How to make it efficient?

P. Marwedel: Embedded Software:How to make it efficient? Slide -20 -

On-chip vs. off-chip energy

Energy32 Bit-Load Instruction (Thumb)

115,8

51,6

76,5

16,4

0,020,040,060,080,0

100,0120,0140,0

Prog Off-Chip/Data Off-Chip

Prog Off-Chip/Data On-Chip

Prog On-Chip/Data Off-Chip

Prog On-Chip/Data On-Chip

10

nJ

Energy

Example: Atmel ARM-Evaluation board

Off-chip access takes more cycles savings (86%) are larger than for the current.

energy reduction:/ 7.06

energy reduction:/ 7.06

Page 21: P. Marwedel: Embedded Software:How to make it efficient? Slide -1 - Embedded Software: How to make it efficient?

P. Marwedel: Embedded Software:How to make it efficient? Slide -21 -

Exploitation of on-chip memory

Which segment (array, loop, etc.) to be stored in on-chip memory?

Gain gi and size si for each segment i.

Maximise gain G = gi, respecting constraint K si.

Static memory allocation:

Solution: knapsack algorithm.

Dynamic reloading:

Where to insert calls to copy function? IP-model

Processor

On-chip memory,capacity K

board

On-board memory

?

For i .{ }

for j ..{ }

while ...

Repeat

call ...

Array ...

Int ...

Array

Example:

Page 22: P. Marwedel: Embedded Software:How to make it efficient? Slide -1 - Embedded Software: How to make it efficient?

P. Marwedel: Embedded Software:How to make it efficient? Slide -22 -

Why not just use a cache ?

0

1

2

3

4

5

6

7

8

9

256 512 1024 2048 4096 8192 16384

memory size

En

erg

y p

er

ac

ce

ss

[n

J]

.

Scratch pad

Cache, 2way, 4GB space

Cache, 2way, 16 MB space

Cache, 2way, 1 MB space

Energy consumption in tags, comparators and muxes significant.

[R. Banakar, S. Steinke, B.-S. Lee, 2001]

Page 23: P. Marwedel: Embedded Software:How to make it efficient? Slide -1 - Embedded Software: How to make it efficient?

P. Marwedel: Embedded Software:How to make it efficient? Slide -23 -

Results for optimization algorithm

Energy saving

0,00% 10,00% 20,00% 30,00% 40,00% 50,00%

Be

nch

ma

rk

Onchip/MemSize

Energy Saving

[Steinke et al., Inf 12, UniDo, 2002]

0.5%

Page 24: P. Marwedel: Embedded Software:How to make it efficient? Slide -1 - Embedded Software: How to make it efficient?

P. Marwedel: Embedded Software:How to make it efficient? Slide -24 -

Total energy reduction for MPEG-2 [%]

100

33.97

31.83

21.68

6.21

4.87

0 20 40 60 80 100

Original

Algorithm (float)

High-level opt.

Compiler opt.

Cache

Scratch pad (static)

[T. Huels, Inf 12, UniDo, 2002]

Page 25: P. Marwedel: Embedded Software:How to make it efficient? Slide -1 - Embedded Software: How to make it efficient?

P. Marwedel: Embedded Software:How to make it efficient? Slide -25 -

Optimization technique for microcontrollers and network processors: Bit-field detection

Assembly:mov b, 1, a, 0, 3 # Cost: 1

a

1

«

|

&

7

&

b 0xF1

b a

[Wagner, Inf 12, UniDo, 2002]

b

=

Page 26: P. Marwedel: Embedded Software:How to make it efficient? Slide -1 - Embedded Software: How to make it efficient?

P. Marwedel: Embedded Software:How to make it efficient? Slide -26 -

Results available to industry?

„Center of excellence“ (IMEC)

„Center of excellence“ (IMEC)

Informatik 12, UniDo

Informatik 12, UniDo

Design houses/ semiconductor vendors

Design houses/ semiconductor vendors

ICD e.V.(technology transfer center)

ICD e.V.(technology transfer center)

CAD vendors

CAD vendors

partner‘s of the trinity model

yes!

Page 27: P. Marwedel: Embedded Software:How to make it efficient? Slide -1 - Embedded Software: How to make it efficient?

P. Marwedel: Embedded Software:How to make it efficient? Slide -27 -

Generating efficient software requireswork at all levels

Algorithmic level(using the most efficient algorithm + data structures)

High-level source code transformations Compiler optimizations Code-Compression Operating system support

(e.g. for minimizing power consumption)

Page 28: P. Marwedel: Embedded Software:How to make it efficient? Slide -1 - Embedded Software: How to make it efficient?

P. Marwedel: Embedded Software:How to make it efficient? Slide -28 -

Code compression/decompression

ROM

µP

decompressor

µP

ROM

Key idea:

Very good survey: Rik van de Wiel: The Code Compaction Bibliography, www.extra. research.philips.com/ccb/

Addr Addr

Page 29: P. Marwedel: Embedded Software:How to make it efficient? Slide -1 - Embedded Software: How to make it efficient?

P. Marwedel: Embedded Software:How to make it efficient? Slide -29 -

Variable-voltage/frequency example: INTEL Xscale

Fro

m I

nte

l’s W

eb

Site

OS should schedule distribution of the energy budget.

Page 30: P. Marwedel: Embedded Software:How to make it efficient? Slide -1 - Embedded Software: How to make it efficient?

P. Marwedel: Embedded Software:How to make it efficient? Slide -30 -

Conclusion

At the algorithmic level At the level of high-level transformations Within the compiler At the code compression level Within the Embedded OS

Making embedded software efficient requires efforts at alllevels:

The focus of this talk was on compilers and energy efficiency;

using new algorithms, the energy consumption can be significantly reduced..