platform-based design: an automotive example

14
Page 1 Platform-based Design: an Automotive Example Platform-based Design: an Automotive Example Alberto Sangiovanni Vincentelli Alberto Sangiovanni Vincentelli The Edgar L. and Harold H. Buttner Chair of The Edgar L. and Harold H. Buttner Chair of Electr Electr . Eng. And Comp. . Eng. And Comp. Science Science University of California at Berkeley University of California at Berkeley Co-Founder, Chief Technology Advisor and Board Member Co-Founder, Chief Technology Advisor and Board Member Cadence Design System Cadence Design System Founder and Scientific Director Founder and Scientific Director PARADES (Cadence, Magneti-Marelli, ST) PARADES (Cadence, Magneti-Marelli, ST) Work in collaboration with Work in collaboration with A.Ferrari, M. Baleani, L. A.Ferrari, M. Baleani, L. Mangeruca Mangeruca (PARADES) (PARADES) And the Magneti-Marelli/ST Design Teams And the Magneti-Marelli/ST Design Teams PARADES June 9, 2000 2 HW Micro-controller Architecture basic components HW Micro-controller Architecture basic components Processing Units (CPU) Processing Units (CPU) Memories Memories I/O Units I/O Units I/O sub system CPU Memory

Upload: vuongnhan

Post on 07-Jan-2017

218 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Platform-based Design: an Automotive Example

Page 1

Platform-based Design: an Automotive ExamplePlatform-based Design: an Automotive Example

Alberto Sangiovanni VincentelliAlberto Sangiovanni Vincentelli

The Edgar L. and Harold H. Buttner Chair ofThe Edgar L. and Harold H. Buttner Chair of Electr Electr. Eng. And Comp.. Eng. And Comp.ScienceScience

University of California at BerkeleyUniversity of California at Berkeley

Co-Founder, Chief Technology Advisor and Board MemberCo-Founder, Chief Technology Advisor and Board Member

Cadence Design SystemCadence Design System

Founder and Scientific DirectorFounder and Scientific Director

PARADES (Cadence, Magneti-Marelli, ST)PARADES (Cadence, Magneti-Marelli, ST)

Work in collaboration withWork in collaboration with

A.Ferrari, M. Baleani, L. A.Ferrari, M. Baleani, L. Mangeruca Mangeruca (PARADES)(PARADES)

And the Magneti-Marelli/ST Design TeamsAnd the Magneti-Marelli/ST Design Teams

PA

RA

DE

S

June 9, 2000 2

HW Micro-controller Architecture basic componentsHW Micro-controller Architecture basic components

�� Processing Units (CPU)Processing Units (CPU)

�� MemoriesMemories

�� I/O UnitsI/O Units

I/O sub system

CPU

Memory

Page 2: Platform-based Design: an Automotive Example

Page 2

June 9, 2000 3

An Example of Platform space: Micro-controllersAn Example of Platform space: Micro-controllers

I/O

Memory

CPU

40 MIPS

30 MIPS

10 MIPS

June 9, 2000 4

The criteria for selection are manyThe criteria for selection are many......

�� CostCost

�� PerformancePerformance

�� FlexibilityFlexibility

�� ReliabilityReliability

�� SizeSize

�� PowerPower

�� Re-useRe-use

�� ……

Page 3: Platform-based Design: an Automotive Example

Page 3

June 9, 2000 5

Power-Train Control SystemPower-Train Control System

�� Electronic device controlling an internal combustion engine and aElectronic device controlling an internal combustion engine and a

gearboxgearbox

�� The goalThe goal

�� offer appropriate driving performance (e.g. torque, comfort, safety)offer appropriate driving performance (e.g. torque, comfort, safety)

�� minimize fuel consumption and emissionsminimize fuel consumption and emissions

�� Relevant characteristicsRelevant characteristics�� strictly coupled with mechanical partsstrictly coupled with mechanical parts

�� hard real-time constraintshard real-time constraints

�� complex algorithms for controlling fuel injection, spark ignition, throttlecomplex algorithms for controlling fuel injection, spark ignition, throttleposition, gear shift …position, gear shift …

June 9, 2000 6

Engine Management: BehaviorEngine Management: Behavior

�� Failure detection and recovery of input sensors (6 CFSMs + 1Failure detection and recovery of input sensors (6 CFSMs + 1

Timer)Timer)

�� Computation ofComputation of�� engine phase, status and angle (6 CFSMs + 6 Timers)engine phase, status and angle (6 CFSMs + 6 Timers)

�� crankshaft revolution speed and acceleration (3 CFSMs + 1 Timer)crankshaft revolution speed and acceleration (3 CFSMs + 1 Timer)

�� Injection and ignition control law (Injection and ignition control law (18 CFSMs)18 CFSMs)

�� Injection and ignition actuation drivers (Injection and ignition actuation drivers (56 CFSMs + 48 Timers)56 CFSMs + 48 Timers)

Page 4: Platform-based Design: an Automotive Example

Page 4

June 9, 2000 7

Behavioral ValidationBehavioral Validation

InjTop

InjectionParametSync

IgnTop

IgnitionParameteSync

IgnDrvTopCyl1Coil

Cyl1RealAdvance

Cyl2Coil

Cyl2RealAdvance

Cyl3Coil

Cyl3RealAdvance

Cyl4Coil

Cyl4RealAdvance

EngineAngle

EngineSyncStatu

IgnitionParamete

sync

InjDrvTopCyl1Inj

Cyl2Inj

Cyl3Inj

Cyl4Inj

EngineAngle

EngineSyncStatus

InjectionParameter

sync

SyncSysEngineAngle

SystemStop

SystemSyncStatus

sync_t

Synchronization& Data

Acquisition

Angle-driven

DataAcquisition

Time-driven

Ign&InjControl

AlgorithmsInjector&Coil

Drivers

MSE on RPM = 0.29rpmSimulationSimulation––I/O traces of 4s of engine managementI/O traces of 4s of engine management––Fault injection (sensors)Fault injection (sensors)––about 20s on a 450MHz PII CPU with 256MBabout 20s on a 450MHz PII CPU with 256MBRAMRAM

June 9, 2000 8

Single-Core Architectural ModelingSingle-Core Architectural Modeling

Hitachi SH2 or ARM7TDMI core based µ-controller

CPU FUZZY I.S.

RTOSSCHEDULINGPOLICY ANDOVERHEAD ISR

OVERHEAD

TIME UNIT(ATU)

Page 5: Platform-based Design: an Automotive Example

Page 5

June 9, 2000 9

Mapping to SH2: SW Execution timeMapping to SH2: SW Execution time

020406080

100120140160180

Time

GAP TDC MTDC

Task Name

Execution Time (microsec)

Emulation Board

VCC Estimation

8.50%

9.00%

9.50%

10.00%

10.50%

11.00%

11.50%

Error

GAP TDC MTDC

Task

Estimation Error %

Error %

CPU Load = 2%CPU Load Error = 11.3%

June 9, 2000 10

Core Exploration: ARM7TDMICore Exploration: ARM7TDMI

0.0%

5.0%

10.0%

15.0%

20.0%

25.0%

30.0%

35.0%

Speed Up %

GAP TDC MTDC

Task

Speed Up

Speed Up

020406080

100120140160180

Time (microsec)

GAP TDC MTDC

Task

Task Execution Time

SH7055

ARM7TDMI

CPU Load = 1.44%

Page 6: Platform-based Design: an Automotive Example

Page 6

June 9, 2000 11

ARM7TDMI Code SizeARM7TDMI Code Size

Code Size (published, publicly available, benchmarks):Code Size (published, publicly available, benchmarks):

�� ARM7 (32 bit mode) is 25-35% larger than ARM7TDMIARM7 (32 bit mode) is 25-35% larger than ARM7TDMI

�� Hitachi SH2 is 20% larger than ARM7TDMIHitachi SH2 is 20% larger than ARM7TDMI

�� CPU32 (68K) is 30% larger than ARM7TDMICPU32 (68K) is 30% larger than ARM7TDMI

�� PPC is 60-100% larger than ARM7TDMIPPC is 60-100% larger than ARM7TDMI

�� ST10/C167 is 15-35% larger than ARM7TDMIST10/C167 is 15-35% larger than ARM7TDMI

�� M-Core same code sizeM-Core same code size

June 9, 2000 12

I/O exploration (Sensor Management)I/O exploration (Sensor Management)

�� Mapping A Mapping A�� maximize software re-use and portability (i.e. not maximize software re-use and portability (i.e. not optimizedoptimized for Hitachi for Hitachi

architecture, only “essential” timers included)architecture, only “essential” timers included)

�� 1313 CFSMs CFSMs + 5 timers + 5 timers

�� A1 two tasks, A2 three tasksA1 two tasks, A2 three tasks

�� Mapping B Mapping B�� optimizeoptimize for the Hitachi architecture (I.e., utilize special purpose peripheral for the Hitachi architecture (I.e., utilize special purpose peripheral

called ATU that makes more timers available)called ATU that makes more timers available)

�� 1515 CFSMs CFSMs + 7 timers + 7 timers

�� B1 two tasks, B2 three tasksB1 two tasks, B2 three tasks

�� Mapping C Mapping C�� minimize hw/minimize hw/swsw communication introducing a new virtual hardware component communication introducing a new virtual hardware component

(I.e. replaces ATU with additional functionality)(I.e. replaces ATU with additional functionality)

�� 1818 CFSMs CFSMs + 6 timers + 6 timers

Page 7: Platform-based Design: an Automotive Example

Page 7

June 9, 2000 13

Mapping: performanceMapping: performance

Mapping C

CPU load * 1.4%

1528

6

1673

B2B1A2A1

IRQs

Task switching

Task number

6.8%

8626

3

7745

6.7%

8626

2

7616

10.3%

7739

3

7745

10.2%

7739

2

7616

Cpu load%

Taskswitching

IRQs

* Deferred ISR scheme used

June 9, 2000 14

Flash and Code Size TrendFlash and Code Size Trend

2048

16384

131072

1048576

4096

32768

64

512

2048 24576

4640

10

100

1000

10000

100000

1000000

10000000

1990 1995 2000 2005

Kbi

ts

Discrete Flash Embedded Flash Code Size

Page 8: Platform-based Design: an Automotive Example

Page 8

June 9, 2000 15

Flash SpeedFlash Speed

0

20

40

60

0K]

1995 2000

0HPRU\�6SHHG�

Embedded Flash Discrete Flash

Embedded: 60Mhz (ISSCC@1999)External: 20-30Mhz (AMD/STM/INTEL)

June 9, 2000 16

Platform CostPlatform Cost

�� External FlashExternal Flash::�� Cost = Cost(Flash IC) + Cost(MCU) + Cost = Cost(Flash IC) + Cost(MCU) + IntegrationCostIntegrationCost(Flash,MCU)(Flash,MCU)

�� Micro-controller likely to be I/O boundedMicro-controller likely to be I/O bounded

�� Embedded FlashEmbedded Flash::�� Cost=Cost(MCU) + Cost=Cost(MCU) + IntegrationCostIntegrationCost(MCU)(MCU)

Other issues: speed, reliability, re-use, ...Other issues: speed, reliability, re-use, ...

Page 9: Platform-based Design: an Automotive Example

Page 9

June 9, 2000 17

Embedded Flash SolutionEmbedded Flash Solution

Embedded Flash:Embedded Flash:

��Speed: embedded is fasterSpeed: embedded is faster

��Cost: embedded is cheaperCost: embedded is cheaper

��Size: micro-controllers with higher code density reduceSize: micro-controllers with higher code density reducesystem cost since size is mostly dictated by Flashsystem cost since size is mostly dictated by Flash

June 9, 2000 18

Power-Train Control SystemPower-Train Control System

Solutions:Solutions:�� Increase Clock FrequencyIncrease Clock Frequency

� Performance bounded by memory (FLASH)� A cache memory must be introduced.� Higher clock frequency (80Mhz) might have a worse EMI impact.

�� Increase ParallelismIncrease Parallelism� Instruction Level: VLIW (larger code size)� Processor Level: Multiprocessor

Extend behavior to gearbox control

ARM7/SH2 core overloaded

Page 10: Platform-based Design: an Automotive Example

Page 10

June 9, 2000 19

The Dual-Arm ArchitectureThe Dual-Arm Architecture

A symmetric dual processor architecture with a high-bandwidthA symmetric dual processor architecture with a high-bandwidthinterconnection network among processors, memory, and I/Ointerconnection network among processors, memory, and I/Osub-systemssub-systems

The micro-controller has been designed as a collaborative effortThe micro-controller has been designed as a collaborative effortamong:among:

�� PARADES for architecture conceptsPARADES for architecture concepts

�� Magneti-Marelli for peripherals and requirementsMagneti-Marelli for peripherals and requirements

�� ST for detailed architecture and IC designST for detailed architecture and IC design

�� Accent for X-bar switch and Interrupt Controller detailed designAccent for X-bar switch and Interrupt Controller detailed design

�� Cadence for system exploration and evaluation tools and methodologyCadence for system exploration and evaluation tools and methodology

Excellent example of supply-chain integration in electronic system design forExcellent example of supply-chain integration in electronic system design fornext generation platforms!next generation platforms!

June 9, 2000 20

CPU

WatchDo

IC

FirstLevelDecoder

CPU

WatchDo

IC

FirstLevelDecoder

ARBITRER

FLASH2ARBITRER

FLASH1

RAM1

ARBITRER

ARBITRER

RAM2

ARBITRER

ARM7NATIVEBUS

APB

AHB

A2D1 A2D2 TimingUnit

SPICAN IPCP

IPCP IPCP

Generic

CPU1 CPU2

DMA I/O

IRQ I/OFLASHRAM

DMAPortP_36P_37

DMAInterface

P_38

P_39

P_38

P_39DMAInterface

P_38

P_39DMAInterface

Output

DMAInterface

IPCP SPICAN

Dual-ARM ArchitectureDual-ARM Architecture

Page 11: Platform-based Design: an Automotive Example

Page 11

June 9, 2000 21

Why dual-core ?Why dual-core ?

A simple back of the envelope calculationA simple back of the envelope calculation

�� Estimated size:Estimated size:�� (32KB.RAM + 512KB.FLASH) ~ 5.8M (32KB.RAM + 512KB.FLASH) ~ 5.8M TrsTrs. equivalent to 65-75% of. equivalent to 65-75% of

chip areachip area

�� (ARM7TDMI+Debug+IC+...) < 200K (ARM7TDMI+Debug+IC+...) < 200K TrsTrs. equivalent to less than 4%. equivalent to less than 4%of chip areaof chip area

�� The MCU size is driven by FLASH sizeThe MCU size is driven by FLASH size

Dual-core solution:Dual-core solution:�� 4% increment of MCU cost4% increment of MCU cost

�� twofold performancetwofold performance

June 9, 2000 22

Dual-ARM VCC Architectural ModelDual-ARM VCC Architectural Model

PeripheralGatew ay

IRQ

CANIRQ

IRQ

Peripherals

Generic

IRQ

SCI

PERIPHERALS

MasterGateway

DMA

Fast A/DExternalInterface

RAMFLASH

RAM

FLASH

MicroRTOS parent

ARM

IBUS

DBUS

IRQ

MicroRTOS parent

IRQA/D

ARM

IBUS

DBUS

IRQ

RTOS

SchedulerTask

InterruptHandlers

SchedulerStatic

SchedulerStatic

SchedulerStatic

SchedulerStatic

RTOS

SchedulerTask

InterruptHandlers

SchedulerStatic

SchedulerStatic

SchedulerStatic

SchedulerStatic

Page 12: Platform-based Design: an Automotive Example

Page 12

June 9, 2000 23

Sensors

Actuators

HardwareNet

BIOS

Sensors

Actuators

HardwareNet

BIOS

RT

OS

Behavioral MappingBehavioral Mapping

�� Engine Management Engine Management �� CPU1 (as single core) CPU1 (as single core)SeleSpeeed SeleSpeeed �� CPU2 CPU2

Transformation

controls

CPU2CPU1 Transformation

controls

RT

OS

Com

mun

icat

ion

Engine Management

SeleSpeed

June 9, 2000 24

Interrupt LatencyInterrupt Latency

�� XBAR arbitration policy: round robin and 1 clock ownerXBAR arbitration policy: round robin and 1 clock owner

cyclecycle

�� Best Case: 15 Best Case: 15 clksclks

�� Worst Case with 1 master: Worst Case with 1 master: 26 26 clksclks

�� Worst Case with 2 masters: 37 Worst Case with 2 masters: 37 clksclks

�� Worst Case with 3 masters: 42 Worst Case with 3 masters: 42 clksclks

�� A ‘typical ‘ value of 27 A ‘typical ‘ value of 27 clk clk (by simulation)(by simulation)

Page 13: Platform-based Design: an Automotive Example

Page 13

June 9, 2000 25

Interrupt Latency (2)Interrupt Latency (2)

June 9, 2000 26

Output DevicesInput devices

Hardware Platform

I O

Hardware

network

DUAL-CORE

RT

OS

BIOS

Device Drivers

Net

wor

k C

omm

unic

atio

n

DUAL-CORE

Architectural Space (Performance)

Application Space (Features)

Platform DefinitionPlatform Definition

Platform Instance

Application Instances

SystemPlatform(no ISA)

Platform Design SpaceExploration

PlatformSpecification

Platform API

Software Platform

Output DevicesInput devices

Hardware Platform

I O

Hardware

network

HITACHI

RT

OS

BIOS

Device Drivers

Net

wor

k C

omm

unic

atio

n

HITACHI

RT

OS

BIOS

Device Drivers

Net

wor

k C

omm

unic

atio

n

Output DevicesInput devices

Hardware Platform

I O

Hardware

network

ST10

RT

OS

BIOS

Device Drivers

Net

wor

k C

omm

unic

atio

n

ST10

Application Software

Application Software

Page 14: Platform-based Design: an Automotive Example

Page 14

June 9, 2000 27

ConclusionsConclusions

�� Complex application of system design methodology:Complex application of system design methodology:

automotive power-train controlautomotive power-train control

�� Several architectures (different CPU, different I/Os,Several architectures (different CPU, different I/Os,

different memory organization, e.g. external vs. internaldifferent memory organization, e.g. external vs. internal

flash) evaluatedflash) evaluated

�� New dual-core architecture developed and validated forNew dual-core architecture developed and validated for

Magneti Marelli applications by joint design teamsMagneti Marelli applications by joint design teams

(PARADES, Magneti-Marelli, ST)(PARADES, Magneti-Marelli, ST)

June 9, 2000 28

The Past, the Present and the Future of the Dual-The Past, the Present and the Future of the Dual-core Platformcore Platform

�� Architecture Conceived in PARADES in March 1999Architecture Conceived in PARADES in March 1999

�� First Discussion with M.M. and ST in April 1999First Discussion with M.M. and ST in April 1999

�� Specification Team with M.M. and ST started in Sept’99Specification Team with M.M. and ST started in Sept’99

�� Performance exploration started in Dec’99Performance exploration started in Dec’99

�� Close specification in Jan’00 (IRQ list)Close specification in Jan’00 (IRQ list)

�� FPGA Prototype in Oct ’00FPGA Prototype in Oct ’00

�� Tape out Jan ‘01 (0.18 Tape out Jan ‘01 (0.18 µµm)m)

�� First Silicon 2001First Silicon 2001

�� Architecture spins for different applications 2001-2002Architecture spins for different applications 2001-2002

�� MM power-train controller Start-Of-Production mid-end 2003MM power-train controller Start-Of-Production mid-end 2003