ocps communication real-time networking and can

oCPS Communication

Real-time networking and CAN

Courtesy: Damir Isovic, Reinder Bril

adaptations and responsible for current version: Johan Lukkien

1

Networking in CPS context

Heterogeneous

Distributed applications • components even on

different networks

• guarantees and

admission tests

• online adaptation

Open (more than before)

• embedded data available for (new) data oriented applications

• security for privacy and safety

Data and Control: connecting the ‘physical’ to the ‘Cyber’ • information about the physical world to higher layers ICT

• local sensors and actuators

• requirements of power use, throughput, latency, jitter

2

Fieldbus technology

Distributed control

• intended for lowest layer in control system:

connecting PLCs, sensors, actuators

• reduction of wires and complexity

(replaces 4-20mA standard)

• high reliability, predictability; relatively

low data rates and small messages

Networking • physically: bus, ring

• logically: bus (shared medium), mostly

• OSI model: focus on PHY, MAC (LLC) and application layer • messages have application meaning (‘commands’)

• not straightforward to integrate with other networks (or IP)

• PLCs might perform gateway functionality, or be integrated in IP

Standards • IEC 61158, general fieldbus description

• AS-i, CAN, ETHERCAT, LON, PROFIBUS 3

from http://www.profibusaustralia.com.au/

The software substitute

Mechanical and electrical control

systems are replaced by computer

based solutions.

Contributing causes are:

• It is possible to improve already

existing technologies, e.g., brakes in

cars

• It is possible to do things previously

seemed impossible, e.g., drive-by-

wire, electronic stability program in

cars, etc..

• it simplifies the architecture

4

Example: Adaptive Cruise Control

Further development of

the Cruise Control (CC)

Maintains distance to car

in front by slowing down

vehicle.

Radar is used to

measure distance

5

Example: Electronic Damper Control (EDC)

Adjusts shock absorbers in real-time

- better driving comfort

- select drving experience

Relies on a number of sensors • Steering wheel sensor

• Speed sensor

• Acceleration sensors

6

Electronic/ICT domains in vehicles

Powertrain

• power generation for driving (complex, real-time processes)

Chassis

• active safety & driving dynamics (ABS, ESC, ...)

• also: x-by-wire

Body & Comfort

• comfort functions: climate (airco), dashboard, wipers, cruise control, ....

• largest domain

Telematics (or infotainment)

• audio systems, displays, navigation, GPS, advanced driver assistance

(ADAS), ...

• large data volumes

Passive safety

• airbag, belts,

rollover sensors

Diagnostics (overall)

(source: IEEE magazine, Jan 2009)

In-vehicle networks

• Networks of ECUs

• Networks are distinct – avoid application

interference

– requirements for each domain are different

• Applications are distributed over several nodes

AT networks

• LIN – Local Interconnect Network – low speed serial bus, central master

• CAN – Control Area Network (Bosch ‘90) – 1Mbps, serial bus, distributed control, real-time support

• MOST – Media Oriented Systems Transport – > 10Mbps, synchronous, stream oriented

(i.e., not control oriented)

• Byteflight – predecessor of FlexRay, for safety applications

• FlexRay - increase speed and reliability beyond CAN and TTP – 10 Mpbs, strictly timed (static/dynamic part)

Comparison of In-Vehicle Network Protocols, including

Ethernet

From MSc presentation: Siva Thangamuthu, TU/e

Distributed real-time systems require robust real-

time communication

Non real-time systems

• Throughput

• Average response time

• Average latency

Real-time systems • Predictability

• Timing requirements on individual response times and latencies

• Require predictable communications network

• Analysis before the system is operating

Challenge • to construct the computer systems that have at least as good reliability

and safety as the system they are replacing

14

In vehicle (RT) communication

It used to look like this…

As the number of electronic devices grew • the wiring gets more “messy”

• the weight of the car increases

16

Nowadays, it looks like this…

• Cost reduction

• Flexibility

• Functionality

In modern cars,

point-to-point wiring

is replaced by a

common

communication bus

17

http://www.office-netshop.de/cgi-bin/shop/front/shop_main.cgi?func=direkt&artnr=00062242

Event-trigged communication

Node 1 Node 2

Task A Task B

Task C Task D Task E

Network

event

response

Message

from A to B

18

Time-trigged communication

Node 1 Node 2

Task A Task B


Network

control sampling

19

”Ordinary” communication protocols

Ethernet • Addressed broadcast messages

• Collision Nodes resend after a

random time

• Impossible to determine transmission

times Not suitable for hard real-

time systems

Token ring • Circulating token

• No collisions

• RT guarantees possible

20

Protocols suitable for RT-communication

TDMA • Time-trigged (periodic)

• Predictable

• High testability

• Example: TTP-protocol

CSMA/CR • Priority based

• Online scheduled

• Flexible

• Example: CAN-protocol

Node 1 Node 2 Node 3 Node 4 Node 1

Max waiting time =

1 TDMA round

Node 1

Node 2

Node 3

Node 4

Collision

resolution

Highest

prio sends

21

CAN – Control Area Network

Originally developed for automotive industry needs

• 1983: BOSCH starts CAN development (Intel joins 1985)

• 1987: First CAN chip

• 1990: First car with CAN (Mercedes S-class)

• 1993: ISO standard

Now used even in industry applications • Very common in machinery

• CAN-controllers developed by Philips, Intel, NEC, Siemens …

An implementation of CSMA/CR • Priority based

• CR is the central mechanism

• Bitwise arbitration to resolve collisions

22

CAN Variants: ISO 11898

• ISO 11898: physical and datalink layers

• 11898-1 (2003): general architecture, PHY, DLL, LLC, MAC

• 11898-2 (2006): High Speed (1 Mps)

• 11898-3 (2006): Fault tolerant, low speed (40K-125Kbps) PHY

• 11898-4 (2004): Time Triggered CAN, TTCAN

• 11898-5 (2007): extend 11898-2 with low power

• 11898-6 (2013): extend -2 and -5 with configurable frames and wakeup

• 15765-2 (2011): diagnostic messaging on CAN

Discussion based on: .I. Davis, A. Burns, R.J. Bril, and J.J. Lukkien, Controller Area Network (CAN) schedulability analysis: Refuted, revisited and revised, Real-Time Systems, ISSN 1573-1383 (online), January 30th, 2007, ISSN 0922-6443 (print), 35(3): 239-272, April 2007.

Structure and function

• Synchronous serial communication

• A shared medium (cable) with connected nodes

• Broadcast – data transmitted as frames can be picked up by all

other attached nodes

• 1 Mbit/s at 40m bus length

• Behaves as an AND-grind: bus value = AND between all bits on

the bus

Node

A

Node

B

Node

C

Node

D

< 40m 1Mbit/s 24

Traffic model

B D

A C

Abstraction of CAN network:

- Frames in priority queues

- No pre-emption

A

B

C

D Resp time

Removed after sending

A single priority queue

that holds all frames

26

Message format

Data frames • Used for data transmission e.g., sampling values from a sensor

• Standard CAN frame (CAN 2.0 A), 11 bits identifier

• Extended CAN frame (CAN 2.0 B), 29 bits identifier

Remote Frames • Used for information requests.

• The transmitting node is asking for information of the type given by

the identifier.

Error frames

• Used for error signaling

Overload Frames

• Used to delay the transmission of the next message frame

• The node sending the Overload Frame is not ready to receive

additional messages at this time

27

CAN-frame (version 2.0 A, standard format)

SOF ID RTR Control Data CRC CRC

DEL

ACK ACK

DEL

EOF IFS

1

bit

11

bits

1

bit

6

bits

0-8

bytes

15

bits

1

bit

1

bit

1

bit

7

bits

min 3

bits

SOF - Start of Frame, start bit (always 0), used for signaling that a frame will be

sent (the bus must be free)

ID - Identifier, identity for the frame and its priority

RTR - Remote Transmission Request

Control - indicates the length of the data field

Data - message data

CRC - Cyclic Redundancy Check,

CRC DEL - CRC delimiter

ACK - Acknowledgement

ACK DEL - ACK delimiter

EOF - End of Frame

IFS - Inter Frame Space, resending wait time

28

Arbitration mechanism

A node wants to

send a frame

Bus

free?

Put ID-bit 0 on the bus

Read bus value

Same as the

one we put?

Yes

No

No Put the

next bit Yes Read bus

value

Same as

the one we

put?

Last bit?

Send the rest of

the frame

Yes

No

Yes

No

29

Arbitration mechanism

Example:

Assume a simplified CAN-system with only three ID-bits and three nodes A, B, C:

A ID=010

B ID=100

C ID=011

000 – highest priority

111 – lowest priority

which gives:

A-high prio, C-middle, B-low

How does the arbitration look like if the nodes are sending simultaneously?

Bit 0 Bit 1 Bit 2

0

1

0

1

1

0

1

Send the rest of the frame

abort! (bit 0 bus value)

abort! (bit 2 bus value)

Node ID

A

B

C

010

100

011

Bus value: 0 1 0

30

Error handling

Error detection with check sum (CRC) • If the frame is received correctly, the ACK-bit is set to 0 by a receiver

Error signaling • The node that detects an error puts instantly 000000 on the bus

• Because 0 is the dominant value, all nodes will detect the error rapidly

• Some CAN-systems have 1 as the dominant bit bit-pattern for error

signaling is 111111

31

Timing properties

CAN is time deterministic • The latency can be predicted

• Possible to calculate how long time it takes to deliver a frame

How many bits are sent in a CAN-frame?


DEL

ACK ACK

DEL

EOF IFS

1

bit

11

bits

1

bit

6

bits

0-8

bytes

15

bits

1

bit

1

bit

1

bit

7

bits

min 3

bits

Sum = 47 + 8n

(n = nr of data bytes)

32

Timing properties

We must avoid two bit-patterns that are used for error signaling

i.e., 000000 and 111111: • Bit stuffing: the sender puts extra bits on strategic places to prevent

forbidden bit-patterns

• The receiver reconstruct the original frame by removing the extra bits

Example:

…00101000000101… Original frame:

Sender puts extra bits: …001010000010101…

Receiver removes extra bits: …00101000000101…

Bits sent on the bus: …001010000010101…

33

Timing properties

Do we need to perform bit stuffing on all 47+8n bits? • No, only 34 (of 47) control bits are affected

• By forbidding some ID values we can avoid bit stuffing in the frame ID


DEL

ACK ACK

DEL

EOF IFS

1

bit

11

bits

1

bit

6

bits

0-8

bytes

15

bits

1

bit

1

bit

1

bit

7

bits

min 3

bits

34+8n affected bits

One (1) extra bit after 5 same bits gives:

5

834 n Nr of extra bits =

34

Timing properties

Is this correct? • Not completely. The standard allows both 000000 and 111111 for

error signaling. We might need to stuff even more extra bits

Example:

Original: 1111 1000 0111 1000 0111 1

After bitstuffing:

To avoid forbidden bit patterns we must insert an extra bit after

the first five bits and one extra bit after each fourth original bit.

1111 10000 01111 10000 01111 1

Extra bit

after 5

original

bits

Extra bit

after 4

original

bits

Extra bit

after 4

original

bits

Extra bit

after 4

original

bits

..etc…

35

Timing properties

Hence, the number of extra bits in a CAN-frame is:

4

1834 n

Now we can calculate the total transmission time for a CAN-frame:

bitbiti nn

nWC )1055()4

1834847(

Transmission times for 1Mbit/s (i.e. bit = 1s):

• longest: nmax= 8 and including stuff-bits

ssWCi 1351)8*1055(

36

• shortest: nmin= 0 and without extra bits:

sBCi 47

Response time analysis for CAN

CAN is priority based. Can we use some kind of response time analysis

(similar as for tasks)?

Yes, with a slight modification. The CAN-bus is non-preemptive, so when a frame

has managed to send the first bit, then it will continue sending the rest of the

frame (without fear of being preempted by high priority frames)

A station i can complete a frame after it has successfully sent the first bit.

The total time wi that this takes (including delays) consists of: – waiting for (at most 1) lower priority frame; plus push through of i

– using this bit-time;

– waiting for all higher priority messages that were queued or kick in during this

waiting, recursively

37


where the blocking time for a frame is given by:

CAN is priority based. Can we use some kind of response time analysis

(similar as for tasks)?

Yes, with a slight modification. The CAN-bus is non-preemptive, so when a frame

has managed to send the first bit, then it will continue sending the rest of the

frame (without fear of being preempted by high priority frames)

j

ihpj j

iibiti WC

T

wBw

)(

Response time for a frame i (pessimistic):

bitiii WCwWR

bitk

ilpki

kilpk

i niWCWC

iWCB 135

)max,max(

1max

)(

)(

1 for

for

hp(i) = high priority frames (that can delay the first bit)

lp(i) = low priority frames (that can block the first bit) 38

Aside: recursive expressions

The recurrence captures the following iterative reasoning:

– consider a minimum approximation time for wi :

– determine how much interference from higher priority tasks is

possible during this w0 . This gives a new approximation

– Continue this process until the new approximation is the same

39

(0)

i bit iw B

(0)(1)

( )

ii bit i j

j hp i j

ww B WC

T


Frames can have jitter: - variations in time when a frame is queued

- usually due to the sender task’s jitter

40

CAN - Example

Assume a CAN system with three nodes.

Task T C Msg

A1 10000 3000 m1

A2 7000 1000 -

Task T C Msg

B1 5000 1000 m2

B2 4000 1000 -

Task T C Msg

C1 4000 1000 m3

C2 10000 1000 -

Node 1 (id = 011)

Node2 (id = 001)

Node 3 (id = 000)

Assumptions:

- Dominant bit: 0

- Bus speed:1 Mbit/s

- Task instances send their messages at

the end of the execution

- The size of each message is 135 bits

- Task priority assignment is according to

Rate Monotonic

a) Calculate jitter for the messages

b) Calculate response times for the

messages

41

CAN - Example

a) Jitter

Node 1:

0

0 4000 20000

7000 14000

WRA1 BRA1

10000 13000

A2

A1

Jm1 = WRA1 – BRA1 = 4000 - 3000 = 1000

Node 2:

(Same as above)

Jm2 = WRB1 – BRB1 = 2000 - 1000 = 1000

Node 3:

Jm3 = WRC1 – BRC1 = 0 (Note: No jitter, because C1 has highest priority)

42

Aside: recursive expressions

The recurrence captures the following iterative reasoning:

– consider a minimum approximation time for wi :

– determine how much interference from higher priority tasks is

possible during this w0 . This gives a new approximation

– Continue this process until the new approximation is the same

If there is jitter, the worst interference of a higher priority task j is

obtained if the first activation in an interval wi(0) is as late as possible

(maximum jitter) and subsequent ones are as early as possible.

43

(0)

i bit iw B

(0)(1)

( )

ii bit i j

j hp i j

ww B WC

T

(0)

(1)

( )

i j

i bit i j

j hp i j

w Jw B WC

T


Frames can have jitter: - variations in time when a frame is queued

- usually due to the sender task’s jitter

(finalization (WF) rather than response to indicate interval start as reference)

j

ihpj j

ji

ibiti WCT

JwBw

)(

bitiiii WCwJWF

The equations above can be re-written as (w’i = wi - bit):

j

ihpj j

bitji

ii WCT

JwBw

)(

iiii WCwJWF

44

CAN - Example

Assume a CAN system with three nodes.

Task T C Msg

A1 10000 3000 m1

A2 7000 1000 -

Task T C Msg

B1 5000 1000 m2

B2 4000 1000 -

Task T C Msg

C1 4000 1000 m3

C2 10000 1000 -

Node 1 (id = 011)

Node2 (id = 001)

Node 3 (id = 000)

Assumptions:

- Dominant bit: 0

- Bus speed:1 Mbit/s

- Task instances send their messages at

the end of the execution

- The size of each message is 135 bits

- Task priority assignment is according to

Rate Monotonic

a) Calculate jitter for the messages

b) Calculate response times for the

messages

48

CAN - Example

a) Jitter

Node 1:

0

0 4000 20000

7000 14000

WRA1 BRA1

10000 13000

A2

A1

Jm1 = WRA1 – BRA1 = 4000 - 3000 = 1000

Node 2:

(Same as above)

Jm2 = WRB1 – BRB1 = 2000 - 1000 = 1000

Node 3:

Jm3 = WRC1 – BRC1 = 0 (Note: No jitter, because C1 has highest priority)

49

TTP – Time Triggered Protocol

An implementation of TDMA • Time-trigged

• Bus access is pre-defined in an offline schedule

• Nodes can be assigned several slots

Originally developed on Technical University of Vienna in

corporation with several car manufacturers • Commercial development by TTTech

Aimed for X-by-wire applications • Boeing 777, Airbus 340, Audi,…

Very high demands on reliability • Safety-critical real-time systems that require fault tolerance

55

TTP - typical system configuration

Node

A

Node

B

Node

C

bus 1

bus 2

FTU

Fail Silent nodes

- Nodes detect errors by themselves

- They either deliver correct results or no result at all

Grouped in FTUs (Fail Tolerant Unit)

- Several nodes that do the same in parallel

- FTU is working as long one of the nodes is working

56

Time-trigged clocks must be synchronized

- continuous synchronization

- some tens of microseconds

The receiver compares actual receiving time with expected

receiving time

TTP - synchronization

57

CAN vs TTP

TTP – Time-trigged (periodic)

– Easier analysis

– Predictable

– High testability

CAN – Priority based

– Faster response times for high priority messages

– Flexible

58

Started in year 2000 as an industrial consortium:

BMW, Daimler-Chrysler, Philips and Motorola. Today,

more than 100 members world wide.

Goals and properties • High speed, an order of magnitude higher than

CAN (10Mbps)

• Deterministic communication

• Fault-tolerant communication

• Different connection possibilities

FlexRay

BMW first with FlexRay

in 2006 - a BMW X5

59

Combination of different buses…

State-of-the-art

60

Scheduling in distributed RT

How do we guarantee timing requirements over several nodes with

communication involved?

ALT1: We schedule nodes and communication bus together

– Good control over the system

– Resource effective

– Not flexible the system must be rescheduled for each new node added

ALT2: Each node and communication bus scheduled separately

– Flexible – different parts (nodes) from different manufacturers

– The nodes must have an agreement about the data traffic

61

Example alternative 1: Global schedule

Assume the following design to be scheduled:

Task A

RT = 0

C = 1

Dl = 10

Task E

RT = 0

C = 1

Dl = 10 Task F

RT = 0

C = 1

Dl = 10

Task B

RT = 0

C = 1

Dl = 10

Task D

RT = 0

C = 1

Dl = 10

Task C

RT = 0

C = 1

Dl = 10

Node 1 Node 2

62

RT: relative release time (phasing)


Precedence graphs with messages included:

Period = 10

A B C PG 1

D

Period = 10

PG 2

m1

C = 1

D = 10

RT = 0

Node: bus

m2

C = 1

D = 10

RT = 0

Node: bus

E F

63


Joint graph:

Period = 10

A B C

D

m1

C = 1

D = 10

RT = 0

Node: bus

m2

C = 1

D = 10

RT = 0

Node: bus

E F

64


Schedule:

A B C

D

m1

m2 E F

A E F

0 1 2 3 4 5 6 7 8 9 10

Node 1

B D C Node 2

m1 bus 0 1 2 3 4 5 6 7 8 9 10

m2

0 1 2 3 4 5 6 7 8 9 10

65

Alternative 2: separated scheduling

Event-trigged distributed systems

– Each node scheduled online (examples: RM, EDF)

– No common time basis

– An event-trigged bus (example: CAN)

We want to be able to analyze distributed transactions, ex:

1. A task on a node activated periodically or by an event

2. The task sends a message on the NW

3. The message activates some another task

66

Example – distributed transaction

Event, A

activated

A sends msg

(earliest)

Node 2

receives msg

B ready to

execute

B sends

response

(earliest)

Task A Task B


CAN-bus

response

event

Execution

delay Communication

delay Release

jitter

Execution

delay

End-to-end response

Msg

jitter

A sends

msg (latest)

67

B sends

response

(latest)

Response

jitter

Distributed transaction – response times

Execution delay of A (= FA) – Caused by A and other tasks on node 1

• Use response time analysis to calculate FA

Message jitter (Jm1) – Caused by variations in A’s execution

• Message jitter = difference between WFA and BFA

Communication delay (= Fm1) – Caused by other messages on the bus

• Use jitter inherited from A (Jm1) and apply response time analysis for CAN

to calculate Fm1

Release jitter for B (JB) – Caused by variations in m1’s transmission

• Release jitter for B = difference between WFm1 and BFm1

• BRm1 47bit (no other messages on the bus)

Execution delay of B (= RB) – Caused by B and other tasks on node 2

• Use jitter inherited from m1 (JB) and calculate response time for B 68

Distributed transaction – response times

AAA

A

A

AAA

JwWF

CT

JwBCw j

hpj j

j

)(

A B m1

j

mhpj j

bitjm

mm CT

JwBw

)1(

1

11

1111 mmmm CwJWF

AA BFWFJm 1

BBB

B

B

BBB

JwWF

CT

JwBCw j

hpj j

j

)(

11 mm BFWFJ B

m1 inherits

jitter from A

B inherits

jitter from m1

69

Conclusions

Today mechanical and electrical control systems are replaced by computer based solutions.

• Challenging to construct the computer systems that have at least as good reliability and safety as the system they are replacing.

• Distributed real-time systems require robust communication

We can use real-time • Predictability • Individual response times and latencies • Analysis before the system is operating

Scheduling of distributed real-time systems

• Require predictable communications networks (CAN, TTP, FlexRay…)

• Either global system scheduling • or separate scheduling on each node and the

communication bus

70

R.I. Davis, A. Burns, R.J. Bril, and J.J. Lukkien,

Controller Area Network (CAN) schedulability

analysis: Refuted, revisited and revised,

Real-Time Systems, ISSN 1573-1383 (online),

http://www.springerlink.com/content/8n32720737877

071/, January 30th, 2007, ISSN 0922-6443 (print),

35(3): 239-272, April 2007.

References

71

http://www.springerlink.com/content/8n32720737877071/fulltext.pdf




ocps communication real-time networking and can

Documents