programming memory-constrained networked embedded systems adam dunkels 1 programming...

71
Programming Memory-Constrained Networked Embedded Systems Adam Dunkels <[email protected]> Programming Memory-Constrained Networked Embedded Systems Adam Dunkels PhD thesis defense February 15, 2007

Upload: james-harper

Post on 26-Dec-2015

235 views

Category:

Documents


1 download

TRANSCRIPT

Programming Memory-Constrained Networked Embedded SystemsAdam Dunkels <[email protected]>1

ProgrammingMemory-Constrained

Networked Embedded Systems

Adam Dunkels

PhD thesis defense

February 15, 2007

Programming Memory-Constrained Networked Embedded SystemsAdam Dunkels <[email protected]>2

Embedded systems

● Things with computers that are not computers themselves

● Refrigerators, toys, industrial robots, ...● 98% of all microprocessors go into

embedded systems● Embedded systems are everywhere!● 50% much smaller than PC microprocessors

● 8-bit microprocessors● 1024 bytes vs 1073741824 (~1 billion) bytes

Programming Memory-Constrained Networked Embedded SystemsAdam Dunkels <[email protected]>3

Tiny microprocessors are huge

Programming Memory-Constrained Networked Embedded SystemsAdam Dunkels <[email protected]>4

Networked, programming

● What if we could make them talk to each other?

● A wide range of new fascinating applications

● Memory-constraints make programming the small embedded systems a challenge

● Typical example: 60k ROM, 2k RAM

Programming Memory-Constrained Networked Embedded SystemsAdam Dunkels <[email protected]>5

Programming –programming in the small

Programming Memory-Constrained Networked Embedded SystemsAdam Dunkels <[email protected]>6

What I’ve done

1. TCP/IP networking for memory-constrained networked embedded systems

● Developed two embedded TCP/IP stacks: lwIP, uIP

2. Simplifying event-driven programming for memory-constrained systems

● Protothreads, a novel programming mechanism● Per-process multi-threading for event-driven systems

3. Loadable modules for embedded operating systems

● Developed an embedded operating system with loadable module support: Contiki

Programming Memory-Constrained Networked Embedded SystemsAdam Dunkels <[email protected]>7

Results of this thesis

● TCP/IP for embedded systems● Now possible to use in systems an order of magnitude smaller

● Trade-off: memory for performance

● Protothreads – a novel programming abstraction● Decrease program complexity

● Very small memory & performance overhead

● Dynamically loadable modules in the Contiki operating system● First system in the community to have this

● Energy overhead of dynamic linking low

● Significant impact● Software used by 100+ companies world-wide, in research projects,

university courses; the papers are published at high-caliber conferences, ...

Programming Memory-Constrained Networked Embedded SystemsAdam Dunkels <[email protected]>8

The details...

Programming Memory-Constrained Networked Embedded SystemsAdam Dunkels <[email protected]>9

Networked embedded systems

● Some embedded systems already talk to each other

● Wireless car keys, the TV remote, mobile phones, ...

● The vision: wireless sensor networks● Sensing, processing, radio on a single device

● Enable new applications

Programming Memory-Constrained Networked Embedded SystemsAdam Dunkels <[email protected]>10

Wireless sensor networks –Applications

● Environmental monitoring● Follow contamination flows

● Habitat observation● Oceanography

Programming Memory-Constrained Networked Embedded SystemsAdam Dunkels <[email protected]>11

Wireless sensor networks –Applications

● Health monitoring of buildings● Cracks in bridges

● Mix sensors right into the concrete

… etc

Programming Memory-Constrained Networked Embedded SystemsAdam Dunkels <[email protected]>12

Wireless sensor networks may be just a vision ...

... but networked embedded systems are a reality!

Programming Memory-Constrained Networked Embedded SystemsAdam Dunkels <[email protected]>13

The networked refrigeratorDave Hudson, principal software engineer for Ubicom Ltd, 26 September 2001:

“Actually, refrigerators are probably one of the most network-connected appliances I know

Not domestic refrigerators, but the commercial type that supermarkets use

We’ve supplied tens of thousands of RS485-connected control and monitoring systems for such refrigerators

This market is now headed towards Ethernet and TCP/IP connectivity because it has a tremendous benefits in terms of manageability and interoperability between different suppliers' equipment.”

Programming Memory-Constrained Networked Embedded SystemsAdam Dunkels <[email protected]>14

TCP/IP for memory-constrained networked embedded systems

1

Programming Memory-Constrained Networked Embedded SystemsAdam Dunkels <[email protected]>15

Traditional TCP/IP stacks are large

● Linux TCP/IP stack● 100k code, 400k RAM

● µCLinux kernel 400k code, 1 megabyte RAM

60k ROM, 2k RAM...

Programming Memory-Constrained Networked Embedded SystemsAdam Dunkels <[email protected]>16

Network

IP ICMP

UDP TCP

Application

µIP – Bottom-up approach

● Unconventional design● Bottom-up design

● Single packet buffer

● Event-driven application interface

Programming Memory-Constrained Networked Embedded SystemsAdam Dunkels <[email protected]>17

µIP results

● 5k code, 100 bytes – 2k RAM● An order of magnitude smaller than existing

work

● RFC compliant TCP, UDP, IP● Possible contrary to conventional wisdom

● Single-segment design of µIPunfortunate interaction with TCP’s delayed ACK mechanism

Programming Memory-Constrained Networked Embedded SystemsAdam Dunkels <[email protected]>18

Programming Memory-Constrained Networked Embedded SystemsAdam Dunkels <[email protected]>19

But: ability to communicate more important than throughput

● µIP trades memory for throughput● Low memory usage, low throughput

● Small systems: not that much data● Example – CubeSat:

● µIP with 100 bytes buffer

● 9600 bps RF link

Programming Memory-Constrained Networked Embedded SystemsAdam Dunkels <[email protected]>20

Event-driven

“In TinyOS, we have chosen an event model so that high levels of concurrency can be handled in a very small amount of space. A stack-based threaded approach would require that stack space be reserved for each execution context.”

J. Hill, R. Szewczyk, A. Woo, S. Hollar, D. Culler, and K. Pister. System architecture directions for networked sensors. [ASPLOS 2000]

Programming Memory-Constrained Networked Embedded SystemsAdam Dunkels <[email protected]>21

Problems with the event-driven model?

“This approach is natural for reactive processing and for interfacing with hardware, but complicates sequencing high-level operations, as a logically blocking sequence must be written in a state-machine style.”

P. Levis, S. Madden, D. Gay, J. Polastre, R. Szewczyk, A. Woo, E. Brewer, and D. Culler. The Emergence of Networking Abstractions and Techniques in TinyOS. [NSDI 2004]

Programming Memory-Constrained Networked Embedded SystemsAdam Dunkels <[email protected]>22

Simplifying event-driven programming of memory-constrained systems

2

Programming Memory-Constrained Networked Embedded SystemsAdam Dunkels <[email protected]>23

Threads vs events…Threads: sequential code flow Events: unstructured code flow

Very much like programming with GOTOs

Programming Memory-Constrained Networked Embedded SystemsAdam Dunkels <[email protected]>24

Explicit state machines for flow control

● The problem: using explicit state machines for flow control

● Created ad hoc by the programmer

● No formal specification

● Must be inferred from reading code

● Very much like using GOTOs

Programming Memory-Constrained Networked Embedded SystemsAdam Dunkels <[email protected]>25

Contiki:Combining event-driven and threads

● Event-based kernel● Low memory usage

● Single stack

● Multi-threading is a library● For those applications that needs it

● One thread, one extra stack

● The first system in the sensor network community to do this

Programming Memory-Constrained Networked Embedded SystemsAdam Dunkels <[email protected]>26

However...

● Threads still require stack memory● Unused stack space wastes memory

● 200 bytes out of 2048 bytes is a lot!● A multi-threading library very difficult to port

● Requires use of assembly language● Hardware specific● Platform specific● Compiler specific

Programming Memory-Constrained Networked Embedded SystemsAdam Dunkels <[email protected]>27

Protothreads:A new programming abstraction

● A design point between events and threads

● Programming primitive: conditional blocking wait● PT_WAIT_UNTIL(condition)

● Single stack● Low memory usage, just like events

● Sequential flow of control● No explicit state machine, just like threads

● Programming language helps us: if and while

Programming Memory-Constrained Networked Embedded SystemsAdam Dunkels <[email protected]>28

An example protothreadint a_protothread(struct pt *pt) { PT_BEGIN(pt);

PT_WAIT_UNTIL(pt, condition1);

if(something) {

PT_WAIT_UNTIL(pt, condition2);

}

PT_END(pt);}

/* … */

/* … */

/* … */

/* … */

Programming Memory-Constrained Networked Embedded SystemsAdam Dunkels <[email protected]>29

Proof-of-concept implementation of protothreads in ANSI C

● Implementation pure ANSI C● Uses the C preprocessor● No need for a special preprocessor● No assembly language

● Very portable● Nothing is changed between platforms, C compilers

● However, two deviations from mechanism● Automatic variables not stored across blocking waits● Limitations on the use of switch statements

Programming Memory-Constrained Networked Embedded SystemsAdam Dunkels <[email protected]>30

How well do protothreads work?

Programming Memory-Constrained Networked Embedded SystemsAdam Dunkels <[email protected]>31

Reduction of complexityStates before

States after

Transitions before

Transitions after

Reduction in lines of code

XNP 25 0 20 0 32%

TinyDB DBBufferC 23 0 24 0 24%

Mantis CC1000 driver 15 0 19 0 23%

SOS CC1000 driver 26 9 32 14 16%

Contiki TR1001 driver 12 3 22 3 49%

uIP SMTP client 10 0 10 0 45%

Contiki codeprop 6 4 11 3 29%

Found state machine-related bugs in two of the programs when rewriting with protothreads

Explicit flow-control state machines could be almost completely removed

Programming Memory-Constrained Networked Embedded SystemsAdam Dunkels <[email protected]>32

Execution time overhead isa few cycles

State machine (CPU cycles)

Protothreads (CPU cycles)

gcc -Os 92 98

gcc –O1 91 94

Contiki TR1001 radio driver average execution time,MSP430 CPU cycles

● Overhead: 3 – 6 CPU cycles● Protothreads useful even in time-critical code

Programming Memory-Constrained Networked Embedded SystemsAdam Dunkels <[email protected]>33

Now we can program...

● But how do we get the programs onto the devices?

Programming Memory-Constrained Networked Embedded SystemsAdam Dunkels <[email protected]>34

Loadable modules in the Contiki operating system

3

Programming Memory-Constrained Networked Embedded SystemsAdam Dunkels <[email protected]>35

Traditional reprogramming

● Physically attach to the device● Provide a special voltage to the chip● Rewrite the memory of the chip● Do this for all your devices out there● What if we have 100 devices in 100

buildings?● 10000 devices...

Programming Memory-Constrained Networked Embedded SystemsAdam Dunkels <[email protected]>36

Transmitting programs over the network

Load the software

Programming Memory-Constrained Networked Embedded SystemsAdam Dunkels <[email protected]>37

Traditional systems:entire system a monolithic binary

● Most systems statically linked at compile-time

● Entire system is a monolithic binary

● Makes code smaller

● But: hard to change● Must re-upload entire

system

Programming Memory-Constrained Networked Embedded SystemsAdam Dunkels <[email protected]>38

Contiki: run-time loadable program modules

● Core resident in memory● Programs know the core

● The core do not know the programs

● Individual programs can be loaded/unloaded

● The first system in the sensor network community to do this

Core

Programming Memory-Constrained Networked Embedded SystemsAdam Dunkels <[email protected]>39

Can we use a standard mechanism for the dynamic loading?

● Can we do dynamic loading the ”Linux” way in Contiki?

● Despite the resource constraints

● Run-time linking of ELF files● Availability of tools, knowledge

● If we could, what would the overhead be?● Compared to a tailored loading mechanism

● Compared to virtual machines

Programming Memory-Constrained Networked Embedded SystemsAdam Dunkels <[email protected]>40

● CVM – Contiki VM● A stack-based, typical virtual machine

● A compiler for a subset of Java

● The leJOS Java VM● Adapted to run in ROM

● Executes Java byte code ● Bundled .class files

In comparison: two virtual machines

Programming Memory-Constrained Networked Embedded SystemsAdam Dunkels <[email protected]>41

Memory footprint is small

Module ROM RAM

Tailored loader

670 0

CVM 1344 8

ELF loader

5694 78

Java VM 13284 59

● ROM size of dynamic linker

● ~ 2k code

● ~ 4k symbol table ● Full Contiki system,

automatically generated

● ELF loading feasible for memory-constrained systems

Programming Memory-Constrained Networked Embedded SystemsAdam Dunkels <[email protected]>42

Quantifying the energy consumption

● Measure the energy consumption:● Radio reception, measured on CC2420, TR1001

● Better estimate based on average Deluge overhead

● Storing data to EEPROM

● Linking, relocating object code

● Loading code into flash ROM

● Executing the code

● Two platforms: ESB, Telos Sky (both MSP430)

Programming Memory-Constrained Networked Embedded SystemsAdam Dunkels <[email protected]>43

Energy consumption of the dynamic linker

Programming Memory-Constrained Networked Embedded SystemsAdam Dunkels <[email protected]>44

Loading, linking native code vs virtual machine code

ELF (mJ) CVM (mJ) Java (mJ)

Reception 29 2 22

Storing 2 0 0

Linking 3 0 0

Loading 1 0 5

Total 35 2 27

Energy consumption in mJ for loading an object tracking application

Programming Memory-Constrained Networked Embedded SystemsAdam Dunkels <[email protected]>45

Execution time overhead

Energy per iteration (µJ)

Native 0.54

CVM 0.95

Java 2.0

Energy per iteration (µJ)

Native 0.75

CVM 65

Java 73

● Computationally “heavy” code

● 8x8 vector convolution

● Code that use a native code library

● Object tracking application

● Most of the code is spent running native code

Programming Memory-Constrained Networked Embedded SystemsAdam Dunkels <[email protected]>46

Break even points, vector convolution

“ELF16”

Programming Memory-Constrained Networked Embedded SystemsAdam Dunkels <[email protected]>47

Break-even points, object tracking

“ELF16”

Programming Memory-Constrained Networked Embedded SystemsAdam Dunkels <[email protected]>48

Wrapping up

Programming Memory-Constrained Networked Embedded SystemsAdam Dunkels <[email protected]>49

Future work

● Investigating the memory requirements/performance trade-off

● More memory = better performance?

● Single-buffer approach for other communication mechanisms

● Bottom-up approach to build other programming abstractions

● High-level sensor network programming

Programming Memory-Constrained Networked Embedded SystemsAdam Dunkels <[email protected]>50

Conclusions

● Results● TCP/IP for memory-constrained systems

● Protothreads: simplifies event-driven programming

● Dynamic loading/linking of code modules

● Low-complexity mechanisms for low-complexity systems

● Simple in hindsight!● But it takes a lot of hard work to get there

● Some interesting future work ahead of us

Programming Memory-Constrained Networked Embedded SystemsAdam Dunkels <[email protected]>51

The end of my part

Programming Memory-Constrained Networked Embedded SystemsAdam Dunkels <[email protected]>52

Background – the TCP/IP stack

● UDP – best-effort datagrams● TCP – connection oriented, reliable

byte-stream, full-duplex● Flow control, congestion control, etc

● IP – best-effort packet delivery● Forwarding, fragmentation

● The hard parts are IP and TCP

Network

IP ICMP

UDP TCP

Application

Programming Memory-Constrained Networked Embedded SystemsAdam Dunkels <[email protected]>53

The secrets of µIP

● Shared packet buffer● Lower throughput● Event-driven application programming

interface

Programming Memory-Constrained Networked Embedded SystemsAdam Dunkels <[email protected]>54

The secrets of µIP part I –A shared packet buffer

● All packets – both outbound and inbound – use the same buffer

● Size of buffer determines throughput

Packet buffer

Incoming packetOutbound packet

Programming Memory-Constrained Networked Embedded SystemsAdam Dunkels <[email protected]>55

The secrets of µIP part I –A shared packet buffer II

● Implicit locking: single-threaded access

1) Grab packet from network – put into buffer

2) Process packet● Put reply packet in the same buffer

3) Send reply packet into network

Packet buffer

Programming Memory-Constrained Networked Embedded SystemsAdam Dunkels <[email protected]>56

The secrets of µIP part II –Throughput

● µIP trades throughput for RAM● Low RAM usage = low throughput

● Small systems = not that much data!● Ability to communicate more important

than throughput!

Programming Memory-Constrained Networked Embedded SystemsAdam Dunkels <[email protected]>57

The smallest µIP configuration (that I know of)

● CubeSat kit by Pumpkin Inc● Pico satellite construction kit

● 128 bytes of RAM for µIP

Programming Memory-Constrained Networked Embedded SystemsAdam Dunkels <[email protected]>58

The secrets of µIP part III –Application Programming Interface I

● µIP does not have BSD sockets● BSD sockets are built on threads

● Threads induce overhead (RAM)

● Instead – event-driven API● Execution is always initiated by µIP

● Applications are called by µIP, call must return

● Protosockets – BSD socket-like API based on protothreads

Programming Memory-Constrained Networked Embedded SystemsAdam Dunkels <[email protected]>59

The secrets of µIP part III –Application Programming Interface II

void example2_app(void) { struct example2_state *s = (struct example2_state *)uip_conn->appstate;

if(uip_connected()) { s->state = WELCOME_SENT; uip_send("Welcome!\n", 9); return; }

if(uip_acked() && s->state == WELCOME_SENT) { s->state = WELCOME_ACKED; }

if(uip_newdata()) { uip_send("ok\n", 3); }

if(uip_rexmit()) { switch(s->state) { case WELCOME_SENT: uip_send("Welcome!\n", 9); break; case WELCOME_ACKED: uip_send("ok\n", 3); break; } }}

Programming Memory-Constrained Networked Embedded SystemsAdam Dunkels <[email protected]>60

The secrets of µIP part III –Application Programming Interface III

● Event-driven API sometimes is problematic● Not all programs are well-suited to it

● Programs are explicit state machines

● Protosockets: sockets-like API using protothreads● Extremely lightweight stackless threads

● 2 bytes per-thread state, no stack

● Protothreads allow “blocking” functions, even when called from µIP

Programming Memory-Constrained Networked Embedded SystemsAdam Dunkels <[email protected]>61

The secrets of µIP part III –Application Programming Interface IV

PT_THREAD(smtp_protothread(void)){ PSOCK_BEGIN(s);

PSOCK_READTO(s, '\n');

if(strncmp(inputbuffer, “220”, 3) != 0) { PSOCK_CLOSE(s); PSOCK_EXIT(s); }

PSOCK_SEND(s, “HELO ”, 5); PSOCK_SEND(s, hostname, strlen(hostname)); PSOCK_SEND(s, “\r\n”, 2);

PSOCK_READTO(s, '\n');

if(inputbuffer[0] != '2') { PSOCK_CLOSE(s); PSOCK_EXIT(s); }

Programming Memory-Constrained Networked Embedded SystemsAdam Dunkels <[email protected]>62

The secrets of µIP part III –Application Programming Interface V

● API built from the bottom (network) and up● Protothreads and protosocket API provides

sequential programming● Less overhead than “real” threads and the

“real” socket API

Programming Memory-Constrained Networked Embedded SystemsAdam Dunkels <[email protected]>63

● Four threads, each with its own stack

Threads require per-thread stack memory

Thread 1 Thread 2 Thread 3 Thread 4

Programming Memory-Constrained Networked Embedded SystemsAdam Dunkels <[email protected]>64

Events require one stack

Thread 1 Thread 2 Thread 3 Thread 4

Eventhandler 1Eventhandler 2Eventhandler 3

Stack is reused for every event handler

Threads require per-thread stack memory

● Four threads, each with its own stack ● Four event handlers, one stack

Eventhandler 4

Programming Memory-Constrained Networked Embedded SystemsAdam Dunkels <[email protected]>65

Protothreads require one stack

Thread 1 Thread 2 Thread 3 Thread 4

Threads require per-thread stack memory

● Four threads, each with its own stack ● Four protothreads, one stack

Events require one stack

● Four event handlers, one stack

Protothread 1Protothread 2Protothread 3Protothread 4

Just like events

Programming Memory-Constrained Networked Embedded SystemsAdam Dunkels <[email protected]>66

Six-line implementation

struct pt { unsigned short lc; };

#define PT_INIT(pt) pt->lc = 0

#define PT_BEGIN(pt) switch(pt->lc) { case 0:

#define PT_EXIT(pt) pt->lc = 0; return 2

#define PT_WAIT_UNTIL(pt, c) pt->lc = __LINE__; case __LINE__: \

if(!(c)) return 0

#define PT_END(pt) } pt->lc = 0; return 1

Protothreads implemented using the C switch statement

Programming Memory-Constrained Networked Embedded SystemsAdam Dunkels <[email protected]>67

Code footprint

0

500

1000

1500

2000

2500

3000

Siz

e (b

ytes

)State machine

Protothreads

● Average increase ~200 bytes

● Inconclusive

Programming Memory-Constrained Networked Embedded SystemsAdam Dunkels <[email protected]>68

What’s wrong with using state machines?

● There is nothing wrong with state machines!● State machines are a powerful tool● Amenable to formal analysis, proofs

● But: state machines typically used to control the logical progam flow in many event-driven programs

● Like using gotos instead of structured programming● The state machines not formally specified● Must be infered from reading the code● These state machines typically look like flow charts anyway

● We’re not the first to see this

● Protothreads: use language constructs for flow control

Programming Memory-Constrained Networked Embedded SystemsAdam Dunkels <[email protected]>69

Why not just use multithreading?

● Multithreading the basis of (almost) all embedded OS/RTOSes!

● WSN community: Mantis, BTNut (based on multithreading); Contiki (multithreading on a per-application basis)

● Nothing wrong with multithreading● Multiple stacks require more memory

● Networked = more concurrency than traditional embedded

● Can lead to more expensive hardware

● Preemption● Threads: explicit locking; Protothreads: implicit locking

● Protothreads are a new point in the design space● Between event-driven and multithreaded

Programming Memory-Constrained Networked Embedded SystemsAdam Dunkels <[email protected]>70

● Modify the compiler?● There are many compilers to modify… (IAR, Keil, ICC,

Microchip, GCC, …)

● Special preprocessor?● Requires us to maintain the preprocessor software on all

development platforms

● Within the C language?● The best solution, if language is expressive enough

● Possible?

Implementing protothreads

Programming Memory-Constrained Networked Embedded SystemsAdam Dunkels <[email protected]>71

Are protothreads useful in practice?

● We know that at least thirteen different embedded developers have adopted them

● AVR, PIC, MSP430, ARM, x86● Portable: no changes when crossing platforms, compilers

● MPEG decoding equipment, real-time systems● Others have ported protothreads to C++, Objective C● Probably many more

● From mailing lists, forums, email questions

● Protothreads recommended twice in embedded “guru” Jack Ganssle’s Embedded Muse newsletter