benchmarking real-time and embedded corba orbs · 2009. 5. 28. · clearly this is of no use for...

1

© 2002 Objective Interface Systems, Inc.

BenchmarkingReal-Time and

Embedded CORBA ORBsObjective Interface

13873 Park Center Road, Suite 360Herndon, VA 20171-3247

703/295-6500 (voice)703/295-6501 (fax)

http://www.ois.com/mailto:[email protected]

© 2002 Objective Interface Systems, Inc. 22

Topics

GoalsImportant Benchmarking Issues & InfrastructureSpeed and End-to-End LatencyEffect of QoS SettingsDeterminism / PredictablityThroughputFootprintScalabilityRT CORBA Priorities and Priority Inversion

2


Goal of this Presentation

Examine ORB benchmarking in light of Embedded and Real-Time concernsPropose additional measures and relate them to ORB characteristicsLook at the issues related to measuring certain aspects of an ORBBut:

Not to compare any specific ORB’s performanceNot to focus on the details of any one kind of benchmark –rather to survey manyNot to focus on any one platform’s specific issuesNot to focus on issues related to time measurements on multiple machines


Important Benchmarking Issues

The general purpose of running a benchmark is to determine the fitness of the ORB

Either pass/fail or raking of multiple ORBsThere are many qualities that can/should be measured

All benchmarks that are run need to be related to a real-time goal or quality – otherwise, why measure them

Once something is measured, if it isn’t satisfactory, the next big question is “why is it this way?”

Is it is the ORB? The Hardware? The OS? The TCP Stack (or replacement transport)?

Isolating the cause is difficult

3


Benchmarking Infrastructure

TimersNeed for High Resolution

E.g., on PPC/VxWorks, the time is limited to 17 millisecondsClearly this is of no use for real-time ORBs

Platform specific natureThe supplied resolution varies with OS and HW

OIS approach for dealing with this was to write our own high resolution timer where needed and wrap inside a portable interface

YMMV

Things that help:Stop/Start, Timer Overhead, Number of TrialsCalculation of min, max, average, std. DeviationFriendly Output (e.g., .csv)


Speed

Theoretically Real-Time systems are about morethan just speed

Pragmatically that’s the first, most important, thing everyone wants to know about

Multiple Dimensions of SpeedSimple: End-to-end latency of two way call

A single data pointModerate: multiple calls with varying parameter sizes

A two dimensional set of pointsComplex: multiple size data and varying data types

3 dimensional set of data4+ dimensions could include: one-way vs two-way; varying numbers of target objects and operations, etc.

020

406080

100120

0

2550

75100

125

1 8 1 6 2 4 3 2

1 8 16 24 32Octet

ShortLongFloat0

255075

100125150175

Time (µs)

Size (bytes)Data

Types

4


Speed [2]

Need to make it easy for users to measure more than just small data two-ways

Minimally need to measure end-to-end round trip latency for a variety of data sizes

Need to ensure that the End-to-End round trip time is divided between the Net and Gross times

Must be easy to measure the corresponding socket times and computer the “Net” overhead

0

50

100

150

200

250

300

Tim

e (µ

s)

ORB net overhead

TCP stack overhead


End-to-End Latencyfor Varying Sized Parameters

0200400600800

10001200140016001800

0 2000 4000 6000 8000 10000 12000 14000Data (by tes)

Late

ncy

(µs)

Socket LatencyORB latency

5


End-to-End Latency for Varying Sized Parameters [2]

0

0.5

1

1.5

2

2.5

3

3.5

4

144 2416 4816 7216 9616 12016 14416 16816 19216 21616 24016Data Size (Bytes)

Time (

ms)

SocketsORB #1ORB #2


Effect of QoS Settings on Transport Benchmarks

Another dimension to most of the previous benchmarks listedGoal: Determine the ability of the ORB to allow the transport’s QoS settings to be used to affect performance, determinism, etc.Involves repeating many of the previous tests using multiple QoS settingsExample important TCP QoS values:

Send/Receive Buf SizeTCP No Delay

Non TCP transports will have different QoS settings

6


Latency Summary

What we know:Round trip time for a given data sizeGeneral “ORB overhead” informationAbility to compare net values for multiple ORBs

What we don’t knowSources of any inefficiencies – the ORB, the OS, the Hardware, the TCP Stack, etc.How the ORB will react on a different OS or Hardware Platform


Determinism / Predictability

Hard Real-Time Systems worry about maximum valueSoft Real-Time Systems worry about average value and distribution of valuesLook at the distribution of the latency over a number of runs

Will not see a normal (bell shaped) distributionThere is a minimum bound of the latencyThere is no maximum bound

So the distribution is skewed toward the minand trails off exponentially toward the max

7


Determinism / Predictability [2]

Multiple possible contributors to the variance of the runs:

Initial connection vs. existing connectionOther processing on the computerOther traffic on the networkSources of non-determinism in the ORBSources of non-determinism in the OSSources of non-determinism in the Network Stack

How can these be determined/isolated?How to assess impact of these on the ORB’s use on the project?

One thing to do is to run many different kinds of tests


Determinism ComparisonSmall Data Two Ways

0

500

1000

1500

2000

2500

3000

0 50 100 150 200

Bytes

Late

ncy

(µs)

x86/NT TCPPPC VxWorks TCP over MyrinetPPC VxWorks Myrinet

8


Determinism Comparison [2]Larger Data

0

200

400

600

800

1000

1200

1400

1600

0 5000 10000 15000 20000 25000 30000 35000 40000 45000

Bytes

Late

ncy

(µs)

Loopback Shared Memory


Determinism Conclusions

Don’t expect predictability on a non-real-time OSDon’t expect predictability using TCP/IPRemember that no system can be predictable with all things changingIsolating each variable helps to understand effects

9


Throughput

Inverse of latency (mostly)Goal: This test is important to embedded and real-time systems that transfer large amounts of data

No sense having a big pipe between boxes if you can’t make use of most/all of it

Significant differentiator between ORBsEasy for an ORB to send small amounts of data very fast. Things like copies, seize/release, system calls, etc. don’t amount to muchBut… when large amounts of data are concerned then the impact of many small mistakes can be seen

And throughput suffers


Throughput [2]

Double Sequence

01020304050607080

0 2 4 6 8 10 12 14 16

MegaBits

Mbp

s

Double Sequence

10


Footprint: Static Baseline

Not important to everyone, but critical to someGoal: To determine:

How much memory does ORB leave for application when put on the embedded boardORB won’t outgrow the board under certain circumstances

Need to examine in memory footprint of:Basic ORB library — may involve multiple configurationsSimple Client or ServerRealistic Application Client or Server

Need to separate out ORB overhead from application footprint

Need to isolate the influence of:ORB vs. application vs. OS & network stackProcessor architecture & compiler (and its switches)


Scalability of the ORB [1]

Goal: Examine how the ORB changes (e.g., memory utilization or latency) under multiple circumstancesTwo major variables:

Things that one changes in the ServerNumber of Objects/ImplementationsNumber of POAsNumber of “Worker” Threads

Things that one changes in the ClientNumber of Connections to each Server and in TotalNumber of threads in each clientDistribution of clients processes to same/different machinesNumber of Connections and Priority Bands

These interact with each other – changes in client behavior produce effects in the server (e.g. footprint) and vice-versa

11


Scalability of the ORB [2]

Need a realistic usage scenario to measure againstEach footprint test requires a metric that measures

The memory size at multiple points and derives the incremental changeThe rate of change per unit of growth (e.g., bytes/connection)

Also look at latency tests to determine change of latency as application scales


RT CORBA vs. non RT CORBA

Real-time predictability vs. raw performancePredictability has a price

Additional latency: managing thread priorities in the OSAdditional memory: connections for priority bands

Bounded behavior vs. Consistent behaviorBounded behavior

Can analyze the worst caseChanges in lower priority activities does not cause higher priority to violate their bounds

Consistent behaviorExecution results in the same behavior each timeMore stringent than bounded behaviorThe same activity executes in the same time assuming that other activity do not compete with that activity

12


RT CORBA Priority Transmission

Measuring Priority InversionEnd-to-End correctness depends on there being no priority inversions anywhere in the chain:

ApplicationOSORBTCP/IP Stack

If any of the pieces do not work correctly, the result will contain inversions

Determining which piece is at fault may be non trivial and may require testing multiple times on multiple environments

Graph of multiple threads at different prioritiesHorserace chart

Client

Network

Server

ORBORB

Predictable TransportPriority Propagation

PredictableMemory

Management

PreemptionPreemption

Client Application Predictability

Client Priority

Preemption

PredictableMemory

Management

Predictable Invocation

Predictable, BoundedAlgorithms

Preemption

Server Priority

Server Application Predictability


Multiple Thread Priority Test [1]

Good ORB on a Bad (Non RT) OS

0

1

2

3

4

5

6

7

8

9

10

142845 142845.5 142846 142846.5 142847 142847.5

Time

Thre

ad P

riorit

ies

13


Multiple Thread Priority Test [2]

Good ORB on a Good (RT) OS

0

1

2

3

4

5

6

7

8

9

10

145970 145980 145990 146000 146010 146020 146030 146040 146050 146060 146070Time

Thre

ad P

riorit

ies


Conclusions

Examined real-time and embedded issues related toLatency, Determinism, Plug-In transports, Throughput, Scalability, Priority Propagation and Priority Inversion

Illustrated some practical techniques for benchmarking Real-Time and embedded ORBA

Inspired by work done by:The ORBexpress product development teamThe benchmarking work done by Boeing’s Phantom Works group on their DII COE sponsored RT CORBA benchmarking studiesThe internal benchmarking work done by Gautam Thaker of LMand by mentoring work done in conjunction with our customers

Need to continue to evolve additional Real-Time and Embedded CORBA specific benchmarks

Discuss at http://www.realtime-corba.com/

benchmarking real-time and embedded corba orbs · 2009. 5. 28. · clearly this is of no use for...

Documents