timing tool test effectiveness for wcet analysis tools

29
Avionics Europe 2013 Zoë Stephenson [email protected] Timing Tool Test Effectiveness for WCET Analysis Tools

Upload: mike-towers

Post on 12-Jul-2015

197 views

Category:

Technology


1 download

TRANSCRIPT

Avionics Europe 2013

Zoë Stephenson

[email protected]

Timing Tool Test

Effectiveness for WCET

Analysis Tools

Confidence in software tools rests on the

effectiveness of tool verification –

essentially, asking the right questions.

To determine the right questions for WCET

tools, this presentation includes our WCET

tool test effectiveness framework and

explains how it influences our tool testing.

Overview

Motivation – WCET analysis in context

Obtaining confidence in tools

How timing tools are used

How we evaluated our own testing efforts

What the evaluation means for RapiTime

The presentation is split into five distinct

sections:

MOTIVATION

Confidence

How do we show that we have effective tests of a timing tool?

Context: general SW development

Timing

RequirementTiming

Evidence

Software

develop test

evaluate

improve

The V model: requirement, software, test result

Context: timing requirements

Timing

RequirementTiming

Evidence

Software

develop test

evaluate

improve

Timing Analysis

Tool + Method

This is where we introduce a timing analysis tool and

method to check that the software meets those

requirements

Context: timing requirements

Tool

Operational

Requirements

Tool Verification

Results

developtest

evaluate

improve

Now we run tests to ensure the tool is operating correctly

(according to its TOR)

Context: tool test effectiveness

Criteria for Test

EffectivenessTest

Effectiveness

develop review

evaluate

improve

Now we review whether the test suite for the tool has

sufficiently exercised the tool

How RVS aids testing

RVS

RapiTime

collect data on-target during execution

transmit data to host computer

combine with program static analysis: beyond end-to-end tests

report test coverage and potential untested worst-case behaviours

direct tool user to define more comprehensive tests

CONFIDENCE

DO-178B says…

The objective of the tool qualification process is to ensure that

the tool provides confidence at least equivalent to that of the

process(es) eliminated, reduced or automated.

Tool Development Tool Testing

Qualification context

Tool Qualification Plan

Tool Accomplishment Summary

Tool Use Cases

Tool User

Tool Vendor

The background and what we are concerned with

for effectiveness of the test of RapiTime (circled in

green)

Tool testing: effective

correct tool

incorrect tool

tool testing accept

rejecttool testing

A tool test is effective if it can distinguish a tool that meets

the requirements from a tool that does not

Tool testing: representative

tool

tool

tool testing accept

acceptreal world

If a test is representative, you can infer real-world

correctness from correctness during the test

TOOL USAGE

Timing tool usage

Tool Verification

ResultsTiming Analysis Tool + Method

test

Tool Test / Analysis Suite

test

procedures

test input

programs

tool

target

user

Adding detail to the model of the tool and its usage

highlights the factors that are considered in the

assessment

ASSESSMENT

Assessment approach

Usage Model

custom SHARD /

HAZOP process

Causes and

Mitigations

Undesired Outcome

requirements

derivation

Conditions of Use∆ Test plans

adjust testing

Guideword selection

Guideword Application

No Test not present / not done

More Over-constrained analysis, cases missed

Less Shallow test, cases missed

Part of Incomplete test, not whole programs

As well as N/A

Reverse N/A

Error Test claims tool works, but it does not

Applicability of general guidewords for test effectiveness:

(and similarly for other artefacts and flows)

Top-level analysis

Provide both procedures and review criteria for test selection and

customisation

Test procedure review criteria: Depth

Generalisability

Completeness

General tool derived requirements: If main tool calls further tools, propagate back error return code

from “less”

from “more”

from “part of”

Timing tool analysis

structural diversity

…with expected

execution times

threads, tasks,

schedules…

subprograms:

direct, indirect…

blocks, entries,

exits…

selection, loop,

nesting…

compound

statement, exits…

expressions,

decisions…

Timing tool analysis

execution time

diversity

bus interaction

with peripheralsbranch prediction

denormalized

numbers

representing the

deployed system

Testing must be applicable across hardware features that

lead to variations in execution time at different scales

RapiTime analysis

Independent time

source

Other calibrated

delay

Target measurement

library testing

Target measurement

for multicore

Target measurement

for ARINC 653

Traceability by

configuration ID

Workflow to validate

annotations

Creating software with a known execution time

Ensuring that on-target measurement is

representative

Helping the user to manage the execution time

analysis

RAPITIME IMPACT

Strengthened confidence

RapiTime works on large programs, peculiar code structures, a variety of OS styles

RapiTime works with a large range of data collection and extraction mechanisms

RapiTime provides comprehensive traceability mechanisms for observed measurements and

computed execution times

Improving tool offering

New integration possibilities for multicore and time-partitioned systems

More comprehensive assessment advice for different target hardware and

measurement capabilities

More flexible verification kits for on-site tool qualification

New features in RVS 3.1

Graphical report comparisons help to show where a test effective in the lab falls

short on site

Wider range of path highlighting facilities show WCET path deviations at a glance

Commandline data export to CSV, XML and text formats help to trace between tool assessment and individual tests

Summary

Motivation - what do we want to test?

Confidence - how do I assess tools?

Tool usage - how do I use timing tools?

Evaluation - How do we evaluate our efforts in testing RapiTime?

Impact - how has the evaluation affected RapiTime?