testing robustness of uas safety technology … robustness of uas safety technology (trust ) ......

1

Testing Robustness of UAS Safety Technology (TRUST)

Team: • John Sauter, PI, PM • Marc Huber, Co-PI • Mike Quist, Technical Lead • Eric Tucker, Software Engineer • Patrick Theisen, Software Engineer

DISTRIBUTION A. Approved for public release; distribution is unlimited. Case Number: 88ABW-2014-2554

Autonomy Challenges

• Need: ensure safe operation of autonomous systems • SBIR Objective: • “Develop an innovative verification tool to assess the robustness of

run time safety systems bounding autonomous and learning algorithms for operation in untrained/unknown environments.”

• TRUST Approach: • An intelligent test agent embeds safety testing domain knowledge

with optimal search strategies to rapidly assess robustness of Unmanned Autonomous Systems (UAS) providing key measures of system safety

2 DISTRIBUTION A. Approved for public release; distribution is unlimited. Case Number: 88ABW-2014-2554

Safety Assurance Approaches: Formal Analysis

• If we can formally analyze the system we can guarantee performance under certain conditions • Hybrid Automata have been used to create verified controllers for

dynamical systems.

• Restricted by the size and complexity of the controller that can be formally verified • This is increasing as methods improve and compute power increases

• Suitable for basic autonomy aids • E.g. lane departure warnings, adaptive cruise control, waypoint control,

etc.

• Complex learning controller is beyond SOA in formal analysis.


Run Time Assurance Architectures

• If we can’t formally verify an Autonomous Controller (AC) we can bound its behavior with a verified Safety Controller (SC)

• Run Time Assurance (RTA) architecture: the AC and SC operate in parallel • The AC normally is in control of the plant • A Decision Module (DM) monitors the state of the plant and environment • If a critical state is reached the SC is engaged in time to bring the system back

into a known safe state • How do you verify a system built with this architecture?

4

Decision Module (DM)

Plant Safety

Controller (SC) RTA Architecture

Autonomous Controller (AC)


Autonomy Testing Challenges

• High Dimensional Massive State Space – cannot adequately cover the entire testing state space to guarantee performance and safety for all missions, environments, and conditions. • See also, the “curse of dimensionality”

• Expensive Evaluation – testing requires live or high fidelity simulation of behavior which takes time • Compounded by the large sample sizes required to measure

probabilistic bounds on performance for non-deterministic systems.

• Changing Behaviors - The UAS tested in the lab is different than the UAS in the field.


TRUST Approach

• Knowledge-based State Space Reduction Techniques (KB-SSR) – use formal methods and domain knowledge to • Reduce the number of dimensions • Reduce the size of the state space using knowledge of collision

geometries and RTA SC boundaries

• Intelligent Test Search Agent • Uses domain and statistical knowledge to eliminate unnecessary tests

and accomplish multiple objectives in a single test • Use optimization and abstract models to rapidly find and explore the

failure boundary of the autonomous system

• Tests that Adapt as the AC Learns posing more complex problems. • Detects performance changes and knows when re-testing is required


Proof of Concept Demonstration

• Purpose: Demonstrate intelligent search concept of TRUST • That we can sufficiently sample the test space

• Reference: Particle Swarm Optimization algorithm • Often used for high dimensional, real-valued problems • 500 particles with a neighborhood of 3 to ensure good exploration over

the surface

• Development: a custom particle-based search algorithm • Initial sampling is random (no guidance) • Interpolation and extrapolation from gradients to estimate surface

location. • Balances exploration and exploitation

• Global exploration if not near surface (|robustness| >> 0) • Local exploitation to improve confidence in surface estimate and gradients


Proof of Concept Architecture

8

Intelligent Search Agent

Optimize robustness testing

Search State Space Robustness

Statistics Test setup

Test Results

Safety controller (ascend & hover)

Plant model (2D kinematic)

Emulated sensor data

Plant control

commands

Test User

Bounded safety test regions

Test Type

Failure conditions Safety boundaries

Autonomy controller

(straight level flight)

Decision Module

(time to engage)

Scilab/Java


POC Description • The POC demonstration implements a simplified Run Time Assurance

architecture of a rotorcraft with • An “Autonomy” Controller that just maintains straight, level flight • A Safety Controller (SC) that executes a single ascend and hover safety

maneuver • A Decision Module that determines when to engage the SC based on time to

decelerate and closing speed of intruder • The SC is designed to keep the intruder from penetrating the NMAC

cylinder • Unless assumptions of SC are violated such as accuracy of sensors

9

500’

100’

Near Mid Air Collision (NMAC) cylinder

βI

R

S

yI

xI

ωI

I

T

E

βT

Mid Air Collision Geometry


Experimental Setup

• Robustness Metric: • Distance to NMAC at closest point of approach to test vehicle. • Zero robustness defines the failure surface we are looking for

• Five NMAC Test Dimensions: • Test vehicle speed • Intruder velocity (relative to test vehicle – 2 coords) • Intruder position (relative to test vehicle – 2 coords)

• Two experiments using 3D submanifolds: • (1) Fixed bearings (βI = 45o and βT = 0o).

• Varying test vehicle speed, intruder speed, and intruder distance • (2) Fixed intruder and test vehicle speeds

• Varying intruder position (2 coords) and bearing from Intruder to Test βT

• One experiment searching over entire 5D space


Adding to the Challenge

• We did not want the simplicity of the RTA chosen to reduce the value gained from the experiments • So we made the problem harder for the search agent

• We did not use any space reduction or domain knowledge techniques • No knowledge of the RTA SC boundary was employed • No knowledge of the physics of an encounter were used to setup tests

• We increased the ranges on the experimental variables • Results: A much larger search space stressing the capabilities

of the search agent.


Search Quality Metrics • Search goal: Maximize the number of samples taken on the failure

surface (precision) and maximize the coverage over that surface. • A secondary measure for the future is to maximize the evenness of the

coverage (i.e. spatial entropy). • For each problem, we formed a reference set by

• Uniformly sampling the whole space • Selecting samples with robustness values near zero (the failure surface)

• Metric: Percentage of search samples on the failure surface • We compute the number of search points with robustness values near zero

over the total number of search points • Metric: Percentage coverage of the failure surface

• We assume the reference set is a coverage set (though density will vary with number of samples and shape of surface)

• We determine a threshold where >99% of the test points are within that threshold of a reference point.

• We calculate the % of reference points that are within that threshold of a test point


Uniform Sampling to Establish Surface


Reference Set

• All the uniformly sampled points that are on the surface define the reference set

14

Reference point and tolerance


Search Quality Assessment • High precision (15/18 on

surface: robustness near zero) • Low coverage (13/30 reference

points covered

15

Test point on surface Test point not on surface

Reference point not near a test point Reference point near a test point


Search Quality Assessment • Low precision (56/93 on

surface: robustness near zero) • High coverage (29/30 reference

points covered)

16

Test point on surface Test point not on surface

Reference point not near a test point Reference point near a test point


Experiments

• We can also visualize results in up to 5-d • For the 5-d results, we look at a 3-d “slice”, with the ranges of the

remaining two coordinates fixed by a kind of mini-map • Example for a 5D toy problem: (a sphere selected from a toroid)


Experiment – Fixed Speeds (no noise)

Dataset # pts # on surface % on surface % coverage

500k uniform 500000 10973 (ref) 2.2% 100%

25K PSO 24999 11746 (test) 47% 49%

25k custom 22646 10236 (test) 45% 39%

Vary intruder position (2 coords) and bearing from Intruder to Test βT


Experiment – Fixed Speeds (no noise)

Reference Set Custom Search

PSO Search


Experiment – Fixed Angles (45o and 0o)


500K uniform 500000 10973 2.2% 100%

25K PSO 24999 7701 31% 31%

25K custom 18823 13563 72% 23%

Varying test vehicle speed, intruder speed, and intruder distance


Experiment – Fixed Angles (45° head-on)

Reference Set Custom Search

PSO Search


Experiment – Full 5D


~2M uniform 1,889,568 15717 0.83% 100%

25K PSO 24999 11270 (test) 45% 38%

25K custom 24817 15637 (test) 63% 15%


Experiment – Full 5D

Reference Set Custom

Search

PSO Search


Conclusions

• Intelligent search alone is insufficient • We need to rely heavily on techniques to reduce the number of

dimensions and size of test space. • Phase II will add those techniques to the system

• Need for multiple abstraction levels for simulation models • High abstraction to run large searches (10,000+ samples) – preferably

in a cloud environment) • More detailed dynamic models to explore failure surfaces directly

• Need to rely on multiple search techniques • Current algorithm focuses only on finding the surface quickly:

Result excellent precision, poor coverage • Phase II will add spreading function to explore along the surfaces found

(to improve coverage)


testing robustness of uas safety technology … robustness of uas safety technology (trust ) ......

Documents