safe model-based learning for robot control...safe model-based learning for robot control felix...

48
Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control 16 th December 2018

Upload: others

Post on 30-Jun-2020

9 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control

Safe model-based learning for robot control

Felix Berkenkamp, Andreas Krause, Angela P. Schoellig

@CDC Workshop on Learning for Control

16th December 2018

Page 2: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control

The future of automation

2Felix Berkenkamp

Page 3: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control

The future of automation

3Felix Berkenkamp

Large prior uncertainties, active decision making

Need safe and high-performance behavior

Page 4: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control

Control approach

4Felix Berkenkamp

System model System identification Controller design

Controlled environments Robustness towards errors Safety constraints

data collection

Page 5: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control

Two approaches

5Felix Berkenkamp

Control(Systems)

+ Models+ Feedback+ Safety+ Worst-case- Learning- Data

Performance limited by system understanding

Systems must learn and adapt

Page 6: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control

Reinforcement learning approach

6Felix Berkenkamp

System model System identification Controller designData samples Controller optimization

Action

State

Agent

Environment

Reward

Collecting relevant data for the task (in controlled environments)

Performance typically in expectation

Page 7: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control

Two approaches

7Felix Berkenkamp

Control(Systems)

Machine Learning(Data)

+ Models+ Feedback+ Safety+ Worst-case- Learning- Data

+ Learning+ Data collection+ Explore / exploit+ Average case- Worst-case- Safety

Systems must learn and adapt

Performance limited by system understanding

Safety limited by lack of system understanding

safety, data efficiency

Model-based reinforcement learning

Page 8: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control

Prerequisites for safe reinforcement learning

8Felix Berkenkamp

Safe Model-based Reinforcement Learning

Understand model errors and learning dynamics

Algorithm to safely acquiredata and optimize task

Define safety, analyze a model for safety

Page 9: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control

Overview

9Felix Berkenkamp

Safe Model-based Reinforcement Learning

Understand model errors and learning dynamics

Algorithm to safely acquiredata and optimize task

Define safety, analyze a model for safety

Page 10: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control

Learning a model

10

Dynamics

Felix Berkenkamp

Need to quantify model error

Model error must decrease with data

Page 11: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control

Learning a model

10

Dynamics

Felix Berkenkamp

Need to quantify model error

Model error must decrease with data

Page 12: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control

Learning a model

10

Dynamics

Felix Berkenkamp

Need to quantify model error

Model error must decrease with data

Page 13: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control

Learning a model

10

Dynamics

Felix Berkenkamp

Need to quantify model error

Model error must decrease with data

Page 14: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control

Learning a model

10

Dynamics

Felix Berkenkamp

Need to quantify model error

Model error must decrease with data

Page 15: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control

Learning a model

10

Dynamics

Felix Berkenkamp

Need to quantify model error

Model error must decrease with data

sub-Gaussian

Page 16: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control

Gaussian process

16Felix Berkenkamp

Page 17: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control

Gaussian process

16Felix Berkenkamp

Page 18: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control

Gaussian process

16Felix Berkenkamp

Page 19: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control

Gaussian process

16Felix Berkenkamp

Page 20: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control

Gaussian process

16Felix Berkenkamp

Theorem (informally):The model error is contained in the scaled Gaussian process confidence intervalswith probability at least jointly for all , time steps, and actively selected measurements.

Gaussian Process Optimization in the Bandit Setting: No Regret and Experimental DesignN. Srinivas, A. Krause, S. Kakade, M.Seeger, ICML 2010

Page 21: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control

A Bayesian dynamics model

21

Dynamics

Felix Berkenkamp

Page 22: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control

A Bayesian dynamics model

21

Dynamics

Felix Berkenkamp

Page 23: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control

A Bayesian dynamics model

21

Dynamics

Felix Berkenkamp

Page 24: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control

A Bayesian dynamics model

21

Dynamics

Felix Berkenkamp

Page 25: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control

Overview

25Felix Berkenkamp

Safe Model-based Reinforcement Learning

Understand model errors and learning dynamics

Algorithm to safely acquire data and optimize task

Define safety, analyze a model for safety

Page 26: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control

Safety definition

26

unsafe

Felix Berkenkamp

robust, control-invariant

prior knowledge

Page 27: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control

Safety for learned models

27Felix Berkenkamp

Dynamics Policy

+

Stability?Region of attraction?

Page 28: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control

Lyapunov functions

28

[A.M. Lyapunov 1892]

Felix Berkenkamp

Page 29: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control

Lyapunov functions

29Felix Berkenkamp

Page 30: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control

Learning Lyapunov functions

30Felix Berkenkamp

Finding the right Lyapunov function is difficult!

The Lyapunov Neural Network: Adaptive Stability Certification for Safe Learning of Dynamic SystemsS.M. Richards, F. Berkenkamp, A. Krause, CoRL 2018

Weights - positive-definiteNonlinearities - trivial nullspace

Classification problem

Page 31: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control

Overview

31Felix Berkenkamp

Safe Model-based Reinforcement Learning

Understand model errors and learning dynamics

Algorithm to safely acquiredata and optimize task

Define safety, analyze a model for safety

Page 32: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control

Safety definition

32

unsafe

Felix Berkenkamp

Safe Model-based Reinforcement Learning with Stability GuaranteesF. Berkenkamp, M. Turchetta, A.P. Schoellig, A. Krause, NIPS, 2017

Initial safe policy

Page 33: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control

Safety definition

32

unsafe

Felix Berkenkamp

Safe Model-based Reinforcement Learning with Stability GuaranteesF. Berkenkamp, M. Turchetta, A.P. Schoellig, A. Krause, NIPS, 2017

Initial safe policy

Page 34: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control

Safety definition

32

unsafe

Felix Berkenkamp

Safe Model-based Reinforcement Learning with Stability GuaranteesF. Berkenkamp, M. Turchetta, A.P. Schoellig, A. Krause, NIPS, 2017

Initial safe policy

Page 35: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control

Safety definition

32

unsafe

Felix Berkenkamp

Safe Model-based Reinforcement Learning with Stability GuaranteesF. Berkenkamp, M. Turchetta, A.P. Schoellig, A. Krause, NIPS, 2017

Initial safe policy

Page 36: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control

Safety definition

32

unsafe

Felix Berkenkamp

Safe Model-based Reinforcement Learning with Stability GuaranteesF. Berkenkamp, M. Turchetta, A.P. Schoellig, A. Krause, NIPS, 2017

Initial safe policy

Page 37: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control

Safety definition

32

unsafe

Felix Berkenkamp

Theorem (informally):

Under suitable conditionscan identify (near-)maximal subset of X on which π is stable, while never leaving the safe set

Safe Model-based Reinforcement Learning with Stability GuaranteesF. Berkenkamp, M. Turchetta, A.P. Schoellig, A. Krause, NIPS, 2017

Initial safe policy

Page 38: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control

Illustration of safe learning

38

Policy

Need to safely explore!

Felix Berkenkamp

Safe Model-based Reinforcement Learning with Stability GuaranteesF. Berkenkamp, M. Turchetta, A.P. Schoellig, A. Krause, NIPS, 2017

low

high

Page 39: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control

Illustration of safe learning

39

Policy

Felix Berkenkamp

Safe Model-based Reinforcement Learning with Stability GuaranteesF. Berkenkamp, M. Turchetta, A.P. Schoellig, A. Krause, NIPS, 2017

low

high

Page 40: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control

Model predictive control

40Felix Berkenkamp

Makes decisions based on predictions about the future

Includes input / state constraints

Page 41: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control

Model predictive control on a robot

41Felix Berkenkamp

Robust constrained learning-based NMPC enabling reliable mobile robot path trackingC.J. Ostafew, A.P. Schoellig, T.D. Barfoot, IJRR, 2016

https://youtu.be/3xRNmNv5Efk

Page 42: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control

Model predictive control

42Felix Berkenkamp

Problem: True dynamics are unknown!

Page 43: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control

Outer approximation contains true dynamics for all time steps with probability at least

Prediction under uncertainty

43

Learning-based Model Predictive Control for Safe Exploration T. Koller, F. Berkenkamp, M. Turchetta, A. Krause, CDC, 2018

Felix Berkenkamp

Page 44: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control

Safe model-based learning framework

44

unsafe safety trajectory

exploration trajectoryfirst step same

Theorem (informally):

Under suitableconditions can alwaysguarantee that we areable to return to thesafe set

Felix Berkenkamp

Page 45: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control

Exploration via expected performance

45Felix Berkenkamp

We design our cost functions to be helpful for optimization

Driving too fast Slow down for safety Faster driving after learning

Exploration objective:

subject to safety constraints

Page 46: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control

Example

46Felix Berkenkamp

Robust constrained learning-based NMPC enabling reliable mobile robot path trackingC.J. Ostafew, A.P. Schoellig, T.D. Barfoot, IJRR, 2016

https://youtu.be/3xRNmNv5Efk

Page 47: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control

Example

47Felix Berkenkamp

Robust constrained learning-based NMPC enabling reliable mobile robot path trackingC.J. Ostafew, A.P. Schoellig, T.D. Barfoot, IJRR, 2016

https://youtu.be/3xRNmNv5Efk

Page 48: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control

Summary

48Felix Berkenkamp

Safe Model-based Reinforcement Learning

Understand model and learning dynamics

Algorithm to safely acquiredata and optimize task

Define safety, analyze a model for safety

RKHS / Gaussian processes Lyapunov stability Model predictive control

https://berkenkamp.me www.dynsyslab.orgwww.las.inf.ethz.ch

reliable confidence intervals stability of learned models Uncertainty propagation, safe active learning