pierre sermanet¹·² raia hadsell¹ jan ben² ayse naz erkan¹ beat flepp² urs muller² yann...

21
Pierre Sermanet¹·² Raia Hadsell¹ Jan Ben² Ayse Naz Erkan¹ Beat Flepp² Urs Muller² Yann LeCun¹ (1) Courant Institute of Mathematical Sciences, New York University (2) Net-Scale Technologies, Morganville, NJ 07751, USA SPEED-RANGE DILEMMAS FOR VISION-BASED NAVIGATION IN UNSTRUCTURED TERRAIN

Upload: jaclyn

Post on 12-Jan-2016

21 views

Category:

Documents


1 download

DESCRIPTION

SPEED-RANGE DILEMMAS FOR VISION-BASED NAVIGATION IN UNSTRUCTURED TERRAIN. Pierre Sermanet¹·² Raia Hadsell¹ Jan Ben² Ayse Naz Erkan¹ Beat Flepp² Urs Muller² Yann LeCun¹ (1) Courant Institute of Mathematical Sciences, New York University - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Pierre Sermanet¹·² Raia Hadsell¹ Jan Ben² Ayse Naz Erkan¹ Beat Flepp² Urs Muller² Yann LeCun¹

Pierre Sermanet¹·² Raia Hadsell¹ Jan Ben²Ayse Naz Erkan¹ Beat Flepp² Urs Muller² Yann LeCun¹

(1) Courant Institute of Mathematical Sciences, New York University (2) Net-Scale Technologies, Morganville, NJ 07751, USA

SPEED-RANGE DILEMMAS FOR VISION-BASED

NAVIGATION IN UNSTRUCTURED TERRAIN

Page 2: Pierre Sermanet¹·² Raia Hadsell¹ Jan Ben² Ayse Naz Erkan¹ Beat Flepp² Urs Muller² Yann LeCun¹

Pierre Sermanet September 4th, 2007 2/21Intelligent Autonomous Vehicles IAV 2007, Toulouse, France

Outline

Program and System overview

Problem definition

Architecture

Results

Page 3: Pierre Sermanet¹·² Raia Hadsell¹ Jan Ben² Ayse Naz Erkan¹ Beat Flepp² Urs Muller² Yann LeCun¹

Pierre Sermanet September 4th, 2007 3/21Intelligent Autonomous Vehicles IAV 2007, Toulouse, France

Overview: Program

LAGR: Learning Applied to Ground RobotsDemonstrate learning algorithms in unstructured outdoor robotics

Vision-based only (passive), no expensive equipment

Reach a GPS goal the fastest without any prior knowledge of location

DARPA funded, 10 teams (Universities and companies), common platform

Comparison to state-of-the-art CMU “baseline” software and other teams

Monthly tests by DARPA in various unknown locations:

OverviewProblemArchitectureResults

Unstructured outdoor robotics is highly challenging due to wide diversity of environments (colors, shapes, sizes of obstacles, lighting and shadows, etc)

Conventional algorithms are unsuited, need for adaptability and learning

SwRI, TX Ft. Belvoir, VA Ft. Belvoir, VA Hanover, NH

Page 4: Pierre Sermanet¹·² Raia Hadsell¹ Jan Ben² Ayse Naz Erkan¹ Beat Flepp² Urs Muller² Yann LeCun¹

Pierre Sermanet September 4th, 2007 4/21Intelligent Autonomous Vehicles IAV 2007, Toulouse, France

Overview: Platform

Constructor: CMU/NREC

Vision based only: 2 stereo pairs of cameras (+ GPS for global navigation)

4 Linux machines linked by Gigabit ethernet:Two “eye” machines (dual core 2Ghz): Image processing

“planner” machine (single core 2Ghz): Planning and control loop

“controller” machine: Low level communication

Maximum speed: 1.3m/s

Proprietary CMU/NREC API to sensors and actuators

Proprietary CMU/NREC “Baseline”:

end-to-end navigation software (D*, etc)

(not re-used)

OverviewProblemArchitectureResults

GPS

Dual stereo cameras

Bumper

Page 5: Pierre Sermanet¹·² Raia Hadsell¹ Jan Ben² Ayse Naz Erkan¹ Beat Flepp² Urs Muller² Yann LeCun¹

Pierre Sermanet September 4th, 2007 5/21Intelligent Autonomous Vehicles IAV 2007, Toulouse, France

Overview: Philosophy

Main goal: Demonstrate machine learning algorithms for long-range vision (RSS07).

Supporting goal:Build a solid software platform for long-range vision and navigation:

Robust and reliable

Resistant to sensors imprecisions and failures

OverviewProblemArchitectureResults

Input image Stereo labels (short-range)

Input: context-rich image windowsOutput: long-range labels

Self-supervised learning using convolutional network:

Page 6: Pierre Sermanet¹·² Raia Hadsell¹ Jan Ben² Ayse Naz Erkan¹ Beat Flepp² Urs Muller² Yann LeCun¹

Pierre Sermanet September 4th, 2007 6/21Intelligent Autonomous Vehicles IAV 2007, Toulouse, France

Overview: System

Processing chain:

OverviewProblemArchitectureResults

Sensors (cameras)

Input image

Image processing

Traversibility map

Network transmission

Actuators (wheels)

Path planning

Path

Eye machine

Planner machine

Control

Pose(GPS + IMU + wheels)

Latency

Frequency

Note: Latency is not only tied to frequency but also to sensors latency, network, planning and actuators latency.

Page 7: Pierre Sermanet¹·² Raia Hadsell¹ Jan Ben² Ayse Naz Erkan¹ Beat Flepp² Urs Muller² Yann LeCun¹

Pierre Sermanet September 4th, 2007 7/21Intelligent Autonomous Vehicles IAV 2007, Toulouse, France

Problem

Important performance drop in local obstacle avoidance with too high latency and frequency:

OverviewProblemArchitectureResults

Artificially increasing latency and period almost linearly increases the number of crashes in obstacles

Human expert drivers of the UPI Crusher vehicle reported a feedback latency of 400ms was the maximum for good remote driving.

How to guarantee good performance with increasing complexity introduced by sophisticated long-range vision modules?

When does processing speed prevails over vision range, and vice-versa?

Performance Test of July 2006, Holmdel Park, NJ

Page 8: Pierre Sermanet¹·² Raia Hadsell¹ Jan Ben² Ayse Naz Erkan¹ Beat Flepp² Urs Muller² Yann LeCun¹

Pierre Sermanet September 4th, 2007 8/21Intelligent Autonomous Vehicles IAV 2007, Toulouse, France

Problem: Delays

OverviewProblemArchitectureResults

fixed

Latency and frequency determine performance, but latency is actually composed of 3 types of latencies or “delays”:1. Sensors/Actuators latency + LAGR API latency: Images are already 190ms old when made available to image processing

2. Processing latency3. Robot’s dynamics latency (inertia + acceleration/deceleration): 1.5sec (worst case) between a wheel command and actual desired speed

– (1) and (3) are relatively high on the LAGR platform and must be caught up to and taken in account by (2).

Page 9: Pierre Sermanet¹·² Raia Hadsell¹ Jan Ben² Ayse Naz Erkan¹ Beat Flepp² Urs Muller² Yann LeCun¹

Pierre Sermanet September 4th, 2007 9/21Intelligent Autonomous Vehicles IAV 2007, Toulouse, France

Problem: Solutions to delays

OverviewProblemArchitectureResults

To account for sensors and processing latencies (1) and (2):a. Reduce processing time.

b. Estimate delays between path planning and actuation.

c. Place traversibility maps according to delays before and after path planning.

To account for dynamics latencies (3):d. Modeling or record robot’s dynamics.

All (a), (b), (c) and (d) solutions are part of the global solution presented in the results section, but here we will only describe a successful architecture for (a)

Page 10: Pierre Sermanet¹·² Raia Hadsell¹ Jan Ben² Ayse Naz Erkan¹ Beat Flepp² Urs Muller² Yann LeCun¹

Pierre Sermanet September 4th, 2007 10/21Intelligent Autonomous Vehicles IAV 2007, Toulouse, France

Architecture

OverviewProblemArchitectureResults

Idea: Wagner et al.¹ showed that a walking human gazes more frequently close by than far away:

need higher frequency closer than far away

Close by obstacles move toward robot faster than far obstacles:

need lower latency closer than far away

To satisfy those requirements, short and long range vision must be separated into 2 parallel and independent OD modules:“Fast-OD”: processing has to be fast, vision is not necessarily long-range.

“Far-OD”: vision has to be long-range, processing can be slower.

How to make Fast-OD fast? Simple processing and reduced input resolution.

Can we reduce resolution without reducing performance?

¹ M. Wagner, J. C. Baird, andW. Barbaresi. The locus of environmental attention. J. of Environmental Psychology, 1:195-206, 1980.

Page 11: Pierre Sermanet¹·² Raia Hadsell¹ Jan Ben² Ayse Naz Erkan¹ Beat Flepp² Urs Muller² Yann LeCun¹

Pierre Sermanet September 4th, 2007 11/21Intelligent Autonomous Vehicles IAV 2007, Toulouse, France

Architecture: Fast-OD // Far-OD

Sensors (cameras)

High res input image(320x240 or 512x384)

Advanced Image processing

Traversibility map(5 to >30m)

Network transmission

Actuators (wheels)

Path planning

Path

Eye machine

Planner machine

Control

Latency: 700ms

Frequency: 3Hz

Latency: 250ms

Frequency: 10Hz

Pose(GPS + IMU + wheels)

Low res input image(160x120)

Simple Image processing

Traversibility map(0 to 5m)

Map merging

OverviewProblemArchitectureResults

Fast-OD Far-OD

Page 12: Pierre Sermanet¹·² Raia Hadsell¹ Jan Ben² Ayse Naz Erkan¹ Beat Flepp² Urs Muller² Yann LeCun¹

Pierre Sermanet September 4th, 2007 12/21Intelligent Autonomous Vehicles IAV 2007, Toulouse, France

Architecture: Implementation notes

OverviewProblemArchitectureResults

CPU cycles: All cycles must be given to Fast-OD when it runs to guarantee low latency. Different solutions are:1. Use real-time OS and give high priority to Fast-OD.

2. With regular OS, give Fast-OD control of Far-OD:

Fast-OD pauses Far-OD, runs, then sleeps for a bit and resume Far-OD.

3. Use dual-core CPU.

Map merging: Fast and Far maps are merged together before planning according to their respective poses.

2-step planning: This architecture makes it easier to separate the different planning algorithms suited for short and long range:

Fast-OD planning happens in Cartesian space and takes robot dynamics in account (more important in short range)

Far-OD planning happens in image space and uses regular path planning.

Dynamics planning: Cartesian space

Long-range planning: Image space

10m

5m

infinity

10m

Page 13: Pierre Sermanet¹·² Raia Hadsell¹ Jan Ben² Ayse Naz Erkan¹ Beat Flepp² Urs Muller² Yann LeCun¹

Pierre Sermanet September 4th, 2007 13/21Intelligent Autonomous Vehicles IAV 2007, Toulouse, France

Results: Timing measures

OverviewProblemArchitectureResults

Fast-od actuation latency250ms

Fast-od sensors latency190ms

Fast-od period (frequency)100ms (10Hz)

Far-od period (frequency)370ms (2-3Hz)

Far-od actuation latency

700ms

Page 14: Pierre Sermanet¹·² Raia Hadsell¹ Jan Ben² Ayse Naz Erkan¹ Beat Flepp² Urs Muller² Yann LeCun¹

Pierre Sermanet September 4th, 2007 14/21Intelligent Autonomous Vehicles IAV 2007, Toulouse, France

Global-map Vehicle-map

Results

OverviewProblemArchitectureResults

Short and long range navigation test:1. 1st obstacle appears quickly and suddenly to robot testing short range navigation

2. Cul-de-sac testing “long” range navigation

Parallel architecture is consistently better at short and long range navigation than series architecture or FAST-OD only.Note: Here Fast-od has 5m radius and Far-od 15m radius.

Page 15: Pierre Sermanet¹·² Raia Hadsell¹ Jan Ben² Ayse Naz Erkan¹ Beat Flepp² Urs Muller² Yann LeCun¹

Pierre Sermanet September 4th, 2007 15/21Intelligent Autonomous Vehicles IAV 2007, Toulouse, France

Results: More recent results

OverviewProblemArchitectureResults

Fast-od + Far-od in parallel:Short-range navigation consistently successful:

0 collision over >5 runs

Finish run in about 16sec along shortest path

Fast-od: 10Hz & 250ms & 3meters range

Far-od: 3Hz & 700ms & 30meters range

Fast-od + Far-od in series:Short-range navigation consistently failing: > 2 collisions over >5 runs

Finish run in >40sec along longer path

Fast-od/Far-od: 3Hz & 700ms & 3m/30m range (frequency is acceptable but latency is too high)

Video 1: collision-free bucket maze Video 2: collision-free bucket maze

Videos 3,4,5: obstacle collisions due to high latency and period.

Page 16: Pierre Sermanet¹·² Raia Hadsell¹ Jan Ben² Ayse Naz Erkan¹ Beat Flepp² Urs Muller² Yann LeCun¹

Pierre Sermanet September 4th, 2007 16/21Intelligent Autonomous Vehicles IAV 2007, Toulouse, France

Results: More recent results

OverviewProblemArchitectureResults

Video 6: Natural obstacles Video 7: Tight maze of artificial obstacles

Fast-od + Far-od in parallel:Short-range navigation consistently successful: 0 collision over >5 runs

Fast-od: 10Hz & 250ms & 3meters range

Far-od: 3Hz & 700ms & 30meters rangeNote: long-range planning is off, i.e. Far-od is processing but ignored. Only short-range navigation was tested here.

Page 17: Pierre Sermanet¹·² Raia Hadsell¹ Jan Ben² Ayse Naz Erkan¹ Beat Flepp² Urs Muller² Yann LeCun¹

Pierre Sermanet September 4th, 2007 17/21Intelligent Autonomous Vehicles IAV 2007, Toulouse, France

Results: Moving obstacles

OverviewProblemArchitectureResults

Video 8: Fast moving obstacle

Detects and avoids moving obstacles consistently.

Page 18: Pierre Sermanet¹·² Raia Hadsell¹ Jan Ben² Ayse Naz Erkan¹ Beat Flepp² Urs Muller² Yann LeCun¹

Pierre Sermanet September 4th, 2007 18/21Intelligent Autonomous Vehicles IAV 2007, Toulouse, France

Results: Beating humans

OverviewProblemArchitectureResults

Video 9: Experienced human driver

Autonomous short-range navigation is consistently better than inexperienced human drivers and equal or better than experienced human drivers.

(driving with only robot’s images would be even harder for a human)

Page 19: Pierre Sermanet¹·² Raia Hadsell¹ Jan Ben² Ayse Naz Erkan¹ Beat Flepp² Urs Muller² Yann LeCun¹

Pierre Sermanet September 4th, 2007 19/21Intelligent Autonomous Vehicles IAV 2007, Toulouse, France

Results: Processing Speed - Vision Range dilemma

OverviewProblemArchitectureResults

We showed that processing speed prevails over vision range for short range navigation, whereas vision range prevails over speed for long range navigation.

Only 3m vision range were necessary to build a collision-free short range navigation for a 1.3m/s non-holonomic vehicle:

Vehicle’s worst-case stopping delay: 1.0 sec.System’s worst-case reaction time: 0.25 sec. latency + 0.1 sec period

Worst-case reaction and stopping delay: 1.35 sec., (or 1.75m)

Only 1.0 sec. anticipation necessary in addition to worst-case reaction and stopping delay.

A vision range of 15m with high latency and lower frequency consistently improved the long range navigation in parallel to the short range module.

Page 20: Pierre Sermanet¹·² Raia Hadsell¹ Jan Ben² Ayse Naz Erkan¹ Beat Flepp² Urs Muller² Yann LeCun¹

Pierre Sermanet September 4th, 2007 20/21Intelligent Autonomous Vehicles IAV 2007, Toulouse, France

Summary

OverviewProblemArchitectureResults

We showed that both latency and frequency are critical in vision-based systems because of higher processing times.

A simple and very low resolution OD in parallel with a high resolution OD proved to increase greatly the performance of a short and long range vision-based autonomous navigation system over commonly used higher resolution and sequential approaches:Processing speed prevails over range in short-range navigation and only 1.0 sec. additional anticipation to dynamics and processing delays was necessary.

Additional key concepts such as dynamics modeling must be implemented to build a complete end-to-end successful system.

A robust collision-free navigation platform, dealing with moving obstacles and beating humans, was successfully built and is able to leave enough CPU cycles available for computationally expensive algorithms.

Page 21: Pierre Sermanet¹·² Raia Hadsell¹ Jan Ben² Ayse Naz Erkan¹ Beat Flepp² Urs Muller² Yann LeCun¹

Pierre Sermanet September 4th, 2007 21/21Intelligent Autonomous Vehicles IAV 2007, Toulouse, France

Questions?