perception systems for autonomous vehicles using energy ... · perception systems for autonomous...

47
Perception Systems for Autonomous Vehicles using Energy-Efficient Deep Neural Networks Forrest Iandola, Ben Landen, Kyle Bertin, Kurt Keutzer and the DeepScale Team

Upload: others

Post on 22-May-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Perception Systems for Autonomous Vehicles using Energy ... · Perception Systems for Autonomous Vehicles using Energy-Efficient Deep Neural Networks . Forrest Iandola, Ben Landen,

Perception Systems for Autonomous Vehicles using Energy-Efficient Deep Neural Networks

Forrest Iandola, Ben Landen, Kyle Bertin, Kurt Keutzer and the DeepScale Team

Page 2: Perception Systems for Autonomous Vehicles using Energy ... · Perception Systems for Autonomous Vehicles using Energy-Efficient Deep Neural Networks . Forrest Iandola, Ben Landen,

I M P L E M E N T I N G A U TO N O M O U S D R I V I N G

THE FLOW

SENSORS

LIDAR

ULTRASONIC CAMERA

RADAR

OFFLINE MAPS

REAL-TIME PERCEPTION

PATH PLANNING &

ACTUATION

Page 3: Perception Systems for Autonomous Vehicles using Energy ... · Perception Systems for Autonomous Vehicles using Energy-Efficient Deep Neural Networks . Forrest Iandola, Ben Landen,

What does a car need to see? What does a car need to see?

Note: above visuals are an artist’s rendering created to help convey concepts. They should not be judged for accuracy.

Page 4: Perception Systems for Autonomous Vehicles using Energy ... · Perception Systems for Autonomous Vehicles using Energy-Efficient Deep Neural Networks . Forrest Iandola, Ben Landen,

What does a car need to see?

Object Detection

Note: above visuals are an artist’s rendering created to help convey concepts. They should not be judged for accuracy.

Vehicle (99%) Vehicle (98%) Vehicle (100%) Vehicle (100%) Vehicle (99%)

Cyclist (99%)

Cyclist (99%)

Pedestrian (99%) Pedestrian

(99%)

Page 5: Perception Systems for Autonomous Vehicles using Energy ... · Perception Systems for Autonomous Vehicles using Energy-Efficient Deep Neural Networks . Forrest Iandola, Ben Landen,

What does a car need to see?

Distance

Note: above visuals are an artist’s rendering created to help convey concepts. They should not be judged for accuracy.

Vehicle (99%) 15m

Vehicle (98%) 20m

Vehicle (100%) 10m

Vehicle (100%) 14m

Vehicle (99%) 18m

Cyclist (99%) 16m

Cyclist (99%) 14m

Pedestrian (99%) 7m

Pedestrian (99%) 7m

Page 6: Perception Systems for Autonomous Vehicles using Energy ... · Perception Systems for Autonomous Vehicles using Energy-Efficient Deep Neural Networks . Forrest Iandola, Ben Landen,

What does a car need to see?

Object Tracking

Note: above visuals are an artist’s rendering created to help convey concepts. They should not be judged for accuracy.

Vehicle (99%) 15m ID: 5 (95 frames)

Vehicle (98%) 20m ID: 4 (140 frames)

Vehicle (100%) 10m ID: 1 (135 frames)

Vehicle (100%) 14m ID: 2 (140 frames)

Cyclist (99%) 16m ID: 6 (90 frames)

Cyclist (99%) 14m ID: 7 (95 frames)

Pedestrian (99%) 7m ID: 8 (60 frames)

Pedestrian (99%) 7m ID: 9 (60 frames) Vehicle (99%)

18m ID: 3 (140 frames)

Page 7: Perception Systems for Autonomous Vehicles using Energy ... · Perception Systems for Autonomous Vehicles using Energy-Efficient Deep Neural Networks . Forrest Iandola, Ben Landen,

What does a car need to see?

Free Space & Driveable Area

Note: above visuals are an artist’s rendering created to help convey concepts. They should not be judged for accuracy.

Cyclist (99%) 16m ID: 6 (90 frames)

Cyclist (99%) 14m ID: 7 (95 frames)

Pedestrian (99%) 7m ID: 8 (60 frames)

Pedestrian (99%) 7m ID: 9 (60 frames) Vehicle (99%)

15m ID: 5 (95 frames)

Vehicle (98%) 20m ID: 4 (140 frames)

Vehicle (100%) 10m ID: 1 (135 frames)

Vehicle (100%) 14m ID: 2 (140 frames)

Vehicle (99%) 18m ID: 3 (140 frames)

Page 8: Perception Systems for Autonomous Vehicles using Energy ... · Perception Systems for Autonomous Vehicles using Energy-Efficient Deep Neural Networks . Forrest Iandola, Ben Landen,

What does a car need to see?

Lane Recognition

Note: above visuals are an artist’s rendering created to help convey concepts. They should not be judged for accuracy.

Cyclist (99%) 16m ID: 6 (90 frames)

Cyclist (99%) 14m ID: 7 (95 frames)

Pedestrian (99%) 7m ID: 8 (60 frames)

Pedestrian (99%) 7m ID: 9 (60 frames) Vehicle (99%)

15m ID: 5 (95 frames)

Vehicle (98%) 20m ID: 4 (140 frames)

Vehicle (100%) 10m ID: 1 (135 frames)

Vehicle (100%) 14m ID: 2 (140 frames)

Vehicle (99%) 18m ID: 3 (140 frames)

Page 9: Perception Systems for Autonomous Vehicles using Energy ... · Perception Systems for Autonomous Vehicles using Energy-Efficient Deep Neural Networks . Forrest Iandola, Ben Landen,

Audi https://www.slashgear.com/man-vs-machine-my-rematch-against-audis-new-self-driving-rs-7-21415540/

BMW + Intel https://newsroom.intel.com/news-releases/bmw-group-intel-mobileye-will-autonomous-test-vehicles-roads-second-half-2017/

Ford http://cwc.ucsd.edu/content/connected-cars-long-road-autonomous-vehicles

Today's autonomous cars require a lot of computing hardware!

…and perception is the most computationally-intensive part of the software stack

Page 10: Perception Systems for Autonomous Vehicles using Energy ... · Perception Systems for Autonomous Vehicles using Energy-Efficient Deep Neural Networks . Forrest Iandola, Ben Landen,

Big computers = expensive cars

Page 11: Perception Systems for Autonomous Vehicles using Energy ... · Perception Systems for Autonomous Vehicles using Energy-Efficient Deep Neural Networks . Forrest Iandola, Ben Landen,

As a workaround, companies want people to share autonomous vehicles to amortize hardware costs

Page 12: Perception Systems for Autonomous Vehicles using Energy ... · Perception Systems for Autonomous Vehicles using Energy-Efficient Deep Neural Networks . Forrest Iandola, Ben Landen,

As a workaround, companies want people to share autonomous vehicles to amortize hardware costs

Page 13: Perception Systems for Autonomous Vehicles using Energy ... · Perception Systems for Autonomous Vehicles using Energy-Efficient Deep Neural Networks . Forrest Iandola, Ben Landen,

Shared autonomous vehicles will likely have some of the downsides as public transportation

Page 14: Perception Systems for Autonomous Vehicles using Energy ... · Perception Systems for Autonomous Vehicles using Energy-Efficient Deep Neural Networks . Forrest Iandola, Ben Landen,

Will better computer chips make autonomous cars affordable?

Page 15: Perception Systems for Autonomous Vehicles using Energy ... · Perception Systems for Autonomous Vehicles using Energy-Efficient Deep Neural Networks . Forrest Iandola, Ben Landen,

Will better computer chips make autonomous cars affordable?

Page 16: Perception Systems for Autonomous Vehicles using Energy ... · Perception Systems for Autonomous Vehicles using Energy-Efficient Deep Neural Networks . Forrest Iandola, Ben Landen,

Deep Learning Processors have arrived!

[1] https://www.nvidia.com/content/PDF/kepler/Tesla-K20-Passive-BD-06455-001-v05.pdf [2] http://www.nvidia.com/content/PDF/Volta-Datasheet.pdf (PCIe version)

T H E S E RV E R S I D E

Platform Computation (GFLOPS/s)

Memory Bandwidth

(GB/s)

Computation-to-bandwidth

ratio

Power (TDP Watts)

Year

NVIDIA K20 [1] 3500

(32-bit float) 208 (GDDR5)

17 225 2012

NVIDIA V100 [2] 112000

(16-bit float) 900 (HBM2)

124 (yikes!)

250 2018

Uh-oh… Processors are improving much faster than Memory.

Page 17: Perception Systems for Autonomous Vehicles using Energy ... · Perception Systems for Autonomous Vehicles using Energy-Efficient Deep Neural Networks . Forrest Iandola, Ben Landen,

Deep Learning Processors have arrived!

[1] https://indico.cern.ch/event/319744/contributions/1698147/attachments/616065/847693/gdb_110215_cesini.pdf [2] https://www.androidauthority.com/huawei-announces-kirin-970-797788 [3] https://blogs.nvidia.com/blog/2018/01/07/drive-xavier-processor/ [4] https://developer.nvidia.com/jetson-xavier

M O B I L E P L AT FO R MS

Device Cores Computation (GFLOPS/s)

Memory Bandwidth

(GB/s)

Computation-to-bandwidth

ratio

System Power (TDP Watts)

Year

Samsung Galaxy Note 3

Arm Mali T-628 GPU [1]

120 (32-bit float)

12.8 (LPDDR3)

9.3 ~10 2013

Huawei P20

Kirin 970 NPU [2] 1920

(16-bit float) 30 (LPDDR4X)

64 (ouch!)

~10 2018

NVIDIA Jetson Xavier [3,4]

NVIDIA Tensor Cores

30000 (832 int)

137

218 (yikes!)

10 to 30 (multiple modes)

2018

Page 18: Perception Systems for Autonomous Vehicles using Energy ... · Perception Systems for Autonomous Vehicles using Energy-Efficient Deep Neural Networks . Forrest Iandola, Ben Landen,

What will the next generation Deep Learning servers look like?

https://medium.com/@shan.tang.g/a-list-of-chip-ip-for-deep-learning-48d05f1759ae

Page 19: Perception Systems for Autonomous Vehicles using Energy ... · Perception Systems for Autonomous Vehicles using Energy-Efficient Deep Neural Networks . Forrest Iandola, Ben Landen,

What will the next generation Deep Learning servers look like? 2 0 TO P/ W CO MP U TAT I O N

Platform Efficiency (TOP/s/W)

Computation (TOP/s)

Memory Bandwidth

(TB/s)

Computation-to-bandwidth

ratio

Power (TDP Watts)

Year

NVIDIA K20 [1] 0.015 3.50

(32-bit float) 0.208 (GDDR5)

17 225 2012

NVIDIA V100 [2] 0.45 112

(16-bit float) 0.900 (HBM2)

124 250 2018

Next-gen: 20 TOP/W 20 2500* 1.800

(HBM3) [3] 1389 (oh no!)

250 2020 (est.)

[1] https://www.nvidia.com/content/PDF/kepler/Tesla-K20-Passive-BD-06455-001-v05.pdf [2] http://www.nvidia.com/content/PDF/Volta-Datasheet.pdf (PCIe version) [3] https://www.eteknix.com/gddr6-hbm3-details-emerge/

* Assuming half the power is spent on computation, and the other half is spent on memory and other devices. 20 TOP/s/W * 20W * 0.5 = 2500 TOP/s

Page 20: Perception Systems for Autonomous Vehicles using Energy ... · Perception Systems for Autonomous Vehicles using Energy-Efficient Deep Neural Networks . Forrest Iandola, Ben Landen,

Small Neural Nets to the rescue

Page 21: Perception Systems for Autonomous Vehicles using Energy ... · Perception Systems for Autonomous Vehicles using Energy-Efficient Deep Neural Networks . Forrest Iandola, Ben Landen,

squeeze (verb): to make an AI system use less resources using whatever means necessary

Page 22: Perception Systems for Autonomous Vehicles using Energy ... · Perception Systems for Autonomous Vehicles using Energy-Efficient Deep Neural Networks . Forrest Iandola, Ben Landen,

squeeze (verb): to make an AI system use less resources using whatever means necessary

Memory Footprint

and Bandwidth

Computational Operations Power

and Energy Time

Page 23: Perception Systems for Autonomous Vehicles using Energy ... · Perception Systems for Autonomous Vehicles using Energy-Efficient Deep Neural Networks . Forrest Iandola, Ben Landen,

squeeze (verb): to make an AI system use less resources using whatever means necessary

Memory Footprint

and Bandwidth

Computational Operations Power

and Energy Time

New DNN Models

Application-specific

Quantization and Pruning

Superior Implementations

Differentiated Data and Training

Strategies

Page 24: Perception Systems for Autonomous Vehicles using Energy ... · Perception Systems for Autonomous Vehicles using Energy-Efficient Deep Neural Networks . Forrest Iandola, Ben Landen,

Most CV Applications Rely on Only a Few Core CV Capabilities

Image Classification

Object Detection

Semantic Segmentation

And the best accuracy for each of these capabilities is given by Convolutional Neural Nets

Page 25: Perception Systems for Autonomous Vehicles using Energy ... · Perception Systems for Autonomous Vehicles using Energy-Efficient Deep Neural Networks . Forrest Iandola, Ben Landen,

But We Need a Very Different Kind of DNN

DGX-1, 170 TFLOPS, 3.2 KWatts,

128 GB Memory

TitanX: 11 TFLOPS, 223 Watts,

12 GB Memory

VGG16[1] model: - Parameter size: 552 MB - Memory: 93 MB/image - Computation: 15.8 GFLOPs/image

[1] K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556, 2014.

Smartphones 100's GFLOPs

3 Watts 2-4 GB

IOT Devices 100's MHz

<1Watt <1 GB

Page 26: Perception Systems for Autonomous Vehicles using Energy ... · Perception Systems for Autonomous Vehicles using Energy-Efficient Deep Neural Networks . Forrest Iandola, Ben Landen,

Speed more Related to Memory Accesses than Operations

L1 D-Cache (per core)

L2 Cache (shared)

Off-chip DRAM

Size 32 KB 2 MB 4 GB

Read Latency 4 cycles 22 cycles ~200 cycles

Read Bandwidth 20.8 GB/s 166.4 GB/s 28.7 GB/s

L1 Cache/TLB

L2 Cache Galaxy S7

Samsung Exynos M1 Access Times

Page 27: Perception Systems for Autonomous Vehicles using Energy ... · Perception Systems for Autonomous Vehicles using Energy-Efficient Deep Neural Networks . Forrest Iandola, Ben Landen,

Energy More Related to Memory Accesses than operations (45nm 0.9V)

0 20 40 60 80 100

Energy (pJ)

18.5x

100x

10,000x

5.5x

500x

0 500 1000 1500 2000

8b INT Mult

16b FP Mult

32b FP Mult

64b Cache Read (32KB)

64b Cache Read (1MB)

DRAM

Mark Horowitz, “Computing’s Energy Problem (and what we can do about it),” ISSCC 2014

Page 28: Perception Systems for Autonomous Vehicles using Energy ... · Perception Systems for Autonomous Vehicles using Energy-Efficient Deep Neural Networks . Forrest Iandola, Ben Landen,

10,000 DNN Architectural Configurations Later: SqueezeNet (2016)

[1] Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet classification with deep convolutional neural networks." NIPS2012 [2] Iandola, Forrest N., et al. "SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and< 1MB model size." arXiv: 1602.07360 (2016). (February 2016)

CNN Top-5 Accuracy ImageNet

Model Parameters

Model Size

AlexNet[1] 80.3% 60M 243MB

SqueezeNet[2] 80.3% 1.2M 4.8MB

AlexNet [1]

SqueezeNet [2]

compresses to 500KB

Page 29: Perception Systems for Autonomous Vehicles using Energy ... · Perception Systems for Autonomous Vehicles using Energy-Efficient Deep Neural Networks . Forrest Iandola, Ben Landen,

SqueezeNet: Immediate Success in Embedded Vision

Enabled embedded processor vendors (ARM, NXP, Qualcomm) to demo CNNs Quickly ported to all the major Deep Learning Frameworks

NXP – Embedded Vision Summit

Qualcomm – Facebook F8

Apple CoreML

Page 30: Perception Systems for Autonomous Vehicles using Energy ... · Perception Systems for Autonomous Vehicles using Energy-Efficient Deep Neural Networks . Forrest Iandola, Ben Landen,

SqueezeDet for Object Detection (2017)

Bounding boxes

Final detections Input

image

Best Paper Award: Bichen Wu, Forrest Iandola, Peter H. Jin, and Kurt Keutzer. 2017. SqueezeDet: Unified, small, low power fully convolutional neural networks for real-time object detection for autonomous driving. In Proceedings, CVPR Embedded Computer Vision Workshop, July 2017.

Filter Conv Det

feature map

• ~2M model parameters • 57 FPS • 1.4 Joules Frame

Page 31: Perception Systems for Autonomous Vehicles using Energy ... · Perception Systems for Autonomous Vehicles using Energy-Efficient Deep Neural Networks . Forrest Iandola, Ben Landen,

SqueezeSeg: Semantic Segmentation for LIDAR (2018)

LIDAR point cloud segmentation SqueezeSegV2: • Higher accuracy: v1[1]: 64.6% -> v2[2]: 73.2% (+8.6%) • Better Sim2Real performance: v1[1]: 30% -> v2[2]: 57.4% (+27.4%)

• Outperforms v1 trained on real data w/o intensity

[1] Wu, Bichen, et al. "Squeezeseg: Convolutional neural nets with recurrent crf for real-time road-object segmentation from 3d lidar point cloud." ICRA18 [2] Wu, Bichen, et al. "SqueezeSegV2: Improved Model Structure and Unsupervised Domain Adaptation for Road-Object Segmentation from a LiDAR Point Cloud." arXiv:1809.08495 (2018).

Page 32: Perception Systems for Autonomous Vehicles using Energy ... · Perception Systems for Autonomous Vehicles using Energy-Efficient Deep Neural Networks . Forrest Iandola, Ben Landen,

Squeeze Family

Image Classification

Object Detection

Semantic Segmentation

SqueezeNet

SqueezeNext

ShiftNet

SqueezeDet SqueezeSeg-{v1, v2} DiracDeltaNet

DNASNet

Page 33: Perception Systems for Autonomous Vehicles using Energy ... · Perception Systems for Autonomous Vehicles using Energy-Efficient Deep Neural Networks . Forrest Iandola, Ben Landen,

Andrew Howard's MobileNets: Efficient On-Device Computer Vision Models

MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications MobileNet V2: Inverted Residuals and Linear Bottlenecks

Designed for efficiency on mobile phones. Family of pareto optimal models to target needs of the user. V1 based on Depthwise Separable Convolutions. V2 introduces Inverted Residuals and Linear Bottlenecks. Supports Classification, Detection, Segmentation and more.

Page 34: Perception Systems for Autonomous Vehicles using Energy ... · Perception Systems for Autonomous Vehicles using Energy-Efficient Deep Neural Networks . Forrest Iandola, Ben Landen,

Model Compression

≥50X 10X

Slide Credit: Prof. Warren Gross (McGill Univ.)

DNN Architecture Search

Page 35: Perception Systems for Autonomous Vehicles using Energy ... · Perception Systems for Autonomous Vehicles using Energy-Efficient Deep Neural Networks . Forrest Iandola, Ben Landen,

Anatomy of a convolution layer

⨷ 384

13 13

384

13 13

… …

384

3x3 conv

384

3 3 13

13

… … …

13

13

Page 36: Perception Systems for Autonomous Vehicles using Energy ... · Perception Systems for Autonomous Vehicles using Energy-Efficient Deep Neural Networks . Forrest Iandola, Ben Landen,

Filters: Kernel Reduction

⨷ 384

13 13

384

13 13

… …

384

3x3 conv

384

3 3 13

… … …

13

13

3

3

1

1

9x reduction in model parameters

Page 37: Perception Systems for Autonomous Vehicles using Energy ... · Perception Systems for Autonomous Vehicles using Energy-Efficient Deep Neural Networks . Forrest Iandola, Ben Landen,

Filters/Channel Reduction

⨷ 384

13 13

384

13 13

… …

384

3x3 conv

384

3 3 13

… … …

13

13 3x3 conv

384

384

3 3

128

128

3 3

9x reduction in model parameters

Page 38: Perception Systems for Autonomous Vehicles using Energy ... · Perception Systems for Autonomous Vehicles using Energy-Efficient Deep Neural Networks . Forrest Iandola, Ben Landen,

Model Distillation/Compression

Model Distillation

Li, et al. Mimicking Very Efficient Network for Object Detection. CVPR, 2017.

Page 39: Perception Systems for Autonomous Vehicles using Energy ... · Perception Systems for Autonomous Vehicles using Energy-Efficient Deep Neural Networks . Forrest Iandola, Ben Landen,

Examples of what's on a DNN Architect's Palette

Spatial Convolution e.g. 3x3

Shift Channel Shuffle

Depthwise Convolution Pointwise Convolution 1x1

Page 40: Perception Systems for Autonomous Vehicles using Energy ... · Perception Systems for Autonomous Vehicles using Energy-Efficient Deep Neural Networks . Forrest Iandola, Ben Landen,

The Art of Small Model Design Small Neural Nets Are Beautiful – ESWeek 2017

The palette of an adept mobile/embedded DNN designer has grown very rich! Overall architecture: economize on layers while retaining accuracy Layer types

Kernel reduction: 5 x 5 3 x 3 1 x 1 Channel reduction: e.g. FireLayer Experiment with novel layer types that consume no FLOPS

Shuffle Shift

Model distillation: let big models teach smaller ones Apply pruning Tailor bit precision (aka quantization) to target processor

Iandola, Forrest, and Kurt Keutzer. "Small neural nets are beautiful: enabling embedded systems with small deep-neural-network architectures." In Proceedings of the Twelfth International Conference on Hardware/Software Codesign and System Synthesis Companion, p. 1. ACM, 2017. (ESWeek 2017). Also, (arXiv:1710.02759)

Page 41: Perception Systems for Autonomous Vehicles using Energy ... · Perception Systems for Autonomous Vehicles using Energy-Efficient Deep Neural Networks . Forrest Iandola, Ben Landen,

Artistic/Engineering Process of Designing a Deep Neural Net

• Manual design: • Each iteration to evaluate a point in the design space is very expensive • Exploration limited by human imagination

Page 42: Perception Systems for Autonomous Vehicles using Energy ... · Perception Systems for Autonomous Vehicles using Energy-Efficient Deep Neural Networks . Forrest Iandola, Ben Landen,

Can we automate this?

• Manual design: • Each iteration to evaluate a point in the design space is very expensive • Exploration limited by human imagination

Page 43: Perception Systems for Autonomous Vehicles using Energy ... · Perception Systems for Autonomous Vehicles using Energy-Efficient Deep Neural Networks . Forrest Iandola, Ben Landen,

DNAS: Differentiable Neural Architecture Search

Differentiable Neural Architecture Search: • Extremely fast: 8 GPUs, 24 hours

• Can search for different conditions case-by-case • Optimize for actual latency Bichen Wu,

Kurt Keutzer, Peizhao Zhang,

Yanghan Wang,

Fei Sun, Yiming Wu,

Yuandong Tian, Peter Vajda, Yangqing Jia

Page 44: Perception Systems for Autonomous Vehicles using Energy ... · Perception Systems for Autonomous Vehicles using Energy-Efficient Deep Neural Networks . Forrest Iandola, Ben Landen,

DNAS in context (FLOPs to normalize comparison)

MobileNetV2: [4] Acc: 71.8%, FLOPs: 300M

More FLOPs - BAD

ImageNet top-1 Accuracy -- Good PNAS: [2] Acc: 74.2%, FLOPs: 588M Search cost*: 6,000 GPU-hrs

DARTS: [3] Acc: 73.1%, FLOPs: 595M Search cost: 288 GPU-hrs

AMC: [5] Acc: 70.8%, FLOPs: 150M

MnasNet: [6] Acc: 74.0, FLOPs: 317M Search Cost*: 91,000 GPU-hrs

NAS: [1] Acc: 74.0%, FLOPs: 564M Search cost: 48,000 GPU-hrs

* Estimated from the paper description

[1] Zoph, Barret, et al. "Learning transferable architectures for scalable image recognition." arXiv preprint arXiv:1707.070122.6 (2017). [2] Liu, Chenxi, et al. "Progressive neural architecture search." arXiv preprint arXiv:1712.00559 (2017). [3] Liu, Hanxiao, Karen Simonyan, and Yiming Yang. "Darts: Differentiable architecture search." arXiv preprint arXiv:1806.09055 (2018) [4] Sandler, Mark, et al. "MobileNetV2: Inverted Residuals and Linear Bottlenecks.” CVPR18 [5] He, Yihui, et al. "Amc: Automl for model compression and acceleration on mobile devices." Proceedings of the European Conference on Computer Vision (ECCV). 2018. [6] Tan, Mingxing, et al. "Mnasnet: Platform-aware neural architecture search for mobile." arXiv preprint arXiv:1807.11626 (2018).

DNASNet: (ours) Acc: 74.2%, FLOPs: 295M Search Cost: 216 GPU-hrs

• X-axis: FLOPs • Y-axis: accuracy • Mark size: search cost • Circles: search cost unknown

Page 45: Perception Systems for Autonomous Vehicles using Energy ... · Perception Systems for Autonomous Vehicles using Energy-Efficient Deep Neural Networks . Forrest Iandola, Ben Landen,

DNAS for device-aware search

NET Latency on iPhoneX

Latency on Samsung S8

Top-1 acc

DNAS-iPhoneX 19.84 ms 23.33 ms (20% slower)

73.20%

DNAS-S8 27.53 ms (25% slower)

22.12 ms 73.27%

• For different targeted devices, both DNASNets achieve similar accuracy.

• However, per target DNN optimization was required

Page 46: Perception Systems for Autonomous Vehicles using Energy ... · Perception Systems for Autonomous Vehicles using Energy-Efficient Deep Neural Networks . Forrest Iandola, Ben Landen,

The Future: Breaking down the wall between DNN Design & Hardware Design

DNN Designers • Unaware of arithmetic intensity • Floating point vs fixed point costs

• Memory hierarchy and latency

NN HW Accelerator architects • Using outdated models:

- AlexNet

- VGG16 • Using irrelevant datasets:

- MNIST

- CIFAR

Page 47: Perception Systems for Autonomous Vehicles using Energy ... · Perception Systems for Autonomous Vehicles using Energy-Efficient Deep Neural Networks . Forrest Iandola, Ben Landen,

Key Takeaways

• Autonomous vehicles currently need thousands (or even hundreds of thousands) of dollars of computing hardware

• Processing is on a trajectory of rapid improvement (in operations-per-Watt) • but other aspects of the system (e.g. memory) are improving much more slowly • today's neural networks will be choked by slow memory on tomorrow's DNN accelerators (this is

already happening and will get worse)

• Designing new (smaller) neural networks helps with all of the following • making full use of next-generation computing platforms • reducing the hardware costs in autonomous vehicles • enabling lower-cost, larger-scale rollouts of autonomous vehicles