comprehensive arm solutions for innovative machine ... · comprehensive arm solutions for...

Post on 04-May-2018

238 Views

Category:

Documents

6 Downloads

Preview:

Click to see full reader

TRANSCRIPT

© 2017 Arm Limited

Comprehensive ArmSolutions for Innovative

Machine Learning (ML) and Computer Vision (CV)

ApplicationsSteve Steele

Director, ML Platforms | Arm

Arm Technical Symposium 2017

2

© 2017 Arm Limited 2

Agenda

Innovation Growth Maturity

Machine learning

Platform & toolsv

What is Artificial Intelligence?

What are the opportunities and challenges in AI?

Arm technology for AI

• Software

• Specialized Acceleration

• Hardware

© 2017 Arm Limited

What is Artificial Intelligence?

4

© 2017 Arm Limited 4

AI Presents Significant Opportunity for Innovation

Robotics

Home, surveillance & analytics

VR/MR

IoT

Shipping & logistics

Mobile

Drones

Automotive

5

© 2017 Arm Limited 5

The Opportunity and Challenge of AI

• Autonomous driving and industrial applications

• Connected services predicted to be very valuable

AI and Machine Learning in 2020 Devices, algorithms, and connected services

• Robotics open up server and knowledge-sharing services

• Algorithms change daily

$4.8 billion for chips

6

© 2017 Arm Limited 6

Machine Learning is a Subset of Artificial IntelligenceAI means many things to many people

Artificial Intelligence

Machine Learning

Perception & Vision

Natural Language Processing

Knowledge Representation

Planning & Navigation

Generalized Intelligence

ML itself has a lot of depth

7

© 2017 Arm Limited 7

Why Artificial Intelligence(AI) is Exploding NowAvailability of increased data sourced at the edge with ubiquitous powerful compute!

Compute Data

2016 – 1 zettabyte

2020 – 2.3 zettabyte

IP Traffic

2010

2015

zettabyte = 1021 bytes

8

© 2017 Arm Limited 8

Neural Networks (NN) Can Now Outperform Humans

Data for ImageNet Large Scale Visual Recognition Challenge

Deep learning introduced in 2012, resulting in big improvements

Error rates have now stabilized at ~3%

0

5

10

15

20

25

30

Top 5 Error on ImageNet

Series1 Series2 Series3 Series4 Series5 Series6 Series7

Computer Vision

Human Error Rate

Deep Learning

Top

-5 E

rro

r R

ate

(%)

(Source: ImageNet and Andrej Karpathy)

9

© 2017 Arm Limited 9

Distributed Intelligence

Regional servers

Training + inference

Cloud servers

Training + inference

Sensing, training, inference & actuation

Edge devices

Capabilities Migrating to the Edge

10

© 2017 Arm Limited 10

Why is On-device ML Driving AI to the Edge?

Bandwidth PrivacyLatencyCostPower

11

© 2017 Arm Limited 11

AI Applications at the Edge on Arm

Detect plant diseases Sort cucumbers Detect Caltrain delays

© 2017 Arm Limited

The Arm ML Platform

13

© 2017 Arm Limited 13

Arm ML Platform Enables

FlexibilityEfficiency Freedom

14

© 2017 Arm Limited 14

Components of Arm ML Platform

Software Hardware Specialized Acceleration

15

© 2017 Arm Limited 15

Software Development

16

© 2017 Arm Limited 16

Software Architecture Overview

Applications

Third-partylibraries and benchmarks

Programmable

CPUs Arm Cortex-M

CPUs Cortex-A

GPUsArm Mali

Spirit3rd party

accelerators

Compute libraries for NEON, GPU

Android NN

Domain-specific high-level libraries:

Mobile, Autonomous, People

Tensorflow Caffe

MXNet Torch

17

© 2017 Arm Limited 17

Compute Library from ArmFaster, advanced processing

Functions for CV and deep-learning algorithms

Optimized for Arm CPU and GPU

OS and platform agnostic

No fee, MIT license

Use as a plug-in backend for your own runtime implementation

What is the Compute Library?

Delivers faster processing Offers OpenCV and Open VX compatibility

Available now: https://developer.arm.com/technologies/compute-library

4.6x faster than stock OpenCVon NEON

18

© 2017 Arm Limited 18

Compute Library from Arm

Partners Functions

80+

19

© 2017 Arm Limited 19

Hardware

© 2017 Arm Limited

ML on Cortex CPUs

21

© 2017 Arm Limited 21

Instruction Sets for AI

• Additional dot product instructions (Cortex-A55 and Cortex-A75)

• New Scalable Vector Extension (SVE) instructions

• Flexibility in multi-core computing with Arm DynamIQ technology

Cortex-A

• Optimized CMSIS-DSP libraries for matrix multiplication

Cortex-M

• Improved performance and efficiency (for broader use cases)

• Connect accelerators with DynamIQ

Closely-coupled acceleration

22

© 2017 Arm Limited 22

New DynamIQ-based CPUs for New Possibilities

>50%

more performancecompared to current devices

2.5x

greater power efficiencycompared to current devices

Estimated device performance using SPECINT2006, final device results may varyComparison using Cortex-A73 at 2.4GHz vs Cortex-A75 at 3GHz

Comparison using Cortex-A53 in 28nm devices vs Cortex-A55 in 16nm devices

Cortex-A75 processor Cortex-A55 processor

23

© 2017 Arm Limited 23

DynamIQ: New Cluster Design for New Cores

Arm DynamIQ big.LITTLE systems:

• Greater product differentiation and scalability

• Improved energy efficiency and performance

• SW compatibility with Energy Aware Scheduling (EAS)

Private L2 and shared L3 caches

• Local cache close to processors

• L3 cache shared between all cores

DynamIQ Shared Unit (DSU)

• Contains L3, Snoop Control Unit (SCU) and all cluster interfaces

Additional instructions for ML1b+4L1b+3L1b+2L

1b+7L

Example: DynamIQ big.LITTLEconfigurations

..

AMBA4 ACE

SCU

Shared L3 cacheACP

Cortex-A5532b/64b Core

Private L2 cache

Async BridgesPeripheral Port

Cortex-A7532b/64b Core

Private L2 cache

DynamIQ Shared Unit (DSU)

2b+6L4b+4L

24

© 2017 Arm Limited 24

Instruction Sets for AI

• Additional dot product instructions (Cortex-A55 and Cortex-A75)

• New Scalable Vector Extension (SVE) instructions

• Flexibility in multi-core computing with Arm DynamIQ technology

Cortex-A

• Optimized CMSIS-DSP libraries for matrix multiplication

Cortex-M

• Improved performance and efficiency (for broader use cases)

• Connect accelerators with DynamIQ

Closely-coupled acceleration

25

© 2017 Arm Limited 25

Instruction Sets for AI

• Additional dot product instructions (Cortex-A55 and Cortex-A75)

• New Scalable Vector Extension (SVE) instructions

• Flexibility in multi-core computing with Arm DynamIQ technology

Cortex-A

• Optimized CMSIS-DSP libraries for matrix multiplication

Cortex-M

• Improved performance and efficiency (for broader use cases)

• Connect accelerators with DynamIQ

Closely-coupled acceleration

© 2017 Arm Limited

ML on Mali GPUs

27

© 2017 Arm Limited 27

Mali GPUs: Increasing ML Throughput and Efficiency

Increasing efficiency

0.9

0.95

1

1.05

1.1

1.15

1.2

1 2

Rel

ativ

e En

eryg

Eff

icen

cy Series1 Series2

17% Efficiency

gain

• GEMM depicts core functionality of ML algorithms

• Mali-G72 has several optimizations to improve ML inference

• Less power-hungry FMA unit

• Bigger L1 cache in the execution engine

• Mali-G72 is the most efficient Mali GPU for machine learning

28

© 2017 Arm Limited 28

Specialized Acceleration

© 2017 Arm Limited

Computer Vision (CV)

30

© 2017 Arm Limited 30

Direct from sensor (no ISP)

Real-time

High resolution, wide range of scale

Very detailed object description

Spirit: Object Detection at the Edge

Image analysis Small area Energy efficient

Trajectory Pose Identity Gesture

Head facing right

Head facing forwardsUpper body facing

right

Full body facing right Person being tracked

Full body facing forward

Upper body facing forward

31

© 2017 Arm Limited 31

Spirit for Object Detection and Localization

Metadata stream (Regions of interest)

Image stream

Ben

Beth

SpiritCV pre-processor

Senso

r in

terface

Feature

extraction

Classifier 1

Classifier 2

ISPSensor

CPUGPU

Acceleration

32

© 2017 Arm Limited 32

Comparison with Neural Network Framework Solutions

SSD Neural Network

Yolo

Spirit

© 2017 Arm Limited

Summary

34

© 2017 Arm Limited 34

Arm’s ML Computing Platform

Flexible software with standard APIs and ML frameworks simplifies implementation and provides portability

Power-efficient and scalable architecture

enables AI on battery-constrained devices

Greater capability for ML solutions

+World’s largest

ecosystem for devicesdelivers broad

applicability and rich capabilities

+

35

© 2017 Arm Limited 35

For further information…

steve.steele@arm.com

https://developer.arm.com

3636

Thank You!Danke!Merci!谢谢!ありがとう!Gracias!Kiitos!

© 2017 Arm Limited

3737 © 2017 Arm Limited

The Arm trademarks featured in this presentation are registered trademarks or trademarks of Arm Limited (or its subsidiaries) in the US and/or elsewhere. All rights reserved. All other marks featured may be trademarks of their respective owners.

www.arm.com/company/policies/trademarks

top related