architecting always-on, context-aware, on-device ai using ... › sites › default ›...

27
Deepak Boppana Senior Director Product & Segment Marketing Gordon Hands Director Solutions Marketing Architecting Always-On, Context-Aware, On-Device AI Using Flexible Low-power FPGAs

Upload: others

Post on 26-Jun-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Architecting Always-On, Context-Aware, On-Device AI Using ... › sites › default › files...Mobile Development Platform –iCE40 UltraPlus FPGA Video Interface Platform –ECP5

Deepak Boppana – Senior Director Product & Segment Marketing

Gordon Hands – Director Solutions Marketing

Architecting Always-On, Context-Aware,

On-Device AI Using Flexible Low-power FPGAs

Page 2: Architecting Always-On, Context-Aware, On-Device AI Using ... › sites › default › files...Mobile Development Platform –iCE40 UltraPlus FPGA Video Interface Platform –ECP5

www.latticesemicom/sensAI [2]

Rapidly Emerging Edge Computing TrendDriven by Latency, Privacy, and Bandwidth Limitations

Unit growth for edge devices with AI will explode increasing over 110% CAGR over the next five years – Semico Research

Edge Networking Cloud

IoT

Communication

Gateway

Wireless /

Wireline

Access

Core

Network

Page 3: Architecting Always-On, Context-Aware, On-Device AI Using ... › sites › default › files...Mobile Development Platform –iCE40 UltraPlus FPGA Video Interface Platform –ECP5

www.latticesemicom/sensAI [3]

Always-on, On-device AI ApplicationsHuman Presence Detection Example

Smart Home

Appliance

LCD turns on

when needed

Consumer

Electronics

TV turns off when

no one is present

Smart

DoorBell

Rings

automatically

when needed

Security

Camera

Alerts when

intruder

present,

not a cat

Smart Doors

Opens when

person is present

Vending

Machine

LCD turns on

when needed

Page 4: Architecting Always-On, Context-Aware, On-Device AI Using ... › sites › default › files...Mobile Development Platform –iCE40 UltraPlus FPGA Video Interface Platform –ECP5

www.latticesemicom/sensAI [4]

Always-on, On-device AI ApplicationsOther Examples

Smart speakersKey phrase detection

Retail store camerasFace tracking

Selfie dronesFace tracking

Toll gate cameraVehicle classification

Machine visionObject counting

After market

automotive camerasSpeed sign detection

Page 5: Architecting Always-On, Context-Aware, On-Device AI Using ... › sites › default › files...Mobile Development Platform –iCE40 UltraPlus FPGA Video Interface Platform –ECP5

www.latticesemicom/sensAI [5]

Always-on, On-device AI RequirementsUnmet Need for Ultra-Low Power, Scalable, and Flexible Inferencing

Few mWs of

Power Consumption

Neural

Network

Accelerator

I2C

SPI

PCIE

Ethernet

USB

Flexible Legacy Interface Support

Customized

Performance/Accuracy

Few mm2 of Board AreaFew $s of BOM Cost Adder

Page 6: Architecting Always-On, Context-Aware, On-Device AI Using ... › sites › default › files...Mobile Development Platform –iCE40 UltraPlus FPGA Video Interface Platform –ECP5

www.latticesemicom/sensAI [6]

HARDWARE PLATFORMS

IP CORES

SOFTWARE TOOLS

REFERENCE DESIGNS / DEMOS

CNN Compact Accelerator CNN Accelerator

Mobile Development Platform

– iCE40 UltraPlus FPGA

Video Interface Platform

– ECP5 FPGA

1 mW, 5.5 mm2, 1/16 bits 1 W, 100 mm2, 1/8/16 bits

CUSTOM DESIGN SERVICES

Mobile Smart CarSmart Home Smart City Smart Factory

Neural Network Compiler

Ultra Low Power

Small Form Factor

Customizable

Neural Network Accelerators

Face Detection

Speed Sign Detection

Key Phrase Detection

Face Tracking

Object Counting

Human Presence Detection

Hand GestureDetection

Page 7: Architecting Always-On, Context-Aware, On-Device AI Using ... › sites › default › files...Mobile Development Platform –iCE40 UltraPlus FPGA Video Interface Platform –ECP5

www.latticesemicom/sensAI [7]

~~

Flexible and Scalable Inferencing at the EdgeFrom under 1 mW to 1 W with Lattice sensAI

HIGH-END FPGA

ZONEGPU

ZONE

MPU

ZONE

MCU ZONE

0.1 1 10 100 1,000 10,000 100,000

100.0

10.0

1.0

0.1

0.001

PO

WE

R(W

)

PERFORMANCE (Billions of Neural Ops per second)

Page 8: Architecting Always-On, Context-Aware, On-Device AI Using ... › sites › default › files...Mobile Development Platform –iCE40 UltraPlus FPGA Video Interface Platform –ECP5

www.latticesemicom/sensAI [8]

Stand-alone, Integrated FPGA Solution

Always-on, integrated solutions on ECP5 or iCE40 UltraPlus FPGA

Low latency and secure implementation

Small form factor packages from 5.5 mm2 to 100 mm2

Programmable FPGA Fabric

5,280 LUTs

120 Kb Block RAM

iCE40 UltraPlus

I/Os

NVCM

8 DSP Blocks

1 Mb RAM

I/O

s

I/O

s

NEW

DATA

RESULT

Programmable FPGA Fabric

85,000 LUTs

3.7 Mb Block RAM

ECP5-85

156 DSP Blocks

I/O

s

I/O

s

NEW

DATA

RESULT

I/Os

Page 9: Architecting Always-On, Context-Aware, On-Device AI Using ... › sites › default › files...Mobile Development Platform –iCE40 UltraPlus FPGA Video Interface Platform –ECP5

www.latticesemicom/sensAI [9]

iCE40 UltraPlus

FPGA as Activity Gate to ASIC/ASSP

VGA

Image Sensor

ASIC/ASSP

iCE40 UltraPlus FPGA for always-on detection of key-phrases or objects

Wakes-up a high performance ASIC/ASSP for further analytics only when required

Reduces overall system power consumption

RESULTS

Camera I/FDown scale

to 32 x 32

Neural

Network IP

SRAM(weights /

activations)

Page 10: Architecting Always-On, Context-Aware, On-Device AI Using ... › sites › default › files...Mobile Development Platform –iCE40 UltraPlus FPGA Video Interface Platform –ECP5

www.latticesemicom/sensAI [10]

ECP5-85/45/25

SPI to DDR loader

NN

Accelerators

8/4/2 engines

ISP

Engine

DDR

Memory

SPI

Memory

FPGA as a Co-Processor to MCU

Scalable performance/power with ECP5 based neural network acceleration

ECP5 based IO flexibility to seamlessly interface to on-board legacy devices including

sensors

Low-end MCU for flexible system control

Legacy MCU

System control

Page 11: Architecting Always-On, Context-Aware, On-Device AI Using ... › sites › default › files...Mobile Development Platform –iCE40 UltraPlus FPGA Video Interface Platform –ECP5

www.latticesemicom/sensAI [11]

CNN Accelerator IP

System Interfaces

Lattice

FPGA

FPGA

Bitstream

Instructions

Neural Network Complier

RTL

Delivering Edge CNN Acceleration in Lattice FPGA

Page 12: Architecting Always-On, Context-Aware, On-Device AI Using ... › sites › default › files...Mobile Development Platform –iCE40 UltraPlus FPGA Video Interface Platform –ECP5

www.latticesemicom/sensAI [12]

Conv EU

CNN Accelerator IP Architecture

AXI

Master

CMD

Queue

State

Machine

AXI

Master

Seq

Gen 0

Seq

Gen 1

Mem 0

Mem 1

Seq

Gen 15

Mem

15

Memory PoolControl Unit

Conv

EUFC EU

Pooling

EU

Sequence

Parameters

Save/Load

Input/Output/Intermediate

Engine PoolDRAM

Page 13: Architecting Always-On, Context-Aware, On-Device AI Using ... › sites › default › files...Mobile Development Platform –iCE40 UltraPlus FPGA Video Interface Platform –ECP5

www.latticesemicom/sensAI [13]

CNN Compact Accelerator IP Architecture

Engine

Activation

storage

Convolution Scaler ReLU

PoolFully

Connected

FIFO Control

Input

Output

Commands

& weights

Control Unit

Page 14: Architecting Always-On, Context-Aware, On-Device AI Using ... › sites › default › files...Mobile Development Platform –iCE40 UltraPlus FPGA Video Interface Platform –ECP5

www.latticesemicom/sensAI [14]

Translating Trained Neural Network Into

Lattice CNN Accelerator Instructions

1. Load 2. Review 3. Analyze

5. Simulate4. Compile

Page 15: Architecting Always-On, Context-Aware, On-Device AI Using ... › sites › default › files...Mobile Development Platform –iCE40 UltraPlus FPGA Video Interface Platform –ECP5

www.latticesemicom/sensAI [15]

On-device AI – Complex Optimization

Device Network

# of

Engines

Local

Memory

Input

Size

Number of

Multipliers

Bit Widths

Power (W)

Device Size

Performance (fps)

Accuracy (%)

Small Object (% fov)

Correlation Between Design Factors and Product Attributes

Design

Factors

Attributes

Page 16: Architecting Always-On, Context-Aware, On-Device AI Using ... › sites › default › files...Mobile Development Platform –iCE40 UltraPlus FPGA Video Interface Platform –ECP5

www.latticesemicom/sensAI [16]

Examples for Illustration

Architecture Number of

Multiplications

Input Size Quantization

Face

Detection

VGG style 290,816 32*32*3 16-bit fixed point

VGG style 14,353,920 90*90*3 16-bit fixed point

Human Presence

Detection

VGG style 8,570,880 64*64*3 16-bit fixed point

VGG style 338,558,976 128*128*3 16-bit fixed point

Page 17: Architecting Always-On, Context-Aware, On-Device AI Using ... › sites › default › files...Mobile Development Platform –iCE40 UltraPlus FPGA Video Interface Platform –ECP5

www.latticesemicom/sensAI [17]

Image Based Neural Networks on Lattice

FPGAs

ECP5

CNN

Accelerator

1 – 8 engines

0.25 – 2 Mbit

Local Memory

SPI

Memory

DDR

Memory

ISP

Engine

Overlay

Engine

UltraPlus

CNN Accelerator

8 Multipliers

0.5 – 1 Mbit

Local Memory

SPI

Memory

Down

sample

SPI to DDR

loader

Page 18: Architecting Always-On, Context-Aware, On-Device AI Using ... › sites › default › files...Mobile Development Platform –iCE40 UltraPlus FPGA Video Interface Platform –ECP5

www.latticesemicom/sensAI [18]

Image Based Neural Networks Lattice Hardware

Himax HM01B0 UPduino Shield Embedded Vision Development Kit

Page 19: Architecting Always-On, Context-Aware, On-Device AI Using ... › sites › default › files...Mobile Development Platform –iCE40 UltraPlus FPGA Video Interface Platform –ECP5

www.latticesemicom/sensAI [19]

Face Detect Implementations

0.5 W

100 mm2

0.6 W

100 mm20.8 W

100 mm2

1 mW*

5.5 mm2

0.5 W

100 mm2

0.6 W

100 mm20.8 W

100 mm2

32 x 32 Input 90 x 90 Input

* Running at 5 frames per second

Page 20: Architecting Always-On, Context-Aware, On-Device AI Using ... › sites › default › files...Mobile Development Platform –iCE40 UltraPlus FPGA Video Interface Platform –ECP5

www.latticesemicom/sensAI [20]

Human Presence Detect Implementations

0.5 W

10 mm2

0.6 W

10 mm2

0.8 W

10 mm2

0.5 W

10 mm2

0.8 W

10 mm2

64 x 64 Input 128 x 128 Input

7 mW*

5.5 mm2

* Running at 5 frames per second

0.6 W

10 mm2

Page 21: Architecting Always-On, Context-Aware, On-Device AI Using ... › sites › default › files...Mobile Development Platform –iCE40 UltraPlus FPGA Video Interface Platform –ECP5

www.latticesemicom/sensAI [21]

Bringing It Together

Device Size / Power / Performance

NetworkSmallest

Object

UltraPlus

1 – 7 mW*

5.5 mm2

ECP5-25

0.5 W

100 mm2

ECP5-45

0.6 W

100 mm2

ECP5-85

0.8 W

100 mm2

Face Detection

32 x 32 Input50% 465 3360 4511 5251

Face Face Detection

90 x 90 Input20% -- 28 82 101

Human Presence Detect

64 x 64 Input20% 18 115 161 338

Human Presence Detect

128 x 128 Input10% -- 2.3 3.5 5.4

* Running at 5 frames per second

Page 22: Architecting Always-On, Context-Aware, On-Device AI Using ... › sites › default › files...Mobile Development Platform –iCE40 UltraPlus FPGA Video Interface Platform –ECP5

www.latticesemicom/sensAI [22]

Summary

AI at the edge solves real world problems

FPGAs can implement AI standalone or in conjunction with other

components

sensAI stack components provide edge AI building blocks

• Silicon, soft IP, tools, development boards & reference designs

Configurable engine size and bit widths coupled with multiple target

devices allows system optimization

• 1 mW – 1 W

• 5.5 mm2 – 100 mm2

Page 23: Architecting Always-On, Context-Aware, On-Device AI Using ... › sites › default › files...Mobile Development Platform –iCE40 UltraPlus FPGA Video Interface Platform –ECP5

www.latticesemicom/sensAI [23]

Resources

Please visit latticesemi.com/sensAI for more information and downloads

4 ECP5 Based Reference Designs / Demonstrations – Free

4 iCE40 Based Reference Designs / Demonstrations – Free

CNN Accelerator IP – Free Evaluation

CNN Compact Accelerator IP – Free

Neural Network Compiler – Free

Embedded Vision Development Kit – $199 Promotional Price

Himax HM01B0 UPduino Shield – Available November ~$49

Page 24: Architecting Always-On, Context-Aware, On-Device AI Using ... › sites › default › files...Mobile Development Platform –iCE40 UltraPlus FPGA Video Interface Platform –ECP5

© 2018 Embedded Vision Alliance 24

The Embedded Vision Alliance (www.Embedded-Vision.com) is a partnership

of 90+ leading embedded vision technology and services suppliers, and

solutions providers

Mission: Inspire and empower product creators to incorporate visual

intelligence into their products

The Alliance provides low-cost, high-quality technical educational resources

for product developers

Register for updates at www.Embedded-Vision.com

The Alliance enables vision technology providers to grow their businesses

through leads, ecosystem partnerships, and insights

For membership, email us: [email protected]

Empowering Product Creators to

Harness Embedded Vision

Page 25: Architecting Always-On, Context-Aware, On-Device AI Using ... › sites › default › files...Mobile Development Platform –iCE40 UltraPlus FPGA Video Interface Platform –ECP5

© 2018 Embedded Vision Alliance 25

The only industry event focused on enabling

product creators to create “machines that see”

• “Awesome! I was very inspired!”

• “Fantastic. Learned a lot and met great people.”

• “Wonderful speakers and informative exhibits!”

Embedded Vision Summit 2019 highlights:

• Inspiring keynotes by leading innovators

• High-quality, practical technical, business and product talks

• Exciting demos of the latest apps and technologies

Visit www.EmbeddedVisionSummit.com to sign up for updates

Join us at the Embedded Vision SummitMay 20-23, 2019—Santa Clara, California

Page 26: Architecting Always-On, Context-Aware, On-Device AI Using ... › sites › default › files...Mobile Development Platform –iCE40 UltraPlus FPGA Video Interface Platform –ECP5

www.latticesemicom/sensAI [26]

Q & A

Page 27: Architecting Always-On, Context-Aware, On-Device AI Using ... › sites › default › files...Mobile Development Platform –iCE40 UltraPlus FPGA Video Interface Platform –ECP5

Thank you