a 0.62mw ultra-low-power convolutional- neural · pdf file14.6: a 0.62mw ultra-low-power...

32
14.6: A 0.62mW Ultra-Low-power Convolutional-Neural-Network Face-Recognition Processor and a CIS Integrated with Always-on Haar-Like Face Detector © 2017 IEEE International Solid-State Circuits Conference 1 of 32 A 0.62mW Ultra-Low-Power Convolutional- Neural-Network Face-Recognition Processor and a CIS Integrated with Always-On Haar- Like Face Detector Kyeongryeol Bong, S. Choi, C. Kim, S. Kang, Y. Kim, and Hoi-Jun Yoo Semiconductor System Laboratory School of EE, KAIST

Upload: duongkiet

Post on 10-Feb-2018

227 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A 0.62mW Ultra-Low-Power Convolutional- Neural · PDF file14.6: A 0.62mW Ultra-Low-power Convolutional-Neural-Network Face-Recognition Processor and a CIS Integrated with Always-on

14.6: A 0.62mW Ultra-Low-power Convolutional-Neural-Network Face-Recognition Processor and a CIS Integrated with Always-on Haar-Like Face Detector

© 2017 IEEE International Solid-State Circuits Conference 1 of 32

A 0.62mW Ultra-Low-Power Convolutional-Neural-Network Face-Recognition Processor and a CIS Integrated with Always-On Haar-

Like Face Detector

Kyeongryeol Bong, S. Choi, C. Kim, S. Kang, Y. Kim, and Hoi-Jun Yoo

Semiconductor System LaboratorySchool of EE, KAIST

Page 2: A 0.62mW Ultra-Low-Power Convolutional- Neural · PDF file14.6: A 0.62mW Ultra-Low-power Convolutional-Neural-Network Face-Recognition Processor and a CIS Integrated with Always-on

14.6: A 0.62mW Ultra-Low-power Convolutional-Neural-Network Face-Recognition Processor and a CIS Integrated with Always-on Haar-Like Face Detector

© 2017 IEEE International Solid-State Circuits Conference 2 of 32

OutlineIntroductionProposed Face Recognition SystemLow-power Key Features

1. Analog-Digital Hybrid Haar-like Face Detector(HHFD)

2. Separable Filter approximation for convolutionallayers (SF-CONV)

3. Transpose-read SRAM (T-SRAM)Implementation ResultsConclusion

Page 3: A 0.62mW Ultra-Low-Power Convolutional- Neural · PDF file14.6: A 0.62mW Ultra-Low-power Convolutional-Neural-Network Face-Recognition Processor and a CIS Integrated with Always-on

14.6: A 0.62mW Ultra-Low-power Convolutional-Neural-Network Face-Recognition Processor and a CIS Integrated with Always-on Haar-Like Face Detector

© 2017 IEEE International Solid-State Circuits Conference 3 of 32

DNN ResearchesPe

rfor

man

ce (T

OPS

)

<1

~10

<10

SoC0.1-0.3 TOPS90-400 mW

GPU5.75 TOPS

~250 WNervanaSys-32NVIDIA Titan X

FPGA ~1.2 TOPS

~40WArria 10 GX1150

Cloud DNN

Edge DNN

Mobile DNN

Power (W)~100<1

?

~10<0.1

Page 4: A 0.62mW Ultra-Low-Power Convolutional- Neural · PDF file14.6: A 0.62mW Ultra-Low-power Convolutional-Neural-Network Face-Recognition Processor and a CIS Integrated with Always-on

14.6: A 0.62mW Ultra-Low-power Convolutional-Neural-Network Face-Recognition Processor and a CIS Integrated with Always-on Haar-Like Face Detector

© 2017 IEEE International Solid-State Circuits Conference 4 of 32

More Smart Devices Per PersonAlways-on Face Recognition(AoFR) for

“Unlock”(User Authentification)– Non-intrusive way for multiple devices

2003 2010 2015 2020

0.081.84

3.47

6.58

<Smart Devices Per Person>

Unlock!

<User Authentication w/ Face>

Page 5: A 0.62mW Ultra-Low-Power Convolutional- Neural · PDF file14.6: A 0.62mW Ultra-Low-power Convolutional-Neural-Network Face-Recognition Processor and a CIS Integrated with Always-on

14.6: A 0.62mW Ultra-Low-power Convolutional-Neural-Network Face-Recognition Processor and a CIS Integrated with Always-on Haar-Like Face Detector

© 2017 IEEE International Solid-State Circuits Conference 5 of 32

Put Smart Face Recognition as New UINew UI for IoT

“Intelligence of Things”– Things can recognize users– More interactive with users

Requirements– Ultra-low power

consumption– Interactive intelligence

Intelligence of Things

Page 6: A 0.62mW Ultra-Low-Power Convolutional- Neural · PDF file14.6: A 0.62mW Ultra-Low-power Convolutional-Neural-Network Face-Recognition Processor and a CIS Integrated with Always-on

14.6: A 0.62mW Ultra-Low-power Convolutional-Neural-Network Face-Recognition Processor and a CIS Integrated with Always-on Haar-Like Face Detector

© 2017 IEEE International Solid-State Circuits Conference 6 of 32

Low-power Face Recognition SystemConventional FR System [1]

Functional CIS + FR System

Transfer & Processonly detected face images

Page 7: A 0.62mW Ultra-Low-Power Convolutional- Neural · PDF file14.6: A 0.62mW Ultra-Low-power Convolutional-Neural-Network Face-Recognition Processor and a CIS Integrated with Always-on

14.6: A 0.62mW Ultra-Low-power Convolutional-Neural-Network Face-Recognition Processor and a CIS Integrated with Always-on Haar-Like Face Detector

© 2017 IEEE International Solid-State Circuits Conference 7 of 32

Low-power Face Recognition SystemProposed Hybrid Face Detector

– Analog FD• Reject non-faced images instantly w/o initial processing

– Digital FD• Detect face location w/ cascaded classsifiers

Using Analog FD as Trigger for low-power FD

Page 8: A 0.62mW Ultra-Low-Power Convolutional- Neural · PDF file14.6: A 0.62mW Ultra-Low-power Convolutional-Neural-Network Face-Recognition Processor and a CIS Integrated with Always-on

14.6: A 0.62mW Ultra-Low-power Convolutional-Neural-Network Face-Recognition Processor and a CIS Integrated with Always-on Haar-Like Face Detector

© 2017 IEEE International Solid-State Circuits Conference 8 of 32

Ultra-low Power DNN

Reduced Number Precision

PE Array Optimization

Algorithm

Architecture

Tensor Decomposition

Distributed Memory Architecture

Customized SRAM

Near-threshold Logic

Circuit

Page 9: A 0.62mW Ultra-Low-Power Convolutional- Neural · PDF file14.6: A 0.62mW Ultra-Low-power Convolutional-Neural-Network Face-Recognition Processor and a CIS Integrated with Always-on

14.6: A 0.62mW Ultra-Low-power Convolutional-Neural-Network Face-Recognition Processor and a CIS Integrated with Always-on Haar-Like Face Detector

© 2017 IEEE International Solid-State Circuits Conference 9 of 32

Distributed Memory Architecture

Centered Global Memory Architecture Distributed Local Memory Architecture

• SRAM as a global buffer• Memory bandwidth• Data reuse distance• Architecture Scalability

• No global buffer• Memory bandwidth• Data reuse distance• Architecture Scalability

PE

PE

PE

PE

PE

PE

PE

PE

PE

PE

PE

PE

PE

PE

PE

PE

MEM MEM MEM MEM

MEM MEM MEM MEM

MEM MEM MEM MEM

MEM MEM MEM MEM

PE

PE

PE

PE

PE

PE

PE

PE

PE

PE

PE

PE

PE

PE

PE

PE

[KU Leuven, Symp.VLSI 2016][MIT, ISSCC 2016] This Work

Page 10: A 0.62mW Ultra-Low-Power Convolutional- Neural · PDF file14.6: A 0.62mW Ultra-Low-power Convolutional-Neural-Network Face-Recognition Processor and a CIS Integrated with Always-on

14.6: A 0.62mW Ultra-Low-power Convolutional-Neural-Network Face-Recognition Processor and a CIS Integrated with Always-on Haar-Like Face Detector

© 2017 IEEE International Solid-State Circuits Conference 10 of 32

Contribution of This Work1. 24μW Always-on 320x240 CIS w/ FD

1) Analog-digital hybrid Haar-like face detector (Low-power)

2. 0.6mW CNN Processor for FR1) Separable filter appr. for conv. layers (Low-power)2) Transpose-read SRAM (Low-power)3) Voltage & frequency scaling (Low-power)

0.62mW Low-power Always-on Face Recognition System

Page 11: A 0.62mW Ultra-Low-Power Convolutional- Neural · PDF file14.6: A 0.62mW Ultra-Low-power Convolutional-Neural-Network Face-Recognition Processor and a CIS Integrated with Always-on

14.6: A 0.62mW Ultra-Low-power Convolutional-Neural-Network Face-Recognition Processor and a CIS Integrated with Always-on Haar-Like Face Detector

© 2017 IEEE International Solid-State Circuits Conference 11 of 32

Overall System Architecture1. 320x240 CIS with Always-on FD2. CNN Processor (CNNP) for FR

Page 12: A 0.62mW Ultra-Low-Power Convolutional- Neural · PDF file14.6: A 0.62mW Ultra-Low-power Convolutional-Neural-Network Face-Recognition Processor and a CIS Integrated with Always-on

14.6: A 0.62mW Ultra-Low-power Convolutional-Neural-Network Face-Recognition Processor and a CIS Integrated with Always-on Haar-Like Face Detector

© 2017 IEEE International Solid-State Circuits Conference 12 of 32

Cascaded Classifiers w/ Haar-like Filters

Haar-like Filter– Intensity summation over

rectangles (black and white)Cascade Classifier

– >10 Classifier Stages– >1000 Haar-like filters

Check

<Classifier Stage(CS)>

Page 13: A 0.62mW Ultra-Low-Power Convolutional- Neural · PDF file14.6: A 0.62mW Ultra-Low-power Convolutional-Neural-Network Face-Recognition Processor and a CIS Integrated with Always-on

14.6: A 0.62mW Ultra-Low-power Convolutional-Neural-Network Face-Recognition Processor and a CIS Integrated with Always-on Haar-Like Face Detector

© 2017 IEEE International Solid-State Circuits Conference 13 of 32

Analog & Digital Haar-like Filtering Digital Haar-like Filtering Unit (DHFU)

– Integral Image Gen. Block sum becomes 3 ADD/SUB Analog Haar-like Filtering Circuit (AHFC)

– Analog MEM: keep voltage intensities read from CIS pixels– Haar-like filt. circuit: charge-mode sum using analog MEM

Ana

log

Mem

ory

Switch Network

Digital Input Image Buffer

Inte

gral

Imag

e G

en. U

nit

Digital Haar-like Filt. UnitAnalog Haar-like Filt.

Circuit

A BC DE F

= D – C – B + A= F – E – D + C

Page 14: A 0.62mW Ultra-Low-Power Convolutional- Neural · PDF file14.6: A 0.62mW Ultra-Low-power Convolutional-Neural-Network Face-Recognition Processor and a CIS Integrated with Always-on

14.6: A 0.62mW Ultra-Low-power Convolutional-Neural-Network Face-Recognition Processor and a CIS Integrated with Always-on Haar-Like Face Detector

© 2017 IEEE International Solid-State Circuits Conference 14 of 32

Analog Haar-like Filtering Circuit 80x20 Analog Memory

– Keep voltage intensities read out from pixels

– 1 storage cap.– 1 unity gain buffer– 1 output cap. & several switches

Analog Haar-like Filtering Circuit– Charge-mode summation

• Black rectangle Cblack

• White rectangle CwhiteA

-MEM

Cel

l (6,

4)

A-M

EM C

ell (

6, 3

)A

nalo

g H

aar-

like

Filte

ring

Circ

uit

RST

RST

Col

_BC

ol_W

Page 15: A 0.62mW Ultra-Low-Power Convolutional- Neural · PDF file14.6: A 0.62mW Ultra-Low-power Convolutional-Neural-Network Face-Recognition Processor and a CIS Integrated with Always-on

14.6: A 0.62mW Ultra-Low-power Convolutional-Neural-Network Face-Recognition Processor and a CIS Integrated with Always-on Haar-Like Face Detector

© 2017 IEEE International Solid-State Circuits Conference 15 of 32

Analog vs. Digital Haar-like FilteringDigital Haar-like Filtering Unit (DHFU)

– Large initial processing (integral image gen.)– Simple operations for filtering

Efficient for long-lived windows

Analog Haar-like Filtering Circuit (AHFC)– Static current in analog MEM & filtering circuit– Direct processing (no initial processing)

Efficient for early-rejected windows

Page 16: A 0.62mW Ultra-Low-Power Convolutional- Neural · PDF file14.6: A 0.62mW Ultra-Low-power Convolutional-Neural-Network Face-Recognition Processor and a CIS Integrated with Always-on

14.6: A 0.62mW Ultra-Low-power Convolutional-Neural-Network Face-Recognition Processor and a CIS Integrated with Always-on Haar-Like Face Detector

© 2017 IEEE International Solid-State Circuits Conference 16 of 32

Proposed Analog-Digital Hybrid FDAHFC for 1st Stage

– Static current during only 1 stageDHFU for Remaining Stages

– Integral image gen. for only passed windows

Page 17: A 0.62mW Ultra-Low-Power Convolutional- Neural · PDF file14.6: A 0.62mW Ultra-Low-power Convolutional-Neural-Network Face-Recognition Processor and a CIS Integrated with Always-on

14.6: A 0.62mW Ultra-Low-power Convolutional-Neural-Network Face-Recognition Processor and a CIS Integrated with Always-on Haar-Like Face Detector

© 2017 IEEE International Solid-State Circuits Conference 17 of 32

Energy-efficiency in Hybrid FD> 60% windows are rejected at 1st Stage

– 60% workload reduction for integral image gen.With Hybrid FD, overall energy consumption

is reduced by 40%

Digital Processing

12

3

6

9

0Hybrid

ProcessingCascade-classifier Stage

1

0.25

0.5

0.75

1 2 3 4 50

>60% Rejected in the 1st stage

Page 18: A 0.62mW Ultra-Low-Power Convolutional- Neural · PDF file14.6: A 0.62mW Ultra-Low-power Convolutional-Neural-Network Face-Recognition Processor and a CIS Integrated with Always-on

14.6: A 0.62mW Ultra-Low-power Convolutional-Neural-Network Face-Recognition Processor and a CIS Integrated with Always-on Haar-Like Face Detector

© 2017 IEEE International Solid-State Circuits Conference 18 of 32

Separable Filter Approximation

Conv. with 2D × Filter

[1] Smith, Steven W, 1997

Conv. with 1D × X-Filter + Conv. with 1D × Y-Filter[1]

d

d

d1

1

d[ [

d

d

d

d

1-D Y-filte

r (dx1)

1-D X-filter (dx1)

Page 19: A 0.62mW Ultra-Low-Power Convolutional- Neural · PDF file14.6: A 0.62mW Ultra-Low-power Convolutional-Neural-Network Face-Recognition Processor and a CIS Integrated with Always-on

14.6: A 0.62mW Ultra-Low-power Convolutional-Neural-Network Face-Recognition Processor and a CIS Integrated with Always-on Haar-Like Face Detector

© 2017 IEEE International Solid-State Circuits Conference 19 of 32

Workload Reduction w/ SF-CONV

<Conv. w/o Approximation> <Conv. w/ Separable Filter>

1-D Vert. Filter

1-D Hor. Filter

c dx1

1xdOut

Input

1-D filtered c

d1

1d

dd

1-D Separable Kernel

c dxdInput Out

Page 20: A 0.62mW Ultra-Low-Power Convolutional- Neural · PDF file14.6: A 0.62mW Ultra-Low-power Convolutional-Neural-Network Face-Recognition Processor and a CIS Integrated with Always-on

14.6: A 0.62mW Ultra-Low-power Convolutional-Neural-Network Face-Recognition Processor and a CIS Integrated with Always-on Haar-Like Face Detector

© 2017 IEEE International Solid-State Circuits Conference 20 of 32

Workload Reduction w/ SF-CONV2x ~ 3x Workload (# of MAC) Reduction

– Enables low-power CNN processing< 1% Accuracy Degradation

*Tested with LFW (Labeled Face in Wild) dataset

Page 21: A 0.62mW Ultra-Low-Power Convolutional- Neural · PDF file14.6: A 0.62mW Ultra-Low-power Convolutional-Neural-Network Face-Recognition Processor and a CIS Integrated with Always-on

14.6: A 0.62mW Ultra-Low-power Convolutional-Neural-Network Face-Recognition Processor and a CIS Integrated with Always-on Haar-Like Face Detector

© 2017 IEEE International Solid-State Circuits Conference 21 of 32

Memory Access in SF-CONVInefficient Memory Access for Image Fetching

– Inefficient Vertical image filtering– 4.7x increased activity factor (∝ PDyn)

Inp.

C0

Inp.

C0

<Hor. Filtering> <Ver. Filtering>

4.7x

Page 22: A 0.62mW Ultra-Low-Power Convolutional- Neural · PDF file14.6: A 0.62mW Ultra-Low-power Convolutional-Neural-Network Face-Recognition Processor and a CIS Integrated with Always-on

14.6: A 0.62mW Ultra-Low-power Convolutional-Neural-Network Face-Recognition Processor and a CIS Integrated with Always-on Haar-Like Face Detector

© 2017 IEEE International Solid-State Circuits Conference 22 of 32

Transpose-Read SRAM (T-SRAM)V-WD & V-SA: transposed read/write

– Vertical wordline-driver & Sense amplifier– Output vertical 1-D vector

H-WD & H-SA: conventional read/write– Output horizontal 1-D vector

(Transposed Mode)(T-SRAM Architecture) (Conventional Mode)

H-SAV-SAH-WD

V-WD

Vert. 1-D Vector

Output

V-WD

H-SAV-SAH-WD

SRAMCell Array

Hor. 1-DVector

Output

H-SAV-SAH-WD

V-WD

Page 23: A 0.62mW Ultra-Low-Power Convolutional- Neural · PDF file14.6: A 0.62mW Ultra-Low-power Convolutional-Neural-Network Face-Recognition Processor and a CIS Integrated with Always-on

14.6: A 0.62mW Ultra-Low-power Convolutional-Neural-Network Face-Recognition Processor and a CIS Integrated with Always-on Haar-Like Face Detector

© 2017 IEEE International Solid-State Circuits Conference 23 of 32

SF-CONV w/ T-SRAM

1-D Vert. Filtering 1-D Hor. Filtering

c dx1 1xd

Input Image 1-D Vert. Filtered Image Output Image

(Transposed Read)

Input Image in T-SRAM

(Conv. Read)

1-D vert. filtered Image

1-D Vert. Filter

1-D Hor. Filter

Output Image

H-SAV-SAH-WD

V-WD V-WD

H-SAV-SAH-WD

Page 24: A 0.62mW Ultra-Low-Power Convolutional- Neural · PDF file14.6: A 0.62mW Ultra-Low-power Convolutional-Neural-Network Face-Recognition Processor and a CIS Integrated with Always-on

14.6: A 0.62mW Ultra-Low-power Convolutional-Neural-Network Face-Recognition Processor and a CIS Integrated with Always-on Haar-Like Face Detector

© 2017 IEEE International Solid-State Circuits Conference 24 of 32

Transpose-Read SRAMTwo Read Modes

– Conventional Read & Transpose Read7T SRAM Cell

– Single-ended read using decoupled MOS– Bitline & wordline share between conventional

read & transpose read

WB

LB

WWL

VDD

WB

L

H_RDBL(V_RDWL)

IOCell Array

WL

Driv

.

SA

WL Driv.

SA

Ban

k 0

Decoder

Dec

oder

Ctrl

H_RDWL(V_RDBL)

Page 25: A 0.62mW Ultra-Low-Power Convolutional- Neural · PDF file14.6: A 0.62mW Ultra-Low-power Convolutional-Neural-Network Face-Recognition Processor and a CIS Integrated with Always-on

14.6: A 0.62mW Ultra-Low-power Convolutional-Neural-Network Face-Recognition Processor and a CIS Integrated with Always-on Haar-Like Face Detector

© 2017 IEEE International Solid-State Circuits Conference 25 of 32

Transpose-Read SRAMTwo Read Modes

– Conventional Read & Transpose Read7T SRAM Cell

– Single-ended read using decoupled MOS– Bitline & wordline share between conventional

read & transpose read

WB

LB

WWL

VDD

WB

L

V_RDWL(H_RDBL)

IOCell Array

WL

Driv

.

SA

WL Driv.

SA

Ban

k 0

Decoder

Dec

oder

Ctrl

V_RDBL(H_RDWL)

Page 26: A 0.62mW Ultra-Low-Power Convolutional- Neural · PDF file14.6: A 0.62mW Ultra-Low-power Convolutional-Neural-Network Face-Recognition Processor and a CIS Integrated with Always-on

14.6: A 0.62mW Ultra-Low-power Convolutional-Neural-Network Face-Recognition Processor and a CIS Integrated with Always-on Haar-Like Face Detector

© 2017 IEEE International Solid-State Circuits Conference 26 of 32

Memory Access in SF-CONV w/ T-SRAM

Efficient Memory Access for Vertical FilteringUsing Transpose Read– 4.2x decreased activity factor (∝ PDyn)

<Hor. Filtering> <Ver. Filtering>

Inp.

C0

Inp.

C0

Page 27: A 0.62mW Ultra-Low-Power Convolutional- Neural · PDF file14.6: A 0.62mW Ultra-Low-power Convolutional-Neural-Network Face-Recognition Processor and a CIS Integrated with Always-on

14.6: A 0.62mW Ultra-Low-power Convolutional-Neural-Network Face-Recognition Processor and a CIS Integrated with Always-on Haar-Like Face Detector

© 2017 IEEE International Solid-State Circuits Conference 27 of 32

Chip Photograph

FD: 3.3mm x 3.36mm CIS in 65nm– 320 x 240 pixel array, Haar-like filtering circuit & analog MEM

FR: 4mm x 4mm CNNP in 65nm– 4 x 4 PE array with local T-SRAM

Page 28: A 0.62mW Ultra-Low-Power Convolutional- Neural · PDF file14.6: A 0.62mW Ultra-Low-power Convolutional-Neural-Network Face-Recognition Processor and a CIS Integrated with Always-on

14.6: A 0.62mW Ultra-Low-power Convolutional-Neural-Network Face-Recognition Processor and a CIS Integrated with Always-on Haar-Like Face Detector

© 2017 IEEE International Solid-State Circuits Conference 28 of 32

NTV with Voltage & Freq. Scaling 2x Energy Efficiency Improvement

– 211mW @ 100MHz, 0.8V– 5.3mW @ 5MHz, 0.46V

Page 29: A 0.62mW Ultra-Low-Power Convolutional- Neural · PDF file14.6: A 0.62mW Ultra-Low-power Convolutional-Neural-Network Face-Recognition Processor and a CIS Integrated with Always-on

14.6: A 0.62mW Ultra-Low-power Convolutional-Neural-Network Face-Recognition Processor and a CIS Integrated with Always-on Haar-Like Face Detector

© 2017 IEEE International Solid-State Circuits Conference 29 of 32

System Implementation

2.8c

m

5.3cm

FR Layer

FD Layer

ULP-DNNProcessor FPGA

FunctionalCIS

Page 30: A 0.62mW Ultra-Low-Power Convolutional- Neural · PDF file14.6: A 0.62mW Ultra-Low-power Convolutional-Neural-Network Face-Recognition Processor and a CIS Integrated with Always-on

14.6: A 0.62mW Ultra-Low-power Convolutional-Neural-Network Face-Recognition Processor and a CIS Integrated with Always-on Haar-Like Face Detector

© 2017 IEEE International Solid-State Circuits Conference 30 of 32

Experiment Results

Only < 1% accuracy loss from SF-CONV 97% achieved using CNN @ LFW dataset

<Experiment Setup> <FR Result>

Page 31: A 0.62mW Ultra-Low-Power Convolutional- Neural · PDF file14.6: A 0.62mW Ultra-Low-power Convolutional-Neural-Network Face-Recognition Processor and a CIS Integrated with Always-on

14.6: A 0.62mW Ultra-Low-power Convolutional-Neural-Network Face-Recognition Processor and a CIS Integrated with Always-on Haar-Like Face Detector

© 2017 IEEE International Solid-State Circuits Conference 31 of 32

Performance Comparison

The Most Accurate Face Recognition SoC– x1.20 accuracy improvement with CNN (97%)

Ultra-Low-Power Consumption– 0.62mW power consumption w/ imaging

SOVC '15

Algorithm

This Work

FD: Haar-like Cascade Classifier FD: Haar-like Cascade Classifier

Resolution HD QVGA

Framerate 5.5fps 1fps

Power 23mW @ 600mV, 100MHz(w/o imaging)

0.62mW @ 460mV, 5MHz(w/ imaging)

Accuracy 81% @ 32-class in LFW 97% @ LFW

Technology TSMC 40nm Samsung 65nm

FR: PCA + SVM FR: CNN

Page 32: A 0.62mW Ultra-Low-Power Convolutional- Neural · PDF file14.6: A 0.62mW Ultra-Low-power Convolutional-Neural-Network Face-Recognition Processor and a CIS Integrated with Always-on

14.6: A 0.62mW Ultra-Low-power Convolutional-Neural-Network Face-Recognition Processor and a CIS Integrated with Always-on Haar-Like Face Detector

© 2017 IEEE International Solid-State Circuits Conference 32 of 32

ConclusionAn Ultra-Low-Power Face Recognition

System with CIS and CNNUltra-Low-Power Features:

1. Analog-Digital Hybrid Haar-like Face Detector2. Separable Filter Approximation for Conv. Layers3. Transpose-Read SRAM4. NTV Implementation with Voltage & Freq. Scaling

0.62mW Face Recognition System is Realized for Always-on and IoT User

Interface Application