Download - “Low Power Embedded Gesture...Algorithm implementation in a mW Parallel RISC-V bases processor and experimental evaluation on efficiency (power, ... Final Accuracies: Sensors Dataset

“Low Power Embedded Gesture

Recognition Using Novel Short-Range

Radar Sensors”

Michele Magno – ETH Zurich

May 28, 2020

tinyML Talks Sponsors

Additional Sponsorships available – contact [email protected] for info

www.edgeimpulse.com

mailto:[email protected]

Next tinyML Talk

Date Presenter Topic / Title

Tuesday

June 9

Igor Fedorov

ARM Machine Learning Lab,

Arm ML Research

Dominic Binks

VP Technology,

Audio Analytic Ltd.

SpArSe: Sparse Architecture Search for

CNNs on Resource-Constrained

Microcontrollers

tinyML doesn’t need Big Data, it needs

Great Data

Webcast start time is 8 am Pacific timeEach presentation is approximately 30 minutes in length

Please contact [email protected] if you are interested in presenting

mailto:[email protected]

Michele Magno

Dr. Magno is a senior scientist and head of the Project-based Learning Centre at ETH Zurich. He is working in ETH since 2013 and has become a visiting lecturer or professor in several universities, namely the University of Nice Sophia, France, Enssat Lannion, France, Univerisity of Bologna and Mid University Sweden.Dr. Magno is a Senior Member of IEEE, the finalist of ETH Spark Award 2018, and a recipient of many other awards and grants. His background is in computer sciences and electrical engineering.

||D-ITET Center for Project-Based Learning, Integrated Systems Laboratory, ETH Zürich

Dr. Michele Magno

Credits: Jonas Erb, Moritz Scherer, Philipp Mayer, Manuel Eggimann, Prof. Dr. Luca Benini

29/05/2020 5

Low Power Embedded Gesture Recognition Using Novel

Short-Range Radar Sensors

Michele Magno

||D-ITET Center for Project-Based Learning, Integrated Systems Laboratory, ETH Zürich 29/05/2020Michele Magno 6

Introduction: Hand Gesture Recognition

Gesture Recognition is the topic of computer

science that interpret human gesture via

Mathematical algorithms.

Human machine Interfaces

Camera, inertial sensors and other sensors

Soli: ubiquitous gesture sensing with

millimeter wave radar (SIGGRAPH

2016)


Radar

Classical airplane detection radar: 1-12 GHz (wavelength = 2.5-30cm)

Short-range Radar: 60GHz (wavelength = 5cm)

Data from sensor:

Sweep @ 160Hz or higher each over range

Time and range discrete signal 𝑆[𝑡, 𝑟]

29/05/2020Michele Magno 7

Promising Technology: Short Range Radar

time

range

Sweeps

range

resolution(0.483mm)

mm wavelength pulsed coherent radar, which means that it transmits

radio signals in short pulses where the starting phase is well known.


Gesture Recognition Based on Radar: Google Soli Sensor

Increasing research on radar for gesture recognition1,2,3,4

Google developed micro-radar for gesture recognition

Good results on difficult hand-gestures: 87% accuracy on

11 gestures and 10 people

Why Google Soli is not tinyML?

Sensor consumes too much power (300mW)

Too much data to process (10’000 sweeps/s)

Prediction model too big for embedded systems (689MB)

CNN+RNN (LSTM) 4

Pixel 4 Algorithm could have different model (Realised 2020)

1Soli: Ubiquitous Gesture Sensing with Millimeter Wave Radar, 2016

4Interacting with Soli: Exploring Fine-Grained Dynamic Gesture Recognition in the Radio-Frequency Spectrum. 2016

3Sparsity-based Dynamic Hand Gesture Recognition Using Micro-Doppler Signatures, 2017

4A Hand Gesture Recognition Sensor Using Reflected Impulses. 2017


AI Workloads from Cloud to Edge (Extreme?)

GOP+

MB+

[culurciello18]

Extreme edge AI challenge

AI capabilities in the power

envelope of an MCU:

100mW peak (1mW avg)

512KB RAM (best case

1MB)


Implement radar based hand-gesture recognition in

embedded system Based on TCN+CNN, below 512KiB

Create dataset with fine-grained hand-gestures

at least 1000 samples per class and 20 users

Algorithm development suitable for embedded systems

less than 1MB, at least 700x smaller than I. w. Soli

Achieving similar accuracy as I. w. Soli

85% (single-user), 87% (10 people) on 11 Gestures1

Algorithm implementation in a mW Parallel RISC-V bases

processor and experimental evaluation on efficiency (power,

run-time)


Main contribution of this work



We want to bring radar based gesture recognition into low embedded system

But how?

Different sensor: Acconeer

Lower power

Less data:

lower sweep rate, fewer Tx, Rx


Toward TinyML and Low Power Embedded System.

Sensors: Soli Acconeer

Carrier Frequency 60GHz 60GHz

Sweep Rate (up to) 10’000Hz 1’500Hz

Power 300mW 20mw @100Hz

Transmitters/Receivers Tx:2, Rx:4 Tx:1, Rx:1

XR112


DATASET


2 Large datasets recorded:

26 people

1610 instances per gesture

Gesture Set:


Datasets & Gesture Set Datasets 5-Gesture 11-Gesture

Sweep Frequency 256Hz 160Hz

Sensors 1 2

Sweep Range 10cm to 30cm 6.1cm to 29.9cm

People 1 26

Instances 2500 17710


Tiny-ML Model


Proposed Tiny Gesture Recognition Embedded Model

Idea: Combine Information of multiple time steps

Combining TCN1+CNN

Memory below 512KB

High Accuracy

1-10 GOPS maximum

1C. Lea, M. D. Flynn, R. Vidal, A. Reiter, and G. D. Hager, “Temporal

convolutional networks for action segmentation and detection,” 2016


Range Frequency Doppler Map:

Radar output: range & time discrete

signal:𝑆[𝑡, 𝑟]

Doppler Map:

𝑆 𝑓, 𝑟 = σ𝑡=0

𝐿𝑑𝑜𝑝𝑝𝑙𝑒𝑟 𝑆[𝑡, 𝑟]𝑒−

2𝜋𝑖𝑓𝑡

𝐿𝑑𝑜𝑝𝑝𝑙𝑒𝑟

= 𝐹𝐹𝑇(𝑆 𝑡, 𝑟 )

Strong feature based on research1,2


Technical Background


2Short-Range FMCW Monopulse Radar for Hand-Gesture Sensing, 2015


Basic Feature Performance Evaluation (on 11 gestures)

Doppler feature performs best (also backed by research1,2)

To reduce model complexity and model size only Doppler feature used

1Interacting with Soli: Exploring Fine-Grained Dynamic Gesture Recognition in the Radio-

Frequency Spectrum. 2016

2Short-Range FMCW Monopulse Radar for Hand-Gesture Sensing, 2015



Algorithm Architecture: 2D CNN

Why CNN?

Condensed gesture

information

Representation ~100x

smaller than input

Extraction of semantic information in Doppler Map with CNN1,2. Time steps considered are 32 and the number of range points per sweep is 492, for 2 sensors.

With only CNN about 90% Accuracy on single user and 5 gestures

1Hand Gesture Recognition Using Micro-Doppler Signatures With Convolutional Neural Network. 2016

2Multi-sensor System for Driver’s Hand-Gesture Recognition, 2015

(492 32 2 + 98 10 16) * 8 Bit = 368KB.


Algorithm Architecture: Temporal Convolutional Neural Network (TCN1)

Idea: Combine information of

multiple time steps

98.9% Accuracy with TCN and 2 sensors and 5 gestures !

1An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling, 2018

Time series using dilated convolutional neural networks

architecture are causal, meaning that there is no information “leakage” from future to past


Reducing memory requirements

Layer structure of the residual blocks in

TCNs 1

1 C. Lea, M. D. Flynn, R. Vidal, A. Reiter, and G. D. Hager, “Temporal

convolutional networks for action segmentation and detection,” 2016.

Reduce residual block to save memory space and

execution time


Results of Prediction Model

Final Accuracies:

Sensors Dataset Accuracy

1 5 gestures single user 93.83 %

11-gestures multiple users

random shuffle

78.25 %

2 5 gestures single user 98.86 %

11 gestures single user 89.52 %

11 gestures multiple users

random shuffle

81.52 %

11 gestures multiple users

Leave-one-out cross-validation

73.66 %

Gesture spotting: 99.84 %


Optimal Architecture: 2D CNN: Demo


Edge AI Platform


Edge-AI Platforms

GOPs/s

Power/Thermal Envelope10W1W

6.2 GLOPS/s @6W 50$

1 TOPS/s @2.5W 80$

4 TOPS/s @3W 150$

1 TOPS/s @20W 1000$

32 TOPS/s @30W 1500$

0.6 GLOPS/s @0.1W 4$

Linux-

Capable

CPU

MCU

100 GOPS/s @1W 70$

9 GLOPS/s @0.07W 5$

Multi-Core MCU

FPGA

GP-GPU

Embedded

ASIC - SmartPhone

1

100

1000

10000Battery Operated

Near-Sensor

0.1W



Successful product development: GWT’s GAP8


TinyML Implementation on Novel Platform


Hardware limitations of GAP8 require adaptation

of network for optimal execution:

Change to symmetric convolutions

Use only 2D convolutions (also for 1D)

Dilated causal 1D convolutions modelled as 2D

Penalties due to adaptation:

Accuracy: <0.5% change

MACs: original: 16MMACs, adapted: 21MMACs

But runs faster than non-symmetric convolutions


GAP8 Implementation: Adaptation of Network

Before:

3x5

3x5

1x7

Before:

2

2

2


GAP8 Implementation: Run-Time, Power Consumption, FFT

Excution Length @100MHz Cycles

Network + FFT 5.877 million

2D CNN 5.079 million

TCN 0.458 million

Fully Connected Layers 0.086 million

FFT 0.177 million

Power consumption:

5Hz Prediction Rate: only 21mW!

Much margin: up to 50Hz possible

Comparison Cortex M7 (STM32F746ZG):

Same calculation consumes 147mW-588mW

Running at limit (@216MHz needing 850ms

for 5 predictions) *Work in Progress

@100MHz and 8 cores running:

total energy per frame: 4.2mJ

5 frames/s: 21mW


Comparison to “Interacting with Soli …” Paper1

Per Frame Accuracy on 11 Gesture Datasets Soli TCN+CNN

Single-User Leave-One-Out (Session) Cross-Validation 85.75 % 89.52 %

Multiple-Users randomly shuffled 87.17 % 81.52 %

Multiple-Users Leave-One-Out (Person) Cross-Validation 79.06 % 73.66 %

Other Properties Soli TCN+CNN

Model Size / Whole System 689MB/ND 91kB / 460Ki

Dataset: Total Instances per Gesture 500 1610

Dataset: People 10 26

Embedded Implementation No Yes

Network Power Consumption - 21mW

Sensor Power Consumption 300mW <190mW

Worst case 160Hz. More Realistic: 2 Sensors @ 100Hz: 40mW



Conclusion

Radar-based gesture recognition for Tiny embedded systems!

Very large hand-gesture dataset created (3x bigger than I. w. Soli)

11 gestures, 17710 instances (1610 per gesture) and 26 people

Very accurate and memory efficient TCN based prediction algorithm

90% accuracy (single-user) and 82% (26 people) on 11 gestures

Model size: 91kB (7500x smaller than I. w. Soli)

Fast and power efficient implementation in GAP8 processor

Power consumption of only 21mW (average at 5 inference per second)

7500x smaller model size, 3x bigger dataset, 21mW implementation

and similar accuracy as previous work on Soli


Reference on Arxiv out in few days

TinyRadarNN: Combining Spatial and Temporal Convolutional Neural Networks for Embedded

Gesture Recognition with Short Range Radars.

Moritz Scherer, Michele Magno, Jonas Erb, Philipp Mayer, Manuel Eggimann, Luca Benini

Previous work

Eggimann, M., Erb, J., Mayer, P., Magno, M. and Benini, L., 2019, October. Low Power

Embedded Gesture Recognition Using Novel Short-Range Radar Sensors. In 2019 IEEE

SENSORS (pp. 1-4). IEEE.

Biggest and OPEN DATASET on short range radar for gesture recognition.: We

are launching it at the Begin of June 2020. Contact me for having it earlier.

[email protected]


Thank you Very much for your attention


Q&A backup slides.


CMSIS-NN: ARM paper1 on efficient Neural Net implementations:

Network: 12.3 MMACs, run-time: 99ms @ 216MHz, most efficient implementation possible

For 5Hz prediction rate: 105 MMACs 845ms

GAP8 Measurements:

21 MMACs: 4.2mJ 0.20mJ / MMACs

GAP8 is 7x – 30x more energy efficient


STM32F746ZG Power Consumption Estimation

1CMSIS-NN: Efficient Neural Network Kernels for Arm Cortex-M CPUs. 2018

Chip Power consumption @216MHz Min (periph. off) Max (periph. on)

Supply Current 102mA 205mA

Supply Voltage 1.7V 3.6V

Power Consumption 173mW 738mW

Power per MMACs 1.39mJ / MMACs 5.94mJ / MMACs


GAP8 Implementation: Run-Time, Power Consumption, FFT


Algorithm adaptability to user:

Add half of test person data to training set: 5% increase in accuracy

Arm Cortex-M Implementation using CMSIS-NN

Gesture spotting capability:

Algorithm detects random noise (no hand) with 99.8% accuracy

Evaluation Continuous instructions and 1 instance instructions:

No difference in accuracy measured

Using only 1 sensor on 11 Gestures:

78.25% instead of 81.52%, drop of only 3%!29/05/2020Michele Magno 36

Further Evaluations

Copyright Notice

This presentation in this publication was presented as a tinyML® Talks webcast. The content reflects the opinion of the author(s) and their respective companies. The inclusion of presentations in this publication does not constitute an endorsement by tinyML Foundation or the sponsors.

There is no copyright protection claimed by this publication. However, each presentation is the work of the authors and their respective companies and may contain copyrighted material. As such, it is strongly encouraged that any use reflect proper acknowledgement to the appropriate source. Any questions regarding the use of any materials presented should be directed to the author(s) or their companies.

tinyML is a registered trademark of the tinyML Foundation.

www.tinyML.org

Download - “Low Power Embedded Gesture...Algorithm implementation in a mW Parallel RISC-V bases processor and experimental evaluation on efficiency (power, ... Final Accuracies: Sensors Dataset

Top Related