無人機視覺之關鍵技術...
TRANSCRIPT
-
無人機視覺之關鍵技術 –仿神經智慧視覺晶片
清華大學電機系 鄭桂忠教授
-
EN I AC Neuromorphic and Biomedical Engineering Lab PI: Prof. Kea-Tiong (Samuel) Tang, [email protected]
Outline
• Smart Agricultureopportunities and challenges
• Neuromorphic and AI algorithms
• Neuromorphic sensor (Processing-In-Sensor)
• Neuromorphic architecture (Computing-In-Memory)
• Summary
-
EN I AC Neuromorphic and Biomedical Engineering Lab PI: Prof. Kea-Tiong (Samuel) Tang, [email protected]
Outline
• Smart Agricultureopportunities and challenges
• Neuromorphic and AI algorithms
• Neuromorphic sensor (Processing-In-Sensor)
• Neuromorphic architecture (Computing-In-Memory)
• Summary
-
EN I AC Neuromorphic and Biomedical Engineering Lab PI: Prof. Kea-Tiong (Samuel) Tang, [email protected]
Crop Farming Challenge
Climate
Soil Quality
Bug
Microorganism
-
EN I AC Neuromorphic and Biomedical Engineering Lab PI: Prof. Kea-Tiong (Samuel) Tang, [email protected]
Traditional Crop Farming Solution
Spraying Pesticide
• Pesticide Pollution Problem• Using Too Many Farmers• Human Food Crisis
-
EN I AC Neuromorphic and Biomedical Engineering Lab PI: Prof. Kea-Tiong (Samuel) Tang, [email protected]
Smart Agriculture For Crop Farming
AIoT Tractor
• Multi-Camera Capture Image• Object Detection using Workstation Computer• The precise spray of pesticides and fertilizers
• Evaluate The Farmland Situation• Monitor The Crop• Organized the Farming
Drone AI System
-
EN I AC Neuromorphic and Biomedical Engineering Lab PI: Prof. Kea-Tiong (Samuel) Tang, [email protected]
Outline
• Smart Agricultureopportunities and challenges
• Neuromorphic and AI algorithms
• Neuromorphic sensor (Processing-In-Sensor)
• Neuromorphic architecture (Computing-In-Memory)
• Summary
-
EN I AC Neuromorphic and Biomedical Engineering Lab PI: Prof. Kea-Tiong (Samuel) Tang, [email protected]
Drone Obstacle Avoid System
-
EN I AC Neuromorphic and Biomedical Engineering Lab PI: Prof. Kea-Tiong (Samuel) Tang, [email protected]
DJI drone Obstacle Avoid System
Obstacle Avoid System based on Radar
Obstacle Avoid System based on Neuromorphic Algorithm
Proposed Technology
Current Technology
Radar
Camera
Power Consumption
Radar 12w
FPV Camera 20mw-200mw
-
EN I AC Neuromorphic and Biomedical Engineering Lab PI: Prof. Kea-Tiong (Samuel) Tang, [email protected]
直線運動 拋物線運動
藍框:侷域動態偵測結果 綠框:物件動態預測結果
-
EN I AC Neuromorphic and Biomedical Engineering Lab PI: Prof. Kea-Tiong (Samuel) Tang, [email protected]
Prediction Demonstration(Block matching and Centroid Velocity)
[C.-C. Lo, NTHU, unpublished]
-
EN I AC Neuromorphic and Biomedical Engineering Lab PI: Prof. Kea-Tiong (Samuel) Tang, [email protected]
Drone Farmland Detection and Segmentation
-
EN I AC Neuromorphic and Biomedical Engineering Lab PI: Prof. Kea-Tiong (Samuel) Tang, [email protected]
Drone Farmland Detection and Segmentation
• Farmland Detection and Segmentation• Obstacle Labeling• Flight Path Planning and Mission Scheduling
Proposed TechnologyAlgorithm Running on Drone
Current TechnologyAlgorithm Running on Laptop
-
EN I AC Neuromorphic and Biomedical Engineering Lab PI: Prof. Kea-Tiong (Samuel) Tang, [email protected]
Instance Detection and Segmentation
Faster R-CNN
FCN
He, Kaiming, et al. "Mask r-cnn." Proceedings of the IEEE international conference on computer vision. 2017.
-
EN I AC Neuromorphic and Biomedical Engineering Lab PI: Prof. Kea-Tiong (Samuel) Tang, [email protected], Alfredo, Adam Paszke, and Eugenio Culurciello. "An analysis of deep neural network models for practical applications." arXiv preprint arXiv:1605.07678 (2016).
The More Accuracy you Want, The More Operations and Parameters you Need
-
EN I AC Neuromorphic and Biomedical Engineering Lab PI: Prof. Kea-Tiong (Samuel) Tang, [email protected], Albert, et al. "Survey and Benchmarking of Machine Learning Accelerators." arXiv preprint arXiv:1908.11348 (2019).
-
EN I AC Neuromorphic and Biomedical Engineering Lab PI: Prof. Kea-Tiong (Samuel) Tang, [email protected]
Future Drone System
• Farmland Detection and Segmentation• Obstacle Detection and Avoid System• Flight Path Planning and Mission
Scheduling• Lite Batter and Takeoff weights
More Powerful Neuromorphic models are needed!Low-power & cost-aware AI chips are needed !!
Processing data in Sensor and Memory could be the solution!!!
-
EN I AC Neuromorphic and Biomedical Engineering Lab PI: Prof. Kea-Tiong (Samuel) Tang, [email protected]
Outline
• Smart Agricultureopportunities and challenges
• Neuromorphic and AI algorithms
• Neuromorphic sensor (Processing-In-Sensor)
• Neuromorphic architecture (Computing-In-Memory)
• Summary
-
EN I AC Neuromorphic and Biomedical Engineering Lab PI: Prof. Kea-Tiong (Samuel) Tang, [email protected]
Schematic of the Mammalian Retina
(1) Rods
(2) Cones
(3) horizontal cells
(4) bipolar cells
(5) amacrine cells
(6) retinal ganglion cells
ref: H. Wässle,” Parallel processing in the mammalian retina,” Nature Reviews Neuroscience, Vol. 5, pp. 1-11, October 2004.
-
EN I AC Neuromorphic and Biomedical Engineering Lab PI: Prof. Kea-Tiong (Samuel) Tang, [email protected]
The Visual Language
Multiple representations of the visual world
Werblin, UC Berkeley
-
EN I AC Neuromorphic and Biomedical Engineering Lab PI: Prof. Kea-Tiong (Samuel) Tang, [email protected]
Why we need Application-driven CIS
Motion
Detection
Facial
Detection
Dynamic images
Feature Descriptor Feature Extraction
Application-driven CIS can process the specific tasks in real time
>>> Low-power & low-latency
-
EN I AC Neuromorphic and Biomedical Engineering Lab PI: Prof. Kea-Tiong (Samuel) Tang, [email protected]
Processing-In-Sensor
Digital Still Camera
+
General-purpose ISP
― Well-defined application
― Simplified software module
― Reduced hardware complexity
AI and Deep Learning
Application-driven CIS
+
AI CNN Processor
― HDR
― Noise reduction
― Color correction
Adapt to low power edge devices
Raw
Image
-
EN I AC Neuromorphic and Biomedical Engineering Lab PI: Prof. Kea-Tiong (Samuel) Tang, [email protected]
• Digital Still Camera with ISP• Fixed/full resolution digitization
• High resolution data transfer
• High capacity frame buffer
• High complexity ISP
• Application-driven CIS with AI CNN Processor• High speed feature extraction
• Down resolution digitization
• Low bandwidth/latency/power
Processing-In-Sensor (cont.)
Application-driven CIS needsProcessing-in-sensor (PIS):Higher Energy Efficiency for Edge Devices !
-
EN I AC Neuromorphic and Biomedical Engineering Lab PI: Prof. Kea-Tiong (Samuel) Tang, [email protected]
• Convolutional CIS (C2IS)
• Real-time feature extraction
NTHU PIS Design
-
EN I AC Neuromorphic and Biomedical Engineering Lab PI: Prof. Kea-Tiong (Samuel) Tang, [email protected]
• Real-time in-column convolution
• Programmable 3x3 kernel with 4-bit weights
• Tunable-resolution ADC
• Normal linear-response image
• Goal and Advantage• Improve system power efficiency
• Reduce data transfer and latency
• 1st stage feature extraction
NTHU PIS Architecture
Application:Real-time Feature Extraction
Front-end of CNN Processor
[C.-C. Hsieh, NTHU, ASSCC 2019]
-
EN I AC Neuromorphic and Biomedical Engineering Lab PI: Prof. Kea-Tiong (Samuel) Tang, [email protected]
• TSMC 0.18um
• Operation voltage: 0.5V
• Pixel: 7.6um
• Chip area: 1.9*2.3 mm2
Chip Specification
[C.-C. Hsieh, NTHU, ASSCC 2019]
-
EN I AC Neuromorphic and Biomedical Engineering Lab PI: Prof. Kea-Tiong (Samuel) Tang, [email protected]
Live Demonstration
-
EN I AC Neuromorphic and Biomedical Engineering Lab PI: Prof. Kea-Tiong (Samuel) Tang, [email protected]
Outline
• Smart Agricultureopportunities and challenges
• Neuromorphic and AI algorithms
• Neuromorphic sensor (Processing-In-Sensor)
• Neuromorphic architecture (Computing-In-Memory)
• Summary
-
EN I AC Neuromorphic and Biomedical Engineering Lab PI: Prof. Kea-Tiong (Samuel) Tang, [email protected]
Weight and Processing in Neural Systems1. Weights are accessed in every processing!!
2. Weights are stored very close to processing unit!!
-
EN I AC Neuromorphic and Biomedical Engineering Lab PI: Prof. Kea-Tiong (Samuel) Tang, [email protected]
Increasing model size over the
years for high accuracy
More memory access
Energy efficiency of memory get
saturated recently
Energy for memory access
becomes difficult to reduce
Memory Access Energy Growing Up
Source: X. Xu, Nature electronics, 2018
-
EN I AC Neuromorphic and Biomedical Engineering Lab PI: Prof. Kea-Tiong (Samuel) Tang, [email protected]
Memory Wall in von Neumann Architecture
Data movement across memory layers and system bus Nonvolatile memory (NVM/SSD)-DRAM-SRAM (on-chip)- PE
NVM usually require long read/write latency
Long latency, high power consumption, high hardware cost !
High-bandwidth memory is required
Beyond Von Neumann (new) architecture is required
Source: M.-F. Chang, ISSCC2018 Tutorial & 31.4
Von Neumann
“Bottleneck”
-
EN I AC Neuromorphic and Biomedical Engineering Lab PI: Prof. Kea-Tiong (Samuel) Tang, [email protected]
Inference Architectures between PE and Memory
• Computation still digital• Eliminates data transfer
costs
• Memory read energydominates
SRAM
Bank
Near MemoryMemory
Digital processing
• Memory access and computation combined
• Mixed signal computation• significant energy &
latency reduction
SRAM
Bank
Deep In-MemoryMemory
Mixed signal
Processing
SRAM
Bank
ALU / Digital Processing
DigitalMemory
• Data access energy and latency dominates
Energy Efficiency
Transferred data
-
EN I AC Neuromorphic and Biomedical Engineering Lab PI: Prof. Kea-Tiong (Samuel) Tang, [email protected]
Concept of Nonvolatile Computing-In-Memory
Input/WL
(IN)
Weight/MC
(W)
Product
(INxW)IMC
0 0 (HRS) 0 0
0 +1 (LRS) 0 0
1 +1 (LRS) +1 ILRS
1 0 (HRS) 0 IHRS
Concept:
Input: WLs (IN)
Weight: cell data (W)
Multiply: WLs x cell-data
Accumulation: BL current
Analog-to-digital out[W.-H. Chen, NTHU, ISSCC2018]
𝑰𝑩𝑳 [𝒋] =
𝒊=𝟎
𝒊=𝑵
𝑰𝑴𝑪[𝒊, 𝒋]
𝑰𝑴𝑪 = 𝑰𝑯𝑹𝑺 (𝐖𝐢𝐣 = 𝟎)
WL ON:
WL OFF:
𝑰𝑴𝑪𝟎
Accumulation at BL:W
L D
river
Tim
e
Reference Generator
Analog-to-digital out
Write-Control
BL[0
]
BL[v
-1]
BL[v
]
Cell Arrays
WL[0]
WL[i]
WL[n]
SL[0
]
SL[v
-1]
SL[v
]
𝑰𝑴𝑪 = 𝑰𝑳𝑹𝑺 (𝐖𝐢𝐣 = 𝟏)
-
EN I AC Neuromorphic and Biomedical Engineering Lab PI: Prof. Kea-Tiong (Samuel) Tang, [email protected]
Nonvolatile Computing-In-Memory (nvCIM)
Nonvolatile computing-in-memory (nvCIM) Store the weight at power-off
suppress data movement across memory layers
Parallel in-memory multiply-and-accumulate (MAC)
reduce amount of intermediate dataPotentially low energy, low cost, and high performance !!
I𝑩𝑳j=
n
Ini×W𝐢,𝐣
nvCIM macro
-
EN I AC Neuromorphic and Biomedical Engineering Lab PI: Prof. Kea-Tiong (Samuel) Tang, [email protected]
Recent CIM Silicon Development Status
SRAM-CIM
PCM-CIM
20192017 20182016
ReRAM-CIM1Mb RRAM 2bIN/3bW CIM@ISSCC19
1Mb RRAM CNNBinary/Ternary, 3b out @ISSCC18 (NTHU)
224b SLC RRAM @ISSCC18 (Stanford)
16Mb RRAM Logic@IEDM17 (NTHU)
2Mb RRAM FCNMLC cell+binary out@VLSI18 (Panasonic)
32*32 RRAM FCNBinary/Ternary, 3b out @VLSI17 (THU&NTHU)
6T-SRAM Classifier@VLSI16, (Princeton)
4+2T SRAM@ VLSI17 (Princeton)
BRein Memory@ VLSI17 (Hokkaido)
10T CSRAM BWN@ ISSCC18 (MIT)
Classifier SVM@ ISSCC18 (UIUC)
DSC6T SRAM BNN@ ISSCC18 (NTHU)XNOR-SRAM@ VLSI18 (Columbia)
4b T8T SRAM CNN @ ISSCC19 (NTHU)AI-Accelerator+T-SRAM@ISSCC19, (THU+NTHU)Sandwich RAM BWN @ISSCC19, (Southeast Univ.)
Time-based SRAM+ Accelerator@ISSCC19, (Minnesota)
Compute SRAM + Accelerator@ISSCC19, (Michigan)
10*3 Crossbar FCNPCM 8bW CIM@IEDM18(IBM)
3Mb PCM CIM@Nat. electronics18 (IBM)
3Mb PCM CIM@Nat. Commun.17 (IBM)
-
EN I AC Neuromorphic and Biomedical Engineering Lab PI: Prof. Kea-Tiong (Samuel) Tang, [email protected]
Input-aware reference current generation (increases read margin)
Small-offset multi-level current sense amplifier (DR-CSA)
Customized ternary-bits model compression algorithm [BioCAS 2018]
[W.-H. Chen, K.-T. Tang & M.F. Chang, et al., NTHU, ISSCC2018 #31.4]
CIM for Neural Networks - 1
-
EN I AC Neuromorphic and Biomedical Engineering Lab PI: Prof. Kea-Tiong (Samuel) Tang, [email protected]
[K.T. Tang & M.F. Chang, NTHU, ISSCC 2019]
Multi-bit computation ReRAM macro
Serial-Input Non-Weighted Product structure
Down-Scaling Weighted Current Translator
Triple-Margin Current-mode Sense Amplifier
Multi-bits compact model
SINWP SINWP
DSWCT DSWCT
Comparator + PN-ISUB
Positive weights Negative weights
DOUT[2:0]DOUTSIGN
IREF
TMCSA
Input
CLK
WL[0]
WL[1]
WL[n]
YMUXS
L[0
]
SL
[1]
SL
[n-1
]
BL
[n]
BL
[0]
BL
[1]
BL
[n-1
]
SL
[n]
Positive-Weight Group Negative-Weight Group
2bit input
IN[0] IN[1]
IBL_MSB[0] IBL_MSB[n]IBL_LSB[0] IBL_LSB[n]
MCM MCL
CIM for Neural Networks - 2
-
EN I AC Neuromorphic and Biomedical Engineering Lab PI: Prof. Kea-Tiong (Samuel) Tang, [email protected]
• ResNet-18 for CIFAR-10 dataset
• Optimization according to ADC output bit limitation
Algorithm Deployment
[K.T. Tang & M.F. Chang, NTHU, ISSCC 2019]
-
EN I AC Neuromorphic and Biomedical Engineering Lab PI: Prof. Kea-Tiong (Samuel) Tang, [email protected]
MNIST Demonstration
Demo
[K.T. Tang & M.F. Chang, NTHU, ISSCC 2018]
-
EN I AC Neuromorphic and Biomedical Engineering Lab PI: Prof. Kea-Tiong (Samuel) Tang, [email protected]
Cifar-10 Demonstration (ResNet-18)
-
EN I AC Neuromorphic and Biomedical Engineering Lab PI: Prof. Kea-Tiong (Samuel) Tang, [email protected]
• On-going/Near-Term (Von Neumann Arch.)• Digital Neural Network based designs • GPU + High bandwidth memory • ASIC+ novel accelerators
CPU+ Storage (memory) Suffer latency/energy bottleneck due to data movement between
ALU and memory
• Mid-Term (Next-generation)• Near-memory computing (NMC)• In-memory computing (IMC/CIM)Memory: storage + computing• 10~1000x energy reduction !
Future Trend: Integrating CIM in AI-ASIC
Bus
Von Neumann “Bottleneck”
-
EN I AC Neuromorphic and Biomedical Engineering Lab PI: Prof. Kea-Tiong (Samuel) Tang, [email protected]
New scenario of edge AI application device
• High system robustness &
flexibility
• High resolution digitization
• Low power consumption
and High Energy Efficiency
• High speed image
processing & computing
Breaking Memory Wall
& speeding up MVMs
Application-driven CIS
Energy EfficiencyResolution
Bus Fabric
CPU SRAM
GPIO
UART
Processing-In-Sensor (PIS)
CIS In IFSPI
OSD
Ctrl
DDR4
Ctrl
I2CDMA Display
Ctrl
DDR4
PHY
Output WriterCtrl
CIM-based DLA
[K.-T. Tang, NTHU, VLSI-Symposium 2019]
-
EN I AC Neuromorphic and Biomedical Engineering Lab PI: Prof. Kea-Tiong (Samuel) Tang, [email protected]
Outline
• Smart Agricultureopportunities and challenges
• Neuromorphic and AI algorithms
• Neuromorphic sensor (Processing-In-Sensor)
• Neuromorphic architecture (Computing-In-Memory)
• Summary
-
EN I AC Neuromorphic and Biomedical Engineering Lab PI: Prof. Kea-Tiong (Samuel) Tang, [email protected]
Summary
• Processing-In-Sensor • Processing data in sensor can achieve low power and latency performance.• However, the achievable computation complexity is limited and suitable only for the
well-defined application-driven architectures.
• Computing-In-Memory• Integrating In-memory computing can break the von Neumann bottleneck to achieve
high energy efficiency.• Mutli-bits CIM marco to achieve higher accuracy causes more hardware issues and
needs smarter circuit design.
• Next generation AI-ASIC• In addition to hardware improvement, device and fabrication technology (ex. 3D
stacking) and software development are also key to the road of success.
-
EN I AC Neuromorphic and Biomedical Engineering Lab PI: Prof. Kea-Tiong (Samuel) Tang, [email protected]
Vision of the future
• The drone can work automatically• Low-power & cost-aware AI chip can deploy on the drone
• All the algorithms that the drone needs can run on the drone online and real-time
-
EN I AC Neuromorphic and Biomedical Engineering Lab PI: Prof. Kea-Tiong (Samuel) Tang, [email protected]
• ITRI project
•NTHU-ITRI project
•Moon shoot project, MOST
•Competitive team project, NTHU
Acknowledgement