運用cntk 實作深度學習物件辨識 deep learning based object detection with microsoft...
TRANSCRIPT
Deep Learning based Object Detection with Microsoft Cognitive Toolkit (CNTK)(運用CNTK 實作深度學習物件辨識)
Deep Learning
need Big Data, but Live is short.
http://blackjack0919.deviantart.com/art/Value-Of-Time-375650300
Microsoft
Cognitive
Toolkit
Cognitive Toolkit – The Fastest Toolkit
Caffe Cognitive Toolkit MxNet TensorFlow Torch
FCN5 (1024) 55.329ms 51.038ms 60.448ms 62.044ms 52.154ms
AlexNet (256) 36.815ms 27.215ms 28.994ms 103.960ms 37.462ms
ResNet (32) 143.987ms 81.470ms 84.545ms 181.404ms 90.935ms
LSTM (256)(v7 benchmark)
- 43.581ms(44.917ms)
288.142ms(284.898ms)
-(223.547ms)
1130.606ms(906.958ms)
http://dlbench.comp.hkbu.edu.hk/ Benchmarking by HKBU, Version 8Single Tesla K80 GPU, CUDA: 8.0 CUDNN: v5.1
Caffe: 1.0rc5(39f28e4)CNTK: 2.0 Beta10(1ae666d)MXNet: 0.93(32dc3a2)TensorFlow: 1.0(4ac9c09)Torch: 7(748f5e3)
5 times faster then TF
2.2 times faster then TF
3.5 times faster then TF
DEMO
Classified as Microsoft Confidential
Data Science VM ?
Comprehensive cloud based Data Science Environment to empower Data Scientists
Scalability
http://ai.easyapi.com/blog/view/983
804 397
1513 766
1240
571
每秒處理圖片數 (越高越好)
Microsoft
Cognitive
Toolkit
Microsoft Cognitive Toolkit (formerly CNTK)
• Microsoft的開源深度學習工具• https://github.com/Microsoft/CNTK
• Created by Microsoft Speech researchers (Dong Yu et al.) in 2012, “Computational Network Toolkit”
• On GitHub since Jan 2016 under MIT license
• Renamed “Cognitive Toolkit”
• Community contributions e.g. from MIT, Stanford and NVidia
Microsoft
Cognitive
Toolkit
Microsoft Cognitive Toolkit
• 執行微軟內部 80% Microsoft 深度學習工具
• 1st-class on Linux and Windows, docker support
• Training: Python, C++,
• Evaluation: C#, Java, Scale up evaluation in Spark
• New in GA:• Keras backend support (Beta)
• Java support, Spark support
• Model compression (Fast binarized evaluation)
Microsoft
Cognitive
Toolkit
Cognitive 特性
• Python and C++ API• Mostly implemented in C++
• Low level + high level Python API
• Extensibility • User functions and learners in pure Python
• Readers • Distributed, highly efficient built-in data readers
Deep Learning Revolutionized Image Recognition
Largest image dataset – ImageNet– 1.2 million training images, 100k test images
– 1000 classes
3.55.1
6.7 7.3
11.7
16.4
25.828.2
Microsoft2015
Human GoogleNet2014
Oxford2014
NYU2013
U Toronto2012
2011 2010
ImageNet Winners and Errors (%)
Microsoft had all 5 entries being the 1-st places this year: ImageNet classification, ImageNet localization, ImageNet detection, COCO detection, and COCO segmentation (2015)
COCO Segmentation Challenge 2016• MSRA won 1st place back-to-back
• 11% relatively better than 2016 2nd (Google)
• 33% relatively better than 2015 1st (MSRA)
37.6
33.8
28.425
MSRA2016 1st
Google2016 2nd
MSRA2015 1st
FAIR2015 2nd
COCO Segmentation Accuracy (%)
our results on COCO test set
Semantic Segmentation
http://host.robots.ox.ac.uk/pascal/VOC/
Microsoft
Cognitive
Toolkit
Example: 2-hidden layer feed-forward NN
h1 = s(W1 x + b1) h1 = sigmoid (x @ W1 + b1)
h2 = s(W2 h1 + b2) h2 = sigmoid (h1 @ W2 + b2)
P = softmax(Wout h2 + bout) P = softmax (h2 @ Wout + bout)
with input x RM and one-hot label y RJ
and cross-entropy training criterion
ce = yT log P ce = cross_entropy (L, P)
CNTK Model
Microsoft
Cognitive
Toolkit
•
+
s
•
+
s
•
+
softmax
W1
b1
W2
b2
Wout
bout
cross_entropy
h1
h2
P
x y
h1 = sigmoid (x @ W1 + b1)
h2 = sigmoid (h1 @ W2 + b2)
P = softmax (h2 @ Wout + bout)
ce = cross_entropy (P, y)
ce
CNTK Model
Microsoft
Cognitive
Toolkit
•
+
s
•
+
s
•
+
softmax
W1
b1
W2
b2
Wout
bout
cross_entropy
h1
h2
P
x y
ce
LEGO-like composability allows CNTK to supportwide range of networks & applications
CNTK Model
MNIST Handwritten Digits (OCR)
• Data set of hand written digits with✓60,000 training images
✓10,000 test images
• Each image is: 28 x 28 pixels
Handwritten Digits
1 5 4 35 3 5 35 9 0 6
Corresponding Labels
Multi-layer perceptron
28 pix
28
pix
.
784 pixels (x)
.
Di = 784O= 400a = relu
Di = 400O= 200a = relu
D10 nodes i = 200
O= 10a = None
Weights
784
400 + 400 bias
400
200 + 200 bias
200
10 + 10 bias
Deep Model
z0 z1 z2 z3 z4 z5 z6 z7 z8 z9
0.08 0.08 0.10 0.17 0.11 0.09 0.08 0.08 0.13 0.01𝑒𝑧i
σ𝑗=09 𝑒𝑧j
softmax
https://github.com/Microsoft/CNTK/tree/master/Tutorials
28 pix
28
pix
.
28 x 28 pix (p)
Loss function
Lossfunction
ce = −σ𝑗=09 𝑦𝑗 𝑙𝑜𝑔 𝑝𝑗
Cross entropy error
1 5 4 35 3 5 35 9 0 6
Label One-hot encoded (Y)
0 0 0 1 0 0 0 0 0 0
Model(w, b)
Predicted Probabilities (p)
0.08 0.08 0.10 0.17 0.11 0.09 0.08 0.08 0.13 0.01
MNIST Handwritten Digits (OCR)
• Data set of hand written digits with✓60,000 training images
✓10,000 test images
• Each image is: 28 x 28 pixels
Handwritten Digits
1 5 4 35 3 5 35 9 0 6
Corresponding Labels
MNIST Handwritten Digits (OCR)
• Data set of hand written digits with✓60,000 training images
✓10,000 test images
• Each image is: 28 x 28 pixels
Handwritten Digits
1 5 4 35 3 5 35 9 0 6
Corresponding Labels
MNIST Handwritten Digits (OCR)
• Data set of hand written digits with✓60,000 training images
✓10,000 test images
• Each image is: 28 x 28 pixels
Handwritten Digits
1 5 4 35 3 5 35 9 0 6
Corresponding Labels
OO
O O1000
ROI
Pooling
Perform for each ROI
21O
1000
"orange", "butter", "onion",
"water", "apple", "milk",
"tabasco", "beer“.
Where to begin?On GitHub: https://github.com/Microsoft/CNTK/wiki
Tutorials: https://www.cntk.ai/pythondocs/tutorials.html (latest release)https://github.com/Microsoft/CNTK/tree/master/Tutorials (latest)
Azure Notebooks: Try for free pre-hosted https://notebooks.azure.com/cntk/libraries/tutorials
Seek help on Stack Overflow: http://stackoverflow.com/search?q=cntk (please add cntk tag)
Seek help on Stack Overflow: http://stackoverflow.com/search?q=cntk (please add cntk tag)
Where to begin?Tutorials: https://www.cntk.ai/pythondocs/tutorials.html
Where to begin?Tutorials: https://github.com/Microsoft/CNTK/tree/master/Tutorials
Where to begin?Azure Notebooks: Try for free pre-hosted https://notebooks.azure.com/cntk/libraries/tutorials
Where to begin?On GitHub: https://github.com/Microsoft/CNTK/wiki
Tutorials: https://www.cntk.ai/pythondocs/tutorials.html (latest release)https://github.com/Microsoft/CNTK/tree/master/Tutorials (latest)
Azure Notebooks: Try for free pre-hosted https://notebooks.azure.com/cntk/libraries/tutorials
Seek help on Stack Overflow: http://stackoverflow.com/search?q=cntk (please add cntk tag)
Seek help on Stack Overflow: http://stackoverflow.com/search?q=cntk (please add cntk tag)
Deep Learning
need Big Data, but Live is short.
http://blackjack0919.deviantart.com/art/Value-Of-Time-375650300
Microsoft
Cognitive
Toolkit
Benchmark result of parallel training on CNTK
2.9 5.4
8.0 3.3
6.7 10.8
3.7 6.9
13.8
25.5
43.7
4.1 8.1
14.1
27.3
54.0
0.0
10.0
20.0
30.0
40.0
50.0
60.0
4 GPUs 8 GPUs 16 GPUs 32 GPUs 64 GPUs
1bit/BMUF Speedup Factors in LSTM Training
1bit-average
1bit-peak
BMUF-average
BMUF-peak
• Training data: thousands-hour speech from real traffics• About 16 and 20 days to train DNN and LSTM on 1-GPU, respectively
Microsoft
Cognitive
Toolkit
Results
• Achievement• Almost linear speedup without degradation of model quality
• Verified for training DNN, CNN, LSTM up to 64 GPUs for speech recognition, image classification, OCR, and click prediction tasks
• Used for enterprise scale production data loads
Language
Speech
Search
Machine Learning
Knowledge Vision
Spell check
Speech API
Entity linking
Recommendation API
Bing autosuggest
Computer vision
Emotion
Forecasting
Text to speech
Thumbnail generation
Anomalydetection
Custom recognition (CRIS)
Bing image search
Web language model
Customer feedback analysis
Academic knowledge
OCR, tagging, captioning
Sentiment scoring
Bingnews search
Bingweb search
Text analytics
Cognitive Services APIs
Information Management
Data Catalog
Data Factory
Event Hubs
Machine Learning and Analytics
Stream Analytics
HDInsight
(Hadoop and Spark)
Machine Learning
Data Lake Analytics
Big Data Stores
SQL Data Warehouse
Data Lake Store
Intelligence
Cognitive Services
Bot Framework
Cortana
Dashboards & Visualizations
Power BI
Cortana Intelligence
R studio Python upyternotebook
從不同資料來源讀取資料
清理資料
套用不同機器學習演算法
支援R跟Python 語言
Web
services
容易建置, 部屬跟分享預測行分析專案
• 完整受管理的雲端服務 , 使用者可以容易的建置, 佈署跟分享預測性分析解決方案
• 數分鐘內即可將藉由機器學習建置的分析預測模型在實際環境佈署成Web Service. 應用系統跟裝置立即可以使用預測分析API.
• 可以在Cortana Intelligence Gallery 或 Azure Marketplace 上分享.
• 包含多種業界常用演算法, 並可以透過R Package 擴充新的機器學習演算法或是執行Python 程式
Machine Learning
and Analytics
HDInsight
(Hadoop and
Spark)
Stream
Analytics
Data Lake
Analytics
Machine
Learning
智慧服務
儀錶板 & 資料視覺化
巨量資料儲存 機器學習跟分析
行動
People
Automated Systems
Apps
Web
Mobile
Bots
Cortana
Bot
Framework
Cognitive
Services
Power BI
資訊管理
Event Hubs
Data Catalog
Data Factory
HDInsight
(Hadoop and
Spark)
Stream
Analytics
智慧服務
Data Lake
Analytics
Machine
Learning
SQL Data
Warehouse
Data Lake
Store
資料來源
應用程式
感知器
與裝置
資料
IoT Hub