introduction of deeplearning - kbs - kbs · 2018-06-19 · you only look once (yolo) unified,...

25
1 Introduction of DeepLearning Jiang Xuan

Upload: others

Post on 25-May-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Introduction of DeepLearning - KBS - KBS · 2018-06-19 · You Only Look Once (YOLO) Unified, Real-Time Object Detection • Introduction à Using YOLO, you only look once at an image

1

Introduction of DeepLearning

Jiang Xuan

Page 2: Introduction of DeepLearning - KBS - KBS · 2018-06-19 · You Only Look Once (YOLO) Unified, Real-Time Object Detection • Introduction à Using YOLO, you only look once at an image

DeepLearning

à Most modern deep learning models are based on an artificialneural network

à Multilayer neural network

à Multiple nonlinear transformations

Page 3: Introduction of DeepLearning - KBS - KBS · 2018-06-19 · You Only Look Once (YOLO) Unified, Real-Time Object Detection • Introduction à Using YOLO, you only look once at an image

Applications of DeepLearning

Page 4: Introduction of DeepLearning - KBS - KBS · 2018-06-19 · You Only Look Once (YOLO) Unified, Real-Time Object Detection • Introduction à Using YOLO, you only look once at an image

DeepLearning Applications In Computer Vision

• (YOLO) You Only Look Once: Unified, Real-Time Object Detection

•CheXNet: Radiologist-Level Pneumonia Detection on Chest X-Rays with Deep Learning Pneumonia

Page 5: Introduction of DeepLearning - KBS - KBS · 2018-06-19 · You Only Look Once (YOLO) Unified, Real-Time Object Detection • Introduction à Using YOLO, you only look once at an image

You Only Look Once (YOLO)Unified, Real-Time Object Detection

• Introductionà Using YOLO, you only look once at

an image to predict what objects are present and where they are.

• Traditional Object-Detection Algorithmsà DPM (deformable parts models )à R-CNN

Page 6: Introduction of DeepLearning - KBS - KBS · 2018-06-19 · You Only Look Once (YOLO) Unified, Real-Time Object Detection • Introduction à Using YOLO, you only look once at an image

Object Detection

YOLO• R-CNN

à use region proposal methods to generate potential boundingboxes in an image and then run a classifier on these proposedboxes.

• DPMà deformable parts modelsà use a sliding window approach

Page 7: Introduction of DeepLearning - KBS - KBS · 2018-06-19 · You Only Look Once (YOLO) Unified, Real-Time Object Detection • Introduction à Using YOLO, you only look once at an image

You Only Look Once (YOLO)

• Advantage of Yolo

à YOLO reframes object detection as a single regression problem

à straight from image pixels to bounding box coordinates and class probabilities.

à Using YOLO, you only look once at an image to predict what objects are present and where they are.

Page 8: Introduction of DeepLearning - KBS - KBS · 2018-06-19 · You Only Look Once (YOLO) Unified, Real-Time Object Detection • Introduction à Using YOLO, you only look once at an image

Model Description

(1)resizes the input image to 448 × 448

(2) runs a single convolutional network on the image

(3) thresholds the resulting detections by the model’s confidence.

Page 9: Introduction of DeepLearning - KBS - KBS · 2018-06-19 · You Only Look Once (YOLO) Unified, Real-Time Object Detection • Introduction à Using YOLO, you only look once at an image

Model Description• Unified Detection

à Our system divides the input image into an S × S grid.

à Each grid cell predicts B bounding boxes and confidence scores for those boxes.

à Each bounding box consists of 5 predictions: x, y, w, h, and confidence.

, y, w, h, and confidence.

Page 10: Introduction of DeepLearning - KBS - KBS · 2018-06-19 · You Only Look Once (YOLO) Unified, Real-Time Object Detection • Introduction à Using YOLO, you only look once at an image

Model ArchitectureThe network architecture is inspired by theGoogLeNet model for image classification .

The network has 24 convolutional layersfollowed by 2 fully connected layers.

Instead of the inception modules used byGoogLeNet, it simply uses 1 × 1 reductionlayers followed by 3 × 3 convolutional layers.

Page 11: Introduction of DeepLearning - KBS - KBS · 2018-06-19 · You Only Look Once (YOLO) Unified, Real-Time Object Detection • Introduction à Using YOLO, you only look once at an image

• Training Set

à the ImageNet 1000-class competition dataset

• Validation Set

à PASCAL VOC 2007 and 2012

Page 12: Introduction of DeepLearning - KBS - KBS · 2018-06-19 · You Only Look Once (YOLO) Unified, Real-Time Object Detection • Introduction à Using YOLO, you only look once at an image

Model Training

• Training Processà For pretraining,we use the first 20 convolutional layers followed by a

average-pooling layer and a fully connected layer.

à We train this network for approximately a week and achieve a single crop top-5 accuracy of 88% on the ImageNet 2012 validation set

à add four convolutional layers and two fully connected layers with randomly initialized weights.

à Detection often requires fine-grained visual information so we increase the input resolution of the network from 224 × 224 to 448 × 448

à We then train the network for about 135 epochs on the training and validation data sets

Page 13: Introduction of DeepLearning - KBS - KBS · 2018-06-19 · You Only Look Once (YOLO) Unified, Real-Time Object Detection • Introduction à Using YOLO, you only look once at an image

Experiments and ComparisonComparing the performance andspeed of fast detectors:

1.Fast YOLO is the fastest detectoron record for PASCAL VOC detection

2. Fast YOLO is still twice as accurateas any other real-time detector.

3.YOLO is 10 mAP more accuratethan the fast version while still wellabove realtime in speed.

Page 14: Introduction of DeepLearning - KBS - KBS · 2018-06-19 · You Only Look Once (YOLO) Unified, Real-Time Object Detection • Introduction à Using YOLO, you only look once at an image

Experiments and Comparison

1.Localization errors account formore of YOLO’s errors than all other sources combined.

2.Fast R-CNN makes much fewerlocalization errors but far morebackground errors.

3.Fast R-CNN is almost 3 timesmore likely to predict backgrounddetections than YOLO.

VOC 2007 Error Analysis

Page 15: Introduction of DeepLearning - KBS - KBS · 2018-06-19 · You Only Look Once (YOLO) Unified, Real-Time Object Detection • Introduction à Using YOLO, you only look once at an image

CheXNetRadiologist-Level Pneumonia Detection on Chest X-Rays with Deep Learning Pneumonia

à CheXNet can automatically detect pneumonia from chest X-rays at a level exceeding practicing radiologists.

ChestX-ray14 dataset

Page 16: Introduction of DeepLearning - KBS - KBS · 2018-06-19 · You Only Look Once (YOLO) Unified, Real-Time Object Detection • Introduction à Using YOLO, you only look once at an image

• Model Description

à a 121- layer convolutional neural network

à input a chest X-ray image

à output the probability of pneumonia along with a heatmaplocalizing the areas of the image most indicative of pneumonia.

à Trained with ChestX-ray14 dataset which contains 112,120 frontal-view X-ray images of30,805 unique patients.

Page 17: Introduction of DeepLearning - KBS - KBS · 2018-06-19 · You Only Look Once (YOLO) Unified, Real-Time Object Detection • Introduction à Using YOLO, you only look once at an image

14. Juli 2010Wolfgang Nejdl

Problem of Traditional CNN:As CNNs become increasingly deep ,as information aboutthe input or gradient passes through many layers, it canvanish by the time it reaches the end (or beginning) of thenetwork.

How can we solve this problem?

Page 18: Introduction of DeepLearning - KBS - KBS · 2018-06-19 · You Only Look Once (YOLO) Unified, Real-Time Object Detection • Introduction à Using YOLO, you only look once at an image

Model ArchitectureDensely Connected Convolutional Neural Network

A 5-layer dense block with a growth rate of k = 4. Each layer takes all precedingfeature-maps as input.

DenseNet:DenseNet propose a different connectivity pattern: directconnections from any layer to all subsequent layers .

Consequently, the layer receivesthe feature-maps of all precedinglayers, , as input:

Page 19: Introduction of DeepLearning - KBS - KBS · 2018-06-19 · You Only Look Once (YOLO) Unified, Real-Time Object Detection • Introduction à Using YOLO, you only look once at an image

• Model Architecture

à CheXNet is a 121-layer Dense Convolutional Network trainedon the ChestX-ray 14 dataset.

à We replace the final fully layer with one that has a singleoutput

à After the fully layer,we apply a sigmoid nonlinearity.

Page 20: Introduction of DeepLearning - KBS - KBS · 2018-06-19 · You Only Look Once (YOLO) Unified, Real-Time Object Detection • Introduction à Using YOLO, you only look once at an image

Model TrainingBased on ChestX-ray14We downscale the images to 224×224 and normalize based on themean and standard deviation of images in the ImageNet training set

we randomly split the dataset into:

à training (28744 patients, 98637 images)à validation (1672 patients, 6351 images)à test (389 patients, 420 images).

Page 21: Introduction of DeepLearning - KBS - KBS · 2018-06-19 · You Only Look Once (YOLO) Unified, Real-Time Object Detection • Introduction à Using YOLO, you only look once at an image

Experiments

We compare radiologists andour model on the F1 metric(F1 Score is the harmonicaverage of the precision andrecall of the models)

Based on 420 images fromChestX-ray14

Page 22: Introduction of DeepLearning - KBS - KBS · 2018-06-19 · You Only Look Once (YOLO) Unified, Real-Time Object Detection • Introduction à Using YOLO, you only look once at an image

Extension

• Improvementà instead of outputting one binary label, ChexNet outputs a vector t

indicating the presence of each of the 14 pathology classes

à we replace the final fully connected layer in CheXNet with a fully connected layer producing a 14-dimensional output, after which we apply an elementwise sigmoid nonlinearity

Page 23: Introduction of DeepLearning - KBS - KBS · 2018-06-19 · You Only Look Once (YOLO) Unified, Real-Time Object Detection • Introduction à Using YOLO, you only look once at an image

Visualizationvisualize the areas of the imagemost indicative of the diseaseusing class activation mappings(CAMs)

feed an image into the fullytrained network and extract thefeature maps that are output bythe final convolutional layer.

Page 24: Introduction of DeepLearning - KBS - KBS · 2018-06-19 · You Only Look Once (YOLO) Unified, Real-Time Object Detection • Introduction à Using YOLO, you only look once at an image

Experiments and Comparision

CheXNet outperforms the best published resultson all 14 pathologies in the ChestX-ray14 dataset.

Page 25: Introduction of DeepLearning - KBS - KBS · 2018-06-19 · You Only Look Once (YOLO) Unified, Real-Time Object Detection • Introduction à Using YOLO, you only look once at an image

Discussion