Deep learning for image super resolution
Post on 11-Feb-2017
Deep Learning for Image Super-Resolution
Deep Learning for Image Super-ResolutionChao Dong, Chen Change Loy, Kaiming He, Xiaoou Tang
Presented By Prudhvi Raj Dachapally
D. Prudhvi Raj
AbstractUsing Deep Convolutional Networks, the machine can learn end-to-end mapping between the low/high-resolution images. Unlike traditional methods, this method jointly optimizes all the layers of the image. A light-weight CNN structure is used, which is simple to implement and provides formidable trade-off from the existential methods.
What is Deep Learning?A branch of Artificial Neural Networks and Machine Learning that deals with more convolutional and realistic brain structures.In the words of Dr. Andrew Ng, researcher at Stanford, Founder & CEO of Coursera, Increased computing power has allowed us to map and process much larger neural networks than ever before.
Appealing Properties of the Proposed ModelThe name given for this model is Super Resolution Convolutional Neural Network (or) SRCNN.Structure is simple, but provides superior accuracy compared to state-of-the-art methods.Since it is a fully feed-forward network, it is unnecessary to solve the optimization problem. Restoration quality can be further improved with more diverse data and/or more deeper network without changing the core structure of the network.SRCNN model can also cope with channels of color images simultaneously with ease, which in turn can improve performance.
PreliminariesColor Channel used YCbCrY LuminanceCb Blue difference Cr Red difference Cb and Cr are Chrominance componentsFirst, we upscale the image to a desired size using bicubic interpolation method. This is just a pre-processing step.
Structure of the Network
Components in the NetworkPatch Extraction and RepresentationDensely extracts patches and then represents them as a set of filters.This layer is expressed as a function F1, whereF 1(Y) = max(0, W1 * Y + B1) This layer extracts a n1 dimensional feature for each patch. Non Linear MappingMaps each of the n1-dimensional vectors into an n2-dimensional one.This layer is expressed as a function F2, whereF 2(Y) = max(0, W2 * F1(Y) + B2) It is possible to add more convolutional layers to this structure, but in perspective, increases the training time. ReconstructionThe predicted overlapping high-resolution patches are often averaged to produce the final full image. This convolutional layer is defined as F (Y) = W3 * F 2(Y) + B3
Terms Used in the FormulationsW1 = Corresponds to the n1 filters of size c * f1 * f1,Where c is number of channels and f1 is the spatial size of the filter.B1 = An n1-dimensional vector, whose each element is associated with a filter.W2 = n2 filters of size n1 * f2 * f2.B2 = n2 dimensional vector.W3 = Corresponds to c filters of size n2 * f3 * f3B3 = c- dimensional vector.
Learning ProcessEstimation of network parameters can be achieved through minimizing the loss between reconstructed images and the corresponding original high-resolution images. This is done by taking the Mean Squared Error (MSE).
Using MSE as a loss function, favors high PSNR( Peak Signal to Noise Ratio).The loss is minimized by using stochastic gradient descent with regular back-propagation algorithm.
ExperimentsTraining DataVery Large Data Set of 395, 909 images from 2013 ImageNet Competition. Test DataA BSD200 Data Set with 200 images.Basic Network SettingsThese are f1 = 9, f2 = 1, f3 = 5, n1 = 64 and n2 = 32.
Comparison Against the State-of-the-art Methods
PSNR Peak Signal to Noise RatioSSIM Structural Similarity IndexIFC Information Fidelity CriterionNQM Noise Quality MeasureWPSNR Weighted PSNRMSSSIM Multi Scale SSIM
NE + LLE Neighbor Embedding + Locally Linear EmbeddedANR Anchored Neighbor RegressionA+ - Adjusted Anchored Neighbor Regression12
Real Time Results
Expansion ScopeUsing Large FiltersIncreasing the filter size can increase the PSNR value, but also increases the training time.
Using Deeper NetworksThis can sometimes be a contradiction to the rule More the layers, so is the accuracy.
ConclusionThis approach, SRCNN, learns an end-to-end mapping between low- and high-resolution images, with little extra pre/post-processing beyond the optimization. With a lightweight structure, the SRCNN achieves a superior performance than the state-of-the-art methods. Additional improvement in performance can be gained further by exploring more filters and different training strategies.
ReferencesImages, tables and some of the text used in this presentation as taken from Chao Dong et.al. Image Super-Resolution Using Deep Convolutional Networks, IEEE Transactions on Pattern Analysis and Machine Intelligence, Volume 38, February 2016.