[ieee 2012 9th international conference on ubiquitous robots and ambient intelligence (urai) -...

2012 9th International Conference on Ubiquitous Robots and Ambient Intelligence (URAl)

Daejeon, Korea / November 26-29,2012

Dictionary Learning for Dark Image Enhancement

Jaesik Yoon1, Yuna Seo2, Jungyu Kang3 and Chang D. Y004

1,2,3 Department of Electrical Engineering, KAIST, Daejeon, 305-701, Korea (Tel : +82-42-350-5470; E-mail: {jaesik817, yn.seo, cmiller2ai} @kaist.ac.kr)

4 Department of Electrical Engineering, KAIST, Daejeon, 305-70], Korea (Tel : +82-42-350-3470; E-mail: [email protected])

Abstract - This paper proposes dark image enhancement algorithm based on dictionary learning(DL). When image is dark, many vision based robotic applications cannot work well. Existing dark image enhancement algorithms have defects. For example, histogram modification results in color information losses and gamma correction makes degradation in dark area. Proposed algorithm preserves color information and provides high performances in extreme conditions. Experimental results compared with

existing algorithms will be also presented.

Keywords - Dark Image Enhancement, Dual Dictionary, Histogram Modification, Sparse Coding, Dictionary Learning

1. Introduction

Vision-based robots using visible light camera require sufficient light source for recognition of objects and environmental conditions. Without the sufficient light source, enough features cannot be obtained from the image. Therefore, dark image enhancement is needed for robots used in dark conditions.

Conventional dark image enhancement skills such as histogram modification and gamma correction have limitations. Histogram modification stretches or equalizes the histogram of image which makes improvements in contrasts. However, it causes losses in color information and shows distortion in bright area. Gamma correction is nonlinear image enhancement method which modifies gamma value of an equation defined as follows

(1)

where C is constant value and {Vin, Voud E [0,1] are normalized pixel values of the image. This enhances the image contrast and brightness but it may also cause degradation in image quality in dark area. Also, both of them show bad performances in harsh conditions. This paper propose novel dark image enhancement algorithm using dictionary learning and LASSO regression. The algorithm showed improvement in preserving color information and performances in harsh conditions. In this paper, dictionary learning (DL) and sparse coding is described. Proposed algorithm using dual dictionary provides a new application of dictionary learning (DL) as an image enhancement method.

978-1-4673-3112-8112/$31.00 ©2012 IEEE 94

2. Related Works

2.1 Dictionary Learning(DL)

Consider set X = [xl, ... , xn] E \Rl.mxn. DL is to represent these set as linear combinations of dictionary basis functions. Dictionary basis function is column vector

di E \Rl.m of dictionary D = [d\ "'} dP] E \Rl.mxp. Set X can be represented with dictionary D and each dictionary basis function coefficients A = [a1} • . • , an] E \Rl.pxn. DL is represented in the following formulation

minimize 2:.l:f-1 [2:.llxi-DUi II�l O,A n - 2 subject to .!1a(ui)�ba, .!10(D)�bo,

(2)

where D and a are learned for minimizing least-squares in constrained conditions. Constrained conditions are to prevent overfitting of coefficient and dictionary. Overfitting is that the performance on the training data still increases while that on the test data decreases. This occurs when values of parameter (D and a as DL) are too big or learning is performed too long. Hence constrained conditions make shrinkage effect. Many kinds of regularization term are used. To only shrinkage, lz-norm regularization term is used. To make the solution sparse[l] and shrinkage, many kinds of regularization term are used and they will be discussed in 2.2. Equation (2) is represented for using Lagrangian equation.

where Lagrangian multipliers AA and Ao are in \Rl.. DL consists of training step, validation step and test step. Training step is to learn dictionary for using training data. In this step, instead of minimizing D and a simultaneously, DL minimizes one while fixing another one. Dictionary learning methods are K-SVD [2], MOD (Method of Optimal Directions) [3] and online dictionary learning [4] . Optimization methods for a are gradient method, proximal gradient method and H. Lee algorithm [5] . Dictionary has primary patterns in this step. Validation step is to tune parameter ( Aa and Ao as DL). In this step, for making difference in parameter DL checks performance and finds parameter when best performance occurs. Test step is to check performance of learned dictionary. (Actually test step is included validation step.)

DL is analyzed stochastically. This is reported in [ I ], [6]-[8] . These suggest that

x=Da+n, (4)

where n is Gaussian white residual vector in IRl,m. Variance of n is (]'

2 and mean of n is zero. if D and a are given,

likelihood probability P(XID, a) is followed

(X-Ou)Z P(XID,a) = c· exp {---z-}, 20- (5)

where C is constant value. If D and a are random variable, joint probability of X, D and a is as followed

P(X, D, a) = P(XID, a)P(D)P(a) . (6)

If distribution of prior probability P (a) and P (D) are Laplace distribution [6], [7] and Gaussian distribution, posteriori probability P(D, a lX) is followed,

P(D,aIX) '" P(X,D,a)= P(XID,a)P(D)P(a) { ((X-OU)2 )}

= CX'exp - ----z;;:z-+CoIIDI12+CullaI11 , (7)

where Cx, Co and Cu are constant value. Equation (7) is equal to Eq. (3) for maximizing Eq. (7) as D, a.

2.2. Sparse Coding

Sparse coding is finding a sparse coefficient a with regard to overcomplete D in DL. For that, using la-norm regularization term is intuitive thinking because la-norm is the number of nonzero elements. Algorithms using la-norm regularization term are matching pursuit [9] and orthogonal matching pursuit (OMP)[lO]. However, optimization problem of la-norm constraint condition is NP-hard(Non deterministic Polynomial time hard) problem. Hence generally l1 -norm(LASSO[ I I ]) is generally used. Figure I -(a) and 2-(b) show l2-norm does not set some coefficients to zero but l1 -norm sets some coefficients to zero. Since l1 -norm is convex function, if objective function is convex function, optimization problem using this is convex optimization problem and algorithms for solving this problem are efficient as compared with algorithms for solving other problem [12]. However, since l1 -norm is undifferentiable, gradient method is not used. Because iterative method is generally used, solutions for this have computation complexity. Algorithms for optimization problem of l1 -norm constraint condition are proximal method, sub gradient method and H. Lee algorithm [5].

If specific dictionary basis functions are used or not used at the same time, these are group and this is represented to use group LASSO [13]( ldl2-norm). group LASSO n is followed

(8)

Equation (8) means l1-norm in group and l2-norm between

95

(a) (b)

(c) (d) Fig. I . (a) l2-norm (b) l1-norm (c) group LASSO (d) tree

structured group LASSO

groups. If aT is [av a2 ' a3 ] and aVa2 are group, figure l-(c) shows group LASSO.

If aT is [av a2' a3 ] and a2' a3 are group and aVa3 are group, these are overlapping group and this structure is represented tree structured group LASSO [14]. Figure 1-( d) shows this constraint condition. This constraint condition represents hierarchical structure of dictionary basis function. Tree structured group LASSO D. is followed

where Wg is defined as follows

{gv IlmEAncestors(v) Sm W -v - IlmEAncestors(v) Sm

if v is an internal node if v is a leaf node,

(9)

(10)

where sv, gv are group parameters and these are Sv + gv = 1, 0 � sv, gv � 1.

3. Dual Dictionary for Dark Image Enhancement

Proposed Algorithm is to estimate bright image from dark image. For this, dark image histogram dictionary and bright image histogram dictionary are used. Each dictionary has primary patterns of dark image histogram and bright image histogram. Dictionary basis functions of dual dictionary are matched. For example, if bright dictionary

basis function db represents specific pattern of bright image

histogram, matched dark dictionary basis function d� represents matched pattern of db. Since basis functions are matched, dual dictionary enhances dark image by using shared coefficients.

3.1. Dictionary Learning Formulation

Consider set X = [xl, ... , xn ] E IRl,mxn of bright image histogram set and a set Y = [y\ ... , yn ] E IRl,mxn of dark

image histogram set. x and yare bright and dark image histogram and m is 256 and n is the number of images.

Dark histogram dictionary is Dct = [d�, ... , d�] E lRl,mxp

and Bright histogram dictionary is Db = [d�, ... , db+1] E lRl,mX(p+l) as pis the number of dictionary basis function.

First column vector of Db is bias. This algorithm learns two dictionaries with shared coefficients. If each dictionary

coefficients are Act = [a�, ... , aJ] E lRl,pxn Ab =

[a�, ... , a�] E lRl,(p+l)Xn , bright histogram dictionary

coefficients ith column vector jth element abV) is

. { 1 if} = 1 aj,(j) =

a�(j - 1) otherwise, (1 1)

where i = 1, ... , n andj = 1, ... ,(p + 1). Therefore, if we know Act, we know Ab too and we can define

where function � changes dimension.

Training Step: Given (X, Y) E lRl,mxn, find Dct, Db

(Oct, Act) = argmin �Lr=l[ �IIYi - Octa�IV +AuDu(a�) + ADDD(0ct)] (13)

Dd.Ad Db = ar��in � IIX - Db�(Act)IV + SD!1D(Db) , (14)

where AA' AD, �D 2:: O. Dual dictionary Dct, Db is learned for

shared coefficients Act and dark, bright image histogram. Hence dual dictionary has matched basis functions. Equation (5) uses sparse coding and sparsity is controlled by changing Aa. If Aa is big, shrinkage effect is increasing and sparsity is increasing. If Au is small, another effect is occurred. For best performance, found parameter in validation step is used. However, if higher sparsity effect is needed, big Aa is used.

Validation Step: Parameters AA' AD, �D are tuned for the best performance.

Testing Step : Find a using y and learned dictionary

Dct and estimate x using dictionary Db

a = argmin .:clly - Dctall 2

+ AA!1A (a) a 2 2 X = arg�in � Ilx - Db�(a)112

2 ,

(15)

(16)

where AA 2:: O. In Eq. (8), x is estimated by x "" Dbg,(a) .

3.2. Optimization

In Eq. (5), function fl is regularization term. This algorithm uses Ii - norm as coefficient regularization term flA for sparsity. Therefore, for solving Eq. (5) in Act, this algorithm needs LASSO optimization algorithm. Proposed algorithm uses H. Lee's LASSO optimization algorithm [5]. Dictionary regularization term flD is 12-norm regularization term and 12 - norm regularization term is differentiable. Therefore, this algorithm uses

96

Algorithm I Dual Dictionary Learning

Require : X = [xl, ... , xn] E lRl,mxn , Y = [yl, ... , yn] E lRl,mXn(bright/dark training data), , Db = [d�, ... , db+1] E lRl,mx(p+l)

, Dct = [d�, ... , d�] E lRl,mxp (initial bright/dark

dictionary), Au, AD, �D E lRl,(parameters), PD E lRl,(paramet er for gradient descent), TL (the number of iterations for dictionary learning), TD (the number of iterations for gradient descent) for t = 1 to TL do

Sparse coding : compute using H. Lee algorithm

Dictionary Update : for j = 1 to TD do

end for

Oct = Dct,Db = Db end for

Return (Dd, Db) (learned dictionaries)

gradient method for solving Eq. (5), (6). Algorithm 1 represents proposed algorithm for learning

Dct, Db' In this algorithm, each initial dictionary is each image histograms selected randomly in training data.

4. Results and Conclusions

Experiments on Berkely segmentation database are conducted to evaluate the proposed algorithm. To obatin dark images, gamma correctaion is adapted and performances are evaluated in terms of PSNR. Berkeley segmentation database is consists of 500 natural images, and we randomly choose: 1) 400 images for

training the dual dictionary Dct, Db; 2) 50 images for parameter tuning; 3) 50 images for test. To benchmark the proposed algorihtm, two fundamental algorithms, gamma correction and histogram equalization, are adapted. Figure 1 illustrates the qualitative results on sample image.

(a) (b) (c)

. ....,.... -

(d) (e)

Fig. 2. (a) Original image (b) Dark image (y = 4) (c) Histogram equalization result, (d) Gamma correction result (e) Proposed algorithm result

70 -- gamma correction -- histogram equalization -- proposed algorithm

30

20

19. t5=�2==2�.5��3��3�.5==;4=::;:4.�5===f5 =";5�.5�=-;6 y

Fig. 3. Experimental result of PSNR

Histogram equalized image in Figure l-(c) shows contrast enhancement, while losing some color information. Gamma correction image in Figure 2-(d) shows contrast enhancement / color information recovery, but degradation in dark area. The proposed algorithm shows contrast enhancement / color information recovery and performs better than gamma correction in dark area since the primary patterns, dictionaries are pre-learned using the training images. Averaged PSNR of the proposed algorithm (red), gamma correction (blue) and histogram equalization (black) on test images are reported in Figure 2. The result shows that the proposed algorithm outperfoms than the benchmark algorithms in harsh conditions.

Acknowledgement

This work was supported by the National Research Foundation of Korea(NRF) grant funded by the Korea government(MEST) (No.2012-0005378 and No.2012-0000 985)

References

[1] B. A. Olshausen and D. J. Field. "Sparse coding with an overcomplete basis set: A strategy employed by VI?" Vision Research, Vo1.37, pp.3311-3325, 1997.

[2] M. Aharon, M. Elad, and A. M. Bruckstein. "The K-SVD: An algorithm for designing of overcomplete dictionaries for sparse representation", IEEE

Transactions On Signal Process, VOL. 54, NO. I I, 2006.

[3] K. Engan, S. O. Aase, and .l. H. Hus¢y. "Multi-frame compression: Theory and design," EURASIP Signal

Process., vol. 80, no. 10, pp.2121-2140, 2000. [4] J. Mairal , F. Bach, J. Ponce and G. Sapiro. "Online

learning for matrix factorization and sparse coding" Journal of Machine Learning Research, Res., 11, 19-60,2010.

[5] H. Lee, A. Battle, R. Raina, and A.Y. Ng. "Efficient sparse coding algorithms.", Neural information

97

Processing Systems Foundation, Vo1.l9, pp.801-808, 2007.

[6] M. S. Lewicki and B. A. Olshausen. "A probabilistic frame work for the adaptation and comparison of image codes",1. Opt. Soc. Amer. A:Opt., Image Sci. Vision,

vol. l6,no. 7,pp. 1587-1601, 1999. [7] B. A. Olshausen and D. J. Field. "Natural image

statistics and efficient coding", Network: Computation

in Neural Systems, vol. 7, no. 2, pp. 333-339, 1996. [8] M. S. Lewicki and T. .l. Sejnowski. "Learning

overcomplete representations", Neural Computation,

vol. 12,pp. 337-365,2000. [9] F. Bergeaud and S. MaHat. "Matching pursuit of

images", In Processing International Conference on

Image Processing, volume I, pages 53-56 vol. I , 1995. [10] D. L. Donoho , Yaakov Tsaig , Iddo Drori and Jean-Iuc

Starck . "Sparse solution of underdetermined linear equations by stagewise orthogonal matching pursuit", Information Theory, IEEE, 2012.

[ I I ] R. Tibshirani, "Regression shrinkage and selection via the lasso", Journal of the Royal Statistical

Society-Series B, VoI.58(1), pp.267-288, 1996. [12] S. P. Boyd and L.Vandenberghe. Convex optimization,

Cambridge University Press, 2004. [13] M. Yuan and L. Lin, "Model selection and estimation

in regression with grouped variables", Journal of the

Royal Statistical Society -Series B, Vol. 68(1), pp. 49-67, 2006.

[14] R . .lenatton,.l. Mairal, G. Obozinski, and F. Bach, "Proximal methods for sparse hierarchical dictionary learning", International Conference on Machine

Learning, pp.487-494, 2010.

[ieee 2012 9th international conference on ubiquitous robots and ambient intelligence (urai) -...

Documents