a method for fusing a pair of images in the jpeg domain (final … · 2012. 7. 18. · marta...
TRANSCRIPT
A method for fusing a pair of images in the JPEG domain
(Final Report)
Submitted by:
Marta Antosik
Supervisor:
A/P Ramakrishna Kakarala
20 July 2012
School Of Computer Science
Summer Research Internship 2012
2
ABSTRACT
The paper contains the report of the work done during Summer Research Internship. It
introduces algorithm of the fusion of a pair of short and long exposure images in the JPEG
domain. The technique is very fast, as it needs only one pass over each pixel and simply as it
does not require any complex calculation. It also does not require much memory as only one
single JPEG macro blocks have to be store during processing. Those features are essential to
allow algorithm to be used as part of camera’s digital processing. The paper describes also the
way of implementation of mentioned method and required modifications.
INTRODUCTION AND BACKGROUND
Nowadays development of technology increases rapidly. However, the customers’
requirements also enlarge. It is essential to improve possibilities to keep up the progress. One
of technological area of progress is photography. Nowadays it is tried to put as many facilities
as it is possible into one device. What is more, it is important that quality of those features
need to be similar to those that are in separate devices.
For photography, it is essential to increase the quality of the image and keep the smallest size
of that picture which is possible in the same time. What is more, the algorithms need to be
very fast as that processes of photograph enhancing have to be executed in real time.
The compression of images is quite well known and well-prepared issue. There is a lot of
algorithms providing good results of decreasing size of photographs with keeping still the
good quality. However, there are still areas in which new better algorithms can be provided.
One of those areas is the image fusion that can be used in digital image stabilization and high
dynamic range capture. That process can be done in spatial domain, by choosing the proper
regions of fusing images, or in transform domain. In paper will be provided the algorithm [1]
which use that second option in very efficient way. Process of image fusion is executed
during the JPEG file writing operation. That fact is also very important as JPEG standard [2]
is one of the most popular and most of devices have that compression method implemented. It
makes the algorithm available to use widely.
LITERATURE REVIEW
The computational photography is very important and interesting topic. That is reason why
many research have been already conducted. In the beginning it is worth to mention
Mitchell’s [3] and Stathaki’s [4] publications which summarize fusion techniques. Other
papers provide particular solutions. Khan et al [5] proposed function which reduce ghosting
artifacts by using probability that pixel belong to a moving object. That method needs a few
iteration similarly to other one provided by Lu et al [6], which based on deconvolution. In
other paper, Reinhard et al [7
which was determinate by the variance across different exposures.
Jacobs et al [8], a local entropy
images to detect motion in a sequence. Tico and Pulli
method of fusing pair of short and long exposure images but in wavelet domain.
Although there is a lot of proposed method for computational photography, there are still
goals to achieve. Those techniques require two or more images to be kept in memory, two or
more passes over each pixel, important calculations which are not needed for normal ca
work. There is also example of real usage of fusion of images implemented in IPhone.
However, the results of its work are not satisfying. For static
many benefits, but in case of moving
RESEACH OBJECTIVE
The purpose of that report was to provide a new implementation of the algorithm proposed by
Kakarala et al [1]. The techn
other paper. The motivation to create it was to find simply
efficient in computation and memory. In
only one
Figure 1. Proposed algorithm of fusing pair of short and long exposure images in the JPEG domain
single JPEG macro block to be kept in memory.
lookup table to reduce algorithm’s complexity.
[7] proposed using the likelihood of a pixel resulting in ghosting
which was determinate by the variance across different exposures. There is,
a local entropy-based technique which uses different low dynamic range
images to detect motion in a sequence. Tico and Pulli [9] and Tico et al [10
method of fusing pair of short and long exposure images but in wavelet domain.
ere is a lot of proposed method for computational photography, there are still
goals to achieve. Those techniques require two or more images to be kept in memory, two or
more passes over each pixel, important calculations which are not needed for normal ca
There is also example of real usage of fusion of images implemented in IPhone.
However, the results of its work are not satisfying. For static landscapes,
benefits, but in case of moving objects, the fusion still causes many motion artifacts.
The purpose of that report was to provide a new implementation of the algorithm proposed by
. The technique covers lacks that occurs in other algorithm
other paper. The motivation to create it was to find simply algorithm, which
efficient in computation and memory. In results, it requires only one pass over each pixel and
thm of fusing pair of short and long exposure images in the JPEG domain
single JPEG macro block to be kept in memory. In addition, the brightness function uses
lookup table to reduce algorithm’s complexity.
3
proposed using the likelihood of a pixel resulting in ghosting
There is, provided by
based technique which uses different low dynamic range
[10] proposed another
method of fusing pair of short and long exposure images but in wavelet domain.
ere is a lot of proposed method for computational photography, there are still
goals to achieve. Those techniques require two or more images to be kept in memory, two or
more passes over each pixel, important calculations which are not needed for normal cameras’
There is also example of real usage of fusion of images implemented in IPhone.
landscapes, that fusion gives
motion artifacts.
The purpose of that report was to provide a new implementation of the algorithm proposed by
in other algorithms provided in
algorithm, which would be
it requires only one pass over each pixel and
thm of fusing pair of short and long exposure images in the JPEG domain
the brightness function uses
4
The algorithm uses pair of short and long exposure images to fuse them in the JPEG domain.
The benefit is that the final picture gets the sharpness from short exposure image and high
signal-to-noise ratio of the long exposure image. It is essential that the JPEG domain is used
to keep small size of photo and good quality in the same time. As it is widely known, the
JPEG uses the fact that human’s sight is more sensitive for luminance Y. Therefore, it
achieves benefits by sub sampling the two chrominance channels Cb and Cr.
For understanding the algorithm, it is very important to introduce the exposure notation and
the assumption that images vary only in exposure time, aperture and ISO setting are constant.
Exposure value (EV) notation is used in logarithmic scale. For that assumption EV(0) means
the default value for camera, while EV(∆) means an exposure time 2∆ longer.
Figure 1 shows briefly the steps of the algorithm. Before images fusing the short exposure
image needs to be boosted in brightness. To provide that process efficiently the lookup table
(LUT) was used. In that case, only one transformation is required for each pixel. The pixels
are been boosted with using sigmoid curve proposed by Zhang et al [11] which has form
���;�, �, � =�
� �����.
M is to maximum pixel value, which varies from 255 to 4095. The best experimental results
were achieved for � = 1 and scaled function
����;�, � = 2���;�, �, 1 − �.
Figure 2. Sigmoid function, which is used for boosting short exposure image
For that function ���0 = 0 and ����0 = ��/2 what means that the slope is proportional
to �. The best experimental results were achieved for
� =2�∆�� + 1
�
5
∆�� means the difference in exposure value for short and long exposure images. In Figure 2
the curve ����;�, � was plotted for used later values, � = 255 and ∆�� = 2 . It can be seen
that for � → ∞ function goes asymptotically to �.
The next step is related to regular transform image from RGB domain into Y, Cb, Cr domain.
Firstly, the long exposure image is processed and written into the file. After that, the
transformation is conducted for short exposure image. The next step is the proper fusion. The
luminance blocks of the long exposure image are been overwriting by luminance blocks of the
short exposure image. That ends the basic fusion process.
However, the received photo is not satisfying. In combination Ybs, Cbl, Crl there are still
motion artifacts because of primary photos were taken in different moments. It causes
ghosting and color bleeding at edges when it is simply composition of luminance from short
exposure image and chrominance from long exposure image. That is a reason to take the last
step in the proposed algorithm. It bases on the DCT analysis for artifact removal. During
JPEG compression, the DCT transform is used for every macro block 8x8. For those ones that
have many details there are many nonzero coefficients. To mark the last nonzero coefficient
the End Of Block (EOB) symbol is used. Analyzing localization of that symbol it can be
guessed how much detailed that area is and if it can be a edge. In proposed technique, blocks
of luminance from short exposure image are checked and the EOB location threshold is set by
experiments to value 15 (out of 64). If the EOB symbol is detected before the threshold
location, still chrominance from long exposure image is used, as the block is not detailed.
Otherwise, the block is determined as the edge and chrominance from long exposure image is
overwritten by chrominance from short exposure image. That method gives great results and
it is still very simply.
RESEARCH
The objective of the research was to implement proposed algorithm using C programming
language. The motivation for this was to create possibility to use it in real device.
Because the JPEG compression is well known and well prepared, it was decided to use open
source code. First choice was the program of Independent JPEG Group. However, the code
was too complex and has much more features than were required. Next choice was program
provided by berliOS Developer. The code was small and simply, but it required some
extensions. The work was divided in a few steps.
During first step restart markers were added into JPEG file. For default algorithm in program
mean value (DC) of every block (besides first one) was dependent on the previous one. In
proposed algorithm, that situation was not acceptable because of possibility overwriting some
6
blocks. That is why it requires putting Define Restart Interval (DRI) Marker with value 1 into
file’s header. The value of the marker means how many blocks are dependent on each other,
so in that case every block is independent and can be overwritten. The JPEG syntax requires
also Restart Markers after every block to work properly.
The next step was based on simply fusing two images. In previous assumptions that step was
separate from the last step which was artifact removal. In real those two steps were connected.
In addition, the order of the algorithm needed to be changed. To keep algorithm simply both
images are being processed in the same time. It is founded that both images are taken before
processing and they are kept in camera’s memory. In every moment, only 2 single JPEG
macro blocks are used. During reading macro block of short exposure image, it is being
boosted. Then luminance of that block is transformed using DCT and written into the file.
Also during the process, luminance localization of EOB symbol is checked. If it occurs before
the threshold, the chrominance of long exposure image, which is already read, is taken.
Otherwise, the chrominance of short exposure image is used. After decision, the process is
similar as with luminance. The DCT transform is used for chrominance and then it is written
into file.
The changes in algorithm were required because of mode of action open source program,
which was chosen. It does not put EOB symbol in pure form. Detecting EOB symbol and
other required calculations would be much more complex then processing two images in the
same time. It has also time benefits. For every macro block luminance and chrominance only
from one image is being processed.
DATA ANALYSIS & RESULTS
The purpose of the research was to obtain small size and good quality photo enhanced by
fusion a pair of short and long exposure images. The assumptions were fulfilled. The output
photo of prepared during research program contains all goals, which should.
The best way to present results would be example. Figure 3 shows the short exposure image
and Figure 4 presents long exposure image. The independent images are not the best quality
because they too dark or have motion artifacts. To obtain better results those images were
fused into photo, which can be seen in Figure 5. Still simply fusing is not enough that is why
the artifact removal method was used. The result can be seen in Figure 6.
7
Figure 3. Short exposure image EV(-2)
Figure 4. Long exposure image EV(0)
Figure 5. Fused photo before artifact removal
8
Figure 6. Final fused photo after artifacts removal
For better noticing method’s benefits, it is essential to see zoomed photos. In Figure 7, it is
easy to recognize goals achieved by using the proposed algorithm. It is important to remember
that simplicity of algorithm with comparison to benefits.
Figure 7. Comparison the photo before (on the left) and after (on the right) artifacts removal
The benefits are easy to notice in Figure 7. The artifacts are caused by mismatch luminance
and chrominance values at the edges. That is why overwriting the chrominance by one from
short exposure image in detailed areas brings very good results.
9
Figure 8. Relation between threshold and number of overwritten blocks
During testing the algorithm many values of threshold was checked. The best value was
smaller than one in algorithm proposal and equal9. In Figure 8 it can be noticed that for
threshold smaller than 10 still 90% of blocks are overwritten. That value allows removing the
artifacts and improving photo’s quality in to comparison to short or long exposure images
what was the purpose of research.
The algorithm is very fast as whole process takes about 5.9 seconds for 10 megapixels image
while single image compression for that resolution takes about 0.7 second. It gives also good
size results, as fused photo is only twice bigger than normal compressed photo.
The program was tested for a few images. The results are presented in table 1.
Name of short
exposure image
Name of long
exposure image
Resolution of
original images
Increasing size
after fusion [%]
Time of image
compression [s]
Time of fusion
[s]
Basket_short.bmp Basket_long.bmp 3888x2592 178 0.7 5.9
Monas_short.bmp Monas_long.bmp 1280x1920 220 0.2 1.6
Safra_short.bmp Safra_long.bmp 1920x1280 122 0.2 1.7
Cathedral_s.bmp Cathedral_l.bmp 1920x1280 150 0.3 1.5
Table 1. Results of another images tests
According to the table 1, it can be seen that algorithm gives good results for many photos. The
fusion time is relatively short and the size of final photo does not increase too much. The
results create the opportunity to use that implementation in real devices.
RECOMMENDATIONS
Proposed algorithm provides great benefits keeping small size of photo and the good quality
the same time. Fusion of the pair of images gives even better results as photo gets sharp edges
from short exposure image and lively colors from long exposure image. The simply trick
helps also to remove motion artifacts. The essential is fact that method still is very simply and
is available for hardware as camera. Important is also fact that only two single JPEG macro
blocks need to be store in camera memory so it can be implemented as a part of camera digital
10
processing. Due to fact that the algorithm needs only one pass over each pixel, method is very
fast and can be conducted in real time. However, the results are very good while the photos
are displayed as a small one. For bigger screen, the artifacts become noticeable. In addition, it
is essential that images need to be captured in very small interval to allow the fusion.
Figure 9. Fused photo taken by iPhone 4.2
To compare results with the other algorithms in figure 10 there is fused photo made by iPhone
4.2. It easy to notice the motion artifacts, as the application works properly mainly for static
landscapes.
CONCLUSIONS
The research was focused on computational photography. The main goal was to fuse a pair of
short and long exposure images in the JPEG domain. The method needed to be implemented
in C language to be available for hardware as camera. What is more it had to be simply and
fast to provide efficient benefits. The essential point in algorithm is the artifacts removal
technique base on DCT analysis and choosing proper elements for final photo. The results can
be also comparing with implementation of images’ fusion in iPhone. It is easy to recognize
that in case of capturing moving objects the proposed algorithm gives much better benefits
and due to its simplicity can be implemented in simply device as well.
11
REFERENCES
[1] Kakarala R, Hebbalaguppe R (2011) A method for fusing a pair of images in the JPEG
domain. Journal of Real-Time Image Processing
[2] Wallace GK (1992) The jpeg still picture compression standard. IEEE Trans Consumer
Electronics 38 (I)
[3] Mitchell HB (2010) Image fusion: theories, techniques and applications. CRC Press
[4] Stathaki T (2008) Image fusion: algorithms and applications. Academic Press
[5] Khan E, Akyuz A, Reinhard E (2006) Ghost removal in high dynamic range images. In:
IEEE Intl. Conf Image Processing (ICIP), pp 530-533
[6] Lu PY, Huang TH, Wu MS, Cheng YT, Chuang YY (2009) High dynamic range image
reconstruction from hand-held cameras. Computes Vision and Pattern Recognition (CVPR),
IEEE Computer Society Conference on pp 509-516
[7] Reinhard E, Ward G, Pattanaik S, Debevec P (2005) High dynamic range imaging:
Acquisition, display and image-based lighting. Morgan Kaufmann Publishers
[8] Jacobs K, Loscos C, Ward G (2008) Automatic high dynamic range image generation for
dynamic scenes. IEEE Computer Graphics and Applications, vol 28, pp 24-33
[9] Tico M, Pulli K (2009) Image enhancement method via blur and noisy image fusion. In:
Proc. Int. Conf. on Image Processing (ICIP), IEEE, pp 1521-1524
[10] Tico M, Gelfand N, Pulli K (2010) Motion-blur free exposure fusion. In: Proc. Int. Conf.
on Image Processing (ICIP), IEEE, pp 1521-1524
[11] Zhang X, Jones RW, Baharav I, Reid DM (2006) System and method for digital image
tone mapping using an adaptive sigmoidal function based on perceptual preference guidelines.
US Patent 7,023,580
[12] Pennebaker WB, Mitchell JL (1993) JPEG still image data compression standard. Van
Nostrand Reinhold