fast near infrared fusion-based adaptive enhancement...
TRANSCRIPT
FAST NEAR INFRARED FUSION-BASED ADAPTIVE ENHANCEMENT OF VISIBLEIMAGES
Ahmed Elliethy, Hussein A. Aly
Dept. of Computer Engineering, Military Technical College, Cairo, Egypt,[email protected], [email protected]
ABSTRACT
Visible (VS) and near infra-red (NIR) band sensors provide dig-
ital images that capture complementary spectral radiations from a
scene. Since NIR radiations propagate well through haze, mist, or
fog, the captured NIR image contains better scene details compared
to the VS image in such cases. However, NIR radiations are material
dependent and provide little information about color or texture of
the scene’s objects. To exploit the complementary details provided
by VS and NIR images, we propose a fusion approach that adap-
tively injects missing spatial details to the VS image from the NIR
image while preserving the spectral contents of the VS image. The
spatial details are adaptively weighted based on the relative differ-
ence between local contrasts of the NIR and the VS images. Thus,
the proposed approach prevents unnecessary modification of colors
or amplification of scene details that result in an unrealistic fused
image. Moreover, the proposed approach is non-iterative, fast with
a low complexity of O(n), and suitable to be implemented on em-
bedded cameras’ hardware. Experimental fusion results obtained on
natural NIR and VS image pairs show the effectiveness of the pro-
posed approach compared with two alternatives.
Index Terms— Image fusion, near infrared, image enhance-
ment
1. INTRODUCTION
Digital images represent captured spectral radiations from a scene
in the visible band (VS) with wavelengths λ ∈ [400, 700] nm in the
form of three color components (red, green, and blue). Challenging
imaging conditions such as haze, mist, fog, and overwhelming or
poor lighting conditions may degrade the quality of the captured VS
images. On the other hand, near-infrared (NIR) sensor sensitive to
wavelengths of λ ∈ [650, 1650] nm can capture complementary ra-
diations from the scene and provide a see-through mechanism under
the aforementioned challenging imaging conditions [1]. However,
the NIR radiation is material dependent [2, 3], and therefore some
details about objects made from the same material may be lost. Fig-
ure 1 shows examples of VS and NIR image pairs captured for the
same scene that demonstrate the complementary details provided by
VS and NIR bands.
Several VS-NIR fusion approaches that exploit the complemen-
tary details provided by VS and NIR images are proposed to tackle
different kinds of problems such as VS image enhancement [4–6]
and de-hazing [7, 8]. In [4], a fused image is obtained from the VS
image by replacing either a color plan or the luminance plane of the
VS image with the NIR one. In [5], a contrast-preserving mapping
model is proposed to alter the pixel values of the NIR image to match
their corresponding pixels in the luminance plane of the VS image
while preserving the local contrast of the NIR image. The altered
(a) (b)
Fig. 1. Samples VS and NIR image pairs from [4] captured for the
same scene that demonstrates the complementary details provided
by VS and NIR bands, where the VS and NIR images are shown in
the first and second rows, respectively. The image pair in column (a)
shows the effectiveness of the NIR in the presence of haze (at the
mountain area), while column (b) shows that the NIR image suffers
from some loss of details (at the water area) compared to the VS
image.
NIR image is used along with other color information from the VS
image to construct the fused image. These fusion approaches pro-
vide an effective enhancement to the VS image when the NIR image
contains more details compared to the VS image. However, when
the NIR image suffers from details loss in some areas as shown in
Fig. 1 (b), the fusion may deteriorate these areas and degrades the
VS image. To overcome this problem, the fusion approach proposed
in [6] uses an adaptive smoothness constraint based on gradient and
color correlations between the VS and NIR images. However, the
fusion approach is iterative and computationally intensive.
In this paper, we propose a fusion approach that adaptively in-
jects spatial details from the NIR to the VS image without alter-
ing the colors of the VS image. The proposed approach has three
stages. First, local contrast for both NIR and VS images are com-
puted. Then, the spatial details from the NIR are extracted using a
carefully designed high pass filter. Finally, the extracted spatial de-
tails are weighted according to the relative difference between the
computed local contrasts and injected into the VS image. Two key
advantages offered by the proposed approach compared to the afore-
mentioned prior approaches. First, the proposed approach incorpo-
rates only missing spatial details from the NIR to the VS image with-
156978-1-5090-5990-4/17/$31.00 ©2017 IEEE GlobalSIP 2017
exctraction
channel
Luminance
Local
contrast
estimation
(LC)
HPF
fusionestimationmap
FusionPer-channel
IRGB INIR
LC (Y )
LC(
INIR)
JRGB
F
Y
Fig. 2. Block diagram of the proposed approach. The local contrasts for the NIR image INIR and the luminance plane Y of the VS image
IRGB are first computed. Then, the spatial details from the NIR are extracted using a high pass filter. Finally, the spatial details are weighted
according to a fusion map F and injected into the VS image to obtain the enhanced image JRGB. Note that, we add a constant to the spatial
details for better visualization.
out introducing unnecessary modification of colors or amplification
of details that may result in an unrealistic fused image. Second, the
proposed approach is non-iterative and fast with a low complexity
of O(n). Therefore, it is suitable for implementation on embedded
cameras’ hardware.
The rest of the paper is organized as follows. In Sec. 2, we
detail the proposed adaptive VS-NIR fusion approach. In Sec. 3,
we present a visual comparison of the fusion results obtained by the
proposed approach and by the methods in [4] and [5]. Finally, the
paper is concluded in Sec. 4.
2. PROPOSED ADAPTIVE VISIBLE AND NIR FUSION
APPROACH
The proposed approach is designed based on the following proposi-
tions:
• The spatial details which are only apparent in the NIR image
INIR and lost in the VS image IRGB should be incorporated
into the fused image JRGB.
• The spectral contents (colors) of IRGB should be preserved
after fusion.
Based on these propositions, we designed the proposed approach as
shown in the block diagram in Fig. 2. The local contrasts for the NIR
image INIR and the luminance plane Y of the VS image IRGB are first
computed to estimate a fusion map F . Then, the spatial details from
the NIR are extracted using a high pass filter. Finally, the spatial
details are weighted according to the fusion map F and injected into
the VS image to obtain the enhanced image JRGB.
The fusion map F is the key that determines the regions that
suffer from missing spatial details in IRGB compared to INIR. To
estimate F , we first extract the luminance plane Y from IRGB, then
F is defined as the relative difference in local contrast between INIR
and Y . Specifically,
F (x) =max
(
0, LC(
INIR (x))
− LC(
Y RGB (x)))
LC (INIR (x)), (1)
where LC (I (x)) is the local contrast for the image I at the spatial
location x = [x, y]T . Inspired by [9], our local contrast is defined as
LC (I (x)) =α
(
maxx′∈N (x)
I(
x′)
− minx′∈N (x)
I(
x′)
)
+
(1− α)
(
maxx′∈N (x)
‖∇I (x)‖
)
, (2)
where N (x) is an S × S neighborhood around x, ∇I is the spatial
gradient of I , and α = 0.5 is a constant. Note that, F has large val-
ues for the regions that have better spatial details in INIR compared
to IRGB, and low values (or zeros) for other regions where the spatial
details of IRGB is better. Hence, F will serve as our adaptive selector
of the amount of fusion (injection) of the spatial details of INIR to
produce JRGB.
We designed a high pass filter g to extract the higher frequency
contents (spatial details) of INIR as g = δ − h, where δ is the unit
impulse filter and h is a prototype Gaussian filter with radial cut-
off frequency Ωcut cycles/picture height (c/ph) and with kernel size
of k × k. More details about how we set the parameters of g are
presented in Sec. 3.
With the estimated fusion map F and the extracted spatial details
g ∗ INIR (where ∗ represents convolution), we propose the fusion
process to generate1 the image JRGB as
JRGB(x) = I
RGB(x) + F (x)(
g ∗ INIR)
(x). (3)
Note that, only the the higher frequency contents(
g ∗ INIR)
are
injected into IRGB while the base-band contents of IRGB are left in-
tact. Additionally, for the regions where the captured spatial details
of IRGB are attenuated compared to their counterparts in INIR, F will
be large to boost the injected high frequency contents from INIR. On
the contrary, the other regions where the spatial details of IRGB is
better, F → 0 and the second term in Eq. (3) vanishes or has very
little effect. Therefore, the proposed fusion in Eq. (3) complies with
our propositions.
The proposed fusion approach is non-iterative and has low com-
putational complexity. Due to space constraints, we summarize the
operations required for each equation of the proposed approach in
Table 1. From the table, the computational complexity C of the pro-
posed approach, applied on an image IRGB with a total number of n
pixels, is given by
C(n) = n(
A(
k2 + 13
)
+ M(
k2 + 12
)
+ C(
3S2 + 1))
= O(n). (4)
This fast method without any iterative process can be hardware
implemented on system on chip and integrated in the camera hard-
ware.
1Note that, the fusion in Eq. (3) is performed on every channel (red, green,and blue) of IRGB.
157
Eq.add/sub mult/div comparison
(A) (M) (C)
(1) 1 1 1
(2) 9 10 3S2
(3) k2 + 3 k2 + 1 0
Table 1. Computational complexity analysis for the proposed fusion
approach.
3. EXPERIMENTAL RESULTS
We evaluated the proposed approach on the dataset from [4] that
consists of 477 pairs of VS-NIR images organized into 9 categories.
The images were captured using a modified SLR camera by using
an IR-block or IR-pass filter in front of the camera’s lens. We set
S = 5, Ωcut = 0.05, and k = 19 in all experiments. The parameter
Ωcut is determined by performing a linear search in the range Ωcut :0.01 → 0.5 and for each value of Ωcut, we first perform the proposed
fusion approach on large number of images from the dataset, then
we subjectively evaluated every fused image, and finally we pick the
value for Ωcut that results in high quality fused images. Similarly,
we performed a linear search for k starting from k = 5 until we
obtained minimum ripples in both the stop and pass bands of the
frequency response of g. The magnitude of the frequency response
of g with the above specified parameters is shown in Fig. 3.
−1/2
−1/4
0
1/4
1/2
−1/2
−1/4
0
1/4
1/2
−25
−20
−15
−10
−5
0
u (c/ph)v (c/ph)
dB
Fig. 3. Magnitude of the frequency response of the high pass filter g
with Ωcut = 0.05.
We compare the proposed approach with (a) the coloring ap-
proach in [4] and (b) with the contrast-preserving mapping approach
in [5]. The experiments were performed on a large number of im-
age pairs from the data set and we present few of them in Fig. 5 and
Fig. 6. The columns of both figures from left to right represent IRGB,
INIR, the fused image obtained using [4], the fused image obtained
using [5], and the fused image JRGB obtained using the proposed ap-
proach, respectively. Additionally, the fusion maps corresponding to
the image pairs in the first two rows of Fig. 5 and the first two rows
of Fig. 6 are shown in Fig. 4 (a) and (b), respectively.
As shown in Fig. 5, the fused images JRGB obtained by the pro-
posed approach show a great enhancement to the scene details com-
pared to IRGB preserving the spectral contents of IRGB. For example,
the majority of the blurriness in IRGB shown in the first row of the
figure is restored in the fused image. Another example, the details of
the hazy distant regions missed in IRGB (shown in the last three rows
of the figure) are much better in the fused image compared to IRGB.
The reason behind the obtained enhancements is that the fusion map
automatically determines the regions in IRGB where scene details are
missing (as shown in Fig. 4 (a)) and accordingly, our adaptive fusion
approach incorporates the details from INIR to IRGB without intro-
(a)
(b)
Fig. 4. Corresponding fusion maps for sample image pairs from
Fig. 5 and Fig. 6. Specifically, the fusion maps in (a) and (b) are
corresponding to the image pairs in the first two rows of Fig. 5 and
the first two rows of Fig. 6, respectively.
ducing unnecessary modification of colors or amplification of scene
details that may result in an unrealistic fused image.
Note that, for all image pairs in Fig. 5, the NIR images have bet-
ter captured scene details compared to the corresponding VS images.
Therefore, the fused images obtained using [4] and [5] also show bet-
ter scene details compared to IRGB since these approaches maintain
a high fidelity between the fused and the NIR images. However, the
fused images have noticeably different spectral contents compared
to IRGB and the fused images seem unnatural.
The problem of obtaining unnatural fused images using [4]
and [5] is very apparent in Fig. 6. This is because the approaches
in [4] and [5] blindly maintain high fidelity between the fused and
the NIR images even when the NIR images lack scene details as
shown in the figure. On the other hand, the proposed fusion ap-
proach avoid this problem by adaptively estimating the fusion map,
which in this case has small values (as shown in Fig. 4 (b)) and
therefore, the injection from the NIR to the VS image is refrained.
We implemented the proposed approach using C++. The run-
ning time for the images included in the results (of size 682× 1024)
was 0.7 seconds on a laptop that has core i7 processor and memory
of 12 GB. The proposed approach is 2.5× faster compared with the
running time2 of the method in [5].
4. CONCLUSIONS
In this paper, we propose a fast fusion approach to enhance a VS im-
age by adaptively injecting spatial details from a co-registered NIR
image without altering the colors of the visible image. Specifically,
the proposed approach first estimates a fusion map from the local
contrasts of the NIR and the VS images. Then, the spatial details
from the NIR are extracted using a carefully designed high pass fil-
ter. Finally, the extracted spatial details are weighted according to
the fusion map and injected into the VS image. Two key advantages
2Note that, the speed-up is reported according to our own implementationof the method in [5].
158
Fig. 5. Sample VS and NIR images and their fusion results. The columns from left to right represent IRGB, INIR, the fused image obtained
using [4], the fused image obtained using [5], and the fused image obtained using the proposed approach, respectively. Note that, for all image
pairs, the NIR images have better captured scene details compared to the corresponding VS images. Images are best viewed electronically.
Fig. 6. Sample VS and NIR images and their fusion results. The columns are organized similar to those of Fig. 5. Note that, for all image
pairs, the NIR images lack scene details compared to the corresponding VS images. Images are best viewed electronically.
offered by the proposed approach. First, the adaptive fusion incor-
porates only missing spatial details from the NIR to the VS image
without introducing any unnecessary modification of colors or am-
plification of details that may result in an unrealistic fused image.
Second the proposed approach has low computational complexity.
The advantages of the proposed approach are highlighted with sev-
eral fusion results obtained from natural NIR and VS image pairs.
The proposed approach takes only about 0.7 seconds to fuse an im-
age of size 682 × 1024 and shows better image enhancement com-
pared to two alternative approaches.
159
5. REFERENCES
[1] C. Colvero, M. Cordeiro, G. De Faria, and J. Von der Weid,
“Experimental comparison between far-and near-infrared wave-
lengths in free-space optical systems,” Microwave and optical
technology letters, vol. 46, no. 4, pp. 319–323, 2005.
[2] N. Salamati, C. Fredembach, and S. Susstrunk, “Material classi-
fication using color and NIR images,” Color and Imaging Con-
ference, vol. 2009, no. 1, pp. 216–222, 2009.
[3] N. Salamati and S. Susstrunk, “Material-based object segmenta-
tion using near-infrared information,” Color and Imaging Con-
ference, vol. 2010, no. 1, pp. 196–201, 2010.
[4] C. Fredembach and S. Susstrunk, “Colouring the near-infrared,”
Color and Imaging Conference, vol. 2008, no. 1, pp. 176–182,
2008.
[5] C. H. Son, X. P. Zhang, and K. Lee, “Near-infrared coloring via
a contrast-preserving mapping model,” in IEEE Global Conf. on
Signal and Information Proc. (GlobalSIP), Dec 2015, pp. 677–
681.
[6] D. Sugimura, T. Mikami, H. Yamashita, and T. Hamamoto, “En-
hancing color images of extremely low light scenes based on
RGB/NIR images acquisition with different exposure times,”
IEEE Trans. Image Proc., vol. 24, no. 11, pp. 3586–3597, Nov
2015.
[7] C. Feng, S. Zhuo, X. Zhang, L. Shen, and S. Susstrunk, “Near-
infrared guided color image dehazing,” in IEEE Intl. Conf. Im-
age Proc., Sept 2013, pp. 2363–2367.
[8] L. Schaul, C. Fredembach, and S. Susstrunk, “Color image de-
hazing using the near-infrared,” in IEEE Intl. Conf. Image Proc.,
Nov 2009, pp. 1629–1632.
[9] Y.-W. Tai and M. S. Brown, “Single image defocus map estima-
tion using local contrast prior,” in IEEE Intl. Conf. Image Proc.,
Nov 2009, pp. 1797–1800.
160