low-bit hardware implementation of dwt for 3d medical

1396 978-1-7281-5761-0/20/$31.00 ©2020 IEEE

Low-Bit Hardware Implementation of DWT for 3D Medical Images Processing

Pavel A. Lyakhov1, Maria V. Valueva2, Nikolay N. Nagornov3, Nikolay I. Chervyakov4

Department of Applied Mathematics and Mathematical Modeling,

North-Caucasus Federal University Stavropol, Russia

[email protected], 2 [email protected], [email protected], [email protected]

Dmitrii I. Kaplun Department of Automation and Control Processes

Saint Petersburg Electrotechnical University "LETI" St. Petersburg, Russia

[email protected]

Abstract— Denoising and compression of tomographic images are important problems in modern medical imaging systems. Discrete wavelet transform (DWT) is used to solve them in practice. We analyzed the quantization noise effect in coefficients of DWT filters for 3D tomographic imaging. The method for wavelet filters coefficients quantizing is proposed which allows minimizing resources in hardware implementation. All data are presented in the fixed-point format in the proposed method of 3D tomographic images DWT. Hardware implementation is performed in Xilinx Vivado 2018.3 for the target device Kintex UltraScale xcku115-flvf-1924-3-e. The proposed method of reducing bit-width allows us to use to 15% – 70% fewer hardware resources, to 12% – 28% less delay of the device and to 4% – 59% fewer energy costs compare to known approaches.

Keywords— discrete wavelet transform; 3D image processing; medical imaging; quantization noise; hardware costs; FPGA

I. INTRODUCTION

Medical imaging uses many different methods such as magnetic resonance (MR) imaging [1, 2], radiography [2-4], radionuclide [4], optical [3, 5] and ultrasound [1]. A typical medical imaging system consists of three components [6]: data collection, data consolidation, data processing. A data collection card filters incoming visual information. As soon as the data is compensated and filtered in the scanners, they are sent to the data consolidation card for alignment and buffering. They are sent to the image processing cards which perform enhanced image filtering farther. Modern FPGA devices are widely used in data consolidation, and image processing for sophisticated application algorithms implementation including pattern recognition, image enhancement and data compression [7, 8].

Medical images denoising is an important problem in modern medical imaging systems. The rheological methods for increasing the resolution of MR elastography determine the viscoelastic properties by very noise sensitive wave inversion [1]. Noise affects the analysis of anatomical structures and complicates diagnostic applications in radiology [3]. The positron emission tomography (PET) images reconstruction process includes multiplicative noise, which prevents the analysis of data [4].

Medical imaging systems produce increasingly quality images with higher resolution and bitness with advances in scanning technology and digital devices. Thus, the amount of processed information is significantly increased. This is especially true for 3D scanning technology [2]. The PET images of one patient more than 4 GB of disk space may be required [9]. Video recording of a relatively short retinal peeling procedure may require more than 40 GB of memory [5]. Hard disk capacity averages 1-2 TB with the current level of storage technology development. Thus, the compression of 3D medical images is also an important problem in modern medical imaging systems.

Various transforms are used to solve problems of 2D and 3D medical images denoising and compression in practice. The most common of them are discrete Fourier transform (DFT) [10] and discrete wavelet transform (DWT) [5]. DWT allows obtaining both frequency and time information about a signal, while DFT extracts only frequency information. Images DWT is performed by convolution with a pair of lowpass and highpass wavelet filters of filter bank that highlight main and detailed information respectively. Denoising and compression of images are performed by detailed information manipulating in modern algorithms such as set partitioning in hierarchical trees (SPIHT) [11] and embedded zerotrees of wavelet transforms (EZW) [12]. The convolution operation has high computational complexity. Hardware implementation on modern microelectronic devices such as field-programmable gate arrays (FPGA) and application-specific integrated circuits (ASIC) working with fixed-point numbers is one of the ways to improve its characteristics [13, 14]. Quantization noise occurs when converting wavelet filters coefficients into this format, due to which convolution is performed with an error. The question arises about the accuracy of wavelet filters coefficients representation in the device's memory, which is efficient in terms of resources and enough to achieve the required quality of image processing. A thorough FPGA and ASIC architectures review for DWT implementation in intelligent and biomedical applications is provided in [14]. Authors [15] demonstrated that DWT along with Gaussian filtering shows the best results in the noise removing and the electrocardiogram signals smoothing. The development and

Authorized licensed use limited to: North-Caucasus Federal University. Downloaded on June 19,2020 at 13:39:14 UTC from IEEE Xplore. Restrictions apply.

1397

implementation of a full hardware model based on DWT for compressing and restoring EEG data on FPGAs are described in [16]. A system based on DWT for epileptic seizures detecting according to EEG data for patients with epilepsy was proposed in [17]. We did not find references to selected wavelet coefficients bit-width in works [14-17]. The authors [18] and [19] quantized wavelet coefficients by 12 and 16 bits respectively, but there is no rationale for this choice.

The purpose of this work is to reduce resource costs for the implementation of 3D medical visual information wavelet processing with hardware implementation on modern microelectronic devices.

II. 3D MEDICAL IMAGES DWT

DWT is a signal transform using a filter bank which is a convolution of the input data with wavelet filters that translate them from a time representation into a time-frequency domain. Wavelet filters F of filter bank consist of coefficients ,F if ,

where 1, ...,i k= . Coefficients of lowpass and highpass wavelet

filters of decomposition ( ),LD HD and reconstruction

( ),LR HR . Daubechies wavelets ( )2db k are the most

common ones.

Consider a 3D digital grayscale image I as a function

( ), ,I x y z with the spatial coordinates 0 1x X≤ ≤ − ,

0 1y Y≤ ≤ − and 0 1z Z≤ ≤ − . Thus, voxel values (analogues of 2D pixels for 3D space) are represented as

( ), ,I x y z . Convolution of a 3D image with wavelet filters is

performed by formulas (1)

( ) ( )

( ) ( )

( ) ( )

,1

,1

,1

' , , , , ,

'' , , ' , , ,

''' , , '' , , ,

k

F iik

F iik

F ii

I x y z I x i y z f

I x y z I x y i z f

I x y z I x y z i f

=

=

=

= − ⋅

= − ⋅

= − ⋅

(1)

where 'I , ''I and '''I are the convolution results by strings, columns and frames respectively. 3D image DWT is performed by sequential convolution with filters (Fig. 1).

We get 8 sets of coefficients , , , ,LLL LLH LHL LHH, , ,HLL HLH HHL HHH of image decomposition as a result of

original image I analysis. This sets can be divided into approximating ( )LLL and detailing ( , , , ,LLH LHL LHH HLL

, , ).HLH HHL HHH Approximating coefficients correspond to the lowpass part of the signal and contain main information about the image I . Detailing coefficients to correspond to the highpass part of the signal and contain detailed information about the image I . 3D image denoising and compression are carried out by manipulating detailing coefficients of image

decomposition. We get the reconstructed image I as a result of image decomposition coefficients synthesis. Theoretically, the original image should be fully reconstructed since the scheme in Fig. 1 has the perfect reconstruction property. However, quantization noise occurs due to the digital format of wavelet filters coefficients representation in practice. Quantization noise distorts both image decomposition coefficients and

reconstructed image .I Images DWT result may have a quality unacceptable for the task depending on the magnitude of quantization noise.

Fig. 1. The scheme of 3D image DWT


1398

The minimum bit-width determination of wavelet filters coefficients ,F if necessary for efficient software and hardware

implementation of 3D images DWT on modern devices and enough for high-quality images processing. Fixed-point number operations speed is higher than a floating-point number on modern devices. This can be used for 3D medical imaging systems improvements. We quantized and converted into a fixed-point format wavelet filters coefficients in the proposed method by scaling by 2n and rounding up

*, ,2 .n

F i F if f = (2)

Bit-width r of quantized wavelet filters coefficients *,F if can

be found form equation 1r n= + . The digital image I ∗ processed according to the scheme on Fig.2 using quantized wavelet filters coefficients *

,F if . Voxel values of an image I ∗

should be normalized by scaling by 62 n− ( 2 n− for each convolution, according to the scheme from Fig. 2) and rounding down

62 .nI I− ∗ = (3)

We get only integers as a result of images DWT with unquantized coefficients. The quantization error of the wavelet filters coefficients rounded up is strictly redundant. Rounding down of DWT results minimizes this error and cannot cause an error by itself. Rounding up and down operations are performed by discarding the fractional part of the number with the addition of one in the case of rounding up an integer. The rounding errors have inverse signs and reduce the influence of each other on the calculation result. We used the peak signal-to-noise ratio ( )PSNR between two images (original image I

and processed image I ) for image processing quality control. This criterion can be defined by

( )2

2

10 10

2 110 log 10 log ,

BMPSNR

MSE MSE

− = =

(4)

where: B is the image BPC; M is the maximum brightness of the image voxels; MSE is the mean square error of brightness. The value PSNR = ∞ for identical images. The image processing quality is considered high if PSNR Q≥ , where Q describes the difference between images imperceptible for human eyes. 40Q = dB for images with 8 BPC. We propose to generalize Q to the case of images with 12 and 16 BPC

using formula 5 .Q B=

III. EXPERIMENTAL MODELING OF THE 3D TOMOGRAPHIC

IMAGES DWT

An example of 3D grayscale 12-bit tomographic image “CT_teeth” DWT with wavelet Daubechies 2db is demonstrated in Fig. 2. The figure shows that images processed by various methods are not distinguishable by eye from each other. Thus, loss of quality when processing with any of the selected methods is acceptable.

Hardware implementation is performed in Xilinx Vivado 2018.3 for the target device Kintex UltraScale xcku115-flvf-1924-3-e. The run strategy is Flow_PerfOptimized_high. Hardware simulation was carried out for DWT of 3D grayscale tomographic images with 8 BPC by wavelet Daubechies 2db with 4 coefficients. We used scaling factor

10=n and the bit-width is 11=r (PSNR is greater than 40 dB in this case). We also used scaling factor 12=n and the bit-width is 13=r (PSNR = ∞ in this case). Results of hardware simulation for various cases of coefficients bit-width are shown in Table 1 and Fig. 3.

Hardware implementation results show that using the proposed method of calculating coefficients bit-width allows reducing time and hardware resources and power consumption of a device. Using 11-bit coefficients representation allows to reduce hardware resources to 15% - 70%, time resources to 12% - 28% and power consumption to 4% - 59% compared to known methods. Using 13-bit coefficients representation requires 3% greater hardware resources, to 1% less time resources and to 1% greater power consumption compare to 12-bit coefficients representation proposed in [18]. Moreover, comparison with 16-bit [19] and 32-bit representations show that using 13-bits for coefficients bit-width requires to 14% and 63% less hardware resources, to 9% and 21% less time

a) b) c)

Fig. 2. Example of 3D grayscale 12-bit tomographic image “CT_teeth” (190-ty frame) DWT by db2 wavelet: a) result of applying the proposed method; b) result of applying coefficients bit-width from [18]; c) result of applying coefficients bit-width from [19]


1399

resources and to 12% and 57% less power consumption respectively.

TABLE I. HARDWARE IMPLEMENTATION RESULTS

LUTs Delay, ns Power, W

11 1263 12.123 142.047

12 1478 13.809 147.670

13 1517 13.661 149.149

16 1760 15.008 169.413

32 4147 17.202 349.429

IV. CONCLUSIONS

The method for wavelet filters coefficients quantizing is proposed which allows minimizing resources in the hardware implementation of 3D medical images processing. All data are presented in a fixed-point format. The proposed method of reducing bit-width allows us to use to 15% – 70% fewer hardware resources, to 12% – 28% less delay of the device and to 4% – 59% fewer energy costs compare to known approaches.

ACKNOWLEDGEMENT

This work was supported by the Government of the Russian Federation (state order no. 2.6035.2017/BCh), the Russian Foundation for Basic Research (projects no. 18-07-00109 A, no. 19-07-00130 A and no. 18-37-20059 mol-a-ved), and by the Presidential Grant of the Russian Federation (project no. SP-2245.2018.5 and SP-126.2019.5).

REFERENCES [1] E. Barnhill et al., “Nonlinear multiscale regularisation in MR

elastography: Towards fine feature mapping,” Med. Image Anal., vol. 35, pp. 133–145, Jan. 2017.

[2] L. F. R. Lucas, N. M. M. Rodrigues, L. A. Da Silva Cruz, and S. M. M. De Faria, “Lossless Compression of Medical Images Using 3-D Predictors,” IEEE Trans. Med. Imaging, vol. 36, no. 11, pp. 2250–2260, Nov. 2017.

[3] F. Schirrmacher et al., “Temporal and volumetric denoising via quantile sparse image prior,” Med. Image Anal., vol. 48, pp. 131–146, Aug. 2018.

[4] Z. Xu et al., “Joint solution for PET image segmentation, denoising, and partial volume correction,” Med. Image Anal., vol. 46, pp. 229–243, May 2018.

[5] L. Fang, S. Li, X. Kang, J. A. Izatt, and S. Farsiu, “3-D adaptive sparsity based image compression with applications to optical coherence

tomography,” IEEE Trans. Med. Imaging, vol. 34, no. 6, pp. 1306–1320, Jun. 2015.

[6] C. Zhang, T. Liang, P. K. T. Mok, and W. Yu, “FPGA Implementation of the Coupled Filtering Method and the Affine Warping Method,” IEEE Trans. Nanobioscience, vol. 16, no. 5, pp. 314–325, Jul. 2017.

[7] “Diagnostic Imaging FPGA Applications - Intel® FPGA.” [Online]. Available: https://www.intel.com/content/www/us/en/healthcare-it/products/programmable/applications/diagnostic-imaging.html. [Accessed: 05-Jun-2019].

[8] “Medical Imaging with CT, MRI and PET.” [Online]. Available: https://www.xilinx.com/applications/medical/medical-imaging-ct-mri-pet.html#overview. [Accessed: 05-Jun-2019].

[9] S. S. Parikh, D. Ruiz, H. Kalva, G. Fernandez-Escribano, and V. Adzic, “High Bit-Depth Medical Image Compression with HEVC,” IEEE J. Biomed. Heal. Informatics, vol. 22, no. 2, pp. 552–560, Mar. 2018.

[10] T. Tashan and M. Al-Azawi, “Multilevel magnetic resonance imaging compression using compressive sensing,” IET Image Process., vol. 12, no. 12, pp. 2186–2191, Dec. 2018.

[11] X. Song, Q. Huang, S. Chang, J. He, and H. Wang, “Three-dimensional separate descendant-based SPIHT algorithm for fast compression of high-resolution medical image sequences,” IET Image Process., vol. 11, no. 1, pp. 80–87, Jan. 2016.

[12] C. Naveen, T. V. S. Gupta, V. R. Satpute, and A. S. Gandhi, “A simple and efficient approach for medical image security using chaos on EZW,” in ICAPR 2015 - 2015 8th International Conference on Advances in Pattern Recognition, 2015, pp. 1–6.

[13] D. G. Bailey, Design for Embedded Image Processing on FPGAs. Singapore: John Wiley & Sons (Asia) Pte Ltd, 2011.

[14] A. Madanayake et al., “Low-power VLSI architectures for DCT/DWT: Precision vs approximation for HD video, biomedical, and smart antenna applications,” IEEE Circuits Syst. Mag., vol. 15, no. 1, pp. 25–47, 2015.

[15] V. Vijendra and M. Kulkarni, “ECG signal filtering using DWT haar wavelets coefficient techniques,” in 2016 International Conference on Emerging Trends in Engineering, Technology and Science (ICETETS), 2016, pp. 1–6.

[16] M. Elsayed, A. Badawy, M. Mahmuddin, T. Elfouly, A. Mohamed, and K. Abualsaud, “FPGA implementation of DWT EEG data compression for wireless body sensor networks,” in 2016 IEEE Conference on Wireless Sensors (ICWiSE), 2016, pp. 21–25.

[17] A. Sharmila and P. Geethanjali, “DWT Based Detection of Epileptic Seizure From EEG Signals Using Naive Bayes and k-NN Classifiers,” IEEE Access, vol. 4, pp. 7716–7727, 2016.

[18] H. Y. Alzaq and B. B. Ustundag, “An optimized two-level discrete wavelet implementation using residue number system,” EURASIP J. Adv. Signal Process., vol. 2018, no. 1, Dec. 2018.

[19] M. Chehaitly, M. Tabaa, F. Monteiro, and A. Dandache, “A ultra high speed and configurable Inverse Discrete Wavelet Packet Transform architecture,” in Proceedings of the International Conference on Microelectronics, ICM, 2018, vol. 2017-December, pp. 1–4.

a) b) c)

Fig. 3. Hardware implementation results: a) number of LUTs; b) delay of the device; c) power consumption of the device


low-bit hardware implementation of dwt for 3d medical

Documents