[ieee 2013 ieee 14th international symposium on computational intelligence and informatics (cinti) -...

6
3D image quality estimation (ANN) based on depth/disparity and 2D metrics Dragan Kukolj * , Dragana Đorđević ** , David Okolišan ** , Ivana Ostojić ** , Dragana Sandić-Stanković *** , Chaminda Hewage **** * University of Novi Sad, Faculty of Technical Sciences, Serbia ** RT-RK Institute for Computer Based Systems, Narodnog fronta 23a, Novi Sad, Serbia ***IRITEL Institute for Telecommunications and Electronics, Batajnički put 23, Beograd, Serbia ****Kingston University, London, United Kingdom [email protected] , [email protected] , [email protected] , [email protected] [email protected] , [email protected] Abstract— Immersive image/video services will be soon available to the mass market due to the technological advancement of 3D video technologies, which include 3D- Ready TV monitors at affordable prices. However, in order to provide demanding customers with a better service over resource limited (e.g., bandwidth) and unreliable communication channels, system parameters need to be changed “on the fly”. Measured 3D video quality can be used as feedback information to fine tune the system parameters. The main aim of this paper is to analyze and present impact of objective image quality assessment metrics on perception of 3D image/video. Neural Network statistical estimator was used to examine the correlation between objective measures on input image base and Differential Mean Opinion Score (DMOS) of used image base. For this purpose part of LIVE 3D Image Quality Database [7] was used. The results suggest that comparison of the neural network DMOS estimators based on full- reference and no-reference objective metrics shown very similar behavior and accuracy. KeywordsObjective Metric, Objective Image Quality Assessment, Image Quality Assessment, Mean Opinion Score I. INTRODUCTION Distance between the eyes (50-75 mm), brings us to the different views of the world surrounding us. Distance between corresponding points of image projection in left and right eye, is defined as disparity [22]. Using this parameter in the primary visual cortex depth impression is created through stereopsis mechanism. 3D image/video applications provide two slightly different views to each eye to simulate natural stereoscopic viewing as perceived by our HVS. Various applications and system for 3D image/video creation, processing and displaying take significant place in scientific and multimedia industry community. Relating to the fact that 3D image/video becomes more and more popular in multimedia industry, subjective and objective quality assessment procedures need to be defined, to overcome the complex nature of 3D image/video quality. Subjective quality assessment could be better way to evaluate image and video quality, but it carries many problems, such as time consuming, special laboratory setups, human observers, cost, results analysis etc. Objective quality assessment, on the other hand, comes with a few advantages such as ease of calculation, real-time processing and reasonable accuracy. Many standards [1][2] and papers [3][4] aim this topic, regarding also associated problems. For 2D image/video quality assessment the most frequently used standard is ITU-R BT 500 [1], while in 3D image quality assessment ITU-R BT.1438 [21] is relevant, because it gives an overview of procedure for subjective assessment of stereoscopic television images. In this paper we aim to investigate the correlation of 2D image quality metrics and few 3D image quality metrics with subjective quality scores of perceived images (DMOS). In relation to 2D objective quality metrics, we observed: activity, zero crossing rate, gradient based activity, edge activity, contrast, correlation, energy and local homogeneity [7] on 3D image data base [5]. We also investigate the correlation of disparity map and subjective scores of image database with 3D color image histogram. Quality assessment of 3D images is an issue addressed in many papers. A brief overview of subjective and objective quality assessment is presented in [4]. It could be noticed that although some 2D objective metrics can be useful in quality assessment of 3D image/video there are additional features which should be taken into consideration [33][34]. Utilization of depth information for 3D image quality assessment was described in [3]. In [22] quality assessment based on 2D quality metrics and disparity is described. It was presented that disparity has higher importance than reference images in 3D image quality assessment. On the other hand some researchers showed that depth doesn’t have significant influence on perceived image quality, while blur significantly affects image quality [6]. Group of researchers analyzed synthesized view 3D quality assessment in [23]. After 2D and 3D quality measures were calculated, the last step was the utilization of Machine Learning (ML) toward impersonation of HVS subjective image quality assessment. While some researchers use polynomial models [9][10] for mapping different objective metrics on image quality perception, others use dedicated learning process with artificial neural network (ANN)[11-15]. There are different proposed types of ANN used for image and video quality assessment relaying on objective quality metrics. For instance in [16] circular back- propagation (CBP) neural network was proposed, while in [11] Radial Basis Function Network (RBFN) was applied and compared with Multi Layer Perception (MLP) which was previously used. Kukolj in [13] proposed new tenet, with aim to obtain most relevant input features from wide set of inputs and use those 125 CINTI 2013 • 14th IEEE International Symposium on Computational Intelligence and Informatics • 19–21 November, 2013 • Budapest, Hungary 978-1-4799-0197-5/13/$31.00 ©2013 IEEE

Upload: chaminda

Post on 15-Feb-2017

215 views

Category:

Documents


2 download

TRANSCRIPT

3D image quality estimation (ANN) based on depth/disparity and 2D metrics

Dragan Kukolj*, Dragana Đorđević**, David Okolišan**, Ivana Ostojić**,

Dragana Sandić-Stanković***, Chaminda Hewage**** * University of Novi Sad, Faculty of Technical Sciences, Serbia

** RT-RK Institute for Computer Based Systems, Narodnog fronta 23a, Novi Sad, Serbia ***IRITEL Institute for Telecommunications and Electronics, Batajnički put 23, Beograd, Serbia

****Kingston University, London, United Kingdom

[email protected] , [email protected] , [email protected] , [email protected] [email protected] , [email protected]

Abstract— Immersive image/video services will be soon

available to the mass market due to the technological advancement of 3D video technologies, which include 3D-Ready TV monitors at affordable prices. However, in order to provide demanding customers with a better service over resource limited (e.g., bandwidth) and unreliable communication channels, system parameters need to be changed “on the fly”. Measured 3D video quality can be used as feedback information to fine tune the system parameters. The main aim of this paper is to analyze and present impact of objective image quality assessment metrics on perception of 3D image/video. Neural Network statistical estimator was used to examine the correlation between objective measures on input image base and Differential Mean Opinion Score (DMOS) of used image base. For this purpose part of LIVE 3D Image Quality Database [7] was used. The results suggest that comparison of the neural network DMOS estimators based on full-reference and no-reference objective metrics shown very similar behavior and accuracy. Keywords—Objective Metric, Objective Image Quality

Assessment, Image Quality Assessment, Mean Opinion Score

I. INTRODUCTION Distance between the eyes (50-75 mm), brings us to

the different views of the world surrounding us. Distance between corresponding points of image projection in left and right eye, is defined as disparity [22]. Using this parameter in the primary visual cortex depth impression is created through stereopsis mechanism. 3D image/video applications provide two slightly different views to each eye to simulate natural stereoscopic viewing as perceived by our HVS.

Various applications and system for 3D image/video creation, processing and displaying take significant place in scientific and multimedia industry community. Relating to the fact that 3D image/video becomes more and more popular in multimedia industry, subjective and objective quality assessment procedures need to be defined, to overcome the complex nature of 3D image/video quality. Subjective quality assessment could be better way to evaluate image and video quality, but it carries many problems, such as time consuming, special laboratory setups, human observers, cost, results analysis etc. Objective quality assessment, on the other hand, comes with a few advantages such as ease of calculation, real-time processing and reasonable accuracy.

Many standards [1][2] and papers [3][4] aim this topic, regarding also associated problems. For 2D image/video

quality assessment the most frequently used standard is ITU-R BT 500 [1], while in 3D image quality assessment ITU-R BT.1438 [21] is relevant, because it gives an overview of procedure for subjective assessment of stereoscopic television images. In this paper we aim to investigate the correlation of 2D image quality metrics and few 3D image quality metrics with subjective quality scores of perceived images (DMOS). In relation to 2D objective quality metrics, we observed: activity, zero crossing rate, gradient based activity, edge activity, contrast, correlation, energy and local homogeneity [7] on 3D image data base [5]. We also investigate the correlation of disparity map and subjective scores of image database with 3D color image histogram.

Quality assessment of 3D images is an issue addressed in many papers. A brief overview of subjective and objective quality assessment is presented in [4]. It could be noticed that although some 2D objective metrics can be useful in quality assessment of 3D image/video there are additional features which should be taken into consideration [33][34]. Utilization of depth information for 3D image quality assessment was described in [3]. In [22] quality assessment based on 2D quality metrics and disparity is described. It was presented that disparity has higher importance than reference images in 3D image quality assessment. On the other hand some researchers showed that depth doesn’t have significant influence on perceived image quality, while blur significantly affects image quality [6]. Group of researchers analyzed synthesized view 3D quality assessment in [23].

After 2D and 3D quality measures were calculated, the last step was the utilization of Machine Learning (ML) toward impersonation of HVS subjective image quality assessment. While some researchers use polynomial models [9][10] for mapping different objective metrics on image quality perception, others use dedicated learning process with artificial neural network (ANN)[11-15]. There are different proposed types of ANN used for image and video quality assessment relaying on objective quality metrics. For instance in [16] circular back-propagation (CBP) neural network was proposed, while in [11] Radial Basis Function Network (RBFN) was applied and compared with Multi Layer Perception (MLP) which was previously used. Kukolj in [13] proposed new tenet, with aim to obtain most relevant input features from wide set of inputs and use those

125

CINTI 2013 • 14th IEEE International Symposium on Computational Intelligence and Informatics • 19–21 November, 2013 • Budapest, Hungary

978-1-4799-0197-5/13/$31.00 ©2013 IEEE

inputs for designing Modular Neural Network (MNN). MLP network configuration is frequently used, with full-reference approach in some papers [12] or no-reference approach used in [14][15].

In this paper, we used reduced-reference approach to combine different, most relevant 2D and 3D objective quality measures in order to obtain Mean Opinion Score (MOS) of distorted image.

This paper is organized as follows. In Section II, an overview of most relevant 2D and 3D objective metrics is provided while in Section III preparation of input measures, correlation, cross correlation and configuration of ANN ware presented. Section IV gives an overview of obtained experimental results and discussion. In Section V conclusions are derived.

II. OBJECTIVE QUALITY METRICS In order to provide corresponding objective metrics for

image quality assessment, they could be divided into three groups: no-reference (NR), reduced-reference (RR) and full-reference (FR) metrics. No-reference objective metrics take into consideration only images/video whose quality is being estimated. Reduced-reference objective metrics beside images/video whose quality is estimated, take into account some features extracted from reference image/video. Full-reference objective metrics demand the use of reference image/video in quality assessment procedure. In this paper a several full-reference and no-reference metrics are considered. Information content weighted structural similarity measure (IW-SSIM), information content weighted peak signal-to-noise-ratio (IW-PSNR), single-scale structural similarity (SSIM), peak signal-to-noise-ratio (PSNR), intensity histogram and disparity are the FR metrics considered in this paper.

The first measure observed was disparity. Disparity between left and right image, represents divergence between those images and precisely it represents the vector between two corresponding points in left and right image. Disparity was already employed earlier in [17] for image quality assessment. Here disparity was calculated for reference images and corresponding distorted images. In order to obtain disparity, block matching algorithm was performed using Stereo Vision toolbox from Matlab [18][19].

First step was to convert color images in gray scale, because it is shown that this is much more efficient than usage of three-channel image. Block matching was performed with 7x7 block size. Dynamic programming and image pyramiding [18] was further used for calculating disparity. After obtained disparity maps for every distorted 3D image and corresponding reference 3D image, corresponding pairs are correlated. Finally, correlation between disparity correlation vector and DMOS values received with image database was analyzed. The main goal was to investigate the impact of disparity on perceived 3D image quality.

Intensity histograms were calculated for every degraded left and right image and then correlated, and appropriate values were obtained. Following paragraphs briefly describe several FR 2D metrics uses.

The simplest and most widely used full-reference quality metric, PSNR is calculated using mean squared error (MSE). According to [24] it is not well correlated with perceived visual quality.

SSIM correlates better with human perception of image quality. It is based on the assumption that the HVS is highly adapted to extract structural information from the viewing field. Structural simularity image index is calculated using three components: luminance, contrast and structure comparision between distorted and reference images.

Multi-scale approach in calculating structural simularity index matches even better the human visual response than single-scale approach. In multi-scale approach Laplacian pyramids with five levels for distorted and reference images are calculated. At each scale the contrast and the structure comparision are calculated and the luminance comparison is calculated only for the top of the pyramid. The multiscale SSIM is obtained by combining the measurement at different scales.

Introducing information-content weighted pooling instead of simple spatial averaging improve the quality scores [23]. Information content weight maps are calculated for each scale of Laplacian pyramid. More weights are given to the regions with larger distortions. Combining information content weighting with multiscale structural simularity measures, the best performance information content weighted SSIM measure, IW-SSIM, is obtained.

Figure 1. Reference LIVE 3D Image Quality Database

D. Kukolj et al. • 3D Image Quality Estimation (ANN) Based on Depth/Disparity and 2D Metrics

126

Combining information content weighting maps with squared error maps competitive perceptual quality measure, information content weighted PSNR, IW-PSNR, is obtained.

These four metrics were first performed for every corresponding reference and distorted image, and then those value vectors were correlated. After we obtained correlation values for every left and right image we took the mean of these two values, respectively for every 3D image.

As mentioned earlier, no-reference (NR) 2D objective measures are also examined. The objective 2D measures are: edge activity, zero-crossing rate, energy homogeneity, local homogeneity, gradient based activity, image activity, blocking effect, contrast and ringing. First, from large set of different 2D objective measures [32] a small set with stronger correlation to DMOS is selected. The measures that have correlation to DMOS higher than 0.7 are chosen.

The edge activity [29] is calculated using Sobel operator. Edges shown significant local changes of intesity in an image and they occur on the boundary between two different regions in an image. Important feature can be extracted from the edge activity measure sach as corners, lines and curves [30].

The zero-crossing rate [27] is the rate of sign-changes along a signal, i.e., the rate at which the signal changes from positive to negative or back in both directions, vertical and horizontal [9].

Co-occurrence matrix (COM) [31] solves the classification and description problems of textured images. COM features, such as energy, homogeneity, contrast, and other shown in [31], represent texture features of the images, and gives information about spatial arrangement or intensity. Energy is referring to global homogeneity of texture. A texture with high energy has a large number of homogeneous areas, whereas a texture with low energy has a small one.

The gradient based activity [29] measure is based on the characteristics that the active parts of the image are defined as those parts with strong edges and texture. The value of the gradient is calculated as absolute value of the difference between the curent pixel and its neighbors giving esencial information about the image.

The image activity [9] gives more insight into the relative blur in the image. The activity is calculated as average absolute difference between in-block image samples in both directions, vertical and horizontal.

Coding technique based on cosine transform DCT, which is used in JPEG and MPEG coding introduce the blocking effect [9]. Blocking effect occurs due to the discontinuity at block boundaries, which is generated because the quantization in JPEG is block-based and the blocks are quantized independently.

Ringing effect (Gibbs phenomenon) appears in images as oscillations near sharp edges. It is a result of a cut-off of high-frequency information. Ringing can appear as a result of image compression and image up-sampling. Approach describe in [28] for measuring quantity of this artifact consist of finding area of the image in which the

artifact may appear is based on calculating standard deviation for each image pixel.

III. QOE USING NONLINEAR WITH NO-REFERENCE AND FULL-REFERENCE QUALITY METRICS

In this paper, we propose no-reference (NR) and full-reference (FR) driven image 3D quality prediction schemes. Both schemes represent nonlinear models using different types of input objective quality metrics. The basic idea during the preparation of models for the 3D images quality estimation is based on available subjective and objective metrics, analysis of impact of selected input metrics on the models estimation performances and comparison of the models performances. As inputs of the NR model has various no-reference quality measures, while the FR model at the input has a full-reference metrics, both described in Section II. Both models are nonlinear quality model based on ANN predictors which are created to optimally predict the perceptual quality of 3D images, expressed in the Different Mean Opinion Score (DMOS) scale. In order to identify dependence between the number of selected input parameters and performance of the models, for each model number of input parameters was varied. The variation of the number of input parameters was performed on the basis of Pearson correlation between parameters and DMOS, gradually reducing the total 9 and 6 parameters, for NR and FR models, respectively. The inputs parameters with a lower correlation to DMOS are taken out first during gradual reduction of the model inputs. Also, we varied the number of distortions in order to observe how the different distortions affect the models.

TABLE I. CORRELATION OF INPUT FEATURES TO DMOS

Table 1 shows correlations between input features and

actual DMOS for image data sets with different types of distortions. The NR and FR model metrics are ordered from largest to smallest correlation Pearson coefficient in

Pearson correlation

Distortions

Fast fading (FF)

jp2k + jpeg

Blur +

jp2k + jpeg

FF + Blur +

jp2k+jpeg

NR

model m

etrics

Edge activity -0.797 -0.784 -0.830 -0.880 Zero-crossing

rate -0.826 -0.826 -0.846 -0.880

Energy homogeneity

0.773 0.739 0.795 0.863

Local homogeneity

0.769 0.718 0.781 0.848

Gradient based activity

-0.720 -0.591 -0.661 -0.761

Activity -0.719 -0.584 -0.654 -0.753 Blocking effect -0.725 -0.531 -0.602 -0.725

Contrast -0.649 -0.519 -0.617 -0.722 Ringing -0.610 -0.547 -0.602 -0.697 FR

model m

etrics

IWSSIM -0.760 -0.831 -0.856 -0.869 IWPSNR -0.747 -0.785 -0.748 -0.826

SSIM -0.560 -0.737 -0.763 -0.792 Disparity

correlation -0.588 -0.748 -0.682 -0.745 PSNR -0.699 -0.597 -0.552 -0.669

Histogram -0.374 -0.193 -0.123 -0.384

127

CINTI 2013 • 14th IEEE International Symposium on Computational Intelligence and Informatics • 19–21 November, 2013 • Budapest, Hungary

Table 1 and hereinafter shall be respected that order for any further analyzes. Also, in this paper one can see the comparative analysis of both models, with special emphasis on the results obtained in the processing of both models.

In addition to the correlation of input parameters and DMOS, we performed the cross-correlation of all input features in order to detect their common impact on the both models independently. Cross-correlation matrixes are shown in Table 2 for no-reference model and Table 3 for full-reference model.

TABLE II.

CROSS-CORRELATION OF NO-REFERENCE FEATURES - 1) Edge activity, 2) Zero-crossing rate, 3) Energy homogeneity, 4)

Local homogeneity, 5) Gradient based activity, 6) Activity, 7) Blocking effect, 8) Contrast, 9) Ringing

TABLE III.

CROSS-CORRELATION OF FULL-REFERENCE FEATURES - 1) IWSSIM, 2) IWPSNR, 3) SSIM, 4) Disparity, 5) PSNR, 6)

Homogeneity

After the analysis of input features, based on correlation and cross-correlation, MLP with a single hidden layer is configured. The MLP’s hidden layer consists of 2n+1 neurons, where n is number of input features. For activation function on hidden nodes is chosen a tangent-hyperbolic function, while for output node a linear function is chosen. MLP was trained with Levenberg-Marquardt optimization back-propagation algorithm. It is used 10-fold cross-validation in order to ensure the exactness of the results during the experiment. The ratio of training and test images were 90% and 10% of total number of images.

IV. RESULTS After defined which objective quality metrics will be

performed, the appropriate image database should be chosen. Objective metrics described earlier were applied on the part of LIVE 3D image quality database [5][20]. It consists of 20 reference images as shown in Figure 1.

Following distortion types are taken into consideration: blur, jpeg, jp2k and fast fading. Here, blur image dataset consists of 28 3D image, two different image blur degradation corresponding to 14 reference images. Also into consideration only images which have at least two distortions in blur image dataset are taken. Remaining distortions include 4 different degradation for each corresponding reference images, precisely 80 distorted 3D images for each distortion (240 3D images total). Lowest and highest distortion level for every degradation type is shown on Figure 2.

In Figure 3, the results of the proposed models are shown. Figure 3a) presents NR model for different number of input features incorporated in terms of root mean square error (RMSE) and standard deviation of RMSE. Figure 3b) shows the results of FR model in the same way as shown in Figure 3a). It can be noticed on both graphs that models with all input features gives much smaller error than models with less input features. For example, NR model for 6 input features gives RMSE = 0.5436, and for 1 input feature 4.8501 which is much higher error. Also from both graphs it is clear that FR model is better for the same number of input features than NR model, which is an expected result.

Pearson correlation coefficient was used for testing correlation between estimated DMOS and the actual

2 3 4 5 6 7 8 9

1 0.952 -0.986 -0.982 0.924 0.914 0.882 0.835 0.819

2 * -0.919 -0.938 0.911 0.910 0.828 0.801 0.851

3 * * 0.983 -0.908 -0.899 -0.862 -0.826 -0.789

4 * * * -0.946 -0.939 -0.893 -0.883 -0.843

5 * * * * 0.997 0.922 0.887 0.948

6 * * * * * 0.887 0.884 0.943

7 * * * * * * 0.815 0.882

8 * * * * * * * 0.838

2 3 4 5 6

1 0.799 0.843 0.791 0.665 0.354

2 * 0.851 0.700 0.907 0.243

3 * * 0.593 0.806 0.140

4 * * * 0.524 0.579

5 * * * * 0.073

(a) (c) (e)

(g)

(b) (d) (f)

(h)

Figure 2. Distortion types: (a) blurred image low level distortion, (b) blurred image high level distortion, (c) jpeg image low level distortion, (d) jpeg image high level distortion, (e) jp2k image low level distortion, (f) jp2k image high level distortion, (g) fast fading image low level

distortion, (h) fast fading image high level distortion

D. Kukolj et al. • 3D Image Quality Estimation (ANN) Based on Depth/Disparity and 2D Metrics

128

DMOS. It can be noticed, based on data from Table IV, that this statistical measure shows high level of correlation for both models, and for all varies of input features.

As a final result, the output of the models is presented in terms of estimated DMOS. Comparison of DMOS and estimated DMOS is given in Figure 4 for both models and different number of input features. Figure 4a) compares DMOS and estimated DMOS for NR model

TABLE IV. PEARSON CORRELATION OF ESTIMATED DMOS AND DMOS

FOR NR AND FR MODEL

(a)

(b)

Figure 3. (a) RMSE of NR model for different number of input features

(b) RMSE of FR model for different number of input features

with 1 input feature and the same model with 9 input features. By comparing those two series of points it is clearly visible that reducing the number of inputs drastically worsens prediction of DMOS. Figure 4b) shows the same comparison for FR model, for all input features and for only one input feature. Conclusion is exactly the same for FR model.

(a)

(b)

Figure 4. DMOS vs. estimated MOS (a) NR model with different number of input features (b) FR model with different number of input

features

V. CONCLUSIONS The main goals of the paper is analysis of impact of

no-reference and full-reference objective metrics on 3D image quality and performances of corresponding nonlinear mapping models for the quality prediction. An impact of nine 2D no-reference and six full-reference objective metrics on overall perceived quality of 3D stereoscopic images distorted by coding and transmission effects, i.e. JPEG, JPEG2K, blurring and fast fading effect is investigated. On basis of their observed correlation level to the subjective quality, a neural network based DMOS predictors are created. Investigated DMOS predictors are divided into two groups, full-reference and no-reference based, and their performances are analyzed. Surprisingly, no-reference models shown slightly lower accuracy and robustness in comparison to adequate full-reference based models.

0.00.51.01.52.02.53.03.54.04.55.05.5

1 2 3 4 5 6 7 8 9

RM

SE

Number of features

Average RMSEMAX/MIN of RMSE

0.0

0.5

1.0

1.5

2.0

2.5

3.0

3.5

4.0

4.5

1 2 3 4 5 6

RM

SE

Number of features

Average RMSEMIN/MAX of RMSE

-10-505

101520253035404550556065

-10-5 0 5 10 15 20 25 30 35 40 45 50 55 60 65

Estim

ated

DM

OS

DMOS

No-reference model - all features

No-reference model - 1 feature

-10-505

101520253035404550556065

-10-5 0 5 10 15 20 25 30 35 40 45 50 55 60 65

Estim

ated

DM

OS

DMOS

Full-reference model all input featuresFull-reference model - 1 input feature

Pearson correlation NR model FR model

Num

ber of input features

9 0.9996 * 8 0.9994 * 7 0.9994 * 6 0.9990 0.9961 5 0.9982 0.9994 4 0.9957 0.9980 3 0.9913 0.9955 2 0.9754 0.9798 1 0.9444 0.9669

129

CINTI 2013 • 14th IEEE International Symposium on Computational Intelligence and Informatics • 19–21 November, 2013 • Budapest, Hungary

ACKNOWLEDGMENT

This work was partially supported by the Ministry of Education, Science and Technological Development of the Republic of Serbia under Grant TR-32034 and by the Secretary of Science and Technology Development of the Province of Vojvodina under Grant 114-451-2636/2012-01.

REFERENCES [1] ITU-R BT.500-11, “Methodology for the subjective assessment of

the quality of television pictures”, 2002. [2] ITU-T P.910, “Subjective video quality assessment methods for

multimedia applications”, 1999. [3] A. Benoit, P. Le Callet, P. Campisi, R. Cousseau,” Quality

Assessment of Stereoscopic Images”, EURASIP Journal on Image and Video Processing 2008.

[4] L. Goldmann, T. Ebrahimi, “Towards reliable and reproducible 3D video quality assessment”, Three-Dimensional Imaging, Visualization, and Display 2011.

[5] LIVE 3D Image Quality Database, http://live.ece.utexas.edu/research/quality/live_3dimage_phase1.html.

[6] R.G. Kaptein, A. Kuijsters, M.T.M. Lambooij, W.A. IJsselsteijn I. Heynderickx, “Performance evaluation of 3D-TV systems”, Image Quality and System Performance V, 2010.

[7] V. Zlokolica, D. Kukolj, M. Pokric, M. Temerinac, “Feature-Cluster Driven Multi-Model Video Quality Assessment Framework”,2011.

[8] K. Ambrosch, W. Kubinger, M. Humenberger, A. Steininger, “ Flexible Hardware-Based Stereo Matching“,EURASIP Journal on Embedded Systems, 2008.

[9] Z. Wang, H. R. Sheikh, A.C. Bovik, “No-Reference Perceptual Quality Assessment of JPEG Compressed Images”, In Proc. of IEEE 2002 International Conference on Image Processing, 2002.

[10] Li C., Bovik A. C., “Content-weighted video quality assessment using a three-component image model”, In Journal of Electronic Imaging, vol. 19, no. 1, pp. 1–9, 2010.

[11] K. Fiegel, “A method for objective image quality metric using neural network with enhanced data preprocessing”, In Radioelektronika 2006 - Conference Proc. Bratislava: Slovak University of Technology, 2006.

[12] A. Bouzerdoum, A. Havstad, A. Beghdadi, “Image quality assessment using a neural network approach”, In Signal Proc. and Information Technology Proc. of the Fourth IEEE International Symposium on, Dec. 2004, pp. 330–333.

[13] D. Kukolj, M. Pokrić, V. Zlokolica, J. Filipović, M. Temerinac, “No-Reference Video Quality Assessment Design Framework Based on Modular Neural Networks”, Lecture Notes in Computer Science, Vol. 6352, Part I, pp. 569–574, Springer-Verlag , 2010.

[14] M. Chambah, S. Ouni, M. Herbin, E. Zagrouba, “Towards an Automatic Subjective Image Quality Assessment System”, In SPIE/IS&T Electronic Imaging, Proc. SPIE 7242, USA California, 2009.

[15] R. V. Babu, S. Suresh, A. Perkis, “No-reference jpeg image quality assessment using gap-rbf”, In Signal Proc., vol. 87, 2007.

[16] P. Gastaldo, R. Zunino, S. Rovetta, “Objective assessment of mpeg-2 video quality”, In Journal of Electronic Imag., vol. 11, 2002.

[17] A. Benoit, P. Le Callet, P. Campisi, R. Cousseau, “Using disparity for quality assessment of stereoscopic images”, hal-00324052, 2008

[18] P. Thevenaz, UE Ruttimann, M. Unser, "A Pyramid Approach to Subpixel Registration Based on Intensity" IEEE Transactions on Image Processing (1998) Vol. 7, No. 1.

[19] A. Koschan, V. Rodehorst, K. Spiller, "Color Stereo Vision Using Hierarchical Block Matching and Active Color Illumination" Pattern Recognition, 1996.

[20] A.K. Moorthy, C.-C. Su, A. Mittal and A.C. Bovik, “Subjective evaluation of stereoscopic image quality” Signal Processing: Image Communication, to appear, 2012.

[21] ITU-R BT.1438,”Subjective assessment of stereoscopic television pictures”,2000

[22] J. You, L. Xing, A. Perkis, and X. Wang, “Perceptual quality assessment for stereoscopic images based on 2D image quality metrics and disparity analysis” in Proceedings of the International Workshop on Video Processing and Quality Metrics for Consumer Electronics (VPQM '01), Scottsdale, Arizona, USA, 2010.

[23] E. Bosc, R. Pepion, P. Le Callet, M. Koppel, P. Ndjiki-Nya, M. Pressigout, L. Morin, “Towards a New Quality Metric for 3-D Synthesized View Assessment”, IEEE Journal, 2011

[24] Z. Wang, Q. Li,” Information Content Weighting for Perceptual Image Quality Assessment’’, IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 20, NO. 5, MAY 2011

[25] Z. Wang, A. C Bovik, H. R Sheikh, and E. P Simoncelli, “Image quality assessment: From error visibility to structural similarity”, Image Processing, IEEE Transactions on, vol. 13, no. 4, pp. 600–612, 2004.]

[26] Andrey V. Nasonov, Andrey S. Krylov “Scale – space method of image ringing estimation”, ICIP, 2009.

[27] Chen, C. H., “Signal processing handbook Dekker”, New York, 1988.

[28] Kirenko, I., “Reduction of compression artifacts in displayed images”, (WO/2006/056940), 2006.

[29] Kusuma, T., Caldera, M., Zepernick, H., “Utilising objective perceptual image quality metrics for implicint link adaptation”, ch. 4, 2319 – 2322, 2004.

[30] Samta Gupta, Susmita Ghosh Mazumdar “Sobel Edge Detection Algorithm”, 2013.

[31] Idrissi N., Martinez J., Aboutajdine D., “Selecting a discriminant subset of co- occurrence matrix features for texture – based Image retrieval”, 696 – 703, 2005.

[32] V.Zlokolica, D. Kukolj, N. Lukic, M. Temerinac, “Evaluation On the Selection of Video Quality Metrics for Overall Visual Perception”, QoMEX 2010, 2nd Int. Workshop on Quality of Multimedia Experience, Trondheim, Norway, June, 2010.

[33] C.Hewage, S.T Worrall, S. Dogan, A.M. Kondoz, “Prediction of stereoscopic video quality using objective quality models of 2-D video”, Electronics Letters, 44(16), pp. 963-965., 2008

[34] C.Hewage, S.T Worrall, S. Dogan, A.M. Kondoz, “Quality evaluation of color plus depth map-based stereoscopic video”, IEEE Journal of Selected Topics in Signal Processing, 3(2), pp. 304-318., 2009.

D. Kukolj et al. • 3D Image Quality Estimation (ANN) Based on Depth/Disparity and 2D Metrics

130