why visual quality assessment? - lina karamlina.faculty.asu.edu/eee508/lectures/eee508_vqa.pdf ·...
TRANSCRIPT
-
Sample image-and video-based applications
• Entertainment
• Communications
• Medical imaging
• Security
• Monitoring
• Visual sensing and control
• Art
Why Visual Quality Assessment?
Copyright 2010 by Lina J. Karam
-
What is Quality? • Fidelity
• Satisfaction
• Performance
• Aesthetic
• Diagnostic
• Other
Some uses of Quality Assessment • Monitoring & Improving the quality of service (QoS) and quality
of experience (QoE)
• Performance evaluation
• Improved operation
• Perceptually improved design
• Authentication
Why Visual Quality Assessment?
Copyright 2010 by Lina J. Karam
-
Quality affected by
• Sensing, capturing devices
• Display, printing, reproduction
• Attacks and Protection
• Compression
• Transmission
• Environment
• Human vision
• Viewing position
Why Visual Quality Assessment?
Copyright 2010 by Lina J. Karam
-
Basic Imaging System
Copyright 2010 by Lina J. Karam
Imaging Device
Imaged Scene
DIGITIZER STORAGE PROCESS
Enhancement, Restoration, Compression for transmission
Sampling + Quantization
Compression
Quality of captured image depends on:
• Imaging optics, sensors, and electronics
•“Color” filter characteristics
• Digitization
• Processing
• Compression
-
Basic Imaging System
Copyright 2010 by Lina J. Karam
Imaging Device
Imaged Scene
DIGITIZER STORAGE PROCESS
Enhancement, Restoration, Compression for transmission
Sampling + Quantization
Compression
• Different storage and transmission media depending on application
• Multimedia applications over wireless portable devices gaining popularity: limited bandwidth and storage
- Video over IP
- Portable devices: power issues in addition to shared bandwidth and error-prone environment result in much lower data rate transfer
- Harsh environments and security: operation under very low power and very low bandwith at below 20 Kbits/sec
• Data Storage Devices: CDs and DVDs Data throughput (read and write rates) is much lower (few mega bits per second) than storage capacity (few gigabits per second) – 1xBlu-ray DVD: 32 Mbps
-
Copyright 2010 by Lina J. Karam
Compression Artifacts
Image and video coding standards
• Transform based
• Block-based DCT coding: JPEG, MPEGx, H.26x
• Wavelet-based coding: JPEG 2000
• Motion compensation for video
• Quantization
-
Copyright 2010 by Lina J. Karam EUVIP 2010
Common Compression Artifacts
• Blocking artifacts in block-based DCT codecs
• Ringing artifacts in wavelet-based codecs
• Blurriness – loss of detail and sharpness due to removal of high frequency transform coefficients
• Graininess – due to quantization of retained transform coefficients
• Contouring
• Color bleeding
• Mosquito noise in video
• Motion jerkiness in video
• Ghosting
• Flickering
-
Copyright 2010 by Lina J. Karam
Compression Artifacts
Degradations due to block-based DCT transform
coding
-
Copyright 2010 by Lina J. Karam
Compression Artifacts
http://www.elecard.com/products/j2kwavelet.php
JPEG - 10,696 Bytes 757x507 Butterfly
JPEG2000 - 10,436 Bytes 757x507 Butterfly
-
Copyright 2010 by Lina J. Karam
Common Compression Artifacts
Ringing
Mosquito Noise
-
Copyright 2010 by Lina J. Karam
Compression Artifacts
-
Copyright 2010 by Lina J. Karam
Compression Artifacts
-
Copyright 2010 by Lina J. Karam
Human Vision and Perception
Quality affected by the human visual system
• Characteristics and limitations of the human visual
system
• Some distortions are introduced
• Some distortions are masked
Saliency – visual attention
• Faces in images, eyes, mouth
• High-contrast objects
• Motion
• Snakes….
-
Copyright 2010 by Lina J. Karam
Objective Visual Quality Models and Metrics
Goal: estimate automatically and “reliably” quality of visual media
Subjective assessment are expensive and not practical for real-time implementations
Subjective tests are important for evaluating the performance of objective visual quality metrics
Subjective tests need to follow strict and repeatable evaluation conditions
ITU-T recommendations: www.itu.int/ITUT/ Publications/ recs.html
Video Quality Experts Group (VQEG) reports: www.vqeg.org
-
EEE 508
Visual Quality Assessment
• Image/Video fidelity criteria
– Useful for
• rating performance of image/video processing
techniques
• measuring image/video quality and user satisfaction
– Issues:
• Viewing distance
• Subjective versus objective measures in evaluating
image/video quality
-
EEE 508
Image Quality Assessment – Subjective criteria:
• Use rating scales – goodness scales (rates image quality)
Overall, global Group
Excellent (5) Best (7)
Good (4) Well above average (6)
Fair (3) Slightly above average (5)
Poor (2) Average (4)
Unsatisfactory (1) Slightly below average (3)
Well below average (2)
Worst (1)
– Impairment scales (rates an image based on level of degradation present in image compared to ideal image; useful in applications such as image coding and compression)
Not noticeable (1)
Just noticeable (2)
Definitely noticeable but only slight impairment (3)
Impairment not objectionable (4)
Somewhat objectionable (5)
Definitely objectionable (6)
Extremely objectionable (7)
• MOS (Mean Opinion Score) calculates average rating of observers
-
EEE 508
Visual Quality Assessment – Traditional Quantitative criteria:
• The most common set of traditional quantitative criteria used are based on the mean square error (MSE) norm.
• In most applications, the mean square error is expressed in terms of a Signal-to-Noise Ratio (SNR), which is defined in decibels (dB)
where = mean square error
often approximated by the average least squares error:
2
1 1
2 ,,1 M
i
N
j
polse jiIjiIMN
22 ,, jiIjiIE pomse
Original image Processed image
2
mse
2
2
10log10mse
dBSNROriginal image variance
Error variance (MSE)
-
EEE 508
Visual Quality Assessment
– Traditional Quantitative criteria: • Other types of SNR used in image coding applications:
- Peak-to-Peak SNR (dB) = PPSNR
- Peak SNR (dB) = PSNR (more commonly used)
• PSNR generally results in values 12 to 15 dB above the value of SNR
• SNR or PSNR are usually measures of quality; they usually correlate well with perceptual quality in image coding applications at high or very low bit rates; but they might not well correlate at low bit rates
• Commonly used because of mathematical tractability (easy to compute and handle in developing image processing algorithms)
2
2
10
imagereferenceofvaluepeaktopeaklog10
e
PPSNR
2
2
10
image referenceof valuepeaklog10
e
PSNR
-
EEE 508
Image Quality Assessment
RMSE = 8.5 RMSE = 9.0
-
Design and Evaluation of Quality Metrics
Content Database
Raw content
Processing Test
Content
Subjective Testing
Objective Visual Quality
Metric
Statistical Analysis
Performance Assessment
Reference Content
Mean Opinion Score (MOS)
[1] LIVE Database , http://live.ece.utexas.edu/research/quality/
[1]
Predicted MOS
Optional
Metric, M
MO
Sp
Nonlinear logistic function
MOS DMOS Raw Scores Z Scores
Copyright 2010 by Lina J. Karam
Visual Database
-
Copyright 2010 by Lina J. Karam
Performance Evaluation of Quality Metrics
Popular performance evaluation measures •Pearson Correlation Coefficient (PCC): measures prediction
accuracy, i.e., the ability of metric to predict subjective MOS with a
low error
•Spearman rank order correlation coefficient (SROCC): measures
prediction monotonicity; i.e., it measures if increase (decrease) in
one variable results in increase (decrease) in the other variable,
independent of the magnitude of increase (decrease).
•Outlier Ratio (OR): measures consistency, i.e., the degree to which
the metric maintains the prediction accuracy; it is defined as the
percentage of the number of predictions outside the range of
2 times the standard deviations of the subjective results.
Other
• RMSE and MAE of objective scores
• Hypothesis testing and F statistics
-
Visual Quality Databases
What is a visual quality database?
-Set of images/videos (typically with varying content)
-Subjective assessment scores
Why are visual quality databases needed?
- To assess the performance of objective or automatic
methods of quality assessment and compare their
performance
- To understand human visual perceptual properties
Copyright 2010 by Lina J. Karam
-
Existing Image quality Databases
LIVE Image (Release 2)
• JPEG compressed images (169 images)
• JPEG2000 compressed images (175 images)
• Gaussian blur (145 images)
• White noise (145 images)
• Bit errors in JPEG2000 bit stream (145 images)
Tampere Image Database 2008 (TID 2008)
• 25 reference images x 17 types of distortions x 4 levels of distortions
IRCCyN/IVC Database
10 original images, 235 distorted images generated from 4 different distortion types (JPEG,JPEG 2000, Rayleigh Fading, Blurring)
Toyama Database
14 original images, 168 distorted images generated from 2 distortion types (JPEG, JPEG 2000)
Visual Quality Databases
Copyright 2010 by Lina J. Karam
-
Existing Video quality Databases
VQEG
• H.263 compression
• MPEG-2 compression
LIVE Video
• MPEG-2 compression
• H.264 compression
• Simulated transmission of H.264 compressed bitstreams through error-
prone IP networks and through error-prone wireless networks
Visual Quality Databases
Copyright 2010 by Lina J. Karam
-
Copyright 2010 by Lina J. Karam
Objective Visual Quality Models and Metrics
Test NR Objective
Metric Quality •No Reference (NR)
•Full Reference (FR) Reference
Test
FR Objective Metric
Quality
Reference
Test
RR Objective Metric
Quality
Features
•Reduced Reference (RR)
-
Fidelity Aesthetic
Copyright 2010 by Lina J. Karam
Objective Visual Quality Models and Metrics
•Full Reference (FR) Reference
Test
FR Objective Metric
Quality
Camera Calibration/Tuning Application
-
Copyright 2010 by Lina J. Karam
Objective Visual Quality Models and Metrics
Reference
Test
RR Objective Metric
Quality
Features
•Reduced Reference (RR)
Sample features from Reference Test
-
Copyright 2010 by Lina J. Karam
Objective Visual Quality Models and Metrics
Test NR Objective
Metric Quality •No Reference (NR)
-
Copyright 2010 by Lina J. Karam
Objective Visual Quality Models and Metrics
Full Reference Reduced Reference No Reference
Perceptual (HVS) Visual Media Characteristics Hybrid
Frequency
Domain Pixel
Domain Hybrid
Natural Scene
Statistics Visual Features Hybrid
-
Copyright 2010 by Lina J. Karam
Full Reference Perceptual-based Model
Reference
Test
Multi-channel Decomposition
Multi-channel Decomposition
.
.
.
Compute locally adaptive detection thresholds
(JNDs) at each location in each channel
. . .
Computer difference at each location in
each channel
.
.
.
. . .
Normalize by local JNDs
Pool over foveal regions
Pool all foveal differences over
entire image/video
D
Q = 1/D
Basis of several metrics: -Watson’s Spatial Standard Observer (SSO) metric -Watson’s Video Standard Observer (VSO) metric -Liu, Karam, & Watson JPEG2000 compression distortion quantification and control -Watson’s DCTune - Hontsch & Karam DCT-based JPEG compression distortion and control - Hontsch & Karam perceptually lossless compression
-
Copyright 2010 by Lina J. Karam
Perceptually lossless compression
Original image, 8 bits per pixel Processed image, 0.35 bits per pixel
-
Copyright 2010 by Lina J. Karam
Perceptual Quality-based JPEG2K compression
Original image, 8 bits per pixel
-
Copyright 2010 by Lina J. Karam
Perceptual Quality-based JPEG2K compression
Conventional JPEG2K, 0.586 bit per pixel
-
Copyright 2010 by Lina J. Karam
Perceptual Quality-based JPEG2K compression
Perceptual JPEG2K, 0.586 bit per pixel
-
Copyright 2010 by Lina J. Karam
Other FR Metrics based on contrast detection
thresholds
•Visual SNR, or VSNR (Chandler & Hemami, ITIP, 2007) • Weighted SNR, WSNR (Mitsa & Varkur, 93) •Noise Quality Measure, NQM (Damera-Venkata et al., ITIP, 2000)
-
Copyright 2010 by Lina J. Karam
Objective Visual Quality Models and Metrics
Full Reference Reduced Reference No Reference
Perceptual (HVS) Visual Media Characteristics Hybrid
Frequency
Domain Pixel
Domain Hybrid
Natural Scene
Statistics Visual Features Hybrid
-
Copyright 2010 by Lina J. Karam
Quality Metrics based on Natural Scene Statistics
Basic Assumption: Distortions are not natural in terms of
Natural Scene Statistics (NSS).
-
Copyright 2010 by Lina J. Karam
Objective Visual Quality Models and Metrics
Structural SIMilarity (SSIM) Index The SSIM metric is calculated on various patches of an image. The measure between two patches
and
of size N×N is:
Multi-Scale Structural SIMilarity (MS-SSIM) Index
mean of
mean of
variance of
variance of
covariance of
and
-
Popular SSIM (Structural SIMilarity) FR Metric (Wang et al., ITIP 04)
•The SSIM between two subimages x and y is given by
- x and y are the means of x and y - x and y are the variances of x and y -covxy is the covariance used to stabilize the division
• SSIM index for image is average of SSIM indices over all subimages • Extensions: MS-SSIM, CWSSIM, VSSIM, …
• Other FR NSS Metrics:
-Universal Quality Index (Wang & Bovik, ISPL, 02) – earlier SSIM
-Image Fidelity Criterion (Sheikh et al., ITIP, 05) – GSM in
wavelet domain
-Visual Information Fidelity (Sheikh et al., ITIP, 06) – adds HVS
• RR NSS Metric: Reduced Reference Image Quality Assessment
(Wang & Simoncelli,05) Copyright 2010 by Lina J. Karam
Quality Metrics based on Natural Scene Statistics
-
Z. Wang, A. C. Bovik, H. R. Sheikh and E. P. Simoncelli, "Image quality
assessment: From error visibility to structural similarity," IEEE
Transactions on Image Processing, vol. 13, no. 4, pp. 600-612, Apr.
2004. http://www.ece.uwaterloo.ca/~z70wang/research/ssim/
Copyright 2010 by Lina J. Karam
Other Sources of Information
-
Copyright 2010 by Lina J. Karam
Objective Visual Quality Models and Metrics
Full Reference Reduced Reference No Reference
Perceptual (HVS) Visual Media Characteristics Hybrid
Frequency
Domain Pixel
Domain Hybrid
Natural Scene
Statistics Visual Features Hybrid
-
Copyright 2010 by Lina J. Karam
No Reference Blur Metric: Just-Noticeable Blur
and Probability of Detection
Just-Noticeable Blur (JNB) concept: “The minimum amount of
perceived blurriness around an edge given a contrast higher than
the Just Noticeable Difference (JND)”.
-
Copyright 2010 by Lina J. Karam
No Reference Blur Metric: Just-Noticeable Blur
and Probability of Detection
CPBD (Cumulative Probability of Blur Detection) Metric
< 0.63
-
Copyright 2010 by Lina J. Karam
No Reference Blur Metric: Just-Noticeable Blur
and Probability of Detection
CPBD (Cumulative Probability of Blur Detection) Metric
= 0.9 = 1.7
-
Copyright 2010 by Lina J. Karam
No Reference Blur Metric: Just-Noticeable Blur
and Probability of Detection
Performance evaluation of CPBD using LIVE Database Set 1: All 174 Gaussian blurred images in LIVE.
Set 2: 30 Gaussian blurred images with varying foreground and background
blur quantities.
Set 3: All 227 jpeg-2000 compressed images in LIVE .
-
Copyright 2010 by Lina J. Karam
No Reference Blur Metric: Just-Noticeable Blur
and Probability of Detection
Performance evaluation of CPBD using TID 2008 Database
-
Copyright 2010 by Lina J. Karam
No Reference Blur Metric: Just-Noticeable Blur
and Probability of Detection
Performance evaluation of CPBD using IVC Database
Performance evaluation of CPBD using Toyama Database
-
Copyright 2010 by Lina J. Karam
Other Sources of Information
• R. Ferzli and L. J. Karam, “A No-Reference Objective Image Sharpness Metric Based on the Notion of Just
Noticeable Blur (JNB),” IEEE Transactions on Image
Processing, vol. 18, no. 4, pp. 717-728, April 2009.
•N. D. Narvekar and L. J. Karam, “A No-Reference Image Blur Metric Based on the Cumulative Probability
of Blur Detection (CPBD),” IEEE Trans. on Image
Processing, vol. 20, No. 9, pp. 2678-2683, Sept. 2011.
• http://ivulab.asu.edu/Quality
http://ivulab.asu.edu/Qualityhttp://ivulab.asu.edu/Qualityhttp://ivulab.asu.edu/Qualityhttp://ivulab.asu.edu/Qualityhttp://ivulab.asu.edu/Qualityhttp://ivulab.asu.edu/Qualityhttp://ivulab.asu.edu/Quality
-
•Existing still-image quality assessment metrics can be applied to assess
video and pooling over frames
•PVQM (Swisscom/KPN): Leader in VQEG Phase 1 study;
uses a linear combination of three distortion indicators, namely edginess,
temporal decorrelation, and color error to measure the perceptual quality
(visual feature based and weighted combinations of distortion indicators
related to these features).
•VQM (NTIA): Leader in VQEG Phase 2 study and standardized by ITU-T
and ISO; provides several quality models, such as the Television model, the
General Model, and the Video Conferencing Model, with several calibration
options prior to feature extraction (Visual feature based and weighted
combinations of distortion indicators related to features); main impairments
considered in General Model include blurring, block distortion,
jerky/unnatural motion, noise, and error blocks
Copyright 2010 by Lina J. Karam
Competitive FR Video Quality Metrics
-
•PEVQ (Opticom): Leader in VQEG Multimedia Phase 1 study; builds upon
PVQM ; became part of ITU-T Recommendation J.247 (FR MM video,
2008)
• MOVIE index (Seshadrinathan & Bovik, ITIP, 2009): spatio-temporal
multi-channels, visual masking, temporal quality assessed along computed
motion trajectories, builds on principles from SSIM and VIF
Copyright 2010 by Lina J. Karam
Competitive FR Video Quality Metrics
-
•Issue with current video quality metrics:
Existing still-image quality assessment metrics results on video are very
competitive with state-of-the-art video quality metrics
• Better video quality models are needed.
Copyright 2010 by Lina J. Karam
Competitive FR Video Quality Metrics
Performance on LIVE Video Database