why visual quality assessment? - lina karamlina.faculty.asu.edu/eee508/lectures/eee508_vqa.pdf ·...

51
Sample image-and video-based applications Entertainment Communications Medical imaging Security Monitoring Visual sensing and control Art Why Visual Quality Assessment? Copyright 2010 by Lina J. Karam

Upload: others

Post on 22-Oct-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

  • Sample image-and video-based applications

    • Entertainment

    • Communications

    • Medical imaging

    • Security

    • Monitoring

    • Visual sensing and control

    • Art

    Why Visual Quality Assessment?

    Copyright 2010 by Lina J. Karam

  • What is Quality? • Fidelity

    • Satisfaction

    • Performance

    • Aesthetic

    • Diagnostic

    • Other

    Some uses of Quality Assessment • Monitoring & Improving the quality of service (QoS) and quality

    of experience (QoE)

    • Performance evaluation

    • Improved operation

    • Perceptually improved design

    • Authentication

    Why Visual Quality Assessment?

    Copyright 2010 by Lina J. Karam

  • Quality affected by

    • Sensing, capturing devices

    • Display, printing, reproduction

    • Attacks and Protection

    • Compression

    • Transmission

    • Environment

    • Human vision

    • Viewing position

    Why Visual Quality Assessment?

    Copyright 2010 by Lina J. Karam

  • Basic Imaging System

    Copyright 2010 by Lina J. Karam

    Imaging Device

    Imaged Scene

    DIGITIZER STORAGE PROCESS

    Enhancement, Restoration, Compression for transmission

    Sampling + Quantization

    Compression

    Quality of captured image depends on:

    • Imaging optics, sensors, and electronics

    •“Color” filter characteristics

    • Digitization

    • Processing

    • Compression

  • Basic Imaging System

    Copyright 2010 by Lina J. Karam

    Imaging Device

    Imaged Scene

    DIGITIZER STORAGE PROCESS

    Enhancement, Restoration, Compression for transmission

    Sampling + Quantization

    Compression

    • Different storage and transmission media depending on application

    • Multimedia applications over wireless portable devices gaining popularity: limited bandwidth and storage

    - Video over IP

    - Portable devices: power issues in addition to shared bandwidth and error-prone environment result in much lower data rate transfer

    - Harsh environments and security: operation under very low power and very low bandwith at below 20 Kbits/sec

    • Data Storage Devices: CDs and DVDs Data throughput (read and write rates) is much lower (few mega bits per second) than storage capacity (few gigabits per second) – 1xBlu-ray DVD: 32 Mbps

  • Copyright 2010 by Lina J. Karam

    Compression Artifacts

    Image and video coding standards

    • Transform based

    • Block-based DCT coding: JPEG, MPEGx, H.26x

    • Wavelet-based coding: JPEG 2000

    • Motion compensation for video

    • Quantization

  • Copyright 2010 by Lina J. Karam EUVIP 2010

    Common Compression Artifacts

    • Blocking artifacts in block-based DCT codecs

    • Ringing artifacts in wavelet-based codecs

    • Blurriness – loss of detail and sharpness due to removal of high frequency transform coefficients

    • Graininess – due to quantization of retained transform coefficients

    • Contouring

    • Color bleeding

    • Mosquito noise in video

    • Motion jerkiness in video

    • Ghosting

    • Flickering

  • Copyright 2010 by Lina J. Karam

    Compression Artifacts

    Degradations due to block-based DCT transform

    coding

  • Copyright 2010 by Lina J. Karam

    Compression Artifacts

    http://www.elecard.com/products/j2kwavelet.php

    JPEG - 10,696 Bytes 757x507 Butterfly

    JPEG2000 - 10,436 Bytes 757x507 Butterfly

  • Copyright 2010 by Lina J. Karam

    Common Compression Artifacts

    Ringing

    Mosquito Noise

  • Copyright 2010 by Lina J. Karam

    Compression Artifacts

  • Copyright 2010 by Lina J. Karam

    Compression Artifacts

  • Copyright 2010 by Lina J. Karam

    Human Vision and Perception

    Quality affected by the human visual system

    • Characteristics and limitations of the human visual

    system

    • Some distortions are introduced

    • Some distortions are masked

    Saliency – visual attention

    • Faces in images, eyes, mouth

    • High-contrast objects

    • Motion

    • Snakes….

  • Copyright 2010 by Lina J. Karam

    Objective Visual Quality Models and Metrics

    Goal: estimate automatically and “reliably” quality of visual media

    Subjective assessment are expensive and not practical for real-time implementations

    Subjective tests are important for evaluating the performance of objective visual quality metrics

    Subjective tests need to follow strict and repeatable evaluation conditions

    ITU-T recommendations: www.itu.int/ITUT/ Publications/ recs.html

    Video Quality Experts Group (VQEG) reports: www.vqeg.org

  • EEE 508

    Visual Quality Assessment

    • Image/Video fidelity criteria

    – Useful for

    • rating performance of image/video processing

    techniques

    • measuring image/video quality and user satisfaction

    – Issues:

    • Viewing distance

    • Subjective versus objective measures in evaluating

    image/video quality

  • EEE 508

    Image Quality Assessment – Subjective criteria:

    • Use rating scales – goodness scales (rates image quality)

    Overall, global Group

    Excellent (5) Best (7)

    Good (4) Well above average (6)

    Fair (3) Slightly above average (5)

    Poor (2) Average (4)

    Unsatisfactory (1) Slightly below average (3)

    Well below average (2)

    Worst (1)

    – Impairment scales (rates an image based on level of degradation present in image compared to ideal image; useful in applications such as image coding and compression)

    Not noticeable (1)

    Just noticeable (2)

    Definitely noticeable but only slight impairment (3)

    Impairment not objectionable (4)

    Somewhat objectionable (5)

    Definitely objectionable (6)

    Extremely objectionable (7)

    • MOS (Mean Opinion Score) calculates average rating of observers

  • EEE 508

    Visual Quality Assessment – Traditional Quantitative criteria:

    • The most common set of traditional quantitative criteria used are based on the mean square error (MSE) norm.

    • In most applications, the mean square error is expressed in terms of a Signal-to-Noise Ratio (SNR), which is defined in decibels (dB)

    where = mean square error

    often approximated by the average least squares error:

    2

    1 1

    2 ,,1 M

    i

    N

    j

    polse jiIjiIMN

    22 ,, jiIjiIE pomse

    Original image Processed image

    2

    mse

    2

    2

    10log10mse

    dBSNROriginal image variance

    Error variance (MSE)

  • EEE 508

    Visual Quality Assessment

    – Traditional Quantitative criteria: • Other types of SNR used in image coding applications:

    - Peak-to-Peak SNR (dB) = PPSNR

    - Peak SNR (dB) = PSNR (more commonly used)

    • PSNR generally results in values 12 to 15 dB above the value of SNR

    • SNR or PSNR are usually measures of quality; they usually correlate well with perceptual quality in image coding applications at high or very low bit rates; but they might not well correlate at low bit rates

    • Commonly used because of mathematical tractability (easy to compute and handle in developing image processing algorithms)

    2

    2

    10

    imagereferenceofvaluepeaktopeaklog10

    e

    PPSNR

    2

    2

    10

    image referenceof valuepeaklog10

    e

    PSNR

  • EEE 508

    Image Quality Assessment

    RMSE = 8.5 RMSE = 9.0

  • Design and Evaluation of Quality Metrics

    Content Database

    Raw content

    Processing Test

    Content

    Subjective Testing

    Objective Visual Quality

    Metric

    Statistical Analysis

    Performance Assessment

    Reference Content

    Mean Opinion Score (MOS)

    [1] LIVE Database , http://live.ece.utexas.edu/research/quality/

    [1]

    Predicted MOS

    Optional

    Metric, M

    MO

    Sp

    Nonlinear logistic function

    MOS DMOS Raw Scores Z Scores

    Copyright 2010 by Lina J. Karam

    Visual Database

  • Copyright 2010 by Lina J. Karam

    Performance Evaluation of Quality Metrics

    Popular performance evaluation measures •Pearson Correlation Coefficient (PCC): measures prediction

    accuracy, i.e., the ability of metric to predict subjective MOS with a

    low error

    •Spearman rank order correlation coefficient (SROCC): measures

    prediction monotonicity; i.e., it measures if increase (decrease) in

    one variable results in increase (decrease) in the other variable,

    independent of the magnitude of increase (decrease).

    •Outlier Ratio (OR): measures consistency, i.e., the degree to which

    the metric maintains the prediction accuracy; it is defined as the

    percentage of the number of predictions outside the range of

    2 times the standard deviations of the subjective results.

    Other

    • RMSE and MAE of objective scores

    • Hypothesis testing and F statistics

  • Visual Quality Databases

    What is a visual quality database?

    -Set of images/videos (typically with varying content)

    -Subjective assessment scores

    Why are visual quality databases needed?

    - To assess the performance of objective or automatic

    methods of quality assessment and compare their

    performance

    - To understand human visual perceptual properties

    Copyright 2010 by Lina J. Karam

  • Existing Image quality Databases

    LIVE Image (Release 2)

    • JPEG compressed images (169 images)

    • JPEG2000 compressed images (175 images)

    • Gaussian blur (145 images)

    • White noise (145 images)

    • Bit errors in JPEG2000 bit stream (145 images)

    Tampere Image Database 2008 (TID 2008)

    • 25 reference images x 17 types of distortions x 4 levels of distortions

    IRCCyN/IVC Database

    10 original images, 235 distorted images generated from 4 different distortion types (JPEG,JPEG 2000, Rayleigh Fading, Blurring)

    Toyama Database

    14 original images, 168 distorted images generated from 2 distortion types (JPEG, JPEG 2000)

    Visual Quality Databases

    Copyright 2010 by Lina J. Karam

  • Existing Video quality Databases

    VQEG

    • H.263 compression

    • MPEG-2 compression

    LIVE Video

    • MPEG-2 compression

    • H.264 compression

    • Simulated transmission of H.264 compressed bitstreams through error-

    prone IP networks and through error-prone wireless networks

    Visual Quality Databases

    Copyright 2010 by Lina J. Karam

  • Copyright 2010 by Lina J. Karam

    Objective Visual Quality Models and Metrics

    Test NR Objective

    Metric Quality •No Reference (NR)

    •Full Reference (FR) Reference

    Test

    FR Objective Metric

    Quality

    Reference

    Test

    RR Objective Metric

    Quality

    Features

    •Reduced Reference (RR)

  • Fidelity Aesthetic

    Copyright 2010 by Lina J. Karam

    Objective Visual Quality Models and Metrics

    •Full Reference (FR) Reference

    Test

    FR Objective Metric

    Quality

    Camera Calibration/Tuning Application

  • Copyright 2010 by Lina J. Karam

    Objective Visual Quality Models and Metrics

    Reference

    Test

    RR Objective Metric

    Quality

    Features

    •Reduced Reference (RR)

    Sample features from Reference Test

  • Copyright 2010 by Lina J. Karam

    Objective Visual Quality Models and Metrics

    Test NR Objective

    Metric Quality •No Reference (NR)

  • Copyright 2010 by Lina J. Karam

    Objective Visual Quality Models and Metrics

    Full Reference Reduced Reference No Reference

    Perceptual (HVS) Visual Media Characteristics Hybrid

    Frequency

    Domain Pixel

    Domain Hybrid

    Natural Scene

    Statistics Visual Features Hybrid

  • Copyright 2010 by Lina J. Karam

    Full Reference Perceptual-based Model

    Reference

    Test

    Multi-channel Decomposition

    Multi-channel Decomposition

    .

    .

    .

    Compute locally adaptive detection thresholds

    (JNDs) at each location in each channel

    . . .

    Computer difference at each location in

    each channel

    .

    .

    .

    . . .

    Normalize by local JNDs

    Pool over foveal regions

    Pool all foveal differences over

    entire image/video

    D

    Q = 1/D

    Basis of several metrics: -Watson’s Spatial Standard Observer (SSO) metric -Watson’s Video Standard Observer (VSO) metric -Liu, Karam, & Watson JPEG2000 compression distortion quantification and control -Watson’s DCTune - Hontsch & Karam DCT-based JPEG compression distortion and control - Hontsch & Karam perceptually lossless compression

  • Copyright 2010 by Lina J. Karam

    Perceptually lossless compression

    Original image, 8 bits per pixel Processed image, 0.35 bits per pixel

  • Copyright 2010 by Lina J. Karam

    Perceptual Quality-based JPEG2K compression

    Original image, 8 bits per pixel

  • Copyright 2010 by Lina J. Karam

    Perceptual Quality-based JPEG2K compression

    Conventional JPEG2K, 0.586 bit per pixel

  • Copyright 2010 by Lina J. Karam

    Perceptual Quality-based JPEG2K compression

    Perceptual JPEG2K, 0.586 bit per pixel

  • Copyright 2010 by Lina J. Karam

    Other FR Metrics based on contrast detection

    thresholds

    •Visual SNR, or VSNR (Chandler & Hemami, ITIP, 2007) • Weighted SNR, WSNR (Mitsa & Varkur, 93) •Noise Quality Measure, NQM (Damera-Venkata et al., ITIP, 2000)

  • Copyright 2010 by Lina J. Karam

    Objective Visual Quality Models and Metrics

    Full Reference Reduced Reference No Reference

    Perceptual (HVS) Visual Media Characteristics Hybrid

    Frequency

    Domain Pixel

    Domain Hybrid

    Natural Scene

    Statistics Visual Features Hybrid

  • Copyright 2010 by Lina J. Karam

    Quality Metrics based on Natural Scene Statistics

    Basic Assumption: Distortions are not natural in terms of

    Natural Scene Statistics (NSS).

  • Copyright 2010 by Lina J. Karam

    Objective Visual Quality Models and Metrics

    Structural SIMilarity (SSIM) Index The SSIM metric is calculated on various patches of an image. The measure between two patches

    and

    of size N×N is:

    Multi-Scale Structural SIMilarity (MS-SSIM) Index

    mean of

    mean of

    variance of

    variance of

    covariance of

    and

  • Popular SSIM (Structural SIMilarity) FR Metric (Wang et al., ITIP 04)

    •The SSIM between two subimages x and y is given by

    - x and y are the means of x and y - x and y are the variances of x and y -covxy is the covariance used to stabilize the division

    • SSIM index for image is average of SSIM indices over all subimages • Extensions: MS-SSIM, CWSSIM, VSSIM, …

    • Other FR NSS Metrics:

    -Universal Quality Index (Wang & Bovik, ISPL, 02) – earlier SSIM

    -Image Fidelity Criterion (Sheikh et al., ITIP, 05) – GSM in

    wavelet domain

    -Visual Information Fidelity (Sheikh et al., ITIP, 06) – adds HVS

    • RR NSS Metric: Reduced Reference Image Quality Assessment

    (Wang & Simoncelli,05) Copyright 2010 by Lina J. Karam

    Quality Metrics based on Natural Scene Statistics

  • Z. Wang, A. C. Bovik, H. R. Sheikh and E. P. Simoncelli, "Image quality

    assessment: From error visibility to structural similarity," IEEE

    Transactions on Image Processing, vol. 13, no. 4, pp. 600-612, Apr.

    2004. http://www.ece.uwaterloo.ca/~z70wang/research/ssim/

    Copyright 2010 by Lina J. Karam

    Other Sources of Information

  • Copyright 2010 by Lina J. Karam

    Objective Visual Quality Models and Metrics

    Full Reference Reduced Reference No Reference

    Perceptual (HVS) Visual Media Characteristics Hybrid

    Frequency

    Domain Pixel

    Domain Hybrid

    Natural Scene

    Statistics Visual Features Hybrid

  • Copyright 2010 by Lina J. Karam

    No Reference Blur Metric: Just-Noticeable Blur

    and Probability of Detection

    Just-Noticeable Blur (JNB) concept: “The minimum amount of

    perceived blurriness around an edge given a contrast higher than

    the Just Noticeable Difference (JND)”.

  • Copyright 2010 by Lina J. Karam

    No Reference Blur Metric: Just-Noticeable Blur

    and Probability of Detection

    CPBD (Cumulative Probability of Blur Detection) Metric

    < 0.63

  • Copyright 2010 by Lina J. Karam

    No Reference Blur Metric: Just-Noticeable Blur

    and Probability of Detection

    CPBD (Cumulative Probability of Blur Detection) Metric

    = 0.9 = 1.7

  • Copyright 2010 by Lina J. Karam

    No Reference Blur Metric: Just-Noticeable Blur

    and Probability of Detection

    Performance evaluation of CPBD using LIVE Database Set 1: All 174 Gaussian blurred images in LIVE.

    Set 2: 30 Gaussian blurred images with varying foreground and background

    blur quantities.

    Set 3: All 227 jpeg-2000 compressed images in LIVE .

  • Copyright 2010 by Lina J. Karam

    No Reference Blur Metric: Just-Noticeable Blur

    and Probability of Detection

    Performance evaluation of CPBD using TID 2008 Database

  • Copyright 2010 by Lina J. Karam

    No Reference Blur Metric: Just-Noticeable Blur

    and Probability of Detection

    Performance evaluation of CPBD using IVC Database

    Performance evaluation of CPBD using Toyama Database

  • Copyright 2010 by Lina J. Karam

    Other Sources of Information

    • R. Ferzli and L. J. Karam, “A No-Reference Objective Image Sharpness Metric Based on the Notion of Just

    Noticeable Blur (JNB),” IEEE Transactions on Image

    Processing, vol. 18, no. 4, pp. 717-728, April 2009.

    •N. D. Narvekar and L. J. Karam, “A No-Reference Image Blur Metric Based on the Cumulative Probability

    of Blur Detection (CPBD),” IEEE Trans. on Image

    Processing, vol. 20, No. 9, pp. 2678-2683, Sept. 2011.

    • http://ivulab.asu.edu/Quality

    http://ivulab.asu.edu/Qualityhttp://ivulab.asu.edu/Qualityhttp://ivulab.asu.edu/Qualityhttp://ivulab.asu.edu/Qualityhttp://ivulab.asu.edu/Qualityhttp://ivulab.asu.edu/Qualityhttp://ivulab.asu.edu/Quality

  • •Existing still-image quality assessment metrics can be applied to assess

    video and pooling over frames

    •PVQM (Swisscom/KPN): Leader in VQEG Phase 1 study;

    uses a linear combination of three distortion indicators, namely edginess,

    temporal decorrelation, and color error to measure the perceptual quality

    (visual feature based and weighted combinations of distortion indicators

    related to these features).

    •VQM (NTIA): Leader in VQEG Phase 2 study and standardized by ITU-T

    and ISO; provides several quality models, such as the Television model, the

    General Model, and the Video Conferencing Model, with several calibration

    options prior to feature extraction (Visual feature based and weighted

    combinations of distortion indicators related to features); main impairments

    considered in General Model include blurring, block distortion,

    jerky/unnatural motion, noise, and error blocks

    Copyright 2010 by Lina J. Karam

    Competitive FR Video Quality Metrics

  • •PEVQ (Opticom): Leader in VQEG Multimedia Phase 1 study; builds upon

    PVQM ; became part of ITU-T Recommendation J.247 (FR MM video,

    2008)

    • MOVIE index (Seshadrinathan & Bovik, ITIP, 2009): spatio-temporal

    multi-channels, visual masking, temporal quality assessed along computed

    motion trajectories, builds on principles from SSIM and VIF

    Copyright 2010 by Lina J. Karam

    Competitive FR Video Quality Metrics

  • •Issue with current video quality metrics:

    Existing still-image quality assessment metrics results on video are very

    competitive with state-of-the-art video quality metrics

    • Better video quality models are needed.

    Copyright 2010 by Lina J. Karam

    Competitive FR Video Quality Metrics

    Performance on LIVE Video Database