objective assessment of image and video...

53
Objective Assessment of Image and Video Quality Vladimir Petrović KTIOS, FTN, UNS 1

Upload: others

Post on 22-Oct-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

  • Objective Assessment of Image and Video Quality

    Vladimir PetrovićKTIOS, FTN, UNS

    1

  • Aims Define motivation for image and video quality evaluation Introduce the problem domain of IQA and VQA Introduce methods and approaches used to tackle the

    problem Look in detail at representative metrics and various steps in

    the process of quality evaluation

    VP, SSIP2017 2

  • VP, SSIP2017 3

    Problem Images and videos are subject to a wide variety of distortions

    during acquisition, processing, compression, transmission and reproduction

  • VP, SSIP2017 4

    Motivation (1) There is intense interest in being able to determine image and

    video quality for a variety of reasons E.g. monitoring quality degradations (QoS)

  • Motivation (2)

    Your life may depend on knowing what is real and what isn’t.

    VP, SSIP2017 5

    Original stream from the sensor Compressed for a low bandwidth channel

  • Motivation (3)

    Benchmarking and optimising a variety of processing methods.

    VP, SSIP2017 6

    Unprocessed Processed

  • VP, SSIP2017 7

    Evaluating Quality Humans effortlessly determine quality of what they are seeing Objective evaluation of perceived quality turns out to be

    challenging

  • Benchmark Human Visual System: “mark 1 eyeball” Subjective trials in relevant conditions can tell us exactly the

    perceived level of quality of any signal But they are complex to organise, to be statistically relevant Require lots of time, equipment and effort to produce results

    VP, SSIP2017 8

  • DefinitionAutomatically determine perceived image or video quality of a displayed signal.

    Automatically = computationally, objectively

    Essentially predict how a representative cohort of observers would rate the presented image/video.

    VP, SSIP2017 9

    Subjective score

    Objective score

  • Quality is Multi-Faceted Depending on the context it can include effects such as:

    Aesthetic quality as subjective impression thereof Utility of the signal for a particular task/purpose Fidelity of original information

    Usually measured as subjective impression expressed on a numerical scale

    VP, SSIP2017 10

    DMOS = 0.55DMOS = 0.41MOS = 0.59

  • Objective Quality Metrics Algorithms that process an image/video and return a quality

    score, usually a single scalar Apart from the test signal their inputs can include the original,

    Reference signal and other relevant information

    VP, SSIP2017 11

    Q Obj= 0.780Q Obj= 0.803Q Obj= 0.921

  • Practical Problem Domain Metrics categorised by practical availability of reference Full Reference (FR) evaluation

    Pristine original (Reference) is available to the metric Evaluation is performed through a direct comparison with the degraded

    Reduced Reference (RR) A small fraction of, usually abstracted, information from the original is

    available Evaluation is performed by comparing with this information

    No-Reference (NR) Original is not available, only degraded/received test signal

    VP, SSIP2017 12

  • Classification of Methods Based on application scope

    General approaches Attempting evaluation of generic “quality” of the signal

    Application specific Evaluating specific aspects of quality: sharpness, noise, contrast … Evaluating a specific type of information: dim targets, broadcast …

    VP, SSIP2017 13

  • Methodology Variety of methods devised over the past 40 years

    Error based methods Physiological and psychological vision models Structural similarity methods Natural scene statistics and information theoretic metrics Machine learning methods

    General purpose viewing quality evaluation performance close to theoretical maximum At least on generic publicly available datasets

    VP, SSIP2017 14

  • Past and Present Most obvious metric is to measure the difference between the

    reference and the test signals Local differences can then be summed up into a global score

    Mean squared error (MSE, L2) and derived PSNR are best known examples

    VP, SSIP2017 15

  • VP, SSIP2017 16

    Limitations of MSE Fails on general, varied evaluation

    All of the degraded images below have approximately the same MSE

  • Degradation AnalysisIt is the images with local structural changes that exhibit loss of perceived quality.

    VP, SSIP2017 17

    OriginalNo Structural Change Local Structural Change

  • Delving FurtherFurthermore, the more local structure degradation there is – the worse the quality.⇒ There is an evident relationship between structure loss and quality.

    VP, SSIP2017 18

  • Structural Similarity Quality Evaluation We can extract local structure from reference and test images … and compare them directly using a similarity/distance model Repeat this systematically across the scene

    VP, SSIP2017 19

    TestReference

    Similarity/Distance Model

    Similarity Score

  • Structural Similarity Maps Showing structural similarity at each location in the scene Depending on similarity model, usually scaled 0 to 1, where

    0 is complete disagreement and 1 is identical structures

    VP, SSIP2017 20

    Reference TestStructural Similarity Map

  • Structural Similarity Evaluation Similarity maps essentially local measures of quality Can be integrated into global similarity/quality scores

    VP, SSIP2017 21

    Similarity/Distance Model

    PoolingGlobal Similarity/Quality

    Score

  • Similarity/Distance Models Evaluate distance between two structures

    Similarity is essentially = 1 – Distance

    Tightly tied to structure extraction/representation method Many models proposed in literature

    Window MSE Normalised correlation SSIM – Structure Similarity Index QAB – gradient structure distance …

    VP, SSIP2017 22

  • Structural Similarity Index Probably best known objective IQ metric Three term evaluation where μ is mean and σ is standard

    deviation of the signal evaluated over a local window

    VP, SSIP2017 23

    [Wang and Bovik ’02, Wang, Bovik, Sheikh, Simoncelli ‘04]

    Distance in illumination

    Distance in contrast

    Distance in structure

    Abbreviated form

  • Structural Similarity as Quality Measure Comparing SSIM scores to subjective scores reveals monotonic

    relationship SSIM can be used directly as a measure of quality

    Assessment performance depends also on pooling method

    VP, SSIP2017 24

  • SSIM Results SSIM provides a more realistic quality estimate than MSE

    VP, SSIP2017 25

  • Gradient Structural Similarity Model Structure represented through local image gradients

    Extracted using gradient operators, e.g. Sobel Compared between Reference and Test images Used to determine relative, perceptual importance of various locations

    across the scene

    VP, SSIP2017 26

    Local Gradient

    Extraction

    Gradient Distance Model

    Perceptual Importance Estimation

    Reference

    Test

    Pooling

    Str. Similarity Model

  • Structural Similarity = 1 – Gradient Loss

    True Visual Information

    Degraded Visual Information

    Information Preservation Estimates

    Gradient Change Estimation

    Perceptual Loss Estimation

    We’re not interested in absolute gradient loss, only the perceived loss

  • Gradient Distance Given gradient components sx and sy at each location n,m We first evaluate its magnitude and orientation

    Then measure distance in gradient magnitude and orientation between the test and reference images (A and B):

    Giving us linear structural distance estimates at each n,m Similarity map

    VP, SSIP2017 28

    ( ) ( )2 2

    max

    ( , ) ( , )( , )

    x yA A

    A

    s n m s n mg n m

    g

    +=

    ( , )( , ) arctan( , )

    yA

    A xA

    s n mn ms n m

    α

    =

    ( , ) , ( , ) ( , )( , )

    ( , )( , ) , ( , ) ( , )( , )

    BA B

    AABg

    AA B

    B

    g n m C g n m g n mg n m C

    n mg n m C g n m g n mg n m C

    + > +∆ = + ≤ +

    ( , ) ( , )( , ) A BAB

    n m n mn mα

    α α ππ

    − −∆ =

  • Perceptual Similarity Response of biological systems, including humans is non linear We quantify perceived information similarity with a non-linear

    mapping of gradient distance (magnitude and orientation) Local perceptual similarity scores

    VP, SSIP2017 29

    )(1 σ−∆+Γ

    =ke

    Q

    1 – Gradient Distance

  • Aggregating Scores First combine similarity in gradient orientation and magnitude

    Then sum across the entire scene

    QAB is a global quality score between 0 – complete perceived loss of information from the input 1 – perfect representation (no perceptible quality degradation)

    VP, SSIP2017 30

    ( , ) ( , ) ( , )AB AB ABgQ n m Q n m Q n mα= ⋅

    ∑∑

    ∀=mn

    mnAB

    AB

    mnw

    mnQmnwQ

    ,

    ,

    ),(

    ),(),(

  • Metric Performance Again good, monotonic relationship with subjective scores Relationship far more linear compared to SSIM

    VP, SSIP2017 31

  • Pooling Integrates local quality scores into a single global score over

    Space: field of view of the scene Time: all the frames in the video sequence

    Simplest pooling model is mean of local scores Assumes all locations/frames are equally important

    Weighted summation gives us more freedom

    VP, SSIP2017 32

    Q = 0.487

  • Perceptual Importance Not all areas of the scene are equally important to observers Perceptual importance can guide the pooling process To obtain more relevant quality scores

    VP, SSIP2017 33

  • Visual Attention Perceptual importance is inherent to all of us Its manifestation in the HVS is Attention We can use attention models to derive perceptual importance Attention is driven by a host of factors:

    Context Motion Visibility/Contrast Structure Familiarity

    Most are difficult to model without higher cognition

    VP, SSIP2017 34

  • Contrast Based Importance A good approximation of perceptual importance is local contrast

    Measured say through local gradient magnitude

    Our attention is drawn to areas of high contrast

    VP, SSIP2017 35

    22 ),(),(),(),( mnsmnsmngmnw yx +==

  • Complete Structural Similarity Evaluation

    VP, SSIP2017 36

    Reference and Test Images

    Structural Similarity

    Σ

    Perceptual Importance

    Q = 0.56×

  • Quality Based Pooling An interesting observation during quality research People devote more attention to areas of poor quality So perceptual importance of poor quality regions is higher

    Local quality can be used to determine perceptual importance

    Same true in video, where the worst frames determine quality

    VP, SSIP2017 37

    UoM Live

  • Video Quality Evaluation Obviously harder than image quality, but closely related Simplest approach is to apply IQ metrics on each frame

    That may ignore some important temporal information

    Many IQ models can be adapted to work on dynamic information

    VP, SSIP2017 38

  • Structural Similarity Video QualityMeasure representation of true scene information in degraded video

    As a proxy for subjective impression

    1. Estimate local structural similarity across space and time Between all locations and frames of Reference and Test videos

    2. Quantify perceived information loss At each location and time

    3. Pool scores spatially across entire field of view Into frame quality scores

    4. Pool frame scores into a single video quality score

    VP, SSIP2017 39

  • Gradient Preservation Video Quality

    VP, SSIP2017 40

    Input Sequence

    Temporal Information Extraction

    Visual Information Loss/Preservation Model

    Video Quality Performance Score

    Output Sequence

    Spatial Information Extraction

    Chromacity Information Extraction

    Spatio-Temporal Perceptual Importance Evaluation

    Temporal Information Loss Model

    Spatial Information Loss Model

    Chromacity Information Loss Model

    x

  • Information Extraction: Colour and Space Transform RGB video to more suitable space: say HSV Use gradient operators to extract spatial structure from

    intensity (value) channel

  • Information Extraction: Time

    Use temporal operator analogous to Sobel Evaluated over 3 subsequent frames, at all locations

    Broader time base adds robustness to noise

    Temporal gradient, gt Inter-frame difference

  • Analogous to IQ metric, at each location compare between videos: Spatial gradient magnitude and orientation Temporal gradient magnitude Colour vectors (2D)

    Process with perceptual non-linearities

    Information Loss Models

    δ Cn,m,t

    Spatial responseTemporal response

  • Once again local quality maps are produced This time in three different dimensions:

    Space, time and colour

    Multi-dimensional Local Quality Maps

    Spatial Structure Similarity

    Temporal Structure Similarity

    Reference Test - Compressed

  • Combining the Quality Dimensions Qs, Qt and Qc define the preservation of spatial, temporal and

    chromatic information at each location They can be combined linearly using a weighted summation

    But which weights to use? Determine using optimisation against subjectively annotated videos

    VP, SSIP2017 45

    ABcc

    ABtt

    ABss

    AB QkQkQkVQ ++=

    ks kt

    Optimal [ks, kt, kc]=[0.8, 0.15, 0.05]

  • Pooling into Frame Scores Pool local quality estimates for each pixel in each frame Define perceptual importance at each pixel

    Now we have not only spatial but temporal contrast too

    And use them to weight local information preservation estimates into a frame quality estimate

    VP, SSIP2017 46

    TmnTmnTmn gtgw ,,,,,, +=

    ∑∑

    ∀=mn

    mnAB

    ABT

    Tmnw

    TmnQTmnwQ

    ,

    ,

    ),,(

    ),,(),,(

  • Video Sequence Quality We pool frame quality scores into global quality score We use p% of the worst frames in the sequence to determine

    global objective quality estimate Compromise estimate for p is 20%

    VP, SSIP2017 47

    Frames

    pTAB

    pN

    TVQVQ

    ∑ ∈= frames worst %)(

  • Video Quality over Time Video quality is a function of time

    i.e. it can change during a sequence

    VP, SSIP2017 48

  • Localised Quality Scores Local spatial, temporal and colour quality estimates provide

    further insight into video quality

    VP, SSIP2017 49

    Spatial Temporal Colour

  • How accurate is it? Extremely high compression low resolution data 5 scenarios, 4 resolutions, 4 codecs

    VP, SSIP2017 50

    Subjective Score

    Objective Score

  • No Reference Quality Evaluation Much tougher proposition

    Where to begin? What to measure?

    Again, done naturally by people Growing number of methods available that essentially mimic

    how we do it Natural statistics methods Machine learning methods

    VP, SSIP2017 51

  • Natural Statistics Metrics We don’t have a reference to compare to but we can learn

    what natural – good quality images look like Quality degradations changes certain image properties Learn statistics of these properties from pristine and distorted

    images Natural scene statistics

    VP, SSIP2017 52

    [Saad et al. 12,]

    MSCN - mean subtracted contrast normalized coefficients

  • Natural Statistics Approaches Several approaches available using mainly local structural

    properties throughout the scene DCT coefficients statistics Mean subtracted contrast normalized coefficients Discrete Wavelet transform coefficients Gabor filter responses

    Measure distance between natural and test image distributions

    VP, SSIP2017 53[Saad et al. 12, Mittal et. al. 12, Su et. al. 13 …]

    Objective Assessment of Image and Video QualityAimsProblemMotivation (1)Motivation (2)Motivation (3)Evaluating QualityBenchmarkDefinitionQuality is Multi-FacetedObjective Quality MetricsPractical Problem DomainClassification of MethodsMethodologyPast and PresentLimitations of MSEDegradation AnalysisDelving FurtherStructural Similarity Quality EvaluationStructural Similarity MapsStructural Similarity EvaluationSimilarity/Distance Models Structural Similarity IndexStructural Similarity as Quality MeasureSSIM ResultsGradient Structural Similarity ModelStructural Similarity = 1 – Gradient LossGradient DistancePerceptual SimilarityAggregating ScoresMetric PerformancePoolingPerceptual ImportanceVisual AttentionContrast Based ImportanceComplete Structural Similarity EvaluationQuality Based PoolingVideo Quality EvaluationStructural Similarity Video QualityGradient Preservation Video QualityInformation Extraction: Colour and SpaceInformation Extraction: TimeInformation Loss ModelsMulti-dimensional Local Quality MapsCombining the Quality DimensionsPooling into Frame ScoresVideo Sequence QualityVideo Quality over TimeLocalised Quality ScoresHow accurate is it?No Reference Quality EvaluationNatural Statistics MetricsNatural Statistics Approaches