quality of experience in high definition television: subjective assessments and objective metrics

58
1 Quality of experience in High Definition Television: subjective assessments and objective metrics Stéphane Péchard October, 2 nd 2008 IVC

Upload: stephane-pechard

Post on 17-May-2015

2.163 views

Category:

Technology


4 download

DESCRIPTION

My Ph.D. defense presentation, on October 2nd, 2008.

TRANSCRIPT

1

Quality of experiencein High Definition Television:

subjective assessmentsand objective metrics

Stéphane PéchardOctober, 2nd 2008

IVC

2

Motivations

technicalpsychological

3

New technologies

capture compression5x SDTV (pixels)

transmission restitution

1 01

0110

01 010 1 0

101

1101

010 101

1NEW

NEWNEW

NEW

=> new distortions

4

subjective(MOS)

objective(MOSp)

Controlling quality

5

OutlineSubjective quality

assessment1. global quality

assessment

2. comparing qualities of 2 TV services

3. towards a fine quality measurement

Objective quality metrics

1. H.264-specific metric (using prior knowledge)

2. generic metric based on spatio-temporal tubes

6

OutlineSubjective quality

assessment1. global quality

assessment

2. comparing qualities of 2 TV services

3. towards a fine quality measurement

Objective quality metrics

1. H.264-specific metric (using prior knowledge)

2. generic metric based on spatio-temporal tubes

7

What is video quality subjective assessment?

getting a mean human quality evaluation

observers environment methodology

8

Subjective quality assessment

preference between HDTV and SDTV ?

how quality is globally perceived ?

can we better understandquality judgment ?

9

OutlineSubjective quality

assessment1. global quality

assessment

2. comparing qualities of 2 TV services

3. towards a fine quality measurement

Objective quality metrics

1. H.264-specific metric (using prior knowledge)

2. generic metric based on spatio-temporal tubes

10

Suitable methodology

HDTV: high quality in a short range=> quality measure should be precise

and discriminative

+ important part of visual field excited=> how to consider this in a methodology ?

11

- random order

- only one viewing

- category scale

- no explicit reference

AbsoluteCategory Rating

Subjective AssessmentMethodology

for Video Quality

Video Quality Experts Group

European BroadcastingUnion

...Good

- user-driven order

- multiple viewing (natural?)

- continuous scale

- explicit reference

12

State of the art[Brotherton, 2006] both MOS (Mean Opinion

Score) populations correlation on CIF (352x288):CC(MOS

ACR, MOS

SAMVIQ) = 0.94

to confirm:more tests

QVGAVGA

HDTV

640320

1920

1080

480

240

13

Results

HDTV

VGA

QVGA 13°

19°

33°

0.969

0.942

0.899

6.73

9.31

14.06

visualfield

RMSDiff=correlation

ACR and SAMVIQ are equivalentup to a certain resolution

14

Accuracy vs.Number of observers

5 10 15 20 25 300

5

10

15

SAMVIQACR'

con

fide

nce

inte

rval

number of observers

SAMVIQ

24

15

OutlineSubjective quality

assessment1. global quality

assessment

2. comparing qualities of 2 TV services

3. towards a fine quality measurement

Objective quality metrics

1. H.264-specific metric (using prior knowledge)

2. generic metric based on spatio-temporal tubes

16

Comparing two videos with different resolutions

H

D=3H for HDTV

problem: observers can't move!

17

QHD ~SDTV

comparison

Technical solution

How? no specific protocol exists

HD QHD in HD

18

Motivationsame screen for both formats

H h

D=3H=6h

TVSD:720x576

QHD: 960x540

19

Quality and preference tests

I prefer much more A than B I prefer more A than B

I prefer a little more A than B I have no preference

I prefer a little less A than B I prefer less A than B

I prefer much less A than B

+3+2+10-1-2-3

preference scale

A: quality testswith SAMVIQ

of SDTV qualities(good and mid-range)

B: quality testswith SAMVIQ

of HDTV qualities

preference testsA vs. B

20

Resultspr

efer

ence

ΔQuality

ΔQuality =MOS

HD - MOS

SD

0

HD/SD Qgood

: QHD may be less than QSD,

benefit of the size

HD/SD Qmid-range

: QHD must be higher

than QSD, size becomes an enemy

isopreference

0

21

OutlineSubjective quality

assessment1. global quality

assessment

2. comparing qualities of 2 TV services

3. towards a fine quality measurement

Objective quality metrics

1. H.264-specific metric (using prior knowledge)

2. generic metric based on spatio-temporal tubes

22

Classical approach

...

a global distortion on an entire sequence

23

blockiness

blur

Farias approach-2004 Proposed approach

distortion-based partition content-based partition

...

from disturbance functionsto global distorting system

homogeneousareas

fine texturedareas

strong textured areas

t

from spatio-temporalcategory qualitiesto global quality?

blur

Drawbackscontent dependency

coding system dependencydistortion list exhaustivity

pooling function?complex subjective assessment

24

H.264 coding

class-distorted sequencesgeneration

source

categoriesmasks sequence

partly-distorted sequencesusable for subjective tests

spatio-temporal classification

… …

spatio-temporal segmentation(tube creation

along local motion)tube classification

C5

C4

C3

C2

C1

25

Local to global?

MOS(Ci): partly-distorted sequence qualitiesrelated to global MOS

G: f(MOS(Ci)) = MOS

G ?

several relation tested:up to CC(f(MOS(Ci)), MOS

g) = 0.95

YES! It's possible to relate spatio-temporalcategory qualities to global quality

26

blockiness

blur

Farias approach-2004 Proposed approach

distortion-based partition content-based partition

...

from disturbance functionsto global distorting system

homogeneousareas

fine textured areas

strong textured areas

t

blur

Drawbackscontent dependency

coding system dependencydistortion list exhaustivity

pooling function?complex subjective assessment

Advantagesgeneric methodology

simple pooling functionreal distortions

classical subjective assessment

27

OutlineSubjective quality

assessment1. global quality

assessment

2. comparing qualities of 2 TV services

3. towards a fine quality measurement

Objective quality metrics

1. H.264-specific metric (using prior knowledge)

2. generic metric based on spatio-temporal tubes

28

distortedsequence

What are objective quality metrics?

system

reference reducedreferenceextraction

objectivescores

FR metricRR metricNR metric

MOS from subjectiveassessments

evaluationperformance

criteria(CC, RMSE, OR,

difference signifiance)

29

Usual approaches

signalapproach

perceptualapproach

PSNR low levelHVS models

structuralmodels

high level distorstionsmeasurement models

VSSIM [2004]

VQM [2002]

30

Performances on HDTV

metric CC RMSE ORVSSIM 0.790 11.27 0.55VQM 0.898 8.09 0.40PSNR 0.543 15.43 0.61

168 sequences

31

OutlineSubjective quality

assessment1. global quality

assessment

2. comparing qualities of 2 TV services

3. towards a fine quality measurement

Objective quality metrics

1. H.264-specific metric (using prior knowledge)

2. generic metric based on spatio-temporal tubes

32

33

ST contentanalysis

referencesequence

distortedsequence

modelparametersprediction

qualitymodel

bitrate B

offset, slope

global motion Mproportions P

i

quality score Q

34

ST contentanalysis

referencesequence

distortedsequence

modelparametersprediction

qualitymodel

bitrate B

offset, slope

quality score Q

use of the spatio-temporalsegmentation

class proportions Pi

60%

10%

5%

20%

mean sequencemotion M

global motion Mproportions P

i

35

ST contentanalysis

referencesequence

distortedsequence

modelparametersprediction

qualitymodel

bitrate B

offset, slope

quality score Q

offset parameter:temporal complexity estimation

related to motion Mi

slope parameter:spatial complexity estimationrelated to class proportions P

i

global motion Mproportions P

i

36

Performancesmetric CC RMSE ORVSSIM 0.791 11.90 0.45VQM 0.892 8.79 0.40

proposed 0.901 8.47 0.36

pros cons

H.264-dependentfaster than VQM

reduced referencemetric (6 parameters)

equal performances

37

OutlineSubjective quality

assessment1. global quality

assessment

2. comparing qualities of 2 TV services

3. towards a fine quality measurement

Objective quality metrics

1. H.264-specific metric (using prior knowledge)

2. generic metric based on spatio-temporal tubes

38

Visual inspection (gaze fixation)spatially localized

duration (200-300 ms)smooth local motion tracking

Interesting HVS features for this metric

some of them have been used in part 1

39

spatio-temporalsegmentation

referencesequence

distortedsequence

featuresextraction

qualityscore Q

featuresextraction

featuresdifference

long-termtemporalpooling

tubes

spatio-temporalsegmentation

short-termspatio-temporal

pooling

40

spatio-temporalsegmentation

referencesequence

distortedsequence

featuresextraction

qualityscore Q

featuresextraction

featuresdifference

temporalpooling

tubes

spatio-temporalsegmentation

short-termspatio-temporal

pooling

a tube t

41

spatio-temporalsegmentation

referencesequence

distortedsequence

featuresextraction

qualityscore Q

featuresextraction

featuresdifference

temporalpooling

tubes

spatio-temporalsegmentation

short-termspatio-temporal

pooling

spatial information feature: fSI

temporal information feature: fTI

referencetube

distortedtube-

42

spatio-temporalsegmentation

referencesequence

distortedsequence

featuresextraction

qualityscore Q

featuresextraction

featuresdifference

tubes

spatio-temporalsegmentation

short-termspatio-temporal

pooling

5 frames

=

1 time-slot (200ms)

long-termtemporalpooling

43

spatio-temporalsegmentation

referencesequence

distortedsequence

featuresextraction

qualityscore Q

featuresextraction

featuresdifference

long-termtemporalpooling

tubes

spatio-temporalsegmentation

short-termspatio-temporal

pooling

high level HVS properties

asymetricaltemporalfiltering

mid-termnon linear

quality judgment

long-termtemporalfiltering

44

Training and testing168 sequences

training

testing

45

Best performancesmetric CC RMSE OR

VSSIM 0.837 10.15 0.38

VQM 0.875 8.98 0.43

fixed tubes 0.875 9.08 0.38

motion-oriented tubes 0.898 8.30 0.31

generic metric

slightly better than VQM with less features

46

General conclusion

47

better knowledge of HDTV (visual)subjective quality assessment

Subjective quality assessment

generic methodology to assess fine quality => better knowledge

of judgment construction

visual image size influences preferencebetween SDTV/HDTV services

48

Experiment effort

26 sessions (6 months)(SAMVIQ, ACR and preference)

200 observers for 600 unique sessionsin 300 hours of subjective evaluation

=> 25,000 subjective scores

more than 750 cumulative daysof H.264 coding

49

fast RR metric dedicatedto H.264 systems evaluation

generic metric based on motion-orientedspatio-temporal tubes

both performed slightly better than VQM

Objective quality metrics

50

Future works

towards a multimodal quality evaluation

considering a display model=> work in progress (Tourancheau)

adapting ACR to HDTV: more than 5 items?=> work in progress (VQEG)

51

Q&A

52

7

ref--------

24HDTV sequence database

53

100

0

excellent

good

fair

poor

bad

excellent

good

fair

poor

bad

ACRSAMVIQ5

4

3

2

1

80%

54

CC=0.899RMSE=14.06

55

mea

n p

refe

ren

ce

ΔMOS=MOSHD

-MOSSD

HDTV prefered

SDTV prefered

MOSHD

<MOSSD

MOSHD

>MOSSDΔMOS

0=-8

ΔMOS0=-18

large screen effect distorsions effect

56

Classes

five spatial activity levels

smooth areas edgestextured areaslow high

luminance

C1

C2

C3

C4

C5

fine strongtextures

57

Tube classification

4 spatial gradientsper tube

plot in spatial space P

C4

C1

C3

C4

C5(P')

C2

ΔV

ΔH

space P

frontiers definedto get relevantclassification

58

ΔMOS(C3

)

MOSref

MOS(Sj,Bk)

DMOS(Sj,Bk)=MOSref - MOS(Sj,Bk)

MOS3

ΔMOS(C2

)MOS2

DMOS and ΔMOS

ΔMOS(C1

)MOS1

ΔMOS(C5

)MOS5

ΔMOS(C4

)MOS4

global loss locallosses