non-intrusive speech quality assessment algorithm based on ... · parametric based methods using...

13
Non-intrusive Speech Quality Assessment Algorithm Based on Spectro-temporal Analysis School of Computer Engineering LI Qiaohong (Supervisor: Prof Weisi Lin) (Co - Supervisor: Prof Daniel THALMANN) July. 22, 2014

Upload: others

Post on 04-Jul-2020

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Non-intrusive Speech Quality Assessment Algorithm Based on ... · Parametric based methods Using the parameters of the compression and transmission protocols to estimate the final

Non-intrusive Speech Quality Assessment

Algorithm Based on Spectro-temporal Analysis

School of Computer Engineering

LI Qiaohong

(Supervisor: Prof Weisi Lin)

(Co-Supervisor: Prof Daniel THALMANN)

July. 22, 2014

Page 2: Non-intrusive Speech Quality Assessment Algorithm Based on ... · Parametric based methods Using the parameters of the compression and transmission protocols to estimate the final

Outline

• Motivation

• Review

• Method

• Experimental results

• Conclusion and future work

Page 3: Non-intrusive Speech Quality Assessment Algorithm Based on ... · Parametric based methods Using the parameters of the compression and transmission protocols to estimate the final

Motivation

Speech

Signals

6. Transmission(eg. VoIP, IPTV)

1. Acquisition

(Noise)

3. Reproduction(eg. imperfect

reconstruction)

5. Postprocessing(eg. enhancement)

2. Synthesis(eg. Text-to-speech)

4. Security(eg. watermarking)

Page 4: Non-intrusive Speech Quality Assessment Algorithm Based on ... · Parametric based methods Using the parameters of the compression and transmission protocols to estimate the final

School of Computer Engineering

• Applications of speech quality assessment methods:

• speech acquisition, enhancement, watermarking, compression,

transmission, reconstruction, authentication, speech synthesis …

• Two broad approaches:

• subjective vs. objective methods

• Subjective assessment suffers from drawbacks

• time-consuming, laborious and expensive; requires many human subjects

and repeated viewing/listening sessions

• Not feasible for on-line signal manipulations (such as encoding,

transmission, relaying, etc.)

• depends upon viewers’ physical conditions, emotional states, personal

experience etc

Motivation

Page 5: Non-intrusive Speech Quality Assessment Algorithm Based on ... · Parametric based methods Using the parameters of the compression and transmission protocols to estimate the final

Objective Speech Quality Assessment

• Intrusive SQA methods(a.k.a double-ended, full-reference methods)

Parametric based methods

Using the parameters of the compression and transmission protocols to estimate the final quality score.

Signal based methods

Calculating the perceptually weighted distance between the reference and

degraded speech signals.

Eg. SNR, LLR, BSD, PSQM, PESQ, POLQA.

• Non-intrusive SQA methods• (a.k.a single-ended, no-reference)

Page 6: Non-intrusive Speech Quality Assessment Algorithm Based on ... · Parametric based methods Using the parameters of the compression and transmission protocols to estimate the final

Engineering Approach Framework

School of Computer Engineering

Feature extraction

Reference signal

Distorted signal

Feature pooling

(cognitive mapping)Quality score

Stage I Stage II

For intrusive methods

Exploits signal processing

techniques

Based on machine learning

Page 7: Non-intrusive Speech Quality Assessment Algorithm Based on ... · Parametric based methods Using the parameters of the compression and transmission protocols to estimate the final

Gabor Feature Extraction for Speech

Quality Assessment

Gabor feature extraction pipeline

Page 8: Non-intrusive Speech Quality Assessment Algorithm Based on ... · Parametric based methods Using the parameters of the compression and transmission protocols to estimate the final

Gabor Feature Extraction for Speech

Quality Assessment

The effectiveness of extracted Gabor features

Page 9: Non-intrusive Speech Quality Assessment Algorithm Based on ... · Parametric based methods Using the parameters of the compression and transmission protocols to estimate the final

SVR for feature mapping

School of Computer EngineeringSchool of Computer Engineering

We adopt the SVR to learn the mapping from extracted Gabor features to

objective speech quality

We use the Radial Basis Function (RBF) kernel with the kernel

function of 𝐾(𝒙𝑖, 𝒙𝑗) = exp(−𝜌 ∥𝒙𝑖 − 𝒙𝑗∥2) in this work. The

parameters {𝐶, 𝜌, 𝜖} are selected through cross validation

80% Training data vs. 20% Test data

Overall Test

Split dataset according to different contents

Test 1

Split dataset according to different noise levels

Test 2

Split dataset according to different noise types

Test 3

Split dataset according to different noise enhanced algorithms

Page 10: Non-intrusive Speech Quality Assessment Algorithm Based on ... · Parametric based methods Using the parameters of the compression and transmission protocols to estimate the final

The scatter plot of the perdition results of proposed metric

versus the subjective scores in NOIZEUS database

Experimental results

Page 11: Non-intrusive Speech Quality Assessment Algorithm Based on ... · Parametric based methods Using the parameters of the compression and transmission protocols to estimate the final

Comparison with state-of-the-arts

[7] T. H. Falk and Chan Wai-Yip, “Single-ended speech quality measurement using machine learning methods,” Audio, Speech, and Language Processing, IEEE Transactions on, vol. 14, no. 6, pp. 1935–1947, 2006

[10] M. Narwaria, Lin Weisi, I. V. McLoughlin, S. Emmanuel, and Chia Liang-Tien, “Nonintrusive quality assessment of noise suppressed speech with mel-filtered energies and support vector regression,” Audio, Speech, and Language Processing, IEEE Transactions on, vol. 20, no. 4, pp. 1217–1232, 2012

Experimental results

Page 12: Non-intrusive Speech Quality Assessment Algorithm Based on ... · Parametric based methods Using the parameters of the compression and transmission protocols to estimate the final

Future Work

No-reference visual quality assessment

Joint audiovisual quality assessment:

humans perceive ‘overall’ multimedia quality and not separate assessment

Possible approaches include one-stage and two-stage fusion (OSF/TSF)

OSF: both audio and speech features pooled in one stage

TSF: first pool audio, then video features and the two scores into an overall score

Page 13: Non-intrusive Speech Quality Assessment Algorithm Based on ... · Parametric based methods Using the parameters of the compression and transmission protocols to estimate the final

Thank you!

Questions?

School of Computer Engineering