communications & multimedia signal processing refinement in ftlp-hnm system for speech...

14
Communications & Multimedia Signal Processing Refinement in FTLP-HNM system for Speech Enhancement Qin Yan Communication & Multimedia Signal Processing Group School of Engineering and Design, Brunel University 23 Nov, 2005

Post on 21-Dec-2015

223 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Communications & Multimedia Signal Processing Refinement in FTLP-HNM system for Speech Enhancement Qin Yan Communication & Multimedia Signal Processing

Co

mm

un

icat

ion

s &

Mu

ltim

edia

Sig

nal

Pro

cess

ing

Refinement in FTLP-HNM system for Speech Enhancement

Qin Yan

Communication & Multimedia Signal Processing Group

School of Engineering and Design, Brunel University

23 Nov, 2005

Page 2: Communications & Multimedia Signal Processing Refinement in FTLP-HNM system for Speech Enhancement Qin Yan Communication & Multimedia Signal Processing

Co

mm

un

icat

ion

s &

Mu

ltim

edia

Sig

nal

Pro

cess

ing Outline

• Review of FTLP-HNM system;

• Parameters estimation of HNM (incl. pitch/harmonic tracking in noise)

• Objective results of pitch, harmonic tracking and FTLP-HNM system

• Demo of enhanced speeches from old archive recordings

Page 3: Communications & Multimedia Signal Processing Refinement in FTLP-HNM system for Speech Enhancement Qin Yan Communication & Multimedia Signal Processing

Co

mm

un

icat

ion

s &

Mu

ltim

edia

Sig

nal

Pro

cess

ing

Overview of FTLP-HNM Speech Enhancement System

LP ModelDecomposition

Pre-cleaning

HNM of Residual

KalmanFilters

Noisy Speech

Formant Estimation

Kalman Filters

Synthesized LPModel

LP ModelRe-composition

Enhanced Speech

Pitch Estimation

Voiced/Unvoiced

Classification

Formant estimation

HNM estimation

Page 4: Communications & Multimedia Signal Processing Refinement in FTLP-HNM system for Speech Enhancement Qin Yan Communication & Multimedia Signal Processing

Co

mm

un

icat

ion

s &

Mu

ltim

edia

Sig

nal

Pro

cess

ing

)()()(ˆ tStStS nh

)(

)(

)(0)()(tL

tLk

ttjkwkh etAtS

)](),()[()( tbthtetSn

In HNM, speech is decomposed to two parts : Harmonic part and noise part.

where L(t) denotes the number of harmonic included in the harmonic part, ω0 denotes the pitch frequency.

Harmonic :

Noise :

Synthesized Speech :

where h the a time-varying autoregressive(AR) model and b is white Gaussian noise.

Harmonic plus Noise Model

Page 5: Communications & Multimedia Signal Processing Refinement in FTLP-HNM system for Speech Enhancement Qin Yan Communication & Multimedia Signal Processing

Co

mm

un

icat

ion

s &

Mu

ltim

edia

Sig

nal

Pro

cess

ing

HNM - Pitch Tracking

MaxF

k

MkF

MkFl

lXFEFE1

00

0

0

)(log)(

MaxF

k

MkF

MkFl

lXlWFEFE1

00

0

0

)(log)()(

• In noisy condition the error function is modified to including SNR dependent weights

The weighting function W(l) is a SNR-dependent given by)(1

)()(

lSNR

lSNRlW

• Error function in frequency domain

NOTE:• The input speech frame is bandpassed to eliminated the parts which don’t contain explicit harmonics.

• For Each speech frame, it outputs several pitch candidates (N=3) and Viterbi algorithm then generates the final pitch tracks.

•It might be useful to have candidates from this method and traditional autocorrelation method.

Page 6: Communications & Multimedia Signal Processing Refinement in FTLP-HNM system for Speech Enhancement Qin Yan Communication & Multimedia Signal Processing

Co

mm

un

icat

ion

s &

Mu

ltim

edia

Sig

nal

Pro

cess

ing

0

0.05

0.1

0.15

0.2

0 5 10 15 20SNR(dB)

Erro

r %

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0 5 10 15 20SNR (dB)

Err

or %

Improved methed with weightsImproved method without weightsGriffin's method

Figure - Comparison of the performance of different pitch track methods for speech in (a) train noise (b) car noise from 0dB SNR to clean.

Results of Pitch Tracking

Page 7: Communications & Multimedia Signal Processing Refinement in FTLP-HNM system for Speech Enhancement Qin Yan Communication & Multimedia Signal Processing

Co

mm

un

icat

ion

s &

Mu

ltim

edia

Sig

nal

Pro

cess

ing HNM - Harmonic Tracking

Peak picking

Pitch Tracking

Noise Speech

VADNoise model

FFTHarmonic

Frequency bin tracks

Harmonic Track

Candidates

Smoothed Harmonic

Magnitude by Kalman filter

Tracking

• Data structure of harmonic track candidates are improved and speed up the whole system.

Page 8: Communications & Multimedia Signal Processing Refinement in FTLP-HNM system for Speech Enhancement Qin Yan Communication & Multimedia Signal Processing

Co

mm

un

icat

ion

s &

Mu

ltim

edia

Sig

nal

Pro

cess

ing Results of Harmonic Tracking in Clean Speech

Figure - An illustration of pitch tracks of a speech segment at sampling frequency of 8kHz.

Page 9: Communications & Multimedia Signal Processing Refinement in FTLP-HNM system for Speech Enhancement Qin Yan Communication & Multimedia Signal Processing

Co

mm

un

icat

ion

s &

Mu

ltim

edia

Sig

nal

Pro

cess

ing Results of Harmonic Tracking in Noisy Speech

Pitch recovery

Harmonic Recovery

Page 10: Communications & Multimedia Signal Processing Refinement in FTLP-HNM system for Speech Enhancement Qin Yan Communication & Multimedia Signal Processing

Co

mm

un

icat

ion

s &

Mu

ltim

edia

Sig

nal

Pro

cess

ing Synthesis of Excitation by HNM

Voiced Excitation :

Unvoiced Excitation :

)*exp(*))((*)()()()()(ˆ)(

)(

)(0

jmestdmbetAmemememL

mLk

mmjkknh

)*exp(*))((*)()()(ˆ jmestdmbmeme n

Where b(m) is unit white Gaussian noise , e(m) is original excitation and a is the phases of original excitation.

Page 11: Communications & Multimedia Signal Processing Refinement in FTLP-HNM system for Speech Enhancement Qin Yan Communication & Multimedia Signal Processing

Co

mm

un

icat

ion

s &

Mu

ltim

edia

Sig

nal

Pro

cess

ing Results of Speech Enhancement

Figure - Comparison of the harmonicity of MMSE and FTLP-HNM systems on train noisy speech at different SNRs

15

18

21

24

0 5 10 15 20SNR(dB)

Har

mon

icity

noisy MMSE FTLP-HNM

1.4

1.7

2

2.3

2.6

2.9

0 5 10 15 20SNR(dB)

PE

SQ

noisyMMSEFTLP-HNM

110

1 , 1

110log

2frames

NHk k

N kframes k k

P PHarmonicity

NH N P

Figure - Performance of MMSE and FTLP-HNM on train noisy speech at different SNR levels.

Enhanced speech is synthesized by inverse filtering the HNM residual with cleaned LP shape.

Page 12: Communications & Multimedia Signal Processing Refinement in FTLP-HNM system for Speech Enhancement Qin Yan Communication & Multimedia Signal Processing

Co

mm

un

icat

ion

s &

Mu

ltim

edia

Sig

nal

Pro

cess

ing

Original speech Enhanced speech

Demo (1)

Persian speech for Iranian King Mozaffareddin Shah

Page 13: Communications & Multimedia Signal Processing Refinement in FTLP-HNM system for Speech Enhancement Qin Yan Communication & Multimedia Signal Processing

Co

mm

un

icat

ion

s &

Mu

ltim

edia

Sig

nal

Pro

cess

ing

Demo (2)

Florence Nightinggale 1890

Original speech Enhanced speech

Page 14: Communications & Multimedia Signal Processing Refinement in FTLP-HNM system for Speech Enhancement Qin Yan Communication & Multimedia Signal Processing

Co

mm

un

icat

ion

s &

Mu

ltim

edia

Sig

nal

Pro

cess

ing