c2. xl Âm thanh
DESCRIPTION
chuong 2 xu li am thanh hinh anhTRANSCRIPT
-
Chng II: X l m thanh X l thoi
Tng quan v XL thoi M ha d on: DPCM, ADPCM Vocoder M ha lai Cc tiu chun m ha thoi
X l m thanh M hnh Psychoacoustic v chc nng nghe-
ngi Cc bc c bn trong m ha m thanh cm
nhn M ha m thanh MPEG
-
X l thoi M ha ting ni l qu trnh biu din tn hiu ting
ni s ha s dng cng t bit cng tt, m boc mc cht lng ting ni hp l. (Nn tingni).
Dch v thoi: dch v vin thng c bn. Cng ngh m ha ting ni : thu ht s quan tm ca
cc nh nghin cu, chun ha, doanh nghip; Nh u vit ca vi in t, tnh kh dng ca cc b XL
c kh nng lp trnh, gi thnh thp, chip nh chuyndng; khc phc nhc im v hn ch.
Chun ha cc phng php m ha ting ni cho ccng dng khc nhau.
-
Cc ng dng ca MH thoi
-
Mc tiu m ha thoi
2,4 Kbps, t l 53 :1
Tn s LM=8 kHzS bit/mu=16
Tc bit=8 kHz x 16 bits = 128 Kbps
Mc tiu ca m ha thoi
-
Cu trc ca h thng m ha thoi
Digital Speech
FocusFocus
-
Yu cu t ra B m ha thoi Tc bit thp: bng thng truyn dn thp, s dng h
thng hiu qu hn( cht lng thoi). Ty thuc vong dng.
Cht lng thoi cao: cht lng c th chp nhn ctheo ng dng hng n; Cc tham s xc nh: tnh dhiu, tnh t nhin, tnh d chu, kh nng nhn din gingngi ni.
Tnh bn vng: qua cc ngn ng khc nhau, chng nhiu Hiu nng tt i vi tn hiu phi thoi: m thng bo,
nhc. Kch thc b nh thp, phc tp tnh ton thp Tr m ha thp
-
Cc phng php nh gi cht lng thoi Phng php nh gi theo thang im MOS (Mean Opinion Score)
(khuyn ngh ITU-T P.800): 5-Excellent/4-Good/3-Fair/2-Poor/1-Bad. PCM (64 Kbps lut A, ) c MOS = 4,5 5,0
Phng php nh gi da trn m hnh gic quan PSQM(Perceptual Speech Quality Measurement) (khuyn ngh ITU-T P.861): im PSQM th hin lch gia tn hiu chun v tn hiu
truyn dn. Phng php PESQ (Perceptual Evaluation of Speech Quality):
So snh tn hiu gc X(t) vi tn hiu suy gim Y(t) l kt qu ca vic truyn tnhiu X(t) qua h thng thng tin. u ra ca PESQ l mt c lng v chtlng thoi nhn c ca tn hiu Y(t).
Phng php da trn m hnh nh gi truyn dn E-model (chunETR 250): c lng cht lng thoi hai chiu v tnh n cc yu t nh: ting vng, tr...
-
Phn loi cc b m ha thoi
Theo tc Theo k thut m ha Theo phng thc s dng
-
Phn loi cc b MH thoi (theo k thut m ha)
-
Cht lng thoi & Tc bit ca cc b m ha
-
M ho dng sng
Trong min thi gian: M ho iu xung m (PCM): mi mu tn hiu
c m ha c lp vi cc mu khc. iu bin xung m vi sai (DPCM):
Cc mu ln cn tng quan vi nhau ng k = s saikhc v bin gia cc mu lin tip l kh nh.
Xy dng m hnh m ha tn dng tnh cht ny gim tc s liu u ra ca ngun: m ha s saikhc gia cc mu lin tip thay v m ha tng muc lp.
-
M ha d on (LPC, DPCM)
Quan st: Cc mu ln cn c s tng quan vi nhau rt ln.
M ha d on: D on mu hin ti t cc mu trc . Lng t ha,
m ha sai s d on thay v c gi tr mu. Nu d on chnh xc, sai s d on tp trung gn 0 v c
th m ha t bit hn so vi mu ban u. B d on thng c s dng l b d on tuyn tnh:
( ) ( )=
-=P
kkp knxanx
1
-
S khi b m ha DPCM
-
S khi b gii m DPCM
-
V d 1M ha chui mu sau s dng b m ha DPCM:-Chui: {1,3,4,4,7,8,6,5,3,1,}- S dng b d on: d on gi tr hin ti bi gi tr trc :
- S dng b lng t ha 3 mc:
- Vit chui mu c khi phc.- Vit chui bit m ha nh phn u ra, nu t m sau c s dng:
Sai s 0 l 1, sai s 2 l 01, sai s -2 l 00.
( ) ( )1px n x n= -
( )2 10 1
2 1
dQ d d
d
=
-
iu ch Delta S dng b lng t ha sai s d on gm c 2
mc: . Mi mu m ha 1 bit.
-
Bi tpBi 3:
M ha chui m sau s dng DM: {1,3,4,4,7,8,6,5,3,1,}. S dng deltabng 2, gi s cho trc mu d on u tin. Biu din chui khi phcv chui m ha nh phn.
Bi 4:Xt h thng d on s dng iu ch DM. B d on d on mu hinti da trn mu khi phc trc . Sai s d on c lng t hatheo b lng t:
Cho chui mu {3,4,5,3,1,}. Tnh gi tr d on, sai s d on, sai s d on lng t ha, gi tr
khi phc i vi mi mu cho,vi mu d on u tin l 3 c sdng c 2 b m ha v gii m. Gii thit 1 biu din e0 v 0 vie
-
M ho bng con SBC (subband coding)
Tn hiu c chia thnh nhiu di bng hp, tnhiu trong min thi gian ng vi mi di c mha c lp.
Trong m ha ting ni, di tn s thp cha phnln nng lng ca tn hiu, ng thi nhiu lngt nh hng n tai rt thp. Do vy, tn hiu bng tn thp c m ha nhiu bit hn tn hiu min tn cao.
-
M ho bin i thch nghi ATC (Adaptive Transform Coding)
Ti pha pht: chia cc mu tn hiu ca ngun thnhtng khung Nf mu, s liu trong mi khung cchuyn sang min tn s m ha truyn i.
Ti pha thu mi khung ph cc mu tn hiu cchuyn i ngc li trong min thi gian v tn hius c tng hp li t cc mu.
m ha hiu qu, ta dng nhiu bit cho cc thnhphn ph quan trng, v t bit cho cc thnh phn phkhng quan trng.
Cc php bin i c chn sao cho cc mu phkhng tng quan vi nhau: KLT (Karhunen-Love)(ti u nhng phc tp), DCT.
-
M ho dng sng Khi phc c tn hiu sng ging nh tn hiu gc.
phc tp, gi thnh, tr cng sut tiu th thp. Ch to c ting ni cht lng cao ti cc tc
ln hn 16kbps.
Khng to c ting ni cht lng cao ti tc nh hn 16kbps.
-
Lm cch no gim tc bit hn na?
ADPCM khng th cho cht lng tt khitc bit thp hn 16 Kbps.
gim tc bit hn na, m hnh tora ting ni cn c khai thc: m hada trn m hnh hoc m ha vocoder.
Cc phng thc m ha khng da trnm hnh c gi l m ha dng sng.
-
M ha ngun Tn hiu thoi c to ra t 1 m hnh (m
hnh AR) (c iu khin bi 1 s cc thams): Trong qu trnh m ha, cc tham s cam hnh c d on t tn hiu thoi u vo,m ha truyn n b gii m.
i vi ting ni, b m ha ngun c gi lvocoder: Hot ng da trn m hnh c quan pht m, c
kch thch bi mt ngun nhiu trng i vi onting ni v thanh hoc bi mt dy xung c chu kbng chu k pitch vi on hu thanh.
Thng tin c gi ti b gii m l cc thng s kthut ca b lc..
-
M hnh AR Phng php chung ca m hnh ha cc tn hiu ngu nhin: biu din tn
hiu lm u ra ca b lc tuyn tnh ton cc (all-pole filter). Khi c:
trong : ph cng sut u ra ca b lc l tch ca ph nhiu trng nhnvi bnh phng bin hm truyn t ca b lc.
Qua vic chn b lc c a thc mu s thch hp c th c c trng phmong mun cho cc tn hiu ngu nhin.
Cho chui cc gi tr x[n], x[n-1],, x[n-M] l th hin tin trnh t ngngc (AR) bc M, tha mn phng trnh vi phn:
trong a1, a2 ,, aM : l cc tham s ca AR, v(n) l tin trnh nhiu trng.hay:
Nhn thy rng: gi tr hin ti ca tin trnh x[n] l t hp tuyn tnh cc gi tr qukh ca tin trnh: x[n-1],, x[n-M] , cng vi v(n). Tin trnh x[n] c th hiu l csuy ngc li trn chnh cc gi tr trc ca n. Nn gi l t ng ngc (AR).
-
Hm truyn t ca b phn tch AR Ly bin i Z cng thc 3.41, ta c hm truyn t ca b phn
tch AR (b lc ton khng-FIR), vi u vo b lc l x[n] v u ra l v[n]:
Nh vy, b phn tch AR bin i tin trnh AR ti u vo thnh nhiu trng ti u ra.
-
Hm truyn t ca b tng hp AR Vi tn hiu vo l nhiu trng v[n] v s dng hm truyn t c
biu din nh sau:
to ra tin trnh AR, x[n] u ra. B tng hp tin trnh AR l b lc ton cc c p ng xung v
hn (IIR). B tng hp bin i ly nhiu trng lm u vo v to ra tn hiu AR u ra. Cng thc 3.44 cho thy hm truyn t ca b phn tch l nghch o hm truyn t ca b tng hp.
trong p1, p2,, pM l cc im cc ca Hs(z) v l nghim ca phng trnh c tnh:
-
B lc tng hp
Kch thch b lc tng hp (hm truyn t Hs(z)) s dng tn hiu nhiu trng, u ra ca b lc s c PSD gn ging vi tn hiu gc
-
D on tuyn tnh ng vai tr quan trng trong cc thut ton m ha ting ni. Trong mt khung tn hiu, cc trng s (h s d on tuyn tnh)
c s dng tnh ton t hp tuyn tnh c tm qua vic tithiu ha sai s d on bnh phng trung bnh.
ng thi, cc h s ny c s dng biu din li khung tnhiu .
Thnh phn c bn ca phng php d on l m hnh AR.Phn tch d on tuyn tnh l qu trnh d on tm cc thams AR da trn cc mu tn hiu (Gi thit ting ni c m hnhha l tn hiu AR).
LP cng c xem l phng php c lng ph: phn tch LPcho php tm c cc tham s AR (xc nh PSD ca chnh tnhiu). Qua vic tnh ton cc h s LPC ca khung tn hiu c thto ra c mt tn hiu khc c ni dung ph gn ging vi phca tn hiu gc.
-
Bi ton d on tuyn tnh D on tuyn tnh l bi ton nhn dng vi cc tham s AR c
c lng t chnh tn hiu AR (hnh 4.1). Tn hiu nhiu trng x[n] clc bi b tng hp tin trnh AR cho u ra s[n] tn hiu AR- vi cctham s AR l . B d on LP c s dng d on s[n] datrn M mu trc :
trong ai l nhng d on ca tham s AR v l h s LPC.
Sai s d on:
-
Ti thiu ha sai s Bi ton nhn dng h thng: d on cc tham s AR t s[n], vi
cc d on l cc h s LPC. thc hin d on, phi thit lpc tiu chun. Trong trng hp ny: sai s d on bnhphng trung bnh:
c ti thiu ha qua vic chn cc h s LPC. (J l hm bc 2ca cc LPC). Thy c s ph thuc ca J v cc h s LPC.
Tm cc h s LPC ti u qua vic ly o hm J theo ak:
Khi phng trnh 4.4 tha mn, th cc h s LPC = tham s AR.Do khi tm c cc h s LPC, h thng s s dng cc thams ny to ra tn hiu AR (b tng hp AR).
-
Ti thiu ha sai s T 4.4, vit li:
Hoc:
Phng trnh 4.6 nh ngha cc h s LPC ti u theo cc t tng quan Rs[l] ca tn hiu x[n]. Biu din 4.6 dng ma trn:
-
Biu din 4.6 dng ma trn:
Trong :
Nu tn ti ma trn nghch o ca ma trn tng quan Rs, tm c cc h s ti u LPC:
li d on (t s gia phng sai ca tn hiu vo vi phng sai ca sai s d on): nh gi hiu nng ca b d on
-
Sai s d on bnh phng trung ti thiu T hnh 4.1, khi ; sai s d on bng nhiu trng
(c s dng to tn hiu AR l s[n]): e[n]=v[n]. Khi , sai sbnh phng trung bnh l nh nht ( li d on ln nht):
iu kin t ti u: bc ca b d on bc ca b tng hpAR. Trong thc t, M khng bit trc. Do , phi lm cho lid on l hm theo bc d on. Bng cch ny, c th xc nhc bc d on khi li bo ha.
Nu bit c bc d on M, J t ti thiu khi .Cc tham s AR c s dng to tn hiu s[n]:
Kt hp 4.16 v 4.9:(0 l vector khng Mx1)
-
Phn tch LP i vi tn hiu khng dng Tn hiu thoi: tnh cht ng. Cc h s LPC phi c tnh mi
khung. Gi thit tnh thng k khng i trong mi khung . Tnhcc h s LPC t N im d liu kt thc ti thi im m: s[m-N+1],s[m-N+2],, s[m]. Vector LPC (M: bc d on) l:
T 4.9 vit li dng ph thuc vo thi gian:
-
Cc c ch d on C 2 k thut c bn: d on trong v d on ngoi.
D on trong: cc h s LPC c tnh t cc gi tr t tng quan clng c s dng d liu ca khung thoi x l cho chnh khung thoi .
D on ngoi: cc h s LPC tm c c s dng trong khung tng lai(sau ). D on ngoi c s dng v tnh thng k ca tn hiu thay ichm theo thi gian. Nu khung khng qu ln, nhng tnh cht thng k c thc ly t cc khung trc khng xa.
Khung c di in hnh: 160 n 240 mu. Phi s dng ca sc kch thc hu hn ly ra cc mu. Khung di hn: phc tp tnh ton t, tc bit thp hn, v vic tnh ton v
truyn cc h s LPC t thng xuyn hn. Song tr m ha ln hn v h thngphi ch tp hp cc mu.
Khng cho li d on cao.
Khung ngn hn: biu din chnh xc hn, nhng ti tnh ton v tc bit caohn
-
Gii thut Levinson-Durbin Phng trnh 4.9 c th gii theo li gii 4.13, nhng nhn chung l
phc tp. Hai gii thut Levison-Durbin (LD) v Leroux-Gueguen (LG) l hai
gii thut rt ph hp cho vic phn tch LP ca cc h thng trin khai trong thc t.
Xt phng trnh:
Mc tiu: Tm cc h s ai theo cc gi tr t tng quan cho trc.
Cc gi tr tng quan c c t vic c lng cc mu tn hiu. J l sai s d on trung bnh bnh phng ti thiu (thc t khng bit trc).
Thut ton LD: tm li gii ca b d on bc M t b d on bc M-1 ( quy lp).
-
Thut ton da trn tnh cht c bn bt bin ca ma trn tng quan:
B d on bc 0:
M rng chiu ca 4.29:
-
Gii thut Levinson-DurbinB d on bc 1 (tip):
V a1=0, nn iu kin ti u khng t c, a thm cn bngphng trnh v c xc nh:
T tnh cht ca ma trn tng quan, 4.30 tng ng vi:
Phng trnh 4.30 v 4.32 c s dng cho bc tip theo. B d on bc 1:
Tm li gii cho:
Trong , 2 bin cn tm cho phng trnh 4.34 :
: l h s d on ca b d on bc 1. J1 l sai s d on bnh phng trung bnh ti thiu c th t c s dng b d on
bc 1.
[ ]10 R=D0D
-
B d on bc 1 (tip)Tm c h s phn x k1 , h s d on ca b lc bc 1, v J1:
Gii thut:
-
Bi tp1. Cho mt khung d liu thoi c cc t tng quan l R(0)=1; R(1)=0,866; R(2)=0,554
v R(3)=0,225. Tm cc h s ai=? (i=1,2,3).2. Cho h thng LPC c cc h s d on a1=1,793; a2=-1,401; a3=0,566; a4=-0,147.
Bit li thu G=2, di chu k pitch=60; gi thit l m hu thanh. Vi cc iukin u =0 ti thi im bt u ca chu k pitch, tng hp 10 mu u tin?
3.a. Cho s khi ca mt m hnh d on tuyn tnh ca tn hiu x(m). Vit phng trnh m t quan h vo-ra ca m hnh trn (min thi gian). Ly bin i Z phng trnh vo/ra tm hm truyn t ca m hnh (m hnh cc) Vit phng trnh b lc d on tuyn tnh ngc. Tm sai s bnh phng ti thiu i vi cc h s d on tuyn tnh
b. Cho 3 h s t tng quan u tin ca tn hiu l: r(0)=1; r(1)=0,865; r(2)=0,521
Tm cc h s ca m hnh d on tuyn tnh bc 2, biu din cc h s dng cc. S dng m hnh ny tnh p ng tn s ca qu trnh v biu din ph ca b d on.
-
M ha ngun
C nhiu m hnh c xut: m hnh da trn don tuyn tnh (thnh cng nht):b lc bin i theothi gian. S dng to ra ting ni ting ni.
Tc bit in hnh: 2 n 5 kbps. C cc loi: m ha d on tuyn tnh (LPC), m ha
d on tuyn tnh kch thch hn hp (MELP).
-
M hnh ha h thng pht m
Cc cu trc gii phu to nn h thng pht m ca con ngi.
Velum: vm ming (ngc mm)Pharyngal cavity: Khoang huLarynx: Thanh qunTrachea: Kh qun
Ting ni (sng m) pht ra tmi, ming khi khng kh t phithot ra. Cc khoang mi+khoangming+khoang hu = b lc mthanh c bn c p ng tn sthay i theo thi gian, c kchthch bi khng kh. C quan pht m = khoang hu+ khoang mi tn s cng hng ca cquan pht m = tn s formant:ph thuc vo hnh dng v kchthc ca c quan pht m.
-
M hnh ha h thng pht m
Cc cu trc gii phu to nn h thng pht m ca con ngi.
Velum: vm ming (ngc mm)Pharyngal cavity: Khoang huLarynx: Thanh qunTrachea: Kh qun
Dy thanh (bn trong thanhqun ): ng/m nhanh trong khipht m. m hu thanh: khi dy thanhrung lm cho lung khng kh tphi b ngt theo chu k, to rachui xung kch thch c quanpht m. m v thanh: Khng kh thotra khng lm rung dy thanh,khng c tnh chu k, hn lon Trong min thi gian,m huthanh c tnh chu k rt mnh, vitn s c bn = tn s pitch.
-
M hnh ha h thng pht m(M hnh AR: autoregressive)
Phi: to khng kh = nng lng kch thch c quanpht m biu din bi ngun nhiu trng S dng k thut nhn dng (d on tuyn tnh) d on cc thng s ca b lc bin i theo thigian da trn tn hiu quan st c.
-
M hnh thoi c quan pht m
u ra ca blc s (b lcLPC): tn hiuthoi s
u vo l chuixung hoc chuinhiu trng.
Quan h gia 2 m hnh:
-
B m ha vocoder
Thng tin a n b gii m: Cc tham s c trng cho b lc; m v thanh/hu thanh; Nhng thay i cn thit ca tn hiu kch thch, chu k
m thanh.
Phng trnh biu din quan h vo/ra ca b lc c th hin phng trnh sai phn tuyn tnh:
Hm truyn t ca b lc:
-
M ha vocoder M hnh b lc c biu din di dng vector:
A thay i theo chu k 20ms (theo tnh cht khng dng ca tn hiuthoi), ti tn s ly mu 8000 Hz, chu k 20 ms tng ng vi 160mu. Do vy tn hiu thoi c phn chia thnh cc khung c di20 ms (50 khung/sec)
M hnh ny tng ng vi
Nh vy, 160 gi tr ca S c i din cho 13 gi trca A
2 kiu bi ton: Tng hp (Synthesis): Cho A, to S. Phn tch (Analysis): Cho S, tm A tt nht
-
B m ha ngun Thc hin:
Tm cc thng s ca b lc = phn tch d ontuyn tnh, c th to ra khung tn hiu c ni dungph ging vi khung ban u, vi m thanh gnging.
V vy, khung c th biu din qua vic s dng 10thng s b lc + h s nh c (tnh t mc cngsut ca khung gc). Tng s bit: 45 bit (40 bit chocc thng s, 5 bit: h s nh c)
-
LPC Vocoder 2,4Kbps S :
Hot ng vi tc khong 2,4 Kbps hoc thp hn To ra thoi c m thanh d hiu nhng khng trung thc so vi ting
ni t nhin ca con ngi. Cc h s LPC c biu din l cc tham s cp ph vch (line
spectrum pair (LSP)). LSP tng ng 1-1 v mt ton hc vi LPC LSP c tnh nh sau:
-
LPC Vocoder 2,4Kbps Phn tch thnh tha s cc phng trnh trn:
l cc tham s LSP LSP c bc v bin: LSP tng quan t khung ny n khung khc hn LPC Kch c khung l 20 msec (C 50 frames/sec, tc 2400 bps = 48
bits/frame). Cc bit ny c gn nh sau:
-
LPC Vocoder 2,4Kbps 34 bit cho LSP c gn
nh B1. tng ch G, c m
ha s dng b lng tha v hng khng u 7bit (b lng t ha vec t1 chiu).
i vi m hu thanh, lcc gi tr t 20-146. V/UV,T c m ha nh B2.
B1
B2
-
Tng qut ha: Cu trc ca b MH ngun
-
S b m ha LPC
-
S b gii m LPC
-
B m ha ngun Qu trnh m ha: (theo tng khung)
Ti pha pht: Tm cc h s ca b lc t khung thoi. Tm h s nh c t khung thoi. Gi cc h s b lc v nh c ti pha thu.
Pha thu: To ra chui nhiu trng. Nhn cc mu nhiu trng vi h s nh c. Xy dng b lc s dng cc h s b lc nhn c t
pha pht v lc chui nhiu trng nh c. u ra ca blc chnh l thoi tng hp.
-
u nhc im ca Vocoder
Cht lng ph thuc vo m hnh thoi.
Cc Vocoder c th pht m kh gi to.
Cht l-ng km cc vocoder rt nhy cm vi li.
C th cung cp thoi s vi tc < 2 Kbps.
-
M ho Hybrid (lai)
S dng lai ghp 2 cng ngh m ho sng vm ho Vocoder
C th t -c cht l-ng thoi tt ti cc tc bt 2-16kbps
M ha lai ph bin nht l m ho phn tchbng cch tng hp AbS (Analysis-by-Synthesis): RPE-LTP(Regular-Pulse-Excited-Long-Term Prediction), CELP, ACELP, CS-CELPvv
-
M ha lai To ra cc m thanh t nhin hn, tn hiu kch thch l
ty , c chn sao cho dng sng ting ni c tora cng ging vi dng sng tht cng tt.
B m ha lai: s dng m ha m hnh b lc v tnhiu kch thch nh mt dng sng.
B m ha d on kch thch m (CELP): chn tn hiukch thch t cc t m trong bng m c thit ktrc.
Nguyn l ny cho php cht lng tn hiu thoi c thchp nhn c trong di tc 4,8 16 kbps trong cch thng in thoi v tuyn.
-
M ho phn tch bng cch tng hp AbS C ch ti u ha vng kn (closed-loop) : chn tham s tt nht
nh x tn hiu thoi tng hp cng ging cng tt tn hiu gc. Tn hiu c tng hp trong qu trnh m ha cho mc ch phn
tch gi l AbS.
-
M ho phn tch bng cch tng hp AbS (Analysis-by-Synthesis)
Cng s dng m hnh c quan pht m ca conngi.
Thay v s dng cc m hnh tn hiu kch thch ngin th tn hiu kch thch c chn sao cho c gngt c dng sng ting ni ti to cng ging vidng sng ting ni ban u cng tt.
Thut ton tm ra dng sng kch thch quyt nh phc tp b m ha.
c s dng ph bin trong cc chun m ha tingni cho mng di ng.
-
Vng Phn tch bng cch tng hp trong CELP
-
B lc d on thi gian di Vic s dng s liu ting ni thc cho thy bc ca b lc phi
ln c th c t nht 1 chu k pitch m hnh ha c tn hiuhu thanh.
B d on tuyn tnh bc 10 khng chnh xc m hnh hatnh chu k ca tn hiu hu thanh c chu k pitch=50.
Khi tng bc b d on, tnh chu k trong sai s d on khngcn, dn n tng li d on.
Song nu bc b on cao s lm tng chi ph thc hin, tc bitv cn nhiu bit biu din cc h s d on, tng thm vic tnhton trong qu trnh phn tch. Cn phi c gii php va n gianli va c th m hnh ha tn hiu chnh xc.
Quan st thc nghim: li d on tng ch yu 8-10 h sd on u tin, cng thm h s ti chu k pitch l 49. Cn cch s bc 11-48 v ln hn 49 khng ng gp vo vic ci thin li d on (Hnh 4.9).
-
B lc d on thi gian di
B d on ngn hn c bc d on M tng i thp (M=8-12):loi b s tng quan gia cc mu ln cn. B d on thi giandi hng n s tng quan gia cc mu cch nhau 1 chu k.
Hm truyn t ca b lc thi gian di:
Hai tham s cn xc nh: chu k pitch T v li d on b (Biton phn tch LP thi gian di).
-
Cu trc khung/khung con ng dng trong phn tch LP thi gian ngn i vi khong tng
i di (khung- 240 mu). Khung c chia thnh cc khung con (60 mu) (khong thi gian
ngn hn). Vic phn tch LP thi gian di c thc hin trn cckhung con ny. (B m ha CELP).
-
B m ha CELP S dng cc m hnh d on tuyn tnh di hn v ngn hn
tng hp ting ni, trnh vic phn loi m hu thanh v v thanhca LPC. Sau kt hp vi bng m kch thch (c truy vntrong qu trnh m ha), tm ra chui kch thch tt nht.
B lc tng hp pitch to ra tn hiu c tnh chu k vi tn s c bnpitch, a n b lc tng hp formant to ng bao ph.
Bng m: c nh hoc thch ng, cha cc xung xc nh hocnhiu ngu nhin.
n gin: bng m c nh, cha cc mu nhiu trng.
Ch s kch thch s chn ra chui nhiu trng a vo cc b lc.
-
B m ha CELP
-
B m ha RPE-LTP(Regular-Pulse-Excited-Long-Term Prediction)
L b m ha ADPCM, trong b d on thc hin tnh ton ttn hiu, tm sai s d on v lng t sai s ny s dng c chthch nghi.
C 2 b d on thi gian ngn v thi gian di, tng c lid on trung bnh.
B m ha: Cc tham s ca mi khung/khung con c ly ra v c ng gi to thnh
lung bit. Chia cc mu ting ni u vo thnh cc khung (160 mu 20ms), t cc
khung chia thnh cc khung con (40 mu). Khi tin x l: s dng b lc thng cao loi b thnh phn DC. Phn tch LP: c thc hin trn tng khung, s dng bc d on l 8. 9 gi
tr t tng quan c tnh ton t khung s dng ca s hnh ch nht. Ccgi tr tng quan c s dng tm 8 h s phn x.
-
Phn tch d on thi gian di, lc v khi m ha
-
B gii m RPE-LTP
-
Nhn xt Hu ht tt c cc b m ha lai u da
trn m hnh LPC, tu theo cch to ra tnhiu kch thch m ngi ta a ra cc loim ho lai khc nhau nh:
- M ho a xung MPE-LTP- M ho xung u RPE-LTP- M ho kch thch bng m CELP,ACELP,CS-
ACELP..- M ho kch thch vect tng VSELP.vv
Cc b m ha lai khc phc nhc imca LPC v cho dch v thoi tc thp vcht lng tng i tt.
-
Cc tiu chun m ha thoi
-
M ha m thanh (audio coding)
m thanh M hnh b m ha v gii m m thanh B m ha cm nhn
m l hc (psychoacoustics) Hiu ng che (auditory masking)
Che min tn s Che min thi gian
Chun nn m thanh MPEG
-
m thanh (Sound) m thanh l mt tn hiu lin tc c to ra bi s nn gin khng
kh. S thay i p sut khng kh lm cho mng nh (eardrum) rung
ng. Di tn s t 16Hz -20000Hz c gi l di tn s m thanh.
Bc sng ca m thanh trong di m tn l t 21.25m n0.017m.
Nhng m c tn s nh hn 16Hz gi l sng h m Nhng m c tn s ln hn 20000 Hz gi l sng siu m
-
c tnh m thanh Tn s m thanh:
S ln dao ng ca khng kh truyn dn m trong mt n vthi gian l 1 giy.
Tn s biu th cao (pitch) ca m thanh. Tn s cng ln th m thanh cng cao v ngc li.
Cng ( mnh: intensity): L lng nng lng c sng m truyn i trong mt n v thi
gian qua mt n v din tch t vung gc vi phng truyn m. n v: W/m2 Mc m thanh c cm nhn bi con ngi c cp n
nh m lng (loudness). c o bi n v m lng: phon L cng cm gic ti 1000 Hz (gi tr cng m chun)
Cng sut: L nng lng m thanh i qua mt din tch S trong thi gian
mt giy. Cht lng quality (m sc: timbre):
m sc c bit n nh l "cht lng" m thanh hay "mu sc" ca mthanh; gip phn bit nhng loi nhc c khc nhau.
-
Truyn tn hiu audio s
Him khi n knh (monoaural sound) CD: 2 knh (stereo). DVD: 7.1 knh (surround sound) (7 knh
normal + 1 knh hiu ng tn s thp LFE -
-
M ha m thanh m nhc c bng tn rng hn v a knh. M ha dng sng m bo c cht lng
m thanh t nhin S dng nhng c tnh ca tai ngi xc
nh s mc lng t ha trong cc di tn skhc nhau. Mi thnh phn tn s c lng t ha vi kch c
bc ph thuc vo ngng nghe. Khng m ha thnh phn tn s m tai ngi khng
th nghe c.
-
M ha m thanh Cht lng m thanh cao hn tc ly mu nhanh
hn, nhiu bit/ mu hn, v nhiu knh hn. Tc truyn tn hiu audio Nch knh:
B0 = b (s bit/mu). Fs. Nch DVD-Video: 48 kHz x 24 bit/mu = 1.152 kbps/ 1 channel; 2.304
kbps/2 channles; 6.912 kbps/5.1; 9.216 kbps/7.1; Bng thng yu cu ln, phn pht n khch hng qua
mt s phng tin truyn thng c dung lng hn ch(wireless: yu cu ln hn 36 ln so vi bng thngknh c gn).
Gii php: Tng dung lng knh truyn (chi ph ln, ko th thc hin c) Hoc gim yu cu bng thng (gim tc bit: m ha m thanh
s).
-
S m ha m thanh
Yu cu t bitT s nn: r=B0/B
(B: tc bit yu cu truyn bn rt gn)
B m ha knh, b iu ch, knh vt l, b gii iu ch, Pht sinh li bit.
Khng tn tht: tn hiu m thanh khi phc ging vi tn hiu m thanh ngun.
Tn tht: bn gn ging, mt s thng tin b mt, tn hiu m thanh mo (khng cm nhn c)
-
tng L thuyt thng tin: tc bit trung bnh ti thiu cn
thit truyn tn hiu ngun l entropy H ca n (xcnh bi xc sut phn b ca tn hiu ngun).
S sai khc: R= B0 H, d tha thng k. M ha m thanh kiu lossless: remove d tha
thng k t tn hiu ngun cng nhiu cng tt, saocho B cng gn H cng tt. (Hnh 1.2)
M ha entropy: l k thut m ha g b phn d tha thng k
Nhn xt:T l nn: hn ch (2:1), ko tha mn yu cu thc t (36:1), vimc ny mt s thng tin trong tn hiu ngun s b mt, khngchuyn i ngc li c (b gii m)
-
tng Thng tin mt khng chuyn ngc li c gy mo
tn hiu audio khi phc ti u ra b gii m. Vn : Thit k c b m ha m bo vic tai ko
cm nhn c mo, hoc c th cm nhn cnhng cha n mc phin phc (annoying).
Phn thng tin trong tn hiu ngun gy ra mo nhngko nh hng n cm nhn hoc khng phin phc lthng tin ko lin quan n cm nhn (ngoi cm nhn:perceptual irrelevant) c th loi b khi tn hiungun, gim ng k tc bit (B m ha lossy).
B m ha lossy: remove nhng thng tin ko nh hngn s cm nhn + d tha thng k. (Hnh 1.3)
-
M hnh d liu Lm cho m ha m thanh hiu qu hn, nhng cng
phc tp. Cc tn hiu m thanh c s tng quan rt ln v c
cc cu trc bn trong c th th biu din qua cc mhnh d liu.
V d: tn hiu sin 1000 Hz, c LM vi tc 48 kHz,biu din 16 bit/mu, tnh chu k chng minh rng n cs tng quan ln. M hnh ha tnh chu k b i stng qua l d on tuyn tnh hoc chuyn itrc giao (gii tng quan) (chuyn tn hiu vo thnhcc h s ko tng quan, c nng lng c chuynvo mt s t cc h s: DCT), hoc chuyn i lapped(m ha bng con)
-
M hnh cm nhn
M hnh xc nh mc ti u c th loi b an ton thng tin khng lin quan n cm nhn (perceptual irrelevance).
c thit k trong min tn s (ngng che).
-
Kin trc c bn (P.15)- B m ha m thanh
-
Kin trc c bn (P.15)- B gii m m thanh
-
B m ha m thanh cm nhn(Perceptual Audio Coder)
-
Di b lc phn tch v tng hp Di b lc:
Thnh phn quan trng trong hu ht cc b m ha video. Chuyn i t min thi gian dang min tn s v ngc li.
-
Gim tc ly mu (Down-Sampling)
Hot ng gim tc ly mu i N ln m t qu trnh gi li cc mu th nN.
V d Di b lc phn tch :N=1024 b lcTn s ly mu: fs=44100 HzTn s Nyquist: fg=22050 Hz
-
Tng tc ly mu (Up-sampling) Hot ng tng tc ly mu ln N ln m t vic chn vo
N-1 mu 0 gia cc mu u vo.
Di b lc tng hp
-
Di b lc iu ch Ci thin di b lc: To ra kch c ca s ln hn Hnh dng ca s khc nhau V d:
-
m l hc (Psychacoustic) S khi ca b m ha m thanh cm nhn
-
Cu to ca tai
-
Tin x l m thanh trong h thng ngoi bin
La chn tn s ca mng nn
-
X l m thanh trong h thng thnh gic Mm nn=Di b lc
c tai
-
X l m thanh trong h thng thnh gic
-
H thng thnh gic c th c m hnh nh di b lc, gm 25 blc bng thng chng ln, t 0 n 20 KHz.
Tai khng th phn bit cc m thanh xut hin ng thi trongcng mt di bng.
Mi di bng c gi l bng c bn. Bng tn ca mi bng c bn khong 100 hz i vi cc tn hiu
di 500 Hz, v tng tuyn tnh sau 500 Hz n 5000 Hz. 1 bark = rng ca 1 bng c bn.
Tai ngi: di b lc
2
/100, 500ar
9 4log ( /100), 500f f Hz
B kf f Hz
= + >
-
Cm nhn m thanh Tn s v di tn s m thanh
-
Ngng ngheq Ngng nghe l mt hm ca tn s m thanh.
q Khi cc thnh phn tn s thp di mc ngng th cc m thanh c tn s ny s khng nghe c.
q Tai ngi nhy nht trong phm vi tn s t 2 4KHz.
-
Che min tn sTn hiu c p sut cao hn mc ngng nghe vn c th bche khut bi cc tn hiu c p sut ln hn v tr gn tnhiu trong min tn s tn hiu tn s ny s khng nghec. Tn hiu che khut lm dch ngng nghe.
-
Che min thi gian
-
M ha m thanh cm nhn Phn tch tn hiu thnh cc di tn s ring bit qua
vic s dng di b lc. Phn tch nng lng tn hiu trong cc di khc nhau
v xc nh ngng che tng ca mi di bi cc tnhiu trong di khc.
Lng t ha cc mu trong cc di khc nhau c t lchnh xc vi mc che.
Mt tn hiu no di mc che khng cn m ha. Tn hiu trn mc che c lng t ha v cc bit
c gn qua cc di sao cho mi bit thm vo c thgim ti a mo cm nhn.
-
Cc tiu chun MPEG MPEG: nhm chuyn gia nh ng ca t chc tiu chun quc t
(ISO). MPEG-1: nh ngha cc chun m ha v m thanh v video, cch
thc gi ha cc bit m thanh v video ng b thi gian. Tc tng: 1,5 Mbps. Video (352x240 pels/frame, 30 frame/s): 30 Mbps n 1,2 Mbps. m thanh ( 2 knh, 48 K samples/s, 16 bit/sample): 2*768 kbps n < 0,3 Mbps. ng dng: web movies, MP3 audio, video CD.
MPEG-2: cho m thanh v video cht lng tt hn. Video: 720x480 pels/frame, 30 frames/s: 216 Mbps n 3-5 Mbps. Audio (5.1 knh), m ha m thanh tin tin (AAC).
MPEG-4: hng n s a dng v cc ng dng, c di chtlng v tc bit rng, nhng cht lng c ci thin ch yu tc bit thp. Cho ng dng internet audio video streaming
-
Chun m ha MPEG Tip cn ca MPEG: MPEG ch chun ha khun dng lung bit v
b gii m (khng a ra khuyn ngh v thut ton m ha).
MPEG-1: Tc bit t 32 kb/s n 448 kb/s. Ba lp:
Lp 1: phc tp thp nht Lp 2: phc tp v cht lng tng Lp 3: phc tp cao nht, t cht lng cao nht tc bit thp.
Cc tc hng n: 384 kbps; 256 kbps;
-
Thut ton m ha m thanh MPEGq B Lc bng con
q Che bng bi bng gn s dng m hnh m l hc(khoa hc tm sinh l nghe (Psychoacoustics) )
q Loi b nhng bng c p sut nm di ngng che.
q Lng t ha/ Gn bit/ M ha
q nh dng lung bit
-
Cc bc c bn trong m ha m thanh MPEG-1
S dng cc b lc tch chp chia tn hiu m thanh thnh 32bng con: lc bng con.
Xc nh mc che i vi mi bng da trn tn s ca n (ngngche tuyt i threshold in quiet), v nng lng ca bng ln cnv tn s v thi gian (che min tn s v che min thi gian).
Nu nng lng trong mt bng nm di ngng che, khng mha n.
Ngc li, xc nh s bit cn thit biu din h s trong bngny sao cho tp m sinh ra do lng t ha nm di hiu ng che(khi thm vo 1 bit gim c tp m lng t ha i 6 dB).
nh dng lung bit: chn cc tiu thch hp, m ha thng tinpha pht nh lng t ha cc h s t l cho cc bng khc nhauv m ha (s dng m ha di thay i: Huffman).
-
Cc Lp trong MPEG-1 Lp 1:
di khung: 384 mu (8ms vi fs=48kHz) phn gii tn s: 32 bng con Lng t ha: nn khi (12 mu), bin ca cc mu bng
con c ch th qua h s nh c SCF, phn gii 2dB. Lp 2:
di khung: 1152 mu (24ms vi fs=48kHz) phn gii tn s: 32 bng con Lng t ha: nn khi (12 mu), s dng h s nh c (SCF)
chn thng tin. Lp 3:
di khung: 1152 mu (24ms vi fs=48kHz) phn gii tn s: 576/192 bng con Lng t ha: khng u vi m ha, s dng h s nh c
(SCF) chn thng tin.
-
MPEG-1 Lp 1 S khi
-
MPEG-1 Lp 1 Cu trc khung
-
MPEG-1 Lp 3 MP3 = ISO/IEC IS 11172-3 (MPEG-1 lp 3) v 13818-3
(MPEG-2 lp 3). Khun dng file khng c tiu , khng cn thit c tiu . Tr nh nht ti b M ha/Gii m l 59 ms.
-
MPEG-1 Lp 3 Cu trc c bn ging nh lp 2: di khung (24ms, 48kHz), di b lc
nhiu pha. im khc:
Di b lc lai ghp (MDCT)(32x18=576 bng con hoc 32x6=192 bng con). (Hnh v)
Lng t ha khng u. M ha Huffman. Cu trc phn tch da trn tng hp. H tr bit thay i
-
Cu trc b gii m MPEG-1
Qu trnh gii m Lp1 v 2
-
Qu trnh gii m MPEG-1 Lp 3
-
Qu trnh gii m MPEG-Tng hp ca b lc bng con
-
Bi tp 81. Nu 3 c im nghe c bn xc nh mc thp nht ca mt
m thanh c th nghe c?2. Gi s tn hiu m thanh c chia thnh 16 bng tn c nng
lng trong cc bng khc nhau nh sau:---------------------------------------------------------------------------------------Bng 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16Mc (db) 0 8 12 10 6 2 20 60 14 20 15 2 3 5 3 1Gi s rng nu mc ca bng th 8 l 60dB, n cho php che 12dB bng 7, 15 dB bng 9. Xc nh s bit cn thit m habng 7 v 9? Bit rng tn hiu gc c biu din vi 8bit/mu/bng.
3. Nu 3 bc c bn trong m ha m thanh cm nhn? V s khi ch ra 3 thnh phn ny.
4. Nu nhng im khc bit gia cc lp ca audio MPEG4 khacnh cc k thut c s dng v cht lng m thanh cng tc bit?
-
Li giiBi 1: Ngng nghe tuyt i, che min tn s, che min thi gian. Ngng nghe tuyt i cho bit mc thp nht (ngng) c th nghe
c khi ch c mt tn s m thanh n v ngng ph thuc tn s.Bi 2: V nng lng bng 7 l 20 dB, ln hn 12 dB, cn phi m ha. Do mc
che 12 dB, c th m ha vi tp m lng t 12 dB hay gim tc bit i2 bit. Do ch cn 6 bit.
i vi bng 9, nng lng tn hiu l 14dB, thp hn 15 dB. Khng cn mha bng 9.
Bi 3 Vic lc bng con xut pht t cc tn hiu bng con trong cc di tn s
khc nhau. Tnh ton mc che da trn nng lng ca cc bng con khc nhau. Gn bit trong s cc bng con da trn mc cho ca mi bng con.