reproducing the feature outputs of common programs in matlab using melfcc
Post on 16-Nov-2015
20 Views
Preview:
DESCRIPTION
TRANSCRIPT
-
25/3/2015 ReproducingthefeatureoutputsofcommonprogramsinMatlabusingmelfcc.m
http://www.ee.columbia.edu/ln/rosa/matlab/rastamat/mfccs.html 1/4
DanEllis:Resources:Matlab:PLP,Rasta,MFCC:
ReproducingthefeatureoutputsofcommonprogramsusingMatlabandmelfcc.m
WhenIdecidedtoimplementmyownversionofwarpedfrequencycepstralfeatures(suchasMFCC)inMatlab,Iwantedtobeabletoduplicatetheoutputofthecommonprogramsusedforthesefeatures,aswellastobeabletoinverttheoutputsofthoseprograms.Thispagegivessomeexamplesofhowcepstracanbecalculatedbythreecommonprograms(HTK'sHCopy,feacalcfromSPRACHcore,andmfcc.mfromMalcolmSlaney'sAuditoryToolboxforMatlab),andhowtoduplicatetheresults(orverynearly)usingmymelfcc.mroutine.Thisalsoautomaticallyshowsyouhowtoinvertcepstracalculatedbyeitherpathintospectrogramsorwaveformsusinginvmelfcc.m,sinceitsargumentsarethesame.
HTKMFCC
20130226:ForanemulationofHTK'sMFCCcalculationaccuratetothe3rddecimalplace,seethemodifiedrastamatcodeincalc_mfcc.ThemaindifferenceswerethatHTKappliespreemphasisindependentlyoneachwindow,andalsoremovesthemeanoneachwindow.
CalculatingfeaturesinHTKisdoneviaHCopy,whichcanconvertbetweenawiderangeofrepresentationsincludingwaveformtocepstra.HCopytakesitsoptionsfromaconfigfile.Thus,toconvert16kHzsampledsoundfilestostandardMelfrequencycepstralcoefficients(MFCCs),youwouldhaveafileconfig.mfcccontaining:
SOURCEKIND=WAVEFORMSOURCEFORMAT=WAVESOURCERATE=625TARGETKIND=MFCC_0TARGETRATE=100000.0WINDOWSIZE=250000.0USEHAMMING=TPREEMCOEF=0.97NUMCHANS=20CEPLIFTER=22NUMCEPS=12
(TheSOURCEFORMAToptionspecifiesthatthewavefilesareinMSWAVEformat.)Thentocalculatethefeatures,yousimplyrunHCopyfromtheUnixcommandline:
$HCopyCconfig.mfccsa1.wavsa1mfcc.htk
WecanemulatethisprocessinginMatlab,andcomparetheresults,asbelow:(Notethatthe">>"atthestartofeachlineisanimage,soyoucancutandcopymultiplelinesoftextdirectlyintoMatlabwithouthavingtoworryabouttheprompts).
%Loadaspeechwaveform[d,sr]=wavread('sa1.wav');%CalculateHTKstyleMFCCsmfc=melfcc(d,sr,'lifterexp',22,'nbands',20,...
'dcttype',3,'maxfreq',8000,'fbtype','htkmel','sumpower',0);%LoadthefeaturesfromHCopyandcompare:htkmfc=readhtk('sa1mfcc.htk');%Reorderandscaletobelikemefccoutputhtkmfc=2*htkmfc(:,[13[1:12]])';%(melfcc.mis2xHCopybecauseitdealsinpower,notmagnitude,spectra)subplot(311)imagesc(htkmfc);axisxy;colorbartitle('HTKMFCC');subplot(312)imagesc(mfc);axisxy;colorbartitle('melfccMFCC');subplot(313)imagesc(htkmfcmfc);axisxy;colorbartitle('differenceHTKmelfcc');%Differenceoccasionallypeaksatasmuchasafewpercent(unexplained),%butisbasicallynegligable
%InverttheHTKfeaturesbacktowaveform,auditoryspectrogram,%regularspectrogram(sameargsasmelfcc())[dr,aspec,spec]=invmelfcc(htkmfc,sr,'lifterexp',22,'nbands',20,...
'dcttype',3,'maxfreq',8000,'fbtype','htkmel','sumpower',0);subplot(311)imagesc(10*log10(spec));axisxy;colorbartitle('ShorttimepowerspectruminvertedfromHTKMFCCs')subplot(312)specgram(dr,512,sr);colorbartitle('Spectrogramofreconstructed(noiseexcited)waveform');
http://www.ee.columbia.edu/~dpwe/resources/matlab/rastamat/http://www.ee.columbia.edu/~dpwe/resources/matlab/rastamat/http://rvl4.ecn.purdue.edu/~malcolm/interval/1998-010/http://www.ee.columbia.edu/~dpwe/resources/http://www.ee.columbia.edu/ln/rosa/matlab/rastamat/config.mfcchttp://www.ee.columbia.edu/ln/rosa/matlab/rastamat/readhtk.mhttp://www.icsi.berkeley.edu/~dpwe/projects/sprach/sprachcore.htmlhttp://www.ee.columbia.edu/~dpwe/resources/matlab/http://htk.eng.cam.ac.uk/http://www.ee.columbia.edu/ln/rosa/matlab/rastamat/sa1.wavhttp://www.ee.columbia.edu/~dpwe/http://labrosa.ee.columbia.edu/projects/calc_mfcc/
-
25/3/2015 ReproducingthefeatureoutputsofcommonprogramsinMatlabusingmelfcc.m
http://www.ee.columbia.edu/ln/rosa/matlab/rastamat/mfccs.html 2/4
subplot(313)specgram(d,512,sr);colorbartitle('Originalsignalspectrogram');%Spectrogramslookprettyclose,althoughnoiseexcitation%ofreconstructiongivesitaweird'whisperingcrowd'sound
HTKPLP
HTKcanalsocalculatePLPfeatures.ItturnsoutthatthesearesomewhatdifferentfromtheMFCCfeaturesbecausethecepstraarecalculatedbyadifferentalgorithm.However,wecanstillemulateandinvertthemwithdifferentparameters.TocalculatePLPfeatureswithHCopy,weneedanewconfigfile,config.plp:
SOURCEKIND=WAVEFORMSOURCEFORMAT=WAVESOURCERATE=625TARGETKIND=PLP_0TARGETRATE=100000.0WINDOWSIZE=250000.0USEHAMMING=TPREEMCOEF=0.97NUMCHANS=20CEPLIFTER=22NUMCEPS=12USEPOWER=TLPCORDER=12
(TARGETKINDischanged,andUSEPOWERandLPCORDERareadded).Thenwecalculatethefeatures:
$HCopyCconfig.plpsa1.wavsa1plp.htk
..andcomparetotheMatlabversion:
[d,sr]=wavread('sa1.wav');%CalculateHTKstylePLPsplp=melfcc(d,sr,'lifterexp',22,'nbands',20,...
'dcttype',1,'maxfreq',8000,'fbtype','htkmel',...'modelorder',12,'usecmp',1);
%LoadtheHCopyfeatureshtkplp=readhtk('sa1plp.htk');%Reorder(noscalinginthiscase)htkplp=htkplp(:,[13[1:12]])';subplot(311)imagesc(htkplp);axisxy;colorbartitle('HTKPLP');subplot(312)imagesc(plp);axisxy;colorbartitle('melfccPLP');subplot(313)imagesc(htkplpplp);axisxy;colorbartitle('differenceHTKmelfcc');%Unexplaineddifferencescanbeupto20%forhigherorder%cepstra,butessentiallythesame
%InverttheHTKfeaturesbackagainbymirroringargstomelfcc[dr,aspec,spec]=invmelfcc(htkplp,sr,'lifterexp',22,'nbands',20,...
'dcttype',1,'maxfreq',8000,'fbtype','htkmel',...'modelorder',12,'usecmp',1);
subplot(311)imagesc(10*log10(spec));axisxy;colorbartitle('ShorttimepowerspectruminvertedfromHTKPLPs')subplot(312)specgram(dr,512,sr);colorbartitle('Spectrogramofreconstructed(noiseexcited)waveform');subplot(313)specgram(d,512,sr);colorbartitle('Originalsignalspectrogram');%Prettyclose
feacalcMFCC
http://www.ee.columbia.edu/ln/rosa/matlab/rastamat/config.plphttp://www.ee.columbia.edu/ln/rosa/matlab/rastamat/sa1.wavhttp://www.ee.columbia.edu/ln/rosa/matlab/rastamat/readhtk.m
-
25/3/2015 ReproducingthefeatureoutputsofcommonprogramsinMatlabusingmelfcc.m
http://www.ee.columbia.edu/ln/rosa/matlab/rastamat/mfccs.html 3/4
feacalcisthemainfeaturecalculationprogramfromICSI'sSPRACHcorepackage.It'sactuallyawrapperaroundtheolderrasta.whichwastheoriginalClanguageimplementationofRASTAandPLPfeaturecalculation.feacalchasbeenexpandedtobeabletocalculate(itsownversionof)MFCCfeatures,sotoparalleltheHTKexamplesabove,we'llstartwithfeacalc'sMFCCfeature.Theycanbecalculatedwiththefollowingcommandline:
$feacalcsr16000nyq8000delta0rasnoplpno\domcepcomnofrqmelfilttricep13opfhtk\sa1.wavosa1fcmfc.htk
andweduplicatethisinMatlabasfollows:
[d,sr]=wavread('sa1.wav');%CalculateFeacalcstyleMFCCs%(scaletomatchnormalizationofMelfilters)mfc2=melfcc(d*5.5289,sr,'lifterexp',0.6,'nbands',19,...
'dcttype',4,'maxfreq',8000,'fbtype','fcmel','preemph',0);%LoadtheHCopyfeaturesfcmfc=readhtk('sa1fcmfc.htk');%Noneedtoreorderorscale,justtransposefcmfc=fcmfc';subplot(311)imagesc(fcmfc(2:13,:));axisxy;colorbartitle('feacalcMFCC');subplot(312)imagesc(mfc2(2:13,:));axisxy;colorbartitle('melfccMFCC(feacalcstyle)');subplot(313)imagesc(fcmfcmfc2);axisxy;colorbartitle('differencefeacalcmelfcc');%Smalldifferencesinhighordercepstradueto%cumulativeerrorsinMelfiltershapes
..andinvertingworksjustthesameasabove.
feacalcPLP
feacalcwasoriginallydesignedtocalculatePLP(andRasta)features,sothisisitsmore'native'invocation:
$feacalcsr16000nyq8000delta0rasnodomcepplp12\opfhtksa1.wavosa1fcplp.htk
..whichweduplicatethisinMatlabasfollows:
[d,sr]=wavread('sa1.wav');%CalculateFeacalcstylePLPsplp2=melfcc(d,sr,'lifterexp',0.6,'nbands',21,...
'dcttype',1,'maxfreq',8000,'fbtype','bark','preemph',0,...'numcep',13,'modelorder',12,'usecmp',1);
%LoadtheHCopyfeaturesfcplp=readhtk('sa1fcplp.htk');%justtransposefcplp=fcplp';subplot(311)imagesc(fcplp(2:13,:));axisxy;colorbartitle('feacalcPLP');subplot(312)imagesc(plp2(2:13,:));axisxy;colorbartitle('melfccPLP(feacalcstyle)');subplot(313)imagesc(fcplpplp2);axisxy;colorbartitle('differencefeacalcmelfcc');%Afewlocalizeddifferencesduewindowsetc.
..andonceagaininvertingworksjustthesameasabove.
AuditoryToolboxmfcc.m
ThemostpopulartoolforcalculatingMFCCsinMatlabismfcc.mfromMalcolmSlaney'sAuditoryToolbox.ThisiswhatIusedforalongtime,untilIneededsomethingwithmoreflexibility.Thatflexibilityincludesbeingabletoduplicatemfcc.m.Here'showwecancomparetheminMatlab.
[d,sr]=wavread('sa1.wav');%CalculateMFCCsusingmfcc.mfromtheAuditoryToolbox%(gainshouldbe2^15becausemelfccscalesbythatamount,%butinthiscasemfccuses2xFFTlen)ce=mfcc(d*(2^14),sr);%Scalethemtomatch(log_10andpower)ce=log(10)*2*ce;
http://www.ee.columbia.edu/ln/rosa/matlab/rastamat/sa1.wavhttp://www.icsi.berkeley.edu/~dpwe/projects/sprach/sprachcore.htmlhttp://www.ee.columbia.edu/ln/rosa/matlab/rastamat/sa1.wavhttp://www.ee.columbia.edu/ln/rosa/matlab/rastamat/readhtk.mhttp://www.ee.columbia.edu/ln/rosa/matlab/rastamat/sa1.wavhttp://www.ee.columbia.edu/ln/rosa/matlab/rastamat/readhtk.m
-
25/3/2015 ReproducingthefeatureoutputsofcommonprogramsinMatlabusingmelfcc.m
http://www.ee.columbia.edu/ln/rosa/matlab/rastamat/mfccs.html 4/4
%Duplicatewithmelfcc.mmfc3=melfcc(d,sr,'lifterexp',0,'minfreq',133.33,...
'maxfreq',6855.6,'wintime',0.016,'sumpower',0);%..andcompare:subplot(311)imagesc(ce(2:13,:));axisxy;colorbartitle('AuditoryToolboxMFCC');subplot(312)imagesc(mfc3(2:13,:));axisxy;colorbartitle('melfccMFCC(AudToolboxstyle)');subplot(313)imagesc(cemfc3);axisxy;colorbartitle('differenceAudTBoxmelfcc');%Smalldifferencesmainlyduetohanningvs.hamming
NotesonthedifferencesbetweendifferentMFCCs
MelmappingfunctionMelfilternormalizationDCTusedtocalculatecepstrumNumberofMelbands(andhencetheirwidth)FrequencyspanofMelbandsLifteringrasta,htk,noneDetailsofinitialSTFT(odd/evenhann/hamm,fftlength,windowlength)MelintegrationinlinearorpowerdomainDitherandDCremovalPreemphasis
Lastupdated:$Date:2013/02/2617:00:16$
DanEllis
http://www.ee.columbia.edu/~dpwe/mailto:dpwe@ee.columbia.edu
top related