reproducing the feature outputs of common programs in matlab using melfcc

Post on 16-Nov-2015

20 Views

Category:

Documents

6 Downloads

Preview:

Click to see full reader

DESCRIPTION

matlab instrumentation

TRANSCRIPT

  • 25/3/2015 ReproducingthefeatureoutputsofcommonprogramsinMatlabusingmelfcc.m

    http://www.ee.columbia.edu/ln/rosa/matlab/rastamat/mfccs.html 1/4

    DanEllis:Resources:Matlab:PLP,Rasta,MFCC:

    ReproducingthefeatureoutputsofcommonprogramsusingMatlabandmelfcc.m

    WhenIdecidedtoimplementmyownversionofwarpedfrequencycepstralfeatures(suchasMFCC)inMatlab,Iwantedtobeabletoduplicatetheoutputofthecommonprogramsusedforthesefeatures,aswellastobeabletoinverttheoutputsofthoseprograms.Thispagegivessomeexamplesofhowcepstracanbecalculatedbythreecommonprograms(HTK'sHCopy,feacalcfromSPRACHcore,andmfcc.mfromMalcolmSlaney'sAuditoryToolboxforMatlab),andhowtoduplicatetheresults(orverynearly)usingmymelfcc.mroutine.Thisalsoautomaticallyshowsyouhowtoinvertcepstracalculatedbyeitherpathintospectrogramsorwaveformsusinginvmelfcc.m,sinceitsargumentsarethesame.

    HTKMFCC

    20130226:ForanemulationofHTK'sMFCCcalculationaccuratetothe3rddecimalplace,seethemodifiedrastamatcodeincalc_mfcc.ThemaindifferenceswerethatHTKappliespreemphasisindependentlyoneachwindow,andalsoremovesthemeanoneachwindow.

    CalculatingfeaturesinHTKisdoneviaHCopy,whichcanconvertbetweenawiderangeofrepresentationsincludingwaveformtocepstra.HCopytakesitsoptionsfromaconfigfile.Thus,toconvert16kHzsampledsoundfilestostandardMelfrequencycepstralcoefficients(MFCCs),youwouldhaveafileconfig.mfcccontaining:

    SOURCEKIND=WAVEFORMSOURCEFORMAT=WAVESOURCERATE=625TARGETKIND=MFCC_0TARGETRATE=100000.0WINDOWSIZE=250000.0USEHAMMING=TPREEMCOEF=0.97NUMCHANS=20CEPLIFTER=22NUMCEPS=12

    (TheSOURCEFORMAToptionspecifiesthatthewavefilesareinMSWAVEformat.)Thentocalculatethefeatures,yousimplyrunHCopyfromtheUnixcommandline:

    $HCopyCconfig.mfccsa1.wavsa1mfcc.htk

    WecanemulatethisprocessinginMatlab,andcomparetheresults,asbelow:(Notethatthe">>"atthestartofeachlineisanimage,soyoucancutandcopymultiplelinesoftextdirectlyintoMatlabwithouthavingtoworryabouttheprompts).

    %Loadaspeechwaveform[d,sr]=wavread('sa1.wav');%CalculateHTKstyleMFCCsmfc=melfcc(d,sr,'lifterexp',22,'nbands',20,...

    'dcttype',3,'maxfreq',8000,'fbtype','htkmel','sumpower',0);%LoadthefeaturesfromHCopyandcompare:htkmfc=readhtk('sa1mfcc.htk');%Reorderandscaletobelikemefccoutputhtkmfc=2*htkmfc(:,[13[1:12]])';%(melfcc.mis2xHCopybecauseitdealsinpower,notmagnitude,spectra)subplot(311)imagesc(htkmfc);axisxy;colorbartitle('HTKMFCC');subplot(312)imagesc(mfc);axisxy;colorbartitle('melfccMFCC');subplot(313)imagesc(htkmfcmfc);axisxy;colorbartitle('differenceHTKmelfcc');%Differenceoccasionallypeaksatasmuchasafewpercent(unexplained),%butisbasicallynegligable

    %InverttheHTKfeaturesbacktowaveform,auditoryspectrogram,%regularspectrogram(sameargsasmelfcc())[dr,aspec,spec]=invmelfcc(htkmfc,sr,'lifterexp',22,'nbands',20,...

    'dcttype',3,'maxfreq',8000,'fbtype','htkmel','sumpower',0);subplot(311)imagesc(10*log10(spec));axisxy;colorbartitle('ShorttimepowerspectruminvertedfromHTKMFCCs')subplot(312)specgram(dr,512,sr);colorbartitle('Spectrogramofreconstructed(noiseexcited)waveform');

    http://www.ee.columbia.edu/~dpwe/resources/matlab/rastamat/http://www.ee.columbia.edu/~dpwe/resources/matlab/rastamat/http://rvl4.ecn.purdue.edu/~malcolm/interval/1998-010/http://www.ee.columbia.edu/~dpwe/resources/http://www.ee.columbia.edu/ln/rosa/matlab/rastamat/config.mfcchttp://www.ee.columbia.edu/ln/rosa/matlab/rastamat/readhtk.mhttp://www.icsi.berkeley.edu/~dpwe/projects/sprach/sprachcore.htmlhttp://www.ee.columbia.edu/~dpwe/resources/matlab/http://htk.eng.cam.ac.uk/http://www.ee.columbia.edu/ln/rosa/matlab/rastamat/sa1.wavhttp://www.ee.columbia.edu/~dpwe/http://labrosa.ee.columbia.edu/projects/calc_mfcc/

  • 25/3/2015 ReproducingthefeatureoutputsofcommonprogramsinMatlabusingmelfcc.m

    http://www.ee.columbia.edu/ln/rosa/matlab/rastamat/mfccs.html 2/4

    subplot(313)specgram(d,512,sr);colorbartitle('Originalsignalspectrogram');%Spectrogramslookprettyclose,althoughnoiseexcitation%ofreconstructiongivesitaweird'whisperingcrowd'sound

    HTKPLP

    HTKcanalsocalculatePLPfeatures.ItturnsoutthatthesearesomewhatdifferentfromtheMFCCfeaturesbecausethecepstraarecalculatedbyadifferentalgorithm.However,wecanstillemulateandinvertthemwithdifferentparameters.TocalculatePLPfeatureswithHCopy,weneedanewconfigfile,config.plp:

    SOURCEKIND=WAVEFORMSOURCEFORMAT=WAVESOURCERATE=625TARGETKIND=PLP_0TARGETRATE=100000.0WINDOWSIZE=250000.0USEHAMMING=TPREEMCOEF=0.97NUMCHANS=20CEPLIFTER=22NUMCEPS=12USEPOWER=TLPCORDER=12

    (TARGETKINDischanged,andUSEPOWERandLPCORDERareadded).Thenwecalculatethefeatures:

    $HCopyCconfig.plpsa1.wavsa1plp.htk

    ..andcomparetotheMatlabversion:

    [d,sr]=wavread('sa1.wav');%CalculateHTKstylePLPsplp=melfcc(d,sr,'lifterexp',22,'nbands',20,...

    'dcttype',1,'maxfreq',8000,'fbtype','htkmel',...'modelorder',12,'usecmp',1);

    %LoadtheHCopyfeatureshtkplp=readhtk('sa1plp.htk');%Reorder(noscalinginthiscase)htkplp=htkplp(:,[13[1:12]])';subplot(311)imagesc(htkplp);axisxy;colorbartitle('HTKPLP');subplot(312)imagesc(plp);axisxy;colorbartitle('melfccPLP');subplot(313)imagesc(htkplpplp);axisxy;colorbartitle('differenceHTKmelfcc');%Unexplaineddifferencescanbeupto20%forhigherorder%cepstra,butessentiallythesame

    %InverttheHTKfeaturesbackagainbymirroringargstomelfcc[dr,aspec,spec]=invmelfcc(htkplp,sr,'lifterexp',22,'nbands',20,...

    'dcttype',1,'maxfreq',8000,'fbtype','htkmel',...'modelorder',12,'usecmp',1);

    subplot(311)imagesc(10*log10(spec));axisxy;colorbartitle('ShorttimepowerspectruminvertedfromHTKPLPs')subplot(312)specgram(dr,512,sr);colorbartitle('Spectrogramofreconstructed(noiseexcited)waveform');subplot(313)specgram(d,512,sr);colorbartitle('Originalsignalspectrogram');%Prettyclose

    feacalcMFCC

    http://www.ee.columbia.edu/ln/rosa/matlab/rastamat/config.plphttp://www.ee.columbia.edu/ln/rosa/matlab/rastamat/sa1.wavhttp://www.ee.columbia.edu/ln/rosa/matlab/rastamat/readhtk.m

  • 25/3/2015 ReproducingthefeatureoutputsofcommonprogramsinMatlabusingmelfcc.m

    http://www.ee.columbia.edu/ln/rosa/matlab/rastamat/mfccs.html 3/4

    feacalcisthemainfeaturecalculationprogramfromICSI'sSPRACHcorepackage.It'sactuallyawrapperaroundtheolderrasta.whichwastheoriginalClanguageimplementationofRASTAandPLPfeaturecalculation.feacalchasbeenexpandedtobeabletocalculate(itsownversionof)MFCCfeatures,sotoparalleltheHTKexamplesabove,we'llstartwithfeacalc'sMFCCfeature.Theycanbecalculatedwiththefollowingcommandline:

    $feacalcsr16000nyq8000delta0rasnoplpno\domcepcomnofrqmelfilttricep13opfhtk\sa1.wavosa1fcmfc.htk

    andweduplicatethisinMatlabasfollows:

    [d,sr]=wavread('sa1.wav');%CalculateFeacalcstyleMFCCs%(scaletomatchnormalizationofMelfilters)mfc2=melfcc(d*5.5289,sr,'lifterexp',0.6,'nbands',19,...

    'dcttype',4,'maxfreq',8000,'fbtype','fcmel','preemph',0);%LoadtheHCopyfeaturesfcmfc=readhtk('sa1fcmfc.htk');%Noneedtoreorderorscale,justtransposefcmfc=fcmfc';subplot(311)imagesc(fcmfc(2:13,:));axisxy;colorbartitle('feacalcMFCC');subplot(312)imagesc(mfc2(2:13,:));axisxy;colorbartitle('melfccMFCC(feacalcstyle)');subplot(313)imagesc(fcmfcmfc2);axisxy;colorbartitle('differencefeacalcmelfcc');%Smalldifferencesinhighordercepstradueto%cumulativeerrorsinMelfiltershapes

    ..andinvertingworksjustthesameasabove.

    feacalcPLP

    feacalcwasoriginallydesignedtocalculatePLP(andRasta)features,sothisisitsmore'native'invocation:

    $feacalcsr16000nyq8000delta0rasnodomcepplp12\opfhtksa1.wavosa1fcplp.htk

    ..whichweduplicatethisinMatlabasfollows:

    [d,sr]=wavread('sa1.wav');%CalculateFeacalcstylePLPsplp2=melfcc(d,sr,'lifterexp',0.6,'nbands',21,...

    'dcttype',1,'maxfreq',8000,'fbtype','bark','preemph',0,...'numcep',13,'modelorder',12,'usecmp',1);

    %LoadtheHCopyfeaturesfcplp=readhtk('sa1fcplp.htk');%justtransposefcplp=fcplp';subplot(311)imagesc(fcplp(2:13,:));axisxy;colorbartitle('feacalcPLP');subplot(312)imagesc(plp2(2:13,:));axisxy;colorbartitle('melfccPLP(feacalcstyle)');subplot(313)imagesc(fcplpplp2);axisxy;colorbartitle('differencefeacalcmelfcc');%Afewlocalizeddifferencesduewindowsetc.

    ..andonceagaininvertingworksjustthesameasabove.

    AuditoryToolboxmfcc.m

    ThemostpopulartoolforcalculatingMFCCsinMatlabismfcc.mfromMalcolmSlaney'sAuditoryToolbox.ThisiswhatIusedforalongtime,untilIneededsomethingwithmoreflexibility.Thatflexibilityincludesbeingabletoduplicatemfcc.m.Here'showwecancomparetheminMatlab.

    [d,sr]=wavread('sa1.wav');%CalculateMFCCsusingmfcc.mfromtheAuditoryToolbox%(gainshouldbe2^15becausemelfccscalesbythatamount,%butinthiscasemfccuses2xFFTlen)ce=mfcc(d*(2^14),sr);%Scalethemtomatch(log_10andpower)ce=log(10)*2*ce;

    http://www.ee.columbia.edu/ln/rosa/matlab/rastamat/sa1.wavhttp://www.icsi.berkeley.edu/~dpwe/projects/sprach/sprachcore.htmlhttp://www.ee.columbia.edu/ln/rosa/matlab/rastamat/sa1.wavhttp://www.ee.columbia.edu/ln/rosa/matlab/rastamat/readhtk.mhttp://www.ee.columbia.edu/ln/rosa/matlab/rastamat/sa1.wavhttp://www.ee.columbia.edu/ln/rosa/matlab/rastamat/readhtk.m

  • 25/3/2015 ReproducingthefeatureoutputsofcommonprogramsinMatlabusingmelfcc.m

    http://www.ee.columbia.edu/ln/rosa/matlab/rastamat/mfccs.html 4/4

    %Duplicatewithmelfcc.mmfc3=melfcc(d,sr,'lifterexp',0,'minfreq',133.33,...

    'maxfreq',6855.6,'wintime',0.016,'sumpower',0);%..andcompare:subplot(311)imagesc(ce(2:13,:));axisxy;colorbartitle('AuditoryToolboxMFCC');subplot(312)imagesc(mfc3(2:13,:));axisxy;colorbartitle('melfccMFCC(AudToolboxstyle)');subplot(313)imagesc(cemfc3);axisxy;colorbartitle('differenceAudTBoxmelfcc');%Smalldifferencesmainlyduetohanningvs.hamming

    NotesonthedifferencesbetweendifferentMFCCs

    MelmappingfunctionMelfilternormalizationDCTusedtocalculatecepstrumNumberofMelbands(andhencetheirwidth)FrequencyspanofMelbandsLifteringrasta,htk,noneDetailsofinitialSTFT(odd/evenhann/hamm,fftlength,windowlength)MelintegrationinlinearorpowerdomainDitherandDCremovalPreemphasis

    Lastupdated:$Date:2013/02/2617:00:16$

    DanEllis

    http://www.ee.columbia.edu/~dpwe/mailto:dpwe@ee.columbia.edu

top related