(19) tzz zz tspatial audio coding at low bitrates" preprints of papers presented at the aes...

23
Note: Within nine months of the publication of the mention of the grant of the European patent in the European Patent Bulletin, any person may give notice to the European Patent Office of opposition to that patent, in accordance with the Implementing Regulations. Notice of opposition shall not be deemed to have been filed until the opposition fee has been paid. (Art. 99(1) European Patent Convention). Printed by Jouve, 75001 PARIS (FR) (19) EP 1 808 047 B1 TEPZZ_8Z8Z47B_T (11) EP 1 808 047 B1 (12) EUROPEAN PATENT SPECIFICATION (45) Date of publication and mention of the grant of the patent: 17.06.2015 Bulletin 2015/25 (21) Application number: 05807484.0 (22) Date of filing: 31.10.2005 (51) Int Cl.: G10L 19/008 (2013.01) H04S 5/02 (2006.01) (86) International application number: PCT/EP2005/011664 (87) International publication number: WO 2006/048227 (11.05.2006 Gazette 2006/19) (54) MULTICHANNEL AUDIO SIGNAL DECODING USING DE-CORRELATED SIGNALS DEKODIERUNG VON MEHRKANALTONSIGNALEN UNTER VERWENDUNG DEKORRELIERTER SIGNALE DECODAGE DE SIGNAUX AUDIO MULTICANAL A SIGNAUX DECORRELES (84) Designated Contracting States: AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LI LT LU LV MC NL PL PT RO SE SI SK TR (30) Priority: 02.11.2004 SE 0402649 (43) Date of publication of application: 18.07.2007 Bulletin 2007/29 (73) Proprietors: Dolby International AB 1101 CN Amsterdam Zuid-Oost (NL) Koninklijke Philips N.V. 5656 AE Eindhoven (NL) (72) Inventors: PURNHAGEN, Heiko S-17265 Sundbyberg (SE) ENGDEGARD, Jonas S-113 52 Stockholm (SE) BREEBAART, Jeroen NL-5621 BA Eindhoven (NL) SCHUIJERS, Erik NL-5621 BA Eindhoven (NL) (74) Representative: Zinkler, Franz et al Schoppe, Zimmermann, Stöckeler Zinkler, Schenk & Partner mbB Patentanwälte Radlkoferstrasse 2 81373 München (DE) (56) References cited: WO-A-2005/101370 G.POTARD, I.BURNETT: "Decorrelation techniques for the rendering of apparent sound source width in 3D audio displays" PROCEEDINGS OF THE 7TH INT. CONFERENCE ON DIGITAL AUDIO EFFECTS DAFX04, [Online] 5 October 2004 (2004-10-05), pages 280-284, XP002369776 naples, italy Retrieved from the Internet: URL:http://dafx04.na.infn.it/WebProc/ Proc/ P_280.pdf> [retrieved on 2006-02-23] BREEBAART J ET AL: "High-quality parametric spatial audio coding at low bitrates" PREPRINTS OF PAPERS PRESENTED AT THE AES CONVENTION, XX, XX, 8 May 2004 (2004-05-08), pages 1-13, XP009042418 cited in the application SCHUIJERS E ET AL: "LOW COMPLEXITY PARAMETRIC STEREO CODING" PREPRINTS OF PAPERS PRESENTED AT THE AES CONVENTION, XX, XX, no. 6073, 8 May 2004 (2004-05-08), pages 1-11, XP008047510 KENDALL G S: "THE DECORRELATION OF AUDIO SIGNALS AND ITS IMPACT ON SPATIAL IMAGERY" COMPUTER MUSIC JOURNAL, CAMBRIDGE, MA, US, vol. 19, no. 4, 1995, pages 71-87, XP008026420

Upload: others

Post on 02-Feb-2021

4 views

Category:

Documents


0 download

TRANSCRIPT

  • Note: Within nine months of the publication of the mention of the grant of the European patent in the European PatentBulletin, any person may give notice to the European Patent Office of opposition to that patent, in accordance with theImplementing Regulations. Notice of opposition shall not be deemed to have been filed until the opposition fee has beenpaid. (Art. 99(1) European Patent Convention).

    Printed by Jouve, 75001 PARIS (FR)

    (19)E

    P1

    808

    047

    B1

    TEPZZ_8Z8Z47B_T(11) EP 1 808 047 B1

    (12) EUROPEAN PATENT SPECIFICATION

    (45) Date of publication and mention of the grant of the patent: 17.06.2015 Bulletin 2015/25

    (21) Application number: 05807484.0

    (22) Date of filing: 31.10.2005

    (51) Int Cl.:G10L 19/008 (2013.01) H04S 5/02 (2006.01)

    (86) International application number: PCT/EP2005/011664

    (87) International publication number: WO 2006/048227 (11.05.2006 Gazette 2006/19)

    (54) MULTICHANNEL AUDIO SIGNAL DECODING USING DE-CORRELATED SIGNALS

    DEKODIERUNG VON MEHRKANALTONSIGNALEN UNTER VERWENDUNG DEKORRELIERTER SIGNALE

    DECODAGE DE SIGNAUX AUDIO MULTICANAL A SIGNAUX DECORRELES

    (84) Designated Contracting States: AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LI LT LU LV MC NL PL PT RO SE SI SK TR

    (30) Priority: 02.11.2004 SE 0402649

    (43) Date of publication of application: 18.07.2007 Bulletin 2007/29

    (73) Proprietors: • Dolby International AB

    1101 CN Amsterdam Zuid-Oost (NL)• Koninklijke Philips N.V.

    5656 AE Eindhoven (NL)

    (72) Inventors: • PURNHAGEN, Heiko

    S-17265 Sundbyberg (SE)• ENGDEGARD, Jonas

    S-113 52 Stockholm (SE)• BREEBAART, Jeroen

    NL-5621 BA Eindhoven (NL)• SCHUIJERS, Erik

    NL-5621 BA Eindhoven (NL)

    (74) Representative: Zinkler, Franz et alSchoppe, Zimmermann, Stöckeler Zinkler, Schenk & Partner mbB Patentanwälte Radlkoferstrasse 281373 München (DE)

    (56) References cited: WO-A-2005/101370

    • G.POTARD, I.BURNETT: "Decorrelation techniques for the rendering of apparent sound source width in 3D audio displays" PROCEEDINGS OF THE 7TH INT. CONFERENCE ON DIGITAL AUDIO EFFECTS DAFX04, [Online] 5 October 2004 (2004-10-05), pages 280-284, XP002369776 naples, italy Retrieved from the Internet: URL:http://dafx04.na.infn.it/WebProc/Proc/ P_280.pdf> [retrieved on 2006-02-23]

    • BREEBAART J ET AL: "High-quality parametric spatial audio coding at low bitrates" PREPRINTS OF PAPERS PRESENTED AT THE AES CONVENTION, XX, XX, 8 May 2004 (2004-05-08), pages 1-13, XP009042418 cited in the application

    • SCHUIJERS E ET AL: "LOW COMPLEXITY PARAMETRIC STEREO CODING" PREPRINTS OF PAPERS PRESENTED AT THE AES CONVENTION, XX, XX, no. 6073, 8 May 2004 (2004-05-08), pages 1-11, XP008047510

    • KENDALL G S: "THE DECORRELATION OF AUDIO SIGNALS AND ITS IMPACT ON SPATIAL IMAGERY" COMPUTER MUSIC JOURNAL, CAMBRIDGE, MA, US, vol. 19, no. 4, 1995, pages 71-87, XP008026420

  • EP 1 808 047 B1

    2

    5

    10

    15

    20

    25

    30

    35

    40

    45

    50

    55

    Description

    [0001] The present invention relates to coding of multi-channel audio signals using spatial parameters and inparticular to new improved concepts for generating andusing de-correlated signals.[0002] Recently, multi-channel audio reproductiontechniques are becoming more and more important. Inthe view of an efficient transmission of multi-channel au-dio signals having 5 or more separate audio channels,several ways of compressing a stereo or multi-channelsignal have been developed. Recent approaches for theparametric coding of multi-channel audio signals (para-metric stereo (PS), "Binaural Cue Coding" (BCC) etc.)represent a multi-channel audio signal by means of adown-mix signal (could be monophonic or comprise sev-eral channels) and parametric side information, also re-ferred to as "spatial cues", characterizing its perceivedspatial sound stage.[0003] A multi-channel encoding device generally re-ceives - as input - at least two channels, and outputs oneor more carrier channels and parametric data. The par-ametric data is derived such that, in a decoder, an ap-proximation of the original multi-channel signal can becalculated. Normally, the carrier channel (channels) willinclude sub-band samples, spectral coefficients, time do-main samples, etc., which provide a comparatively finerepresentation of the underlying signal, while the para-metric data do not include such samples of spectral co-efficients but include control parameters for controlling acertain reconstruction algorithm instead. Such a recon-struction could comprise weighting by multiplication, timeshifting, frequency shifting, phase shifting, etc. Thus, theparametric data includes only a comparatively coarserepresentation of the signal or the associated channel.[0004] The binaural cue coding (BCC) technique is de-scribed in a number of publications, as in "Binaural CueCoding applied to Stereo and Multi-Channel Audio Com-pression", C. Faller, F. Baumgarte, AES convention pa-per 5574, May 2002, Munich, in the 2 ICASSP publica-tions "Estimation of auditory spatial cues for binaural cuecoding", and "Binaural cue coding: a normal and efficientrepresentation of spatial audio", both authored by C. Fall-er, and F. Baumgarte, Orlando, FL, May 2002.[0005] In BCC encoding, a number of audio input chan-nels are converted to a spectral representation using aDFT (Discrete Fourier Transform) based transform withoverlapping windows. The resulting uniform spectrum isthen divided into nonoverlapping partitions. Each parti-tion has a bandwidth proportional to the equivalent rec-tangular bandwidth (ERB). Then, spatial parameterscalled ICLD (Inter-Channel Level Difference) and ICTD(Inter-Channel Time Difference) are estimated for eachpartition. The ICLD parameter describes a level differ-ence between two channels and the ICTD parameter de-scribes the time difference (phase shift) between two sig-nals of different channels. The level differences and thetime differences are normally given for each channel with

    respect to a reference channel. After the derivation ofthese parameters, the parameters are quantized and fi-nally encoded for transmission.[0006] Although ICLD and ICTD parameters representthe most important sound source localization parame-ters, a spatial representation using these parameters canbe enhanced by introducing additional parameters.[0007] A related technique, called "parametric stereo"describes the parametric coding of a two-channel stereosignal based on a transmitted mono signal plus param-eter side information. In this context, 3 types of spatialparameters, referred to as inter-channel intensity differ-ence (IIDs), inter-channel phase differences (IPDs), andinter-channel coherence (ICC) are introduced. The ex-tension of the spatial parameter set with a coherenceparameter (correlation parameter) enables a parametri-zation of the perceived spatial "diffuseness" or spatial"compactness" of the sound stage. Parametric stereo isdescribed in more detail in: "Parametric Coding of stereoaudio", J. Breebaart, S. van de Par, A. Kohlrausch, E.Schuijers (2005) Eurasip, J. Applied Signal Proc. 9, pag-es 1305-1322)", in "High-Quality Parametric Spatial Au-dio Coding at Low Bitrates", J. Breebaart, S. van de Par,A. Kohlrausch, E. Schuijers, AES 116th Convention, Pre-print 6072, Berlin, May 2004, and in "Low ComplexityParametric Stereo Coding", E. Schuijers, J. Breebaart,H. Purnhagen, J. Engdegard, AES 116th Convention,Preprint 6073, Berlin, May 2004.[0008] The present invention relates to parametric cod-ing of the spatial properties of an audio signal. Parametricmulti-channel audio decoders reconstruct N channelsbased on M transmitted channels, where N > M, andadditional control data. The additional control data rep-resents a significant lower data rate than transmitting allN channels, making the coding very efficient while at thesame time ensuring compatibility with at least both Mchannel devices and N channel devices. Typical param-eters used for describing spatial properties are inter-channel intensity differences (IID), inter-channel time dif-ferences (ITD), and inter-channel co-herences (ICC). Inorder to reconstruct the spatial properties based on theseparameters, a method is required that can reconstructthe correct level of correlation between two or more chan-nels, according to the IC parameters. This is accom-plished by means of a de-correlation method, i.e. a meth-od to derive decorrelated signals from transmitted signalsto combine decorrelated signals with transmitted signalswithin some upmixing process. Methods for upmixingbased on a transmitted signal, a decorrelated signal, andIID/ICC parameters is described in the references givenabove.[0009] There are a couple of methods available for cre-ation of de-correlated signals. Preferably, the decorrelat-ed signals have similar or equal temporal and spectralenvelopes as the original input signals. Ideally, a lineartime invariant (LTI) function with all-pass frequency re-sponse is desired. One obvious method for achieving thisis by using a constant delay. However, using a delay, or

    1 2

  • EP 1 808 047 B1

    3

    5

    10

    15

    20

    25

    30

    35

    40

    45

    50

    55

    any other LTI all-pass function, will result in non-all-passresponse after addition of the non-processed signal. Inthe case of a delay, the result will be a typical comb-filter.The comb-filter often gives an undesirable "metallic"sound that, even if the stereo widening effect can be ef-ficient, reduces much naturalness of the original. Theconstant delay method and other prior art methods sufferfrom the inability to create more than one de-correlatedsignal while preserving quality and mutual de-correlation.[0010] The perceptual quality of a reconstructed multi-channel audio signal therefore depends strongly on anefficient concept that allows for the generation of a de-correlated signal from a transmitted signal, wherein ide-ally the de-correlated signal is orthogonal to the signalfrom which it is derived, i.e. perfectly de-correlated. Evenif a perfectly de-correlated signal is available, a multi-channel upmix in which the individual channels are mu-tually de-correlated cannot be derived using a single de-correlated signal. During the upmixing a reconstructedaudio channel is generated by combining a transmittedsignal with the generated de-correlated signal, whereasthe extent to which the de-correlated signal is mixed tothe transmitted signal is typically controlled by a trans-mitted spatial audio parameter (ICC). Mutually perfectlyde-correlated signals can therefore not be achieved,since every reconstructed audio channel has a fractionof the same de-correlated signal.[0011] As multi-channel reconstruction gains more andmore interest, a number of publications are known relat-ing to this matter. For example, International patent ap-plication No. WO2005/101370 Al relates to an apparatusfor generation of a level parameter for generating a multi-channel representation from at least one downmix chan-nel derived from a multi-channel signal. As energy pres-ervation is required during up-and-down-mixing, an ad-ditional level parameter is calculated such that the energyof the downmix channel, when weighted by the level pa-rameter, becomes equal to the sum of energies of theoriginal channels of the multi-channel signal. Such, ap-propriate downmix algorithms may be used without hav-ing to assure for energy preservation at the encoder sidewhile downmixing the multi-channel signal into the atleast one down-mixed channel.[0012] The publication G. Potard, I.Burnett: "Decorre-lation techniques for the rendering of apparent soundsource width in 3D audio displays" proceedings of the7th International Conference on Digital Audio EffectsDAFX04, 5 October 2004 (2004-10-05), relates to theprinciples for rendering of sound sources within 3D audiodisplays and in particular two techniques using decorre-lation as a means to decrease the Interaural Cross-cor-relation, which has a direct impact on the perceivedsource extent. Furthermore, techniques to vary decorre-lation with time and with frequency are described, allow-ing to create temporal and spectral variations in the spa-tial extent of sound sources.[0013] Furthermore, the publication Breebaart J. et al:"High-quality parametric spatial audio coding at low bi-

    trates" preprints of papers presented at the AES Con-vention on 8 May 2004 (2004-05-08), describes a codingscheme called binaural cue coding which transmits twoperceptually relevant sound localization cues togetherwith a mono downmix of a original audio signal havingat least two channels. In the document, it particularly de-scribes the perceptual quality of a reconstructed signalmay be enhanced when a coherence parameter (IC) istransmitted, describing correlation between two chan-nels of the original signal.[0014] It is the object of the present invention to providea more efficient concept for creation of highly de-corre-lated signals.[0015] This object is achieved by an apparatus accord-ing to claim 1 or a method according to claim 10 or acomputer program according to claim 13.[0016] The present invention is based on the findingthat a multi-channel signal having at least three channelscan be reconstructed such that the reconstructed chan-nels are at least partly de-correlated from each other us-ing a downmixed signal derived from an original multi-channel signal and a set of decorrelated signals providedby a de-correlator that derives the set of de-correlatedsignals from the downmix signal, wherein the de-corre-lated signals within the set of de-correlated signals aremutually approximately orthogonal to each other, i.e. anorthogonality relation between channel pairs is satisfiedwithin an orthogonality tolerance range.[0017] An orthogonality tolerance range can for exam-ple be derived from the cross correlation coefficient thatquantifies the degree of correlation between two signals.A cross correlation coefficient of 1 means perfect corre-lation, i.e. two identical signals. On the other and, a crosscorrelation coefficient of 0 means perfect anticorrelationor orthogonality of the signals. The orthogonality toler-ance range, therefore, may be defined as interval of cor-relation coefficient values ranging from 0 to a specificupper limit.[0018] Hence, the present invention relates to, and pro-vides a solution to, the problem of efficiently generatingone or more orthogonal signals while preserving impulseproperties and perceived audio quality.[0019] In one embodiment of the present invention anIIR lattice filter is implemented as a de-correlator havingfilter-coefficients derived from noise sequences, and thefiltering is performed within a complex valued or real val-ued filter bank.[0020] In one embodiment of the present invention, amethod for reconstructing a multi-channel signal includesa method for creating several orthogonal or close to or-thogonal signals by using a group of lattice IIR filters.[0021] In a further embodiment of the present inven-tion, the method for creating several orthogonal signalsis having a method for choosing filter coefficients forachieving orthogonality or an approximation of orthogo-nality in a perceptually motivated way.[0022] In a further embodiment of the present inven-tion, a group of lattice IIR filters is used within a complex

    3 4

  • EP 1 808 047 B1

    4

    5

    10

    15

    20

    25

    30

    35

    40

    45

    50

    55

    valued filterbank during the reconstruction of the multi-channel signal.[0023] In a further embodiment of the present inventiona method for creating one or more orthogonal or closeto orthogonal signals is implemented, using one or moreall-pass IIR filters based on lattice structure within in aspatial decoder.[0024] In a further embodiment of the present inven-tion, the embodiment described above is implementedsuch that the filter coefficients used for the IIR filteringare based on random noise sequences.[0025] In a further embodiment of the present inven-tion, additional time delays are added to the filters used.[0026] In a further embodiment of the present inven-tion, the filtering is processed in a filterbank domain.[0027] In a further embodiment of the present inven-tion, the filtering is processed in a complex valued filter-bank.[0028] In a further embodiment of the present inven-tion, the orthogonal signals created by the filtering aremixed to form a set of output signals.[0029] In a further embodiment of the present inven-tion, the mixing of the orthogonal signals is dependingon transmitted control data, additionally supplied to aninventive decoder.[0030] In a further embodiment of the present inven-tion, an inventive decoder or an inventive decoding meth-od uses control data that contains at least one parameterindicating a desired cross-correlation of at least two ofthe output signals generated.[0031] In a further embodiment of the present inven-tion, a 5.1 channel surround signal is upmixed from atransmitted monophonic signal by deriving four de-cor-related signals using the inventive concept. The mono-phonic downmixed signal and the four de-correlated sig-nals are then mixed together according to some mixingrules to form the output 5.1 channel signal. Therefore thepossibility is provided to generate output signals that aremutually de-correlated, since the signals used for the up-mix, i.e. the transmitted monophonic signal and the fourgenerated de-correlated signals are mainly de-correlateddue to their inventive generation.[0032] In a further embodiment of the present inven-tion, two individual channels are transmitted as a down-mix of a 5.1 channel signal. In one implementation, twoadditional mutually de-correlated signals are derived us-ing the inventive concept to provide four channels as ba-sis for an upmix which are almost perfectly de-correlated.In a modification of the embodiment described above athird de-correlated signal is derived and mixed with theother two de-correlated signals to provide a further de-correlated signal available for the subsequent upmixing.Using this feature, the perceptual quality can be furtherenhanced for individual channels, e.g. the center-chan-nel of a 5.1 surround signal.[0033] In a further embodiment of the present inven-tion, five audio channels are upmixed from a monophonictransmitted channel prior to deriving, using the inventive

    concept, four de-correlated signals that are subsequentlycombined with four of the five aforementioned upmixedchannels, allowing for a creation of five output audiochannels that are mutually mainly de-correlated.[0034] In a further embodiment of the present inven-tion, the audio signals are delayed prior to or after theapplication of the inventive IIR filter based filtering. Thedelay further enhances the de-correlation of the gener-ated signals, and reduces colorization when mixing thegenerated de-correlated signals with the original down-mixed signal.[0035] In a further embodiment of the present inven-tion, the generation of the de-correlated signals is per-formed in the subband domain of a (complex modulated)filterbank, wherein the filter coefficients used by the de-correlator are derived using the specific filterbank indexof the filterbank for which the de-correlated signals arederived.[0036] In a further embodiment of the present inven-tion, the de-correlated signals are derived using latticeIIR filters that perform a lattice IIR all-pass filtering of anaudio signal. Using a lattice IIR filter has major advan-tages. An exponential decay of the response of such afilter, which is preferable for creating appropriate decor-related signals, is an inherent property of such a filter.Furthermore, a desired long decaying pulse response ofa filter used to generate decorrelated signals can beachieved in an extremely memory and computationallyefficient (low complexity) manner by using a lattice filterstructure.[0037] In a modification of the previously describedembodiment the filter coefficients (reflection coefficients)used are given by means of providing filter coefficientsderived from noise sequences. In a modification, the re-flection coefficients are individually calculated based onthe sub-band index of a subband, in which the lattice filteris used to derive de-correlated signals.[0038] In one embodiment of the present invention, thefiltered signals and the unmodified input signal are com-bined by a mixing matrix D to form a set of output signals.The mixing matrix D defines the mutual correlations ofthe output signals, as well as the energy of each outputsignal. The entries (weights) of the mixing matrix D arepreferably time-variable and dependent on transmittedcontrol data. The control parameters preferably contain(desired) level differences between certain output signalsand/or specific mutual correlation parameters.[0039] In a further embodiment of the present inven-tion, an inventive audio decoder is comprised within anaudio receiver or playback device to enhance the per-ceptual quality of a reconstructed signal.[0040] Preferred embodiments of the present inventionare subsequently described by the following drawings,wherein:

    Fig. 1 shows a block diagram of the inventive audiodecoding concepts;

    5 6

  • EP 1 808 047 B1

    5

    5

    10

    15

    20

    25

    30

    35

    40

    45

    50

    55

    Fig. 2 shows a prior art decoder not implementingthe inventive concepts;

    Fig. 3 shows a 5.1 multi-channel audio decoder ac-cording to the present invention;

    Fig. 4 shows a further 5.1 channel audio decoder ac-cording to the present invention;

    Fig. 5 shows a further inventive audio decoder;

    Fig. 6 shows a further embodiment of an inventivemulti-channel audio decoder;

    Fig. 7 shows schematically the generation of a de-correlated signal;

    Fig. 8 shows a lattice IIR filter used for generating ade-correlated signal;

    Fig. 9 shows a receiver or audio player having aninventive audio decoder; and

    Fig. 10 shows a transmission having a receiver orplayback device having an inventive audio de-coder.

    [0041] The embodiments described below are merelyillustrative for the principles of the present invention foradvanced methods for creating orthogonal signals. It isunderstood that modifications and variations of the ar-rangements and the details described herein will be ap-parent to those skilled in the art. It is the intent, therefore,to be limited only by the scope of the impending patentclaims and not by the specific details presented by wayof description and explanation of the embodiments here-in.[0042] Fig. 1 illustrates an inventive apparatus for thede-correlation of signals as used in a parametric stereoor multi-channel system. The inventive apparatus in-cludes means 101 for providing a plurality of orthogonalde-correlated signals derived from an input signal 102.The providing means can be an array of de-correlationfilters based on lattice IIR structures. The input signal102 (x) can be a time-domain signal or a single sub-banddomain signal as e.g. obtained from a complex QMFbank. The signals output by the means 101, y1-yN arethe resulting de-correlated signals that are all mutuallyorthogonal or close to orthogonal.[0043] As it is vital for reconstructing the spatial prop-erties of a parametric stereo or parametric multi-channelsystem to decrease the coherence between two or morechannels in order to reconstruct the perceived widenessof the spatial image, the resulting de-correlated signalcan be used to create a final upmix of a multi-channelsignal. This can be done by adding filtered versions(h1(x)) of the original signal (x) to the output channels.Hence, lowering the coherence between N signals using

    N different filters can be done according to:

    where x is the original signal, y1 to yn are the resultingoutput signals, a and b are the gain factors controllingthe amount of coherence and h1 to hn are the differentdecorrelation filters. In a more general sense, one canwrite the output signals yi (i=1...I) as a linear combinationof the input signal x and the input signal x filtered by filtershn (j=1...N):

    [0044] Here, the mixing matrix D determines the mu-tual correlations and output levels of the output signals yi.[0045] In order to prevent changes in the timbre, thefilter in question should preferably be of all-pass charac-ter. One successful approach is to use all-pass filterssimilar to those used for artificial reverberation process-es. Artificial reverberation algorithms usually require ahigh time resolution to provide an impulse response thatis satisfactory diffuse in time. One way of designing suchall-pass filters is to use a random noise sequence asimpulse response. The filter can then easily be imple-mented as an FIR filter. In order to achieve a sufficientdegree of independence between the filtered outputs,the impulse response of the FIR filter should be relativelylong, hence requiring a significant amount of computa-tional effort to perform the convolution. An all-pass IIRfilter is preferred for that purpose. The IIR structure hasseveral advantages when it comes to designing de-cor-relation filters:

    a) The natural exponential decay that is common forall natural reverberation is desired for a de-correla-tion filter. This is an inherent property of IIR filters.

    b) For long decaying impulse responses of an IIRfilter, the corresponding FIR filter is generally moreexpensive in terms of complexity and requires morememory.

    [0046] However, designing IIR all-pass filters is lesstrivial than the FIR case where any random noise se-quence qualifies as a coefficient vector. A design con-straint when targeting multiple de-correlation filters is al-

    7 8

  • EP 1 808 047 B1

    6

    5

    10

    15

    20

    25

    30

    35

    40

    45

    50

    55

    so the required ability to preserve the same decayingproperties for all the filters while providing orthogonal out-puts (i.e., a filter impulse responses that obey mutuallysubstantially low correlation) of each filter output. Alsoas a basic requirement - stability has to be achieved.[0047] The present invention shows a novel method tocreate multiple orthogonal all-pass filters by means of alattice IIR filter structure. This approach has several ad-vantages:

    a) Lower complexity than FIR filters (given the re-quired length of the impulse responses).

    b) Stability constraints can be satisfied easily, as thisis automatically achieved when absolute values ofthe magnitudes of all reflection coefficients are lessthan one.

    c) Multiple orthogonal all-pass filters can be de-signed more easily with the same decaying proper-ties based on random noise sequences.

    d) High robustness against quantization errors dueto finite word-length effects.

    [0048] Although the reflection coefficients of the latticeIIR filter can be based on random noise sequences, forbetter performance those coefficients should also besorted in more sophisticated ways or processed by non-random methods in order to achieve sufficient orthogo-nality and other important properties. A straightforwardmethod is to generate a multitude of random reflectioncoefficient vectors, followed by a selection of a specificset based on certain criteria, such as a common decayingenvelope, minimization of all mutual impulse responsecorrelations of the selected set, and alike.[0049] More specifically, one could start with a largeset of random noise sequences. Each of these sequenc-es is used as reflection coefficients in the allpass section.Subsequently, the impulse response of the resulting all-pass section is computed for each random noise se-quence. Finally, one selects those noise sequences thatgive mutually decorrelated impulse responses.[0050] There are great advantages in basing the de-correlation algorithm on a (complex) filter bank such asthe complex valued QMF bank. This filter bank providesthe flexibility to allow the properties of the de-correlatorto be frequency selective in terms of for example equal-ization, decay time, impulse density and timbre. Note thatmany of these properties can be altered while preservingthe all-pass characteristic. There is much knowledge re-lated to auditory perception that guides the design of suchlattice IIR filter. An important aspect is the length andshape of the decaying envelop of the impulse response.Also the need for an additional pre-delay, optionally fre-quency dependent, is important as this largely influenceswhat kind of comb-filter characteristic will be obtainedwhen mixing the de-correlated signal with the original

    one. For sufficient impulse density the noise based re-flection coefficients in the lattice filter should preferablybe different for the different filter bank channels. For evenbetter impulse density fractional delay approximationscan be used within the filter bank.[0051] Fig. 2 shows a hierarchical decoding structureto derive a multi-channel signal for a transmitted mono-phonic downmix signal by subsequent parametric stereoboxes, using a single decorrelated signal. By shortly re-viewing the prior art approach, the problem solved by thepresent invention shall again be motivated. The 1-to-3channel decoder 110 shown in Fig. 2 comprises a de-correlator 112, a first parametric stereo upmixer 114 anda second parametric stereo upmixer 116.[0052] A monophonic input signal 118 is input into thede-correlator 112 to derive a de-correlated signal 120.Only a single de-correlated signal is derived. The firstparametric stereo upmixer receives as an input themonophonic downmix signal 118 and the de-correlatedsignal 120. The first upmixer 114 derives a center channel122 and a combined channel 124 by mixing the mono-phonic downmix signal 118 and the de-correlated signal120 using a correlation parameter 126, that steers themixing of the channels.[0053] The combined channel 124 is then input intothe second parametric stereo upmixer 116, building thesecond hierarchical level of the audio decoder. The sec-ond parametric stereo upmixer 116 is further receivingthe de-correlated signal 120 as an input and derives aleft channel 128 and a right channel 130 by mixing thecombined channel 124 and the de-correlated signal 120.[0054] It is principally feasible to generate a centerchannel 122 that is perfectly de-correlated from the com-bined channel 124, when the de-correlator 112 is able toderive a de-correlated signal which is fully orthogonal tothe monophonic downmix signal 118. Almost perfect de-correlation would be achieved when the steering infor-mation 126 indicates an upmix, in which each upmixedchannel is mainly having a signal component comingfrom either the de-correlated signal 120 or from the mono-phonic downmix signal 118. Since, however, the samede-correlated signal 120 is then used to derive the leftchannel 128 and the right channel 130, it is obvious, thatthis will result in a remaining correlation between the cent-er channel 122 and one of the channels 128 or 130.[0055] This becomes even more evident when exam-ining the extreme case in which a completely de-corre-lated left channel 128 and right channel 130 shall be de-rived from a de-correlated signal 120 that is assumed tobe perfectly orthogonal to the monophonic downmix sig-nal. Perfect decorrelation between the left channel 128and the right channel 130 can be achieved, when thecombined channel 124 holds information on the mono-phonic downmix channel 118 only, which simultaneouslymeans that the center channel 122 is mainly comprisingthe de-correlated signal 112. Therefore, a de-correlatedleft channel 128 and right channel 130 would mean thatone of the channels does mainly comprise the informa-

    9 10

  • EP 1 808 047 B1

    7

    5

    10

    15

    20

    25

    30

    35

    40

    45

    50

    55

    tion on the de-correlated signal 120 and the other channelwould mainly comprise the combined signal 124, whichthen is identical to the monophonic downmix signal 118.Therefore the only way the left or the right channels arecompletely de-correlated forces an almost perfect corre-lation between the center channel 122 and one of thechannels 128 or 130.[0056] This most unwanted property can be success-fully avoided by applying the inventive concept of gener-ating different and mutually orthogonal de-correlated sig-nals.[0057] Fig. 3 shows an embodiment of an inventivemulti-channel audio decoder 400 comprising a pre-de-correlator matrix 401, a de-correlator 402 and a mix-ma-trix 403. The inventive decoder 400 shows a 1-to-5 con-figuration, where five audio channels and a low-frequen-cy enhancement channel are derived from a monophonicdownmix signal 405 and additional spatial control data,such as ICC or ICLD parameters. These are not shownin the principle sketch in Fig. 3. The monophonic downmixsignal 405 is input into the pre-de-correlator matrix 401that derives four intermediate signals 406 which serveas an input for the de-correlator 402, that is comprisingfour inventive de-correlators h1-h4. These are supplyingfour mutually orthogonal de-correlated signals 408 at theoutput of the de-correlator 402.[0058] The mix-matrix 403 receives as an input the fourmutually orthogonal de-correlated signals 408 and in ad-dition a down-mix signal 410 derived from the monophon-ic downmix signal 405 by the pre-de-correlator matrix401.[0059] The mix-matrix 403 combines the monophonicsignal 410 and the four de-correlated signals 408 to yielda 5.1 output signal 412 comprising a left-front channel414a, a left-surround channel 414b, a right-front channel414c, a right-surround channel 414d, a center channel414e and a low-frequency enhancement channel 414f.[0060] It is important to note that the generation of fourmutually orthogonal de-correlated signals 408 enablesthe ability to derive five channels of the 5.1 channel signalthat are at least partly de-correlated. In a preferred em-bodiment of the present invention, these are the channels414a to 414e. The low-frequency enhancement channel414f comprises low-frequency parts of the multi-channelsignal, that are combined in one single low-frequencychannel for all the surround channels 414a to 414e.[0061] Fig. 4 shows an inventive 2-to-5 decoder to de-rive a 5.1 channel surround signal from two transmittedsignals.[0062] The multi-channel audio decoder 500 compris-es a pre-de-correlator matrix 501, a de-correlator 502and a mix-matrix 503. In the 2-to-5 setup, two transmittedchannels, 505a and 505b are input into the pre-de-cor-relator matrix that derives an intermediate left channel506a, an intermediate right channel 506b and an inter-mediate center channel 506c and two intermediate chan-nels 506d from the submitted channels 505a and 505b,optionally also using additional control data such as ICC

    and ICLD parameters.[0063] The intermediate channels 506d are used asinput for the de-correlator 502 that derives two mutuallyorthogonal or nearly orthogonal de-correlated signalswhich are input into the mix-matrix 503 together with theintermediate left channel 506a, the intermediate rightchannel 506b and the intermediate center channel 506c.[0064] The mix-matrix 503 derives the final 5.1 channelaudio signal 508 from the previously mentioned signals,wherein the finally derived audio channels have the sameadvantageous properties as already described for thechannels derived by the 1-to-5 multi-channel audio de-coder 400.[0065] Fig. 5 shows a further embodiment of thepresent invention, that combines the features of multi-channel audio decoders 400 and 500. The multi-channelaudio decoder 600 comprises a pre-de-correlation matrix601, a de-correlator 602 and a mix-matrix 603. The multi-channel audio decoder 600 is a flexible device allowingto operate in different modes depending on the configu-ration of input signals 605 input into the pre-de-correlator601. Generally, the pre-de-correlator derives intermedi-ate signals 607 that serve as input for the de-correlator602 and that are partially transmitted and altered to buildinput parameters 608. The input parameters 608 are theparameters input into the mix-matrix 603 that derives out-put channel configurations 610a or 610b depending onthe input channel configuration.[0066] In a 1-to-5 configuration, a downmix signal andan optional residual signal is supplied to the pre-de-cor-relator matrix, that derives four intermediate signals (e1to e4) that are used as an input of the de-correlator, whichderives four de-correlated signals (d1 to d4) that form theinput parameters 608 together with a directly transmittedsignal m derived from the input signal.[0067] It may be noted, that in the case where an ad-ditional residual signal is supplied as input, the de-cor-relator 602 that is generally operative in a sub-band do-main, may be operative to forward the residual signalinstead of deriving a de-correlated signal. This may alsobe done in a selective manner for certain frequencybands only.[0068] In the 2-to-5 configuration the input signals 605comprise a left channel, a right channel and optionally aresidual signal. In that configuration, the pre-de-correla-tor matrix derives a left, a right and a center channel andin addition two intermediate channels (e1, e2). Hence,the input parameters to the mix-matrix 603 are formedby the left channel, the right channel, the center channel,and two de-correlated signals (d1 and d2). In a furthermodification, the pre-de-correlator matrix may derive anadditional intermediate signal (e5) that is used as an inputfor a de-correlator (D5) whose output is a combination ofthe de-correlated signal (d5) derived from the signal (e5)and the de-correlated signals (d1 and d2). In this case,an additional de-correlation can be guaranteed betweenthe center channel and the left and the right channel.[0069] Fig. 6 shows a further embodiment of the

    11 12

  • EP 1 808 047 B1

    8

    5

    10

    15

    20

    25

    30

    35

    40

    45

    50

    55

    present invention, in which de-correlated signals arecombined with individual audio channels after the upmix-ing process. In this alternative embodiment, a mono-phonic audio channel 620 is upmixed by an upmixer 624,wherein the upmixing may be controlled by additionalcontrol data 622. The upmix channels 630 comprise fiveaudio channels that are correlated with each other, andcommonly referred to as dry channels. Final channels632 can be derived by combining four of the dry channels630 with de-correlated, mutually orthogonal signals. Asa result, it is possible to provide five channels that are atleast partly de-correlated from each other. With respectto Figure 3, this can be seen as a special case of a mix-matrix.[0070] Fig. 7 shows a block diagram of an inventivede-correlator 700 for providing a de-correlated signal.The de-correlator 700 comprises a predelay unit 702 anda de-correlation unit 704.[0071] An input signal 706 is input into the predelayunit 702 for delaying the signal 706 for a predeterminedtime. The output from the predelay unit 702 is connectedto the de-correlation unit 704 to derive a de-correlatedsignal 708 as an output of the de-correlator 700.[0072] In a preferred embodiment of the present inven-tion, the de-correlation unit 704 comprises a lattice IIRall-pass filter. In an optional variation of the de-correlator700, the filter coefficients (reflection coefficients) are in-put to the de-correlation unit 704 by means of an providerof filter coefficients 710. When the inventive de-correlator700 is operated within a filtering sub-band (e.g. within aQMF filterbank), the sub-band index of the currently proc-essed sub-band signal may additionally be input into thede-correlation unit 704. In that case, in a further modifi-cation of the present invention, different filter coefficientsof the de-correlation unit 704 may be applied or calculat-ed based on the sub-band index provided.[0073] Fig. 8 shows a lattice IIR filter as preferably usedto generate the de-correlated signals.[0074] The IIR filter 800 shown in Fig. 8 receives as aninput an audio signal 802 and derives as an output 804a de-correlated version of the input signal. A big advan-tage using an IIR lattice filter is, that the exponentiallydecaying impulse response required to derive an appro-priate de-correlated signal comes at no additional costs,since this is an inherent property of the lattice IIR filter.It is to be noted, that it is necessary to have filter coeffi-cients k(0) to k(M-1) whose absolute values are smallerthan unity to achieve the required stability of the filter.Additionally, multiple orthogonal all-pass filters can bedesigned more easily based on lattice IIR filters which isa major advantage for the inventive concept of derivingmultiple de-correlated signals from a single input signal,wherein the different derived de-correlated signals shallbe almost perfectly de-correlated or orthogonal to oneanother.[0075] More details on the design and the propertiesof all-pass lattice filters may be found in "Adaptive FilterTheory", Simon Haykin, ISBN 0-13-090126-1, Prentice-

    Hall, 2002.[0076] Fig. 9 shows an inventive receiver or audio play-er 900, having an inventive audio decoder 902, a bitstream input 904, and an audio output 906.[0077] A bit stream can be input at the input 904 of theinventive receiver/audio player 900. The bit stream thenis decoded by the decoder 902 and the decoded signalis output or played at the output 906 of the inventive re-ceiver/audio player 900.[0078] Fig. 10 shows a transmission system compris-ing a transmitter 908 and an inventive receiver 900.[0079] The audio signal input at an input interface 910of the transmitter 908 is encoded and transferred fromthe output of the transmitter 908 to the input 904 of thereceiver 900. The receiver decodes the audio signal andplays back or outputs the audio signal on its output 906.[0080] The present invention relates to coding of multi-channel representations of audio signals using spatialparameters. The present invention teaches new methodsfor de-correlating signals in order to lower the coherencebetween the output channels. It goes without saying thatalthough the new concept to create multiple de-correlat-ed signals is extremely advantageous in an inventive au-dio decoder, the inventive concept may also be used inany other technical field that requires the efficient gen-eration of such signals.[0081] Although the present invention has been de-tailed within multi-channel audio decoder that are per-forming an upmix in a single upmixing step, the presentinvention may of course also be incorporated in audiodecoders that are based on a hierarchical decoding struc-ture, such as for example shown in Fig. 2.[0082] Although the previously described embodi-ments mostly describe the derivation of decorrelated sig-nals from a single downmix signal, it goes without sayingthat also more than one audio channel may be used asinput for the decorrelators or the pre-decorrelation-ma-trix, i.e. that the downmix signal may comprise more thanone downmixed audio channel.[0083] Furthermore, the number of de-correlated sig-nal derived from a single input signal is basically un-lim-ited, since the filter order of lattice filters can be variedwithout limitation and, since it is possible to find a newset of filter coefficients deriving a de-correlated signalbeing orthogonal or mainly orthogonal to other signals inthe set.[0084] Depending on certain implementation require-ments of the inventive methods, the inventive methodscan be implemented in hardware or in software. The im-plementation can be performed using a digital storagemedium, in particular a disk, DVD or a CD having elec-tronically readable control signals stored thereon, whichcooperate with a programmable computer system suchthat the inventive methods are performed. Generally, thepresent invention is, therefore, a computer program prod-uct with a program code stored on a machine readablecarrier, the program code being operative for performingthe inventive methods when the computer program prod-

    13 14

  • EP 1 808 047 B1

    9

    5

    10

    15

    20

    25

    30

    35

    40

    45

    50

    55

    uct runs on a computer. In other words, the inventivemethods are, therefore, a computer program having aprogram code for performing at least one of the inventivemethods when the computer program runs on a compu-ter.[0085] It is to be understood that various changes maybe made in adapting to different embodiments withoutdeparting from the broader concepts disclosed hereinand comprehended by the claims that follow.

    Claims

    1. Multi-channel audio decoder (400; 500; 600) for gen-erating a reconstruction of a multi-channel signal(412; 508; 610a; 610b; 630) using a downmix signal(405; 505a, b; 605; 620) derived from an original mul-ti-channel signal, the reconstruction of the multi-channel signal (412; 508; 610a; 610b; 630) havingat least three channels, comprising:

    a de-correlator (402; 502; 602; 700) for derivinga set of de-correlated signals using a de-corre-lation rule, wherein the de-correlation rule issuch that a first de-correlated signal and a sec-ond de-correlated signal are derived using thedownmix signal (405; 505a, b; 605; 620), andthat the first de-correlated signal and the secondde-correlated signal are orthogonal to each oth-er within an orthogonality tolerance range,wherein the deriving of the first and second de-correlated signals comprises filtering of an audiochannel (406; 506; 607) extracted from thedownmix signal (405; 505a, b; 605; 620) bymeans of one or more all-pass IIR filters basedon a lattice structure; andan output channel calculator (403; 503; 603) forgenerating output channels using the downmixsignal (405; 505a, b; 605; 620), the first and thesecond de-correlated signals and upmix infor-mation so that the at least three channels are atleast partly de-correlated from each other.

    2. Multi-channel audio decoder (400; 500; 600) in ac-cordance with claim 1 in which the de-correlation ruleis such that the orthogonality tolerance range in-cludes orthogonality values

  • EP 1 808 047 B1

    10

    5

    10

    15

    20

    25

    30

    35

    40

    45

    50

    55

    ing the downmix signal and that the first de-cor-related signal and the second de-correlated sig-nal are orthogonal to each other within an or-thogonality tolerance range, wherein the deriv-ing of the first and second de-correlated signalscomprises filtering of an audio channel (406;506; 607) extracted from the downmix signal(405; 505a, b; 605; 620) by means of one ormore all-pass IIR filters based on a lattice struc-ture; andgenerating output channels using the downmixsignal, the first and the second de-correlationsignals and upmix information so that the at leastthree channels are at least partly de-correlatedfrom each other.

    11. Receiver or audio player, the receiver or audio playerhaving a Multi-channel decoder (400; 500; 600) inaccordance with claim 1.

    12. Method of receiving or audio playing, the methodhaving a method for generating a reconstruction ofa multi-channel signal in accordance with claim 10.

    13. Computer program for performing, when running ona computer, a method in accordance with any of themethod claims 10 or 12.

    Patentansprüche

    1. Mehrkanalaudiodecodierer (400; 500; 600) zum Er-zeugen einer Rekonstruktion eines Mehrkanalsig-nals (412; 508; 610a; 610b; 630) unter Verwendungeines Abwärtsmischsignals (405; 505a, b; 605; 620),abgeleitet von einem ursprünglichen Mehrkanalsig-nal, wobei die Rekonstruktion des Mehrkanalsignals(412; 508; 610a; 610b; 630) zumindest drei Kanäleaufweist, der folgende Merkmale aufweist:

    einen Dekorrelator (402; 502; 602; 700) zum Ab-leiten eines Satzes von dekorrelierten Signalenunter Verwendung einer Dekorrelationsregel,wobei die Dekorrelationsregel so lautet, dassein erstes dekorreliertes Signal und ein zweitesdekorreliertes Signal unter Verwendung des Ab-wärtsmischsignals (405; 505a; b; 605; 620) ab-geleitet werden, und dass das erste dekorrelier-te Signal und das zweite dekorrelierte Signal in-nerhalb eines Orthogonalitätstoleranzbereichsorthogonal zueinander sind, wobei das Ableitendes ersten und zweiten dekorrelierten Signalsdas Filtern eines Audiokanals (406; 506; 607)aufweist, der von dem Abwärtsmischsignal(405; 505a; b; 605; 620) extrahiert wird durchein oder mehrere Allpass-IIR-Filter basierendauf einer Gitterstruktur; undeine Ausgabekanalberechnungseinrichtung

    (403; 503; 603) zum Erzeugen von Ausgabeka-nälen unter Verwendung des Abwärtsmischsig-nals (405; 505a; b; 605; 620), des ersten unddes zweiten dekorrelierten Signals und einerAufwärtsmischinformation, so dass die zumin-dest drei Kanäle zumindest teilweise voneinan-der dekorreliert sind.

    2. Mehrkanalaudiodecodierer (400; 500; 600) gemäßAnspruch 1, bei dem die Dekorrelationsregel so lau-tet, dass der Orthogonalitätstoleranzbereich Ortho-gonalitätswerte

  • EP 1 808 047 B1

    11

    5

    10

    15

    20

    25

    30

    35

    40

    45

    50

    55

    Ausgabekanäle von einem Abwärtsmischsignal(405; 505a, b; 605; 620) mit einer Information übereinen Audiokanal und von vier dekorrelierten Signa-len zu erzeugen.

    8. Mehrkanalaudiodecodierer (400; 500; 600) gemäßeinem der Ansprüche 1 bis 7, bei dem die Ausgabe-kanalberechnungseinrichtung wirksam ist, um fünfAusgabekanäle von dem Abwärtsmischsignal (405;505a, b; 605; 620) mit einer Information über zweiAudiokanäle und von zwei dekorrelierten Signalenzu erzeugen.

    9. Mehrkanalaudiodecodierer (400; 500; 600) gemäßeinem der Ansprüche 1 bis 8, bei dem die Ausgabe-kanalberechnungseinrichtung (403; 503; 603) wirk-sam ist, um eine aufwärtsgemischte Information zunutzen, die zumindest einen Parameter aufweist, dereine gewünschte Korrelation eines ersten und eineszweiten Ausgabekanals anzeigt.

    10. Verfahren zum Erzeugen einer Rekonstruktion einesMehrkanalaudiosignals unter Verwendung einesAbwärtsmischsignals, abgeleitet von einem ur-sprünglichen Mehrkanalsignal, wobei die Rekonst-ruktion des Mehrkanalsignals zumindest drei Kanäleaufweist, wobei das Verfahren folgende Schritte auf-weist:

    Ableiten eines Satzes von dekorrelierten Signa-len unter Verwendung einer Dekorrelationsre-gel, wobei die Dekorrelationsregel so lautet,dass das erste dekorrelierte Signal und daszweite dekorrelierte Signal unter Verwendungdes Abwärtsmischsignals abgeleitet werden,und dass das erste dekorrelierte Signal und daszweite dekorrelierte Signal innerhalb eines Or-thogonalitätstoleranzbereichs orthogonal zu-einander sind, wobei das Ableiten des erstenund zweiten dekorrelierten Signals das Filterneines Audiokanals (406; 506; 607) aufweist, dervon dem Abwärtsmischsignal (405; 505a, b;605; 620) extrahiert wird durch ein oder mehrereAllpass-IIR-Filter basierend auf einer Gitter-struktur; undErzeugen von Ausgabekanälen unter Verwen-dung des Abwärtsmischsignals, des ersten unddes zweiten Dekorrelationssignals und der Auf-wärtsmischinformation, so dass die zumindestdrei Kanäle zumindest teilweise voneinanderdekorreliert sind.

    11. Empfänger oder Audiowiedergabegerät, wobei derEmpfänger oder das Audiowiedergabegerät einenMehrkanaldecodierer (400; 500; 600) gemäß An-spruch 1 aufweist.

    12. Verfahren zum Empfangen oder für eine Audiowie-

    dergabe, wobei das Verfahren ein Verfahren zumErzeugen einer Rekonstruktion eines Mehrkanalsi-gnals gemäß Anspruch 10 aufweist.

    13. Computerprogramm zum Durchführen, wenn das-selbe auf einem Computer läuft, eines Verfahrensgemäß einem der Verfahrensansprüche 10 oder 12.

    Revendications

    1. Décodeur audio multicanal (400; 500; 600) pour gé-nérer une reconstruction d’un signal multicanal (412;508; 610a; 610b; 630) à l’aide d’un signal de mélangevers le bas (405; 505a, b; 605; 620) dérivé d’un signalmulticanal original, la reconstruction du signal mul-ticanal (412; 508; 610a; 610b; 630) présentant aumoins trois canaux, comprenant:

    un décorrélateur (402; 502; 602; 700) destiné àdériver un ensemble de signaux décorrélés àl’aide d’une règle de décorrélation, où la règlede décorrélation est telle qu’un premier signaldécorrélé et un deuxième signal décorrélésoient dérivés à l’aide du signal de mélange versle bas (405; 505a, b; 605; 620), et que le premiersignal décorrélé et le deuxième signal décorrélésoient orthogonaux l’un à l’autre dans une plagede tolérance d’orthogonalité, où le fait de dériverles premier et deuxième signaux de décorréla-tion comprend le fait de filtrer un canal audio(406; 506; 607) extrait du signal de mélange versle bas (405; 505a, b; 605; 620) au moyen d’unou plusieurs filtres IIR passe-tout sur base d’unestructure réticulée; etun calculateur de canal de sortie (403; 503; 603)destiné à générer des canaux de sortie à l’aidedu signal de mélange vers le bas (405; 505a, b;605; 620), des premier et deuxième signaux dé-corrélés et des informations de mélange vers lehaut de sorte que les au moins trois canauxsoient au moins partiellement décorrélés l’un del’autre.

    2. Décodeur audio multicanal (400; 500; 600) selon larevendication 1, dans lequel la règle de décorrélationest telle que la plage de tolérance d’orthogonalitécomprenne des valeurs d’orthogonalité

  • EP 1 808 047 B1

    12

    5

    10

    15

    20

    25

    30

    35

    40

    45

    50

    55

    pondérée par un premier facteur de pondération; etun deuxième additionneur dans un trajet de prédic-tion en arrière destiné à additionner la partie précé-dente du canal audio à la partie actuelle qui est pon-dérée par un deuxième facteur de pondération dusignal audio; etdans lequel les valeurs absolues des premier etdeuxième facteurs de pondération sont égales.

    4. Décodeur audio multicanal (400; 500; 600) selon larevendication 3, dans lequel chacun desdits un ouplusieurs filtres IIR (704; 800) est opérationnel pourutiliser un premier et un deuxième facteur de pon-dération qui sont dérivés de séquences de bruit aléa-toires.

    5. Décodeur audio multicanal (400; 500; 600) selon lesrevendications 1 à 4, dans lequel la règle de décor-rélation est telle que le premier signal décorrélé etle deuxième signal décorrélé soient dérivés à l’aided’une version retardée dans le temps du signal demélange vers le bas (405; 505a, b; 605; 620).

    6. Décodeur audio multicanal (400; 500; 600) selon lesrevendications 1 à 5, qui est opérationnel pour déri-ver les premier et deuxième signaux décorrélés dé-rivés du signal de mélange vers le bas (405; 505a,b; 605; 620) par un banc de filtres à valeurs réellesou complexes.

    7. Décodeur audio multicanal (400; 500; 600) selon lesrevendications 1 à 6, dans lequel le calculateur decanal de sortie est opérationnel pour générer cinqcanaux de sortie à partir d’un signal de mélange versle bas (405; 505a, b; 605; 620) présentant des infor-mations sur un canal audio et de quatre signaux dé-corrélés.

    8. Décodeur audio multicanal (400; 500; 600) selon lesrevendications 1 à 7, dans lequel le calculateur decanal de sortie est opérationnel pour générer cinqcanaux de sortie à partir du signal de mélange versle bas (405; 505a, b; 605; 620) présentant des infor-mations sur deux canaux audio et de deux signauxdécorrélés.

    9. Décodeur audio multicanal (400; 500; 600) selon lesrevendications 1 à 8, dans lequel le calculateur decanal de sortie (403; 503; 603) est opérationnel pourutiliser des informations de mélange vers le hautcomprenant au moins un paramètre indiquant unecorrélation souhaitée d’un premier et d’un deuxièmecanal de sortie.

    10. Procédé de génération d’une reconstruction d’un si-gnal audio multicanal à l’aide d’un signal de mélangevers le bas dérivé d’un signal original multicanal, lareconstruction du signal multicanal présentant au

    moins trois canaux, le procédé comprenant le fait de:

    dériver un ensemble de signaux décorrélés àl’aide d’une règle de décorrélation, où la règlede décorrélation est telle que le premier signaldécorrélé et le deuxième signal décorrélé soientdérivés à l’aide du signal de mélange vers le baset que le premier signal décorrélé et le deuxièmesignal décorrélé soient orthogonaux l’un àl’autre dans une plage de tolérance d’orthogo-nalité, où le fait de dériver les premier et deuxiè-me signaux de décorrélation comprend le fait defiltrer un canal audio (406; 506; 607) extrait dusignal de mélange vers le bas (405; 505a, b;605; 620) au moyen d’un ou plusieurs filtres IIRpasse-tout sur base d’une structure réticulée; etgénérer des canaux de sortie à l’aide du signalde mélange vers le bas, du premier et le deuxiè-me signaux de décorrélation et des informationsde mélange vers le haut de sorte que les aumoins trois canaux soient au moins partielle-ment décorrélés l’un de l’autre.

    11. Récepteur ou reproducteur audio, le récepteur oureproducteur audio présentant un décodeur multica-nal (400; 500; 600) selon la revendication 1.

    12. Procédé de réception ou de reproduction audio, leprocédé présent un procédé pour générer une re-construction d’un signal multicanal selon la revendi-cation 10.

    13. Programme d’ordinateur pour réaliser, lorsqu’il estexécuté sur un ordinateur, un procédé selon l’unequelconque des revendications de procédé 10 ou 12.

    21 22

  • EP 1 808 047 B1

    13

  • EP 1 808 047 B1

    14

  • EP 1 808 047 B1

    15

  • EP 1 808 047 B1

    16

  • EP 1 808 047 B1

    17

  • EP 1 808 047 B1

    18

  • EP 1 808 047 B1

    19

  • EP 1 808 047 B1

    20

  • EP 1 808 047 B1

    21

  • EP 1 808 047 B1

    22

  • EP 1 808 047 B1

    23

    REFERENCES CITED IN THE DESCRIPTION

    This list of references cited by the applicant is for the reader’s convenience only. It does not form part of the Europeanpatent document. Even though great care has been taken in compiling the references, errors or omissions cannot beexcluded and the EPO disclaims all liability in this regard.

    Patent documents cited in the description

    • WO 2005101370 A1 [0011]

    Non-patent literature cited in the description

    • C. FALLER ; F. BAUMGARTE. Binaural Cue Codingapplied to Stereo and Multi-Channel Audio Compres-sion. AES convention paper 5574, May 2002 [0004]

    • Estimation of auditory spatial cues for binaural cuecoding. 2 ICASSP [0004]

    • C. FALLER ; F. BAUMGARTE. Binaural cue coding:a normal and efficient representation of spatial audio,May 2002 [0004]

    • J. BREEBAART ; S. VAN DE PAR ; A.KOHLRAUSCH ; E. SCHUIJERS. Parametric Cod-ing of stereo audio. Eurasip, J. Applied Signal Proc.,2005, vol. 9, 1305-1322 [0007]

    • J. BREEBAART ; S. VAN DE PAR ; A.KOHLRAUSCH ; E. SCHUIJERS. High-Quality Par-ametric Spatial Audio Coding at Low Bitrates. AES116th Convention, Preprint 6072, May 2004 [0007]

    • E. SCHUIJERS ; J. BREEBAART ; H.PURNHAGEN ; J. ENGDEGARD. Low ComplexityParametric Stereo Coding. AES 116th Convention,Preprint 6073, May 2004 [0007]

    • G. POTARD ; I.BURNETT. Decorrelation techniquesfor the rendering of apparent sound source width in3D audio displays. proceedings of the 7th Interna-tional Conference on Digital Audio Effects DAFX04,05 October 2004 [0012]

    • BREEBAART J. et al. High-quality parametric spatialaudio coding at low bitrates. preprints of papers pre-sented at the AES Convention, 08 May 2004 [0013]

    • SIMON HAYKIN. Adaptive Filter Theory. Pren-tice-Hall, 2002 [0075]

    bibliographydescriptionclaimsdrawingscited references