voice morphing using 3d waveform interpolation...

18
EURASIP Journal on Applied Signal Processing 2005:8, 1174–1184 c 2005 Hindawi Publishing Corporation Voice Morphing using 3D Waveform Interpolation Surfaces and Lossless Tube Area Functions Yizhar Lavner Department of Computer Science, Tel-Hai Academic College, Upper Galilee 12210, Israel Email: yizhar [email protected] Signal and Image Processing Lab (SIPL), Department of Electrical Engineering, Technion – Israel Institute of Technology, Haifa 32000, Israel Gidon Porat Signal and Image Processing Lab (SIPL), Department of Electrical Engineering, Technion – Israel Institute of Technology, Haifa 32000, Israel Email: [email protected] Received 31 December 2003; Revised 2 February 2005; Recommended for Publication by Mark Kahrs Voice morphing is the process of producing intermediate or hybrid voices between the utterances of two speakers. It can also be defined as the process of gradually transforming the voice of one speaker to that of another. The ability to change the speaker’s individual characteristics and to produce high-quality voices can be used in many applications. Examples include multimedia and video entertainment, as well as enrichment of speech databases in text-to-speech systems. In this study we present a new technique which enables production of a given number of intermediate voices or of utterances which gradually change from one voice to another. This technique is based on two components: (1) creation of a 3D prototype waveform interpolation (PWI) surface from the LPC residual signal, to produce an intermediate excitation signal; (2) a representation of the vocal tract by a lossless tube area function, and an interpolation of the parameters of the two speakers. The resulting synthesized signal sounds like a natural voice lying between the two original voices. Keywords and phrases: voice morphing, prototype waveform interpolation, lossless tube area function, speech synthesis. 1. INTRODUCTION Voice morphing is the process of producing intermediate or hybrid voices between the utterances of two speakers. It can also be defined as the process of smoothly changing speech identity between two speakers [1], or gradually transforming the voice of a given speaker to that of another [2, 3]. The abil- ity to change the speaker’s individual characteristics and pro- duce high-quality voices can be used in many applications. For example, in multimedia and video entertainment, voice morphing is similar to its visual counterpart: while seeing a face gradually changing from one person’s to another’s, we can simultaneously hear the voice progressively changing, as well. Another potential application is forensic voice identifi- cation: creating a voice bank of dierent pitches, rates, and timbres to assist in recognition of a suspect’s voice. In a simi- lar manner, producing a databank of dierent voices which are intermediates between several given speech recordings can enhance the possibility of synthesizing a given utterance more naturally. When prerecorded speech data is taken from dierent speakers, a natural-sounding new message, created by concatenation of speech segments taken from these speak- ers, may be achieved using speech morphing [1]. Speech and audio morphing can also be a valuable tool for voice and speaker perception research, for example, in an attempt to control the emotional content of speech [2]. Within a text- to-speech (TTS) synthesis framework, voice morphing also oers the opportunity to generate a variety of voices from a database containing only a small number of speakers. This is potentially advantageous since the voice creation process for a TTS system is quite time consuming, and it can also con- siderably reduce the memory requirements for storing TTS voices. A successful procedure for voice morphing requires a representation of the speech signal in a parametric space, using a suitable mathematical model that allows interpola- tion between the characteristics of the two speakers. In other words, for the speech characteristics of the source speaker’s voice to change gradually to those of the target speaker, the pitch, duration, and spectral parameters must be extracted from both speakers. Natural-sounding synthetic intermedi- ates, with a new voice timbre, can then be produced.

Upload: lamque

Post on 28-Aug-2018

248 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Voice Morphing using 3D Waveform Interpolation …spl.telhai.ac.il/speech/pub/S1110865705502178[1].pdf · Voice Morphing Using 3D Waveform Interpolation Surfaces 1175 Several studies

EURASIP Journal on Applied Signal Processing 20058 1174ndash1184ccopy 2005 Hindawi Publishing Corporation

Voice Morphing using 3D Waveform InterpolationSurfaces and Lossless Tube Area Functions

Yizhar LavnerDepartment of Computer Science Tel-Hai Academic College Upper Galilee 12210 IsraelEmail yizhar lkyiftahorgil

Signal and Image Processing Lab (SIPL) Department of Electrical Engineering Technion ndash Israel Institute of TechnologyHaifa 32000 Israel

Gidon PoratSignal and Image Processing Lab (SIPL) Department of Electrical Engineering Technion ndash Israel Institute of TechnologyHaifa 32000 IsraelEmail gidonporatintelcom

Received 31 December 2003 Revised 2 February 2005 Recommended for Publication by Mark Kahrs

Voice morphing is the process of producing intermediate or hybrid voices between the utterances of two speakers It can also bedefined as the process of gradually transforming the voice of one speaker to that of another The ability to change the speakerrsquosindividual characteristics and to produce high-quality voices can be used in many applications Examples include multimedia andvideo entertainment as well as enrichment of speech databases in text-to-speech systems In this study we present a new techniquewhich enables production of a given number of intermediate voices or of utterances which gradually change from one voice toanother This technique is based on two components (1) creation of a 3D prototype waveform interpolation (PWI) surface fromthe LPC residual signal to produce an intermediate excitation signal (2) a representation of the vocal tract by a lossless tube areafunction and an interpolation of the parameters of the two speakers The resulting synthesized signal sounds like a natural voicelying between the two original voices

Keywords and phrases voice morphing prototype waveform interpolation lossless tube area function speech synthesis

1 INTRODUCTION

Voice morphing is the process of producing intermediate orhybrid voices between the utterances of two speakers It canalso be defined as the process of smoothly changing speechidentity between two speakers [1] or gradually transformingthe voice of a given speaker to that of another [2 3] The abil-ity to change the speakerrsquos individual characteristics and pro-duce high-quality voices can be used in many applicationsFor example in multimedia and video entertainment voicemorphing is similar to its visual counterpart while seeing aface gradually changing from one personrsquos to anotherrsquos wecan simultaneously hear the voice progressively changing aswell Another potential application is forensic voice identifi-cation creating a voice bank of different pitches rates andtimbres to assist in recognition of a suspectrsquos voice In a simi-lar manner producing a databank of different voices whichare intermediates between several given speech recordingscan enhance the possibility of synthesizing a given utterancemore naturally When prerecorded speech data is taken fromdifferent speakers a natural-sounding new message created

by concatenation of speech segments taken from these speak-ers may be achieved using speech morphing [1] Speech andaudio morphing can also be a valuable tool for voice andspeaker perception research for example in an attempt tocontrol the emotional content of speech [2] Within a text-to-speech (TTS) synthesis framework voice morphing alsooffers the opportunity to generate a variety of voices from adatabase containing only a small number of speakers This ispotentially advantageous since the voice creation process fora TTS system is quite time consuming and it can also con-siderably reduce the memory requirements for storing TTSvoices

A successful procedure for voice morphing requires arepresentation of the speech signal in a parametric spaceusing a suitable mathematical model that allows interpola-tion between the characteristics of the two speakers In otherwords for the speech characteristics of the source speakerrsquosvoice to change gradually to those of the target speaker thepitch duration and spectral parameters must be extractedfrom both speakers Natural-sounding synthetic intermedi-ates with a new voice timbre can then be produced

Voice Morphing Using 3D Waveform Interpolation Surfaces 1175

Several studies have explored the subject of speech orvoice morphing to date Slaney et al [3] used a represen-tation of separate time-aligned spectrograms for pitch andspectral envelope using MFCC and modified the spectro-grams separately to achieve an audio morph Short vowelswere used to demonstrate the resulting morphs Morphingbetween a womanrsquos vowel and a short note of an oboe wasalso used A similar approach used smooth spectrographicrepresentations to interpolate between utterances with differ-ent emotional contents [2] In another study real-time mor-phing was applied to a singing voice using an interpolationof a source voice with a target voice based on a sinusoidalmodel [4] The same method of sinusoidal analysis was re-ported in [5] In the digital music industry the Morpheussynthesizer by E-mu [6] introduced a 14-pole dynamicallyvariable filter which could model different resonant charac-teristics perform spectral morphing-like effects between dif-ferent musical samples and interpolate between them in realtime

In this study we present a new technique which enablesthe production of a desired number of intermediate voicesbetween the original voices of two speakers or the produc-tion of one voice signal that changes gradually in time fromone speaker to another The latter means that at the begin-ning of the utterance the voice characteristics are those ofone speaker and the voice is perceived as belonging to thatspeaker The voice is gradually modified towards the charac-teristics of another speaker so that by the end of the utter-ance it is perceived as belonging to the second speaker Thistechnique is based on two components One is the creation ofa 3D prototype waveform interpolation (PWI) surface fromthe residual error signal which is obtained by LPC analysisto produce a new intermediate excitation signal The secondcomponent is a representation of the vocal tract by a loss-less tube area function and interpolation of the parametersof the two speakers

The morphing algorithm consists of two main stagesanalysis and synthesis In the analysis stage the residual errorsignal is estimated along with the vocal tract parameters Theresidual signal is then used to create a PWI surface for eachspeech utterance In the synthesis stage a new residual errorsignal is recovered from a PWI surface interpolated from thetwo original surfaces The area functions of the two speakersare also interpolated producing a hybrid area function fromwhich a new vocal-tract filter is computed The residual sig-nal is then transferred through this filter to yield a morphedspeech signal Thus we use an excitation signal the dynam-ics of which are comprised of both excitation waves and pitchperiod contours along with a vocal tract with an interpolatedstructure

This study resembles other studies on voice conversion orvoice transformations [7 8 9 10 11 12 13 14] but it is sig-nificantly different Voice conversion modifies the utterancesof one speaker so that hisher voice will sound like another(target) voice by matching the source voice to the statisti-cal properties of the target voice In these studies differentmethods are used to represent the relationships between thesource and the target speakers and most of the studies are

concentrated on the spectral envelope data of short segmentsof speech The spectral envelopes are characterized by one ofseveral possible representations such as HNM (8) Cepstrumor log-area ratio [8] LSF [9 11] LPC [12] and formant fre-quencies [13] For example in [7] a Gaussian mixture model(GMM) is used where the conversion is performed in thecontext of the harmonic + noise model (HNM) using a con-tinuous probabilistic model of the source envelopes Conver-sion using GMM is also utilized in [12] with joint densityestimation for the spectral conversion using LPC analysiswhile the pitch of the source has been modified to match theaverage pitch of the target The residual LPC in each pitchperiod in the latter study was left intact In other studies theconversion is performed using codebook mapping with vec-tor quantization [9 13] or with artificial neural networks[14]

The aim of speech morphing as it is proposed and usedin [1 2 3] and as it has been carried out in the currentstudy is to produce intermediate voices between two givenutterances that will be perceived as lying between the twooriginal voices In the other morphing type gradual mor-phing the morphed sound should be perceived as one ob-ject that smoothly changes into another sound [3] The mor-phing algorithm presented here is shown to produce high-quality morphing sounds that are perceived as highly naturaland smooth

The paper is organized as follows In Section 21 the PWItechnique is introduced and in Section 22 the idea of usingit for speech morphing is presented The basic morphing al-gorithm is described in Section 23 A detailed computationof the characteristic waveforms function with the construc-tion of corresponding PWI surface and the interpolation be-tween two such surfaces are all presented in Section 231The procedure for extracting a new intermediate residual er-ror signal is presented in Section 232 Subsequently the cal-culation of the new vocal tract model and the synthesis ofthe morphed speech are described In Section 24 subjectivetests for evaluating the naturalness and the intelligibility ofthe morphing voices are presented Finally we discuss theadvantages and disadvantages of the algorithm in contrast toprevious studies

2 PROCEDURE AND RESULTS

In this section a method for decomposing the speech sig-nals of two speakers and recombining the components ispresented The components are the excitation signal rep-resented by the residual error signal from an LPC analysisand the vocal tract parameters represented by the area coef-ficients of a lossless tube model The method is designed sothat the resulting speech will be characterized perceptually asan interpolated version of the voices of the two speakers

21 Prototype waveform interpolationPWI is a speech coding method described in [15 16 17 18]This method is based on the fact that voiced speech isquasiperiodic and can be considered as a chain of pitch cy-cles Comparing consecutive pitch cycles reveals a slow evo-lution in the pitch-cycle waveform and duration that is each

1176 EURASIP Journal on Applied Signal Processing

pitch cycle has a close similarity to its neighbors The slowchange in the shape and duration of the pitch cycle suggeststhat extracting the cyclesrsquo waveform at regular time intervalsshould be sufficient in order to reconstruct the signal fromthe sampled cycles by interpolation The interpolation pro-cedure is carried out by constructing a 3D surface from thespeech or the residual error waveforms The coding proce-dure can be applied for both the speech signal and its resid-ual error function derived from the LPC analysis A detaileddescription of this coding technique for bit-rate reductioncan be found in [16] The representation of a speech signalrsquosresidual error function in the form of 3D surfaces has beenfound useful for voiced speech morphing The creation ofsuch a surface is described below

22 PWI-based speech morphingPrototype waveform interpolation is based on the observa-tion that during voiced segments of speech the pitch cyclesresemble each other and their general shape usually evolvesslowly in time (see [16 17 18]) The essential characteris-tics of the speech signal can thus be described by the pitch-cycle waveform By extracting pitch cycles at regular time in-stants and interpolating between them an interpolation sur-face can be created The speech can then be reconstructedfrom this surface if the pitch contour and the phase function(see Section 231) are known

The algorithm presented here is based on the source-filter model of speech production [19 20] According to thismodel voiced speech is the output of a time-varying vocal-tract filter excited by a time-varying glottal pulse signal Inorder to separate the vocal-tract filter from the source sig-nal we used the LPC analysis [21] by which the speech isdecomposed into two components the LPC coefficients con-taining the information of the vocal tract characteristics andthe residual error signal analogous to the derivative of theglottal pulse signal In the proposed morphing technique weused the PWI to create a 3D surface from the residual errorsignal which would represent the source characteristics foreach speaker Interpolation between the surfaces of the twospeakers allows us to create an intermediate excitation signalIn addition to the fact that the information of the vocal tract(see Section 233) is manipulated separately from the infor-mation of the residual error signal it is also more advanta-geous to create a PWI surface from the residual signal thanto obtain one from the speech itself In this domain it is rel-atively easy to ensure that the periodic extension procedure(see below) does not result in artifacts in the characteristicwaveform shape [16] This is due to the fact that the resid-ual signal contains mainly excitation pulses with low-powerregions in between and thus allows a smooth reconstruc-tion of the residual signal from the PWI surface with minimalphase discontinuities

In the proposed algorithm the surfaces of the residual er-ror signals computed for each voiced phoneme of two differ-ent speakers are interpolated to create an intermediate sur-face Together with an intermediate pitch contour and an in-terpolated vocal-tract filter a new voiced phoneme is pro-duced

23 The basic algorithm

The morphing algorithm consists of two main stagesmdashanalysis and synthesis As most of the speakerrsquos individualityis contained in the voiced portion of speech [22] and be-cause in preliminary experiments it was found that mor-phing the unvoiced sections yielded low-quality utterancesthe algorithm is applied on the voiced segments only Theunvoiced segments were left intact and concatenated withthe interpolated voiced segments Concatenation of voicedand unvoiced segments was performed by overlapping andadding two adjacent frames about 20 milliseconds each onefrom the voiced phoneme and the other from the unvoicedone The two frames are overlapped after each of them ismultiplied by a half-left or half-right Hanning or linear win-dows to yield a new frame that gradually changes from thevoiced segment to the unvoiced segment or vice-versa Sincethis operation may shorten the utterance time-scale com-pensation is carried out for each vocal segment by extendingthe PWI surface as necessary

The unvoiced segments are taken according to the mor-phing factor (α) from the first speaker where 0 le α lt 05and from the second one where 05 le α lt 10 The basicblock diagram of the algorithm is shown in Figure 1

In the analysis stage the voiced segments of both speechsignals are marked and each section in one of the voicesis associated with the corresponding section in the otherThe segmentation and mapping of the speech segmentsare done semiautomatically First a simple algorithm forvoicedunvoiced segmentation is applied which is based on3 parameters the short-time energy the normalized maxi-mal peak of the autocorrelation function in the range of 3ndash16milliseconds (the possible expected pitch period duration)and the short-time zero-crossing rate (Figure 2) The outputof the automatic voicedunvoiced segmentation is a series ofvoiced segments for each of the two voices However due tothe imperfection of the segmentation algorithm and the dis-similarity of the characteristics of the two voices and as ac-curate mapping between the corresponding voiced sectionsof the two voices is crucial for the success of the algorithma manual correction mode has been added to refine the pre-liminary segmentation The manual mode allows for makingadjustments to the edges of the segments splitting segmentsjoining segments and adding new segments or deleting onesFor the demarcation of phoneme boundaries the user canbe assisted by the graph of the 2-norm of the difference be-tween the MFCCs of adjacent frames (10 milliseconds eachsee Figure 2)

It was found that by applying manual segmentation asa refinement of the automatic segmentation it is possible toreach accurate mapping with only small adjustments

A pitch detection algorithm is applied to both speakersrsquoutterances The pitch detection algorithm is based on a com-bination of the cepstral method [23] for coarse-pitch perioddetection and the cross-correlation method [24] for refiningthe results Pitch marks are obtained and after preemphasislinear prediction coefficients are calculated for each voicedphoneme (either on the whole phoneme as one windowed

Voice Morphing Using 3D Waveform Interpolation Surfaces 1177

Phonemesboundaries

demarcation

Phonemesboundaries

demarcation

Pitchdetector

LPCanalysis

LPCanalysis

Pitchdetector

PWIcreation

Interpolator PWIcreation

Excitation signalreconstruction

InterpolatedLPC model

New interpolated voice

Speaker BSpeaker A

Synthesizer

Figure 1 A basic diagram of the algorithm

Figure 2 The semiautomatic segmentation of the speech signal A simple algorithm for voicedunvoiced segmentation is applied Theupper graph shows the speech signal with the vuv boundaries In the 4 graphs below the 4 parameters on which the decision is based aredepicted the short-time energy the zero-crossing rate the autocorrelation and the MFCC difference coefficient The user can correct thesegmentation manually with an interactive graphical user interface (see text)

1178 EURASIP Journal on Applied Signal Processing

frame or pitch synchronously) to create the vocal-tract filterand the residual error function for each segment The pro-totype waveform surfaces are then created from the resid-ual error functions (see Section 231) In the synthesis stagea new residual error signal is recovered from a PWI sur-face interpolated from the two original ones (as describedin Section 232) The two speakersrsquo area functions are alsointerpolated producing an intermediate area function fromwhich a new vocal-tract filter is computed (see Section 233)The new residual signal is then used to excite the interpo-lated vocal-tract filter to yield an intermediate speech signalThe final morphed speech signal is created by concatenatingthe new vocal phonemes in order along with the unvoicedphonemes and silent periods of the source or of the target

231 Computation of the characteristicwaveform surface

The characteristic waveform surface which represents theresidual error signal derived from the voiced sections [16] isa two-dimensional signal that represents a one-dimensionalsignal and is constructed as follows let u(tφ) be the char-acteristic waveform where t denotes the time axis and φ is aphase variable whose values are in the range [0 2π] The pro-totype waveforms are displayed along the phase axis whereeach prototype is a short segment from the residual signalwith a length of one pitch period Each prototype is consid-ered as a periodic function with a period of 2π The timeaxis of the surface displays the waveform evolution A one-dimensional signal can be recovered from u(tφ) by using aspecific φ(t) so that

r(t) = u(tφ(t)

) (1)

where φ(t) is calculated using the signal pitch period func-tion or pitch contour p(t) by

φ(t) = φ(t0)

+int t

t0

2πp(t)

dt (2)

A typical prediction error signal and its surface are shown inFigure 3

In the proposed solution the surface for each phonemeis created separately The construction of such a surface isdetailed in using the following procedure (Figure 4)

(1) Pitch detection is applied in order to obtain an instan-taneous pitch value p(t) which will track the pitch cy-cle change through time At any given point in timethe pitch cycle is determined by a linear interpolationof the pitch marks obtained by the pitch detector

(2) A rectangular window with duration of one pitch pe-riod multiplies the residual error function around asampling time ti with a step update of 25 millisec-onds to create a prototype waveform In order tosmooth the surface along the time axis a low-orderSavitzky-Golay filter is applied to the error function

0 50 100 150 200 250 300 350 400minus03

minus02

minus01

0

01

02

03

04

05

06

Am

plit

ude

t

(a)

60 50 40 30 20 10 0

100200

300400

minus04

minus02

0

02

04

06

08

1

Am

plit

ude

φ

t

(b)

Figure 3 Creating the characteristic waveform surface (a) A typ-ical residual error signal derived from the speech signal using lin-ear prediction analysis (b) The surface is constructed by plottingthe prototype waveform along the φ-axis at intervals of 25 millisec-onds after alignment and interpolation of the prototypes

(3) For the reconstruction of the signal from the surfaceit is extremely important to maintain similar and min-imum energy values at both ends of the prototypewaveform (which actually represent the same pointdue to the 2π periodicity along the phase axis) There-fore a shift of plusmn∆ samples (∆max was set to be 1 mil-lisecond) is allowed for the location of the windowrsquoscenter in the construction of the PWI surface

(4) Because the pitch cycle varies in time each prototypewaveform will be of different length Therefore all pro-totypes must be aligned along φ = [0 minus 2π] and musthave the same number of samples

Voice Morphing Using 3D Waveform Interpolation Surfaces 1179

The errorwaveform surface

Interpolatealong

time axis

Normalizebetween0minus 2π

Align withprevious

waveform

Prototypewaveformsampler

LPF

Short-timepitch function

calculation

Error signalcalculation

Pitch detectionmarks

LPC parameters

Figure 4 A schematic description of the PWI surface construction

(5) Once a prototype waveform is sampled according tostep (3) above a cyclical shift along the φ-axis is per-formed to obtain maximum cross-correlation with theformer prototype waveform thus creating a relativelysmooth waveform surface when moving across thetime axis

(6) In order to create a surface that reflects the errorrsquos pitchcycle evolution over time an interpolation along thetime axis is performed

232 Construction of the intermediateresidual error signal

Let us(tφ) ut(tφ) t [0 minus TsTt]φ [0 minus 2π] be thePWI of the source and target speakers respectively As de-scribed earlier the new intermediate waveform surface willbe an interpolation of the two surfaces (Figure 5) Therefore

unew(tφ) = α middot us(β middot tφ) + (1minus α) middot utminusaligned(γ middot tφ)

unew(tφ) t [0minus Tnew] φ [0minus 2π]

(3)

where β = TnewTs γ = TnewTt and α1 is the relative part ofus(tφ)

1The factor may be invariant if so the voice produced will be an inter-mediate between the two speakers or it may vary in time (from α = 1 att = 0 to α = 0 at t = T) yielding a gradual change from one voice to theother

minus02minus01

00102

03

04

0506

60 5040

3020

100

+

0100

200300

400

Am

plit

ude

φt

minus04minus03minus02

minus01

0

010203

04

6050

4030

2010

0 0100

200300

400

Am

plit

ude

φt

minus02

minus01

0

01

02

03

04

05

6050

4030

2010

0 0100

200300

400

Am

plit

ude

φt

Figure 5 Residual PWI surfaces of the two speakers (upper andintermediate figures respectively) and the resulting interpolatedsurface (bottom figure) It can be noticed that the interpolatedsurface combines characteristics of both speakers The φ-axis rep-resents samples of the normalized pitch period in a scale between 0and 2π

1180 EURASIP Journal on Applied Signal Processing

minus02

0

02

0 050

100

150200

φ

t

Am

plit

ude

Figure 6 The new PWI surface is constructed by interpolation oftwo PWI surfaces of the residual error signals of two speakers Thenew reconstructed residual can be recovered by moving on the sur-face along the phase axis according to the phase φ(t) determined bythe new pitch contour (The tracking is represented by the black linealong the surface)

In order to maximize the cross-correlation between thetwo surfaces the target speakerrsquos surface is shifted along theφ-axis and is referred to as utminusaligned(tφ) where

utminusaligned(tφ) = ut(tφ + φn

) (4)

φn = arg maxφe

int 2πφ=0

int Tnew

t=0 us(tφ) middot ut(tφ + φe

)dtdφ∥∥us∥∥ middot ∥∥ut(tφ + φe

)∥∥ (5)

u =radicradicradicradicint 2π

φ=0

int Tnew

t=0u(tφ) middot u(tφ)dtdφ (6)

where φe is the correction needed for the surfaces to bealigned

The last step (after creating unew(tφ)) is reconstruct-ing the new residual error signal from the waveform sur-face The reconstruction is performed by defining enew(t) =unew(tφnew(t)) for all t = [0 Tnew] where φnew(t) is createdby the following equation

φnew(t) =int t

t0

2πpnew(tprime)

dtprime (7)

pnew(t) is calculated as an average of the sourcersquos and targetrsquosshort-time pitch contour functions as shown in the follow-ing equation

pnew(t) = α middot ps(β middot t) + (1minus α) middot pt(γ middot t)

β = Tnew

Ts

γ = Tnew

Tt

(8)

where the factors β and γ are scaling factors for the new timeaxis [0Tnew] and α as in (3) is a weighting factor betweenthe pitch of the source and the pitch of the target Figure 6shows the derivation of a new residual error signal using thetrack on the PWI surface determined by φnew(t)

233 New vocal tract model calculation and synthesisIt is well known that the linear prediction parameters (iethe coefficients of the predictor polynomial A(z)) are highlysensitive to quantization [19] Therefore their quantizationor interpolation may result in an unstable filter and may pro-duce an undesirable signal However certain invertible non-linear transformations of the predictor coefficients result inequivalent sets of parameters that tolerate quantization or in-terpolation better An example of such a set of parametersis the PARCOR coefficients (ki) which are related to the ar-eas of lossless tube sections modeling the vocal tract [20] asgiven by the following equation

Ai+1 =(

1minus ki1 + ki

)middot Ai (9)

The value of the first area function parameter (A1) is arbi-trarily set to be 2 A new set of LPC parameters that defines anew vocal tract is computed using an interpolation of the twoarea vectors (source and target) This choice of the area pa-rameters seems to be more reasonable since intermediate vo-cal tract models should reflect intermediate dimensions [25]Let the source and target vocal tracts be modeled by N loss-less tube sections with areasAs

i Ati i [1minusN] respectively

The new signalrsquos vocal tract will be represented by

Anewi = α middot As

i + (1minus α) middot Ati foralli [1 2 N] (10)

After calculating the new areas the prediction filter iscomputed and the new vocal phoneme is synthesized accord-ing to the following scheme (Figure 7)

(1) compute new PARCOR parameters from the new areasby reversing (9)

(2) compute the coefficients of the new LPC model fromthe new PARCOR parameters

(3) filter the new excitation signal through the new vocal-tract filter to obtain the new vocal phoneme

When temporal voice morphing is applied informal subjec-tive listening tests performed on different sets of morphingparameters have revealed that in order to have a ldquolinearrdquo per-ceptual change between the voices of the source and the tar-get the coefficient α(t) (the relative part of us(tφ)) mustvary nonlinearly in time (like the one in Figure 8) When thecoefficient changed linearly with time the listeners perceivedan abrupt change from one identity to the other Using thenonlinear option that is gradually changing the identity ofone speaker to that of another a smooth modification of theldquosourcerdquo properties to those of the ldquotargetrdquo properties wasachieved In another subjective listening test (see below) per-formed on morphing from a womanrsquos voice to a manrsquos voiceand vice versa both uttering the same sentence the mor-phed sound was perceived as changing smoothly and natu-rally from one speaker to the other The quality of the mor-phed voices was found to depend upon the specific speakersthe differences between their voices and the content of theutterance Further research is required to accurately evaluatethe effect of these factors

Voice Morphing Using 3D Waveform Interpolation Surfaces 1181

New phoneme

Synthesizenew speech

signal

Calculatenew LPC

parameters

Calculate newtube areas

Modificationparameter (α)

Source and targetpitch functions

Source and targeterror surface

Creat newerror surface

Recover newexcitation

Source and targetLPC parameters

Modificationparameter (α)

Calculateφnew(t)

EvaluatePnew(t)

Figure 7 A block diagram of the procedure for synthesizing the new phoneme

0010203040506070809

1

0 01 02 03 04 05 06 07 08 09 1

Act

ual

mor

phin

gco

effici

ent

Desired morphing coefficient

Figure 8 An example of a nonlinear morphing function obtainedempirically and used to achieve a linear perceptual change betweenthe first speaker and the second speaker when time-varying mor-phing was performed

24 Evaluation of the algorithm

Evaluation of the algorithm was carried out using three sub-jective listening tests The naturalness intelligibility identitychange and smoothness of the morphing algorithm wereexamined in these psychoacoustic tests In all tests six lis-teners participated (three males and three females all with-out hearing impairments and inexperienced in speech mor-phing) The sentences for the tests were taken from TIMITdatabase and from BGU Hebrew database The speech stim-uli were played using high-quality loudspeakers In all tests

the stimuli were played in random order The listeners werefree to play each stimulus more than once without any limi-tation In the first test the listeners had to decide if there wasa change in the identity of the speaker for each of the threesentences on a 1ndash5 scale where 1 meant that one identitywas perceived along the utterance and 5 meant that there wasmore than one speaker The listeners had to rate the smooth-ness of the utterance as well on a similar 1ndash5 scale where5 meant smooth utterance and 1 meant abrupt change orchanges in the utterance Three types of stimuli were usedthe original speech morphed speech (in which the identitychanged gradually from one speaker to another during thesentence) and concatenated speech of two speakers at a fixedpoint [1] The aim of this test was to evaluate if the iden-tity change was perceived and to assess the effect of the algo-rithm on the smoothness of the gradual morphed utterancesThe results of this experiment for one of the sentences (ldquoWewere different shades and it did not make a bit of differenceamong usrdquo taken from [9]) are depicted in Figure 9 As ex-pected it is readily seen that the original utterance was per-ceived as a smooth speech without identity change while theabrupt change in identity in the concatenated utterance wasperceived and rated accordingly that is as more than oneidentity and with an abrupt change The change of identitywas perceived in the morphed signal as well (average score35 std = 096) since the sentence was long enough to no-tice the modification Nevertheless it was also perceived asrelatively smooth (average score 35 std = 112) The other

1182 EURASIP Journal on Applied Signal Processing

s3-o

rigi

nal

t3-o

rigi

nal

s3-g

radu

alm

orph

ing

t3-g

radu

alm

orph

ing

s3-

con

cate

nat

ed

t3-

con

cate

nat

ed

1000

1500

2000

2500

3000

3500

4000

4500

5000

Ave

rage

scor

e

Stimuli

Figure 9 Evaluation of smoothness and identity change in an ut-terance in which a gradual morphing is produced (smoothnessmdashleft column identity changemdashright column)

two utterances yielded similar results In the second test thelisteners had to evaluate the naturalness and intelligibilityof a given sentence on a 1ndash5 scale The listeners attendedto 3 stimuli two original sentences one uttered by a malespeaker and the other by a female speaker and a morphedsentence with a morphing factor of 05 which is a hybrid sig-nal between the two originals The results are summarized inFigure 10 It is clear from this figure that the morphed sig-nal was perceived as highly natural and clear by most of thelisteners (naturalness average score 43 std = 074 intelli-gibility average score 417 std = 069) In the third exper-iment the listeners had to rate the identity change and thenaturalness of two sentences on a 1ndash5 scale Each sentencewas repeated as a cyclostationary morph (see [3]) 16 timeswith various morphing factors from 0 to 1 The average nat-uralness was 325 and the identity change was rated with anaverage of 33 Naturalness in this case was not perfect (al-though it is quite high) and can be explained by the fact thatthe voice timbers in this test were distant and the hybridvoice that was produced could be perceived as uncommonIn summary the results of the tests show that the morphedsignals are perceived in most cases as highly natural highlyintelligible with a relatively smooth change from one speechvoice to another

3 CONCLUSIONS

In this study a new speech morphing algorithm is presentedThe aim is to produce natural sounding hybrid voices be-tween two speakers uttering the same content The algo-rithm is based on representing the residual error signal as aPWI surface and the vocal tract as a lossless tube area func-tion The PWI surface incorporates the characteristics of theexcitation signal and enables reproduction of a residual sig-nal with a given pitch contour and time duration which in-cludes the dynamics of both speakersrsquo excitations It is known

1000

1500

2000

2500

3000

3500

4000

4500

5000

Ave

rage

scor

e

Man Woman Morphing 05

Stimuli

Figure 10 Results of subjective evaluation for naturalness (leftbars) and intelligibility (right bars) for 3 stimuli a man (left) anda woman (middle) voices and a morphing between the two voices(right)

[16] that PWI surfaces can be exploited efficiently for speechcoding and therefore they allow for higher compression ofthe speech database The area function was used in an at-tempt to reflect an intermediate configuration of the vocaltract between the two speakers [25] The utterances producedby the algorithm were shown to be of high quality and to con-sist of intermediate features of the two speakers

There are at least two modes in which the morphing algo-rithm can be used In the first mode the morphing parameteris invariant meaning for example taking a factor of 05 andreceiving a morphed signal with characteristics which are be-tween the two voices for the whole duration of the articu-lation In the second mode we start from the first (source)speaker (morphing factor = 0) and the morphing factor ischanged gradually along the duration of the sentence so itsvalue is 1 at the end of the sentence The same morphingfactor was used for both the excitation and the vocal tractparameters

In this time-varying version of the algorithm that iswhen morphing gradually from one voice to another overtime smooth morphing was achieved producing a highlynatural transition between the source and the target speakersThis was assessed by subjective evaluation tests as previouslydescribed

The algorithm described here is more capable of pro-ducing longer and more smooth and natural sounding ut-terances than previous studies [1 3] One of the advantagesof the proposed algorithm is that in addition to the inter-polation of the vocal tract features interpolation is also per-formed between the two PWI surfaces of the correspondingresidual signals and thus captures the evolution of the excita-tion signals of both speakers In this way a hybrid excitationsignal can be produced that contains intermediate character-istics of both excitations Thus a more natural morphing be-tween utterances can be achieved as has been demonstratedFurthermore our algorithm performs the interpolation of

Voice Morphing Using 3D Waveform Interpolation Surfaces 1183

the residual signal regardless of the pitch information sincethe pitch data is normalized within the PWI surface There-fore the morphed pitch contour is extracted independentlyand can be manipulated separately In addition the currentapproach enables morphing between utterances with differ-ent pitches between male and female voices or betweenvoices of different and perceptually distant timbres Kawa-hara and his colleagues [26 27] implemented a morphingsystem based on interpolation between time-frequency rep-resentations of the source and the target signals It appearsthat the STRAIGHT-based morphing system (see [2 26])was able to produce intermediate voices of higher clarity thanour algorithm but the need to assign multiple anchor pointsfor each short segment using visual inspection and phono-logical knowledge is a noticeable disadvantage of that sys-tem which can make it difficult to use for morphing betweenlong utterances

Further research is needed to improve the quality of themorphed signals which are natural sounding (see httpspltelhaiacilspeech) but are somewhat degraded com-pared to the originals

ACKNOWLEDGMENTS

We would like to thank Professor David Malah the Head ofSIPL for proposing to apply PWI to voice morphing andfor his valuable discussions and comments We also thankDima Ruinskiy for his valuable assistance for programmingpart of the morphing system and for preparing the semiau-tomatic segmentation and Yefim Yakir for preparing part ofthe figures and for technical support The authors thank thereviewers for their helpful comments This study was partlysupported by Guastella Fellowship of the Sacta-Rashi Foun-dation and the JAFI project

REFERENCES

[1] M Abe ldquoSpeech morphing by gradually changing spectrumparameter and fundamental frequencyrdquo Tech Rep SP96-40The Institute of Electronics Information and Communica-tion Engineers (IEICE) July 1996

[2] H Kawahara and H Matsui ldquoAuditory morphing based onan elastic perceptual distance metric in an interference-freetime-frequency representationrdquo in Proc IEEE InternationalConference on Acoustics Speech and Signal Processing (ICASSPrsquo03) vol 1 pp 256ndash259 Hong Kong China 2003

[3] M Slaney M Covell and B Lassiter ldquoAutomatic audio mor-phingrdquo in Proc IEEE International Conference on AcousticsSpeech and Signal Processing (ICASSP rsquo96) vol 2 pp 1001ndash1004 Atlanta GA USA May 1996

[4] P Cano A Loscos J Bonada M de Boer and X Serra ldquoVoicemorphing system for impersonating in karaoke applicationsrdquoin Proc International Computer Music Conference (ICMA rsquo00)pp 109ndash112 Berlin Germany August 2000

[5] E Tellman L Haken and B Holloway ldquoTimbre morphingof sounds with unequal numbers of featuresrdquo Journal of theAudio Engineering Society (AES) vol 43 no 9 pp 678ndash6891995

[6] D Rossum ldquoThe ldquoarmadillordquo coefficient encoding scheme fordigital audio filtersrdquo in proc IEEE Workshop on Applicationsof Signal Processing to Audio and Acoustics (WASPAA rsquo91) pp129ndash130 New Paltz NY USA October 1991

[7] Y Stylianou O Cappe and E Moulines ldquoContinuous prob-abilistic transform for voice conversionrdquo IEEE Trans SpeechAudio Processing vol 6 no 2 pp 131ndash142 1998

[8] N Iwahashi and Y Sagisaka ldquoSpeech spectrum conversionbased on speaker interpolation and multi-functional repre-sentation with weighting by radial basis function networksrdquoSpeech Communication vol 16 no 2 pp 139ndash151 1995

[9] L M Arslan ldquoSpeaker Transformation Algorithm using Seg-mental Codebooks (STASC)rdquo Speech Communication vol 28no 3 pp 211ndash226 1999

[10] A Kain and M Macon ldquoSpectral voice conversion for text-to-speech synthesisrdquo in Proc IEEE International Conferenceon Acoustics Speech and Signal Processing (ICASSP rsquo98) vol 1pp 285ndash288 Seattle WA USA May 1998

[11] A Kain and M Macon ldquoPersonalizing a speech synthesizerby voice adaptationrdquo in Proc 3rd ESCACOCOSDA Interna-tional Speech Synthesis Workshop pp 225ndash230 Jenolan CavesAustralia November 1998

[12] A Kain and M Macon ldquoDesign and evaluation of a voice con-version algorithm based on spectral envelope mapping andresidual predictionrdquo in Proc IEEE International Conference onAcoustics Speech and Signal Processing (ICASSP rsquo01) vol 2Salt Lake City Utah USA May 2001

[13] M Abe S Nakamura K Shikano and H Kuwabara ldquoVoiceconversion through vector quantizationrdquo in Proc IEEE In-ternational Conference on Acoustics Speech and Signal Pro-cessing (ICASSP rsquo88) pp 655ndash658 New York NY USA April1988

[14] M Narendranath H A Murthy S Rajendran and B Yegna-narayana ldquoTransformation of formants for voice conversionusing artificial neural networksrdquo Speech Communication vol16 no 2 pp 206ndash216 1995

[15] O Lev (Fellah) and D Malah ldquoLow bit-rate speech coderbased on a long-term modelrdquo in Proc 22nd IEEE Conventionof Electrical and Electronics Engineers in Israel Tel-Aviv IsraelDecember 2002

[16] W B Kleijn and J Haagen ldquoWaveform interpolationfor cod-ing and synthesisrdquo in Speech Coding and Synthesis W BKleijn and K K Paliwal Eds chapter 5 pp 175ndash207 Else-vier Science B V Amsterdam the Netherlands 1995

[17] W B Kleijn ldquoEncoding speech using prototype waveformsrdquoIEEE Trans Speech Audio Processing vol 1 no 4 pp 386ndash3991993

[18] W B Kleijn and J Haagen ldquoTransformation and decompo-sition of the speech signal for codingrdquo IEEE Signal ProcessingLett vol 1 no 9 pp 136ndash138 1994

[19] J R Deller J G Proakis and J H L Hansen Discrete TimeProcessing of Speech Signals chapter 7 Macmillan PublishingNew York NY USA 1993

[20] L R Rabiner and R W Schafer Digital processing of speechsignals chapter 8 Signal Processing Series Prentice-Hall Pub-lishing Englewood Cliffs London 1978

[21] J Makhoul ldquoLinear prediction A tutorial reviewrdquo Proc IEEEvol 63 no 4 pp 561ndash580 1975

[22] H Kuwabara and Y Sagisaka ldquoAcoustic characteristics ofspeaker individuality Control and conversionrdquo Speech Com-munication vol 16 no 2 pp 165ndash173 1995

[23] A M Noll ldquoCepstrum pitch determinationrdquo Journal of theAcoustical Society of America vol 41 no 2 pp 293ndash309 1967

[24] Y Medan E Yair and D Chazan ldquoSuper resolution pitchdetermination of speech signalsrdquo IEEE Trans Acoust SpeechSignal Processing vol 39 no 1 pp 40ndash48 1991

[25] H Wakita ldquoDirect estimation of the vocal tract shape by in-verse filtering of the acoustic speech waveformsrdquo IEEE Trans-action Audio Electroacoustic vol AV-21 no 5 pp 417ndash4271973

1184 EURASIP Journal on Applied Signal Processing

[26] H Kawahara I Masuda-Katsuse and A de Cheveigne ldquoRe-structuring speech representations using a pitch-adaptivetime-frequency smoothing and an instantaneous-frequency-based F0 extraction Possible role of a repetitive structure insoundsrdquo Speech Communication vol 27 no 3-4 pp 187ndash207 1999

[27] H Kawahara H Katayose A de Cheveigne and R D Pat-terson ldquoFixed point analysis of frequency to instantaneousfrequency mapping for accurate estimation of f0 and period-icityrdquo in Proc European Union Activities in Human LanguageTechnologies (EUROSPEECH rsquo99) vol 6 pp 2781ndash2784 Bu-dapest Hungary September 1999

Yizhar Lavner received a PhD degree fromthe Technion ndash Israel Institute of Technol-ogy in 1997 He joined the Department ofcomputer science Tel-Hai Academic Col-lege Upper Galilee Israel in 1997 where heis a Senior Lecturer He is teaching in SIPL(Signal and Image Processing Lab) Elec-trical Engineering Faculty the TechnionHaifa Israel) since 1998 His research in-terests include speech and audio signal pro-cessing and genomic signal processing

Gidon Porat received his BS degree (cumlaude) in electrical engineering from theTechnion ndash Israel Institute of Technologyin January 2003 As a part of his grad-uation requirements Porat has researchedvoice morphing in the Electrical Engineer-ing Facultyrsquos Signal and Image ProcessingLab in the Technion Porat was a recipientof a Best Studentrsquos Paper Award at the IEEE22nd Convention of Electrical amp Electron-ics Engineers Israel in December 2002 Since his graduation Po-rat has been employed by Intel Corporation as a member of theMobile Platform Group Chip Design Team Porat takes part in thedevelopment of high-speed low-power circuit designs for the next-generation CPUs for mobile computers

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Transforming Signal Processing Applications intoParallel Implementations

Call for PapersThere is an increasing need to develop efficient ldquosystem-levelrdquo models methods and tools to support designers toquickly transform signal processing application specificationto heterogeneous hardware and software architectures suchas arrays of DSPs heterogeneous platforms involving mi-croprocessors DSPs and FPGAs and other evolving multi-processor SoC architectures Typically the design process in-volves aspects of application and architecture modeling aswell as transformations to translate the application modelsto architecture models for subsequent performance analysisand design space exploration Accurate predictions are indis-pensable because next generation signal processing applica-tions for example audio video and array signal processingimpose high throughput real-time and energy constraintsthat can no longer be served by a single DSP

There are a number of key issues in transforming applica-tion models into parallel implementations that are not ad-dressed in current approaches These are engineering theapplication specification transforming application specifi-cation or representation of the architecture specification aswell as communication models such as data transfer and syn-chronization primitives in both models

The purpose of this call for papers is to address approachesthat include application transformations in the performanceanalysis and design space exploration efforts when takingsignal processing applications to concurrent and parallel im-plementations The Guest Editors are soliciting contributionsin joint application and architecture space exploration thatoutperform the current architecture-only design space ex-ploration methods and tools

Topics of interest for this special issue include but are notlimited to

bull modeling applications in terms of (abstract)control-dataflow graph dataflow graph and processnetwork models of computation (MoC)

bull transforming application models or algorithmicengineering

bull transforming application MoCs to architecture MoCsbull joint application and architecture space exploration

bull joint application and architecture performanceanalysis

bull extending the concept of algorithmic engineering toarchitecture engineering

bull design cases and applications mapped onmultiprocessor homogeneous or heterogeneousSOCs showing joint optimization of application andarchitecture

Authors should follow the EURASIP JASP manuscriptformat described at httpwwwhindawicomjournalsaspProspective authors should submit an electronic copy oftheir complete manuscript through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due September 1 2006

Acceptance Notification January 1 2007

Final Manuscript Due April 1 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

F Deprettre Leiden Embedded Research Center LeidenUniversity Niels Bohrweg 1 2333 CA Leiden TheNetherlands eddliacsnl

Roger Woods School of Electrical and ElectronicEngineering Queens University of Belfast Ashby BuildingStranmillis Road Belfast BT9 5AH UK rwoodsqubacuk

Ingrid Verbauwhede Katholieke Universiteit LeuvenESAT-COSIC Kasteelpark Arenberg 10 3001 LeuvenBelgium Ingridverbauwhedeesatkuleuvenbe

Erwin de Kock Philips Research High Tech Campus 315656 AE Eindhoven The Netherlandserwindekockphilipscom

Hindawi Publishing Corporationhttpwwwhindawicom

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Video Adaptation for Heterogeneous Environments

Call for PapersThe explosive growth of compressed video streams andrepositories accessible worldwide the recent addition of newvideo-related standards such as H264AVC MPEG-7 andMPEG-21 and the ever-increasing prevalence of heteroge-neous video-enabled terminals such as computer TV mo-bile phones and personal digital assistants have escalated theneed for efficient and effective techniques for adapting com-pressed videos to better suit the different capabilities con-straints and requirements of various transmission networksapplications and end users For instance Universal Multime-dia Access (UMA) advocates the provision and adaptation ofthe same multimedia content for different networks termi-nals and user preferences

Video adaptation is an emerging field that offers a richbody of knowledge and techniques for handling the hugevariation of resource constraints (eg bandwidth display ca-pability processing speed and power consumption) and thelarge diversity of user tasks in pervasive media applicationsConsiderable amounts of research and development activi-ties in industry and academia have been devoted to answer-ing the many challenges in making better use of video con-tent across systems and applications of various kinds

Video adaptation may apply to individual or multiplevideo streams and may call for different means depending onthe objectives and requirements of adaptation Transcodingtransmoding (cross-modality transcoding) scalable contentrepresentation content abstraction and summarization arepopular means for video adaptation In addition video con-tent analysis and understanding including low-level featureanalysis and high-level semantics understanding play an im-portant role in video adaptation as essential video contentcan be better preserved

The aim of this special issue is to present state-of-the-art developments in this flourishing and important researchfield Contributions in theoretical study architecture designperformance analysis complexity reduction and real-worldapplications are all welcome

Topics of interest include (but are not limited to)

bull Heterogeneous video transcodingbull Scalable video codingbull Dynamic bitstream switching for video adaptation

bull Signal structural and semantic-level videoadaptation

bull Content analysis and understanding for videoadaptation

bull Video summarization and abstractionbull Copyright protection for video adaptationbull Crossmedia techniques for video adaptationbull Testing field trials and applications of video

adaptation servicesbull International standard activities for video adaptation

Authors should follow the EURASIP JASP manuscriptformat described at httpwwwhindawicomjournalsaspProspective authors should submit an electronic copy oftheir complete manuscript through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due September 1 2006

Acceptance Notification January 1 2007

Final Manuscript Due April 1 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

Chia-Wen Lin Department of Computer Science andInformation Engineering National Chung ChengUniversity Chiayi 621 Taiwan cwlincsccuedutw

Yap-Peng Tan School of Electrical and ElectronicEngineering Nanyang Technological University NanyangAvenue Singapore 639798 Singapore eyptanntuedusg

Ming-Ting Sun Department of Electrical EngineeringUniversity of Washington Seattle WA 98195 USA suneewashingtonedu

Alex Kot School of Electrical and Electronic EngineeringNanyang Technological University Nanyang AvenueSingapore 639798 Singapore eackotntuedusg

Anthony Vetro Mitsubishi Electric Research Laboratories201 Broadway 8th Floor Cambridge MA 02138 USAavetromerlcom

Hindawi Publishing Corporationhttpwwwhindawicom

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Knowledge-Assisted Media Analysis for InteractiveMultimedia Applications

Call for PapersIt is broadly acknowledged that the development of enablingtechnologies for new forms of interactive multimedia ser-vices requires a targeted confluence of knowledge semanticsand low-level media processing The convergence of these ar-eas is key to many applications including interactive TV net-worked medical imaging vision-based surveillance and mul-timedia visualization navigation search and retrieval Thelatter is a crucial application since the exponential growthof audiovisual data along with the critical lack of tools torecord the data in a well-structured form is rendering uselessvast portions of available content To overcome this problemthere is need for technology that is able to produce accuratelevels of abstraction in order to annotate and retrieve con-tent using queries that are natural to humans Such technol-ogy will help narrow the gap between low-level features orcontent descriptors that can be computed automatically andthe richness and subjectivity of semantics in user queries andhigh-level human interpretations of audiovisual media

This special issue focuses on truly integrative research tar-geting of what can be disparate disciplines including imageprocessing knowledge engineering information retrievalsemantic analysis and artificial intelligence High-qualityand novel contributions addressing theoretical and practicalaspects are solicited Specifically the following topics are ofinterest

bull Semantics-based multimedia analysisbull Context-based multimedia miningbull Intelligent exploitation of user relevance feedbackbull Knowledge acquisition from multimedia contentsbull Semantics based interaction with multimediabull Integration of multimedia processing and Semantic

Web technologies to enable automatic content shar-ing processing and interpretation

bull Content user and network aware media engineeringbull Multimodal techniques high-dimensionality reduc-

tion and low level feature fusion

Authors should follow the EURASIP JASP manuscriptformat described at httpwwwhindawicomjournalsaspProspective authors should submit an electronic copy oftheir complete manuscript through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due September 1 2006

Acceptance Notification January 15 2007

Final Manuscript Due April 1 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

Ebroul Izquierdo Department of Electronic EngineeringQueen Mary University of London Mile End Road LondonE1 4NS United Kingdom ebroulizquierdoelecqmulacuk

Hyoung Joong Kim Department of Control and Instru-mentation Engineering Kangwon National University 192 1Hyoja2 Dong Kangwon Do 200 701 Koreakhjkangwonackr

Thomas Sikora Communication Systems Group TechnicalUniversity Berlin Einstein Ufer 17 10587 Berlin Germanysikoranuetu-berlinde

Hindawi Publishing Corporationhttpwwwhindawicom

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Super-resolution Enhancement of Digital Video

Call for PapersWhen designing a system for image acquisition there is gen-erally a desire for high spatial resolution and a wide field-of-view To achieve this a camera system must typically em-ploy small f-number optics This produces an image withvery high spatial-frequency bandwidth at the focal plane Toavoid aliasing caused by undersampling the correspondingfocal plane array (FPA) must be sufficiently dense Howevercost and fabrication complexities may make this impracticalMore fundamentally smaller detectors capture fewer pho-tons which can lead to potentially severe noise levels in theacquired imagery Considering these factors one may chooseto accept a certain level of undersampling or to sacrifice someoptical resolution andor field-of-view

In image super-resolution (SR) postprocessing is used toobtain images with resolutions that go beyond the conven-tional limits of the uncompensated imaging system In somesystems the primary limiting factor is the optical resolutionof the image in the focal plane as defined by the cut-off fre-quency of the optics We use the term ldquooptical SRrdquo to re-fer to SR methods that aim to create an image with validspatial-frequency content that goes beyond the cut-off fre-quency of the optics Such techniques typically must rely onextensive a priori information In other image acquisitionsystems the limiting factor may be the density of the FPAsubsequent postprocessing requirements or transmission bi-trate constraints that require data compression We refer tothe process of overcoming the limitations of the FPA in orderto obtain the full resolution afforded by the selected optics asldquodetector SRrdquo Note that some methods may seek to performboth optical and detector SR

Detector SR algorithms generally process a set of low-resolution aliased frames from a video sequence to producea high-resolution frame When subpixel relative motion ispresent between the objects in the scene and the detector ar-ray a unique set of scene samples are acquired for each frameThis provides the mechanism for effectively increasing thespatial sampling rate of the imaging system without reduc-ing the physical size of the detectors

With increasing interest in surveillance and the prolifera-tion of digital imaging and video SR has become a rapidlygrowing field Recent advances in SR include innovative al-gorithms generalized methods real-time implementations

and novel applications The purpose of this special issue isto present leading research and development in the area ofsuper-resolution for digital video Topics of interest for thisspecial issue include but are not limited to

bull Detector and optical SR algorithms for videobull Real-time or near-real-time SR implementationsbull Innovative color SR processingbull Novel SR applications such as improved object

detection recognition and trackingbull Super-resolution from compressed videobull Subpixel image registration and optical flow

Authors should follow the EURASIP JASP manuscriptformat described at httpwwwhindawicomjournalsaspProspective authors should submit an electronic copy oftheir complete manuscript through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due September 1 2006

Acceptance Notification February 1 2007

Final Manuscript Due April 15 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

Russell C Hardie Department of Electrical and ComputerEngineering University of Dayton 300 College ParkDayton OH 45469-0026 USA rhardieudaytonedu

Richard R Schultz Department of Electrical EngineeringUniversity of North Dakota Upson II Room 160 PO Box7165 Grand Forks ND 58202-7165 USARichardSchultzmailundnodakedu

Kenneth E Barner Department of Electrical andComputer Engineering University of Delaware 140 EvansHall Newark DE 19716-3130 USA barnereeudeledu

Hindawi Publishing Corporationhttpwwwhindawicom

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Advanced Signal Processing and ComputationalIntelligence Techniques for Power Line Communications

Call for PapersIn recent years increased demand for fast Internet access andnew multimedia services the development of new and fea-sible signal processing techniques associated with faster andlow-cost digital signal processors as well as the deregulationof the telecommunications market have placed major em-phasis on the value of investigating hostile media such aspowerline (PL) channels for high-rate data transmissions

Nowadays some companies are offering powerline com-munications (PLC) modems with mean and peak bit-ratesaround 100 Mbps and 200 Mbps respectively Howeveradvanced broadband powerline communications (BPLC)modems will surpass this performance For accomplishing itsome special schemes or solutions for coping with the follow-ing issues should be addressed (i) considerable differencesbetween powerline network topologies (ii) hostile propertiesof PL channels such as attenuation proportional to high fre-quencies and long distances high-power impulse noise oc-currences time-varying behavior and strong inter-symbolinterference (ISI) effects (iv) electromagnetic compatibilitywith other well-established communication systems work-ing in the same spectrum (v) climatic conditions in differ-ent parts of the world (vii) reliability and QoS guarantee forvideo and voice transmissions and (vi) different demandsand needs from developed developing and poor countries

These issues can lead to exciting research frontiers withvery promising results if signal processing digital commu-nication and computational intelligence techniques are ef-fectively and efficiently combined

The goal of this special issue is to introduce signal process-ing digital communication and computational intelligencetools either individually or in combined form for advancingreliable and powerful future generations of powerline com-munication solutions that can be suited with for applicationsin developed developing and poor countries

Topics of interest include (but are not limited to)

bull Multicarrier spread spectrum and single carrier tech-niques

bull Channel modeling

bull Channel coding and equalization techniquesbull Multiuser detection and multiple access techniquesbull Synchronization techniquesbull Impulse noise cancellation techniquesbull FPGA ASIC and DSP implementation issues of PLC

modemsbull Error resilience error concealment and Joint source-

channel design methods for video transmissionthrough PL channels

Authors should follow the EURASIP JASP manuscript for-mat described at the journal site httpasphindawicomProspective authors should submit an electronic copy of theircomplete manuscripts through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due October 1 2006

Acceptance Notification January 1 2007

Final Manuscript Due April 1 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

Moiseacutes Vidal Ribeiro Federal University of Juiz de ForaBrazil mribeiroieeeorg

Lutz Lampe University of British Columbia Canadalampeeceubcca

Sanjit K Mitra University of California Santa BarbaraUSA mitraeceucsbedu

Klaus Dostert University of Karlsruhe Germanyklausdostertetecuni-karlsruhede

Halid Hrasnica Dresden University of Technology Ger-many hrasnicaifnettu-dresdende

Hindawi Publishing Corporationhttpasphindawicom

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Numerical Linear Algebra in Signal ProcessingApplications

Call for PapersThe cross-fertilization between numerical linear algebra anddigital signal processing has been very fruitful in the lastdecades The interaction between them has been growingleading to many new algorithms

Numerical linear algebra tools such as eigenvalue and sin-gular value decomposition and their higher-extension leastsquares total least squares recursive least squares regulariza-tion orthogonality and projections are the kernels of pow-erful and numerically robust algorithms

The goal of this special issue is to present new efficient andreliable numerical linear algebra tools for signal processingapplications Areas and topics of interest for this special issueinclude (but are not limited to)

bull Singular value and eigenvalue decompositions in-cluding applications

bull Fourier Toeplitz Cauchy Vandermonde and semi-separable matrices including special algorithms andarchitectures

bull Recursive least squares in digital signal processingbull Updating and downdating techniques in linear alge-

bra and signal processingbull Stability and sensitivity analysis of special recursive

least-squares problemsbull Numerical linear algebra in

bull Biomedical signal processing applicationsbull Adaptive filtersbull Remote sensingbull Acoustic echo cancellationbull Blind signal separation and multiuser detectionbull Multidimensional harmonic retrieval and direc-

tion-of-arrival estimationbull Applications in wireless communicationsbull Applications in pattern analysis and statistical

modelingbull Sensor array processing

Authors should follow the EURASIP JASP manuscriptformat described at httpwwwhindawicomjournalsaspProspective authors should submit an electronic copy oftheir complete manuscript through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due October 1 2006

Acceptance Notification February 1 2007

Final Manuscript Due May 1 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

Shivkumar Chandrasekaran Department of Electricaland Computer Engineering University of California SantaBarbara USA shiveceucsbedu

Gene H Golub Department of Computer Science StanfordUniversity USA golubsccmstanfordedu

Nicola Mastronardi Istituto per le Applicazioni del CalcololdquoMauro Piconerdquo Consiglio Nazionale delle Ricerche BariItaly nmastronardibaiaccnrit

Marc Moonen Department of Electrical EngineeringKatholieke Universiteit Leuven Belgiummarcmoonenesatkuleuvenbe

Paul Van Dooren Department of Mathematical Engineer-ing Catholic University of Louvain Belgiumvdoorencsamuclacbe

Sabine Van Huffel Department of Electrical EngineeringKatholieke Universiteit Leuven Belgiumsabinevanhuffelesatkuleuvenbe

Hindawi Publishing Corporationhttpwwwhindawicom

Page 2: Voice Morphing using 3D Waveform Interpolation …spl.telhai.ac.il/speech/pub/S1110865705502178[1].pdf · Voice Morphing Using 3D Waveform Interpolation Surfaces 1175 Several studies

Voice Morphing Using 3D Waveform Interpolation Surfaces 1175

Several studies have explored the subject of speech orvoice morphing to date Slaney et al [3] used a represen-tation of separate time-aligned spectrograms for pitch andspectral envelope using MFCC and modified the spectro-grams separately to achieve an audio morph Short vowelswere used to demonstrate the resulting morphs Morphingbetween a womanrsquos vowel and a short note of an oboe wasalso used A similar approach used smooth spectrographicrepresentations to interpolate between utterances with differ-ent emotional contents [2] In another study real-time mor-phing was applied to a singing voice using an interpolationof a source voice with a target voice based on a sinusoidalmodel [4] The same method of sinusoidal analysis was re-ported in [5] In the digital music industry the Morpheussynthesizer by E-mu [6] introduced a 14-pole dynamicallyvariable filter which could model different resonant charac-teristics perform spectral morphing-like effects between dif-ferent musical samples and interpolate between them in realtime

In this study we present a new technique which enablesthe production of a desired number of intermediate voicesbetween the original voices of two speakers or the produc-tion of one voice signal that changes gradually in time fromone speaker to another The latter means that at the begin-ning of the utterance the voice characteristics are those ofone speaker and the voice is perceived as belonging to thatspeaker The voice is gradually modified towards the charac-teristics of another speaker so that by the end of the utter-ance it is perceived as belonging to the second speaker Thistechnique is based on two components One is the creation ofa 3D prototype waveform interpolation (PWI) surface fromthe residual error signal which is obtained by LPC analysisto produce a new intermediate excitation signal The secondcomponent is a representation of the vocal tract by a loss-less tube area function and interpolation of the parametersof the two speakers

The morphing algorithm consists of two main stagesanalysis and synthesis In the analysis stage the residual errorsignal is estimated along with the vocal tract parameters Theresidual signal is then used to create a PWI surface for eachspeech utterance In the synthesis stage a new residual errorsignal is recovered from a PWI surface interpolated from thetwo original surfaces The area functions of the two speakersare also interpolated producing a hybrid area function fromwhich a new vocal-tract filter is computed The residual sig-nal is then transferred through this filter to yield a morphedspeech signal Thus we use an excitation signal the dynam-ics of which are comprised of both excitation waves and pitchperiod contours along with a vocal tract with an interpolatedstructure

This study resembles other studies on voice conversion orvoice transformations [7 8 9 10 11 12 13 14] but it is sig-nificantly different Voice conversion modifies the utterancesof one speaker so that hisher voice will sound like another(target) voice by matching the source voice to the statisti-cal properties of the target voice In these studies differentmethods are used to represent the relationships between thesource and the target speakers and most of the studies are

concentrated on the spectral envelope data of short segmentsof speech The spectral envelopes are characterized by one ofseveral possible representations such as HNM (8) Cepstrumor log-area ratio [8] LSF [9 11] LPC [12] and formant fre-quencies [13] For example in [7] a Gaussian mixture model(GMM) is used where the conversion is performed in thecontext of the harmonic + noise model (HNM) using a con-tinuous probabilistic model of the source envelopes Conver-sion using GMM is also utilized in [12] with joint densityestimation for the spectral conversion using LPC analysiswhile the pitch of the source has been modified to match theaverage pitch of the target The residual LPC in each pitchperiod in the latter study was left intact In other studies theconversion is performed using codebook mapping with vec-tor quantization [9 13] or with artificial neural networks[14]

The aim of speech morphing as it is proposed and usedin [1 2 3] and as it has been carried out in the currentstudy is to produce intermediate voices between two givenutterances that will be perceived as lying between the twooriginal voices In the other morphing type gradual mor-phing the morphed sound should be perceived as one ob-ject that smoothly changes into another sound [3] The mor-phing algorithm presented here is shown to produce high-quality morphing sounds that are perceived as highly naturaland smooth

The paper is organized as follows In Section 21 the PWItechnique is introduced and in Section 22 the idea of usingit for speech morphing is presented The basic morphing al-gorithm is described in Section 23 A detailed computationof the characteristic waveforms function with the construc-tion of corresponding PWI surface and the interpolation be-tween two such surfaces are all presented in Section 231The procedure for extracting a new intermediate residual er-ror signal is presented in Section 232 Subsequently the cal-culation of the new vocal tract model and the synthesis ofthe morphed speech are described In Section 24 subjectivetests for evaluating the naturalness and the intelligibility ofthe morphing voices are presented Finally we discuss theadvantages and disadvantages of the algorithm in contrast toprevious studies

2 PROCEDURE AND RESULTS

In this section a method for decomposing the speech sig-nals of two speakers and recombining the components ispresented The components are the excitation signal rep-resented by the residual error signal from an LPC analysisand the vocal tract parameters represented by the area coef-ficients of a lossless tube model The method is designed sothat the resulting speech will be characterized perceptually asan interpolated version of the voices of the two speakers

21 Prototype waveform interpolationPWI is a speech coding method described in [15 16 17 18]This method is based on the fact that voiced speech isquasiperiodic and can be considered as a chain of pitch cy-cles Comparing consecutive pitch cycles reveals a slow evo-lution in the pitch-cycle waveform and duration that is each

1176 EURASIP Journal on Applied Signal Processing

pitch cycle has a close similarity to its neighbors The slowchange in the shape and duration of the pitch cycle suggeststhat extracting the cyclesrsquo waveform at regular time intervalsshould be sufficient in order to reconstruct the signal fromthe sampled cycles by interpolation The interpolation pro-cedure is carried out by constructing a 3D surface from thespeech or the residual error waveforms The coding proce-dure can be applied for both the speech signal and its resid-ual error function derived from the LPC analysis A detaileddescription of this coding technique for bit-rate reductioncan be found in [16] The representation of a speech signalrsquosresidual error function in the form of 3D surfaces has beenfound useful for voiced speech morphing The creation ofsuch a surface is described below

22 PWI-based speech morphingPrototype waveform interpolation is based on the observa-tion that during voiced segments of speech the pitch cyclesresemble each other and their general shape usually evolvesslowly in time (see [16 17 18]) The essential characteris-tics of the speech signal can thus be described by the pitch-cycle waveform By extracting pitch cycles at regular time in-stants and interpolating between them an interpolation sur-face can be created The speech can then be reconstructedfrom this surface if the pitch contour and the phase function(see Section 231) are known

The algorithm presented here is based on the source-filter model of speech production [19 20] According to thismodel voiced speech is the output of a time-varying vocal-tract filter excited by a time-varying glottal pulse signal Inorder to separate the vocal-tract filter from the source sig-nal we used the LPC analysis [21] by which the speech isdecomposed into two components the LPC coefficients con-taining the information of the vocal tract characteristics andthe residual error signal analogous to the derivative of theglottal pulse signal In the proposed morphing technique weused the PWI to create a 3D surface from the residual errorsignal which would represent the source characteristics foreach speaker Interpolation between the surfaces of the twospeakers allows us to create an intermediate excitation signalIn addition to the fact that the information of the vocal tract(see Section 233) is manipulated separately from the infor-mation of the residual error signal it is also more advanta-geous to create a PWI surface from the residual signal thanto obtain one from the speech itself In this domain it is rel-atively easy to ensure that the periodic extension procedure(see below) does not result in artifacts in the characteristicwaveform shape [16] This is due to the fact that the resid-ual signal contains mainly excitation pulses with low-powerregions in between and thus allows a smooth reconstruc-tion of the residual signal from the PWI surface with minimalphase discontinuities

In the proposed algorithm the surfaces of the residual er-ror signals computed for each voiced phoneme of two differ-ent speakers are interpolated to create an intermediate sur-face Together with an intermediate pitch contour and an in-terpolated vocal-tract filter a new voiced phoneme is pro-duced

23 The basic algorithm

The morphing algorithm consists of two main stagesmdashanalysis and synthesis As most of the speakerrsquos individualityis contained in the voiced portion of speech [22] and be-cause in preliminary experiments it was found that mor-phing the unvoiced sections yielded low-quality utterancesthe algorithm is applied on the voiced segments only Theunvoiced segments were left intact and concatenated withthe interpolated voiced segments Concatenation of voicedand unvoiced segments was performed by overlapping andadding two adjacent frames about 20 milliseconds each onefrom the voiced phoneme and the other from the unvoicedone The two frames are overlapped after each of them ismultiplied by a half-left or half-right Hanning or linear win-dows to yield a new frame that gradually changes from thevoiced segment to the unvoiced segment or vice-versa Sincethis operation may shorten the utterance time-scale com-pensation is carried out for each vocal segment by extendingthe PWI surface as necessary

The unvoiced segments are taken according to the mor-phing factor (α) from the first speaker where 0 le α lt 05and from the second one where 05 le α lt 10 The basicblock diagram of the algorithm is shown in Figure 1

In the analysis stage the voiced segments of both speechsignals are marked and each section in one of the voicesis associated with the corresponding section in the otherThe segmentation and mapping of the speech segmentsare done semiautomatically First a simple algorithm forvoicedunvoiced segmentation is applied which is based on3 parameters the short-time energy the normalized maxi-mal peak of the autocorrelation function in the range of 3ndash16milliseconds (the possible expected pitch period duration)and the short-time zero-crossing rate (Figure 2) The outputof the automatic voicedunvoiced segmentation is a series ofvoiced segments for each of the two voices However due tothe imperfection of the segmentation algorithm and the dis-similarity of the characteristics of the two voices and as ac-curate mapping between the corresponding voiced sectionsof the two voices is crucial for the success of the algorithma manual correction mode has been added to refine the pre-liminary segmentation The manual mode allows for makingadjustments to the edges of the segments splitting segmentsjoining segments and adding new segments or deleting onesFor the demarcation of phoneme boundaries the user canbe assisted by the graph of the 2-norm of the difference be-tween the MFCCs of adjacent frames (10 milliseconds eachsee Figure 2)

It was found that by applying manual segmentation asa refinement of the automatic segmentation it is possible toreach accurate mapping with only small adjustments

A pitch detection algorithm is applied to both speakersrsquoutterances The pitch detection algorithm is based on a com-bination of the cepstral method [23] for coarse-pitch perioddetection and the cross-correlation method [24] for refiningthe results Pitch marks are obtained and after preemphasislinear prediction coefficients are calculated for each voicedphoneme (either on the whole phoneme as one windowed

Voice Morphing Using 3D Waveform Interpolation Surfaces 1177

Phonemesboundaries

demarcation

Phonemesboundaries

demarcation

Pitchdetector

LPCanalysis

LPCanalysis

Pitchdetector

PWIcreation

Interpolator PWIcreation

Excitation signalreconstruction

InterpolatedLPC model

New interpolated voice

Speaker BSpeaker A

Synthesizer

Figure 1 A basic diagram of the algorithm

Figure 2 The semiautomatic segmentation of the speech signal A simple algorithm for voicedunvoiced segmentation is applied Theupper graph shows the speech signal with the vuv boundaries In the 4 graphs below the 4 parameters on which the decision is based aredepicted the short-time energy the zero-crossing rate the autocorrelation and the MFCC difference coefficient The user can correct thesegmentation manually with an interactive graphical user interface (see text)

1178 EURASIP Journal on Applied Signal Processing

frame or pitch synchronously) to create the vocal-tract filterand the residual error function for each segment The pro-totype waveform surfaces are then created from the resid-ual error functions (see Section 231) In the synthesis stagea new residual error signal is recovered from a PWI sur-face interpolated from the two original ones (as describedin Section 232) The two speakersrsquo area functions are alsointerpolated producing an intermediate area function fromwhich a new vocal-tract filter is computed (see Section 233)The new residual signal is then used to excite the interpo-lated vocal-tract filter to yield an intermediate speech signalThe final morphed speech signal is created by concatenatingthe new vocal phonemes in order along with the unvoicedphonemes and silent periods of the source or of the target

231 Computation of the characteristicwaveform surface

The characteristic waveform surface which represents theresidual error signal derived from the voiced sections [16] isa two-dimensional signal that represents a one-dimensionalsignal and is constructed as follows let u(tφ) be the char-acteristic waveform where t denotes the time axis and φ is aphase variable whose values are in the range [0 2π] The pro-totype waveforms are displayed along the phase axis whereeach prototype is a short segment from the residual signalwith a length of one pitch period Each prototype is consid-ered as a periodic function with a period of 2π The timeaxis of the surface displays the waveform evolution A one-dimensional signal can be recovered from u(tφ) by using aspecific φ(t) so that

r(t) = u(tφ(t)

) (1)

where φ(t) is calculated using the signal pitch period func-tion or pitch contour p(t) by

φ(t) = φ(t0)

+int t

t0

2πp(t)

dt (2)

A typical prediction error signal and its surface are shown inFigure 3

In the proposed solution the surface for each phonemeis created separately The construction of such a surface isdetailed in using the following procedure (Figure 4)

(1) Pitch detection is applied in order to obtain an instan-taneous pitch value p(t) which will track the pitch cy-cle change through time At any given point in timethe pitch cycle is determined by a linear interpolationof the pitch marks obtained by the pitch detector

(2) A rectangular window with duration of one pitch pe-riod multiplies the residual error function around asampling time ti with a step update of 25 millisec-onds to create a prototype waveform In order tosmooth the surface along the time axis a low-orderSavitzky-Golay filter is applied to the error function

0 50 100 150 200 250 300 350 400minus03

minus02

minus01

0

01

02

03

04

05

06

Am

plit

ude

t

(a)

60 50 40 30 20 10 0

100200

300400

minus04

minus02

0

02

04

06

08

1

Am

plit

ude

φ

t

(b)

Figure 3 Creating the characteristic waveform surface (a) A typ-ical residual error signal derived from the speech signal using lin-ear prediction analysis (b) The surface is constructed by plottingthe prototype waveform along the φ-axis at intervals of 25 millisec-onds after alignment and interpolation of the prototypes

(3) For the reconstruction of the signal from the surfaceit is extremely important to maintain similar and min-imum energy values at both ends of the prototypewaveform (which actually represent the same pointdue to the 2π periodicity along the phase axis) There-fore a shift of plusmn∆ samples (∆max was set to be 1 mil-lisecond) is allowed for the location of the windowrsquoscenter in the construction of the PWI surface

(4) Because the pitch cycle varies in time each prototypewaveform will be of different length Therefore all pro-totypes must be aligned along φ = [0 minus 2π] and musthave the same number of samples

Voice Morphing Using 3D Waveform Interpolation Surfaces 1179

The errorwaveform surface

Interpolatealong

time axis

Normalizebetween0minus 2π

Align withprevious

waveform

Prototypewaveformsampler

LPF

Short-timepitch function

calculation

Error signalcalculation

Pitch detectionmarks

LPC parameters

Figure 4 A schematic description of the PWI surface construction

(5) Once a prototype waveform is sampled according tostep (3) above a cyclical shift along the φ-axis is per-formed to obtain maximum cross-correlation with theformer prototype waveform thus creating a relativelysmooth waveform surface when moving across thetime axis

(6) In order to create a surface that reflects the errorrsquos pitchcycle evolution over time an interpolation along thetime axis is performed

232 Construction of the intermediateresidual error signal

Let us(tφ) ut(tφ) t [0 minus TsTt]φ [0 minus 2π] be thePWI of the source and target speakers respectively As de-scribed earlier the new intermediate waveform surface willbe an interpolation of the two surfaces (Figure 5) Therefore

unew(tφ) = α middot us(β middot tφ) + (1minus α) middot utminusaligned(γ middot tφ)

unew(tφ) t [0minus Tnew] φ [0minus 2π]

(3)

where β = TnewTs γ = TnewTt and α1 is the relative part ofus(tφ)

1The factor may be invariant if so the voice produced will be an inter-mediate between the two speakers or it may vary in time (from α = 1 att = 0 to α = 0 at t = T) yielding a gradual change from one voice to theother

minus02minus01

00102

03

04

0506

60 5040

3020

100

+

0100

200300

400

Am

plit

ude

φt

minus04minus03minus02

minus01

0

010203

04

6050

4030

2010

0 0100

200300

400

Am

plit

ude

φt

minus02

minus01

0

01

02

03

04

05

6050

4030

2010

0 0100

200300

400

Am

plit

ude

φt

Figure 5 Residual PWI surfaces of the two speakers (upper andintermediate figures respectively) and the resulting interpolatedsurface (bottom figure) It can be noticed that the interpolatedsurface combines characteristics of both speakers The φ-axis rep-resents samples of the normalized pitch period in a scale between 0and 2π

1180 EURASIP Journal on Applied Signal Processing

minus02

0

02

0 050

100

150200

φ

t

Am

plit

ude

Figure 6 The new PWI surface is constructed by interpolation oftwo PWI surfaces of the residual error signals of two speakers Thenew reconstructed residual can be recovered by moving on the sur-face along the phase axis according to the phase φ(t) determined bythe new pitch contour (The tracking is represented by the black linealong the surface)

In order to maximize the cross-correlation between thetwo surfaces the target speakerrsquos surface is shifted along theφ-axis and is referred to as utminusaligned(tφ) where

utminusaligned(tφ) = ut(tφ + φn

) (4)

φn = arg maxφe

int 2πφ=0

int Tnew

t=0 us(tφ) middot ut(tφ + φe

)dtdφ∥∥us∥∥ middot ∥∥ut(tφ + φe

)∥∥ (5)

u =radicradicradicradicint 2π

φ=0

int Tnew

t=0u(tφ) middot u(tφ)dtdφ (6)

where φe is the correction needed for the surfaces to bealigned

The last step (after creating unew(tφ)) is reconstruct-ing the new residual error signal from the waveform sur-face The reconstruction is performed by defining enew(t) =unew(tφnew(t)) for all t = [0 Tnew] where φnew(t) is createdby the following equation

φnew(t) =int t

t0

2πpnew(tprime)

dtprime (7)

pnew(t) is calculated as an average of the sourcersquos and targetrsquosshort-time pitch contour functions as shown in the follow-ing equation

pnew(t) = α middot ps(β middot t) + (1minus α) middot pt(γ middot t)

β = Tnew

Ts

γ = Tnew

Tt

(8)

where the factors β and γ are scaling factors for the new timeaxis [0Tnew] and α as in (3) is a weighting factor betweenthe pitch of the source and the pitch of the target Figure 6shows the derivation of a new residual error signal using thetrack on the PWI surface determined by φnew(t)

233 New vocal tract model calculation and synthesisIt is well known that the linear prediction parameters (iethe coefficients of the predictor polynomial A(z)) are highlysensitive to quantization [19] Therefore their quantizationor interpolation may result in an unstable filter and may pro-duce an undesirable signal However certain invertible non-linear transformations of the predictor coefficients result inequivalent sets of parameters that tolerate quantization or in-terpolation better An example of such a set of parametersis the PARCOR coefficients (ki) which are related to the ar-eas of lossless tube sections modeling the vocal tract [20] asgiven by the following equation

Ai+1 =(

1minus ki1 + ki

)middot Ai (9)

The value of the first area function parameter (A1) is arbi-trarily set to be 2 A new set of LPC parameters that defines anew vocal tract is computed using an interpolation of the twoarea vectors (source and target) This choice of the area pa-rameters seems to be more reasonable since intermediate vo-cal tract models should reflect intermediate dimensions [25]Let the source and target vocal tracts be modeled by N loss-less tube sections with areasAs

i Ati i [1minusN] respectively

The new signalrsquos vocal tract will be represented by

Anewi = α middot As

i + (1minus α) middot Ati foralli [1 2 N] (10)

After calculating the new areas the prediction filter iscomputed and the new vocal phoneme is synthesized accord-ing to the following scheme (Figure 7)

(1) compute new PARCOR parameters from the new areasby reversing (9)

(2) compute the coefficients of the new LPC model fromthe new PARCOR parameters

(3) filter the new excitation signal through the new vocal-tract filter to obtain the new vocal phoneme

When temporal voice morphing is applied informal subjec-tive listening tests performed on different sets of morphingparameters have revealed that in order to have a ldquolinearrdquo per-ceptual change between the voices of the source and the tar-get the coefficient α(t) (the relative part of us(tφ)) mustvary nonlinearly in time (like the one in Figure 8) When thecoefficient changed linearly with time the listeners perceivedan abrupt change from one identity to the other Using thenonlinear option that is gradually changing the identity ofone speaker to that of another a smooth modification of theldquosourcerdquo properties to those of the ldquotargetrdquo properties wasachieved In another subjective listening test (see below) per-formed on morphing from a womanrsquos voice to a manrsquos voiceand vice versa both uttering the same sentence the mor-phed sound was perceived as changing smoothly and natu-rally from one speaker to the other The quality of the mor-phed voices was found to depend upon the specific speakersthe differences between their voices and the content of theutterance Further research is required to accurately evaluatethe effect of these factors

Voice Morphing Using 3D Waveform Interpolation Surfaces 1181

New phoneme

Synthesizenew speech

signal

Calculatenew LPC

parameters

Calculate newtube areas

Modificationparameter (α)

Source and targetpitch functions

Source and targeterror surface

Creat newerror surface

Recover newexcitation

Source and targetLPC parameters

Modificationparameter (α)

Calculateφnew(t)

EvaluatePnew(t)

Figure 7 A block diagram of the procedure for synthesizing the new phoneme

0010203040506070809

1

0 01 02 03 04 05 06 07 08 09 1

Act

ual

mor

phin

gco

effici

ent

Desired morphing coefficient

Figure 8 An example of a nonlinear morphing function obtainedempirically and used to achieve a linear perceptual change betweenthe first speaker and the second speaker when time-varying mor-phing was performed

24 Evaluation of the algorithm

Evaluation of the algorithm was carried out using three sub-jective listening tests The naturalness intelligibility identitychange and smoothness of the morphing algorithm wereexamined in these psychoacoustic tests In all tests six lis-teners participated (three males and three females all with-out hearing impairments and inexperienced in speech mor-phing) The sentences for the tests were taken from TIMITdatabase and from BGU Hebrew database The speech stim-uli were played using high-quality loudspeakers In all tests

the stimuli were played in random order The listeners werefree to play each stimulus more than once without any limi-tation In the first test the listeners had to decide if there wasa change in the identity of the speaker for each of the threesentences on a 1ndash5 scale where 1 meant that one identitywas perceived along the utterance and 5 meant that there wasmore than one speaker The listeners had to rate the smooth-ness of the utterance as well on a similar 1ndash5 scale where5 meant smooth utterance and 1 meant abrupt change orchanges in the utterance Three types of stimuli were usedthe original speech morphed speech (in which the identitychanged gradually from one speaker to another during thesentence) and concatenated speech of two speakers at a fixedpoint [1] The aim of this test was to evaluate if the iden-tity change was perceived and to assess the effect of the algo-rithm on the smoothness of the gradual morphed utterancesThe results of this experiment for one of the sentences (ldquoWewere different shades and it did not make a bit of differenceamong usrdquo taken from [9]) are depicted in Figure 9 As ex-pected it is readily seen that the original utterance was per-ceived as a smooth speech without identity change while theabrupt change in identity in the concatenated utterance wasperceived and rated accordingly that is as more than oneidentity and with an abrupt change The change of identitywas perceived in the morphed signal as well (average score35 std = 096) since the sentence was long enough to no-tice the modification Nevertheless it was also perceived asrelatively smooth (average score 35 std = 112) The other

1182 EURASIP Journal on Applied Signal Processing

s3-o

rigi

nal

t3-o

rigi

nal

s3-g

radu

alm

orph

ing

t3-g

radu

alm

orph

ing

s3-

con

cate

nat

ed

t3-

con

cate

nat

ed

1000

1500

2000

2500

3000

3500

4000

4500

5000

Ave

rage

scor

e

Stimuli

Figure 9 Evaluation of smoothness and identity change in an ut-terance in which a gradual morphing is produced (smoothnessmdashleft column identity changemdashright column)

two utterances yielded similar results In the second test thelisteners had to evaluate the naturalness and intelligibilityof a given sentence on a 1ndash5 scale The listeners attendedto 3 stimuli two original sentences one uttered by a malespeaker and the other by a female speaker and a morphedsentence with a morphing factor of 05 which is a hybrid sig-nal between the two originals The results are summarized inFigure 10 It is clear from this figure that the morphed sig-nal was perceived as highly natural and clear by most of thelisteners (naturalness average score 43 std = 074 intelli-gibility average score 417 std = 069) In the third exper-iment the listeners had to rate the identity change and thenaturalness of two sentences on a 1ndash5 scale Each sentencewas repeated as a cyclostationary morph (see [3]) 16 timeswith various morphing factors from 0 to 1 The average nat-uralness was 325 and the identity change was rated with anaverage of 33 Naturalness in this case was not perfect (al-though it is quite high) and can be explained by the fact thatthe voice timbers in this test were distant and the hybridvoice that was produced could be perceived as uncommonIn summary the results of the tests show that the morphedsignals are perceived in most cases as highly natural highlyintelligible with a relatively smooth change from one speechvoice to another

3 CONCLUSIONS

In this study a new speech morphing algorithm is presentedThe aim is to produce natural sounding hybrid voices be-tween two speakers uttering the same content The algo-rithm is based on representing the residual error signal as aPWI surface and the vocal tract as a lossless tube area func-tion The PWI surface incorporates the characteristics of theexcitation signal and enables reproduction of a residual sig-nal with a given pitch contour and time duration which in-cludes the dynamics of both speakersrsquo excitations It is known

1000

1500

2000

2500

3000

3500

4000

4500

5000

Ave

rage

scor

e

Man Woman Morphing 05

Stimuli

Figure 10 Results of subjective evaluation for naturalness (leftbars) and intelligibility (right bars) for 3 stimuli a man (left) anda woman (middle) voices and a morphing between the two voices(right)

[16] that PWI surfaces can be exploited efficiently for speechcoding and therefore they allow for higher compression ofthe speech database The area function was used in an at-tempt to reflect an intermediate configuration of the vocaltract between the two speakers [25] The utterances producedby the algorithm were shown to be of high quality and to con-sist of intermediate features of the two speakers

There are at least two modes in which the morphing algo-rithm can be used In the first mode the morphing parameteris invariant meaning for example taking a factor of 05 andreceiving a morphed signal with characteristics which are be-tween the two voices for the whole duration of the articu-lation In the second mode we start from the first (source)speaker (morphing factor = 0) and the morphing factor ischanged gradually along the duration of the sentence so itsvalue is 1 at the end of the sentence The same morphingfactor was used for both the excitation and the vocal tractparameters

In this time-varying version of the algorithm that iswhen morphing gradually from one voice to another overtime smooth morphing was achieved producing a highlynatural transition between the source and the target speakersThis was assessed by subjective evaluation tests as previouslydescribed

The algorithm described here is more capable of pro-ducing longer and more smooth and natural sounding ut-terances than previous studies [1 3] One of the advantagesof the proposed algorithm is that in addition to the inter-polation of the vocal tract features interpolation is also per-formed between the two PWI surfaces of the correspondingresidual signals and thus captures the evolution of the excita-tion signals of both speakers In this way a hybrid excitationsignal can be produced that contains intermediate character-istics of both excitations Thus a more natural morphing be-tween utterances can be achieved as has been demonstratedFurthermore our algorithm performs the interpolation of

Voice Morphing Using 3D Waveform Interpolation Surfaces 1183

the residual signal regardless of the pitch information sincethe pitch data is normalized within the PWI surface There-fore the morphed pitch contour is extracted independentlyand can be manipulated separately In addition the currentapproach enables morphing between utterances with differ-ent pitches between male and female voices or betweenvoices of different and perceptually distant timbres Kawa-hara and his colleagues [26 27] implemented a morphingsystem based on interpolation between time-frequency rep-resentations of the source and the target signals It appearsthat the STRAIGHT-based morphing system (see [2 26])was able to produce intermediate voices of higher clarity thanour algorithm but the need to assign multiple anchor pointsfor each short segment using visual inspection and phono-logical knowledge is a noticeable disadvantage of that sys-tem which can make it difficult to use for morphing betweenlong utterances

Further research is needed to improve the quality of themorphed signals which are natural sounding (see httpspltelhaiacilspeech) but are somewhat degraded com-pared to the originals

ACKNOWLEDGMENTS

We would like to thank Professor David Malah the Head ofSIPL for proposing to apply PWI to voice morphing andfor his valuable discussions and comments We also thankDima Ruinskiy for his valuable assistance for programmingpart of the morphing system and for preparing the semiau-tomatic segmentation and Yefim Yakir for preparing part ofthe figures and for technical support The authors thank thereviewers for their helpful comments This study was partlysupported by Guastella Fellowship of the Sacta-Rashi Foun-dation and the JAFI project

REFERENCES

[1] M Abe ldquoSpeech morphing by gradually changing spectrumparameter and fundamental frequencyrdquo Tech Rep SP96-40The Institute of Electronics Information and Communica-tion Engineers (IEICE) July 1996

[2] H Kawahara and H Matsui ldquoAuditory morphing based onan elastic perceptual distance metric in an interference-freetime-frequency representationrdquo in Proc IEEE InternationalConference on Acoustics Speech and Signal Processing (ICASSPrsquo03) vol 1 pp 256ndash259 Hong Kong China 2003

[3] M Slaney M Covell and B Lassiter ldquoAutomatic audio mor-phingrdquo in Proc IEEE International Conference on AcousticsSpeech and Signal Processing (ICASSP rsquo96) vol 2 pp 1001ndash1004 Atlanta GA USA May 1996

[4] P Cano A Loscos J Bonada M de Boer and X Serra ldquoVoicemorphing system for impersonating in karaoke applicationsrdquoin Proc International Computer Music Conference (ICMA rsquo00)pp 109ndash112 Berlin Germany August 2000

[5] E Tellman L Haken and B Holloway ldquoTimbre morphingof sounds with unequal numbers of featuresrdquo Journal of theAudio Engineering Society (AES) vol 43 no 9 pp 678ndash6891995

[6] D Rossum ldquoThe ldquoarmadillordquo coefficient encoding scheme fordigital audio filtersrdquo in proc IEEE Workshop on Applicationsof Signal Processing to Audio and Acoustics (WASPAA rsquo91) pp129ndash130 New Paltz NY USA October 1991

[7] Y Stylianou O Cappe and E Moulines ldquoContinuous prob-abilistic transform for voice conversionrdquo IEEE Trans SpeechAudio Processing vol 6 no 2 pp 131ndash142 1998

[8] N Iwahashi and Y Sagisaka ldquoSpeech spectrum conversionbased on speaker interpolation and multi-functional repre-sentation with weighting by radial basis function networksrdquoSpeech Communication vol 16 no 2 pp 139ndash151 1995

[9] L M Arslan ldquoSpeaker Transformation Algorithm using Seg-mental Codebooks (STASC)rdquo Speech Communication vol 28no 3 pp 211ndash226 1999

[10] A Kain and M Macon ldquoSpectral voice conversion for text-to-speech synthesisrdquo in Proc IEEE International Conferenceon Acoustics Speech and Signal Processing (ICASSP rsquo98) vol 1pp 285ndash288 Seattle WA USA May 1998

[11] A Kain and M Macon ldquoPersonalizing a speech synthesizerby voice adaptationrdquo in Proc 3rd ESCACOCOSDA Interna-tional Speech Synthesis Workshop pp 225ndash230 Jenolan CavesAustralia November 1998

[12] A Kain and M Macon ldquoDesign and evaluation of a voice con-version algorithm based on spectral envelope mapping andresidual predictionrdquo in Proc IEEE International Conference onAcoustics Speech and Signal Processing (ICASSP rsquo01) vol 2Salt Lake City Utah USA May 2001

[13] M Abe S Nakamura K Shikano and H Kuwabara ldquoVoiceconversion through vector quantizationrdquo in Proc IEEE In-ternational Conference on Acoustics Speech and Signal Pro-cessing (ICASSP rsquo88) pp 655ndash658 New York NY USA April1988

[14] M Narendranath H A Murthy S Rajendran and B Yegna-narayana ldquoTransformation of formants for voice conversionusing artificial neural networksrdquo Speech Communication vol16 no 2 pp 206ndash216 1995

[15] O Lev (Fellah) and D Malah ldquoLow bit-rate speech coderbased on a long-term modelrdquo in Proc 22nd IEEE Conventionof Electrical and Electronics Engineers in Israel Tel-Aviv IsraelDecember 2002

[16] W B Kleijn and J Haagen ldquoWaveform interpolationfor cod-ing and synthesisrdquo in Speech Coding and Synthesis W BKleijn and K K Paliwal Eds chapter 5 pp 175ndash207 Else-vier Science B V Amsterdam the Netherlands 1995

[17] W B Kleijn ldquoEncoding speech using prototype waveformsrdquoIEEE Trans Speech Audio Processing vol 1 no 4 pp 386ndash3991993

[18] W B Kleijn and J Haagen ldquoTransformation and decompo-sition of the speech signal for codingrdquo IEEE Signal ProcessingLett vol 1 no 9 pp 136ndash138 1994

[19] J R Deller J G Proakis and J H L Hansen Discrete TimeProcessing of Speech Signals chapter 7 Macmillan PublishingNew York NY USA 1993

[20] L R Rabiner and R W Schafer Digital processing of speechsignals chapter 8 Signal Processing Series Prentice-Hall Pub-lishing Englewood Cliffs London 1978

[21] J Makhoul ldquoLinear prediction A tutorial reviewrdquo Proc IEEEvol 63 no 4 pp 561ndash580 1975

[22] H Kuwabara and Y Sagisaka ldquoAcoustic characteristics ofspeaker individuality Control and conversionrdquo Speech Com-munication vol 16 no 2 pp 165ndash173 1995

[23] A M Noll ldquoCepstrum pitch determinationrdquo Journal of theAcoustical Society of America vol 41 no 2 pp 293ndash309 1967

[24] Y Medan E Yair and D Chazan ldquoSuper resolution pitchdetermination of speech signalsrdquo IEEE Trans Acoust SpeechSignal Processing vol 39 no 1 pp 40ndash48 1991

[25] H Wakita ldquoDirect estimation of the vocal tract shape by in-verse filtering of the acoustic speech waveformsrdquo IEEE Trans-action Audio Electroacoustic vol AV-21 no 5 pp 417ndash4271973

1184 EURASIP Journal on Applied Signal Processing

[26] H Kawahara I Masuda-Katsuse and A de Cheveigne ldquoRe-structuring speech representations using a pitch-adaptivetime-frequency smoothing and an instantaneous-frequency-based F0 extraction Possible role of a repetitive structure insoundsrdquo Speech Communication vol 27 no 3-4 pp 187ndash207 1999

[27] H Kawahara H Katayose A de Cheveigne and R D Pat-terson ldquoFixed point analysis of frequency to instantaneousfrequency mapping for accurate estimation of f0 and period-icityrdquo in Proc European Union Activities in Human LanguageTechnologies (EUROSPEECH rsquo99) vol 6 pp 2781ndash2784 Bu-dapest Hungary September 1999

Yizhar Lavner received a PhD degree fromthe Technion ndash Israel Institute of Technol-ogy in 1997 He joined the Department ofcomputer science Tel-Hai Academic Col-lege Upper Galilee Israel in 1997 where heis a Senior Lecturer He is teaching in SIPL(Signal and Image Processing Lab) Elec-trical Engineering Faculty the TechnionHaifa Israel) since 1998 His research in-terests include speech and audio signal pro-cessing and genomic signal processing

Gidon Porat received his BS degree (cumlaude) in electrical engineering from theTechnion ndash Israel Institute of Technologyin January 2003 As a part of his grad-uation requirements Porat has researchedvoice morphing in the Electrical Engineer-ing Facultyrsquos Signal and Image ProcessingLab in the Technion Porat was a recipientof a Best Studentrsquos Paper Award at the IEEE22nd Convention of Electrical amp Electron-ics Engineers Israel in December 2002 Since his graduation Po-rat has been employed by Intel Corporation as a member of theMobile Platform Group Chip Design Team Porat takes part in thedevelopment of high-speed low-power circuit designs for the next-generation CPUs for mobile computers

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Transforming Signal Processing Applications intoParallel Implementations

Call for PapersThere is an increasing need to develop efficient ldquosystem-levelrdquo models methods and tools to support designers toquickly transform signal processing application specificationto heterogeneous hardware and software architectures suchas arrays of DSPs heterogeneous platforms involving mi-croprocessors DSPs and FPGAs and other evolving multi-processor SoC architectures Typically the design process in-volves aspects of application and architecture modeling aswell as transformations to translate the application modelsto architecture models for subsequent performance analysisand design space exploration Accurate predictions are indis-pensable because next generation signal processing applica-tions for example audio video and array signal processingimpose high throughput real-time and energy constraintsthat can no longer be served by a single DSP

There are a number of key issues in transforming applica-tion models into parallel implementations that are not ad-dressed in current approaches These are engineering theapplication specification transforming application specifi-cation or representation of the architecture specification aswell as communication models such as data transfer and syn-chronization primitives in both models

The purpose of this call for papers is to address approachesthat include application transformations in the performanceanalysis and design space exploration efforts when takingsignal processing applications to concurrent and parallel im-plementations The Guest Editors are soliciting contributionsin joint application and architecture space exploration thatoutperform the current architecture-only design space ex-ploration methods and tools

Topics of interest for this special issue include but are notlimited to

bull modeling applications in terms of (abstract)control-dataflow graph dataflow graph and processnetwork models of computation (MoC)

bull transforming application models or algorithmicengineering

bull transforming application MoCs to architecture MoCsbull joint application and architecture space exploration

bull joint application and architecture performanceanalysis

bull extending the concept of algorithmic engineering toarchitecture engineering

bull design cases and applications mapped onmultiprocessor homogeneous or heterogeneousSOCs showing joint optimization of application andarchitecture

Authors should follow the EURASIP JASP manuscriptformat described at httpwwwhindawicomjournalsaspProspective authors should submit an electronic copy oftheir complete manuscript through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due September 1 2006

Acceptance Notification January 1 2007

Final Manuscript Due April 1 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

F Deprettre Leiden Embedded Research Center LeidenUniversity Niels Bohrweg 1 2333 CA Leiden TheNetherlands eddliacsnl

Roger Woods School of Electrical and ElectronicEngineering Queens University of Belfast Ashby BuildingStranmillis Road Belfast BT9 5AH UK rwoodsqubacuk

Ingrid Verbauwhede Katholieke Universiteit LeuvenESAT-COSIC Kasteelpark Arenberg 10 3001 LeuvenBelgium Ingridverbauwhedeesatkuleuvenbe

Erwin de Kock Philips Research High Tech Campus 315656 AE Eindhoven The Netherlandserwindekockphilipscom

Hindawi Publishing Corporationhttpwwwhindawicom

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Video Adaptation for Heterogeneous Environments

Call for PapersThe explosive growth of compressed video streams andrepositories accessible worldwide the recent addition of newvideo-related standards such as H264AVC MPEG-7 andMPEG-21 and the ever-increasing prevalence of heteroge-neous video-enabled terminals such as computer TV mo-bile phones and personal digital assistants have escalated theneed for efficient and effective techniques for adapting com-pressed videos to better suit the different capabilities con-straints and requirements of various transmission networksapplications and end users For instance Universal Multime-dia Access (UMA) advocates the provision and adaptation ofthe same multimedia content for different networks termi-nals and user preferences

Video adaptation is an emerging field that offers a richbody of knowledge and techniques for handling the hugevariation of resource constraints (eg bandwidth display ca-pability processing speed and power consumption) and thelarge diversity of user tasks in pervasive media applicationsConsiderable amounts of research and development activi-ties in industry and academia have been devoted to answer-ing the many challenges in making better use of video con-tent across systems and applications of various kinds

Video adaptation may apply to individual or multiplevideo streams and may call for different means depending onthe objectives and requirements of adaptation Transcodingtransmoding (cross-modality transcoding) scalable contentrepresentation content abstraction and summarization arepopular means for video adaptation In addition video con-tent analysis and understanding including low-level featureanalysis and high-level semantics understanding play an im-portant role in video adaptation as essential video contentcan be better preserved

The aim of this special issue is to present state-of-the-art developments in this flourishing and important researchfield Contributions in theoretical study architecture designperformance analysis complexity reduction and real-worldapplications are all welcome

Topics of interest include (but are not limited to)

bull Heterogeneous video transcodingbull Scalable video codingbull Dynamic bitstream switching for video adaptation

bull Signal structural and semantic-level videoadaptation

bull Content analysis and understanding for videoadaptation

bull Video summarization and abstractionbull Copyright protection for video adaptationbull Crossmedia techniques for video adaptationbull Testing field trials and applications of video

adaptation servicesbull International standard activities for video adaptation

Authors should follow the EURASIP JASP manuscriptformat described at httpwwwhindawicomjournalsaspProspective authors should submit an electronic copy oftheir complete manuscript through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due September 1 2006

Acceptance Notification January 1 2007

Final Manuscript Due April 1 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

Chia-Wen Lin Department of Computer Science andInformation Engineering National Chung ChengUniversity Chiayi 621 Taiwan cwlincsccuedutw

Yap-Peng Tan School of Electrical and ElectronicEngineering Nanyang Technological University NanyangAvenue Singapore 639798 Singapore eyptanntuedusg

Ming-Ting Sun Department of Electrical EngineeringUniversity of Washington Seattle WA 98195 USA suneewashingtonedu

Alex Kot School of Electrical and Electronic EngineeringNanyang Technological University Nanyang AvenueSingapore 639798 Singapore eackotntuedusg

Anthony Vetro Mitsubishi Electric Research Laboratories201 Broadway 8th Floor Cambridge MA 02138 USAavetromerlcom

Hindawi Publishing Corporationhttpwwwhindawicom

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Knowledge-Assisted Media Analysis for InteractiveMultimedia Applications

Call for PapersIt is broadly acknowledged that the development of enablingtechnologies for new forms of interactive multimedia ser-vices requires a targeted confluence of knowledge semanticsand low-level media processing The convergence of these ar-eas is key to many applications including interactive TV net-worked medical imaging vision-based surveillance and mul-timedia visualization navigation search and retrieval Thelatter is a crucial application since the exponential growthof audiovisual data along with the critical lack of tools torecord the data in a well-structured form is rendering uselessvast portions of available content To overcome this problemthere is need for technology that is able to produce accuratelevels of abstraction in order to annotate and retrieve con-tent using queries that are natural to humans Such technol-ogy will help narrow the gap between low-level features orcontent descriptors that can be computed automatically andthe richness and subjectivity of semantics in user queries andhigh-level human interpretations of audiovisual media

This special issue focuses on truly integrative research tar-geting of what can be disparate disciplines including imageprocessing knowledge engineering information retrievalsemantic analysis and artificial intelligence High-qualityand novel contributions addressing theoretical and practicalaspects are solicited Specifically the following topics are ofinterest

bull Semantics-based multimedia analysisbull Context-based multimedia miningbull Intelligent exploitation of user relevance feedbackbull Knowledge acquisition from multimedia contentsbull Semantics based interaction with multimediabull Integration of multimedia processing and Semantic

Web technologies to enable automatic content shar-ing processing and interpretation

bull Content user and network aware media engineeringbull Multimodal techniques high-dimensionality reduc-

tion and low level feature fusion

Authors should follow the EURASIP JASP manuscriptformat described at httpwwwhindawicomjournalsaspProspective authors should submit an electronic copy oftheir complete manuscript through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due September 1 2006

Acceptance Notification January 15 2007

Final Manuscript Due April 1 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

Ebroul Izquierdo Department of Electronic EngineeringQueen Mary University of London Mile End Road LondonE1 4NS United Kingdom ebroulizquierdoelecqmulacuk

Hyoung Joong Kim Department of Control and Instru-mentation Engineering Kangwon National University 192 1Hyoja2 Dong Kangwon Do 200 701 Koreakhjkangwonackr

Thomas Sikora Communication Systems Group TechnicalUniversity Berlin Einstein Ufer 17 10587 Berlin Germanysikoranuetu-berlinde

Hindawi Publishing Corporationhttpwwwhindawicom

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Super-resolution Enhancement of Digital Video

Call for PapersWhen designing a system for image acquisition there is gen-erally a desire for high spatial resolution and a wide field-of-view To achieve this a camera system must typically em-ploy small f-number optics This produces an image withvery high spatial-frequency bandwidth at the focal plane Toavoid aliasing caused by undersampling the correspondingfocal plane array (FPA) must be sufficiently dense Howevercost and fabrication complexities may make this impracticalMore fundamentally smaller detectors capture fewer pho-tons which can lead to potentially severe noise levels in theacquired imagery Considering these factors one may chooseto accept a certain level of undersampling or to sacrifice someoptical resolution andor field-of-view

In image super-resolution (SR) postprocessing is used toobtain images with resolutions that go beyond the conven-tional limits of the uncompensated imaging system In somesystems the primary limiting factor is the optical resolutionof the image in the focal plane as defined by the cut-off fre-quency of the optics We use the term ldquooptical SRrdquo to re-fer to SR methods that aim to create an image with validspatial-frequency content that goes beyond the cut-off fre-quency of the optics Such techniques typically must rely onextensive a priori information In other image acquisitionsystems the limiting factor may be the density of the FPAsubsequent postprocessing requirements or transmission bi-trate constraints that require data compression We refer tothe process of overcoming the limitations of the FPA in orderto obtain the full resolution afforded by the selected optics asldquodetector SRrdquo Note that some methods may seek to performboth optical and detector SR

Detector SR algorithms generally process a set of low-resolution aliased frames from a video sequence to producea high-resolution frame When subpixel relative motion ispresent between the objects in the scene and the detector ar-ray a unique set of scene samples are acquired for each frameThis provides the mechanism for effectively increasing thespatial sampling rate of the imaging system without reduc-ing the physical size of the detectors

With increasing interest in surveillance and the prolifera-tion of digital imaging and video SR has become a rapidlygrowing field Recent advances in SR include innovative al-gorithms generalized methods real-time implementations

and novel applications The purpose of this special issue isto present leading research and development in the area ofsuper-resolution for digital video Topics of interest for thisspecial issue include but are not limited to

bull Detector and optical SR algorithms for videobull Real-time or near-real-time SR implementationsbull Innovative color SR processingbull Novel SR applications such as improved object

detection recognition and trackingbull Super-resolution from compressed videobull Subpixel image registration and optical flow

Authors should follow the EURASIP JASP manuscriptformat described at httpwwwhindawicomjournalsaspProspective authors should submit an electronic copy oftheir complete manuscript through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due September 1 2006

Acceptance Notification February 1 2007

Final Manuscript Due April 15 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

Russell C Hardie Department of Electrical and ComputerEngineering University of Dayton 300 College ParkDayton OH 45469-0026 USA rhardieudaytonedu

Richard R Schultz Department of Electrical EngineeringUniversity of North Dakota Upson II Room 160 PO Box7165 Grand Forks ND 58202-7165 USARichardSchultzmailundnodakedu

Kenneth E Barner Department of Electrical andComputer Engineering University of Delaware 140 EvansHall Newark DE 19716-3130 USA barnereeudeledu

Hindawi Publishing Corporationhttpwwwhindawicom

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Advanced Signal Processing and ComputationalIntelligence Techniques for Power Line Communications

Call for PapersIn recent years increased demand for fast Internet access andnew multimedia services the development of new and fea-sible signal processing techniques associated with faster andlow-cost digital signal processors as well as the deregulationof the telecommunications market have placed major em-phasis on the value of investigating hostile media such aspowerline (PL) channels for high-rate data transmissions

Nowadays some companies are offering powerline com-munications (PLC) modems with mean and peak bit-ratesaround 100 Mbps and 200 Mbps respectively Howeveradvanced broadband powerline communications (BPLC)modems will surpass this performance For accomplishing itsome special schemes or solutions for coping with the follow-ing issues should be addressed (i) considerable differencesbetween powerline network topologies (ii) hostile propertiesof PL channels such as attenuation proportional to high fre-quencies and long distances high-power impulse noise oc-currences time-varying behavior and strong inter-symbolinterference (ISI) effects (iv) electromagnetic compatibilitywith other well-established communication systems work-ing in the same spectrum (v) climatic conditions in differ-ent parts of the world (vii) reliability and QoS guarantee forvideo and voice transmissions and (vi) different demandsand needs from developed developing and poor countries

These issues can lead to exciting research frontiers withvery promising results if signal processing digital commu-nication and computational intelligence techniques are ef-fectively and efficiently combined

The goal of this special issue is to introduce signal process-ing digital communication and computational intelligencetools either individually or in combined form for advancingreliable and powerful future generations of powerline com-munication solutions that can be suited with for applicationsin developed developing and poor countries

Topics of interest include (but are not limited to)

bull Multicarrier spread spectrum and single carrier tech-niques

bull Channel modeling

bull Channel coding and equalization techniquesbull Multiuser detection and multiple access techniquesbull Synchronization techniquesbull Impulse noise cancellation techniquesbull FPGA ASIC and DSP implementation issues of PLC

modemsbull Error resilience error concealment and Joint source-

channel design methods for video transmissionthrough PL channels

Authors should follow the EURASIP JASP manuscript for-mat described at the journal site httpasphindawicomProspective authors should submit an electronic copy of theircomplete manuscripts through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due October 1 2006

Acceptance Notification January 1 2007

Final Manuscript Due April 1 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

Moiseacutes Vidal Ribeiro Federal University of Juiz de ForaBrazil mribeiroieeeorg

Lutz Lampe University of British Columbia Canadalampeeceubcca

Sanjit K Mitra University of California Santa BarbaraUSA mitraeceucsbedu

Klaus Dostert University of Karlsruhe Germanyklausdostertetecuni-karlsruhede

Halid Hrasnica Dresden University of Technology Ger-many hrasnicaifnettu-dresdende

Hindawi Publishing Corporationhttpasphindawicom

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Numerical Linear Algebra in Signal ProcessingApplications

Call for PapersThe cross-fertilization between numerical linear algebra anddigital signal processing has been very fruitful in the lastdecades The interaction between them has been growingleading to many new algorithms

Numerical linear algebra tools such as eigenvalue and sin-gular value decomposition and their higher-extension leastsquares total least squares recursive least squares regulariza-tion orthogonality and projections are the kernels of pow-erful and numerically robust algorithms

The goal of this special issue is to present new efficient andreliable numerical linear algebra tools for signal processingapplications Areas and topics of interest for this special issueinclude (but are not limited to)

bull Singular value and eigenvalue decompositions in-cluding applications

bull Fourier Toeplitz Cauchy Vandermonde and semi-separable matrices including special algorithms andarchitectures

bull Recursive least squares in digital signal processingbull Updating and downdating techniques in linear alge-

bra and signal processingbull Stability and sensitivity analysis of special recursive

least-squares problemsbull Numerical linear algebra in

bull Biomedical signal processing applicationsbull Adaptive filtersbull Remote sensingbull Acoustic echo cancellationbull Blind signal separation and multiuser detectionbull Multidimensional harmonic retrieval and direc-

tion-of-arrival estimationbull Applications in wireless communicationsbull Applications in pattern analysis and statistical

modelingbull Sensor array processing

Authors should follow the EURASIP JASP manuscriptformat described at httpwwwhindawicomjournalsaspProspective authors should submit an electronic copy oftheir complete manuscript through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due October 1 2006

Acceptance Notification February 1 2007

Final Manuscript Due May 1 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

Shivkumar Chandrasekaran Department of Electricaland Computer Engineering University of California SantaBarbara USA shiveceucsbedu

Gene H Golub Department of Computer Science StanfordUniversity USA golubsccmstanfordedu

Nicola Mastronardi Istituto per le Applicazioni del CalcololdquoMauro Piconerdquo Consiglio Nazionale delle Ricerche BariItaly nmastronardibaiaccnrit

Marc Moonen Department of Electrical EngineeringKatholieke Universiteit Leuven Belgiummarcmoonenesatkuleuvenbe

Paul Van Dooren Department of Mathematical Engineer-ing Catholic University of Louvain Belgiumvdoorencsamuclacbe

Sabine Van Huffel Department of Electrical EngineeringKatholieke Universiteit Leuven Belgiumsabinevanhuffelesatkuleuvenbe

Hindawi Publishing Corporationhttpwwwhindawicom

Page 3: Voice Morphing using 3D Waveform Interpolation …spl.telhai.ac.il/speech/pub/S1110865705502178[1].pdf · Voice Morphing Using 3D Waveform Interpolation Surfaces 1175 Several studies

1176 EURASIP Journal on Applied Signal Processing

pitch cycle has a close similarity to its neighbors The slowchange in the shape and duration of the pitch cycle suggeststhat extracting the cyclesrsquo waveform at regular time intervalsshould be sufficient in order to reconstruct the signal fromthe sampled cycles by interpolation The interpolation pro-cedure is carried out by constructing a 3D surface from thespeech or the residual error waveforms The coding proce-dure can be applied for both the speech signal and its resid-ual error function derived from the LPC analysis A detaileddescription of this coding technique for bit-rate reductioncan be found in [16] The representation of a speech signalrsquosresidual error function in the form of 3D surfaces has beenfound useful for voiced speech morphing The creation ofsuch a surface is described below

22 PWI-based speech morphingPrototype waveform interpolation is based on the observa-tion that during voiced segments of speech the pitch cyclesresemble each other and their general shape usually evolvesslowly in time (see [16 17 18]) The essential characteris-tics of the speech signal can thus be described by the pitch-cycle waveform By extracting pitch cycles at regular time in-stants and interpolating between them an interpolation sur-face can be created The speech can then be reconstructedfrom this surface if the pitch contour and the phase function(see Section 231) are known

The algorithm presented here is based on the source-filter model of speech production [19 20] According to thismodel voiced speech is the output of a time-varying vocal-tract filter excited by a time-varying glottal pulse signal Inorder to separate the vocal-tract filter from the source sig-nal we used the LPC analysis [21] by which the speech isdecomposed into two components the LPC coefficients con-taining the information of the vocal tract characteristics andthe residual error signal analogous to the derivative of theglottal pulse signal In the proposed morphing technique weused the PWI to create a 3D surface from the residual errorsignal which would represent the source characteristics foreach speaker Interpolation between the surfaces of the twospeakers allows us to create an intermediate excitation signalIn addition to the fact that the information of the vocal tract(see Section 233) is manipulated separately from the infor-mation of the residual error signal it is also more advanta-geous to create a PWI surface from the residual signal thanto obtain one from the speech itself In this domain it is rel-atively easy to ensure that the periodic extension procedure(see below) does not result in artifacts in the characteristicwaveform shape [16] This is due to the fact that the resid-ual signal contains mainly excitation pulses with low-powerregions in between and thus allows a smooth reconstruc-tion of the residual signal from the PWI surface with minimalphase discontinuities

In the proposed algorithm the surfaces of the residual er-ror signals computed for each voiced phoneme of two differ-ent speakers are interpolated to create an intermediate sur-face Together with an intermediate pitch contour and an in-terpolated vocal-tract filter a new voiced phoneme is pro-duced

23 The basic algorithm

The morphing algorithm consists of two main stagesmdashanalysis and synthesis As most of the speakerrsquos individualityis contained in the voiced portion of speech [22] and be-cause in preliminary experiments it was found that mor-phing the unvoiced sections yielded low-quality utterancesthe algorithm is applied on the voiced segments only Theunvoiced segments were left intact and concatenated withthe interpolated voiced segments Concatenation of voicedand unvoiced segments was performed by overlapping andadding two adjacent frames about 20 milliseconds each onefrom the voiced phoneme and the other from the unvoicedone The two frames are overlapped after each of them ismultiplied by a half-left or half-right Hanning or linear win-dows to yield a new frame that gradually changes from thevoiced segment to the unvoiced segment or vice-versa Sincethis operation may shorten the utterance time-scale com-pensation is carried out for each vocal segment by extendingthe PWI surface as necessary

The unvoiced segments are taken according to the mor-phing factor (α) from the first speaker where 0 le α lt 05and from the second one where 05 le α lt 10 The basicblock diagram of the algorithm is shown in Figure 1

In the analysis stage the voiced segments of both speechsignals are marked and each section in one of the voicesis associated with the corresponding section in the otherThe segmentation and mapping of the speech segmentsare done semiautomatically First a simple algorithm forvoicedunvoiced segmentation is applied which is based on3 parameters the short-time energy the normalized maxi-mal peak of the autocorrelation function in the range of 3ndash16milliseconds (the possible expected pitch period duration)and the short-time zero-crossing rate (Figure 2) The outputof the automatic voicedunvoiced segmentation is a series ofvoiced segments for each of the two voices However due tothe imperfection of the segmentation algorithm and the dis-similarity of the characteristics of the two voices and as ac-curate mapping between the corresponding voiced sectionsof the two voices is crucial for the success of the algorithma manual correction mode has been added to refine the pre-liminary segmentation The manual mode allows for makingadjustments to the edges of the segments splitting segmentsjoining segments and adding new segments or deleting onesFor the demarcation of phoneme boundaries the user canbe assisted by the graph of the 2-norm of the difference be-tween the MFCCs of adjacent frames (10 milliseconds eachsee Figure 2)

It was found that by applying manual segmentation asa refinement of the automatic segmentation it is possible toreach accurate mapping with only small adjustments

A pitch detection algorithm is applied to both speakersrsquoutterances The pitch detection algorithm is based on a com-bination of the cepstral method [23] for coarse-pitch perioddetection and the cross-correlation method [24] for refiningthe results Pitch marks are obtained and after preemphasislinear prediction coefficients are calculated for each voicedphoneme (either on the whole phoneme as one windowed

Voice Morphing Using 3D Waveform Interpolation Surfaces 1177

Phonemesboundaries

demarcation

Phonemesboundaries

demarcation

Pitchdetector

LPCanalysis

LPCanalysis

Pitchdetector

PWIcreation

Interpolator PWIcreation

Excitation signalreconstruction

InterpolatedLPC model

New interpolated voice

Speaker BSpeaker A

Synthesizer

Figure 1 A basic diagram of the algorithm

Figure 2 The semiautomatic segmentation of the speech signal A simple algorithm for voicedunvoiced segmentation is applied Theupper graph shows the speech signal with the vuv boundaries In the 4 graphs below the 4 parameters on which the decision is based aredepicted the short-time energy the zero-crossing rate the autocorrelation and the MFCC difference coefficient The user can correct thesegmentation manually with an interactive graphical user interface (see text)

1178 EURASIP Journal on Applied Signal Processing

frame or pitch synchronously) to create the vocal-tract filterand the residual error function for each segment The pro-totype waveform surfaces are then created from the resid-ual error functions (see Section 231) In the synthesis stagea new residual error signal is recovered from a PWI sur-face interpolated from the two original ones (as describedin Section 232) The two speakersrsquo area functions are alsointerpolated producing an intermediate area function fromwhich a new vocal-tract filter is computed (see Section 233)The new residual signal is then used to excite the interpo-lated vocal-tract filter to yield an intermediate speech signalThe final morphed speech signal is created by concatenatingthe new vocal phonemes in order along with the unvoicedphonemes and silent periods of the source or of the target

231 Computation of the characteristicwaveform surface

The characteristic waveform surface which represents theresidual error signal derived from the voiced sections [16] isa two-dimensional signal that represents a one-dimensionalsignal and is constructed as follows let u(tφ) be the char-acteristic waveform where t denotes the time axis and φ is aphase variable whose values are in the range [0 2π] The pro-totype waveforms are displayed along the phase axis whereeach prototype is a short segment from the residual signalwith a length of one pitch period Each prototype is consid-ered as a periodic function with a period of 2π The timeaxis of the surface displays the waveform evolution A one-dimensional signal can be recovered from u(tφ) by using aspecific φ(t) so that

r(t) = u(tφ(t)

) (1)

where φ(t) is calculated using the signal pitch period func-tion or pitch contour p(t) by

φ(t) = φ(t0)

+int t

t0

2πp(t)

dt (2)

A typical prediction error signal and its surface are shown inFigure 3

In the proposed solution the surface for each phonemeis created separately The construction of such a surface isdetailed in using the following procedure (Figure 4)

(1) Pitch detection is applied in order to obtain an instan-taneous pitch value p(t) which will track the pitch cy-cle change through time At any given point in timethe pitch cycle is determined by a linear interpolationof the pitch marks obtained by the pitch detector

(2) A rectangular window with duration of one pitch pe-riod multiplies the residual error function around asampling time ti with a step update of 25 millisec-onds to create a prototype waveform In order tosmooth the surface along the time axis a low-orderSavitzky-Golay filter is applied to the error function

0 50 100 150 200 250 300 350 400minus03

minus02

minus01

0

01

02

03

04

05

06

Am

plit

ude

t

(a)

60 50 40 30 20 10 0

100200

300400

minus04

minus02

0

02

04

06

08

1

Am

plit

ude

φ

t

(b)

Figure 3 Creating the characteristic waveform surface (a) A typ-ical residual error signal derived from the speech signal using lin-ear prediction analysis (b) The surface is constructed by plottingthe prototype waveform along the φ-axis at intervals of 25 millisec-onds after alignment and interpolation of the prototypes

(3) For the reconstruction of the signal from the surfaceit is extremely important to maintain similar and min-imum energy values at both ends of the prototypewaveform (which actually represent the same pointdue to the 2π periodicity along the phase axis) There-fore a shift of plusmn∆ samples (∆max was set to be 1 mil-lisecond) is allowed for the location of the windowrsquoscenter in the construction of the PWI surface

(4) Because the pitch cycle varies in time each prototypewaveform will be of different length Therefore all pro-totypes must be aligned along φ = [0 minus 2π] and musthave the same number of samples

Voice Morphing Using 3D Waveform Interpolation Surfaces 1179

The errorwaveform surface

Interpolatealong

time axis

Normalizebetween0minus 2π

Align withprevious

waveform

Prototypewaveformsampler

LPF

Short-timepitch function

calculation

Error signalcalculation

Pitch detectionmarks

LPC parameters

Figure 4 A schematic description of the PWI surface construction

(5) Once a prototype waveform is sampled according tostep (3) above a cyclical shift along the φ-axis is per-formed to obtain maximum cross-correlation with theformer prototype waveform thus creating a relativelysmooth waveform surface when moving across thetime axis

(6) In order to create a surface that reflects the errorrsquos pitchcycle evolution over time an interpolation along thetime axis is performed

232 Construction of the intermediateresidual error signal

Let us(tφ) ut(tφ) t [0 minus TsTt]φ [0 minus 2π] be thePWI of the source and target speakers respectively As de-scribed earlier the new intermediate waveform surface willbe an interpolation of the two surfaces (Figure 5) Therefore

unew(tφ) = α middot us(β middot tφ) + (1minus α) middot utminusaligned(γ middot tφ)

unew(tφ) t [0minus Tnew] φ [0minus 2π]

(3)

where β = TnewTs γ = TnewTt and α1 is the relative part ofus(tφ)

1The factor may be invariant if so the voice produced will be an inter-mediate between the two speakers or it may vary in time (from α = 1 att = 0 to α = 0 at t = T) yielding a gradual change from one voice to theother

minus02minus01

00102

03

04

0506

60 5040

3020

100

+

0100

200300

400

Am

plit

ude

φt

minus04minus03minus02

minus01

0

010203

04

6050

4030

2010

0 0100

200300

400

Am

plit

ude

φt

minus02

minus01

0

01

02

03

04

05

6050

4030

2010

0 0100

200300

400

Am

plit

ude

φt

Figure 5 Residual PWI surfaces of the two speakers (upper andintermediate figures respectively) and the resulting interpolatedsurface (bottom figure) It can be noticed that the interpolatedsurface combines characteristics of both speakers The φ-axis rep-resents samples of the normalized pitch period in a scale between 0and 2π

1180 EURASIP Journal on Applied Signal Processing

minus02

0

02

0 050

100

150200

φ

t

Am

plit

ude

Figure 6 The new PWI surface is constructed by interpolation oftwo PWI surfaces of the residual error signals of two speakers Thenew reconstructed residual can be recovered by moving on the sur-face along the phase axis according to the phase φ(t) determined bythe new pitch contour (The tracking is represented by the black linealong the surface)

In order to maximize the cross-correlation between thetwo surfaces the target speakerrsquos surface is shifted along theφ-axis and is referred to as utminusaligned(tφ) where

utminusaligned(tφ) = ut(tφ + φn

) (4)

φn = arg maxφe

int 2πφ=0

int Tnew

t=0 us(tφ) middot ut(tφ + φe

)dtdφ∥∥us∥∥ middot ∥∥ut(tφ + φe

)∥∥ (5)

u =radicradicradicradicint 2π

φ=0

int Tnew

t=0u(tφ) middot u(tφ)dtdφ (6)

where φe is the correction needed for the surfaces to bealigned

The last step (after creating unew(tφ)) is reconstruct-ing the new residual error signal from the waveform sur-face The reconstruction is performed by defining enew(t) =unew(tφnew(t)) for all t = [0 Tnew] where φnew(t) is createdby the following equation

φnew(t) =int t

t0

2πpnew(tprime)

dtprime (7)

pnew(t) is calculated as an average of the sourcersquos and targetrsquosshort-time pitch contour functions as shown in the follow-ing equation

pnew(t) = α middot ps(β middot t) + (1minus α) middot pt(γ middot t)

β = Tnew

Ts

γ = Tnew

Tt

(8)

where the factors β and γ are scaling factors for the new timeaxis [0Tnew] and α as in (3) is a weighting factor betweenthe pitch of the source and the pitch of the target Figure 6shows the derivation of a new residual error signal using thetrack on the PWI surface determined by φnew(t)

233 New vocal tract model calculation and synthesisIt is well known that the linear prediction parameters (iethe coefficients of the predictor polynomial A(z)) are highlysensitive to quantization [19] Therefore their quantizationor interpolation may result in an unstable filter and may pro-duce an undesirable signal However certain invertible non-linear transformations of the predictor coefficients result inequivalent sets of parameters that tolerate quantization or in-terpolation better An example of such a set of parametersis the PARCOR coefficients (ki) which are related to the ar-eas of lossless tube sections modeling the vocal tract [20] asgiven by the following equation

Ai+1 =(

1minus ki1 + ki

)middot Ai (9)

The value of the first area function parameter (A1) is arbi-trarily set to be 2 A new set of LPC parameters that defines anew vocal tract is computed using an interpolation of the twoarea vectors (source and target) This choice of the area pa-rameters seems to be more reasonable since intermediate vo-cal tract models should reflect intermediate dimensions [25]Let the source and target vocal tracts be modeled by N loss-less tube sections with areasAs

i Ati i [1minusN] respectively

The new signalrsquos vocal tract will be represented by

Anewi = α middot As

i + (1minus α) middot Ati foralli [1 2 N] (10)

After calculating the new areas the prediction filter iscomputed and the new vocal phoneme is synthesized accord-ing to the following scheme (Figure 7)

(1) compute new PARCOR parameters from the new areasby reversing (9)

(2) compute the coefficients of the new LPC model fromthe new PARCOR parameters

(3) filter the new excitation signal through the new vocal-tract filter to obtain the new vocal phoneme

When temporal voice morphing is applied informal subjec-tive listening tests performed on different sets of morphingparameters have revealed that in order to have a ldquolinearrdquo per-ceptual change between the voices of the source and the tar-get the coefficient α(t) (the relative part of us(tφ)) mustvary nonlinearly in time (like the one in Figure 8) When thecoefficient changed linearly with time the listeners perceivedan abrupt change from one identity to the other Using thenonlinear option that is gradually changing the identity ofone speaker to that of another a smooth modification of theldquosourcerdquo properties to those of the ldquotargetrdquo properties wasachieved In another subjective listening test (see below) per-formed on morphing from a womanrsquos voice to a manrsquos voiceand vice versa both uttering the same sentence the mor-phed sound was perceived as changing smoothly and natu-rally from one speaker to the other The quality of the mor-phed voices was found to depend upon the specific speakersthe differences between their voices and the content of theutterance Further research is required to accurately evaluatethe effect of these factors

Voice Morphing Using 3D Waveform Interpolation Surfaces 1181

New phoneme

Synthesizenew speech

signal

Calculatenew LPC

parameters

Calculate newtube areas

Modificationparameter (α)

Source and targetpitch functions

Source and targeterror surface

Creat newerror surface

Recover newexcitation

Source and targetLPC parameters

Modificationparameter (α)

Calculateφnew(t)

EvaluatePnew(t)

Figure 7 A block diagram of the procedure for synthesizing the new phoneme

0010203040506070809

1

0 01 02 03 04 05 06 07 08 09 1

Act

ual

mor

phin

gco

effici

ent

Desired morphing coefficient

Figure 8 An example of a nonlinear morphing function obtainedempirically and used to achieve a linear perceptual change betweenthe first speaker and the second speaker when time-varying mor-phing was performed

24 Evaluation of the algorithm

Evaluation of the algorithm was carried out using three sub-jective listening tests The naturalness intelligibility identitychange and smoothness of the morphing algorithm wereexamined in these psychoacoustic tests In all tests six lis-teners participated (three males and three females all with-out hearing impairments and inexperienced in speech mor-phing) The sentences for the tests were taken from TIMITdatabase and from BGU Hebrew database The speech stim-uli were played using high-quality loudspeakers In all tests

the stimuli were played in random order The listeners werefree to play each stimulus more than once without any limi-tation In the first test the listeners had to decide if there wasa change in the identity of the speaker for each of the threesentences on a 1ndash5 scale where 1 meant that one identitywas perceived along the utterance and 5 meant that there wasmore than one speaker The listeners had to rate the smooth-ness of the utterance as well on a similar 1ndash5 scale where5 meant smooth utterance and 1 meant abrupt change orchanges in the utterance Three types of stimuli were usedthe original speech morphed speech (in which the identitychanged gradually from one speaker to another during thesentence) and concatenated speech of two speakers at a fixedpoint [1] The aim of this test was to evaluate if the iden-tity change was perceived and to assess the effect of the algo-rithm on the smoothness of the gradual morphed utterancesThe results of this experiment for one of the sentences (ldquoWewere different shades and it did not make a bit of differenceamong usrdquo taken from [9]) are depicted in Figure 9 As ex-pected it is readily seen that the original utterance was per-ceived as a smooth speech without identity change while theabrupt change in identity in the concatenated utterance wasperceived and rated accordingly that is as more than oneidentity and with an abrupt change The change of identitywas perceived in the morphed signal as well (average score35 std = 096) since the sentence was long enough to no-tice the modification Nevertheless it was also perceived asrelatively smooth (average score 35 std = 112) The other

1182 EURASIP Journal on Applied Signal Processing

s3-o

rigi

nal

t3-o

rigi

nal

s3-g

radu

alm

orph

ing

t3-g

radu

alm

orph

ing

s3-

con

cate

nat

ed

t3-

con

cate

nat

ed

1000

1500

2000

2500

3000

3500

4000

4500

5000

Ave

rage

scor

e

Stimuli

Figure 9 Evaluation of smoothness and identity change in an ut-terance in which a gradual morphing is produced (smoothnessmdashleft column identity changemdashright column)

two utterances yielded similar results In the second test thelisteners had to evaluate the naturalness and intelligibilityof a given sentence on a 1ndash5 scale The listeners attendedto 3 stimuli two original sentences one uttered by a malespeaker and the other by a female speaker and a morphedsentence with a morphing factor of 05 which is a hybrid sig-nal between the two originals The results are summarized inFigure 10 It is clear from this figure that the morphed sig-nal was perceived as highly natural and clear by most of thelisteners (naturalness average score 43 std = 074 intelli-gibility average score 417 std = 069) In the third exper-iment the listeners had to rate the identity change and thenaturalness of two sentences on a 1ndash5 scale Each sentencewas repeated as a cyclostationary morph (see [3]) 16 timeswith various morphing factors from 0 to 1 The average nat-uralness was 325 and the identity change was rated with anaverage of 33 Naturalness in this case was not perfect (al-though it is quite high) and can be explained by the fact thatthe voice timbers in this test were distant and the hybridvoice that was produced could be perceived as uncommonIn summary the results of the tests show that the morphedsignals are perceived in most cases as highly natural highlyintelligible with a relatively smooth change from one speechvoice to another

3 CONCLUSIONS

In this study a new speech morphing algorithm is presentedThe aim is to produce natural sounding hybrid voices be-tween two speakers uttering the same content The algo-rithm is based on representing the residual error signal as aPWI surface and the vocal tract as a lossless tube area func-tion The PWI surface incorporates the characteristics of theexcitation signal and enables reproduction of a residual sig-nal with a given pitch contour and time duration which in-cludes the dynamics of both speakersrsquo excitations It is known

1000

1500

2000

2500

3000

3500

4000

4500

5000

Ave

rage

scor

e

Man Woman Morphing 05

Stimuli

Figure 10 Results of subjective evaluation for naturalness (leftbars) and intelligibility (right bars) for 3 stimuli a man (left) anda woman (middle) voices and a morphing between the two voices(right)

[16] that PWI surfaces can be exploited efficiently for speechcoding and therefore they allow for higher compression ofthe speech database The area function was used in an at-tempt to reflect an intermediate configuration of the vocaltract between the two speakers [25] The utterances producedby the algorithm were shown to be of high quality and to con-sist of intermediate features of the two speakers

There are at least two modes in which the morphing algo-rithm can be used In the first mode the morphing parameteris invariant meaning for example taking a factor of 05 andreceiving a morphed signal with characteristics which are be-tween the two voices for the whole duration of the articu-lation In the second mode we start from the first (source)speaker (morphing factor = 0) and the morphing factor ischanged gradually along the duration of the sentence so itsvalue is 1 at the end of the sentence The same morphingfactor was used for both the excitation and the vocal tractparameters

In this time-varying version of the algorithm that iswhen morphing gradually from one voice to another overtime smooth morphing was achieved producing a highlynatural transition between the source and the target speakersThis was assessed by subjective evaluation tests as previouslydescribed

The algorithm described here is more capable of pro-ducing longer and more smooth and natural sounding ut-terances than previous studies [1 3] One of the advantagesof the proposed algorithm is that in addition to the inter-polation of the vocal tract features interpolation is also per-formed between the two PWI surfaces of the correspondingresidual signals and thus captures the evolution of the excita-tion signals of both speakers In this way a hybrid excitationsignal can be produced that contains intermediate character-istics of both excitations Thus a more natural morphing be-tween utterances can be achieved as has been demonstratedFurthermore our algorithm performs the interpolation of

Voice Morphing Using 3D Waveform Interpolation Surfaces 1183

the residual signal regardless of the pitch information sincethe pitch data is normalized within the PWI surface There-fore the morphed pitch contour is extracted independentlyand can be manipulated separately In addition the currentapproach enables morphing between utterances with differ-ent pitches between male and female voices or betweenvoices of different and perceptually distant timbres Kawa-hara and his colleagues [26 27] implemented a morphingsystem based on interpolation between time-frequency rep-resentations of the source and the target signals It appearsthat the STRAIGHT-based morphing system (see [2 26])was able to produce intermediate voices of higher clarity thanour algorithm but the need to assign multiple anchor pointsfor each short segment using visual inspection and phono-logical knowledge is a noticeable disadvantage of that sys-tem which can make it difficult to use for morphing betweenlong utterances

Further research is needed to improve the quality of themorphed signals which are natural sounding (see httpspltelhaiacilspeech) but are somewhat degraded com-pared to the originals

ACKNOWLEDGMENTS

We would like to thank Professor David Malah the Head ofSIPL for proposing to apply PWI to voice morphing andfor his valuable discussions and comments We also thankDima Ruinskiy for his valuable assistance for programmingpart of the morphing system and for preparing the semiau-tomatic segmentation and Yefim Yakir for preparing part ofthe figures and for technical support The authors thank thereviewers for their helpful comments This study was partlysupported by Guastella Fellowship of the Sacta-Rashi Foun-dation and the JAFI project

REFERENCES

[1] M Abe ldquoSpeech morphing by gradually changing spectrumparameter and fundamental frequencyrdquo Tech Rep SP96-40The Institute of Electronics Information and Communica-tion Engineers (IEICE) July 1996

[2] H Kawahara and H Matsui ldquoAuditory morphing based onan elastic perceptual distance metric in an interference-freetime-frequency representationrdquo in Proc IEEE InternationalConference on Acoustics Speech and Signal Processing (ICASSPrsquo03) vol 1 pp 256ndash259 Hong Kong China 2003

[3] M Slaney M Covell and B Lassiter ldquoAutomatic audio mor-phingrdquo in Proc IEEE International Conference on AcousticsSpeech and Signal Processing (ICASSP rsquo96) vol 2 pp 1001ndash1004 Atlanta GA USA May 1996

[4] P Cano A Loscos J Bonada M de Boer and X Serra ldquoVoicemorphing system for impersonating in karaoke applicationsrdquoin Proc International Computer Music Conference (ICMA rsquo00)pp 109ndash112 Berlin Germany August 2000

[5] E Tellman L Haken and B Holloway ldquoTimbre morphingof sounds with unequal numbers of featuresrdquo Journal of theAudio Engineering Society (AES) vol 43 no 9 pp 678ndash6891995

[6] D Rossum ldquoThe ldquoarmadillordquo coefficient encoding scheme fordigital audio filtersrdquo in proc IEEE Workshop on Applicationsof Signal Processing to Audio and Acoustics (WASPAA rsquo91) pp129ndash130 New Paltz NY USA October 1991

[7] Y Stylianou O Cappe and E Moulines ldquoContinuous prob-abilistic transform for voice conversionrdquo IEEE Trans SpeechAudio Processing vol 6 no 2 pp 131ndash142 1998

[8] N Iwahashi and Y Sagisaka ldquoSpeech spectrum conversionbased on speaker interpolation and multi-functional repre-sentation with weighting by radial basis function networksrdquoSpeech Communication vol 16 no 2 pp 139ndash151 1995

[9] L M Arslan ldquoSpeaker Transformation Algorithm using Seg-mental Codebooks (STASC)rdquo Speech Communication vol 28no 3 pp 211ndash226 1999

[10] A Kain and M Macon ldquoSpectral voice conversion for text-to-speech synthesisrdquo in Proc IEEE International Conferenceon Acoustics Speech and Signal Processing (ICASSP rsquo98) vol 1pp 285ndash288 Seattle WA USA May 1998

[11] A Kain and M Macon ldquoPersonalizing a speech synthesizerby voice adaptationrdquo in Proc 3rd ESCACOCOSDA Interna-tional Speech Synthesis Workshop pp 225ndash230 Jenolan CavesAustralia November 1998

[12] A Kain and M Macon ldquoDesign and evaluation of a voice con-version algorithm based on spectral envelope mapping andresidual predictionrdquo in Proc IEEE International Conference onAcoustics Speech and Signal Processing (ICASSP rsquo01) vol 2Salt Lake City Utah USA May 2001

[13] M Abe S Nakamura K Shikano and H Kuwabara ldquoVoiceconversion through vector quantizationrdquo in Proc IEEE In-ternational Conference on Acoustics Speech and Signal Pro-cessing (ICASSP rsquo88) pp 655ndash658 New York NY USA April1988

[14] M Narendranath H A Murthy S Rajendran and B Yegna-narayana ldquoTransformation of formants for voice conversionusing artificial neural networksrdquo Speech Communication vol16 no 2 pp 206ndash216 1995

[15] O Lev (Fellah) and D Malah ldquoLow bit-rate speech coderbased on a long-term modelrdquo in Proc 22nd IEEE Conventionof Electrical and Electronics Engineers in Israel Tel-Aviv IsraelDecember 2002

[16] W B Kleijn and J Haagen ldquoWaveform interpolationfor cod-ing and synthesisrdquo in Speech Coding and Synthesis W BKleijn and K K Paliwal Eds chapter 5 pp 175ndash207 Else-vier Science B V Amsterdam the Netherlands 1995

[17] W B Kleijn ldquoEncoding speech using prototype waveformsrdquoIEEE Trans Speech Audio Processing vol 1 no 4 pp 386ndash3991993

[18] W B Kleijn and J Haagen ldquoTransformation and decompo-sition of the speech signal for codingrdquo IEEE Signal ProcessingLett vol 1 no 9 pp 136ndash138 1994

[19] J R Deller J G Proakis and J H L Hansen Discrete TimeProcessing of Speech Signals chapter 7 Macmillan PublishingNew York NY USA 1993

[20] L R Rabiner and R W Schafer Digital processing of speechsignals chapter 8 Signal Processing Series Prentice-Hall Pub-lishing Englewood Cliffs London 1978

[21] J Makhoul ldquoLinear prediction A tutorial reviewrdquo Proc IEEEvol 63 no 4 pp 561ndash580 1975

[22] H Kuwabara and Y Sagisaka ldquoAcoustic characteristics ofspeaker individuality Control and conversionrdquo Speech Com-munication vol 16 no 2 pp 165ndash173 1995

[23] A M Noll ldquoCepstrum pitch determinationrdquo Journal of theAcoustical Society of America vol 41 no 2 pp 293ndash309 1967

[24] Y Medan E Yair and D Chazan ldquoSuper resolution pitchdetermination of speech signalsrdquo IEEE Trans Acoust SpeechSignal Processing vol 39 no 1 pp 40ndash48 1991

[25] H Wakita ldquoDirect estimation of the vocal tract shape by in-verse filtering of the acoustic speech waveformsrdquo IEEE Trans-action Audio Electroacoustic vol AV-21 no 5 pp 417ndash4271973

1184 EURASIP Journal on Applied Signal Processing

[26] H Kawahara I Masuda-Katsuse and A de Cheveigne ldquoRe-structuring speech representations using a pitch-adaptivetime-frequency smoothing and an instantaneous-frequency-based F0 extraction Possible role of a repetitive structure insoundsrdquo Speech Communication vol 27 no 3-4 pp 187ndash207 1999

[27] H Kawahara H Katayose A de Cheveigne and R D Pat-terson ldquoFixed point analysis of frequency to instantaneousfrequency mapping for accurate estimation of f0 and period-icityrdquo in Proc European Union Activities in Human LanguageTechnologies (EUROSPEECH rsquo99) vol 6 pp 2781ndash2784 Bu-dapest Hungary September 1999

Yizhar Lavner received a PhD degree fromthe Technion ndash Israel Institute of Technol-ogy in 1997 He joined the Department ofcomputer science Tel-Hai Academic Col-lege Upper Galilee Israel in 1997 where heis a Senior Lecturer He is teaching in SIPL(Signal and Image Processing Lab) Elec-trical Engineering Faculty the TechnionHaifa Israel) since 1998 His research in-terests include speech and audio signal pro-cessing and genomic signal processing

Gidon Porat received his BS degree (cumlaude) in electrical engineering from theTechnion ndash Israel Institute of Technologyin January 2003 As a part of his grad-uation requirements Porat has researchedvoice morphing in the Electrical Engineer-ing Facultyrsquos Signal and Image ProcessingLab in the Technion Porat was a recipientof a Best Studentrsquos Paper Award at the IEEE22nd Convention of Electrical amp Electron-ics Engineers Israel in December 2002 Since his graduation Po-rat has been employed by Intel Corporation as a member of theMobile Platform Group Chip Design Team Porat takes part in thedevelopment of high-speed low-power circuit designs for the next-generation CPUs for mobile computers

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Transforming Signal Processing Applications intoParallel Implementations

Call for PapersThere is an increasing need to develop efficient ldquosystem-levelrdquo models methods and tools to support designers toquickly transform signal processing application specificationto heterogeneous hardware and software architectures suchas arrays of DSPs heterogeneous platforms involving mi-croprocessors DSPs and FPGAs and other evolving multi-processor SoC architectures Typically the design process in-volves aspects of application and architecture modeling aswell as transformations to translate the application modelsto architecture models for subsequent performance analysisand design space exploration Accurate predictions are indis-pensable because next generation signal processing applica-tions for example audio video and array signal processingimpose high throughput real-time and energy constraintsthat can no longer be served by a single DSP

There are a number of key issues in transforming applica-tion models into parallel implementations that are not ad-dressed in current approaches These are engineering theapplication specification transforming application specifi-cation or representation of the architecture specification aswell as communication models such as data transfer and syn-chronization primitives in both models

The purpose of this call for papers is to address approachesthat include application transformations in the performanceanalysis and design space exploration efforts when takingsignal processing applications to concurrent and parallel im-plementations The Guest Editors are soliciting contributionsin joint application and architecture space exploration thatoutperform the current architecture-only design space ex-ploration methods and tools

Topics of interest for this special issue include but are notlimited to

bull modeling applications in terms of (abstract)control-dataflow graph dataflow graph and processnetwork models of computation (MoC)

bull transforming application models or algorithmicengineering

bull transforming application MoCs to architecture MoCsbull joint application and architecture space exploration

bull joint application and architecture performanceanalysis

bull extending the concept of algorithmic engineering toarchitecture engineering

bull design cases and applications mapped onmultiprocessor homogeneous or heterogeneousSOCs showing joint optimization of application andarchitecture

Authors should follow the EURASIP JASP manuscriptformat described at httpwwwhindawicomjournalsaspProspective authors should submit an electronic copy oftheir complete manuscript through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due September 1 2006

Acceptance Notification January 1 2007

Final Manuscript Due April 1 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

F Deprettre Leiden Embedded Research Center LeidenUniversity Niels Bohrweg 1 2333 CA Leiden TheNetherlands eddliacsnl

Roger Woods School of Electrical and ElectronicEngineering Queens University of Belfast Ashby BuildingStranmillis Road Belfast BT9 5AH UK rwoodsqubacuk

Ingrid Verbauwhede Katholieke Universiteit LeuvenESAT-COSIC Kasteelpark Arenberg 10 3001 LeuvenBelgium Ingridverbauwhedeesatkuleuvenbe

Erwin de Kock Philips Research High Tech Campus 315656 AE Eindhoven The Netherlandserwindekockphilipscom

Hindawi Publishing Corporationhttpwwwhindawicom

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Video Adaptation for Heterogeneous Environments

Call for PapersThe explosive growth of compressed video streams andrepositories accessible worldwide the recent addition of newvideo-related standards such as H264AVC MPEG-7 andMPEG-21 and the ever-increasing prevalence of heteroge-neous video-enabled terminals such as computer TV mo-bile phones and personal digital assistants have escalated theneed for efficient and effective techniques for adapting com-pressed videos to better suit the different capabilities con-straints and requirements of various transmission networksapplications and end users For instance Universal Multime-dia Access (UMA) advocates the provision and adaptation ofthe same multimedia content for different networks termi-nals and user preferences

Video adaptation is an emerging field that offers a richbody of knowledge and techniques for handling the hugevariation of resource constraints (eg bandwidth display ca-pability processing speed and power consumption) and thelarge diversity of user tasks in pervasive media applicationsConsiderable amounts of research and development activi-ties in industry and academia have been devoted to answer-ing the many challenges in making better use of video con-tent across systems and applications of various kinds

Video adaptation may apply to individual or multiplevideo streams and may call for different means depending onthe objectives and requirements of adaptation Transcodingtransmoding (cross-modality transcoding) scalable contentrepresentation content abstraction and summarization arepopular means for video adaptation In addition video con-tent analysis and understanding including low-level featureanalysis and high-level semantics understanding play an im-portant role in video adaptation as essential video contentcan be better preserved

The aim of this special issue is to present state-of-the-art developments in this flourishing and important researchfield Contributions in theoretical study architecture designperformance analysis complexity reduction and real-worldapplications are all welcome

Topics of interest include (but are not limited to)

bull Heterogeneous video transcodingbull Scalable video codingbull Dynamic bitstream switching for video adaptation

bull Signal structural and semantic-level videoadaptation

bull Content analysis and understanding for videoadaptation

bull Video summarization and abstractionbull Copyright protection for video adaptationbull Crossmedia techniques for video adaptationbull Testing field trials and applications of video

adaptation servicesbull International standard activities for video adaptation

Authors should follow the EURASIP JASP manuscriptformat described at httpwwwhindawicomjournalsaspProspective authors should submit an electronic copy oftheir complete manuscript through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due September 1 2006

Acceptance Notification January 1 2007

Final Manuscript Due April 1 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

Chia-Wen Lin Department of Computer Science andInformation Engineering National Chung ChengUniversity Chiayi 621 Taiwan cwlincsccuedutw

Yap-Peng Tan School of Electrical and ElectronicEngineering Nanyang Technological University NanyangAvenue Singapore 639798 Singapore eyptanntuedusg

Ming-Ting Sun Department of Electrical EngineeringUniversity of Washington Seattle WA 98195 USA suneewashingtonedu

Alex Kot School of Electrical and Electronic EngineeringNanyang Technological University Nanyang AvenueSingapore 639798 Singapore eackotntuedusg

Anthony Vetro Mitsubishi Electric Research Laboratories201 Broadway 8th Floor Cambridge MA 02138 USAavetromerlcom

Hindawi Publishing Corporationhttpwwwhindawicom

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Knowledge-Assisted Media Analysis for InteractiveMultimedia Applications

Call for PapersIt is broadly acknowledged that the development of enablingtechnologies for new forms of interactive multimedia ser-vices requires a targeted confluence of knowledge semanticsand low-level media processing The convergence of these ar-eas is key to many applications including interactive TV net-worked medical imaging vision-based surveillance and mul-timedia visualization navigation search and retrieval Thelatter is a crucial application since the exponential growthof audiovisual data along with the critical lack of tools torecord the data in a well-structured form is rendering uselessvast portions of available content To overcome this problemthere is need for technology that is able to produce accuratelevels of abstraction in order to annotate and retrieve con-tent using queries that are natural to humans Such technol-ogy will help narrow the gap between low-level features orcontent descriptors that can be computed automatically andthe richness and subjectivity of semantics in user queries andhigh-level human interpretations of audiovisual media

This special issue focuses on truly integrative research tar-geting of what can be disparate disciplines including imageprocessing knowledge engineering information retrievalsemantic analysis and artificial intelligence High-qualityand novel contributions addressing theoretical and practicalaspects are solicited Specifically the following topics are ofinterest

bull Semantics-based multimedia analysisbull Context-based multimedia miningbull Intelligent exploitation of user relevance feedbackbull Knowledge acquisition from multimedia contentsbull Semantics based interaction with multimediabull Integration of multimedia processing and Semantic

Web technologies to enable automatic content shar-ing processing and interpretation

bull Content user and network aware media engineeringbull Multimodal techniques high-dimensionality reduc-

tion and low level feature fusion

Authors should follow the EURASIP JASP manuscriptformat described at httpwwwhindawicomjournalsaspProspective authors should submit an electronic copy oftheir complete manuscript through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due September 1 2006

Acceptance Notification January 15 2007

Final Manuscript Due April 1 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

Ebroul Izquierdo Department of Electronic EngineeringQueen Mary University of London Mile End Road LondonE1 4NS United Kingdom ebroulizquierdoelecqmulacuk

Hyoung Joong Kim Department of Control and Instru-mentation Engineering Kangwon National University 192 1Hyoja2 Dong Kangwon Do 200 701 Koreakhjkangwonackr

Thomas Sikora Communication Systems Group TechnicalUniversity Berlin Einstein Ufer 17 10587 Berlin Germanysikoranuetu-berlinde

Hindawi Publishing Corporationhttpwwwhindawicom

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Super-resolution Enhancement of Digital Video

Call for PapersWhen designing a system for image acquisition there is gen-erally a desire for high spatial resolution and a wide field-of-view To achieve this a camera system must typically em-ploy small f-number optics This produces an image withvery high spatial-frequency bandwidth at the focal plane Toavoid aliasing caused by undersampling the correspondingfocal plane array (FPA) must be sufficiently dense Howevercost and fabrication complexities may make this impracticalMore fundamentally smaller detectors capture fewer pho-tons which can lead to potentially severe noise levels in theacquired imagery Considering these factors one may chooseto accept a certain level of undersampling or to sacrifice someoptical resolution andor field-of-view

In image super-resolution (SR) postprocessing is used toobtain images with resolutions that go beyond the conven-tional limits of the uncompensated imaging system In somesystems the primary limiting factor is the optical resolutionof the image in the focal plane as defined by the cut-off fre-quency of the optics We use the term ldquooptical SRrdquo to re-fer to SR methods that aim to create an image with validspatial-frequency content that goes beyond the cut-off fre-quency of the optics Such techniques typically must rely onextensive a priori information In other image acquisitionsystems the limiting factor may be the density of the FPAsubsequent postprocessing requirements or transmission bi-trate constraints that require data compression We refer tothe process of overcoming the limitations of the FPA in orderto obtain the full resolution afforded by the selected optics asldquodetector SRrdquo Note that some methods may seek to performboth optical and detector SR

Detector SR algorithms generally process a set of low-resolution aliased frames from a video sequence to producea high-resolution frame When subpixel relative motion ispresent between the objects in the scene and the detector ar-ray a unique set of scene samples are acquired for each frameThis provides the mechanism for effectively increasing thespatial sampling rate of the imaging system without reduc-ing the physical size of the detectors

With increasing interest in surveillance and the prolifera-tion of digital imaging and video SR has become a rapidlygrowing field Recent advances in SR include innovative al-gorithms generalized methods real-time implementations

and novel applications The purpose of this special issue isto present leading research and development in the area ofsuper-resolution for digital video Topics of interest for thisspecial issue include but are not limited to

bull Detector and optical SR algorithms for videobull Real-time or near-real-time SR implementationsbull Innovative color SR processingbull Novel SR applications such as improved object

detection recognition and trackingbull Super-resolution from compressed videobull Subpixel image registration and optical flow

Authors should follow the EURASIP JASP manuscriptformat described at httpwwwhindawicomjournalsaspProspective authors should submit an electronic copy oftheir complete manuscript through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due September 1 2006

Acceptance Notification February 1 2007

Final Manuscript Due April 15 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

Russell C Hardie Department of Electrical and ComputerEngineering University of Dayton 300 College ParkDayton OH 45469-0026 USA rhardieudaytonedu

Richard R Schultz Department of Electrical EngineeringUniversity of North Dakota Upson II Room 160 PO Box7165 Grand Forks ND 58202-7165 USARichardSchultzmailundnodakedu

Kenneth E Barner Department of Electrical andComputer Engineering University of Delaware 140 EvansHall Newark DE 19716-3130 USA barnereeudeledu

Hindawi Publishing Corporationhttpwwwhindawicom

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Advanced Signal Processing and ComputationalIntelligence Techniques for Power Line Communications

Call for PapersIn recent years increased demand for fast Internet access andnew multimedia services the development of new and fea-sible signal processing techniques associated with faster andlow-cost digital signal processors as well as the deregulationof the telecommunications market have placed major em-phasis on the value of investigating hostile media such aspowerline (PL) channels for high-rate data transmissions

Nowadays some companies are offering powerline com-munications (PLC) modems with mean and peak bit-ratesaround 100 Mbps and 200 Mbps respectively Howeveradvanced broadband powerline communications (BPLC)modems will surpass this performance For accomplishing itsome special schemes or solutions for coping with the follow-ing issues should be addressed (i) considerable differencesbetween powerline network topologies (ii) hostile propertiesof PL channels such as attenuation proportional to high fre-quencies and long distances high-power impulse noise oc-currences time-varying behavior and strong inter-symbolinterference (ISI) effects (iv) electromagnetic compatibilitywith other well-established communication systems work-ing in the same spectrum (v) climatic conditions in differ-ent parts of the world (vii) reliability and QoS guarantee forvideo and voice transmissions and (vi) different demandsand needs from developed developing and poor countries

These issues can lead to exciting research frontiers withvery promising results if signal processing digital commu-nication and computational intelligence techniques are ef-fectively and efficiently combined

The goal of this special issue is to introduce signal process-ing digital communication and computational intelligencetools either individually or in combined form for advancingreliable and powerful future generations of powerline com-munication solutions that can be suited with for applicationsin developed developing and poor countries

Topics of interest include (but are not limited to)

bull Multicarrier spread spectrum and single carrier tech-niques

bull Channel modeling

bull Channel coding and equalization techniquesbull Multiuser detection and multiple access techniquesbull Synchronization techniquesbull Impulse noise cancellation techniquesbull FPGA ASIC and DSP implementation issues of PLC

modemsbull Error resilience error concealment and Joint source-

channel design methods for video transmissionthrough PL channels

Authors should follow the EURASIP JASP manuscript for-mat described at the journal site httpasphindawicomProspective authors should submit an electronic copy of theircomplete manuscripts through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due October 1 2006

Acceptance Notification January 1 2007

Final Manuscript Due April 1 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

Moiseacutes Vidal Ribeiro Federal University of Juiz de ForaBrazil mribeiroieeeorg

Lutz Lampe University of British Columbia Canadalampeeceubcca

Sanjit K Mitra University of California Santa BarbaraUSA mitraeceucsbedu

Klaus Dostert University of Karlsruhe Germanyklausdostertetecuni-karlsruhede

Halid Hrasnica Dresden University of Technology Ger-many hrasnicaifnettu-dresdende

Hindawi Publishing Corporationhttpasphindawicom

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Numerical Linear Algebra in Signal ProcessingApplications

Call for PapersThe cross-fertilization between numerical linear algebra anddigital signal processing has been very fruitful in the lastdecades The interaction between them has been growingleading to many new algorithms

Numerical linear algebra tools such as eigenvalue and sin-gular value decomposition and their higher-extension leastsquares total least squares recursive least squares regulariza-tion orthogonality and projections are the kernels of pow-erful and numerically robust algorithms

The goal of this special issue is to present new efficient andreliable numerical linear algebra tools for signal processingapplications Areas and topics of interest for this special issueinclude (but are not limited to)

bull Singular value and eigenvalue decompositions in-cluding applications

bull Fourier Toeplitz Cauchy Vandermonde and semi-separable matrices including special algorithms andarchitectures

bull Recursive least squares in digital signal processingbull Updating and downdating techniques in linear alge-

bra and signal processingbull Stability and sensitivity analysis of special recursive

least-squares problemsbull Numerical linear algebra in

bull Biomedical signal processing applicationsbull Adaptive filtersbull Remote sensingbull Acoustic echo cancellationbull Blind signal separation and multiuser detectionbull Multidimensional harmonic retrieval and direc-

tion-of-arrival estimationbull Applications in wireless communicationsbull Applications in pattern analysis and statistical

modelingbull Sensor array processing

Authors should follow the EURASIP JASP manuscriptformat described at httpwwwhindawicomjournalsaspProspective authors should submit an electronic copy oftheir complete manuscript through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due October 1 2006

Acceptance Notification February 1 2007

Final Manuscript Due May 1 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

Shivkumar Chandrasekaran Department of Electricaland Computer Engineering University of California SantaBarbara USA shiveceucsbedu

Gene H Golub Department of Computer Science StanfordUniversity USA golubsccmstanfordedu

Nicola Mastronardi Istituto per le Applicazioni del CalcololdquoMauro Piconerdquo Consiglio Nazionale delle Ricerche BariItaly nmastronardibaiaccnrit

Marc Moonen Department of Electrical EngineeringKatholieke Universiteit Leuven Belgiummarcmoonenesatkuleuvenbe

Paul Van Dooren Department of Mathematical Engineer-ing Catholic University of Louvain Belgiumvdoorencsamuclacbe

Sabine Van Huffel Department of Electrical EngineeringKatholieke Universiteit Leuven Belgiumsabinevanhuffelesatkuleuvenbe

Hindawi Publishing Corporationhttpwwwhindawicom

Page 4: Voice Morphing using 3D Waveform Interpolation …spl.telhai.ac.il/speech/pub/S1110865705502178[1].pdf · Voice Morphing Using 3D Waveform Interpolation Surfaces 1175 Several studies

Voice Morphing Using 3D Waveform Interpolation Surfaces 1177

Phonemesboundaries

demarcation

Phonemesboundaries

demarcation

Pitchdetector

LPCanalysis

LPCanalysis

Pitchdetector

PWIcreation

Interpolator PWIcreation

Excitation signalreconstruction

InterpolatedLPC model

New interpolated voice

Speaker BSpeaker A

Synthesizer

Figure 1 A basic diagram of the algorithm

Figure 2 The semiautomatic segmentation of the speech signal A simple algorithm for voicedunvoiced segmentation is applied Theupper graph shows the speech signal with the vuv boundaries In the 4 graphs below the 4 parameters on which the decision is based aredepicted the short-time energy the zero-crossing rate the autocorrelation and the MFCC difference coefficient The user can correct thesegmentation manually with an interactive graphical user interface (see text)

1178 EURASIP Journal on Applied Signal Processing

frame or pitch synchronously) to create the vocal-tract filterand the residual error function for each segment The pro-totype waveform surfaces are then created from the resid-ual error functions (see Section 231) In the synthesis stagea new residual error signal is recovered from a PWI sur-face interpolated from the two original ones (as describedin Section 232) The two speakersrsquo area functions are alsointerpolated producing an intermediate area function fromwhich a new vocal-tract filter is computed (see Section 233)The new residual signal is then used to excite the interpo-lated vocal-tract filter to yield an intermediate speech signalThe final morphed speech signal is created by concatenatingthe new vocal phonemes in order along with the unvoicedphonemes and silent periods of the source or of the target

231 Computation of the characteristicwaveform surface

The characteristic waveform surface which represents theresidual error signal derived from the voiced sections [16] isa two-dimensional signal that represents a one-dimensionalsignal and is constructed as follows let u(tφ) be the char-acteristic waveform where t denotes the time axis and φ is aphase variable whose values are in the range [0 2π] The pro-totype waveforms are displayed along the phase axis whereeach prototype is a short segment from the residual signalwith a length of one pitch period Each prototype is consid-ered as a periodic function with a period of 2π The timeaxis of the surface displays the waveform evolution A one-dimensional signal can be recovered from u(tφ) by using aspecific φ(t) so that

r(t) = u(tφ(t)

) (1)

where φ(t) is calculated using the signal pitch period func-tion or pitch contour p(t) by

φ(t) = φ(t0)

+int t

t0

2πp(t)

dt (2)

A typical prediction error signal and its surface are shown inFigure 3

In the proposed solution the surface for each phonemeis created separately The construction of such a surface isdetailed in using the following procedure (Figure 4)

(1) Pitch detection is applied in order to obtain an instan-taneous pitch value p(t) which will track the pitch cy-cle change through time At any given point in timethe pitch cycle is determined by a linear interpolationof the pitch marks obtained by the pitch detector

(2) A rectangular window with duration of one pitch pe-riod multiplies the residual error function around asampling time ti with a step update of 25 millisec-onds to create a prototype waveform In order tosmooth the surface along the time axis a low-orderSavitzky-Golay filter is applied to the error function

0 50 100 150 200 250 300 350 400minus03

minus02

minus01

0

01

02

03

04

05

06

Am

plit

ude

t

(a)

60 50 40 30 20 10 0

100200

300400

minus04

minus02

0

02

04

06

08

1

Am

plit

ude

φ

t

(b)

Figure 3 Creating the characteristic waveform surface (a) A typ-ical residual error signal derived from the speech signal using lin-ear prediction analysis (b) The surface is constructed by plottingthe prototype waveform along the φ-axis at intervals of 25 millisec-onds after alignment and interpolation of the prototypes

(3) For the reconstruction of the signal from the surfaceit is extremely important to maintain similar and min-imum energy values at both ends of the prototypewaveform (which actually represent the same pointdue to the 2π periodicity along the phase axis) There-fore a shift of plusmn∆ samples (∆max was set to be 1 mil-lisecond) is allowed for the location of the windowrsquoscenter in the construction of the PWI surface

(4) Because the pitch cycle varies in time each prototypewaveform will be of different length Therefore all pro-totypes must be aligned along φ = [0 minus 2π] and musthave the same number of samples

Voice Morphing Using 3D Waveform Interpolation Surfaces 1179

The errorwaveform surface

Interpolatealong

time axis

Normalizebetween0minus 2π

Align withprevious

waveform

Prototypewaveformsampler

LPF

Short-timepitch function

calculation

Error signalcalculation

Pitch detectionmarks

LPC parameters

Figure 4 A schematic description of the PWI surface construction

(5) Once a prototype waveform is sampled according tostep (3) above a cyclical shift along the φ-axis is per-formed to obtain maximum cross-correlation with theformer prototype waveform thus creating a relativelysmooth waveform surface when moving across thetime axis

(6) In order to create a surface that reflects the errorrsquos pitchcycle evolution over time an interpolation along thetime axis is performed

232 Construction of the intermediateresidual error signal

Let us(tφ) ut(tφ) t [0 minus TsTt]φ [0 minus 2π] be thePWI of the source and target speakers respectively As de-scribed earlier the new intermediate waveform surface willbe an interpolation of the two surfaces (Figure 5) Therefore

unew(tφ) = α middot us(β middot tφ) + (1minus α) middot utminusaligned(γ middot tφ)

unew(tφ) t [0minus Tnew] φ [0minus 2π]

(3)

where β = TnewTs γ = TnewTt and α1 is the relative part ofus(tφ)

1The factor may be invariant if so the voice produced will be an inter-mediate between the two speakers or it may vary in time (from α = 1 att = 0 to α = 0 at t = T) yielding a gradual change from one voice to theother

minus02minus01

00102

03

04

0506

60 5040

3020

100

+

0100

200300

400

Am

plit

ude

φt

minus04minus03minus02

minus01

0

010203

04

6050

4030

2010

0 0100

200300

400

Am

plit

ude

φt

minus02

minus01

0

01

02

03

04

05

6050

4030

2010

0 0100

200300

400

Am

plit

ude

φt

Figure 5 Residual PWI surfaces of the two speakers (upper andintermediate figures respectively) and the resulting interpolatedsurface (bottom figure) It can be noticed that the interpolatedsurface combines characteristics of both speakers The φ-axis rep-resents samples of the normalized pitch period in a scale between 0and 2π

1180 EURASIP Journal on Applied Signal Processing

minus02

0

02

0 050

100

150200

φ

t

Am

plit

ude

Figure 6 The new PWI surface is constructed by interpolation oftwo PWI surfaces of the residual error signals of two speakers Thenew reconstructed residual can be recovered by moving on the sur-face along the phase axis according to the phase φ(t) determined bythe new pitch contour (The tracking is represented by the black linealong the surface)

In order to maximize the cross-correlation between thetwo surfaces the target speakerrsquos surface is shifted along theφ-axis and is referred to as utminusaligned(tφ) where

utminusaligned(tφ) = ut(tφ + φn

) (4)

φn = arg maxφe

int 2πφ=0

int Tnew

t=0 us(tφ) middot ut(tφ + φe

)dtdφ∥∥us∥∥ middot ∥∥ut(tφ + φe

)∥∥ (5)

u =radicradicradicradicint 2π

φ=0

int Tnew

t=0u(tφ) middot u(tφ)dtdφ (6)

where φe is the correction needed for the surfaces to bealigned

The last step (after creating unew(tφ)) is reconstruct-ing the new residual error signal from the waveform sur-face The reconstruction is performed by defining enew(t) =unew(tφnew(t)) for all t = [0 Tnew] where φnew(t) is createdby the following equation

φnew(t) =int t

t0

2πpnew(tprime)

dtprime (7)

pnew(t) is calculated as an average of the sourcersquos and targetrsquosshort-time pitch contour functions as shown in the follow-ing equation

pnew(t) = α middot ps(β middot t) + (1minus α) middot pt(γ middot t)

β = Tnew

Ts

γ = Tnew

Tt

(8)

where the factors β and γ are scaling factors for the new timeaxis [0Tnew] and α as in (3) is a weighting factor betweenthe pitch of the source and the pitch of the target Figure 6shows the derivation of a new residual error signal using thetrack on the PWI surface determined by φnew(t)

233 New vocal tract model calculation and synthesisIt is well known that the linear prediction parameters (iethe coefficients of the predictor polynomial A(z)) are highlysensitive to quantization [19] Therefore their quantizationor interpolation may result in an unstable filter and may pro-duce an undesirable signal However certain invertible non-linear transformations of the predictor coefficients result inequivalent sets of parameters that tolerate quantization or in-terpolation better An example of such a set of parametersis the PARCOR coefficients (ki) which are related to the ar-eas of lossless tube sections modeling the vocal tract [20] asgiven by the following equation

Ai+1 =(

1minus ki1 + ki

)middot Ai (9)

The value of the first area function parameter (A1) is arbi-trarily set to be 2 A new set of LPC parameters that defines anew vocal tract is computed using an interpolation of the twoarea vectors (source and target) This choice of the area pa-rameters seems to be more reasonable since intermediate vo-cal tract models should reflect intermediate dimensions [25]Let the source and target vocal tracts be modeled by N loss-less tube sections with areasAs

i Ati i [1minusN] respectively

The new signalrsquos vocal tract will be represented by

Anewi = α middot As

i + (1minus α) middot Ati foralli [1 2 N] (10)

After calculating the new areas the prediction filter iscomputed and the new vocal phoneme is synthesized accord-ing to the following scheme (Figure 7)

(1) compute new PARCOR parameters from the new areasby reversing (9)

(2) compute the coefficients of the new LPC model fromthe new PARCOR parameters

(3) filter the new excitation signal through the new vocal-tract filter to obtain the new vocal phoneme

When temporal voice morphing is applied informal subjec-tive listening tests performed on different sets of morphingparameters have revealed that in order to have a ldquolinearrdquo per-ceptual change between the voices of the source and the tar-get the coefficient α(t) (the relative part of us(tφ)) mustvary nonlinearly in time (like the one in Figure 8) When thecoefficient changed linearly with time the listeners perceivedan abrupt change from one identity to the other Using thenonlinear option that is gradually changing the identity ofone speaker to that of another a smooth modification of theldquosourcerdquo properties to those of the ldquotargetrdquo properties wasachieved In another subjective listening test (see below) per-formed on morphing from a womanrsquos voice to a manrsquos voiceand vice versa both uttering the same sentence the mor-phed sound was perceived as changing smoothly and natu-rally from one speaker to the other The quality of the mor-phed voices was found to depend upon the specific speakersthe differences between their voices and the content of theutterance Further research is required to accurately evaluatethe effect of these factors

Voice Morphing Using 3D Waveform Interpolation Surfaces 1181

New phoneme

Synthesizenew speech

signal

Calculatenew LPC

parameters

Calculate newtube areas

Modificationparameter (α)

Source and targetpitch functions

Source and targeterror surface

Creat newerror surface

Recover newexcitation

Source and targetLPC parameters

Modificationparameter (α)

Calculateφnew(t)

EvaluatePnew(t)

Figure 7 A block diagram of the procedure for synthesizing the new phoneme

0010203040506070809

1

0 01 02 03 04 05 06 07 08 09 1

Act

ual

mor

phin

gco

effici

ent

Desired morphing coefficient

Figure 8 An example of a nonlinear morphing function obtainedempirically and used to achieve a linear perceptual change betweenthe first speaker and the second speaker when time-varying mor-phing was performed

24 Evaluation of the algorithm

Evaluation of the algorithm was carried out using three sub-jective listening tests The naturalness intelligibility identitychange and smoothness of the morphing algorithm wereexamined in these psychoacoustic tests In all tests six lis-teners participated (three males and three females all with-out hearing impairments and inexperienced in speech mor-phing) The sentences for the tests were taken from TIMITdatabase and from BGU Hebrew database The speech stim-uli were played using high-quality loudspeakers In all tests

the stimuli were played in random order The listeners werefree to play each stimulus more than once without any limi-tation In the first test the listeners had to decide if there wasa change in the identity of the speaker for each of the threesentences on a 1ndash5 scale where 1 meant that one identitywas perceived along the utterance and 5 meant that there wasmore than one speaker The listeners had to rate the smooth-ness of the utterance as well on a similar 1ndash5 scale where5 meant smooth utterance and 1 meant abrupt change orchanges in the utterance Three types of stimuli were usedthe original speech morphed speech (in which the identitychanged gradually from one speaker to another during thesentence) and concatenated speech of two speakers at a fixedpoint [1] The aim of this test was to evaluate if the iden-tity change was perceived and to assess the effect of the algo-rithm on the smoothness of the gradual morphed utterancesThe results of this experiment for one of the sentences (ldquoWewere different shades and it did not make a bit of differenceamong usrdquo taken from [9]) are depicted in Figure 9 As ex-pected it is readily seen that the original utterance was per-ceived as a smooth speech without identity change while theabrupt change in identity in the concatenated utterance wasperceived and rated accordingly that is as more than oneidentity and with an abrupt change The change of identitywas perceived in the morphed signal as well (average score35 std = 096) since the sentence was long enough to no-tice the modification Nevertheless it was also perceived asrelatively smooth (average score 35 std = 112) The other

1182 EURASIP Journal on Applied Signal Processing

s3-o

rigi

nal

t3-o

rigi

nal

s3-g

radu

alm

orph

ing

t3-g

radu

alm

orph

ing

s3-

con

cate

nat

ed

t3-

con

cate

nat

ed

1000

1500

2000

2500

3000

3500

4000

4500

5000

Ave

rage

scor

e

Stimuli

Figure 9 Evaluation of smoothness and identity change in an ut-terance in which a gradual morphing is produced (smoothnessmdashleft column identity changemdashright column)

two utterances yielded similar results In the second test thelisteners had to evaluate the naturalness and intelligibilityof a given sentence on a 1ndash5 scale The listeners attendedto 3 stimuli two original sentences one uttered by a malespeaker and the other by a female speaker and a morphedsentence with a morphing factor of 05 which is a hybrid sig-nal between the two originals The results are summarized inFigure 10 It is clear from this figure that the morphed sig-nal was perceived as highly natural and clear by most of thelisteners (naturalness average score 43 std = 074 intelli-gibility average score 417 std = 069) In the third exper-iment the listeners had to rate the identity change and thenaturalness of two sentences on a 1ndash5 scale Each sentencewas repeated as a cyclostationary morph (see [3]) 16 timeswith various morphing factors from 0 to 1 The average nat-uralness was 325 and the identity change was rated with anaverage of 33 Naturalness in this case was not perfect (al-though it is quite high) and can be explained by the fact thatthe voice timbers in this test were distant and the hybridvoice that was produced could be perceived as uncommonIn summary the results of the tests show that the morphedsignals are perceived in most cases as highly natural highlyintelligible with a relatively smooth change from one speechvoice to another

3 CONCLUSIONS

In this study a new speech morphing algorithm is presentedThe aim is to produce natural sounding hybrid voices be-tween two speakers uttering the same content The algo-rithm is based on representing the residual error signal as aPWI surface and the vocal tract as a lossless tube area func-tion The PWI surface incorporates the characteristics of theexcitation signal and enables reproduction of a residual sig-nal with a given pitch contour and time duration which in-cludes the dynamics of both speakersrsquo excitations It is known

1000

1500

2000

2500

3000

3500

4000

4500

5000

Ave

rage

scor

e

Man Woman Morphing 05

Stimuli

Figure 10 Results of subjective evaluation for naturalness (leftbars) and intelligibility (right bars) for 3 stimuli a man (left) anda woman (middle) voices and a morphing between the two voices(right)

[16] that PWI surfaces can be exploited efficiently for speechcoding and therefore they allow for higher compression ofthe speech database The area function was used in an at-tempt to reflect an intermediate configuration of the vocaltract between the two speakers [25] The utterances producedby the algorithm were shown to be of high quality and to con-sist of intermediate features of the two speakers

There are at least two modes in which the morphing algo-rithm can be used In the first mode the morphing parameteris invariant meaning for example taking a factor of 05 andreceiving a morphed signal with characteristics which are be-tween the two voices for the whole duration of the articu-lation In the second mode we start from the first (source)speaker (morphing factor = 0) and the morphing factor ischanged gradually along the duration of the sentence so itsvalue is 1 at the end of the sentence The same morphingfactor was used for both the excitation and the vocal tractparameters

In this time-varying version of the algorithm that iswhen morphing gradually from one voice to another overtime smooth morphing was achieved producing a highlynatural transition between the source and the target speakersThis was assessed by subjective evaluation tests as previouslydescribed

The algorithm described here is more capable of pro-ducing longer and more smooth and natural sounding ut-terances than previous studies [1 3] One of the advantagesof the proposed algorithm is that in addition to the inter-polation of the vocal tract features interpolation is also per-formed between the two PWI surfaces of the correspondingresidual signals and thus captures the evolution of the excita-tion signals of both speakers In this way a hybrid excitationsignal can be produced that contains intermediate character-istics of both excitations Thus a more natural morphing be-tween utterances can be achieved as has been demonstratedFurthermore our algorithm performs the interpolation of

Voice Morphing Using 3D Waveform Interpolation Surfaces 1183

the residual signal regardless of the pitch information sincethe pitch data is normalized within the PWI surface There-fore the morphed pitch contour is extracted independentlyand can be manipulated separately In addition the currentapproach enables morphing between utterances with differ-ent pitches between male and female voices or betweenvoices of different and perceptually distant timbres Kawa-hara and his colleagues [26 27] implemented a morphingsystem based on interpolation between time-frequency rep-resentations of the source and the target signals It appearsthat the STRAIGHT-based morphing system (see [2 26])was able to produce intermediate voices of higher clarity thanour algorithm but the need to assign multiple anchor pointsfor each short segment using visual inspection and phono-logical knowledge is a noticeable disadvantage of that sys-tem which can make it difficult to use for morphing betweenlong utterances

Further research is needed to improve the quality of themorphed signals which are natural sounding (see httpspltelhaiacilspeech) but are somewhat degraded com-pared to the originals

ACKNOWLEDGMENTS

We would like to thank Professor David Malah the Head ofSIPL for proposing to apply PWI to voice morphing andfor his valuable discussions and comments We also thankDima Ruinskiy for his valuable assistance for programmingpart of the morphing system and for preparing the semiau-tomatic segmentation and Yefim Yakir for preparing part ofthe figures and for technical support The authors thank thereviewers for their helpful comments This study was partlysupported by Guastella Fellowship of the Sacta-Rashi Foun-dation and the JAFI project

REFERENCES

[1] M Abe ldquoSpeech morphing by gradually changing spectrumparameter and fundamental frequencyrdquo Tech Rep SP96-40The Institute of Electronics Information and Communica-tion Engineers (IEICE) July 1996

[2] H Kawahara and H Matsui ldquoAuditory morphing based onan elastic perceptual distance metric in an interference-freetime-frequency representationrdquo in Proc IEEE InternationalConference on Acoustics Speech and Signal Processing (ICASSPrsquo03) vol 1 pp 256ndash259 Hong Kong China 2003

[3] M Slaney M Covell and B Lassiter ldquoAutomatic audio mor-phingrdquo in Proc IEEE International Conference on AcousticsSpeech and Signal Processing (ICASSP rsquo96) vol 2 pp 1001ndash1004 Atlanta GA USA May 1996

[4] P Cano A Loscos J Bonada M de Boer and X Serra ldquoVoicemorphing system for impersonating in karaoke applicationsrdquoin Proc International Computer Music Conference (ICMA rsquo00)pp 109ndash112 Berlin Germany August 2000

[5] E Tellman L Haken and B Holloway ldquoTimbre morphingof sounds with unequal numbers of featuresrdquo Journal of theAudio Engineering Society (AES) vol 43 no 9 pp 678ndash6891995

[6] D Rossum ldquoThe ldquoarmadillordquo coefficient encoding scheme fordigital audio filtersrdquo in proc IEEE Workshop on Applicationsof Signal Processing to Audio and Acoustics (WASPAA rsquo91) pp129ndash130 New Paltz NY USA October 1991

[7] Y Stylianou O Cappe and E Moulines ldquoContinuous prob-abilistic transform for voice conversionrdquo IEEE Trans SpeechAudio Processing vol 6 no 2 pp 131ndash142 1998

[8] N Iwahashi and Y Sagisaka ldquoSpeech spectrum conversionbased on speaker interpolation and multi-functional repre-sentation with weighting by radial basis function networksrdquoSpeech Communication vol 16 no 2 pp 139ndash151 1995

[9] L M Arslan ldquoSpeaker Transformation Algorithm using Seg-mental Codebooks (STASC)rdquo Speech Communication vol 28no 3 pp 211ndash226 1999

[10] A Kain and M Macon ldquoSpectral voice conversion for text-to-speech synthesisrdquo in Proc IEEE International Conferenceon Acoustics Speech and Signal Processing (ICASSP rsquo98) vol 1pp 285ndash288 Seattle WA USA May 1998

[11] A Kain and M Macon ldquoPersonalizing a speech synthesizerby voice adaptationrdquo in Proc 3rd ESCACOCOSDA Interna-tional Speech Synthesis Workshop pp 225ndash230 Jenolan CavesAustralia November 1998

[12] A Kain and M Macon ldquoDesign and evaluation of a voice con-version algorithm based on spectral envelope mapping andresidual predictionrdquo in Proc IEEE International Conference onAcoustics Speech and Signal Processing (ICASSP rsquo01) vol 2Salt Lake City Utah USA May 2001

[13] M Abe S Nakamura K Shikano and H Kuwabara ldquoVoiceconversion through vector quantizationrdquo in Proc IEEE In-ternational Conference on Acoustics Speech and Signal Pro-cessing (ICASSP rsquo88) pp 655ndash658 New York NY USA April1988

[14] M Narendranath H A Murthy S Rajendran and B Yegna-narayana ldquoTransformation of formants for voice conversionusing artificial neural networksrdquo Speech Communication vol16 no 2 pp 206ndash216 1995

[15] O Lev (Fellah) and D Malah ldquoLow bit-rate speech coderbased on a long-term modelrdquo in Proc 22nd IEEE Conventionof Electrical and Electronics Engineers in Israel Tel-Aviv IsraelDecember 2002

[16] W B Kleijn and J Haagen ldquoWaveform interpolationfor cod-ing and synthesisrdquo in Speech Coding and Synthesis W BKleijn and K K Paliwal Eds chapter 5 pp 175ndash207 Else-vier Science B V Amsterdam the Netherlands 1995

[17] W B Kleijn ldquoEncoding speech using prototype waveformsrdquoIEEE Trans Speech Audio Processing vol 1 no 4 pp 386ndash3991993

[18] W B Kleijn and J Haagen ldquoTransformation and decompo-sition of the speech signal for codingrdquo IEEE Signal ProcessingLett vol 1 no 9 pp 136ndash138 1994

[19] J R Deller J G Proakis and J H L Hansen Discrete TimeProcessing of Speech Signals chapter 7 Macmillan PublishingNew York NY USA 1993

[20] L R Rabiner and R W Schafer Digital processing of speechsignals chapter 8 Signal Processing Series Prentice-Hall Pub-lishing Englewood Cliffs London 1978

[21] J Makhoul ldquoLinear prediction A tutorial reviewrdquo Proc IEEEvol 63 no 4 pp 561ndash580 1975

[22] H Kuwabara and Y Sagisaka ldquoAcoustic characteristics ofspeaker individuality Control and conversionrdquo Speech Com-munication vol 16 no 2 pp 165ndash173 1995

[23] A M Noll ldquoCepstrum pitch determinationrdquo Journal of theAcoustical Society of America vol 41 no 2 pp 293ndash309 1967

[24] Y Medan E Yair and D Chazan ldquoSuper resolution pitchdetermination of speech signalsrdquo IEEE Trans Acoust SpeechSignal Processing vol 39 no 1 pp 40ndash48 1991

[25] H Wakita ldquoDirect estimation of the vocal tract shape by in-verse filtering of the acoustic speech waveformsrdquo IEEE Trans-action Audio Electroacoustic vol AV-21 no 5 pp 417ndash4271973

1184 EURASIP Journal on Applied Signal Processing

[26] H Kawahara I Masuda-Katsuse and A de Cheveigne ldquoRe-structuring speech representations using a pitch-adaptivetime-frequency smoothing and an instantaneous-frequency-based F0 extraction Possible role of a repetitive structure insoundsrdquo Speech Communication vol 27 no 3-4 pp 187ndash207 1999

[27] H Kawahara H Katayose A de Cheveigne and R D Pat-terson ldquoFixed point analysis of frequency to instantaneousfrequency mapping for accurate estimation of f0 and period-icityrdquo in Proc European Union Activities in Human LanguageTechnologies (EUROSPEECH rsquo99) vol 6 pp 2781ndash2784 Bu-dapest Hungary September 1999

Yizhar Lavner received a PhD degree fromthe Technion ndash Israel Institute of Technol-ogy in 1997 He joined the Department ofcomputer science Tel-Hai Academic Col-lege Upper Galilee Israel in 1997 where heis a Senior Lecturer He is teaching in SIPL(Signal and Image Processing Lab) Elec-trical Engineering Faculty the TechnionHaifa Israel) since 1998 His research in-terests include speech and audio signal pro-cessing and genomic signal processing

Gidon Porat received his BS degree (cumlaude) in electrical engineering from theTechnion ndash Israel Institute of Technologyin January 2003 As a part of his grad-uation requirements Porat has researchedvoice morphing in the Electrical Engineer-ing Facultyrsquos Signal and Image ProcessingLab in the Technion Porat was a recipientof a Best Studentrsquos Paper Award at the IEEE22nd Convention of Electrical amp Electron-ics Engineers Israel in December 2002 Since his graduation Po-rat has been employed by Intel Corporation as a member of theMobile Platform Group Chip Design Team Porat takes part in thedevelopment of high-speed low-power circuit designs for the next-generation CPUs for mobile computers

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Transforming Signal Processing Applications intoParallel Implementations

Call for PapersThere is an increasing need to develop efficient ldquosystem-levelrdquo models methods and tools to support designers toquickly transform signal processing application specificationto heterogeneous hardware and software architectures suchas arrays of DSPs heterogeneous platforms involving mi-croprocessors DSPs and FPGAs and other evolving multi-processor SoC architectures Typically the design process in-volves aspects of application and architecture modeling aswell as transformations to translate the application modelsto architecture models for subsequent performance analysisand design space exploration Accurate predictions are indis-pensable because next generation signal processing applica-tions for example audio video and array signal processingimpose high throughput real-time and energy constraintsthat can no longer be served by a single DSP

There are a number of key issues in transforming applica-tion models into parallel implementations that are not ad-dressed in current approaches These are engineering theapplication specification transforming application specifi-cation or representation of the architecture specification aswell as communication models such as data transfer and syn-chronization primitives in both models

The purpose of this call for papers is to address approachesthat include application transformations in the performanceanalysis and design space exploration efforts when takingsignal processing applications to concurrent and parallel im-plementations The Guest Editors are soliciting contributionsin joint application and architecture space exploration thatoutperform the current architecture-only design space ex-ploration methods and tools

Topics of interest for this special issue include but are notlimited to

bull modeling applications in terms of (abstract)control-dataflow graph dataflow graph and processnetwork models of computation (MoC)

bull transforming application models or algorithmicengineering

bull transforming application MoCs to architecture MoCsbull joint application and architecture space exploration

bull joint application and architecture performanceanalysis

bull extending the concept of algorithmic engineering toarchitecture engineering

bull design cases and applications mapped onmultiprocessor homogeneous or heterogeneousSOCs showing joint optimization of application andarchitecture

Authors should follow the EURASIP JASP manuscriptformat described at httpwwwhindawicomjournalsaspProspective authors should submit an electronic copy oftheir complete manuscript through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due September 1 2006

Acceptance Notification January 1 2007

Final Manuscript Due April 1 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

F Deprettre Leiden Embedded Research Center LeidenUniversity Niels Bohrweg 1 2333 CA Leiden TheNetherlands eddliacsnl

Roger Woods School of Electrical and ElectronicEngineering Queens University of Belfast Ashby BuildingStranmillis Road Belfast BT9 5AH UK rwoodsqubacuk

Ingrid Verbauwhede Katholieke Universiteit LeuvenESAT-COSIC Kasteelpark Arenberg 10 3001 LeuvenBelgium Ingridverbauwhedeesatkuleuvenbe

Erwin de Kock Philips Research High Tech Campus 315656 AE Eindhoven The Netherlandserwindekockphilipscom

Hindawi Publishing Corporationhttpwwwhindawicom

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Video Adaptation for Heterogeneous Environments

Call for PapersThe explosive growth of compressed video streams andrepositories accessible worldwide the recent addition of newvideo-related standards such as H264AVC MPEG-7 andMPEG-21 and the ever-increasing prevalence of heteroge-neous video-enabled terminals such as computer TV mo-bile phones and personal digital assistants have escalated theneed for efficient and effective techniques for adapting com-pressed videos to better suit the different capabilities con-straints and requirements of various transmission networksapplications and end users For instance Universal Multime-dia Access (UMA) advocates the provision and adaptation ofthe same multimedia content for different networks termi-nals and user preferences

Video adaptation is an emerging field that offers a richbody of knowledge and techniques for handling the hugevariation of resource constraints (eg bandwidth display ca-pability processing speed and power consumption) and thelarge diversity of user tasks in pervasive media applicationsConsiderable amounts of research and development activi-ties in industry and academia have been devoted to answer-ing the many challenges in making better use of video con-tent across systems and applications of various kinds

Video adaptation may apply to individual or multiplevideo streams and may call for different means depending onthe objectives and requirements of adaptation Transcodingtransmoding (cross-modality transcoding) scalable contentrepresentation content abstraction and summarization arepopular means for video adaptation In addition video con-tent analysis and understanding including low-level featureanalysis and high-level semantics understanding play an im-portant role in video adaptation as essential video contentcan be better preserved

The aim of this special issue is to present state-of-the-art developments in this flourishing and important researchfield Contributions in theoretical study architecture designperformance analysis complexity reduction and real-worldapplications are all welcome

Topics of interest include (but are not limited to)

bull Heterogeneous video transcodingbull Scalable video codingbull Dynamic bitstream switching for video adaptation

bull Signal structural and semantic-level videoadaptation

bull Content analysis and understanding for videoadaptation

bull Video summarization and abstractionbull Copyright protection for video adaptationbull Crossmedia techniques for video adaptationbull Testing field trials and applications of video

adaptation servicesbull International standard activities for video adaptation

Authors should follow the EURASIP JASP manuscriptformat described at httpwwwhindawicomjournalsaspProspective authors should submit an electronic copy oftheir complete manuscript through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due September 1 2006

Acceptance Notification January 1 2007

Final Manuscript Due April 1 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

Chia-Wen Lin Department of Computer Science andInformation Engineering National Chung ChengUniversity Chiayi 621 Taiwan cwlincsccuedutw

Yap-Peng Tan School of Electrical and ElectronicEngineering Nanyang Technological University NanyangAvenue Singapore 639798 Singapore eyptanntuedusg

Ming-Ting Sun Department of Electrical EngineeringUniversity of Washington Seattle WA 98195 USA suneewashingtonedu

Alex Kot School of Electrical and Electronic EngineeringNanyang Technological University Nanyang AvenueSingapore 639798 Singapore eackotntuedusg

Anthony Vetro Mitsubishi Electric Research Laboratories201 Broadway 8th Floor Cambridge MA 02138 USAavetromerlcom

Hindawi Publishing Corporationhttpwwwhindawicom

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Knowledge-Assisted Media Analysis for InteractiveMultimedia Applications

Call for PapersIt is broadly acknowledged that the development of enablingtechnologies for new forms of interactive multimedia ser-vices requires a targeted confluence of knowledge semanticsand low-level media processing The convergence of these ar-eas is key to many applications including interactive TV net-worked medical imaging vision-based surveillance and mul-timedia visualization navigation search and retrieval Thelatter is a crucial application since the exponential growthof audiovisual data along with the critical lack of tools torecord the data in a well-structured form is rendering uselessvast portions of available content To overcome this problemthere is need for technology that is able to produce accuratelevels of abstraction in order to annotate and retrieve con-tent using queries that are natural to humans Such technol-ogy will help narrow the gap between low-level features orcontent descriptors that can be computed automatically andthe richness and subjectivity of semantics in user queries andhigh-level human interpretations of audiovisual media

This special issue focuses on truly integrative research tar-geting of what can be disparate disciplines including imageprocessing knowledge engineering information retrievalsemantic analysis and artificial intelligence High-qualityand novel contributions addressing theoretical and practicalaspects are solicited Specifically the following topics are ofinterest

bull Semantics-based multimedia analysisbull Context-based multimedia miningbull Intelligent exploitation of user relevance feedbackbull Knowledge acquisition from multimedia contentsbull Semantics based interaction with multimediabull Integration of multimedia processing and Semantic

Web technologies to enable automatic content shar-ing processing and interpretation

bull Content user and network aware media engineeringbull Multimodal techniques high-dimensionality reduc-

tion and low level feature fusion

Authors should follow the EURASIP JASP manuscriptformat described at httpwwwhindawicomjournalsaspProspective authors should submit an electronic copy oftheir complete manuscript through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due September 1 2006

Acceptance Notification January 15 2007

Final Manuscript Due April 1 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

Ebroul Izquierdo Department of Electronic EngineeringQueen Mary University of London Mile End Road LondonE1 4NS United Kingdom ebroulizquierdoelecqmulacuk

Hyoung Joong Kim Department of Control and Instru-mentation Engineering Kangwon National University 192 1Hyoja2 Dong Kangwon Do 200 701 Koreakhjkangwonackr

Thomas Sikora Communication Systems Group TechnicalUniversity Berlin Einstein Ufer 17 10587 Berlin Germanysikoranuetu-berlinde

Hindawi Publishing Corporationhttpwwwhindawicom

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Super-resolution Enhancement of Digital Video

Call for PapersWhen designing a system for image acquisition there is gen-erally a desire for high spatial resolution and a wide field-of-view To achieve this a camera system must typically em-ploy small f-number optics This produces an image withvery high spatial-frequency bandwidth at the focal plane Toavoid aliasing caused by undersampling the correspondingfocal plane array (FPA) must be sufficiently dense Howevercost and fabrication complexities may make this impracticalMore fundamentally smaller detectors capture fewer pho-tons which can lead to potentially severe noise levels in theacquired imagery Considering these factors one may chooseto accept a certain level of undersampling or to sacrifice someoptical resolution andor field-of-view

In image super-resolution (SR) postprocessing is used toobtain images with resolutions that go beyond the conven-tional limits of the uncompensated imaging system In somesystems the primary limiting factor is the optical resolutionof the image in the focal plane as defined by the cut-off fre-quency of the optics We use the term ldquooptical SRrdquo to re-fer to SR methods that aim to create an image with validspatial-frequency content that goes beyond the cut-off fre-quency of the optics Such techniques typically must rely onextensive a priori information In other image acquisitionsystems the limiting factor may be the density of the FPAsubsequent postprocessing requirements or transmission bi-trate constraints that require data compression We refer tothe process of overcoming the limitations of the FPA in orderto obtain the full resolution afforded by the selected optics asldquodetector SRrdquo Note that some methods may seek to performboth optical and detector SR

Detector SR algorithms generally process a set of low-resolution aliased frames from a video sequence to producea high-resolution frame When subpixel relative motion ispresent between the objects in the scene and the detector ar-ray a unique set of scene samples are acquired for each frameThis provides the mechanism for effectively increasing thespatial sampling rate of the imaging system without reduc-ing the physical size of the detectors

With increasing interest in surveillance and the prolifera-tion of digital imaging and video SR has become a rapidlygrowing field Recent advances in SR include innovative al-gorithms generalized methods real-time implementations

and novel applications The purpose of this special issue isto present leading research and development in the area ofsuper-resolution for digital video Topics of interest for thisspecial issue include but are not limited to

bull Detector and optical SR algorithms for videobull Real-time or near-real-time SR implementationsbull Innovative color SR processingbull Novel SR applications such as improved object

detection recognition and trackingbull Super-resolution from compressed videobull Subpixel image registration and optical flow

Authors should follow the EURASIP JASP manuscriptformat described at httpwwwhindawicomjournalsaspProspective authors should submit an electronic copy oftheir complete manuscript through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due September 1 2006

Acceptance Notification February 1 2007

Final Manuscript Due April 15 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

Russell C Hardie Department of Electrical and ComputerEngineering University of Dayton 300 College ParkDayton OH 45469-0026 USA rhardieudaytonedu

Richard R Schultz Department of Electrical EngineeringUniversity of North Dakota Upson II Room 160 PO Box7165 Grand Forks ND 58202-7165 USARichardSchultzmailundnodakedu

Kenneth E Barner Department of Electrical andComputer Engineering University of Delaware 140 EvansHall Newark DE 19716-3130 USA barnereeudeledu

Hindawi Publishing Corporationhttpwwwhindawicom

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Advanced Signal Processing and ComputationalIntelligence Techniques for Power Line Communications

Call for PapersIn recent years increased demand for fast Internet access andnew multimedia services the development of new and fea-sible signal processing techniques associated with faster andlow-cost digital signal processors as well as the deregulationof the telecommunications market have placed major em-phasis on the value of investigating hostile media such aspowerline (PL) channels for high-rate data transmissions

Nowadays some companies are offering powerline com-munications (PLC) modems with mean and peak bit-ratesaround 100 Mbps and 200 Mbps respectively Howeveradvanced broadband powerline communications (BPLC)modems will surpass this performance For accomplishing itsome special schemes or solutions for coping with the follow-ing issues should be addressed (i) considerable differencesbetween powerline network topologies (ii) hostile propertiesof PL channels such as attenuation proportional to high fre-quencies and long distances high-power impulse noise oc-currences time-varying behavior and strong inter-symbolinterference (ISI) effects (iv) electromagnetic compatibilitywith other well-established communication systems work-ing in the same spectrum (v) climatic conditions in differ-ent parts of the world (vii) reliability and QoS guarantee forvideo and voice transmissions and (vi) different demandsand needs from developed developing and poor countries

These issues can lead to exciting research frontiers withvery promising results if signal processing digital commu-nication and computational intelligence techniques are ef-fectively and efficiently combined

The goal of this special issue is to introduce signal process-ing digital communication and computational intelligencetools either individually or in combined form for advancingreliable and powerful future generations of powerline com-munication solutions that can be suited with for applicationsin developed developing and poor countries

Topics of interest include (but are not limited to)

bull Multicarrier spread spectrum and single carrier tech-niques

bull Channel modeling

bull Channel coding and equalization techniquesbull Multiuser detection and multiple access techniquesbull Synchronization techniquesbull Impulse noise cancellation techniquesbull FPGA ASIC and DSP implementation issues of PLC

modemsbull Error resilience error concealment and Joint source-

channel design methods for video transmissionthrough PL channels

Authors should follow the EURASIP JASP manuscript for-mat described at the journal site httpasphindawicomProspective authors should submit an electronic copy of theircomplete manuscripts through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due October 1 2006

Acceptance Notification January 1 2007

Final Manuscript Due April 1 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

Moiseacutes Vidal Ribeiro Federal University of Juiz de ForaBrazil mribeiroieeeorg

Lutz Lampe University of British Columbia Canadalampeeceubcca

Sanjit K Mitra University of California Santa BarbaraUSA mitraeceucsbedu

Klaus Dostert University of Karlsruhe Germanyklausdostertetecuni-karlsruhede

Halid Hrasnica Dresden University of Technology Ger-many hrasnicaifnettu-dresdende

Hindawi Publishing Corporationhttpasphindawicom

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Numerical Linear Algebra in Signal ProcessingApplications

Call for PapersThe cross-fertilization between numerical linear algebra anddigital signal processing has been very fruitful in the lastdecades The interaction between them has been growingleading to many new algorithms

Numerical linear algebra tools such as eigenvalue and sin-gular value decomposition and their higher-extension leastsquares total least squares recursive least squares regulariza-tion orthogonality and projections are the kernels of pow-erful and numerically robust algorithms

The goal of this special issue is to present new efficient andreliable numerical linear algebra tools for signal processingapplications Areas and topics of interest for this special issueinclude (but are not limited to)

bull Singular value and eigenvalue decompositions in-cluding applications

bull Fourier Toeplitz Cauchy Vandermonde and semi-separable matrices including special algorithms andarchitectures

bull Recursive least squares in digital signal processingbull Updating and downdating techniques in linear alge-

bra and signal processingbull Stability and sensitivity analysis of special recursive

least-squares problemsbull Numerical linear algebra in

bull Biomedical signal processing applicationsbull Adaptive filtersbull Remote sensingbull Acoustic echo cancellationbull Blind signal separation and multiuser detectionbull Multidimensional harmonic retrieval and direc-

tion-of-arrival estimationbull Applications in wireless communicationsbull Applications in pattern analysis and statistical

modelingbull Sensor array processing

Authors should follow the EURASIP JASP manuscriptformat described at httpwwwhindawicomjournalsaspProspective authors should submit an electronic copy oftheir complete manuscript through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due October 1 2006

Acceptance Notification February 1 2007

Final Manuscript Due May 1 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

Shivkumar Chandrasekaran Department of Electricaland Computer Engineering University of California SantaBarbara USA shiveceucsbedu

Gene H Golub Department of Computer Science StanfordUniversity USA golubsccmstanfordedu

Nicola Mastronardi Istituto per le Applicazioni del CalcololdquoMauro Piconerdquo Consiglio Nazionale delle Ricerche BariItaly nmastronardibaiaccnrit

Marc Moonen Department of Electrical EngineeringKatholieke Universiteit Leuven Belgiummarcmoonenesatkuleuvenbe

Paul Van Dooren Department of Mathematical Engineer-ing Catholic University of Louvain Belgiumvdoorencsamuclacbe

Sabine Van Huffel Department of Electrical EngineeringKatholieke Universiteit Leuven Belgiumsabinevanhuffelesatkuleuvenbe

Hindawi Publishing Corporationhttpwwwhindawicom

Page 5: Voice Morphing using 3D Waveform Interpolation …spl.telhai.ac.il/speech/pub/S1110865705502178[1].pdf · Voice Morphing Using 3D Waveform Interpolation Surfaces 1175 Several studies

1178 EURASIP Journal on Applied Signal Processing

frame or pitch synchronously) to create the vocal-tract filterand the residual error function for each segment The pro-totype waveform surfaces are then created from the resid-ual error functions (see Section 231) In the synthesis stagea new residual error signal is recovered from a PWI sur-face interpolated from the two original ones (as describedin Section 232) The two speakersrsquo area functions are alsointerpolated producing an intermediate area function fromwhich a new vocal-tract filter is computed (see Section 233)The new residual signal is then used to excite the interpo-lated vocal-tract filter to yield an intermediate speech signalThe final morphed speech signal is created by concatenatingthe new vocal phonemes in order along with the unvoicedphonemes and silent periods of the source or of the target

231 Computation of the characteristicwaveform surface

The characteristic waveform surface which represents theresidual error signal derived from the voiced sections [16] isa two-dimensional signal that represents a one-dimensionalsignal and is constructed as follows let u(tφ) be the char-acteristic waveform where t denotes the time axis and φ is aphase variable whose values are in the range [0 2π] The pro-totype waveforms are displayed along the phase axis whereeach prototype is a short segment from the residual signalwith a length of one pitch period Each prototype is consid-ered as a periodic function with a period of 2π The timeaxis of the surface displays the waveform evolution A one-dimensional signal can be recovered from u(tφ) by using aspecific φ(t) so that

r(t) = u(tφ(t)

) (1)

where φ(t) is calculated using the signal pitch period func-tion or pitch contour p(t) by

φ(t) = φ(t0)

+int t

t0

2πp(t)

dt (2)

A typical prediction error signal and its surface are shown inFigure 3

In the proposed solution the surface for each phonemeis created separately The construction of such a surface isdetailed in using the following procedure (Figure 4)

(1) Pitch detection is applied in order to obtain an instan-taneous pitch value p(t) which will track the pitch cy-cle change through time At any given point in timethe pitch cycle is determined by a linear interpolationof the pitch marks obtained by the pitch detector

(2) A rectangular window with duration of one pitch pe-riod multiplies the residual error function around asampling time ti with a step update of 25 millisec-onds to create a prototype waveform In order tosmooth the surface along the time axis a low-orderSavitzky-Golay filter is applied to the error function

0 50 100 150 200 250 300 350 400minus03

minus02

minus01

0

01

02

03

04

05

06

Am

plit

ude

t

(a)

60 50 40 30 20 10 0

100200

300400

minus04

minus02

0

02

04

06

08

1

Am

plit

ude

φ

t

(b)

Figure 3 Creating the characteristic waveform surface (a) A typ-ical residual error signal derived from the speech signal using lin-ear prediction analysis (b) The surface is constructed by plottingthe prototype waveform along the φ-axis at intervals of 25 millisec-onds after alignment and interpolation of the prototypes

(3) For the reconstruction of the signal from the surfaceit is extremely important to maintain similar and min-imum energy values at both ends of the prototypewaveform (which actually represent the same pointdue to the 2π periodicity along the phase axis) There-fore a shift of plusmn∆ samples (∆max was set to be 1 mil-lisecond) is allowed for the location of the windowrsquoscenter in the construction of the PWI surface

(4) Because the pitch cycle varies in time each prototypewaveform will be of different length Therefore all pro-totypes must be aligned along φ = [0 minus 2π] and musthave the same number of samples

Voice Morphing Using 3D Waveform Interpolation Surfaces 1179

The errorwaveform surface

Interpolatealong

time axis

Normalizebetween0minus 2π

Align withprevious

waveform

Prototypewaveformsampler

LPF

Short-timepitch function

calculation

Error signalcalculation

Pitch detectionmarks

LPC parameters

Figure 4 A schematic description of the PWI surface construction

(5) Once a prototype waveform is sampled according tostep (3) above a cyclical shift along the φ-axis is per-formed to obtain maximum cross-correlation with theformer prototype waveform thus creating a relativelysmooth waveform surface when moving across thetime axis

(6) In order to create a surface that reflects the errorrsquos pitchcycle evolution over time an interpolation along thetime axis is performed

232 Construction of the intermediateresidual error signal

Let us(tφ) ut(tφ) t [0 minus TsTt]φ [0 minus 2π] be thePWI of the source and target speakers respectively As de-scribed earlier the new intermediate waveform surface willbe an interpolation of the two surfaces (Figure 5) Therefore

unew(tφ) = α middot us(β middot tφ) + (1minus α) middot utminusaligned(γ middot tφ)

unew(tφ) t [0minus Tnew] φ [0minus 2π]

(3)

where β = TnewTs γ = TnewTt and α1 is the relative part ofus(tφ)

1The factor may be invariant if so the voice produced will be an inter-mediate between the two speakers or it may vary in time (from α = 1 att = 0 to α = 0 at t = T) yielding a gradual change from one voice to theother

minus02minus01

00102

03

04

0506

60 5040

3020

100

+

0100

200300

400

Am

plit

ude

φt

minus04minus03minus02

minus01

0

010203

04

6050

4030

2010

0 0100

200300

400

Am

plit

ude

φt

minus02

minus01

0

01

02

03

04

05

6050

4030

2010

0 0100

200300

400

Am

plit

ude

φt

Figure 5 Residual PWI surfaces of the two speakers (upper andintermediate figures respectively) and the resulting interpolatedsurface (bottom figure) It can be noticed that the interpolatedsurface combines characteristics of both speakers The φ-axis rep-resents samples of the normalized pitch period in a scale between 0and 2π

1180 EURASIP Journal on Applied Signal Processing

minus02

0

02

0 050

100

150200

φ

t

Am

plit

ude

Figure 6 The new PWI surface is constructed by interpolation oftwo PWI surfaces of the residual error signals of two speakers Thenew reconstructed residual can be recovered by moving on the sur-face along the phase axis according to the phase φ(t) determined bythe new pitch contour (The tracking is represented by the black linealong the surface)

In order to maximize the cross-correlation between thetwo surfaces the target speakerrsquos surface is shifted along theφ-axis and is referred to as utminusaligned(tφ) where

utminusaligned(tφ) = ut(tφ + φn

) (4)

φn = arg maxφe

int 2πφ=0

int Tnew

t=0 us(tφ) middot ut(tφ + φe

)dtdφ∥∥us∥∥ middot ∥∥ut(tφ + φe

)∥∥ (5)

u =radicradicradicradicint 2π

φ=0

int Tnew

t=0u(tφ) middot u(tφ)dtdφ (6)

where φe is the correction needed for the surfaces to bealigned

The last step (after creating unew(tφ)) is reconstruct-ing the new residual error signal from the waveform sur-face The reconstruction is performed by defining enew(t) =unew(tφnew(t)) for all t = [0 Tnew] where φnew(t) is createdby the following equation

φnew(t) =int t

t0

2πpnew(tprime)

dtprime (7)

pnew(t) is calculated as an average of the sourcersquos and targetrsquosshort-time pitch contour functions as shown in the follow-ing equation

pnew(t) = α middot ps(β middot t) + (1minus α) middot pt(γ middot t)

β = Tnew

Ts

γ = Tnew

Tt

(8)

where the factors β and γ are scaling factors for the new timeaxis [0Tnew] and α as in (3) is a weighting factor betweenthe pitch of the source and the pitch of the target Figure 6shows the derivation of a new residual error signal using thetrack on the PWI surface determined by φnew(t)

233 New vocal tract model calculation and synthesisIt is well known that the linear prediction parameters (iethe coefficients of the predictor polynomial A(z)) are highlysensitive to quantization [19] Therefore their quantizationor interpolation may result in an unstable filter and may pro-duce an undesirable signal However certain invertible non-linear transformations of the predictor coefficients result inequivalent sets of parameters that tolerate quantization or in-terpolation better An example of such a set of parametersis the PARCOR coefficients (ki) which are related to the ar-eas of lossless tube sections modeling the vocal tract [20] asgiven by the following equation

Ai+1 =(

1minus ki1 + ki

)middot Ai (9)

The value of the first area function parameter (A1) is arbi-trarily set to be 2 A new set of LPC parameters that defines anew vocal tract is computed using an interpolation of the twoarea vectors (source and target) This choice of the area pa-rameters seems to be more reasonable since intermediate vo-cal tract models should reflect intermediate dimensions [25]Let the source and target vocal tracts be modeled by N loss-less tube sections with areasAs

i Ati i [1minusN] respectively

The new signalrsquos vocal tract will be represented by

Anewi = α middot As

i + (1minus α) middot Ati foralli [1 2 N] (10)

After calculating the new areas the prediction filter iscomputed and the new vocal phoneme is synthesized accord-ing to the following scheme (Figure 7)

(1) compute new PARCOR parameters from the new areasby reversing (9)

(2) compute the coefficients of the new LPC model fromthe new PARCOR parameters

(3) filter the new excitation signal through the new vocal-tract filter to obtain the new vocal phoneme

When temporal voice morphing is applied informal subjec-tive listening tests performed on different sets of morphingparameters have revealed that in order to have a ldquolinearrdquo per-ceptual change between the voices of the source and the tar-get the coefficient α(t) (the relative part of us(tφ)) mustvary nonlinearly in time (like the one in Figure 8) When thecoefficient changed linearly with time the listeners perceivedan abrupt change from one identity to the other Using thenonlinear option that is gradually changing the identity ofone speaker to that of another a smooth modification of theldquosourcerdquo properties to those of the ldquotargetrdquo properties wasachieved In another subjective listening test (see below) per-formed on morphing from a womanrsquos voice to a manrsquos voiceand vice versa both uttering the same sentence the mor-phed sound was perceived as changing smoothly and natu-rally from one speaker to the other The quality of the mor-phed voices was found to depend upon the specific speakersthe differences between their voices and the content of theutterance Further research is required to accurately evaluatethe effect of these factors

Voice Morphing Using 3D Waveform Interpolation Surfaces 1181

New phoneme

Synthesizenew speech

signal

Calculatenew LPC

parameters

Calculate newtube areas

Modificationparameter (α)

Source and targetpitch functions

Source and targeterror surface

Creat newerror surface

Recover newexcitation

Source and targetLPC parameters

Modificationparameter (α)

Calculateφnew(t)

EvaluatePnew(t)

Figure 7 A block diagram of the procedure for synthesizing the new phoneme

0010203040506070809

1

0 01 02 03 04 05 06 07 08 09 1

Act

ual

mor

phin

gco

effici

ent

Desired morphing coefficient

Figure 8 An example of a nonlinear morphing function obtainedempirically and used to achieve a linear perceptual change betweenthe first speaker and the second speaker when time-varying mor-phing was performed

24 Evaluation of the algorithm

Evaluation of the algorithm was carried out using three sub-jective listening tests The naturalness intelligibility identitychange and smoothness of the morphing algorithm wereexamined in these psychoacoustic tests In all tests six lis-teners participated (three males and three females all with-out hearing impairments and inexperienced in speech mor-phing) The sentences for the tests were taken from TIMITdatabase and from BGU Hebrew database The speech stim-uli were played using high-quality loudspeakers In all tests

the stimuli were played in random order The listeners werefree to play each stimulus more than once without any limi-tation In the first test the listeners had to decide if there wasa change in the identity of the speaker for each of the threesentences on a 1ndash5 scale where 1 meant that one identitywas perceived along the utterance and 5 meant that there wasmore than one speaker The listeners had to rate the smooth-ness of the utterance as well on a similar 1ndash5 scale where5 meant smooth utterance and 1 meant abrupt change orchanges in the utterance Three types of stimuli were usedthe original speech morphed speech (in which the identitychanged gradually from one speaker to another during thesentence) and concatenated speech of two speakers at a fixedpoint [1] The aim of this test was to evaluate if the iden-tity change was perceived and to assess the effect of the algo-rithm on the smoothness of the gradual morphed utterancesThe results of this experiment for one of the sentences (ldquoWewere different shades and it did not make a bit of differenceamong usrdquo taken from [9]) are depicted in Figure 9 As ex-pected it is readily seen that the original utterance was per-ceived as a smooth speech without identity change while theabrupt change in identity in the concatenated utterance wasperceived and rated accordingly that is as more than oneidentity and with an abrupt change The change of identitywas perceived in the morphed signal as well (average score35 std = 096) since the sentence was long enough to no-tice the modification Nevertheless it was also perceived asrelatively smooth (average score 35 std = 112) The other

1182 EURASIP Journal on Applied Signal Processing

s3-o

rigi

nal

t3-o

rigi

nal

s3-g

radu

alm

orph

ing

t3-g

radu

alm

orph

ing

s3-

con

cate

nat

ed

t3-

con

cate

nat

ed

1000

1500

2000

2500

3000

3500

4000

4500

5000

Ave

rage

scor

e

Stimuli

Figure 9 Evaluation of smoothness and identity change in an ut-terance in which a gradual morphing is produced (smoothnessmdashleft column identity changemdashright column)

two utterances yielded similar results In the second test thelisteners had to evaluate the naturalness and intelligibilityof a given sentence on a 1ndash5 scale The listeners attendedto 3 stimuli two original sentences one uttered by a malespeaker and the other by a female speaker and a morphedsentence with a morphing factor of 05 which is a hybrid sig-nal between the two originals The results are summarized inFigure 10 It is clear from this figure that the morphed sig-nal was perceived as highly natural and clear by most of thelisteners (naturalness average score 43 std = 074 intelli-gibility average score 417 std = 069) In the third exper-iment the listeners had to rate the identity change and thenaturalness of two sentences on a 1ndash5 scale Each sentencewas repeated as a cyclostationary morph (see [3]) 16 timeswith various morphing factors from 0 to 1 The average nat-uralness was 325 and the identity change was rated with anaverage of 33 Naturalness in this case was not perfect (al-though it is quite high) and can be explained by the fact thatthe voice timbers in this test were distant and the hybridvoice that was produced could be perceived as uncommonIn summary the results of the tests show that the morphedsignals are perceived in most cases as highly natural highlyintelligible with a relatively smooth change from one speechvoice to another

3 CONCLUSIONS

In this study a new speech morphing algorithm is presentedThe aim is to produce natural sounding hybrid voices be-tween two speakers uttering the same content The algo-rithm is based on representing the residual error signal as aPWI surface and the vocal tract as a lossless tube area func-tion The PWI surface incorporates the characteristics of theexcitation signal and enables reproduction of a residual sig-nal with a given pitch contour and time duration which in-cludes the dynamics of both speakersrsquo excitations It is known

1000

1500

2000

2500

3000

3500

4000

4500

5000

Ave

rage

scor

e

Man Woman Morphing 05

Stimuli

Figure 10 Results of subjective evaluation for naturalness (leftbars) and intelligibility (right bars) for 3 stimuli a man (left) anda woman (middle) voices and a morphing between the two voices(right)

[16] that PWI surfaces can be exploited efficiently for speechcoding and therefore they allow for higher compression ofthe speech database The area function was used in an at-tempt to reflect an intermediate configuration of the vocaltract between the two speakers [25] The utterances producedby the algorithm were shown to be of high quality and to con-sist of intermediate features of the two speakers

There are at least two modes in which the morphing algo-rithm can be used In the first mode the morphing parameteris invariant meaning for example taking a factor of 05 andreceiving a morphed signal with characteristics which are be-tween the two voices for the whole duration of the articu-lation In the second mode we start from the first (source)speaker (morphing factor = 0) and the morphing factor ischanged gradually along the duration of the sentence so itsvalue is 1 at the end of the sentence The same morphingfactor was used for both the excitation and the vocal tractparameters

In this time-varying version of the algorithm that iswhen morphing gradually from one voice to another overtime smooth morphing was achieved producing a highlynatural transition between the source and the target speakersThis was assessed by subjective evaluation tests as previouslydescribed

The algorithm described here is more capable of pro-ducing longer and more smooth and natural sounding ut-terances than previous studies [1 3] One of the advantagesof the proposed algorithm is that in addition to the inter-polation of the vocal tract features interpolation is also per-formed between the two PWI surfaces of the correspondingresidual signals and thus captures the evolution of the excita-tion signals of both speakers In this way a hybrid excitationsignal can be produced that contains intermediate character-istics of both excitations Thus a more natural morphing be-tween utterances can be achieved as has been demonstratedFurthermore our algorithm performs the interpolation of

Voice Morphing Using 3D Waveform Interpolation Surfaces 1183

the residual signal regardless of the pitch information sincethe pitch data is normalized within the PWI surface There-fore the morphed pitch contour is extracted independentlyand can be manipulated separately In addition the currentapproach enables morphing between utterances with differ-ent pitches between male and female voices or betweenvoices of different and perceptually distant timbres Kawa-hara and his colleagues [26 27] implemented a morphingsystem based on interpolation between time-frequency rep-resentations of the source and the target signals It appearsthat the STRAIGHT-based morphing system (see [2 26])was able to produce intermediate voices of higher clarity thanour algorithm but the need to assign multiple anchor pointsfor each short segment using visual inspection and phono-logical knowledge is a noticeable disadvantage of that sys-tem which can make it difficult to use for morphing betweenlong utterances

Further research is needed to improve the quality of themorphed signals which are natural sounding (see httpspltelhaiacilspeech) but are somewhat degraded com-pared to the originals

ACKNOWLEDGMENTS

We would like to thank Professor David Malah the Head ofSIPL for proposing to apply PWI to voice morphing andfor his valuable discussions and comments We also thankDima Ruinskiy for his valuable assistance for programmingpart of the morphing system and for preparing the semiau-tomatic segmentation and Yefim Yakir for preparing part ofthe figures and for technical support The authors thank thereviewers for their helpful comments This study was partlysupported by Guastella Fellowship of the Sacta-Rashi Foun-dation and the JAFI project

REFERENCES

[1] M Abe ldquoSpeech morphing by gradually changing spectrumparameter and fundamental frequencyrdquo Tech Rep SP96-40The Institute of Electronics Information and Communica-tion Engineers (IEICE) July 1996

[2] H Kawahara and H Matsui ldquoAuditory morphing based onan elastic perceptual distance metric in an interference-freetime-frequency representationrdquo in Proc IEEE InternationalConference on Acoustics Speech and Signal Processing (ICASSPrsquo03) vol 1 pp 256ndash259 Hong Kong China 2003

[3] M Slaney M Covell and B Lassiter ldquoAutomatic audio mor-phingrdquo in Proc IEEE International Conference on AcousticsSpeech and Signal Processing (ICASSP rsquo96) vol 2 pp 1001ndash1004 Atlanta GA USA May 1996

[4] P Cano A Loscos J Bonada M de Boer and X Serra ldquoVoicemorphing system for impersonating in karaoke applicationsrdquoin Proc International Computer Music Conference (ICMA rsquo00)pp 109ndash112 Berlin Germany August 2000

[5] E Tellman L Haken and B Holloway ldquoTimbre morphingof sounds with unequal numbers of featuresrdquo Journal of theAudio Engineering Society (AES) vol 43 no 9 pp 678ndash6891995

[6] D Rossum ldquoThe ldquoarmadillordquo coefficient encoding scheme fordigital audio filtersrdquo in proc IEEE Workshop on Applicationsof Signal Processing to Audio and Acoustics (WASPAA rsquo91) pp129ndash130 New Paltz NY USA October 1991

[7] Y Stylianou O Cappe and E Moulines ldquoContinuous prob-abilistic transform for voice conversionrdquo IEEE Trans SpeechAudio Processing vol 6 no 2 pp 131ndash142 1998

[8] N Iwahashi and Y Sagisaka ldquoSpeech spectrum conversionbased on speaker interpolation and multi-functional repre-sentation with weighting by radial basis function networksrdquoSpeech Communication vol 16 no 2 pp 139ndash151 1995

[9] L M Arslan ldquoSpeaker Transformation Algorithm using Seg-mental Codebooks (STASC)rdquo Speech Communication vol 28no 3 pp 211ndash226 1999

[10] A Kain and M Macon ldquoSpectral voice conversion for text-to-speech synthesisrdquo in Proc IEEE International Conferenceon Acoustics Speech and Signal Processing (ICASSP rsquo98) vol 1pp 285ndash288 Seattle WA USA May 1998

[11] A Kain and M Macon ldquoPersonalizing a speech synthesizerby voice adaptationrdquo in Proc 3rd ESCACOCOSDA Interna-tional Speech Synthesis Workshop pp 225ndash230 Jenolan CavesAustralia November 1998

[12] A Kain and M Macon ldquoDesign and evaluation of a voice con-version algorithm based on spectral envelope mapping andresidual predictionrdquo in Proc IEEE International Conference onAcoustics Speech and Signal Processing (ICASSP rsquo01) vol 2Salt Lake City Utah USA May 2001

[13] M Abe S Nakamura K Shikano and H Kuwabara ldquoVoiceconversion through vector quantizationrdquo in Proc IEEE In-ternational Conference on Acoustics Speech and Signal Pro-cessing (ICASSP rsquo88) pp 655ndash658 New York NY USA April1988

[14] M Narendranath H A Murthy S Rajendran and B Yegna-narayana ldquoTransformation of formants for voice conversionusing artificial neural networksrdquo Speech Communication vol16 no 2 pp 206ndash216 1995

[15] O Lev (Fellah) and D Malah ldquoLow bit-rate speech coderbased on a long-term modelrdquo in Proc 22nd IEEE Conventionof Electrical and Electronics Engineers in Israel Tel-Aviv IsraelDecember 2002

[16] W B Kleijn and J Haagen ldquoWaveform interpolationfor cod-ing and synthesisrdquo in Speech Coding and Synthesis W BKleijn and K K Paliwal Eds chapter 5 pp 175ndash207 Else-vier Science B V Amsterdam the Netherlands 1995

[17] W B Kleijn ldquoEncoding speech using prototype waveformsrdquoIEEE Trans Speech Audio Processing vol 1 no 4 pp 386ndash3991993

[18] W B Kleijn and J Haagen ldquoTransformation and decompo-sition of the speech signal for codingrdquo IEEE Signal ProcessingLett vol 1 no 9 pp 136ndash138 1994

[19] J R Deller J G Proakis and J H L Hansen Discrete TimeProcessing of Speech Signals chapter 7 Macmillan PublishingNew York NY USA 1993

[20] L R Rabiner and R W Schafer Digital processing of speechsignals chapter 8 Signal Processing Series Prentice-Hall Pub-lishing Englewood Cliffs London 1978

[21] J Makhoul ldquoLinear prediction A tutorial reviewrdquo Proc IEEEvol 63 no 4 pp 561ndash580 1975

[22] H Kuwabara and Y Sagisaka ldquoAcoustic characteristics ofspeaker individuality Control and conversionrdquo Speech Com-munication vol 16 no 2 pp 165ndash173 1995

[23] A M Noll ldquoCepstrum pitch determinationrdquo Journal of theAcoustical Society of America vol 41 no 2 pp 293ndash309 1967

[24] Y Medan E Yair and D Chazan ldquoSuper resolution pitchdetermination of speech signalsrdquo IEEE Trans Acoust SpeechSignal Processing vol 39 no 1 pp 40ndash48 1991

[25] H Wakita ldquoDirect estimation of the vocal tract shape by in-verse filtering of the acoustic speech waveformsrdquo IEEE Trans-action Audio Electroacoustic vol AV-21 no 5 pp 417ndash4271973

1184 EURASIP Journal on Applied Signal Processing

[26] H Kawahara I Masuda-Katsuse and A de Cheveigne ldquoRe-structuring speech representations using a pitch-adaptivetime-frequency smoothing and an instantaneous-frequency-based F0 extraction Possible role of a repetitive structure insoundsrdquo Speech Communication vol 27 no 3-4 pp 187ndash207 1999

[27] H Kawahara H Katayose A de Cheveigne and R D Pat-terson ldquoFixed point analysis of frequency to instantaneousfrequency mapping for accurate estimation of f0 and period-icityrdquo in Proc European Union Activities in Human LanguageTechnologies (EUROSPEECH rsquo99) vol 6 pp 2781ndash2784 Bu-dapest Hungary September 1999

Yizhar Lavner received a PhD degree fromthe Technion ndash Israel Institute of Technol-ogy in 1997 He joined the Department ofcomputer science Tel-Hai Academic Col-lege Upper Galilee Israel in 1997 where heis a Senior Lecturer He is teaching in SIPL(Signal and Image Processing Lab) Elec-trical Engineering Faculty the TechnionHaifa Israel) since 1998 His research in-terests include speech and audio signal pro-cessing and genomic signal processing

Gidon Porat received his BS degree (cumlaude) in electrical engineering from theTechnion ndash Israel Institute of Technologyin January 2003 As a part of his grad-uation requirements Porat has researchedvoice morphing in the Electrical Engineer-ing Facultyrsquos Signal and Image ProcessingLab in the Technion Porat was a recipientof a Best Studentrsquos Paper Award at the IEEE22nd Convention of Electrical amp Electron-ics Engineers Israel in December 2002 Since his graduation Po-rat has been employed by Intel Corporation as a member of theMobile Platform Group Chip Design Team Porat takes part in thedevelopment of high-speed low-power circuit designs for the next-generation CPUs for mobile computers

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Transforming Signal Processing Applications intoParallel Implementations

Call for PapersThere is an increasing need to develop efficient ldquosystem-levelrdquo models methods and tools to support designers toquickly transform signal processing application specificationto heterogeneous hardware and software architectures suchas arrays of DSPs heterogeneous platforms involving mi-croprocessors DSPs and FPGAs and other evolving multi-processor SoC architectures Typically the design process in-volves aspects of application and architecture modeling aswell as transformations to translate the application modelsto architecture models for subsequent performance analysisand design space exploration Accurate predictions are indis-pensable because next generation signal processing applica-tions for example audio video and array signal processingimpose high throughput real-time and energy constraintsthat can no longer be served by a single DSP

There are a number of key issues in transforming applica-tion models into parallel implementations that are not ad-dressed in current approaches These are engineering theapplication specification transforming application specifi-cation or representation of the architecture specification aswell as communication models such as data transfer and syn-chronization primitives in both models

The purpose of this call for papers is to address approachesthat include application transformations in the performanceanalysis and design space exploration efforts when takingsignal processing applications to concurrent and parallel im-plementations The Guest Editors are soliciting contributionsin joint application and architecture space exploration thatoutperform the current architecture-only design space ex-ploration methods and tools

Topics of interest for this special issue include but are notlimited to

bull modeling applications in terms of (abstract)control-dataflow graph dataflow graph and processnetwork models of computation (MoC)

bull transforming application models or algorithmicengineering

bull transforming application MoCs to architecture MoCsbull joint application and architecture space exploration

bull joint application and architecture performanceanalysis

bull extending the concept of algorithmic engineering toarchitecture engineering

bull design cases and applications mapped onmultiprocessor homogeneous or heterogeneousSOCs showing joint optimization of application andarchitecture

Authors should follow the EURASIP JASP manuscriptformat described at httpwwwhindawicomjournalsaspProspective authors should submit an electronic copy oftheir complete manuscript through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due September 1 2006

Acceptance Notification January 1 2007

Final Manuscript Due April 1 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

F Deprettre Leiden Embedded Research Center LeidenUniversity Niels Bohrweg 1 2333 CA Leiden TheNetherlands eddliacsnl

Roger Woods School of Electrical and ElectronicEngineering Queens University of Belfast Ashby BuildingStranmillis Road Belfast BT9 5AH UK rwoodsqubacuk

Ingrid Verbauwhede Katholieke Universiteit LeuvenESAT-COSIC Kasteelpark Arenberg 10 3001 LeuvenBelgium Ingridverbauwhedeesatkuleuvenbe

Erwin de Kock Philips Research High Tech Campus 315656 AE Eindhoven The Netherlandserwindekockphilipscom

Hindawi Publishing Corporationhttpwwwhindawicom

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Video Adaptation for Heterogeneous Environments

Call for PapersThe explosive growth of compressed video streams andrepositories accessible worldwide the recent addition of newvideo-related standards such as H264AVC MPEG-7 andMPEG-21 and the ever-increasing prevalence of heteroge-neous video-enabled terminals such as computer TV mo-bile phones and personal digital assistants have escalated theneed for efficient and effective techniques for adapting com-pressed videos to better suit the different capabilities con-straints and requirements of various transmission networksapplications and end users For instance Universal Multime-dia Access (UMA) advocates the provision and adaptation ofthe same multimedia content for different networks termi-nals and user preferences

Video adaptation is an emerging field that offers a richbody of knowledge and techniques for handling the hugevariation of resource constraints (eg bandwidth display ca-pability processing speed and power consumption) and thelarge diversity of user tasks in pervasive media applicationsConsiderable amounts of research and development activi-ties in industry and academia have been devoted to answer-ing the many challenges in making better use of video con-tent across systems and applications of various kinds

Video adaptation may apply to individual or multiplevideo streams and may call for different means depending onthe objectives and requirements of adaptation Transcodingtransmoding (cross-modality transcoding) scalable contentrepresentation content abstraction and summarization arepopular means for video adaptation In addition video con-tent analysis and understanding including low-level featureanalysis and high-level semantics understanding play an im-portant role in video adaptation as essential video contentcan be better preserved

The aim of this special issue is to present state-of-the-art developments in this flourishing and important researchfield Contributions in theoretical study architecture designperformance analysis complexity reduction and real-worldapplications are all welcome

Topics of interest include (but are not limited to)

bull Heterogeneous video transcodingbull Scalable video codingbull Dynamic bitstream switching for video adaptation

bull Signal structural and semantic-level videoadaptation

bull Content analysis and understanding for videoadaptation

bull Video summarization and abstractionbull Copyright protection for video adaptationbull Crossmedia techniques for video adaptationbull Testing field trials and applications of video

adaptation servicesbull International standard activities for video adaptation

Authors should follow the EURASIP JASP manuscriptformat described at httpwwwhindawicomjournalsaspProspective authors should submit an electronic copy oftheir complete manuscript through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due September 1 2006

Acceptance Notification January 1 2007

Final Manuscript Due April 1 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

Chia-Wen Lin Department of Computer Science andInformation Engineering National Chung ChengUniversity Chiayi 621 Taiwan cwlincsccuedutw

Yap-Peng Tan School of Electrical and ElectronicEngineering Nanyang Technological University NanyangAvenue Singapore 639798 Singapore eyptanntuedusg

Ming-Ting Sun Department of Electrical EngineeringUniversity of Washington Seattle WA 98195 USA suneewashingtonedu

Alex Kot School of Electrical and Electronic EngineeringNanyang Technological University Nanyang AvenueSingapore 639798 Singapore eackotntuedusg

Anthony Vetro Mitsubishi Electric Research Laboratories201 Broadway 8th Floor Cambridge MA 02138 USAavetromerlcom

Hindawi Publishing Corporationhttpwwwhindawicom

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Knowledge-Assisted Media Analysis for InteractiveMultimedia Applications

Call for PapersIt is broadly acknowledged that the development of enablingtechnologies for new forms of interactive multimedia ser-vices requires a targeted confluence of knowledge semanticsand low-level media processing The convergence of these ar-eas is key to many applications including interactive TV net-worked medical imaging vision-based surveillance and mul-timedia visualization navigation search and retrieval Thelatter is a crucial application since the exponential growthof audiovisual data along with the critical lack of tools torecord the data in a well-structured form is rendering uselessvast portions of available content To overcome this problemthere is need for technology that is able to produce accuratelevels of abstraction in order to annotate and retrieve con-tent using queries that are natural to humans Such technol-ogy will help narrow the gap between low-level features orcontent descriptors that can be computed automatically andthe richness and subjectivity of semantics in user queries andhigh-level human interpretations of audiovisual media

This special issue focuses on truly integrative research tar-geting of what can be disparate disciplines including imageprocessing knowledge engineering information retrievalsemantic analysis and artificial intelligence High-qualityand novel contributions addressing theoretical and practicalaspects are solicited Specifically the following topics are ofinterest

bull Semantics-based multimedia analysisbull Context-based multimedia miningbull Intelligent exploitation of user relevance feedbackbull Knowledge acquisition from multimedia contentsbull Semantics based interaction with multimediabull Integration of multimedia processing and Semantic

Web technologies to enable automatic content shar-ing processing and interpretation

bull Content user and network aware media engineeringbull Multimodal techniques high-dimensionality reduc-

tion and low level feature fusion

Authors should follow the EURASIP JASP manuscriptformat described at httpwwwhindawicomjournalsaspProspective authors should submit an electronic copy oftheir complete manuscript through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due September 1 2006

Acceptance Notification January 15 2007

Final Manuscript Due April 1 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

Ebroul Izquierdo Department of Electronic EngineeringQueen Mary University of London Mile End Road LondonE1 4NS United Kingdom ebroulizquierdoelecqmulacuk

Hyoung Joong Kim Department of Control and Instru-mentation Engineering Kangwon National University 192 1Hyoja2 Dong Kangwon Do 200 701 Koreakhjkangwonackr

Thomas Sikora Communication Systems Group TechnicalUniversity Berlin Einstein Ufer 17 10587 Berlin Germanysikoranuetu-berlinde

Hindawi Publishing Corporationhttpwwwhindawicom

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Super-resolution Enhancement of Digital Video

Call for PapersWhen designing a system for image acquisition there is gen-erally a desire for high spatial resolution and a wide field-of-view To achieve this a camera system must typically em-ploy small f-number optics This produces an image withvery high spatial-frequency bandwidth at the focal plane Toavoid aliasing caused by undersampling the correspondingfocal plane array (FPA) must be sufficiently dense Howevercost and fabrication complexities may make this impracticalMore fundamentally smaller detectors capture fewer pho-tons which can lead to potentially severe noise levels in theacquired imagery Considering these factors one may chooseto accept a certain level of undersampling or to sacrifice someoptical resolution andor field-of-view

In image super-resolution (SR) postprocessing is used toobtain images with resolutions that go beyond the conven-tional limits of the uncompensated imaging system In somesystems the primary limiting factor is the optical resolutionof the image in the focal plane as defined by the cut-off fre-quency of the optics We use the term ldquooptical SRrdquo to re-fer to SR methods that aim to create an image with validspatial-frequency content that goes beyond the cut-off fre-quency of the optics Such techniques typically must rely onextensive a priori information In other image acquisitionsystems the limiting factor may be the density of the FPAsubsequent postprocessing requirements or transmission bi-trate constraints that require data compression We refer tothe process of overcoming the limitations of the FPA in orderto obtain the full resolution afforded by the selected optics asldquodetector SRrdquo Note that some methods may seek to performboth optical and detector SR

Detector SR algorithms generally process a set of low-resolution aliased frames from a video sequence to producea high-resolution frame When subpixel relative motion ispresent between the objects in the scene and the detector ar-ray a unique set of scene samples are acquired for each frameThis provides the mechanism for effectively increasing thespatial sampling rate of the imaging system without reduc-ing the physical size of the detectors

With increasing interest in surveillance and the prolifera-tion of digital imaging and video SR has become a rapidlygrowing field Recent advances in SR include innovative al-gorithms generalized methods real-time implementations

and novel applications The purpose of this special issue isto present leading research and development in the area ofsuper-resolution for digital video Topics of interest for thisspecial issue include but are not limited to

bull Detector and optical SR algorithms for videobull Real-time or near-real-time SR implementationsbull Innovative color SR processingbull Novel SR applications such as improved object

detection recognition and trackingbull Super-resolution from compressed videobull Subpixel image registration and optical flow

Authors should follow the EURASIP JASP manuscriptformat described at httpwwwhindawicomjournalsaspProspective authors should submit an electronic copy oftheir complete manuscript through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due September 1 2006

Acceptance Notification February 1 2007

Final Manuscript Due April 15 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

Russell C Hardie Department of Electrical and ComputerEngineering University of Dayton 300 College ParkDayton OH 45469-0026 USA rhardieudaytonedu

Richard R Schultz Department of Electrical EngineeringUniversity of North Dakota Upson II Room 160 PO Box7165 Grand Forks ND 58202-7165 USARichardSchultzmailundnodakedu

Kenneth E Barner Department of Electrical andComputer Engineering University of Delaware 140 EvansHall Newark DE 19716-3130 USA barnereeudeledu

Hindawi Publishing Corporationhttpwwwhindawicom

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Advanced Signal Processing and ComputationalIntelligence Techniques for Power Line Communications

Call for PapersIn recent years increased demand for fast Internet access andnew multimedia services the development of new and fea-sible signal processing techniques associated with faster andlow-cost digital signal processors as well as the deregulationof the telecommunications market have placed major em-phasis on the value of investigating hostile media such aspowerline (PL) channels for high-rate data transmissions

Nowadays some companies are offering powerline com-munications (PLC) modems with mean and peak bit-ratesaround 100 Mbps and 200 Mbps respectively Howeveradvanced broadband powerline communications (BPLC)modems will surpass this performance For accomplishing itsome special schemes or solutions for coping with the follow-ing issues should be addressed (i) considerable differencesbetween powerline network topologies (ii) hostile propertiesof PL channels such as attenuation proportional to high fre-quencies and long distances high-power impulse noise oc-currences time-varying behavior and strong inter-symbolinterference (ISI) effects (iv) electromagnetic compatibilitywith other well-established communication systems work-ing in the same spectrum (v) climatic conditions in differ-ent parts of the world (vii) reliability and QoS guarantee forvideo and voice transmissions and (vi) different demandsand needs from developed developing and poor countries

These issues can lead to exciting research frontiers withvery promising results if signal processing digital commu-nication and computational intelligence techniques are ef-fectively and efficiently combined

The goal of this special issue is to introduce signal process-ing digital communication and computational intelligencetools either individually or in combined form for advancingreliable and powerful future generations of powerline com-munication solutions that can be suited with for applicationsin developed developing and poor countries

Topics of interest include (but are not limited to)

bull Multicarrier spread spectrum and single carrier tech-niques

bull Channel modeling

bull Channel coding and equalization techniquesbull Multiuser detection and multiple access techniquesbull Synchronization techniquesbull Impulse noise cancellation techniquesbull FPGA ASIC and DSP implementation issues of PLC

modemsbull Error resilience error concealment and Joint source-

channel design methods for video transmissionthrough PL channels

Authors should follow the EURASIP JASP manuscript for-mat described at the journal site httpasphindawicomProspective authors should submit an electronic copy of theircomplete manuscripts through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due October 1 2006

Acceptance Notification January 1 2007

Final Manuscript Due April 1 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

Moiseacutes Vidal Ribeiro Federal University of Juiz de ForaBrazil mribeiroieeeorg

Lutz Lampe University of British Columbia Canadalampeeceubcca

Sanjit K Mitra University of California Santa BarbaraUSA mitraeceucsbedu

Klaus Dostert University of Karlsruhe Germanyklausdostertetecuni-karlsruhede

Halid Hrasnica Dresden University of Technology Ger-many hrasnicaifnettu-dresdende

Hindawi Publishing Corporationhttpasphindawicom

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Numerical Linear Algebra in Signal ProcessingApplications

Call for PapersThe cross-fertilization between numerical linear algebra anddigital signal processing has been very fruitful in the lastdecades The interaction between them has been growingleading to many new algorithms

Numerical linear algebra tools such as eigenvalue and sin-gular value decomposition and their higher-extension leastsquares total least squares recursive least squares regulariza-tion orthogonality and projections are the kernels of pow-erful and numerically robust algorithms

The goal of this special issue is to present new efficient andreliable numerical linear algebra tools for signal processingapplications Areas and topics of interest for this special issueinclude (but are not limited to)

bull Singular value and eigenvalue decompositions in-cluding applications

bull Fourier Toeplitz Cauchy Vandermonde and semi-separable matrices including special algorithms andarchitectures

bull Recursive least squares in digital signal processingbull Updating and downdating techniques in linear alge-

bra and signal processingbull Stability and sensitivity analysis of special recursive

least-squares problemsbull Numerical linear algebra in

bull Biomedical signal processing applicationsbull Adaptive filtersbull Remote sensingbull Acoustic echo cancellationbull Blind signal separation and multiuser detectionbull Multidimensional harmonic retrieval and direc-

tion-of-arrival estimationbull Applications in wireless communicationsbull Applications in pattern analysis and statistical

modelingbull Sensor array processing

Authors should follow the EURASIP JASP manuscriptformat described at httpwwwhindawicomjournalsaspProspective authors should submit an electronic copy oftheir complete manuscript through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due October 1 2006

Acceptance Notification February 1 2007

Final Manuscript Due May 1 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

Shivkumar Chandrasekaran Department of Electricaland Computer Engineering University of California SantaBarbara USA shiveceucsbedu

Gene H Golub Department of Computer Science StanfordUniversity USA golubsccmstanfordedu

Nicola Mastronardi Istituto per le Applicazioni del CalcololdquoMauro Piconerdquo Consiglio Nazionale delle Ricerche BariItaly nmastronardibaiaccnrit

Marc Moonen Department of Electrical EngineeringKatholieke Universiteit Leuven Belgiummarcmoonenesatkuleuvenbe

Paul Van Dooren Department of Mathematical Engineer-ing Catholic University of Louvain Belgiumvdoorencsamuclacbe

Sabine Van Huffel Department of Electrical EngineeringKatholieke Universiteit Leuven Belgiumsabinevanhuffelesatkuleuvenbe

Hindawi Publishing Corporationhttpwwwhindawicom

Page 6: Voice Morphing using 3D Waveform Interpolation …spl.telhai.ac.il/speech/pub/S1110865705502178[1].pdf · Voice Morphing Using 3D Waveform Interpolation Surfaces 1175 Several studies

Voice Morphing Using 3D Waveform Interpolation Surfaces 1179

The errorwaveform surface

Interpolatealong

time axis

Normalizebetween0minus 2π

Align withprevious

waveform

Prototypewaveformsampler

LPF

Short-timepitch function

calculation

Error signalcalculation

Pitch detectionmarks

LPC parameters

Figure 4 A schematic description of the PWI surface construction

(5) Once a prototype waveform is sampled according tostep (3) above a cyclical shift along the φ-axis is per-formed to obtain maximum cross-correlation with theformer prototype waveform thus creating a relativelysmooth waveform surface when moving across thetime axis

(6) In order to create a surface that reflects the errorrsquos pitchcycle evolution over time an interpolation along thetime axis is performed

232 Construction of the intermediateresidual error signal

Let us(tφ) ut(tφ) t [0 minus TsTt]φ [0 minus 2π] be thePWI of the source and target speakers respectively As de-scribed earlier the new intermediate waveform surface willbe an interpolation of the two surfaces (Figure 5) Therefore

unew(tφ) = α middot us(β middot tφ) + (1minus α) middot utminusaligned(γ middot tφ)

unew(tφ) t [0minus Tnew] φ [0minus 2π]

(3)

where β = TnewTs γ = TnewTt and α1 is the relative part ofus(tφ)

1The factor may be invariant if so the voice produced will be an inter-mediate between the two speakers or it may vary in time (from α = 1 att = 0 to α = 0 at t = T) yielding a gradual change from one voice to theother

minus02minus01

00102

03

04

0506

60 5040

3020

100

+

0100

200300

400

Am

plit

ude

φt

minus04minus03minus02

minus01

0

010203

04

6050

4030

2010

0 0100

200300

400

Am

plit

ude

φt

minus02

minus01

0

01

02

03

04

05

6050

4030

2010

0 0100

200300

400

Am

plit

ude

φt

Figure 5 Residual PWI surfaces of the two speakers (upper andintermediate figures respectively) and the resulting interpolatedsurface (bottom figure) It can be noticed that the interpolatedsurface combines characteristics of both speakers The φ-axis rep-resents samples of the normalized pitch period in a scale between 0and 2π

1180 EURASIP Journal on Applied Signal Processing

minus02

0

02

0 050

100

150200

φ

t

Am

plit

ude

Figure 6 The new PWI surface is constructed by interpolation oftwo PWI surfaces of the residual error signals of two speakers Thenew reconstructed residual can be recovered by moving on the sur-face along the phase axis according to the phase φ(t) determined bythe new pitch contour (The tracking is represented by the black linealong the surface)

In order to maximize the cross-correlation between thetwo surfaces the target speakerrsquos surface is shifted along theφ-axis and is referred to as utminusaligned(tφ) where

utminusaligned(tφ) = ut(tφ + φn

) (4)

φn = arg maxφe

int 2πφ=0

int Tnew

t=0 us(tφ) middot ut(tφ + φe

)dtdφ∥∥us∥∥ middot ∥∥ut(tφ + φe

)∥∥ (5)

u =radicradicradicradicint 2π

φ=0

int Tnew

t=0u(tφ) middot u(tφ)dtdφ (6)

where φe is the correction needed for the surfaces to bealigned

The last step (after creating unew(tφ)) is reconstruct-ing the new residual error signal from the waveform sur-face The reconstruction is performed by defining enew(t) =unew(tφnew(t)) for all t = [0 Tnew] where φnew(t) is createdby the following equation

φnew(t) =int t

t0

2πpnew(tprime)

dtprime (7)

pnew(t) is calculated as an average of the sourcersquos and targetrsquosshort-time pitch contour functions as shown in the follow-ing equation

pnew(t) = α middot ps(β middot t) + (1minus α) middot pt(γ middot t)

β = Tnew

Ts

γ = Tnew

Tt

(8)

where the factors β and γ are scaling factors for the new timeaxis [0Tnew] and α as in (3) is a weighting factor betweenthe pitch of the source and the pitch of the target Figure 6shows the derivation of a new residual error signal using thetrack on the PWI surface determined by φnew(t)

233 New vocal tract model calculation and synthesisIt is well known that the linear prediction parameters (iethe coefficients of the predictor polynomial A(z)) are highlysensitive to quantization [19] Therefore their quantizationor interpolation may result in an unstable filter and may pro-duce an undesirable signal However certain invertible non-linear transformations of the predictor coefficients result inequivalent sets of parameters that tolerate quantization or in-terpolation better An example of such a set of parametersis the PARCOR coefficients (ki) which are related to the ar-eas of lossless tube sections modeling the vocal tract [20] asgiven by the following equation

Ai+1 =(

1minus ki1 + ki

)middot Ai (9)

The value of the first area function parameter (A1) is arbi-trarily set to be 2 A new set of LPC parameters that defines anew vocal tract is computed using an interpolation of the twoarea vectors (source and target) This choice of the area pa-rameters seems to be more reasonable since intermediate vo-cal tract models should reflect intermediate dimensions [25]Let the source and target vocal tracts be modeled by N loss-less tube sections with areasAs

i Ati i [1minusN] respectively

The new signalrsquos vocal tract will be represented by

Anewi = α middot As

i + (1minus α) middot Ati foralli [1 2 N] (10)

After calculating the new areas the prediction filter iscomputed and the new vocal phoneme is synthesized accord-ing to the following scheme (Figure 7)

(1) compute new PARCOR parameters from the new areasby reversing (9)

(2) compute the coefficients of the new LPC model fromthe new PARCOR parameters

(3) filter the new excitation signal through the new vocal-tract filter to obtain the new vocal phoneme

When temporal voice morphing is applied informal subjec-tive listening tests performed on different sets of morphingparameters have revealed that in order to have a ldquolinearrdquo per-ceptual change between the voices of the source and the tar-get the coefficient α(t) (the relative part of us(tφ)) mustvary nonlinearly in time (like the one in Figure 8) When thecoefficient changed linearly with time the listeners perceivedan abrupt change from one identity to the other Using thenonlinear option that is gradually changing the identity ofone speaker to that of another a smooth modification of theldquosourcerdquo properties to those of the ldquotargetrdquo properties wasachieved In another subjective listening test (see below) per-formed on morphing from a womanrsquos voice to a manrsquos voiceand vice versa both uttering the same sentence the mor-phed sound was perceived as changing smoothly and natu-rally from one speaker to the other The quality of the mor-phed voices was found to depend upon the specific speakersthe differences between their voices and the content of theutterance Further research is required to accurately evaluatethe effect of these factors

Voice Morphing Using 3D Waveform Interpolation Surfaces 1181

New phoneme

Synthesizenew speech

signal

Calculatenew LPC

parameters

Calculate newtube areas

Modificationparameter (α)

Source and targetpitch functions

Source and targeterror surface

Creat newerror surface

Recover newexcitation

Source and targetLPC parameters

Modificationparameter (α)

Calculateφnew(t)

EvaluatePnew(t)

Figure 7 A block diagram of the procedure for synthesizing the new phoneme

0010203040506070809

1

0 01 02 03 04 05 06 07 08 09 1

Act

ual

mor

phin

gco

effici

ent

Desired morphing coefficient

Figure 8 An example of a nonlinear morphing function obtainedempirically and used to achieve a linear perceptual change betweenthe first speaker and the second speaker when time-varying mor-phing was performed

24 Evaluation of the algorithm

Evaluation of the algorithm was carried out using three sub-jective listening tests The naturalness intelligibility identitychange and smoothness of the morphing algorithm wereexamined in these psychoacoustic tests In all tests six lis-teners participated (three males and three females all with-out hearing impairments and inexperienced in speech mor-phing) The sentences for the tests were taken from TIMITdatabase and from BGU Hebrew database The speech stim-uli were played using high-quality loudspeakers In all tests

the stimuli were played in random order The listeners werefree to play each stimulus more than once without any limi-tation In the first test the listeners had to decide if there wasa change in the identity of the speaker for each of the threesentences on a 1ndash5 scale where 1 meant that one identitywas perceived along the utterance and 5 meant that there wasmore than one speaker The listeners had to rate the smooth-ness of the utterance as well on a similar 1ndash5 scale where5 meant smooth utterance and 1 meant abrupt change orchanges in the utterance Three types of stimuli were usedthe original speech morphed speech (in which the identitychanged gradually from one speaker to another during thesentence) and concatenated speech of two speakers at a fixedpoint [1] The aim of this test was to evaluate if the iden-tity change was perceived and to assess the effect of the algo-rithm on the smoothness of the gradual morphed utterancesThe results of this experiment for one of the sentences (ldquoWewere different shades and it did not make a bit of differenceamong usrdquo taken from [9]) are depicted in Figure 9 As ex-pected it is readily seen that the original utterance was per-ceived as a smooth speech without identity change while theabrupt change in identity in the concatenated utterance wasperceived and rated accordingly that is as more than oneidentity and with an abrupt change The change of identitywas perceived in the morphed signal as well (average score35 std = 096) since the sentence was long enough to no-tice the modification Nevertheless it was also perceived asrelatively smooth (average score 35 std = 112) The other

1182 EURASIP Journal on Applied Signal Processing

s3-o

rigi

nal

t3-o

rigi

nal

s3-g

radu

alm

orph

ing

t3-g

radu

alm

orph

ing

s3-

con

cate

nat

ed

t3-

con

cate

nat

ed

1000

1500

2000

2500

3000

3500

4000

4500

5000

Ave

rage

scor

e

Stimuli

Figure 9 Evaluation of smoothness and identity change in an ut-terance in which a gradual morphing is produced (smoothnessmdashleft column identity changemdashright column)

two utterances yielded similar results In the second test thelisteners had to evaluate the naturalness and intelligibilityof a given sentence on a 1ndash5 scale The listeners attendedto 3 stimuli two original sentences one uttered by a malespeaker and the other by a female speaker and a morphedsentence with a morphing factor of 05 which is a hybrid sig-nal between the two originals The results are summarized inFigure 10 It is clear from this figure that the morphed sig-nal was perceived as highly natural and clear by most of thelisteners (naturalness average score 43 std = 074 intelli-gibility average score 417 std = 069) In the third exper-iment the listeners had to rate the identity change and thenaturalness of two sentences on a 1ndash5 scale Each sentencewas repeated as a cyclostationary morph (see [3]) 16 timeswith various morphing factors from 0 to 1 The average nat-uralness was 325 and the identity change was rated with anaverage of 33 Naturalness in this case was not perfect (al-though it is quite high) and can be explained by the fact thatthe voice timbers in this test were distant and the hybridvoice that was produced could be perceived as uncommonIn summary the results of the tests show that the morphedsignals are perceived in most cases as highly natural highlyintelligible with a relatively smooth change from one speechvoice to another

3 CONCLUSIONS

In this study a new speech morphing algorithm is presentedThe aim is to produce natural sounding hybrid voices be-tween two speakers uttering the same content The algo-rithm is based on representing the residual error signal as aPWI surface and the vocal tract as a lossless tube area func-tion The PWI surface incorporates the characteristics of theexcitation signal and enables reproduction of a residual sig-nal with a given pitch contour and time duration which in-cludes the dynamics of both speakersrsquo excitations It is known

1000

1500

2000

2500

3000

3500

4000

4500

5000

Ave

rage

scor

e

Man Woman Morphing 05

Stimuli

Figure 10 Results of subjective evaluation for naturalness (leftbars) and intelligibility (right bars) for 3 stimuli a man (left) anda woman (middle) voices and a morphing between the two voices(right)

[16] that PWI surfaces can be exploited efficiently for speechcoding and therefore they allow for higher compression ofthe speech database The area function was used in an at-tempt to reflect an intermediate configuration of the vocaltract between the two speakers [25] The utterances producedby the algorithm were shown to be of high quality and to con-sist of intermediate features of the two speakers

There are at least two modes in which the morphing algo-rithm can be used In the first mode the morphing parameteris invariant meaning for example taking a factor of 05 andreceiving a morphed signal with characteristics which are be-tween the two voices for the whole duration of the articu-lation In the second mode we start from the first (source)speaker (morphing factor = 0) and the morphing factor ischanged gradually along the duration of the sentence so itsvalue is 1 at the end of the sentence The same morphingfactor was used for both the excitation and the vocal tractparameters

In this time-varying version of the algorithm that iswhen morphing gradually from one voice to another overtime smooth morphing was achieved producing a highlynatural transition between the source and the target speakersThis was assessed by subjective evaluation tests as previouslydescribed

The algorithm described here is more capable of pro-ducing longer and more smooth and natural sounding ut-terances than previous studies [1 3] One of the advantagesof the proposed algorithm is that in addition to the inter-polation of the vocal tract features interpolation is also per-formed between the two PWI surfaces of the correspondingresidual signals and thus captures the evolution of the excita-tion signals of both speakers In this way a hybrid excitationsignal can be produced that contains intermediate character-istics of both excitations Thus a more natural morphing be-tween utterances can be achieved as has been demonstratedFurthermore our algorithm performs the interpolation of

Voice Morphing Using 3D Waveform Interpolation Surfaces 1183

the residual signal regardless of the pitch information sincethe pitch data is normalized within the PWI surface There-fore the morphed pitch contour is extracted independentlyand can be manipulated separately In addition the currentapproach enables morphing between utterances with differ-ent pitches between male and female voices or betweenvoices of different and perceptually distant timbres Kawa-hara and his colleagues [26 27] implemented a morphingsystem based on interpolation between time-frequency rep-resentations of the source and the target signals It appearsthat the STRAIGHT-based morphing system (see [2 26])was able to produce intermediate voices of higher clarity thanour algorithm but the need to assign multiple anchor pointsfor each short segment using visual inspection and phono-logical knowledge is a noticeable disadvantage of that sys-tem which can make it difficult to use for morphing betweenlong utterances

Further research is needed to improve the quality of themorphed signals which are natural sounding (see httpspltelhaiacilspeech) but are somewhat degraded com-pared to the originals

ACKNOWLEDGMENTS

We would like to thank Professor David Malah the Head ofSIPL for proposing to apply PWI to voice morphing andfor his valuable discussions and comments We also thankDima Ruinskiy for his valuable assistance for programmingpart of the morphing system and for preparing the semiau-tomatic segmentation and Yefim Yakir for preparing part ofthe figures and for technical support The authors thank thereviewers for their helpful comments This study was partlysupported by Guastella Fellowship of the Sacta-Rashi Foun-dation and the JAFI project

REFERENCES

[1] M Abe ldquoSpeech morphing by gradually changing spectrumparameter and fundamental frequencyrdquo Tech Rep SP96-40The Institute of Electronics Information and Communica-tion Engineers (IEICE) July 1996

[2] H Kawahara and H Matsui ldquoAuditory morphing based onan elastic perceptual distance metric in an interference-freetime-frequency representationrdquo in Proc IEEE InternationalConference on Acoustics Speech and Signal Processing (ICASSPrsquo03) vol 1 pp 256ndash259 Hong Kong China 2003

[3] M Slaney M Covell and B Lassiter ldquoAutomatic audio mor-phingrdquo in Proc IEEE International Conference on AcousticsSpeech and Signal Processing (ICASSP rsquo96) vol 2 pp 1001ndash1004 Atlanta GA USA May 1996

[4] P Cano A Loscos J Bonada M de Boer and X Serra ldquoVoicemorphing system for impersonating in karaoke applicationsrdquoin Proc International Computer Music Conference (ICMA rsquo00)pp 109ndash112 Berlin Germany August 2000

[5] E Tellman L Haken and B Holloway ldquoTimbre morphingof sounds with unequal numbers of featuresrdquo Journal of theAudio Engineering Society (AES) vol 43 no 9 pp 678ndash6891995

[6] D Rossum ldquoThe ldquoarmadillordquo coefficient encoding scheme fordigital audio filtersrdquo in proc IEEE Workshop on Applicationsof Signal Processing to Audio and Acoustics (WASPAA rsquo91) pp129ndash130 New Paltz NY USA October 1991

[7] Y Stylianou O Cappe and E Moulines ldquoContinuous prob-abilistic transform for voice conversionrdquo IEEE Trans SpeechAudio Processing vol 6 no 2 pp 131ndash142 1998

[8] N Iwahashi and Y Sagisaka ldquoSpeech spectrum conversionbased on speaker interpolation and multi-functional repre-sentation with weighting by radial basis function networksrdquoSpeech Communication vol 16 no 2 pp 139ndash151 1995

[9] L M Arslan ldquoSpeaker Transformation Algorithm using Seg-mental Codebooks (STASC)rdquo Speech Communication vol 28no 3 pp 211ndash226 1999

[10] A Kain and M Macon ldquoSpectral voice conversion for text-to-speech synthesisrdquo in Proc IEEE International Conferenceon Acoustics Speech and Signal Processing (ICASSP rsquo98) vol 1pp 285ndash288 Seattle WA USA May 1998

[11] A Kain and M Macon ldquoPersonalizing a speech synthesizerby voice adaptationrdquo in Proc 3rd ESCACOCOSDA Interna-tional Speech Synthesis Workshop pp 225ndash230 Jenolan CavesAustralia November 1998

[12] A Kain and M Macon ldquoDesign and evaluation of a voice con-version algorithm based on spectral envelope mapping andresidual predictionrdquo in Proc IEEE International Conference onAcoustics Speech and Signal Processing (ICASSP rsquo01) vol 2Salt Lake City Utah USA May 2001

[13] M Abe S Nakamura K Shikano and H Kuwabara ldquoVoiceconversion through vector quantizationrdquo in Proc IEEE In-ternational Conference on Acoustics Speech and Signal Pro-cessing (ICASSP rsquo88) pp 655ndash658 New York NY USA April1988

[14] M Narendranath H A Murthy S Rajendran and B Yegna-narayana ldquoTransformation of formants for voice conversionusing artificial neural networksrdquo Speech Communication vol16 no 2 pp 206ndash216 1995

[15] O Lev (Fellah) and D Malah ldquoLow bit-rate speech coderbased on a long-term modelrdquo in Proc 22nd IEEE Conventionof Electrical and Electronics Engineers in Israel Tel-Aviv IsraelDecember 2002

[16] W B Kleijn and J Haagen ldquoWaveform interpolationfor cod-ing and synthesisrdquo in Speech Coding and Synthesis W BKleijn and K K Paliwal Eds chapter 5 pp 175ndash207 Else-vier Science B V Amsterdam the Netherlands 1995

[17] W B Kleijn ldquoEncoding speech using prototype waveformsrdquoIEEE Trans Speech Audio Processing vol 1 no 4 pp 386ndash3991993

[18] W B Kleijn and J Haagen ldquoTransformation and decompo-sition of the speech signal for codingrdquo IEEE Signal ProcessingLett vol 1 no 9 pp 136ndash138 1994

[19] J R Deller J G Proakis and J H L Hansen Discrete TimeProcessing of Speech Signals chapter 7 Macmillan PublishingNew York NY USA 1993

[20] L R Rabiner and R W Schafer Digital processing of speechsignals chapter 8 Signal Processing Series Prentice-Hall Pub-lishing Englewood Cliffs London 1978

[21] J Makhoul ldquoLinear prediction A tutorial reviewrdquo Proc IEEEvol 63 no 4 pp 561ndash580 1975

[22] H Kuwabara and Y Sagisaka ldquoAcoustic characteristics ofspeaker individuality Control and conversionrdquo Speech Com-munication vol 16 no 2 pp 165ndash173 1995

[23] A M Noll ldquoCepstrum pitch determinationrdquo Journal of theAcoustical Society of America vol 41 no 2 pp 293ndash309 1967

[24] Y Medan E Yair and D Chazan ldquoSuper resolution pitchdetermination of speech signalsrdquo IEEE Trans Acoust SpeechSignal Processing vol 39 no 1 pp 40ndash48 1991

[25] H Wakita ldquoDirect estimation of the vocal tract shape by in-verse filtering of the acoustic speech waveformsrdquo IEEE Trans-action Audio Electroacoustic vol AV-21 no 5 pp 417ndash4271973

1184 EURASIP Journal on Applied Signal Processing

[26] H Kawahara I Masuda-Katsuse and A de Cheveigne ldquoRe-structuring speech representations using a pitch-adaptivetime-frequency smoothing and an instantaneous-frequency-based F0 extraction Possible role of a repetitive structure insoundsrdquo Speech Communication vol 27 no 3-4 pp 187ndash207 1999

[27] H Kawahara H Katayose A de Cheveigne and R D Pat-terson ldquoFixed point analysis of frequency to instantaneousfrequency mapping for accurate estimation of f0 and period-icityrdquo in Proc European Union Activities in Human LanguageTechnologies (EUROSPEECH rsquo99) vol 6 pp 2781ndash2784 Bu-dapest Hungary September 1999

Yizhar Lavner received a PhD degree fromthe Technion ndash Israel Institute of Technol-ogy in 1997 He joined the Department ofcomputer science Tel-Hai Academic Col-lege Upper Galilee Israel in 1997 where heis a Senior Lecturer He is teaching in SIPL(Signal and Image Processing Lab) Elec-trical Engineering Faculty the TechnionHaifa Israel) since 1998 His research in-terests include speech and audio signal pro-cessing and genomic signal processing

Gidon Porat received his BS degree (cumlaude) in electrical engineering from theTechnion ndash Israel Institute of Technologyin January 2003 As a part of his grad-uation requirements Porat has researchedvoice morphing in the Electrical Engineer-ing Facultyrsquos Signal and Image ProcessingLab in the Technion Porat was a recipientof a Best Studentrsquos Paper Award at the IEEE22nd Convention of Electrical amp Electron-ics Engineers Israel in December 2002 Since his graduation Po-rat has been employed by Intel Corporation as a member of theMobile Platform Group Chip Design Team Porat takes part in thedevelopment of high-speed low-power circuit designs for the next-generation CPUs for mobile computers

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Transforming Signal Processing Applications intoParallel Implementations

Call for PapersThere is an increasing need to develop efficient ldquosystem-levelrdquo models methods and tools to support designers toquickly transform signal processing application specificationto heterogeneous hardware and software architectures suchas arrays of DSPs heterogeneous platforms involving mi-croprocessors DSPs and FPGAs and other evolving multi-processor SoC architectures Typically the design process in-volves aspects of application and architecture modeling aswell as transformations to translate the application modelsto architecture models for subsequent performance analysisand design space exploration Accurate predictions are indis-pensable because next generation signal processing applica-tions for example audio video and array signal processingimpose high throughput real-time and energy constraintsthat can no longer be served by a single DSP

There are a number of key issues in transforming applica-tion models into parallel implementations that are not ad-dressed in current approaches These are engineering theapplication specification transforming application specifi-cation or representation of the architecture specification aswell as communication models such as data transfer and syn-chronization primitives in both models

The purpose of this call for papers is to address approachesthat include application transformations in the performanceanalysis and design space exploration efforts when takingsignal processing applications to concurrent and parallel im-plementations The Guest Editors are soliciting contributionsin joint application and architecture space exploration thatoutperform the current architecture-only design space ex-ploration methods and tools

Topics of interest for this special issue include but are notlimited to

bull modeling applications in terms of (abstract)control-dataflow graph dataflow graph and processnetwork models of computation (MoC)

bull transforming application models or algorithmicengineering

bull transforming application MoCs to architecture MoCsbull joint application and architecture space exploration

bull joint application and architecture performanceanalysis

bull extending the concept of algorithmic engineering toarchitecture engineering

bull design cases and applications mapped onmultiprocessor homogeneous or heterogeneousSOCs showing joint optimization of application andarchitecture

Authors should follow the EURASIP JASP manuscriptformat described at httpwwwhindawicomjournalsaspProspective authors should submit an electronic copy oftheir complete manuscript through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due September 1 2006

Acceptance Notification January 1 2007

Final Manuscript Due April 1 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

F Deprettre Leiden Embedded Research Center LeidenUniversity Niels Bohrweg 1 2333 CA Leiden TheNetherlands eddliacsnl

Roger Woods School of Electrical and ElectronicEngineering Queens University of Belfast Ashby BuildingStranmillis Road Belfast BT9 5AH UK rwoodsqubacuk

Ingrid Verbauwhede Katholieke Universiteit LeuvenESAT-COSIC Kasteelpark Arenberg 10 3001 LeuvenBelgium Ingridverbauwhedeesatkuleuvenbe

Erwin de Kock Philips Research High Tech Campus 315656 AE Eindhoven The Netherlandserwindekockphilipscom

Hindawi Publishing Corporationhttpwwwhindawicom

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Video Adaptation for Heterogeneous Environments

Call for PapersThe explosive growth of compressed video streams andrepositories accessible worldwide the recent addition of newvideo-related standards such as H264AVC MPEG-7 andMPEG-21 and the ever-increasing prevalence of heteroge-neous video-enabled terminals such as computer TV mo-bile phones and personal digital assistants have escalated theneed for efficient and effective techniques for adapting com-pressed videos to better suit the different capabilities con-straints and requirements of various transmission networksapplications and end users For instance Universal Multime-dia Access (UMA) advocates the provision and adaptation ofthe same multimedia content for different networks termi-nals and user preferences

Video adaptation is an emerging field that offers a richbody of knowledge and techniques for handling the hugevariation of resource constraints (eg bandwidth display ca-pability processing speed and power consumption) and thelarge diversity of user tasks in pervasive media applicationsConsiderable amounts of research and development activi-ties in industry and academia have been devoted to answer-ing the many challenges in making better use of video con-tent across systems and applications of various kinds

Video adaptation may apply to individual or multiplevideo streams and may call for different means depending onthe objectives and requirements of adaptation Transcodingtransmoding (cross-modality transcoding) scalable contentrepresentation content abstraction and summarization arepopular means for video adaptation In addition video con-tent analysis and understanding including low-level featureanalysis and high-level semantics understanding play an im-portant role in video adaptation as essential video contentcan be better preserved

The aim of this special issue is to present state-of-the-art developments in this flourishing and important researchfield Contributions in theoretical study architecture designperformance analysis complexity reduction and real-worldapplications are all welcome

Topics of interest include (but are not limited to)

bull Heterogeneous video transcodingbull Scalable video codingbull Dynamic bitstream switching for video adaptation

bull Signal structural and semantic-level videoadaptation

bull Content analysis and understanding for videoadaptation

bull Video summarization and abstractionbull Copyright protection for video adaptationbull Crossmedia techniques for video adaptationbull Testing field trials and applications of video

adaptation servicesbull International standard activities for video adaptation

Authors should follow the EURASIP JASP manuscriptformat described at httpwwwhindawicomjournalsaspProspective authors should submit an electronic copy oftheir complete manuscript through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due September 1 2006

Acceptance Notification January 1 2007

Final Manuscript Due April 1 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

Chia-Wen Lin Department of Computer Science andInformation Engineering National Chung ChengUniversity Chiayi 621 Taiwan cwlincsccuedutw

Yap-Peng Tan School of Electrical and ElectronicEngineering Nanyang Technological University NanyangAvenue Singapore 639798 Singapore eyptanntuedusg

Ming-Ting Sun Department of Electrical EngineeringUniversity of Washington Seattle WA 98195 USA suneewashingtonedu

Alex Kot School of Electrical and Electronic EngineeringNanyang Technological University Nanyang AvenueSingapore 639798 Singapore eackotntuedusg

Anthony Vetro Mitsubishi Electric Research Laboratories201 Broadway 8th Floor Cambridge MA 02138 USAavetromerlcom

Hindawi Publishing Corporationhttpwwwhindawicom

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Knowledge-Assisted Media Analysis for InteractiveMultimedia Applications

Call for PapersIt is broadly acknowledged that the development of enablingtechnologies for new forms of interactive multimedia ser-vices requires a targeted confluence of knowledge semanticsand low-level media processing The convergence of these ar-eas is key to many applications including interactive TV net-worked medical imaging vision-based surveillance and mul-timedia visualization navigation search and retrieval Thelatter is a crucial application since the exponential growthof audiovisual data along with the critical lack of tools torecord the data in a well-structured form is rendering uselessvast portions of available content To overcome this problemthere is need for technology that is able to produce accuratelevels of abstraction in order to annotate and retrieve con-tent using queries that are natural to humans Such technol-ogy will help narrow the gap between low-level features orcontent descriptors that can be computed automatically andthe richness and subjectivity of semantics in user queries andhigh-level human interpretations of audiovisual media

This special issue focuses on truly integrative research tar-geting of what can be disparate disciplines including imageprocessing knowledge engineering information retrievalsemantic analysis and artificial intelligence High-qualityand novel contributions addressing theoretical and practicalaspects are solicited Specifically the following topics are ofinterest

bull Semantics-based multimedia analysisbull Context-based multimedia miningbull Intelligent exploitation of user relevance feedbackbull Knowledge acquisition from multimedia contentsbull Semantics based interaction with multimediabull Integration of multimedia processing and Semantic

Web technologies to enable automatic content shar-ing processing and interpretation

bull Content user and network aware media engineeringbull Multimodal techniques high-dimensionality reduc-

tion and low level feature fusion

Authors should follow the EURASIP JASP manuscriptformat described at httpwwwhindawicomjournalsaspProspective authors should submit an electronic copy oftheir complete manuscript through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due September 1 2006

Acceptance Notification January 15 2007

Final Manuscript Due April 1 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

Ebroul Izquierdo Department of Electronic EngineeringQueen Mary University of London Mile End Road LondonE1 4NS United Kingdom ebroulizquierdoelecqmulacuk

Hyoung Joong Kim Department of Control and Instru-mentation Engineering Kangwon National University 192 1Hyoja2 Dong Kangwon Do 200 701 Koreakhjkangwonackr

Thomas Sikora Communication Systems Group TechnicalUniversity Berlin Einstein Ufer 17 10587 Berlin Germanysikoranuetu-berlinde

Hindawi Publishing Corporationhttpwwwhindawicom

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Super-resolution Enhancement of Digital Video

Call for PapersWhen designing a system for image acquisition there is gen-erally a desire for high spatial resolution and a wide field-of-view To achieve this a camera system must typically em-ploy small f-number optics This produces an image withvery high spatial-frequency bandwidth at the focal plane Toavoid aliasing caused by undersampling the correspondingfocal plane array (FPA) must be sufficiently dense Howevercost and fabrication complexities may make this impracticalMore fundamentally smaller detectors capture fewer pho-tons which can lead to potentially severe noise levels in theacquired imagery Considering these factors one may chooseto accept a certain level of undersampling or to sacrifice someoptical resolution andor field-of-view

In image super-resolution (SR) postprocessing is used toobtain images with resolutions that go beyond the conven-tional limits of the uncompensated imaging system In somesystems the primary limiting factor is the optical resolutionof the image in the focal plane as defined by the cut-off fre-quency of the optics We use the term ldquooptical SRrdquo to re-fer to SR methods that aim to create an image with validspatial-frequency content that goes beyond the cut-off fre-quency of the optics Such techniques typically must rely onextensive a priori information In other image acquisitionsystems the limiting factor may be the density of the FPAsubsequent postprocessing requirements or transmission bi-trate constraints that require data compression We refer tothe process of overcoming the limitations of the FPA in orderto obtain the full resolution afforded by the selected optics asldquodetector SRrdquo Note that some methods may seek to performboth optical and detector SR

Detector SR algorithms generally process a set of low-resolution aliased frames from a video sequence to producea high-resolution frame When subpixel relative motion ispresent between the objects in the scene and the detector ar-ray a unique set of scene samples are acquired for each frameThis provides the mechanism for effectively increasing thespatial sampling rate of the imaging system without reduc-ing the physical size of the detectors

With increasing interest in surveillance and the prolifera-tion of digital imaging and video SR has become a rapidlygrowing field Recent advances in SR include innovative al-gorithms generalized methods real-time implementations

and novel applications The purpose of this special issue isto present leading research and development in the area ofsuper-resolution for digital video Topics of interest for thisspecial issue include but are not limited to

bull Detector and optical SR algorithms for videobull Real-time or near-real-time SR implementationsbull Innovative color SR processingbull Novel SR applications such as improved object

detection recognition and trackingbull Super-resolution from compressed videobull Subpixel image registration and optical flow

Authors should follow the EURASIP JASP manuscriptformat described at httpwwwhindawicomjournalsaspProspective authors should submit an electronic copy oftheir complete manuscript through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due September 1 2006

Acceptance Notification February 1 2007

Final Manuscript Due April 15 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

Russell C Hardie Department of Electrical and ComputerEngineering University of Dayton 300 College ParkDayton OH 45469-0026 USA rhardieudaytonedu

Richard R Schultz Department of Electrical EngineeringUniversity of North Dakota Upson II Room 160 PO Box7165 Grand Forks ND 58202-7165 USARichardSchultzmailundnodakedu

Kenneth E Barner Department of Electrical andComputer Engineering University of Delaware 140 EvansHall Newark DE 19716-3130 USA barnereeudeledu

Hindawi Publishing Corporationhttpwwwhindawicom

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Advanced Signal Processing and ComputationalIntelligence Techniques for Power Line Communications

Call for PapersIn recent years increased demand for fast Internet access andnew multimedia services the development of new and fea-sible signal processing techniques associated with faster andlow-cost digital signal processors as well as the deregulationof the telecommunications market have placed major em-phasis on the value of investigating hostile media such aspowerline (PL) channels for high-rate data transmissions

Nowadays some companies are offering powerline com-munications (PLC) modems with mean and peak bit-ratesaround 100 Mbps and 200 Mbps respectively Howeveradvanced broadband powerline communications (BPLC)modems will surpass this performance For accomplishing itsome special schemes or solutions for coping with the follow-ing issues should be addressed (i) considerable differencesbetween powerline network topologies (ii) hostile propertiesof PL channels such as attenuation proportional to high fre-quencies and long distances high-power impulse noise oc-currences time-varying behavior and strong inter-symbolinterference (ISI) effects (iv) electromagnetic compatibilitywith other well-established communication systems work-ing in the same spectrum (v) climatic conditions in differ-ent parts of the world (vii) reliability and QoS guarantee forvideo and voice transmissions and (vi) different demandsand needs from developed developing and poor countries

These issues can lead to exciting research frontiers withvery promising results if signal processing digital commu-nication and computational intelligence techniques are ef-fectively and efficiently combined

The goal of this special issue is to introduce signal process-ing digital communication and computational intelligencetools either individually or in combined form for advancingreliable and powerful future generations of powerline com-munication solutions that can be suited with for applicationsin developed developing and poor countries

Topics of interest include (but are not limited to)

bull Multicarrier spread spectrum and single carrier tech-niques

bull Channel modeling

bull Channel coding and equalization techniquesbull Multiuser detection and multiple access techniquesbull Synchronization techniquesbull Impulse noise cancellation techniquesbull FPGA ASIC and DSP implementation issues of PLC

modemsbull Error resilience error concealment and Joint source-

channel design methods for video transmissionthrough PL channels

Authors should follow the EURASIP JASP manuscript for-mat described at the journal site httpasphindawicomProspective authors should submit an electronic copy of theircomplete manuscripts through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due October 1 2006

Acceptance Notification January 1 2007

Final Manuscript Due April 1 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

Moiseacutes Vidal Ribeiro Federal University of Juiz de ForaBrazil mribeiroieeeorg

Lutz Lampe University of British Columbia Canadalampeeceubcca

Sanjit K Mitra University of California Santa BarbaraUSA mitraeceucsbedu

Klaus Dostert University of Karlsruhe Germanyklausdostertetecuni-karlsruhede

Halid Hrasnica Dresden University of Technology Ger-many hrasnicaifnettu-dresdende

Hindawi Publishing Corporationhttpasphindawicom

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Numerical Linear Algebra in Signal ProcessingApplications

Call for PapersThe cross-fertilization between numerical linear algebra anddigital signal processing has been very fruitful in the lastdecades The interaction between them has been growingleading to many new algorithms

Numerical linear algebra tools such as eigenvalue and sin-gular value decomposition and their higher-extension leastsquares total least squares recursive least squares regulariza-tion orthogonality and projections are the kernels of pow-erful and numerically robust algorithms

The goal of this special issue is to present new efficient andreliable numerical linear algebra tools for signal processingapplications Areas and topics of interest for this special issueinclude (but are not limited to)

bull Singular value and eigenvalue decompositions in-cluding applications

bull Fourier Toeplitz Cauchy Vandermonde and semi-separable matrices including special algorithms andarchitectures

bull Recursive least squares in digital signal processingbull Updating and downdating techniques in linear alge-

bra and signal processingbull Stability and sensitivity analysis of special recursive

least-squares problemsbull Numerical linear algebra in

bull Biomedical signal processing applicationsbull Adaptive filtersbull Remote sensingbull Acoustic echo cancellationbull Blind signal separation and multiuser detectionbull Multidimensional harmonic retrieval and direc-

tion-of-arrival estimationbull Applications in wireless communicationsbull Applications in pattern analysis and statistical

modelingbull Sensor array processing

Authors should follow the EURASIP JASP manuscriptformat described at httpwwwhindawicomjournalsaspProspective authors should submit an electronic copy oftheir complete manuscript through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due October 1 2006

Acceptance Notification February 1 2007

Final Manuscript Due May 1 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

Shivkumar Chandrasekaran Department of Electricaland Computer Engineering University of California SantaBarbara USA shiveceucsbedu

Gene H Golub Department of Computer Science StanfordUniversity USA golubsccmstanfordedu

Nicola Mastronardi Istituto per le Applicazioni del CalcololdquoMauro Piconerdquo Consiglio Nazionale delle Ricerche BariItaly nmastronardibaiaccnrit

Marc Moonen Department of Electrical EngineeringKatholieke Universiteit Leuven Belgiummarcmoonenesatkuleuvenbe

Paul Van Dooren Department of Mathematical Engineer-ing Catholic University of Louvain Belgiumvdoorencsamuclacbe

Sabine Van Huffel Department of Electrical EngineeringKatholieke Universiteit Leuven Belgiumsabinevanhuffelesatkuleuvenbe

Hindawi Publishing Corporationhttpwwwhindawicom

Page 7: Voice Morphing using 3D Waveform Interpolation …spl.telhai.ac.il/speech/pub/S1110865705502178[1].pdf · Voice Morphing Using 3D Waveform Interpolation Surfaces 1175 Several studies

1180 EURASIP Journal on Applied Signal Processing

minus02

0

02

0 050

100

150200

φ

t

Am

plit

ude

Figure 6 The new PWI surface is constructed by interpolation oftwo PWI surfaces of the residual error signals of two speakers Thenew reconstructed residual can be recovered by moving on the sur-face along the phase axis according to the phase φ(t) determined bythe new pitch contour (The tracking is represented by the black linealong the surface)

In order to maximize the cross-correlation between thetwo surfaces the target speakerrsquos surface is shifted along theφ-axis and is referred to as utminusaligned(tφ) where

utminusaligned(tφ) = ut(tφ + φn

) (4)

φn = arg maxφe

int 2πφ=0

int Tnew

t=0 us(tφ) middot ut(tφ + φe

)dtdφ∥∥us∥∥ middot ∥∥ut(tφ + φe

)∥∥ (5)

u =radicradicradicradicint 2π

φ=0

int Tnew

t=0u(tφ) middot u(tφ)dtdφ (6)

where φe is the correction needed for the surfaces to bealigned

The last step (after creating unew(tφ)) is reconstruct-ing the new residual error signal from the waveform sur-face The reconstruction is performed by defining enew(t) =unew(tφnew(t)) for all t = [0 Tnew] where φnew(t) is createdby the following equation

φnew(t) =int t

t0

2πpnew(tprime)

dtprime (7)

pnew(t) is calculated as an average of the sourcersquos and targetrsquosshort-time pitch contour functions as shown in the follow-ing equation

pnew(t) = α middot ps(β middot t) + (1minus α) middot pt(γ middot t)

β = Tnew

Ts

γ = Tnew

Tt

(8)

where the factors β and γ are scaling factors for the new timeaxis [0Tnew] and α as in (3) is a weighting factor betweenthe pitch of the source and the pitch of the target Figure 6shows the derivation of a new residual error signal using thetrack on the PWI surface determined by φnew(t)

233 New vocal tract model calculation and synthesisIt is well known that the linear prediction parameters (iethe coefficients of the predictor polynomial A(z)) are highlysensitive to quantization [19] Therefore their quantizationor interpolation may result in an unstable filter and may pro-duce an undesirable signal However certain invertible non-linear transformations of the predictor coefficients result inequivalent sets of parameters that tolerate quantization or in-terpolation better An example of such a set of parametersis the PARCOR coefficients (ki) which are related to the ar-eas of lossless tube sections modeling the vocal tract [20] asgiven by the following equation

Ai+1 =(

1minus ki1 + ki

)middot Ai (9)

The value of the first area function parameter (A1) is arbi-trarily set to be 2 A new set of LPC parameters that defines anew vocal tract is computed using an interpolation of the twoarea vectors (source and target) This choice of the area pa-rameters seems to be more reasonable since intermediate vo-cal tract models should reflect intermediate dimensions [25]Let the source and target vocal tracts be modeled by N loss-less tube sections with areasAs

i Ati i [1minusN] respectively

The new signalrsquos vocal tract will be represented by

Anewi = α middot As

i + (1minus α) middot Ati foralli [1 2 N] (10)

After calculating the new areas the prediction filter iscomputed and the new vocal phoneme is synthesized accord-ing to the following scheme (Figure 7)

(1) compute new PARCOR parameters from the new areasby reversing (9)

(2) compute the coefficients of the new LPC model fromthe new PARCOR parameters

(3) filter the new excitation signal through the new vocal-tract filter to obtain the new vocal phoneme

When temporal voice morphing is applied informal subjec-tive listening tests performed on different sets of morphingparameters have revealed that in order to have a ldquolinearrdquo per-ceptual change between the voices of the source and the tar-get the coefficient α(t) (the relative part of us(tφ)) mustvary nonlinearly in time (like the one in Figure 8) When thecoefficient changed linearly with time the listeners perceivedan abrupt change from one identity to the other Using thenonlinear option that is gradually changing the identity ofone speaker to that of another a smooth modification of theldquosourcerdquo properties to those of the ldquotargetrdquo properties wasachieved In another subjective listening test (see below) per-formed on morphing from a womanrsquos voice to a manrsquos voiceand vice versa both uttering the same sentence the mor-phed sound was perceived as changing smoothly and natu-rally from one speaker to the other The quality of the mor-phed voices was found to depend upon the specific speakersthe differences between their voices and the content of theutterance Further research is required to accurately evaluatethe effect of these factors

Voice Morphing Using 3D Waveform Interpolation Surfaces 1181

New phoneme

Synthesizenew speech

signal

Calculatenew LPC

parameters

Calculate newtube areas

Modificationparameter (α)

Source and targetpitch functions

Source and targeterror surface

Creat newerror surface

Recover newexcitation

Source and targetLPC parameters

Modificationparameter (α)

Calculateφnew(t)

EvaluatePnew(t)

Figure 7 A block diagram of the procedure for synthesizing the new phoneme

0010203040506070809

1

0 01 02 03 04 05 06 07 08 09 1

Act

ual

mor

phin

gco

effici

ent

Desired morphing coefficient

Figure 8 An example of a nonlinear morphing function obtainedempirically and used to achieve a linear perceptual change betweenthe first speaker and the second speaker when time-varying mor-phing was performed

24 Evaluation of the algorithm

Evaluation of the algorithm was carried out using three sub-jective listening tests The naturalness intelligibility identitychange and smoothness of the morphing algorithm wereexamined in these psychoacoustic tests In all tests six lis-teners participated (three males and three females all with-out hearing impairments and inexperienced in speech mor-phing) The sentences for the tests were taken from TIMITdatabase and from BGU Hebrew database The speech stim-uli were played using high-quality loudspeakers In all tests

the stimuli were played in random order The listeners werefree to play each stimulus more than once without any limi-tation In the first test the listeners had to decide if there wasa change in the identity of the speaker for each of the threesentences on a 1ndash5 scale where 1 meant that one identitywas perceived along the utterance and 5 meant that there wasmore than one speaker The listeners had to rate the smooth-ness of the utterance as well on a similar 1ndash5 scale where5 meant smooth utterance and 1 meant abrupt change orchanges in the utterance Three types of stimuli were usedthe original speech morphed speech (in which the identitychanged gradually from one speaker to another during thesentence) and concatenated speech of two speakers at a fixedpoint [1] The aim of this test was to evaluate if the iden-tity change was perceived and to assess the effect of the algo-rithm on the smoothness of the gradual morphed utterancesThe results of this experiment for one of the sentences (ldquoWewere different shades and it did not make a bit of differenceamong usrdquo taken from [9]) are depicted in Figure 9 As ex-pected it is readily seen that the original utterance was per-ceived as a smooth speech without identity change while theabrupt change in identity in the concatenated utterance wasperceived and rated accordingly that is as more than oneidentity and with an abrupt change The change of identitywas perceived in the morphed signal as well (average score35 std = 096) since the sentence was long enough to no-tice the modification Nevertheless it was also perceived asrelatively smooth (average score 35 std = 112) The other

1182 EURASIP Journal on Applied Signal Processing

s3-o

rigi

nal

t3-o

rigi

nal

s3-g

radu

alm

orph

ing

t3-g

radu

alm

orph

ing

s3-

con

cate

nat

ed

t3-

con

cate

nat

ed

1000

1500

2000

2500

3000

3500

4000

4500

5000

Ave

rage

scor

e

Stimuli

Figure 9 Evaluation of smoothness and identity change in an ut-terance in which a gradual morphing is produced (smoothnessmdashleft column identity changemdashright column)

two utterances yielded similar results In the second test thelisteners had to evaluate the naturalness and intelligibilityof a given sentence on a 1ndash5 scale The listeners attendedto 3 stimuli two original sentences one uttered by a malespeaker and the other by a female speaker and a morphedsentence with a morphing factor of 05 which is a hybrid sig-nal between the two originals The results are summarized inFigure 10 It is clear from this figure that the morphed sig-nal was perceived as highly natural and clear by most of thelisteners (naturalness average score 43 std = 074 intelli-gibility average score 417 std = 069) In the third exper-iment the listeners had to rate the identity change and thenaturalness of two sentences on a 1ndash5 scale Each sentencewas repeated as a cyclostationary morph (see [3]) 16 timeswith various morphing factors from 0 to 1 The average nat-uralness was 325 and the identity change was rated with anaverage of 33 Naturalness in this case was not perfect (al-though it is quite high) and can be explained by the fact thatthe voice timbers in this test were distant and the hybridvoice that was produced could be perceived as uncommonIn summary the results of the tests show that the morphedsignals are perceived in most cases as highly natural highlyintelligible with a relatively smooth change from one speechvoice to another

3 CONCLUSIONS

In this study a new speech morphing algorithm is presentedThe aim is to produce natural sounding hybrid voices be-tween two speakers uttering the same content The algo-rithm is based on representing the residual error signal as aPWI surface and the vocal tract as a lossless tube area func-tion The PWI surface incorporates the characteristics of theexcitation signal and enables reproduction of a residual sig-nal with a given pitch contour and time duration which in-cludes the dynamics of both speakersrsquo excitations It is known

1000

1500

2000

2500

3000

3500

4000

4500

5000

Ave

rage

scor

e

Man Woman Morphing 05

Stimuli

Figure 10 Results of subjective evaluation for naturalness (leftbars) and intelligibility (right bars) for 3 stimuli a man (left) anda woman (middle) voices and a morphing between the two voices(right)

[16] that PWI surfaces can be exploited efficiently for speechcoding and therefore they allow for higher compression ofthe speech database The area function was used in an at-tempt to reflect an intermediate configuration of the vocaltract between the two speakers [25] The utterances producedby the algorithm were shown to be of high quality and to con-sist of intermediate features of the two speakers

There are at least two modes in which the morphing algo-rithm can be used In the first mode the morphing parameteris invariant meaning for example taking a factor of 05 andreceiving a morphed signal with characteristics which are be-tween the two voices for the whole duration of the articu-lation In the second mode we start from the first (source)speaker (morphing factor = 0) and the morphing factor ischanged gradually along the duration of the sentence so itsvalue is 1 at the end of the sentence The same morphingfactor was used for both the excitation and the vocal tractparameters

In this time-varying version of the algorithm that iswhen morphing gradually from one voice to another overtime smooth morphing was achieved producing a highlynatural transition between the source and the target speakersThis was assessed by subjective evaluation tests as previouslydescribed

The algorithm described here is more capable of pro-ducing longer and more smooth and natural sounding ut-terances than previous studies [1 3] One of the advantagesof the proposed algorithm is that in addition to the inter-polation of the vocal tract features interpolation is also per-formed between the two PWI surfaces of the correspondingresidual signals and thus captures the evolution of the excita-tion signals of both speakers In this way a hybrid excitationsignal can be produced that contains intermediate character-istics of both excitations Thus a more natural morphing be-tween utterances can be achieved as has been demonstratedFurthermore our algorithm performs the interpolation of

Voice Morphing Using 3D Waveform Interpolation Surfaces 1183

the residual signal regardless of the pitch information sincethe pitch data is normalized within the PWI surface There-fore the morphed pitch contour is extracted independentlyand can be manipulated separately In addition the currentapproach enables morphing between utterances with differ-ent pitches between male and female voices or betweenvoices of different and perceptually distant timbres Kawa-hara and his colleagues [26 27] implemented a morphingsystem based on interpolation between time-frequency rep-resentations of the source and the target signals It appearsthat the STRAIGHT-based morphing system (see [2 26])was able to produce intermediate voices of higher clarity thanour algorithm but the need to assign multiple anchor pointsfor each short segment using visual inspection and phono-logical knowledge is a noticeable disadvantage of that sys-tem which can make it difficult to use for morphing betweenlong utterances

Further research is needed to improve the quality of themorphed signals which are natural sounding (see httpspltelhaiacilspeech) but are somewhat degraded com-pared to the originals

ACKNOWLEDGMENTS

We would like to thank Professor David Malah the Head ofSIPL for proposing to apply PWI to voice morphing andfor his valuable discussions and comments We also thankDima Ruinskiy for his valuable assistance for programmingpart of the morphing system and for preparing the semiau-tomatic segmentation and Yefim Yakir for preparing part ofthe figures and for technical support The authors thank thereviewers for their helpful comments This study was partlysupported by Guastella Fellowship of the Sacta-Rashi Foun-dation and the JAFI project

REFERENCES

[1] M Abe ldquoSpeech morphing by gradually changing spectrumparameter and fundamental frequencyrdquo Tech Rep SP96-40The Institute of Electronics Information and Communica-tion Engineers (IEICE) July 1996

[2] H Kawahara and H Matsui ldquoAuditory morphing based onan elastic perceptual distance metric in an interference-freetime-frequency representationrdquo in Proc IEEE InternationalConference on Acoustics Speech and Signal Processing (ICASSPrsquo03) vol 1 pp 256ndash259 Hong Kong China 2003

[3] M Slaney M Covell and B Lassiter ldquoAutomatic audio mor-phingrdquo in Proc IEEE International Conference on AcousticsSpeech and Signal Processing (ICASSP rsquo96) vol 2 pp 1001ndash1004 Atlanta GA USA May 1996

[4] P Cano A Loscos J Bonada M de Boer and X Serra ldquoVoicemorphing system for impersonating in karaoke applicationsrdquoin Proc International Computer Music Conference (ICMA rsquo00)pp 109ndash112 Berlin Germany August 2000

[5] E Tellman L Haken and B Holloway ldquoTimbre morphingof sounds with unequal numbers of featuresrdquo Journal of theAudio Engineering Society (AES) vol 43 no 9 pp 678ndash6891995

[6] D Rossum ldquoThe ldquoarmadillordquo coefficient encoding scheme fordigital audio filtersrdquo in proc IEEE Workshop on Applicationsof Signal Processing to Audio and Acoustics (WASPAA rsquo91) pp129ndash130 New Paltz NY USA October 1991

[7] Y Stylianou O Cappe and E Moulines ldquoContinuous prob-abilistic transform for voice conversionrdquo IEEE Trans SpeechAudio Processing vol 6 no 2 pp 131ndash142 1998

[8] N Iwahashi and Y Sagisaka ldquoSpeech spectrum conversionbased on speaker interpolation and multi-functional repre-sentation with weighting by radial basis function networksrdquoSpeech Communication vol 16 no 2 pp 139ndash151 1995

[9] L M Arslan ldquoSpeaker Transformation Algorithm using Seg-mental Codebooks (STASC)rdquo Speech Communication vol 28no 3 pp 211ndash226 1999

[10] A Kain and M Macon ldquoSpectral voice conversion for text-to-speech synthesisrdquo in Proc IEEE International Conferenceon Acoustics Speech and Signal Processing (ICASSP rsquo98) vol 1pp 285ndash288 Seattle WA USA May 1998

[11] A Kain and M Macon ldquoPersonalizing a speech synthesizerby voice adaptationrdquo in Proc 3rd ESCACOCOSDA Interna-tional Speech Synthesis Workshop pp 225ndash230 Jenolan CavesAustralia November 1998

[12] A Kain and M Macon ldquoDesign and evaluation of a voice con-version algorithm based on spectral envelope mapping andresidual predictionrdquo in Proc IEEE International Conference onAcoustics Speech and Signal Processing (ICASSP rsquo01) vol 2Salt Lake City Utah USA May 2001

[13] M Abe S Nakamura K Shikano and H Kuwabara ldquoVoiceconversion through vector quantizationrdquo in Proc IEEE In-ternational Conference on Acoustics Speech and Signal Pro-cessing (ICASSP rsquo88) pp 655ndash658 New York NY USA April1988

[14] M Narendranath H A Murthy S Rajendran and B Yegna-narayana ldquoTransformation of formants for voice conversionusing artificial neural networksrdquo Speech Communication vol16 no 2 pp 206ndash216 1995

[15] O Lev (Fellah) and D Malah ldquoLow bit-rate speech coderbased on a long-term modelrdquo in Proc 22nd IEEE Conventionof Electrical and Electronics Engineers in Israel Tel-Aviv IsraelDecember 2002

[16] W B Kleijn and J Haagen ldquoWaveform interpolationfor cod-ing and synthesisrdquo in Speech Coding and Synthesis W BKleijn and K K Paliwal Eds chapter 5 pp 175ndash207 Else-vier Science B V Amsterdam the Netherlands 1995

[17] W B Kleijn ldquoEncoding speech using prototype waveformsrdquoIEEE Trans Speech Audio Processing vol 1 no 4 pp 386ndash3991993

[18] W B Kleijn and J Haagen ldquoTransformation and decompo-sition of the speech signal for codingrdquo IEEE Signal ProcessingLett vol 1 no 9 pp 136ndash138 1994

[19] J R Deller J G Proakis and J H L Hansen Discrete TimeProcessing of Speech Signals chapter 7 Macmillan PublishingNew York NY USA 1993

[20] L R Rabiner and R W Schafer Digital processing of speechsignals chapter 8 Signal Processing Series Prentice-Hall Pub-lishing Englewood Cliffs London 1978

[21] J Makhoul ldquoLinear prediction A tutorial reviewrdquo Proc IEEEvol 63 no 4 pp 561ndash580 1975

[22] H Kuwabara and Y Sagisaka ldquoAcoustic characteristics ofspeaker individuality Control and conversionrdquo Speech Com-munication vol 16 no 2 pp 165ndash173 1995

[23] A M Noll ldquoCepstrum pitch determinationrdquo Journal of theAcoustical Society of America vol 41 no 2 pp 293ndash309 1967

[24] Y Medan E Yair and D Chazan ldquoSuper resolution pitchdetermination of speech signalsrdquo IEEE Trans Acoust SpeechSignal Processing vol 39 no 1 pp 40ndash48 1991

[25] H Wakita ldquoDirect estimation of the vocal tract shape by in-verse filtering of the acoustic speech waveformsrdquo IEEE Trans-action Audio Electroacoustic vol AV-21 no 5 pp 417ndash4271973

1184 EURASIP Journal on Applied Signal Processing

[26] H Kawahara I Masuda-Katsuse and A de Cheveigne ldquoRe-structuring speech representations using a pitch-adaptivetime-frequency smoothing and an instantaneous-frequency-based F0 extraction Possible role of a repetitive structure insoundsrdquo Speech Communication vol 27 no 3-4 pp 187ndash207 1999

[27] H Kawahara H Katayose A de Cheveigne and R D Pat-terson ldquoFixed point analysis of frequency to instantaneousfrequency mapping for accurate estimation of f0 and period-icityrdquo in Proc European Union Activities in Human LanguageTechnologies (EUROSPEECH rsquo99) vol 6 pp 2781ndash2784 Bu-dapest Hungary September 1999

Yizhar Lavner received a PhD degree fromthe Technion ndash Israel Institute of Technol-ogy in 1997 He joined the Department ofcomputer science Tel-Hai Academic Col-lege Upper Galilee Israel in 1997 where heis a Senior Lecturer He is teaching in SIPL(Signal and Image Processing Lab) Elec-trical Engineering Faculty the TechnionHaifa Israel) since 1998 His research in-terests include speech and audio signal pro-cessing and genomic signal processing

Gidon Porat received his BS degree (cumlaude) in electrical engineering from theTechnion ndash Israel Institute of Technologyin January 2003 As a part of his grad-uation requirements Porat has researchedvoice morphing in the Electrical Engineer-ing Facultyrsquos Signal and Image ProcessingLab in the Technion Porat was a recipientof a Best Studentrsquos Paper Award at the IEEE22nd Convention of Electrical amp Electron-ics Engineers Israel in December 2002 Since his graduation Po-rat has been employed by Intel Corporation as a member of theMobile Platform Group Chip Design Team Porat takes part in thedevelopment of high-speed low-power circuit designs for the next-generation CPUs for mobile computers

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Transforming Signal Processing Applications intoParallel Implementations

Call for PapersThere is an increasing need to develop efficient ldquosystem-levelrdquo models methods and tools to support designers toquickly transform signal processing application specificationto heterogeneous hardware and software architectures suchas arrays of DSPs heterogeneous platforms involving mi-croprocessors DSPs and FPGAs and other evolving multi-processor SoC architectures Typically the design process in-volves aspects of application and architecture modeling aswell as transformations to translate the application modelsto architecture models for subsequent performance analysisand design space exploration Accurate predictions are indis-pensable because next generation signal processing applica-tions for example audio video and array signal processingimpose high throughput real-time and energy constraintsthat can no longer be served by a single DSP

There are a number of key issues in transforming applica-tion models into parallel implementations that are not ad-dressed in current approaches These are engineering theapplication specification transforming application specifi-cation or representation of the architecture specification aswell as communication models such as data transfer and syn-chronization primitives in both models

The purpose of this call for papers is to address approachesthat include application transformations in the performanceanalysis and design space exploration efforts when takingsignal processing applications to concurrent and parallel im-plementations The Guest Editors are soliciting contributionsin joint application and architecture space exploration thatoutperform the current architecture-only design space ex-ploration methods and tools

Topics of interest for this special issue include but are notlimited to

bull modeling applications in terms of (abstract)control-dataflow graph dataflow graph and processnetwork models of computation (MoC)

bull transforming application models or algorithmicengineering

bull transforming application MoCs to architecture MoCsbull joint application and architecture space exploration

bull joint application and architecture performanceanalysis

bull extending the concept of algorithmic engineering toarchitecture engineering

bull design cases and applications mapped onmultiprocessor homogeneous or heterogeneousSOCs showing joint optimization of application andarchitecture

Authors should follow the EURASIP JASP manuscriptformat described at httpwwwhindawicomjournalsaspProspective authors should submit an electronic copy oftheir complete manuscript through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due September 1 2006

Acceptance Notification January 1 2007

Final Manuscript Due April 1 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

F Deprettre Leiden Embedded Research Center LeidenUniversity Niels Bohrweg 1 2333 CA Leiden TheNetherlands eddliacsnl

Roger Woods School of Electrical and ElectronicEngineering Queens University of Belfast Ashby BuildingStranmillis Road Belfast BT9 5AH UK rwoodsqubacuk

Ingrid Verbauwhede Katholieke Universiteit LeuvenESAT-COSIC Kasteelpark Arenberg 10 3001 LeuvenBelgium Ingridverbauwhedeesatkuleuvenbe

Erwin de Kock Philips Research High Tech Campus 315656 AE Eindhoven The Netherlandserwindekockphilipscom

Hindawi Publishing Corporationhttpwwwhindawicom

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Video Adaptation for Heterogeneous Environments

Call for PapersThe explosive growth of compressed video streams andrepositories accessible worldwide the recent addition of newvideo-related standards such as H264AVC MPEG-7 andMPEG-21 and the ever-increasing prevalence of heteroge-neous video-enabled terminals such as computer TV mo-bile phones and personal digital assistants have escalated theneed for efficient and effective techniques for adapting com-pressed videos to better suit the different capabilities con-straints and requirements of various transmission networksapplications and end users For instance Universal Multime-dia Access (UMA) advocates the provision and adaptation ofthe same multimedia content for different networks termi-nals and user preferences

Video adaptation is an emerging field that offers a richbody of knowledge and techniques for handling the hugevariation of resource constraints (eg bandwidth display ca-pability processing speed and power consumption) and thelarge diversity of user tasks in pervasive media applicationsConsiderable amounts of research and development activi-ties in industry and academia have been devoted to answer-ing the many challenges in making better use of video con-tent across systems and applications of various kinds

Video adaptation may apply to individual or multiplevideo streams and may call for different means depending onthe objectives and requirements of adaptation Transcodingtransmoding (cross-modality transcoding) scalable contentrepresentation content abstraction and summarization arepopular means for video adaptation In addition video con-tent analysis and understanding including low-level featureanalysis and high-level semantics understanding play an im-portant role in video adaptation as essential video contentcan be better preserved

The aim of this special issue is to present state-of-the-art developments in this flourishing and important researchfield Contributions in theoretical study architecture designperformance analysis complexity reduction and real-worldapplications are all welcome

Topics of interest include (but are not limited to)

bull Heterogeneous video transcodingbull Scalable video codingbull Dynamic bitstream switching for video adaptation

bull Signal structural and semantic-level videoadaptation

bull Content analysis and understanding for videoadaptation

bull Video summarization and abstractionbull Copyright protection for video adaptationbull Crossmedia techniques for video adaptationbull Testing field trials and applications of video

adaptation servicesbull International standard activities for video adaptation

Authors should follow the EURASIP JASP manuscriptformat described at httpwwwhindawicomjournalsaspProspective authors should submit an electronic copy oftheir complete manuscript through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due September 1 2006

Acceptance Notification January 1 2007

Final Manuscript Due April 1 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

Chia-Wen Lin Department of Computer Science andInformation Engineering National Chung ChengUniversity Chiayi 621 Taiwan cwlincsccuedutw

Yap-Peng Tan School of Electrical and ElectronicEngineering Nanyang Technological University NanyangAvenue Singapore 639798 Singapore eyptanntuedusg

Ming-Ting Sun Department of Electrical EngineeringUniversity of Washington Seattle WA 98195 USA suneewashingtonedu

Alex Kot School of Electrical and Electronic EngineeringNanyang Technological University Nanyang AvenueSingapore 639798 Singapore eackotntuedusg

Anthony Vetro Mitsubishi Electric Research Laboratories201 Broadway 8th Floor Cambridge MA 02138 USAavetromerlcom

Hindawi Publishing Corporationhttpwwwhindawicom

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Knowledge-Assisted Media Analysis for InteractiveMultimedia Applications

Call for PapersIt is broadly acknowledged that the development of enablingtechnologies for new forms of interactive multimedia ser-vices requires a targeted confluence of knowledge semanticsand low-level media processing The convergence of these ar-eas is key to many applications including interactive TV net-worked medical imaging vision-based surveillance and mul-timedia visualization navigation search and retrieval Thelatter is a crucial application since the exponential growthof audiovisual data along with the critical lack of tools torecord the data in a well-structured form is rendering uselessvast portions of available content To overcome this problemthere is need for technology that is able to produce accuratelevels of abstraction in order to annotate and retrieve con-tent using queries that are natural to humans Such technol-ogy will help narrow the gap between low-level features orcontent descriptors that can be computed automatically andthe richness and subjectivity of semantics in user queries andhigh-level human interpretations of audiovisual media

This special issue focuses on truly integrative research tar-geting of what can be disparate disciplines including imageprocessing knowledge engineering information retrievalsemantic analysis and artificial intelligence High-qualityand novel contributions addressing theoretical and practicalaspects are solicited Specifically the following topics are ofinterest

bull Semantics-based multimedia analysisbull Context-based multimedia miningbull Intelligent exploitation of user relevance feedbackbull Knowledge acquisition from multimedia contentsbull Semantics based interaction with multimediabull Integration of multimedia processing and Semantic

Web technologies to enable automatic content shar-ing processing and interpretation

bull Content user and network aware media engineeringbull Multimodal techniques high-dimensionality reduc-

tion and low level feature fusion

Authors should follow the EURASIP JASP manuscriptformat described at httpwwwhindawicomjournalsaspProspective authors should submit an electronic copy oftheir complete manuscript through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due September 1 2006

Acceptance Notification January 15 2007

Final Manuscript Due April 1 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

Ebroul Izquierdo Department of Electronic EngineeringQueen Mary University of London Mile End Road LondonE1 4NS United Kingdom ebroulizquierdoelecqmulacuk

Hyoung Joong Kim Department of Control and Instru-mentation Engineering Kangwon National University 192 1Hyoja2 Dong Kangwon Do 200 701 Koreakhjkangwonackr

Thomas Sikora Communication Systems Group TechnicalUniversity Berlin Einstein Ufer 17 10587 Berlin Germanysikoranuetu-berlinde

Hindawi Publishing Corporationhttpwwwhindawicom

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Super-resolution Enhancement of Digital Video

Call for PapersWhen designing a system for image acquisition there is gen-erally a desire for high spatial resolution and a wide field-of-view To achieve this a camera system must typically em-ploy small f-number optics This produces an image withvery high spatial-frequency bandwidth at the focal plane Toavoid aliasing caused by undersampling the correspondingfocal plane array (FPA) must be sufficiently dense Howevercost and fabrication complexities may make this impracticalMore fundamentally smaller detectors capture fewer pho-tons which can lead to potentially severe noise levels in theacquired imagery Considering these factors one may chooseto accept a certain level of undersampling or to sacrifice someoptical resolution andor field-of-view

In image super-resolution (SR) postprocessing is used toobtain images with resolutions that go beyond the conven-tional limits of the uncompensated imaging system In somesystems the primary limiting factor is the optical resolutionof the image in the focal plane as defined by the cut-off fre-quency of the optics We use the term ldquooptical SRrdquo to re-fer to SR methods that aim to create an image with validspatial-frequency content that goes beyond the cut-off fre-quency of the optics Such techniques typically must rely onextensive a priori information In other image acquisitionsystems the limiting factor may be the density of the FPAsubsequent postprocessing requirements or transmission bi-trate constraints that require data compression We refer tothe process of overcoming the limitations of the FPA in orderto obtain the full resolution afforded by the selected optics asldquodetector SRrdquo Note that some methods may seek to performboth optical and detector SR

Detector SR algorithms generally process a set of low-resolution aliased frames from a video sequence to producea high-resolution frame When subpixel relative motion ispresent between the objects in the scene and the detector ar-ray a unique set of scene samples are acquired for each frameThis provides the mechanism for effectively increasing thespatial sampling rate of the imaging system without reduc-ing the physical size of the detectors

With increasing interest in surveillance and the prolifera-tion of digital imaging and video SR has become a rapidlygrowing field Recent advances in SR include innovative al-gorithms generalized methods real-time implementations

and novel applications The purpose of this special issue isto present leading research and development in the area ofsuper-resolution for digital video Topics of interest for thisspecial issue include but are not limited to

bull Detector and optical SR algorithms for videobull Real-time or near-real-time SR implementationsbull Innovative color SR processingbull Novel SR applications such as improved object

detection recognition and trackingbull Super-resolution from compressed videobull Subpixel image registration and optical flow

Authors should follow the EURASIP JASP manuscriptformat described at httpwwwhindawicomjournalsaspProspective authors should submit an electronic copy oftheir complete manuscript through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due September 1 2006

Acceptance Notification February 1 2007

Final Manuscript Due April 15 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

Russell C Hardie Department of Electrical and ComputerEngineering University of Dayton 300 College ParkDayton OH 45469-0026 USA rhardieudaytonedu

Richard R Schultz Department of Electrical EngineeringUniversity of North Dakota Upson II Room 160 PO Box7165 Grand Forks ND 58202-7165 USARichardSchultzmailundnodakedu

Kenneth E Barner Department of Electrical andComputer Engineering University of Delaware 140 EvansHall Newark DE 19716-3130 USA barnereeudeledu

Hindawi Publishing Corporationhttpwwwhindawicom

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Advanced Signal Processing and ComputationalIntelligence Techniques for Power Line Communications

Call for PapersIn recent years increased demand for fast Internet access andnew multimedia services the development of new and fea-sible signal processing techniques associated with faster andlow-cost digital signal processors as well as the deregulationof the telecommunications market have placed major em-phasis on the value of investigating hostile media such aspowerline (PL) channels for high-rate data transmissions

Nowadays some companies are offering powerline com-munications (PLC) modems with mean and peak bit-ratesaround 100 Mbps and 200 Mbps respectively Howeveradvanced broadband powerline communications (BPLC)modems will surpass this performance For accomplishing itsome special schemes or solutions for coping with the follow-ing issues should be addressed (i) considerable differencesbetween powerline network topologies (ii) hostile propertiesof PL channels such as attenuation proportional to high fre-quencies and long distances high-power impulse noise oc-currences time-varying behavior and strong inter-symbolinterference (ISI) effects (iv) electromagnetic compatibilitywith other well-established communication systems work-ing in the same spectrum (v) climatic conditions in differ-ent parts of the world (vii) reliability and QoS guarantee forvideo and voice transmissions and (vi) different demandsand needs from developed developing and poor countries

These issues can lead to exciting research frontiers withvery promising results if signal processing digital commu-nication and computational intelligence techniques are ef-fectively and efficiently combined

The goal of this special issue is to introduce signal process-ing digital communication and computational intelligencetools either individually or in combined form for advancingreliable and powerful future generations of powerline com-munication solutions that can be suited with for applicationsin developed developing and poor countries

Topics of interest include (but are not limited to)

bull Multicarrier spread spectrum and single carrier tech-niques

bull Channel modeling

bull Channel coding and equalization techniquesbull Multiuser detection and multiple access techniquesbull Synchronization techniquesbull Impulse noise cancellation techniquesbull FPGA ASIC and DSP implementation issues of PLC

modemsbull Error resilience error concealment and Joint source-

channel design methods for video transmissionthrough PL channels

Authors should follow the EURASIP JASP manuscript for-mat described at the journal site httpasphindawicomProspective authors should submit an electronic copy of theircomplete manuscripts through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due October 1 2006

Acceptance Notification January 1 2007

Final Manuscript Due April 1 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

Moiseacutes Vidal Ribeiro Federal University of Juiz de ForaBrazil mribeiroieeeorg

Lutz Lampe University of British Columbia Canadalampeeceubcca

Sanjit K Mitra University of California Santa BarbaraUSA mitraeceucsbedu

Klaus Dostert University of Karlsruhe Germanyklausdostertetecuni-karlsruhede

Halid Hrasnica Dresden University of Technology Ger-many hrasnicaifnettu-dresdende

Hindawi Publishing Corporationhttpasphindawicom

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Numerical Linear Algebra in Signal ProcessingApplications

Call for PapersThe cross-fertilization between numerical linear algebra anddigital signal processing has been very fruitful in the lastdecades The interaction between them has been growingleading to many new algorithms

Numerical linear algebra tools such as eigenvalue and sin-gular value decomposition and their higher-extension leastsquares total least squares recursive least squares regulariza-tion orthogonality and projections are the kernels of pow-erful and numerically robust algorithms

The goal of this special issue is to present new efficient andreliable numerical linear algebra tools for signal processingapplications Areas and topics of interest for this special issueinclude (but are not limited to)

bull Singular value and eigenvalue decompositions in-cluding applications

bull Fourier Toeplitz Cauchy Vandermonde and semi-separable matrices including special algorithms andarchitectures

bull Recursive least squares in digital signal processingbull Updating and downdating techniques in linear alge-

bra and signal processingbull Stability and sensitivity analysis of special recursive

least-squares problemsbull Numerical linear algebra in

bull Biomedical signal processing applicationsbull Adaptive filtersbull Remote sensingbull Acoustic echo cancellationbull Blind signal separation and multiuser detectionbull Multidimensional harmonic retrieval and direc-

tion-of-arrival estimationbull Applications in wireless communicationsbull Applications in pattern analysis and statistical

modelingbull Sensor array processing

Authors should follow the EURASIP JASP manuscriptformat described at httpwwwhindawicomjournalsaspProspective authors should submit an electronic copy oftheir complete manuscript through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due October 1 2006

Acceptance Notification February 1 2007

Final Manuscript Due May 1 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

Shivkumar Chandrasekaran Department of Electricaland Computer Engineering University of California SantaBarbara USA shiveceucsbedu

Gene H Golub Department of Computer Science StanfordUniversity USA golubsccmstanfordedu

Nicola Mastronardi Istituto per le Applicazioni del CalcololdquoMauro Piconerdquo Consiglio Nazionale delle Ricerche BariItaly nmastronardibaiaccnrit

Marc Moonen Department of Electrical EngineeringKatholieke Universiteit Leuven Belgiummarcmoonenesatkuleuvenbe

Paul Van Dooren Department of Mathematical Engineer-ing Catholic University of Louvain Belgiumvdoorencsamuclacbe

Sabine Van Huffel Department of Electrical EngineeringKatholieke Universiteit Leuven Belgiumsabinevanhuffelesatkuleuvenbe

Hindawi Publishing Corporationhttpwwwhindawicom

Page 8: Voice Morphing using 3D Waveform Interpolation …spl.telhai.ac.il/speech/pub/S1110865705502178[1].pdf · Voice Morphing Using 3D Waveform Interpolation Surfaces 1175 Several studies

Voice Morphing Using 3D Waveform Interpolation Surfaces 1181

New phoneme

Synthesizenew speech

signal

Calculatenew LPC

parameters

Calculate newtube areas

Modificationparameter (α)

Source and targetpitch functions

Source and targeterror surface

Creat newerror surface

Recover newexcitation

Source and targetLPC parameters

Modificationparameter (α)

Calculateφnew(t)

EvaluatePnew(t)

Figure 7 A block diagram of the procedure for synthesizing the new phoneme

0010203040506070809

1

0 01 02 03 04 05 06 07 08 09 1

Act

ual

mor

phin

gco

effici

ent

Desired morphing coefficient

Figure 8 An example of a nonlinear morphing function obtainedempirically and used to achieve a linear perceptual change betweenthe first speaker and the second speaker when time-varying mor-phing was performed

24 Evaluation of the algorithm

Evaluation of the algorithm was carried out using three sub-jective listening tests The naturalness intelligibility identitychange and smoothness of the morphing algorithm wereexamined in these psychoacoustic tests In all tests six lis-teners participated (three males and three females all with-out hearing impairments and inexperienced in speech mor-phing) The sentences for the tests were taken from TIMITdatabase and from BGU Hebrew database The speech stim-uli were played using high-quality loudspeakers In all tests

the stimuli were played in random order The listeners werefree to play each stimulus more than once without any limi-tation In the first test the listeners had to decide if there wasa change in the identity of the speaker for each of the threesentences on a 1ndash5 scale where 1 meant that one identitywas perceived along the utterance and 5 meant that there wasmore than one speaker The listeners had to rate the smooth-ness of the utterance as well on a similar 1ndash5 scale where5 meant smooth utterance and 1 meant abrupt change orchanges in the utterance Three types of stimuli were usedthe original speech morphed speech (in which the identitychanged gradually from one speaker to another during thesentence) and concatenated speech of two speakers at a fixedpoint [1] The aim of this test was to evaluate if the iden-tity change was perceived and to assess the effect of the algo-rithm on the smoothness of the gradual morphed utterancesThe results of this experiment for one of the sentences (ldquoWewere different shades and it did not make a bit of differenceamong usrdquo taken from [9]) are depicted in Figure 9 As ex-pected it is readily seen that the original utterance was per-ceived as a smooth speech without identity change while theabrupt change in identity in the concatenated utterance wasperceived and rated accordingly that is as more than oneidentity and with an abrupt change The change of identitywas perceived in the morphed signal as well (average score35 std = 096) since the sentence was long enough to no-tice the modification Nevertheless it was also perceived asrelatively smooth (average score 35 std = 112) The other

1182 EURASIP Journal on Applied Signal Processing

s3-o

rigi

nal

t3-o

rigi

nal

s3-g

radu

alm

orph

ing

t3-g

radu

alm

orph

ing

s3-

con

cate

nat

ed

t3-

con

cate

nat

ed

1000

1500

2000

2500

3000

3500

4000

4500

5000

Ave

rage

scor

e

Stimuli

Figure 9 Evaluation of smoothness and identity change in an ut-terance in which a gradual morphing is produced (smoothnessmdashleft column identity changemdashright column)

two utterances yielded similar results In the second test thelisteners had to evaluate the naturalness and intelligibilityof a given sentence on a 1ndash5 scale The listeners attendedto 3 stimuli two original sentences one uttered by a malespeaker and the other by a female speaker and a morphedsentence with a morphing factor of 05 which is a hybrid sig-nal between the two originals The results are summarized inFigure 10 It is clear from this figure that the morphed sig-nal was perceived as highly natural and clear by most of thelisteners (naturalness average score 43 std = 074 intelli-gibility average score 417 std = 069) In the third exper-iment the listeners had to rate the identity change and thenaturalness of two sentences on a 1ndash5 scale Each sentencewas repeated as a cyclostationary morph (see [3]) 16 timeswith various morphing factors from 0 to 1 The average nat-uralness was 325 and the identity change was rated with anaverage of 33 Naturalness in this case was not perfect (al-though it is quite high) and can be explained by the fact thatthe voice timbers in this test were distant and the hybridvoice that was produced could be perceived as uncommonIn summary the results of the tests show that the morphedsignals are perceived in most cases as highly natural highlyintelligible with a relatively smooth change from one speechvoice to another

3 CONCLUSIONS

In this study a new speech morphing algorithm is presentedThe aim is to produce natural sounding hybrid voices be-tween two speakers uttering the same content The algo-rithm is based on representing the residual error signal as aPWI surface and the vocal tract as a lossless tube area func-tion The PWI surface incorporates the characteristics of theexcitation signal and enables reproduction of a residual sig-nal with a given pitch contour and time duration which in-cludes the dynamics of both speakersrsquo excitations It is known

1000

1500

2000

2500

3000

3500

4000

4500

5000

Ave

rage

scor

e

Man Woman Morphing 05

Stimuli

Figure 10 Results of subjective evaluation for naturalness (leftbars) and intelligibility (right bars) for 3 stimuli a man (left) anda woman (middle) voices and a morphing between the two voices(right)

[16] that PWI surfaces can be exploited efficiently for speechcoding and therefore they allow for higher compression ofthe speech database The area function was used in an at-tempt to reflect an intermediate configuration of the vocaltract between the two speakers [25] The utterances producedby the algorithm were shown to be of high quality and to con-sist of intermediate features of the two speakers

There are at least two modes in which the morphing algo-rithm can be used In the first mode the morphing parameteris invariant meaning for example taking a factor of 05 andreceiving a morphed signal with characteristics which are be-tween the two voices for the whole duration of the articu-lation In the second mode we start from the first (source)speaker (morphing factor = 0) and the morphing factor ischanged gradually along the duration of the sentence so itsvalue is 1 at the end of the sentence The same morphingfactor was used for both the excitation and the vocal tractparameters

In this time-varying version of the algorithm that iswhen morphing gradually from one voice to another overtime smooth morphing was achieved producing a highlynatural transition between the source and the target speakersThis was assessed by subjective evaluation tests as previouslydescribed

The algorithm described here is more capable of pro-ducing longer and more smooth and natural sounding ut-terances than previous studies [1 3] One of the advantagesof the proposed algorithm is that in addition to the inter-polation of the vocal tract features interpolation is also per-formed between the two PWI surfaces of the correspondingresidual signals and thus captures the evolution of the excita-tion signals of both speakers In this way a hybrid excitationsignal can be produced that contains intermediate character-istics of both excitations Thus a more natural morphing be-tween utterances can be achieved as has been demonstratedFurthermore our algorithm performs the interpolation of

Voice Morphing Using 3D Waveform Interpolation Surfaces 1183

the residual signal regardless of the pitch information sincethe pitch data is normalized within the PWI surface There-fore the morphed pitch contour is extracted independentlyand can be manipulated separately In addition the currentapproach enables morphing between utterances with differ-ent pitches between male and female voices or betweenvoices of different and perceptually distant timbres Kawa-hara and his colleagues [26 27] implemented a morphingsystem based on interpolation between time-frequency rep-resentations of the source and the target signals It appearsthat the STRAIGHT-based morphing system (see [2 26])was able to produce intermediate voices of higher clarity thanour algorithm but the need to assign multiple anchor pointsfor each short segment using visual inspection and phono-logical knowledge is a noticeable disadvantage of that sys-tem which can make it difficult to use for morphing betweenlong utterances

Further research is needed to improve the quality of themorphed signals which are natural sounding (see httpspltelhaiacilspeech) but are somewhat degraded com-pared to the originals

ACKNOWLEDGMENTS

We would like to thank Professor David Malah the Head ofSIPL for proposing to apply PWI to voice morphing andfor his valuable discussions and comments We also thankDima Ruinskiy for his valuable assistance for programmingpart of the morphing system and for preparing the semiau-tomatic segmentation and Yefim Yakir for preparing part ofthe figures and for technical support The authors thank thereviewers for their helpful comments This study was partlysupported by Guastella Fellowship of the Sacta-Rashi Foun-dation and the JAFI project

REFERENCES

[1] M Abe ldquoSpeech morphing by gradually changing spectrumparameter and fundamental frequencyrdquo Tech Rep SP96-40The Institute of Electronics Information and Communica-tion Engineers (IEICE) July 1996

[2] H Kawahara and H Matsui ldquoAuditory morphing based onan elastic perceptual distance metric in an interference-freetime-frequency representationrdquo in Proc IEEE InternationalConference on Acoustics Speech and Signal Processing (ICASSPrsquo03) vol 1 pp 256ndash259 Hong Kong China 2003

[3] M Slaney M Covell and B Lassiter ldquoAutomatic audio mor-phingrdquo in Proc IEEE International Conference on AcousticsSpeech and Signal Processing (ICASSP rsquo96) vol 2 pp 1001ndash1004 Atlanta GA USA May 1996

[4] P Cano A Loscos J Bonada M de Boer and X Serra ldquoVoicemorphing system for impersonating in karaoke applicationsrdquoin Proc International Computer Music Conference (ICMA rsquo00)pp 109ndash112 Berlin Germany August 2000

[5] E Tellman L Haken and B Holloway ldquoTimbre morphingof sounds with unequal numbers of featuresrdquo Journal of theAudio Engineering Society (AES) vol 43 no 9 pp 678ndash6891995

[6] D Rossum ldquoThe ldquoarmadillordquo coefficient encoding scheme fordigital audio filtersrdquo in proc IEEE Workshop on Applicationsof Signal Processing to Audio and Acoustics (WASPAA rsquo91) pp129ndash130 New Paltz NY USA October 1991

[7] Y Stylianou O Cappe and E Moulines ldquoContinuous prob-abilistic transform for voice conversionrdquo IEEE Trans SpeechAudio Processing vol 6 no 2 pp 131ndash142 1998

[8] N Iwahashi and Y Sagisaka ldquoSpeech spectrum conversionbased on speaker interpolation and multi-functional repre-sentation with weighting by radial basis function networksrdquoSpeech Communication vol 16 no 2 pp 139ndash151 1995

[9] L M Arslan ldquoSpeaker Transformation Algorithm using Seg-mental Codebooks (STASC)rdquo Speech Communication vol 28no 3 pp 211ndash226 1999

[10] A Kain and M Macon ldquoSpectral voice conversion for text-to-speech synthesisrdquo in Proc IEEE International Conferenceon Acoustics Speech and Signal Processing (ICASSP rsquo98) vol 1pp 285ndash288 Seattle WA USA May 1998

[11] A Kain and M Macon ldquoPersonalizing a speech synthesizerby voice adaptationrdquo in Proc 3rd ESCACOCOSDA Interna-tional Speech Synthesis Workshop pp 225ndash230 Jenolan CavesAustralia November 1998

[12] A Kain and M Macon ldquoDesign and evaluation of a voice con-version algorithm based on spectral envelope mapping andresidual predictionrdquo in Proc IEEE International Conference onAcoustics Speech and Signal Processing (ICASSP rsquo01) vol 2Salt Lake City Utah USA May 2001

[13] M Abe S Nakamura K Shikano and H Kuwabara ldquoVoiceconversion through vector quantizationrdquo in Proc IEEE In-ternational Conference on Acoustics Speech and Signal Pro-cessing (ICASSP rsquo88) pp 655ndash658 New York NY USA April1988

[14] M Narendranath H A Murthy S Rajendran and B Yegna-narayana ldquoTransformation of formants for voice conversionusing artificial neural networksrdquo Speech Communication vol16 no 2 pp 206ndash216 1995

[15] O Lev (Fellah) and D Malah ldquoLow bit-rate speech coderbased on a long-term modelrdquo in Proc 22nd IEEE Conventionof Electrical and Electronics Engineers in Israel Tel-Aviv IsraelDecember 2002

[16] W B Kleijn and J Haagen ldquoWaveform interpolationfor cod-ing and synthesisrdquo in Speech Coding and Synthesis W BKleijn and K K Paliwal Eds chapter 5 pp 175ndash207 Else-vier Science B V Amsterdam the Netherlands 1995

[17] W B Kleijn ldquoEncoding speech using prototype waveformsrdquoIEEE Trans Speech Audio Processing vol 1 no 4 pp 386ndash3991993

[18] W B Kleijn and J Haagen ldquoTransformation and decompo-sition of the speech signal for codingrdquo IEEE Signal ProcessingLett vol 1 no 9 pp 136ndash138 1994

[19] J R Deller J G Proakis and J H L Hansen Discrete TimeProcessing of Speech Signals chapter 7 Macmillan PublishingNew York NY USA 1993

[20] L R Rabiner and R W Schafer Digital processing of speechsignals chapter 8 Signal Processing Series Prentice-Hall Pub-lishing Englewood Cliffs London 1978

[21] J Makhoul ldquoLinear prediction A tutorial reviewrdquo Proc IEEEvol 63 no 4 pp 561ndash580 1975

[22] H Kuwabara and Y Sagisaka ldquoAcoustic characteristics ofspeaker individuality Control and conversionrdquo Speech Com-munication vol 16 no 2 pp 165ndash173 1995

[23] A M Noll ldquoCepstrum pitch determinationrdquo Journal of theAcoustical Society of America vol 41 no 2 pp 293ndash309 1967

[24] Y Medan E Yair and D Chazan ldquoSuper resolution pitchdetermination of speech signalsrdquo IEEE Trans Acoust SpeechSignal Processing vol 39 no 1 pp 40ndash48 1991

[25] H Wakita ldquoDirect estimation of the vocal tract shape by in-verse filtering of the acoustic speech waveformsrdquo IEEE Trans-action Audio Electroacoustic vol AV-21 no 5 pp 417ndash4271973

1184 EURASIP Journal on Applied Signal Processing

[26] H Kawahara I Masuda-Katsuse and A de Cheveigne ldquoRe-structuring speech representations using a pitch-adaptivetime-frequency smoothing and an instantaneous-frequency-based F0 extraction Possible role of a repetitive structure insoundsrdquo Speech Communication vol 27 no 3-4 pp 187ndash207 1999

[27] H Kawahara H Katayose A de Cheveigne and R D Pat-terson ldquoFixed point analysis of frequency to instantaneousfrequency mapping for accurate estimation of f0 and period-icityrdquo in Proc European Union Activities in Human LanguageTechnologies (EUROSPEECH rsquo99) vol 6 pp 2781ndash2784 Bu-dapest Hungary September 1999

Yizhar Lavner received a PhD degree fromthe Technion ndash Israel Institute of Technol-ogy in 1997 He joined the Department ofcomputer science Tel-Hai Academic Col-lege Upper Galilee Israel in 1997 where heis a Senior Lecturer He is teaching in SIPL(Signal and Image Processing Lab) Elec-trical Engineering Faculty the TechnionHaifa Israel) since 1998 His research in-terests include speech and audio signal pro-cessing and genomic signal processing

Gidon Porat received his BS degree (cumlaude) in electrical engineering from theTechnion ndash Israel Institute of Technologyin January 2003 As a part of his grad-uation requirements Porat has researchedvoice morphing in the Electrical Engineer-ing Facultyrsquos Signal and Image ProcessingLab in the Technion Porat was a recipientof a Best Studentrsquos Paper Award at the IEEE22nd Convention of Electrical amp Electron-ics Engineers Israel in December 2002 Since his graduation Po-rat has been employed by Intel Corporation as a member of theMobile Platform Group Chip Design Team Porat takes part in thedevelopment of high-speed low-power circuit designs for the next-generation CPUs for mobile computers

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Transforming Signal Processing Applications intoParallel Implementations

Call for PapersThere is an increasing need to develop efficient ldquosystem-levelrdquo models methods and tools to support designers toquickly transform signal processing application specificationto heterogeneous hardware and software architectures suchas arrays of DSPs heterogeneous platforms involving mi-croprocessors DSPs and FPGAs and other evolving multi-processor SoC architectures Typically the design process in-volves aspects of application and architecture modeling aswell as transformations to translate the application modelsto architecture models for subsequent performance analysisand design space exploration Accurate predictions are indis-pensable because next generation signal processing applica-tions for example audio video and array signal processingimpose high throughput real-time and energy constraintsthat can no longer be served by a single DSP

There are a number of key issues in transforming applica-tion models into parallel implementations that are not ad-dressed in current approaches These are engineering theapplication specification transforming application specifi-cation or representation of the architecture specification aswell as communication models such as data transfer and syn-chronization primitives in both models

The purpose of this call for papers is to address approachesthat include application transformations in the performanceanalysis and design space exploration efforts when takingsignal processing applications to concurrent and parallel im-plementations The Guest Editors are soliciting contributionsin joint application and architecture space exploration thatoutperform the current architecture-only design space ex-ploration methods and tools

Topics of interest for this special issue include but are notlimited to

bull modeling applications in terms of (abstract)control-dataflow graph dataflow graph and processnetwork models of computation (MoC)

bull transforming application models or algorithmicengineering

bull transforming application MoCs to architecture MoCsbull joint application and architecture space exploration

bull joint application and architecture performanceanalysis

bull extending the concept of algorithmic engineering toarchitecture engineering

bull design cases and applications mapped onmultiprocessor homogeneous or heterogeneousSOCs showing joint optimization of application andarchitecture

Authors should follow the EURASIP JASP manuscriptformat described at httpwwwhindawicomjournalsaspProspective authors should submit an electronic copy oftheir complete manuscript through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due September 1 2006

Acceptance Notification January 1 2007

Final Manuscript Due April 1 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

F Deprettre Leiden Embedded Research Center LeidenUniversity Niels Bohrweg 1 2333 CA Leiden TheNetherlands eddliacsnl

Roger Woods School of Electrical and ElectronicEngineering Queens University of Belfast Ashby BuildingStranmillis Road Belfast BT9 5AH UK rwoodsqubacuk

Ingrid Verbauwhede Katholieke Universiteit LeuvenESAT-COSIC Kasteelpark Arenberg 10 3001 LeuvenBelgium Ingridverbauwhedeesatkuleuvenbe

Erwin de Kock Philips Research High Tech Campus 315656 AE Eindhoven The Netherlandserwindekockphilipscom

Hindawi Publishing Corporationhttpwwwhindawicom

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Video Adaptation for Heterogeneous Environments

Call for PapersThe explosive growth of compressed video streams andrepositories accessible worldwide the recent addition of newvideo-related standards such as H264AVC MPEG-7 andMPEG-21 and the ever-increasing prevalence of heteroge-neous video-enabled terminals such as computer TV mo-bile phones and personal digital assistants have escalated theneed for efficient and effective techniques for adapting com-pressed videos to better suit the different capabilities con-straints and requirements of various transmission networksapplications and end users For instance Universal Multime-dia Access (UMA) advocates the provision and adaptation ofthe same multimedia content for different networks termi-nals and user preferences

Video adaptation is an emerging field that offers a richbody of knowledge and techniques for handling the hugevariation of resource constraints (eg bandwidth display ca-pability processing speed and power consumption) and thelarge diversity of user tasks in pervasive media applicationsConsiderable amounts of research and development activi-ties in industry and academia have been devoted to answer-ing the many challenges in making better use of video con-tent across systems and applications of various kinds

Video adaptation may apply to individual or multiplevideo streams and may call for different means depending onthe objectives and requirements of adaptation Transcodingtransmoding (cross-modality transcoding) scalable contentrepresentation content abstraction and summarization arepopular means for video adaptation In addition video con-tent analysis and understanding including low-level featureanalysis and high-level semantics understanding play an im-portant role in video adaptation as essential video contentcan be better preserved

The aim of this special issue is to present state-of-the-art developments in this flourishing and important researchfield Contributions in theoretical study architecture designperformance analysis complexity reduction and real-worldapplications are all welcome

Topics of interest include (but are not limited to)

bull Heterogeneous video transcodingbull Scalable video codingbull Dynamic bitstream switching for video adaptation

bull Signal structural and semantic-level videoadaptation

bull Content analysis and understanding for videoadaptation

bull Video summarization and abstractionbull Copyright protection for video adaptationbull Crossmedia techniques for video adaptationbull Testing field trials and applications of video

adaptation servicesbull International standard activities for video adaptation

Authors should follow the EURASIP JASP manuscriptformat described at httpwwwhindawicomjournalsaspProspective authors should submit an electronic copy oftheir complete manuscript through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due September 1 2006

Acceptance Notification January 1 2007

Final Manuscript Due April 1 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

Chia-Wen Lin Department of Computer Science andInformation Engineering National Chung ChengUniversity Chiayi 621 Taiwan cwlincsccuedutw

Yap-Peng Tan School of Electrical and ElectronicEngineering Nanyang Technological University NanyangAvenue Singapore 639798 Singapore eyptanntuedusg

Ming-Ting Sun Department of Electrical EngineeringUniversity of Washington Seattle WA 98195 USA suneewashingtonedu

Alex Kot School of Electrical and Electronic EngineeringNanyang Technological University Nanyang AvenueSingapore 639798 Singapore eackotntuedusg

Anthony Vetro Mitsubishi Electric Research Laboratories201 Broadway 8th Floor Cambridge MA 02138 USAavetromerlcom

Hindawi Publishing Corporationhttpwwwhindawicom

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Knowledge-Assisted Media Analysis for InteractiveMultimedia Applications

Call for PapersIt is broadly acknowledged that the development of enablingtechnologies for new forms of interactive multimedia ser-vices requires a targeted confluence of knowledge semanticsand low-level media processing The convergence of these ar-eas is key to many applications including interactive TV net-worked medical imaging vision-based surveillance and mul-timedia visualization navigation search and retrieval Thelatter is a crucial application since the exponential growthof audiovisual data along with the critical lack of tools torecord the data in a well-structured form is rendering uselessvast portions of available content To overcome this problemthere is need for technology that is able to produce accuratelevels of abstraction in order to annotate and retrieve con-tent using queries that are natural to humans Such technol-ogy will help narrow the gap between low-level features orcontent descriptors that can be computed automatically andthe richness and subjectivity of semantics in user queries andhigh-level human interpretations of audiovisual media

This special issue focuses on truly integrative research tar-geting of what can be disparate disciplines including imageprocessing knowledge engineering information retrievalsemantic analysis and artificial intelligence High-qualityand novel contributions addressing theoretical and practicalaspects are solicited Specifically the following topics are ofinterest

bull Semantics-based multimedia analysisbull Context-based multimedia miningbull Intelligent exploitation of user relevance feedbackbull Knowledge acquisition from multimedia contentsbull Semantics based interaction with multimediabull Integration of multimedia processing and Semantic

Web technologies to enable automatic content shar-ing processing and interpretation

bull Content user and network aware media engineeringbull Multimodal techniques high-dimensionality reduc-

tion and low level feature fusion

Authors should follow the EURASIP JASP manuscriptformat described at httpwwwhindawicomjournalsaspProspective authors should submit an electronic copy oftheir complete manuscript through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due September 1 2006

Acceptance Notification January 15 2007

Final Manuscript Due April 1 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

Ebroul Izquierdo Department of Electronic EngineeringQueen Mary University of London Mile End Road LondonE1 4NS United Kingdom ebroulizquierdoelecqmulacuk

Hyoung Joong Kim Department of Control and Instru-mentation Engineering Kangwon National University 192 1Hyoja2 Dong Kangwon Do 200 701 Koreakhjkangwonackr

Thomas Sikora Communication Systems Group TechnicalUniversity Berlin Einstein Ufer 17 10587 Berlin Germanysikoranuetu-berlinde

Hindawi Publishing Corporationhttpwwwhindawicom

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Super-resolution Enhancement of Digital Video

Call for PapersWhen designing a system for image acquisition there is gen-erally a desire for high spatial resolution and a wide field-of-view To achieve this a camera system must typically em-ploy small f-number optics This produces an image withvery high spatial-frequency bandwidth at the focal plane Toavoid aliasing caused by undersampling the correspondingfocal plane array (FPA) must be sufficiently dense Howevercost and fabrication complexities may make this impracticalMore fundamentally smaller detectors capture fewer pho-tons which can lead to potentially severe noise levels in theacquired imagery Considering these factors one may chooseto accept a certain level of undersampling or to sacrifice someoptical resolution andor field-of-view

In image super-resolution (SR) postprocessing is used toobtain images with resolutions that go beyond the conven-tional limits of the uncompensated imaging system In somesystems the primary limiting factor is the optical resolutionof the image in the focal plane as defined by the cut-off fre-quency of the optics We use the term ldquooptical SRrdquo to re-fer to SR methods that aim to create an image with validspatial-frequency content that goes beyond the cut-off fre-quency of the optics Such techniques typically must rely onextensive a priori information In other image acquisitionsystems the limiting factor may be the density of the FPAsubsequent postprocessing requirements or transmission bi-trate constraints that require data compression We refer tothe process of overcoming the limitations of the FPA in orderto obtain the full resolution afforded by the selected optics asldquodetector SRrdquo Note that some methods may seek to performboth optical and detector SR

Detector SR algorithms generally process a set of low-resolution aliased frames from a video sequence to producea high-resolution frame When subpixel relative motion ispresent between the objects in the scene and the detector ar-ray a unique set of scene samples are acquired for each frameThis provides the mechanism for effectively increasing thespatial sampling rate of the imaging system without reduc-ing the physical size of the detectors

With increasing interest in surveillance and the prolifera-tion of digital imaging and video SR has become a rapidlygrowing field Recent advances in SR include innovative al-gorithms generalized methods real-time implementations

and novel applications The purpose of this special issue isto present leading research and development in the area ofsuper-resolution for digital video Topics of interest for thisspecial issue include but are not limited to

bull Detector and optical SR algorithms for videobull Real-time or near-real-time SR implementationsbull Innovative color SR processingbull Novel SR applications such as improved object

detection recognition and trackingbull Super-resolution from compressed videobull Subpixel image registration and optical flow

Authors should follow the EURASIP JASP manuscriptformat described at httpwwwhindawicomjournalsaspProspective authors should submit an electronic copy oftheir complete manuscript through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due September 1 2006

Acceptance Notification February 1 2007

Final Manuscript Due April 15 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

Russell C Hardie Department of Electrical and ComputerEngineering University of Dayton 300 College ParkDayton OH 45469-0026 USA rhardieudaytonedu

Richard R Schultz Department of Electrical EngineeringUniversity of North Dakota Upson II Room 160 PO Box7165 Grand Forks ND 58202-7165 USARichardSchultzmailundnodakedu

Kenneth E Barner Department of Electrical andComputer Engineering University of Delaware 140 EvansHall Newark DE 19716-3130 USA barnereeudeledu

Hindawi Publishing Corporationhttpwwwhindawicom

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Advanced Signal Processing and ComputationalIntelligence Techniques for Power Line Communications

Call for PapersIn recent years increased demand for fast Internet access andnew multimedia services the development of new and fea-sible signal processing techniques associated with faster andlow-cost digital signal processors as well as the deregulationof the telecommunications market have placed major em-phasis on the value of investigating hostile media such aspowerline (PL) channels for high-rate data transmissions

Nowadays some companies are offering powerline com-munications (PLC) modems with mean and peak bit-ratesaround 100 Mbps and 200 Mbps respectively Howeveradvanced broadband powerline communications (BPLC)modems will surpass this performance For accomplishing itsome special schemes or solutions for coping with the follow-ing issues should be addressed (i) considerable differencesbetween powerline network topologies (ii) hostile propertiesof PL channels such as attenuation proportional to high fre-quencies and long distances high-power impulse noise oc-currences time-varying behavior and strong inter-symbolinterference (ISI) effects (iv) electromagnetic compatibilitywith other well-established communication systems work-ing in the same spectrum (v) climatic conditions in differ-ent parts of the world (vii) reliability and QoS guarantee forvideo and voice transmissions and (vi) different demandsand needs from developed developing and poor countries

These issues can lead to exciting research frontiers withvery promising results if signal processing digital commu-nication and computational intelligence techniques are ef-fectively and efficiently combined

The goal of this special issue is to introduce signal process-ing digital communication and computational intelligencetools either individually or in combined form for advancingreliable and powerful future generations of powerline com-munication solutions that can be suited with for applicationsin developed developing and poor countries

Topics of interest include (but are not limited to)

bull Multicarrier spread spectrum and single carrier tech-niques

bull Channel modeling

bull Channel coding and equalization techniquesbull Multiuser detection and multiple access techniquesbull Synchronization techniquesbull Impulse noise cancellation techniquesbull FPGA ASIC and DSP implementation issues of PLC

modemsbull Error resilience error concealment and Joint source-

channel design methods for video transmissionthrough PL channels

Authors should follow the EURASIP JASP manuscript for-mat described at the journal site httpasphindawicomProspective authors should submit an electronic copy of theircomplete manuscripts through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due October 1 2006

Acceptance Notification January 1 2007

Final Manuscript Due April 1 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

Moiseacutes Vidal Ribeiro Federal University of Juiz de ForaBrazil mribeiroieeeorg

Lutz Lampe University of British Columbia Canadalampeeceubcca

Sanjit K Mitra University of California Santa BarbaraUSA mitraeceucsbedu

Klaus Dostert University of Karlsruhe Germanyklausdostertetecuni-karlsruhede

Halid Hrasnica Dresden University of Technology Ger-many hrasnicaifnettu-dresdende

Hindawi Publishing Corporationhttpasphindawicom

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Numerical Linear Algebra in Signal ProcessingApplications

Call for PapersThe cross-fertilization between numerical linear algebra anddigital signal processing has been very fruitful in the lastdecades The interaction between them has been growingleading to many new algorithms

Numerical linear algebra tools such as eigenvalue and sin-gular value decomposition and their higher-extension leastsquares total least squares recursive least squares regulariza-tion orthogonality and projections are the kernels of pow-erful and numerically robust algorithms

The goal of this special issue is to present new efficient andreliable numerical linear algebra tools for signal processingapplications Areas and topics of interest for this special issueinclude (but are not limited to)

bull Singular value and eigenvalue decompositions in-cluding applications

bull Fourier Toeplitz Cauchy Vandermonde and semi-separable matrices including special algorithms andarchitectures

bull Recursive least squares in digital signal processingbull Updating and downdating techniques in linear alge-

bra and signal processingbull Stability and sensitivity analysis of special recursive

least-squares problemsbull Numerical linear algebra in

bull Biomedical signal processing applicationsbull Adaptive filtersbull Remote sensingbull Acoustic echo cancellationbull Blind signal separation and multiuser detectionbull Multidimensional harmonic retrieval and direc-

tion-of-arrival estimationbull Applications in wireless communicationsbull Applications in pattern analysis and statistical

modelingbull Sensor array processing

Authors should follow the EURASIP JASP manuscriptformat described at httpwwwhindawicomjournalsaspProspective authors should submit an electronic copy oftheir complete manuscript through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due October 1 2006

Acceptance Notification February 1 2007

Final Manuscript Due May 1 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

Shivkumar Chandrasekaran Department of Electricaland Computer Engineering University of California SantaBarbara USA shiveceucsbedu

Gene H Golub Department of Computer Science StanfordUniversity USA golubsccmstanfordedu

Nicola Mastronardi Istituto per le Applicazioni del CalcololdquoMauro Piconerdquo Consiglio Nazionale delle Ricerche BariItaly nmastronardibaiaccnrit

Marc Moonen Department of Electrical EngineeringKatholieke Universiteit Leuven Belgiummarcmoonenesatkuleuvenbe

Paul Van Dooren Department of Mathematical Engineer-ing Catholic University of Louvain Belgiumvdoorencsamuclacbe

Sabine Van Huffel Department of Electrical EngineeringKatholieke Universiteit Leuven Belgiumsabinevanhuffelesatkuleuvenbe

Hindawi Publishing Corporationhttpwwwhindawicom

Page 9: Voice Morphing using 3D Waveform Interpolation …spl.telhai.ac.il/speech/pub/S1110865705502178[1].pdf · Voice Morphing Using 3D Waveform Interpolation Surfaces 1175 Several studies

1182 EURASIP Journal on Applied Signal Processing

s3-o

rigi

nal

t3-o

rigi

nal

s3-g

radu

alm

orph

ing

t3-g

radu

alm

orph

ing

s3-

con

cate

nat

ed

t3-

con

cate

nat

ed

1000

1500

2000

2500

3000

3500

4000

4500

5000

Ave

rage

scor

e

Stimuli

Figure 9 Evaluation of smoothness and identity change in an ut-terance in which a gradual morphing is produced (smoothnessmdashleft column identity changemdashright column)

two utterances yielded similar results In the second test thelisteners had to evaluate the naturalness and intelligibilityof a given sentence on a 1ndash5 scale The listeners attendedto 3 stimuli two original sentences one uttered by a malespeaker and the other by a female speaker and a morphedsentence with a morphing factor of 05 which is a hybrid sig-nal between the two originals The results are summarized inFigure 10 It is clear from this figure that the morphed sig-nal was perceived as highly natural and clear by most of thelisteners (naturalness average score 43 std = 074 intelli-gibility average score 417 std = 069) In the third exper-iment the listeners had to rate the identity change and thenaturalness of two sentences on a 1ndash5 scale Each sentencewas repeated as a cyclostationary morph (see [3]) 16 timeswith various morphing factors from 0 to 1 The average nat-uralness was 325 and the identity change was rated with anaverage of 33 Naturalness in this case was not perfect (al-though it is quite high) and can be explained by the fact thatthe voice timbers in this test were distant and the hybridvoice that was produced could be perceived as uncommonIn summary the results of the tests show that the morphedsignals are perceived in most cases as highly natural highlyintelligible with a relatively smooth change from one speechvoice to another

3 CONCLUSIONS

In this study a new speech morphing algorithm is presentedThe aim is to produce natural sounding hybrid voices be-tween two speakers uttering the same content The algo-rithm is based on representing the residual error signal as aPWI surface and the vocal tract as a lossless tube area func-tion The PWI surface incorporates the characteristics of theexcitation signal and enables reproduction of a residual sig-nal with a given pitch contour and time duration which in-cludes the dynamics of both speakersrsquo excitations It is known

1000

1500

2000

2500

3000

3500

4000

4500

5000

Ave

rage

scor

e

Man Woman Morphing 05

Stimuli

Figure 10 Results of subjective evaluation for naturalness (leftbars) and intelligibility (right bars) for 3 stimuli a man (left) anda woman (middle) voices and a morphing between the two voices(right)

[16] that PWI surfaces can be exploited efficiently for speechcoding and therefore they allow for higher compression ofthe speech database The area function was used in an at-tempt to reflect an intermediate configuration of the vocaltract between the two speakers [25] The utterances producedby the algorithm were shown to be of high quality and to con-sist of intermediate features of the two speakers

There are at least two modes in which the morphing algo-rithm can be used In the first mode the morphing parameteris invariant meaning for example taking a factor of 05 andreceiving a morphed signal with characteristics which are be-tween the two voices for the whole duration of the articu-lation In the second mode we start from the first (source)speaker (morphing factor = 0) and the morphing factor ischanged gradually along the duration of the sentence so itsvalue is 1 at the end of the sentence The same morphingfactor was used for both the excitation and the vocal tractparameters

In this time-varying version of the algorithm that iswhen morphing gradually from one voice to another overtime smooth morphing was achieved producing a highlynatural transition between the source and the target speakersThis was assessed by subjective evaluation tests as previouslydescribed

The algorithm described here is more capable of pro-ducing longer and more smooth and natural sounding ut-terances than previous studies [1 3] One of the advantagesof the proposed algorithm is that in addition to the inter-polation of the vocal tract features interpolation is also per-formed between the two PWI surfaces of the correspondingresidual signals and thus captures the evolution of the excita-tion signals of both speakers In this way a hybrid excitationsignal can be produced that contains intermediate character-istics of both excitations Thus a more natural morphing be-tween utterances can be achieved as has been demonstratedFurthermore our algorithm performs the interpolation of

Voice Morphing Using 3D Waveform Interpolation Surfaces 1183

the residual signal regardless of the pitch information sincethe pitch data is normalized within the PWI surface There-fore the morphed pitch contour is extracted independentlyand can be manipulated separately In addition the currentapproach enables morphing between utterances with differ-ent pitches between male and female voices or betweenvoices of different and perceptually distant timbres Kawa-hara and his colleagues [26 27] implemented a morphingsystem based on interpolation between time-frequency rep-resentations of the source and the target signals It appearsthat the STRAIGHT-based morphing system (see [2 26])was able to produce intermediate voices of higher clarity thanour algorithm but the need to assign multiple anchor pointsfor each short segment using visual inspection and phono-logical knowledge is a noticeable disadvantage of that sys-tem which can make it difficult to use for morphing betweenlong utterances

Further research is needed to improve the quality of themorphed signals which are natural sounding (see httpspltelhaiacilspeech) but are somewhat degraded com-pared to the originals

ACKNOWLEDGMENTS

We would like to thank Professor David Malah the Head ofSIPL for proposing to apply PWI to voice morphing andfor his valuable discussions and comments We also thankDima Ruinskiy for his valuable assistance for programmingpart of the morphing system and for preparing the semiau-tomatic segmentation and Yefim Yakir for preparing part ofthe figures and for technical support The authors thank thereviewers for their helpful comments This study was partlysupported by Guastella Fellowship of the Sacta-Rashi Foun-dation and the JAFI project

REFERENCES

[1] M Abe ldquoSpeech morphing by gradually changing spectrumparameter and fundamental frequencyrdquo Tech Rep SP96-40The Institute of Electronics Information and Communica-tion Engineers (IEICE) July 1996

[2] H Kawahara and H Matsui ldquoAuditory morphing based onan elastic perceptual distance metric in an interference-freetime-frequency representationrdquo in Proc IEEE InternationalConference on Acoustics Speech and Signal Processing (ICASSPrsquo03) vol 1 pp 256ndash259 Hong Kong China 2003

[3] M Slaney M Covell and B Lassiter ldquoAutomatic audio mor-phingrdquo in Proc IEEE International Conference on AcousticsSpeech and Signal Processing (ICASSP rsquo96) vol 2 pp 1001ndash1004 Atlanta GA USA May 1996

[4] P Cano A Loscos J Bonada M de Boer and X Serra ldquoVoicemorphing system for impersonating in karaoke applicationsrdquoin Proc International Computer Music Conference (ICMA rsquo00)pp 109ndash112 Berlin Germany August 2000

[5] E Tellman L Haken and B Holloway ldquoTimbre morphingof sounds with unequal numbers of featuresrdquo Journal of theAudio Engineering Society (AES) vol 43 no 9 pp 678ndash6891995

[6] D Rossum ldquoThe ldquoarmadillordquo coefficient encoding scheme fordigital audio filtersrdquo in proc IEEE Workshop on Applicationsof Signal Processing to Audio and Acoustics (WASPAA rsquo91) pp129ndash130 New Paltz NY USA October 1991

[7] Y Stylianou O Cappe and E Moulines ldquoContinuous prob-abilistic transform for voice conversionrdquo IEEE Trans SpeechAudio Processing vol 6 no 2 pp 131ndash142 1998

[8] N Iwahashi and Y Sagisaka ldquoSpeech spectrum conversionbased on speaker interpolation and multi-functional repre-sentation with weighting by radial basis function networksrdquoSpeech Communication vol 16 no 2 pp 139ndash151 1995

[9] L M Arslan ldquoSpeaker Transformation Algorithm using Seg-mental Codebooks (STASC)rdquo Speech Communication vol 28no 3 pp 211ndash226 1999

[10] A Kain and M Macon ldquoSpectral voice conversion for text-to-speech synthesisrdquo in Proc IEEE International Conferenceon Acoustics Speech and Signal Processing (ICASSP rsquo98) vol 1pp 285ndash288 Seattle WA USA May 1998

[11] A Kain and M Macon ldquoPersonalizing a speech synthesizerby voice adaptationrdquo in Proc 3rd ESCACOCOSDA Interna-tional Speech Synthesis Workshop pp 225ndash230 Jenolan CavesAustralia November 1998

[12] A Kain and M Macon ldquoDesign and evaluation of a voice con-version algorithm based on spectral envelope mapping andresidual predictionrdquo in Proc IEEE International Conference onAcoustics Speech and Signal Processing (ICASSP rsquo01) vol 2Salt Lake City Utah USA May 2001

[13] M Abe S Nakamura K Shikano and H Kuwabara ldquoVoiceconversion through vector quantizationrdquo in Proc IEEE In-ternational Conference on Acoustics Speech and Signal Pro-cessing (ICASSP rsquo88) pp 655ndash658 New York NY USA April1988

[14] M Narendranath H A Murthy S Rajendran and B Yegna-narayana ldquoTransformation of formants for voice conversionusing artificial neural networksrdquo Speech Communication vol16 no 2 pp 206ndash216 1995

[15] O Lev (Fellah) and D Malah ldquoLow bit-rate speech coderbased on a long-term modelrdquo in Proc 22nd IEEE Conventionof Electrical and Electronics Engineers in Israel Tel-Aviv IsraelDecember 2002

[16] W B Kleijn and J Haagen ldquoWaveform interpolationfor cod-ing and synthesisrdquo in Speech Coding and Synthesis W BKleijn and K K Paliwal Eds chapter 5 pp 175ndash207 Else-vier Science B V Amsterdam the Netherlands 1995

[17] W B Kleijn ldquoEncoding speech using prototype waveformsrdquoIEEE Trans Speech Audio Processing vol 1 no 4 pp 386ndash3991993

[18] W B Kleijn and J Haagen ldquoTransformation and decompo-sition of the speech signal for codingrdquo IEEE Signal ProcessingLett vol 1 no 9 pp 136ndash138 1994

[19] J R Deller J G Proakis and J H L Hansen Discrete TimeProcessing of Speech Signals chapter 7 Macmillan PublishingNew York NY USA 1993

[20] L R Rabiner and R W Schafer Digital processing of speechsignals chapter 8 Signal Processing Series Prentice-Hall Pub-lishing Englewood Cliffs London 1978

[21] J Makhoul ldquoLinear prediction A tutorial reviewrdquo Proc IEEEvol 63 no 4 pp 561ndash580 1975

[22] H Kuwabara and Y Sagisaka ldquoAcoustic characteristics ofspeaker individuality Control and conversionrdquo Speech Com-munication vol 16 no 2 pp 165ndash173 1995

[23] A M Noll ldquoCepstrum pitch determinationrdquo Journal of theAcoustical Society of America vol 41 no 2 pp 293ndash309 1967

[24] Y Medan E Yair and D Chazan ldquoSuper resolution pitchdetermination of speech signalsrdquo IEEE Trans Acoust SpeechSignal Processing vol 39 no 1 pp 40ndash48 1991

[25] H Wakita ldquoDirect estimation of the vocal tract shape by in-verse filtering of the acoustic speech waveformsrdquo IEEE Trans-action Audio Electroacoustic vol AV-21 no 5 pp 417ndash4271973

1184 EURASIP Journal on Applied Signal Processing

[26] H Kawahara I Masuda-Katsuse and A de Cheveigne ldquoRe-structuring speech representations using a pitch-adaptivetime-frequency smoothing and an instantaneous-frequency-based F0 extraction Possible role of a repetitive structure insoundsrdquo Speech Communication vol 27 no 3-4 pp 187ndash207 1999

[27] H Kawahara H Katayose A de Cheveigne and R D Pat-terson ldquoFixed point analysis of frequency to instantaneousfrequency mapping for accurate estimation of f0 and period-icityrdquo in Proc European Union Activities in Human LanguageTechnologies (EUROSPEECH rsquo99) vol 6 pp 2781ndash2784 Bu-dapest Hungary September 1999

Yizhar Lavner received a PhD degree fromthe Technion ndash Israel Institute of Technol-ogy in 1997 He joined the Department ofcomputer science Tel-Hai Academic Col-lege Upper Galilee Israel in 1997 where heis a Senior Lecturer He is teaching in SIPL(Signal and Image Processing Lab) Elec-trical Engineering Faculty the TechnionHaifa Israel) since 1998 His research in-terests include speech and audio signal pro-cessing and genomic signal processing

Gidon Porat received his BS degree (cumlaude) in electrical engineering from theTechnion ndash Israel Institute of Technologyin January 2003 As a part of his grad-uation requirements Porat has researchedvoice morphing in the Electrical Engineer-ing Facultyrsquos Signal and Image ProcessingLab in the Technion Porat was a recipientof a Best Studentrsquos Paper Award at the IEEE22nd Convention of Electrical amp Electron-ics Engineers Israel in December 2002 Since his graduation Po-rat has been employed by Intel Corporation as a member of theMobile Platform Group Chip Design Team Porat takes part in thedevelopment of high-speed low-power circuit designs for the next-generation CPUs for mobile computers

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Transforming Signal Processing Applications intoParallel Implementations

Call for PapersThere is an increasing need to develop efficient ldquosystem-levelrdquo models methods and tools to support designers toquickly transform signal processing application specificationto heterogeneous hardware and software architectures suchas arrays of DSPs heterogeneous platforms involving mi-croprocessors DSPs and FPGAs and other evolving multi-processor SoC architectures Typically the design process in-volves aspects of application and architecture modeling aswell as transformations to translate the application modelsto architecture models for subsequent performance analysisand design space exploration Accurate predictions are indis-pensable because next generation signal processing applica-tions for example audio video and array signal processingimpose high throughput real-time and energy constraintsthat can no longer be served by a single DSP

There are a number of key issues in transforming applica-tion models into parallel implementations that are not ad-dressed in current approaches These are engineering theapplication specification transforming application specifi-cation or representation of the architecture specification aswell as communication models such as data transfer and syn-chronization primitives in both models

The purpose of this call for papers is to address approachesthat include application transformations in the performanceanalysis and design space exploration efforts when takingsignal processing applications to concurrent and parallel im-plementations The Guest Editors are soliciting contributionsin joint application and architecture space exploration thatoutperform the current architecture-only design space ex-ploration methods and tools

Topics of interest for this special issue include but are notlimited to

bull modeling applications in terms of (abstract)control-dataflow graph dataflow graph and processnetwork models of computation (MoC)

bull transforming application models or algorithmicengineering

bull transforming application MoCs to architecture MoCsbull joint application and architecture space exploration

bull joint application and architecture performanceanalysis

bull extending the concept of algorithmic engineering toarchitecture engineering

bull design cases and applications mapped onmultiprocessor homogeneous or heterogeneousSOCs showing joint optimization of application andarchitecture

Authors should follow the EURASIP JASP manuscriptformat described at httpwwwhindawicomjournalsaspProspective authors should submit an electronic copy oftheir complete manuscript through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due September 1 2006

Acceptance Notification January 1 2007

Final Manuscript Due April 1 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

F Deprettre Leiden Embedded Research Center LeidenUniversity Niels Bohrweg 1 2333 CA Leiden TheNetherlands eddliacsnl

Roger Woods School of Electrical and ElectronicEngineering Queens University of Belfast Ashby BuildingStranmillis Road Belfast BT9 5AH UK rwoodsqubacuk

Ingrid Verbauwhede Katholieke Universiteit LeuvenESAT-COSIC Kasteelpark Arenberg 10 3001 LeuvenBelgium Ingridverbauwhedeesatkuleuvenbe

Erwin de Kock Philips Research High Tech Campus 315656 AE Eindhoven The Netherlandserwindekockphilipscom

Hindawi Publishing Corporationhttpwwwhindawicom

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Video Adaptation for Heterogeneous Environments

Call for PapersThe explosive growth of compressed video streams andrepositories accessible worldwide the recent addition of newvideo-related standards such as H264AVC MPEG-7 andMPEG-21 and the ever-increasing prevalence of heteroge-neous video-enabled terminals such as computer TV mo-bile phones and personal digital assistants have escalated theneed for efficient and effective techniques for adapting com-pressed videos to better suit the different capabilities con-straints and requirements of various transmission networksapplications and end users For instance Universal Multime-dia Access (UMA) advocates the provision and adaptation ofthe same multimedia content for different networks termi-nals and user preferences

Video adaptation is an emerging field that offers a richbody of knowledge and techniques for handling the hugevariation of resource constraints (eg bandwidth display ca-pability processing speed and power consumption) and thelarge diversity of user tasks in pervasive media applicationsConsiderable amounts of research and development activi-ties in industry and academia have been devoted to answer-ing the many challenges in making better use of video con-tent across systems and applications of various kinds

Video adaptation may apply to individual or multiplevideo streams and may call for different means depending onthe objectives and requirements of adaptation Transcodingtransmoding (cross-modality transcoding) scalable contentrepresentation content abstraction and summarization arepopular means for video adaptation In addition video con-tent analysis and understanding including low-level featureanalysis and high-level semantics understanding play an im-portant role in video adaptation as essential video contentcan be better preserved

The aim of this special issue is to present state-of-the-art developments in this flourishing and important researchfield Contributions in theoretical study architecture designperformance analysis complexity reduction and real-worldapplications are all welcome

Topics of interest include (but are not limited to)

bull Heterogeneous video transcodingbull Scalable video codingbull Dynamic bitstream switching for video adaptation

bull Signal structural and semantic-level videoadaptation

bull Content analysis and understanding for videoadaptation

bull Video summarization and abstractionbull Copyright protection for video adaptationbull Crossmedia techniques for video adaptationbull Testing field trials and applications of video

adaptation servicesbull International standard activities for video adaptation

Authors should follow the EURASIP JASP manuscriptformat described at httpwwwhindawicomjournalsaspProspective authors should submit an electronic copy oftheir complete manuscript through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due September 1 2006

Acceptance Notification January 1 2007

Final Manuscript Due April 1 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

Chia-Wen Lin Department of Computer Science andInformation Engineering National Chung ChengUniversity Chiayi 621 Taiwan cwlincsccuedutw

Yap-Peng Tan School of Electrical and ElectronicEngineering Nanyang Technological University NanyangAvenue Singapore 639798 Singapore eyptanntuedusg

Ming-Ting Sun Department of Electrical EngineeringUniversity of Washington Seattle WA 98195 USA suneewashingtonedu

Alex Kot School of Electrical and Electronic EngineeringNanyang Technological University Nanyang AvenueSingapore 639798 Singapore eackotntuedusg

Anthony Vetro Mitsubishi Electric Research Laboratories201 Broadway 8th Floor Cambridge MA 02138 USAavetromerlcom

Hindawi Publishing Corporationhttpwwwhindawicom

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Knowledge-Assisted Media Analysis for InteractiveMultimedia Applications

Call for PapersIt is broadly acknowledged that the development of enablingtechnologies for new forms of interactive multimedia ser-vices requires a targeted confluence of knowledge semanticsand low-level media processing The convergence of these ar-eas is key to many applications including interactive TV net-worked medical imaging vision-based surveillance and mul-timedia visualization navigation search and retrieval Thelatter is a crucial application since the exponential growthof audiovisual data along with the critical lack of tools torecord the data in a well-structured form is rendering uselessvast portions of available content To overcome this problemthere is need for technology that is able to produce accuratelevels of abstraction in order to annotate and retrieve con-tent using queries that are natural to humans Such technol-ogy will help narrow the gap between low-level features orcontent descriptors that can be computed automatically andthe richness and subjectivity of semantics in user queries andhigh-level human interpretations of audiovisual media

This special issue focuses on truly integrative research tar-geting of what can be disparate disciplines including imageprocessing knowledge engineering information retrievalsemantic analysis and artificial intelligence High-qualityand novel contributions addressing theoretical and practicalaspects are solicited Specifically the following topics are ofinterest

bull Semantics-based multimedia analysisbull Context-based multimedia miningbull Intelligent exploitation of user relevance feedbackbull Knowledge acquisition from multimedia contentsbull Semantics based interaction with multimediabull Integration of multimedia processing and Semantic

Web technologies to enable automatic content shar-ing processing and interpretation

bull Content user and network aware media engineeringbull Multimodal techniques high-dimensionality reduc-

tion and low level feature fusion

Authors should follow the EURASIP JASP manuscriptformat described at httpwwwhindawicomjournalsaspProspective authors should submit an electronic copy oftheir complete manuscript through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due September 1 2006

Acceptance Notification January 15 2007

Final Manuscript Due April 1 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

Ebroul Izquierdo Department of Electronic EngineeringQueen Mary University of London Mile End Road LondonE1 4NS United Kingdom ebroulizquierdoelecqmulacuk

Hyoung Joong Kim Department of Control and Instru-mentation Engineering Kangwon National University 192 1Hyoja2 Dong Kangwon Do 200 701 Koreakhjkangwonackr

Thomas Sikora Communication Systems Group TechnicalUniversity Berlin Einstein Ufer 17 10587 Berlin Germanysikoranuetu-berlinde

Hindawi Publishing Corporationhttpwwwhindawicom

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Super-resolution Enhancement of Digital Video

Call for PapersWhen designing a system for image acquisition there is gen-erally a desire for high spatial resolution and a wide field-of-view To achieve this a camera system must typically em-ploy small f-number optics This produces an image withvery high spatial-frequency bandwidth at the focal plane Toavoid aliasing caused by undersampling the correspondingfocal plane array (FPA) must be sufficiently dense Howevercost and fabrication complexities may make this impracticalMore fundamentally smaller detectors capture fewer pho-tons which can lead to potentially severe noise levels in theacquired imagery Considering these factors one may chooseto accept a certain level of undersampling or to sacrifice someoptical resolution andor field-of-view

In image super-resolution (SR) postprocessing is used toobtain images with resolutions that go beyond the conven-tional limits of the uncompensated imaging system In somesystems the primary limiting factor is the optical resolutionof the image in the focal plane as defined by the cut-off fre-quency of the optics We use the term ldquooptical SRrdquo to re-fer to SR methods that aim to create an image with validspatial-frequency content that goes beyond the cut-off fre-quency of the optics Such techniques typically must rely onextensive a priori information In other image acquisitionsystems the limiting factor may be the density of the FPAsubsequent postprocessing requirements or transmission bi-trate constraints that require data compression We refer tothe process of overcoming the limitations of the FPA in orderto obtain the full resolution afforded by the selected optics asldquodetector SRrdquo Note that some methods may seek to performboth optical and detector SR

Detector SR algorithms generally process a set of low-resolution aliased frames from a video sequence to producea high-resolution frame When subpixel relative motion ispresent between the objects in the scene and the detector ar-ray a unique set of scene samples are acquired for each frameThis provides the mechanism for effectively increasing thespatial sampling rate of the imaging system without reduc-ing the physical size of the detectors

With increasing interest in surveillance and the prolifera-tion of digital imaging and video SR has become a rapidlygrowing field Recent advances in SR include innovative al-gorithms generalized methods real-time implementations

and novel applications The purpose of this special issue isto present leading research and development in the area ofsuper-resolution for digital video Topics of interest for thisspecial issue include but are not limited to

bull Detector and optical SR algorithms for videobull Real-time or near-real-time SR implementationsbull Innovative color SR processingbull Novel SR applications such as improved object

detection recognition and trackingbull Super-resolution from compressed videobull Subpixel image registration and optical flow

Authors should follow the EURASIP JASP manuscriptformat described at httpwwwhindawicomjournalsaspProspective authors should submit an electronic copy oftheir complete manuscript through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due September 1 2006

Acceptance Notification February 1 2007

Final Manuscript Due April 15 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

Russell C Hardie Department of Electrical and ComputerEngineering University of Dayton 300 College ParkDayton OH 45469-0026 USA rhardieudaytonedu

Richard R Schultz Department of Electrical EngineeringUniversity of North Dakota Upson II Room 160 PO Box7165 Grand Forks ND 58202-7165 USARichardSchultzmailundnodakedu

Kenneth E Barner Department of Electrical andComputer Engineering University of Delaware 140 EvansHall Newark DE 19716-3130 USA barnereeudeledu

Hindawi Publishing Corporationhttpwwwhindawicom

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Advanced Signal Processing and ComputationalIntelligence Techniques for Power Line Communications

Call for PapersIn recent years increased demand for fast Internet access andnew multimedia services the development of new and fea-sible signal processing techniques associated with faster andlow-cost digital signal processors as well as the deregulationof the telecommunications market have placed major em-phasis on the value of investigating hostile media such aspowerline (PL) channels for high-rate data transmissions

Nowadays some companies are offering powerline com-munications (PLC) modems with mean and peak bit-ratesaround 100 Mbps and 200 Mbps respectively Howeveradvanced broadband powerline communications (BPLC)modems will surpass this performance For accomplishing itsome special schemes or solutions for coping with the follow-ing issues should be addressed (i) considerable differencesbetween powerline network topologies (ii) hostile propertiesof PL channels such as attenuation proportional to high fre-quencies and long distances high-power impulse noise oc-currences time-varying behavior and strong inter-symbolinterference (ISI) effects (iv) electromagnetic compatibilitywith other well-established communication systems work-ing in the same spectrum (v) climatic conditions in differ-ent parts of the world (vii) reliability and QoS guarantee forvideo and voice transmissions and (vi) different demandsand needs from developed developing and poor countries

These issues can lead to exciting research frontiers withvery promising results if signal processing digital commu-nication and computational intelligence techniques are ef-fectively and efficiently combined

The goal of this special issue is to introduce signal process-ing digital communication and computational intelligencetools either individually or in combined form for advancingreliable and powerful future generations of powerline com-munication solutions that can be suited with for applicationsin developed developing and poor countries

Topics of interest include (but are not limited to)

bull Multicarrier spread spectrum and single carrier tech-niques

bull Channel modeling

bull Channel coding and equalization techniquesbull Multiuser detection and multiple access techniquesbull Synchronization techniquesbull Impulse noise cancellation techniquesbull FPGA ASIC and DSP implementation issues of PLC

modemsbull Error resilience error concealment and Joint source-

channel design methods for video transmissionthrough PL channels

Authors should follow the EURASIP JASP manuscript for-mat described at the journal site httpasphindawicomProspective authors should submit an electronic copy of theircomplete manuscripts through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due October 1 2006

Acceptance Notification January 1 2007

Final Manuscript Due April 1 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

Moiseacutes Vidal Ribeiro Federal University of Juiz de ForaBrazil mribeiroieeeorg

Lutz Lampe University of British Columbia Canadalampeeceubcca

Sanjit K Mitra University of California Santa BarbaraUSA mitraeceucsbedu

Klaus Dostert University of Karlsruhe Germanyklausdostertetecuni-karlsruhede

Halid Hrasnica Dresden University of Technology Ger-many hrasnicaifnettu-dresdende

Hindawi Publishing Corporationhttpasphindawicom

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Numerical Linear Algebra in Signal ProcessingApplications

Call for PapersThe cross-fertilization between numerical linear algebra anddigital signal processing has been very fruitful in the lastdecades The interaction between them has been growingleading to many new algorithms

Numerical linear algebra tools such as eigenvalue and sin-gular value decomposition and their higher-extension leastsquares total least squares recursive least squares regulariza-tion orthogonality and projections are the kernels of pow-erful and numerically robust algorithms

The goal of this special issue is to present new efficient andreliable numerical linear algebra tools for signal processingapplications Areas and topics of interest for this special issueinclude (but are not limited to)

bull Singular value and eigenvalue decompositions in-cluding applications

bull Fourier Toeplitz Cauchy Vandermonde and semi-separable matrices including special algorithms andarchitectures

bull Recursive least squares in digital signal processingbull Updating and downdating techniques in linear alge-

bra and signal processingbull Stability and sensitivity analysis of special recursive

least-squares problemsbull Numerical linear algebra in

bull Biomedical signal processing applicationsbull Adaptive filtersbull Remote sensingbull Acoustic echo cancellationbull Blind signal separation and multiuser detectionbull Multidimensional harmonic retrieval and direc-

tion-of-arrival estimationbull Applications in wireless communicationsbull Applications in pattern analysis and statistical

modelingbull Sensor array processing

Authors should follow the EURASIP JASP manuscriptformat described at httpwwwhindawicomjournalsaspProspective authors should submit an electronic copy oftheir complete manuscript through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due October 1 2006

Acceptance Notification February 1 2007

Final Manuscript Due May 1 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

Shivkumar Chandrasekaran Department of Electricaland Computer Engineering University of California SantaBarbara USA shiveceucsbedu

Gene H Golub Department of Computer Science StanfordUniversity USA golubsccmstanfordedu

Nicola Mastronardi Istituto per le Applicazioni del CalcololdquoMauro Piconerdquo Consiglio Nazionale delle Ricerche BariItaly nmastronardibaiaccnrit

Marc Moonen Department of Electrical EngineeringKatholieke Universiteit Leuven Belgiummarcmoonenesatkuleuvenbe

Paul Van Dooren Department of Mathematical Engineer-ing Catholic University of Louvain Belgiumvdoorencsamuclacbe

Sabine Van Huffel Department of Electrical EngineeringKatholieke Universiteit Leuven Belgiumsabinevanhuffelesatkuleuvenbe

Hindawi Publishing Corporationhttpwwwhindawicom

Page 10: Voice Morphing using 3D Waveform Interpolation …spl.telhai.ac.il/speech/pub/S1110865705502178[1].pdf · Voice Morphing Using 3D Waveform Interpolation Surfaces 1175 Several studies

Voice Morphing Using 3D Waveform Interpolation Surfaces 1183

the residual signal regardless of the pitch information sincethe pitch data is normalized within the PWI surface There-fore the morphed pitch contour is extracted independentlyand can be manipulated separately In addition the currentapproach enables morphing between utterances with differ-ent pitches between male and female voices or betweenvoices of different and perceptually distant timbres Kawa-hara and his colleagues [26 27] implemented a morphingsystem based on interpolation between time-frequency rep-resentations of the source and the target signals It appearsthat the STRAIGHT-based morphing system (see [2 26])was able to produce intermediate voices of higher clarity thanour algorithm but the need to assign multiple anchor pointsfor each short segment using visual inspection and phono-logical knowledge is a noticeable disadvantage of that sys-tem which can make it difficult to use for morphing betweenlong utterances

Further research is needed to improve the quality of themorphed signals which are natural sounding (see httpspltelhaiacilspeech) but are somewhat degraded com-pared to the originals

ACKNOWLEDGMENTS

We would like to thank Professor David Malah the Head ofSIPL for proposing to apply PWI to voice morphing andfor his valuable discussions and comments We also thankDima Ruinskiy for his valuable assistance for programmingpart of the morphing system and for preparing the semiau-tomatic segmentation and Yefim Yakir for preparing part ofthe figures and for technical support The authors thank thereviewers for their helpful comments This study was partlysupported by Guastella Fellowship of the Sacta-Rashi Foun-dation and the JAFI project

REFERENCES

[1] M Abe ldquoSpeech morphing by gradually changing spectrumparameter and fundamental frequencyrdquo Tech Rep SP96-40The Institute of Electronics Information and Communica-tion Engineers (IEICE) July 1996

[2] H Kawahara and H Matsui ldquoAuditory morphing based onan elastic perceptual distance metric in an interference-freetime-frequency representationrdquo in Proc IEEE InternationalConference on Acoustics Speech and Signal Processing (ICASSPrsquo03) vol 1 pp 256ndash259 Hong Kong China 2003

[3] M Slaney M Covell and B Lassiter ldquoAutomatic audio mor-phingrdquo in Proc IEEE International Conference on AcousticsSpeech and Signal Processing (ICASSP rsquo96) vol 2 pp 1001ndash1004 Atlanta GA USA May 1996

[4] P Cano A Loscos J Bonada M de Boer and X Serra ldquoVoicemorphing system for impersonating in karaoke applicationsrdquoin Proc International Computer Music Conference (ICMA rsquo00)pp 109ndash112 Berlin Germany August 2000

[5] E Tellman L Haken and B Holloway ldquoTimbre morphingof sounds with unequal numbers of featuresrdquo Journal of theAudio Engineering Society (AES) vol 43 no 9 pp 678ndash6891995

[6] D Rossum ldquoThe ldquoarmadillordquo coefficient encoding scheme fordigital audio filtersrdquo in proc IEEE Workshop on Applicationsof Signal Processing to Audio and Acoustics (WASPAA rsquo91) pp129ndash130 New Paltz NY USA October 1991

[7] Y Stylianou O Cappe and E Moulines ldquoContinuous prob-abilistic transform for voice conversionrdquo IEEE Trans SpeechAudio Processing vol 6 no 2 pp 131ndash142 1998

[8] N Iwahashi and Y Sagisaka ldquoSpeech spectrum conversionbased on speaker interpolation and multi-functional repre-sentation with weighting by radial basis function networksrdquoSpeech Communication vol 16 no 2 pp 139ndash151 1995

[9] L M Arslan ldquoSpeaker Transformation Algorithm using Seg-mental Codebooks (STASC)rdquo Speech Communication vol 28no 3 pp 211ndash226 1999

[10] A Kain and M Macon ldquoSpectral voice conversion for text-to-speech synthesisrdquo in Proc IEEE International Conferenceon Acoustics Speech and Signal Processing (ICASSP rsquo98) vol 1pp 285ndash288 Seattle WA USA May 1998

[11] A Kain and M Macon ldquoPersonalizing a speech synthesizerby voice adaptationrdquo in Proc 3rd ESCACOCOSDA Interna-tional Speech Synthesis Workshop pp 225ndash230 Jenolan CavesAustralia November 1998

[12] A Kain and M Macon ldquoDesign and evaluation of a voice con-version algorithm based on spectral envelope mapping andresidual predictionrdquo in Proc IEEE International Conference onAcoustics Speech and Signal Processing (ICASSP rsquo01) vol 2Salt Lake City Utah USA May 2001

[13] M Abe S Nakamura K Shikano and H Kuwabara ldquoVoiceconversion through vector quantizationrdquo in Proc IEEE In-ternational Conference on Acoustics Speech and Signal Pro-cessing (ICASSP rsquo88) pp 655ndash658 New York NY USA April1988

[14] M Narendranath H A Murthy S Rajendran and B Yegna-narayana ldquoTransformation of formants for voice conversionusing artificial neural networksrdquo Speech Communication vol16 no 2 pp 206ndash216 1995

[15] O Lev (Fellah) and D Malah ldquoLow bit-rate speech coderbased on a long-term modelrdquo in Proc 22nd IEEE Conventionof Electrical and Electronics Engineers in Israel Tel-Aviv IsraelDecember 2002

[16] W B Kleijn and J Haagen ldquoWaveform interpolationfor cod-ing and synthesisrdquo in Speech Coding and Synthesis W BKleijn and K K Paliwal Eds chapter 5 pp 175ndash207 Else-vier Science B V Amsterdam the Netherlands 1995

[17] W B Kleijn ldquoEncoding speech using prototype waveformsrdquoIEEE Trans Speech Audio Processing vol 1 no 4 pp 386ndash3991993

[18] W B Kleijn and J Haagen ldquoTransformation and decompo-sition of the speech signal for codingrdquo IEEE Signal ProcessingLett vol 1 no 9 pp 136ndash138 1994

[19] J R Deller J G Proakis and J H L Hansen Discrete TimeProcessing of Speech Signals chapter 7 Macmillan PublishingNew York NY USA 1993

[20] L R Rabiner and R W Schafer Digital processing of speechsignals chapter 8 Signal Processing Series Prentice-Hall Pub-lishing Englewood Cliffs London 1978

[21] J Makhoul ldquoLinear prediction A tutorial reviewrdquo Proc IEEEvol 63 no 4 pp 561ndash580 1975

[22] H Kuwabara and Y Sagisaka ldquoAcoustic characteristics ofspeaker individuality Control and conversionrdquo Speech Com-munication vol 16 no 2 pp 165ndash173 1995

[23] A M Noll ldquoCepstrum pitch determinationrdquo Journal of theAcoustical Society of America vol 41 no 2 pp 293ndash309 1967

[24] Y Medan E Yair and D Chazan ldquoSuper resolution pitchdetermination of speech signalsrdquo IEEE Trans Acoust SpeechSignal Processing vol 39 no 1 pp 40ndash48 1991

[25] H Wakita ldquoDirect estimation of the vocal tract shape by in-verse filtering of the acoustic speech waveformsrdquo IEEE Trans-action Audio Electroacoustic vol AV-21 no 5 pp 417ndash4271973

1184 EURASIP Journal on Applied Signal Processing

[26] H Kawahara I Masuda-Katsuse and A de Cheveigne ldquoRe-structuring speech representations using a pitch-adaptivetime-frequency smoothing and an instantaneous-frequency-based F0 extraction Possible role of a repetitive structure insoundsrdquo Speech Communication vol 27 no 3-4 pp 187ndash207 1999

[27] H Kawahara H Katayose A de Cheveigne and R D Pat-terson ldquoFixed point analysis of frequency to instantaneousfrequency mapping for accurate estimation of f0 and period-icityrdquo in Proc European Union Activities in Human LanguageTechnologies (EUROSPEECH rsquo99) vol 6 pp 2781ndash2784 Bu-dapest Hungary September 1999

Yizhar Lavner received a PhD degree fromthe Technion ndash Israel Institute of Technol-ogy in 1997 He joined the Department ofcomputer science Tel-Hai Academic Col-lege Upper Galilee Israel in 1997 where heis a Senior Lecturer He is teaching in SIPL(Signal and Image Processing Lab) Elec-trical Engineering Faculty the TechnionHaifa Israel) since 1998 His research in-terests include speech and audio signal pro-cessing and genomic signal processing

Gidon Porat received his BS degree (cumlaude) in electrical engineering from theTechnion ndash Israel Institute of Technologyin January 2003 As a part of his grad-uation requirements Porat has researchedvoice morphing in the Electrical Engineer-ing Facultyrsquos Signal and Image ProcessingLab in the Technion Porat was a recipientof a Best Studentrsquos Paper Award at the IEEE22nd Convention of Electrical amp Electron-ics Engineers Israel in December 2002 Since his graduation Po-rat has been employed by Intel Corporation as a member of theMobile Platform Group Chip Design Team Porat takes part in thedevelopment of high-speed low-power circuit designs for the next-generation CPUs for mobile computers

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Transforming Signal Processing Applications intoParallel Implementations

Call for PapersThere is an increasing need to develop efficient ldquosystem-levelrdquo models methods and tools to support designers toquickly transform signal processing application specificationto heterogeneous hardware and software architectures suchas arrays of DSPs heterogeneous platforms involving mi-croprocessors DSPs and FPGAs and other evolving multi-processor SoC architectures Typically the design process in-volves aspects of application and architecture modeling aswell as transformations to translate the application modelsto architecture models for subsequent performance analysisand design space exploration Accurate predictions are indis-pensable because next generation signal processing applica-tions for example audio video and array signal processingimpose high throughput real-time and energy constraintsthat can no longer be served by a single DSP

There are a number of key issues in transforming applica-tion models into parallel implementations that are not ad-dressed in current approaches These are engineering theapplication specification transforming application specifi-cation or representation of the architecture specification aswell as communication models such as data transfer and syn-chronization primitives in both models

The purpose of this call for papers is to address approachesthat include application transformations in the performanceanalysis and design space exploration efforts when takingsignal processing applications to concurrent and parallel im-plementations The Guest Editors are soliciting contributionsin joint application and architecture space exploration thatoutperform the current architecture-only design space ex-ploration methods and tools

Topics of interest for this special issue include but are notlimited to

bull modeling applications in terms of (abstract)control-dataflow graph dataflow graph and processnetwork models of computation (MoC)

bull transforming application models or algorithmicengineering

bull transforming application MoCs to architecture MoCsbull joint application and architecture space exploration

bull joint application and architecture performanceanalysis

bull extending the concept of algorithmic engineering toarchitecture engineering

bull design cases and applications mapped onmultiprocessor homogeneous or heterogeneousSOCs showing joint optimization of application andarchitecture

Authors should follow the EURASIP JASP manuscriptformat described at httpwwwhindawicomjournalsaspProspective authors should submit an electronic copy oftheir complete manuscript through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due September 1 2006

Acceptance Notification January 1 2007

Final Manuscript Due April 1 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

F Deprettre Leiden Embedded Research Center LeidenUniversity Niels Bohrweg 1 2333 CA Leiden TheNetherlands eddliacsnl

Roger Woods School of Electrical and ElectronicEngineering Queens University of Belfast Ashby BuildingStranmillis Road Belfast BT9 5AH UK rwoodsqubacuk

Ingrid Verbauwhede Katholieke Universiteit LeuvenESAT-COSIC Kasteelpark Arenberg 10 3001 LeuvenBelgium Ingridverbauwhedeesatkuleuvenbe

Erwin de Kock Philips Research High Tech Campus 315656 AE Eindhoven The Netherlandserwindekockphilipscom

Hindawi Publishing Corporationhttpwwwhindawicom

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Video Adaptation for Heterogeneous Environments

Call for PapersThe explosive growth of compressed video streams andrepositories accessible worldwide the recent addition of newvideo-related standards such as H264AVC MPEG-7 andMPEG-21 and the ever-increasing prevalence of heteroge-neous video-enabled terminals such as computer TV mo-bile phones and personal digital assistants have escalated theneed for efficient and effective techniques for adapting com-pressed videos to better suit the different capabilities con-straints and requirements of various transmission networksapplications and end users For instance Universal Multime-dia Access (UMA) advocates the provision and adaptation ofthe same multimedia content for different networks termi-nals and user preferences

Video adaptation is an emerging field that offers a richbody of knowledge and techniques for handling the hugevariation of resource constraints (eg bandwidth display ca-pability processing speed and power consumption) and thelarge diversity of user tasks in pervasive media applicationsConsiderable amounts of research and development activi-ties in industry and academia have been devoted to answer-ing the many challenges in making better use of video con-tent across systems and applications of various kinds

Video adaptation may apply to individual or multiplevideo streams and may call for different means depending onthe objectives and requirements of adaptation Transcodingtransmoding (cross-modality transcoding) scalable contentrepresentation content abstraction and summarization arepopular means for video adaptation In addition video con-tent analysis and understanding including low-level featureanalysis and high-level semantics understanding play an im-portant role in video adaptation as essential video contentcan be better preserved

The aim of this special issue is to present state-of-the-art developments in this flourishing and important researchfield Contributions in theoretical study architecture designperformance analysis complexity reduction and real-worldapplications are all welcome

Topics of interest include (but are not limited to)

bull Heterogeneous video transcodingbull Scalable video codingbull Dynamic bitstream switching for video adaptation

bull Signal structural and semantic-level videoadaptation

bull Content analysis and understanding for videoadaptation

bull Video summarization and abstractionbull Copyright protection for video adaptationbull Crossmedia techniques for video adaptationbull Testing field trials and applications of video

adaptation servicesbull International standard activities for video adaptation

Authors should follow the EURASIP JASP manuscriptformat described at httpwwwhindawicomjournalsaspProspective authors should submit an electronic copy oftheir complete manuscript through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due September 1 2006

Acceptance Notification January 1 2007

Final Manuscript Due April 1 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

Chia-Wen Lin Department of Computer Science andInformation Engineering National Chung ChengUniversity Chiayi 621 Taiwan cwlincsccuedutw

Yap-Peng Tan School of Electrical and ElectronicEngineering Nanyang Technological University NanyangAvenue Singapore 639798 Singapore eyptanntuedusg

Ming-Ting Sun Department of Electrical EngineeringUniversity of Washington Seattle WA 98195 USA suneewashingtonedu

Alex Kot School of Electrical and Electronic EngineeringNanyang Technological University Nanyang AvenueSingapore 639798 Singapore eackotntuedusg

Anthony Vetro Mitsubishi Electric Research Laboratories201 Broadway 8th Floor Cambridge MA 02138 USAavetromerlcom

Hindawi Publishing Corporationhttpwwwhindawicom

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Knowledge-Assisted Media Analysis for InteractiveMultimedia Applications

Call for PapersIt is broadly acknowledged that the development of enablingtechnologies for new forms of interactive multimedia ser-vices requires a targeted confluence of knowledge semanticsand low-level media processing The convergence of these ar-eas is key to many applications including interactive TV net-worked medical imaging vision-based surveillance and mul-timedia visualization navigation search and retrieval Thelatter is a crucial application since the exponential growthof audiovisual data along with the critical lack of tools torecord the data in a well-structured form is rendering uselessvast portions of available content To overcome this problemthere is need for technology that is able to produce accuratelevels of abstraction in order to annotate and retrieve con-tent using queries that are natural to humans Such technol-ogy will help narrow the gap between low-level features orcontent descriptors that can be computed automatically andthe richness and subjectivity of semantics in user queries andhigh-level human interpretations of audiovisual media

This special issue focuses on truly integrative research tar-geting of what can be disparate disciplines including imageprocessing knowledge engineering information retrievalsemantic analysis and artificial intelligence High-qualityand novel contributions addressing theoretical and practicalaspects are solicited Specifically the following topics are ofinterest

bull Semantics-based multimedia analysisbull Context-based multimedia miningbull Intelligent exploitation of user relevance feedbackbull Knowledge acquisition from multimedia contentsbull Semantics based interaction with multimediabull Integration of multimedia processing and Semantic

Web technologies to enable automatic content shar-ing processing and interpretation

bull Content user and network aware media engineeringbull Multimodal techniques high-dimensionality reduc-

tion and low level feature fusion

Authors should follow the EURASIP JASP manuscriptformat described at httpwwwhindawicomjournalsaspProspective authors should submit an electronic copy oftheir complete manuscript through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due September 1 2006

Acceptance Notification January 15 2007

Final Manuscript Due April 1 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

Ebroul Izquierdo Department of Electronic EngineeringQueen Mary University of London Mile End Road LondonE1 4NS United Kingdom ebroulizquierdoelecqmulacuk

Hyoung Joong Kim Department of Control and Instru-mentation Engineering Kangwon National University 192 1Hyoja2 Dong Kangwon Do 200 701 Koreakhjkangwonackr

Thomas Sikora Communication Systems Group TechnicalUniversity Berlin Einstein Ufer 17 10587 Berlin Germanysikoranuetu-berlinde

Hindawi Publishing Corporationhttpwwwhindawicom

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Super-resolution Enhancement of Digital Video

Call for PapersWhen designing a system for image acquisition there is gen-erally a desire for high spatial resolution and a wide field-of-view To achieve this a camera system must typically em-ploy small f-number optics This produces an image withvery high spatial-frequency bandwidth at the focal plane Toavoid aliasing caused by undersampling the correspondingfocal plane array (FPA) must be sufficiently dense Howevercost and fabrication complexities may make this impracticalMore fundamentally smaller detectors capture fewer pho-tons which can lead to potentially severe noise levels in theacquired imagery Considering these factors one may chooseto accept a certain level of undersampling or to sacrifice someoptical resolution andor field-of-view

In image super-resolution (SR) postprocessing is used toobtain images with resolutions that go beyond the conven-tional limits of the uncompensated imaging system In somesystems the primary limiting factor is the optical resolutionof the image in the focal plane as defined by the cut-off fre-quency of the optics We use the term ldquooptical SRrdquo to re-fer to SR methods that aim to create an image with validspatial-frequency content that goes beyond the cut-off fre-quency of the optics Such techniques typically must rely onextensive a priori information In other image acquisitionsystems the limiting factor may be the density of the FPAsubsequent postprocessing requirements or transmission bi-trate constraints that require data compression We refer tothe process of overcoming the limitations of the FPA in orderto obtain the full resolution afforded by the selected optics asldquodetector SRrdquo Note that some methods may seek to performboth optical and detector SR

Detector SR algorithms generally process a set of low-resolution aliased frames from a video sequence to producea high-resolution frame When subpixel relative motion ispresent between the objects in the scene and the detector ar-ray a unique set of scene samples are acquired for each frameThis provides the mechanism for effectively increasing thespatial sampling rate of the imaging system without reduc-ing the physical size of the detectors

With increasing interest in surveillance and the prolifera-tion of digital imaging and video SR has become a rapidlygrowing field Recent advances in SR include innovative al-gorithms generalized methods real-time implementations

and novel applications The purpose of this special issue isto present leading research and development in the area ofsuper-resolution for digital video Topics of interest for thisspecial issue include but are not limited to

bull Detector and optical SR algorithms for videobull Real-time or near-real-time SR implementationsbull Innovative color SR processingbull Novel SR applications such as improved object

detection recognition and trackingbull Super-resolution from compressed videobull Subpixel image registration and optical flow

Authors should follow the EURASIP JASP manuscriptformat described at httpwwwhindawicomjournalsaspProspective authors should submit an electronic copy oftheir complete manuscript through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due September 1 2006

Acceptance Notification February 1 2007

Final Manuscript Due April 15 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

Russell C Hardie Department of Electrical and ComputerEngineering University of Dayton 300 College ParkDayton OH 45469-0026 USA rhardieudaytonedu

Richard R Schultz Department of Electrical EngineeringUniversity of North Dakota Upson II Room 160 PO Box7165 Grand Forks ND 58202-7165 USARichardSchultzmailundnodakedu

Kenneth E Barner Department of Electrical andComputer Engineering University of Delaware 140 EvansHall Newark DE 19716-3130 USA barnereeudeledu

Hindawi Publishing Corporationhttpwwwhindawicom

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Advanced Signal Processing and ComputationalIntelligence Techniques for Power Line Communications

Call for PapersIn recent years increased demand for fast Internet access andnew multimedia services the development of new and fea-sible signal processing techniques associated with faster andlow-cost digital signal processors as well as the deregulationof the telecommunications market have placed major em-phasis on the value of investigating hostile media such aspowerline (PL) channels for high-rate data transmissions

Nowadays some companies are offering powerline com-munications (PLC) modems with mean and peak bit-ratesaround 100 Mbps and 200 Mbps respectively Howeveradvanced broadband powerline communications (BPLC)modems will surpass this performance For accomplishing itsome special schemes or solutions for coping with the follow-ing issues should be addressed (i) considerable differencesbetween powerline network topologies (ii) hostile propertiesof PL channels such as attenuation proportional to high fre-quencies and long distances high-power impulse noise oc-currences time-varying behavior and strong inter-symbolinterference (ISI) effects (iv) electromagnetic compatibilitywith other well-established communication systems work-ing in the same spectrum (v) climatic conditions in differ-ent parts of the world (vii) reliability and QoS guarantee forvideo and voice transmissions and (vi) different demandsand needs from developed developing and poor countries

These issues can lead to exciting research frontiers withvery promising results if signal processing digital commu-nication and computational intelligence techniques are ef-fectively and efficiently combined

The goal of this special issue is to introduce signal process-ing digital communication and computational intelligencetools either individually or in combined form for advancingreliable and powerful future generations of powerline com-munication solutions that can be suited with for applicationsin developed developing and poor countries

Topics of interest include (but are not limited to)

bull Multicarrier spread spectrum and single carrier tech-niques

bull Channel modeling

bull Channel coding and equalization techniquesbull Multiuser detection and multiple access techniquesbull Synchronization techniquesbull Impulse noise cancellation techniquesbull FPGA ASIC and DSP implementation issues of PLC

modemsbull Error resilience error concealment and Joint source-

channel design methods for video transmissionthrough PL channels

Authors should follow the EURASIP JASP manuscript for-mat described at the journal site httpasphindawicomProspective authors should submit an electronic copy of theircomplete manuscripts through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due October 1 2006

Acceptance Notification January 1 2007

Final Manuscript Due April 1 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

Moiseacutes Vidal Ribeiro Federal University of Juiz de ForaBrazil mribeiroieeeorg

Lutz Lampe University of British Columbia Canadalampeeceubcca

Sanjit K Mitra University of California Santa BarbaraUSA mitraeceucsbedu

Klaus Dostert University of Karlsruhe Germanyklausdostertetecuni-karlsruhede

Halid Hrasnica Dresden University of Technology Ger-many hrasnicaifnettu-dresdende

Hindawi Publishing Corporationhttpasphindawicom

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Numerical Linear Algebra in Signal ProcessingApplications

Call for PapersThe cross-fertilization between numerical linear algebra anddigital signal processing has been very fruitful in the lastdecades The interaction between them has been growingleading to many new algorithms

Numerical linear algebra tools such as eigenvalue and sin-gular value decomposition and their higher-extension leastsquares total least squares recursive least squares regulariza-tion orthogonality and projections are the kernels of pow-erful and numerically robust algorithms

The goal of this special issue is to present new efficient andreliable numerical linear algebra tools for signal processingapplications Areas and topics of interest for this special issueinclude (but are not limited to)

bull Singular value and eigenvalue decompositions in-cluding applications

bull Fourier Toeplitz Cauchy Vandermonde and semi-separable matrices including special algorithms andarchitectures

bull Recursive least squares in digital signal processingbull Updating and downdating techniques in linear alge-

bra and signal processingbull Stability and sensitivity analysis of special recursive

least-squares problemsbull Numerical linear algebra in

bull Biomedical signal processing applicationsbull Adaptive filtersbull Remote sensingbull Acoustic echo cancellationbull Blind signal separation and multiuser detectionbull Multidimensional harmonic retrieval and direc-

tion-of-arrival estimationbull Applications in wireless communicationsbull Applications in pattern analysis and statistical

modelingbull Sensor array processing

Authors should follow the EURASIP JASP manuscriptformat described at httpwwwhindawicomjournalsaspProspective authors should submit an electronic copy oftheir complete manuscript through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due October 1 2006

Acceptance Notification February 1 2007

Final Manuscript Due May 1 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

Shivkumar Chandrasekaran Department of Electricaland Computer Engineering University of California SantaBarbara USA shiveceucsbedu

Gene H Golub Department of Computer Science StanfordUniversity USA golubsccmstanfordedu

Nicola Mastronardi Istituto per le Applicazioni del CalcololdquoMauro Piconerdquo Consiglio Nazionale delle Ricerche BariItaly nmastronardibaiaccnrit

Marc Moonen Department of Electrical EngineeringKatholieke Universiteit Leuven Belgiummarcmoonenesatkuleuvenbe

Paul Van Dooren Department of Mathematical Engineer-ing Catholic University of Louvain Belgiumvdoorencsamuclacbe

Sabine Van Huffel Department of Electrical EngineeringKatholieke Universiteit Leuven Belgiumsabinevanhuffelesatkuleuvenbe

Hindawi Publishing Corporationhttpwwwhindawicom

Page 11: Voice Morphing using 3D Waveform Interpolation …spl.telhai.ac.il/speech/pub/S1110865705502178[1].pdf · Voice Morphing Using 3D Waveform Interpolation Surfaces 1175 Several studies

1184 EURASIP Journal on Applied Signal Processing

[26] H Kawahara I Masuda-Katsuse and A de Cheveigne ldquoRe-structuring speech representations using a pitch-adaptivetime-frequency smoothing and an instantaneous-frequency-based F0 extraction Possible role of a repetitive structure insoundsrdquo Speech Communication vol 27 no 3-4 pp 187ndash207 1999

[27] H Kawahara H Katayose A de Cheveigne and R D Pat-terson ldquoFixed point analysis of frequency to instantaneousfrequency mapping for accurate estimation of f0 and period-icityrdquo in Proc European Union Activities in Human LanguageTechnologies (EUROSPEECH rsquo99) vol 6 pp 2781ndash2784 Bu-dapest Hungary September 1999

Yizhar Lavner received a PhD degree fromthe Technion ndash Israel Institute of Technol-ogy in 1997 He joined the Department ofcomputer science Tel-Hai Academic Col-lege Upper Galilee Israel in 1997 where heis a Senior Lecturer He is teaching in SIPL(Signal and Image Processing Lab) Elec-trical Engineering Faculty the TechnionHaifa Israel) since 1998 His research in-terests include speech and audio signal pro-cessing and genomic signal processing

Gidon Porat received his BS degree (cumlaude) in electrical engineering from theTechnion ndash Israel Institute of Technologyin January 2003 As a part of his grad-uation requirements Porat has researchedvoice morphing in the Electrical Engineer-ing Facultyrsquos Signal and Image ProcessingLab in the Technion Porat was a recipientof a Best Studentrsquos Paper Award at the IEEE22nd Convention of Electrical amp Electron-ics Engineers Israel in December 2002 Since his graduation Po-rat has been employed by Intel Corporation as a member of theMobile Platform Group Chip Design Team Porat takes part in thedevelopment of high-speed low-power circuit designs for the next-generation CPUs for mobile computers

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Transforming Signal Processing Applications intoParallel Implementations

Call for PapersThere is an increasing need to develop efficient ldquosystem-levelrdquo models methods and tools to support designers toquickly transform signal processing application specificationto heterogeneous hardware and software architectures suchas arrays of DSPs heterogeneous platforms involving mi-croprocessors DSPs and FPGAs and other evolving multi-processor SoC architectures Typically the design process in-volves aspects of application and architecture modeling aswell as transformations to translate the application modelsto architecture models for subsequent performance analysisand design space exploration Accurate predictions are indis-pensable because next generation signal processing applica-tions for example audio video and array signal processingimpose high throughput real-time and energy constraintsthat can no longer be served by a single DSP

There are a number of key issues in transforming applica-tion models into parallel implementations that are not ad-dressed in current approaches These are engineering theapplication specification transforming application specifi-cation or representation of the architecture specification aswell as communication models such as data transfer and syn-chronization primitives in both models

The purpose of this call for papers is to address approachesthat include application transformations in the performanceanalysis and design space exploration efforts when takingsignal processing applications to concurrent and parallel im-plementations The Guest Editors are soliciting contributionsin joint application and architecture space exploration thatoutperform the current architecture-only design space ex-ploration methods and tools

Topics of interest for this special issue include but are notlimited to

bull modeling applications in terms of (abstract)control-dataflow graph dataflow graph and processnetwork models of computation (MoC)

bull transforming application models or algorithmicengineering

bull transforming application MoCs to architecture MoCsbull joint application and architecture space exploration

bull joint application and architecture performanceanalysis

bull extending the concept of algorithmic engineering toarchitecture engineering

bull design cases and applications mapped onmultiprocessor homogeneous or heterogeneousSOCs showing joint optimization of application andarchitecture

Authors should follow the EURASIP JASP manuscriptformat described at httpwwwhindawicomjournalsaspProspective authors should submit an electronic copy oftheir complete manuscript through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due September 1 2006

Acceptance Notification January 1 2007

Final Manuscript Due April 1 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

F Deprettre Leiden Embedded Research Center LeidenUniversity Niels Bohrweg 1 2333 CA Leiden TheNetherlands eddliacsnl

Roger Woods School of Electrical and ElectronicEngineering Queens University of Belfast Ashby BuildingStranmillis Road Belfast BT9 5AH UK rwoodsqubacuk

Ingrid Verbauwhede Katholieke Universiteit LeuvenESAT-COSIC Kasteelpark Arenberg 10 3001 LeuvenBelgium Ingridverbauwhedeesatkuleuvenbe

Erwin de Kock Philips Research High Tech Campus 315656 AE Eindhoven The Netherlandserwindekockphilipscom

Hindawi Publishing Corporationhttpwwwhindawicom

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Video Adaptation for Heterogeneous Environments

Call for PapersThe explosive growth of compressed video streams andrepositories accessible worldwide the recent addition of newvideo-related standards such as H264AVC MPEG-7 andMPEG-21 and the ever-increasing prevalence of heteroge-neous video-enabled terminals such as computer TV mo-bile phones and personal digital assistants have escalated theneed for efficient and effective techniques for adapting com-pressed videos to better suit the different capabilities con-straints and requirements of various transmission networksapplications and end users For instance Universal Multime-dia Access (UMA) advocates the provision and adaptation ofthe same multimedia content for different networks termi-nals and user preferences

Video adaptation is an emerging field that offers a richbody of knowledge and techniques for handling the hugevariation of resource constraints (eg bandwidth display ca-pability processing speed and power consumption) and thelarge diversity of user tasks in pervasive media applicationsConsiderable amounts of research and development activi-ties in industry and academia have been devoted to answer-ing the many challenges in making better use of video con-tent across systems and applications of various kinds

Video adaptation may apply to individual or multiplevideo streams and may call for different means depending onthe objectives and requirements of adaptation Transcodingtransmoding (cross-modality transcoding) scalable contentrepresentation content abstraction and summarization arepopular means for video adaptation In addition video con-tent analysis and understanding including low-level featureanalysis and high-level semantics understanding play an im-portant role in video adaptation as essential video contentcan be better preserved

The aim of this special issue is to present state-of-the-art developments in this flourishing and important researchfield Contributions in theoretical study architecture designperformance analysis complexity reduction and real-worldapplications are all welcome

Topics of interest include (but are not limited to)

bull Heterogeneous video transcodingbull Scalable video codingbull Dynamic bitstream switching for video adaptation

bull Signal structural and semantic-level videoadaptation

bull Content analysis and understanding for videoadaptation

bull Video summarization and abstractionbull Copyright protection for video adaptationbull Crossmedia techniques for video adaptationbull Testing field trials and applications of video

adaptation servicesbull International standard activities for video adaptation

Authors should follow the EURASIP JASP manuscriptformat described at httpwwwhindawicomjournalsaspProspective authors should submit an electronic copy oftheir complete manuscript through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due September 1 2006

Acceptance Notification January 1 2007

Final Manuscript Due April 1 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

Chia-Wen Lin Department of Computer Science andInformation Engineering National Chung ChengUniversity Chiayi 621 Taiwan cwlincsccuedutw

Yap-Peng Tan School of Electrical and ElectronicEngineering Nanyang Technological University NanyangAvenue Singapore 639798 Singapore eyptanntuedusg

Ming-Ting Sun Department of Electrical EngineeringUniversity of Washington Seattle WA 98195 USA suneewashingtonedu

Alex Kot School of Electrical and Electronic EngineeringNanyang Technological University Nanyang AvenueSingapore 639798 Singapore eackotntuedusg

Anthony Vetro Mitsubishi Electric Research Laboratories201 Broadway 8th Floor Cambridge MA 02138 USAavetromerlcom

Hindawi Publishing Corporationhttpwwwhindawicom

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Knowledge-Assisted Media Analysis for InteractiveMultimedia Applications

Call for PapersIt is broadly acknowledged that the development of enablingtechnologies for new forms of interactive multimedia ser-vices requires a targeted confluence of knowledge semanticsand low-level media processing The convergence of these ar-eas is key to many applications including interactive TV net-worked medical imaging vision-based surveillance and mul-timedia visualization navigation search and retrieval Thelatter is a crucial application since the exponential growthof audiovisual data along with the critical lack of tools torecord the data in a well-structured form is rendering uselessvast portions of available content To overcome this problemthere is need for technology that is able to produce accuratelevels of abstraction in order to annotate and retrieve con-tent using queries that are natural to humans Such technol-ogy will help narrow the gap between low-level features orcontent descriptors that can be computed automatically andthe richness and subjectivity of semantics in user queries andhigh-level human interpretations of audiovisual media

This special issue focuses on truly integrative research tar-geting of what can be disparate disciplines including imageprocessing knowledge engineering information retrievalsemantic analysis and artificial intelligence High-qualityand novel contributions addressing theoretical and practicalaspects are solicited Specifically the following topics are ofinterest

bull Semantics-based multimedia analysisbull Context-based multimedia miningbull Intelligent exploitation of user relevance feedbackbull Knowledge acquisition from multimedia contentsbull Semantics based interaction with multimediabull Integration of multimedia processing and Semantic

Web technologies to enable automatic content shar-ing processing and interpretation

bull Content user and network aware media engineeringbull Multimodal techniques high-dimensionality reduc-

tion and low level feature fusion

Authors should follow the EURASIP JASP manuscriptformat described at httpwwwhindawicomjournalsaspProspective authors should submit an electronic copy oftheir complete manuscript through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due September 1 2006

Acceptance Notification January 15 2007

Final Manuscript Due April 1 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

Ebroul Izquierdo Department of Electronic EngineeringQueen Mary University of London Mile End Road LondonE1 4NS United Kingdom ebroulizquierdoelecqmulacuk

Hyoung Joong Kim Department of Control and Instru-mentation Engineering Kangwon National University 192 1Hyoja2 Dong Kangwon Do 200 701 Koreakhjkangwonackr

Thomas Sikora Communication Systems Group TechnicalUniversity Berlin Einstein Ufer 17 10587 Berlin Germanysikoranuetu-berlinde

Hindawi Publishing Corporationhttpwwwhindawicom

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Super-resolution Enhancement of Digital Video

Call for PapersWhen designing a system for image acquisition there is gen-erally a desire for high spatial resolution and a wide field-of-view To achieve this a camera system must typically em-ploy small f-number optics This produces an image withvery high spatial-frequency bandwidth at the focal plane Toavoid aliasing caused by undersampling the correspondingfocal plane array (FPA) must be sufficiently dense Howevercost and fabrication complexities may make this impracticalMore fundamentally smaller detectors capture fewer pho-tons which can lead to potentially severe noise levels in theacquired imagery Considering these factors one may chooseto accept a certain level of undersampling or to sacrifice someoptical resolution andor field-of-view

In image super-resolution (SR) postprocessing is used toobtain images with resolutions that go beyond the conven-tional limits of the uncompensated imaging system In somesystems the primary limiting factor is the optical resolutionof the image in the focal plane as defined by the cut-off fre-quency of the optics We use the term ldquooptical SRrdquo to re-fer to SR methods that aim to create an image with validspatial-frequency content that goes beyond the cut-off fre-quency of the optics Such techniques typically must rely onextensive a priori information In other image acquisitionsystems the limiting factor may be the density of the FPAsubsequent postprocessing requirements or transmission bi-trate constraints that require data compression We refer tothe process of overcoming the limitations of the FPA in orderto obtain the full resolution afforded by the selected optics asldquodetector SRrdquo Note that some methods may seek to performboth optical and detector SR

Detector SR algorithms generally process a set of low-resolution aliased frames from a video sequence to producea high-resolution frame When subpixel relative motion ispresent between the objects in the scene and the detector ar-ray a unique set of scene samples are acquired for each frameThis provides the mechanism for effectively increasing thespatial sampling rate of the imaging system without reduc-ing the physical size of the detectors

With increasing interest in surveillance and the prolifera-tion of digital imaging and video SR has become a rapidlygrowing field Recent advances in SR include innovative al-gorithms generalized methods real-time implementations

and novel applications The purpose of this special issue isto present leading research and development in the area ofsuper-resolution for digital video Topics of interest for thisspecial issue include but are not limited to

bull Detector and optical SR algorithms for videobull Real-time or near-real-time SR implementationsbull Innovative color SR processingbull Novel SR applications such as improved object

detection recognition and trackingbull Super-resolution from compressed videobull Subpixel image registration and optical flow

Authors should follow the EURASIP JASP manuscriptformat described at httpwwwhindawicomjournalsaspProspective authors should submit an electronic copy oftheir complete manuscript through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due September 1 2006

Acceptance Notification February 1 2007

Final Manuscript Due April 15 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

Russell C Hardie Department of Electrical and ComputerEngineering University of Dayton 300 College ParkDayton OH 45469-0026 USA rhardieudaytonedu

Richard R Schultz Department of Electrical EngineeringUniversity of North Dakota Upson II Room 160 PO Box7165 Grand Forks ND 58202-7165 USARichardSchultzmailundnodakedu

Kenneth E Barner Department of Electrical andComputer Engineering University of Delaware 140 EvansHall Newark DE 19716-3130 USA barnereeudeledu

Hindawi Publishing Corporationhttpwwwhindawicom

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Advanced Signal Processing and ComputationalIntelligence Techniques for Power Line Communications

Call for PapersIn recent years increased demand for fast Internet access andnew multimedia services the development of new and fea-sible signal processing techniques associated with faster andlow-cost digital signal processors as well as the deregulationof the telecommunications market have placed major em-phasis on the value of investigating hostile media such aspowerline (PL) channels for high-rate data transmissions

Nowadays some companies are offering powerline com-munications (PLC) modems with mean and peak bit-ratesaround 100 Mbps and 200 Mbps respectively Howeveradvanced broadband powerline communications (BPLC)modems will surpass this performance For accomplishing itsome special schemes or solutions for coping with the follow-ing issues should be addressed (i) considerable differencesbetween powerline network topologies (ii) hostile propertiesof PL channels such as attenuation proportional to high fre-quencies and long distances high-power impulse noise oc-currences time-varying behavior and strong inter-symbolinterference (ISI) effects (iv) electromagnetic compatibilitywith other well-established communication systems work-ing in the same spectrum (v) climatic conditions in differ-ent parts of the world (vii) reliability and QoS guarantee forvideo and voice transmissions and (vi) different demandsand needs from developed developing and poor countries

These issues can lead to exciting research frontiers withvery promising results if signal processing digital commu-nication and computational intelligence techniques are ef-fectively and efficiently combined

The goal of this special issue is to introduce signal process-ing digital communication and computational intelligencetools either individually or in combined form for advancingreliable and powerful future generations of powerline com-munication solutions that can be suited with for applicationsin developed developing and poor countries

Topics of interest include (but are not limited to)

bull Multicarrier spread spectrum and single carrier tech-niques

bull Channel modeling

bull Channel coding and equalization techniquesbull Multiuser detection and multiple access techniquesbull Synchronization techniquesbull Impulse noise cancellation techniquesbull FPGA ASIC and DSP implementation issues of PLC

modemsbull Error resilience error concealment and Joint source-

channel design methods for video transmissionthrough PL channels

Authors should follow the EURASIP JASP manuscript for-mat described at the journal site httpasphindawicomProspective authors should submit an electronic copy of theircomplete manuscripts through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due October 1 2006

Acceptance Notification January 1 2007

Final Manuscript Due April 1 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

Moiseacutes Vidal Ribeiro Federal University of Juiz de ForaBrazil mribeiroieeeorg

Lutz Lampe University of British Columbia Canadalampeeceubcca

Sanjit K Mitra University of California Santa BarbaraUSA mitraeceucsbedu

Klaus Dostert University of Karlsruhe Germanyklausdostertetecuni-karlsruhede

Halid Hrasnica Dresden University of Technology Ger-many hrasnicaifnettu-dresdende

Hindawi Publishing Corporationhttpasphindawicom

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Numerical Linear Algebra in Signal ProcessingApplications

Call for PapersThe cross-fertilization between numerical linear algebra anddigital signal processing has been very fruitful in the lastdecades The interaction between them has been growingleading to many new algorithms

Numerical linear algebra tools such as eigenvalue and sin-gular value decomposition and their higher-extension leastsquares total least squares recursive least squares regulariza-tion orthogonality and projections are the kernels of pow-erful and numerically robust algorithms

The goal of this special issue is to present new efficient andreliable numerical linear algebra tools for signal processingapplications Areas and topics of interest for this special issueinclude (but are not limited to)

bull Singular value and eigenvalue decompositions in-cluding applications

bull Fourier Toeplitz Cauchy Vandermonde and semi-separable matrices including special algorithms andarchitectures

bull Recursive least squares in digital signal processingbull Updating and downdating techniques in linear alge-

bra and signal processingbull Stability and sensitivity analysis of special recursive

least-squares problemsbull Numerical linear algebra in

bull Biomedical signal processing applicationsbull Adaptive filtersbull Remote sensingbull Acoustic echo cancellationbull Blind signal separation and multiuser detectionbull Multidimensional harmonic retrieval and direc-

tion-of-arrival estimationbull Applications in wireless communicationsbull Applications in pattern analysis and statistical

modelingbull Sensor array processing

Authors should follow the EURASIP JASP manuscriptformat described at httpwwwhindawicomjournalsaspProspective authors should submit an electronic copy oftheir complete manuscript through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due October 1 2006

Acceptance Notification February 1 2007

Final Manuscript Due May 1 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

Shivkumar Chandrasekaran Department of Electricaland Computer Engineering University of California SantaBarbara USA shiveceucsbedu

Gene H Golub Department of Computer Science StanfordUniversity USA golubsccmstanfordedu

Nicola Mastronardi Istituto per le Applicazioni del CalcololdquoMauro Piconerdquo Consiglio Nazionale delle Ricerche BariItaly nmastronardibaiaccnrit

Marc Moonen Department of Electrical EngineeringKatholieke Universiteit Leuven Belgiummarcmoonenesatkuleuvenbe

Paul Van Dooren Department of Mathematical Engineer-ing Catholic University of Louvain Belgiumvdoorencsamuclacbe

Sabine Van Huffel Department of Electrical EngineeringKatholieke Universiteit Leuven Belgiumsabinevanhuffelesatkuleuvenbe

Hindawi Publishing Corporationhttpwwwhindawicom

Page 12: Voice Morphing using 3D Waveform Interpolation …spl.telhai.ac.il/speech/pub/S1110865705502178[1].pdf · Voice Morphing Using 3D Waveform Interpolation Surfaces 1175 Several studies

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Transforming Signal Processing Applications intoParallel Implementations

Call for PapersThere is an increasing need to develop efficient ldquosystem-levelrdquo models methods and tools to support designers toquickly transform signal processing application specificationto heterogeneous hardware and software architectures suchas arrays of DSPs heterogeneous platforms involving mi-croprocessors DSPs and FPGAs and other evolving multi-processor SoC architectures Typically the design process in-volves aspects of application and architecture modeling aswell as transformations to translate the application modelsto architecture models for subsequent performance analysisand design space exploration Accurate predictions are indis-pensable because next generation signal processing applica-tions for example audio video and array signal processingimpose high throughput real-time and energy constraintsthat can no longer be served by a single DSP

There are a number of key issues in transforming applica-tion models into parallel implementations that are not ad-dressed in current approaches These are engineering theapplication specification transforming application specifi-cation or representation of the architecture specification aswell as communication models such as data transfer and syn-chronization primitives in both models

The purpose of this call for papers is to address approachesthat include application transformations in the performanceanalysis and design space exploration efforts when takingsignal processing applications to concurrent and parallel im-plementations The Guest Editors are soliciting contributionsin joint application and architecture space exploration thatoutperform the current architecture-only design space ex-ploration methods and tools

Topics of interest for this special issue include but are notlimited to

bull modeling applications in terms of (abstract)control-dataflow graph dataflow graph and processnetwork models of computation (MoC)

bull transforming application models or algorithmicengineering

bull transforming application MoCs to architecture MoCsbull joint application and architecture space exploration

bull joint application and architecture performanceanalysis

bull extending the concept of algorithmic engineering toarchitecture engineering

bull design cases and applications mapped onmultiprocessor homogeneous or heterogeneousSOCs showing joint optimization of application andarchitecture

Authors should follow the EURASIP JASP manuscriptformat described at httpwwwhindawicomjournalsaspProspective authors should submit an electronic copy oftheir complete manuscript through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due September 1 2006

Acceptance Notification January 1 2007

Final Manuscript Due April 1 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

F Deprettre Leiden Embedded Research Center LeidenUniversity Niels Bohrweg 1 2333 CA Leiden TheNetherlands eddliacsnl

Roger Woods School of Electrical and ElectronicEngineering Queens University of Belfast Ashby BuildingStranmillis Road Belfast BT9 5AH UK rwoodsqubacuk

Ingrid Verbauwhede Katholieke Universiteit LeuvenESAT-COSIC Kasteelpark Arenberg 10 3001 LeuvenBelgium Ingridverbauwhedeesatkuleuvenbe

Erwin de Kock Philips Research High Tech Campus 315656 AE Eindhoven The Netherlandserwindekockphilipscom

Hindawi Publishing Corporationhttpwwwhindawicom

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Video Adaptation for Heterogeneous Environments

Call for PapersThe explosive growth of compressed video streams andrepositories accessible worldwide the recent addition of newvideo-related standards such as H264AVC MPEG-7 andMPEG-21 and the ever-increasing prevalence of heteroge-neous video-enabled terminals such as computer TV mo-bile phones and personal digital assistants have escalated theneed for efficient and effective techniques for adapting com-pressed videos to better suit the different capabilities con-straints and requirements of various transmission networksapplications and end users For instance Universal Multime-dia Access (UMA) advocates the provision and adaptation ofthe same multimedia content for different networks termi-nals and user preferences

Video adaptation is an emerging field that offers a richbody of knowledge and techniques for handling the hugevariation of resource constraints (eg bandwidth display ca-pability processing speed and power consumption) and thelarge diversity of user tasks in pervasive media applicationsConsiderable amounts of research and development activi-ties in industry and academia have been devoted to answer-ing the many challenges in making better use of video con-tent across systems and applications of various kinds

Video adaptation may apply to individual or multiplevideo streams and may call for different means depending onthe objectives and requirements of adaptation Transcodingtransmoding (cross-modality transcoding) scalable contentrepresentation content abstraction and summarization arepopular means for video adaptation In addition video con-tent analysis and understanding including low-level featureanalysis and high-level semantics understanding play an im-portant role in video adaptation as essential video contentcan be better preserved

The aim of this special issue is to present state-of-the-art developments in this flourishing and important researchfield Contributions in theoretical study architecture designperformance analysis complexity reduction and real-worldapplications are all welcome

Topics of interest include (but are not limited to)

bull Heterogeneous video transcodingbull Scalable video codingbull Dynamic bitstream switching for video adaptation

bull Signal structural and semantic-level videoadaptation

bull Content analysis and understanding for videoadaptation

bull Video summarization and abstractionbull Copyright protection for video adaptationbull Crossmedia techniques for video adaptationbull Testing field trials and applications of video

adaptation servicesbull International standard activities for video adaptation

Authors should follow the EURASIP JASP manuscriptformat described at httpwwwhindawicomjournalsaspProspective authors should submit an electronic copy oftheir complete manuscript through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due September 1 2006

Acceptance Notification January 1 2007

Final Manuscript Due April 1 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

Chia-Wen Lin Department of Computer Science andInformation Engineering National Chung ChengUniversity Chiayi 621 Taiwan cwlincsccuedutw

Yap-Peng Tan School of Electrical and ElectronicEngineering Nanyang Technological University NanyangAvenue Singapore 639798 Singapore eyptanntuedusg

Ming-Ting Sun Department of Electrical EngineeringUniversity of Washington Seattle WA 98195 USA suneewashingtonedu

Alex Kot School of Electrical and Electronic EngineeringNanyang Technological University Nanyang AvenueSingapore 639798 Singapore eackotntuedusg

Anthony Vetro Mitsubishi Electric Research Laboratories201 Broadway 8th Floor Cambridge MA 02138 USAavetromerlcom

Hindawi Publishing Corporationhttpwwwhindawicom

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Knowledge-Assisted Media Analysis for InteractiveMultimedia Applications

Call for PapersIt is broadly acknowledged that the development of enablingtechnologies for new forms of interactive multimedia ser-vices requires a targeted confluence of knowledge semanticsand low-level media processing The convergence of these ar-eas is key to many applications including interactive TV net-worked medical imaging vision-based surveillance and mul-timedia visualization navigation search and retrieval Thelatter is a crucial application since the exponential growthof audiovisual data along with the critical lack of tools torecord the data in a well-structured form is rendering uselessvast portions of available content To overcome this problemthere is need for technology that is able to produce accuratelevels of abstraction in order to annotate and retrieve con-tent using queries that are natural to humans Such technol-ogy will help narrow the gap between low-level features orcontent descriptors that can be computed automatically andthe richness and subjectivity of semantics in user queries andhigh-level human interpretations of audiovisual media

This special issue focuses on truly integrative research tar-geting of what can be disparate disciplines including imageprocessing knowledge engineering information retrievalsemantic analysis and artificial intelligence High-qualityand novel contributions addressing theoretical and practicalaspects are solicited Specifically the following topics are ofinterest

bull Semantics-based multimedia analysisbull Context-based multimedia miningbull Intelligent exploitation of user relevance feedbackbull Knowledge acquisition from multimedia contentsbull Semantics based interaction with multimediabull Integration of multimedia processing and Semantic

Web technologies to enable automatic content shar-ing processing and interpretation

bull Content user and network aware media engineeringbull Multimodal techniques high-dimensionality reduc-

tion and low level feature fusion

Authors should follow the EURASIP JASP manuscriptformat described at httpwwwhindawicomjournalsaspProspective authors should submit an electronic copy oftheir complete manuscript through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due September 1 2006

Acceptance Notification January 15 2007

Final Manuscript Due April 1 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

Ebroul Izquierdo Department of Electronic EngineeringQueen Mary University of London Mile End Road LondonE1 4NS United Kingdom ebroulizquierdoelecqmulacuk

Hyoung Joong Kim Department of Control and Instru-mentation Engineering Kangwon National University 192 1Hyoja2 Dong Kangwon Do 200 701 Koreakhjkangwonackr

Thomas Sikora Communication Systems Group TechnicalUniversity Berlin Einstein Ufer 17 10587 Berlin Germanysikoranuetu-berlinde

Hindawi Publishing Corporationhttpwwwhindawicom

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Super-resolution Enhancement of Digital Video

Call for PapersWhen designing a system for image acquisition there is gen-erally a desire for high spatial resolution and a wide field-of-view To achieve this a camera system must typically em-ploy small f-number optics This produces an image withvery high spatial-frequency bandwidth at the focal plane Toavoid aliasing caused by undersampling the correspondingfocal plane array (FPA) must be sufficiently dense Howevercost and fabrication complexities may make this impracticalMore fundamentally smaller detectors capture fewer pho-tons which can lead to potentially severe noise levels in theacquired imagery Considering these factors one may chooseto accept a certain level of undersampling or to sacrifice someoptical resolution andor field-of-view

In image super-resolution (SR) postprocessing is used toobtain images with resolutions that go beyond the conven-tional limits of the uncompensated imaging system In somesystems the primary limiting factor is the optical resolutionof the image in the focal plane as defined by the cut-off fre-quency of the optics We use the term ldquooptical SRrdquo to re-fer to SR methods that aim to create an image with validspatial-frequency content that goes beyond the cut-off fre-quency of the optics Such techniques typically must rely onextensive a priori information In other image acquisitionsystems the limiting factor may be the density of the FPAsubsequent postprocessing requirements or transmission bi-trate constraints that require data compression We refer tothe process of overcoming the limitations of the FPA in orderto obtain the full resolution afforded by the selected optics asldquodetector SRrdquo Note that some methods may seek to performboth optical and detector SR

Detector SR algorithms generally process a set of low-resolution aliased frames from a video sequence to producea high-resolution frame When subpixel relative motion ispresent between the objects in the scene and the detector ar-ray a unique set of scene samples are acquired for each frameThis provides the mechanism for effectively increasing thespatial sampling rate of the imaging system without reduc-ing the physical size of the detectors

With increasing interest in surveillance and the prolifera-tion of digital imaging and video SR has become a rapidlygrowing field Recent advances in SR include innovative al-gorithms generalized methods real-time implementations

and novel applications The purpose of this special issue isto present leading research and development in the area ofsuper-resolution for digital video Topics of interest for thisspecial issue include but are not limited to

bull Detector and optical SR algorithms for videobull Real-time or near-real-time SR implementationsbull Innovative color SR processingbull Novel SR applications such as improved object

detection recognition and trackingbull Super-resolution from compressed videobull Subpixel image registration and optical flow

Authors should follow the EURASIP JASP manuscriptformat described at httpwwwhindawicomjournalsaspProspective authors should submit an electronic copy oftheir complete manuscript through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due September 1 2006

Acceptance Notification February 1 2007

Final Manuscript Due April 15 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

Russell C Hardie Department of Electrical and ComputerEngineering University of Dayton 300 College ParkDayton OH 45469-0026 USA rhardieudaytonedu

Richard R Schultz Department of Electrical EngineeringUniversity of North Dakota Upson II Room 160 PO Box7165 Grand Forks ND 58202-7165 USARichardSchultzmailundnodakedu

Kenneth E Barner Department of Electrical andComputer Engineering University of Delaware 140 EvansHall Newark DE 19716-3130 USA barnereeudeledu

Hindawi Publishing Corporationhttpwwwhindawicom

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Advanced Signal Processing and ComputationalIntelligence Techniques for Power Line Communications

Call for PapersIn recent years increased demand for fast Internet access andnew multimedia services the development of new and fea-sible signal processing techniques associated with faster andlow-cost digital signal processors as well as the deregulationof the telecommunications market have placed major em-phasis on the value of investigating hostile media such aspowerline (PL) channels for high-rate data transmissions

Nowadays some companies are offering powerline com-munications (PLC) modems with mean and peak bit-ratesaround 100 Mbps and 200 Mbps respectively Howeveradvanced broadband powerline communications (BPLC)modems will surpass this performance For accomplishing itsome special schemes or solutions for coping with the follow-ing issues should be addressed (i) considerable differencesbetween powerline network topologies (ii) hostile propertiesof PL channels such as attenuation proportional to high fre-quencies and long distances high-power impulse noise oc-currences time-varying behavior and strong inter-symbolinterference (ISI) effects (iv) electromagnetic compatibilitywith other well-established communication systems work-ing in the same spectrum (v) climatic conditions in differ-ent parts of the world (vii) reliability and QoS guarantee forvideo and voice transmissions and (vi) different demandsand needs from developed developing and poor countries

These issues can lead to exciting research frontiers withvery promising results if signal processing digital commu-nication and computational intelligence techniques are ef-fectively and efficiently combined

The goal of this special issue is to introduce signal process-ing digital communication and computational intelligencetools either individually or in combined form for advancingreliable and powerful future generations of powerline com-munication solutions that can be suited with for applicationsin developed developing and poor countries

Topics of interest include (but are not limited to)

bull Multicarrier spread spectrum and single carrier tech-niques

bull Channel modeling

bull Channel coding and equalization techniquesbull Multiuser detection and multiple access techniquesbull Synchronization techniquesbull Impulse noise cancellation techniquesbull FPGA ASIC and DSP implementation issues of PLC

modemsbull Error resilience error concealment and Joint source-

channel design methods for video transmissionthrough PL channels

Authors should follow the EURASIP JASP manuscript for-mat described at the journal site httpasphindawicomProspective authors should submit an electronic copy of theircomplete manuscripts through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due October 1 2006

Acceptance Notification January 1 2007

Final Manuscript Due April 1 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

Moiseacutes Vidal Ribeiro Federal University of Juiz de ForaBrazil mribeiroieeeorg

Lutz Lampe University of British Columbia Canadalampeeceubcca

Sanjit K Mitra University of California Santa BarbaraUSA mitraeceucsbedu

Klaus Dostert University of Karlsruhe Germanyklausdostertetecuni-karlsruhede

Halid Hrasnica Dresden University of Technology Ger-many hrasnicaifnettu-dresdende

Hindawi Publishing Corporationhttpasphindawicom

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Numerical Linear Algebra in Signal ProcessingApplications

Call for PapersThe cross-fertilization between numerical linear algebra anddigital signal processing has been very fruitful in the lastdecades The interaction between them has been growingleading to many new algorithms

Numerical linear algebra tools such as eigenvalue and sin-gular value decomposition and their higher-extension leastsquares total least squares recursive least squares regulariza-tion orthogonality and projections are the kernels of pow-erful and numerically robust algorithms

The goal of this special issue is to present new efficient andreliable numerical linear algebra tools for signal processingapplications Areas and topics of interest for this special issueinclude (but are not limited to)

bull Singular value and eigenvalue decompositions in-cluding applications

bull Fourier Toeplitz Cauchy Vandermonde and semi-separable matrices including special algorithms andarchitectures

bull Recursive least squares in digital signal processingbull Updating and downdating techniques in linear alge-

bra and signal processingbull Stability and sensitivity analysis of special recursive

least-squares problemsbull Numerical linear algebra in

bull Biomedical signal processing applicationsbull Adaptive filtersbull Remote sensingbull Acoustic echo cancellationbull Blind signal separation and multiuser detectionbull Multidimensional harmonic retrieval and direc-

tion-of-arrival estimationbull Applications in wireless communicationsbull Applications in pattern analysis and statistical

modelingbull Sensor array processing

Authors should follow the EURASIP JASP manuscriptformat described at httpwwwhindawicomjournalsaspProspective authors should submit an electronic copy oftheir complete manuscript through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due October 1 2006

Acceptance Notification February 1 2007

Final Manuscript Due May 1 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

Shivkumar Chandrasekaran Department of Electricaland Computer Engineering University of California SantaBarbara USA shiveceucsbedu

Gene H Golub Department of Computer Science StanfordUniversity USA golubsccmstanfordedu

Nicola Mastronardi Istituto per le Applicazioni del CalcololdquoMauro Piconerdquo Consiglio Nazionale delle Ricerche BariItaly nmastronardibaiaccnrit

Marc Moonen Department of Electrical EngineeringKatholieke Universiteit Leuven Belgiummarcmoonenesatkuleuvenbe

Paul Van Dooren Department of Mathematical Engineer-ing Catholic University of Louvain Belgiumvdoorencsamuclacbe

Sabine Van Huffel Department of Electrical EngineeringKatholieke Universiteit Leuven Belgiumsabinevanhuffelesatkuleuvenbe

Hindawi Publishing Corporationhttpwwwhindawicom

Page 13: Voice Morphing using 3D Waveform Interpolation …spl.telhai.ac.il/speech/pub/S1110865705502178[1].pdf · Voice Morphing Using 3D Waveform Interpolation Surfaces 1175 Several studies

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Video Adaptation for Heterogeneous Environments

Call for PapersThe explosive growth of compressed video streams andrepositories accessible worldwide the recent addition of newvideo-related standards such as H264AVC MPEG-7 andMPEG-21 and the ever-increasing prevalence of heteroge-neous video-enabled terminals such as computer TV mo-bile phones and personal digital assistants have escalated theneed for efficient and effective techniques for adapting com-pressed videos to better suit the different capabilities con-straints and requirements of various transmission networksapplications and end users For instance Universal Multime-dia Access (UMA) advocates the provision and adaptation ofthe same multimedia content for different networks termi-nals and user preferences

Video adaptation is an emerging field that offers a richbody of knowledge and techniques for handling the hugevariation of resource constraints (eg bandwidth display ca-pability processing speed and power consumption) and thelarge diversity of user tasks in pervasive media applicationsConsiderable amounts of research and development activi-ties in industry and academia have been devoted to answer-ing the many challenges in making better use of video con-tent across systems and applications of various kinds

Video adaptation may apply to individual or multiplevideo streams and may call for different means depending onthe objectives and requirements of adaptation Transcodingtransmoding (cross-modality transcoding) scalable contentrepresentation content abstraction and summarization arepopular means for video adaptation In addition video con-tent analysis and understanding including low-level featureanalysis and high-level semantics understanding play an im-portant role in video adaptation as essential video contentcan be better preserved

The aim of this special issue is to present state-of-the-art developments in this flourishing and important researchfield Contributions in theoretical study architecture designperformance analysis complexity reduction and real-worldapplications are all welcome

Topics of interest include (but are not limited to)

bull Heterogeneous video transcodingbull Scalable video codingbull Dynamic bitstream switching for video adaptation

bull Signal structural and semantic-level videoadaptation

bull Content analysis and understanding for videoadaptation

bull Video summarization and abstractionbull Copyright protection for video adaptationbull Crossmedia techniques for video adaptationbull Testing field trials and applications of video

adaptation servicesbull International standard activities for video adaptation

Authors should follow the EURASIP JASP manuscriptformat described at httpwwwhindawicomjournalsaspProspective authors should submit an electronic copy oftheir complete manuscript through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due September 1 2006

Acceptance Notification January 1 2007

Final Manuscript Due April 1 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

Chia-Wen Lin Department of Computer Science andInformation Engineering National Chung ChengUniversity Chiayi 621 Taiwan cwlincsccuedutw

Yap-Peng Tan School of Electrical and ElectronicEngineering Nanyang Technological University NanyangAvenue Singapore 639798 Singapore eyptanntuedusg

Ming-Ting Sun Department of Electrical EngineeringUniversity of Washington Seattle WA 98195 USA suneewashingtonedu

Alex Kot School of Electrical and Electronic EngineeringNanyang Technological University Nanyang AvenueSingapore 639798 Singapore eackotntuedusg

Anthony Vetro Mitsubishi Electric Research Laboratories201 Broadway 8th Floor Cambridge MA 02138 USAavetromerlcom

Hindawi Publishing Corporationhttpwwwhindawicom

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Knowledge-Assisted Media Analysis for InteractiveMultimedia Applications

Call for PapersIt is broadly acknowledged that the development of enablingtechnologies for new forms of interactive multimedia ser-vices requires a targeted confluence of knowledge semanticsand low-level media processing The convergence of these ar-eas is key to many applications including interactive TV net-worked medical imaging vision-based surveillance and mul-timedia visualization navigation search and retrieval Thelatter is a crucial application since the exponential growthof audiovisual data along with the critical lack of tools torecord the data in a well-structured form is rendering uselessvast portions of available content To overcome this problemthere is need for technology that is able to produce accuratelevels of abstraction in order to annotate and retrieve con-tent using queries that are natural to humans Such technol-ogy will help narrow the gap between low-level features orcontent descriptors that can be computed automatically andthe richness and subjectivity of semantics in user queries andhigh-level human interpretations of audiovisual media

This special issue focuses on truly integrative research tar-geting of what can be disparate disciplines including imageprocessing knowledge engineering information retrievalsemantic analysis and artificial intelligence High-qualityand novel contributions addressing theoretical and practicalaspects are solicited Specifically the following topics are ofinterest

bull Semantics-based multimedia analysisbull Context-based multimedia miningbull Intelligent exploitation of user relevance feedbackbull Knowledge acquisition from multimedia contentsbull Semantics based interaction with multimediabull Integration of multimedia processing and Semantic

Web technologies to enable automatic content shar-ing processing and interpretation

bull Content user and network aware media engineeringbull Multimodal techniques high-dimensionality reduc-

tion and low level feature fusion

Authors should follow the EURASIP JASP manuscriptformat described at httpwwwhindawicomjournalsaspProspective authors should submit an electronic copy oftheir complete manuscript through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due September 1 2006

Acceptance Notification January 15 2007

Final Manuscript Due April 1 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

Ebroul Izquierdo Department of Electronic EngineeringQueen Mary University of London Mile End Road LondonE1 4NS United Kingdom ebroulizquierdoelecqmulacuk

Hyoung Joong Kim Department of Control and Instru-mentation Engineering Kangwon National University 192 1Hyoja2 Dong Kangwon Do 200 701 Koreakhjkangwonackr

Thomas Sikora Communication Systems Group TechnicalUniversity Berlin Einstein Ufer 17 10587 Berlin Germanysikoranuetu-berlinde

Hindawi Publishing Corporationhttpwwwhindawicom

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Super-resolution Enhancement of Digital Video

Call for PapersWhen designing a system for image acquisition there is gen-erally a desire for high spatial resolution and a wide field-of-view To achieve this a camera system must typically em-ploy small f-number optics This produces an image withvery high spatial-frequency bandwidth at the focal plane Toavoid aliasing caused by undersampling the correspondingfocal plane array (FPA) must be sufficiently dense Howevercost and fabrication complexities may make this impracticalMore fundamentally smaller detectors capture fewer pho-tons which can lead to potentially severe noise levels in theacquired imagery Considering these factors one may chooseto accept a certain level of undersampling or to sacrifice someoptical resolution andor field-of-view

In image super-resolution (SR) postprocessing is used toobtain images with resolutions that go beyond the conven-tional limits of the uncompensated imaging system In somesystems the primary limiting factor is the optical resolutionof the image in the focal plane as defined by the cut-off fre-quency of the optics We use the term ldquooptical SRrdquo to re-fer to SR methods that aim to create an image with validspatial-frequency content that goes beyond the cut-off fre-quency of the optics Such techniques typically must rely onextensive a priori information In other image acquisitionsystems the limiting factor may be the density of the FPAsubsequent postprocessing requirements or transmission bi-trate constraints that require data compression We refer tothe process of overcoming the limitations of the FPA in orderto obtain the full resolution afforded by the selected optics asldquodetector SRrdquo Note that some methods may seek to performboth optical and detector SR

Detector SR algorithms generally process a set of low-resolution aliased frames from a video sequence to producea high-resolution frame When subpixel relative motion ispresent between the objects in the scene and the detector ar-ray a unique set of scene samples are acquired for each frameThis provides the mechanism for effectively increasing thespatial sampling rate of the imaging system without reduc-ing the physical size of the detectors

With increasing interest in surveillance and the prolifera-tion of digital imaging and video SR has become a rapidlygrowing field Recent advances in SR include innovative al-gorithms generalized methods real-time implementations

and novel applications The purpose of this special issue isto present leading research and development in the area ofsuper-resolution for digital video Topics of interest for thisspecial issue include but are not limited to

bull Detector and optical SR algorithms for videobull Real-time or near-real-time SR implementationsbull Innovative color SR processingbull Novel SR applications such as improved object

detection recognition and trackingbull Super-resolution from compressed videobull Subpixel image registration and optical flow

Authors should follow the EURASIP JASP manuscriptformat described at httpwwwhindawicomjournalsaspProspective authors should submit an electronic copy oftheir complete manuscript through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due September 1 2006

Acceptance Notification February 1 2007

Final Manuscript Due April 15 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

Russell C Hardie Department of Electrical and ComputerEngineering University of Dayton 300 College ParkDayton OH 45469-0026 USA rhardieudaytonedu

Richard R Schultz Department of Electrical EngineeringUniversity of North Dakota Upson II Room 160 PO Box7165 Grand Forks ND 58202-7165 USARichardSchultzmailundnodakedu

Kenneth E Barner Department of Electrical andComputer Engineering University of Delaware 140 EvansHall Newark DE 19716-3130 USA barnereeudeledu

Hindawi Publishing Corporationhttpwwwhindawicom

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Advanced Signal Processing and ComputationalIntelligence Techniques for Power Line Communications

Call for PapersIn recent years increased demand for fast Internet access andnew multimedia services the development of new and fea-sible signal processing techniques associated with faster andlow-cost digital signal processors as well as the deregulationof the telecommunications market have placed major em-phasis on the value of investigating hostile media such aspowerline (PL) channels for high-rate data transmissions

Nowadays some companies are offering powerline com-munications (PLC) modems with mean and peak bit-ratesaround 100 Mbps and 200 Mbps respectively Howeveradvanced broadband powerline communications (BPLC)modems will surpass this performance For accomplishing itsome special schemes or solutions for coping with the follow-ing issues should be addressed (i) considerable differencesbetween powerline network topologies (ii) hostile propertiesof PL channels such as attenuation proportional to high fre-quencies and long distances high-power impulse noise oc-currences time-varying behavior and strong inter-symbolinterference (ISI) effects (iv) electromagnetic compatibilitywith other well-established communication systems work-ing in the same spectrum (v) climatic conditions in differ-ent parts of the world (vii) reliability and QoS guarantee forvideo and voice transmissions and (vi) different demandsand needs from developed developing and poor countries

These issues can lead to exciting research frontiers withvery promising results if signal processing digital commu-nication and computational intelligence techniques are ef-fectively and efficiently combined

The goal of this special issue is to introduce signal process-ing digital communication and computational intelligencetools either individually or in combined form for advancingreliable and powerful future generations of powerline com-munication solutions that can be suited with for applicationsin developed developing and poor countries

Topics of interest include (but are not limited to)

bull Multicarrier spread spectrum and single carrier tech-niques

bull Channel modeling

bull Channel coding and equalization techniquesbull Multiuser detection and multiple access techniquesbull Synchronization techniquesbull Impulse noise cancellation techniquesbull FPGA ASIC and DSP implementation issues of PLC

modemsbull Error resilience error concealment and Joint source-

channel design methods for video transmissionthrough PL channels

Authors should follow the EURASIP JASP manuscript for-mat described at the journal site httpasphindawicomProspective authors should submit an electronic copy of theircomplete manuscripts through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due October 1 2006

Acceptance Notification January 1 2007

Final Manuscript Due April 1 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

Moiseacutes Vidal Ribeiro Federal University of Juiz de ForaBrazil mribeiroieeeorg

Lutz Lampe University of British Columbia Canadalampeeceubcca

Sanjit K Mitra University of California Santa BarbaraUSA mitraeceucsbedu

Klaus Dostert University of Karlsruhe Germanyklausdostertetecuni-karlsruhede

Halid Hrasnica Dresden University of Technology Ger-many hrasnicaifnettu-dresdende

Hindawi Publishing Corporationhttpasphindawicom

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Numerical Linear Algebra in Signal ProcessingApplications

Call for PapersThe cross-fertilization between numerical linear algebra anddigital signal processing has been very fruitful in the lastdecades The interaction between them has been growingleading to many new algorithms

Numerical linear algebra tools such as eigenvalue and sin-gular value decomposition and their higher-extension leastsquares total least squares recursive least squares regulariza-tion orthogonality and projections are the kernels of pow-erful and numerically robust algorithms

The goal of this special issue is to present new efficient andreliable numerical linear algebra tools for signal processingapplications Areas and topics of interest for this special issueinclude (but are not limited to)

bull Singular value and eigenvalue decompositions in-cluding applications

bull Fourier Toeplitz Cauchy Vandermonde and semi-separable matrices including special algorithms andarchitectures

bull Recursive least squares in digital signal processingbull Updating and downdating techniques in linear alge-

bra and signal processingbull Stability and sensitivity analysis of special recursive

least-squares problemsbull Numerical linear algebra in

bull Biomedical signal processing applicationsbull Adaptive filtersbull Remote sensingbull Acoustic echo cancellationbull Blind signal separation and multiuser detectionbull Multidimensional harmonic retrieval and direc-

tion-of-arrival estimationbull Applications in wireless communicationsbull Applications in pattern analysis and statistical

modelingbull Sensor array processing

Authors should follow the EURASIP JASP manuscriptformat described at httpwwwhindawicomjournalsaspProspective authors should submit an electronic copy oftheir complete manuscript through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due October 1 2006

Acceptance Notification February 1 2007

Final Manuscript Due May 1 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

Shivkumar Chandrasekaran Department of Electricaland Computer Engineering University of California SantaBarbara USA shiveceucsbedu

Gene H Golub Department of Computer Science StanfordUniversity USA golubsccmstanfordedu

Nicola Mastronardi Istituto per le Applicazioni del CalcololdquoMauro Piconerdquo Consiglio Nazionale delle Ricerche BariItaly nmastronardibaiaccnrit

Marc Moonen Department of Electrical EngineeringKatholieke Universiteit Leuven Belgiummarcmoonenesatkuleuvenbe

Paul Van Dooren Department of Mathematical Engineer-ing Catholic University of Louvain Belgiumvdoorencsamuclacbe

Sabine Van Huffel Department of Electrical EngineeringKatholieke Universiteit Leuven Belgiumsabinevanhuffelesatkuleuvenbe

Hindawi Publishing Corporationhttpwwwhindawicom

Page 14: Voice Morphing using 3D Waveform Interpolation …spl.telhai.ac.il/speech/pub/S1110865705502178[1].pdf · Voice Morphing Using 3D Waveform Interpolation Surfaces 1175 Several studies

Anthony Vetro Mitsubishi Electric Research Laboratories201 Broadway 8th Floor Cambridge MA 02138 USAavetromerlcom

Hindawi Publishing Corporationhttpwwwhindawicom

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Knowledge-Assisted Media Analysis for InteractiveMultimedia Applications

Call for PapersIt is broadly acknowledged that the development of enablingtechnologies for new forms of interactive multimedia ser-vices requires a targeted confluence of knowledge semanticsand low-level media processing The convergence of these ar-eas is key to many applications including interactive TV net-worked medical imaging vision-based surveillance and mul-timedia visualization navigation search and retrieval Thelatter is a crucial application since the exponential growthof audiovisual data along with the critical lack of tools torecord the data in a well-structured form is rendering uselessvast portions of available content To overcome this problemthere is need for technology that is able to produce accuratelevels of abstraction in order to annotate and retrieve con-tent using queries that are natural to humans Such technol-ogy will help narrow the gap between low-level features orcontent descriptors that can be computed automatically andthe richness and subjectivity of semantics in user queries andhigh-level human interpretations of audiovisual media

This special issue focuses on truly integrative research tar-geting of what can be disparate disciplines including imageprocessing knowledge engineering information retrievalsemantic analysis and artificial intelligence High-qualityand novel contributions addressing theoretical and practicalaspects are solicited Specifically the following topics are ofinterest

bull Semantics-based multimedia analysisbull Context-based multimedia miningbull Intelligent exploitation of user relevance feedbackbull Knowledge acquisition from multimedia contentsbull Semantics based interaction with multimediabull Integration of multimedia processing and Semantic

Web technologies to enable automatic content shar-ing processing and interpretation

bull Content user and network aware media engineeringbull Multimodal techniques high-dimensionality reduc-

tion and low level feature fusion

Authors should follow the EURASIP JASP manuscriptformat described at httpwwwhindawicomjournalsaspProspective authors should submit an electronic copy oftheir complete manuscript through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due September 1 2006

Acceptance Notification January 15 2007

Final Manuscript Due April 1 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

Ebroul Izquierdo Department of Electronic EngineeringQueen Mary University of London Mile End Road LondonE1 4NS United Kingdom ebroulizquierdoelecqmulacuk

Hyoung Joong Kim Department of Control and Instru-mentation Engineering Kangwon National University 192 1Hyoja2 Dong Kangwon Do 200 701 Koreakhjkangwonackr

Thomas Sikora Communication Systems Group TechnicalUniversity Berlin Einstein Ufer 17 10587 Berlin Germanysikoranuetu-berlinde

Hindawi Publishing Corporationhttpwwwhindawicom

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Super-resolution Enhancement of Digital Video

Call for PapersWhen designing a system for image acquisition there is gen-erally a desire for high spatial resolution and a wide field-of-view To achieve this a camera system must typically em-ploy small f-number optics This produces an image withvery high spatial-frequency bandwidth at the focal plane Toavoid aliasing caused by undersampling the correspondingfocal plane array (FPA) must be sufficiently dense Howevercost and fabrication complexities may make this impracticalMore fundamentally smaller detectors capture fewer pho-tons which can lead to potentially severe noise levels in theacquired imagery Considering these factors one may chooseto accept a certain level of undersampling or to sacrifice someoptical resolution andor field-of-view

In image super-resolution (SR) postprocessing is used toobtain images with resolutions that go beyond the conven-tional limits of the uncompensated imaging system In somesystems the primary limiting factor is the optical resolutionof the image in the focal plane as defined by the cut-off fre-quency of the optics We use the term ldquooptical SRrdquo to re-fer to SR methods that aim to create an image with validspatial-frequency content that goes beyond the cut-off fre-quency of the optics Such techniques typically must rely onextensive a priori information In other image acquisitionsystems the limiting factor may be the density of the FPAsubsequent postprocessing requirements or transmission bi-trate constraints that require data compression We refer tothe process of overcoming the limitations of the FPA in orderto obtain the full resolution afforded by the selected optics asldquodetector SRrdquo Note that some methods may seek to performboth optical and detector SR

Detector SR algorithms generally process a set of low-resolution aliased frames from a video sequence to producea high-resolution frame When subpixel relative motion ispresent between the objects in the scene and the detector ar-ray a unique set of scene samples are acquired for each frameThis provides the mechanism for effectively increasing thespatial sampling rate of the imaging system without reduc-ing the physical size of the detectors

With increasing interest in surveillance and the prolifera-tion of digital imaging and video SR has become a rapidlygrowing field Recent advances in SR include innovative al-gorithms generalized methods real-time implementations

and novel applications The purpose of this special issue isto present leading research and development in the area ofsuper-resolution for digital video Topics of interest for thisspecial issue include but are not limited to

bull Detector and optical SR algorithms for videobull Real-time or near-real-time SR implementationsbull Innovative color SR processingbull Novel SR applications such as improved object

detection recognition and trackingbull Super-resolution from compressed videobull Subpixel image registration and optical flow

Authors should follow the EURASIP JASP manuscriptformat described at httpwwwhindawicomjournalsaspProspective authors should submit an electronic copy oftheir complete manuscript through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due September 1 2006

Acceptance Notification February 1 2007

Final Manuscript Due April 15 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

Russell C Hardie Department of Electrical and ComputerEngineering University of Dayton 300 College ParkDayton OH 45469-0026 USA rhardieudaytonedu

Richard R Schultz Department of Electrical EngineeringUniversity of North Dakota Upson II Room 160 PO Box7165 Grand Forks ND 58202-7165 USARichardSchultzmailundnodakedu

Kenneth E Barner Department of Electrical andComputer Engineering University of Delaware 140 EvansHall Newark DE 19716-3130 USA barnereeudeledu

Hindawi Publishing Corporationhttpwwwhindawicom

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Advanced Signal Processing and ComputationalIntelligence Techniques for Power Line Communications

Call for PapersIn recent years increased demand for fast Internet access andnew multimedia services the development of new and fea-sible signal processing techniques associated with faster andlow-cost digital signal processors as well as the deregulationof the telecommunications market have placed major em-phasis on the value of investigating hostile media such aspowerline (PL) channels for high-rate data transmissions

Nowadays some companies are offering powerline com-munications (PLC) modems with mean and peak bit-ratesaround 100 Mbps and 200 Mbps respectively Howeveradvanced broadband powerline communications (BPLC)modems will surpass this performance For accomplishing itsome special schemes or solutions for coping with the follow-ing issues should be addressed (i) considerable differencesbetween powerline network topologies (ii) hostile propertiesof PL channels such as attenuation proportional to high fre-quencies and long distances high-power impulse noise oc-currences time-varying behavior and strong inter-symbolinterference (ISI) effects (iv) electromagnetic compatibilitywith other well-established communication systems work-ing in the same spectrum (v) climatic conditions in differ-ent parts of the world (vii) reliability and QoS guarantee forvideo and voice transmissions and (vi) different demandsand needs from developed developing and poor countries

These issues can lead to exciting research frontiers withvery promising results if signal processing digital commu-nication and computational intelligence techniques are ef-fectively and efficiently combined

The goal of this special issue is to introduce signal process-ing digital communication and computational intelligencetools either individually or in combined form for advancingreliable and powerful future generations of powerline com-munication solutions that can be suited with for applicationsin developed developing and poor countries

Topics of interest include (but are not limited to)

bull Multicarrier spread spectrum and single carrier tech-niques

bull Channel modeling

bull Channel coding and equalization techniquesbull Multiuser detection and multiple access techniquesbull Synchronization techniquesbull Impulse noise cancellation techniquesbull FPGA ASIC and DSP implementation issues of PLC

modemsbull Error resilience error concealment and Joint source-

channel design methods for video transmissionthrough PL channels

Authors should follow the EURASIP JASP manuscript for-mat described at the journal site httpasphindawicomProspective authors should submit an electronic copy of theircomplete manuscripts through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due October 1 2006

Acceptance Notification January 1 2007

Final Manuscript Due April 1 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

Moiseacutes Vidal Ribeiro Federal University of Juiz de ForaBrazil mribeiroieeeorg

Lutz Lampe University of British Columbia Canadalampeeceubcca

Sanjit K Mitra University of California Santa BarbaraUSA mitraeceucsbedu

Klaus Dostert University of Karlsruhe Germanyklausdostertetecuni-karlsruhede

Halid Hrasnica Dresden University of Technology Ger-many hrasnicaifnettu-dresdende

Hindawi Publishing Corporationhttpasphindawicom

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Numerical Linear Algebra in Signal ProcessingApplications

Call for PapersThe cross-fertilization between numerical linear algebra anddigital signal processing has been very fruitful in the lastdecades The interaction between them has been growingleading to many new algorithms

Numerical linear algebra tools such as eigenvalue and sin-gular value decomposition and their higher-extension leastsquares total least squares recursive least squares regulariza-tion orthogonality and projections are the kernels of pow-erful and numerically robust algorithms

The goal of this special issue is to present new efficient andreliable numerical linear algebra tools for signal processingapplications Areas and topics of interest for this special issueinclude (but are not limited to)

bull Singular value and eigenvalue decompositions in-cluding applications

bull Fourier Toeplitz Cauchy Vandermonde and semi-separable matrices including special algorithms andarchitectures

bull Recursive least squares in digital signal processingbull Updating and downdating techniques in linear alge-

bra and signal processingbull Stability and sensitivity analysis of special recursive

least-squares problemsbull Numerical linear algebra in

bull Biomedical signal processing applicationsbull Adaptive filtersbull Remote sensingbull Acoustic echo cancellationbull Blind signal separation and multiuser detectionbull Multidimensional harmonic retrieval and direc-

tion-of-arrival estimationbull Applications in wireless communicationsbull Applications in pattern analysis and statistical

modelingbull Sensor array processing

Authors should follow the EURASIP JASP manuscriptformat described at httpwwwhindawicomjournalsaspProspective authors should submit an electronic copy oftheir complete manuscript through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due October 1 2006

Acceptance Notification February 1 2007

Final Manuscript Due May 1 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

Shivkumar Chandrasekaran Department of Electricaland Computer Engineering University of California SantaBarbara USA shiveceucsbedu

Gene H Golub Department of Computer Science StanfordUniversity USA golubsccmstanfordedu

Nicola Mastronardi Istituto per le Applicazioni del CalcololdquoMauro Piconerdquo Consiglio Nazionale delle Ricerche BariItaly nmastronardibaiaccnrit

Marc Moonen Department of Electrical EngineeringKatholieke Universiteit Leuven Belgiummarcmoonenesatkuleuvenbe

Paul Van Dooren Department of Mathematical Engineer-ing Catholic University of Louvain Belgiumvdoorencsamuclacbe

Sabine Van Huffel Department of Electrical EngineeringKatholieke Universiteit Leuven Belgiumsabinevanhuffelesatkuleuvenbe

Hindawi Publishing Corporationhttpwwwhindawicom

Page 15: Voice Morphing using 3D Waveform Interpolation …spl.telhai.ac.il/speech/pub/S1110865705502178[1].pdf · Voice Morphing Using 3D Waveform Interpolation Surfaces 1175 Several studies

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Knowledge-Assisted Media Analysis for InteractiveMultimedia Applications

Call for PapersIt is broadly acknowledged that the development of enablingtechnologies for new forms of interactive multimedia ser-vices requires a targeted confluence of knowledge semanticsand low-level media processing The convergence of these ar-eas is key to many applications including interactive TV net-worked medical imaging vision-based surveillance and mul-timedia visualization navigation search and retrieval Thelatter is a crucial application since the exponential growthof audiovisual data along with the critical lack of tools torecord the data in a well-structured form is rendering uselessvast portions of available content To overcome this problemthere is need for technology that is able to produce accuratelevels of abstraction in order to annotate and retrieve con-tent using queries that are natural to humans Such technol-ogy will help narrow the gap between low-level features orcontent descriptors that can be computed automatically andthe richness and subjectivity of semantics in user queries andhigh-level human interpretations of audiovisual media

This special issue focuses on truly integrative research tar-geting of what can be disparate disciplines including imageprocessing knowledge engineering information retrievalsemantic analysis and artificial intelligence High-qualityand novel contributions addressing theoretical and practicalaspects are solicited Specifically the following topics are ofinterest

bull Semantics-based multimedia analysisbull Context-based multimedia miningbull Intelligent exploitation of user relevance feedbackbull Knowledge acquisition from multimedia contentsbull Semantics based interaction with multimediabull Integration of multimedia processing and Semantic

Web technologies to enable automatic content shar-ing processing and interpretation

bull Content user and network aware media engineeringbull Multimodal techniques high-dimensionality reduc-

tion and low level feature fusion

Authors should follow the EURASIP JASP manuscriptformat described at httpwwwhindawicomjournalsaspProspective authors should submit an electronic copy oftheir complete manuscript through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due September 1 2006

Acceptance Notification January 15 2007

Final Manuscript Due April 1 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

Ebroul Izquierdo Department of Electronic EngineeringQueen Mary University of London Mile End Road LondonE1 4NS United Kingdom ebroulizquierdoelecqmulacuk

Hyoung Joong Kim Department of Control and Instru-mentation Engineering Kangwon National University 192 1Hyoja2 Dong Kangwon Do 200 701 Koreakhjkangwonackr

Thomas Sikora Communication Systems Group TechnicalUniversity Berlin Einstein Ufer 17 10587 Berlin Germanysikoranuetu-berlinde

Hindawi Publishing Corporationhttpwwwhindawicom

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Super-resolution Enhancement of Digital Video

Call for PapersWhen designing a system for image acquisition there is gen-erally a desire for high spatial resolution and a wide field-of-view To achieve this a camera system must typically em-ploy small f-number optics This produces an image withvery high spatial-frequency bandwidth at the focal plane Toavoid aliasing caused by undersampling the correspondingfocal plane array (FPA) must be sufficiently dense Howevercost and fabrication complexities may make this impracticalMore fundamentally smaller detectors capture fewer pho-tons which can lead to potentially severe noise levels in theacquired imagery Considering these factors one may chooseto accept a certain level of undersampling or to sacrifice someoptical resolution andor field-of-view

In image super-resolution (SR) postprocessing is used toobtain images with resolutions that go beyond the conven-tional limits of the uncompensated imaging system In somesystems the primary limiting factor is the optical resolutionof the image in the focal plane as defined by the cut-off fre-quency of the optics We use the term ldquooptical SRrdquo to re-fer to SR methods that aim to create an image with validspatial-frequency content that goes beyond the cut-off fre-quency of the optics Such techniques typically must rely onextensive a priori information In other image acquisitionsystems the limiting factor may be the density of the FPAsubsequent postprocessing requirements or transmission bi-trate constraints that require data compression We refer tothe process of overcoming the limitations of the FPA in orderto obtain the full resolution afforded by the selected optics asldquodetector SRrdquo Note that some methods may seek to performboth optical and detector SR

Detector SR algorithms generally process a set of low-resolution aliased frames from a video sequence to producea high-resolution frame When subpixel relative motion ispresent between the objects in the scene and the detector ar-ray a unique set of scene samples are acquired for each frameThis provides the mechanism for effectively increasing thespatial sampling rate of the imaging system without reduc-ing the physical size of the detectors

With increasing interest in surveillance and the prolifera-tion of digital imaging and video SR has become a rapidlygrowing field Recent advances in SR include innovative al-gorithms generalized methods real-time implementations

and novel applications The purpose of this special issue isto present leading research and development in the area ofsuper-resolution for digital video Topics of interest for thisspecial issue include but are not limited to

bull Detector and optical SR algorithms for videobull Real-time or near-real-time SR implementationsbull Innovative color SR processingbull Novel SR applications such as improved object

detection recognition and trackingbull Super-resolution from compressed videobull Subpixel image registration and optical flow

Authors should follow the EURASIP JASP manuscriptformat described at httpwwwhindawicomjournalsaspProspective authors should submit an electronic copy oftheir complete manuscript through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due September 1 2006

Acceptance Notification February 1 2007

Final Manuscript Due April 15 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

Russell C Hardie Department of Electrical and ComputerEngineering University of Dayton 300 College ParkDayton OH 45469-0026 USA rhardieudaytonedu

Richard R Schultz Department of Electrical EngineeringUniversity of North Dakota Upson II Room 160 PO Box7165 Grand Forks ND 58202-7165 USARichardSchultzmailundnodakedu

Kenneth E Barner Department of Electrical andComputer Engineering University of Delaware 140 EvansHall Newark DE 19716-3130 USA barnereeudeledu

Hindawi Publishing Corporationhttpwwwhindawicom

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Advanced Signal Processing and ComputationalIntelligence Techniques for Power Line Communications

Call for PapersIn recent years increased demand for fast Internet access andnew multimedia services the development of new and fea-sible signal processing techniques associated with faster andlow-cost digital signal processors as well as the deregulationof the telecommunications market have placed major em-phasis on the value of investigating hostile media such aspowerline (PL) channels for high-rate data transmissions

Nowadays some companies are offering powerline com-munications (PLC) modems with mean and peak bit-ratesaround 100 Mbps and 200 Mbps respectively Howeveradvanced broadband powerline communications (BPLC)modems will surpass this performance For accomplishing itsome special schemes or solutions for coping with the follow-ing issues should be addressed (i) considerable differencesbetween powerline network topologies (ii) hostile propertiesof PL channels such as attenuation proportional to high fre-quencies and long distances high-power impulse noise oc-currences time-varying behavior and strong inter-symbolinterference (ISI) effects (iv) electromagnetic compatibilitywith other well-established communication systems work-ing in the same spectrum (v) climatic conditions in differ-ent parts of the world (vii) reliability and QoS guarantee forvideo and voice transmissions and (vi) different demandsand needs from developed developing and poor countries

These issues can lead to exciting research frontiers withvery promising results if signal processing digital commu-nication and computational intelligence techniques are ef-fectively and efficiently combined

The goal of this special issue is to introduce signal process-ing digital communication and computational intelligencetools either individually or in combined form for advancingreliable and powerful future generations of powerline com-munication solutions that can be suited with for applicationsin developed developing and poor countries

Topics of interest include (but are not limited to)

bull Multicarrier spread spectrum and single carrier tech-niques

bull Channel modeling

bull Channel coding and equalization techniquesbull Multiuser detection and multiple access techniquesbull Synchronization techniquesbull Impulse noise cancellation techniquesbull FPGA ASIC and DSP implementation issues of PLC

modemsbull Error resilience error concealment and Joint source-

channel design methods for video transmissionthrough PL channels

Authors should follow the EURASIP JASP manuscript for-mat described at the journal site httpasphindawicomProspective authors should submit an electronic copy of theircomplete manuscripts through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due October 1 2006

Acceptance Notification January 1 2007

Final Manuscript Due April 1 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

Moiseacutes Vidal Ribeiro Federal University of Juiz de ForaBrazil mribeiroieeeorg

Lutz Lampe University of British Columbia Canadalampeeceubcca

Sanjit K Mitra University of California Santa BarbaraUSA mitraeceucsbedu

Klaus Dostert University of Karlsruhe Germanyklausdostertetecuni-karlsruhede

Halid Hrasnica Dresden University of Technology Ger-many hrasnicaifnettu-dresdende

Hindawi Publishing Corporationhttpasphindawicom

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Numerical Linear Algebra in Signal ProcessingApplications

Call for PapersThe cross-fertilization between numerical linear algebra anddigital signal processing has been very fruitful in the lastdecades The interaction between them has been growingleading to many new algorithms

Numerical linear algebra tools such as eigenvalue and sin-gular value decomposition and their higher-extension leastsquares total least squares recursive least squares regulariza-tion orthogonality and projections are the kernels of pow-erful and numerically robust algorithms

The goal of this special issue is to present new efficient andreliable numerical linear algebra tools for signal processingapplications Areas and topics of interest for this special issueinclude (but are not limited to)

bull Singular value and eigenvalue decompositions in-cluding applications

bull Fourier Toeplitz Cauchy Vandermonde and semi-separable matrices including special algorithms andarchitectures

bull Recursive least squares in digital signal processingbull Updating and downdating techniques in linear alge-

bra and signal processingbull Stability and sensitivity analysis of special recursive

least-squares problemsbull Numerical linear algebra in

bull Biomedical signal processing applicationsbull Adaptive filtersbull Remote sensingbull Acoustic echo cancellationbull Blind signal separation and multiuser detectionbull Multidimensional harmonic retrieval and direc-

tion-of-arrival estimationbull Applications in wireless communicationsbull Applications in pattern analysis and statistical

modelingbull Sensor array processing

Authors should follow the EURASIP JASP manuscriptformat described at httpwwwhindawicomjournalsaspProspective authors should submit an electronic copy oftheir complete manuscript through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due October 1 2006

Acceptance Notification February 1 2007

Final Manuscript Due May 1 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

Shivkumar Chandrasekaran Department of Electricaland Computer Engineering University of California SantaBarbara USA shiveceucsbedu

Gene H Golub Department of Computer Science StanfordUniversity USA golubsccmstanfordedu

Nicola Mastronardi Istituto per le Applicazioni del CalcololdquoMauro Piconerdquo Consiglio Nazionale delle Ricerche BariItaly nmastronardibaiaccnrit

Marc Moonen Department of Electrical EngineeringKatholieke Universiteit Leuven Belgiummarcmoonenesatkuleuvenbe

Paul Van Dooren Department of Mathematical Engineer-ing Catholic University of Louvain Belgiumvdoorencsamuclacbe

Sabine Van Huffel Department of Electrical EngineeringKatholieke Universiteit Leuven Belgiumsabinevanhuffelesatkuleuvenbe

Hindawi Publishing Corporationhttpwwwhindawicom

Page 16: Voice Morphing using 3D Waveform Interpolation …spl.telhai.ac.il/speech/pub/S1110865705502178[1].pdf · Voice Morphing Using 3D Waveform Interpolation Surfaces 1175 Several studies

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Super-resolution Enhancement of Digital Video

Call for PapersWhen designing a system for image acquisition there is gen-erally a desire for high spatial resolution and a wide field-of-view To achieve this a camera system must typically em-ploy small f-number optics This produces an image withvery high spatial-frequency bandwidth at the focal plane Toavoid aliasing caused by undersampling the correspondingfocal plane array (FPA) must be sufficiently dense Howevercost and fabrication complexities may make this impracticalMore fundamentally smaller detectors capture fewer pho-tons which can lead to potentially severe noise levels in theacquired imagery Considering these factors one may chooseto accept a certain level of undersampling or to sacrifice someoptical resolution andor field-of-view

In image super-resolution (SR) postprocessing is used toobtain images with resolutions that go beyond the conven-tional limits of the uncompensated imaging system In somesystems the primary limiting factor is the optical resolutionof the image in the focal plane as defined by the cut-off fre-quency of the optics We use the term ldquooptical SRrdquo to re-fer to SR methods that aim to create an image with validspatial-frequency content that goes beyond the cut-off fre-quency of the optics Such techniques typically must rely onextensive a priori information In other image acquisitionsystems the limiting factor may be the density of the FPAsubsequent postprocessing requirements or transmission bi-trate constraints that require data compression We refer tothe process of overcoming the limitations of the FPA in orderto obtain the full resolution afforded by the selected optics asldquodetector SRrdquo Note that some methods may seek to performboth optical and detector SR

Detector SR algorithms generally process a set of low-resolution aliased frames from a video sequence to producea high-resolution frame When subpixel relative motion ispresent between the objects in the scene and the detector ar-ray a unique set of scene samples are acquired for each frameThis provides the mechanism for effectively increasing thespatial sampling rate of the imaging system without reduc-ing the physical size of the detectors

With increasing interest in surveillance and the prolifera-tion of digital imaging and video SR has become a rapidlygrowing field Recent advances in SR include innovative al-gorithms generalized methods real-time implementations

and novel applications The purpose of this special issue isto present leading research and development in the area ofsuper-resolution for digital video Topics of interest for thisspecial issue include but are not limited to

bull Detector and optical SR algorithms for videobull Real-time or near-real-time SR implementationsbull Innovative color SR processingbull Novel SR applications such as improved object

detection recognition and trackingbull Super-resolution from compressed videobull Subpixel image registration and optical flow

Authors should follow the EURASIP JASP manuscriptformat described at httpwwwhindawicomjournalsaspProspective authors should submit an electronic copy oftheir complete manuscript through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due September 1 2006

Acceptance Notification February 1 2007

Final Manuscript Due April 15 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

Russell C Hardie Department of Electrical and ComputerEngineering University of Dayton 300 College ParkDayton OH 45469-0026 USA rhardieudaytonedu

Richard R Schultz Department of Electrical EngineeringUniversity of North Dakota Upson II Room 160 PO Box7165 Grand Forks ND 58202-7165 USARichardSchultzmailundnodakedu

Kenneth E Barner Department of Electrical andComputer Engineering University of Delaware 140 EvansHall Newark DE 19716-3130 USA barnereeudeledu

Hindawi Publishing Corporationhttpwwwhindawicom

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Advanced Signal Processing and ComputationalIntelligence Techniques for Power Line Communications

Call for PapersIn recent years increased demand for fast Internet access andnew multimedia services the development of new and fea-sible signal processing techniques associated with faster andlow-cost digital signal processors as well as the deregulationof the telecommunications market have placed major em-phasis on the value of investigating hostile media such aspowerline (PL) channels for high-rate data transmissions

Nowadays some companies are offering powerline com-munications (PLC) modems with mean and peak bit-ratesaround 100 Mbps and 200 Mbps respectively Howeveradvanced broadband powerline communications (BPLC)modems will surpass this performance For accomplishing itsome special schemes or solutions for coping with the follow-ing issues should be addressed (i) considerable differencesbetween powerline network topologies (ii) hostile propertiesof PL channels such as attenuation proportional to high fre-quencies and long distances high-power impulse noise oc-currences time-varying behavior and strong inter-symbolinterference (ISI) effects (iv) electromagnetic compatibilitywith other well-established communication systems work-ing in the same spectrum (v) climatic conditions in differ-ent parts of the world (vii) reliability and QoS guarantee forvideo and voice transmissions and (vi) different demandsand needs from developed developing and poor countries

These issues can lead to exciting research frontiers withvery promising results if signal processing digital commu-nication and computational intelligence techniques are ef-fectively and efficiently combined

The goal of this special issue is to introduce signal process-ing digital communication and computational intelligencetools either individually or in combined form for advancingreliable and powerful future generations of powerline com-munication solutions that can be suited with for applicationsin developed developing and poor countries

Topics of interest include (but are not limited to)

bull Multicarrier spread spectrum and single carrier tech-niques

bull Channel modeling

bull Channel coding and equalization techniquesbull Multiuser detection and multiple access techniquesbull Synchronization techniquesbull Impulse noise cancellation techniquesbull FPGA ASIC and DSP implementation issues of PLC

modemsbull Error resilience error concealment and Joint source-

channel design methods for video transmissionthrough PL channels

Authors should follow the EURASIP JASP manuscript for-mat described at the journal site httpasphindawicomProspective authors should submit an electronic copy of theircomplete manuscripts through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due October 1 2006

Acceptance Notification January 1 2007

Final Manuscript Due April 1 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

Moiseacutes Vidal Ribeiro Federal University of Juiz de ForaBrazil mribeiroieeeorg

Lutz Lampe University of British Columbia Canadalampeeceubcca

Sanjit K Mitra University of California Santa BarbaraUSA mitraeceucsbedu

Klaus Dostert University of Karlsruhe Germanyklausdostertetecuni-karlsruhede

Halid Hrasnica Dresden University of Technology Ger-many hrasnicaifnettu-dresdende

Hindawi Publishing Corporationhttpasphindawicom

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Numerical Linear Algebra in Signal ProcessingApplications

Call for PapersThe cross-fertilization between numerical linear algebra anddigital signal processing has been very fruitful in the lastdecades The interaction between them has been growingleading to many new algorithms

Numerical linear algebra tools such as eigenvalue and sin-gular value decomposition and their higher-extension leastsquares total least squares recursive least squares regulariza-tion orthogonality and projections are the kernels of pow-erful and numerically robust algorithms

The goal of this special issue is to present new efficient andreliable numerical linear algebra tools for signal processingapplications Areas and topics of interest for this special issueinclude (but are not limited to)

bull Singular value and eigenvalue decompositions in-cluding applications

bull Fourier Toeplitz Cauchy Vandermonde and semi-separable matrices including special algorithms andarchitectures

bull Recursive least squares in digital signal processingbull Updating and downdating techniques in linear alge-

bra and signal processingbull Stability and sensitivity analysis of special recursive

least-squares problemsbull Numerical linear algebra in

bull Biomedical signal processing applicationsbull Adaptive filtersbull Remote sensingbull Acoustic echo cancellationbull Blind signal separation and multiuser detectionbull Multidimensional harmonic retrieval and direc-

tion-of-arrival estimationbull Applications in wireless communicationsbull Applications in pattern analysis and statistical

modelingbull Sensor array processing

Authors should follow the EURASIP JASP manuscriptformat described at httpwwwhindawicomjournalsaspProspective authors should submit an electronic copy oftheir complete manuscript through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due October 1 2006

Acceptance Notification February 1 2007

Final Manuscript Due May 1 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

Shivkumar Chandrasekaran Department of Electricaland Computer Engineering University of California SantaBarbara USA shiveceucsbedu

Gene H Golub Department of Computer Science StanfordUniversity USA golubsccmstanfordedu

Nicola Mastronardi Istituto per le Applicazioni del CalcololdquoMauro Piconerdquo Consiglio Nazionale delle Ricerche BariItaly nmastronardibaiaccnrit

Marc Moonen Department of Electrical EngineeringKatholieke Universiteit Leuven Belgiummarcmoonenesatkuleuvenbe

Paul Van Dooren Department of Mathematical Engineer-ing Catholic University of Louvain Belgiumvdoorencsamuclacbe

Sabine Van Huffel Department of Electrical EngineeringKatholieke Universiteit Leuven Belgiumsabinevanhuffelesatkuleuvenbe

Hindawi Publishing Corporationhttpwwwhindawicom

Page 17: Voice Morphing using 3D Waveform Interpolation …spl.telhai.ac.il/speech/pub/S1110865705502178[1].pdf · Voice Morphing Using 3D Waveform Interpolation Surfaces 1175 Several studies

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Advanced Signal Processing and ComputationalIntelligence Techniques for Power Line Communications

Call for PapersIn recent years increased demand for fast Internet access andnew multimedia services the development of new and fea-sible signal processing techniques associated with faster andlow-cost digital signal processors as well as the deregulationof the telecommunications market have placed major em-phasis on the value of investigating hostile media such aspowerline (PL) channels for high-rate data transmissions

Nowadays some companies are offering powerline com-munications (PLC) modems with mean and peak bit-ratesaround 100 Mbps and 200 Mbps respectively Howeveradvanced broadband powerline communications (BPLC)modems will surpass this performance For accomplishing itsome special schemes or solutions for coping with the follow-ing issues should be addressed (i) considerable differencesbetween powerline network topologies (ii) hostile propertiesof PL channels such as attenuation proportional to high fre-quencies and long distances high-power impulse noise oc-currences time-varying behavior and strong inter-symbolinterference (ISI) effects (iv) electromagnetic compatibilitywith other well-established communication systems work-ing in the same spectrum (v) climatic conditions in differ-ent parts of the world (vii) reliability and QoS guarantee forvideo and voice transmissions and (vi) different demandsand needs from developed developing and poor countries

These issues can lead to exciting research frontiers withvery promising results if signal processing digital commu-nication and computational intelligence techniques are ef-fectively and efficiently combined

The goal of this special issue is to introduce signal process-ing digital communication and computational intelligencetools either individually or in combined form for advancingreliable and powerful future generations of powerline com-munication solutions that can be suited with for applicationsin developed developing and poor countries

Topics of interest include (but are not limited to)

bull Multicarrier spread spectrum and single carrier tech-niques

bull Channel modeling

bull Channel coding and equalization techniquesbull Multiuser detection and multiple access techniquesbull Synchronization techniquesbull Impulse noise cancellation techniquesbull FPGA ASIC and DSP implementation issues of PLC

modemsbull Error resilience error concealment and Joint source-

channel design methods for video transmissionthrough PL channels

Authors should follow the EURASIP JASP manuscript for-mat described at the journal site httpasphindawicomProspective authors should submit an electronic copy of theircomplete manuscripts through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due October 1 2006

Acceptance Notification January 1 2007

Final Manuscript Due April 1 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

Moiseacutes Vidal Ribeiro Federal University of Juiz de ForaBrazil mribeiroieeeorg

Lutz Lampe University of British Columbia Canadalampeeceubcca

Sanjit K Mitra University of California Santa BarbaraUSA mitraeceucsbedu

Klaus Dostert University of Karlsruhe Germanyklausdostertetecuni-karlsruhede

Halid Hrasnica Dresden University of Technology Ger-many hrasnicaifnettu-dresdende

Hindawi Publishing Corporationhttpasphindawicom

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Numerical Linear Algebra in Signal ProcessingApplications

Call for PapersThe cross-fertilization between numerical linear algebra anddigital signal processing has been very fruitful in the lastdecades The interaction between them has been growingleading to many new algorithms

Numerical linear algebra tools such as eigenvalue and sin-gular value decomposition and their higher-extension leastsquares total least squares recursive least squares regulariza-tion orthogonality and projections are the kernels of pow-erful and numerically robust algorithms

The goal of this special issue is to present new efficient andreliable numerical linear algebra tools for signal processingapplications Areas and topics of interest for this special issueinclude (but are not limited to)

bull Singular value and eigenvalue decompositions in-cluding applications

bull Fourier Toeplitz Cauchy Vandermonde and semi-separable matrices including special algorithms andarchitectures

bull Recursive least squares in digital signal processingbull Updating and downdating techniques in linear alge-

bra and signal processingbull Stability and sensitivity analysis of special recursive

least-squares problemsbull Numerical linear algebra in

bull Biomedical signal processing applicationsbull Adaptive filtersbull Remote sensingbull Acoustic echo cancellationbull Blind signal separation and multiuser detectionbull Multidimensional harmonic retrieval and direc-

tion-of-arrival estimationbull Applications in wireless communicationsbull Applications in pattern analysis and statistical

modelingbull Sensor array processing

Authors should follow the EURASIP JASP manuscriptformat described at httpwwwhindawicomjournalsaspProspective authors should submit an electronic copy oftheir complete manuscript through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due October 1 2006

Acceptance Notification February 1 2007

Final Manuscript Due May 1 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

Shivkumar Chandrasekaran Department of Electricaland Computer Engineering University of California SantaBarbara USA shiveceucsbedu

Gene H Golub Department of Computer Science StanfordUniversity USA golubsccmstanfordedu

Nicola Mastronardi Istituto per le Applicazioni del CalcololdquoMauro Piconerdquo Consiglio Nazionale delle Ricerche BariItaly nmastronardibaiaccnrit

Marc Moonen Department of Electrical EngineeringKatholieke Universiteit Leuven Belgiummarcmoonenesatkuleuvenbe

Paul Van Dooren Department of Mathematical Engineer-ing Catholic University of Louvain Belgiumvdoorencsamuclacbe

Sabine Van Huffel Department of Electrical EngineeringKatholieke Universiteit Leuven Belgiumsabinevanhuffelesatkuleuvenbe

Hindawi Publishing Corporationhttpwwwhindawicom

Page 18: Voice Morphing using 3D Waveform Interpolation …spl.telhai.ac.il/speech/pub/S1110865705502178[1].pdf · Voice Morphing Using 3D Waveform Interpolation Surfaces 1175 Several studies

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Special Issue on

Numerical Linear Algebra in Signal ProcessingApplications

Call for PapersThe cross-fertilization between numerical linear algebra anddigital signal processing has been very fruitful in the lastdecades The interaction between them has been growingleading to many new algorithms

Numerical linear algebra tools such as eigenvalue and sin-gular value decomposition and their higher-extension leastsquares total least squares recursive least squares regulariza-tion orthogonality and projections are the kernels of pow-erful and numerically robust algorithms

The goal of this special issue is to present new efficient andreliable numerical linear algebra tools for signal processingapplications Areas and topics of interest for this special issueinclude (but are not limited to)

bull Singular value and eigenvalue decompositions in-cluding applications

bull Fourier Toeplitz Cauchy Vandermonde and semi-separable matrices including special algorithms andarchitectures

bull Recursive least squares in digital signal processingbull Updating and downdating techniques in linear alge-

bra and signal processingbull Stability and sensitivity analysis of special recursive

least-squares problemsbull Numerical linear algebra in

bull Biomedical signal processing applicationsbull Adaptive filtersbull Remote sensingbull Acoustic echo cancellationbull Blind signal separation and multiuser detectionbull Multidimensional harmonic retrieval and direc-

tion-of-arrival estimationbull Applications in wireless communicationsbull Applications in pattern analysis and statistical

modelingbull Sensor array processing

Authors should follow the EURASIP JASP manuscriptformat described at httpwwwhindawicomjournalsaspProspective authors should submit an electronic copy oftheir complete manuscript through the EURASIP JASP man-uscript tracking system at httpwwwhindawicommts ac-cording to the following timetable

Manuscript Due October 1 2006

Acceptance Notification February 1 2007

Final Manuscript Due May 1 2007

Publication Date 3rd Quarter 2007

GUEST EDITORS

Shivkumar Chandrasekaran Department of Electricaland Computer Engineering University of California SantaBarbara USA shiveceucsbedu

Gene H Golub Department of Computer Science StanfordUniversity USA golubsccmstanfordedu

Nicola Mastronardi Istituto per le Applicazioni del CalcololdquoMauro Piconerdquo Consiglio Nazionale delle Ricerche BariItaly nmastronardibaiaccnrit

Marc Moonen Department of Electrical EngineeringKatholieke Universiteit Leuven Belgiummarcmoonenesatkuleuvenbe

Paul Van Dooren Department of Mathematical Engineer-ing Catholic University of Louvain Belgiumvdoorencsamuclacbe

Sabine Van Huffel Department of Electrical EngineeringKatholieke Universiteit Leuven Belgiumsabinevanhuffelesatkuleuvenbe

Hindawi Publishing Corporationhttpwwwhindawicom