iit bombay iste, iitb, mumbai, 28 march, 2003 1 speech synthesis pc pandey ee dept iit bombay march...

Download IIT Bombay ISTE, IITB, Mumbai, 28 March, 2003 1 SPEECH SYNTHESIS PC Pandey EE Dept IIT Bombay March ‘03

If you can't read please download the document

Upload: lambert-johnston

Post on 18-Jan-2018

232 views

Category:

Documents


0 download

DESCRIPTION

IIT Bombay ISTE, IITB, Mumbai, 28 March, Classification of phonemes Vowels Pure vowels Diphthongs Consonants Semivowels Whisper Stops Nasals Fricatives Affricates

TRANSCRIPT

IIT Bombay ISTE, IITB, Mumbai, 28 March, SPEECH SYNTHESIS PC Pandey EE Dept IIT Bombay March 03 IIT Bombay ISTE, IITB, Mumbai, 28 March, Speech units Sentences & phrases Words Syllables Phonemes Subphonemic acoustic segments Speech features Prosodic (suprasegmental) features Intensity variation Pitch variation Phonemic features Articulatory Acoustic Perceptual IIT Bombay ISTE, IITB, Mumbai, 28 March, Classification of phonemes Vowels Pure vowels Diphthongs Consonants Semivowels Whisper Stops Nasals Fricatives Affricates IIT Bombay ISTE, IITB, Mumbai, 28 March, Speech production system IIT Bombay ISTE, IITB, Mumbai, 28 March, Schematic of speech production IIT Bombay ISTE, IITB, Mumbai, 28 March, Vovel spectrum IIT Bombay ISTE, IITB, Mumbai, 28 March, Speech synthesis Generation of speech by a machine Applications Voice response systems (limited vocabulary) Text-to-speech synthesis (unlimited vocabulary) Analysis-by-synthesis (speech research) Generation of speech-like test signals Analysis-synthesis systems * channel capacity reduction* secure commn. * speech enhancement * voice transformation * processing for hearing aids IIT Bombay ISTE, IITB, Mumbai, 28 March, Development of speech synthesizers Mechanical / electro-mechanical ( ) Electronic analog with key-board input (1930s) Electronic analog analysis-synthesis systems ( ) Digital synthesizer (1950..) * software based* hardware based IIT Bombay ISTE, IITB, Mumbai, 28 March, Mechanical synthesizers Von Kempelen, 1780 Wheatstones speaking machine IIT Bombay ISTE, IITB, Mumbai, 28 March, Riesz, 1930s: Speaking machine IIT Bombay ISTE, IITB, Mumbai, 28 March, Dudley, 1930s: Voder Electronic analog synthesizer with mechanical keyboard IIT Bombay ISTE, IITB, Mumbai, 28 March, Fant, 1950s: OVE IIT Bombay ISTE, IITB, Mumbai, 28 March, Holmes, 1960s: Parallel formant synth. IIT Bombay ISTE, IITB, Mumbai, 28 March, Klatt, 1970s: Cascade/parallel formant synth. IIT Bombay ISTE, IITB, Mumbai, 28 March, Modern synthesis approaches Waveform based high quality natural output limited vocabulary large storage requirement Speech model based unlimited speech synthesis with small storage difficulty in parameter generation & concatenation Text-to-speech synthesis Text pre-processing & phonetic transcription Parsing for syntactic & semantic structure Prosodic information & Sound units Speech waveform generation IIT Bombay ISTE, IITB, Mumbai, 28 March, Speech model based approaches Articulatory Source-filter * channel vocoder * LPC vocoder * homomorphic vocoder * formant-based synthesizer Acoustic * phase vocoder * sinusoidal model * harmonic plus noise model (HNM) IIT Bombay ISTE, IITB, Mumbai, 28 March, HARMONIC PLUS NOISE MODEL (Stylianou, 1995; 2001) Speech signal divided into: harmonic part noise part Harmonic part Noise part Parameters: Harmonic amplitudes and phases max. voiced frequency V/UV & pitch noise parameters IIT Bombay ISTE, IITB, Mumbai, 28 March, IMPLEMENTATION OF HNM IIT Bombay ISTE, IITB, Mumbai, 28 March, ANALYSIS IIT Bombay ISTE, IITB, Mumbai, 28 March, SYNTHESIS IIT Bombay ISTE, IITB, Mumbai, 28 March, SEGMENT CONCATENATION For generation of longer units from smaller ones. Steps: 1) Parsing of phonetic transcript 2) Fetching the parameters of required units 3) Pitch and intensity modifications for prosody 4) Smoothening of the parameter tracts at unit boundaries 5) Interpolation of the parameters over the frame length from end point values 6) Synthesis IIT Bombay ISTE, IITB, Mumbai, 28 March, RESULTS All VCV syllables and vowels natural & intelligible if synthesized using harmonic part only, except / aa / and / asa / HNM preserve the styles (anger, high articulatory rate) Synthesized /aa/ Synthesized /asa/ IIT Bombay ISTE, IITB, Mumbai, 28 March, RESULTS (continued) GCIs from glottal signal give better synthesis. Pitch contours for "/ ap khn ja rhE hn /" From glottal signal From speech (Childers and Hus, 1994) IIT Bombay ISTE, IITB, Mumbai, 28 March, RESULTS (continued) Good quality of the larger units constructed from prarameters of the smaller units. Recorded /b h Imani/ Synthesized from /b h I/, /Ima/, /ani/ IIT Bombay ISTE, IITB, Mumbai, 28 March, /a/ R HN H Cardinal Vowels /I/ R HN H /u/ R HN H R HN H R HN H Stops /a k a / /a g a/ /a k h a/ /a g h a/ Fricatives /a a / /a s a/ Affricates /a t a / /a d z a/ /a t h a / /a dz h a/ Word / b h Imani / R HN Sentence / mn dzmmu ja rha hun / R HN DEMONSTRATIONS IIT Bombay ISTE, IITB, Mumbai, 28 March, Further developments High quality multilingual / multi-dialect text-to-speech synthesis Voice transformations Processing for aids for the hearing impaired IIT Bombay ISTE, IITB, Mumbai, 28 March,