sped2007 mihai surmei - institute of computer...

17
SpeD SpeD 2007” 2007” May 10 May 10 12, 2007 12, 2007 Ia Ia şi şi , ROMANIA , ROMANIA TEXT TEXT - - TO TO - - SPEECH ENGINES SPEECH ENGINES AS TELECOM SERVICE ENABLERS AS TELECOM SERVICE ENABLERS Mihai Surmei * , Dragoş Burileanu ** , Cristian Negrescu ** , Răzvan Pîrvu * , Cătălin Ungurean ** , Aurelian Derviş ** * ERICSSON Telecommunications Romania S.R.L. ** Faculty of Electronics, Telecommunications and IT, University “Politehnica” of Bucharest, ROMANIA

Upload: others

Post on 26-Feb-2021

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: SpeD2007 Mihai Surmei - Institute of Computer Scienceiit.academiaromana-is.ro/sped2007/documente/SpeD2007... · 2017. 4. 18. · 14 Specific features in Romanian language Romanian

““SpeDSpeD 2007”2007”May 10 May 10 –– 12, 2007 12, 2007 •• IaIaşişi, ROMANIA, ROMANIA

““TEXTTEXT--TOTO--SPEECH ENGINESSPEECH ENGINESAS TELECOM SERVICE ENABLERSAS TELECOM SERVICE ENABLERS””

Mihai Surmei *, Dragoş Burileanu **, Cristian Negrescu **,Răzvan Pîrvu *, Cătălin Ungurean **, Aurelian Derviş **

* ERICSSON Telecommunications Romania S.R.L.** Faculty of Electronics, Telecommunications and IT,

University “Politehnica” of Bucharest, ROMANIA

Page 2: SpeD2007 Mihai Surmei - Institute of Computer Scienceiit.academiaromana-is.ro/sped2007/documente/SpeD2007... · 2017. 4. 18. · 14 Specific features in Romanian language Romanian

2

TextText--toto--Speech EnginesSpeech Enginesas Telecom Service Enablersas Telecom Service Enablers

OUTLINEOUTLINE::Service typesService typesImplementation viewImplementation viewNetwork viewNetwork viewMaking use of open protocolsMaking use of open protocolsEE--mail reader platformmail reader platformCarrierCarrier--grade platformgrade platformTTS algorithm implementation issuesTTS algorithm implementation issuesConclusionsConclusions

Page 3: SpeD2007 Mihai Surmei - Institute of Computer Scienceiit.academiaromana-is.ro/sped2007/documente/SpeD2007... · 2017. 4. 18. · 14 Specific features in Romanian language Romanian

3

IntroductionIntroduction

Voiced interaction with non-human peer is a complex task involving all speech technology domainsThe areas of interest are completely independent: database info retrieval, navigation aid or health & public administrationSpeech services has similar requirements to fulfill. This could be unified by abstracting speech technology as telecom enablers

Page 4: SpeD2007 Mihai Surmei - Institute of Computer Scienceiit.academiaromana-is.ro/sped2007/documente/SpeD2007... · 2017. 4. 18. · 14 Specific features in Romanian language Romanian

4

Service typesService types

Simple taxonomy of TTS based servicesNotify/warning services

News reading or location based traffic informationThe information is pushed to the terminal

Legacy terminal adaptationReading e-mail or SMS from legacy POTS terminalsThe user pop the information from the source

Accessibility

Page 5: SpeD2007 Mihai Surmei - Institute of Computer Scienceiit.academiaromana-is.ro/sped2007/documente/SpeD2007... · 2017. 4. 18. · 14 Specific features in Romanian language Romanian

5

Implementation viewImplementation view

The services could be implemented asEmbedded system

Pros: low bandwidth requirementsCons: terminal resources, intellectual property

Network-based systemsPros: convergent servicesCons: higher communication costs for end-user (but these costs are constantly dropping)

Mixed solutions

Page 6: SpeD2007 Mihai Surmei - Institute of Computer Scienceiit.academiaromana-is.ro/sped2007/documente/SpeD2007... · 2017. 4. 18. · 14 Specific features in Romanian language Romanian

6

Network viewNetwork view

Networks are evolving to all-IP, but slowly and costlyThe TTS based services must rely on open and future-proof protocolsSIP/RTP based protocols such as MRCP are quite suited

Page 7: SpeD2007 Mihai Surmei - Institute of Computer Scienceiit.academiaromana-is.ro/sped2007/documente/SpeD2007... · 2017. 4. 18. · 14 Specific features in Romanian language Romanian

7

Making use of open protocolsMaking use of open protocols

The client requires the generation and/or consumption of media streamsMedia Resource Server has the relevant resources to process the input stream or to generate the output stream: synthesis engines, ASR engines, SV/SI engines

Page 8: SpeD2007 Mihai Surmei - Institute of Computer Scienceiit.academiaromana-is.ro/sped2007/documente/SpeD2007... · 2017. 4. 18. · 14 Specific features in Romanian language Romanian

8

EE--mail reader platform (1)mail reader platform (1)

Single threaded platform Permits to listen e-mail messages on legacy POTS terminals, being a simple example of convergent serviceUseful for

Verifying the end-to-end concept with minimal resourcesReceiving feedback from end-user about speech quality

Page 9: SpeD2007 Mihai Surmei - Institute of Computer Scienceiit.academiaromana-is.ro/sped2007/documente/SpeD2007... · 2017. 4. 18. · 14 Specific features in Romanian language Romanian

9

EE--mail reader platform (2)mail reader platform (2)

TTS EngineC-based TTS algorithms (synthesis in Romanian language)HTTP server

Media ServerCollection of Perlscripts for

Service controlE-mail connectorPOTS connector

Page 10: SpeD2007 Mihai Surmei - Institute of Computer Scienceiit.academiaromana-is.ro/sped2007/documente/SpeD2007... · 2017. 4. 18. · 14 Specific features in Romanian language Romanian

10

EE--mail reader platform (3)mail reader platform (3)

A typical call:1) Platform is triggered by calling

an access number2) E-mail connector opens a

session to the predefined e-mail account

3) Several options are presented to the user during an interactive voice response session

4) Service logic opens an HTTP connection to the TTS engine and sends the text to be translated

5) The media server will re-encode the audio payload from the HTTP response and will fetch it to the POTS connector

Page 11: SpeD2007 Mihai Surmei - Institute of Computer Scienceiit.academiaromana-is.ro/sped2007/documente/SpeD2007... · 2017. 4. 18. · 14 Specific features in Romanian language Romanian

11

CarrierCarrier--grade platformgrade platform

Features:Multi-threaded and multi-process on carrier grade hardware and software platformLayered architecture:

Horizontal processing layer: speech engines, communication and middle layerVertical layer: provisioning, O&M, statistics

Intra-operator versus hosting deployments

Page 12: SpeD2007 Mihai Surmei - Institute of Computer Scienceiit.academiaromana-is.ro/sped2007/documente/SpeD2007... · 2017. 4. 18. · 14 Specific features in Romanian language Romanian

12

Conclusions Conclusions –– 11

Speech services could be unified as telecom enablersMaking use of the new enablers to build convergent servicesImplementation leveraging open protocols for future-proof solutionsTTS technology – an important component in network-based applications developmentWe developed and end-to-end e-mail reader application using a proprietary TTS system in Romanian

Page 13: SpeD2007 Mihai Surmei - Institute of Computer Scienceiit.academiaromana-is.ro/sped2007/documente/SpeD2007... · 2017. 4. 18. · 14 Specific features in Romanian language Romanian

13

TTS algorithm implementation issues

An e-mail or SMS reader application needs to meet an important constraint: the missing diacritics problemUsually, the majority of users still disregard the diacriticsSynthesizing a text generated without diacritics generally leads to a poor intelligibilityThe automatic restoration of diacritics is a difficult problem, as there are not evident linguistic rules to accomplish this task

Page 14: SpeD2007 Mihai Surmei - Institute of Computer Scienceiit.academiaromana-is.ro/sped2007/documente/SpeD2007... · 2017. 4. 18. · 14 Specific features in Romanian language Romanian

14

Specific featuresin Romanian language

Romanian language makes use of three diacritic marks (a breve, a circumflex accent, and a cedilla), leading to five letters with diacritics: ă, â / î, ş and ţ.Some diacritics indicate only a different noun form (e.g., casă – house, and its pair casa – the house), others lead to a distinct meaning (e.g., fata – the girl, but faţa – the face).The percentage of words written with diacritics in a Romanian text is substantial: between 25% and 40% of the total number of wordsAn interesting particularity: there are words always written with diacritics (câteva – some, ştiinţific – scientific), and also words where some of the diacritics are always present (cămaşă / cămaşa – shirt / the shirt)

Page 15: SpeD2007 Mihai Surmei - Institute of Computer Scienceiit.academiaromana-is.ro/sped2007/documente/SpeD2007... · 2017. 4. 18. · 14 Specific features in Romanian language Romanian

15

An automatic diacriticrestoration algorithm in Romanian (1)

The algorithm is based on on a hybrid (dictionary and rule set) approachA large electronic dictionary of the most used Romanian words (D1) was first built; it contains more than 120,000 wordsThree dictionaries were then iteratively inferred. The last one (D4) contains all the possible forms for each word, indexed twice according to the basic position of the word in the dictionary and to the number of the possible diacritic patterns. A few examples:

Index i

Word in D4

Index n

0 1 2 3 4 5 max (i )câteva 9514 cateva câteva – – – – 1 ştiinţific 56240 stiintific ştiinţific – – – – 1

două 18594 doua doua două – – – 2 cămaşă 8326 camasa cămaşă cămaşa – – – 2

până 41624 pana pana pană până – – 3 rama 48543 rama rama ramă râma râmă – 4 ţara 58334 tara tara tară ţara ţară ţâră 5

Page 16: SpeD2007 Mihai Surmei - Institute of Computer Scienceiit.academiaromana-is.ro/sped2007/documente/SpeD2007... · 2017. 4. 18. · 14 Specific features in Romanian language Romanian

16

An automatic diacriticrestoration algorithm in Romanian (2)

The algorithm is part of the preprocessing module in the TTS systemThe incoming word is searched first in D4. If it is not found, the word remains unchanged. If it is found, it is processed in accordance with the maximum value of index i, either automatically assigning the diacritics that are always present, or making use of rules based on the word context

Page 17: SpeD2007 Mihai Surmei - Institute of Computer Scienceiit.academiaromana-is.ro/sped2007/documente/SpeD2007... · 2017. 4. 18. · 14 Specific features in Romanian language Romanian

17

Conclusions Conclusions –– 22

The requirement for restoring missing diacritics to text is a common problem for most languages that use the Latin alphabetThe overall accuracy of the proposed algorithm is currently about 94% (tests were performed on three texts containing about 12,000 words; each diacritic missed or incorrectly assigned was considered as an error)We anticipate even better results by increasing the dictionary size and by supplementary using a morphological analysis for word disambiguation