cosc 4/5730

25
Cosc 4/5730 Android Text to Speech And Speech To Text

Upload: garin

Post on 15-Jan-2016

69 views

Category:

Documents


0 download

DESCRIPTION

Cosc 4/5730. Android Text to Speech And Speech To Text. Android. Text to Speech. Text to Speech. In Android 1.6+ there is a native Text-to-speech built into the Android OS. In 2.3.3, Menu-> Voice input & output-> Text-to-speech settings - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Cosc  4/5730

Cosc 4/5730

AndroidText to Speech

And Speech To Text

Page 2: Cosc  4/5730

TEXT TO SPEECHAndroid

Page 3: Cosc  4/5730

Text to Speech

• In Android 1.6+ there is a native Text-to-speech built into the Android OS.– In 2.3.3, Menu-> Voice input & output-> Text-to-

speech settings– In 4.X, Settings-> Language and Input -> Text-to-

speech output– You can use the “Listen to an example” to see how

it works.

Page 4: Cosc  4/5730

How it works.

• The Text-to-Speech (TTS) uses a the Pico engine– It sends the “speech” to the audio output.

• There is only one TTS engine and it is share across all the activities on the device.– Other activities maybe using it– The user may have overridden the settings in the

preferences as well.

Page 5: Cosc  4/5730

Using the TTS

• First we need to check if the TTS engine is available.– We can do this with a Intent to with

ACTION_CHECK_TTS_DATA • Using startActivityForResult, we then find out if the TTS

engine is working and avialable.

Page 6: Cosc  4/5730

Android.speech.tts

• To use the TTS we need get access to it using the constructor

• TextToSpeech(Context context, TextToSpeech.OnInitListener listener)– The constructor for the TextToSpeech class.

mTts = mTts = new TextToSpeech(this, this);– First this, use the context of our application

– Likely US-EN

– Second this, the listener.• … Activity implements OnInitListener

– @override public void onInit(int status)

Page 7: Cosc  4/5730

OnInitListener

• onInit(int status)– Called to signal the completion of the

TextToSpeech engine initialization.– Status is either • TextToSpeech.SUCCESS

– You can use it.

– or • TextToSpeech.ERROR

– Failure, you can’t use it.

Page 8: Cosc  4/5730

Using the TTS

• To have it speak words– speak(String text, int queueMode,

HashMap<String, String> params)• To stop, call stop()• Shutdown() to release everything

Page 9: Cosc  4/5730

Example

• mTts.speak(“Test”, TextToSpeech.QUEUE_ADD, null);– You should hear the word test spoken.

Page 10: Cosc  4/5730

Other methods.

• You can change the pitch and speech rate with – setPitch(float pitch)– setSpeechRate(float speechRate)

• To find out if “it” is still speaking– Boolean isSpeaking()

• To have the speech written to a file– synthesizeToFile(String text, HashMap<String,

String> params, String filename)• Remember permission for writing to the file system.

Page 11: Cosc  4/5730

Note

• In the OnPause() method– You should put at least a stop() call– You app has lost focus

Page 12: Cosc  4/5730

Example code

• Txt2spk example in github– Simple text box and button. Type in the words you

want to speak and then press play.

– If you are running the example on a phone• For fun, use the voice input (microphone on the

keyboard) for the input and then have it read it back to you.

Page 13: Cosc  4/5730

SPEECH TO TEXTAndroid

Page 14: Cosc  4/5730

Speech To Text

• Like Text to speech, we are going to call on another Google's voice recognition software.– Android.speech package

– The simple version uses an intent and there is a dialog box for the users to know when to speech.• RecognizerIntent

– With a onActivityResult

• A Note speech recognition doesn’t work in the emulators.

Page 15: Cosc  4/5730

Simple version code• First get the recognize intentIntent intent = new Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH);• Specify the calling package to identify your application (this one is generic for any class

you use)intent.putExtra(RecognizerIntent.EXTRA_CALLING_PACKAGE, getClass().getPackage().getName());• Display an hint to the user in the dialog boxintent.putExtra(RecognizerIntent.EXTRA_PROMPT, "Say Something!");• Given an hint to the recognizer about what the user is going to sayintent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL, RecognizerIntent.LANGUAGE_MODEL_FREE_FORM);• Specify how many results you want to receive. The results will be sorted where the first

result is the one with higher confidence. In this case max of 5 resultsintent.putExtra(RecognizerIntent.EXTRA_MAX_RESULTS, 5);• Now launch the activity for a resultstartActivityForResult(intent, VOICE_RECOGNITION_REQUEST_CODE);

Page 16: Cosc  4/5730

Simple version code (2)• When the recognition is done, results are returned to onActivityResult protected void onActivityResult(int requestCode, int resultCode, Intent data) { if (requestCode == VOICE_RECOGNITION_REQUEST_CODE && resultCode == RESULT_OK) {• Fill the list view with the strings the recognizer thought it could have

heard, there should be at most 5, based on the callArrayList<String> matches = data.getStringArrayListExtra(RecognizerIntent.EXTRA_RESULTS);• Now you deal with results in matches array.}• lastly send other results to the super since we are not dealing with them.super.onActivityResult(requestCode, resultCode, data);}

Page 17: Cosc  4/5730

SpeechRecognizer class

• A second version is more complex, but also removes the dialog box

• Which many people want implement their own or just not have one.

• You will need record_audio permission – <uses-permission android:name="android.permission.RECORD_AUDIO"/>

• Get the speech recognizer and a RecognitionListener– This still uses an intent as well.

• Remember the recognition is done by Google's “cloud”.

Page 18: Cosc  4/5730

SpeechRecognizer

• First get the recognizersr = SpeechRecognizer.createSpeechRecognizer(this); • Set your listener. sr.setRecognitionListener(new Recognitionlistener()); – Listener is on the next slide.

Page 19: Cosc  4/5730

RecognitionListener• create a Recognitionlistener and implement the following methods

– void onBeginningOfSpeech() • The user has started to speak.

– void onBufferReceived(byte[] buffer) • More sound has been received.

– void onEndOfSpeech() • Called after the user stops speaking.

– void onError(int error) • A network or recognition error occurred.• Error codes are covered here

– void onEvent(int eventType, Bundle params) • Reserved for adding future events.

– void onPartialResults(Bundle partialResults) • Called when partial recognition results are available.

– void onReadyForSpeech(Bundle params) • Called when the endpointer is ready for the user to start speaking.

– void onResults(Bundle results) • Called when recognition results are ready.

– void onRmsChanged(float rmsdB) • The sound level in the audio stream has changed.

Page 20: Cosc  4/5730

RecognitionListener (2)

• onResults methods– This is where you would pull out the results from

the bundle– ArrayList results =

results.getStringArrayList(SpeechRecognizer.RESULTS_RECOGNITION);

Page 21: Cosc  4/5730

Start the recognition• As in the simple version we need an intent to start the

recognition, but we are sending the intent through the SpeechRecognizer object, we declared in the beginning.– get the recognize intent

Intent intent = new Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH);– Specify the calling package to identify your application

intent.putExtra(RecognizerIntent.EXTRA_CALLING_PACKAGE,getClass().getPackage().getName());

– Given an hint to the recognizer about what the user is going to say

intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL,RecognizerIntent.LANGUAGE_MODEL_FREE_FORM);

– Specified the max number of results

intent.putExtra(RecognizerIntent.EXTRA_MAX_RESULTS,5);– Use our SpeechRecognizer to send the intent.

sr.startListening(intent);• The listener will now get the results.

Page 22: Cosc  4/5730

Code Examples

• The txt2spk will speak text• Speak2Text demo shows you more information

on using other languages for voice recognition, plus will speak the results back to you.

• speech2txtDemo is simplified voice recognition• speech2txtDemo2 is uses the

RecognitionListener.

Page 23: Cosc  4/5730

iSpeech

• There have SDK and API for blackberry, android, and iphone as well.– Text to speech• With many voice options as well

– Speech to text • Limited to 100 word demo key per application launch.

– License key removes the 100 word limit.

• http://www.ispeech.org/

Page 24: Cosc  4/5730

References

• http://developer.android.com/resources/samples/ApiDemos/src/com/example/android/apis/app/TextToSpeechActivity.html

• http://developer.android.com/reference/android/speech/SpeechRecognizer.html

• http://developer.android.com/resources/samples/ApiDemos/src/com/example/android/apis/app/VoiceRecognition.html

• http://stackoverflow.com/questions/6316937/how-can-i-use-speech-recognition-without-the-annoying-dialog-in-android-phones

Page 25: Cosc  4/5730

QA&