printed documentation - lumenvox · 2 release notes version 6.0: supports n-best. reduced server...

473
Printed Documentation

Upload: others

Post on 11-Dec-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

Page 2: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

ii

Table Of Contents

Welcome to the LumenVox Speech Recognition Engine......................................1

Release Notes ......................................................................................................2

Version 6.0: .......................................................................................................2

Version 5.0: .......................................................................................................2

Version 4.0: .......................................................................................................2

Programmers Guide..............................................................................................4

Initializing a Speech Port...................................................................................4

C Code ..........................................................................................................4

C++ Code ......................................................................................................4

C Code ..........................................................................................................5

C++ Code ......................................................................................................6

Working with Grammars....................................................................................7

Loading A Grammar ......................................................................................7

C Code ..........................................................................................................7

C++ Code ......................................................................................................7

Activating A Grammar....................................................................................7

C Code ..........................................................................................................8

C++ Code ......................................................................................................8

See Also ........................................................................................................8

Page 3: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Table Of Contents

iii

Adding Audio.....................................................................................................9

Batched Audio ...............................................................................................9

C Code ..........................................................................................................9

C++ Code ......................................................................................................9

Streaming ......................................................................................................9

C++ Code ......................................................................................................9

C Code ........................................................................................................10

Decoding .........................................................................................................13

C Code ........................................................................................................13

C++ Code ....................................................................................................14

Streaming ....................................................................................................14

Getting The Return Value ............................................................................15

C Code ........................................................................................................15

C++ Code ....................................................................................................15

C Code ........................................................................................................16

C++ Code ....................................................................................................16

See Also ......................................................................................................17

Using the Speech Parse Tree .........................................................................18

Example 1: Print the Tags in the tree...........................................................18

Example 2: Print a structured tree ...............................................................19

Page 4: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

iv

See Also ......................................................................................................21

Using the Interpretation Object........................................................................22

C API ...........................................................................................................22

C++ API .......................................................................................................22

Semantic Data Examples ............................................................................22

Example 1: Access Data Directly.................................................................24

C++ Code ....................................................................................................24

C Code ........................................................................................................24

Example 2: Traverse a Semantic Data Structure.........................................24

C Code ........................................................................................................24

Result ..........................................................................................................25

See Also ......................................................................................................26

Shutting Down the Speech Port ......................................................................27

C Code ........................................................................................................27

C++ Code ....................................................................................................27

Gotchas .......................................................................................................27

Example Code.................................................................................................28

A Working Example .....................................................................................28

main.cpp ......................................................................................................29

SimpleRecognizer.h.....................................................................................30

Page 5: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Table Of Contents

v

SimpleRecognizer.cpp.................................................................................31

AudioStreamer.h..........................................................................................36

AudioStreamer.cpp ......................................................................................37

HeaderClasses.h .........................................................................................39

SRGS Grammars ............................................................................................43

A Simple Grammar ......................................................................................43

Rule Expansions by Example ......................................................................46

Rule References ..........................................................................................49

Special Rules...............................................................................................51

Tags.............................................................................................................53

Applying Grammar Weights .........................................................................56

SRGS Definitions.........................................................................................58

Example Grammars.....................................................................................65

Semantic Interpretation ...................................................................................68

Intro to Semantic Interpretation....................................................................68

Semantic Interpretation by Example ............................................................70

Getting The Return Value ............................................................................74

Phonemes .......................................................................................................75

Phrases ...........................................................................................................78

BNF Refresher.............................................................................................78

Page 6: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

vi

LumenVox SpeechRec API ................................................................................80

Cautions ..........................................................................................................80

LV_SRE C API Functions................................................................................81

LV_SRE.......................................................................................................81

API Functions ..............................................................................................86

LVInterpretation C API Functions..................................................................161

LVInterpretation Summary.........................................................................161

LVSemanticData Summary........................................................................163

API Functions ............................................................................................166

LVParseTree C API functions........................................................................190

API Functions ............................................................................................191

Related APIs..............................................................................................204

LVParseTree Class....................................................................................218

LVGrammar C API Functions........................................................................221

LVGrammar Summary...............................................................................221

API Functions ............................................................................................225

LVSpeechPort Class .....................................................................................261

class LVSpeechPort ..................................................................................261

Methods.....................................................................................................266

LVInterpretation Class...................................................................................334

Page 7: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Table Of Contents

vii

Intro To LVInterpretation............................................................................334

LVInterpretation: Constructing and Copying ..............................................336

ResultData.................................................................................................337

ResultName...............................................................................................338

Language...................................................................................................339

Mode..........................................................................................................340

TagFormat .................................................................................................341

InputSentence............................................................................................342

GrammarLabel...........................................................................................343

Score .........................................................................................................344

LVSemanticData Class..............................................................................345

LVSemanticObject Class ...........................................................................354

LVSemanticArray Class.............................................................................360

LVParseTree Class .......................................................................................363

LVParseTree Class....................................................................................363

Methods.....................................................................................................366

LVParseTree Inner Classes.......................................................................375

LVGrammar Class.........................................................................................388

class LVGrammar ......................................................................................388

Methods.....................................................................................................393

Page 8: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

viii

Callback Functions ........................................................................................426

Logging Callback Function.........................................................................426

Streaming Callback Function.....................................................................427

Grammar Logging Callback Function ........................................................428

Constants ......................................................................................................429

Decoder Flags ...........................................................................................429

Error Codes ...............................................................................................431

Properties ..................................................................................................434

Sound Formats ..........................................................................................440

Standard Grammars ..................................................................................442

Semantic Data Type ..................................................................................443

Semantic Data Print Format.......................................................................444

Stream Parameters....................................................................................445

Environment Variables ..................................................................................449

Environment Variables...............................................................................449

FAQs.................................................................................................................451

FAQs .............................................................................................................451

How to Contact LumenVox LLC........................................................................458

Copyright Information........................................................................................459

Glossary............................................................................................................460

Page 9: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Table Of Contents

ix

Index .................................................................................................................461

Page 10: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start
Page 11: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

1

Welcome to the LumenVox Speech Recognition Engine We strive to make our products as user-friendly as possible and we value your opinion. If there is something you would like added to the Help system, please email your suggestions to [email protected].

Page 12: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

2

Release Notes Version 6.0:

Supports n-best.

Reduced server memory footprint.

Speed up on recognition algorithm.

Reduced server new thread start up time.

New American English acoustic models with 8~10% relative improvement on recognition accuracy.

Improved confidence score.

Global grammars are stored on server.

Version 5.0:

Support for the Speech Recognition Grammar Specifiacation (SRGS). SRGS grammars are now the official grammar format for the LumenVox Engine. SRGS grammars are powerful probabilistic context free grammars that allow a lot of flexibility in writing grammars.

Support for the Semantic Interpretation for Speech Recognition working draft (SISR). Semantic Interpretation makes it easy to transform spoken input into machine understandable data.

Version 4.0:

A new header file <LV_SRE2.h> is provided for the new C interface functions. This should be used in conjuction with <LV_SRE.h>

A new header file <LVSpeechPort2.h> is provided. This contains a new C++ wrapper class (with same name "class LVSpeechPort") which contains new methods. This replaces the <LVSpeechPort.h> header.

A new dll called "LVSpeechPort_stdcall.dll" is included to allow programming environments which require standard calls (like VB) to use

Page 13: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Release Notes

3

the SRE engine. The file SREAPI.txt contains a sample interface for use with VB.

Page 14: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

4

Programmers Guide Initializing a Speech Port

The only thing you must do to initialize a speech port is to have an Speech Engine service running on your machine, and call OpenPort

C Code

HPORT port; long error_code; port = LV_SRE_OpenPort2(&error_code,NULL,NULL,0);

switch(error_code) { case LV_OPEN_PORT_FAILED__LICENSES_EXCEEDED: printf("licenses exceeded"); break; case LV_OPEN_PORT_FAILED__PRIMARY_SERVER_NOT_RESPONDING: case LV_NO_SERVER_RESPONDING printf("SRE server unavailable"); break; case LV_SUCCESS: printf("port opened"); break; }

C++ Code

LVSpeechPort port; port.OpenPort( ); int error_code = port.GetOpenPortStatus();

switch(error_code) { case LV_FAILURE: cout <<"licenses exceeded"; break; case LV_OPEN_PORT_FAILED__PRIMARY_SERVER_NOT_RESPONDING: case LV_NO_SERVER_RESPONDING cout << "SRE server unavailable"; break; case LV_SUCCESS: cout << "port opened"; break; }

Other things you can do besides opening a port include

Page 15: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Programmers Guide

5

Register logging callback functions

Register multiple servers

Turn on Engine sound file and result logging, for application tuning.

C Code

/* a structure to hold logfile info */ typedef struct logdata_s { long file; long message_count; }logdata_t; void logdata_callback(const char* message, void* userdata) { logdata_t* mydata = (logdata_t*)userdata; fprintf(mydata->file,"%s\n",message); ++(mydata->message_count; } int init_port (HPORT* port, logdata_t* app_message, logdata_t* log_message ) { long error_code; /* Register a callback to accept messages from the server or client library, at warning level 3 */ LV_SRE_RegisterAppLogMsg(logdata_callback,app_message, 3); /* point the client library to a local server and a remote server */ LV_SRE_SetPropertyEx(NULL,PROP_EX_SRE_SERVERS, PROP_EX_VALUE_TYPE_STRING, "127.0.0.1,10.0.0.1", PROP_EX_TARGET_CLIENT, 0); /* open the port, registering a callback to accept messages from the port at warning level 3 */ port = LV_SRE_OpenPort2(error_code, logdata_callback,log_message,3); /* turn on sound and response file logging */ int save_sound_files=1; LV_SRE_SetPropertyEx(port,PROP_EX_SAVE_SOUND_FILES, PROP_EX_VALUE_TYPE_INT_PTR, &save_sound_files, PROP_EX_TARGET_PORT,0);

Page 16: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

6

return error_code; }

C++ Code

// a class to hold logfile info struct logdata { ofstream file; long message_count; static void callback(const char* message, void* userdata) { logdata* self = (logdata*)userdata; mydata->file << message << endl; ++(mydata->message_count; } }; int init_port (LVSpeechPort& port, logdata* app_message, logdata* log_message ) { long error_code; // Register a callback to accept messages from the server // or client library, at warning level 3. LVSpeechPort::RegisterAppLogMsg(logdata_callback,app_message, 3); // point the client library to a local server and a remote server LVSpeechPort::SetClientPropertyEx(PROP_EX_SRE_SERVERS, PROP_EX_TYPE_STRING, "127.0.0.1,10.0.0.1"); // open the port, registering a callback to accept messages // from the port at warning level 3. port.OpenPort(logdata_callback,log_message,3); // turn on sound and response file logging port.SetPropertyEx(PROP_EX_SAVE_SOUND_FILES, PROP_EX_VALUE_TYPE_INT_PTR, &save_sound_files); return port.GetOpenPortStatus(); }

Page 17: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Programmers Guide

7

Working with Grammars

Grammars tell the Speech Recognition Engine what words and phrases can be recognized by the engine, and in what order. The LumenVox grammar format is an implementation of Speech Recognition Grammar Specification, published by the W3C. A short tutorial on writing SRGS grammars is provided here.

Loading A Grammar

In order to decode audio, there must be at least one grammar loaded. Grammars can be loaded a variety of ways, a few of which are demonstrated below:

C Code

HPORT hport; /* Load a grammar into the global (application-level) space, and name it * nav_menu" * This grammar will be usable by any speech port on the client machine. * Any syntax warnings or error messages will be sent to the * application-level logging callback. */ LV_SRE_LoadGlobalGrammar ("nav_menu","c:/MyGrammars/top_level_navigation.gram"); /* Load a built-in grammar into the speech port, name it "yes_no". * Syntax error or warning messages * will be sent to the port's logging callback. * The hport needs to be open first, of course. */ LV_SRE_LoadGrammar(hport, "yes_no", "builtin:grammar/boolean");

C++ Code

LVSpeechPort port; port.OpenPort(); LVSpeechPort::LoadGlobalGrammar("nav_menu","c:/MyGrammars/top_level_navigation.gram"); port.LoadGrammar("yes_no", "builtin:grammar/boolean");

Activating A Grammar

Page 18: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

8

When a grammar is loaded, it is compiled into a file usable by the Engine. But to use the grammar for a decode you must activate it. You may activate multiple grammars for a single decode; the Engine will tell you which grammar was matched.

C Code

/* Activates the "nav_menu" grammar that was loaded above. * Activate searches for a grammar named "nav_menu" in its port, then searches the global * space if it can't find it. */ LV_SRE_ActivateGrammar (hport, "nav_menu");

C++ Code

port.ActivateGrammar("nav_menu");

See Also

Grammar Writing Tutorial

Page 19: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Programmers Guide

9

Adding Audio

Because the LumenVox Speech Engine is hardware independent, the client application has greater flexibility when collecting the audio data. Once the audio is acquired, the client application should ensure the data is in a supported audio format. The audio must be header-less, otherwise known as "raw" audio format. For example, the standard Windows .wav files have a header which needs to be removed.

The audio data is stored in a voice channel. Each speech port has 64 different voice channels. This allows 64 different audio data samples to be stored in a speech port at once, although most applications will only need 2, one for the main answer, and one holding the results of a confirmation yes/no question.

Audio may be entered at once, as a batch decode, or it may be streamed in.

Batched Audio

To get your audio into the port all you have to do is collect your audio into a buffer and call LoadVoiceChannel

C Code

void LoadAudio(HPORT hport, void* audio, int audiolength) {

LV_SRE_LoadVoiceChannel(hport, 1, audio, audiolength, PCM_16KHZ);

}

C++ Code

void LoadAudio(LVSpeechPort& myPort, void* audio, int audiolength) {

myPort.LoadVoiceChannel(1, audio, audiolength, PCM_16KHZ); }

Streaming

In order to stream audio into the server, there are several parameters to set. We will set them to the most commonly used settings:

C++ Code

Page 20: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

10

// The port gets opened and initialized. LVSpeechPort Port; Port.OpenPort(); // ... // let the port detect beginning and end of speech, // and handle the speech decoding automatically port.StreamSetParameter(STREAM_PARM_DETECT_BARGE_IN,1); port.StreamSetParameter(STREAM_PARM_DETECT_END_OF_SPEECH,1); port.StreamSetParameter(STREAM_PARM_AUTO_DECODE,1); //pick a voice channel to record audio and send responses to. port.StreamSetParameter(STREAM_PARM_VOICE_CHANNEL, 1); // If you wish to use your activated SRGS grammars, the grammar set // must be LV_ACTIVE_GRAMMAR_SET port.StreamSetParameter(STREAM_PARM_GRAMMAR_SET, LV_ACTIVE_GRAMMAR_SET);

C Code

LV_SRE_StreamSetParameter(hport,STREAM_PARM_DETECT_BARGE_IN,1); LV_SRE_StreamSetParameter(hport,STREAM_PARM_DETECT_END_OF_SPEECH,1); LV_SRE_StreamSetParameter(hport,STREAM_PARM_AUTO_DECODE,1); LV_SRE_StreamSetParameter(hport,STREAM_PARM_VOICE_CHANNEL, 1); LV_SRE_StreamSetParameter(hport,STREAM_PARM_GRAMMAR_SET, LV_ACTIVE_GRAMMAR_SET);

The rest of this example will be in C++. The C version can be an exercise for the reader. Suppose we have an interface that intermittently provides audio to us. For simplicity, assume it always sends audio in u-Law 8KHz:

typedef bool (*AudioStreamCallback)(char* audio_chunk, int audio_length, void* user_data) class AudioStreamer { public: //non-blocking function. Sends audio through the callback function //at regular intervals on a separate thread. It will stop sending //audio if the callback returns "false". void StartStream(AudioStreamCallback cb, void* user_data); //The audio thread will stop sending audio through the callback if //StopStream is called. When StopStream returns, the audio thread //is no longer sending.

Page 21: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Programmers Guide

11

void StopStream( ); //constructors, destructors, hardware hooks, etc. //... };

The speech port also has a callback mechanism for letting the user know what state of processing it is in.

typedef void (*StreamStateChangeFn)(long new_state, unsigned long total_bytes, unsigned long recorded_bytes, void* user_data);

We can connect our speech port and the audio streamer together by way of their callbacks.

struct SimpleRecognizer { LVSpeechPort port; AudioStreamer audio; }; bool AudioCB(char* audio_chunk, int audio_length, void* user_data) { SimpleRecognizer* self = (SimpleRecognizer*)user_data; self->port.StreamSendData(audio_chunk,audio_length); return true; } static void PortCB(long new_state, unsigned long total_bytes, unsigned long recorded_bytes, void* user_data) { SimpleRecognizer* self = (SimpleRecognizer*)user_data; switch (new_state) { case STREAM_STATUS_READY: self->audio.StartStream(AudioCB,self); break; case STREAM_STATUS_STOPPED: case STREAM_STATUS_END_SPEECH: self->audio.StopStream(); //retrieve answers: we will define this later break; case STREAM_STATUS_BARGE_IN: //stop playing prompt break; } }

Now all that has to happen is to plug the PortCB function into the port.

Page 22: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

12

SimpleRecognizer reco; //initialize the speech port and the audio streamer //... //start the stream. reco.port.StreamSetStateChangeCallBack(PortCB,&reco); reco.port.StreamSetParameter(STREAM_PARM_SOUND_FORMAT,ULAW_8KHZ); //StreamStart will put the port into the STREAM_STATUS_READY state, which //will trigger the audio streamer to start sending audio to the port. reco.port.StreamStart();

Page 23: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Programmers Guide

13

Decoding

Once grammars have been activated, and the speech port is receiving audio, The decode process can begin. The decode process sends audio and grammars to the Engine to be parsed and interpreted for meaning.

Batched Audio

With audio that is dropped directly into a speech port's voice channel, the user can explicitly call Decode, and wait for results to come back

C Code

HPORT hport; /* Let the port decide if the audio is suited for the MODEL_MALE or MODEL_FEMALE acoustic models. Otherwise, two decodes will be performed, and the port will choose afterward */ int choose_model = 1; LV_SRE_SetPropertyEx(NULL, PROP_EX_CHOOSE_MODEL, PROP_EX_VALUE_TYPE_INT_PTR, &choose_model, PROP_EX_TARGET_CLIENT,0); /* If you wish to use the LumenVox Semantic Interpretation process this flag needs to be present. */ unsigned long flags = LV_DECODE_SEMANTIC_INTERPRETATION; /* voice_channel is wherever you loaded the audio */ int voice_channel = 1; /* you should use the LV_ACTIVE_GRAMMAR_SET if you are using SRGS grammars. It is the grammar set that holds all of your active grammars. */ int grammar_set = LV_ACTIVE_GRAMMAR_SET; /* wait a max of 3 seconds before abandoning hope for the Engine to return an answer */ int timeout = 3000; LV_SRE_Decode(hport, voice_channel, grammar_set, flags); int code = LV_SRE_WaitForEngineToIdle(hport,timeout,voice_channel); if (code == LV_TIME_OUT) { /*do some clean up and exit */ } else { /* process the answers contained in the voice channel */ }

Page 24: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

14

C++ Code

LVSpeechPort port; int choose_model = 1; LVSpeechPort::SetClientPropertyEx(PROP_EX_CHOOSE_MODEL, PROP_EX_VALUE_TYPE_INT_PTR, &choose_model); unsigned long flags = LV_DECODE_SEMANTIC_INTERPRETATION; int voice_channel = 1; int grammar_set = LV_ACTIVE_GRAMMAR_SET; int timeout = 3000; port.Decode(voice_channel, grammar_set, flags); int code = port.WaitForEngineToIdle(timeout,voice_channel); if (code == LV_TIME_OUT) { /*do some clean up and exit */ } else { /* process the answers contained in the voice channel */ }

Streaming

If you are streaming the audio into the speech port, you can elect to have the speech port handle the decode process automatically, as we did in the section on adding audio when we wrote the line:

port.SetStreamParameter(STREAM_PARM_AUTO_DECODE,1);

In order to wait for the Engine to return with results, we need to modify our callback function:

void ProcessResults(SimpleRecognizer* reco) { reco->audio.StopStream(); int code = reco->port.WaitForEngineToIdle(3000, voice_channel); if (code == LV_TIME_OUT) { /*do some clean up and exit */ } else { /* process the answers contained in the voice channel */ } } static void PortCB(long new_state, unsigned long total_bytes, unsigned long recorded_bytes, void* user_data) { SimpleRecognizer* self = (SimpleRecognizer*)user_data; switch (new_state) { case STREAM_STATUS_READY: self->audio.StartStream(AudioCB,self); break; case STREAM_STATUS_STOPPED:

Page 25: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Programmers Guide

15

case STREAM_STATUS_END_SPEECH: ProcessResults(self); break; case STREAM_STATUS_BARGE_IN: //stop playing prompt break; } }

Getting The Return Value

If WaitForEngineToIdle returns successfully, you can grab answers out of the port. If you are using the semantic interpretation processor, you retrieve LVInterpretation objects.

C Code

if (code == LV_TIME_OUT) {/* do some clean up and exit */} else { int num_interp = LV_SRE_GetNumberOfInterpretations(hport,voice_channel); for (int i = 0; i < num_interp; ++i) { printf("interpretation %i:\n", i); H_SI interp = LV_SRE_CreateInterpretation(hport,voice_channel,i); const char* grammar = LVInterpretation_GetGrammarLabel(interp); int score = LVInterpretation_GetScore(interp); printf("utterance matched grammar %s with confidence %i\n",grammar,score); /* See "Using Semantic Data" to see how to handle the semantic data contained in this interpretation object by example */ /* release the interpretation handle when finished with it */ LVInterpretation_Release(interp); } }

C++ Code

if (code == LV_TIME_OUT) {/* do some clean up and exit */} else { int num_interp =

Page 26: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

16

port.GetNumberOfInterpretations(voice_channel); for (int i = 0; i < num_interp; ++i) { cout <<"interpretation "<< i <<":"<<endl; LVInterpretation interp = port.GetInterpretation(voice_channel,i); const char* grammar = interp.GrammarLabel( ); int score = interp.Score( ); cout <<"utterance matched grammar "<<grammar<<" with confidence "<<score<<endl; // See "Using Semantic Data" to see how to handle the semantic data // contained in this interpretation object by example } }

If you are not using semantic interpretation, you can receive LVParseTree objects from the Engine.

C Code

if (code == LV_TIME_OUT) {/* do some clean up and exit */} else { int num_parses = LV_SRE_GetNumberOfParses(hport,voice_channel); for (int i = 0; i < num_parses; ++i) { printf("interpretation %i:\n", i); H_PARSE_TREE parse = LV_SRE_CreateParseTree(hport,voice_channel,i); /* See "Using the Parse Tree" to see how to handle the parse tree by example */ /* release the parse tree when finished with it */ LVParseTree_Release(parse); } }

C++ Code

if (code == LV_TIME_OUT) {/* do some clean up and exit */} else { int num_parses = port.GetNumberOfParses(voice_channel); for (int i = 0; i < num_parses; ++i) {

Page 27: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Programmers Guide

17

cout <<"interpretation "<< i <<":"<<endl; LVParseTree parse = port.GetParseTree(voice_channel,i); // See "Using the Parse Tree" to see how to handle // the parse tree by example } }

See Also

Using Semantic Data

Using the Parse Tree

Page 28: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

18

Using the Speech Parse Tree

#include <LV_SRE_ParseTree.h>

A ParseTree represents a sentence diagram of engine output, according to the SRGS grammar that was matched. Information about the tree is accessed through iterators.

Here are a few code examples to show how information can be accessed from the speech parse tree. In every example, the active grammar will be:

#ABNF 1.0; language en-US; mode voice; tag-format <XML>; //a made up tag format.

root $PhoneNumber;

$Digit = one {1} | two {2} | three {3} | four {4} | five {5} | six {6} | seven {7} | eight {8} | nine {9} | (zero | oh) {0};

$AreaCode = [area code | one] {<AREA_CODE>} $Digit<3> {</AREA_CODE>};

$PhoneNumber = [$AreaCode] {<PHONE>} $Digit<7> {</PHONE>};

And the decoded sentence will be "area code eight five eight seven o seven o seven o seven". If you do not understand how to write an SRGS Grammar, read the tutorial now.

Example 1: Print the Tags in the tree

C++ API

#include <LV_SRE_ParseTree.h> #include <iostream>

using namespace std;

void PrintTags(LVParseTree& Tree) { LVParseTree::Iterator Itr = Tree.Begin(); LVParseTree::Iterator End = Tree.End();

Page 29: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Programmers Guide

19

for (; Itr != End; ++Itr) { if (Itr->IsTag()) { cout << Itr->Text() << "\n"; } } }

C API

#include <LV_SRE_ParseTree.h>

void PrintTags(H_PARSE_TREE Tree) { H_PARSE_TREE_NODE N; H_PARSE_TREE_ITR Itr; Itr = LVParseTree_CreateIteratorBegin(Tree); for (; !LVParseTree_Iterator_IsPastEnd(Itr); LVParseTree_Iterator_Advance(Itr)) { N = LVParseTree_Iterator_GetNode(Itr); if (LVParseTree_Node_IsTag(N)) { printf("%s ",LVParseTree_Node_GetLabel(N)); } } LVParseTree_Iterator_Release(Itr); }

Result

"<AREA_CODE> 8 5 8 </AREA_CODE> <PHONE> 7 0 7 0 7 0 7 </PHONE>"

Example 2: Print a structured tree

C++ API

#include <LV_SRE_ParseTree.h> #include <iostream> using namespace std; void PrintNode(LVParseTree::Node& N)

Page 30: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

20

{ for (int i = 0; i < N.Level(); ++i) cout << " "; if (N.IsTerminal()) cout << "\"" << N.Text() << "\"\n"; if (N.IsTag()) cout << "{ " << N.Text() << " }\n"; if (N.IsRule()) { cout << "$" << N.RuleName() << ":\n"; LVParseTree::ChildrenIterator Itr = N.ChildrenBegin(); LVParseTree::ChildrenIterator End = N.ChildrenEnd(); for (;Itr != End; ++Itr) PrintNode(*Itr); } } void PrintTree(LVParseTree& Tree) { PrintNode(Tree.Root()); }

C API

#include <LV_SRE_ParseTree.h> #include <stdio.h> void PrintNode(H_PARSE_TREE_NODE N) { H_PARSE_TREE_CHILDREN_ITR I; int i; for (i = 0; i < LVParseTree_Node_GetLevel(N); ++i) printf(" "); if (LVParseTree_Node_IsTerminal(N)) printf("\"%s\"\n",LVParseTree_Node_GetText(N)); if (LVParseTree_Node_IsTag(N)) printf("{ %s }\n",LVParseTree_Node_GetText(N)); if (LVParseTree_Node_IsRule(N)) { printf("$%s:\n",LVParseTree_Node_GetRuleName(N)); I = LVParseTree_Node_CreateChildrenIterator(N); while (!LVParseTree_ChildrenIterator_IsPastEnd(I)) { PrintNode(LVParseTree_ChildrenIterator_GetNode(I)); LVParseTree_ChildrenIterator_Advance(I); } LVParseTree_ChildrenIterator_Release(I); } }

Page 31: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Programmers Guide

21

void PrintTree(H_PARSE_TREE Tree) { PrintNode(LVParseTree_GetRoot(Tree)); }

Result:

$PhoneNumber: $AreaCode: "AREA" "CODE" { <AREA_CODE> } $Digit: "EIGHT" { 8 } $Digit: "FIVE" { 5 } $Digit: "EIGHT" { 8 } { </AREA_CODE> } { <PHONE> } $Digit: "SEVEN" { 7 } $Digit: "OH" { 0 } $Digit: "SEVEN" { 7 } $Digit: "OH" { 0 } $Digit: "SEVEN" { 7 } $Digit: "OH" { 0 } $Digit: "SEVEN" { 7 } { </PHONE> }

See Also

LVParseTree C API

LVParseTree C++ API

Page 32: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

22

Using the Interpretation Object

#include <LV_SRE_Semantic.h>

When the speech port executes your semantic interpretation tags, the output is an ECMAScript (JavaScript) object. LumenVox provides a C and C++ API for examining this object. When the speech port has finished its decode, and processed the resulting parse tree and tags, you may request an interpretation object. The interpretation object contains information about the decode -- confidence score, matching grammar, etc -- plus a single semantic data object.

C API

H_SI interpretation = LV_SRE_CreateInterpretation (hport,voicechannel,index); /* the name of the active grammar that matched this interpretation */ const char* grammar = LVInterpretation_GetGrammarLabel (interpretation); /* the SRE's confidence in this interpretation */ int confidence = LVInterpretation_GetScore (interpretation); /* the sentence that the SRE decoded */ const char* sentence = LVInterpretation_GetInputSentence (interpretation); /* the object returned by the semantic interpretation process */ H_SI_DATA result_data = LVInterpretation_GetResultData (interpretation);

C++ API

LVInterpretation interpretation = port.GetInterpretation (voicechannel, index); const char* grammar = interpretation.GrammarLabel( ); int confidence = interpretation.Score( ); const char* sentence = interpretation.InputSentence ( ); LVSemanticData result_data = interpretation.ResultData ( );

Semantic Data Examples

In the following examples, the grammar will be:

Page 33: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Programmers Guide

23

#ABNF 1.0; language en-US; mode voice; tag-format <lumenvox/1.0>; //This line tells the engine how to interpret the grammar's tags. //currently, only "lumenvox/1.0" or "semantics/1.0" is supported. root $small_number_and_text; $base = (one:"1"|two:"2"|three:"3"|four:"4"|five:"5"|six:"6"|seven:"7"|eight:"8"|nine:"9") { $ = parseInt($) }; $teen = ten:"10"|eleven:"11"|twelve:"12"|thirteen:"13"|fourteen:"14"|fifteen:"15" | sixteen:"16"|seventeen:"17"|eighteen:"18"|nineteen:"19" { $ = parseInt($) }; $twenty_to_ninetynine = (twenty:"20"|thirty:"30"|forty:"40"|fifty:"50"|sixty:"60"| seventy:"70"|eighty:"80"|ninety:"90"){ $ = parseInt($) } [$base { $ += $base }]; $tens = ($base|$teen|$twenty_to_ninetynine) { $ = $$ }; $hundred = ([a] hundred {$ = 100} | $base hundred {$ = 100 * $base}); $small_number = $hundred {$ = $$} [[and] $tens {$ += $$}] | $tens { $ = $$ }; $small_number_and_text = $small_number { $.number = $$; $.text = $$$.text };

And the input sentence will be "four hundred and six". If you do not understand how SRGS grammars are written, or how the semantic interpretation process works, please read the SRGS Grammar and/or Semantic Interpretation tutorials now.

The result of the semantic interpretation process on the input sentence is an ECMAScript object that looks like this:

small_number_and_text : // return value of type SI_TYPE_OBJECT { number: 406, // property of type SI_TYPE_INT text: "four hundred and six" // property of type SI_TYPE_STRING }

Page 34: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

24

Example 1: Access Data Directly

If we knew that our application would always be receiving an object containing an integer property named "number", and a string property named "text", we could write code to retrieve the data as follows:

C++ Code

LVSemanticObject result_obj = interpretation.ResultData().GetSemanticObject( ); int number = result_obj["number"].GetInt( ); const char* text = result_obj["text"].GetString( );

C Code

H_SI_DATA result = LVInterpretation_GetResultData(interpretation); H_SI_DATA number_container = LVSemanticObject_GetPropertyValue(result,"number"); int number = LVSemanticData_GetInt(number_container); H_SI_DATA text_container = LVSemanticObject_GetPropertyValue(result,"text"); const char* text = LVSemanticData_GetString(text_container);

Example 2: Traverse a Semantic Data Structure

The following code prints a generic interpretation object as an XML fragment.

C Code

void PrintXML(H_SI hsi) { const char* result_name = LVInterpretation_GetResultName(hsi); printf("<%s>\n",result_name); PrintDataXML(LVInterpretation_GetResultData(hsi)); printf("</%s>\n",result_name); } void PrintDataXML(H_SI_DATA hsi) { int i; int n; const char* property_name; H_SI_DATA data;

Page 35: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Programmers Guide

25

switch(LVSemanticData_GetType(hsi)) { case SI_TYPE_BOOL: LVSemanticData_GetBool(hsi) ? printf("true\n") : printf("false\n"); break; case SI_TYPE_INT: printf("%d\n", LVSemanticData_GetInt(hsi)); break; case SI_TYPE_DOUBLE: printf("%f\n", LVSemanticData_GetDouble(hsi)); break; case SI_TYPE_STRING: printf("%s\n", LVSemanticData_GetString(hsi));

break; case SI_TYPE_OBJECT: n = LVSemanticObject_GetNumberOfProperties(hsi); for (i = 0; i < n; i++) { property_name = LVSemanticObject_GetPropertyName(hsi, i) data = LVSemanticObject_GetPropertyValue(hsi,property_name); printf("<%s>\n", property_name); PrintDataXML(data); printf("</%s>\n", property_name); } break; case SI_TYPE_ARRAY: n = LVSemanticArray_GetSize(hsi); for (i = 0; i < n; i++) { data = LVSemanticArray_GetElement(hsi,i); printf("<item>\n"); PrintDataXML(data); printf("</item>\n"); } break; } }

Result

<small_number_and_text> <number> 406 </number> <text> four hundred and six </text> </small_number_and_text>

Page 36: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

26

See Also

Semantic Interpretation C API

Semantic Interpretation C++ API

Page 37: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Programmers Guide

27

Shutting Down the Speech Port

When the speech port is no longer needed it should be closed. Closing every unnecessary speech port frees up licensed ports, and releases all of the speech port's resources.

C Code

HPORT hport; /* open it...do some stuff...close when done */ LV_SRE_ClosePort (hport);

C++ Code

LVSpeechPort Port; //open it...do some stuff...close when done Port.ClosePort ( );

Gotchas

While closing the port may seem trivial, as soon as you start streaming audio to the port from a separate thread, the trivial can be problematic. Remember to completely disengage your stream from the port before you close it.

Page 38: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

28

Example Code

A Working Example

Included in this documentation is a working example that incorporates streaming audio, SRGS grammars, and Semantic Interpretation. It is written in C++, is based on examples throughout this documentation, and compiles under Visual C++ 6.0.

It consists of six files.

main.cpp -- The entry point into the application.

SimpleRecognizer.h -- Definition of a recognizer, backed by LVSpeechPort.

SimpleRecognizer.cpp -- Implementation file.

AudioStreamer.h -- Definition of an object that mimics streaming by reading an audio file.

AudioStreamer.cpp -- Implementation file.

HeaderClasses.h -- Thread code to help implement AudioStreamer.

Page 39: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Programmers Guide

29

main.cpp

#include "AudioStreamer.h" #include "SimpleRecognizer.h" #include <iostream> int main() { SimpleRecognizer Reco; Reco.LoadGrammar("yesno","builtin:grammar/boolean"); AudioStreamer Audio("yesplease.ulaw"); Reco.Recognize(&Audio,"yesno"); Reco.WaitUntilDone(); std::cout << std::endl << Reco.GetResult() << std::endl << std::endl; return 0; }

Page 40: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

30

SimpleRecognizer.h

#ifndef SIMPLE_RECOGNIZER_H #define SIMPLE_RECOGNIZER_H #include "AudioStreamer.h" #include <LVSpeechPort.h> class SimpleRecognizer { public: SimpleRecognizer(); ~SimpleRecognizer(); void WaitUntilDone(); void LoadGrammar(const std::string& grammar_name, const std::string& grammar_location); void Recognize(AudioStreamer* Stream, const std::string& grammar_name); const std::string& GetResult(); private: static void PortCB(long NewState, unsigned long TotalBytes, unsigned long RecordedBytes, void* UserData); static bool AudioCB(char* audio_data, int audio_data_size, void* user_data); bool finished_decode; AudioStreamer* AudioThread; LVSpeechPort port; int voiceChannel; void GetAnswers(); std::string result; }; #endif//SIMPLE_RECOGNIZER_H

Page 41: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Programmers Guide

31

SimpleRecognizer.cpp

#include "SimpleRecognizer.h" #include <sstream> //============================================================================================== // callback for messages from the speech port void logger(const char* msg, void* userdata) { std::cout << msg << std::endl; } //============================================================================================== // code to plug LVSemanticData into any standard stream std::ostream& operator << (std::ostream& os ,const LVSemanticData& Data) { int i; LVSemanticObject Obj; switch (Data.Type()) { case SI_TYPE_BOOL: os << Data.GetBool() << "\n"; break; case SI_TYPE_INT: os << Data.GetInt() << "\n"; break; case SI_TYPE_DOUBLE: os << Data.GetDouble() << "\n"; break; case SI_TYPE_STRING: os << Data.GetString() << "\n"; break; case SI_TYPE_OBJECT: Obj = Data.GetSemanticObject(); for (i = 0; i < Obj.NumberOfProperties(); ++i) { os <<"<property name=" << Obj.PropertyName(i) << ">\n"; os << Obj.PropertyValue(i); os << "</property>\n"; } break; case SI_TYPE_ARRAY: for (i = 0; i < Data.GetSemanticArray().Size(); ++i) { os << "<element>\n"; os << Data.GetArray().At(i); os << "</element>\n";

Page 42: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

32

} break; } return os; } //============================================================================================== // code to plug LVInterpretation into any standard stream std::ostream& operator << (std::ostream& os, const LVInterpretation& Interp) { os << "<interpretation grammar=\""<<Interp.GrammarLabel() <<"\" score=\""<<Interp.Score()<<"\">"<<std::endl; os << "<result name=\""<<Interp.ResultName()<<"\">"<<std::endl; os << Interp.ResultData(); os << "</result>"<<std::endl; os << "<input>"<<std::endl; os << Interp.InputSentence()<<std::endl; os << "</input>"<<std::endl; os << "</interpretation>"; return os; } //============================================================================================== void SimpleRecognizer::WaitUntilDone() { while (!finished_decode) Sleep(50); } //============================================================================================== SimpleRecognizer::SimpleRecognizer() : voiceChannel(1), finished_decode(true), AudioThread(NULL) { LVSpeechPort::RegisterAppLogMsg(logger,NULL,6); int v = port.OpenPort(logger,NULL,6); if (v != LV_SUCCESS) { std::cout << LVSpeechPort::ReturnErrorString(port.GetOpenPortStatus()) << std::endl; exit(-1); } // Turn on frequency based voice activity detector port.StreamSetParameter(STREAM_PARM_USE_FREQ_VAD,1); port.StreamSetParameter(STREAM_PARM_DETECT_BARGE_IN, 1); port.StreamSetParameter(STREAM_PARM_DETECT_END_OF_SPEECH, 1); port.StreamSetParameter(STREAM_PARM_VOICE_CHANNEL, voiceChannel); port.StreamSetParameter(STREAM_PARM_GRAMMAR_SET, LV_ACTIVE_GRAMMAR_SET); //Let the port handle the decode process port.StreamSetParameter(STREAM_PARM_AUTO_DECODE, 1);

Page 43: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Programmers Guide

33

//and use semantic interpretation processor port.StreamSetParameter(STREAM_PARM_DECODE_FLAGS, LV_DECODE_SEMANTIC_INTERPRETATION); port.StreamSetStateChangeCallBack(PortCB, this); } //============================================================================================== SimpleRecognizer::~SimpleRecognizer() { port.ClosePort(); } //============================================================================================== void SimpleRecognizer::PortCB(long NewState, unsigned long TotalBytes, unsigned long RecordedBytes, void* UserData) { SimpleRecognizer* self = (SimpleRecognizer*)UserData; switch (NewState) { case STREAM_STATUS_END_SPEECH: if (!self->finished_decode) { self->AudioThread->StopStream(); self->GetAnswers(); self->finished_decode = true; } break; case STREAM_STATUS_STOPPED: if (!self->finished_decode) { self->AudioThread->StopStream(); self->GetAnswers(); self->finished_decode = true; } break; case STREAM_STATUS_NOT_READY: break; case STREAM_STATUS_READY: self->finished_decode = false; self->AudioThread->StartStream(AudioCB,self); break; } } //============================================================================================== void SimpleRecognizer::LoadGrammar(const std::string& grammar_name, const std::string& grammar_location) {

Page 44: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

34

port.LoadGrammar(grammar_name.c_str(), grammar_location.c_str()); } //============================================================================================== bool SimpleRecognizer::AudioCB(char* audio_data, int audio_data_size, void* user_data) { SimpleRecognizer* self = (SimpleRecognizer*)user_data; self->port.StreamSendData(audio_data,audio_data_size); return true; } //============================================================================================== void SimpleRecognizer::Recognize(AudioStreamer* Audio, const std::string& grammar_name) { finished_decode = false; AudioThread = Audio; port.DeactivateGrammars();//clear out old grammars. port.ActivateGrammar(grammar_name.c_str()); port.AddEvent(EVENT_START_DECODE_SEQ); port.StreamSetParameter(STREAM_PARM_SOUND_FORMAT,ULAW_8KHZ); port.StreamStart(); } //============================================================================================== void SimpleRecognizer::GetAnswers() { int val; val = port.WaitForEngineToIdle(3000,voiceChannel); if (val < 0) { result = "<noanswer/>"; return; } //view the results of the decode: std::stringstream ss; int numInterp = port.GetNumberOfInterpretations(voiceChannel); for (int t = 0; t < numInterp; ++t) { ss << port.GetInterpretation(voiceChannel,t); } result = ss.str(); } //============================================================================================== const std::string& SimpleRecognizer::GetResult() {return result;}

Page 45: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Programmers Guide

35

//==============================================================================================

Page 46: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

36

AudioStreamer.h

#include "HeaderClasses.h" #ifndef AUDIO_STREAMER_H #define AUDIO_STREAMER_H typedef bool (*AudioStreamCB)(char* audio_chunk, int chunk_size, void* user_data); /** class AudioStreamer Mimics live audio being streamed. It reads audio a bit at a time from a file, periodically calling a user provided callback function to transmit the audio. It stops transmitting audio when the user callback function returns false. If it reaches the end of file before the callback tells it to stop, then it just sends silence. The audio is assumed to be a headerless u-Law audio file at 8Khz **/ class AudioStreamer : Demo::Thread { public: AudioStreamer(const char* filename); void StartStream(AudioStreamCB _cb, void* _user_data); void StopStream(); ~AudioStreamer(); private: char* audio_buffer; char* end_buffer; int audio_buffer_size; int increment_ms; AudioStreamCB cb; void* user_data; virtual void ThreadAction(); }; #endif//AUDIO_STREAMER_H

Page 47: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Programmers Guide

37

AudioStreamer.cpp

#include "AudioStreamer.h" #include <stdio.h> #include <fcntl.h> #include <io.h> //================================================================================== AudioStreamer::AudioStreamer(const char* filename): increment_ms(300), audio_buffer_size(0), audio_buffer(NULL) { int audio_handle = _open(filename, _O_BINARY | _O_RDONLY); if (audio_handle <= 0) { printf("could not open audio file %s\n",filename); exit(-1); } audio_buffer_size = _lseek(audio_handle,0L,SEEK_END); _close(audio_handle); audio_handle = _open(filename, _O_BINARY | _O_RDONLY); audio_buffer = new char[audio_buffer_size]; _read(audio_handle, audio_buffer, audio_buffer_size); _close(audio_handle); } //================================================================================== AudioStreamer::~AudioStreamer() { ThreadStop(); delete[] audio_buffer; } //================================================================================== void AudioStreamer::StartStream(AudioStreamCB CB, void* UserData) { cb = CB; user_data = UserData; ThreadActivate(); ThreadStart(); printf("audio stream started\n"); } //================================================================================== void AudioStreamer::StopStream()

Page 48: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

38

{ ThreadStop(); printf("audio stream stopped\n"); } //================================================================================== void AudioStreamer::ThreadAction() { printf("audio thread working\n"); int chunk_size; int end_chunk_size; char* current_pos = audio_buffer; bool feed_more = true; chunk_size = 8000*1*increment_ms/1000; end_chunk_size=chunk_size; end_buffer = new char[end_chunk_size]; memset(end_buffer,0,end_chunk_size); while(current_pos != audio_buffer + audio_buffer_size && feed_more && !IsThreadShuttingDown()) { if(current_pos + chunk_size > audio_buffer + audio_buffer_size) { chunk_size = (audio_buffer+audio_buffer_size) - current_pos; } feed_more = cb(current_pos,chunk_size,user_data); current_pos += chunk_size; printf("sending audio\n"); Sleep(increment_ms); } while(feed_more && !IsThreadShuttingDown()) { feed_more = cb(end_buffer,end_chunk_size,user_data); Sleep(increment_ms); printf("sending dead air\n"); } printf("audio thread told to shut down\n"); delete[] end_buffer; } //==================================================================================

Page 49: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Programmers Guide

39

HeaderClasses.h

#ifndef HEADER_ONLY_HELPER_CLASSES_DEFINED #define HEADER_ONLY_HELPER_CLASSES_DEFINED #include <string> #include <process.h> #include <time.h> #include <sys/types.h> #include <sys/timeb.h> #include <Windows.h> #undef GetObject namespace Demo { //critical section wrapper class CS { public: CS(): m_busy(false) { InitializeCriticalSection( &m_cs ); } virtual ~CS() { DeleteCriticalSection( &m_cs ); } bool IsBusy() const { return m_busy; } //only valid at time of call void Enter() { EnterCriticalSection( &m_cs ); m_busy = true; } void Leave() { // Be careful, linux allows other non-owner of cs to unlock m_busy = false; LeaveCriticalSection( &m_cs ); } bool Try() { if (m_busy) return false; Enter(); return true; }

Page 50: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

40

private: volatile bool m_busy; CRITICAL_SECTION m_cs; }; //simple way to lock critical section (releases in destructor) class CSLock { public: CSLock(CS& cs) { m_localCs = &cs; m_localCs->Enter(); } virtual ~CSLock() { m_localCs->Leave(); } private: CS* m_localCs; }; //simple windows event wrapper class Event { public: Event() { m_event = CreateEvent(NULL, false, false, NULL); } virtual ~Event() { CloseHandle( m_event ); } bool Wait(unsigned int timeout = INFINITE) { return WaitForSingleObject( m_event, timeout ) != WAIT_TIMEOUT; } bool Reset() { return ResetEvent( m_event ) != 0; } bool Signal() { return SetEvent( m_event ) != 0; } bool Try() { return Wait(0);

Page 51: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Programmers Guide

41

} private: HANDLE m_event; }; //a thread class. Have your class derive from this one, override the Thread() function. class Thread { bool Running; bool ShuttingDown; bool InUserThread; HANDLE hThread; unsigned int thrdaddr; CS CS; Event Event; public: Thread() { Running = false; ShuttingDown = true; InUserThread = false; } virtual ~Thread(){ ThreadStop(); } virtual void ThreadAction() = 0; //derive and override the ThreadAction function bool ThreadActivate() { CSLock L(CS); if (Running) return false; ShuttingDown = false; Running = true; InUserThread = false; hThread = (HANDLE) _beginthreadex(NULL, 0, CallBackThread ,(LPVOID) this, 0, &thrdaddr); return true; } bool ThreadStart() { CSLock L(CS); if (!Running || ShuttingDown || InUserThread) return false; Event.Signal(); return true; } bool ThreadStop(unsigned long WaitTime = 1000) { { CSLock L(CS); if (!Running) return false;

Page 52: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

42

ShuttingDown = true; Event.Signal(); } if (WaitForSingleObject(hThread, WaitTime) == WAIT_TIMEOUT) TerminateThread(hThread,0); Sleep(50); thrdaddr = 0; Running = false; return true; } bool IsThreadRunning(){CSLock L(CS);return Running;}; bool IsThreadShuttingDown(){CSLock L(CS);return ShuttingDown;}; private: static unsigned int __stdcall CallBackThread(void* p) { ((Thread*)p)->InternalThread(); return 0; } void InternalThread() { while (!ShuttingDown) { if (Event.Wait(2000)) { { CSLock L(CS); InUserThread = true; } ThreadAction(); { CSLock L(CS); InUserThread = false; } } } { CSLock L(CS); Running = false; } } }; }//namespace Demo #endif

Page 53: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Programmers Guide

43

SRGS Grammars

A Simple Grammar

We will begin our look at writing SRGS grammars with a simple grammar that lets the engine recognize the words "yes" or "no". Yes or no grammars are the "hello world" of grammar writing.

Example

#ABNF 1.0; language en-US; //use the American English pronunciation dictionary. mode voice; //the input for this grammar will be spoken words (as opposed to DTMF)

root $yesorno;

$yes = yes; $no = no; $yesorno = $yes | $no;

This grammar contains most of the elements of any grammar you will write. Let's take it apart.

The Grammar Identifier

Any SRGS grammar written in ABNF notation must begin with the line

#ABNF 1.0;

With no additional characters. This identifies to the LumenVox grammar compiler that the file being read is an ABNF grammar, as opposed to an SRGS XML grammar, or other future supported grammar formats.

The Grammar Header

Following the identifier, a well formed grammar will contain information about the language the grammar is written in, the expected interaction mode, and the name of a rule where the engine will begin its search (the root rule). In addition, the header may contain one or more tags, and an identifier describing the tag format for this grammar. Tags will be discussed later in this tutorial.

Page 54: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

44

The contents of the grammar header may be in any order, but no header data may occur in the file after the first rule is written.

Comments

ABNF grammars may contain comments anywhere in their body (with the exception of the first line, containing the grammar identifier). The comment format is the same one used by the C, C++, and Java programming languages.

Rules

The rules of a grammar specify what word combinations the engine may recognize. They are the heart of the grammar. Each rule has a name, appearing on the left hand side of an "=" sign, and a rule expansion, appearing on the right hand side.

The rule name starts with a "$", then a letter followed by additional letters, numbers, or underscore characters.

The rule expansion describes to the engine what sequences of words will allow a rule to be matched. An entire grammar is matched if its root rule is matched.

The first rule in the above grammar is matched if the engine detects the word "yes" being spoken. The second rule is matched if the word "no" is detected. The third rule contains a "|" symbol, which is a logical "or" operator. So the third rule is matched if the $yes or $no rules are matched.

Most of the rest of this tutorial will be concerned with writing more and more expressive rule expansions.

How the Speech Engine Uses a Grammar

When the engine begins decoding your audio, it starts at the root rule of the provided grammar, in this case the rule $yesorno. It then steps through all legal expansions, looking for the first words it's allowed to listen for. It moves into the rules $yes and $no, since it's allowed to match against either rule. Since the first words in the rules $yes and $no are "yes" and "no", the engine knows that it is allowed to recognize either word.

If the engine detects "yes" as a possibility, it then looks for the next word it can recognize in the $yes rule. Since there are no more words in the $yes rule, the rule is matched. And since the $yes rule is matched, the $yesorno root rule is matched, so the entire grammar is matched.

Page 55: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Programmers Guide

45

Next Rule Expansions

Page 56: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

46

Rule Expansions by Example

Rule expansions are built by combining together small phrases with a number of grammar operations. The operations are

Operation Example Description

Alternatives $rule = $A | $B;

match A or B

Optional Expansion

$rule = $A [$B];

match A possibly followed by B

Repetition $rule = $A <7>;

match A 7 times

Rule Alternatives

As we saw in the previous "yes no" grammar, the SRE can be told to accept one or more possibilities by using the rule alternative operator "|".

Example

$toppings = pepperoni | sausage | green peppers;

The above rule is matched by the phrases "pepperoni", "sausage", or "green peppers".

Note that the rule alternative operator is greedy. It collects "peppers" with "green" to form the alternative "green peppers". If you wish to scope the effects of the rule alternative operator, you can use parentheses.

Example

$pizza = (pepperoni | sausage) pizza;

This rule matches "pepperoni pizza" or "sausage pizza". Without the parentheses, it would match "pepperoni" or "sausage pizza".

Page 57: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Programmers Guide

47

Optional Expansion

If you wish to make a portion of a rule expansion optional, you can wrap that portion of the expansion in the optional operator "[ ]"

Example

$yes = yes [please];

This rule matches "yes" or "yes please".

Any of the SRGS operators may be wrapped inside each other, or used in sequence, to create more and more expressive sentences.

Example

$yes = yes [please | thank you];

This rule matches "yes", "yes please", or "yes thank you".

Repetition

If you wish to allow a portion of a rule expansion to be repeated a number of times, you can use the repeat operator "< >". The repeat operator can be used to specify a fixed number of repetitions, or a range of repetitions.

Example

$digit = one | two | three | four | five | six | seven | eight | nine | zero; $seven_digits = $digit <7>; $seven_to_ten_digits = $digit <7-10>; $one_or_more_digit = $digit <1->;

The $seven_digits rule allows any seven digit combination to be recognized. The $seven_to_ten_digits rule allows any seven to ten digit combination to be recognized. The $one_or_more_digit rule allows one or more digits to be recognized.

The repeat operator is tightly binding; it only applies to whatever immediately precedes it. Use parentheses to control how much of a rule expansion it applies to.

Example

Page 58: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

48

$oh_boy1 = oh boy <3>; $oh_boy2 = (oh boy)<3>;

The rule $oh_boy1 matches "oh boy boy boy". $oh_boy2 matches "oh boy oh boy oh boy";

Next Rule References

Page 59: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Programmers Guide

49

Rule References

You can reference grammar rules inside rule expansions, as we have already seen. You can also reference external grammar files--or rules within external files -- to create more complex grammars, and re-use existing grammar solutions. As an example, suppose you had a simple phone number grammar in a remote location that looked like this:

http://www.mycompany.com/phone_number.gram

#ABNF 1.0; language en-US; mode voice; root $phone_number;

$phone_number = [$area_code] $number;

$digit = one | two | three | four | five | six | seven | eight | nine | zero; $area_code = [one | area code] $digit<3>; $number = $digit<7>;

You can use this grammar in another grammar by using its location as a rulename.

#ABNF 1.0; language en-US; mode voice; root $main;

$main = (my | the) [phone] number is $<http://www.mycompany.com/phone_number.gram>;

The above grammar is using the root rule of the phone_number grammar in its $main rule. You can reference grammar files using http, ftp, or your operating systems local or network file descriptors. When writing grammars that utilize external grammar files, it's usually a good idea to specify a base URI in your grammar header.

To use a single rule in an external grammar, append the grammar name with the "#" symbol.

Example

Page 60: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

50

#ABNF 1.0; language en-US; mode voice; root $main;

$main = (my | the) area code is $<http://www.mycompany.com/phone_number.gram#area_code>;

In addition to referencing external grammar files, you can also reference any of the LumenVox built-in grammars.

Example

#ABNF 1.0; language en-US; mode voice; root $main;

$main = (my | the) [phone] number is $<builtin:grammar/phone>;

Next Special Rules

Page 61: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Programmers Guide

51

Special Rules

In addition to the rules you create, there are several reserved rules that dictate special behaviour for the Speech Engine. These rules are

$NULL

$VOID

$GARBAGE

NULL

The $NULL rule is automatically matched as soon as it is seen. Users rarely need to use the $NULL rule, but it can be useful when creating grammars programmatically. The $NULL rule is illustrated below with standard grammar operations rewritten to use the $NULL rule.

Example 1

$yes = $yes [please];

/* Identical rule expansion using the $NULL rule */ $yes = $yes (please | $NULL);

Example 2

$oh_boy = (oh boy)<0->;

/* Identical rule expansion using the $NULL rule */ $oh_boy = oh boy $oh_boy | $NULL;

VOID

The $VOID rule invalidates any rule that contains it, and hence any answer that contains it.

Example

#ABNF 1.0; language en-US; mode voice;

Page 62: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

52

root $yesorno;

$yes = yes [please]; $no = no $VOID;

If the engine recognizes the word no being spoken with the above grammar, it will invalidate the answer, and the engine will return with no answer.

GARBAGE

The $GARBAGE rule engages the out-of-vocabulary filter of the engine, allowing it to listen for arbitrary phonetic sequences until it hears the next matching word in the grammar. The garbage that was matched will not be returned by the engine.

Example

#ABNF 1.0; language en-US; mode voice;

root $yesorno;

$yes = yes [please]; $no = no $GARBAGE;

The above grammar could allow the user to say "no", "no thank you", or "no you stupid machine" (Though we've never heard anyone say that last one).

When using the $GARGAGE rule, keep in mind that engaging the out-of-vocabulary filter can slow down recognition times, and even cause additional mis-recognitions if used too aggressively. We recommend creating specific "filler" models using grammar rules that match frequently occurring out-of-vocabulary words instead of using the $GARBAGE rule, if possible.

Next Tags

Page 63: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Programmers Guide

53

Tags

Tags are special grammar tokens that can contain any information you wish to put in them. Tags are completely ignored when the engine uses your grammar. Any time the engine sees a tag in a rule, it skips right over it. But what makes tags useful, is that when the engine returns the results of a decode, it returns the tags it saw -- in the order it saw them -- along with the words and rules it recognized. This makes tags an good way to store post-processing information.

Example

#ABNF 1.0; language en-US; mode voice; root $yesorno;

$yes = yes [please] {!{ returnvalue: true }!}; // This is a tag $no = no [way | thank you] { returnvalue: false }; // Another tag $yesorno = $yes | $no;

To understand how you might use tags, we need to examine the form of an engine decode response.

Example

#ABNF 1.0; language en-US; mode voice; root $navigate;

$direction = forward | back | backward | left | right; $number = one | two | three | four | five;

$navigate = ( go | move | walk | step) $direction $number (steps | paces | units);

With the above grammar, if the engine recognizes "walk forward three paces", it will return a parse tree, or sentence diagram, that looks like this:

$navigate: "walk" $direction: "forward"

Page 64: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

54

$number: "three" "paces"

You can read more about the parse tree return type here.

In order to convert the parse tree return type into data useful to your application, You need to walk the tree and convert it into a result your application expects. For instance, your application might expect a result that looks like this:

instruction:[ direction: 1, units: 3, ]

While it is certainly possible to make the conversion, there are disadvantages to interpreting the parse tree directly to do so. One disadvantage is that your application becomes directly dependant on knowing the structure of your grammar. If the form of your grammar changes, your application code will have to change as well. Another disadvantage is that if your application uses multiple grammars (as most do), then you will most likely have to have a different set of parse tree processing code for each of your

Instead of manipulating the parse tree directly, you can put the conversion process in your grammar using tags. To do so, you adopt a consistent format for your tags, and a uniform way of processing your tags + parse tree. Then the shape of your grammar does not matter, as long as you process your tags and parse tree in the same way each time.

For this example we will adopt a very simple method for post-processing: we will walk the tree, ignoring anything that is not a tag. We will treat the tags as string data, and concatenate the strings as we see them in the parse tree.

Example

#ABNF 1.0; language en-US; mode voice; root $navigate; tag-format <my_simple_tag_format>;

$direction = { direction: }( forward { 1, } | back { 2, } | backward { 2, } | left { 3, } | right { 4, } );

Page 65: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Programmers Guide

55

$number = { units: } ( one { 1, } | two { 2, } | three { 3, } | four { 4, } | five { 5, } );

$navigate = { instruction:[ } ( go | move | walk | step) $direction $number (steps | paces | units) { ] };

Now, with the above grammar, when the engine recognizes "walk forward three paces", the parse tree returned will look like:

$navigate: {!{ instruction:[ }!} "walk" $direction: {!{ direction: }!} "forward" {!{ 1, }!} $number: {!{ units: }!} "three" {!{ 3, }!} "paces" {!{ ] }!}

And when we concatenate the tags we get the result type our application expects.

Admittedly, this is a very naive tag processing scheme, and as a result it requires a hefty number of tags to accomplish a simple task, but it does achieve the goal we want of processing our tree in a way that is independent of the form of the grammar. As a result, if ever the form of the grammar needs to change, the tags in the grammar can change, too, and the application code can stay the same.

The LumenVox API provides a much more powerful post-processing scheme based on the Semantic Interpretation for Speech Recognition working draft . It is described in detail here.

Next Applying grammar weights.

Page 66: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

56

Applying Grammar Weights

Ultimately, the engine is just a large probability machine. Inside the engine there are huge tables that store probability scores for phonemes and the sounds the sounds those phonemes are likely to generate when a person speaks. When the engine decodes audio input, it searches through these tables to find the most likely path through a sequence of phonemes given the audio input. Your SRGS grammars have the ability to modify these scores by providing grammar weights.

As an example, suppose we have a grammar that recognizes a person speaking a number that is four digits long.

#ABNF 1.0; language en-US; mode voice; root $number; $one_digit = zero | one | two | three | four | five | six | seven | eight | nine; $teens = ten | eleven | twelve | thirteen | fourteen | fifteen | sixteen | seventeen | eighteen | nineteen; $above_twenty = (twenty | thirty | forty | fifty | sixty | seventy | eighty | ninety)[$one_digit]; $double_digit = $teens | $above_twenty;

$single_digits = $one_digit<4>; //one two three four $double_digits = $double_digit<2>; //twelve thirty four $single_double = $one_digit<2> $double_digit; //one two thirty four $double_single = $double_digit $single_digit<2>; //twelve three four

$number = $single_digits | $double_digits | $single_double | $double_single;

This is a flexible grammar, but if you used it in practice you might be disappointed. You might notice that too often words like "four three" are being misrecognized as "forty". In general, your callers may be speaking a sentence that matches $single_digits 95% of the time, but the engine too frequently returns a result that matches one of the other three rules.

You can help the engine get the right answer more frequently by predisposing it to choose the $single_digits rule. Here is the same grammar with grammar weights applied.

Page 67: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Programmers Guide

57

#ABNF 1.0; language en-US; mode voice; root $number; $one_digit = zero | one | two | three | four | five | six | seven | eight | nine; $teens = ten | eleven | twelve | thirteen | fourteen | fifteen | sixteen | seventeen | eighteen | nineteen; $above_twenty = (twenty | thirty | forty | fifty | sixty | seventy | eighty | ninety)[$one_digit]; $double_digit = $teens | $above_twenty;

$single_digits = $one_digit<4>; //one two three four $double_digits = $double_digit<2>; //twelve thirty four $single_double = $one_digit<2> $double_digit; //one two thirty four $double_single = $double_digit $single_digit<2>; //twelve three four

// $single_digits has a 95% chance of being the right rule to match. // The other rules combine to take up the remaining 5%. $number = /0.95/ $single_digits | /0.05/ ($double_digits | $single_double | $double_single); /********************************************************** * you could also write the weights as * /95/ $single_digits | /5/($double_digits | $single_double | $double_single); * or * /19/ $single_digits | $double_digits | $single_double | $double_single; **********************************************************/

Now, in cases where the engine has a borderline decision to make between matching $single_digits or one of the others, it will more frequently choose $single_digits. We weighted the rules 95% to 5% only because we had records of our callers to back up the decision.

Do Not Apply Weights Without Data

Applying grammar weights should never be the first thing you do to your grammar. Initially, you don't really know how often each rule will be matched, so you are better off letting all rules be treated equally. Only after you have a compelling amount of data to suggest that applying grammar weights will help, as we did above, should you apply them. And after you do apply them, you must test their effects on real call data. Badly applied weights are worse than no weights at all.

Page 68: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

58

SRGS Definitions

Interaction Mode

An interaction mode specifies the type of interaction the speech port is to having with a user. An interaction mode can be voice or DTMF.

In a grammar, you specify whether the grammar will be used in a DTMF interaction, or a voice interaction. When grammars are activated in a speech port, only the voice grammars get used to decode speech, and only the DTMF grammars get used to process a DTMF string.

To specify the interaction mode in a grammar, use the following syntax:

ABNF

mode voice; or mode dtmf;

XML

<grammar mode="voice" ...> or <grammar mode="dtmf" ...>

Page 69: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Programmers Guide

59

Tag Format

In an SRGS grammar, you may place pieces of data called tags anywhere in a grammar rule. When a rule is matched, the tag is returned to the user in a parse tree, along with the words spoken that caused the rule to match.

A common use for tags is to transform a speakers sentence into data that your application can understand. The LumenVox speech port is capable of manipulating the tags in your parse tree, if they are in a form known as the Semantic Interpretation for Speech Recognition (SISR) tag format. Examples of this tag format can be found in this help file here.

To do any kind of interpretation, you must specify the format your tags are in.

Within the speech port, the following tag format specifiers are acceptable. Currently, both formats tell the engine to perform the same interpretation process, but as other interpretation schemes are adopted, or interpretation schemes are modified, the tag format specifier you decide on will become more important.

semantics/1.0 Use the latest working draft of the SISR, as of this help file's publication.

lumenvox/1.0 Use the working draft of the SISR published on April 1 2003.

lumenvox/1.1 Use the next working draft of the SISR (since this next draft does not exist, this tag format does nothing -- its for example only).

If the tag format of your grammar does not match one of these specifiers, the speech port will not attempt to interpret your tags. You can still use the tag data in the Parse Tree to perform your own interpretation.

To specify the format of the tags in a grammar, use the following syntax:

ABNF

tag-format <lumenvox/1.0>;

XML

Page 70: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

60

<grammar tag-format="semantics/1.0" ...>

Page 71: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Programmers Guide

61

Language Identifier

A language identifier specifies the language being spoken to the speech port.

The format of the language identifier follows the convention set out by RFC 3066. In a nutshell, the identifier is either a language and country pair -- like "en-US" for United States English, or its just a language descriptor -- like "fr" for generic French.

Within the speech port, the following language identifiers are acceptable:

"en-US" or "en"

Use the LumenVox AmericanEnglish acoustic models and dictionary

"fr-CA" or "fr" Use the LumenVox French acoustic models and dictionary

"es-MX" or "es"

Use the LumenVox Spanish acoustic models and dictionary

To specify the interaction mode in a grammar, use the following syntax in your grammar:

ABNF

language en-US;

XML

<grammar language="en-US" ... >

Page 72: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

62

Tags

Tags are special tokens in a grammar that are automatically recognized whenever they are seen by the Speech Engine. They are usually filled with information useful to the author of the grammar, or to an application using a grammar. Tags may appear in the header or the body of a grammar. When the engine recognizes a rule containing a tag, it returns the tag information along with the rule.

Filling tags with snippets of JavaScript is the basis of the semantic information process.

ABNF

{!{ tag information }!}; //this is a header tag. //Its contents will be returned if the grammar is matched. $rule = some text {!{ tag information }!} more text; //this is a tag declared in a rule.

XML

<!-- header tag. Its contents will be returned if the grammar is matched. --> <tag> tag information </tag> <rule id="rule"> some text <!-- a tag declared in a rule --> <tag> tag information </tag> more text </rule>

Page 73: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Programmers Guide

63

Base URI

Declaring a base URI in a grammar tells the grammar how to resolve relative path names in the grammar. If no base URI is present, they will be resolved from the location of the grammar file. Grammars loaded by buffer should have a base URI if they contain relative path names. Grammars may have multiple base paths, and they are searched in the order provided.

ABNF

base <http://www.mycompany.com/grammars>; base <http://www.mycompany.com/more_grammars>;

XML

<grammar xml:base="http://www.mycompany.com/grammars" xml:base="http://www.mycompany.com/more_grammars" ... >

Page 74: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

64

Built-in Grammars

LumenVox provides the built-in grammars expected by VoiceXML users. All of them provide the required output format

URI Sample Input Output

builtin:grammar/boolean "yes", "no thank you", etc.

"true" or "false"

builtin:grammar/date "january thirteenth" or "december first two thousand"

"????0113" or 20001201"

builtin:grammar/digits "one two three four" "1234"

builtin:grammar/currency "eighteen dollars and four cents"

"USD18.04"

builtin:grammar/number "four hundred point five"

"400.5"

builtin:grammar/phone "area code eight five eight seven oh seven oh seven oh seven"

"8587070707"

builtin:grammar/time "six o clock" or "five thirty p m"

"0600?" or "0530p"

Page 75: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Programmers Guide

65

Example Grammars

phone_number.gram

#ABNF 1.0; mode voice; language en-US; tag-format <lumenvox/1.0>; // The lumenvox tag format tracks the current working draft of // the W3Cs semantic interpretation proposal. // 1.0 corresponds to the working draft released on 01 April 2003

root $PhoneNumber;

/* ONE:"1" is shorthand for * ONE {!{ $="1" }!} * "$" refers to the current rule being matched ($Digit) * So the net effect is that $Digit resolves to a one digit string * after semantic interpretation. */ $Digit = (ONE:"1" | TWO:"2" | THREE:"3" | FOUR:"4" | FIVE:"5" | SIX:"6" | SEVEN:"7" | EIGHT:"8" | NINE:"9" | (ZERO | O):"0" ); /* $AreaCode resolves to a three digit string * after semantic interpretation. */ $AreaCode = { $ = "" } ( $Digit { $ += $Digit } ) <3>; /* $Number resolves to a seven digit string * after semantic interpretation. */ $Number = { $ = "" } ( $Digit { $ += $$ } ) <7>; //$$ is shorthand for the last rule detected //i.e. $Digit /* After semantic interpretation, * $PhoneNumber resolves to a structure with two member variable strings, * areacode (which defaults to "858"), and number. */ $PhoneNumber = ([AREA CODE | ONE] $AreaCode { $.areacode = $$ } $Number { $.number = $$ } ) | ( $Number ) { $.areacode = "858"; $.number = $$ };

Page 76: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

66

Page 77: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Programmers Guide

67

top_level_navigation.gram

#ABNF 1.0; mode voice; language en-US; tag-format <lumenvox/1.0>; // The lumenvox tag format tracks the current working draft of // the W3Cs semantic interpretation proposal. // 1.0 corresponds to the working draft released on 01 April 2003 root $directive; $directive = (go back) {$ = "APPLICATION_BACK"} | (main menu) {$ = "APPLICATION_TOP"} | (goodbye | quit | exit) {$ = "APPLICATION_EXIT"};

Page 78: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

68

Semantic Interpretation

Intro to Semantic Interpretation

When constructing an application using speech recognition, it is often not enough to know what the user said. You have to know what the user meant. In fact, often you don't care whether you heard the user correctly, as long as you got the meaning right. In the speech recognition world, semantic interpretation refers to the process of extracting meaning from what was spoken.

Creating a grammar and examining the parse tree that was generated by a user's speech input is the first step toward semantic interpretation. But sometimes, it is not enough to just read off the values of the tree; significant post processing of the tree is necessary to extract meaning.

As an example, here is an SRGS/ABNF grammar that matches speaking numbers from zero to nine hundred and ninety nine (it is by no means complete; for instance, it cannot recognize "two forty six" for 246):

#ABNF 1.0; language en-US; mode voice; root $small_number;

$base = one|two|three|four|five|six|seven|eight|nine; $teen = ten|eleven|twelve|thirteen|fourteen|fifteen|sixteen|seventeen|eighteen|nineteen; $twenty_to_ninetynine = (twenty|thirty|forty|fifty|sixty|seventy|eighty|ninety)[$base];

$tens = $base|$teen|$twenty_to_ninetynine;

$hundred = ([a] hundred | $base hundred);

$small_number = $hundred [[and] $tens] | $tens;

If the engine recognizes "two hundred twelve", Then the parse tree looks like this:

$small_number: $hundred: $base: "TWO" "HUNDRED" $tens:

Page 79: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Programmers Guide

69

$teen: "TWELVE"

But if your application needs to find out if the speaker spoke a number larger than 500, then it's not enough to know the parse tree; all you have is a structure of words. You need to write code to transform the tree into the number 212, which is meaningful to your application. The logic to do this transformation is going to be tied closely to the grammar's rules. For instance, within the $hundred rule, you have to know that there is an optional $base rule that has to be multiplied by 100. But in the $twenty_to_ninetynine rule, the optional $base has to be added to the total of the number you are building.

Because of the close relationship between a grammar's rules, and the semantic interpretation process, it can be convenient if you can put the semantic interpretation directly into the grammar. This is where grammar tags come into play.

The LumenVox semantic interpretation scheme is an implementation of the W3C's Semantic Interpretation working draft . The W3C will likely make changes to the draft before approving it, and LumenVox will track those changes, while maintaining backward compatibility.

The basic idea behind the LumenVox semantic interpretation scheme is this:

1. Each tag contains snippets of ECMAScript code (still popularly known as JavaScript).

2. Each grammar rule can be thought of as a function that executes the ECMAScript code in its tags from left to right, and returns a value based on that executed code.

3. Any other rules that are referenced in a grammar rule are also executed left to right, and any tag that appears after a rule reference may use that rules return value.

4. Grammar rules are only executed if the recognizer detects something to match the rule.

There are other facets to master, but understanding these four concepts will help you with everything else.

Next Semantic Interpretation by Example

Page 80: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

70

Semantic Interpretation by Example

The details of semantic interpretation will be discussed through example, by editing the numbers grammar from the introduction.

Literals

If you do not need to process any code to provide a return value for a rule, you can just attach a literal to the rule, as follows:

$foo = ($reference1 $reference2 some text):"bar";

Now, when the rule $foo is referenced, it will return the value "bar". Note: If no tags or literals exist in a grammar rule, the rule will just return text corresponding to the spoken words that matched the rule.

Literals can also be attached to individual words or phrases, as in this example:

$base = one:"1"|two:"2"|three:"3"|four:"4"|five:"5"|six:"6"|seven:"7"|eight:"8"|nine:"9";

$teen = ten:"10"|eleven:"11"|twelve:"12"|thirteen:"13"|fourteen:"14"|fifteen:"15" | sixteen:"16"|seventeen:"17"|eighteen:"18"|nineteen:"19";

Now $base and $teen return a numeric representation of the word that matched them. Note: Since a literal is the return value of a grammar rule, only one can be returned per rule. Since we have only one literal per rule alternative, this is no problem.

The Return Value

The return value of a grammar rule is an ECMAScript object named "$". You can build the return value up by writing code in tags that manipulates this symbol. For instance, our $foo rule above is equivalent to writing

$foo = ($reference1 $reference2 some text) { $ = "bar" };

This more meaningful example allows the $twenty_to_ninetynine rule to return a numeric representation of the words it matches.

Page 81: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Programmers Guide

71

$twenty_to_ninetynine = (twenty:"20"|thirty:"30"|forty:"40"|fifty:"50"|sixty:"60"|seventy:"70"| eighty:"80"|ninety:"90")[$base {$ = parseInt($) + parseInt($base)];

In this example, first the return value $ is set to "20" or "30" or "40", etc. Then, if the optional $base rule is matched, its value is added to $. Notice the use of the JavaScript operator parseInt. This is used because literals are always strings, so without parseInt, the addition above would resolve to string concatenation. Since it can be confusing to have a rule that sometimes returns a number, and other times returns a string, we will use parseInt in all of our rules:

$base = (one:"1"|two:"2"|three:"3"|four:"4"|five:"5"|six:"6"|seven:"7"|eight:"8"|nine:"9") { $ = parseInt($) };

$teen = ten:"10"|eleven:"11"|twelve:"12"|thirteen:"13"|fourteen:"14"|fifteen:"15" | sixteen:"16"|seventeen:"17"|eighteen:"18"|nineteen:"19" { $ = parseInt($) };

$twenty_to_ninetynine = (twenty:"20"|thirty:"30"|forty:"40"|fifty:"50"|sixty:"60"|seventy:"70"| eighty:"80"|ninety:"90"){ $ = parseInt($) } [$base { $ += $base }];

The "$$" object

So far we have seen that a rule's return can be referenced by its name after that rule has been matched. Sometimes, when there are lots of rule alternatives in a rule, it can be cumbersome to reference rules by name. Other times, a matched rule can't be referenced at all. For instance, you can never access an external rule reference by name in a tag, because its name is not a valid ECMAScript identifier. For these reasons, the "$$" object exists. The "$$" object is always equal to the last rule matched. Using the "$$" object, we can write the $tens, $hundred and $small_number rules like this:

$tens = ( $base | $teen | $twenty_to_ninetynine ) { $ = $$ };

$hundred = [a] hundred {$ = 100} | $base hundred {$ = 100 * $$} ;

$small_number = $hundred {$ = $$} [[and] $tens {$ += $$}] | $tens {$ = $$};

Page 82: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

72

Composite return types

Our small numbers grammar now returns an integer named small_number. If that is all we want out of this grammar, then great. Sometimes, however, we want more than one piece of information for a return type. A grammar rule always returns an object type, and object types can have additional properties. Lets say in our grammar we also want to know the text that was spoken, possibly for transcription or reading the text back to the speaker. Each rule reference $foo also has a corresponding data structure called $foo$ (yes, the W3C working group is aware that they are seriously overworking the dollar symbol), with a property called "text". Also, the text of $$ can be referenced using $$$.text.

The following change to our grammar creates a composite return type containing the text that was spoken, and the numeric representation of that text.

root $small_number_and_text;

$small_number_and_text = $small_number { $.number = $$; $.text = $$$.text }; //Note: use semi-colons to separate ECMAScript commands within tags.

Now a successful grammar match returns an object with two member properties, number and text. Here is the grammar in one place:

#ABNF 1.0; language en-US; mode voice; tag-format <lumenvox/1.0>; //This line tells the engine how to interpret the grammar's tags. //currently, only "lumenvox/1.0" or "semantics/1.0" is supported. root $small_number_and_text;

$base = (one:"1"|two:"2"|three:"3"|four:"4"|five:"5"|six:"6"|seven:"7"|eight:"8"|nine:"9") { $ = parseInt($) };

$teen = ten:"10"|eleven:"11"|twelve:"12"|thirteen:"13"|fourteen:"14"|fifteen:"15" | sixteen:"16"|seventeen:"17"|eighteen:"18"|nineteen:"19" { $ = parseInt($) };

$twenty_to_ninetynine = (twenty:"20"|thirty:"30"|forty:"40"|fifty:"50"|sixty:"60"|seventy:"70"|

Page 83: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Programmers Guide

73

eighty:"80"|ninety:"90"){ $ = parseInt($) } [$base { $ += $base }];

$tens = ($base|$teen|$twenty_to_ninetynine) { $ = $$ };

$hundred = ([a] hundred {$ = 100} | $base hundred {$ = 100 * $base});

$small_number = $hundred {$ = $$} [[and] $tens {$ += $$}] | $tens { $ = $$ };

$small_number_and_text = $small_number { $.number = $$; $.text = $$$.text };

Next Getting the Return Value

Page 84: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

74

Getting The Return Value

So far we have described how to use grammar tags to create a semantic interpretation result. So how do you access that result to use in your application?

LumenVox provides an XML fragment representation of the return type. This conforms to the W3C's proposal for generating XML from semantic interpretation results (except that do not enclose the XML in a top-level tag). LumenVox also provides an API for accessing the return value as a data structure.

Under the XML scheme, if the engine recognized "four hundred and six" using our example grammar, then the result would look like:

<number> 406 </number> <text> FOUR HUNDRED AND SIX </text>

To access the return value of semantic interpretation scheme you must do the following:

1. Set the LV_DECODE_SEMANTIC_INTERPRETATION flag in your decode function call.

2. After decode, get the number of different interpretations that exist using GetNumberOfInterpretations (usually there will only be one, but an ambiguous grammar might return more than one).

3. For each result, get the interpretation result by calling GetInterpretation.

Page 85: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Programmers Guide

75

Phonemes

The unit of sound the recognition engine actually recognizes are phonemes. All phrase formats are ultimately translated into phonetic spelling for decoding. These phonetic spellings can be directly entered if surrounded by curly braces.

The phonetic alphabet used by the decoder:

Phoneme Example #1

Phonetic Spelling #1

Example #2

Phonetic Spelling #2

Vowels

AA barn B AA R N top T AA P

AE bat B AE T crab K R AE B

AH what W AH T cut K AH T

AO more M AO R auto AO T OW

AW cow C AW house HH AW S

AX about AX B AW T dial D AY AX L

AXR butter B AH DX AXR

career K AXR IH R

AY type T AY P life L AY F

EH check CH EH K mess M EH S

ER church CH ER CH bird B ER D

EY take T EY K hail HH EY L

IH little L IH DX AX L rib R IH B

Page 86: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

76

IX action AE K SH IX N

women W IH M IX N

IY team T IY M keep K IY P

OW loan L OW N robe R OW B

OY hoist H OY S T joy JH OY

UH book B UH K look L UH K

UW flew F L UW who HH UW

Consonants

B web W EH B bear B EH R

CH chair CH EY R statue S T AE CH UW

D reed R IY D dark D AA R K

DH with W IH DH other AH DH ER

DX forty F AO R DX IY

butter B AH DX AXR

F four F AO R graph G R AE F

G peg P EH G exam IH G Z AE M

HH halt HH AO L T Jose HH OW Z EY

JH cage K EY JH Jack JH AE K

K coin K OY N back B AE K

Page 87: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Programmers Guide

77

L late L EY T really R IH L IY

M lemon L EH M AH N mail M EY L

N night N AY T any EH N IY

NG ring R IH NG ankle AE NG K AH L

P pay P EY beep B IY P

R rest R EH S T prior P R AY ER

S sit S IH T bass B AE S

SH blush B L AH SH sure SH UH R

T raft R AE F T taped T EY P T

TH three TH R IY youth Y UW TH

V van V AE N river R IH V AXR

W swap S W AA P wing W IH NG

Y yes Y EH S year Y IY R

Z arms AA R M Z blaze B L EY Z

ZH Asian EY ZH AH N genre ZH AA N R AH

Page 88: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

78

Phrases

The phrase is what the decoder attempts to match to speech.

A phrase can be in one or more of the following formats.

One of more words. Examples: "California" "how do I"

BNF format. Example: "[that's] (right | correct)" - that's right, that's correct , right or correct

Raw phonemes (inclosed in curly braces {} ) Example: "{Y EH S P L IY Z}" - yes please

Combination of above formats Example: "is that ( correct | {R AY T} )" - is that correct or is that right

The engine has an internal dictionary of approxiamately 120,000 words. There is also a robust phonetic speller for words not found in the dictionary. The only valid punctuation marks are the apostrophe (') and the dash. Dashes should be used for multiple words that should be looked up in the internal dictionary as a single word, an example being new-orleans. If the multiple words do not exist in the dictionary the dashes will be replaced by spaces words will be looked up in the dictionary separately.

BNF Refresher

BNF is an acronym for "Backus Naur Form". We use only terminal symbols. The pipe "|" is an OR operator and the square brackets "[ ]" surround optional words. The parenthesis clarify order of operation and nesting. Here are some examples.

( (I would like to speak | Please connect me ) with ) John Doe [please] translates to these variations:

1. I WOULD LIKE TO SPEAK WITH JOHN DOE PLEASE

Page 89: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Programmers Guide

79

2. PLEASE CONNECT ME WITH JOHN DOE PLEASE 3. I WOULD LIKE TO SPEAK WITH JOHN DOE 4. PLEASE CONNECT ME WITH JOHN DOE

I ( want | need ) [ to ( know | hear ) ] [ the ] directions [ to ] 1. I WANT TO KNOW THE DIRECTIONS TO 2. I NEED TO KNOW THE DIRECTIONS TO 3. I WANT TO HEAR THE DIRECTIONS TO 4. I NEED TO HEAR THE DIRECTIONS TO 5. I WANT THE DIRECTIONS TO 6. I NEED THE DIRECTIONS TO 7. I WANT TO KNOW DIRECTIONS TO 8. I NEED TO KNOW DIRECTIONS TO 9. I WANT TO HEAR DIRECTIONS TO 10. I NEED TO HEAR DIRECTIONS TO 11. I WANT DIRECTIONS TO 12. I NEED DIRECTIONS TO 13. I WANT TO KNOW THE DIRECTIONS 14. I NEED TO KNOW THE DIRECTIONS 15. I WANT TO HEAR THE DIRECTIONS 16. I NEED TO HEAR THE DIRECTIONS 17. I WANT THE DIRECTIONS 18. I NEED THE DIRECTIONS 19. I WANT TO KNOW DIRECTIONS 20. I NEED TO KNOW DIRECTIONS 21. I WANT TO HEAR DIRECTIONS 22. I NEED TO HEAR DIRECTIONS 23. I WANT DIRECTIONS 24. I NEED DIRECTIONS

Page 90: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

80

LumenVox SpeechRec API Cautions

Calling LV_SRE functions using the same HPORT in different threads at the same time can have unexpected results.

Calling LVSpeechPort methods using the same LVSpeechPort object in different threads at the same time can have unexpected results.

Win32

The environment variable LVLANG specifies the location of the Lang subdirectory. The installation package will create this variable. If the client application needs to relocate the Lang subdirectory or the API was not installed using the installation package, the client application must make sure LVLANG has the correct location of the Lang subdirectory.

LVLANG\Dict is used to store static data files (primarily the language model files for the engine, which contain acoustic models and dictionaries).

LVLANG\Responses is used to store run-time created files (the Engine's call files which contain all the details of each recognition - audio data, grammar, recognized text, etc.). A sub-directory will be created for each day's data.

Linux

LVLANG is hard-coded to /usr/LumenVox/Dict by default and is used to store static data files (primarily the language model files for the Speech Engine, which contain acoustic models and dictionaries).

LVRESPONSE is hard-coded to /var/LumenVox/Responses by default and is used to store run-time created files (the Speech Engine call files which contain all the details of each recognition - audio data, grammar, recognized text, etc). A sub-directory will be created for each day's data.

The client application can create or modify either (or both) of these two environment variables to use custom locations if desired.

Page 91: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

81

LV_SRE C API Functions

LV_SRE

The following "C" API is exported from the LVSpeechPort dll. For C++ programmers, these functions are wrapped in class LVSpeechPort.

Port Management Functions

int LV_SRE_ClosePort(HPORT hport);

int LV_SRE_Decode(HPORT hport,int VoiceChannel,int grammarset,unsigned int flags);

int LV_SRE_GetVoiceChannelData(HPORT hport, int VoiceChannel, short** PCM, unsigned int Samples);

int LV_SRE_LoadVoiceChannel(HPORT hport,int VoiceChannel,void* M,int Length,SOUND_FORMAT Format,const char* SoundFileName);

HPORT LV_SRE_OpenPort(ExportLogMsg log,void *p,int verbosity);

void LV_SRE_RegisterAppLogMsg(ExportLogMsg Log,void* p,int NewMsgVerbosity);

const char* LV_SRE_ReturnErrorString(int ReturnCode);

int LV_SRE_SetProperty(HPORT hport, int property, int Value);

int LV_SRE_SetProperty(HPORT hport, int property, int valuetype, void *pvalue, int target, int ndx);

int LV_SRE_WaitForEngineToIdle(HPORT hport,int voicechannel,int ms);

int LV_SRE_WaitForDecode(HPORT hport, int voicechannel);

Streaming API Functions

int LV_SRE_StreamStart(HPORT hport);

Page 92: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

82

int LV_SRE_StreamSendData(HPORT hport, void* SoundData, int SoundDataLength);

int LV_SRE_StreamGetStatus(HPORT hport);

int LV_SRE_StreamGetLength(HPORT hport);

int LV_SRE_StreamSetStateChangeCallBack(HPORT hport, LV_SRE_StreamStateChangeFn* fn, void* UserData);

void LV_SRE_StreamStateChangeFn(long NewState, unsigned long TotalBytes, unsigned long RecordedBytes, void* UserData);

int LV_SRE_StreamStop(HPORT hport);

int LV_SRE_StreamCancel(HPORT hport);

int LV_SRE_StreamSetParameter(HPORT hport, int StreamParameter, unsigned long StreamParameterValue);

int LV_SRE_StreamGetParameter(HPORT hport, int StreamParameter, unsigned long* StreamParameterValue);

int LV_SRE_StreamSetParameterToDefault(HPORT hport, int StreamParameter);

SRGS Grammar Functions

int LV_SRE_LoadGrammar(HPORT hport, const char* GrammarLabel, const char* GrammarLocation);

int LV_SRE_LoadGrammarIdx(HPORT hport, int GrammarIndex, const char* GrammarLocation);

int LV_SRE_LoadGlobalGrammar(const char* GrammarLabel, const char* GrammarLocation);

int LV_SRE_LoadGrammarFromBuffer(HPORT hport, const char* GrammarLabel, const char* GrammarContents);

Page 93: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

83

int LV_SRE_LoadGrammarFromBufferIdx(HPORT hport, int GrammarIndex, const char* GrammarContents);

int LV_SRE_LoadGlobalGrammarFromBuffer(const char* GrammarLabel, const char* GrammarContents);

int LV_SRE_LoadGrammarFromObject(HPORT hport, const char* GrammarLabel, HGRAMMAR hgrammar);

int LV_SRE_LoadGrammarFromObjectIdx(HPORT hport, int GrammarIdx, HGRAMMAR hgrammar);

int LV_SRE_LoadGlobalGrammarFromObject(const char* GrammarLabel, HGRAMMAR hgrammar);

int LV_SRE_UnloadGrammar(HPORT hport, const char* GrammarLabel);

int LV_SRE_UnloadGrammarIdx(HPORT hport, int GrammarIndex);

int LV_SRE_UnloadGlobalGrammar(const char* GrammarLabel);

int LV_SRE_UnloadGrammars(HPORT hport);

int LV_SRE_UnloadGlobalGrammars(void);

int LV_SRE_IsGrammarLoaded(HPORT hport,const char* GrammarLabel);

int LV_SRE_IsGrammarLoadedIdx(HPORT hport, int GrammarIndex);

int LV_SRE_IsGlobalGrammarLoaded(const char* GrammarLabel);

int LV_SRE_ActivateGrammar(HPORT hport, const char* GrammarLabel);

int LV_SRE_ActivateGrammarIdx(HPORT hport, int GrammarIndex);

int LV_SRE_ActivateGlobalGrammar(HPORT hport, const char* GrammarLabel);

int LV_SRE_DeactivateGrammar(HPORT hport, const char* GrammarLabel);

int LV_SRE_DeactivateGrammarIdx(HPORT hport, int GrammarIndex);

int LV_SRE_DeactivateGrammars(HPORT hport);

Page 94: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

84

SRGS Result Functions

int LV_SRE_GetNumberOfParses(HPORT hport, int VoiceChannel);

const char* LV_SRE_GetParseTreeString(HPORT hport, int VoiceChannel, int index);

H_PARSE_TREE LV_SRE_CreateParseTree(HPORT hport, int VoiceChannel, int Index);

int LV_SRE_GetNumberOfInterpretations(HPORT hport, int VoiceChannel);

const char* LV_SRE_GetInterpretationString(HPORT hport, int VoiceChannel, int index);

H_SI LV_SRE_CreateInterpretation(HPORT hport, int VoiceChannel, int index);

N-Best Result Functions

int LV_SRE_GetNumberOfNBestAlternatives(HPORT hport, int VoiceChannel);

int LV_SRE_SwitchToNBestAlternative(HPORT hport, int VoiceChannel, int index);

Concept-Phrase Grammar Functions (for backward compatibility)

int LV_SRE_AddPhrase(HPORT hport,int GrammarSet, const char* Concept, const char* Phrase);

int LV_SRE_LoadStandardGrammar(HPORT hport,int grammarset,int defaultgrammar);

int LV_SRE_ResetGrammar(HPORT hport,int GrammarSet);

const char* LV_SRE_GetConcept(HPORT hport,int VoiceChannel, int Index);

int LV_SRE_GetConceptScore(HPORT hport,int VoiceChannel, int Index);

int LV_SRE_GetNumberOfConceptsReturned(HPORT hport,int VoiceChannel);

Page 95: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

85

int LV_SRE_GetPhonemesDecoded(HPORT hport, int VoiceChannel, int Index);

int LV_SRE_GetPhraseDecoded(HPORT hport, int VoiceChannel, int Index);

int LV_SRE_GetRawTextDecoded(HPORT hport, int VoiceChannel, int Index);

int LV_SRE_RemoveConcept(HPORT hport,int GrammarSet, const char* Concept);

Page 96: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

86

API Functions

LV_SRE_OpenPort

Opens the speech port and initializes a connection to the Speech Engine.

Functions

HPORT LV_SRE_OpenPort(ExportLogMsg Log, void* p, int verbosity);

HPORT LV_SRE_OpenPort2(unsigned long* error_code, ExportLogMsg Log, void* p, int verbosity);

Return Values

Note: the returned handle is used by most other API functions, and must be closed by calling LV_SRE_ClosePort.

Non-NULL

Port initialized successfully.

NULL

Licensing has been exceeded. There are too many ports active.

Parameters

Log

Pointer to a function which will receive logging information from the object.

p

A void pointer to client application-defined data. This data will be passed into the ExportLogMsg function to identify the calling port.

verbosity

range: 0 - 6

0 - minimal logging info

Page 97: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

87

6 - maximum logging info

error_code

An error message indicating why the port failed to open

Error Code Return Values for OpenPort2

LV_SUCCESS

The port opened successfully

LV_NO_SERVER_RESPONDING or LV_OPEN_PORT_FAILED__PRIMARY_SERVER_NOT_RESPONDING

The client could not find a server to request a licensed port from.

LV_OPEN_PORT_FAILED__LICENSES_SUCCEEDED

The primary server has too many ports connected for the number of licenses it has to give out.

This function activates the speech port object. The recognition engine will begin initializing when this function is called. Control will return to the application immediately.

p is passed into the ExportLogMsg function to enable client-application-defined behavior.

Remarks

This method activates the speech port object. The recognition engine will begin initializing when this function is called. Control will return to the application immediately.

p is passed into the ExportLogMsg function to enable client-application-defined behavior.

See Also

Logging Callback Function

LV_SRE_ClosePort

Page 98: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

88

LVSpeechPort::OpenPort

Page 99: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

89

LV_SRE_ClosePort

Closes the port, and releases its resources.

int LV_SRE_ClosePort(HPORT hport);

Return Values

LV_SUCCESS

No errors; the port has successfully shutdown.

LV_FAILURE

The Port was unable to shutdown.

LV_INVALID_HPORT

The port was never successfully opened, or was already closed.

Note:

Frees this port from counting against the number of ports allowed by your license. Close every port not needed anymore.

See Also

LV_SRE_OpenPort

LVSpeechPort::ClosePort

Page 100: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

90

LV_SRE_RegisterAppLogMsg

Registers an application level log msg callback..

void LV_SRE_RegisterAppLogMsg(ExportLogMsg log,void *p,int verbosity);

Return Values

none.

Parameters

Log

Pointer to a function which will receive logging information.

p

p is a void pointer to Application defined data. This data will be passed into the ExportLogMsg function to identify the application.

verbosity

range: 0 - 6

0 - minimal logging info

6 - maximum logging info

Remarks

This is in addition to the port log message callback, because some log messages are generated while not associated with any one port.

There currently is no equivalent in LVSpeechPort.

See Also

Logging Callback Function

Page 101: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

91

LV_SRE_ActivateGrammar functions

If you wish to use an SRGS grammar for decode, you need to activate it. Activating a grammar puts it in the multi-grammar grammarset called LV_ACTIVE_GRAMMAR_SET. The grammars that were activated can then be used for a decode by specifying LV_ACTIVE_GRAMMAR_SET as the grammarset parameter in a call to Decode, or by setting the STREAM_PARM_GRAMMAR_SET equal to the LV_ACTIVE_GRAMMAR_SET before calling StreamStart. The reason for this mechanism is to maintain backward compatibility with previous APIs.

When ActivateGrammar is called, first the grammar is searched for among the grammars in the speech port's loaded grammars. If it can not be found there, the collection of application level grammars is searched. If you wish to explicitly activate an application level grammar, use LV_SRE_ActivateGlobalGrammar.

Functions

LV_SRE_ActivateGrammar(HPORT hport, const char* gram_name);

LV_SRE_ActivateGrammarIdx(HPORT hport, int gram_name);

Parameters

hport

The handle of the speech port for which you are activating the grammar.

gram_name

The identifier for the grammar being activated. This is the same identifier that was given to the grammar when it was loaded. This can be a string, or an integer ID if you use the *Idx version of the function call. The string "123" and the integer 123 are identical labels. Integer names are provided for backward compatibility.

Return Values

LV_SUCCESS

No errors; this grammar is now active.

LV_GRAMMAR_LOADING_ERROR

Page 102: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

92

This grammar could not be activated, because it was not found in the speech port's set of loaded grammars.

Remarks

Detailed error and warning messages are sent to the speech port's logging callback function at priorities 0 and 1, respectively.

See Also

LV_SRE_DeactivateGrammar functions

LV_SRE_ActivateGlobalGrammar

LVSpeechPort::ActivateGrammar functions (C++ API)

Page 103: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

93

LV_SRE_ActivateGlobalGrammar

You only need to use this function if you have a grammar in the speech port with same name as a grammar in the global space, and you wish to activate the global grammar.

Function

int LV_SRE_ActivateGlobalGrammar(HPORT hport,const char* gram_name);

Parameters

hport

The handle of the speech port for which you are activating the grammar.

gram_name

The identifier for the grammar being activated. This is the same identifier that was given to the grammar when it was loaded.

Return Values

LV_SUCCESS

No errors; this grammar is now active.

LV_FAILURE

This grammar could not be activated, because it was not found in the application-level set of grammars.

Remarks

Since LV_SRE_ActivateGrammar searches the speech port's loaded grammars, and then searches the application level grammars, you only need to use LV_SRE_ActivateGlobalGrammar if there is a name conflict between your local and app-level grammars, and you need to activate the app-level one.

Detailed error and warning messages are sent to the speech port's logging callback function at priorities 0 and 1, respectively.

Page 104: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

94

See Also

LV_SRE_ActivateGrammar functions

LV_SRE_DeactivateGrammar functions

LVSpeechPort::ActivateGlobalGrammar (C++ API)

Page 105: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

95

LV_SRE_DeactivateGrammar functions

These functions remove a grammar from the set of active grammars. The last function clears the active grammar set

Functions

int LV_SRE_DeactivateGrammar(HPORT hport, const char* gram_name);

int LV_SRE_DeactivateGrammarIdx(HPORT hport, int gram_name);

int LV_SRE_DeactivateGrammars(HPORT hport);

Parameters

hport

The handle of the speech port for which you are activating the grammar.

gram_name

The identifier for the grammar being deactivated. This is the same identifier that was given to the grammar when it was loaded. This can be a string, or an integer ID if you use the *Idx version of the function call. The string "123" and the integer 123 are identical labels. Integer names are provided for backward compatibility.

Return Values

LV_SUCCESS

No errors; this grammar is no longer active.

LV_FAILURE

This grammar could not be deactivated, because it was never successfully activated.

See Also

LV_SRE_ActivateGrammar functions

Page 106: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

96

LV_SRE_ActivateGlobalGrammar

LVSpeechPort::DeactivateGrammar (C++ API)

Page 107: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

97

LV_SRE_LoadGrammar functions

Before you can use a grammar, you must load it into the speech port's collection of grammars, or you must load it into the collection of application-level (global) grammars. When you load a grammar, it is compiled for use in the LumenVox Speech Engine.

These functions load an SRGS grammar that will be usable by a single speech port object.

Functions

LV_SRE_LoadGrammar(HPORT hport, const char* gram_name, const char* gram_location);

LV_SRE_LoadGrammarIdx(HPORT hport, int gram_name, const char* gram_location);

LV_SRE_LoadGrammarFromBuffer(HPORT hport, const char* gram_name, const char* gram_contents);

LV_SRE_LoadGrammarFromBufferIdx(HPORT hport, int gram_name, const char* gram_contents);

LV_SRE_LoadGrammarFromObject(HPORT hport, const char* gram_name, HGRAMMAR gram_handle);

LV_SRE_LoadGrammarFromObjectIdx(HPORT hport, int gram_name, HGRAMMAR gram_handle);

Parameters

hport

The handle for the speech port you are loading the grammar into.

gram_name

The identifier for the grammar being loaded. Whenever you activate, deactivate, or unload, this is the identifier you will use. This can be a string, or an integer ID if you use the *Idx version of the function call. The string "123" and the integer 123 are identical labels. Integer names are provided for backward compatibility.

Page 108: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

98

gram_location

A file descriptor or uri that points to a valid SRGS grammar file, such as "c:/grammars/pizza.grxml", "http://www.gramsRus.com/phonenumber.gram", or "builtin:dtmf/boolean?y=1;n=2"

gram_contents

A null terminated string containing the contents of a valid SRGS grammar file.

gram_handle

A handle for an LVGrammar object, created by LVGrammar_Create

Return Values

LV_SUCCESS

No errors; this grammar is now ready for use.

LV_GRAMMAR_SYNTAX_WARNING

The grammar file was not fully conforming, but it was understandable and is now ready to be used

LV_GRAMMAR_SYNTAX_ERROR

The grammar file was not understandable to the grammar compiler. You will not be able to decode with this grammar.

LV_GRAMMAR_LOADING_ERROR

The grammar compiler was unable to find the location of the grammar you loaded.

Remarks

Page 109: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

99

Detailed error and warning messages are sent to the speech port's logging callback function at priorities 0 and 1, respectively.

See Also

LV_SRE_UnloadGrammar functions

LV_SRE_IsGrammarLoaded functions

LV_SRE_LoadGlobalGrammar functions

LVSpeechPort::LoadGrammar functions (C++ API)

Page 110: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

100

LV_SRE_UnloadGrammar functions

These functions remove a loaded grammar from a speech port object. The last function removes all loaded grammars from the speech port.

Functions

int LV_SRE_UnloadGrammar(HPORT hport, const char* gram_name);

int LV_SRE_UnloadGrammarIdx(HPORT hport, int gram_name);

int LV_SRE_UnloadGrammars(HPORT hport);

Parameters

hport

The handle for the speech port you are unloading the grammar out of.

gram_name

The identifier for the grammar being unloaded. This is the same identifier you gave the grammar when you loaded it. It can be a null terminated string, or an integer if you use the *Idx version of the method.

Return Values

LV_SUCCESS

No errors; this grammar is removed.

LV_FAILURE

The grammar was not present. Nothing was removed.

Remarks

Grammars that were activated and then unloaded are still active; they must be explicitly deactivated.

See Also

Page 111: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

101

LV_SRE_IsGrammarLoaded functions

LV_SRE_UnloadGlobalGrammar functions

LV_SRE_LoadGrammar functions

LVSpeechPort::UnloadLoadGrammar functions (C++ API)

Page 112: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

102

LV_SRE_UnloadGlobalGrammar

These functions removes a loaded grammar from the application level space of grammars. The second function removes all application-level grammars.

Functions

int LV_SRE_UnloadGlobalGrammar(const char* gram_name);

void LV_SRE_UnloadGlobalGrammars(void);

Parameters

gram_name

The identifier for the grammar being unloaded. This is the same identifier you gave the grammar when you loaded it.

Return Values

LV_SUCCESS

No errors; this grammar is removed.

LV_GLOBAL_GRAMMAR_TRANSACTION_ERROR

Fail to unload the grammar on all servers.

LV_GLOBAL_GRAMMAR_TRANSACTION_PARTIAL_ERROR

Fail to unload the grammar on some of the servers.

Remarks

A global grammar is unloaded on the server only when users have called unload functions on all labels that are associated with the grammar.

See Also

LV_SRE_UnloadGrammar functions

LV_SRE_IsGlobalGrammarLoaded functions

Page 113: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

103

LV_SRE_LoadGlobalGrammar functions

LVSpeechPort::UnloadGlobalGrammar functions (C++ API)

Page 114: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

104

LV_SRE_LoadGlobalGrammar functions

When loading a global grammar, the grammar will be sent to the server. And all following decode requests only contain global grammar ID's, instead of the actual grammars, to avoid network transportation overhead on large grammars.

A global grammar is associated with the client process that loads that grammar. All speech ports that are belong to that client have access to that global grammar. However, different client processes don't share global grammars with each other.

Generally, the lifetime of a global grammar is controlled by load and unload functions. However, in the case that users terminate client process without unloading global grammars, in order to release un-used global grammars, the server periodically checks if the client process is still alive. Once the server detected that a client process has been inactive for more than 10 minutes, it will remove all grammars associated with that client process.

In multi-threaded program, it is safe to access global grammars in read-only fashion on multiple threads simultaneously. For instance, querying whether a global grammar is loaded, or calling decode with global grammars. In the case that loading or unloading takes place, such as unloading a global grammar while decoding on another thread with that grammar, it is users' responsibility to prevent racing from happening.

Functions

LV_SRE_LoadGlobalGrammar (const char* gram_name, const char* gram_location);

LV_SRE_LoadGlobalGrammarFromBuffer (const char* gram_name, const char* gram_contents);

LV_SRE_LoadGlobalGrammarFromObject (const char* gram_name, HGRAMMAR gram_handle);

Parameters

gram_name

The identifier for the grammar being loaded. Whenever you activate, deactivate, or unload, this is the identifier you will use.

gram_location

Page 115: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

105

A file descriptor or uri that points to a valid SRGS grammar file, such as "c:/grammars/pizza.grxml", "http://www.gramsRus.com/phonenumber.gram", or "builtin:dtmf/boolean?y=1;n=2"

gram_contents

A null terminated string containing the contents of a valid SRGS grammar file.

gram_handle

A handle for an LVGrammar object, created by LVGrammar_Create

Return Values

LV_SUCCESS

No errors; this grammar is now ready to use.

LV_GRAMMAR_SYNTAX_WARNING

The grammar file was not fully conforming, but it was understandable and is now ready for use.

LV_GRAMMAR_SYNTAX_ERROR

The grammar file was not understandable to the grammar compiler. You will not be able to decode with this grammar.

LV_GRAMMAR_LOADING_ERROR

The grammar compiler was unable to find the location of the grammar you loaded.

LV_GLOBAL_GRAMMAR_TRANSACTION_ERROR

Fail to send the grammar to all servers.

LV_GLOBAL_GRAMMAR_TRANSACTION_PARTIAL_ERROR

Page 116: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

106

Fail to send the grammar to some of the servers.

Remarks

Detailed error and warning messages are sent to the LVSpeechPort application-level logging callback function at priorities 0 and 1, respectively.

Users can load the same grammar with different labels. That will only create one instance of that grammar on the server.

See Also

LV_SRE_LoadGrammar functions

LV_SRE_IsGlobalGrammarLoaded functions

LV_SRE_UnloadGlobalGrammar functions

LVSpeechPort::LoadGlobalGrammar functions (C++ API)

Page 117: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

107

LV_SRE_IsGrammarLoaded functions

Functions

int LV_SRE_IsGrammarLoaded(HPORT hport, const char* gram_name);

int LV_SRE_IsGrammarLoadedIdx(HPORT hport, int gram_name);

Parameters

hport

The port being queried for gram_name.

gram_name

The identifier for the grammar being queried. This is the same identifier you gave the grammar when you loaded it.

Return Values

1 if a grammar was found with the label gram_name in the space of application-level grammars; 0 otherwise.

Remarks

Note: This function only tells you if a grammar with the name gram_name is loaded. It does not tell you if there are two identical grammar bodies loaded.

See Also

LV_SRE_UnloadGrammar functions

LV_SRE_IsGlobalGrammarLoaded

LV_SRE_LoadGrammar functions

LVSpeechPort::IsGrammarLoaded functions (C++ API)

Page 118: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

108

LV_SRE_IsGlobalGrammarLoaded

Function

int LV_SRE_IsGlobalGrammarLoaded(const char* gram_name);

Parameters

gram_name

The identifier for the grammar being queried. This is the same identifier you gave the grammar when you loaded it.

Return Values

1 if a grammar was found with the label gram_name in the space of application-level grammars; 0 otherwise.

Remarks

Note: This function only tells you if a grammar with the name gram_name is loaded. It does not tell you if there are two identical grammar bodies loaded.

See Also

LV_SRE_UnloadGlobalGrammar

LV_SRE_IsGrammarLoaded functions

LV_SRE_LoadGlobalGrammar functions

LVSpeechPort::IsGlobalGrammarLoaded functions (C++ API)

Page 119: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

109

LV_SRE_AddPhrase

Adds a phrase to a new or existing concept.

int LV_SRE_AddPhrase(HPORT hport, int GrammarSet, const char* Concept , const char* Phrase);

Return Values

LV_SUCCESS

No errors; the phrase was added to the concept.

LV_BAD_HPORT

The engine is no longer running. This is the result of a ClosePort call or a unrecoverable engine error.

LV_GRAMMAR_SET_OUT_OF_RANGE

The grammar set is out of range.

LV_GRAMMAR_SYNTAX_ERROR or LV_GRAMMAR_SYNTAX_WARNING

The phrase entered has bad syntax, such as mismatched parenthesis.

Parameters

GrammarSet

Which grammar set to add the phrase. Integer value between 0 - 63, inclusive.

Concept

Which concept to add the phrase. Null-terminated string.

Phrase

The new phrase.

Page 120: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

110

Remarks

The concept can be a new or existing concept; the call will automatically add the new concept with the single phrase.

See Also

Phrase Formats

Phonemes

LVSpeechPort::AddPhrase

Page 121: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

111

LV_SRE_RemoveConcept

Removes a concept and all of its phrases.

int LV_SRE_RemoveConcept(HPORT hport, int GrammarSet, const char* Concept);

Return Values

LV_SUCCESS

No errors; the concept and all phrases are removed form the grammar set.

LV_GRAMMAR_SET_OUT_OF_RANGE

The grammar set specified is outside the valid range.

LV_BAD_HPORT

The engine is no longer running. This is the result of a LV_SRE_ClosePort call or a unrecoverable engine error.

Parameters

GrammarSet

Which grammar set to remove concept from. Possible value range 0 - 63.

Concept

The Existing concept to remove. Null-terminated string.

See Also

LVSpeechPort::RemoveConcept

Page 122: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

112

LV_SRE_ResetGrammar

Removes all concepts from a grammar.

int LV_SRE_ResetGrammar(HPORT hport, int GrammarSet);

Return Values

LV_SUCCESS

No errors; grammar reset.

LV_GRAMMAR_SET_OUT_OF_RANGE

The grammar set value is out of expected range (0-63).

See Also

LVSpeechPort::ResetGrammar

Page 123: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

113

LV_SRE_LoadStandardGrammar

Standard Grammars are deprecated in favor of SRGS built-in grammars

Loads a standard, pre-defined grammar to easily recognize and format numbers, monetary figures or digits.

int LV_SRE_LoadStandardGrammar(HPORT hport,int GrammarSet, int StdGrammar);

Return Values

LV_SUCCESS

No errors; the standard grammar is loaded.

LV_STANDARD_GRAMMAR_OUT_OF_RANGE

The standard grammar value is not a recognized grammar type.

LV_GRAMMAR_SET_OUT_OF_RANGE

The standard grammar was loaded into a set that is not in range.

Parameters

GrammarSet

Which grammar set this phrase is being added to. Possible value range 0 - 63.

StandardGrammar

The standard grammars are:

1. GRAMMAR_DIGITS String of single digits like a phone number or pin code.

2. GRAMMAR_MONEY Monetary value (only implemented for SRGS decodes).

Page 124: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

114

3. GRAMMAR_NUMERIC Numeric value like 12,000, 24.45, or 35).

4. GRAMMAR_SPELLING Alphabet letters for spelling (not implemented).

5. GRAMMAR_ALPHA_NUMERIC (Not implemented).

6. GRAMMAR_DATE Date values (only implemented for SRGS decodes).

7. GRAMMAR_NONE Clears out the standard grammar, without clearing out any phrases that were added. ResetGrammar( ) will clear out the entire grammar.

Remarks

The client application can load only one standard grammar, but can add any number of concepts with AddPhrase. This is not true, however, if you use SRGS grammars. The correct way to augment as standard SRGS grammar is to load a grammar to a different location, and then activate both. When a standard grammar is loaded, the decoder will return the number, dollar amount, or digit string as either a single concept, or a single interpretation string, depending on whether SRGS is used or not .

As an example, the client application loads GRAMMAR_NUMBER and also adds the concept and phrase "Widgets". If the sound data contained the speech "twelve widgets". The decoder will return two concepts: the first is the string "12" and the second the string "Widgets". If the speech was "one thousand one hundred and twenty nine Widgets seven point two Widgets", the decoder would return four concepts: "1129" , "Widgets", "7.2" and "Widgets" .

However, If you use SRGS, this is not what happens. In order to get this sort of functionality in the SRGS setting, you would create a grammar that looks like the following:

#ABNF 1.0; language en-US; mode voice; tag-format <semantics/1.0>; root $how_many_widgets;

Page 125: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

115

$how_many_widgets = $<builtin:grammar/number> widgets {$=$$;}

In this case you wouldn't bother using LoadStandardGrammar() at all, since the standard number grammar will get loaded when you load this grammar. The return type would be an interpretation string representing the number that was recognized, like "1129" or "7.2". The word "widgets" would not be returned in this grammar.

See Also

Standard Grammars

LVSpeechPort::LoadStandardGrammar

Page 126: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

116

LV_SRE_LoadVoiceChannel

Loads the audio data into the specified voice channel prior to a call to LV_SRE_Decode (which decodes the audio data).

int LV_SRE_LoadVoiceChannel(HPORT hport,int VoiceChannel, void* M, int Length,SOUND_FORMAT);

Return Values

LV_SUCCESS

No errors; the voice channel audio successfully loaded.

LV_BAD_HPORT

The engine is no longer running. This is the result of an LV_SRE_ClosePort call or a unrecoverable engine error.

LV_FAILURE

Sound format was incorrectly specified.

Parameters

VoiceChannel

Accepted values 0 through 63.

M

Pointer to audio data.

Length

Memory size in bytes of the audio data.

Format

The audio data sound format.

Page 127: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

117

Remarks

Each LV_SpeechPort supports 64 separate voice channels. Each channel has its own separate storage for decode data, so once the call is made, the client application can release its own copy. LV_SRE_LoadVoiceChannel will accept the audio data and prepare it for decoding.

See Also

LVSpeechPort::LoadVoiceChannel

Page 128: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

118

LV_SRE_Decode

Processes the voice channel audio data against the active grammar.

int LV_SRE_Decode(HPORT hport,int VoiceChannel,int grammarset,unsigned int flags);

Return Values

Zero (0) or greater indicates success.

A negative result indicates a specific error.

Parameters

VoiceChannel

The voice channel to process.

GrammarSet

The grammar to use to process.

Flags (bitwise OR flags to set desired options)

LV_DECODE_BLOCK - Decode will not return until it has finished.

LV_DECODE_GENDER_MALE - Gender identifier.

LV_DECODE_GENDER_FEMALE – Gender identifier.

LV_DECODE_FIRST_TIME_USER – Reset caller weights in Recognition Engine (not implemented).

LV_DECODE_USE_OOV - Use the Out-Of-Vocabulary filter (OOV) during decode.

Remarks

Page 129: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

119

If LV_DECODE_BLOCK is set, LV_SRE_Decode will not return until it has finished processing the data.

If LV_DECODE_BLOCK is not set, LV_SRE_Decode returns immediately (but continues processing the data on a separate thread); the client application can continue its own work. Calling other LVSpeechPort methods may block until the Decode is finished. Once the client application is ready to check for results, call either 1) LV_SRE_GetNumberOfConceptsReturned, or 2) LV_SRE_WaitForEngineToIdle and then LV_SRE_GetNumberOfConceptsReturned. LV_SRE_WaitForEngineToIdle will only wait for a specified time, and returns regardless of whether LV_SRE_Decode is finished, where LV_SRE_GetNumberOfConceptsReturned will block until Decode is finished.

LV_DECODE_GENDER_FEMALE and LV_DECODE_GENDER_MALE identify which gender acoustic model to use. If these flags are not specified, the engine automatically decodes each audio file against both gender models. While this slows the engine by requiring two decodes, evaluating against both models has a very significant positive effect on recognition accuracy. Since the engine is multithreaded, unless CPU loads are a serious issue, do not use these flags.

On an error, call LV_SRE_ReturnErrorString with the negative result from LV_SRE_Decode to get a description of the error.

See Also

LV_SpeechPort::Decode

Page 130: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

120

LV_SRE_WaitForEngineToIdle

(Deprecated in favor of LV_SRE_WaitForDecode)

Blocks the client application until the port is idle (not decoding).

int LV_SRE_WaitForEngineToIdle(HPORT hport, int MillisecondsToWait, int VoiceChannel);

Return Values

LV_SUCCESS

No errors or timeout; the engine is now idle.

LV_TIME_OUT

WaitForEngineToIdle's timeout was reached before the engine became idle.

Parameters

MillisecondsToWait

The number of milliseconds to wait before returning if the Speech Port does not become idle.

VoiceChannel

Which VoiceChannel to wait on, -1 waits on all the voice channels for the port.

Remarks

This function is deprecated in favor of LV_SRE_WaitForDecode. To achieve the same behavior as LV_SRE_WaitForDecode, use property PROP_EX_DECODE_TIMEOUT, and set MillisecondsToWait to TIMEOUT_INFINITE.

Some of the LV_SRE functions run asynchronously, in particular, LV_SRE_Decode. LV_SRE_WaitForEngineToIdle is primarily useful when LV_SRE_Decode is called without LV_DECODE_BLOCK. In this case, LV_SRE_Decode returns immediately, but continues processing the voice channel's audio data in a separate thread. Since client applications will

Page 131: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

121

eventually need the results, the clients need a way to query the port to see if LV_SRE_Decode has finished. LV_SRE_WaitForEngineToIdle will wait the specified time for the engine to idle; check the return value to ensure the engine is idle, indicating that decode results are available.

LV_SRE_WaitForEngineToIdle is also useful to ensure the engine has finished initializing, prior to calls to LV_SRE_Decode.

See Also

LV_SRE_Decode

LVSpeechPort::WaitForEngineToIdle

LV_SRE_WaitForDecode

Page 132: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

122

LV_SRE_GetNumberOfInterpretations

Returns the number of semantic interpretation results that were generated by the previous decode.

Function

int LV_SRE_GetNumberOfInterpretations(HPORT hport, int voicechannel)

Parameters

hport

A handle to the speech port.

voicechannel

The audio channel holding the decoded audio.

See Also

LV_SRE_CreateInterpretation

LV_SRE_GetInterpretationString

LVSpeechPort::GetNumberOfInterpretations (C++ API)

Page 133: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

123

LV_SRE_CreateInterpretation

Returns a handle to a data structure representing the results of the semantic interpretation process. The handle must be released with LVInterpretation_Release when you are finished with it.

Function

H_SI LV_SRE_CreateInterpretation (HPORT hport, int voicechannel, int index)

Parameters

hport

A handle to the speech port

voicechannel

The channel that the decode took place on.

index

An utterance could give rise to multiple interpretations, particularly if the grammars involved are ambiguous. index ranges from 0 to LV_SRE_GetNumberOfInterpretations - 1.

Return Value

The return type is a handle to an interpretation object. The object is a representation of the ECMAScript object made by the matching grammar, using the Semantic Interpretation for Speech Recognition process. It also contains additional information such as the confidence score, matching grammar label, and the input sentence.

Remarks

The H_SI handle can be manipulated using the functions prefixed by "LVInterpretation_"

See Also

LV_SRE_GetNumberOfInterpretations

Page 134: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

124

LV_SRE_GetInterpretationString

LVInterpretation C API

LVParseTree::GetInterpretation (C++ API)

Page 135: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

125

LV_SRE_GetInterpretationString

Provides the user with a string representation of the semantic interpretation result data.

Function

const char* LV_SRE_GetInterpretationString(HPORT hport, int voicechannel, int index)

Parameters

hport

A handle to the speech port

voicechannel

The channel containing the decoded audio

index

A value between 0 and LV_SRE_GetNumberOfInterpretations -1

Remarks

Logically, the interpretation string is the same as the result data contained in a semantic interpretation object.

See Also

LV_SRE_GetNumberOfInterpretations

LV_SRE_CreateInterpretation

LVSpeechPort::GetInterpretationString (C++ API)

Page 136: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

126

LV_SRE_GetNumberOfParses

Returns the number of parse trees that were generated by the previous decode.

Function

int LV_SRE_GetNumberOfParses(HPORT hport, int voicechannel)

Parameters

hport

A handle to the speech port.

voicechannel

The audio channel holding the decoded audio.

See Also

LV_SRE_CreateParseTree

LV_SRE_GetParseTreeString

Speech Parse Tree Introduction

LVSpeechPort::GetNumberOfParses (C++ API)

Page 137: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

127

LV_SRE_CreateParseTree

Provides the user with a handle to a speech parse tree, representing the sentence structure of what was decoded by the Speech Engine, according to the active grammars. You must release the handle with LVParseTree_Release when you are finished with it.

Function

H_PARSE_TREE LV_SRE_CreateParseTree(HPORT hport, int voicechannel, int index)

Parameters

hport

The handle to the speech port.

voicechannel

The audio channel containing the input audio

index

It is possible to have more than one parse tree for an utterance (for instance if the grammar is ambiguous); this is the index of the tree

Return Value

A handle to a parse tree. The parse tree handle is manipulated with functions having the prefix "LVParseTree_".

Remark

Logically, a parse tree and the parse string returned to the user are the same. However, a speech parse tree makes it easy to search the parse tree for useful information.

See Also

LV_SRE_GetNumberOfParses

LV_SRE_GetParseTreeString

Page 138: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

128

Parse Tree Introduction

LVParseTree C API

LVSpeechPort::GetParseTree (C++ API)

Page 139: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

129

LV_SRE_GetParseTreeString

Provides the user with a string representation of a speech parse tree.

Function

const char* LV_SRE_GetParseTreeString(HPORT hport, int voicechannel, int index)

Parameters

hport

The handle to the speech port.

voicechannel

The audio channel containing the input audio

index

It is possible to have more than one parse tree possibility (for instance if the grammar is ambiguous); this is the index of the tree

Remark

Logically, a speech parse tree and the parse string returned to the user are the same. However, a speech parse tree makes it easy to search the parse tree for useful information. The parse tree string is based on the examples provided by the W3C SRGS specification

See Also

LV_SRE_GetNumberOfParses

LV_SRE_CreateParseTree

Parse Tree Introduction

LVSpeechPort::GetParseTreeString (C++ API)

Page 140: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

130

LV_SRE_GetNumberOfConceptsReturned

Returns the number of concepts found in the last call to LV_SRE_Decode.

int LV_SRE_GetNumberOfConceptsReturned(HPORT hport,int VoiceChannel);

Return Values

The number of concepts found for this voice channel.

Parameters

VoiceChannel

The voice channel processed by LV_SRE_Decode.

See Also

LVSpeechPort::GetNumberOfConceptsReturned

Page 141: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

131

LV_SRE_GetConcept

Returns one concept found in the last call to LV_SRE_Decode.

const char* LV_SRE_GetConcept(HPORT hport,int VoiceChannel, int Index);

Return Values

A null-terminated string representing the matched concept .

NULL indicates that Index was outside the possible range.

Parameters

VoiceChannel

The voice channel processed by LV_SRE_Decode.

Index

The recognition position of the concept, between 0 and (LV_SRE_GetNumberOfConceptsReturned - 1), inclusive.

Remarks

Assuming the speaker said "Violet" and the grammar contained the concepts under Concept, and the grammar under Phrase, the Speech Engine would return the concepts highlighted:

See Also

LVSpeechPort::GetConcept

Page 142: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

132

LV_SRE_GetConceptScore

Returns the confidence score of a concept found in the last call to LV_SRE_Decode.

int LV_SRE_GetConceptScore(HPORT hport,int VoiceChannel, int Index);

Return Values

The confidence score of the matched concept. The range of possible values is 0 to 1000.

Parameters

VoiceChannel

The voice channel processed by LV_SRE_Decode.

Index

The recognition position of the concept, between 0 and (LV_SRE_GetNumberOfConceptsReturned - 1), inclusive.

Remarks

Assuming the speaker said "Violet" and the grammar contained the concepts under Concept, and the grammar under Phrase, the Speech Engine might return the scores highlighted:

See Also

LV_SpeechPort::GetConceptScore

Page 143: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

133

LV_SRE_GetPhraseDecoded

Returns the decoded phrase (with BNF formatting retained) found in the last call to LV_SRE_Decode.

const char* LV_SRE_GetPhraseDecoded(HPORT hport, int VoiceChannel, int Index);

Return Values

A static string

Parameters

VoiceChannel

The voice channel to process.

Index

The recognition position of the phrase to decode.

Remarks

Assuming the speaker said "Violet" and the grammar contained the concepts under Concept, and the grammar under Phrase, the Speech Engine might return the phrases highlighted:

The main difference between LV_SRE_GetPhraseDecoded and LV_SRE_GetRawTextDecoded is in BNF formatting. LV_SRE_GetPhraseDecode returns the decoded phrase, as it is entered into the grammar. If the phrase contains BNF formatting, with selections, options, grouping, etc., than the return value preserves that formatting. LV_SRE_GetRawTextDecoded returns the decode phrase, after BNF formatting has been removed. Thus, LV_SRE_GetRawTextDecoded will return the phrase as a list of the words actually recognized, rather than the phrase as it was entered into the grammar.

Page 144: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

134

See Also

LV_SRE_GetPhonemesDecoded

LV_SRE_GetRawTextDecoded

LVSpeechPort::GetPhraseDecoded

Page 145: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

135

LV_SRE_GetPhonemesDecoded

Returns the actual phonemes found in a call to LV_SRE_Decode.

const char* LV_SRE_GetPhonemesDecoded(HPORT hport,int VoiceChannel, int Index);

Return Values

A null-terminated static string of the decoded phonemes.

Parameters

VoiceChannel

The voice channel to process.

Index

The recognition position of the decoded phonemes.

Remarks

Assuming the speaker said "Violet" and the grammar contained the concepts under Concept, and the grammar under Phrase, the Speech Engine might return the phonemes highlighted:

See Also

LV_SRE_GetPhraseDecoded

LV_SRE_GetRawTextDecoded

LVSpeechPort::GetPhonemes

Page 146: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

136

LV_SRE_GetRawTextDecoded

Returns the decoded raw text (without BNF formatting) found in the last call to Decode.

const char* LV_SRE_GetRawTextDecoded(HPORT hport,int VoiceChannel, int Index);

Return Values

A null-terminated string representing the decoded raw text.

Parameters

VoiceChannel

The voice channel to process.

Index

The recognition position of the decoded raw text.

Remarks

Assuming the speaker said "Violet" and the grammar contained the concepts under Concept, and the grammar under Phrase, the Speech Engine might return the raw text highlighted:

The main difference between LV_SRE_GetPhraseDecoded and LV_SRE_GetRawTextDecoded is in BNF formatting. LV_SRE_GetPhraseDecode returns the decoded phrase, as it is entered into the grammar. If the phrase contains BNF formatting, with selections, options, grouping, etc., than the return value preserves that formatting. LV_SRE_GetRawTextDecoded returns the decode phrase, after BNF formatting has been removed. Thus, LV_SRE_GetRawTextDecoded will return the phrase as a list of the words actually recognized, rather than the phrase as it was entered into the grammar.

Page 147: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

137

See Also

LV_SRE_GetPhonemes

LV_SRE_GetPhraseDecoded

LVSpeechPort::GetRawTextDecoded

Page 148: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

138

LV_SRE_GetVoiceChannelData

Sets the pointers to the voice channel's copy of the original preprocessed audio data.

int LV_SRE_GetVoiceChannelData(HPORT hport, int VoiceChannel, short** PCM, unsigned int* Samples);

Return Values

LV_SUCCESS

No errors; PCM and Samples have been successfully set.

LV_SOUND_CHANNEL_OUT_OF_RANGE

The grammar set specified is outside the valid range; possible values are 0-63, inclusive.

LV_BAD_HPORT

The Speech Engine is no longer running. This is the result of a ClosePort call or a unrecoverable Speech Engine error.

Parameters

VoiceChannel

The voice channel to process.

PCM

A pointer to a pointer that will be set to the post-processed audio data.

Samples

A pointer to an integer to the set the number of samples.

See Also

LVSpeechPort::GetVoiceChannelData

Page 149: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

139

LV_SRE_ReturnErrorString

Returns a description of an error code.

const char* LV_SRE_ReturnErrorString(int ReturnCode);

Return Values

A null-terminated static string describing the error code.

Parameters

ReturnCode

The error code.

Remarks

If the error code is an invalid error code, "Invalid Error Code" is returned.

See Also

LVSpeechPort::ReturnErrorString

Page 150: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

140

LV_SRE_SetProperty

Sets various properties on the port.

int LV_SRE_SetProperty(HPORT hport, PROPERTIES Property, int Value);

Return Values

LV_SUCCESS

No errors; Property is set to Value.

LV_BAD_HPORT

hport was invalid.

LV_NOT_A_VALID_PROPERTY_VALUE

Value is invalid for the given property.

Parameters

HPort

The port's handle.

Property

Which property to modify.

Value

Property-dependent.

Remarks

Currently, only PROP_SAVE_SOUND_FILES is implemented; setting Value to 1 will cause the port to save request and answer files to disk; setting Value to 0

Page 151: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

141

turns this feature off. The request and answer files are invaluable for troubleshooting and tuning applications, but will quickly fill up a hard drive.

See Also

Properties

LVSpeechPort::SetProperty

Page 152: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

142

LV_SRE_SetPropertyEx

Sets various properties for a port, client, soundchannel, or grammar.

int SetProperty(int propertyname, int valuetype, void* pvalue, int target = PROP_EX_TARGET_PORT, int index = 0 );

Return Values

LV_SUCCESS

No errors; property is set to the value pointed to by pvalue.

LV_INVALID_PROPERTY

The property does not exist.

LV_INVALID_PROPERTY_VALUE

The property value is invalid for the designated property (e.g. out of range).

LV_INVALID_PROPERTY_TARGET

The property cannot be set for the specified target.

LV_INVALID_PROPERTY_VALUE_TYPE

The property's type is incompatible with the declared type.

LV_INVALID_PROPERTY_TARGET_IDX

The target's index (grammar set, voicechannel) is out of range for this property.

Note: If more than one error occurs, which error code is returned is undefined.

Parameters

propertyname

Page 153: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

143

Which property to modify.

valuetype

The value type of the property being set. Legal values are:

PROP_EX_VALUE_TYPE_INT

PROP_EX_VALUE_TYPE_INT_PTR

PROP_EX_VALUE_TYPE_STRING

PROP_EX_VALUE_TYPE_FLOAT_PTR

Each property has a set of legal set of value types. See Properties.

pvalue

A pointer to the new value for propertyname. pvalue will be reinterpreted according to the value type provided.

target

The portion of the API that this property is set for. Legal values are:

PROP_EX_TARGET_PORT -- pvalue affects an entire speech port object

PROP_EX_TARGET_CHANNEL -- pvalue affects one voice channel in the speech port. The channel is specified by index.

PROP_EX_TARGET_GRAMMAR -- pvalue affects one grammar set in the speech port. The set is specified by index.

PROP_EX_TARGET_CLIENT -- pvalue is global, and affects all ports on the client.

Remarks

Page 154: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

144

See Properties for a list of modifiable properties.

See Also

Properties

LVSpeechPort::SetPropertyEx (C++ API)

Page 155: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

145

LV_SRE_StreamStart

Sets up a new stream.

int LV_SRE_StreamStart(HPORT hport);

Return Values

LV_SUCCESS

Stream set up.

LV_FAILURE

Parameters incorrectly set.

Parameters

HPort

The port's handle.

Remarks

Call this function to set up a new stream. You need to call this function after calling LV_SRE_StreamStop, LV_SRE_StreamCancel or after end-of-speech has been detected on previous utterance.

See Also

LV_SRE_StreamSetParameter

LV_SRE_StreamStop

LV_SRE_StreamCancel

Page 156: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

146

LV_SRE_StreamSendData

Send data buffer of sound data to stream.

int LV_SRE_StreamSendData(HPORT hport, void* SoundData, int SoundDataLength);

Return Values

LV_SUCCESS

Data accepted

LV_FAILURE

Stream not active or NULL sound data.

Parameters

HPort

The port's handle.

SoundData

Pointer to the memory buffer containing sound data.

SoundDataLength

Length in bytes of sound data.

Remarks

Used to do the actual streaming. Call this function with each sound data buffer. This call copies sound data to an internal buffer and returns immediatly. Processing of sound data takes place on a background thread.

See Also

LV_SRE_StreamSetStateChangeCallBack

LV_SRE_StreamGetStatus

Page 157: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

147

LV_SRE_StreamGetStatus

Returns status of stream.

int LV_SRE_StreamGetStatus(HPORT hport);

Return Values

Returns a stream status define. See Steam Status.

Parameters

HPort

The port's handle.

Remarks

Called to check the current state of stream.

See Also

LV_SRE_StreamSetStateChangeCallBack

Page 158: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

148

LV_SRE_StreamGetLength

Returns length of sound data in stream buffer.

int LV_SRE_StreamGetStatus(HPORT hport);

Return Values

Number of bytes in internal buffer for sound stream.

Parameters

HPort

The port's handle.

Remarks

This is the total number of bytes streamed. Does not include bytes sent before barge-in is detected (if STREAM_PARM_DETECT_BARGE_IN is active) Can be useful if application wants to stop post barge-in stream after a certain amount of time (as example, to limit a user speech to 10 seconds)

See Also

LV_SRE_StreamSetStateChangeCallBack

Page 159: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

149

LV_SRE_StreamSetStateChangeCallBack

Set up a call back to receive state change notification of a stream.

int LV_SRE_StreamSetStateChangeCallBack(HPORT hport, LV_SRE_StreamStateChangeFn* fn, void* UserData);

Return Values

LV_SUCCESS

LV_BAD_HPORT

Parameters

HPort

The port's handle.

LV_SRE_StreamStateChangeFn

Pointer to callback function to receive state change updates. See Stream Callback.

UserData

Application defined data sent back in callback.

Remarks

Each time a streams status changes, this callback will be called.

See Also

LV_SRE_StreamStateChangeFn

LV_SRE_StreamGetStatus

Page 160: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

150

LV_SRE_StreamStop

Stops stream and loads sound channel with streamed data.

int LV_SRE_StreamStop(HPORT hport);

Return Values

LV_SUCCESS

LV_BAD_HPORT

LV_FAILURE Stream not active.

Parameters

HPort

The port's handle.

Remarks

This function ends streaming and puts streamed data into the voice channel defined with the STREAM_PARM_VOICE_CHANNEL parameter. If the STREAM_PARM_AUTO_DECODE parameter is active, the decode will begin (non-blocking) when this function is called.

See Also

LV_SRE_StreamSetParameter

LV_SRE_StreamCancel

Stream Parameters

Page 161: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

151

LV_SRE_StreamCancel

Stops stream, sound data is discarded.

int LV_SRE_StreamCancel(HPORT hport);

Return Values

LV_SUCCESS

LV_BAD_HPORT

LV_FAILURE Stream not active.

Parameters

HPort

The port's handle.

Remarks

This kills the stream. Can be called to cancel a stream (particularly auto-decode types streams) in order to start new stream.

See Also

LV_SRE_StreamStop

Page 162: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

152

LV_SRE_StreamSetParameter

Sets a new value for a stream property.

int LV_SRE_StreamSetParameter(HPORT hport, int StreamParameter, unsigned long StreamParameterValue);

Return Values

LV_SUCCESS

LV_INVALID_PROPERTY StreamParameter does not exist.

LV_INVALID_PROPERTY_VALUE StreamParamerterValue is out of range for the stream parameter.

Parameters

HPort

The port's handle.

StreamParameter

Stream parameter to change. See Stream Parameters.

StreamParameterValue

New stream parameter value.

Remarks

Sets a stream parameter value.

See Also

Page 163: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

153

LV_SRE_StreamGetParameter

LV_SRE_StreamSetParameterToDefault

Stream Parameters

Page 164: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

154

LV_SRE_StreamGetParameter

Gets the current value of a stream property.

int LV_SRE_StreamSetParameter(HPORT hport, int StreamParameter, unsigned long StreamParameterValue);

Return Values

LV_SUCCESS

LV_INVALID_PROPERTY StreamParameter does not exist.

LV_INVALID_PROPERTY_VALUE StreamParamerterValue is out of range for the stream parameter.

Parameters

HPort

The port's handle.

StreamParameter

Stream parameter to change. See Stream Parameters.

StreamParameterValue

New stream parameter value.

Remarks

Sets a stream parameter value.

See Also

Page 165: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

155

LV_SRE_StreamGetParameter

LV_SRE_StreamSetParameterToDefault

Stream Parameters

Page 166: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

156

LV_SRE_StreamSetParameterToDefault

Sets a stream property to its default value.

int LV_SRE_StreamSetParameterToDefault(HPORT hport, int StreamParameter);

Return Values

LV_SUCCESS

LV_INVALID_PROPERTY Stream parameter does not exist.

Parameters

HPort

The port's handle.

StreamParameter

Stream parameter to reset . See Stream Parameters.

Remarks

Sets a stream parameter value back to default value.

See Also

LV_SRE_StreamGetParameter

LV_SRE_StreamSetParameter

Stream Parameters

Page 167: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

157

LV_SRE_GetNumberOfNBestAlternatives

Returns the number of n-best alternatives found by the engine.

int LV_SRE_GetNumberOfNBestAlternatives(HPORT hport, int voicechannel);

Return Values

Number of n-best alternatives. It will always less than or equal to the value set for PROP_EX_MAX_NBEST_RETURNED.

Parameters

HPort

The port's handle.

voicechannel

The channel containing the decoded audio.

Remarks

Sets a stream parameter value back to default value.

See Also

PROP_EX_MAX_NBEST_RETURNED

LV_SRE_SwitchToNBestAlternative

LVSpeechPort::GetNumberOfNBestAlternatives

Page 168: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

158

LV_SRE_SwitchToNBestAlternative

Switch the n-best alternative that is viewable. After this function call, subsequent result retrieval functions, such as LV_SRE_CreateInterpretation will come from this n-best alternative.

int LV_SRE_SwitchToNBestAlternatives(HPORT hport, int voicechannel, int index);

Return Values

LV_SUCCESS

LV_FAILURE The index is not valid.

Parameters

HPort

The port's handle.

voicechannel

The channel containing the decoded audio.

index

The index of the n-best alternative to switch to. It may be any value in the range [0, LV_SRE_GetNumberOfNBestAlternatives).

Remarks

Each alternative represents a distinct sentence. However, since some sentences can have multiple interpretations or multiple parses, it is possible that for some alternatives you will have multiple parse tree or interpretation objects returned. For this reason, you should get all results out as follows:

int nbest_count; int nbest_total = LV_SRE_GetNumberOfNBestAlternatives(port, vc);

Page 169: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

159

int interp_count; for (nbest_count=0; nbest_count<nbest_total; ++nbest_count) { LV_SRE_SwitchToNBestAlternative(port, vc, nbest_count); int interp_total = LV_SRE_GetNumberOfInterpretations(port, vc); for (interp_count=0; interp_count<interp_total; ++interp_count) { H_SI interp = LV_SRE_CreateInterpretation(port, vc, interp_count); /* do something with the interp */ LVInterpretation_Release(interp); } }

Even though more than one interpretation can live in a single n-best result, the same interpretation will not live in more than one n-best result. The lower scoring interpretations are pruned out.

See Also

LV_SRE_GetNumberOfNBestAlternatives

LVSpeechPort::SwitchToNBestAlternative

Page 170: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

160

LV_SRE_WaitForDecode

Blocks the client application until the decode is finished.

int LV_SRE_WaitForDecode(HPORT hport, int VoiceChannel);

Return Values

LV_SUCCESS

No errors or timeout; the decode interaction is finished.

LV_TIME_OUT

The timeout value associated with PROP_EX_DECODE_TIMEOUT was exceeded before a result was returned from the Speech Engine. The decode was dropped from the Engine, and the LVSpeechPort may now start a new decode request.

Parameters

VoiceChannel

Which voice channel to wait on. Setting VoiceChannel equal to -1 causes a wait on all the voice channels for the port.

Remarks

Some of the API functions run asynchronously, in particular, LV_SRE_Decode. LV_SRE_WaitForDecode is primarily useful when LV_SRE_Decode is called without LV_DECODE_BLOCK. In this case, LV_SRE_Decode returns immediately, but continues processing the voice channel's audio data in a separate thread. Since client applications will eventually need the results, the clients need a way to query the port to see if LV_SRE_Decode has finished. LV_SRE_WaitForDecode will wait the specified time (determined by set value of PROP_EX_DECODE_TIMEOUT) for the engine to idle; check the return value to ensure the decode interaction is finished before attempting to retrieve answers from the speech port.

See Also

PROP_EX_DECODE_TIMEOUT

LV_SRE_Decode

LVSpeechPort::WaitForDecode

Page 171: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

161

LVInterpretation C API Functions

LVInterpretation Summary

The LVInterpretation object contains a fully processed decode result. It includes

The raw input the Speech Engine recognized

The name of the grammar that was matched

A confidence score for the interpretation

The semantic data object -- the result of processing the input sentence against the matching grammar, and executing the semantic tags in the sentence's parse tree

Use <LVSpeechPort.h> or <LV_SRE_Semantic.h>

Return Type Function Description

H_SI LVInterpretation_Create (void) Creates an empty LVInterpretation handle.

H_SI LVInterpretation_CreateFromCopy (H_SI other)

Create a copy of another LVInterpretation handle

void LVInterpretation_Release(H_SI hsi) Destroys the LVInterpretation handle

H_SI_DATA LVInterpretation_GetResultData (H_SI hsi)

The result object, representing the end product of the semantic interpretation process.

const char* LVInterpretation_GetResultName (H_SI hsi)

The name of the result

Page 172: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

162

hsi) data, according to the matching grammar.

const char* LVInterpretation_GetGrammarLabel (H_SI hsi)

Returns the name of the grammar as it was provided to the speech port.

const char* LVInterpretation_GetMode (H_SI hsi) returns the interaction mode for this interpretation.

const char* LVInterpretation_GetLanguage (H_SI hsi)

Returns the language identifier for this interpretation.

const char* LVInterpretation_GetInputSentence (H_SI hsi)

The sentence that generated this interpretation.

int LVInterpretation_GetScore (H_SI hsi) Confidence score for this interpretation.

const char* LVInterpretation_GetTagFormat (H_SI hsi)

The tag format (interpretation scheme) that created the semantic data object.

Page 173: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

163

LVSemanticData Summary

An LVSemanticData object is the result of the semantic interpretation process. A user's spoken input is combined with a grammar containing semantic tag instructions to create a compound object. An LVSemanticData object can be one of the following types:

SI_TYPE_INT -- A simple integer value

SI_TYPE_DOUBLE -- A double precision floating point value

SI_TYPE_BOOL -- An integer that is either 1 or 0

SI_TYPE_STRING -- A null-terminated character array.

SI_TYPE_OBJECT -- A structure containing one or more property-value pairs.

SI_TYPE_ARRAY -- An indexed collection of values.

SI_TYPE_NULL -- A null object.

Return Value

Function Description

H_SI_DATA LVSemanticData_CreateFromCopy(H_SI_DATA other) Creates a new object from an old one. The new one will need to be released when no longer in use.

const char*

LVSemanticData_Print(H_SI_DATA data, int format) Prints the data in XML or ECMAScript formats.

int LVSemanticData_GetType(H_SI_DATA data) Returns the type of the data.

const char*

LVSemanticData_GetString(H_SI_DATA data) If the data is of type SI_TYPE_STRING,

Page 174: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

164

returns the string contents.

int LVSemanticData_GetInt(H_SI_DATA data) If the data is of type SI_TYPE_INT, returns the integer.

double LVSemanticData_GetDouble(H_SI_DATA data) If the data is of type SI_TYPE_DOUBLE, returns the double.

int LVSemanticData_GetBool(H_SI_DATA data) If the data is of type SI_TYPE_BOOL, returns a 1 for true, 0 for false

int LVSemanticObject_GetNumberOfProperties(H_SI_DATA data)

If the data is of type SI_TYPE_OBJECT, returns the number of properties (member data) it contains.

const char*

LVSemanticObject_GetPropertyName(H_SI_DATA data, int i)

If the data is of type SI_TYPE_OBJECT, returns the name of the ith property

int LVSemanticObject_PropertyExists(H_SI_DATA data, const char* prop_name)

If the data is of type SI_TYPE_OBJECT, returns 1 if the object contains a value named prop_name, 0 otherwise.

H_SI_DATA LVSemanticObject_GetPropertyValue(H_SI_DATA data, const char* prop_name)

If the data is of type SI_TYPE_OBJECT, returns the member data named prop_name.

int LVSemanticArray_GetSize(H_SI_DATA data) If the data is of type

Page 175: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

165

SI_TYPE_ARRAY, returns the number of elements in the array.

H_SI_DATA LVSemanticArray_GetElement(H_SI_DATA data, int i)

If the data is of type SI_TYPE_ARRAY, returns the ith element in the array.

Page 176: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

166

API Functions

LVInterpretation: Creating, Copying and Releasing

LVInterpretation objects are fully copyable.

Functions

H_SI LVInterpretation_Create(void)

H_SI LVInterpretation_CreateFromCopy(H_SI other_si)

void LVInterpretation_Copy(H_SI hsi, H_SI other_si)

void LVInterpretation_Release(H_SI hsi)

Parameters

hsi

The interpretation handle being copied into, or being released

other_hsi

The interpretaion handle whose contents are being copied.

Remarks

Any new handle given to you via Create or CreateFromCopy must be released. Also, any handle given to you by the speech port through LV_SRE_CreateInterpretation must be released.

Example

HPORT Port; H_SI Interp;

//open the port and do a decode //... //when the decode is finished,grab an interpretation handle Interp = LV_SRE_CreateInterpretation(Port, voicechannel, index);

Page 177: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

167

//start using the interpretation data. //... //When you are done with it, release it. LVInterpretation_Release(Interp);

See Also

Constructing, Copying and Destroying an LVInterpretation Object (C++ API)

Page 178: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

168

LVInterpretation_GetResultData

Returns a handle for the semantic data generated by user input and a matching grammar. The returned handle does not allocate any additional memory, so do not release it.

Function

H_SI_DATA LVInterpretation_GetResultData(H_SI hsi)

Returns

A handle to the results of a semantic interpretation process.

Parameters

hsi

An interpretation handle.

Remarks

The semantic data handle provided to the user via this function is owned by the interpretation handle hsi. It will be released when hsi is released.

See Also

LVSemanticData C API

LVInterpretation::ResultData (C++ API)

Page 179: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

169

LVInterpretation_GetResultName

Returns the name of the name of the result data for this interpretation. The result name is usually the root rule of the matching grammar for this interpretation.

Function

const char* LVInterpretation_GetResultName (H_SI hsi)

Parameters

hsi

An interpretation handle.

See Also

LVInterpretation::ResultName (C++ API)

Page 180: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

170

LVInterpretation_GetLanguage

Returns the language identifier of the grammar that generated this interpretation.

Function

const char* LVInterpretation_GetLanguage(H_SI hsi)

Parameters

hsi

An interpretation handle.

Returns

An RFC 3066 language identifier, such as "en-US" for United States English, or "fr" for French.

See Also

LVInterpretation::Language ( C++ API )

Page 181: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

171

LVInterpretation_GetMode

Returns the interaction mode that created the interaction.

Function

const char* LVInterpretation_GetMode(H_SI hsi)

Parameters

hsi

An interpretation handle.

Returns

"voice" or "dtmf"

See Also

LVInterpretation::Mode (C++ API)

Page 182: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

172

LVInterpretation_GetInputSentence

Returns the input that was fed to the matching grammar to create this interpretation. It may represent the speech the engine recognized, or a dtmf sequence.

Function

const char* LVInterpretation_GetInputSentence(H_SI hsi)

Parameters

hsi

An interpretation handle

See Also

LVInterpretation::InputSentence (C++ API)

Page 183: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

173

LVInterpretation_GetGrammarLabel

Returns the name of the grammar that generated this interpretation.

Function

const char* LVInterpretation_GetGrammarLabel (H_SI hsi)

Parameters

hsi

An interpretation handle.

Remarks

LVInterpretation_GetGrammarLabel will always return the name of one of the grammars you activated for decode. If the active grammar had an integer label, then the returned label will be a string representation of that integer.

See Also

LVInterpretation::GrammarLabel ( C++ API )

Page 184: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

174

LVInterpretation_GetScore

Returns a confidence score for this interpretation.

Function

int LVInterpretation_GetScore(H_SI hsi)

Parameters

hsi

An interpretation handle

Returns

A number between 0-1000. Higher numbers indicate more confidence by the speech port in this interpretation.

See Also

LVInterpretation::Score (C++ API)

Page 185: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

175

LVInterpretation_GetTagFormat

Returns the name of the tag format declared in the matching grammar for this interpretation. The tag format determines the semantic interpretation scheme.

Function

const char* LVInterpretation_GetTagFormat(H_SI hsi)

Parameters

hsi

An interpretation handle.

See Also

LVInterpretation::TagFormat (C++ API)

Page 186: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

176

LVSemanticData_Release

Release memory used by a H_SI_DATA handle.

Function

void SI_DATA_Release(H_SI_DATA h_si_data)

Parameters

h_si_data

Semantic Data Handle.

Page 187: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

177

LVSemanticData_CreateFromCopy

Copies the contents of another handle into a new handle and returns the new handle. This function allocates memory for the new handle, so user is required to release the new handle.

H_SI_DATA LVSemanticData_CreateFromCopy(H_SI_DATA h_si_data)

Return Value

Non-zero

Successful.

NULL

Copying failed.

Parameters

h_si_data

Semantic data handle.

Page 188: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

178

LVSemanticData_Print

Returns a string describing the contents of a semantic data handle. The function can return XML or ECMAScript formatted text.

const char* LVSemanticData_Print(H_SI_DATA h_si_data, int format)

Return Values

A pointer to the string which contains the print out information.

Parameters

h_si_data

Semantic data handle.

format

The format type.

Remark

The string contents are stored with the semantic data handle, and will be released when the handle is released.

Page 189: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

179

LVSemanticData_GetType

Returns the underlying data type of a given H_SI_DATA handle.

Function

int LVSemanticData_GetType(H_SI_DATA h_si_data)

Return Value

One of seven semantic data types.

Parameters

h_si_data

Semantic data handle.

Page 190: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

180

LVSemanticData_GetString

Returns the string contained in a given handle. This function assumes that the handle is of type SI_TYPE_STRING. If the user passes in a handle with non SI_TYPE_STRING type, this function always return NULL.

Function

const char* LVSemanticData_GetString(H_SI_DATA h_si_data)

Return Values

NULL

Either the handle is not of type SI_TYPE_STRING, or some error occurred.

Other

A pointer to a buffer containing the string.

Parameters

h_si_data

Semantic data handle.

Page 191: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

181

LVSemanticData_GetDouble

Returns a double precision floating point value contained in the given semantic data handle. This function assumes that the handle is of type SI_TYPE_DOUBLE . If the user passes in a handle with non SI_TYPE_DOUBLE type, this function always returns 0.0.

Function

double LVSemanticData_GetDouble(H_SI_DATA h_si_data)

Return Values

A double.

Parameters

h_si_data

Semantic data handle.

Page 192: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

182

LVSemanticData_GetInt

Returns the integer value contained in a given semantic data handle. This function assumes that the handle is of type SI_TYPE_INT. If the user passes in a handle with non SI_TYPE_INT type, this function always returns 0.

Function

int LVSemanticData_GetInt(H_SI_DATA h_si_data)

Return Values

An integer value.

Parameters

h_si_data

Semantic data handle.

Page 193: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

183

LVSemanticData_GetBool

Returns an integer value contained in a given handle. A non-zero integer value represents a true value, and a zero value represents a false value. This function assumes that the semantic data handle being passed in is of type SI_TYPE_BOOL. If the user passes in a handle with non SI_TYPE_BOOL type, this function always returns false.

Function

int LVSemanticData_GetBool(H_SI_DATA h_si_data)

Return Values

An integer value.

Parameters

h_si_data

Semantic data handle.

Page 194: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

184

LVSemanticObject_GetNumberOfProperties

If a semantic data handle is of type SI_TYPE_OBJECT this function returns the number of elements in this object. Otherwise, it returns -1.

Function

int LVSemanticObject_GetNumberOfProperties(H_SI_DATA h_si_data)

Return Value

The number of elements in the object.

Parameters

h_si_data

Semantic data handle.

Page 195: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

185

LVSemanticObject_GetPropertyName

If a handle is of type SI_TYPE_OBJECT, this function returns the name of a property of the object. Otherwise this function returns NULL. Usually, the user obtains the number of properties by calling LVSemanticObject_GetNumberOfProperties, then gets each property name in sequence.

Function

const char* LVSemanticObject_GetPropertyName(H_SI_DATA h_si_data, int index)

Return Values

Non-NULL pointer

A pointer to a buffer containing the name of the property specified by index.

NULL

Either the handle is not of SI_TYPE_OBJECT type, or the index exceeds the total number of properties in this object.

Parameters

h_si_data

Semantic data handle.

index

The index of the property you are inspecting. The index begins at 0. If the index is greater or equal to the value returned by LVSemanticObject_GetNumberOfProperties, this function will return NULL.

Page 196: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

186

LVSemanticObject_GetPropertyValue

If the handle is of SI_TYPE_OBJECT type, this function return the handle to the semantic data associated with the property name in the object.If the handle is not of SI_TYPE_OBJECT type, this function always return 0. This function does not allocate memory for the new handle, so do not try to release the new handle.

Function

H_SI_DATA LVSemanticObject_GetPropertyValue(H_SI_DATA h_si_data, const char *property_name)

Return Values

Non-zero value

A handle to the semantic data associated with the property name in the object..

NULL

The property name does not exist in the object, or the handle is not of SI_TYPE_OBJECT type.

Parameters

h_si_data

Semantic data handle.

property_name

A string containing the property name.

Page 197: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

187

LVSemanticObject_PropertyExists

If a handle is of SI_TYPE_OBJECT type, this function returns a boolean value indicating if the property name exists in the object. If the handle is not of SI_TYPE_OBJECT type, this function always return false.

Function

int LVSemanticObject_PropertyExists(H_SI_DATA h_si_data, const char *property_name)

Return Values

1

The property name exists in the object..

0

The property name does not exist in the object. Or the handle is not SI_TYPE_OBJECT type.

Parameters

h_si_data

A semantic data handle.

property_name

A string containing the property name.

Page 198: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

188

LVSemanticArray_GetSize

If a handle is of SI_TYPE_ARRAY type, this function returns the number of elements in the array. Otherwise this function returns -1.

Function

int LVSemanticArray_GetSize(H_SI_DATA h_si_data)

Return Values

Non-negtive value

The number of elements in the array.

-1

Either the handle is not of SI_TYPE_ARRAY type, or some error occurred.

Parameters

h_si_data

Semantic data handle.

Page 199: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

189

LVSemanticArray_GetElement

If the handle is of SI_TYPE_ARRAY type, this function returns a handle to the semantic data specified by the index. If the handle is not of SI_TYPE_ARRAY type, this function always returns NULL. This function does not allocate memory for the new handle, so do not try to release it.

Function

H_SI_DATA LVSemanticArray_GetElement(H_SI_DATA h_si_data, int index)

Return Values

Non-zero value

The handle to the semantic data specified by the index in the array..

0

The index exceeds the number of elements. Or the handle is not SI_TYPE_ARRAY type.

Parameters

h_si_data

Semantic data handle.

index

The index begins at 0. If the index is greater or equal to the value returned by LVSemanticArray_GetSize, this function will return NULL.

Page 200: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

190

LVParseTree C API functions

Page 201: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

191

API Functions

Creating, Copying and Releasing a LVParseTree Handle

LVParseTree objects are fully copyable and assignable.

Functions

H_PARSE_TREE LVParseTree_Create()

H_PARSE_TREE LVParseTree_CreateFromCopy(H_PARSE_TREE Other)

void LVParseTree_Copy (H_PARSE_TREE Tree, H_PARSE_TREE Other)

void LVParseTree_Release (H_PARSE_TREE Tree)

Parameters

Tree

A handle to a parse tree being released or copied into

Other

A handle to a parse tree being copied.

Remarks

CreateFromCopy and Copy both perform deep copies on the handles in question. Both handles will have to be released after either function call to release all allocated memory. Tree handles given to the user via LV_SRE_CreateParseTree must also be released.

Example

HPORT Port;

//open the port and do a decode //... //when the decode is finished,grab a parse tree handle H_PARSE_TREE Tree = LV_SRE_CreateParseTree(Port, voicechannel, index);

Page 202: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

192

//start using the tree. //... //When you are done with it, release it. LVParseTree_Release(Tree);

See Also

Constructing, Copying and Destroying an LVParseTree Object (C++ API)

Page 203: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

193

LVParseTree_GetGrammarLabel

Returns the name of the grammar that generated this tree.

Function

const char* LVParseTree_GrammarLabel (H_PARSE_TREE Tree)

Parameters

Tree

A handle to the parse tree.

Remarks

LVParseTree_GetGrammarLabel( ) will always return the name of one of the grammars you activated for decode. If the active grammar had an integer label, then the returned label will be a string representation of that integer.

See Also

LVParseTree::GrammarLabel ( C++ API )

Page 204: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

194

LVParseTree_GetLanguage

Returns the language identifier of the grammar that generated this tree.

Function

const char* LVParseTree_GetLanguage(H_PARSE_TREE Tree)

Parameters

Tree

A handle to a parse tree.

Returns

An RFC 3066 language identifier, such as "en-US" for United States English, or "fr" for French.

See Also

LVParseTree::Language ( C++ API )

Page 205: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

195

LVParseTree_GetMode

Returns the interaction mode that created the tree.

Function

const char* LVParseTree_GetMode(H_PARSE_TREE Tree)

Parameters

Tree

A handle to a parse tree.

Returns

"voice" or "dtmf"

See Also

LVParseTree::Mode (C++ API)

Page 206: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

196

LVParseTree_GetTagFormat

Returns the name of the tag format declared in the matching grammar for this tree.

Function

const char* LVParseTree_GetTagFormat(H_PARSE_TREE Tree)

Parameters

Tree

A handle to a parse tree

See Also

LVParseTree::TagFormat (C++ API)

Page 207: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

197

LVParseTree_GetRoot

Gets the root parse tree node.

Function

H_PARSE_TREE_NODE LVParseTree_GetRoot(H_PARSE_TREE Tree);

Parameters

Tree

Handle to a parse tree.

Return Values

An H_PARSE_TREE_NODE handle representing the toplevel rule of the matching grammar.

Remarks

This node will always be a rule node (i.e will always satisfy LVParseTree_Node_IsRule(root) == 1). If the matching grammar specified a root rule then this node will always represent that rule.

See Also

LVParseTree::Root ( C++ API )

Page 208: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

198

LVParseTree_CreateIteratorBegin and LVParseTree_CreateIteratorEnd

LVParseTree_CreateIteratorBegin and LVParseTree_CreateIteratorEnd provide iterators for visiting every node in the tree in a top-to-bottom, left-to-right descent. It is also the basis for the Tag and Terminal iterators.

Functions

H_PARSE_TREE_ITR LVParseTree_CreateIteratorBegin(H_PARSE_TREE Tree)

H_PARSE_TREE_ITR LVParseTree_CreateIteratorEnd(H_PARSE_TREE Tree)

Parameters

Tree

Handle to a parse tree.

Example

The following code prints out every node in a parse tree.

H_PARSE_TREE_ITR Itr; H_PARSE_TREE_ITR End; H_PARSE_TREE_NODE Node;

Itr = LVParseTree_CreateIteratorBegin(Tree); End = LVParseTree_CreateIteratorEnd(Tree);

while (!LVParseTree_Iterator_AreEqual(Itr,End)) { H_PARSE_TREE_NODE Node = LVParseTree_Iterator_GetNode(Itr); for (int i = 0; i < LVParseTree_Node_GetLevel(Node); ++i) printf("\t"); if (LVParseTree_Node_IsRule(Node)) printf("$%s:\n",LV_ParseTree_Node_GetRuleName(Node)); if (LVParseTree_Node_IsTag(Node)) printf("{%s}\n",LVParseTree_Node_GetText(Node)); if (LVParseTree_Node_IsTerminal(Node)) printf("\"%s\"\n",LVParseTree_Node_GetText(Node)); LVParseTree_Iterator_Advance(Itr); }

Page 209: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

199

LVParseTree_Iterator_Release(Itr); LVParseTree_Iterator_Release(End);

/* Note: Node handles don't get released; They are part of the tree, and the tree releases them when it gets released */

If the grammar was the top level navigation example grammar, and the engine recognized "go back", the the above code would print out:

$directive: "go" "back" {$ = "APPLICATION_BACK"}

See Also

LVParseTree::Begin and LVParseTree::End (C++ API)

Page 210: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

200

LVParseTree_CreateTerminalIteratorBegin LVParseTree_CreateTerminalIteratorEnd

LVParseTree_CreateTerminalIteratorBegin and LVParseTree_CreateTerminalIteratorEnd provide access to the "terminals" of the tree. Terminals are the words and phrases in your grammar, so a TerminalIterator gives you access the the exact words the SRE heard a speaker say to match a grammar, in the order that the SRE heard those words.

Functions

H_PARSE_TREE_TERMINAL_ITR LVParseTree_CreateTerminalIteratorBegin(H_PARSE_TREE Tree)

H_PARSE_TREE_TERMINAL_ITR LVParseTree_CreateTerminalIteratorEnd(H_PARSE_TREE Tree)

Parameters

Tree

Handle to a parse tree.

Example

The following code prints out the sentence SRE heard, with a word-level confidence score attached to each word.

H_PARSE_TREE_TERMINAL_ITR Itr; H_PARSE_TREE_TERMINAL_ITR End; H_PARSE_TREE_NODE Node;

Itr = LVParseTree_CreateTerminalIteratorBegin(Tree); End = LVParseTree_CreateTerminalIteratorEnd(Tree);

while (!LVParseTree_TerminalIterator_AreEqual(Itr,End)) { Node = LVParseTree_TerminalIterator_GetNode(Itr); printf("\"%s\":(%i)\n",LVParseTree_Node_GetText(Node), LVParseTree_Node_GetScore(Node)); LVParseTree_TerminalIterator_Advance(Itr); } printf("\n");

Page 211: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

201

LVParseTree_TerminalIterator_Release(Itr); LVParseTree_TerminalIterator_Release(End);

/* Note: Node handles don't get released; They are part of the tree, and the tree releases them when it gets released */

So if the grammar being used was the top level navigation example grammar, and the SRE recognized "go back", then the output of the above code might look like:

"go":(850) "back":(901)

See Also

LVParseTree::TerminalsBegin and LVParseTree::TerminalsEnd (C++ API)

Page 212: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

202

LVParseTree_CreateTagIteratorBegin LVParseTree_CreateTagIteratorEnd

LVParseTree_CreateTagIteratorBeginand LVParseTree_CreateTagIteratorEnd provide iterators for visiting the tags in the tree's body.

Functions

H_PARSE_TREE_TAG_ITR LVParseTree_CreateTagIteratorBegin(H_PARSE_TREE Tree)

H_PARSE_TREE_TAG_ITR LVParseTree_CreateTagIteratorEnd(H_PARSE_TREE Tree)

Parameters

Tree

Handle to a parse tree.

Example

The following code prints out every tag in a parse tree.

H_PARSE_TREE_TAG_ITR Itr; H_PARSE_TREE_TAG_ITR End; H_PARSE_TREE_NODE Node;

Itr = LVParseTree_CreateTagIteratorBegin(Tree); End = LVParseTree_CreateTagIteratorEnd(Tree);

while (!LVParseTree_TagIterator_AreEqual(Itr,End)) { Node = LVParseTree_TagIterator_GetNode(Itr); printf("%s;\n",LVParseTree_Node_GetText(Node)); LVParseTree_TagIterator_Advance(Itr); }

LVParseTree_TagIterator_Release(Itr); LVParseTree_TagIterator_Release(End);

/* Note: Node handles don't get released; They are part of the tree, and the tree releases them when it gets released */

Page 213: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

203

If the grammar was the top level navigation example grammar, and the engine recognized "go back", the the above code would print out:

$ = "APPLICATION_BACK";

Remark

The TagIterator does not visit the tags in a tree's header. Use LVParseTree::HeaderTag to access the contents of those tags.

See Also

LVParseTree::TagsBegin and LVParseTree::TagsEnd (C API)

Page 214: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

204

Related APIs

LVParseTree_Node API

An LVParseTree is made out of Node objects. Each node represents a word, rule, or tag that was seen by the engine as it decoded an utterance against the matching grammar.

Use <LVSpeechPort.h> or <LV_SRE_ParseTree.h>

Return Type Function

H_PARSE_TREE_NODE LVParseTree_Node_GetParent (H_PARSE_TREE_NODE Node)

H_PARSE_TREE_CHILDREN_ITR LVParseTree_Node_CreateChildrenIteratorBegin(H_PARSE_TREE_NODE Node)

H_PARSE_TREE_CHILDREN_ITR LVParseTree_Node_CreateChildrenIteratorEnd(H_PARSE_TREE_NODE Node)

H_PARSE_TREE_ITR LVParseTree_Node_CreateIteratorBegin(H_PARSE_TREE_NODE Node)

Page 215: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

205

H_PARSE_TREE_ITR LVParseTree_Node_CreateIteratorBegin(H_PARSE_TREE_NODE Node)

H_PARSE_TREE_TERMINAL_ITR LVParseTree_Node_CreateTerminalIteratorBegin(H_PARSE_TREE_NODE Node)

LH_PARSE_TREE_TERMINAL_ITR LVParseTree_Node_CreateTerminalIteratorEnd(H_PARSE_TREE_NODE Node)

H_PARSE_TREE_TAG_ITR LVParseTree_Node_CreateTagIteratorBegin(H_PARSE_TREE_NODE Node)

H_PARSE_TREE_TAG_ITR LVParseTree_Node_CreateTagIteratorEnd(H_PARSE_TREE_NODE Node)

int LVParseTree_Node_IsRule (void)

Page 216: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

206

int LVParseTree_Node_IsTerminal (void)

int LVParseTree_Node_IsTag (void)

const char* LVParseTree_Node_GetText (void)

Page 217: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

207

const char* LVParseTree_Node_GetPhonemes (void)

const char* LVParseTree_Node_GetRuleName (void)

int LVParseTree_Node_GetScore (void)

int LVParseTree_Node_GetStartTime (void)

Page 218: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

208

int LVParseTree_Node_GetEndTime (void)

Page 219: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

209

Page 220: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

210

LVParseTree_Iterator C API

An LVParseTree_Iterator object traverses a parse tree in a top-to-bottom, left-to-right fashion (sometimes called a pre-order or LL traversal). You can get an iterator over a subtree rooted at a Node by calling:

LVParseTree_Node_CreateIteratorBegin(H_PARSE_TREE_NODE Node)

LVParseTree_Node_CreateIteratorEnd(node)

Use <LVSpeechPort.h> or <LV_SRE_ParseTree.h>

Return Type Function Description

H_PARSE_TREE_ITR LVParseTree_Iterator_Create(void)

Creates a blank Iterator; its not pointing over anything.

H_PARSE_TREE_ITR LVParseTree_Iterator_CreateFromCopy(H_PARSE_TREE_ITR Other)

Creates a new Iterator from another. Both Iterators will need to be released when no longer needed.

void LVParseTree_Iterator_Copy(H_PARSE_TREE Iterator, H_PARSE_TREE_ITR Other)

Copies the data from one handle into another.

void LVParseTree_Iterator_Release(H_PARSE_TREE Iterator)

Releases the memory allocated to the Iterator

Page 221: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

211

handle.

void LVParseTree_Iterator_Advance(H_PARSE_TREE_ITR Iterator) Advances the Iterator one position.

H_PARSE_TREE_NODE LVParseTree_Iterator_GetNode(H_PARSE_TREE_ITR Iterator)

Provides access to a node in the parse tree.

int

LVParseTree_Iterator_AreEqual(H_PARSE_TREE_ITR Iterator1, H_PARSE_TREE_ITR Iterator2)

Tests equality with another Iterator. Two Iterators are equal if they are pointing to the same node in a parse tree.

Page 222: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

212

LVParseTree_ChildrenIterator C API

An LVParseTree_ChildrenIterator object traverses the immediate children of a rule node, from left to right. You get a ChildrenIterator object from a Node by calling

LVParseTree_Node_CreateChildrenIteratorBegin(H_PARSE_TREE_NODE Node)

LVParseTree_Node_CreateChildrenIteratorEnd(H_PARSE_TREE_NODE Node)

With these iterators, you can traverse the immediate children of Node.

Use <LVSpeechPort.h> or <LV_SRE_ParseTree.h>

Return Type Function

H_PARSE_TREE_CHILDREN_ITR LVParseTree_ChildrenIterator_Create(void)

H_PARSE_TREE_CHILDREN_ITR LVParseTree_ChildrenIterator_CreateFromCopy (H_PARSE_TREE_CHILDREN_ITR Other)

void LVParseTree_ChildrenIterator_Copy(H_PARSE_TREE_CHILDREN_ITR Itr, H_PARSE_TREE_CHILDREN_ITR Other)

void LVParseTree_ChildrenIterator_Release(H_PARSE_CHILDREN_ITR Itr)

Page 223: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

213

void LVParseTree_ChildrenIterator_Advance(H_PARSE_TREE_CHILDREN_ITR Itr)

H_PARSE_TREE_NODE LVParseTree_ChildrenIterator_GetNode(H_PARSE_TREE_CHILDREN_ITR Itr)

int

LVParseTree_ChildrenIterator_AreEqual(H_PARSE_TREE_CHILDREN_ITR Itr1, H_PARSE_TREE_CHILDREN_ITR Itr2)

Page 224: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

214

LVParseTree_TerminalIterator C API

An LVParseTree_TerminalIterator object is an adaptation of the standard LVParseTree_Iterator. It only visits the nodes in a tree that are terminals. You can get a TerminalIterator from a Node by calling:

LVParseTree_Node_CreateTerminalIteratorBegin(H_PARSE_TREE_NODE Node)

LVParseTree_Node_CreateTerminalIteratorEnd(H_PARSE_TREE_NODE Node)

With these iterators, you can visit all of the terminal nodes in the subtree rooted by Node.

Use <LVSpeechPort.h> or <LV_SRE_ParseTree.h>

Return Type Function

H_PARSE_TREE_TERMINAL_ITR LVParseTree_TerminalIterator_Create(void)

H_PARSE_TREE_TERMINAL_ITR LVParseTree_TerminalIterator_CreateFromCopy (H_PARSE_TREE_TERMINAL_ITR Other)

void LVParseTree_TerminalIterator_Copy(H_PARSE_TREE_TERMINAL_ITR Itr, H_PARSE_TREE_TERMINAL_ITR Other)

void LVParseTree_TerminalIterator_Release(H_PARSE_TERMINAL_ITR Itr)

Page 225: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

215

void LVParseTree_TerminalIterator_Advance(H_PARSE_TREE_TERMINAL_ITR Itr)

H_PARSE_TREE_NODE LVParseTree_TerminalIterator_GetNode(H_PARSE_TREE_TERMINAL_ITR Itr)

int

LVParseTree_TerminalIterator_AreEqual(H_PARSE_TREE_TERMINAL_ITR Itr1, H_PARSE_TREE_TERMINAL_ITR Itr2)

Page 226: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

216

LVParseTree_TagIterator C API

An LVParseTree_TagIterator object is an adaptation of the standard LVParseTree_Iterator. It only visits the nodes in a tree that are tags. You can get a tag iterator from a Node by calling:

LVParseTree_Node_CreateTagIteratorBegin(H_PARSE_TREE_NODE Node)

LVParseTree_Node_CreateTagIteratorEnd(H_PARSE_TREE_NODE Node)

With these iterators, you can traverse all of the tags in the subtreee rooted by Node.

Use <LVSpeechPort.h> or <LV_SRE_ParseTree.h>

Return Type Function Description

H_PARSE_TREE_TAG_ITR LVParseTree_TagIterator_Create(void)

Creates a blank iterator; its not pointing over anything.

H_PARSE_TREE_TAG_ITR LVParseTree_TagIterator_CreateFromCopy (H_PARSE_TREE_TAG_ITR Other)

Creates a new iterator from another Both iterators will need to be released when no longer needed.

void LVParseTree_TagIterator_Copy(H_PARSE_TREE_TAG_ITR Itr, H_PARSE_TREE_TAG_ITR Other)

Copies the data from one handle into another.

void LVParseTree_TagIterator_Release(H_PARSE_TREE_TAG_ITR Itr) Releases the memory allocated to

Page 227: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

217

the iterator handle.

void LVParseTree_TagIterator_Advance(H_PARSE_TREE_TAG_ITR Itr) Advances the iterator one position.

H_PARSE_TREE_NODE LVParseTree_TagIterator_GetNode(H_PARSE_TREE_TAG_ITR Itr)

Provides access to a node in the parse tree.

int

LVParseTree_TagIterator_AreEqual(H_PARSE_TREE_TAG_ITR Itr1, H_PARSE_TREE_TAG_ITR Itr2)

Tests equality with another iterator. Two iterators are equal if they are pointing to the same node in a parse tree.

Page 228: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

218

LVParseTree Class

The following C API is exported from "LV_SRE_ParseTree.h". An LVParseTree class is available for C++ programmers which wraps this API.

See Also Using the Parse Tree Tutorial

Return Type Function Description

H_PARSE_TREE LVParseTree_Create(void) Constructs an LVParseTree object.

H_PARSE_TREE LVParseTree_CreateFromCopy(H_PARSE_TREE Other)

Copy constructor

void LVParseTree_Copy(H_PARSE_TREE Tree, H_PARSE_TREE Other)

Assignment operator

void LVParseTree_Release (H_PARSE_TREE Tree) Destroys the LVParseTree object

H_PARSE_TREE_NODE LVParseTree_GetRoot (H_PARSE_TREE Tree)

Provides access to the parent node in the parse tree.

H_PARSE_TREE_ITR LVParse_CreateIteratorBegin (H_PARSE_TREE Tree)

Provides an iterator that walks each node in the tree in a top-to-bottom, left-to-right fashion

Page 229: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

219

H_PARSE_TREE_ITR LVParseTree_CreateIteratorEnd (H_PARSE_TREE Tree)

Marks the end of traversal for the parse tree iterator

H_PARSE_TREE_TERMINAL_ITR LVParseTree_CreateTerminalIteratorBegin (H_PARSE_TREE Tree)

Traverses the terminals of the parse tree (words).

H_PARSE_TREE_TERMINAL_ITR LVParseTree_CreateTerminalIteratorEnd (H_PARSE_TREE Tree)

Marks the end of traversal for the TerminalIterator.

H_PARSE_TREE_TAG_ITR LVParseTree_CreateTagIteratorBegin (H_PARSE_TREE Tree)

Traverses the tags in the parse tree (semantic data).

H_PARSE_TREE_TAG_ITR LVParseTree_CreateTagIteratorEnd (H_PARSE_TREE Tree)

Marks the end of traversal for the TagIterator

const char* LVParseTree_GetTagFormat (H_PARSE_TREE Tree)

Returns the tag format, as described by the grammar that this tree matched (e.g. "lumenvox/1.0" or "semantics/1.0")

int LVParseTree_GetNumberOfTagsInHeader (H_PARSE_TREE Tree)

Returns the number of tags (semantic data) that were defined in the matching

Page 230: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

220

grammar's header.

const char* LVParseTree_GetHeaderTag (H_PARSE_TREE Tree, int i)

Returns the ith header tag from the matching grammar.

const char* LVParseTree_GetGrammarLabel (H_PARSE_TREE Tree)

Returns the name of the matching grammar that was provided to the speech port when it was loaded

const char* LVParseTree_GetMode (H_PARSE_TREE Tree)

Returns the mode of the utterance decode that created this tree: "voice" or "dtmf"

const char* LVParseTree_GetLanguage (H_PARSE_TREE Tree )

Returns the language of the matching grammar (e.g. "en-US" or "es-MX")

Page 231: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

221

LVGrammar C API Functions

LVGrammar Summary

The LVGrammar API allows you to manipulate a context-free grammar object that can be used in the engine to recognize speech.

Use <LVSpeechPor.h> or <LV_SRE_Grammar.h>

Return Type Function Description

HGRAMMAR LVGrammar_Create() Constructs an grammar object.

HGRAMMAR LVGrammar_CreateFromCopy (HGRAMMAR other)

Constructs an grammar object by copying an existing one.

void LVGrammar_Copy (HGRAMMAR hgram, HGRAMMAR other)

Copy object pointed by other to the object pointed by hgram.

void LVGrammar_Release (HGRAMMAR hgram)

Destroys the grammar object.

int LVGrammar_Reset (HGRAMMAR hgram) Reset an grammar object.

void LVGrammar_RegisterLoggingCallback (HGRAMMAR hgram, GrammarLogCB Log, void* UserData)

Registers a callback so the object can report warnings and errors to the grammar author.

int LVGrammar_SaveCompiledGrammar (HGRAMMAR hgram, const char* filename)

Save the grammar object to a binary file

Page 232: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

222

int LVGrammar_LoadCompiledGrammar (HGRAMMAR hgram, const char* filename)

Load the grammar object from a binary file

int LVGrammar_LoadGrammar (HGRAMMAR hgram, const char* uri)

Loads a grammar from a location specified by the "uri" argument.

int LVGrammar_LoadGrammarFromBuffer (HGRAMMAR hgram, const char* buffer)

Loads a grammar from a null terminated string containing the contents of the grammar.

int LVGrammar_AddRule (HGRAMMAR hgram, const char* left_hand_side, const char* right_hand_side)

Inserts a new rule into the grammar.

int LVGrammar_RemoveRule (HGRAMMAR hgram, const char* left_hand_side)

Removes a rule from the grammar.

int LVGrammar_SetRoot (HGRAMMAR hgram, const char* root)

Sets a starting rule for the grammar.

void LVGrammar_SetMode (HGRAMMAR hgram, const char* mode)

Declare the mode of grammar (the style of decode to be processed). Legal arguments are "voice" or "dtmf".

const char* LVGrammar_GetMode (GRAMMAR hgram)

Retrieve the mode of the grammar.

void LVGrammar_SetLanguage (HGRAMMAR hgram, const char* language)

Specify the language of this grammar as a language/country code pair. Legal arguments include "en-US" and "es-MX".

Page 233: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

223

const char* LVGrammar_GetLanguage (HGRAMMAR hgram)

Retrieve the language setting of the grammar.

int

LVGrammar_SetTagFormat (HGRAMMAR hgram, const char* tag_format)

Identify the tag format of the grammar. To use the LumenVox semantic interpretation, the tag format must be "lumenvox/1.0" or "semantics/1.0".

const char* LVGrammar_GetTagFormat (HGRAMMAR hgram)

Retrieve the tag format of the grammar.

int LVGrammar_GetNumberOfMetaData (HGRAMMAR hgram)

Retrieve number of meta data in the grammar

const char* LVGrammar_GetMetaDataKey (HGRAMMAR hgram, int index)

Returns the key of the meta data indicated by the index.

const char* LVGrammar_GetMetaDataValue (HGRAMMAR hgram, int index)

Returns the value of the meta data indicated by the index.

int

LVGrammar_ParseSentence (HGRAMMAR hgram, const char* sentence)

Use the grammar to parse a sentence.

int LVGrammar_GetNumberOfParses (HGRAMMAR hgram)

Returns the number of parses created by the most recent LVGrammar_ParseSentence call.

H_PARSE_TREE LVGrammar_CreateParseTree (HGRAMMAR hgram, int index)

Returns the parse tree handle indicated by the index.

Page 234: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

224

int LVGrammar_InterpretParses (HGRAMMAR hgram)

Generate interpretations form parses trees created by the most recent LVGrammar_ParseSentence call.

int LVGrammar_GetNumberOfInterpretations (HGRAMMAR hgram)

Returns the number of interpretations created by the most recent LVGrammar_InterpretParses call.

H_SI LVGrammar_CreateInterpretation (HGRAMMAR hgram, int index)

Returns the semantic interpretation handle indicated by the index

Page 235: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

225

API Functions

LVGrammar_AddRule

Add rules to a grammar object.

Function

int LVGrammar_AddRule(HGRAMMAR hgram, const char* rule_name, const char* rule_definition)

Parameters

hgram

A handle to the grammar.

rule_name

The name of the rule

rule_definition

The definition of the rule

Return Values

LV_SUCCESS

No errors; the rule has been successfully added or removed.

LV_GRAMMAR_SYNTAX_WARNING

The new rule was not fully conforming, but it was understandable and is now ready to be used

LV_GRAMMAR_SYNTAX_ERROR

The new rule was not understandable to the grammar compiler. You will not be able to decode with this grammar.

Example

Page 236: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

226

LVGrammar_AddRule(hgram, "foo", "hello [world]");

Is the same as writing a rule:

$foo = hello [world];

Remarks

New rules must be written in ABNF notation. Detailed error and warning messages are sent to the grammar object's logging callback function.

See Also

LVGrammar_RemoveRule

LVGrammar::AddRule (C++ API)

Page 237: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

227

LVGrammar_SetRoot

Identifies one of the grammar rules as the root rule. The root rule is where the engine starts its search.

Function

int LVGrammar_SetRoot(HGRAMMAR hgram, const char* rule_name)

Parameters

hgram

A handle to the grammar.

rule_name

The name of the rule.

Example

LVGrammar_SetRule(hgram, "foo");

Is the same as writing in a grammar:

root $foo;

See Also

LVGrammar::SetRoot (C++ API)

Page 238: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

228

LVGrammar_SetMode

Set mode property for the grammar,

Function

int LVGrammar_SetMode(HGRAMMAR hgram, const char* mode)

Parameters

hgram

A handle to the grammar.

mode

The interaction mode of the grammar.

Example

LVGrammar_SetLanguage(hgram, "en-US"); LVGrammar_SetMode(hgram,"voice"); LVGrammar_SetTagFormat(hgram,"lumenvox/1.0");

Is the same as writing in your grammar:

language "en-US; mode "voice"; tag-format <lumenvox/1.0>;

See Also

LVGrammar_GetMode

LVGrammar::SetMode (C++API)

Page 239: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

229

LVGrammar_Create

Creates an empty grammar object, and returns the handle.

Function

HGRAMMAR LVGrammar_Create()

Parameters

Return Values

A handle to the created grammar object.

Remarks

The memory pointed by the returned handle will not be released until the user called LVGrammar_Release explicitly.

See Also

LVGrammar_Release

Page 240: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

230

LVGrammar_CreateFromCopy

Creates an grammar object by copying another one, and returns the handle.

Function

HGRAMMAR LVGrammar_CreateFromCopy(HGRAMMAR another)

Parameters

another

The grammar object to copy from.

Return Values

A handle to the created grammar object.

Remarks

The memory pointed by the returned handle will not be released until the user called LVGrammar_Release explicitly.

See Also

LVGrammar_Release

Page 241: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

231

LVGrammar_Copy

Copy one grammar object to another.

Function

int LVGrammar_Copy (HGRAMMAR hgram, HGRAMMAR other)

Parameters

hgrammar

Destination grammar object handle.

other

Source grammar object handle.

Return Values

LV_SUCCESS

LV_FAILURE

Remarks

This function doesn't create new objects for the destination handle. So no memory will be allocated. It is users' responsibility to make sure that the object pointed by the destination handle has already been created before calling this function.

See Also

LVGrammar::operator = (C++ API)

Page 242: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

232

LVGrammar_Reset

Reset a grammar object.

Function

int LVGrammar_Reset (HGRAMMAR Grammar)

Parameters

hgram

The handle to the grammar object to be reset.

Return Values

LV_SUCCESS

LV_FAILURE

See Also

LVGrammar::Reset (C++ API)

Page 243: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

233

LVGrammar_Release

Destroy a grammar object.

Function

void LVGrammar_Release (HGRAMMAR Grammar)

Parameters

hgram

The handle to the grammar object to be released.

Remarks

The grammar object created by LVGrammar_Create and LVGrammar_CreateFromCopy need to be explicitly destroyed by calling LVGrammar_Release.

See Also

LVGrammar_Create

LVGrammar_CreateFromCopy

Page 244: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

234

LVGrammar_RegisterLoggingCallback

Registers a callback so the object can report warnings and errors to the grammar author via the callback function.

Function

void LVGrammar_RegisterLoggingCallback (HGRAMMAR hgram, GrammarLogCB log, void* userData)

Parameters

hgram

The handle to the grammar object.

log

The logging callback function pointer.

userdata

The pointer to user defined data associated with the grammar object pointed by Grammar. It will be passed into the callback function.

Remarks

The call back function need to have signature defined by GrammarLogCB.

See Also

LVGrammar::RegisterLoggingCallback (C++ API)

Page 245: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

235

LVGrammar_SaveCompiledGrammar

Save a grammar object to a binary file.

Function

int LVGrammar_SaveCompiledGrammar (HGRAMMAR hgram, const char* filename)

Parameters

hgram

The handle to a grammar object.

filename

File name.

Return Values

LV_SUCCESS

LV_FAILURE

Remarks

The saved compiled grammar can be later loaded into a grammar object with LVGrammar_LoadCompiledGrammar.

See Also

LVGrammar_LoadCompiledGrammar

LVGrammar::SaveCompiledGrammar (C++ API)

Page 246: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

236

LVGrammar_LoadCompiledGrammar

Load a grammar object from a binary file previously saved by LVGrammar_SaveCompiledGrammar.

Function

int LVGrammar_LoadCompiledGrammar (HGRAMMAR hgram, const char* filename)

Parameters

hgram

The handle to a grammar object.

filename

File name.

Return Values

LV_SUCCESS

LV_FAILURE

See Also

LVGrammar_SaveCompiledGrammar

LVGrammar::LoadCompiledGrammar (C++ API)

Page 247: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

237

LVGrammar_LoadGrammar

Loads a grammar from a local file or remote file via http or ftp. Grammar can be written in ABNF or XML notations.

Function

int LVGrammar_LoadGrammar(HGRAMMAR hgram, const char* grammar_location)

Parameters

hgram

Handle to a grammar object.

gram_location

A file descriptor or uri that points to a valid SRGS grammar file, such as "c:/grammars/pizza.grxml", "http://www.gramsRus.com/phonenumber.gram", or "builtin:dtmf/boolean?y=1;n=2"

Return Values

LV_SUCCESS

No errors; this grammar is now ready for use.

LV_GRAMMAR_SYNTAX_WARNING

The grammar file was not fully conforming, but it was understandable and is now ready to be used

LV_GRAMMAR_SYNTAX_ERROR

The grammar file was not understandable to the grammar compiler. You will not be able to decode with this grammar.

LV_GRAMMAR_LOADING_ERROR

The grammar compiler was unable to find the location of the grammar you loaded.

Page 248: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

238

Remarks

Detailed error and warning messages are sent to the grammar object's logging callback function.

See Also

LVGrammar::LoadGrammar (C++ API)

Page 249: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

239

LVGrammar_LoadGrammarFromBuffer

Loads a grammar from a null terminated string buffer. Grammar can be written in ABNF or XML notations.

Function

int LVGrammar_LoadGrammarFromBuffer(HGRAMMAR hgram, const char* grammar_contents);

Parameters

hgram

Handle to a grammar object.

gram_contents

A null terminated string containing the contents of a valid SRGS grammar.

Return Values

LV_SUCCESS

No errors; this grammar is now ready for use.

LV_GRAMMAR_SYNTAX_WARNING

The grammar file was not fully conforming, but it was understandable and is now ready to be used

LV_GRAMMAR_SYNTAX_ERROR

The grammar file was not understandable to the grammar compiler. You will not be able to decode with this grammar.

LV_GRAMMAR_LOADING_ERROR

The grammar compiler was unable to find the location of the grammar you loaded.

Remarks

Page 250: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

240

Detailed error and warning messages are sent to the grammar object's logging callback function.

See Also

LVGrammar::LoadGrammarFromBuffer (C++ API)

Page 251: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

241

LVGrammar_RemoveRule

Remove rules to a grammar object.

Function

int LVGrammar_RemoveRule(HGRAMMAR hgram, const char* rule_name)

Parameters

hgram

A handle to the grammar.

rule_name

The name of the rule

Return Values

LV_SUCCESS

No errors; the rule has been successfully added or removed.

LV_GRAMMAR_SYNTAX_WARNING

The new rule was not fully conforming, but it was understandable and is now ready to be used

LV_GRAMMAR_SYNTAX_ERROR

The new rule was not understandable to the grammar compiler. You will not be able to decode with this grammar.

Remarks

Detailed error and warning messages are sent to the grammar object's logging callback function.

See Also

Page 252: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

242

LVGrammar_AddRule

LVGrammar::RemoveRule (C++ API)

Page 253: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

243

LVGrammar_SetLanguage

Set language for the grammar,

Function

int LVGrammar_SetLanguage(HGRAMMAR hgram, const char* language)

Parameters

hgram

A handle to the grammar.

language

The language identifier for the grammar

Example

LVGrammar_SetLanguage(hgram, "en-US"); LVGrammar_SetMode(hgram,"voice"); LVGrammar_SetTagFormat(hgram,"lumenvox/1.0");

Is the same as writing in your grammar:

language "en-US; mode "voice"; tag-format <lumenvox/1.0>;

See Also

LVGrammar_GetLanguage

LVGrammar::SetLanguage (C++ API)

Page 254: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

244

LVGrammar_SetTagFormat

Set interpretation tag format of the grammar.

Function

int LVGrammar_SetTagFormat(HGRAMMAR hgram, const char* tag_format)

Parameters

hgram

A handle to the grammar.

tag_format

The grammar's tag format.

Example

LVGrammar_SetLanguage(hgram, "en-US"); LVGrammar_SetMode(hgram,"voice"); LVGrammar_SetTagFormat(hgram,"lumenvox/1.0");

Is the same as writing in your grammar:

language "en-US; mode "voice"; tag-format <lumenvox/1.0>;

See Also

LVGrammar_GetTagFormat

LVGrammar::SetTagFormat (C++ API)

Page 255: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

245

LVGrammar_GetMode

Return the mode setting for the grammar,

Function

const char* LVGrammar_GetMode(HGRAMMAR hgram)

Parameters

hgram

A handle to the grammar.

Return Values

The interaction mode of the grammar.

See Also

LVGrammar_SetMode

LVGrammar::GetMode (C++API)

Page 256: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

246

LVGrammar_GetLanguage

Return the language setting for the grammar,

Function

const char* LVGrammar_GetLanguage(HGRAMMAR hgram)

Parameters

hgram

A handle to the grammar.

Return Values

The language identifier of the grammar.

See Also

LVGrammar_SetLanguage

LVGrammar::GetLanguage (C++API)

Page 257: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

247

LVGrammar_GetTagFormat

Return the interpretation tag format setting for the grammar,

Function

const char* LVGrammar_GetTagFormat(HGRAMMAR hgram)

Parameters

hgram

A handle to the grammar.

Return Values

The tag format of the grammar.

See Also

LVGrammar_SetTagFormat

LVGrammar::GetTagFormat (C++API)

Page 258: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

248

LVGrammar_GetNumberOfMetaData

Return the number of meta data contained in the grammar.

Function

int LVGrammar_GetNumberOfMetaData(HGRAMMAR hgram)

Parameters

hgram

A handle to the grammar.

Example

If the grammar contains the following lines:

meta 'description' is 'example grammar'; meta 'date' is '05/12/2005';

You can access meta data as follows:

int count = LVGrammar_GetNumberOfMetaData(grammar); // returns 2 const char* key = LVGrammar_GetMetaDataKey(grammar, 0); // returns "description" const char* value = LVGrammar_GetMetaDataValue(grammar, 1); // returns "05/12/2005"

See Also

LVGrammar_GetMetaDataKey

LVGrammar_GetMetaDataValue

LVGrammar::GetNumberOfMetaData (C++ API)

Page 259: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

249

LVGrammar_GetMetaDataKey

Return the key of the meta data indicated by the index.

Function

int LVGrammar_GetMetaDataKey(HGRAMMAR hgram, int index)

Parameters

hgram

A handle to the grammar.

index

Index of the meta data. It should be in the range [0, LVGrammar_GetNumberOfMetaData).

Return Values

null

The index is not valid.

non-null

A pointer to the value string.

Example

If the grammar has following lines:

meta 'description' is 'example grammar'; meta 'date' is '05/12/2005';

You can access meta data as follows:

int count = LVGrammar_GetNumberOfMetaData(grammar); // returns 2 const char* key = LVGrammar_GetMetaDataKey(grammar, 0); // returns "description" const char* value = LVGrammar_GetMetaDataValue(grammar, 1); // returns "05/12/2005"

Page 260: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

250

See Also

LVGrammar_GetNumberOfMetaData

LVGrammar_GetMetaDataValue

LVGrammar::GetMetaDataKey (C++ API)

Page 261: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

251

LVGrammar_GetMetaDataValue

Return the value of the meta data indicated by the index.

Function

int LVGrammar_GetMetaDataValue(HGRAMMAR hgram, int index)

Parameters

hgram

A handle to the grammar.

index

Index of the meta data. It should be in the range [0, LVGrammar_GetNumberOfMetaData).

Return Values

null

The index is not valid.

non-null

A pointer to the value string.

Example

If the grammar has following lines:

meta 'description' is 'example grammar'; meta 'date' is '05/12/2005';

You can access meta data as follows:

int count = LVGrammar_GetNumberOfMetaData(grammar); // returns 2 const char* key = LVGrammar_GetMetaDataKey(grammar, 0); // returns "description" const char* value = LVGrammar_GetMetaDataValue(grammar, 1); // returns "05/12/2005"

Page 262: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

252

See Also

LVGrammar_GetNumberOfMetaData

LVGrammar_GetMetaDataKey

LVGrammar::GetMetaDataValue (C++ API)

Page 263: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

253

LVGrammar_ParseSentence

Use a loaded grammar object to parse a sentence.

Function

int LVGrammar_ParseSentence(HGRAMMAR hgram, const char* sentence)

Parameters

hgram

A handle to the grammar.

sentence

The sentence to parse.

Return Values

0

The sentence is not covered by the grammar.

non-0

The number of distinct parses.

Example

Assume a grammar was defined as:

root $yes_no; $yes_no = $yes | $no; $yes = yes [please]; $no = no [thank you];

You can use this grammar to validate sentences as follows:

int count = LVGrammar_ParseSentence(grammar, "no thank you"); // returns 1 int count = LVGrammar_ParseSentence(grammar, "no thanks"); // returns 0

Page 264: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

254

Remarks

With this function, you can identify how well a grammar covers your targeted transcript set.

See Also

LVGrammar_GetNumberOfParses

LVGrammar_CreateParseTree

LVGrammar::ParseSentence (C++ API)

Page 265: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

255

LVGrammar_GetNumberOfParses

Return the number of parses created by the most recent call of LVGrammar_ParseSentence.

Function

int LVGrammar_GetNumberOfParses(HGRAMMAR hgram)

Parameters

hgram

A handle to the grammar.

Return Values

0

The sentence is not covered by the grammar.

non-0

The number of distinct parses.

Remarks

This function can be used after a call to LVGrammar_ParseSentence. It is merely a convenience, as it returns the save value as the return value for LVGrammar_ParseSentence.

See Also

LVGrammar_ParseSentence

LVGrammar_CreateParseTree

LVGrammar::NumberOfParses (C++ API)

Page 266: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

256

LVGrammar_CreateParseTree

Return the parse tree handle with the specified index.

Function

H_PARSE_TREE LVGrammar_CreateParseTree(HGRAMMAR hgram, int index)

Parameters

hgram

A handle to the grammar.

index

The index of the parse tree handle to be returned. It should be in the range [0, LVGrammar_GetNumberOfParses).

Return Values

null

The index is not valid.

non-null

The parse tree handle.

Remarks

This function should be used after a call to LVGrammar_ParseSentence.

If the returned handle is not null, you need to call LVParseTree_Release to destroy the parse tree object pointed by the handle.

See Also

LVGrammar_ParseSentence

LVGrammar_GetNumberOfParses

Page 267: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

257

LVGrammar::GetParseTree (C++ API)

Page 268: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

258

LVGrammar_InterpretParses

Generate semantic interpretation results from the parse trees generated by the previous call to LVGrammar_ParseSentence.

Function

int LVGrammar_InterpretParses(HGRAMMAR hgram)

Parameters

hgram

A handle to a grammar.

Return Values

integer (>=0)

Number of available interpretations.

Remarks

Before passing a grammar object handle to this function, you should call LVGrammar_ParseSentence using that handle. Otherwise, that handle doesn't contain any parse tree information.

See Also

LVGrammar_ParseSentence

LVGrammar_GetNumberOfInterpretations

LVGrammar_CreateInterpretation

LVGrammar::InterpretParses (C++ API)

Page 269: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

259

LVGrammar_GetNumberOfInterpretations

Return the number of semantic interpretations created by the most recent call to LVGrammar_InterpretParses.

Function

int LVGrammar_GetNumberOfInterpretations(HGRAMMAR hgram)

Parameters

hgram

A handle to the grammar.

Return Values

integer (>=0)

Number of available interpretations.

Remarks

This function can be used after a call to LVGrammar_InterpretParses. It is merely a convenience, as the return value of LVGrammar_InterpretParses provides the same information.

See Also

LVGrammar_InterpretParses

LVGrammar_CreateInterpretation

LVGrammar::GetNumberOfInterpretations (C++ API)

Page 270: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

260

LVGrammar_CreateInterpretation

Returns the semantic interpretation handle indicated by the index.

Function

H_SI LVGrammar_CreateInterpretation (HGRAMMAR hgram, int index)

Parameters

hgram

A handle to the grammar.

index

The index of the interpretation handle to be returned. It should be in the range [0, LVGrammar_GetNumberOfInterpretations).

Return Values

null

The index is not valid.

non-null

The interpretation handle.

Remarks

This function should be used after a call to LVGrammar_InterpretParses. A non-null interpretation handle needs to be released after you are done using it, by calling LVInterpretation_Release

See Also

LVGrammar_InterpretParses

LVGrammar_GetNumberOfInterpretations

LVGrammar::GetInterpretation (C++ API)

Page 271: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

261

LVSpeechPort Class

class LVSpeechPort

An LVSpeechPort Object represents one Speech Recognition Port and processes its sound data into text; all port instances can process their data in parallel. If the client application is multi-threaded, every thread that needs to process audio data should have its own LVSpeechPort.

Each port has multiple voice channels and grammar sets.

Each voice channel holds raw audio data. Before processing any data, the client application must call LoadVoiceChannel to load the channel. The channel keeps its own copy of this sound data, so the client application can free its copy after the call to LoadVoiceChannel. The voice channel will store the data until the client application loads new data into the channel. This allows the client application to decode the same sound data against different grammars without reloading the data.

The Decode method processes a voice channel against a grammar set, returning the concepts from the grammar set recognized in the channel’s audio data. Multiple voice channels are provided as a convenience, but only one voice channel can decode concurrently per port.

Use <LVSpeechPort.h>

Constructor/Destructors

LVSpeechPort Constructs an LVSpeechPort object.

~LVSpeechPort Closes the speech port object and releases its resources.

Functions

Page 272: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

262

OpenPort Opens the speech port and initializes the SRE.

ClosePort Closes the port, and releases its resources.

Decode Processes the voice channel audio data against the active grammar.

ReturnErrorString Returns a description of an error code.

SetProperty Sets various properties on the port.

SetPropertyEx Sets various properties on various scopes.

SetClientPropertyEx Sets various properties on client process level. (static)

WaitForDecode Blocks the client application until the decode is finished.

WaitForEngineToIdle Blocks the client application until the port is idle (not decoding).

AddPhrase Adds a phrase to a new or existing concept.

GetConcept Returns one concept found in the last call to Decode.

GetConceptScore Returns the confidence score of a concept found in the last call to Decode.

GetNumberOfConceptsReturned Returns the number of concepts found in the last call to Decode.

GetPhonemesDecoded Returns the actual phonemes found in the

Page 273: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

263

last call to Decode.

GetPhraseDecoded Returns the decoded phrase (with BNF formatting) found in the last call to Decode.

GetRawTextDecoded Returns the decoded raw text (without BNF formatting) found in the last call to Decode.

GetVoiceChannelData Returns the (original) preprocessed audio data for the voice channel.

LoadStandardGrammar Loads a standard, pre-defined grammar to easily recognize and format numbers, monetary figures or digits.

LoadVoiceChannel Loads the audio data into the specified voice channel prior to a call to Decode (which decodes the audio data).

RemoveConcept Removes a concept and all of its phrases.

ResetGrammar Removes all concepts from a grammar.

StreamStart Sets up a new stream.

StreamSendData Send data buffer of sound data to stream.

StreamGetStatus Returns status of stream.

StreamGetLength Returns length of sound data in stream buffer.

StreamSetStateChangeCallBack Set up a call back to receive state change notification of a stream.

StreamStop Stops stream and loads sound channel

Page 274: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

264

with streamed data.

StreamCancel Stops stream, sound data is discarded.

StreamSetParameter Sets a new value for a stream property.

StreamGetParameter Gets the current value of a stream property.

StreamSetParameterToDefault Sets a stream property to its default value.

LoadGrammar functions Loads and compiles an SRGS grammar

UnloadGrammar functions Unloads a grammar from the speech port.

IsGrammarLoaded Checks if a grammar has already been compiled and loaded into port.

ActivateGrammar functions Activates an SRGS grammar for decoding

DeactivateGrammar functions Removes a grammar from the active grammar set.

GetNumberOfParses Returns the number of parses generated by the decode, according to the active grammars.

GetParseTree Returns a Parse Tree result.

GetParseTreeString Returns a string representation of the parse tree.

GetNumberOfInterpretations Returns the number of interpretations generated by the decode + semantic interpretation process.

Page 275: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

265

GetInterpretation Returns an interpretation result.

GetInterpretationString Returns an XML snippet representation of the interpretation result.

GetNumberOfNBestAlternatives Returns number of n-best alternatives found by the engine.

SwitchToNBestAlternative Set the n-best alternative that is viewable.

Constants

Error Codes Error codes returned by methods.

Properties Property settings for the port.

Sound Formats Sound data format constants.

Standard Grammars Build-in grammar constants.

Page 276: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

266

Methods

LVSpeechPort::LVSpeechPort

Constructs an LVSpeechPort object.

LVSpeechPort(void);

Remarks

Does not automatically open the port.

Page 277: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

267

LVSpeechPort::~LVSpeechPort

Closes the speech port object and releases its resources.

~LVSpeechPort(void)

See Also

ClosePort

Page 278: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

268

LVSpeechPort::OpenPort

Opens the speech port and initializes the Speech Engine.

int OpenPort(ExportLogMsg Log, void* p, int verbosity);

Return Values

LV_SUCCESS

No errors; the port initialized successfully,

LV_FAILURE

Licensing has been exceeded. There are too many LVSpeechPorts active.

LV_SYSTEM_ERROR

The port is already open.

Parameters

Log

Pointer to a function which receives logging information from the LVSpeechPort instance.

p

A pointer to client application-defined data.

verbosity

range: 0 - 6

0 - minimal logging info

6 - maximum logging info

Remarks

Page 279: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

269

This method activates the speech port object. The recognition engine will begin initializing when this function is called. Control will return to the application immediately.

p is passed into the ExportLogMsg function to enable client-application-defined behavior.

See Also

Logging Callback Function

ClosePort

LV_SRE_OpenPort

Page 280: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

270

LVSpeechPort::GetOpenPortStatus

Returns a detailed code about the results of opening the speech port.

LVSpeechPort::GetOpenPortStatus( );

Return Values

LV_SUCCESS

The port opened successfully

LV_NO_SERVER_RESPONDING or LV_OPEN_PORT_FAILED__PRIMARY_SERVER_NOT_RESPONDING

The client could not find a server to request a licensed port from.

LV_OPEN_PORT_FAILED__LICENSES_SUCCEEDED

The primary server has too many ports connected for the number of licenses it has to give out.

See Also

OpenPort

ClosePort

LV_SRE_OpenPort

Page 281: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

271

LVSpeechPort::ClosePort

Closes the port, and releases its resources.

int ClosePort(void);

Return Values

LV_SUCCESS

No errors; the port has successfully shutdown.

LV_FAILURE

The port was unable to shutdown.

LV_INVALID_HPORT

The port was never successfully opened, or was already closed.

Note:

Frees this port from counting against the number of ports allowed by your license. Close every port not needed anymore.

See Also

OpenPort

LV_SRE_ClosePort

Page 282: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

272

LVSpeechPort::Decode

Processes the voice channel audio data against the active grammar.

int Decode(int VoiceChannel, int grammarset, unsigned int flags = 0);

Return Values

Zero (0) or greater indicates success.

A negative result indicates a specific error.

Parameters

VoiceChannel

The voice channel to process.

GrammarSet

The grammar set to process.

Flags (bitwise OR flags to set desired options)

LV_DECODE_BLOCK - Decode will not return until it has finished.

LV_DECODE_GENDER_MALE - Gender identifier.

LV_DECODE_GENDER_FEMALE – Gender identifier.

LV_DECODE_FIRST_TIME_USER – Reset caller weights in Recognition Engine (not implemented).

LV_DECODE_USE_OOV - Use the Out-Of-Vocabulary filter (OOV) during decode.

Remarks

If LV_DECODE_BLOCK is set, Decode will not return until it has finished processing the data.

Page 283: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

273

If LV_DECODE_BLOCK is not set, Decode returns immediately (but continues processing the data on a separate thread); the client application can continue its own work. Calling other LVSpeechPort methods may block until the Decode is finished. Once the client application is ready to check for results, call either 1) GetNumberOfConceptsReturned, or 2) WaitForEngineToIdle and then GetNumberOfConceptsReturned. WaitForEngineToIdle will only wait for a specified time, and returns regardless of whether Decode is finished, where GetNumberOfConceptsReturned will block until Decode is finished.

LV_DECODE_GENDER_FEMALE and LV_DECODE_GENDER_MALE identify which gender acoustic model to use. If these flags are not specified, the engine automatically decodes each audio file against both gender models. While this slows the engine by requiring two decodes, evaluating against both models has a very significant positive effect on recognition accuracy. Since the engine is multithreaded, unless CPU loads are a serious issue, do not use these flags.

On an error, call ReturnErrorString with the negative result from Decode to get a description of the error.

See Also

LV_SRE_Decode

Page 284: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

274

LoadGrammar functions

Before you can use a grammar, you must load it into the speech port's collection of grammars, or you must load it into the collection of application-level (global) grammars. When you load a grammar, it is compiled for use in the LumenVox Speech Engine.

These functions load an SRGS grammar that will be usable by a single speech port object.

Functions

int LoadGrammar(const char* gram_name, const char* gram_location);

int LoadGrammar(int gram_name, const char* gram_location);

int LoadGrammarFromBuffer(const char* gram_name, const char* gram_contents);

int LoadGrammarFromBuffer(int gram_name, const char* gram_contents);

int LoadGrammarFromObject(const char* gram_name, LVGrammar& gram_obj);

int LoadGrammarFromObject(int gram_name, LVGrammar& gram_obj);

Parameters

gram_name

The identifier for the grammar being loaded. Whenever you activate, deactivate, or unload, this is the identifier you will use. This can be a string, or an integer ID. The string "123" and the integer 123 are identical labels. Integer names are provided for backward compatibility.

gram_location

A file descriptor or uri that points to a valid SRGS grammar file, such as "c:/grammars/pizza.grxml", "http://www.gramsRus.com/phonenumber.gram", or "builtin:dtmf/boolean?y=1;n=2"

gram_contents

Page 285: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

275

A null terminated string containing the contents of a valid SRGS grammar file.

gram_obj

An LVGrammar object.

Return Values

LV_SUCCESS

No errors; this grammar is now ready for use.

LV_GRAMMAR_SYNTAX_WARNING

The grammar file was not fully conforming, but it was understandable and is now ready to be used

LV_GRAMMAR_SYNTAX_ERROR

The grammar file was not understandable to the grammar compiler. You will not be able to decode with this grammar.

LV_GRAMMAR_LOADING_ERROR

The grammar compiler was unable to find the location of the grammar you loaded.

Remarks

Detailed error and warning messages are sent to the speech port's logging callback function at priorities 0 and 1, respectively.

See Also

LVSpeechPort::UnloadGrammar functions

LVSpeechPort::IsGrammarLoaded functions

LVSpeechPort::LoadGlobalGrammar functions

Page 286: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

276

LV_SRE_LoadGrammar functions (C API)

Page 287: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

277

LoadGlobalGrammar functions

When loading a global grammar, the grammar will be sent to the server. And all following decode requests only contain global grammar ID's, instead of the actual grammars, to avoid network transportation overhead on large grammars.

A global grammar is associated with the client process that loads that grammar. All speech ports that are belong to that client have access to that global grammar. However, different client processes don't share global grammars with each other.

Generally, the lifetime of a global grammar is controlled by load and unload functions. However, in the case that users terminate client process without unloading global grammars, in order to release un-used global grammars, the server periodically checks if the client process is still alive. Once the server detected that a client process has been inactive for more than 10 minutes, it will remove all grammars associated with that client process.

In multi-threaded program, it is safe to access global grammars in read-only fashion on multiple threads simultaneously. For instance, querying whether a global grammar is loaded, or calling decode with global grammars. In the case that loading or unloading takes place, such as unloading a global grammar while decoding on another thread with that grammar, it is users' responsibility to prevent racing from happening.

Functions

static int LoadGlobalGrammar (const char* gram_name, const char* gram_location);

static int LoadGlobalGrammarFromBuffer (const char* gram_name, const char* gram_contents);

static int LoadGlobalGrammarFromObject (const char* gram_name, LVGrammar& gram_obj);

Parameters

gram_name

The identifier for the grammar being loaded. Whenever you activate, deactivate, or unload, this is the identifier you will use.

gram_location

Page 288: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

278

A file descriptor or uri that points to a valid SRGS grammar file, such as "c:/grammars/pizza.grxml", "http://www.gramsRus.com/phonenumber.gram", or "builtin:dtmf/boolean?y=1;n=2"

gram_contents

A null terminated string containing the contents of a valid SRGS grammar file.

gram_obj

An LVGrammar object.

Return Values

LV_SUCCESS

No errors; this grammar is now ready to use.

LV_GRAMMAR_SYNTAX_WARNING

The grammar file was not fully conforming, but it was understandable and is now ready for use.

LV_GRAMMAR_SYNTAX_ERROR

The grammar file was not understandable to the grammar compiler. You will not be able to decode with this grammar.

LV_GRAMMAR_LOADING_ERROR

The grammar compiler was unable to find the location of the grammar you loaded.

LV_GLOBAL_GRAMMAR_TRANSACTION_ERROR

Fail to send the grammar to all servers.

LV_GLOBAL_GRAMMAR_TRANSACTION_PARTIAL_ERROR

Page 289: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

279

Fail to send the grammar to some of the servers.

Remarks

Detailed error and warning messages are sent to the LVSpeechPort application-level logging callback function at priorities 0 and 1, respectively.

Users can load the same grammar with different labels. That will only create one instance of that grammar on the server.

See Also

LVSpeechPort::LoadGrammar functions

LVSpeechPort::IsGlobalGrammarLoaded functions

LVSpeechPort::UnloadGlobalGrammar functions

LV_SRE_LoadGlobalGrammar functions (C API)

Page 290: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

280

UnloadGrammar functions

These functions remove a loaded grammar from a speech port object. The last function removes all loaded grammars from the speech port.

Functions

int UnloadGrammar(const char* gram_name);

int UnloadGrammar(int gram_name);

void UnloadGrammars();

Parameters

gram_name

The identifier for the grammar being unloaded. This is the same identifier you gave the grammar when you loaded it. It can be a null terminated string, or an integer.

Return Values

LV_SUCCESS

No errors; this grammar is removed.

LV_FAILURE

The grammar was not present. Nothing was removed.

Remarks

Grammars that were activated and then unloaded are still active; they must be explicitly deactivated.

See Also

LVSpeechPort::IsGrammarLoaded functions

LVSpeechPort::UnloadGlobalGrammar functions

Page 291: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

281

LVSpeechPort::LoadGrammar functions

LV_SRE_UnloadGrammar functions (C API)

Page 292: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

282

UnloadGlobalGrammar functions

These functions remove a loaded grammar from the application-level set of grammars. The second function removes all application-level grammars.

Functions

static int UnloadGlobalGrammar(const char* gram_name);

static void UnloadGlobalGrammars( );

Parameters

gram_name

The identifier for the grammar being unloaded. This is the same identifier you gave the grammar when you loaded it.

Return Values

LV_SUCCESS

No errors; this grammar is removed.

LV_FAILURE

The grammar was not present. Nothing was removed.

Remarks

A global grammar is unloaded on the server only when users have called unload functions on all labels that are associated with the grammar.

See Also

LVSpeechPort::UnloadGrammar functions

LVSpeechPort::IsGlobalGrammarLoaded functions

LVSpeechPort::LoadGlobalGrammar functions

LV_SRE_UnloadGlobalGrammar functions (C API)

Page 293: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

283

IsGrammarLoaded functions

Functions

bool IsGrammarLoaded(const char* gram_name);

bool IsGrammarLoaded(int gram_name);

Parameters

gram_name

The identifier for the grammar being queried. This is the same identifier you gave the grammar when you loaded it.

Return Values

1 if a grammar was found with the label gram_name in the space of application-level grammars; 0 otherwise.

Remarks

Note: This function only tells you if a grammar with the name gram_name is loaded. It does not tell you if there are two identical grammar bodies loaded.

See Also

LVSpeechPort::UnloadGrammar functions

LVSpeechPort::IsGlobalGrammarLoaded functions

LVSpeechPort::LoadGrammar functions

LV_SRE_IsGrammarLoaded functions (C API)

Page 294: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

284

IsGlobalGrammarLoaded

Function

static bool IsGlobalGrammarLoaded(const char* gram_name);

Parameters

gram_name

The identifier for the grammar being queried. This is the same identifier you gave the grammar when you loaded it.

Return Values

true if a grammar was found with the label gram_name in the space of application-level grammars; false otherwise.

Remarks

Note: This function only tells you if a grammar with the name gram_name is loaded. It does not tell you if there are two identical grammar bodies loaded.

See Also

LVSpeechPort::UnloadGlobalGrammar functions

LVSpeechPort::IsGrammarLoaded functions

LVSpeechPort::LoadGlobalGrammar functions

LV_SRE_IsGlobalGrammarLoaded (C API)

Page 295: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

285

ActivateGrammar functions

If you wish to use a speech port's loaded SRGS grammar for decode, you need to activate it. Activating a grammar puts it in the multi-grammar grammarset called LV_ACTIVE_GRAMMAR_SET. The grammars that were activated can then be used for a decode by specifying LV_ACTIVE_GRAMMAR_SET as the grammarset parameter in a call to Decode, or by setting the STREAM_PARM_GRAMMAR_SET equal to the LV_ACTIVE_GRAMMAR_SET before calling StreamStart. The reason for this mechanism is to maintain backward compatibility with previous APIs.

When ActivateGrammar is called, first the grammar is searched for among the grammars in the speech port's loaded grammars. If it can not be found there, the collection of application level grammars is searched. If you wish to explicitly activate an application level grammar, use ActivateGlobalGrammar

Functions

int ActivateGrammar(const char* gram_name);

int ActivateGrammar(int gram_name);

Parameters

gram_name

The identifier for the grammar being activated. This is the same identifier that was given to the grammar when it was loaded. This can be a string, or an integer ID. The string "123" and the integer 123 are identical labels. Integer names are provided for backward compatibility.

Return Values

LV_SUCCESS

No errors; this grammar is now active.

LV_GRAMMAR_LOADING_ERROR

This grammar could not be activated, because it was not found in the speech port's set of loaded grammars.

Remarks

Page 296: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

286

Detailed error and warning messages are sent to the speech port's logging callback function at priorities 0 and 1, respectively.

See Also

LV_SRE_DeactivateGrammar functions

LVSpeechPort::ActivateGlobalGrammar

LV_SRE_ActivateGrammar functions (C API)

Page 297: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

287

DeactivateGrammar functions

These functions remove a grammar from the set of active grammars. The last function clears the active grammar set

Functions

int DeactivateGrammar(const char* gram_name);

int DeactivateGrammar(int gram_name);

int DeactivateGrammars();

Parameters

hport

The handle of the speech port for which you are activating the grammar.

gram_name

The identifier for the grammar being deactivated. This is the same identifier that was given to the grammar when it was loaded. This can be a string, or an integer ID. The string "123" and the integer 123 are identical labels. Integer names are provided for backward compatibility.

Return Values

LV_SUCCESS

No errors; this grammar is no longer active.

LV_FAILURE

This grammar could not be deactivated, because it was never successfully activated.

See Also

LVSpeechPort::ActivateGrammar functions

LV_SRE_DeactivateGrammar functions (C++ API)

Page 298: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

288

GetNumberOfInterpretations

Returns the number of semantic interpretation results that were generated by the previous decode.

Function

int GetNumberOfInterpretations(int voicechannel)

Parameters

voicechannel

The audio channel holding the decoded audio.

See Also

LVSpeechPort::GetInterpretation

LVSpeechPort::GetInterpretationString

LV_SRE_GetNumberOfInterpretations (C API)

Page 299: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

289

GetInterpretation

Returns an LVInterpretation object representing the results of the semantic interpretation process.

Function

LVInterpretation GetInterpretation (int voicechannel, int index)

Parameters

voicechannel

The channel that the decode took place on.

index

An utterance could give rise to multiple interpretations, particularly if the grammars involved are ambiguous. index ranges from 0 to GetNumberOfInterpretations - 1.

Return Value

The return type is an interpretation object. The object is a representation of the ECMAScript object made by the matching grammar, using the Semantic Interpretation for Speech Recognition process. It also contains additional information such as the confidence score, matching grammar label, and the input sentence.

See Also

LVSpeechPort::GetNumberOfInterpretations

LVSpeechPort::GetInterpretationString

LVInterpretation C++ API

LV_SRE_CreateInterpretation (C API)

Page 300: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

290

LVSpeechPort::GetNumberOfParses

Returns the number of parse trees that were generated by the previous decode.

Function

int GetNumberOfParses(int voicechannel)

Parameters

voicechannel

The audio channel holding the decoded audio.

See Also

LVSpeechPort::GetParseTree

LVSpeechPort::GetParseTreeString

Parse Tree Introduction

LV_SRE_GetNumberOfParses (C++ API)

Page 301: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

291

LVSpeechPort::GetParseTree

Provides the user with an LVParseTree object representing the sentence structure of what was decoded by the Speech Engine, according to the active grammars.

Function

LVParseTree GetParseTree(int voicechannel, int index)

Parameters

voicechannel

The audio channel containing the input audio

index

It is possible to have more than one parse tree for an utterance (for instance if the grammar is ambiguous); this is the index of the tree

Return Value

A parse tree.

Remark

Logically, a parse tree and the parse string returned to the user are the same. However, an LVParseTree object makes it easy to search the parse tree for useful information.

See Also

LVSpeechPort::GetNumberOfParses

LVSpeechPort::GetParseTreeString

Parse Tree Introduction

LVParseTree C++ API

LV_SRE_CreateParseTree (C API)

Page 302: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

292

LVSpeechPort::GetParseTreeString

Provides the user with a string representation of a speech parse tree.

Function

const char* GetParseTreeString(int voicechannel, int index)

Parameters

voicechannel

The audio channel containing the input audio

index

It is possible to have more than one parse tree possibility (for instance if the grammar is ambiguous); this is the index of the tree

Remark

Logically, a speech parse tree and the parse string returned to the user are the same. However, a speech parse tree makes it easy to search the parse tree for useful information. The parse tree string is based on the examples provided by the W3C SRGS specification

See Also

LVSpeechPort::GetNumberOfParses

LVSpeechPort::GetParseTree

Parse Tree Introduction

LV_SRE_GetParseTreeString (C API)

Page 303: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

293

LVSpeechPort::GetNumberOfConceptsReturned

Returns the number of concepts found in the last call to Decode.

int GetNumberOfConceptsReturned(int VoiceChannel);

Return values

The number of concepts found for this voice channel.

Parameters

VoiceChannel

The voice channel processed by Decode.

See Also

LV_SRE_GetNumberOfConceptsReturned

Page 304: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

294

LVSpeechPort::GetConcept

Returns one concept found in the last call to Decode.

const char* GetConcept(int VoiceChannel, int Index);

Return Values

A null-terminated string representing the matched concept .

NULL indicates that Index was outside the possible range.

Parameters

VoiceChannel

The voice channel processed by Decode.

Index

The recognition position of the concept, between 0 and (GetNumberOfConceptsReturned - 1), inclusive.

Remarks

Assuming the speaker said "Violet" and the grammar contained the concepts under Concept, and the grammar under Phrase, the Speech Engine might return the concepts highlighted:

See Also

LV_SRE_GetConcept

Page 305: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

295

LVSpeechPort::GetConceptScore

Returns the confidence score of a concept found in the last call to Decode.

int GetConceptScore(int VoiceChannel, int Index);

Return Values

The confidence score of the matched concept. The range of possible values is 0 to 1000.

Parameters

VoiceChannel

The voice channel processed by Decode.

Index

The recognition position of the concept, between 0 and (GetNumberOfConceptsReturned - 1), inclusive.

Remarks

Assuming the speaker said "Violet" and the grammar contained the concepts under Concept, and the grammar under Phrase, the Speech Engine might return the scores highlighted:

See Also

LV_SRE_GetConceptScore

Page 306: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

296

LVSpeechPort::LVGetPhonemesDecoded

Returns the actual phonemes found in a call to Decode.

const char* GetPhonemesDecoded(int VoiceChannel, int Index);

Return Values

A null-terminated static string of the decoded phonemes.

Parameters

VoiceChannel

The voice channel to process.

Index

The recognition position of the decoded phonemes.

Remarks

Assuming the speaker said "Violet" and the grammar contained the concepts under Concept, and the grammar under Phrase, the Speech Engine might return the phonemes highlighted:

See Also

GetPhraseDecoded

GetRawTextDecoded

LV_SRE_GetPhonemes

Page 307: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

297

LVSpeechPort::GetPhraseDecoded

Returns the decoded phrase (with BNF formatting) found in the last call to Decode.

const char* GetPhraseDecoded(int VoiceChannel, int Index);

Return Values

A null-terminated string representing the decoded string.

Parameters

VoiceChannel

The voice channel to process.

Index

The recognition position of the decoded phrase.

Remarks

Assuming the speaker said "Violet" and the grammar contained the concepts under Concept, and the grammar under Phrase, the Speech Engine might return the phrases highlighted:

The main difference between LVSpeechPort::GetPhraseDecoded and LVSpeechPort::GetRawTextDecoded is in BNF formatting. LVSpeechPort::GetPhraseDecode returns the decoded phrase, as it is entered into the grammar. If the phrase contains BNF formatting, with selections, options, grouping, etc., than the return value preserves that formatting. LVSpeechPort::GetRawTextDecoded returns the decode phrase, after BNF formatting has been removed. Thus, LVSpeechPort::GetRawTextDecoded will return the phrase as a list of the words actually recognized, rather than the phrase as it was entered into the grammar.

See Also

Page 308: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

298

GetPhonemesDecoded

GetRawTextDecoded

LV_SRE_GetPhraseDecoded

Page 309: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

299

LVSpeechPort::GetRawTextDecoded

Returns the decoded raw text (without BNF formatting) found in the last call to Decode.

const char* GetRawTextDecoded(HPORT hport,int VoiceChannel, int Index);

Return Values

A null-terminated string representing the decoded raw text.

Parameters

VoiceChannel

The voice channel to process.

Index

The recognition position of the decoded raw text.

Remarks

Assuming the speaker said "Violet" and the grammar contained the concepts under Concept, and the grammar under Phrase, the Speech Engine might return the raw text highlighted:

The main difference between GetPhraseDecoded and GetRawTextDecoded is in BNF formatting. GetPhraseDecode returns the decoded phrase, as it is entered into the grammar. If the phrase contains BNF formatting, with selections, options, grouping, etc., than the return value preserves that formatting. GetRawTextDecoded returns the decode phrase, after BNF formatting has been removed. Thus, GetRawTextDecoded will return the phrase as a list of the words actually recognized, rather than the phrase as it was entered into the grammar.

See Also

Page 310: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

300

GetPhonemesDecoded

GetPhraseDecoded

LV_SRE_GetRawTextDecoded

Page 311: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

301

LVSpeechPort::GetVoiceChannelData

Sets the pointers to the voice channel's copy of the original preprocessed audio data.

int GetVoiceChannelData(int VoiceChannel, short** PCM, unsigned int* Samples);

Return Values

LV_SUCCESS

No errors; PCM and Samples have been successfully set.

LV_SOUND_CHANNEL_OUT_OF_RANGE

The grammar set specified is outside the valid range; possible values are 0-63, inclusive.

LV_BAD_HPORT

The Speech Engine is no longer running. This is the result of a ClosePort call or a unrecoverable Speech Engine error.

Parameters

VoiceChannel

The voice channel to process.

PCM

A pointer to a pointer to set to the post-processed audio data.

Samples

A pointer to an integer to set the number of samples.

See Also

LV_SRE_GetVoiceChannelData

Page 312: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

302

LVSpeechPort::LoadStandardGrammar

Standard Grammars are deprecated in favor of SRGS built-in grammars

Loads a standard, pre-defined grammar to easily recognize and format numbers, monetary figures or digits.

int LoadStandardGrammar(int GrammarSet, int StdGrammar);

Return Values

LV_SUCCESS

No errors; the standard grammar is loaded.

LV_STANDARD_GRAMMAR_OUT_OF_RANGE

The standard grammar value is not a recognized grammar type.

LV_GRAMMAR_SET_OUT_OF_RANGE

The GrammarSet value is out of range.

Parameters

GrammarSet

Which grammar set this phrase is being added to. Possible value range 0 - 63.

StandardGrammar

The standard grammars are:

1. GRAMMAR_DIGITS String of single digits like a phone number or pin code.

2. GRAMMAR_MONEY Monetary value (only implemented for SRGS decodes).

Page 313: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

303

3. GRAMMAR_NUMERIC Numeric value like 12,000, 24.45, or 35).

4. GRAMMAR_SPELLING Alphabet letters for spelling (not implemented).

5. GRAMMAR_ALPHA_NUMERIC (Not implemented).

6. GRAMMAR_DATE Date values (only implemented for SRGS decodes).

7. GRAMMAR_NONE Clears out the standard grammar, without clearing out any phrases that were added. ResetGrammar( ) will clear out the entire grammar.

Remarks

The client application can load only one standard grammar, but can add any number of concepts with AddPhrase. This is not true, however, if you use SRGS grammars. The correct way to augment as standard SRGS grammar is to load a grammar to a different location, and then activate both. When a standard grammar is loaded, the decoder will return the number, dollar amount, or digit string as either a single concept, or a single interpretation string, depending on whether SRGS is used or not .

As an example, the client application loads GRAMMAR_NUMBER and also adds the concept and phrase "Widgets". If the sound data contained the speech "twelve widgets". The decoder will return two concepts: the first is the string "12" and the second the string "Widgets". If the speech was "one thousand one hundred and twenty nine Widgets seven point two Widgets", the decoder would return four concepts: "1129" , "Widgets", "7.2" and "Widgets" .

However, If you use SRGS, this is not what happens. In order to get this sort of functionality in the SRGS setting, you would create a grammar that looks like the following:

#ABNF 1.0; language en-US; mode voice; tag-format <semantics/1.0>; root $how_many_widgets;

Page 314: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

304

$how_many_widgets = $<builtin:grammar/number> widgets {$=$$;}

In this case you wouldn't bother using LoadStandardGrammar() at all, since the standard number grammar will get loaded when you load this grammar. The return type would be an interpretation string representing the number that was recognized, like "1129" or "7.2". The word "widgets" would not be returned in this grammar.

See Also

Standard Grammars

LV_SRE_LoadStandardGrammar

Page 315: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

305

LVSpeechPort::LoadVoiceChannel

Loads the audio data into the specified voice channel prior to a call to Decode (which decodes the audio data).

int LoadVoiceChannel(int VoiceChannel, void* M, int Length, SOUND_FORMAT Format = ULAW_8KHZ);

Return Values

LV_SUCCESS

No errors; the voice channel audio successfully loaded.

LV_BAD_HPORT

The engine is no longer running. This is the result of a ClosePort call or a unrecoverable engine error.

LV_FAILURE

Sound format was incorrectly specified.

Parameters

VoiceChannel

Accepted values 0 through 63.

M

Pointer to audio data.

Length

Memory size in bytes of the audio data.

Format

The audio data sound format.

Page 316: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

306

Remarks

Each LV_SpeechPort supports 64 separate voice channels. Each channel has its own separate storage for decode data, so once the call is made, the client application can release its own copy. LoadVoiceChannel will accept the audio data and prepare it for decoding.

See Also

LV_SRE_LoadVoiceChannel

Page 317: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

307

LVSpeechPort::AddPhrase

Adds a phrase to a new or existing concept.

int AddPhrase(int GrammarSet, const char* Concept , const char* Phrase);

Return Values

LV_SUCCESS

No errors; the phrase was added to the concept.

LV_BAD_HPORT

The engine is no longer running. This is the result of a ClosePort call or a unrecoverable engine error.

LV_GRAMMAR_SET_OUT_OF_RANGE

The grammar set is out of range.

LV_GRAMMAR_SYNTAX_ERROR or LV_GRAMMAR_SYNTAX_WARNING

The phrase entered has bad syntax, such as mismatched parenthesis.

Parameters

GrammarSet

Which grammar set to add the phrase. Integer value between 0 - 63, inclusive.

Concept

Which concept to add the phrase. Null-terminated string.

Phrase

The new phrase.

Page 318: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

308

Remarks

The concept can be a new or existing concept; the call will automatically add the new concept with the single phrase.

See Also

Phrase Formats

Phonemes

LV_SRE_AddPhrase

Page 319: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

309

LVSpeechPort::RemoveConcept

Removes a concept and all of its phrases.

int RemoveConcept(int GrammarSet, const char* Concept);

Return Values

LV_SUCCESS

No errors; the concept and all phrases are removed form the grammar set.

LV_GRAMMAR_SET_OUT_OF_RANGE

The grammar set specified is outside the valid range.

LV_BAD_HPORT

The engine is no longer running. This is the result of a ClosePort call or a unrecoverable engine error.

Parameters

GrammarSet

Which grammar set to remove the concept from. Possible value range 0 - 63.

Concept

Existing concept to remove. Null-terminated string.

See Also

LV_SRE_RemoveConcept

Page 320: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

310

LVSpeechPort::ResetGrammar

Removes all concepts from a grammar.

int ResetGrammar(int GrammarSet);

Return Values

LV_SUCCESS

No errors; grammar reset.

LV_GRAMMAR_SET_OUT_OF_RANGE

The grammar set value is out of expected range (0-63).

See Also

LV_SRE_ResetGrammar

Page 321: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

311

LVSpeechPort::ReturnErrorString

Returns a description of an error code.

const char* ReturnErrorString(int ReturnCode);

Return Values

A null-terminated static string describing the error code.

Parameters

ReturnCode

The error code.

Remarks

If the error code is an invalid error code, "Invalid Error Code" is returned.

See Also

LV_SRE_ReturnErrorString

Page 322: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

312

LVSpeechPort::SetProperty

SetProperty is deprecated in favor of using SetPropertyEx.

Sets various properties on the port.

int SetProperty(PROPERTIES Property, int Value);

Return Values

LV_SUCCESS

No errors; Property is set to Value.

LV_NOT_A_VALID_PROPERTY_VALUE

The property value is not a valid for the designated property.

Parameters

Property

Which property to modify.

Value

Property-dependent.

Remarks

Currently, only PROP_SAVE_SOUND_FILES is implemented; setting Value to 1 will cause the port to save request and answer files to disk; setting Value to 0 turns this feature off. The request and answer files are invaluable for troubleshooting and tuning applications, but will quickly fill up a hard drive.

See Also

Properties

LV_SRE_SetProperty

SetPropertyEx

Page 323: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

313

LVSpeechPort::SetPropertyEx

Sets various properties for a port, client, soundchannel, or grammar.

int SetPropertyEx(int propertyname, int valuetype, void* pvalue, int target = PROP_EX_TARGET_PORT, int index = 0 );

Return Values

LV_SUCCESS

No errors; property is set to the value pointed to by pvalue.

LV_INVALID_PROPERTY

The property does not exist.

LV_INVALID_PROPERTY_VALUE

The property value is invalid for the designated property (e.g. out of range).

LV_INVALID_PROPERTY_TARGET

The property cannot be set for the specified target.

LV_INVALID_PROPERTY_VALUE_TYPE

The property's type is incompatible with the declared type.

LV_INVALID_PROPERTY_TARGET_IDX

The target's index (grammar set, voicechannel) is out of range for this property.

Note: If more than one error occurs, which error code is returned is undefined.

Parameters

propertyname

Page 324: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

314

Which property to modify.

valuetype

The value type of the property being set. Legal values are:

PROP_EX_VALUE_TYPE_INT

PROP_EX_VALUE_TYPE_INT_PTR

PROP_EX_VALUE_TYPE_STRING

PROP_EX_VALUE_TYPE_FLOAT_PTR

Each property has a set of legal set of value types. See Properties.

pvalue

A pointer to the new value for propertyname. pvalue will be reinterpreted according to the value type provided.

target

The portion of the API that this property is set for. Legal values are:

PROP_EX_TARGET_PORT -- pvalue affects an entire speech port object

PROP_EX_TARGET_CHANNEL -- pvalue affects one voice channel in the speech port. The channel is specified by index.

PROP_EX_TARGET_GRAMMAR -- pvalue affects one grammar set in the speech port. The set is specified by index.

PROP_EX_TARGET_CLIENT -- pvalue is global, and affects all ports on the client.

Remarks

Page 325: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

315

See Properties for a list of modifiable properties.

You can use this function only after open a port. Calling this function before opening a port will result in failure. To set client scope property, use static function LVSpeechPort::ClientPropertyEx.

See Also

Properties

LV_SRE_SetPropertyEx

(static) LVSpeechPort::SetClientPropertyEx

Page 326: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

316

LVSpeechPort::StreamStart

Sets up a new stream.

int StreamStart();

Return Values

LV_SUCCESS

Stream set up.

LV_FAILURE

Parameters incorrectly set.

Remarks

Call this function to set up a new stream. You need to call this function after calling StreamStop, StreamCancel or after end-of-speech has been detected on previous utterance.

See Also

StreamSetParameter

StreamStop

StreamCancel

Page 327: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

317

LVSpeechPort::StreamSendData

Send data buffer of sound data to stream.

int StreamSendData(void* SoundData, int SoundDataLength);

Return Values

LV_SUCCESS

Data accepted

LV_FAILURE

Stream not active or NULL sound data.

Parameters

SoundData

Pointer to the memory buffer containing sound data.

SoundDataLength

Length in bytes of sound data.

Remarks

Used to do the actual streaming. Call this function with each sound data buffer. This call copies sound data to an internal buffer and returns immediatly. Processing of sound data takes place on a background thread.

See Also

StreamSetStateChangeCallBack

StreamGetStatus

Page 328: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

318

LVSpeechPort::StreamGetStatus

Returns status of stream.

int StreamGetStatus();

Return Values

Returns a stream status define. See Steam Status.

Remarks

Called to check the current state of stream.

See Also

StreamSetStateChangeCallBack

Page 329: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

319

LVSpeechPort::StreamGetLength

Returns length of sound data in stream buffer.

int StreamGetStatus();

Return Values

Number of bytes in internal buffer for sound stream.

Remarks

This is the total number of bytes streamed. Does not include bytes sent before barge-in is detected (if STREAM_PARM_DETECT_BARGE_IN is active) Can be useful if application wants to stop post barge-in stream after a certain amount of time (as example, to limit a user speech to 10 seconds)

See Also

StreamSetStateChangeCallBack

Page 330: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

320

LVSpeechPort::StreamSetStateChangeCallBack

Set up a call back to receive state change notification of a stream.

int StreamSetStateChangeCallBack(LV_SRE_StreamStateChangeFn* fn, void* UserData);

Return Values

LV_SUCCESS

Parameters

LV_SRE_StreamStateChangeFn

Pointer to callback function to receive state change updates. See Stream Callback.

UserData

Application defined data sent back in callback.

Remarks

Each time a streams status changes, this callback will be called.

See Also

LV_SRE_StreamStateChangeFn

StreamGetStatus

Page 331: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

321

LVSpeechPort::StreamStop

Stops stream and loads sound channel with streamed data.

int StreamStop();

Return Values

LV_SUCCESS

LV_FAILURE Stream not active.

Remarks

This function ends streaming and puts streamed data into the voice channel defined with the STREAM_PARM_VOICE_CHANNEL parameter. If the STREAM_PARM_AUTO_DECODE parameter is active, the decode will begin (non-blocking) when this function is called.

See Also

StreamSetParameter

StreamCancel

Stream Parameters

Page 332: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

322

LVSpeechPort::StreamCancel

Stops stream, sound data is discarded.

int StreamCancel();

Return Values

LV_SUCCESS

LV_FAILURE Stream not active.

Remarks

This kills the stream. Can be called to cancel a stream (particularly auto-decode types streams) in order to start new stream.

See Also

StreamStop

Page 333: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

323

LVSpeechPort::StreamSetParameter

Sets a new value for a stream property.

int StreamSetParameter(int StreamParameter, unsigned long StreamParameterValue);

Return Values

LV_SUCCESS

LV_INVALID_PROPERTY StreamParameter does not exist.

LV_INVALID_PROPERTY_VALUE StreamParamerterValue is out of range for the stream parameter.

Parameters

StreamParameter

Stream parameter to change. See Stream Parameters.

StreamParameterValue

New stream parameter value.

Remarks

Sets a stream parameter value.

See Also

StreamGetParameter

StreamSetParameterToDefault

Stream Parameters

Page 334: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

324

LVSpeechPort::StreamGetParameter

Gets the current value of a stream property.

int StreamSetParameter(int StreamParameter, unsigned long StreamParameterValue);

Return Values

LV_SUCCESS

LV_INVALID_PROPERTY StreamParameter does not exist.

LV_INVALID_PROPERTY_VALUE StreamParamerterValue is out of range for the stream parameter.

Parameters

StreamParameter

Stream parameter to change. See Stream Parameters.

StreamParameterValue

New stream parameter value.

Remarks

Sets a stream parameter value.

See Also

StreamGetParameter

StreamSetParameterToDefault

Stream Parameters

Page 335: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

325

LVSpeechPort::StreamSetParameterToDefault

Sets a stream property to its default value.

int StreamSetParameterToDefault(int StreamParameter);

Return Values

LV_SUCCESS

LV_INVALID_PROPERTY Stream parameter does not exist.

Parameters

StreamParameter

Stream parameter to reset. See Stream Parameters.

Remarks

Sets a stream parameter value back to default setting.

See Also

StreamGetParameter

StreamSetParameter

Stream Parameters

Page 336: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

326

LVSpeechPort::WaitForEngineToIdle

(Deprecated in favor of LVSpeechPort::WaitForDecode.)

Blocks the client application until the port is idle (not decoding).

int WaitForEngineToIdle(int MillisecondsToWait, int VoiceChannel = -1);

Return Values

LV_SUCCESS

No errors or timeout; the engine is now idle.

LV_TIME_OUT

WaitForEngineToIdle's timeout was reached before the engine became idle.

Parameters

MillisecondsToWait

The number of milliseconds to wait before returning if the Speech Port does not become idle.

VoiceChannel

Which VoiceChannel to wait on, -1 waits on all voice channels for the port.

Remarks

This function is deprecated in favor of LVSpeechPort::WaitForDecode. To achieve the same behavior as LVSpeechPort::WaitForDecode, use property PROP_EX_DECODE_TIMEOUT, and set MillisecondsToWait to TIMEOUT_INFINITE.

Some of the LVSpeechPort methods run asynchronous, in particular, Decode. WaitForEngineToIdle is primarily useful when Decode is called without LV_DECODE_BLOCK. In this case, Decode returns immediately, but continues processing the voice channel's audio data in a separate thread. Since client applications will eventually need the results, the clients need a way to query the port to see if Decode has finished. WaitForEngineToIdle will wait the specified

Page 337: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

327

time for the engine to idle; check the return value to ensure the engine is idle, indicating that decode results are available.

WaitForEngineToIdle is also useful to ensure the LVSpeechPort has finished initializing, prior to calls to Decode.

See Also

Decode

LVSpeechPort::WaitForDecode

LV_SREWaitForEngineToIdle

Page 338: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

328

LVSpeechPort::GetNumberOfNBestAlternatives

Returns the number of n-best alternatives found by the engine.

int GetNumberOfNBestAlternatives(int voicechannel);

Return Values

Number of n-best alternatives. It will always less than or equal to the value set for PROP_EX_MAX_NBEST_RETURNED.

Parameters

voicechannel

The channel containing the decoded audio.

See Also

PROP_EX_MAX_NBEST_RETURNED

LVSpeechPort::SwitchToNBestAlternative

LV_SRE_GetNumberOfNBestAlternatives

Page 339: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

329

LVSpeechPort::SwitchToNBestAlternative

Switch the n-best alternative that is viewable. After this function call, following result retrieval functions, such as LVSpeechPort::GetInterpretation will be bound to this n-best alternative.

int SwitchToNBestAlternatives(int voicechannel, int index);

Return Values

LV_SUCCESS

LV_FAILURE The index is not valid.

Parameters

voicechannel

The channel containing the decoded audio.

index

The index of the n-best alternative to switch to. It may be any value in the range [0, LVSpeechPort::GetNumberOfNBestAlternatives).

Remarks

Each alternative represents a distinct sentence. However, since some sentences can have multiple interpretations or multiple parses, it is possible that for some alternatives you will have multiple parse tree or interpretation objects returned. For this reason, it is recommended to get all result out as follows:

int nbest_count; int nbest_total = port.GetNumberOfNBestAlternatives(vc); int interp_count; for (nbest_count=0; nbest_count<nbest_total; ++nbest_count) { port.SwitchToNBestAlternative(vc, nbest_count); int interp_total = port.GetNumberOfInterpretations(vc); for (interp_count=0; interp_count<interp_total; ++interp_count) {

Page 340: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

330

LVInterpretation interp = port.GetInterpretation(vc, interp_count); /* do something with the interp */ } }

Even though more than one interpretation can live in a single n-best result, the same interpretation will not live in more than one n-best result. The lower scoring interpretations are pruned out.

See Also

LVSpeechPort::GetNumberOfNBestAlternatives

LV_SRE_SwitchToNBestAlternative

Page 341: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

331

LVSpeechPort::WaitForDecode

Blocks the client application until the decode is finished.

int WaitForDecode(int VoiceChannel);

Return Values

LV_SUCCESS

No errors or timeout; the decode interaction is finished.

LV_TIME_OUT

The timeout value associated with PROP_EX_DECODE_TIMEOUT was exceeded before a result was returned from the Speech Engine. The decode was dropped from the Engine, and the LVSpeechPort may now start a new decode request.

Parameters

VoiceChannel

Which voice channel to wait on. Setting VoiceChannel equal to -1 causes a wait on all the voice channels for the port.

Remarks

Some of the API functions run asynchronous, in particular, LVSpeechPort::Decode. LVSpeechPort::WaitForDecode is primarily useful when LVSpeechPort::Decode is called without LV_DECODE_BLOCK. In this case, LVSpeechPort::Decode returns immediately, but continues processing the voice channel's audio data in a separate thread. Since client applications will eventually need the results, the clients need a way to query the port to see if LVSpeechPort::Decode has finished. LVSpeechPort::WaitForDecode will wait the specified time (determined by set value of PROP_EX_DECODE_TIMEOUT) for the engine to idle; check the return value to ensure the decode interaction is finished before attempting to retrieve answers from the speech port.

See Also

PROP_EX_DECODE_TIMEOUT

LVSpeechPort::Decode

LV_SRE_WaitForDecode

Page 342: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

332

LVSpeechPort::SetClientPropertyEx

Sets various properties on the scope of client process..

static int SetClientPropertyEx(int propertyname, int valuetype, void* pvalue);

Return Values

LV_SUCCESS

No errors; property is set to the value pointed to by pvalue.

LV_INVALID_PROPERTY

The property does not exist.

LV_INVALID_PROPERTY_VALUE

The property value is invalid for the designated property (e.g. out of range).

LV_INVALID_PROPERTY_TARGET

The property cannot be set for the specified target.

LV_INVALID_PROPERTY_VALUE_TYPE

The property's type is incompatible with the declared type.

Note: If more than one error occurs, which error code is returned is undefined.

Parameters

propertyname

Which property to modify.

valuetype

The value type of the property being set. Legal values are:

Page 343: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

333

PROP_EX_VALUE_TYPE_INT

PROP_EX_VALUE_TYPE_INT_PTR

PROP_EX_VALUE_TYPE_STRING

PROP_EX_VALUE_TYPE_FLOAT_PTR

Each property has a set of legal set of value types. See Properties.

pvalue

A pointer to the new value for propertyname. pvalue will be reinterpreted according to the value type provided.

Remarks

See Properties for a list of modifiable properties.

A client property can be modified by calling this function even before opening a port.

See Also

Properties

LV_SRE_SetPropertyEx

Page 344: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

334

LVInterpretation Class

Intro To LVInterpretation

Use <LVSpeechPort.h> or <LV_SRE_Semantic.h>

Return Type Function Description

LVInterpretation (void) Constructs an LVInterpretation object.

LVInterpretation (const LVInterpretatione& other) Copy constructor

LVInterpretation & operator= (const LVInterpretation& other) Assignment operator

~LVInterpretation(void) Destroys the LVInterpretation object

LVSemanticData & ResultData (void)

The result object, representing the end product of the semantic interpretation process.

const char* ResultName (void)

const char* GrammarLabel (void)

Returns the name of the grammar as it was provided to the speech port.

const char* Mode (void) returns the interaction mode for this answer.

Page 345: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

335

const char* Language (void) Returns the language identifier for this answer.

const char* InputSentence (void) The sentence that generated this interpretation.

int Score (void) Confidence score for this interpretation.

const char* TagFormat (void) The tag format that created the Data object.

Page 346: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

336

LVInterpretation: Constructing and Copying

LVInterpretation objects are fully copyable.

Functions

LVInterpretation(void)

LVInterpretation(const LVInterpretation& other_si)

LVInterpretation& operator=(const LVInterpretation& other_si)

~LVInterpretation()

Parameters

other_hsi

The interpretaion object whose contents are being copied.

Remarks

Example

LVSpeechPort Port;

//open the port and do a decode //... //when the decode is finished,grab an interpretation object LVInterpretation Interp = Port.GetInterpretation(voicechannel, index);

//start using the interpretation data. //...

See Also

Creating, Copying and Releasing an LVInterpretation Handle (CAPI)

Page 347: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

337

ResultData

Returns a semantic data object generated by the user input and a matching grammar.

Function

const LVSemanticData& LVInterpretation::ResultData( )

Returns

An object representing the results of the semantic interpretation process.

See Also

LVSemanticData C++ API

LVInterpretation_GetResultData (C API)

Page 348: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

338

ResultName

Returns the name of the name of the result data for this interpretation. The result name is usually the root rule of the matching grammar for this interpretation.

Function

const char* LVInterpretation::ResultName ( )

See Also

LVInterpretation_GetResultName (C API)

Page 349: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

339

Language

Returns the language identifier of the grammar that generated this interpretation.

Function

const char* LVInterpretation::Language( )

Returns

An RFC 3066 language identifier, such as "en-US" for United States English, or "fr" for French.

See Also

LVInterpretation_GetLanguage ( C API )

Page 350: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

340

Mode

Returns the interaction mode that created the interaction.

Function

const char* LVInterpretation::Mode()

Returns

"voice" or "dtmf"

See Also

LVInterpretation_GetMode (C API)

Page 351: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

341

TagFormat

Returns the name of the semantic process that created this interpretation.

Function

const char* LVInterpretation::TagFormat()

Returns

tag format identifier

See Also

LVInterpretation_GetMode (C API)

Page 352: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

342

InputSentence

Returns the input that was fed to the matching grammar to create this interpretation. It may represent the speech the Speech Engine recognized, or a dtmf sequence.

Function

const char* LVInterpretation::InputSentence()

See Also

LVInterpretation_GetInputSentence (CAPI)

Page 353: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

343

GrammarLabel

Returns the name of the grammar that generated this interpretation.

Function

const char* LVInterpretation::GrammarLabel ()

Remarks

GrammarLabel will always return the name of one of the grammars you activated for decode. If the active grammar had an integer label, then the returned label will be a string representation of that integer.

See Also

LVInterpretation_GetGrammarLabel ( C API )

Page 354: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

344

Score

Function

int LVInterpretation::Score()

Returns

A number between 0-1000. Higher numbers indicate more confidence by the speech port in this interpretation.

See Also

LVInterpretation_GetScore (C API)

Page 355: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

345

LVSemanticData Class

LVSemanticData

LVSemanticData is the C++ class presenting semantic data. Think of an LVSemanticData object as a container containing one of the following items

A boolean

An integer

A floating point number

A composite object

An array

Return Value Function Description

LVSemanticData( ) Constructor

LVSemanticData (const LVSemanticData& other)

Copy constructor

LVSemanticData operator = (const LVSemanticData& other)

Assignment operator

~LVSemanticdata ( ) Destructor

int Type ( ) Returns the semantic data type contained in this object.

bool GetBool ( ) If thedata in this object is of type

Page 356: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

346

SI_TYPE_BOOL, returns the boolean value.

int GetInt ( ) If the data in this object is of type SI_TYPE_INT, returns the integer value

double GetDouble ( ) If the data in this object is of type SI_TYPE_DOUBLE, returns the floating point value.

const char* GetString ( ) If the data in this object is of type SI_TYPE_STRING, returns the string value.

LVSemanticObject GetSemanticObject ( ) If the data in this object is of type SI_TYPE_OBJECT, returns the semantic object value.

LVSemanticArray GetSemanticArray ( ) If the data in this object is of type SI_TYPE_ARRAY, returns the semantic array value.

Page 357: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

347

Type

Returns the data type contained in a given LVSemanticData object.

Function

int LVSemanticData::Type( )

Return Value

One of seven semantic data types.

Page 358: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

348

GetBool

Returns a boolean value contained in an LVSemanticData object. This function assumes that the object contains data of type SI_TYPE_BOOL. If the user calls this function when its type is not SI_TYPE_BOOL, the function always returns false.

Function

bool LVSemanticData::GetBool( )

Return Values

A boolean value.

Page 359: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

349

GetInt

Returns the integer value contained in a given semantic data object. This function assumes that the data contained is of type SI_TYPE_INT. If it is not, this function always returns 0.

Function

int LVSemanticData::GetInt( )

Return Values

An integer value.

Page 360: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

350

GetDouble

Returns a double precision floating point value contained in the given semantic data object. This function assumes that the contained data is of type SI_TYPE_DOUBLE . If it is not, this function always returns 0.0.

Function

double LVSemanticData::GetDouble( )

Return Values

A double.

Page 361: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

351

GetString

Returns the string contained in a given LVSemanticData object. This function assumes that the contained data is of type SI_TYPE_STRING. If it is not, this function always returns NULL.

Function

const char* LVSemanticData::GetString( )

Return Values

NULL

Either the contained data is not of type SI_TYPE_STRING, or some error occurred.

Other

A pointer to a buffer containing the string.

Page 362: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

352

GetSemanticObject

If the LVSemanticData object contains an element of type SI_DATA_OBJECT, this function returns the composite object. Otherwise, it returns an empty object.

Function

LVSemanticObject LVSemanticData::GetSemanticObject ( );

Returns

A semantic object

See Also

LVSemanticObject C++ API

Page 363: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

353

GetSemanticArray

If the LVSemanticData object contains an element of type SI_DATA_ARRAY this function returns the array object. Otherwise, it returns an empty array object.

Function

LVSemanticArray LVSemanticData::GetSemanticArray ( );

Returns

A semantic array

See Also

LVSemanticArray C++ API

Page 364: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

354

LVSemanticObject Class

LVSemanticObject

LVSemanticObject represents a composite object. The user can get an LVSemanticObject by calling LVSemanticData::GetObject().

Return Types Functions Description

LVSemanticObject() Constructor

LVSemanticObject(const LVSemanticObject & other)

Copy constructor

~LVSemanticObject() Destructor

LVSemanticObject& operator = (const LVSemanticObject & other)

Assignment operator

int NumberOfProperties() Returns the number of properties in this object.

const char* PropertyName (int index)

Returns the property name corresponding to index.

LVSemanticData

PropertyValue(const char* property_name) PropertyValue(int index)

Returns the semantic data associated corresponding to property_name, or index

bool PropertyExists(const char* property_name)

If this object has a property named property_name, this method returns true, otherwise false.

Page 365: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

355

Page 366: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

356

NumberOfProperties

Returns the number of properties in this LVSemanticObject

Function

int LVSemanticObject::NumberOfProperties ( )

Page 367: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

357

PropertyName

Returns the ith name of a property (member data) in this object.

Function

const char* LVSemanticObject::PropertyName(int i)

Parameter

i

An index between 0 and NumberOfProperties - 1

Page 368: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

358

PropertyValue

Returns a property (member data) of this object.

Functions

LVSemanticData LVSemanticObject::PropertyValue(const char *property_name)

LVSemanticData LVSemanticObject::PropertyValue(int property_index)

Return Values

Returns a semantic data object. The first returns the object named property_name. The second returns the object corresponding to PropertyName(property_index)

Parameters

property_index

A number between 0 and NumberOfProperties - 1

property_name

A string containing the property name.

Page 369: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

359

PropertyExists

Function

bool LVSemanticObject::PropertyExists(const char *property_name)

Return Values

Returns true if there exists a property of this object named property_name.

Parameters

property_name

A property name.

Page 370: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

360

LVSemanticArray Class

LVSemanticArray

LVSemanticArray represents an array type. You can get an array out of a data type container by calling LVSemanticData::GetArray().

Return Values Functions Description

LVSemanticArray() Constructor

LVSemanticArray(const LVSemanticArray& other) Copy constructor

LVSemanticArray&

operator=(const LVSemanticArray& other) Assignment Operator

~LVSemanticArray() Destructor

int Size() Return the number of properties in this array.

LVSemanticData operator [] (int Index)

Return the semantic data indicated by the index. If the Index does not exist, the returned semantic data will have type SI_TYPE_NULL.

LVSemanticData

At(int Index)

Return the semantic data indicated by the index. If the Index does not exist, the returned semantic data will have type SI_TYPE_NULL.

Page 371: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

361

Size

Returns the size of an LVSemanticArray.

Function

int LVSemanticArray::Size( )

Page 372: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

362

Operator [ ] or At

Access elements in an LVSemanticArray the way you would a conventional array.

Functions

LVSemanticData LVSemanticArray::operator [] (int index)

LVSemantidData LVSemanticArray::At(int index)

Example

LVSemanticData myData = myArray[6];

Page 373: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

363

LVParseTree Class

LVParseTree Class

An LVParseTree object represents the results of a decode using a context free grammar.

Use <LVSpeechPort.h> or <LV_SRE_ParseTree.h>

See Also Using the Parse Tree Tutorial

Return Type Function Description

LVParseTree(void) Constructs an LVParseTree object.

LVParseTree(const LVParseTree& other) Copy constructor

LVParseTree operator=(const LVParseTree& other) Assignment operator

~LVParseTree(void) Destroys the LVParseTree object

LVParseTree::Node Root (void) Provides access to the parent node in the parse tree.

LVParseTree::Iterator Begin (void)

Provides an iterator that walks each node in the tree in a top-to-bottom, left-to-right fashion

Page 374: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

364

LVParseTree::Iterator End (void) Marks the end of traversal for the parse tree iterator

LVParseTree::TerminalIterator TerminalsBegin (void) Traverses the terminals of the parse tree (words).

LVParseTree::TerminalIterator TerminalsEnd (void) Marks the end of traversal for the TerminalIterator.

LVParseTree::TagIterator TagsBegin (void) Traverses the tags in the parse tree (semantic data).

LVParseTree::TagIterator TagsEnd (void) Marks the end of traversal for the TagIterator.

const char* TagFormat (void)

Returns the tag format, as described by the grammar that this tree matched (e.g. "lumenvox/1.0" or "semantics/1.0")

int NumberOfTagsInHeader (void)

Returns the number of tags (semantic data) that were defined in the matching grammar's header.

const char* HeaderTag (int i) Returns the ith header tag from the matching grammar.

const char* GrammarLabel (void) Returns the name of the grammar as it was

Page 375: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

365

provided to the speech port.

const char* Mode (void) "voice" or "dtmf"

const char* Language (void)

Returns the language of the matching grammar (e.g. "en-US" or "es-MX")

Page 376: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

366

Methods

LVParseTree Construction, Assignment and Destruction

LVParseTree objects are fully copyable and assignable.

Functions

LVParseTree()

LVParseTree(const LVParseTree& Other)

LVParseTree& operator = (const LVParseTree& Other)

~LVParseTree()

Parameters

Other

The LVParseTree object being copied

Remarks

You shouldn't have to worry too much about construction or destruction of an LVParseTree object. When you declare an LVParseTree, an empty tree is created. Just set it equal to the results of a decode, and begin using it.

Example

LVSpeechPort Port;

//open the port and do a decode //... //when the decode is finished, grab a parse tree from the speech port LVParseTree Tree = Port.GetParseTree (voicechannel, index);

//start using the tree. It is valid as long as its in scope.

See Also

Creating and Releasing an LVParseTree Handle (C API)

Page 377: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

367

LVParseTree::GrammarLabel

Returns the name of the grammar that generated this tree.

Function

const char* GrammarLabel( )

Remarks

GrammarLabel( ) will always return the name of one of the grammars you activated for decode. It will be the name of the grammar that matched the speakers input, according to the engine. If the active grammar had an integer label, then the returned label will be a string representation of that integer.

See Also

LVParseTree_GetGrammarLabel ( C API )

Page 378: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

368

LVParseTree::Language

Returns the language identifier of the grammar that generated this tree.

Function

const char* Language()

Returns

An RFC 3066 language identifier, such as "en-US" for United States English, or "fr" for French.

See Also

LVParseTree_GetLanguage ( C API )

Page 379: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

369

LVParseTree::Mode

Returns the interaction mode that created the tree.

Function

const char* Mode(void)

Returns

"voice" or "dtmf"

See Also

LVParseTree_GetMode (C API)

Page 380: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

370

LVParseTree::TagFormat

Returns the name of the tag format declared in the matching grammar for this tree.

Function

const char* TagFormat(void)

See Also

LVParseTree_GetTagFormat (C API)

Page 381: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

371

LVParseTree::Root

Gets the root parse tree node.

Function

LVParseTree::Node Root();

Return Values

An LVParseTree::Node object representing the toplevel rule of the matching grammar.

Remarks

This node will always be a rule node (i.e will always satisfy Tree.Root().IsRule() == true). If the matching grammar specified a root rule then this node will always represent that rule.

See Also

LVParseTree_GetRoot ( C API )

Page 382: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

372

LVParseTree::Begin and LVParseTree::End

Begin and End provide iterators for visiting every node in the tree in a top-to-bottom, left-to-right descent. It is the basis for the Tag and Terminal iterators.

Functions

LVParseTree::Iterator Begin ()

LVParseTree::Iterator End ()

Example

The following code prints out every node in a parse tree.

LVParseTree::Iterator Itr = Tree.Begin(); LVParseTree::Iterator End = Tree.End();

for (; Itr != End; Itr++) { for (int i = 0; i < Itr->Level(); ++i) cout << "\t"; if (Itr->IsRule()) cout << "$" << Itr->RuleName() << ":" << endl; if (Itr->IsTag()) cout << "{" << Itr->Text() << "}" << endl; if (Itr->IsTerminal()) cout << "\"" << Itr->Text() << "\"" << endl; }

If the grammar was the top level navigation example grammar, and the engine recognized "go back", the above code would print out:

$directive: "go" "back" {$ = "APPLICATION_BACK"}

See Also

LVParseTree_GetIteratorBegin and LVParseTree_GetIteratorEnd (C API)

Page 383: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

373

LVParseTree::TerminalsBegin and LVParseTree::TerminalsEnd

TerminalsBegin and TerminalsEnd provide access to the "terminals" of the tree. Terminals are the words and phrases in your grammar, so a TerminalIterator gives you access the the exact words the engine heard a speaker say to match a grammar, in the order that the engine heard those words.

Functions

LVParseTree::TerminalIterator TerminalsBegin()

LVParseTree::TerminalIterator TerminalsEnd()

Example

The following code prints out the sentence engine heard, with a word-level confidence score attached to each word.

LVParseTree::TerminalIterator Itr = Tree.TerminalsBegin(); LVParseTree::TerminalIterator End = Tree.TerminalsEnd();

for (; Itr != End; ++Itr) { cout << "\"" << Itr->Text() << "\"":(" << Itr->Score() << ") "; } cout << endl;

So if the grammar being used was the top level navigation example grammar, and the engine recognized "go back", then the output of the above code might look like:

"go":(850) "back":(901)

See Also

LVParseTree_GetTerminalIteratorBegin and LVParseTree_GetTerminalIteratorEnd (C API)

Page 384: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

374

LVParseTree::TagsBegin and LVParseTree::TagsEnd

TagsBegin and TagsEnd provide iterators for visiting the tags in the tree's body.

Functions

LVParseTree::TagIterator TagsBegin ()

LVParseTree::TagIterator TagsEnd ()

Example

The following code prints out every tag in a parse tree.

LVParseTree::TagIterator Itr = Tree.TagsBegin(); LVParseTree::TagIterator End = Tree.TagsEnd();

for (; Itr != End; Itr++) { cout << Itr->Text() << ";" << endl; }

If the grammar was the top level navigation example grammar, and the engine recognized "go back", the the above code would print out:

$ = "APPLICATION_BACK";

Remark

The TagIterator does not visit the tags in a tree's header. Use LVParseTree::HeaderTag to access the contents of those tags.

See Also

LVParseTree_GetTagIteratorBegin and LVParseTree_GetTagIteratorEnd (C API)

Page 385: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

375

LVParseTree Inner Classes

LVParseTree::Node

An LVParseTree is made out of Node objects. Each node represents a word, rule, or tag that was seen by the engine as it decoded an utterance against the matching grammar.

Use <LVSpeechPort.h> or <LV_SRE_ParseTree.h>

Return Type Function Description

Node(void) Constructs an empty node.

Node(const Node& other)

Copy constructor

LVParseTree::Node& operator=(const Node& other)

Assignment operator

~Node(void) destructor

LVParseTree::Node Parent (void)

Provides access to the parent node of this node. Note: the parent of the tree's root node has an empty parent.

LVParseTree::ChildrenIterator ChildrenBegin (void)

Traverses the immediate children of this node.

LVParseTree::ChildrenIterator ChildrenEnd (void) Marks the end of traversal for the

Page 386: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

376

ChildrenIterator

LVParseTree::Iterator SubTreeBegin (void)

Provides an iterator that walks each node in the sub tree rooted by this node in a top-to-bottom, left-to-right fashion.

LVParseTree::Iterator SubTreeEnd (void) Marks the end of traversal for the parse tree iterator

LVParseTree::TerminalIterator TerminalsBegin (void)

Traverses the terminals(words) of the subtree rooted by this node.

LVParseTree::TerminalIterator TerminalsEnd (void)

Marks the end of traversal for the TerminalIterator.

LVParseTree::TagIterator TagsBegin (void)

Traverses the tags (semantic data) in the subtree rooted by this node.

LVParseTree::TagIterator TagsEnd (void) Marks the end of traversal for the TagIterator.

bool IsRule (void)

Returns true if this node represents a matched rule in a grammar. Note: rule nodes are the only nodes that can have children. The children

Page 387: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

377

of a rule node match the right hand side of the grammar rule that is represented by this node.

bool IsTerminal (void)

Returns true if this node represents a terminal (word) in a grammar. Note: the parent of a terminal node is always a rule in the matching grammar that contains this terminal.

bool IsTag (void)

Returns true if this node represents a tag (semantic data) in a grammar. Note: the parent of a tag node is always a rule in the matching grammar that contains this tag.

const char* Text (void)

For a rule node, this is the partial sentence that caused the rule to match. For a terminal node, this is the word that the node represents. For a tag node, this is the tag data.

const char* Phonemes (void)

For a rule node, this is the phonetic pronunciation of the partial sentence that caused the rule to match. For a terminal node,

Page 388: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

378

this is the phonetic pronunciation of the word that was spoken. For a tag node, this is empty.

const char* RuleName (void)

For a rule node, this is the name of the rule being represented. For a tag or terminal node, this is the name of the node's parent.

int Score (void)

For a rule node, this is the confidence of the rule being matched. For a terminal node, this is the confidence of the word being spoken. For a tag node, this is the parent rule's score.

int StartTime (void)

For a rule node, this is the the start time of the first word that matched this rule (elapsed time from the start of the utterance, in milliseconds). For a terminal node, this is the start time of the word. For a tag node, this is the start time of the first word after the tag/ the end time of the last word before the tag.

int EndTime (void)

For a rule node, this is the the end time of the last word that matched this rule (elapsed time from the start of the utterance, in

Page 389: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

379

milliseconds). For a terminal node, this is the end time of the word. For a tag node, this is the start time of the first word after the tag/ the end time of the last word before the tag.

Page 390: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

380

LVParseTree::Iterator

An LVParseTree::Iterator Object traverses a parse tree in a top-to-bottom, left-to-right fashion (sometimes called a pre-order or LL traversal)

Use <LVSpeechPort2.h> or <LV_SRE_ParseTree.h>

Return Type Function Description

Iterator(void) Constructs a blank Iterator; its not pointing over anything.

Iterator(const Iterator& other)

Copy constructor.

LVParseTree::Iterator& operator=(const Iterator& other)

Assignment operator.

~Iterator(void) Destructor.

LVParseTree::Iterator& operator ++ (void) pre-increments the iterator (++itr).

LVParseTree::Iterator operator ++ (int) post-increments the iterator (itr++).

const LVParseTree::Node* operator -> (void)

provides pointer-like access to the node the iterator is currently over ( e.g const char* text = itr->Text( ) )

const LVParseTree::Node&

operator * (void) provides access to the node the iterator is currently over

Page 391: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

381

(e.g. LVParseTree::Node n = *itr )

bool operator == (const Iterator& other)

Tests equality with another Iterator. Two Iterators are equal if they are pointing to the same node in the same tree.

bool operator != (const Iterator& other)

returns true if and only if the equality operator returns false.

Page 392: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

382

LVParseTree::ChildrenIterator

An LVParseTree::ChildrenIterator Object traverses the immediate children of a rule node, from left to right. You get a ChildrenIterator object from a Node by calling

LVParseTree::Node::ChildrenBegin( )

and

LVParseTree::Node::ChildrenEnd( )

Use <LVSpeechPort.h> or <LV_SRE_ParseTree.h>

Return Type Function Description

Iterator(void) Constructs a blank ChildrenIterator; its not pointing over anything.

Iterator(const ChildrenIterator& other) Copy constructor.

LVParseTree::ChildrenIterator& operator=(const ChildrenIterator& other) Assignment operator.

~ChildrenIterator(void) Destructor.

LVParseTree::ChildrenIterator& operator ++ (void) pre-increments the iterator (++itr).

LVParseTree::ChildrenIterator operator ++ (int) post-increments the iterator (itr++).

const LVParseTree::Node* operator -> (void) provides pointer-like access to the node the

Page 393: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

383

iterator is currently over ( e.g const char* text = itr->Text( ) )

const LVParseTree::Node& operator * (void)

provides access to the node the iterator is currently over (e.g. LVParseTree::Node n = *itr )

bool operator==(const ChildrenIterator& other)

Tests equality with another ChildrenIterator. Two ChildrenIterators are equal if they are pointing to the same node in the same tree. (e.g if itr1 == itr2 do something)

bool operator!=(const ChildrenIterator& other)

returns true if and only if the equality operator returns false.

Page 394: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

384

LVParseTree::TerminalIterator

An LVParseTree::TerminalIterator object is an adaptation of the standard LVParseTree::Iterator. It only visits the nodes in a tree that are terminals. You get a TerminalIterator by calling:

LVParseTree::Node::TerminalsBegin( ) LVParseTree::Node::TerminalsEnd( )

Use <LVSpeechPort2.h> or <LV_SRE_ParseTree.h>

Return Type Function Description

Iterator(void)

Constructs a blank TerminalIterator; its not pointing over anything.

Iterator(const TerminalIterator& other)

Copy constructor.

LVParseTree::TerminalIterator& operator=(const TerminalIterator& other)

Assignment operator.

~TerminalIterator(void) Destructor.

LVParseTree::TerminalIterator& operator ++ (void) pre-increments the iterator (++itr).

LVParseTree::TerminalIterator operator ++ (int) post-increments the iterator (itr++).

Page 395: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

385

const LVParseTree::Node* operator -> (void)

provides pointer-like access to the node the iterator is currently over ( e.g const char* text = itr->Text( ) )

const LVParseTree::Node& operator * (void)

provides access to the node the iterator is currently over (e.g. LVParseTree::Node n = *itr )

bool operator==(const TerminalIterator& other)

Tests equality with another TerminalIterator. Two TerminalIterators are equal if they are pointing to the same node in the same tree. (e.g if itr1 == itr2 do something)

bool operator!=(const TerminalIterator& other)

returns true if and only if the equality operator returns false.

Page 396: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

386

LVParseTree::TagIterator

An LVParseTree::TagIterator object is an adaptation of the standard LVParseTree::Iterator. It only visits the nodes in a tree that are tags. You get a TagIterator by calling:

LVParseTree::Node::TagsBegin( ) LVParseTree::Node::TagsEnd( )

Use <LVSpeechPort2.h> or <LV_SRE_ParseTree.h>

Return Type Function Description

Iterator(void) Constructs a blank TagIterator; its not pointing over anything.

Iterator(const TagIterator& other)

Copy constructor.

LVParseTree::TagIterator& operator=(const TagIterator& other)

Assignment operator.

~TagIterator(void) Destructor.

LVParseTree::TagIterator& operator ++ (void) pre-increments the iterator (++itr).

LVParseTree::TagIterator operator ++ (int) post-increments the iterator (itr++).

const LVParseTree::Node* operator -> (void) provides pointer-like access to the node the iterator is currently over

Page 397: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

387

( e.g const char* text = itr->Text( ) )

const LVParseTree::Node&

operator * (void)

provides access to the node the iterator is currently over (e.g. LVParseTree::Node n = *itr )

bool operator==(const TagIterator& other)

Tests equality with another TagIterator. Two TagIterators are equal if they are pointing to the same node in the same tree. (e.g if itr1 == itr2 do something)

bool operator!=(const TagIterator& other)

returns true if and only if the equality operator returns false.

Page 398: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

388

LVGrammar Class

class LVGrammar

An LVGrammar object represents a context-free grammar that can be used in the Speech Engine to recognize speech. An LVGrammar object can also be used to test the functionality of a grammar by processing transcripts.

Use <LVSpeechPor.h> or <LV_SRE_Grammar.h>

Return Type Function Description

LVGrammar (void) Constructs an LVGrammar object.

LVGrammar (GrammarLogCB log, void* userdata)

Constructs an LVGrammar object, with an initial logging function.

LVGrammar (const LVGrammar& other)

Copy constructor.

~LVGrammar (void) Destroys the LVGrammar object.

LVGrammar& operator = (const LVGrammar& other) Assignment operator

void

RegisterLoggingCallback (GrammarLogCB log, void* userdata)

Registers a callback so the object can report warnings and errors to the grammar author.

int Reset (void) Reset a grammar

Page 399: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

389

object.

int SaveCompiledGrammar (const char* filename)

Save the grammar object to a binary file.

int LoadCompiledGrammar (const char* filename)

Load the grammar object from a binary file

HGRAMMAR GetHGrammar (void) Returns the underlying object handle.

int LoadGrammar (const char* location)

Loads a grammar from a location specified by the "uri" argument.

int LoadGrammarFromBuffer (const char* contents)

Loads a grammar from a null terminated string containing the contents of the grammar.

int AddRule (const char* rulename, const char* definition)

Inserts a new rule into the grammar.

int RemoveRule (const char* rulename)

Removes a rule from the grammar.

int SetRoot (const char* rulename) Sets a starting rule for the grammar.

void SetMode (const char* mode)

Declare the mode of grammar (the style of decode to be processed). Legal arguments are "voice" or "dtmf".

Page 400: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

390

const char* GetMode (void) Return the interaction mode of the grammar.

void SetLanguage (const char* language)

Specify the language of this grammar as a language/country code pair. Legal arguments include "en-US" and "es-MX".

const char* GetLanguage (void)

Return the language setting of the grammar.

void SetTagFormat (const char* tag_format)

Identify the tag format of the grammar. To use the LumenVox semantic interpretation, the tag format must be "lumenvox/1.0" or "semantics/1.0".

const char* GetTagFormat (void)

Return the tag format setting of the grammar.

int GetNumberOfMetaData (void)

Return the number of meta data in the grammar.

const char* GetMetaDataKey (int index)

Return the key of the meta data with a specified index

const char* GetMetaDataValue (int index)

Return the value of the meta data with a specified index

Page 401: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

391

int ParseSentence (const char* sentence)

Use the grammar to parse a sentence.

int NumberOfParses (void)

Returns the number of parses created by the most recent ParseSentence call.

LVParseTree GetParseTree (int index)

Returns the parse tree object created with a specified index

int InterpretParses (void)

Generate interpretations form parses trees created by the most recent ParseSentence call.

int GetNumberOfInterpretations (void)

Returns the number of interpretations created the most recent InterpretParses call.

LVInterpretation GetInterpretation (int index)

Returns the semantic interpretation with the specified index

Page 402: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

392

Page 403: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

393

Methods

LVGrammar Constructor/Destructor

Functions

LVGrammar()

LVGrammar(GrammarLogCB log, void* userdata)

LVGrammar(const LVGrammar& other)

~LVGrammar()

Parameters

log

Error/warning reporting callback function pointer.

userdata

The logging callback function pointer.

other

Existing grammar object.

Remarks

The call back function need to have signature defined by GrammarLogCB.

See Also

LVGrammar_Create (C API)

LVGrammar_CreateFromCopy (C API)

LVGrammar_Release (C API)

Page 404: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

394

LVGrammar::operator =

Assignment operator.

Function

LVGrammar& operator = (const LVGrammar& other)

Parameters

other

Existing grammar object.

See Also

LVGrammar_Copy (C API)

Page 405: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

395

LVGrammar::RegisterLoggingCallback

Registers a callback so the object can report warnings and errors to the grammar author via the callback function.

Function

void RegisterLoggingCallback (GrammarLogCB log, void* userData)

Parameters

log

The logging callback function pointer.

userdata

The pointer to user defined data associated with the grammar object pointed by Grammar. It will be passed into the callback function.

Remarks

The call back function need to have signature defined by GrammarLogCB.

See Also

LVGrammar__RegisterLoggingCallback (C API)

Page 406: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

396

LVGrammar::Reset

Reset a grammar object.

Function

int Reset (void)

Return Values

LV_SUCCESS

LV_FAILURE

See Also

LVGrammar_Reset (C API)

Page 407: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

397

LVGrammar::SaveCompiledGrammar

Save a grammar object to a binary file.

Function

int SaveCompiledGrammar (const char* filename)

Parameters

filename

File name.

Return Values

LV_SUCCESS

LV_FAILURE

Remarks

The saved compiled grammar can be later loaded into a grammar object with LVGramma::LoadCompiledGrammar.

See Also

LVGramma::LoadCompiledGrammar

LVGrammar_SaveCompiledGrammar (C API)

Page 408: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

398

LVGrammar::LoadCompiledGrammar

Load a grammar object from a binary file previously saved by LVGrammar::SaveCompiledGrammar.

Function

int LoadCompiledGrammar (const char* filename)

Parameters

hgram

The handle to a grammar object.

filename

File name.

Return Values

LV_SUCCESS

LV_FAILURE

See Also

LVGrammar::SaveCompiledGrammar

LVGrammar_LoadCompiledGrammar (C API)

Page 409: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

399

LVGrammar::GetHGrammar

Return underlying grammar object handle.

Function

HGRAMMAR GetHGrammar (void)

Return Values

A pointer to the underlying grammar object.

Remarks

class LVGrammar is just a thin wrapper of grammar object handle HGRAMMAR.

See Also

HGRAMMAR

Page 410: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

400

LVGrammar::LoadGrammar

Loads a grammar from a local file or remote file via http or ftp. Grammar can be written in ABNF or XML notations.

Function

int LoadGrammar(const char* grammar_location)

Parameters

gram_location

A file descriptor or uri that points to a valid SRGS grammar file, such as "c:/grammars/pizza.grxml", "http://www.gramsRus.com/phonenumber.gram", or "builtin:dtmf/boolean?y=1;n=2"

Return Values

LV_SUCCESS

No errors; this grammar is now ready for use.

LV_GRAMMAR_SYNTAX_WARNING

The grammar file was not fully conforming, but it was understandable and is now ready to be used

LV_GRAMMAR_SYNTAX_ERROR

The grammar file was not understandable to the grammar compiler. You will not be able to decode with this grammar.

LV_GRAMMAR_LOADING_ERROR

The grammar compiler was unable to find the location of the grammar you loaded.

Remarks

Page 411: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

401

Detailed error and warning messages are sent to the grammar object's logging callback function.

See Also

LVGrammar_LoadGrammar (C API)

Page 412: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

402

LoadGrammarFromBuffer

Loads a grammar from a null terminated string buffer. Grammar can be written in ABNF or XML notations.

Function

int LoadGrammarFromBuffer(const char* grammar_contents);

Parameters

gram_contents

A null terminated string containing the contents of a valid SRGS grammar.

Return Values

LV_SUCCESS

No errors; this grammar is now ready for use.

LV_GRAMMAR_SYNTAX_WARNING

The grammar file was not fully conforming, but it was understandable and is now ready to be used

LV_GRAMMAR_SYNTAX_ERROR

The grammar file was not understandable to the grammar compiler. You will not be able to decode with this grammar.

LV_GRAMMAR_LOADING_ERROR

The grammar compiler was unable to find the location of the grammar you loaded.

Remarks

Detailed error and warning messages are sent to the grammar object's logging callback function.

See Also

Page 413: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

403

LVGrammar_LoadGrammarFromBuffer (C API)

Page 414: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

404

LVGrammar::AddRule

Add rules to a grammar object.

Function

int AddRule(const char* rule_name, const char* rule_definition)

Parameters

rule_name

The name of the rule

rule_definition

The definition of the rule

Return Values

LV_SUCCESS

No errors; the rule has been successfully added or removed.

LV_GRAMMAR_SYNTAX_WARNING

The new rule was not fully conforming, but it was understandable and is now ready to be used

LV_GRAMMAR_SYNTAX_ERROR

The new rule was not understandable to the grammar compiler. You will not be able to decode with this grammar.

Example

grammar.AddRule("foo", "hello [world]");

Is the same as writing a rule:

$foo = hello [world];

Page 415: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

405

Remarks

New rules must be written in ABNF notation. Detailed error and warning messages are sent to the grammar object's logging callback function.

See Also

LVGrammar::RemoveRule

LVGrammar_AddRule (C API)

Page 416: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

406

LVGrammar::RemoveRule

Remove rules to a grammar object.

Function

int RemoveRule(const char* rule_name)

Parameters

rule_name

The name of the rule

Return Values

LV_SUCCESS

No errors; the rule has been successfully added or removed.

LV_GRAMMAR_SYNTAX_WARNING

The new rule was not fully conforming, but it was understandable and is now ready to be used

LV_GRAMMAR_SYNTAX_ERROR

The new rule was not understandable to the grammar compiler. You will not be able to decode with this grammar.

Remarks

Detailed error and warning messages are sent to the grammar object's logging callback function.

See Also

LVGrammar::AddRule

LVGrammar_RemoveRule (C API)

Page 417: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

407

LVGrammar::SetRoot

Identifies one of the grammar rules as the root rule. The root rule is where the engine starts its search.

Function

int SetRoot(const char* rule_name)

Parameters

rule_name

The name of the rule.

Example

grammar.SetRule("foo");

Is the same as writing in a grammar:

root $foo;

See Also

LVGrammar_SetRoot (C API)

Page 418: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

408

LVGrammar::SetMode

Set mode property for the grammar,

Function

int SetMode(const char* mode)

Parameters

mode

The interaction mode of the grammar.

Example

grammar.SetLanguage("en-US"); grammar.SetMode("voice"); grammar.SetTagFormat("lumenvox/1.0");

Is the same as writing in your grammar:

language "en-US; mode "voice"; tag-format <lumenvox/1.0>;

See Also

LVGrammar::GetMode

LVGrammar_SetMode (C API)

Page 419: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

409

LVGrammar::SetLanguage

Set language for the grammar,

Function

int SetLanguage(const char* language)

Parameters

language

The language identifier for the grammar

Example

grammar.SetLanguage("en-US"); grammar.SetMode("voice"); grammar.SetTagFormat("lumenvox/1.0");

Is the same as writing in your grammar:

language "en-US; mode "voice"; tag-format <lumenvox/1.0>;

See Also

LVGrammar::GetLanguage

LVGrammar_SetLanguage (C API)

Page 420: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

410

LVGrammar::SetTagFormat

Set interpretation tag format of the grammar.

Function

int SetTagFormat(const char* tag_format)

Parameters

tag_format

The grammar's tag format.

Example

grammar.SetLanguage("en-US"); grammar.SetMode("voice"); grammar.SetTagFormat("lumenvox/1.0");

Is the same as writing in your grammar:

language "en-US; mode "voice"; tag-format <lumenvox/1.0>;

See Also

LVGrammar_GetTagFormat

LVGrammar_SetTagFormat (C++ API)

Page 421: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

411

LVGrammar::GetMode

Return the mode setting for the grammar,

Function

const char* GetMode(void)

Return Values

The interaction mode of the grammar.

See Also

LVGrammar::SetMode

LVGrammar_GetMode (C API)

Page 422: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

412

LVGrammar::GetLanguage

Return the language setting for the grammar,

Function

const char* GetLanguage(void)

Return Values

The language identifier of the grammar.

See Also

LVGrammar::SetLanguage

LVGrammar_GetLanguage (C API)

Page 423: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

413

LVGrammar::GetTagFormat

Return the interpretation tag format setting for the grammar,

Function

const char* GetTagFormat(void)

Parameters

hgram

A handle to the grammar.

Return Values

The tag format of the grammar.

See Also

LVGrammar::SetTagFormat

LVGrammar_GetTagFormat (C API)

Page 424: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

414

LVGrammar::GetNumberOfMetaData

Return the number of meta data contained in the grammar.

Function

int GetNumberOfMetaData(void)

Example

If the grammar has following lines:

meta 'description' is 'example grammar'; meta 'date' is '05/12/2005';

You can access meta data as follows:

int count = grammar.GetNumberOfMetaData(); // returns 2 const char* key = grammar.GetMetaDataKey(0); //returns "description" const char* value = grammar.GetMetaDataValue(1); //returns "05/12/2005"

See Also

LVGrammar::GetMetaDataKey

LVGrammar::GetMetaDataValue

LVGrammar_GetNumberOfMetaData (C API)

Page 425: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

415

LVGrammar::GetMetaDatakey

Return the key of the meta data indicated by the index.

Function

int GetMetaDataKey(int index)

Parameters

index

Index of the meta data. It should be in the range [0, LVGrammar::GetNumberOfMetaData).

Return Values

null

The index is not valid.

non-null

A pointer to the value string.

Example

If the grammar has following lines:

meta 'description' is 'example grammar'; meta 'date' is '05/12/2005';

You can access meta data as follows:

int count = grammar.GetNumberOfMetaData(); // returns 2 const char* key = grammar.GetMetaDataKey(0); //returns "description" const char* value = grammar.GetMetaDataValue(1); //returns "05/12/2005"

See Also

LVGrammar::GetNumberOfMetaData

Page 426: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

416

LVGrammar::GetMetaDataValue

LVGrammar_GetMetaDataKey (C API)

Page 427: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

417

LVGrammar::GetMetaDataValue

Return the value of the meta data indicated by the index.

Function

int GetMetaDataValue(int index)

Parameters

index

Index of the meta data. It should be in the range [0, LVGrammar::GetNumberOfMetaData).

Return Values

null

The index is not valid.

non-null

A pointer to the value string.

Example

If the grammar has following lines:

meta 'description' is 'example grammar'; meta 'date' is '05/12/2005';

You can access meta data as follows:

int count = grammar.GetNumberOfMetaData(); // returns 2 const char* key = grammar.GetMetaDataKey(0); // returns "description" const char* value = grammar.GetMetaDataValue(1); // returns "05/12/2005"

See Also

LVGrammar::GetNumberOfMetaData

Page 428: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

418

LVGrammar::GetMetaDataKey

LVGrammar_GetMetaDataValue (C API)

Page 429: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

419

LVGrammar::ParseSentence

Use a loaded grammar object to parse a sentence.

Function

int ParseSentence(const char* sentence)

Parameters

sentence

The sentence to parse.

Return Values

0

The sentence is not covered by the grammar.

non-0

The number of distinct parses.

Example

Assume a grammar was defined as:

root $yes_no; $yes_no = $yes | $no; $yes = yes [please]; $no = no [thank you];

You can use this grammar to validate sentences as follows:

int count = grammar.ParseSentence("no thank you"); // returns 1 int count = grammar.ParseSentence("no thanks"); // returns 0

Remarks

With this function, you can identify how well a grammar covers your targeted transcript set.

Page 430: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

420

See Also

LVGrammar::GetNumberOfParses

LVGrammar::GetParseTree

LVGrammar_ParseSentence (C API)

Page 431: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

421

LVGrammar::NumberOfParses

Return the number of parses created by the most recent call of LVGrammar::ParseSentence.

Function

int GetNumberOfParses(void)

Return Values

0

The sentence is not covered by the grammar.

non-0

The number of distinct parses.

Remarks

This function can be used after a call to LVGrammar::ParseSentence. It is provided as a convenience; it returns the same value as LVGrammar::ParseSentence.

See Also

LVGrammar::ParseSentence

LVGrammar::GetParseTree

LVGrammar_GetNumberOfParses (C API)

Page 432: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

422

LVGrammar::GetParseTree

Return the parse tree object with the specified index.

Function

LVParseTree GetParseTree(int index)

Parameters

index

The index of the parse tree handle to be returned. It should be in the range [0, LVGrammar::GetNumberOfParses).

Return Values

null

The index is not valid.

non-null

The parse tree handle.

Remarks

This function should be used after a call to LVGrammar::ParseSentence.

See Also

LVGrammar::ParseSentence

LVGrammar::GetNumberOfParses

LVGrammar_CreateParseTree (C API)

Page 433: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

423

LVGrammar::InterpretParses

Generate semantic interpretation results from parses created by previous calls to LVGrammar::ParseSentence.

Function

int InterpretParses(void)

Return Values

integer (>=0)

Number of available interpretations.

Remarks

Before calling this function , you have to call LVGrammar::ParseSentence on that grammar object. Otherwise, that grammar object doesn't contain any parse tree information.

See Also

LVGrammar::ParseSentence

LVGrammar::GetNumberOfInterpretations

LVGrammar::GetInterpretation

LVGrammar_InterpretParses (C API)

Page 434: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

424

LVGrammar::GetNumberOfInterpretations

Return the number of semantic interpretations created by the most recent call to LVGrammar::InterpretParses.

Function

int GetNumberOfParses(void)

Return Values

integer (>=0)

Number of available interpretations.

Remarks

This function can be used after a call to LVGrammar::InterpretParses. It is provided as a convenience; it returns the same value as LVGrammar::InterpretParses.

See Also

LVGrammar::InterpretParses

LVGrammar::GetInterpretation

LVGrammar_GetNumberOfInterpretions (C API)

Page 435: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

425

LVGrammar::GetInterpretation

Returns the semantic interpretation handle indicated by the index.

Function

LVInterpretation GetInterpretation (int index)

Parameters

index

The index of the interpretation handle to be returned. It should be in the range [0, LVGrammar::GetNumberOfInterpretations).

Return Values

null

The index is not valid.

non-null

The interpretation handle.

Remarks

This function should be used after a call to LVGrammar_InterpretParses.

See Also

LVGrammar::InterpretParses

LVGrammar::GetNumberOfInterpretations

LVGrammar_CreateInterpretation (C API)

Page 436: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

426

Callback Functions

Logging Callback Function

typedef void (*ExportLogMsg)(const char* String, void* p)

The callback function is called by the speech port with informational and error messages. It is the second parameter to LV_SRE_OpenPort, and LV_SRE_RegisterAppLogMsg, and the first parameter to LVSpeechPort::OpenPort.

p is a pointer to a user-defined class or function which can customize behavior when the engine sends logging messages to the callback.

See Also

LV_SRE_OpenPort

LV_SRE_RegisterAppLogMsg

LVSpeechPort::OpenPort

Page 437: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

427

Streaming Callback Function

typedef void (*LV_SRE_StreamStateChangeFn)(long NewState, unsigned long TotalBytes, unsigned long RecordedBytes, void* UserData)

The callback function is called by the speech port each time a stream status changes. Primarily this is used with streams performing barge-in detection and/or end-of-speech detection to notify hardware to stop playing prompt (barge-in) or stop recording user (end-of-speech).

Parameters

NewState

New state of stream. See Stream Status.

TotalBytes

Total bytes streamed (at point of stream status change), more sound data may still be in the internal unprocessed queue.

RecordedBytes

Total bytes minus data discarded before barge-in was detected.

UserData

Pointer to application defined data.

See Also

LV_SRE_StreamSendData

LV_SRE_StreamGetStatus

Page 438: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

428

Grammar Logging Callback Function

typedef void (*GrammarLogCB)(const char* message, int error_level, void* user_data)

The callback function is called by the LVGrammar object when an error or warning is generated during the grammar compilation process. The types of errors which can be passed through the callback via the error_level parameter are:

LV_GRAMMAR_LOADING_ERROR -- the grammar could not be loaded from the location provided.

LV_GRAMMAR_SYNTAX_ERROR -- one or more rules or statements in the grammar was badly formed. The message parameter provides more detailed information.

LV_GRAMMAR_SYNTAX_WARNING -- one or more statements in the grammar were either missing, or not strictly conforming to specifications, but the grammar builder was able to recover. The message parameter provides more detailed information.

user_data is a pointer to a user-defined class or function which can customize behavior when the LVGrammar object sends logging messages through the callback.

See Also

LVGrammar_RegisterLoggingCallback

LVGrammar::RegisterLoggingCallback

Page 439: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

429

Constants

Decoder Flags

The engine accepts several different flags for use when calling LV_SRE_Decode (C API) and LVSpeechPort::Decode (C++ API). The flags can be bitwise OR'd ( "|" ) to customize behavior.

LV_DECODE_BLOCK

Normally, calls to the decode function/method will immediately return to allow the client application to continue working on other tasks while the engine processes the data. This flag blocks the client application until the engine has finished.

LV_DECODE_GENDER_MALE

LV_DECODE_GENDER_FEMALE

LV_DECODE_GENDER_MALE and LV_DECODE_GENDER_MALE identify which gender acoustic model to use during decode. If these flags are not specified, the engine automatically decodes each audio file against both gender models. While this slows the engine by requiring two decodes, evaluating against both models has a very significant positive effect on recognition accuracy. Since the engine is multit-hreaded, unless CPU loads are a serious issue, do not use these flags.

LV_DECODE_FIRST_TIME_USER

Reset caller weights in Recognition Engine (not implemented).

LV_DECODE_USE_OOV

Use the Out-Of-Vocabulary filter (OOV) during decode. The OOV filter, when set, processes each audio file against both the grammar specified by the client application, and a special grammar which detects words not in the grammar. If the engine detects these OOV words, it will not return them. Generally, the OOV filter slows the engine down without a large gain in accuracy, so client applications should use the filter only if OOV words seem to be a problem.

LV_DECODE_RETURN_EACH_DIGIT

Page 440: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

430

When using standard grammars, a string of digits, monetary value etc. is passed back as a single concept. If this flag is used, each digit comes back as a separate concept. (Since each concept has a confidence score, this can be useful for determining poorly recognized individual digits.)

LV_DECODE_SRGS_GRAMMAR

Normally, you do not need to use this flag. But if you want to use a concept-phrase grammar as an SRGS grammar, and are not using the LV_ACTIVE_GRAMMAR_SET, this flag is necessary.

LV_DECODE_SEMANTIC_INTERPRETATION

This flag tells the decoder to process the parse tree return type for semantic information in the tree's tags.

Page 441: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

431

Error Codes

0 LM_SUCCESS No errors.

-1 LM_FAILURE General failure.

-2 LV_SYSTEM_ERROR The speech recognition engine is no longer running. This is the result of a ClosePort call or a unrecoverable engine error.

-4 LV_BAD_SOUND_DATA There was a problem with sound data.

-5 LV_INVALID_SOUND_FORMAT The sound format value is not one of the allowable formats.

-6 LV_TIME_OUT WaitForEngineToIdle's timeout was reached before the engine became idle. Also losing connection to an engine server during decode may return this error code.

-7 LV_GRAMMAR_SET_OUT_OF_RANGE The grammar set value is out of expected range (0-63).

-8 LV_SOUND_CHANNEL_OUT_OF_RANGE The sound channel value out of expected range.

-9 LV_STANDARD_GRAMMAR_ALREADY_LOADED Only one standard grammar can be loaded for a grammar set.

-10 LV_STANDARD_GRAMMAR_OUT_OF_RANGE The standard grammar value is not a recognized grammar type.

-11 LV_NOT_A_VALID_PROPERTY_VALUE The property value is not a valid for the designated property.

-12 LV_BAD_HPORT The specified port handle not valid.

Page 442: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

432

-13 LV_NOT_IMPLEMENTED The action was not implemented in the current version.

-14 LV_SOCKETS_ERROR General network communication error.

-15 LV_INVALID_PROPERTY_TARGET The target type used in a call to LV_SRE_SetPropertyEx() is invalid for the property given.

-16 LV_INVALID_PROPERTY_VALUE_TYPE The value type used in a call to LV_SRE_SetPropertyEx() is invalid for the property given.

-17 LV_INVALID_PROPERTY The propert supplied in a call to LV_SRE_SetPropertyEx() or LV_SRE_SetProperty() is invalid.

-18 LV_INVALID_PROPERTY_TARGET_NDX When calling LV_SRE_SetPropertyEx() and using a target type of PROP_EX_TARGET_CHANNEL or PROP_EX_TARGET_GRAMMAR the index value was out or range.

-19 LV_STREAM_NOT_ACCEPTED Stream functions called on a stopped stream.

-20 LV_FUNCTION_NOT_FOUND LVSpeechPort_stdcall.dll is a wrapper dll around LVSpeechPortl.dll. If a newer version of the standard call dll is used, it may not find a function in LVSpeechPortl.dll.

-21 LV_STRING_BUFFER_TOO_SMALL The application supplied string buffer was too small.

-22 LV_NO_SERVER_AVAILABLE No engine servers where found to connect to.

-23 LV_GRAMMAR_SYNTAX_WARNING The grammar contained a syntax warning in one or more of its rules or declarations. A specific message from the grammar builder has been logged. The grammar was successfully built, despite the warning.

Page 443: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

433

-24 LV_GRAMMAR_SYNTAX_ERROR The grammar contained a syntax error in one or more of its rules or declarations. A specific message from the grammar builder has been logged. The grammar was not built.

-25 LV_GRAMMAR_LOADING_ERROR The grammar could not be loaded, because a specified url was invalid.

-26 LV_OPEN_PORT_FAILED__LICENSE_EXCEEDED Can not open ports due to exceeding the number of ports allowed by license.

-31 LV_GLOBAL_GRAMMAR_TRANSACTION_PARTIAL_ERROR Global grammar operation failed on some of the servers.

-32 LV_GLOBAL_GRAMMAR_TRANSACTION_ERROR Global grammar operation failed on all servers.

Note:

Not all the error codes are implemented.

Page 444: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

434

Properties

#define PROP_EX_SAVE_SOUND_FILES 2 #define PROP_EX_LANGUAGE 3 #define PROP_EX_SRE_SERVERS 4 #define PROP_EX_CHOOSE_MODEL 8 #define PROP_EX_SET_SERVER_IP 10 #define PROP_EX_SET_SERVER_PORT 11 #define PROP_EX_SEARCH_BEAM_WIDTH 12 #define PROP_EX_CONCEPT_REPETITION_MIN 13 #define PROP_EX_CONCEPT_REPETITION_MAX 14 #define PROP_EX_ENABLE_LATTICE_CONFIDENCE_SCORE 15 #define PROP_EX_MAX_NBEST_RETURNED 16 #define PROP_EX_DECODE_TIMEOUT 17 #define PROP_EX_MOD_SEL_LOW_THLD 18 #define PROP_EX_MOD_SEL_HIGH_THLD 19

PROP_EX_SAVE_SOUND_FILES

Value Types:

PROP_EX_VALUE_TYPE_INT

PROP_EX_VALUE_TYPE_INT_PTR

Targets: PROP_EX_TARGET_PORT

Default Value: 1

Save request and answer files to disk.

Setting to 1 saves request and answer files for each call to Decode to LVLANG\Responses (Win32) or LVRESPONSES/Responses (Linux). Setting to 0 stops saving the files. Turning this property on can quickly fill up a hard drive, but is invaluable for troubleshooting and tuning the application.

PROP_EX_LANGUAGE

Value Types:

PROP_EX_VALUE_TYPE_STRING

Page 445: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

435

Targets: PROP_EX_TARGET_PORT

Default Value: "AmericanEnglish"

The language model to use for decodes.

PROP_EX_SRE_SERVERS

Value Types: PROP_EX_VALUE_TYPE_STRING Targets: PROP_EX_TARGET_CLIENT Default Value: "127.0.0.1:5000"

The list of Speech Engine servers which will handle decodes for this client. A comma (or semicolon) delimited list of IP addresses (and ports) the client will attempt to connect to. Use a colon to separate IPs and Ports. 5000 is the default port.

Example: "127.0.0.1;10.0.0.1:5001;10.10.0.1" Client will attempt to attach to the local machine, port 5000; IP address "10.0.0.1" port 5001; and IP address "10.10.0.1" port 5000.

PROP_EX_SEARCH_BEAM_WIDTH

Value Types:

PROP_EX_VALUE_TYPE_FLOAT_PTR

Targets: PROP_EX_TARGET_CLIENT

PROP_EX_TARGET_PORT

PROP_EX_TARGET_CHANNEL

Default Value: 1e-6

The beam controls how thorough the Speech Engine search is. Legal values can range from 0.0 to 1.0. The smaller the value, the more thorough the search is, leading to potentially more accurate searches, but also leading to more time intensive searches. Use the default at first, and only

Page 446: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

436

experiment with this value while tuning your application for speed and accuracy. Make small changes only. For instance, try going from 1e-6 to 1e-9, but not 1e-30.

PROP_EX_CONCEPT_REPETITION_MIN

Value Types:

PROP_EX_VALUE_TYPE_INT

PROP_EX_VALUE_TYPE_INT_PTR

Targets: PROP_EX_TARGET_GRAMMAR

Default Value: 1

PROP_EX_CONCEPT_REPETITION_MAX

Value Types:

PROP_EX_VALUE_TYPE_INT

PROP_EX_VALUE_TYPE_INT_PTR

Targets: PROP_EX_TARGET_GRAMMAR

Default Value: -1 (infinity)

PROP_EX_CONCEPT_REPETITION_MIN and PROP_EX_CONCEPT_REPETITION_MAX control the repeat count of concepts in a concept/phrase grammar. They have no effect on SRGS grammars. Having a grammar such as:

concept "topping" = "pepperoni | olives | sausage | onions | peppers"

With MIN=1 MAX=5, is equivalent to an SRGS grammar

root $toppings; $toppings = $topping<1-5>; $topping = (pepperoni | olives | sausage | onions | peppers);

Page 447: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

437

PROP_EX_ENABLE_LATTICE_CONFIDENCE_SCORE

Value Types:

PROP_EX_VALUE_TYPE_INT

PROP_EX_VALUE_TYPE_INT_PTR

Targets: PROP_EX_TARGET_CLIENT

PROP_EX_TARGET_PORT

PROP_EX_TARGET_CHANNEL

Default Value: 1

The lattice based confidence score is a slightly slower, but more accurate confidence score. Set it to 0 to turn off the score.

PROP_EX_CHOOSE_MODEL

Value Types:

PROP_EX_VALUE_TYPE_INT

PROP_EX_VALUE_TYPE_INT_PTR

Targets: PROP_EX_TARGET_CLIENT

Default Value: 0

If this property is set to 1, then the client will decide which acoustic model is most appropriate for the server to use, based on a frequency analysis of the speaker's voice. Otherwise, two decodes will be done simultaneously, and an answer will be selected based on which model had better "coverage" for the speaker's voice.

PROP_EX_MOD_SEL_LOW_THLD

Value Types:

PROP_EX_VALUE_TYPE_INT

Page 448: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

438

PROP_EX_VALUE_TYPE_INT_PTR

Targets: PROP_EX_TARGET_CLIENT

PROP_EX_TARGET_PORT

PROP_EX_TARGET_CHANNEL

Default Value: 135Hz

PROP_EX_MOD_SEL_HIGH_THLD

Value Types:

PROP_EX_VALUE_TYPE_INT

PROP_EX_VALUE_TYPE_INT_PTR

Targets: PROP_EX_TARGET_CLIENT

PROP_EX_TARGET_PORT

PROP_EX_TARGET_CHANNEL

Default Value: 155Hz

When property PROP_EX_CHOOSE_MODEL is set to 1, the engine will use the pitch of input audio to determine which acoustic model to use. If the pitch is lower than PROP_EX_MOD_SEL_LOW_THLD, the low pitch model will be used, while a pitch higher than PROP_EX_MOD_SEL_HIGH_THLD indicates using high pitch model. Any value that falls in between will causes the engine to use both models.

PROP_EX_MAX_NBEST_RETURNED

Value Types:

PROP_EX_VALUE_TYPE_INT

PROP_EX_VALUE_TYPE_INT_PTR

Page 449: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

439

Targets: PROP_EX_TARGET_CLIENT

PROP_EX_TARGET_PORT

PROP_EX_TARGET_CHANNEL

Default Value: 1

The maximum number of n-best result the engine can return. This property is required to be an integer greater than or equal to 1.

PROP_EX_DECODE_TIMEOUT

Value Types:

PROP_EX_VALUE_TYPE_INT

PROP_EX_VALUE_TYPE_INT_PTR

Targets: PROP_EX_TARGET_CLIENT

PROP_EX_TARGET_PORT

PROP_EX_TARGET_CHANNEL

Default Value: 1

The time out value used by LV_SRE_WaitForDecode and LVSpeechPort::WaitForDecode functions.

Page 450: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

440

Sound Formats

enum SOUND_FORMAT {

UNK_FORMAT = 0, ULAW_8KHZ, PCM_8KHZ, PCM_16KHZ, ALAW_8KHZ,

};

ULAW_8KHZ

-law format at 8000 samples per second. 1 byte per sample. One minute of sound occupies approximately .5 MB's of memory. This is the standard domestic telephone format.

PCM_8KHZ

Pulse code modulated at 8000 samples per second. 2 bytes per sample. One minute of sound occupies approximately 1 MB of memory.

PCM_16KHZ

Pulse code modulated at 16000 samples per second. 2 bytes per sample. One minute of sound occupies approximately 2 MB's of memory. This is the native format of the SRE.

ALAW_8KHZ

-law format at 8000 samples per second. 1 byte per sample. One minute of sound occupies approximately .5 MB's of memory. This is the standard international telephone format.

Note:

Page 451: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

441

We will be adding support for more formats in near future, in particular the standard Windows wave format.

Page 452: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

442

Standard Grammars

These grammars are deprecated in favor of built-in SRGS grammars.

The standard grammars are built-in grammars, predefined by LumenVox. Using these grammars will return a single concept, formatted appropriately. Only one standard grammar can be active at a time; no concepts can be removed from the standard grammar. The client application can, however, add and remove concepts to the voice channel grammar, which will coexist with the standard grammar.

1 GRAMMAR_DIGITS

String of single digits, like a phone number or pin code. In version 4.0, digits are a separate acoustical model and so only recognize (One, two, three, four, five, six, seven, eight, nine, zero and oh). It ignores application supplied grammar and cannot currently recognize things like "twenty-five or seventeen". This allowed us to obtain extremely low error rate. The number grammar can be used to mix application grammar and digit recognition.

2 GRAMMAR_MONEY

Monetary value.

3 GRAMMAR_NUMBER

Numeric value like 12,000, 24.45 or 35.

4 GRAMMAR_LETTERS

Letters of alphabet for spelling (not implemented).

5 GRAMMAR_DATE

Date values (not implemented).

Page 453: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

443

Semantic Data Type

There are seven semantic data types. They are defined as macros in <LV_SRE_Semantic.h>

SI_TYPE_BOOL

SI_TYPE_INT

SI_TYPE_DOUBLE

SI_TYPE_STRING

SI_TYPE_OBJECT

SI_TYPE_ARRAY

SI_TYPE_NULL

Note: SI_TYPE_NULL is a special type which usually indicates that some error occurred.

Page 454: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

444

Semantic Data Print Format

These macros are used in the SI_DATA_Print() function to specify the printing format.

SI_FORMAT_XML primitive data types are printed as string literals; objects and arrays are printed as a collection of xml key value pairs.

SI_FORMAT_ECMA primitive data types are printed as string literals; objects and arrays are printed as ecmascript objects.

Page 455: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

445

Stream Parameters

Stream Parameters

STREAM_PARM_SOUND_FORMAT sound format stream handles - uses SOUND_FORMAT enum default value: ULAW_8KHZ

STREAM_PARM_VOICE_CHANNEL voice channel to load streamed sound data to no default - application must set

STREAM_PARM_GRAMMAR_SET grammar set to use with auto decode type streams no default - application must set if STREAM_PARM_AUTO_DECODE active

STREAM_PARM_DECODE_FLAGS decode flags to send with auto decode type streams no default - application must set if STREAM_PARM_AUTO_DECODE active

STREAM_PARM_USE_COMPRESSION use compression internally for sound data data sent to the Speech Engine and data stored to disk will be compressed to approx. 10% of normal size, this adds a small amount of load to the CPU default = 0 (off)

STREAM_PARM_DETECT_BARGE_IN if active, the speech port will discard stream data until barge-in detected default = 0 (off)

STREAM_PARM_DETECT_END_OF_SPEECH if active, the port will stop accepting stream data once end-of-speech is detected, and change stream status to STREAM_STATUS_END_SPEECH if auto_decode also active, will immediately begin decoding as well default = 0 (off)

STREAM_PARM_AUTO_DECODE if active decode will start immediately on end-of-speech detection or a call to

Page 456: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

446

StopStream(), otherwise the application needs to call Decode to begin decode. default = 0 (off)

STREAM_PARM_BARGE_IN_TIMEOUT The streaming interface will flag STREAM_STATUS_BARGE_IN_TIMEOUT, if no speech was detected in the time frame specify by this property.

STREAM_PARM_END_OF_SPEECH_TIMEOUT After barge-in, the streaming interface will flag STREAM_STATUS_END_SPEECH_TIMEOUT, if it did detect end-of-speech in the time frame specified by this property.

STREAM_PARM_USE_FREQ_VAD. LumenVox Speech Engine API provides two Voice Activity Detection (VAD) algorithms, namely Time-domain VAD (TVAD) and Frequency-domain VAD (FVAD) . While TVAD is faster, FVAD has better performance and more flexibility. Set this parameter to 1 to enable FVAD, 0 to use TVAD. The default value is 1. Note: Each algorithm has its own set of parameters. Please make sure to use the correct parameters in your code. Below is each VAD parameter, along with the algorithm that it works with.

STREAM_PARM_BARGE_IN_BEGIN_DELAY <TVAD> number of 1/8 seconds at begriming of stream to limit barge-in during this period a much higher energy level is required to trigger barge-in this can be useful when echo-cancelled data streamed to port needs time for convergence default = 4 (0.5 seconds)

STREAM_PARM_BARGE_IN_NOISE_COUNT_LOW_THRESHOLD <TVAD> adjuster to strength of signal to trigger barge-in (and end-of-speech) lower number will trigger barge-in at lower volume if using dynamic barge-in adjust, this is the initial value. default = 55 (optimal for telephony applications)

STREAM_PARM_BARGE_IN_DYNAMIC_ADJUST <TVAD> adjust the volume trigger for barge-in dynamically, works best when audio data sent to a port is from the same source. Also works better if the EVENT_START_DECODE_SEQ and EVENT_END_DECODE_SEQ events are sent to port to signify change of audio source (as example a new telephony call is beginning). default = 1 (on)

Page 457: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

447

STREAM_PARM_VAD_BARGEIN_LVL <FVAD> This is Signal-Noise-Ratio (SNR) threshold. An audio frame will be considered for voice activity only when the SNR metric is higher than this threshold. Lower this parameter for noisy channel, so that it is easier to barge in. The default value is 30. Note: this value is not a measurement in dB. It is just a relative value compared to an internal standard.

STREAM_PARM_VAD_EOS_DELAY <FVAD> End-of-speech delay in ms. The default value is 800ms.

STREAM_PARM_VAD_INIT_TIME <FVAD> The FVAD needs to be initialized properly to optimize the performance. The parameter sets the duration of initialization time at the beginning at each audio stream. The default value is 100ms.

STREAM_PARM_VAD_NOISE_FLOOR <FVAD> An audio frame will be considered for voice activity only when the average energy is higher than this threshold. The default value is 0. This parameter is particularly useful when the echo canceler doesn't work very well. When channel noise, background noise or residual echo causes false barge-in, try to raise this threshold to prevent low energy signal from triggering barge-in. The range is from 0 to 999, but in practice you probably won't need to set it above 200.

STREAM_PARM_VAD_WIND_BACK <FVAD> The length of audio to be wound back at the beginning of voice activity. It helps in the situation of weak speech onset. The resolution of this parameter is 1/8 sec, i.e. 125ms, which means setting this value to 249ms is same as setting it to 125ms. The default value is 250ms.

STREAM_PARM_VAD_BURST_THLD <FVAD> The FVAD algorithm triggers barge-in only after it has observed the duration of voice longer than this threshold. This threshold helps preventing bursting noise from triggering barge-in. The default value is 100ms.

STREAM_PARM_VAD_P2A_THLD <FVAD> An audio frame will be considered for voice activity only when the ratio of peak frequency band energy to average energy is higher than this threshold. This is a fine tune parameter. Usually users don't need to modify it. The valid range of this parameter is [0,1000]. The default value is 100.

Page 458: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

448

Stream Status

STREAM_STATUS_NOT_READY

LV_SRE_StreamStart has not been called for this port.

STREAM_STATUS_READY

Stream is ready to accept data.

STREAM_STATUS_BARGE_IN

Only returned if STREAM_PARM_DETECT_BARGE_IN stream type set. Code has determined that speech has started, stream data is now being stored. (Hardware can stop playing audio when this state is reached.)

STREAM_STATUS_END_SPEECH

Only returned if STREAM_PARM_DETECT_END_OF_SPEECH stream type set. Code has determined that speech has stopped. If STREAM_PARM_AUTO_DECODE stream type has been set the decoding of audio data has begun. (Hardware can stop recording audio when this state is reached.)

STREAM_STATUS_STOPPED

Stream has stopped. Call LV_SRE_StreamStart to reset stream.

STREAM_STATUS_BARGE_IN_TIMEOUT

Barge-in was not triggered before timeout. No audio will be sent for decode.

STREAM_STATUS_END_SPEECH_TIMEOUT

End-of-speech was not detected before timeout. Note, the streaming will not stop until you call StreamStop or StreamCancel.

Page 459: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

LumenVox SpeechRec API

449

Environment Variables

Environment Variables

LV_SRE_CLIENT_CONNECT_IP

A comma (or semicolon) delimited list of IP addresses (and ports) the client will attempt to connect to. If this variable does not exists, the client will default to IP 127.0.0.1 (the local machine) and port 5000. Use a colon to separate IPs and Ports.

Example: "127.0.0.1;10.0.0.1:5001;10.10.0.1" Client will attempt to attach to the local machine, port 5000; IP address "10.0.0.1" port 5001; and IP address "10.10.0.1" port 5000.

Win32

The following environment variables need to be set up for the LVSpeechPort.Dll to function. The installation program creates these variables.

LVLANG

Location of the dictionary and language files, stored in two subdirectories: Dict and Responses.

LVBIN

Location of LVSpeechPort.Dll.

The following optional environment variables are set up for creating applications with the LVSpeechPort.DLL. See the LVSpeechPortConsole example program.

LVLIB

Location of LVSpeechPort.Lib

LVINCLUDE

Page 460: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

450

Location of LVSpeechPort.h

Linux

The following environment variables can be used to override the default locations used by LVSpeechPort.so, and BNF_Dict.so.

LVLANG

Location of the dictionary files, stored in the Dict sub-directory. Default location "/usr/LumenVox".

LVRESPONSE

Location of the answer and response files created at run-time, stored in the Responses sub-directory. Default location "/var/LumenVox".

Page 461: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

451

FAQs FAQs

Please email your questions to [email protected].

I cannot get the engine to recognize correctly, or my results have a low confidence.

A good speech recognition application depends on a well designed grammar. A grammar which contains very similar words (like "bit" and "pit") is an inefficient grammar that will hurt accuracy and speed. The engine will take longer as it tests the competing words against the audio. The resulting match will have a lower confidence because of the additional words which are very similar.

What do the confidence scores mean?

The confidence score is a rough measure of how closely the speech matched the phrases in the grammar. The score ranges from 0 - 1000. The higher the score the higher the estimated probability that the result. Typically, an application designer will use the confidence score to make decisions about the quality of a recognition result. For instance, results over 600 might always be accepted, results between 599 and 200 might trigger a confirmation, and results below 200 might be rejected outright. The thresholds to use depend largely on the grammar that is being used. In addition to the grammars, an application's confidence thresholds should be one of the first things to tune.

Do I need a Dialogic card?

Our engine is hardware-independent, so if the client application can collect the audio and put it into a buffer, the engine can decode. Which hardware a particular client application needs depends only on the client application.

How much memory does the Speech Engine need?

The memory requirement for running the Speech Engine is mainly determined by the maximum number of decoder threads. The start up memory usage is about 160MB, including one thread for each acoustic model. After that, each additional thread requires about 20MB. The maximum number of threads are determined by the number of processors. The more processors you have, the more simultaneous threads you can run, consequently the more memory you

Page 462: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

452

need. In the future, we shall allow users to set the maximum number of threads on the server. Currently, typical memory requirement for running the engine is:

One processor with one acoustic model and 2 threads: 207MB. Dual processors with one acoustic model and 4 threads: 247MB. Quad-processor with one acoustic model and 8 threads: 327MB.

How fast does the computer need to be?

This is dependant on the expected density of your application. The Speech Engine can perform about 14 recognitions per minute per 100 megahertz of processor speed. This calculation is based on a single word 50 item grammar.

What are some ways to increase the recognition accuracy?

Smaller grammars always work better. The practical phrase limit is 2000, but depending on how easily the words in the grammar can be confused, or the number of branches at any point in the grammar, that number could be anywhere from 1000 to 10,000.

Longer phrases also work better. When you need to recognize a phrase like "How do I" or "transfer me to", put these in as a single phrase, not individual words. Except where recognizing a single word, (like "Yes" or "No") avoid single small words.

You can use the ABNF format to cover several variations of small words:

"How (do | would | could) (I | we | you)"

Also, attempt to cover all the words you believe a user will speak. If a word or phrase is not in the grammar, the engine will not be able to identify it.

Will the engine handle proper names?

The internal dictionary has thousands of common names. (Around half of the 120,000 words are names). If a name is not in the dictionary, the decoder will use basic rules to phonetically spell any name.

For unknown names, enter the phonetic spelling of the name if the phonetic speller is unable to come up with a good pronunciation. This has been shown to work in the vast majority of cases. The phonetic spelling can be directly

Page 463: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

FAQs

453

entered as the phrase, if necessary, by enclosing the phoneme characters in curly braces "{ }". See Phonemes.

Can I ask for ticker symbols with your recognition engine?

Speaker-independent recognition systems have a hard time with open spelling. This is caused by the very similar sounding letters. For example, b, c, d, e, g, p, t, v and z all end with the sound of 'e'. Dictation software allows spelling because it trains for a single person's voice; many of those products also supply a phonetic alphabet system ("Alpha" for A, "Beta" for B, etc.).

In addition, there are more than sixteen thousand ticker symbols. Many of the symbols are very similar in the way they sound when being spelled out, and thus are hard to correct for:

eeee is the symbol for eMachines, Inc.

cccc is the symbol for Concord Career Colleges Inc.

How can I get around this problem?

Limit the tickers you support.

Breakdown the category of the stock. Make grammars smaller. First ask which stock exchange. Then ask for the symbol. Have a strategy available to disambiguate symbols until the proper answer is found.

What are the languages currently supported?

We currently support North American English. Spanish is the next language planned.

Does/Can LumenVox support language X?

The short answer is that, yes, LumenVox can localize/customize the products to the extent that we can add in different languages for speech recognition. There are two ways to do it:

The first option is very fast and easy to implement. Phonetically spell the (for example) Spanish words using the English phone set. For example, the Spanish word mañana can be entered {M AO N Y AO N AE}. See Phrases and Phonemes for more information on entering raw phonemes as phrases.

Page 464: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

454

The second option requires a couple of items and more time. Basically, LumenVox needs:

- Lots of audio data in the target language; the amount can vary from 10 hours for male and 10 for female (20 total) for small vocabularies (10 -15 words), to as much as can be collected.

- The same audio data, transcribed as text.

- A machine-readable dictionary in the target language.

The first option is quite easy to implement, but loses some accuracy across very large vocabularies because the target language's sound inventory still different from the English inventory. The second option takes more time and energy to produce, but is quite a bit more accurate.

As a first step, phonetically spell each word so that your organization can test and deploy the application. Then, once you have collected enough audio data, LumenVox can train native language models and quit using the English models entirely.

With some work, LumenVox could adapt the Speech Engine itself so that it displays in a different language, but that is a special case situation.

Why does the engine occasionally recognize my speech in the Female model when I am male?

First, some notes about the "male" and "female" model. The models are entirely statistical, and the separate models just encode a speaker of type 1, and another model that encodes a speaker of type 2. It happens to be that a very useful distinction lies on gender (owing mainly to pitch differences between males and females), but there are men who sound like women and women who sound like men. In addition, it is possible that the particular utterance involved simply had better examples in the other model, so the "wrong" model did a better job of recognizing the speech. Because we trained the two separate models using data divided by gender, we named the models according to their gender as a convenience. In fact, the recognizer has no knowledge as to which gender the speaker is, only which model had the best match.

Do not use the engine to classify speakers according to their sex; the engine is not designed or intended to be used to categorize speakers according to personal characteristics, whether the characteristic is age, sex, dialect, or any other attribute. LumenVox takes NO responsibility for issues arising from using the engine in such a manner.

Page 465: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

FAQs

455

Why does the engine always do two decodes, one in a male model, and one in a female model?

Suppose we have two models, a generic male (MM) and generic female model (MF), as well as a Speaker (S1). S1 says something, and the decoder runs two decodes, one against each model, MM and MF. The results break down as follows (for our purposes, correct means "got the right thing" whether the result is the actual string of words, or the right concept):

Case a: MM has the highest score, and the correct answer, MF may or may not return the correct answer.

Case b: MF has the highest score and the correct answer, MF may or may not return the correct answer.

Case c: MM has the highest score but returned the incorrect answer, while MF had the lower score but returned the correct answer.

Case d: MF has the highest score but returned the incorrect answer, while MM had the lower score but returned the correct answer.

Case e: Neither is correct, regardless of score.

For case e, since neither model got the right answer, all we can do is try to make the models better and the system tighter. Cases c and d are the worst case performance; we try to avoid these :). Cases a and b are the hoped-for result, since we get a correct answer. Notice that we never specify which is the "correct model" only the "correct answer". Also, note that for all cases "correct" requires some outside knowledge about which answer was correct. The engine has no such information, and is forced to choose the best answer based on highest score.

The potentially bad results are cases c and d; in this case, the recognizer picks the wrong answer, when it should have gotten the right answer had the engine more knowledge. Fortunately, c and d rarely happen; instead what we have found is that in cases a and b, the speaker's gender frequently does not always match the gender model which had the best answer. But, it doesn't actually matter, since we obtain the correct answer anyway (and we are looking for the answer, not the gender).

Running two decodes (ignoring decode history) allows us to capture each case where, for some reason, the mismatched gender model gets the right answer and the matched one blows up. There are several reasons this might happen: the mismatched model may have better coverage on the acoustics in question,

Page 466: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

456

the speaker's voice could crack, or the speaker could be sucking on helium, etc. Since some people will waffle between the two different models, given the above, we are better off running two decodes. If we were to select a particular model based on previous history, we would lose the accuracy gain between running two models and letting the system pick the best result.

In addition, the incidence rate for mismatched, but correct answers is quite a bit higher than the incidence rate for mismatched, incorrect answers, which means running two decodes and picking the best result gets a net gain, even given incorrect answers occasionally.

That said, one plausible scenario where a client application might want to cut the second decode is for load balancing. If all 48 ports go active at once (or the system is on a slow machine), it might be better to sacrifice some accuracy to handle more customers quicker. For the systems LumenVox deploys on, we haven't had a problem with running two decodes yet; the load balancing feature is on the short term pipeline and should be online soon.

If the client application wants to track decode models for a caller, there is no restriction against doing so; load-balancing becomes an issue of deciding how many double decodes the application can handle, and then picking a permanent model for that caller/speaker. One thing not to do is to make the decision after only one utterance; let the double decodes continue for a few rounds (at least three or five) and then pick the model which had the highest score the most (the application will also need to take into account whether the decodes were correct). The gender model flags (LV_DECODE_GENDER_MALE, LV_DECODE_GENDER_FEMALE) for LV_SRE_Decode() and LVSpeechPort::Decode() tell the recognizer which model to use for the decode, thus disabling the dual decodes.

Because there is an accuracy gain doing both decodes, we recommend letting the system do both decodes for most applications. If load becomes a serious issue, than disable the double decode system and pick the model the application should use.

What is n-best?

Instead of hypothesizes only one sentence, the engine hypothesizes several sentences on what it heard. Usually the top best sentence is the highest scoring sentence. The others are the top alternative sentences, which scored lower. N-best results can be used to craft more intelligent confirmations.

Why does the API appear to cause a memory leak?

Page 467: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

FAQs

457

A common reason that causes the memory usage to grow is keeping loading grammars without unloading them. A good practice is unloading grammars that will not be used for a while.

Also, please exercise caution when using the C API. Most of the handles created by the API, such as H_SI, H_GRAMMAR, and HPORT, need to be explicitly released after you were done using them.

Page 468: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

458

How to Contact LumenVox LLC Web site: www.LumenVox.com Email: [email protected] Sales: [email protected] Support: [email protected] Phone: (858) 707-0707 Fax: (858) 707-7072

LumenVox LLC 3615 Kearny Villa Road, Suite # 202 San Diego, CA 92123

Page 469: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

459

Copyright Information Copyright 2001, 2002, 2003, 2004, 2005 LumenVox LLC. All rights reserved.

Page 470: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

460

Glossary

C Concept: The string value returned by the decoder. The decoder can return

mutiple concepts. A concept represents words or phrases grouped together under single a "heading".

P Phrase: A word or series of words. Can also include BNF formated words and/or

pure phonemes.

S SISR: Semantic Interpretation for Speech Recognition; A companion to SRGS

grammars, this working draft describes a process for turning sentences recognized by an ASR into data objects usable by an application.

SRGS: Speech Recognition Grammar Specification; a W3C recommendation for the format of grammars used in a speech recognizer.

Page 471: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

461

Index A

AddPhrase .......................... 109, 307

Asynchronously................... 120, 326

B

Backus Naur Form........................ 78

BNF............................................... 78

C

Callback Function ....................... 425

Cautions........................................ 80

ClosePort .............................. 89, 271

Concept ..... 109, 111, 131, 294, 307, 309

confidence value................. 132, 295

Contact Us .................................. 457

Copyright Information.................. 458

D

Decode ............... 118, 130, 272, 293

dictionary ...................................... 78

E

email............................................457

Environment Variables ................448

F

FAQ.............................................450

G

GetConcept .........................131, 294

GetConceptScore130, 132, 293, 295

Grammar .....................112, 310, 441

GRAMMAR_DIGITS............113, 302

GRAMMAR_LETTERS .......113, 302

GRAMMAR_MONEY ..........113, 302

GRAMMAR_NUMBER........113, 302

I

Invalid Error Code ...............139, 311

L

LoadStandardGrammar.......113, 302

LoadVoiceChannel..............116, 305

Page 472: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Printed Documentation

462

Logging ................... 86, 90, 268, 425

LV_DEFAULT_GRAMMAR_ALREADY_LOADED........................... 430

LV_DEFAULT_GRAMMAR_OUT_OF_RANGE ............................... 430

LV_GRAMMAR_SET_OUT_OF_RANGE ........................................ 430

LV_INVALID_SOUND_FORMAT 430

LV_RESET ................................. 430

LV_SOUND_CHANNEL_OUT_OF_RANGE ................................... 430

LV_STANDARD_GRAMMAR_ALREADY_LOADED ................ 113, 302

LV_STANDARD_GRAMMAR_OUT_OF_RANGE..................... 113, 302

LV_SYSTEM_ERROR................ 430

LV_TIME_OUT ........... 120, 326, 430

LVBIN ......................................... 448

LVINCLUDE................................ 448

LVLANG...................................... 448

LVLIB .......................................... 448

LVRESPONSE............................ 448

LVSpeechPort............................. 261

M

MillisecondsToWait .............120, 326

O

OpenPort...............................86, 268

P

pcm .....................................116, 305

PCM_16KHZ ...............................439

PCM_8KHZ .................................439

Phonemes .....................................75

Phonetic Spelling ..........................75

Phrase...........................78, 109, 307

Port........................................86, 268

Properties....................................433

Q

Questions ....................................450

R

RemoveConcept .................111, 309

ResetGrammar....................112, 310

ReturnCode.........................139, 311

ReturnErrorString ................139, 311

Page 473: Printed Documentation - LumenVox · 2 Release Notes Version 6.0: Supports n-best. Reduced server memory footprint. Speed up on recognition algorithm. Reduced server new thread start

Index

463

S

scoring ................................ 132, 295

Sound Formats ........................... 439

speech port ........................... 86, 268

Standard Grammars ................... 441

StandardGrammar .............. 113, 302

Subdirectories............................. 448

T

Technical Support ....................... 457

TRIM_SILENCE_VALUE ............433

U

Ulaw ............................116, 305, 439

U-law...........................................439

ULAW_8KHZ...............................439

V

VoiceChannel......................116, 305

W

WaitForEngineToIdle...........120, 326