osd developers guide

OpenSpeech Dialog1.4

Developer’s Guide

Notice

OpenSpeech Dialog 1.4Developer’s Guide

Copyright © 2001–2007 Nuance Communications, Inc. All rights reserved. Published by Nuance Communications, Inc.One Wayside Road, Burlington, Massachusetts, 01803, U.S.A. Last updated July 16, 2007.

Nuance Communications, Inc. provides this document without representation or warranty of any kind. The information in this document is subject to change without notice and does not represent a commitment by Nuance Communications, Inc. The software and/or databases described in this document are furnished under a license agreement and may be used or copied only in accordance with the terms of such license agreement. Without limiting the rights under copyright reserved below, and except as permitted by such license agreement, no part of this document may be reproduced or transmitted in any form or by any means, including, without limitation, electronic, mechanical, photocopying, recording, or otherwise, or transferred to information storage and retrieval systems, without the prior written permission of Nuance Communications, Inc.

Nuance, the Nuance logo, DialogModules, and RealSpeak are trademarks or registered trademarks of Nuance Communications, Inc. or its affiliates in the United States and/or other countries. All other trademarks referenced herein are the property of their respective owners.

Contents

Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi

Restrictions and limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii

Audiences and objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii

Abbreviations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii

Available documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiv

Related standards and third-party documents . . . . . . . . . . . . . . . . . xiv

Chapter 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

Overview of OSD applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

Dialog configurations with xHMI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

VoiceXML runtime platforms with the OSD framework . . . . . . . . . . . 1

Application Java code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

Sample applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

Running the pizza application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

Chapter 2. OSDM node configuration (<osdm>) . . . . . . . . . . . . . . . . . . . . . . . . . . 5

<OSDM> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

Chapter 3. Dynamic property framework (DPF) . . . . . . . . . . . . . . . . . . . . . . . . . 15

Using DPF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

Performance guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

Error handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

Setting up the DPF tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

Storing HTTP parameters in the DPF tree . . . . . . . . . . . . . . . . . . . . . . . . . . 18

DPF facades . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

Generic facade . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

OSDM-like facade . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

iiiNuance Proprietary

Chapter 4. Creating an OSD component . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

Example PIN component . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

Details for the example component . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

Implementing OSD components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

Create the directory structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

Configure the DPF tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

Defining the DPF (dpfInit.xml) . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

Enabling DPF (appconfig.xhmi) . . . . . . . . . . . . . . . . . . . . . . . . . . 29

Create xHMI file for the component (pin.xhmi) . . . . . . . . . . . . . . . . . 29

The collection node example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

The confirmation node example . . . . . . . . . . . . . . . . . . . . . . . . . . 31

The error <catch> example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

Write a wrapper for the component (appconfig.xhmi) . . . . . . . . . . . . 34

Calling OSD components from VoiceXML applications . . . . . . . . . . . . . . 36

Calling OSD components from xHMI applications . . . . . . . . . . . . . . . . . . 37

Create the application structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

Create the application configuration (appconfig.xhmi) . . . . . . . . . . . 37

Example parameter list . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

Example reference to a component . . . . . . . . . . . . . . . . . . . . . . . . 38

Example call to a component . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

Example result handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

Example error handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

Chapter 5. TransitionNode configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

Chapter 6. Server-side event handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

Handling events on application servers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

Performance considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

Enabling server-side event handling . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

Counting events as they occur . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

How to catch events on the application server . . . . . . . . . . . . . . . . . . 47

Examples of event handling on the application server . . . . . . . . . . . 47


iv Nuance Proprietary

Restarting a node and varying the output . . . . . . . . . . . . . . . . . . 48

Using conditions to vary the output . . . . . . . . . . . . . . . . . . . . . . . 49

Setting the maximum retries of a node . . . . . . . . . . . . . . . . . . . . . 50

Running scripts inside event handlers . . . . . . . . . . . . . . . . . . . . . 51

Complete event-handling example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

Chapter 7. OSD logging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

About OSD logging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

Diagnostic logging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

Page logging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

Application logging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

Log message formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

Scoping of log messages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

Nesting log events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

Log events, parameters, and values . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

Generic log events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

Transaction events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

Node transitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

Catch handlers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

Final transitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

Database interactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

Caller segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

Transfers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

Turning application logging on and off . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

Chapter 8. OSD administration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

Deployment to a web server . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

Providing an XML parser . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

Starting a session . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

Operation administration & management (OA&M) . . . . . . . . . . . . . . . . . . 73

Using JMX in Tomcat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

Configuring the JMX connector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

Set JMX parameters in web.xml . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

Load the configurator class . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

vNuance Proprietary

Managing configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

Balancing system loads . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

Controlling shutdown and update operations . . . . . . . . . . . . . . . . . . 76

The IOAM interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

OA&M event notifications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

Default error messages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

Extending error messages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

Localizing OA&M notifications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

Routing calls to the application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

Background concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

Registering OSD applications for routing . . . . . . . . . . . . . . . . . . 81

Configuring the routing servlet . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

Setting the persistent storage directory . . . . . . . . . . . . . . . . . . . . 81

Setting the initial routing table . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

Enabling the routing servlet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

Configuring the web application server . . . . . . . . . . . . . . . . . . . . 83

Static routing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

Dynamic routing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

Using the routing servlet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

Chapter 9. Application development topics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

Using “skip lists” to avoid recognizing specific words . . . . . . . . . . . . . . . 85

Key facts about skip lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

When skip list processing occurs . . . . . . . . . . . . . . . . . . . . . . . . . . 87

Controlling where skip list processing occurs . . . . . . . . . . . . . . . 87

OSD automatically adds homophones to skip lists . . . . . . . . . . 88

Sample skip list grammar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

Dynamic prompts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

Writing prompt text to an attribute . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

Adding output to the StepResponse . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

Working with dates and times programmatically . . . . . . . . . . . . . . . . . . . . 94

Setting dates and times . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

Getting dates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95


vi Nuance Proprietary

Details on date and time formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96

Exceptions for invalid timestamps . . . . . . . . . . . . . . . . . . . . . . . . . 97

Creating grammars dynamically (at runtime) . . . . . . . . . . . . . . . . . . . . . . . 98

Comparison of dynamic and static grammars . . . . . . . . . . . . . . . . . . . 98

Overview of OSD support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98

Example of a dynamic grammar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99

Reading the <config> content of a node . . . . . . . . . . . . . . . . . . . . . . . . . . . 100

Add the elements by extending the DTD . . . . . . . . . . . . . . . . . . . . . . 101

Use the new element in your xHMI configuration . . . . . . . . . . . . . . 101

Write classes to access the custom node . . . . . . . . . . . . . . . . . . . . . . . 102

Extending the application object . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108

Rendering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

Extending the rendering system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

Create a custom node . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110

Configure the custom node in xHMI . . . . . . . . . . . . . . . . . . . . . . 110

Change the <vuiforward> map in xHMI . . . . . . . . . . . . . . . . . . 110

Create a jsp page . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

Using custom Render Data objects . . . . . . . . . . . . . . . . . . . . . . . 112

Chapter 10. The OSD Datamodel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113

Overview of variables and data storage . . . . . . . . . . . . . . . . . . . . . . . . . . . 113

Datamodel error events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114

Accessing variables with xHMI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114

Accessing variables with ECMAScript . . . . . . . . . . . . . . . . . . . . . . . . 115

Passing variables to subdialogs . . . . . . . . . . . . . . . . . . . . . . . . . . 116

Writing your own bean . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116

Accessing variables with Java code . . . . . . . . . . . . . . . . . . . . . . . . . . . 118

Access from a custom node . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118

Access from an update rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118

Access from an attribute facade . . . . . . . . . . . . . . . . . . . . . . . . . . 119

Access from a custom bean . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120

ISessionFrame methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120

Differences between a.best() and s.best(‘a’) . . . . . . . . . . . . . . . . . . . . 121

AttributeBean methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122

viiNuance Proprietary

Writing a factory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122

When do you need a factory? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123

Steps for writing a factory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123

Implementing IDataElementFactory . . . . . . . . . . . . . . . . . . . . . . . . . . 123

The factory lifecycle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124

Configuring factories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124

Using the OSD datamodel interface . . . . . . . . . . . . . . . . . . . . . . . . . . 125

Implementing IDataModelAccess . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125

Example user-defined JNDI factory . . . . . . . . . . . . . . . . . . . . . . . . . . 126

Chapter 11. FAQ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129

Evaluating variables and making logic decisions . . . . . . . . . . . 129

Accessing variables in ECMAScript expressions . . . . . . . . . . . 130

Declaring custom classes in xHMI . . . . . . . . . . . . . . . . . . . . . . . . 131

Creating attributes via Java . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131

Configuring OSDM parameters dynamically . . . . . . . . . . . . . . 131

Timing of updates in the SessionFrame . . . . . . . . . . . . . . . . . . . 131

Changing the provided rendering jsp . . . . . . . . . . . . . . . . . . . . . 132

Recognizing long utterances with robust parsing grammars . 132

Chapter 12. Getting started with development . . . . . . . . . . . . . . . . . . . . . . . . . 133

Speech application development lifecycle . . . . . . . . . . . . . . . . . . . . . . . . . 134

Application design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134

Design the callflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136

Design the prompts and speech grammars . . . . . . . . . . . . . . . . . . . . 137

Application development . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138

Create a directory structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138

Configure the application (create xHMI files) . . . . . . . . . . . . . . . . . . 139

Validating with a DTD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139

Validating with a W3C schema . . . . . . . . . . . . . . . . . . . . . . . . . . 140

Implementing custom nodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140

General activities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140

Test the application callflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141

Application deployment and tuning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141


viii Nuance Proprietary

Chapter 13. Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143

Appendix A. Predefined properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147

Miscellaneous properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147

Ambiguous recognition results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147

Skip list processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148

Properties for OpenSpeech Insight logging (OSI) . . . . . . . . . . . . . . . . . . . 148

Appendix B. Command line tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151

Summary of command line tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151

Prerequisites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151

Recording list tool (listing prompts for the recording studio) . . . . . . . . . 151

Grammar List tool (lists grammars in an xHMI file) . . . . . . . . . . . . . . . . . 154

Validate tool (validating xHMI configuration files) . . . . . . . . . . . . . . . . . 155

Appendix C. Timestamp abbreviations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157

Appendix D. Negative confirmations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159

ixNuance Proprietary


x Nuance Proprietary

Preface

IntroductionOpenSpeech Dialog (OSD) is an open environment for platform vendors and application developers to accelerate development time for speech applications, lower the total cost of application deployment, and provide a higher level of service to customers.

OSD interprets the xHMI1 configuration language, which is an open XML specification language for dialog applications. The xHMI language provides a high-level language for designing and specifying speech applications.

Together, OSD and the xHMI language address the complexity of implementing user interfaces that personalize telephone calls for each customer and that offer natural-language dialogs with broad vocabularies. Without OSD and xHMI, considerable design and programming skill is required to build these advanced applications using VoiceXML or JSP pages, and because of the complexity the costs are often too expensive to justify development. OSD and xHMI change this cost structure by speeding development, improving reliability, automating operational reports, and reducing the amount of tuning needed for initial phases of deployment.

1The name xHMI is an abbreviation for Extensible Human Machine Interface.

IntroductionxiNuance Proprietary

In addition, environments that integrate OSD gain access to Nuance products that also use OSD. These products include:

■ SpeechPAKs™ (vertical application suites for healthcare, finance, utilties, etc.)

■ DirectoryAssistant (telephone directory solutions)■ SpeechAttendant (automated telephone attendant solutions)■ X|Mode (multi-modal applications)■ Custom applications built by the Nuance Professional Services organization

Restrictions and limitationsOSD 1.4 is compatible with xHMI 1.4.

OSD compatibility with OSR

For its natural language capabilities, OSD requires version 3.0.4 of the OpenSpeech Recognizer (OSR 3.0.4 and higher). This includes the following OSD capabilities:

■ Robust parsing grammars (first available in OSR 3.0.3, but only with sentence-based confidence scores)

■ Slot-base confidence scores—By default, OSR provides confidence scores for entire sentences and not for individual grammar (attribute) slots. To enable slot-based scores, you must configure OSR to add the special grammar key SWI_attributes to the recognition result. This can be done with the swirec_extra_nbest_keys parameter inn an OSR user configuration file. For example, in a configuration file the parameter might appear as follows:



<param name="swirec_extra_nbest_keys"><value>SWI_meaning</value><value>SWI_literal</value><value>SWI_grammarName</value><value>SWI_utteranceSNR</value><value>SWI_attributes</value>

</param>

Audiences and objectivesApplication developers

Application developers are responsible for building speech applications that meet their customer’s business needs. With xHMI configurations and the OSD runtime environment, application developers can provide truly conversational speech interfaces. The xHMI configuration defines the conversational callflow of


xii Nuance Proprietary

applications, and the OSD runtime provides interfaces, classes, and methods for developing xHMI applications.

To develop OSD applications, you also need the xHMI Reference Guide.

Abbreviations

Abbreviation Description

ASP Application Services Platform

ASR Automatic Speech Recognition

CCXML Call Control eXtensible Markup Language

DPF Dynamic Property Framework working draft (from the W3C)

DTD Document Type Definition

FIA Form Interpretation Algorithm

HTTP Hypertext Transfer Protocol

J2EE Java 2 Platform, Enterprise Edition

JSP Java Server Pages

MVC Model-View-Controller

NL Natural Language

OSD OpenSpeech Dialog

RDO Render Data Object

SALT Speech Application Language Tags

SRGS Speech Recognition Grammar Specification (from the W3C)

SSFT ScanSoft (the former name of Nuance Communications, Inc.)

SSML Speech Synthesis Markup Language (from the W3C)

TTS Text to Speech

VCL Verification Candidate List

VoiceXML Voice Extensible Markup Language

W3C World Wide Web Consortium

AbbreviationsxiiiNuance Proprietary

Available documentationThe documentation set for the OSD product includes the following:

■ OSD Migration Guide – Topics for applications and platforms built on previous releases of OSD and xHMI.

■ xHMI Reference Guide – Overview, architecture, and the xHMI language.

■ OSD Integration Guide – How to add OSD to an existing application development platform.

■ OSD Developer’s Guide – How to use OSD to build applications.

Related standards and third-party documents

xHMI eXtensible Human-Machine Interface

XML eXtensible Markup Language

Abbreviation Description

DPF http://www.w3.org/TR/DPF/

ECMAScript http://www.ecma-international.org/publications/files/ecma-st/ECMA-262.pdf

HTTP http://www.ietf.org/rfc/rfc2616.txt

j2EE http://java.sun.com/j2ee/

Jakarta Struts http://struts.apache.org/

Language tags (RFC 3066) http://www.ietf.org/rfc/rfc3066.txt

RFC 2806—Telephone URLs http://www.ietf.org/rfc/rfc2806.txt

SALT http://www.saltforum.org

SRGS http://www.w3.org/TR/speech-grammar/

SSML http://www.w3.org/TR/speech-synthesis/

Velocity template engine http://jakarta.apache.org/velocity/

VoiceXML 2.0 http://www.w3.org/TR/voicexml20/


xiv Nuance Proprietary

W3C Schema http://www.w3.org/TR/xmlschema-0/http://www.w3.org/TR/xmlschema-1/ http://www.w3.org/TR/xmlschema-2/

XML 1.0 http://www.w3.org/TR/2004/REC-xml-20040204/

XML Namespaces http://www.w3.org/TR/REC-xml-names/

Available documentationxvNuance Proprietary


xvi Nuance Proprietary

Chapter 1

Introduction

This guide helps Java application developers to build speech applications using the OSD runtime framework with xHMI configurations.

Overview of OSD applicationsAn OSD application (the OpenSpeech Dialog) is a set of dialogs configured in the xHMI language and run as a web application on a platform that has integrated the OSD framework.

Each dialog has one or more dialog nodes that are implemented in Java. Only one dialog is active at one time.

Dialog configurations with xHMI

xHMI (the eXtensible Human-Machine Interface) is an xml markup language that configures the application callflow. The callflow consists of dialogs and their nodes; the callflow progresses as one node transitions to the next. The configuration also specifies prompts, grammars, and slots for recognition results.

Application developers can write the xHMI configuration manually, or they can generate it automatically with a tool (such as Nuance V-Builder), or they can use a combination of the two (for example, automatically generating a skeleton application, and then manually coding the details).

VoiceXML runtime platforms with the OSD framework

OSD is an application framework that simplifies the building of Java web applications (speech and multi-modal applications). These applications read

IntroductionOverview of OSD applications

1Nuance Proprietary

xHMI configurations that define dialog behavior and the transitions between dialogs (for example, the prompts, confirmations, speech grammars, and so on).

Most OSD installations are integrated into a runtime platform that is partnered with Nuance. You can also use OSD as provided directly by Nuance.

OSD is a flexible and extensible framework providing several integration APIs for customizing your Java applications. OSD provides interfaces, classes, and methods for controlling dialogs and interacting with the session states configured via xHMI.

OSD renders VoiceXML pages to your browser platform. The pages define everything needed for playing output to users and communicating with speech recognizers and text-to-speech engines. OSD writes detailed log messages for application analysis and tuning, and it provides an interface for systems operations.

OSD does not provide telephony, operational control, speech recognition, text-to-speech, or the logging of log messages. We assume these services are controlled by your browser platform.

Application Java code

Here is a partial list of the responsibilities of your Java code:

■ Implementing decision logic based on the current status of a session (the SessionFrame status) and on external information (for example, using a customer database to personalize the application callflow).

■ Accessing back-end databases and performing transactions (for example, a money transfer in a bank application).

Sample applicationsOSD installations include sample code. Here is the path to the samples:

installdir\OpenSpeech Dialog\samples\

The pizza application – The ./pizza directory contains a simple application for ordering pizza. The application illustrates basic xHMI configuration concepts and Java coding practices.

The restaurant guide application – The ./restguide directory contains a simple application for selecting a restaurant and booking a table. The application illustrates the use of advanced natural language features.


2 Nuance Proprietary

The controller example – The ./sourcecode/servlet directory contains the source code for the front controller for handling events that affect the model or views.

The transfer example – The ./transfer directory contains code that demonstrates how to use the Transfer node for telephony transfers (typically to human agents).

The routing example – The ./routing directory contains the OSD routing servlet. Use the servlet for routing incoming telephone calls to OSD applications, and for performing rolling updates of applications. For an overview, see Routing calls to the application on page 81.

Dialog node examples – The ./sourcecode/nodes directory contains source code of the OSD implementation of the xHMI primitive nodes: Output, Collection, and Transfer. The sources are provided for educational purposes and serve as a reference for application developers building custom nodes.

OSDM samples – These samples are useful if you plan to use OpenSpeech DialogModules (OSDMs) in your application. OSDMs perform specific callflow and prompting actions such as collecting a telephone number from a user.

The ./dateofbirth directory contains an xHMI application that makes use of the date OSDM to ask for a date. To run this example an OSDM installation is required.

The ./appconfig/OSDM directory contains fragments of xHMI configurations that show how to use OpenSpeech DialogModules in OSD. Use TmplAppConfig.fragment.xml as a template for other OSDM nodes.

Running the pizza applicationAfter installation of OSD, the pizza sample application is ready for deployment and is found in the following directory:

installdir\SpeechWorks\OpenSpeech Dialog\samples\pizza\

Before installing this (or any OSD application) on an application server, you must copy dialogmanager-shared-1.4.jar to the shared library folder of the application server:

Application server Shared library folder

Tomcat TomcatInstallDir\shared\lib

WebSphere WebSphereInstallDir\lib

IntroductionRunning the pizza application

3Nuance Proprietary

To install the sample on the application server perform these steps:

1 Deploy the pizza.war file

2 Ensure that the jar files included in the sample application have precedence.

3 Ensure that you are using an XML parser that is compatible with JAXP 1.3. OSD installs the Apache Xerces parser, which you can use. OSD installs the parser in the java/lib/ext directory. For Tomcat 5.0, copy the XML parser libraries (xercesImpl.jar and xml-apis.jar) to the common/endorsed folder of the Tomcat installation, and then restart Tomcat.

4 Change the properties in installdir\webapps\pizza\WEBINF\ global.prop. Most importantly, you must specify your servers:

Tomcat Copy pizza.war to TomcatInstallDir\webapps\ and start deployment on the management console.

WebSphere In the Administrative Console click Applications→Install New Applications. Follow the standard procedures to install a war file.

Tomcat No action required.

WebSphere In the Administrative Console click Applications→Enterprise Applications. Select the pizza application. In the next page make the following changes:

Classloader Mode PARENT_LAST

WAR Classloader Policy

set grammar_server_ext Location of grammars

prompt_server_ext Location of prompts

osdm_server_ext Location of dialog modules



Chapter 2

OSDM node configuration (<osdm>)

The <osdm> element invokes sub-components and enables the creation of re-usable, modular, building blocks for applications. You can call any OpenSpeech DialogModule (OSDM) provided by Nuance or other vendor, and you can implement your own sub-components. (For details and an example, see Creating an OSD component on page 23.)

You can run sub-components as client-side or server-side executions: a client-side invocation runs as a VoiceXML <subdialog>, and a server-side invocation runs inside the application itself. You cannot randomly change from client- to server-side; you must use one or the other, but not both. Server-side sub-components must be OSD applications configured with xHMI. (See Server-side versus client-side sub-components below.)

<OSDM>Defines a modular, re-usable dialog executed as an application sub-component.

Scope

Scope node

Parent <config>

Allowed child elements <fills>

OSDM node configuration (<osdm>)<OSDM>

5Nuance Proprietary

Attributes

Implemented tags The OSDMNode and ServerSideOSDMNode node classes implement these tags:

Server-side versus client-side sub-components

Each <osdm> element in your xHMI configuration invokes a client-side or server-side component:

■ The server-side capability enables complete encapsulation of the called module within a single application context. (You can use the server-side class to call any OSD application configured with xHMI.)

■ The client-side capability enables partial encapsulation of non-OSD applications. You can use the client-side class to call OpenSpeech DialogModules (OSDMs) created by Nuance and any other component that can run as a <subdialog> on the VoiceXML page.

There are performance differences when calling an OSD component with OSDMNode or ServerSideOSDMNode. We recommend implementing components as described in Creating an OSD component and using OSDMNode initially to call it. If you suspect degraded performance, change to ServerSideOSDMNode and compare results.

Attribute1

1. Either srcexpr or src is required; you cannot specify both.

Description

srcexpr Required (except when you use src). An ECMAScript expression containing the URL of an OSDM or an OSD application.

src Required (except when you use srcexpr). For client-side, this is the URL of the sub-component’s VoiceXML subdialog. (For client-side, the <osdm> tag is a simple wrapper around a VoiceXML <subdialog>.)

For server-side, this is the URL of the sub-component’s xHMI configuration file. The path can be relative or absolute.

Config element Description

<osdm> On the client-side, specifies an OSDM address. On the server-side, specifies the name of an xHMI file.

<fills> Specifies elements to map the return values to xHMI attributes.

<property> Specifies sub-component parameters. A child of <config>.



Here are the basic differences:

■ Client-side–When you execute an OSD component using OSDMNode, OSD renders a VoiceXML page with a <subdialog> that calls the component. When the component exits, the OSDMNode regains control and continues.

OSDMNode adds a roundtrip communication with the browser platform. The cost depends mostly on the number of parameters passed between the application and the components: if the application passes many parameters to components, or the components return many parameters to the application, the performance cost increases.

■ Server-side–In comparison, when you execute an OSD component on the server-side, the ServerSideOSDMNode creates a component handler without generating a VoiceXML page to get to the component. There is no roundtrip communication with the browser platform.

ServerSideOSDMNode adds a validation procedure and uses more memory. The cost depends on the size of the component’s xHMI configuration files: larger files adds load.

The configuration differences are as follows:

■ The node class:

■ The src location:

Client-side class com.scansoft.osd.nodes.OSDMNode

Server-side class com.scansoft.osd.nodes.ServerSideOSDMNode

Client-side src URL to the OSDM subdialog. This URL will be rendered as <subdialog src="myURL"/> on the VoiceXML page.

Server-side src URL to an xHMI configuration file. Can be absolute or relative. A relative URL is resolved as follows:■ If the URL starts with a forward slash (/), the location is

relative to the application context root. For example, if the root is http://server/app, then these src values are equivalent: /subapps/getPIN.xhmi (relative) and http://server/app/subapps/getPIN.xhmi (absolute).

■ If the URL does not start with a forward slash, the file location is relative to the parent xHMI file location (the calling application’s xHMI file). For example, if the parent xHMI is at: /xhmi/callPIN.xhmi, then using a src value of subapps/getPIN.xhmi resolves to http://server/app/xhmi/subapps/getPIN.xhmi.


7Nuance Proprietary

■ Events:

■ Sharing data:

■ Return values:

■ Global transitions in the calling application are not available to the sub-component. (This applies to both client- and server-side sub-components.)

Example client-side invocation

This example invokes a client-side OSDM.

<xhmi xml:lang="en-US" root="PinDialog">

<dialog id="PinDialog" root="PIN"><var-list><var name="PINReturnValue" type="attribute"/><var name="PINReturnCode" type="attribute"/><var name="PINConfidence" type="attribute"/>

</var-list>

Client-side The sub-component must handle all events.

Server-side Events are automatically forwarded to the calling application if not handled by the sub-component. This allows applications to write top-level event handlers (for example, a single configuration to handle hang-up events in all components).

Client-side The calling application can use <property> to configure the VoiceXML page that runs the OSDM subdialog. It cannot set attributes in the sub-component. The subdialog can return attribute values.

Server-side The calling application can use <property> to set attribute values in the sub-component. The sub-component can return values from any of its top-level (global-scope) attributes. Both the application and the sub-component share the same DPF tree (see Dynamic property framework (DPF) on page 15).

Client-side The calling application evaluates explicit return values when performing transitions to the <next> target.

Server-side The calling application evaluate the “expr” return value. (See examples below.)

<node id="PIN" class="com.scansoft.osd.nodes.OSDMNode"><config>

<property-list><property name="length" value="4"/>

</property-list>

<osdm src="%{osdm_server}osdm3-pin/osd-component">

<fills name="PINReturnValue" slot="returnvalue"/><fills name="PINReturnCode" slot="returncode"/><fills name="PINConfidence" slot="confidencescore"/>

</osdm></config>

<transition><next name="SUCCESS"><target path="playbackResult"/>

</next><next name="FAILURE"><target path="/ErrorHandler(PINReturnValue)"/>

</next></transition>

</node>

</dialog></xhmi>

The OSDMnode class allows you to specify transitions directly to other nodes by setting transition properties to the values returned by the OSDM. The above example uses blue text to show the transitions that use the returned properties.

Example server-side invocation

The next example invokes a server-side sub-component. It is nearly identical to the previous client-side example; the differences are highlighted with comments:

<xhmi xml:lang="en-US" root="PinDialog"><dialog id="PinDialog" root="PIN"><var-list><var name="PINReturnValue" type="attribute"/><var name="PINReturnCode" type="attribute"/><var name="PINConfidence" type="attribute"/>


9Nuance Proprietary

</var-list>

<node id="PIN"

class="com.scansoft.osd.nodes.ServerSideOSDMNode">

<config><property-list><property name="length" value="4"/>

</property-list>

<osdm src="getPIN.xhmi"><fills name="PINReturnValue" slot="returnvalue"/><fills name="PINReturnCode" slot="returncode"/><fills name="PINConfidence" slot="confidencescore"/>

</osdm></config>

<transition> <next name="expr"><target condexpr="PINReturnCode == 'SUCCESS'"

path="playbackResult"/><target condexpr="PINReturnCode == 'FAILURE'"

path="="/ErrorHandler(PINReturnValue)"/></next>

</transition></node>

</dialog></xhmi>

Returning server-side values to the calling application

The calling application can define attributes to be filled when the sub-component returns (see the <fills> elements in the Example server-side invocation). To accomplish the return, the sub-component must define the “return” path and assign values to the returning attributes. The following example shows how the getPIN.xhmi sub-component might be implemented:

<transition> <next name="expr"><target condexpr="s.ver('pin')" path="return">

<assign name="returnvalue" value="s.best('pin')"/><assign name="returncode" value="'SUCCESS'"/><assign name="confidencescore" value="s.conf('pin')"/>

</target>

<target condexpr="s.def('pin')" path="confirmPIN"/>

<target condexpr="maxRetries == 3" path="return"><assign name="returnvalue" value="''"/><assign name="returncode" value="'FAILURE'"/><assign name="confidencescore" value="0"/>

</target>

<target path="getPIN"/>

</next></transition>

Note: The sub-component can only return data in globally-scoped attributes. In the example, returnvalue, returncode, and confidencescore must be declared as top-level attributes in the sub-component (or declared in the “attributes” attribute of the sub-component root dialog.

Passing parameters to sub-components

The calling application uses properties to configure attributes in sub-components. The attributes are globally scoped (they are mapped to top-level attributes in the sub-component; you cannot set local attributes in dialogs or nodes).

The <property> elements in the calling application will reset the values of the corresponding attributes (if they exist) in the sub-component. Thus, you can define default values in the sub-component, and override those attributes as needed with the calling application. You cannot configure properties on the VoiceXML platform with server-side components.

The <property> configurations are identical for client-side and server-side components, but the effect is different: for client-side sub-components, OSD renders the properties as VoiceXML properties.

Note: A good coding practice is to declare expected attributes at the top of sub-component. This is recommended but not required (since OSD will create the attributes if they do not already exist).


11Nuance Proprietary

Setting parameters at runtime

To use dynamic content in sub-components, use ECMAScript expressions. For example:

<property name=”collection_parallelgrammar1”expr=”s.get(‘nameOfCommandgrammar’)”/>

Example call to the Date OSDM

This example shows a client-side invocation of the Date OSDM:

<xhmi xml:lang="en-US" root="Horoscope">

<dialog id="Horoscope" root="nodeAskForDate"><var-list><var name="osdmReturnCode" type="attribute"/><var name="osdmDateReturnValue" type="attribute"/><var name="osdmInputMode" type="attribute"/><var name="osdmDateYear" type="attribute"/>

</var-list>

<node id="nodeAskForDate"class="com.scansoft.osd.dm.nodes.OSDMNode"><config><osdm src= ="http://osdmserver:8080/osdm2-core/date">

<fills name="osdmReturnCode"

slot="nodeAskForDate.returncode"/><fills name="osdmReturnValue"

slot="nodeAskForDate.returnvalue "/><fills name="osdmReturnInputMode"

slot="nodeAskForDate.inputmode "/><fills name="osdmReturnDateYear"

slot="nodeAskForDate.returnkeys.Year "/></osdm>

<property name="propertiesfile1">value="http://myserver/params/application.properties"/>

<property name="collection_parallelgrammar1"value="command.grxml" />

</config>

Chapter 3

Dynamic property framework (DPF)

The Dynamic Property Framework (DPF)1 enables changing and passing property values during runtime execution. You can set properties statically or dynamically with as many DPF trees as needed in your OSD applications and components.

One use for DPF is to pass parameters when calling OSD components from a client application (see Creating an OSD component on page 23).

Using DPFAfter you set up a DPF tree, the DPF object and its properties are available to all ECMAScript expressions in the xHMI configuration. For example myDPF.mycomponent.daily is a reference to the daily property, which appears in mycomponent in the root tree named myDPF. Use the root name as a prefix in property paths. For example, consider this script:

<output id="advertisement"><audio srcexpr="myDPF.mycomponent.daily"/>

<output>

Assuming that daily has the value “http://server/usedcars.wav”, the system automatically renders this VoiceXML fragment:

<prompt><audio src=" http://server/usedcars.wav "/><prompt>

1The dynamic property framework is modeled according to the W3C working draft DPF from November 2004.

Dynamic property framework (DPF)Using DPF


Performance guidelines

For all DPF structures (DPF trees and DPF facades):

■ Loading large DPF trees can affect runtime performance. Because OSD creates DPF structures when it encounters <var> elements, the best location to define DPF trees is at the global scope.

■ Applications must treat DPF structures as read-only when sharing those structures with other applications. Otherwise, one application can corrupt property values in other applications by writing to the shared DPF tree.

■ Avoid defining property names that might cause problems with your ECMAScript interpreter. Do not define a property name that is also the name of a DPF tree or the String “Function”.

Error handling

When DPF errors occur, OSD throws error events for datamodel errors just as it would any error. For a description of events, see <catch> in the xHMI Reference Guide. Applications should always catch the error.datamodel root event or at least the general error event.s

Setting up the DPF treeTo use dynamic properties in an OSD application or component, you must define a DPF tree in xHMI. It’s best to do this at global scope. For example:

<xhmi>

<var-list><var name="myDPF" type="dpf" expr="'myComponent'"/>

</var-list>

</xhmi>

Above, the example creates a tree named DPF. The tree automatically reads these properties files, in order:

1 /dpf/dpfInit.xml initialization file. This file is required.

2 /all.properties file. This file is optional; if it does not exist, the system writes a warning.

3 /myComponent.properties file. This file is optional; if it does not exist, the system writes a warning. Typically, the name you provide for myComponent is the same name you provide for <mycomponent> in the dpfInit.xml file described below.

You can create as many DPF trees as needed. Define the contents of each tree in an xml file, then refer to this file in the configuration. This example creates a tree named myDPF that is defined in myDPFInit.xml:

<var name="myDPF" type="dpf:/dpf/myDPFInit.xml"/>

For example, the contents of myDPFInit.xml might look like this:

<?xml version="1.0" encoding="UTF-8"?><DPF><mycomponent><todays_ad> http://server/usedcars.wav </todays_ad>

</mycomponent></DPF>

In the example above, the content is static. You can supply dynamic values programmatically by writing a java class and linking it to the property:

<?xml version="1.0" encoding="UTF-8"?><DPF><mycomponent><todays_ad class="com.mycorp.myclassname"> </todays_ad>


Your class must extend com.scansoft.osd.dpf.DPFProperty, and supply the value of the property. The class must overwrite the following method:

public String getValue()

You must avoid name conflicts when designing property names; this is especially important for reusable components. One way to accomplish this is to use a subtree in DPF which has the same name as the component (as shown by <mycomponent> in the previous example).

The framework creates the DPF tree whenever it encounters the dpf variable. Because the variable is scoped, the tree is also scoped. For example, when a <node> contains a <var> that loads a DPF tree, the framework re-creates the tree every time the callflow enters the node (the name is always the same, as is the content).

You can create a DPF that is available globally to more than one application. When applications define this tree, they share a single instance of the tree instead of creating new instances.

Dynamic property framework (DPF)Setting up the DPF tree


To create a global DPF, use the “static” identifier as follows:

<var name="myCallDPF" type="dpf:static:/dpf/myCallDPFInit.xml"/>

Use this for read-only properties. Applications should not write to these DPF trees because any changes immediately affects all applications using the tree.

Storing HTTP parameters in the DPF treeWhen calling OSD components, applications pass parameters using HTTP requests. The components store these parameters in DPF trees (this is discussed in Creating an OSD component on page 23).2

The parameters are inserted as children of the DPF property that has the name of the component. For example, assume the following dpfInit.xml file:

<DPF><mycomponent><a>40</a><b>41</b>


If you send a parameter “myparam=42” to “mycomponent,” it is inserted into the DPF as follows:

DPFmycomponent

a = 40b = 41myparam = 42

Note: Avoid using underscores in parameter names. Using the underscore (_) character in parameter names will create hierarchies in the DPF tree. This feature was designed for special purposes and should be avoided in your applications. For every underscore in the parameter name, a new level in the DPF tree is created. For example, if you send a_b_c=42 to mycomponent, the DPF tree looks like this:

DPFmycomponent

todays_ad class="com.mycorp.myclassname"a

2 OSD components allow insertion of HTTP parameters; OSD applications do not.



bc=42

DPF facadesYou can group several DPF trees into a facade, andthen reference properties in the facade instead of the specific trees where the properties reside. When OSD looks up a property, it searches the trees in the facade in an order that you define.

Accessing DPF facade entries with scripts is the same as for DPF trees. For example, consider the ECMAScript expression in this configuration:

<output id="advertisement"><audio srcexpr="myDPF.mycomponent.daily"/>

<output>

Generic facade

This is a generic (list-like) approach that could be used to implement the following OSDM-like DPF facade.

<var name="myDPF" type="dpf_facade"/>

Setting dpf trees to this dpf facade can be done by executing the following ECMAScript expression:

<script>myDPF.addTree(dpfA);

</script>

This DPF facade has the following methods:

addNamedTree(String name, IDPFProperty dpf)–adds a DPF tree with the specified name.

addTree(IDPFProperty dpf)–adds a DPF tree.

removeTree(int index)–removes the tree specified by the index.

setTree(int index, IDPFProperty dpf)

getTree(int index)–returns the tree specified by the index.

size()–returns the number of DPF trees in the DPFFacade.

Dynamic property framework (DPF)DPF facades


OSDM-like facade

This facade is tailored to the usage needed in packaged applications like OSDMS. Filling the DPF facade works like this:

<var name="myDPF" type="dpf_facade_osdm"><param name="all" expr="dpfAll"/><param name="component" expr="dpfComponent"/><param name="language" expr="dpfEnglish"/><param name="instance" expr="dpfInstance"/>

</var>

The single DPF trees could be instantiated like the following:

<var name="dpfAll" type="dpf:/dpf/all.xml"/><var name="dpfComponent" type="dpf:/dpf/component.xml"/><var name="dpfEnglish" type="dpf:/dpf/language.xml"/><var name="dpfInstance" type="dpf:/dpf/instance.xml"/>

The DPF tree instantiation can also be done using the mechanism described in the Outside Bean Creation document.

The dpf_facade type allows renewing the references at runtime like in the following example for language. The other references can be renewed similarly.

<script>myDPF.language = dpfGerman;

</script>

Once the DPF facade is accessed like a normal DPF tree the referenced DPF trees will be searched in the following order:

1 instance dpf

2 language DPF tree

3 component DPF tree

4 all DPF tree

This DPF facade has the following methods:

■ setAll(IDPFProperty)■ getAll()■ setComponent(IDPFProperty)■ getComponent()■ setLanguage(IDPFProperty)■ getLanguage()■ setInstance(IDPFProperty)■ getInstance()



Dynamic property framework (DPF)DPF facades


Chapter 4

Creating an OSD component

This chapter shows how to write an OSD component that you can call from OSD applications or directly from a platform browser. It has a step-by-step description for writing and calling the component

IntroductionAn OSD component is a web application. Conceptually, it is a re-usable building block for speech applications. You can call a component from VoiceXML or OSD applications:

■ By calling OSD components from existing VoiceXML applications, you can use the components without fully re-implementing the applications. By adding a <subdialog> to the VoiceXML application, the applications benefits without being configured with xHMI.

■ By calling OSD components from OSD applications, you increase modularity and re-usability. Applications call components as if they were nodes (using the OSDMNode class). The node generates a VoiceXML page that uses <subdialog>.

An OSD component is similar to an OSD application except for these differences:

■ Components run inside the calling application; they are not standalone applications. For example, this has implications for logging since the component is part of a session whereas an OSD application comprises the entire session.

■ When a component exits (<target path="return"/>), the calling application regains control of the session. When a standalone OSD application exits, the session ends.

Creating an OSD componentIntroduction


■ When you call a component, the URI ends with “/osd-component”. When you call a standalone OSD application the URI ends with “/osd”.

■ When calling a component, applications pass property values using HTTP parameters.

■ When components return to their calling application, they fill values in a formal parameter list defined in the application’s root dialog. The list reinforces the purpose of a component to perform a specific activity, and ensures that the component returns a specific amount of information.

Example PIN component

Consider an application that requests a personal identification number (PIN) for security purposes. We have a client application and a re-usable component:

■ The application configures the collection prompts, and defines the PIN length. Then it calls the component, gets the result, and continues.

■ The component collects the PIN, confirms it if necessary, and returns the result. The component uses xHMI to configure the prompts, grammars, confirmations, and so on, needed to perform the collection.

■ The example is kept simple to demonstrate the key features. For example, there is no configuration for nomatch, noinput, and help.

Here is a sample conversation:

Another example call flow:

System Please enter your 4-digit PIN

User one-two-three-four

System I think you said “1234.” Correct?

User Yes

[SUCCESS returned]

System Please enter your 3-digit PIN

User one-eight-three




This figure shows the callflow for the example OSD component:

User No

System Please enter you 3-digit PIN

User one-eight-three


User Yes

[SUCCESS returned]

Initial Prompt

Collection

Confirmation

Something recognized

FailureSuccess

Nothing heard orLow confidence

Too many retries

Rejected by user

Nothing heard orLow confidence

Creating an OSD componentIntroduction


Details for the example component

For implementation and discussion of each piece of the example PIN component, see these details:

■ Implementing OSD components■ Calling OSD components from VoiceXML applications■ Calling OSD components from xHMI applications



Implementing OSD componentsThe following subsections show how to create an OSD component by implementing the Example PIN component. You must:

■ Create the directory structure■ Configure the DPF tree■ Create xHMI file for the component (pin.xhmi)■ Write a wrapper for the component (appconfig.xhmi)

Create the directory structure

OSD components use the same directory structure as OSD applications. Follow the conventions in Create a directory structure on page 138.

Below, we show key pieces of the PIN example. See the following subsections for details on the blue entries. The parent directory is osdm3-pin:

osdm3-pinWEB-INFdpfdpfInit.xml

xhmiappincpin.xhmi

appconfig.xhmidtd...dtd files...

all.properties...rendering jsp files...pin.propertiesreturn.jsp ...

Configure the DPF tree

OSD components define and enable DPF trees for receiving variables from their calling applications. This enables the applications to dynamically insert variables into the trees when invoking components. Steps:

1 Defining the DPF (dpfInit.xml)

2 Enabling DPF (appconfig.xhmi)

For detailed DPF information, see Dynamic property framework (DPF) on page 15.

Creating an OSD componentImplementing OSD components


Defining the DPF (dpfInit.xml)

You must define a DPF tree so applications can set parameters for OSD components to use in its callflow. You should define default values for every entry in the DPF tree. The examples below use the length property to demonstrate setting and changing the default value.

For the Example PIN component, the application sets the PIN length. Below, we create the <length> property in the dpfInit.xml file as well as properties for prompts and various counters. The example configures a single element called pin below the DPF root element. The system automatically includes all properties in the pin.properties file under the pin element.

<?xml version="1.0" encoding="UTF-8"?><DPF><pin><length>4</length><collection_initialprompt><![CDATA[ Please state your

<value expr="DPF.pin.length"/>-digit PIN]]></collection_initialprompt><collection_maxnoinputs>3</collection_maxnoinputs><collection_maxnomatches>3</collection_maxnomatches><collection_maxretries>3</collection_maxretries>

<confirmation_initialprompt>I think you said</confirmation_initialprompt>

<confirmation_initialprompt2>Correct?</confirmation_initialprompt2>

<confirmation_maxnoinputs>3</confirmation_maxnoinputs><confirmation_maxnomatches>3</confirmation_maxnomatches><confirmation_maxretries>3</confirmation_maxretries>

</pin></DPF>

You can reference DPF properties using “DPF.pin.name” syntax. For example: <value expr="DPF.pin.length"/>.

In a component, this xHMI fragment assigns the length from the DPF:

<property-list> <property name="length" expr="DPF.pin.length"/>

</property-list>



The component can access all its properties with the asterisk symbol (*) as a wildcard. The following example references all pin properties:

<property-list><property-set expr="DPF.pin.*"/>

</property-list>

To see DPF in a more complete example, see The collection node example on page 30.

Enabling DPF (appconfig.xhmi)

The OSD component must enable DPF in its xHMI configuration. Typically, you configure the DPF in a global <var-list>. For the Example PIN component, this is done in the appconfig.xhmi file as follows:

<var-list><var name="DPF" type="dpf" expr="'pin'"/>

</var-list>

See Write a wrapper for the component (appconfig.xhmi) on page 34.

Create xHMI file for the component (pin.xhmi)

This section shows the complete structure of a configuration file for an OSD component including the initial header and placeholders for the important blocks of code. This file, pin.xhmi, implements the Example PIN component:

<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE dialog SYSTEM "../../dtd/xhmi.dtd"><dialog id="pin" root="collection"

attributes="returncode returnvalue confidencescore">

<var-list><var name="pin" type="attribute"/>

</var-list>



</dialog>

Note: Do not enclose the OSD component with <xhmi> </xhmi> elements.

The header defines a dialog called pin that has a root node called collection, which uses three parameters: returncode, returnvalue, and confidencescore. The parameters are implicitly defined in the component with no need for <var> definitions.

The collection node example

The PIN component first executes the collection node, which makes extensive use of the DPF tree defined in Configure the DPF tree. The user hears the prompt: “Please state your n-digit PIN,” and the value of n is set dynamically.

Here is the collection node configuration in pin.xhmi. It defines properties, prompts, grammars, and transitions; and it gets these values from the DPF tree (the values previously set by the calling application):

<node id="collection" class="Collection"><config><property-list><property name="_maxnoinputs" expr="DPF.pin.collection_maxnoinputs" />

<property name="_maxnomatches"expr="DPF.pin.collection_maxnomatches"/>

<property name="_maxretries" expr="DPF.pin.collection_maxnoretries"/>

</property-list>

<output-list><initial><output><value expr="DPF.pin.collection_initialprompt"/>

</output></initial>

</output-list>

<grm-list><fills name="pin" slot="NONE"/>

</grm></grm-list>

<understand namelist="pin"/>

</config>

<transition><next name="expr"><target condexpr="pin.ver()" path="return"><assign name="returncode" value="SUCCESS"/><assign name="confidencescore" expr="pin.conf()"/><assign name="returnvalue" expr="pin.best()"/>

</target>

<target condexpr="pin.def()" path="confirmation"/>



<target condexpr="s.nodeVisits()>2" path="return"><assign name="returncode" value="FAILURE"/><assign name="returnvalue"

value="noValueInCaseOfFailure"/><assign name="confidencescore" value = "0"/>

</target>

<target path="collection"/>

</next>

</transition></node>

The collection node maps values to several predefined OSD properties: _maxnoinputs, _maxnomatches, and _maxnoretries. OSD automatically renders these properties to configure the browser.

To prepare for recognition, the node defines a grammar. In this example, we use the built-in digits grammar with parameters for the minimum, expected, and the maximum number of digits.1 The values for these parameters are retrieved from the DPF tree (where set by the calling application). The grammar fills the pin attribute with the recognition result.

The confirmation node example

The confirmation node executes when pin is defined, but not verified. This means the dialog collected a valid PIN, but the confidence score was too low to automatically verify. When the user confirms, the node returns the PIN to the calling instance of the PIN component, and the instance returns to the calling application.

1The documentation for built-in grammars is provided in the OSR Language Supplement for the recognized language.

Here is the configuration of the confirmation node in pin.xhmi:

<node id="confirmation" class="Collection"><var-list><var name="YESNO" type="attribute"><param name="temporary" expr="true"/> </var>

</var-list>

<config><property-list><property name="_maxnoinputs"

expr="DPF.pin.confirmation_maxnoinputs" /><property name="_maxnomatches"

expr="DPF.pin.confirmation_maxnomatches"/><property name="_maxretries"

expr="DPF.pin.confirmation_maxnoretries"/></property-list>

<output-list><initial><output><value expr="DPF.pin.confirmation_initialprompt"/>

</output><output>

<say-as interpret-as="number" format="digits"><value expr="pin.best()"/>

</say-as></output><output><value expr="DPF.pin.confirmation_initialprompt2"/>

</output></initial>

</output-list><grm-list><grm src="builtin:grammar/boolean"><fills name="YESNO" slot="NONE"/>

</grm></grm-list>

<understand namelist="YESNO"/>

</config>

<transition><next name="expr"><target condexpr="YESNO.best()=='true'" path="return"><assign name="returncode" value="SUCCESS"/><assign name="returnvalue" expr="s.best('pin')"/><assign name="confidencescore"

expr="pin.conf()"/></target>

<target condexpr="s.nodeVisits()>2" path="return"><assign name="returncode" value="FAILURE"/><assign name="returnvalue"


</target>

<target condexpr="YESNO.best()=='false'"

path="collection"/>

<target path="confirmation"/>


</node>

The error <catch> example

The PIN component must catch and handle any errors that occur during collection and confirmation. This is configured by the following block at the dialog scope in pin.xhmi:

<catch><target event="connection.disconnect" path="return"><assign name="returncode" value="FAILURE"/><assign name="returnvalue"


</target>

<target event="maxspeechtimeout" path="return"><assign name="returncode" value="FAILURE"/><assign name="returnvalue" value="maxspeechtimeout"/><assign name="confidencescore" value = "0"/>

</target>

<target event="maxnoinputs" path="return"><assign name="returncode" value="FAILURE"/><assign name="returnvalue" value="maxnoinputs"/><assign name="confidencescore" value = "0"/>

</target>

<target event="error" path="return"><assign name="returncode" value="FAILURE"/><assign name="returnvalue" value="error"/><assign name="confidencescore" value = "0"/>

</target>

<target event="." path="return"><assign name="returncode" value="FAILURE"/><assign name="returnvalue"

value="defaultEventHandlerActivated"/><assign name="confidencescore" value = "0"/>

</target></catch>

Write a wrapper for the component (appconfig.xhmi)

The final step to building an OSD component is to create a wrapper that loads the component as a web application. This is done in an appconfig.xhmi file (see Create the directory structure).

Once the wrapper is complete, the implementation of the component is done. You can pack the component into a war or ear file and install it on any servlet container.

This example wraps the OSD component described in Example PIN component:

<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE xhmi SYSTEM "../dtd/xhmi.dtd"><xhmi root="pin"

xmlns="http://www.scansoft.com/2004/xhmi" xmlns:xi="http://www.w3.org/2001/XInclude">

<vuiforward><forward name="_collection" path="/collection.jsp"/><forward name="_error" path="/error.jsp"/><forward name="_return" path="/return.jsp"/>

</vuiforward>

<var-list><var name="DPF" type="dpf" expr="'pin'"/>

</var-list>

<xi:include href="inc/pin.xhmi"/>

</xhmi>

Above, the wrapper defines an xHMI application with the pin root dialog. This structure (using <xi:include>) is recommended because it enables you to also use the pin.xhmi file directly in an OSD application.

Calling OSD components from VoiceXML applicationsHere is a VoiceXML fragment that calls a subdialog named mypin, and passes the PIN length as an HTTP parameter. The subdialog is actually an OSD component named osdm3-pin that collects a personal identification number.

When the component returns, the application checks the returned result. The example does not show what the VoiceXML application does next. We omit error handling for brevity:

<?xml version="1.0" encoding="UTF-8"?><vxml version="2.0"

xmlns="http://www.w3.org/2001/vxml"xml:lang="en-US">

<form><var name="result" /><var name="length" expr="4" /><subdialog name="mypin"src="http://somehost:8080/osdm3-pin/osd-component"

namelist="length"><filled><assign name="result" expr="mypin.returnvalue" />

</filled></subdialog>

</form></vxml>

The VoiceXML application can use the HTTP request to set properties because the OSD component configures a DPF tree to receive them. For an example, see Implementing OSD components. Above, the <subdialog> inserts length into the tree, and the component accesses the property using syntax like “DPF.pin.length”. For an example, in The collection node example on page 30).



Calling OSD components from xHMI applicationsThe following subsections show how to use OSD components. This example creates a client application that calls the component described in the Example PIN component on page 24. The application configures the PIN length, and decides what to do after a valid PIN is collected and verified.

Create the application structure

The directory structure follows the conventions described in Create a directory structure on page 138. Here is the directory structure for the example client application. The parent directory is named secured:

securedWEB-INFxhmiappappconfig.xhmidtd...dtd files...

...rendering jsp files...

...

Create the application configuration (appconfig.xhmi)

OSD applications call OSD components as if they were nodes. In this example, the OSDM node configures and calls the OSD component (itself an OSDM node). The code below shows the complete xHMI file including a header followed by placeholders for the important blocks of code (which are shown later).

Here is the application configuration:

<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE xhmi SYSTEM "../dtd/xhmi.dtd"><xhmi root="Main" xmlns="http://www.scansoft.com/2004/xhmi" xmlns:xi="http://www.w3.org/2001/XInclude">

<vuiforward><forward name="_collection" path="/collection.jsp"/><forward name="_outputExit" path="/outputExit.jsp"/><forward name="callOSDM" path="/outputOSDM.jsp"/><forward name="_error" path="/error.jsp"/>

</vuiforward>

Creating an OSD componentCalling OSD components from xHMI applications

<dialog id="Main" root="intro">

</dialog>

<catch><target event="connection.disconnect" path="exit"/><target event="error" path="exit"/><target event="." path="exit"/>

</catch></xhmi>

Example parameter list

To receive return values from OSD components, the OSD application must declare a formal list of attributes in its root dialog.

Below is the root dialog for the Example PIN component. The Main dialog defines intro as the root node, and declares attributes to contain the return values. Here is the beginning of the Main dialog:

<dialog id="Main" root="intro"><var-list><var name="osdmPINReturnValue" type="attribute"/><var name="osdmPINReturnCode" type="attribute"/><var name="osdmPINConfidence" type="attribute"/>

</var-list>

Example reference to a component

The OSD application calls components as if they were nodes. Below, intro is the first node called by the Main dialog of the Example PIN component. The node

adds a prompt to the output queue and calls the pinOSDM node. Here is the configuration:

<node id="intro" class="Output"><config><output-list><initial><output>Welcome to the PIN demonstrator</output>

</initial></output-list>

</config><transition><next name="expr"><target path="pinOSDM"/>

</next>


Example call to a component

This example defines a node that invokes an OSD component. You can use the OSDMNode or ServerSideOSDMNode classes for the definition.

Below is the configuration for the pinOSDM node, which calls the Example PIN component. Like all nodes using the OSDMNode class, pinOSDM uses a namelist to send properties to the OSDM (in this example, we set the PIN length). The system writes property values to a DPF tree provided by the OSD component, and the values override any defaults defined in the component.

<node id="pinOSDM" class="com.scansoft.osd.nodes.OSDMNode"><config><property-list><property name="length" value="3"/>

</property-list>

<osdm src="%{osdm_server}osdm3-pin/osd-component"><fills name="osdmPINReturnValue" slot="returnvalue"/><fills name="osdmPINReturnCode" slot="returncode"/><fills name="osdmPINConfidence" slot=

"confidencescore"/></osdm>

</config>

<transition><next name="SUCCESS"><target path="presentResult"/>

</next><next name="FAILURE"><target path="/ErrorHandler(osdmPINReturnValue)"/>


</node>

The <transition> block defines what happens after the PIN component returns. On SUCCESS, the application calls the presentResult node. On FAILURE the application calls the ErrorHandler dialog (and passes the return value as a parameter for use in the error message).

Example result handling

A real application would do something useful with the collected PIN, but this sample merely repeats the collected PIN to the user and then exits. Here is the configuration for the presentResult node:

<node id="presentResult" class="Output"><config><output-list><initial><output>Your pin is <say-as interpret-as="number" format="digits"><value expr="osdmPINReturnValue.best()"/>

</say-as>Goodbye

</output></initial>

</output-list></config><transition><next name="expr"><target path="exit"/>


</node></dialog>

Note how the presentResult node controls the text-to-speech output: because the format is declared as digits, the TTS engine will speak the PIN as individual digits (for example, one-two-three-four) instead of combining digits into some natural number expression (for example, “twelve thirty-four” or “one thousand two hundred thirty-four”).



Example error handling

The ErrorHandler dialog is called when an error occurs.2 This example error handler simply informs about the error (for testing purposes), and then the application exits. A real application would do something more useful; for example, it might transfer the telephone call to a human operator. Here is the configuration for the ErrorHandler dialog:

<dialog id="ErrorHandler" root="errorHandler"attributes="resval">

<node id="errorHandler" class="Output"><config><output-list><initial><output>An error occurred: <value expr="s.best('resval')"/>

</output><output>please try again later</output>


</config><transition><next name="expr"><target path="exit"/>


</node></dialog>

2For this example, the error handler is implemented as a dialog instead of a node. This choice emphasizes that a re-usable dialog is more useful when the example performs a more complicated transaction.



Chapter 5

TransitionNode configuration

The TransitionNode node class provides a way to specify transitions in a node. Using a Transition node allows you to centralize and re-use transition handling instead of repeating transition configurations in every node.

Description

TransitionNode ignores all elements in <config> except for the predefined properties (see Predefined properties on page 147).

Your Transition nodes can log activities just as any other node: use <log> inside the <node> and its <target> elements.

Example Below is an example Transition node. It defines three targets that could be used by several nodes in the application:

<node class="com.scansoft.osd.nodes.TransitionNode" id="transit">

<transition><next name="expr"><target condexpr="s.best('serviceType')=='WEATHER'"

path="/Weather"/><target condexpr="s.best('serviceType')=='NEWS'"

path="/News"/><target condexpr="true" path="/Help"/>


</node>

<config> Element Description

<property-list> Specifies predefined properties that steer the transition to the next node.

TransitionNode configuration 43Nuance Proprietary

Here is an example of a node that uses the Transition node:

<var-list><var name="serviceType" type="attribute"/>

</var-list>

<node class="Collection" id="welcome"><config><output-list><initial><output>Welcome, what service do you want to use?

</output><output>You can have information about the WEATHER or NEWS

</output></initial>

</output-list>

<grm-list><grm src="serviceNames.grxml"><fills name="serviceType"/>

</grm></grm-list>

<understand namelist="serviceType"/>

</config>

<transition><next name=”expr”><target path=”#transit”/>


</node>



Chapter 6

Handling events with application servers

When events arise, application designers and developers can handle them client-side or server-side:

■ Client-side event handling means handling events on the VoiceXML page.■ Server-side event handling means handling events on the application server.

By default, event handling is done client-side. To change the default, see Enabling server-side event handling on page 46.

OverviewPrior to OSD 1.4, applications handled user-defined events on the application server and handled predefined recognition events (nomatch, noinput and help) on the VoiceXML page. As of OSD 1.4, applications can define a single event-handling mechanism located on the server-side.

When recognition events arise, and the application handles them client-side, the only possible actions are to play a prompt and restart the node to collect the information. Many more actions are possible on the server-side:

■ <assign■ <clear> ■ <output>■ <log> ■ <script>

Handling events with application serversOverview


Performance considerationsServer-side event handling shifts processing requirements from client to server. Applications are likely to demand more processing on the server-side (because more actions are possible there).

A disadvantage to server-side event handling is that it adds one roundtrip of messages between the VoiceXML interpreter and the web application server for each recognition event that occurs. If you anticipate high counts of recognition events (nomatch, noinput and help), then you should consider the additional network load of roundtrip messages for your application.

Enabling server-side event handlingBy default, server-side event handling is disabled. This is done to preserve compatibility with previous OSD releases. Applications can enable server-side event handling by setting the following predefined properties to false:

■ _renderNoMatchOutputs■ _renderNoInputOutputs■ _renderHelpOutputs

You can set the properties at the global, dialog, or node scope. (Thus, applications can switch between client- and server-side event handling on a node- by-node basis.

Note: Applications can use client-side or server-side event handling for each type of recognition event. For example, you can handle noinput events on the client and nomatch and help on the server. (However, you cannot mix client and server for the same type of event; for example, you cannot handle noinput events on both the client and server.)

Setting _renderNoMatchOutputs to false disables automatic playing of the nomatch output and sends the nomatch event to the server.



Counting events as they occurApplications can use the following xHMI attributes to track when events occur in the current node:

For examples showing counters, see Restarting a node and varying the output on page 48 and Using conditions to vary the output on page 49.

How to catch events on the application serverTo catch events on the server-side, use the <catch> element. You can scope your configurations as global, dialog, and node. The <catch> configuration contains one child <target> for each handled event.

This example handles the nomatch and noinput events. Both targets invoke the collect node after performing their actions:

<catch>

<target event="nomatch" path="#collect"><log>nomatch log message with <value expr="'values'"/></log><clear name="slot_to_be_cleared"/><assign name="slot_to_be_assigned_to" expr="'some value'"/><output>I did not understand, please repeat</output>

</target>

<target event="noinput" path="#collect"><log>noinput log message with <value expr="'values'"/></log><output>Please answer the question carefully</output>...

</target>

</catch>

Predefined attribute Description

__helpCounter Counts how many times the help event has been raised. Resets to zero (0) when exiting the current node.

__noinputCounter Counts how many times the noinput event has been raised. Resets to zero (0) when exiting the current node.

__nomatchCounter Counts how many times the nomatch event has been raised. Resets to zero (0) when exiting the current node.

Handling events with application serversCounting events as they occur


Examples of event handling on the application serverThis section shows snippets and a complete example of xHMI configurations for handling recognition events (noinput, nomatch, and help).

The examples show how to count iterations of node executions and customize output prompts based on the number of retries experienced by the end-user.

Restarting a node and varying the output

This example collects one item and transitions to the next node (which is always the bye node). If a recognition event arises, the handler restarts the node.

<node class="Collection" id="collect"><config><var name="counter" type="int" expr="0"/><property-list><property name="_renderNoMatchOutputs" value="false"/><property name="_renderNoInputOutputs" value="false"/><property name="_renderHelpOutputs" value="false"/>

</property-list>

<output-list><initial><output>This is a collection</output>


<grm-list><grm src="collection.grxml"><fills name="collectedItem"/>

</grm></grm-list>

<understand namelist="collectedItem"/></config>

<transition><next name="expr"><target condexpr="counter > 17" path="Help()"/>

<target condexpr="collectedItem.def()" path="#bye"/></next>

</transition>

<catch><target event="nomatch" path="#collect"/><target event="noinput" path="#collect"/>

</catch></node>

Using conditions to vary the output

The next example uses conditions to vary the <output> elements each time the node is restarted. Because the conditions require processing that cannot execute on the VoiceXML page, the recognition event handling must happen server-side.

The previous example (Restarting a node and varying the output on page 48) used “<output count=...” to implement its conditional logic. Here, user-defined attributes serve the same purpose.

The behavior of this node is the same shown in Restarting a node and varying the output on page 48. The difference is that this node controls the contents of the counters. The catch handlers contain executable content that compute counter values:

<node class="Collection" id="collect"> <var-list> <var name="nomatches" type="attribute" expr="0"/><var name="noinputs" type="attribute" expr="0"/><var name="counter" type="int" expr="0"/>

</var-list>

<config><property-list> <property name="_renderNoMatchOutputs" value="false"/><property name="_renderNoInputOutputs" value="false"/><property name="_renderHelpOutputs" value="false"/>

</property-list>

<output-list><initial>

<output condexpr="counter==0">This is a collection

</output><output condexpr="counter==1">

Handling events with application serversExamples of event handling on the application server

This is still a collection</output>



</grm></grm-list>

</config>

<transition><next name="expr"><target condexpr="collectedItem.def()" path="#bye"/>


<catch><target event="nomatch" path="#collect"><assign name="nomatches"

expr="Number(s.best('nomatches'))+1"/><assign name="counter"

expr="Number(s.best('counter'))+1"/></target>

<target event="noinput" path="#collect"><assign name="nomatches"

expr="Number(s.best('noinputs'))+1"/><assign name="counter"


</catch></node>

(In the preceding example, we omit <understand> assuming that it is defined at a higher scope.)

Setting the maximum retries of a node

This example implements a maximum number of restarts for a node. (A node restart is a retry state for the user: the node begins again after an unsuccessful collection of information from the user.) The counter in this example is the total of all retry causes.

Define the maximum retry counter as first or last transition in the node. Here, the counter is the first transition target:

<node><transition><target condexpr="s.best('counter')>17" path="Help"><log>counter reached maximum number of events</log>

</target>



</transition>

</node>

To see where this <target> fits inside the node, see the example in Restarting a node and varying the output on page 48.

Running scripts inside event handlers

The next example adds the powerful scripting feature. (This feature is sometimes called “scripts as a child of <target>”.)

Upon catching any event (not just recognition events, but also user-defined events), the application can run a script. Because the event handler is server-side, the scripts have full access to the data in the OSD session.

This example catches only the nomatch event (the <catch> is at the bottom):

<node class="Collection" id="collect"><config><property-list><property name="_renderNoMatchOutputs" value="false"/><property name="_renderNoInputOutputs" value="false"/><property name="_renderHelpOutputs" value="false"/>

</property-list>

<output-list><initial><output>This is a collection</output><output count="2">This is still a collection</output>



</grm>

Handling events with application serversExamples of event handling on the application server

</grm-list>


<transition><next name="expr"><target condexpr="collectedItem.def()" path="#bye"/>

</next>

</transition>

<catch><target event="nomatch" path="#collect"><output>Please say something I can understand!</output><script>a = 'executed every time this target is used';

</script></target>

</catch></node>

The first execution of the node sends the following output:

“This is a collection.”

The first nomatch event, and any subsequent nomatch, sends this output:

“Please say something I can understand!This is still a collection.”

Complete event-handling exampleThis example shows a simple, complete application for reference using features provided by default in OSD. There is a single dialog (Main), and several nodes (intro, collect, and bye).

When noinput and nomatch recognition events arise, the example handles them on the application server. This allows for logging, scripting, and interactions with the session’s data.

This example application refers to counters that are managed internally by OSD, and it has one user-defined counter for the total number of nomatch and noinput events:

<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE xhmi SYSTEM "../dtd/xhmi.dtd"><xhmi root="Main" xml:lang="en-US" xmlns="http://www.scansoft.com/2004/xhmi" xmlns:xi="http://www.w3.org/2001/XInclude">

<vuiforward><forward name="_outputExit" path="/outputExit.jsp"/><forward name="_collection" path="/collection.jsp"/>

</vuiforward>

<dialog root="intro" id="Main"><var-list><var name="collectedItem" type="attribute"/><var name="counter" type="int" expr="0"/>

</var-list>

<node class="Output" id="intro"><config><output-list><initial><output>Welcome to the demo</output>


</config>

<transition><next name="expr"><target path="#collect"/>


</node>

Handling events with application serversComplete event-handling example


<node class="Collection" id="collect"><log>collection entered</log><log>counter is '<value expr="counter"/>'</log>

<config><property-list><property name="_renderNoMatchOutpurs" expr="'false'"/><property name="_renderNoInputOutpurs" expr="'false'"/><property name="_renderNoHelpOutpurs" expr="'false'"/>

</property-list>

<output-list><initial><output>This is a collection</output>



</grm></grm-list>


<transition><next name="expr">

<target condexpr="counter > 17" path="#bye"><log>maximum retries and events exceeded;leaving unsuccessful collection

</log></target>

<target condexpr="collectedItem.def()" path="#bye"><log>leaving successful collection</log><clear name="counter"/>

</target>

<target condexpr="true" path="#collect"><log>executing node again</log><assign name="counter"

<catch> <target event="nomatch" path="#collect"><log>nomatch event occurred; the number of timesthis node has tried to collect is <value expr="counter.best()"/>

</log><script>counter=counter+1;</script><clear name="collectedItem"/>

</target>

<target event="noinput" path="#collect"><log>noinput event occurred; the number of timesthis node has tried to collect is <value expr="counter.best()"/>

</log><script>counter=counter+1;</script><clear name="collectedItem"/>

</target></catch>



<final><log>non node transition used to leave collection</log>

</final></node>

<node id="bye" class="Output"><config><output-list><initial><output>Thanks for calling</output>


</config>

<transition><next name="expr"><target path="exit"/>

</next>

Handling events with application serversComplete event-handling example

<final><log>Dialog exited without transitioning to another dialog

</log></final>

</dialog>

<catch><target event="session.connection.disconnect" path="exit"/>

</catch>

</xhmi>

Chapter 7

OSD logging

This chapter describes OSD logging mechanisms, including these topics:

■ About OSD logging, an overview of the available logging streams.

■ Application logging, a description of xHMI configuration and critical events your applications should log.

■ Turning application logging on and off

About OSD loggingOSD generates log events, and writes log files for various purposes. OSD implements these types of logging.

■ Diagnostic logging–For debugging and monitoring system operations, OSD uses log4j, an open-source utility that is a project of the Apache Software Foundation.

■ Page logging–For debugging, OSD writes copies of every VoiceXML page it renders.

■ Application logging–For analysis and tuning of your applications, OSD writes log files to a documented file system. You have control of the content of log messages, and you can choose more than one format of records in the files. For example, the most common format is the one used by the OpenSpeech Insight (OSI) tuning tool.

OSD loggingAbout OSD logging


Diagnostic logging

For debugging and monitoring system operations, OSD uses log4j, an open-source Java utility that is a project of the Apache Software Foundation. The utility has six logging levels:

■ FATAL■ ERROR■ WARN■ INFO■ DEBUG■ TRACE

You control the logging mostly with the help of a log4j property file. The contents of the file control the format of the log entries and the amount of logged information.

For each OSD application, you point to the log4j property file with the log4jPropertyFileName context parameter in the application’s web.xml. For example:

<web-app>…<context-param><param-name> log4jPropertyFileName </param-name><param-value>WEB-INF/log4j.properties

</param-value><description>Location of log4j property file for diagnostic logging.

</description></context-param>…

</web-app>

If you omit this parameter, the default location of the property file is /log/OSD.log inside the installed application.

For concepts, reference, and configuration details, see this website:

http://logging.apache.org/log4j/docs/

Performance tips:

■ Disable console output in the log4j.properties file.■ In a production system, do not use DEBUG or INFO levels.■ Limit logging for each class if specific DEBUG or INFO logs are required.



Page logging

For debugging, OSD can write a copy of every VoiceXML page it renders to a local directory. When you enable this feature, OSD creates a directory named pages, and writes the pages there.

Page logging adds a substantial load to your computer system. Do not use it during normal operations. When you use it, enable it for one http session (web service) at a time, since the code running in the background is not thread safe.

Note: Do not enable page logging on production systems that are already operating near capacity.

To enable page logging, add the following <filter> and <filter-mapping> elements to your application’s web.xml file:

<web-app>...<filter><filter-name>PageLog</filter-name><filter-class>com.scansoft.osd.servlet.FilterPageLog

</filter-class></filter>

<filter-mapping><filter-name>PageLog</filter-name><servlet-name>osdservlet</servlet-name>

</filter-mapping>…

</web-app>

To disable page logging, convert the section to a comment or delete it.

Application loggingApplication logging writes information about the dialog flow as it occurs during each session: you write messages with the <log> element, and your subsequent analysis of the logs reveals the performance of the application, its overall success, and the success of its discrete parts.

Although OSD automatically writes some application logs, application developers provide the majority of logs for the OSD applications and components they write. By using <log> in the appropriate locations of your xHMI configuration, you control when the application writes messages and the content of those messages (for example, you might indicate the success or failure of a transaction when leaving a dialog).

OSD loggingApplication logging


Log message formats

You can write any text in a log message. For example:

<log>This message is simple text. </log>

However, it is better to write event-based messages so that the logs can merge with other Nuance speech products in a complementary fashion. Each speech product has a different role during sessions, and their logs describe different aspects of runtime events. During a session, the products write logs to different directories and machines. The files are complementary because you can assembly all the logs for analysis by a single tuning tool.

Here is the format of an event-based message:

<log>EVNT=event-name|parameter-name <value>=value

</log>

You can use any number of <log> elements and parameter/value pairs. The field EVNT is required; it classifies the event.

Log message values can be scripts or constants. Use ECMAScript to provide dynamic values that are only available at runtime. For example, a script can access the SessionFrame:

<log>EVNT=OSDInfo|

INFO=pizzatopping: <value expr="pizzatopping.best()"/></log>

In this example, the expression writes the name of the pizza topping collected from the caller. The name is the first hypothesis in the recognition result.

Scoping of log messages

You can write logs in these locations in the xHMI configuration file:

<dialog><node><transition><catch><final>

The scope where you insert a <log> element is important. For example, if you log an event at the <dialog> level, the system writes the message (once) when

invoking any node inside that dialog. Alternatively, if you log an event inside a <node>, the event is logged only when invoking that node.

Nesting log events

Some events indicate the start or end of a situation. For example, a transaction starts, processing occurs, and the transaction ends. During the processing, you can nest additional log events inside the started event. Conceptually, nested events work like this:

dialog transaction [startA]

node transaction [startB]node transaction [endB]

node transaction [startC]node transaction [endC]

dialog transaction [endA]

Log events, parameters, and values

Below are the log events used by the OSI tuning tool. By using these events and their associated parameters, you ensure that OSI can analyze and report on application performance in a standard way:

EVNT Purpose

OSDError Logs an error situation.

OSDInfo Logs any general information.

SWIcllr Categorizes the caller into a user population.

SWIdbrx Ends a database transaction.

SWIdbtx Starts a database transaction.

SWItrfr Starts a transfer.

SWItrxb Starts a transaction.

SWItrxe Ends a transaction.



Each log event accepts one or more parameters:

Here are definitions of the parameters:

■ CLID–a caller id. For example the telephone number of the caller.

■ GRP1, GRP2, GRP3, GRP4–a category for the caller.

■ INFO–any additional information about the event.

■ NAME–the name of the event.

■ SERV–the name of a database server.

■ RESN–the reason for the RSLT.

■ RSLT–the result of the event. The values are:

■ FAIL indicates a failed event, for example, in which all the required information was not collected from the caller due to a recognition or user interface problem.

■ SUCC indicates a successful situation, for example, where all the required information was collected from the caller.

■ UNKN indicates an unknown situation. For example, the caller disconnected or inexplicably requested a transfer to a human agent.

Event Allowed parameters

OSDError INFO

OSDInfo INFO

SWIcllr CLID, GRP1, GRP2, GRP3, GRP4

SWIdbrx INFO, NAME, RSLT

SWIdbtx NAME,SERV

SWItrfr INFO, NAME, RESN

SWItrxb NAME

SWItrxe INFO, NAME, RESN, RSLT



Generic log events Use the OSDInfo event for any generic message, for example at the beginning and end of dialogs and nodes, and after successful transfers.

This example shows the beginning of a dialog:

<dialog id="Main" root="intro"><log>EVNT=OSDInfo | INFO=Gathering user input

</log>

This example shows the beginning of a node:

<node id="pizzaSize" class="Collection" ><log>EVNT=OSDinfo|INFO=Node pizzaSize entered.</log>

Use the OSDError event for any generic error message. For example:

<log>EVNT=OSDerror | INFO=An error occurred, going to last anchor.

</log>

OSDInfo and OSDError are also useful for certain transfer situations. See Transfers.

Transaction events One use for log messages is to signal the start and end of application transactions such as the accessing of records in a database or the execution of a group of nodes.

A transaction consists of one or more collections from a caller, and can also include database interactions. For example, applications that identify users might define a transaction to include these parts:

■ Collect caller ID■ Collect password■ Validate password (database interaction).

To measure the success of an application, your logs must report each transaction. For example, in a banking application users identify themselves, view account balances, transfer currency among accounts, and make payments. The application associates transactions with each of these tasks and tracks them using the <log> element.

Because transactions begin and end, their associated log messages must also begin and end. You can start and end transactions anywhere in your xHMI configuration, but in general a <dialog> starts a transaction and a <node> starts a task (or a sub-transaction) within a transaction. Here are recommendations:

■ Write start and end events for each <dialog>.■ Write start and end events for each <node> in the <dialog>.

Use SWItrxb at the start of a transaction. For example:

<node id="pizzaSize" class="Collection"><log>EVNT=SWItrxb|NAME=get_size_from_user</log>

Use SWItrxe in the transitions at the end of a transaction. The NAME must match the name of a previous SWItrxb event:

<log>EVNT=SWItrxe|NAME=get_size_from_user|RSLT=SUCC|INFO=prompts queued

</log></target>


Node transitions A node transition is always the result of gathering new information (for example, from a recognizer or a database). Depending on the status of the dialog (resulting from the new information applied to its attributes), the transition chooses the next target. For transitions that are transfers, see Transfers on page 67.

Generally, a start transaction has already been logged, and the transition needs an end transaction event. Because transitions have the characteristics of an if-then-else decision, you need to log information for each possible outcome, including messages for:

■ Each <target> in the node’s <transition>■ Each <target> in the dialog’s <transition>■ Each <target> in a global <transition>■ Each <target> in a thrown event

Here’s an example for successful outcomes:

When the transaction for the node is successful (target condition=true), configure this log in the <transition> element:

<log>EVNT=SWItrxe | NAME=get_pizzasize |RSLT=SUCC

</log>

When the transaction is a failure, configure the log message in the <final> element. See Final transitions.

Catch handlers Catch handlers should log the thrown events. The <catch> element enables the application to react to specific situations. Your logs can provide details and frequencies of those situations. Using <log> in a global catch handler (the <catch> inside the <xhmi> element) defines global log messages.

OSD processes catch handlers before node transitions. When the system throws an event, and the <catch> has a true transition, the system invokes the target and never visits the node’s transition. This behavior ensures catching events immediately when they occur.

Final transitions Use <log> inside <final> to report final status before the application exits, but remember that applications do not visit <final> at the end of every session. For example, events such as failing transactions are good candidates for final logging. The logging is done in these cases:

■ OSD calls the node’s <final> element when leaving the node without using a node transition. This occurs when there is an exception during execution, and when using a <transition> or <catch> at the dialog or global scope.

■ OSD calls the dialog’s <final> element when leaving the dialog without using a dialog or node target. This occurs when there is an exception during execution, and when using a global <transition> or <catch>.

Here’s an example:

<final><log>EVNT=SWItrxe |NAME=get_pizzasize |RSLT+FAIL|RESN=targetmissing |INFO=Node left without a target used. User data not complete.

</log>

Database interactions

You can log database interactions as individual transactions or nest them in other transactions. Because it is logged as a transaction, a database transaction has a start and end event log.

Use SWIdbtx at the beginning of a database transaction. For example:

<node id="Your_databaseAccess_node_name"class="com.YourName.nodes.Database">

<log>EVNT=SWIdbtx|NAME=go_fetch_my_data></log>

Use SWIdbrx at the end of a database transaction. The NAME must match the name of a previous SWIdbtx event:



<log>EVNT=SWItrxe |NAME=go_fetch_my_data |RSLT=SUCC

</log></target>


Caller segmentation You can use logs to group user populations into categories. For example, you can identify particular aspects of a call (its id number) and the callers (what they say and what they want). You can categorize types of callers, types of products, or the parts of a product a caller wants.

To accomplish this, log the SWIcllr event in a <dialog> whenever a collection reveals new information that characterizes the caller. If you write the same information more than once, OSI uses the latest value logged. It is a good idea to log caller information as soon as it becomes available and if information changes, it should be logged again. Repetition of previously logged data is not necessary.

This example identifies the user with the session ID from the servlet container, which is stored in a predefined OSD property. The message categorizes the user by the size of pizza ordered and their desired topping

<log>EVNT=SWIcllr|CLID=s.best('_sessionId')|GRP1=s.best('pizzasize')|GRP2=s.best('pizzatopping')

</log>

Below, is a more detailed example. GRP1 should be filled with the pizza size and GRP2 should be filled with the pizza topping requested by the caller.

1 In the beginning of the application the first information logged is the call ID stored in the SessionFrame:

<log>EVNT=SWIcllr|CLID=${s.best("__callId")}</log>

In the log file, this might appear as:

EVNT=SWIcllr|CLID=1234

2 Then the application asks the caller about the pizza size. Assume the caller answers “large”. Below, the next SWIcllr event fills GRP1 with the answer:

<log>EVNT=SWIcllr|GRP1=s.best('pizzasize')</log>

In the log file, this might appear as follows. Note that the CLID remains known to the system.:

EVNT=SWIcllr|CLID=1234|GRP1=large

3 Next, the application asks the caller about the topping. Assume the caller answers “ham.” Below, the next SWIcllr event fills GRP2 with the answer:

<log>EVNT=SWIcllr|GRP2=s.best('pizzatopping')</log>

In the log file, this might appear as follows. Again, the previously logged values remain known:

EVNT=SWIcllr|CLID=1234|GRP1=large|GRP2=ham

Transfers Logging transfers is another type of transaction logging. Because a speech application should result in a minimum of transfers, logging them provides critical information for tuning an application.

OSD makes three types of transfers:

■ Blind transfer■ Silent bridge transfer■ Bridge transfer

It is very important to log the SWItrfr event when the application transfers the caller to another application or to a human operator. This is the last opportunity to add session information, and it enables analysis of how sessions are ending.

You can write transfer log messages in either of these locations:

■ In the node from which the caller is transferred ■ In the Transfer node when the transfer is happening

Use SWItrfr to log the critical part, that is, the actual transfer, inside the <transition> element. Example for blind transfers:

<log>EVNT=SWItrfr |NAME=blind|RESN=caller demanded a blind transfer|INFO=Transferring caller blindly

</log>



Example for silent bridge transfers:

<log>EVNT=SWItrfr |NAME=silen|RESN=caller demanded a silent bridge transfer|INFO=Transferring caller to %{_transferDestination}

</log>

Example for bridge transfers:

<log>EVNT=SWItrfr|NAME=bridge|RESN=caller demanded an interruptible bridge transfer|INFO=Transferring caller to%{_transferDestination}

</log>

Use OSDinfo and OSDerror to log transfer events other than the actual transfer. For example:

<transition><target condexpr="s.best('typeOfTransfer')=='hangup'" path="#nodeBye"><log>EVNT=OSDinfo|INFO=Caller does not want a transfer.

Transferring caller to %{_transferDestination}</log>

</target></transition>

Turning application logging on and off OSD provides a logging web service as a web archive file (osd-osilogger.war). To use the service, update the configuration of its web.xml file and deploy it on an application server. The deployment is the same as any other web application.

OSD uses an EventLogger class to write log messages. The logging is done on the server-side, a feature that helps to centralize the location of logged data while minimizing load on client browsers.

By default, logging is turned off. You turn it on by setting the OSILogServer property to true in the application’s xHMI configuration file.

To configure the logger, use the INF/classes/com/nuance/log/config.properties file:

# FILTER CONFIGURATION## filter=XYZ The filter selection, XYZ, needs to be defined



# using filter.XYZ.class=...## PLEASE NOTE: use ONE filter only## Configure each filter XYZ as follows: ## [REQUIRED] filter.XYZ.class # The implementation of com.nuance.log.IFilter## [OPTIONAL] filter.XYZ.logdir # The directory to write log files. The default is: logs/## [OPTIONAL] filter.XYZ.filepattern # The pattern for generating the logfile. The default pattern# uses the sessionid in the file name: sid_%{SESSIONID}.log#filter=OSI_SIMPLIFIEDfilter.OSI_SIMPLIFIED.class=com.nuance.log.OSISimplifiedFilterfilter.OSI_SIMPLIFIED.logdir=logfilter.OSI_SIMPLIFIED.logmerge=classicfilter.OSI_SIMPLIFIED.maxBackupIndex=100filter.OSI_SIMPLIFIED.bufferSize=52428800# Here is the default filepattern# filter.OSI_SIMPLIFIED.filepattern=coreEvents.log# Here is the filepattern used by OSDfilter.OSI_SIMPLIFIED.filepattern=http://host:8080/osd-osilogger/log

OSD loggingTurning application logging on and off


Chapter 8

OSD administration

Operators can configure OSD applications as web applications in various ways:

■ You can copy and edit the global.prop file that resides in the WEB-INF directory of any OSD sample application. The file contains properties that the application can reference in the xHMI configuration file (see the xHMI Reference Guide for details on properties).

■ You can edit the web.xml to add certain services as shown in this chapter.

■ You can dynamically configure each session of an OSD application by including specific URL parameters in the first request to the application.

Deployment to a web serverIt is simple to deploy an OSD application to a web application server. The support for this task depends heavily on which web application server you use. For example, the Tomcat server uses a web-based configuration manager that allows several options:

■ Uploading a war file for installation.■ Copying the war file into the webapps directory and restarting the server.■ Automatically unpacking the war file without restarting the server.

If your web application does not use a war file, you can copy the whole directory structure to the web application server for deployment.

See your web application server documentation for information about deployment.

OSD administrationDeployment to a web server


Providing an XML parser

You must provide an XML parser in your runtime environment (the application server). The parser must be compatible with JAXP 1.3.

OSD installs the Apache Xerces parser for this purpose in the java/lib/ext directory. To use this parser, copy the libraries (xercesImpl.jar and xml-apis.jar) to the appropriate location for your server. For Tomcat, the location is the common/endorsed folder of the Tomcat installation. After you copy the libraries, you might need to restart the server. Then, the OSD samples and all your OSD web applications will use the parser.

Starting a session

The first request of a client to an OSD application creates a session on the server. You can configure the session using URL parameters appended to the request, but you must do this in the initial request and not in subsequent requests.

OSD accepts the following parameters in the first request:

■ callId■ callerId■ calledId■ session ID (you can define this parameter as described below)

For example, you could send this start request to the sample pizza application:

http://server:port/pizza/osd?xhmiCallerId=123&xhmiCalledId=456&xhmiCallId=789

OSD initializes the session with the parameter values, and stores them in predefined attributes named __callerId, __calledId and __callId. You can access the attributes in the same way as any variable, for example: __callerId.best().

The rendering system use these properties to when rendering markup for your browser. If you replace the OSD rendering system, you can access the properties as _xhmiCallerId, _xhmiCalledId and _xhmiCallId.

To define a session ID, add a global property to the xHMI configuration file:

<property name="_aliasUrlParamUserSessionId" value="SID"/>

Then, the request to the OSD application can contain a SID parameter:

http://server:port/pizza/osd?xhmiCallerId=123&xhmiCalledId=456&xhmiCallId=789&SID=abc



The session ID becomes available to xHMI and the rendering system as follows:

You can rename the callId, callerId, and calledId parameters using the following by adding these global properties in the xHMI configuration file:

<property name="_aliasUrlParamCallerId" value="myCallerId"/><property name="_aliasUrlParamCalledId" value="myCalledId"/><property name="_aliasUrlParamCallId" value="myCallId"/>

With the redefined parameter names, the example request URL looks like:

http://server:port/pizza/osd?myCallerId=123&myCalledId=456&myCallId=789&mySID=abc

Operation administration & management (OA&M)OSD provides an Operations, Administration & Management interface (OA&M) to control installed OSD applications. To use the interface, your servlet container must be JMX compliant. Some web application servers automatically include a JMX configuration (JBoss, for example), while others require manual setup (Tomcat 5, for example).

This documentation describes the JMX installation for Tomcat, with general comments for other web application servers. To use a different server, see the server vendor’s documentation.

To use a management framework that is not JMX compliant, you must implement the framework in a manner similar to the provided OA&M interface. See the OSD Integration Guide for information about integrating different management frameworks.

Using JMX in Tomcat

Using JMX in Tomcat differs slightly from other web application servers. Tomcat does not provide a pre-configured JMX system. You must install a compatible JMX version and implementation for your Tomcat server.

We assume Tomcat 5.0.x with an implementation of the JMX API v1.2 and the JMX Remote API v1.0.

xHMI attributes __userSessionIdName and __userSessionIdValue

rendering property _xhmiUserSessionId

(This corresponds to __userSessionIdValue. The name is contained in the property _aliasUrlParamUserSessionId that is set in the xHMI configuration.)

OSD administrationOperation administration & management (OA&M)


Configuring the JMX connector

OSD provides a JMX configurator that you must configure in the web.xml. Doing this allows JMX Management Consoles to connect to the web application servers. Different web application servers do this differently.

Configuration requirements:

■ Set JMX parameters in web.xml■ Load the configurator class

For Tomcat, set the servlet-listener to add the OA&M MBean. You must do this even if you have more existing web applications using the JMX configurator and the web application server is already configured for it. The JMX configurator then detects the new configuration and only adds an MBean.

Set JMX parameters in web.xml

Add the following configuration to the same web.xml that contains the JMXConfigurator listener (described in Load the configurator class on page 75).

This example shows default values. If you do not need to change values, you can omit them from the configuration:

<context-param><param-name>jmxProtocol</param-name> <param-value>jmxmp</param-value>

</context-param>

<context-param><param-name>jmxHost</param-name> <param-value>localhost</param-value>

</context-param>

<context-param><param-name>jmxPort</param-name> <param-value>1099</param-value>

</context-param>

<context-param><param-name>jmxUrlPath</param-name> <param-value></param-value>

</context-param>

<context-param><param-name>mbeanDomain</param-name> <param-value>OSD</param-value>

</context-param>



The protocol, host, port, and URL path parameters form the service url:

service:jmx:jmxmp://localhost:1099

The mbeanDomain parameter names the MBean in the ObjectName:

OpenSpeech Dialog:MBean=Instrumentation

For details, see the javadoc for the JMXServiceURL and ObjectName classes.

Load the configurator class

To enable JMX in Tomcat, insert the following lines into the web.xml:

<listener><listener-class> com.scansoft.osd.oam.jmx.JMXConfigurator </listener-class>

</listener>

This configuration loads the JMXConfigurator class as a web application context listener, and enables the JMX server components.

Note: All OSD web applications share this setting. Any web application that uses this configurator will possibly overwrite settings from the previously loaded web applications. We strongly recommended using a single web application to set up JMX.

Managing configuration

After you complete the JMX configuration, an MBean named Instrumentation becomes available to JMX Management consoles (under the mbeanDomain, configured as “OSD” by default). You can use any JMX management console.

Balancing system loads

When you deploy OSD applications as web applications, you can use a resource management service to balance loads on CPU and memory. OSD does not provide a load balancer, but you can use third-party routers and load balancers.

Requirements:

■ The load balancer must direct each OSD session to a single server. To accomplish this, the service uses the JSESSIONID cookie to balance loads. The cookie is described in the Java Servlets Specification.

■ The VoiceXML browser platform must support and use this cookie for each call when communicating with application servers running OSD applications.



Controlling shutdown and update operations

System operators need to update OSD applications periodically without interrupting service to application users. We recommend that operators use the routing servlet when performing these updates.

You can also use the methods described here to support an alternative update process. OSD provides management operations to stop applications from accepting new sessions (graceful shutdown) and to force the end of sessions that exceed a reasonable duration. OSD does not provide complete program logic for application updates; this must be implemented (integrated), probably as a script running on your management console.

You can also implement graceful shutdown by blocking new sessions, eventually having zero instances of an application running. Once this status is reached, you can safely update and restart the application without losing sessions.

Another mechanism is to kill an application, which immediately terminates all running sessions. This technique should be a last resort, and will result in an error when using any method of the IDialogManagerInvocation interface.

The IOAM interface The following table shows the management operations accessible through the class IOAM and exposed by the JMX MBean server. For details on the methods, see the OSD Integration Guide and javadoc.

Method Description

acceptNewSessions Cancels a previous call to blockNewApplications or kill.

blockNewSessions Prevents new sessions.

getMaxActiveSessions Gets the maximum number of sessions allowed.

getNumSessions Gets the current number of sessions.

getProperties Gets all context properties for an application.

getStatistics Gets statistical information about an application.

kill Aborts all sessions immediately.

listApplications Lists the names of all running applications.

sendNotification Sends a message to the OAM management framework.

setMaxActiveSessions Sets the maximum number of sessions allowed.



OA&M event notifications

Operational event notifications are messages sent to a management framework. This is different from event logging, which records all events, because only a subset of events is usually sent (a strategy to avoid information overflow at the console).

OSD automatically sends notifications in case of exceptional situations. In addition, your OSD applications can send notification events. Each event contains a type, message, and userdata. If you need more fields (for example, sessionid and tenantid), your implementation must add them when writing notification events.

A notification type has the following predefined hierarchy:

error.applicationerror.systemwarn.applicationwarn.systeminfo.applicationinfo.system

Your events must conform with this hierarchy. The OSD framework sends error.system, warn.system, and info.system. Your applications can use an API call to send the others. You can also define subtrees of events under the error.application, warn.application and info.application.

The message is a free format string to contain any information to be displayed at the management console.

The userdata field contains key-value pairs delimited by semicolons. For example:

sessionid=42;application=pizza;callerId=0815;calledId=4711;…

To send OA&M application events, use the sendNotification method:

sendNotification(String type, String message, String userData)

OSD adds a timestamp and a prefix of predefined key-value pairs to the userdata. The pairs are taken from the create method of the DialogManager Invocation Interface:

Key Value

Sessionid Usually the ID generated by the application server (jessionid)

appname The application name



Default error messages

The messages.dtd file contains keys for the predefined error messages. OSD uses the keys to find message text in messages.xml. Here is a list of keys:

Each message describes a single error. OSD fills the {0} construct at runtime. For a description of how this works, see the javadoc for the ErrorMessages class.

Extending error messages

You can define new error messages and use them in any custom nodes you create. (OSD does not use the messages in the predefined nodes.) The DialogManager API has a method for sending the custom error messages from within your Java code.

CallerId The caller’s telephone number (if available)

CalledId The called number (if available)

Called A unique identifier of the call (if available)

Key Value

Message key Message parameters

CONFIGURATION_ERROR none

VUIFORWARDMAP_UNDEFINED {0} name of node

TRANSITION_FAILED none

NO_NEW_SESSIONS none

SHUTTING_DOWN none

CONTEXT_NOT_REGISTERED {0} name of application

NODE_EXECUTION_FAILED {0} name of node

PROCESSING_ERROR {0} information about failed processing

PROTOCOL_ERROR none

PROTOCOL_ERROR_DIALOG_NOT_STARTED none

PROTOCOL_ERROR_ONLY_HANGUP none

PROTOCOL_ERROR_SMEX none

BROWSER_ERROR_EVENT {0} event message



Extending the error messages is simple:

1 Copy messages.dtd, messages_custom.dtd, and messages.xml from their installation location in the OSD system folder) to a temporary location for editing. Do not change the originally installed versions of these files.

2 Edit the messages_custom.dtd file to define the new messages. (OSD automatically appends the file to messages.dtd.)

3 Edit the messages.xml file to write the text of the messages.

4 Copy the dtd files to the WEB-INF/dtd folder of every web application that will use the custom error messages.

Example Here is a fragment of messages.dtd. It shows how the custom dtd is appended:

<?xml version="1.0" encoding="UTF-8"?><!ELEMENT messages ANY>

<!ELEMENT CONFIGURATION_ERROR (#PCDATA)><!ELEMENT VUIFORWARDMAP_UNDEFINED (#PCDATA)>



<!ENTITY % messages_custom.dtd SYSTEM "messages_custom.dtd">%messages_custom.dtd;

Here is an example of messages_custom.dtd. It shows the key for a new message called HELLO_WORLD:

<?xml version="1.0" encoding="UTF-8"?><!ELEMENT HELLO_WORLD (#PCDATA)>

Here is an example of messages.xml. It shows the text added for the new message:

<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE messages SYSTEM "dtd/messages.dtd"><messages><CONFIGURATION_ERROR>Configuration error. Check the appconfig.xhmi for errors.

</CONFIGURATION_ERROR>

<VUIFORWARDMAP_UNDEFINED>VUIForwardMap undefined in node {0}. Check VUI forward maps in the appconfig.xhmi

</VUIFORWARDMAP_UNDEFINED>



<HELLO_WORLD>Hello World Message</HELLO_WORLD></messages>

Your application can return the new message from the ErrorMessages class, which is available through the DialogNode class (see the javadoc).

Localizing OA&M notifications

You can localize the default OA&M notification messages to any language by creating xml files containing your definitions. OSD automatically uses your localized message definitions when you place the files in the WEB-INF folder of the web application (see Create a directory structure on page 138).

You can only localize the OA&M error messages defined by the DialogManager and its components. You cannot use this mechanism to localize values (such as prompts) in the xHMI configuration.

To localize messages, use the following templates for the xml filenames. You can substitute other languages using the same pattern:

When you include the full or partial language code, you can provide any number of files with different language codes. (For example, you could have Spanish files messages_es.xml, messages_es-ES.xml, and messages_es-CO.xml.) At runtime the system chooses messages in this order:

1 From the given locale

2 From the default locale

3 From the messages.xml

4 From the internal defaults

OSD sends messages from these definitions to the OA&M client in notification messages. Using the localized messages.xml and messages.dtd files, you can overwrite the default messages defined in the ErrorMessages class. Because the

messages.xml Messages in the default language.

messages_en.xml Messages in English.

messages_en_US.xml Messages in US English.

default notification messages have built-in defaults, no messages.xml file is needed. OSD provides the dtd is provided in its system folder. You must copy the dtd to every web application that uses custom error messages. (See Extending error messages on page 78.)

Routing calls to the application

This section describes using the routing servlet to route incoming requests to OSD applications.

Background concepts

OSD provides a routing servlet that accepts and forwards requests from voice browsers to OSD web applications. The routing servlet offers an easy way to update web applications and to perform management operations for application installation, update, and removal.

The routing servlet lets you map requests to specific web applications using telephone numbers or application names. Importantly, the servlet lets you remove applications from service so they can be updated without interrupting sessions.

To remotely manage the routing, the management framework provides a set of management operations. OSD installs an example implementation of a routing application (which uses JMX) in the routing.war that is installed in the samples folder. For additional discussion of the routing servlet, see the OSD Integration Guide.

Registering OSD applications for routing

There are no special steps to register an OSD application for routing. At startup, every OSD application registers itself to the OSD framework and is therefore available for routing. The framework knows which context of an OSD application is the latest (and stores this knowledge in the oamDataDirectory). If a server restarts, the applications are configured to use the latest context when receiving start requests.

Configuring the routing servlet

There’s two ways to configure the routing servlet:

■ Configure static or dynamic routing. ■ Configure the routing servlet dynamically at runtime.

Static routing is configured in the web.xml of the OSD application.

Setting the persistent storage directory

The servlet needs a persistent storage location; you must configure this location in the web application deployment descriptor (web.xml). Do this in only one



web.xml, and the setting will be used by every OSD web application. To configure this setting add the following context parameter to the web.xml:

<context-param><param-name> oamDataDirectory </param-name><param-value> C:\OSDSavedData </param-value>

</context-param>

Setting the initial routing table

The web.xml also configures the initial entry set of the routing table. Use this example as a model to specify the set:

<context-param><param-name>routingTableEntry001</param-name><param-value>123,HelloWorld</param-value>

</context-param>

<context-param><param-name>routingTableEntry017</param-name><param-value>456,HelloWeb</param-value>

</context-param>

Each entry is named routingTableEntryXYZ where XYZ is a unique string or number. Every parameter name that starts with “routingTableEntry” is an entry for the routing table. In the example, the phone number 123 maps to the HelloWorld application, and 456 maps to HelloWeb. In all entries, the application names must match the <display-name> entry in the web.xml of the web applications.

Enabling the routing servlet

The <servlet> and <servlet-mapping> elements are required in the web.xml for starting the RoutingServlet. These elements accept the parameters described in later in this section.

<servlet><servlet-name>routingservlet

</servlet-name><servlet-class>com.scansoft.osd.servlet.RoutingServlet

</servlet-class></servlet>

<servlet-mapping><servlet-name>routingservlet

</servlet-name><url-pattern>/router

</url-pattern></servlet-mapping>



The order of the entries in the web.xml file is significant (and is dictated by the dtd used for the file). Consult the documentation for your web application server for details.

Configuring the web application server

To use the Routing Servlet, the web application server must allow the routing servlet to access different web contents. For example, Tomcat requires a configuration such as the following in the server.xml file inside the <host> element:

<DefaultContext reloadable="true" crossContext="true"/>

Static routing The routing configurator sets up the RoutingServlet using parameters from the web.xml. These parameters include settings for the initial routing table.

To enable routing in Tomcat, insert the following lines in the web.xml:

<listener><listener-class>com.scansoft.osd.oam.jmx.RoutingServiceConfigurator

</listener-class></listener>

This configuration reads the routing table entries from web.xml and sets up the routing table with these values. With static routing, the routing table does not change after this initial setup. With dynamic routing, changes are possible at runtime.

Dynamic routing For dynamic routing, you must set up JMX as described in Set JMX parameters in web.xml on page 74. After specifying the web.xml entries described there, make the settings needed for Static routing.

After setting up the dynamic routing a new MBean is available under the JMX mbeanDomain “OSD” for changing the routing table at runtime. For information, see Managing configuration on page 75.

Using the routing servlet

To use the configured routing servlet, copy the routing.war file into the web applications folder of your web application server.

Sending requests to the routing servlet is similar to sending requests to the OSD servlet. Here is a URL request to the OSD servlet for starting a dialog with the pizza application on MyServer port 8080:

http://MyServer:8080/routing/router?application=Pizza

Here is a request using telephone number 123 and a routing table as identifiers:

http://MyServer:8080/routing/router?calledId=123

If the request uses both application name and telephone number, the telephone number takes precedence (assuming the routing table has an appropriate entry).



The returned page carries the real context that handled the forwarded request; this allows all subsequent requests to be sent directly to the serving context instead of the routing servlet.

The URL used in the request to the routing servlet is forwarded to the target application without change. Therefore, the start request can carry additional information to configure the OSD application.



Chapter 9

Application development topics

This chapter describes:

■ Using “skip lists” to avoid recognizing specific words■ Dynamic prompts■ Working with dates and times programmatically■ Creating grammars dynamically (at runtime)■ Reading the <config> content of a node■ Extending the application object■ Rendering

Using “skip lists” to avoid recognizing specific wordsA skip list is a list of recognition hypotheses that should be skipped. This feature supports the concept of “not asking the same question twice.” The purpose is to prevent illogical conversations such as this:

In the example above, the negative confirmation “Not Boston” is understood, but then “BOLTON!” is mis-recognized again as Boston. Users get angry when applications make mistakes like this; and skip lists solve the problem.

System What time do you want to go to Boston?

User Not Boston, BOLTON!

System Okay, what time do you want to go to Boston?

Application development topicsUsing “skip lists” to avoid recognizing specific words


Key facts about skip lists

OSD maintains a skip list for each attribute. When a caller rejects an attribute value during a verification step, that value is added to the skip list for that attribute. In other words, any attributes on the skip list are automatically removed from the recognizer’s next results.

Skip lists are lists of values only. They are attached to single attributes. For example an attribute named count can have a skip list containing one,two,three so that OSD would never again understand one of those values.

Skip lists do not contain combined values. When a user rejects more than one slot in a single utterance, the system does not update any skiplist. For example, if the user rejects “do you want to go to Boston tomorrow?” the skiplists for the destination and time attributes are not changed.

Skip lists are driven by the content of the verification candidate list (VCL) and are only used when a verification question is answered negatively. (A verification question means using, for example, <verify yesno="YESNO"/> instead of <understand namelist="YESNO"/> in xHMI.) A negative answer is the value false for the YESNO attribute.

Unless you clear a skiplist value, the system retains it in the attribute until the end of the session. Use these IAttribute methods to clear or change the skiplist:

clearSkipList()addToSkipList()getSkipList()setSkipList(<new skiplist>)skip(<new value>)

Application developers can choose where the system processes skip lists: either by the recognition engine or by OSD. Thus, skip lists are processed either automatically by the OSD framework or within ECMAScript inside speech grammars. When you select processing on the recognizer side, OSD automatically adds the ECMAScript to the grammar.

Use these OSD properties to control where processing occurs:

asrSideSkipListserverSideSkipList



When skip list processing occurs

For background information, this figure shows when skip list processing occurs:

Controlling where skip list processing occurs

By default skip lists are processed by OSD (which is known as server-side processing) and disabled on the speech recognizer (which is known as ASR-side processing). This is equivalent to the following setting of the properties serverSideSkipList and asrSideSkipList:

<property-list>



<property name="serverSideSkipList" value="true"/><property name="asrSideSkipList" value="false"/>

</property-list>

Above, these are the default settings. If you want server-side processing, do not change the defaults.

Alternatively, you can process skip lists on the speech recognizer instead of using OSD. Set the parameters as follows:

<property-list><property name="serverSideSkipList" value="false"/><property name="asrSideSkipList" value="true"/>

</property-list>

To disable skip list processing, use the following settings:

<property-list><property name="serverSideSkipList" value="false"/><property name="asrSideSkipList" value="false"/>

</property-list>

You can enable skip list processing on both sides (recognizer and OSD), but this adds load to processing without increasing performance.

The default is to process skip lists on the OSD machine. This is recommended for these reasons:

■ If you are not using the Nuance Recognizer, and your recognizer does not support ECMAScript in speech grammars that can be activated via parameter grammars, you must change the default processing location.

■ Performance—If your recognition server is already heavily loaded, the additional ECMAScript processing for skip lists might be undesirable. Normally, the additional load is minimal.

■ There is a small cost savings (cpu and network activity) to process with OSD. For example, there is less rendering for the voice browser and less data transfer to the recognizer. This is not likely a significant factor for changing the default processing location.

But processing on the recognizer is also useful because it returns more accurate results that are easier to work with. (The recognizer replaces skipped hypotheses with new possibilities and re-adjusts confidences levels, whereas OSD simply removes next-best entries. With OSD, removed slots are not re-filled and confidence scores are not re-adjusted.)

OSD automatically adds homophones to skip lists

If any of the items on the skip list are associated with homophones, OSD automatically adds the homophones to the list. This is done because the default implementation of xHMI works with semantic values, and it could



unknowingly prompt with an item that has a different semantic value but the same pronunciation. To avoid this, OSD reads the contents of the skip list, detects homophones, and appends them to the list.

Assume the following dialog:

Above, when the caller says “no,” OSD has a skip item containing “Meyer.” Because Meyer has homophones such as Mayer, Meier, and Mayor, OSD also adds these names to the skip list.

Sample skip list grammar

Below is a skip list grammar (as created automatically by OSD):

<?xml version="1.0"?>

<SWIparameter version="1.0" id="set_grammar_script" precedence="1" ignore_unknown_parameters="0">

<parameter name="swirec_grammar_script"><value>if(typeof(origin) != 'undefined' &&

origin == 'London') {SWI_disallow=1;}</value>

</parameter></SWIparameter>

System What is the person’s name?

User Cager.

System Meyer, correct?

User No.

System Please say the name again.



Dynamic promptsTo present dynamic content to the caller, the application developer can do either of the following:

■ Write the presentation text (the text to be presented to the caller) to an attribute and then use an ECMAScript expression in the <output> configuration.

■ Add the output to the StepResponse that is passed to the rendering system.

To describe these alternatives (below), we use the Hello World example (originally presented in the xHMI Reference Guide).

In these examples, the prompt text is hard-coded in java. This is done for simplicity; in a real application, the dynamic prompts would be generated from database content.

Writing prompt text to an attribute

The following xHMI configuration shows the Hello World written so that the greeting prompt depends on the time of day:

<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE xhmi SYSTEM "dtd/xhmi.dtd"><xhmi root="HelloWorld" xml:lang="en-US"xmlns="http://www.scansoft.com/2004/xhmi">

<dialog id="HelloWorld" root="hello"><var-list><var name="dynamicPrompt" type="attribute"

expr="Default prompt"/></var-list>

<node class="com.scansoft.osd.tutorial.DynamicPrompt1"id="hello"><config><output-list><initial><output><value> expr="s.best('dynamicPrompt')"/>

</output></initial>

</output-list></config>



<transition><next name="expr"><target condexpr="true" path="exit"/>


</node></dialog>

<catch><target event="error" path="exit"><log>An error occurred.</log>

</target></catch>

<vuiforward><forward name="_outputExit" path="/outputExit.jsp"/><forward name="_exit" path="/exit.jsp"/>

</vuiforward></xhmi>

java code for the custom node

package com.scansoft.osd.tutorial;

import java.util.Calendar;import java.util.GregorianCalendar;

import com.scansoft.xhmi.DialogException;import com.scansoft.xhmi.IStepRequest;import com.scansoft.xhmi.IStepResponse;import com.scansoft.xhmi.nodes.Output;

public class DynamicPrompt1 extends Output{public void execute(IStepRequest request,IStepResponse response)throws DialogException

{// calculate prompt textString greetingPrompt = new String( "Hello!" );Calendar calendar = new GregorianCalendar();int hour = calendar.get(Calendar.HOUR_OF_DAY);if( hour < 10 ){greetingPrompt = "Good morning!";

}else if( hour > 17 ){greetingPrompt = "Good evening!";

}

Application development topicsDynamic prompts


// write prompt text to attributegetSessionFrame().addAttribute( "dynamicPrompt",greetingPrompt);

super.execute(request, response);}

}

(An alternative implementation to the above is discussed in Adding output to the StepResponse on page 92.)

Adding output to the StepResponse

In contrast to the preceding example, this xHMI configuration file does not contain an <output> element. Instead the initial output is added to the StepResponse (shown after the configuration):

<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE xhmi SYSTEM "dtd/xhmi.dtd"><xhmi root="HelloWorld" xml:lang="en-US"xmlns="http://www.scansoft.com/2004/xhmi">

<dialog id="HelloWorld" root="hello"><node class="com.scansoft.osd.tutorial.DynamicPrompt2"id="hello"><transition><next name="expr"><target condition="true" path="exit"/>


</node></dialog>


</target></catch>





The initial output in the StepResponse


import java.util.Calendar;import java.util.GregorianCalendar;

import com.scansoft.osd.config.XInitial;import com.scansoft.osd.config.XOutput;import com.scansoft.xhmi.DialogException;import com.scansoft.xhmi.IStepRequest;import com.scansoft.xhmi.IStepResponse;import com.scansoft.xhmi.nodes.Output;

public class DynamicPrompt2 extends Output{

public void execute(IStepRequest request, IStepResponse response) throws DialogException

{// calculate prompt textString greetingPrompt = new String( "Hello!" );Calendar calendar = new GregorianCalendar();int hour = calendar.get(Calendar.HOUR_OF_DAY);if( hour < 10 ){greetingPrompt = "Good morning!";

}else if( hour > 17 ){greetingPrompt = "Good evening!";

}

XInitial initial = new XInitial();initial.add(new XOutput(greetingPrompt));// write output to step responseresponse.putInitialOutputList(initial);super.execute(request, response);

}}

Application development topicsDynamic prompts


Working with dates and times programmaticallyIn xHMI, the IDateTime interface defines methods for working with dates and times. It allows interactions with any part of a complete timestamp, which in turns allows applications to work with timestamps in any desired format. It also allows conditionalized expressions based on dates. For example, the application can check whether the current time is morning or evening.

OSD provides a default implementation for this interface. The implementation is available via the com.scansoft.osd.date.DateTime class.

The following sections describe the use of the interface and the default implementation, but these are not the only features available for handling date and time information. Applications can also use the following features:

■ Update rules—OSD provides sample date and time update rules for evaluating timestamp information in recognition results, improving the next-best entries, and filling slot attributes in the SessionFrame.

■ Attribute facades—When discussing dates and times with users, one challenge is to collect separate slots of timestamp information, and then use them together. For example, the application might allow: date, day, day of week, month, year, time of day, hour of day, and so on. Internally, the application translates these slots into a concise timestamp object. Externally, the application needs to formulate output using any combination of the slots. To simplify the challenge, applications can use attribute facades.

For example, date and time attribute facades can detect when timestamps are incomplete (i.e. when slots are missing), and automatically formulate follow-up questions in output to users. To collect each slot, the needed grammars are automatically referenced, and the application does not need to control the chaotic activation and deactivation of individual grammars in the recognizer.

Setting dates and times

The interface IDateTime has two methods to set dates:

set(String date)

set(String date, String format, String assume)

With set, you can specify any part of the date (using the notation described in Details on date and time formats on page 96), and the system will automatically add any parts that you omit. To do this, the system assumes future dates when



resolving ambiguities. You can specify dates in the following combinations of fields:

The set method provides additional control. The format string allows various timestamp formats with these values:

osd:datetime

vxml:date

vxml:time

The assume string allows changes to assumptions when filling in omitted parts of the timestamp with these values:

ASSUMEPAST

ASSUMEFUTURE

ASSUMECLOSEST

ASSUMENOTHING

For example, to set a timestamp in the past (the first day of the current month):

set("-01","osd:datetime","ASSUMEPAST")

Getting dates

Use the get method of IDateTime to retrieve dates in any desired format.

You can get complete dates or parts of dates. The signature is:

String get(String format)

For example, the following format string prints a four-digit year (zero padded when only three digits are available), a two-digit month, and so on. This for is similar to the ISO8601 definition for dates and times:

YYYY-MM-dd HH:mm:ss Z

Complete timestamp Partial timestamp combinations

date date date date

time time time

timezone timezone

Application development topicsWorking with dates and times programmatically


The format string can contain the following arguments:

Details on date and time formats

The com.scansoft.osd.date.DateTime class uses the following format for the date portion of a timestamp:

YYYY-MM-dd

You can abbreviate values so long as your abbreviation is not ambiguous when the specification is parsed (from left to right). When your specification omits parts of the date, OSD fills those parts using the current date in a way that the future is assumed.

Examples:

Argument Description

y One digit of a year, for a four digit year use yyyy

M One digit of a month, for a two digit month use mm

d One digit of a day, for a two digit day use dd

h One digit of a 12hr day, for a two digit hour use hh

H One digit of a 24hr day, for a two digit hour use HH

m One digit of a minute, for a two digit minute use mm

s One digit of a second, for a two digit second use ss

Z Gets the +hh:mm timezone designator

Specification Date

2005-02-10 10 February 2005

2005-02 February 2005

2005 Year 2005

02 Year 02

-02 February

-02-10 10th February

--10 10th day



The com.scansoft.osd.date.DateTime class uses the following format for the time portion of a timestamp:

HH:mm:ss Z

The “Z” is a timezone designator showing an offset relative to Greenwich Mean Time (GMT). It can take the values -23.59 to +23.59. For example:

2005-02-10 10:00:00 +01:00

The timezone must contain a plus sign (+) or a minus sign (-). The format is:

(+|-)HH:mm

In the timezone format, HH is a two-digit number between 0 and 23, and mm is a two-digit number between 0 and 59.

Times are always based on a 24-hour clock (not 12-hour). You cannot specify “am” or “pm.”

You can abbreviate values when specifying the time. The class fills abbreviations using zeros:

For a description of format abbreviations, see Timestamp abbreviations on page 157

Exceptions for invalid timestamps

The set method rejects invalid dates and times by throwing an exception that states the error. For example, the date 2005-02-40 is invalid (because there is no month having more than 31 days).

The smallest possible date is 1-1-1 (the first day of the first month in year 1). Although there is a zero time (00:00:00 is midnight), there is no zero date: a value of 0-0-0 will throw an exception.

Specification Time

13:15 13:15:00 (this is 1.15 am)

13 13:00:00

:30 00:30:00

::55 00:00:55

Application development topicsWorking with dates and times programmatically


Creating grammars dynamically (at runtime)A dynamic grammar is a grammar created at runtime (during a session) because it depends on previous user input, content of a database, or some other information that is unknown during application development.

Dynamic grammars should be small or medium-sized. Large dynamic grammars add load to your system.

Comparison of dynamic and static grammars

Application developers have a choice when designing and building speech grammars: the grammars can be written in advance, perhaps compiled in advance too, and then stored on a server until needed for loading into the recognizer; or they can be written dynamically (just before loading in the recognizer) at the moment they are needed.

Pre-written grammars are called static grammars. Runtime grammars are dynamic grammars. (Because dynamic grammars are written into memory as a string, they are known informally as string grammars.)

Benefits: static grammars can be stored in cache (thus saving runtime resources); dynamic grammars can recognize information that is only available at runtime (for example, the geographical location of the caller). Most applications use a combination of static and dynamic grammars:

■ Large vocabularies of known information (item lists, natural language grammars, robust parsing grammars, and so on) are built as static grammars.

■ Smaller, customized vocabularies (personalized information that is specific to the session) are built as dynamic grammars (typically, by jsp pages).

■ Static and dynamic grammars can be complementary. For example, two grammars can be activated in parallel where the larger grammar is static, and the smaller grammar is dynamically generated to extend the coverage of the recognized speech.

Overview of OSD support

For convenience the OSD Collection node provides an extension point to create dynamic grammars. Basically, you implement the addAllDynamicGrms and removeDynamicGrms methods in your extended node. See the Example of a dynamic grammar. Your custom dialog node executes the following steps:

1 Create dynamic grammar.

2 Store the grammar.



3 Add and use the grammar.

4 Release unused grammars. This is optional to improve performance.

To store and release grammars, the com.scansoft.xhmi.nodes.DialogNode class provides the storeGrammar and releaseGrammar methods.

OSD adds the dynamic grammar to the generated page so it is activated for the next recognition.

Example of a dynamic grammar

This is an example implementation customized to use a simple grammar and a simple mechanism to tell the node which attribute name is to be used for the return value. Please note that the method addAllDynamicGrms is called during node execution and removeDynamicGrms is called when the node is finished on transitioning out using a transition.

public class DynamicGrammarCollection extends Collection{final String name = "myDynamicGrammar";

// the following method is called from DialogNode#executeprotected void addAllDynamicGrms(IStepResponse response, IUnderstandList ul) throws DialogException

{IPropertyList pl = getLocalProperties();String fills = pl.getValue("attributeToFill");

String grammar = "<?xml version=\"1.0\"?>\n"+ "<grammar version=\"1.0\" xml:lang=\"en-US\"

xmlns=\"http://www.w3.org/2001/06/grammar\""+ " mode=\"voice\" tag-format=\"semantics/1.0\"

root=\"SIZE\">\n"+ " <rule id=\"SIZE\" scope=\"public\">\n"+ " <one-of>\n"+ " <item>one<tag>DYNAMICSLOT='one';</tag></item>\n"+ " <item>two<tag>DYNAMICSLOT='two';</tag></item>\n"+ " <item>three<tag>DYNAMICSLOT='three';</tag></item>\n"+ " </one-of>\n"+ " </rule>\n"+ "</grammar>";

// store new grammarDynamicGrammarDesc dynamicGrammar = storeGrammar(name,

grammar, "application/srgs+xml", false);dynamicGrammar.addMapping(fills, "DYNAMICSLOT");

Application development topicsCreating grammars dynamically (at runtime)


// add grammar to set of used grammarsaddDynamicGrammar(response, ul, dynamicGrammar);

}

// the following method is called from DialogNode#transitionprotected void removeDynamicGrms(IStepResponse response)

throws DialogException {releaseGrammar(name); // release grammar after using it

}}

Use the node in your xHMI configuration. You must configure the <understand> element with any attribute names that are filled by the dynamic grammars. Otherwise, the system cannot copy recognized slots to attributes:

<node class="com.examples.nodes.DynamicGrammarCollection" id="collect"><config><property-list><property name="attributeToFill" value="dummy"/>

</property-list><output-list><initial><output>please state value for 'dummy'</output>

</initial></output-list><understand namelist="dummy"/>

</config><transition><next name="expr"><target path="#waiting"/>


</node>

Reading the <config> content of a nodeFor custom nodes it might be required to add custom child elements to the <config> element. Lets revisit the Hello World example once more, to show which steps are required to extend the config element. This time we like to have a list of typed parameters that can be used to configure the different prompt texts and the time until which the good-morning-prompt is played and a time from which on the good-evening-prompt is played. Furthermore parameters are allowed to take default values.



To extend the <config>, perform these steps:

1 Add the elements by extending the DTD

2 Use the new element in your xHMI configuration

3 Write classes to access the custom node

Add the elements by extending the DTD

In xhmi_mai.dtd, add the new, custom element. In the following example, we add <parameter-list> as child to the <config> element of xHMI:

<!ELEMENT config (xi:include|osdm?|understand?|verify?|property-list?|output-list?|grm-list?|transfer-result?|parameter-list?)*>

Put the definition of the custom element into the custom.dtd file. For example:

<!ELEMENT parameter-list (parameter+)><!ELEMENT parameter EMPTY><!ATTLIST parameter name NMTOKEN #REQUIREDtype NMTOKEN #REQUIREDvalue CDATA #IMPLIEDdefault CDATA #IMPLIED

>

Use the new element in your xHMI configuration

The following example shows the new element <parameter-list> in the Hello World application. The dtd now allows multiple greetings prompts that are parameterized as times of day:

<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE xhmi SYSTEM "dtd/xhmi.dtd"><xhmi root="HelloWorld" xml:lang="en-US"xmlns="http://www.scansoft.com/2004/xhmi"><dialog id="HelloWorld" root="hello"><node class="com.scansoft.osd.tutorial.CustomConfig"id="hello">

Application development topicsReading the <config> content of a node

<config><parameter-list><parameter name="helloPrompt" value="Hello!" type="String"/>

<parameter name="morningPrompt" value="Good morning!" type="String" default="Hello"/>

<parameter name="eveningPrompt" value="Good evening!" type="String" default="Hello"/>

<parameter name="morning" value="10" type="Integer" /> 

<parameter name="evening" value="17" type="Integer" /> 

</parameter-list></config>

<transition><next name="expr"><target condition="true" path="exit"/>


</node></dialog>


</target></catch>



Write classes to access the custom node

To access the new <parameter-list> element, the custom node must read the custom elements from the DOM. To enable this, you must write classes that implement the IDomElementReader interface. Then, the node can retrieve the custom elements using the DialogNode.getConfig method.

Here is the java code:


import java.util.Calendar;import java.util.GregorianCalendar;import java.util.HashMap;

import javax.xml.transform.TransformerException;

import org.apache.xpath.XPathAPI;import org.w3c.dom.Element;import org.w3c.dom.NamedNodeMap;import org.w3c.dom.Node;import org.w3c.dom.NodeList;

import com.scansoft.osd.DialogManagerConstants;import com.scansoft.osd.config.XInitial;import com.scansoft.osd.config.XOutput;import com.scansoft.xhmi.DialogException;import com.scansoft.xhmi.IDomElementReader;import com.scansoft.xhmi.IStepRequest;import com.scansoft.xhmi.IStepResponse;import com.scansoft.xhmi.XMLException;import com.scansoft.xhmi.nodes.Output;

public class CustomConfig extends Output{

class XParameterList extends HashMap implements IDomElementReader{

public static final String XML_ELEMENT_NAME = "parameter-list";

public static final String TAG_ATTRIBUTE =XParameter.XML_ELEMENT_NAME;

public String getXMLElementName(){return XML_ELEMENT_NAME;

}



public void readFromDomElement(Element element,String namespacePrefix,Node namespaceNode) throws XMLException{String xpath = "./" + namespacePrefix + ":" + TAG_ATTRIBUTE;

try{try{NodeList nodeList = XPathAPI.selectNodeList( element, xpath,namespaceNode);

if (nodeList != null){for (int n=0; n< nodeList.getLength() ; n++){XParameter param = new XParameter();param.readFromDomElement( (Element)nodeList.item(n),namespacePrefix,namespaceNode );

put( param.getName(), param );}

}}catch (TransformerException e){throw new DialogException("Cannot find: " + xpath, e);

}}catch (Exception e){throw new XMLException("Cannot get XParameterList", e);

}}

}



class XParameter implements IDomElementReader {

public static final String XML_ELEMENT_NAME ="parameter";

public static final String XML_ATTRIB_NAME = "name";public static final String XML_ATTRIB_TYPE = "type";public static final String XML_ATTRIB_VALUE ="value";

public static final String XML_ATTRIB_DEFAULT ="default";

private String name_ = null;private String type_ = null;private Object value_ = null;private Object default_ = null;

public String getXMLElementName(){return XML_ELEMENT_NAME;

}

public String getAttribute(Node node, String attributeName){NamedNodeMap map = node.getAttributes();Node n = map.getNamedItem(attributeName);if (n == null){return null;

}else{return n.getNodeValue();

}}



public void readFromDomElement( Element element, String namespacePrefix,Node namespaceNode) throws XMLException{name_ = getAttribute(element, XML_ATTRIB_NAME);

type_ = getAttribute(element, XML_ATTRIB_TYPE);

if( type_.equalsIgnoreCase( "String") ){value_ = new String(getAttribute( element,XML_ATTRIB_VALUE));

if( null != getAttribute( element, XML_ATTRIB_DEFAULT))

{default_ = new String(getAttribute(element, XML_ATTRIB_DEFAULT));

}}else if( type_.equalsIgnoreCase( "Integer") ){value_ = new Integer(Integer.parseInt(getAttribute(element, XML_ATTRIB_VALUE)));

if( null != getAttribute( element, XML_ATTRIB_DEFAULT))

{default_ = new Integer(Integer.parseInt(getAttribute(element, XML_ATTRIB_DEFAULT)));

}}

}

public String getName(){return name_;

}

public void setName(String string){name_ = string;

}



public Object clone(){Object clone = null;try{clone = super.clone();

} catch(CloneNotSupportedException e){// should never happen// because we have implemented clone, // so it is supportedthrow new RuntimeException(e);

}return clone;

}}

public void execute(IStepRequest request,IStepResponse response)throws DialogException

{XParameterList parameterList = new XParameterList();

//read parameter list from xHMIIDomElementReader reader = getConfig(getApplication().getNamespacePrefix()+ ":" + XParameterList.XML_ELEMENT_NAME,parameterList, DialogManagerConstants.SCOPE_NODE, true);

// access parameters (this sample does not // make use of the default value)XParameter param = (XParameter)parameterList.get("morning");

int morning = ((Integer)param.value_).intValue();param = (XParameter)parameterList.get("evening");int evening = ((Integer)(param).value_).intValue();param =(XParameter)parameterList.get("morningPrompt");String goodMorning = ((String)(param).value_);

param =(XParameter)parameterList.get("eveningPrompt");String goodEvening = ((String)(param).value_);

param = (XParameter)parameterList.get("helloPrompt");String hello = ((String)(param).value_);



// calculate prompt textString greetingPrompt = new String( hello );Calendar calendar = new GregorianCalendar();int hour = calendar.get(Calendar.HOUR_OF_DAY);if( hour < morning ){greetingPrompt = goodMorning;

}else if( hour > evening ){greetingPrompt = goodEvening;

}

XInitial initial = new XInitial();initial.add(new XOutput(greetingPrompt));// write output to step responseresponse.putInitialOutputList(initial);super.execute(request, response);

}}

Extending the application objectSometimes it is useful for an application to set up data during load time, which can be shared by all sessions of the application at run-time. You can do this by extending the supplied class com.scansoft.osd.Application and overwriting its init method. In init, call super.init first to start OSD's regular initialization. The application class is loaded when the web application starts.

To tell the framework about your class, you need to set a parameter in the web.xml of your web application.

<context-param><param-name>ApplicationClassName</param-name><param-value>com.scansoft.osd.MyApplication</param-value>

</context-param>

You can access the application from a node simply by calling getApplication and casting the result to your application class. For example:

MyApplication app = (MyApplication)getApplication();

Do not store any dynamic session data in the application object because the data will be lost in the case of a session failure and re-initialization. Instead, store all dynamic data in the SessionFrame.



RenderingOSD renders Voice XML 2.0 pages and sends them to your browser. It implements the rendering system using jsp pages stored in installDir\voiceXML\jsp.

The jsp pages use a JSP tag library using the xhmi-voicexml.tld descriptor file. OSD stores the file in installDir\voiceXML\jsp\WEB-INF. Applications must copy the file to their WEB-INF folder.

OSD provides these jsp pages:

■ collection.jsp—plays prompt and collects user input■ error.jsp—used internally by OSD■ outputExit.jsp—optionally plays prompts and exits■ outputOSDM.jsp—optionally plays prompts and calls an OSDM■ outputSync.jsp—plays a prompt and triggers VoiceXML generation■ return.jsp—handles the return from a component to the calling application■ root.jsp—used internally by OSD■ root-nocache.jsp—used internally by OSD■ start.jsp—used internally by OSD■ submit.jsp—invokes a transfer to another VoiceXML application■ transfer.jsp—invokes a call transfer

These pages correspond to the predefined <vuiforward> properties _collection, _outputExit, _transfer, plus the vui forward key defined by the class OSDMNode: callOSDM.

In addition to the Render Data objects defined by xHMI, OSD uses the following Render Data object:

In this table the key is an entry in the hash map of Render Data objects contained in the step response

Extending the rendering system

Application developers can extend the rendering system provided with OSD.1 For example, you could supply additional jsp pages when your platform

Key Object

RenderOsdmGeneric OSDMCallDesc

1Substituting a complete rendering system completely (for example, to adopt a different markup language) is described in the OSD Integration Guide.

Application development topicsRendering


exports additional functionality with a VoiceXML <object> element. To use the functionality, an application needs to generate an appropriate VoiceXML page.

For example, assume you want to use an object called “X,” and a call to this object needs two parameters “a” and “b.” The following example shows the desired rendering of VoiceXML when the parameter values are 42 and 43 (values chosen arbitrarily for this example):

<object name="X"><param name="a" value="42"/> <param name="b" value="43"/>

</object>

To extend the rendering system, do the following:

1 Create a custom node

2 Configure the custom node in xHMI

3 Change the <vuiforward> map in xHMI

4 Create a jsp page

Create a custom node

We need a custom node that sets a viuforward key to select the new rendering component. We define an arbitrary key (named "callX"). The node execute function looks like this:

public void execute(IStepRequest request, IStepResponse response)throws DialogException

{setVuiForward(response, "callX");response.setCommit(true);

}

The step response must be committed to invoke immediate page rendering.

Configure the custom node in xHMI

The node can set required parameters (“a” and “b” in the example scenario) because the runtime framework passes all properties that are visible to a node into the StepResponse as a Render Data object. Therefore, your xHMI configuration defines properties in the <config> section of the custom node.

For example:

<config><property name="a" value="42"/><property name="b" value="43"/>

</config>

Change the <vuiforward> map in xHMI

You must map the vuiforward key created by the custom node to the jsp page that renders the VoiceXML. Any existing mappings are unchanged (even if they



are not used in the new custom node). Assume that the name of jsp page we want to create is callObjectX.jsp:

<vuiforward><forward name="_collection" path="/collection.jsp"/><forward name="_outputExit" path="/outputExit.jsp"/><forward name="callX" path="/callObjectX.jsp"/>

</vuiforward>

Above, this example assumes the new page resides in the root directory of the web application.

Create a jsp page In our example scenario, the jsp page needs to render a VoiceXML document that makes a call to the object. It is beyond the scope of this document to explain this process in detail. However, we explain how JSP code can access the properties we need.

For Render Data objects that contain properties, the runtime framework uses the key RenderPropertyList. The object returned for the key implements the interface IPropertyList.

The Render Data object can be found in a StepResponse object. First, we retrieve this object from the HTTP request, and then we extract the property list. The following JSP fragment shows how to do this:

<%@page contentType="text/xml;charset=UTF-8"errorPage="/error.jsp"import="com.scansoft.osd.StepResponse"import="com.scansoft.xhmi.renderbeans.RenderBeanKeys" %>

<%StepResponse stepResponse =(StepResponse)request.getAttribute( RequestAttributeNames.STEP_RESPONSE );

IPropertyList props = (IPropertyList)stepResponse.get( RenderBeanKeys. RENDER_PROPERTY_LIST);

%>

... more VoiceXML here

<object name="X"><param name="a" value="<%=props.getValue("a")%>"/><param name=""b" value="<%=props.getValue("b")%>"/>

</object>

... more VoiceXML here

Note the use of the getValue function to access a property in the list. For more rendering information, study the JSP page supplied with the OSD installation.

Application development topicsRendering


Using custom Render Data objects

The example scenario above shows how to use a system-defined Render Data object to create a custom render component. You can also use arbitrary objects as your own Render Data object. The only requirement is that the key for storing these objects in the StepResponse must not conflict with any of the render keys defined by xHMI (see the appendix of reserved names in the xHMI Reference Guide).

To add your object to the response, make a call like this inside the execute function of a node:

response.put("myKey", new MyClass());



Chapter 10

The OSD Datamodel

This chapter describes the model for handling application data in OSD and xHMI, including how to declare and use variables, access data from application components, and define new datatypes.

Overview of variables and data storageThe xHMI configuration uses the <var> element to define variables. OSD provides the necessary runtime classes to make variables available to any ECMAScript or Java code in your application.

OSD and xHMI store the following kinds of data:

■ Recognition data (results from the recognizer). In xHMI, you define this data as attribute variables. For example:

<var name="destination" type="attribute" expr="'London'"><param name="temporary" expr="true"/><param name="verified" expr="false"/><param name="homophone" expr="true"/>

</var>

■ Application data of common datatypes: integers, double words, booleans, and strings. For example, this includes counters for purposes like retries, no-match recognitions, and how often a node has been entered. You can also store static strings to hold presentation data (such as the application name). In xHMI, you define this data as typed variables. For example:

<var-list><var name="maxIndex" type="int" expr="17" /><var name="counter" type="int" expr="maxIndex + 2" /><var name="pi" type="double" expr="3.1415" /><var name="isValid" type="boolean" expr="true" />

The OSD DatamodelOverview of variables and data storage


<var name="version" type="string" expr="'MyApp Version 5'" /></var-list>

■ Application data of complex datatypes: you can define Java classes for any complex type, and then define xHMI variables of these types. For example:

<var name="rc" type="class:org.examples.beans.MyComplexType">

<param name="age" expr="1 billion years"/><param name="message" expr="'hello World'"/><param name="version" expr="1.0"/>

</var>

■ ECMAScript variables. You can define and use data within ECMAScript, and you can use xHMI variables in ECMAScript expressions. See Accessing variables with ECMAScript.

Datamodel error events

OSD throws error events for datamodel errors just as it would any error. For a description of events, see <catch> in the xHMI Reference Guide. Applications should always catch the error.datamodel root event or at least the general error event.1

Accessing variables with xHMI

Use the <var> element to define variables in your xHMI configuration. Within the <var>, use ECMAScript expressions to assign initial values, or omit the expressions so the variables take default values. For details on <var>, see the xHMI Reference Guide.

Here is an integer variable initialized to the number 42:

<var name="myCounter" type="int" expr="42"/>

Here is a recognition attribute variable:

<var name="myAttribute" type="attribute"><param name="temporary" expr="true"/>

</var>

Below is a attribute facade variable. This example refers to a fictitious class org.example.MyFacade as an example of a user-defined Java extension to OSD.

1Some severe errors (for example, errors in xHMI configuration files) throw the general error event. The event message provides details. When enabled, the diagnostic log (osd.log) contains an exception trace.



The class accepts the parameter namelist, which is defined in the extension. The namelist is a string with attribute names separated by white space:

<var name="myFacade" type="facade:org.example.MyFacade"><param name="namelist" expr="'myAttribute'"/>

</var>

Accessing variables with ECMAScript

Your ECMAScript expressions can access any variable of any datatype created with the <var> element. (Of course, the expressions can define new variables too.) For example, counter is an integer variable defined by the <var> element:

<script>if (counter==0) {// ...

}counter = counter + 1;

</script>

Below, the ECMAScript appears in a conditional target. If myAttribute is defined, the path is taken:

<target condexpr="myAttribute.def()" path="#next"/>

Here is a guard condition. The node is entered if myAttribute is not yet defined:

<node id="n1" class="Collection"><guard condexpr="myAttribute.undef()"/>...

</node>

Here is a script that creates the ECMAScript variable z and assigns the top n-best results associated with the xHMI variable myAttribute:

<script>var z = myAttribute.best();...

</script>

The following example shows a complex datatype called agent. You could create such a datatype by extending org.examples.bean.MyComplexType with the identifier agent. You must define the variable before using it in a script; for example:

<var name="agent" type="class:org.example.beans.MyComplexType"/>

The example sets three properties of the agent: age, version and message. (Not shown are the variable declarations, which must be done before the script executes.)



<param name="message" expr="'hello World'"/><param name="version" expr="1.0"/>

</var>

Here is the implementation:

package com.example.beans;public class MyComplexType {

private int age = 0;private String message = "";private float version = 0.1;

public void setAge(Number age) {this.age = age.intValue();

}

public Number getAge() {return new Integer(this.age);

}

public void setVersion(Number version) {this.version = version.floatValue();

}

public Number getVersion() {return new Float(this.version);

}

public void setVersion(String message) {this.message = message;

}

public String getMessage() {return this.message;

}}



Accessing variables with Java code

This section describes how to access xHMI and ECMAScript data from any Java class created by your application or development platform. You can create classes for any purpose, including:

■ Complex datatypes■ Custom nodes■ Attribute facades■ Update rules■ Extensions to the nodes, facades, and update rules provided with OSD

When creating a class, implement the IDataModel interface to gain access to the datamodel. In general, this means the following:

IDataModel model = // get datamodel instanceObject varObj = model.get("myAttribute");AttributeBean attr = (AttributeBean) varObj;// use the attribute

Access from a custom node

To get the datamodel from a custom node, extend DialogNode and use the getDataModel method. For example:

public class MyNode extends DialogNode {public void execute(IStepRequest request,

IStepResponse response) {IDataModel model = super.getDataModel();// the model is accessed// now get any data in the model...

}}

Above, DialogNode is the base class for all nodes. Use getDataModel to return the current scope of the IDataModel instance. The getDataModel method has the signature:

public IDataModel getDataModel();

Access from an update rule

To write to the datamodel from an update rule, implement the IUpdateRule interface. For simplicity, also implement the IDataModelAccess interface to automatically reference the datamodel in the instantiated rule before the rule is initialized. (For information on update rules, see the xHMI Reference Guide.)

For example:

public class MyUpdateRule implements IUpdateRule, IDataModelAccess {



private IDataModel model = null;private String attrName = null;

//OSD calls this method after you create an instance of this// updater rule and before it calls the init method.public void setDataModel(IDataModel model) {this.model = model;

}

//Initialize this instances of the update rule, taking a list// of attribute names separated by whitespace.// This implementation only expects one attribute on the list.public void init(String variableNames) throws DialogException {

this.attrName = variableNames.split("\\s")[0];}

//Define behavior when update rule runs as a pre-update rule.public void preUpdate(INbestResult nbestResult, ISessionFrame sf)throws DialogException {AttributeBean attr = (AttributeBean) model.get(attrName);// work with the attribute bean...

}

public void postUpdate(INbestResult nbestResult, ISessionFrame sf)

throws DialogException {}

}

Access from an attribute facade

To get the datamodel from an attribute facade, implement the IAttributeFacade interface. For simplicity, also implement the IDataModelAccess interface to automatically reference the datamodel in the instantiated facade before using the facade. For example:

public class MyAttributeFacade implements IAttributeFacade, IDataModelAccess {

private IDataModel model = null;

//OSD calls this method after you create an instance of this// updater rule and before it calls the init method.public void setDataModel(IDataModel model) {this.model = model;

}



//A user defined method for this attribute facade.public String someFacadeMethod() {AttributeBean attr = (AttributeBean)

model.get("myAttribute");return attr.getBest();

}}

Above, DefaultAttributeFacade is the base class. Use getDataModel to return the current scope of the IDataModel instance. The getDataModel method has the signature:

public IDataModel getDataModel();

Access from a custom bean

Your JavaBeans must implement the interface IDataModelAccess. OSD uses dependency injection to provide IDataModel references to instances of each bean. For example, this bean implements the interface, sets the model, and gets an xHMI variable named myAttribute:

public class MyComplexType implements IDataModelAccess {private IDataModel model = null;

//OSD calls this method after you create an instance of this // updater rule and before it calls the init method.public void setDataModel(IDataModel model) { this.model = model;

}

public String getAttrValue() {AttributeBean attr = (AttributeBean) model.get("myAttribute");//Return the best choice from the recognition result.return attr.best();

}// ... see myComplexType in Writing your own bean

}

The getAttrValue method is only defined in this class (for bean instances). You can use it in xHMI as follows:

<script>rc.getAttrValue()</script>

ISessionFrame methodsamb(java.lang.String qname)–returns true if the specified attribute's first best value is ambiguous, false otherwise. Returns a boolean.



best(java.lang.String qname)–returns the first best value of an attribute or null if it does not exist in the current node, one of the dialogs on the stack or in global scope. Returns a java.lang.String.

conf(java.lang.String qname)–returns the confidence (0..1.0) of the first best value of an attribute or 0 if it does not exist in the current node, one of the dialogs on the stack or in global scope. Returns a double.

def(java.lang.String qname)–checks whether attribute is defined. Returns a boolean.

hom(java.lang.String qname)–returns true if the specified attribute's first best value is homophone, false otherwise. Returns a boolean.

nbest(java.lang.String qname, int order)–returns the n-th best value of an attribute or null if the attribute does not exist. Returns a java.lang.String.

nconf(java.lang.String qname, int order)–returns the confidence of the n-th best item of an attribute or 0 if the attribute does not exist. Returns a double.

nsay(java.lang.String qname, int order)–returns the representation of the best items in a format that is suitable for output generation. Returns a java.lang.String.

say(java.lang.String qname)–returns the representation of the best items in a format that is suitable for output generation. Returns a java.lang.String.

undef(java.lang.String qname)–checks whether an attribute is not defined. Returns a boolean.

unver(java.lang.String qname)–checks whether attribute is not verified. Returns a boolean.

ver(java.lang.String qname)–checks whether attribute is verified. Returns a boolean.

Differences between a.best() and s.best(‘a’)

The following expressions have the same result:

mySlot.best()s.best('mySlot')

The differences between the expressions are:

■ The expression mySlot.best() uses simpler syntax. If the variable is undefined, the expression throws error.script.execution.

■ You can use s.best('mySlot') even when mySlot is not defined (in which case the ISessionFrame instance returns ‘null’).

The OSD DatamodelISessionFrame methods


AttributeBean methods

Writing a factoryYou can define a complex variable in the xHMI configuration, and then use a factory in OSD to instantiate a JavaBean object of the desired class in the datamodel.

OSD provides the following predefined factories:

■ Class factory■ Facade factory

In addition, you can create factories of your own, and use these factories to create classes conveniently without knowing their concrete type.

AttributeBean methods

clear() : return void like <clear name…/>

amb() : return boolean like s.amb

best(): return String like s.best..

nbest(int): return String like s.nbest…

say(): return String like s.say..

nsay(int): return String like s.nsay…

conf(): return double like s.conf..

nconf(int): return double like s.nconf…

def(): return boolean like s.def..

undef(): return boolean like s.undef..

ver(): return boolean like s.ver..

unver(): return boolean like s.unver..

getAmbiguousCount(): return int like s.amb…

skip(String): return void add value … to skip list of this attribute bean, like IAttribute#skip(…)



When do you need a factory?

You need a factory when the <var> element does not provide the needed datatype. For example:

■ If you anticipate creating objects with many parameters, then a factory simplifies and significantly reduces the xHMI configuration.

■ If you need access to an external registry like JNDI or Spring, a factory creates a bridge between the registry and the contents of the xHMI datamodel. This is possible because a factory can contain arbitrary Java code that accesses xHMI variables and interacts with the external entity.

Steps for writing a factory

Steps for writing and using factories:

1 Implement the interface IDataElementFactory. See Implementing IDataElementFactory on page 123.

2 Add the new class to the osd-config.xml configuration file. See Configuring factories on page 124.

3 In the xHMI configuration, create variables with the complex datatype pointing to the class. See the example in Overview of variables and data storage on page 113, or see the <var> element in the xHMI Reference Guide.

Implementing IDataElementFactory

To write a factory, implement the interface IDataElementFactory. The interface defines a create method that your implementation must provide to create objects of the desired type.

Any JavaBeans or factories that you create must have a default constructor in the implementation. (Other constructors are also allowed.) OSD uses the default constructor to initialize variables defined with <var name="a" type="z"/>. In other words:

■ When you do not define expr, OSD uses the default constructor.

■ When you define expr, OSD uses the constructor for the type of the expr result. If this constructor is not present, OSD throws the error.datamodel event.

The interface is defined as:

public interface IDataElementFactory {/*** This method creates an instance based on a specific scheme.

The OSD DatamodelWriting a factory


* For example, the scheme 'class' would call the class loader.* This loads the class and creates an instance* of the loaded class.*/public Object create(String uri, IDataModel dm);

}

Here is an example implementation of a class factory:

package com.scansoft.osd.datamodel;public class ClassFactory implements IDataElementFactory {public Object create (String uri, IDataModel dm)

throws Exception {return

getClass().getClassLoader().loadClass(uri).newInstance();}

}

You can use create an instance of the class in xHMI as follows:

<var name="a" type="class:org.example.beans.MyComplexType"/>

The factory lifecycle The framework instantiates factories when executing their corresponding <var> elements, and then removes them after building the variable. Therefore, your factory implementations must not store data for individual application sessions.

Configuring factories

After creating a new IDataElementFactory class, you must configure it in the /WEB-INF/lib/osd-config.xml configuration file.

During runtime initialization, the osd-config.xml configuration file defines the available datatypes and their classes. For example:

<osd-config><var-types><!-- predefined datatypes provided with OSD -><var-type name='int' class='java.lang.Integer'/><var-type name='string' class='java.lang.String'/><var-type name='double' class='java.lang.Double'/><var-type name='boolean' class='java.lang.Boolean'/><var-type name='attribute'

class='com.scansoft.osd.datamodel.AttributeBean'/></var-types>

<var-factories><!-- predefined factories provided with OSD -><var-factory name='class'

class='com.scansoft.osd.datamodel.ClassFactory'/>

<var-factory name='facade'class='com.scansoft.osd.datamodel.FacadeFactory'/>

<!-- example user-defined factory -><var-factory name='record'

class='org.examples.beans.MyComplexType'/></var-factories>

</osd-config>

Above, you can create new types and classes, and insert them into the configuration file. Also, you can replace any class with an implementation of your own. For example, if you implement a variant for integers, you can map int to a class other than java.lang.Integer.

Using the OSD datamodel interface

To access the OSD datamodel, use the interface IDataModel in your Java code.

The OSD datamodel itself is responsible for serving declared data elements to other modules of the running system. (See Accessing variables with Java code on page 117.) So this class serves access to data elements that are stored in the OSD datamodel itself. Available methods in IDataModel:

contains(name)–Indicates whether the named variable is defined or not.

get(name)–Gets the value of name. Returns NULL if name does not exist.

getNames()–Gets the names of all defined variables.

getType(name)–Gets the datatype of name.

set(name, value)–Sets the value of name.

size()–Gets the number of defined variables.

store(name, datatype, value)–Creates a variable of the given type and stores the value. Once stored, you can update the value with the set method.

In addition, to ensure correct processing of back processing and the servlet, your application must also implement the following: Cloneable and Serializable.

Implementing IDataModelAccess

Applications must implement this interface in any object (such as a bean, update rule, attribute facade, or class factory) that accesses the OSD datamodel and SessionFrame interfaces. In response, OSD automatically injects the datamodel into the object when the object is created.

public interface IDataModelAccess {

public void setDataModel(IDataModel dm) throws Exception;}

The injected instance of the datamodel is appropriately scoped, and all variables are available. This includes OSD variables such as __callId, __callerId, and __calledId (as described in the xHMI Reference Guide).

Example user-defined JNDI factory

A convenient way of sharing data among OSD applications is to package the data in a JNDI object and access the data with a JNDI factory in each application. OSD does not provide a JNDI factory, but the following example shows how to create one.

JNDI is the Java Naming and Directory Interface provided by Sun Microsystems, Inc. For details, see these links:

■ http://java.sun.com/products/jndi/.■ http://java.sun.com/j2se/1.4.2/docs/api/javax/naming/package-summary.html

1 Create the JNDI factory class:

import javax.naming.InitialContext;public class JNDIFactory implements IDataElementFactory {

private InitialContext context = new InitialContext();

public Object create(String schemeSpecificPart) {return context.lookup(schemeSpecificPart);

}}

2 Add the factory to the osd-config.xml configuration file:

<osd-config><var-types>...

</var-types><var-factories>...<var-factory name="jndi" class="com.scansoft.osd.datamodel.JNDIFactory"/>

...</var-factories>

</osd-config>

3 Defining a JNDI bean in xHMI:

<var name="myBean" type="jndi:weatherforecast"/>



Above, the example gets weatherforecast from the JNDI local context, which must contain an instance of the org.examples.jndi.WeatherService class. Then, you can use myBean to access the methods of the class as follows (we assume the class implements a temperature method:

<script>var t = myBean.temperature();

</script>



Chapter 11

FAQ

This chapter provides answers to typical questions that might arise.

Evaluating variables and making logic decisionsQ: Can I create a node that doesn't do any prompting or recognition but only tests a variable and branches accordingly?

A: There are two possibilities to do that

a Write a java node with an empty execute function and a transition function that like this

public String transition(StepResponse response) throws DialogException{return ("expr");

}

In xHMI <transition>, use the "condexpr" attribute of <target> to write some conditions using ECMAScript.

This approach has the advantage that you only need one Java class (independent from the conditions), because the conditions are expressed in xHMI.

The disadvantage is that sometimes the expressions might get too complicated to be written and debugged in ECMAScript.

b Write a java node with an empty execute function and a transition function that checks your condition:

FAQEvaluating variables and making logic decisions


public String transition(StepResponse response) throwsDialogException

{if (getSessionFrame().best("varname").equals("G")){return "good";

} else if (getSessionFrame().best("varname").equals("B")){return "bad";

} }

In the <transition> you can branch on the transition property:

<transition><next name="good"><target path="#node_g"/>

</next><next name="bad"><target path="#node_b"/>

</next> </transition>

Q: How can I specify branching conditions in xml when a node always returns the same transition property.

A: You can use more than one <target> elements within a <next> element, each with a condition. See the xHMI Reference Guide for details.

Accessing variables in ECMAScript expressionsQ: How can I access dialog variables in ECMAScript expressions used in xHMI? Can I have my own objects there?

A: OSD does not give application developers direct access to the ECMAScript context. Instead all data of an application needs to be stored in the SessionFrame. This is required in order to allow making the application state persistent for session fail-over.

The SessionFrame can be accessed with the ECMAScript variable name "s". In the Java code of a node it can be accessed with getSessionFrame.

The variables stored in the SessionFrame are of class type "Attribute". An Attribute has next-best lists of strings together with their confidences. The SessionFrame object has convenience methods to allow a shorthand notation for



the access to the best value of an Attribute or the confidence of that. For example:

s.best('varname')

gives you the value of the first best attribute. This is equivalent to:

$s.get('varname').getFirstBest().getText()

You can also put arbitrary objects into the SessionFrame with getObject and putObject. These are stored in a hashtable. So, in ECMAScript you would access them with:

((MyClass)s.getObject('myname')).getSomeMember()

Declaring custom classes in xHMIQ: Can I declare custom classes in xHMI and use them as dialog session data?

A: No, you can only declare attributes. These correspond to the com.scansoft.xhmi.Iattribute interface.

Creating attributes via JavaQ: Can attributes be added from the java code?

A: yes, they can. Use the addAttribute and assignAttribute methods of class

com.scansoft.xhmi.ISessionFrame

To access the SessionFrame, call the getSessionFramemethod. This example, adds a new attribute and assigns a value to it.

public void addAttribute(String qname, String value)

Configuring OSDM parameters dynamicallyQ: How can I set a parameter for an OSDM call dynamically?

A: You can either use an ECMAScript expression in the value of a <property> or do it in Java.

To do it in Java, create a new node that extends com.scansoft.osd.nodes.OSDM.

Overwrite the method updateProperties(OSDMCallDesc osdmCallDesc)

FAQDeclaring custom classes in xHMI


You can use getProperty and putProperty of OSDMCallDesc to change properties.

Timing of updates in the SessionFrameQ: When does the update of the attributes in the SessionFrame occur?

A: Short answer: After the call to execute and before the call to transition. More in detail: A node renders a VoiceXML page via a JSP, the VoiceXML browser executes the page, gathers input from the caller and submits the recognition result back to the server. The HTTP request that reaches the servers is transformed into a StepRequest for the DialogManager. The DialogManager reads the semantic attributes in the StepRequest and uses them to update the SessionFrame. After the status has been updated, the framework checks for events raised and calls a <catch> handler if an event was raised and a matching catch handler is present. The framework then calls the transition function of the current node. With the result of the transition function and the content of the global and node specific transition maps in the xml, the framework decides which node to call next.

Changing the provided rendering jspQ: Can I change the provided rendering jsp?

A: The OSD rendering engine uses jsp pages to create dynamic VoiceXML pages. One typical question from application developers working with OSD 1.0 is whether they can change the provided jsp pages. The answer is this: the jsp pages supplied should be sufficient for any type of application. Although it is possible for application developers to create custom jsp pages (for reasons described below), we strongly recommend that you do not adapt the provided jsp pages:

■ Your custom pages will not benefit from feature updates and bug fixes in future OSD releases.

■ Your custom jsp pages are required to support OSI logging. This is an error-prone task that easily leads to a situation where problems occur only very late in the development process (for example, only when logged data is analyzed).



In rare circumstances, an application developers might want to create their own pages. Example reasons:

■ To use a VoiceXML object tag to trigger non-VoiceXML functions of the browser platform.

■ To use custom ECMAScript on the generated page.

■ To generate different mark-up such as SALT.

Recognizing long utterances with robust parsingQ: My application uses a robust parsing grammar, but the recognizer seems to reject all long utterances. What is wrong?

A: With robust parsing grammars and OSR versions 3.0.3 or 3.0.4, you should set the reserved property _confidenceForMatch to 0.0. Otherwise, your VoiceXML browser might always reject long utterances.

Background information: often, the recognizer assigns a sentence confidence of 0 or just above 0.0 to long utterances. If the application specifies a _confidenceForMatch threshold, then these sentences will have a confidence that is below the threshold. If you do not have access to the slot confidences (OSR 3.0.3), OSD automatically uses the sentence confidence as confidence for each of the slots. In that case you also need to set _confidenceForDefined to 0.0.

FAQRecognizing long utterances with robust parsing


Chapter 12

Getting started with development

An xHMI speech or multi-modal application typically consists of one or more xHMI files, a web deployment descriptor file (web.xml), grammar files, audio files, Java files for custom nodes and backend logic.

It is beyond the scope of this document to describe the development process for such an application in detail. However, it is assumed that the development process roughly follows the process in the picture below. Usually such a process is iterative, that is a version of the application is created with a subset of the functionality, it is then tested and refined and tested again and so on until it the application meets all test criteria.

Getting started with development 133Nuance Proprietary

Speech application development lifecycleWe assume a simple, iterative development lifecycle consisting of Design, Development, and Deployment phases. Speech applications also require tuning iterations during deployment. The following figure shows the lifecycle in detail:

In this document, we focus on the aspects of the lifecycle that are closely related to xHMI. (These are depicted in the white boxes).

Application designApplication design includes the specification of every possible interaction with end-users: the prompts, the speech grammar coverage, and the expected



recognition results. All these specifications are reflected in the application’s xHMI configuration.

At the highest level of organization, the xHMI configuration consists of dialogs and nodes: major branches of the application are represented by <dialog> elements; individual interactions are represented by <node> elements.

The callflow (as embodied in dialogs and nodes) can be highly conditionalized. That is, the nodes that are visited and prompts that are played can change depending on current state of the session with the caller. This is a key strength of OSD applications; they are not limited to linear (directed-dialog) callflow models. Instead, OSD applications are most versatile when employed for information-driven or state-driven models.

For example, you might design a directed-dialog as the default callflow:

The default callflow can be conditionalized (in the xHMI configuration instead of the application code) depending on the data collected (an information-driven dialog):

System Do you want ice cream or pizza?

User Ice cream

System Cone or cup?

User Cup.

System Large, medium, or small?

User Small.


User Ice cream cone.

System Large, medium, or small?

Getting started with developmentApplication design


OSD relies on a design that defines the individual pieces information needed, that collects the pieces, and that changes depending on which pieces have been collected. For example, the previous example might have evolved as follows:

Design the callflow

The UI designer specifies the conversational flow (the callflow) of user sessions. This includes the introductory prompts, the possible main branches of the callflow, the collection of information, the sequence of that collection, and the expected vocabularies to be verified.

The following list shows rules of thumb when designing application callflows. The terms in parentheses show the xHMI configuration elements associated with these User Interface (UI) design tasks:

■ Define the branches of the application, the tasks to be accomplished in each branch, and the primitive activities that comprise each task.

■ Each branch of the application is a single, self-contained dialog (<dialog>).

■ Each primitive activity in the callflow is a node (<node>). Each node collects information, sends information, interacts with a database, or calls an object such as a SpeechPAK or an OpenSpeech DialogModule (OSDM).

One goal is to determine the optimal size of a task by balancing tasks that are small enough to be generic and reusable yet large enough to perform a meaningful transaction.

■ Define the root dialog of the application, and the root nodes of each dialog:

■ Define a root dialog, the first dialog that runs at the start of every session.

■ Define a root node inside each dialog, the first node that runs in the dialog.

■ Specify the transitions (<transition>) between the nodes. A transition happens when all callflow activity is completed inside the current dialog and node.


User Small ice cream.

System Cone or cup?



■ Define the possible states (success, failure, and so on) at the end of each node.

■ Define the possible targets (<target>) for the next dialog or node

■ Define the conditions for choosing the correct target to when transitioning from the current node or dialog.

■ Define how error situations are handled (<catch>). Consider an operator fallback.

■ Define the information collected by each node, and the likely vocabulary words used by callers. Specify a label for each piece of data (<var>).

■ Define an OpenSpeech Insight (OSI) transaction for each high-level application branch, mid-level application task, and detailed-level node. To prove that the callflow meets application objectives, there should be at least one transaction for each application requirement. Transactions are logged (<log>), added to OSI databases, and used for generating reports (for example, transaction success rates) and tuning (for example, locating correcting dialogs and nodes with high failure rates).

Design the prompts and speech grammars

For each node, specify the initial prompt, retry prompts, success and failure prompts, and any other expected prompts.

Whenever the application collects information, specify the utterances that you expect to collect from callers. Categorize the vocabulary as follows:

■ Globally recognized (for example, for commands and shortcuts).

■ Statically recognized (the expected utterances are known in advance and the speech grammar can be written before the runtime session begins).

■ Dynamically recognized (the expected utterances are identified during the session and the speech grammar must be written at runtime by the application.

You can write highly-constrained or minimally-constrained speech grammars. This includes robust parsing grammars (grammars that distinguish meaningful phrases within longer utterances) and natural language grammars (grammars that categorize each utterance based on large samplings of possible sentences and their intended meanings).

Getting started with developmentApplication design


Application developmentSteps for coding an xHMI application:

1 Create a directory structure.

2 Configure the application (create xHMI files).

3 Test the application callflow.

4 Implement grammars. (Not described in this guide.)

5 Create recordings. (Not described in this guide.)

Create a directory structure

For new xHMI applications, we recommend a directory structure as follows:

mayappaudiogrammarslogxhmiappdtdinc

WEB-INFlibsrctestsrc

Here are details:

myapp–Storage of jsp pages (or links to the pages). The jsp pages from install-dir\voiceXML\jsp need to be copied (or linked) into the application root directory. If you use Ant as build tool, the Ant build.xml goes into this directory.

myapp/audio–Contains waveforms (caller utterances) collected by the application.

myapp/grammars–Contains speech and DTMF grammars.

myapp/log–Contains application-dependent log files. Use this directory for diagnostic log files created by log4j. The log4j.properties file supplied in the /samples directory shows how to set the log file location. For more information, see Diagnostic logging on page 58.



myapp/xhmi/app–Contains appconfig.xhmi (the xHMI configuration file) and global.prop (global properties).

myapp/xhmi/dtd–Contains a copy (or link) to the xHMI DTD file.

myapp/xhmi/inc–Contains xHMI include files.

myapp/WEB-INF–Contains files needed by the application including classfiles, libraries, taglib tld files, and files for any custom nodes such as a database access node. Some specific files in the directory:

■ log4j.properties–The log4j configuration properties.

■ messages.xml–Localized alarm messages in different languages.

■ web.xml–The web application configuration file.

■ A copy (or link) of the tag descriptor file: install-dir\voiceXML\jsp\WEB-INF\ xhmi-voicexml.tld

myapp/WEB-INF/lib–Contains copies (or links) to all OSD jar files. The files are located in the following directories:

install-dir\lib install-dir\Shared\java\lib\ext

myapp/WEB-INF/src–The source code for any customization needed by the application (for example, for a custom database access node).

myapp/WEB-INF/test–Contains all code for unit tests (source code, dtd’s, data, etc.) See Testing on page 143.

myapp/WEB-INF/test/src–Contains only the source code for unit test classes. See Testing on page 143.

Configure the application (create xHMI files)

You can use any text editor to create xHMI files. Because the syntax is XML, you can minimize application load errors by using an XML editor that supports DTD or W3C Schema validation.

Validating with a DTD

The dtd (Document Type Definition) for xHMI configurations can be found in <installdir>/system/xhmi.dtd. It consist of two entities: xhmi_main.dtd and custom.dtd. We recommend you copy these three files into a directory relative to the location of you xHMI file. In your application’s xHMI file you can then use a DOCTYPE declaration such as the following:

<!DOCTYPE xhmi SYSTEM "dtd/xhmi.dtd">

Getting started with developmentApplication development


The dtd can be extended by declaring new elements in the custom.dtd file. A typical use case is when a node requires a new xml element inside <config>. The dtd for these elements accept "any", element, so declaring them in custom.dtd will suffice.

Validating with a W3C schema

A mechanism similar to the dtd is provided for schema validation. The schema files (*.xsd) can also be found in <installdir>/system."

For schema validation, declare attributes in the xHMI root element (instead of using DOCTYPE) as shown in the following example:

<xhmi root="Main" xmlns:xi="http://www.w3.org/2001/XInclude"xml:lang="en-US"xmlns="http://www.scansoft.com/2004/xhmi" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"xsi:schemaLocation="http://www.scansoft.com/2004/xhmixhmi.xsd">

Implementing custom nodes

If the application needs to access some backend logic, such as a transactional system or a database, custom nodes are used to implement the desired behavior. Sometimes an application developer might also choose to implement decision logic in Java instead of building the logic with ECMAScript expressions in xHMI. Any Java IDE, such as Eclipse can be used for this step.

See also, Application development topics on page 85.

General activities

Design the behavior of the application, and divide the main user tasks into separate modules. In xHMI, use a separate <dialog> for each module, and within each <dialog>, use a <node> for each phase of interaction with the user.

For example, a corporate speech application might have these modules:

■ greetings and security checks■ corporate information and promotions■ employee directory■ technical support■ good-byes and follow-up

Each module has a node for each subtask. For example, “greetings and security checks” might have nodes for public greeting, internal greeting, and security password.

Within each <node>, define the needed configuration, prompts, slots, grammars, and transitions. To re-use definitions, define them at the dialog or global scopes.



Test the application callflow

An advantage of the xHMI and OSD architecture is the ability to test the callflow of an application before completing the grammars and prompts. These tests can be done as a test against a Java API (see Testing on page 143).

Application deployment and tuningOnce all parts of an application are completed, the application can be deployed to a web server.

See Deployment to a web server on page 71 for details.

Getting started with developmentApplication deployment and tuning


Chapter 13

Testing

Testing can be divided into several sub processes:

■ grammar testing—testing that the grammar delivers the expected semantic results with a variety of different utterances

■ call-flow test—testing that the application behaves in the way expected with a variety of semantic inputs

■ backend interface test—testing the access to a backend system

■ system test—testing the complete system, usually by making telephony calls

■ acceptance test—testing the system at the customer's premises

A speech application test is often executed by making phone calls into a life system. The system is comprised of a number of complex sub-systems, such as the telephony switch, a VoiceXML gateway with a browser, an application server and a backend system such as a database. Testing an application under development in this way can be a cumbersome procedure, because the application will show errors, in which case, after fixing the error, the process of deploying and calling has to be repeated.

With OSD you simply the process by executing the call-flow tests at a Java API level, without the need to deploy the application. In this way it is possible to create automatic regression tests for an application.

Testing 143Nuance Proprietary

From the xHMI architecture, recall the separation of the Front-end Controller and the DialogManager:

At the IDialogManagerInvocation interface (a), many aspects of an application can be examined. We will use this interface for callflow testing:

The general flow of such a test is the following:

■ Create and initialize an application object. The xHMI file is read during initialization.

■ Create a DialogManager, passing in the application object

■ Make a start request by calling the stepRequest method

■ Test that the StepResponse contains the expected Render Data objects and test their content

FrontController

SessionFrame

(a)

DialogManager

C O N T R O L L E R

M O D E L

Application

DialogNode

Render DataObjects

execute

create

update

*.xHMI

Tester

(a)

DialogManager



■ Setup the semantic input, i.e. the results that would have been created by the recognizer

■ Make a further request supplying the results

During execution of the nextStep method, the DialogManager executes one or several nodes. If one of these nodes accesses the backend system, this access is included in call-flow testing. The StepResponse object contains the Render Data Objects that would have been used to render the markup in a deployed system. Information that can be retrieved from these objects is e.g. a list of activated grammars and a list of prompts. The OSD library contains a support class that facilitates the access to these objects. See class com.scansoft.osd.test.TstSupport.

A test program can also access the SessionFrame object with:

ISessionFrame dialogManager.getSessionFrame();

From the SessionFrame, the current dialog and node names can be read as well as the SessionFrame attributes and user objects. The following is a small example of how such a test looks like. You can find a more comprehensive example in a subfolder of the pizza sample application: \samples\pizza\test\src\com\scansoft\osd\samples\pizza\TestPizzaOrder

Both examples are using JUnit, a popular unit test framework (see www.junit.org). However, any other suitable test software can also be used.

public void testFirstInitialOutput(){try{String configFileName = appRoot + "/output.xhmi";Application app = new Application();app.init(appRoot, configFileName);DialogManager dm = new DialogManager(app, "ABC", "usrid", "1235", "callerId", "calledId", "callId", null);

dm.init();StepRequest request = new StepRequest();StepResponse response = dm.nextStep(request);assertEquals("__exitDialog",

Testing 145Nuance Proprietary

http://www.junit.org

dm.getSessionFrame().getCurrentDialogName());assertEquals("__exitNode",dm.getSessionFrame().getCurrentNodeName());

assertEquals(2,TstSupport.getNumberOfQueuedOutputs(response));

assertEquals("Hello World from Nuance", TstSupport.getFirstQueuedOutput(response));

assertEquals("GoodBye", TstSupport.getQueuedOutput(response, 1));

assertTrue(dm.isTerminated());}catch (Exception e){fail(e.getMessage());

}}



Appendix A

Predefined properties

This appendix lists property names that are predefined by OSD.

Miscellaneous properties

Ambiguous recognition results

Applications can use the following properties to handle ambiguous recognition results for speech that has more than one meaning:

■ ambiguousSeparatorChar ■ ambiguousGroupSeparatorChar

For example, consider this conversation:

Above, the user’s response is ambiguous; it could refer to Frankfurt Oder or Frankfurt Main. A well-written speech grammar will detect ambiguous meanings and provide all possibilities in the recognition result; the grammar concatenates the meanings. For example, the top next-best item might appear as follows:

System Where will you pick up the car in Germany?

User In Frankfurt.

0 Frankfurt|frankfurt oder#frankfurt main

Miscellaneous properties147Nuance Proprietary

The application can configure guard conditions to control how the ambiguous meanings should be distinguished.

Above, the pound sign (#) and pipe symbol (the vertical bar, |) are used inside the speech grammar as delimiters:

■ The pound sign (#) is a delimiter between ambiguous items (frankfurt oder and frankfurt main). To use a different separator, you must configure that delimiter with the following OSD property:

ambiguousSeparatorChar

■ The pipe symbol (|) is a delimiter between an ambiguous group (Frankfurt) and the ambiguous items. To use a different separator, you must configure that delimiter with the following OSD property:

ambiguousGroupSeparatorChar

Skip list processing

The asrSideSkipList and serverSideSkipList properties control where skip list processing occurs, either on the OSD server or on the recognition server. For details, see Controlling where skip list processing occurs on page 87.

Properties for OpenSpeech Insight logging (OSI)All time values are specified in milliseconds; all real number are specified in a simple format such as "1.0"; and exponential formats are not supported.

Property Description

OSIEventLogDestination The file or path for the log file. Use this for server-side OSI logging.

OSILogMode Selects the type of OSI logging that is done automatically. Values are system (default) or app. When set to system a more complete set of events is automatically logged. When set to app only basic events are written.

OSILogServer When true, this property enables server-side OSI logging. The default is "false".



Here are example property definitions:

<property name="OSIUrlCallStart" value="%{osdm_server}/osd-osilogger/sessionstart"/>

<property name="OSIUrlCallEnd" value="%{osdm_server}/osd-osilogger/sessionend"/>

<property name="OSIUrlLogApplication" value="%{osdm_server}/osd-osilogger/log"/>

OSIVarNameUniqueCallID This property defines the name of the VoiceXML variable that holds a unique call ID at any time during page execution. There is no default value.

OSD checks for an ID set via a shadow variable in the platform adaptor and also check for a global xHMI property.

OSIUrlCallStart This property selects the web application that handles the start of call event for execution of server-side OSI logging requests. Typically, this is a log OSDM. This parameter is mandatory for server-side OSI logging. See example below.

OSIUrlCallEnd This property selects the web application that handles the end of call event for execution of server-side OSI logging requests. Typically, this is a log OSDM. This parameter is mandatory for server-side OSI logging. See example below.

OSIUrlLogApplication This property selects the web application that handles the general application event for execution of server-side OSI logging requests. Typically, this is a log OSDM. This parameter is mandatory for server-side OSI logging. See example below.

Property Description

Properties for OpenSpeech Insight logging (OSI)149Nuance Proprietary

Appendix B

Command line tools

This appendix summarizes available command line tools.

Summary of command line toolsOSD provides the following command line tools:

Prerequisites

To run the commandline tools, your system requires the following:

■ Java SDK version 1.4.2 or higher with java.exe in the current path.

■ OSD version 1.1. This release sets the environment variable SWIOSD to the OSD base directory and adds SWIOSD\bin to the path.

Recording list tool (listing prompts for the recording studio)This command line tool extracts all prompts from the xHMI configuration and writes them to a text file. For example, the list might be used when making audio files in a recording studio.

Tool File Purpose

Recording List rl Extracts all prompts from a xHMI configuration.

Grammar List gl Extracts all grammars filenames from an xHMI configuration.

Validate validate Validates the configuration described in xHMI files.

Summary of command line tools151Nuance Proprietary

Windows command:

rl.bat (-i|--inputfile) <infile> (-o|--outputfile) <outfile>

Linux command:

rl.sh (-i|--inputfile) <infile> (-o|--outputfile) <outfile>

The tool reads the inputfile and extracts all outputs. Any output that has an “id” and an <audio> tag is written to the outputfile. The input and output parameters are required.

The <infile> is the name of the xHMI configuration file. The <outfile> is the name of the recording list that will be created

Given is the following output definition at node scope in the configuration file/xyz/appconfig.xhmi:

<output id="id1"><audio src="uri">text</audio></output>

(If there is audio file at the URI, then the text is played.)

The recording list generator generates the following file:

This file now serves as input for the recording of the outputs. If the output is not specified in a node/dialog then the node/dialog fields contain the value none.

Here is a more interesting example:

<output id="bla"><audio src="problems.wav">We could not understand you.

</audio><audio src="operator.wav">An operator will assist you soon.

</audio></output>

file dialog node output-id soundfile text

/xyz/appconfig.xhmi DlgName NodeName id1 uri text



Screen output:

Warning: Found an <output> element with multiple <audio> elements

Warning: Found name mismatch: output id is 'bla', but audio file is called 'problems.wav'

Warning: Found name mismatch: output id is 'bla', but audio file is called 'operator.wav'

Recording file:

file dialog node output-id soundfile text

appconfig.xhmi OperatorTransfer doTransfer bla problems.wav We could not understand you.

appconfig.xhmi OperatorTransfer doTransfer bla operator.wav An operator will assist you soon.

Recording list tool (listing prompts for the recording studio)153Nuance Proprietary

Grammar List tool (lists grammars in an xHMI file)This command line tool extracts all grammar filenames from the xHMI configuration and writes them to a text file as tab-separated items. For example, the list might be used in a functional specification to person who writes the grammars.

Windows command:

gl.bat (-i|--inputfile) <infile> (-o|--outputfile) <outfile>

Linux command:

gl.sh (-i|--inputfile) <infile> (-o|--outputfile) <outfile>

The input and output parameters are required.

The <infile> is the name of the xHMI configuration file. The <outfile> is the filename of the recording list that will be created.

This example shows a sample output file (the count of slots can vary).

dialog id node id grm id grm name slot 0 slot 1 grammar path

none none restart restart.grxml path/to/grammars

none none help __help.grxml __help path/to/grammars

none none starkey starkey.grxml __help path/to/grammars

none none shortcut shortcut.grxml pizzasize pizzatopping path/to/grammars

Order size size size.grxml pizzasize path/to/grammars

Order topping none topping.grxml pizzatopping path/to/grammars

Order confirm yes_no Boolean yesNo builtin:grammar



Validate tool (validating xHMI configuration files)This command line tool validates the configuration files of and xHMI application. The validation is performed on all xHMI files included in the application (via the xi:include statement).

Windows command:

validate.bat <filename>

Linux command:

validate.sh <filename>

The <filename> is the name of the xHMI configuration file.

If errors occur they are written to the stderr device, warnings are written to stdout.

The tool performs numerous validations; not all are documented.

File-level validations:

■ Validation of the file against the DTD■ Validation of the reference to the root dialog (ensures the root dialog is

defined)■ Validation of all references to root nodes in dialogs (ensures root nodes exist)■ Validation of path attributes of all nodes (ensures the specified paths exist)

Within the file, the validation includes:

■ check for duplicate symbols

■ validate all <fills> elements to ensure that the name attributes used are previously declared in that scope

■ validate all <verify> elements to ensure that the actor attributes used are previously declared in that scope

■ validate all <verify> elements to ensure that the vcl attributes used are previously declared in any scope

■ validate all <verify> elements to ensure that all names in the vcl' attribute are activated by grammars in visible scopes

■ validate all <understand> elements to ensure that the attributes used in the namelist are previously declared in that scope

■ check all <property> elements for suspicious names. For example:

Validate tool (validating xHMI configuration files)155Nuance Proprietary

■ When their names suggest that they are meant to override xHMI system properties but are missing the leading underscore character (_).

■ When their names match a discontinued OSI logging property name.

■ validate values of properties

■ check that each grammar (<grm> element) has a <fills> element as a child, except when the <fills> element is optional (when the <grmr> element has an event attribute).

■ check consistency of all instances of verifyOutput against the VCL. This ensures that the used attributes or facades exist in one of the verify-output-list elements (on different scopes).

■ for <verify> ensure that the vcl and appendvcl attributes are not used at the same time; also, ensure that any appended attributes are also declared.

This example shows the output of a successful validation:

> validate appconfig.xhmi

Validation result:0 fatal errors,0 errors,0 warnings

This example shows the output of a failed validation:

> validate appconfig.xhmi

Error: Duplicated definition of node 'ask_for_location' (previously declared in file 'appconfig.xhmi' in line 128)

Error: The attribute 'undefined_attribute' cannot be resolved.

Validation result:0 fatal errors,2 errors,0 warnings



Appendix C

Timestamp abbreviations

When using the OSD-provided classes for date and time, dates can be abbreviated as defined in the following grammar:

S -> fulldate | datetime | datez | timez | date | time

fulldate -> date ' ' time ' ' z

datetime -> date ' ' time

datez -> date ' ' z

timez -> time ' ' z

date -> year date2 | date2

date2 -> '-' month date3 | date3

date3 -> '-' day | *eps*

time -> hour time2 | time2

minute -> minute time3 | time3

second -> second | *eps*

year -> ('0' | … | '9') year | ('0' | … | '9')

month -> ('0' | … | '9') ('0' | … | '9')

day -> ('0' | … | '9') ('0' | … | '9')

hour -> ('0' | … | '9') ('0' | … | '9')

minute -> ('0' | … | '9') ('0' | … | '9')


Above, *eps* is the empty word, terminal symbol are quoted. Other symbols are rule references.

second -> ('0' | … | '9') ('0' | … | '9')

z -> ( '+' | '-' ) ( hour ':' minute | ':' minute | hour | hour ': ')



Appendix D

Negative confirmations

When using robust parsing grammars (sometimes called open grammars), the ROOT rule is not a rule but a set of concepts. Although the following grammar is complete, the example does not show the additional files needed for robust parsing (fsm, wordlist, and userdict). See the OSR Grammar Developer’s Guide for details.

Instead of a single ROOT rule, the ROOT rule is divided into individual rules; each rule is called by a rule-ref tag inside a concept tag. The concept tag then uses ECMAScript to copy the values into the returned slots.

<?xml version="1.0" encoding="UTF-8"?><grammar version="1.0" xmlns="http://www.w3.org/2001/06/grammar"xml:lang="en-US" mode="voice" root="concepts">

<meta name="swirec_user_dict_name" content="my.userdict"/><meta name="swirec_fsm_grammar" content="some.fsm"/><meta name="swirec_fsm_wordlist" content="some.wordlist"/><conceptset id="concepts"

xmlns="http://www.scansoft.com/grammar"><concept><ruleref uri="#r_origin"/><tag>origin = r_origin.origin;

</tag></concept>

<concept><ruleref uri="#r_origin_destination"/><tag> origin = r_origin_destination.origin;destination = r_origin_destination.destination;

</tag></concept>


<concept><ruleref uri="#neg_origin_destination"/><tag> var tmp = '';if (neg _origin_destination.origin != undefined){tmp += (tmp!='' ? '**' : '') + 'origin*'

+ neg _origin_destination.origin;}if (neg _origin_destination.destination != undefined){tmp += (tmp!='' ? '**' : '') + 'destination *'

+ neg _origin_destination. destination;}if (tmp != ''){NEG_GROUP = tmp;}

</tag></concept>

<concept><ruleref uri="#neg_origin"/><tag>NEG_origin = neg_origin.NEG_origin;

</tag></concept>

</concepts>

<rule id="neg_origin_destination"<item><ruleref uri="#r_not"/><ruleref uri="#r_origin_destination"/><tag>origin = r_origin_destination.origin;destination = r_origin_destination.destination;

</tag></item>

</rule>



<rule id="neg_origin"><item><ruleref uri="#r_not"/><ruleref uri="#neg_origin"/><tag>NEG_origin = r_origin.origin;

</tag></item>

</rule>

<rule id="r_not"><one-of><item>not</item>

</one-of></rule>

<rule id="r_origin"><item>from <ruleref uri="#r_cities"/> <tag>origin=r_cities.v;</tag>

</item></rule>

<rule id="r_origin_destination"><item>from <ruleref uri="#r_cities"/> <tag>origin=r_cities.v;</tag>to <ruleref uri="#r_cities"/><tag>destination=r_cities.v;</tag>

</item></rule>

<rule id="r_cities"><one-of><item>boston<tag>v='boston';</tag></item><item>austin<tag>v='austin';</tag></item><item>houston<tag>v='houston';</tag></item>

</one-of></rule></grammar>

The order of the concepts is defined by the training of the voice model for a robust parsing grammar and not by the order of the tags.


osd developers guide

Documents

example component

example pin component

dpf appconfig

component appconfig

example error handling

component pin

dpf tree27defining

osd framework