Implementation Notes for VXI

Overview

Open VXI code organization

Classes

Our implementation consists of three main classes: The class VXI holds most of the methods for processing VoiceXML elements and the Form Interpretation Algorithm. On the other hand, it has very few instance variables, the most important of these is the channel which contains handles on the platform required by the rec, prompt, tel, and object interfaces.

The class ExeCont represent the execution context and contains most of the state information relevent to document execution. It also abstracts functionality relating to event and prompt counts, form item variables, and interactions with the ECMAScript context.

The class GInfo represents information relating to a particular grammar including its scope, grammar handle (as returned by recLoadGrammar, and its target (in the case of a <link> or <choice> grammar). This class also contains most of the methods for building grammars for menus and option lists.

Aside from these three, there is a utility class mediating XML attribute access, and a few utility classes local to the prompt processing section (VXI_prompt.cpp).

Internal Type System

Control Flow (VXI_doc.cpp, VXI_form.cpp)

The main control sequence for VoiceXML processing consists of 3 nested loops: the document loop, the dialog loop and the item loop. These loops are implemented in VXI_doc.cpp and VXI_form.cpp.

The document loop loads and executes documents - each iteration of the loop produces the next document to load. When no new doc is to be loaded, the loop exits.

The dialog loop is called for each document (possibly with a dialog name). It looks for the proper dialog to execute and enters it. Generally, this loop only has one iterate as the executed dialog typically determines the next document/dialog/item to run. If no next is produced by the current dialog, the dialog loop exits.

The item loop is essentially the FIA. It iterates over the form-items in a dialog and ends when all form items are complete or a new target (item, dialog, or url) is specified. Note: Event handling is primarily done in loop and catch handlers result in either continuation of this loop or produces a new target. The new target is passed up the enclosing loops and processed in the appropriate loop depending on target type (items in FIA, dialogs (url fragments) in dialog loop, URLs is doc loop).

Communication in and out of the three control loops is handled by the execution context (class ExeCont) and a return code. Each loop returns a return code to its caller as well as handling those from inner loops. Based on these return codes and value in the execution context, control is passed up to the proper level. The return codes are enumerated in vxi.hpp and are:

With the exception of RTN_GOTO, handling these is straightforward. In the case of RTN_GOTO, the target can be a field, local dialog, or a dialog in a new document. Each loop must check the target type and determine if the control flow should be handled locally or passed up to enclosing loops.

Document Processing (VXI_doc.cpp)

Form and Field Item(VXI_form.cpp)

The main form processing loop fia_loop() follows the Form Interpretation Algorithm as described in Appendix 1 of the VoiceXML specification, although the loop is structured slightly differently.

Each iteration through the loop checks for pending events. If an event is found, it is processed, with control either resuming in the FIA loop or transferring due to a <goto> etc.

If no events are pending, the loop checks for field results, either from recognition, subdialog calls, or other field item execution. If a result is pending, it is processed (see Result Processing) with control again either continuing the the FIA or explicitly transferred.

If no events or results are pending, the FIA loop selects a form item to process. If a target item has been specified by a control element (<goto> etc.), this item is selected. Otherwise, the field item variables and guard conditions of each item are tested and the first unfilled and enabled item is selected. If no item is selected, the loop exits.

If an item is selected, the "collect phase" for that item is performed. Most form items (except <block> have a similar collect phase which consists of

The <block> element simply sets its guard variable, processes its executable content, and returns.

Executable Content (VXI_exe.cpp)

Executable content consists of the elements: These elements are implemented in the file VXI_exe.cpp.

Executable content occurs inside <block>, <filled> and <catch> elements and recursively inside <if> elements. Processing of these elements is straightforward. The <var>,<assign>,<script>,<clear>, and <reprompt> produce side-effects in either the ECMAscript environment or the execution context (class ExeCont). The <exit>,<throw>,<goto>,<submit>, and <return> element cause control to leave the executable content. In the current release, control transfers out of executable content sections are implemented by a C++ throw() of a VXI return code (see VXI.hpp for enumeration). It could equally well be implemented by checking ordinary return values at each execution step, though this would be tedious. In consequence, those places in the code that can invoke executable content are wrapped in try{}catch(){} blocks. These are three main control loops, as well as the initial document download and setup (which can invoke <catch> elements through download failure or other errors). Control transfer are caught and handled as describe in the sections above.

Descriptions of the implementation of individual executable elements can be found in comments within the source code.

Grammar Processing (VXI_grammar.cpp)

Result Processing (VXI_result.cpp)

Prompt Generation (VXI_prompt.cpp)

Event Handling (VXI_event.cpp)

Document Fetching (VXI_fetch.cpp)

fetchobj

This document refers to files in the Open VXI release. This is available at http://www.speech.cs.cmu.edu/openvxi .

VoiceXML is a Trademark of the VoiceXML forum.
Copyright 2000, 2001. SpeechWorks International, Inc. All rights reserved. Distributed under SpeechWorks Open Document License, v1.0