Nuance

xHMI

Resources
White Papers
Speech Solutions
Foundation Technologies
Solutions
Applications
Training
Professional Services
Speech Standards
Newsletter Sign-Up
Enter your e-mail address:
 
Frequently Asked Questions
Q. What is xHMI?
A. xHMI - the extensible Human-Machine Interface - is an open, XML-based dialog configuration language that enables efficient development of more natural, conversational applications.

By defining a common approach to configuring dialog components, the xHMI specification promotes compatibility between the new generation of advanced speech applications, tools, application frameworks and components, and makes it easier to create and deliver powerful dialog-driven applications. xHMI is supported by over twenty developers of advanced speech applications, development tools, platforms, and other speech related technologies.

Q. Why is Nuance creating this initiative?
A. As the speech industry has expanded, new tools, applications and application components from multiple vendors have provided customers with increased choice and flexibility. Until now, however, products from different suppliers have remained incompatible, limiting customers' ability to mix and match the best applications, tools and components for their specific needs.

The industry faces a choice: it can continue along the current path which requires the use of different tool sets and specialized skills to configure and maintain applications from different vendors, with ensuing high costs, limited choices for the customer, and limited business opportunities for the vendors. Alternately, the industry can adopt an open, standards-based approach which facilitates compatibility across different vendors' products. The advent of VoiceXML not only significantly accelerated the growth of the speech and voice platform industry, it also gave rise to entire new market around packaged speech applications and tools. The industry has a tremendous opportunity to leverage the VoiceXML experience and ensure openness across applications and tools from a very early stage, rather than heading down a high-cost, proprietary path that will limit growth.

Additionally, speech recognition technology has advanced to the point where more sophisticated and natural conversational applications have proven their value in a number of successful commercial deployments. The use of natural language can increase user satisfaction as well as increase transaction efficiency; however, such applications are still expensive and time consuming to build. A common language that allows developers to more easily describe natural language interactions will reduce the cost, time, and risk of the development of these powerful applications.

The xHMI initiative is driven by Nuance's and partners' extensive experience building standards-based speech applications based on reusable components, such as OSDMs, and a number of different development frameworks and environments.

Q. What problems does xHMI solve?
A. xHMI addresses the need for a common approach to control the behavior of dialog nodes in a speech application. xHMI reflects the standard Model View Controller 2 (MVC 2) design pattern used in modern Web applications. MVC 2 separates functionality related to how application data are manipulated from how they are displayed and stored. An xHMI-based dialog framework therefore makes it easier to develop and maintain speech applications using accepted standard practices.
Benefits of this approach include:
  1. Tool and application interoperability: By design, xHMI decouples applications from the tool environments used to build them, allowing customers to use their choice of best-of-breed tools and runtime environments to develop, configure, extend, run and manage these applications.
  2. Separation of application from markup language: xHMI decouples application logic from the presentation layer, permitting support of alternative markup languages and speech browsers, as well as supporting multimodality.
  3. Application component interoperability: xHMI enables interoperability between application components from different vendors.
  4. Modular application design: An xHMI-based dialog framework promotes a strong degree of application modularization and a clean separation of concerns. xHMI thus facilitates and encourages component reusability, optimal development task division and staff allocation and application maintainability and flexibility for future-proofing.
  5. Flexible, layered development model: An xHMI-based approach to application design is compatible with a broad range of development models within the MVC 2 paradigm. Application developers and users are free to work at the appropriate level of abstraction and encapsulation, ranging from node configuration at the XML level, to customizing or extending existing nodes, to developing new nodes.
  6. More powerful interactions: xHMI reflects a dialog-based approach to application design, which views dialog management (the selection of the next state) as an explicit, centralized functionality. xHMI supports a range of dialog strategies, and therefore permits the cost-effective implementation of both mainstream directed dialogs in which state transitions are explicitly predefined, as well as advanced natural language dialogs where state transitions are computed dynamically, based on user input and the conversation history.

Q. Is xHMI a standard?
A. Nuance has created the xHMI specification in partnership with several leading speech technology providers who are driving this initiative. The specification and all intellectual property contained within it will be donated to the standards community.

Q. Is this complementary to the IBM RDC initiative?
A. Yes. The IBM Reusable Dialog Component (RDC) initiative and the xHMI initiative are complementary. Both are working to help reduce the cost of speech application development and enable a wider set of speech applications, and both address important aspects of the interoperability of reusable components. xHMI focuses on the general configurability of dialog nodes via XML. RDC deals with the idea of structuring dialog components as JSP tag libraries in a J2EE environment. The configuration of components in a JSP tag library via XML (xHMI) is consistent with the RDC vision.

Q. How does xHMI relate to VoiceXML?
A. xHMI is complementary to VoiceXML, and helps developers build dialog-based VoiceXML applications within an MVC 2 (Model View Controller) framework. xHMI is a dialog configuration language corresponding to the Model and Controller elements of MVC 2, which deal with the way application data are stored and manipulated. xHMI is independent of the View part of the framework (the rendering or display element), and is thus compatible with a range of rendering markups, including VoiceXML, which can be delivered through Java Server Pages and other active template languages like Velocity.

We view many elements of xHMI as ideal components of future VoiceXML specifications, and are committed to actively contributing relevant concepts from xHMI to the standards process.

Q. How does xHMI relate to SALT?
A. xHMI is based on the MVC 2 architecture which separates the way application data are stored and manipulated from the way they are displayed. xHMI is concerned with dialog data and flow, (Model and Controller), but not the display, and is thus complementary with a variety of rendering markups (corresponding to the View). xHMI is therefore compatible with SALT as well as VoiceXML.

Q. How does xHMI relate to CCXML?
A. CCXML is an XML-based standard for telephony call control for dialog systems. CCXML can be used to specify how phone calls are placed, answered, transferred and conferenced. xHMI, on the other hand, is concerned with the management of the conversational turns within a dialog, and its functionality is thus complementary with CCXML.

Q. How does xHMI relate to Java-or .NET-based applications?
A. xHMI is language-neutral: the specification is independent of the programming language, and therefore completely compatible with any J2EE environment, and can also be implemented within a .NET environment.

The xHMI model allows for separation of application flow from dialog rendering, and can therefore support both JSP and Microsoft's ASP technologies. Note, however, that xHMI-compatible components implemented in Java are unlikely to be interoperable with those written in .NET.

Q. What partners have committed to support this?
A. Over twenty leading industry vendors have publicly declared their support for xHMI at the time of its announcement. Nuance has been working with a number of tool, application, and platform partners in the development of this approach to applications. Multiple tools partners have stated their intentions to support xHMI and speech platform vendors have stated their intentions to support applications written with xHMI. For an up to date list of xHMI partners, please visit: www.nuance.com/xhmi

Q. What can we expect from tools vendors in terms of xHMI support?
A. Several tools vendors are currently working on xHMI-based implementations. xHMI-based applications can currently coexist with any tool environment that is capable of calling an external web page. These applications will be accessible directly within the tool environment once xHMI support is available within the tool.

Q. Is Nuance supporting xHMI?
A. xHMI is strategic to Nuance's direction in applications. Nuance's applications are being written to this specification; and partners and customers can expect a number of additional announcements about speech optimization tools built around xHMI.

Q. If I use an xHMI based solution will I be able to migrate to different platforms?
A. Yes. xHMI defines a way for applications to specify call flow and configure dialog nodes. However, the rendering is not part of the xHMI spec, and xHMI applications can therefore be migrated to different VoiceXML and SALT platforms.

Q. Will xHMI make my tools more valuable?
A. Yes - customers are demanding interoperability and solutions that embrace standards. Tools are evolving to support these standards, and several tool vendors are participating in the xHMI initiative.

The ability to have call flow development, debugging, and design tools interoperate with applications from a variety of vendors will enable customers to choose best-of-breed tools that are most appropriate to them, increasing the number of potential customers that will purchase these tools.

Q. Are there any tools that are available now or in the near future?
A. Nuance will be announcing relationships with major tool providers in the near future, including information on availability of xHMI-based solutions within those offerings.

Q. When will xHMI be made publicly available?
A. Nuance is currently circulating a full draft among partners and will make it available to the public once full reviews and edits have been received and incorporated. The xHMI specification is available to interested partners today.

Q. What is an example of an "advanced" dialog?
A. Recent advances in speech recognition and understanding have made it possible to use more open-ended prompts, and to extract meaning from the resulting natural user utterances. E.g.:

System: "Can I help you?"

Caller: "Yeah, uh, I'd like to fly to New York tomorrow, oh, and I'd also like to book a hotel"

System: "Let's help you with the flight first. At what time would you like to fly to New York?"



System: "Now let me help you with the hotel …"

In interactions like the dialog above, callers are free to provide varying amounts of information in any order they want (and may include information that the system didn't prompt the caller to provide).

Conventional directed dialogs require the explicit specification of every possible dialog state and transition. However, attempting to enumerate every possible combination of user inputs in a natural language application quickly leads to a combinatorial explosion. The application development task is complicated further by the fact that the selection of the next dialog state can also be influenced by the conversation history (e.g., the request for a hotel in the example above).

Natural language applications clearly call for a different dialog strategy - one in which explicit transitions don't need specification. One such alternative is the "information-driven" dialog strategy, in which the system dynamically determines the best next dialog state by taking into consideration the information received from the caller, the reliability of the recognition, the conversation history, as well as activation rules associated with dialog nodes. For natural language applications, specifying such rules is far more efficient than predefining all state transitions.

The xHMI framework unifies and centralizes dialog management functionality, and thus allows the efficient implementation of a range of dialog strategies, including those suitable for powerful natural-language dialogs.

Q. Is there any charge for xHMI?
A.

No, xHMI is a specification designed by Nuance and several partners, and will be donated to the public domain.

© 2008 Nuance Communications, Inc. All rights reserved.