Vocalizer enriches the user experience in a variety of applications ranging from automotive and consumer electronics to assistive technologies.
Seamlessly combines dynamic text-to-speech with pre-recorded audio and tuned text-to-speech segments to generate audibly pleasing speech output. Deep learning based voices reach a new standard of quality and are being rolled out first in the Nuance cloud and then in embedded.
Embedded and cloud-based
Vocalizer Embedded (VE) is widely deployed in automotive and mobile platforms and offers the highest quality for any footprint from 2 MB to 500 MB per voice. Vocalizer Server (VS) offers the highest quality voices with footprints in the GB range and with daily dictionary updates in the Nuance cloud.
Localised for global audience
Nuance's broad language coverage supports 55+ languages and features 100+ different voices to reflect global need and audience.
For a complete text-to-speech solution, Vocalizer Studio offers improved prompt tuning capabilities for optimising applications.
When used with Dragon Drive, Vocalizer offers the driver a hands-free, distraction-free experience with readout of all information and an engaging voice-powered interface.
Learn more about Dragon Drive
Vocalizer provides natural-sounding voice output to screen readers and other products that provide access to information for the visually impaired.
Learn about solutions for accessibility
Vocalizer enables brands to add personality to a range of products including TVs, home appliances, games and toys, language learning software and mobile phones.
TTS technology has been successfully deployed and used to increase productivity and safety in warehousing, stock picking and transportation.
Provides a more human-like interaction through Natural Language Understanding, pronunciation accuracy, appropriate breaks for longer sentences and emphasis on the correct words.
High linguistic accuracy offers correct readout for all types of text input including map data, music data and a large dictionary of person names.
Lively, natural speech
Gilded speech databases containing laughs, hesitations and other aspects of speech provide a more natural experience.
Superior output quality
New signal-processing algorithms and advanced syntactical analysis result in improved smoothness and natural prosody.
The Vocalizer voice has personality with a natural, fluent and lively speaking style, which provides an engaging user experience.
Improved long text readout
Vocalizer has been optimised for reading out long texts such as news, emails and social media status updates more smoothly and naturally.
Direct phonetic input
Allows for optimal and seamless readout of offline phonetic databases such as navigation map data.
More accurate language identification as well as high-quality acoustic extensions provide superior foreign language readout.
User text rules
Customised readout of application-specific abbreviations and text pattern is possible using a user text processing rule set.
Application-specific lexica can be phonetically optimised for accurate readout of exceptional pronunciations.
Global language and voice portfolio
Vocalizer currently supports more than 55 languages and features more than 100 different voices to facilitate the creation of global solutions using a single engine.
Custom configurations and optimisation
Built-in domain intelligence
Optimisation settings provide extra control options for special use cases such as SMS reading.
Improved prompt tuning
With offline tuning options, any prompt set can be further optimised and customised for maximum flexibility.
Seamless prompt insertion
Recorded audio prompts or tuned prompts can be blended with dynamic text-to-speech seamlessly through active prompt matching.
A wide range of footprints scaling from 2MB to 900MB ensures optimal performance on embedded platforms, from very small mobile devices to powerful multimedia systems.
Built-in embedded speech synthesis markup language (SSML) support allows for TTS-vendor independent markup. W3C conform solution can also be used for more generic applications.
Easy configuration on all platforms
Porting the core engine to a variety of embedded platforms is highly efficient and reliable. Voices and languages can be easily configured by adding data files.
This comprehensive, user-friendly tools suite allows developers to prototype and optimise speech output applications by easily creating optimisation data such as user text rules, user dictionaries and prompts.
Custom voices can be built upon request to reinforce corporate branding, highlight product uniqueness and reflect a global user base.
Server-based TTS offering available
Vocalizer Server offers high-quality readout of longer text in the cloud. Use cases include news and email reading. Hosted TTS is available through Nuance Mix and customer specific deployments. For on-premise deployment a RESTful API allows for easy integration.
Consistent with VoCon voice recognition
Common Linguistic Component (CLC) shared with Nuance VoCon allows for pronunciation consistency between speech input and output.