IVONA For Developers – Products Overview
|
IVONA SDK
|
IVONA Speech Cloud
(SaaS) |
||
|---|---|---|---|
|
Requirements
|
|||
|
Storage memory (ROM) per voice |
– 100-250 MB
- < 50 MB** - < 20 MB** |
0 MB
|
|
|
Runtime memory (RAM)
|
– 5 – 13 MB
- < 5MB** |
0 MB |
|
|
CPU
|
Scalable |
Under 50 MIPS
|
|
|
Chipset |
X86 (32/64 bit); ARM 7,8,9,11; Strong ARM; X-Scale; Sparc (32/64 bit)*; PowerPC*; MIPS*
|
||
|
OS |
Linux, Windows, Android, Windows Mobile, Windows CE, iOS**, MeeGo, Mac OS X, Solaris*, FreeBSD*
|
||
|
API |
IVONA C/C++ API, IVONA Java API, TCP/IP, Unix socket, SAPI 4*, SAPI 5
|
Web Services (SOAP), IVONA C/C++ API**, IVONA Java API** |
|
|
Key features
|
|||
|
BrightVoice™: superior speech output quality |
|
|
|
|
Languages and voices
|
US English (2 male, 3 female, child); British English (male, 2 female); US Spanish (male, female, child**); Spanish (male, female, child**); German (male, female); French (male, female); Polish (2 male, 2 female); Romanian (female); Welsh; Welsh English; Australian English*; Italian**; Dutch**; Canadian French**; Brazilian Portuguese**; Portuguese**; Icelandic**; Russian**; Korean**; Danish**; Swedish**; Japanese**; Special Effects (SFX)/Character voices*
|
||
|
Sampling rate
|
8 kHz, 22 kHz (up to 48 kHz*)
|
8 kHz, 22 kHz (up to 48 kHz*)
|
|
|
Audio formats
|
PCM 16 bit mono/stereo*, A-law, µ-law, mp3*, vorbis (ogg) *
|
mp3, vorbis (ogg), PCM 16 bit mono*/stereo** A-law, µ-law
|
|
|
Low response (start/stop) time
|
|
||
|
SDK components
|
Libraries, documentation, examples (C/C++, PHP), support; Supported events: word-highlighting (alignment of text with audio), visemes (lip-sync), bookmarks, SSML events
|
||
|
Prosody control: volume, speed, pitch
|
|
|
|
|
User level pronunciation lexicon
|
(with regular expression rules support) |
|
|
|
Language detection
|
**
|
**
|
|
|
Phoneme mapping for mixed languages input
|
**
|
**
|
|
|
Text preprocessing rules for specific domains
|
*
|
*
|
|
|
Dynamic voice and language switching
|
|
|
|
|
Mixing static expressive prompts
|
|
|
|
|
Custom voices: voice branding or custom accent and/or style
|
|
|
|
|
Support for natural reading of long texts
|
|
|
|
|
Support for phonetic alphabets
|
IPA, X-SAMPA, TeleAtlas®, Navteq™
|
IPA, X-SAMPA, TeleAtlas®, Navteq™
|
|
|
Expressive TTS effects
|
**
|
**
|
|
|
Standards compliance
|
W3C SSML 1.0/1.1, W3C PLS 1.0 (with IVONA extensions)
|
W3C SSML 1.0/1.1 (with IVONA extensions)
|
|
|
Built-in audio effects
|
**
|
|
|
|
Support for text highlighting
|
|
**
|
|
|
REST support
|
N/A
|
**
|
|
* Available on request
** Feature in development
** Feature in development