IVONA For Developers

Develop with IVONA Text-to-Speech.

3. Synthesis process in Cloud

Using createSpeechFile() method doesn’t exactly invoke any speech synthesis process. It simply stores the synthesis requirements inside the IVONA Speech Cloud (SaaS) database and returns an url from which the file will be available. The synthesis process starts not earlier then on such url request.

The file download process invokes a couple of redirects that eventually ends in the final file location that will be streamed (over HTTP protocol) to the client. Some audio formats aren’t streamable – for example “wav” files, that require full file data inside a file header and because of it cannot be send to user before it is fully synthesized. For such files the stream isn’t available immediately after url request – instead of a file, some HTTP headers will be returned for user, carrying additional information about the file availability.

The following HTTP headers could be returned on file requests:

FOR STREAMABLE FORMATS (mp3/ogg)

HTTP Code

Additional headers

Description

200

-

The file is ready to download – the streaming should start immediately.

302

“Location”

Redirection to the selected TTS host, from which the file will be streamed. User should follow the location (most client applications, for example using the flash Sound object, will automatically proceed to the new address, without even notifying the user).

449

-

The file isn’t ready to synthesize, some additional preparation are required (for example the content analysis). User should repeat a request after few seconds.

FOR NON-STREAMABLE FORMATS (wav/alaw/ulaw)

HTTP Code

Additional headers

Description

200

-

The file is ready to download – it will be send immediately to user.

302

“Location”

Redirection to the selected TTS host, from which the file will be served. User should follow the location.

449

“Retry-After”,”Refresh”

There is synthesis in progress and user should wait till it’s done. The expected synthesis duration (in seconds) is returned in “Retry-After” and in “Refresh” headers (the latter one will automatically refresh the webpage if user uses the web browser to download the file). User should repeat file request after the suggested duration.

If user is using a command line application (for example wget or curl) to download the file from the returned soundUrl, there is a necessity of using quotes or apostrophes around an url, because some characters in the returned url maybe interpretted by shell (for example the ampersand (“&”) character).

Note
The “449 Retry” HTTP code on file request instead of “503 Service Temporarily Unavailable” .

Many Flash multimedia applications use the “Sound” object to load the audio file into a flash movie. In most enviroments the flash is unable to detect the exact HTTP status code returned on the audio file request. In most cases only the 4xx HTTP status codes are detected as “file not available” events, for which the “retrying” procedure could be easly implemented in the application. The Flash Developer could easly implement for example the “retry after 3 seconds” (or any other time period) function and call it in such event. That’s the reason for such “exotic” status code that is returned when the file isn’t ready yet instead of standard server responses from the 5xx HTTP codes family.