Speech Cloud Documentation

Actions

Note

The API of the service might change at any time during the beta phase. We strongly encourage using one of the available client libraries to integrate with Speech Cloud.
See the Developer Guide for more information.

The following actions are supported:

All actions require request signing by using the AWS Signature version 4 process described here.

Check the request signing section in IVONA Speech Cloud Developer Guide for additional service-specific information.

CreateSpeech

The CreateSpeech operation performs a synthesis of the requested text to the audio stream containing the speech. A request contains all data required for the service to perform the operation and consists of parameters grouped into several objects:

  • Information about the text that should be synthesized, along with its type declaration, in the Input object.

  • Selected IVONA TTS Voice.

  • Additional synthesis Parameters.

  • The OutputFormat selection that will determine the encoding of the returned audio stream.

  • A list of lexicon names to be used in speech generation.

In case of an error caused by an invalid request (for example, using invalid synthesis Parameters, or a nonexistent Voice), HTTP code 400 (Bad Request) containing an error message is returned instead of a valid audio stream. In case of a service failure, HTTP code 500 (Internal Error) is returned instead of a valid audio stream. If a lexicon specified in the request is missing, synthesis will proceed ignoring the missing lexicon.

Request syntax
{
    "Input" : {
        "Data" : "string",
        "Type" : "string",
    },

    "OutputFormat" : {
        "Codec" : "string",
        "SampleRate" : number,
        "SpeechMarks" : {
            "Sentence" : boolean,
            "Ssml" : boolean,
            "Viseme" : boolean,
            "Word" : boolean
        }
    },

    "Parameters" : {
        "Rate" : "string",
        "Volume" : "string",
        "SentenceBreak" : number,
        "ParagraphBreak" : number
    },

    "Voice" : {
        "Name" : "string",
        "Language" : "string",
        "Gender" : "string"
    },

    "LexiconNames" : ["string", "string", "string"]
}

Only the InputData attribute is required and it does not have a default value. All other attributes are optional, and they all have default values listed in the Default Values table.

Request attributes

The request is formatted as a JSON object. All attribute names and values are case-sensitive.

Input

Contains attributes describing the user input.

Type: an Input object.

Required fields: Data.

OutputFormat

Contains attributes describing the audio compression and format in which the returned stream should be encoded.

Type: an OutputFormat object.

Required fields: none.

Parameters

Additional attributes affecting the generated speech.

Type: a Parameters object.

Required fields: none.

Voice

Filter for the voice selection that should be used for speech synthesis. Any or all of these attributes can be omitted, in which case the service will try to select the first of the matching voices according to the attribute values. Note that the voice order is not guaranteed to be consistent.

Type: a Voice object.

Required fields: none.

LexiconNames

The names of the lexicons that should be applied to this synthesis to adjust the pronunciation. The lexicons are applied in the same left-to-right order as they are specified in this list. If a name in the list does not match a lexicon previously stored with a PutLexicon request, that name is silently discarded.

Note

In the case of a GET request to CreateSpeech, the names should be combined as a comma-separated string without quotes.

Type: list of strings (POST), or a comma-separated string (GET)

Default Values

Table 1. List of default values
Request parameter Default value

Input → Type

text/plain

OutputFormat → Codec

MP3

OutputFormat → SampleRate

22050

OutputFormat → SpeechMarks → Sentence

false

OutputFormat → SpeechMarks → Ssml

false

OutputFormat → SpeechMarks → Viseme

false

OutputFormat → SpeechMarks → Word

false

Parameters → Rate

medium

Parameters → Volume

medium

Parameters → SentenceBreak

400

Parameters → ParagraphBreak

650

Voice → Name

(*) Salli

Voice → Language

(*) en-US

Voice → Gender

(*) Female

LexiconNames

[]

Response syntax

An audio stream is returned as the response to a successful request. If an error or service failure occurs, a detailed error message is returned.

Errors

For information about errors that are common to all actions, see the Errors section.

InvalidSsmlException

The InputData contains incorrect SSML. This exception will be returned if InputType was set to the value application/ssml+xml and the provided SSML cannot be validated because of incorrect XML syntax or other parsing errors.

HTTP Status Code: 400

VoiceNotFoundException

The combination of attributes in the Voice object does not match any existing voice. For example, VoiceNotFoundException is returned if the requested voice specifies Name = Kendra, which is an American English voice, and the attribute Language is set to the value of en-GB.

HTTP Status Code: 400

Additional response headers

The response contains additional headers that give more information about the performed request.

Header name Description

x-amzn-IvonaTtsRequestCharacters

Indicates the number of characters used to perform a speech synthesis.

x-amzn-IvonaTtsRequestUnits

Indicates the number of units* charged for performing a speech synthesis.

x-amzn-RequestId

String identifying the request. It can be overwritten by setting this header on the request. This is useful for tracking requests.

x-amzn-IvonaTtsRequestId

String uniquely identifying the request.

*For billing purposes, the price of each request is calculated in units. The number of units for a single request is calculated by dividing the number of characters (excluding SSML tags) by 200 and rounding up the result.

Examples

Sample Request
POST /CreateSpeech HTTP/1.1
X-Amz-Date: <Date>
Authorization: AWS4-HMAC-SHA256 Credential=<Credential>,
SignedHeaders=content-type;host;x-amz-content-sha256;x-amz-date, Signature=<Signature>
User-Agent: <UserAgentString>
x-amz-content-sha256: <PayloadHash>
Host: <ServiceEndpointAddress>
Content-Type: application/json
Content-Length: <PayloadSizeBytes>

{
    "Input" : {
        "Data" : "Mary has a little lamb. Alice has a black cat.",
        "Type" : "text/plain"
    },

    "OutputFormat" : {
        "Codec" : "MP3",
        "SampleRate" : 22050,
        "SpeechMarks" : {
            "Sentence" : false,
            "Ssml" : false,
            "Viseme" : false,
            "Word" : false
        }
    },

    "Parameters" : {
        "Rate" : "medium",
        "Volume" : "medium",
        "SentenceBreak" : 500,
        "ParagraphBreak" : 800
    },

    "Voice" : {
        "Name" : "Salli",
        "Language" : "en-US",
        "Gender" : "Female"
    },

    "LexiconNames" : [ "AmericanEng", "lambFix" ]
}
Sample Response
HTTP/1.1 200 OK
x-amzn-RequestId: <RequestId>
x-amzn-IvonaTtsRequestCharacters: <NumberOfCharacters>
x-amzn-IvonaTtsRequestUnits: <NumberOfUnits>
x-amzn-IvonaTtsRequestId: <IvonaTtsRequestId>
Content-Type: <OutputFormatContentType>
Transfer-Encoding: chunked
Date: <Date>

<Audio>

In response to the above sample request, an audio stream would be returned containing speech generated using the Salli voice.

ListVoices

The ListVoices operation returns a list of TTS voices available for speech synthesis using the CreateSpeech action. This list can be filtered by matching the attribute values specified in the Voice object. Each attribute is optional; sending a request without a Voice object or with a Voice object having empty attributes will return all voices.

For example, to retrieve a list of all female American English voices, send a request with Language set to en-US and Gender set to Female (see Sample request).

In case of an error caused by an invalid request such as using a nonexistent Language attribute, HTTP code 400 (Bad Request) containing an error message would be returned instead of a list of voices. In case of a service failure, HTTP code 500 (Internal Error) would be returned instead of a list of voices.

Request syntax
{
    "Voice" : {
        "Name" : "string",
        "Language" : "string",
        "Gender" : "string"
    }
}

All attributes are optional and have no default value.

The request is formatted as a JSON object. All attribute names and values are case-sensitive.

Request attributes

Voice

An object containing values that could be used to filter the list of available text-to-speech voices. If no Voice object is provided, this action will return all available voices.

Type: a Voice object.

Required fields: none.

Response syntax
{
    "Voices" : [
        {
            "Name" : "string",
            "Language" : "string",
            "Gender" : "string"
        }, ...
    ]
}

Voices

The ListVoices action returns an array of Voice objects that match the provided filter (or all voices if no filtering attributes are provided in the request).

Type: an array of Voice objects.

Errors

For information about errors that are common to all actions, see the Errors section.

VoiceNotFoundException

The combination of attributes in the Voice object does not match any existing voice. For example, using the Name set to Kendra, which is an American English voice, and the attribute Language set to en-GB, no match will be found, so this exception will be returned.

HTTP Status Code: 400

Additional response headers

The response contains additional headers that give more information about the performed request.

Header name Description

x-amzn-RequestId

String identifying the request. It can be overwritten by setting this header on the request. This is useful for tracking requests.

x-amzn-IvonaTtsRequestId

String uniquely identifying the request.

Examples

Sample Request
POST /ListVoices HTTP/1.1
X-Amz-Date: <Date>
Authorization: AWS4-HMAC-SHA256 Credential=<Credential>,
SignedHeaders=content-type;host;x-amz-content-sha256;x-amz-date,
Signature=<Signature>
User-Agent: <UserAgentString>
x-amz-content-sha256: <PayloadHash>
Host: <ServiceEndpointAddress>
Content-Type: application/json
Content-Length: <PayloadSizeBytes>

{
    "Voice" : {
        "Language" : "en-US",
        "Gender" : "Female"
    }
}
Sample Response
HTTP/1.1 200 OK
x-amzn-RequestId: <RequestId>
x-amzn-IvonaTtsRequestId: <IvonaTtsRequestId>
Content-Type: application/json
Content-Length: <ReturnedContentLength>
Date: <Date>

{
    "Voices" : [
        {
            "Name" : "Salli",
            "Language" : "en-US",
            "Gender" : "Female"
        },
        {
            "Name" : "Kendra",
            "Language" : "en-US",
            "Gender" : "Female"
        }
    ]
}

PutLexicon

The PutLexicon operation enables the storage of lexicons that can later be used during synthesis. A request to store a lexicon contains the lexicon contents and a user-defined name for the lexicon. A lexicon can later be used by referring to its name in the CreateSpeech action. Executing this operation using the name of an existing lexicon replaces the contents of the existing lexicon with the contents specified in the PutLexicon operation.

Lexicons must be valid XML files that conform to W3C’s PLS specification. No more than five lexicons can be stored at a time.

The list of errors this operation can return is described later in this section.

Request syntax
{
    "Lexicon" : {
        "Name" : "string",
        "Contents" : "string"
    }
}

All attributes are required.

The operation supports only HTTP POST requests and JSON-formatted payloads. All attribute names and values are case-sensitive.

Request attributes

Lexicon

An object containing the lexicon and the name used to identify it.

Type: a Lexicon object.

Required fields: Name and Contents.

Response

A successful request yields an empty response with a 204 No Content HTTP status code.

Errors

For information about errors that are common to all actions, see the Errors section.

InvalidLexiconException

The lexicon does not conform to W3C PLS Recommendation.

HTTP Status Code: 400

MaxLexemeLengthExceededException

At least one of the lexeme substitution rules (i.e., alias and phoneme elements) exceeded 100 characters in length.

HTTP Status Code: 400

MaxLexiconsNumberExceededException

There was an attempt to store a new lexicon when five lexicons were already stored. No more than five lexicons can be stored at any given time.

HTTP Status Code: 400

MaxLexiconSizeExceededException

The lexicon content exceeded the maximum length allowed.

HTTP Status Code: 400

Additional response headers

The response contains additional headers that give more information about the performed request.

Header name Description

x-amzn-RequestId

String identifying the request. It can be overwritten by setting this header on the request. This is useful for tracking requests.

x-amzn-IvonaTtsRequestId

String uniquely identifying the request.

Examples

Sample Request
POST /PutLexicon HTTP/1.1
X-Amz-Date: <Date>
Authorization: AWS4-HMAC-SHA256 Credential=<Credential>,
SignedHeaders=content-type;host;x-amz-content-sha256;x-amz-date, Signature=<Signature>
User-Agent: <UserAgentString>
x-amz-content-sha256: <PayloadHash>
Host: <ServiceEndpointAddress>
Content-Type: application/json
Content-Length: <PayloadSizeBytes>

{
    "Lexicon" : {
        "Name" : "tomatoFix",
        "Contents" : "<PLS>"
    }
}
Sample Response
HTTP/1.1 204 OK
x-amzn-RequestId: <RequestId>
x-amzn-IvonaTtsRequestId: <IvonaTtsRequestId>
Date: <Date>

GetLexicon

The GetLexicon operation enables the retrieval of lexicons previously stored in the service. Lexicons are referred to by name, so a request to retrieve a lexicon must contain its name.

The list of errors this operation can return is described later in this section.

Request syntax
{
    "Name" : "string"
}

Name is a required attributed.

The operation supports only HTTP POST requests and JSON-formatted payloads. All attribute names and values are case-sensitive.

Request attributes

Name

The name of the lexicon to retrieve.

Type: "string"

Response syntax
{
    "Lexicon" : {
        "Name" : "string",
        "Contents" : "string"
    }
}

Lexicon

The GetLexicon action returns the Lexicon object with the provided name.

Type: a Lexicon object.

Errors

For information about errors that are common to all actions, see the Errors section.

LexiconNotFoundException

No lexicon matching the given name was found.

HTTP Status Code: 400

Additional response headers

The response contains additional headers that give more information about the performed request.

Header name Description

x-amzn-RequestId

String identifying the request. It can be overwritten by setting this header on the request. This is useful for tracking requests.

x-amzn-IvonaTtsRequestId

String uniquely identifying the request.

Examples

Sample Request
POST /GetLexicon HTTP/1.1
X-Amz-Date: <Date>
Authorization: AWS4-HMAC-SHA256 Credential=<Credential>,
SignedHeaders=content-type;host;x-amz-content-sha256;x-amz-date, Signature=<Signature>
User-Agent: <UserAgentString>
x-amz-content-sha256: <PayloadHash>
Host: <ServiceEndpointAddress>
Content-Type: application/json
Content-Length: <PayloadSizeBytes>

{
    "Name" : "tomatoFix"
}
Sample Response
HTTP/1.1 200 OK
x-amzn-RequestId: <RequestId>
x-amzn-IvonaTtsRequestId: <IvonaTtsRequestId>
Date: <Date>
Content-Type: application/json
Content-Length: <PayloadSizeBytes>

{
    "Lexicon" : {
        "Name" : "tomatoFix",
        "Contents" : "<PLS>"
    }
}

DeleteLexicon

The DeleteLexicon operation enables the removal of lexicons previously stored in the service. Lexicons are referred to by name, so a request to delete a lexicon must contain its name. When a lexicon is deleted, it is no longer available for retrieval from the service and it is not possible to restore it; it is therefore recommended that you back up the lexicon’s contents using the GetLexicon operation before executing DeleteLexicon.

The list of errors this operation can return is described later in this section.

Request syntax
{
    "Name" : "string"
}

Name is a required attributed.

The operation supports only HTTP POST requests and JSON-formatted payloads. All attribute names and values are case-sensitive.

Request attributes

Name

The name of the lexicon to remove.

Type: "string"

Response

A successful request will yield an empty response with a 204 No Content HTTP status code.

Errors

For information about errors that are common to all actions, see the Errors section.

LexiconNotFoundException

No lexicon matching the given name was found.

HTTP Status Code: 400

Additional response headers

The response contains additional headers that give more information about the performed request.

Header name Description

x-amzn-RequestId

String identifying the request. It can be overwritten by setting this header on the request. This is useful for tracking requests.

x-amzn-IvonaTtsRequestId

String uniquely identifying the request.

Examples

Sample Request
POST /DeleteLexicon HTTP/1.1
X-Amz-Date: <Date>
Authorization: AWS4-HMAC-SHA256 Credential=<Credential>,
SignedHeaders=content-type;host;x-amz-content-sha256;x-amz-date, Signature=<Signature>
User-Agent: <UserAgentString>
x-amz-content-sha256: <PayloadHash>
Host: <ServiceEndpointAddress>
Content-Type: application/json
Content-Length: <PayloadSizeBytes>

{
    "Name" : "tomatoFix"
}
Sample Response
HTTP/1.1 204 OK
x-amzn-RequestId: <RequestId>
x-amzn-IvonaTtsRequestId: <IvonaTtsRequestId>
Date: <Date>

ListLexicons

The ListLexicons operation retrieves a list of user-defined lexicons available for speech synthesis using the CreateSpeech action.

The list of errors this operation can return is described later in this section.

Request

An HTTP POST request with an empty payload.

Response syntax
{
    "LexiconNames" : [ "string", "string, "string" ]
}
LexiconNames

The names of the lexicons currently stored in the service and available for speech synthesis.

Type: list of strings

Errors

For information about errors that are common to all actions, see the Errors section.

Additional response headers

The response contains additional headers that give more information about the performed request.

Header name Description

x-amzn-RequestId

String identifying the request. It can be overwritten by setting this header on the request. This is useful for tracking requests.

x-amzn-IvonaTtsRequestId

String uniquely identifying the request.

Examples

Sample Request
POST /ListLexicons HTTP/1.1
X-Amz-Date: <Date>
Authorization: AWS4-HMAC-SHA256 Credential=<Credential>,
SignedHeaders=content-type;host;x-amz-content-sha256;x-amz-date, Signature=<Signature>
User-Agent: <UserAgentString>
x-amz-content-sha256: <PayloadHash>
Host: <ServiceEndpointAddress>
Sample Response
HTTP/1.1 200 OK
x-amzn-RequestId: <RequestId>
x-amzn-IvonaTtsRequestId: <IvonaTtsRequestId>
Date: <Date>
Content-Type: application/json
Content-Length: <PayloadSizeBytes>

{
    "LexiconNames" : ["tomatoFix", "AmericanEng", "lambFix"]
}
 
Copyright © 2015 IVONA Software. All rights reserved. Terms of Use | Privacy Policy