IVONA For Developers

Develop with IVONA Text-to-Speech.

1. Synthesize text to file

<< Introduction Streaming audio in real-time >>

 

C# PHP Python

Download source code

Introduction

This tutorial will present how to generate basic synthesis of text using IVONA Speech Cloud service.

Establish connection

When we will create project in Visual Studio we can start with creating reference to IVONA Speech Cloud Web Service. In order to do that we need to add service reference for our project. This reference needs to point to IVONA Speech Cloud Web Service which has address:

http://api.ivona.com/saasapiwsdl.xml

Lets call this reference Ivona.SpeechCloud in our project. When this service is added successfuly, Visual Studio automatically will create wrapper for Web Service and we will be able to call Web Service methods directly in the code. We just need to create instance of Web Service client:

ws = new Ivona.SpeechCloud.IVONATTSSaaSPortTypeClient();

Authorization

Every IVONA.com functionality is available only for registered users with IVONA TTS SaaS service active for their account. So almost every API request should be authorized. There is only one type of unauthorized request through the API, and that is a request for token, which should be used in the authorization process.

Authorization is based on a calculation of MD5 sums from password and token received from API, by using getToken() request. Every request beside the getToken() method itself should be preceeded by getToken() call. In every authorized request there is a md5 parameter in which the generated token should be used. The value of this parameter should be prepared using the following formula:

public static string GetMd5Hash(string input)
{
    MD5 md5Hash = MD5.Create();
    // Convert the input string to a byte array and compute the hash.
    byte[] data = md5Hash.ComputeHash(Encoding.UTF8.GetBytes(input));
    // Create a new Stringbuilder to collect the bytes
    // and create a string.
    StringBuilder sBuilder = new StringBuilder();
    // Loop through each byte of the hashed data
    // and format each one as a hexadecimal string.
    for (int i = 0; i &lt; data.Length; i++)
    {
        sBuilder.Append(data[i].ToString("x2"));
    }
    // Return the hexadecimal string.
    return sBuilder.ToString();
}
...
string token = client.getToken("user@address");
string md5 = GetMd5Hash(md5, GetMd5Hash(md5, "apiKey") + token);
  • getToken() method call for a new token.
  • listVoices() method call using token acquired in step 1 to calculate the md5 parameter value.
  • getToken() method call for a new token.
  • listCodecs() method call using token acquired in step 3 to calculate the md5 parameter value.
  • etc…

Generate speech

When we will get token and authorize to the service we can start to generate speech. Following method requests to syntesize specified text:

///
<summary> /// This function creates a new sound file from text. /// </summary>
 
///Ivona Web Service
///User data
///Text data
private static void TTSSimple(IVONATTSSaaS webService, User u, TextData td)
{
    string token = webService.getToken(u.UserName);
    string speechid, fileUrl, playerCode;
    int charactersPrice;
 
    speechid = webService.createSpeechFile(token, User.GetMd5Hash(u.apiKey + token), td.text, td.contentType,
                                                    td.defaultVoice, td.codecID, td.additionaParams.ToArray(),
                                                    out charactersPrice, out fileUrl, out playerCode);
 
    System.Console.WriteLine("Your file is placed right here:\n" + fileUrl);
    System.Console.WriteLine("Download price: " + charactersPrice);
 
    if(ArgsParser.speakers)
        IvonaSoftware.Player.WebPlayer.DirectPlayAudioStream(fileUrl);
}

Additionally for purpose of this sample we have created WebPlayer class which allows to play synthesized text using loud speakers. This class takes advantage of open source library NAudio in order to play synthesized file. NAudio library can be downloaded for free from NAudio web page.

Now we just need to call method from main program:

static void Main(string[] args)
{
    ArgsParser.ParseArgs(args);
 
    IVONATTSSaaS webService = new IVONATTSSaaS();
    User u = new User(ConfigurationManager.AppSettings["username"], User.GetMd5Hash(ConfigurationManager.AppSettings["apiKey"]));
    string token = webService.getToken(u.UserName);
 
    // for debug information you can check if a pair user, apiKey is valid
    if (webService.checkToken(token, User.GetMd5Hash(u.apiKey + token)) == 1)
    {
        foreach (TextData td in ArgsParser.parsedArgs)
        {
            // run synthesis
            TTSSimple(webService, u, td);
        }
    }
    else
    {
        System.Console.WriteLine("Access denied!");
    }
}

Check result

Now it is time to check if our program is working correctly. First please check if in app.config file username and API Key is the right one for your account. Next lets run sample with following command than:

[Tutorial] --text "hello world" "text/plain" --speakers

Command line above will run synthesis of given text and send it to speakers (assuming you have linked some mp3 decoder library like for example NAudio). Parameter “text/plain” is there to notify about format of input text

Conclusion

In above paragraphs we showed how to establish connection with IVONA Speech Cloud and generate speech.

Complete example

Download source code