Speech Cloud Documentation

Lexicons

IVONA SpeechCloud supports the Pronunciation Lexicon Specification (PLS).

A lexicon is a PLS file that contains a set of rules to be applied for a specific language. The rules defined in the standard can:

  • Perform text substitution and expand acronyms.

  • Tune pronunciation using IPA or X-SAMPA alphabets.

  • Make it possible to disambiguate homographs using roles.

You can store PLS files and name them using the PutLexicon method.

When calling the CreateSpeech method, you can specify a list of lexicons, referring to them just by their names. All the rules from the specified lexicons are applied in the order they were defined when generating a speech.

You can use ListLexicons, GetLexicon, and DeleteLexicon to list the names of all your lexicons, retrieve the content of a specific lexicon, and delete a lexicon by name.

Limits

The following limits and restrictions apply when uploading lexicons to the service:

  • Each account can have no more than 5 lexicons currently stored.

  • A lexicon must not exceed 4 KB in size.

  • The name of a lexicon must be a string of no more than 10 alphanumeric characters, and can not include white spaces or any other special characters.

  • Lexicon names are unique per account; storing a lexicon with the same name as one already stored will replace the older lexicon.

Examples of valid PLS

Example PLS that handles the pronunciation of tomato
<?xml version="1.0" encoding="UTF-8"?>
<lexicon version="1.0"
      xmlns="http://www.w3.org/2005/01/pronunciation-lexicon"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
      xsi:schemaLocation="http://www.w3.org/2005/01/pronunciation-lexicon
        http://www.w3.org/TR/2007/CR-pronunciation-lexicon-20071212/pls.xsd"
      alphabet="ipa" xml:lang="en-US">
  <lexeme>
    <grapheme>tomato</grapheme>
    <phoneme>təmei̥ɾou̥</phoneme>
    <!-- IPA string is: "t&#x0259;mei&#x325;&#x027E;ou&#x325;" -->
  </lexeme>
</lexicon>
Example PLS which expands UE to "Unione Europea" in the Italian language
<?xml version="1.0" encoding="UTF-8"?>
<lexicon version="1.0"
      xmlns="http://www.w3.org/2005/01/pronunciation-lexicon"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
      xsi:schemaLocation="http://www.w3.org/2005/01/pronunciation-lexicon
        http://www.w3.org/TR/2007/CR-pronunciation-lexicon-20071212/pls.xsd"
      alphabet="ipa" xml:lang="it">
  <lexeme>
    <grapheme>EU</grapheme>
    <alias>Unione Europea
      <!-- This is a substitution of the European
      Union acronym in Italian language.  --></alias>
  </lexeme>
</lexicon>
 
Copyright © 2015 IVONA Software. All rights reserved. Terms of Use | Privacy Policy