SIP Caller | Usando SSML para melhorar suas mensagens de conversão de texto em fala

Overview

SSML (Speech Synthesis Markup Language) lets you control how text is read out loud. It allows you to fine-tune the pronunciation, tone, emphasis, and more to create natural and engaging voice prompts. Below are the most useful SSML commands you can use with SIP Caller to customize your Text-to-Speech experience.

Key SSML Commands

<speak>

Purpose: The root element for all SSML, ensuring your prompt is processed as SSML.

Example:

<speak>Hello! How can I help you today?</speak>

<break>

Purpose: Adds a pause in speech. Use this to create a natural conversational flow.
Attributes:
- time: Duration of the pause (e.g., "500ms" for milliseconds or "1s" for seconds).

Example:

<speak>Hello! <break time="500ms"/> How can I assist you?</speak>

<emphasis>

Purpose: Changes the emphasis on specific words to add expression.
Attributes:
- level: Can be "strong," "moderate," or "reduced."

Example:

<speak>This is <emphasis level="strong">very important</emphasis> information.</speak>

<prosody>

Purpose: Controls the pitch, rate, and volume of the spoken text.
Attributes:
- rate: Speed of the speech (e.g., "fast," "slow," or percentages).
- pitch: Pitch of the voice (e.g., "high," "low," or "+10%").
- volume: Volume level (e.g., "loud," "soft").

Example:

<speak><prosody rate="slow" pitch="low">Please listen carefully.</prosody></speak>

<say-as>

Purpose: Specifies the type of content to help with pronunciation (e.g., dates, times, addresses).
Attributes:
- interpret-as: Can be set to "date", "time", "characters", "expletive", etc.

Example:

<speak>The date is <say-as interpret-as="date">2024-11-08</say-as>.</speak>

<sub>

Purpose: Reads an abbreviation or acronym as its full word form.
Attributes:
- alias: The full text to be read.

Example:

<speak>SIP Caller uses <sub alias="Session Initiation Protocol">SIP</sub> to connect to the PBX.</speak>

<p> and <s>

Purpose: Defines paragraphs (<p>) and sentences (<s>) to structure longer prompts for better pacing.

Example:

<speak><p>Welcome to SIP Caller.</p><p>How can we assist you today?</p></speak>

Quick Tips

Test Variations: Play around with combinations of <break>, <emphasis>, and <prosody> to find the most natural-sounding prompts.
Keep it Clear: Overuse of SSML can make prompts sound robotic. Focus on clarity and flow.
Accessibility: Use <say-as> for dates, times, and special pronunciations to ensure accuracy.

By using SSML effectively in SIP Caller, you can deliver clear, professional, and engaging voice prompts that enhance customer interactions. For further customization, feel free to explore the full SSML specification for Google or SSML specification for Azure and try out different configurations with SIP Caller's Text-to-Speech feature.

Special Note When Using SSML with Azure Text-to-Speech

When using SSML with Azure Text-to-Speech, please note that Azure is very strict about SSML tag structure, especially when compared to Google Cloud Text-to-Speech, which is more permissive.

For Azure TTS to work correctly in SIP Caller, you must explicitly include both the <speak> root element and a <voice> element in your SSML input.

Additionally, the <voice> element must specify the exact same voice that is selected for the campaign in SIP Caller. If the voice defined in SSML does not match the campaign voice, Azure may reject the request and fail to synthesize audio.

Failing to follow this structure may result in synthesis errors and the audio not being generated at all.

Example:

<speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis" xml:lang="en-US">
    <voice name="en-US-AvaNeural">
        Hello. <break time="1s"/> This is a call from ACME. <break time="1s"/> This is to confirm your appointment tomorrow. <break time="1s"/> If there are any issues with this visit, please contact us. Thank you.
    </voice>
</speak>

Índice

Overview

Key SSML Commands

<speak>

<break>

<emphasis>

<prosody>

<say-as>

<sub>

<p> and <s>

Quick Tips

Special Note When Using SSML with Azure Text-to-Speech

Outros Tópicos

Produto

Soluções

Empresa

Ajuda e Documentação

Jurídico