Kai voice
The voice Kai is available in the Azure Text-to-Speech service for the English language.
How to use Kai voice in your videos
To use Kai voice in your videos, you can use the following JSON2Video code:
{
"type": "voice",
"model": "azure",
"voice": "en-US-KaiNeural",
"text": "In springtime, the garden comes alive with colorful flowers and singing birds. The old oak tree provides shade for visitors, while butterflies dance among the roses. A small fountain creates peaceful sounds, making this the perfect spot to relax and enjoy nature's beauty."
}
Kai supports SSML
SSML stands for Speech Synthesis Markup Language. It's a way to add instructions to your text so that a Text-To-Speech (TTS) system knows how to read it aloud.
You use SSML like HTML, but for controlling speech. It helps you adjust things like: Pronunciation, Pauses, Pitch and Volume, Emphasis, Speaking Rate.
{
"type": "voice",
"voice": "en-US-KaiNeural",
"text": "<speak>Hello, <break time="500ms"/> how are you today? <emphasis level="strong">This is important!</emphasis></speak>"
}
Kai supports different voice styles
As part of SSML, you can use the style tags to change the voice style.
Kai supports these styles:
conversation
{
"type": "voice",
"voice": "en-US-KaiNeural",
"text": "<whispering>I have a secret for you</whispering>"
}
Kai is a neural voice
In Azure Cognitive Services, a Neural voice refers to a voice generated using neural network technology. This means the Text-To-Speech system uses advanced machine learning models to create more natural, human-like speech compared to traditional methods.
Key characteristics of Neural voices:
- More expressive and realistic
- Better at handling pitch, tone, and rhythm variations
- Sounds closer to how humans naturally speak
Use Azure voices in your videos with JSON2Video
JSON2Video lets you create videos programmatically with AI voiceover, subtitles, images and effects. Add any Azure voice to your videos via the API, no-code tools or the CLI.