Dariya voice
The voice Dariya is available in the Azure Text-to-Speech service for the Russian language.
How to use Dariya voice in your videos
To use Dariya voice in your videos, you can use the following JSON2Video code:
{
"type": "voice",
"model": "azure",
"voice": "ru-RU-DariyaNeural",
"text": "\u0412\u0435\u0441\u043d\u043e\u0439 \u0441\u0430\u0434 \u043e\u0436\u0438\u0432\u0430\u0435\u0442 \u044f\u0440\u043a\u0438\u043c\u0438 \u0446\u0432\u0435\u0442\u0430\u043c\u0438 \u0438 \u043f\u0435\u043d\u0438\u0435 \u043f\u0442\u0438\u0446. \u0421\u0442\u0430\u0440\u044b\u0439 \u0434\u0443\u0431 \u0441\u043e\u0437\u0434\u0430\u0435\u0442 \u0442\u0435\u043d\u044c \u0434\u043b\u044f \u043f\u043e\u0441\u0435\u0442\u0438\u0442\u0435\u043b\u0435\u0439, \u0430 \u0431\u0430\u0431\u043e\u0447\u043a\u0438 \u0442\u0430\u043d\u0446\u0443\u044e\u0442 \u0441\u0440\u0435\u0434\u0438 \u0440\u043e\u0437. \u041d\u0435\u0431\u043e\u043b\u044c\u0448\u043e\u0439 \u0444\u043e\u043d\u0442\u0430\u043d \u0438\u0437\u0434\u0430\u0435\u0442 \u043c\u0438\u0440\u043d\u044b\u0435 \u0437\u0432\u0443\u043a\u0438, \u0434\u0435\u043b\u0430\u044f \u044d\u0442\u043e \u0438\u0434\u0435\u0430\u043b\u044c\u043d\u044b\u043c \u043c\u0435\u0441\u0442\u043e\u043c \u0434\u043b\u044f \u043e\u0442\u0434\u044b\u0445\u0430 \u0438 \u043d\u0430\u0441\u043b\u0430\u0436\u0434\u0435\u043d\u0438\u044f \u043a\u0440\u0430\u0441\u043e\u0442\u043e\u0439 \u043f\u0440\u0438\u0440\u043e\u0434\u044b."
}
Dariya supports SSML
SSML stands for Speech Synthesis Markup Language. It's a way to add instructions to your text so that a Text-To-Speech (TTS) system knows how to read it aloud.
You use SSML like HTML, but for controlling speech. It helps you adjust things like: Pronunciation, Pauses, Pitch and Volume, Emphasis, Speaking Rate.
{
"type": "voice",
"voice": "ru-RU-DariyaNeural",
"text": "<speak>Hello, <break time="500ms"/> how are you today? <emphasis level="strong">This is important!</emphasis></speak>"
}
Dariya is a neural voice
In Azure Cognitive Services, a Neural voice refers to a voice generated using neural network technology. This means the Text-To-Speech system uses advanced machine learning models to create more natural, human-like speech compared to traditional methods.
Key characteristics of Neural voices:
- More expressive and realistic
- Better at handling pitch, tone, and rhythm variations
- Sounds closer to how humans naturally speak