Azure
<!-- TODO: expand -->Azure AI Speech (formerly Azure Cognitive Services Speech) is Microsoft's text-to-speech service. JSON2Video integrates with Azure for high-quality voiceovers, particularly when you need:
- Languages and neural voices not covered by ElevenLabs.
- Enterprise / compliance requirements that mandate Azure as the AI provider.
- SSML support for fine-grained pronunciation, breaks, and prosody control.
How it appears in JSON2Video
Reference Azure on a voice element by setting model to an Azure model name:
{
"type": "voice",
"model": "azure",
"voice": "en-US-JennyNeural",
"text": "Hello, this is an Azure-generated voiceover."
}
The voice field expects an Azure voice short name. See json2video.com/ai-voices/azure for the catalog of supported voices, or refer to Microsoft's voice gallery for the complete list.
Bring your own Azure subscription
To use your own Azure subscription (and pay Microsoft directly):
- In the Azure portal, create or pick a Speech resource. Note the region and the API key.
- Create a Connection at json2video.com/dashboard/connections of type Azure, paste the key, and set the region.
- Reference the Connection on the voice element:
{
"type": "voice",
"model": "azure",
"connection": "my-azure",
"voice": "en-US-JennyNeural",
"text": "Hello, world!"
}
SSML support
Azure voices accept SSML input for tags like <break>, <prosody>, <phoneme>. Pass SSML via the text field; the engine forwards it directly to Azure.
Cost
JSON2Video-billed usage is documented in the credit consumption table. With a Connection, Azure bills your subscription per the Speech service pricing.