Azure
<!-- TODO: expand -->Azure Speech is Microsoft's text-to-speech service. JSON2Video integrates with Azure for high-quality voiceovers, particularly when you need:
- Languages and neural voices not covered by ElevenLabs.
- Enterprise / compliance requirements that mandate Azure as the TTS provider.
- SSML support for fine-grained pronunciation, breaks, and prosody control.
How it appears in JSON2Video
Reference Azure on a voice element by setting model to azure:
{
"type": "voice",
"model": "azure",
"voice": "en-US-JennyNeural",
"text": "Hello, this is an Azure-generated voiceover."
}
The voice field expects an Azure voice short name. Refer to Microsoft's voice gallery for the complete list.
Bring your own Azure subscription
To use your own Azure subscription (and pay Microsoft directly):
- In the Azure portal, create or pick a Speech resource. Note the region and the API key.
- Create a Connection at json2video.com/dashboard/connections of type Azure, paste the key, and set the region.
- Reference the Connection on the voice element:
{
"type": "voice",
"model": "azure",
"connection": "my-azure",
"voice": "en-US-JennyNeural",
"text": "Hello, world!"
}
SSML support
Azure voices accept SSML input for tags like <break>, <prosody>, <phoneme>. Pass SSML via the text field; the engine forwards it directly to Azure.
Cost
JSON2Video-billed usage is documented in the credit consumption table. With a Connection, Azure bills your subscription per the Speech service pricing.