Azure

Azure Speech is Microsoft's text-to-speech service. JSON2Video integrates with Azure for high-quality voiceovers, particularly when you need:

Languages and neural voices not covered by ElevenLabs.
Enterprise / compliance requirements that mandate Azure as the TTS provider.
SSML support for fine-grained pronunciation, breaks, and prosody control.

How it appears in JSON2Video

Reference Azure on a voice element by setting model to azure:

{
  "type": "voice",
  "model": "azure",
  "voice": "en-US-JennyNeural",
  "text": "Hello, this is an Azure-generated voiceover."
}

The voice field expects an Azure voice short name. Refer to Microsoft's voice gallery for the complete list.

Bring your own Azure subscription

To use your own Azure subscription (and pay Microsoft directly):

In the Azure portal, create or pick a Speech resource. Note the region and the API key.
Create a Connection at json2video.com/dashboard/connections of type Azure, paste the key, and set the region.
Reference the Connection on the voice element:

{
  "type": "voice",
  "model": "azure",
  "connection": "my-azure",
  "voice": "en-US-JennyNeural",
  "text": "Hello, world!"
}

SSML support

Azure voices accept SSML input for tags like <break>, <prosody>, <phoneme>. Pass SSML via the text field; the engine forwards it directly to Azure.

Cost

JSON2Video-billed usage is documented in the credit consumption table. With a Connection, Azure bills your subscription per the Speech service pricing.

Azure#

How it appears in JSON2Video#

Bring your own Azure subscription#

SSML support#

Cost#

See also#

Azure

How it appears in JSON2Video

Bring your own Azure subscription

SSML support

Cost

See also