Archived docs Get your API Key
Get started
Tutorials
Guides
Reference
Help for AI agents
🤖 AI Assistant

Azure

<!-- TODO: expand -->

Azure AI Speech (formerly Azure Cognitive Services Speech) is Microsoft's text-to-speech service. JSON2Video integrates with Azure for high-quality voiceovers, particularly when you need:

  • Languages and neural voices not covered by ElevenLabs.
  • Enterprise / compliance requirements that mandate Azure as the AI provider.
  • SSML support for fine-grained pronunciation, breaks, and prosody control.

How it appears in JSON2Video

Reference Azure on a voice element by setting model to an Azure model name:

{
  "type": "voice",
  "model": "azure",
  "voice": "en-US-JennyNeural",
  "text": "Hello, this is an Azure-generated voiceover."
}

The voice field expects an Azure voice short name. See json2video.com/ai-voices/azure for the catalog of supported voices, or refer to Microsoft's voice gallery for the complete list.

Bring your own Azure subscription

To use your own Azure subscription (and pay Microsoft directly):

  1. In the Azure portal, create or pick a Speech resource. Note the region and the API key.
  2. Create a Connection at json2video.com/dashboard/connections of type Azure, paste the key, and set the region.
  3. Reference the Connection on the voice element:
{
  "type": "voice",
  "model": "azure",
  "connection": "my-azure",
  "voice": "en-US-JennyNeural",
  "text": "Hello, world!"
}

SSML support

Azure voices accept SSML input for tags like <break>, <prosody>, <phoneme>. Pass SSML via the text field; the engine forwards it directly to Azure.

Cost

JSON2Video-billed usage is documented in the credit consumption table. With a Connection, Azure bills your subscription per the Speech service pricing.

See also