Archived docs Get your API Key
Get started
Tutorials
Guides
Reference
Help for AI agents
๐Ÿค– AI Assistant

8. Automatic subtitles

Most social platforms autoplay video with sound off. Subtitles double watch time. This chapter adds a subtitles element that transcribes the chapter-7 voice-over automatically โ€” no SRT file required.

Prerequisites: chapter 7. The listing must already include a voice element; the subtitles element transcribes it.

Step 1 โ€” The simplest subtitles element

A bare subtitles element transcribes the audio track of the movie automatically. There can be only one subtitles element per movie, and it always lives at movie level.

{
  "type": "subtitles"
}

That's it. Add it to the top-level elements array and the renderer:

  1. Mixes the voice + audio tracks.
  2. Runs speech-to-text on the result.
  3. Burns the captions onto the canvas in a default style.

Step 2 โ€” Customise the look

Subtitles styling lives in a settings object. The most-used keys are style, font-family, font-size, font-color, outline-color, position, and all-caps.

{
  "type": "subtitles",
  "settings": {
    "style": "boxed-word",
    "font-family": "Inter",
    "font-size": 90,
    "font-color": "#FFFFFF",
    "outline-color": "#000000",
    "position": "bottom-center",
    "all-caps": true,
    "box-color": "#0E7C66"
  }
}

style ranges from classic (simple text overlay) to boxed-word (modern social-style with a coloured box behind the current word). See the Subtitles element reference for all styles and settings.

Step 3 โ€” Specify the language (optional)

The transcription engine auto-detects language by default. If you want to be explicit (or speed up the model), set language:

{
  "type": "subtitles",
  "language": "en",
  "settings": {
    "style": "boxed-word",
    "font-family": "Inter",
    "font-size": 90,
    "font-color": "#FFFFFF",
    "outline-color": "#000000",
    "position": "bottom-center",
    "all-caps": true,
    "box-color": "#0E7C66"
  }
}

language accepts ISO 639-1 codes (en, es, fr, โ€ฆ).

Step 4 โ€” Mind the room labels at the bottom-left

The chapter-4 room labels live at bottom-left. Subtitles at bottom-center are far enough away that they don't clash, but if you wanted them not to overlap you could move the room labels to top-left instead. We keep both for clarity.

The complete final JSON

{
  "resolution": "full-hd",
  "elements": [
    {
      "type": "audio",
      "src": "https://cdn.json2video.com/assets/audios/uplifting-corporate.mp3",
      "volume": 0.4
    },
    {
      "type": "voice",
      "text": "Welcome to 123 Oak Street โ€” a four-bedroom craftsman home, listed at $849,000.",
      "voice": "en-US-EmmaMultilingualNeural",
      "start": 1.5
    },
    {
      "type": "subtitles",
      "language": "en",
      "settings": {
        "style": "boxed-word",
        "font-family": "Inter",
        "font-size": 90,
        "font-color": "#FFFFFF",
        "outline-color": "#000000",
        "position": "bottom-center",
        "all-caps": true,
        "box-color": "#0E7C66"
      }
    },
    {
      "type": "html",
      "tailwind": true,
      "wait": 0.5,
      "html": "<div class='inline-flex items-center gap-2 px-6 py-4 rounded-xl bg-emerald-700 text-white text-5xl font-bold shadow-lg'>๐Ÿ’ฐ $849,000</div>",
      "position": "bottom-right",
      "x": -60,
      "y": -60,
      "start": 4,
      "duration": 12
    }
  ],
  "scenes": [
    {
      "duration": 4,
      "elements": [
        {
          "type": "component",
          "component": "basic/000",
          "settings": { "headline": "FOR SALE", "subline": "123 Oak Street" }
        }
      ]
    },
    {
      "duration": 4,
      "transition": { "style": "fade", "duration": 0.5 },
      "elements": [
        { "type": "image", "src": "https://cdn.json2video.com/assets/images/sample-house-front.jpg" },
        { "type": "text", "text": "Exterior", "position": "top-left", "x": 60, "y": 60 }
      ]
    },
    {
      "duration": 4,
      "transition": { "style": "fade", "duration": 0.5 },
      "elements": [
        { "type": "image", "src": "https://cdn.json2video.com/assets/images/sample-house-kitchen.jpg" },
        { "type": "text", "text": "Chef's Kitchen", "position": "top-left", "x": 60, "y": 60 }
      ]
    },
    {
      "duration": 4,
      "transition": { "style": "fade", "duration": 0.5 },
      "elements": [
        { "type": "image", "src": "https://cdn.json2video.com/assets/images/sample-house-bedroom.jpg" },
        { "type": "text", "text": "Master Bedroom", "position": "top-left", "x": 60, "y": 60 }
      ]
    }
  ]
}

(Room labels moved to top-left to keep the bottom band reserved for subtitles.)

Expected output

The chapter-7 listing with bold green-boxed subtitles word-by-word at the bottom, room labels in the top-left, and the price tag bottom-right. Sample render: tutorial-08.mp4 (placeholder).

What you learned

  • type: subtitles auto-transcribes the audio track of the movie.
  • Only one subtitles element per movie; it always sits at movie level.
  • settings.style switches between classic captions and modern boxed-word styles.
  • language is optional โ€” set it to skip auto-detection and pick a specific transcription model.

Going further

You can also supply subtitles from a pre-existing SRT/VTT/ASS file with captions: "https://โ€ฆ". Useful for translated subtitle tracks the engine cannot generate yet. See the Subtitles reference.

Previous chapter / Next chapter

โ† 7. AI voiceover ยท 9. AI generated images and videos โ†’