Archived docs Get your API Key
Get started
Tutorials
Guides
Reference
Help for AI agents
๐Ÿค– AI Assistant

9. AI generated images and videos

Sometimes an agent gives you the listing but no twilight exterior shot โ€” and you need one for the closing scene. JSON2Video can generate b-roll images and short videos with text-to-image and text-to-video models, all from the same image and video elements. This chapter adds a twilight AI exterior shot and a short AI b-roll clip.

Prerequisites: chapter 8. You should know the difference between scene-level and movie-level elements.

Step 1 โ€” AI images

Instead of src, supply a prompt and a model on an image element. JSON2Video pipes the prompt to the chosen model and embeds the result.

{
  "type": "image",
  "prompt": "A craftsman house at twilight with warm lights, magazine real-estate photo, cinematic, high detail",
  "model": "flux-schnell",
  "aspect-ratio": "horizontal",
  "duration": 4
}

The full list of image models โ€” with their IDs, output sizes, and per-model parameters โ€” lives in the live AI models catalog. This tutorial uses flux-schnell as a fast, low-cost default; swap the model field to try others.

aspect-ratio accepts horizontal, vertical, or square. The renderer picks compatible model output sizes.

Step 2 โ€” AI videos

Short AI clips work the same way on a video element:

{
  "type": "video",
  "prompt": "Slow dolly shot through a sunlit modern kitchen, real-estate listing style",
  "model": "kling-1-6",
  "aspect-ratio": "horizontal",
  "duration": 5
}

AI video generation is the most credit-expensive operation in the catalog โ€” check Credit consumption before running this at scale.

Step 3 โ€” Add a closing twilight scene

Append a fourth room scene whose photo is AI-generated. The output stays cached after the first render, so future renders reuse the same image at no extra cost (chapter 16 covers caching).

{
  "duration": 4,
  "transition": { "style": "fade", "duration": 0.5 },
  "elements": [
    {
      "type": "image",
      "prompt": "A craftsman house at twilight with warm lights, magazine real-estate photo, cinematic",
      "model": "flux-schnell",
      "aspect-ratio": "horizontal"
    },
    { "type": "text", "text": "Twilight Exterior", "position": "top-left", "x": 60, "y": 60 }
  ]
}

Tip โ€” flux-schnell returns in ~5 s. flux-pro and AI video generators can take 30โ€“60 s; expect the overall render to slow down. Always pair AI generation with cache: true (the default) so subsequent renders of the same prompt are instant.

The complete final JSON

{
  "resolution": "full-hd",
  "elements": [
    {
      "type": "audio",
      "src": "https://cdn.json2video.com/assets/audios/uplifting-corporate.mp3",
      "volume": 0.4
    },
    {
      "type": "voice",
      "text": "Welcome to 123 Oak Street. Take a tour of the kitchen, bedrooms, and backyard.",
      "voice": "en-US-EmmaMultilingualNeural",
      "start": 1.5
    },
    {
      "type": "subtitles",
      "language": "en",
      "settings": {
        "style": "boxed-word",
        "font-family": "Inter",
        "font-size": 90,
        "font-color": "#FFFFFF",
        "outline-color": "#000000",
        "position": "bottom-center",
        "all-caps": true,
        "box-color": "#0E7C66"
      }
    }
  ],
  "scenes": [
    {
      "duration": 4,
      "elements": [
        {
          "type": "component",
          "component": "basic/000",
          "settings": { "headline": "FOR SALE", "subline": "123 Oak Street" }
        }
      ]
    },
    {
      "duration": 4,
      "transition": { "style": "fade", "duration": 0.5 },
      "elements": [
        { "type": "image", "src": "https://cdn.json2video.com/assets/images/sample-house-front.jpg" },
        { "type": "text", "text": "Exterior", "position": "top-left", "x": 60, "y": 60 }
      ]
    },
    {
      "duration": 4,
      "transition": { "style": "fade", "duration": 0.5 },
      "elements": [
        { "type": "image", "src": "https://cdn.json2video.com/assets/images/sample-house-kitchen.jpg" },
        { "type": "text", "text": "Chef's Kitchen", "position": "top-left", "x": 60, "y": 60 }
      ]
    },
    {
      "duration": 4,
      "transition": { "style": "fade", "duration": 0.5 },
      "elements": [
        { "type": "image", "src": "https://cdn.json2video.com/assets/images/sample-house-bedroom.jpg" },
        { "type": "text", "text": "Master Bedroom", "position": "top-left", "x": 60, "y": 60 }
      ]
    },
    {
      "duration": 4,
      "transition": { "style": "fade", "duration": 0.5 },
      "elements": [
        {
          "type": "image",
          "prompt": "A craftsman house at twilight with warm lights, magazine real-estate photo, cinematic",
          "model": "flux-schnell",
          "aspect-ratio": "horizontal"
        },
        { "type": "text", "text": "Twilight Exterior", "position": "top-left", "x": 60, "y": 60 }
      ]
    }
  ]
}

Expected output

A 20-second listing that ends on a generated twilight exterior of the house. The voice-over has been updated to invite the viewer on a tour, and the subtitles transcribe the new line. Sample render: tutorial-09.mp4 (placeholder).

What you learned

  • Swap src for prompt + model to ask JSON2Video to generate the image or video.
  • aspect-ratio selects compatible output sizes.
  • AI generation is cached by default โ€” repeat renders are instant and free.
  • AI video is the most credit-expensive primitive; prefer AI images where possible.

Previous chapter / Next chapter

โ† 8. Automatic subtitles ยท 10. Variables โ†’