16. Optimization & cost
Your CRM is now triggering renders automatically (chapter 15) — possibly hundreds per day. Each render consumes credits and takes wall-clock time. This final chapter covers four levers that bring down both: cache, scene granularity for parallel rendering, quality, and asset reuse.
Prerequisites: chapter 15. Optimization assumes you understand the rest of the pipeline.
Lever 1 — Caching
Every renderable thing in JSON2Video is cached by default. The cache key is "everything that affects the output": the JSON of the element, its sources, its settings. If you submit a movie with the same JSON twice, the second render is served from cache — essentially free and instant.
To opt out for a specific element, set cache: false. Useful when:
- A source URL has the same path but changing contents (e.g. a price tag served by your CMS).
- You want to force a fresh voice render after tweaking voice settings.
{
"type": "voice",
"text": "Welcome to {{ address }}.",
"voice": "en-US-EmmaMultilingualNeural",
"cache": false
}
In production, leave caching on. Each generated voiceover in the listing costs credits the first time — every subsequent render of the same listing is free.
Lever 2 — Split into more scenes
JSON2Video renders scenes in parallel. The total render time is roughly max(scene durations), not the sum. Two effects:
- Splitting one 30-second scene into 6×5-second scenes is significantly faster.
- A monolithic scene with 30 elements is the worst case — it cannot parallelise.
In our listing, the chapter-13 iterate pattern already produces N scenes — one per room. That's optimal. If you have a long "stats roll" with many beats, break it into one scene per beat.
Lever 3 — Tune quality
The movie-level quality field controls render fidelity:
| Value | Use for |
|---|---|
"low" |
Internal previews, internal QA |
"medium" |
Social previews, draft reviews |
"high" |
Final delivery (default) |
low renders ~3× faster than high and costs fewer credits. A typical workflow: render low while drafting, then re-render high once approved.
{ "quality": "high" }
Lever 4 — Reuse heavy assets across renders
The most expensive operations are voiceover synthesis and heavy remote downloads. Two strategies:
4a. Hoist heavy assets to preload. The preload array generates or fetches an asset once and exposes its URL as {{id_url}}, ready for every element that references it. The generation happens before the main render and is cached.
{
"preload": [
{
"id": "intro",
"type": "voice",
"text": "Welcome to today's listing.",
"voice": "en-US-EmmaMultilingualNeural"
}
],
"scenes": [
{
"elements": [
{ "type": "audio", "src": "{{intro_url}}" }
]
}
]
}
4b. Upload the asset once via /v2/media. If you want full control, upload your reusable assets to your own Media library and reference them by URL. No re-generation ever.
The complete final JSON
{
"resolution": "full-hd",
"quality": "high",
"cache": true,
"client-data": {
"listing_id": "L-4821"
},
"exports": [
{
"destinations": [
{ "type": "webhook", "endpoint": "https://your-app.example/json2video-callback" }
]
}
],
"preload": [
{
"id": "intro",
"type": "voice",
"text": "Welcome to {{ address }}.",
"voice": "en-US-EmmaMultilingualNeural"
}
],
"variables": {
"address": "123 Oak Street",
"rooms": [
{ "name": "Exterior", "image": "https://cdn.json2video.com/assets/images/sample-house-front.jpg" },
{ "name": "Chef's Kitchen", "image": "https://cdn.json2video.com/assets/images/sample-house-kitchen.jpg" },
{ "name": "Master Bedroom", "image": "https://cdn.json2video.com/assets/images/sample-house-bedroom.jpg" }
]
},
"elements": [
{
"type": "audio",
"src": "https://cdn.json2video.com/assets/audios/uplifting-corporate.mp3",
"volume": 0.4
},
{
"type": "voice",
"text": "Welcome to {{ address }}.",
"voice": "en-US-EmmaMultilingualNeural",
"start": 1.5
},
{
"type": "subtitles",
"language": "en",
"settings": {
"style": "boxed-word",
"font-family": "Inter",
"font-size": 90,
"font-color": "#FFFFFF",
"position": "bottom-center",
"all-caps": true,
"box-color": "#0E7C66"
}
}
],
"scenes": [
{
"duration": 4,
"elements": [
{
"type": "component",
"component": "basic/000",
"settings": { "headline": "FOR SALE", "subline": "{{address}}" }
}
]
},
{
"duration": 4,
"transition": { "style": "fade", "duration": 0.5 },
"iterate": "rooms",
"iterate-as": "room",
"elements": [
{ "type": "image", "src": "{{ room.image }}" },
{ "type": "text", "text": "{{ room.name }}", "position": "top-left", "x": 60, "y": 60 }
]
},
{
"duration": 4,
"transition": { "style": "fade", "duration": 0.5 },
"elements": [
{
"type": "html",
"tailwind": true,
"html": "<div class='flex h-full w-full items-center justify-center bg-emerald-700 text-white text-7xl font-bold'>Open House Sunday</div>"
}
]
}
]
}
Expected output
Functionally the same listing as chapter 15 — but renders faster on first hit (parallel scenes), free on repeat hits (cache), with the voiceover hoisted into preload so it generates exactly once and is reused everywhere. Sample render: tutorial-16.mp4 (placeholder).
What you learned
cache: true(default) makes repeat renders of the same JSON essentially free.- Splitting work into more scenes lets the renderer parallelise — total time tracks the slowest scene, not the sum.
qualityis a three-step dial; uselowfor drafts andhighfor delivery.- Hoist expensive generated assets into
preloadso they are produced once and reused across every element.
You finished the tutorial
You now have a production-grade real-estate listing pipeline: data-driven, conditional, narrated, captioned, cached, parallel, and webhook-delivered. Where to go next:
- Reference — every field and every endpoint.
- Guides — task-oriented walkthroughs (dashboards, no-code, advanced patterns).
- For coding agents — set up an MCP server or hand these docs to your coding agent.