In this tutorial we will use JSON2Video API and Make.com to create a small no-code application that turns text into a video with AI. You can create your own videos with this text to video API.

The application will use Airtable as a database to store the prompt text, Make.com to automate the process and JSON2Video API to generate the videos.

Choosing the AI models

JSON2Video API integrates with different AI services to generate images and voices like Flux-Pro, Freepik, ElevenLabs, Azure TTS, etc.

In this tutorial we will use Freepik's classic model to generate images and Microsoft Azure TTS to generate voices. We choose these services because they are included in the JSON2Video API plan and do not consume extra credits from your account.

Read more about the credits consumption here: How credits are consumed.

Example of a video generated in this tutorial

Text to video prompt Resulting video

What Schools Don’t Teach You About Money: Filling in gaps in traditional education with financial literacy basics.

Create the Text to Video app

Step 1: Create an Airtable base

First we need to create an Airtable base to manage the list of videos to generate. You can clone the following base and then edit it to manage your own list of videos:

Text to video Airtable base

Clone the Text to video Airtable base

Step 2: Create the Make.com scenario

Now we need to create the Make.com scenario that will be used to generate the videos.

Text to video Make.com scenario

Download the scenario blueprint

Once you have downloaded the blueprint, you can import it into a new scenario in your Make.com account.

Step 3: Configure the connections

Now we need to configure the connections to your Airtable account and JSON2Video API:

Generate your first video from text

Once you have configured the connections, you can start generating videos:

Step 1: Add a new record to the Airtable base

Text to video Airtable base with new record

Step 2: Run the scenario in Make.com

Run the scenario in Make.com by clicking on the Run button.

The scenario will take a few minutes to complete. If everything goes well, the scenario will run until the end and the video will be generated.

If the scenario fails, check the Solving issues section at the end of this page.

Step 3: Check the Airtable base

Check now the Airtable base, and the Status field should have been changed to Done and the VideoURL field should have been populated with the video URL.

Examples

Here are some examples of videos generated with the text to video app:

Prompt Video

10 Kitchen Hacks That Will Change the Way You Cook: Innovative tips to simplify cooking and food preparation.

Cleopatra Lived Closer to the Moon Landing Than the Pyramids: A mind-blowing perspective on historical timelines

How Rich People Avoid Taxes (Legally): Explaining strategies like trusts, deductions, and loopholes.

How the app works

Let's dive into the details of the Make scenario and how it works.

Understanding the video template

The template used in this example defines the design of the output video and the variables that will be replaced by the JSON2Video API.

Text to video video template

Open the video template

The template defines a vertical video, with a replicable scene that has a background image and a voice over.

The template receives 4 variables:

The scene (using the iterate property) will be populated for each item in the scenes array, creating a video with as many scenes as the scenes array has items.

Learn more about building and customizing video templates in the video templates documentation.

Using other AI models

You can use other AI models by entering the model name in the Voice Model and Image Model fields in the module Toolbox settings.

Text to video AI models to use

By default, the scenario blueprint uses the azure voice model and the freepik-classic image model, but you can change to use flux-pro or elevenlabs.

Be aware that using Flux-Pro or ElevenLabs will consume credits from your JSON2Video account. Read below.

How many credits does this text to video app consume?

By default, the scenario blueprint uses the azure voice model and the freepik-classic image model. These models are included in the JSON2Video API plan and do not consume extra credits from your account.

Therefore, this app will consume as many credits as the number of seconds of the generated videos. For example, a video of 38 seconds will consume 38 credits.

However, if you use Flux-Pro or ElevenLabs, you will be charged for the credits used by these services. Read more about the credits consumption here: How credits are consumed.

Solving issues

There are some common errors that may happen when generating videos with this tutorial.

Published on January 12th, 2025

Author
David Bosch
David Bosch David is an experienced engineer, now collaborating with JSON2Video.