In this tutorial we will use JSON2Video API and Make.com to create a small no-code application that turns text into a video with AI. You can create your own videos with this text to video API.
The application will use Airtable as a database to store the text, Make.com to automate the process and JSON2Video API to generate the videos.
JSON2Video API integrates with Flux Pro to generate images and ElevenLabs to generate voices.
Example of a video generated in this tutorial
Text to video prompt | Resulting video |
---|---|
What Schools Don’t Teach You About Money: Filling in gaps in traditional education with financial literacy basics. |
|
Create the Text to Video app
Step 1: Create an Airtable base
First we need to create an Airtable base to manage the list of videos to generate. You can clone the following base and then edit it to manage your own list of videos:
Clone the Text to video Airtable base
Step 2: Create the Make.com scenario
Now we need to create the Make.com scenario that will be used to generate the videos.
Download the scenario blueprint
Once you have downloaded the blueprint, you can import it into a new scenario in your Make.com account.
Step 3: Configure the connections
Now we need to configure the connections to your Airtable account and JSON2Video API:
- Airtable: Read Airtable integration guide to learn how to connect your Airtable account to Make.com.
- JSON2Video: Watch JSON2Video getting started video to learn how to connect your JSON2Video account to Make.com.
Generate your first video from text
Once you have configured the connections, you can start generating videos:
Step 1: Add a new record to the Airtable base
- In the Title field, add a title for your video. It's for your own reference, it will not be used by the video generation process.
- In the Topic field, add the topic of the video you want to generate (see the example below).
- In the Language field, select the language of the video you want to generate. Not all languages are supported, but you can try with the most common ones like English, French, Spanish, Italian, German, etc.
- In the Voice field, select the voice of the video you want to generate. The field is predefined with a list of ElevenLabs voices you can choose from.
- Finally, change the Status field to To do.
Step 2: Run the scenario in Make.com
Run the scenario in Make.com by clicking on the Run button.
The scenario will take a few minutes to complete.
Step 3: Check the Airtable base
Check now the Airtable base, and the Status field should have been changed to Done and the VideoURL field should have been populated with the video URL.
Examples
Here are some examples of videos generated with the text to video app:
Prompt | Video |
---|---|
10 Kitchen Hacks That Will Change the Way You Cook: Innovative tips to simplify cooking and food preparation. |
|
Cleopatra Lived Closer to the Moon Landing Than the Pyramids: A mind-blowing perspective on historical timelines |
|
How Rich People Avoid Taxes (Legally): Explaining strategies like trusts, deductions, and loopholes. |
|
How the app works
Let's dive into the details of the Make scenario and how it works.
- The scenario starts by checking the Airtable base to see if there are any records with the Status field set to To do.
- If there are records to process, the scenario will take the first record, update the Status field to In progress.
-
Then the scenario calls "ChatGPT" to generate the script for the video.
The prompt given to ChatGPT is:
Create a script of a social media video about the topic included below. The video will be organized in scenes. Each scene has an overlaid text, a voice over and an image. The voice over text must be at least 20 words. Your response must be in JSON format following this schema: { "scenes": [{ "overlaidText": "", "voiceOverText": "", "imagePrompt": "" }] } The image prompt must be written in ENGLISH, being detailed and photo realistic. The overlaidText and the voiceOverText must be in {{16.language}}. The topic of the video is:
- ChatGPT returns a JSON string with the script for the video, including the overlaid text, voice over text and image prompt for each scene.
- The scenario then converts the JSON string into a JSON object, and calls the JSON2Video module "Create a Movie from a Template ID".
- The template ID used in this example is Awe8I3PY4cRkhcfjPdUc that has the logic and design of the output video. See below for more details about the template.
- Next module is "Wait for a Movie to Render". This module will wait until the video is generated by the JSON2Video API.
- Finally, the scenario updates the Airtable base with the video URL and the status of the record.
Understanding the video template
The template used in this example defines the design of the output video and the variables that will be replaced by the JSON2Video API.
The template defines a vertical video, with a background image, an animation at the bottom, an overlaid text and a voice over.
The template receives 2 variables:
- scenes: an array of objects, each object representing a scene of the video, with 3 fields: overlaidText, voiceOverText and imagePrompt.
- voice: the voice name of the voice over
The scene (using the iterate
property) will be populated for each item in the scenes array, creating a video with as many scenes as the scenes array has items.
Published on November 18th, 2024