JSON2Video uses OpenAI's Whisper to transcribe the audio and generate subtitles for you, supporting a wide variety of languages and accents.

See more examples

The subtitle element

The subtitle element is a bit different from the other elements. It's processed always once the scene or the movie is rendered to make sure it captures and transcribes the audio correctly.

You can add a subtitle element to any scene or movie:

The subtitle element has the following format:

{
    "type": "subtitles",
    "language": "en",
    "settings": {}
}

language

The language property defines the expected language of the voice-over using the ISO 639-1 standard. If it's not set, the language will be tried to be detected automatically. It defaults to "en" (English).

The full list of supported languages is can be found in the API specification.

Some examples of common languages are:

Settings object

The settings object allows you to customize the style of the subtitles using the following properties:

style

The style of the subtitle element. Styles are like presets of settings to start with that you can customize further using the settings object.

The default style is classic. These are the available styles:

font-family

Sets the font family of the subtitle text.

The full list of supported fonts is can be found in the API specification.

Some examples of common fonts are: Arial, Luckiest Guy, Nanum Pen Script, Roboto.

font-size

Sets the font size of the subtitle text. Defaults to a 10% of the video resolution.

max-words-per-line

Sets the maximum number of words per line. Defaults to 4. If you set it to 1, the subtitle will display just one word at once.

position

Sets the subtitle position on the video. Defaults to bottom-center.

word-color, line-color

Sets the color of the subtitle text. The word-color refers to the color of the word that is being spoken and the line-color refers to the rest of the text. Setting different values to word-color and line-color makes the spoken word to be highlighted. Setting these properties to the same value makes the whole line of text to look the same color.

Colors can be set in hexadecimal format like #00B140, but can also include alpha (opacity) values, like #00B14008.

outline-width, outline-color

Some styles support an outline (classic styles, for example). outline-width sets the width of the outline, that can be set to 0 to disable the outline. outline-color sets the color of the outline.

box-color

A few styles, instead of outline, support a box around the text. This property sets the color of the box framing the subtitle. Again, like all color properties, it must be in hexadecimal format like #00B140, but can also include alpha (opacity) values, like #00B14008.

shadow-color, shadow-offset

Some styles support a shadow. These properties set the color of the shadow, and the offset of the shadow.

keywords

Keywords is an array of words that helps the AI to transcribe the audio better.

For example, transcribing a voice over talking about "JSON2Video" may be transcribed as "JSON to video" or "Jason to video". Adding the word "JSON2Video" to the keywords array will help the AI to transcribe the audio better.

{
    "type": "subtitles",
    "language": "en",
    "settings": {
        "keywords": [
            "JSON2Video"
        ]
    }
}

replace

Replace is collection of words that will be replaced by other words.

For example, we can replace any appearance of the word "apple" by the word "orange".

{
    "type": "subtitles",
    "language": "en",
    "settings": {
        "replace": {
            "apple": "orange"
        }
    }
}

See more examples