JSON2Video uses OpenAI's Whisper to transcribe the audio and generate subtitles for you, supporting a wide variety of languages and accents.
The subtitle element
The subtitle element is a bit different from the other elements. It's processed always once the scene or the movie is rendered to make sure it captures and transcribes the audio correctly.
You can add a subtitle element to any scene or movie:
- In case you add it to a scene, the subtitles will be created for the scene only, and other scenes in the movie may not have subtitles.
- In case you add it to a movie, the subtitles will be created for the full movie.
- If you add the subtitle element to a scene and to the movie at the same time, the subtitles will be overlapped on top of each other, so try to avoid this.
The subtitle element has the following format:
language
The language
property defines the expected language of the voice-over using the ISO 639-1 standard.
If it's not set, the language will be tried to be detected automatically.
It defaults to "en" (English).
The full list of supported languages is can be found in the API specification.
Some examples of common languages are:
en
: Englishes
: Spanishfr
: Frenchde
: Germanit
: Italianar
: Arabicja
: Japaneseko
: Koreanzh
: Chineseca
: Catalanpt
: Portuguese
Settings object
The settings
object allows you to customize the style of the subtitles using the following properties:
style
The style of the subtitle element. Styles are like presets of settings to start with that you can customize further using the settings
object.
The default style is classic
. These are the available styles:
font-family
Sets the font family of the subtitle text.
The full list of supported fonts is can be found in the API specification.
Some examples of common fonts are: Arial, Luckiest Guy, Nanum Pen Script, Roboto.
font-size
Sets the font size of the subtitle text. Defaults to a 10% of the video resolution.
max-words-per-line
Sets the maximum number of words per line. Defaults to 4. If you set it to 1, the subtitle will display just one word at once.
position
Sets the subtitle position on the video. Defaults to bottom-center
.
word-color, line-color
Sets the color of the subtitle text. The word-color
refers to the color of the word that is being spoken and
the line-color
refers to the rest of the text.
Setting different values to word-color
and line-color
makes the spoken word to be highlighted.
Setting these properties to the same value makes the whole line of text to look the same color.
Colors can be set in hexadecimal format like #00B140
, but can also include alpha (opacity) values, like #00B14008
.
outline-width, outline-color
Some styles support an outline (classic styles, for example).
outline-width
sets the width of the outline, that can be set to 0 to disable the outline.
outline-color
sets the color of the outline.
box-color
A few styles, instead of outline, support a box around the text. This property sets the color of the box framing the subtitle.
Again, like all color properties, it must be in hexadecimal format like #00B140
, but can also include alpha (opacity) values, like #00B14008
.
shadow-color, shadow-offset
Some styles support a shadow. These properties set the color of the shadow, and the offset of the shadow.
keywords
Keywords is an array of words that helps the AI to transcribe the audio better.
For example, transcribing a voice over talking about "JSON2Video" may be transcribed as "JSON to video" or "Jason to video".
Adding the word "JSON2Video" to the keywords
array will help the AI to transcribe the audio better.
{
"type": "subtitles",
"language": "en",
"settings": {
"keywords": [
"JSON2Video"
]
}
}
replace
Replace is collection of words that will be replaced by other words.
For example, we can replace any appearance of the word "apple" by the word "orange".
{
"type": "subtitles",
"language": "en",
"settings": {
"replace": {
"apple": "orange"
}
}
}