The Voice elements allow you to easily add voice-over to your videos by simply indicating the text to be spoken and the type of voice (and language) to be used.

JSON2Video uses Microsoft Azure's Text-To-Speech service to achieve the most natural voices and the widest variety of languages and accents.

In the following examples we will see how we can include voice elements in our videos.

Simple voice over

In this example we will use the default voice to add a short voice-over to a still image video

{
    "project": "tutorial",
    "resolution": "full-hd",
    "quality": "high",
    "scenes": [
        {
            "comment": "Scene #1",
            "elements": [
                {
                    "type": "image",
                    "src": "https:\/\/assets.json2video.com\/assets\/images\/space-apollo11-01.jpg",
                    "scale": {
                        "width": 1920,
                        "height": 1280
                    }
                },
                {
                    "type": "voice",
                    "text": "That's one small step for a man, one giant leap for mankind. Upon taking a \"small step\" onto the surface of the moon in 1969, Neil Armstrong uttered what would become one of history's most famous one-liners.",
                    "start": 1.5
                }
            ]
        }
    ]
}

The resulting video is:

The video uses one scene with 2 elements:

In this example, we are not indicating the voice to use, so it uses the default value for the voice field: en-GB-LibbyNeural.

Using multiple voices

In this example, we will use two voices in two different languages to showcase the Voice element features.

{
    "project": "tutorial",
    "resolution": "full-hd",
    "quality": "high",
    "scenes": [
        {
            "comment": "Scene #1",
            "elements": [
                {
                    "type": "image",
                    "src": "https:\/\/assets.json2video.com\/assets\/images\/woman-01.jpg",
                    "y": -100
                },
                {
                    "type": "text",
                    "items": [
                        {
                            "text": "Hello Diego! Could you please introduce yourself in Italian?",
                            "font-size": 60
                        }
                    ],
                    "y": 850
                },
                {
                    "type": "voice",
                    "text": "Hello Diego! Could you please introduce yourself in Italian?",
                    "voice": "en-US-AriaNeural",
                    "start": 1
                }
            ]
        },
        {
            "comment": "Scene #2",
            "elements": [
                {
                    "type": "image",
                    "src": "https:\/\/assets.json2video.com\/assets\/images\/man-01.jpg",
                    "y": -100
                },
                {
                    "type": "text",
                    "items": [
                        {
                            "text": "S\u00ec, certo, Aria. Mi chiamo Diego Rossi e sono di Firenze.",
                            "font-size": 60
                        }
                    ],
                    "y": 850
                },
                {
                    "type": "voice",
                    "text": "S\u00ec, certo, Aria. Mi chiamo Diego Rossi e sono di Firenze.",
                    "voice": "it-IT-DiegoNeural"
                }
            ]
        }
    ]
}

The resulting video is:

The video simulates a short conversation between an English-speaking woman and an Italian-speaking man.