Stream mappings

The Transcoder API allows you to concatenate videos, mix audio tracks, and more. The JobConfig JSON specification is highly flexible, and this can create ambiguity between inputs and outputs. You can define certain stream mappings to clear up this ambiguity. If you do not, the API provides reasonable default stream mappings for you.

This page shows the default stream mappings provided by the API and some advanced configuration examples for the encoding of input media files.

Background

The inputs list in a JobConfig specifies which files to download, not how to use them. Each input is paired with a key to identify it.

The editList defines a sequence of edits as a timeline for the output file (or manifest) from a transcoding job. The inputs in the editList determine which inputs to use in each atom.

For more information, read the Concepts section in the Overview.

Default video mapping

Each atom in the editList must reference at least one input that contains a video track. If multiple inputs are defined for an atom and each contains a video track, the first input in the inputs list is used as the video source; this is the default mapping. If none of the inputs contains a video track, the job fails.

The following configuration concatenates the first 5 seconds of video track input0.mp4 with 10 seconds of video track input1.mov into the output file:

"inputs": [
  {
    "key": "input0",
    "uri": "gs://my-bucket/input0.mp4"
  },
  {
    "key": "input1",
    "uri": "gs://my-bucket/input1.mov"
  }
],
"editList": [
  {
    "key": "atom0",
    "inputs": ["input0"],
    "endTimeOffset": "5s",
    "startTimeOffset": "0s"
  },
  {
    "key": "atom1",
    "inputs": ["input1"],
    "endTimeOffset": "20s",
    "startTimeOffset": "10s"
  }
]

See Concatenating multiple input videos for more information.

Default audio mappings

Audio mappings apply to a variety of situations, most notably when there is a mismatched number of audio inputs to outputs.

Concatenate multiple inputs

Each atom in the editList must reference at least one input that contains an audio track if an audioStream is defined. If multiple inputs are defined for an atom and each contains an audio track, the first input in the inputs list is used as the audio source; this is the default mapping. If none of the inputs contains an audio track, the job fails.

The Transcoder API only generates a default mapping for each defined audioStream if the mapping is not specified explicitly by the client.

Consider the following configuration that contains a defined audioStream:

"inputs": [
  {
    "key": "video_and_stereo_audio",
    "uri": "gs://my-bucket/video_and_stereo_audio.mp4"
  },
  {
    "key": "video_only",
    "uri": "gs://my-bucket/video_only.mov"
  },
  {
    "key": "stereo_audio_only",
    "uri": "gs://my-bucket/stereo_audio_only.mp3"
  }
],
"editList": [
  {
    "key": "atom0",
    "inputs": ["video_and_stereo_audio"]
  },
  {
    "key": "atom1",
    "inputs": ["video_only", "stereo_audio_only"]
  }
],
"elementaryStreams": [
  {
    "key": "output_audio",
    "audioStream": {
      "codec": "aac",
      "bitrateBps": 64000,
      "channelCount": 2, // API default
      "channelLayout": ["fl", "fr"], // API default
      "sampleRateHertz": 48000
    }
  }
]

The Transcoder API generates the following default mappings for audio output. Note that the audioStream fields are not applied to the video_only input. Although this input appears first in the inputs list, it does not contain an audio track.

"elementaryStreams": [
  {
    "key": "output_audio",
    "audioStream": {
      "codec": "aac",
      "bitrateBps": 64000,
      "channelCount": 2,
      "channelLayout": ["fl", "fr"],
      "sampleRateHertz": 48000,
      "mapping": [
        {
          "atomKey": "atom0",
          "inputKey": "video_and_stereo_audio",
          "inputTrack": 1,
          "inputChannel": 0,
          "outputChannel": 0,
          "gainDb": 0
        },
        {
          "atomKey": "atom0",
          "inputKey": "video_and_stereo_audio",
          "inputTrack": 1,
          "inputChannel": 1,
          "outputChannel": 1,
          "gainDb": 0
        },
        {
          "atomKey": "atom1",
          "inputKey": "stereo_audio_only",
          "inputTrack": 0,
          "inputChannel": 0,
          "outputChannel": 0,
          "gainDb": 0
        },
        {
          "atomKey": "atom1",
          "inputKey": "stereo_audio_only",
          "inputTrack": 0,
          "inputChannel": 1,
          "outputChannel": 1,
          "gainDb": 0
        }
      ]
    }
  }
]

N to n copy

If the number of channels in the input audio track matches the number of channels in the output audioStream, the Transcoder API copies the input channels into the output channels.

Consider the following configuration that contains an input with two-channel stereo audio and a defined audioStream with 2 channels:

"inputs": [
  {
    "key": "video_and_stereo_audio",
    "uri": "gs://my-bucket/video_and_stereo_audio.mp4"
  }
],
"editList": [
  {
    "key": "atom0",
    "inputs": ["video_and_stereo_audio"]
  }
],
"elementaryStreams": [
  {
    "key": "output_audio",
    "audioStream": {
      "codec": "aac",
      "bitrateBps": 64000,
      "channelCount": 2, // API default
      "channelLayout": ["fl", "fr"], // API default
      "sampleRateHertz": 48000
    }
  }
]

The Transcoder API generates the following default mappings for audio output:

"elementaryStreams": [
  {
    "key": "output_audio",
    "audioStream": {
      "codec": "aac",
      "bitrateBps": 64000,
      "channelCount": 2,
      "channelLayout": ["fl", "fr"],
      "sampleRateHertz": 48000,
      "mapping": [
        {
          "atomKey": "atom0",
          "inputKey": "video_and_stereo_audio",
          "inputTrack": 1,
          "inputChannel": 0,
          "outputChannel": 0,
          "gainDb": 0
        },
        {
          "atomKey": "atom0",
          "inputKey": "video_and_stereo_audio",
          "inputTrack": 1,
          "inputChannel": 1,
          "outputChannel": 1,
          "gainDb": 0
        }
      ]
    }
  }
]

N to 1 downmix

If the number of channels in the input audio track is greater than the number of channels in the output audioStream, the Transcoder API copies all input channels into a single output channel.

If the audioStream defines multiple output channels, the single output channel is copied and used for each output channel. For example, if the input audio track consists of 5 channels and the audioStream defines 2 output channels, those two output channels will contain the exact same audio, a downmix of the 5 input channels.

Consider the following configuration that contains an input with two-channel stereo audio and a defined audioStream with one output channel:

"inputs": [
  {
    "key": "video_and_stereo_audio",
    "uri": "gs://my-bucket/video_and_stereo_audio.mp4"
  }
],
"editList": [
  {
    "key": "atom0",
    "inputs": ["video_and_stereo_audio"]
  }
],
"elementaryStreams": [
  {
    "key": "output_audio",
    "audioStream": {
      "codec": "aac",
      "bitrateBps": 64000,
      "channelCount": 1,
      "channelLayout": ["fc"],
      "sampleRateHertz": 48000
    }
  }
]

The Transcoder API generates the following default mappings for audio output:

"elementaryStreams": [
  {
    "key": "output_mono_audio",
    "audioStream": {
      "codec": "aac",
      "bitrateBps": 64000,
      "channelCount": 1,
      "channelLayout": ["fc"],
      "sampleRateHertz": 48000,
      "mapping": [
        {
          "atomKey": "atom0",
          "inputKey": "video_and_stereo_audio",
          "inputTrack": 1,
          "inputChannel": 0,
          "outputChannel": 0,
          "gainDb": 0
        },
        {
          "atomKey": "atom0",
          "inputKey": "video_and_stereo_audio",
          "inputTrack": 1,
          "inputChannel": 1,
          "outputChannel": 0,
          "gainDb": 0
        }
      ]
    }
  }
]

1 to N copy

If the number of channels in the input audio track is less than the number of channels in the output audioStream, the Transcoder API copies the first input channel into each output channel.

Consider the following configuration that contains an input with one-channel mono audio and a defined audioStream with 2 output channels:

"inputs": [
  {
    "key": "video_and_mono_audio",
    "uri": "gs://my-bucket/video_and_mono_audio.mp4"
  }
],
"editList": [
  {
    "key": "atom0",
    "inputs": ["video_and_mono_audio"]
  }
],
"elementaryStreams": [
  {
    "key": "output_mono_audio",
    "audioStream": {
      "codec": "aac",
      "bitrateBps": 64000,
      "channelCount": 2, // API default
      "channelLayout": ["fl", "fr"], // API default
      "sampleRateHertz": 48000
    }
  }
]

The Transcoder API generates the following default mappings for audio output:

"elementaryStreams": [
  {
    "key": "output_mono_audio",
    "audioStream": {
      "codec": "aac",
      "bitrateBps": 64000,
      "channelCount": 2,
      "channelLayout": ["fl", "fr"],
      "sampleRateHertz": 48000,
      "mapping": [
        {
          "atomKey": "atom0",
          "inputKey": "video_and_mono_audio",
          "inputTrack": 1,
          "inputChannel": 0,
          "outputChannel": 0,
          "gainDb": 0
        },
        {
          "atomKey": "atom0",
          "inputKey": "video_and_mono_audio",
          "inputTrack": 1,
          "inputChannel": 0,
          "outputChannel": 1,
          "gainDb": 0
        }
      ]
    }
  }
]

Default text mapping

Text mappings are generally used for subtitles and closed-captioning (CC).

Each atom in the editList must reference at least one input that contains a text track if a textStream is defined. If multiple inputs are defined for an atom and each contains a text track, the first input in the inputs list is used as the text source; this is the default mapping. If none of the inputs contains a text track, the job fails.

"inputs": [
  {
    "key": "video_and_audio",
    "uri": "gs://my-bucket/video_and_audio.mp4"
  },
  {
    "key": "sub",
    "uri": "gs://my-bucket/sub.srt"
  }
],
"editList": [
  {
    "key": "atom0",
    "inputs": ["video_and_audio", "sub"]
  }
],
"elementaryStreams": [
  {
    "key": "output_sub",
    "textStream": {
      "codec": "webvtt"
    }
  }
]

The Transcoder API generates the following default mappings for text output:

"elementaryStreams": [
  {
    "key": "output_sub",
    "textStream": {
      "codec": "webvtt",
      "mapping": [
        {
          "atomKey": "atom0",
          "inputKey": "caption_input0",
          "inputTrack": 0
        }
      ]
    }
  }
]