Video generation input format for video generation model.
prompt
string
The text prompt for generating the videos.
An image to use as the first frame of the generated video. If an input image is provided, an input video is not supported.
An input video. If this field is provided, an input image is not supported. If a mask is provided along with the video, this video will be editing using the mask. Otherwise, this video will be extended by the given duration.
Image to use as the last frame of generated videos. An input image must also be provided.
cameraControl
string
Camera motion to use in generated videos. An input image must also be provided. Valid values are: - fixed - pan_left - pan_right - tilt_up - tilt_down - truck_left - truck_right - pedestal_up - pedestal_down - push_in - pull_out
Mask to use in generated videos.
The images to use as the references to generate the videos. If this field is provided, the text prompt field must also be provided. The image, video, or lastFrame field are not supported. Each image must be associated with a type. Veo 2 supports up to 3 asset images or 1 style image.
JSON representation |
---|
{ "prompt": string, "image": { object ( |
Image
Image input format for the prediction.
mimeType
string
The MIME type of the content of the image. Only the images in below listed MIME types are supported. - image/jpeg - image/png
data
Union type
data
can be only one of the following:bytesBase64Encoded
string
Base64 encoded bytes string representing the image.
gcsUri
string
The Google Cloud Storage location of the image.
JSON representation |
---|
{ "mimeType": string, // data "bytesBase64Encoded": string, "gcsUri": string // Union type } |
Video
Video input format for the prediction.
mimeType
string
The MIME type of the content of the video. Only the videos in below listed MIME types are supported. - video/mov - video/mpeg - video/mp4 - video/mpg - video/avi - video/wmv - video/mpegps - video/flv
data
Union type
data
can be only one of the following:gcsUri
string
The Google Cloud Storage location of the video on which to perform the prediction.
bytesBase64Encoded
string
Base64 encoded bytes string representing the video.
JSON representation |
---|
{ "mimeType": string, // data "gcsUri": string, "bytesBase64Encoded": string // Union type } |
Mask
Mask input format for the prediction.
mimeType
string
Valid values: - image/png - image/jpeg - image/webp - video/mov - video/mpeg - video/mp4 - video/mpg - video/avi - video/wmv - video/mpegps - video/flv
maskMode
string
Describes how the mask will be used. Inpainting masks must match the aspect ration of the input video. Outpainting masks can be either 9:16 or 16:9. Available options are: - insert: The image mask contains a masked rectangular region which is applied on the first frame of the input video. The object described in the prompt is inserted into this region and will appear in subsequent frames. - remove: The image mask is used to determine an object in the first video frame to track. This object is removed from the video. - remove_static: The image mask is used to determine a region in the video. Objects in this region will be removed. - outpaint: The image mask contains a masked rectangular region where the input video will go. The remaining area will be generated. Video masks are not supported.
data
Union type
data
can be only one of the following:bytesBase64Encoded
string
Base64 encoded bytes string representing the mask.
gcsUri
string
The Google Cloud Storage location of the mask.
JSON representation |
---|
{ "mimeType": string, "maskMode": string, // data "bytesBase64Encoded": string, "gcsUri": string // Union type } |
ReferenceImage
Reference image input format for the prediction. A ReferenceImage is an image that is used to provide additional context for the video generation.
The image data to be used as the reference image.
referenceType
string
The type of the reference image, which defines how the reference image will be used to generate the video. Supported types are: - asset: The reference image provides assets to the generated video, such as the scene, an object, a character, etc. - style: The aesthetics of the reference image, including colors, lighting, texture, etc., are used as the style of the generated video, such as 'anime', 'photography', 'origami', etc.
JSON representation |
---|
{
"image": {
object ( |