VisionEmbeddingModelResult

Prediction format for large vision model embedding api.

Fields
imageEmbedding array (ListValue format)

The 1024 dimension image embedding result from the provided image.

textEmbedding array (ListValue format)

The 1024 dimension text embedding result from the provided text.

videoEmbeddings[] object (VideoEmbedding)

Video embeddings.

JSON representation
{
  "imageEmbedding": array,
  "textEmbedding": array,
  "videoEmbeddings": [
    {
      object (VideoEmbedding)
    }
  ]
}

VideoEmbedding

The video embedding message.

Fields
startOffsetSec integer

The start offset of the video.

endOffsetSec integer

The end offset of the video.

embedding array (ListValue format)

The 1024 dimension video embedding result from the provided video.

JSON representation
{
  "startOffsetSec": integer,
  "endOffsetSec": integer,
  "embedding": array
}