This guide shows you how to get multimodal embeddings for text, image, and video data using the The following diagram summarizes the overall workflow for getting multimodal embeddings: The multimodal embeddings model generates 1408-dimension vectors based on the
input you provide, which can include a combination of image, text, and video
data. You can then use the embedding vectors for subsequent tasks like image
classification or video content moderation. The image embedding vector and text embedding vector are in the same semantic space
with the same dimensionality. Consequently, you can use these vectors
interchangeably for use cases like searching for an image with text or searching for a video with an image.
For text-only embedding use cases, we recommend using the Vertex AI
text-embeddings API instead. For example, the text-embeddings API might be
better for text-based semantic search, clustering, long-form document analysis,
and other text retrieval or question-answering use cases. For more information,
see Get text embeddings.
To get multimodal embeddings, use the following model: Consider the following best practices when you use the multimodal embeddings model: the text 'cat' picture of a cat In the Google Cloud console, on the project selector page,
select or create a Google Cloud project.
Verify that billing is enabled for your Google Cloud project.
Enable the Vertex AI API.
In the Google Cloud console, on the project selector page,
select or create a Google Cloud project.
Verify that billing is enabled for your Google Cloud project.
Enable the Vertex AI API.
Set up authentication for your environment. Select the tab for how you plan to use the samples on this page:
To use the Java samples on this page in a local
development environment, install and initialize the gcloud CLI, and
then set up Application Default Credentials with your user credentials.
Install the Google Cloud CLI.
If you're using an external identity provider (IdP), you must first
sign in to the gcloud CLI with your federated identity.
After initializing the gcloud CLI, update it and install the required components:
If you're using a local shell, then create local authentication credentials for your user
account:
You don't need to do this if you're using Cloud Shell.
If an authentication error is returned, and you are using an external identity provider
(IdP), confirm that you have
signed in to the gcloud CLI with your federated identity.
For more information, see
Set up ADC for a local development environment
in the Google Cloud authentication documentation.
To use the Node.js samples on this page in a local
development environment, install and initialize the gcloud CLI, and
then set up Application Default Credentials with your user credentials.
Install the Google Cloud CLI.
If you're using an external identity provider (IdP), you must first
sign in to the gcloud CLI with your federated identity.
After initializing the gcloud CLI, update it and install the required components:
If you're using a local shell, then create local authentication credentials for your user
account:
You don't need to do this if you're using Cloud Shell.
If an authentication error is returned, and you are using an external identity provider
(IdP), confirm that you have
signed in to the gcloud CLI with your federated identity.
For more information, see
Set up ADC for a local development environment
in the Google Cloud authentication documentation.
To use the Python samples on this page in a local
development environment, install and initialize the gcloud CLI, and
then set up Application Default Credentials with your user credentials.
Install the Google Cloud CLI.
If you're using an external identity provider (IdP), you must first
sign in to the gcloud CLI with your federated identity.
After initializing the gcloud CLI, update it and install the required components:
If you're using a local shell, then create local authentication credentials for your user
account:
You don't need to do this if you're using Cloud Shell.
If an authentication error is returned, and you are using an external identity provider
(IdP), confirm that you have
signed in to the gcloud CLI with your federated identity.
For more information, see
Set up ADC for a local development environment
in the Google Cloud authentication documentation.
To use the REST API samples on this page in a local development environment, you use the
credentials you provide to the gcloud CLI.
Install the Google Cloud CLI.
If you're using an external identity provider (IdP), you must first
sign in to the gcloud CLI with your federated identity.
After initializing the gcloud CLI, update it and install the required components:
For more information, see
Authenticate for using REST
in the Google Cloud authentication documentation.
To use the Python SDK, follow the instructions at
Install the Vertex AI SDK for Python.
For more information, see the
Vertex AI SDK for Python API reference documentation. Use the following code samples to send an embedding request with image and text
data. The samples show how to send a request with both data types, but you can
also send requests with only image data or only text data.
For more information about
Before using any of the request data,
make the following replacements:
HTTP method and URL:
Request JSON body:
To send your request, choose one of these options:
Save the request body in a file named
Save the request body in a file named To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python.
For more information, see the
Python API reference documentation.
Before trying this sample, follow the Node.js setup instructions in the
Vertex AI quickstart using
client libraries.
For more information, see the
Vertex AI Node.js API
reference documentation.
To authenticate to Vertex AI, set up Application Default Credentials.
For more information, see
Set up authentication for a local development environment.
Before trying this sample, follow the Java setup instructions in the
Vertex AI quickstart using
client libraries.
For more information, see the
Vertex AI Java API
reference documentation.
To authenticate to Vertex AI, set up Application Default Credentials.
For more information, see
Set up authentication for a local development environment.
Before trying this sample, follow the Go setup instructions in the
Vertex AI quickstart using
client libraries.
For more information, see the
Vertex AI Go API
reference documentation.
To authenticate to Vertex AI, set up Application Default Credentials.
For more information, see
Set up authentication for a local development environment.
multimodalembedding
model. This page covers the following topics:
Supported models
multimodalembedding
Best practices
Before you begin
Java
gcloud components update
gcloud components install beta
gcloud auth application-default login
Node.js
gcloud components update
gcloud components install beta
gcloud auth application-default login
Python
gcloud components update
gcloud components install beta
gcloud auth application-default login
REST
gcloud components update
gcloud components install beta
Send an embedding request (image and text)
REST
multimodalembedding
model requests, see the
multimodalembedding
model API reference.
us-central1
, europe-west2
, or asia-northeast3
. For a list
of available regions, see
Generative AI on Vertex AI locations.
a cat
.POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/multimodalembedding@001:predict
{
"instances": [
{
"text": "TEXT",
"image": {
"bytesBase64Encoded": "B64_ENCODED_IMG"
}
}
]
}
curl
request.json
,
and execute the following command:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/multimodalembedding@001:predict"PowerShell
request.json
,
and execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/multimodalembedding@001:predict" | Select-Object -Expand Content
{
"predictions": [
{
"textEmbedding": [
0.010477379,
-0.00399621,
0.00576670747,
[...]
-0.00823613815,
-0.0169572588,
-0.00472954148
],
"imageEmbedding": [
0.00262696808,
-0.00198890246,
0.0152047109,
-0.0103145819,
[...]
0.0324628279,
0.0284924973,
0.011650892,
-0.00452344026
]
}
],
"deployedModelId": "DEPLOYED_MODEL_ID"
}
Python
Node.js
Java
Go
Send an embedding request (video, image, or text)
You can send an embedding request with an input video alone, or with a combination of video, image, and text data.
Video embedding modes
There are three modes you can use with video embeddings: Essential, Standard, or
Plus. The mode corresponds to the density of the generated embeddings, which you
specify with the intervalSec
field in the request. The model generates one
embedding for each video interval of the length that you specify. The minimum
video interval length is 4 seconds. Interval lengths greater than 120 seconds
might negatively affect the quality of the generated embeddings.
Pricing for video embedding depends on the mode you use. For more information, see pricing.
The following table summarizes the three modes you can use for video embeddings:
Mode | Description | Use Case | Embedding Interval (intervalSec ) |
---|---|---|---|
Essential | Generates the fewest embeddings per minute of video. | Best for coarse-grained analysis where cost-efficiency is a priority and high temporal detail is not required. | >= 15 seconds |
Standard | Offers a balance between embedding density and cost. | Suitable for general-purpose video analysis, such as topic segmentation or standard search and retrieval. | 8 to < 15 seconds |
Plus | Generates the most embeddings per minute of video, providing the highest temporal granularity. | Ideal for detailed, fine-grained analysis, such as detecting short events, action recognition, or content moderation. | 4 to < 8 seconds |
Video embeddings best practices
Consider the following when you send video embedding requests:
To generate a single embedding for the first two minutes of an input video of any length, use the following
videoSegmentConfig
setting:request.json
:// other request body content "videoSegmentConfig": { "intervalSec": 120 } // other request body content
To generate embeddings for a video longer than two minutes, you can send multiple requests that specify the start and end times in the
videoSegmentConfig
:request1.json
:// other request body content "videoSegmentConfig": { "startOffsetSec": 0, "endOffsetSec": 120 } // other request body content
request2.json
:// other request body content "videoSegmentConfig": { "startOffsetSec": 120, "endOffsetSec": 240 } // other request body content
Get video embeddings
Use the following sample to get embeddings for video content alone.
REST
For more information about multimodalembedding
model requests, see the
multimodalembedding
model API reference.
The following example uses a video located in Cloud Storage. You can
also use the video.bytesBase64Encoded
field to provide a
base64-encoded string representation of the
video.
Before using any of the request data, make the following replacements:
- LOCATION: Your project's region. For example,
us-central1
,europe-west2
, orasia-northeast3
. For a list of available regions, see Generative AI on Vertex AI locations. - PROJECT_ID: Your Google Cloud project ID.
- VIDEO_URI: The Cloud Storage URI of the target video to get embeddings for.
For example,
gs://my-bucket/embeddings/supermarket-video.mp4
.You can also provide the video as a base64-encoded byte string:
[...] "video": { "bytesBase64Encoded": "B64_ENCODED_VIDEO" } [...]
videoSegmentConfig
(START_SECOND, END_SECOND, INTERVAL_SECONDS). Optional. The specific video segments (in seconds) the embeddings are generated for.For example:
[...] "videoSegmentConfig": { "startOffsetSec": 10, "endOffsetSec": 60, "intervalSec": 10 } [...]
Using this config specifies video data from 10 seconds to 60 seconds and generates embeddings for the following 10 second video intervals: [10, 20), [20, 30), [30, 40), [40, 50), [50, 60). This video interval (
"intervalSec": 10
) falls in the Standard video embedding mode, and the user is charged at the Standard mode pricing rate.If you omit
videoSegmentConfig
, the service uses the following default values:"videoSegmentConfig": { "startOffsetSec": 0, "endOffsetSec": 120, "intervalSec": 16 }
. This video interval ("intervalSec": 16
) falls in the Essential video embedding mode, and the user is charged at the Essential mode pricing rate.
HTTP method and URL:
POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/multimodalembedding@001:predict
Request JSON body:
{ "instances": [ { "video": { "gcsUri": "VIDEO_URI", "videoSegmentConfig": { "startOffsetSec": START_SECOND, "endOffsetSec": END_SECOND, "intervalSec": INTERVAL_SECONDS } } } ] }
To send your request, choose one of these options:
curl
Save the request body in a file named request.json
,
and execute the following command:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/multimodalembedding@001:predict"
PowerShell
Save the request body in a file named request.json
,
and execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/multimodalembedding@001:predict" | Select-Object -Expand Content
Response (7 second video, no videoSegmentConfig
specified):
{ "predictions": [ { "videoEmbeddings": [ { "endOffsetSec": 7, "embedding": [ -0.0045467657, 0.0258095954, 0.0146885719, 0.00945400633, [...] -0.0023291884, -0.00493789, 0.00975185353, 0.0168156829 ], "startOffsetSec": 0 } ] } ], "deployedModelId": "DEPLOYED_MODEL_ID" }
Response (59 second video, with the following video segment config: "videoSegmentConfig": { "startOffsetSec": 0, "endOffsetSec": 60, "intervalSec": 10 }
):
{ "predictions": [ { "videoEmbeddings": [ { "endOffsetSec": 10, "startOffsetSec": 0, "embedding": [ -0.00683252793, 0.0390476175, [...] 0.00657121744, 0.013023301 ] }, { "startOffsetSec": 10, "endOffsetSec": 20, "embedding": [ -0.0104404651, 0.0357737206, [...] 0.00509833824, 0.0131902946 ] }, { "startOffsetSec": 20, "embedding": [ -0.0113538112, 0.0305239167, [...] -0.00195809244, 0.00941874553 ], "endOffsetSec": 30 }, { "embedding": [ -0.00299320649, 0.0322436653, [...] -0.00993082579, 0.00968887936 ], "startOffsetSec": 30, "endOffsetSec": 40 }, { "endOffsetSec": 50, "startOffsetSec": 40, "embedding": [ -0.00591270532, 0.0368893594, [...] -0.00219071587, 0.0042470959 ] }, { "embedding": [ -0.00458270218, 0.0368121453, [...] -0.00317760976, 0.00595594104 ], "endOffsetSec": 59, "startOffsetSec": 50 } ] } ], "deployedModelId": "DEPLOYED_MODEL_ID" }
Python
To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python. For more information, see the Python API reference documentation.
Go
Before trying this sample, follow the Go setup instructions in the Vertex AI quickstart using client libraries. For more information, see the Vertex AI Go API reference documentation.
To authenticate to Vertex AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Get image, text, and video embeddings
Use the following sample to get embeddings for video, text, and image content.
REST
For more information about multimodalembedding
model requests, see the
multimodalembedding
model API reference.
The following example uses image, text, and video data. You can use any combination of these data types in your request body.
Additionally, this
sample uses a video located in Cloud Storage. You can
also use the video.bytesBase64Encoded
field to provide a
base64-encoded string representation of the
video.
Before using any of the request data, make the following replacements:
- LOCATION: Your project's region. For example,
us-central1
,europe-west2
, orasia-northeast3
. For a list of available regions, see Generative AI on Vertex AI locations. - PROJECT_ID: Your Google Cloud project ID.
- TEXT: The target text to get embeddings for. For example,
a cat
. - IMAGE_URI: The Cloud Storage URI of the target image to get embeddings for.
For example,
gs://my-bucket/embeddings/supermarket-img.png
.You can also provide the image as a base64-encoded byte string:
[...] "image": { "bytesBase64Encoded": "B64_ENCODED_IMAGE" } [...]
- VIDEO_URI: The Cloud Storage URI of the target video to get embeddings for.
For example,
gs://my-bucket/embeddings/supermarket-video.mp4
.You can also provide the video as a base64-encoded byte string:
[...] "video": { "bytesBase64Encoded": "B64_ENCODED_VIDEO" } [...]
videoSegmentConfig
(START_SECOND, END_SECOND, INTERVAL_SECONDS). Optional. The specific video segments (in seconds) the embeddings are generated for.For example:
[...] "videoSegmentConfig": { "startOffsetSec": 10, "endOffsetSec": 60, "intervalSec": 10 } [...]
Using this config specifies video data from 10 seconds to 60 seconds and generates embeddings for the following 10 second video intervals: [10, 20), [20, 30), [30, 40), [40, 50), [50, 60). This video interval (
"intervalSec": 10
) falls in the Standard video embedding mode, and the user is charged at the Standard mode pricing rate.If you omit
videoSegmentConfig
, the service uses the following default values:"videoSegmentConfig": { "startOffsetSec": 0, "endOffsetSec": 120, "intervalSec": 16 }
. This video interval ("intervalSec": 16
) falls in the Essential video embedding mode, and the user is charged at the Essential mode pricing rate.
HTTP method and URL:
POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/multimodalembedding@001:predict
Request JSON body:
{ "instances": [ { "text": "TEXT", "image": { "gcsUri": "IMAGE_URI" }, "video": { "gcsUri": "VIDEO_URI", "videoSegmentConfig": { "startOffsetSec": START_SECOND, "endOffsetSec": END_SECOND, "intervalSec": INTERVAL_SECONDS } } } ] }
To send your request, choose one of these options:
curl
Save the request body in a file named request.json
,
and execute the following command:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/multimodalembedding@001:predict"
PowerShell
Save the request body in a file named request.json
,
and execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/multimodalembedding@001:predict" | Select-Object -Expand Content
{ "predictions": [ { "textEmbedding": [ 0.0105433334, -0.00302835181, 0.00656806398, 0.00603460241, [...] 0.00445805816, 0.0139605571, -0.00170318608, -0.00490092579 ], "videoEmbeddings": [ { "startOffsetSec": 0, "endOffsetSec": 7, "embedding": [ -0.00673126569, 0.0248149596, 0.0128901172, 0.0107588246, [...] -0.00180952181, -0.0054573305, 0.0117037306, 0.0169312079 ] } ], "imageEmbedding": [ -0.00728622358, 0.031021487, -0.00206603738, 0.0273937676, [...] -0.00204976718, 0.00321615417, 0.0121978866, 0.0193375275 ] } ], "deployedModelId": "DEPLOYED_MODEL_ID" }
Python
To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python. For more information, see the Python API reference documentation.
Go
Before trying this sample, follow the Go setup instructions in the Vertex AI quickstart using client libraries. For more information, see the Vertex AI Go API reference documentation.
To authenticate to Vertex AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Specify lower-dimension embeddings
By default, an embedding request returns a 1408-dimension float vector for a data type. You can also specify lower-dimension embeddings (128, 256, or 512 float vectors) for text and image data. This option lets you optimize for latency and storage or quality based on how you plan to use the embeddings. Lower-dimension embeddings require less storage and provide lower latency for subsequent embedding tasks (like search or recommendation), while higher-dimension embeddings offer greater accuracy for the same tasks.
Dimension | Description | Pros | Cons | Use Case |
---|---|---|---|---|
128 | Lowest dimension embedding. | Fastest performance, lowest storage and memory cost. | Lowest accuracy and granularity. | Applications where speed and cost are critical, and lower accuracy is acceptable (e.g., large-scale, coarse-grained search). |
256 | A balance between performance and quality. | Good performance with moderate storage costs. | Less accurate than higher-dimension embeddings. | General-purpose applications requiring a balance of speed and accuracy. |
512 | Higher-quality embedding. | Improved accuracy and detail representation. | Slower performance and higher storage costs than lower dimensions. | Tasks requiring higher accuracy, such as fine-grained similarity search or classification. |
1408 (default) | Highest dimension embedding. | Highest accuracy and most detailed semantic representation. | Highest latency and storage/memory requirements. | High-stakes applications where quality is paramount and performance costs are acceptable. |
REST
For more information about multimodalembedding
model requests, see the
multimodalembedding
model API reference.
To request a lower-dimension embedding, add the parameters.dimension
field to your request. Set the value to 128
, 256
, 512
, or 1408
. The response then includes an embedding of the specified dimension.
Before using any of the request data, make the following replacements:
- LOCATION: Your project's region. For example,
us-central1
,europe-west2
, orasia-northeast3
. For a list of available regions, see Generative AI on Vertex AI locations. - PROJECT_ID: Your Google Cloud project ID.
- IMAGE_URI: The Cloud Storage URI of the target image to get embeddings for.
For example,
gs://my-bucket/embeddings/supermarket-img.png
.You can also provide the image as a base64-encoded byte string:
[...] "image": { "bytesBase64Encoded": "B64_ENCODED_IMAGE" } [...]
- TEXT: The target text to get embeddings for. For example,
a cat
. - EMBEDDING_DIMENSION: The number of embedding dimensions. Lower values offer decreased
latency when using these embeddings for subsequent tasks, while higher values offer better
accuracy. Available values:
128
,256
,512
, and1408
(default).
HTTP method and URL:
POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/multimodalembedding@001:predict
Request JSON body:
{ "instances": [ { "image": { "gcsUri": "IMAGE_URI" }, "text": "TEXT" } ], "parameters": { "dimension": EMBEDDING_DIMENSION } }
To send your request, choose one of these options:
curl
Save the request body in a file named request.json
,
and execute the following command:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/multimodalembedding@001:predict"
PowerShell
Save the request body in a file named request.json
,
and execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/multimodalembedding@001:predict" | Select-Object -Expand Content
128 dimensions:
{ "predictions": [ { "imageEmbedding": [ 0.0279239565, [...128 dimension vector...] 0.00403284049 ], "textEmbedding": [ 0.202921599, [...128 dimension vector...] -0.0365431122 ] } ], "deployedModelId": "DEPLOYED_MODEL_ID" }
256 dimensions:
{ "predictions": [ { "imageEmbedding": [ 0.248620048, [...256 dimension vector...] -0.0646447465 ], "textEmbedding": [ 0.0757875815, [...256 dimension vector...] -0.02749932 ] } ], "deployedModelId": "DEPLOYED_MODEL_ID" }
512 dimensions:
{ "predictions": [ { "imageEmbedding": [ -0.0523675755, [...512 dimension vector...] -0.0444030389 ], "textEmbedding": [ -0.0592851527, [...512 dimension vector...] 0.0350437127 ] } ], "deployedModelId": "DEPLOYED_MODEL_ID" }
Python
To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python. For more information, see the Python API reference documentation.
Go
Before trying this sample, follow the Go setup instructions in the Vertex AI quickstart using client libraries. For more information, see the Vertex AI Go API reference documentation.
To authenticate to Vertex AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
API usage and limits
API limits
The following limits apply when you use the multimodalembedding
model for
text and image embeddings:
Limit | Value and description |
---|---|
Text and image data | |
Maximum number of API requests per minute per project | 120-600 depending on region |
Maximum text length | 32 tokens (~32 words). If the input text exceeds this limit, the model truncates it to 32 tokens. |
Language | English |
Image formats | BMP, GIF, JPG, PNG |
Image size | 20 MB. For base64-encoded images, this limit applies after transcoding to PNG. For Cloud Storage images, this limit applies to the original file. To reduce network latency, use smaller images. The model resizes all images to 512x512 pixels, so you don't need to provide higher-resolution images. |
Video data | |
Audio supported | N/A - The model doesn't consider audio content when generating video embeddings. |
Video formats | AVI, FLV, MKV, MOV, MP4, MPEG, MPG, WEBM, and WMV |
Maximum video length (Cloud Storage) | No limit. However, the model analyzes only 2 minutes of content at a time. |
Locations
A location is a region you can specify in a request to control where data is stored at rest. For a list of available regions, see Generative AI on Vertex AI locations.
Error messages
Quota exceeded error
google.api_core.exceptions.ResourceExhausted: 429 Quota exceeded for
aiplatform.googleapis.com/online_prediction_requests_per_base_model with base
model: multimodalembedding. Please submit a quota increase request.
If this is the first time you receive this error, use the Google Cloud console to request a quota adjustment for your project. Use the following filters before requesting your adjustment:
Service ID: aiplatform.googleapis.com
metric: aiplatform.googleapis.com/online_prediction_requests_per_base_model
base_model:multimodalembedding
If you have already sent a quota adjustment request, wait before sending another request. If you need to further increase the quota, repeat the quota adjustment request with your justification for a sustained quota adjustment.
What's next
- Read the blog "What is Multimodal Search: 'LLMs with vision' change businesses".
- For information about text-only use cases (text-based semantic search, clustering, long-form document analysis, and other text retrieval or question-answering use cases), read Get text embeddings.
- View all Vertex AI image generative AI offerings in the Imagen on Vertex AI overview.
- Explore more pretrained models in Model Garden.
- Learn about responsible AI best practices and safety filters in Vertex AI.