A partire dal 29 aprile 2025, i modelli Gemini 1.5 Pro e Gemini 1.5 Flash non sono disponibili nei progetti che non li hanno mai utilizzati, inclusi i nuovi progetti. Per maggiori dettagli, vedi Versioni e ciclo di vita dei modelli.
Mantieni tutto organizzato con le raccolte
Salva e classifica i contenuti in base alle tue preferenze.
imagetext è il nome del modello che supporta le didascalie delle immagini. imagetext
genera una didascalia da un'immagine che fornisci in base alla lingua che
specifichi. Il modello supporta le seguenti lingue: inglese (en), tedesco
(de), francese (fr), spagnolo (es) e italiano (it).
Per esplorare questo modello nella console, consulta la scheda del modello Image Captioning in
Model Garden.
{"instances":[{"image":{// Union field can be only one of the following:"bytesBase64Encoded":string,"gcsUri":string,// End of list of possible types for union field."mimeType":string}}],"parameters":{"sampleCount":integer,"storageUri":string,"language":string,"seed":integer}}
Un array che contiene l'oggetto con i dettagli dell'immagine su cui ottenere informazioni.
array (è consentito un solo oggetto immagine)
bytesBase64Encoded
L'immagine a cui aggiungere una didascalia.
Stringa dell'immagine con codifica base64 (PNG o JPEG, massimo 20 MB)
gcsUri
L'URI Cloud Storage dell'immagine a cui aggiungere i sottotitoli codificati.
URI stringa del file immagine in Cloud Storage (PNG o JPEG, massimo 20 MB)
mimeType
Facoltativo. Il tipo MIME dell'immagine che specifichi.
stringa (image/jpeg o image/png)
sampleCount
Numero di stringhe di testo generate.
Valore int: 1-3
seed
Facoltativo. Il seed per il generatore di numeri casuali (RNG). Se il seed RNG è lo stesso per le richieste con gli input, i risultati della previsione saranno gli stessi.
integer
storageUri
Facoltativo. La posizione Cloud Storage in cui salvare le risposte di testo generate.
string
language
Facoltativo. Il prompt di testo per guidare la risposta.
stringa: en (impostazione predefinita), de, fr, it, es
[[["Facile da capire","easyToUnderstand","thumb-up"],["Il problema è stato risolto","solvedMyProblem","thumb-up"],["Altra","otherUp","thumb-up"]],[["Difficile da capire","hardToUnderstand","thumb-down"],["Informazioni o codice di esempio errati","incorrectInformationOrSampleCode","thumb-down"],["Mancano le informazioni o gli esempi di cui ho bisogno","missingTheInformationSamplesINeed","thumb-down"],["Problema di traduzione","translationIssue","thumb-down"],["Altra","otherDown","thumb-down"]],["Ultimo aggiornamento 2025-09-04 UTC."],[],[],null,["# Image captions\n\n| **Caution:** Starting on June 24, 2025, Imagen versions 1 and 2 are deprecated. Imagen models `imagegeneration@002`, `imagegeneration@005`, and `imagegeneration@006` will be removed on September 24, 2025 . For more information about migrating to Imagen 3, see [Migrate to\n| Imagen 3](/vertex-ai/generative-ai/docs/image/migrate-to-imagen-3).\n\n\u003cbr /\u003e\n\n`imagetext` is the name of the model that supports image captioning. `imagetext`\ngenerates a caption from an image you provide based on the language that you\nspecify. The model supports the following languages: English (`en`), German\n(`de`), French (`fr`), Spanish (`es`) and Italian (`it`).\n\nTo explore this model in the console, see the `Image Captioning` model card in\nthe Model Garden.\n\n\n[View Imagen for Captioning \\& VQA model card](https://console.cloud.google.com/vertex-ai/publishers/google/model-garden/imagetext)\n\nUse cases\n---------\n\nSome common use cases for image captioning include:\n\n- Creators can generate captions for uploaded images and videos (for example, a short description of a video sequence)\n- Generate captions to describe products\n- Integrate captioning with an app using the API to create new experiences\n\nHTTP request\n------------\n\n POST https://us-central1-aiplatform.googleapis.com/v1/projects/\u003cvar translate=\"no\"\u003ePROJECT_ID\u003c/var\u003e/locations/us-central1/publishers/google/models/imagetext:predict\n\nRequest body\n------------\n\n {\n \"instances\": [\n {\n \"image\": {\n // Union field can be only one of the following:\n \"bytesBase64Encoded\": string,\n \"gcsUri\": string,\n // End of list of possible types for union field.\n \"mimeType\": string\n }\n }\n ],\n \"parameters\": {\n \"sampleCount\": integer,\n \"storageUri\": string,\n \"language\": string,\n \"seed\": integer\n }\n }\n\nUse the following parameters for the Imagen model `imagetext`.\nFor more information, see\n[Get image descriptions using visual captioning](/vertex-ai/generative-ai/docs/image/image-captioning).\n\nSample request\n--------------\n\n### REST\n\nTo test a text prompt by using the Vertex AI API, send a POST request to the\npublisher model endpoint.\n\n\nBefore using any of the request data,\nmake the following replacements:\n\n- \u003cvar translate=\"no\"\u003ePROJECT_ID\u003c/var\u003e: Your Google Cloud [project ID](/resource-manager/docs/creating-managing-projects#identifiers).\n- \u003cvar translate=\"no\"\u003eLOCATION\u003c/var\u003e: Your project's region. For example, `us-central1`, `europe-west2`, or `asia-northeast3`. For a list of available regions, see [Generative AI on Vertex AI locations](/vertex-ai/generative-ai/docs/learn/locations-genai).\n- \u003cvar translate=\"no\"\u003eB64_IMAGE\u003c/var\u003e: The image to get captions for. The image must be specified as a [base64-encoded](/vertex-ai/generative-ai/docs/image/base64-encode) byte string. Size limit: 10 MB.\n- \u003cvar translate=\"no\"\u003eRESPONSE_COUNT\u003c/var\u003e: The number of image captions you want to generate. Accepted integer values: 1-3.\n- \u003cvar translate=\"no\"\u003eLANGUAGE_CODE\u003c/var\u003e: One of the supported language codes. Languages supported:\n - English (`en`)\n - French (`fr`)\n - German (`de`)\n - Italian (`it`)\n - Spanish (`es`)\n\n\nHTTP method and URL:\n\n```\nPOST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/imagetext:predict\n```\n\n\nRequest JSON body:\n\n```\n{\n \"instances\": [\n {\n \"image\": {\n \"bytesBase64Encoded\": \"B64_IMAGE\"\n }\n }\n ],\n \"parameters\": {\n \"sampleCount\": RESPONSE_COUNT,\n \"language\": \"LANGUAGE_CODE\"\n }\n}\n```\n\nTo send your request, choose one of these options: \n\n#### curl\n\n| **Note:** The following command assumes that you have logged in to the `gcloud` CLI with your user account by running [`gcloud init`](/sdk/gcloud/reference/init) or [`gcloud auth login`](/sdk/gcloud/reference/auth/login) , or by using [Cloud Shell](/shell/docs), which automatically logs you into the `gcloud` CLI . You can check the currently active account by running [`gcloud auth list`](/sdk/gcloud/reference/auth/list).\n\n\nSave the request body in a file named `request.json`,\nand execute the following command:\n\n```\ncurl -X POST \\\n -H \"Authorization: Bearer $(gcloud auth print-access-token)\" \\\n -H \"Content-Type: application/json; charset=utf-8\" \\\n -d @request.json \\\n \"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/imagetext:predict\"\n```\n\n#### PowerShell\n\n| **Note:** The following command assumes that you have logged in to the `gcloud` CLI with your user account by running [`gcloud init`](/sdk/gcloud/reference/init) or [`gcloud auth login`](/sdk/gcloud/reference/auth/login) . You can check the currently active account by running [`gcloud auth list`](/sdk/gcloud/reference/auth/list).\n\n\nSave the request body in a file named `request.json`,\nand execute the following command:\n\n```\n$cred = gcloud auth print-access-token\n$headers = @{ \"Authorization\" = \"Bearer $cred\" }\n\nInvoke-WebRequest `\n -Method POST `\n -Headers $headers `\n -ContentType: \"application/json; charset=utf-8\" `\n -InFile request.json `\n -Uri \"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/imagetext:predict\" | Select-Object -Expand Content\n```\nThe following sample responses are for a request with `\"sampleCount\": 2`. The response returns two prediction strings.\n\n**English (`en`):** \n\n```\n{\n \"predictions\": [\n \"a yellow mug with a sheep on it sits next to a slice of cake\",\n \"a cup of coffee with a heart shaped latte art next to a slice of cake\"\n ],\n \"deployedModelId\": \"DEPLOYED_MODEL_ID\",\n \"model\": \"projects/PROJECT_ID/locations/LOCATION/models/MODEL_ID\",\n \"modelDisplayName\": \"MODEL_DISPLAYNAME\",\n \"modelVersionId\": \"1\"\n}\n```\n\n**Spanish (`es`):**\n\n```\n{\n \"predictions\": [\n \"una taza de café junto a un plato de pastel de chocolate\",\n \"una taza de café con una forma de corazón en la espuma\"\n ]\n}\n```\n\n\u003cbr /\u003e\n\nResponse body\n-------------\n\n {\n \"predictions\": [ string ]\n }\n\nSample response\n---------------\n\n {\n \"predictions\": [\n \"text1\",\n \"text2\"\n ]\n }"]]