Console

Class ImageTextModel (1.33.0)

ImageTextModel(model_id: str, endpoint_name: typing.Optional[str] = None)

Generates text from images.

Examples::

model = ImageTextModel.from_pretrained("imagetext@001")
image = Image.load_from_file("image.png")

captions = model.get_captions(
    image=image,
    # Optional:
    number_of_results=1,
    language="en",
)

answers = model.ask_question(
    image=image,
    question="What color is the car in this image?",
    # Optional:
    number_of_results=1,
)

Methods

ImageTextModel

ImageTextModel(model_id: str, endpoint_name: typing.Optional[str] = None)

Creates a _ModelGardenModel.

This constructor should not be called directly. Use {model_class}.from_pretrained(model_name=...) instead.

ask_question

ask_question(
    image: vertexai.vision_models.Image, question: str, *, number_of_results: int = 1
) -> typing.List[str]

Answers questions about an image.

from_pretrained

from_pretrained(model_name: str) -> vertexai._model_garden._model_garden_models.T

Loads a _ModelGardenModel.

Exceptions
Type	Description
`ValueError`	If model_name is unknown.
`ValueError`	If model does not support this class.

get_captions

get_captions(
    image: vertexai.vision_models.Image,
    *,
    number_of_results: int = 1,
    language: str = "en"
) -> typing.List[str]

Generates captions for a given image.

Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-06-20 UTC.

Class ImageTextModel (1.33.0) Stay organized with collections Save and categorize content based on your preferences.