Modulo visione_models (1.50.0)

Corsi per lavorare con i modelli di visione artificiale.

Corsi

GeneratedImage

GeneratedImage(
    image_bytes: typing.Optional[bytes],
    generation_parameters: typing.Dict[str, typing.Any],
    gcs_uri: typing.Optional[str] = None,
)

Immagine generata.

Image

Image(
    image_bytes: typing.Optional[bytes] = None, gcs_uri: typing.Optional[str] = None
)

Immagine.

ImageCaptioningModel

ImageCaptioningModel(model_id: str, endpoint_name: typing.Optional[str] = None)

Genera didascalie dall'immagine.

Esempi:

model = ImageCaptioningModel.from_pretrained("imagetext@001")
image = Image.load_from_file("image.png")
captions = model.get_captions(
    image=image,
    # Optional:
    number_of_results=1,
    language="en",
)

ImageGenerationModel

ImageGenerationModel(model_id: str, endpoint_name: typing.Optional[str] = None)

Genera immagini dal prompt di testo.

Esempi:

model = ImageGenerationModel.from_pretrained("imagegeneration@002")
response = model.generate_images(
    prompt="Astronaut riding a horse",
    # Optional:
    number_of_images=1,
    seed=0,
)
response[0].show()
response[0].save("image1.png")

ImageGenerationResponse

ImageGenerationResponse(images: typing.List[GeneratedImage])

Risposta di generazione di immagini.

ImageQnAModel

ImageQnAModel(model_id: str, endpoint_name: typing.Optional[str] = None)

Consente di rispondere a domande su un'immagine.

Esempi:

model = ImageQnAModel.from_pretrained("imagetext@001")
image = Image.load_from_file("image.png")
answers = model.ask_question(
    image=image,
    question="What color is the car in this image?",
    # Optional:
    number_of_results=1,
)

ImageTextModel

ImageTextModel(model_id: str, endpoint_name: typing.Optional[str] = None)

Genera testo dalle immagini.

Esempi:

model = ImageTextModel.from_pretrained("imagetext@001")
image = Image.load_from_file("image.png")

captions = model.get_captions(
    image=image,
    # Optional:
    number_of_results=1,
    language="en",
)

answers = model.ask_question(
    image=image,
    question="What color is the car in this image?",
    # Optional:
    number_of_results=1,
)

MultiModalEmbeddingModel

MultiModalEmbeddingModel(model_id: str, endpoint_name: typing.Optional[str] = None)

Genera vettori di incorporamento da immagini e video.

Esempi:

model = MultiModalEmbeddingModel.from_pretrained("multimodalembedding@001")
image = Image.load_from_file("image.png")
video = Video.load_from_file("video.mp4")

embeddings = model.get_embeddings(
    image=image,
    video=video,
    contextual_text="Hello world",
)
image_embedding = embeddings.image_embedding
video_embeddings = embeddings.video_embeddings
text_embedding = embeddings.text_embedding

MultiModalEmbeddingResponse

MultiModalEmbeddingResponse(
    _prediction_response: typing.Any,
    image_embedding: typing.Optional[typing.List[float]] = None,
    video_embeddings: typing.Optional[
        typing.List[vertexai.vision_models.VideoEmbedding]
    ] = None,
    text_embedding: typing.Optional[typing.List[float]] = None,
)

La risposta di incorporamento multimodale.

Video

Video(
    video_bytes: typing.Optional[bytes] = None, gcs_uri: typing.Optional[str] = None
)

Video.

VideoEmbedding

VideoEmbedding(
    start_offset_sec: int, end_offset_sec: int, embedding: typing.List[float]
)

Incorporamenti generati dal video con tempi di offset.

VideoSegmentConfig

VideoSegmentConfig(
    start_offset_sec: int = 0, end_offset_sec: int = 120, interval_sec: int = 16
)

I segmenti video specifici (in secondi) per i quali vengono generati gli incorporamenti.

WatermarkVerificationModel

WatermarkVerificationModel(
    model_id: str, endpoint_name: typing.Optional[str] = None
)

Verifica se un'immagine ha una filigrana

WatermarkVerificationResponse

WatermarkVerificationResponse(
    _prediction_response: Any, watermark_verification_result: Optional[str] = None
)

WatermarkVerificationResponse(_prediction_response: Any, watermark_verification_result: Optional[str] = None)