Class LocalEndpoint (1.36.3)

LocalEndpoint(
    serving_container_image_uri: str,
    artifact_uri: typing.Optional[str] = None,
    serving_container_predict_route: typing.Optional[str] = None,
    serving_container_health_route: typing.Optional[str] = None,
    serving_container_command: typing.Optional[typing.Sequence[str]] = None,
    serving_container_args: typing.Optional[typing.Sequence[str]] = None,
    serving_container_environment_variables: typing.Optional[
        typing.Dict[str, str]
    ] = None,
    serving_container_ports: typing.Optional[typing.Sequence[int]] = None,
    credential_path: typing.Optional[str] = None,
    host_port: typing.Optional[str] = None,
    gpu_count: typing.Optional[int] = None,
    gpu_device_ids: typing.Optional[typing.List[str]] = None,
    gpu_capabilities: typing.Optional[typing.List[typing.List[str]]] = None,
    container_ready_timeout: typing.Optional[int] = None,
    container_ready_check_interval: typing.Optional[int] = None,
)

Class that represents a local endpoint.

Methods

LocalEndpoint

LocalEndpoint(
    serving_container_image_uri: str,
    artifact_uri: typing.Optional[str] = None,
    serving_container_predict_route: typing.Optional[str] = None,
    serving_container_health_route: typing.Optional[str] = None,
    serving_container_command: typing.Optional[typing.Sequence[str]] = None,
    serving_container_args: typing.Optional[typing.Sequence[str]] = None,
    serving_container_environment_variables: typing.Optional[
        typing.Dict[str, str]
    ] = None,
    serving_container_ports: typing.Optional[typing.Sequence[int]] = None,
    credential_path: typing.Optional[str] = None,
    host_port: typing.Optional[str] = None,
    gpu_count: typing.Optional[int] = None,
    gpu_device_ids: typing.Optional[typing.List[str]] = None,
    gpu_capabilities: typing.Optional[typing.List[typing.List[str]]] = None,
    container_ready_timeout: typing.Optional[int] = None,
    container_ready_check_interval: typing.Optional[int] = None,
)

Creates a local endpoint instance.

Parameters
Name	Description
`serving_container_image_uri`	`str` Required. The URI of the Model serving container.
`artifact_uri`	`str` Optional. The path to the directory containing the Model artifact and any of its supporting files. The path is either a GCS uri or the path to a local directory. If this parameter is set to a GCS uri: (1) `credential_path` must be specified for local prediction. (2) The GCS uri will be passed directly to `Predictor.load`. If this parameter is a local directory: (1) The directory will be mounted to a default temporary model path. (2) The mounted path will be passed to `Predictor.load`.
`serving_container_predict_route`	`str` Optional. An HTTP path to send prediction requests to the container, and which must be supported by it. If not specified a default HTTP path will be used by Vertex AI.
`serving_container_health_route`	`str` Optional. An HTTP path to send health check requests to the container, and which must be supported by it. If not specified a standard HTTP path will be used by Vertex AI.
`serving_container_command`	`Sequence[str]` Optional. The command with which the container is run. Not executed within a shell. The Docker image's ENTRYPOINT is used if this is not provided. Variable references $(VAR_NAME) are expanded using the container's environment. If a variable cannot be resolved, the reference in the input string will be unchanged. The $(VAR_NAME) syntax can be escaped with a double $$, ie: $$(VAR_NAME). Escaped references will never be expanded, regardless of whether the variable exists or not.
`serving_container_args`	`typing.Optional[typing.Sequence[str]]` (Sequence[str]): Optional. The arguments to the command. The Docker image's CMD is used if this is not provided. Variable references $(VAR_NAME) are expanded using the container's environment. If a variable cannot be resolved, the reference in the input string will be unchanged. The $(VAR_NAME) syntax can be escaped with a double $$, ie: $$(VAR_NAME). Escaped references will never be expanded, regardless of whether the variable exists or not.
`serving_container_environment_variables`	`Dict[str, str]` Optional. The environment variables that are to be present in the container. Should be a dictionary where keys are environment variable names and values are environment variable values for those names.
`serving_container_ports`	`Sequence[int]` Optional. Declaration of ports that are exposed by the container. This field is primarily informational, it gives Vertex AI information about the network connections the container uses. Listing or not a port here has no impact on whether the port is actually exposed, any port listening on the default "0.0.0.0" address inside a container will be accessible from the network.
`credential_path`	`str` Optional. The path to the credential key that will be mounted to the container. If it's unset, the environment variable, `GOOGLE_APPLICATION_CREDENTIALS`, will be used if set.
`host_port`	`str` Optional. The port on the host that the port, `AIP_HTTP_PORT`, inside the container will be exposed as. If it's unset, a random host port will be assigned.
`gpu_count`	`int` Optional. Number of devices to request. Set to -1 to request all available devices. To use GPU, set either `gpu_count` or `gpu_device_ids`. The default value is -1 if `gpu_capabilities` is set but both `gpu_count` and `gpu_device_ids` are not set.
`gpu_device_ids`	`List[str]` Optional. This parameter corresponds to `NVIDIA_VISIBLE_DEVICES` in the NVIDIA Runtime. To use GPU, set either `gpu_count` or `gpu_device_ids`.
`gpu_capabilities`	`List[List[str]]` Optional. This parameter corresponds to `NVIDIA_DRIVER_CAPABILITIES` in the NVIDIA Runtime. The outer list acts like an OR, and each sub-list acts like an AND. The driver will try to satisfy one of the sub-lists. Available capabilities for the NVIDIA driver can be found in https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/user-guide.html#driver-capabilities. The default value is `[["utility", "compute"]]` if `gpu_count` or `gpu_device_ids` is set.
`container_ready_timeout`	`int` Optional. The timeout in second used for starting the container or succeeding the first health check.
`container_ready_check_interval`	`int` Optional. The time interval in second to check if the container is ready or the first health check succeeds.

Exceptions
Type	Description
`ValueError`	If both `gpu_count` and `gpu_device_ids` are set.

del

__del__()

Stops the container when the instance is about to be destroyed.

enter

__enter__()

Enters the runtime context related to this object.

exit

__exit__(exc_type, exc_value, exc_traceback)

Exits the runtime context related to this object.

get_container_status

get_container_status() -> str

Gets the container status.

predict

predict(
    request: typing.Optional[typing.Any] = None,
    request_file: typing.Optional[str] = None,
    headers: typing.Optional[typing.Dict] = None,
    verbose: bool = True,
) -> requests.models.Response

Executes a prediction.

Parameters
Name	Description
`request`	`Any` Optional. The request sent to the container.
`request_file`	`str` Optional. The path to a request file sent to the container.
`headers`	`Dict` Optional. The headers in the prediction request.
`verbose`	`bool` Required. Whether or not print logs if any.

Exceptions
Type	Description
`RuntimeError`	If the local endpoint has been stopped.
`ValueError`	If both `request` and `request_file` are specified, both `request` and `request_file` are not provided, or `request_file` is specified but does not exist.
`requests.exception.RequestException`	If the request fails with an exception.

print_container_logs

print_container_logs(
    show_all: bool = False, message: typing.Optional[str] = None
) -> None

Prints container logs.

Parameters
Name	Description
`show_all`	`bool` Required. If True, prints all logs since the container starts.
`message`	`str` Optional. The message to be printed before printing the logs.

print_container_logs_if_container_is_not_running

print_container_logs_if_container_is_not_running(
    show_all: bool = False, message: typing.Optional[str] = None
) -> None

Prints container logs if the container is not in "running" status.

Parameters
Name	Description
`show_all`	`bool` Required. If True, prints all logs since the container starts.
`message`	`str` Optional. The message to be printed before printing the logs.

run_health_check

run_health_check(verbose: bool = True) -> requests.models.Response

Runs a health check.

Parameter
Name	Description
`verbose`	`bool` Required. Whether or not print logs if any.

Exceptions
Type	Description
`RuntimeError`	If the local endpoint has been stopped.
`requests.exception.RequestException`	If the request fails with an exception.

serve

serve()

Starts running the container and serves the traffic locally.

An environment variable, GOOGLE_CLOUD_PROJECT, will be set to the project in the global config. This is required if the credentials file does not have project specified and used to recognize the project by the Cloud Storage client.

Exceptions
Type	Description
`DockerError`	If the container is not ready or health checks do not succeed after the timeout.

stop

stop() -> None

Explicitly stops the container.