KRM API prediction.aiplatform.gdc.goog/v1

prediction.aiplatform.gdc.goog/v1

Package v1 contains API Schema definitions for the prediction.aiplatform.gdc.goog v1 API group.

Autoscaling

Defines the autoscaling parameters for a deployment.

Appears in:

DedicatedResources

Field	Description
`minReplica` integer	Number of minimum replicas. The default value is `1`. The next tag is `5`.
`maxReplica` integer	Number of maximum replicas.
`cpuTarget` integer	The threshold of CPU usage for scaling up a pod.
`gpuDutyCycleTarget` integer	The threshold of GPU duty-cycle utilization for scaling up a pod.

DedicatedResources

Defines the resources that are dedicated to a resource pool. The next ID is 8.

Appears in:

ResourcePoolSpec

Field	Description
`machineSpec`	Not required. Specifies the configuration of a single machine using the `machineType` value. If not provided, a default value is used. For Prediction, the default machine type for a deployment is `n1-standard-2` (GKE-based models) or `n2-standard-2-gdc` (CPU-based models).
`autoscaling` Autoscaling	Specifies the autoscaling parameters for the user workloads, for example, the predictor deployment for prediction.

DeployedModel

Defines the Schema for the DeployedModels API.

Appears in:

DeployedModelList

Field	Description
`apiVersion` string	`prediction.aiplatform.gdc.goog/v1`
`kind` string	`DeployedModel`
`metadata` ObjectMeta	Refer to Kubernetes API documentation for fields of `metadata`.
`spec` DeployedModelSpec
`status` DeployedModelStatus

DeployedModelList

Contains a list of DeployedModel resources.

Field	Description
`apiVersion` string	`prediction.aiplatform.gdc.goog/v1`
`kind` string	`DeployedModelList`
`metadata` ListMeta	Refer to Kubernetes API documentation for fields of `metadata`.
`items` DeployedModel array

DeployedModelSpec

Defines the expected state of DeployedModel resources.

Appears in:

DeployedModel

Field	Description
`endpointPath` string	Specifies the resource name of the endpoint. The format is `projects/{project}/locations/{location}/endpoints/{endpoint-id}`. The next tag is `6`.
`modelSpec` invalid type	Defines the model specification needed when deploying the model.
`resourcePoolRef` ObjectReference	Specifies the reference of the resource pool with the resource specifications required for this `DeployedModel`.
`sharesResourcePool` boolean	Specifies if the `DeployedModel` shares a resource pool with other models.

DeployedModelStatus

Defines the observed state of the DeployedModel resource.

Appears in:

DeployedModel

Field	Description
`ready` boolean	Indicates whether the resource is in a ready state.
`primaryCondition`	Represents the primary condition of a resource. If the resource is ready, then the condition indicates the resource is ready. Otherwise, the condition is the primary reason why the resource is not ready.
`resourceConditions` array	Represents a collection of conditions for a resource and its subresources. You can use it to determine the overall health of a resource and its subresources.
`conditions` Condition array	Represents raw resource conditions populated from Kubernetes resources for debugging purposes.
`routes` Routes	Represents the container or system routes for the deployed model prediction or health check.
`ports` Ports	Represents the container HTTP or gRPC ports.
`rpcStatus` RpcStatus	Indicates a canonical RPC representation of the deployed model's primary condition.
`observedGeneration` integer	Indicates the revision of the resource that was most recently reconciled.

Ports

Appears in:

DeployedModelStatus

Field	Description
`predictorPorts` integer array	Lists the HTTP ports to expose from the predictor. Requests aren't forwarded to ports other than the first one listed. This field corresponds to the `ports` field of the Kubernetes Containers v1 core API.
`predictorGRPCPorts` integer array	Lists the gRPC ports to expose from the predictor. If this field is omitted, then the gRPC requests to the container are disabled. Requests aren't forwarded to ports other than the first one listed. This field corresponds to the `ports` field of the Kubernetes Containers v1 core API.

ReplicaStatus

Defines the replica information of the ResourcePool resource.

Appears in:

ResourcePoolStatus

Field	Description
`resourceType` ResourceType	Specifies the type of the resource.
`resourceName` string	Indicates the unique resource name in the namespace of the resource type. For example, a `Deployment` `ResourceType` has the Kubernetes deployment name as its resource name.
`resourceNamespace` string	Indicates the namespace of the resource. This field is not applicable for cluster-scoped resources.
`appType` string	Indicates the application type of the resource, defined by a specific product. For example, Prediction supports `predictor` and `explainer` as `appType`.
`availableReplicas` integer	Indicates the total number of available replicas. For Kubernetes deployment resource type, this field represents the total number of available pods, ready for at least a minimum of ready seconds, targeted by this deployment.
`unavailableReplicas` integer	Indicates the total number of unavailable replicas. For Kubernetes deployment resource type, it represents the total number of unavailable pods targeted by the deployment.

ResourcePool

Defines the Schema for the ResourcePools API.

Appears in:

ResourcePoolList

Field	Description
`apiVersion` string	`prediction.aiplatform.gdc.goog/v1`
`kind` string	`ResourcePool`
`metadata` ObjectMeta	Refer to Kubernetes API documentation for fields of `metadata`.
`spec` ResourcePoolSpec
`status` ResourcePoolStatus

ResourcePoolList

Contains a list of ResourcePool resources.

Field	Description
`apiVersion` string	`prediction.aiplatform.gdc.goog/v1`
`kind` string	`ResourcePoolList`
`metadata` ListMeta	Refer to Kubernetes API documentation for fields of `metadata`.
`items` ResourcePool array

ResourcePoolSpec

Defines the expected state of ResourcePool resources.

Appears in:

ResourcePool

Field	Description
`resourcePoolID` string	Represents the system-generated ID of the `ResourcePool` resource. This field is only applicable for Google Cloud and GDCE. The next tag is `8`.
`userProvidedID` string	Represents the user-provided ID of the `ResourcePool` resource. This field is only applicable for Google Cloud and GDCE.
`dedicatedResources` DedicatedResources	Contains a description of resources that are dedicated to the resource pool.
`enableContainerLogging` boolean	Indicates whether or not container logging is enabled for the `ResourcePool`.
`userGsa` string	Indicates the user-provided IAM service account in the user project. If not specified, the default serving service account is used.
`customKsaName` string	Specifies the custom name that the operator creates and the user workload uses for bring your own service account (BYOSA) cases only. If empty, the system uses the default naming pattern.
`rolloutStrategy` invalid type	Specifies whether models deployed to this pool must be rolled out to the model server replicas gradually or all at once.

ResourcePoolStatus

Defines the observed state of ResourcePool resources.

Appears in:

ResourcePool

Field	Description
`ready` boolean	Indicates whether the resource is in a ready state.
`primaryCondition`	Represents the primary condition of a resource. If the resource is ready, then the condition indicates the resource is ready. Otherwise, the condition is the primary reason why the resource is not ready.
`resourceConditions`	Represents a collection of conditions for a resource and its subresources. You can use it to determine the overall health of a resource and its subresources.
`conditions` Condition array	Represents raw resource conditions populated from Kubernetes resources for debugging purposes.
`replicaStatuses` ReplicaStatus array
`rpcStatus` RpcStatus	Indicates a canonical RPC representation of the ResourcePool's primary condition.
`observedGeneration` integer	Indicates what revision of the resource was most recently reconciled.

Routes

Appears in:

DeployedModelStatus

Field	Description
`predictRoute` string	Represents the routing path on the container to send prediction requests. Vertex AI forwards requests using `projects.locations.endpoints.predict` to this path on the container's IP address and port. Vertex AI then returns the container's response in the API response.
`predictSystemRoute` string	Represents the system routing path to send prediction requests to the cluster ingress. This field is populated internally only when it is copied to the `deployedModel` during deployment.
`healthRoute` string	Represents the routing path on the container to send health checks. Vertex AI intermittently sends GET requests to this path on the container's IP address and port to check that the container is healthy.
`healthSystemRoute` string	Represents the system routing path to send health check requests to the cluster ingress. This field is populated internally only when it is copied to the `deployedModel` during deployment.

RpcStatus

Encapsulates an RPC code and a message.

Appears in:

DeployedModelStatus

Field	Description
`code` Code	Represents the RPC code. The next tag is `4`.
`message` string	Contains a user-facing description of the condition.
`terminalState` boolean	Indicates a value of `true` if the resource has reached a terminal state and it cannot become ready.