KRM API prediction.aiplatform.gdc.goog/v1

prediction.aiplatform.gdc.goog/v1

Package v1 contains API Schema definitions for the prediction.aiplatform.gdc.goog v1 API group.

Autoscaling

Defines the autoscaling parameters for a deployment.

Appears in:

Field Description
minReplica integer Number of minimum replicas. The default value is 1. The next tag is 5.
maxReplica integer Number of maximum replicas.
cpuTarget integer The threshold of CPU usage for scaling up a pod.
gpuDutyCycleTarget integer The threshold of GPU duty-cycle utilization for scaling up a pod.

DedicatedResources

Defines the resources that are dedicated to a resource pool. The next ID is 8.

Appears in:

Field Description
machineSpec Not required. Specifies the configuration of a single machine using the machineType value. If not provided, a default value is used. For Prediction, the default machine type for a deployment is n1-standard-2 (GKE-based models) or n2-standard-2-gdc (CPU-based models).
autoscaling Autoscaling Specifies the autoscaling parameters for the user workloads, for example, the predictor deployment for prediction.

DeployedModel

Defines the Schema for the DeployedModels API.

Appears in:

Field Description
apiVersion string prediction.aiplatform.gdc.goog/v1
kind string DeployedModel
metadata ObjectMeta Refer to Kubernetes API documentation for fields of metadata.
spec DeployedModelSpec
status DeployedModelStatus

DeployedModelList

Contains a list of DeployedModel resources.

Field Description
apiVersion string prediction.aiplatform.gdc.goog/v1
kind string DeployedModelList
metadata ListMeta Refer to Kubernetes API documentation for fields of metadata.
items DeployedModel array

DeployedModelSpec

Defines the expected state of DeployedModel resources.

Appears in:

Field Description
endpointPath string Specifies the resource name of the endpoint. The format is projects/{project}/locations/{location}/endpoints/{endpoint-id}. The next tag is 6.
modelSpec invalid type Defines the model specification needed when deploying the model.
resourcePoolRef ObjectReference Specifies the reference of the resource pool with the resource specifications required for this DeployedModel.
sharesResourcePool boolean Specifies if the DeployedModel shares a resource pool with other models.

DeployedModelStatus

Defines the observed state of the DeployedModel resource.

Appears in:

Field Description
ready boolean Indicates whether the resource is in a ready state.
primaryCondition Represents the primary condition of a resource. If the resource is ready, then the condition indicates the resource is ready. Otherwise, the condition is the primary reason why the resource is not ready.
resourceConditions array Represents a collection of conditions for a resource and its subresources. You can use it to determine the overall health of a resource and its subresources.
conditions Condition array Represents raw resource conditions populated from Kubernetes resources for debugging purposes.
routes Routes Represents the container or system routes for the deployed model prediction or health check.
ports Ports Represents the container HTTP or gRPC ports.
rpcStatus RpcStatus Indicates a canonical RPC representation of the deployed model's primary condition.
observedGeneration integer Indicates the revision of the resource that was most recently reconciled.

Ports

Appears in:

Field Description
predictorPorts integer array Lists the HTTP ports to expose from the predictor. Requests aren't forwarded to ports other than the first one listed. This field corresponds to the ports field of the Kubernetes Containers v1 core API.
predictorGRPCPorts integer array Lists the gRPC ports to expose from the predictor. If this field is omitted, then the gRPC requests to the container are disabled. Requests aren't forwarded to ports other than the first one listed. This field corresponds to the ports field of the Kubernetes Containers v1 core API.

ReplicaStatus

Defines the replica information of the ResourcePool resource.

Appears in:

Field Description
resourceType ResourceType Specifies the type of the resource.
resourceName string Indicates the unique resource name in the namespace of the resource type. For example, a Deployment ResourceType has the Kubernetes deployment name as its resource name.
resourceNamespace string Indicates the namespace of the resource. This field is not applicable for cluster-scoped resources.
appType string Indicates the application type of the resource, defined by a specific product. For example, Prediction supports predictor and explainer as appType.
availableReplicas integer Indicates the total number of available replicas. For Kubernetes deployment resource type, this field represents the total number of available pods, ready for at least a minimum of ready seconds, targeted by this deployment.
unavailableReplicas integer Indicates the total number of unavailable replicas. For Kubernetes deployment resource type, it represents the total number of unavailable pods targeted by the deployment.

ResourcePool

Defines the Schema for the ResourcePools API.

Appears in:

Field Description
apiVersion string prediction.aiplatform.gdc.goog/v1
kind string ResourcePool
metadata ObjectMeta Refer to Kubernetes API documentation for fields of metadata.
spec ResourcePoolSpec
status ResourcePoolStatus

ResourcePoolList

Contains a list of ResourcePool resources.

Field Description
apiVersion string prediction.aiplatform.gdc.goog/v1
kind string ResourcePoolList
metadata ListMeta Refer to Kubernetes API documentation for fields of metadata.
items ResourcePool array

ResourcePoolSpec

Defines the expected state of ResourcePool resources.

Appears in:

Field Description
resourcePoolID string Represents the system-generated ID of the ResourcePool resource. This field is only applicable for Google Cloud and GDCE. The next tag is 8.
userProvidedID string Represents the user-provided ID of the ResourcePool resource. This field is only applicable for Google Cloud and GDCE.
dedicatedResources DedicatedResources Contains a description of resources that are dedicated to the resource pool.
enableContainerLogging boolean Indicates whether or not container logging is enabled for the ResourcePool.
userGsa string Indicates the user-provided IAM service account in the user project. If not specified, the default serving service account is used.
customKsaName string Specifies the custom name that the operator creates and the user workload uses for bring your own service account (BYOSA) cases only. If empty, the system uses the default naming pattern.
rolloutStrategy invalid type Specifies whether models deployed to this pool must be rolled out to the model server replicas gradually or all at once.

ResourcePoolStatus

Defines the observed state of ResourcePool resources.

Appears in:

Field Description
ready boolean Indicates whether the resource is in a ready state.
primaryCondition Represents the primary condition of a resource. If the resource is ready, then the condition indicates the resource is ready. Otherwise, the condition is the primary reason why the resource is not ready.
resourceConditions Represents a collection of conditions for a resource and its subresources. You can use it to determine the overall health of a resource and its subresources.
conditions Condition array Represents raw resource conditions populated from Kubernetes resources for debugging purposes.
replicaStatuses ReplicaStatus array
rpcStatus RpcStatus Indicates a canonical RPC representation of the ResourcePool's primary condition.
observedGeneration integer Indicates what revision of the resource was most recently reconciled.

Routes

Appears in:

Field Description
predictRoute string Represents the routing path on the container to send prediction requests. Vertex AI forwards requests using projects.locations.endpoints.predict to this path on the container's IP address and port. Vertex AI then returns the container's response in the API response.
predictSystemRoute string Represents the system routing path to send prediction requests to the cluster ingress. This field is populated internally only when it is copied to the deployedModel during deployment.
healthRoute string Represents the routing path on the container to send health checks. Vertex AI intermittently sends GET requests to this path on the container's IP address and port to check that the container is healthy.
healthSystemRoute string Represents the system routing path to send health check requests to the cluster ingress. This field is populated internally only when it is copied to the deployedModel during deployment.

RpcStatus

Encapsulates an RPC code and a message.

Appears in:

Field Description
code Code Represents the RPC code. The next tag is 4.
message string Contains a user-facing description of the condition.
terminalState boolean Indicates a value of true if the resource has reached a terminal state and it cannot become ready.