REST Resource: projects.models.versions

Resource: Version

Represents a version of the model.

Each version is a trained model deployed in the cloud, ready to handle prediction requests. A model can have multiple versions. You can get information about all of the versions of a given model by calling projects.models.versions.list.

JSON representation
{
  "name": string,
  "description": string,
  "isDefault": boolean,
  "deploymentUri": string,
  "createTime": string,
  "lastUseTime": string,
  "runtimeVersion": string,
  "machineType": string,
  "state": enum (State),
  "errorMessage": string,
  "packageUris": [
    string
  ],
  "labels": {
    string: string,
    ...
  },
  "etag": string,
  "framework": enum (Framework),
  "pythonVersion": string,
  "acceleratorConfig": {
    object (AcceleratorConfig)
  },
  "serviceAccount": string,
  "requestLoggingConfig": {
    object (RequestLoggingConfig)
  },
  "explanationConfig": {
    object (ExplanationConfig)
  },
  "container": {
    object (ContainerSpec)
  },
  "routes": {
    object (RouteMap)
  },

  // Union field scaling can be only one of the following:
  "autoScaling": {
    object (AutoScaling)
  },
  "manualScaling": {
    object (ManualScaling)
  }
  // End of list of possible types for union field scaling.
  "predictionClass": string
}
Fields
name

string

Required. The name specified for the version when it was created.

The version name must be unique within the model it is created in.

description

string

Optional. The description specified for the version when it was created.

isDefault

boolean

Output only. If true, this version will be used to handle prediction requests that do not specify a version.

You can change the default version by calling projects.methods.versions.setDefault.

deploymentUri

string

The Cloud Storage URI of a directory containing trained model artifacts to be used to create the model version. See the guide to deploying models for more information. The total number of files under this directory must not exceed 1000.

During projects.models.versions.create, AI Platform Prediction copies all files from the specified directory to a location managed by the service. From then on, AI Platform Prediction uses these copies of the model artifacts to serve predictions, not the original files in Cloud Storage, so this location is useful only as a historical record.

If you specify container, then this field is optional. Otherwise, it is required. Learn how to use this field with a custom container.

createTime

string (Timestamp format)

Output only. The time the version was created.

lastUseTime

string (Timestamp format)

Output only. The time the version was last used for prediction.

runtimeVersion

string

Required. The AI Platform runtime version to use for this deployment.

For more information, see the runtime version list and how to manage runtime versions.

machineType

string

Optional. The type of machine on which to serve the model. Currently only applies to online prediction service. If this field is not specified, it defaults to mls1-c1-m2.

Online prediction supports the following machine types:

  • mls1-c1-m2
  • mls1-c4-m2
  • n1-standard-2
  • n1-standard-4
  • n1-standard-8
  • n1-standard-16
  • n1-standard-32
  • n1-highmem-2
  • n1-highmem-4
  • n1-highmem-8
  • n1-highmem-16
  • n1-highmem-32
  • n1-highcpu-2
  • n1-highcpu-4
  • n1-highcpu-8
  • n1-highcpu-16
  • n1-highcpu-32

mls1-c4-m2 is in beta. All other machine types are generally available. Learn more about the differences between machine types.

state

enum (State)

Output only. The state of a version.

errorMessage

string

Output only. The details of a failure or a cancellation.

packageUris[]

string

Optional. Cloud Storage paths (gs://…) of packages for custom prediction routines or scikit-learn pipelines with custom code.

For a custom prediction routine, one of these packages must contain your Predictor class (see predictionClass). Additionally, include any dependencies used by your Predictor or scikit-learn pipeline uses that are not already included in your selected runtime version.

If you specify this field, you must also set runtimeVersion to 1.4 or greater.

labels

map (key: string, value: string)

Optional. One or more labels that you can add, to organize your model versions. Each label is a key-value pair, where both the key and the value are arbitrary strings that you supply. For more information, see the documentation on using labels.

An object containing a list of "key": value pairs. Example: { "name": "wrench", "mass": "1.3kg", "count": "3" }.

etag

string (bytes format)

etag is used for optimistic concurrency control as a way to help prevent simultaneous updates of a model from overwriting each other. It is strongly suggested that systems make use of the etag in the read-modify-write cycle to perform model updates in order to avoid race conditions: An etag is returned in the response to versions.get, and systems are expected to put that etag in the request to versions.patch to ensure that their change will be applied to the model as intended.

A base64-encoded string.

framework

enum (Framework)

Optional. The machine learning framework AI Platform uses to train this version of the model. Valid values are TENSORFLOW, SCIKIT_LEARN, XGBOOST. If you do not specify a framework, AI Platform will analyze files in the deploymentUri to determine a framework. If you choose SCIKIT_LEARN or XGBOOST, you must also set the runtime version of the model to 1.4 or greater.

Do not specify a framework if you're deploying a custom prediction routine or if you're using a custom container.

pythonVersion

string

Required. The version of Python used in prediction.

The following Python versions are available:

  • Python '3.7' is available when runtimeVersion is set to '1.15' or later.
  • Python '3.5' is available when runtimeVersion is set to a version from '1.4' to '1.14'.
  • Python '2.7' is available when runtimeVersion is set to '1.15' or earlier.

Read more about the Python versions available for each runtime version.

acceleratorConfig

object (AcceleratorConfig)

Optional. Accelerator config for using GPUs for online prediction (beta). Only specify this field if you have specified a Compute Engine (N1) machine type in the machineType field. Learn more about using GPUs for online prediction.

serviceAccount

string

Optional. Specifies the service account for resource access control. If you specify this field, then you must also specify either the containerSpec or the predictionClass field.

Learn more about using a custom service account.

requestLoggingConfig

object (RequestLoggingConfig)

Optional. Only specify this field in a projects.models.versions.patch request. Specifying it in a projects.models.versions.create request has no effect.

Configures the request-response pair logging on predictions from this Version.

explanationConfig

object (ExplanationConfig)

Optional. Configures explainability features on the model's version. Some explanation features require additional metadata to be loaded as part of the model payload.

container

object (ContainerSpec)

Optional. Specifies a custom container to use for serving predictions.

If you specify this field, then machineType is required.

If you specify this field, then deploymentUri is optional.

If you specify this field, then you must not specify runtimeVersion, packageUris, framework, pythonVersion, or predictionClass.

routes

object (RouteMap)

Optional. Specifies paths on a custom container's HTTP server where AI Platform Prediction sends certain requests. If you specify this field, then you must also specify the container field.

If you specify the container field and do not specify this field, it defaults to the following:

{
  "predict": "/v1/models/MODEL/versions/VERSION:predict",
  "health": "/v1/models/MODEL/versions/VERSION"
}

See RouteMap for more details about these default values.

Union field scaling. Optional. Sets the options for scaling. If not specified, defaults to auto_scaling with min_nodes of 0 (see doc for AutoScaling.min_nodes) scaling can be only one of the following:
autoScaling

object (AutoScaling)

Automatically scale the number of nodes used to serve the model in response to increases and decreases in traffic. Care should be taken to ramp up traffic according to the model's ability to scale or you will start seeing increases in latency and 429 response codes.

Note that you cannot use AutoScaling if your version uses GPUs. Instead, you must use specify manualScaling.

manualScaling

object (ManualScaling)

Manually select the number of nodes to use for serving the model. You should generally use autoScaling with an appropriate minNodes instead, but this option is available if you want more predictable billing. Beware that latency and error rates will increase if the traffic exceeds that capability of the system to serve it based on the selected number of nodes.

predictionClass

string

Optional. The fully qualified name (module_name.class_name) of a class that implements the Predictor interface described in this reference field. The module containing this class should be included in a package provided to the packageUris field.

Specify this field if and only if you are deploying a custom prediction routine (beta). If you specify this field, you must set runtimeVersion to 1.4 or greater and you must set machineType to a legacy (MLS1) machine type.

The following code sample provides the Predictor interface:

class Predictor(object):
"""Interface for constructing custom predictors."""

def predict(self, instances, **kwargs):
    """Performs custom prediction.

    Instances are the decoded values from the request. They have already
    been deserialized from JSON.

    Args:
        instances: A list of prediction input instances.
        **kwargs: A dictionary of keyword args provided as additional
            fields on the predict request body.

    Returns:
        A list of outputs containing the prediction results. This list must
        be JSON serializable.
    """
    raise NotImplementedError()

@classmethod
def from_path(cls, model_dir):
    """Creates an instance of Predictor using the given path.

    Loading of the predictor should be done in this method.

    Args:
        model_dir: The local directory that contains the exported model
            file along with any additional files uploaded when creating the
            version resource.

    Returns:
        An instance implementing this Predictor class.
    """
    raise NotImplementedError()

Learn more about the Predictor interface and custom prediction routines.

AutoScaling

Options for automatically scaling a model.

JSON representation
{
  "minNodes": integer
}
Fields
minNodes

integer

Optional. The minimum number of nodes to allocate for this model. These nodes are always up, starting from the time the model is deployed. Therefore, the cost of operating this model will be at least rate * minNodes * number of hours since last billing cycle, where rate is the cost per node-hour as documented in the pricing guide, even if no predictions are performed. There is additional cost for each prediction performed.

Unlike manual scaling, if the load gets too heavy for the nodes that are up, the service will automatically add nodes to handle the increased load as well as scale back as traffic drops, always maintaining at least minNodes. You will be charged for the time in which additional nodes are used.

If minNodes is not specified and AutoScaling is used with a legacy (MLS1) machine type, minNodes defaults to 0, in which case, when traffic to a model stops (and after a cool-down period), nodes will be shut down and no charges will be incurred until traffic to the model resumes.

If minNodes is not specified and AutoScaling is used with a Compute Engine (N1) machine type, minNodes defaults to 1. minNodes must be at least 1 for use with a Compute Engine machine type.

Note that you cannot use AutoScaling if your version uses GPUs. Instead, you must use ManualScaling.

You can set minNodes when creating the model version, and you can also update minNodes for an existing version:

update_body.json:
{
  'autoScaling': {
    'minNodes': 5
  }
}

HTTP request:

PATCH
https://ml.googleapis.com/v1/{name=projects/*/models/*/versions/*}?updateMask=autoScaling.minNodes
-d @./update_body.json

ManualScaling

Options for manually scaling a model.

JSON representation
{
  "nodes": integer
}
Fields
nodes

integer

The number of nodes to allocate for this model. These nodes are always up, starting from the time the model is deployed, so the cost of operating this model will be proportional to nodes * number of hours since last billing cycle plus the cost for each prediction performed.

State

Describes the version state.

Enums
UNKNOWN The version state is unspecified.
READY The version is ready for prediction.
CREATING The version is being created. New versions.patch and versions.delete requests will fail if a version is in the CREATING state.
FAILED The version failed to be created, possibly cancelled. errorMessage should contain the details of the failure.
DELETING The version is being deleted. New versions.patch and versions.delete requests will fail if a version is in the DELETING state.
UPDATING The version is being updated. New versions.patch and versions.delete requests will fail if a version is in the UPDATING state.

Framework

Available frameworks for prediction.

Enums
FRAMEWORK_UNSPECIFIED Unspecified framework. Assigns a value based on the file suffix.
TENSORFLOW Tensorflow framework.
SCIKIT_LEARN Scikit-learn framework.
XGBOOST XGBoost framework.

AcceleratorConfig

Represents a hardware accelerator request config. Note that the AcceleratorConfig can be used in both Jobs and Versions. Learn more about accelerators for training and accelerators for online prediction.

JSON representation
{
  "count": string,
  "type": enum (AcceleratorType)
}
Fields
count

string (int64 format)

The number of accelerators to attach to each machine running the job.

type

enum (AcceleratorType)

The type of accelerator to use.

RequestLoggingConfig

Configuration for logging request-response pairs to a BigQuery table. Online prediction requests to a model version and the responses to these requests are converted to raw strings and saved to the specified BigQuery table. Logging is constrained by BigQuery quotas and limits. If your project exceeds BigQuery quotas or limits, AI Platform Prediction does not log request-response pairs, but it continues to serve predictions.

If you are using continuous evaluation, you do not need to specify this configuration manually. Setting up continuous evaluation automatically enables logging of request-response pairs.

JSON representation
{
  "samplingPercentage": number,
  "bigqueryTableName": string
}
Fields
samplingPercentage

number

Percentage of requests to be logged, expressed as a fraction from 0 to 1. For example, if you want to log 10% of requests, enter 0.1. The sampling window is the lifetime of the model version. Defaults to 0.

bigqueryTableName

string

Required. Fully qualified BigQuery table name in the following format: "projectId.dataset_name.table_name"

The specified table must already exist, and the "Cloud ML Service Agent" for your project must have permission to write to it. The table must have the following schema:

Field nameType Mode
modelSTRINGREQUIRED
model_versionSTRINGREQUIRED
timeTIMESTAMPREQUIRED
raw_dataSTRINGREQUIRED
raw_predictionSTRINGNULLABLE
groundtruthSTRINGNULLABLE

ExplanationConfig

Message holding configuration options for explaining model predictions. There are three feature attribution methods supported for TensorFlow models: integrated gradients, sampled Shapley, and XRAI. Learn more about feature attributions.

JSON representation
{

  // Union field attribution_method can be only one of the following:
  "integratedGradientsAttribution": {
    object (IntegratedGradientsAttribution)
  },
  "sampledShapleyAttribution": {
    object (SampledShapleyAttribution)
  },
  "xraiAttribution": {
    object (XraiAttribution)
  }
  // End of list of possible types for union field attribution_method.
}
Fields
Union field attribution_method. The attribution method to enable for explaining the model's predictions. attribution_method can be only one of the following:
integratedGradientsAttribution

object (IntegratedGradientsAttribution)

Attributes credit by computing the Aumann-Shapley value taking advantage of the model's fully differentiable structure. Refer to this paper for more details: https://arxiv.org/abs/1703.01365

sampledShapleyAttribution

object (SampledShapleyAttribution)

An attribution method that approximates Shapley values for features that contribute to the label being predicted. A sampling strategy is used to approximate the value rather than considering all subsets of features.

xraiAttribution

object (XraiAttribution)

Attributes credit by computing the XRAI taking advantage of the model's fully differentiable structure. Refer to this paper for more details: https://arxiv.org/abs/1906.02825 Currently only implemented for models with natural image inputs.

IntegratedGradientsAttribution

Attributes credit by computing the Aumann-Shapley value taking advantage of the model's fully differentiable structure. Refer to this paper for more details: https://arxiv.org/abs/1703.01365

JSON representation
{
  "numIntegralSteps": integer
}
Fields
numIntegralSteps

integer

Number of steps for approximating the path integral. A good value to start is 50 and gradually increase until the sum to diff property is met within the desired error range.

SampledShapleyAttribution

An attribution method that approximates Shapley values for features that contribute to the label being predicted. A sampling strategy is used to approximate the value rather than considering all subsets of features.

JSON representation
{
  "numPaths": integer
}
Fields
numPaths

integer

The number of feature permutations to consider when approximating the Shapley values.

XraiAttribution

Attributes credit by computing the XRAI taking advantage of the model's fully differentiable structure. Refer to this paper for more details: https://arxiv.org/abs/1906.02825 Currently only implemented for models with natural image inputs.

JSON representation
{
  "numIntegralSteps": integer
}
Fields
numIntegralSteps

integer

Number of steps for approximating the path integral. A good value to start is 50 and gradually increase until the sum to diff property is met within the desired error range.

ContainerSpec

Specification of a custom container for serving predictions. This message is a subset of the Kubernetes Container v1 core specification.

JSON representation
{
  "image": string,
  "command": [
    string
  ],
  "args": [
    string
  ],
  "ports": [
    {
      object (ContainerPort)
    }
  ],
  "env": [
    {
      object (EnvVar)
    }
  ]
}
Fields
image

string

URI of the Docker image to be used as the custom container for serving predictions. This URI must identify an image in Artifact Registry and begin with the hostname {REGION}-docker.pkg.dev, where {REGION} is replaced by the region that matches AI Platform Prediction regional endpoint that you are using. For example, if you are using the us-central1-ml.googleapis.com endpoint, then this URI must begin with us-central1-docker.pkg.dev.

To use a custom container, the AI Platform Google-managed service account must have permission to pull (read) the Docker image at this URI. The AI Platform Google-managed service account has the following format:

service-{PROJECT_NUMBER}@cloud-ml.google.com.iam.gserviceaccount.com

{PROJECT_NUMBER} is replaced by your Google Cloud project number.

By default, this service account has necessary permissions to pull an Artifact Registry image in the same Google Cloud project where you are using AI Platform Prediction. In this case, no configuration is necessary.

If you want to use an image from a different Google Cloud project, learn how to grant the Artifact Registry Reader (roles/artifactregistry.reader) role for a repository to your projet's AI Platform Google-managed service account.

To learn about the requirements for the Docker image itself, read Custom container requirements.

command[]

string

Immutable. Specifies the command that runs when the container starts. This overrides the container's ENTRYPOINT. Specify this field as an array of executable and arguments, similar to a Docker ENTRYPOINT's "exec" form, not its "shell" form.

If you do not specify this field, then the container's ENTRYPOINT runs, in conjunction with the args field or the container's CMD, if either exists. If this field is not specified and the container does not have an ENTRYPOINT, then refer to the Docker documentation about how CMD and ENTRYPOINT interact.

If you specify this field, then you can also specify the args field to provide additional arguments for this command. However, if you specify this field, then the container's CMD is ignored. See the Kubernetes documentation about how the command and args fields interact with a container's ENTRYPOINT and CMD.

In this field, you can reference environment variables set by AI Platform Prediction and environment variables set in the env field. You cannot reference environment variables set in the Docker image. In order for environment variables to be expanded, reference them by using the following syntax:

$(VARIABLE_NAME)

Note that this differs from Bash variable expansion, which does not use parentheses. If a variable cannot be resolved, the reference in the input string is used unchanged. To avoid variable expansion, you can escape this syntax with $$; for example:

$$(VARIABLE_NAME)

This field corresponds to the command field of the Kubernetes Containers v1 core API.

args[]

string

Immutable. Specifies arguments for the command that runs when the container starts. This overrides the container's CMD. Specify this field as an array of executable and arguments, similar to a Docker CMD's "default parameters" form.

If you don't specify this field but do specify the command field, then the command from the command field runs without any additional arguments. See the Kubernetes documentation about how the command and args fields interact with a container's ENTRYPOINT and CMD.

If you don't specify this field and don't specify the commmand field, then the container's ENTRYPOINT and CMD determine what runs based on their default behavior. See the Docker documentation about how CMD and ENTRYPOINT interact.

In this field, you can reference environment variables set by AI Platform Prediction and environment variables set in the env field. You cannot reference environment variables set in the Docker image. In order for environment variables to be expanded, reference them by using the following syntax:

$(VARIABLE_NAME)

Note that this differs from Bash variable expansion, which does not use parentheses. If a variable cannot be resolved, the reference in the input string is used unchanged. To avoid variable expansion, you can escape this syntax with $$; for example:

$$(VARIABLE_NAME)

This field corresponds to the args field of the Kubernetes Containers v1 core API.

ports[]

object (ContainerPort)

Immutable. List of ports to expose from the container. AI Platform Prediction sends any prediction requests that it receives to the first port on this list. AI Platform Prediction also sends liveness and health checks to this port.

If you do not specify this field, it defaults to following value:

[
  {
    "containerPort": 8080
  }
]

AI Platform Prediction does not use ports other than the first one listed. This field corresponds to the ports field of the Kubernetes Containers v1 core API.

env[]

object (EnvVar)

Immutable. List of environment variables to set in the container. After the container starts running, code running in the container can read these environment variables.

Additionally, the command and args fields can reference these variables. Later entries in this list can also reference earlier entries. For example, the following example sets the variable VAR_2 to have the value foo bar:

[
  {
    "name": "VAR_1",
    "value": "foo"
  },
  {
    "name": "VAR_2",
    "value": "$(VAR_1) bar"
  }
]

If you switch the order of the variables in the example, then the expansion does not occur.

This field corresponds to the env field of the Kubernetes Containers v1 core API.

ContainerPort

Represents a network port in a single container.

This message is a subset of the Kubernetes ContainerPort v1 core specification.

JSON representation
{
  "containerPort": integer
}
Fields
containerPort

integer

Number of the port to expose on the container. This must be a valid port number: 0 < PORT_NUMBER < 65536.

EnvVar

Represents an environment variable to be made available in a container.

This message is a subset of the Kubernetes EnvVar v1 core specification.

JSON representation
{
  "name": string,
  "value": string
}
Fields
name

string

Name of the environment variable. Must be a valid C identifier and must not begin with the prefix AIP_.

value

string

Value of the environment variable. Defaults to an empty string.

In this field, you can reference environment variables set by AI Platform Prediction and environment variables set earlier in the same env field as where this message occurs. You cannot reference environment variables set in the Docker image. In order for environment variables to be expanded, reference them by using the following syntax:

$(VARIABLE_NAME)

Note that this differs from Bash variable expansion, which does not use parentheses. If a variable cannot be resolved, the reference in the input string is used unchanged. To avoid variable expansion, you can escape this syntax with $$; for example:

$$(VARIABLE_NAME)

RouteMap

Specifies HTTP paths served by a custom container. AI Platform Prediction sends requests to these paths on the container; the custom container must run an HTTP server that responds to these requests with appropriate responses. Read Custom container requirements for details on how to create your container image to meet these requirements.

JSON representation
{
  "predict": string,
  "health": string
}
Fields
predict

string

HTTP path on the container to send prediction requests to. AI Platform Prediction forwards requests sent using projects.predict to this path on the container's IP address and port. AI Platform Prediction then returns the container's response in the API response.

For example, if you set this field to /foo, then when AI Platform Prediction receives a prediction request, it forwards the request body in a POST request to the following URL on the container:

localhost:PORT/foo

PORT refers to the first value of Version.container.ports.

If you don't specify this field, it defaults to the following value:

/v1/models/MODEL/versions/VERSION:predict

The placeholders in this value are replaced as follows:

  • MODEL: The name of the parent Model. This does not include the "projects/PROJECT_ID/models/" prefix that the API returns in output; it is the bare model name, as provided to projects.models.create.

  • VERSION: The name of the model version. This does not include the "projects/PROJECT_ID/models/MODEL/versions/" prefix that the API returns in output; it is the bare version name, as provided to projects.models.versions.create.

health

string

HTTP path on the container to send health checkss to. AI Platform Prediction intermittently sends GET requests to this path on the container's IP address and port to check that the container is healthy. Read more about health checks.

For example, if you set this field to /bar, then AI Platform Prediction intermittently sends a GET request to the following URL on the container:

localhost:PORT/bar

PORT refers to the first value of Version.container.ports.

If you don't specify this field, it defaults to the following value:

/v1/models/MODEL/versions/VERSION

The placeholders in this value are replaced as follows:

  • MODEL: The name of the parent Model. This does not include the "projects/PROJECT_ID/models/" prefix that the API returns in output; it is the bare model name, as provided to projects.models.create.

  • VERSION: The name of the model version. This does not include the "projects/PROJECT_ID/models/MODEL/versions/" prefix that the API returns in output; it is the bare version name, as provided to projects.models.versions.create.

Methods

create

Creates a new version of a model from a trained TensorFlow model.

delete

Deletes a model version.

get

Gets information about a model version.

list

Gets basic information about all the versions of a model.

patch

Updates the specified Version resource.

setDefault

Designates a version to be the default for the model.