REST Resource: projects.models.versions

Resource: Version

Represents a version of the model.

Each version is a trained model deployed in the cloud, ready to handle prediction requests. A model can have multiple versions. You can get information about all of the versions of a given model by calling projects.models.versions.list.

JSON representation
{
  "name": string,
  "description": string,
  "isDefault": boolean,
  "deploymentUri": string,
  "createTime": string,
  "lastUseTime": string,
  "runtimeVersion": string,
  "state": enum(State),
  "errorMessage": string,

  // Union field scaling can be only one of the following:
  "autoScaling": {
    object(AutoScaling)
  },
  "manualScaling": {
    object(ManualScaling)
  },
  // End of list of possible types for union field scaling.
}
Fields
name

string

Required.The name specified for the version when it was created.

The version name must be unique within the model it is created in.

description

string

Optional. The description specified for the version when it was created.

isDefault

boolean

Output only. If true, this version will be used to handle prediction requests that do not specify a version.

You can change the default version by calling projects.methods.versions.setDefault.

deploymentUri

string

Required. The Google Cloud Storage location of the trained model used to create the version. See the overview of model deployment for more information.

When passing Version to projects.models.versions.create the model service uses the specified location as the source of the model. Once deployed, the model version is hosted by the prediction service, so this location is useful only as a historical record. The total number of model files can't exceed 1000.

createTime

string (Timestamp format)

Output only. The time the version was created.

A timestamp in RFC3339 UTC "Zulu" format, accurate to nanoseconds. Example: "2014-10-02T15:01:23.045123456Z".

lastUseTime

string (Timestamp format)

Output only. The time the version was last used for prediction.

A timestamp in RFC3339 UTC "Zulu" format, accurate to nanoseconds. Example: "2014-10-02T15:01:23.045123456Z".

runtimeVersion

string

Optional. The Google Cloud ML runtime version to use for this deployment. If not set, Google Cloud ML will choose a version.

state

enum(State)

Output only. The state of a version.

errorMessage

string

Output only. The details of a failure or a cancellation.

Union field scaling. Optional. Sets the options for scaling. If not specified, defaults to auto_scaling with min_nodes of 0 (see doc for AutoScaling.min_nodes) scaling can be only one of the following:
autoScaling

object(AutoScaling)

Automatically scale the number of nodes used to serve the model in response to increases and decreases in traffic. Care should be taken to ramp up traffic according to the model's ability to scale or you will start seeing increases in latency and 429 response codes.

manualScaling

object(ManualScaling)

Manually select the number of nodes to use for serving the model. You should generally use autoScaling with an appropriate minNodes instead, but this option is available if you want more predictable billing. Beware that latency and error rates will increase if the traffic exceeds that capability of the system to serve it based on the selected number of nodes.

AutoScaling

Options for automatically scaling a model.

JSON representation
{
  "minNodes": number,
}
Fields
minNodes

number

Optional. The minimum number of nodes to allocate for this model. These nodes are always up, starting from the time the model is deployed, so the cost of operating this model will be at least rate * minNodes * number of hours since last billing cycle, where rate is the cost per node-hour as documented in pricing, even if no predictions are performed. There is additional cost for each prediction performed.

Unlike manual scaling, if the load gets too heavy for the nodes that are up, the service will automatically add nodes to handle the increased load as well as scale back as traffic drops, always maintaining at least minNodes. You will be charged for the time in which additional nodes are used.

If not specified, minNodes defaults to 0, in which case, when traffic to a model stops (and after a cool-down period), nodes will be shut down and no charges will be incurred until traffic to the model resumes.

ManualScaling

Options for manually scaling a model.

JSON representation
{
  "nodes": number,
}
Fields
nodes

number

The number of nodes to allocate for this model. These nodes are always up, starting from the time the model is deployed, so the cost of operating this model will be proportional to nodes * number of hours since last billing cycle plus the cost for each prediction performed.

State

Describes the version state.

Enums
UNKNOWN The version state is unspecified.
READY The version is ready for prediction.
CREATING The version is in the process of creation.
FAILED The version failed to be created, possibly cancelled. errorMessage should contain the details of the failure.
DELETING The version is in the process of deletion.

Methods

create

Creates a new version of a model from a trained TensorFlow model.

delete

Deletes a model version.

get

Gets information about a model version.

list

Gets basic information about all the versions of a model.

patch

Updates the specified Version resource.

setDefault

Designates a version to be the default for the model.

Monitor your resources on the go

Get the Google Cloud Console app to help you manage your projects.

Send feedback about...

Cloud Machine Learning Engine (Cloud ML Engine)