This legacy version of AI Platform Training is deprecated and will no longer be available on Google Cloud after January 31, 2025. Migrate your resources to Vertex AI custom training to get new machine learning features that are unavailable in AI Platform.

REST Resource: projects.jobs

Resource: Job
- JSON representation
TrainingInput
- JSON representation
ScaleTier
ReplicaConfig
- JSON representation
AcceleratorConfig
- JSON representation
DiskConfig
- JSON representation
HyperparameterSpec
- JSON representation
GoalType
ParameterSpec
- JSON representation
ParameterType
ScaleType
Algorithm
EncryptionConfig
- JSON representation
Scheduling
- JSON representation
State
TrainingOutput
- JSON representation
HyperparameterOutput
- JSON representation
HyperparameterMetric
- JSON representation
BuiltInAlgorithmOutput
- JSON representation
Methods

Resource: Job

Represents a training or prediction job.

JSON representation

JSON representation
{ "jobId": string, "createTime": string, "startTime": string, "endTime": string, "state": enum (`State`), "errorMessage": string, "labels": { string: string, ... }, "etag": string, "trainingInput": { object (`TrainingInput`) }, "trainingOutput": { object (`TrainingOutput`) }, "jobPosition": string }

{
  "jobId": string,
  "createTime": string,
  "startTime": string,
  "endTime": string,
  "state": enum (State),
  "errorMessage": string,
  "labels": {
    string: string,
    ...
  },
  "etag": string,
  "trainingInput": {
    object (TrainingInput)
  },
  "trainingOutput": {
    object (TrainingOutput)
  },
  "jobPosition": string
}

Fields
`jobId`	`string` Required. The user-specified id of the job.
`createTime`	`string (Timestamp format)` Output only. When the job was created.
`startTime`	`string (Timestamp format)` Output only. When the job processing was started.
`endTime`	`string (Timestamp format)` Output only. When the job processing was completed.
`state`	`enum (State)` Output only. The detailed state of a job.
`errorMessage`	`string` Output only. The details of a failure or a cancellation.
`labels`	`map (key: string, value: string)` Optional. One or more labels that you can add, to organize your jobs. Each label is a key-value pair, where both the key and the value are arbitrary strings that you supply. For more information, see the documentation on using labels. An object containing a list of `"key": value` pairs. Example: `{ "name": "wrench", "mass": "1.3kg", "count": "3" }`.
`etag`	`string (bytes format)` `etag` is used for optimistic concurrency control as a way to help prevent simultaneous updates of a job from overwriting each other. It is strongly suggested that systems make use of the `etag` in the read-modify-write cycle to perform job updates in order to avoid race conditions: An `etag` is returned in the response to `jobs.get`, and systems are expected to put that etag in the request to `jobs.patch` to ensure that their change will be applied to the same version of the job. A base64-encoded string.
`trainingInput`	`object (TrainingInput)` Input parameters to create a training job.
`trainingOutput`	`object (TrainingOutput)` The current training job result.
`jobPosition`	`string (int64 format)` Output only. It's only effect when the job is in QUEUED state. If it's positive, it indicates the job's position in the job scheduler. It's 0 when the job is already scheduled.

TrainingInput

Represents input parameters for a training job. When using the gcloud command to submit your training job, you can specify the input parameters as command-line arguments and/or in a YAML configuration file referenced from the --config command-line argument. For details, see the guide to submitting a training job.

JSON representation

JSON representation
{ "scaleTier": enum (`ScaleTier`), "masterType": string, "masterConfig": { object (`ReplicaConfig`) }, "workerType": string, "workerConfig": { object (`ReplicaConfig`) }, "parameterServerType": string, "parameterServerConfig": { object (`ReplicaConfig`) }, "evaluatorType": string, "evaluatorConfig": { object (`ReplicaConfig`) }, "workerCount": string, "parameterServerCount": string, "evaluatorCount": string, "packageUris": [ string ], "pythonModule": string, "args": [ string ], "hyperparameters": { object (`HyperparameterSpec`) }, "region": string, "jobDir": string, "runtimeVersion": string, "pythonVersion": string, "encryptionConfig": { object (`EncryptionConfig`) }, "scheduling": { object (`Scheduling`) }, "network": string, "serviceAccount": string, "useChiefInTfConfig": boolean, "enableWebAccess": boolean }

{
  "scaleTier": enum (ScaleTier),
  "masterType": string,
  "masterConfig": {
    object (ReplicaConfig)
  },
  "workerType": string,
  "workerConfig": {
    object (ReplicaConfig)
  },
  "parameterServerType": string,
  "parameterServerConfig": {
    object (ReplicaConfig)
  },
  "evaluatorType": string,
  "evaluatorConfig": {
    object (ReplicaConfig)
  },
  "workerCount": string,
  "parameterServerCount": string,
  "evaluatorCount": string,
  "packageUris": [
    string
  ],
  "pythonModule": string,
  "args": [
    string
  ],
  "hyperparameters": {
    object (HyperparameterSpec)
  },
  "region": string,
  "jobDir": string,
  "runtimeVersion": string,
  "pythonVersion": string,
  "encryptionConfig": {
    object (EncryptionConfig)
  },
  "scheduling": {
    object (Scheduling)
  },
  "network": string,
  "serviceAccount": string,
  "useChiefInTfConfig": boolean,
  "enableWebAccess": boolean
}

Fields
`scaleTier`	`enum (ScaleTier)` Required. Specifies the machine types, the number of replicas for workers and parameter servers.
`masterType`	`string` Optional. Specifies the type of virtual machine to use for your training job's master worker. You must specify this field when `scaleTier` is set to `CUSTOM`. You can use certain Compute Engine machine types directly in this field. See the list of compatible Compute Engine machine types. Alternatively, you can use the certain legacy machine types in this field. See the list of legacy machine types. Finally, if you want to use a TPU for training, specify `cloud_tpu` in this field. Learn more about the special configuration options for training with TPUs.
`masterConfig`	`object (ReplicaConfig)` Optional. The configuration for your master worker. You should only set `masterConfig.acceleratorConfig` if `masterType` is set to a Compute Engine machine type. Learn about restrictions on accelerator configurations for training. Set `masterConfig.imageUri` only if you build a custom image. Only one of `masterConfig.imageUri` and `runtimeVersion` should be set. Learn more about configuring custom containers.
`workerType`	`string` Optional. Specifies the type of virtual machine to use for your training job's worker nodes. The supported values are the same as those described in the entry for `masterType`. This value must be consistent with the category of machine type that `masterType` uses. In other words, both must be Compute Engine machine types or both must be legacy machine types. If you use `cloud_tpu` for this value, see special instructions for configuring a custom TPU machine. This value must be present when `scaleTier` is set to `CUSTOM` and `workerCount` is greater than zero.
`workerConfig`	`object (ReplicaConfig)` Optional. The configuration for workers. You should only set `workerConfig.acceleratorConfig` if `workerType` is set to a Compute Engine machine type. Learn about restrictions on accelerator configurations for training. Set `workerConfig.imageUri` only if you build a custom image for your worker. If `workerConfig.imageUri` has not been set, AI Platform uses the value of `masterConfig.imageUri`. Learn more about configuring custom containers.
`parameterServerType`	`string` Optional. Specifies the type of virtual machine to use for your training job's parameter server. The supported values are the same as those described in the entry for `masterType`. This value must be consistent with the category of machine type that `masterType` uses. In other words, both must be Compute Engine machine types or both must be legacy machine types. This value must be present when `scaleTier` is set to `CUSTOM` and `parameterServerCount` is greater than zero.
`parameterServerConfig`	`object (ReplicaConfig)` Optional. The configuration for parameter servers. You should only set `parameterServerConfig.acceleratorConfig` if `parameterServerType` is set to a Compute Engine machine type. Learn about restrictions on accelerator configurations for training. Set `parameterServerConfig.imageUri` only if you build a custom image for your parameter server. If `parameterServerConfig.imageUri` has not been set, AI Platform uses the value of `masterConfig.imageUri`. Learn more about configuring custom containers.
`evaluatorType`	`string` Optional. Specifies the type of virtual machine to use for your training job's evaluator nodes. The supported values are the same as those described in the entry for `masterType`. This value must be consistent with the category of machine type that `masterType` uses. In other words, both must be Compute Engine machine types or both must be legacy machine types. This value must be present when `scaleTier` is set to `CUSTOM` and `evaluatorCount` is greater than zero.
`evaluatorConfig`	`object (ReplicaConfig)` Optional. The configuration for evaluators. You should only set `evaluatorConfig.acceleratorConfig` if `evaluatorType` is set to a Compute Engine machine type. Learn about restrictions on accelerator configurations for training. Set `evaluatorConfig.imageUri` only if you build a custom image for your evaluator. If `evaluatorConfig.imageUri` has not been set, AI Platform uses the value of `masterConfig.imageUri`. Learn more about configuring custom containers.
`workerCount`	`string (int64 format)` Optional. The number of worker replicas to use for the training job. Each replica in the cluster will be of the type specified in `workerType`. This value can only be used when `scaleTier` is set to `CUSTOM`. If you set this value, you must also set `workerType`. The default value is zero.
`parameterServerCount`	`string (int64 format)` Optional. The number of parameter server replicas to use for the training job. Each replica in the cluster will be of the type specified in `parameterServerType`. This value can only be used when `scaleTier` is set to `CUSTOM`. If you set this value, you must also set `parameterServerType`. The default value is zero.
`evaluatorCount`	`string (int64 format)` Optional. The number of evaluator replicas to use for the training job. Each replica in the cluster will be of the type specified in `evaluatorType`. This value can only be used when `scaleTier` is set to `CUSTOM`. If you set this value, you must also set `evaluatorType`. The default value is zero.
`packageUris[]`	`string` Required. The Google Cloud Storage location of the packages with the training program and any additional dependencies. The maximum number of package URIs is 100.
`pythonModule`	`string` Required. The Python module name to run after installing the packages.
`args[]`	`string` Optional. Command-line arguments passed to the training application when it starts. If your job uses a custom container, then the arguments are passed to the container's `ENTRYPOINT` command.
`hyperparameters`	`object (HyperparameterSpec)` Optional. The set of Hyperparameters to tune.
`region`	`string` Required. The region to run the training job in. See the available regions for AI Platform Training.
`jobDir`	`string` Optional. A Google Cloud Storage path in which to store training outputs and other data needed for training. This path is passed to your TensorFlow program as the '--job-dir' command-line argument. The benefit of specifying this field is that Cloud ML validates the path for use in training.
`runtimeVersion`	`string` Optional. The AI Platform runtime version to use for training. You must either specify this field or specify `masterConfig.imageUri`. For more information, see the runtime version list and learn how to manage runtime versions.
`pythonVersion`	`string` Optional. The version of Python used in training. You must either specify this field or specify `masterConfig.imageUri`. The following Python versions are available: Python '3.7' is available when `runtimeVersion` is set to '1.15' or later. Python '3.5' is available when `runtimeVersion` is set to a version from '1.4' to '1.14'. Python '2.7' is available when `runtimeVersion` is set to '1.15' or earlier. Read more about the Python versions available for each runtime version.
`encryptionConfig`	`object (EncryptionConfig)` Optional. Options for using customer-managed encryption keys (CMEK) to protect resources created by a training job, instead of using Google's default encryption. If this is set, then all resources created by the training job will be encrypted with the customer-managed encryption key that you specify. Learn how and when to use CMEK with AI Platform Training.
`scheduling`	`object (Scheduling)` Optional. Scheduling options for a training job.
`network`	`string` Optional. The full name of the Compute Engine network to which the Job is peered. For example, `projects/12345/global/networks/myVPC`. The format of this field is `projects/{project}/global/networks/{network}`, where {project} is a project number (like `12345`) and {network} is network name. Private services access must already be configured for the network. If left unspecified, the Job is not peered with any network. Learn about using VPC Network Peering..
`serviceAccount`	`string` Optional. The email address of a service account to use when running the training appplication. You must have the `iam.serviceAccounts.actAs` permission for the specified service account. In addition, the AI Platform Training Google-managed service account must have the `roles/iam.serviceAccountAdmin` role for the specified service account. Learn more about configuring a service account. If not specified, the AI Platform Training Google-managed service account is used by default.
`useChiefInTfConfig`	`boolean` Optional. Use `chief` instead of `master` in the `TF_CONFIG` environment variable when training with a custom container. Defaults to `false`. Learn more about this field. This field has no effect for training jobs that don't use a custom container.
`enableWebAccess`	`boolean` Optional. Whether you want AI Platform Training to enable interactive shell access to training containers. If set to `true`, you can access interactive shells at the URIs given by `TrainingOutput.web_access_uris` or `HyperparameterOutput.web_access_uris` (within `TrainingOutput.trials`).

ScaleTier

A scale tier is an abstract representation of the resources Cloud ML will allocate to a training job. When selecting a scale tier for your training job, you should consider the size of your training dataset and the complexity of your model. As the tiers increase, virtual machines are added to handle your job, and the individual machines in the cluster generally have more memory and greater processing power than they do at lower tiers. The number of training units charged per hour of processing increases as tiers get more advanced. Refer to the pricing guide for more details. Note that in addition to incurring costs, your use of training resources is constrained by the quota policy.

Enums
`BASIC`	A single worker instance. This tier is suitable for learning how to use Cloud ML, and for experimenting with new models using small datasets.
`STANDARD_1`	Many workers and a few parameter servers.
`PREMIUM_1`	A large number of workers with many parameter servers.
`BASIC_GPU`	A single worker instance with a GPU.
`BASIC_TPU`	A single worker instance with a Cloud TPU.
`CUSTOM`	The CUSTOM tier is not a set tier, but rather enables you to use your own cluster specification. When you use this tier, set values to configure your processing cluster according to these guidelines: You must set `TrainingInput.masterType` to specify the type of machine to use for your master node. This is the only required setting. You may set `TrainingInput.workerCount` to specify the number of workers to use. If you specify one or more workers, you must also set `TrainingInput.workerType` to specify the type of machine to use for your worker nodes. You may set `TrainingInput.parameterServerCount` to specify the number of parameter servers to use. If you specify one or more parameter servers, you must also set `TrainingInput.parameterServerType` to specify the type of machine to use for your parameter servers. Note that all of your workers must use the same machine type, which can be different from your parameter server type and master type. Your parameter servers must likewise use the same machine type, which can be different from your worker type and master type.

ReplicaConfig

Represents the configuration for a replica in a cluster.

JSON representation

JSON representation
{ "acceleratorConfig": { object (`AcceleratorConfig`) }, "imageUri": string, "tpuTfVersion": string, "diskConfig": { object (`DiskConfig`) }, "containerCommand": [ string ], "containerArgs": [ string ] }

{
  "acceleratorConfig": {
    object (AcceleratorConfig)
  },
  "imageUri": string,
  "tpuTfVersion": string,
  "diskConfig": {
    object (DiskConfig)
  },
  "containerCommand": [
    string
  ],
  "containerArgs": [
    string
  ]
}

Fields
`acceleratorConfig`	`object (AcceleratorConfig)` Represents the type and number of accelerators used by the replica. Learn about restrictions on accelerator configurations for training.
`imageUri`	`string` The Docker image to run on the replica. This image must be in Container Registry. Learn more about configuring custom containers.
`tpuTfVersion`	`string` The AI Platform runtime version that includes a TensorFlow version matching the one used in the custom container. This field is required if the replica is a TPU worker that uses a custom container. Otherwise, do not specify this field. This must be a runtime version that currently supports training with TPUs. Note that the version of TensorFlow included in a runtime version may differ from the numbering of the runtime version itself, because it may have a different patch version. In this field, you must specify the runtime version (TensorFlow minor version). For example, if your custom container runs TensorFlow `1.x.y`, specify `1.x`.
`diskConfig`	`object (DiskConfig)` Represents the configuration of disk options.
`containerCommand[]`	`string` The command with which the replica's custom container is run. If provided, it will override default ENTRYPOINT of the docker image. If not provided, the docker image's ENTRYPOINT is used. It cannot be set if `custom container image` is not provided. Note that this field and [TrainingInput.args] are mutually exclusive, i.e., both cannot be set at the same time.
`containerArgs[]`	`string` Arguments to the [entrypoint command][ReplicaConfig.command]. The following rules apply for `containerCommand` and `containerArgs`: - If you do not supply command or args: The defaults defined in the Docker image are used. - If you supply a command but no args: The default EntryPoint and the default Cmd defined in the Docker image are ignored. Your command is run without any arguments. - If you supply only args: The default Entrypoint defined in the Docker image is run with the args that you supplied. - If you supply a command and args: The default Entrypoint and the default Cmd defined in the Docker image are ignored. Your command is run with your args. It cannot be set if `custom container image` is not provided. Note that this field and [TrainingInput.args] are mutually exclusive, i.e., both cannot be set at the same time.

AcceleratorConfig

Represents a hardware accelerator request config. Note that the AcceleratorConfig can be used in both Jobs and Versions. Learn more about accelerators for training and accelerators for online prediction.

JSON representation
{ "count": string, "type": enum (`AcceleratorType`) }

Fields

Fields
`count`	`string (int64 format)` The number of accelerators to attach to each machine running the job.
`type`	`enum (AcceleratorType)` The type of accelerator to use.

count

string (int64 format)

The number of accelerators to attach to each machine running the job.

type

enum (AcceleratorType)

The type of accelerator to use.

DiskConfig

Represents the config of disk options.

JSON representation
{ "bootDiskType": string, "bootDiskSizeGb": integer }

Fields

Fields
`bootDiskType`	`string` Type of the boot disk (default is "pd-ssd"). Valid values: "pd-ssd" (Persistent Disk Solid State Drive) or "pd-standard" (Persistent Disk Hard Disk Drive).
`bootDiskSizeGb`	`integer` Size in GB of the boot disk (default is 100GB).

bootDiskType

string

Type of the boot disk (default is "pd-ssd"). Valid values: "pd-ssd" (Persistent Disk Solid State Drive) or "pd-standard" (Persistent Disk Hard Disk Drive).

bootDiskSizeGb

integer

Size in GB of the boot disk (default is 100GB).

HyperparameterSpec

Represents a set of hyperparameters to optimize.

JSON representation

JSON representation
{ "goal": enum (`GoalType`), "params": [ { object (`ParameterSpec`) } ], "maxTrials": integer, "maxParallelTrials": integer, "maxFailedTrials": integer, "hyperparameterMetricTag": string, "resumePreviousJobId": string, "enableTrialEarlyStopping": boolean, "algorithm": enum (`Algorithm`) }

{
  "goal": enum (GoalType),
  "params": [
    {
      object (ParameterSpec)
    }
  ],
  "maxTrials": integer,
  "maxParallelTrials": integer,
  "maxFailedTrials": integer,
  "hyperparameterMetricTag": string,
  "resumePreviousJobId": string,
  "enableTrialEarlyStopping": boolean,
  "algorithm": enum (Algorithm)
}

Fields
`goal`	`enum (GoalType)` Required. The type of goal to use for tuning. Available types are `MAXIMIZE` and `MINIMIZE`. Defaults to `MAXIMIZE`.
`params[]`	`object (ParameterSpec)` Required. The set of parameters to tune.
`maxTrials`	`integer` Optional. How many training trials should be attempted to optimize the specified hyperparameters. Defaults to one.
`maxParallelTrials`	`integer` Optional. The number of training trials to run concurrently. You can reduce the time it takes to perform hyperparameter tuning by adding trials in parallel. However, each trail only benefits from the information gained in completed trials. That means that a trial does not get access to the results of trials running at the same time, which could reduce the quality of the overall optimization. Each trial will use the same scale tier and machine types. Defaults to one.
`maxFailedTrials`	`integer` Optional. The number of failed trials that need to be seen before failing the hyperparameter tuning job. You can specify this field to override the default failing criteria for AI Platform hyperparameter tuning jobs. Defaults to zero, which means the service decides when a hyperparameter job should fail.
`hyperparameterMetricTag`	`string` Optional. The TensorFlow summary tag name to use for optimizing trials. For current versions of TensorFlow, this tag name should exactly match what is shown in TensorBoard, including all scopes. For versions of TensorFlow prior to 0.12, this should be only the tag passed to tf.Summary. By default, "training/hptuning/metric" will be used.
`resumePreviousJobId`	`string` Optional. The prior hyperparameter tuning job id that users hope to continue with. The job id will be used to find the corresponding Vertex AI Vizier study guide and resume the study.
`enableTrialEarlyStopping`	`boolean` Optional. Indicates if the hyperparameter tuning job enables auto trial early stopping.
`algorithm`	`enum (Algorithm)` Optional. The search algorithm specified for the hyperparameter tuning job. Uses the default AI Platform hyperparameter tuning algorithm if unspecified.

GoalType

The available types of optimization goals.

Enums
`GOAL_TYPE_UNSPECIFIED`	Goal Type will default to maximize.
`MAXIMIZE`	Maximize the goal metric.
`MINIMIZE`	Minimize the goal metric.

ParameterSpec

Represents a single hyperparameter to optimize.

JSON representation
{ "parameterName": string, "type": enum (`ParameterType`), "minValue": number, "maxValue": number, "categoricalValues": [ string ], "discreteValues": [ number ], "scaleType": enum (`ScaleType`) }

Fields
`parameterName`	`string` Required. The parameter name must be unique amongst all ParameterConfigs in a HyperparameterSpec message. E.g., "learning_rate".
`type`	`enum (ParameterType)` Required. The type of the parameter.
`minValue`	`number` Required if type is `DOUBLE` or `INTEGER`. This field should be unset if type is `CATEGORICAL`. This value should be integers if type is INTEGER.
`maxValue`	`number` Required if type is `DOUBLE` or `INTEGER`. This field should be unset if type is `CATEGORICAL`. This value should be integers if type is `INTEGER`.
`categoricalValues[]`	`string` Required if type is `CATEGORICAL`. The list of possible categories.
`discreteValues[]`	`number` Required if type is `DISCRETE`. A list of feasible points. The list should be in strictly increasing order. For instance, this parameter might have possible settings of 1.5, 2.5, and 4.0. This list should not contain more than 1,000 values.
`scaleType`	`enum (ScaleType)` Optional. How the parameter should be scaled to the hypercube. Leave unset for categorical parameters. Some kind of scaling is strongly recommended for real or integral parameters (e.g., `UNIT_LINEAR_SCALE`).

ParameterType

The type of the parameter.

Enums
`PARAMETER_TYPE_UNSPECIFIED`	You must specify a valid type. Using this unspecified type will result in an error.
`DOUBLE`	Type for real-valued parameters.
`INTEGER`	Type for integral parameters.
`CATEGORICAL`	The parameter is categorical, with a value chosen from the categories field.
`DISCRETE`	The parameter is real valued, with a fixed set of feasible points. If `type==DISCRETE`, feasible_points must be provided, and {`minValue`, `maxValue`} will be ignored.

ScaleType

The type of scaling that should be applied to this parameter.

Enums
`NONE`	By default, no scaling is applied.
`UNIT_LINEAR_SCALE`	Scales the feasible space to (0, 1) linearly.
`UNIT_LOG_SCALE`	Scales the feasible space logarithmically to (0, 1). The entire feasible space must be strictly positive.
`UNIT_REVERSE_LOG_SCALE`	Scales the feasible space "reverse" logarithmically to (0, 1). The result is that values close to the top of the feasible space are spread out more than points near the bottom. The entire feasible space must be strictly positive.

Algorithm

The available search algorithms for hyperparameter tuning. Learn more about these algorithms.

Enums
`ALGORITHM_UNSPECIFIED`	The default algorithm used by the hyperparameter tuning service. This is a Bayesian optimization algorithm.
`GRID_SEARCH`	Simple grid search within the feasible space. To use grid search, all parameters must be `INTEGER`, `CATEGORICAL`, or `DISCRETE`.
`RANDOM_SEARCH`	Simple random search within the feasible space.

EncryptionConfig

Represents a custom encryption key configuration that can be applied to a resource.

JSON representation
{ "kmsKeyName": string }

Fields

Fields
`kmsKeyName`	`string` The Cloud KMS resource identifier of the customer-managed encryption key used to protect a resource, such as a training job. It has the following format: `projects/{PROJECT_ID}/locations/{REGION}/keyRings/{KEY_RING_NAME}/cryptoKeys/{KEY_NAME}`

kmsKeyName

string

The Cloud KMS resource identifier of the customer-managed encryption key used to protect a resource, such as a training job. It has the following format: projects/{PROJECT_ID}/locations/{REGION}/keyRings/{KEY_RING_NAME}/cryptoKeys/{KEY_NAME}

Scheduling

All parameters related to scheduling of training jobs.

JSON representation
{ "maxRunningTime": string, "maxWaitTime": string, "priority": integer }

Fields

Fields
`maxRunningTime`	`string (Duration format)` Optional. The maximum job running time, expressed in seconds. The field can contain up to nine fractional digits, terminated by `s`. If not specified, this field defaults to `604800s` (seven days). If the training job is still running after this duration, AI Platform Training cancels it. The duration is measured from when the job enters the `RUNNING` state; therefore it does not overlap with the duration limited by `Scheduling.max_wait_time`. For example, if you want to ensure your job runs for no more than 2 hours, set this field to `7200s` (2 hours * 60 minutes / hour * 60 seconds / minute). If you submit your training job using the `gcloud` tool, you can specify this field in a `config.yaml` file. For example: `trainingInput: scheduling: maxRunningTime: 7200s`
`maxWaitTime`	`string (Duration format)` Optional. The maximum job wait time, expressed in seconds. The field can contain up to nine fractional digits, terminated by `s`. If not specified, there is no limit to the wait time. The minimum for this field is `1800s` (30 minutes). If the training job has not entered the `RUNNING` state after this duration, AI Platform Training cancels it. After the job begins running, it can no longer be cancelled due to the maximum wait time. Therefore the duration limited by this field does not overlap with the duration limited by `Scheduling.max_running_time`. For example, if the job temporarily stops running and retries due to a VM restart, this cannot lead to a maximum wait time cancellation. However, independently of this constraint, AI Platform Training might stop a job if there are too many retries due to exhausted resources in a region. The following example describes how you might use this field: To cancel your job if it doesn't start running within 1 hour, set this field to `3600s` (1 hour * 60 minutes / hour * 60 seconds / minute). If the job is still in the `QUEUED` or `PREPARING` state after an hour of waiting, AI Platform Training cancels the job. If you submit your training job using the `gcloud` tool, you can specify this field in a `config.yaml` file. For example: `trainingInput: scheduling: maxWaitTime: 3600s`
`priority`	`integer` Optional. Job scheduling will be based on this priority, which in the range [0, 1000]. The bigger the number, the higher the priority. Default to 0 if not set. If there are multiple jobs requesting same type of accelerators, the high priority job will be scheduled prior to ones with low priority.

maxRunningTime

string (Duration format)

Optional. The maximum job running time, expressed in seconds. The field can contain up to nine fractional digits, terminated by s. If not specified, this field defaults to 604800s (seven days).

If the training job is still running after this duration, AI Platform Training cancels it. The duration is measured from when the job enters the RUNNING state; therefore it does not overlap with the duration limited by Scheduling.max_wait_time.

For example, if you want to ensure your job runs for no more than 2 hours, set this field to 7200s (2 hours * 60 minutes / hour * 60 seconds / minute).

If you submit your training job using the gcloud tool, you can specify this field in a config.yaml file. For example:

trainingInput:
  scheduling:
    maxRunningTime: 7200s

maxWaitTime

string (Duration format)

Optional. The maximum job wait time, expressed in seconds. The field can contain up to nine fractional digits, terminated by s. If not specified, there is no limit to the wait time. The minimum for this field is 1800s (30 minutes).

If the training job has not entered the RUNNING state after this duration, AI Platform Training cancels it. After the job begins running, it can no longer be cancelled due to the maximum wait time. Therefore the duration limited by this field does not overlap with the duration limited by Scheduling.max_running_time.

For example, if the job temporarily stops running and retries due to a VM restart, this cannot lead to a maximum wait time cancellation. However, independently of this constraint, AI Platform Training might stop a job if there are too many retries due to exhausted resources in a region.

The following example describes how you might use this field: To cancel your job if it doesn't start running within 1 hour, set this field to 3600s (1 hour * 60 minutes / hour * 60 seconds / minute). If the job is still in the QUEUED or PREPARING state after an hour of waiting, AI Platform Training cancels the job.

If you submit your training job using the gcloud tool, you can specify this field in a config.yaml file. For example:

trainingInput:
  scheduling:
    maxWaitTime: 3600s

priority

integer

Optional. Job scheduling will be based on this priority, which in the range [0, 1000]. The bigger the number, the higher the priority. Default to 0 if not set.

If there are multiple jobs requesting same type of accelerators, the high priority job will be scheduled prior to ones with low priority.

State

Describes the job state.

Enums
`STATE_UNSPECIFIED`	The job state is unspecified.
`QUEUED`	The job has been just created and processing has not yet begun.
`PREPARING`	The service is preparing to run the job.
`RUNNING`	The job is in progress.
`SUCCEEDED`	The job completed successfully.
`FAILED`	The job failed. `errorMessage` should contain the details of the failure.
`CANCELLING`	The job is being cancelled. `errorMessage` should describe the reason for the cancellation.
`CANCELLED`	The job has been cancelled. `errorMessage` should describe the reason for the cancellation.

TrainingOutput

Represents results of a training job. Output only.

JSON representation

JSON representation
{ "completedTrialCount": string, "trials": [ { object (`HyperparameterOutput`) } ], "consumedMLUnits": number, "isHyperparameterTuningJob": boolean, "isBuiltInAlgorithmJob": boolean, "builtInAlgorithmOutput": { object (`BuiltInAlgorithmOutput`) }, "hyperparameterMetricTag": string, "webAccessUris": { string: string, ... } }

{
  "completedTrialCount": string,
  "trials": [
    {
      object (HyperparameterOutput)
    }
  ],
  "consumedMLUnits": number,
  "isHyperparameterTuningJob": boolean,
  "isBuiltInAlgorithmJob": boolean,
  "builtInAlgorithmOutput": {
    object (BuiltInAlgorithmOutput)
  },
  "hyperparameterMetricTag": string,
  "webAccessUris": {
    string: string,
    ...
  }
}

Fields
`completedTrialCount`	`string (int64 format)` The number of hyperparameter tuning trials that completed successfully. Only set for hyperparameter tuning jobs.
`trials[]`	`object (HyperparameterOutput)` Results for individual Hyperparameter trials. Only set for hyperparameter tuning jobs.
`consumedMLUnits`	`number` The amount of ML units consumed by the job.
`isHyperparameterTuningJob`	`boolean` Whether this job is a hyperparameter tuning job.
`isBuiltInAlgorithmJob`	`boolean` Whether this job is a built-in Algorithm job.
`builtInAlgorithmOutput`	`object (BuiltInAlgorithmOutput)` Details related to built-in algorithms jobs. Only set for built-in algorithms jobs.
`hyperparameterMetricTag`	`string` The TensorFlow summary tag name used for optimizing hyperparameter tuning trials. See `HyperparameterSpec.hyperparameterMetricTag` for more information. Only set for hyperparameter tuning jobs.
`webAccessUris`	`map (key: string, value: string)` Output only. URIs for accessing interactive shells (one URI for each training node). Only available if `trainingInput.enable_web_access` is `true`. The keys are names of each node in the training job; for example, `master-replica-0` for the master node, `worker-replica-0` for the first worker, and `ps-replica-0` for the first parameter server. The values are the URIs for each node's interactive shell. An object containing a list of `"key": value` pairs. Example: `{ "name": "wrench", "mass": "1.3kg", "count": "3" }`.

HyperparameterOutput

Represents the result of a single hyperparameter tuning trial from a training job. The TrainingOutput object that is returned on successful completion of a training job with hyperparameter tuning includes a list of HyperparameterOutput objects, one for each successful trial.

JSON representation

JSON representation
{ "trialId": string, "hyperparameters": { string: string, ... }, "startTime": string, "endTime": string, "state": enum (`State`), "finalMetric": { object (`HyperparameterMetric`) }, "isTrialStoppedEarly": boolean, "allMetrics": [ { object (`HyperparameterMetric`) } ], "builtInAlgorithmOutput": { object (`BuiltInAlgorithmOutput`) }, "webAccessUris": { string: string, ... } }

{
  "trialId": string,
  "hyperparameters": {
    string: string,
    ...
  },
  "startTime": string,
  "endTime": string,
  "state": enum (State),
  "finalMetric": {
    object (HyperparameterMetric)
  },
  "isTrialStoppedEarly": boolean,
  "allMetrics": [
    {
      object (HyperparameterMetric)
    }
  ],
  "builtInAlgorithmOutput": {
    object (BuiltInAlgorithmOutput)
  },
  "webAccessUris": {
    string: string,
    ...
  }
}

Fields
`trialId`	`string` The trial id for these results.
`hyperparameters`	`map (key: string, value: string)` The hyperparameters given to this trial. An object containing a list of `"key": value` pairs. Example: `{ "name": "wrench", "mass": "1.3kg", "count": "3" }`.
`startTime`	`string (Timestamp format)` Output only. Start time for the trial.
`endTime`	`string (Timestamp format)` Output only. End time for the trial.
`state`	`enum (State)` Output only. The detailed state of the trial.
`finalMetric`	`object (HyperparameterMetric)` The final objective metric seen for this trial.
`isTrialStoppedEarly`	`boolean` True if the trial is stopped early.
`allMetrics[]`	`object (HyperparameterMetric)` All recorded object metrics for this trial. This field is not currently populated.
`builtInAlgorithmOutput`	`object (BuiltInAlgorithmOutput)` Details related to built-in algorithms jobs. Only set for trials of built-in algorithms jobs that have succeeded.
`webAccessUris`	`map (key: string, value: string)` URIs for accessing interactive shells (one URI for each training node). Only available if this trial is part of a hyperparameter tuning job and the job's `trainingInput.enable_web_access` is `true`. The keys are names of each node in the training job; for example, `master-replica-0` for the master node, `worker-replica-0` for the first worker, and `ps-replica-0` for the first parameter server. The values are the URIs for each node's interactive shell. An object containing a list of `"key": value` pairs. Example: `{ "name": "wrench", "mass": "1.3kg", "count": "3" }`.

HyperparameterMetric

An observed value of a metric.

JSON representation
{ "trainingStep": string, "objectiveValue": number }

Fields

Fields
`trainingStep`	`string (int64 format)` The global training step for this metric.
`objectiveValue`	`number` The objective value at this training step.

trainingStep

string (int64 format)

The global training step for this metric.

objectiveValue

number

The objective value at this training step.

BuiltInAlgorithmOutput

Represents output related to a built-in algorithm Job.

JSON representation
{ "framework": string, "runtimeVersion": string, "pythonVersion": string, "modelPath": string }

Fields
`framework`	`string` Framework on which the built-in algorithm was trained.
`runtimeVersion`	`string` AI Platform runtime version on which the built-in algorithm was trained.
`pythonVersion`	`string` Python version on which the built-in algorithm was trained.
`modelPath`	`string` The Cloud Storage path to the `model/` directory where the training job saves the trained model. Only set for successful jobs that don't use hyperparameter tuning.

Methods
`cancel`	Cancels a running job.
`create`	Creates a training or a batch prediction job.
`get`	Describes a job.
`getIamPolicy`	Gets the access control policy for a resource.
`list`	Lists the jobs in the project.
`patch`	Updates a specific job resource.
`setIamPolicy`	Sets the access control policy on the specified resource.
`testIamPermissions`	Returns permissions that a caller has on the specified resource.

REST Resource: projects.jobs

Resource: Job

TrainingInput

ScaleTier

ReplicaConfig

AcceleratorConfig

DiskConfig

HyperparameterSpec

GoalType

ParameterSpec

ParameterType

ScaleType

Algorithm

EncryptionConfig

Scheduling

State

TrainingOutput

HyperparameterOutput

HyperparameterMetric

BuiltInAlgorithmOutput

Methods

`cancel`

`create`

`get`

`getIamPolicy`

`list`

`patch`

`setIamPolicy`

`testIamPermissions`