REST Resource: projects.locations.nodes

Resource: Node

A TPU instance.

JSON representation
{
  "name": string,
  "description": string,
  "acceleratorType": string,
  "state": enum (State),
  "healthDescription": string,
  "runtimeVersion": string,
  "networkConfig": {
    object (NetworkConfig)
  },
  "cidrBlock": string,
  "serviceAccount": {
    object (ServiceAccount)
  },
  "createTime": string,
  "schedulingConfig": {
    object (SchedulingConfig)
  },
  "networkEndpoints": [
    {
      object (NetworkEndpoint)
    }
  ],
  "health": enum (Health),
  "labels": {
    string: string,
    ...
  },
  "metadata": {
    string: string,
    ...
  },
  "tags": [
    string
  ],
  "id": string,
  "dataDisks": [
    {
      object (AttachedDisk)
    }
  ],
  "apiVersion": enum (ApiVersion),
  "symptoms": [
    {
      object (Symptom)
    }
  ]
}
Fields
name

string

Output only. Immutable. The name of the TPU.

description

string

The user-supplied description of the TPU. Maximum of 512 characters.

acceleratorType

string

Required. The type of hardware accelerators associated with this node.

state

enum (State)

Output only. The current state for the TPU Node.

healthDescription

string

Output only. If this field is populated, it contains a description of why the TPU Node is unhealthy.

runtimeVersion

string

Required. The runtime version running in the Node.

networkConfig

object (NetworkConfig)

Network configurations for the TPU node.

cidrBlock

string

The CIDR block that the TPU node will use when selecting an IP address. This CIDR block must be a /29 block; the Compute Engine networks API forbids a smaller block, and using a larger block would be wasteful (a node can only consume one IP address). Errors will occur if the CIDR block has already been used for a currently existing TPU node, the CIDR block conflicts with any subnetworks in the user's provided network, or the provided network is peered with another network that is using that CIDR block.

serviceAccount

object (ServiceAccount)

The Google Cloud Platform Service Account to be used by the TPU node VMs. If None is specified, the default compute service account will be used.

createTime

string (Timestamp format)

Output only. The time when the node was created.

A timestamp in RFC3339 UTC "Zulu" format, with nanosecond resolution and up to nine fractional digits. Examples: "2014-10-02T15:01:23Z" and "2014-10-02T15:01:23.045123456Z".

schedulingConfig

object (SchedulingConfig)

The scheduling options for this node.

networkEndpoints[]

object (NetworkEndpoint)

Output only. The network endpoints where TPU workers can be accessed and sent work. It is recommended that runtime clients of the node reach out to the 0th entry in this map first.

health

enum (Health)

The health status of the TPU node.

labels

map (key: string, value: string)

Resource labels to represent user-provided metadata.

An object containing a list of "key": value pairs. Example: { "name": "wrench", "mass": "1.3kg", "count": "3" }.

metadata

map (key: string, value: string)

Custom metadata to apply to the TPU Node. Can set startup-script and shutdown-script

An object containing a list of "key": value pairs. Example: { "name": "wrench", "mass": "1.3kg", "count": "3" }.

tags[]

string

Tags to apply to the TPU Node. Tags are used to identify valid sources or targets for network firewalls.

id

string (int64 format)

Output only. The unique identifier for the TPU Node.

dataDisks[]

object (AttachedDisk)

The additional data disks for the Node.

apiVersion

enum (ApiVersion)

Output only. The API version that created this Node.

symptoms[]

object (Symptom)

Output only. The Symptoms that have occurred to the TPU Node.

State

Represents the different states of a TPU node during its lifecycle.

Enums
STATE_UNSPECIFIED TPU node state is not known/set.
CREATING TPU node is being created.
READY TPU node has been created.
RESTARTING TPU node is restarting.
REIMAGING TPU node is undergoing reimaging.
DELETING TPU node is being deleted.
REPAIRING TPU node is being repaired and may be unusable. Details can be found in the help_description field.
STOPPED TPU node is stopped.
STOPPING TPU node is currently stopping.
STARTING TPU node is currently starting.
PREEMPTED TPU node has been preempted. Only applies to Preemptible TPU Nodes.
TERMINATED TPU node has been terminated due to maintenance or has reached the end of its life cycle (for preemptible nodes).
HIDING TPU node is currently hiding.
HIDDEN TPU node has been hidden.
UNHIDING TPU node is currently unhiding.

NetworkConfig

Network related configurations.

JSON representation
{
  "network": string,
  "subnetwork": string,
  "enableExternalIps": boolean
}
Fields
network

string

The name of the network for the TPU node. It must be a preexisting Google Compute Engine network. If none is provided, "default" will be used.

subnetwork

string

The name of the subnetwork for the TPU node. It must be a preexisting Google Compute Engine subnetwork. If none is provided, "default" will be used.

enableExternalIps

boolean

Indicates that external IP addresses would be associated with the TPU workers. If set to false, the specified subnetwork or network should have Private Google Access enabled.

ServiceAccount

A service account.

JSON representation
{
  "email": string,
  "scope": [
    string
  ]
}
Fields
email

string

Email address of the service account. If empty, default Compute service account will be used.

scope[]

string

The list of scopes to be made available for this service account. If empty, access to all Cloud APIs will be allowed.

SchedulingConfig

Sets the scheduling options for this node.

JSON representation
{
  "preemptible": boolean,
  "reserved": boolean
}
Fields
preemptible

boolean

Defines whether the node is preemptible.

reserved

boolean

Whether the node is created under a reservation.

NetworkEndpoint

A network endpoint over which a TPU worker can be reached.

JSON representation
{
  "ipAddress": string,
  "port": integer,
  "accessConfig": {
    object (AccessConfig)
  }
}
Fields
ipAddress

string

The internal IP address of this network endpoint.

port

integer

The port of this network endpoint.

accessConfig

object (AccessConfig)

The access config for the TPU worker.

AccessConfig

An access config attached to the TPU worker.

JSON representation
{
  "externalIp": string
}
Fields
externalIp

string

Output only. An external IP address associated with the TPU worker.

Health

Health defines the status of a TPU node as reported by Health Monitor.

Enums
HEALTH_UNSPECIFIED Health status is unknown: not initialized or failed to retrieve.
HEALTHY The resource is healthy.
TIMEOUT The resource is unresponsive.
UNHEALTHY_TENSORFLOW The in-guest ML stack is unhealthy.
UNHEALTHY_MAINTENANCE The node is under maintenance/priority boost caused rescheduling and will resume running once rescheduled.

AttachedDisk

A node-attached disk resource. Next ID: 8;

JSON representation
{
  "sourceDisk": string,
  "mode": enum (DiskMode)
}
Fields
sourceDisk

string

Specifies the full path to an existing disk. For example: "projects/my-project/zones/us-central1-c/disks/my-disk".

mode

enum (DiskMode)

The mode in which to attach this disk. If not specified, the default is READ_WRITE mode. Only applicable to dataDisks.

DiskMode

The different mode of the attached disk.

Enums
DISK_MODE_UNSPECIFIED The disk mode is not known/set.
READ_WRITE Attaches the disk in read-write mode. Only one TPU node can attach a disk in read-write mode at a time.
READ_ONLY Attaches the disk in read-only mode. Multiple TPU nodes can attach a disk in read-only mode at a time.

ApiVersion

TPU API Version.

Enums
API_VERSION_UNSPECIFIED API version is unknown.
V1_ALPHA1 TPU API V1Alpha1 version.
V1 TPU API V1 version.
V2_ALPHA1 TPU API V2Alpha1 version.

Symptom

A Symptom instance.

JSON representation
{
  "createTime": string,
  "symptomType": enum (SymptomType),
  "details": string,
  "workerId": string
}
Fields
createTime

string (Timestamp format)

Timestamp when the Symptom is created.

A timestamp in RFC3339 UTC "Zulu" format, with nanosecond resolution and up to nine fractional digits. Examples: "2014-10-02T15:01:23Z" and "2014-10-02T15:01:23.045123456Z".

symptomType

enum (SymptomType)

Type of the Symptom.

details

string

Detailed information of the current Symptom.

workerId

string

A string used to uniquely distinguish a worker within a TPU node.

SymptomType

SymptomType represents the different types of Symptoms that a TPU can be at.

Enums
SYMPTOM_TYPE_UNSPECIFIED Unspecified symptom.
LOW_MEMORY TPU VM memory is low.
OUT_OF_MEMORY TPU runtime is out of memory.
EXECUTE_TIMED_OUT TPU runtime execution has timed out.
MESH_BUILD_FAIL TPU runtime fails to construct a mesh that recognizes each TPU device's neighbors.
HBM_OUT_OF_MEMORY TPU HBM is out of memory.
PROJECT_ABUSE Abusive behaviors have been identified on the current project.

Methods

create

Creates a node.

delete

Deletes a node.

get

Gets the details of a node.

getGuestAttributes

Retrieves the guest attributes for the node.

list

Lists nodes.

patch

Updates the configurations of a node.

start

Starts a node.

stop

Stops a node.