Resource: Node
A TPU instance.
JSON representation | |
---|---|
{ "name": string, "description": string, "acceleratorType": string, "ipAddress": string, "port": string, "state": enum ( |
Fields | |
---|---|
name |
Output only. The immutable name of the TPU |
description |
The user-supplied description of the TPU. Maximum of 512 characters. |
acceleratorType |
The type of hardware accelerators associated with this node. Required. |
ipAddress |
Output only. DEPRECATED! Use networkEndpoints instead. The network address for the TPU Node as visible to Compute Engine instances. |
port |
Output only. DEPRECATED! Use networkEndpoints instead. The network port for the TPU Node as visible to Compute Engine instances. |
state |
Output only. The current state for the TPU Node. |
healthDescription |
Output only. If this field is populated, it contains a description of why the TPU Node is unhealthy. |
tensorflowVersion |
The version of Tensorflow running in the Node. Required. |
network |
The name of a network they wish to peer the TPU node to. It must be a preexisting Compute Engine network inside of the project on which this API has been activated. If none is provided, "default" will be used. |
cidrBlock |
The CIDR block that the TPU node will use when selecting an IP address. This CIDR block must be a /29 block; the Compute Engine networks API forbids a smaller block, and using a larger block would be wasteful (a node can only consume one IP address). Errors will occur if the CIDR block has already been used for a currently existing TPU node, the CIDR block conflicts with any subnetworks in the user's provided network, or the provided network is peered with another network that is using that CIDR block. |
serviceAccount |
Output only. The service account used to run the tensor flow services within the node. To share resources, including Google Cloud Storage data, with the Tensorflow job running in the Node, this account must have permissions to that data. |
createTime |
Output only. The time when the node was created. A timestamp in RFC3339 UTC "Zulu" format, accurate to nanoseconds. Example: |
schedulingConfig |
|
networkEndpoints[] |
Output only. The network endpoints where TPU workers can be accessed and sent work. It is recommended that Tensorflow clients of the node reach out to the 0th entry in this map first. |
health |
The health status of the TPU node. |
labels |
Resource labels to represent user-provided metadata. An object containing a list of |
State
Represents the different states of a TPU node during its lifecycle.
Enums | |
---|---|
STATE_UNSPECIFIED |
TPU node state is not known/set. |
CREATING |
TPU node is being created. |
READY |
TPU node has been created and is fully usable. |
RESTARTING |
TPU node is restarting. |
REIMAGING |
TPU node is undergoing reimaging. |
DELETING |
TPU node is being deleted. |
REPAIRING |
TPU node is being repaired and may be unusable. Details can be found in the help_description field. |
STOPPED |
TPU node is stopped. |
STOPPING |
TPU node is currently stopping. |
STARTING |
TPU node is currently starting. |
PREEMPTED |
TPU node has been preempted. Only applies to Preemptible TPU Nodes. |
TERMINATED |
TPU node has been terminated due to maintenance or has reached the end of its life cycle (for preemptible nodes). |
HIDING |
TPU node is currently hiding. |
HIDDEN |
TPU node has been hidden. |
UNHIDING |
TPU node is currently unhiding. |
SchedulingConfig
Sets the scheduling options for this node.
JSON representation | |
---|---|
{ "preemptible": boolean, "reserved": boolean } |
Fields | |
---|---|
preemptible |
Defines whether the node is preemptible. |
reserved |
Whether the node is created under a reservation. |
NetworkEndpoint
A network endpoint over which a TPU worker can be reached.
JSON representation | |
---|---|
{ "ipAddress": string, "port": integer } |
Fields | |
---|---|
ipAddress |
The IP address of this network endpoint. |
port |
The port of this network endpoint. |
Health
Health defines the status of a TPU node as reported by Health Monitor.
Enums | |
---|---|
HEALTH_UNSPECIFIED |
Health status is unknown: not initialized or failed to retrieve. |
HEALTHY |
The resource is healthy. |
DEPRECATED_UNHEALTHY |
The resource is unhealthy. |
TIMEOUT |
The resource is unresponsive. |
UNHEALTHY_TENSORFLOW |
The in-guest ML stack is unhealthy. |
UNHEALTHY_MAINTENANCE |
The node is under maintenance/priority boost caused rescheduling and will resume running once rescheduled. |
Methods |
|
---|---|
|
Creates a node. |
|
Deletes a node. |
|
Gets the details of a node. |
|
Lists nodes. |
|
Reimages a node's OS. |
|
Starts a node. |
|
Stops a node. |