- Resource: Node
- Methods
Resource: Node
A TPU instance.
JSON representation |
---|
{ "name": string, "description": string, "acceleratorType": string, "state": enum ( |
Fields | |
---|---|
name |
Output only. Immutable. The name of the TPU. |
description |
The user-supplied description of the TPU. Maximum of 512 characters. |
acceleratorType |
The type of hardware accelerators associated with this node. |
state |
Output only. The current state for the TPU Node. |
healthDescription |
Output only. If this field is populated, it contains a description of why the TPU Node is unhealthy. |
runtimeVersion |
Required. The runtime version running in the Node. |
networkConfig |
Network configurations for the TPU node. |
cidrBlock |
The CIDR block that the TPU node will use when selecting an IP address. This CIDR block must be a /29 block; the Compute Engine networks API forbids a smaller block, and using a larger block would be wasteful (a node can only consume one IP address). Errors will occur if the CIDR block has already been used for a currently existing TPU node, the CIDR block conflicts with any subnetworks in the user's provided network, or the provided network is peered with another network that is using that CIDR block. |
serviceAccount |
The Google Cloud Platform Service Account to be used by the TPU node VMs. If None is specified, the default compute service account will be used. |
createTime |
Output only. The time when the node was created. A timestamp in RFC3339 UTC "Zulu" format, with nanosecond resolution and up to nine fractional digits. Examples: |
schedulingConfig |
The scheduling options for this node. |
networkEndpoints[] |
Output only. The network endpoints where TPU workers can be accessed and sent work. It is recommended that runtime clients of the node reach out to the 0th entry in this map first. |
health |
The health status of the TPU node. |
labels |
Resource labels to represent user-provided metadata. An object containing a list of |
metadata |
Custom metadata to apply to the TPU Node. Can set startup-script and shutdown-script An object containing a list of |
tags[] |
Tags to apply to the TPU Node. Tags are used to identify valid sources or targets for network firewalls. |
id |
Output only. The unique identifier for the TPU Node. |
dataDisks[] |
The additional data disks for the Node. |
apiVersion |
Output only. The API version that created this Node. |
symptoms[] |
Output only. The Symptoms that have occurred to the TPU Node. |
queuedResource |
Output only. The qualified name of the QueuedResource that requested this Node. |
acceleratorConfig |
The AccleratorConfig for the TPU Node. |
shieldedInstanceConfig |
Shielded Instance options. |
multisliceNode |
Output only. Whether the Node belongs to a Multislice group. |
autocheckpointEnabled |
Optional. Whether Autocheckpoint is enabled. |
bootDiskConfig |
Optional. Boot disk configuration. |
State
Represents the different states of a TPU node during its lifecycle.
Enums | |
---|---|
STATE_UNSPECIFIED |
TPU node state is not known/set. |
CREATING |
TPU node is being created. |
READY |
TPU node has been created. |
RESTARTING |
TPU node is restarting. |
REIMAGING |
TPU node is undergoing reimaging. |
DELETING |
TPU node is being deleted. |
REPAIRING |
TPU node is being repaired and may be unusable. Details can be found in the 'help_description' field. |
STOPPED |
TPU node is stopped. |
STOPPING |
TPU node is currently stopping. |
STARTING |
TPU node is currently starting. |
PREEMPTED |
TPU node has been preempted. Only applies to Preemptible TPU Nodes. |
TERMINATED |
TPU node has been terminated due to maintenance or has reached the end of its life cycle (for preemptible nodes). |
HIDING |
TPU node is currently hiding. |
HIDDEN |
TPU node has been hidden. |
UNHIDING |
TPU node is currently unhiding. |
NetworkConfig
Network related configurations.
JSON representation |
---|
{ "network": string, "subnetwork": string, "enableExternalIps": boolean, "canIpForward": boolean, "queueCount": integer } |
Fields | |
---|---|
network |
The name of the network for the TPU node. It must be a preexisting Google Compute Engine network. If none is provided, "default" will be used. |
subnetwork |
The name of the subnetwork for the TPU node. It must be a preexisting Google Compute Engine subnetwork. If none is provided, "default" will be used. |
enableExternalIps |
Indicates that external IP addresses would be associated with the TPU workers. If set to false, the specified subnetwork or network should have Private Google Access enabled. |
canIpForward |
Allows the TPU node to send and receive packets with non-matching destination or source IPs. This is required if you plan to use the TPU workers to forward routes. |
queueCount |
Optional. Specifies networking queue count for TPU VM instance's network interface. |
ServiceAccount
A service account.
JSON representation |
---|
{ "email": string, "scope": [ string ] } |
Fields | |
---|---|
email |
Email address of the service account. If empty, default Compute service account will be used. |
scope[] |
The list of scopes to be made available for this service account. If empty, access to all Cloud APIs will be allowed. |
SchedulingConfig
Sets the scheduling options for this node.
JSON representation |
---|
{ "preemptible": boolean, "reserved": boolean, "spot": boolean } |
Fields | |
---|---|
preemptible |
Defines whether the node is preemptible. |
reserved |
Whether the node is created under a reservation. |
spot |
Optional. Defines whether the node is Spot VM. |
NetworkEndpoint
A network endpoint over which a TPU worker can be reached.
JSON representation |
---|
{
"ipAddress": string,
"port": integer,
"accessConfig": {
object ( |
Fields | |
---|---|
ipAddress |
The internal IP address of this network endpoint. |
port |
The port of this network endpoint. |
accessConfig |
The access config for the TPU worker. |
AccessConfig
An access config attached to the TPU worker.
JSON representation |
---|
{ "externalIp": string } |
Fields | |
---|---|
externalIp |
Output only. An external IP address associated with the TPU worker. |
Health
Health defines the status of a TPU node as reported by Health Monitor.
Enums | |
---|---|
HEALTH_UNSPECIFIED |
Health status is unknown: not initialized or failed to retrieve. |
HEALTHY |
The resource is healthy. |
TIMEOUT |
The resource is unresponsive. |
UNHEALTHY_TENSORFLOW |
The in-guest ML stack is unhealthy. |
UNHEALTHY_MAINTENANCE |
The node is under maintenance/priority boost caused rescheduling and will resume running once rescheduled. |
AttachedDisk
A node-attached disk resource. Next ID: 8;
JSON representation |
---|
{
"sourceDisk": string,
"mode": enum ( |
Fields | |
---|---|
sourceDisk |
Specifies the full path to an existing disk. For example: "projects/my-project/zones/us-central1-c/disks/my-disk". |
mode |
The mode in which to attach this disk. If not specified, the default is READ_WRITE mode. Only applicable to dataDisks. |
DiskMode
The different mode of the attached disk.
Enums | |
---|---|
DISK_MODE_UNSPECIFIED |
The disk mode is not known/set. |
READ_WRITE |
Attaches the disk in read-write mode. Only one TPU node can attach a disk in read-write mode at a time. |
READ_ONLY |
Attaches the disk in read-only mode. Multiple TPU nodes can attach a disk in read-only mode at a time. |
ApiVersion
TPU API Version.
Enums | |
---|---|
API_VERSION_UNSPECIFIED |
API version is unknown. |
V1_ALPHA1 |
TPU API V1Alpha1 version. |
V1 |
TPU API V1 version. |
V2_ALPHA1 |
TPU API V2Alpha1 version. |
Symptom
A Symptom instance.
JSON representation |
---|
{
"createTime": string,
"symptomType": enum ( |
Fields | |
---|---|
createTime |
Timestamp when the Symptom is created. A timestamp in RFC3339 UTC "Zulu" format, with nanosecond resolution and up to nine fractional digits. Examples: |
symptomType |
Type of the Symptom. |
details |
Detailed information of the current Symptom. |
workerId |
A string used to uniquely distinguish a worker within a TPU node. |
SymptomType
SymptomType represents the different types of Symptoms that a TPU can be at.
Enums | |
---|---|
SYMPTOM_TYPE_UNSPECIFIED |
Unspecified symptom. |
LOW_MEMORY |
TPU VM memory is low. |
OUT_OF_MEMORY |
TPU runtime is out of memory. |
EXECUTE_TIMED_OUT |
TPU runtime execution has timed out. |
MESH_BUILD_FAIL |
TPU runtime fails to construct a mesh that recognizes each TPU device's neighbors. |
HBM_OUT_OF_MEMORY |
TPU HBM is out of memory. |
PROJECT_ABUSE |
Abusive behaviors have been identified on the current project. |
ShieldedInstanceConfig
A set of Shielded Instance options.
JSON representation |
---|
{ "enableSecureBoot": boolean } |
Fields | |
---|---|
enableSecureBoot |
Defines whether the instance has Secure Boot enabled. |
BootDiskConfig
Boot disk configurations.
JSON representation |
---|
{
"customerEncryptionKey": {
object ( |
Fields | |
---|---|
customerEncryptionKey |
Optional. Customer encryption key for boot disk. |
enableConfidentialCompute |
Optional. Whether the boot disk will be created with confidential compute mode. |
CustomerEncryptionKey
Customer's encryption key.
JSON representation |
---|
{ // Union field |
Fields | |
---|---|
Union field
|
|
kmsKeyName |
The name of the encryption key that is stored in Google Cloud KMS. For example: "kmsKeyName": "projects/ kms_project_id/locations/ region/keyRings/ key_region/cryptoKeys/key The fully-qualifed key name may be returned for resource GET requests. For example: "kmsKeyName": "projects/ kms_project_id/locations/ region/keyRings/ key_region/cryptoKeys/key /cryptoKeyVersions/1 |
Methods |
|
---|---|
|
Creates a node. |
|
Deletes a node. |
|
Gets the details of a node. |
|
Retrieves the guest attributes for the node. |
|
Lists nodes. |
|
Updates the configurations of a node. |
|
Simulates a maintenance event. |
|
Starts a node. |
|
Stops a node. |