Cloud TPU Error Glossary

This document provides a glossary of common errors with solutions from the Cloud TPU service.


Invalid Accelerator Type

Error Message

generic::invalid_argument: Accelerator type v2-512 as preemptible (false) and
reserved (false) is not available in zone us-central1-a, please contact support.

Solution

An invalid parameter has been provided to the create command. The availability of an accelerator in a zone depends on 3 parameters: the type, the preemptible flag, and the reserved flag. The preemptible and reserved flags can be changed by including or excluding them in the create command.

A TPU created with the reserved flag will use reserved capacity. Including the preemptible flag will allow the TPU to be preempted by higher priority TPUs. If neither is provided, the TPU will be on demand. It is not a valid configuration to enable both flags. See the create command documentation for more information.

The accelerator types available in each zone can be found in the TPU regions and zones documentation or they can be queried using the accelerator-types list command. Change the create command to use one of these accelerator types and try again or contact support if the problem persists.

Network Not Found

Error Message

  Cloud TPU received a bad request. The field "Network" cannot be "xxxx":
  requested resource not found

Solution

The Network xxxx was not found. Ensure that the Network was created and set up properly. See Create and manage VPC networks for more information.

Service Account Permission Denied

Error Message

  generic::permission_denied: Cloud TPU got permissions denied when trying to
  access the customer project. Make sure that the IAM account
  'service-[project number]@cloud-tpu.iam.gserviceaccount.com' has the 'Cloud
  TPU API Service Agent' role by following https://cloud.google.com/iam/docs/manage-access-service-accounts

Solution

This error occurs when a user attempts to create or list nodes in a project without IAM authorization. A likely cause of this issue is that the Cloud TPU API service account does not have the required role for the project. The Manage access accounts documentation gives an overview of how to manage access. Follow the Grant or revoke a single role steps and give the account 'service-PROJECT_NUMBER@cloud-tpu.iam.gserviceaccount.com' the role of 'Cloud TPU API Service Agent' (be sure to replace PROJECT_NUMBER with your project number, which can be found in the project settings in the Google Cloud console). For more information on service agents, see the Service agents documentation.

Quota Exceeded

Error Message

You have reached XXXX limit. Please request an increase for the 'YYYY' quota for
Compute Engine API by following https://cloud.google.com/docs/quotas/view-manage#requesting_higher_quota.

Solution

Your project has reached a quota limit. To learn more about working with quotas, see the Cloud Quotas documentation. This shouldn't be confused with the TPU quota, which governs the usage of TPU pods.

You may request an increase to the appropriate limit by following the steps listed at Request a higher quota. On the quotas page, you may search for the quota specified by the 'YYYY' part of this message. Some quotas are split across different regions or services. The error message will indicate which one needs to be increased.

The 'XXXX' and 'YYYY' parts of the message may be one of the following:

  • HEALTH_CHECKS - 'Health checks' quota
  • FIREWALLS - 'Firewall rules' quota
  • NETWORK_ENDPOINT_GROUPS - 'Network endpoint groups' quota for this region
  • READ_REQUESTS - 'Read requests per minute' quota for the Compute Engine API service
  • OPERATION_READ_REQUESTS - 'Operation read requests per minute' quota

This request is typically processed within 2-3 business days. If the request is urgent, reach out to a customer engineer or technical account manager.