Troubleshooting

API not enabled or service account deleted error

When calling the Cloud Life Sciences API, you might encounter one or both of the following errors:

  • API not enabled or service account deleted
  • checking service account permission: Account deleted: PROJECT_ID

To solve these issues, complete the following steps in order:

  1. Make sure that the Cloud Life Sciences and Compute Engine APIs are enabled.
  2. Make sure that the Cloud Life Sciences Service Agent service account is configured correctly.
  3. Make sure that the Compute Engine default service account is configured correctly.

Enabling the Cloud Life Sciences and Compute Engine APIs

Make sure that the Cloud Life Sciences and Compute Engine APIs are enabled in your Google Cloud project:

  1. Enable the Cloud Life Sciences API:

    Enable Cloud Life Sciences API

  2. Enable the Compute Engine API:

    Enable Compute Engine API

If you encounter an error that says you don't have permission to enable Google Cloud APIs for your project, see Enabling and disabling APIs.

Missing the Cloud Life Sciences service account or Cloud Life Sciences Service Agent role

The Cloud Life Sciences Service Agent service account is automatically created when you run a pipeline for the first time in a Google Cloud project. You can run the pipeline using the Google Cloud CLI or the REST and RPC APIs. You can't delete the service account, but it might not appear in the Identity and Access Management page. This might lead to errors with the Cloud Life Sciences API.

For the Cloud Life Sciences API to function and complete tasks like running pipelines on Compute Engine VMs, the Cloud Life Sciences Service Agent service account must exist. It must also have the Life Sciences Service Agent IAM role.

If you encounter any of the following issues, recreate the Cloud Life Sciences Service Agent service account or grant it the Life Sciences Service Agent IAM role:

  • You cannot find the Cloud Life Sciences Service Agent service account in the Identity and Access Management page.
  • You can find the Cloud Life Sciences Service Agent service account, but it does not contain the Life Sciences Service Agent role.

Use the Google Cloud CLI to add the lifesciences.serviceAgent role to the Cloud Life Sciences Service Agent service account using the service account's identifier, which uses the format service-PROJECT_NUMBER@gcp-sa-lifesciences.iam.gserviceaccount.com.

To recreate the service account or grant it the Life Sciences Service Agent IAM role, run the gcloud projects add-iam-policy-binding command. To find the PROJECT_ID and PROJECT_NUMBER, see Identifying projects.

gcloud projects add-iam-policy-binding PROJECT_ID \
    --member=serviceAccount:service-PROJECT_NUMBER@gcp-sa-lifesciences.iam.gserviceaccount.com \
    --role=roles/lifesciences.serviceAgent

If the request is successful, the command prompt displays a message similar to the following sample:

Updated IAM policy for project [PROJECT_ID].
bindings:
...
- members:
  - serviceAccount:service-PROJECT_NUMBER@gcp-sa-lifesciences.iam.gserviceaccount.com
  role: roles/lifesciences.serviceAgent
...
etag: VALUE
version: VALUE

Return to the Identity and Access Management page again and verify the following:

  • The Member column contains a service account identifier in the format service-PROJECT_NUMBER@gcp-sa-lifesciences.iam.gserviceaccount.
  • In the same row as the Member column, the Name column contains Cloud Life Sciences Service Agent.
  • In the same row as the Member column, the Role column contains Life Sciences Service Agent.

Missing the Compute Engine default service account

Newly created Google Cloud projects come with the Compute Engine default service account, identifiable using the following email:

PROJECT_NUMBER-compute@developer.gserviceaccount.com

The service account must exist in your Google Cloud project, otherwise the Cloud Life Sciences API cannot run pipelines on Compute Engine VMs. If you delete the service account from your project, any applications that depend on the service account's credentials might fail. If you accidentally delete the Compute Engine default service account, you can try to recover the account within 30 days. See Undeleting a service account for more information.

Cannot authenticate to the Cloud Life Sciences API

If you are running a pipeline with the Cloud Life Sciences API using a service account as your credentials (as opposed to using gcloud auth application-default login as your credentials), ensure that the service account has the following roles:

  • roles/lifesciences.workflowsRunner
  • roles/iam.serviceAccountUser

To add these roles to your service account, complete the following steps using either the Google Cloud console or the Google Cloud CLI:

Console

  1. Make sure that you have enabled the Cloud Life Sciences API.
  2. On the IAM page in Google Cloud console, find your service account.
  3. In the Inheritance column that matches the service account, click the pencil icon. The Edit permissions pane opens.
  4. Click Add another role and then search for the Life Sciences Workflows Runner and Service Account User roles.
  5. Select the role and then click Save. The lifesciences.workflowsRunner and iam.serviceAccountUser roles are then added to the service account.

gcloud

To add the service account permissions, run the gcloud projects add-iam-policy-binding command. To find the PROJECT_ID and PROJECT_NUMBER, see Identifying projects.

gcloud projects add-iam-policy-binding PROJECT_ID \
    --member=serviceAccount:service-PROJECT_NUMBER@SERVICE_ACCOUNT_ID.iam.gserviceaccount.com \
    --role=roles/lifesciences.workflowsRunner
gcloud projects add-iam-policy-binding PROJECT_ID \
    --member=serviceAccount:service-PROJECT_NUMBER@SERVICE_ACCOUNT_ID.iam.gserviceaccount.com \
    --role=roles/iam.serviceAccountUser

Cannot authenticate using Application Default Credentials

When calling the Cloud Life Sciences API, you might receive an error message indicating that your "Application Default Credentials" are unavailable.

See Setting up authentication for server-to-server applications for information on how to configure Application Default Credentials or how to pass in authentication credentials manually to an application or command.

Error codes

The Cloud Life Sciences API can return the following error codes:

RESOURCE EXHAUSTED (8)

Code: 8

Status: RESOURCE_EXHAUSTED

Category: User error

Description: A resource has been exhausted. This can indicate that your application has exhausted a project-level admin API quota.

Recommended action: Retry the operation.

FAILED_PRECONDITION (9)

Code: 9

Status: FAILED_PRECONDITION

Category: User error

Full error: Execution failed: while running "[USER_COMMAND_LINE]": unexpected exit status [NUMBER] was not ignored

Description: The operation was rejected because a user action returned a non-zero exit status. A snippet of the standard error output from the action displays that you can use to diagnose the issue. To upload the full logs from the Compute Engine virtual machine (VM), use the ALWAYS_RUN action when making the pipeline request, like so:

{
  "commands": [
    "-c",
    "gcloud storage cp /google/logs/output gs://CLOUD_STORAGE_BUCKET/output" --quiet
  ],
  "entrypoint": "bash",
  "flags": [ "ALWAYS_RUN" ],
  "imageUri": "gcr.io/cloud-genomics-pipelines/io"
}

Recommended action: Do not retry without fixing the problem.

ABORTED (10)

Code: 10

Status: ABORTED

Category: System error

Full error: The assigned worker has failed to complete the operation

Description: The operation was aborted because the Compute Engine VM running the pipeline failed, possibly because it was preempted and could not report its status before terminating.

Recommended action: Retry the operation. If the error reoccurs consistently, there might be an issue causing the Compute Engine VM to fail, such as using too many resources. Inspect the Compute Engine logs in Cloud Logging for further information.

13

Code: 13

Full error: Execution failed: generic::internal: action INDEX: waiting for container: container is still running, possibly due to low system resources

Description: The container for the action might have run out of memory.

Recommended action: Retry the pipeline using a larger machine type.

UNAVAILABLE (14)

Code: 14

Status: UNAVAILABLE

Category: System error

Full error: Execution failed: worker was terminated

Description: The Compute Engine VM running the pipeline was preempted.

Recommended action: Retry the operation.

Retrying after encountering errors

A pipeline might fail and return an error code. Failures can occur due to issues unrelated to the work that the pipeline was doing. In most cases, you should retry the pipeline operation. Pipelines are prone to failures when you use preemptible VMs, which are cheaper but more likely to encounter interruptions. The Cloud Life Sciences API cannot retry pipeline operations automatically because not all pipelines are idempotent.

As shown in the error codes section, retrying is typically recommended when you encounter any of the following error codes:

  • RESOURCE EXHAUSTED (8)
  • ABORTED (10)
  • UNAVAILABLE (14)

Enabling Cloud Monitoring

You can enable Cloud Monitoring on your pipelines to monitor the health and resource usage of the worker VMs used to run the pipeline. However, enabling Monitoring can incur additional costs. To enable Monitoring, specify the enableStackdriverMonitoring flag on the VirtualMachine object when making the pipeline request.

Pipeline is running out of disk space

If your pipeline runs out of disk space and cannot pull Docker images, or if it needs more disk space to log or perform its tasks, you can choose one of the following options:

  • Increase the boot disk size using the bootDiskSizeGb flag in the VirtualMachine object when making the pipeline request.
  • Attach a separate disk and add it to the Mount object inside the Action object when making the pipeline request.

Encountering quota delays

If your Google Cloud project is out of Compute Engine quota, the Cloud Life Sciences API does not allocate VMs. Any further allocation attempts are delayed to give existing pipelines time to complete. If this delay continues to occur, you can request an increase in quota.

Pipelines are stopping

If your pipelines stop and repeatedly assign and release VMs because the VMs cannot communicate with the Cloud Life Sciences API service, the cause might be due to the following issues that occur together:

  • The VMs' default network might have been deleted.
  • Another network was not specified in the Network object.

To solve this issue, do one of the following:

  • Specify a network in the Network object.
  • Recreate the default network using the steps in Creating an auto mode network and name the new network default.

Canceling or deleting VMs

Rather than deleting unwanted worker VMs, you should cancel their associated operations. If you delete the VM, the pipeline fails slowly and, if it is in the process of starting up, a new VM might be assigned.