Troubleshooting notebooks

Nothing happens after clicking "Open JupyterLab"

Verify that your browser does not block pop-up tabs. JupyterLab opens in a new browser tab.

No Inverting Proxy server access to JupyterLab

AI Platform Notebooks uses a Google internal Inverting Proxy server to provide access to JupyterLab. AI Platform Notebooks instance settings, network configuration, and other factors can prevent access to JupyterLab. Use SSH to connect to JupyterLab and learn more about why you might not have access through the Inverting Proxy.

Unable to SSH into notebook instance

AI Platform Notebooks uses OS Login to enable SSH access. This is done automatically at AI Platform Notebooks instance creation time by setting the metadata entry enable-oslogin value to TRUE. To enable SSH access for AI Platform Notebooks for users, complete the steps for configuring OS Login roles on user accounts.

Opening a notebook results in a 403 (Forbidden) error

There are 3 different options to access notebook JupyterLab:

  • Single User
  • Service Account
  • Project Editors

The access mode is configured during Notebook creation and it is defined in the notebook metadata:

  • Single User: proxy-mode=mail, proxy-user-mail=user@domain.com
  • Service Account: proxy-mode=service_account
  • Project Editors: proxy-mode=project_editors

If you can't access a notebook when you click Open JupyterLab, try the following:

  • Verify that the user accessing the instance has the iam.serviceAccounts.ActAs permission for the defined service account. The service account on the instance provides access to other Google Cloud services. You can use any service account within the same project, but you must have the Service Account User permission (iam.serviceAccounts.actAs) to access the instance. If not specified, the Compute Engine default service account is used and this permission is required as well.

The following example shows how to specify a service account when you create an instance:

gcloud beta notebooks instances create nb-1 \
  --vm-image-family=tf2-latest-cpu \
  --metadata=proxy-mode=mail,proxy-user-mail=user@domain.com \
  --service-account=your_service_account@project_id.iam.gserviceaccount.com \
  --location=us-west1-a
  • When you click Open JupyterLab to open a notebook, the notebook opens in a new browser tab. If you are signed in to more than one Google account, the new tab opens with your default Google account. If you did not create your notebook instance with your default Google account, the new browser tab will show a 403 (Forbidden) error.

Opening a notebook results in a 504 (Gateway Timeout) error

This is an indication of an internal proxy timeout or a backend server (Jupyter) timeout. This can be seen when:

  • The request never reached the internal Inverting Proxy server
  • Backend(Jupyter) returns a 504 error.

If you can't access a notebook:

  • Open a Google support case.

Opening a notebook results in a 524 (A Timeout Occurred) error

The internal Inverting Proxy server hasn't received a response from the Inverting Proxy agent for the request within the timeout period. Inverting Proxy agent runs inside your notebook instance as a Docker container. A 524 error is usually an indication that the Inverting Proxy agent isn't connecting to the Inverting Proxy server or the requests are taking too long on the backend server side (Jupyter). A typical case for this error is on the user side (e.g. A networking issue. Or the Inverting Proxy agent/Jupyter service isn't running)

If you can't access a notebook, try the following:

  • Verify that your notebook is started.

  • Verify Notebook instance disk is not out of space.

    1. Connect to your Deep Learning VM using SSH. For information on connecting to a VM using SSH, see Connecting to instances.

    2. Run the following command:

      df -h -T /home/jupyter
      

      If the Use% is above 85%, you need to manually delete files from /home/jupyter. As a first step, you can empty the trash with the following command:

      sudo rm -rf  /home/jupyter/.local/share/Trash/*
      
  • Verify that the Docker service is started.

  • Verify that the Inverting Proxy agent is running. If the agent is started, try restarting it.

  • Make sure the Jupyter service is running. If it is, try restarting it.

  • Verify that you are using AI Platform Deep Learning VM Image version M55 or later.

Opening a notebook results in a 598 (Network read timeout) error

The Inverting Proxy server hasn't heard from the Inverting Proxy agent at all for more than 10 minutes, this is a strong indication of an Inverting Proxy agent/Jupyter issue.

If you can't access a notebook, try the following:

  • Verify that your notebook is started.

  • Verify that the Docker service is started.

  • Verify that the Inverting Proxy agent is running. If the agent is started, try restarting it.

  • Make sure the Jupyter service is running. If it is, try restarting it.

  • Verify that you are using AI Platform Deep Learning VM Image version M55 or later.

Downloading files from JupyterLab results in 403 (Forbidden) error

The "notebook" package in the M23 release of Deep Learning VM includes a bug that prevents you from downloading a file using the JupyterLab UI. You can read more about the bug at Cannot download files after JL update and Download file functionality is broken in notebook packages version 5.7.6+ (5.7.7, 5.7.8).

If you are using the M23 release of Deep Learning VM you can resolve the issue in one of two ways:

  • Use a Safari browser. The download functionality works for Safari.

  • Downgrade your notebook package to version 5.7.5.

    To downgrade your notebook package:

    1. Connect to your Deep Learning VM using SSH. For information on connecting to a VM using SSH, see Connecting to instances.

    2. Run the following commands:

      sudo pip3 install notebook==5.7.5
      sudo service jupyter restart
      

After restarting VM, local files cannot be referenced from notebook terminal

Sometimes after restarting an AI Platform Notebooks instance, local files cannot be referenced from within a notebook terminal.

This is a known issue. To reference your local files from within a notebook terminal, first re-establish your current working directory using the following command:

cd PWD

In this command, replace PWD with your current working directory. For example, if your current working directory was /home/jupyter/, use the command cd /home/jupyter/.

After re-establishing your current working directory, your local files can be referenced from within the notebook terminal.

GPU quota has been exceeded

Determine the number of GPUs available in your project by checking the quotas page. If GPUs are not listed on the quotas page, or you require additional GPU quota, you can request a quota increase. See Requesting additional quota on the Compute Engine Resource Quotas page.

New notebook is not created (insufficient permissions)

It usually takes about a minute to create a notebook instance. If your new notebook instance remains in "pending" state indefinitely, it might be because the service account used to start the notebook instance does not have the required Editor permission in your Google Cloud Platform (GCP) project.

You can start a notebook instance with a custom service account that you create or in single-user mode with a userid. If you start a notebook instance in single-user mode, then your notebook instance begins the boot process using Compute Engine default service account before turning control over to your userid.

To verify that a service account has the appropriate permissions, follow these steps:

Console

  1. Open the IAM page in the Cloud Console.

    Open the IAM page

  2. Determine the service account used with your notebook instance, which is one of the following:

    • A custom service account that you specified when you created your notebook instance.

    • The Compute Engine default service account for your GCP project, which is used when you start your notebook instance in single-user mode. The Compute Engine default service account for your GCP project is named project-number-compute@developer.gserviceaccount.com. For example: 113377992299-compute@developer.gserviceaccount.com.

  3. Verify that your service account is in the Editor role.

  4. If not, edit the service account and add it to the Editor role.

For more information, see Granting, changing, and revoking access to resources in the IAM documentation.

gcloud

  1. If you have not already, install the gcloud command-line tool.

  2. Get the name and project number for your GCP project with the following command. Replace project-id with the project ID for your GCP project.

    gcloud projects describe project-id
    

    You should see output similar to the following, which displays the name (name:) and project number (projectNumber:) for your project.

    createTime: '2018-10-18T21:03:31.408Z'
    lifecycleState: ACTIVE
    name: my-project-name
    parent:
     id: '396521612403'
     type: folder
    projectId: my-project-id-1234
    projectNumber: '113377992299'
    
  3. Determine the service account used with your notebook instance, which is one of the following:

    • A custom service account that you specified when you created your notebook instance.

    • The Compute Engine default service account for your GCP project, which is used when you start your notebook instance in single-user mode. The Compute Engine default service account for your GCP project is named project-number-compute@developer.gserviceaccount.com. For example: 113377992299-compute@developer.gserviceaccount.com.

  4. Add the roles/editor role to the service account with the following command. Replace project-name with the name of your project, and replace service-account-id with the service account ID for your notebook instance.

    gcloud projects add-iam-policy-binding project-name \
     --member serviceAccount:service-account-id \
     --role roles/editor
    

Creating an instance results in a "Permission denied" error

When creating a new instance, verify that the user creating the instance has the iam.serviceAccounts.ActAs permission for the defined service account.

The service account on the instance provides access to other Google Cloud services. You can use any service account within the same project, but you must have the Service Account User permission (iam.serviceAccounts.actAs) to create the instance. If not specified, the Compute Engine default service account is used.

The following example shows how to specify a service account when you create an instance:

gcloud beta notebooks instances create nb-1 \
  --vm-image-family=tf2-latest-cpu \
  --service-account=your_service_account@project_id.iam.gserviceaccount.com \
  --location=us-west1-a

To grant the Service Account User permission, see Allowing a member to impersonate a single service account.

Creating a new instance results in an "already exists" error

When creating a new instance, verify that an AI Platform Notebooks instance with the same name was not deleted previously by Compute Engine and still exists in the AI Platform Notebooks API database.

The following example shows how to list instances using the AI Platform Notebooks API and verify their state.

gcloud beta notebooks instances list --location=LOCATION

If an instance's state is DELETED, run the following command to delete it permanently.

gcloud beta notebooks instances delete INSTANCE_NAME --location=LOCATION

Notebook is unresponsive

If your notebook instance isn't executing cells or appears to be frozen, first try restarting the kernel by clicking Kernel from the top menu and then Restart Kernel. If that doesn't work, you can try the following:

  • From a terminal session in the notebook, run top to see if there are processes consuming the CPU
  • From the terminal, check the amount of free disk space using df or available RAM using free
  • Shut your instance down by selecting it from the Notebook instances page and clicking Stop. Once it has stopped completely, select it and click Start.

To re-register with Inverting Proxy server

To re-register the Notebooks with the internal Inverting Proxy server, you can stop and start the VM from the Notebook instances page or you can log in to the notebook instance via SSH and enter:

cd /opt/deeplearning/bin
sudo ./attempt-register-vm-on-proxy.sh

Verify the Docker service status

To verify the Docker service status you can log in to the notebook instance via SSH and enter:

sudo service docker status

Verify that the Inverting Proxy agent is running

To verify if the notebook Inverting Proxy agent is running, log in to the notebook instance via SSH and enter:

# Confirm Inverting Proxy agent Docker container is running (proxy-agent)
sudo docker ps

# Verify State.Status is running and State.Running is true.
sudo docker inspect proxy-agent

# Grab logs
sudo docker logs proxy-agent

Verify the Jupyter service status and collect logs

To verify the Jupyter service status you can log in to the notebook instance via SSH and enter:

sudo service jupyter status

To collect Jupyter service logs:

sudo journalctl -u jupyter.service --no-pager

Verify the Jupyter internal API is active

To verify the Jupyter internal API is active you can log in to the notebook instance via SSH and enter:

curl http://127.0.0.1:8080/api/kernelspecs

Restart the Docker service

To restart the Docker service, you can stop and start the VM from the Notebook instances page or you can log in to the notebook instance via SSH and enter:

sudo service docker restart

Restart the Inverting Proxy agent

To restart the Inverting Proxy agent, you can stop and start the VM from the Notebook instances page or you can log in to the notebook instance via SSH and enter:

sudo docker restart proxy-agent

Restart the Jupyter service

To restart the Jupyter service, you can stop and start the VM from the Notebook instances page or you can log in to the notebook instance via SSH and enter:

sudo service jupyter restart