Troubleshooting SSH

Under certain conditions, it is possible that a Google Compute Engine instance no longer accepts SSH connections. There are many reasons this could happen, from a full disk to an accidental misconfiguration of sshd. This section describes a number of tips and approaches to troubleshoot and resolve common SSH issues.

Check your firewall rules

Google Compute Engine provisions each project with a default set of firewall rules which permit SSH traffic. If the default firewall rule that permits SSH connections is somehow removed, you'll be unable to access your instance. Check your list of firewalls with the gcloud compute command-line tool and ensure the default-allow-ssh rule is present. If it is missing, add it back:

gcloud compute firewall-rules list
gcloud compute firewall-rules create default-allow-ssh --allow tcp:22

Debug the issue in the serial console

You can enable read-write access to an instance's serial console so you can log into the console and troubleshoot problems with the instance. This is particularly useful when you cannot log in with SSH or if the instance has no connection to the network. The serial console remains accessible in both these conditions.

To learn how to enable interactive access and connect to an instance's serial console, read Interacting with the Serial Console.

Test the network

You can use the netcat tool to connect to your instance on port 22, and see if the network connection is working. If you connect and see an ssh banner (e.g. SSH-2.0-OpenSSH_6.0p1 Debian-4), your network connection is working, and you can rule out firewall problems. First, use the gcloud tool to obtain the external natIP for your instance:

gcloud compute instances describe example-instance --format='get(networkInterfaces[0].accessConfigs[0].natIP)'

Use the nc command to connect to your instance:

# Check for SSH banner
user@local:~$ nc [EXTERNAL_IP] 22
SSH-2.0-OpenSSH_6.0p1 Debian-4

Try a new user

The issue that prevents you from logging in might be limited to your account (e.g. if the permissions on the ~/.ssh/authorized_keys file on the instance were set incorrectly).

Try logging in as a fresh user with the gcloud tool by specifying another username with the SSH request. The gcloud tool will update the project's metadata to add the new user and allow SSH access.

user@local:~$ gcloud compute ssh [USER]@example-instance

where [USER] is a new username to log in with.

Use your disk on a new instance

If the above set of steps doesn't work for you, and the instance you're interested in is booted from a persistent disk, you can detach the persistent disk and attach this disk to use on new instance. Replace DISK in the following example with your disk name:

gcloud compute instances delete old-instance --keep-disks=boot
gcloud compute instances create new-instance --disk name=DISK boot=yes auto-delete=no
gcloud compute ssh new-instance

Inspect an instance without shutting it down

You might have an instance you can't connect to that continues to correctly serve production traffic. In this case, you might want to inspect the disk without interrupting the instance's ability to serve users. First, take a snapshot of the instance's boot disk, then create a new disk from that snapshot, create a temporary instance, and finally attach and mount the new persistent disk to your temporary instance to troubleshoot the disk.

  1. Create a new VPC network to host your cloned instance:

    gcloud compute networks create debug-network
  2. Add a firewall rule to allow SSH connections to the network:

    gcloud compute firewall-rules create debug-network-allow-ssh --allow tcp:22
  3. Create a snapshot of the disk in question, replacing DISK with the disk name:

    gcloud compute disks snapshot DISK --snapshot-name debug-disk-snapshot
  4. Create a new disk with the snapshot you just created:

    gcloud compute disks create example-disk-debugging --source-snapshot debug-disk-snapshot
  5. Create a new debugging instance without an external IP address:

    gcloud compute instances create debugger --network debug-network --no-address
  6. Attach the debugging disk to the instance:

    gcloud compute instances attach-disk debugger --disk example-disk-debugging
  7. Follow the instructions to connect to an instance without an external IP address.

  8. Once logged into the debugger instance, troubleshoot the instance. For example, you can look at the instance logs:

    $ sudo su -

    $ mkdir /mnt/myinstance

    $ mount /dev/disk/by-id/scsi-0Google_PersistentDisk_example-disk-debugging /mnt/myinstance

    $ cd /mnt/myinstance/var/log

    # Identify the issue preventing ssh from working
    $ ls

Use a startup script

If none of the above helped, you can create a startup script to collect information right after the instance starts. Follow the instructions for running a startup script.

Afterwards, you will also need to reset your instance before the metadata will take affect using gcloud compute instances reset. Alternatively, you can also recreate your instance with a diagnostic startup script:

  1. Run gcloud compute instances delete with the --keep-disks flag.

    gcloud compute instances delete INSTANCE --keep-disks boot
  2. Add a new instance with the same disk and specify your startup script.

    gcloud compute instances create example-instance --disk name=DISK,boot=yes --startup-script-url URL

As a starting point, you can use the compute-ssh-diagnostic script to collect diagnostics information for most common issues.

Send feedback about...

Compute Engine Documentation