This page describes troubleshooting steps that you might find helpful if you run into problems using Google Compute Engine instances.
My instance will not start up. What can I do?
Here are some tips to help troubleshoot your persistent boot disk if it doesn't boot.
Examine your virtual machine instance's serial port output.
An instance's BIOS, bootloader, and kernel will print their debug messages into the instance's serial port output, providing valuable information about any errors or issues that the instance experienced. To get your serial port information, run:
gcloud compute instances get-serial-port-output INSTANCE
You can also access this information in the Google Cloud Platform Console:
- Go to VM instances page in the GCP Console.
- Click the instance that is not booting up.
- On the instance's page, scroll to the bottom and click Serial console output.
Enable interactive access to the serial console.
You can enable interactive access to an instance's serial console so you can log in and debug boot issues from within the instance, without requiring your instance to be fully booted. For more information, read Interacting with the Serial Console.
Validate that your disk has a valid file system.
If your file system is corrupted or otherwise invalid, you won't be able to launch your instance. Validate your disk's file system:
Detach the disk in question from any instance it is attached to, if applicable:
gcloud compute instances delete old-instance --keep-disks boot
Start a new instance with the latest Google-provided image:
gcloud compute instances create debug-instance
Attach your disk as a non-boot disk but don't mount it. Replace
DISKwith the name of the disk that won't boot. Note that we also provide a device name so that the disk is easily identifiable on the instance:
gcloud compute instances attach-disk debug-instance --disk DISK --device-name debug-disk
Connect to the instance:
gcloud compute ssh debug-instance
Look up the root partition of the disk, which is identified with the
part1notation. In this case, the root partition of the disk is at
user@debug-instance:~$ ls -l /dev/disk/by-id total 0 lrwxrwxrwx 1 root root 9 Jan 22 17:09 google-debug-disk -> ../../sdb lrwxrwxrwx 1 root root 10 Jan 22 17:09 google-debug-disk-part1 -> ../../sdb1 lrwxrwxrwx 1 root root 9 Jan 22 17:02 google-persistent-disk-0 -> ../../sda lrwxrwxrwx 1 root root 10 Jan 22 17:02 google-persistent-disk-0-part1 -> ../../sda1 lrwxrwxrwx 1 root root 9 Jan 22 17:09 scsi-0Google_PersistentDisk_debug-disk -> ../../sdb lrwxrwxrwx 1 root root 10 Jan 22 17:09 scsi-0Google_PersistentDisk_debug-disk-part1 -> ../../sdb1 lrwxrwxrwx 1 root root 9 Jan 22 17:02 scsi-0Google_PersistentDisk_persistent-disk-0 -> ../../sda lrwxrwxrwx 1 root root 10 Jan 22 17:02 scsi-0Google_PersistentDisk_persistent-disk-0-part1 -> ../../sda1
Run a file system check on the root partition:
user@debug-instance:~$ sudo fsck /dev/sdb1 fsck from util-linux 2.20.1 e2fsck 1.42.5 (29-Jul-2012) /dev/sdb1: clean, 19829/655360 files, 208111/2621184 blocks
Mount your file system:
user@debug-instance:~$ sudo mkdir /mydisk
user@debug-instance:~$ sudo mount /dev/sdb1 /mydisk
Check that the disk has kernel files:
user@debug-instance~:$ ls /mydisk/boot/vmlinuz-* /mydisk/boot/vmlinuz-3.2.0-4-amd64
Validate that the disk has a valid master boot record (MBR).
Run the following command on the debug instance that has attached the persistent boot disk, such as
$ sudo parted /dev/sdb print
If your MBR is valid, it should list information about the filesystem:
Disk /dev/sdb: 10.7GB Sector size (logical/physical): 512B/4096B Partition Table: msdos Disk Flags: Number Start End Size Type File system Flags 1 2097kB 10.7GB 10.7GB primary ext4 boot
What does it mean for my instance to be in
Why is network traffic to/from my instance being dropped?
Google Compute Engine only allows network traffic that is explicitly permitted by your project's firewall rules to reach your instance. By default, all projects automatically come with a default network that allows certain kinds of connections. If you deny all traffic by default, that will also deny SSH connections and all internal traffic. For more information, see the Firewall Rules page.
In addition, you may need to adjust TCP keep-alive settings to work around the default idle connection timeout of 10 minutes. For more information, see Communicating between your instances and the Internet.