Troubleshooting OS Login


This document describes how to troubleshoot OS Login using the metadata server. For information about setting up OS Login or for step-by-step instructions, see Setting up OS Login.

You can query the metadata server from within a virtual machine (VM) instance. For more information, see Storing and retrieving instance metadata.

Before you begin

  • If you haven't already, then set up authentication. Authentication is the process by which your identity is verified for access to Google Cloud services and APIs. To run code or samples from a local development environment, you can authenticate to Compute Engine by selecting one of the following options:

    Select the tab for how you plan to use the samples on this page:

    Console

    When you use the Google Cloud console to access Google Cloud services and APIs, you don't need to set up authentication.

    gcloud

    1. Install the Google Cloud CLI, then initialize it by running the following command:

      gcloud init
    2. Set a default region and zone.

    REST

    To use the REST API samples on this page in a local development environment, you use the credentials you provide to the gcloud CLI.

      Install the Google Cloud CLI, then initialize it by running the following command:

      gcloud init

    For more information, see Authenticate for using REST in the Google Cloud authentication documentation.

Common error messages

The following are examples of common errors you might encounter when you use OS Login.

Cannot find name for group

On some VMs using OS Login, you might receive the following error message after the connection is established:

/usr/bin/id: cannot find name for group ID 123456789

Ignore this error message. This error does not affect your VMs.

Failure getting groups

You might see logs similar to the following when you create VMs:

Dec 10 22:31:05 instance-1 google_oslogin_nss_cache[381]: oslogin_cache_refresh[381]: Refreshing group entry cache
Dec 10 22:31:05 instance-1 google_oslogin_nss_cache[381]: oslogin_cache_refresh[381]: Failure getting groups, quitting

These logs indicate that your organization doesn't have OS Login Linux groups configured. Ignore these messages.

Failed precondition

You might see an error similar to the following when you connect to the VM using SSH:

ERROR: (gcloud.compute.ssh) FAILED_PRECONDITION: The specified username or UID is not unique within given system ID.

This error occurs when OS Login attempts to generate a username that already exists within an organization. This is common when a user account is deleted and a new user with the same email address is created shortly after. After a user account is deleted, it takes up to 48 hours to remove the user's POSIX information.

To resolve this issue, do one of the following:

Invalid argument

You might see errors similar to the following when you connect to a VM using SSH or use SCP to transfer files:

ERROR: (gcloud.compute.ssh) INVALID_ARGUMENT: Login profile size exceeds 32 KiB. Delete profile values to make additional space.
ERROR: (gcloud.compute.scp) INVALID_ARGUMENT: Login profile size exceeds 32 KiB. Delete profile values to make additional space.

To resolve these errors, do the following:

  1. View your OS Login profile by running the gcloud compute os-login describe-profile command:

    gcloud compute os-login describe-profile
    

    The output looks similar to the following:

    name: '00000000000000'
    posixAccounts:
    ...
    sshPublicKeys:
     ...:
       fingerprint: ...
       key: |
         ssh-rsa AAAAB3NzaC1yc2...
       name: ...
     ...
    
  2. Review the output to identify any unused SSH keys.

  3. Remove any unused keys from the output using the gcloud compute os-login ssh-keys remove command:

    gcloud compute os-login ssh-keys remove --key=KEY
    

    Replace KEY with the keys's fingerprint or the key string.

To prevent this issue from occurring in the future, add an expiry time for SSH keys. Expired keys are automatically removed from your login profile 48 hours after expiry, or when you add a new key to your profile.

HTTP response code: 503

You might see the following error when you attempt to connect to a VM using SSH:

Failed to validate organization user USERNAME has login permission, got HTTP response code: 503

This issue is caused by the metadata server rate limit of 100 queries per second per virtual machine instance. This limit cannot be adjusted. To resolve this issue, wait a few seconds, then retry the connection.

To prevent this issue in the future, try the following:

  • Implement a retry mechanism in the application code. For more information, see:
  • Re-use existing SSH connections.
  • Send commands in batches to reduce SSH connections and OS Login metadata queries.

Default OS Login metadata entries

Compute Engine defines a set of default metadata entries that serves OS Login information. Default metadata is always defined and set by the server. Default metadata keys are case sensitive.

The following table describes the entries you can query.

Relative to http://metadata.google.internal/computeMetadata/v1/
Metadata entry Description
project/attributes/enable-oslogin Checks if OS Login is enabled on the current Google Cloud project.
instance/attributes/enable-oslogin Checks if OS Login is enabled on the current VM.
oslogin/users/ Retrieves profile information for OS Login users. You can pass query parameters such as username, uid, pagesize and pagetoken.
oslogin/authorize/

Retrieves login or administrative level permission settings for an OS Login user.

To check a permission, you must specify the policy query parameter. The value of the policy parameter must be set to either login (to check for login permission) or adminLogin (to check for sudo access).

Checking if OS Login is configured

Use the Google Cloud console or Google Cloud CLI to query metadata to determine if OS Login is enabled. OS Login is enabled when the enable-oslogin metadata key is set to TRUE in project or instance metadata. If both instance and project metadata are set, the value set in instance metadata takes precedence.

Viewing OS Login users

To view the profile information for multiple users, you need to specify the pagesize and pagetoken parameters. Replace the pagesize and pagetoken with the required numeric value.

curl "http://metadata.google.internal/computeMetadata/v1/oslogin/users?pagesize=PAGE_SIZE&
pagetoken=PAGE_TOKEN" -H "Metadata-Flavor: Google"

For example, to set the pagesize to 1 and the pagetoken to 0, run the following command:

curl "http://metadata.google.internal/computeMetadata/v1/oslogin/users?pagesize=1&pagetoken=0" -H "Metadata-Flavor: Google"

On most distributions, you can also run the Unix command getent passwd to retrieve the password entries for organization users.

Viewing a specific OS Login user

To view the profile information for a specific user on your VM, run the following command:

curl "http://metadata.google.internal/computeMetadata/v1/oslogin/users?username=USERNAME" -H "Metadata-Flavor: Google"

Replace USERNAME with the username of the user that you want to query.

For example, you can perform a request to look up the user user_example_com. The following command and output show added formatting for improved readability.

curl "http://metadata.google.internal/computeMetadata/v1/oslogin/users?username=user_example_com" -H "Metadata-Flavor: Google"

The output is similar to the following:

{
    "loginProfiles": [{
        "name": "12345678912345",
        "posixAccounts": [{
            "primary": true,
            "username": "user_example_com",
            "uid": "123451",
            "gid": "123451",
            "homeDirectory": "/home/user_example_com",
            "operatingSystemType": "LINUX"
        }],
        "sshPublicKeys": {
            "204c4b4fb...": {
                "key": "ssh-rsa AAAAB3Nz...",
                "fingerprint": "204c4b4fb..."
            }
        }
    }]
}

On most distributions, you can also run Unix commands such as getent passwd username or getent passwd uid to retrieve profile information.

To retrieve the SSH keys for a user, you can also run /usr/bin/google_authorized_keys USERNAME. If no keys are returned, the user might not have the required permissions to log into the VM.

Checking login permissions

To view login and administrative level permissions, you must provide the policy=login&email=LOGIN_NAME query parameters.

  1. Query the user profile to get the value of the name field:

    curl "http://metadata.google.internal/computeMetadata/v1/oslogin/users?username=user_example_com" -H "Metadata-Flavor: Google"
  2. In the output, take note of the name.

  3. Run the following login command using the value of name:

    curl "http://metadata.google.internal/computeMetadata/v1/oslogin/authorize?policy=login&email=LOGIN_NAME" -H "Metadata-Flavor: Google"
    

For example, you can query the login permissions for the user user_example_com that was viewed in the previous section:

curl "http://metadata.google.internal/computeMetadata/v1/oslogin/authorize?policy=login&email=12345678912345" -H "Metadata-Flavor: Google"

The command output indicates the user is authorized to log in to the VM:

{"success":true}

Checking if your VM has a service account

You can query the metadata server to find the service account associated with your VM. On your VM, run the following command:

curl "http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/" -H "Metadata-Flavor: Google"

The output is similar to the following:

12345-sa@developer.gserviceaccount.com/
default/

If no service account is found, the output is blank.

Debugging OS Login issues with gcpdiag

gcpdiag is an open source tool. It is not an officially supported Google Cloud product. You can use the gcpdiag tool to help you identify and fix Google Cloud project issues. For more information, see the gcpdiag project on GitHub.

This gcpdiag runbook investigates potential causes for SSH access problems on both Windows and Linux VMs in Google Cloud. It focuses on:
  • VM Health: Checks if the VM is running and has sufficient resources (CPU, memory, disk).
  • Permissions: Ensures you have the right IAM permissions to configure SSH keys.
  • VM Settings: Verifies SSH keys and other metadata are configured correctly.
  • Network Rules: Reviews firewall rules to confirm SSH traffic is allowed.
  • Guest OS: Looks for internal OS issues that might block SSH.

Google Cloud console

  1. Complete and then copy the following command.
  2. GOOGLE_AUTH_TOKEN=GOOGLE_AUTH_TOKEN \
      gcpdiag runbook gce/ssh --project=PROJECT_ID \
        --parameter name=VM_NAME \
        --parameter zone=ZONE \
        --parameter principal=PRINCIPAL \
        --parameter tunnel_through_iap=IAP_ENABLED \
        --auto --reason=REASON
  3. Open the Google Cloud console and activate Cloud Shell.
  4. Open Cloud console
  5. Paste the copied command.
  6. Run the gcpdiag command, which downloads the gcpdiag docker image, and then performs diagnostic checks. If applicable, follow the output instructions to fix failed checks.

Docker

You can run gcpdiag using a wrapper that starts gcpdiag in a Docker container. Docker or Podman must be installed.

  1. Copy and run the following command on your local workstation.
    curl https://gcpdiag.dev/gcpdiag.sh >gcpdiag && chmod +x gcpdiag
  2. Execute the gcpdiag command.
    ./gcpdiag runbook gce/ssh --project=PROJECT_ID \
        --parameter name=VM_NAME \
        --parameter zone=ZONE \
        --parameter principal=PRINCIPAL \
        --parameter tunnel_through_iap=IAP_ENABLED

View available parameters for this runbook.

Replace the following:

  • VM_NAME: The name of the target VM within your project.
  • ZONE: The zone in which your target VM is located.
  • PRINCIPAL: The user or service account principal initiating the SSH connection. For key-based authentication, use the user authenticated by your Cloud Shell command-line tool or signed into the Google Cloud console. For service account impersonation, it should be the service account's email.
  • IAP_ENABLED: A boolean value (true or false) indicating whether Identity-Aware Proxy is used for establishing the SSH connection. Default: true

Useful flags:

For a list and description of all gcpdiag tool flags, see the gcpdiag usage instructions.

What's next