Agent Policies enable automated installation and maintenance of the Google Cloud's operations suite agents across a fleet of VMs that match user-specified criteria. With one command, you can create a Policy for your Google Cloud project that governs existing and new VMs associated with that Google Cloud project, ensuring proper installation and optional auto-upgrade of all agents.
Supported operating systems
You can apply an Agent Policy to Compute Engine instances with the following operating systems.
Logging agent
maps to policies with
agent type
logging
. Monitoring agent
maps to policies with
agent type
metrics
. Ops Agent
maps to policies with
agent type
ops-agent
.
Operating system | Logging agent | Monitoring agent | Ops Agent |
---|---|---|---|
CentOS 7 | |||
CentOS 8 | |||
Rocky Linux 8 | |||
RHEL 6 | |||
RHEL 7: rhel-7, rhel-7-6-sap-ha, rhel-7-7-sap-ha, rhel-7-9-sap-ha |
1 | ||
RHEL 8: rhel-8, rhel-8-1-sap-ha, rhel-8-2-sap-ha, rhel-8-4-sap-ha |
1 | ||
Debian 9 (Stretch) | |||
Debian 10 (Buster) | |||
Debian 11 (Bullseye) | |||
Ubuntu LTS 18.04 (Bionic Beaver): ubuntu-1804-lts, ubuntu-minimal-1804-lts |
|||
Ubuntu LTS 20.04 (Focal Fossa): ubuntu-2004-lts, ubuntu-minimal-2004-lts |
|||
Ubuntu LTS 22.04 (Jammy Jellyfish): ubuntu-2204-lts, ubuntu-minimal-2204-lts |
|||
SLES 12: sles-12, sles-12-sp3-sap, sles-12-sp4-sap, sles-12-sp5-sap |
|||
SLES 15: sles-15, sles-15-sap, sles-15-sp1-sap, sles-15-sp2-sap, sles-15-sp3-sap, sles-15-sp4-sap |
|||
OpenSUSE Leap 15: opensuse-leap (opensuse-leap-15-3-*, opensuse-leap-15-4-*) |
|||
Windows Server: 2012 R2, 2016, 2019, 2022, Core 2012 R2, Core 2016, Core 2019, Core 2022 |
1 The Monitoring agent is not
supported on rhel-7-9-sap-ha
, rhel-8-2-sap-ha
, or
rhel-8-4-sap-ha
.
Creating an Agent Policy
To create an Agent Policy using the Google Cloud CLI, complete the following steps:
If you haven't done so already, install the Google Cloud CLI.
In the gcloud CLI, the command group for managing Agent Policies is in
beta
release.If you haven't done so already, install the
beta
component of the gcloud CLI:gcloud components install beta
To check if you have the
beta
component for the installed, run:gcloud components list
If you previously installed the
beta
component, ensure you have the latest version:gcloud components update
Use the following script to enable the APIs and to set the proper permissions for using the Google Cloud CLI:
set-permissions.sh
.For information about the script, refer to What's the
set-permissions.sh
script doing?.Use the
gcloud beta compute instances ops-agents policies create
command to create a Policy. For the syntax of the command, refer to thegcloud beta compute instances ops-agents policies create
documentation.For examples of how to format the command, refer to the Examples section in the Google Cloud CLI documentation.
For more information about the available gcloud CLI commands and the available options, refer to the
gcloud beta compute instances ops-agents policies
documentation.
Best practices for using Agent Policies
To control the impact to production systems during rollout, we recommend that you use instance labels and zones to filter the instances that the policy applies to.
Here is an example of a phased rollout plan for CentOS 7 VMs:
Phase 1: Create a policy to install the legacy Logging agent and Monitoring agent on all VMs with the
label env=test
and app=myproduct
.
gcloud beta compute instances \
ops-agents policies create ops-agents-policy-safe-rollout \
--agent-rules="type=logging,version=current-major,package-state=installed,enable-autoupgrade=true;type=metrics,version=current-major,package-state=installed,enable-autoupgrade=true" \
--os-types=short-name=centos,version=7 \
--group-labels=env=test,app=myproduct \
--project=my_project
For more information about specifying the operating system, see
gcloud beta compute instances ops-agents policies create
.
Phase 2: Update that policy to target env=prod
and app=myproduct
and only a
single zone.
gcloud beta compute instances \
ops-agents policies update ops-agents-policy-safe-rollout \
--group-labels=env=prod,app=myproduct \
--zones=us-central1-c \
Phase 3: Update that policy to clear the zones filter so it rolls out globally
gcloud beta compute instances \
ops-agents policies update ops-agents-policy-safe-rollout \
--clear-zones
Limitations
For a Policy to take effect on VMs that predate OS Config, additional setup is needed to ensure the OS Config Agent that the policy relies on is installed on the VMs. To install the OS Config Agent on a fleet of VMs, complete the following steps:
Ensure you have run the
set-permissions.sh
script in the Creating an Agent Policy section.Decide on which VMs you want to install the OS Config Agent and list them in a CSV file.
To get a list of all the non Google-managed (e.g. by Google Kubernetes Engine or Google App Engine) instances into a csv, run:
gcloud compute instances list \ --filter="-labels.list(show="keys"):goog-" \ --format="csv(name,zone)" \ | grep -v -x -F -f <(gcloud compute instances os-inventory list-instances \ --format="csv(name,zone)") \ | sed 's/$/,update/' > instances.csv
The
grep
section filters out the VMs that already have the OS Config Agent installed and enabled. The VM label exclusion based ongoog-
filters out Compute Engine VMs managed by GKE, App Engine etc.To further filter the instances by zones or labels, change the
--filter
to something similar to the following:"-labels.list(show="keys"):goog- AND zone:(ZONE_1,ZONE_2) AND labels.KEY_1:VALUE_1 AND labels.KEY_2=VALUE_2"
Download and run the
mass-install-osconfig-agent.sh
script by following the instructions in the script to run a command like:bash mass-install-osconfig-agent.sh --project project-id --input-file instances.csv
This script automates the Installing the OS Config agent instructions.
Troubleshooting
The ops-agents policy commands fail
If ops-agents policy commands fail, they show a corresponding validation error. Correct those errors by fixing the command arguments and flags as suggested by the error message.
In addition to the validation errors, you might see the following errors:
Insufficient IAM permission
A sample error looks like:
ERROR: (gcloud.beta.compute.instances.ops-agents.policies.XXX) PERMISSION_DENIED: Caller does not have required permission to XXX
Make sure you run the
set-permissions.sh
script in the Creating an Agent Policy section to set up theosconfig.guestPolicy
specific IAM role.To verify whether you have the sufficient OS Config guest policy role enabled for the project, you can run the following command. In this example, the command checks if the user has the
roles/osconfig.guestPolicyAdmin
role. TheGCLOUD_MEMBER
should be in the format ofuser:USER_EMAIL
orserviceaccount:SERVICE_ACCOUNT_EMAIL
.gcloud projects get-iam-policy project-id \ --filter=--member=gcloud-member \ | grep "roles/osconfig.guestPolicyAdmin" -B 2
The expected output is:
- members: - gcloud-member role: roles/osconfig.guestPolicyAdmin
Osconfig API is not enabled
A sample error looks like:
API [osconfig.googleapis.com] not enabled on project [XXX]. Would you like to enable and retry (this will take a few minutes)? (y/N)?
Make sure you run the
set-permissions.sh
script in the Creating an Agent Policy section to grant all the necessary permissions.To verify whether OS Config API is enabled for the project, you can run the following commands:
gcloud services list --project project-id \ | grep osconfig.googleapis.com
The expected output is:
osconfig.googleapis.com Cloud OS Config API
The policy does not exist
A sample error looks like:
NOT_FOUND: Requested entity was not found
This suggests the policy has already been deleted. Make sure the policy ID in the
describe
,update
ordelete
command maps to an existing policy.
The policy is created, but seems to have no effect
OS Config agents are deployed to each Compute Engine instance to manage the packages for the Logging and Monitoring agents. The policy may seem to have no effect if the underlying OS Config agent is not installed.
LINUX
To verify that the OS Config agent is installed, run the following command:
gcloud compute ssh instance-id \
--project project-id \
-- sudo systemctl status google-osconfig-agent
A sample output is:
google-osconfig-agent.service - Google OSConfig Agent
Loaded: loaded (/lib/systemd/system/google-osconfig-agent.service; enabled; vendor preset:
Active: active (running) since Wed 2020-01-15 00:14:22 UTC; 6min ago
Main PID: 369 (google_osconfig)
Tasks: 8 (limit: 4374)
Memory: 102.7M
CGroup: /system.slice/google-osconfig-agent.service
└─369 /usr/bin/google_osconfig_agent
WINDOWS
To verify that the OS Config agent is installed, run the following steps:
Connect to your instance using RDP or a similar tool and login to Windows.
Open a PowerShell terminal, then run the following PowerShell command. You don't need administrator privileges.
Get-Service google_osconfig_agent
SUSE and Ubuntu Compute Engine instances don't have the OS Config agent preinstalled, so you need to follow the OS Config agent installation instructions to get the OS Config agent installed on those Compute Engine instances.
The OS Config agent is installed, but it does not install the Ops agents
To verify if there are any errors when the OS Config agent applies Policies, you can check the OS Config agent's log. This can be done either via Logs Explorer or via SSH / RDP into individual Compute Engine instances.
To view OS Config agent logs in Logs Explorer, use the following filter:
resource.type="gce_instance"
logName="projects/project-id/logs/OSConfigAgent"
To view OS Config agent logs via SSH for individual Compute Engine Linux instances, run the following command:
CentOS / RHEL / SLES / SUSE
gcloud compute ssh instance-id \ --project project-id \ -- sudo cat /var/log/messages \ | grep "OSConfigAgent\|google-fluentd\|stackdriver-agent"
Debian / Ubuntu
gcloud compute ssh instance-id \ --project project-id \ -- sudo cat /var/log/syslog \ | grep "OSConfigAgent\|google-fluentd\|stackdriver-agent"
To view OS Config agent logs via RDP for individual Compute Engine Windows instances, run the following steps:
Connect to your instance using RDP or a similar tool and login to Windows.
Open the
Event Viewer
app, underWindows Logs
=>Application
, search for logs withSource
equal toOSConfigAgent
.
If there is an error connecting to the OS Config Service, make sure you run the
set-permissions.sh
script in the Creating an Agent Policy
section to set up the metadata.
To verify that the OS Config metadata is enabled, you can run the following command:
gcloud compute project-info describe \
--project project-id \
| grep "enable-osconfig\|enable-guest-attributes" -A 1
The expected output is:
- key: enable-guest-attributes
value: 'TRUE'
- key: enable-osconfig
value: 'TRUE'
Ops agents are installed, but not functioning properly
Refer to the Logging agent and the Monitoring agent troubleshooting pages to debug specific issues.
Enabling debug-level logs
It's very helpful to enable debug level logging of the OS Config agent when reporting an issue.
You can set the osconfig-log-level: debug
metadata to enable debug-level
logging for the OS Config agent. The collected logs have more information to
help with the investigation.
To enable debug-level logging for the entire project, run the following command:
gcloud compute project-info add-metadata \
--project project-id \
--metadata osconfig-log-level=debug
To enable debug-level logging for one VM, run the following command:
gcloud compute instances add-metadata instance-id \
--project project-id \
--metadata osconfig-log-level=debug
Additional information
What's the set-permissions.sh
script doing?
Given a project ID, an Identity and Access Management (IAM) role, and an email or a
service account, the set-permissions.sh
script performs
the following actions:
Enables the Cloud Logging API, the Cloud Monitoring API, and the OS Config API for the project.
Grants the
roles/logging.logWriter
and theroles/monitoring.metricWriter
roles to the Compute Engine default service account so that the agents can write logs and metrics to the Logging and Cloud Monitoring APIs.Enables the OS Config metadata for the project so that OS Config agents get activated on the VMs.
Grants the specified IAM role to the
gcloud
user or the service account. Project owners have full access to create and manage a Policy. For all other users or service accounts, project owners must grant one of the following roles:roles/osconfig.guestPolicyAdmin
: Provides full access to a Policy.roles/osconfig.guestPolicyEditor
: Allows users to get, update, and list a Policy.roles/osconfig.guestPolicyViewer
: Provides read-only access to get and list a Policy.
When running the script, you only need to specify the
guestPolicy*
part of the role name. The script supplies theroles/osconfig.
part of the name.
The following invocation of the script enables the APIs, grants the necessary roles to the default service account, and enables the OS Config metadata:
bash set-permissions.sh --project=PROJECT_ID
To use the script to also grant one of the OS Config roles to a user who does
not have the roles/owner
(Owner) role on the project, run the script as
follows:
bash set-permissions.sh --project=PROJECT_ID \ --iam-user=USER_EMAIL \ --iam-permission-role=guestPolicy[Admin|Editor|Viewer]
To use the script to also grant one of the OS Config roles to a non-default service account, run the script as follows:
bash set-permissions.sh --project=PROJECT_ID \ --iam-service-account=SERVICE_ACCT_EMAIL \ --iam-permission-role=guestPolicy[Admin|Editor|Viewer]
For more information, see the contents of the script.
What's the diagnose.sh
script doing?
Given a project, a Compute Engine instance ID, and an Ops agent Policy ID, the
diagnose.sh
script automatically collects
the necessary information to help diagnosing issues of the policy:
The OS Config agent version
The underlying OS Config guest policy
The Policies that are applicable to this Compute Engine instance
The agent package repos that are pulled on to a Compute Engine instance
Terraform integration
Terraform support is built on top of the Google Cloud CLI commands. To create an Agent Policy using Terraform, follow the Terraform module instruction.