This topic describes how to use Cloud Monitoring to monitor the boot integrity of Shielded VMs instances that have integrity monitoring enabled, identify the cause of an integrity validation failure, and update the integrity policy baseline.
Monitoring VM boot integrity by using Monitoring
Use Cloud Monitoring to view integrity validation events and set alerts for them, and Cloud Logging to review details of those events.
Viewing integrity validation events
To view the metrics for a monitored resource by using the Metrics Explorer, do the following:
-
In the Google Cloud console, go to the leaderboard Metrics explorer page:
If you use the search bar to find this page, then select the result whose subheading is Monitoring.
- In the Metric element, expand the Select a metric menu,
enter
Boot Validation
in the filter bar, and then use the submenus to select a specific resource type and metric:- In the Active resources menu, select VM instance.
- In the Active metric categories menu, select Instance.
- In the Active metrics menu, select Early Boot Validation or Late Boot Validation.
- Early Boot Validation: Shows the pass/fail status of the early boot portion of the last boot sequence. Early boot is the boot sequence from the start of the UEFI firmware until it passes control to the bootloader.
- Late Boot Validation: Shows the pass/fail status of the late boot portion of the last boot sequence. Late boot is the boot sequence from the bootloader until completion. This includes the loading of the operating system kernel.
- Click Apply.
To remove time series from the display, use the Filter element.
To combine time series, use the menus on the Aggregation element. For example, to display the CPU utilization for your VMs, based on their zone, set the first menu to Mean and the second menu to zone.
All time series are displayed when the first menu of the Aggregation element is set to Unaggregated. The default settings for the Aggregation element are determined by the metric type you selected.
- For quota and other metrics that report one sample per day, do the following:
- In the Display pane, set the Widget type to Stacked bar chart.
- Set the time period to at least one week.
Setting alerts on integrity validation events
Set alerts on the values of the Early Boot Validation and Late Boot Validation metrics if you want to be notified when there is a boot validation failure on your VM instance. For information about alerting, see Introduction to Alerting:
For early boot validation alerting policy settings, see Compute Engine early boot validation.
For late boot validation alerting policy settings, see Compute Engine late boot validation.
Viewing integrity validation event details
- Go to the VM instances page
- Click the instance ID to open the VM instance details page.
- Under Logs, click Cloud Logging.
- Locate the
earlyBootReportEvent
orlateBootReportEvent
log entry that you want to review. - Expand the log entry >
jsonPayload
>earlyBootReportEvent
orlateBootReportEvent
, as appropriate. Within that section, thepolicyEvaluationPassed
element identifies whether the given section of the boot sequence passed verification against the integrity policy baseline. - Expand the
actualMeasurements
section and the numbered elements within it to see the platform configuration register (PCR) values saved from the latest boot sequence. The PCR values are saved in thevalue
elements within the numbered elements. The PCR values identify the boot components and component load order used by the latest boot sequence, and are compared to the integrity policy baseline to determine if there has been any change in the VM instance boot sequence. For more information about what the PCRs represent, see Integrity monitoring events. - Expand the
policyMeasurements
section to see the PCR values saved for the integrity policy baseline.
Automating responses to integrity validation events
You can automate responses to boot validation events by exporting the Cloud Logging logs and processing them in another service like Cloud Run functions. For more information, see Routing and storage overview and Automating responses to integrity validation failures.
Determining the cause of boot integrity validation failure
- Go to the VM instances page
- Click the instance ID to open the VM instance details page.
- Under Logs, click Cloud Logging.
- Locate the most recent
earlyBootReportEvent
andlateBootReportEvent
log entries and see which one has apolicyEvaluationPassed
value of false. - Expand the log entry >
jsonPayload
>earlyBootReportEvent
orlateBootReportEvent
, as appropriate. - Expand the
actualMeasurements
andpolicyMeasurements
sections and the numbered elements within them to see the platform configuration register (PCR) values saved from the latest boot sequence and the integrity policy baseline, respectively. The PCR values identify the boot components and component load order used by the latest boot sequence and the integrity policy baseline. Compare the PCR values in the
actualMeasurements
andpolicyMeasurements
sections to determine where the variation between the latest boot sequence and the integrity policy baseline occurred. Whichever comparison produces dissimilar values is the issue that caused validation failure. Be aware that the element numbers in these sections seldom correspond to the PCR numbers, and similarly numbered elements inactualMeasurements
andpolicyMeasurements
might represent different PCRs. For example, in the early boot sequence for both Windows and Linux, the3
element inactualMeasurements
and the2
element inpolicyMeasurements
both represent PCR7.Check Integrity monitoring events to determine what the changed PCR represents, and investigate whether that is an expected change.
Updating the integrity policy baseline
The initial integrity policy baseline is derived from the implicitly trusted boot image when the instance is created. Updating the baseline updates the integrity policy baseline using the current instance configuration. The VM instance must be running when you update the baseline.
You should update the baseline after any planned boot-specific changes in the instance configuration, like kernel updates or kernel driver installation, as these will cause integrity validation failures. If you have an unexpected integrity validation failure, you should investigate the reason for the failure and be prepared to stop the instance if necessary.
You must have the setShieldedInstanceIntegrityPolicy
permission to be able to
update the integrity policy baseline.
Use the following procedure to update the integrity policy baseline.
gcloud
Update the VM instance's integrity policy baseline by using the
compute instances update
command with the
--shielded-learn-integrity-policy
flag.
The following example resets the integrity policy baseline for the my-instance VM instance:
gcloud compute instances update INSTANCE_NAME \
--zone=ZONE \
--shielded-learn-integrity-policy
Replace the following:
- INSTANCE_NAME: Name of the VM.
- ZONE: Zone where the VM exists.
API
Update the VM instance's integrity policy baseline by using the
updateAutoLearnPolicy
request body item with the
setShieldedInstanceIntegrityPolicy
method.
The following example resets the integrity policy baseline for a VM instance.
PATCH https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/zones/ZONE/instances/INSTANCE_NAME/setShieldedInstanceIntegrityPolicy?key=YOUR_API_KEY
{
"updateAutoLearnPolicy": true
}
Replace the following:
- PROJECT_ID: Project ID of your project where VM exists.
- INSTANCE_NAME: Name of the VM.
- ZONE: Zone where the VM exists.
- YOUR_API_KEY: An API key that lets you access the API.
What's next
- Read more about the security features offered by Shielded VM.
- Learn more about modifying options on a Shielded VM instance.
- Learn about one approach to automating responses to integrity monitoring events.