Google Distributed Cloud air-gapped release notes

March 5, 2024 [GDCH 1.12.1]


  • Google Distributed Cloud air-gapped 1.12.1 is now available.
    See the product overview to learn about the features of Google Distributed Cloud air-gapped.


Updated Rocky Linux image version to 20240131 to apply the latest security patches and important updates. To take advantage of the bug and security vulnerability fixes, you must upgrade all nodes with each release. The following security vulnerabilities are fixed:


The following container image security vulnerabilities are fixed:


Backup and restore:

  • An issue prevents volume backups to org buckets.
  • The backup route to orgs fails.

Cluster management:

  • User clusters with Kubernetes version 1.27.x might have node pools that fail to initialize.

Istio:

  • Pods in the ImagePullBackOff state with the Back-off pulling image "auto" event.

File and block storage:

  • When upgrading from 1.11.1 to 1.12.1, file-netapp-trident subcomponent rollout might fail.

Hardware security module:

  • A rotatable secret for hardware security modules is in an unknown state.

Logging:

  • When upgrading from 1.11.1 to 1.12.1, ValidatingWebhookConfigurations, MutatingWebhookConfigurations, and MonitoringRules deployed by the Log component might fail to upgrade.
  • The cortex-ingester pod shows an OOMKilled status.
  • After enabling logs export to an external SIEM destination, the forwarded logs don't contain any Kubernetes API server logs.

Monitoring:

  • Configuring the ServiceNow webhook results in Lifecycle Management (LCM) re-reconciling and reverting the changes made to the ConfigMap object mon-alertmanager-servicenow-webhook-backend and the Secret object mon-alertmanager-servicenow-webhook-backend in the mon-system namespace.
  • Audit logs and operational logs are not collected.
  • When upgrading from 1.11.x to 1.12.1, Cortex bucket deletion might fail.
  • The metrics storage class is incorrectly defined in the configuration.

Networking:

  • GDC experiences issues with VM and container updates, termination, and scheduling.
  • The preinstall script fails on several switches.
  • Upgrading from 1.11 to 1.12.1 fails due to an unsuccessful generation of the hairpinlink custom resource.

Node platform:

  • When upgrading from 1.11.x to 1.12.1, a switch image download pod might get stuck in the ErrImagePull state.
  • When upgrading from 1.11.x to 1.12.1, the host firewall blocks the switch image downloading.

NTP server:

  • The NTP relay server pod crashes after restarting.
  • The NTP relay job pod crashes after restarting.

Physical servers:

  • When upgrading from 1.11.x to 1.12.1, NodeUpgrade contains multiple versions for the same hardware model, blocking firmware upgrade verification.
  • When installing a server manually, the server installation might get stuck.
  • The servers are stuck in the provisioning state.

System artifact registry:

  • Harbor crash loops after an ABM upgrade.
  • When upgrading from 1.11.x to 1.12.1, the Harbor cluster status might be unhealthy.

Upgrade:

  • When upgrading from 1.11.x to 1.12.1, node upgrade gets stuck with the MaintenanceModeHealthCheckReady undrain error.
  • When upgrading from 1.11.x to 1.12.1, a cluster node might not exit the maintenance mode due to a health check failure for registy_mirror.
  • OS in-place node upgrade might stop responding.

VM manager:

  • When upgrading from 1.11.x to 1.12.x, a VM might not be ready due to too many pods.

  • VMRuntime might not be ready due to network-controller-manager installation failure


Billing:

  • Fixed the issue causing the patch upgrade to fail with the upgrade check.
  • Fixed the issue causing the creation of multiple billing-storage-init-job objects.

Firewall:

  • Fixed the issue with blocked traffic to object storage from the bootstrapper, caused by a deny policy configured on port 8082.

Monitoring:

  • Fixed the issue of not collecting metrics from the user clusters, affecting the user VM clusters but not the system cluster.
  • Fixed the issue of primary Prometheus sending metrics to Cortex tenant across cluster boundaries.

Operations Suite Infrastructure Core Services (OIC):

  • Fixed the issue with Desired State Configuration (DSC) return incorrect results and fail to update resources.
  • Fixed the issue where Microsoft System Center Configuration Manager (SCCM) deployment doesn't finish successfully and requires manual intervention to fix.

VM Backup and Restore:

  • Fixed an issue where role-based access control (RBAC) and schema settings in the VM manager was stopping users from starting VM backup and restore processes.

Add-on Manager:

  • The GKE on Bare Metal version is updated to 1.28.100-gke.150 to apply the latest security patches and important updates.

Operations Suite Infrastructure Core Services (OIC):

  • Google Distributed Cloud air-gapped 1.12.1 added instructions for partners to prepare OIC artifacts excluded from the release.

Security Information and Event Management (SIEM):

  • Splunk Enterprise and Splunk Universal Forwarder are upgraded to version 9.1.3.

Version update:

  • The Debian-based image version is updated to bookworm-v1.0.1-gke.1.