Google's primary support objective is to resolve production incidents as quickly as possible. Understanding your configuration, analyzing logs and metrics, and collaborating with partners helps us to solve incidents quickly.
Google Cloud offers various support packages to accommodate your support needs. All Google Cloud Support packages include support for Anthos and Anthos clusters on bare metal. If you have an existing Google Cloud Support package, then you already have support for Anthos and Anthos clusters on bare metal.
For more information, see the Google Cloud Support documentation.
Requirements for Anthos clusters on bare metal support
To troubleshoot business-critical incidents effectively:
- Check that your environment is current and within the published end-of-support timeframes. See the Version Support Policy section for more information.
- Enable Cloud Logging and Cloud Monitoring for system components. For details, see the following Support tools section.
To troubleshoot an Anthos clusters on bare metal incident, Google Cloud Support relies on three pieces of information:
Your environment configuration
When you open a support case, running the following commands provides key information about your cluster setup:
For all your cluster types, run
bmctl check cluster --snapshotcommand to capture information about Kubernetes and your nodes. Attach the resulting tarball to the support case.
For admin, hybrid, and standalone clusters, run the
bmctl check clustercommand to check the health status of the cluster and nodes. Attach the resulting logs to the support case. They should exist under the
For user clusters, first create a health check YAML file with the cluster name and namespace, and then apply the file in the appropriate admin cluster:
Create a YAML file with the following
healthcheckproperties. Here is sample content for a cluster named
apiVersion: baremetal.cluster.gke.io/v1 kind: HealthCheck metadata: generateName: healthcheck- namespace: cluster-user1 spec: clusterName: user1
After you create the YAML file, apply the custom resource in the admin cluster that is managing the user cluster with the
kubectlcommand. Here is a sample command using the YAML file created in the previous step. In the sample, the
ADMIN_KUBECONFIGvariable specifies the path to the admin cluster's kubeconfig file:
kubectl --kubeconfig ADMIN_KUBECONFIG create -f healthcheck-user1.yaml
The command returns the following response:
Wait until the health check job is completed by testing to see if the health check job has finished reconciling. In the previous example case, the health check job name is
healthcheck.baremetal.cluster.gke.io/healthcheck-7c4qf. Here is a sample test with the
kubectlcommand that waits 30 minutes for the health check job to complete:
kubectl --kubeconfig ADMIN_KUBECONFIG wait healthcheck healthcheck-7c4qf \ -n cluster-user1 --for=condition=Reconciling=False --timeout=30m
When completed, this command returns:
healthcheck.baremetal.cluster.gke.io/healthcheck-7c4qf condition met
You can see the health check job results with the following command:
kubectl --kubeconfig ADMIN_KUBECONFIG get healthcheck healthcheck-7c4qf \ -n cluster-user1
The command returns the following result:
NAME PASS AGE healthcheck-7c4qf true 17m
Gather all the health check job pod's logs into a local file with the
kubectlcommand. Here's an example using the previous sample health check job:
kubectl --kubeconfig ADMIN_KUBECONFIG logs -n cluster-user1 \ -l baremetal.cluster.gke.io/check-name=healthcheck-7c4qf --tail=-1 > \ healthcheck-7c4qf.log
When you create a new Anthos on bare metal cluster, Cloud Logging agents are enabled by default and scoped only to system-level components. This replicates system-level logs into the Google Cloud project associated with the cluster. System-level logs are from Kubernetes pods in the following namespaces:
Logs can be queried from the Cloud Logging console.
For more details, see Logging and Monitoring.
Google Cloud CLI and remote cluster access
If you open a support case, Cloud Customer Care may ask you for remote read-only access to your clusters to help diagnose and resolve issues more effectively. For the support team to have sufficient access to troubleshoot your cluster issue remotely, ensure that you've installed and updated to the latest version of the Google Cloud CLI. The Google Cloud CLI must be at version 401.0.0 or higher to give Cloud Customer Care the needed permissions. We recommend that you update Google Cloud CLI regularly to pick up added permissions and other enhancements.
To install the latest components of the gcloud CLI, use the
gcloud components update command.
For more information about giving Cloud Customer Care remote read-only access to
your clusters, see
Google Cloud Support for your registered clusters.
In addition to logs, the Cloud Monitoring agent also captures metrics. This replicates system-level metrics into the Google Cloud project associated with the cluster. System-level metrics are from Kubernetes pods running in the same namespaces listed in Logs.
For more details, see Logging and Monitoring.
How we troubleshoot your environment
Here is an example of a typical support incident:
- The cluster administrator opens a support case in Google Cloud console or the
Google Cloud Support Center, and selects Anthos and
Anthos clusters on bare metal as Category and Component, respectively. They enter
the information required and attach the output of relevant
bmctlcommands to the case.
- The support case is routed to a Technical Support Engineer specializing in Anthos clusters on bare metal.
- The support engineer examines the contents of the snapshot to gain context of the environment.
- The support engineer examines the logs and metrics in the Google Cloud project, entering the support case ID as the business justification, which is logged internally.
- The support engineer responds to the case with an assessment and recommendation. The support engineer and the user continue troubleshooting until they come to a resolution.
What does Google support?
Generally, the Cloud Support team supports all software components shipped as part of Anthos clusters on bare metal and Anthos Service Mesh and Anthos Config Management. See the following table for a more complete list of what is and isn't supported:
|Google Cloud supported||Not supported|
|Kubernetes and the container runtime||Customer choice of load balancer (manual load balancing)|
|Connect and the Connect Agent||Customer code (see Developer Support)|
|Google Cloud operations, Monitoring, Logging, and agents||Customer choice of operating system|
|Bundled load balancer||Physical or virtual server, storage, and network|
|Ingress controller||External DNS, DHCP, and identity systems|
|Anthos Identity Service|
|Anthos Service Mesh|
|Anthos Config Management|
Version Support Policy
Support for Anthos clusters on bare metal follows the Anthos Version Support Policy. Google supports the current version and the previous two (n-2) minor versions of Anthos clusters on bare metal.
The following table shows the supported and unsupported versions of this product.
|Minor version||Release date||Estimated end of support date||Available patches||Kubernetes version|
|1.13 (latest)||September 29, 2022||June 29, 2023||1.13.0||v1.24.2-gke.1900|
|1.12||June 29, 2022||March 29, 2023||1.12.2||v1.23.5-gke.1505|
|1.11||March 21, 2022||December 21, 2022||1.11.6||v1.22.8-gke.204|
|1.10 (unsupported)||December 10, 2021||September 10, 2022||1.10.8||v1.21.13-gke.202|
|1.9 (unsupported)||September 23, 2021||June 23, 2022||1.9.8||v1.21.13-gke.200|
|1.8 (unsupported)||June 21, 2021||March 21, 2022||1.8.9||v1.20.9-gke.102|
|1.7 (unsupported)||March 25, 2021||December 25, 2021||1.7.7||v1.19.14-gke.2201|
|1.6 (unsupported)||November 30, 2020||August 30, 2021||1.6.4||v1.18.20-gke.3000|
This document lists the availability of features and capabilities for Anthos clusters on bare metal for supported releases. The table is not intended to be an exhaustive list, but it highlights some of the benefits of upgrading your clusters to the latest supported version.
Features listed as Preview are covered by the Pre-GA Offerings Terms of the Google Cloud Terms of Service. Pre-GA products and features might have limited support, and changes to pre-GA products and features might not be compatible with other pre-GA versions. For more information, see the launch stage descriptions. Preview offerings are intended for use in test environments only.
Features listed as General Availability (GA) are fully supported, open to all customers, and ready for production use.
|Anthos VM Runtime||Preview||Preview||GA||GA|
|Bundled load balancing with BGP||Preview||GA||GA||GA|
|Cloud Audit Logging||GA||GA||GA||GA|
|Cluster backup and restore CLI support||GA||GA||GA||GA|
|Cluster Certificate Authorities (CAs) rotation||GA||GA||GA||GA|
|Cluster node reset CLI support||GA||GA||GA||GA|
|containerd container runtime||GA||GA||GA||GA|
|Dynamic Flat IP with Border Gateway Protocol (BGP)||Not Available||Preview||Preview||GA|
|Egress NAT gateway||Preview||GA||GA||GA|
|Flat IPv4 mode (static)||Preview||GA||GA||GA|
|Flat IPv6 support (BGP mode)||Not Available||Preview||Preview||GA|
|BGP-based Load Balancer support for IPv6||Not Available||Not Available||Preview||GA|
|IPv4/IPv6 Dual Stack||Preview||GA||GA||GA|
|Managed Collector for Google Cloud Managed Service for Prometheus||Not Available||Not Available||Preview||GA|
|Multi-NIC for Pods||GA||GA||GA||GA|
|Network Connectivity Gateway||Not Available||Not Available||Preview||Preview|
|Node problem detector||GA||GA||GA||GA|
|Registry mirror support||Preview||Preview||Preview||GA|
|Summary API metrics||Not Available||Preview||GA||GA|
Shared Responsibility Model
Running a business-critical production application on Anthos clusters on bare metal requires multiple parties to carry different responsponsibilities. While not an exhaustive list, the following sections list the roles and responsibilities.
- Maintenance and distribution of the Anthos clusters on bare metal software package.
- Notifying users of available upgrades for Anthos clusters on bare metal, and producing upgrade scripts for the previous version; Anthos clusters on bare metal supports sequential upgrades only (example: 1.2 → 1.3 → 1.4 and not 1.2 → 1.4).
- Operating the Connect and Cloud Operations services.
- Troubleshooting, providing workarounds, and correcting the root cause of any issues related to Google-provided components
- Overall system administration for on-premises clusters.
- Maintaining any application workload deployed on the cluster.
- Running, maintaining, and patching the data center infrastructure, including networking, servers, operating system, storage, and connectivity to Google Cloud.
- Running, maintaining, and patching network load balancers if manual load balancer option is chosen.
- Upgrading Anthos clusters on bare metal versions regularly.
- Monitoring of the cluster and applications, and responding to any incidents.
- Ensuring Cloud Operations agents are deployed to clusters.
- Providing Google with environmental details for troubleshooting purposes.
Google doesn't support your application workloads running on Anthos clusters on bare metal. However, we do provide best-effort developer support to ensure your developers can easily run applications on Anthos clusters on bare metal. We believe that engaging earlier during development can prevent critical incidents later in the deployment.
This Developer Support is available to customers with a paid support package and is treated as a P3 priority for an issue blocking a launch, or a P4 priority for general consultation.
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2022-09-30 UTC.