Google's primary support objective is to resolve production incidents as quickly as possible. We do this by understanding your configuration, analyzing logs and metrics, and collaborating with partners to solve incidents quickly.
Google Cloud offers a variety of support packages to accommodate your support needs. All Google Cloud Support packages include support for Anthos and GKE on-prem. If you have an existing Google Cloud Support package, then you already have support for Anthos and GKE on-prem.
For more information, see the Google Cloud Support documentation.
Requirements for GKE on-prem Support
To effectively troubleshoot business-critical incidents, you must:
- Check that the environment is current with the published end-of-support timeframes (see Version Support Policy below).
- Enable Cloud Logging and Cloud Monitoring for system components (for more details, see the Support tools section).
- When you open a support case, provide a configuration snapshot using the
gkectl diagnose snapshotcommand.
To troubleshoot a GKE on-prem incident, Google Cloud Support relies on three pieces of information:
- Your environment's configuration
- Logs from your admin and user clusters
- Metrics from your admin and user clusters
When you open a support case, you are asked to run the
gkectl diagnose snapshot --seed-config command and attach the resulting
tarball to the support case.
gkectl diagnose snapshot --seed-config captures
information about Kubernetes and your nodes.
The tool is highly configurable and includes several predefined scenarios. You can also pass a YAML file with a customized set of information to gather. To learn more, refer to Diagnosing Clusters.
You can add an
excludeWords field to your configuration file to omit
sensitive or confidential information. Be sure to carefully review the
information captured by the tool. Highly confidential or sensitive information
should not be attached to your support case.
When you create a new GKE on-prem cluster, Cloud Logging agents are enabled by default and scoped only to system-level components. This replicates system-level logs into the Google Cloud project associated with the cluster. System-level logs are from Kubernetes pods running in one of five namespaces:
Logs can be queried from the Cloud Logging console.
For more details, see Logging and Monitoring.
In addition to logs, metrics are also captured by the Cloud Monitoring agent. This replicates system-level metrics into the Google Cloud project associated with the cluster. System-level metrics are from Kubernetes pods running in the same namespaces listed in Logs.
For more details, see Logging and Monitoring.
How we troubleshoot your environment
Here is an example of a typical support incident:
- Someone—for example, the cluster administrator—opens a support case via
Cloud Console or the Google Cloud Support Center, and selects
Anthos and GKE on-prem as Category and Component,
respectively. They enter the information required and attach the output of
gkectl diagnose snapshotto the case.
- The support case is routed to a Technical Support Engineer specializing in GKE on-prem.
- The support engineer examines the contents of the snapshot to gain context of the environment.
- The support engineer examines the logs and metrics in the Google Cloud project, entering the support case ID as the business justification, which is logged internally.
- The support engineer responds to the case with an assessment and recommendation. The support engineer and the user continue troubleshooting until they come to a resolution.
Collaborative Support Partners
Google maintains collaborative support relationships with select partners to deliver a more seamless support experience for GKE on-prem. With these relationships, Google is able to collaborate closely with that partner on behalf of our shared customers.
To benefit from Collaborative Support, you must maintain support agreements with both Google and the partner in question.
Google currently has a collaborative support relationship in place with the partners specified on the Collaborative Support Partners page.
Data about support issues may be shared with Collaborative Support Partners, as described in Google's Technical Support Services Guidelines.
What does Google support?
Generally, the Cloud Support team supports all software components shipped as part of GKE on-prem plus open source Istio. The table below details this further:
|GCP Support||Collaborative Support||Not Supported|
|Kubernetes and the container runtime
||VMware vSphere (vCenter Server and ESXi)
||VMware products beyond vSphere
|Canonical Ubuntu for guest/node OS
||F5 BIG-IP load balancers
||Customer code (see Developer Support below)
||Hardware and hyper-converged infrastructure solutions as listed in the Collaborative Support Partners page
||Customer choice of host OS
||Physical server, storage, and network
|Calico and related network policies
||External DNS, DHCP, and identity systems
||Calico Enterprise Edition
|Prometheus and Grafana
|Stackdriver Monitoring, Stackdriver Logging, and Stackdriver agents
|Identity federation with OIDC compliant providers
|Hub, Connect, and the Connect Agent
|Open source Istio
|Cloud Run / Knative
|Bundled LoadBalancer (Seesaw)
Version Support Policy
To learn about the overall version support policy, see the Anthos support page
Shared Responsibility Model
Running a business-critical production application on GKE on-prem requires multiple parties to carry different responsponsibilities. While not an exhaustive list, the sections below list the roles and responsibilities.
- Maintenance and distribution of the GKE on-prem software package
including Kubernetes, vCenter and F5 controllers, ingress controller, Connect
and Stackdriver agents, and
gkectlcommand line tool.
- Maintenance and distribution of the Ubuntu admin workstation and node machine images including regular patching and security fixes.
- Notifying users of available upgrades for GKE on-prem, and producing upgrade scripts for the previous version; GKE on-prem supports sequential upgrades only (1.2 → 1.3 → 1.4 only and not 1.2 → 1.4).
- Operating the Connect, and Stackdriver services.
- Troubleshooting, providing workarounds, and correcting the root cause of any issues related to Google-provided components
- Overall system administration for on-premises clusters.
- Maintaining any application workload deployed on the cluster.
- Running, maintaining, and patching the data center infrastructure, including networking, servers, storage, and connectivity to Google Cloud.
- Running, maintaining, and patching vSphere and network load balancers.
- Maintaining support contracts with VMware and F5 (if deployed).
- Upgrading GKE on-prem versions on a regular basis.
- Testing and deployment updated node machine images with Ubuntu patches.
- Monitoring of the cluster and applications, and responding to any incidents.
- Ensuring Cloud Logging and Stackdriver agents are deployed to clusters.
- Providing Google with environmental details for troubleshooting purposes.
Google does not provide support for application workloads running on GKE on-prem. However, we do provide best-effort developer support to ensure your developers can easily run applications on GKE on-prem. We believe that engaging earlier during development can prevent critical incidents later in the deployment.
This Developer Support is available to customers with a paid support package and is treated as a P3 for an issue blocking a launch, or a P4 for general consultation.