Jump to

Day 2 Ops for GKE

Helping customers simplify how they operate their GKE platform and build an effective strategy for monitoring and managing it. To schedule a hands-on workshop please contact your Google Cloud account team.


Simplifying operations to manage the platform in a cost-effective manner

Comprehensive Solution

Google Cloud’s Day 2 Ops solution provides an end-to-end approach for managing their GKE platform as well as monitoring and troubleshooting it to ensure required SLA.

Minimize Operational Risk

Streamline and standardize platform upgrades so that GKE clusters don’t stay on outdated versions for long and get exposed to security breaches and related vulnerabilities.

Reduce Operational Costs

Organizations can reduce their operational costs by having an unified approach for monitoring and managing their various GKE environments.

Key features

Hands-on workshop: Day 2 Ops for GKE

Our solution uses a hands-on workshop to help customers understand the Day 2 strategies for GKE. Some aspects covered in the workshop are below.

GKE cluster notifications with Pub/Sub

When certain events occur that are relevant to a GKE cluster, such as important scheduled upgrades or available security bulletins, GKE can publish cluster notifications about those events as messages to Pub/Sub topics. You can receive these notifications on a Pub/Sub subscription, integrate with third-party services, and filter for the notification types you want to receive.

GKE release channels and cluster upgrades

By default, auto-upgrading nodes are enabled for Google Kubernetes Engine (GKE) clusters and node pools. GKE release channels offer you the ability to balance between stability and the feature set of the version deployed in the cluster. When you enroll a new cluster in a release channel, Google automatically manages the version and upgrade cadence for the cluster and its node pools.

GKE maintenance windows and exclusions

A maintenance window is a repeating window of time during which automatic maintenance is permitted. A maintenance exclusion is a non-repeating window of time during which automatic maintenance is forbidden. These provide fine-grained control over when automatic maintenance can occur on your GKE clusters. 

GKE node pool updates

Node pools represent a subset of nodes within a cluster; a container cluster can contain one or more node pools. Dynamic configuration changes are limited to network tags, node labels, and node taints. Any other field changes in the UpdateNodePool API will not occur dynamically, and will result in node re-creation.

GKE backup and restore

Backup for GKE is a service for backing up and restoring workloads in GKE clusters. Backups of your workloads may be useful for disaster recovery, CI/CD pipelines, cloning workloads, or upgrade scenarios. Protecting your workloads can help you achieve business-critical recovery point objectives.

Ready to get started? Contact us