This document in the Google Cloud Architecture Framework describes best practices to optimize the performance of your workloads in Google Cloud.
Evaluate performance requirements. Determine the priority of your various applications and what minimum performance you require of them.
Use scalable design patterns. Improve scalability and performance with autoscaling, compute choices, and storage configurations.
- Use autoscaling and data processing.
- Use GPUs and TPUs to increase performance.
- Identify apps to tune.
Use autoscaling and data processing
Use autoscaling so that as load increases or decreases, the services add or release resources to match.
Compute Engine autoscaling
Managed instance groups (MIGs) let you scale your stateless apps on multiple identical VMs, so that a group of Compute Engine resources is launched based on an instance template. You can configure an autoscaling policy to scale your group based on CPU utilization, load-balancing capacity, Cloud Monitoring metrics, schedules, and, for zonal MIGs, by a queue-based workload, like Pub/Sub.
Google Kubernetes Engine autoscaling
You can use the cluster autoscaler feature in Google Kubernetes Engine (GKE) to manage your cluster's node pool based on varying demand of your workloads. Cluster autoscaler increases or decreases the size of the node pool automatically, based on the resource requests (rather than actual resource utilization) of Pods running on that node pool's nodes.
Dataproc and Dataflow offer autoscaling options to scale your data pipelines and data processing. Use these options to allow your pipelines to access more computing resources based on the processing load.
- Which of your applications have variable user load or processing requirements?
- Which of your data processing pipelines have variable data requirements?
- Use Google Cloud Load Balancers to provide a global endpoint.
- Use managed instance groups with Compute Engine to automatically scale.
- Use the cluster autoscaler in GKE to automatically scale the cluster.
- Use App Engine to autoscale your Platform-as-a-Service (PaaS) application.
- Use Cloud Run or Cloud Functions to autoscale your function or microservice.
Use GPUs and TPUs to increase performance
Google Cloud provides options to accelerate the performance of your workloads. You can use these specialized hardware platforms to increase your application and data processing performance.
Graphics Processing Unit (GPU)
Compute Engine provides GPUs that you can add to your virtual machine instances. You can use these GPUs to accelerate specific workloads on your instances such as machine learning and data processing.
Tensor Processing Unit (TPU)
A TPU is specifically designed as a matrix processor by Google for machine learning workloads. TPUs are best suited for massive matrix operations with a large pipeline, with significantly less memory access.
Identify apps to tune
Application Performance Management (APM) includes tools to help you reduce latency and cost, so that you can run more efficient applications. With Cloud Trace, Cloud Debugger, and Cloud Profiler, you gain insight into how your code and services function, and you can troubleshoot if needed.
Latency plays a big role in determining your users' experience. When your application backend starts getting complex or you start adopting microservice architecture, it's challenging to identify latencies between inter-service communication or identify bottlenecks. Cloud Trace and OpenTelemetry tools help you scale collecting latency data from deployments and quickly analyze it.
Cloud Debugger helps you inspect and analyze your production code behavior in real time without affecting its performance or slowing it down.
Poorly performing code increases the latency and cost of applications and web services. Cloud Profiler helps you identify and address performance by continuously analyzing the performance of CPU or memory-intensive functions executed across an application.
- Use Cloud Trace to instrument your applications.
- Use Cloud Debugger to provide real-time production debugging capabilities.
- Use Cloud Profiler to analyze the operating performance of your applications.
Explore the other categories of the Google Cloud Architecture Framework.