Google Cloud products are served from specific regional failure domains and are fully supported by Service Level Agreements to ensure you are designing your application architecture within the structure of Google Cloud.
Google Cloud infrastructure services are available in locations across North America, South America, Europe, Asia, and Australia. These locations are divided into regions and zones. You can choose where to locate your applications to meet your latency, availability, and durability requirements.
Try it for yourself
If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.Get started for free
Regions and zones
Regions are independent geographic areas that consist of zones.
A zone is a deployment area for Google Cloud resources within a region. Zones should be considered a single failure domain within a region. To deploy fault-tolerant applications with high availability and help protect against unexpected failures, deploy your applications across multiple zones in a region.
To protect against the loss of an entire region due to natural disaster, have a disaster recovery plan and know how to bring up your application in the unlikely event that your primary region is lost. See application deployment considerations for more information.
For more information about the specific resources available within each location option, see our Cloud locations.
Google Cloud's services and resources can be zonal, regional, or managed by Google across multiple regions. For more information about what these options mean for your data, see geographic management of data.
Zonal resources operate within a single zone. Zonal outages can affect some or all of the resources in that zone. An example of a zonal resource is a Compute Engine virtual machine (VM) instance that resides within a specific zone.
Regional resources are resources that are redundantly deployed across multiple zones within a region, for example App Engine applications, or regional managed instance groups. This gives them higher availability relative to zonal resources.
Multiple Google Cloud services are managed by Google to be redundant and distributed within and across regions. These services optimize availability, performance, and resource efficiency. As a result, these services require a trade-off between either latency or the consistency model. These trade-offs are documented on a product specific basis.
The following services have one or more multiregional locations in addition to any regional locations:
- Cloud Bigtable
- Cloud Data Loss Prevention
- Cloud Healthcare API
- Cloud Key Management Service
- Cloud Spanner
- Cloud Storage
These multiregional services are designed to be able to function following the loss of a single region.
You can find each product's exact configurations and options with respect to regions and zones in Google Cloud public documentation.
Google Cloud has been designed to operate globally from the ground up and continually conducts maintenance and upgrades 24/7/365 without inconveniencing you. Our global backbone provides tremendous flexibility for load-balancing, and reduces end-user latency by having interconnects close to you. Our global cloud management plane simplifies managing multi-region developments.
Underpinning and supporting many customer facing Google Cloud services are a set of proven internal services like Spanner, Colossus, Borg, and Chubby.
These internal services are either globally load-balanced across multiple regions, or dedicated to each region in which they are available. Where services are load-balanced across multiple regions, we deploy updates progressively region-by-region, allowing us to detect and address problems without affecting your service usage. None of these internal services are limited to a single logical data center or to a single region.
In general, if a single region goes down, only resources solely in that region are impacted, multi-region products are not impacted.
All Google Cloud services rely upon core internal tools to provide fundamental services such as networking (in and out of data centers), access to data centers, and Identity Authorization systems. These tools are regionalized, and one region is not impacted if other regions go down.
Google Cloud provides clear direction on how you can architect your applications for the desired level of resilience in Compute Engine, BigQuery, Pub/Sub, and other services via public documentation.
Maintaining and improving availability and resilience
Site Reliability Engineering (SRE) is Google's internal organization dedicated to working on availability, latency, performance, and capacity. Outages and service unavailability are correlated to the deployment of new code or changes to the environment. By using industry best practices, SRE balances the need to release new software and keeps the environment secure with the understanding that those necessary changes might cause downtime.
Partnering with customers to build resilient services
If you have mission critical needs, and need to architect for resilience and disaster recovery, our SRE/CRE and PSO teams can work with you to architect your applications to bridge multiple regions and zones and can further assist you with designing High Availability (HA) systems.
If you have heightened availability requirements around specific dates, such as Black Friday/Cyber Monday, Google Cloud has a program to partner with you to check and validate your specific application running in GCP and identify any unexpected service dependencies between your application and our services.
Geographic management of data
Data locality for Google Cloud services are governed by the terms of service, including service specific terms. Google understands that each customer might have unique security and compliance needs. The Google Cloud sales team can help you work towards meeting your requirements.
When using regional or zonal storage resources, we strongly recommend that you replicate data to another region or snapshot it to a multiregional storage resource for disaster recovery purposes.
Application deployment considerations
- To build highly available services and applications that can withstand zones becoming unavailable
Use the following:
- Regional resources, such as App Engine applications, regional managed instance groups, or managed multiregional resources such as Cloud Storage, Datastore, Firestore, or Cloud Spanner.
- Zonal resources, such as Compute Engine virtual machines, but manage your own compute and storage redundancy across zones or across regions.
- To build disaster recovery capable applications that can withstand the extended loss of entire regions
For data, use one or more of the following strategies:
- Use managed, multiregional storage services such as Cloud Storage, Datastore, Firestore, or Cloud Spanner.
- Use zonal or regional resources, but snapshot data to a multiregional resource such as Cloud Storage, Datastore, Firestore, or Cloud Spanner.
- Use zonal or regional resources, but manage your own data replication to one or more other regions.
For compute, use the following strategy:
- Use zonal or regional resources, such as Compute Engine or App Engine, but manually or automatically bring up your application in another region (on regional failure) referring to copies of your primary data if the data is not already in a managed, multiregional resource.
For more information about service dependencies, contact sales.
Additional solutions and tutorials
The following solutions and tutorials provide guidance for ensuring your application is highly available and can withstand outages:
Learn how to use Google Cloud to build scalable and resilient application architectures using patterns and practices that apply broadly to any web application.
Configure Compute Engine instances in different regions and use HTTP load balancing to distribute traffic across the regions to increase availability across regions and provide failover in the case of a service outage.
Design your application on the Compute Engine service to be robust against failures, network interruptions, and unexpected disasters.
Learn how to add basic disaster recovery to your Cassandra installation by backing up your data into, and restoring your data from, Cloud Storage.
General principles for designing and testing a disaster recovery plan with Google Cloud.