This page provides information about BigQuery disaster resilience for datasets and the disaster recovery system.
The following are the types of failure domains for failures that could occur in Google Cloud data centers.
Machine-level: Failures impacting a single or few, but not all, machines within a Google Cloud zone. An example of a machine-level failure is hardware failure for a single machine.
Zonal: Failures that render a single Google Cloud zone unavailable while other zones in the same Google Cloud region are still available. Google Cloud zones have different failure domains but multiple zones can be co-located in the same geographical location. Examples are a building fire, power outage, cut fiber-optic cable, and network partitions.
Regional: Failures affecting an entire Google Cloud region that consists of multiple zones. Examples are hurricanes and large-scale earthquakes.
Types of failures
There are two types of failures, soft failures and hard failures.
Soft failure is an operational deficiency where hardware is not destroyed. Examples include power failure, network partition, or a machine crash. In general, BigQuery should never lose data for a soft failure, even if the failure damages some hardware.
Hard failure is an operational deficiency where hardware is destroyed. Hard failures are more severe than soft failures. Hard failure examples include damage from floods, terrorist attacks, earthquakes, and hurricanes.
Availability and durability
When you create a BigQuery dataset, you select a location in which
to store your data. This location is either a region, which is a specific
geographical location, such as Iowa (
us-central1) or Montréal
northamerica-northeast1), or a multi-region, which is a large geographic
area, such as the United States (
US) or Europe (
EU), that contains two
or more geographic places.
In either case, BigQuery automatically stores copies of your data in two different Google Cloud zones within the selected location.
In the event of a machine-level failure, BigQuery will continue running with no more than a few milliseconds delay. All currently running queries will continue processing. In the event of either a soft or hard zonal failure, no data loss is expected. However, currently running queries may fail and will need to be resubmitted. A soft zonal failure, such as resulting from a power outage, destroyed transformer, or network partition, is a well-tested path and will be automatically mitigated within a few minutes.
A soft regional failure, such as a region-wide loss of network connectivity, will result in loss of availability until the region is brought back online, but it will not result in lost data. A hard regional failure, for example, if a disaster destroys the entire region, could result in loss of data stored in that region. BigQuery does not automatically provide a backup or replica of your data in another geographic region. You can create cross-region dataset copies to enhance your disaster recovery strategy.
To learn more about BigQuery locations, see Location considerations.