Google Cloud Platform for Azure Professionals: Storage

Updated March 15, 2018

Compare the storage services provided by Microsoft Azure and Google Cloud Platform (GCP). This article discusses the following service types:

  • Distributed object storage, or redundant key-value stores in which you can store data objects.
  • Block storage, or virtual disk volumes that you can attach to virtual machine instances.
  • File storage, or network-attached, file-server-based storage.
  • Storage area networks, or remote storage that provides block-level storage access.
  • Cool storage, or storage services designed to store data backups.
  • Archival storage, or storage services designed to store archival data for compliance or analysis purposes.

This article does not discuss databases or message queues.

Service model comparison

Microsoft Azure and GCP take different approaches to the high-level organization and configuration of their storage services. However, in practice, the storage services themselves are often more similar than they are different.

Microsoft Azure

On Azure, you store your various types of data, including binary objects (blobs), databases, and message queues, in specific services within a storage account. Azure requires you to define your account type, disk type, and redundancy type at the storage-account level. These attributes then apply to all of the storage services within that account.

Azure storage accounts come in three types: general-purpose v1 (GPv1), blob-specific, and general-purpose v2 (GPv2). GPv1 accounts can support any of Azure's standard storage types, and blob-specific accounts are designed to support advanced functionality for Azure Blob Storage. GPv2 accounts support the APIs and features of both of the other account types.

While blob-specific accounts run exclusively on hard disk drives (HDDs), both general-purpose account types are further split into Standard Storage accounts, which run on HDDs, and Premium Storage accounts, which run on solid-state drives (SSDs). This latter account type supports page blobs only.

When you create a new Azure storage account, you select the level of replication you want to use. Azure offers the following levels:

  • Locally redundant storage (LRS), which replicates your data locally within the data center that contains your storage account. Your data is replicated three times.
  • Zone-redundant storage (ZRS), which replicates your data across one or two regions with eventual consistency. As with LRS, your data is replicated three times locally as well. ZRS is limited to block blob storage in general-purpose storage accounts.
  • Geo-redundant storage (GRS), which replicates your data in a primary region and in a secondary region at least 100 miles away. Your data is replicated three times in the primary region, and then asynchronously replicated three times in the secondary region.
  • Read-access geo-redundant storage (RA-GRS), which is identical to GRS but adds a secondary read-only endpoint in the secondary region.

Google Cloud Platform

Like Azure, GCP stores each data type within a type-specific service. However, GCP does not have a high-level organizational layer such as a storage account. Instead, you create storage resources and define resource attributes, such as disk type or redundancy type, at the service level.

For distributed object storage, GCP provides Cloud Storage, which is comparable to Azure Blob Storage's block and append blob storage. For block storage, GCP offers Compute Engine persistent disks, which are equivalent to Azure VHDs.

Distributed object storage

For distributed object storage, Azure offers Azure Blob Storage, and GCP offers Cloud Storage.

Azure Blob Storage and Cloud Storage have many similarities. In both services, you store binary objects inside a named unit of storage. In Azure Blob Storage, these binary objects are called blobs, and the unit of storage is called a container. In Cloud Storage, these binary objects are called objects, and the unit of storage is called a bucket.

In both services, each binary object within a unit of storage is identified by a unique key within that unit of storage, and each object has an associated metadata record. This metadata record contains information such as object size, date of last modification, and media type. If you have the appropriate permissions, you can view and modify some of this metadata. You can also add custom metadata, if needed.

Though both containers and buckets are key-value stores, the user experience for each is similar, though not identical, to that of filesystems. By convention, object keys are usually paths such as "/foo/bar.txt" or "/foo/subdir/baz.txt." Azure and Cloud Storage both also provide filesystem-like APIs—for example, Azure Blob Storage's List Blobs method and Cloud Storage's list method both can list all object keys with a common prefix, not unlike ls -R would on a Unix-like filesystem.

In addition to their most obvious use, distributed object storage, both services can be used to host static web content and media.

Azure Blob Storage's features and terminology map to those of Cloud Storage as follows:

Feature Azure Blob Storage Cloud Storage
Unit of deployment Container Bucket
Deployment identifier Account-level unique key Globally unique key
File system emulation Limited Limited
Object types Block blobs, append blobs Objects
Object metadata Yes Yes
Object versioning Manual, per-object snapshotting Automatic versioning of all objects in a bucket (must be enabled)
Object lifecycle management Yes (through Azure Automation) Yes (native)
Object change notifications Yes (through Azure Event Grids) Yes (through Cloud Pub/Sub)
Service classes Redundancy levels: LRS, ZRS, GRS, RA-GRS
Tiers: Hot, Cool, Archive
Regional, Multi-Regional, Cloud Storage Nearline, Cloud Storage Coldline
Deployment locality Zonal and regional Regional and multi-regional
Redundancy Yes Yes

Blob types

In Azure Blob Storage, you store data as block blobs, append blobs, or page blobs. In Cloud Storage, you store all data as objects, which are equivalent to block blobs. GCP doesn't provide a service or object type directly comparable to append blobs. However, you can use Cloud Storage's object composition functionality and concurrency controls to approximate the functionality of an append blob. For more information, see Composite Objects and Parallel Uploads.

Unlike Azure, GCP does not store your disk volumes as page blobs within its object storage service. Instead, your disk volumes are stored within Compute Engine, GCP's infrastructure-as-a-service offering. See Block storage for more information.

Access tiers and replication

The flexibility of Azure Blob Storage depends on the type of storage account you've created and the replication options you've chosen for that account. If you're using a GPv1 storage account, you are limited to Azure's default tier for blob storage. However, when you use a blob-specific or GPv2 storage account, you can choose between Hot, Cool, and Archive access tiers for Azure Blob Storage. The Hot tier is designed for frequently accessed data, the Cool tier is designed for infrequently accessed data, and the Archive tier is designed for data archiving. The level of replication for your Azure Blob Storage service is determined by your storage account's replication type.

In contrast, Cloud Storage's replication types are built into its service classes. These service classes map to Azure Blob Storage access tiers and replication types as follows:

Configuration Azure GCP
Frequently accessed data with automatic failover N/A Multi-Regional Storage
Frequently accessed data with geo-redundancy Azure Blob Storage (general-purpose or Hot tier) with GRS or RA-GRS Multi-Regional Storage
Frequently accessed data with regional redundancy Azure Blob Storage (general-purpose) with ZRS Regional Storage
Frequently accessed data with local (data center) redundancy Azure Blob Storage (general-purpose or Hot tier) with LRS Regional Storage*
Infrequently accessed data Azure Blob Storage Cool tier Cloud Storage Nearline
Archival data Azure Blob Storage Archive tier Cloud Storage Coldline

* Regional redundancy is the lowest level of redundancy available on GCP

Object versioning

Both Azure and Cloud Storage allow you to version your stored objects, storing distinct versions of an object with a given key under distinct version IDs. However, they implement this functionality in different ways.

In Azure Blob Storage, you can achieve versioning by taking read-only snapshots of your blobs. If you upload your files programmatically, you can take a new snapshot before each upload. Azure Blob Storage also lets you specify access conditions to avoid unnecessary snapshotting.

In contrast, Cloud Storage allows you to enable automatic object versioning for all objects within a bucket. With automatic versioning enabled, Cloud Storage automatically creates a new version of an object each time you modify the object. This approach simplifies the object versioning process but is slightly less flexible than Azure's approach. Each version of an object also adds to your total stored data, which can increase storage costs. You can mitigate this issue by using Cloud Storage's object lifecycle management.

Concurrency controls

Azure Blob Storage and Cloud Storage each default to a "last write wins" write strategy. This strategy works well for sequential writes, but it allows for race conditions if you're performing concurrent writes to the same object. To mitigate this issue, both services provide mechanisms for managing concurrent writes.

On Azure, you can manage concurrent writes optimistically or pessimistically:

  • Optimistic approach: You retrieve an object's ETag header when performing a GET operation, and then compare that ETag to the object's current ETag when attempting a write. If the tags match, you commit the write.
  • Pessimistic approach: You lease the target object, locking it for a specified duration while you perform the write.

On Cloud Storage, you use an optimistic approach. To manage concurrent writes, you obtain the current generation number of a given object, and then check against that generation number when your script or application attempts a write. If the numbers match, you commit the write. Otherwise, you abort the transaction and then restart it. For more information, see Object Versioning and Concurrency Control.

Object lifecycle management

Cloud Storage allows you to automate object deletion according to user-specified lifecycle policies. For more information, see Object Lifecycle Management.

Though Azure Blob Storage does not provide a native lifecycle management feature, you can use Azure Automation to automate object deletion.

Object change notifications

Both Azure and GCP provide a publish/subscribe model that allows you to send and receive notifications when your objects are modified. With Azure Blob Storage, you can use Azure Event Grid to track Blob Storage events and send them to a webhook, Azure Function, or other endpoint. Similarly, GCP provides Cloud Pub/Sub Notifications, which allow you to publish notifications to a Cloud Pub/Sub topic when objects are created, deleted, or updated within a Cloud Storage bucket. To receive these notifications, you can subscribe to this Cloud Pub/Sub topic from other applications or services.

Encryption

Azure supports encryption at rest through Azure Storage Service Encryption (SSE) for Data at Rest. All blob-based storage within your storage account is encrypted with AES256 during ingress and decrypted during egress. If you enable encryption for your account after you've already uploaded data, that data will not be encrypted until it is rewritten.

Similarly, all data stored in GCP's storage services, including Cloud Storage, is automatically encrypted at rest with either AES256 or AES128. For data that requires you to manage your own encryption key, GCP also supports customer-managed encryption keys (CMEK) using Cloud Key Management Service and customer-supplied encryption keys (CSEK). For more information, see Encryption at Rest in Google Cloud Platform.

Service level agreement

Microsoft and Google both provide uptime guarantees and have policies in place for crediting customer accounts in the event that these guarantees are not met. Microsoft defines the guarantees and policies for Azure Blob Storage in the Azure Storage SLA. Google defines the guarantees and policies for Cloud Storage in the Cloud Storage SLA.

Costs

Azure Blob Storage

Azure Blob Storage is priced by amount of data stored per month, storage account type, replication type, and network egress. If you take object snapshots, these snapshots are charged at the same rate as the live versions of the objects. Azure Blob Storage also charges for common API requests.

Cloud Storage Regional and Multi-regional

Cloud Storage Regional and Multi-regional are priced by amount of data stored per month and by network egress. For buckets with object versioning enabled, each archived version of an object is charged at the same rate as the live version of the object. Cloud Storage Region and Multi-regional also charge for common API requests.

Block storage

GCP and Azure both offer block storage options. GCP provides block storage in the form of persistent disks, which are part of Compute Engine. Azure provides block storage in the form of page blobs, which are stored in a container in a general-purpose storage account. Both platforms also provide users the option of using locally attached SSDs.

Service model comparison

Apart from the method in which they're stored, Compute Engine persistent disks and Azure virtual hard disks (VHDs) are very similar in most ways. In both cases, disk volumes are network-attached, though both Compute Engine and Azure also provide the ability to locally attach a disk if necessary. While networked disks have higher operational latency and less throughput than locally attached disks, they have many benefits as well, including built-in redundancy, snapshotting, and ease of disk detachment and reattachment.

Azure VHDs

Azure stores its VHDs as page blobs. Page blobs can be up to 8 TB in size, and VHDs can be up to 4 TB in size. Azure virtual machines (VMs) impose machine-type-based limits on how many VHDs can be attached at a time. For example, basic-tier machines can attach up to 16 TB of VHD storage, while higher-tier machines can attach up to 256 TB of VHD storage.

Each VHD must reside in a storage account based in the same region as the VM to which the VHD is attached. The latency and throughput of VHDs depend on both the storage account type and the machine type of the VM:

  • Azure Standard Storage accounts run on standard HDDs and are recommended for infrequently accessed data or bulk storage use cases.
  • Azure Premium Storage accounts run on SSD drives and are recommended for I/O-intensive operations. The Premium Storage tier supports only LRS replication. Because the Premium Storage tier is available for VHDs only, if you choose to manually manage your disks instead of letting Azure manage them, you might need to create a separate storage account specifically for your SSD-backed VHDs. Some low-tier machines, such as A0, do not support SSD-backed VHDs.

Compute Engine persistent disks

GCP provides block storage in the form of persistent disks that are stored in Compute Engine. You can attach up to 16 persistent disks to a single Compute Engine VM instance, regardless of its machine type. You can attach up to 64 TB of persistent disk storage per VM instance, and each persistent disk can be up to 64 TB in size. Persistent disks can be HDDs or SSDs. As with Azure, you must create your disk volume in the same zone as the VM instance to which the disk will be attached.

Compute Engine persistent disks map to Azure Storage as follows:

Feature Azure VHDs Compute Engine persistent disks
Volume types Standard Storage (HDD), Premium Storage (SSD) Standard persistent disk (HDD), SSD persistent disk
Management schemes Unmanaged disks, managed disks N/A (GCP-managed at the project level)
Volume attachment Can be attached to only one instance at a time Read-write volumes: Can be attached to only one instance at a time
Read-only volumes: Can be attached to multiple instances
Maximum volume size 4 TiB 64 TB
Redundancy Yes Yes
Snapshotting Yes Yes
Disk encryption Encrypted by default Encrypted by default

Compute Engine's locally attached disks compare to those of Azure as follows:

Feature Azure Compute Engine
Service name Local SSD Local SSD
Volume attachment Tied to instance type Can be attached to any non-shared-core instance
Attached volumes per instance Varies by instance type Up to 8
Storage capacity Varies by instance type 375 GB per volume
Live migration No Yes
Redundancy None None

Replication

Azure page blobs allow for more replication options than Compute Engine persistent disks. Depending on your machine type and storage account type, you can replicate VHDs within a data center (LRS) or across regions (GRS or RA-GRS).

In contrast, Compute Engine persistent disks are replicated within a single Compute Engine zone. This configuration is equivalent to Azure's LRS option. To ensure high availability when using persistent disks, you must design for high availability between the regions and zones in which your workload is running.

While each service provides replication for increased durability, this feature does not protect against data corruption or accidental deletions due to user or application error. To protect important data, users should perform regular data backups and disk snapshots.

Volume attachment and detachment

After creating a disk volume, you can attach the volume to a Compute Engine VM instance or Azure VM. The VM instance can then mount and format the disk volume like any other block device. Similarly, you can unmount and detach a volume from an instance, allowing it to be reattached to other instances.

An Azure VHD can be attached to only one VM at a time. Compute Engine persistent disks in read/write mode have the same limitation. However, persistent disks in read-only mode can be attached to multiple instances simultaneously.

Volume backup

Compute Engine and Azure both allow users to capture and store snapshots of disk volumes. These snapshots can be used create new volumes at a later time.

Compute Engine persistent disks and Azure unmanaged disks both support differential snapshots. The initial snapshot creates a full copy of the volume, and subsequent snapshots only copy the blocks that have changed since the previous snapshot. After a number of differential snapshots, another full snapshot is taken, and the cycle repeats.

Azure managed disks do not currently support differential snapshots. Instead, each snapshot makes a full copy of your disk volume.

Azure also provides Azure Backup and Azure Recovery Service, which help automate backup and restore operations. GCP does not provide equivalents for these services.

Volume performance

For both Compute Engine persistent disks and Azure VHDs, disk performance depends on several factors, including:

  • Volume type: Each service offers several distinct volume types. Each has its own set of performance characteristics and limits.
  • Available bandwidth: The throughput of a networked volume depends on the network bandwidth available to the Compute Engine or Azure VM to which it is attached.

This section discusses additional performance details for each service.

Azure VHDs

Azure VM types vary widely in networking performance. VM types with a small number of cores might not have sufficient network capacity to achieve the advertised maximum IOPs or throughput for a given VHD disk type. See High-performance Premium Storage and Managed Disks for VMs for more information.

Compute Engine persistent disks

Compute Engine allocates throughput on a per-core basis. You get 2 Gbps of network egress per virtual CPU core, with a theoretical maximum of 16 Gbps for a single virtual machine instance. Because Compute Engine has a data redundancy factor of 3.3x, each logical write actually requires 3.3 writes' worth of network bandwidth. As such, machine types with a small number of cores might not have sufficient network capacity to achieve the advertised maximum IOPs or throughput for a given persistent disk type. See Network egress caps for more information.

For each Compute Engine disk type, the total I/O available is related to the total size of the volumes that are connected to a given instance. For example, if you have two 2.5 TB standard persistent disks connected to an instance, your total available I/O comes out to 3000 read IOPS and 7500 write IOPS.

Locally attached disks

In addition to standard networked block storage, Azure and Compute Engine both allow users to use SSDs that are locally attached to the physical machine running the instance. In both environments, these disks are called local SSDs. Local SSDs offer much faster transfer rates than networked block storage. However, unlike networked block storage, they do not guarantee data persistence, and they cannot be snapshotted with native differential snapshotting features.

On Azure, the size and availability of local SSDs are directly tied to your machine type. Local SSD sizes can be as small as 16 GB and as large as 6 TB. Some machine types, such as the A Series, do not include local SSDs.

On Compute Engine, local SSDs can be attached to almost any machine type, with the exception of shared-core types like f1-micro and g1-small. Local SSDs have a fixed size of 375 GB per disk, and you can attach a maximum of 8 to a single instance.

Compute Engine migrates local SSDs automatically and seamlessly before their host machines go down for maintenance. See Live Migrate for more information.

Costs

On Azure, disk volumes are priced per GB per month. Charges for local SSDs are included with the cost of the VM.

Compute Engine persistent disks and disk snapshots are also priced per GB per month. Local SSDs are charged independently of the machine costs. See Local SSD pricing for details.

File storage

For file-server-based workloads, Azure provides File Storage, a distributed SMB-based file server service.

GCP does not provide a native file server solution as a service. However, you can run a file server on GCP in a variety of ways. For more information, see File Servers on Google Compute Engine.

Storage area network (SAN)

For SAN workloads, Azure provides integration with StorSimple, Microsoft's proprietary SAN appliance. Architecturally, StorSimple comprises an on-premises StorSimple SAN and a virtual cloud-based appliance that replicates the behavior of the on-premises SAN.

On GCP, you can use persistent disks to support workloads that expect SANs. Used in a SAN context, persistent disks are analogous to the logical disk volumes you would access through logical unit number (LUN) devices, and can be provisioned in a similar way. As with LUN-based logical disk volumes, you can mount multiple persistent disks to a single VM instance. You can also mount a single read-only persistent disk to multiple VM instances. For more information, see Google Cloud Platform for Data Center Professionals: Storage.

Cool storage

GCP and Azure each offer a cool storage option for data that does not need to be accessed regularly. Cloud Storage offers an additional class called Cloud Storage Nearline, and Azure Blob Storage offers an additional Cool tier.

Latency

Both services have a time-to-first-byte of milliseconds.

Availability

The availability of Azure Blob Storage's Cool tier depends on your storage account's replication type. Cloud Storage Nearline is regional.

Storage duration

If you're using a blob-specific storage account, Azure Blob Storage's Cool tier does not have a minimum storage period. For GPv2 storage accounts, Azure Blob Storage's Cool tier has a minimum storage period of 30 days for each blob. If you delete or overwrite a blob before the minimum storage period ends, you will incur additional charges.

Cloud Storage Nearline has a minimum storage period of 30 days for each data object. As with Azure Blob Storage's Cool tier when used within a GPv2 account, if you delete or overwrite a data object before the minimum storage period ends, you will incur additional charges.

Costs

Azure Blob Storage's Cool tier

Azure Blob Storage's Cool tier is priced by amount of data stored per month, storage account type, replication type, and network egress. If you take blob snapshots, these snapshots are charged at the same rate as the live versions of the blobs. Azure Cool Blob Storage also charges for common API requests, data writes, and data retrieval.

Cloud Storage Nearline

Cloud Storage Nearline is priced by amount of data stored per month and by network egress. If you delete or modify your data before the minimum storage period, you will be charged for the remainder of the period. For example, if you delete an object 5 days after storing the object, you will be charged for the remaining 25 days of storage for that object. Cloud Storage Nearline also charges for common API requests.

For more information about Cloud Storage Nearline pricing, see Cloud Storage Nearline pricing.

Cold or archival storage

GCP and Azure each offer an archival storage option. Cloud Storage offers an additional class called Cloud Storage Coldline, and Azure Blob Storage offers an additional Archive tier.

Latency

Cloud Storage Coldline has a time-to-first-byte of milliseconds. Azure Blob Storage's Archive tier has a time-to-first-byte of 15 hours or less.

Availability

The availability of Azure Blob Storage's Archive tier depends on your storage account's replication type. Cloud Storage Coldline is regional.

Storage duration

Azure Blob Storage's Archive tier has a minium storage period of 180 days for each blob. If you delete or overwrite a blob before the minimum storage period ends, you will incur additional charges.

Cloud Storage Coldline has a minimum storage period of 90 days for each data object. As with Azure Blob Storage's Archive tier, if you delete or overwrite a data object before the minimum storage period ends, you will incur additional charges.

Costs

Azure Blob Storage's Archive tier

Azure Blob Storage's Archive tier is priced by amount of data stored per month, storage account type, replication type, and network egress. If you delete or modify your data before the minimum storage period, you will be charged for the remainder of the period. For example, if you delete a blob 5 days after storing the blob, you will be charged for the remaining 175 days of storage for that blob. In addition, if you take blob snapshots, these snapshots are charged at the same rate as the live versions of the blobs.

Azure Blob Storage's Archive tier also charges for common API requests.

Cloud Storage Coldline

Cloud Storage Coldline is priced by amount of data stored per month and by network egress. If you delete or modify your data before the minimum storage period, you will be charged for the remainder of the period. For example, if you delete an object 5 days after storing the object, you will be charged for the remaining 85 days of storage for that object. Cloud Storage Coldline also charges for common API requests.

For more information about Cloud Storage Coldline pricing, see Cloud Storage Pricing.

What's next?

Check out the other Google Cloud Platform for Azure Professionals articles:

Was this page helpful? Let us know how we did:

Send feedback about...

Google Cloud Platform for Azure Professionals