Updated Dec 07, 2020
Compare the storage services provided by Microsoft Azure and Google Cloud. This article discusses the following service types:
- Distributed object storage, or redundant key-value stores in which you can store data objects.
- Block storage, or virtual disk volumes that you can attach to virtual machine instances.
- File storage, or network-attached, file-server-based storage.
- Storage area networks, or remote storage that provides block-level storage access.
- Cool storage, or storage services designed to store data backups.
- Archival storage, or storage services designed to store archival data for compliance or analysis purposes.
This article does not discuss databases or message queues.
Service model comparison
Microsoft Azure and Google Cloud take different approaches to the high-level organization and configuration of their storage services. However, in practice, the storage services themselves are often more similar than they are different.
On Azure, you store your various types of data, including binary objects (blobs), databases, and message queues, in specific services within a storage account. Azure requires you to define your account type, disk type, and redundancy type at the storage-account level. These attributes then apply to all of the storage services within that account.
Azure storage accounts come in three types: general-purpose v1 (GPv1), blob-specific, and general-purpose v2 (GPv2). GPv1 accounts can support any of Azure's standard storage types, and blob-specific accounts are designed to support advanced functionality for Azure Blob Storage. GPv2 accounts support the APIs and features of both of the other account types.
While blob-specific accounts run exclusively on hard disk drives (HDDs), both general-purpose account types are further split into Standard Storage accounts, which run on HDDs, and Premium Storage accounts, which run on solid-state drives (SSDs). This latter account type supports page blobs only.
When you create a new Azure storage account, you select the level of replication you want to use. Azure offers the following levels:
- Locally redundant storage (LRS), which replicates your data locally within the data center that contains your storage account. Your data is replicated three times.
- Zone-redundant storage (ZRS), which replicates your data across one or two regions with eventual consistency. As with LRS, your data is replicated three times locally as well. ZRS is limited to block blob storage in general-purpose storage accounts.
- Geo-redundant storage (GRS), which replicates your data in a primary region and in a secondary region at least 100 miles away. Your data is replicated three times in the primary region, and then asynchronously replicated three times in the secondary region.
- Read-access geo-redundant storage (RA-GRS), which is identical to GRS but adds a secondary read-only endpoint in the secondary region.
Like Azure, Google Cloud stores each data type within a type-specific service. However, Google Cloud does not have a high-level organizational layer such as a storage account. Instead, you create storage resources and define resource attributes, such as disk type or redundancy type, at the service level.
For distributed object storage, Google Cloud provides Cloud Storage, which is comparable to Azure Blob Storage's block and append blob storage. For block storage, Google Cloud offers Compute Engine persistent disks, which are equivalent to Azure VHDs.
Distributed object storage
For distributed object storage, Azure offers Azure Blob Storage, and Google Cloud offers Cloud Storage.
Azure Blob Storage and Cloud Storage have many similarities. In both services, you store binary objects inside a named unit of storage. In Azure Blob Storage, these binary objects are called blobs, and the unit of storage is called a container. In Cloud Storage, these binary objects are called objects, and the unit of storage is called a bucket.
In both services, each binary object within a unit of storage is identified by a unique key within that unit of storage, and each object has an associated metadata record. This metadata record contains information such as object size, date of last modification, and media type. If you have the appropriate permissions, you can view and modify some of this metadata. You can also add custom metadata, if needed.
Though both containers and buckets are key-value stores, the user
experience for each is similar, though not identical, to that of
filesystems. By convention, object keys are usually paths such as
"/foo/bar.txt" or "/foo/subdir/baz.txt." Azure and Cloud Storage both
also provide filesystem-like APIs—for example, Azure Blob Storage's
List Blobs method and Cloud Storage's
list method both can list
all object keys with a common prefix, not unlike
ls -R would on a
In addition to their most obvious use, distributed object storage, both services can be used to host static web content and media.
Azure Blob Storage's features and terminology map to those of Cloud Storage as follows:
|Feature||Azure Blob Storage||Cloud Storage|
|Unit of deployment||Container||Bucket|
|Deployment identifier||Account-level unique key||Globally unique key|
|File system emulation||Limited||Limited|
|Object types||Block blobs, append blobs, page blobs||Objects|
|Object versioning||Manual, per-object snapshotting||Automatic versioning of all objects in a bucket (must be enabled)|
|Object lifecycle management||Yes (through lifecycle rules or Azure Automation)||Yes (native)|
|Object change notifications||Yes (through Azure Event Grids)||Yes (through Pub/Sub)|
|Service classes||Redundancy levels: LRS, ZRS, GRS, RA-GRS
Tiers: Hot, Cool, Archive
|Standard, Nearline, Coldline, Archive|
|Deployment locality||Zonal and regional||Regional and multi-regional|
In Azure Blob Storage, you store data as block blobs, append blobs, or page blobs. In Cloud Storage, you store all data as objects, which are equivalent to block blobs. Google Cloud doesn't provide a service or object type directly comparable to append blobs. However, you can use Cloud Storage's object composition functionality and concurrency controls to approximate the functionality of an append blob. For more information, see Composite Objects and Parallel Uploads.
Unlike Azure, Google Cloud does not store your disk volumes as page blobs within its object storage service. Instead, your disk volumes are stored within Compute Engine, Google Cloud's infrastructure-as-a-service offering. See Block storage for more information.
Access tiers and replication
The flexibility of Azure Blob Storage depends on the type of storage account you've created and the replication options you've chosen for that account. If you're using a GPv1 storage account, you are limited to Azure's default tier for blob storage. However, when you use a blob-specific or GPv2 storage account, you can choose between Hot, Cool, and Archive access tiers for Azure Blob Storage. The Hot tier is designed for frequently accessed data, the Cool tier is designed for infrequently accessed data, and the Archive tier is designed for data archiving. The level of replication for your Azure Blob Storage service is determined by your storage account's replication type.
In contrast, Cloud Storage's replication types are built into its service classes. These service classes map to Azure Blob Storage access tiers and replication types as follows:
|Frequently accessed data with geo-redundancy||Azure Blob Storage (general-purpose or Hot tier) with GRS or RA-GRS||Standard Storage in a multi-region or dual-region|
|Frequently accessed data with regional redundancy||Azure Blob Storage (general-purpose) with ZRS||Standard Storage in a region|
|Frequently accessed data with local (data center) redundancy||Azure Blob Storage (general-purpose or Hot tier) with LRS||Standard Storage in a region*|
|Infrequently accessed data||Azure Blob Storage Cool tier||Cloud Storage Nearline and Cloud Storage Coldline|
|Archival data||Azure Blob Storage Archive tier||Cloud Storage Archive|
* Regional redundancy is the lowest level of redundancy available on Google Cloud
Both Azure and Cloud Storage allow you to version your stored objects, storing distinct versions of an object with a given key under distinct version IDs. However, they implement this functionality in different ways.
In Azure Blob Storage, you can achieve versioning by taking read-only snapshots of your blobs. If you upload your files programmatically, you can take a new snapshot before each upload. Azure Blob Storage also lets you specify access conditions to avoid unnecessary snapshotting.
In contrast, Cloud Storage allows you to enable automatic object versioning for all objects within a bucket. With automatic versioning enabled, Cloud Storage automatically creates a new version of an object each time you modify the object. This approach simplifies the object versioning process but is slightly less flexible than Azure's approach. Each version of an object also adds to your total stored data, which can increase storage costs. You can mitigate this issue by using Cloud Storage's object lifecycle management.
Azure Blob Storage and Cloud Storage each default to a "last write wins" write strategy. This strategy works well for sequential writes, but it allows for race conditions if you're performing concurrent writes to the same object. To mitigate this issue, both services provide mechanisms for managing concurrent writes.
On Azure, you can manage concurrent writes optimistically or pessimistically:
- Optimistic approach: You retrieve an object's ETag header when performing a GET operation, and then compare that ETag to the object's current ETag when attempting a write. If the tags match, you commit the write.
- Pessimistic approach: You lease the target object, locking it for a specified duration while you perform the write.
On Cloud Storage, you use an optimistic approach. To manage concurrent writes, you obtain the current generation number of a given object, and then check against that generation number when your script or application attempts a write. If the numbers match, you commit the write. Otherwise, you abort the transaction and then restart it. For more information, see Object Versioning and Concurrency Control.
Object lifecycle management
Azure Blob Storage lifecycle management offers rule-based policies for GPv2 and blob storage accounts. You can use these policies to transition your data to the appropriate access tiers or expire at the end of the data's lifecycle. Azure object lifecycle management rules support scenarios such as archiving or deleting data based on age, archiving data at ingest, and deleting old snapshots.
Cloud Storage allows you to automate object deletion according to user-specified lifecycle policies. For more information, see Object lifecycle management.
Though Azure Blob Storage does not provide a native lifecycle management feature, you can use Azure Automation to automate object deletion.
Object change notifications
Both Azure and Google Cloud provide a publish/subscribe model that allows you to send and receive notifications when your objects are modified. With Azure Blob Storage, you can use Azure Event Grid to track Blob Storage events and send them to a webhook, Azure Function, or other endpoint. Similarly, Google Cloud provides Pub/Sub Notifications, which allow you to publish notifications to a Pub/Sub topic when objects are created, deleted, or updated within a Cloud Storage bucket. To receive these notifications, you can subscribe to this Pub/Sub topic from other applications or services.
Azure supports encryption at rest through Azure Storage Service Encryption (SSE) for Data at Rest. All blob-based storage within your storage account is encrypted with AES256 during ingress and decrypted during egress. If you enable encryption for your account after you've already uploaded data, that data is not encrypted until it is rewritten. Azure also supports customer-managed encryption keys (CMEK) for server-side encryption (SSE). SSE for Azure Blob Storage and Azure Files is integrated with Azure Key Vault so that you can use a key vault to manage your encryption keys.
Similarly, all data stored in Google Cloud's storage services, including Cloud Storage, is automatically encrypted at rest with either AES256 or AES128. For data that requires you to manage your own encryption key, Google Cloud also supports CMEK using Cloud Key Management Service and customer-supplied encryption keys (CSEK). For more information, see Encryption at Rest in Google Cloud Platform.
Service level agreement
Microsoft and Google both provide uptime service level agreements (SLA)s and have policies in place for crediting customer accounts in the event that these SLAs are not met. Microsoft defines the guarantees and policies for Azure Blob Storage in the Azure Storage SLA. Google defines the guarantees and policies for Cloud Storage in the Cloud Storage SLA.
Azure Blob Storage
Azure Blob Storage is priced by amount of data stored per month, storage account type, replication type, and network egress. If you take object snapshots, these snapshots are charged at the same rate as the live versions of the objects. Azure Blob Storage also charges for common API requests.
Cloud Storage Standard
Cloud Storage Standard is priced by amount of data stored per month and by network egress. For buckets with object versioning enabled, each archived version of an object is charged at the same rate as the live version of the object. Cloud Storage Standard also charges for common API requests.
Google Cloud and Azure both offer block storage options. Google Cloud provides block storage in the form of persistent disks, which are part of Compute Engine. Azure provides block storage in the form of managed disks. Both platforms also provide users the option of using locally attached SSDs.
Service model comparison
Apart from the method in which they're stored, Compute Engine persistent disks and Azure Managed Disks are very similar in most ways. In both cases, disk volumes are network-attached, though both Compute Engine and Azure also provide the ability to locally attach a disk if necessary. While networked disks have higher operational latency and less throughput than locally attached disks, they have many benefits as well, including built-in redundancy, snapshotting, and ease of disk detachment and reattachment.
Azure Managed Disks
An Azure managed disk can be up to 32 TiB in size (64 TiB for Ultra SSDs). Azure virtual machines (VMs) impose machine-type-based limits on how many managed disks can be attached at a time.
Each managed disk must reside in the same region as the VM to which the disk is attached. The latency and throughput of managed disks depend on both the disk type and the machine type of the VM:
- Azure Standard HDDs (hard disk drives) run on spinning magnetic HDDs and are recommended for infrequently accessed data or bulk storage use cases.
- Azure Standard SSDs (solid-state drives), Azure Premium SSDs, and Azure Ultra SSDs run on varying SSD drives and are recommended for the increasing demands of production workloads. The amount of throughput and IOPS increases as you increase the managed disk storage tier. Some low-tier machines, such as A0, do not support SSD-backed managed disks.
Compute Engine persistent disks
Google Cloud provides block storage in the form of persistent disks that are stored in Compute Engine. For most Compute Engine VM instances with custom machine types or predefined machine types, you can attach up to 128 persistent disks. Instances with shared-core machine types are limited to a maximum of 16 persistent disks. You can attach up to 257 TB of persistent disk storage per VM instance, and each persistent disk can be up to 64 TB in size.
Persistent disks can be HDDs, balanced (SSD-based), or SSDs. Zonal persistent disks are created in the same zone as the VM instance. Regional persistent disks are replicated between zones.
Compute Engine persistent disks map to Azure managed disks as follows:
|Feature||Azure managed disks||Compute Engine persistent disks|
|Volume types||Standard HDD, Standard SSD, Premium SSD, Ultra SSD||Standard persistent disk (HDD), balanced persistent disk (SSD-based), SSD persistent disk|
|Management schemes||Managed disks, unmanaged disks||N/A (Google Cloud-managed at the project level)|
|Volume attachment||Regular managed disks can be attached to only one instance at a
Shared disks can attach to multiple instances in a limited capacity
|Read-write volumes: Can be attached to only one instance
at a time
Read-only volumes: Can be attached to multiple instances
|Maximum volume size||32 TiB for Standard HDD, Standard SSD, Premium SSD
64 TiB for Ultra SSD
|Disk encryption||Encrypted by default||Encrypted by default|
Compute Engine's locally attached disks compare to those of Azure as follows:
|Service name||Local SSD||Local SSD|
|Volume attachment||Tied to instance type||Can be attached to any non-shared-core instance|
|Attached volumes per instance||Varies by instance type||Up to 24|
|Storage capacity||Varies by instance type||375 GB per volume, 9 TB in total|
Azure managed disks can use Azure availability zones. This ability creates replicas across up to three datacenters within the same zone. A VM that uses managed disks in an availability zone can failover if there's a problem in one datacenter.
You can replicate Compute Engine persistent disks within a single Compute Engine region using zonal persistent disks. If you are designing robust systems on Compute Engine, you can use regional persistent disks to maintain high availability for resources across multiple zones. Regional persistent disks provide synchronous replication for workloads that might not have application-level replication. If there's a failure, regional persistent disks can be configured to automatically failover and attach to another VM instance in a usable region.
While each service provides replication for increased durability, this feature does not protect against data corruption or accidental deletions due to user or application error. To protect important data, users should perform regular data backups and disk snapshots.
Volume attachment and detachment
After creating a disk volume, you can attach the volume to a VM. Azure VM and Compute Engine VM instances work similarly. The VM instance can then mount and format the disk volume like any other block device. Similarly, you can unmount and detach a volume from an instance, allowing it to be reattached to other instances.
Most Azure managed disks attach to only one VM at a time. Some Premium SSD and Ultra SSD managed disks can be shared across VMs, with some limitations in their use and configuration. Compute Engine persistent disks in read/write mode can attach to only one VM instance, while persistent disks in read-only mode can attach to multiple VM instances simultaneously.
Compute Engine and Azure both allow users to capture and store snapshots of disk volumes. These snapshots can be used create new volumes at a later time.
Compute Engine persistent disks and Azure managed disks both support differential, or incremental, snapshots. The initial snapshot creates a full copy of the volume, and subsequent snapshots only copy the blocks that have changed since the previous snapshot. After a number of differential snapshots, another full snapshot is taken, and the cycle repeats.
Azure also provides Azure Backup and Azure Recovery Service, which help automate backup and restore operations. Google Cloud does not provide equivalents for these services.
For both Compute Engine persistent disks and Azure managed disks, disk performance depends on several factors, including:
- Volume type: Each service offers several distinct volume types. Each has its own set of performance characteristics and limits.
- Available bandwidth: The throughput of a networked volume depends on the network bandwidth available to the Compute Engine or Azure VM to which it is attached.
This section discusses additional performance details for each service.
Azure Managed Disks
Azure VM types vary widely in networking performance. VM types with a small number of cores might not have sufficient network capacity to achieve the advertised maximum IOPs or throughput for a given managed disk type. See High-performance Premium Storage and Managed Disks for VMs for more information.
Compute Engine persistent disks
Compute Engine allocates throughput on a per-core basis. You get 2 Gbps of network egress per virtual CPU core, with a theoretical maximum of 16 Gbps for a single virtual machine instance. Because Compute Engine has a data redundancy factor of 3.3x, each logical write actually requires 3.3 writes' worth of network bandwidth. As such, machine types with a small number of cores might not have sufficient network capacity to achieve the advertised maximum IOPs or throughput for a given persistent disk type. See Network egress caps for more information.
For each Compute Engine disk type, the total I/O available is related to the total size of the volumes that are connected to a given instance. For example, if you have two 2.5 TB standard persistent disks connected to an instance, your total available I/O comes out to 3000 read IOPS and 7500 write IOPS.
Locally attached disks
In addition to standard networked block storage, Azure and Compute Engine both allow users to use SSDs that are locally attached to the physical machine running the instance. In both environments, these disks are called local SSDs. Local SSDs offer much faster transfer rates than networked block storage. However, unlike networked block storage, they do not guarantee data persistence, and they cannot be snapshotted with native differential snapshotting features.
On Azure, the size and availability of local SSDs are directly tied to your machine type. Local SSD sizes can be as small as 16 GiB and as large as 8 TiB. Some machine types, such as the A Series, do not include local SSDs.
On Compute Engine, local SSDs can be attached to almost any machine type, with the exception of shared-core types like f1-micro and g1-small. Local SSDs have a fixed size of 375 GB per disk, and you can attach a maximum of 24 to a single instance for a total of 9 TB.
Compute Engine migrates local SSDs automatically and seamlessly before their host machines go down for maintenance. See Live Migrate for more information.
On Azure, disk volumes are priced per allocated GB per month. Charges for local SSDs are included with the cost of the VM.
Compute Engine persistent disks and disk snapshots are also priced per GB per month. Local SSDs are charged independently of the machine costs. See Local SSD pricing for details.
For file-server-based workloads, Azure provides Azure Files, a distributed SMB-based file server service.
Google Cloud provides Filestore, a managed file storage service, for applications that require a file system interface and a shared file system for data. Filestore gives users a simple, native experience for standing up managed Network Attached Storage (NAS) with their Compute Engine and Google Kubernetes Engine (GKE) instances.
Storage area network (SAN)
For SAN workloads, Azure provides integration with StorSimple, Microsoft's proprietary SAN appliance. Architecturally, StorSimple comprises an on-premises StorSimple SAN and a virtual cloud-based appliance that replicates the behavior of the on-premises SAN.
On Google Cloud, you can use persistent disks to support workloads that expect SANs. Used in a SAN context, persistent disks are analogous to the logical disk volumes you would access through logical unit number (LUN) devices, and can be provisioned in a similar way. As with LUN-based logical disk volumes, you can mount multiple persistent disks to a single VM instance. You can also mount a single read-only persistent disk to multiple VM instances. For more information, see Google Cloud Platform for Data Center Professionals: Storage.
Google Cloud and Azure each offer a cool storage option for data that does not need to be accessed regularly. Cloud Storage offers additional classes called Cloud Storage Nearline and Cloud Storage Coldline, while Azure Blob Storage offers an additional Cool tier.
Both services have a time-to-first-byte of milliseconds.
When using Azure Blob Storage's Cool tier, the way your data is replicated depends on your storage account's replication type. When you use Cloud Storage Nearline or Cloud Storage Coldline, the way your data is replicated depends on the type of location where you store the data.
If you're using a blob-specific storage account, Azure Blob Storage's Cool tier does not have a minimum storage period. For GPv2 storage accounts, Azure Blob Storage's Cool tier has a minimum storage period of 30 days for each blob. If you delete or overwrite a blob before the minimum storage period ends, you will incur additional charges.
Cloud Storage Nearline has a minimum storage period of 30 days for each data object, while Cloud Storage Coldline has a minimum storage period of 90 days for each data object. As with Azure Blob Storage's Cool tier when used within a GPv2 account, if you delete or overwrite a data object before the minimum storage period ends, you will incur additional charges.
Azure Blob Storage's Cool tier
Azure Blob Storage's Cool tier is priced by amount of data stored per month, storage account type, replication type, and network egress. Azure Cool Blob Storage also charges for common API requests, data writes, and data retrieval.
Cloud Storage Nearline and Cloud Storage Coldline
Both Cloud Storage Nearline and Cloud Storage Coldline are priced by the amount of data stored per month and by network egress. If you delete or modify your data before the minimum storage period, you will be charged for the remainder of the period. For example, if you delete an object stored as Cloud Storage Nearline 5 days after storing the object, you will be charged for the remaining 25 days of storage for that object. Cloud Storage Nearline and Cloud Storage Coldline also charge for common API requests and data retrieval.
For more information about Cloud Storage Nearline and Cloud Storage Coldline pricing, see Cloud Storage pricing.
GCP and Azure each offer an archival storage option. Cloud Storage offers an additional class called Cloud Storage Archive, and Azure Blob Storage offers an additional Archive tier.
Cloud Storage Archive has a time-to-first-byte of milliseconds. Azure Blob Storage's Archive tier has a time-to-first-byte of 15 hours or less.
When using Azure Blob Storage's Archive tier, the way your data is replicated depends on your storage account's replication type. When you use Cloud Storage Archive, the way your data is replicated depends on the type of location where you store the data.
Azure Blob Storage's Archive tier has a minium storage period of 180 days for each blob. If you delete or overwrite a blob before the minimum storage period ends, you will incur additional charges.
Cloud Storage Archive has a minimum storage period of 365 days for each data object. As with Azure Blob Storage's Archive tier, if you delete or overwrite a data object before the minimum storage period ends, you will incur additional charges.
Azure Blob Storage's Archive tier
Azure Blob Storage's Archive tier is priced by amount of data stored per month, storage account type, replication type, and network egress. If you delete or modify your data before the minimum storage period, you will be charged for the remainder of the period. For example, if you delete a blob 5 days after storing the blob, you will be charged for the remaining 175 days of storage for that blob. In addition, if you take blob snapshots, these snapshots are charged at the same rate as the live versions of the blobs.
Azure Blob Storage's Archive tier also charges for common API requests.
Cloud Storage Archive
Cloud Storage Archive is priced by amount of data stored per month and by network egress. If you delete or modify your data before the minimum storage period, you will be charged for the remainder of the period. For example, if you delete an object 5 days after storing the object, you will be charged for the remaining 360 days of storage for that object. Cloud Storage Archive also charges for common API requests and data retrieval.
For more information about Cloud Storage Archive pricing, see Cloud Storage Pricing.
Check out the other Google Cloud for Azure Professionals articles: