Persistent disks give you the performance described in the disk type chart if the VM drives usage that is sufficient to reach the performance limits. After you size your persistent disk volumes to meet your performance needs, your app and operating system might need some tuning.
In the following sections, we describe a few key elements that can be tuned for better performance and how to apply some of them to specific types of workloads.
Use a high I/O queue depth
Persistent disks have higher latency than locally attached disks such as local SSDs because they are network-attached devices. They can provide very high IOPS and throughput, but you need to make sure that enough I/O requests are done in parallel. The number of I/O requests done in parallel is referred to as the I/O queue depth.
The tables below show the recommended I/O queue depth to ensure you can achieve a certain performance level. Note that the table below uses a slight overestimate of typical latency in order to show conservative recommendations. The example assumes that you are using an I/O size of 16 KB.
Generate enough I/Os using large I/O size
Use large I/O size
To ensure IOPS limits and latency don't bottleneck your application performance, use a minimum I/O size of 256 KB or higher.
Use large stripe sizes for distributed file system applications. A random I/O workload using large stripe sizes (4 MB or larger) achieves great performance on standard persistent disks due to how closely the workload mimics multiple sequential stream disk access.
Make sure your application is generating enough I/O
Make sure your application is generating enough I/Os to fully utilize the IOPS and throughput limits of the disk. To better understand your workload I/O pattern, review persistent disk usage and performance metrics in Cloud Monitoring.
Make sure there is enough available CPU on the instance that is generating the I/O
If your VM instance is starved for CPU, your app won't be able to manage the IOPS described earlier. We recommend that you have one available CPU for every 2,000–2,500 IOPS of expected traffic.
Limit heavy I/O loads to a maximum span
A span refers to a contiguous range of logical block addresses on a single physical disk. Heavy I/O loads achieve maximum performance when limited to a certain maximum span, which depends on the machine type of the VM to which the disk is attached, as listed in the following table.
Machine type | Recommended maximum span |
---|---|
|
25 TB |
All other machine types | 50 TB |
Spans on separate persistent disks that add up to 50 TB or less can be considered equal to a single 50 TB span for performance purposes.
Avoid using ext3 filesystems in Linux
Using Linux's ext3 filesystem can result in very poor performance under heavy write loads. Use ext4 when possible. The ext4 filesystem driver is backwards compatible with ext3/ext2 and supports mounting ext3 filesystems. The ext4 filesystem is the default on most Linux operating systems.
If you can't migrate to ext4, as a workaround, you can mount ext3
filesystems with the data=journal
mount option. This improves write IOPS at
the cost of write throughput. Migrating to ext4 can result in up to a 7x
improvement in some benchmarks.
Disable lazy initialization and enable DISCARD commands
Persistent disks support DISCARD or TRIM commands, which allow operating systems to inform the disks when blocks are no longer in use. DISCARD support allows the OS to mark disk blocks as no longer needed, without incurring the cost of zeroing out the blocks.
On most Linux operating systems, you enable DISCARD when you mount a persistent disk to your instance. Windows Server 2012 R2 instances enable DISCARD by default when you mount a persistent disk.
Enabling DISCARD can boost general runtime performance, and it can also speed up the performance of your disk when it is first mounted. Formatting an entire disk volume can be time consuming, so "lazy formatting" is a common practice. The downside of lazy formatting is that the cost is often then paid the first time the volume is mounted. By disabling lazy initialization and enabling DISCARD commands, you can get fast format and mount.
Disable lazy initialization and enable DISCARD during format by passing the following parameters to mkfs.ext4:
-E lazy_itable_init=0,lazy_journal_init=0,discard
The
lazy_journal_init=0
parameter does not work on instances with CentOS 6 or RHEL 6 images. For those instances, format persistent disks without that parameter.-E lazy_itable_init=0,discard
Enable DISCARD commands on mount by passing the following flag to the mount command:
-o discard
Persistent disks work well with the discard
option enabled. However, you
can optionally run fstrim
periodically in addition to, or instead of
using the discard
option. If you do not use the discard
option, run
fstrim
before you create a snapshot of your disk. Trimming the file system
lets you create smaller snapshot images, which reduces the cost of storing
snapshots.
Adjust the readahead value
To improve I/O performance, operating systems employ techniques such as readahead, where more of a file than was requested is read into memory with the assumption that subsequent reads are likely to need that data. Higher readahead increases throughput at the expense of memory and IOPS. Lower readahead increases IOPS at the expense of throughput.
On Linux systems, you can get and set the readahead value with the blockdev command:
$ sudo blockdev --getra /dev/[DEVICE_ID]
$ sudo blockdev --setra [VALUE] /dev/[DEVICE_ID]
The readahead value is <desired_readahead_bytes>
/ 512 bytes.
For example, for an 8-MB readahead, 8 MB is 8388608 bytes (8 * 1024 * 1024).
8388608 bytes / 512 bytes = 16384
You set blockdev to 16384
:
$ sudo blockdev --setra 16384 /dev/[DEVICE_ID]
Ensure you have free CPUs
Reading and writing to persistent disk requires CPU cycles from your VM. To achieve very high, consistent IOPS levels, you must have CPUs free to process I/O.
Optimize your disks for IOPS or throughput oriented workloads
IOPS-oriented workloads
Databases, whether SQL or NoSQL, have usage patterns of random access to data. Google recommends the following values for IOPS-oriented workloads:
I/O queue depth values of 1 per each 400–800 IOPS, up to a limit of 64 on large volumes
One free CPU for every 2,000 random read IOPS and 1 free CPU for every 2,500 random write IOPS
Lower readahead values are typically suggested in best practices documents for MongoDB, Apache Cassandra, and other database applications.
Throughput-oriented workloads
Streaming operations, such as a Hadoop job, benefit from fast sequential reads, and larger I/O sizes can increase streaming performance.
Use an I/O size of 256 KB or larger.
On standard persistent disks, use 8 or more parallel sequential I/O streams when possible. Standard persistent disks are designed to optimize I/O performance for sequential disk access, similar to a physical HDD hard drive.
Make sure your app is optimized for a reasonable temporal data locality on large disks.
If your app accesses data that is distributed across different parts of a disk over a short period of time (hundreds of GB per vCPU), you won't achieve optimal IOPS. For best performance, optimize for temporal data locality, weighing factors like the fragmentation of the disk and the randomness of accessed parts of the disk.
On SSD persistent disks, make sure the I/O scheduler in the OS is configured to meet your specific needs.
On Linux-based systems, check if the I/O scheduler is set to
none
. This I/O scheduler doesn't reorder requests and is ideal for fast, random I/O devices.- On the command line, verify the I/O schedule being used by your Linux
machine:
cat /sys/block/sda/queue/scheduler
The output is similar to the following:[mq-deadline] none
The I/O scheduler that is currently active is displayed in square brackets ([]
). - If your I/O scheduler is not set to
none
, perform one of the following steps:- To change your default I/O scheduler to
none
, setelevator=none
in theGRUB_CMDLINE_LINUX
entry of the GRUB configuration file. Usually this file is located in/etc/default/grub
, but on some earlier distributions, it might be located in a different directory.GRUB_CMDLINE_LINUX="elevator=none vconsole.keymap=us console=ttyS0,38400n8 vconsole.font=latarcyrheb-sun16
After updating the GRUB configuration file, configure the bootloader on the system so that it can boot on Compute Engine. - Alternatively, you can change the I/O scheduler at runtime:
echo 'none' > sudo /sys/block/sda/queue/scheduler
If you use this method, the system switches back to the default I/O scheduler on reboot. Run thecat
command again to verify your I/O scheduler.
- To change your default I/O scheduler to
- On the command line, verify the I/O schedule being used by your Linux
machine:
Review persistent disk performance metrics
You can review persistent disk performance metrics in Cloud Monitoring, Google Cloud's integrated monitoring solution.
To learn more, see Reviewing persistent disk performance metrics.
What's next
- Benchmark your persistent disks.
- Learn about persistent disk pricing.