Best practices for 16 KB persistent disk and MySQL

This document describes how to use persistent disk with 16 KB physical block size for performance improvements to a MySQL database.

Write-heavy MySQL workloads usually benefit from disabling the InnoDB doublewrite buffer. MySQL's InnoDB performs doublewrite during the dirty page flushing process so it can recover possible torn pages.

However, if there is an end-to-end 16 KB atomic write path to ensure that a 16 KB data page won't be partially committed to the disk or torn write, then there is no need to perform doublewrite. When doublewrite is disabled, the database's dirty page flushing capability is essentially doubled, reducing the frequency in which the database falls into a sync flush state, which leads to a more stable, and possibly increased performance.

Before you begin

Building a 16 KB atomic write path from database to block device

You can build an end-to-end 16 KB atomic write path from the database to the block device leveraging a 16 KB persistent disk, so you can safely disable the doublewrite feature in MySQL/InnoDB and achieve a more stable and better performance for a high-write load.

Create and attach a persistent disk through the Google Cloud Console, the gcloud tool, or the API.

  1. Create a 16 KB block size persistent disk and attach it to your VM. The 16 KB persistent disk provides 16 KB write atomicity at the physical block level.

    Although optional, it is recommended that you configure your MySQL instance to store data files only to the 16 KB persistent disk. Store log files, especially redo log and binlogs, to a 4 KB persistent disk that is attached to the same VM. This ensures log file writes continue to be high performance because small log writes on 16 KB persistent disk might trigger lots of read-modify-writes, which are slower.

  2. Format the 16 KB disk using the ext4 file system with the BigAlloc option and set the cluster size to 16 KB. Here is an example mkfs command with BigAlloc option specified:

    mkfs.ext4 -O bigalloc -C 16384 [...other options…]

    Using BigAlloc with 16 KB as a cluster size, ensures that the file system allocates files that align with the 16 KB boundary on the disk.

  3. When you create the VM instance, choose an OS image from the Google Container-Optimized OS image families in the cos-cloud project.

    Use the gcloud command to see a list of all available cos images:

    gcloud compute images list --project cos-cloud --no-standard-images

    Select version 67 or higher. For better results, consider choosing an image from the cos-stable family.

    Choosing a cos image ensures that writes are not improperly split across 16 KB boundary by layers in between the file system and the physical block device layer. The cos image qualification process has built-in tests to ensure this result.

  4. Ensure max_segments and max_sectors_kb are configured properly in the OS:

    max_segments >= max_sectors_kb/4

    These two variables are already configured in all Compute Engine VMs, if you don't have a script changing these two variables after VM creation then you don't need to do anything here.

    You can query these two constants in the OS under this path:

    /sys/block/sd<drive letter>/queue/
  5. Configure noop or none as the I/O scheduler for the 16 KB persistent disk.

    echo "none" > /sys/block/sd<drive_letter>/queue/scheduler
  6. Disable I/O requests merging in kernel for the 16 KB persistent disk.

    echo 2 > /sys/block/sd<drive_letter>/queue/nomerges
  7. Configure InnoDB to use O_DIRECT. Set (or add) O_DIRECT to the innodb_flush_method database config.

Now you can safely turn off the innodb_doublewrite option.

This method is not the only approach you can take to ensure end-to-end 16 KB atomic writes using a 16 KB block device. For example, if you configure your database to use the block device directly as a raw device without using a filesystem, then you can skip the steps above that describe configuration of the file system.

What's next