Cloud Storage FUSE performance

This page provides guidance on how you can improve Cloud Storage FUSE performance.

Improve read and write performance

To improve read and write performance, we recommend the following:

  • Enable caching: Cloud Storage FUSE offers four optional client-side cache types that store specific types of data and metadata locally to help improve performance:

    Cloud Storage FUSE caching works with any user-specified directory that's backed by your choice of storage. Cloud Storage FUSE cache performance matches underlying storage used by the cache with minimal overhead.

  • Accelerate reads by enabling parallel downloads: accelerate large file reads over 1 GB in size by enabling parallel downloads. For more information, see Improve read performance using parallel downloads.

  • Run sequential read workloads when possible: Cloud Storage FUSE performs better for sequential read workloads than random read workloads. Cloud Storage FUSE uses a heuristic to detect when a file is being read sequentially, which enables Cloud Storage FUSE to issue fewer, larger read requests to Cloud Storage using the same TCP connection.

  • Adjust file sizes based on read type: to optimize sequential read performance, we recommend that you upload and read files between 5 MB and 200 MB in size. To optimize random read performance, we recommend that you upload and read files around 2 MB in size.

  • Mount buckets with hierarchical namespace enabled: to increase read and write performance speeds and ensure atomicity for higher initial queries per second (QPS) operations, we recommend mounting buckets with hierarchical namespace enabled. To learn more about how hierarchical namespace-enabled buckets can improve Cloud Storage FUSE performance, see Mount buckets with hierarchical namespace enabled.

Improve first-time read performance

Before running your workload, we recommend that you first recursively list the files in your mounted bucket to populate the stat and type caches ahead of time and improve performance on the first run in a faster, batched method:

ls -R MOUNT_POINT > /dev/null

Use file caching to improve throughput

Cloud Storage FUSE has higher latency than a local file system. Throughput is reduced when you read or write small files one at a time, as it results in several separate API calls. Reading or writing multiple large files at a time can help increase throughput. Use the Cloud Storage FUSE file cache feature to improve performance for small and random I/Os. To learn more about file caching and how to enable the feature, see Use Cloud Storage FUSE file caching.

Mount buckets with hierarchical namespace enabled

To ensure atomicity for higher initial queries per second (QPS) operations such as checkpointing and directory renames or changes, we recommend mounting buckets with hierarchical namespace enabled. Hierarchical namespace organizes your data into a hierarchical file system structure, making operations within the bucket more efficient. List object calls (BucketHandle.Objects) are replaced with get folder calls, resulting in quicker response times and fewer overall list calls for every operation.

Increase read-ahead size to improve large read throughput

You can improve large read throughput by increasing the amount of data that's prefetched with each read request using the read_ahead_kb Linux kernel parameter on your local machine. We recommend increasing the read_ahead_kb kernel parameter to 1 MB instead of using the default amount of 128 KB that's set on most Linux distributions. Either sudo or root permissions are required to successfully increase the kernel parameter.

To increase the read_ahead_kb kernel parameter to 1 MB for a specific Cloud Storage FUSE mounted directory, use the following command where /path/to/mount/point is your Cloud Storage FUSE mount point. Your bucket must be mounted to Cloud Storage FUSE before you run the command, otherwise, the kernel parameter doesn't increase.

  export MOUNT_POINT=/path/to/mount/point
  echo 1024 | sudo tee /sys/class/bdi/0:$(stat -c "%d" $MOUNT_POINT)/read_ahead_kb

Achieve maximum throughput

To achieve maximum throughput, use a machine with enough CPU resources to drive throughput and saturate the network interface card (NIC). Insufficient CPU resources can cause Cloud Storage FUSE throttling.

If you're using Google Kubernetes Engine, increase the CPU allocation to the Cloud Storage FUSE sidecar container if your workloads need higher throughput. You can increase the resources used by the sidecar container or allocate unlimited resources.

Assess IOPS needs in queries-per-second

Filestore is a better option than Cloud Storage FUSE for workloads that require high instantaneous input/output operations per second (IOPS), also known as queries-per-second in Cloud Storage. Filestore is also the better option for very high IOPS on a single file system with lower latency.

Alternatively, you can also use the Cloud Storage FUSE file cache feature to build on the underlying cache media's performance characteristics if it provides high IOPS and low latency.

Perform load tests

For instructions on how to perform load tests on Cloud Storage FUSE, see Performance Benchmarks in the GitHub documentation.

What's next