Use Cloud Storage FUSE file caching

The Cloud Storage FUSE file cache feature is a client-based read cache that lets repeat file reads to be served from a faster cache storage of your choice. This page describes how to enable and use Cloud Storage FUSE file caching. For an overview of file caching, stat caching, or type caching, see Overview of caching.

Before you begin

The file cache requires a directory path to be used to cache files. You can create a new directory on an existing file system or create a new file system on provisioned storage. If you are provisioning new storage to be used, use the following instructions to create a new file system:

  1. To format a Persistent Disk, see Compute Engine instructions for how to format a Persistent Disk.

  2. To create in-memory RAM disks, see Compute Engine instructions for how to mount RAM disks.

  3. To format and mount Local SSDs, see Compute Engine instructions for mounting Local SSDs. To combine multiple Local SSDs into a single volume, see Compute Engine instructions for how to add a Local SSD to your VM.

Enable and configure caching behavior

  1. Enable and configure file caching by using the file-cache field in a Cloud Storage FUSE configuration file and specify the cache directory you want to use in the cache-dir field. The file cache is disabled by default. Note that you enable file caching by passing a directory to cache-dir field.

  2. Optional: configure stat caching and type caching by using the metadata-cache field in a configuration file. To learn more about stat and type caches, see Overview of type caching or Overview of stat caching.

  3. Optional: increase the TTL of cached entries by setting the ttl-secs option to a value based on the expected time between repeat reads while balancing consistency needs. We recommend that you set the ttl-secs value to as high as your workload lets you. You can configure the TTL in a Cloud Storage FUSE configuration file. For more information about setting a TTL for cached entries, see Time to live.

    For example, the following configuration file enables file caching, stat caching, and type caching with a TTL of 3600 seconds and the cache directory set to /path/to/a/directory/. Note that max-size-mb is set to -1, which configures the file cache to use all available capacity.

    file-cache:
      max-size-mb: -1
      cache-file-for-range-read: false
    
    metadata-cache:
      stat-cache-max-size-mb: 32
      ttl-secs: 3600
      type-cache-max-size-mb: 4
    
    cache-dir: /path/to/a/directory
    
  4. Manually run the ls -R command on your mounted bucket before you run your workload to pre-populate metadata to ensure the type cache populates ahead of the first read in a faster, batched method.

What's next