Cloud Storage FUSE

This page provides an overview of Cloud Storage FUSE, a FUSE adapter that lets you mount and access Cloud Storage buckets as local file systems, so applications can read and write objects in your bucket using standard file system semantics. Cloud Storage FUSE is an open source product that's supported by Google.

See the following documentation for instructions on how to use Cloud Storage FUSE:

This documentation always reflects the latest version of Cloud Storage FUSE. For details on the latest version, see Cloud Storage FUSE releases on GitHub.

How Cloud Storage FUSE works

Cloud Storage FUSE uses FUSE and Cloud Storage APIs to transparently expose buckets as locally mounted folders on your file system.

Cloud Storage FUSE works by translating object storage names into a file and directory system, interpreting the slash character ("/") in object names as a directory separator so that objects with the same common prefix are treated as files in the same directory. Applications can interact with the mounted bucket like a file system, providing virtually limitless file storage running in the cloud. Cloud Storage FUSE can be run from anywhere with connectivity to Cloud Storage, including Google Kubernetes Engine, Compute Engine VMs, or on-premises systems.

Cloud Storage FUSE is ideal for use cases where Cloud Storage has the right performance and scalability characteristics for an application that requires file system semantics. For example, Cloud Storage FUSE is useful for machine learning (ML) projects because it provides a way to store data, models, checkpoints, and logs directly in Cloud Storage. For more information, see Cloud Storage FUSE for ML workloads.

Cloud Storage FUSE is integrated with other Google Cloud services. For example, the Cloud Storage FUSE CSI driver lets you use the Google Kubernetes Engine (GKE) API to consume buckets as volumes, so you can read from and write to Cloud Storage from within your Kubernetes pods. For more information on other integrations, see Integrations.

Limitations

While Cloud Storage FUSE has a file system interface, it is not like an NFS or CIFS file system on the backend. Additionally, Cloud Storage FUSE is not POSIX compliant. For a POSIX file system product in Google Cloud, see Filestore.

When using Cloud Storage FUSE, be aware of its limitations and semantics, which are different than that of POSIX file systems. Cloud Storage FUSE should only be used within its capabilities.

Limitations and differences from POSIX file systems

The following list describes the limitations of Cloud Storage FUSE:

  • Metadata: Cloud Storage FUSE does not transfer object metadata when uploading files to Cloud Storage, with the exception of mtime and symlink targets. This means that you cannot set object metadata when you upload files using Cloud Storage FUSE. If you need to preserve object metadata, consider uploading files using the Google Cloud CLI, the JSON API, or the Google Cloud console.
  • Concurrency: Cloud Storage FUSE does not provide concurrency control for multiple writes to the same file. When multiple writes try to replace a file, the last write wins and all previous writes are lost. There is no merging, version control, or user notification of the subsequent overwrite.
  • Linking: Cloud Storage FUSE does not support hard links.
  • File locking and file patching: Cloud Storage FUSE does not support file locking or file patching. As such, you should not store version control system repositories in Cloud Storage FUSE mount points, as version control systems rely on file locking and patching. Additionally, you should not use Cloud Storage FUSE as a filer replacement.
  • Semantics: Semantics in Cloud Storage FUSE are different from semantics in a traditional file system. For example, metadata like last access time are not supported, and some metadata operations like directory renaming are not atomic. For a list of differences between Cloud Storage FUSE semantics and traditional file system semantics, see Semantics in the Cloud Storage FUSE GitHub documentation.
  • Workloads that do file patching (or overwrites in place): Cloud Storage FUSE can only write whole objects at a time to Cloud Storage and does not provide a mechanism for patching. If you try to patch a file, Cloud Storage FUSE will reupload the entire file. The only exception to this behavior is that you can append content to the end of a file that's 2 MB and larger, where Cloud Storage FUSE will only reupload the appended content.
  • Access: Authorization for files is governed by Cloud Storage permissions. POSIX-style access control does not work.
  • Performance: Cloud Storage FUSE has much higher latency than a local file system, and as such, should not be used as the backend for storing a database. Throughput may be reduced when reading or writing one small file at a time. Using larger files and/or transferring multiple files at a time will help increase throughput.
  • Availability: Transient errors can sometimes occur when you use Cloud Storage FUSE to access Cloud Storage. It's recommended that you retry failed operations using retry strategies.
  • Object versioning: Cloud Storage FUSE does not formally support usage with buckets that have object versioning enabled. Attempting to use Cloud Storage FUSE with buckets that have object versioning enabled can produce unpredictable behavior.
  • File transcoding: Objects with content-encoding: gzip in metadata: Any such object in a Cloud Storage FUSE-mounted directory does not undergo decompressive transcoding. Instead, the object remains compressed in the same manner that it's stored in the bucket.
    For example, a file of 1000 bytes, uploaded to a bucket using the gcloud storage cp command with the --gzip-local flag, might become 60 bytes (the actual compressed size depends on the content and the gzip implementation used by the gcloud CLI) as a Cloud Storage object. If the bucket is mounted using gcsfuse, and the corresponding file is listed or read from the mount directory, its size is returned as 60 bytes, and its contents are a compressed version of the original 1000 bytes content.
    This is in contrast to a download using gcloud storage cp gs://bucket/path /local/path which undergoes decompressive transcoding: in the gcloud command, the content is auto-decompressed during the download, and the original, uncompressed content is served.
    Note: Attempting to use Cloud Storage FUSE to edit or modify objects with content-encoding: gzip can produce unpredictable behavior. This is because Cloud Storage FUSE uploads the object content as it is (without compressing it) while retaining content-encoding: gzip, and if this content is not properly gzip-compressed, it might fail in being read from the server by other clients, such as the gcloud CLI. This is because other clients employ decompressive transcoding while reading, and it fails for improper gzip content.
  • Retention policies: Cloud Storage FUSE does not support writing to buckets with a retention policy. If you attempt to write to a bucket with a retention policy, your writes will fail.

    Cloud Storage FUSE supports reading objects from buckets with a retention policy, but the bucket must be mounted as Read-Only by passing the -o RO flag during bucket mounting.

  • Local storage: Objects that are new or modified are stored in their entirety in a local temporary file until they are closed or synced. When working with large files, make sure you have enough local storage capacity for temporary copies of the files, particularly if you are working with Compute Engine instances. For more information, see the README in the Cloud Storage FUSE GitHub documentation.
  • Directories: Cloud Storage operates with a flat namespace. By default, only directories that are explicitly defined (meaning they exist as objects in Cloud Storage) can appear in the mounted file system. Implicit directories (ones that are only parts of the pathname of other files or directories) do not appear by default. If you have files with a pathname that contains an implicit directory, the file does not appear in the overall directory tree, since the implicit directory containing them does not appear. However, you can use a flag to change this behavior. For more information, see Files and directories in the Cloud Storage FUSE GitHub documentation.

    Cloud Storage FUSE does not support directory renaming. A directory rename cannot be performed atomically in Cloud Storage; instead, renaming a directory involves copying an object with a new name and deleting the original object.

  • File handle limits: By default, the Linux kernel allows a maximum of 1,024 open file handles. Cloud Storage FUSE shouldn't be used as a server handling concurrent parallel connections from external clients, as this might exceed the maximum number of open file handles. Some common use cases to avoid are web serving content from a Cloud Storage FUSE mount, exposing a Cloud Storage FUSE mount as network-attached storage (NAS) using file sharing protocols (for example, NFS or SMB), and hosting a file transfer protocol (FTP) server backed by a Cloud Storage FUSE mount.

Frameworks, operating systems, and architectures

Cloud Storage FUSE has been validated with the following frameworks:

  • TensorFlow V2.x

  • TensorFlow V1.x

  • PyTorch V2.x

  • PyTorch V1.x

  • JAX 0.4.x

Cloud Storage FUSE supports the following operating systems and architectures:

  • Ubuntu 18.04 or later

  • Debian 10 or later

  • CentOS 7.9 or later

  • RHEL 7.9 or later

  • x86_64

  • ARM64

Get support

You can get support, submit general questions, and request new features by using one of Google Cloud's official support channels. You can also get support by filing issues in GitHub.

For solutions to commonly-encountered issues, see Troubleshooting in the Cloud Storage FUSE GitHub documentation.

Pricing for Cloud Storage FUSE

Cloud Storage FUSE is available free of charge, but the storage, metadata, and network I/O it generates to and from Cloud Storage are charged like any other Cloud Storage interface. In other words, all data transfer and operations performed by Cloud Storage FUSE map to Cloud Storage transfers and operations, and are charged accordingly. For more information on common Cloud Storage FUSE operations and how they map to Cloud Storage operations, see the operations mapping.

To avoid surprises, you should estimate how your use of Cloud Storage FUSE translates to Cloud Storage charges. For example, if you are using Cloud Storage FUSE to store log files, you can incur charges quickly if logs are aggressively flushed on hundreds or thousands of machines at the same time.

See Cloud Storage Pricing for information on charges such as storage, network usage, and operations.

Map of Cloud Storage FUSE operations to Cloud Storage operations

When you perform an operation using Cloud Storage FUSE, you also perform the Cloud Storage operations associated with the Cloud Storage FUSE operation. The following table describes common Cloud Storage FUSE commands and their associated Cloud Storage JSON API operations. You can display information about the Cloud Storage FUSE operations by using the --debug_gcs flag.

Command JSON API Operations
gcsfuse --debug_gcs example-bucket mp Objects.list (to check credentials)
cd mp n/a
ls mp Objects.list("")
mkdir subdir

Objects.get("subdir")

Objects.get("subdir/")

Objects.insert("subdir/")

cp ~/local.txt subdir/

Objects.get("subdir/local.txt")

Objects.get("subdir/local.txt/")

Objects.insert("subdir/local.txt"), to create an empty object

Objects.insert("subdir/local.txt"), when closing after done writing

rm -rf subdir

Objects.list("subdir")

Objects.list("subdir/")

Objects.delete("subdir/local.txt")

Objects.list("subdir/")

Objects.delete("subdir/")

Known issues

For a list of known issues in Cloud Storage FUSE, refer to GitHub.

Next steps