Configure Cloud Storage volume mounts for jobs

This page shows how to mount a Cloud Storage bucket as a storage volume, using Cloud Run volume mounts.

Mounting the bucket as a volume in Cloud Run presents the bucket content as files in the container file system. After you mount the bucket as a volume, you access the bucket as if it were a directory on your local file system, using your programming language's file system operations and libraries instead of using Google API Client Libraries.

You can mount your volume as read-only and you can also specify mount options for your volume.

Memory requirements

Cloud Storage volume mounts use the Cloud Run container memory for the following activities:

  • For all Cloud Storage FUSE caching, Cloud Run uses the stat cache setting with a Time-to-live (TTL) of 60 seconds by default. The default maximum size of the stat cache is 32 MB, the default maximum size of the type cache is 4 MB.

  • When reading, Cloud Storage FUSE also consumes memory other than stat and type caches, for example a 1 MiB array for every file being read and for goroutines.

  • When writing to Cloud Storage, the entire file is staged in Cloud Run memory before the file is written to Cloud Storage.

Limitations

Since Cloud Run uses Cloud Storage FUSE for this volume mount, there are a few things to keep in mind when mounting a Cloud Storage bucket as a volume:

  • Cloud Storage FUSE does not provide concurrency control for multiple writes (file locking) to the same file. When multiple writes try to replace a file, the last write wins and all previous writes are lost.
  • Cloud Storage FUSE is not a fully POSIX-compliant file system. For more details, refer to the Cloud Storage FUSE documentation.

Disallowed paths

Cloud Run does not allow you to mount a volume at /dev, /proc and /sys, or on their subdirectories.

Before you begin

You need a Cloud Storage bucket to mount as the volume.

For optimal read/write performance to Cloud Storage, see Optimizing Cloud Storage FUSE network bandwidth performance.

Required roles

To get the permissions that you need to configure Cloud Storage volume mounts, ask your administrator to grant you the following IAM roles:

To get the permissions that your service identity needs to access the file and Cloud Storage bucket, ask your administrator to grant the service identity the following IAM role:

For more details on Cloud Storage roles and permissions, see IAM for Cloud Storage.

For a list of IAM roles and permissions that are associated with Cloud Run, see Cloud Run IAM roles and Cloud Run IAM permissions. If your Cloud Run job interfaces with Google Cloud APIs, such as Cloud Client Libraries, see the service identity configuration guide. For more information about granting roles, see deployment permissions and manage access.

Mount a Cloud Storage volume

You can mount multiple buckets at different mount paths. You can also mount a volume to more than one container using the same or different mount paths across containers.

If you are using multiple containers, first specify the volumes, then specify the volume mounts for each container.

Console

  1. In the Google Cloud console, go to the Cloud Run jobs page:

    Go to Cloud Run

  2. Click Deploy container and select Job to fill out the initial job settings page. If you are configuring an existing job, select the job, then click Edit.

  3. Click Container, variables and secrets, connections, security to expand the job properties page.

  4. Click the Volumes tab.

    image

    • Under Volumes:
      • Click Add volume.
      • In the Volume type drop-down, select Cloud Storage bucket as the volume type.
      • In the Volume name field, enter the name you want to use for the volume.
      • Browse and select the bucket you want to use for the volume.
      • Optionally, select the Read only checkbox to make the bucket read-only.
      • Click Done.
    • Click the Container tab, then expand the container you are mounting the volume to, to edit the container.
    • Click the Volume Mounts tab.
    • Click Mount volume.
      • Select the Cloud Storage volume from the menu.
      • Specify the path where you want to mount the volume.
      • Click Mount Volume
  5. Click Create or Update.

gcloud

  • To add a volume and mount it:

    gcloud run jobs update JOB \
    --add-volume name=VOLUME_NAME,type=cloud-storage,bucket=BUCKET_NAME \
    --add-volume-mount volume=VOLUME_NAME,mount-path=MOUNT_PATH

    Replace:

    • JOB with the name of your job.
    • MOUNT_PATH with the relative path where you are mounting the volume, for example, /mnt/my-volume.
    • VOLUME_NAME with any name you want for your volume. The VOLUME_NAME value is used to map the volume to the volume mount.
    • BUCKET_NAME with the name of your Cloud Storage bucket.
  • To mount your volume as a read-only volume:

    --add-volume=name=VOLUME_NAME,type=cloud-storage,bucket=BUCKET_NAME,readonly=true
  • If you are using multiple containers, first specify your volume(s), then specify the volume mount(s) for each container:

    gcloud run jobs update JOB \
    --add-volume name=VOLUME_NAME,type=cloud-storage,bucket=BUCKET_NAME \
    --container CONTAINER_1 \
    --add-volume-mount volume=VOLUME_NAME,mount-path=MOUNT_PATH \
    --container CONTAINER_2 \
    --add-volume-mount volume=VOLUME_NAME,mount-path=MOUNT_PATH2

YAML

  1. If you are creating a new job, skip this step. If you are updating an existing job, download its YAML configuration:

    gcloud run jobs describe JOB_NAME --format export > job.yaml
  2. Update the MOUNT_PATH, VOLUME_NAME, BUCKET_NAME and IS_READ_ONLY as needed.

    apiVersion: run.googleapis.com/v1
    kind: Job
    metadata:
      name: JOB_NAME
    spec:
      template:
        metadata:
          annotations:
            run.googleapis.com/execution-environment: gen2
        spec:
          template:
            spec:
              containers:
              - image: IMAGE_URL
                volumeMounts:
                - mountPath: MOUNT_PATH
                  name: VOLUME_NAME
              volumes:
              - name: VOLUME_NAME
                csi:
                  driver: gcsfuse.run.googleapis.com
                  readOnly: IS_READ_ONLY
                  volumeAttributes:
                    bucketName: BUCKET_NAME

    Replace

    • IMAGE_URL with a reference to the container image, for example, us-docker.pkg.dev/cloudrun/container/hello:latest. If you use Artifact Registry, the repository REPO_NAME must already be created. The URL has the shape LOCATION-docker.pkg.dev/PROJECT_ID/REPO_NAME/PATH:TAG
    • MOUNT_PATH with the relative path where you are mounting the volume, for example, /mnt/my-volume.
    • VOLUME_NAME with any name you want for your volume. The VOLUME_NAME value is used to map the volume to the volume mount.
    • IS_READ_ONLY with True to make the volume read-only, or False to allow writes.
    • BUCKET_NAME with the name of the Cloud Storage bucket.
  3. Create or update the job using the following command:

    gcloud run jobs replace job.yaml

Reading and writing to a volume

If you use the Cloud Run volume mount feature, you access a mounted volume using the same libraries in your programming language that you use to read and write files on your local file system.

This is especially useful if you're using an existing container that expects data to be stored on the local file system and uses regular file system operations to access it.

The following snippets assume a volume mount with a mountPath set to /mnt/my-volume.

Nodejs

Use the File System module to create a new file or append to an existing in the volume, /mnt/my-volume:

var fs = require('fs');
fs.appendFileSync('/mnt/my-volume/sample-logfile.txt', 'Hello logs!', { flag: 'a+' });

Python

Write to a file kept in the volume, /mnt/my-volume:

f = open("/mnt/my-volume/sample-logfile.txt", "a")

Go

Use the os package to create a new file kept in the volume, /mnt/my-volume

f, err := os.Create("/mnt/my-volume/sample-logfile.txt")

Java

Use the Java.io.File class to create a log file in the volume, /mnt/my-volume:

import java.io.File;
File f = new File("/mnt/my-volume/sample-logfile.txt");

Volume configuration using mount options

You can optionally use mount options to configure various properties of your volume mount. The available mount options allow you to configure cache settings, mount a specific directory, enable debug logging, and other behaviors.

Specify mount options

You can specify mount options using the Google Cloud CLI or YAML.

gcloud

To add a volume and mount it with mount options:

gcloud beta run jobs update JOB \
    --add-volume name=VOLUME_NAME,type=cloud-storage,bucket=BUCKET_NAME, mount-options="OPTION_1=VALUE_1;OPTION_N=VALUE_N" \
    --add-volume-mount volume=VOLUME_NAME,mount-path=MOUNT_PATH

Replace:

  • JOB with the name of your job.
  • MOUNT_PATH with the relative path where you are mounting the volume, for example, /cache.
  • VOLUME_NAME with any name you want for your volume. The VOLUME_NAME value is used to map the volume to the volume mount.
  • BUCKET_NAME with the name of your Cloud Storage bucket.
  • OPTION_1 with the first mount option. Note that you can specify as many mount options as you need, with each mount option and value pair separated by a semicolon.
  • VALUE_1 with the setting you want for the first mount option.
  • OPTION_N with the second mount option.
  • VALUE_N with the setting for the second mount option.
  • BUCKET_NAME with the name of your Cloud Storage bucket.
  • MOUNT_PATH with the relative path where you are mounting the volume, for example, /cache.

YAML

  1. If you are creating a new job, skip this step. If you are updating an existing job, download its YAML configuration:

    gcloud run jobs describe JOB_NAME --format export > job.yaml
  2. Update as needed.

    apiVersion: run.googleapis.com/v1
    kind: Job
    metadata:
      name: JOB_NAME
    spec:
      metadata:
        annotations:
          run.googleapis.com/launch-stage: BETA
      template:
        metadata:
          annotations:
            run.googleapis.com/execution-environment: gen2
        spec:
          template:
            spec:
              containers:
              - image: IMAGE_URL
                volumeMounts:
                - mountPath: MOUNT_PATH
                  name: VOLUME_NAME
              volumes:
              - name: VOLUME_NAME
                csi:
                  driver: gcsfuse.run.googleapis.com
                  readOnly: IS_READ_ONLY
                  volumeAttributes:
                    bucketName: BUCKET_NAME
                    mountOptions: OPTION_1=VALUE_1,OPTION_N=VALUE_N

    Replace

    • IMAGE_URL with a reference to the container image, for example, us-docker.pkg.dev/cloudrun/container/hello:latest. If you use Artifact Registry, the repository REPO_NAME must already be created. The URL has the shape LOCATION-docker.pkg.dev/PROJECT_ID/REPO_NAME/PATH:TAG
    • MOUNT_PATH with the relative path where you are mounting the volume, for example, /cache.
    • VOLUME_NAME with any name you want for your volume. The VOLUME_NAME value is used to map the volume to the volume mount.
    • IS_READ_ONLY with True to make the volume read-only, or False to allow writes.
    • BUCKET_NAME with the name of the Cloud Storage bucket.
    • OPTION_1 with the first mount option. Note that you can specify as many mount options as you need, with each mount option and value pair separated by a comma.
    • VALUE_1 with the setting you want for the first mount option.
    • OPTION_N with the second mount option.
    • VALUE_N with the setting for the second mount option.
  3. Create or update the job using the following command:

    gcloud run jobs replace job.yaml

Commonly used mount options

Mount options are commonly used to configure cache settings, mount only a specific directory from the Cloud Storage bucket, configure the ownership of the volume (uid, gid), turn off implicit directories, or specify debug logging levels.

Configure caching settings

You can change the caching settings for your volume by setting the caching-related mount options. The following table lists the settings, along with the default Cloud Run values :

Cache setting Description Default
metadata-cache-ttl-secs Time to live (TTL) in seconds of cached metadata entries. For example, metadata-cache-ttl-secs=120s. To use the most up-to-date file, specify a value of 0. To always use the cached version, specify a value of -1. To learn more, see Configuring cache invalidation. 60
stat-cache-max-size-mb Maximum size in mebibytes (MiB) that the stat cache can use. The stat cache is always entirely kept in memory, impacting memory consumption. Specify a value of 32 if your workload involves up to 20,000 files. If your workload is larger than 20,000 files, increase the size by values of 10 for every additional 6,000 files, where the stat cache uses an average of 1,500 MiB per file.

To let the stat cache use as much memory as needed, that is, to set no limit, specify a value of -1. To disable the stat cache, specify a value of 0.
32
type-cache-max-size-mb The maximum size in MiB per directory that the type cache can use. The type cache is always entirely kept in memory, impacting memory consumption.

Specify a value of 4 if the maximum number of files within a single directory from the bucket you're mounting contains 20,000 files or less. If the maximum number of files within a single directory that you're mounting contains more than 20,000 files, increase the value by 1 for every 5,000 files, which is an average of around 200 bytes per file.

To let the type cache use as much memory as needed, that is, to specify no limit, specify a value of -1.

To disable the type cache, specify a value of 0.
4

The following Google Cloud CLI command sets the metadata-cache-ttl-secs to 120 seconds and increases the stat and type cache capacity to 52 and 7 MiB, respectively:

gcloud beta run jobs update JOB \
    --add-volume name=VOLUME_NAME,type=cloud-storage,bucket=BUCKET_NAME,mount-options="metadata-cache-ttl-secs=120;stat-cache-max-size-mb=52;type-cache-max-size-mb=7" \
    --add-volume-mount volume=VOLUME_NAME,mount-path=MOUNT_PATH

Enable debug logging

By default, Cloud Storage FUSE logs events that have Info severity. You can change the logging settings using any of the following log-severity mount options:

  • trace
  • debug
  • info
  • warning
  • error
  • To turn off all logging, specify the value off.

These severity levels are ordered from lowest to highest. When you specify a severity level, Cloud Storage FUSE generates logs for events that have a severity level equal to or higher than the specified severity. For example, when you specify the warning level, Cloud Storage FUSE generates logs for warnings and errors.

Setting log severity to levels above info can impact performance and generate a large amount of logging data, so we recommend doing this only as needed.

The following command line turns on debug logging:

gcloud beta run jobs update JOB \
    --add-volume name=VOLUME_NAME,type=cloud-storage,bucket=BUCKET_NAME,mount-options="log-severity=debug" \
    --add-volume-mount volume=VOLUME_NAME,mount-path=MOUNT_PATH

Disable implicit directories

To make Cloud Storage appear more like a standard file system, Cloud Run enables implicit directories by default when mounting a Cloud Storage bucket. You can turn implicit directories off using the implicit-dirs mount option. Disabling implicit directories can improve performance and cost, but comes with compatibility tradeoffs.

The implicit directories feature enables Cloud Run to recognize pre-existing Cloud Storage files whose filenames mimic a directory structure, such as /mydir/myfile.txt. If you disable implicit directories, Cloud Run won't be able to list or read such files.

Turning off implicit directories reduces the number of requests to Cloud Storage, which might improve your application performance and cost. Read the Cloud Storage FUSE Files and directories documentation to learn more.

The following command line disables implicit directories:

gcloud beta run jobs update JOB \
    --add-volume name=VOLUME_NAME,type=cloud-storage,bucket=BUCKET_NAME,mount-options="implicit-dirs=false" \
    --add-volume-mount volume=VOLUME_NAME,mount-path=MOUNT_PATH

Mount a specific directory inside your Cloud Storage bucket

By default, Cloud Run mounts the entire Cloud Storage bucket, which gives Cloud Run jobs access to all its contents. In some cases you might want to mount only a specific directory. For example, in the case where the bucket contains a large number of files, mounting a specific directory can improve performance.

Another example is for isolation purposes where you need different jobs to have access to different directories in the storage bucket.

The following command line specifies the directory to mount:

gcloud beta run jobs update JOB \
    --add-volume name=VOLUME_NAME,type=cloud-storage,bucket=BUCKET_NAME,mount-options="only-dir=images" \
    --add-volume-mount volume=VOLUME_NAME,mount-path=MOUNT_PATH

Set the volume UID and GID

Use the uid and gid mount options to change the User Identifier and Group Identifier for the volume. This is useful if you want to set ownership of the file to a specific user or group matching the identity of one or multiple running containers. By default, volumes are owned by root.

The following command line sets uid and gid:

gcloud beta run jobs update JOB \
    --add-volume name=VOLUME_NAME,type=cloud-storage,bucket=BUCKET_NAME,mount-options="uid=UID;gid=GID"  \
    --add-volume-mount volume=VOLUME_NAME,mount-path=MOUNT_PATH

Set other mount options

The following is the complete list of all of the mount options supported by Cloud Run.

Directory

  • implicit-dirs
  • only-dir
  • rename-dir-limit

Debug

  • debug_fuse_errors
  • debug_fuse
  • debug_gcs
  • debug-invariants
  • debug_mutex

Cache

  • stat-cache-capacity
  • stat-cache-ttl
  • type-cache-ttl
  • enable-nonexistent-type-cache

Permissions

  • uid
  • gid
  • file-mode
  • dir-mode

Other

  • billing-project
  • client-protocol
  • experimental-enable-json-read
  • experimental-opentelemetry-collector-address
  • http-client-timeout
  • limit-bytes-per-sec
  • limit-ops-per-sec
  • max-conns-per-host
  • max-idle-conns-per-host
  • max-retry-sleep
  • -o
  • retry-multiplier
  • sequential-read-size-mb
  • stackdriver-export-interval

For full documentation of the supported mount options, see the Cloud Storage FUSE command line mount options.

View volume mounts settings

You can view current volume mount settings using the Google Cloud console or the Google Cloud CLI.

Console

  1. In the Google Cloud console, go to the Cloud Run jobs page:

    Go to Cloud Run jobs

  2. Click the job you are interested in to open the Job details page.

  3. Click the Volumes tab.

  4. Locate the volume mounts setting in the volumes detail page.

gcloud

  1. Use the following command:

    gcloud run jobs describe JOB_NAME
  2. Locate the volume mounts setting in the returned configuration.

Optimizing Cloud Storage FUSE network bandwidth performance

For better read and write performance, connect your Cloud Run job to a VPC network using Direct VPC and routing all outbound traffic through your VPC network. You can do this using any of the following options:

Container startup time and Cloud Storage FUSE mounts

Using Cloud Storage FUSE can slightly increase your Cloud Run container cold start time because the volume mount is started prior to starting the container(s). Your container will start only if Cloud Storage FUSE is successfully mounted.

Note that Cloud Storage FUSE successfully mounts a volume only after establishing a connection to the Cloud Storage. Any networking delays can have an impact on container startup time. Correspondingly, if the connection attempt fails, Cloud Storage FUSE will fail to mount and the Cloud Run job will fail to start. Also, if Cloud Storage FUSE takes longer than 30 seconds to mount, the Cloud Run job will fail to start because Cloud Run has a total timeout of 30 seconds to perform all mounts.

Cloud Storage FUSE performance characteristics

If you define two volumes, each pointing to a different bucket, two Cloud Storage FUSE processes will be started. The mounts and processes occur in parallel.

Operations using Cloud Storage FUSE are impacted by network bandwidth because Cloud Storage FUSE communicates with Cloud Storage using the Cloud Storage API. Some operations such as listing the content of a bucket can be slow if the network bandwidth is low. Similarly, reading a large file can take time as this is also limited by network bandwidth.

When you write to a bucket, Cloud Storage FUSE fully stages the object in memory. This means that writing large files is limited by the amount of memory available to the container instance (the maximum container memory limit is 32 GiB).

The write is flushed to the bucket only when you perform a close or an fsync: the full object is then uploaded/re-uploaded to the bucket. The only exception to an object being entirely re-uploaded to the bucket is in the case of a file with appended content when the file is 2 MiB or more.

For more information, see the following resources:

Clear and remove volumes and volume mounts

You can clear all volumes and volumes mounts or you can remove individual volumes and volume mounts.

Clear all volumes and volume mounts

To clear all volumes and volume mounts from your single-container job, run the following command:

gcloud beta run jobs update JOB \
    --clear-volumes
    --clear-volume-mounts

If you have multiple containers, follow the sidecars CLI conventions to clear volumes and volume mounts:

gcloud beta run jobs update JOB \
    --clear-volumes \
    --clear-volume-mounts \
    --container=container1 \
    --clear-volumes \
    -–clear-volume-mounts \
    --container=container2 \
    --clear-volumes \
    -–clear-volume-mounts

Remove individual volumes and volume mounts

In order to remove a volume, you must also remove all volume mounts using that volume.

To remove individual volumes or volume mounts, use the remove-volume and remove-volume-mount flags:

gcloud beta run jobs update JOB \
    --remove-volume VOLUME_NAME
    --container=container1 \
    --remove-volume-mount MOUNT_PATH \
    --container=container2 \
    --remove-volume-mount MOUNT_PATH