This guide shows you how to use Kubernetes persistent volumes backed by your Cloud Storage buckets to manage storage resources for your Kubernetes Pods on Google Kubernetes Engine (GKE). Consider using this storage option if you are already familiar with PersistentVolumes and want consistency with your existing deployments that rely on this resource type.
This guide is for Platform admins and operators users who want to simplify storage management for their GKE applications.
Before reading this page, ensure you're familiar with Kubernetes persistent volumes, Kubernetes Pods, and Cloud Storage buckets.
If you want a streamlined Pod-based interface that requires no previous experience with Kubernetes persistent volumes, see Mount Cloud Storage buckets as CSI ephemeral volumes.
Before you begin
Make sure you have completed these prerequisites:
- Understand the requirements and limitations of the Cloud Storage FUSE CSI driver.
- Create the Cloud Storage bucket
- Enable the Cloud Storage FUSE CSI driver
- Configure access to Cloud Storage buckets
How persistent volumes for Cloud Storage buckets work
With static provisioning, you create one or more PersistentVolume objects containing the details of the underlying storage system. Pods in your clusters can then consume the storage through PersistentVolumeClaims.
Using a persistent volume backed by a Cloud Storage bucket involves these operations:
Storage definition: You define a PersistentVolume in your GKE cluster, including the CSI driver to use and any required parameters. For Cloud Storage FUSE CSI driver, you specify the bucket name and other relevant details.
Optionally, you can fine-tune the performance of your CSI driver by using the file caching feature. File caching can boost GKE app performance by caching frequently accessed Cloud Storage files on a faster local disk.
Additionally, you can use the parallel download feature to accelerate reading large files from Cloud Storage for multi-threaded downloads. You can use this feature to improve model load times, especially for reads of over 1 GB in size.
Driver invocation: When a PersistentVolumeClaim requests storage matching the PersistentVolume's specification, GKE invokes the Cloud Storage FUSE CSI driver.
Bucket mounting: The CSI driver mounts the bucket to the node where the requesting Pod is scheduled. This makes the bucket's contents accessible to the Pod as a directory in the Pod's local file system. To fine-tune how buckets are mounted in the file system, you can use mount options. You can also use volume attributes to configure specific behavior of the Cloud Storage FUSE CSI driver.
Re-attachment: If the Pod restarts or is rescheduled to another node, the CSI driver remounts the same bucket to the new node, ensuring data accessibility.
Create a PersistentVolume
Create a PersistentVolume manifest with the following specification:
Pod
apiVersion: v1 kind: PersistentVolume metadata: name: gcs-fuse-csi-pv spec: accessModes: - ReadWriteMany capacity: storage: 5Gi storageClassName: example-storage-class mountOptions: - implicit-dirs csi: driver: gcsfuse.csi.storage.gke.io volumeHandle: BUCKET_NAME claimRef: name: gcs-fuse-csi-static-pvc namespace: NAMESPACE
Replace the following values:
- NAMESPACE: the Kubernetes namespace where you want to deploy your Pod.
- BUCKET_NAME: the Cloud Storage bucket
name you specified when configuring access to the Cloud Storage buckets. You can
specify an underscore (
_
) to mount all buckets that the Kubernetes ServiceAccount can access. To learn more, see Dynamic mounting in the Cloud Storage FUSE documentation.
The example manifest shows these required settings:
spec.csi.driver
: usegcsfuse.csi.storage.gke.io
as the CSI driver name.
Optionally, you can adjust these variables:
spec.mountOptions
: Pass mount options to Cloud Storage FUSE. Specify the flags in one string separated by commas, without spaces.spec.csi.volumeAttributes
: Pass additional volume attributes to Cloud Storage FUSE.
Pod (file caching)
apiVersion: v1 kind: PersistentVolume metadata: name: gcs-fuse-csi-pv spec: accessModes: - ReadWriteMany capacity: storage: 5Gi storageClassName: example-storage-class mountOptions: - implicit-dirs - file-cache:max-size-mb:-1 csi: driver: gcsfuse.csi.storage.gke.io volumeHandle: BUCKET_NAME claimRef: name: gcs-fuse-csi-static-pvc namespace: NAMESPACE
Replace the following values:
- NAMESPACE: the Kubernetes namespace where you want to deploy your Pod.
- BUCKET_NAME: the Cloud Storage bucket
name you specified when configuring access to the Cloud Storage buckets.
You can specify an underscore (
_
) to mount all buckets that the Kubernetes ServiceAccount can access. To learn more, see Dynamic mounting in the Cloud Storage FUSE documentation.
Pod (parallel download)
apiVersion: v1 kind: PersistentVolume metadata: name: gcs-fuse-csi-pv spec: accessModes: - ReadWriteMany capacity: storage: 5Gi storageClassName: example-storage-class mountOptions: - implicit-dirs - file-cache:enable-parallel-downloads:true - file-cache:max-size-mb:-1 csi: driver: gcsfuse.csi.storage.gke.io volumeHandle: BUCKET_NAME claimRef: name: gcs-fuse-csi-static-pvc namespace: NAMESPACE
Replace the following values:
- NAMESPACE: the Kubernetes namespace where you want to deploy your Pod.
- BUCKET_NAME: the Cloud Storage bucket
name you specified when configuring access to the Cloud Storage buckets.
You can specify an underscore (
_
) to mount all buckets that the Kubernetes ServiceAccount can access. To learn more, see Dynamic mounting in the Cloud Storage FUSE documentation.
Apply the manifest to the cluster:
kubectl apply -f PV_FILE_PATH
Replace PV_FILE_PATH with the path to your YAML file.
Create a PersistentVolumeClaim
Create a PersistentVolumeClaim manifest with the following specification:
apiVersion: v1 kind: PersistentVolumeClaim metadata: name: gcs-fuse-csi-static-pvc namespace: NAMESPACE spec: accessModes: - ReadWriteMany resources: requests: storage: 5Gi storageClassName: example-storage-class
Replace the NAMESPACE with the Kubernetes namespace where you want todeploy your Pod.
To bind your PersistentVolume to a PersistentVolumeClaim, check these configuration settings:
spec.storageClassName
fields in your PersistentVolume and PersistentVolumeClaim manifests should match. The storageClassName doesn't need to refer to an existing StorageClass object. To bind the claim to a volume, you can use any name you want but it can't be empty.spec.accessModes
fields in your PersistentVolume and PersistentVolumeClaim manifests should match.spec.capacity.storage
field in your PersistentVolume manifest should match thespec.resources.requests.storage
in the PersistentVolumeClaim manifest. Since Cloud Storage buckets don't have size limits, you can put any number for capacity but it can't be empty.
Apply the manifest to the cluster:
kubectl apply -f PVC_FILE_PATH
Replace PVC_FILE_PATH with the path to your YAML file.
Consume the volume in a Pod
Create a Pod manifest with the following specification:
apiVersion: v1 kind: Pod metadata: name: gcs-fuse-csi-example-static-pvc namespace: NAMESPACE annotations: gke-gcsfuse/volumes: "true" gke-gcsfuse/ephemeral-storage-limit: "50Gi" spec: containers: - image: busybox name: busybox command: ["sleep"] args: ["infinity"] volumeMounts: - name: gcs-fuse-csi-static mountPath: /data readOnly: true serviceAccountName: KSA_NAME volumes: - name: gcs-fuse-csi-static persistentVolumeClaim: claimName: gcs-fuse-csi-static-pvc readOnly: true
Replace the following values:
- NAMESPACE: the Kubernetes namespace where you want todeploy your Pod.
- KSA_NAME: the Kubernetes ServiceAccount name that you created when configuring access to the Cloud Storage buckets.
The example manifest shows these required settings:
metadata.annotations
: the annotationgke-gcsfuse/volumes: "true"
is required. See Configure the sidecar container for optional annotations.
Optionally, you can adjust these variables:
spec.containers[n].volumeMonts[n].readOnly
: Specify true if only specific volume mounts are read-only.spec.volumes[n].persistentVolumeClaim.readOnly
: Specify true if all volume mounts are read-only.
Apply the manifest to the cluster:
kubectl apply -f POD_FILE_PATH
Replace POD_FILE_PATH with the path to your YAML file.
Troubleshoot issues
If you need to troubleshoot Cloud Storage FUSE issues, you can set the
log-severity
flag to TRACE
. You set the flag in the args
section of the
driver's container spec in the deployment YAML. This causes the
gcsfuseLoggingSeverity
volume attribute to be automatically set to trace.
For additional troubleshooting tips, see Troubleshooting Guide in the GitHub project documentation.
What's next
- Learn how to optimize performance for the Cloud Storage FUSE CSI driver.
- Explore additional samples for using the CSI driver on GitHub.
- Learn more about Cloud Storage FUSE.