GKE Volume Populator lets you preload data from a source storage to a destination PersistentVolumeClaim during dynamic provisioning, without the need to run additional scripts or CLI commands for manual data transfer. This feature handles automating and streamlining the data transfer process by leveraging the Kubernetes Volume Populator feature. It provides seamless data portability so that you can swap storage types to benefit from price or performance optimizations.
Use this feature if you need to transfer large amounts of data from Cloud Storage buckets to a PersistentVolumeClaim backed by another Google Cloud storage type (such as Parallelstore).
You primarily interact with GKE Volume Populator through the gcloud CLI and kubectl CLI. GKE Volume Populator is supported on both Autopilot and Standard clusters. You don't need to enable the GKE Volume Populator. It's a GKE managed component that's enabled by default.
Benefits
- If you want to take advantage of the performance of a managed parallel file system, but your data is stored in Cloud Storage, you can use GKE Volume Populator to simplify data transfer.
- GKE Volume Populator allows for data portability; you can move data per your needs.
- GKE Volume Populator supports IAM-based authentication so you can transfer data while maintaining fine-grained access control.
The diagram shows how data flows from the source storage to the destination storage, and the creation of the PersistentVolume for the destination storage using GKE Volume Populator.
Limitations
- GKE Volume Populator only supports Cloud Storage buckets as the source storage and Parallelstore instances as the destination storage type.
- GKE Volume Populator only supports StorageClass resources that have their
volumeBindingMode
set toImmediate
. - The
GCPDataSource
custom resource must be in the same namespace as your Kubernetes workload. Volumes with cross-namespace data sources are not supported. - GKE Volume Populator only supports Workload Identity Federation for GKE binding of IAM service accounts to a Kubernetes service account. Granting IAM permissions to the Kubernetes service account directly is not supported.
Before you begin
Before you start, make sure you have performed the following tasks:
- Enable the Parallelstore API and the Google Kubernetes Engine API. Enable APIs
- If you want to use the Google Cloud CLI for this task,
install and then
initialize the
gcloud CLI. If you previously installed the gcloud CLI, get the latest
version by running
gcloud components update
.
- See the Parallelstore CSI driver overview for limitations and requirements.
- Create your Cloud Storage buckets, populated with the data that you want to transfer.
Requirements
To use GKE Volume Populator, your clusters must meet the following requirements:
- Use GKE cluster version 1.31.1-gke.1729000 or later.
- Have the Parallelstore CSI driver enabled. GKE enables the CSI driver for you by default on new and existing GKE Autopilot clusters. On new and existing Standard clusters, you'll need to enable the CSI driver.
Prepare your environment
This section covers the steps to create your GKE clusters and set up the necessary permissions to use GKE Volume Populator.
Set up your VPC network
You must specify the same Virtual Private Cloud (VPC) network when creating the Parallelstore instance and client Compute Engine VMs or GKE clusters. To enable VPC to privately connect to Google Cloud services without exposing traffic to the public internet, you need to do a one-time configuration of private services access (PSA), if you have not already done so.
To configure PSA, follow these steps:
Configure the Compute Network Admin (
roles/compute.networkAdmin
) IAM permission in order to set up network peering for your project.To grant the role, run the following command:
gcloud projects add-iam-policy-binding PROJECT_ID \ --member="user:EMAIL_ADDRESS" \ --role=roles/compute.networkAdmin
Replace EMAIL_ADDRESS with your email address.
Enable service networking:
gcloud services enable servicenetworking.googleapis.com
Create a VPC network:
gcloud compute networks create NETWORK_NAME \ --subnet-mode=auto \ --mtu=8896 \ --project=PROJECT_ID
Replace the following:
- NETWORK_NAME: the name of the VPC network where you will create your Parallelstore instance.
- PROJECT_ID: your Google Cloud project ID.
Create an IP range.
Private services access requires an IP address range (CIDR block) with prefix length of at least
/24
(256 addresses). Parallelstore reserves 64 addresses per instance, which means that you can re-use this IP range with other services or other Parallelstore instances if needed.gcloud compute addresses create IP_RANGE_NAME \ --global \ --purpose=VPC_PEERING \ --prefix-length=24 \ --description="Parallelstore VPC Peering" \ --network=NETWORK_NAME \ --project=PROJECT_ID
Replace IP_RANGE_NAME with the name of the VPC network IP range name.
Set an environment variable with the CIDR range associated with the range you created in the previous step:
CIDR_RANGE=$( gcloud compute addresses describe IP_RANGE_NAME \ --global \ --format="value[separator=/](address, prefixLength)" \ --project=PROJECT_ID \ )
Create a firewall rule to allow TCP traffic from the IP range you created:
gcloud compute firewall-rules create FIREWALL_NAME \ --allow=tcp \ --network=NETWORK_NAME \ --source-ranges=$CIDR_RANGE \ --project=PROJECT_ID
Replace FIREWALL_NAME with the name of the firewall rule to allow TCP traffic from the IP range you will create.
Connect the peering:
gcloud services vpc-peerings connect \ --network=NETWORK_NAME \ --ranges=IP_RANGE_NAME \ --project=PROJECT_ID \ --service=servicenetworking.googleapis.com
If you encounter issues while setting up the VPC network, check the Parallelstore troubleshooting guide.
Create your GKE cluster
We recommend that you use an Autopilot cluster for a fully managed Kubernetes experience. To choose the GKE mode of operation that's the best fit for your workload needs, see Choose a GKE mode of operation.
Autopilot
To create a GKE cluster using Autopilot, run the following command:
gcloud container clusters create-auto CLUSTER_NAME \
--network=NETWORK_NAME \
--cluster-version=CLUSTER_VERSION \
--location=CLUSTER_LOCATION
GKE enables Workload Identity Federation for GKE and the Parallelstore CSI Driver by default in Autopilot clusters.
Replace the following values:
- CLUSTER_NAME: the name of your cluster.
- CLUSTER_VERSION : the GKE version number. You must specify 1.31.1-gke.1729000 or later.
- NETWORK_NAME: the name of the VPC network you created for the Parallelstore instance. To learn more, see Configure a VPC network.
- CLUSTER_LOCATION: the region where you want to create your cluster. We recommend that you create the cluster in a supported Parallelstore location for best performance. If you want to create your cluster in a non-supported Parallelstore location, when creating a Parallelstore StorageClass, you must specify a custom topology that uses supported Parallelstore location, otherwise provisioning will fail.
Standard
Create a Standard cluster with the Parallelstore CSI Driver and Workload Identity Federation for GKE enabled using the following command:
gcloud container clusters create CLUSTER_NAME \
--addons=ParallelstoreCsiDriver \
--cluster-version=CLUSTER_VERSION \
--workload-pool=PROJECT_ID.svc.id.goog \
--network=NETWORK_NAME \
--location=CLUSTER_LOCATION
Replace the following values:
- CLUSTER_NAME: the name of your cluster.
- CLUSTER_VERSION: the GKE version number. You must specify 1.31.1-gke.1729000 or later.
- PROJECT_ID: your Google Cloud project ID.
- NETWORK_NAME: the name of the VPC network you created for the Parallelstore instance. To learn more, see Configure a VPC network.
- CLUSTER_LOCATION: the region or zone where you want to create your cluster. We recommend that you create the cluster in a supported Parallelstore location for best performance. If you want to create your cluster in a non-supported Parallelstore location, when creating a Parallelstore StorageClass, you must specify a custom topology that uses supported Parallelstore location, otherwise provisioning will fail.
Set up necessary permissions
To transfer data from a Cloud Storage bucket, you need to set up permissions for Workload Identity Federation for GKE.
Create a Kubernetes namespace:
kubectl create namespace NAMESPACE
Replace NAMESPACE with the namespace that your workloads will run on.
Create a Kubernetes service account.
kubectl create serviceaccount KSA_NAME \ --namespace=NAMESPACE
Replace KSA_NAME with the name of the Kubernetes service account that your Pod uses to authenticate to Google Cloud APIs.
Create an IAM service account. You can also use any existing IAM service account in any project in your organization:
gcloud iam service-accounts create IAM_SA_NAME \ --project=PROJECT_ID
Replace the following:
- IAM_SA_NAME: the name for your IAM service account.
- PROJECT_ID: your Google Cloud project ID.
Grant your IAM service account the role
roles/storage.objectViewer
so that it can access your Cloud Storage bucket:gcloud storage buckets \ add-iam-policy-binding gs://GCS_BUCKET \ --member "serviceAccount:IAM_SA_NAME@PROJECT_ID.iam.gserviceaccount.com" \ --role "roles/storage.objectViewer"
Replace GCS_BUCKET with your Cloud Storage bucket name.
Create the IAM allow policy that gives the Kubernetes service account access to impersonate the IAM service account:
gcloud iam service-accounts \ add-iam-policy-binding IAM_SA_NAME@PROJECT_ID.iam.gserviceaccount.com \ --role roles/iam.workloadIdentityUser \ --member "serviceAccount:PROJECT_ID.svc.id.goog[NAMESPACE/KSA_NAME]"
Annotate the Kubernetes service account so that GKE sees the link between the service accounts.
kubectl annotate serviceaccount KSA_NAME \ --namespace NAMESPACE \ iam.gke.io/gcp-service-account=IAM_SA_NAME@PROJECT_ID.iam.gserviceaccount.com
Create the Parallelstore service identity:
gcloud beta services identity create \ --service=parallelstore.googleapis.com \ --project=PROJECT_ID
Grant the Parallelstore service identity the role
roles/iam.serviceAccountTokenCreator
to allow it to impersonate the IAM service account. Set thePROJECT_NUMBER
environment variable so you can use it in subsequent steps.export PROJECT_NUMBER=$(gcloud projects describe PROJECT_ID --format="value(projectNumber)") gcloud iam service-accounts \ add-iam-policy-binding "IAM_SA_NAME@PROJECT_ID.iam.gserviceaccount.com" \ --member=serviceAccount:"service-${PROJECT_NUMBER?}@gcp-sa-parallelstore.iam.gserviceaccount.com" \ --role=roles/iam.serviceAccountTokenCreator
The PROJECT_NUMBER value is the automatically generated unique identifier for your project. To find this value, refer to Creating and managing projects.
Grant the Parallelstore service identity the role
roles/iam.serviceAccountUser
to allow it to access all the resources that the IAM service account can access:gcloud iam service-accounts \ add-iam-policy-binding "IAM_SA_NAME@PROJECT_ID.iam.gserviceaccount.com" \ --member=serviceAccount:"service-${PROJECT_NUMBER?}@gcp-sa-parallelstore.iam.gserviceaccount.com" \ --role=roles/iam.serviceAccountUser
Grant the GKE service identity the role
roles/iam.serviceAccountUser
to allow it to access all the resources that the IAM service account can access. This step is not required if the GKE cluster and the IAM service account are in the same project.gcloud iam service-accounts \ add-iam-policy-binding "IAM_SA_NAME@PROJECT_ID.iam.gserviceaccount.com" \ --member=serviceAccount:"service-${PROJECT_NUMBER?}@container-engine-robot.iam.gserviceaccount.com" \ --role=roles/iam.serviceAccountUser
Create a Parallelstore volume with preloaded data
The following sections describe the typical process for creating a Parallelstore volume with data preloaded from a Cloud Storage bucket, using the GKE Volume Populator.
- Create a
GCPDataSource
resource. - Create a Parallelstore StorageClass.
- Create a PersistentVolumeClaim to access the volume.
- Verify that the PersistentVolumeClaim provisioning completed.
- (Optional) View the data transfer progress.
- Create a workload that consumes the volume.
Create a GCPDataSource
resource
To use GKE Volume Populator, create a GCPDataSource
custom resource. This resource defines the source storage
properties to use for volume population.
Save the following manifest in a file named
gcpdatasource.yaml
.apiVersion: datalayer.gke.io/v1 kind: GCPDataSource metadata: name: GCP_DATA_SOURCE namespace: NAMESPACE spec: cloudStorage: serviceAccountName: KSA_NAME uri: gs://GCS_BUCKET/
Replace the following values:
- GCP_DATA_SOURCE: the name of the
GCPDataSource
CRD that holds a reference to your Cloud Storage bucket. See theGCPDataSource
CRD reference for more details. - NAMESPACE: the namespace that your workloads will run on. The namespace value should be the same as your workload namespace.
- KSA_NAME: the name of the Kubernetes service account that your Pod uses to authenticate to Google Cloud APIs. The
cloudStorage.serviceAccountName
value should be the Kubernetes service account you set up for Workload Identity Federation for GKE in the Set up necessary permissions step. - GCS_BUCKET: your Cloud Storage bucket name. Alternatively, you can also specify
gs://GCS_BUCKET/PATH_INSIDE_BUCKET/
for theuri
field.
- GCP_DATA_SOURCE: the name of the
Create the
GCPDataSource
resource by running this command:kubectl apply -f gcpdatasource.yaml
Create a Parallelstore StorageClass
Create a StorageClass to direct the Parallelstore CSI driver to provision Parallelstore instances in the same region as your GKE cluster. This ensures optimal I/O performance.
Save the following manifest as
parallelstore-class.yaml
. Make sure that thevolumeBindingMode
field in the StorageClass definition is set toImmediate
.apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: parallelstore-class provisioner: parallelstore.csi.storage.gke.io volumeBindingMode: Immediate reclaimPolicy: Delete
Create the StorageClass by running this command:
kubectl apply -f parallelstore-class.yaml
If you want to create a custom StorageClass with a specific topology, refer to the Parallelstore CSI guide.
Create a PersistentVolumeClaim to access the volume
The following manifest file shows an example of how to create a
PersistentVolumeClaim in ReadWriteMany
access mode that
references the StorageClass you created earlier.
Save the following manifest in a file named
volume-populator-pvc.yaml
:apiVersion: v1 kind: PersistentVolumeClaim metadata: name: PVC_NAME namespace: NAMESPACE spec: accessModes: - ReadWriteMany storageClassName: parallelstore-class resources: requests: storage: 12Gi dataSourceRef: apiGroup: datalayer.gke.io kind: GCPDataSource name: GCP_DATA_SOURCE
Replace the following values:
- PVC_NAME: the name of the PersistentVolumeClaim where you want to transfer your data. The PersistentVolumeClaim must be backed by a Parallelstore instance.
- NAMESPACE: the namespace where your workloads will run. The namespace value should be the same as your workload namespace.
- GCP_DATA_SOURCE: the name of the
GCPDataSource
CRD that holds a reference to your Cloud Storage bucket. See theGCPDataSource
CRD reference for more details.
Create the PersistentVolumeClaim by running the following command:
kubectl apply -f volume-populator-pvc.yaml
GKE won't schedule the workload Pod until the PersistentVolumeClaim provisioning is complete. To check on your data transfer progress, see View the data transfer progress. If you encounter errors during provisioning, refer to Troubleshooting.
Verify that the PersistentVolumeClaim provisioning completed
GKE Volume Populator uses a temporary PersistentVolumeClaim in the
gke-managed-volumepopulator
namespace for volume provisioning.
The temporary PersistentVolumeClaim is essentially a snapshot of your
PersistentVolumeClaim that is still in transit (waiting for data to be fully
loaded). Its name has the format prime-YOUR_PVC_UID
.
To check its status:
Run the following commands:
PVC_UID=$(kubectl get pvc PVC_NAME -n NAMESPACE -o yaml | grep uid | awk '{print $2}') TEMP_PVC=prime-$PVC_UID echo $TEMP_PVC kubectl describe pvc ${TEMP_PVC?} -n gke-managed-volumepopulator
If the output is empty, this means the temporary PersistentVolumeClaim was not created. In that case, refer to the Troubleshooting section.
If provisioning is successful, the output is similar to the following. Look for the
ProvisioningSucceeded
log:Warning ProvisioningFailed 9m12s parallelstore.csi.storage.gke.io_gke-10fedd76bae2494db688-2237-793f-vm_5f284e53-b25c-46bb-b231-49e894cbba6c failed to provision volume with StorageClass "parallelstore-class": rpc error: code = DeadlineExceeded desc = context deadline exceeded Warning ProvisioningFailed 3m41s (x11 over 9m11s) parallelstore.csi.storage.gke.io_gke-10fedd76bae2494db688-2237-793f-vm_5f284e53-b25c-46bb-b231-49e894cbba6c failed to provision volume with StorageClass "parallelstore-class": rpc error: code = DeadlineExceeded desc = Volume pvc-808e41a4-b688-4afe-9131-162fe5d672ec not ready, current state: CREATING Normal ExternalProvisioning 3m10s (x43 over 13m) persistentvolume-controller Waiting for a volume to be created either by the external provisioner 'parallelstore.csi.storage.gke.io' or manually by the system administrator. If volume creation is delayed, please verify that the provisioner is running and correctly registered. Normal Provisioning 8s (x13 over 10m) "xxx" External provisioner is provisioning volume for claim "xxx" Normal ProvisioningSucceeded 7s "xxx" Successfully provisioned volume "xxx"
Check if the Parallelstore instance creation has started.
gcloud beta parallelstore instances list \ --project=PROJECT_ID \ --location=-
The output is similar to the following. Verify that your volume is in the
CREATING
state. When the Parallelstore instance creation is finished, the state will change toACTIVE
."projects/PROJECT_ID/locations/<my-location>/<my-volume>" 12000 2024-10-09T17:59:42.582857261Z 2024-10-09T17:59:42.582857261Z CREATING projects/PROJECT_ID/global/NETWORK_NAME
If provisioning failed, refer to the Parallelstore troubleshooting guide for additional guidance.
(Optional) View the data transfer progress
This section shows how you can track the progress of your data transfers from a Cloud Storage bucket to a Parallelstore volume. You can do this to monitor the status of your transfer and ensure that your data is copied successfully. You should also run this command if your PersistentVolumeClaim binding operation is taking too long.
Verify the status of your PersistentVolumeClaim by running the following command:
kubectl describe pvc PVC_NAME -n NAMESPACE
Check the PersistentVolumeClaim events message to find the data transfer progress. GKE logs the messages about once per minute. The output is similar to the following:
Reason Message ------ ------- PopulateOperationStartSuccess Populate operation started PopulateOperationStartSuccess Populate operation started Provisioning External provisioner is provisioning volume for claim "my-namespace/my-pvc" Provisioning Assuming an external populator will provision the volume ExternalProvisioning Waiting for a volume to be created either by the external provisioner 'parallelstore.csi.storage.gke.io' or manually by the system administrator. If volume creation is delayed, please verify that the provisioner is running and correctly registered. PopulateOperationStartSuccess Populate operation started PopulatorPVCCreationProgress objects found 7, objects copied 7, objects skipped 0. bytes found 1000020010, bytes copied 1000020010, bytes skipped 0 PopulateOperationFinished Populate operation finished PopulatorFinished Populator finished
It can take some time for the populate operation to start; this operation is dependent on file size. If you don't see any data transfer progress after several minutes, refer to the Troubleshooting section.
Create a workload that consumes the volume
This section shows an example of how to create a Pod that consumes the PersistentVolumeClaim resource you created earlier.
Save the following YAML manifest for your Pod as
pod.yaml
.apiVersion: v1 kind: Pod metadata: name: POD_NAME namespace: NAMESPACE spec: volumes: - name: parallelstore-volume persistentVolumeClaim: claimName: PVC_NAME containers: - image: nginx name: nginx volumeMounts: - name: parallelstore-volume mountPath: /mnt/data
Replace the following values:
- POD_NAME: the name of the Pod that runs your workload.
- NAMESPACE: the namespace where your workloads will run. The namespace value should be the same as your workload namespace.
- PVC_NAME: the name of the PersistentVolumeClaim where you want to transfer your data. The PersistentVolumeClaim must be backed by a Parallelstore instance.
Run the following command to apply the manifest to the cluster:
kubectl apply -f pod.yaml
Check the status of your Pod and wait until its status is
RUNNING
. Your PersistentVolumeClaim should be bound before the workload can run.kubectl describe pod POD_NAME -n NAMESPACE
Verify that the files were successfully transferred and can be accessed by your workload.
kubectl exec -it POD_NAME -n NAMESPACE -c nginx -- /bin/sh
Change to the
/mnt/data
directory and runls
:cd /mnt/data ls
The output should list all the files that exist in your Cloud Storage bucket URI.
Delete a PersistentVolumeClaim during dynamic provisioning
If you need to delete your PersistentVolumeClaim while data is still being transferred during dynamic provisioning, you have two options: graceful deletion and forced deletion.
Graceful deletion requires less effort, but can be more time-consuming and doesn't account for user misconfiguration that prevents data transfer from completing. Forceful deletion offers a faster alternative that allows for greater flexibility and control; this option is suitable when you need to quickly restart or correct misconfigurations.
Graceful deletion
Use this deletion option to ensure that the data transfer process is completed before GKE deletes the associated resources.
Delete the workload Pod if it exists, by running this command:
kubectl delete pod POD_NAME -n NAMESPACE
Find the name of the temporary PersistentVolumeClaim:
PVC_UID=$(kubectl get pvc PVC_NAME -n NAMESPACE -o yaml | grep uid | awk '{print $2}') TEMP_PVC=prime-$PVC_UID echo $TEMP_PVC
Find the name of the PersistentVolume:
PV_NAME=$(kubectl describe pvc ${TEMP_PVC?} -n gke-managed-volumepopulator | grep "Volume:" | awk '{print $2}') echo ${PV_NAME?}
If the output is empty, that means that the PersistentVolume has not been created yet.
Delete your PersistentVolumeClaim by running this command. The finalizer will block your deletion operation. Press
Ctrl+C
, then move on to the next step.kubectl delete pvc PVC_NAME -n NAMESPACE
Wait for data transfer to complete. GKE will eventually delete the PersistentVolumeClaim, PersistentVolume, and Parallelstore instance.
Check that the temporary PersistentVolumeClaim, PersistentVolumeClaim, and PersistentVolume resources are deleted:
kubectl get pvc,pv -A | grep -E "${TEMP_PVC?}|PVC_NAME|${PV_NAME?}"
Check that the Parallelstore instance is deleted. The Parallelstore instance will share the same name as the PersistentVolume. You don't need to run this command if you confirmed in Step 3 that the PersistentVolume was not created.
gcloud beta parallelstore instances list \ --project=PROJECT_ID \ --location=- | grep ${PV_NAME?}
Forced deletion
Use this deletion option when you need to delete a PersistentVolumeClaim and its associated resources before the data transfer process is complete. This might be necessary in situations where the data transfer is taking too long or has encountered errors, or if you need to reclaim resources quickly.
Delete the workload Pod if it exists:
kubectl delete pod POD_NAME -n NAMESPACE
Update the PersistentVolume reclaim policy to
Delete
. This ensures that the PersistentVolume, along with the underlying storage, is automatically deleted when the associated PersistentVolumeClaim is deleted.Skip the following command if any of the following apply:
- You don't want to delete the PersistentVolume or the underlying storage.
- Your current reclaim policy is
Retain
and you want to keep the underlying storage. Clean up the PersistentVolume and storage instance manually as needed. - The following
echo $PV_NAME
command outputs an empty string, that means that the PersistentVolume has not been created yet.
PV_NAME=$(kubectl describe pvc $TEMP_PVC -n gke-managed-volumepopulator | grep "Volume:" | awk '{print $2}') echo $PV_NAME kubectl patch pv $PV_NAME -p '{"spec":{"persistentVolumeReclaimPolicy":"Delete"}}'
Find the name of the temporary PersistentVolumeClaim and set the environment variable for a later step:
PVC_UID=$(kubectl get pvc PVC_NAME -n NAMESPACE -o yaml | grep uid | awk '{print $2}') TEMP_PVC=prime-$PVC_UID echo $TEMP_PVC
Delete the PersistentVolumeClaim by running this command. The finalizer will block your deletion operation. Press
Ctrl+C
, then move on to the next step.kubectl delete pvc PVC_NAME -n NAMESPACE
Remove the
datalayer.gke.io/populate-target-protection
finalizer from your PersistentVolumeClaim. This step is needed after deleting the PersistentVolumeClaim, otherwisegke-volume-populator
adds the finalizer back to the PersistentVolumeClaim.kubectl get pvc PVC_NAME -n NAMESPACE -o=json | \ jq '.metadata.finalizers = null' | kubectl apply -f -
Delete the temporary PersistentVolumeClaim in the
gke-managed-volumepopulator
namespace.kubectl delete pvc $TEMP_PVC -n gke-managed-volumepopulator
Check that the temporary PersistentVolumeClaim, PersistentVolumeClaim, and PersistentVolume resources are deleted:
kubectl get pvc,pv -A | grep -E "${TEMP_PVC?}|PVC_NAME|${PV_NAME?}"
Check that the Parallelstore instance is deleted. The Parallelstore instance will share the same name as the PersistentVolume. You don't need to run this command if you confirmed in Step 2 that the PersistentVolume was not created.
gcloud beta parallelstore instances list \ --project=PROJECT_ID \ --location=- | grep ${PV_NAME?}
Troubleshooting
This section shows you how to resolve issues related to GKE Volume Populator.
Before proceeding, run the following command to check for PersistentVolumeClaim event warnings:
kubectl describe pvc PVC_NAME -n NAMESPACE
Error: An internal error has occurred
If you encounter the following error, this indicates that a Parallelstore API internal error has occurred.
Warning
PopulateOperationStartError
gkevolumepopulator-populator Failed to start populate operation: populate data for PVC "xxx". Import data failed, error: rpc error: code = Internal desc = An internal error has occurred ("xxx")
To resolve this issue, you'll need to follow these steps to gather data for Support:
Run the following commands to get the name of the temporary PersistentVolumeClaim, replacing placeholders with the actual names:
PVC_UID=$(kubectl get pvc PVC_NAME -n NAMESPACE -o yaml | grep uid | awk '{print $2}') TEMP_PVC=prime-${PVC_UID?} echo ${TEMP_PVC?}
Run the following command to get the volume name:
PV_NAME=$(kubectl describe pvc ${TEMP_PVC?} -n gke-managed-volumepopulator | grep "Volume:" | awk '{print $2}')
Contact the support team with the error message, your project name, and the volume name.
Permission issues
If you encounter errors like the following during volume population, it indicates GKE encountered a permissions problem:
- Cloud Storage bucket doesn't exist:
PopulateOperationStartError
withcode = PermissionDenied
- Missing permissions on the Cloud Storage bucket or service accounts:
PopulateOperationFailed
with"code: "xxx" message:"Verify if bucket "xxx" exists and grant access"
. - Service account not found:
PopulateOperationStartError
withcode = Unauthenticated
.
To resolve these, double-check the following:
- Cloud Storage bucket access: Verify the bucket exists and the service account has the
roles/storage.objectViewer permission
. - Service accounts: Confirm both the Kubernetes service account and the IAM service account exist and are correctly linked.
- Parallelstore service account: Ensure it exists and has the necessary permissions (
roles/iam.serviceAccountTokenCreator
androles/iam.serviceAccountUser
on the IAM account).
For detailed steps and verification commands, refer to Set up necessary permissions. If errors persist, contact support with the error message, your project name, and the Cloud Storage bucket name.
Invalid argument errors
If you encounter InvalidArgument
errors, it means you've likely provided
incorrect values in either the GCPDataSource
or PersistentVolumeClaim. The
error log will pinpoint the exact fields containing the invalid data. Check your
Cloud Storage bucket URI and other relevant fields for accuracy.
What's next
- Explore the Parallelstore CSI reference documentation.
- Learn how to use the Parallelstore interception library to improve workload performance.
- Try the tutorial to train a TensorFlow model with Keras on GKE.