Persistent Volumes with Persistent Disks

This page provides an overview of PersistentVolumes and PersistentVolumeClaims in Kubernetes, and their use with Kubernetes Engine. This page focuses on storage backed by Compute Engine Persistent Disks.

Overview

PersistentVolume resources are used to manage durable storage in a cluster. On Kubernetes Engine, PersistentVolumes are typically backed by Compute Engine persistent disks. PersistentVolumes can also be used with other storage types like NFS. Refer to the Kubernetes documentation for an exhaustive overview of PersistentVolumes.

Unlike Volumes, the PersistentVolumes lifecycle is managed by Kubernetes. PersistentVolumes can be dynamically provisioned; the user does not have to manually create and delete the backing storage.

PersistentVolumes are cluster resources that exist independently of Pods. This means that the disk and data represented by a PersistentVolume continue to exist as the cluster changes and as Pods are deleted and recreated. PersistentVolume resources can be provisioned dynamically through PersistentVolumeClaims, or they can be explicitly created by a cluster administrator.

A PersistentVolumeClaim is a request for and claim to a PersistentVolume resource. PersistentVolumeClaim objects request a specific size, access mode, and StorageClass for the PersistentVolume. If a PersistentVolume that satisfies the request exists or can be provisioned, the PersistentVolumeClaim is bound to that PersistentVolume.

Pods use claims as Volumes. The cluster inspects the claim to find the bound Volume and mounts that Volume for the Pod.

Portability is another advantage of using PersistentVolumes and PersistentVolumeClaims. You can easily use the same Pod specification across different clusters and environments because PersistentVolume is an interface to the actual backing storage.

StorageClasses

Volume implementations such as gcePersistentDisk are configured through StorageClass resources. Kubernetes Engine creates a default StorageClass for you which uses the standard persistent disk type. The default StorageClass is used when a PersistentVolumeClaim doesn't specify a StorageClassName. You can replace the provided default StorageClass with your own.

You can create your own StorageClass resources to describe different classes of storage. For example, classes might map to quality-of-service levels, or to backup policies. This concept is sometimes called "profiles" in other storage systems.

Dynamically provisioning PersistentVolumes

Most of the time, you don't need to directly configure PersistentVolume objects or create Compute Engine persistent disks. Instead, you can create a PersistentVolumeClaim and Kubernetes automatically provisions a persistent disk for you.

The following manifest describes a request for a 30 GiB disk whose access mode allows it to be mounted by one Pod at a time in read/write mode:

pvc-demo.yaml

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: helloweb-disk
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 30Gi
    

When you create this PersistentVolumeClaim with kubectl apply -f pvc-demo.yaml, Kubernetes dynamically creates a corresponding PersistentVolume object. Assuming that you haven't replaced the Kubernetes Engine default storage class, this PersistentVolume is backed by a new, empty Compute Engine persistent disk. You use this disk in a Pod by using the claim as a volume.

When you delete this claim, the corresponding PersistentVolume object as well as the provisioned Compute Engine persistent disk are also deleted.

Should you want to prevent deletion of dynamically provisioned persistent disks, set the reclaim policy of the PersistentVolume resource, or its StorageClass resource, to Retain. In this case, you are charged for the persistent disk for as long as it exists even if there is no PersistentVolumeClaim consuming it.

Access Modes

PersistentVolumes support the following access modes:

  • ReadWriteOnce: The Volume can be mounted as read-write by a single node.
  • ReadOnlyMany: The Volume can be mounted read-only by many nodes.
  • ReadWriteMany: The Volume can be mounted as read-write by many nodes. PersistentVolumes that are backed by Compute Engine persistent disks don't support this access mode.

Using Compute Engine Persistent Disks as ReadOnlyMany

ReadWriteOnce is the most common use case for Persistent Disks and works as the default access mode for most applications. Compute Engine Persistent Disks also support ReadOnlyMany mode so that many applications or many replicas of the same application can consume the same disk at the same time. An example use case is serving static content across multiple replicas.

In order to use your disk as ReadOnlyMany, you must recreate a new Persistent Volume and PersistentVolumeClaim for the disk with the accessModes fields set to ReadOnlyMany:

readonly-pv.yaml

apiVersion: v1
kind: PersistentVolume
metadata:
  name: my-readonly-pv
spec:
  storageClassName: ""
  capacity:
    storage: 10G
  accessModes:
    - ReadOnlyMany
  gcePersistentDisk:
    pdName: my-test-disk
    fsType: ext4
    

readonly-pvclaim.yaml

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: my-readonly-pvc
spec:
  accessModes:
    - ReadOnlyMany
  resources:
    requests:
      storage: 30Gi
    

Then, when using this PVC in your workloads, you need to specify readOnly: true on the Pod specification:

volumes:
- name: my-volume
  persistentVolumeClaim:
    claimName: my-readonly-pvc
    readOnly: true

Now, you can have multiple Pods on different nodes that can all mount this PVC in read-only mode. Note that you can't attach Persistent Disks in write mode on multiple nodes at the same time. See Deployments Vs. Statefulsets.

SSD persistent disks

By default, dynamically provisioned PersistentVolumes use the default StorageClass described above and are backed by standard hard disks. If you need faster SSDs you can create a StorageClass. The following manifest describes a StorageClass named faster. PersistentVolumeClaims made with this StorageClass are backed by SSDs:

ssd-storageclass.yaml

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: faster
provisioner: kubernetes.io/gce-pd
parameters:
  type: pd-ssd
    

To create a PersistentVolumeClaim named "my-volume" with the faster StorageClass, refer to the StorageClass in the claim's manifest:

ssd-claim.yaml

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: my-volume
spec:
  storageClassName: faster
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 30Gi
    

Use kubectl apply to create this StorageClass and PersistentVolumeClaim:

kubectl apply -f ssd-storageclass.yaml
kubectl apply -f ssd-claim.yaml

Using preexsiting persistent disks as PersistentVolumes

Dynamically provisioned PersistentVolumes are empty when they are created. If you have an existing Compute Engine persistent disk populated with data, you can introduce it to your cluster by manually creating a corresponding PersistentVolume resource. The persistent disk must be in the same zone as the cluster nodes.

For example, if you already have a 500 GB persistent disk named pd-name, the manifest file below describes a corresponding PersistentVolume and PersistentVolumeClaim:

existing-pd.yaml

apiVersion: v1
kind: PersistentVolume
metadata:
  name: pv-demo
spec:
  storageClassName: ""
  capacity:
    storage: 500G
  accessModes:
    - ReadWriteOnce
  gcePersistentDisk:
    pdName: pd-name
    fsType: ext4
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: pv-claim-demo
spec:
  # It's necessary to specify "" as the storageClassName
  # so that the default storage class won't be used, see
  # https://kubernetes.io/docs/concepts/storage/persistent-volumes/#class-1
  storageClassName: ""
  volumeName: pv-demo
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 500G
    

Use kubectl apply -f existing-pd.yaml to create the PersistentVolume and PersistentVolumeClaim.

Deployments vs. Statefulsets

You can use Persistent Volume Claims or Volume Claim Templates in higher level controllers such as Deployments or StatefulSets respectively.

Deployments are designed for stateless applications and therefore all replicas of a Deployment share the same Persistent Volume Claim. Since the replica Pods created will be identical to each other, only Volumes with modes ReadOnlyMany or ReadWriteMany can work in this setting.

Even Deployments with one replica using a ReadWriteOnce Volume are not recommended. This is because the default Deployment strategy will create a second Pod before bringing down the first pod on a recreate. The Deployment may fail in deadlock as the second Pod can't start because the ReadWriteOnce Volume is already in use, and the first Pod wont be removed because the second Pod has not yet started. Instead, use a StatefulSet with ReadWriteOnce volumes.

StatefulSets are the recommended method of deploying stateful applications that require a unique volume per replica. By using StatefulSets with Persistent Volume Claim Templates you can have applications that can scale up automatically with unique Persistent Volume Claims assosicated to each replica Pod.

Regional Persistent Disks

Regional persistent disks replicate data between two zones in the same region, and can be used similarly to regular persistent disks. In the event of a zonal outage, Kubernetes can failover workloads using the volume to the other zone. You can use regional persistent disks to build highly available solutions for stateful workloads on Kubernetes Engine. Users must ensure that both the primary and failover zones are configured with enough resource capacity to run the workload.

Regional SSD persistent disks are an option for applications such as databases that require both high availability and high performance. For more details see Block storage performance comparison.

As with regular persistent disks, regional persistent disks can be dynamically provisioned as needed or manually provisioned in advance by the cluster administrator.

Dynamic provisioning

To enable dynamic provisioning of regional persistent disks, the cluster administrator can create a StorageClass with the replica-type and zones parameters. For example, the following manifest describes a StorageClass that uses standard persistent disks and that replicates data to the europe-west1-b and europe-west1-c zones:

kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: regionalpd-storageclass
provisioner: kubernetes.io/gce-pd
parameters:
  type: pd-standard
  replication-type: regional-pd
  zones: europe-west1-b, europe-west1-c
  

To create a PersistentVolumeClaim named "regional-pvc" with the regionalpd-storageclass StorageClass, refer to the StorageClass in the claim's manifest:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: regional-pvc
  namespace: testns
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 100Gi
  storageClassName: regionalpd-storageclass
  

Manual provisioning

First, create a regional persistent disk. The following example creates a disk named "gce-disk-1" replicated to the europe-west1-b and europe-west1-c zones:

gcloud beta compute disks create \
 gce-disk-1 \
 --region europe-west1 \
 --replica-zones europe-west1-b,europe-west1-c

You can then create a PersistentVolume that references the regional persistent disk, following the procedure described in the previous section.

Kubernetes automatically adds a label to PersistentVolume objects that are backed by a regional persistent disk. The label's key is failure-domain.beta.kubernetes.io/zone and its value is the two zones where the persistent disks are located. For example, a PersistentVolume that is backed by regional persistent disk that replicates data to europe-west1-b and europe-west1-c has this label added to it:

failure-domain.beta.kubernetes.io/zone: “europe-west1-b__europe-west1-c”

What's next

Was this page helpful? Let us know how we did:

Send feedback about...

Kubernetes Engine