Persistent Volume & Claim

A Persistent Volume (PV) is the actual storage resource, while a Persistent Volume Claim (PVC) is a request for storage by a Pod.

Persistent Volume

A Persistent Volume (PV) in Kubernetes is a storage resource that exists independently of Pods, allowing data to persist even if a Pod is deleted or rescheduled.

One of the use case that need a persistent volume is when we want to deploy a database. We want our data to be persist even when pods that run postgres server died or upgraded.

Let's create new file postgres-pv.yaml. In this file we will put our persistent volume definition to be used for our database.

apiVersion: v1
kind: PersistentVolume
metadata:
  name: postgres-pv
spec:
  capacity:
    storage: 1Gi  # 1GB Storage
  accessModes:
    - ReadWriteOnce  # Only one node can mount it at a time
  persistentVolumeReclaimPolicy: Retain  # Keep data even if PVC is deleted
  storageClassName: manual
  hostPath:
    path: "/mnt/data/postgresql"  # Local storage for Minikube

accessModes defines how the volume can be accessed:
- ReadWriteOnce: Can be mounted as read-write by a single node.
- ReadOnlyMany: Can be mounted as read-only by multiple nodes.
- ReadWriteMany: Can be mounted as read-write by multiple nodes.
persistentVolumeReclaimPolicy: determines what happens when the PVC is deleted:
- Retain: Keeps the data even if PVC is deleted.
- Delete: Deletes the PV and its data.
- Recycle: Performs a basic wipe of the PV (rm -rf /thevolume/*).

Apply and Validate

Lets apply our persistent volume definition and validate it using get pv or describe pv.

➜ kubectl apply -f postgres-pv.yaml
persistentvolume/postgres-pv created

➜ kubectl get pv
NAME          CAPACITY   ACCESS MODES   RECLAIM POLICY   
postgres-pv   1Gi        RWO            Retain           

STATUS      CLAIM   STORAGECLASS   VOLUMEATTRIBUTESCLASS   REASON   AGE
Available           manual         <unset>                          13s

Persistent Volume Claim

A PersistentVolumeClaim (PVC) is a request for storage by a user. It is similar to a Pod. Pods consume node resources and PVCs consume PV resources. Pods can request specific levels of resources (CPU and Memory). Claims can request specific size and access modes.

Create new file postgres-pvc.yaml and put our PVC definition there.

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: postgres-pvc
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi
  storageClassName: manual

After we create the PV, the Kubernetes looks for a PV that satisfies the claim's requirements. The PV and PVC must have the followings to bind:

The same access mode (ReadWriteOnce).
The same storage class (manual).
The PVC requested storage size must be less or equal with PV. If not the PVC status will stuck in pending until the PV storage increased.

Apply and Validate

Lets apply and validate using get pvc or describe pvc command.

➜ kubectl apply -f postgres-pvc.yaml
persistentvolumeclaim/postgres-pvc created

➜ kubectl get pvc                   
NAME           STATUS   VOLUME        CAPACITY   ACCESS MODES   
postgres-pvc   Bound    postgres-pv   1Gi        RWO            

STORAGECLASS   VOLUMEATTRIBUTESCLASS   AGE
manual         <unset>                 2s

As we can se from get pvc output above that our PVC status is successfully bound.

Postgres Deployment

Next step is to create a deployment spec definition for our postgres server that use the previously defined PVC. Create new file postgres-deployment.yaml and put our postgres server definition there.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: postgres
spec:
  replicas: 1
  selector:
    matchLabels:
      app: postgres
  template:
    metadata:
      labels:
        app: postgres
    spec:
      containers:
      - name: postgres
        image: postgres:17
        ports:
        - containerPort: 5432
        env:
        - name: POSTGRES_USER
          value: "admin"
        - name: POSTGRES_PASSWORD
          value: "password"
        volumeMounts:
        - name: postgres-storage
          mountPath: /var/lib/postgresql/data  # PostgreSQL data directory
      volumes:
      - name: postgres-storage
        persistentVolumeClaim:
          claimName: postgres-pvc  # Connects to PVC

In this definition we expose our POSTGRES_USER and POSTGRES_PASSWORD in plain text. This is no recommended and please don't do this in production. We will learn how to utilize kubernetes secret for this later.

Apply and Validate

➜ kubectl apply -f postgres-deployment.yaml 
deployment.apps/postgres created

➜ kubectl get pods
NAME                        READY   STATUS    RESTARTS   AGE
postgres-5f969684c4-kssfd   1/1     Running   0          62s

Postgres Service

After creating deployment we need to create service to expose it so it will be accessible for our apps. Create new file postgres-service.yaml and put postgres service definition there.

apiVersion: v1
kind: Service
metadata:
  name: postgres
spec:
  selector:
    app: postgres
  ports:
    - protocol: TCP
      port: 5432
      targetPort: 5432

This will expose the postgres service on port 5432.

Apply and Validate

➜ kubectl apply -f postgres-service.yaml 
service/postgres created

➜ kubectl get svc
NAME         TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)    AGE
kubernetes   ClusterIP   10.96.0.1        <none>        443/TCP    23d
postgres     ClusterIP   10.99.16.100     <none>        5432/TCP   6s

Testing Data Persistence

To access our recently deployed postgres database and do some testing we can port forward it.

➜ kubectl port-forward services/postgres 5432:5432 
Forwarding from 127.0.0.1:5432 -> 5432
Forwarding from [::1]:5432 -> 5432

Connect to postgres server using psql.

psql -h localhost -U admin

Create a new database mydb or whatever name you want, connect into it, and create a simple table message.

admin=# CREATE DATABASE mydb;
CREATE DATABASE
admin=# \c mydb 
You are now connected to database "mydb" as user "admin".
mydb=# 
CREATE TABLE message
(
  id SERIAL PRIMARY KEY,
  content TEXT DEFAULT ''
);
CREATE TABLE

Insert a row into the table and select it.

mydb=# INSERT INTO message (content) VALUES ('Hello!');
INSERT 0 1
mydb=# SELECT * FROM message;
  id | content 
----+---------
  1 | Hello!
(1 row)

Let's restart our pods using kubectl rollout restart deployment postgres. After the new pods running our port forward will stop so we need to initiate port forward again. Then connect using psql and select the table. The existing data should still there even after the old pods removed and new pods created.

admin=# \c mydb
You are now connected to database "mydb" as user "admin".
mydb=# SELECT * FROM message;
 id | content 
----+---------
  1 | Hello!
(1 row)

Persistent Volume​

Apply and Validate​

Persistent Volume Claim​

Apply and Validate​

Postgres Deployment​

Apply and Validate​

Postgres Service​

Apply and Validate​

Testing Data Persistence​

References​

Persistent Volume

Apply and Validate

Persistent Volume Claim

Apply and Validate

Postgres Deployment

Apply and Validate

Postgres Service

Apply and Validate

Testing Data Persistence

References