How To Recover Persistent Volume Snapshots
Estimated time to read: 5 minutes
In this tutorial you will learn how to recover a PostgreSQL database from a previously created persistent volume snapshot.
First wel will deploy a PostgreSQL database in Kubernetes, storing the database on a new persistent volume. After populating the database we will snapshot the persistent volume containing the database, acting as a backup, to demonstrate a recovery of lost data.
Prerequisites:
In this tutorial we use the following tools:
- kubectl (https://kubernetes.io/docs/tasks/tools/)
It's required to have it installed before beginning.
The tutorial will be split into five parts:
- Deploy PostgreSQL Database
- Create VolumeSnapshotClass
- Create Volume Snapshot
- Recover Volume from Snapshot
- Cleanup
Deploy PostgreSQL Database
1. Create file postgres-configmap.yaml
containing the database configuration, update accordingly:
apiVersion: v1
kind: ConfigMap
metadata:
name: postgres-config
labels:
app: postgres
data:
POSTGRES_DB: mydb
POSTGRES_USER: myuser
POSTGRES_PASSWORD: mypassword
2. Create file postgres-deployment.yaml
containing a PosgreSQL Deployment:
apiVersion: apps/v1
kind: Deployment
metadata:
name: postgres
spec:
replicas: 1
selector:
matchLabels:
app: postgres
template:
metadata:
labels:
app: postgres
spec:
initContainers:
- name: init-volume
image: 'postgres:17'
command: ['sh', '-c', "mkdir -p /data/postgres"]
volumeMounts:
- mountPath: /data
name: data
containers:
- name: postgres
image: 'postgres:17'
imagePullPolicy: IfNotPresent
ports:
- containerPort: 5432
envFrom:
- configMapRef:
name: postgres-config
volumeMounts:
- mountPath: /var/lib/postgresql/data
name: data
subPath: postgres
volumes:
- name: data
persistentVolumeClaim:
claimName: postgres-restored-pvc
3. Create file postgres-pvc.yaml
containing a PersistentVolumeClaim for storing the data:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: postgres-pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
storageClassName: default
4. Deploy all the resources:
5. Confirm the postgres deployment is ready:
kubectl get deployment -o wide
NAME READY UP-TO-DATE AVAILABLE AGE CONTAINERS IMAGES SELECTOR
postgres 1/1 1 1 86m postgres postgres:17 app=postgres
Although populating the database is out of scope, there are multiple methods for accessing the database from your workstation allowing you to populate the database with some dummy data.
When accessing the datatabase, use the credentials set in the database configuration created earlier.
The first method is to forward the database connection to your workstation making the database accessible on your workstation at address 127.0.0.1:5432
:
Another method is to open a psql prompt in the running postgres deployment:
Example queries to populate the database with some dummy data:
-- Create a table
CREATE TABLE IF NOT EXISTS films (
id SERIAL PRIMARY KEY,
title VARCHAR(100) NOT NULL
);
-- Insert rows
INSERT INTO films (title)
VALUES ('Inside Out'), ('Toy Story'), ('Monsters Inc.'), ('Finding Nemo');
-- List current rows
SELECT id, title
FROM films;
Output of the last query should list all inserted film titles:
id | title
----+---------------
1 | Inside Out
2 | Toy Story
3 | Monsters Inc.
4 | Finding Nemo
(4 rows)
In the next step we willl create a VolumeSnapshotClass.
Create VolumeSnapshotClass
To create volume snapshots a VolumeSnapshotClass is required, this is similar to a StorageClass but specific for volume snaphots. If your cluster already has a VolumeSnapshotClass you may skip this part and use the current one instead.
1. Create file volumesnapshotclass.yaml
containing the VolumeSnapshotClass:
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshotClass
driver: cinder.csi.openstack.org
deletionPolicy: Delete
metadata:
name: default
2. Deploy the VolumeSnapshotClass:
In the next step we willl create a snapshot of the database volume.
Create Volume Snapshot
For databases and filesystem to be fast, writes are cached in ephemeral memory and periodically flushed to persistent storage for safekeeping. This abiquitous optimization makes it unsafe to make snapshots of volumes in-use, therefore we need to temporary stop the database and detach the volume. This will flush all writes to persistent storage to prevent loss of data.
Info
If your use-case is not affected by unflushed writes you can forcefully create snapshots from in-use volumes by updating the target VolumeSnapshotClass with an additional parameter, e.g.:
1. Scale down the deployment to stop the database:
2. Create file postgres-snapshot.yaml
containing the reference for the new volume snapshot:
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
name: postgres-pvc
spec:
volumeSnapshotClassName: default
source:
persistentVolumeClaimName: postgres-pvc
3. Create the snapshot:
4.. Confirm the snapshot is created:
% kubectl get VolumeSnapshot
NAME READYTOUSE SOURCEPVC RESTORESIZE SNAPSHOTCLASS SNAPSHOTCONTENT
postgres-pvc true postgres-pvc 1Gi default snapcontent-be0019e0-2209-43e5-9a54-485f973911d5
5. Restart the database, scale up the deployment, and confirm:
% kubectl scale --replicas 1 deployment postgres
deployment.apps/postgres scaled
% kubectl get deployment postgres
NAME READY UP-TO-DATE AVAILABLE AGE
postgres 1/1 1 1 131m
In the next step we will recover the database using the volume snapshot.
Recover Volume from Snapshot
First, lets fake some data loss by deleting the first row in the database:
The first row was deleted:
The recovery process in a nutshell:
- Create a new persistent volume based on the volume snapshot created before data loss occured
- Update the postgres deployment with the new persistent volume
1. Create file postgres-restored-pvc.yaml
containing a reference to the volume snapshot created earlier:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: postgres-restored-pvc
spec:
storageClassName: default
dataSource:
name: postgres-pvc
kind: VolumeSnapshot
apiGroup: snapshot.storage.k8s.io
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
2. Apply the manifest to create the new PVC:
3. Confirm the PVC is created, depending on the used StorageClass its status will be Pending or Available:
kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS
postgres-restored-pvc Pending pv-shoot-cc978d5b-351e-4245-b801-86d2e4e13bcc 1Gi RWO default
4. Change the postgres deployment in file postgres-deployment.yaml
to use the new PVC:
5. Apply the changes:
Info
Kubernetes will notice the change and detach the current PVC, stop and delete the current Pod, attach the new PVC and deploy a new Pod.
6. Confirm the database was recovered by listing the rows in the database table:
Output:
id | title
----+---------------
1 | Inside Out
2 | Toy Story
3 | Monsters Inc.
4 | Finding Nemo
(4 rows)
Cleanup
To wrap it up, delete all previously created resources: