Author: Quan Hua
Last Updated: Tue, Mar 29, 2022PostgreSQL is a powerful, open-source relational database system. This tutorial explains how the Postgres Operator from Crunchy Data v5 automates and simplifies deploying and managing PostgreSQL clusters on Kubernetes. Within a few minutes, you can have a production-grade Postgres cluster with high availability for disaster recovery.
Before you begin, you should:
kubectl
and git
in your machine.Clone the official example repository from Crunchy Data.
$ git clone --depth=1 https://github.com/CrunchyData/postgres-operator-examples
$ cd cd postgres-operator-examples
Install the PGO.
$ kubectl apply -k kustomize/install
Check if PGO is READY.
$ kubectl get pods -n postgres-operator
The result should look like:
NAME READY STATUS RESTARTS AGE
pgo-59c4f987b6-6pj72 1/1 Running 0 44s
A Vultr Object Storage stores the Write-Ahead-Logging files and daily backups of your Postgres cluster.
postgres-demo-bucket
inside that Object StorageIn the postgres-operator-examples
repository, there are multiple examples to create Postgres clusters. In this tutorial, you use the postgres-operator-examples/kustomize/s3
as the starting point.
Change directory to kustomize/s3/
folder.
$ cd kustomize/s3
Copy the file s3.conf.example
to s3.conf
.
$ cp s3.conf.example s3.conf
Set your Vultr Object Storage Access Key and Secret Key into s3.conf file. Here is the example content of this tutorial.
[global]
repo1-s3-key=OR70GNH<redacted>HVKG3X
repo1-s3-key-secret=MnsrWR5kKAZ<redacted>83P3b5J2BdY5pU
Open the file postgres.yaml
and find the following section.
s3:
bucket: "<YOUR_AWS_S3_BUCKET_NAME>"
endpoint: "<YOUR_AWS_S3_ENDPOINT>"
region: "<YOUR_AWS_S3_REGION>"
Replace "<YOUR_AWS_S3_ENDPOINT>"
with the Hostname of your Vultr Object Storage. Replace "<YOUR_AWS_S3_BUCKET_NAME>"
with the bucket name in section 2. Replace <YOUR_AWS_S3_REGION>
with any text. Here is the example content of this tutorial.
s3:
bucket: "postgres-demo-bucket"
endpoint: "ewr1.vultrobjects.com"
region: "default"
Add the repo1-s3-uri-style: path
to the global
section as follows:
global:
repo1-path: /pgbackrest/postgres-operator/hippo-s3/repo1
repo1-s3-uri-style: path
Add a new section under the spec
section to back up the Write-AHead-Logging (WAL) every 60 seconds:
spec:
patroni:
dynamicConfiguration:
postgresql:
parameters:
archive_timeout: 60
Vultr Block Storage requires to have a minimum size of 10GB. Change the storage: 1Gi
to storage: 10Gi
.
Add another repo2
to the repos
section, which has a volume
instead of an s3
. This creates another Vultr Block Storage to save the Write-AHead-Logging and daily backups. Here is an example configuration
global:
repo1-path: /pgbackrest/postgres-operator/hippo-s3/repo1
repo1-s3-uri-style: path
repos:
- name: repo1
s3:
bucket: "postgres-demo-bucket"
endpoint: "ewr1.vultrobjects.com"
region: "default"
- name: repo2
volume:
volumeClaimSpec:
accessModes:
- "ReadWriteOnce"
resources:
requests:
storage: 10Gi
The final content of the postgres.yaml
should be as follows:
apiVersion: postgres-operator.crunchydata.com/v1beta1
kind: PostgresCluster
metadata:
name: hippo-s3
spec:
image: registry.developers.crunchydata.com/crunchydata/crunchy-postgres:centos8-14.2-0
postgresVersion: 14
instances:
- dataVolumeClaimSpec:
accessModes:
- "ReadWriteOnce"
resources:
requests:
storage: 10Gi
patroni:
dynamicConfiguration:
postgresql:
parameters:
archive_timeout: 60
backups:
pgbackrest:
image: registry.developers.crunchydata.com/crunchydata/crunchy-pgbackrest:centos8-2.36-1
configuration:
- secret:
name: pgo-s3-creds
global:
repo1-path: /pgbackrest/postgres-operator/hippo-s3/repo1
repo1-s3-uri-style: path
repos:
- name: repo1
s3:
bucket: "postgres-demo-bucket"
endpoint: "ewr1.vultrobjects.com"
region: "default"
- name: repo2
volume:
volumeClaimSpec:
accessModes:
- "ReadWriteOnce"
resources:
requests:
storage: 10Gi
Under the s3 folder, run the following command to create the Postgres Cluster
$ kubectl apply -k .
Check the running pods with kubectl get pods -n postgres-operator
. The result should look like
NAME READY STATUS RESTARTS AGE
hippo-s3-00-7nt4-0 2/2 Running 0 96s
pgo-59c4f987b6-nzpnn 1/1 Running 0 10m
If you see a similar result, you have successfully deployed a Postgres cluster on Vultr Kubernetes Engine with the following features:
The information to connect to the Postgres Cluster is inside a secret that has the name <clusterName>-pguser-<userName>
in postgres-operator
namespace. In this tutorial, the secret is hippo-s3-pguser-hippo-s3
Get the secret hippo-s3-pguser-hippo-s3
$ kubectl -n postgres-operator get secrets hippo-s3-pguser-hippo-s3 -o yaml
The output should look as follows. The value in data
are base64-encoded strings.
apiVersion: v1
data:
dbname: aGlwcG8tczM=
host: aGlwcG8tczMtcHJpbWFyeS5wb3N0Z3Jlcy1vcGVyYXRvci5zdmM=
jdbc-uri: amRiYzpwb3N0Z3Jlc3FsOi8vaGlwcG8tczMtcHJpbWFyeS5wb3N0Z3Jlcy1vcGVyYXRvci5zdmM6NTQzMi9oaXBwby1zMz9wYXNzd29yZD1vJTNCJTVEQlhmWFo0azUlNUVUZyU0MDklM0IlMkJGTyUyQTduJTNCJnVzZXI9aGlwcG8tczM=
password: bztdQlhmWFo0azVeVGdAOTsrRk8qN247
port: NTQzMg==
uri: cG9zdGdyZXNxbDovL2hpcHBvLXMzOm87JTVEQlhmWFo0azUlNUVUZyU0MDk7K0ZPJTJBN247QGhpcHBvLXMzLXByaW1hcnkucG9zdGdyZXMtb3BlcmF0b3Iuc3ZjOjU0MzIvaGlwcG8tczM=
user: aGlwcG8tczM=
verifier: U0NSQU0tU0hBLTI1NiQ0MDk2OnJsRlNIUERLU1VmNDE0KzNLNlN4Qmc9PSR4Z2dTbjgzaFk1QjZYSERoR2gxbjdvZmdIUWNUNnJRamZHUGwvdUVFQUVrPTo2YUFiSk9pUSs2cVVtUzZTNkpwbW1McFJXeDFFVGdFcTdKSVQ1UnozSmR3PQ==
kind: Secret
metadata:
creationTimestamp: "2022-03-12T07:24:42Z"
labels:
postgres-operator.crunchydata.com/cluster: hippo-s3
postgres-operator.crunchydata.com/pguser: hippo-s3
postgres-operator.crunchydata.com/role: pguser
name: hippo-s3-pguser-hippo-s3
namespace: postgres-operator
ownerReferences:
- apiVersion: postgres-operator.crunchydata.com/v1beta1
blockOwnerDeletion: true
controller: true
kind: PostgresCluster
name: hippo-s3
uid: 2743b032-51d0-46e7-ace8-fef49eb305a1
resourceVersion: "3380"
uid: f7664af8-f371-4a3a-b4ac-2871e2abda02
type: Opaque
Decode the value for the root password using the following command
$ echo 'bztdQlhmWFo0azVeVGdAOTsrRk8qN247' | base64 --decode
Run the following commands to get the user, database name, and password
$ kubectl -n postgres-operator get secrets hippo-s3-pguser-hippo-s3 -o go-template='{{.data.user | base64decode}}'
$ kubectl -n postgres-operator get secrets hippo-s3-pguser-hippo-s3 -o go-template='{{.data.dbname | base64decode}}'
$ kubectl -n postgres-operator get secrets hippo-s3-pguser-hippo-s3 -o go-template='{{.data.password | base64decode}}'
Get the pod's name that is the primary node of the Postgres Cluster (pod/hippo-s3-00-7nt4-0
in this tutorial).
$ kubectl get pod -n postgres-operator -o name -l postgres-operator.crunchydata.com/cluster=hippo-s3,postgres-operator.crunchydata.com/role=master
Run port-forward to access the pod through localhost with port 5432. Replace hippo-s3-00-7nt4-0
with your pod name. Then, connect to your Postgres with your favorite database tool with the above credentials.
$ kubectl -n postgres-operator port-forward hippo-s3-00-7nt4-0 5432:5432
Here are some customizations that you may need. Whenever you change the postgres.yaml
file, make sure that you run the apply command as follows to apply the changes to the Postgres cluster with htis command:
$ kubectl apply -k .
Add a schedules
section to repo
to allow automatically full backup every day at 1 a.m. and incremental backup every 4 hours. You can change the Cron schedule expression as you want.
Add repo1-retention-full
field to global
field to automatically remove old backups.
Here is an example configuration:
global:
repo1-path: /pgbackrest/postgres-operator/hippo-s3/repo1
repo1-s3-uri-style: path
repo1-retention-full: "14"
repo1-retention-full-type: "count"
repo2-retention-full: "14"
repo2-retention-full-type: "count"
repos:
- name: repo1
schedules:
full: '0 1 * * *'
incremental: "0 */4 * * *"
s3:
bucket: "postgres-demo-bucket"
endpoint: "ewr1.vultrobjects.com"
region: "default"
- name: repo2
schedules:
full: '0 1 * * *'
incremental: "0 */4 * * *"
volume:
volumeClaimSpec:
accessModes:
- "ReadWriteOnce"
resources:
requests:
storage: 10Gi
Add replicas: 3
under the instances to get two more replicas in the Postgres Cluster.
Here is an example configuration:
spec:
image: registry.developers.crunchydata.com/crunchydata/crunchy-postgres:centos8-14.2-0
postgresVersion: 14
instances:
- dataVolumeClaimSpec:
accessModes:
- "ReadWriteOnce"
resources:
requests:
storage: 10Gi
replicas: 3
Synchronous Replication is useful for workloads that are sensitive to losing transactions.
The trade-offs are:
Under the patroni
section, add the synchronous_mode: true
and synchronous_commit: "on"
as follows:
patroni:
dynamicConfiguration:
synchronous_mode: true
postgresql:
parameters:
synchronous_commit: "on"
archive_timeout: 60
Connection Pooling is useful when scaling and maintaining the connection between the application and the database, especially if you are using a serverless architecture for your application.
Add the proxy
section under spec
to enable the connection pooling with PgBouncer connection pooler. You can also specify the number of replicas of the pooling
specs: proxy: pgBouncer: image: registry.developers.crunchydata.com/crunchydata/crunchy-pgbouncer:centos8-1.16-1 replicas: 2
Run the following command to see the pods for connection pooling
$ kubectl get pod -n postgres-operator -l postgres-operator.crunchydata.com/cluster=hippo-s3,postgres-operator.crunchydata.com/role=pgbouncer
Run the following command to see the secrets. You should see the new attributes for the connection pooling including pgbouncer-host
, pgbouncer-jdbc-uri
, pgbouncer-port
and pgbouncer-uri
.
$ kubectl -n postgres-operator get secrets hippo-s3-pguser-hippo-s3 -o yaml
Run port-forwarding to access the PgBouncer service through localhost with port 5432.
$ kubectl -n postgres-operator port-forward service/hippo-s3-pgbouncer 5432:5432
Create a section under the pgbackrest as follows:
backup:
pgbackrest:
manual:
repoName: repo1
options:
- --type=full
Annotate the Postgres Cluster to trigger a one-off backup
$ kubectl annotate -n postgres-operator postgrescluster hippo-s3 --overwrite postgres-operator.crunchydata.com/pgbackrest-backup="$(date)"
Here is the final postgres.yaml
file which all the above customizations.
apiVersion: postgres-operator.crunchydata.com/v1beta1
kind: PostgresCluster
metadata:
name: hippo-s3
spec:
image: registry.developers.crunchydata.com/crunchydata/crunchy-postgres:centos8-14.2-0
postgresVersion: 14
instances:
- dataVolumeClaimSpec:
accessModes:
- "ReadWriteOnce"
resources:
requests:
storage: 10Gi
replicas: 3
patroni:
dynamicConfiguration:
synchronous_mode: true
postgresql:
parameters:
synchronous_commit: "on"
archive_timeout: 60
proxy:
pgBouncer:
image: registry.developers.crunchydata.com/crunchydata/crunchy-pgbouncer:centos8-1.16-1
replicas: 2
backups:
pgbackrest:
image: registry.developers.crunchydata.com/crunchydata/crunchy-pgbackrest:centos8-2.36-1
configuration:
- secret:
name: pgo-s3-creds
global:
repo1-path: /pgbackrest/postgres-operator/hippo-s3/repo1
repo1-s3-uri-style: path
repo1-retention-full: "14"
repo1-retention-full-type: "count"
repo2-retention-full: "14"
repo2-retention-full-type: "count"
manual:
repoName: repo1
options:
- --type=full
repos:
- name: repo1
schedules:
full: '0 1 * * *'
incremental: "0 */4 * * *"
s3:
bucket: "postgres-demo-bucket"
endpoint: "ewr1.vultrobjects.com"
region: "default"
- name: repo2
schedules:
full: '0 1 * * *'
incremental: "0 */4 * * *"
volume:
volumeClaimSpec:
accessModes:
- "ReadWriteOnce"
resources:
requests:
storage: 10Gi
Use kubectl describe
to check parameters and events of the Postgres Cluster
$ kubectl describe -n postgres-operator postgrescluster hippo-s3
Get all the events under the postgres-operator
namespace
$ kubectl get events -n postgres-operator --sort-by='.metadata.creationTimestamp'
Get the pod name of the primary pod. Replace hippo-s3
with your cluster name
$ kubectl get pod -n postgres-operator -o name -l postgres-operator.crunchydata.com/cluster=hippo-s3,postgres-operator.crunchydata.com/role=master
Get the pod name of the pod which mounts the volume of the volume backup repository named repo2
in this tutorial.
$ kubectl get pod -n postgres-operator -o name -l postgres-operator.crunchydata.com/cluster=hippo-s3,postgres-operator.crunchydata.com/data=pgbackrest
Get a shell into the pod that controls the volume backup repository. Replace <POD_NAME>
with your pod name
$ kubectl exec -n postgres-operator <POD_NAME> -it -- /bin/bash
In the shell, you can query the files inside the Vultr Block Storage that stores the data of repo2. Run df -h
to get the mounted location of the Vultr Block Storage (/pgbackrest/repo2
in this tutorial).
Get a shell into the primary pod. Replace <POD_NAME>
with your pod name
$ kubectl exec -n postgres-operator <POD_NAME> -it -- /bin/bash
You can access the pgbackrest tool that backs up and restores the database in the shell. Here are some useful commands
Get information of the backup repos
$ pgbackrest info
Check the pgbackrest configuration
$ pgbackrest check --stanza=db
Find the list of available Write-AHead-Logging files
$ ls -l /pgdata/pg14_wal
Find the list of uploaded Write-AHead-Logging files
$ ls -l /pgdata/pg14_wal/archive_status
Check the pgbackrest log files
$ ls /pgdata/pgbackrest/log
$ cat /pgdata/pgbackrest/log/db-backup.log
$ cat /pgdata/pgbackrest/log/db-expire.log
$ cat /pgdata/pgbackrest/log/db-stanza-create.log