How to Install PostgreSQL in Vultr Kubernetes Engine (VKE) with Postgres Operator

Author: Quan Hua

Last Updated: Tue, Mar 29, 2022
Kubernetes PostgreSQL

Introduction

PostgreSQL is a powerful, open-source relational database system. This tutorial explains how the Postgres Operator from Crunchy Data v5 automates and simplifies deploying and managing PostgreSQL clusters on Kubernetes. Within a few minutes, you can have a production-grade Postgres cluster with high availability for disaster recovery.

Prerequisites

Before you begin, you should:

1. Install PGO, the Postgres Operator from Crunchy Data

  1. Clone the official example repository from Crunchy Data.

    $ git clone --depth=1 https://github.com/CrunchyData/postgres-operator-examples
    $ cd cd postgres-operator-examples
    
  2. Install the PGO.

    $ kubectl apply -k kustomize/install
    
  3. Check if PGO is READY.

    $ kubectl get pods -n postgres-operator
    

The result should look like:

NAME                   READY   STATUS    RESTARTS   AGE
pgo-59c4f987b6-6pj72   1/1     Running   0          44s

2. Prepare a Vultr Object Storage

A Vultr Object Storage stores the Write-Ahead-Logging files and daily backups of your Postgres cluster.

3. Prepare a Manifest for Your Postgres Cluster

In the postgres-operator-examples repository, there are multiple examples to create Postgres clusters. In this tutorial, you use the postgres-operator-examples/kustomize/s3 as the starting point.

  1. Change directory to kustomize/s3/ folder.

    $ cd kustomize/s3
    
  2. Copy the file s3.conf.example to s3.conf.

    $ cp s3.conf.example s3.conf
    
  3. Set your Vultr Object Storage Access Key and Secret Key into s3.conf file. Here is the example content of this tutorial.

    [global]
    repo1-s3-key=OR70GNH<redacted>HVKG3X
    repo1-s3-key-secret=MnsrWR5kKAZ<redacted>83P3b5J2BdY5pU
    
  4. Open the file postgres.yaml and find the following section.

    s3:
      bucket: "<YOUR_AWS_S3_BUCKET_NAME>"
      endpoint: "<YOUR_AWS_S3_ENDPOINT>"
      region: "<YOUR_AWS_S3_REGION>"
    
  5. Replace "<YOUR_AWS_S3_ENDPOINT>" with the Hostname of your Vultr Object Storage. Replace "<YOUR_AWS_S3_BUCKET_NAME>" with the bucket name in section 2. Replace <YOUR_AWS_S3_REGION> with any text. Here is the example content of this tutorial.

    s3:
      bucket: "postgres-demo-bucket"
      endpoint: "ewr1.vultrobjects.com"
      region: "default"
    
  6. Add the repo1-s3-uri-style: path to the global section as follows:

    global:
      repo1-path: /pgbackrest/postgres-operator/hippo-s3/repo1
      repo1-s3-uri-style: path
    
  7. Add a new section under the spec section to back up the Write-AHead-Logging (WAL) every 60 seconds:

    spec:
      patroni:
      dynamicConfiguration:
      postgresql:
        parameters:
        archive_timeout: 60
    
  8. Vultr Block Storage requires to have a minimum size of 10GB. Change the storage: 1Gi to storage: 10Gi.

  9. Add another repo2 to the repos section, which has a volume instead of an s3. This creates another Vultr Block Storage to save the Write-AHead-Logging and daily backups. Here is an example configuration

    global:
      repo1-path: /pgbackrest/postgres-operator/hippo-s3/repo1
      repo1-s3-uri-style: path        
    repos:
    - name: repo1
      s3:
        bucket: "postgres-demo-bucket"
        endpoint: "ewr1.vultrobjects.com"
        region: "default"          
    - name: repo2
      volume:
        volumeClaimSpec:
          accessModes:
            - "ReadWriteOnce"
          resources:
            requests:
              storage: 10Gi
    

The final content of the postgres.yaml should be as follows:

apiVersion: postgres-operator.crunchydata.com/v1beta1
kind: PostgresCluster
metadata:
  name: hippo-s3
spec:
  image: registry.developers.crunchydata.com/crunchydata/crunchy-postgres:centos8-14.2-0
  postgresVersion: 14
  instances:
    - dataVolumeClaimSpec:
        accessModes:
        - "ReadWriteOnce"
        resources:
          requests:
            storage: 10Gi
  patroni:
    dynamicConfiguration:
      postgresql:
        parameters:
          archive_timeout: 60            
  backups:
    pgbackrest:
      image: registry.developers.crunchydata.com/crunchydata/crunchy-pgbackrest:centos8-2.36-1
      configuration:
      - secret:
          name: pgo-s3-creds
      global:
        repo1-path: /pgbackrest/postgres-operator/hippo-s3/repo1
        repo1-s3-uri-style: path
      repos:
      - name: repo1
        s3:
          bucket: "postgres-demo-bucket"
          endpoint: "ewr1.vultrobjects.com"
          region: "default"
      - name: repo2
        volume:
          volumeClaimSpec:
            accessModes:
            - "ReadWriteOnce"
            resources:
            requests:
                storage: 10Gi

4. Create a Postgres Cluster

  1. Under the s3 folder, run the following command to create the Postgres Cluster

    $ kubectl apply -k .
    
  2. Check the running pods with kubectl get pods -n postgres-operator. The result should look like

    NAME                   READY   STATUS    RESTARTS   AGE
    hippo-s3-00-7nt4-0     2/2     Running   0          96s
    pgo-59c4f987b6-nzpnn   1/1     Running   0          10m
    

If you see a similar result, you have successfully deployed a Postgres cluster on Vultr Kubernetes Engine with the following features:

  • Save the data of the Postgres Cluster in a Vultr Block Storage with a size of 10GB.
  • Upload Write-AHead-Logging (WAL) every 60 seconds to a Vultr Object Storage.
  • The WAL files and backups are stored inside another Vultr Block Storage in the same Kubernetes cluster and Vultr Object Storage in another region.

5. Connect to the Postgres cluster

The information to connect to the Postgres Cluster is inside a secret that has the name <clusterName>-pguser-<userName> in postgres-operator namespace. In this tutorial, the secret is hippo-s3-pguser-hippo-s3

  1. Get the secret hippo-s3-pguser-hippo-s3

    $ kubectl -n postgres-operator get secrets hippo-s3-pguser-hippo-s3 -o yaml
    

    The output should look as follows. The value in data are base64-encoded strings.

    apiVersion: v1
    data:
      dbname: aGlwcG8tczM=
      host: aGlwcG8tczMtcHJpbWFyeS5wb3N0Z3Jlcy1vcGVyYXRvci5zdmM=
      jdbc-uri: amRiYzpwb3N0Z3Jlc3FsOi8vaGlwcG8tczMtcHJpbWFyeS5wb3N0Z3Jlcy1vcGVyYXRvci5zdmM6NTQzMi9oaXBwby1zMz9wYXNzd29yZD1vJTNCJTVEQlhmWFo0azUlNUVUZyU0MDklM0IlMkJGTyUyQTduJTNCJnVzZXI9aGlwcG8tczM=
      password: bztdQlhmWFo0azVeVGdAOTsrRk8qN247
      port: NTQzMg==
      uri: cG9zdGdyZXNxbDovL2hpcHBvLXMzOm87JTVEQlhmWFo0azUlNUVUZyU0MDk7K0ZPJTJBN247QGhpcHBvLXMzLXByaW1hcnkucG9zdGdyZXMtb3BlcmF0b3Iuc3ZjOjU0MzIvaGlwcG8tczM=
      user: aGlwcG8tczM=
      verifier: U0NSQU0tU0hBLTI1NiQ0MDk2OnJsRlNIUERLU1VmNDE0KzNLNlN4Qmc9PSR4Z2dTbjgzaFk1QjZYSERoR2gxbjdvZmdIUWNUNnJRamZHUGwvdUVFQUVrPTo2YUFiSk9pUSs2cVVtUzZTNkpwbW1McFJXeDFFVGdFcTdKSVQ1UnozSmR3PQ==
    kind: Secret
    metadata:
      creationTimestamp: "2022-03-12T07:24:42Z"
      labels:
        postgres-operator.crunchydata.com/cluster: hippo-s3
        postgres-operator.crunchydata.com/pguser: hippo-s3
        postgres-operator.crunchydata.com/role: pguser
      name: hippo-s3-pguser-hippo-s3
      namespace: postgres-operator
      ownerReferences:
      - apiVersion: postgres-operator.crunchydata.com/v1beta1
        blockOwnerDeletion: true
        controller: true
        kind: PostgresCluster
        name: hippo-s3
        uid: 2743b032-51d0-46e7-ace8-fef49eb305a1
      resourceVersion: "3380"
      uid: f7664af8-f371-4a3a-b4ac-2871e2abda02
    type: Opaque
    
  2. Decode the value for the root password using the following command

    $ echo 'bztdQlhmWFo0azVeVGdAOTsrRk8qN247' | base64 --decode 
    
  3. Run the following commands to get the user, database name, and password

    $ kubectl -n postgres-operator get secrets hippo-s3-pguser-hippo-s3 -o go-template='{{.data.user | base64decode}}'
    $ kubectl -n postgres-operator get secrets hippo-s3-pguser-hippo-s3 -o go-template='{{.data.dbname | base64decode}}'
    $ kubectl -n postgres-operator get secrets hippo-s3-pguser-hippo-s3 -o go-template='{{.data.password | base64decode}}'
    
  4. Get the pod's name that is the primary node of the Postgres Cluster (pod/hippo-s3-00-7nt4-0 in this tutorial).

    $ kubectl get pod -n postgres-operator -o name -l postgres-operator.crunchydata.com/cluster=hippo-s3,postgres-operator.crunchydata.com/role=master
    
  5. Run port-forward to access the pod through localhost with port 5432. Replace hippo-s3-00-7nt4-0 with your pod name. Then, connect to your Postgres with your favorite database tool with the above credentials.

    $ kubectl -n postgres-operator port-forward hippo-s3-00-7nt4-0 5432:5432
    

6. Customize the Postgres Cluster

Here are some customizations that you may need. Whenever you change the postgres.yaml file, make sure that you run the apply command as follows to apply the changes to the Postgres cluster with htis command:

$ kubectl apply -k .

Daily Backup to S3

Add a schedules section to repo to allow automatically full backup every day at 1 a.m. and incremental backup every 4 hours. You can change the Cron schedule expression as you want.

Add repo1-retention-full field to global field to automatically remove old backups.

Here is an example configuration:

global:
  repo1-path: /pgbackrest/postgres-operator/hippo-s3/repo1
  repo1-s3-uri-style: path
  repo1-retention-full: "14"
  repo1-retention-full-type: "count"        
  repo2-retention-full: "14"
  repo2-retention-full-type: "count"                  
repos:
- name: repo1
  schedules:
    full: '0 1 * * *'
    incremental: "0 */4 * * *"
  s3:
    bucket: "postgres-demo-bucket"
    endpoint: "ewr1.vultrobjects.com"
    region: "default"
- name: repo2
  schedules:
    full: '0 1 * * *'
    incremental: "0 */4 * * *"
  volume:
    volumeClaimSpec:
      accessModes:
        - "ReadWriteOnce"
      resources:
        requests:
          storage: 10Gi

Add replicas to the Postgres Cluster

Add replicas: 3 under the instances to get two more replicas in the Postgres Cluster.

Here is an example configuration:

spec:
  image: registry.developers.crunchydata.com/crunchydata/crunchy-postgres:centos8-14.2-0
  postgresVersion: 14
  instances:
    - dataVolumeClaimSpec:
        accessModes:
        - "ReadWriteOnce"
        resources:
          requests:
            storage: 10Gi
      replicas: 3

Enable Synchronous Replication

Synchronous Replication is useful for workloads that are sensitive to losing transactions.

The trade-offs are:

  • Take longer for a transaction to commit
  • A crash in synchronous replicas blocks writes to the primary.

Under the patroni section, add the synchronous_mode: true and synchronous_commit: "on" as follows:

patroni:
  dynamicConfiguration:
    synchronous_mode: true
    postgresql:
      parameters:
        synchronous_commit: "on"
        archive_timeout: 60  

Connection Pooling

Connection Pooling is useful when scaling and maintaining the connection between the application and the database, especially if you are using a serverless architecture for your application.

  • Add the proxy section under spec to enable the connection pooling with PgBouncer connection pooler. You can also specify the number of replicas of the pooling

    specs: proxy: pgBouncer: image: registry.developers.crunchydata.com/crunchydata/crunchy-pgbouncer:centos8-1.16-1 replicas: 2

  • Run the following command to see the pods for connection pooling

    $ kubectl get pod -n postgres-operator -l postgres-operator.crunchydata.com/cluster=hippo-s3,postgres-operator.crunchydata.com/role=pgbouncer
    
  • Run the following command to see the secrets. You should see the new attributes for the connection pooling including pgbouncer-host, pgbouncer-jdbc-uri, pgbouncer-port and pgbouncer-uri.

    $ kubectl -n postgres-operator get secrets hippo-s3-pguser-hippo-s3 -o yaml
    
  • Run port-forwarding to access the PgBouncer service through localhost with port 5432.

    $ kubectl -n postgres-operator port-forward service/hippo-s3-pgbouncer 5432:5432
    

Perform a Manual Backup

  • Create a section under the pgbackrest as follows:

    backup:
      pgbackrest:
        manual:
          repoName: repo1
          options:
          - --type=full
    
  • Annotate the Postgres Cluster to trigger a one-off backup

    $ kubectl annotate -n postgres-operator postgrescluster hippo-s3 --overwrite postgres-operator.crunchydata.com/pgbackrest-backup="$(date)"
    

Here is the final postgres.yaml file which all the above customizations.

apiVersion: postgres-operator.crunchydata.com/v1beta1
kind: PostgresCluster
metadata:
  name: hippo-s3
spec:
  image: registry.developers.crunchydata.com/crunchydata/crunchy-postgres:centos8-14.2-0
  postgresVersion: 14
  instances:
    - dataVolumeClaimSpec:
        accessModes:
        - "ReadWriteOnce"
        resources:
          requests:
            storage: 10Gi
      replicas: 3
  patroni:
    dynamicConfiguration:
      synchronous_mode: true
      postgresql:
        parameters:
          synchronous_commit: "on"
          archive_timeout: 60       
  proxy:
    pgBouncer:
      image: registry.developers.crunchydata.com/crunchydata/crunchy-pgbouncer:centos8-1.16-1
      replicas: 2           
  backups:
    pgbackrest:
      image: registry.developers.crunchydata.com/crunchydata/crunchy-pgbackrest:centos8-2.36-1
      configuration:
      - secret:
          name: pgo-s3-creds
      global:
        repo1-path: /pgbackrest/postgres-operator/hippo-s3/repo1
        repo1-s3-uri-style: path
        repo1-retention-full: "14"
        repo1-retention-full-type: "count"       
        repo2-retention-full: "14"
        repo2-retention-full-type: "count"                
      manual:
        repoName: repo1
        options:
        - --type=full
      repos:
      - name: repo1
        schedules:
          full: '0 1 * * *'
          incremental: "0 */4 * * *"
        s3:
          bucket: "postgres-demo-bucket"
          endpoint: "ewr1.vultrobjects.com"
          region: "default"          
      - name: repo2
        schedules:
          full: '0 1 * * *'
          incremental: "0 */4 * * *"
        volume:
          volumeClaimSpec:
            accessModes:
              - "ReadWriteOnce"
            resources:
              requests:
                storage: 10Gi

7. Troubleshooting Tips

  • Use kubectl describe to check parameters and events of the Postgres Cluster

    $ kubectl describe -n postgres-operator postgrescluster hippo-s3
    
  • Get all the events under the postgres-operator namespace

    $ kubectl get events -n postgres-operator  --sort-by='.metadata.creationTimestamp'
    
  • Get the pod name of the primary pod. Replace hippo-s3 with your cluster name

    $ kubectl get pod -n postgres-operator -o name -l postgres-operator.crunchydata.com/cluster=hippo-s3,postgres-operator.crunchydata.com/role=master
    
  • Get the pod name of the pod which mounts the volume of the volume backup repository named repo2 in this tutorial.

    $ kubectl get pod -n postgres-operator -o name -l postgres-operator.crunchydata.com/cluster=hippo-s3,postgres-operator.crunchydata.com/data=pgbackrest
    
  • Get a shell into the pod that controls the volume backup repository. Replace <POD_NAME> with your pod name

    $ kubectl exec -n postgres-operator <POD_NAME> -it -- /bin/bash
    
  • In the shell, you can query the files inside the Vultr Block Storage that stores the data of repo2. Run df -h to get the mounted location of the Vultr Block Storage (/pgbackrest/repo2 in this tutorial).

  • Get a shell into the primary pod. Replace <POD_NAME> with your pod name

    $ kubectl exec -n postgres-operator <POD_NAME> -it -- /bin/bash
    

You can access the pgbackrest tool that backs up and restores the database in the shell. Here are some useful commands

  • Get information of the backup repos

    $ pgbackrest info
    
  • Check the pgbackrest configuration

    $ pgbackrest check --stanza=db
    
  • Find the list of available Write-AHead-Logging files

    $ ls -l /pgdata/pg14_wal
    
  • Find the list of uploaded Write-AHead-Logging files

    $ ls -l /pgdata/pg14_wal/archive_status
    
  • Check the pgbackrest log files

    $ ls /pgdata/pgbackrest/log
    $ cat /pgdata/pgbackrest/log/db-backup.log
    $ cat /pgdata/pgbackrest/log/db-expire.log
    $ cat /pgdata/pgbackrest/log/db-stanza-create.log 
    

More Information

Want to contribute?

You could earn up to $600 by adding new articles.