Skip to content

Commit

Permalink
instructions & standby updates
Browse files Browse the repository at this point in the history
  • Loading branch information
brettedw committed Dec 10, 2024
1 parent 9b7bb02 commit 9a381df
Show file tree
Hide file tree
Showing 3 changed files with 70 additions and 20 deletions.
20 changes: 14 additions & 6 deletions docs/database/CLUSTER_DB.MD
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ Source: https://access.crunchydata.com/documentation/postgres-operator/latest/ar
When the repo-standby is spun, it reads and applies the WAL in order to replicate the state of the database and runs as it's own cluster, separate from the primary.

Spin up the Repo-Standby Cluster with:
`PROJ_TARGET=<namespace-license-plate> BUCKET=<s3-bucket> bash openshift/scripts/oc_provision_crunchy_standby.sh <suffix> apply`
`PROJ_TARGET=<namespace-license-plate> BUCKET=<s3-bucket> DATE=<YYYY-MM-DD> bash openshift/scripts/oc_provision_crunchy_standby.sh <suffix> apply`

- Anecdotally, spinning up a standby cluster for a 15GB database took about 5 minutes

Expand All @@ -61,16 +61,24 @@ Promote the standby cluster by editing the [crunchy_standby.yaml](../../openshif

More details here: <https://access.crunchydata.com/documentation/postgres-operator/latest/architecture/disaster-recovery#promoting-a-standby-cluster>

### Setting secrets

The promoted standby cluster created it's own secrets for connecting to pgbouncer and it has created a new user in the database with the same name as the cluster. Ex. if the standby cluster is named "wps-crunchy-16-2024-12-10", the user will have the same name.
Once the standby has been promoted, the easiest way to update user privileges is to reassign table ownership from the old user to the new user.
`REASSIGN OWNED BY "<old-user>" TO "<new-user>";`

Once this is done, the deployment using the crunchy secrets will need to be updated. This can be done by manually editing the deployment YAML and changing all config that referenced the original crunchy secrets to the newly promoted standby cluster secrets. Once this is done, new pods should roll out successfully.

## Cluster Restore From pg_dump

In the event that the cluster can't be restored from pgbackrest you can create a new cluster and restore using a pg_dump from S3.

##### Deploy new cluster

```
oc login --token=<your-token> --server=<openshift-api-url>
PROJ_TARGET=<namespace-license-plate> BUCKET=<s3-bucket> CPU_REQUEST=75m CPU_LIMIT=2000m MEMORY_REQUEST=2Gi MEMORY_LIMIT=16Gi DATA_SIZE=65Gi WAL_SIZE=45Gi bash ./oc_provision_crunchy.sh <suffix> apply
```
```
oc login --token=<your-token> --server=<openshift-api-url>
PROJ_TARGET=<namespace-license-plate> BUCKET=<s3-bucket> CPU_REQUEST=75m CPU_LIMIT=2000m MEMORY_REQUEST=2Gi MEMORY_LIMIT=16Gi DATA_SIZE=65Gi WAL_SIZE=45Gi bash ./oc_provision_crunchy.sh <suffix> apply
```

##### Set superuser permissions in new cluster via OpenShift web GUI

Expand All @@ -90,7 +98,7 @@ PGUSER=$(oc get secrets -n <namespace-license-plate> "<wps-crunchydb-pguser-secr
PGDATABASE=$(oc get secrets -n <namespace-license-plate> "<wps-crunchydb-pguser-secret-name>" -o go-template='{{.data.dbname | base64decode}}')
oc -n <namespace-license-plate> port-forward "${PG_CLUSTER_PRIMARY_POD}" 5432:5432
```

##### Restore sql dump into new cluster in another shell

Download the latest SQL dump from S3 storage and unzip it.
Expand Down
6 changes: 5 additions & 1 deletion openshift/scripts/oc_provision_crunchy_standby.sh
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,9 @@ source "$(dirname ${0})/common/common"
# Target project override for Dev or Prod deployments
#
PROJ_TARGET="${PROJ_TARGET:-${PROJ_DEV}}"

# Set DATE to today's date if it isn't set
DATE=${DATE:-$(date +"%Y-%m-%d")}

# Prepare names for crunchy ephemeral instance for this PR.
IMAGE_STREAM_NAMESPACE=${IMAGE_STREAM_NAMESPACE:-${PROJ_TOOLS}}
Expand All @@ -35,7 +38,8 @@ OC_PROCESS="oc -n ${PROJ_TARGET} process -f ${TEMPLATE_PATH}/crunchy_standby.yam
-p SUFFIX=${SUFFIX} \
-p TARGET_NAMESPACE=${PROJ_TARGET} \
-p BUCKET=${BUCKET} \
-p DATA_SIZE=45Gi \
-p DATE=${DATE} \
-p DATA_SIZE=65Gi \
-p WAL_SIZE=15Gi \
${IMAGE_NAME:+ " -p IMAGE_NAME=${IMAGE_NAME}"} \
${IMAGE_TAG:+ " -p IMAGE_TAG=${IMAGE_TAG}"} \
Expand Down
64 changes: 51 additions & 13 deletions openshift/templates/crunchy_standby.yaml
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
apiVersion: template.openshift.io/v1
kind: Template
metadata:
name: wps-crunchydb-standby
name: ${APP_NAME}-${DATE}
annotations:
"openshift.io/display-name": wps-crunchydb-standby
"openshift.io/display-name": ${APP_NAME}-${DATE}
labels:
app.kubernetes.io/part-of: wps-crunchydb-standby
app: wps-crunchydb-standby
app.kubernetes.io/part-of: ${APP_NAME}-${DATE}
app: ${APP_NAME}-${DATE}
parameters:
- description: Namespace in which database resides
displayName: Target Namespace
Expand All @@ -15,6 +15,13 @@ parameters:
- name: BUCKET
description: S3 bucket name
required: true
- name: APP_NAME
description: Application name (wps - wildfire predictive services)
value: wps-crunchydb-16
required: true
- name: DATE
description: Date the standby was created
required: true
- name: DATA_SIZE
description: Data PVC size
required: true
Expand Down Expand Up @@ -60,23 +67,17 @@ objects:
- apiVersion: postgres-operator.crunchydata.com/v1beta1
kind: PostgresCluster
metadata:
name: wps-crunchydb-standby
name: ${APP_NAME}-${DATE}
spec:
postgresVersion: 16
postGISVersion: "3.3"
metadata:
name: wps-crunchydb-standby
name: ${APP_NAME}-${DATE}
labels:
app: wps-crunchydb-standby
app: ${APP_NAME}-${DATE}
databaseInitSQL:
key: init.sql
name: wps-init-sql
users:
- name: wps
databases:
- postgres
- wps
options: "SUPERUSER"
instances:
- name: crunchy
replicas: 1
Expand Down Expand Up @@ -104,20 +105,57 @@ objects:
backups:
pgbackrest:
image: artifacts.developer.gov.bc.ca/bcgov-docker-local/crunchy-pgbackrest:ubi8-2.41-4
manual:
repoName: repo1
options:
- --type=full
configuration:
- secret:
name: crunchy-pgbackrest
items:
- key: conf
path: s3.conf
global:
repo1-retention-full: "3"
repo1-retention-full-type: count
repo1-path: /pgbackrest/${SUFFIX}/repo1
repos:
- name: repo1
schedules:
full: "0 1 * * 0"
differential: "0 1 * * 1-6"
s3:
bucket: ${BUCKET}
endpoint: nrs.objectstore.gov.bc.ca
region: "ca-central-1"
proxy:
pgBouncer:
image: artifacts.developer.gov.bc.ca/bcgov-docker-local/crunchy-pgbouncer:ubi8-1.21-0
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- podAffinityTerm:
labelSelector:
matchLabels:
postgres-operator.crunchydata.com/cluster: db
postgres-operator.crunchydata.com/role: pgbouncer
topologyKey: kubernetes.io/hostname
weight: 1
config:
global:
pool_mode: transaction
ignore_startup_parameters: options, extra_float_digits
max_prepared_statements: "10"
max_client_conn: "1000"
port: 5432
replicas: 1
resources:
limits:
cpu: 500m
memory: 3Gi
requests:
cpu: 100m
memory: 1Gi
standby:
enabled: true
repoName: repo1

0 comments on commit 9a381df

Please sign in to comment.