Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update guide - database-upgrade-storage-optimization doc. Closes #210. #211

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -5,36 +5,42 @@ sidebar_label: "Database Upgrade and Storage Optimization"

# Database Upgrade and Storage Optimization

In this guide, you will
In this guide, you will:

- Resize and/or upgrade a database engine version with minimal downtime using AWS and PostgreSQL tools.
- Monitor and troubleshoot the upgrade process.
<!-- - Set up logical replication between the source and target databases. -->
<!-- - Monitor the progress of data migration and verify database synchronization. -->
<!-- - Rename and clean up database instances after the resizing or upgrade process is complete. -->

Efficient management of database resources ensures optimal storage utilization, minimizes costs, and enhances performance by reducing unused storage. This process also ensures seamless version upgrades with minimal disruption.
[Turbot Guardrails Enterprise Database (TED)](/guardrails/docs/reference/glossary#turbot-guardrails-enterprise-database-ted) is an AWS Service Catalog product that provides automated configuration and management of the infrastructure needed to run the enterprise version of Turbot Guardrails in your AWS account. Efficient management of database resources ensures optimal storage utilization, minimizes costs, and enhances performance by reducing unused storage. This process also ensures seamless version upgrades with minimal disruption.

## Prerequisites

- Access to the Guardrails AWS account with [Administrator Privileges](/guardrails/docs/enterprise/FAQ/admin-permissions).
- PostgreSQL client installed on the [bastion host](https://github.com/turbot/guardrails-samples/tree/main/enterprise_installation/turbot_bastion_host).
- Ensure logical replication is supported and enabled on the database engine.
- Knowledge of the current database usage (storage and version).
- TED stack version is at-least 1.45.0.

> [!WARNING]
> After creating replication slots (steps #7 below), we won't be able to upgrade any existing workspaces or create a new one till the end of the process. Basically, no DDL changes.

## Step 1: Spin up a new TED

- Create a new [TED](/guardrails/docs/reference/glossary#turbot-guardrails-enterprise-database-ted) with the same name as the original, appending `-blue` or `-green` to the end.
- If performing a database version upgrade, use the `DB Engine Version` and `Read Replica DB Engine Version` parameters under the "Database - Advanced - Engine" section. Set the appropriate `DB Engine Parameter Group Family` and the `Hive RDS Parameter Group` under the "Database - Advanced - Parameters" section.
- Set the allocated storage to match the current disk usage (e.g., if 210 GB out of 500 GB is used, set allocated storage to 210 GB) using the `Allocated Storage in GB` parameter under the "Database - Advanced - Storage" section.
- If performing a database version upgrade, use the `DB Engine Version` and `Read Replica DB Engine Version` parameters under the `Database - Advanced - Engine` section. Set the appropriate `DB Engine Parameter Group Family` and the `Hive RDS Parameter Group` under the `Database - Advanced - Parameters` section.
- Set the allocated storage to match the current disk usage using the `Allocated Storage in GB` parameter under the "Database - Advanced - Storage" section. For example - if 210 GB out of 500 GB is used, set allocated storage to 210 GB. Check the `FreeStorageSpace` metrics to get the size.
- Set the maximum allocated storage to a suitable value using the `Maximum Allocated Storage limit in GB` parameter under the "Database - Advanced - Storage" section.
- Set up encryption by configuring the `Custom Hive Key` parameter to use the original KMS key under the "Advanced - Infrastructure" section. This should be the Key ID, typically formatted as: 1111233-abcd-4444-2322-123456789012.
- Set up encryption by configuring the `Custom Hive Key` parameter to use the original KMS key under the `Advanced - Infrastructure` section. This should be the Key ID, typically formatted as: `1111233-abcd-4444-2322-123456789012`.
- Keep the other parameters the same.

## Step 2: Enable Logical Replication

- Go to the AWS Console and navigate to the relevant parameter group.
- Set `rds.logical_replical` to **`1`** if it’s not already set.
- Reboot the DB instance (expected downtime is ~50 seconds).
- Turn off events - refer [here](https://turbot.com/guardrails/docs/enterprise/FAQ/pause-events) for more details.
- **Reboot** the DB instance (expected downtime is ~50 seconds).

## Step 3: Set Master Password

Expand Down Expand Up @@ -105,6 +111,11 @@ Use pg_dump to create a dump of the source database:
nohup pg_dump -h $SOURCE -U master -F c -b -v -f data.dump turbot > dump.log 2>&1
```

Check for errors in the dump file
```shell
cat dump.log | grep error
```

## Step 9: Restore the Dump in the Target DB

Restore the database in the target instance:
Expand All @@ -129,7 +140,7 @@ set local search_path to <workspace_schema>, public;
```

```sql
set local search_path to <workspace_schema>;
set local search_path to <workspace_schema>, public;
create trigger control_category_path_au after update on control_categories for each row when (old.path is distinct from new.path) execute procedure types_path_au('controls', 'control_category_id', 'control_category_path');
create trigger control_resource_category_path_au after update on resource_categories for each row when (old.path is distinct from new.path) execute procedure types_path_au('controls', 'resource_category_id', 'resource_category_path');
create trigger control_resource_types_path_au after update on resource_types for each row when (old.path is distinct from new.path) execute procedure types_path_au('controls', 'resource_type_id', 'resource_type_path');
Expand Down Expand Up @@ -204,7 +215,7 @@ SELECT n.nspname AS schema_name, COUNT(c.conname) AS constraint_count FROM pg_ca
SELECT count(tgname), tgenabled FROM pg_trigger GROUP by tgenabled;
```

## Step 14: Turn Off Events (Optional)
## Step 14: Turn Off Events

Disable events as per the guidelines: [Pause Events](https://turbot.com/guardrails/docs/guides/hosting-guardrails/troubleshooting/pause-events).

Expand All @@ -213,6 +224,16 @@ Disable events as per the guidelines: [Pause Events](https://turbot.com/guardrai
- Rename the primary instance by appending -green.
- Rename the new instance by removing the -blue suffix.

## Step 16: Disable and delete subscription

- Delete subscription and replication slots
```sql
SELECT subname AS "Subscription Name", subowner AS "Owner ID", subenabled AS "Is Enabled", subpublications AS "Publications" FROM pg_subscription;
alter subscription <subscription_name> disable;
alter subscription <subscription_name> set (slot_name=NONE);
drop subscription <subscription_name>;
```

## Step 16: Turn On Events

Refer to the documentation: [Turn On Events](https://turbot.com/guardrails/docs/guides/hosting-guardrails/troubleshooting/pause-events).
Expand Down
Loading