Skip to content

Commit

Permalink
docs: add backup concept, guide, and configuration (#98)
Browse files Browse the repository at this point in the history
  • Loading branch information
arinda-arif authored Nov 3, 2021
1 parent d1001c8 commit d4476fa
Show file tree
Hide file tree
Showing 4 changed files with 76 additions and 1 deletion.
8 changes: 7 additions & 1 deletion docs/docs/concepts/overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -457,7 +457,13 @@ After passing the validation, Replay will clear task instances of the requested
up and run. Replay will frequently check the status of each task from the scheduler (every 5 minutes) to track if
each task is still in progress, failed, or succeeded.
Optimus also provides Replay Dry Run to simulate all the impacted tasks without actually re-running the tasks.
Optimus also provides Backup to duplicate a resource that can be perfectly used before running Replay. Optimus accepts
which datastore and resource that needs to be backed up and users have a choice to also back up the downstream resources
within the same project. Where the backup result will be located, and the expiry detail can be configured in the project
configuration.
Both Replay and Backup are provided with Dry Run to simulate all the impacted tasks or resources without actually re-running
the tasks or backing up the resources.
## Monitoring & Alerting
Expand Down
2 changes: 2 additions & 0 deletions docs/docs/getting-started/configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,8 @@ datastore:
type: bigquery
# path where resource spec for BQ are stored
path: "bq"
# backup configurations of a datastore
backup: {}

# project variables usable in specifications
config:
Expand Down
66 changes: 66 additions & 0 deletions docs/docs/guides/backup.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
---
id: backup
title: Backup Resources
---

Backup is a common prerequisite step to be done before re-running or modifying a resource. Currently, Optimus supports
backup for BigQuery tables and provides dependency resolution, so backup can be also done to all the downstream tables
as long as it is registered in Optimus and within the same project.

## Configuring backup details

Several configurations can be set to have the backup result in your project as your preference. Here are the
available configurations for BigQuery datastore.

Configuration key | Description | Default |
------------------|------------------------------------------|----------------|
ttl | Time to live in duration | 720h |
prefix | Prefix of the result table name | backup |
dataset | Where the table result should be located | optimus_backup |

These values can be set in the project [configuration](../getting-started/configuration.md).


## Run a backup

To start a backup, run the following command:

```shell
$ optimus backup resource --project sample-project --namespace sample-namespace
```

After you run the command, prompts will be shown. You will need to answer the questions.

```
$ optimus backup resource --project sample-project --namespace sample-namespace
? Select supported datastore? bigquery
? Why is this backup needed? backfill due to business logic change
? Backup downstream? Yes
```

You will be shown a list of resources that will be backed up, including the downstream resources (if you chose to do so).
You can confirm to proceed if the list is as expected, and please wait until the backup is finished.

Once the backup is finished, the list of backup results along with where it is located will be shown.


## Get list of backups

List of recent backups of a project can be checked using this sub command:

```shell
$ optimus backup list --project sample-project
```

Recent backup ID including the resource, when it was created, what is the description or purpose of the backup will be
shown. Backup ID is used as a postfix in backup result name, thus you can find those results in the datastore
(for example BigQuery) using the backup ID. However, keep in mind that these backup results have expiry time set.

## Run a backup dry run

A dry run is also available to simulate all the resources that can be backed up without actually doing it. Example of dry
run usage:

```shell
$ optimus backup resource --project sample-project --namespace sample-namespace --dry-run
```
1 change: 1 addition & 0 deletions docs/sidebars.js
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@ module.exports = {
"guides/organising-specifications",
"guides/optimus-serve",
"guides/task-bq2bq",
"guides/backup",
"guides/replay"
],
},
Expand Down

0 comments on commit d4476fa

Please sign in to comment.