Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PoC for 1-1 restore (VNODES) #4145

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from
Draft

PoC for 1-1 restore (VNODES) #4145

wants to merge 2 commits into from

Conversation

karol-kokoszka
Copy link
Collaborator

@karol-kokoszka karol-kokoszka commented Dec 2, 2024

This PR is just to explore the 1-1 restore.

.1-1-restore-poc/backup.tar.gz is the archived snapshot.
To execute the PoC that restores simple backuptest_data.big_table table:

  1. Clean the scylla data folders on containers.
make clean_restore
  1. Extract information about the token ring from the backup manifest
make extract_tokens_and_schema
  1. Set initial_token to the scylla.yaml per every node.
make set_initial_tokens
  1. Start the cluster
make start-dev-env
  1. Recreate schema
make recreate_schema
  1. Extract SSTables from archive
make extract_sstables
  1. Copy nodes SSTables to the /var/lib/scylla/data
make copy_sstables
  1. Refresh the restored table.
make refresh_nodes
  1. Query the amount of rows
make query_data
  1. Extract restored data to sorted CSV (sorted by ID)
make extract_data_to_csv
  1. Compare the restored content with expected one
make compare_restored
➜  scylla-manager git:(PoC-1-1-restore) ✗ make extract_data_to_csv 
Extracting data from Scylla table...
cqlsh 192.168.100.11 -u cassandra -p cassandra -e "COPY backuptest_data.big_table TO 'backuptest_data.big_table.restored.csv' WITH HEADER = TRUE;"
Using 11 child processes

Starting copy of backuptest_data.big_table with columns [id, data].
Processed: 2560 rows; Rate:    7451 rows/s; Avg. rate:    6158 rows/s
2560 rows exported to 1 files in 0.441 seconds.
Sorting the extracted data...
sort -t, -k1 backuptest_data.big_table.restored.csv > .1-1-restore-poc/backuptest_data.big_table.restored.sorted.csv
Removing unsorted CSV file...
rm backuptest_data.big_table.restored.csv
Data extraction and sorting complete!
➜  scylla-manager git:(PoC-1-1-restore) ✗ make compare_restored 
Comparing the content of restored ks with the expected content
Contents are the same


Please make sure that:

  • Code is split to commits that address a single change
  • Commit messages are informative
  • Commit titles have module prefix
  • Commit titles have issue nr. suffix

@karol-kokoszka
Copy link
Collaborator Author

Current scenario covers VNODEs only.
Another scenario with tablets must be created and tested (Is it enough to restore system tables related to tablets ?).

@karol-kokoszka
Copy link
Collaborator Author

UPDATE: This solution won't work for tablets and is not expected to work with tablets.
For tablet keyspaces, we will failback to load and stream.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant