-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
When the public key/node ID changes, use a database with a separate namespace #225
Comments
I think this is a good idea that we should implement. It stops us from needing to "wipe the database" and instead lets us move over to new databases, with the possibility of rolling back if we made a mistake. It also handles rolling upgrades well, since new code will write to the new DB while old code is still running smoothly. The remaining question is "what are the inputs to the database identifier". We can create an arbitrary set of inputs, then derive a hash to be used in the DB identifier. Node public key is one input. At least until we decide to add a per-smart-contract originator ID, we can also include the addresses of all the smart contracts. So, if you change the smart contract we start fresh with a new DB. We could have some "generation ID" in the code that we can manually increment to force upgrades. This can live in the code, allowing us to deploy releases which force a DB upgrade. This should be a "last resort" option, since wiping the DB without changing the smart contract could lead to incompatibilities. But there are cases where it makes sense. For example, wire format changes to messages that don't impact the smart contracts. These should be rare. So, the database name would look something like To implement we'll need to do some gymnastics to connect to the |
If changing the generation ID switches to a new database without changing the public key, will that cause weird behavior replicating to other nodes during rolling upgrades? |
Almost certainly. But they should be upgrading soon after and wiping their own databases, since the generation ID is in the code. Once everyone upgrades it'll quiet down, and anything that happens before the upgrade will be thrown away. So, will be noisy but self-resolve. |
I am not a huge fan of generationID. Mostly because I see no issues with generating a new id from publickey/nodeId+contract since it gives us a clear boundary. New public key means new node using the same PSQL server which is safe. And new contract means new reality reset. Allowing to reset this manually would really complicate things as it would lead to inconsistencies. |
Fair concern. If we didn't have a generation ID we would need to normalize creating new public/private keypairs as part of upgrades. Which would be fine, but takes a little more hand-holding with partners when they upgrade. Would need to re-register them into the nodes contract, etc. |
I guess what I'm worried about is nodes after upgrading syncing updates from nodes that haven't upgraded yet. If I interpret that situation correctly, it won't self-resolve. And I think the same issue also happens if you update the contract addresses. |
So, the concern is what would happen if you have Node A (wiped DB) talking to Node B (not yet updated/wiped) and had replicated some of their pre-wipe messages? Then Node B wipes and their originator sequence ID goes back to 0, causing conflicts? Hmm. Maybe we do want to keep this simple and force everyone to update their private keys/nodes contract address whenever we do an upgrade. That way they have no way to talk to one another between versions. |
I assumed that this would be hidden behind some test flag that explicitly states that it is hacky. In testnet/mainnet, the only way to wipe an instance should be to create a new nodeId. |
Ya. This falls into the category of "training wheels" that we'd need to remove before mainnet. |
Seems like there are two use-cases here:
|
Alternatively, we could change the name derivation to:
And use the act of setting a new node registry address as the act of resetting the network. Then we don't need to change the node keys - once all nodes have been reset, we can add them all to the registry. |
If we add the nodes to the registry after wiping the DBs, all the nodes are going to crash loop until we add them to the registry since they can't find themselves. Maybe that's fine, but leads to more downtime and noise. If the HTTP addresses and public keys aren't changing, the old nodes will still be able to connect to the new nodes and read messages, since the connection info in the old contract will work. That's not the end of the world, since they will wipe their DBs eventually but is still a little messy.
If we control all the nodes that script is straightforward enough. But when other people are running the nodes, there's a coordination cost. It feels hacky for us to be handing node operators their private keys. They should generate them locally and send us the public keys. |
Let's make this specific: an upgrade plan for us to use the new namespaced databases to migrate testnet to the new chain/nodes contract.
|
### TL;DR Added support for namespaced databases with automatic creation and migration capabilities. ### What changed? - Introduced `NewNamespacedDB` function to create and manage isolated database instances - Refactored database connection logic into smaller, focused functions - Added `createNamespace` function to handle database creation - Implemented `waitUntilDBReady` for connection readiness checks - Added unit tests to verify namespaced database functionality - Made test database DSN constants public for reuse ### How to test? 1. Run the new unit test `TestNamespacedDB` 2. Verify that a new database is created with the specified namespace 3. Confirm that the created database is properly initialized with migrations 4. Check that the database name matches the provided namespace ### Why make this change? This change enables better isolation between different database instances, allowing for separate namespaces per instance. This is particularly useful for testing scenarios and multi-tenant applications where data separation is crucial. ## Ticket #225 <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Introduced a function to check database readiness and a function to create a namespaced database. - Enhanced database configuration handling with a new parsing function. - **Bug Fixes** - Improved error handling during namespace creation. - **Tests** - Added tests to validate the creation and querying of namespaced databases, including support for multiple instances with the same name and error handling for invalid inputs. - **Documentation** - Updated visibility of constants for better accessibility in the test utilities. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
@neekolas has this been resolved? |
As a starting example (may or may not be the best approach), our unit tests use this approach to create namespaced databases:
xmtpd/pkg/testutils/store.go
Lines 30 to 47 in 02c16df
Easiest would be to derive the namespace name from the public key, so that for rolling upgrades all new nodes roll onto the same database
The text was updated successfully, but these errors were encountered: