When the public key/node ID changes, use a database with a separate namespace #225

richardhuaaa · 2024-10-15T21:32:50Z

As a starting example (may or may not be the best approach), our unit tests use this approach to create namespaced databases:

xmtpd/pkg/testutils/store.go

Lines 30 to 47 in 02c16df

    
           func newCtlDB(t testing.TB) (*sql.DB, string, func()) { 
        
           	return openDB(t, localTestDBDSNPrefix+localTestDBDSNSuffix) 
        
           } 
        
           func newInstanceDB(t testing.TB, ctx context.Context, ctlDB *sql.DB) (*sql.DB, string, func()) { 
        
           	dbName := "test_" + RandomStringLower(12) 
        
           	_, err := ctlDB.Exec("CREATE DATABASE " + dbName) 
        
           	require.NoError(t, err) 
        
           	db, dsn, cleanup := openDB(t, localTestDBDSNPrefix+"/"+dbName+localTestDBDSNSuffix) 
        
           	require.NoError(t, migrations.Migrate(ctx, db)) 
        
           	return db, dsn, func() { 
        
           		cleanup() 
        
           		_, err = ctlDB.Exec("DROP DATABASE " + dbName) 
        
           		require.NoError(t, err) 
        
           	} 
        
           }

Easiest would be to derive the namespace name from the public key, so that for rolling upgrades all new nodes roll onto the same database

neekolas · 2024-11-26T18:41:43Z

I think this is a good idea that we should implement. It stops us from needing to "wipe the database" and instead lets us move over to new databases, with the possibility of rolling back if we made a mistake. It also handles rolling upgrades well, since new code will write to the new DB while old code is still running smoothly.

The remaining question is "what are the inputs to the database identifier". We can create an arbitrary set of inputs, then derive a hash to be used in the DB identifier.

Node public key is one input. At least until we decide to add a per-smart-contract originator ID, we can also include the addresses of all the smart contracts. So, if you change the smart contract we start fresh with a new DB.

We could have some "generation ID" in the code that we can manually increment to force upgrades. This can live in the code, allowing us to deploy releases which force a DB upgrade. This should be a "last resort" option, since wiping the DB without changing the smart contract could lead to incompatibilities. But there are cases where it makes sense. For example, wire format changes to messages that don't impact the smart contracts. These should be rare.

So, the database name would look something like xmtpd-${HASH(CONCAT($GENERATION_ID, $NODE_PRIVATE_KEY, $NODES_CONTRACT_ADDRESS, $MESSAGES_CONTRACT_ADDRESS, $IDENTITY_UPDATES_CONTRACT_ADDRESS))}

To implement we'll need to do some gymnastics to connect to the postgres DB on startup, make sure the DB exists, and then update the connection string to point to the right database.

richardhuaaa · 2024-11-26T19:09:54Z

If changing the generation ID switches to a new database without changing the public key, will that cause weird behavior replicating to other nodes during rolling upgrades?

neekolas · 2024-11-26T19:12:30Z

will that cause weird behavior replicating to other nodes during rolling upgrades?

Almost certainly. But they should be upgrading soon after and wiping their own databases, since the generation ID is in the code. Once everyone upgrades it'll quiet down, and anything that happens before the upgrade will be thrown away. So, will be noisy but self-resolve.

mkysel · 2024-11-26T19:12:40Z

I am not a huge fan of generationID. Mostly because I see no issues with generating a new id from publickey/nodeId+contract since it gives us a clear boundary. New public key means new node using the same PSQL server which is safe. And new contract means new reality reset. Allowing to reset this manually would really complicate things as it would lead to inconsistencies.

neekolas · 2024-11-26T19:13:57Z

Fair concern. If we didn't have a generation ID we would need to normalize creating new public/private keypairs as part of upgrades. Which would be fine, but takes a little more hand-holding with partners when they upgrade. Would need to re-register them into the nodes contract, etc.

richardhuaaa · 2024-11-26T19:18:23Z

But they should be upgrading soon after and wiping their own databases, since the generation ID is in the code. Once everyone upgrades it'll quiet down, and anything that happens before the upgrade will be thrown away. So, will be noisy but self-resolve.

I guess what I'm worried about is nodes after upgrading syncing updates from nodes that haven't upgraded yet. If I interpret that situation correctly, it won't self-resolve. And I think the same issue also happens if you update the contract addresses.

neekolas · 2024-11-26T19:22:48Z

So, the concern is what would happen if you have Node A (wiped DB) talking to Node B (not yet updated/wiped) and had replicated some of their pre-wipe messages?

Then Node B wipes and their originator sequence ID goes back to 0, causing conflicts?

Hmm. Maybe we do want to keep this simple and force everyone to update their private keys/nodes contract address whenever we do an upgrade. That way they have no way to talk to one another between versions.

mkysel · 2024-11-26T19:25:52Z

I assumed that this would be hidden behind some test flag that explicitly states that it is hacky. In testnet/mainnet, the only way to wipe an instance should be to create a new nodeId.

neekolas · 2024-11-26T19:27:01Z

Ya. This falls into the category of "training wheels" that we'd need to remove before mainnet.

richardhuaaa · 2024-11-26T19:34:00Z

Seems like there are two use-cases here:

A specific node wants to start from scratch. In that case, having the DB be namedspaced on private key, and changing the private key, should be able to solve that.
Wanting to nuke the entire network and start from scratch. I'm wondering if we should just get good at running a script that automates this:
1. Deploy new nodes contract
2. Generate however many node keys we want
3. Insert new node keys into contract
4. Replace private key of all existing nodes

richardhuaaa · 2024-11-26T19:36:12Z

Alternatively, we could change the name derivation to:

xmtpd-${HASH(CONCAT($NODE_PRIVATE_KEY, $NODES_CONTRACT_ADDRESS))}

And use the act of setting a new node registry address as the act of resetting the network. Then we don't need to change the node keys - once all nodes have been reset, we can add them all to the registry.

neekolas · 2024-11-26T20:30:40Z

Then we don't need to change the node keys - once all nodes have been reset, we can add them all to the registry.

If we add the nodes to the registry after wiping the DBs, all the nodes are going to crash loop until we add them to the registry since they can't find themselves. Maybe that's fine, but leads to more downtime and noise.

If the HTTP addresses and public keys aren't changing, the old nodes will still be able to connect to the new nodes and read messages, since the connection info in the old contract will work. That's not the end of the world, since they will wipe their DBs eventually but is still a little messy.

I'm wondering if we should just get good at running a script that automates this:

If we control all the nodes that script is straightforward enough. But when other people are running the nodes, there's a coordination cost. It feels hacky for us to be handing node operators their private keys. They should generate them locally and send us the public keys.

neekolas · 2024-11-26T20:44:53Z

Let's make this specific: an upgrade plan for us to use the new namespaced databases to migrate testnet to the new chain/nodes contract.

`xmtpd-${HASH(CONCAT($NODE_PRIVATE_KEY, $NODES_CONTRACT_ADDRESS))}`

Update Terraform with the new contract addresses
Deploy
Add the nodes to the contract after they have deployed, at which point they will stop crash looping and connect to any other nodes that are in the contract

`xmtpd-${HASH(CONCAT($NODE_PRIVATE_KEY))}`

Generate new public/private keypairs for all nodes
Deploy new Nodes smart contract
Add nodes to the new smart contract
Update Terraform with new private keys and contract addresses
Deploy

Conclusion

I like the xmtpd-${HASH(CONCAT($NODE_PRIVATE_KEY, $NODES_CONTRACT_ADDRESS))} plan. It leads to some noisy deploys, but guarantees that after the wipe you are only talking to other nodes that have also been wiped (assuming we sequence adding nodes to the contract correctly)

### TL;DR Added support for namespaced databases with automatic creation and migration capabilities. ### What changed? - Introduced `NewNamespacedDB` function to create and manage isolated database instances - Refactored database connection logic into smaller, focused functions - Added `createNamespace` function to handle database creation - Implemented `waitUntilDBReady` for connection readiness checks - Added unit tests to verify namespaced database functionality - Made test database DSN constants public for reuse ### How to test? 1. Run the new unit test `TestNamespacedDB` 2. Verify that a new database is created with the specified namespace 3. Confirm that the created database is properly initialized with migrations 4. Check that the database name matches the provided namespace ### Why make this change? This change enables better isolation between different database instances, allowing for separate namespaces per instance. This is particularly useful for testing scenarios and multi-tenant applications where data separation is crucial. ## Ticket #225  ## Summary by CodeRabbit - **New Features** - Introduced a function to check database readiness and a function to create a namespaced database. - Enhanced database configuration handling with a new parsing function. - **Bug Fixes** - Improved error handling during namespace creation. - **Tests** - Added tests to validate the creation and querying of namespaced databases, including support for multiple instances with the same name and error handling for invalid inputs. - **Documentation** - Updated visibility of constants for better accessibility in the test utilities.

mkysel · 2024-12-03T17:01:25Z

@neekolas has this been resolved?

richardhuaaa mentioned this issue Oct 15, 2024

Replication tracking task #118

Open

richardhuaaa assigned mkysel Oct 15, 2024

mkysel removed their assignment Nov 26, 2024

neekolas mentioned this issue Nov 26, 2024

Add support for creating namespaced databases #294

Merged

neekolas closed this as completed Dec 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

When the public key/node ID changes, use a database with a separate namespace #225

When the public key/node ID changes, use a database with a separate namespace #225

richardhuaaa commented Oct 15, 2024 •

edited

Loading

neekolas commented Nov 26, 2024 •

edited

Loading

richardhuaaa commented Nov 26, 2024

neekolas commented Nov 26, 2024

mkysel commented Nov 26, 2024

neekolas commented Nov 26, 2024

richardhuaaa commented Nov 26, 2024

neekolas commented Nov 26, 2024

mkysel commented Nov 26, 2024

neekolas commented Nov 26, 2024

richardhuaaa commented Nov 26, 2024

richardhuaaa commented Nov 26, 2024 •

edited

Loading

neekolas commented Nov 26, 2024 •

edited

Loading

neekolas commented Nov 26, 2024 •

edited

Loading

mkysel commented Dec 3, 2024

When the public key/node ID changes, use a database with a separate namespace #225

When the public key/node ID changes, use a database with a separate namespace #225

Comments

richardhuaaa commented Oct 15, 2024 • edited Loading

neekolas commented Nov 26, 2024 • edited Loading

richardhuaaa commented Nov 26, 2024

neekolas commented Nov 26, 2024

mkysel commented Nov 26, 2024

neekolas commented Nov 26, 2024

richardhuaaa commented Nov 26, 2024

neekolas commented Nov 26, 2024

mkysel commented Nov 26, 2024

neekolas commented Nov 26, 2024

richardhuaaa commented Nov 26, 2024

richardhuaaa commented Nov 26, 2024 • edited Loading

neekolas commented Nov 26, 2024 • edited Loading

neekolas commented Nov 26, 2024 • edited Loading

xmtpd-${HASH(CONCAT($NODE_PRIVATE_KEY, $NODES_CONTRACT_ADDRESS))}

xmtpd-${HASH(CONCAT($NODE_PRIVATE_KEY))}

Conclusion

mkysel commented Dec 3, 2024

richardhuaaa commented Oct 15, 2024 •

edited

Loading

neekolas commented Nov 26, 2024 •

edited

Loading

richardhuaaa commented Nov 26, 2024 •

edited

Loading

neekolas commented Nov 26, 2024 •

edited

Loading

neekolas commented Nov 26, 2024 •

edited

Loading

`xmtpd-${HASH(CONCAT($NODE_PRIVATE_KEY, $NODES_CONTRACT_ADDRESS))}`

`xmtpd-${HASH(CONCAT($NODE_PRIVATE_KEY))}`