Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: registrar #471

Merged
merged 7 commits into from
Nov 6, 2019
Merged

feat: registrar #471

merged 7 commits into from
Nov 6, 2019

Conversation

vasco-santos
Copy link
Member

@vasco-santos vasco-santos commented Nov 2, 2019

The registrar aims to be a bridge between all subsystems and js-libp2p (peer-store + connection-manager). Before this refactor, js-libp2p subsystems like pubsub and dht were responsible for tracking new peers that appear in the network and try to dial them using their respective protocols. Other than that, each subsystem was responsible for managing the connections and dials they were doing. This way, we aim to centralize this logic and enable subsystems to only care about their work, and leave the dialing and connection management for who should take care of them. It is also important pointing out that this is the first step to enable us to enhance the connection-manager.

More context on libp2p/notes#13 and cryptpad

This PR is on top of #470 branch and was also extracted from #467


// Setup the Upgrader
this.upgrader = new Upgrader({
localPeer: this.peerInfo.id,
onConnection: (connection) => {
const peerInfo = getPeerInfo(connection.remotePeer)

this.peerStore.put(peerInfo)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The put here doesn't do anything. connection.remotePeer is a PeerId, so getPeerInfo is just going to check the peerStore anyway. I would delete the getPeerInfo usage and just do:

const peerInfo = this.peerStore.get(connection.remotePeer.toB58String())

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When a peerB dials peerA the peerStore for peerA does not have the peerB on it. That way, we would have peerInfo = undefined
getPeerInfo puts in the peerStore if it is provided. I changed for the following:

const peerInfo = getPeerInfo(connection.remotePeer, this.peerStore)

this.registrar.onConnect(peerInfo, connection)
this.emit('peer:connect', peerInfo)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh wait, if I do that getPeerInfo tries to put invalid PeerInfo to the store in some scenarios from other tests.

src/index.js Show resolved Hide resolved
src/registrar.js Outdated Show resolved Hide resolved
src/registrar.js Outdated Show resolved Hide resolved
src/registrar.js Outdated Show resolved Hide resolved
src/registrar.js Outdated
* @param {object} topology properties for topology
* @return {string} registrar identifier
*/
register (multicodecs, handlers, topology = {}) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

topology should be required. The registrar shouldn't create them, it should just track and inform them. With that the handlers shouldn't need to be passed, because they're part of the topology.

multicodecs shouldn't be needed and should exist solely in the topology

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this scenario, I am also using multicodecs to try to connect to the already connected peers using that multicodec. If I don't use them here, I will need to try onConnect to all the peers, which will fail the newStream inside Pubsub (as an example). Other than that, we will need to also try to connect with peers from the peerStore that are not connected

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is only needed right now because Identify hasn't been refactored yet, right? The Registrar shouldn't need to care about multicodecs once identify is running. I think eventually we could have a peer discovery mechanism that checked the multicodecs our node supports (those added via libp2p.handle) and querying the network or rendezvous servers for nodes that also support those. This still wouldn't require getting multicodecs from the topology as we'll know them from handle/unhandle calls.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this conversation has more insights.

Basically, I am using them for getting the already connected peers.

I have a doubt though: Once pubsub starts, a identify-push will occur. If we have already connected and identified other peers (who support pubsub), we will already have that information right? This way, protocols will not change and we need to try to connect using pubsub on those peers.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, let's say Libp2p has been running for a bit with pubsub off and the following is true:

  • We have 100 known peers
  • We are connected to 25 of those
  • The connection manager has a minimum of 25 and a max of 300 connections.
  • The pubsub topology will want a minimum of 15 peers

Now, we enable pubsub. This causes a MulticodecTopology to be created and registered. The logic could look like this:

const top = new MulticodecTopology({ ... })
const regId = registrar.register(top)

// In Registrar.register
topology.registrar = this

Instead of passing the registrar to the topology when it's created, the registrar could set the value during registration, if it's successful. If Topology uses a setter for registrar, it could then just iterate over the existing registrar.peerStore.peers set. If any peers support the desired multicodecs, the Topology can then just add them to its own peer set.

The change in the Topology peer set, would cause the Registrar to reprioritize the peer. If we assume the Topology finds 20 valid peers, and only 5 are connected, we would have a deficit of 10 known peers we need to establish new connections to (which the connection manager will take care of later).

The Topology could then retrieve the existing connections from the registrar and pass that off to pubsub to create new streams from.


Once pubsub starts, a identify-push will occur. If we have already connected and identified other peers (who support pubsub), we will already have that information right? This way, protocols will not change and we need to try to connect using pubsub on those peers.

The identify-push won't matter because it's just informing peers we're already connected to. This is why I think the Topology should iterate over the known peers after registration. It has access to that information via the Registrar, it just needs to check. Not all Topologies will care about doing that, so they should be managing that themselves.

src/registrar.js Outdated Show resolved Hide resolved
sinon.spy(libp2p.registrar, 'onConnect')
sinon.spy(libp2p.registrar, 'onDisconnect')

const connection = await libp2p.dial(remoteAddr)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mock the internal dial as we don't actually need to connect to another peer for these tests.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test needs the remotePeer to receive the dial. I just changed it accordingly

src/connection-manager/topology.js Outdated Show resolved Hide resolved
// New to protocol support
for (const protocol of protocols) {
if (this.multicodecs.includes(protocol)) {
this.tryToConnect(peerInfo, this.registrar.getPeerConnection(peerInfo))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the point of tryToConnect? Ultimately I think this line should just be an add to this.peers. When registration happens we could use Proxy to have the registrar watch for changes on each Toplogies peer set. When that changes it can take action on prioritizing peers for the connection manager to connect/disconnect to.

The flow I am thinking of is:

  1. A new peer is discovered and we are under the min threshold for a topology (multicodec, bootstrap, root, etc)
  2. The connection manager takes the highest priority peers and dials to them (for a typical node on startup, this would be the bootstrap peers, and any other priority peers like ipfs preload nodes)
  3. When the connection is made, the registrar is notified and in turn notifies the topologies, so they can take action if desired
    a. A multicodec topology could just ignore this, because it cares about protocol updates
  4. When identify runs, the protocol change is triggered by the peer store, and the multicodec does a check for its multicodecs
    a. If the multicodec is present, it adds the peer to its list of peers
    b. The peer set change, causes the registrar to update the priority of that peer
  5. The reprioritized peer will now be connected/disconnected accordingly by the connection manager

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The point for tryToConnect is basically the following use case:

When a peer starts a subsystem (pubsub as an example), the peer may already be connected to several peers. Take into account that someone enables the subsystem in runtime. By that time, the peer will know and be connected to several peers (some of them may support the protocol of the subsystem). Once the subsystem is started and the registrar is notified of it, the registrar must inform the topology that it is already connected with some peers (tryToConnect), and the topology will do onConnect if it is inside the thresholds. This way, the tryToConnect is basically a subsystem.onConnect with a validation before, in order to guarantee that we are inside the thresholds.

The flow that you described looks good, except for the initial part that I just described. However, I am not sure if we are getting too complex with the Proxy.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the registrar must inform the topology that it is already connected with some peers

I would expect that when a topology is registered, the registrar would simply trigger onConnect for any existing connections. It would basically be a replay mechanism for lazily started systems.

However, I am not sure if we are getting too complex with the Proxy

It's not really complex though, it's just another way of creating an observer. We could do it with hooks or events, but the Proxy is cleaner IMO. The registrar should be the only thing observing topology.peers.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect that when a topology is registered, the registrar would simply trigger onConnect for any existing connections. It would basically be a replay mechanism for lazily started systems.

I see 2 problems with that. The topology needs to update its peers set and if we are already connected with a lot of peers, we may immediately exceed the threshold. That's why I would rather have a tryToConnect method.

It's not really complex though, it's just another way of creating an observer. We could do it with hooks or events, but the Proxy is cleaner IMO. The registrar should be the only thing observing topology.peers.

I think we can iterate on that later, as we will not need this until getting to connectionManager

@vasco-santos vasco-santos force-pushed the feat/registrar branch 3 times, most recently from da49930 to 0ec963e Compare November 5, 2019 11:40
src/registrar.js Outdated Show resolved Hide resolved
src/connection-manager/topology.js Outdated Show resolved Hide resolved
src/peer-store/index.js Outdated Show resolved Hide resolved
}
})

this.peers.set(peerInfo.id.toB58String(), peerProxy)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A problem with this approach is that the original PeerInfo that is being added isn't tracked, the new instance is. So if the original instance has any updates occur to it after the add, the changes will never trigger. I think it would be safer to require changes to a peer happen through the peer store (the deprecation of PeerInfo) rather than trying to track the metadata like this, as we don't have complete control of the instances being used.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be safer to require changes to a peer happen through the peer store

Yes, that is why I made this change now, as a first step to deprecate PeerInfo.
With the new instance of peer-info, we cannot change the added PeerInfo from outside (same as when we destruct it).

Then, I added the proxy so that when people do the get the changes are triggered for now. Once we get rid of the peer-info we just need to also remove the Proxy.

I can change this to not have the peer-info if you prefer to make it now, but I would rather do it in a subsequent PR instead of tracking code changes in the 3 current PRs. I think that with the current code we have guarantees that what is in the peer-store is always correct according to the events we emit.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fine to refactor this later. In most instances it should be fine, but if the following occurs, we won't get notified because the provided peer is not the one being proxied:

libp2p.peerStore.add(peerInfo)
peerInfo.protocols.add('/new/protocol')

In order for it to work properly you'd need to do:

libp2p.peerStore.add(peerInfo)
peerInfo = libp2p.peerStore.get(peerInfo.id.toB58String())
peerInfo.protocols.add('/new/protocol')

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was expecting the following flow to be required for updates:

libp2p.peerStore.add(peerInfo)
peerInfo.protocols.add('/new/protocol')
libp2p.peerStore.add(peerInfo)

To update the record

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 , should the second call be libp2p.peerStore.update(peerInfo)? .put would work fine for both.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

update would be better, but the put would call the update

storedConn.push(conn)
} else {
this.connections.set(id, [conn])
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The topologies are never informed of new connections here as they are with onDisconnect.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When a new connection happens, I was expecting that the peer-store will trigger the protocol-change after identify and we do not need to inform the topologies with the connection.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can leave this for now. There may be Topologies in the future that need to known when connections are created but that don't care about the multicodecs, but I don't think we have any of those planned right now. A static peer set Topology, such as a Bootstrap or Priority Node topology won't care about the multicodecs, but they also won't care about connections, just expressing their desire to be connected to certain peers.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes sense yes. We iterate on this with the new versions of connMgr + peer-store

* @param {Connection} conn
* @returns {void}
*/
onConnect (peerInfo, conn) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about changing the connection tracking to just use the connection id, since we have those now? onConnect could just take the connection, since we can determine the PeerInfo using its remotePeer and the PeerStore. This would also simplify having to track multiple connections per peer.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we move in that direction, when we want to get a connection from a peer, we will need to iterate the collection and get the remotePeer from the connection and then check if it matches. This simplifies the multiple connections tracking but makes the getConnection more complex. I am not sure if we should change it or not, but we can do it if you feel this is a more viable approach.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's a good point, I think we can leave this for now. We could potentially create an id dictionary for quicker lookups both ways, but that might be overkill at the moment.

src/connection-manager/topology.js Outdated Show resolved Hide resolved
* @returns {void}
*/
disconnect (peerInfo, error) {
if (this.peers.delete(peerInfo.id.toB58String())) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't remove the peer from tracking here. The only time a peer should get removed is if they don't support our protocol, or we've decided they shouldn't be included in our Topology, such as with certain mesh overlays. For multicodec, the peer still has value to us, we just don't have a connection anymore. By keeping it in our peer set we are telling the registrar this peer is useful to us, so if we go below our connection threshold, Connection Manager knows that specific peer has value. If no Topology has it in their peer sets, it won't get dialed (unless it is randomly dialed when we are below our global threshold of minimum connections).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was deleting this from a perspective of when a peer gets offline and not on a perspective of we intentionally disconnect to it. I agree with you in a perspective of intentionally disconnecting, but what are your thoughts on a peer getting offline?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't really have a good way of knowing that right now. For peers that may come back online we would probably want to keep them in the peer set so that they have a higher chance of being connected to when they come back online. For peers that permanently leave the network, we'll likely need some way to prune them from our node entirely, which will likely be initiated by the peer store and bubbled out. Until we have support for that I think we should keep them in the peer set as long as we believe they might be of value to us.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All good, I will make this change!

Copy link
Contributor

@jacobheun jacobheun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good

@jacobheun jacobheun merged commit d7c38d3 into refactor/async-await Nov 6, 2019
@jacobheun jacobheun deleted the feat/registrar branch November 6, 2019 14:47
dirkmc pushed a commit that referenced this pull request Nov 26, 2019
* feat: peer-store v0

* feat: registrar

* chore: apply suggestions from code review

Co-Authored-By: Jacob Heun <[email protected]>

* chore: address review

* chore: support multiple conns

* chore: address review

* fix: no remote peer from topology on disconnect
dirkmc pushed a commit that referenced this pull request Nov 26, 2019
* feat: peer-store v0

* feat: registrar

* chore: apply suggestions from code review

Co-Authored-By: Jacob Heun <[email protected]>

* chore: address review

* chore: support multiple conns

* chore: address review

* fix: no remote peer from topology on disconnect
jacobheun pushed a commit that referenced this pull request Dec 12, 2019
* feat: peer-store v0

* feat: registrar

* chore: apply suggestions from code review

Co-Authored-By: Jacob Heun <[email protected]>

* chore: address review

* chore: support multiple conns

* chore: address review

* fix: no remote peer from topology on disconnect
jacobheun pushed a commit that referenced this pull request Jan 24, 2020
* feat: peer-store v0

* feat: registrar

* chore: apply suggestions from code review

Co-Authored-By: Jacob Heun <[email protected]>

* chore: address review

* chore: support multiple conns

* chore: address review

* fix: no remote peer from topology on disconnect
maschad pushed a commit to maschad/js-libp2p that referenced this pull request Jun 21, 2023
…#471)

Previously we chose peers before the self-query ran which meant there
was a reasonable chance we chose 0 peers.

Instead, wait for the query to run, then choose peers to query.
maschad pushed a commit to maschad/js-libp2p that referenced this pull request Jun 21, 2023
## [9.1.5](libp2p/js-libp2p-kad-dht@v9.1.4...v9.1.5) (2023-05-05)

### Bug Fixes

* log peer id not whole object ([libp2p#470](libp2p/js-libp2p-kad-dht#470)) ([e9efb7f](libp2p/js-libp2p-kad-dht@e9efb7f))
* only choose query peers after initial self-query has run ([libp2p#471](libp2p/js-libp2p-kad-dht#471)) ([4d05154](libp2p/js-libp2p-kad-dht@4d05154))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants