-
Notifications
You must be signed in to change notification settings - Fork 378
Add simple collator election mechanism #2960
base: master
Are you sure you want to change the base?
Conversation
Signed-off-by: georgepisaltu <[email protected]>
Signed-off-by: georgepisaltu <[email protected]>
Signed-off-by: georgepisaltu <[email protected]>
Signed-off-by: georgepisaltu <[email protected]>
Signed-off-by: georgepisaltu <[email protected]>
Signed-off-by: georgepisaltu <[email protected]>
Signed-off-by: georgepisaltu <[email protected]>
Signed-off-by: georgepisaltu <[email protected]>
Signed-off-by: georgepisaltu <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like you also need to impl SortedListProvider
, unless I'm missing it somewhere? But I don't see a default impl in Substrate.
@@ -486,6 +504,7 @@ construct_runtime!( | |||
Session: pallet_session = 22, | |||
Aura: pallet_aura = 23, | |||
AuraExt: cumulus_pallet_aura_ext = 24, | |||
CandidateList: pallet_bags_list::<Instance1> = 25, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implied
CandidateList: pallet_bags_list::<Instance1> = 25, | |
CandidateList: pallet_bags_list = 25, |
|
||
Self::deposit_event(Event::CandidateRemoved { account_id: target }); | ||
Self::deposit_event(Event::CandidateAdded { account_id: who, deposit }); | ||
Ok(Some(T::WeightInfo::take_candidate_slot(length as u32)).into()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should do the same for the other functions: refund weight by providing the actual length.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, I forgot to do this for increase_bond
/decrease_bond
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
if candidate_info.deposit < <CandidacyBond<T>>::get() { | ||
return Err(Error::<T>::OnRemoveInsufficientFunds.into()) | ||
} | ||
T::Currency::unreserve(&who, bond); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would be good to migrate to new fungible trait from Currency
may be in different PR
here about new traits - paritytech/substrate#12951
@@ -289,6 +300,16 @@ pub mod pallet { | |||
NoAssociatedValidatorId, | |||
/// Validator ID is not yet registered. | |||
ValidatorNotRegistered, | |||
/// Some doc. | |||
OnInsert, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
more idiomatic way within our libs to name the errors, is not describing on what occasion an error has happened, but what has happened, FailedToUpdateCandidateList
for example
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the names are just placeholders right now. Same with doc comments. I'll write some proper ones once the other comments are addressed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
}); | ||
Ok(()) | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it would be easier to review with the docs telling what this meant for
who is target, who is origin?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The caller of this extrinsic wants to register themselves in the candidate list by taking the spot of another candidate target
. The caller's deposit
must be greater than the deposit of the candidate it's trying to replace for this to work.
I'll get some proper documentation on these functions in a future iteration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
collators | ||
} | ||
|
||
/// Kicks out candidates that did not produce a block in the kick threshold and refunds | ||
/// their deposits. | ||
pub fn kick_stale_candidates( | ||
candidates: BoundedVec<CandidateInfo<T::AccountId, BalanceOf<T>>, T::MaxCandidates>, | ||
) -> BoundedVec<T::AccountId, T::MaxCandidates> { | ||
) -> usize { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would be nice to document what is returned value
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@@ -142,6 +143,8 @@ pub mod pallet { | |||
/// Maximum number of invulnerables. | |||
type MaxInvulnerables: Get<u32>; | |||
|
|||
type CandidateList: SortedListProvider<Self::AccountId, Score = BalanceOf<Self>>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you explain why we using the bags-list
?
the number of Candidates
is not more than desired_candidates
(can be more if desired_candidates
changed to the less value), so the Candidates
list is quite small.
Candidates
list stored as a value, the whole list is being read usually from the storage.
is this more efficient than maintaining a sorted list of Candidates
within the pallet?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There can be at most MaxCandidates
candidates, out of which the top N (where N is DesiredCandidates
) are selected to be collators when sessions begin/end. The list could be small, it depends on the config, but the idea behind using bags-list
was to have it lazily sorted, as in people take a slot in the list and then they use bags-list
extrinsics to place their entry higher on the list, thus sorting it themselves and paying for it. This moves the sorting logic off-chain as the candidates registering have to watch the list and do the calls if/when they see fit. It would also give an incentive to only the top N candidates to be in a sorted order, since if you're below the N threshold, it doesn't matter if your entry is out of order or not since you're not gonna get selected for collation. This is also true for all the entries above the threshold, since it doesn't matter what spot of the top N you got, you're still going to be selected. This reduces the sorting burden on the list and also makes the candidates responsible for their place in the list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bags-list
will perform additional reads and inserts on insert/decrease/increase etc. On decrease and increase it will run the same rebag
as it does with the extrinsic you mentioning. So I am not sure what exactly you optimizing with it.
what the BagThresholds
you planning to set?
I think for this case, when MaxCandidates
wont be more than 50 in near future, its not necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the bags list is counter productive in terms of performance and resources since the number of candidates is potentially small, I'd suggest implementing a simpler SortedListProvider
that allows parachain devs to decide whether to use that implementation or an instance of bags-lists.
In any case, it would be interesting to compare benchmarks with both implementations and see what's the threshold in terms of number of candidates (and corresponding thresholds) where the bags list approach starts to make sense (if it's not always better -- I'm unsure).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bags-list
will perform additional reads and inserts on insert/decrease/increase etc. On decrease and increase it will run the samerebag
as it does with the extrinsic you mentioning. So I am not sure what exactly you optimizing with it.
what theBagThresholds
you planning to set?
I was planning to have a single universal bag - so BagThresholds = []
- and on_increase
and on_decrease
would be noops for that configuration since rebag
would do nothing, but they would be needed if the user chose a different SortedListProvider
implementation. The on_insert
operation for bags-list
is very simple, while on_remove
is a bit costly indeed, since it would need to traverse the list.
I think for this case, when MaxCandidates wont be more than 50 in near future, its not necessary.
I thought it was supposed to be about 100 in the near future with the potential to grow even more, but I'm not sure, I might be misremembering.
In any case, it would be interesting to compare benchmarks with both implementations and see what's the threshold in terms of number of candidates (and corresponding thresholds) where the bags list approach starts to make sense (if it's not always better -- I'm unsure).
I'm in favor of implementing a simpler SortedListProvider
and comparing the two.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I implemented both the simple list as a separate pallet as well as integrated within collator-selection
. I ran the benchmarks on my machine and these are the results in terms of weight:
register_as_candidate
: bags-list
outperforms for MaxCandidates
> 55
increase_bond
: bags-list
outperforms for MaxCandidates
> 10
decrease_bond
: bags-list
outperforms for MaxCandidates
> 6
take_candidate_slot
: bags-list
outperforms for MaxCandidates
> 160
leave_intent
always: more expensive on bags-list
by about 50%
new_session
always: more expensive on bags-list
by about 30%
There is no major difference between the simple sorted list pallet vs list kept in the storage of collator-selection
, aside from increase_bond
which starts to be better for the integrated list for MaxCandidates
> 13.
The bags-list
benchmarks do not include the put_in_front_of
call costs because those are separate and are supposed to be called less often than the other extrinsics presented here, as in not every time a bond changes.
I didn't get to run the bags-list
pallet benchmarks locally to get an idea of how much those would cost, but each put_in_front_of
scales linearly with MaxCandidates
. Given the small MaxCandidates
values we're working with, bags-list
is not ideal. That said, the simple list as a separate pallet seems to be the way to go for now, since it allows for customization down the line and the weights are very similar with the integrated version.
I'll link the branches used for the benchmarks below:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@georgepisaltu why you need two lists (Candidates and CandidateList) for the simple list within collator-selection
solution?
here you even do an additional read for very candidate - https://github.com/georgepisaltu/cumulus/blob/b0ca1b615336d83ea210a26950efc4833ab5367e/pallets/collator-selection/src/lib.rs#L823
you can have only one Candidate<T: Config> = StorageValue<_, BoundedVec<(AccountId, Deposit), T::MaxCandidates>
for that solution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
with that we wont event need those new three dispatchables. we need only one more or register_as_candidate
can both. we can only rename it (keeping the same index) to imply that its not only register but also reset bond.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why you need two lists (Candidates and CandidateList) for the simple list within collator-selection solution?
The purpose of the Candidates
map is to allow fast access to the deposits. Even in the solution where the list was within the collator-selection pallet storage, there are still places where we look for an account's deposit and we have to iterate the list if we don't have the map. However, with these small numbers it's probably counterproductive to load a 10-sized map into memory just for that. As a bonus, it works well with the SortedListProvider
/ScoreProvider
abstraction we have in place. I'm not too worried about the "extra" map since it uses the Identity
hasher, so it should be cheaper than a standard one.
with that we wont event need those new three dispatchables. we need only one more or register_as_candidate can both. we can only rename it (keeping the same index) to imply that its not only register but also reset bond.
The plan right now is to merge increase_bond
and decrease_bond
into a update_bond
. A function like update_bond
is necessary to allow an auction to take place and I like the idea of these functions being as intuitive and simple as possible, so I'm against having bid update mechanics in the function used to register.
Anyway, I implemented a simpler version without the map, as you suggested, and I got the benchmarks comparing the simple list in a separate pallet with the mapless implementation (I'll refer to it as "simplest" from this point on):
register_as_candidate
: always more expensive with the pallet version by about 40%
increase_bond
: simplest version outperforms for MaxCandidates
> 10
decrease_bond
: same base weight, increases by about 1/30 of the base weight for each extra candidate for the pallet version
take_candidate_slot
: always more expensive with the pallet version by about 35%
leave_intent
: always more expensive with the pallet version by about 20%
new_session
: always more expensive with the pallet version by about 25%
Also, the simplest version also uses about half the reads/writes on the database compared to the simple pallet version.
Also, I managed to get the benchmarks for the bags-list
put_in_front_of
calls, which would be additional weight incurred by users in the bags-list
version. However, the catch with that is that if we use an empty BagThresholds
, there is only a single universal bag, so all rebags
are basically noops (they still have to fetch the thresholds from storage but the cost is reflected in the previous benchmarks I posted anyway). This means that the optimal way to use the bags-list
version would be to register as a candidate, observe the list off-chain and do calls to update_bond
as necessary to have a top-DesiredCandidates
bond in place, so before the new session is about to begin and the collators are being selected (after the dust is settled and the bidding is done), if you are a top-DesiredCandidates
candidate, call put_in_front_of
so that you can claim your collator spot. This means that the amount of swaps done on the list becomes DesiredCandidates
, a fraction of MaxCandidates
by definition, and it's easy to see how it would scale if the numbers were greater, but as I understand, MaxCandidates
will probably be a small number. put_in_front_of
is about as expensive as a leave_intent
, which means it's inexpensive for a candidate to call this to claim their spot.
That being said, I think for the problem we're trying to solve right now we should go with the simple list version, but in a separate pallet while keeping the SortedListProvider
abstraction. The reasons for this are:
- there isn't a significant difference between the separate pallet version and the list inside
collator-selection
pallet storage (see benches above) - easy to upgrade in the future
- works for other users of the
collator-selection
pallet who may want to have lots of collators, even if we plan on having small lists of candidates
Main downside of this approach is the extra complexity in the code because of the new simple-list
pallet which needs to be tested/maintained.
I assumed that other users may want to use the collator-selection
pallet with MaxCandidates
values greater than 50-100, which seems to be the cut-off point for bags-list
. If I'm wrong about that, and, AFAIK, we don't really plan to raise that over 20 any time soon, then it's probably better to go with the simplest possible version, a list kept inside the collator-selection
pallet storage.
@muharem @joepetrowski @gpestana let me know what you think
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would go for the simpler solution since we don't expect needing more than 100 candidates at any time. Having a list and a map also adds overhead of needing to maintain some invariants like have one-to-one correspondence of elements and keys.
About the solution of removing the map, BoundedVec
s should generally be at their limit in usage, while general storage should use maps to facilitate O(1) operations. Perhaps a good solution is to store candidates in a map, but only pass a (full) BoundedVec
to Session for the next set of validators? I'm not sure the costs of computing that on session change vs. just maintaining candidates in a vec though.
Agree with collapsing increase/decrease into update.
I don't think we should optimize for benchmarks. We don't expect these functions to be called very often. The one we do need to keep an eye on is the new_session
call since that is called in on_initialize
and needs to stay within block building bounds.
@@ -142,6 +143,8 @@ pub mod pallet { | |||
/// Maximum number of invulnerables. | |||
type MaxInvulnerables: Get<u32>; | |||
|
|||
type CandidateList: SortedListProvider<Self::AccountId, Score = BalanceOf<Self>>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There can be at most MaxCandidates
candidates, out of which the top N (where N is DesiredCandidates
) are selected to be collators when sessions begin/end. The list could be small, it depends on the config, but the idea behind using bags-list
was to have it lazily sorted, as in people take a slot in the list and then they use bags-list
extrinsics to place their entry higher on the list, thus sorting it themselves and paying for it. This moves the sorting logic off-chain as the candidates registering have to watch the list and do the calls if/when they see fit. It would also give an incentive to only the top N candidates to be in a sorted order, since if you're below the N threshold, it doesn't matter if your entry is out of order or not since you're not gonna get selected for collation. This is also true for all the entries above the threshold, since it doesn't matter what spot of the top N you got, you're still going to be selected. This reduces the sorting burden on the list and also makes the candidates responsible for their place in the list.
@@ -289,6 +300,16 @@ pub mod pallet { | |||
NoAssociatedValidatorId, | |||
/// Validator ID is not yet registered. | |||
ValidatorNotRegistered, | |||
/// Some doc. | |||
OnInsert, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the names are just placeholders right now. Same with doc comments. I'll write some proper ones once the other comments are addressed.
}); | ||
Ok(()) | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The caller of this extrinsic wants to register themselves in the candidate list by taking the spot of another candidate target
. The caller's deposit
must be greater than the deposit of the candidate it's trying to replace for this to work.
I'll get some proper documentation on these functions in a future iteration.
collators | ||
} | ||
|
||
/// Kicks out candidates that did not produce a block in the kick threshold and refunds | ||
/// their deposits. | ||
pub fn kick_stale_candidates( | ||
candidates: BoundedVec<CandidateInfo<T::AccountId, BalanceOf<T>>, T::MaxCandidates>, | ||
) -> BoundedVec<T::AccountId, T::MaxCandidates> { | ||
) -> usize { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do.
@@ -142,6 +143,8 @@ pub mod pallet { | |||
/// Maximum number of invulnerables. | |||
type MaxInvulnerables: Get<u32>; | |||
|
|||
type CandidateList: SortedListProvider<Self::AccountId, Score = BalanceOf<Self>>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the bags list is counter productive in terms of performance and resources since the number of candidates is potentially small, I'd suggest implementing a simpler SortedListProvider
that allows parachain devs to decide whether to use that implementation or an instance of bags-lists.
In any case, it would be interesting to compare benchmarks with both implementations and see what's the threshold in terms of number of candidates (and corresponding thresholds) where the bags list approach starts to make sense (if it's not always better -- I'm unsure).
let who = ensure_signed(origin)?; | ||
|
||
let new_bond = | ||
<Candidates<T>>::try_mutate(|candidates| -> Result<BalanceOf<T>, DispatchError> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe a map keyed by account Id would suit better the candidates use case to avoid this loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
/// | ||
/// Todo | ||
#[pallet::call_index(7)] | ||
#[pallet::weight(T::WeightInfo::increase_bond(T::MaxCandidates::get()))] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
/// | ||
/// Todo | ||
#[pallet::call_index(8)] | ||
#[pallet::weight(T::WeightInfo::decrease_bond(T::MaxCandidates::get()))] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment regarding weights as in increase_bond
.
let who = ensure_signed(origin)?; | ||
|
||
let new_bond = | ||
<Candidates<T>>::try_mutate(|candidates| -> Result<BalanceOf<T>, DispatchError> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a suggestion for you here: the implementation of the increase and decrease bond extrinsics are very similar. I'd suggest refactoring out most it it's mutation logic into a Self::update_bond(accountID, imbalance)
, where the imbalance is an enum that can take positive and negative imbalance (to add or remove bond).
The pallet balances implements a similar logic and, perhaps more relevant, I've implemented something similar in this PR, which also calls into a SortedListProvoder::on_increase/decrease
paritytech/substrate#14620
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's a good idea.
Error::<T>::ValidatorNotRegistered | ||
); | ||
|
||
ensure!(deposit >= Self::candidacy_bond(), Error::<T>::InsufficientBond); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd move this check before fetching the validator key, in the spirit of verifying first and read/write later. The storage transactional ensures that all the writes are rolled back if the extrinsic fails, but the node may spend unnecessary cycles reading/writing something that will be eventually rolled back due to this check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
|
||
ensure!(deposit >= Self::candidacy_bond(), Error::<T>::InsufficientBond); | ||
|
||
<Candidates<T>>::try_mutate(|candidates| -> Result<(), DispatchError> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I start to think that Candidates should be refactored as a storage map instead of a vec.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Signed-off-by: georgepisaltu <[email protected]>
Signed-off-by: georgepisaltu <[email protected]>
Signed-off-by: georgepisaltu <[email protected]>
Signed-off-by: georgepisaltu <[email protected]>
The CI pipeline was cancelled due to failure one of the required jobs. |
Closes paritytech/polkadot-sdk#106
This PR adds the ability to bid for collator slots even after the max number of collators have already registered. This eliminates the first come, first served mechanism that was in place before.
Key changes:
increase_bond
/decrease_bond
extrinsics to allow registered candidates to adjust their bonds in order to dynamically control their bidstake_candidate_slot
extrinsic to try to replace an already existing candidate by bidding more than thembags-list
pallet, where the topDesiredCandidates
out ofMaxCandidates
candidates in the list will be selected by the session pallet as collatorsI'm not sure whether we should allow decreasing the candidate bids. The mechanic is partially in place already through
leave_intent
, but it's not quite the same thing. I'm also open to any rename suggestions.I made adjustments to the parachain template runtime to test things out, but will go ahead and fix the rest of them once we reach consensus on the overall approach.