<algorithm>
: Implement worst-case linear-time nth_element
#5100
+535
−81
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fixes #856 for
std::nth_element
andstd::ranges::nth_element
. This implements a fallback to the median-of-medians-of-five algorithm when the quickselect algorithm seems to be making too little progress.The median-of-medians algorithm is mostly the textbook version, with two minor tweaks:
Benchmark results
bm_uniform
just appliesnth_element
to an integer array of the given length. The integer array is uniformly sampled from a fixed seed. This is to check that the worst-case fallback does not noticeably worsen the processing time on such a sequence.bm_tunkey_adversary
appliesnth_element
to a sequence on which the implemented quickselect algorithm performs terribly.Before:
After:
As expected, the fallback greatly improves the running time for
bm_tunkey_adversary
. The timings forbm_uniform
are about on par, or more precisely even a bit better with this PR on my machine.The fallback heuristic
std::sort
switches to its fallback when the recursion depth exceeds some logarithmic threshold. We could use the same heuristic as well, however, this would not guarantee linear time in the worst case but "only" anSo I opted for an adaptive depth limit: Like the heuristic for
std::sort
, it assumes that each iteration should reduce the range of inspected elements by 25 %. But whilestd::sort
derives a maximum recursion depth from this assumption, this heuristic falls back to the median-of-medians algorithm when the actual size of the processed sequence exceeds the desired size by some constant tolerance factor (currently about 2) during some iteration. Thus, the total number of processed elements over all quickselect iterations is bounded by a multiple of the sequence length times a geometric sum, ensuring worst-case linear time overall. At the same time, the tolerance factor introduces some leeway so that one or two bad iterations (especially at the beginning) don't trigger the fallback immediately.Obviously, there are many possible choices for the desired percentage reduction per iteration and the tolerance factor. But the benchmarks seem to suggest that the chosen values aren't too bad; a smaller percentage reduction or a larger margin factor noticeably worsen the
bm_tunkey_adversary
benchmark, but result in little difference forbm_uniform
. Besides, the implementation ofstd::sort
already sets a precedent for a desired reduction of 25 % per iteration.Test
The newly added test applies
nth_element
to the same worst-case sequence as the benchmark. This makes sure that the fallback is actually exerted by the test. (I think it's also the first test that exerts the quickselect algorithm and not the just the insertion sort fallback.)