Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce assertion about performance #36053

Merged
merged 2 commits into from
Sep 27, 2024
Merged

Reduce assertion about performance #36053

merged 2 commits into from
Sep 27, 2024

Conversation

concavelenz
Copy link
Contributor

Correct wording that implies the iterator helps are always more efficient. It is a trade off and depends on operations and the size of the array. Creating iterators and the iterator protocol has its own cost.

Description

Change "is" to "may be"

Motivation

Correct wording that implies the iterator helps are always more efficient. It is a trade off and depends on operations and the size of the array. Creating iterators and the iterator protocol has its own cost.

Correct wording that implies the iterator helps are always more efficient.  It is a trade off and depends on operations and the size of the array.   Creating iterators and the iterator protocol has its own cost.
@concavelenz concavelenz requested a review from a team as a code owner September 25, 2024 21:16
@concavelenz concavelenz requested review from Josh-Cena and removed request for a team September 25, 2024 21:16
@github-actions github-actions bot added Content:JS JavaScript docs size/xs [PR only] 0-5 LoC changed labels Sep 25, 2024
@Josh-Cena
Copy link
Member

Do you have benchmarks? The second one goes through strictly fewer operations than the first one, so unless the array reduce method has been heavily optimized I don't believe it's a "may".

Creating iterators and the iterator protocol has its own cost.

The array method also needs to first create an iterator because the initial value is a map.

@concavelenz
Copy link
Contributor Author

concavelenz commented Sep 26, 2024 via email

@Josh-Cena
Copy link
Member

Josh-Cena commented Sep 26, 2024

Here's what I mean. You are comparing the following two:

[...map.values()].reduce((a, b) => a + b);
map.values().reduce((a, b) => a + b);

The first one:

  1. Invokes values() which returns an Iterator.
  2. Steps through the Iterator and collects everything into an intermediate array.
  3. Loops through the array, and sums up every value.

The second one:

  1. Invokes values().
  2. Steps through the Iterator, where for each iteration it does the same thing as in array reduce (calculating sum)

Both have to iterate through the Iterator exactly once, and both have to perform N additions, but the second version avoids the extra work of allocating an array, populating it, index-accessing it, and discarding it.

@shicks
Copy link
Contributor

shicks commented Sep 26, 2024

I just ran a benchmark to try to make a point that, even for this simple reduce-only case, iterator helpers would still lose to a basic for loop, even if they beat the array methods. But I was surprised that the array methods actually soundly beat the iterator helpers and ended up neck-and-neck with the for loop.

In any case, my main point was that leading off with such a trivial example is misleading because even if the case of a simple reduce is more efficient, it's not clear that a bunch of chained operations (e.g. iter.map(...).filter(...).reduce(...)) would win over the array alternative (or the naive for-loop), since the iterator version also makes plenty of garbage in the form of iterator result objects for every element at every stage, which is likely more expensive than a memory-contiguous array of values, provided the latter fits comfortably in memory. So to my mind, leading with this example seems to suggest a false generalization and is therefore misleading.

@Josh-Cena
Copy link
Member

since the iterator version also makes plenty of garbage in the form of iterator result objects for every element at every stage

I don't think that's the issue. Again [...map.values()] literally went through all the same iterator machinery that map.values().reduce would go through. As I said, the only reason the array method may be faster is because array methods are old and well-optimized, while iterator methods are new and therefore unoptimized. By principle, iterator methods go through strictly less computational steps than first converting the iterator to an array and then summing up the array; it's only because the runtime is less optimistic in making predictions and optimizations.

@Josh-Cena
Copy link
Member

Thanks for the benchmark though, confirms our belief that new methods, no matter how seemingly efficient they are, can never beat old and manual ways of doing things. I'm okay to soften the language here since we can never be assertive about performance.

@Josh-Cena Josh-Cena changed the title Update index.md Reduce assertion about performane Sep 26, 2024
@Josh-Cena Josh-Cena changed the title Reduce assertion about performane Reduce assertion about performance Sep 26, 2024
@Josh-Cena
Copy link
Member

Josh-Cena commented Sep 26, 2024

Also by the way, your benchmark is way too small to be comparative. I changed the setup to:

var nameToDeposit = new Map();

for (let i = 0; i < 100000; i++) {
  nameToDeposit.set(i, i);
}

var sink = 0;

And the result:
image

So, array methods still beat iterator methods by a constant factor, but for loop is by far the fastest, which is very reasonable, because it avoids calling a user-provided callback (callback-based APIs are always the slowest).

@concavelenz
Copy link
Contributor Author

I think you misunderstand your example:

[...map.values()].reduce((a, b) => a + b); iterates once using iterator protocol, and once using loop incrementing.

map.values().reduce((a, b) => a + b); iterates twice using iterator protocol. The iterator protocol creates an object per-item as part of the iteration.

Loop incrementing is a more local operation and thus easier to optimize and this is why the first is faster. Also the first is likely more memory "efficient" in that there are fewer object allocations, while the second can minimize peek memory for larger arrays. But at the point that one is handling very large arrays, one should should not be using these methods if you care about efficiency as you would be significantly better off using a simple loop rather than these methods at all.

Therefore, I would argue that you should only say that these methods are convenient and nothing about efficiency until there is some some evidence otherwise.

@Josh-Cena
Copy link
Member

Josh-Cena commented Sep 26, 2024

Let's think about the underlying implementation.
[...map.values()].reduce((a, b) => a + b); becomes:

// [...map.values()]
const array = [];
for (const v of map.values()) {
  array.push(v);
}
// arr.reduce((a, b) => a + b)
let acc = array[0];
const callback = (a, b) => a + b;
for (let i = 1; i < array.length; i++) {
  acc = callback(acc, array[i]);
}

map.values().reduce((a, b) => a + b); becomes:

const iter = map.values();
let acc = iter.next().value;
for (const v of iter) {
  callback(acc, v);
}

The for...of loop to consume the iterator is the same. the N callback invocations is the same. The only thing that changes is the array allocation.

@concavelenz
Copy link
Contributor Author

concavelenz commented Sep 27, 2024 via email

Copy link
Member

@Josh-Cena Josh-Cena left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, thanks

Copy link
Contributor

@Josh-Cena Josh-Cena merged commit 5a140a2 into mdn:main Sep 27, 2024
8 checks passed
fiji-flo pushed a commit that referenced this pull request Oct 2, 2024
* Update index.md

Correct wording that implies the iterator helps are always more efficient.  It is a trade off and depends on operations and the size of the array.   Creating iterators and the iterator protocol has its own cost.

* Update files/en-us/web/javascript/reference/global_objects/iterator/index.md

---------

Co-authored-by: Joshua Chen <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Content:JS JavaScript docs size/xs [PR only] 0-5 LoC changed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants