Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove HashSet::get_or_insert_with #123657

Closed
wants to merge 1 commit into from

Conversation

Amanieu
Copy link
Member

@Amanieu Amanieu commented Apr 8, 2024

This method is unsound because it allows inserting a key at the "wrong" position in a HashSet, which could result in it not appearing it future lookups or being inserted multiple times in the set.

Instead, HashSet::get_or_insert and HashSet::get_or_insert_owned should be preferred.

@rustbot
Copy link
Collaborator

rustbot commented Apr 8, 2024

r? @Mark-Simulacrum

rustbot has assigned @Mark-Simulacrum.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Apr 8, 2024
@Mark-Simulacrum
Copy link
Member

r=me, but I don't think we should say this is unsound? The method is entirely safe code so that implies other problems. It might be a bad idea, but that's a different kind of problem than a soundness issue.

@rust-log-analyzer

This comment has been minimized.

This method is unsound because it allows inserting a key at the "wrong"
position in a `HashSet`, which could result in it not appearing it
future lookups or being inserted multiple times in the set.

Instead, `HashSet::get_or_insert` and `HashSet::get_or_insert_owned`
should be preferred.
@Amanieu Amanieu force-pushed the remove-get_or_insert_with branch from 24b7e97 to 8d5e8b2 Compare April 9, 2024 00:30
@Amanieu
Copy link
Member Author

Amanieu commented Apr 9, 2024

The view of the libs-api team is that unsafe code should be able to rely on an incoming HashMap<T> (and other collections) to work correctly (i.e. not have items that appear in iteration but not in lookups) when T is not a user-controlled type.

@cuviper
Copy link
Member

cuviper commented Apr 9, 2024

I agree that we want that reliability, but the term "unsound" is the other way around:
https://doc.rust-lang.org/reference/behavior-considered-undefined.html

if unsafe code can be misused by safe code to exhibit undefined behavior, it is unsound.

@fogti
Copy link
Contributor

fogti commented Apr 9, 2024

wouldn't it make more sense to panic! if the key doesn't match?

@cuviper
Copy link
Member

cuviper commented Apr 9, 2024

Sure, adding a check on the new value might be better than throwing this API out altogether...
rust-lang/hashbrown#518

@Mark-Simulacrum Mark-Simulacrum added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Apr 13, 2024
@RustyYato
Copy link
Contributor

RustyYato commented May 29, 2024

@Amanieu

The view of the libs-api team is that unsafe code should be able to rely on an incoming HashMap (and other collections) to work correctly (i.e. not have items that appear in iteration but not in lookups) when T is not a user-controlled type.

Then get_or_insert_owned is in the same boat, since it's trivial to write a malicious implementation.

https://play.rust-lang.org/?version=nightly&mode=debug&edition=2021&gist=1ec481da7b5fce04dc5dd3e0506409fa

So either both get_or_insert_with and get_or_insert_owned are removed or they are both kept. Since it is possible to implement one in terms of the other (I can show that too if you need). (the hashbrown PR does fix thismitigates the damage and make this harder to exploit, but it should also check get_or_insert_owned for the reasons I showed above)

@cuviper
Copy link
Member

cuviper commented May 29, 2024

the hashbrown PR does fix this

Does it? If we're considering malicious impls, then your PartialEq could also abuse static/thread-local state to return true right after the ..._with closure or its to_owned.

@Amanieu
Copy link
Member Author

Amanieu commented Oct 1, 2024

Closing since get_or_insert_owned has the same problem, this doesn't actually help.

@Amanieu Amanieu closed this Oct 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants