Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add immutable map accessors to OccupiedEntry and VacantEntry #600

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

apasel422
Copy link

@apasel422 apasel422 commented Dec 11, 2024

Without these, it is not possible to inspect other keys in the map while deciding how to process the entry, forcing the user to either perform unnecessary work before calling entry or performing multiple map lookups for the entry key.

For example:

fn some_expensive_operation(...) -> bool { ... }

let mut map: HashMap<K, V> = ...;

// old way, with two lookups for `key`
if !map.contains_key(key) && some_expensive_operation(map.get(other_key)) {
    map.insert(key, val);
}

// old way, with expensive operation done unnecessarily when `key` is
// already present
if some_expensive_operation(map.get(other_key)) {
    if let VacantEntry(e) = map.entry(key) {
        e.insert(val);
    }
}

// new way, with one lookup for `key`
if let VacantEntry(e) = map.entry(key) {
  if some_expensive_operation(e.map().get(other_key)) {
    e.insert(val);
  }
}

We do not provide accessors returning a mutable reference to the map because doing so would make it possible to invalidate the entry itself.

Without these, it is not possible to inspect other keys in the map while
deciding how to process the entry, forcing the user to either perform
unnecessary checks before calling entry or performing multiple map
lookups for the entry key.

For example:

```rust
fn some_expensive_operation(...) -> bool { ... }

// old way, with two lookups for `key`
if !map.contains_key(key) && some_expensive_operation(map.get(other_key)) {
    map.insert(key, val);
}

// old way, with expensive operation done unnecessarily when key is
// already present
if some_expensive_operation(map.get(other_key)) {
    if let VacantEntry(e) = map.entry(key) {
        e.insert(val);
    }
}

// new way, with one lookup for `key`
if let VacantEntry(e) = map.entry(key) {
  if some_expensive_operation(e.map().get(other_key)) {
    e.insert(val);
  }
}
```

We do not provide accessors returning a mutable reference to the map
because doing so would make it possible to invalidate the entry itself.
@clarfonthey
Copy link
Contributor

clarfonthey commented Dec 11, 2024

This feels sound and reasonable to add, but I'll defer to others on whether that's actually the case.

I would be interested in some more specifics on why you're looking to do this, since this doesn't quite feel like the best way of going about this, but is probably fine. Maybe an entry form of get_many_mut might help.

@apasel422
Copy link
Author

Maybe an entry form of get_many_mut might help.

That would be fine too.

I would be interested in some more specifics on why you're looking to do this

While deciding whether to insert a new entry into a map, I need to do as many as four other lookups, each accompanied by some additional, potentially expensive processing on the corresponding values, if they are present at all. Right now I just do two lookups for the to-be-inserted key (contains_key followed by insert) rather than using the entry API due to the inability to access the map (immutably in this case) while an entry is outstanding.

@apasel422 apasel422 marked this pull request as ready for review December 11, 2024 22:45
@clarfonthey
Copy link
Contributor

I guess that my confusion lies in why you specifically need the different things being checked to all exist within the map as separate entries, rather than all lumped together in a single entry, if you want to access them all at once.

@Amanieu
Copy link
Member

Amanieu commented Dec 11, 2024

Could these be renamed to hash_map to avoid confusion with the very commonly used map method?

Also I can see a use case for into_hash_map which discards the entry and returns the original mutable reference that the entry was constructed from.

Maybe an entry form of get_many_mut might help.

That won't work since an invariant of the entry API is that only one entry may exist at any time. Otherwise methods on one entry would invalidate the others.

@Amanieu
Copy link
Member

Amanieu commented Dec 11, 2024

Also could you add these to:

  • Entry, EntryRef, VacantEntryRef
  • The entry types for HashSet and HashTable.

@apasel422
Copy link
Author

  • The entry types for HashSet and HashTable.

Is it possible to do this for HashSet without unsafe? hash_set::OccupiedEntry has a reference to hash_map::OccupiedEntry, but HashSet isn't just a type alias for HashMap.

@Amanieu
Copy link
Member

Amanieu commented Dec 13, 2024

Actually the API as it is will certainly cause problems in the future when we want to rewrite HashMap on top of HashTable for the same reason we can't support HashSet directly: HashTable's ExtractIf can only provide a reference to the HashTable.

Perhaps the API could be re-considered: instead of a returning a reference the the map, there could be a method which explicitly performs a lookup.

@apasel422
Copy link
Author

Perhaps the API could be re-considered: instead of a returning a reference the the map, there could be a method which explicitly performs a lookup.

This makes sense, though it's unfortunate that each method of interest would have to be explicitly exposed here. I will update this PR with that approach shortly, exposing only get for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants