Skip to content

Commit

Permalink
0.4.1 (#19)
Browse files Browse the repository at this point in the history
* add missing constructor methods to SegmentedHashMap (#17)

* add SegmentedHashMap BuildHasher constructors

* SegmentedHashMap#with_hasher: default capacity and number of segments,
  user specified BuildHasher instance
* SegmentedHashMap#with_capacity_and_hasher: default number of segments,
  user specified capacity and BuildHasher instance

* correctly implement SegmentedHashMap#default

I'm not exactly sure what the previous implementation would've done, but
it sure wouldn't have been what you wanted. At the very least, the
segment shift would've been wrong.

* add default test case

This will catch a faultily-implemented Default, such as for the previous
version of SegmentedHashMap

* Rewrite or update all documentation (#18)

* update SegmentedHashMap module and struct docs

A fair amount of this is copied from the documentation of
std::collections::HashMap.

* update SegmentedHashMap constructor documentation

* update SegmentedHashMap#capacity documentation

Note to self: update the segment module documentation to explain what
a segment is. Also, update the crate documentation to explain the
lockfree hash table algorithm.

* update SegmentedHashMap#get* docs

The standard library is a little terser when referring to the
requirements on `Q`, which is a nice change.

* update SegmentedHashMap#insert* docs

* update SegmentedHashMap#insert_*or_modify* docs

goodness is this a lot of functions to maintain

* update SegmentedHashMap#remove* docs

* update SegmentedHashMap#remove_*if* docs

* update SegmentedHashMap#modify docs

* update HashMap documentation

this is 100% copy-pasted from the SegmentedHashMap documentation I
wrote, except when it shouldn't have been.

* update segment module documentation

The new documentation explains the basics of segmented hash tables, as
well as the likely performance wins and losses for using it instead of
the regular hash table.

* update crate-level documentation

Describe, in *excruciating* detail, the gory details of the hash table
algorithm.

* update README.md

Shill for SegmentedHashTable a little bit

* bump version to 0.4.1
  • Loading branch information
Gregory-Meyer authored Mar 11, 2020
1 parent cf85f62 commit b7770a0
Show file tree
Hide file tree
Showing 7 changed files with 650 additions and 639 deletions.
2 changes: 1 addition & 1 deletion Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[package]
name = "cht"
version = "0.4.0"
version = "0.4.1"
authors = ["Gregory Meyer <[email protected]>"]
edition = "2018"
description = "Lockfree resizeable concurrent hash table."
Expand Down
9 changes: 6 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,15 +4,18 @@
[![docs.rs](https://docs.rs/cht/badge.svg)](https://docs.rs/cht)
[![Travis CI](https://travis-ci.com/Gregory-Meyer/cht.svg?branch=master)](https://travis-ci.com/Gregory-Meyer/cht)

cht provides a lockfree hash table that supports concurrent lookups, insertions,
and deletions.
cht provides a lockfree hash table that supports fully concurrent lookups,
insertions, modifications, and deletions. The table may also be concurrently
resized to allow more elements to be inserted. cht also provides a segmented
hash table using the same lockfree algorithm for increased concurrent write
performance.

## Usage

In your `Cargo.toml`:

```toml
cht = "^0.3.0"
cht = "^0.4.1"
```

Then in your code:
Expand Down
72 changes: 68 additions & 4 deletions src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -22,14 +22,78 @@
// CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
// SOFTWARE.

//! Lockfree resizeable concurrent hash table.
//! Lockfree hash tables.
//!
//! The hash table in this crate was inspired by
//! [a blog post by Jeff Phreshing], which describes the implementation of a
//! hash table in [Junction].
//! The hash tables in this crate are, at their core, open addressing hash
//! tables implemented using open addressing and boxed buckets. The core of
//! these hash tables are bucket arrays, which consist of a vector of atomic
//! pointers to buckets, an atomic pointer to the next bucket array, and an
//! epoch number. In the context of this crate, an atomic pointer is a nullable
//! pointer that is accessed and manipulated using atomic memory operations.
//! Each bucket consists of a key and a possibly-uninitialized value.
//!
//! The key insight into making the hash table resizeable is to incrementally
//! copy buckets from the old bucket array to the new bucket array. As buckets
//! are copied between bucket arrays, their pointers in the old bucket array are
//! CAS'd with a null pointer that has a sentinel bit set. If the CAS fails,
//! that thread must read the bucket pointer again and retry copying it into the
//! new bucket array. If at any time a thread reads a bucket pointer with the
//! sentinel bit set, that thread knows that a new (larger) bucket array has
//! been allocated. That thread will then immediately attempt to copy all
//! buckets to the new bucket array. It is possible to implement an algorithm in
//! which a subset of buckets are relocated per-thread; such an algorithm has
//! not been implemented for the sake of simplicity.
//!
//! Bucket pointers that have been copied from an old bucket array into a new
//! bucket array are marked with a borrowed bit. If a thread copies a bucket
//! from an old bucket array into a new bucket array, fails to CAS the bucket
//! pointer in the old bucket array, it attempts to CAS the bucket pointer in
//! the new bucket array that it previously inserted to. If the bucket pointer
//! in the new bucket array does *not* have the borrowed tag bit set, that
//! thread knows that the value in the new bucket array was modified more
//! recently than the value in the old bucket array. To avoid discarding updates
//! to the new bucket array, a thread will never replace a bucket pointer that
//! has the borrowed tag bit set with one that does not. To see why this is
//! necessary, consider the case where a bucket pointer is copied into the new
//! array, removed from the new array by a second thread, then copied into the
//! new array again by a third thread.
//!
//! Mutating operations are, at their core, an atomic compare-and-swap (CAS) on
//! a bucket pointer. Insertions CAS null pointers and bucket pointers with
//! matching keys, modifications CAS bucket pointers with matching keys, and
//! removals CAS non-tombstone bucket pointers. Tombstone bucket pointers are
//! bucket pointers with a tombstone bit set as part of a removal; this
//! indicates that the bucket's value has been moved from and will be destroyed
//! if it has not beel already.
//!
//! As previously mentioned, removing an entry from the hash table results in
//! that bucket pointer having a tombstone bit set. Insertions cannot
//! displace a tombstone bucket unless their key compares equal, so once an
//! entry is inserted into the hash table, the specific index it is assigned to
//! will only ever hold entries whose keys compare equal. Without this
//! restriction, resizing operations could result in the old and new bucket
//! arrays being temporarily inconsistent. Consider the case where one thread,
//! as part of a resizing operation, copies a bucket into a new bucket array
//! while another thread removes and replaces that bucket from the old bucket
//! array. If the new bucket has a non-matching key, what happens to the bucket
//! that was just copied into the new bucket array?
//!
//! Tombstone bucket pointers are typically not copied into new bucket arrays.
//! The exception is the case where a bucket pointer was copied to the new
//! bucket array, then CAS on the old bucket array fails because that bucket has
//! been replaced with a tombstone. In this case, the tombstone bucket pointer
//! will be copied over to reflect the update without displacing a key from its
//! bucket.
//!
//! This hash table algorithm was inspired by [a blog post by Jeff Phreshing]
//! that describes the implementation of the Linear hash table in [Junction], a
//! C++ library of concurrent data structrures. Additional inspiration was drawn
//! from the lockfree hash table described by Cliff Click in [a tech talk] given
//! at Google in 2007.
//!
//! [a blog post by Jeff Phreshing]: https://preshing.com/20160222/a-resizable-concurrent-map/
//! [Junction]: https://github.com/preshing/junction
//! [a tech talk]: https://youtu.be/HJ-719EGIts
pub mod map;
pub mod segment;
Expand Down
Loading

0 comments on commit b7770a0

Please sign in to comment.