feat: Support memory freeing #236

hpeebles · 2024-11-11T15:10:49Z

WIP

…nto freeing_memory # Conflicts: # Cargo.lock # Cargo.toml # examples/Cargo.lock # src/memory_manager.rs

…tps://github.com/dfinity/stable-structures into freeing_memory # Conflicts: # Cargo.lock # Cargo.toml # examples/Cargo.lock # src/memory_manager.rs

dragoljub-duric · 2024-11-13T15:16:58Z

src/memory_manager.rs

@@ -294,6 +291,8 @@ impl<M: Memory> Memory for VirtualMemory<M> {
 struct MemoryManagerInner<M: Memory> {
    memory: M,

+    version: u8,


Do we need this?

Reason regardless of whether in storage we have an old or new version, when loading we will load it in V2 always, so when saving we can be sure we are saving a new version.

It is only used in the upgrade_from_v1_to_v2 test to ensure the correct version of grow is called.
I'll wrap it in a cfg(test).

dragoljub-duric · 2024-11-13T15:30:43Z

src/memory_manager.rs

+        BucketId(self.inner.first_bucket_per_memory[memory_id.0 as usize])
+    }
+
+    // Sets the first bucket assigned to a memory
+    fn set_first(&mut self, memory_id: MemoryId, value: BucketId) {
+        self.inner.first_bucket_per_memory[memory_id.0 as usize] = value.0;
+        self.dirty_first_buckets.insert(memory_id);
+    }
+
+    // Gets the next bucket in the linked list of buckets
+    fn get_next(&self, bucket: BucketId) -> BucketId {
+        let start_bit_index = bucket.0 as usize * 15;
+        let mut next_bits: BitArray<[u8; 2]> = BitArray::new([0u8; 2]);
+
+        for i in 0..15 {
+            next_bits.set(
+                i + 1,
+                self.inner
+                    .bucket_links
+                    .get(start_bit_index + i)
+                    .unwrap()
+                    .as_bool(),
+            );
+        }
+
+        BucketId(u16::from_be_bytes(next_bits.data))
+    }
+
+    // Sets the next bucket in the linked list of buckets
+    fn set_next(&mut self, bucket: BucketId, next: BucketId) {
+        let start_bit_index = bucket.0 as usize * 15;
+        let next_bits: BitArray<[u8; 2]> = BitArray::from(next.0.to_be_bytes());
+
+        for (index, bit) in next_bits.iter().skip(1).enumerate() {
+            self.inner
+                .bucket_links
+                .set(start_bit_index + index, bit.as_bool());
+        }
+
+        let start_byte_index = start_bit_index / 8;
+        let end_byte_index = (start_bit_index + 14) / 8;
+
+        self.dirty_bucket_link_bytes
+            .extend((start_byte_index..=end_byte_index).map(|i| i as u16))
+    }
+
+    // Calculates the buckets for a given memory by iterating over its linked list
+    fn buckets_for_memory(&self, memory_id: MemoryId, count: u16) -> Vec<BucketId> {
+        if count == 0 {
+            return Vec::new();
+        }
+
+        let mut bucket = self.get_first(memory_id);
+        let mut buckets = vec![bucket];
+        for _ in 1..count {
+            bucket = self.get_next(bucket);
+            buckets.push(bucket);
+        }
+        buckets
+    }
+
+    // Flushes only the dirty bytes to memory
+    fn flush_dirty_bytes<M: Memory>(&mut self, memory: &M, start_offset: u64) {
+        const FIRST_BUCKET_PER_MEMORY_LEN: usize = 2 * MAX_NUM_MEMORIES as usize;
+
+        if !self.dirty_first_buckets.is_empty() {
+            // SAFETY: This is safe because we simply cast from [u16] to [u8] and double the length.
+            let bytes: [u8; FIRST_BUCKET_PER_MEMORY_LEN] =
+                unsafe { transmute(self.inner.first_bucket_per_memory) };
+
+            // Multiply by 2 since we've converted from [u16] to [u8].
+            let min = 2 * self.dirty_first_buckets.first().unwrap().0 as usize;
+            let max = 2 * self.dirty_first_buckets.last().unwrap().0 as usize + 1;
+
+            let slice = &bytes[min..=max];
+            write(memory, start_offset + min as u64, slice);
+            self.dirty_first_buckets.clear();
+        }
+
+        if !self.dirty_bucket_link_bytes.is_empty() {
+            let min = *self.dirty_bucket_link_bytes.first().unwrap() as usize;
+            let max = *self.dirty_bucket_link_bytes.last().unwrap() as usize;
+
+            let slice = &self.inner.bucket_links.data[min..=max];
+            write(
+                memory,
+                start_offset + (FIRST_BUCKET_PER_MEMORY_LEN + min) as u64,
+                slice,
+            );
+            self.dirty_bucket_link_bytes.clear();
+        }
+    }
+
+    // Flushes all bytes to memory
+    fn flush_all<M: Memory>(&mut self, memory: &M, start_offset: u64) {
+        write_struct(&self.inner, Address::from(start_offset), memory);
+        self.dirty_first_buckets.clear();
+        self.dirty_bucket_link_bytes.clear();
+    }
+}


I would strongly prefer to have only one place where we read the header, and after that, we use u16 (instead of 15-bit representation in storage). I believe that will improve performance, but more importantly, it will be easier to maintain and understand. After that, we can still use information from dirty_first_buckets and dirty_bucket_link_bytes to apply changes in storage on 15-bit representation when saving.

Yeah that makes sense.
This means we'll have to convert from 15 to 16-bit after reading which will have some overhead but at least its only done once. I'll have a go at it now.

Done. Will likely need a bit of cleaning up. Need to run off now but will give it another look tomorrow.

ielashi · 2024-11-14T09:20:00Z

@dragoljub-duric @hpeebles @dsarlis I'm not sure we'd want to merge this solution to support memory freeing.

With the expansion of stable memory, we now have another issue to solve: we have ~32k buckets, and with a default size of 8MiB per bucket, the memory manager can only manage 256GiB of memory. I bet most people don't know this, assuming that they have access to the full 400GiB (or is it 500 now?). This is a foot gun that we should resolve, and resolving this would require increasing the number of buckets.

In order to both increase the number of buckets and to support memory freeing the future, we'll have to update the memory manager in such a way that its header can span, say, two wasm pages, as opposed to one wasm page. And, if we have two wasm pages, then we won't need to represent buckets using 15 bits as we're doing in this PR, which is adding more complication than it's worth imho.

hpeebles · 2024-11-14T16:07:45Z

We could keep the header as a single page, store each bucket in the linked list as 3 bytes (which could support up to 16777216 buckets), and if additional buckets are needed, we use the unused space in the header to point to the page where the subsequent buckets are defined.

Then on each page of buckets is a pointer to the next page.

This way we can support far more buckets than anyone would ever need, the header remains small in most use cases, and we wouldn't need to move the existing data in the first memory page after the header.

hpeebles · 2024-11-14T22:44:40Z

Having thought about this a bit more I think my idea to store bucketIds as 3 bytes is wrong.
If we support spreading buckets onto more pages, then storing them as 3 bytes is an unnecessary storage optimisation at the expense of code complexity. I think storing buckets as u32s is the way to go.
The first page could then store something like 8192 buckets in the last 32768 bytes (using the format I introduced in this PR where the entry at each bucket's index points to the next bucket in the linked list).
Follow on pages used to store buckets could then store 16384 per page.
The first page then just needs to somehow store where the additional pages are.
If my calculations are correct then using the format above we'd have 30178 bytes available within the first page to encode the additional bucket pages and any metadata we may need on each one.

hpeebles added 12 commits November 8, 2024 16:41

Merge branch 'main' of https://github.com/dfinity/stable-structures i…

83fc9a8

…nto freeing_memory # Conflicts: # Cargo.lock # Cargo.toml # examples/Cargo.lock # src/memory_manager.rs

Free virtual memory WIP

0ba0064

Getting there

fcec6d3

Move new field out of header

772c24a

Clean it up a bit

6516ed0

Track dirty bytes

fd8e456

Only flush dirty bytes

2ac69ba

Fix

6817fc9

Update layout diagram

d78af60

Save header and bucket bits after loading v1 layout

4a035e6

Clean up

3be6acb

Fix upgrade test

99cbf50

github-actions bot added the external-contributor label Nov 11, 2024

hpeebles changed the base branch from main to EXC-1404-memory-manager-support-freeing-a-memory November 11, 2024 15:29

hpeebles changed the base branch from EXC-1404-memory-manager-support-freeing-a-memory to main November 11, 2024 15:34

dragoljub-duric changed the base branch from main to EXC-1404-memory-manager-support-freeing-a-memory November 11, 2024 15:44

dragoljub-duric changed the base branch from EXC-1404-memory-manager-support-freeing-a-memory to main November 11, 2024 15:45

hpeebles added 2 commits November 11, 2024 16:47

fmt

074e073

Add comment about safety

619e4d1

hpeebles changed the base branch from main to EXC-1404-memory-manager-support-freeing-a-memory November 11, 2024 21:47

hpeebles added 5 commits November 11, 2024 21:51

Merge branch 'EXC-1404-memory-manager-support-freeing-a-memory' of ht…

db51478

…tps://github.com/dfinity/stable-structures into freeing_memory # Conflicts: # Cargo.lock # Cargo.toml # examples/Cargo.lock # src/memory_manager.rs

clippy

bc7376d

Reduce size of changeset

7fad329

Reduce a bit more

3edded2

Add comments

72b2913

dragoljub-duric reviewed Nov 13, 2024

View reviewed changes

hpeebles added 2 commits November 13, 2024 15:39

Wrap version in cfg(test)

77fb4a2

Convert bucket linked list to 16-bit after reading

4e34b01

Manually implement 15bit -> 16bit using rotations

f174c63

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Support memory freeing #236

feat: Support memory freeing #236

hpeebles commented Nov 11, 2024

dragoljub-duric Nov 13, 2024

dragoljub-duric Nov 13, 2024

hpeebles Nov 13, 2024

dragoljub-duric Nov 13, 2024

hpeebles Nov 13, 2024

hpeebles Nov 13, 2024

ielashi commented Nov 14, 2024 •

edited

Loading

hpeebles commented Nov 14, 2024 •

edited

Loading

hpeebles commented Nov 14, 2024 •

edited

Loading

feat: Support memory freeing #236

Are you sure you want to change the base?

feat: Support memory freeing #236

Conversation

hpeebles commented Nov 11, 2024

dragoljub-duric Nov 13, 2024

Choose a reason for hiding this comment

dragoljub-duric Nov 13, 2024

Choose a reason for hiding this comment

hpeebles Nov 13, 2024

Choose a reason for hiding this comment

dragoljub-duric Nov 13, 2024

Choose a reason for hiding this comment

hpeebles Nov 13, 2024

Choose a reason for hiding this comment

hpeebles Nov 13, 2024

Choose a reason for hiding this comment

ielashi commented Nov 14, 2024 • edited Loading

hpeebles commented Nov 14, 2024 • edited Loading

hpeebles commented Nov 14, 2024 • edited Loading

ielashi commented Nov 14, 2024 •

edited

Loading

hpeebles commented Nov 14, 2024 •

edited

Loading

hpeebles commented Nov 14, 2024 •

edited

Loading