-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Support memory freeing #236
base: EXC-1404-memory-manager-support-freeing-a-memory
Are you sure you want to change the base?
feat: Support memory freeing #236
Conversation
…nto freeing_memory # Conflicts: # Cargo.lock # Cargo.toml # examples/Cargo.lock # src/memory_manager.rs
…tps://github.com/dfinity/stable-structures into freeing_memory # Conflicts: # Cargo.lock # Cargo.toml # examples/Cargo.lock # src/memory_manager.rs
@@ -294,6 +291,8 @@ impl<M: Memory> Memory for VirtualMemory<M> { | |||
struct MemoryManagerInner<M: Memory> { | |||
memory: M, | |||
|
|||
version: u8, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reason regardless of whether in storage we have an old or new version, when loading we will load it in V2 always, so when saving we can be sure we are saving a new version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is only used in the upgrade_from_v1_to_v2
test to ensure the correct version of grow
is called.
I'll wrap it in a cfg(test)
.
src/memory_manager.rs
Outdated
BucketId(self.inner.first_bucket_per_memory[memory_id.0 as usize]) | ||
} | ||
|
||
// Sets the first bucket assigned to a memory | ||
fn set_first(&mut self, memory_id: MemoryId, value: BucketId) { | ||
self.inner.first_bucket_per_memory[memory_id.0 as usize] = value.0; | ||
self.dirty_first_buckets.insert(memory_id); | ||
} | ||
|
||
// Gets the next bucket in the linked list of buckets | ||
fn get_next(&self, bucket: BucketId) -> BucketId { | ||
let start_bit_index = bucket.0 as usize * 15; | ||
let mut next_bits: BitArray<[u8; 2]> = BitArray::new([0u8; 2]); | ||
|
||
for i in 0..15 { | ||
next_bits.set( | ||
i + 1, | ||
self.inner | ||
.bucket_links | ||
.get(start_bit_index + i) | ||
.unwrap() | ||
.as_bool(), | ||
); | ||
} | ||
|
||
BucketId(u16::from_be_bytes(next_bits.data)) | ||
} | ||
|
||
// Sets the next bucket in the linked list of buckets | ||
fn set_next(&mut self, bucket: BucketId, next: BucketId) { | ||
let start_bit_index = bucket.0 as usize * 15; | ||
let next_bits: BitArray<[u8; 2]> = BitArray::from(next.0.to_be_bytes()); | ||
|
||
for (index, bit) in next_bits.iter().skip(1).enumerate() { | ||
self.inner | ||
.bucket_links | ||
.set(start_bit_index + index, bit.as_bool()); | ||
} | ||
|
||
let start_byte_index = start_bit_index / 8; | ||
let end_byte_index = (start_bit_index + 14) / 8; | ||
|
||
self.dirty_bucket_link_bytes | ||
.extend((start_byte_index..=end_byte_index).map(|i| i as u16)) | ||
} | ||
|
||
// Calculates the buckets for a given memory by iterating over its linked list | ||
fn buckets_for_memory(&self, memory_id: MemoryId, count: u16) -> Vec<BucketId> { | ||
if count == 0 { | ||
return Vec::new(); | ||
} | ||
|
||
let mut bucket = self.get_first(memory_id); | ||
let mut buckets = vec![bucket]; | ||
for _ in 1..count { | ||
bucket = self.get_next(bucket); | ||
buckets.push(bucket); | ||
} | ||
buckets | ||
} | ||
|
||
// Flushes only the dirty bytes to memory | ||
fn flush_dirty_bytes<M: Memory>(&mut self, memory: &M, start_offset: u64) { | ||
const FIRST_BUCKET_PER_MEMORY_LEN: usize = 2 * MAX_NUM_MEMORIES as usize; | ||
|
||
if !self.dirty_first_buckets.is_empty() { | ||
// SAFETY: This is safe because we simply cast from [u16] to [u8] and double the length. | ||
let bytes: [u8; FIRST_BUCKET_PER_MEMORY_LEN] = | ||
unsafe { transmute(self.inner.first_bucket_per_memory) }; | ||
|
||
// Multiply by 2 since we've converted from [u16] to [u8]. | ||
let min = 2 * self.dirty_first_buckets.first().unwrap().0 as usize; | ||
let max = 2 * self.dirty_first_buckets.last().unwrap().0 as usize + 1; | ||
|
||
let slice = &bytes[min..=max]; | ||
write(memory, start_offset + min as u64, slice); | ||
self.dirty_first_buckets.clear(); | ||
} | ||
|
||
if !self.dirty_bucket_link_bytes.is_empty() { | ||
let min = *self.dirty_bucket_link_bytes.first().unwrap() as usize; | ||
let max = *self.dirty_bucket_link_bytes.last().unwrap() as usize; | ||
|
||
let slice = &self.inner.bucket_links.data[min..=max]; | ||
write( | ||
memory, | ||
start_offset + (FIRST_BUCKET_PER_MEMORY_LEN + min) as u64, | ||
slice, | ||
); | ||
self.dirty_bucket_link_bytes.clear(); | ||
} | ||
} | ||
|
||
// Flushes all bytes to memory | ||
fn flush_all<M: Memory>(&mut self, memory: &M, start_offset: u64) { | ||
write_struct(&self.inner, Address::from(start_offset), memory); | ||
self.dirty_first_buckets.clear(); | ||
self.dirty_bucket_link_bytes.clear(); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would strongly prefer to have only one place where we read the header, and after that, we use u16 (instead of 15-bit representation in storage). I believe that will improve performance, but more importantly, it will be easier to maintain and understand. After that, we can still use information from dirty_first_buckets and dirty_bucket_link_bytes to apply changes in storage on 15-bit representation when saving.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that makes sense.
This means we'll have to convert from 15 to 16-bit after reading which will have some overhead but at least its only done once. I'll have a go at it now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Will likely need a bit of cleaning up. Need to run off now but will give it another look tomorrow.
@dragoljub-duric @hpeebles @dsarlis I'm not sure we'd want to merge this solution to support memory freeing. With the expansion of stable memory, we now have another issue to solve: we have ~32k buckets, and with a default size of 8MiB per bucket, the memory manager can only manage 256GiB of memory. I bet most people don't know this, assuming that they have access to the full 400GiB (or is it 500 now?). This is a foot gun that we should resolve, and resolving this would require increasing the number of buckets. In order to both increase the number of buckets and to support memory freeing the future, we'll have to update the memory manager in such a way that its header can span, say, two wasm pages, as opposed to one wasm page. And, if we have two wasm pages, then we won't need to represent buckets using 15 bits as we're doing in this PR, which is adding more complication than it's worth imho. |
We could keep the header as a single page, store each bucket in the linked list as 3 bytes (which could support up to 16777216 buckets), and if additional buckets are needed, we use the unused space in the header to point to the page where the subsequent buckets are defined. Then on each page of buckets is a pointer to the next page. This way we can support far more buckets than anyone would ever need, the header remains small in most use cases, and we wouldn't need to move the existing data in the first memory page after the header. |
Having thought about this a bit more I think my idea to store bucketIds as 3 bytes is wrong. |
WIP