-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[loader-v2] Fixing global cache reads & read-before-write on publish #15285
Changes from all commits
cef0262
c6d6182
2541069
b0a93db
1ed9b80
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -136,44 +136,39 @@ impl<'a, T: Transaction, S: TStateView<Key = T::Key>, X: Executable> ModuleCache | |
Self::Version, | ||
)>, | ||
> { | ||
// First, look up the module in the cross-block global module cache. Record the read for | ||
// later validation in case the read module is republished. | ||
if let Some(module) = self.global_module_cache.get_valid(key) { | ||
match &self.latest_view { | ||
ViewState::Sync(state) => state | ||
.captured_reads | ||
.borrow_mut() | ||
.capture_global_cache_read(key.clone()), | ||
ViewState::Unsync(state) => { | ||
state.read_set.borrow_mut().capture_module_read(key.clone()) | ||
}, | ||
} | ||
return Ok(Some((module, Self::Version::default()))); | ||
} | ||
|
||
// Global cache miss: check module cache in versioned/unsync maps. | ||
match &self.latest_view { | ||
ViewState::Sync(state) => { | ||
// Check the transaction-level cache with already read modules first. | ||
let cache_read = state.captured_reads.borrow().get_module_read(key)?; | ||
match cache_read { | ||
CacheRead::Hit(read) => Ok(read), | ||
CacheRead::Miss => { | ||
// If the module has not been accessed by this transaction, go to the | ||
// module cache and record the read. | ||
let read = state | ||
.versioned_map | ||
.module_cache() | ||
.get_module_or_build_with(key, builder)?; | ||
state | ||
.captured_reads | ||
.borrow_mut() | ||
.capture_per_block_cache_read(key.clone(), read.clone()); | ||
Ok(read) | ||
}, | ||
if let CacheRead::Hit(read) = state.captured_reads.borrow().get_module_read(key) { | ||
return Ok(read); | ||
} | ||
|
||
// Otherwise, it is a miss. Check global cache. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. why do we check global cache before checking state.versioned_map.module_cache ? on rolling commit - are we updating GlobalCache itself? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We update global cache at rolling commit - if published keys exist in global cache, we mark them as invalid. So reads to them results in a cache miss and we fallback to MVHashMap where we have placed the write at commit time. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You can check versioned before, but then you end up acquiring a lock for potentially non-republished module (publish is rare). If 32 threads do this for aptos-framework, this is bad. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So instead, we lookup in global first, but check an atomic bool flag there (better than a lock), so we optimize for read case There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I see, then I would rename PerBlockCache to UnfinalizedBlockCache or something like that - to make it clear it only ever refers to things before rolling commit, and GlobalCache is global and updated within the block itself. (you can do that in separate PR of course :) ) |
||
if let Some(module) = self.global_module_cache.get_valid(key) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. why do we reverse the order of checking now? (I was wondering for the previous pr about the order too) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Now, we always check local cache first. If it is not there, we as before check 1) global first, if valid, 2) per-block next. In both cases, clone the module to captured reads (local cache). So next read always reads the same thing. Does this make sense? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I was asking more about why we checked global cache in previous pr, is this an orthogonal change or we need to reverse the order now? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. we still first check global cache. What is added here is check in captured reads - meaning whether this same transaction has already read it, and if it did - do not read it again |
||
state | ||
.captured_reads | ||
.borrow_mut() | ||
.capture_global_cache_read(key.clone(), module.clone()); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. it would be a total overkill here but I wonder if we can do RAAI style little struct that captures things on drop, to make sure different paths of getting things all get recorded in captured reads (not just for modules). But I suppose we don't have such complex flows anywhere else. |
||
return Ok(Some((module, Self::Version::default()))); | ||
} | ||
|
||
// If not global cache, check per-block cache. | ||
let read = state | ||
.versioned_map | ||
.module_cache() | ||
.get_module_or_build_with(key, builder)?; | ||
state | ||
.captured_reads | ||
.borrow_mut() | ||
.capture_per_block_cache_read(key.clone(), read.clone()); | ||
Ok(read) | ||
}, | ||
ViewState::Unsync(state) => { | ||
if let Some(module) = self.global_module_cache.get_valid(key) { | ||
state.read_set.borrow_mut().capture_module_read(key.clone()); | ||
return Ok(Some((module, Self::Version::default()))); | ||
} | ||
|
||
let read = state | ||
.unsync_map | ||
.module_cache() | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this whole match be equivalent to:
why do we need to update GlobalCache at all while executing a block?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do if we read first from it (to know if entry is overridden or not). An alternative is to check lower level cache first, but this means performance penalty due to locking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code can be somewhat equivalent, but:
causes a prefetch of storage version by default. We would need to special case validation to not do it. An we also end up locking the cache (shard, worst case), instead of checking an atomic bool
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is because we may publish a module that invalidates the global cache that's being read I think