Replies: 15 comments
-
I have seen in a production deployment the consumer can never catch up if bookie's avg read latency is 10+ms bad due to the disk and other workloads running on same machine. the problem can potentially be mitigated if we can have a proper readahead mechanism in managed like what dlog is doing. |
Beta Was this translation helpful? Give feedback.
-
Should the logic similar to the readAhead logic here in BKLogSegmentEntryReader |
Beta Was this translation helpful? Give feedback.
-
@MarvinCai yes. I think it is worth pushing this logic to BK to provide a |
Beta Was this translation helpful? Give feedback.
-
/cc @jiazhai @eolivelli in this thread. so they can provide some more thoughts around this if it is worth adding this readahead logic to BK read handle. |
Beta Was this translation helpful? Give feedback.
-
@sijie |
Beta Was this translation helpful? Give feedback.
-
This is a also blocking issue for practical use of tiered storage as historical retention, since replay from tiered storage (at least s3) is too slow. It'd be great if number of read-ahead threads and/or outstanding requests is tunable, as many blob/object store scale throughput proportional to number of connections. |
Beta Was this translation helpful? Give feedback.
-
@vicaya thank you for your feedback. @MarvinCai are you willing to give it a try? |
Beta Was this translation helpful? Give feedback.
-
@sijie sorry just saw the replies, how about I start with doc with problem statement and try propose a solution. If everything looks good then we can proceed from there. |
Beta Was this translation helpful? Give feedback.
-
@MarvinCai Are you working on this issue already? |
Beta Was this translation helpful? Give feedback.
-
@sijie I was new to BK code base and was reading the some LeadferHandler and DL codes tog figure out what should be change, have a simple doc about what I think may need to change. |
Beta Was this translation helpful? Give feedback.
-
Is there any progress on this issue? Whenever people ask me about why we don't use tiered storage, I have to point them to this issue for why it's too slow for us (readers cannot read fast enough from tiered storage and backlogs would build up even at moderate throughput (<1000msgs/s with small messages (<100 bytes), which is <100KB/s). |
Beta Was this translation helpful? Give feedback.
-
@nicoloboschi you could be interested in working on a fix for this issue if @MarvinCai doesn't have time to work on this topic |
Beta Was this translation helpful? Give feedback.
-
@MarvinCai, @sijie, @eolivelli - It looks like the feature to improve read throughput for blob storage is still outstanding. I'm very interested in helping to contribute this feature. Do you think it deserves its own issue to start a dedicated discussion for offloading? |
Beta Was this translation helpful? Give feedback.
-
@michaeljmarshall you can start a discussion with a proposal. |
Beta Was this translation helpful? Give feedback.
-
@eolivelli - thanks, I'll do that. |
Beta Was this translation helpful? Give feedback.
-
Problems
Currently managed ledger read entries in a very large batch requests - 100 entries by default. This is an inefficient approach. We should streamline the read requests like what dlog is doing.
Beta Was this translation helpful? Give feedback.
All reactions