Skip to content

Commit

Permalink
more questions
Browse files Browse the repository at this point in the history
Signed-off-by: Alex Chi Z <[email protected]>
  • Loading branch information
skyzh committed Jan 23, 2024
1 parent 415c3c4 commit 940125e
Show file tree
Hide file tree
Showing 5 changed files with 26 additions and 7 deletions.
2 changes: 1 addition & 1 deletion mini-lsm-book/src/week1-04-sst.md
Original file line number Diff line number Diff line change
Expand Up @@ -99,7 +99,7 @@ We do not provide reference answers to the questions, and feel free to discuss a

## Bonus Tasks

* **Explore different SST encoding and layout.** For example, in the [Lethe](https://disc-projects.bu.edu/lethe/) paper, the author adds secondary key support to SST.
* **Explore different SST encoding and layout.** For example, in the [Lethe: Enabling Efficient Deletes in LSMs](https://disc-projects.bu.edu/lethe/) paper, where the author adds secondary key support to SST.
* Or you can use B+ Tree as the SST format instead of sorted blocks.
* **Index Blocks.** Split block indexes and block metadata into index blocks, and load them on-demand.
* **Index Cache.** Use a separate cache for indexes apart from the data block cache.
Expand Down
2 changes: 1 addition & 1 deletion mini-lsm-book/src/week2-01-compaction.md
Original file line number Diff line number Diff line change
Expand Up @@ -111,7 +111,7 @@ scan 2000 2333
* What are the definitions of read/write/space amplifications? (This is covered in the overview chapter)
* What are the ways to accurately compute the read/write/space amplifications, and what are the ways to estimate them?
* Is it correct that a key will take some storage space even if a user requests to delete it?
* Given that compaction takes a lot of write bandwidth and read bandwidth and may interfere with foreground operations, it is a good idea to postpone compaction when there are large write flow. It is even beneficial to stop/pause existing compaction tasks in this situation. What do you think of this idea? (Read the [Silk](https://www.usenix.org/conference/atc19/presentation/balmau) paper!)
* Given that compaction takes a lot of write bandwidth and read bandwidth and may interfere with foreground operations, it is a good idea to postpone compaction when there are large write flow. It is even beneficial to stop/pause existing compaction tasks in this situation. What do you think of this idea? (Read the [SILK: Preventing Latency Spikes in Log-Structured Merge Key-Value Stores](https://www.usenix.org/conference/atc19/presentation/balmau) paper!)
* Is it a good idea to use/fill the block cache for compactions? Or is it better to fully bypass the block cache when compaction?
* Does it make sense to have a `struct ConcatIterator<I: StorageIterator>` in the system?
* Some researchers/engineers propose to offload compaction to a remote server or a serverless lambda function. What are the benefits, and what might be the potential challenges and performance impacts of doing remote compaction? (Think of the point when a compaction completes and what happens to the block cache on the next read request...)
Expand Down
4 changes: 3 additions & 1 deletion mini-lsm-book/src/week2-02-simple.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ In this chapter, you will:
* Implement a simple leveled compaction strategy and simulate it on the compaction simulator.
* Start compaction as a background task and implement a compaction trigger in the system.

## Task 1: Simple Leveled Compaction + Compaction Simulation
## Task 1: Simple Leveled Compaction

In this chapter, we are going to implement our first compaction strategy -- simple leveled compaction. In this task, you will need to modify:

Expand Down Expand Up @@ -152,6 +152,8 @@ You may print something, for example, the compaction task information, when the

## Test Your Understanding

* What is the estimated write amplification of leveled compaction?
* What is the estimated read amplification of leveled compaction?
* Is it correct that a key will only be purged from the LSM tree if the user requests to delete it and it has been compacted in the bottom-most level?
* Is it a good strategy to periodically do a full compaction on the LSM tree? Why or why not?
* Actively choosing some old files/levels to compact even if they do not violate the level amplifier would be a good choice, is it true? (Look at the [Lethe](https://disc-projects.bu.edu/lethe/) paper!)
Expand Down
13 changes: 10 additions & 3 deletions mini-lsm-book/src/week2-03-tiered.md
Original file line number Diff line number Diff line change
Expand Up @@ -109,13 +109,20 @@ src/lsm_storage.rs

As tiered compaction does not use the L0 level of the LSM state, you should directly flush your memtables to a new tier instead of as an L0 SST. You can use `self.compaction_controller.flush_to_l0()` to know whether to flush to L0. You may use the first output SST id as the level/tier id for your new sorted run. You will also need to modify your compaction process to construct merge iterators for tiered compaction jobs.

## Related Readings

[Universal Compaction - RocksDB Wiki](https://github.com/facebook/rocksdb/wiki/Universal-Compaction)

## Test Your Understanding

* What is the estimated write amplification of leveled compaction? (Okay this is hard to estimate... But what if without the last *reduce sorted run* trigger?)
* What is the estimated read amplification of leveled compaction?
* What are the pros/cons of universal compaction compared with simple leveled/tiered compaction?
* How much storage space is it required (compared with user data size) to run universal compaction without using up the storage device space?
* How much storage space is it required (compared with user data size) to run universal compaction?
* Can we merge two tiers that are not adjacent in the LSM state?
* What happens if compaction cannot keep up with the SST flushes?
* The log-on-log problem.
* What happens if compaction speed cannot keep up with the SST flushes?
* What might needs to be considered if the system schedules multiple compaction tasks in parallel?
* SSDs also write its own logs (basically it is a log-structured storage). If the SSD has a write amplification of 2x, what is the end-to-end write amplification of the whole system? Related: [ZNS: Avoiding the Block Interface Tax for Flash-based SSDs](https://www.usenix.org/conference/atc21/presentation/bjorling).

We do not provide reference answers to the questions, and feel free to discuss about them in the Discord community.

Expand Down
12 changes: 11 additions & 1 deletion mini-lsm-book/src/week2-04-leveled.md
Original file line number Diff line number Diff line change
Expand Up @@ -144,11 +144,21 @@ src/lsm_storage.rs

The implementation should be similar to simple leveled compaction. Remember to change both get/scan read path and the compaction iterators.

## Related Readings

[Leveled Compaction - RocksDB Wiki](https://github.com/facebook/rocksdb/wiki/Leveled-Compaction)

## Test Your Understanding

* Finding a good key split point for compaction may potentially reduce the write amplification, or it does not matter at all?
* What is the estimated write amplification of leveled compaction?
* What is the estimated read amplification of leveled compaction?
* Finding a good key split point for compaction may potentially reduce the write amplification, or it does not matter at all? (Consider that case that the user write keys beginning with some prefixes, `00` and `01`. The number of keys under these two prefixes are different and their write patterns are different. If we can always split `00` and `01` into different SSTs...)
* Imagine that a user was using tiered (universal) compaction before and wants to migrate to leveled compaction. What might be the challenges of this migration? And how to do the migration?
* What if the user wants to migrate from leveled compaction to tiered compaction?
* What happens if compaction speed cannot keep up with the SST flushes?
* What might needs to be considered if the system schedules multiple compaction tasks in parallel?
* What is the peak storage usage for leveled compaction? Compared with universal compaction?
* Is it true that with a lower `level_size_multiplier`, you can always get a lower write amplification?
* What needs to be done if a user not using compaction at all decides to migrate to leveled compaction?

We do not provide reference answers to the questions, and feel free to discuss about them in the Discord community.
Expand Down

0 comments on commit 940125e

Please sign in to comment.