Skip to content

Concepts

Radu Marias edited this page Aug 8, 2024 · 8 revisions

Concepts

Chapter 9 Volume 2 of the book system design interview

We need to organize these on topics and have a summary for each of them, like:

RAFT

https://www.cncf.io/blog/2019/11/04/building-a-large-scale-distributed-storage-system-based-on-raft/

https://github.com/tikv/raft-rs

File sharding

Each file is split in chunks (shards) and we will distribute those chunks on multiple nodes and also replicate them.

In terms of how we decide on which node the chunk goes initially I was thinking to use smth like chunk_index % num_nodes, but this adds a problem when we add/remove nodes, as we would need to rebalance the cluster by moving shards around.

To solve this we could use weighted random distribution like here https://dev.to/jacktt/understanding-the-weighted-random-algorithm-581p We will use the space used for each node as weight, build those intervals and select a random interval (node) to put the shard on. This will handle gracefully adding or removing nodes without moving shards around, it will auto-rebalance in time.

rand crate also has smth for this https://docs.rs/rand/latest/rand/distributions/struct.WeightedIndex.html#example

Alternatives

Consistent hashing

Metadata DB

Distributed DB

File sync

Decentralized BitTorrent: BitTorrent filesystem

https://www.bittorrent.com/token/btt
BTFS is a decentralized file storage system supported by millions of BitTorrent user nodes.* By running on the blockchain, which has a Delegated proof of Stake method of processing blockchain transactions, BTFS addresses these limitations.

https://en.m.wikipedia.org/wiki/Bencode
Bencode (pronounced like Bee-encode) is the encoding used by the peer-to-peer file sharing system BitTorrent for storing and transmitting loosely structured data.

Raft filesystem

https://github.com/deepmehtait/Distributed-file-system-server-with-RAFT