Coordinator and Data nodes

Coordinator nodes with many Data nodes. We can use Raft for coordinator nodes
Coordinator node is served over http (axum or gRPC with tonic)
File is split in shards and kept on data node distributed and replicated
First clients will communicate with coordinator node to setup operations and metadata then the actual file content access will be made with data nodes
Data nodes acts also like an interface for DHT queries accessing the actual data from tikv
After a shard is uploaded to data nodes they will use DHT and BitTorrent between them to replicate the shard to multiple nodes. This doesn't require coordinator node
Coordinator node communicate with tikv cluster to get/put metadata about the file
Metadata contains information about the files where key is piece hash and value is a data nodes of that file
WAL strategy is used to commit files to all of our replicas and update tikv with the data node handle
Coordinator/Data nodes communication is done over a channel we can try out Kafka or maybe with gRPC to make sure that a data nodes contains a shard or to distribute a shard over the data nodes when a client uploads a file
Client is then given a list of data nodes to access the shards in parallel and assemble the file

From https://en.wikipedia.org/wiki/CAP_theorem I think we should target Consistency and Availability. Availability is also affected by Network partition so in that case we will choose Consistence.

Provide feedback