Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Distributed-Hash-Table DHT #4

Open
radumarias opened this issue Jul 27, 2024 · 5 comments
Open

Distributed-Hash-Table DHT #4

radumarias opened this issue Jul 27, 2024 · 5 comments
Assignees
Labels
enhancement New feature or request

Comments

@radumarias
Copy link
Owner

radumarias commented Jul 27, 2024

https://github.com/radumarias/rfs/wiki/Distributed-Hash-Table-DHT

#39
#3

@radumarias radumarias added this to rfs Jul 27, 2024
@radumarias radumarias converted this from a draft issue Jul 27, 2024
@radumarias radumarias added the enhancement New feature or request label Jul 27, 2024
@radumarias radumarias changed the title Implement BitTorrent tracker Building the bittorrent tracker and deciding on the encoding we want to use and how to communicate with clients Jul 27, 2024
@bnchi bnchi changed the title Building the bittorrent tracker and deciding on the encoding we want to use and how to communicate with clients Deciding on the protocol used to exchange between the nodes Jul 31, 2024
@bnchi bnchi changed the title Deciding on the protocol used to exchange between the nodes Deciding on the protocol used to disribute files between the nodes Jul 31, 2024
@radumarias
Copy link
Owner Author

I propose that files are immutable on the system, that's once a file is replicated and in the system it can't be modified.

that could be problematic for imagine a database which does small changes in short time

I think it would not be so hard to locate the replicat and update them on change
Also we will apply changes with WAL so multiple writes can be in parallel an will apply them in order, similar is how DBs are handling transactions. I'm doing this in rencfs projects.
Like this we could also allow parallel writes to multiple replicas, as all will eventually get to the same consistency

@radumarias
Copy link
Owner Author

If the user define how many nodes and the topology of the system that's we have x amount of nodes in the cluster do we really need to implement a peer exchange protocol.

As I understand PEX sends list of peers for the file and also what each peer what shards they have and what chunks from the sharks (like seeder)

The network topology I assume could be also synched with Raft, just we can use PEX for the above shards and chunks metadata

But do I get it wrong on what PEX does?

@radumarias
Copy link
Owner Author

I would still try with PEX also as seems more reliable and more scalable.
Storing initial metadata in DHT seems better that torrent file as it's more scalable and fault tolerant

The file is divided into chunks, ideally not more than 256 KB per chunk.

The actual shards (splits of the original file) I would imagine 512MB or 64MB and then from torrent POV each of these shards is seen as a file where yes chunk should be in the order of hundreds of kB

@radumarias
Copy link
Owner Author

I agree we don't need PEX but DHT. I think it's more scalable to use DHT instead of a tracker as it eliminates the single point of failure.

Will we need to create some service that acts like DHT and reads metadata from tikv?

@radumarias
Copy link
Owner Author

radumarias commented Jul 31, 2024

This leaves us with a tough problem the replication and how are we going to replicate a piece over the nodes ?!

Well we can implement it similar to ConsistentHashing, actually the initial implmentation I did already supports file replicas distribution https://github.com/radumarias/rfs/blob/feat/Building-the-sharding-algorithm-to-know-where-each-chunk-go_1/shard-distribution/src/consistent_hashing.rs#L65 we could use smth like that to distribute replicas and redistribute the ones from dead nodes

@bnchi bnchi removed their assignment Sep 1, 2024
@radumarias radumarias changed the title Deciding on the protocol used to disribute files between the nodes Distributed-Hash-Table DHT Sep 6, 2024
@radumarias radumarias changed the title Distributed-Hash-Table DHT Distributed-Hash-Table, DHT Sep 8, 2024
@radumarias radumarias changed the title Distributed-Hash-Table, DHT Distributed-Hash-Table DHT Sep 8, 2024
@Eyob94 Eyob94 self-assigned this Sep 8, 2024
@radumarias radumarias moved this from Todo to In Progress in rfs Dec 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: In Progress
Development

No branches or pull requests

3 participants