Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Threshold Cryptography #254

Open
1 task
radumarias opened this issue Dec 8, 2024 · 0 comments
Open
1 task

Threshold Cryptography #254

radumarias opened this issue Dec 8, 2024 · 0 comments
Labels

Comments

@radumarias
Copy link
Member

radumarias commented Dec 8, 2024

It is similar to Shamir's secret sharing. Still, it tries to solve the possible vulnerability that while encrypting the file, one node has the whole key constructed, and at that point, if the key gets into the hands of an attacker, it will be able to decrypt all data.

This approach adds a layer of “secret sharing,” though not in the classical Shamir’s Secret Sharing sense, which is conceptually similar. With this solution, no single entity holds the whole key, so no single entity can decrypt the whole data.

Solution

We have a group of users who will share the content of the files. Each file is split into chunks, and each user owns and encrypts a chunk. Each user has a key, and each chunk is encrypted with a different key. On decrypting, we ask each user to decrypt sending content via mTLS mutual TLS.

image

image

Security and Trust Model

TODO

  • We will use mTLS to validate both ends. We will also have a secure way to give certificates to each node, which has yet to be discussed.

Pros

  • Distributing encryption keys across multiple nodes can make it significantly harder for an attacker to gain full access to any file. A compromise of one node does not compromise the entire file.
  • This approach adds a layer of “secret sharing,” though not in the classical Shamir’s Secret Sharing sense, conceptually similar. No single entity holds the whole key so no single entity is able to decrypt the whole data.

Cons / Considerations

  • If a node responsible for a certain chunk goes offline permanently, that chunk cannot be decrypted, rendering the file incomplete
  • The design heavily relies on the trust and reliability of each node in the cluster. If a malicious node refuses to decrypt its chunk, the entire file becomes partially unrecoverable (unless there’s a redundancy)
  • Access control and revocation: If a user leaves the cluster or is compromised, how are keys rotated or chunks reassigned and re-encrypted?
  • Whenever reading from a file, request a decryption service from the remote node if a chunk does not belong to the local node. This introduces network latency and additional round trips. For workloads with frequent read operations, this could severely impact performance, making even local reads dependent on remote availability and speed

If we want more security, we can use Shamir's secret sharing for local keys. The attacker cannot decrypt the associated chunks if one node is compromised. For maximum security, we can enforce the presence of all nodes to reconstruct the local key.

Start the app

  • At the start, specify the group ID via a CLI parameter like --decentralized-group-id. This should be a hash over some unique identifier or over a public key
  • Discover other nodes and join the cluster, marking yourself available and getting the cluster info, like the list of nodes with the node metadata also containing which node is associated with that chunk
  • We can use algorithms like consistent hashing or shard keys to assign each node to chunks https://github.com/radumarias/rfs/wiki/File-sharding . To protect against one node dying and making that chunk unavailable, assign more nodes to each chunk
  • Distribute this metadata info via Blockchain, maybe
  • On the first start, we will, as usual, generate a key

Encrypt

  • When writing to a file, we split the content in chunks, let's say by the sharing algorithm
  • If the chunks correspond to us, then encrypt the data corresponding to the chunks locally and distribute it to all other nodes. We can use BitTorrent over uTP for that
  • If the chunk corresponds to other nodes, send the plaintext data to one of the nodes
  • The receiving node will encrypt the data, keep it locally, too, and send the encrypted content back to the sender node
  • The sender node will also save the encrypted data, ensuring data replication for backup purposes. All nodes will have the same data
  • Send the encrypted data to all nodes corresponding to the chunk
  • File metadata is encrypted locally on all nodes with the local node key because we want to have operations like read_dir, get file info fast, and execute only locally

Decrypt

  • When reading from a file, if the chunk by offset corresponds to us, we decrypt it locally
  • If not, then ask a corresponding chunk's node to decrypt it, which will send it to us
  • We then save the chunk locally in a cache, encrypting it with our local key

Implementation

It can all be implemented as an implementation for the Storage layer (#111), which will be similar to NFS but somehow hybrid; if the chunk is local, it will access the local filesystem; if not, it will access the other nodes via the network. Ideally, most of the upper code will stay the same.

Data replication

The above solution describes data being replicated across all nodes for backup purposes. If some nodes lose the encrypted data but still have access to the key, we can recover the data from other nodes and participate in the decryption. If the data for that chunk resides only on one node and is lost, then we cannot recover that data, and the file will be corrupted.

But for cases where this is not desired, we need a CLI flag like '--decentralized-keep-chunk-data-only-locally`, in which case we don't distribute the data to other nodes. We distribute it to related nodes which hold data for the same chunk,

Use

  • Iroh as it offers communication between nodes. This is mostly what we need, as the data will be kept on our nodes
  • Veilid If more than decentralized data transmission, we need a decentralized filesystem, too.
    Private Routing Offers the ability to create a private network just between our nodes https://chatgpt.com/share/67563e26-f9b4-8003-81c3-5ed8ef2623de
  • ipfs similar to Veilid, just it doesn't have encryption at rest, but we don't need it as this is what we do :) IPFS supports the creation of private networks, which restrict access to authorized nodes only, this is useful

https://docs.ipfs.tech/concepts/privacy-and-encryption/#encryption

IPFS uses transport encryption but not content encryption. This means your data is secure when sent from one IPFS node to another. However, anyone can download and view that data with the CID. The lack of content encryption is an intentional decision. Instead of forcing you to use a particular encryption protocol, you can choose the best method for your project. This modular design keeps IPFS lightweight and free of vendor lock-in.

However, to add encryption support, it might help to create a plugin for IPFS.

Solutions using IPFS.

Structure notes

  • Create a dedicated crate for this that uses rencfs (the core) as a lib. You can create attractions and common minimalistic generic functionality in the core but keep the core more generic

Related

@radumarias radumarias added this to rencfs Dec 8, 2024
@radumarias radumarias moved this to Todo in rencfs Dec 8, 2024
@radumarias radumarias changed the title Distributed encryption Decentralized/distributed encryption Dec 8, 2024
@radumarias radumarias changed the title Decentralized/distributed encryption Decentralized encryption Dec 8, 2024
@radumarias radumarias changed the title Decentralized encryption Decentralized shaeded encryption Dec 8, 2024
@radumarias radumarias changed the title Decentralized shaeded encryption Decentralized sharded encryption Dec 8, 2024
@radumarias radumarias changed the title Decentralized sharded encryption Threshold Cryptography Dec 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant