Threshold Cryptography #254

radumarias · 2024-12-08T07:32:16Z

It is similar to Shamir's secret sharing. Still, it tries to solve the possible vulnerability that while encrypting the file, one node has the whole key constructed, and at that point, if the key gets into the hands of an attacker, it will be able to decrypt all data.

This approach adds a layer of “secret sharing,” though not in the classical Shamir’s Secret Sharing sense, which is conceptually similar. With this solution, no single entity holds the whole key, so no single entity can decrypt the whole data.

Solution

We have a group of users who will share the content of the files. Each file is split into chunks, and each user owns and encrypts a chunk. Each user has a key, and each chunk is encrypted with a different key. On decrypting, we ask each user to decrypt sending content via mTLS mutual TLS.

Security and Trust Model

TODO

We will use mTLS to validate both ends. We will also have a secure way to give certificates to each node, which has yet to be discussed.

Pros

Distributing encryption keys across multiple nodes can make it significantly harder for an attacker to gain full access to any file. A compromise of one node does not compromise the entire file.
This approach adds a layer of “secret sharing,” though not in the classical Shamir’s Secret Sharing sense, conceptually similar. No single entity holds the whole key so no single entity is able to decrypt the whole data.

Cons / Considerations

If a node responsible for a certain chunk goes offline permanently, that chunk cannot be decrypted, rendering the file incomplete
The design heavily relies on the trust and reliability of each node in the cluster. If a malicious node refuses to decrypt its chunk, the entire file becomes partially unrecoverable (unless there’s a redundancy)
Access control and revocation: If a user leaves the cluster or is compromised, how are keys rotated or chunks reassigned and re-encrypted?
Whenever reading from a file, request a decryption service from the remote node if a chunk does not belong to the local node. This introduces network latency and additional round trips. For workloads with frequent read operations, this could severely impact performance, making even local reads dependent on remote availability and speed

If we want more security, we can use Shamir's secret sharing for local keys. The attacker cannot decrypt the associated chunks if one node is compromised. For maximum security, we can enforce the presence of all nodes to reconstruct the local key.

Start the app

At the start, specify the group ID via a CLI parameter like --decentralized-group-id. This should be a hash over some unique identifier or over a public key
Discover other nodes and join the cluster, marking yourself available and getting the cluster info, like the list of nodes with the node metadata also containing which node is associated with that chunk
We can use algorithms like consistent hashing or shard keys to assign each node to chunks https://github.com/radumarias/rfs/wiki/File-sharding . To protect against one node dying and making that chunk unavailable, assign more nodes to each chunk
Distribute this metadata info via Blockchain, maybe
On the first start, we will, as usual, generate a key

Encrypt

When writing to a file, we split the content in chunks, let's say by the sharing algorithm
If the chunks correspond to us, then encrypt the data corresponding to the chunks locally and distribute it to all other nodes. We can use BitTorrent over uTP for that
If the chunk corresponds to other nodes, send the plaintext data to one of the nodes
The receiving node will encrypt the data, keep it locally, too, and send the encrypted content back to the sender node
The sender node will also save the encrypted data, ensuring data replication for backup purposes. All nodes will have the same data
Send the encrypted data to all nodes corresponding to the chunk
File metadata is encrypted locally on all nodes with the local node key because we want to have operations like read_dir, get file info fast, and execute only locally

Decrypt

When reading from a file, if the chunk by offset corresponds to us, we decrypt it locally
If not, then ask a corresponding chunk's node to decrypt it, which will send it to us
We then save the chunk locally in a cache, encrypting it with our local key

Implementation

It can all be implemented as an implementation for the Storage layer (#111), which will be similar to NFS but somehow hybrid; if the chunk is local, it will access the local filesystem; if not, it will access the other nodes via the network. Ideally, most of the upper code will stay the same.

Data replication

The above solution describes data being replicated across all nodes for backup purposes. If some nodes lose the encrypted data but still have access to the key, we can recover the data from other nodes and participate in the decryption. If the data for that chunk resides only on one node and is lost, then we cannot recover that data, and the file will be corrupted.

But for cases where this is not desired, we need a CLI flag like '--decentralized-keep-chunk-data-only-locally`, in which case we don't distribute the data to other nodes. We distribute it to related nodes which hold data for the same chunk,

Use

Iroh as it offers communication between nodes. This is mostly what we need, as the data will be kept on our nodes
Veilid If more than decentralized data transmission, we need a decentralized filesystem, too.
Private Routing Offers the ability to create a private network just between our nodes https://chatgpt.com/share/67563e26-f9b4-8003-81c3-5ed8ef2623de
ipfs similar to Veilid, just it doesn't have encryption at rest, but we don't need it as this is what we do :) IPFS supports the creation of private networks, which restrict access to authorized nodes only, this is useful

https://docs.ipfs.tech/concepts/privacy-and-encryption/#encryption

IPFS uses transport encryption but not content encryption. This means your data is secure when sent from one IPFS node to another. However, anyone can download and view that data with the CID. The lack of content encryption is an intentional decision. Instead of forcing you to use a particular encryption protocol, you can choose the best method for your project. This modular design keeps IPFS lightweight and free of vendor lock-in.

However, to add encryption support, it might help to create a plugin for IPFS.

Solutions using IPFS.

Structure notes

Create a dedicated crate for this that uses rencfs (the core) as a lib. You can create attractions and common minimalistic generic functionality in the core but keep the core more generic

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Threshold Cryptography #254

Threshold Cryptography #254

radumarias commented Dec 8, 2024 •

edited

Loading

Threshold Cryptography #254

Threshold Cryptography #254

Comments

radumarias commented Dec 8, 2024 • edited Loading

Solution

Security and Trust Model

Start the app

Encrypt

Decrypt

Implementation

Data replication

Use

Structure notes

Related

radumarias commented Dec 8, 2024 •

edited

Loading