-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Canonicalise node ID representation #261
Comments
My proposal is to just use a Then any kind of encoding can be used, and we always use We knock out the problem of dereferencing node ids improperly, and also ensure that all domains are use the byte-form canonical version of NodeId and not some encoded form. The exact encoding algorithm we use has some important features needed:
If we do choose |
Remember with this issue done, it does not matter what the encoded node id lengths are, and there is no guarantee that the all encoded node id lengths are the same length and this is fine cause it doesn't matter anymore. At the moment in the source code, there's a need to temporarily ensure that the base encoding we are using is producing a fixed length. That's |
Some interesting discussions that came up with Brian and I whilst we've been debugging the tests. Previously, the node ID's "canonicalised" representation was its string representation (as a base64/base58 string). Instead, we're now using the actual, internal 32 bytes of the node ID for all calculations in the nodes domain. See https://gitlab.com/MatrixAI/Engineering/Polykey/js-polykey/-/merge_requests/205#note_712546331 for more details. |
This work on the node ID could also relate to #168. When we eventually move to Ed25519 keys, then the node ID will be the 32 byte sequence of the Ed25519 public key. |
As talked about in #269, we need to support multibase encoded Node IDs as these will be used by Note that we do not need ordered base encoding for Node Ids, Node Ids do not have any order. One question I have is whether this blocks testnet deployment? If not, then this can come after we have testnet working, and I don't think it blocks general publishing and distribution of PK CLI #268. If so this can be pushed to the icebox. |
No, I don't think it does block testnet deployment, nor #268, so this can be pushed back. |
For this issue, I think we need to do a bit more speccing out of the requirements for our string-encoded node ID. e.g. what's needed from a CLI point of view? What would we want a user to see as the representation of their node ID? Do we have length constraints on this? etc etc |
Yes we would need to pick a specific node ID encoding to use by default. There's a robustness principle at play here. Specific outputs, diverse inputs. So if at each human rendering location or just external printing is needed like logging, then base58 can be used, while we can take as input any base, and decode it to bytes. What base encoding does wireguard use for their ed25519 keys?
On 27 October 2021 10:40:42 am AEDT, Josh ***@***.***> wrote:
For this issue, I think we need to do a bit more speccing out of the requirements for our string-encoded node ID. e.g. what's needed from a CLI point of view? What would we want a user to see as the representation of their node ID? Do we have length constraints on this? etc etc
--
You are receiving this because you commented.
Reply to this email directly or view it on GitHub:
#261 (comment)
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.
|
Seems like Wireguard uses base64 encodings: https://www.wireguard.com/quickstart/#key-generation Interestingly, seems like the standard for SSH keys is also base64: https://datatracker.ietf.org/doc/html/rfc4716#section-3.4 |
Perhaps there's also merit in standardising this with our claims on the sigchain with base64url (removes unsafe URL characters for transfer): https://datatracker.ietf.org/doc/html/rfc7515 |
Note that |
As a refresher, it's worthwhile to remember that we've already done some form of canonicalisation of the node ID, as discussed here: #261 (comment). See the following commit too: 5f382e2 |
It will also potentially be worthwhile to undertake #299 (the review of the |
With #318, we can use the raw typed array/buffer form of NodeId without needing to encode it first, this works by using the Encoded representations of NodeId can be used elsewhere where string form/human readable is needed. And in those cases, we would use base32hex in order to preserve lexicographic order. |
Example of encoding the public key fingerprint as |
Specification
We'd like to canonicalise the representation of a keynode's node ID.
Recall that the node ID of a node is based on the public key of a keynode: it is a public key fingerprint. At the moment, we produce a SHA-256 hash of the public key, resulting in an array of 32 bytes. This array of bytes can be currently seen as the primitive canonicalised representation of the node ID.
The representation needs to provide:
Previously, we were using
base64
as the encoding choice for a human-readable version of the node ID. This produced fixed-size node IDs of 44 characters, but used unsafe characters (e.g./
,+
).After that, we were then looking at
base58btc
. This has the advantage of no longer producing URL-unsafe characters, as well as some quality-of-life improvements by not including visually-similar characters. However, it can produce different sized node IDs. See below:As a temporary solution, we've adopted a
base32hex
encoding, producing a fixed fixed 53 character node ID, such asv6n7m9vuf44pfqq133re8v91ju7a8968h4nkirjt8r7r9pbcoqqq0
(with a prependedv
character frommultibase
): https://gitlab.com/MatrixAI/Engineering/Polykey/js-polykey/-/merge_requests/205#note_711237922Additional context
NodeId
as a proxy/singleton instance #254 - will need to look at this issue concurrentlybase58
andbase58btc
Tasks
NodeGraph
(kademlia), across GRPC, etc)The text was updated successfully, but these errors were encountered: