About generating <podcast:guid>: how to deal with URL encoding and punycode? #440
-
Hi all, the section on Is there a definition somewhere of how URLs containing umlauts or percent-encoding (or domain names with uppercase letters) should be handled before the UUID is generated? Or is this officially undefined and theoretically there could just be multiple semantically equivalent GUIDs for a podcast? |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 5 replies
-
Actually it does not really matter as the |
Beta Was this translation helpful? Give feedback.
-
I’m not sure why anything in the feed url would be re-encoded. Why would the punicode or percent need to be modified before calculating the guid? The feed url needs to be in its native form as the input to the UUIDv5 algo so that we all get the same thing as output. In a sense it’s just the seed value. Since UUIDv5 uses SHA-1, the input doesn’t need to be in any particular form. It will just be treated as byte data in 512 bit chunks. From RFC 4122: “Convert the name to a canonical sequence of octets” Forgive me if I’ve misunderstood the issue. |
Beta Was this translation helpful? Give feedback.
I’m not sure why anything in the feed url would be re-encoded. Why would the punicode or percent need to be modified before calculating the guid? The feed url needs to be in its native form as the input to the UUIDv5 algo so that we all get the same thing as output. In a sense it’s just the seed value. Since UUIDv5 uses SHA-1, the input doesn’t need to be in any particular form. It will just be treated as byte data in 512 bit chunks.
From RFC 4122: “Convert the name to a canonical sequence of octets”
Forgive me if I’ve misunderstood the issue.