Skip to content

Extending the URI

Glyn Matthews edited this page Jun 17, 2013 · 4 revisions

Extending the URI

Why extend network::uri? On its own, this class doesn't actually do a lot. When developers need a URI they are using some scheme which defines the protocol to access a specific resource. I'll list several common use cases that network::uri currently ignores:

  1. Getting the path and converting it to a filesystem path:
http://www.example.org/path/to/file.html --> boost::filesystem::path("/path/to/file.html")
  1. Copying query arguments to a std container:
http://www.example.org/?a=b;foo=bar;x=5;pi=3.141 --> std::map["a"] == "b" etc.
  1. All schemes define a specific form. The mailto scheme defines a form (RFC 2368):
mailtoURL  =  "mailto:" [ to ] [ headers ]
to         =  #mailbox
headers    =  "?" header *( "&" header )
header     =  hname "=" hvalue
hname      =  *urlc
hvalue     =  *urlc

For XMPP, they are of the form (RFC 3920):

jid             = [ node "@" ] domain [ "/" resource ]
domain          = fqdn / address-literal
fqdn            = (sub-domain 1*("." sub-domain))
sub-domain      = (internationalized domain label)
address-literal = IPv4address / IPv6address

And we should provide accessors and builders to be able to handle those.

I can think of different ways of handling these extra requirements.

1. Sub-classing network::uri

It would be necessary to provide a virtual destructor for network::uri, with an additional cost. Also, the invariants of network::uri will be broken by sub-classing. I am against this.

2. Provide free functions in scheme-specific namespaces

The network::uri doesn't change, but every time new functionality is needed, then a new free function can be added. This is unsatisfactory, in my opinion.

3. Provide scheme-specific classes that conform to a URI concept

There can specific classes for each scheme that we want to support, e.g.:

network::mailto::address
network::http::url
network::https::url
network::file::path
network::xmpp::jid
network::git::repo

They each contain a uri object, and can accept either a network::uri object as a constructor argument or a string object, and provide further parsing if necessary. Each class and sub-namespace could contain additional scheme-specific functions. Furthermore, it could be easier to parse relative references, if this is necessary. I favour this approach, even though there is a certain amount of duplication for the reason that most use cases will use scheme-specific URIs.