-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enhancement request: Adding some error information for DNS requests like RQ #19
Comments
This would be equally useful for FQ/FR messages of course. |
A possible way (which would be consistent I think) would be to add a second message type, so that "Message" (the only supported type for now) still reflects actual queries and responses, while the second one "Error?" could include information about the failed query and the error information (unreachable destination, timeouts, etc). |
This is not correct. dnstap is an instrumentation format for representing events that occur inside DNS software. It is not a "packet capture on steroids" format. E.g., dnstap is oblivious to the packetized representation (TCP segmentation, IP fragmentation, TLS encryption) of a wire-format DNS message. Similarly, packet capture representations of DNS server traffic cannot capture metadata that dnstap can export (e.g., the
A timeout is intrinsically an event that occurs inside DNS software, so it would be plausible to design new protobuf message type(s) for the dnstap protobuf schema and instrument DNS servers to support the new message type(s). Unbound's timeout algorithm is described here: https://www.nlnetlabs.nl/documentation/unbound/info-timeout/.
I think there are a lot of possible "error" events that can occur in a recursive DNS server beyond just network timeouts, for instance RFC 8914 (Extended DNS Errors) specifies an in-band way of encoding several dozen different error codes in response to a client that supports the EDE option. These will get logged into dnstap incidentally by a recursive DNS server when responding to clients that set the EDE option. But maybe it makes sense to design a dnstap schema for encapsulating an out-of-band EDE-like payload so that the server operator can log the occurrences of these kinds of errors. |
dnstap is really comprehensive as a DNS server monitoring solution.
Thanks to dnstap it is really simple to obtain, for instance, response time data for queries and responses. Because dnstap
includes the query timestamps in response messages, obtaining the response time is simple without needing to keep track of individual queries and responses, using the context information stored by the DNS server instead.
However, there is a situation in which dnstap (in my opinion) falls short: timeouts due to packet loss or non responsive servers are bit reported through dnstap.
This means that in order to obtain this data the possibilities are:
The second option doesn´t look so good. Moreover, dnstap seems to be designed from the ground up to avoid a situation like that. Reply messages benefit from the DNS server software being aware of the query state and response messages include the query timestamp when available.
Although it would break one aspect of dnstap in which it tries to behave as close as possible to a packet capture on steroids, that kind of out of band messages would (in my opinion) greatly improve it.
At least in the situation I am describing, detecting certain errors when trying to querying another DNS server, I guess the performance impact would be negligible and all of the state information needed is already in place.
What do you think?
The text was updated successfully, but these errors were encountered: