You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Jun 25, 2023. It is now read-only.
TL;DR, I think recursive DNS limitations with landrush can cause pain on Linux when using dnsmasq with libvirt and NetworkManager, and the default of guest redirection via iptables to use the landrush.
Key issue from VM guest with landrush defaults
$ dig -p 10053 @127.0.0.1 www.google.com
...
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 11678;; flags: qr aa rd; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0;; WARNING: recursion requested but not available...
Workaround
config.landrush.guest_redirect_dns = false avoids the pain!
But then, when set to false:
VM guest can't find the VM host IP.
resolving the VM host's name will now incorrectly return a localhost address, not the real VM
interface! (talks to itself instead of the VM host)
previously, when set to true, VM guest failed to resolve VM host instead of incorrectly returning the localhost address for the VM host.
Host can still find VMs
resolving the guest VM's FQDN from the host works.
VMs can still find each other
resolving another guest VM's FQDN from within a guest worked.
Workaround side-effect on VM server host resolution
Here's an example where, with false, form within a guest, it doesn't resolve the VM host server IP correctly.
$ nslookup <VM server hostname>
Server: 192.168.121.1
Address: 192.168.121.1#53
Name: <VM server hostname>
Address: 127.0.1.1
And the default true:
$ nslookup <VM server hostname>
Server: 192.168.121.1
Address: 192.168.121.1#53
** server can't find <VM server hostname>: SERVFAIL
Potentially Related Issues
Originally, I had the same symptoms as #198. No matter which host I ping, landrush seemed to end up 'wildcarding' the FQDNs of external hosts and appending the configured 'local' TLD (in my case, vagrant.test). Might be to do with search vagrant.test being put into /etc/resolve.conf for guests...
And then there are extra complications noted... which relate more to #252 and possibly #174.
More or less default / minimal config causes this upstream DNS resolution bug
A fair bit of verbose context/info - jump down to the dig command that backs up what I saw in network packet captures. landrush DNS (with my stack) can't handle recursive queries.
By the way, not sure why landrush decides to run on all interfaces!? 0.0.0.0? Why not just the network vagrant is provisioning (i.e. 192.168.121.1). Maybe something to do with config.landrush.host_redirect_dns (and I should probably file a separate bug for this, I digress)
Checking what happened with iptables on the VM host shows another potential mess with multiple allows for both UDP and TCP.
When doing a packet trace on the virbr1 (vagrant provisioned) interface of the VM host, with nslookup from the guest (192.168.121.102), I observed multiple DNS query attempts:
1st go (doesn't append the landrush TLD), e.g. 192.168.121.102 -> 192.168.121.1:10053
DNS query from guest IP for www.google.com: type A, class IN to landrush DNS on host (port 10053) listening on all interfaces, including the VM host interface (192.168.121.1)
has 0x0100 flags
asking for recursion
indicating non-authenticated data is unacceptable
DNS response from landrush DNS on host seems to suggest that a recursive DNS query is not permitted
has 0x8502 flags
recursion not allowed
answer not authenticated
2nd go (does append the landrush TLD)
Same as above, except now DNS query from guest IP for www.google.com.vagrant.test: type A, class IN
probably default cold logic to try append the landrush TLD if the first attempt failed?
Quereis didn't make it to 127.0.1.1:53 (NetworkManager's dnsmasq, and later I also test upstream)
When using nslookup, from the host, I noticed this (working) behaviour where queries did make it to 127.0.1.1:53 (the NetworkManager's dnsmasq):
DNS query from host via host to itself on the virbr1 interface 192.168.121.1 -> 192.168.121.1:53
flags in response from DNS service say recursion is allowed!
Triggers a forwarded (recursive) DNS query from the dnsmasq part on 127.0.0.1 to 127.0.1.1:53
127.0.1.1 must have then quired the upstream DNS (as managed by NetworkManager) and responded correclty
192.168.121.1 reponds to itself.
Reading the man page for dnsmasq, I noticed the following:
Dnsmasq is a DNS query forwarder: it it not capable of recursively answering arbitrary queries starting from the root servers but forwards such queries to a fully recursive upstream DNS server which is typically provided by an ISP
So at a guess, landrush -> libvirt -> NetworkManager causes issues with a recursive DNS query? To confirm this, I also poked at landrush from the VM host:
$ dig -p 10053 @127.0.0.1 www.google.com
;<<>> DiG 9.10.3-P4-Ubuntu <<>> -p 10053 @127.0.0.1 www.google.com; (1 server found);; global options: +cmd;; Got answer:;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 11678;; flags: qr aa rd; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0;; WARNING: recursion requested but not available;; QUESTION SECTION:;www.google.com. IN A;; Query time: 2003 msec;; SERVER: 127.0.0.1#10053(127.0.0.1);; WHEN: Thu Oct 20 22:59:54 SAST 2016;; MSG SIZE rcvd: 32
I also hacked in config.landrush.upstream '127.0.1.1' to explicitly get landrush to target NetworkManager's dnsmasq, but no luck. Also tried real upstream DNS servers found via:
fordin$(nmcli device show | grep -E "^IP4.DNS"| grep -oP '(\d{1,3}\.){3}\d{1,3}');doecho$d;done
Doesn't work. Seems landrush doesn't pass on recursive DNS, even directly to upstream!
All the above, was with the following setup (I try keep to base/stable repo's as far as possible):
Ubuntu 16.04.1 LTS
landrush (1.1.2)
Vagrant 1.8.1
Libvirt version: 1.3.1
The text was updated successfully, but these errors were encountered:
TL;DR, I think recursive DNS limitations with landrush can cause pain on Linux when using dnsmasq with libvirt and NetworkManager, and the default of guest redirection via iptables to use the landrush.
Key issue from VM guest with landrush defaults
Workaround
config.landrush.guest_redirect_dns = false
avoids the pain!But then, when set to false:
interface! (talks to itself instead of the VM host)
Workaround side-effect on VM server host resolution
Here's an example where, with
false
, form within a guest, it doesn't resolve the VM host server IP correctly.And the default
true
:Potentially Related Issues
Originally, I had the same symptoms as #198. No matter which host I ping, landrush seemed to end up 'wildcarding' the FQDNs of external hosts and appending the configured 'local' TLD (in my case, vagrant.test). Might be to do with
search vagrant.test
being put into/etc/resolve.conf
for guests...And then there are extra complications noted... which relate more to #252 and possibly #174.
More or less default / minimal config causes this upstream DNS resolution bug
A fair bit of verbose context/info - jump down to the dig command that backs up what I saw in network packet captures. landrush DNS (with my stack) can't handle recursive queries.
Vagrantfile
:/etc/NetworkManager/dnsmasq.d/vagrant-landrush
(because Ubuntu, like Fedora, ships with NetworkManager, which already has dnsmasq plugged in)libvirt provides DNS on the
virbr1
network spooled up by the vagrant libvirt provider. On the guest VM:$ cat /etc/resolv.conf # Generated by NetworkManager search vagrant.test nameserver 192.168.121.1
libvirt is also using dnsmasq... So yay, three layers of dnsmasq that need to play nice together, landrush -> libvirt -> NetworkManager :-/
On the host, various DNS services are listening
By the way, not sure why landrush decides to run on all interfaces!? 0.0.0.0? Why not just the network vagrant is provisioning (i.e. 192.168.121.1). Maybe something to do with
config.landrush.host_redirect_dns
(and I should probably file a separate bug for this, I digress)Checking what happened with iptables on the VM host shows another potential mess with multiple allows for both UDP and TCP.
And on the guest
On the host, www.google.com resovles fine via libvirts dns, e.g.
On the libvirt guest, it fails, oddly, with the TLD appended:
When doing a packet trace on the
virbr1
(vagrant provisioned) interface of the VM host, withnslookup
from the guest (192.168.121.102), I observed multiple DNS query attempts:www.google.com: type A, class IN
to landrush DNS on host (port 10053) listening on all interfaces, including the VM host interface (192.168.121.1)0x0100
flags0x8502
flagswww.google.com.vagrant.test: type A, class IN
Quereis didn't make it to 127.0.1.1:53 (NetworkManager's dnsmasq, and later I also test upstream)
When using
nslookup
, from the host, I noticed this (working) behaviour where queries did make it to 127.0.1.1:53 (the NetworkManager's dnsmasq):Reading the man page for dnsmasq, I noticed the following:
So at a guess, landrush -> libvirt -> NetworkManager causes issues with a recursive DNS query? To confirm this, I also poked at landrush from the VM host:
I also hacked in
config.landrush.upstream '127.0.1.1'
to explicitly get landrush to target NetworkManager's dnsmasq, but no luck. Also tried real upstream DNS servers found via:Doesn't work. Seems landrush doesn't pass on recursive DNS, even directly to upstream!
All the above, was with the following setup (I try keep to base/stable repo's as far as possible):
The text was updated successfully, but these errors were encountered: