Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Re-resolve ddns endpoint #6

Closed
merc4derp opened this issue Jul 11, 2023 · 25 comments
Closed

Feature Request: Re-resolve ddns endpoint #6

merc4derp opened this issue Jul 11, 2023 · 25 comments
Assignees
Labels
enhancement New feature or request wg-lib This issue requires changes to the official WireGuard Android library.

Comments

@merc4derp
Copy link

The biggest problem with the official app imo is that it won't re-resolve the endpoint unless you turn the tunnel off and back on.
This means that if your endpoint's ip changes while the tunnel is active, you're stuck with no connectivity and you don't even know it.

There's no non-root solutions to this problem currently afaik.

Part of the problem is the fact that while the tunnel is active, your dns comes from the tunnel too. So if the endpoint switches ip, you can't even re-resolve because you lose access to the dns server.

Not sure how much agency you have on android with this app. In a typical linux installation you periodically re-resolve every 5mins or so via 1.1.1.1 or something outside the tunnel, compare to cached resolution and if different restart the tunnel. I'm sure you know of these workarounds already.

If you could implement something similar in this app, it would be fantastic.

@zaneschepke zaneschepke self-assigned this Jul 11, 2023
@zaneschepke zaneschepke added the enhancement New feature or request label Jul 11, 2023
@zaneschepke
Copy link
Owner

Thank you for showing interest in this project! I've done a bit of investigating and if we lose connection to the server (like from an ip change or even the server just going down) the WireGuard library will just log that the handshake failed and will endlessly keep trying to initiate a handshake again without exposing to the client app that the tunnel connection is down. Meanwhile, the user is left thinking the tunnel is connected but wondering why they have no internet connection. I even noticed this behavior on initial connection and even if you've never actually successfully connected to the tunnel it will still tell the client app that the tunnel is up. Unfortunately, I don't have too much control over this behavior as none of it is exposed to the client app except via log messages.

An initial solution could be for the app to monitor the logs from the WireGuard library. There are some very obvious log messages at 5 and 15 second intervals saying that the handshake has failed that could be cued off of to bring the tunnel down and then back up (to try and solve the ddns issue). If that still does not resolve issue, at some point we should bring the tunnel down and let the user know that connection has failed. Any thoughts on this solution or thoughts on possible thresholds for this behavior?

@zaneschepke zaneschepke added the bug Something isn't working label Jul 11, 2023
@merc4derp
Copy link
Author

It's obviously very dangerous to turn off the tunnel without user consent when we're talking about a security app.
Can you somehow guarantee that all active states reset, all connections drop and nothing is leaked outside the tunnel if you restart it silently on the fly?

I'll assume that you can't, in which case I suppose the behavior could be offered as an option with a big fat disclaimer in the app. For use cases where privacy is not critical, such as using the tunnel for adblock, users may opt to allow it. And if privacy is critical there could be an option to send a notification that tunnel seems dead, allowing for manual restart if desired.

As for possible thresholds, it depends on how accurate these logs are at indicating a dead tunnel and how often you're reading them keeping in mind android battery optimization policies for connections and apps. 3 failed handshakes should probably trigger it? Will need testing.

And of course it could take multiple iterations switching off and back on until the endpoint's new ip is updated on dns servers. Probably needs a notification after 10-15mins of failed attempts that this endpoint seems dead, to account for server crashes, power outages and what not.

@zaneschepke
Copy link
Owner

Yeah, there would not be a way to guarantee nothing is leaked when restarting the tunnel.

I really like your idea of showing in the notification if the connection is failing and providing the user with an option restart action.

These logs are accurate and I would read them in real time.

I was thinking after 3 failed handshakes (about 15 seconds). We could just display the state of the tunnel failing, attempting to connect. From there the user is welcome to just wait it out for a long as they want (to prevent any leaks) or attempt a restart as many times and whenever they want via the action button on the notification. Thoughts?

@merc4derp
Copy link
Author

I think it's a good start. Might not be the definitive solution but sounds better than what the official app offers currently and can be further iterated later.

I'd still love to see a silent auto-restart option eventually for people who aren't tech-savvy enough to keep an eye on notifications and interact with them though (such as my parents, heh).

Happy to provide more feedback when you implement it. Love what you've done with the app so far.

zaneschepke added a commit that referenced this issue Jul 17, 2023
Adds details screen which display details of tunnel configuration as well as last handshake and rx/tx of peer.

Adds last handshake monitoring with statuses and thresholds.

Adds handshake/connection notifications based on last successful handshake.

Adds status LED next to tunnel on main screen.

Fixes bug where first click on QR code could result in nothing happening if QR code module is being downloaded. Now shows message to user.

Fixes bug where changes made after editing tunnel were not propagated to settings if that tunnel was configured as the default tunnel.

Fixes bug causing crash if wrong config file selected

Update README

Closes #7, Closes #6
zaneschepke added a commit that referenced this issue Jul 18, 2023
Adds details screen which display details of tunnel configuration as well as last handshake and rx/tx of peer.

Adds last handshake monitoring with statuses and thresholds.

Adds handshake/connection notifications based on last successful handshake.

Adds status LED next to tunnel on main screen.

Fixes bug where first click on QR code could result in nothing happening if QR code module is being downloaded. Now shows message to user.

Fixes bug where changes made after editing tunnel were not propagated to settings if that tunnel was configured as the default tunnel.

Fixes bug causing crash if wrong config file selected

Update README

Closes #7, Closes #6
zaneschepke added a commit that referenced this issue Jul 18, 2023
Adds details screen which display details of tunnel configuration as well as last handshake and rx/tx of peer.

Adds last handshake monitoring with statuses and thresholds.

Adds handshake/connection notifications based on last successful handshake.

Adds status LED next to tunnel on main screen.

Fixes bug where first click on QR code could result in nothing happening if QR code module is being downloaded. Now shows message to user.

Fixes bug where changes made after editing tunnel were not propagated to settings if that tunnel was configured as the default tunnel.

Fixes bug causing crash if wrong config file selected

Update README

Closes #7, Closes #6
@gitthangbaby
Copy link

Switching by maintaining the connection on WAN or LAN network

My solution to this is

  1. Keep using (D)DNS address that resolves to WAN IP, and make sure it stays external all the time. In case you redirect your own domain to LAN IPs at home, which is common, make an exception to one subdomain for sake of WG connectivity.
  2. Let firewall redirect firewall:WG port to LAN WG IP:WG port. To trick the WG users connecting to WAN WG to actually connect to LAN WG.
    This way i'm switching all day long without interruption. The DNS record is fetched all day from the outside resolver.

Switching by turning off at a known network

Now the problem is what to do if we want to turn off WG at home (like this app, and soon official WG client, offers). In this case, there's obviously no issue when dropping the connection. And no DNS issue when starting it up again (as we fetch DNS record from WAN). However, there are leaks in this case. Unlike in the previous case, where connection was permanent.

The moment home WiFi disconnects, this happens:

  • instant connection to WAN whether Mobile connection is permanent in Developer Settings or not
  • DNS resolving of WG and many other addresses
  • connectivity check
  • leak: several seconds of all sleezy corporate apps reconnecting via WAN IP, and writing your WAN IP to their UniqueIP tables
    Try yourself: turn of WiFi and while true; do curl ifconfig.co; sleep 0.5; done or keep refreshing network info in the Network Tool. Or you can tcpdump to see the whole myriad of re-connections outside of WG.

The moment home WiFi connects, this happens:

  • several seconds of connectivity checking before WiFi icon turns positive (without "?"/"!" subsymbol)
  • leak 1: the connectivity check can fail and mobile connection be still used for various reasons: DNS resolving issue, [x] Block connections is enforced etc
  • leak 2: this also leaks if VPN connection is being switched between servers (least: WG kernel, most: OpenVPN clients), but that's not this case

To patch WiFi disconnect event:

  • necessary to ban network, and allow only DNS + WG IP, or at least simplified ban TCP (to allow DNS resolving): iptables -I OUTPUT -p tcp -j DROP
  • let WG connect
  • after some seconds, or when WG connection is confirmed & working, drop the limitation: iptables -D OUTPUT -p tcp -j DROP. No probie for the apps, as there are sticky connections anyway, few will complain about the little blackout.
    Of course the app controlling the WG tunnel could do all this fastest. Since none of them do, I am launching the commands myself via XposedEdge and leak is undetectable. But that's on a rooted phone.

To patch WiFi connect event:

  • release potentional blocks e.g. settings put secure always_on_vpn_lockdown 0
  • let WiFi connect and "?"/"!" subsymbol disapear
  • after some seconds, check again if SSID is connected, and only then drop the tunnel, otherwise add blocks e.g. settings put secure always_on_vpn_lockdown 1

@zaneschepke
Copy link
Owner

@gitthangbaby Thank you for all of these details. It will take me a bit to unpack this all, but I think these are some really valuable insights.

Would you be willing to chat on something like Discord? I have a lot of questions about this information you shared.

Additionally, I think there are ways to accomplish some of the things you listed here in the patches without root, but it would need to make changes to the wireguard tunnel official android library.

Also of note is some of the exempt traffic Android has that is reference in Mullvad's documentation here

I think a lot of what you shared is relevant to #52.

@gitthangbaby
Copy link

Thanks for the resources. Indeed the Block mode is not a great insurance, I've caught it allowing far more than just DNS / NTP. And then it blocks LAN. So the only chance is the WG app to handle this, and fight apps which modify the network rules. Indeed some apps like Mullvad will be more prone to leaking, but they don't allow home WG server connections. And sadly, no SSID driven behavior. That is sad, after so many years of VPN boom.
Sadly i don't use chats. Perhaps use https://hack.chat/?git .

@olivluca
Copy link

I'm not using your app (yet). My use case is not to have absolute certainty that no data will be leaked, but to safely access a phone in a remote location.
To "solve" the ddns problem, I use an Automate flow that, every 30 seconds, checks 5 times if it can ping an host only reachable through the vpn (i.e an host in my internal network). If it fails it will turn off then turn on the tunnel again. The same flow ensures that the tunnel is brought up when the phone reboots. Unfortunately from time to time Automate stops processing all flows, which could lead to a dead tunnel and an unreachable phone, hence I was looking to alternatives and found your app and this bug.

@zaneschepke
Copy link
Owner

Hello @olivluca. Thank you for your interest in the app! Yes, this app does not completely solve this issue. Right now, it only shows a push notifications that allows user to easily restart the tunnel if there are connection issues.

@olivluca
Copy link

This is what openwrt does to solve the issue, I have no idea if the wireguard library exposes the same functionality as the wg command:
https://git.openwrt.org/?p=openwrt/openwrt.git;a=blob;f=package/network/utils/wireguard-tools/files/wireguard_watchdog

@zaneschepke
Copy link
Owner

This is what openwrt does to solve the issue, I have no idea if the wireguard library exposes the same functionality as the wg command:
https://git.openwrt.org/?p=openwrt/openwrt.git;a=blob;f=package/network/utils/wireguard-tools/files/wireguard_watchdog

Thanks for this! This is super helpful. I'll dig in and see if this is something I can implement.

@zaneschepke zaneschepke reopened this Nov 18, 2023
@zaneschepke zaneschepke removed the bug Something isn't working label Nov 18, 2023
@Laserlicht
Copy link

It would be great to find a solution for this.

The biggest disadvantage of the official app in my opinion.

@nistvan86
Copy link

I'm trying to use Wireguard outside my home network with a dynamic WAN IP address (my domain name gets updated by DDNS), but this makes running apps like Home Assistant through the tunnel unreliable, as it breaks the tunnel when my ISP decides to renew my public IP.
I would really like to have some builtin solution for this. Tasker and such can break too easily.

There was a related pull request for this on the official android client which was not accepted. WireGuard/wireguard-android#26

I understand that there must be some kind of verification that the new peer IP updated from the DNS entry really belongs to the endpoint in question and not someone impersonating it. But isn't that require the stealing of the private key of the peer to be successful? Also we have the pre-shared keys extra security option. We can also tie the DNS re-resolving to the presence of that extra security feature, if it helps. (I'm just thinking out loud here, I'm not sure if this is the case)

@konradmoesch
Copy link

There is a script in wireguard-tools which checks the age of the latest handshake and re-resolves the endpoint, if needed.
I'd like such a feature in this app, as well.
I think monitoring the logs of the library is not needed, since the latest handshake age is already shown in the UI.

Regarding the issue of dns entry verification: this is already a problem, not something new, right? For example activating the tunnel when switching to mobile network or an untrusted wifi would mean that the endpoint is resolved using untrusted a dns server.
Furthermore, the handshake would not succeed if someone would impersonate the peer and poison the dns.

Regarding leaks during re-resolving and restart: if we make this a default-off setting, this would not be a big problem, right?

Is someone already working on this? If not, what would be a good starting point for implementing it?

@zaneschepke
Copy link
Owner

Thanks for this info! This one is still in the backlog. I was planning on first implementing a kill-switch feature to prevent the leaks before working on this re-resolve (I think that is what you mean by a default-off setting). The problem is a lot of this work needs to be done inside of the WireGuard's tunnel android lib. This lib needs to be integrated directly into the project as a local lib to make the appropriate changes. The app is currently just pulling the lib in from maven, but I am in the process of adding it as a local lib.

@zaneschepke zaneschepke added the wg-lib This issue requires changes to the official WireGuard Android library. label Feb 19, 2024
zaneschepke added a commit that referenced this issue Feb 19, 2024
Migrated app to a forked version of wireguard-android to enable development work on features that require changes to the core lib, like #107 #104 #87 #52 #6

Improved first launch flow by change vpn permission to only launch on first tunnel start

Changed to proper database seeding strategy

Updated README to account for GitHub packages auth requirement

Migrated from deprecated UI components and libs

Bump versions
@maurerle
Copy link

When using DynDNS, i have used Easer with the official wireguard app, which is unfortunately not on F-Droid anymore.
See here: https://android.stackexchange.com/a/254263

So one has to use the Easer start service intent to toggle this.
I described a nicer Feature Request to easer here: renyuneyun/Easer#478

But it would be even cooler to have this functionality in wgtunnel directly.

@orbital253
Copy link

Is it possible to check the tunnel is active and valid if you ping (at an interval) the the server local IP and if its pingable then the tunnel is working as intended.
If the ping fails then the tunnel is not valid and it need to be reestablished.

So an option to set an IP to monitor when a tunnel is enabled

I use tasker to ping the server local IP and if ping fails i use wireguard intent to disable the tunnel and another intent to enable it again. I use a ping interval of 1 minute

@zaneschepke
Copy link
Owner

Thank you @orbital253 and @maurerle for the info. Yes, @orbital253 this is precisely the plan. I was delaying before because I thought I would be nice if I had a kill switch feature before this feature (to prevent data leaks), but I don't think this is necessary.

I was also thinking it would be nice if I could solve this issue in the wireguard lib logic itself (to re-resolve endpoints) but I think it might actually just be an android API limitation with how it creates the tunnel interface via the API.

A simple pinger with a restart of the tunnel should be a pretty easy to implement. This one is long overdue so I'll get to work on it next.

@perzarys
Copy link

Hi, thanks for implementing this, I was really looking forward to this feature and immediately downloaded the new release. Unfortunately it does not work reliably for me. I feel like the 60s handshake timer is not enough for some DDNS endpoints to update, especially when the last handshake before endpoint renewal was already like 20s ago, then only leaving a 40s DDNS renewal margin which will fail with most DDNS providers, like in my case. The 60 minute cooldown timer for the auto-restart is way too high in my opinion, maybe it would already help to reduce this to 5min to give endpoints enough time to update? Let me know if you need more info. Thanks

@perzarys
Copy link

Even after more than 60 minutes of failed handshakes after IP change, my tunnel did not restart automatically. Had to manually restart.

@zaneschepke zaneschepke reopened this Mar 19, 2024
@zaneschepke
Copy link
Owner

Hi, thanks for implementing this, I was really looking forward to this feature and immediately downloaded the new release. Unfortunately it does not work reliably for me. I feel like the 60s handshake timer is not enough for some DDNS endpoints to update, especially when the last handshake before endpoint renewal was already like 20s ago, then only leaving a 40s DDNS renewal margin which will fail with most DDNS providers, like in my case. The 60 minute cooldown timer for the auto-restart is way too high in my opinion, maybe it would already help to reduce this to 5min to give endpoints enough time to update? Let me know if you need more info. Thanks

Hey! I'll reopen this issue until I get this right. It was my first stab at this so it was a very rudimentary implementation. It seems it would be ideal to integrate the last handshake time into the ping test. Any suggestions for timings/thresholds on this? We can lower the cooldown timer as well and have it check wifi/mobile data availability so it doesn't spam in situations when you've lost all network availability.

I am surprised it wasn't working at all for you. I added a logs screen which might help us determine why it didn't work. Did you have auto-tunneling enabled?

@perzarys
Copy link

perzarys commented Mar 19, 2024

Yes, I have auto tunneling enabled, as well as "Restart on ping fail". I think I saw it working once after I triggered an IP change on my router and the WG Tunnel app was open, but it never worked when the app was in the background. Maybe something unintentional caused this exception, not sure as I could never reproduce it. Foreground service is enabled, location permission set to always allow. Auto tunneling with my trusted SSID works perfectly fine, btw. I tried numerous times to do the following: connect to my VPN, check traffic I/O and internet connection, close the WG Tunnel App, trigger a renewal of the router IP, wait a minute or two, check if the VPN automatically restarted. It has never automatically restarted during my tests. Manually restarting the tunnel after a minute of failed handshakes has always worked, however, so it seems like the DDNS update succeeded in that short amount of time. Thanks for continuing to dig into this, I can grab some logs later today. (Edit: Logs are sent over to [email protected])

@perzarys
Copy link

I tried a few more times, respecting the 60 minutes cooldown of the ping test. Restarting the tunnel has now worked a couple of times, but it's unfortunately not reliable. A lot of times I had to manually restart after like 15 minutes of failed handshakes, at other times it automatically restarted after like 1 minute which is totally fine. But unfortunately too unreliable right now.

@zaneschepke
Copy link
Owner

I'm going to go back to marking this one as closed as this feature is now in the app. For anyone still following this feature, there is a new conversation happening about improving this feature to allow users to customize the a per tunnel ping target, ping interval, and ping cooldown in a new issues #198. I think adding these changes will help make this feature more reliable (especially for tunnels that are using split tunneling or specific allowedIps). It will also give the user more flexibility and improve the expected behavior by allowing users to customize the intervals at which this feature will operate.

@seppiola
Copy link

seppiola commented Jun 9, 2024

I know this is closed because implemented in the app but I really think that the auto tunneling feature to restart the tunnel on ping fail is a fundamentally different feature.

If the issue is to reresolve dns because a change in the network caused a change in an endpoint ip address, the tunnel should not be restarted if only unreachable for more than 30 or 60 seconds, which can be pretty common on mobile networks, moreover wireguard will already repeat the handshake at specified intervals if keepalive is set, and reconnect as soon as the endpoint becomes reachable again.
What would be ideal is something like this script from wireguard which checks potentially changed endpoints ips against the ones active in the peers and eventually resets them without the need to repeatedly ping the endpoints and restart the tunnel every time the connection fails for any reason.
This is an implementation of the feature with systemd on archlinux just for reference, and I've been using it reliably for years.
I never actually looked at wireguard libs on android so I don't know how this should be implemented here, also because that script uses wireguard tools to reset endpoints in the kernel module which is mostly unused by android users (me included), but I think this should be an option unreleated to auto tunneling and it should usable without enabling it, so as to be used in conjunction with always-on vpn.

To be clear, I dont think #198 shouldn't be addressed and "restart tunnel on ping fail" isn't a potentially useful feature, just that it's not the correct solution to seamlessly use a wireguard configuration with endpoint with changing ips.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request wg-lib This issue requires changes to the official WireGuard Android library.
Projects
None yet
Development

No branches or pull requests