Forum
Good day everyone,
I stumbled across the Skudonet Community Edition a week or two ago and tested it a bit in my homelab.
I wanted to ask if the CE (or the commercial versions) supports Port Following like KEMP or F5? The use case here is MS Always On VPN, specifically with IKEv2, which establishes a connection first on Port 500, but then switches to Port 4500 when it detects it's behind a NAT. I created a Service/Farm (l4xnat mode) and put both ports in it and used IP Source Persistency and while my test clients do establish IKEv2 connections, at some point (mostly after 1.5h) the tunnels either drop or enter an inconsistent state where the tunnel is up, but no traffic can go through.
Otherwise, load balancing SSTP Tunnels (HTTPS) feels great with Skudonet and the non-limited bandwidth is a huge plus in comparison to other Community or Free Editions!
Thank you very much and kind regards
Good day, regarding your question about Port Following. Your configuration is perfectly suited for moving traffic that requires more than 2 ports. But here can be the issue.
If the client starts the new connection, the load balancer will be able to continue the traffic based on the persistence source IP. However, if the backend initiates a new connection, this request will reach the load balancer, and it won't be able to determine which client sent the connection. If this is the case, I encourage you to use DSR or DNAT mode instead of NAT. It will fix the problem.
It is quite strange for me that it is working for 1.5 hours, and after this period of time, it fails. Maybe MS is using additional ports for something else. Before testing DSR or DNAT, you can try the following.
-Configure 1 Virtual IP only for MS Always On VPN Service, ensure this IP is not used in any additional farms.
-Configure a farm using this new Virtual IP with ALL the ports and ALL the Protocols.
-Configure the persistence and the backends in the same way as in the other farm, without configuring any PORT in the backend table. It won't apply any port change.
With that configuration, if the client requires additional ports not taken into consideration, the VIP will forward as done with 500 and 4500.
Regarding non-limit, we develop our solutions commercially and community without any limitation in our software, you can configure as many farms, backends, certificates, CPU, RAM, or network as you need.
Finally, please keep us updated as we are interested in confirmation.
Regards!
Hello and thank you for your response!
I'll do some more testing tomorrow and come back to you with more information and details.
Kind regards
Thanks for waiting, I'm back with some more infos. I'll also preface it with the full original setup, which I should've done in the opening post (my bad):
The Setup looks as follows:
- 1x SKUDONET CE Load Balancer in the DMZ VLAN (one-armed)
- 1x l4xnat IKEv2 Farm (Ports 500,4500 | UDP | DNAT | IP: Source IP | Persistency TTL: 4 Days)
- 3x Windows 11 Enterprise Clients (VMs) in an "outside" VLAN (no communication links to the internal infrastructure)
- 3x Windows Server 2022 (VMs) with two NICs (one in LAN, one in DMZ). The DMZ NIC Gateways point to the Load Balancer for Transparency to work and to prevent asymmetric routing.
I ran the setup today and after 2.5h I noticed one of the Win 11 clients losing connectivity on the Device Tunnel (No Ports open to the Domain Controller, ICMP not working etc.). In the GUI, I could see that under the Monitoring Stats, only 2 IKEv2 connections were established. Observing it further, the IKEv2 Tunnel Connection for that client seemed to bounce between two backend servers back and forth. An hour later, another client experienced the same issue.
I will try your suggestion with the no limits Farm, tho I think the result will be the same, since the problem is not present using other LBs and Microsoft using undocumented Ports isn't very likely. Perhaps I have just misconfigured something with SKUDONET, since it's the first time I'm using it. I'll be back tomorrow with the results.
Thank you very much for your help, much appreciated!
Hi, it seems quite strange that it starts failing after a good period of time and it is quite strange that the same client creates sessions against two different backends
Are you configuring farm guardian to detect any failure in the backends? It is mandatory for l4xnat. Also confirm that farm guardian is configured with "cut connections".
This issue sounds to me that the backend is detected in down mode by the farm guardian but the already established connection is not deleted and a new connection is created against another host.
Anyway please fell free to send us supportsave after your tests here we will be able to see more and you can check syslog file /var/log/syslog
Regards
Hi,
I can confirm that I'm using Farmguardian for the health check on the backend (check_ping). It's not strictly mandatory I guess, since I can select "Disabled" as an option. I checked the Farmguardian Section under the Monitoring Tab and I can confirm "Cut connections" is enabled, but greyed out (no changes can be made by me for the pre-defined ones).
For the SSTP Farm (Basically HTTPS), I use the "check_tcp" Farmguardian, where "cut connections" is disabled by default, tho I have no issues with it (just to mention and confirm it).
I have noticed that under the Farms Tab of the Farmguardian (Monitoring>Farmguardians>check_tcp) I can't see my Farm that uses it or my other one. Is this intended behavior?
I'd be very delighted, if you could take a look at the supportsave! What's the best way to send it over? While this is just a lab environment for testing, I'd still prefer to not post the file online as an attachment.
Again, thanks for you help and have a nice weekend!
@bob-sample, please send the support save or any direct download link to " community at skudonet dot com ".
And indicate to us the affected farm name and hours where you experienced the issue in order to find this time in logs.
Regards!
@emiliocm I just sent the mail out. Thanks again for taking the time and looking into this issue 🙂
Hi @bob-sample, checking logs I can see that the farmguardian detects the backends down, the used farmguardian is:
[check_ping]
command=check_ping -H HOST -w 2,100% -c 2,100% -p 2
cut_conns=true
description=Send 2 ICMP packets and check that 100% reaches the host.
farms= AOVPN-IKE-500-4500
interval=15
log=false
template=true
So you are concluding that the backend will be detected in DOWN mode if 2 pings are lost. From my point of view, it is not technically correct because the pings can be lost. Still, the service in layer 7 is OK, so I would recommend that you change the health check to check_udp or something similar or even configure your health check with some vpn test authentication.
When a backend is detected as DOWN by the farmguardian ALL the established connections are deleted and the sessions are deleted as well, so the VPN client has to detect that the endpoint failed and force to negotiate again sending a new connection to the load balancer and the load balancer will select a new VPN backend available on the farm.
So the behaviour is expected and the load balancer is doing the job properly.
Thanks!
Good day and thank you kindly for the swift analysis and response!
Hmm, seems like check_ping is a rather strict health check. Do you guys have perhaps general recommendations for when to use which health check? Seems like the one I used isn't well suited for VPN workloads. I'll use another one and run the setup today and see if there's still something going wrong (I don't expect, but it's good to test and verify).
Thanks again!
@bob-sample we encourage you to use at least a layer4 check like check_tcp or check_udp in your case, as you are running 2 ports 500 and 4500 UDP, I would recommend you create your health check
Do an script that first check the backend port 500, if OK, check the backend port 4500.
For preconfigured health checks:
Regards
Thanks for the suggestion. I tried to use check_udp, check_udp_nc and a custom one based on check_udp_nc (just checking for Port 500) and unfortunately, they don't work for me and backends get detected as being down. Interestingly enough, if I manually do the checks via the VM console (same command, just filling out the two variables for HOST and PORT), I can see packets arriving on the host via Wireshark and I get a success message in the VM from the executed commands. With the three Farmguardiands mentioned, tho, I don't see any packets arriving.
Sorry for having another issue. Any advice on your part?
please try the following commands and paste here the ouptut:
nmap -sU 4500 backend_IP
nmap -sU 500 backend_IP
In case nmap is not installed please install it.
Regards
I have attached a picture. nmap was installed by default. I hid the MACs, so you know.
Try the following:
0-Create a symlink to nmap in path /usr/local/skudonet/app/libexec
1-Create a new farmguardian check_udp_nmap
2-Configure the following command:
nmap -sU 4500 HOST | grep open
timeout: 5
interval: number of backends*5+1.
cut connections: yes
logs: no
3-Apply this new health check to the affected farm.
This health check is going to check the status open in the nmap output for UDP port 4500.
Regards!