Summary: | kernel: r295285 in 10.2-STABLE breaks OpenVPN functionality | ||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Base System | Reporter: | g_amanakis | ||||||||||||||||||||||||
Component: | kern | Assignee: | Mark Linimon <linimon> | ||||||||||||||||||||||||
Status: | Closed Overcome By Events | ||||||||||||||||||||||||||
Severity: | Affects Only Me | CC: | KOT, ae, brooks, cy, garga, gnn, madpilot, mandree, melifaro, mgrooms, net, re, vangyzen | ||||||||||||||||||||||||
Priority: | --- | Keywords: | needs-qa, patch, regression | ||||||||||||||||||||||||
Version: | 10.2-STABLE | Flags: | koobs:
mfc-stable10?
|
||||||||||||||||||||||||
Hardware: | amd64 | ||||||||||||||||||||||||||
OS: | Any | ||||||||||||||||||||||||||
URL: | https://reviews.freebsd.org/D4042 | ||||||||||||||||||||||||||
Attachments: |
|
Created attachment 166845 [details]
client.conf
Also I just figured out that my Android devices which connect directly to the gateway running the OpenVPN server (they connect to the internal interface and not through OpenVPN) are not able to open regular webpages and some apps (eBay, Amazon, YouTube) stop functioning if this commit is applied. Recently I noticed that after upgrading two separate pairs of firewalls to 10.2-RELEASE that my ISAKMP deamons stopped negotiating SAs with peers. I just haven't gotten around to submitting a bug report yet. It only seems to happen when large UDP packets get fragmented due to large payloads ( ie. certificate info is transmitted during late in phase1 negotiation ). This may be unique to the bge driver or related hardware as the isakmp daemon started working again on both sets of firewalls once I disabled hardware checksum offload ( ifconfig bgeX -rxcsum ). This work-around wasn't required until the upgrade to 10.2-RELEASE, but I can't say if it was at a specific patch level. I can say that one set of firewalls were upgraded from 9.2-RELEASE-p?? and the other set were upgraded from a patched 10.0-RELEASE, so I assume the commit that broke UDP re-assembly was committed sometime between 10.0-RELEASE and 10.2-RELEASE-p11. Sorry I can't be more specific. BTW, this isn't an attempt to hijack your problem report. I just thought that the issue you describe ( openvpn w/ UDP ) may be related to mine so I thought it would be worth mentioning. Have you tried disabling hw checksum offload on your public facing network device? If that improves the situation, it's quite possible that we are being bit by the same issue. (In reply to mgrooms from comment #3) This issue concerns only 10.2-STABLE (now 10.3-BETA1) which is about to become 10.3-RELEASE. The commit has not been applied to 10.2-RELEASE, so you must be facing another issue. The OpenVPN clients connect with no problems to the server. It has to do with ip_tryforward(). If I comment out this function in ip_input.c the symptoms resolve. Could it be that some of the traffic entering ip_tryforward() bypasses the NAT? I see. They underlying cause is quite possibly unrelated then. As I said, I wasn't trying to hijack your bug report. But the symptom still sounds similar in the respect that some of your UDP traffic ( your OpenVPN control traffic for example ) appears to be processed correctly, but other traffic ( your OpenVPN transport traffic being tunneled ) does not. That smacks of a re-assembly problem. In the latter case, you could have a large inner IP packet size due to the tunnel overhead which would cause the outer IP packet to be fragmented. This would lead to stalls and resets from the client perspective, just as you describe in your bug report. However, that doesn't necessarily explain your 2nd problem where non-tunneled traffic stalls. You can't NAT fragmented packets if you have a re-assembly problem as the required UDP/TCP port values are only available in the initial packet of a fragmented chain. That usually only effects UDP packets but it can still be a problem for TCP if the TCP MSS is large enough as the DNF bit is typically set in the IP header. In any case, good luck with your problem. Doah, sorry. I stopped and started writing that last paragraph while in the middle of something else. I was still thinking of things in terms of tunneling. Please disregard and I'll go away and be quiet now :) Assign to committer of 295285. Kernels before this commit (e.g. r295264) with "net.inet.ip.fastforwarding=1" do not exhibit this symptoms. Can you try this without VIMAGE, and then possibly without IPSEC_NAT_T and tell me if the problem persists? Also, can you share the output of netstat -s for all protocols including tcp, esp, ah ? Created attachment 166885 [details]
netstat.txt
Output of "netstat -s" attached.
In the local network the problem concerns primarily smartphones (I have an Android ecosystem) where some pages do not open at all. Commenting out the ip_tryforward() function resolves this.
I tried with IPSEC_NAT_T and VIMAGE disabled and it doesn't resolve it. I did some thorough testing with a simplified IPFW ruleset (only in-kernel NAT enabled and allow everything on the local and WAN interfaces). Enabling "net.inet.ip.fastforwarding" in kernels before r295285 also exhibits the symptoms. Please disregard Comment #8 above. Created attachment 166886 [details]
tcpdump.txt
I did a tcpdump while an android client tries to access a webpage (www.gutefrage.net) while "net.inet.ip.fastforwarding" was on. I interrupted both dumps as soon as the client gave up trying to open the page.
Thanks for all the updates, this does help to track some of this down. A few more questions: If you are not using an Android client does everything just work? In your last test did you also turn off IPSEC and just use IPFW? Can I see the IPFW ruleset you're using? And, can I get a full pcap file rather than a text dump of the attempted session? Have you/can you test this on HEAD? Remove freebsd-amd64 from cc. Created attachment 166901 [details]
ipfw.txt
This is the simplified IPFW ruleset I am using. IPSEC is turned off in kernel compilation. I will use only this from now on in order to have a common basis. xxx.yyy and aaa.bbb are local networks. All the local clients are on the xxx.yyy network.
With this I am getting a mixed behaviour. For example my laptop client (Thinkpad X230 running Archlinux) exhibits the symptoms on some sites (most notably www.gutefrage.net) when the gateway runs the r295545 kernel (commenting out ip_tryforward() resolves it). However when the gateway runs the r295264 kernel with net.inet.ip.fastforwarding=1 the archlinux client doesn't exhibit the symptoms anymore.
I will test this on HEAD. Is there any special tcpdump command you 'd like me to run? I will try and get simultaneous dumps from the interfaces involved.
With tcpdump just use -w /tmp/capture.pcap so you get a file rather than text based output. Created attachment 166908 [details]
wan.pcap
Created attachment 166909 [details]
tun0.pcap
tun0.pcap and wan.pcap (gateway interfaces) were captured simultaneously. A client (archlinux) was connected over OpenVPN to the gateway running r295545. The simplified IPFW ruleset was used and www.gutefrage.net was accessed. The webpage did not load at all.
After the capture, I also tried lowering the MTU of the tun interface on the client from 1500 to 1212 and 1196 but this didn't resolve it.
The problem persists on HEAD (build 20160127). Created attachment 167003 [details]
ffon.pcapng
Created attachment 167004 [details]
ffoff.pcapng
I did another dump on a client on the local network (directly connected to gateway, no OpenVPN involved). The gateway ran 10.2-STABLE r295264 GENERIC. The symptoms when fastforwarding was enabled were the same as with r295285.
I did 2 dumps on the client:
net.inet.fastforwarding=0 on the gateway ===> ffoff.pcapng ===> HTTP/GET happens at packet 10
net.inet.fastforwarding=1 on the gateway ===> ffon.pcapng ===> HTTP/GET happens at packet 36
The only significant difference I see is that when fastforwarding is turned off the gateway sends an ICMP Fragmentation needed to the client whereas when fastforwarding is on this doesn't happen, and the client keeps retransmitting the HTTP/GET packet. Could it be that the ip_fastfwd.c doesn't correctly send ICMP when the destination is unreachable and fragmentation is required?
Thanks for the update and the new files. I am trying to reproduce this on HEAD still. With your latest test were you still using IPFW and NAT or was this just vanilla forwarding? I have setup some test hosts in the lab The setup is: 172.16.0.1 172.16.0.2 172.16.1.2 172.16.1.3 source <-> router <-> sink 1500 576 and I'm doing a ping -s 1024 -D 172.16.1.3 and I do see the MTU error returning on the source: 36 bytes from 172.16.0.2: frag needed and DF set (MTU 576) Vr HL TOS Len ID Flg off TTL Pro cks Src Dst 4 5 00 041c 0000 0 0000 3f 01 debc 172.16.0.1 172.16.1.3 Your hardware addresses in the pcaps are obfuscated so its hard to tell whats happening at layer 2. (In reply to g_amanakis from comment #20) Jumping back a bit. I definitely see data to your client on both interfaces in the tun and em0 traces. Looks like the client is 70.78.231.153? (In reply to George V. Neville-Neil from comment #25) Yes, correct. 70.78.231.153 is the WAN-IP of the gateway. I used tcprewrite to spoof the mac addresses. You only see this with IPFW + NAT, right? If you just use tryforward or, on older versions, fastforward, things are fine? Correct, up until now the problem occured with IPFW and in-kernel NAT for IPv4. I will test using plain fastforwarding (without NAT on IPv4) and report. If you have a natd.conf file that would also be helpful. I am using in-kernel NAT, you can see the configuration in the ipfw.txt attached above. Looking at the pcap files I see that the client is always advertising an MSS of 1460. In your setup what are the MTUs of each interface involved? (In reply to George V. Neville-Neil from comment #31) MTU is 1500 on all interfaces (on WAN and LAN interface on the gateway, as well as on the client). Really? Then why is there the "packet too large" ICMP message? The only hypothesis I have is that when fragmentation is needed for an outgoing packet (I have no idea why) and the client sending this packet is behind NAT, the gateway cannot see the real IP of the client in order to send him the ICMP-fragmentation-required because the icmp_error() occurs after the outgoing packet has gone through the pfil hooks (and ipfw). Can someone watching this report reproduce the symptoms using IPFW+NAT? I just did a: $ route get 8.8.8.8 and got: route to: google-public-dns-a.google.com destination: default mask: default gateway: 69.251.142.1 fib: 0 interface: em0 flags: <UP,GATEWAY,DONE,STATIC> recvpipe sendpipe ssthresh rtt,msec mtu weight expire 0 0 0 0 576 1 0 However: # netstat -i Name Mtu Network Address Ipkts Ierrs Idrop Opkts em0 1500 <Link#1> 00:aa:bb:cc:dd:ee 136920 0 0 103864 em0 - fe80::225:90f fe80::225:90ff:fe 190 - - 107 em0 - 2001:558:6020 2001:558:6020:167 108 - - 96 em0 - 69.251.142.0/ c-69-251-143-153. 4555 - - 4982 em0 is the WAN-interface. Why is there this discrepancy? 576 versus 1500? I figured it out: the dhcpcd changed the MTU of em0 each time it acquired a lease. Setting "#option interface_mtu" in dhcpcd.conf leaves the MTU at 1500. I think this resolves the whole thing. I am going to test it right now and report back. Setting dhcpcd to ignore the interface MTU resolves my problem. However if I manually reduce the MTU the problem reappears and the client receives no fragmentation-needed-ICMP. I am leaving this to the discretion of George. I think the problem lies here: =======8<======== ip_fastfwd.c if (ip_off & IP_DF) { IPSTAT_INC(ips_cantfrag); icmp_error(m, ICMP_UNREACH, ICMP_UNREACH_NEEDFRAG, 0, mtu); goto consumed; } else { =======8<======== By the time the icmp_error() happens, m has gone through the firewall (see "Step 5:" in ip_fastfwd.c, meaning that outgoing NAT has already happened and that the source address of has already been changed to reflect the one of the gateway. Thus when the icmp_error() takes place the ICMP is not sent to the client. Is this correct? (In reply to g_amanakis from comment #38) That does look suspicious. In the ip_forward() routine we make a copy of the mbuf first. I will look at a patch that synchronizes the way these work. I'd like to ask about your various MTUs. Are the mismatched across any of the links? I ask because I am trying to get the code to misbehave here and I have had a hard time getting that to happen. In a simple, 3 host, test I'm trying this: source -> router -> sink MTU 1500 576 When you say "em0 is set to 576, where, in your setup, does that exist? client -> LAN-router-WAN -> webserver (eg. gutefrage.net) 1500 1500 576 1500? client MTU: interface: 1500 route: 1500 router MTU: LAN-interface: 1500 LAN-route: 1500 WAN-interface (em0): 1500 WAN-route: 576 (set by dhcpcd when run on WAN-interface) webserver MTU: probably 1500. I don't know this for sure. Does this help? (In reply to g_amanakis from comment #40) Yes, it does. Also, without IPFW and NAT, that is if you can make this a regular routing setup, do you see the problem? My theory is that you will not, and that it requires the packet to go through IPFW to show the issue. You are correct, I can confirm this. On this setup without NAT involved (ipfw was set to pass all): client --> LAN-router-LAN --> server 1500 1500 576 1500 I can see the client getting an ICMP-fragmentation-required from the router when it tries to access the server on the other side. Thus, the client can access the server. (In reply to g_amanakis from comment #34) Hi, My home router is a nanobsd image I just updated to 10.3: 10.3-BETA2 FreeBSD 10.3-BETA2 #0 r295652: Tue Feb 16 10:09:07 CET 2016 It's running openvpn, ipfw and nat, I connected with my laptop (running head) via openvpn and had no problems. I just ran a few basic things: ssh, http, transferred a few files with those protocols and had no problems. I'm not sure about the MTUs, booth connections are residential ADSL, so I guess both use 1492 on the WAN level, 1500 in the LAN. One more difference is that the OpenVPN package was compiled in a poudriere 10.2 jail, not on the machine itself and not in 10.3, but this should not make a difference imho. Not sure if this helps in some way, I can't make too many tests, but if something specific is needed I can get to do it. Created attachment 167113 [details]
Only use tryfoward() when pfilter hooks are not present
This is a patch against HEAD that I'm testing. It ought to also apply against 10-STABLE though with an offset. It bypasses tryforward() when there are pfil hooks present which will prevent issues from rewritten packets not having error reports generated.
The patch resolves the OpenVPN bug. (tested with the above ipfw.txt ruleset and OpenVPN config files). I will report in a couple of hours if it also resolves the bug in a direct LAN connection. This also resolves the bug in a direct LAN connection. (In reply to George V. Neville-Neil from comment #44) > Created attachment 167113 [details] > Only use tryfoward() when pfilter hooks are not present > > This is a patch against HEAD that I'm testing. It ought to also apply > against 10-STABLE though with an offset. It bypasses tryforward() when > there are pfil hooks present which will prevent issues from rewritten > packets not having error reports generated. With this patch we will lost the tryforward's goal (fastforwarding by default) for routers where firewall is present. I guess the most of routers has firewall. (In reply to Andrey V. Elsukov from comment #47) It turns out that for this bug fastforward (the predecessor to tryforward) would never have worked either. I am working up an alternate fix and testing it now, but the issue is now time. This bug is holding up the 10.3 release. (In reply to George V. Neville-Neil from comment #48) > It turns out that for this bug fastforward (the predecessor to tryforward) > would never have worked either. I am working up an alternate fix and > testing it now, but the issue is now time. This bug is holding up the 10.3 > release. But for those for whom fastforwarding worked (i.e. IPSEC is disabled and ipfw is enabled), now it will never work. I think it is easiest and better to revert this MFC for 10.3 and properly fix it in the head/. Created attachment 167150 [details]
Copy the mbuf for use in icmp error messages.
Comment on attachment 167150 [details]
Copy the mbuf for use in icmp error messages.
In the "Copy the mbuf" patch, some paths seem to either double-free or leak an mbuf. I can comment on specific lines, if you'd like.
I am now tracking an updated patch in Phabricator: https://reviews.freebsd.org/D5330 That's where the rest of this will be carried out. ping? (In reply to Matthias Andree from comment #53) > ping? I still have the same router based on NanoBSD, in the while I updated the image to 11.0-RELEASE. As before everything is working fine for me, and I'm not seeing this problem. But most probably my configuration differs from the reporter's one. ^Triage: overcome by events. |
Created attachment 166844 [details] server.conf r295285 in 10.2-STABLE breaks OpenVPN server functionality. Tested with OpenVPN 2.3.10 on amd64 bare-metal hardware with IPv4. Clients connected to the OpenVPN server experience slow IPv4 www traffic and connection resets. Clients connect via IPv4 UDP to the server, and in-kernel NAT is performed on the external interface. OpenVPN configs are attached. The kernel has IPSEC, IPSEC_NAT_T and VIMAGE enabled, and SCTP disabled.