Summary: | [pf] [ip6]: 'scrub reassemble tcp' breaks IPv6 packet checksum on SYN ACK | ||||||
---|---|---|---|---|---|---|---|
Product: | Base System | Reporter: | Mark.Martinec | ||||
Component: | kern | Assignee: | Kristof Provost <kp> | ||||
Status: | Open --- | ||||||
Severity: | Affects Only Me | CC: | daniel, fbsdbugzilla, feld, glebius, ilya, pi, viktor.stujber+freebsd-bugs_v4CCPfay | ||||
Priority: | Normal | ||||||
Version: | 9.1-PRERELEASE | ||||||
Hardware: | Any | ||||||
OS: | Any | ||||||
Attachments: |
|
Description
Mark.Martinec
2012-10-12 20:10:00 UTC
Btw, the effect described here looks very similar, checksum errors on a SYN reply with IPv6 and pf: http://lists.freebsd.org/pipermail/freebsd-stable/2012-July/068990.html Regression with jails/IPv6/pf Matthew Seaman <m.seaman@infracaninophile.co.uk> Thu Jul 26 23:10:43 UTC 2012 Mark Responsible Changed From-To: freebsd-bugs->freebsd-pf Over to maintainer(s). (In reply to Kurt Jaeger from comment #3) > See > > https://lists.freebsd.org/pipermail/freebsd-net/2014-November/040319.html Patch from Ermal Luçi inline in: https://lists.freebsd.org/pipermail/freebsd-pf/2014-November/007500.html In PR 179392 the commit r274709 worked on checksums. Can someone reproduce the problem with that fix applied ? I just tried the inline patch to these files on 10.1-p3 sys/netinet6/ip6_output.c sys/netinet6/ip6_var.h sys/netpfil/pf/pf_ioctl.c (there was nothing changed in sys/netpfil/pf/pf.c if I'm reading the patch correctly) It seems it does not work, this rule does not end up with the traffic at ::1.8080; rdr pass log on igb0 inet6 proto tcp from any to any port 80 -> ::1 port 8080 Although I can see the rule being executed in pflog; rule 10..16777216/0(match): rdr in on igb0: 2a00:1a28:1200:11::2.56746 > ::1.8080: Flags [S], seq 4110669173, win 65535, options [mss 1440,nop,wscale 6,sackOK,TS val 2462372368 ecr 0], length 0 Can you check whether the bug is still valid for stable/10? Created attachment 152061 [details]
Minimal pf config
Just built kernel against stable r.277607
Added the pf devices to the generic kernel and rebuilt, the bug is still there.
Attachinig minimal pf config.
Here is how to test:
On the server with pf:
nc -6 -l 8080
From any other ipv6 enabled server:
nc -6 yourfbsdserversipv6 80
(In reply to Gleb Smirnoff from comment #7) This bug has been valid for 8.x, 9.x, 10.x and is not solved anywhere. There's no need to validate, a fix is strongly needed, though, exactly like it was fixed for ipfw (Bug 145733) Kristof, can you please look at this bug? Sorry, markp@. For unknown reason Bugzilla rewrites kp@FreeBSD.org to your login name. I will take bug to me, before this is fixed. Kristof, can you please look at this bug? I've thus far been unable to reproduce this on either a bhyve guest (vtnet) or a physical machine (ale(4)). I might be missing some part of the reproduction scenario, but I don't see what. My pf.conf: > scrub all reassemble tcp > pass all > curl -6 -L http://tools.ietf.org/rfc/rfc3021.txt | wc -l Responds almost instantly. There's no 9 second delay. If you change the listen port of your webserver to 8080 and then change pass all to rdr pass log on $ext inet6 proto tcp from any to any port 80 -> port 8080 You will see that the rule is executed in pflog but the traffic never ends up on the webserver I initially noticed this bug on all of my previous employer's FreeBSD servers when I upgraded to FreeBSD 9.x. The cause was definitely "scrub all reassemble tcp" as removing it across all our servers solved the problem for us. The symptoms we had were long connection establishment times and once connected it was very high latency and terribly slow; ssh over IPv6 was unusable. (In reply to daniel from comment #14) I'm still having no luck reproducing this. The rdar rule you gave as an example (rdr pass log on $ext inet6 proto tcp from any to any port 80 -> port 8080) doesn't actually work because it doesn't specify a redirect target. I've tested with this one instead: rdr log on $ext inet6 proto tcp from any to any port 1234 -> 2001:db8::2 port 8080 and things work correctly. (As an aside, I was initially testing with a rule which redirected to ::1. This doesn't work, and I'm not quite sure yet if I think that's a feature or a bug.) (In reply to Kristof Provost from comment #16) Sorry for missing to specify the target of ::1 in the reply to you, must have been tired :) I did however specify it in the other comments in this PR. I don't know if its a missing feature or bug but it does work to rdr with a target of 127.0.0.1 for ipv4, this is why I've been pushing this PR for ipv6 (because I thought it was a bug) Sorry for any missunderstandings. No worries. From my perspective that means we're looking at two different issues though. The first (which I can't reproduce) is that 'scrub all reassemble tcp' breaks TCP checksums. The second is that rdr to ::1 doesn't work. This I can reproduce, and perhaps even do something about. Let's look at the second one first, because there's more likely to be progress on that. Ok, I think I've got a handle on what the problem is. With rdr to ::1 we fail the scope check in ip6_input() (right after the pfil hook) because we have a packet to localhost with a m->m_pkthdr.rcvif which is not a loopback interface. That can be fixed by having pf rewrite the rcvif, but that'd special-case rdr to ::1. We've got a similar problem for the reply. There we've got a packet from ::1 to something else. This fails the scope lookup too. In essence the problem is that we've already made the routing decision before pf gets the chance to rewrite the destination address. I'm not quite sure how to fix this though. If I understand the problem correctly wouldn't that also be the same problem if you were to rdr ipv6 from a public nic to a LAN nic? As you can't affect the routing from PF? I.e: rdr pass log on igb0 inet6 proto tcp from any to 2a00:1a28:1252::2 port 80 -> fc00::1234 port 8080 Or any scenario where you want to rdr the traffic from one nic to another At first glance the issue is in in6_clearscope() and in6_setscope(). Those will only fail for the loopback address (::1) or link local addresses. The rdr rule does work with GUAs. batch change: For bugs that match the following - Status Is In progress AND - Untouched since 2018-01-01. AND - Affects Base System OR Documentation DO: Reset to open status. Note: I did a quick pass but if you are getting this email it might be worthwhile to double check to see if this bug ought to be closed. I have been observing this issue for several years. I believe last time I tested it was on FreeBSD 11.2 from 2017. I re-tested this today on FreeBSD 12.0-RELEASE-p3 r343997 and have not noticed the disruption in tcp traffic that comes from bad checksums. From the tcpdumps, it looks like the OS is correctly performing timestamp randomization/masquerade on behalf of computers on both sides of the connection. So... I guess it's fixed? Don't know when exactly. A second confirmation would be appreciated. |