Created attachment 222023 [details] A sample of MD5 signed traffic I was testing net/bird on 13.0-ALPHA3 and found that BGP MD5 signed IPv4 session with another instance of Bird running on stable/11 cannot be established. Please let me describe the scenario. On affected machine running 13.0-ALPHA I have: # cat /etc/ipsec.conf flush ; add 172.31.1.2 172.31.1.6 tcp 0x1000 -A tcp-md5 "abigpassword" ; add 172.31.1.6 172.31.1.2 tcp 0x1001 -A tcp-md5 "abigpassword" ; # setkey -D 172.31.1.6 172.31.1.2 tcp mode=any spi=4097(0x00001001) reqid=0(0x00000000) A: tcp-md5 61626967 70617373 776f7264 seq=0x00000000 replay=0 flags=0x00000040 state=mature created: Jan 30 15:32:23 2021 current: Jan 30 15:54:43 2021 diff: 1340(s) hard: 0(s) soft: 0(s) last: Jan 30 15:33:05 2021 hard: 0(s) soft: 0(s) current: 440(bytes) hard: 0(bytes) soft: 0(bytes) allocated: 131 hard: 0 soft: 0 sadb_seq=1 pid=7647 refcnt=1 172.31.1.2 172.31.1.6 tcp mode=any spi=4096(0x00001000) reqid=0(0x00000000) A: tcp-md5 61626967 70617373 776f7264 seq=0x00000000 replay=0 flags=0x00000040 state=mature created: Jan 30 15:32:23 2021 current: Jan 30 15:54:43 2021 diff: 1340(s) hard: 0(s) soft: 0(s) last: Jan 30 15:33:05 2021 hard: 0(s) soft: 0(s) current: 4111(bytes) hard: 0(bytes) soft: 0(bytes) allocated: 52 hard: 0 soft: 0 sadb_seq=0 pid=7647 refcnt=1 On machine running stable/11 I have: # cat /etc/ipsec.conf flush ; add 172.31.1.6 172.31.1.2 tcp 0x1000 -A tcp-md5 "abigpassword" ; add 172.31.1.2 172.31.1.6 tcp 0x1001 -A tcp-md5 "abigpassword" ; # setkey -D 172.31.1.2 172.31.1.6 tcp mode=any spi=4097(0x00001001) reqid=0(0x00000000) A: tcp-md5 61626967 70617373 776f7264 seq=0x00000000 replay=0 flags=0x00000040 state=mature created: Jan 30 15:18:20 2021 current: Jan 30 15:55:13 2021 diff: 2213(s) hard: 0(s) soft: 0(s) last: Jan 30 15:26:32 2021 hard: 0(s) soft: 0(s) current: 8031(bytes) hard: 0(bytes) soft: 0(bytes) allocated: 101 hard: 0 soft: 0 sadb_seq=1 pid=85243 refcnt=1 172.31.1.6 172.31.1.2 tcp mode=any spi=4096(0x00001000) reqid=0(0x00000000) A: tcp-md5 61626967 70617373 776f7264 seq=0x00000000 replay=0 flags=0x00000040 state=mature created: Jan 30 15:18:20 2021 current: Jan 30 15:55:13 2021 diff: 2213(s) hard: 0(s) soft: 0(s) last: Jan 30 15:33:05 2021 hard: 0(s) soft: 0(s) current: 10226(bytes) hard: 0(bytes) soft: 0(bytes) allocated: 131 hard: 0 soft: 0 sadb_seq=0 pid=85243 refcnt=1 So far, everything looks fine, now for easier debugging I take nc(1) and try to transmit some signed TCP segments. They arrive from stable/11 to stable/13 as signed correctly, but the responding traffic is reported to be signed incorrectly (please see attached dumps). The handshake looks fine, but the segments send later from the affected stable/13 have invalid signatures what is reported accordingly on both sides.
It looks to be more quirky than it looked at a glance. In this setup igb(4) lagg(4), vlan(4) and if_bridge(4) are involved. Since the machine was upgraded from the recent 12.2-STABLE and I still have old BE, I rebooted to check if 12.2-STABLE was affected. The machine is not running Bird in production, I test here peering setups only occasionally and haven't done so since a long while. So after booting into 12.2-STABLE it came out that MD5 signed session cannot be established either. The interface used for peering is a vlan(4) created on top of lagg(4) which is aggregating two igb(4)s, moreover, this if_lagg(4) was a member of a bridge(4). To simplify it a bit in the first step I destroyed the bridge, reloaded ipsec rules and everything went back to normal - MD5 signed BGP session was established (12.2-STABLE). The diagnose is not so obvious, since the same steps taken under 13.0-STABLE (removing lagg from the bridge and destroying brdige) don't change anything (still MD5 signatures of segments originating from this host are invalid). I will test it in simplified scenario later, where neither lagg(4) nor vlan(4) be involved and I suspect TCP MD5 to work fine in such a setup.
It persistent issue for me, I have tested it on bare metal and bare NICs, but with nc(1) only. Perhaps someone else will be able to reproduce and definitely confirm or deny this? Anyway, the most recent 11.4-STABLE seems to fine.
The setting net.inet.tcp.functions_default=rack was the culprit. Probably TCP RACK is not supposed to support TCP MD5 and this bug has to be closed, but let people from the project decide and give some feedback here. I have done more tests with the most recent stable{12,13} and it looks like that with net.inet.tcp.functions_default=freebsd TCP MD5 signatures are supported fine. I have tried to revert this setting to default prior to reporting this as a bug but it not always worked. I am sorry for the noise on Bugzilla and freebsd-net@ mailing list, but in initial tests disabling RACK wasn't sufficient to get TCP MD5 working (probably due to accidentally flushing IPsec rules in the meantime), so I took some ad-hoc steps to repair it quickly, disabling some devices, reverting sysclts to default values etc.
I think neither RACK nor BBR do support TCP MD5. Up to now, this was also not intended, I think, but I'll let rrs@ confirm. I don't think it is a regression in the sense that RACK did not support it in the past.
(In reply to Michael Tuexen from comment #4) > I think neither RACK nor BBR do support TCP MD5. Up to now, this was also not intended, I think, but I'll let rrs@ confirm. Do and of the stacks support TCP-AO? I think that should be a requirement since that is the replacement for TCP-MD5. And though TCP-MD5 is officially depricated giving the nature of how slowly TCP-AO has rolled out it would probably be a good idea to have continued support for TCP-MD5 in all stacks (this is not a hard thing to implement) until TCP-AO is more widely deployed. Most of my BGP peers have a fall back stance to TCP-MD5 if you can't do TCP-AO.
(In reply to Rodney W. Grimes from comment #5) Is TCP-AO supported by the base stack?
(In reply to Michael Tuexen from comment #6) Not that I can find, though I find some stuff on the internet that suggests Juniper sponsored some work on it, where that ended up I have no idea. This is one of my reasons for wanting tcp-md5 support to be prevalent as without it you can not protect BGP sessions, and most BGP peers request at a minimum md5 protection. Its probably ok that RACK does not have it, and that should be somehow documented or at least an error condition asserted if one tries to use it with RACK. Silent failure like this person experienced is painful, and people dealing with BGP already have enough pain.
TCP MD5 is currently not supported by the RACK stack. It is planed to add that.
A patch is under review in D40597.
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=02b885b09d1e90574162a1442b9ede06cef2b13a commit 02b885b09d1e90574162a1442b9ede06cef2b13a Author: Michael Tuexen <tuexen@FreeBSD.org> AuthorDate: 2023-06-21 20:54:33 +0000 Commit: Michael Tuexen <tuexen@FreeBSD.org> CommitDate: 2023-06-21 20:54:33 +0000 tcp: fix TCP MD5 computation for the BBR and RACK stack PR: 253096 Reviewed by: cc, rscheff MFC after: 3 days Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D40597 sys/netinet/tcp_stacks/bbr.c | 10 +++---- sys/netinet/tcp_stacks/rack.c | 66 ++++++++++++++++++++++++++++++++++++------- 2 files changed, 61 insertions(+), 15 deletions(-)
(In reply to Marek Zarychta from comment #3) Please test the patch, if possible, ad report if it fixes the problem for you.
(In reply to Michael Tuexen from comment #11) Thank you for taking care of the problem and finally solving it Michael! I can confirm that after cherry-picking 02b885b09d1e90574162a1442b9ede06cef2b13a to stable/13 TCP-MD5 can be used with RACK.
A commit in branch stable/13 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=edff1d344c6bf8f3de2ba1e36b2807fd6d1e7ea8 commit edff1d344c6bf8f3de2ba1e36b2807fd6d1e7ea8 Author: Michael Tuexen <tuexen@FreeBSD.org> AuthorDate: 2023-06-21 20:54:33 +0000 Commit: Michael Tuexen <tuexen@FreeBSD.org> CommitDate: 2023-06-25 19:26:32 +0000 tcp: fix TCP MD5 computation for the BBR and RACK stack PR: 253096 Reviewed by: cc, rscheff Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D40597 (cherry picked from commit 02b885b09d1e90574162a1442b9ede06cef2b13a) sys/netinet/tcp_stacks/bbr.c | 10 +++---- sys/netinet/tcp_stacks/rack.c | 66 ++++++++++++++++++++++++++++++++++++------- 2 files changed, 61 insertions(+), 15 deletions(-)