There seem to be an issue with path MTU discovery, and SSH (and possibly others).
I have a FreeBSD router, running 12.0-RELEASE-p9. The router terminates an IPv6 tunnel, using gif (protocol 41). The MTU on the tunnel is 1480, because of the encapsulating IPv4 header.
In my internal network, the MTU is the standard 1500.
From my work station, when I try to SSH over IPv6 to, for example freefall.freebsd.org, the connection fails. I can see in package dumps that the router responds with an ICMP packet too big. I can understand that this is the case for the first attempt since the local link has an MTU of 1500 but the remote link has an MTU of 1480, but ssh does multiple attempts with the same packet size, and gets ICMP ptb in return. Eventually, the connection is reset.
I expected that something, either in our TCP stack, IP stack or in SSH, would adjust to the lower MTU, and resend a shorter packet (possibly splitting it in multiple TCP packets).
After the first SSH connection attempt has failed, if I try again, the connection succeeds, so something adjusts so that later connections are not using the too big MTU.
I have a packet capture of the traffic between my work station, router and freefall.freebsd.org, that hopefully can shed a little bit of light on what's going on. It's available on request.
So, I did some more digging, and I think my initial conclusions weren't entirely correct.
I set up the following mini network, with a client on one subnet, and a server on another, and a router in between running PF. All machines are running FreeBSD 12.0.
|server| ---- <MTU 1280> ---- |router| ---- <MTU 1500> ---- |client|
server IP: 2001:db8:ffff:ff00::2
client IP: 2001:db8:ffff:ff10::2
I then try two connections to the server:
One with ssh, running ssh on the client to connect to sshd on the server.
One using netcat:
nc command on server: nc -6 -l 1234
nc command on client: cat /usr/share/examples/IPv6/USAGE | nc -6 ip-of-server
Between the ssh and nc invocations, I wipe the TCP host cache using
I run the above tests with three different router configurations.
First, I use the ruleset modulate.pf.conf, which uses modulate state for state tracking of TCP connections.
Second, I use the ruleset keep.pf.conf, which uses keep state for state tracking.
Third, I disable PF completely.
In the first case, using modulate state, the ssh connection stalls, and it looks like the path mtu discovery fails. The nc connection works though.
In the second and third case, things work as normal.
I am guessing that 'modulate state' somehow screws up path MTU discovery, so that the ptb packet sent by the router isn't recognized by the client, but this is just a guess.
I've attached the two different PF rule sets used, as well as /etc/rc.conf from the router, and pcap traffic dumps from all three runs.
Created attachment 206496 [details]
PF rulseset using modulate state
Created attachment 206497 [details]
pcap traffic dump using modulate state PF conf
Created attachment 206498 [details]
PF ruleset using modulate state
Created attachment 206499 [details]
PF ruleset using keep state
Created attachment 206500 [details]
pcap traffic dump using keep state PF conf
Created attachment 206501 [details]
pcap traffic dump not using PF
Created attachment 206502 [details]
Switched out the router to one running OpenBSD with a similar setup. OpenBSD is affected as well.
I guess you traced at the router.
Could you run the first and third scenario with tracing at the client and the router at the same time? I'm wondering if the client receives the ICMPv6 PTB message at all.
Right now, I don't see a problem with the TCP stack on the client side.
To improve the situation, you could enable blackhole detection by setting
sudo sysctl -w net.inet.tcp.pmtud_blackhole_detection=1
Then the TCP connections should not stall, but should automatically reduce the MSS after a (small) number of retransmissions.