281395 – ICMPv6 neighbor discovery broken under Proxmox

Bug 281395 - ICMPv6 neighbor discovery broken under Proxmox

Summary: ICMPv6 neighbor discovery broken under Proxmox

Status:	Closed Works As Intended

Alias:	None

Product:	Base System
Classification:	Unclassified
Component:	kern (show other bugs)
Version:	14.1-STABLE
Hardware:	Any Any

Importance:	--- Affects Some People
Assignee:	Michael Gmelin

URL:	https://bugzilla.kernel.org/show_bug....
Keywords:	regression

Depends on:
Blocks:

Reported:	2024-09-09 17:56 UTC by Dr. Uwe Meyer-Gruhl
Modified:	2024-09-14 14:53 UTC (History)
CC List:	13 users (show)

See Also:	280701 281397 https://bugzilla.kernel.org/show_bug.cgi?id=99081

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Dr. Uwe Meyer-Gruhl 2024-09-09 17:56:58 UTC

As already stated in https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=280701, FreeBSD-SA-24:05 causes ICMPv6 neighbor discovery to fail.

I quote the info given there by Franco:

According to multiple users the ICMP patch series causes stalls in neighbor discovery and only a full revert brings back the desired behaviour.

A TCP dump showed that the Cisco is sending ICMP6 neighbour solicitations, which are answered by the opnsense with a large delay.
The cisco switch looses it's IPv6 neighbour.

tcpdump -n -i ix0 icmp6 and host fe80::86b8:2ff:fe1a:c67f

07:34:42.764553 IP6 fe80::86b8:2ff:fe1a:c67f > 2001:xxxx:x:x::x:2: ICMP6, neighbor solicitation, who has 2001:xxxx:x:x::x:2, length 32
07:34:43.852542 IP6 fe80::86b8:2ff:fe1a:c67f > 2001:xxxx:x:x::x:2: ICMP6, neighbor solicitation, who has 2001:xxxx:x:x::x:2, length 32
07:34:44.940525 IP6 fe80::86b8:2ff:fe1a:c67f > 2001:xxxx:x:x::x:2: ICMP6, neighbor solicitation, who has 2001:xxxx:x:x::x:2, length 32
07:34:46.094207 IP6 fe80::86b8:2ff:fe1a:c67f > ff02::1:ff56:2: ICMP6, neighbor solicitation, who has 2001:xxxx:x:x::x:2, length 32
07:34:47.120778 IP6 fe80::86b8:2ff:fe1a:c67f > ff02::1:ff56:2: ICMP6, neighbor solicitation, who has 2001:xxxx:x:x::x:2, length 32
07:34:48.201460 IP6 fe80::86b8:2ff:fe1a:c67f > ff02::1:ff56:2: ICMP6, neighbor solicitation, who has 2001:xxxx:x:x::x:2, length 32
07:34:49.336747 IP6 fe80::86b8:2ff:fe1a:c67f > ff02::1:ff56:2: ICMP6, neighbor solicitation, who has 2001:xxxx:x:x::x:2, length 32
07:34:50.360952 IP6 fe80::86b8:2ff:fe1a:c67f > ff02::1:ff56:2: ICMP6, neighbor solicitation, who has 2001:xxxx:x:x::x:2, length 32
07:34:51.385618 IP6 fe80::86b8:2ff:fe1a:c67f > ff02::1:ff56:2: ICMP6, neighbor solicitation, who has 2001:xxxx:x:x::x:2, length 32
07:34:52.436467 IP6 fe80::86b8:2ff:fe1a:c67f > ff02::1:ff56:2: ICMP6, neighbor solicitation, who has 2001:xxxx:x:x::x:2, length 32
07:34:53.529962 IP6 fe80::86b8:2ff:fe1a:c67f > ff02::1:ff56:2: ICMP6, neighbor solicitation, who has 2001:xxxx:x:x::x:2, length 32
07:34:54.617082 IP6 fe80::86b8:2ff:fe1a:c67f > ff02::1:ff56:2: ICMP6, neighbor solicitation, who has 2001:xxxx:x:x::x:2, length 32
07:34:55.717592 IP6 fe80::86b8:2ff:fe1a:c67f > ff02::1:ff56:2: ICMP6, neighbor solicitation, who has 2001:xxxx:x:x::x:2, length 32
07:34:56.765964 IP6 fe80::86b8:2ff:fe1a:c67f > ff02::1:ff56:2: ICMP6, neighbor solicitation, who has 2001:xxxx:x:x::x:2, length 32
07:34:57.796680 IP6 fe80::86b8:2ff:fe1a:c67f > ff02::1:ff56:2: ICMP6, neighbor solicitation, who has 2001:xxxx:x:x::x:2, length 32
07:34:58.888994 IP6 fe80::86b8:2ff:fe1a:c67f > ff02::1:ff56:2: ICMP6, neighbor solicitation, who has 2001:xxxx:x:x::x:2, length 32
07:34:58.889051 IP6 fe80::3eec:efff:fe70:7326 > fe80::86b8:2ff:fe1a:c67f: ICMP6, neighbor advertisement, tgt is 2001:xxxx:x:x::x:2, length 32

via: https://github.com/opnsense/src/issues/217


If you want to construct a test setup to cover this, try directing the following command from another machine to a FreeBSD machine and look at the results:

while :
do
	ndisc6 -m -n -r 1 fe80::1111:2222:3333:4444 eth0
done

Of course, fill in the target's EUI-64 instead of 1111:2222:3333:4444 and use the correct interface instead of eth0.

You will find that a machine with the SA applied does not always respond in due time to these requests and the requests time out, whereas a machine without the SA always answers correctly.


P.S.: As noted, this is another fallout of not applying the full set of OpenBSD patches to address the issus with stateful ICMPv6 handling introduced by the SA.

Comment 1 Gleb Smirnoff freebsd_committer

2024-09-09 18:43:58 UTC

Dr. Uwe Meyer-Gruhl,

thanks for report!  Two questions:

1) Where can one get the ndisc6 program used to reproduce? Is it in ports
   or maybe somewhere on github or anywhere else?

2) Is it possible to record those messages from Cisco with 'tcpdump -w' so
   that the exact packets can be replayed in a test environment?

P.S. Assigning to committer of supposed regression.
BUG-: Enter comment:

Comment 2 Dr. Uwe Meyer-Gruhl 2024-09-09 19:33:18 UTC

(In reply to Gleb Smirnoff from comment #1)

ndisc6 is included with many Linuc dsitributions, I think it comes from here:

https://www.remlab.net/ndisc6/

Also, I have seen Windows versions of it and it should work on FreeBSD as well. It is simply a IPv6 client to initiate neighbour discoveries and show the answers to them.

You can initiate those discoveries any way you like, this is just the way I have done it.

Comment 3 Gleb Smirnoff freebsd_committer

2024-09-09 19:46:50 UTC

Your initial report specifies FreeBSD version as 14.1-RELEASE.  The release
itself doesn't have the mentioned SA in it.  So it can't have neither the
regression nor the fix.  So, I guess you are running some different version of
FreeBSD.  Can you please clarify that?

In the FreeBSD 14.1-STABLE there is a commit 0121a4baaca0, that is supposed to
fix the bug.  There is also a test case in tests/sys/netpfil/pf/icmp6.sh added
in 4909bd69ddef.  The test case does exactly what you suggested - it uses the
ndisc6 program.  btw, it lives in ports/net/ndisc6.

Can you please confirm or decline that you observe the problem on up to date
FreeBSD 14.1-STABLE?

Comment 4 Dr. Uwe Meyer-Gruhl 2024-09-09 19:48:01 UTC

(In reply to Gleb Smirnoff from comment #1)

Sorry, I meant STABLE...

Comment 5 Gleb Smirnoff freebsd_committer

2024-09-09 20:46:39 UTC

What exact version of 14.1-STABLE do you run? Can you please show 'uname -v'?

Comment 6 Dr. Uwe Meyer-Gruhl 2024-09-09 21:06:08 UTC

"uname -v" shows FreeBSD 14.1-STABLE stable/14-n268665-4938f554469b GENERIC.

I just tried again and ndisc6 shows a few correct and timely answers just after booting:


#ndisc6 -m -n -r 1 fe80::be24:11ff:fe9f:eee9 eth0
Soliciting fe80::be24:11ff:fe9f:eee9 (fe80::be24:11ff:fe9f:eee9) on eth0...
Target link-layer address: BC:24:11:9F:EE:E9
 from fe80::be24:11ff:fe9f:eee9
#

Then, after a few minutes:

#ndisc6 -m -n -r 1 fe80::be24:11ff:fe9f:eee9 eth0
Soliciting fe80::be24:11ff:fe9f:eee9 (fe80::be24:11ff:fe9f:eee9) on eth0...
Timed out.
No response.
#

Comment 7 Michael Gmelin freebsd_committer

2024-09-10 16:01:53 UTC

Just to share a data point from my tests (different setup, so this does neither prove nor disprove anything).

I used FreeBSD 13.3-RELEASE-p6 to run ndisc6 from ports.

When running 13.3-RELEASE-p6 on the target, all NDs time out (as expected), while running stable/13, basically all NDs are answered, even when lowering the the timeout to 1/10th of a second (when going down to 50ms, timeouts start to happen in bursts, but that's the same with pf disabled). I'm also not seeing any log messages from pf about ICMP messages being too short (using "set debug loud" in pf.conf). Looking at pftop, packet counters for the state entry go up as expected.

Could you share a typical pf.conf that you use while testing (mine was basically: set skip on lo0, set debug loud, pass)?

Comment 8 Michael Gmelin freebsd_committer

2024-09-11 10:32:14 UTC

For comparison I tried replicating the issue on OPNsense in a vm using OPNsense kernel 24.7.2-nd2 (as mentioned here: https://github.com/opnsense/src/issues/218#issuecomment-2312311377). Unfortunately  it seems like that experimental kernel is not available anymore.

Comment 9 Michael Gmelin freebsd_committer

2024-09-11 16:31:16 UTC

I also tried sending NDs with ndisc6 on Debian to today's stable/14 (9f319352d7aca). Both were in VMs (so the setup is a bit steril), but on two different physical hosts. I still could not reproduce the issue using this setup.

Comment 10 Gleb Smirnoff freebsd_committer

2024-09-12 23:13:57 UTC

Dr. Uwe Meyer-Gruhl,

> #ndisc6 -m -n -r 1 fe80::be24:11ff:fe9f:eee9 eth0
> Soliciting fe80::be24:11ff:fe9f:eee9 (fe80::be24:11ff:fe9f:eee9) on eth0...
> Target link-layer address: BC:24:11:9F:EE:E9
>  from fe80::be24:11ff:fe9f:eee9

This check is exactly what tests/sys/netpfil/icmp6:repeat does.

> Then, after a few minutes:
> 
> #ndisc6 -m -n -r 1 fe80::be24:11ff:fe9f:eee9 eth0
> Soliciting fe80::be24:11ff:fe9f:eee9 (fe80::be24:11ff:fe9f:eee9) on eth0...
> Timed out.
> No response.

To instrument this, I edited /usr/tests/sys/netpfil/pf/icmp6 and in the
function repeat_body() added 'sleep 300' before the last invocation
of ndisc6:

        sleep 300
        atf_check -s exit:0 -o ignore \
          ndisc6 -m -n -r 1 2001:db8::1 ${epair}a
        jexec alcatraz pfctl -ss -vv

I also increased test timeout in the /usr/tests/sys/netpfil/pf/Kyuafile:

atf_test_program{name="icmp6", is_exclusive=true, timeout="360"}

On my bare virtual machine running 14.1-STABLE n268665-4938f554469 GENERIC
the test with added 5 minute delay succeeded:

[root@stable14 ~]# kyua test -k /usr/tests/sys/netpfil/pf/Kyuafile icmp6:repeat
icmp6:repeat  ->  passed  [307.995s]

Can you please perform the same check on your host? If the kyua test
succeeds, but forementioned manual test with ndisc6 from external host
still fails, then we probably need to know more about the pf.conf on
your host and network configuration.

Alternatively, you can work on the icmp6:repeat test case itself, adding
necessary bits to the test virtual machine configuration to make it reproduce
the problem.

Comment 11 Dr. Uwe Meyer-Gruhl 2024-09-13 08:19:23 UTC

I just retried with the same results. I rebooted the FreeBSD VM, got a reply just then, waited a few minutes, then got no more replies.

When I try from the FreeBSD VM locally via:

ndisc6 -m -n -r 1 fe80::be24:11ff:fe9f:eee9 em0

I always get a timeout, which is not really surprising, as this happens on my Linux client machine as well (i.e. when I direct ndisc6 at the machine itself, the request also times out). And BTW: Even when the FreeBSD machine does not answer any ND solicitations any more, it can get answers from the Linux counterpart.

Also, when I get no more ND replies from the outside, the FreeBSD VM still has full outside IPv6 connectivity (i.e. I can ping IPv6 machines on the internet).

The virtual NIC is an E1000, and I have nothing changed on the pf side. I just install FreeBSD from the ISO and configure it via DHCP and SLAAC.

Comment 12 Michael Gmelin freebsd_committer

2024-09-13 08:32:52 UTC

(In reply to Dr. Uwe Meyer-Gruhl from comment #11)

It's really change as I cannot replicate the result here at all (see my previous comments on the various things I experimented with).

Question: Once you do not get replies anymore, does disabling pf make things work again? Also, what's in pftop (or at least pfctl -s state) at that point?

Comment 13 Michael Gmelin freebsd_committer

2024-09-13 08:34:48 UTC

(In reply to Michael Gmelin from comment #12)

Extra question: What is the output of `ifconfig em0` in your VM?

Comment 14 Dr. Uwe Meyer-Gruhl 2024-09-13 09:19:26 UTC

As I have not changed anything from the default config, I probably do not have pf running on that instance. At least "pfctl -d" gives "pfctl: /dev/pf: No such file or directory".

ifconfig em0 gives:


em0: flags=1008843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST,LOWER_UP> metric 0 mtu 1500
        options=48525bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,LRO,WOL_MAGIC,VLAN_HWFILTER,VLAN_HWTSO,HWSTATS,MEXTPG>
        ether bc:24:11:9f:ee:e9
        inet 192.168.x.106 netmask 0xffffff00 broadcast 192.168.x.255
        inet6 fe80::be24:11ff:fe9f:eee9%em0 prefixlen 64 scopeid 0x1
        inet6 2001:xxxx:xxxx:xxxx:be24:11ff:fe9f:eee9 prefixlen 64 autoconf pltime 14400 vltime 86400
        media: Ethernet autoselect (1000baseT <full-duplex>)
        status: active
        nd6 options=23<PERFORMNUD,ACCEPT_RTADV,AUTO_LINKLOCAL>

Comment 15 Michael Gmelin freebsd_committer

2024-09-13 10:25:33 UTC

(In reply to Dr. Uwe Meyer-Gruhl from comment #14)

What is the output of: `netstat -s -p icmp6`?

If you're not running pf, then this sounds to me like it is unrelated, since FreeBSD-SA-24:05 only touched two files:

- sys/netpfil/pf/pf.c
- sys/netpfil/pf/pf_lb.c

which are both part of pf.ko.

I also checked commits on stable that followed to fix the various issues the SA introduced. From what I can tell, they all only touched pf.ko.

I would still like to replicate what you're experiencing, but I would need to get a better understanding of your test setup to do so.

Comment 16 Dr. Uwe Meyer-Gruhl 2024-09-13 10:48:17 UTC

# netstat -s -p icmp6
icmp6:
        0 calls to icmp6_error
        0 errors not generated in response to an icmp6 message
        0 errors not generated because of rate limitation
        Output histogram:
                router solicitation: 1
                neighbor solicitation: 3
                neighbor advertisement: 1
                MLDv2 listener report: 1
        0 messages with bad code fields
        0 messages < minimum length
        0 bad checksums
        0 messages with bad length
        0 total packets dropped due to failed NDP resolution
        Input histogram:
                router advertisement: 13
                neighbor solicitation: 1
                neighbor advertisement: 1
        Histogram of error messages to be generated:
                0 no route
                0 administratively prohibited
                0 beyond scope
                0 address unreachable
                0 port unreachable
                0 packet too big
                0 time exceed transit
                0 time exceed reassembly
                0 erroneous header field
                0 unrecognized next header
                0 unrecognized option
                0 redirect
                0 unknown
        0 message responses generated
        0 messages with too many ND options
        0 messages with bad ND options
        0 bad neighbor solicitation messages
        0 bad neighbor advertisement messages
        0 bad router solicitation messages
        0 bad router advertisement messages
        0 bad redirect messages
        0 default routers overflows
        0 prefix overflows
        0 neighbour entries overflows
        0 redirect overflows
        0 messages with invalid hop limit
        0 path MTU changes

As for pf: The bug is just a reminder for one of the remaining issues of 280701, which has been closed because of your "official policy of one bug
per bug report". I simply installed FreeBSD 14.1 plain vanilla in order to reproduce the cited bugs that turned up on OpnSense 24.7.3 and found ND broken under FreeBSD 14.1 in much the same way as it was on OpnSense.

I have tried just to replicate on FreeBSD to not have the bug report dismissed as "downstream only". Matter-of-fact I wanted to change the least amount possible and did nothing besides of what I stated, yet I get these results on this kernel.

I know for a fact that this specific issue is not present with FreeBSD 14.1-RELEASE before I installed the FreeBSD 14.1-STABLE kernel and is also gone for OpnSense 24.7.4, on which the whole SA-24:05 has been reverted.

Comment 17 Michael Gmelin freebsd_committer

2024-09-13 11:52:04 UTC

(In reply to Dr. Uwe Meyer-Gruhl from comment #16)

Just to be clear, the bug report is appreciated and I'm not arguing if there is a problem or not. I just want to figure out how to replicate what you (and others) are experiencing. So far I wasn't able to do this, so it would be good to understand more about your setup. Like, network environment, hypervisor you're running, how many cores assigned to the vm etc.

That said, your netstat output shows:

        Input histogram:
                router advertisement: 13
                neighbor solicitation: 1
                neighbor advertisement: 1

So this reads only one (1) neighbor solicitation. Is this after it broke? (for comparison, in my test setup this number is in the thousands and counting).

Additional things I tried:
- switch vm to e1000 instead of virtio-net
- assign two cores to vm instead of one
- build the exact kernel version you've been testing with

Still, it keeps on working without issues.

One difference I noticed between your and my `ifconfig em0` output is that I have a third inet6 address assigned (ULA, fd00::), but I don't think that this makes a difference.

Comment 18 Dr. Uwe Meyer-Gruhl 2024-09-13 11:58:38 UTC

I appreciate you stance to this. Do you use Proxmox as Hypervisor? I will try to add an ULA later...

Yes, this was after it broke. I will also try to see what happens before (i.e. just after boot).

Comment 19 Michael Gmelin freebsd_committer

2024-09-13 12:08:15 UTC

(In reply to Dr. Uwe Meyer-Gruhl from comment #18)

I'm currently running this on bhyve (I previously also did some tests on virtualbox).

A few more questions to better understand the situation:

- What is the output of `ndp -na` right after boot and then again after receiving the first ND sent by ndisc6?
- Does `ndp -c` unbreak things temporarily?

Thanks!

Comment 20 Dr. Uwe Meyer-Gruhl 2024-09-13 13:39:03 UTC

So:

After boot:

# netstat -s -p icmp6
icmp6:
        0 calls to icmp6_error
        0 errors not generated in response to an icmp6 message
        0 errors not generated because of rate limitation
        Output histogram:
                router solicitation: 1
                neighbor solicitation: 3
                neighbor advertisement: 1
                MLDv2 listener report: 1
        0 messages with bad code fields
        0 messages < minimum length
        0 bad checksums
        0 messages with bad length
        0 total packets dropped due to failed NDP resolution
        Input histogram:
                router advertisement: 1
                neighbor solicitation: 1
                neighbor advertisement: 1
        Histogram of error messages to be generated:
                0 no route
                0 administratively prohibited
                0 beyond scope
                0 address unreachable
                0 port unreachable
                0 packet too big
                0 time exceed transit
                0 time exceed reassembly
                0 erroneous header field
                0 unrecognized next header
                0 unrecognized option
                0 redirect
                0 unknown
        0 message responses generated
        0 messages with too many ND options
        0 messages with bad ND options
        0 bad neighbor solicitation messages
        0 bad neighbor advertisement messages
        0 bad router solicitation messages
        0 bad router advertisement messages
        0 bad redirect messages
        0 default routers overflows
        0 prefix overflows
        0 neighbour entries overflows
        0 redirect overflows
        0 messages with invalid hop limit
        0 path MTU changes

# ndp -na
Neighbor                             Linklayer Address  Netif Expire    1s 5s
fe80::62be:b4ff:fe16:a800%em0        60:be:b4:16:a8:00    em0 25s       R R
2001:xxxx:xxxx:xxxx:be24:11ff:fe9f:eee9 bc:24:11:9f:ee:e9   em0 permanent R
fe80::be24:11ff:fe9f:eee9%em0        bc:24:11:9f:ee:e9    em0 permanent R

# netstat -s -p icmp6
icmp6:
        0 calls to icmp6_error
        0 errors not generated in response to an icmp6 message
        0 errors not generated because of rate limitation
        Output histogram:
                router solicitation: 1
                neighbor solicitation: 4
                neighbor advertisement: 2
                MLDv2 listener report: 1
        0 messages with bad code fields
        0 messages < minimum length
        0 bad checksums
        0 messages with bad length
        0 total packets dropped due to failed NDP resolution
        Input histogram:
                router advertisement: 1
                neighbor solicitation: 2
                neighbor advertisement: 2
        Histogram of error messages to be generated:
                0 no route
                0 administratively prohibited
                0 beyond scope
                0 address unreachable
                0 port unreachable
                0 packet too big
                0 time exceed transit
                0 time exceed reassembly
                0 erroneous header field
                0 unrecognized next header
                0 unrecognized option
                0 redirect
                0 unknown
        0 message responses generated
        0 messages with too many ND options
        0 messages with bad ND options
        0 bad neighbor solicitation messages
        0 bad neighbor advertisement messages
        0 bad router solicitation messages
        0 bad router advertisement messages
        0 bad redirect messages
        0 default routers overflows
        0 prefix overflows
        0 neighbour entries overflows
        0 redirect overflows
        0 messages with invalid hop limit
        0 path MTU changes


After 1 successful ndisc from Linux client:

# netstat -s -p icmp6
icmp6:
        0 calls to icmp6_error
        0 errors not generated in response to an icmp6 message
        0 errors not generated because of rate limitation
        Output histogram:
                router solicitation: 1
                neighbor solicitation: 4
                neighbor advertisement: 2
                MLDv2 listener report: 1
        0 messages with bad code fields
        0 messages < minimum length
        0 bad checksums
        0 messages with bad length
        0 total packets dropped due to failed NDP resolution
        Input histogram:
                router advertisement: 1
                neighbor solicitation: 2
                neighbor advertisement: 2
        Histogram of error messages to be generated:
                0 no route
                0 administratively prohibited
                0 beyond scope
                0 address unreachable
                0 port unreachable
                0 packet too big
                0 time exceed transit
                0 time exceed reassembly
                0 erroneous header field
                0 unrecognized next header
                0 unrecognized option
                0 redirect
                0 unknown
        0 message responses generated
        0 messages with too many ND options
        0 messages with bad ND options
        0 bad neighbor solicitation messages
        0 bad neighbor advertisement messages
        0 bad router solicitation messages
        0 bad router advertisement messages
        0 bad redirect messages
        0 default routers overflows
        0 prefix overflows
        0 neighbour entries overflows
        0 redirect overflows
        0 messages with invalid hop limit
        0 path MTU changes

After 2 success ndisc6s:

# netstat -s -p icmp6
icmp6:
        0 calls to icmp6_error
        0 errors not generated in response to an icmp6 message
        0 errors not generated because of rate limitation
        Output histogram:
                router solicitation: 1
                neighbor solicitation: 4
                neighbor advertisement: 3
                MLDv2 listener report: 1
        0 messages with bad code fields
        0 messages < minimum length
        0 bad checksums
        0 messages with bad length
        0 total packets dropped due to failed NDP resolution
        Input histogram:
                router advertisement: 1
                neighbor solicitation: 3
                neighbor advertisement: 2
        Histogram of error messages to be generated:
                0 no route
                0 administratively prohibited
                0 beyond scope
                0 address unreachable
                0 port unreachable
                0 packet too big
                0 time exceed transit
                0 time exceed reassembly
                0 erroneous header field
                0 unrecognized next header
                0 unrecognized option
                0 redirect
                0 unknown
        0 message responses generated
        0 messages with too many ND options
        0 messages with bad ND options
        0 bad neighbor solicitation messages
        0 bad neighbor advertisement messages
        0 bad router solicitation messages
        0 bad router advertisement messages
        0 bad redirect messages
        0 default routers overflows
        0 prefix overflows
        0 neighbour entries overflows
        0 redirect overflows
        0 messages with invalid hop limit
        0 path MTU changes

# ndp -na
Neighbor                             Linklayer Address  Netif Expire    1s 5s
fe80::62be:b4ff:fe16:a800%em0        60:be:b4:16:a8:00    em0 23h59m37s S R
2001:xxxx:xxxx:xxxx:be24:11ff:fe9f:eee9 bc:24:11:9f:ee:e9   em0 permanent R
fe80::be24:11ff:fe9f:eee9%em0        bc:24:11:9f:ee:e9    em0 permanent R
fe80::d435:77ff:fe88:2299%em0        d6:35:77:88:22:99    em0 11s       R


The first LL-addr is the router, the last one is the Linux client.

After ~2 minutes and 1 unsuccessful ndisc6:

# netstat -s -p icmp6
icmp6:
        0 calls to icmp6_error
        0 errors not generated in response to an icmp6 message
        0 errors not generated because of rate limitation
        Output histogram:
                router solicitation: 1
                neighbor solicitation: 4
                neighbor advertisement: 3
                MLDv2 listener report: 1
        0 messages with bad code fields
        0 messages < minimum length
        0 bad checksums
        0 messages with bad length
        0 total packets dropped due to failed NDP resolution
        Input histogram:
                router advertisement: 2
                neighbor solicitation: 3
                neighbor advertisement: 2
        Histogram of error messages to be generated:
                0 no route
                0 administratively prohibited
                0 beyond scope
                0 address unreachable
                0 port unreachable
                0 packet too big
                0 time exceed transit
                0 time exceed reassembly
                0 erroneous header field
                0 unrecognized next header
                0 unrecognized option
                0 redirect
                0 unknown
        0 message responses generated
        0 messages with too many ND options
        0 messages with bad ND options
        0 bad neighbor solicitation messages
        0 bad neighbor advertisement messages
        0 bad router solicitation messages
        0 bad router advertisement messages
        0 bad redirect messages
        0 default routers overflows
        0 prefix overflows
        0 neighbour entries overflows
        0 redirect overflows
        0 messages with invalid hop limit
        0 path MTU changes

After 2 unsuccessful ndisc6s:

# netstat -s -p icmp6
icmp6:
        0 calls to icmp6_error
        0 errors not generated in response to an icmp6 message
        0 errors not generated because of rate limitation
        Output histogram:
                router solicitation: 1
                neighbor solicitation: 4
                neighbor advertisement: 3
                MLDv2 listener report: 1
        0 messages with bad code fields
        0 messages < minimum length
        0 bad checksums
        0 messages with bad length
        0 total packets dropped due to failed NDP resolution
        Input histogram:
                router advertisement: 2
                neighbor solicitation: 3
                neighbor advertisement: 2
        Histogram of error messages to be generated:
                0 no route
                0 administratively prohibited
                0 beyond scope
                0 address unreachable
                0 port unreachable
                0 packet too big
                0 time exceed transit
                0 time exceed reassembly
                0 erroneous header field
                0 unrecognized next header
                0 unrecognized option
                0 redirect
                0 unknown
        0 message responses generated
        0 messages with too many ND options
        0 messages with bad ND options
        0 bad neighbor solicitation messages
        0 bad neighbor advertisement messages
        0 bad router solicitation messages
        0 bad router advertisement messages
        0 bad redirect messages
        0 default routers overflows
        0 prefix overflows
        0 neighbour entries overflows
        0 redirect overflows
        0 messages with invalid hop limit
        0 path MTU changes

# ndp -na
Neighbor                             Linklayer Address  Netif Expire    1s 5s
fe80::62be:b4ff:fe16:a800%em0        60:be:b4:16:a8:00    em0 23h53m54s S R
2001:xxxx:xxxx:xxxx:be24:11ff:fe9f:eee9 bc:24:11:9f:ee:e9   em0 permanent R
fe80::be24:11ff:fe9f:eee9%em0        bc:24:11:9f:ee:e9    em0 permanent R
fe80::d435:77ff:fe88:2299%em0        d6:35:77:88:22:99    em0 23h54m28s S


Finally, ndp -c does not fix the problem and gives:

# ndp -c
fe80::62be:b4ff:fe16:a800%em0        60:be:b4:16:a8:00    em0 23h53m7s  S R
fe80::62be:b4ff:fe16:a800 (fe80::62be:b4ff:fe16:a800) deleted
2001:xxxx:xxxx:xxxx:be24:11ff:fe9f:eee9 bc:24:11:9f:ee:e9   em0 permanent R
ndp: delete 2001:xxxx:xxxx:xxxx:be24:11ff:fe9f:eee9: Operation not permitted
fe80::be24:11ff:fe9f:eee9%em0        bc:24:11:9f:ee:e9    em0 permanent R
ndp: delete fe80::be24:11ff:fe9f:eee9: Operation not permitted
fe80::d435:77ff:fe88:2299%em0        d6:35:77:88:22:99    em0 23h53m40s S
fe80::d435:77ff:fe88:2299 (fe80::d435:77ff:fe88:2299) deleted

BTW: Afterwards, I only see:

# ndp -na
Neighbor                             Linklayer Address  Netif Expire    1s 5s
2001:xxxx:xxxx:xxxx:be24:11ff:fe9f:eee9 bc:24:11:9f:ee:e9   em0 permanent R
fe80::be24:11ff:fe9f:eee9%em0        bc:24:11:9f:ee:e9    em0 permanent R

# netstat -s -p icmp6
icmp6:
        0 calls to icmp6_error
        0 errors not generated in response to an icmp6 message
        0 errors not generated because of rate limitation
        Output histogram:
                echo: 2
                router solicitation: 1
                neighbor solicitation: 6
                neighbor advertisement: 4
                MLDv2 listener report: 1
        0 messages with bad code fields
        0 messages < minimum length
        0 bad checksums
        0 messages with bad length
        0 total packets dropped due to failed NDP resolution
        Input histogram:
                echo reply: 2
                router advertisement: 3
                neighbor solicitation: 4
                neighbor advertisement: 4
        Histogram of error messages to be generated:
                0 no route
                0 administratively prohibited
                0 beyond scope
                0 address unreachable
                0 port unreachable
                0 packet too big
                0 time exceed transit
                0 time exceed reassembly
                0 erroneous header field
                0 unrecognized next header
                0 unrecognized option
                0 redirect
                0 unknown
        0 message responses generated
        0 messages with too many ND options
        0 messages with bad ND options
        0 bad neighbor solicitation messages
        0 bad neighbor advertisement messages
        0 bad router solicitation messages
        0 bad router advertisement messages
        0 bad redirect messages
        0 default routers overflows
        0 prefix overflows
        0 neighbour entries overflows
        0 redirect overflows
        0 messages with invalid hop limit


So it seems the ND requests either do not reach the FreeBSD VM or they do not trigger a neighbor entry any more. I would bet it is the latter, as another failed ndisc6 shows:

# netstat -s -p icmp6
icmp6:
        0 calls to icmp6_error
        0 errors not generated in response to an icmp6 message
        0 errors not generated because of rate limitation
        Output histogram:
                echo: 2
                router solicitation: 1
                neighbor solicitation: 6
                neighbor advertisement: 4
                MLDv2 listener report: 1
        0 messages with bad code fields
        0 messages < minimum length
        0 bad checksums
        0 messages with bad length
        0 total packets dropped due to failed NDP resolution
        Input histogram:
                echo reply: 2
                router advertisement: 3
                neighbor solicitation: 4
                neighbor advertisement: 7
        Histogram of error messages to be generated:
                0 no route
                0 administratively prohibited
                0 beyond scope
                0 address unreachable
                0 port unreachable
                0 packet too big
                0 time exceed transit
                0 time exceed reassembly
                0 erroneous header field
                0 unrecognized next header
                0 unrecognized option
                0 redirect
                0 unknown
        0 message responses generated
        0 messages with too many ND options
        0 messages with bad ND options
        0 bad neighbor solicitation messages
        0 bad neighbor advertisement messages
        0 bad router solicitation messages
        0 bad router advertisement messages
        0 bad redirect messages
        0 default routers overflows
        0 prefix overflows
        0 neighbour entries overflows
        0 redirect overflows
        0 messages with invalid hop limit
        0 path MTU changes

So the machine seems to see / count the NDs, but does neither answer them nor does it make new entries. It does re-crreate the router entries:

# ndp -na
Neighbor                             Linklayer Address  Netif Expire    1s 5s
2001:xxxx:xxxx:xxxx:62be:b4ff:fe16:a800 60:be:b4:16:a8:00   em0 23h56m14s S R
fe80::62be:b4ff:fe16:a800%em0        60:be:b4:16:a8:00    em0 23h56m5s  S R
2001:a61:5fb:2e10:be24:11ff:fe9f:eee9 bc:24:11:9f:ee:e9   em0 permanent R
fe80::be24:11ff:fe9f:eee9%em0        bc:24:11:9f:ee:e9    em0 permanent R

...but not the Linux client, unless I ping the VM directly, which works - afterwards, I see it again:

# ndp -na
Neighbor                             Linklayer Address  Netif Expire    1s 5s
2001:xxxx:xxxx:xxxx:62be:b4ff:fe16:a800 60:be:b4:16:a8:00   em0 23h53m48s S R
fe80::62be:b4ff:fe16:a800%em0        60:be:b4:16:a8:00    em0 23h53m39s S R
2001:xxxx:xxxx:xxxx:be24:11ff:fe9f:eee9 bc:24:11:9f:ee:e9   em0 permanent R
fe80::be24:11ff:fe9f:eee9%em0        bc:24:11:9f:ee:e9    em0 permanent R
fe80::d435:77ff:fe88:2299%em0        d6:35:77:88:22:99    em0 23h59m50s S

Comment 21 Michael Gmelin freebsd_committer

2024-09-13 13:52:15 UTC

(In reply to Dr. Uwe Meyer-Gruhl from comment #20)

Thanks, that's lots of input to process.

Could you please do two final tests before I'll try to reproduce this once more (hopefully on the weekend):

1. Can you add this to /etc/sysctl.conf and reboot:

    net.inet6.icmp6.nd6_delay=200


and see if ND works for longer after a reboot?


2. Can you undo 1. and add this to /etc/sysctl.conf:

    net.inet6.icmp6.nd6_onlink_ns_rfc4861=1


reboot again and see if it makes a difference.

Comment 22 Dr. Uwe Meyer-Gruhl 2024-09-13 14:33:49 UTC

Ad 1: Sometimes, the ND were answered even up to 5 minutes after reboot. After I have set net.inet6.icmp6.nd6_delay=200, I watched the VM and after 4 minutes, the ND still worked, after 6 minutes, ND stopped being answered.

Ad 2: Undoing step 1 and setting net.inet6.icmp6.nd6_onlink_ns_rfc4861=1, same story: After 5 minutes, the problem surfaces.

Comment 23 Michael Gmelin freebsd_committer

2024-09-13 15:41:20 UTC

(In reply to Dr. Uwe Meyer-Gruhl from comment #22)

Ok, thank you.

TL;DR if below is too much hassle for you at the moment, don't bother (I promised the questions above were the last for now). Also, maybe somebody more knowledgeable of IPv6 could look into this as well.

----

Just to confirm: You're positive that exact same setup worked using 14.1-RELEASE *before* the correction date? (I know, it's annoying to get the same thing asked multiple times).

Also: Does 14.1-RELEASE *after* the correction date also work in your setup (vanilla FreeBSD install in your vm, pf disabled)?

Hypothesis: There might actually be two independent issues at hand. The one you're seeing when pf is disabled and the one originally reported by Franco, which is a problem with pf enabled.

(this is assuming `kldstat` does not list pf on your test host)

Going through diffs between 14.1-RELEASE and 14-STABLE I found one commit about neighbor discovery that might be worth trying to reverse and see if it improves the situation in your setup: https://cgit.freebsd.org/src/commit/?id=407ef8669f5a3

Assuming /usr/src is empty:

git clone -b stable/14 --shallow-since=2024-07-24 \
https://git.freebsd.org/src.git /usr/src
cd /usr/src
git show 407ef8669f5a3 | patch -R
make kernel
reboot

I also double-checked the commit that backed out the SA at OPNsense:
https://github.com/opnsense/src/commit/d0d18dbbaba27b342bbb10df89e75d2156c136fe

It really just removed changes in pf.ko, so it's unlikely that what you're seeing here (with pf disabled) is caused by it.

This doesn't mean there isn't another issue with pf enabled though, for which we would need a separate way to reproduce it, once we figured out what's up with the issue you're seeing.

Comment 24 Gleb Smirnoff freebsd_committer

2024-09-13 16:56:49 UTC

Dr. Uwe Meyer-Gruhl,

I understood your reply to me is that all tests in tests/sys/netpfil/pf/icmp
do pass on your host.  Please confirm that?

Also, looks like your investigations together with Michael show that the
problem is not related to pf(4).  However, if you had run tests from
tests/sys/netpfil/pf, then pf.ko is in memory after that until you reboot.
Can you please double check and confirm once again that lost NDs are not
related to pf?

Once that confirmation is done, I would also suggest to double check the
assumption that SA-24:05 introduced the regression.  As Michael notes it
is very unlikely since it touched files related to pf only.

If you insist that bug manifests itself without pf.ko in memory and also
that bug was introduced by SA-24:05, then I would ask to checkout stable/14
at 96ff33484ee5, which is the last commit before SA-24:05 and _triple_
check that the bug doesn't exist.  If that is true, then please check out
3382c691dc6a, which is exactly last commit of SA-24:05, and _quadruple_
check that bug is introduced again.  Also check that pf.ko is not in memory
at this point.

Sorry for insisting on all this re-checks, but with the news of pf not
being necessary to be present in memory makes the whole picture too unrealistic
and hints that there is some testing error in our records.

Comment 25 Dr. Uwe Meyer-Gruhl 2024-09-13 17:07:03 UTC

(In reply to Gleb Smirnoff from comment #24)

I have not tried any automated tests, as I am not familiar with the ATF framework. I also have not checked out any source versions, but only installed pre-made binaries.

And yes, it really seems that this is not related to the SA, but happens independently, with pf enabled and without and with both 14.1-RELEASE and 14.1-STABLE. I just have to wait 5+ minutes (actually, it is a few seconds after 5 minutes uptime are reached).

Comment 26 Michael Gmelin freebsd_committer

2024-09-13 17:10:19 UTC

One additional data point: I tried to disable ULA on my network, but it still works (so this additional test is off the table).

Another thing you could test, since you seem to be on proxmox already (otherwise I would try to set that up myself, but I have no experience doing that and if possible would like to avoid the extra time spent):

Could you try to disable multicast snooping on the bridge device on the proxmox host:

    echo -n 0 > /sys/class/net/<BRIDGE>/bridge/multicast_snooping

This was suggested on the proxmox forums: https://forum.proxmox.com/threads/ipv6-neighbor-solicitation-not-forwarded-to-vm.96758/

It's a bit of a wild shot, but since it's super low effort to try, it would be nice if we could rule that one out.

Comment 27 Dr. Uwe Meyer-Gruhl 2024-09-13 17:23:14 UTC

(In reply to Michael Gmelin from comment #26)

Wow. Right on the spot! Once I set that on the Proxmox host even on the VM that ran for 18 minutes, it immediately answers again.

What I do not get is that during the first five minutes, the setting does not cause problems and also, that the ND counters in the VM seemed to increase, so what is actually filtered?

I also would never have guessed that, because I saw the ND problems before on bare metal under OpnSense.

I think we can close this issue and I will forward the info to the OpnSense team. Sorry for the noise and thanks for listening!

Comment 28 Michael Gmelin freebsd_committer

2024-09-13 17:35:10 UTC

(In reply to Dr. Uwe Meyer-Gruhl from comment #27)

That's good news, thank you!

Maybe we should leave this open for a couple of days until the in-project reporters confirm that there isn't another ND issue that shows when pf is enabled?

By the way, even though I'm not using OPNsense myself - I'm a bit too hardcore raw FreeBSD for that - I think it's a really cool project and I already recommended it to friends in the past.

Comment 29 Ed Maste freebsd_committer

2024-09-13 17:44:31 UTC

> Maybe we should leave this open for a couple of days until the in-project reporters confirm that there isn't another ND issue that shows when pf is enabled?

IMO it doesn't hurt to keep this ticket open for a bit, but if there is a distinct issue that is observed with pf enabled it should be a new bug as the investigation and updates in this one go down the path of the proxmox issue.

We should also update the headline to indicate that this one is (apparently) a Linux kernel / proxmox issue rather than a SA-24:05 regression.

Comment 30 Dr. Uwe Meyer-Gruhl 2024-09-13 17:49:20 UTC

That would be my idea as well. I would open a separate issue for the SA/pf-related problem when/if I can reproduce it.

This one really is an interoperabilty issue, but it is solved.

Comment 31 Dr. Uwe Meyer-Gruhl 2024-09-13 17:51:57 UTC

Changed title as root cause was interoperability with Proxmox

Comment 32 Michael Gmelin freebsd_committer

2024-09-13 17:52:51 UTC

There's actually an open bug on kernel.org tracking the issue: https://bugzilla.kernel.org/show_bug.cgi?id=99081

Comment 33 Ed Maste freebsd_committer

2024-09-13 19:01:11 UTC

> There's actually an open bug on kernel.org tracking the issue

Oh my, submitted in 2015. Thanks for the link - I've added it to the "See Also" list. (Bugzilla allows that since it's another Bugzilla instance.)

Comment 34 Michael Gmelin freebsd_committer

2024-09-13 22:59:53 UTC

@meyergru For a lack of having a better place to write about this:

I've seen the posts[0] in the OPNsense forum about this. I cannot comment on the history of the *senses, personal harms, and whatever might happened in the past - including the communication on bug #280701. What I can do is give my current (non-sarcastic, personal, not representative for the project) status of the situation we're in:

1. There was FreeBSD-SA-24:05, that addresses an icmp6 state table vulnerability (I can't comment on how urgent this was, but I'm giving the security officer(s) the benefit of the doubt and therefore trust their judgement).
2. This SA introduced a couple of regressions, including *breaking neighbor discovery*. This one I could also easily reproduce locally when running 14.1-RELEASE with pf enabled. So there definitely was, and still is, an issue. I'm unhappy that this is what the current release patch level does and we all want this to be rectified asap.
3. There were a couple of patches to 14-STABLE that were supposed to fix the regressions introduced by the SA. These patches have not yet been applied to any releng branches.
4. In bug #280701 there were reports about additional problems (including this one) *after applying all additional fixes*. These were put into separate bugs.
5. Besides waiting for more test reports to come in, addressing the issues introduced by the SA was delayed by these bugs as well as by waiting a bit to make sure that there are no additional issues (as issuing and SA for the SA for the SA would be pretty bad).

So from my perspective, the way forward would be to resolve bug #281397 as well, and then, given that there are no more concerns, issue another SA or EN to address the problems that were introduced by SA-24:05.

I'm adding a couple of people I think might be relevant to the Cc list of this bug. If you feel like this is inappropriate, please simply remove yourselves from the Cc list.

[0]https://forum.opnsense.org/index.php?topic=42458.msg212548#msg212548

Comment 35 Dr. Uwe Meyer-Gruhl 2024-09-13 23:25:41 UTC

(In reply to Michael Gmelin from comment #34)

Exactly my point of view and the main reason I opened this bug and bug #281397, as I already pointed out. 

I have closed this bug just because the problem I could show really was about interoperation with Proxmox only, but would open a new one issue once I find a way to reproduce the ND bug without any probability of it being dismissed as "downstream" again by showing it on pure FreeBSD.

Alas, that is not so easy, partly, because of some automation in OpnSense, like there is no static /etc/pf.conf.

I can understand each position: Yes, FreeBSD cannot provide solutions for issues created downstream. On the other hand, I am quite sure Franco is right about the partly applied patches leaving reasonable doubt that all bugs have been fixed. Given the prehistory and because OpnSense was definitely showing these remaining problems, I also would have chosen to revert the SA altogether for OpnSense at this time, but the best way going forward is indeed to fix them upstream.

However, the old bug was closed for formal reasons and without any forking issues of the remaining problems.

I like your way to address this, because I think this is the best way to improve both FreeBSD and OpnSense.

Comment 36 Michael Gmelin freebsd_committer

2024-09-14 10:05:23 UTC

(In reply to Dr. Uwe Meyer-Gruhl from comment #35)

> Alas, that is not so easy, partly, because of some automation
> in OpnSense, like there is no static /etc/pf.conf.

That's why I tried to replicate the issue using OPNsense. I spent a couple of hours familiarizing myself with in and upgraded to a specific release (getting to 24.7.2 in this case, as this would be the basis for additional patches).

Like I wrote in comment #8, the experimental kernels which would have the same patches as 14.1-STABLE were not available anymore (e.g., `opnsense-update -zkr 24.7.2-nd2` could not locate them, I tried various mirrors). As testing the official release kernels is pointless (24.7.1 has no patches beyond the original SA, 24.7.2 does not have all the patches from STABLE and 24.7.3 has the SA backed out completely), this is where I stopped, hoping for some advice, where to get these kernels alternatively or how to build them manually.

The original SA caused issues, also on bare metal, which bug #280701 was supposed to address. I would like to make sure there are no remaining problems before an EN is created (even though I'm not part of that process).

The proxmox issue is kind of unfortunate, as some of its symptoms are similar to the ones the SA introduced also on bare metal and now it's a bit hard to tell apart which patches were tested on which setups.

Comment 37 Dr. Uwe Meyer-Gruhl 2024-09-14 10:08:55 UTC

No, 24.7.3_1 should have the buggy behaviour, as it was based on the partial SA fix. Only 24.7.4 has the full SA revert.

Comment 38 Michael Gmelin freebsd_committer

2024-09-14 14:53:57 UTC

(In reply to Dr. Uwe Meyer-Gruhl from comment #37)

Ok, I'm not sure much sense it makes to continue this conversation here, as I keep spamming those who are subscribed, so maybe we can move it somewhere else (you could, e.g., simply send me an email or we create a new/separate bug). That said, I now created the following:

- OPNsense VM running 24.7.3_1
- Custom kernel containing bits from 14-STABLE

The custom kernel was created by taking OPNsense kernel sources from 24.7.2 and replacing sys/net, sys/netinet, sys/netinet6, sys/netipsec, and sys/netpfil with those from 14-STABLE.

This means I now have a testbed in which I can test reports that are only reproducible on OPNsense. So in case you happen to have such reports, I'm happy to check if I can reproduce them locally.