Bug 279208 - filling up arp table with static entries leads to a crash
Summary: filling up arp table with static entries leads to a crash
Status: Closed DUPLICATE of bug 277063
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 14.0-RELEASE
Hardware: Any Any
: --- Affects Many People
Assignee: freebsd-net (Nobody)
URL:
Keywords: crash
Depends on:
Blocks:
 
Reported: 2024-05-21 22:59 UTC by martin
Modified: 2024-05-24 14:52 UTC (History)
3 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description martin 2024-05-21 22:59:11 UTC
Loading arp table with the arp -f command leads to a panic. Sometimes panic occurs immediately, sometimes after loading more entries (more subnets or wider subnet). Executing few arp -a processes and waiting few minutes does lead to panic too.

To reproduce I've created an alias on interface and a list of dummy entries:

# ifconfig em0 alias 172.17.1.1/24
# cat 1list
172.17.1.2 13:01:00:00:00:02
172.17.1.3 13:01:00:00:00:03
...
172.17.1.255 13:01:00:00:00:ff


# arp -f 1list
# ps axl |grep arp
  0 842  820 1  20  0 12956  2688 sbwait   I+    0   0:00.02 arp -a

Those entries that arp command did show have obvious overflow:

# arp -an
? (172.17.3.254) at 13:03:00:00:00:fe on em0 expires in -1716331940 seconds [ethernet]
? (172.17.3.222) at 13:03:00:00:00:de on em0 expires in -1716331940 seconds [ethernet]


Sleeping thread (tid 100853, pid 0) owns a non-sleepable lock
KDB: stack backtrace of thread 100853:
#0 0xffffffff80b5028b at mi_switch+0xbb
#1 0xffffffff80b4fa00 at _sleep+0x1f0
#2 0xffffffff80ba6c11 at taskqueue_thread_loop+0xb1
#3 0xffffffff80afdb7f at fork_exit+0x7f
#4 0xffffffff80fe4b2e at fork_trampoline+0xe
panic: sleeping thread
cpuid = 1
time = 1716332236
KDB: stack backtrace:
#0 0xffffffff80b9009d at kdb_backtrace+0x5d
#1 0xffffffff80b431a2 at vpanic+0x132
#2 0xffffffff80b43063 at panic+0x43
#3 0xffffffff80ba8e9e at propagate_priority+0x29e
#4 0xffffffff80ba99e4 at turnstile_wait+0x314
#5 0xffffffff80b3e9c9 at __rw_rlock_hard+0x279
#6 0xffffffff80d8c2af at dump_lle+0x1f
#7 0xffffffff80c6c38c at htable_foreach_lle+0x5c
#8 0xffffffff80d8c234 at dump_llts_iface+0x54
#9 0xffffffff80d8bfcd at rtnl_handle_getneigh+0x20d
#10 0xffffffff80d882d2 at rtnl_handle_message+0x132
#11 0xffffffff80d85c0b at nl_taskqueue_handler+0x79b
#12 0xffffffff80ba5992 at taskqueue_run_locked+0x182
#13 0xffffffff80ba6c22 at taskqueue_thread_loop+0xc2
#14 0xffffffff80afdb7f at fork_exit+0x7f
#15 0xffffffff80fe4b2e at fork_trampoline+0xe
Uptime: 4m49s
Comment 1 Marek Zarychta 2024-05-22 04:04:27 UTC
Please see also bug 277063. Perhaps it has already been fixed[1]. Could you test if the panics are reproducible on 14.1-BETA3 ?

1. https://cgit.freebsd.org/src/commit/?id=1e74fc950419f2b2482d313fc664cc03aa46f13c
Comment 2 martin 2024-05-22 08:16:16 UTC
Thanks for the link. On releng/14.1-n267636-2a964a7fc34e I was not able to replicate the issue, system is stable.
Comment 3 Sergey 2024-05-23 11:21:21 UTC
(In reply to Marek Zarychta from comment #1)

I applied a patch and rebuilt it, but the System crashes and automatically reboots.

# uname -a
FreeBSD localhost 14.0-RELEASE-p6 FreeBSD 14.0-RELEASE-p6 #0: Вт 26 марта 20 26:20:2024 UTC root@amd64-builder.daemonology.net:/usr/obj/usr/src/amd64.amd64/sys/GENERIC amd64
Comment 4 Marek Zarychta 2024-05-23 14:26:40 UTC
(In reply to Sergey from comment #3)
Probably it would be better to test how recent stable/14 or releng/14.1 behaves if you are building from sources.
Comment 5 Sergey 2024-05-23 14:58:21 UTC
(In reply to Marek Zarychta from comment #4)
I have a master gateway that I have updated 
FreeBSD localhost 13.2-RELEASE-p4 FreeBSD 13.2-RELEASE-p4 GENERIC i386
up to
FreeBSD localhost 14.0-RELEASE-p6 FreeBSD 14.0-RELEASE-p6 GENERIC amd64 

After that, I encountered unstable operation of the static arp table, which is actively used there.

It seems that it is too early to upgrade to a 64-bit system.
Comment 6 Zhenlei Huang freebsd_committer freebsd_triage 2024-05-24 14:52:26 UTC
(In reply to Sergey from comment #5)
> I have a master gateway that I have updated 
> FreeBSD localhost 13.2-RELEASE-p4 FreeBSD 13.2-RELEASE-p4 GENERIC i386
> up to
> FreeBSD localhost 14.0-RELEASE-p6 FreeBSD 14.0-RELEASE-p6 GENERIC amd64 

`arp -na` is stable on 14.1-BETA3. No problem with about 10000 arp entries.
```
# arp -na | wc -l
   10003
``` 

> After that, I encountered unstable operation of the static arp table, which is
> actively used there.

That should be regression of arp(8) which has been converted to use netlink in 14.0.

> It seems that it is too early to upgrade to a 64-bit system.

That is not fair to 64-bit system ;)

*** This bug has been marked as a duplicate of bug 277063 ***