14-STABLE eff27c3872300e594e0b410364a02302fc555121 built 4 June. This machine is a gateway and does indeed use ipv6. It runs dns/blocky (a filtering resolver, like pi-hole written in go) in a jail that lives on ZFS. The rest of the system is on UFS. I had just rolled back the jail to an old snapshot when this happened, but I'm not positive that is related, even though it appears to have crashed after I hit enter on the zfs rollback command. It looks like it crashed when blocky went to close a TCP connection (the upstream resolver is DNS-over-https using ipv6). Message buffer: Fatal trap 12: page fault while in kernel mode cpuid = 3; apic id = 06 fault virtual address = 0x10 fault code = supervisor read data, page not present instruction pointer = 0x20:0xffffffff80b10416 stack pointer = 0x28:0xfffffe00b4245980 frame pointer = 0x28:0xfffffe00b42459b0 code segment = base rx0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 11116 (blocky) rdi: fffff8004c742000 rsi: 000000000000001c rdx: fffff801dba0a278 rcx: fffff8004c742000 r8: 00000000ffffffbd r9: 0000000000000018 rax: 0000000000000000 rbx: 0000000000000000 rbp: fffffe00b42459b0 r10: fffff8004ca20e20 r11: fffff8005ec6b880 r12: fffff8003fb4e898 r13: 0000000000000000 r14: fffffe00b424598c r15: 0000000000010480 trap number = 12 panic: page fault cpuid = 3 time = 1718033759 KDB: stack backtrace: #0 0xffffffff808b899d at kdb_backtrace+0x5d #1 0xffffffff8086b701 at vpanic+0x131 #2 0xffffffff8086b5c3 at panic+0x43 #3 0xffffffff80d6325b at trap_fatal+0x40b #4 0xffffffff80d632a6 at trap_pfault+0x46 #5 0xffffffff80d3b718 at calltrap+0x8 #6 0xffffffff80adda9a at tcp_default_output+0x1cda #7 0xffffffff80aef193 at tcp_usr_disconnect+0x83 #8 0xffffffff8090ff05 at soclose+0x75 #9 0xffffffff8080a5c1 at _fdrop+0x11 #10 0xffffffff8080d82a at closef+0x24a #11 0xffffffff8080cee6 at fdescfree+0x4e6 #12 0xffffffff8081fa2e at exit1+0x49e #13 0xffffffff8081f58d at sys_exit+0xd #14 0xffffffff80d63b15 at amd64_syscall+0x115 #15 0xffffffff80d3c02b at fast_syscall_common+0xf8 kgdb backtrace: (kgdb) bt #0 __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:57 #1 doadump (textdump=<optimized out>) at /usr/src/sys/kern/kern_shutdown.c:405 #2 0xffffffff8086b297 in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:523 #3 0xffffffff8086b76e in vpanic (fmt=0xffffffff80e79c24 "%s", ap=ap@entry=0xfffffe00b42457e0) at /usr/src/sys/kern/kern_shutdown.c:967 #4 0xffffffff8086b5c3 in panic (fmt=<unavailable>) at /usr/src/sys/kern/kern_shutdown.c:891 #5 0xffffffff80d6325b in trap_fatal (frame=0xfffffe00b42458c0, eva=16) at /usr/src/sys/amd64/amd64/trap.c:952 #6 0xffffffff80d632a6 in trap_pfault (frame=<unavailable>, usermode=false, signo=<optimized out>, ucode=<optimized out>) at /usr/src/sys/amd64/amd64/trap.c:760 #7 <signal handler called> #8 0xffffffff80b10416 in in6_selecthlim (inp=inp@entry=0xfffff8005ea2b540, ifp=ifp@entry=0x0) at /usr/src/sys/netinet6/in6_src.c:850 #9 0xffffffff80adda9a in tcp_default_output (tp=0xfffff8005ea2b540) at /usr/src/sys/netinet/tcp_output.c:1444 #10 0xffffffff80aef193 in tcp_usr_disconnect (so=<optimized out>) at /usr/src/sys/netinet/tcp_usrreq.c:705 #11 0xffffffff8090ff05 in sodisconnect (so=0xfffff80136b683c0) at /usr/src/sys/kern/uipc_socket.c:1436 #12 soclose (so=0xfffff80136b683c0) at /usr/src/sys/kern/uipc_socket.c:1271 #13 0xffffffff8080a5c1 in fo_close (fp=0xfffff8004c742000, fp@entry=0xfffff8019bc50730, td=0x1c, td@entry=0xfffff8019bc50730) at /usr/src/sys/sys/file.h:392 #14 _fdrop (fp=0xfffff8004c742000, fp@entry=0xfffff8019bc50730, td=0x1c, td@entry=0xfffff801db4cb000) at /usr/src/sys/kern/kern_descrip.c:3670 #15 0xffffffff8080d82a in closef (fp=fp@entry=0xfffff8019bc50730, td=td@entry=0xfffff801db4cb000) at /usr/src/sys/kern/kern_descrip.c:2843 #16 0xffffffff8080cee6 in fdescfree_fds (td=0xfffff801db4cb000, fdp=0xfffffe00b1260860) at /usr/src/sys/kern/kern_descrip.c:2566 #17 fdescfree (td=td@entry=0xfffff801db4cb000) at /usr/src/sys/kern/kern_descrip.c:2609 #18 0xffffffff8081fa2e in exit1 (td=0xfffff801db4cb000, rval=<optimized out>, signo=signo@entry=0) at /usr/src/sys/kern/kern_exit.c:404 #19 0xffffffff8081f58d in sys_exit (td=0xfffff8004c742000, uap=<optimized out>) at /usr/src/sys/kern/kern_exit.c:210 #20 0xffffffff80d63b15 in syscallenter (td=0xfffff801db4cb000) at /usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:191 #21 amd64_syscall (td=0xfffff801db4cb000, traced=0) at /usr/src/sys/amd64/amd64/trap.c:1194 #22 <signal handler called> #23 0x000000000047398b in ?? () Backtrace stopped: Cannot access memory at address 0x8702b7ee8
(In reply to Daniel Ponte from comment #0) The stack trace is weird. The caller `sys/netinet/tcp_output.c` ``` 1444 ip6->ip6_hlim = in6_selecthlim(inp, NULL); ``` The callee, `sys/netinet6/in6_src.c`: ``` 843 int 844 in6_selecthlim(struct inpcb *inp, struct ifnet *ifp) 845 { 846 847 if (inp && inp->in6p_hops >= 0) 848 return (inp->in6p_hops); 849 else if (ifp) 850 return (ND_IFINFO(ifp)->chlim); 851 else if (inp && !IN6_IS_ADDR_UNSPECIFIED(&inp->in6p_faddr)) { ... } ``` The line 850 of should never hit as `ifp` is NULL, the backtrace also shows that clearly. That is quite odd ... Is it possible that kgdb report the wrong line number ?
(In reply to Zhenlei Huang from comment #1) fault virtual address = 0x10 corresponds to offset of nd_ifinfo field in struct in6_ifextra that is returned by if_getafdata().
(In reply to Andrey V. Elsukov from comment #2) Emm, I guess we have to disassemble the kernel file to figure out what happens behind, if this can not be repeated.
ffffffff80b10380 <in6_selecthlim>: ffffffff80b10380: 55 pushq %rbp ffffffff80b10381: 48 89 e5 movq %rsp, %rbp ffffffff80b10384: 41 56 pushq %r14 ffffffff80b10386: 53 pushq %rbx ffffffff80b10387: 48 83 ec 20 subq $0x20, %rsp ffffffff80b1038b: 48 85 ff testq %rdi, %rdi ffffffff80b1038e: 74 74 je 0xffffffff80b10404 <in6_selecthlim+0x84> ffffffff80b10390: 0f b7 87 04 01 00 00 movzwl 0x104(%rdi), %eax ffffffff80b10397: 66 85 c0 testw %ax, %ax ffffffff80b1039a: 0f 89 9a 00 00 00 jns 0xffffffff80b1043a <in6_selecthlim+0xba> ffffffff80b103a0: 48 85 f6 testq %rsi, %rsi ffffffff80b103a3: 75 64 jne 0xffffffff80b10409 <in6_selecthlim+0x89> ffffffff80b103a5: 83 bf 94 00 00 00 00 cmpl $0x0, 0x94(%rdi) ffffffff80b103ac: 75 1b jne 0xffffffff80b103c9 <in6_selecthlim+0x49> ffffffff80b103ae: 83 bf 98 00 00 00 00 cmpl $0x0, 0x98(%rdi) ffffffff80b103b5: 75 12 jne 0xffffffff80b103c9 <in6_selecthlim+0x49> ffffffff80b103b7: 83 bf 9c 00 00 00 00 cmpl $0x0, 0x9c(%rdi) ffffffff80b103be: 75 09 jne 0xffffffff80b103c9 <in6_selecthlim+0x49> ffffffff80b103c0: 83 bf a0 00 00 00 00 cmpl $0x0, 0xa0(%rdi) ffffffff80b103c7: 74 57 je 0xffffffff80b10420 <in6_selecthlim+0xa0> ffffffff80b103c9: 0f b7 9f 8e 00 00 00 movzwl 0x8e(%rdi), %ebx ffffffff80b103d0: 48 81 c7 94 00 00 00 addq $0x94, %rdi ffffffff80b103d7: 4c 8d 75 dc leaq -0x24(%rbp), %r14 ffffffff80b103db: 48 8d 55 ec leaq -0x14(%rbp), %rdx ffffffff80b103df: 4c 89 f6 movq %r14, %rsi ffffffff80b103e2: e8 19 dd 01 00 callq 0xffffffff80b2e100 <in6_splitscope> ffffffff80b103e7: 8b 55 ec movl -0x14(%rbp), %edx ffffffff80b103ea: 89 df movl %ebx, %edi ffffffff80b103ec: 4c 89 f6 movq %r14, %rsi ffffffff80b103ef: 31 c9 xorl %ecx, %ecx ffffffff80b103f1: 45 31 c0 xorl %r8d, %r8d ffffffff80b103f4: e8 07 46 ff ff callq 0xffffffff80b04a00 <fib6_lookup> ffffffff80b103f9: 48 85 c0 testq %rax, %rax ffffffff80b103fc: 74 22 je 0xffffffff80b10420 <in6_selecthlim+0xa0> ffffffff80b103fe: 48 8b 78 20 movq 0x20(%rax), %rdi ffffffff80b10402: eb 08 jmp 0xffffffff80b1040c <in6_selecthlim+0x8c> ffffffff80b10404: 48 85 f6 testq %rsi, %rsi ffffffff80b10407: 74 17 je 0xffffffff80b10420 <in6_selecthlim+0xa0> ffffffff80b10409: 48 89 f7 movq %rsi, %rdi ffffffff80b1040c: be 1c 00 00 00 movl $0x1c, %esi ffffffff80b10411: e8 0a a3 e8 ff callq 0xffffffff8099a720 <if_getafdata> ffffffff80b10416: 48 8b 40 10 movq 0x10(%rax), %rax ffffffff80b1041a: 0f b6 40 1c movzbl 0x1c(%rax), %eax ffffffff80b1041e: eb 1a jmp 0xffffffff80b1043a <in6_selecthlim+0xba> ffffffff80b10420: 65 48 8b 04 25 00 00 00 00 movq %gs:0x0, %rax ffffffff80b10429: 48 8b 80 90 06 00 00 movq 0x690(%rax), %rax ffffffff80b10430: 48 8b 40 28 movq 0x28(%rax), %rax ffffffff80b10434: 8b 80 48 5c 33 81 movl -0x7ecca3b8(%rax), %eax ffffffff80b1043a: 48 83 c4 20 addq $0x20, %rsp ffffffff80b1043e: 5b popq %rbx ffffffff80b1043f: 41 5e popq %r14 ffffffff80b10441: 5d popq %rbp ffffffff80b10442: c3 retq ffffffff80b10443: 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 nopw %cs:(%rax,%rax)
Created attachment 251522 [details] test nh_ifp
(In reply to Daniel Ponte from comment #4) I do not see any problems with the disassembled code from my limited x86-64 ASM knowledge. There're only two entries that will run to ffffffff80b10416, one is > ffffffff80b103a0: 48 85 f6 testq %rsi, %rsi > ffffffff80b103a3: 75 64 jne 0xffffffff80b10409 <in6_selecthlim+0x89> , the another one is > ffffffff80b103fe: 48 8b 78 20 movq 0x20(%rax), %rdi > ffffffff80b10402: eb 08 jmp 0xffffffff80b1040c <in6_selecthlim+0x8c> So I suspect the line number 850 by kgdb is wrong, and the correct one should be 861. I have no evidences but may you please have a try with the patch ?
The C calling convention for x86-64 I refer: https://people.freebsd.org/~obrien/amd64-elf-abi.pdf
(In reply to Daniel Ponte from comment #4) Can you show me the output of print ((struct ifnet *)0xfffff8004c742000)->if_afdata[28] print *(struct ifnet *)0xfffff8004c742000 on kgdb? Probably %rdi still held the ifnet pointer at the fatal fault because if_getafdata() was a tiny function (I can confirm if the disassemble output of if_getafdata is provided).
kgdb output: (kgdb) print ((struct ifnet *)0xfffff8004c742000)->if_afdata[28] $1 = (void *) 0x0 (kgdb) print *(struct ifnet *)0xfffff8004c742000 $2 = {if_link = {cstqe_next = 0x0}, if_clones = {le_next = 0x0, le_prev = 0xfffff8004c897828}, if_groups = {cstqh_first = 0x0, cstqh_last = 0xfffff8004c742018}, if_alloctype = 6 '\006', if_numa_domain = 255 '\377', if_softc = 0x0, if_llsoftc = 0x0, if_l2com = 0x0, if_dname = 0xffffffff834e2000 <epairname> "epair", if_dunit = 0, if_index = 23, if_idxgen = 0, if_xname = "epair0b\000\000\000\000\000\000\000\000", if_description = 0x0, if_flags = 2131970, if_drv_flags = 0, if_capabilities = 8, if_capabilities2 = 0, if_capenable = 8, if_capenable2 = 0, if_linkmib = 0x0, if_linkmiblen = 0, if_refcount = 4, if_type = 6 '\006', if_addrlen = 6 '\006', if_hdrlen = 14 '\016', if_link_state = 1 '\001', if_mtu = 1500, if_metric = 0, if_baudrate = 10000000000, if_hwassist = 0, if_epoch = 77, if_lastchange = {tv_sec = 1718033759, tv_usec = 498647}, if_snd = {ifq_head = 0x0, ifq_tail = 0x0, ifq_len = 0, ifq_maxlen = 50, ifq_mtx = {lock_object = { lo_name = 0xfffff8004c742058 "epair0b", lo_flags = 16973824, lo_data = 0, lo_witness = 0x0}, mtx_lock = 0}, ifq_drv_head = 0x0, ifq_drv_tail = 0x0, ifq_drv_len = 0, ifq_drv_maxlen = 50, altq_type = 0, altq_flags = 1, altq_disc = 0x0, altq_ifp = 0xfffff8004c742000, altq_enqueue = 0x0, altq_dequeue = 0x0, altq_request = 0x0, altq_tbr = 0x0, altq_cdnr = 0x0}, if_linktask = {ta_link = {stqe_next = 0x0}, ta_pending = 0, ta_priority = 0 '\000', ta_flags = 0 '\000', ta_func = 0xffffffff8099ab60 <do_link_state_change>, ta_context = 0xfffff8004c742000}, if_addmultitask = {ta_link = {stqe_next = 0x0}, ta_pending = 0, ta_priority = 0 '\000', ta_flags = 0 '\000', ta_func = 0xffffffff8099add0 <if_siocaddmulti>, ta_context = 0xfffff8004c742000}, if_addr_lock = {lock_object = { lo_name = 0xffffffff80e985c6 "if_addr_lock", lo_flags = 16973824, lo_data = 0, lo_witness = 0x0}, mtx_lock = 0}, if_addrhead = {cstqh_first = 0x0, cstqh_last = 0xfffff8004c7421c0}, if_multiaddrs = { cstqh_first = 0x0, cstqh_last = 0xfffff8004c7421d0}, if_amcount = 0, if_addr = 0xfffff8004c921000, if_hw_addr = 0xfffff80007d7e7d0, if_broadcastaddr = 0xffffffff80fa0530 <etherbroadcastaddr> "\377\377\377\377\377\377", if_afdata_lock = {lock_object = {lo_name = 0xffffffff80eea36d "if_afdata", lo_flags = 16973824, lo_data = 0, lo_witness = 0x0}, mtx_lock = 0}, if_afdata = {0x0 <repeats 44 times>}, if_afdata_initialized = 0, if_fib = 0, if_vnet = 0xfffff80016c43580, if_home_vnet = 0xfffff800010af9c0, if_vlantrunk = 0x0, if_bpf = 0xffffffff80f9f0b0 <dead_bpf_if>, if_pcount = 0, if_bridge = 0x0, if_lagg = 0x0, if_pf_kif = 0x0, if_carp = 0x0, if_label = 0x0, if_netmap = 0x0, if_output = 0xffffffff809a3760 <ifdead_output>, if_input = 0xffffffff809a3780 <ifdead_input>, if_bridge_input = 0x0, if_bridge_output = 0x0, if_bridge_linkstate = 0x0, if_start = 0xffffffff809a3790 <ifdead_start>, if_ioctl = 0xffffffff809a37a0 <ifdead_ioctl>, if_init = 0xffffffff834e1020 <epair_init>, if_resolvemulti = 0xffffffff809a37b0 <ifdead_resolvemulti>, if_qflush = 0xffffffff809a37d0 <ifdead_qflush>, if_transmit = 0xffffffff809a37e0 <ifdead_transmit>, if_reassign = 0xffffffff809a5070 <ether_reassign>, if_get_counter = 0xffffffff809a3800 <ifdead_get_counter>, if_requestencap = 0xffffffff809a4fa0 <ether_requestencap>, if_counters = {0xfffffe012c2c88b8, 0xfffffe012c2c88b0, 0xfffffe012c2c8878, 0xfffffe012c2c8870, 0xfffffe012c2c8868, 0xfffffe012c2c8860, 0xfffffe012c2c8858, 0xfffffe012c2c8850, 0xfffffe012c2c8848, 0xfffffe012c2c8840, 0xfffffe012c2c8838, 0xfffffe012c2c8830}, if_hw_tsomax = 65518, if_hw_tsomaxsegcount = 35, if_hw_tsomaxsegsize = 2048, if_snd_tag_alloc = 0xffffffff809a3810 <ifdead_snd_tag_alloc>, if_ratelimit_query = 0xffffffff809a3820 <ifdead_ratelimit_query>, if_ratelimit_setup = 0x0, if_pcp = 255 '\377', if_debugnet_methods = 0x0, if_epoch_ctx = {data = {0x0, 0x0}}, if_ispare = {0, 0, 0, 0}} As far as testing the patch, I can build with it, but this probably won't be reproducible anyway. I'm not totally certain what was happening when it crashed.
(In reply to Daniel Ponte from comment #9) it looks like an epair(4) device was detached and some packets were going to send through, but were delayed. Then afdata[AF_INET6] was freed due to epair ifnet detach and access to this data triggers panic.
We've been running into this in pfSense for a while as well: https://redmine.pfsense.org/issues/14431 We wound up applying this band-aid: https://github.com/pfsense/FreeBSD-src/commit/9834d8bb0d3344cd82552c3cd16e5b2d84543d8f That's very much not a fix, but it does seem to mitigate the panics.
I have not reproduced the crash but I guess the following patch for if_detach_internal() would fix the problem: ---- --- a/sys/net/if.c +++ b/sys/net/if.c @@ -1235,6 +1235,8 @@ if_detach_internal(struct ifnet *ifp, bool vmove) #ifdef VIMAGE finish_vnet_shutdown: #endif + epoch_wait_preempt(net_epoch_preempt); + NET_EPOCH_DRAIN_CALLBACKS(); /* * We cannot hold the lock over dom_ifdetach calls as they might * sleep, for example trying to drain a callout, thus open up the ---- The routing entries that are related with the detaching ifnet are removed in if_purgeaddrs() and rt_flushifroutes(). It seems that the transport layer protects itself from freeing objects with NET_EPOCH_ENTER/EXIT. So there should be no threads that still reference nhop_objects related to the ifnet after rt_flushifroutes() + epoch_wait_preempt(). I am not sure that NET_EPOCH_DRAIN_CALLBACKS() is required but it is probably harmless.
Created attachment 251613 [details] shell script that reproduces the crash I have been able to reproduce the crash with the attached script (repro.sh). With the patch in comment #12 applied, the crash does not occur so far.
Patch posted on the Phabricator, review D45690.
Xref another report by bz@ https://lists.freebsd.org/archives/freebsd-net/2024-May/004981.html
I've retracted review D45690 because it does not completely fix the problem.
What about upstreaming PFSense's workaround until a proper solution can be found? https://github.com/pfsense/FreeBSD-src/commit/9834d8bb0d3344cd82552c3cd16e5b2d84543d8f