Bug 272319 - FreeBSD kernel crash on MPD5 restart with PPP configuration.
Summary: FreeBSD kernel crash on MPD5 restart with PPP configuration.
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 13.2-RELEASE
Hardware: Any Any
: --- Affects Some People
Assignee: Gleb Smirnoff
URL:
Keywords: crash
Depends on:
Blocks:
 
Reported: 2023-07-01 17:21 UTC by ny
Modified: 2023-11-30 17:07 UTC (History)
6 users (show)

See Also:


Attachments
Fix panic and try to add solisten upcall (2.11 KB, patch)
2023-07-02 17:01 UTC, Aleksandr Fedorov
no flags Details | Diff
Simple MPD5 configuration, which not work on FreeBSD 12,13,14 and crash kernel. (637 bytes, text/plain)
2023-08-06 06:49 UTC, ny
no flags Details
FreeBSD 15 crash / trace (13.38 KB, image/png)
2023-11-19 16:15 UTC, ny
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description ny 2023-07-01 17:21:11 UTC
FreeBSD 12.0-13.2 (both amd64 and i386) have kernel crash on MPD5 daemon restart or OS reboot with PPP configuration.

How to reproduce:.
1. Install FreeBSD 13.2 (sample amd64) with default kernel
2. install mpd5 from ports
3. configure mpd5 with PPP over TCP/IP.
4. start MPD5 daemon
5. restart MPD5 or reboot OS
6. kernel crashed.

Sample of mpd5 configuration (/usr/local/etc/mpd5/mpd.conf):
========
startup:
#	set log +all

default:
	load ppp_server

ppp_server:
        set ippool add pool2 10.0.0.0 10.0.255.255

        create bundle template B2
        set ipcp ranges 10.0.1.1/16 ippool pool2
        set iface enable proxy-arp
        set iface enable tcpmssfix
        set iface idle 0

        create link template L2 tcp
        set link enable multilink
        set link enable shortseq
        set link yes acfcomp protocomp
        set link action bundle B2

        set link disable chap pap eap
        set link enable chap chap-msv1 chap-msv2 chap-md5

        set tcp self 127.0.0.1 57
        set link enable incoming
======

Trace:

Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address	= 0x18
fault code		= supervisor write data, page not present
instruction pointer	= 0x20:0xffffffff80be3cc2
stack pointer	        = 0x28:0xfffffe00939e6c70
frame pointer	        = 0x28:0xfffffe00939e6c80
code segment		= base rx0, limit 0xfffff, type 0x1b
			= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags	= resume, IOPL = 0
current process		= 475 (ng_queue0)
trap number		= 12
panic: page fault
cpuid = 0
time = 1688225854
KDB: stack backtrace:
#0 0xffffffff80c53dc5 at kdb_backtrace+0x65
#1 0xffffffff80c06741 at vpanic+0x151
#2 0xffffffff80c065e3 at panic+0x43
#3 0xffffffff810b1fa7 at trap_fatal+0x387
#4 0xffffffff810b1fff at trap_pfault+0x4f
#5 0xffffffff81088e78 at calltrap+0x8
#6 0xffffffff80c6bef8 at propagate_priority+0x58
#7 0xffffffff80c6cce3 at turnstile_wait+0x323
#8 0xffffffff80be33a0 at __mtx_lock_sleep+0x180
#9 0xffffffff82b366fb at ng_ksocket_shutdown+0x1ab
#10 0xffffffff82b23923 at ng_rmnode+0x1c3
#11 0xffffffff82b258b5 at ng_apply_item+0x85
#12 0xffffffff82b287b8 at ngthread+0x1e8
#13 0xffffffff80bc2fce at fork_exit+0x7e
#14 0xffffffff81089eee at fork_trampoline+0xe
Uptime: 1m52s
Dumping 161 out of 2006 MB:..10%..20%..30%..40%..50%..60%..70%..80%..90%..100%

__curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
55		__asm("movq %%gs:%P1,%0" : "=r" (td) : "n" (offsetof(struct pcpu,
(kgdb) #0  __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
#1  doadump (textdump=<optimized out>)
    at /usr/src/sys/kern/kern_shutdown.c:396
#2  0xffffffff80c0630a in kern_reboot (howto=260)
    at /usr/src/sys/kern/kern_shutdown.c:484
#3  0xffffffff80c067ae in vpanic (fmt=<optimized out>, 
    ap=ap@entry=0xfffffe00939e6ac0) at /usr/src/sys/kern/kern_shutdown.c:923
#4  0xffffffff80c065e3 in panic (fmt=<unavailable>)
    at /usr/src/sys/kern/kern_shutdown.c:847
#5  0xffffffff810b1fa7 in trap_fatal (frame=0xfffffe00939e6bb0, eva=24)
    at /usr/src/sys/amd64/amd64/trap.c:942
#6  0xffffffff810b1fff in trap_pfault (frame=0xfffffe00939e6bb0, 
    usermode=false, signo=<optimized out>, ucode=<optimized out>)
    at /usr/src/sys/amd64/amd64/trap.c:761
#7  <signal handler called>
#8  0xffffffff80be3cc2 in atomic_cmpset_long (expect=0, 
    src=18446741876100055968, dst=<optimized out>)
    at /usr/src/sys/amd64/include/atomic.h:217
#9  _thread_lock (td=0xfffff800210a4158) at /usr/src/sys/kern/kern_mutex.c:845
#10 0xffffffff80c6bef8 in propagate_priority (td=0xfffff800210a4158, 
    td@entry=0xfffffe00544443a0) at /usr/src/sys/kern/subr_turnstile.c:234
#11 0xffffffff80c6cce3 in turnstile_wait (ts=ts@entry=0xfffff800104ff240, 
    owner=owner@entry=0xfffff800210a4158, queue=queue@entry=0)
    at /usr/src/sys/kern/subr_turnstile.c:808
#12 0xffffffff80be33a0 in __mtx_lock_sleep (c=0xfffff800210a4160, 
    v=<optimized out>) at /usr/src/sys/kern/kern_mutex.c:668
#13 0xffffffff82b366fb in ng_ksocket_shutdown (node=0xfffff80021ae7800)
    at /usr/src/sys/netgraph/ng_ksocket.c:939
#14 0xffffffff82b23923 in ng_rmnode (node=0xfffff80021ae7800, 
    dummy1=<optimized out>, dummy2=<optimized out>, dummy3=<optimized out>)
    at /usr/src/sys/netgraph/ng_base.c:758
#15 0xffffffff82b258b5 in ng_apply_item (node=node@entry=0xfffff80021ae7800, 
    item=item@entry=0xfffff80021659d80, rw=rw@entry=1)
    at /usr/src/sys/netgraph/ng_base.c:2477
#16 0xffffffff82b287b8 in ngthread (arg=arg@entry=0x0)
    at /usr/src/sys/netgraph/ng_base.c:3444
#17 0xffffffff80bc2fce in fork_exit (callout=0xffffffff82b285d0 <ngthread>, 
    arg=0x0, frame=0xfffffe00939e6f40) at /usr/src/sys/kern/kern_fork.c:1093
#18 <signal handler called>
#19 0x000004c708f40bfa in ?? ()
Backtrace stopped: Cannot access memory at address 0x4c700446b68
(kgdb) 
=========

Reproduced in stable way. Visibility only with PPP over TCP/IP, 
PPTP or L2TP not have such question. FreeBSD 11 kernel work good 
and not have such problem.
Comment 1 Tino Engel 2023-07-01 19:32:53 UTC
Although I have no possibility to test this at the moment there I will be happy see if I have an idea as soon as I have setup an AWS image (and preferably have the possibility to document the process in the wiki).
Comment 2 ny 2023-07-01 20:01:41 UTC
(In reply to Tino Engel from comment #1)
You can test it on any existing FreeBSD 12.0-13.2 by installing MPD5, which will listen some IP (even 127.0.0.1) for PPP over TCP
Comment 3 Eugene Grosbein freebsd_committer freebsd_triage 2023-07-02 08:56:51 UTC
I can reliably reproduce the problem and get same kernel panic with my small VM system updated today:

$ uname -pv
FreeBSD 12.4-STABLE 593211cc7 HZ  amd64
Comment 4 Eugene Grosbein freebsd_committer freebsd_triage 2023-07-02 09:55:48 UTC
I've added some debugging options to my kernel:

options         INVARIANT_SUPPORT
options         INVARIANTS
options         WITNESS
options         WITNESS_KDB

And got more sane panic message:

panic: mtx_lock() of spin mutex (null) @ /usr/src/sys/netgraph/ng_ksocket.c:940

(kgdb) bt
#0  __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
#1  doadump (textdump=<optimized out>) at /usr/src/sys/kern/kern_shutdown.c:371
#2  0xffffffff804c6eef in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:452
#3  0xffffffff804c7251 in vpanic (fmt=<optimized out>, ap=0xfffffe00184e6280)
    at /usr/src/sys/kern/kern_shutdown.c:881
#4  0xffffffff804c7093 in panic (fmt=<unavailable>) at /usr/src/sys/kern/kern_shutdown.c:808
#5  0xffffffff804a77a0 in __mtx_lock_flags (c=0xfffff8004da77ba8, opts=<optimized out>,
    file=0xffffffff807e1b04 "/usr/src/sys/netgraph/ng_ksocket.c", line=940)
    at /usr/src/sys/kern/kern_mutex.c:261
#6  0xffffffff805e1478 in ng_ksocket_shutdown (node=0xfffff8001066f100)
    at /usr/src/sys/netgraph/ng_ksocket.c:940
#7  0xffffffff805db00f in ng_rmnode (node=0xfffff8001066f100, dummy1=<optimized out>,
    dummy2=<optimized out>, dummy3=<optimized out>) at /usr/src/sys/netgraph/ng_base.c:757
#8  0xffffffff805dcec9 in ng_apply_item (node=<unavailable>, item=<unavailable>, rw=1)
    at /usr/src/sys/netgraph/ng_base.c:2477
#9  0xffffffff805e0080 in ngthread (arg=<optimized out>) at /usr/src/sys/netgraph/ng_base.c:3439
#10 0xffffffff804891c0 in fork_exit (callout=0xffffffff805dfea0 <ngthread>, arg=0x0,
    frame=0xfffffe00184e6480) at /usr/src/sys/kern/kern_fork.c:1080
#11 <signal handler called>
Comment 5 Eugene Grosbein freebsd_committer freebsd_triage 2023-07-02 09:59:25 UTC
CC'ing some more people. Maybe someone of them has something to tell.
Comment 6 Tino Engel 2023-07-02 11:27:02 UTC
I need to wait 24h for AWS cost management in order to be able to afford the machine.
Comment 7 Aleksandr Fedorov freebsd_committer freebsd_triage 2023-07-02 11:52:40 UTC
Evgeniy, sent p priv->so to mee:

$24 = {so_lock = {lock_object = {lo_name = 0xffffffff807f7904 "socket", lo_flags = 21168128,
      lo_data = 0, lo_witness = 0xfffff8007cd5a800}, mtx_lock = 0}, so_count = 1, so_rdsel = {
    si_tdlist = {tqh_first = 0x0, tqh_last = 0x0}, si_note = {kl_list = {slh_first = 0x0},
      kl_lock = 0xffffffff80555a00 <so_rdknl_lock>,
      kl_unlock = 0xffffffff80555a40 <so_rdknl_unlock>,
      kl_assert_locked = 0xffffffff80555a80 <so_rdknl_assert_locked>,
      kl_assert_unlocked = 0xffffffff80555ac0 <so_rdknl_assert_unlocked>,
      kl_lockarg = 0xfffff8004da77a38, kl_autodestroy = 0}, si_mtx = 0x0}, so_wrsel = {
    si_tdlist = {tqh_first = 0x0, tqh_last = 0x0}, si_note = {kl_list = {slh_first = 0x0},
      kl_lock = 0xffffffff80555b00 <so_wrknl_lock>,
      kl_unlock = 0xffffffff80555b40 <so_wrknl_unlock>,
      kl_assert_locked = 0xffffffff80555b80 <so_wrknl_assert_locked>,
      kl_assert_unlocked = 0xffffffff80555bc0 <so_wrknl_assert_unlocked>,
      kl_lockarg = 0xfffff8004da77a38, kl_autodestroy = 0}, si_mtx = 0x0}, so_type = 1,
  so_options = 514, so_linger = 0, so_state = 256, so_pcb = 0xfffff800355bd988,
  so_vnet = 0xfffff8000203e8c0, so_proto = 0xffffffff80a62460 <inetsw+192>, so_timeo = 0,
  so_error = 0, so_rerror = 0, so_sigio = 0x0, so_cred = 0xfffff8005f954400, so_label = 0x0,
  so_gencnt = 11170, so_emuldata = 0x0, so_dtor = 0x0, osd = {osd_nslots = 0, osd_slots = 0x0,
    osd_next = {le_next = 0x0, le_prev = 0x0}}, so_fibnum = 0, so_user_cookie = 0,
  so_ts_clock = 0, so_max_pacing_rate = 0, {{so_rcv = {sb_mtx = {lock_object = {lo_name = 0x0,
            lo_flags = 1302821776, lo_data = 4294965248, lo_witness = 0x0},
          mtx_lock = 18446735278919351200}, sb_sx = {lock_object = {lo_name = 0x0, lo_flags = 1,
            lo_data = 0, lo_witness = 0x0}, sx_lock = 0}, sb_sel = 0x0, sb_state = 0,
        sb_mb = 0x0, sb_mbtail = 0x80000000001, sb_lastrecord = 0x800000010000,
        sb_sndptr = 0x8200820, sb_fnrdy = 0x0, sb_sndptroff = 0, sb_acc = 0, sb_ccc = 0,
        sb_hiwat = 0, sb_mbcnt = 0, sb_mcnt = 0, sb_ccnt = 0, sb_mbmax = 0, sb_ctl = 0,
        sb_lowat = 0, sb_timeo = 0, sb_flags = 0, sb_upcall = 0x0, sb_upcallarg = 0x0,
        sb_aiojobq = {tqh_first = 0x0, tqh_last = 0x0}, sb_aiotask = {ta_link = {
            stqe_next = 0x0}, ta_pending = 0, ta_priority = 0, ta_func = 0x0, ta_context = 0x0}},
      so_snd = {sb_mtx = {lock_object = {lo_name = 0x0, lo_flags = 0, lo_data = 0,
            lo_witness = 0x0}, mtx_lock = 0}, sb_sx = {lock_object = {lo_name = 0x0,
            lo_flags = 0, lo_data = 0, lo_witness = 0x0}, sx_lock = 0}, sb_sel = 0x0,
        sb_state = 0, sb_mb = 0x0, sb_mbtail = 0x0, sb_lastrecord = 0x0, sb_sndptr = 0x0,
        sb_fnrdy = 0x0, sb_sndptroff = 0, sb_acc = 0, sb_ccc = 0, sb_hiwat = 0, sb_mbcnt = 0,
        sb_mcnt = 0, sb_ccnt = 0, sb_mbmax = 0, sb_ctl = 0, sb_lowat = 0, sb_timeo = 0,
        sb_flags = 0, sb_upcall = 0x0, sb_upcallarg = 0x0, sb_aiojobq = {tqh_first = 0x0,
          tqh_last = 0x0}, sb_aiotask = {ta_link = {stqe_next = 0x0}, ta_pending = 0,
          ta_priority = 0, ta_func = 0x0, ta_context = 0x0}}, so_list = {tqe_next = 0x0,
        tqe_prev = 0x0}, so_listen = 0x0, so_qstate = SQ_NONE, so_peerlabel = 0x0,
      so_oobmark = 0}, {sol_incomp = {tqh_first = 0x0, tqh_last = 0xfffff8004da77b90},
      sol_comp = {tqh_first = 0x0, tqh_last = 0xfffff8004da77ba0}, sol_qlen = 0, sol_incqlen = 0,
      sol_qlimit = 1, sol_accept_filter = 0x0, sol_accept_filter_arg = 0x0,
      sol_accept_filter_str = 0x0, sol_upcall = 0x0, sol_upcallarg = 0x0, sol_sbrcv_lowat = 1,
      sol_sbsnd_lowat = 2048, sol_sbrcv_hiwat = 65536, sol_sbsnd_hiwat = 32768,
      sol_sbrcv_flags = 2080, sol_sbsnd_flags = 2080, sol_sbrcv_timeo = 0, sol_sbsnd_timeo = 0}}}
(kgdb)

priv->so->so_options == 512 (0x202 - SO_ACCPTCONN | SO_REUSEADDR), so this is a LISTENNING type of socket.

After this commit: https://github.com/freebsd/freebsd-src/commit/779f106aa169256b7010a1d8f963ff656b881e92

Access to so_rcv, so_snd fields is invalid for listening sockets. Because they share the same place with sol_* fields.
Comment 8 Aleksandr Fedorov freebsd_committer freebsd_triage 2023-07-02 11:59:29 UTC
I think fix should look like this:

diff --git a/sys/netgraph/ng_ksocket.c b/sys/netgraph/ng_ksocket.c
index ba9845410e42..7074549ae403 100644
--- a/sys/netgraph/ng_ksocket.c
+++ b/sys/netgraph/ng_ksocket.c
@@ -936,12 +936,18 @@ ng_ksocket_shutdown(node_p node)
 
        /* Close our socket (if any) */
        if (priv->so != NULL) {
-               SOCKBUF_LOCK(&priv->so->so_rcv);
-               soupcall_clear(priv->so, SO_RCV);
-               SOCKBUF_UNLOCK(&priv->so->so_rcv);
-               SOCKBUF_LOCK(&priv->so->so_snd);
-               soupcall_clear(priv->so, SO_SND);
-               SOCKBUF_UNLOCK(&priv->so->so_snd);
+               /*
+                * SOLISTENNIG sockets doesn't have data upcalls.
+                */
+               if (!SOLISTENING(priv->so)) {
+                       SOCKBUF_LOCK(&priv->so->so_rcv);
+                       soupcall_clear(priv->so, SO_RCV);
+                       SOCKBUF_UNLOCK(&priv->so->so_rcv);
+                       SOCKBUF_LOCK(&priv->so->so_snd);
+                       soupcall_clear(priv->so, SO_SND);
+                       SOCKBUF_UNLOCK(&priv->so->so_snd);
+               }
+
                soclose(priv->so);
                priv->so = NULL;
        }
Comment 9 Eugene Grosbein freebsd_committer freebsd_triage 2023-07-02 13:04:35 UTC
(In reply to Aleksandr Fedorov from comment #8)

Indeed, this patch eliminates panic. However, after mpd5 successfully stopped without a panic, the command "sockstat | fgrep :57" still shows the following (just like while mpd5 runs):

?        ?          ?     ?  tcp4   127.0.0.1:57          *:*

After second mpd5 start, the line duplicates:

?        ?          ?     ?  tcp4   127.0.0.1:57          *:*
?        ?          ?     ?  tcp4   127.0.0.1:57          *:*

Looks like ksocket leak, doesn't it?
Comment 10 Eugene Grosbein freebsd_committer freebsd_triage 2023-07-02 13:13:35 UTC
Nevermind, I created the leak with bad manual patching. Your version is fine.
Comment 11 ny 2023-07-02 13:43:09 UTC
I have check  
     if (!SOLISTENING(priv->so)) 
patch with FreeBSD 13.2 AMD64.

Yes, kernel have stopped to crash on mpd5 restart. Thanks.

At same time telnet into 127.0.0.1 port 57 still not produce { {{{{ {{{ {{.

PPP/TCP problem in MPD5 was started from FreeBSD 12.0 and active up to now.
In FreeBSD 11 - all was fine.

Look like some extra bug exist inside kernel netgraph code, which is not allow mpd5 to listen PPP/TCP socket in correct way and send answers back to PPP client.
Comment 12 ny 2023-07-02 13:47:18 UTC
Also "telnet 127.0.0.1 57" never disconnected by timeout.
Comment 13 Eugene Grosbein freebsd_committer freebsd_triage 2023-07-02 13:52:18 UTC
mpd5 uses poll(2) system call to obtain events such as incoming connection from ng_ksocket. If such an event occurs for TCP ng_ksocket, mpd5 prints either "Incoming TCP connection from ..." log message at its PHYS loglevel, or "TCP: error reading message from ..." at its error loglevel in case of error reading data from ng_ksocket.

My tests show that neither message occurs, so poll(2) is broken for ng_ksocket().
Comment 14 Tino Engel 2023-07-02 14:54:03 UTC
Maybe removing the #error directives in
https://cgit.freebsd.org/src/tree/sys/amd64/include/pcpu_aux.h?h=releng/13.2
gives more insights?
Comment 15 Aleksandr Fedorov freebsd_committer freebsd_triage 2023-07-02 15:08:16 UTC
(In reply to ny from comment #11)

Yes, I think there is an additional bug here.

After commit https://github.com/freebsd/freebsd-src/commit/779f106aa169256b7010a1d8f963ff656b881e92

A new KPI has been added - solisten_upcall_set() which is used to catch events associated with a listening socket.

But, the current version of ng_ksocket(4) continues to use the old KPI for data sockets.

Therefore, I think, because ng_ksocket(4) doesn't set the handler function via solisten_upcall_set(), it doesn't get ACCEPT event, etc.

For an example, see how it was done for ctl_ha.c https://github.com/freebsd/freebsd-src/commit/779f106aa169256b7010a1d8f963ff656b881e92#diff-77eaa0ac186398050c82052906359306b18a8a274c b81b7ee3dbd897f8207705

The same things should be done with ng_ksocket(4).
Comment 16 Aleksandr Fedorov freebsd_committer freebsd_triage 2023-07-02 17:01:54 UTC
Created attachment 243150 [details]
Fix panic and try to add solisten upcall
Comment 17 Aleksandr Fedorov freebsd_committer freebsd_triage 2023-07-02 17:03:50 UTC
I added quick and dirty patch against HEAD, where I will try to fix panic and add solisten upcall. Patch should work with 13.2.

After patch telnet show me:

# telnet 127.0.0.1 57
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
~#!}!}!} &}(}"}'}"}!}$}%}%}&a6}#}%#}1}$}(} }2}"}3})}#.})~~#!}!}"} &}(}"}'}"}!}$}%}%}&a6}#}%#}1}$}(} }2}"}3})}#.})Z}:~~#!}!}#} &}(}"}'}"}!}$}%}%}&a6}#}%#}1}$}(} }2}"}3})}#.})}%~~#!}!}$} &}(}"}'}"}!}$}%}%}&a6}#}%#}1}$}(} }2}"}3})}#.})~~#!}!}%} &}(}"}'}"}!}$}%}%}&a6}#}%#}1}$}(} }2}"}3})}#.})%~~#!}!}&} &}(}"}'}"}!}$}%}%}&a6}#}%#}1}$}(} }2}"}3})}#.})}$~~#!}!}'} &}(}"}'}"}!}$}%}%}&a6}#}%#}1}$}(} }2}"}3})}#.})[Y~~#!}!}(} &}(}"}'}"}!}$}%}%}&a6}#}%#}1}$}(} }2}"}3})}#.})X~~#!}!})} &}(}"}'}"}!}$}%}%}&a6}#}%#}1}$}(} }2}"}3})}#.})}'&~~#!}!}*} &}(}"}'}"}!}$}%}%}&a6}#}%#}1}$}(} }2}"}3})}#.})~Connection closed by foreign host.
Comment 18 ny 2023-07-02 17:28:12 UTC
I something negative inside "dirty patch" for PPP TCP production server?
Comment 19 ny 2023-07-02 18:45:03 UTC
I have tested patch under FreeBSD 13.2 / amd64.
Now mpd5 restart - no any crashes.

On telnet 127.0.0.1 57 - no connect, but new crash:

Unread portion of the kernel message buffer:
panic: _mtx_lock_sleep: recursed on non-recursive mutex socket @ /usr/src/sys/netgraph/ng_ksocket.c:1196

cpuid = 0
time = 1688330477
KDB: stack backtrace:
#0 0xffffffff80c423a5 at kdb_backtrace+0x65
#1 0xffffffff80bf5ff1 at vpanic+0x151
#2 0xffffffff80bf5df3 at panic+0x43
#3 0xffffffff80bd176c at __mtx_lock_sleep+0x43c
#4 0xffffffff80bd12b5 at __mtx_lock_flags+0xe5
#5 0xffffffff82d37d53 at ng_ksocket_accept+0x33
#6 0xffffffff80c99806 at solisten_wakeup+0x26
#7 0xffffffff80dc7037 at tcp_do_segment+0x15b7
#8 0xffffffff80dc4ddb at tcp_input_with_port+0xddb
#9 0xffffffff80dc59ab at tcp_input+0xb
#10 0xffffffff80db628b at ip_input+0x18b
#11 0xffffffff80d38cd1 at swi_net+0x1a1
#12 0xffffffff80bb2de9 at ithread_loop+0x279
#13 0xffffffff80bafcc0 at fork_exit+0x80
#14 0xffffffff8108df2e at fork_trampoline+0xe
Uptime: 1m46s
Dumping 175 out of 2004 MB:..10%..19%..28%..37%..46%..55%..64%..73%..82%..91%

__curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
55              __asm("movq %%gs:%P1,%0" : "=r" (td) : "n" (offsetof(struct pcpu,
(kgdb) #0  __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
#1  doadump (textdump=<optimized out>)
    at /usr/src/sys/kern/kern_shutdown.c:396
#2  0xffffffff80bf5bff in kern_reboot (howto=260)
    at /usr/src/sys/kern/kern_shutdown.c:484
#3  0xffffffff80bf605e in vpanic (fmt=<optimized out>, 
    ap=ap@entry=0xfffffe00037339c0) at /usr/src/sys/kern/kern_shutdown.c:923
#4  0xffffffff80bf5df3 in panic (fmt=<unavailable>)
    at /usr/src/sys/kern/kern_shutdown.c:847
#5  0xffffffff80bd176c in __mtx_lock_sleep (c=c@entry=0xfffff8000d95d018, 
    v=18446741874752305408, opts=opts@entry=0, 
    file=file@entry=0xffffffff82d39567 "/usr/src/sys/netgraph/ng_ksocket.c", 
    line=line@entry=1196) at /usr/src/sys/kern/kern_mutex.c:546
#6  0xffffffff80bd12b5 in __mtx_lock_flags (c=0xfffff8000d95d018, 
    opts=<unavailable>, 
    file=0xffffffff82d39567 "/usr/src/sys/netgraph/ng_ksocket.c", line=1196)
    at /usr/src/sys/kern/kern_mutex.c:284
#7  0xffffffff82d37d53 in ng_ksocket_accept (priv=0xfffff80014246180, 
    priv@entry=<error reading variable: value is not available>)
    at /usr/src/sys/netgraph/ng_ksocket.c:1196
#8  0xffffffff80c99806 in solisten_wakeup (sol=0xfffff8000d95d000, 
    sol@entry=0xfffff8000deb257c) at /usr/src/sys/kern/uipc_socket.c:985
#9  0xffffffff80c9fee8 in soisconnected (so=so@entry=0xfffff8000d961b10)
    at /usr/src/sys/kern/uipc_socket.c:4053
#10 0xffffffff80dc7037 in tcp_do_segment (m=0xfffff8000deb2500, 
    m@entry=<error reading variable: value is not available>, 
    th=0xfffff8000deb257c, 
    th@entry=<error reading variable: value is not available>, 
    so=<unavailable>, 
    so@entry=<error reading variable: value is not available>, 
    tp=0xfffffe0094589ca8, 
    tp@entry=<error reading variable: value is not available>, 
    drop_hdrlen=52, 
    drop_hdrlen@entry=<error reading variable: value is not available>, 
    tlen=<optimized out>, 
    tlen@entry=<error reading variable: value is not available>, 
    iptos=16 '\020', 
    iptos@entry=<error reading variable: value is not available>)
    at /usr/src/sys/netinet/tcp_input.c:2469
#11 0xffffffff80dc4ddb in tcp_input_with_port (mp=mp@entry=<unavailable>, 
    offp=offp@entry=<unavailable>, proto=<optimized out>, port=port@entry=0)
    at /usr/src/sys/netinet/tcp_input.c:1180
#12 0xffffffff80dc59ab in tcp_input (mp=<unavailable>, 
    mp@entry=<error reading variable: value is not available>, 
    offp=<unavailable>, 
    offp@entry=<error reading variable: value is not available>, 
    proto=<unavailable>, 
    proto@entry=<error reading variable: value is not available>)
    at /usr/src/sys/netinet/tcp_input.c:1509
#13 0xffffffff80db628b in ip_input (m=0x0, 
    m@entry=<error reading variable: value is not available>)
    at /usr/src/sys/netinet/ip_input.c:840
#14 0xffffffff80d38cd1 in netisr_process_workstream_proto (
    nwsp=0xffffffff82b876c0, proto=1) at /usr/src/sys/net/netisr.c:919
#15 swi_net (arg=0xffffffff82b876c0) at /usr/src/sys/net/netisr.c:966
#16 0xffffffff80bb2de9 in intr_event_execute_handlers (ie=0xfffff800034cd500, 
    p=<optimized out>) at /usr/src/sys/kern/kern_intr.c:1169
#17 ithread_execute_handlers (ie=0xfffff800034cd500, p=<optimized out>)
    at /usr/src/sys/kern/kern_intr.c:1182
#18 ithread_loop (arg=arg@entry=0xfffff800034704c0)
    at /usr/src/sys/kern/kern_intr.c:1270
#19 0xffffffff80bafcc0 in fork_exit (
    callout=0xffffffff80bb2b70 <ithread_loop>, arg=0xfffff800034704c0, 
    frame=0xfffffe0003733f40) at /usr/src/sys/kern/kern_fork.c:1093
#20 <signal handler called>
(kgdb)
Comment 20 ny 2023-08-05 09:06:21 UTC
I have check FreeBSD 14.0 - situation is far worse. Kernel crash even on MPD5 start.
Comment 21 ny 2023-08-06 06:42:55 UTC
FreeBSD-14.0-CURRENT-i386-20230720-a52f23f4c49e-264226-disc1.iso 
(default kernel + MPD5 start)

FreeBSD fb 14.0-CURRENT FreeBSD 14.0-CURRENT i386 1400093 #0 main-n264226-a52f23f4c49e: Thu Jul 20 07:59:56 UTC 2023     root@releng1.nyi.freebsd.org:/usr/obj/usr/src/i386.i386/sys/GENERIC  i386

Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address	= 0x1e588898
fault code		= supervisor write data, page not present
instruction pointer	= 0x20:0xfb062d
stack pointer	        = 0x28:0x172b5550
frame pointer	        = 0x28:0x172b555c
code segment		= base rx0, limit 0xfffff, type 0x1b
			= DPL 0, pres 1, def32 1, gran 1
processor eflags	= interrupt enabled, resume, IOPL = 0
current process		= 824 (mpd5)
trap number		= 12
panic: page fault
cpuid = 0
time = 1691310521
KDB: stack backtrace:
db_trace_self_wrapper(0,17e073a0,172b5510,1e588898,c,...) at db_trace_self_wrapper+0x28/frame 0x172b53a0
vpanic(13faf8e,172b53dc,172b53dc,172b5408,1391746,...) at vpanic+0xf3/frame 0x172b53bc
panic(13faf8e,14a24f7,0,fffff,c09b,...) at panic+0x14/frame 0x172b53d0
trap_fatal(2,1a2d79c,0,172b5440,0,...) at trap_fatal+0x346/frame 0x172b5408
trap_pfault(1e588898,0,0) at trap_pfault+0x6f/frame 0x172b543c
trap(172b5510,8,28,28,f,...) at trap+0x316/frame 0x172b5504
calltrap() at 0xffc0321f/frame 0x172b5504
--- trap 0xc, eip = 0xfb062d, esp = 0x172b5550, ebp = 0x172b555c ---
hashinit(10,1861d908,1e588898,17e073a0,77bc000,...) at hashinit+0xed/frame 0x172b555c
vnet_netgraph_init(0) at vnet_netgraph_init+0x37/frame 0x172b557c
vnet_register_sysinit(1861dbd0) at vnet_register_sysinit+0xe1/frame 0x172b5594
linker_load_module(5a95300,1860eeec,0) at linker_load_module+0xa7c/frame 0x172b5770
linker_load_dependencies(5a95300) at linker_load_dependencies+0x189/frame 0x172b57a4
link_elf_load_file(1800f28,5a96900,172b5920) at link_elf_load_file+0x560/frame 0x172b5890
linker_load_module(0,0,172b5a88) at linker_load_module+0x8b3/frame 0x172b5a74
kern_kldload(17e073a0,e6e5800,172b5ab0) at kern_kldload+0x142/frame 0x172b5a9c
sys_kldload(17e073a0,17e0764c) at sys_kldload+0x50/frame 0x172b5ac0
syscall(172b5ba8,3b,3b,3b,0,...) at syscall+0x1c5/frame 0x172b5b9c
Xint0x80_syscall() at 0xffc03479/frame 0x172b5b9c
--- syscall (304, FreeBSD ELF32, kldload), eip = 0x20ae897f, esp = 0xffbfea60, ebp = 0xffbfeb20 ---
KDB: enter: panic

0x00f6b6f9 in doadump (textdump=0) at /usr/src/sys/kern/kern_shutdown.c:407
407		dump_savectx();
(kgdb) #0  0x00f6b6f9 in doadump (textdump=0)
    at /usr/src/sys/kern/kern_shutdown.c:407
#1  0x00a0d3c4 in db_dump (dummy=<optimized out>, dummy2=<optimized out>, 
    dummy3=<optimized out>, dummy4=<optimized out>)
    at /usr/src/sys/ddb/db_command.c:593
#2  0x00a0d1cd in db_command (last_cmdp=<optimized out>, 
    cmd_table=<optimized out>, dopager=true)
    at /usr/src/sys/ddb/db_command.c:506
#3  0x00a0cedc in db_command_loop () at /usr/src/sys/ddb/db_command.c:553
#4  0x00a1018d in db_trap (type=3, code=0) at /usr/src/sys/ddb/db_main.c:270
#5  0x00fb1cc6 in kdb_trap (type=3, code=0, tf=0x172b5358)
    at /usr/src/sys/kern/subr_kdb.c:784
#6  0x01390e51 in trap (frame=0x172b5358) at /usr/src/sys/i386/i386/trap.c:694
#7  0xffc0321f in ?? ()
#8  0x172b5358 in ?? ()
#9  0x00000028 in ?? ()
#10 0x00000028 in ?? ()
#11 0x00000100 in ?? ()
#12 0x172b53dc in ?? ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
(kgdb)
Comment 22 ny 2023-08-06 06:49:18 UTC
Created attachment 243879 [details]
Simple MPD5 configuration, which not work on FreeBSD 12,13,14 and crash kernel.
Comment 23 Aleksandr Fedorov freebsd_committer freebsd_triage 2023-08-09 10:04:08 UTC
(In reply to ny from comment #21)

The last panic is very strange. Is it reproduced by a simple call to kldload netgraph?
Comment 24 ny 2023-08-09 10:10:54 UTC
I have checked "kldload netgraph" on FreeBSD 14 without start of MPD5. 
Yes, kernel also crashed.
Comment 25 Aleksandr Fedorov freebsd_committer freebsd_triage 2023-08-09 11:01:00 UTC
(In reply to ny from comment #24)

Unfortunately I can't test it on i386. On amd64, this panic does not reproduce for me. Maybe netgraph(4) is broken on i386 in 14-CURRENT.
Comment 26 ny 2023-08-09 12:29:04 UTC
(In reply to Aleksandr Fedorov from comment #25)

Just installed FreeBSD 14 AMD64 - and "kldload netgraph" is really not crash on it.
Comment 27 Mike Karels freebsd_committer freebsd_triage 2023-08-09 13:05:33 UTC
(In reply to Aleksandr Fedorov from comment #25)
I installed the 20230803 snapshot for i386 on a virtual machine.  kldload netgraph did not crash.
Comment 28 Aleksandr Fedorov freebsd_committer freebsd_triage 2023-08-09 15:06:43 UTC
(In reply to Mike Karels from comment #27)

I also tested:
freeBSD-14.0-CURRENT-i386-20230720-a52f23f4c49e-264226-disc1.iso - panic.
FreeBSD-14.0-CURRENT-i386-20230803-8a5c836b51ce-264491-mini-memstick.img - no panic.

This is an unrelated issue with mpd5.

The main problem is ng_ksocket(4) + TCP. Those, it's not even a mpd5 problem, it's just that ng_ksocket(4) is no longer able to do TCP.

I have some understanding of the problem.
Comment 29 Aleksandr Fedorov freebsd_committer freebsd_triage 2023-08-09 15:10:33 UTC
(In reply to ny from comment #19)

This is because I tested the kernel without DEBUG. That's why you got it panicked on KASSERT().

I will try to rework the patch.
Comment 30 ny 2023-08-09 16:43:40 UTC
FreeBSD 14 AMD64 - and "kldload netgraph" not crash, but crash on mpd5 start
Comment 31 ny 2023-08-10 09:14:53 UTC
FreeBSD 14 amd64 8a5c836b51ce kernel crash on MPD5 start.

Here is kernel crash:

FreeBSD fb 14.0-CURRENT FreeBSD 14.0-CURRENT amd64 1400093 #0 main-n264491-8a5c836b51ce: Thu Aug  3 08:15:15 UTC 2023   
  root@releng1.nyi.freebsd.org:/usr/obj/usr/src/amd64.amd64/sys/GENERIC  amd64

panic: page fault

Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address	= 0x18
fault code		= supervisor read data, page not present
instruction pointer	= 0x20:0xffffffff80b25b68
stack pointer	        = 0x28:0xfffffe00545f0d60
frame pointer	        = 0x28:0xfffffe00545f0da0
code segment		= base rx0, limit 0xfffff, type 0x1b
			= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags	= interrupt enabled, resume, IOPL = 0
current process		= 846 (ng_queue0)
rdi: 0000000000000000 rsi: fffffe0054b71560 rdx: 0000000000000000
rcx: 00000000000003aa  r8: 0000000000000000  r9: 0000000000010000
rax: fffff800074af3c0 rbx: 0000000000000018 rbp: fffffe00545f0da0
r10: 0000000000000001 r11: 0000000000010000 r12: 00000000000003aa
r13: 0000000000000001 r14: fffff80003621800 r15: ffffffff82b396a5
trap number		= 12
panic: page fault
cpuid = 0
time = 1691665616
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe00545f0b10
vpanic() at vpanic+0x149/frame 0xfffffe00545f0b60
panic() at panic+0x43/frame 0xfffffe00545f0bc0
trap_fatal() at trap_fatal+0x40c/frame 0xfffffe00545f0c20
trap_pfault() at trap_pfault+0xae/frame 0xfffffe00545f0c90
calltrap() at calltrap+0x8/frame 0xfffffe00545f0c90
--- trap 0xc, rip = 0xffffffff80b25b68, rsp = 0xfffffe00545f0d60, rbp = 0xfffffe00545f0da0 ---
__mtx_lock_flags() at __mtx_lock_flags+0x48/frame 0xfffffe00545f0da0
ng_ksocket_shutdown() at ng_ksocket_shutdown+0x39/frame 0xfffffe00545f0dc0
ng_rmnode() at ng_rmnode+0x188/frame 0xfffffe00545f0df0
ng_apply_item() at ng_apply_item+0x4fb/frame 0xfffffe00545f0e80
ngthread() at ngthread+0x291/frame 0xfffffe00545f0ef0
fork_exit() at fork_exit+0x82/frame 0xfffffe00545f0f30
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe00545f0f30
--- trap 0xc, rip = 0xe344027ccba, rsp = 0xe343d35b558, rbp = 0xe343d35b650 ---
KDB: enter: panic

__curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:59
59		__asm("movq %%gs:%P1,%0" : "=r" (td) : "n" (offsetof(struct pcpu,
(kgdb) #0  __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:59
#1  doadump (textdump=textdump@entry=0)
    at /usr/src/sys/kern/kern_shutdown.c:407
#2  0xffffffff804a2f1a in db_dump (dummy=<optimized out>, 
    dummy2=<optimized out>, dummy3=<optimized out>, dummy4=<optimized out>)
    at /usr/src/sys/ddb/db_command.c:593
#3  0xffffffff804a2d1d in db_command (last_cmdp=<optimized out>, 
    cmd_table=<optimized out>, dopager=true)
    at /usr/src/sys/ddb/db_command.c:506
#4  0xffffffff804a29dd in db_command_loop ()
    at /usr/src/sys/ddb/db_command.c:553
#5  0xffffffff804a60b6 in db_trap (type=<optimized out>, code=<optimized out>)
    at /usr/src/sys/ddb/db_main.c:270
#6  0xffffffff80b99d53 in kdb_trap (type=type@entry=3, code=code@entry=0, 
    tf=tf@entry=0xfffffe00545f0a50) at /usr/src/sys/kern/subr_kdb.c:792
#7  0xffffffff81045db9 in trap (frame=0xfffffe00545f0a50)
    at /usr/src/sys/amd64/amd64/trap.c:610
#8  <signal handler called>
#9  kdb_enter (why=<optimized out>, msg=<optimized out>)
    at /usr/src/sys/kern/subr_kdb.c:558
#10 0xffffffff80b4b86a in vpanic (fmt=0xffffffff81182bad "%s", 
    ap=ap@entry=0xfffffe00545f0ba0) at /usr/src/sys/kern/kern_shutdown.c:960
#11 0xffffffff80b4b633 in panic (
    fmt=0xffffffff8194fec0 <cnputs_mtx> "\257\346\023\201\377\377\377\377")
    at /usr/src/sys/kern/kern_shutdown.c:896
#12 0xffffffff8104624c in trap_fatal (frame=0xfffffe00545f0ca0, eva=24)
    at /usr/src/sys/amd64/amd64/trap.c:954
#13 0xffffffff810462fe in trap_pfault (frame=0xfffffe00545f0ca0, 
    usermode=false, signo=<optimized out>, ucode=<optimized out>)
    at /usr/src/sys/amd64/amd64/trap.c:762
#14 <signal handler called>
#15 __mtx_lock_flags (c=0x18, opts=opts@entry=0, 
    file=0xffffffff82b396a5 "/usr/src/sys/netgraph/ng_ksocket.c", 
    line=line@entry=938) at /usr/src/sys/kern/kern_mutex.c:273
#16 0xffffffff82b37559 in ng_ksocket_shutdown (node=0xfffff80003621800)
    at /usr/src/sys/netgraph/ng_ksocket.c:938
#17 0xffffffff82b23a48 in ng_rmnode (node=node@entry=0xfffff80003621800, 
    dummy1=<optimized out>, dummy2=<optimized out>, dummy3=<optimized out>)
    at /usr/src/sys/netgraph/ng_base.c:760
#18 0xffffffff82b25ddb in ng_generic_msg (here=0xfffff80003621800, 
    item=<optimized out>, lasthook=0xfffff800030b6580)
    at /usr/src/sys/netgraph/ng_base.c:2528
#19 ng_apply_item (node=node@entry=0xfffff80003621800, 
    item=item@entry=0xfffff80007ee1d80, rw=rw@entry=1)
    at /usr/src/sys/netgraph/ng_base.c:2442
#20 0xffffffff82b28d11 in ngthread (arg=<optimized out>)
    at /usr/src/sys/netgraph/ng_base.c:3451
#21 0xffffffff80b01b92 in fork_exit (callout=0xffffffff82b28a80 <ngthread>, 
    arg=0x0, frame=0xfffffe00545f0f40) at /usr/src/sys/kern/kern_fork.c:1162
#22 <signal handler called>
#23 0x00000e344027ccba in ?? ()
Backtrace stopped: Cannot access memory at address 0xe343d35b558
(kgdb)
Comment 32 ny 2023-09-12 07:01:45 UTC
Is it possible just to return old Netgraph code from FreeBSD 11 kernel into 12,13,14 - to make MPD5 work again ?
Comment 34 Eugene Grosbein freebsd_committer freebsd_triage 2023-11-17 04:13:29 UTC
(In reply to Gleb Smirnoff from comment #33)

Two last patches apply to 14.0-RELEASE sources just fine, but D42635 does not at all. Could you please attach version adjusted to 14.0-RELEASE sources?
Comment 35 Gleb Smirnoff freebsd_committer freebsd_triage 2023-11-17 06:14:00 UTC
D42635 is not planned to for merge into 14.0-RELEASE, as it is too risky for a stable branch. But it is not required to repair ng_ksocket operation if kernel runs without INVARIANTS.
Comment 36 commit-hook freebsd_committer freebsd_triage 2023-11-17 17:25:34 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=43f7e21668105cc5a3c66eae5ecef0203c2df62f

commit 43f7e21668105cc5a3c66eae5ecef0203c2df62f
Author:     Gleb Smirnoff <glebius@FreeBSD.org>
AuthorDate: 2023-11-17 17:24:30 +0000
Commit:     Gleb Smirnoff <glebius@FreeBSD.org>
CommitDate: 2023-11-17 17:24:30 +0000

    ng_ksocket: fix accept(2)

    - Provide listen upcall and set it on NGM_KSOCKET_LISTEN
    - Mask EWOULDBLOCK on NGM_KSOCKET_ACCEPT

    Reviewed by:            afedorov
    Differential Revision:  https://reviews.freebsd.org/D42637
    PR:                     272319
    PR:                     275106
    Fixes:                  779f106aa169256b7010a1d8f963ff656b881e92

 sys/netgraph/ng_ksocket.c | 41 +++++++++++++++++++++++++++++++++++------
 1 file changed, 35 insertions(+), 6 deletions(-)
Comment 37 commit-hook freebsd_committer freebsd_triage 2023-11-17 17:25:35 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=efad7cbfdc06e92bcc589a6c0cae2f3bea0d5cb9

commit efad7cbfdc06e92bcc589a6c0cae2f3bea0d5cb9
Author:     Gleb Smirnoff <glebius@FreeBSD.org>
AuthorDate: 2023-11-17 17:23:58 +0000
Commit:     Gleb Smirnoff <glebius@FreeBSD.org>
CommitDate: 2023-11-17 17:23:58 +0000

    ng_ksocket: fix upcall clearing on node shutdown

    Note: imho, the proper solution would be to guarantee that upcalls
    won't ever be called after soclose(), but this isn't the case, yet.
    This change at least makes the node work the way it always worked.

    Reviewed by:            afedorov
    Differential Revision:  https://reviews.freebsd.org/D42636
    PR:                     272319
    PR:                     275106
    Fixes:                  779f106aa169256b7010a1d8f963ff656b881e92

 sys/netgraph/ng_ksocket.c | 21 ++++++++++++++-------
 1 file changed, 14 insertions(+), 7 deletions(-)
Comment 38 ny 2023-11-18 23:03:34 UTC
I have check it with FreeBSD 14.0 kernel source.

1. compiled + installed generic kernel = all pass well
2. applied patches
3. I got errors during compilation

cc -target i386-unknown-freebsd14.0 --sysroot=/usr/obj/usr/src/i386.i386/tmp -B/usr/obj/usr/src/i386.i386/tmp/usr/bin -c -O2 -pipe  -fno-strict-aliasing  -g -nostdinc  -I. -I/usr/src/sys -I/usr/src/sys/contrib/ck/include -I/usr/src/sys/contrib/libfdt -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common    -MD  -MF.depend.tcp_usrreq.o -MTtcp_usrreq.o -fdebug-prefix-map=./machine=/usr/src/sys/i386/include -fdebug-prefix-map=./x86=/usr/src/sys/x86/include -mno-mmx -mno-sse -msoft-float -ffreestanding -fwrapv -fstack-protector -Wall -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Wcast-qual -Wundef -Wno-pointer-sign -D__printf__=__freebsd_kprintf__ -Wmissing-include-dirs -fdiagnostics-show-option -Wno-unknown-pragmas -Wno-error=tautological-compare -Wno-error=empty-body -Wno-error=parentheses-equality -Wno-error=unused-function -Wno-error=pointer-sign -Wno-error=shift-negative-value -Wno-address-of-packed-member -Wno-format-zero-length   -mno-aes -mno-avx  -std=gnu99 -Werror /usr/src/sys/netinet/tcp_usrreq.c

/usr/src/sys/netinet/tcp_usrreq.c:746:2: error: use of undeclared identifier 'port'
        port = inp->inp_fport;
        ^
/usr/src/sys/netinet/tcp_usrreq.c:747:2: error: use of undeclared identifier 'addr'
        addr = inp->inp_faddr;
        ^
/usr/src/sys/netinet/tcp_usrreq.c:754:4: error: use of undeclared identifier 'nam'
                *nam = in_sockaddr(port, &addr);
                 ^
/usr/src/sys/netinet/tcp_usrreq.c:754:22: error: use of undeclared identifier 'port'
                *nam = in_sockaddr(port, &addr);
                                   ^
/usr/src/sys/netinet/tcp_usrreq.c:754:29: error: use of undeclared identifier 'addr'
                *nam = in_sockaddr(port, &addr);
                                          ^
/usr/src/sys/netinet/tcp_usrreq.c:808:11: error: call to undeclared function 'in6_v4mapsin6_sockaddr'; ISO C99 and later do not support implicit function declarations [-Werror,-Wimplicit-function-declaration]
                        *nam = in6_v4mapsin6_sockaddr(port, &addr);
                               ^
/usr/src/sys/netinet/tcp_usrreq.c:808:11: note: did you mean 'in6_mapped_sockaddr'?
/usr/src/sys/netinet6/in6_pcb.h:105:5: note: 'in6_mapped_sockaddr' declared here
int     in6_mapped_sockaddr(struct socket *so, struct sockaddr **nam);
        ^
/usr/src/sys/netinet/tcp_usrreq.c:808:9: error: incompatible integer to pointer conversion assigning to 'struct sockaddr *' from 'int' [-Wint-conversion]
                        *nam = in6_v4mapsin6_sockaddr(port, &addr);
                             ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/usr/src/sys/netinet/tcp_usrreq.c:1425:16: error: incompatible function pointer types initializing 'pr_accept_t *' (aka 'int (*)(struct socket *, struct sockaddr *)') with an expression of type 'int (struct socket *, struct sockaddr **)' [-Wincompatible-function-pointer-types]
        .pr_accept =            tcp6_usr_accept,
                                ^~~~~~~~~~~~~~~
8 errors generated.
*** Error code 1
Comment 39 ny 2023-11-19 16:15:27 UTC
Created attachment 246427 [details]
FreeBSD 15 crash / trace
Comment 40 ny 2023-11-19 16:16:33 UTC
FreeBSD 15 kernel+world compiled well, but it is crash on:

telnet 127.0.0.1 57
Comment 41 Gleb Smirnoff freebsd_committer freebsd_triage 2023-11-19 16:45:32 UTC
Hi, ny!

The panic you see on FreeBSD 15 is covered by https://reviews.freebsd.org/D42635. It is not committed yet. Should happen soon.

However, you will not see the panic on kernel without debugging, without INVARIANTS option.

For FreeBSD 14 I will merge D42636 and D42637. But I'm not going to merge D42635. It is too large of a change for a stable branch. As you see it already doesn't apply. This means that FreeBSD 14 with INVARIANTS is not going to be fixed. We expect people to run STABLE branches without INVARIANTS unless debugging a problem. Note that the problem reported by malloc_dbg() on soaccept() has been in ng_ksocket since very beginning of ng_ksocket history. It is malloc that became smarter and stricter recently.
Comment 42 ny 2023-11-19 16:54:34 UTC
Please update us, when new FreeBSD 14 kernel source will be ready with all required patches. I will try to test it. It is little bit hard to use kernel from version 15 in production, when many things are not active for 15 yet (like packages). 

Usage of version 14 is more realistic.
But I will retest 15 without INVARIANTS and give you my feedback.

Thank you for your time.
Comment 43 Gleb Smirnoff freebsd_committer freebsd_triage 2023-11-19 16:59:10 UTC
> when new FreeBSD 14 kernel source will be ready with all required patches

When I merge to stable/14 the ng_ksocket fixes, this bug will automatically be updated.

> But I will retest 15 without INVARIANTS and give you my feedback.

Better test it please with INVARIANTS and with https://reviews.freebsd.org/D42635 applied.
Comment 44 ny 2023-11-19 17:54:51 UTC
I am confirming, that FreeBSD 15 kernel can work with disabled debug. Just inside sockstat it is with "? ? ? ?" at start. May be it is possible to show here as minimum mpd5 name + pid, which have created socket.
Comment 45 Gleb Smirnoff freebsd_committer freebsd_triage 2023-11-19 17:57:27 UTC
No, this can't be done. In principle the ng_ksocket implements a socket that is facing kernel instead of userland. It is the netgraph node that created the socket, not mpd. There is no process associated with the socket. If you kill -KILL mpd5, the socket will remain.
Comment 46 ny 2023-11-19 19:26:09 UTC
Tested FreeBSD 15 / i386 / 32 bit + INVARIANTS + D42635 patch + all debug.
All work well now. I am not able to crash it in any known way.

Look like you can try to port all this into FreeBSD 14. As minimum this code is far better then past/original crashes. Thanks!
Comment 47 commit-hook freebsd_committer freebsd_triage 2023-11-30 17:02:42 UTC
A commit in branch stable/14 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=09f4b840bd7cb6427af2a28a10bd839da6dd76d5

commit 09f4b840bd7cb6427af2a28a10bd839da6dd76d5
Author:     Gleb Smirnoff <glebius@FreeBSD.org>
AuthorDate: 2023-11-17 17:23:58 +0000
Commit:     Gleb Smirnoff <glebius@FreeBSD.org>
CommitDate: 2023-11-30 17:01:39 +0000

    ng_ksocket: fix upcall clearing on node shutdown

    Note: imho, the proper solution would be to guarantee that upcalls
    won't ever be called after soclose(), but this isn't the case, yet.
    This change at least makes the node work the way it always worked.

    Reviewed by:            afedorov
    Differential Revision:  https://reviews.freebsd.org/D42636
    PR:                     272319
    PR:                     275106
    Fixes:                  779f106aa169256b7010a1d8f963ff656b881e92

    (cherry picked from commit efad7cbfdc06e92bcc589a6c0cae2f3bea0d5cb9)

 sys/netgraph/ng_ksocket.c | 21 ++++++++++++++-------
 1 file changed, 14 insertions(+), 7 deletions(-)
Comment 48 commit-hook freebsd_committer freebsd_triage 2023-11-30 17:02:43 UTC
A commit in branch stable/14 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=ae3c8991cf0db9beff762f90b55e8995326eb894

commit ae3c8991cf0db9beff762f90b55e8995326eb894
Author:     Gleb Smirnoff <glebius@FreeBSD.org>
AuthorDate: 2023-11-17 17:24:30 +0000
Commit:     Gleb Smirnoff <glebius@FreeBSD.org>
CommitDate: 2023-11-30 17:01:40 +0000

    ng_ksocket: fix accept(2)

    - Provide listen upcall and set it on NGM_KSOCKET_LISTEN
    - Mask EWOULDBLOCK on NGM_KSOCKET_ACCEPT

    Reviewed by:            afedorov
    Differential Revision:  https://reviews.freebsd.org/D42637
    PR:                     272319
    PR:                     275106
    Fixes:                  779f106aa169256b7010a1d8f963ff656b881e92

    (cherry picked from commit 43f7e21668105cc5a3c66eae5ecef0203c2df62f)

 sys/netgraph/ng_ksocket.c | 41 +++++++++++++++++++++++++++++++++++------
 1 file changed, 35 insertions(+), 6 deletions(-)
Comment 49 Gleb Smirnoff freebsd_committer freebsd_triage 2023-11-30 17:07:03 UTC
Fixes to ng_ksocket merged to stable/14. With INVARIANTS stable/14 would still panic. However, problem now being catched by INVARIANTS was there always, so can be ignored.

The problem reported by INVARIANTS fixed in the main branch. The change is too intrusive to be merged to a stable branch.