Hi. After upgrade to FreeBSD 14.0 I noticed in /var/log/messages rows like: Jun 20 01:36:14 servername kernel: pid 74752 (sockstat), jid 1, uid 1003: exited on signal 11 (no core dump - other error) Jun 20 01:37:41 servername kernel: pid 75150 (sockstat), jid 1, uid 1003: exited on signal 11 (no core dump - other error) Jun 20 01:38:51 servername kernel: pid 75425 (sockstat), jid 1, uid 1003: exited on signal 11 (no core dump - other error) Jun 20 01:39:15 servername kernel: pid 75587 (sockstat), jid 1, uid 1003: exited on signal 11 (no core dump - other error) Jun 20 01:44:02 servername kernel: pid 76745 (sockstat), jid 1, uid 1003: exited on signal 11 (no core dump - other error) This happens after our script tries to parse sockstat output, but sometime it crashes. Run sockstat without any arguments give same result. One remark that server processes a lot of connections. May be backtrace could help you: Process 46451 stopped * thread #1, name = 'sockstat.full', stop reason = signal SIGSEGV: invalid address (fault address: 0x18) frame #0: 0x00001852b6abe497 sockstat.full`displaysock [inlined] file_compare(a=<unavailable>, b=0x0000000000000000) at sockstat.c:179:38 (lldb) list (lldb) bt * thread #1, name = 'sockstat.full', stop reason = signal SIGSEGV: invalid address (fault address: 0x18) * frame #0: 0x00001852b6abe497 sockstat.full`displaysock [inlined] file_compare(a=<unavailable>, b=0x0000000000000000) at sockstat.c:179:38 frame #1: 0x00001852b6abe497 sockstat.full`displaysock [inlined] files_t_RB_FIND(head=<unavailable>, elm=<unavailable>) at sockstat.c:181:1 frame #2: 0x00001852b6abe48e sockstat.full`displaysock(s=0x000026891e269a40, pos=40) at sockstat.c:1165:10 frame #3: 0x00001852b6abdc10 sockstat.full`display at sockstat.c:1364:3 frame #4: 0x00001852b6abcbd8 sockstat.full`main(argc=<unavailable>, argv=<unavailable>) at sockstat.c:1577:2 frame #5: 0x0000185ada4cbafa libc.so.7`__libc_start1 + 298 frame #6: 0x00001852b6abb17d sockstat.full`_start at crt1_s.S:83 (lldb)
'Me too' Recent 14-STABLE amd64 FreeBSD 14.1-STABLE #0 stable/14-n268159-60f78f8ed14d: Tue Jul 16 19:25:41 AEST 2024 john@rwsrv08.gfn.riverwillow.net.au:/build/obj/john/kits/src/amd64.amd64/sys/RWSRV08 No segfault if I specify -j to restrict dispaly to one of the jails, only if I specify -j0 or omit -j. This is my third build of 14-STABLE (beginning early May) and all of them have done the same. Same vintage 14-STABLE on i386 is fine. I only have the two systems running FreeBSD. rwsrv08# lldb -X sockstat (lldb) target create "sockstat" Current executable set to '/usr/bin/sockstat' (x86_64). (lldb) run Process 87548 launched: '/usr/bin/sockstat' (x86_64) USER COMMAND PID FD PROTO LOCAL ADDRESS FOREIGN ADDRESS root sockstat 87554 6 stream -> [87548 8] root sockstat 87553 6 stream -> [87548 7] ... root syslogd 2948 9 dgram /var/run/logpriv root gssd 2810 3 stream /var/run/gssd.sock Process 87548 stopped * thread #1, name = 'sockstat', stop reason = signal SIGSEGV: address not mapped to object (fault address: 0x18) frame #0: 0x000002c892dde507 sockstat`displaysock [inlined] file_compare(a=<unavailable>, b=0x0000000000000000) at sockstat.c:179:38 176 static int64_t 177 file_compare(const struct file *a, const struct file *b) 178 { -> 179 return ((int64_t)(a->xf_data/2 - b->xf_data/2)); ^ 180 } 181 RB_GENERATE_STATIC(files_t, file, file_tree, file_compare); 182 (lldb) bt * thread #1, name = 'sockstat', stop reason = signal SIGSEGV: address not mapped to object (fault address: 0x18) * frame #0: 0x000002c892dde507 sockstat`displaysock [inlined] file_compare(a=<unavailable>, b=0x0000000000000000) at sockstat.c:179:38 frame #1: 0x000002c892dde507 sockstat`displaysock [inlined] files_t_RB_FIND(head=<unavailable>, elm=<unavailable>) at sockstat.c:181:1 frame #2: 0x000002c892dde4fe sockstat`displaysock(s=0x00001790ce24be00, pos=<unavailable>) at sockstat.c:1165:10 frame #3: 0x000002c892ddd71f sockstat`display at sockstat.c:1345:4 frame #4: 0x000002c892ddcc07 sockstat`main(argc=<unavailable>, argv=<unavailable>) at sockstat.c:1577:2 frame #5: 0x000002d0b7f008da libc.so.7`__libc_start1(argc=1, argv=0x000002d0b2e0ed10, env=0x000002d0b2e0ed20, cleanup=<unavailable>, mainX=(sockstat`main at sockstat.c:1434)) at libc_start1.c:157:7 frame #6: 0x000002c892ddb18d sockstat`_start at crt1_s.S:83 (lldb) q
Rebuild sockstat without optimization, like this: # make -C /usr/src/usr.bin/sockstat DEBUG_FLAGS=-g clean all install then get the backtrace from gdb.
(In reply to Konstantin Belousov from comment #2) Thank you. - Re-built and installed sockstat as per your instruction. - Built and installed devel/gdb rwsrv08# gdb sockstat GNU gdb (GDB) 14.1 [GDB v14.1 for FreeBSD] ... Reading symbols from sockstat... Reading symbols from /build/usr/lib/debug//usr/bin/sockstat.debug... (gdb) r Starting program: /usr/bin/sockstat [Detaching after fork from child process 85890] USER COMMAND PID FD PROTO LOCAL ADDRESS FOREIGN ADDRESS root sockstat 85895 6 stream -> [85889 8] root sockstat 85894 6 stream -> [85889 7] ... root syslogd 2948 9 dgram /var/run/logpriv root gssd 2810 3 stream /var/run/gssd.sock Program received signal SIGSEGV, Segmentation fault. Address not mapped to object. files_t_RB_FIND (head=<optimized out>, elm=<optimized out>) at /kits/src/usr.bin/sockstat/sockstat.c:181 181 RB_GENERATE_STATIC(files_t, file, file_tree, file_compare); (gdb) bt #0 files_t_RB_FIND (head=<optimized out>, elm=<optimized out>) at /kits/src/usr.bin/sockstat/sockstat.c:181 #1 displaysock (s=s@entry=0x80184b440, pos=<optimized out>, pos@entry=30) at /kits/src/usr.bin/sockstat/sockstat.c:1165 #2 0x000000000102671f in display () at /kits/src/usr.bin/sockstat/sockstat.c:1345 #3 0x0000000001025c07 in main (argc=<optimized out>, argv=<optimized out>) at /kits/src/usr.bin/sockstat/sockstat.c:1577 (gdb)
(In reply to Konstantin Belousov from comment #2) Sorry. That re-compile still had -O2. I found that I was able to get it to compile with -O0 by adding 'CFLAGS=-O0' to /etc/make.conf. Is there a way to tell make(1) to use -O0 just for 'this' Makefile? Here is a more useful backtrace I hope. rwsrv08# rwsrv08# gdb sockstat GNU gdb (GDB) 14.1 [GDB v14.1 for FreeBSD] ... Reading symbols from sockstat... Reading symbols from /build/usr/lib/debug//usr/bin/sockstat.debug... (gdb) r Starting program: /usr/bin/sockstat [Detaching after fork from child process 88560] USER COMMAND PID FD PROTO LOCAL ADDRESS FOREIGN ADDRESS root sockstat 88565 6 stream -> [88559 8] root sockstat 88564 6 stream -> [88559 7] ... root syslogd 2948 9 dgram /var/run/logpriv root gssd 2810 3 stream /var/run/gssd.sock Program received signal SIGSEGV, Segmentation fault. Address not mapped to object. 0x0000000001029669 in displaysock (s=0x80184b200, pos=59) at /kits/src/usr.bin/sockstat/sockstat.c:1169 1169 (u_long)f->xf_pid, f->xf_fd); (gdb) bt #0 0x0000000001029669 in displaysock (s=0x80184b200, pos=59) at /kits/src/usr.bin/sockstat/sockstat.c:1169 #1 0x000000000102776b in display () at /kits/src/usr.bin/sockstat/sockstat.c:1345 #2 0x0000000001024eab in main (argc=0, argv=0x7fffffffeb60) at /kits/src/usr.bin/sockstat/sockstat.c:1577 (gdb)
Do you have set something that prevents visibility of processes/jails? Like see_other_uids etc? Anyway, please apply this debugging patch and report if the program still SIGSEGVs with it. diff --git a/usr.bin/sockstat/sockstat.c b/usr.bin/sockstat/sockstat.c index 73b1f00a4481..20a0a5e65e0a 100644 --- a/usr.bin/sockstat/sockstat.c +++ b/usr.bin/sockstat/sockstat.c @@ -1164,6 +1164,7 @@ displaysock(struct sock *s, int pos) f = RB_FIND(files_t, &ftree, &(struct file){ .xf_data = p->socket }); + if (f != NULL) pos += xprintf("[%lu %d]", (u_long)f->xf_pid, f->xf_fd); } else @@ -1183,6 +1184,7 @@ displaysock(struct sock *s, int pos) f = RB_FIND(files_t, &ftree, &(struct file){ .xf_data = p->socket }); + if (f != NULL); pos += xprintf("%s[%lu %d]", fref ? "" : ",", (u_long)f->xf_pid, f->xf_fd);
(In reply to Konstantin Belousov from comment #5) That debugging patch stops the segfault in my case. Program runs to completion. Yes, I have security restrictions in place. My sysctl.conf includes the following. -- kern.securelevel=0 # init(8) will raise this to 1 going multi-user. security.bsd.see_other_uids=0 security.bsd.see_other_gids=0 security.bsd.unprivileged_proc_debug=0 security.bsd.unprivileged_read_msgbuf=0 -- Thank you.
https://reviews.freebsd.org/D46050
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=35f4984343229545881a324a00cdbb3980d675ce commit 35f4984343229545881a324a00cdbb3980d675ce Author: Konstantin Belousov <kib@FreeBSD.org> AuthorDate: 2024-07-20 00:30:55 +0000 Commit: Konstantin Belousov <kib@FreeBSD.org> CommitDate: 2024-07-21 08:51:42 +0000 sockstat(1): tolerate situation where file info cannot be fetched Either due to a race, or to the privilege restrictions, it is not guaranteed that kern.files returned file information for all pcbs read from net.inet.<proto>.pcblist. In this case the file rbtree does not return the matching file by data address, and code must avoid dereferencing NULL. PR: 279875 Reviewed by: asomers Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D46050 usr.bin/sockstat/sockstat.c | 16 +++++++++++----- 1 file changed, 11 insertions(+), 5 deletions(-)
(In reply to commit-hook from comment #8) Thank you. Applying this patch to 14.1-STABLE works fine for me.
A commit in branch stable/14 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=7d1e4b7bf299dbf2319c969357cd1545ad81c8a6 commit 7d1e4b7bf299dbf2319c969357cd1545ad81c8a6 Author: Konstantin Belousov <kib@FreeBSD.org> AuthorDate: 2024-07-20 00:30:55 +0000 Commit: Konstantin Belousov <kib@FreeBSD.org> CommitDate: 2024-07-28 15:02:45 +0000 sockstat(1): tolerate situation where file info cannot be fetched PR: 279875 (cherry picked from commit 35f4984343229545881a324a00cdbb3980d675ce) usr.bin/sockstat/sockstat.c | 16 +++++++++++----- 1 file changed, 11 insertions(+), 5 deletions(-)