Bug 279875 - sockstat: segmentation fault
Summary: sockstat: segmentation fault
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: bin (show other bugs)
Version: 14.0-RELEASE
Hardware: amd64 Any
: --- Affects Only Me
Assignee: freebsd-net (Nobody)
URL:
Keywords: regression
Depends on:
Blocks:
 
Reported: 2024-06-20 09:42 UTC by Kirill
Modified: 2024-07-28 15:06 UTC (History)
3 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Kirill 2024-06-20 09:42:55 UTC
Hi.

After upgrade to FreeBSD 14.0 I noticed in /var/log/messages rows like:

Jun 20 01:36:14 servername kernel: pid 74752 (sockstat), jid 1, uid 1003: exited on signal 11 (no core dump - other error)
Jun 20 01:37:41 servername  kernel: pid 75150 (sockstat), jid 1, uid 1003: exited on signal 11 (no core dump - other error)
Jun 20 01:38:51 servername  kernel: pid 75425 (sockstat), jid 1, uid 1003: exited on signal 11 (no core dump - other error)
Jun 20 01:39:15 servername  kernel: pid 75587 (sockstat), jid 1, uid 1003: exited on signal 11 (no core dump - other error)
Jun 20 01:44:02 servername  kernel: pid 76745 (sockstat), jid 1, uid 1003: exited on signal 11 (no core dump - other error)

This happens after our script tries to parse sockstat output, but sometime it crashes. Run sockstat without any arguments give same result. One remark that server processes a lot of connections.

May be backtrace could help you:

Process 46451 stopped
* thread #1, name = 'sockstat.full', stop reason = signal SIGSEGV: invalid address (fault address: 0x18)
    frame #0: 0x00001852b6abe497 sockstat.full`displaysock [inlined] file_compare(a=<unavailable>, b=0x0000000000000000) at sockstat.c:179:38
(lldb) list
(lldb) bt
* thread #1, name = 'sockstat.full', stop reason = signal SIGSEGV: invalid address (fault address: 0x18)
  * frame #0: 0x00001852b6abe497 sockstat.full`displaysock [inlined] file_compare(a=<unavailable>, b=0x0000000000000000) at sockstat.c:179:38
    frame #1: 0x00001852b6abe497 sockstat.full`displaysock [inlined] files_t_RB_FIND(head=<unavailable>, elm=<unavailable>) at sockstat.c:181:1
    frame #2: 0x00001852b6abe48e sockstat.full`displaysock(s=0x000026891e269a40, pos=40) at sockstat.c:1165:10
    frame #3: 0x00001852b6abdc10 sockstat.full`display at sockstat.c:1364:3
    frame #4: 0x00001852b6abcbd8 sockstat.full`main(argc=<unavailable>, argv=<unavailable>) at sockstat.c:1577:2
    frame #5: 0x0000185ada4cbafa libc.so.7`__libc_start1 + 298
    frame #6: 0x00001852b6abb17d sockstat.full`_start at crt1_s.S:83
(lldb)
Comment 1 John Marshall 2024-07-18 10:40:00 UTC
'Me too'

Recent 14-STABLE amd64

 FreeBSD 14.1-STABLE #0 stable/14-n268159-60f78f8ed14d: Tue Jul 16 19:25:41 AEST 2024     john@rwsrv08.gfn.riverwillow.net.au:/build/obj/john/kits/src/amd64.amd64/sys/RWSRV08

No segfault if I specify -j to restrict dispaly to one of the jails, only if I specify -j0 or omit -j. This is my third build of 14-STABLE (beginning early May) and all of them have done the same. Same vintage 14-STABLE on i386 is fine. I only have the two systems running FreeBSD.

rwsrv08# lldb -X sockstat
(lldb) target create "sockstat"
Current executable set to '/usr/bin/sockstat' (x86_64).
(lldb) run
Process 87548 launched: '/usr/bin/sockstat' (x86_64)
USER     COMMAND    PID   FD  PROTO  LOCAL ADDRESS         FOREIGN ADDRESS      
root     sockstat   87554 6   stream -> [87548 8]
root     sockstat   87553 6   stream -> [87548 7]
...
root     syslogd     2948 9   dgram  /var/run/logpriv
root     gssd        2810 3   stream /var/run/gssd.sock
Process 87548 stopped
* thread #1, name = 'sockstat', stop reason = signal SIGSEGV: address not mapped to object (fault address: 0x18)
    frame #0: 0x000002c892dde507 sockstat`displaysock [inlined] file_compare(a=<unavailable>, b=0x0000000000000000) at sockstat.c:179:38
   176 	static int64_t
   177 	file_compare(const struct file *a, const struct file *b)
   178 	{
-> 179 		return ((int64_t)(a->xf_data/2 - b->xf_data/2));
    		                                    ^
   180 	}
   181 	RB_GENERATE_STATIC(files_t, file, file_tree, file_compare);
   182 	
(lldb) bt
* thread #1, name = 'sockstat', stop reason = signal SIGSEGV: address not mapped to object (fault address: 0x18)
  * frame #0: 0x000002c892dde507 sockstat`displaysock [inlined] file_compare(a=<unavailable>, b=0x0000000000000000) at sockstat.c:179:38
    frame #1: 0x000002c892dde507 sockstat`displaysock [inlined] files_t_RB_FIND(head=<unavailable>, elm=<unavailable>) at sockstat.c:181:1
    frame #2: 0x000002c892dde4fe sockstat`displaysock(s=0x00001790ce24be00, pos=<unavailable>) at sockstat.c:1165:10
    frame #3: 0x000002c892ddd71f sockstat`display at sockstat.c:1345:4
    frame #4: 0x000002c892ddcc07 sockstat`main(argc=<unavailable>, argv=<unavailable>) at sockstat.c:1577:2
    frame #5: 0x000002d0b7f008da libc.so.7`__libc_start1(argc=1, argv=0x000002d0b2e0ed10, env=0x000002d0b2e0ed20, cleanup=<unavailable>, mainX=(sockstat`main at sockstat.c:1434)) at libc_start1.c:157:7
    frame #6: 0x000002c892ddb18d sockstat`_start at crt1_s.S:83
(lldb) q
Comment 2 Konstantin Belousov freebsd_committer freebsd_triage 2024-07-19 05:40:18 UTC
Rebuild sockstat without optimization, like this:
# make -C /usr/src/usr.bin/sockstat DEBUG_FLAGS=-g clean all install
then get the backtrace from gdb.
Comment 3 John Marshall 2024-07-19 08:42:06 UTC
(In reply to Konstantin Belousov from comment #2)

Thank you.
 - Re-built and installed sockstat as per your instruction.
 - Built and installed devel/gdb

rwsrv08# gdb sockstat
GNU gdb (GDB) 14.1 [GDB v14.1 for FreeBSD]
...
Reading symbols from sockstat...
Reading symbols from /build/usr/lib/debug//usr/bin/sockstat.debug...
(gdb) r
Starting program: /usr/bin/sockstat 
[Detaching after fork from child process 85890]
USER     COMMAND    PID   FD  PROTO  LOCAL ADDRESS         FOREIGN ADDRESS      
root     sockstat   85895 6   stream -> [85889 8]
root     sockstat   85894 6   stream -> [85889 7]
...
root     syslogd     2948 9   dgram  /var/run/logpriv
root     gssd        2810 3   stream /var/run/gssd.sock

Program received signal SIGSEGV, Segmentation fault.
Address not mapped to object.
files_t_RB_FIND (head=<optimized out>, elm=<optimized out>) at /kits/src/usr.bin/sockstat/sockstat.c:181
181	RB_GENERATE_STATIC(files_t, file, file_tree, file_compare);
(gdb) bt
#0  files_t_RB_FIND (head=<optimized out>, elm=<optimized out>) at /kits/src/usr.bin/sockstat/sockstat.c:181
#1  displaysock (s=s@entry=0x80184b440, pos=<optimized out>, pos@entry=30) at /kits/src/usr.bin/sockstat/sockstat.c:1165
#2  0x000000000102671f in display () at /kits/src/usr.bin/sockstat/sockstat.c:1345
#3  0x0000000001025c07 in main (argc=<optimized out>, argv=<optimized out>) at /kits/src/usr.bin/sockstat/sockstat.c:1577
(gdb)
Comment 4 John Marshall 2024-07-19 10:22:21 UTC
(In reply to Konstantin Belousov from comment #2)

Sorry.  That re-compile still had -O2.  I found that I was able to get it to compile with -O0 by adding 'CFLAGS=-O0' to /etc/make.conf.  Is there a way to tell make(1) to use -O0 just for 'this' Makefile?

Here is a more useful backtrace I hope.

rwsrv08# 
rwsrv08# gdb sockstat
GNU gdb (GDB) 14.1 [GDB v14.1 for FreeBSD]
...
Reading symbols from sockstat...
Reading symbols from /build/usr/lib/debug//usr/bin/sockstat.debug...
(gdb) r
Starting program: /usr/bin/sockstat 
[Detaching after fork from child process 88560]
USER     COMMAND    PID   FD  PROTO  LOCAL ADDRESS         FOREIGN ADDRESS      
root     sockstat   88565 6   stream -> [88559 8]
root     sockstat   88564 6   stream -> [88559 7]
...
root     syslogd     2948 9   dgram  /var/run/logpriv
root     gssd        2810 3   stream /var/run/gssd.sock

Program received signal SIGSEGV, Segmentation fault.
Address not mapped to object.
0x0000000001029669 in displaysock (s=0x80184b200, pos=59) at /kits/src/usr.bin/sockstat/sockstat.c:1169
1169						    (u_long)f->xf_pid, f->xf_fd);
(gdb) bt
#0  0x0000000001029669 in displaysock (s=0x80184b200, pos=59) at /kits/src/usr.bin/sockstat/sockstat.c:1169
#1  0x000000000102776b in display () at /kits/src/usr.bin/sockstat/sockstat.c:1345
#2  0x0000000001024eab in main (argc=0, argv=0x7fffffffeb60) at /kits/src/usr.bin/sockstat/sockstat.c:1577
(gdb)
Comment 5 Konstantin Belousov freebsd_committer freebsd_triage 2024-07-19 23:04:08 UTC
Do you have set something that prevents visibility of processes/jails?
Like see_other_uids etc?

Anyway, please apply this debugging patch and report if the program still
SIGSEGVs with it.

diff --git a/usr.bin/sockstat/sockstat.c b/usr.bin/sockstat/sockstat.c
index 73b1f00a4481..20a0a5e65e0a 100644
--- a/usr.bin/sockstat/sockstat.c
+++ b/usr.bin/sockstat/sockstat.c
@@ -1164,6 +1164,7 @@ displaysock(struct sock *s, int pos)
 					f = RB_FIND(files_t, &ftree,
 					    &(struct file){ .xf_data =
 					    p->socket });
+					if (f != NULL)
 					pos += xprintf("[%lu %d]",
 					    (u_long)f->xf_pid, f->xf_fd);
 				} else
@@ -1183,6 +1184,7 @@ displaysock(struct sock *s, int pos)
 					f = RB_FIND(files_t, &ftree,
 					    &(struct file){ .xf_data =
 					    p->socket });
+					if (f != NULL);
 					pos += xprintf("%s[%lu %d]",
 					    fref ? "" : ",",
 					    (u_long)f->xf_pid, f->xf_fd);
Comment 6 John Marshall 2024-07-20 00:26:38 UTC
(In reply to Konstantin Belousov from comment #5)

That debugging patch stops the segfault in my case. Program runs to completion.

Yes, I have security restrictions in place. My sysctl.conf includes the following.

--
kern.securelevel=0		# init(8) will raise this to 1 going multi-user.
security.bsd.see_other_uids=0
security.bsd.see_other_gids=0
security.bsd.unprivileged_proc_debug=0
security.bsd.unprivileged_read_msgbuf=0
--

Thank you.
Comment 7 Konstantin Belousov freebsd_committer freebsd_triage 2024-07-20 00:39:50 UTC
https://reviews.freebsd.org/D46050
Comment 8 commit-hook freebsd_committer freebsd_triage 2024-07-21 08:52:02 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=35f4984343229545881a324a00cdbb3980d675ce

commit 35f4984343229545881a324a00cdbb3980d675ce
Author:     Konstantin Belousov <kib@FreeBSD.org>
AuthorDate: 2024-07-20 00:30:55 +0000
Commit:     Konstantin Belousov <kib@FreeBSD.org>
CommitDate: 2024-07-21 08:51:42 +0000

    sockstat(1): tolerate situation where file info cannot be fetched

    Either due to a race, or to the privilege restrictions, it is not
    guaranteed that kern.files returned file information for all pcbs
    read from net.inet.<proto>.pcblist.  In this case the file rbtree does
    not return the matching file by data address, and code must avoid
    dereferencing NULL.

    PR:     279875
    Reviewed by:    asomers
    Sponsored by:   The FreeBSD Foundation
    MFC after:      1 week
    Differential revision:  https://reviews.freebsd.org/D46050

 usr.bin/sockstat/sockstat.c | 16 +++++++++++-----
 1 file changed, 11 insertions(+), 5 deletions(-)
Comment 9 John Marshall 2024-07-22 00:01:35 UTC
(In reply to commit-hook from comment #8)
Thank you.  Applying this patch to 14.1-STABLE works fine for me.
Comment 10 commit-hook freebsd_committer freebsd_triage 2024-07-28 15:03:06 UTC
A commit in branch stable/14 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=7d1e4b7bf299dbf2319c969357cd1545ad81c8a6

commit 7d1e4b7bf299dbf2319c969357cd1545ad81c8a6
Author:     Konstantin Belousov <kib@FreeBSD.org>
AuthorDate: 2024-07-20 00:30:55 +0000
Commit:     Konstantin Belousov <kib@FreeBSD.org>
CommitDate: 2024-07-28 15:02:45 +0000

    sockstat(1): tolerate situation where file info cannot be fetched

    PR:     279875

    (cherry picked from commit 35f4984343229545881a324a00cdbb3980d675ce)

 usr.bin/sockstat/sockstat.c | 16 +++++++++++-----
 1 file changed, 11 insertions(+), 5 deletions(-)