Bug 259084 - Panic while kill -9 syslogd
Summary: Panic while kill -9 syslogd
Status: Closed Overcome By Events
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 12.2-RELEASE
Hardware: amd64 Any
: --- Affects Only Me
Assignee: freebsd-bugs (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-10-11 22:29 UTC by Peter Eriksson
Modified: 2023-08-03 21:35 UTC (History)
1 user (show)

See Also:


Attachments
procstat -kk, ps and some vmstat log output just before the panic (81.01 KB, application/gzip)
2021-10-11 22:40 UTC, Peter Eriksson
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Peter Eriksson 2021-10-11 22:29:06 UTC
Just noticed one of our servers had many "hanging" ntpd processes. Tried to kill them and then noticed many <defunct> processes that seemed to be created by syslogd. Tried to kill syslogd to restart it, then the kernel paniced...

atal trap 12: page fault while in kernel mode
cpuid = 20; apic id = 14
fault virtual address	= 0x410
fault code		= supervisor read data, page not present
instruction pointer	= 0x20:0xffffffff80b9f55c
stack pointer	        = 0x28:0xfffffe14debc6710
frame pointer	        = 0x28:0xfffffe14debc6790
code segment		= base rx0, limit 0xfffff, type 0x1b
			= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags	= interrupt enabled, resume, IOPL = 0
current process		= 9277 (sshd)
trap number		= 12
panic: page fault
cpuid = 20
time = 1633990484
KDB: stack backtrace:
#0 0xffffffff80c0ad75 at kdb_backtrace+0x65
#1 0xffffffff80bbf02b at vpanic+0x17b
#2 0xffffffff80bbeea3 at panic+0x43
#3 0xffffffff8108e911 at trap_fatal+0x391
#4 0xffffffff8108e96f at trap_pfault+0x4f
#5 0xffffffff8108dfb6 at trap+0x286
#6 0xffffffff81066c28 at calltrap+0x8
#7 0xffffffff80c6365f at unp_pcb_owned_lock2_slowpath+0x12f
#8 0xffffffff80c61e0f at uipc_send+0x139f
#9 0xffffffff80c55b7a at sosend_generic+0x4ca
#10 0xffffffff80c55f90 at sosend+0x50
#11 0xffffffff80c5cc55 at kern_sendit+0x225
#12 0xffffffff80c5cfcc at sendit+0x19c
#13 0xffffffff80c5ce1d at sys_sendto+0x4d
#14 0xffffffff8108f4c7 at amd64_syscall+0x387
#15 0xffffffff8106754e at fast_syscall_common+0xf8
Uptime: 212d21h35m47s
mpr0: Sending StopUnit: path (xpt0:mpr0:0:20:ffffffff):

This server was running the 12.2-RELEASE-p4 kernel at the time of the crash, uptime at some 200 days...
Comment 1 Peter Eriksson 2021-10-11 22:40:23 UTC
Created attachment 228602 [details]
procstat -kk, ps and some vmstat log output just before the panic

Adding a tar.gz file containing some "procstat -kka", "ps auxwww" and various "vmstat" outputs captured just a minute before the panic/reboot in case it would help in providing some clues...
Comment 2 Peter Eriksson 2021-10-12 13:22:28 UTC
It seems the syslogd processes is blocking in the kernel at a "pipe_write":

# egrep syslogd /var/log/sys/15:10/procstat-kk-a.log
 9212 101640 syslogd             -                   mi_switch+0xd4 sleepq_catch_signals+0x403 sleepq_wait_sig+0xf _sleep+0x1de pipe_write+0x583 dofilewrite+0xb0 sys_write+0xc0 amd64_syscall+0x387 fast_syscall_common+0xf8 

Looking at our other servers I notice that syslogd has frozen on more than one of them causing thousands of defunct processes, and it seems to have happened at the time of a log rotate. Hmm...