Just noticed one of our servers had many "hanging" ntpd processes. Tried to kill them and then noticed many <defunct> processes that seemed to be created by syslogd. Tried to kill syslogd to restart it, then the kernel paniced... atal trap 12: page fault while in kernel mode cpuid = 20; apic id = 14 fault virtual address = 0x410 fault code = supervisor read data, page not present instruction pointer = 0x20:0xffffffff80b9f55c stack pointer = 0x28:0xfffffe14debc6710 frame pointer = 0x28:0xfffffe14debc6790 code segment = base rx0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 9277 (sshd) trap number = 12 panic: page fault cpuid = 20 time = 1633990484 KDB: stack backtrace: #0 0xffffffff80c0ad75 at kdb_backtrace+0x65 #1 0xffffffff80bbf02b at vpanic+0x17b #2 0xffffffff80bbeea3 at panic+0x43 #3 0xffffffff8108e911 at trap_fatal+0x391 #4 0xffffffff8108e96f at trap_pfault+0x4f #5 0xffffffff8108dfb6 at trap+0x286 #6 0xffffffff81066c28 at calltrap+0x8 #7 0xffffffff80c6365f at unp_pcb_owned_lock2_slowpath+0x12f #8 0xffffffff80c61e0f at uipc_send+0x139f #9 0xffffffff80c55b7a at sosend_generic+0x4ca #10 0xffffffff80c55f90 at sosend+0x50 #11 0xffffffff80c5cc55 at kern_sendit+0x225 #12 0xffffffff80c5cfcc at sendit+0x19c #13 0xffffffff80c5ce1d at sys_sendto+0x4d #14 0xffffffff8108f4c7 at amd64_syscall+0x387 #15 0xffffffff8106754e at fast_syscall_common+0xf8 Uptime: 212d21h35m47s mpr0: Sending StopUnit: path (xpt0:mpr0:0:20:ffffffff): This server was running the 12.2-RELEASE-p4 kernel at the time of the crash, uptime at some 200 days...
Created attachment 228602 [details] procstat -kk, ps and some vmstat log output just before the panic Adding a tar.gz file containing some "procstat -kka", "ps auxwww" and various "vmstat" outputs captured just a minute before the panic/reboot in case it would help in providing some clues...
It seems the syslogd processes is blocking in the kernel at a "pipe_write": # egrep syslogd /var/log/sys/15:10/procstat-kk-a.log 9212 101640 syslogd - mi_switch+0xd4 sleepq_catch_signals+0x403 sleepq_wait_sig+0xf _sleep+0x1de pipe_write+0x583 dofilewrite+0xb0 sys_write+0xc0 amd64_syscall+0x387 fast_syscall_common+0xf8 Looking at our other servers I notice that syslogd has frozen on more than one of them causing thousands of defunct processes, and it seems to have happened at the time of a log rotate. Hmm...