I updated my server 2 days ago using freebsd-update from a 12.2-RELEASE-p7 kernel to p12 and since then it panicked something like 20 times in 24h as soon as there's filesystem activity. gmirror repairs seemed to have no impact though. Reverting to old p7 kernel it's now been stable again for 24h. I tried upgrading to 12.3 but it crashed like 12.2p12 after the kernel upgrade step and even before I got a chance to upgrade binaries. I managed to snap a picture of the panic stacktrace right before it rebooted after one of the crashes (see transcript below). I read the changes from p7 to p12 and I think it's related to changes made in the pmap code by this commit (1) as the stacktrace starts in PTDpde. I'll admit, that while I've been running FreeBSD servers since, I think, 2.7 I'm not familiar with the kernel code but it reeks of race condition under load. The machine is a Core i5 2400 (see details below), and yes I know I could be running the 64 bits version, but I never needed to go above 4GB of RAM and it's been running the 32 bits version of the OS flawlessly for years now. Sorry if this is a duplicate but I searched and could not find anything related. (1) https://github.com/freebsd/freebsd-src/commit/a165b4591e48cd2adce8215fca73147c016e6cea#diff-b34ee41e14f87fb2b18fdf77337237f336830ae88aac2a02e1c32aa45e43b4de panic: vm_fault: fault on nofault entry, addr: 0 cpuid = 1 time = 1642161900 KDB: stack backtrace: #0 0x10327ae at kdb_backtrace+0x4e #1 0xfed128 at vpanic+0x118 #2 0xfeda4 at panic+0x14 #3 0x12e5733 at vm_fault+0x2613 #4 0x12e3832 at vm_fault_trap+0x42 #5 0x154c0f5 at trap_pfault+0x115 #6 0x154b71f at trap+0x36f #7 0xffc0319d at PTDpde-0x41a5 #8 0x18bbaa3 at _umtx_op_nwake_private+0x93 #9 0x154c7b9 at syscall+0×3e9 #10 0xffc033e7 at PTDpde+0x43ef Uptime:3m20s CPU: Intel(R) Core(TM) i5-2400 CPU @ 3.10GHz (3093.04-MHz 686-class CPU) Origin="GenuineIntel" Id=0x206a7 Family=0x6 Model=0x2a Stepping=7 Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE> Features2=0x1fbae3ff<SSE3,PCLMULQDQ,DTES64,MON,DS_CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,POPCNT,TSCDLT,AESNI,XSAVE,OSXSAVE,AVX> AMD Features=0x28100000<NX,RDTSCP,LM> AMD Features2=0x1<LAHF> XSAVE Features=0x1<XSAVEOPT> VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID TSC: P-state invariant, performance statistics
Created attachment 231117 [details] coredump1
Created attachment 231118 [details] coredump2
Created attachment 231119 [details] coredump3
Created attachment 231120 [details] coredump4
Created attachment 231121 [details] coredump5
Created attachment 231122 [details] coredump6
I upgraded 1 server from AMD64 12.3-RELEASE to AMD64 12.3-RELEASE-p1. This server is stable. I also upgraded 2 servers from i386 12.2-RELEASE-p11 to i386 12.3-RELEASE-p1. These servers crash several times.
Just a +1 for this. Upgraded from kernel 12.2-RELEASE-p7 to 12.2-RELEASE-p12 with freebsd-update on several systems both amd64 and i386. Some of the i386 systems now panic under disk load (backups!). These are otherwise very lightly loaded web servers or firewalls. Is this related to Bug 261338 ? Example panics from three systems (cut-and-paste from dmesg) below: Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0x0 fault code = supervisor read, page not present instruction pointer = 0x20:0x0 stack pointer = 0x28:0x1b2de880 frame pointer = 0x28:0x1b2de8bc code segment = base rx0, limit 0xfffff, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 836 (bacula-fd) trap number = 12 panic: page fault cpuid = 0 time = 1642636089 KDB: stack backtrace: #0 0x10327ae at kdb_backtrace+0x4e #1 0xfed128 at vpanic+0x118 #2 0xfed004 at panic+0x14 #3 0x154bfd5 at trap_fatal+0x335 #4 0x154c013 at trap_pfault+0x33 #5 0x154b71f at trap+0x36f #6 0xffc0319d at PTDpde+0x41a5 #7 0x1534905 at copyout+0xa5 #8 0x154d0f9 at uiomove_fromphys+0x159 #9 0x12c3483 at ffs_read+0x3d3 #10 0x157b77d at VOP_READ_APV+0x5d #11 0x10b980a at vn_read+0x18a #12 0x10b95da at vn_io_fault_doio+0x3a #13 0x10b7232 at vn_io_fault1+0x162 #14 0x10b548b at vn_io_fault+0x1cb #15 0x104c580 at dofileread+0x70 #16 0x104c1d8 at sys_read+0x78 #17 0x154c7b9 at syscall+0x3e9 Uptime: 23h37m45s Physical memory: 2009 MB Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0x0 fault code = supervisor read, page not present instruction pointer = 0x20:0x0 stack pointer = 0x28:0x1afd6880 frame pointer = 0x28:0x1afd68bc code segment = base rx0, limit 0xfffff, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 2134 (bacula-fd) trap number = 12 panic: page fault cpuid = 0 time = 1642547014 KDB: stack backtrace: #0 0x10327ae at kdb_backtrace+0x4e #1 0xfed128 at vpanic+0x118 #2 0xfed004 at panic+0x14 #3 0x154bfd5 at trap_fatal+0x335 #4 0x154c013 at trap_pfault+0x33 #5 0x154b71f at trap+0x36f #6 0xffc0319d at PTDpde+0x41a5 #7 0x1534905 at copyout+0xa5 #8 0x154d0f9 at uiomove_fromphys+0x159 #9 0x12c3483 at ffs_read+0x3d3 #10 0x157b77d at VOP_READ_APV+0x5d #11 0x10b980a at vn_read+0x18a #12 0x10b95da at vn_io_fault_doio+0x3a #13 0x10b7232 at vn_io_fault1+0x162 #14 0x10b548b at vn_io_fault+0x1cb #15 0x104c580 at dofileread+0x70 #16 0x104c1d8 at sys_read+0x78 #17 0x154c7b9 at syscall+0x3e9 Uptime: 1d2h4m47s Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0x0 fault code = supervisor read, page not present instruction pointer = 0x20:0x0 stack pointer = 0x28:0x1df3a8d4 frame pointer = 0x28:0x1df3a90c code segment = base rx0, limit 0xfffff, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 932 (bacula-fd) trap number = 12 panic: page fault cpuid = 0 time = 1642636048 KDB: stack backtrace: #0 0x10327ae at kdb_backtrace+0x4e #1 0xfed128 at vpanic+0x118 #2 0xfed004 at panic+0x14 #3 0x154bfd5 at trap_fatal+0x335 #4 0x154c013 at trap_pfault+0x33 #5 0x154b71f at trap+0x36f #6 0xffc0319d at PTDpde+0x41a5 #7 0x12c33ce at ffs_read+0x31e #8 0x157b77d at VOP_READ_APV+0x5d #9 0x10b980a at vn_read+0x18a #10 0x10b95da at vn_io_fault_doio+0x3a #11 0x10b7232 at vn_io_fault1+0x162 #12 0x10b548b at vn_io_fault+0x1cb #13 0x104c580 at dofileread+0x70 #14 0x104c1d8 at sys_read+0x78 #15 0x154c7b9 at syscall+0x3e9 #16 0xffc033e7 at PTDpde+0x43ef Uptime: 23h37m42s Physical memory: 3026 MB
Yes Richard, it seems like the other ticket is closely related to this one. I was not able to propose a kernel patch, but I will follow the other ticket. Thanks for pointing it out. Yann
Updating to the latests 12.3 with the patch included in Bug 261338 fixes the issue. I'm going to cancel this ticket