Created attachment 243120 [details] backtrace I am experiencing crashes on 13.2-RELEASE usually around 3:05 AM every few weeks. grep 2023$ | sort ./core.txt.1:Mon Apr 10 03:05:10 CEST 2023 ./core.txt.2:Thu Apr 27 03:04:54 CEST 2023 ./core.txt.3:Fri May 26 03:05:20 CEST 2023 ./core.txt.4:Mon Jun 12 03:05:18 CEST 2023 ./core.txt.5:Sun Jul 2 03:05:14 CEST 2023 They could be ZFS related. panic: page fault Fatal trap 12: page fault while in kernel mode cpuid = 6; apic id = 06 fault virtual address = 0x0 fault code = supervisor read data, page not present instruction pointer = 0x20:0xffffffff8286b65c stack pointer = 0x28:0xfffffe0353890870 frame pointer = 0x28:0xfffffe0353890930 code segment = base rx0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 55948 (zfs) trap number = 12 panic: page fault cpuid = 6 time = 1688259833 Lines in syslog when crashes occur: Apr 10 03:01:00 zen-pobro root[5666]: [cron] daily Apr 10 03:05:04 zen-pobro syslogd: kernel boot file is /boot/kernel/kernel Apr 27 03:01:00 zen-pobro root[9491]: [cron] daily Apr 27 03:04:48 zen-pobro syslogd: kernel boot file is /boot/kernel/kernel May 26 03:01:00 zen-pobro root[70274]: [cron] daily May 26 03:05:14 zen-pobro syslogd: kernel boot file is /boot/kernel/kernel Jun 12 03:01:00 zen-pobro root[75378]: [cron] daily Jun 12 03:05:12 zen-pobro syslogd: kernel boot file is /boot/kernel/kernel Jul 2 03:01:00 zen-pobro root[55797]: [cron] daily Jul 2 03:05:08 zen-pobro syslogd: kernel boot file is /boot/kernel/kernel I have modified /etc/periodic to run 310.locate daily: lrwxr-xr-x 1 root wheel 31B Jun 10 12:01 /etc/periodic/daily/310.locate -> /etc/periodic/weekly/310.locate Box is amd64 PC which is used as home workstation, ZFS and VM server. It has ECC RAM (in desktop MBO), 2 HDDs in ZFS mirror and single NVMe SSD. Similar PR from a decade ago: #174372 Backtraces attached.
Created attachment 243121 [details] backtrace
Created attachment 243122 [details] backtrace
Created attachment 243123 [details] backtrace
Another reproduction: Jul 21 03:01:00 zen-pobro root[98948]: [cron] daily Jul 21 03:04:40 zen-pobro syslogd: kernel boot file is /boot/kernel/kernel Jul 21 03:04:40 zen-pobro kernel: ---<<BOOT>>--- Jul 21 03:04:40 zen-pobro kernel: Copyright (c) 1992-2021 The FreeBSD Project. Jul 21 03:04:40 zen-pobro kernel: Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 Jul 21 03:04:40 zen-pobro kernel: The Regents of the University of California. All rights reserved. Jul 21 03:04:40 zen-pobro kernel: FreeBSD is a registered trademark of The FreeBSD Foundation. Fatal trap 12: page fault while in kernel mode cpuid = 6; apic id = 06 fault virtual address = 0x0 fault code = supervisor read data, page not present instruction pointer = 0x20:0xffffffff8286a65c stack pointer = 0x28:0xfffffe024f13a870 frame pointer = 0x28:0xfffffe024f13a930 code segment = base rx0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 99113 (zfs) trap number = 12 panic: page fault cpuid = 6 time = 1689901415
Created attachment 243515 [details] backtrace
Created attachment 244082 [details] crash 7 Another crash Aug 14 03:01:00 zen-pobro root[4964]: [cron] daily Aug 14 03:05:15 zen-pobro syslogd: kernel boot file is /boot/kernel/kernel Aug 14 03:05:15 zen-pobro kernel: ---<<BOOT>>--- Aug 14 03:05:15 zen-pobro kernel: Copyright (c) 1992-2021 The FreeBSD Project. Fatal trap 12: page fault while in kernel mode cpuid = 6; apic id = 06 fault virtual address = 0x0 fault code = supervisor read data, page not present instruction pointer = 0x20:0xffffffff8286a65c stack pointer = 0x28:0xfffffe01f8416870 frame pointer = 0x28:0xfffffe01f8416930 code segment = base rx0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 5110 (zfs) trap number = 12 panic: page fault cpuid = 6 time = 1691975038
Created attachment 244545 [details] crash 8 Yet another reproduction Sep 1 03:01:00 zen-pobro root[98040]: [cron] daily Sep 1 03:05:21 zen-pobro syslogd: kernel boot file is /boot/kernel/kernel Sep 1 03:05:21 zen-pobro kernel: ---<<BOOT>>--- Sep 1 03:05:21 zen-pobro kernel: Copyright (c) 1992-2021 The FreeBSD Project. Fatal trap 12: page fault while in kernel mode cpuid = 3; apic id = 03 fault virtual address = 0x0 fault code = supervisor read data, page not present instruction pointer = 0x20:0xffffffff822e865c stack pointer = 0x28:0xfffffe035b2c7870 frame pointer = 0x28:0xfffffe035b2c7930 code segment = base rx0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 98180 (zfs) trap number = 12 panic: page fault cpuid = 3 time = 1693530254 KDB: stack backtrace: #0 0xffffffff80c55825 at kdb_backtrace+0x65 #1 0xffffffff80c081a1 at vpanic+0x151 #2 0xffffffff80c08043 at panic+0x43 #3 0xffffffff810b2fa7 at trap_fatal+0x387 #4 0xffffffff810b2fff at trap_pfault+0x4f #5 0xffffffff8108a8b8 at calltrap+0x8 #6 0xffffffff822e9828 at zap_lookup_norm+0x68 #7 0xffffffff822e97b1 at zap_lookup+0x11 #8 0xffffffff8218083c at zfs_get_zplprop+0x9c #9 0xffffffff8230026d at zfs_ioc_objset_zplprops+0x8d #10 0xffffffff822f973a at zfsdev_ioctl_common+0x58a #11 0xffffffff8216d826 at zfsdev_ioctl+0x116 #12 0xffffffff80a9f116 at devfs_ioctl+0xc6 #13 0xffffffff80cfabb4 at vn_ioctl+0x1a4 #14 0xffffffff80a9f7ce at devfs_ioctl_f+0x1e #15 0xffffffff80c762dd at kern_ioctl+0x26d #16 0xffffffff80c75fc0 at sys_ioctl+0x100 #17 0xffffffff810b389c at amd64_syscall+0x10c Uptime: 15d9h20m31s
Created attachment 244815 [details] crash 9 The crash is reproduced once more. Sep 12 03:01:00 zen-pobro root[87648]: [cron] daily Sep 12 03:04:40 zen-pobro syslogd: kernel boot file is /boot/kernel/kernel Sep 12 03:04:40 zen-pobro kernel: ---<<BOOT>>--- Sep 12 03:04:40 zen-pobro kernel: Copyright (c) 1992-2021 The FreeBSD Project. panic: page fault Fatal trap 12: page fault while in kernel mode cpuid = 5; apic id = 05 fault virtual address = 0x0 fault code = supervisor read data, page not present instruction pointer = 0x20:0xffffffff8287565c stack pointer = 0x28:0xfffffe022151f870 frame pointer = 0x28:0xfffffe022151f930 code segment = base rx0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 87839 (zfs) trap number = 12 panic: page fault cpuid = 5 time = 1694480616 KDB: stack backtrace: #0 0xffffffff80c55825 at kdb_backtrace+0x65 #1 0xffffffff80c081a1 at vpanic+0x151 #2 0xffffffff80c08043 at panic+0x43 #3 0xffffffff810b2fa7 at trap_fatal+0x387 #4 0xffffffff810b2fff at trap_pfault+0x4f #5 0xffffffff8108a8b8 at calltrap+0x8 #6 0xffffffff82876828 at zap_lookup_norm+0x68 #7 0xffffffff828767b1 at zap_lookup+0x11 #8 0xffffffff8270d83c at zfs_get_zplprop+0x9c #9 0xffffffff8288d26d at zfs_ioc_objset_zplprops+0x8d #10 0xffffffff8288673a at zfsdev_ioctl_common+0x58a #11 0xffffffff826fa826 at zfsdev_ioctl+0x116 #12 0xffffffff80a9f116 at devfs_ioctl+0xc6 #13 0xffffffff80cfabb4 at vn_ioctl+0x1a4 #14 0xffffffff80a9f7ce at devfs_ioctl_f+0x1e #15 0xffffffff80c762dd at kern_ioctl+0x26d #16 0xffffffff80c75fc0 at sys_ioctl+0x100 #17 0xffffffff810b389c at amd64_syscall+0x10c Uptime: 10d23h58m39s
Do all pools scrub without error? How is S.M.A.R.T. status for each of the hard disk drives? (In reply to porsolic from comment #0) > Similar PR from a decade ago: #174372 From bug 174372 comment 4 (the recent closure): > ⋯ I'm sure the other issue that was linked is unrelated to this.
(Sorry, the repeat addition of 174372 was a slip of the fingers. I'll repeat the removal. Apologies for the noise.)
Created attachment 246786 [details] smart-nvme
Created attachment 246787 [details] smart-hdd0
Created attachment 246788 [details] smart-hdd1
(In reply to Graham Perrin from comment #9) zpool scrub of mirrored mechanical disk finishes without error. zpool scrub of single NVMe disk finishes with (semi?) error: scan: scrub repaired 0B in 00:20:58 with 0 errors on Thu Nov 30 14:56:10 2023 errors: 3 data errors, use '-v' for a list But "show -v" shows more than 3 errors (189 to be exact), all are in contained inside my home partition's snapshots, like: pool/encrypted/home:<0x1> pool/encrypted/home@auto_daily-2023-12-03_02.01.00--1w:<0x1> pool/encrypted/home@auto_hourly-2023-12-02_22.00.00--2d:<0x1> pool/encrypted/home@auto_hourly-2023-12-02_20.00.00--2d:<0x1> All that error can be fixed with "clear" and "scrub": scan: scrub repaired 0B in 00:13:29 with 0 errors on Mon Dec 4 23:37:59 2023 Smartctl from disks attached Regarding related PR: that's a fascinating find and reply for a decade old PR! I also have PCI cards (which are all passed to VMs): Intel 4xGbit, Intel Wifi, integrated 2.5Gbit, cheap Asmedia USB controller. Although I did not experienced crashes at 3AM after I upgraded to 14.0 branch