278958 – zfs panic: page fault in sync_dnodes_task

Bug 278958 - zfs panic: page fault in sync_dnodes_task

Summary: zfs panic: page fault in sync_dnodes_task

Status:	New

Alias:	None

Product:	Base System
Classification:	Unclassified
Component:	kern (show other bugs)
Version:	14.0-RELEASE
Hardware:	amd64 Any

Importance:	--- Affects Only Me
Assignee:	freebsd-fs (Nobody)

URL:
Keywords:	crash

Depends on:
Blocks:

Reported:	2024-05-13 19:17 UTC by nunziotocci2000
Modified:	2024-05-31 15:17 UTC (History)
CC List:	0 users

See Also:

Attachments
core.txt (612.71 KB, text/plain) 2024-05-13 19:17 UTC, nunziotocci2000	no flags	Details
View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description nunziotocci2000 2024-05-13 19:17:26 UTC

Created attachment 250626 [details]
core.txt

Fatal trap 12: page fault while in kernel mode
cpuid = 29; apic id = 1d
fault virtual address   = 0x0
fault code              = supervisor write data, page not present
instruction pointer     = 0x20:0xffffffff820975a1
stack pointer           = 0x28:0xfffffe022b901de0
frame pointer           = 0x28:0xfffffe022b901de0
code segment            = base rx0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 6 (dp_sync_taskq_17)
rdi: fffff8000234cf60 rsi: fffff800022f8328 rdx: 0000000000000000
rcx: 0000000000000000  r8: 0000000000000000  r9: fffffe027bbf1a00
rax: 00000000000000e8 rbx: 0000000000000270 rbp: fffffe022b901de0
r10: 0000000000000000 r11: 0000000098ff24fe r12: fffff800022f8328
r13: 0000000000000000 r14: fffff8000234cf40 r15: fffff80aa5726c00
trap number             = 12
panic: page fault
cpuid = 29
time = 1715524597
KDB: stack backtrace:
#0 0xffffffff80b9009d at kdb_backtrace+0x5d
#1 0xffffffff80b431a2 at vpanic+0x132
#2 0xffffffff80b43063 at panic+0x43
#3 0xffffffff8100c85c at trap_fatal+0x40c
#4 0xffffffff8100c8af at trap_pfault+0x4f
#5 0xffffffff80fe3ac8 at calltrap+0x8
#6 0xffffffff82105083 at sync_dnodes_task+0x63
#7 0xffffffff8209addf at taskq_run+0x1f
#8 0xffffffff80ba5992 at taskqueue_run_locked+0x182
#9 0xffffffff80ba6c22 at taskqueue_thread_loop+0xc2
#10 0xffffffff80afdb7f at fork_exit+0x7f
#11 0xffffffff80fe4b2e at fork_trampoline+0xe
Uptime: 4d20h39m45s



kgdb backtrace:

#0  __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:57
#1  doadump (textdump=<optimized out>) at /usr/src/sys/kern/kern_shutdown.c:405
#2  0xffffffff80b42d37 in kern_reboot (howto=260)
    at /usr/src/sys/kern/kern_shutdown.c:526
#3  0xffffffff80b4320f in vpanic (fmt=0xffffffff81136b3b "%s",
    ap=ap@entry=0xfffffe022b901c30) at /usr/src/sys/kern/kern_shutdown.c:970
#4  0xffffffff80b43063 in panic (fmt=<unavailable>)
    at /usr/src/sys/kern/kern_shutdown.c:894
#5  0xffffffff8100c85c in trap_fatal (frame=0xfffffe022b901d20, eva=0)
    at /usr/src/sys/amd64/amd64/trap.c:952
#6  0xffffffff8100c8af in trap_pfault (frame=0xfffffe022b901d20,
    usermode=false, signo=<optimized out>, ucode=<optimized out>)
    at /usr/src/sys/amd64/amd64/trap.c:760
#7  <signal handler called>
#8  0xffffffff820975a1 in list_remove (list=0xfffff8000234cf60,
    object=object@entry=0xfffff800022f8328)
    at /usr/src/sys/contrib/openzfs/module/os/freebsd/spl/list.c:127
#9  0xffffffff8216158e in multilist_sublist_remove (
    mls=mls@entry=0xfffff8000234cf40, obj=obj@entry=0xfffff800022f8328)
    at /usr/src/sys/contrib/openzfs/module/zfs/multilist.c:363
#10 0xffffffff82105083 in dmu_objset_sync_dnodes (list=0xfffff8000234cf40,
    tx=0xfffff80aa5726c00)
    at /usr/src/sys/contrib/openzfs/module/zfs/dmu_objset.c:1557
#11 sync_dnodes_task (arg=0xfffff8083ac22e60)
    at /usr/src/sys/contrib/openzfs/module/zfs/dmu_objset.c:1638
#12 0xffffffff8209addf in taskq_run (arg=0xfffff8005728f900,
    pending=<optimized out>)
    at /usr/src/sys/contrib/openzfs/module/os/freebsd/spl/spl_taskq.c:320
#13 0xffffffff80ba5992 in taskqueue_run_locked (
    queue=queue@entry=0xfffff80002356300)
    at /usr/src/sys/kern/subr_taskqueue.c:512
#14 0xffffffff80ba6c22 in taskqueue_thread_loop (
    arg=arg@entry=0xfffff80003a91620) at /usr/src/sys/kern/subr_taskqueue.c:824
#15 0xffffffff80afdb7f in fork_exit (
    callout=0xffffffff80ba6b60 <taskqueue_thread_loop>,
    arg=0xfffff80003a91620, frame=0xfffffe022b901f40)
    at /usr/src/sys/kern/kern_fork.c:1160
#16 <signal handler called>

See attached for core.txt

This seems to happen intermittently while running a backup, which is performed by a remote computer running `zfs send` through SSH.

If there's anything else you'd like to see please let me know. I have a full vmcore as well if needed (11GB). I am also able to run kgdb to inspect said vmcore.

Comment 1 nunziotocci2000 2024-05-14 15:45:55 UTC

Another panic this morning at 3:33AM with an identical backtrace.
Looking at the core.txt there was a backup running this time as well. There is a running `zfs` process with a `sudo` as the parent, and `sshd` as the next process in the tree.

A `zpool status` shoes that our jail dataset experience an error:
  pool: zsmtp_jail
 state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
        attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
        using 'zpool clear' or replace the device with 'zpool replace'.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-9P
  scan: scrub repaired 92K in 00:23:52 with 0 errors on Tue May 14 03:53:36 2024
config:

        NAME        STATE     READ WRITE CKSUM
        zsmtp_jail  ONLINE       0     0     0
          mirror-0  ONLINE       0     0     0
            nda1p3  ONLINE       0     0     1
            nda2p3  ONLINE       0     0     0

It seems to me that this was likely caused by the panic corrupting something on that drive. smartctl -a /dev/nvme1 doesn't seem out of the ordinary:
=== START OF INFORMATION SECTION ===
Model Number:                       Force MP600
Serial Number:                      2006823000012856205C
Firmware Version:                   EGFM11.3
PCI Vendor/Subsystem ID:            0x1987
IEEE OUI Identifier:                0x6479a7
Total NVM Capacity:                 2,000,398,934,016 [2.00 TB]
Unallocated NVM Capacity:           0
Controller ID:                      1
NVMe Version:                       1.3
Number of Namespaces:               1
Namespace 1 Size/Capacity:          2,000,398,934,016 [2.00 TB]
Namespace 1 Formatted LBA Size:     512
Namespace 1 IEEE EUI-64:            6479a7 2fc124183d
Local Time is:                      Tue May 14 10:42:58 2024 CDT
Firmware Updates (0x12):            1 Slot, no Reset required
Optional Admin Commands (0x0017):   Security Format Frmw_DL Self_Test
Optional NVM Commands (0x005d):     Comp DS_Mngmt Wr_Zero Sav/Sel_Feat Timestmp
Log Page Attributes (0x08):         Telmtry_Lg
Maximum Data Transfer Size:         512 Pages
Warning  Comp. Temp. Threshold:     90 Celsius
Critical Comp. Temp. Threshold:     95 Celsius

Supported Power States
St Op     Max   Active     Idle   RL RT WL WT  Ent_Lat  Ex_Lat
 0 +     9.78W       -        -    0  0  0  0        0       0
 1 +     6.75W       -        -    1  1  1  1        0       0
 2 +     5.23W       -        -    2  2  2  2        0       0
 3 -   0.0490W       -        -    3  3  3  3     2000    2000
 4 -   0.0018W       -        -    4  4  4  4    25000   25000

Supported LBA Sizes (NSID 0x1)
Id Fmt  Data  Metadt  Rel_Perf
 0 +     512       0         2
 1 -    4096       0         1

=== START OF SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

SMART/Health Information (NVMe Log 0x02)
Critical Warning:                   0x00
Temperature:                        39 Celsius
Available Spare:                    100%
Available Spare Threshold:          5%
Percentage Used:                    3%
Data Units Read:                    295,807,203 [151 TB]
Data Units Written:                 98,425,650 [50.3 TB]
Host Read Commands:                 1,499,250,766
Host Write Commands:                2,297,088,561
Controller Busy Time:               5,756
Power Cycles:                       108
Power On Hours:                     33,640
Unsafe Shutdowns:                   67
Media and Data Integrity Errors:    0
Error Information Log Entries:      543
Warning  Comp. Temperature Time:    0
Critical Comp. Temperature Time:    0

Error Information (NVMe Log 0x01, 16 of 63 entries)
No Errors Logged

Self-test Log (NVMe Log 0x06)
Self-test status: No self-test in progress
Num  Test_Description  Status                       Power_on_Hours  Failing_LBA  NSID Seg SCT Code
 0   Short             Completed without error               30728            -     -   -   -    -

I will run a SMART self test and report the results.

Comment 2 nunziotocci2000 2024-05-14 16:00:26 UTC

=== START OF SMART DATA SECTION ===
Self-test Log (NVMe Log 0x06)
Self-test status: No self-test in progress
Num  Test_Description  Status                       Power_on_Hours  Failing_LBA  NSID Seg SCT Code
 0   Extended          Completed without error               33641            -     -   -   -    -
 1   Short             Completed without error               33640            -     -   -   -    -
 2   Short             Completed without error               30728            -     -   -   -    -

Comment 3 nunziotocci2000 2024-05-17 16:52:59 UTC

There was another crash yesterday, this time with a different backtrace:
#0  __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:57
#1  doadump (textdump=<optimized out>) at /usr/src/sys/kern/kern_shutdown.c:405
#2  0xffffffff80b42d37 in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:526
#3  0xffffffff80b4320f in vpanic (fmt=0xffffffff81136b3b "%s", ap=ap@entry=0xfffffe0231e84ca0)
    at /usr/src/sys/kern/kern_shutdown.c:970
#4  0xffffffff80b43063 in panic (fmt=<unavailable>) at /usr/src/sys/kern/kern_shutdown.c:894
#5  0xffffffff8100c85c in trap_fatal (frame=0xfffffe0231e84d90, eva=164) at /usr/src/sys/amd64/amd64/trap.c:952
#6  0xffffffff8100c8af in trap_pfault (frame=0xfffffe0231e84d90, usermode=false, signo=<optimized out>,
    ucode=<optimized out>) at /usr/src/sys/amd64/amd64/trap.c:760
#7  <signal handler called>
#8  taskqueue_run_locked (queue=0x0, queue@entry=0xfffff800021dd100) at /usr/src/sys/kern/subr_taskqueue.c:514
#9  0xffffffff80ba6c22 in taskqueue_thread_loop (arg=arg@entry=0xfffff800020b30e0)
    at /usr/src/sys/kern/subr_taskqueue.c:824
#10 0xffffffff80afdb7f in fork_exit (callout=0xffffffff80ba6b60 <taskqueue_thread_loop>, arg=0xfffff800020b30e0,
    frame=0xfffffe0231e84f40) at /usr/src/sys/kern/kern_fork.c:1160
#11 <signal handler called>

I found a thread on TrueNAS: https://www.truenas.com/community/threads/importing-pool-crashes-os.45049/

This seems to indicate an SSD failure. We are going to remove the suspect drive from the mirror and see if the crashes stop. If we don't get any after a week then I will close this bug as a hardware issue.

Comment 4 nunziotocci2000 2024-05-31 15:17:21 UTC

We have not gotten arround to replacing the drive yet. Meanwhile we have a new panic:

Fatal trap 12: page fault while in kernel mode
cpuid = 28; apic id = 1c
fault virtual address   = 0x0
fault code              = supervisor read data, page not present
instruction pointer     = 0x20:0xffffffff80eb9c53
stack pointer           = 0x28:0xfffffe02b60c7b90
frame pointer           = 0x28:0xfffffe02b60c7bb0
code segment            = base rx0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 47378 (sshd)
rdi: ffffffff81a135a8 rsi: ffffffff81a135a8 rdx: 0000000000000000
rcx: fffff80a9be5dc60  r8: 0000000000000000  r9: fffff80a9be5dc60
rax: 0000000000000000 rbx: fffffe011d24f000 rbp: fffffe02b60c7bb0
r10: ffffffff81a135a8 r11: 0000000000000000 r12: 0000000000000000
r13: fffff8075191c2a0 r14: fffff807e1ed8240 r15: 0000000000000000
trap number             = 12
panic: page fault
cpuid = 28
time = 1717134122
KDB: stack backtrace:
#0 0xffffffff80b9009d at kdb_backtrace+0x5d
#1 0xffffffff80b431a2 at vpanic+0x132
#2 0xffffffff80b43063 at panic+0x43
#3 0xffffffff8100c85c at trap_fatal+0x40c
#4 0xffffffff8100c8af at trap_pfault+0x4f
#5 0xffffffff80fe3ac8 at calltrap+0x8
#6 0xffffffff80ebd5ab at vm_map_entry_delete+0x1b
#7 0xffffffff80eb8d0b at vm_map_delete+0x7b
#8 0xffffffff80ebd84c at vm_map_remove+0x9c
#9 0xffffffff80bb5f79 at pipeclose+0x2f9
#10 0xffffffff80bb5970 at pipe_close+0x60
#11 0xffffffff80ae2221 at _fdrop+0x11
#12 0xffffffff80ae543a at closef+0x24a
#13 0xffffffff80ae9194 at closefp_impl+0x64
#14 0xffffffff8100d119 at amd64_syscall+0x109
#15 0xffffffff80fe43db at fast_syscall_common+0xf8

Interestingly enough, the only process listed in the `ps` output of the core.txt in a `pipewr` state was the `zfs send` of a backup.

Also, myself and my father have been working on our backups scripts and we've had other issues with zfs described here: https://forums.freebsd.org/threads/zfs-dataset-size-is-out-of-control-zfs-send-recv-hangs-on-random-datasets.93549/