Bug 236357 - [zfs] kernel panic on 12-STABLE
Summary: [zfs] kernel panic on 12-STABLE
Status: New
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 12.0-STABLE
Hardware: amd64 Any
: --- Affects Only Me
Assignee: freebsd-fs (Nobody)
URL:
Keywords: crash
Depends on:
Blocks:
 
Reported: 2019-03-07 10:14 UTC by Sergey Anokhin
Modified: 2023-07-16 17:21 UTC (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Sergey Anokhin 2019-03-07 10:14:21 UTC
Hi All,

12.0-STABLE FreeBSD 12.0-STABLE #2 r343904M

After 19 days online I've got strange crash:

# kgdb /boot/kernel/kernel /var/crash/vmcore.last 
 GNU gdb (GDB) 8.2.1 [GDB v8.2.1 for FreeBSD] 
 Copyright (C) 2018 Free Software Foundation, Inc. 
 License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> 
 This is free software: you are free to change and redistribute it. 
 There is NO WARRANTY, to the extent permitted by law. 
 Type "show copying" and "show warranty" for details. 
 This GDB was configured as "x86_64-portbld-freebsd12.0". 
 Type "show configuration" for configuration details. 
 For bug reporting instructions, please see: 
 <http://www.gnu.org/software/gdb/bugs/>. 
 Find the GDB manual and other documentation resources online at: 
     <http://www.gnu.org/software/gdb/documentation/>. 
  
 For help, type "help". 
 Type "apropos word" to search for commands related to "word"... 
 Reading symbols from /boot/kernel/kernel...Reading symbols from /usr/lib/debug//boot/kernel/kernel.debug...done. 
 done. 
  
 Unread portion of the kernel message buffer: 
  
  
 Fatal trap 12: page fault while in kernel mode 
 cpuid = 1; apic id = 01 
 fault virtual address   = 0xfffff80282bf3401 
 fault code              = supervisor write data, page not present 
 instruction pointer     = 0x20:0xffffffff826f26e0 
 stack pointer           = 0x28:0xfffffe006475d320 
 frame pointer           = 0x28:0xfffffe006475d320 
 code segment            = base rx0, limit 0xfffff, type 0x1b 
                         = DPL 0, pres 1, long 1, def32 0, gran 1 
 processor eflags        = interrupt enabled, resume, IOPL = 0 
 current process         = 37083 (syslogd) 
 trap number             = 12 
 panic: page fault 
 cpuid = 1 
 time = 1551808983 
 KDB: stack backtrace: 
 #0 0xffffffff80c531c7 at kdb_backtrace+0x67 
 #1 0xffffffff80c07143 at vpanic+0x1a3 
 #2 0xffffffff80c06f93 at panic+0x43 
 #3 0xffffffff8118d9ff at trap_fatal+0x35f 
 #4 0xffffffff8118da59 at trap_pfault+0x49 
 #5 0xffffffff8118d07e at trap+0x29e 
 #6 0xffffffff81168af5 at calltrap+0x8 
 #7 0xffffffff827bb83a at zil_itx_assign+0x3da 
 #8 0xffffffff827e4b99 at zfs_log_write+0x2d9 
 #9 0xffffffff827f2338 at zfs_freebsd_write+0xbc8 
 #10 0xffffffff81315acf at VOP_WRITE_APV+0xff 
 #11 0xffffffff80ce98fe at vn_write+0x1ee 
 #12 0xffffffff80ce9443 at vn_io_fault_doio+0x43 
 #13 0xffffffff80ce72da at vn_io_fault1+0x16a 
 #14 0xffffffff80ce54b5 at vn_io_fault+0x195 
 #15 0xffffffff80c70232 at dofilewrite+0xb2 
 #16 0xffffffff80c70130 at sys_writev+0x70 
 #17 0xffffffff8118e592 at amd64_syscall+0x352 
 Uptime: 19d17h37m19s 
 Dumping 1611 out of 8077 MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91% 
  
 __curthread () at ./machine/pcpu.h:230 
 230             __asm("movq %%gs:%P1,%0" : "=r" (td) : "n" (OFFSETOF_CURTHREAD)); 
 (kgdb) bt 
 #0  __curthread () at ./machine/pcpu.h:230 
 #1  doadump (textdump=<optimized out>) at /usr/src/sys/kern/kern_shutdown.c:366 
 #2  0xffffffff80c06d2b in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:446 
 #3  0xffffffff80c071a3 in vpanic (fmt=<optimized out>, ap=0xfffffe006475d070) at /usr/src/sys/kern/kern_shutdown.c:872 
 #4  0xffffffff80c06f93 in panic (fmt=<unavailable>) at /usr/src/sys/kern/kern_shutdown.c:799 
 #5  0xffffffff8118d9ff in trap_fatal (frame=0xfffffe006475d260, eva=18446735288400032769) at /usr/src/sys/amd64/amd64/trap.c:929 
 #6  0xffffffff8118da59 in trap_pfault (frame=0xfffffe006475d260, usermode=0) at /usr/src/sys/amd64/amd64/trap.c:765 
 #7  0xffffffff8118d07e in trap (frame=0xfffffe006475d260) at /usr/src/sys/amd64/amd64/trap.c:441 
 #8  <signal handler called> 
 #9  0xffffffff826f26e0 in list_insert_tail (list=0xfffff80048db44c8, object=0xfffff80182bf3400) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/os/list.c:121 
 #10 0xffffffff827bb83a in zil_itx_assign (zilog=0xfffff8001f588c00, itx=0xfffff80182bf3400, tx=0xfffff8004d651e00) 
     at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zil.c:1737 
 #11 0xffffffff827e4b99 in zfs_log_write (zilog=0xfffff8001f588c00, tx=0xfffff8004d651e00, txtype=365662, zp=0xfffff80224bbd440, off=365662, resid=91, ioflag=8) 
     at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_log.c:525 
 #12 0xffffffff827f2338 in zfs_write (vp=<optimized out>, uio=<optimized out>, ioflag=<optimized out>, cr=<optimized out>, ct=<optimized out>) 
     at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:1203 
 #13 zfs_freebsd_write (ap=<optimized out>) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:4898 
 #14 0xffffffff81315acf in VOP_WRITE_APV (vop=<optimized out>, a=0xfffffe006475d788) at vnode_if.c:1000 
 #15 0xffffffff80ce98fe in VOP_WRITE (vp=<optimized out>, uio=<optimized out>, cred=<optimized out>, ioflag=<optimized out>) at ./vnode_if.h:413 
 #16 vn_write (fp=<optimized out>, uio=<optimized out>, active_cred=0x5945e, flags=<optimized out>, td=<optimized out>) at /usr/src/sys/kern/vfs_vnops.c:881 
 #17 0xffffffff80ce9443 in vn_io_fault_doio (args=0xfffffe006475d9b0, uio=0xfffff8002ab5bd00, td=0xfffff8007c685000) at /usr/src/sys/kern/vfs_vnops.c:946 
 #18 0xffffffff80ce72da in vn_io_fault1 (vp=<optimized out>, uio=<optimized out>, args=<optimized out>, td=<optimized out>) at /usr/src/sys/kern/vfs_vnops.c:1064 
 #19 0xffffffff80ce54b5 in vn_io_fault (fp=<optimized out>, uio=0xfffff8004ebef188, active_cred=0xfffff80021a684b0, flags=<optimized out>, td=<optimized out>) 
     at /usr/src/sys/kern/vfs_vnops.c:1168 
 #20 0xffffffff80c70232 in fo_write (fp=<optimized out>, uio=<optimized out>, active_cred=0xfffff80282bf3401, flags=<optimized out>, td=<optimized out>) 
     at /usr/src/sys/sys/file.h:314 
 #21 dofilewrite (td=0x0, fd=34, fp=0xfffff801a82000f0, auio=0xfffff8002ab5bd00, offset=<optimized out>, flags=<optimized out>) at /usr/src/sys/kern/sys_generic.c:567 
 #22 0xffffffff80c70130 in kern_writev (td=0xfffff8007c685000, fd=34, auio=0xfffff8002ab5bd00) at /usr/src/sys/kern/sys_generic.c:491 
 #23 sys_writev (td=0xfffff8007c685000, uap=0xfffff8007c6853c0) at /usr/src/sys/kern/sys_generic.c:477 
 #24 0xffffffff8118e592 in syscallenter (td=<optimized out>) at /usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:135 
 #25 amd64_syscall (td=0xfffff8007c685000, traced=0) at /usr/src/sys/amd64/amd64/trap.c:1154 
 #26 <signal handler called> 
 #27 0x00000008004122ca in ?? () 
 Backtrace stopped: Cannot access memory at address 0x7fffffffc378 
 (kgdb) frame 9 
 #9  0xffffffff826f26e0 in list_insert_tail (list=0xfffff80048db44c8, object=0xfffff80182bf3400) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/os/list.c:121 
 121             list_insert_before_node(list, lold, object); 
 (kgdb) frame 10 
 #10 0xffffffff827bb83a in zil_itx_assign (zilog=0xfffff8001f588c00, itx=0xfffff80182bf3400, tx=0xfffff8004d651e00) 
     at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zil.c:1737 
 1737                    list_insert_tail(&ian->ia_list, itx); 
 (kgdb) frame 11 
 #11 0xffffffff827e4b99 in zfs_log_write (zilog=0xfffff8001f588c00, tx=0xfffff8004d651e00, txtype=365662, zp=0xfffff80224bbd440, off=365662, resid=91, ioflag=8) 
     at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_log.c:525 
 525                     zil_itx_assign(zilog, itx, tx); 
 (kgdb) frame 12 
 #12 0xffffffff827f2338 in zfs_write (vp=<optimized out>, uio=<optimized out>, ioflag=<optimized out>, cr=<optimized out>, ct=<optimized out>) 
     at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:1203 
 1203                    zfs_log_write(zilog, tx, TX_WRITE, zp, woff, tx_bytes, ioflag); 
 (kgdb) frame 13 
 #13 zfs_freebsd_write (ap=<optimized out>) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:4898 
 4898            return (zfs_write(ap->a_vp, ap->a_uio, ioflags(ap->a_ioflag), 
 (kgdb) frame 14 
 #14 0xffffffff81315acf in VOP_WRITE_APV (vop=<optimized out>, a=0xfffffe006475d788) at vnode_if.c:1000 
 1000                    rc = vop->vop_write(a); 
 (kgdb) frame 15 
 #15 0xffffffff80ce98fe in VOP_WRITE (vp=<optimized out>, uio=<optimized out>, cred=<optimized out>, ioflag=<optimized out>) at ./vnode_if.h:413 
 413             return (VOP_WRITE_APV(vp->v_op, &a)); 
 (kgdb) frame 16 
 #16 vn_write (fp=<optimized out>, uio=<optimized out>, active_cred=0x5945e, flags=<optimized out>, td=<optimized out>) at /usr/src/sys/kern/vfs_vnops.c:881 
 881                     error = VOP_WRITE(vp, uio, ioflag, fp->f_cred); 
 (kgdb) frame 17 
 #17 0xffffffff80ce9443 in vn_io_fault_doio (args=0xfffffe006475d9b0, uio=0xfffff8002ab5bd00, td=0xfffff8007c685000) at /usr/src/sys/kern/vfs_vnops.c:946 
 946                     error = (args->args.fop_args.doio)(args->args.fop_args.fp, 
 (kgdb) frame 18 
 #18 0xffffffff80ce72da in vn_io_fault1 (vp=<optimized out>, uio=<optimized out>, args=<optimized out>, td=<optimized out>) at /usr/src/sys/kern/vfs_vnops.c:1064 
 1064            error = vn_io_fault_doio(args, uio, td); 
 (kgdb) frame 19 
 #19 0xffffffff80ce54b5 in vn_io_fault (fp=<optimized out>, uio=0xfffff8004ebef188, active_cred=0xfffff80021a684b0, flags=<optimized out>, td=<optimized out>) 
     at /usr/src/sys/kern/vfs_vnops.c:1168 
 1168                    error = vn_io_fault1(vp, uio, &args, td); 
 (kgdb) frame 20 
 #20 0xffffffff80c70232 in fo_write (fp=<optimized out>, uio=<optimized out>, active_cred=0xfffff80282bf3401, flags=<optimized out>, td=<optimized out>) 
     at /usr/src/sys/sys/file.h:314 
 314             return ((*fp->f_ops->fo_write)(fp, uio, active_cred, flags, td)); 
 (kgdb) frame 21 
 #21 dofilewrite (td=0x0, fd=34, fp=0xfffff801a82000f0, auio=0xfffff8002ab5bd00, offset=<optimized out>, flags=<optimized out>) at /usr/src/sys/kern/sys_generic.c:567 
 567             if ((error = fo_write(fp, auio, td->td_ucred, flags, td))) { 
 (kgdb) frame 22 
 #22 0xffffffff80c70130 in kern_writev (td=0xfffff8007c685000, fd=34, auio=0xfffff8002ab5bd00) at /usr/src/sys/kern/sys_generic.c:491 
 491             error = dofilewrite(td, fd, fp, auio, (off_t)-1, 0); 
 (kgdb)
Comment 1 Andriy Gapon freebsd_committer freebsd_triage 2019-03-07 22:29:06 UTC
(In reply to Sergey Anokhin from comment #0)

Are you using ECC RAM?
I suspect that you might have a memory corruption.

Compare:
> fault virtual address   = 0xfffff80282bf3401
to
> itx=0xfffff80182bf3400
Comment 2 Sergey Anokhin 2019-03-08 07:47:50 UTC
(In reply to Andriy Gapon from comment #1)

This is test machine without ecc ram. In this case it's hard to say for sure 100%. Memtests were passed. I suppose if the problem is in memory, then the error should be floating, right?
Comment 3 Eugene Grosbein freebsd_committer freebsd_triage 2023-07-16 17:21:26 UTC
Is this problem still relevant? 12.0-STABLE is EoL.