Bug 232544 - general protection fault while in kernel mode - vdev_indirect
Summary: general protection fault while in kernel mode - vdev_indirect
Status: New
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: CURRENT
Hardware: Any Any
: --- Affects Only Me
Assignee: freebsd-bugs (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-10-22 19:07 UTC by Jeremy Faulkner
Modified: 2019-03-13 12:09 UTC (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Jeremy Faulkner 2018-10-22 19:07:25 UTC
constans% uname -a
FreeBSD constans 12.0-BETA1 FreeBSD 12.0-BETA1 r339534 GENERIC  amd64


Fatal trap 9: general protection fault while in kernel mode
cpuid = 14; apic id = 34
instruction pointer     = 0x20:0xffffffff82cb48a0
stack pointer           = 0x28:0xfffffe00e336c830
frame pointer           = 0x28:0xfffffe00e336c830
code segment            = base rx0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 0 (zio_write_intr_0)
trap number             = 9
panic: general protection fault
cpuid = 14
time = 1540226847
KDB: stack backtrace:
#0 0xffffffff80bf9a97 at kdb_backtrace+0x67
#1 0xffffffff80bada63 at vpanic+0x1a3
#2 0xffffffff80bad8b3 at panic+0x43
#3 0xffffffff8108586f at trap_fatal+0x35f
#4 0xffffffff81084cbd at trap+0x6d
#5 0xffffffff81060b65 at calltrap+0x8
#6 0xffffffff82cb3a70 at vdev_indirect_remap+0xa0
#7 0xffffffff82cb32ec at vdev_indirect_io_start+0x6c
#8 0xffffffff82ce3e79 at zio_vdev_io_start+0x2a9
#9 0xffffffff82ce02ec at zio_execute+0xbc
#10 0xffffffff82cdfbfb at zio_nowait+0xcb
#11 0xffffffff82cde23f at zil_lwb_write_done+0x13f
#12 0xffffffff82ce4f1e at zio_done+0x88e
#13 0xffffffff82ce02ec at zio_execute+0xbc
#14 0xffffffff80c0bdd4 at taskqueue_run_locked+0x154
#15 0xffffffff80c0cf38 at taskqueue_thread_loop+0x98
#16 0xffffffff80b6e4f3 at fork_exit+0x83
#17 0xffffffff81061b4e at fork_trampoline+0xe
Uptime: 1d2h16m58s

__curthread () at ./machine/pcpu.h:230
230             __asm("movq %%gs:%1,%0" : "=r" (td)
(kgdb) #0  __curthread () at ./machine/pcpu.h:230
#1  doadump (textdump=<optimized out>)
    at /usr/src/sys/kern/kern_shutdown.c:366
#2  0xffffffff80bad64b in kern_reboot (howto=260)
    at /usr/src/sys/kern/kern_shutdown.c:446
#3  0xffffffff80badac3 in vpanic (fmt=<optimized out>, ap=0xfffffe00e336c5e0)
    at /usr/src/sys/kern/kern_shutdown.c:872
#4  0xffffffff80bad8b3 in panic (fmt=<unavailable>)
    at /usr/src/sys/kern/kern_shutdown.c:799
#5  0xffffffff8108586f in trap_fatal (frame=0xfffffe00e336c770, eva=0)
    at /usr/src/sys/amd64/amd64/trap.c:929
#6  0xffffffff81084cbd in trap (frame=0xfffffe00e336c770)
    at /usr/src/sys/amd64/amd64/trap.c:217
#7  <signal handler called>
#8  dva_mapping_overlap_compare (v_array_elem=<optimized out>,
    v_key=<optimized out>)
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_indirect_mapping.c:139
#9  vdev_indirect_mapping_entry_for_offset_impl (vim=<optimized out>,
    offset=0, next_if_missing=0)
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_indirect_mapping.c:191
#10 vdev_indirect_mapping_entry_for_offset (vim=<optimized out>, offset=0)
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_indirect_mapping.c:270
#11 0xffffffff82cb3a70 in vdev_indirect_mapping_duplicate_adjacent_entries (
    offset=0, asize=0, vd=<optimized out>, copied_entries=<optimized out>)
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_indirect.c:964
#12 vdev_indirect_remap (vd=0xfffff802499b5000, offset=<optimized out>,
    asize=0, func=0xffffffff82cb40a0 <vdev_indirect_gather_splits>,
    arg=0xfffff803ae71d000)
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_indirect.c:1060
#13 0xffffffff82cb32ec in vdev_indirect_io_start (zio=0xfffff803ae71d000)
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_indirect.c:1282
#14 0xffffffff82ce3e79 in zio_vdev_io_start (zio=0xfffff803ae71d000)
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:3341
#15 0xffffffff82ce02ec in zio_execute (zio=0xfffff803ae71d000)
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:1785
#16 0xffffffff82cdfbfb in zio_nowait (zio=0xfffff803ae71d000)
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:1841
#17 0xffffffff82cde23f in zil_lwb_write_done (zio=<optimized out>)
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zil.c:1150
#18 0xffffffff82ce4f1e in zio_done (zio=<optimized out>)
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:4117
#19 0xffffffff82ce02ec in zio_execute (zio=0xfffff80a85caf418)
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:1785
#20 0xffffffff80c0bdd4 in taskqueue_run_locked (queue=0xfffff80952b72d00)
    at /usr/src/sys/kern/subr_taskqueue.c:465
#21 0xffffffff80c0cf38 in taskqueue_thread_loop (arg=<optimized out>)
    at /usr/src/sys/kern/subr_taskqueue.c:757
#22 0xffffffff80b6e4f3 in fork_exit (
    callout=0xffffffff80c0cea0 <taskqueue_thread_loop>,
    arg=0xfffff8010a8c6f50, frame=0xfffffe00e336cc00)
    at /usr/src/sys/kern/kern_fork.c:1057
#23 <signal handler called>
(kgdb)
Comment 1 Jeremy Faulkner 2018-10-27 00:44:03 UTC
Fatal trap 12: page fault while in kernel mode
cpuid = 14; apic id = 34
fault virtual address   = 0x0
fault code              = supervisor read data, page not present
instruction pointer     = 0x20:0xffffffff8273fa90
stack pointer           = 0x28:0xfffffe02639f8600
frame pointer           = 0x28:0xfffffe02639f8690
code segment            = base rx0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 0 (dsl_scan_tq_reflect)
trap number             = 12
panic: page fault
cpuid = 14
Fri Oct 26 20:37:33 EDT 2018

FreeBSD constans 12.0-BETA1 FreeBSD 12.0-BETA1 r339534 GENERIC  amd64

panic: page fault

GNU gdb (GDB) 8.2 [GDB v8.2 for FreeBSD]
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-portbld-freebsd12.0".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /boot/kernel/kernel...Reading symbols from /usr/lib/debug//boot/kernel/kernel.debug...done.
done.

Unread portion of the kernel message buffer:


Fatal trap 12: page fault while in kernel mode
cpuid = 14; apic id = 34
fault virtual address   = 0x0
fault code              = supervisor read data, page not present
instruction pointer     = 0x20:0xffffffff8273fa90
stack pointer           = 0x28:0xfffffe02639f8600
frame pointer           = 0x28:0xfffffe02639f8690
code segment            = base rx0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 0 (dsl_scan_tq_reflect)
trap number             = 12
panic: page fault
cpuid = 14
time = 1540599485
KDB: stack backtrace:
#0 0xffffffff80bf9a97 at kdb_backtrace+0x67
#1 0xffffffff80bada63 at vpanic+0x1a3
#2 0xffffffff80bad8b3 at panic+0x43
#3 0xffffffff8108586f at trap_fatal+0x35f
#4 0xffffffff810858c9 at trap_pfault+0x49
#5 0xffffffff81084eee at trap+0x29e
#6 0xffffffff81060b65 at calltrap+0x8
#7 0xffffffff8273f2ec at vdev_indirect_io_start+0x6c
#8 0xffffffff8276fe79 at zio_vdev_io_start+0x2a9
#9 0xffffffff8276c2ec at zio_execute+0xbc
#10 0xffffffff8276bbfb at zio_nowait+0xcb
#11 0xffffffff8274571f at vdev_mirror_io_start+0x41f
#12 0xffffffff8276fd2c at zio_vdev_io_start+0x15c
#13 0xffffffff8276c2ec at zio_execute+0xbc
#14 0xffffffff8276bbfb at zio_nowait+0xcb
#15 0xffffffff82708251 at scan_exec_io+0x2f1
#16 0xffffffff8270a3c5 at scan_io_queues_run_one+0x4f5
#17 0xffffffff82699f80 at taskq_run+0x10
Uptime: 4d7h20m57s
Dumping 14448 out of 73670 MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91%

__curthread () at ./machine/pcpu.h:230
230             __asm("movq %%gs:%1,%0" : "=r" (td)
(kgdb) #0  __curthread () at ./machine/pcpu.h:230
#1  doadump (textdump=<optimized out>)
    at /usr/src/sys/kern/kern_shutdown.c:366
#2  0xffffffff80bad64b in kern_reboot (howto=260)
    at /usr/src/sys/kern/kern_shutdown.c:446
#3  0xffffffff80badac3 in vpanic (fmt=<optimized out>, ap=0xfffffe02639f8350)
    at /usr/src/sys/kern/kern_shutdown.c:872
#4  0xffffffff80bad8b3 in panic (fmt=<unavailable>)
    at /usr/src/sys/kern/kern_shutdown.c:799
#5  0xffffffff8108586f in trap_fatal (frame=0xfffffe02639f8540, eva=0)
    at /usr/src/sys/amd64/amd64/trap.c:929
#6  0xffffffff810858c9 in trap_pfault (frame=0xfffffe02639f8540, usermode=0)
    at /usr/src/sys/amd64/amd64/trap.c:765
#7  0xffffffff81084eee in trap (frame=0xfffffe02639f8540)
    at /usr/src/sys/amd64/amd64/trap.c:441
#8  <signal handler called>
#9  vdev_indirect_mapping_duplicate_adjacent_entries (offset=7868951343104,
    asize=4096, vd=<optimized out>, copied_entries=<optimized out>)
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_indirect.c:974
#10 vdev_indirect_remap (vd=0xfffff806aff1f000, offset=<optimized out>,
    asize=<optimized out>,
    func=0xffffffff827400a0 <vdev_indirect_gather_splits>,
    arg=0xfffff807481d8000)
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_indirect.c:1060
#11 0xffffffff8273f2ec in vdev_indirect_io_start (zio=0xfffff807481d8000)
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_indirect.c:1282
#12 0xffffffff8276fe79 in zio_vdev_io_start (zio=0xfffff807481d8000)
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:3341
#13 0xffffffff8276c2ec in zio_execute (zio=0xfffff807481d8000)
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:1785
#14 0xffffffff8276bbfb in zio_nowait (zio=0xfffff807481d8000)
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:1841
#15 0xffffffff8274571f in vdev_mirror_io_start (zio=<optimized out>)
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_mirror.c:560
#16 0xffffffff8276fd2c in zio_vdev_io_start (zio=0xfffff804e1733000)
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:3188
#17 0xffffffff8276c2ec in zio_execute (zio=0xfffff804e1733000)
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:1785
#18 0xffffffff8276bbfb in zio_nowait (zio=0xfffff804e1733000)
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:1841
#19 0xffffffff82708251 in scan_exec_io (dp=0xfffff8019bab8000,
    bp=0xfffffe02639f89c8, zio_flags=8388776, zb=0xfffff808716cd148,
    queue=0xfffff80ffb424d00)
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c:3682
#20 0xffffffff8270a3c5 in scan_io_queue_issue (queue=<optimized out>,
    io_list=0x80)
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c:2587
#21 scan_io_queues_run_one (arg=<optimized out>)
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c:2766
#22 0xffffffff82699f80 in taskq_run (arg=0xfffff81002611cf0, pending=-512)
    at /usr/src/sys/cddl/compat/opensolaris/kern/opensolaris_taskq.c:110
#23 0xffffffff80c0bdd4 in taskqueue_run_locked (queue=0xfffff803a27e5d00)
    at /usr/src/sys/kern/subr_taskqueue.c:465
#24 0xffffffff80c0cf38 in taskqueue_thread_loop (arg=<optimized out>)
    at /usr/src/sys/kern/subr_taskqueue.c:757
#25 0xffffffff80b6e4f3 in fork_exit (
    callout=0xffffffff80c0cea0 <taskqueue_thread_loop>,
    arg=0xfffff807a98545f0, frame=0xfffffe02639f8c00)
    at /usr/src/sys/kern/kern_fork.c:1057
#26 <signal handler called>
(kgdb)
Comment 2 Jeremy Faulkner 2019-03-12 20:47:39 UTC
It's been months since this pool has had a vdev removed, it had 8MB used for removed device remapping. I attempted to delete old snapshots from before the last vdev had been removed and had this crash. While I was deleting old snapshots the pool was also scrubbing and had a steady io load from jails running on the pool.

The core dump is over 15GB. I'll try and upload it later. 


FreeBSD constans 12.0-STABLE FreeBSD 12.0-STABLE #6 r344436M: Thu Feb 21 13:40:08 EST 2019     gldisater@constans:/usr/obj/usr/src/amd64.amd64/sys/GENERIC  amd64


(kgdb) bt
#0  __curthread () at ./machine/pcpu.h:230
#1  doadump (textdump=<optimized out>) at /usr/src/sys/kern/kern_shutdown.c:366
#2  0xffffffff80bc361a in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:446
#3  0xffffffff80bc3a90 in vpanic (fmt=<optimized out>, ap=0xfffffe00ec040350) at /usr/src/sys/kern/kern_shutdown.c:872
#4  0xffffffff80bc3873 in panic (fmt=<unavailable>) at /usr/src/sys/kern/kern_shutdown.c:799
#5  0xffffffff8109b539 in trap_fatal (frame=0xfffffe00ec040540, eva=0) at /usr/src/sys/amd64/amd64/trap.c:929
#6  0xffffffff8109b599 in trap_pfault (frame=0xfffffe00ec040540, usermode=0) at /usr/src/sys/amd64/amd64/trap.c:765
#7  0xffffffff8109abaf in trap (frame=0xfffffe00ec040540) at /usr/src/sys/amd64/amd64/trap.c:441
#8  <signal handler called>
#9  vdev_indirect_mapping_duplicate_adjacent_entries (vd=0xfffff8012a7a0000, offset=402853027840, asize=4096, copied_entries=<optimized out>)
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_indirect.c:974
#10 vdev_indirect_remap (vd=0xfffff8012a7a0000, offset=<optimized out>, asize=<optimized out>, func=0xffffffff82cd5860 <vdev_indirect_gather_splits>, arg=0xfffff80402c7d418)
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_indirect.c:1060
#11 0xffffffff82cd4a6c in vdev_indirect_io_start (zio=0xfffff80402c7d418) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_indirect.c:1282
#12 0xffffffff82d05656 in zio_vdev_io_start (zio=0xfffff80402c7d418) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:3341
#13 0xffffffff82d01aac in zio_execute (zio=0xfffff80402c7d418) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:1785
#14 0xffffffff82d013cb in zio_nowait (zio=0xfffff80402c7d418) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:1841
#15 0xffffffff82cdaeec in vdev_mirror_io_start (zio=<optimized out>) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_mirror.c:560
#16 0xffffffff82d05509 in zio_vdev_io_start (zio=0xfffff80262174000) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:3188
#17 0xffffffff82d01aac in zio_execute (zio=0xfffff80262174000) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:1785
#18 0xffffffff82d013cb in zio_nowait (zio=0xfffff80262174000) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:1841
#19 0xffffffff82c9de90 in scan_exec_io (dp=0xfffff8097b629000, bp=0xfffffe00ec0409c8, zio_flags=8388784, zb=0xfffff80ebc23dbc8, queue=0xfffff80ed63c8600)
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c:3682
#20 0xffffffff82c9ff96 in scan_io_queue_issue (queue=0xfffff80ed63c8600, io_list=0x80) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c:2587
#21 scan_io_queues_run_one (arg=0xfffff80ed63c8600) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c:2766
#22 0xffffffff82c30da0 in taskq_run (arg=0xfffff810357eeea0, pending=-512) at /usr/src/sys/cddl/compat/opensolaris/kern/opensolaris_taskq.c:110
#23 0xffffffff80c21894 in taskqueue_run_locked (queue=0xfffff80b1fc9db00) at /usr/src/sys/kern/subr_taskqueue.c:467
#24 0xffffffff80c22c18 in taskqueue_thread_loop (arg=<optimized out>) at /usr/src/sys/kern/subr_taskqueue.c:773
#25 0xffffffff80b84c32 in fork_exit (callout=0xffffffff80c22b80 <taskqueue_thread_loop>, arg=0xfffff811c3072590, frame=0xfffffe00ec040c00) at /usr/src/sys/kern/kern_fork.c:1059
#26 <signal handler called>
(kgdb)
Comment 3 Jeremy Faulkner 2019-03-13 12:09:10 UTC
After deleting snapshots the removed device remapping data was reduced to 4MB so the manipulation of the remapping data my be tied to the crash.

doas tar zcvf vdev_indirect_mapping-panic.tar.gz /var/crash/info.2 /var/crash/vmcore.2 /usr/lib/debug/boot/kernel
https://drive.google.com/open?id=1dgWRWlymB9iioHXc0rVxzTwU1YmhX6cB