11.2-beta2 and 11.2-RC1 steps to reproduce: zpool create test mirror da1 da2 mirror da3 d4 mirror da5 da6 zpool remove test mirror-1 it then panics. it also seems to be panic after reboot when trying to import it, or scrub it, or destroy it. # kgdb /boot/kernel/kernel /var/crash/vmcore.0 GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "amd64-marcel-freebsd"... Unread portion of the kernel message buffer: panic: page fault cpuid = 8 KDB: stack backtrace: #0 0xffffffff80b3d407 at kdb_backtrace+0x67 #1 0xffffffff80af6a77 at vpanic+0x177 #2 0xffffffff80af68f3 at panic+0x43 #3 0xffffffff80f77f6f at trap_fatal+0x35f #4 0xffffffff80f77fc9 at trap_pfault+0x49 #5 0xffffffff80f77797 at trap+0x2c7 #6 0xffffffff80f5744c at calltrap+0x8 #7 0xffffffff824f01d7 at vdev_indirect_io_start_cb+0x37 #8 0xffffffff824efe58 at vdev_indirect_remap+0x2f8 #9 0xffffffff824efb3d at vdev_indirect_io_start+0x2d #10 0xffffffff8251ac9e at zio_vdev_io_start+0x2ae #11 0xffffffff8251774c at zio_execute+0xac #12 0xffffffff8251706b at zio_nowait+0xcb #13 0xffffffff824f38ef at vdev_mirror_io_start+0x3ff #14 0xffffffff8251ab52 at zio_vdev_io_start+0x162 #15 0xffffffff8251774c at zio_execute+0xac #16 0xffffffff80b4ec14 at taskqueue_run_locked+0x154 #17 0xffffffff80b4fd78 at taskqueue_thread_loop+0x98 Uptime: 1m46s Dumping 1886 out of 49109 MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91% Reading symbols from /boot/kernel/ums.ko...Reading symbols from /usr/lib/debug//boot/kernel/ums.ko.debug...done. done. Loaded symbols for /boot/kernel/ums.ko Reading symbols from /boot/kernel/pf.ko...Reading symbols from /usr/lib/debug//boot/kernel/pf.ko.debug...done. done. Loaded symbols for /boot/kernel/pf.ko Reading symbols from /boot/kernel/zfs.ko...Reading symbols from /usr/lib/debug//boot/kernel/zfs.ko.debug...done. done. Loaded symbols for /boot/kernel/zfs.ko Reading symbols from /boot/kernel/opensolaris.ko...Reading symbols from /usr/lib/debug//boot/kernel/opensolaris.ko.debug...done. done. Loaded symbols for /boot/kernel/opensolaris.ko #0 doadump (textdump=<value optimized out>) at pcpu.h:229 229 pcpu.h: No such file or directory. in pcpu.h (kgdb) (kgdb) bt #0 doadump (textdump=<value optimized out>) at pcpu.h:229 #1 0xffffffff80af668b in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:383 #2 0xffffffff80af6ab1 in vpanic (fmt=<value optimized out>, ap=<value optimized out>) at /usr/src/sys/kern/kern_shutdown.c:776 #3 0xffffffff80af68f3 in panic (fmt=<value optimized out>) at /usr/src/sys/kern/kern_shutdown.c:707 #4 0xffffffff80f77f6f in trap_fatal (frame=0xfffffe0c58d30720, eva=0) at /usr/src/sys/amd64/amd64/trap.c:875 #5 0xffffffff80f77fc9 in trap_pfault (frame=0xfffffe0c58d30720, usermode=0) at pcpu.h:229 #6 0xffffffff80f77797 in trap (frame=0xfffffe0c58d30720) at /usr/src/sys/amd64/amd64/trap.c:415 #7 0xffffffff80f5744c in calltrap () at /usr/src/sys/amd64/amd64/exception.S:231 #8 0xffffffff82476994 in abd_get_offset (sabd=0x0, off=0) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/abd.c:443 #9 0xffffffff824f01d7 in vdev_indirect_io_start_cb (split_offset=<value optimized out>, vd=0xfffff800237fd000, offset=1258659328, size=3584, arg=0xfffff8006ac49000) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_indirect.c:1082 #10 0xffffffff824efe58 in vdev_indirect_remap (vd=<value optimized out>, offset=<value optimized out>, asize=<value optimized out>, func=0xffffffff824f01a0 <vdev_indirect_io_start_cb>, arg=0xfffff8006ac49000) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_indirect.c:1041 #11 0xffffffff824efb3d in vdev_indirect_io_start (zio=0xfffff8006ac49000) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_indirect.c:1099 #12 0xffffffff8251ac9e in zio_vdev_io_start (zio=0xfffff8006ac49000) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:3297 #13 0xffffffff8251774c in zio_execute (zio=0xfffff8006ac49000) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:1768 #14 0xffffffff8251706b in zio_nowait (zio=0xfffff8006ac49000) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:1826 #15 0xffffffff824f38ef in vdev_mirror_io_start (zio=<value optimized out>) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_mirror.c:557 #16 0xffffffff8251ab52 in zio_vdev_io_start (zio=0xfffff8006ad1c000) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:3166 #17 0xffffffff8251774c in zio_execute (zio=0xfffff8006ad1c000) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:1768 #18 0xffffffff80b4ec14 in taskqueue_run_locked (queue=0xfffff8006a839900) at /usr/src/sys/kern/subr_taskqueue.c:463 #19 0xffffffff80b4fd78 in taskqueue_thread_loop (arg=<value optimized out>) at /usr/src/sys/kern/subr_taskqueue.c:755 #20 0xffffffff80aba0b3 in fork_exit (callout=0xffffffff80b4fce0 <taskqueue_thread_loop>, arg=0xfffff8006a7b81f0, frame=0xfffffe0c58d30c00) at /usr/src/sys/kern/kern_fork.c:1054 #21 0xffffffff80f5836e in fork_trampoline () at /usr/src/sys/amd64/amd64/exception.S:957 #22 0x0000000000000000 in ?? () (kgdb) -------------------------------------- kgdb /boot/kernel/kernel /var/crash/vmcore.1 GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "amd64-marcel-freebsd"... Unread portion of the kernel message buffer: code segment = base rx0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 current process = 0 (zio_free_issue_6_1) trap number = 12 fault code = supervisor read data, page not present = DPL 0, pres 1, long 1, def32 0, gran 1 instruction pointer = 0x20:0xffffffff82476994 stack pointer = 0x28:0xfffffe0c58b217e0 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 0 (zio_free_issue_3_1) frame pointer = 0x28:0xfffffe0c58b21810 panic: page fault cpuid = 5 KDB: stack backtrace: #0 0xffffffff80b3d407 at kdb_backtrace+0x67 #1 0xffffffff80af6a77 at vpanic+0x177 #2 0xffffffff80af68f3 at panic+0x43 #3 0xffffffff80f77f6f at trap_fatal+0x35f #4 0xffffffff80f77fc9 at trap_pfault+0x49 #5 0xffffffff80f77797 at trap+0x2c7 #6 0xffffffff80f5744c at calltrap+0x8 #7 0xffffffff824f01d7 at vdev_indirect_io_start_cb+0x37 #8 0xffffffff824efe58 at vdev_indirect_remap+0x2f8 #9 0xffffffff824efb3d at vdev_indirect_io_start+0x2d #10 0xffffffff8251ac9e at zio_vdev_io_start+0x2ae #11 0xffffffff8251774c at zio_execute+0xac #12 0xffffffff8251706b at zio_nowait+0xcb #13 0xffffffff824f38ef at vdev_mirror_io_start+0x3ff #14 0xffffffff8251ab52 at zio_vdev_io_start+0x162 #15 0xffffffff8251774c at zio_execute+0xac #16 0xffffffff80b4ec14 at taskqueue_run_locked+0x154 #17 0xffffffff80b4fd78 at taskqueue_thread_loop+0x98 Uptime: 16m41s Dumping 1894 out of 49109 MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91% Reading symbols from /boot/kernel/ums.ko...Reading symbols from /usr/lib/debug//boot/kernel/ums.ko.debug...done. done. Loaded symbols for /boot/kernel/ums.ko Reading symbols from /boot/kernel/pf.ko...Reading symbols from /usr/lib/debug//boot/kernel/pf.ko.debug...done. done. Loaded symbols for /boot/kernel/pf.ko Reading symbols from /boot/kernel/zfs.ko...Reading symbols from /usr/lib/debug//boot/kernel/zfs.ko.debug...done. done. Loaded symbols for /boot/kernel/zfs.ko Reading symbols from /boot/kernel/opensolaris.ko...Reading symbols from /usr/lib/debug//boot/kernel/opensolaris.ko.debug...done. done. Loaded symbols for /boot/kernel/opensolaris.ko #0 doadump (textdump=<value optimized out>) at pcpu.h:229 229 pcpu.h: No such file or directory. in pcpu.h (kgdb) (kgdb) bt #0 doadump (textdump=<value optimized out>) at pcpu.h:229 #1 0xffffffff80af668b in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:383 #2 0xffffffff80af6ab1 in vpanic (fmt=<value optimized out>, ap=<value optimized out>) at /usr/src/sys/kern/kern_shutdown.c:776 #3 0xffffffff80af68f3 in panic (fmt=<value optimized out>) at /usr/src/sys/kern/kern_shutdown.c:707 #4 0xffffffff80f77f6f in trap_fatal (frame=0xfffffe0c58adb720, eva=0) at /usr/src/sys/amd64/amd64/trap.c:875 #5 0xffffffff80f77fc9 in trap_pfault (frame=0xfffffe0c58adb720, usermode=0) at pcpu.h:229 #6 0xffffffff80f77797 in trap (frame=0xfffffe0c58adb720) at /usr/src/sys/amd64/amd64/trap.c:415 #7 0xffffffff80f5744c in calltrap () at /usr/src/sys/amd64/amd64/exception.S:231 #8 0xffffffff82476994 in abd_get_offset (sabd=0x0, off=0) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/abd.c:443 #9 0xffffffff824f01d7 in vdev_indirect_io_start_cb (split_offset=<value optimized out>, vd=0xfffff8002373f800, offset=1225929216, size=2560, arg=0xfffff801c43c9000) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_indirect.c:1082 #10 0xffffffff824efe58 in vdev_indirect_remap (vd=<value optimized out>, offset=<value optimized out>, asize=<value optimized out>, func=0xffffffff824f01a0 <vdev_indirect_io_start_cb>, arg=0xfffff801c43c9000) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_indirect.c:1041 #11 0xffffffff824efb3d in vdev_indirect_io_start (zio=0xfffff801c43c9000) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_indirect.c:1099 #12 0xffffffff8251ac9e in zio_vdev_io_start (zio=0xfffff801c43c9000) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:3297 #13 0xffffffff8251774c in zio_execute (zio=0xfffff801c43c9000) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:1768 #14 0xffffffff8251706b in zio_nowait (zio=0xfffff801c43c9000) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:1826 #15 0xffffffff824f38ef in vdev_mirror_io_start (zio=<value optimized out>) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_mirror.c:557 #16 0xffffffff8251ab52 in zio_vdev_io_start (zio=0xfffff801c4419820) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:3166 #17 0xffffffff8251774c in zio_execute (zio=0xfffff801c4419820) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:1768 #18 0xffffffff80b4ec14 in taskqueue_run_locked (queue=0xfffff80168168100) at /usr/src/sys/kern/subr_taskqueue.c:463 #19 0xffffffff80b4fd78 in taskqueue_thread_loop (arg=<value optimized out>) at /usr/src/sys/kern/subr_taskqueue.c:755 #20 0xffffffff80aba0b3 in fork_exit (callout=0xffffffff80b4fce0 <taskqueue_thread_loop>, arg=0xfffff800220f6f60, frame=0xfffffe0c58adbc00) at /usr/src/sys/kern/kern_fork.c:1054 #21 0xffffffff80f5836e in fork_trampoline () at /usr/src/sys/amd64/amd64/exception.S:957 #22 0x0000000000000000 in ?? () Current language: auto; currently minimal (kgdb) # zpool import pool: test id: 632784374722369342 state: ONLINE action: The pool can be imported using its name or numeric identifier. config: test ONLINE mirror-0 ONLINE da1 ONLINE da2 ONLINE indirect-1 ONLINE mirror-2 ONLINE da5 ONLINE da6 ONLINE and then it panics when trying to import.
Can you provide dmesg(8) output, so we know what type of hardware is involved?
FYI, this is an issue on 12-CURRENT as well.
I just ran into the same problem on FreeBSD 11.2-p2. Please put at least a warning in if the feature is not (yet) usable.
On 12.0-ALPHA5 today, r338620M, the panic today is ZFS storage pool version: features support (5000) panic: solaris assert: rc-rc_count == number, file: /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/refcount.c, line: 94 cpuid = 2 time = 1536840299 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe0091cb87b0 vpanic() at vpanic+0x1a3/frame 0xfffffe0091cb8810 panic() at panic+0x43/frame 0xfffffe091cb8870 assfail() at assfail+0x1a/frame 0xfffffe0091cb8880 refcount_destroy_many() at refcount_destroy_many+0x2b/frame 0xfffffe0091cb88b0 abd_free() at abd_free+0x18d/frame 0xfffffe0091cb88e0 spa_vdev_copy_segment_write_done() at spa_vdev_copy_segment_write_done+0x20/frame 0xfffffe0091cb8910 zio_done() at zio_done+0xf21/frame 0xfffffe0091cb8990 zio_execute() at zio_execute+0x18c/frame 0xfffffe0091cb89e0 taskqueue_run_locked() at taskqueue_run_locked+0x10c/frame 0xfffffe0091cb8a40 taskqueue_thread_loop() at taskqueue_thread_loop+0x88/frame 0xfffffe0091cb8a70 fork_exit() at fork_exit+0x84/frame 0xfffffe0091cb8ab0 fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe0091cb8ab0 --- trap 0, ripe = 0, rsp = 0, rbp = 0 --- KDB: enter: panic [ thread pid 0 tid 100477 ] Stopped at kdb_enter+0x3b: movq $0,kdb_why db>
(In reply to Roger Hammerstein from comment #4) That panic is different, and may warrant its own PR. See also: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=229007
I think those crashes are caused by lach of TRIM support in device removal code. This patch fixes alike crashes for me: https://reviews.freebsd.org/D17523
(In reply to Alexander Motin from comment #6) I meant original crashes. The last one is different issue, also related to TRIM, which I also already identified and looking for solution.
A commit references this bug: Author: mav Date: Fri Oct 12 15:14:22 UTC 2018 New revision: 339329 URL: https://svnweb.freebsd.org/changeset/base/339329 Log: Add ZIO_TYPE_FREE support for indirect vdevs. Upstream code expects only ZIO_TYPE_READ and some ZIO_TYPE_WRITE requests to removed (indirect) vdevs, while on FreeBSD there is also ZIO_TYPE_FREE (TRIM). ZIO_TYPE_FREE requests do not have the data buffers, so don't need the pointer adjustment. PR: 228750, 229007 Reviewed by: allanjude, sef Approved by: re (kib) MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D17523 Changes: head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_indirect.c
I moved the 11.2 test machine to 12, but I can make a new 11.2 test later. The panic in Comment 4 is still occurring in the latest 12 at r339345. Can you take a look at that ? FreeBSD 12.0-ALPHA9 (GENERIC) #3 r339345M
(In reply to Roger Hammerstein from comment #9) There is a 2nd patch you'll need as well. See here: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=229007#c12
(In reply to Allan Jude from comment #10) yes, that patch works. root@freebsd12:~ # zpool remove test mirror-1 root@freebsd12:~ # root@freebsd12:~ # zpool status pool: test state: ONLINE scan: none requested remove: Evacuation of mirror-1 in progress since Sat Oct 13 10:03:42 2018 5.81M copied out of 5.81M at 850K/s, 100.00% done, 0h0m to go config: NAME STATE READ WRITE CKSUM test ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 da9 ONLINE 0 0 0 da8 ONLINE 0 0 0 mirror-1 ONLINE 0 0 0 da7 ONLINE 0 0 0 da6 ONLINE 0 0 0 mirror-2 ONLINE 0 0 0 da5 ONLINE 0 0 0 da4 ONLINE 0 0 0 errors: No known data errors root@freebsd12:~ # root@freebsd12:~ # zpool status pool: test state: ONLINE scan: none requested remove: Removal of vdev 1 copied 5.81M in 0h0m, completed on Sat Oct 13 10:03:51 2018 384 memory used for removed device mappings config: NAME STATE READ WRITE CKSUM test ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 da9 ONLINE 0 0 0 da8 ONLINE 0 0 0 mirror-2 ONLINE 0 0 0 da5 ONLINE 0 0 0 da4 ONLINE 0 0 0 errors: No known data errors root@freebsd12:~ # and for extra: root@freebsd12:~ # zpool remove test mirror-2 root@freebsd12:~ # zpool status pool: test state: ONLINE scan: none requested remove: Evacuation of mirror-2 in progress since Sat Oct 13 10:05:49 2018 7.15M copied out of 7.15M at 1.79M/s, 100.00% done, 0h0m to go 384 memory used for removed device mappings config: NAME STATE READ WRITE CKSUM test ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 da9 ONLINE 0 0 0 da8 ONLINE 0 0 0 mirror-2 ONLINE 0 0 0 da5 ONLINE 0 0 0 da4 ONLINE 0 0 0 errors: No known data errors root@freebsd12:~ # root@freebsd12:~ # zpool status pool: test state: ONLINE scan: none requested remove: Removal of vdev 2 copied 7.15M in 0h0m, completed on Sat Oct 13 10:05:54 2018 912 memory used for removed device mappings config: NAME STATE READ WRITE CKSUM test ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 da9 ONLINE 0 0 0 da8 ONLINE 0 0 0 errors: No known data errors root@freebsd12:~ # root@freebsd12:~ # zpool clear test root@freebsd12:~ # zpool status test pool: test state: ONLINE scan: none requested remove: Removal of vdev 2 copied 7.15M in 0h0m, completed on Sat Oct 13 10:05:54 2018 912 memory used for removed device mappings config: NAME STATE READ WRITE CKSUM test ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 da9 ONLINE 0 0 0 da8 ONLINE 0 0 0 errors: No known data errors root@freebsd12:~ # root@freebsd12:~ # zpool export test root@freebsd12:~ # zpool import test zpool status test root@freebsd12:~ # zpool status test pool: test state: ONLINE scan: none requested remove: Removal of vdev 2 copied 7.15M in 0h0m, completed on Sat Oct 13 10:05:54 2018 912 memory used for removed device mappings config: NAME STATE READ WRITE CKSUM test ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 da9 ONLINE 0 0 0 da8 ONLINE 0 0 0 errors: No known data errors root@freebsd12:~ # FreeBSD freebsd12 12.0-ALPHA9 FreeBSD 12.0-ALPHA9 #3 r339345M
11.2-stable also works (for one removal) with the second patch from 229007#c12 FreeBSD freebsd11 11.2-STABLE FreeBSD 11.2-STABLE #0 r339346: Sat Oct 13 15:30:06 EDT 2018 without the second patch, the removal seemed to never finish, after leaving it overnight: root@freebsd11:~ # zpool create test mirror /dev/da8 /dev/da7 mirror /dev/da6 /dev/da5 mirror /dev/da4 /dev/da3 mirror /dev/da2 /dev/da1 root@freebsd11:~ # zpool status pool: test state: ONLINE scan: none requested config: NAME STATE READ WRITE CKSUM test ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 da8 ONLINE 0 0 0 da7 ONLINE 0 0 0 mirror-1 ONLINE 0 0 0 da6 ONLINE 0 0 0 da5 ONLINE 0 0 0 mirror-2 ONLINE 0 0 0 da4 ONLINE 0 0 0 da3 ONLINE 0 0 0 mirror-3 ONLINE 0 0 0 da2 ONLINE 0 0 0 da1 ONLINE 0 0 0 errors: No known data errors root@freebsd11:~ # cp -a /usr/src /test/ root@freebsd11:~ # zpool remove test mirror-2 root@freebsd11:~ # root@freebsd11:~ # zpool status pool: test state: ONLINE scan: none requested remove: Evacuation of mirror-2 in progress since Sat Oct 13 15:44:31 2018 22.2M copied out of 22.2M at 2.46M/s, 100.00% done, 0h0m to go config: NAME STATE READ WRITE CKSUM test ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 da8 ONLINE 0 0 0 da7 ONLINE 0 0 0 mirror-1 ONLINE 0 0 0 da6 ONLINE 0 0 0 da5 ONLINE 0 0 0 mirror-2 ONLINE 0 0 0 da4 ONLINE 0 0 0 da3 ONLINE 0 0 0 mirror-3 ONLINE 0 0 0 da2 ONLINE 0 0 0 da1 ONLINE 0 0 0 errors: No known data errors root@freebsd11:~ # root@freebsd11:~ # zpool status pool: test state: ONLINE scan: none requested remove: Evacuation of mirror-2 in progress since Sat Oct 13 15:44:31 2018 22.2M copied out of 22.2M at 169K/s, 100.00% done, 0h0m to go config: NAME STATE READ WRITE CKSUM test ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 da8 ONLINE 0 0 0 da7 ONLINE 0 0 0 mirror-1 ONLINE 0 0 0 da6 ONLINE 0 0 0 da5 ONLINE 0 0 0 mirror-2 ONLINE 0 0 0 da4 ONLINE 0 0 0 da3 ONLINE 0 0 0 mirror-3 ONLINE 0 0 0 da2 ONLINE 0 0 0 da1 ONLINE 0 0 0 errors: No known data errors root@freebsd11:~ # root@freebsd11:/usr/src # zpool status pool: test state: ONLINE scan: none requested remove: Evacuation of mirror-2 in progress since Sat Oct 13 15:44:31 2018 22.2M copied out of 22.2M at 6.68K/s, 100.00% done, 0h0m to go config: NAME STATE READ WRITE CKSUM test ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 da8 ONLINE 0 0 0 da7 ONLINE 0 0 0 mirror-1 ONLINE 0 0 0 da6 ONLINE 0 0 0 da5 ONLINE 0 0 0 mirror-2 ONLINE 0 0 0 da4 ONLINE 0 0 0 da3 ONLINE 0 0 0 mirror-3 ONLINE 0 0 0 da2 ONLINE 0 0 0 da1 ONLINE 0 0 0 errors: No known data errors root@freebsd11:/usr/src # after overnight: root@freebsd11:/usr/src # zpool status pool: test state: ONLINE scan: none requested remove: Evacuation of mirror-2 in progress since Sat Oct 13 15:44:31 2018 22.2M copied out of 22.2M at 355/s, 100.00% done, 0h0m to go config: NAME STATE READ WRITE CKSUM test ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 da8 ONLINE 0 0 0 da7 ONLINE 0 0 0 mirror-1 ONLINE 0 0 0 da6 ONLINE 0 0 0 da5 ONLINE 0 0 0 mirror-2 ONLINE 0 0 0 da4 ONLINE 0 0 0 da3 ONLINE 0 0 0 mirror-3 ONLINE 0 0 0 da2 ONLINE 0 0 0 da1 ONLINE 0 0 0 errors: No known data errors rebooting onto the new kernel with the patch from 229007 root@freebsd11:~ # zpool status pool: test state: ONLINE scan: none requested remove: Evacuation of mirror-2 in progress since Sat Oct 13 15:44:31 2018 1 copied out of 22.2M at 1/s, 0.00% done, (copy is slow, no estimated time) config: NAME STATE READ WRITE CKSUM test ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 da8 ONLINE 0 0 0 da7 ONLINE 0 0 0 mirror-1 ONLINE 0 0 0 da6 ONLINE 0 0 0 da5 ONLINE 0 0 0 mirror-2 ONLINE 0 0 0 da4 ONLINE 0 0 0 da3 ONLINE 0 0 0 mirror-3 ONLINE 0 0 0 da2 ONLINE 0 0 0 da1 ONLINE 0 0 0 errors: No known data errors root@freebsd11:~ # zpool status pool: test state: ONLINE scan: none requested remove: Removal of vdev 2 copied 22.2M in 18h20m, completed on Sun Oct 14 10:04:46 2018 816 memory used for removed device mappings config: NAME STATE READ WRITE CKSUM test ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 da8 ONLINE 0 0 0 da7 ONLINE 0 0 0 mirror-1 ONLINE 0 0 0 da6 ONLINE 0 0 0 da5 ONLINE 0 0 0 mirror-3 ONLINE 0 0 0 da2 ONLINE 0 0 0 da1 ONLINE 0 0 0 errors: No known data errors but then a second remove of zpool remove test mirror-0 paniced KDB: stack backtrace: #0 0xffffffff80b40df7 at kdb_backtrace+0x67 #1 0xffffffff80afa337 at vpanic+0x177 #2 0xffffffff80afa1b3 at panic+0x43 #3 0xffffffff80f7c38f at trap_fatal+0x35f #4 0xffffffff80f7c3e9 at trap_pfault+0x49 #5 0xffffffff80f7ba8c at trap+0x29c #6 0xffffffff80f5bfcc at calltrap+0x8 #7 0xffffffff824bc89b at vdev_indirect_io_start+0x9b #8 0xffffffff824e9fa9 at zio_vdev_io_start+0x2a9 #9 0xffffffff824e68ec at zio_execute+0xbc #10 0xffffffff824e61fb at zio_nowait+0xcb #11 0xffffffff824c248f at vdev_mirror_io_start+0x41f #12 0xffffffff824e9e5c at zio_vdev_io_start+0x15c #13 0xffffffff824e68ec at zio_execute+0xbc #14 0xffffffff80b52694 at taskqueue_run_locked+0x154 #15 0xffffffff80b537f8 at taskqueue_thread_loop+0x98 #16 0xffffffff80abd963 at fork_exit+0x83 #17 0xffffffff80f5cf8e at fork_trampoline+0xe Uptime: 11m46s but i think not everything that is in head has been mfcd back to 11 yet
(In reply to Roger Hammerstein from comment #12) When posting a panic, please include the message that comes before the backtrace as well. While the backtrace is very important, we also need the actual error message.
(In reply to Allan Jude from comment #13) actually there is no panic on the console. kgdb /boot/kernel/kernel /var/crash/vmcore.2 Unread portion of the kernel message buffer: Fatal trap 12: page fault while in kernel mode Fatal trap 12: page fault while in kernel mode cpuid = 19; apic id = 26 fault virtual address = 0x0 fault code = supervisor read data, page not present instruction pointer = 0x20:0xffffffff8243df34 Fatal trap 12: page fault while in kernel mode stack pointer = 0x28:0xfffffe023ca77750 frame pointer = 0x28:0xfffffe023ca77780 code segment = base rx0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 0 (zio_free_issue_2_3) trap number = 12 panic: page fault cpuid = 19 KDB: stack backtrace: #0 0xffffffff80b40df7 at kdb_backtrace+0x67 #1 0xffffffff80afa337 at vpanic+0x177 #2 0xffffffff80afa1b3 at panic+0x43 #3 0xffffffff80f7c38f at trap_fatal+0x35f #4 0xffffffff80f7c3e9 at trap_pfault+0x49 #5 0xffffffff80f7ba8c at trap+0x29c #6 0xffffffff80f5bfcc at calltrap+0x8 #7 0xffffffff824bc89b at vdev_indirect_io_start+0x9b #8 0xffffffff824e9fa9 at zio_vdev_io_start+0x2a9 #9 0xffffffff824e68ec at zio_execute+0xbc #10 0xffffffff824e61fb at zio_nowait+0xcb #11 0xffffffff824c248f at vdev_mirror_io_start+0x41f #12 0xffffffff824e9e5c at zio_vdev_io_start+0x15c #13 0xffffffff824e68ec at zio_execute+0xbc #14 0xffffffff80b52694 at taskqueue_run_locked+0x154 #15 0xffffffff80b537f8 at taskqueue_thread_loop+0x98 #16 0xffffffff80abd963 at fork_exit+0x83 #17 0xffffffff80f5cf8e at fork_trampoline+0xe
A commit references this bug: Author: mav Date: Mon Oct 15 21:59:24 UTC 2018 New revision: 339372 URL: https://svnweb.freebsd.org/changeset/base/339372 Log: Skip VDEV_IO_DONE stage only for ZIO_TYPE_FREE. Device removal code uses zio_vdev_child_io() with ZIO_TYPE_NULL parent, that never happened before. It confused FreeBSD-specific TRIM code, which does not use VDEV_IO_DONE for logical ZIO_TYPE_FREE ZIOs. As result of that stage being skipped device removal ZIOs leaked references and memory that supposed to be freed by VDEV_IO_DONE, making it stuck. It is a quick patch rather then a nice fix, but hopefully we'll be able to drop it all together when alternative TRIM implementation finally get landed. PR: 228750, 229007 Discussed with: allanjude, avg, smh Approved by: re (delphij) MFC after: 5 days Sponsored by: iXsystems, Inc. Changes: head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c
I believe ZFS device removal should now work in FreeBSD head. I'll merge the fixes to stable/11 in a week.
A commit references this bug: Author: mav Date: Fri Oct 19 04:30:26 UTC 2018 New revision: 339440 URL: https://svnweb.freebsd.org/changeset/base/339440 Log: MFC r339329: Add ZIO_TYPE_FREE support for indirect vdevs. Upstream code expects only ZIO_TYPE_READ and some ZIO_TYPE_WRITE requests to removed (indirect) vdevs, while on FreeBSD there is also ZIO_TYPE_FREE (TRIM). ZIO_TYPE_FREE requests do not have the data buffers, so don't need the pointer adjustment. PR: 228750, 229007 Changes: _U stable/11/ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_indirect.c
A commit references this bug: Author: mav Date: Fri Oct 19 04:37:28 UTC 2018 New revision: 339441 URL: https://svnweb.freebsd.org/changeset/base/339441 Log: MFC r339372: Skip VDEV_IO_DONE stage only for ZIO_TYPE_FREE. Device removal code uses zio_vdev_child_io() with ZIO_TYPE_NULL parent, that never happened before. It confused FreeBSD-specific TRIM code, which does not use VDEV_IO_DONE for logical ZIO_TYPE_FREE ZIOs. As result of that stage being skipped device removal ZIOs leaked references and memory that supposed to be freed by VDEV_IO_DONE, making it stuck. It is a quick patch rather then a nice fix, but hopefully we'll be able to drop it all together when alternative TRIM implementation finally get landed. PR: 228750, 229007 Changes: _U stable/11/ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c
Merged to stable/11.
Sorry to bring this up again but was this fixed on 12.0 too? I just removed a device and I did see the message in status about stating to remove it but then it panics with a similar error, only n my case I see: panic: solaris assert: ((offset) & ((1ULL << vd->vdev_ashift) - 1)) == 0 (0xa00 == 0x0), file /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c, line: 3561 The fact that it shoes vdev_ashift leads me to believe that the cause is slightly different but still related to device removal.
Imported in read only and I see: remove: Evacuation of label/zfs1 in progress since Wed Sep 4 17:37:55 2019 29.5K copied out of 837G at 1/s, 0.00% done, (copy is slow, no estimated time) But also with zdb I can see that ashift is not the same and I'm guessing that's why I get the panic: metaslab_array: 33 metaslab_shift: 33 ashift: 9 asize: 1000199946240 is_log: 0 removing: 1 I suppose its probably easier to rebuild the pool
MARKED AS SPAM