Summary: | [zfs] kernel crashes on remove disk | ||
---|---|---|---|
Product: | Base System | Reporter: | Sergei Masharov <serzh> |
Component: | kern | Assignee: | Alexander Motin <mav> |
Status: | Closed FIXED | ||
Severity: | Affects Some People | CC: | allanjude, emaste, gldisater, maurizio1018, mav |
Priority: | --- | ||
Version: | CURRENT | ||
Hardware: | amd64 | ||
OS: | Any |
Description
Sergei Masharov
2018-06-14 08:56:58 UTC
also crashes on files: root@:/ # mount -t tmpfs tmpfs /tmp root@:/ # dd if=/dev/zero of=/tmp/z1 bs=1G count=0 seek=1 0+0 records in 0+0 records out 0 bytes transferred in 0.000010 secs (0 bytes/sec) root@:/ # dd if=/dev/zero of=/tmp/z2 bs=1G count=0 seek=1 0+0 records in 0+0 records out 0 bytes transferred in 0.000010 secs (0 bytes/sec) root@:/ # zpool create zm /tmp/z1 root@:/ # zpool add zm /tmp/z2 root@:/ # zpool list -v zm NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT zm 1.88G 154K 1.87G - - 0% 0% 1.00x ONLINE - /tmp/z1 960M 145K 960M - - 0% 0% /tmp/z2 960M 8.50K 960M - - 0% 0% root@:/ # zpool remove zm /tmp/z1 root@:/ # zpool list -v zm NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT zm 960M 176K 960M - - 0% 0% 1.00x ONLINE - /tmp/z2 960M 176K 960M - - 0% 0% root@:/ # dd if=/dev/zero of=/tmp/z0 bs=1M count=0 seek=100 0+0 records in 0+0 records out 0 bytes transferred in 0.000010 secs (0 bytes/sec) root@:/ # zpool add zm /tmp/z0 root@:/ # zpool list -v zm NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT zm 1.02G 138K 1.02G - - 0% 0% 1.00x ONLINE - /tmp/z2 960M 138K 960M - - 0% 0% /tmp/z0 80M 0 80M - - 0% 0% root@:/ # zpool remove zm /tmp/z2 cannot remove /tmp/z2: out of space root@:/ # zpool remove zm /tmp/z2 cannot remove /tmp/z2: out of space root@:/ # zpool remove zm /tmp/z2 cannot remove /tmp/z2: out of space root@:/ # zpool add zm /tmp/z1 root@:/ # zpool remove zm /tmp/z2 and system crashes Also occurs on current (r335317): db> trace Tracing pid 0 tid 100536 td 0xfffff8001149b000 kdb_enter() at kdb_enter+0x3b/frame 0xfffffe004cfa8810 vpanic() at vpanic+0x1c0/frame 0xfffffe004cfa8870 panic() at panic+0x43/frame 0xfffffe004cfa88d0 assfail3() at assfail3+0x2c/frame 0xfffffe004cfa88f0 vdev_indirect_io_start() at vdev_indirect_io_start+0x6b/frame 0xfffffe004cfa8910 zio_vdev_io_start() at zio_vdev_io_start+0x4c6/frame 0xfffffe004cfa8970 zio_execute() at zio_execute+0x17c/frame 0xfffffe004cfa89c0 zio_nowait() at zio_nowait+0xc4/frame 0xfffffe004cfa89f0 vdev_mirror_io_start() at vdev_mirror_io_start+0x3fa/frame 0xfffffe004cfa8a70 zio_vdev_io_start() at zio_vdev_io_start+0x208/frame 0xfffffe004cfa8ad0 zio_execute() at zio_execute+0x17c/frame 0xfffffe004cfa8b20 taskqueue_run_locked() at taskqueue_run_locked+0x14c/frame 0xfffffe004cfa8b80 taskqueue_thread_loop() at taskqueue_thread_loop+0x88/frame 0xfffffe004cfa8bb0 fork_exit() at fork_exit+0x84/frame 0xfffffe004cfa8bf0 fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe004cfa8bf0 --- trap 0, rip = 0, rsp = 0, rbp = 0 --- db> FreeBSD 12.0-ALPHA2 #3 r337947: Fri Aug 17 10:29:32 EDT 2018 Starting file system checks: panic: solaris assert: zio->io_type == ZIO_TYPE_WRITE (0x3 == 0x2), file: /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_indirect.c, line: 1267 cpuid = 3 time = 1534517459 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe004d1f37e0 vpanic() at vpanic+0x1a3/frame 0xfffffe004d1f3840 panic() at panic+0x43/frame 0xfffffe004d1f38a0 assfail3() at assfail3+0x2c/frame 0xfffffe004d1f38c0 vdev_indirect_io_start() at vdev_indirect_io_start+0xae/frame 0xfffffe004d1f3900 zio_vdev_io_start() at zio_vdev_io_start+0x4b5/frame 0xfffffe004d1f3960 zio_execute() at zio_execute+0x18c/frame 0xfffffe004d1f39b0 zio_nowait() at zio_nowait+0xc4/frame 0xfffffe004d1f39e0 vdev_mirror_io_start() at vdev_mirror_io_start+0x42a/frame 0xfffffe004d1f3a70 zio_vdev_io_start() at zio_vdev_io_start+0x1fe/frame 0xfffffe004d1f3ad0 zio_execute() at zio_execute+0x18c/frame 0xfffffe004d1f3b20 taskqueue_run_locked() at taskqueue_run_locked+0x14c/frame 0xfffffe004d1f3b80 taskqueue_thread_loop() at taskqueue_thread_loop+0x88/frame 0xfffffe004d1f3bb0 fork_exit() at fork_exit+0x84/frame 0xfffffe004d1f3bf0 fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe004d1f3bf0 --- trap 0, rip = 0, rsp = 0, rbp = 0 --- KDB: enter: panic [ thread pid 0 tid 100538 ] Stopped at kdb_enter+0x3b: movq $0,kdb_why db> October 2018 OpenZFS Leadership Meeting listed mav@ as maintainer for zfs device removal. CC'd I crashed differently. Fatal trap 9: general protection fault while in kernel mode cpuid = 3; apic id = 03 instruction pointer = 0x20:0xffffffff827ccc3f stack pointer = 0x28:0xfffffe002d962390 frame pointer = 0x28:0xfffffe002d962390 code segment = base rx0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 0 (zio_write_issue_1) #10 0xffffffff827ccc3f in vdev_queue_offset_compare (x1=0xfffff800843ea830, x2=0xdeadc0dedeadbe8e) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_queue.c:320 z1 = 0xfffff800843ea830 z2 = 0xdeadc0dedeadbe8e cmp = <optimized out> #11 0xffffffff826fc174 in avl_find (tree=0xfffff80005f28740, value=0xfffff800843ea830, where=<optimized out>) at /usr/src/sys/cddl/contrib/opensolaris/common/avl/avl.c:268 node = 0xfffff80003d98a80 off = 592 prev = 0xdeadc0dedeadc0de diff = <optimized out> #12 avl_add (tree=0xfffff80005f28740, new_node=0xfffff800843ea830) at /usr/src/sys/cddl/contrib/opensolaris/common/avl/avl.c:644 where = <optimized out> #13 0xffffffff827cdbbc in vdev_queue_pending_add (vq=0xfffff80005f285b8, zio=<optimized out>) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_queue.c:472 spa = <optimized out> __right = <optimized out> __left = <optimized out> and Fatal trap 9: general protection fault while in kernel mode cpuid = 1; apic id = 01 instruction pointer = 0x20:0xffffffff826fbd1f stack pointer = 0x28:0xfffffe002d8e09d8 frame pointer = 0x28:0xfffffe002d8e0a10 code segment = base rx0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 0 (zio_write_intr_3) #10 0xffffffff826fbd1f in avl_rotation (tree=0xfffff8000d01f740, node=0xfffff80003d8b668, balance=<optimized out>) at /usr/src/sys/cddl/contrib/opensolaris/common/avl/avl.c:379 right = 0 left = <optimized out> left_heavy = <optimized out> right_heavy = <optimized out> parent = 0xdeadc0dedeadc0d8 child = 0xfffff80003dd9250 which_child = 1 child_bal = -1 gchild = <optimized out> gleft = <optimized out> gright = <optimized out> cright = 0x0 #11 0xffffffff826fc3e3 in avl_remove (tree=0xfffff8000d01f740, data=0xfffff80006fd4830) at /usr/src/sys/cddl/contrib/opensolaris/common/avl/avl.c:818 off = <optimized out> delete = <optimized out> left = <optimized out> right = <optimized out> node = 0xdeadc0dedeadc0d8 parent = 0xfffff80003d8b668 which_child = <optimized out> old_balance = <optimized out> new_balance = 0 tmp = <optimized out> #12 0xffffffff827cdd42 in vdev_queue_pending_remove (vq=0xfffff8000d01f5b8, zio=0xfffff80006fd4830) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_queue.c:490 spa = <optimized out> __right = <optimized out> __left = <optimized out> #13 vdev_queue_io_done (zio=0xfffff80006fd4830) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_queue.c:937 vq = 0xfffff8000d01f5b8 nio = <optimized out> #14 0xffffffff827fe7f1 in zio_vdev_io_done (zio=0xfffff80006fd4830) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:3366 vd = 0xfffff8000d01f000 unexpected_error = <error reading variable unexpected_error (Cannot access memory at address 0x0)> ops = <optimized out> #15 0xffffffff827f903c in zio_execute (zio=<optimized out>) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:1784 pipeline = <optimized out> stage = <optimized out> Possibly related to two different crashes discussed in: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=228750 (In reply to Allan Jude from comment #6) I think they are related. In my case I also see "indirect" devices in problem pools. (In reply to Sergei Masharov from comment #7) An indirect vdev is what you get once a remove has progressed. Do you have a stack trace or crash dump from that point in the process? tar zcvf core-dump-zfs-devrem.tar.gz /boot/kernel /usr/lib/debug/boot/kernel /var/crash/*.0 https://drive.google.com/open?id=1SUTm-hBgIhkkTmYNridU9zZqVMSIlA3T root@devremoval:~ # uname -a FreeBSD devremoval 12.0-ALPHA7 FreeBSD 12.0-ALPHA7 r338859 GENERIC amd64 I haven't updated this bhyve vm recently but it's been unable to boot if three of it's drives are enabled. panic: solaris assert: zio->io_type == ZIO_TYPE_WRITE (0x3 == 0x2), file: /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_indirect.c, line: 1267 cpuid = 1 time = 1539270204 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe004d1677e0 vpanic() at vpanic+0x1a3/frame 0xfffffe004d167840 panic() at panic+0x43/frame 0xfffffe004d1678a0 assfail3() at assfail3+0x2c/frame 0xfffffe004d1678c0 vdev_indirect_io_start() at vdev_indirect_io_start+0xae/frame 0xfffffe004d167900 zio_vdev_io_start() at zio_vdev_io_start+0x4b5/frame 0xfffffe004d167960 zio_execute() at zio_execute+0x18c/frame 0xfffffe004d1679b0 zio_nowait() at zio_nowait+0xc4/frame 0xfffffe004d1679e0 vdev_mirror_io_start() at vdev_mirror_io_start+0x42a/frame 0xfffffe004d167a70 zio_vdev_io_start() at zio_vdev_io_start+0x1fe/frame 0xfffffe004d167ad0 zio_execute() at zio_execute+0x18c/frame 0xfffffe004d167b20 taskqueue_run_locked() at taskqueue_run_locked+0x10c/frame 0xfffffe004d167b80 taskqueue_thread_loop() at taskqueue_thread_loop+0x88/frame 0xfffffe004d167bb0 fork_exit() at fork_exit+0x84/frame 0xfffffe004d167bf0 fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe004d167bf0 --- trap 0, rip = 0, rsp = 0, rbp = 0 --- KDB: enter: panic [ thread pid 0 tid 100510 ] Stopped at kdb_enter+0x3b: movq $0,kdb_why db> At least some those crashes (zio->io_type == ZIO_TYPE_WRITE (0x3 == 0x2)) are caused by lack of TRIM support in device removal code. This patch fixes alike crashes for me: https://reviews.freebsd.org/D17523 The patch allowed my bhyve virtual machine to boot and import the pool that had previously had a device removed. I attempted to remove an additional device from the pool and got a new panic: root@devremoval:~ # zpool status pool: devrem state: ONLINE scan: none requested remove: Removal of vdev 2 copied 4.36G in 1659h54m, completed on Wed Aug 29 20:37:33 2018 108K memory used for removed device mappings config: NAME STATE READ WRITE CKSUM devrem ONLINE 0 0 0 vtbd1 ONLINE 0 0 0 vtbd2 ONLINE 0 0 0 errors: No known data errors pool: zroot state: ONLINE scan: none requested config: NAME STATE READ WRITE CKSUM zroot ONLINE 0 0 0 vtbd0p3 ONLINE 0 0 0 errors: No known data errors root@devremoval:~ # zpool remove devrem vtbd2 panic: solaris assert: rc->rc_count == number, file: /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/refcount.c, lin e: 94 cpuid = 1 time = 1539304693 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe004d13a8f0 vpanic() at vpanic+0x1a3/frame 0xfffffe004d13a950 panic() at panic+0x43/frame 0xfffffe004d13a9b0 assfail() at assfail+0x1a/frame 0xfffffe004d13a9c0 refcount_destroy_many() at refcount_destroy_many+0x2b/frame 0xfffffe004d13a9f0 abd_free() at abd_free+0x18d/frame 0xfffffe004d13aa20 spa_vdev_copy_segment_write_done() at spa_vdev_copy_segment_write_done+0x20/frame 0xfffffe004d13aa50 zio_done() at zio_done+0xf21/frame 0xfffffe004d13aad0 zio_execute() at zio_execute+0x18c/frame 0xfffffe004d13ab20 taskqueue_run_locked() at taskqueue_run_locked+0x10c/frame 0xfffffe004d13ab80 taskqueue_thread_loop() at taskqueue_thread_loop+0x88/frame 0xfffffe004d13abb0 fork_exit() at fork_exit+0x84/frame 0xfffffe004d13abf0 fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe004d13abf0 --- trap 0, rip = 0, rsp = 0, rbp = 0 --- KDB: enter: panic [ thread pid 0 tid 100501 ] Stopped at kdb_enter+0x3b: movq $0,kdb_why db> tar zcvf core-dump-zfs-devrem-2.tar.gz /boot/kernel /usr/lib/debug/boot/kernel /var/crash/*.1 https://drive.google.com/open?id=1eBBJC3s3CnHRCQL8dNxEIBfNyAVy5S9K This is a different issue. I reproduced it too, but still thinking about proper solution. Quick and dirty patch looks like this: --- zio.c (revision 339291) +++ zio.c (working copy) @@ -1180,7 +1180,8 @@ } /* Not all IO types require vdev io done stage e.g. free */ - if (!(pio->io_pipeline & ZIO_STAGE_VDEV_IO_DONE)) + if (type == pio->io_type && + !(pio->io_pipeline & ZIO_STAGE_VDEV_IO_DONE)) pipeline &= ~ZIO_STAGE_VDEV_IO_DONE; if (vd->vdev_ops->vdev_op_leaf) { A commit references this bug: Author: mav Date: Fri Oct 12 15:14:22 UTC 2018 New revision: 339329 URL: https://svnweb.freebsd.org/changeset/base/339329 Log: Add ZIO_TYPE_FREE support for indirect vdevs. Upstream code expects only ZIO_TYPE_READ and some ZIO_TYPE_WRITE requests to removed (indirect) vdevs, while on FreeBSD there is also ZIO_TYPE_FREE (TRIM). ZIO_TYPE_FREE requests do not have the data buffers, so don't need the pointer adjustment. PR: 228750, 229007 Reviewed by: allanjude, sef Approved by: re (kib) MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D17523 Changes: head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_indirect.c A commit references this bug: Author: mav Date: Mon Oct 15 21:59:25 UTC 2018 New revision: 339372 URL: https://svnweb.freebsd.org/changeset/base/339372 Log: Skip VDEV_IO_DONE stage only for ZIO_TYPE_FREE. Device removal code uses zio_vdev_child_io() with ZIO_TYPE_NULL parent, that never happened before. It confused FreeBSD-specific TRIM code, which does not use VDEV_IO_DONE for logical ZIO_TYPE_FREE ZIOs. As result of that stage being skipped device removal ZIOs leaked references and memory that supposed to be freed by VDEV_IO_DONE, making it stuck. It is a quick patch rather then a nice fix, but hopefully we'll be able to drop it all together when alternative TRIM implementation finally get landed. PR: 228750, 229007 Discussed with: allanjude, avg, smh Approved by: re (delphij) MFC after: 5 days Sponsored by: iXsystems, Inc. Changes: head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c I believe ZFS device removal should now work in FreeBSD head. When will be the patch for 11.2-RELEASE? A commit references this bug: Author: mav Date: Fri Oct 19 04:30:26 UTC 2018 New revision: 339440 URL: https://svnweb.freebsd.org/changeset/base/339440 Log: MFC r339329: Add ZIO_TYPE_FREE support for indirect vdevs. Upstream code expects only ZIO_TYPE_READ and some ZIO_TYPE_WRITE requests to removed (indirect) vdevs, while on FreeBSD there is also ZIO_TYPE_FREE (TRIM). ZIO_TYPE_FREE requests do not have the data buffers, so don't need the pointer adjustment. PR: 228750, 229007 Changes: _U stable/11/ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_indirect.c A commit references this bug: Author: mav Date: Fri Oct 19 04:37:28 UTC 2018 New revision: 339441 URL: https://svnweb.freebsd.org/changeset/base/339441 Log: MFC r339372: Skip VDEV_IO_DONE stage only for ZIO_TYPE_FREE. Device removal code uses zio_vdev_child_io() with ZIO_TYPE_NULL parent, that never happened before. It confused FreeBSD-specific TRIM code, which does not use VDEV_IO_DONE for logical ZIO_TYPE_FREE ZIOs. As result of that stage being skipped device removal ZIOs leaked references and memory that supposed to be freed by VDEV_IO_DONE, making it stuck. It is a quick patch rather then a nice fix, but hopefully we'll be able to drop it all together when alternative TRIM implementation finally get landed. PR: 228750, 229007 Changes: _U stable/11/ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c |