It looks like the system here is crashing on startup trying to do something with a vdev that has a failed member on host startup. (da3:mpt0:0:3:0): READ(10). CDB: 28 00 15 37 a9 29 00 00 24 00 (da3:mpt0:0:3:0): CAM status: SCSI Status Error (da3:mpt0:0:3:0): SCSI status: Check Condition (da3:mpt0:0:3:0): SCSI sense: MEDIUM ERROR asc:11,0 (Unrecovered read error) (da3:mpt0:0:3:0): Info: 0x1537a933 (da3:mpt0:0:3:0): Field Replaceable Unit: 129 (da3:mpt0:0:3:0): Actual Retry Count: 151 (da3:mpt0:0:3:0): Error 5, Unretryable error Fatal trap 12: page fault while in kernel mode cpuid = 6; apic id = 06 fault virtual address = 0x11 fault code = supervisor read instruction, page not present instruction pointer = 0x20:0x11 stack pointer = 0x28:0xfffffe009d5e9118 frame pointer = 0x28:0xfffffe009d5e9150 code segment = base rx0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 2464 (syncthing) [ thread pid 2464 tid 100994 ] Stopped at 0x11 db> bt Tracing pid 2464 tid 100994 td 0xfffff800cf1d8000 ??() at 0x11 avl_find() at avl_find+0x44/frame 0xfffffe009d5e9150 vdev_queue_change_io_priority() at vdev_queue_change_io_priority+0xa1/frame 0xfffffe009d5e9190 zio_change_priority() at zio_change_priority+0x38/frame 0xfffffe009d5e91e0 zio_change_priority() at zio_change_priority+0xe7/frame 0xfffffe009d5e9230 zio_change_priority() at zio_change_priority+0xe7/frame 0xfffffe009d5e9280 zio_change_priority() at zio_change_priority+0xe7/frame 0xfffffe009d5e92d0 arc_read() at arc_read+0xf3/frame 0xfffffe009d5e9360 dbuf_read() at dbuf_read+0x714/frame 0xfffffe009d5e9420 dmu_buf_hold_array_by_dnode() at dmu_buf_hold_array_by_dnode+0x163/frame 0xfffffe009d5e9490 dmu_read_uio_dnode() at dmu_read_uio_dnode+0x37/frame 0xfffffe009d5e9500 dmu_read_uio_dbuf() at dmu_read_uio_dbuf+0x3b/frame 0xfffffe009d5e9530 zfs_freebsd_read() at zfs_freebsd_read+0x2d3/frame 0xfffffe009d5e95e0 VOP_READ_APV() at VOP_READ_APV+0x7c/frame 0xfffffe009d5e9610 null_bypass() at null_bypass+0xd8/frame 0xfffffe009d5e9750 VOP_READ_APV() at VOP_READ_APV+0x81/frame 0xfffffe009d5e9780 vn_read() at vn_read+0x195/frame 0xfffffe009d5e9800 vn_io_fault_doio() at vn_io_fault_doio+0x43/frame 0xfffffe009d5e9860 vn_io_fault1() at vn_io_fault1+0x161/frame 0xfffffe009d5e99a0 vn_io_fault() at vn_io_fault+0x195/frame 0xfffffe009d5e9a10 dofileread() at dofileread+0x95/frame 0xfffffe009d5e9a50 sys_read() at sys_read+0xc3/frame 0xfffffe009d5e9ac0 amd64_syscall() at amd64_syscall+0x369/frame 0xfffffe009d5e9bf0 fast_syscall_common() at fast_syscall_common+0x101/frame 0xfffffe009d5e9bf0 --- syscall (3, FreeBSD ELF64, sys_read), rip = 0x4a5d50, rsp = 0xc4219532c0, rbp = 0xc421953318 ---
ftp-mirror0.isc.freebsd.org 12.0-ALPHA6 FreeBSD 12.0-ALPHA6 #0 r338761M: Tue Sep 18 22:55:31 UTC 2018 I've put the kernel/crashdump/debuginfo in freefall.freebsd.org:~sbruno/ftpmirror_crash.tgz
da3 is the disk that is failing. Starting up the jails and indexing files on the zftp pool will cause the panic in this ticket. I *can* remove the drive, but I want to wait for some diagnostics from ZFS folks first. root@ftp-mirror0.isc:~ # zpool status pool: zftp state: ONLINE scan: scrub repaired 150K in 0 days 02:03:31 with 0 errors on Fri Aug 17 05:19:19 2018 config: NAME STATE READ WRITE CKSUM zftp ONLINE 0 0 0 raidz1-0 ONLINE 0 0 0 da0p1 ONLINE 0 0 0 da1p1 ONLINE 0 0 0 da2p1 ONLINE 0 0 0 da3p1 ONLINE 0 0 0 da4p1 ONLINE 0 0 0 da5p1 ONLINE 0 0 0 da6p1 ONLINE 0 0 0 da7p1 ONLINE 0 0 0 errors: No known data errors pool: zroot state: ONLINE scan: scrub repaired 0 in 0 days 03:16:08 with 0 errors on Fri Aug 17 06:31:58 2018 config: NAME STATE READ WRITE CKSUM zroot ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 ada0p3 ONLINE 0 0 0 ada1p3 ONLINE 0 0 0 errors: No known data errors
Recently saw a ZFS crash due to 2 dying/dead disks (of 4) in a RAIDZ1: FreeBSD 11.1-STABLE #0 r331113: Sun Mar 18 19:57:45 AEST 2018 panic: solaris assert: remove_reference(hdr, ((void *)0), tag) == 0 (0xd == 0x0), file: /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c, line: 3463 cpuid = 7 KDB: stack backtrace: #0 0xffffffff805cf167 at kdb_backtrace+0x67 #1 0xffffffff8058cc46 at vpanic+0x186 #2 0xffffffff8058cab3 at panic+0x43 #3 0xffffffff815cb22c at assfail3+0x2c #4 0xffffffff8124cb50 at arc_buf_destroy+0x1a0 #5 0xffffffff8125c56a at dbuf_read_done+0x4a #6 0xffffffff8124f236 at arc_read_done+0x296 #7 0xffffffff812e357e at zio_done+0x80e #8 0xffffffff812df05c at zio_execute+0xac #9 0xffffffff805e0714 at taskqueue_run_locked+0x154 #10 0xffffffff805e1858 at taskqueue_thread_loop+0x98 #11 0xffffffff805596a2 at fork_exit+0x82 #12 0xffffffff8074a13e at fork_trampoline+0xe Uptime: 53s The culprits: (da2:mps0:0:5:0): SCSI sense: MEDIUM ERROR asc:11,0 (Unrecovered read error) (da2:mps0:0:5:0): Error 5, Unretryable error (da2:mps0:0:5:0): READ(16). CDB: 88 00 00 00 00 01 cf 2f 5c dc 00 00 00 55 00 00 (da2:mps0:0:5:0): CAM status: SCSI Status Error (da2:mps0:0:5:0): SCSI status: Check Condition (da2:mps0:0:5:0): SCSI sense: UNIT ATTENTION asc:29,0 (Power on, reset, or bus device reset occurred) (da2:mps0:0:5:0): Error 6, Retries exhausted (da2:mps0:0:5:0): Invalidating pack ... Sep 21 11:50:16 <daemon.crit> drunkfish smartd[1060]: Device: /dev/da3 [SAT], 39538 Currently unreadable (pending) sectors
Yet another resilvering crash due to disk errors. I've dropped the crashinfo, kernel and debug in: freefall.freebsd.org:~sbruno/ftp-mirror_zfs.tar.gz
(In reply to Andy Farkas from comment #3) This crash is different, and should get its own bugzilla number.
(In reply to Sean Bruno from comment #0) I have submitted the fix for this to Release Engineering
A commit references this bug: Author: allanjude Date: Sat Sep 29 01:26:07 UTC 2018 New revision: 339009 URL: https://svnweb.freebsd.org/changeset/base/339009 Log: Avoid panic when adjusting priority of a read in the face of an IO error PR: 231516 Reported by: sbruno Approved by: re (rgrimes) Obtained from: ZFS-on-Linux X-MFC-with: 334844 Sponsored by: Klara Systems MFV/ZoL: Fix zio->io_priority failed (7 < 6) assert commit c26cf0966d131b722c32f8ccecfe5791a789d975 Author: Tony Hutter <hutter2@llnl.gov> Date: Tue May 29 18:13:48 2018 -0700 Fix zio->io_priority failed (7 < 6) assert This fixes an assert in vdev_queue_change_io_priority(): VERIFY3(zio->io_priority < ZIO_PRIORITY_NUM_QUEUEABLE) failed (7 < 6) PANIC at vdev_queue.c:832:vdev_queue_change_io_priority() Reviewed-by: Tom Caputi <tcaputi@datto.com> Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tony Hutter <hutter2@llnl.gov> Changes: head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_queue.c
A commit references this bug: Author: mav Date: Thu Apr 25 15:51:01 UTC 2019 New revision: 346679 URL: https://svnweb.freebsd.org/changeset/base/346679 Log: MFC r339009 (by allanjude): Avoid panic when adjusting priority of a read in the face of an IO error PR: 231516 Reported by: sbruno Approved by: re (rgrimes) Obtained from: ZFS-on-Linux X-MFC-with: 334844 Sponsored by: Klara Systems MFV/ZoL: Fix zio->io_priority failed (7 < 6) assert commit c26cf0966d131b722c32f8ccecfe5791a789d975 Author: Tony Hutter <hutter2@llnl.gov> Date: Tue May 29 18:13:48 2018 -0700 Fix zio->io_priority failed (7 < 6) assert This fixes an assert in vdev_queue_change_io_priority(): VERIFY3(zio->io_priority < ZIO_PRIORITY_NUM_QUEUEABLE) failed (7 < 6) PANIC at vdev_queue.c:832:vdev_queue_change_io_priority() Reviewed-by: Tom Caputi <tcaputi@datto.com> Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tony Hutter <hutter2@llnl.gov> Changes: _U stable/11/ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_queue.c