I started a dd command to copy data between ZFS volumes in different pools. I returned to find my system had rebooted. Unfortunately there was not enough space to save a dump. All I have is the syslog message: savecore[1921]: reboot after panic: VERIFY(zio->io_stage == ZIO_STAGE_VDEV_IO_START) failed There are two such assertions in sys/contrib/openzfs/module/zfs/zio.c, one in zio_vdev_io_bypass and the other in zio_vdev_io_reissue. dd block size argument was bs=1m. Transfer rate was about 100 MB/s. Server has 160 GB RAM. Source disk was SAS SSD, destination was raidz2 with 5 spinning SATA disks. Both pools have a cache on different partitions of the same NVMe drive. Kernel has 13.2 changes through 4c4633fdffbe8e4b6d328c2bc9bb3edacc9ab50a.
Repeating the dd command, this time with bs=8m, caused another panic with a different message but also related to tracking state of I/O. I have a crash dump this time. The new kernel is based off 4c4633fdffbe8e4b6d328c2bc9bb3edacc9ab50a. There are no substantial changes to the ZFS kernel code in between. panic: VERIFY(zio->io_stall == NULL) failed cpuid = 38 time = 1705586225 KDB: stack backtrace: #0 0xffffffff80c19745 at kdb_backtrace+0x65 #1 0xffffffff80bce982 at vpanic+0x152 #2 0xffffffff82a2a2ea at spl_panic+0x3a #3 0xffffffff82c201e8 at zio_execute+0x308 #4 0xffffffff80c2e0eb at taskqueue_run_locked+0xab #5 0xffffffff80c2f183 at taskqueue_thread_loop+0xd3 #6 0xffffffff80b88730 at fork_exit+0x80 #7 0xffffffff81061b7e at fork_trampoline+0xe #4 0xffffffff82a2a2ea in spl_panic (file=<optimized out>, func=<optimized out>, line=<optimized out>, fmt=<unavailable>) at /usr/home/jfc/freebsd/src/sys/contrib/openzfs/module/os/freebsd/spl/spl_misc.c:107 ap = {{gp_offset = 40, fp_offset = 48, overflow_arg_area = 0xfffffe03a6f8ae20, reg_save_area = 0xfffffe03a6f8adc0}} #5 0xffffffff82c201e8 in __zio_execute (zio=0xfffff804582b19a0) at /usr/home/jfc/freebsd/src/sys/contrib/openzfs/module/zfs/zio.c:2175 stage = ZIO_STAGE_VDEV_IO_DONE pipeline = (ZIO_STAGE_VDEV_IO_START | ZIO_STAGE_VDEV_IO_DONE | ZIO_STAGE_VDEV_IO_ASSESS | ZIO_STAGE_DONE) #6 zio_execute (zio=<optimized out>) at /usr/home/jfc/freebsd/src/sys/contrib/openzfs/module/zfs/zio.c:2130 cookie = 0 #7 0xffffffff80c2e0eb in taskqueue_run_locked ( queue=queue@entry=0xfffff801afeafb00) at /usr/home/jfc/freebsd/src/sys/kern/subr_taskqueue.c:512 et = {et_link = {tqe_next = 0xfffffe03a6f8aec0, tqe_prev = 0xffffffff80bdac78 <_sleep+712>}, et_td = 0x0, et_section = {bucket = 0}, et_old_priority = 0 '\000'} tb = {tb_running = 0xfffff804582b1db0, tb_seq = 3579256, tb_canceling = false, tb_link = {le_next = 0x0, le_prev = 0xfffff801afeafb10}} task = 0xfffff804582b1db0 in_net_epoch = false pending = 1 The zio argument to __zio_execute is {io_bookmark = {zb_objset = 2437, zb_object = 1, zb_level = 0, zb_blkid = 3700975}, io_prop = { zp_checksum = ZIO_CHECKSUM_INHERIT, zp_compress = ZIO_COMPRESS_INHERIT, zp_complevel = 0 '\000', zp_type = DMU_OT_NONE, zp_level = 0 '\000', zp_copies = 0 '\000', zp_dedup = 0, zp_dedup_verify = 0, zp_nopwrite = 0, zp_encrypt = 0, zp_byteorder = 0, zp_salt = "\000\000\000\000\000\000\000", zp_iv = '\000' <repeats 11 times>, zp_mac = '\000' <repeats 15 times>, zp_zpl_smallblk = 0}, io_type = ZIO_TYPE_WRITE, io_child_type = ZIO_CHILD_VDEV, io_trim_flags = 0, io_cmd = 0, io_priority = ZIO_PRIORITY_ASYNC_WRITE, io_reexecute = 0 '\000', io_state = "\001", io_txg = 23709657, io_spa = 0xfffffe03a69f9000, io_bp = 0x0, io_bp_override = 0x0, io_bp_copy = { blk_dva = {{dva_word = {0, 0}}, {dva_word = {0, 0}}, {dva_word = {0, 0}}}, blk_prop = 0, blk_pad = { 0, 0}, blk_phys_birth = 0, blk_birth = 0, blk_fill = 0, blk_cksum = {zc_word = {0, 0, 0, 0}}}, io_parent_list = {list_size = 48, list_offset = 16, list_head = {list_next = 0xfffff802c327cd60, list_prev = 0xfffff802c327cd60}}, io_child_list = {list_size = 48, list_offset = 32, list_head = { list_next = 0xfffff8068a043800, list_prev = 0xfffff8068a043800}}, io_logical = 0xfffff804899659a0, io_transform_stack = 0x0, io_ready = 0x0, io_children_ready = 0x0, io_physdone = 0xffffffff82a7cd30 <arc_write_physdone>, io_done = 0xffffffff82b71880 <vdev_raidz_child_done>, io_private = 0xfffff801623fc490, io_prev_space_delta = 0, io_bp_orig = {blk_dva = {{dva_word = {0, 0}}, {dva_word = {0, 0}}, {dva_word = {0, 0}}}, blk_prop = 0, blk_pad = {0, 0}, blk_phys_birth = 0, blk_birth = 0, blk_fill = 0, blk_cksum = {zc_word = {0, 0, 0, 0}}}, io_lsize = 4096, io_abd = 0xfffff801623fc4a8, io_orig_abd = 0xfffff801623fc4a8, io_size = 4096, io_orig_size = 4096, io_vd = 0xfffffe03a6d7d000, io_vsd = 0x0, io_vsd_ops = 0x0, io_metaslab_class = 0xfffff801623c2000, io_offset = 459490152448, io_timestamp = 1564556504339, io_queued_timestamp = 1564556504259, io_target_timestamp = 0, io_delta = 0, io_delay = 0, io_queue_node = {avl_child = {0x0, 0x0}, avl_pcb = 1}, io_offset_node = {avl_child = {0x0, 0x0}, avl_pcb = 1}, io_alloc_node = {avl_child = {0x0, 0x0}, avl_pcb = 0}, io_alloc_list = {zal_list = {list_size = 72, list_offset = 0, list_head = {list_next = 0xfffff804582b1c98, list_prev = 0xfffff804582b1c98}}, zal_size = 0}, io_flags = 3672192, io_stage = ZIO_STAGE_VDEV_IO_DONE, io_pipeline = (ZIO_STAGE_VDEV_IO_START | ZIO_STAGE_VDEV_IO_DONE | ZIO_STAGE_VDEV_IO_ASSESS | ZIO_STAGE_DONE), io_orig_flags = 1048704, io_orig_stage = ZIO_STAGE_READY, io_orig_pipeline = (ZIO_STAGE_VDEV_IO_START | ZIO_STAGE_VDEV_IO_DONE | ZIO_STAGE_VDEV_IO_ASSESS | ZIO_STAGE_DONE), io_pipeline_trace = (ZIO_STAGE_OPEN | ZIO_STAGE_VDEV_IO_START | ZIO_STAGE_VDEV_IO_ASSESS), io_error = 0, io_child_error = {0, 0, 0, 0}, io_children = {{0, 1}, {0, 0}, {0, 0}, {0, 0}}, io_child_count = 1, io_phys_children = 0, io_parent_count = 1, io_stall = 0xfffff804582b1ce8, io_gang_leader = 0x0, io_gang_tree = 0x0, io_executor = 0xfffff801b92c5000, io_waiter = 0x0, io_bio = 0x0, io_lock = {lock_object = {lo_name = 0xffffffff82ca3c8d <.L.str.112+1> "zio->io_lock", lo_flags = 577830912, lo_data = 0, lo_witness = 0x0}, sx_lock = 18446735284676890624}, io_cv = { cv_description = 0xffffffff82caea62 <.L.str.113+1> "zio->io_cv", cv_waiters = 0}, io_allocator = 0, io_cksum_report = 0x0, io_ena = 0, io_tqent = {tqent_task = {ta_link = {stqe_next = 0x0}, ta_pending = 0, ta_priority = 0 '\000', ta_flags = 0 '\000', ta_func = 0xffffffff82a2bd90 <taskq_run_ent>, ta_context = 0xfffff804582b1db0}, tqent_timeout_task = {q = 0x0, t = {ta_link = {stqe_next = 0x0}, ta_pending = 0, ta_priority = 0 '\000', ta_flags = 0 '\000', ta_func = 0x0, ta_context = 0x0}, c = {c_links = { le = {le_next = 0x0, le_prev = 0x0}, sle = {sle_next = 0x0}, tqe = {tqe_next = 0x0, tqe_prev = 0x0}}, c_time = 0, c_precision = 0, c_arg = 0x0, c_func = 0x0, c_lock = 0x0, c_flags = 0, c_iflags = 0, c_cpu = 0}, f = 0}, tqent_func = 0xffffffff82c1fee0 <zio_execute>, tqent_arg = 0xfffff804582b19a0, tqent_id = 0, tqent_hash = {cle_next = 0x0, cle_prev = 0x0}, tqent_type = 0 '\000', tqent_registered = 0 '\000', tqent_cancelled = 0 '\000', tqent_rc = 0}}
I have also reproduced the original panic with a crash dump. panic: VERIFY(zio->io_stage == ZIO_STAGE_VDEV_IO_START) failed It is from zio_vdev_io_bypass. #5 0xffffffff82c2209e in zio_vdev_io_bypass ( zio=zio@entry=0xfffff80767ed29a0) at /usr/home/jfc/freebsd/src/sys/contrib/openzfs/module/zfs/zio.c:4093 No locals. #6 0xffffffff82b70926 in vdev_queue_io_done ( zio=zio@entry=0xfffff80767ed29a0) at /usr/home/jfc/freebsd/src/sys/contrib/openzfs/module/zfs/vdev_queue.c:951 zl = 0xfffff80763b9d600 vq = 0xfffffe03a677b1c0 now = <optimized out> nio = 0xfffff807698819a0 dio = <optimized out> #7 0xffffffff82c27d1f in zio_vdev_io_done (zio=0xfffff80767ed29a0) at /usr/home/jfc/freebsd/src/sys/contrib/openzfs/module/zfs/zio.c:3908 vd = 0xfffffe03a6778000 ops = 0xffffffff82d0e530 <vdev_disk_ops> unexpected_error = 0 #8 0xffffffff82c1ff5b in __zio_execute (zio=0xfffff80767ed29a0) at /usr/home/jfc/freebsd/src/sys/contrib/openzfs/module/zfs/zio.c:2219 stage = ZIO_STAGE_VDEV_IO_DONE pipeline = <optimized out> #9 zio_execute (zio=<optimized out>) at /usr/home/jfc/freebsd/src/sys/contrib/openzfs/module/zfs/zio.c:2130 cookie = 0 The zio object is {io_bookmark = {zb_objset = 2437, zb_object = 1, zb_level = 0, zb_blkid = 776249}, io_prop = {zp_checksum = ZIO_CHECKSUM_INHERIT, zp_compress = ZIO_COMPRESS_INHERIT, zp_complevel = 0 '\000', zp_type = DMU_OT_NONE, zp_level = 0 '\000', zp_copies = 0 '\000', zp_dedup = 0, zp_dedup_verify = 0, zp_nopwrite = 0, zp_encrypt = 0, zp_byteorder = 0, zp_salt = "\000\000\000\000\000\000\000", zp_iv = '\000' < repeats 11 times >, zp_mac = '\000' < repeats 15 times >, zp_zpl_smallblk = 0}, io_type = ZIO_TYPE_WRITE, io_child_type = ZIO_CHILD_VDEV, io_trim_flags = 0, io_cmd = 0, io_priority = ZIO_PRIORITY_ASYNC_WRITE, io_reexecute = 0 '\000', io_state = "\001", io_txg = 23709913, io_spa = 0xfffffe03a66d8000, io_bp = 0x0, io_bp_override = 0x0, io_bp_copy = {blk_dva = {{dva_word = {0, 0}}, {dva_word = {0, 0}}, {dva_word = {0, 0}}}, blk_prop = 0, blk_pad = {0, 0}, blk_phys_birth = 0, blk_birth = 0, blk_fill = 0, blk_cksum = {zc_word = {0, 0, 0, 0}}}, io_parent_list = {list_size = 48, list_offset = 16, list_head = {list_next = 0xfffff80768647490, list_prev = 0xfffff80768647490}}, io_child_list = {list_size = 48, list_offset = 32, list_head = {list_next = 0xfffff80763b9d620, list_prev = 0xfffff80763b9d620}}, io_logical = 0xfffff8075b3f94d0, io_transform_stack = 0x0, io_ready = 0x0, io_children_ready = 0x0, io_physdone = 0xffffffff82a7cd30 < arc_write_physdone >, io_done = 0xffffffff82b71880 < vdev_raidz_child_done >, io_private = 0xfffff8047802ea70, io_prev_space_delta = 0, io_bp_orig = {blk_dva = {{dva_word = {0, 0}}, {dva_word = {0, 0}}, {dva_word = {0, 0}}}, blk_prop = 0, blk_pad = {0, 0}, blk_phys_birth = 0, blk_birth = 0, blk_fill = 0, blk_cksum = {zc_word = {0, 0, 0, 0}}}, io_lsize = 8192, io_abd = 0xfffff8047802ea88, io_orig_abd = 0xfffff8047802ea88, io_size = 8192, io_orig_size = 8192, io_vd = 0xfffffe03a6778000, io_vsd = 0x0, io_vsd_ops = 0x0, io_metaslab_class = 0xfffff8010a80a000, io_offset = 899567525888, io_timestamp = 2052263446783, io_queued_timestamp = 2052263446703, io_target_timestamp = 0, io_delta = 4538041, io_delay = 0, io_queue_node = {avl_child = {0x0, 0x0}, avl_pcb = 18446735309376501621}, io_offset_node = {avl_child = {0x0, 0x0}, avl_pcb = 18446735309466822541}, io_alloc_node = {avl_child = {0x0, 0x0}, avl_pcb = 0}, io_alloc_list = {zal_list = {list_size = 72, list_offset = 0, list_head = {list_next = 0xfffff80767ed2c98, list_prev = 0xfffff80767ed2c98}}, zal_size = 0}, io_flags = 1575040, io_stage = ZIO_STAGE_VDEV_IO_DONE, io_pipeline = (ZIO_STAGE_VDEV_IO_START | ZIO_STAGE_VDEV_IO_DONE | ZIO_STAGE_VDEV_IO_ASSESS | ZIO_STAGE_DONE), io_orig_flags = 1048704, io_orig_stage = ZIO_STAGE_READY, io_orig_pipeline = (ZIO_STAGE_VDEV_IO_START | ZIO_STAGE_VDEV_IO_DONE | ZIO_STAGE_VDEV_IO_ASSESS | ZIO_STAGE_DONE), io_pipeline_trace = (ZIO_STAGE_OPEN | ZIO_STAGE_VDEV_IO_START | ZIO_STAGE_VDEV_IO_DONE), io_error = 0, io_child_error = {0, 0, 0, 0}, io_children = {{0, 1}, {0, 0}, {0, 0}, {0, 0}}, io_child_count = 1, io_phys_children = 0, io_parent_count = 1, io_stall = 0x0, io_gang_leader = 0x0, io_gang_tree = 0x0, io_executor = 0xfffff801b82bc000, io_waiter = 0x0, io_bio = 0x0, io_lock = {lock_object = {lo_name = 0xffffffff82ca3c8d <.L.str .112 + 1 > "zio->io_lock", lo_flags = 577830912, lo_data = 0, lo_witness = 0x0}, sx_lock = 1}, io_cv = {cv_description = 0xffffffff82caea62 <.L.str .113 + 1 > "zio->io_cv", cv_waiters = 0}, io_allocator = 0, io_cksum_report = 0x0, io_ena = 0, io_tqent = {tqent_task = {ta_link = {stqe_next = 0x0}, ta_pending = 0, ta_priority = 0 '\000', ta_flags = 0 '\000', ta_func = 0xffffffff82a2bd90 < taskq_run_ent >, ta_context = 0xfffff80767ed2db0}, tqent_timeout_task = {q = 0x0, t = {ta_link = {stqe_next = 0x0}, ta_pending = 0, ta_priority = 0 '\000', ta_flags = 0 '\000', ta_func = 0x0, ta_context = 0x0}, c = {c_links = {le = {le_next = 0x0, le_prev = 0x0}, sle = {sle_next = 0x0}, tqe = {tqe_next = 0x0, tqe_prev = 0x0}}, c_time = 0, c_precision = 0, c_arg = 0x0, c_func = 0x0, c_lock = 0x0, c_flags = 0, c_iflags = 0, c_cpu = 0}, f = 0}, tqent_func = 0xffffffff82c1fee0 < zio_execute >, tqent_arg = 0xfffff80767ed29a0, tqent_id = 0, tqent_hash = {cle_next = 0x0, cle_prev = 0x0}, tqent_type = 0 '\000', tqent_registered = 0 '\000', tqent_cancelled = 0 '\000', tqent_rc = 0}}; Here are the volume properties: NAME PROPERTY VALUE SOURCE fat32 type volume - fat32 creation Thu Jan 18 5:36 2024 - fat32 used 100G - fat32 available 734G - fat32 referenced 74.2G - fat32 compressratio 1.02x - fat32 reservation none default fat32 volsize 279G local fat32 volblocksize 16K - fat32 checksum on default fat32 compression on inherited from private fat32 readonly off default fat32 createtxg 23707850 - fat32 copies 1 default fat32 refreservation 100G local fat32 guid 3356515442365534536 - fat32 primarycache all default fat32 secondarycache all default fat32 usedbysnapshots 0B - fat32 usedbydataset 74.2G - fat32 usedbychildren 0B - fat32 usedbyrefreservation 25.8G - fat32 logbias latency default fat32 objsetid 2437 - fat32 dedup off default fat32 mlslabel none default fat32 sync standard default fat32 refcompressratio 1.02x - fat32 written 74.2G - fat32 logicalused 57.0G - fat32 logicalreferenced 57.0G - fat32 volmode default default fat32 snapshot_limit none default fat32 snapshot_count none default fat32 snapdev hidden default fat32 context none default fat32 fscontext none default fat32 defcontext none default fat32 rootcontext none default fat32 redundant_metadata all default fat32 encryption off default fat32 keylocation none default fat32 keyformat none default fat32 pbkdf2iters 0 default Here are the pool features: private feature@async_destroy enabled local private feature@empty_bpobj active local private feature@lz4_compress active local private feature@multi_vdev_crash_dump enabled local private feature@spacemap_histogram active local private feature@enabled_txg active local private feature@hole_birth active local private feature@extensible_dataset active local private feature@embedded_data active local private feature@bookmarks enabled local private feature@filesystem_limits enabled local private feature@large_blocks active local private feature@large_dnode enabled local private feature@sha512 enabled local private feature@skein enabled local private feature@userobj_accounting active local private feature@encryption enabled local private feature@project_quota active local private feature@device_removal enabled local private feature@obsolete_counts enabled local private feature@zpool_checkpoint enabled local private feature@spacemap_v2 active local private feature@allocation_classes enabled local private feature@resilver_defer enabled local private feature@bookmark_v2 enabled local private feature@redaction_bookmarks enabled local private feature@redacted_datasets enabled local private feature@bookmark_written enabled local private feature@log_spacemap active local private feature@livelist enabled local private feature@device_rebuild enabled local private feature@zstd_compress enabled local private feature@draid enabled local
I can also reproduce the crash writing to a volume with zfs recv and with dd from /dev/zero (with compression and deduplication disabled to force writes to disk). The common factor is tens of gigabytes of continuous writes to a volume in a raidz2 pool spread across five spinning disks with an NVMe cache.