Bug 277267 - ZFS panic: VERIFY3(zio->io_error == ENXIO) failed (0 == 6)
Summary: ZFS panic: VERIFY3(zio->io_error == ENXIO) failed (0 == 6)
Status: New
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 13.2-STABLE
Hardware: Any Any
: --- Affects Only Me
Assignee: freebsd-fs (Nobody)
URL:
Keywords: crash
Depends on:
Blocks:
 
Reported: 2024-02-23 19:40 UTC by John F. Carr
Modified: 2024-02-23 23:58 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description John F. Carr 2024-02-23 19:40:17 UTC
My amd64 server crashed while I was running git clone on a repository that adds up to 12 GB unpacked.  The crash dump confirms that the assertion is associated with the pool being written.  The pool is a mirror on two
2TB HP MM2000JEFRC HPD4 drives on a 1200 MBps SAS bus.  The smartpqi driver from FreeBSD 14.0 because it fixes a bug I was hitting.  Otherwise, stable/13 with ZFS changes through

282fd2c39ee6 Add vnode_pager_clean_{a,}sync(9)

and other changes through

763b10806cd4 LinuxKPI: 802.11: lsta txq locking cleanup



#3  0xffffffff80bd26bf in vpanic (
    fmt=0xffffffff82cf72a7 "VERIFY3(zio->io_error == ENXIO) failed (%lld == %lld)\n", ap=ap@entry=0xfffffe03a0489db0)
    at /usr/home/jfc/freebsd/src/sys/kern/kern_shutdown.c:921
        buf = "VERIFY3(zio->io_error == ENXIO) failed (0 == 6)\n", '\000' <repeats 207 times>
        other_cpus = {__bits = {281474974613503, 0, 0, 0}}
        td = 0xfffff80182b88000
        bootopt = <unavailable>
        newpanic = <optimized out>
#4  0xffffffff82a2a2ea in spl_panic (file=<optimized out>, 
    func=<optimized out>, line=<optimized out>, fmt=<unavailable>)
    at /usr/home/jfc/freebsd/src/sys/contrib/openzfs/module/os/freebsd/spl/spl_misc.c:107
        ap = {{gp_offset = 48, fp_offset = 48, 
            overflow_arg_area = 0xfffffe03a0489de0, 
            reg_save_area = 0xfffffe03a0489d80}}
#5  0xffffffff82c27e60 in zio_vdev_io_done (zio=0xfffff8184fb87000)
    at /usr/home/jfc/freebsd/src/sys/contrib/openzfs/module/zfs/zio.c:3929
        vd = 0xfffffe03a0a2e000
        ops = 0xffffffff82d0e530 <vdev_disk_ops>
        unexpected_error = <optimized out>
#6  0xffffffff82c1ff5b in __zio_execute (zio=0xfffff8184fb87000)
    at /usr/home/jfc/freebsd/src/sys/contrib/openzfs/module/zfs/zio.c:2219
        stage = ZIO_STAGE_VDEV_IO_DONE
        pipeline = <optimized out>
#7  zio_execute (zio=<optimized out>)
    at /usr/home/jfc/freebsd/src/sys/contrib/openzfs/module/zfs/zio.c:2130
        cookie = 0
#8  0xffffffff80c31e1b in taskqueue_run_locked (
    queue=queue@entry=0xfffff80182f7e200)
    at /usr/home/jfc/freebsd/src/sys/kern/subr_taskqueue.c:518
        et = {et_link = {tqe_next = 0xfffffe03a0489ec0, 
            tqe_prev = 0xffffffff80bde948 <_sleep+712>}, et_td = 0x0, 
          et_section = {bucket = 0}, et_old_priority = 0 '\000'}
        tb = {tb_running = 0xfffff8184fb87410, tb_seq = 523385, 
          tb_canceling = false, tb_link = {le_next = 0x0, 
            le_prev = 0xfffff80182f7e210}}
        task = 0xfffff8184fb87410
        in_net_epoch = false
        pending = 1
#9  0xffffffff80c32eb3 in taskqueue_thread_loop (
    arg=arg@entry=0xfffff80178145320)
    at /usr/home/jfc/freebsd/src/sys/kern/subr_taskqueue.c:830
        tq = 0xfffff80182f7e200
        tqp = <optimized out>
#10 0xffffffff80b8c400 in fork_exit (
    callout=0xffffffff80c32de0 <taskqueue_thread_loop>, 
    arg=0xfffff80178145320, frame=0xfffffe03a0489f40)
    at /usr/home/jfc/freebsd/src/sys/kern/kern_fork.c:1151
        td = 0xfffff80182b88000
        p = 0xffffffff81e46828 <proc0>
        dtd = <optimized out>

The zio object is

stage = 0x200000

*zio =
{
  io_bookmark = {zb_objset = 93, zb_object = 1087285, zb_level = 0,
                 zb_blkid = 0},
  io_prop = {zp_checksum = ZIO_CHECKSUM_INHERIT,
             zp_compress = ZIO_COMPRESS_INHERIT,
             zp_complevel = 0 '\000',
             zp_type = DMU_OT_NONE,
             zp_level = 0 '\000',
             zp_copies = 0 '\000',
             zp_dedup = 0,
             zp_dedup_verify = 0,
             zp_nopwrite = 0,
             zp_encrypt = 0,
             zp_byteorder = 0,
             zp_salt = "\000\000\000\000\000\000\000",
             zp_iv = '\000' < repeats 11 times >
             , zp_mac = '\000' < repeats 15 times >, zp_zpl_smallblk = 0},
  io_type = ZIO_TYPE_WRITE, io_child_type = ZIO_CHILD_VDEV, io_trim_flags = 0,
  io_cmd = 0, io_priority = ZIO_PRIORITY_ASYNC_WRITE, io_reexecute = 0 '\000',
  io_state = "\001", io_txg = 24604026, io_spa = 0xfffffe03a0755000,
  io_bp = 0xfffff819959c5578, io_bp_override = 0x0,
  io_bp_copy = {blk_dva = {{dva_word = {8, 3219260528}},
                           {dva_word = {0, 0}},
                           {dva_word = {0, 0}}},
                blk_prop = 13840414884522819587,
                blk_pad = {0, 0},
                blk_phys_birth = 0,
                blk_birth = 24604026,
                blk_fill = 0,
                blk_cksum = {zc_word = {3816859218741551286,
                                        1781004871607911648,
                                        15693030277188822173,
                                        4275945154273572093}}},
  io_parent_list = {list_size = 48, list_offset = 16,
                    list_head = {list_next = 0xfffff809344e0820,
                                 list_prev = 0xfffff809344e0820}},
  io_child_list = {list_size = 48, list_offset = 32,
                   list_head = {list_next = 0xfffff8184fb87158,
                                list_prev = 0xfffff8184fb87158}},
  io_logical = 0xfffff8250eb404d0, io_transform_stack = 0x0, io_ready = 0x0,
  io_children_ready = 0x0, io_physdone = 0x0,
  io_done = 0xffffffff82b6eb00 < vdev_mirror_child_done >,
  io_private = 0xfffff808d1b901c8, io_prev_space_delta = 0,
  io_bp_orig = {blk_dva = {{dva_word = {8, 3219260528}},
                           {dva_word = {0, 0}},
                           {dva_word = {0, 0}}},
                blk_prop = 13840414884522819587,
                blk_pad = {0, 0},
                blk_phys_birth = 0,
                blk_birth = 24604026,
                blk_fill = 0,
                blk_cksum = {zc_word = {3816859218741551286,
                                        1781004871607911648,
                                        15693030277188822173,
                                        4275945154273572093}}},
  io_lsize = 4096, io_abd = 0xfffff806be831400,
  io_orig_abd = 0xfffff806be831400, io_size = 4096, io_orig_size = 4096,
  io_vd = 0xfffffe03a0a2e000, io_vsd = 0x0, io_vsd_ops = 0x0,
  io_metaslab_class = 0xfffff801783e9800, io_offset = 1648265584640,
  io_timestamp = 349058909606831, io_queued_timestamp = 349058909606751,
  io_target_timestamp = 0, io_delta = 11021342, io_delay = 0,
  io_queue_node = {avl_child = {0xfffff818775ddc40, 0xfffff81e7bd742a0},
                   avl_pcb = 18446735417614089285},
  io_offset_node = {avl_child = {0xfffff818775ddc58, 0xfffff81e7bd742b8},
                    avl_pcb = 18446735417614089309},
  io_alloc_node = {avl_child = {0x0, 0x0}, avl_pcb = 0},
  io_alloc_list = {zal_list = {list_size = 72, list_offset = 0,
                               list_head = {list_next = 0xfffff8184fb872f8,
                                            list_prev = 0xfffff8184fb872f8}},
                   zal_size = 0},
  io_flags = 1575040, io_stage = ZIO_STAGE_VDEV_IO_DONE,
  io_pipeline = (ZIO_STAGE_VDEV_IO_START | ZIO_STAGE_VDEV_IO_DONE |
                 ZIO_STAGE_VDEV_IO_ASSESS | ZIO_STAGE_DONE),
  io_orig_flags = 1048704, io_orig_stage = ZIO_STAGE_READY,
  io_orig_pipeline = (ZIO_STAGE_VDEV_IO_START | ZIO_STAGE_VDEV_IO_DONE |
                      ZIO_STAGE_VDEV_IO_ASSESS | ZIO_STAGE_DONE),
  io_pipeline_trace =
      (ZIO_STAGE_OPEN | ZIO_STAGE_VDEV_IO_START | ZIO_STAGE_VDEV_IO_DONE),
  io_error = 0, io_child_error = {0, 0, 0, 0},
  io_children = {{0, 0}, {0, 0}, {0, 0}, {0, 0}}, io_child_count = 0,
  io_phys_children = 0, io_parent_count = 1, io_stall = 0x0,
  io_gang_leader = 0x0, io_gang_tree = 0x0, io_executor = 0xfffff80182b88000,
  io_waiter = 0x0, io_bio = 0x0,
  io_lock = {lock_object = {lo_name = 0xffffffff82ca3c8d <.L.str .112 + 1 >
                                      "zio->io_lock",
                            lo_flags = 577830912, lo_data = 0,
                            lo_witness = 0x0},
             sx_lock = 1},
  io_cv = {cv_description = 0xffffffff82caea62 <.L.str .113 + 1 > "zio->io_cv",
           cv_waiters = 0},
  io_allocator = 0, io_cksum_report = 0x0, io_ena = 0, io_tqent = {
    tqent_task = {ta_link = {stqe_next = 0x0}, ta_pending = 0,
                  ta_priority = 0 '\000', ta_flags = 0 '\000',
                  ta_func = 0xffffffff82a2bd90 < taskq_run_ent >,
                  ta_context = 0xfffff8184fb87410},
    tqent_timeout_task =
        {q = 0x0,
         t = {ta_link = {stqe_next = 0x0}, ta_pending = 0,
              ta_priority = 0 '\000', ta_flags = 0 '\000', ta_func = 0x0,
              ta_context = 0x0},
         c = {c_links = {le = {le_next = 0x0, le_prev = 0x0},
                         sle = {sle_next = 0x0},
                         tqe = {tqe_next = 0x0, tqe_prev = 0x0}},
              c_time = 0, c_precision = 0, c_arg = 0x0, c_func = 0x0,
              c_lock = 0x0, c_flags = 0, c_iflags = 0, c_cpu = 0},
         f = 0},
    tqent_func = 0xffffffff82c1fee0 < zio_execute >,
    tqent_arg = 0xfffff8184fb87000,
    tqent_id = 0,
    tqent_hash = {cle_next = 0x0, cle_prev = 0x0},
    tqent_type = 0 '\000',
    tqent_registered = 0 '\000',
    tqent_cancelled = 0 '\000',
    tqent_rc = 0
  }
}

zio->io_vd->vdev_guid = 11713793445848286627 which corresponds to da1 in this pool:

  pool: data
 state: ONLINE
  scan: scrub repaired 0B in 04:45:15 with 0 errors on Sat Oct  7 00:08:52 2023
config:

	NAME        STATE     READ WRITE CKSUM
	data        ONLINE       0     0     0
	  mirror-0  ONLINE       0     0     0
	    da0     ONLINE       0     0     0
	    da1     ONLINE       0     0     0
	cache
	  nvd0p5    ONLINE       0     0     0
Comment 1 John F. Carr 2024-02-23 20:27:28 UTC
When I ran "git reset --hard" to recover the partially extracted repository I got one of the crashes from bug #276420:

panic: VERIFY(zio->io_stall == NULL) failed

The underlying problem seems to be that my server with lots of memory and CPU is incapable of writing 10 gigabytes of data to a spinning disk without crashing.  The pool configuration is different from 276420, this time being a two way mirror instead of a 5 disk raidz2.