Bug 243126 - if_vmx(4): iflib: Assertion fl->ifl_cidx == cidx failed at /usr/src/sys/net/iflib.c:2531
Summary: if_vmx(4): iflib: Assertion fl->ifl_cidx == cidx failed at /usr/src/sys/net/i...
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: CURRENT
Hardware: Any Any
: --- Affects Some People
Assignee: Patrick Kelsey
URL:
Keywords: crash
Depends on:
Blocks:
 
Reported: 2020-01-06 11:11 UTC by Andriy Gapon
Modified: 2020-07-26 00:03 UTC (History)
4 users (show)

See Also:
julien: mfc-stable12?
koobs: mfc-stable11?


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Andriy Gapon freebsd_committer freebsd_triage 2020-01-06 11:11:11 UTC
This happened with FreeBSD head r351653 and if_em interface in a VMware guest.

I have a crash dump.
This is the stack trace:
#11 kdb_enter (why=0xffffffff80bdbcb7 "panic", msg=0xffffffff80bdbcb7 "panic") at /usr/src/sys/kern/subr_kdb.c:483
#12 0xffffffff8077cff1 in vpanic (fmt=<optimized out>, ap=<optimized out>) at /usr/src/sys/kern/kern_shutdown.c:987
#13 0xffffffff8077c2cd in kassert_panic (fmt=0xffffffff80b4f01d "Assertion %s failed at %s:%d") at /usr/src/sys/kern/kern_shutdown.c:879
#14 0xffffffff808b721e in rxd_frag_to_sd (rxq=0xfffffe00003fe000, irf=<optimized out>, unload=<error reading variable: Cannot access memory at address 0x0>, sd=0xfffffe0011a54900, pf_rv=0xfffffe0011a549b0, ri=0xfffffe0011a54960)
    at /usr/src/sys/net/iflib.c:2531
#15 0xffffffff808b10e4 in assemble_segments (rxq=<optimized out>, ri=0x1dd05ea00000000, sd=0xfffff80002d1b470, pf_rv=0x100000000) at /usr/src/sys/net/iflib.c:2594
#16 iflib_rxd_pkt_get (rxq=<optimized out>, ri=<optimized out>) at /usr/src/sys/net/iflib.c:2671
#17 iflib_rxeof (rxq=<optimized out>, budget=16) at /usr/src/sys/net/iflib.c:2803
#18 _task_fn_rx (context=<optimized out>) at /usr/src/sys/net/iflib.c:3795
#19 0xffffffff807c72a9 in gtaskqueue_run_locked (queue=0xfffff80002922600) at /usr/src/sys/kern/subr_gtaskqueue.c:377
#20 0xffffffff807c7028 in gtaskqueue_thread_loop (arg=<optimized out>) at /usr/src/sys/kern/subr_gtaskqueue.c:558
#21 0xffffffff8073ce34 in fork_exit (callout=0xffffffff807c6fa0 <gtaskqueue_thread_loop>, arg=0xfffffe00007f9008, frame=0xfffffe0011a54ac0) at /usr/src/sys/kern/kern_fork.c:1048

I can provide any additional information.
Also, I can do some debugging myself, but I need some help with that as I don't know iflib code at all.
Comment 1 Andriy Gapon freebsd_committer freebsd_triage 2020-01-07 11:40:44 UTC
Correction, the panic happened with vmxnet3 network driver.
The VM was later switched to em as a workaround and that got me confused.

Some data from the crash:
(kgdb) fr 14
#14 0xffffffff808b721e in rxd_frag_to_sd (rxq=0xfffffe00003fe000, irf=<optimized out>, unload=<error reading variable: Cannot access memory at address 0x0>, sd=0xfffffe0011a54900, pf_rv=0xfffffe0011a549b0, ri=0xfffffe0011a54960)
    at /usr/src/sys/net/iflib.c:2531
2531    /usr/src/sys/net/iflib.c: No such file or directory.
(kgdb) p cidx
$1 = 142

(kgdb) p *rxq
$3 = {ifr_ctx = 0xfffff80002d22400, ifr_fl = 0xfffff80002d1a000, ifr_rx_irq = 0, pfil = 0xfffff8000436fb80, ifr_cq_cidx = 477, ifr_id = 0, ifr_nfl = 2 '\002', ifr_ntxqirq = 1 '\001', ifr_txqid = "\000\000\000",
  ifr_fl_offset = 1 '\001', ifr_lc = {ifp = 0xfffff80002d1a800, lro_mbuf_data = 0xfffffe00da9c4000, lro_queued = 1753531, lro_flushed = 339087, lro_bad_csum = 0, lro_cnt = 8, lro_mbuf_count = 0, lro_mbuf_max = 256,
    lro_ackcnt_lim = 65535, lro_length_lim = 65535, lro_hashsz = 251, lro_hash = 0xfffff80002fa5000, lro_active = {lh_first = 0x0}, lro_free = {lh_first = 0xfffffe00da9c5360}}, ifr_task = {gt_task = {ta_link = {stqe_next = 0x0},
      ta_flags = 2, ta_priority = 0, ta_func = 0xffffffff808b0bd0 <_task_fn_rx>, ta_context = 0xfffffe00003fe000}, gt_taskqueue = 0xfffff80002922600, gt_list = {le_next = 0x0, le_prev = 0xfffffe00117e98a8},
    gt_uniq = 0xfffffe00003fe000, gt_name = "rxq0", '\000' <repeats 27 times>, gt_dev = 0xfffff80002d3b000, gt_irq = 0xfffff80002d17900, gt_cpu = 0}, ifr_filter_info = {ifi_filter = 0xffffffff80af8510 <vmxnet3_rxq_intr>,
    ifi_filter_arg = 0xfffff80002fab800, ifi_task = 0xfffffe00003fe090, ifi_ctx = 0xfffffe00003fe000}, ifr_ifdi = 0xfffff80002d17d80, ifr_frags = {{irf_flid = 0 '\000', irf_idx = 142, irf_len = 1514}, {irf_flid = 1 '\001',
      irf_idx = 76, irf_len = 2048}, {irf_flid = 1 '\001', irf_idx = 77, irf_len = 1762}, {irf_flid = 1 '\001', irf_idx = 53, irf_len = 1038}, {irf_flid = 1 '\001', irf_idx = 114, irf_len = 2048}, {irf_flid = 1 '\001',
      irf_idx = 115, irf_len = 2048}, {irf_flid = 1 '\001', irf_idx = 116, irf_len = 2048}, {irf_flid = 1 '\001', irf_idx = 117, irf_len = 2048}, {irf_flid = 1 '\001', irf_idx = 118, irf_len = 2048}, {irf_flid = 1 '\001',
      irf_idx = 119, irf_len = 1906}, {irf_flid = 1 '\001', irf_idx = 184, irf_len = 1306}, {irf_flid = 1 '\001', irf_idx = 17, irf_len = 706}, {irf_flid = 1 '\001', irf_idx = 3, irf_len = 2048}, {irf_flid = 1 '\001',
      irf_idx = 4, irf_len = 2048}, {irf_flid = 1 '\001', irf_idx = 5, irf_len = 2048}, {irf_flid = 1 '\001', irf_idx = 6, irf_len = 1202}, {irf_flid = 0 '\000', irf_idx = 0, irf_len = 0} <repeats 48 times>}}

(kgdb) p *ri
$4 = {iri_qsidx = 0, iri_vtag = 0, iri_len = 1514, iri_cidx = 477, iri_ifp = 0xfffff80002d1a800, iri_frags = 0xfffffe00003fe140, iri_flowid = 600473664, iri_csum_flags = 251658240, iri_csum_data = 65535, iri_flags = 0 '\000',
  iri_nfrags = 1 '\001', iri_rsstype = 130 '\202', iri_pad = 0 '\000'}


(kgdb) fr 17
#17 iflib_rxeof (rxq=<optimized out>, budget=16) at /usr/src/sys/net/iflib.c:2803
2803    in /usr/src/sys/net/iflib.c
(kgdb) i loc
ctx = <optimized out>
scctx = <optimized out>
lro_possible = <error reading variable lro_possible (Cannot access memory at address 0x0)>
v4_forwarding = <error reading variable v4_forwarding (Cannot access memory at address 0x0)>
v6_forwarding = <error reading variable v6_forwarding (Cannot access memory at address 0x0)>
sctx = 0xffffffff810e7780 <vmxnet3_sctx_init>
rx_pkts = 1
rx_bytes = 1514
mh = 0x0
mt = 0x0
ifp = 0xfffff80002d1a800
cidxp = 0xfffffe00003fe020
avail = 17
i = <error reading variable i (Cannot access memory at address 0x0)>
fl = <optimized out>
m = 0x0
budget_left = 16
ri = <optimized out>
err = <optimized out>
mf = <optimized out>
lro_enabled = <optimized out>

(kgdb) p *cidxp
$1 = 477

(kgdb) p *$5.ifc_sctx
$7 = {isc_magic = 3405705229, isc_driver = 0xffffffff810e7900 <vmxnet3_iflib_driver>, isc_q_align = 512, isc_tx_maxsize = 65536, isc_tx_maxsegsize = 16383, isc_tso_maxsize = 65550, isc_tso_maxsegsize = 16383,
  isc_rx_maxsize = 16383, isc_rx_maxsegsize = 16383, isc_rx_nsegments = 1, isc_admin_intrcnt = 1, isc_vendor_info = 0xffffffff810e7930 <vmxnet3_vendor_info_array>, isc_driver_version = 0xffffffff80ba63e4 "2",
  isc_parse_devinfo = 0x0, isc_nrxd_min = {32, 32, 32, 0, 0, 0, 0, 0}, isc_nrxd_default = {256, 256, 256, 0, 0, 0, 0, 0}, isc_nrxd_max = {2048, 2048, 2048, 0, 0, 0, 0, 0}, isc_ntxd_min = {32, 32, 0, 0, 0, 0, 0, 0},
  isc_ntxd_default = {512, 512, 0, 0, 0, 0, 0, 0}, isc_ntxd_max = {4096, 4096, 0, 0, 0, 0, 0, 0}, isc_nfl = 2, isc_ntxqs = 2, isc_nrxqs = 3, __spare0__ = 0, isc_tx_reclaim_thresh = 0, isc_flags = 9, isc_name = 0x0}

(kgdb) p *rxq->ifr_ctx
$5 = {ops = 0xfffff80002d18000, ifc_softc = 0xfffff80002d22000, ifc_dev = 0xfffff80002d3b000, ifc_ifp = 0xfffff80002d1a800, ifc_cpus = {__bits = {255, 0, 0, 0}}, ifc_sctx = 0xffffffff810e7780 <vmxnet3_sctx_init>,
  ifc_softc_ctx = {isc_vectors = 9, isc_nrxqsets = 8, isc_ntxqsets = 8, __spare0__ = 0, __spare1__ = 0, isc_msix_bar = 24, isc_tx_nsegments = 32, isc_ntxd = {512, 512, 0, 0, 0, 0, 0, 0}, isc_nrxd = {512, 256, 256, 0, 0, 0, 0,
      0}, isc_txqsizes = {8192, 8192, 0, 0, 0, 0, 0, 0}, isc_rxqsizes = {8192, 4096, 4096, 0, 0, 0, 0, 0}, isc_txd_size = "\000\000\000\000\000\000\000", isc_rxd_size = "\000\000\000\000\000\000\000",
    isc_tx_tso_segments_max = 32, isc_tx_tso_size_max = 65532, isc_tx_tso_segsize_max = 16383, isc_tx_csum_flags = 5654, isc_capabilities = 6621115, isc_capenable = 6554555, isc_rss_table_size = 128, isc_rss_table_mask = 127,
    isc_nrxqsets_max = 8, isc_ntxqsets_max = 8, __spare2__ = 0, isc_intr = IFLIB_INTR_MSIX, isc_max_frame_size = 1522, isc_min_frame_size = 0, isc_pause_frames = 0, __spare3__ = 0, __spare4__ = 0, __spare5__ = 0, __spare6__ = 0,
    __spare7__ = 0, __spare8__ = 0, __spare9__ = 0x0, isc_disable_msix = 0, isc_txrx = 0xffffffff810e7680 <vmxnet3_txrx>, isc_media = 0x0}, ifc_ctx_sx = {lock_object = {lo_name = 0xffffffff80be034d "iflib ctx lock",
      lo_flags = 36896768, lo_data = 0, lo_witness = 0x0}, sx_lock = 1}, ifc_state_mtx = {lock_object = {lo_name = 0xfffff80002dac7d0 "vmx0", lo_flags = 16973824, lo_data = 0, lo_witness = 0x0}, mtx_lock = 0},
  ifc_txqs = 0xfffffe00117e5000, ifc_rxqs = 0xfffffe00003fe000, ifc_if_flags = 34819, ifc_flags = 112, ifc_max_fl_buf_size = 2048, ifc_rx_mbuf_sz = 2048, ifc_link_state = 2, ifc_watchdog_events = 0, ifc_led_dev = 0x0,
  ifc_msix_mem = 0xfffff80002d15480, ifc_legacy_irq = {ii_res = 0x0, __spare0__ = 0, ii_tag = 0x0}, ifc_admin_task = {gt_task = {ta_link = {stqe_next = 0x0}, ta_flags = 2, ta_priority = 0,
      ta_func = 0xffffffff808acd60 <_task_fn_admin>, ta_context = 0xfffff80002d22400}, gt_taskqueue = 0xfffff80002923500, gt_list = {le_next = 0x0, le_prev = 0xfffffe00007fb000}, gt_uniq = 0xfffff80002d22400,
    gt_name = "admin", '\000' <repeats 26 times>, gt_dev = 0x0, gt_irq = 0x0, gt_cpu = -1}, ifc_vflr_task = {gt_task = {ta_link = {stqe_next = 0x0}, ta_flags = 0, ta_priority = 0, ta_func = 0x0, ta_context = 0x0},
    gt_taskqueue = 0x0, gt_list = {le_next = 0x0, le_prev = 0x0}, gt_uniq = 0x0, gt_name = '\000' <repeats 31 times>, gt_dev = 0x0, gt_irq = 0x0, gt_cpu = 0}, ifc_filter_info = {
    ifi_filter = 0xffffffff80af8550 <vmxnet3_event_intr>, ifi_filter_arg = 0xfffff80002d22000, ifi_task = 0xfffff80002d225f8, ifi_ctx = 0xfffff80002d22400}, ifc_media = {ifm_mask = -268435456, ifm_media = 0,
    ifm_cur = 0xfffff8000437fcc0, ifm_list = {lh_first = 0xfffff8000437fcc0}, ifm_change = 0xffffffff808b6d40 <iflib_media_change>, ifm_status = 0xffffffff808b6de0 <iflib_media_status>}, ifc_mediap = 0xfffff80002d22708,
  ifc_sysctl_node = 0xfffff80002d16d80, ifc_sysctl_ntxqs = 0, ifc_sysctl_nrxqs = 0, ifc_sysctl_qs_eq_override = 0, ifc_sysctl_rx_budget = 0, ifc_sysctl_tx_abdicate = 0, ifc_sysctl_core_offset = 0,
  ifc_sysctl_separate_txrx = 0 '\000', ifc_sysctl_ntxds = {0, 0, 0, 0, 0, 0, 0, 0}, ifc_sysctl_nrxds = {0, 0, 0, 0, 0, 0, 0, 0}, ifc_txrx = {ift_txd_encap = 0xffffffff80af5190 <vmxnet3_isc_txd_encap>,
    ift_txd_flush = 0xffffffff80af5430 <vmxnet3_isc_txd_flush>, ift_txd_credits_update = 0xffffffff80af5480 <vmxnet3_isc_txd_credits_update>, ift_rxd_available = 0xffffffff80af5560 <vmxnet3_isc_rxd_available>,
    ift_rxd_pkt_get = 0xffffffff80af5690 <vmxnet3_isc_rxd_pkt_get>, ift_rxd_refill = 0xffffffff80af5970 <vmxnet3_isc_rxd_refill>, ift_rxd_flush = 0xffffffff80af5a50 <vmxnet3_isc_rxd_flush>,
    ift_legacy_intr = 0xffffffff80af5ab0 <vmxnet3_legacy_intr>}, ifc_vlan_attach_event = 0xfffff80002da2300, ifc_vlan_detach_event = 0xfffff80002da22c0, ifc_mac = {octet = "\000PV\243\265\275"}}

(kgdb) p $5.ifc_softc_ctx
$8 = {isc_vectors = 9, isc_nrxqsets = 8, isc_ntxqsets = 8, __spare0__ = 0, __spare1__ = 0, isc_msix_bar = 24, isc_tx_nsegments = 32, isc_ntxd = {512, 512, 0, 0, 0, 0, 0, 0}, isc_nrxd = {512, 256, 256, 0, 0, 0, 0, 0},
  isc_txqsizes = {8192, 8192, 0, 0, 0, 0, 0, 0}, isc_rxqsizes = {8192, 4096, 4096, 0, 0, 0, 0, 0}, isc_txd_size = "\000\000\000\000\000\000\000", isc_rxd_size = "\000\000\000\000\000\000\000", isc_tx_tso_segments_max = 32,
  isc_tx_tso_size_max = 65532, isc_tx_tso_segsize_max = 16383, isc_tx_csum_flags = 5654, isc_capabilities = 6621115, isc_capenable = 6554555, isc_rss_table_size = 128, isc_rss_table_mask = 127, isc_nrxqsets_max = 8,
  isc_ntxqsets_max = 8, __spare2__ = 0, isc_intr = IFLIB_INTR_MSIX, isc_max_frame_size = 1522, isc_min_frame_size = 0, isc_pause_frames = 0, __spare3__ = 0, __spare4__ = 0, __spare5__ = 0, __spare6__ = 0, __spare7__ = 0,
  __spare8__ = 0, __spare9__ = 0x0, isc_disable_msix = 0, isc_txrx = 0xffffffff810e7680 <vmxnet3_txrx>, isc_media = 0x0}

(kgdb) p/x $7.isc_flags
$9 = 0x9

So, it seems that because isc_flags has IFLIB_HAS_RXCQ bit, cidx was taken from rxq->ifr_cq_cidx (477) in iflib_rxeof().
But rxd_frag_to_sd() asserts that fl->ifl_cidx == irf->irf_idx, but if we look at rxq->ifr_frags, none of the fragments has that ID.
In fact:
(kgdb) p rxq->ifr_frags[0]
$11 = {irf_flid = 0 '\000', irf_idx = 142, irf_len = 1514}
And that's where the assertion trips, 142 != 477.
Comment 2 Andriy Gapon freebsd_committer freebsd_triage 2020-01-07 12:05:07 UTC
Looking at vmxnet3_isc_rxd_pkt_get, I see that both iri_cidx and ifr_cq_cidx will be set to cqidx after the last fragment.  irf_idx of each fragment is set to rxd_idx of the rspective descriptor.

Getting back to the assertion in rxd_frag_to_sd():

(kgdb) p *ri
$4 = {iri_qsidx = 0, iri_vtag = 0, iri_len = 1514, iri_cidx = 477, iri_ifp = 0xfffff80002d1a800, iri_frags = 0xfffffe00003fe140, iri_flowid = 600473664, iri_csum_flags = 251658240, iri_csum_data = 65535, iri_flags = 0 '\000',
  iri_nfrags = 1 '\001', iri_rsstype = 130 '\202', iri_pad = 0 '\000'}

(kgdb) p ri->iri_frags[0]
$13 = {irf_flid = 0 '\000', irf_idx = 142, irf_len = 1514}

So, irf_idx = 142 in the first and the only fragment.
irf_flid  is zero, so:

$15 = {ifl_cidx = 141, ifl_pidx = 140, ifl_credits = 254, ifl_gen = 1 '\001', ifl_rxd_size = 0 '\000', ifl_rx_bitmap = 0xfffff80002daa8c0, ifl_fragidx = 139, ifl_size = 256, ifl_buf_size = 2048, ifl_cltype = 1, 
  ifl_zone = 0xfffff80002957000, ifl_sds = {ifsd_map = 0xfffff80002d2d800, ifsd_m = 0xfffff80002d1b800, ifsd_cl = 0xfffff80002d1b000, ifsd_ba = 0xfffff80002d1e000}, ifl_rxq = 0xfffffe00003fe000, ifl_id = 0 '\000', 
  ifl_buf_tag = 0xfffff80002cf8100, ifl_ifdi = 0xfffff80002d17da8, ifl_bus_addrs = {32051296256, 32050450432, 31627747328, 31625582592, 31637278720, 6269554688, 6269536256, 6272448512, 6512416768, 6269433856, 5922217984, 
    6273015808, 31638720512, 6261741568, 6268790784, 29797609472, 6271569920, 6271576064, 6271574016, 6271580160, 6271578112, 6271584256, 6271582208, 6271588352, 6271739904, 6271733760, 6271735808, 6271729664, 6271731712, 
    6271725568, 6271727616, 6271721472}, ifl_vm_addrs = {0xfffff8077667f800 ...

So, ifl_cidx is 141 and that's the actual problem.
477 that I mentioned in the previous comment is irrelevant.
The real problem is ifl_cidx != irf_idx, 141 != 142.

Still no clue how that could happen.
Comment 3 Andriy Gapon freebsd_committer freebsd_triage 2020-01-07 12:50:48 UTC
I wonder if irf_idx can "jump ahead" of ifl_cidx because of a skipped zero-length packet?

Hmm, it seems like it could be the case.
Given the current value of iri_cidx/ifr_cq_cidx, I examined three descriptors before that index:

(kgdb) p $19.vxcr_u.rxcd[476]
$21 = {rxd_idx = 142, pad1 = 0, eop = 1, sop = 1, qid = 0, rss_type = 2, no_csum = 0, pad2 = 0, rss_hash = 600473664, len = 1514, error = 0, vlan = 0, vtag = 0, csum = 0, csum_ok = 1, udp = 0, tcp = 1, ipcsum_ok = 1, ipv6 = 0, 
  ipv4 = 1, fragment = 0, fcs = 0, type = 3, gen = 1}

(kgdb) p $19.vxcr_u.rxcd[475]
$22 = {rxd_idx = 141, pad1 = 0, eop = 1, sop = 1, qid = 0, rss_type = 0, no_csum = 0, pad2 = 0, rss_hash = 0, len = 0, error = 0, vlan = 0, vtag = 0, csum = 0, csum_ok = 0, udp = 0, tcp = 0, ipcsum_ok = 0, ipv6 = 0, ipv4 = 0, 
  fragment = 0, fcs = 0, type = 3, gen = 1}

(kgdb) p $19.vxcr_u.rxcd[474]
$23 = {rxd_idx = 140, pad1 = 0, eop = 1, sop = 1, qid = 0, rss_type = 2, no_csum = 0, pad2 = 0, rss_hash = 600473664, len = 66, error = 0, vlan = 0, vtag = 0, csum = 0, csum_ok = 1, udp = 0, tcp = 1, ipcsum_ok = 1, ipv6 = 0, 
  ipv4 = 1, fragment = 0, fcs = 0, type = 3, gen = 1}

The descriptor at 476 with rxd_idx = 142 seems like the current packet.
And the previous descriptor at 475 with rxd_idx = 141 is a zero-length packet: eop = 1, sop = 1, len = 0.
The packet before it is a normal packet again: eop = 1, sop = 1,  len = 66.

Now, how to fix the problem?
I see two ways:
- a driver can notify iflib of a zero length packet (via a new callback), so that iflib can skip the corresponding entry in the appropriate free list
- rxd_frag_to_sd() can skip through fl entries until ifl_cidx becomes equal to irf_idx
Comment 4 Andriy Gapon freebsd_committer freebsd_triage 2020-01-07 13:02:02 UTC
I wonder if vmxnet3_isc_rxd_pkt_get() should not hide those zero-length packets.
It could return packets with irf_len = 0.
I see that there is some provision for that case in assemble_segments(), but not sure if it is really expected.
Especially, in rxd_frag_to_sd().

Could any of iflib developers please help?
Thank you!
Comment 5 Andriy Gapon freebsd_committer freebsd_triage 2020-01-16 11:53:37 UTC
Still looking for help from iflib developers.
Comment 6 Kubilay Kocak freebsd_committer freebsd_triage 2020-01-21 03:40:32 UTC
@Andriy Is this a recent regression/change in behaviour? Any chance of bisection?
Comment 7 Andriy Gapon freebsd_committer freebsd_triage 2020-01-21 05:53:09 UTC
(In reply to Kubilay Kocak from comment #6)
That's certainly a fallout from iflib migration.
There have not been that many changes to vmx code.
Comment 8 Kubilay Kocak freebsd_committer freebsd_triage 2020-01-22 03:02:05 UTC
(In reply to Andriy Gapon from comment #7)

Could you request feedback from developers that were involved in that migration by setting maintainer-feedback ? <email> flags. Thanks!
Comment 9 commit-hook freebsd_committer freebsd_triage 2020-03-14 19:44:45 UTC
A commit references this bug:

Author: pkelsey
Date: Sat Mar 14 19:43:44 UTC 2020
New revision: 358995
URL: https://svnweb.freebsd.org/changeset/base/358995

Log:
  Fix iflib freelist state corruption

  This fixes a bug in iflib freelist management that breaks the required
  correspondence between freelist indexes and driver ring slots.

  PR:		243126, 243392, 240628
  Reported by:	avg, alexandr.oleynikov@gmail.com, Harald Schmalzbauer
  Reviewed by:	avg, gallatin
  MFC after:	1 week
  Differential Revision:	https://reviews.freebsd.org/D23943

Changes:
  head/sys/net/iflib.c
Comment 10 commit-hook freebsd_committer freebsd_triage 2020-03-14 19:55:53 UTC
A commit references this bug:

Author: pkelsey
Date: Sat Mar 14 19:55:06 UTC 2020
New revision: 358997
URL: https://svnweb.freebsd.org/changeset/base/358997

Log:
  Remove freelist contiguous-indexes assertion from rxd_frag_to_sd()

  The vmx driver is an example of an iflib driver that might report
  packets using non-contiguous descriptors (with unused descriptors
  either between received packets or between the fragments of a received
  packet), so this assertion needs to be removed.

  For such drivers, the freelist producer and consumer indexes don't
  relate directly to driver ring slots (the driver deals directly with
  freelist buffer indexes supplied by iflib during refill, and reports
  them with each fragment during packet reception), but do continue to
  be used by iflib for accounting, such as determining the number of
  ring slots that are refillable.

  PR:		243126, 243392, 240628
  Reported by:	avg, alexandr.oleynikov@gmail.com, Harald Schmalzbauer
  Reviewed by:	gallatin
  MFC after:	1 week
  Differential Revision:	https://reviews.freebsd.org/D23946

Changes:
  head/sys/net/iflib.c
Comment 11 commit-hook freebsd_committer freebsd_triage 2020-03-14 20:08:58 UTC
A commit references this bug:

Author: pkelsey
Date: Sat Mar 14 20:08:04 UTC 2020
New revision: 359000
URL: https://svnweb.freebsd.org/changeset/base/359000

Log:
  Fix if_vmx receive checksum offload bug and harden against the device skipping receive descriptors

  This fixes a bug where the checksum offload status of received packets
  was being taken from the first descriptor instead of the last, which
  affected LRO packets.

  The driver has been hardened against the device skipping receive
  descriptors, although it is not believed that this can occur given the
  way this implementation configures the receive rings.

  Additionally, for packets received with the error indicator set, the
  driver now forces the length of all fragments in that packet to zero
  prior to passing it to iflib.  Such packets should wind up being
  discarded at some point in the stack anyway, but this removes any
  questions by killing them in the driver.

  Counters have been added (and exposed via sysctls) for skipped receive
  descriptors, zero-length packets received, and packets received with
  the error indicator set so that these conditions can be easily
  observed in the field.

  PR:		243126, 243392, 240628
  Reported by:	avg, alexandr.oleynikov@gmail.com, Harald Schmalzbauer
  Reviewed by:	gallatin
  MFC after:	1 week
  Differential Revision:	https://reviews.freebsd.org/D23949

Changes:
  head/sys/dev/vmware/vmxnet3/if_vmx.c
  head/sys/dev/vmware/vmxnet3/if_vmxvar.h
Comment 12 Kubilay Kocak freebsd_committer freebsd_triage 2020-03-19 03:44:17 UTC
^Triage: Assign to committer resolving, pending MFC. If this doesn't need to go
to stable/11, set mfc-stable11 to -

Thanks Patrick!
Comment 13 commit-hook freebsd_committer freebsd_triage 2020-07-13 17:21:08 UTC
A commit references this bug:

Author: mav
Date: Mon Jul 13 17:20:21 UTC 2020
New revision: 363163
URL: https://svnweb.freebsd.org/changeset/base/363163

Log:
  MFC r358995 (by pkelsey): Fix iflib freelist state corruption

  This fixes a bug in iflib freelist management that breaks the required
  correspondence between freelist indexes and driver ring slots.

  PR: 243126, 243392, 240628

Changes:
_U  stable/12/
  stable/12/sys/net/iflib.c