Bug 240608 - if_vmx(4): iflib - Panic with INVARIANTS: Memory modified after free (12.1-pre-QA)
Summary: if_vmx(4): iflib - Panic with INVARIANTS: Memory modified after free (12.1-pr...
Status: In Progress
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 12.0-STABLE
Hardware: Any Any
: --- Affects Some People
Assignee: Patrick Kelsey
URL:
Keywords: crash
Depends on:
Blocks: 240700
  Show dependency treegraph
 
Reported: 2019-09-16 07:45 UTC by Harald Schmalzbauer
Modified: 2020-03-19 03:42 UTC (History)
9 users (show)

See Also:
koobs: mfc-stable12?
koobs: mfc-stable11?


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Harald Schmalzbauer 2019-09-16 07:45:31 UTC
Hello,

testing 12.1-PRERELEASE updates with debug kernel on cold-standby hardware revealed some unexpected panics related to iflib.
Not sure if I shall file individual bug reports or collect them here in one report.
Need to collect the others one after another, so let's start here with the most unexpected, happened during traffic test utilizing if_vmx(4):

panic: Memory modified after free 0xfffff801381d0000(2048) val=0 @ 0xfffff801381d0000

cpuid = 0
time = 1568618749
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe0041352670
vpanic() at vpanic+0x19d/frame 0xfffffe00413526c0
panic() at panic+0x43/frame 0xfffffe0041352720
trash_ctor() at trash_ctor+0x49/frame 0xfffffe0041352730
mb_ctor_clust() at mb_ctor_clust+0x18/frame 0xfffffe0041352760
uma_zalloc_arg() at uma_zalloc_arg+0x8a0/frame 0xfffffe00413527e0
m_cljget() at m_cljget+0x8a/frame 0xfffffe0041352810
_iflib_fl_refill() at _iflib_fl_refill+0x2f1/frame 0xfffffe0041352900
_task_fn_rx() at _task_fn_rx+0xb29/frame 0xfffffe00413529f0
gtaskqueue_run_locked() at gtaskqueue_run_locked+0xf9/frame 0xfffffe0041352a40
gtaskqueue_thread_loop() at gtaskqueue_thread_loop+0x88/frame 0xfffffe0041352a70
fork_exit() at fork_exit+0x84/frame 0xfffffe0041352ab0
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe0041352ab0

#9  0xffffffff805cf4ca in vpanic (fmt=<value optimized out>, ap=<value optimized out>)
    at /usr/local/share/deploy-tools/RELENG_12/src/sys/kern/kern_shutdown.c:866
#10 0xffffffff805cf273 in panic (fmt=<value optimized out>)
    at /usr/local/share/deploy-tools/RELENG_12/src/sys/kern/kern_shutdown.c:804
#11 0xffffffff808da039 in trash_ctor (mem=<value optimized out>, size=<value optimized out>)
    at /usr/local/share/deploy-tools/RELENG_12/src/sys/vm/uma_dbg.c:82
#12 0xffffffff805b2b08 in mb_ctor_clust (mem=0xfffff801381d0000, size=2048, arg=0x0, how=<value optimized out>)
    at /usr/local/share/deploy-tools/RELENG_12/src/sys/kern/kern_mbuf.c:702
#13 0xffffffff808d5030 in uma_zalloc_arg (zone=<value optimized out>, udata=0x0, flags=1)
    at /usr/local/share/deploy-tools/RELENG_12/src/sys/vm/uma_core.c:2506
#14 0xffffffff805b18fa in m_cljget (m=0x0, how=1, size=2048)
    at /usr/local/share/deploy-tools/RELENG_12/src/sys/kern/kern_mbuf.c:956
#15 0xffffffff80703e41 in _iflib_fl_refill (ctx=0xfffff800028ec800, fl=0xfffff8000293eac0, count=<value optimized out>)
    at /usr/local/share/deploy-tools/RELENG_12/src/sys/net/iflib.c:2025
#16 0xffffffff806fea59 in _task_fn_rx (context=0xfffff8000293d000)
    at /usr/local/share/deploy-tools/RELENG_12/src/sys/net/iflib.c:2117
#17 0xffffffff80616539 in gtaskqueue_run_locked (queue=0xfffff80002360a00)
    at /usr/local/share/deploy-tools/RELENG_12/src/sys/kern/subr_gtaskqueue.c:378
#18 0xffffffff806162f8 in gtaskqueue_thread_loop (arg=<value optimized out>)
    at /usr/local/share/deploy-tools/RELENG_12/src/sys/kern/subr_gtaskqueue.c:559
#19 0xffffffff80596274 in fork_exit (callout=0xffffffff80616270 <gtaskqueue_thread_loop>, arg=0xfffffe000029b008, 
    frame=0xfffffe0041352ac0) at /usr/local/share/deploy-tools/RELENG_12/src/sys/kern/kern_fork.c:1065
#20 0xffffffff80912c6e in fork_trampoline () at /usr/local/share/deploy-tools/RELENG_12/src/sys/amd64/amd64/exception.S:1077
#21 0x0000000000000000 in ?? ()

Hope someone can use that information.  Happily providing more info on request.
Guess I'd better open individual bug reports...

Thanks,
-harry
Comment 1 Harald Schmalzbauer 2019-09-16 09:43:04 UTC
I guess this panic with an non-INVARIANTS kernel (wihtout any other debug OPTIONS) is related.
This is happening quiet early after some traffic tests utilizing if_vmx(4).
12.1-prerelease without debug options:

Fatal trap 12: page fault while in kernel mode                                                                                         
cpuid = 1; apic id = 01                                                                                                                
fault virtual address   = 0x0                                                                                                          
fault code              = supervisor write data, page not present
instruction pointer     = 0x20:0xffffffff806ffdc2                                                                                      
stack pointer           = 0x28:0xfffffe0040557900                                                                                      
frame pointer           = 0x28:0xfffffe00405579e0                                                                                      
code segment            = base rx0, limit 0xfffff, type 0x1b                                                                           
                        = DPL 0, pres 1, long 1, def32 0, gran 1                                                                       
processor eflags        = interrupt enabled, resume, IOPL = 0                                                                          
current process         = 0 (if_io_tqg_1)                                                                                              
trap number             = 12                                                                                                           
panic: page fault                                                                                                                      
cpuid = 1                                                                                                                              
time = 1568626299                                                                                                                      
KDB: stack backtrace:                                                                                                                  
#0 0xffffffff8061f047 at kdb_backtrace+0x67                                                                                            
#1 0xffffffff805d33bd at vpanic+0x19d                                                                                                  
#2 0xffffffff805d3213 at panic+0x43                                                                                                    
#3 0xffffffff80941d2c at trap_fatal+0x39c                                                                                              
#4 0xffffffff80941d79 at trap_pfault+0x49                                                                                              
#5 0xffffffff8094136f at trap+0x29f                                                                                                    
#6 0xffffffff8091b92c at calltrap+0x8                                                                                                  
#7 0xffffffff8061d904 at gtaskqueue_run_locked+0x144
#8 0xffffffff8061d568 at gtaskqueue_thread_loop+0x98
#9 0xffffffff8059b103 at fork_exit+0x83
#10 0xffffffff8091c96e at fork_trampoline+0xe

#3  0xffffffff805d3213 in panic (fmt=<value optimized out>) at /usr/local/share/deploy-tools/RELENG_12/src/sys/kern/kern_shutdown.c:804
#4  0xffffffff80941d2c in trap_fatal (frame=<value optimized out>, eva=<value optimized out>)                                          
    at /usr/local/share/deploy-tools/RELENG_12/src/sys/amd64/amd64/trap.c:943                                                          
#5  0xffffffff80941d79 in trap_pfault (frame=0xfffffe0040557840, usermode=0) at RELENG_12/src/sys/amd64/include/pcpu.h:234             
#6  0xffffffff8094136f in trap (frame=0xfffffe0040557840) at /usr/local/share/deploy-tools/RELENG_12/src/sys/amd64/amd64/trap.c:443    
#7  0xffffffff8091b92c in calltrap () at /usr/local/share/deploy-tools/RELENG_12/src/sys/amd64/amd64/exception.S:289                   
#8  0xffffffff806ffdc2 in _task_fn_rx (context=<value optimized out>)                                                                  
    at /usr/local/share/deploy-tools/RELENG_12/src/sys/net/iflib.c:2614                                                                
#9  0xffffffff8061d904 in gtaskqueue_run_locked (queue=0xfffff8000206c100)                                                             
    at /usr/local/share/deploy-tools/RELENG_12/src/sys/kern/subr_gtaskqueue.c:378                                                      
#10 0xffffffff8061d568 in gtaskqueue_thread_loop (arg=<value optimized out>)                                                           
    at /usr/local/share/deploy-tools/RELENG_12/src/sys/kern/subr_gtaskqueue.c:559                                                      
#11 0xffffffff8059b103 in fork_exit (callout=0xffffffff8061d4d0 <gtaskqueue_thread_loop>, arg=0xfffffe00025fe020,                      
    frame=0xfffffe0040557ac0) at /usr/local/share/deploy-tools/RELENG_12/src/sys/kern/kern_fork.c:1065                                 
#12 0xffffffff8091c96e in fork_trampoline () at /usr/local/share/deploy-tools/RELENG_12/src/sys/amd64/amd64/exception.S:1077           
#13 0x0000000000000000 in ?? ()

Thanks for any help,

-harry
Comment 2 Eric Joyner freebsd_committer 2019-10-09 19:30:53 UTC
Maybe your issue was similar to the one found in this bug: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=239118, which got fixed.

Could you retest?
Comment 3 Harald Schmalzbauer 2019-10-10 04:37:27 UTC
(In reply to Eric Joyner from comment #2)

Thanks for the hint, I also thought this could solve/influence the panic, but won't find time before the weekend.  I also found that it only happens when I use a MTU of 9000 bytes - if I don't confuse things from memory…
Will update soon.

-harry
Comment 4 Harald Schmalzbauer 2019-10-12 13:28:59 UTC
tested with RC1:
Still same panic as soon as if_vmx(4) get's load _and_ jumbo frames are in use (mtu 9000).

Unfortunately, solid vmxnet3 support is crucial to my setups.
I managed to patch the vmware vmxnet3 guest driver to work with FreeBSD 12, but it also suffers from panics…
The vmware vmxnet3 driver provides doubled transfer rates compared to if_vmx(3), but my skills and time don't last to fix any of both myself.

In case anyone is interested in the vmxnet3 patch I'll provide on request of course, but won't pollute this bug report with anything completely different.

-harry
Comment 5 Kubilay Kocak freebsd_committer freebsd_triage 2019-10-13 22:08:16 UTC
(In reply to Harald Schmalzbauer from comment #4)

@Herald Can you confirm (explicitly) that the panic does *not* occur with jumbo frames *disabled/not configured* ?
Comment 6 Harald Schmalzbauer 2019-10-14 18:44:45 UTC
(In reply to Kubilay Kocak from comment #5)

As far as I could briefly test (some iperf3 streams), using if_vmx(4) does _not_ lead to panic if frames are <= 4k (and interface MTU is set to 9000).

Also, using if_igb(4) with interface MTU of 9000 and frames as big as 9k is working fine with FreeBSD-12_RC1.

Maybe 240700-block can be released if man page mentions the 4k frame size limit?

Thanks,
-harry
Comment 7 Kubilay Kocak freebsd_committer freebsd_triage 2019-10-15 02:17:39 UTC
(In reply to Harald Schmalzbauer from comment #6)

We can leave/remove the blocker depending on the necessary/chosen resolution. In my opinion though, crashes (panics) should be resolved whether or not there are intended limits in specific scenarios, ie: "all crashes are bugs"
Comment 8 Andriy Gapon freebsd_committer 2020-01-08 11:15:43 UTC
I am seeing the same problem with FreeBSD CURRENT as of r351653 (early September).

As far as I can tell, jumbo frames are not used.
The interface mtu is 1500 and there are no routes with non-default mtu.
The interface driver is also vmxnet3.

Unread portion of the kernel message buffer:
panic: Memory modified after free 0xfffff8017f069800(2048) val=9d671660 @ 0xfffff8017f069800

cpuid = 0
time = 1578053135
KDB: stack backtrace:
 stack1 db_trace_self_wrapper+0x2b vpanic+0x1a4 panic+0x43 trash_ctor+0x49 mb_ctor_clust+0x18 uma_zalloc_arg+0xa4d m_cljget+0x8a _iflib_fl_refill+0x511 _task_fn_rx+0xc06 gtaskqueue_run_locked+0x139 gtaskqueue_thread_loop+0x88 fork_exit+0x84 fork_trampoline+0xe
KDB: enter: panic

__curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
55      /usr/src/sys/amd64/include/pcpu_aux.h: No such file or directory.
(kgdb) bt
#0  __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
#1  doadump (textdump=4620928) at /usr/src/sys/kern/kern_shutdown.c:467
#2  0xffffffff8039641c in db_fncall_generic (addr=<optimized out>, args=<optimized out>, rv=<optimized out>, nargs=<optimized out>) at /usr/src/sys/ddb/db_command.c:620
#3  db_fncall (dummy1=<optimized out>, dummy2=<optimized out>, dummy3=<optimized out>, dummy4=<optimized out>) at /usr/src/sys/ddb/db_command.c:668
#4  0xffffffff80395d39 in db_command (last_cmdp=<optimized out>, cmd_table=<optimized out>, dopager=0) at /usr/src/sys/ddb/db_command.c:492
#5  0xffffffff8039acd8 in db_script_exec (scriptname=<optimized out>, warnifnotfound=<optimized out>) at /usr/src/sys/ddb/db_script.c:304
#6  0xffffffff8039ab23 in db_script_kdbenter (eventname=<optimized out>) at /usr/src/sys/ddb/db_script.c:326
#7  0xffffffff80398d2a in db_trap (type=<optimized out>, code=<optimized out>) at /usr/src/sys/ddb/db_main.c:251
#8  0xffffffff807c90fc in kdb_trap (type=3, code=0, tf=<optimized out>) at /usr/src/sys/kern/subr_kdb.c:696
#9  0xffffffff80abe2d4 in trap (frame=0xfffffe0000468580) at /usr/src/sys/amd64/amd64/trap.c:621
#10 <signal handler called>
#11 kdb_enter (why=0xffffffff80bdbcb7 "panic", msg=0xffffffff80bdbcb7 "panic") at /usr/src/sys/kern/subr_kdb.c:483
#12 0xffffffff8077cff1 in vpanic (fmt=<optimized out>, ap=<optimized out>) at /usr/src/sys/kern/kern_shutdown.c:987
#13 0xffffffff8077d073 in panic (fmt=0xffffffff8115abd0 <cnputs_mtx+24> "") at /usr/src/sys/kern/kern_shutdown.c:921
#14 0xffffffff80a0d399 in trash_ctor (mem=0x80, size=4621632, arg=<optimized out>, flags=<optimized out>) at /usr/src/sys/vm/uma_dbg.c:82
#15 0xffffffff8075ad68 in mb_ctor_clust (mem=0xfffff8017f069800, size=2048, arg=0x0, how=128) at /usr/src/sys/kern/kern_mbuf.c:738
#16 0xffffffff80a0783d in uma_zalloc_arg (zone=0xfffff8000289c000, udata=0x0, flags=<optimized out>) at /usr/src/sys/vm/uma_core.c:2441
#17 0xffffffff80759aca in m_cljget (m=0x0, how=1, size=2048) at /usr/src/sys/kern/kern_mbuf.c:1392
#18 0xffffffff808b5d71 in _iflib_fl_refill (ctx=0xfffff80002a4b000, fl=<optimized out>, count=<optimized out>) at /usr/src/sys/net/iflib.c:1986
#19 0xffffffff808b17d6 in __iflib_fl_refill_lt (ctx=<optimized out>, fl=0xfffff80002a42000, max=24) at /usr/src/sys/net/iflib.c:2079
#20 iflib_rxeof (rxq=0xfffff80002a32000, budget=24) at /usr/src/sys/net/iflib.c:2823
#21 _task_fn_rx (context=0xfffff80002a32000) at /usr/src/sys/net/iflib.c:3795
#22 0xffffffff807c72a9 in gtaskqueue_run_locked (queue=0xfffff800020a6100) at /usr/src/sys/kern/subr_gtaskqueue.c:377
#23 0xffffffff807c7028 in gtaskqueue_thread_loop (arg=<optimized out>) at /usr/src/sys/kern/subr_gtaskqueue.c:558
#24 0xffffffff8073ce34 in fork_exit (callout=0xffffffff807c6fa0 <gtaskqueue_thread_loop>, arg=0xfffffe00003f9008, frame=0xfffffe0000468ac0) at /usr/src/sys/kern/kern_fork.c:1048
Comment 9 Andriy Gapon freebsd_committer 2020-01-08 11:18:20 UTC
I have a crash dump and I can provide any data requested.
I can also do some debugging myself if given some directions.
Comment 10 Andriy Gapon freebsd_committer 2020-01-15 10:15:33 UTC
We (Panzura) are still getting this panic semi-regularly.
I have been trying to root cause the bug, but I am failing at it so far.
I am afraid that we will to revert vmxnet3 code to the pre-iflib state.
It's kind of a lose-lose situation, unfortunately.
Comment 11 Andriy Gapon freebsd_committer 2020-01-16 07:49:40 UTC
Not sure if that's relevant, but one thing I noticed is that the converted driver lost handling of rxcd->error field.
The old code would do:
        if (rxcd->error) {
                rxq->vxrxq_stats.vmrxs_ierrors++;
                m_freem(m);
                return;
        }

But in the iflib world the error is simply ignored.
Comment 12 Andriy Gapon freebsd_committer 2020-01-16 09:05:35 UTC
Also, to be honest, I do not understand how ifl_fragidx and ifl_rx_bitmap on the one hand and ifl_pidx and ifl_cidx on the other work together.
The map potentially allows for an arbitrary mix of free and busy descriptors while cidx/pidx assume strictly linear iteration.
Comment 13 Andriy Gapon freebsd_committer 2020-01-16 11:09:14 UTC
I see one potential problem related to ifl_fragidx.
_iflib_fl_refill() has this logic:
        frag_idx = fl->ifl_fragidx;
<loop>
                bit_ffc_at(fl->ifl_rx_bitmap, frag_idx, fl->ifl_size,                                                                                                                                                                 
                    &frag_idx);                                                                                                                                                                                                       
                if (frag_idx < 0)                                                                                                                                                                                                     
                        bit_ffc(fl->ifl_rx_bitmap, fl->ifl_size, &frag_idx);

                bit_set(fl->ifl_rx_bitmap, frag_idx);
<end of loop>
        fl->ifl_fragidx = frag_idx;

So, ifl_fragidx is used to store the latest set bit in ifl_rx_bitmap.
bit_ffc_at() finds the first cleared bit at or after the start position.
Typically, that means /after/ as the bit /at/ frag_idx is set.

But let's consider this scenario.  Somehow the hardware consumes descriptors very fast, faster than they are refilled.  Let's say we refilled descriptors up to index N, so ifl_fragidx=N and ifl_pidx=N+1.  Let's say the hardware consumed all descriptors available to it.  That means that the whole ifl_rx_bitmap is clear and ifl_cidx=N+1.  Now we do the next refill and we start searching from ifl_fragidx.  That position is free now, so bit_ffc_at() will return it.  At this point ifl_fragidx and ifl_pidx get out of sync.  We populate various software resources by frag_idx, but we program the hardware by pidx.  For example, we will allocate a cluster at index N, but program its bus address in a descriptor at index N+1.
That will mess up things for a driver that expects that indexes are always advanced linearly.

There is a simple solution.
Either we should store frag_idx + 1 to ifl_fragidx for the benefit of the next refill or we should call bit_ffc_at() with frag_idx + 1 as a starting position.
Comment 14 Andriy Gapon freebsd_committer 2020-01-16 11:52:57 UTC
As to why the hardware can exhaust all descriptors in a free list.
I think that there is a certain impedance mismatch.  iflib_rxeof() takes its budget in terms of full packets and also iflib_rxd_avail() works in the same terms. __iflib_fl_refill_lt() is called with a limit equal to budget + 8. 
But the descriptors are used by packet fragments and a single packet may have many fragments.  Also, a driver like vmx may waste some descriptors[*] for reasons that are known only to vmware.

For example, in one crash dumps that I have here I see that iflib_rxeof() was processing its 11th packet (rx_pkts = 11, avail = 7, budget = 16) and that packet had iri_nfrags = 14.
It's conceivable that such a batch could exhaust all descriptors populated by the last refill.

[*]
eop=1, sop=1, len=0 but a command descriptor 141 in rx queue 0 is "consumed":

(kgdb) p $19.vxcr_u.rxcd[475]
$22 = {rxd_idx = 141, pad1 = 0, eop = 1, sop = 1, qid = 0, rss_type = 0, no_csum = 0, pad2 = 0, rss_hash = 0, len = 0, error = 0, vlan = 0, vtag = 0, csum = 0, csum_ok = 0, udp = 0, tcp = 0, ipcsum_ok = 0, ipv6 = 0, ipv4 = 0,
  fragment = 0, fcs = 0, type = 3, gen = 1}
Comment 15 Patrick Kelsey freebsd_committer 2020-01-16 14:21:39 UTC
(In reply to Andriy Gapon from comment #11)

Regarding ignoring rxd->error in the iflib driver:

Just as a piece of background information, one thing that is true is that rxd->error can only be set when rx->eop is set (according to reference driver sources).

In the error case, we still need iflib to do all of the related completion queue/free list processing for descriptors from sop to eop, but we also have knowledge that the packet is 'bad'.

The current treatment of the error indication case is based on the following.  There is no facility in the iflib interface (now, or when this driver was written) to indicate this case to iflib.  So the vmx driver submits the packet anyway, under the reasoning that it is already possible to receive damaged packets from the network (that pass the checksum checks and so on).  It is perhaps not strictly optimal to not take advantage of the early knowledge that the packet is bad, but this does not introduce a new possible condition for the network stack, and this is a rare event.

Considering this again, this approach does rely on the virtual device to not provide damaged fragment length indicators (meaning something greater than the configured buffer size) in the error case, and perhaps we should not trust it to do so, in which case the minimal change to what we currently have would be to check rxd->error, and if set, consider the length of the eop fragment to be zero.  iflib probably needs an audit on all of the cases of receiving zero length fragments.

I may have some time next week to re-analyze the current iflib descriptor/free list mechanism and how the vmx driver interacts with it in order to determine whether there is a bug in the vmx driver, the iflib implementation has shifted since the vmx driver was written in a way that breaks the vmx driver, or everything looks OK and there is some other root cause here.
Comment 16 Patrick Kelsey freebsd_committer 2020-03-16 13:52:02 UTC
The following commits address this bug (the commit hook missed them because there was a typo in the reference to this bug in their commit logs):


Author: pkelsey
Date: Sat Mar 14 19:43:44 UTC 2020
New revision: 358995
URL: https://svnweb.freebsd.org/changeset/base/358995

Log:
  Fix iflib freelist state corruption

  This fixes a bug in iflib freelist management that breaks the required
  correspondence between freelist indexes and driver ring slots.

  PR:		243126, 243392, 240628
  Reported by:	avg, alexandr.oleynikov@gmail.com, Harald Schmalzbauer
  Reviewed by:	avg, gallatin
  MFC after:	1 week
  Differential Revision:	https://reviews.freebsd.org/D23943

Changes:
  head/sys/net/iflib.c



Author: pkelsey
Date: Sat Mar 14 19:55:06 UTC 2020
New revision: 358997
URL: https://svnweb.freebsd.org/changeset/base/358997

Log:
  Remove freelist contiguous-indexes assertion from rxd_frag_to_sd()

  The vmx driver is an example of an iflib driver that might report
  packets using non-contiguous descriptors (with unused descriptors
  either between received packets or between the fragments of a received
  packet), so this assertion needs to be removed.

  For such drivers, the freelist producer and consumer indexes don't
  relate directly to driver ring slots (the driver deals directly with
  freelist buffer indexes supplied by iflib during refill, and reports
  them with each fragment during packet reception), but do continue to
  be used by iflib for accounting, such as determining the number of
  ring slots that are refillable.

  PR:		243126, 243392, 240628
  Reported by:	avg, alexandr.oleynikov@gmail.com, Harald Schmalzbauer
  Reviewed by:	gallatin
  MFC after:	1 week
  Differential Revision:	https://reviews.freebsd.org/D23946

Changes:
  head/sys/net/iflib.c



Author: pkelsey
Date: Sat Mar 14 20:08:05 UTC 2020
New revision: 359000
URL: https://svnweb.freebsd.org/changeset/base/359000

Log:
  Fix if_vmx receive checksum offload bug and harden against the device skipping receive descriptors

  This fixes a bug where the checksum offload status of received packets
  was being taken from the first descriptor instead of the last, which
  affected LRO packets.

  The driver has been hardened against the device skipping receive
  descriptors, although it is not believed that this can occur given the
  way this implementation configures the receive rings.

  Additionally, for packets received with the error indicator set, the
  driver now forces the length of all fragments in that packet to zero
  prior to passing it to iflib.  Such packets should wind up being
  discarded at some point in the stack anyway, but this removes any
  questions by killing them in the driver.

  Counters have been added (and exposed via sysctls) for skipped receive
  descriptors, zero-length packets received, and packets received with
  the error indicator set so that these conditions can be easily
  observed in the field.

  PR:		243126, 243392, 240628
  Reported by:	avg, alexandr.oleynikov@gmail.com, Harald Schmalzbauer
  Reviewed by:	gallatin
  MFC after:	1 week
  Differential Revision:	https://reviews.freebsd.org/D23949

Changes:
  head/sys/dev/vmware/vmxnet3/if_vmx.c
  head/sys/dev/vmware/vmxnet3/if_vmxvar.h
Comment 17 Kubilay Kocak freebsd_committer freebsd_triage 2020-03-19 03:42:49 UTC
^Triage: Assign to committer resolving, pending MFC. If this doesn't need to go to stable/11, set mfc-stable11 to -

Thanks Patrick!